All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] Various fixes around invalidate_page()
@ 2022-02-14 20:00 Matthew Wilcox (Oracle)
  2022-02-14 20:00 ` [PATCH 01/10] splice: Use a folio in page_cache_pipe_buf_try_steal() Matthew Wilcox (Oracle)
                   ` (9 more replies)
  0 siblings, 10 replies; 42+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-14 20:00 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle)

Blame Rik for this.  His "clean up hwpoison page cache page in fault path"
patch on Friday made me look at how invalidate_inode_page() handled tail
pages.  It's not buggy, but __invalidate_mapping_pages() has a minor
accounting bug, and all of this related code could be done a little
more efficiently by using folios instead of pages.

I don't _love_ the name mapping_shrink_folio(), but I'm having a hard
time coming up with a verb that means "remove from cache if unused".
Maybe mapping_evict_folio()?  Or filemap_evict_folio()?

The last two patches are just cleanup that should be done at some point,
and since this patchset already conflicts with everything else, why not?
I've stashed these in my for-next tree immediately after converting
__remove_mapping() to take a folio because it seems to fit best there.
The bots may complain about build problems as a result, but this is
really a patch series for humans to review.

Matthew Wilcox (Oracle) (10):
  splice: Use a folio in page_cache_pipe_buf_try_steal()
  mm/truncate: Inline invalidate_complete_page() into its one caller
  mm/truncate: Convert invalidate_inode_page() to use a folio
  mm/truncate: Replace page_mapped() call in invalidate_inode_page()
  mm: Convert remove_mapping() to take a folio
  mm/truncate: Split invalidate_inode_page() into mapping_shrink_folio()
  mm/truncate: Convert __invalidate_mapping_pages() to use a folio
  mm: Turn deactivate_file_page() into deactivate_file_folio()
  mm/truncate: Combine invalidate_mapping_pagevec() and
    __invalidate_mapping_pages()
  fs: Move many prototypes to pagemap.h

 drivers/block/xen-blkback/xenbus.c           |   1 +
 drivers/usb/gadget/function/f_mass_storage.c |   1 +
 fs/coda/file.c                               |   1 +
 fs/iomap/fiemap.c                            |   1 +
 fs/nfsd/filecache.c                          |   1 +
 fs/nfsd/vfs.c                                |   1 +
 fs/splice.c                                  |  24 ++--
 fs/vboxsf/utils.c                            |   1 +
 include/linux/fs.h                           | 120 -------------------
 include/linux/mm.h                           |   1 -
 include/linux/pagemap.h                      | 114 ++++++++++++++++++
 include/linux/swap.h                         |   3 +-
 mm/internal.h                                |   4 +
 mm/memory-failure.c                          |   4 +-
 mm/swap.c                                    |  33 +++--
 mm/truncate.c                                | 109 ++++++++---------
 mm/vmscan.c                                  |  23 ++--
 17 files changed, 219 insertions(+), 223 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 01/10] splice: Use a folio in page_cache_pipe_buf_try_steal()
  2022-02-14 20:00 [PATCH 00/10] Various fixes around invalidate_page() Matthew Wilcox (Oracle)
@ 2022-02-14 20:00 ` Matthew Wilcox (Oracle)
  2022-02-15  7:15   ` Christoph Hellwig
  2022-02-15  8:32   ` Miaohe Lin
  2022-02-14 20:00 ` [PATCH 02/10] mm/truncate: Inline invalidate_complete_page() into its one caller Matthew Wilcox (Oracle)
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 42+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-14 20:00 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle)

This saves a lot of calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/splice.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index 5dbce4dcc1a7..23ff9c303abc 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -47,26 +47,27 @@ static bool page_cache_pipe_buf_try_steal(struct pipe_inode_info *pipe,
 		struct pipe_buffer *buf)
 {
 	struct page *page = buf->page;
+	struct folio *folio = page_folio(page);
 	struct address_space *mapping;
 
-	lock_page(page);
+	folio_lock(folio);
 
-	mapping = page_mapping(page);
+	mapping = folio_mapping(folio);
 	if (mapping) {
-		WARN_ON(!PageUptodate(page));
+		WARN_ON(!folio_test_uptodate(folio));
 
 		/*
 		 * At least for ext2 with nobh option, we need to wait on
-		 * writeback completing on this page, since we'll remove it
+		 * writeback completing on this folio, since we'll remove it
 		 * from the pagecache.  Otherwise truncate wont wait on the
-		 * page, allowing the disk blocks to be reused by someone else
+		 * folio, allowing the disk blocks to be reused by someone else
 		 * before we actually wrote our data to them. fs corruption
 		 * ensues.
 		 */
-		wait_on_page_writeback(page);
+		folio_wait_writeback(folio);
 
-		if (page_has_private(page) &&
-		    !try_to_release_page(page, GFP_KERNEL))
+		if (folio_has_private(folio) &&
+		    !filemap_release_folio(folio, GFP_KERNEL))
 			goto out_unlock;
 
 		/*
@@ -80,11 +81,11 @@ static bool page_cache_pipe_buf_try_steal(struct pipe_inode_info *pipe,
 	}
 
 	/*
-	 * Raced with truncate or failed to remove page from current
+	 * Raced with truncate or failed to remove folio from current
 	 * address space, unlock and return failure.
 	 */
 out_unlock:
-	unlock_page(page);
+	folio_unlock(folio);
 	return false;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 02/10] mm/truncate: Inline invalidate_complete_page() into its one caller
  2022-02-14 20:00 [PATCH 00/10] Various fixes around invalidate_page() Matthew Wilcox (Oracle)
  2022-02-14 20:00 ` [PATCH 01/10] splice: Use a folio in page_cache_pipe_buf_try_steal() Matthew Wilcox (Oracle)
@ 2022-02-14 20:00 ` Matthew Wilcox (Oracle)
  2022-02-14 23:09   ` John Hubbard
                     ` (3 more replies)
  2022-02-14 20:00 ` [PATCH 03/10] mm/truncate: Convert invalidate_inode_page() to use a folio Matthew Wilcox (Oracle)
                   ` (7 subsequent siblings)
  9 siblings, 4 replies; 42+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-14 20:00 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle)

invalidate_inode_page() is the only caller of invalidate_complete_page()
and inlining it reveals that the first check is unnecessary (because we
hold the page locked, and we just retrieved the mapping from the page).
Actually, it does make a difference, in that tail pages no longer fail
at this check, so it's now possible to remove a tail page from a mapping.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/truncate.c | 28 +++++-----------------------
 1 file changed, 5 insertions(+), 23 deletions(-)

diff --git a/mm/truncate.c b/mm/truncate.c
index 9dbf0b75da5d..e5e2edaa0b76 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -193,27 +193,6 @@ static void truncate_cleanup_folio(struct folio *folio)
 	folio_clear_mappedtodisk(folio);
 }
 
-/*
- * This is for invalidate_mapping_pages().  That function can be called at
- * any time, and is not supposed to throw away dirty pages.  But pages can
- * be marked dirty at any time too, so use remove_mapping which safely
- * discards clean, unused pages.
- *
- * Returns non-zero if the page was successfully invalidated.
- */
-static int
-invalidate_complete_page(struct address_space *mapping, struct page *page)
-{
-
-	if (page->mapping != mapping)
-		return 0;
-
-	if (page_has_private(page) && !try_to_release_page(page, 0))
-		return 0;
-
-	return remove_mapping(mapping, page);
-}
-
 int truncate_inode_folio(struct address_space *mapping, struct folio *folio)
 {
 	if (folio->mapping != mapping)
@@ -309,7 +288,10 @@ int invalidate_inode_page(struct page *page)
 		return 0;
 	if (page_mapped(page))
 		return 0;
-	return invalidate_complete_page(mapping, page);
+	if (page_has_private(page) && !try_to_release_page(page, 0))
+		return 0;
+
+	return remove_mapping(mapping, page);
 }
 
 /**
@@ -584,7 +566,7 @@ void invalidate_mapping_pagevec(struct address_space *mapping,
 }
 
 /*
- * This is like invalidate_complete_page(), except it ignores the page's
+ * This is like invalidate_inode_page(), except it ignores the page's
  * refcount.  We do this because invalidate_inode_pages2() needs stronger
  * invalidation guarantees, and cannot afford to leave pages behind because
  * shrink_page_list() has a temp ref on them, or because they're transiently
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 03/10] mm/truncate: Convert invalidate_inode_page() to use a folio
  2022-02-14 20:00 [PATCH 00/10] Various fixes around invalidate_page() Matthew Wilcox (Oracle)
  2022-02-14 20:00 ` [PATCH 01/10] splice: Use a folio in page_cache_pipe_buf_try_steal() Matthew Wilcox (Oracle)
  2022-02-14 20:00 ` [PATCH 02/10] mm/truncate: Inline invalidate_complete_page() into its one caller Matthew Wilcox (Oracle)
@ 2022-02-14 20:00 ` Matthew Wilcox (Oracle)
  2022-02-15  7:18   ` Christoph Hellwig
  2022-02-15  8:32   ` Miaohe Lin
  2022-02-14 20:00 ` [PATCH 04/10] mm/truncate: Replace page_mapped() call in invalidate_inode_page() Matthew Wilcox (Oracle)
                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 42+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-14 20:00 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle)

This saves a number of calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/truncate.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/truncate.c b/mm/truncate.c
index e5e2edaa0b76..b73c30c95cd0 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -281,14 +281,15 @@ EXPORT_SYMBOL(generic_error_remove_page);
  */
 int invalidate_inode_page(struct page *page)
 {
-	struct address_space *mapping = page_mapping(page);
+	struct folio *folio = page_folio(page);
+	struct address_space *mapping = folio_mapping(folio);
 	if (!mapping)
 		return 0;
-	if (PageDirty(page) || PageWriteback(page))
+	if (folio_test_dirty(folio) || folio_test_writeback(folio))
 		return 0;
 	if (page_mapped(page))
 		return 0;
-	if (page_has_private(page) && !try_to_release_page(page, 0))
+	if (folio_has_private(folio) && !filemap_release_folio(folio, 0))
 		return 0;
 
 	return remove_mapping(mapping, page);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 04/10] mm/truncate: Replace page_mapped() call in invalidate_inode_page()
  2022-02-14 20:00 [PATCH 00/10] Various fixes around invalidate_page() Matthew Wilcox (Oracle)
                   ` (2 preceding siblings ...)
  2022-02-14 20:00 ` [PATCH 03/10] mm/truncate: Convert invalidate_inode_page() to use a folio Matthew Wilcox (Oracle)
@ 2022-02-14 20:00 ` Matthew Wilcox (Oracle)
  2022-02-15  7:19   ` Christoph Hellwig
                     ` (2 more replies)
  2022-02-14 20:00 ` [PATCH 05/10] mm: Convert remove_mapping() to take a folio Matthew Wilcox (Oracle)
                   ` (5 subsequent siblings)
  9 siblings, 3 replies; 42+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-14 20:00 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle)

folio_mapped() is expensive because it has to check each page's mapcount
field.  A cheaper check is whether there are any extra references to
the page, other than the one we own and the ones held by the page cache.
The call to remove_mapping() will fail in any case if it cannot freeze
the refcount, but failing here avoids cycling the i_pages spinlock.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/truncate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/truncate.c b/mm/truncate.c
index b73c30c95cd0..d67fa8871b75 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -287,7 +287,7 @@ int invalidate_inode_page(struct page *page)
 		return 0;
 	if (folio_test_dirty(folio) || folio_test_writeback(folio))
 		return 0;
-	if (page_mapped(page))
+	if (folio_ref_count(folio) > folio_nr_pages(folio) + 1)
 		return 0;
 	if (folio_has_private(folio) && !filemap_release_folio(folio, 0))
 		return 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 05/10] mm: Convert remove_mapping() to take a folio
  2022-02-14 20:00 [PATCH 00/10] Various fixes around invalidate_page() Matthew Wilcox (Oracle)
                   ` (3 preceding siblings ...)
  2022-02-14 20:00 ` [PATCH 04/10] mm/truncate: Replace page_mapped() call in invalidate_inode_page() Matthew Wilcox (Oracle)
@ 2022-02-14 20:00 ` Matthew Wilcox (Oracle)
  2022-02-15  7:21   ` Christoph Hellwig
  2022-02-15  8:33   ` Miaohe Lin
  2022-02-14 20:00 ` [PATCH 06/10] mm/truncate: Split invalidate_inode_page() into mapping_shrink_folio() Matthew Wilcox (Oracle)
                   ` (4 subsequent siblings)
  9 siblings, 2 replies; 42+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-14 20:00 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle)

Add kernel-doc and return the number of pages removed in order to
get the statistics right in __invalidate_mapping_pages().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/splice.c          |  5 ++---
 include/linux/swap.h |  2 +-
 mm/truncate.c        |  2 +-
 mm/vmscan.c          | 23 ++++++++++++++---------
 4 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index 23ff9c303abc..047b79db8eb5 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -46,8 +46,7 @@
 static bool page_cache_pipe_buf_try_steal(struct pipe_inode_info *pipe,
 		struct pipe_buffer *buf)
 {
-	struct page *page = buf->page;
-	struct folio *folio = page_folio(page);
+	struct folio *folio = page_folio(buf->page);
 	struct address_space *mapping;
 
 	folio_lock(folio);
@@ -74,7 +73,7 @@ static bool page_cache_pipe_buf_try_steal(struct pipe_inode_info *pipe,
 		 * If we succeeded in removing the mapping, set LRU flag
 		 * and return good.
 		 */
-		if (remove_mapping(mapping, page)) {
+		if (remove_mapping(mapping, folio)) {
 			buf->flags |= PIPE_BUF_FLAG_LRU;
 			return true;
 		}
diff --git a/include/linux/swap.h b/include/linux/swap.h
index e7cb7a1e6ceb..304f174b4d31 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -395,7 +395,7 @@ extern unsigned long mem_cgroup_shrink_node(struct mem_cgroup *mem,
 						unsigned long *nr_scanned);
 extern unsigned long shrink_all_memory(unsigned long nr_pages);
 extern int vm_swappiness;
-extern int remove_mapping(struct address_space *mapping, struct page *page);
+long remove_mapping(struct address_space *mapping, struct folio *folio);
 
 extern unsigned long reclaim_pages(struct list_head *page_list);
 #ifdef CONFIG_NUMA
diff --git a/mm/truncate.c b/mm/truncate.c
index d67fa8871b75..8aa86e294775 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -292,7 +292,7 @@ int invalidate_inode_page(struct page *page)
 	if (folio_has_private(folio) && !filemap_release_folio(folio, 0))
 		return 0;
 
-	return remove_mapping(mapping, page);
+	return remove_mapping(mapping, folio);
 }
 
 /**
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 2965be8df713..7959df4d611b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1335,23 +1335,28 @@ static int __remove_mapping(struct address_space *mapping, struct folio *folio,
 	return 0;
 }
 
-/*
- * Attempt to detach a locked page from its ->mapping.  If it is dirty or if
- * someone else has a ref on the page, abort and return 0.  If it was
- * successfully detached, return 1.  Assumes the caller has a single ref on
- * this page.
+/**
+ * remove_mapping() - Attempt to remove a folio from its mapping.
+ * @mapping: The address space.
+ * @folio: The folio to remove.
+ *
+ * If the folio is dirty, under writeback or if someone else has a ref
+ * on it, removal will fail.
+ * Return: The number of pages removed from the mapping.  0 if the folio
+ * could not be removed.
+ * Context: The caller should have a single refcount on the folio and
+ * hold its lock.
  */
-int remove_mapping(struct address_space *mapping, struct page *page)
+long remove_mapping(struct address_space *mapping, struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
 	if (__remove_mapping(mapping, folio, false, NULL)) {
 		/*
-		 * Unfreezing the refcount with 1 rather than 2 effectively
+		 * Unfreezing the refcount with 1 effectively
 		 * drops the pagecache ref for us without requiring another
 		 * atomic operation.
 		 */
 		folio_ref_unfreeze(folio, 1);
-		return 1;
+		return folio_nr_pages(folio);
 	}
 	return 0;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 06/10] mm/truncate: Split invalidate_inode_page() into mapping_shrink_folio()
  2022-02-14 20:00 [PATCH 00/10] Various fixes around invalidate_page() Matthew Wilcox (Oracle)
                   ` (4 preceding siblings ...)
  2022-02-14 20:00 ` [PATCH 05/10] mm: Convert remove_mapping() to take a folio Matthew Wilcox (Oracle)
@ 2022-02-14 20:00 ` Matthew Wilcox (Oracle)
  2022-02-15  7:23   ` Christoph Hellwig
  2022-02-15  9:37   ` Miaohe Lin
  2022-02-14 20:00 ` [PATCH 07/10] mm/truncate: Convert __invalidate_mapping_pages() to use a folio Matthew Wilcox (Oracle)
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 42+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-14 20:00 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle)

Some of the callers already have the address_space and can avoid calling
folio_mapping() and checking if the folio was already truncated.  Also
add kernel-doc and fix the return type (in case we ever support folios
larger than 4TB).

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h  |  1 -
 mm/internal.h       |  1 +
 mm/memory-failure.c |  4 ++--
 mm/truncate.c       | 34 +++++++++++++++++++++++-----------
 4 files changed, 26 insertions(+), 14 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 4637368d9455..53b301dc5c14 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1853,7 +1853,6 @@ extern void truncate_setsize(struct inode *inode, loff_t newsize);
 void pagecache_isize_extended(struct inode *inode, loff_t from, loff_t to);
 void truncate_pagecache_range(struct inode *inode, loff_t offset, loff_t end);
 int generic_error_remove_page(struct address_space *mapping, struct page *page);
-int invalidate_inode_page(struct page *page);
 
 #ifdef CONFIG_MMU
 extern vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
diff --git a/mm/internal.h b/mm/internal.h
index b7a2195c12b1..927a17d58b85 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -100,6 +100,7 @@ void filemap_free_folio(struct address_space *mapping, struct folio *folio);
 int truncate_inode_folio(struct address_space *mapping, struct folio *folio);
 bool truncate_inode_partial_folio(struct folio *folio, loff_t start,
 		loff_t end);
+long invalidate_inode_page(struct page *page);
 
 /**
  * folio_evictable - Test whether a folio is evictable.
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 97a9ed8f87a9..0b72a936b8dd 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -2139,7 +2139,7 @@ static bool isolate_page(struct page *page, struct list_head *pagelist)
  */
 static int __soft_offline_page(struct page *page)
 {
-	int ret = 0;
+	long ret = 0;
 	unsigned long pfn = page_to_pfn(page);
 	struct page *hpage = compound_head(page);
 	char const *msg_page[] = {"page", "hugepage"};
@@ -2196,7 +2196,7 @@ static int __soft_offline_page(struct page *page)
 			if (!list_empty(&pagelist))
 				putback_movable_pages(&pagelist);
 
-			pr_info("soft offline: %#lx: %s migration failed %d, type %pGp\n",
+			pr_info("soft offline: %#lx: %s migration failed %ld, type %pGp\n",
 				pfn, msg_page[huge], ret, &page->flags);
 			if (ret > 0)
 				ret = -EBUSY;
diff --git a/mm/truncate.c b/mm/truncate.c
index 8aa86e294775..b1bdc61198f6 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -273,18 +273,9 @@ int generic_error_remove_page(struct address_space *mapping, struct page *page)
 }
 EXPORT_SYMBOL(generic_error_remove_page);
 
-/*
- * Safely invalidate one page from its pagecache mapping.
- * It only drops clean, unused pages. The page must be locked.
- *
- * Returns 1 if the page is successfully invalidated, otherwise 0.
- */
-int invalidate_inode_page(struct page *page)
+static long mapping_shrink_folio(struct address_space *mapping,
+		struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
-	struct address_space *mapping = folio_mapping(folio);
-	if (!mapping)
-		return 0;
 	if (folio_test_dirty(folio) || folio_test_writeback(folio))
 		return 0;
 	if (folio_ref_count(folio) > folio_nr_pages(folio) + 1)
@@ -295,6 +286,27 @@ int invalidate_inode_page(struct page *page)
 	return remove_mapping(mapping, folio);
 }
 
+/**
+ * invalidate_inode_page() - Remove an unused page from the pagecache.
+ * @page: The page to remove.
+ *
+ * Safely invalidate one page from its pagecache mapping.
+ * It only drops clean, unused pages.
+ *
+ * Context: Page must be locked.
+ * Return: The number of pages successfully removed.
+ */
+long invalidate_inode_page(struct page *page)
+{
+	struct folio *folio = page_folio(page);
+	struct address_space *mapping = folio_mapping(folio);
+
+	/* The page may have been truncated before it was locked */
+	if (!mapping)
+		return 0;
+	return mapping_shrink_folio(mapping, folio);
+}
+
 /**
  * truncate_inode_pages_range - truncate range of pages specified by start & end byte offsets
  * @mapping: mapping to truncate
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 07/10] mm/truncate: Convert __invalidate_mapping_pages() to use a folio
  2022-02-14 20:00 [PATCH 00/10] Various fixes around invalidate_page() Matthew Wilcox (Oracle)
                   ` (5 preceding siblings ...)
  2022-02-14 20:00 ` [PATCH 06/10] mm/truncate: Split invalidate_inode_page() into mapping_shrink_folio() Matthew Wilcox (Oracle)
@ 2022-02-14 20:00 ` Matthew Wilcox (Oracle)
  2022-02-15  7:24   ` Christoph Hellwig
  2022-02-15  9:37   ` Miaohe Lin
  2022-02-14 20:00 ` [PATCH 08/10] mm: Turn deactivate_file_page() into deactivate_file_folio() Matthew Wilcox (Oracle)
                   ` (2 subsequent siblings)
  9 siblings, 2 replies; 42+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-14 20:00 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle)

Now we can call mapping_shrink_folio() instead of invalidate_inode_page()
and save a few calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/truncate.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/mm/truncate.c b/mm/truncate.c
index b1bdc61198f6..567557c36d45 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -505,27 +505,27 @@ static unsigned long __invalidate_mapping_pages(struct address_space *mapping,
 	folio_batch_init(&fbatch);
 	while (find_lock_entries(mapping, index, end, &fbatch, indices)) {
 		for (i = 0; i < folio_batch_count(&fbatch); i++) {
-			struct page *page = &fbatch.folios[i]->page;
+			struct folio *folio = fbatch.folios[i];
 
-			/* We rely upon deletion not changing page->index */
+			/* We rely upon deletion not changing folio->index */
 			index = indices[i];
 
-			if (xa_is_value(page)) {
+			if (xa_is_value(folio)) {
 				count += invalidate_exceptional_entry(mapping,
 								      index,
-								      page);
+								      folio);
 				continue;
 			}
-			index += thp_nr_pages(page) - 1;
+			index += folio_nr_pages(folio) - 1;
 
-			ret = invalidate_inode_page(page);
-			unlock_page(page);
+			ret = mapping_shrink_folio(mapping, folio);
+			folio_unlock(folio);
 			/*
-			 * Invalidation is a hint that the page is no longer
+			 * Invalidation is a hint that the folio is no longer
 			 * of interest and try to speed up its reclaim.
 			 */
 			if (!ret) {
-				deactivate_file_page(page);
+				deactivate_file_page(&folio->page);
 				/* It is likely on the pagevec of a remote CPU */
 				if (nr_pagevec)
 					(*nr_pagevec)++;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 08/10] mm: Turn deactivate_file_page() into deactivate_file_folio()
  2022-02-14 20:00 [PATCH 00/10] Various fixes around invalidate_page() Matthew Wilcox (Oracle)
                   ` (6 preceding siblings ...)
  2022-02-14 20:00 ` [PATCH 07/10] mm/truncate: Convert __invalidate_mapping_pages() to use a folio Matthew Wilcox (Oracle)
@ 2022-02-14 20:00 ` Matthew Wilcox (Oracle)
  2022-02-15  7:25   ` Christoph Hellwig
  2022-02-15  8:26   ` Miaohe Lin
  2022-02-14 20:00 ` [PATCH 09/10] mm/truncate: Combine invalidate_mapping_pagevec() and __invalidate_mapping_pages() Matthew Wilcox (Oracle)
  2022-02-14 20:00 ` [PATCH 10/10] fs: Move many prototypes to pagemap.h Matthew Wilcox (Oracle)
  9 siblings, 2 replies; 42+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-14 20:00 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle)

This function has one caller which already has a reference to the
page, so we don't need to use get_page_unless_zero().  Also move the
prototype to mm/internal.h.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/swap.h |  1 -
 mm/internal.h        |  1 +
 mm/swap.c            | 33 ++++++++++++++++-----------------
 mm/truncate.c        |  2 +-
 4 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 304f174b4d31..064e60e9f63f 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -372,7 +372,6 @@ extern void lru_add_drain(void);
 extern void lru_add_drain_cpu(int cpu);
 extern void lru_add_drain_cpu_zone(struct zone *zone);
 extern void lru_add_drain_all(void);
-extern void deactivate_file_page(struct page *page);
 extern void deactivate_page(struct page *page);
 extern void mark_page_lazyfree(struct page *page);
 extern void swap_setup(void);
diff --git a/mm/internal.h b/mm/internal.h
index 927a17d58b85..d886b87b1294 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -66,6 +66,7 @@ static inline void wake_throttle_isolated(pg_data_t *pgdat)
 vm_fault_t do_swap_page(struct vm_fault *vmf);
 void folio_rotate_reclaimable(struct folio *folio);
 bool __folio_end_writeback(struct folio *folio);
+void deactivate_file_folio(struct folio *folio);
 
 void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma,
 		unsigned long floor, unsigned long ceiling);
diff --git a/mm/swap.c b/mm/swap.c
index bcf3ac288b56..745915127b1f 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -639,32 +639,31 @@ void lru_add_drain_cpu(int cpu)
 }
 
 /**
- * deactivate_file_page - forcefully deactivate a file page
- * @page: page to deactivate
+ * deactivate_file_folio() - Forcefully deactivate a file folio.
+ * @folio: Folio to deactivate.
  *
- * This function hints the VM that @page is a good reclaim candidate,
- * for example if its invalidation fails due to the page being dirty
+ * This function hints to the VM that @folio is a good reclaim candidate,
+ * for example if its invalidation fails due to the folio being dirty
  * or under writeback.
  */
-void deactivate_file_page(struct page *page)
+void deactivate_file_folio(struct folio *folio)
 {
+	struct pagevec *pvec;
+
 	/*
-	 * In a workload with many unevictable page such as mprotect,
-	 * unevictable page deactivation for accelerating reclaim is pointless.
+	 * In a workload with many unevictable pages such as mprotect,
+	 * unevictable folio deactivation for accelerating reclaim is pointless.
 	 */
-	if (PageUnevictable(page))
+	if (folio_test_unevictable(folio))
 		return;
 
-	if (likely(get_page_unless_zero(page))) {
-		struct pagevec *pvec;
-
-		local_lock(&lru_pvecs.lock);
-		pvec = this_cpu_ptr(&lru_pvecs.lru_deactivate_file);
+	folio_get(folio);
+	local_lock(&lru_pvecs.lock);
+	pvec = this_cpu_ptr(&lru_pvecs.lru_deactivate_file);
 
-		if (pagevec_add_and_need_flush(pvec, page))
-			pagevec_lru_move_fn(pvec, lru_deactivate_file_fn);
-		local_unlock(&lru_pvecs.lock);
-	}
+	if (pagevec_add_and_need_flush(pvec, &folio->page))
+		pagevec_lru_move_fn(pvec, lru_deactivate_file_fn);
+	local_unlock(&lru_pvecs.lock);
 }
 
 /*
diff --git a/mm/truncate.c b/mm/truncate.c
index 567557c36d45..14486e75ec28 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -525,7 +525,7 @@ static unsigned long __invalidate_mapping_pages(struct address_space *mapping,
 			 * of interest and try to speed up its reclaim.
 			 */
 			if (!ret) {
-				deactivate_file_page(&folio->page);
+				deactivate_file_folio(folio);
 				/* It is likely on the pagevec of a remote CPU */
 				if (nr_pagevec)
 					(*nr_pagevec)++;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 09/10] mm/truncate: Combine invalidate_mapping_pagevec() and __invalidate_mapping_pages()
  2022-02-14 20:00 [PATCH 00/10] Various fixes around invalidate_page() Matthew Wilcox (Oracle)
                   ` (7 preceding siblings ...)
  2022-02-14 20:00 ` [PATCH 08/10] mm: Turn deactivate_file_page() into deactivate_file_folio() Matthew Wilcox (Oracle)
@ 2022-02-14 20:00 ` Matthew Wilcox (Oracle)
  2022-02-15  7:26   ` Christoph Hellwig
  2022-02-15  9:37   ` Miaohe Lin
  2022-02-14 20:00 ` [PATCH 10/10] fs: Move many prototypes to pagemap.h Matthew Wilcox (Oracle)
  9 siblings, 2 replies; 42+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-14 20:00 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle)

We can save a function call by combining these two functions, which
are identical except for the return value.  Also move the prototype
to mm/internal.h.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/fs.h |  4 ----
 mm/internal.h      |  2 ++
 mm/truncate.c      | 32 +++++++++++++-------------------
 3 files changed, 15 insertions(+), 23 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index e2d892b201b0..85c584c5c623 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2749,10 +2749,6 @@ extern bool is_bad_inode(struct inode *);
 unsigned long invalidate_mapping_pages(struct address_space *mapping,
 					pgoff_t start, pgoff_t end);
 
-void invalidate_mapping_pagevec(struct address_space *mapping,
-				pgoff_t start, pgoff_t end,
-				unsigned long *nr_pagevec);
-
 static inline void invalidate_remote_inode(struct inode *inode)
 {
 	if (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
diff --git a/mm/internal.h b/mm/internal.h
index d886b87b1294..6bbe40a1880a 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -102,6 +102,8 @@ int truncate_inode_folio(struct address_space *mapping, struct folio *folio);
 bool truncate_inode_partial_folio(struct folio *folio, loff_t start,
 		loff_t end);
 long invalidate_inode_page(struct page *page);
+unsigned long invalidate_mapping_pagevec(struct address_space *mapping,
+		pgoff_t start, pgoff_t end, unsigned long *nr_pagevec);
 
 /**
  * folio_evictable - Test whether a folio is evictable.
diff --git a/mm/truncate.c b/mm/truncate.c
index 14486e75ec28..6b94b00f4307 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -492,7 +492,18 @@ void truncate_inode_pages_final(struct address_space *mapping)
 }
 EXPORT_SYMBOL(truncate_inode_pages_final);
 
-static unsigned long __invalidate_mapping_pages(struct address_space *mapping,
+/**
+ * invalidate_mapping_pagevec - Invalidate all the unlocked pages of one inode
+ * @mapping: the address_space which holds the pages to invalidate
+ * @start: the offset 'from' which to invalidate
+ * @end: the offset 'to' which to invalidate (inclusive)
+ * @nr_pagevec: invalidate failed page number for caller
+ *
+ * This helper is similar to invalidate_mapping_pages(), except that it accounts
+ * for pages that are likely on a pagevec and counts them in @nr_pagevec, which
+ * will be used by the caller.
+ */
+unsigned long invalidate_mapping_pagevec(struct address_space *mapping,
 		pgoff_t start, pgoff_t end, unsigned long *nr_pagevec)
 {
 	pgoff_t indices[PAGEVEC_SIZE];
@@ -557,27 +568,10 @@ static unsigned long __invalidate_mapping_pages(struct address_space *mapping,
 unsigned long invalidate_mapping_pages(struct address_space *mapping,
 		pgoff_t start, pgoff_t end)
 {
-	return __invalidate_mapping_pages(mapping, start, end, NULL);
+	return invalidate_mapping_pagevec(mapping, start, end, NULL);
 }
 EXPORT_SYMBOL(invalidate_mapping_pages);
 
-/**
- * invalidate_mapping_pagevec - Invalidate all the unlocked pages of one inode
- * @mapping: the address_space which holds the pages to invalidate
- * @start: the offset 'from' which to invalidate
- * @end: the offset 'to' which to invalidate (inclusive)
- * @nr_pagevec: invalidate failed page number for caller
- *
- * This helper is similar to invalidate_mapping_pages(), except that it accounts
- * for pages that are likely on a pagevec and counts them in @nr_pagevec, which
- * will be used by the caller.
- */
-void invalidate_mapping_pagevec(struct address_space *mapping,
-		pgoff_t start, pgoff_t end, unsigned long *nr_pagevec)
-{
-	__invalidate_mapping_pages(mapping, start, end, nr_pagevec);
-}
-
 /*
  * This is like invalidate_inode_page(), except it ignores the page's
  * refcount.  We do this because invalidate_inode_pages2() needs stronger
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 10/10] fs: Move many prototypes to pagemap.h
  2022-02-14 20:00 [PATCH 00/10] Various fixes around invalidate_page() Matthew Wilcox (Oracle)
                   ` (8 preceding siblings ...)
  2022-02-14 20:00 ` [PATCH 09/10] mm/truncate: Combine invalidate_mapping_pagevec() and __invalidate_mapping_pages() Matthew Wilcox (Oracle)
@ 2022-02-14 20:00 ` Matthew Wilcox (Oracle)
  2022-02-15  7:27   ` Christoph Hellwig
  2022-02-15  9:38   ` Miaohe Lin
  9 siblings, 2 replies; 42+ messages in thread
From: Matthew Wilcox (Oracle) @ 2022-02-14 20:00 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle)

These functions are page cache functionality and don't need to be
declared in fs.h.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 drivers/block/xen-blkback/xenbus.c           |   1 +
 drivers/usb/gadget/function/f_mass_storage.c |   1 +
 fs/coda/file.c                               |   1 +
 fs/iomap/fiemap.c                            |   1 +
 fs/nfsd/filecache.c                          |   1 +
 fs/nfsd/vfs.c                                |   1 +
 fs/vboxsf/utils.c                            |   1 +
 include/linux/fs.h                           | 116 -------------------
 include/linux/pagemap.h                      | 114 ++++++++++++++++++
 9 files changed, 121 insertions(+), 116 deletions(-)

diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index 62125fd4af4a..f09040435e2e 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -10,6 +10,7 @@
 
 #include <linux/module.h>
 #include <linux/kthread.h>
+#include <linux/pagemap.h>
 #include <xen/events.h>
 #include <xen/grant_table.h>
 #include "common.h"
diff --git a/drivers/usb/gadget/function/f_mass_storage.c b/drivers/usb/gadget/function/f_mass_storage.c
index 46dd11dcb3a8..7371c6e65b10 100644
--- a/drivers/usb/gadget/function/f_mass_storage.c
+++ b/drivers/usb/gadget/function/f_mass_storage.c
@@ -179,6 +179,7 @@
 #include <linux/kthread.h>
 #include <linux/sched/signal.h>
 #include <linux/limits.h>
+#include <linux/pagemap.h>
 #include <linux/rwsem.h>
 #include <linux/slab.h>
 #include <linux/spinlock.h>
diff --git a/fs/coda/file.c b/fs/coda/file.c
index 29dd87be2fb8..3f3c81e6b1ab 100644
--- a/fs/coda/file.c
+++ b/fs/coda/file.c
@@ -14,6 +14,7 @@
 #include <linux/time.h>
 #include <linux/file.h>
 #include <linux/fs.h>
+#include <linux/pagemap.h>
 #include <linux/stat.h>
 #include <linux/cred.h>
 #include <linux/errno.h>
diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c
index 66cf267c68ae..610ca6f1ec9b 100644
--- a/fs/iomap/fiemap.c
+++ b/fs/iomap/fiemap.c
@@ -7,6 +7,7 @@
 #include <linux/fs.h>
 #include <linux/iomap.h>
 #include <linux/fiemap.h>
+#include <linux/pagemap.h>
 
 static int iomap_to_fiemap(struct fiemap_extent_info *fi,
 		const struct iomap *iomap, u32 flags)
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 8bc807c5fea4..47f804e0ec93 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -7,6 +7,7 @@
 #include <linux/hash.h>
 #include <linux/slab.h>
 #include <linux/file.h>
+#include <linux/pagemap.h>
 #include <linux/sched.h>
 #include <linux/list_lru.h>
 #include <linux/fsnotify_backend.h>
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 91600e71be19..fe0d7abbc1b1 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -26,6 +26,7 @@
 #include <linux/xattr.h>
 #include <linux/jhash.h>
 #include <linux/ima.h>
+#include <linux/pagemap.h>
 #include <linux/slab.h>
 #include <linux/uaccess.h>
 #include <linux/exportfs.h>
diff --git a/fs/vboxsf/utils.c b/fs/vboxsf/utils.c
index aec2ebf7d25a..e1db0f3f7e5e 100644
--- a/fs/vboxsf/utils.c
+++ b/fs/vboxsf/utils.c
@@ -9,6 +9,7 @@
 #include <linux/namei.h>
 #include <linux/nls.h>
 #include <linux/sizes.h>
+#include <linux/pagemap.h>
 #include <linux/vfs.h>
 #include "vfsmod.h"
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 85c584c5c623..0961c979e949 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2746,50 +2746,6 @@ extern void init_special_inode(struct inode *, umode_t, dev_t);
 extern void make_bad_inode(struct inode *);
 extern bool is_bad_inode(struct inode *);
 
-unsigned long invalidate_mapping_pages(struct address_space *mapping,
-					pgoff_t start, pgoff_t end);
-
-static inline void invalidate_remote_inode(struct inode *inode)
-{
-	if (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
-	    S_ISLNK(inode->i_mode))
-		invalidate_mapping_pages(inode->i_mapping, 0, -1);
-}
-extern int invalidate_inode_pages2(struct address_space *mapping);
-extern int invalidate_inode_pages2_range(struct address_space *mapping,
-					 pgoff_t start, pgoff_t end);
-extern int write_inode_now(struct inode *, int);
-extern int filemap_fdatawrite(struct address_space *);
-extern int filemap_flush(struct address_space *);
-extern int filemap_fdatawait_keep_errors(struct address_space *mapping);
-extern int filemap_fdatawait_range(struct address_space *, loff_t lstart,
-				   loff_t lend);
-extern int filemap_fdatawait_range_keep_errors(struct address_space *mapping,
-		loff_t start_byte, loff_t end_byte);
-
-static inline int filemap_fdatawait(struct address_space *mapping)
-{
-	return filemap_fdatawait_range(mapping, 0, LLONG_MAX);
-}
-
-extern bool filemap_range_has_page(struct address_space *, loff_t lstart,
-				  loff_t lend);
-extern int filemap_write_and_wait_range(struct address_space *mapping,
-				        loff_t lstart, loff_t lend);
-extern int __filemap_fdatawrite_range(struct address_space *mapping,
-				loff_t start, loff_t end, int sync_mode);
-extern int filemap_fdatawrite_range(struct address_space *mapping,
-				loff_t start, loff_t end);
-extern int filemap_check_errors(struct address_space *mapping);
-extern void __filemap_set_wb_err(struct address_space *mapping, int err);
-int filemap_fdatawrite_wbc(struct address_space *mapping,
-			   struct writeback_control *wbc);
-
-static inline int filemap_write_and_wait(struct address_space *mapping)
-{
-	return filemap_write_and_wait_range(mapping, 0, LLONG_MAX);
-}
-
 extern int __must_check file_fdatawait_range(struct file *file, loff_t lstart,
 						loff_t lend);
 extern int __must_check file_check_and_advance_wb_err(struct file *file);
@@ -2801,67 +2757,6 @@ static inline int file_write_and_wait(struct file *file)
 	return file_write_and_wait_range(file, 0, LLONG_MAX);
 }
 
-/**
- * filemap_set_wb_err - set a writeback error on an address_space
- * @mapping: mapping in which to set writeback error
- * @err: error to be set in mapping
- *
- * When writeback fails in some way, we must record that error so that
- * userspace can be informed when fsync and the like are called.  We endeavor
- * to report errors on any file that was open at the time of the error.  Some
- * internal callers also need to know when writeback errors have occurred.
- *
- * When a writeback error occurs, most filesystems will want to call
- * filemap_set_wb_err to record the error in the mapping so that it will be
- * automatically reported whenever fsync is called on the file.
- */
-static inline void filemap_set_wb_err(struct address_space *mapping, int err)
-{
-	/* Fastpath for common case of no error */
-	if (unlikely(err))
-		__filemap_set_wb_err(mapping, err);
-}
-
-/**
- * filemap_check_wb_err - has an error occurred since the mark was sampled?
- * @mapping: mapping to check for writeback errors
- * @since: previously-sampled errseq_t
- *
- * Grab the errseq_t value from the mapping, and see if it has changed "since"
- * the given value was sampled.
- *
- * If it has then report the latest error set, otherwise return 0.
- */
-static inline int filemap_check_wb_err(struct address_space *mapping,
-					errseq_t since)
-{
-	return errseq_check(&mapping->wb_err, since);
-}
-
-/**
- * filemap_sample_wb_err - sample the current errseq_t to test for later errors
- * @mapping: mapping to be sampled
- *
- * Writeback errors are always reported relative to a particular sample point
- * in the past. This function provides those sample points.
- */
-static inline errseq_t filemap_sample_wb_err(struct address_space *mapping)
-{
-	return errseq_sample(&mapping->wb_err);
-}
-
-/**
- * file_sample_sb_err - sample the current errseq_t to test for later errors
- * @file: file pointer to be sampled
- *
- * Grab the most current superblock-level errseq_t value for the given
- * struct file.
- */
-static inline errseq_t file_sample_sb_err(struct file *file)
-{
-	return errseq_sample(&file->f_path.dentry->d_sb->s_wb_err);
-}
-
 extern int vfs_fsync_range(struct file *file, loff_t start, loff_t end,
 			   int datasync);
 extern int vfs_fsync(struct file *file, int datasync);
@@ -3604,15 +3499,4 @@ extern int vfs_fadvise(struct file *file, loff_t offset, loff_t len,
 extern int generic_fadvise(struct file *file, loff_t offset, loff_t len,
 			   int advice);
 
-/*
- * Flush file data before changing attributes.  Caller must hold any locks
- * required to prevent further writes to this file until we're done setting
- * flags.
- */
-static inline int inode_drain_writes(struct inode *inode)
-{
-	inode_dio_wait(inode);
-	return filemap_write_and_wait(inode->i_mapping);
-}
-
 #endif /* _LINUX_FS_H */
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index cdb3f118603a..f968b36ad771 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -18,6 +18,120 @@
 
 struct folio_batch;
 
+unsigned long invalidate_mapping_pages(struct address_space *mapping,
+					pgoff_t start, pgoff_t end);
+
+static inline void invalidate_remote_inode(struct inode *inode)
+{
+	if (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
+	    S_ISLNK(inode->i_mode))
+		invalidate_mapping_pages(inode->i_mapping, 0, -1);
+}
+int invalidate_inode_pages2(struct address_space *mapping);
+int invalidate_inode_pages2_range(struct address_space *mapping,
+		pgoff_t start, pgoff_t end);
+int write_inode_now(struct inode *, int sync);
+int filemap_fdatawrite(struct address_space *);
+int filemap_flush(struct address_space *);
+int filemap_fdatawait_keep_errors(struct address_space *mapping);
+int filemap_fdatawait_range(struct address_space *, loff_t lstart, loff_t lend);
+int filemap_fdatawait_range_keep_errors(struct address_space *mapping,
+		loff_t start_byte, loff_t end_byte);
+
+static inline int filemap_fdatawait(struct address_space *mapping)
+{
+	return filemap_fdatawait_range(mapping, 0, LLONG_MAX);
+}
+
+bool filemap_range_has_page(struct address_space *, loff_t lstart, loff_t lend);
+int filemap_write_and_wait_range(struct address_space *mapping,
+		loff_t lstart, loff_t lend);
+int __filemap_fdatawrite_range(struct address_space *mapping,
+		loff_t start, loff_t end, int sync_mode);
+int filemap_fdatawrite_range(struct address_space *mapping,
+		loff_t start, loff_t end);
+int filemap_check_errors(struct address_space *mapping);
+void __filemap_set_wb_err(struct address_space *mapping, int err);
+int filemap_fdatawrite_wbc(struct address_space *mapping,
+			   struct writeback_control *wbc);
+
+static inline int filemap_write_and_wait(struct address_space *mapping)
+{
+	return filemap_write_and_wait_range(mapping, 0, LLONG_MAX);
+}
+
+/**
+ * filemap_set_wb_err - set a writeback error on an address_space
+ * @mapping: mapping in which to set writeback error
+ * @err: error to be set in mapping
+ *
+ * When writeback fails in some way, we must record that error so that
+ * userspace can be informed when fsync and the like are called.  We endeavor
+ * to report errors on any file that was open at the time of the error.  Some
+ * internal callers also need to know when writeback errors have occurred.
+ *
+ * When a writeback error occurs, most filesystems will want to call
+ * filemap_set_wb_err to record the error in the mapping so that it will be
+ * automatically reported whenever fsync is called on the file.
+ */
+static inline void filemap_set_wb_err(struct address_space *mapping, int err)
+{
+	/* Fastpath for common case of no error */
+	if (unlikely(err))
+		__filemap_set_wb_err(mapping, err);
+}
+
+/**
+ * filemap_check_wb_err - has an error occurred since the mark was sampled?
+ * @mapping: mapping to check for writeback errors
+ * @since: previously-sampled errseq_t
+ *
+ * Grab the errseq_t value from the mapping, and see if it has changed "since"
+ * the given value was sampled.
+ *
+ * If it has then report the latest error set, otherwise return 0.
+ */
+static inline int filemap_check_wb_err(struct address_space *mapping,
+					errseq_t since)
+{
+	return errseq_check(&mapping->wb_err, since);
+}
+
+/**
+ * filemap_sample_wb_err - sample the current errseq_t to test for later errors
+ * @mapping: mapping to be sampled
+ *
+ * Writeback errors are always reported relative to a particular sample point
+ * in the past. This function provides those sample points.
+ */
+static inline errseq_t filemap_sample_wb_err(struct address_space *mapping)
+{
+	return errseq_sample(&mapping->wb_err);
+}
+
+/**
+ * file_sample_sb_err - sample the current errseq_t to test for later errors
+ * @file: file pointer to be sampled
+ *
+ * Grab the most current superblock-level errseq_t value for the given
+ * struct file.
+ */
+static inline errseq_t file_sample_sb_err(struct file *file)
+{
+	return errseq_sample(&file->f_path.dentry->d_sb->s_wb_err);
+}
+
+/*
+ * Flush file data before changing attributes.  Caller must hold any locks
+ * required to prevent further writes to this file until we're done setting
+ * flags.
+ */
+static inline int inode_drain_writes(struct inode *inode)
+{
+	inode_dio_wait(inode);
+	return filemap_write_and_wait(inode->i_mapping);
+}
+
 static inline bool mapping_empty(struct address_space *mapping)
 {
 	return xa_empty(&mapping->i_pages);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/10] mm/truncate: Inline invalidate_complete_page() into its one caller
  2022-02-14 20:00 ` [PATCH 02/10] mm/truncate: Inline invalidate_complete_page() into its one caller Matthew Wilcox (Oracle)
@ 2022-02-14 23:09   ` John Hubbard
  2022-02-14 23:32     ` Matthew Wilcox
  2022-02-15  7:17   ` Christoph Hellwig
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 42+ messages in thread
From: John Hubbard @ 2022-02-14 23:09 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-fsdevel, linux-mm

On 2/14/22 12:00, Matthew Wilcox (Oracle) wrote:
> invalidate_inode_page() is the only caller of invalidate_complete_page()
> and inlining it reveals that the first check is unnecessary (because we
> hold the page locked, and we just retrieved the mapping from the page).

I just noticed this yesterday, when reviewing Rik's page poisoning fix.
I had a patch for it squirreled away, but I missed the point about
removing that extraneous mapping check. Glad you spotted it.

...
> @@ -309,7 +288,10 @@ int invalidate_inode_page(struct page *page)

It would be nice to retain some of the original comments. May I suggest
this (it has an additional paragraph) for an updated version of comments
above invalidate_inode_page():

/*
  * Safely invalidate one page from its pagecache mapping.
  * It only drops clean, unused pages. The page must be locked.
  *
  * This function can be called at any time, and is not supposed to throw away
  * dirty pages.  But pages can be marked dirty at any time too, so use
  * remove_mapping(), which safely discards clean, unused pages.
  *
  * Returns 1 if the page is successfully invalidated, otherwise 0.
  */


Also, as long as you're there, a newline after the mapping declaration
would bring this routine into compliance with that convention.

hmmm, now I wonder why this isn't a boolean function. And I think the
reason is that it's quite old.

Either way, looks good:

Reviewed-by: John Hubbard <jhubbard@nvidia.com>

thanks,
-- 
John Hubbard
NVIDIA

>   		return 0;
>   	if (page_mapped(page))
>   		return 0;
> -	return invalidate_complete_page(mapping, page);
> +	if (page_has_private(page) && !try_to_release_page(page, 0))
> +		return 0;
> +
> +	return remove_mapping(mapping, page);
>   }
>   
>   /**
> @@ -584,7 +566,7 @@ void invalidate_mapping_pagevec(struct address_space *mapping,
>   }
>   
>   /*
> - * This is like invalidate_complete_page(), except it ignores the page's
> + * This is like invalidate_inode_page(), except it ignores the page's
>    * refcount.  We do this because invalidate_inode_pages2() needs stronger
>    * invalidation guarantees, and cannot afford to leave pages behind because
>    * shrink_page_list() has a temp ref on them, or because they're transiently


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/10] mm/truncate: Inline invalidate_complete_page() into its one caller
  2022-02-14 23:09   ` John Hubbard
@ 2022-02-14 23:32     ` Matthew Wilcox
  2022-02-14 23:51       ` John Hubbard
  0 siblings, 1 reply; 42+ messages in thread
From: Matthew Wilcox @ 2022-02-14 23:32 UTC (permalink / raw)
  To: John Hubbard; +Cc: linux-fsdevel, linux-mm

On Mon, Feb 14, 2022 at 03:09:09PM -0800, John Hubbard wrote:
> > @@ -309,7 +288,10 @@ int invalidate_inode_page(struct page *page)
> 
> It would be nice to retain some of the original comments. May I suggest
> this (it has an additional paragraph) for an updated version of comments
> above invalidate_inode_page():
> 
> /*
>  * Safely invalidate one page from its pagecache mapping.
>  * It only drops clean, unused pages. The page must be locked.
>  *
>  * This function can be called at any time, and is not supposed to throw away
>  * dirty pages.  But pages can be marked dirty at any time too, so use
>  * remove_mapping(), which safely discards clean, unused pages.
>  *
>  * Returns 1 if the page is successfully invalidated, otherwise 0.
>  */

By the end of this series, it becomes:

/**
 * invalidate_inode_page() - Remove an unused page from the pagecache.
 * @page: The page to remove.
 *
 * Safely invalidate one page from its pagecache mapping.
 * It only drops clean, unused pages.
 *
 * Context: Page must be locked.
 * Return: The number of pages successfully removed.
 */

> Also, as long as you're there, a newline after the mapping declaration
> would bring this routine into compliance with that convention.

Again, by the end, we're at:

        struct folio *folio = page_folio(page);
        struct address_space *mapping = folio_mapping(folio);

        /* The page may have been truncated before it was locked */
        if (!mapping)
                return 0;
        return mapping_shrink_folio(mapping, folio);

> hmmm, now I wonder why this isn't a boolean function. And I think the
> reason is that it's quite old.

We could make this return a bool and have the one user that cares
call folio_nr_pages().  I don't have a strong preference.

> Either way, looks good:
> 
> Reviewed-by: John Hubbard <jhubbard@nvidia.com>

Thanks!

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/10] mm/truncate: Inline invalidate_complete_page() into its one caller
  2022-02-14 23:32     ` Matthew Wilcox
@ 2022-02-14 23:51       ` John Hubbard
  0 siblings, 0 replies; 42+ messages in thread
From: John Hubbard @ 2022-02-14 23:51 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-fsdevel, linux-mm

On 2/14/22 3:32 PM, Matthew Wilcox wrote:
> On Mon, Feb 14, 2022 at 03:09:09PM -0800, John Hubbard wrote:
>>> @@ -309,7 +288,10 @@ int invalidate_inode_page(struct page *page)
>>
>> It would be nice to retain some of the original comments. May I suggest
>> this (it has an additional paragraph) for an updated version of comments
>> above invalidate_inode_page():
>>
>> /*
>>   * Safely invalidate one page from its pagecache mapping.
>>   * It only drops clean, unused pages. The page must be locked.
>>   *
>>   * This function can be called at any time, and is not supposed to throw away
>>   * dirty pages.  But pages can be marked dirty at any time too, so use
>>   * remove_mapping(), which safely discards clean, unused pages.
>>   *
>>   * Returns 1 if the page is successfully invalidated, otherwise 0.
>>   */
> 
> By the end of this series, it becomes:
> 
> /**
>   * invalidate_inode_page() - Remove an unused page from the pagecache.
>   * @page: The page to remove.
>   *
>   * Safely invalidate one page from its pagecache mapping.
>   * It only drops clean, unused pages.
>   *
>   * Context: Page must be locked.
>   * Return: The number of pages successfully removed.
>   */

OK.

> 
>> Also, as long as you're there, a newline after the mapping declaration
>> would bring this routine into compliance with that convention.
> 
> Again, by the end, we're at:
> 
>          struct folio *folio = page_folio(page);
>          struct address_space *mapping = folio_mapping(folio);
> 
>          /* The page may have been truncated before it was locked */
>          if (!mapping)
>                  return 0;
>          return mapping_shrink_folio(mapping, folio);
> 

Also good.

>> hmmm, now I wonder why this isn't a boolean function. And I think the
>> reason is that it's quite old.
> 
> We could make this return a bool and have the one user that cares
> call folio_nr_pages().  I don't have a strong preference.

Neither do I. No need to add churn for that.


thanks,

-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 01/10] splice: Use a folio in page_cache_pipe_buf_try_steal()
  2022-02-14 20:00 ` [PATCH 01/10] splice: Use a folio in page_cache_pipe_buf_try_steal() Matthew Wilcox (Oracle)
@ 2022-02-15  7:15   ` Christoph Hellwig
  2022-02-15  8:32   ` Miaohe Lin
  1 sibling, 0 replies; 42+ messages in thread
From: Christoph Hellwig @ 2022-02-15  7:15 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-fsdevel, linux-mm

On Mon, Feb 14, 2022 at 08:00:08PM +0000, Matthew Wilcox (Oracle) wrote:
> This saves a lot of calls to compound_head().

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/10] mm/truncate: Inline invalidate_complete_page() into its one caller
  2022-02-14 20:00 ` [PATCH 02/10] mm/truncate: Inline invalidate_complete_page() into its one caller Matthew Wilcox (Oracle)
  2022-02-14 23:09   ` John Hubbard
@ 2022-02-15  7:17   ` Christoph Hellwig
  2022-02-15  7:45   ` Miaohe Lin
  2022-02-16  2:45   ` Miaohe Lin
  3 siblings, 0 replies; 42+ messages in thread
From: Christoph Hellwig @ 2022-02-15  7:17 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-fsdevel, linux-mm

On Mon, Feb 14, 2022 at 08:00:09PM +0000, Matthew Wilcox (Oracle) wrote:
> invalidate_inode_page() is the only caller of invalidate_complete_page()
> and inlining it reveals that the first check is unnecessary (because we
> hold the page locked, and we just retrieved the mapping from the page).
> Actually, it does make a difference, in that tail pages no longer fail
> at this check, so it's now possible to remove a tail page from a mapping.

There is a comment that referneces invalidate_complete_page in
kernel/futex/core.c which needs updating.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 03/10] mm/truncate: Convert invalidate_inode_page() to use a folio
  2022-02-14 20:00 ` [PATCH 03/10] mm/truncate: Convert invalidate_inode_page() to use a folio Matthew Wilcox (Oracle)
@ 2022-02-15  7:18   ` Christoph Hellwig
  2022-02-15  8:32   ` Miaohe Lin
  1 sibling, 0 replies; 42+ messages in thread
From: Christoph Hellwig @ 2022-02-15  7:18 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-fsdevel, linux-mm

On Mon, Feb 14, 2022 at 08:00:10PM +0000, Matthew Wilcox (Oracle) wrote:
> This saves a number of calls to compound_head().
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 04/10] mm/truncate: Replace page_mapped() call in invalidate_inode_page()
  2022-02-14 20:00 ` [PATCH 04/10] mm/truncate: Replace page_mapped() call in invalidate_inode_page() Matthew Wilcox (Oracle)
@ 2022-02-15  7:19   ` Christoph Hellwig
  2022-02-15 20:12     ` Matthew Wilcox
  2022-02-15  8:32   ` Miaohe Lin
  2022-02-25  1:31   ` Matthew Wilcox
  2 siblings, 1 reply; 42+ messages in thread
From: Christoph Hellwig @ 2022-02-15  7:19 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-fsdevel, linux-mm

On Mon, Feb 14, 2022 at 08:00:11PM +0000, Matthew Wilcox (Oracle) wrote:
> folio_mapped() is expensive because it has to check each page's mapcount
> field.  A cheaper check is whether there are any extra references to
> the page, other than the one we own and the ones held by the page cache.
> The call to remove_mapping() will fail in any case if it cannot freeze
> the refcount, but failing here avoids cycling the i_pages spinlock.

I wonder if something like this should also be in a comment near
the check in the code.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 05/10] mm: Convert remove_mapping() to take a folio
  2022-02-14 20:00 ` [PATCH 05/10] mm: Convert remove_mapping() to take a folio Matthew Wilcox (Oracle)
@ 2022-02-15  7:21   ` Christoph Hellwig
  2022-02-15  8:33   ` Miaohe Lin
  1 sibling, 0 replies; 42+ messages in thread
From: Christoph Hellwig @ 2022-02-15  7:21 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-fsdevel, linux-mm

On Mon, Feb 14, 2022 at 08:00:12PM +0000, Matthew Wilcox (Oracle) wrote:
> Add kernel-doc and return the number of pages removed in order to
> get the statistics right in __invalidate_mapping_pages().

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 06/10] mm/truncate: Split invalidate_inode_page() into mapping_shrink_folio()
  2022-02-14 20:00 ` [PATCH 06/10] mm/truncate: Split invalidate_inode_page() into mapping_shrink_folio() Matthew Wilcox (Oracle)
@ 2022-02-15  7:23   ` Christoph Hellwig
  2022-02-15  9:37   ` Miaohe Lin
  1 sibling, 0 replies; 42+ messages in thread
From: Christoph Hellwig @ 2022-02-15  7:23 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-fsdevel, linux-mm

The patch looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 07/10] mm/truncate: Convert __invalidate_mapping_pages() to use a folio
  2022-02-14 20:00 ` [PATCH 07/10] mm/truncate: Convert __invalidate_mapping_pages() to use a folio Matthew Wilcox (Oracle)
@ 2022-02-15  7:24   ` Christoph Hellwig
  2022-02-15  9:37   ` Miaohe Lin
  1 sibling, 0 replies; 42+ messages in thread
From: Christoph Hellwig @ 2022-02-15  7:24 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-fsdevel, linux-mm

On Mon, Feb 14, 2022 at 08:00:14PM +0000, Matthew Wilcox (Oracle) wrote:
> Now we can call mapping_shrink_folio() instead of invalidate_inode_page()
> and save a few calls to compound_head().

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 08/10] mm: Turn deactivate_file_page() into deactivate_file_folio()
  2022-02-14 20:00 ` [PATCH 08/10] mm: Turn deactivate_file_page() into deactivate_file_folio() Matthew Wilcox (Oracle)
@ 2022-02-15  7:25   ` Christoph Hellwig
  2022-02-15  8:26   ` Miaohe Lin
  1 sibling, 0 replies; 42+ messages in thread
From: Christoph Hellwig @ 2022-02-15  7:25 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-fsdevel, linux-mm

On Mon, Feb 14, 2022 at 08:00:15PM +0000, Matthew Wilcox (Oracle) wrote:
> This function has one caller which already has a reference to the
> page, so we don't need to use get_page_unless_zero().  Also move the
> prototype to mm/internal.h.

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 09/10] mm/truncate: Combine invalidate_mapping_pagevec() and __invalidate_mapping_pages()
  2022-02-14 20:00 ` [PATCH 09/10] mm/truncate: Combine invalidate_mapping_pagevec() and __invalidate_mapping_pages() Matthew Wilcox (Oracle)
@ 2022-02-15  7:26   ` Christoph Hellwig
  2022-02-15  9:37   ` Miaohe Lin
  1 sibling, 0 replies; 42+ messages in thread
From: Christoph Hellwig @ 2022-02-15  7:26 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-fsdevel, linux-mm

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 10/10] fs: Move many prototypes to pagemap.h
  2022-02-14 20:00 ` [PATCH 10/10] fs: Move many prototypes to pagemap.h Matthew Wilcox (Oracle)
@ 2022-02-15  7:27   ` Christoph Hellwig
  2022-02-15  9:38   ` Miaohe Lin
  1 sibling, 0 replies; 42+ messages in thread
From: Christoph Hellwig @ 2022-02-15  7:27 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-fsdevel, linux-mm

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/10] mm/truncate: Inline invalidate_complete_page() into its one caller
  2022-02-14 20:00 ` [PATCH 02/10] mm/truncate: Inline invalidate_complete_page() into its one caller Matthew Wilcox (Oracle)
  2022-02-14 23:09   ` John Hubbard
  2022-02-15  7:17   ` Christoph Hellwig
@ 2022-02-15  7:45   ` Miaohe Lin
  2022-02-15 20:09     ` Matthew Wilcox
  2022-02-16  2:45   ` Miaohe Lin
  3 siblings, 1 reply; 42+ messages in thread
From: Miaohe Lin @ 2022-02-15  7:45 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Linux FS-devel Mailing List, Linux-MM

On 2022/2/15 4:00, Matthew Wilcox (Oracle) wrote:
> invalidate_inode_page() is the only caller of invalidate_complete_page()
> and inlining it reveals that the first check is unnecessary (because we
> hold the page locked, and we just retrieved the mapping from the page).
> Actually, it does make a difference, in that tail pages no longer fail
> at this check, so it's now possible to remove a tail page from a mapping.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  mm/truncate.c | 28 +++++-----------------------
>  1 file changed, 5 insertions(+), 23 deletions(-)
> 
> diff --git a/mm/truncate.c b/mm/truncate.c
> index 9dbf0b75da5d..e5e2edaa0b76 100644
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -193,27 +193,6 @@ static void truncate_cleanup_folio(struct folio *folio)
>  	folio_clear_mappedtodisk(folio);
>  }
>  
> -/*
> - * This is for invalidate_mapping_pages().  That function can be called at
> - * any time, and is not supposed to throw away dirty pages.  But pages can
> - * be marked dirty at any time too, so use remove_mapping which safely
> - * discards clean, unused pages.
> - *
> - * Returns non-zero if the page was successfully invalidated.
> - */
> -static int
> -invalidate_complete_page(struct address_space *mapping, struct page *page)
> -{
> -
> -	if (page->mapping != mapping)
> -		return 0;
> -
> -	if (page_has_private(page) && !try_to_release_page(page, 0))
> -		return 0;
> -
> -	return remove_mapping(mapping, page);
> -}
> -
>  int truncate_inode_folio(struct address_space *mapping, struct folio *folio)
>  {
>  	if (folio->mapping != mapping)
> @@ -309,7 +288,10 @@ int invalidate_inode_page(struct page *page)
>  		return 0;
>  	if (page_mapped(page))
>  		return 0;
> -	return invalidate_complete_page(mapping, page);

It seems the checking of page->mapping != mapping is removed here.
IIUC, this would cause possibly unexpected side effect because
swapcache page can be invalidate now. I think this function is
not intended to deal with swapcache though it could do this.
Or am I miss anything?

Many thanks!

> +	if (page_has_private(page) && !try_to_release_page(page, 0))
> +		return 0;
> +
> +	return remove_mapping(mapping, page);
>  }
>  
>  /**
> @@ -584,7 +566,7 @@ void invalidate_mapping_pagevec(struct address_space *mapping,
>  }
>  
>  /*
> - * This is like invalidate_complete_page(), except it ignores the page's
> + * This is like invalidate_inode_page(), except it ignores the page's
>   * refcount.  We do this because invalidate_inode_pages2() needs stronger
>   * invalidation guarantees, and cannot afford to leave pages behind because
>   * shrink_page_list() has a temp ref on them, or because they're transiently
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 08/10] mm: Turn deactivate_file_page() into deactivate_file_folio()
  2022-02-14 20:00 ` [PATCH 08/10] mm: Turn deactivate_file_page() into deactivate_file_folio() Matthew Wilcox (Oracle)
  2022-02-15  7:25   ` Christoph Hellwig
@ 2022-02-15  8:26   ` Miaohe Lin
  2022-02-15 20:33     ` Matthew Wilcox
  1 sibling, 1 reply; 42+ messages in thread
From: Miaohe Lin @ 2022-02-15  8:26 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-fsdevel, linux-mm

On 2022/2/15 4:00, Matthew Wilcox (Oracle) wrote:
> This function has one caller which already has a reference to the
> page, so we don't need to use get_page_unless_zero().  Also move the
> prototype to mm/internal.h.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/swap.h |  1 -
>  mm/internal.h        |  1 +
>  mm/swap.c            | 33 ++++++++++++++++-----------------
>  mm/truncate.c        |  2 +-
>  4 files changed, 18 insertions(+), 19 deletions(-)
> 
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 304f174b4d31..064e60e9f63f 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -372,7 +372,6 @@ extern void lru_add_drain(void);
>  extern void lru_add_drain_cpu(int cpu);
>  extern void lru_add_drain_cpu_zone(struct zone *zone);
>  extern void lru_add_drain_all(void);
> -extern void deactivate_file_page(struct page *page);
>  extern void deactivate_page(struct page *page);
>  extern void mark_page_lazyfree(struct page *page);
>  extern void swap_setup(void);
> diff --git a/mm/internal.h b/mm/internal.h
> index 927a17d58b85..d886b87b1294 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -66,6 +66,7 @@ static inline void wake_throttle_isolated(pg_data_t *pgdat)
>  vm_fault_t do_swap_page(struct vm_fault *vmf);
>  void folio_rotate_reclaimable(struct folio *folio);
>  bool __folio_end_writeback(struct folio *folio);
> +void deactivate_file_folio(struct folio *folio);
>  
>  void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma,
>  		unsigned long floor, unsigned long ceiling);
> diff --git a/mm/swap.c b/mm/swap.c
> index bcf3ac288b56..745915127b1f 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -639,32 +639,31 @@ void lru_add_drain_cpu(int cpu)
>  }
>  
>  /**
> - * deactivate_file_page - forcefully deactivate a file page
> - * @page: page to deactivate
> + * deactivate_file_folio() - Forcefully deactivate a file folio.
> + * @folio: Folio to deactivate.
>   *
> - * This function hints the VM that @page is a good reclaim candidate,
> - * for example if its invalidation fails due to the page being dirty
> + * This function hints to the VM that @folio is a good reclaim candidate,
> + * for example if its invalidation fails due to the folio being dirty
>   * or under writeback.
>   */
> -void deactivate_file_page(struct page *page)
> +void deactivate_file_folio(struct folio *folio)
>  {
> +	struct pagevec *pvec;
> +
>  	/*
> -	 * In a workload with many unevictable page such as mprotect,
> -	 * unevictable page deactivation for accelerating reclaim is pointless.
> +	 * In a workload with many unevictable pages such as mprotect,
> +	 * unevictable folio deactivation for accelerating reclaim is pointless.
>  	 */
> -	if (PageUnevictable(page))
> +	if (folio_test_unevictable(folio))
>  		return;
>  
> -	if (likely(get_page_unless_zero(page))) {
> -		struct pagevec *pvec;
> -
> -		local_lock(&lru_pvecs.lock);
> -		pvec = this_cpu_ptr(&lru_pvecs.lru_deactivate_file);
> +	folio_get(folio);

Should we comment the assumption that caller already hold the refcnt?

Anyway, this patch looks good to me. Thanks.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

> +	local_lock(&lru_pvecs.lock);
> +	pvec = this_cpu_ptr(&lru_pvecs.lru_deactivate_file);
>  
> -		if (pagevec_add_and_need_flush(pvec, page))
> -			pagevec_lru_move_fn(pvec, lru_deactivate_file_fn);
> -		local_unlock(&lru_pvecs.lock);
> -	}
> +	if (pagevec_add_and_need_flush(pvec, &folio->page))
> +		pagevec_lru_move_fn(pvec, lru_deactivate_file_fn);
> +	local_unlock(&lru_pvecs.lock);
>  }
>  
>  /*
> diff --git a/mm/truncate.c b/mm/truncate.c
> index 567557c36d45..14486e75ec28 100644
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -525,7 +525,7 @@ static unsigned long __invalidate_mapping_pages(struct address_space *mapping,
>  			 * of interest and try to speed up its reclaim.
>  			 */
>  			if (!ret) {
> -				deactivate_file_page(&folio->page);
> +				deactivate_file_folio(folio);
>  				/* It is likely on the pagevec of a remote CPU */
>  				if (nr_pagevec)
>  					(*nr_pagevec)++;
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 01/10] splice: Use a folio in page_cache_pipe_buf_try_steal()
  2022-02-14 20:00 ` [PATCH 01/10] splice: Use a folio in page_cache_pipe_buf_try_steal() Matthew Wilcox (Oracle)
  2022-02-15  7:15   ` Christoph Hellwig
@ 2022-02-15  8:32   ` Miaohe Lin
  1 sibling, 0 replies; 42+ messages in thread
From: Miaohe Lin @ 2022-02-15  8:32 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-fsdevel, linux-mm

On 2022/2/15 4:00, Matthew Wilcox (Oracle) wrote:
> This saves a lot of calls to compound_head().
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  fs/splice.c | 21 +++++++++++----------
>  1 file changed, 11 insertions(+), 10 deletions(-)
> 

LGTM. Thanks.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

> diff --git a/fs/splice.c b/fs/splice.c
> index 5dbce4dcc1a7..23ff9c303abc 100644
> --- a/fs/splice.c
> +++ b/fs/splice.c
> @@ -47,26 +47,27 @@ static bool page_cache_pipe_buf_try_steal(struct pipe_inode_info *pipe,
>  		struct pipe_buffer *buf)
>  {
>  	struct page *page = buf->page;
> +	struct folio *folio = page_folio(page);
>  	struct address_space *mapping;
>  
> -	lock_page(page);
> +	folio_lock(folio);
>  
> -	mapping = page_mapping(page);
> +	mapping = folio_mapping(folio);
>  	if (mapping) {
> -		WARN_ON(!PageUptodate(page));
> +		WARN_ON(!folio_test_uptodate(folio));
>  
>  		/*
>  		 * At least for ext2 with nobh option, we need to wait on
> -		 * writeback completing on this page, since we'll remove it
> +		 * writeback completing on this folio, since we'll remove it
>  		 * from the pagecache.  Otherwise truncate wont wait on the
> -		 * page, allowing the disk blocks to be reused by someone else
> +		 * folio, allowing the disk blocks to be reused by someone else
>  		 * before we actually wrote our data to them. fs corruption
>  		 * ensues.
>  		 */
> -		wait_on_page_writeback(page);
> +		folio_wait_writeback(folio);
>  
> -		if (page_has_private(page) &&
> -		    !try_to_release_page(page, GFP_KERNEL))
> +		if (folio_has_private(folio) &&
> +		    !filemap_release_folio(folio, GFP_KERNEL))
>  			goto out_unlock;
>  
>  		/*
> @@ -80,11 +81,11 @@ static bool page_cache_pipe_buf_try_steal(struct pipe_inode_info *pipe,
>  	}
>  
>  	/*
> -	 * Raced with truncate or failed to remove page from current
> +	 * Raced with truncate or failed to remove folio from current
>  	 * address space, unlock and return failure.
>  	 */
>  out_unlock:
> -	unlock_page(page);
> +	folio_unlock(folio);
>  	return false;
>  }
>  
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 03/10] mm/truncate: Convert invalidate_inode_page() to use a folio
  2022-02-14 20:00 ` [PATCH 03/10] mm/truncate: Convert invalidate_inode_page() to use a folio Matthew Wilcox (Oracle)
  2022-02-15  7:18   ` Christoph Hellwig
@ 2022-02-15  8:32   ` Miaohe Lin
  1 sibling, 0 replies; 42+ messages in thread
From: Miaohe Lin @ 2022-02-15  8:32 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-fsdevel, linux-mm

On 2022/2/15 4:00, Matthew Wilcox (Oracle) wrote:
> This saves a number of calls to compound_head().
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---

LGTM. Thanks.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

>  mm/truncate.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/truncate.c b/mm/truncate.c
> index e5e2edaa0b76..b73c30c95cd0 100644
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -281,14 +281,15 @@ EXPORT_SYMBOL(generic_error_remove_page);
>   */
>  int invalidate_inode_page(struct page *page)
>  {
> -	struct address_space *mapping = page_mapping(page);
> +	struct folio *folio = page_folio(page);
> +	struct address_space *mapping = folio_mapping(folio);
>  	if (!mapping)
>  		return 0;
> -	if (PageDirty(page) || PageWriteback(page))
> +	if (folio_test_dirty(folio) || folio_test_writeback(folio))
>  		return 0;
>  	if (page_mapped(page))
>  		return 0;
> -	if (page_has_private(page) && !try_to_release_page(page, 0))
> +	if (folio_has_private(folio) && !filemap_release_folio(folio, 0))
>  		return 0;
>  
>  	return remove_mapping(mapping, page);
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 04/10] mm/truncate: Replace page_mapped() call in invalidate_inode_page()
  2022-02-14 20:00 ` [PATCH 04/10] mm/truncate: Replace page_mapped() call in invalidate_inode_page() Matthew Wilcox (Oracle)
  2022-02-15  7:19   ` Christoph Hellwig
@ 2022-02-15  8:32   ` Miaohe Lin
  2022-02-25  1:31   ` Matthew Wilcox
  2 siblings, 0 replies; 42+ messages in thread
From: Miaohe Lin @ 2022-02-15  8:32 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-fsdevel, linux-mm

On 2022/2/15 4:00, Matthew Wilcox (Oracle) wrote:
> folio_mapped() is expensive because it has to check each page's mapcount
> field.  A cheaper check is whether there are any extra references to
> the page, other than the one we own and the ones held by the page cache.
> The call to remove_mapping() will fail in any case if it cannot freeze
> the refcount, but failing here avoids cycling the i_pages spinlock.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---

LGTM. Thanks.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

>  mm/truncate.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/truncate.c b/mm/truncate.c
> index b73c30c95cd0..d67fa8871b75 100644
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -287,7 +287,7 @@ int invalidate_inode_page(struct page *page)
>  		return 0;
>  	if (folio_test_dirty(folio) || folio_test_writeback(folio))
>  		return 0;
> -	if (page_mapped(page))
> +	if (folio_ref_count(folio) > folio_nr_pages(folio) + 1)
>  		return 0;
>  	if (folio_has_private(folio) && !filemap_release_folio(folio, 0))
>  		return 0;
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 05/10] mm: Convert remove_mapping() to take a folio
  2022-02-14 20:00 ` [PATCH 05/10] mm: Convert remove_mapping() to take a folio Matthew Wilcox (Oracle)
  2022-02-15  7:21   ` Christoph Hellwig
@ 2022-02-15  8:33   ` Miaohe Lin
  1 sibling, 0 replies; 42+ messages in thread
From: Miaohe Lin @ 2022-02-15  8:33 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-fsdevel, linux-mm

On 2022/2/15 4:00, Matthew Wilcox (Oracle) wrote:
> Add kernel-doc and return the number of pages removed in order to
> get the statistics right in __invalidate_mapping_pages().
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

LGTM. Thanks.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

> ---
>  fs/splice.c          |  5 ++---
>  include/linux/swap.h |  2 +-
>  mm/truncate.c        |  2 +-
>  mm/vmscan.c          | 23 ++++++++++++++---------
>  4 files changed, 18 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/splice.c b/fs/splice.c
> index 23ff9c303abc..047b79db8eb5 100644
> --- a/fs/splice.c
> +++ b/fs/splice.c
> @@ -46,8 +46,7 @@
>  static bool page_cache_pipe_buf_try_steal(struct pipe_inode_info *pipe,
>  		struct pipe_buffer *buf)
>  {
> -	struct page *page = buf->page;
> -	struct folio *folio = page_folio(page);
> +	struct folio *folio = page_folio(buf->page);
>  	struct address_space *mapping;
>  
>  	folio_lock(folio);
> @@ -74,7 +73,7 @@ static bool page_cache_pipe_buf_try_steal(struct pipe_inode_info *pipe,
>  		 * If we succeeded in removing the mapping, set LRU flag
>  		 * and return good.
>  		 */
> -		if (remove_mapping(mapping, page)) {
> +		if (remove_mapping(mapping, folio)) {
>  			buf->flags |= PIPE_BUF_FLAG_LRU;
>  			return true;
>  		}
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index e7cb7a1e6ceb..304f174b4d31 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -395,7 +395,7 @@ extern unsigned long mem_cgroup_shrink_node(struct mem_cgroup *mem,
>  						unsigned long *nr_scanned);
>  extern unsigned long shrink_all_memory(unsigned long nr_pages);
>  extern int vm_swappiness;
> -extern int remove_mapping(struct address_space *mapping, struct page *page);
> +long remove_mapping(struct address_space *mapping, struct folio *folio);
>  
>  extern unsigned long reclaim_pages(struct list_head *page_list);
>  #ifdef CONFIG_NUMA
> diff --git a/mm/truncate.c b/mm/truncate.c
> index d67fa8871b75..8aa86e294775 100644
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -292,7 +292,7 @@ int invalidate_inode_page(struct page *page)
>  	if (folio_has_private(folio) && !filemap_release_folio(folio, 0))
>  		return 0;
>  
> -	return remove_mapping(mapping, page);
> +	return remove_mapping(mapping, folio);
>  }
>  
>  /**
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 2965be8df713..7959df4d611b 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1335,23 +1335,28 @@ static int __remove_mapping(struct address_space *mapping, struct folio *folio,
>  	return 0;
>  }
>  
> -/*
> - * Attempt to detach a locked page from its ->mapping.  If it is dirty or if
> - * someone else has a ref on the page, abort and return 0.  If it was
> - * successfully detached, return 1.  Assumes the caller has a single ref on
> - * this page.
> +/**
> + * remove_mapping() - Attempt to remove a folio from its mapping.
> + * @mapping: The address space.
> + * @folio: The folio to remove.
> + *
> + * If the folio is dirty, under writeback or if someone else has a ref
> + * on it, removal will fail.
> + * Return: The number of pages removed from the mapping.  0 if the folio
> + * could not be removed.
> + * Context: The caller should have a single refcount on the folio and
> + * hold its lock.
>   */
> -int remove_mapping(struct address_space *mapping, struct page *page)
> +long remove_mapping(struct address_space *mapping, struct folio *folio)
>  {
> -	struct folio *folio = page_folio(page);
>  	if (__remove_mapping(mapping, folio, false, NULL)) {
>  		/*
> -		 * Unfreezing the refcount with 1 rather than 2 effectively
> +		 * Unfreezing the refcount with 1 effectively
>  		 * drops the pagecache ref for us without requiring another
>  		 * atomic operation.
>  		 */
>  		folio_ref_unfreeze(folio, 1);
> -		return 1;
> +		return folio_nr_pages(folio);
>  	}
>  	return 0;
>  }
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 06/10] mm/truncate: Split invalidate_inode_page() into mapping_shrink_folio()
  2022-02-14 20:00 ` [PATCH 06/10] mm/truncate: Split invalidate_inode_page() into mapping_shrink_folio() Matthew Wilcox (Oracle)
  2022-02-15  7:23   ` Christoph Hellwig
@ 2022-02-15  9:37   ` Miaohe Lin
  1 sibling, 0 replies; 42+ messages in thread
From: Miaohe Lin @ 2022-02-15  9:37 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-fsdevel, linux-mm

On 2022/2/15 4:00, Matthew Wilcox (Oracle) wrote:
> Some of the callers already have the address_space and can avoid calling
> folio_mapping() and checking if the folio was already truncated.  Also
> add kernel-doc and fix the return type (in case we ever support folios
> larger than 4TB).
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

LGTM. Thanks.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>  include/linux/mm.h  |  1 -
>  mm/internal.h       |  1 +
>  mm/memory-failure.c |  4 ++--
>  mm/truncate.c       | 34 +++++++++++++++++++++++-----------
>  4 files changed, 26 insertions(+), 14 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 4637368d9455..53b301dc5c14 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1853,7 +1853,6 @@ extern void truncate_setsize(struct inode *inode, loff_t newsize);
>  void pagecache_isize_extended(struct inode *inode, loff_t from, loff_t to);
>  void truncate_pagecache_range(struct inode *inode, loff_t offset, loff_t end);
>  int generic_error_remove_page(struct address_space *mapping, struct page *page);
> -int invalidate_inode_page(struct page *page);
>  
>  #ifdef CONFIG_MMU
>  extern vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
> diff --git a/mm/internal.h b/mm/internal.h
> index b7a2195c12b1..927a17d58b85 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -100,6 +100,7 @@ void filemap_free_folio(struct address_space *mapping, struct folio *folio);
>  int truncate_inode_folio(struct address_space *mapping, struct folio *folio);
>  bool truncate_inode_partial_folio(struct folio *folio, loff_t start,
>  		loff_t end);
> +long invalidate_inode_page(struct page *page);
>  
>  /**
>   * folio_evictable - Test whether a folio is evictable.
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 97a9ed8f87a9..0b72a936b8dd 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -2139,7 +2139,7 @@ static bool isolate_page(struct page *page, struct list_head *pagelist)
>   */
>  static int __soft_offline_page(struct page *page)
>  {
> -	int ret = 0;
> +	long ret = 0;
>  	unsigned long pfn = page_to_pfn(page);
>  	struct page *hpage = compound_head(page);
>  	char const *msg_page[] = {"page", "hugepage"};
> @@ -2196,7 +2196,7 @@ static int __soft_offline_page(struct page *page)
>  			if (!list_empty(&pagelist))
>  				putback_movable_pages(&pagelist);
>  
> -			pr_info("soft offline: %#lx: %s migration failed %d, type %pGp\n",
> +			pr_info("soft offline: %#lx: %s migration failed %ld, type %pGp\n",
>  				pfn, msg_page[huge], ret, &page->flags);
>  			if (ret > 0)
>  				ret = -EBUSY;
> diff --git a/mm/truncate.c b/mm/truncate.c
> index 8aa86e294775..b1bdc61198f6 100644
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -273,18 +273,9 @@ int generic_error_remove_page(struct address_space *mapping, struct page *page)
>  }
>  EXPORT_SYMBOL(generic_error_remove_page);
>  
> -/*
> - * Safely invalidate one page from its pagecache mapping.
> - * It only drops clean, unused pages. The page must be locked.
> - *
> - * Returns 1 if the page is successfully invalidated, otherwise 0.
> - */
> -int invalidate_inode_page(struct page *page)
> +static long mapping_shrink_folio(struct address_space *mapping,
> +		struct folio *folio)
>  {
> -	struct folio *folio = page_folio(page);
> -	struct address_space *mapping = folio_mapping(folio);
> -	if (!mapping)
> -		return 0;
>  	if (folio_test_dirty(folio) || folio_test_writeback(folio))
>  		return 0;
>  	if (folio_ref_count(folio) > folio_nr_pages(folio) + 1)
> @@ -295,6 +286,27 @@ int invalidate_inode_page(struct page *page)
>  	return remove_mapping(mapping, folio);
>  }
>  
> +/**
> + * invalidate_inode_page() - Remove an unused page from the pagecache.
> + * @page: The page to remove.
> + *
> + * Safely invalidate one page from its pagecache mapping.
> + * It only drops clean, unused pages.
> + *
> + * Context: Page must be locked.
> + * Return: The number of pages successfully removed.
> + */
> +long invalidate_inode_page(struct page *page)
> +{
> +	struct folio *folio = page_folio(page);
> +	struct address_space *mapping = folio_mapping(folio);
> +
> +	/* The page may have been truncated before it was locked */
> +	if (!mapping)
> +		return 0;
> +	return mapping_shrink_folio(mapping, folio);
> +}
> +
>  /**
>   * truncate_inode_pages_range - truncate range of pages specified by start & end byte offsets
>   * @mapping: mapping to truncate
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 07/10] mm/truncate: Convert __invalidate_mapping_pages() to use a folio
  2022-02-14 20:00 ` [PATCH 07/10] mm/truncate: Convert __invalidate_mapping_pages() to use a folio Matthew Wilcox (Oracle)
  2022-02-15  7:24   ` Christoph Hellwig
@ 2022-02-15  9:37   ` Miaohe Lin
  1 sibling, 0 replies; 42+ messages in thread
From: Miaohe Lin @ 2022-02-15  9:37 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-fsdevel, linux-mm

On 2022/2/15 4:00, Matthew Wilcox (Oracle) wrote:
> Now we can call mapping_shrink_folio() instead of invalidate_inode_page()
> and save a few calls to compound_head().
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

LGTM. Thanks.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

> ---
>  mm/truncate.c | 18 +++++++++---------
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/truncate.c b/mm/truncate.c
> index b1bdc61198f6..567557c36d45 100644
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -505,27 +505,27 @@ static unsigned long __invalidate_mapping_pages(struct address_space *mapping,
>  	folio_batch_init(&fbatch);
>  	while (find_lock_entries(mapping, index, end, &fbatch, indices)) {
>  		for (i = 0; i < folio_batch_count(&fbatch); i++) {
> -			struct page *page = &fbatch.folios[i]->page;
> +			struct folio *folio = fbatch.folios[i];
>  
> -			/* We rely upon deletion not changing page->index */
> +			/* We rely upon deletion not changing folio->index */
>  			index = indices[i];
>  
> -			if (xa_is_value(page)) {
> +			if (xa_is_value(folio)) {
>  				count += invalidate_exceptional_entry(mapping,
>  								      index,
> -								      page);
> +								      folio);
>  				continue;
>  			}
> -			index += thp_nr_pages(page) - 1;
> +			index += folio_nr_pages(folio) - 1;
>  
> -			ret = invalidate_inode_page(page);
> -			unlock_page(page);
> +			ret = mapping_shrink_folio(mapping, folio);
> +			folio_unlock(folio);
>  			/*
> -			 * Invalidation is a hint that the page is no longer
> +			 * Invalidation is a hint that the folio is no longer
>  			 * of interest and try to speed up its reclaim.
>  			 */
>  			if (!ret) {
> -				deactivate_file_page(page);
> +				deactivate_file_page(&folio->page);
>  				/* It is likely on the pagevec of a remote CPU */
>  				if (nr_pagevec)
>  					(*nr_pagevec)++;
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 09/10] mm/truncate: Combine invalidate_mapping_pagevec() and __invalidate_mapping_pages()
  2022-02-14 20:00 ` [PATCH 09/10] mm/truncate: Combine invalidate_mapping_pagevec() and __invalidate_mapping_pages() Matthew Wilcox (Oracle)
  2022-02-15  7:26   ` Christoph Hellwig
@ 2022-02-15  9:37   ` Miaohe Lin
  1 sibling, 0 replies; 42+ messages in thread
From: Miaohe Lin @ 2022-02-15  9:37 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-fsdevel, linux-mm

On 2022/2/15 4:00, Matthew Wilcox (Oracle) wrote:
> We can save a function call by combining these two functions, which
> are identical except for the return value.  Also move the prototype
> to mm/internal.h.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

LGTM. Thanks.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

> ---
>  include/linux/fs.h |  4 ----
>  mm/internal.h      |  2 ++
>  mm/truncate.c      | 32 +++++++++++++-------------------
>  3 files changed, 15 insertions(+), 23 deletions(-)
> 
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index e2d892b201b0..85c584c5c623 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2749,10 +2749,6 @@ extern bool is_bad_inode(struct inode *);
>  unsigned long invalidate_mapping_pages(struct address_space *mapping,
>  					pgoff_t start, pgoff_t end);
>  
> -void invalidate_mapping_pagevec(struct address_space *mapping,
> -				pgoff_t start, pgoff_t end,
> -				unsigned long *nr_pagevec);
> -
>  static inline void invalidate_remote_inode(struct inode *inode)
>  {
>  	if (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
> diff --git a/mm/internal.h b/mm/internal.h
> index d886b87b1294..6bbe40a1880a 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -102,6 +102,8 @@ int truncate_inode_folio(struct address_space *mapping, struct folio *folio);
>  bool truncate_inode_partial_folio(struct folio *folio, loff_t start,
>  		loff_t end);
>  long invalidate_inode_page(struct page *page);
> +unsigned long invalidate_mapping_pagevec(struct address_space *mapping,
> +		pgoff_t start, pgoff_t end, unsigned long *nr_pagevec);
>  
>  /**
>   * folio_evictable - Test whether a folio is evictable.
> diff --git a/mm/truncate.c b/mm/truncate.c
> index 14486e75ec28..6b94b00f4307 100644
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -492,7 +492,18 @@ void truncate_inode_pages_final(struct address_space *mapping)
>  }
>  EXPORT_SYMBOL(truncate_inode_pages_final);
>  
> -static unsigned long __invalidate_mapping_pages(struct address_space *mapping,
> +/**
> + * invalidate_mapping_pagevec - Invalidate all the unlocked pages of one inode
> + * @mapping: the address_space which holds the pages to invalidate
> + * @start: the offset 'from' which to invalidate
> + * @end: the offset 'to' which to invalidate (inclusive)
> + * @nr_pagevec: invalidate failed page number for caller
> + *
> + * This helper is similar to invalidate_mapping_pages(), except that it accounts
> + * for pages that are likely on a pagevec and counts them in @nr_pagevec, which
> + * will be used by the caller.
> + */
> +unsigned long invalidate_mapping_pagevec(struct address_space *mapping,
>  		pgoff_t start, pgoff_t end, unsigned long *nr_pagevec)
>  {
>  	pgoff_t indices[PAGEVEC_SIZE];
> @@ -557,27 +568,10 @@ static unsigned long __invalidate_mapping_pages(struct address_space *mapping,
>  unsigned long invalidate_mapping_pages(struct address_space *mapping,
>  		pgoff_t start, pgoff_t end)
>  {
> -	return __invalidate_mapping_pages(mapping, start, end, NULL);
> +	return invalidate_mapping_pagevec(mapping, start, end, NULL);
>  }
>  EXPORT_SYMBOL(invalidate_mapping_pages);
>  
> -/**
> - * invalidate_mapping_pagevec - Invalidate all the unlocked pages of one inode
> - * @mapping: the address_space which holds the pages to invalidate
> - * @start: the offset 'from' which to invalidate
> - * @end: the offset 'to' which to invalidate (inclusive)
> - * @nr_pagevec: invalidate failed page number for caller
> - *
> - * This helper is similar to invalidate_mapping_pages(), except that it accounts
> - * for pages that are likely on a pagevec and counts them in @nr_pagevec, which
> - * will be used by the caller.
> - */
> -void invalidate_mapping_pagevec(struct address_space *mapping,
> -		pgoff_t start, pgoff_t end, unsigned long *nr_pagevec)
> -{
> -	__invalidate_mapping_pages(mapping, start, end, nr_pagevec);
> -}
> -
>  /*
>   * This is like invalidate_inode_page(), except it ignores the page's
>   * refcount.  We do this because invalidate_inode_pages2() needs stronger
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 10/10] fs: Move many prototypes to pagemap.h
  2022-02-14 20:00 ` [PATCH 10/10] fs: Move many prototypes to pagemap.h Matthew Wilcox (Oracle)
  2022-02-15  7:27   ` Christoph Hellwig
@ 2022-02-15  9:38   ` Miaohe Lin
  1 sibling, 0 replies; 42+ messages in thread
From: Miaohe Lin @ 2022-02-15  9:38 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-fsdevel, linux-mm

On 2022/2/15 4:00, Matthew Wilcox (Oracle) wrote:
> These functions are page cache functionality and don't need to be
> declared in fs.h.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

LGTM. Thanks.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>  drivers/block/xen-blkback/xenbus.c           |   1 +
>  drivers/usb/gadget/function/f_mass_storage.c |   1 +
>  fs/coda/file.c                               |   1 +
>  fs/iomap/fiemap.c                            |   1 +
>  fs/nfsd/filecache.c                          |   1 +
>  fs/nfsd/vfs.c                                |   1 +
>  fs/vboxsf/utils.c                            |   1 +
>  include/linux/fs.h                           | 116 -------------------
>  include/linux/pagemap.h                      | 114 ++++++++++++++++++
>  9 files changed, 121 insertions(+), 116 deletions(-)
> 
> diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
> index 62125fd4af4a..f09040435e2e 100644
> --- a/drivers/block/xen-blkback/xenbus.c
> +++ b/drivers/block/xen-blkback/xenbus.c
> @@ -10,6 +10,7 @@
>  
>  #include <linux/module.h>
>  #include <linux/kthread.h>
> +#include <linux/pagemap.h>
>  #include <xen/events.h>
>  #include <xen/grant_table.h>
>  #include "common.h"
> diff --git a/drivers/usb/gadget/function/f_mass_storage.c b/drivers/usb/gadget/function/f_mass_storage.c
> index 46dd11dcb3a8..7371c6e65b10 100644
> --- a/drivers/usb/gadget/function/f_mass_storage.c
> +++ b/drivers/usb/gadget/function/f_mass_storage.c
> @@ -179,6 +179,7 @@
>  #include <linux/kthread.h>
>  #include <linux/sched/signal.h>
>  #include <linux/limits.h>
> +#include <linux/pagemap.h>
>  #include <linux/rwsem.h>
>  #include <linux/slab.h>
>  #include <linux/spinlock.h>
> diff --git a/fs/coda/file.c b/fs/coda/file.c
> index 29dd87be2fb8..3f3c81e6b1ab 100644
> --- a/fs/coda/file.c
> +++ b/fs/coda/file.c
> @@ -14,6 +14,7 @@
>  #include <linux/time.h>
>  #include <linux/file.h>
>  #include <linux/fs.h>
> +#include <linux/pagemap.h>
>  #include <linux/stat.h>
>  #include <linux/cred.h>
>  #include <linux/errno.h>
> diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c
> index 66cf267c68ae..610ca6f1ec9b 100644
> --- a/fs/iomap/fiemap.c
> +++ b/fs/iomap/fiemap.c
> @@ -7,6 +7,7 @@
>  #include <linux/fs.h>
>  #include <linux/iomap.h>
>  #include <linux/fiemap.h>
> +#include <linux/pagemap.h>
>  
>  static int iomap_to_fiemap(struct fiemap_extent_info *fi,
>  		const struct iomap *iomap, u32 flags)
> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> index 8bc807c5fea4..47f804e0ec93 100644
> --- a/fs/nfsd/filecache.c
> +++ b/fs/nfsd/filecache.c
> @@ -7,6 +7,7 @@
>  #include <linux/hash.h>
>  #include <linux/slab.h>
>  #include <linux/file.h>
> +#include <linux/pagemap.h>
>  #include <linux/sched.h>
>  #include <linux/list_lru.h>
>  #include <linux/fsnotify_backend.h>
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 91600e71be19..fe0d7abbc1b1 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -26,6 +26,7 @@
>  #include <linux/xattr.h>
>  #include <linux/jhash.h>
>  #include <linux/ima.h>
> +#include <linux/pagemap.h>
>  #include <linux/slab.h>
>  #include <linux/uaccess.h>
>  #include <linux/exportfs.h>
> diff --git a/fs/vboxsf/utils.c b/fs/vboxsf/utils.c
> index aec2ebf7d25a..e1db0f3f7e5e 100644
> --- a/fs/vboxsf/utils.c
> +++ b/fs/vboxsf/utils.c
> @@ -9,6 +9,7 @@
>  #include <linux/namei.h>
>  #include <linux/nls.h>
>  #include <linux/sizes.h>
> +#include <linux/pagemap.h>
>  #include <linux/vfs.h>
>  #include "vfsmod.h"
>  
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 85c584c5c623..0961c979e949 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2746,50 +2746,6 @@ extern void init_special_inode(struct inode *, umode_t, dev_t);
>  extern void make_bad_inode(struct inode *);
>  extern bool is_bad_inode(struct inode *);
>  
> -unsigned long invalidate_mapping_pages(struct address_space *mapping,
> -					pgoff_t start, pgoff_t end);
> -
> -static inline void invalidate_remote_inode(struct inode *inode)
> -{
> -	if (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
> -	    S_ISLNK(inode->i_mode))
> -		invalidate_mapping_pages(inode->i_mapping, 0, -1);
> -}
> -extern int invalidate_inode_pages2(struct address_space *mapping);
> -extern int invalidate_inode_pages2_range(struct address_space *mapping,
> -					 pgoff_t start, pgoff_t end);
> -extern int write_inode_now(struct inode *, int);
> -extern int filemap_fdatawrite(struct address_space *);
> -extern int filemap_flush(struct address_space *);
> -extern int filemap_fdatawait_keep_errors(struct address_space *mapping);
> -extern int filemap_fdatawait_range(struct address_space *, loff_t lstart,
> -				   loff_t lend);
> -extern int filemap_fdatawait_range_keep_errors(struct address_space *mapping,
> -		loff_t start_byte, loff_t end_byte);
> -
> -static inline int filemap_fdatawait(struct address_space *mapping)
> -{
> -	return filemap_fdatawait_range(mapping, 0, LLONG_MAX);
> -}
> -
> -extern bool filemap_range_has_page(struct address_space *, loff_t lstart,
> -				  loff_t lend);
> -extern int filemap_write_and_wait_range(struct address_space *mapping,
> -				        loff_t lstart, loff_t lend);
> -extern int __filemap_fdatawrite_range(struct address_space *mapping,
> -				loff_t start, loff_t end, int sync_mode);
> -extern int filemap_fdatawrite_range(struct address_space *mapping,
> -				loff_t start, loff_t end);
> -extern int filemap_check_errors(struct address_space *mapping);
> -extern void __filemap_set_wb_err(struct address_space *mapping, int err);
> -int filemap_fdatawrite_wbc(struct address_space *mapping,
> -			   struct writeback_control *wbc);
> -
> -static inline int filemap_write_and_wait(struct address_space *mapping)
> -{
> -	return filemap_write_and_wait_range(mapping, 0, LLONG_MAX);
> -}
> -
>  extern int __must_check file_fdatawait_range(struct file *file, loff_t lstart,
>  						loff_t lend);
>  extern int __must_check file_check_and_advance_wb_err(struct file *file);
> @@ -2801,67 +2757,6 @@ static inline int file_write_and_wait(struct file *file)
>  	return file_write_and_wait_range(file, 0, LLONG_MAX);
>  }
>  
> -/**
> - * filemap_set_wb_err - set a writeback error on an address_space
> - * @mapping: mapping in which to set writeback error
> - * @err: error to be set in mapping
> - *
> - * When writeback fails in some way, we must record that error so that
> - * userspace can be informed when fsync and the like are called.  We endeavor
> - * to report errors on any file that was open at the time of the error.  Some
> - * internal callers also need to know when writeback errors have occurred.
> - *
> - * When a writeback error occurs, most filesystems will want to call
> - * filemap_set_wb_err to record the error in the mapping so that it will be
> - * automatically reported whenever fsync is called on the file.
> - */
> -static inline void filemap_set_wb_err(struct address_space *mapping, int err)
> -{
> -	/* Fastpath for common case of no error */
> -	if (unlikely(err))
> -		__filemap_set_wb_err(mapping, err);
> -}
> -
> -/**
> - * filemap_check_wb_err - has an error occurred since the mark was sampled?
> - * @mapping: mapping to check for writeback errors
> - * @since: previously-sampled errseq_t
> - *
> - * Grab the errseq_t value from the mapping, and see if it has changed "since"
> - * the given value was sampled.
> - *
> - * If it has then report the latest error set, otherwise return 0.
> - */
> -static inline int filemap_check_wb_err(struct address_space *mapping,
> -					errseq_t since)
> -{
> -	return errseq_check(&mapping->wb_err, since);
> -}
> -
> -/**
> - * filemap_sample_wb_err - sample the current errseq_t to test for later errors
> - * @mapping: mapping to be sampled
> - *
> - * Writeback errors are always reported relative to a particular sample point
> - * in the past. This function provides those sample points.
> - */
> -static inline errseq_t filemap_sample_wb_err(struct address_space *mapping)
> -{
> -	return errseq_sample(&mapping->wb_err);
> -}
> -
> -/**
> - * file_sample_sb_err - sample the current errseq_t to test for later errors
> - * @file: file pointer to be sampled
> - *
> - * Grab the most current superblock-level errseq_t value for the given
> - * struct file.
> - */
> -static inline errseq_t file_sample_sb_err(struct file *file)
> -{
> -	return errseq_sample(&file->f_path.dentry->d_sb->s_wb_err);
> -}
> -
>  extern int vfs_fsync_range(struct file *file, loff_t start, loff_t end,
>  			   int datasync);
>  extern int vfs_fsync(struct file *file, int datasync);
> @@ -3604,15 +3499,4 @@ extern int vfs_fadvise(struct file *file, loff_t offset, loff_t len,
>  extern int generic_fadvise(struct file *file, loff_t offset, loff_t len,
>  			   int advice);
>  
> -/*
> - * Flush file data before changing attributes.  Caller must hold any locks
> - * required to prevent further writes to this file until we're done setting
> - * flags.
> - */
> -static inline int inode_drain_writes(struct inode *inode)
> -{
> -	inode_dio_wait(inode);
> -	return filemap_write_and_wait(inode->i_mapping);
> -}
> -
>  #endif /* _LINUX_FS_H */
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index cdb3f118603a..f968b36ad771 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -18,6 +18,120 @@
>  
>  struct folio_batch;
>  
> +unsigned long invalidate_mapping_pages(struct address_space *mapping,
> +					pgoff_t start, pgoff_t end);
> +
> +static inline void invalidate_remote_inode(struct inode *inode)
> +{
> +	if (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
> +	    S_ISLNK(inode->i_mode))
> +		invalidate_mapping_pages(inode->i_mapping, 0, -1);
> +}
> +int invalidate_inode_pages2(struct address_space *mapping);
> +int invalidate_inode_pages2_range(struct address_space *mapping,
> +		pgoff_t start, pgoff_t end);
> +int write_inode_now(struct inode *, int sync);
> +int filemap_fdatawrite(struct address_space *);
> +int filemap_flush(struct address_space *);
> +int filemap_fdatawait_keep_errors(struct address_space *mapping);
> +int filemap_fdatawait_range(struct address_space *, loff_t lstart, loff_t lend);
> +int filemap_fdatawait_range_keep_errors(struct address_space *mapping,
> +		loff_t start_byte, loff_t end_byte);
> +
> +static inline int filemap_fdatawait(struct address_space *mapping)
> +{
> +	return filemap_fdatawait_range(mapping, 0, LLONG_MAX);
> +}
> +
> +bool filemap_range_has_page(struct address_space *, loff_t lstart, loff_t lend);
> +int filemap_write_and_wait_range(struct address_space *mapping,
> +		loff_t lstart, loff_t lend);
> +int __filemap_fdatawrite_range(struct address_space *mapping,
> +		loff_t start, loff_t end, int sync_mode);
> +int filemap_fdatawrite_range(struct address_space *mapping,
> +		loff_t start, loff_t end);
> +int filemap_check_errors(struct address_space *mapping);
> +void __filemap_set_wb_err(struct address_space *mapping, int err);
> +int filemap_fdatawrite_wbc(struct address_space *mapping,
> +			   struct writeback_control *wbc);
> +
> +static inline int filemap_write_and_wait(struct address_space *mapping)
> +{
> +	return filemap_write_and_wait_range(mapping, 0, LLONG_MAX);
> +}
> +
> +/**
> + * filemap_set_wb_err - set a writeback error on an address_space
> + * @mapping: mapping in which to set writeback error
> + * @err: error to be set in mapping
> + *
> + * When writeback fails in some way, we must record that error so that
> + * userspace can be informed when fsync and the like are called.  We endeavor
> + * to report errors on any file that was open at the time of the error.  Some
> + * internal callers also need to know when writeback errors have occurred.
> + *
> + * When a writeback error occurs, most filesystems will want to call
> + * filemap_set_wb_err to record the error in the mapping so that it will be
> + * automatically reported whenever fsync is called on the file.
> + */
> +static inline void filemap_set_wb_err(struct address_space *mapping, int err)
> +{
> +	/* Fastpath for common case of no error */
> +	if (unlikely(err))
> +		__filemap_set_wb_err(mapping, err);
> +}
> +
> +/**
> + * filemap_check_wb_err - has an error occurred since the mark was sampled?
> + * @mapping: mapping to check for writeback errors
> + * @since: previously-sampled errseq_t
> + *
> + * Grab the errseq_t value from the mapping, and see if it has changed "since"
> + * the given value was sampled.
> + *
> + * If it has then report the latest error set, otherwise return 0.
> + */
> +static inline int filemap_check_wb_err(struct address_space *mapping,
> +					errseq_t since)
> +{
> +	return errseq_check(&mapping->wb_err, since);
> +}
> +
> +/**
> + * filemap_sample_wb_err - sample the current errseq_t to test for later errors
> + * @mapping: mapping to be sampled
> + *
> + * Writeback errors are always reported relative to a particular sample point
> + * in the past. This function provides those sample points.
> + */
> +static inline errseq_t filemap_sample_wb_err(struct address_space *mapping)
> +{
> +	return errseq_sample(&mapping->wb_err);
> +}
> +
> +/**
> + * file_sample_sb_err - sample the current errseq_t to test for later errors
> + * @file: file pointer to be sampled
> + *
> + * Grab the most current superblock-level errseq_t value for the given
> + * struct file.
> + */
> +static inline errseq_t file_sample_sb_err(struct file *file)
> +{
> +	return errseq_sample(&file->f_path.dentry->d_sb->s_wb_err);
> +}
> +
> +/*
> + * Flush file data before changing attributes.  Caller must hold any locks
> + * required to prevent further writes to this file until we're done setting
> + * flags.
> + */
> +static inline int inode_drain_writes(struct inode *inode)
> +{
> +	inode_dio_wait(inode);
> +	return filemap_write_and_wait(inode->i_mapping);
> +}
> +
>  static inline bool mapping_empty(struct address_space *mapping)
>  {
>  	return xa_empty(&mapping->i_pages);
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/10] mm/truncate: Inline invalidate_complete_page() into its one caller
  2022-02-15  7:45   ` Miaohe Lin
@ 2022-02-15 20:09     ` Matthew Wilcox
  2022-02-16  2:36       ` Miaohe Lin
  0 siblings, 1 reply; 42+ messages in thread
From: Matthew Wilcox @ 2022-02-15 20:09 UTC (permalink / raw)
  To: Miaohe Lin; +Cc: Linux FS-devel Mailing List, Linux-MM

On Tue, Feb 15, 2022 at 03:45:34PM +0800, Miaohe Lin wrote:
> > @@ -309,7 +288,10 @@ int invalidate_inode_page(struct page *page)
> >  		return 0;
> >  	if (page_mapped(page))
> >  		return 0;
> > -	return invalidate_complete_page(mapping, page);
> 
> It seems the checking of page->mapping != mapping is removed here.
> IIUC, this would cause possibly unexpected side effect because
> swapcache page can be invalidate now. I think this function is
> not intended to deal with swapcache though it could do this.

You're right that it might now pass instead of being skipped.
But it's not currently called for swapcache pages.  If we did want
to prohibit swapcache pages explicitly, I'd rather we checked the
flag instead of relying on page->mapping != page_mapping(page).
The intent of that check was "has it been truncated", not "is it
swapcache".


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 04/10] mm/truncate: Replace page_mapped() call in invalidate_inode_page()
  2022-02-15  7:19   ` Christoph Hellwig
@ 2022-02-15 20:12     ` Matthew Wilcox
  0 siblings, 0 replies; 42+ messages in thread
From: Matthew Wilcox @ 2022-02-15 20:12 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-fsdevel, linux-mm

On Mon, Feb 14, 2022 at 11:19:22PM -0800, Christoph Hellwig wrote:
> On Mon, Feb 14, 2022 at 08:00:11PM +0000, Matthew Wilcox (Oracle) wrote:
> > folio_mapped() is expensive because it has to check each page's mapcount
> > field.  A cheaper check is whether there are any extra references to
> > the page, other than the one we own and the ones held by the page cache.
> > The call to remove_mapping() will fail in any case if it cannot freeze
> > the refcount, but failing here avoids cycling the i_pages spinlock.
> 
> I wonder if something like this should also be in a comment near
> the check in the code.

        /* The refcount will be elevated if any page in the folio is mapped */

is what I've added for now.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 08/10] mm: Turn deactivate_file_page() into deactivate_file_folio()
  2022-02-15  8:26   ` Miaohe Lin
@ 2022-02-15 20:33     ` Matthew Wilcox
  2022-02-16  2:45       ` Miaohe Lin
  0 siblings, 1 reply; 42+ messages in thread
From: Matthew Wilcox @ 2022-02-15 20:33 UTC (permalink / raw)
  To: Miaohe Lin; +Cc: linux-fsdevel, linux-mm

On Tue, Feb 15, 2022 at 04:26:22PM +0800, Miaohe Lin wrote:
> > +	folio_get(folio);
> 
> Should we comment the assumption that caller already hold the refcnt?

Added to the kernel-doc:
+ * Context: Caller holds a reference on the page.

> Anyway, this patch looks good to me. Thanks.
> 
> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
> 
> > +	local_lock(&lru_pvecs.lock);
> > +	pvec = this_cpu_ptr(&lru_pvecs.lru_deactivate_file);
> >  
> > -		if (pagevec_add_and_need_flush(pvec, page))
> > -			pagevec_lru_move_fn(pvec, lru_deactivate_file_fn);
> > -		local_unlock(&lru_pvecs.lock);
> > -	}
> > +	if (pagevec_add_and_need_flush(pvec, &folio->page))
> > +		pagevec_lru_move_fn(pvec, lru_deactivate_file_fn);
> > +	local_unlock(&lru_pvecs.lock);
> >  }
> >  
> >  /*
> > diff --git a/mm/truncate.c b/mm/truncate.c
> > index 567557c36d45..14486e75ec28 100644
> > --- a/mm/truncate.c
> > +++ b/mm/truncate.c
> > @@ -525,7 +525,7 @@ static unsigned long __invalidate_mapping_pages(struct address_space *mapping,
> >  			 * of interest and try to speed up its reclaim.
> >  			 */
> >  			if (!ret) {
> > -				deactivate_file_page(&folio->page);
> > +				deactivate_file_folio(folio);
> >  				/* It is likely on the pagevec of a remote CPU */
> >  				if (nr_pagevec)
> >  					(*nr_pagevec)++;
> > 
> 
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/10] mm/truncate: Inline invalidate_complete_page() into its one caller
  2022-02-15 20:09     ` Matthew Wilcox
@ 2022-02-16  2:36       ` Miaohe Lin
  0 siblings, 0 replies; 42+ messages in thread
From: Miaohe Lin @ 2022-02-16  2:36 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Linux FS-devel Mailing List, Linux-MM, HORIGUCHI NAOYA

On 2022/2/16 4:09, Matthew Wilcox wrote:
> On Tue, Feb 15, 2022 at 03:45:34PM +0800, Miaohe Lin wrote:
>>> @@ -309,7 +288,10 @@ int invalidate_inode_page(struct page *page)
>>>  		return 0;
>>>  	if (page_mapped(page))
>>>  		return 0;
>>> -	return invalidate_complete_page(mapping, page);
>>
>> It seems the checking of page->mapping != mapping is removed here.
>> IIUC, this would cause possibly unexpected side effect because
>> swapcache page can be invalidate now. I think this function is
>> not intended to deal with swapcache though it could do this.
> 
> You're right that it might now pass instead of being skipped.
> But it's not currently called for swapcache pages.  If we did want

AFAICS, __soft_offline_page might call invalidate_inode_page for swapcache page.
It only checks !PageHuge(page). Maybe __soft_offline_page should change to check
the flag or maybe it's fine to invalidate swapcache page there. I'm not sure...

> to prohibit swapcache pages explicitly, I'd rather we checked the
> flag instead of relying on page->mapping != page_mapping(page).

Agree.

> The intent of that check was "has it been truncated", not "is it
> swapcache".

Many thanks for clarifying this.

> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 02/10] mm/truncate: Inline invalidate_complete_page() into its one caller
  2022-02-14 20:00 ` [PATCH 02/10] mm/truncate: Inline invalidate_complete_page() into its one caller Matthew Wilcox (Oracle)
                     ` (2 preceding siblings ...)
  2022-02-15  7:45   ` Miaohe Lin
@ 2022-02-16  2:45   ` Miaohe Lin
  3 siblings, 0 replies; 42+ messages in thread
From: Miaohe Lin @ 2022-02-16  2:45 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-fsdevel, linux-mm

On 2022/2/15 4:00, Matthew Wilcox (Oracle) wrote:
> invalidate_inode_page() is the only caller of invalidate_complete_page()
> and inlining it reveals that the first check is unnecessary (because we
> hold the page locked, and we just retrieved the mapping from the page).
> Actually, it does make a difference, in that tail pages no longer fail
> at this check, so it's now possible to remove a tail page from a mapping.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---

LGTM. Thanks.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

>  mm/truncate.c | 28 +++++-----------------------
>  1 file changed, 5 insertions(+), 23 deletions(-)
> 
> diff --git a/mm/truncate.c b/mm/truncate.c
> index 9dbf0b75da5d..e5e2edaa0b76 100644
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -193,27 +193,6 @@ static void truncate_cleanup_folio(struct folio *folio)
>  	folio_clear_mappedtodisk(folio);
>  }
>  
> -/*
> - * This is for invalidate_mapping_pages().  That function can be called at
> - * any time, and is not supposed to throw away dirty pages.  But pages can
> - * be marked dirty at any time too, so use remove_mapping which safely
> - * discards clean, unused pages.
> - *
> - * Returns non-zero if the page was successfully invalidated.
> - */
> -static int
> -invalidate_complete_page(struct address_space *mapping, struct page *page)
> -{
> -
> -	if (page->mapping != mapping)
> -		return 0;
> -
> -	if (page_has_private(page) && !try_to_release_page(page, 0))
> -		return 0;
> -
> -	return remove_mapping(mapping, page);
> -}
> -
>  int truncate_inode_folio(struct address_space *mapping, struct folio *folio)
>  {
>  	if (folio->mapping != mapping)
> @@ -309,7 +288,10 @@ int invalidate_inode_page(struct page *page)
>  		return 0;
>  	if (page_mapped(page))
>  		return 0;
> -	return invalidate_complete_page(mapping, page);
> +	if (page_has_private(page) && !try_to_release_page(page, 0))
> +		return 0;
> +
> +	return remove_mapping(mapping, page);
>  }
>  
>  /**
> @@ -584,7 +566,7 @@ void invalidate_mapping_pagevec(struct address_space *mapping,
>  }
>  
>  /*
> - * This is like invalidate_complete_page(), except it ignores the page's
> + * This is like invalidate_inode_page(), except it ignores the page's
>   * refcount.  We do this because invalidate_inode_pages2() needs stronger
>   * invalidation guarantees, and cannot afford to leave pages behind because
>   * shrink_page_list() has a temp ref on them, or because they're transiently
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 08/10] mm: Turn deactivate_file_page() into deactivate_file_folio()
  2022-02-15 20:33     ` Matthew Wilcox
@ 2022-02-16  2:45       ` Miaohe Lin
  0 siblings, 0 replies; 42+ messages in thread
From: Miaohe Lin @ 2022-02-16  2:45 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-fsdevel, linux-mm

On 2022/2/16 4:33, Matthew Wilcox wrote:
> On Tue, Feb 15, 2022 at 04:26:22PM +0800, Miaohe Lin wrote:
>>> +	folio_get(folio);
>>
>> Should we comment the assumption that caller already hold the refcnt?
> 
> Added to the kernel-doc:
> + * Context: Caller holds a reference on the page.
> 

I see. Thanks.

>> Anyway, this patch looks good to me. Thanks.
>>
>> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
>>
>>> +	local_lock(&lru_pvecs.lock);
>>> +	pvec = this_cpu_ptr(&lru_pvecs.lru_deactivate_file);
>>>  
>>> -		if (pagevec_add_and_need_flush(pvec, page))
>>> -			pagevec_lru_move_fn(pvec, lru_deactivate_file_fn);
>>> -		local_unlock(&lru_pvecs.lock);
>>> -	}
>>> +	if (pagevec_add_and_need_flush(pvec, &folio->page))
>>> +		pagevec_lru_move_fn(pvec, lru_deactivate_file_fn);
>>> +	local_unlock(&lru_pvecs.lock);
>>>  }
>>>  
>>>  /*
>>> diff --git a/mm/truncate.c b/mm/truncate.c
>>> index 567557c36d45..14486e75ec28 100644
>>> --- a/mm/truncate.c
>>> +++ b/mm/truncate.c
>>> @@ -525,7 +525,7 @@ static unsigned long __invalidate_mapping_pages(struct address_space *mapping,
>>>  			 * of interest and try to speed up its reclaim.
>>>  			 */
>>>  			if (!ret) {
>>> -				deactivate_file_page(&folio->page);
>>> +				deactivate_file_folio(folio);
>>>  				/* It is likely on the pagevec of a remote CPU */
>>>  				if (nr_pagevec)
>>>  					(*nr_pagevec)++;
>>>
>>
>>
> .
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 04/10] mm/truncate: Replace page_mapped() call in invalidate_inode_page()
  2022-02-14 20:00 ` [PATCH 04/10] mm/truncate: Replace page_mapped() call in invalidate_inode_page() Matthew Wilcox (Oracle)
  2022-02-15  7:19   ` Christoph Hellwig
  2022-02-15  8:32   ` Miaohe Lin
@ 2022-02-25  1:31   ` Matthew Wilcox
  2022-02-25  3:27     ` Matthew Wilcox
  2 siblings, 1 reply; 42+ messages in thread
From: Matthew Wilcox @ 2022-02-25  1:31 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm

On Mon, Feb 14, 2022 at 08:00:11PM +0000, Matthew Wilcox (Oracle) wrote:
> folio_mapped() is expensive because it has to check each page's mapcount
> field.  A cheaper check is whether there are any extra references to
> the page, other than the one we own and the ones held by the page cache.
> The call to remove_mapping() will fail in any case if it cannot freeze
> the refcount, but failing here avoids cycling the i_pages spinlock.

This is the patch that's causing ltp's readahead02 test to break.
Haven't dug into why yet, but it happens without large folios, so
I got something wrong.

> diff --git a/mm/truncate.c b/mm/truncate.c
> index b73c30c95cd0..d67fa8871b75 100644
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -287,7 +287,7 @@ int invalidate_inode_page(struct page *page)
>  		return 0;
>  	if (folio_test_dirty(folio) || folio_test_writeback(folio))
>  		return 0;
> -	if (page_mapped(page))
> +	if (folio_ref_count(folio) > folio_nr_pages(folio) + 1)
>  		return 0;
>  	if (folio_has_private(folio) && !filemap_release_folio(folio, 0))
>  		return 0;
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 04/10] mm/truncate: Replace page_mapped() call in invalidate_inode_page()
  2022-02-25  1:31   ` Matthew Wilcox
@ 2022-02-25  3:27     ` Matthew Wilcox
  0 siblings, 0 replies; 42+ messages in thread
From: Matthew Wilcox @ 2022-02-25  3:27 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm

On Fri, Feb 25, 2022 at 01:31:09AM +0000, Matthew Wilcox wrote:
> On Mon, Feb 14, 2022 at 08:00:11PM +0000, Matthew Wilcox (Oracle) wrote:
> > folio_mapped() is expensive because it has to check each page's mapcount
> > field.  A cheaper check is whether there are any extra references to
> > the page, other than the one we own and the ones held by the page cache.
> > The call to remove_mapping() will fail in any case if it cannot freeze
> > the refcount, but failing here avoids cycling the i_pages spinlock.
> 
> This is the patch that's causing ltp's readahead02 test to break.
> Haven't dug into why yet, but it happens without large folios, so
> I got something wrong.

This fixes it:

+++ b/mm/truncate.c
@@ -288,7 +288,8 @@ int invalidate_inode_page(struct page *page)
        if (folio_test_dirty(folio) || folio_test_writeback(folio))
                return 0;
        /* The refcount will be elevated if any page in the folio is mapped */
-       if (folio_ref_count(folio) > folio_nr_pages(folio) + 1)
+       if (folio_ref_count(folio) >
+                       folio_nr_pages(folio) + 1 + folio_has_private(folio))
                return 0;
        if (folio_has_private(folio) && !filemap_release_folio(folio, 0))
                return 0;

Too late for today's -next, but I'll push it out tomorrow.

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2022-02-25  3:27 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-14 20:00 [PATCH 00/10] Various fixes around invalidate_page() Matthew Wilcox (Oracle)
2022-02-14 20:00 ` [PATCH 01/10] splice: Use a folio in page_cache_pipe_buf_try_steal() Matthew Wilcox (Oracle)
2022-02-15  7:15   ` Christoph Hellwig
2022-02-15  8:32   ` Miaohe Lin
2022-02-14 20:00 ` [PATCH 02/10] mm/truncate: Inline invalidate_complete_page() into its one caller Matthew Wilcox (Oracle)
2022-02-14 23:09   ` John Hubbard
2022-02-14 23:32     ` Matthew Wilcox
2022-02-14 23:51       ` John Hubbard
2022-02-15  7:17   ` Christoph Hellwig
2022-02-15  7:45   ` Miaohe Lin
2022-02-15 20:09     ` Matthew Wilcox
2022-02-16  2:36       ` Miaohe Lin
2022-02-16  2:45   ` Miaohe Lin
2022-02-14 20:00 ` [PATCH 03/10] mm/truncate: Convert invalidate_inode_page() to use a folio Matthew Wilcox (Oracle)
2022-02-15  7:18   ` Christoph Hellwig
2022-02-15  8:32   ` Miaohe Lin
2022-02-14 20:00 ` [PATCH 04/10] mm/truncate: Replace page_mapped() call in invalidate_inode_page() Matthew Wilcox (Oracle)
2022-02-15  7:19   ` Christoph Hellwig
2022-02-15 20:12     ` Matthew Wilcox
2022-02-15  8:32   ` Miaohe Lin
2022-02-25  1:31   ` Matthew Wilcox
2022-02-25  3:27     ` Matthew Wilcox
2022-02-14 20:00 ` [PATCH 05/10] mm: Convert remove_mapping() to take a folio Matthew Wilcox (Oracle)
2022-02-15  7:21   ` Christoph Hellwig
2022-02-15  8:33   ` Miaohe Lin
2022-02-14 20:00 ` [PATCH 06/10] mm/truncate: Split invalidate_inode_page() into mapping_shrink_folio() Matthew Wilcox (Oracle)
2022-02-15  7:23   ` Christoph Hellwig
2022-02-15  9:37   ` Miaohe Lin
2022-02-14 20:00 ` [PATCH 07/10] mm/truncate: Convert __invalidate_mapping_pages() to use a folio Matthew Wilcox (Oracle)
2022-02-15  7:24   ` Christoph Hellwig
2022-02-15  9:37   ` Miaohe Lin
2022-02-14 20:00 ` [PATCH 08/10] mm: Turn deactivate_file_page() into deactivate_file_folio() Matthew Wilcox (Oracle)
2022-02-15  7:25   ` Christoph Hellwig
2022-02-15  8:26   ` Miaohe Lin
2022-02-15 20:33     ` Matthew Wilcox
2022-02-16  2:45       ` Miaohe Lin
2022-02-14 20:00 ` [PATCH 09/10] mm/truncate: Combine invalidate_mapping_pagevec() and __invalidate_mapping_pages() Matthew Wilcox (Oracle)
2022-02-15  7:26   ` Christoph Hellwig
2022-02-15  9:37   ` Miaohe Lin
2022-02-14 20:00 ` [PATCH 10/10] fs: Move many prototypes to pagemap.h Matthew Wilcox (Oracle)
2022-02-15  7:27   ` Christoph Hellwig
2022-02-15  9:38   ` Miaohe Lin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.