linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/25] Page folios
@ 2021-01-28  7:03 Matthew Wilcox (Oracle)
  2021-01-28  7:03 ` [PATCH v3 01/25] mm: Introduce struct folio Matthew Wilcox (Oracle)
                   ` (24 more replies)
  0 siblings, 25 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Some functions which take a struct page as an argument operate on
PAGE_SIZE bytes.  Others operate on the entire compound page if
passed either a head or tail page.  Others operate on the compound
page if passed a head page, but PAGE_SIZE bytes if passed a tail page.
Yet others either BUG or do the wrong thing if passed a tail page.

This patch series starts to resolve this ambiguity by introducing a new
type, the struct folio.  A function which takes a struct folio argument
declares that it will operate on the entire page.  In return, the caller
guarantees that the pointer it is passing does not point to a tail page.

This allows us to do less work.  Now we have a type that is guaranteed
not to be a tail page, we can avoid calling compound_head().  That saves
us hundreds of bytes of text and even manages to reduce the amount of
data in the kernel image somehow.

The focus for this patch series is on introducing infrastructure.
The big correctness proof that exists in this patch series is to make
it clear that one cannot wait (for the page lock or writeback) on a
tail page.  I don't believe there were any places which could miss a
wakeup due to this, but it's hard to prove that without struct folio.
Now the compiler proves it for us.

v3:
 - Rebase on next-20210127.  Two major sources of conflict, the
   generic_file_buffered_read refactoring (in akpm tree) and the
   fscache work (in dhowells tree).  Not sure how this patch series
   can get merged with these two sources of conflict?
v2:
 - Pare patch series back to just infrastructure and the page waiting
   parts.

Matthew Wilcox (Oracle) (25):
  mm: Introduce struct folio
  mm: Add folio_pgdat
  mm/vmstat: Add folio stat wrappers
  mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO
  mm: Add put_folio
  mm: Add get_folio
  mm: Create FolioFlags
  mm: Handle per-folio private data
  mm: Add folio_index, folio_page and folio_contains
  mm/util: Add folio_mapping and folio_file_mapping
  mm/memcg: Add folio_memcg, lock_folio_memcg and unlock_folio_memcg
  mm/memcg: Add mem_cgroup_folio_lruvec
  mm: Add unlock_folio
  mm: Add lock_folio
  mm: Add lock_folio_killable
  mm: Convert lock_page_async to lock_folio_async
  mm/filemap: Convert end_page_writeback to end_folio_writeback
  mm: Convert wait_on_page_bit to wait_on_folio_bit
  mm: Add wait_for_stable_folio and wait_on_folio_writeback
  mm: Add wait_on_folio_locked & wait_on_folio_locked_killable
  mm: Convert lock_page_or_retry to lock_folio_or_retry
  mm/filemap: Convert wake_up_page_bit to wake_up_folio_bit
  mm: Convert test_clear_page_writeback to test_clear_folio_writeback
  mm/filemap: Convert page wait queues to be folios
  cachefiles: Switch to wait_page_key

 fs/afs/write.c             |  31 +++---
 fs/cachefiles/rdwr.c       |  13 ++-
 fs/io_uring.c              |   2 +-
 include/linux/memcontrol.h |  22 ++++
 include/linux/mm.h         |  88 ++++++++++++----
 include/linux/mm_types.h   |  33 ++++++
 include/linux/mmdebug.h    |  20 ++++
 include/linux/netfs.h      |   5 +
 include/linux/page-flags.h | 106 +++++++++++++++----
 include/linux/pagemap.h    | 201 ++++++++++++++++++++++++-----------
 include/linux/vmstat.h     |  60 +++++++++++
 mm/filemap.c               | 207 ++++++++++++++++++-------------------
 mm/memcontrol.c            |  36 ++++---
 mm/memory.c                |  10 +-
 mm/page-writeback.c        |  48 ++++-----
 mm/swapfile.c              |   6 +-
 mm/util.c                  |  20 ++--
 17 files changed, 621 insertions(+), 287 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v3 01/25] mm: Introduce struct folio
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-03-01 20:26   ` Zi Yan
  2021-01-28  7:03 ` [PATCH v3 02/25] mm: Add folio_pgdat Matthew Wilcox (Oracle)
                   ` (23 subsequent siblings)
  24 siblings, 1 reply; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

We have trouble keeping track of whether we've already called
compound_head() to ensure we're not operating on a tail page.  Further,
it's never clear whether we intend a struct page to refer to PAGE_SIZE
bytes or page_size(compound_head(page)).

Introduce a new type 'struct folio' that always refers to an entire
(possibly compound) page, and points to the head page (or base page).

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h       | 26 ++++++++++++++++++++++++++
 include/linux/mm_types.h | 17 +++++++++++++++++
 2 files changed, 43 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2d6e715ab8ea..f20504017adf 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -924,6 +924,11 @@ static inline unsigned int compound_order(struct page *page)
 	return page[1].compound_order;
 }
 
+static inline unsigned int folio_order(struct folio *folio)
+{
+	return compound_order(&folio->page);
+}
+
 static inline bool hpage_pincount_available(struct page *page)
 {
 	/*
@@ -975,6 +980,26 @@ static inline unsigned int page_shift(struct page *page)
 
 void free_compound_page(struct page *page);
 
+static inline unsigned long folio_nr_pages(struct folio *folio)
+{
+	return compound_nr(&folio->page);
+}
+
+static inline struct folio *next_folio(struct folio *folio)
+{
+	return folio + folio_nr_pages(folio);
+}
+
+static inline unsigned int folio_shift(struct folio *folio)
+{
+	return PAGE_SHIFT + folio_order(folio);
+}
+
+static inline size_t folio_size(struct folio *folio)
+{
+	return PAGE_SIZE << folio_order(folio);
+}
+
 #ifdef CONFIG_MMU
 /*
  * Do pte_mkwrite, but only if the vma says VM_WRITE.  We do this when
@@ -1618,6 +1643,7 @@ extern void pagefault_out_of_memory(void);
 
 #define offset_in_page(p)	((unsigned long)(p) & ~PAGE_MASK)
 #define offset_in_thp(page, p)	((unsigned long)(p) & (thp_size(page) - 1))
+#define offset_in_folio(folio, p) ((unsigned long)(p) & (folio_size(folio) - 1))
 
 /*
  * Flags passed to show_mem() and show_free_areas() to suppress output in
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 07d9acb5b19c..875dc6cd6ad2 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -223,6 +223,23 @@ struct page {
 #endif
 } _struct_page_alignment;
 
+/*
+ * A struct folio is either a base (order-0) page or the head page of
+ * a compound page.
+ */
+struct folio {
+	struct page page;
+};
+
+static inline struct folio *page_folio(struct page *page)
+{
+	unsigned long head = READ_ONCE(page->compound_head);
+
+	if (unlikely(head & 1))
+		return (struct folio *)(head - 1);
+	return (struct folio *)page;
+}
+
 static inline atomic_t *compound_mapcount_ptr(struct page *page)
 {
 	return &page[1].compound_mapcount;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 02/25] mm: Add folio_pgdat
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
  2021-01-28  7:03 ` [PATCH v3 01/25] mm: Introduce struct folio Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-03-01 21:05   ` Zi Yan
  2021-01-28  7:03 ` [PATCH v3 03/25] mm/vmstat: Add folio stat wrappers Matthew Wilcox (Oracle)
                   ` (22 subsequent siblings)
  24 siblings, 1 reply; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This is just a convenience wrapper for callers with folios; pgdat can
be reached from tail pages as well as head pages.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index f20504017adf..7d787229dd40 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1503,6 +1503,11 @@ static inline pg_data_t *page_pgdat(const struct page *page)
 	return NODE_DATA(page_to_nid(page));
 }
 
+static inline pg_data_t *folio_pgdat(const struct folio *folio)
+{
+	return page_pgdat(&folio->page);
+}
+
 #ifdef SECTION_IN_PAGE_FLAGS
 static inline void set_page_section(struct page *page, unsigned long section)
 {
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 03/25] mm/vmstat: Add folio stat wrappers
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
  2021-01-28  7:03 ` [PATCH v3 01/25] mm: Introduce struct folio Matthew Wilcox (Oracle)
  2021-01-28  7:03 ` [PATCH v3 02/25] mm: Add folio_pgdat Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-03-01 21:17   ` Zi Yan
  2021-01-28  7:03 ` [PATCH v3 04/25] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO Matthew Wilcox (Oracle)
                   ` (21 subsequent siblings)
  24 siblings, 1 reply; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Allow page counters to be more readily modified by callers which have
a folio.  Name these wrappers with 'stat' instead of 'state' as requested
by Linus here:
https://lore.kernel.org/linux-mm/CAHk-=wj847SudR-kt+46fT3+xFFgiwpgThvm7DJWGdi4cVrbnQ@mail.gmail.com/

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/vmstat.h | 60 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 773135fc6e19..3c3373c2c3c2 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -396,6 +396,54 @@ static inline void drain_zonestat(struct zone *zone,
 			struct per_cpu_pageset *pset) { }
 #endif		/* CONFIG_SMP */
 
+static inline
+void __inc_zone_folio_stat(struct folio *folio, enum zone_stat_item item)
+{
+	__inc_zone_page_state(&folio->page, item);
+}
+
+static inline
+void __dec_zone_folio_stat(struct folio *folio, enum zone_stat_item item)
+{
+	__dec_zone_page_state(&folio->page, item);
+}
+
+static inline
+void inc_zone_folio_stat(struct folio *folio, enum zone_stat_item item)
+{
+	inc_zone_page_state(&folio->page, item);
+}
+
+static inline
+void dec_zone_folio_stat(struct folio *folio, enum zone_stat_item item)
+{
+	dec_zone_page_state(&folio->page, item);
+}
+
+static inline
+void __inc_node_folio_stat(struct folio *folio, enum node_stat_item item)
+{
+	__inc_node_page_state(&folio->page, item);
+}
+
+static inline
+void __dec_node_folio_stat(struct folio *folio, enum node_stat_item item)
+{
+	__dec_node_page_state(&folio->page, item);
+}
+
+static inline
+void inc_node_folio_stat(struct folio *folio, enum node_stat_item item)
+{
+	inc_node_page_state(&folio->page, item);
+}
+
+static inline
+void dec_node_folio_stat(struct folio *folio, enum node_stat_item item)
+{
+	dec_node_page_state(&folio->page, item);
+}
+
 static inline void __mod_zone_freepage_state(struct zone *zone, int nr_pages,
 					     int migratetype)
 {
@@ -530,6 +578,18 @@ static inline void __dec_lruvec_page_state(struct page *page,
 	__mod_lruvec_page_state(page, idx, -1);
 }
 
+static inline void __inc_lruvec_folio_stat(struct folio *folio,
+					   enum node_stat_item idx)
+{
+	__mod_lruvec_page_state(&folio->page, idx, 1);
+}
+
+static inline void __dec_lruvec_folio_stat(struct folio *folio,
+					   enum node_stat_item idx)
+{
+	__mod_lruvec_page_state(&folio->page, idx, -1);
+}
+
 static inline void inc_lruvec_state(struct lruvec *lruvec,
 				    enum node_stat_item idx)
 {
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 04/25] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (2 preceding siblings ...)
  2021-01-28  7:03 ` [PATCH v3 03/25] mm/vmstat: Add folio stat wrappers Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-03-01 21:25   ` Zi Yan
  2021-01-28  7:03 ` [PATCH v3 05/25] mm: Add put_folio Matthew Wilcox (Oracle)
                   ` (20 subsequent siblings)
  24 siblings, 1 reply; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

These are the folio equivalents of VM_BUG_ON_PAGE and VM_WARN_ON_ONCE_PAGE.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mmdebug.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/include/linux/mmdebug.h b/include/linux/mmdebug.h
index 5d0767cb424a..77d24e1dcaec 100644
--- a/include/linux/mmdebug.h
+++ b/include/linux/mmdebug.h
@@ -23,6 +23,13 @@ void dump_mm(const struct mm_struct *mm);
 			BUG();						\
 		}							\
 	} while (0)
+#define VM_BUG_ON_FOLIO(cond, folio)					\
+	do {								\
+		if (unlikely(cond)) {					\
+			dump_page(&folio->page, "VM_BUG_ON_FOLIO(" __stringify(cond)")");\
+			BUG();						\
+		}							\
+	} while (0)
 #define VM_BUG_ON_VMA(cond, vma)					\
 	do {								\
 		if (unlikely(cond)) {					\
@@ -48,6 +55,17 @@ void dump_mm(const struct mm_struct *mm);
 	}								\
 	unlikely(__ret_warn_once);					\
 })
+#define VM_WARN_ON_ONCE_FOLIO(cond, folio)	({			\
+	static bool __section(".data.once") __warned;			\
+	int __ret_warn_once = !!(cond);					\
+									\
+	if (unlikely(__ret_warn_once && !__warned)) {			\
+		dump_page(&folio->page, "VM_WARN_ON_ONCE_FOLIO(" __stringify(cond)")");\
+		__warned = true;					\
+		WARN_ON(1);						\
+	}								\
+	unlikely(__ret_warn_once);					\
+})
 
 #define VM_WARN_ON(cond) (void)WARN_ON(cond)
 #define VM_WARN_ON_ONCE(cond) (void)WARN_ON_ONCE(cond)
@@ -56,11 +74,13 @@ void dump_mm(const struct mm_struct *mm);
 #else
 #define VM_BUG_ON(cond) BUILD_BUG_ON_INVALID(cond)
 #define VM_BUG_ON_PAGE(cond, page) VM_BUG_ON(cond)
+#define VM_BUG_ON_FOLIO(cond, folio) VM_BUG_ON(cond)
 #define VM_BUG_ON_VMA(cond, vma) VM_BUG_ON(cond)
 #define VM_BUG_ON_MM(cond, mm) VM_BUG_ON(cond)
 #define VM_WARN_ON(cond) BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN_ON_ONCE(cond) BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN_ON_ONCE_PAGE(cond, page)  BUILD_BUG_ON_INVALID(cond)
+#define VM_WARN_ON_ONCE_FOLIO(cond, folio)  BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN_ONCE(cond, format...) BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN(cond, format...) BUILD_BUG_ON_INVALID(cond)
 #endif
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 05/25] mm: Add put_folio
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (3 preceding siblings ...)
  2021-01-28  7:03 ` [PATCH v3 04/25] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-03-01 21:41   ` Zi Yan
  2021-01-28  7:03 ` [PATCH v3 06/25] mm: Add get_folio Matthew Wilcox (Oracle)
                   ` (19 subsequent siblings)
  24 siblings, 1 reply; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

If we know we have a folio, we can call put_folio() instead of put_page()
and save the overhead of calling compound_head().  Also skips the
devmap checks.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 7d787229dd40..873d649107ba 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1220,9 +1220,15 @@ static inline __must_check bool try_get_page(struct page *page)
 	return true;
 }
 
+static inline void put_folio(struct folio *folio)
+{
+	if (put_page_testzero(&folio->page))
+		__put_page(&folio->page);
+}
+
 static inline void put_page(struct page *page)
 {
-	page = compound_head(page);
+	struct folio *folio = page_folio(page);
 
 	/*
 	 * For devmap managed pages we need to catch refcount transition from
@@ -1230,13 +1236,12 @@ static inline void put_page(struct page *page)
 	 * need to inform the device driver through callback. See
 	 * include/linux/memremap.h and HMM for details.
 	 */
-	if (page_is_devmap_managed(page)) {
-		put_devmap_managed_page(page);
+	if (page_is_devmap_managed(&folio->page)) {
+		put_devmap_managed_page(&folio->page);
 		return;
 	}
 
-	if (put_page_testzero(page))
-		__put_page(page);
+	put_folio(folio);
 }
 
 /*
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 06/25] mm: Add get_folio
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (4 preceding siblings ...)
  2021-01-28  7:03 ` [PATCH v3 05/25] mm: Add put_folio Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-03-01 21:45   ` Zi Yan
  2021-01-28  7:03 ` [PATCH v3 07/25] mm: Create FolioFlags Matthew Wilcox (Oracle)
                   ` (18 subsequent siblings)
  24 siblings, 1 reply; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

If we know we have a folio, we can call get_folio() instead of get_page()
and save the overhead of calling compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 873d649107ba..d71c5776b571 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1192,18 +1192,19 @@ static inline bool is_pci_p2pdma_page(const struct page *page)
 }
 
 /* 127: arbitrary random number, small enough to assemble well */
-#define page_ref_zero_or_close_to_overflow(page) \
-	((unsigned int) page_ref_count(page) + 127u <= 127u)
+#define folio_ref_zero_or_close_to_overflow(folio) \
+	((unsigned int) page_ref_count(&folio->page) + 127u <= 127u)
+
+static inline void get_folio(struct folio *folio)
+{
+	/* Getting a page requires an already elevated page->_refcount. */
+	VM_BUG_ON_FOLIO(folio_ref_zero_or_close_to_overflow(folio), folio);
+	page_ref_inc(&folio->page);
+}
 
 static inline void get_page(struct page *page)
 {
-	page = compound_head(page);
-	/*
-	 * Getting a normal page or the head of a compound page
-	 * requires to already have an elevated page->_refcount.
-	 */
-	VM_BUG_ON_PAGE(page_ref_zero_or_close_to_overflow(page), page);
-	page_ref_inc(page);
+	get_folio(page_folio(page));
 }
 
 bool __must_check try_grab_page(struct page *page, unsigned int flags);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 07/25] mm: Create FolioFlags
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (5 preceding siblings ...)
  2021-01-28  7:03 ` [PATCH v3 06/25] mm: Add get_folio Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-01-28  7:03 ` [PATCH v3 08/25] mm: Handle per-folio private data Matthew Wilcox (Oracle)
                   ` (17 subsequent siblings)
  24 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

These new functions are the folio analogues of the PageFlags functions.
If CONFIG_DEBUG_VM_PGFLAGS is enabled, we check the folio is not a
tail page at every invocation.  Note that this will also catch the
PagePoisoned case as a poisoned page has every bit set, which would
include PageTail.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/netfs.h      |   5 ++
 include/linux/page-flags.h | 104 ++++++++++++++++++++++++++++++-------
 2 files changed, 89 insertions(+), 20 deletions(-)

diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 2ffdef1ded91..887d259eb384 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -24,6 +24,11 @@
 #define ClearPageFsCache(page)		ClearPagePrivate2((page))
 #define TestSetPageFsCache(page)	TestSetPagePrivate2((page))
 #define TestClearPageFsCache(page)	TestClearPagePrivate2((page))
+#define FolioFsCache(page)		FolioPrivate2((page))
+#define SetFolioFsCache(page)		SetFolioPrivate2((page))
+#define ClearFolioFsCache(page)		ClearFolioPrivate2((page))
+#define TestSetFolioFsCache(page)	TestSetFolioPrivate2((page))
+#define TestClearFolioFsCache(page)	TestClearFolioPrivate2((page))
 
 enum netfs_read_source {
 	NETFS_FILL_WITH_ZEROES,
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 04a34c08e0a6..90381858d901 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -212,6 +212,12 @@ static inline void page_init_poison(struct page *page, size_t size)
 }
 #endif
 
+static unsigned long *folio_flags(struct folio *folio)
+{
+	VM_BUG_ON_PGFLAGS(PageTail(&folio->page), &folio->page);
+	return &folio->page.flags;
+}
+
 /*
  * Page flags policies wrt compound pages
  *
@@ -260,30 +266,44 @@ static inline void page_init_poison(struct page *page, size_t size)
  * Macros to create function definitions for page flags
  */
 #define TESTPAGEFLAG(uname, lname, policy)				\
+static __always_inline int Folio##uname(struct folio *folio)		\
+	{ return test_bit(PG_##lname, folio_flags(folio)); }		\
 static __always_inline int Page##uname(struct page *page)		\
 	{ return test_bit(PG_##lname, &policy(page, 0)->flags); }
 
 #define SETPAGEFLAG(uname, lname, policy)				\
+static __always_inline void SetFolio##uname(struct folio *folio)	\
+	{ set_bit(PG_##lname, folio_flags(folio)); }			\
 static __always_inline void SetPage##uname(struct page *page)		\
 	{ set_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define CLEARPAGEFLAG(uname, lname, policy)				\
+static __always_inline void ClearFolio##uname(struct folio *folio)	\
+	{ clear_bit(PG_##lname, folio_flags(folio)); }			\
 static __always_inline void ClearPage##uname(struct page *page)		\
 	{ clear_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define __SETPAGEFLAG(uname, lname, policy)				\
+static __always_inline void __SetFolio##uname(struct folio *folio)	\
+	{ __set_bit(PG_##lname, folio_flags(folio)); }			\
 static __always_inline void __SetPage##uname(struct page *page)		\
 	{ __set_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define __CLEARPAGEFLAG(uname, lname, policy)				\
+static __always_inline void __ClearFolio##uname(struct folio *folio)	\
+	{ __clear_bit(PG_##lname, folio_flags(folio)); }		\
 static __always_inline void __ClearPage##uname(struct page *page)	\
 	{ __clear_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define TESTSETFLAG(uname, lname, policy)				\
+static __always_inline int TestSetFolio##uname(struct folio *folio)	\
+	{ return test_and_set_bit(PG_##lname, folio_flags(folio)); }	\
 static __always_inline int TestSetPage##uname(struct page *page)	\
 	{ return test_and_set_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define TESTCLEARFLAG(uname, lname, policy)				\
+static __always_inline int TestClearFolio##uname(struct folio *folio)	\
+	{ return test_and_clear_bit(PG_##lname, folio_flags(folio)); }	\
 static __always_inline int TestClearPage##uname(struct page *page)	\
 	{ return test_and_clear_bit(PG_##lname, &policy(page, 1)->flags); }
 
@@ -302,21 +322,27 @@ static __always_inline int TestClearPage##uname(struct page *page)	\
 	TESTCLEARFLAG(uname, lname, policy)
 
 #define TESTPAGEFLAG_FALSE(uname)					\
+static inline int Folio##uname(const struct folio *folio) { return 0; }	\
 static inline int Page##uname(const struct page *page) { return 0; }
 
 #define SETPAGEFLAG_NOOP(uname)						\
+static inline void SetFolio##uname(struct folio *folio) { }		\
 static inline void SetPage##uname(struct page *page) {  }
 
 #define CLEARPAGEFLAG_NOOP(uname)					\
+static inline void ClearFolio##uname(struct folio *folio) { }		\
 static inline void ClearPage##uname(struct page *page) {  }
 
 #define __CLEARPAGEFLAG_NOOP(uname)					\
+static inline void __ClearFolio##uname(struct folio *folio) { }		\
 static inline void __ClearPage##uname(struct page *page) {  }
 
 #define TESTSETFLAG_FALSE(uname)					\
+static inline int TestSetFolio##uname(struct folio *folio) { return 0; } \
 static inline int TestSetPage##uname(struct page *page) { return 0; }
 
 #define TESTCLEARFLAG_FALSE(uname)					\
+static inline int TestClearFolio##uname(struct folio *folio) { return 0; } \
 static inline int TestClearPage##uname(struct page *page) { return 0; }
 
 #define PAGEFLAG_FALSE(uname) TESTPAGEFLAG_FALSE(uname)			\
@@ -393,14 +419,18 @@ PAGEFLAG_FALSE(HighMem)
 #endif
 
 #ifdef CONFIG_SWAP
-static __always_inline int PageSwapCache(struct page *page)
+static __always_inline bool FolioSwapCache(struct folio *folio)
 {
-#ifdef CONFIG_THP_SWAP
-	page = compound_head(page);
-#endif
-	return PageSwapBacked(page) && test_bit(PG_swapcache, &page->flags);
+	return FolioSwapBacked(folio) &&
+			test_bit(PG_swapcache, folio_flags(folio));
 
 }
+
+static __always_inline bool PageSwapCache(struct page *page)
+{
+	return FolioSwapCache(page_folio(page));
+}
+
 SETPAGEFLAG(SwapCache, swapcache, PF_NO_TAIL)
 CLEARPAGEFLAG(SwapCache, swapcache, PF_NO_TAIL)
 #else
@@ -478,10 +508,14 @@ static __always_inline int PageMappingFlags(struct page *page)
 	return ((unsigned long)page->mapping & PAGE_MAPPING_FLAGS) != 0;
 }
 
-static __always_inline int PageAnon(struct page *page)
+static __always_inline bool FolioAnon(struct folio *folio)
 {
-	page = compound_head(page);
-	return ((unsigned long)page->mapping & PAGE_MAPPING_ANON) != 0;
+	return ((unsigned long)folio->page.mapping & PAGE_MAPPING_ANON) != 0;
+}
+
+static __always_inline bool PageAnon(struct page *page)
+{
+	return FolioAnon(page_folio(page));
 }
 
 static __always_inline int __PageMovable(struct page *page)
@@ -509,18 +543,16 @@ TESTPAGEFLAG_FALSE(Ksm)
 
 u64 stable_page_flags(struct page *page);
 
-static inline int PageUptodate(struct page *page)
+static inline int FolioUptodate(struct folio *folio)
 {
-	int ret;
-	page = compound_head(page);
-	ret = test_bit(PG_uptodate, &(page)->flags);
+	int ret = test_bit(PG_uptodate, folio_flags(folio));
 	/*
 	 * Must ensure that the data we read out of the page is loaded
 	 * _after_ we've loaded page->flags to check for PageUptodate.
 	 * We can skip the barrier if the page is not uptodate, because
 	 * we wouldn't be reading anything from it.
 	 *
-	 * See SetPageUptodate() for the other side of the story.
+	 * See SetFolioUptodate() for the other side of the story.
 	 */
 	if (ret)
 		smp_rmb();
@@ -528,23 +560,36 @@ static inline int PageUptodate(struct page *page)
 	return ret;
 }
 
-static __always_inline void __SetPageUptodate(struct page *page)
+static inline int PageUptodate(struct page *page)
+{
+	return FolioUptodate(page_folio(page));
+}
+
+static __always_inline void __SetFolioUptodate(struct folio *folio)
 {
-	VM_BUG_ON_PAGE(PageTail(page), page);
 	smp_wmb();
-	__set_bit(PG_uptodate, &page->flags);
+	__set_bit(PG_uptodate, folio_flags(folio));
 }
 
-static __always_inline void SetPageUptodate(struct page *page)
+static __always_inline void SetFolioUptodate(struct folio *folio)
 {
-	VM_BUG_ON_PAGE(PageTail(page), page);
 	/*
 	 * Memory barrier must be issued before setting the PG_uptodate bit,
 	 * so that all previous stores issued in order to bring the page
 	 * uptodate are actually visible before PageUptodate becomes true.
 	 */
 	smp_wmb();
-	set_bit(PG_uptodate, &page->flags);
+	set_bit(PG_uptodate, folio_flags(folio));
+}
+
+static __always_inline void __SetPageUptodate(struct page *page)
+{
+	__SetFolioUptodate((struct folio *)page);
+}
+
+static __always_inline void SetPageUptodate(struct page *page)
+{
+	SetFolioUptodate((struct folio *)page);
 }
 
 CLEARPAGEFLAG(Uptodate, uptodate, PF_NO_TAIL)
@@ -569,6 +614,17 @@ static inline void set_page_writeback_keepwrite(struct page *page)
 
 __PAGEFLAG(Head, head, PF_ANY) CLEARPAGEFLAG(Head, head, PF_ANY)
 
+/* Whether there are one or multiple pages in a folio */
+static inline bool FolioSingle(struct folio *folio)
+{
+	return !FolioHead(folio);
+}
+
+static inline bool FolioMulti(struct folio *folio)
+{
+	return FolioHead(folio);
+}
+
 static __always_inline void set_compound_head(struct page *page, struct page *head)
 {
 	WRITE_ONCE(page->compound_head, (unsigned long)head + 1);
@@ -592,12 +648,15 @@ static inline void ClearPageCompound(struct page *page)
 #ifdef CONFIG_HUGETLB_PAGE
 int PageHuge(struct page *page);
 int PageHeadHuge(struct page *page);
+static inline bool FolioHuge(struct folio *folio)
+{
+	return PageHeadHuge(&folio->page);
+}
 #else
 TESTPAGEFLAG_FALSE(Huge)
 TESTPAGEFLAG_FALSE(HeadHuge)
 #endif
 
-
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 /*
  * PageHuge() only returns true for hugetlbfs pages, but not for
@@ -844,6 +903,11 @@ static inline int page_has_private(struct page *page)
 	return !!(page->flags & PAGE_FLAGS_PRIVATE);
 }
 
+static inline bool folio_has_private(struct folio *folio)
+{
+	return page_has_private(&folio->page);
+}
+
 #undef PF_ANY
 #undef PF_HEAD
 #undef PF_ONLY_HEAD
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 08/25] mm: Handle per-folio private data
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (6 preceding siblings ...)
  2021-01-28  7:03 ` [PATCH v3 07/25] mm: Create FolioFlags Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-01-28  7:03 ` [PATCH v3 09/25] mm: Add folio_index, folio_page and folio_contains Matthew Wilcox (Oracle)
                   ` (16 subsequent siblings)
  24 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Add folio_private() and set_folio_private() which mirror page_private()
and set_page_private() -- ie folio private data is the same as page
private data.

Turn attach_page_private() into attach_folio_private() and reimplement
attach_page_private() as a wrapper.  No filesystem which uses page private
data currently supports compound pages, so we're free to define the rules.
attach_page_private() may only be called on a head page; if you want
to add private data to a tail page, you can call set_page_private()
directly (and shouldn't increment the page refcount!  That should be
done when adding private data to the head page / folio).

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm_types.h | 16 ++++++++++++++
 include/linux/pagemap.h  | 48 ++++++++++++++++++++++++----------------
 2 files changed, 45 insertions(+), 19 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 875dc6cd6ad2..750184130074 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -258,6 +258,12 @@ static inline atomic_t *compound_pincount_ptr(struct page *page)
 #define PAGE_FRAG_CACHE_MAX_SIZE	__ALIGN_MASK(32768, ~PAGE_MASK)
 #define PAGE_FRAG_CACHE_MAX_ORDER	get_order(PAGE_FRAG_CACHE_MAX_SIZE)
 
+/*
+ * page_private can be used on tail pages.  However, PagePrivate is only
+ * checked by the VM on the head page.  So page_private on the tail pages
+ * should be used for data that's ancillary to the head page (eg attaching
+ * buffer heads to tail pages after attaching buffer heads to the head page)
+ */
 #define page_private(page)		((page)->private)
 
 static inline void set_page_private(struct page *page, unsigned long private)
@@ -265,6 +271,16 @@ static inline void set_page_private(struct page *page, unsigned long private)
 	page->private = private;
 }
 
+static inline unsigned long folio_private(struct folio *folio)
+{
+	return folio->page.private;
+}
+
+static inline void set_folio_private(struct folio *folio, unsigned long v)
+{
+	folio->page.private = v;
+}
+
 struct page_frag_cache {
 	void * va;
 #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index fda84e88b2ba..83d24b41fb04 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -245,42 +245,52 @@ static inline int page_cache_add_speculative(struct page *page, int count)
 }
 
 /**
- * attach_page_private - Attach private data to a page.
- * @page: Page to attach data to.
- * @data: Data to attach to page.
+ * attach_folio_private - Attach private data to a folio.
+ * @folio: Folio to attach data to.
+ * @data: Data to attach to folio.
  *
- * Attaching private data to a page increments the page's reference count.
- * The data must be detached before the page will be freed.
+ * Attaching private data to a folio increments the page's reference count.
+ * The data must be detached before the folio will be freed.
  */
-static inline void attach_page_private(struct page *page, void *data)
+static inline void attach_folio_private(struct folio *folio, void *data)
 {
-	get_page(page);
-	set_page_private(page, (unsigned long)data);
-	SetPagePrivate(page);
+	get_folio(folio);
+	set_folio_private(folio, (unsigned long)data);
+	SetFolioPrivate(folio);
 }
 
 /**
- * detach_page_private - Detach private data from a page.
- * @page: Page to detach data from.
+ * detach_folio_private - Detach private data from a folio.
+ * @folio: Folio to detach data from.
  *
- * Removes the data that was previously attached to the page and decrements
+ * Removes the data that was previously attached to the folio and decrements
  * the refcount on the page.
  *
- * Return: Data that was attached to the page.
+ * Return: Data that was attached to the folio.
  */
-static inline void *detach_page_private(struct page *page)
+static inline void *detach_folio_private(struct folio *folio)
 {
-	void *data = (void *)page_private(page);
+	void *data = (void *)folio_private(folio);
 
-	if (!PagePrivate(page))
+	if (!FolioPrivate(folio))
 		return NULL;
-	ClearPagePrivate(page);
-	set_page_private(page, 0);
-	put_page(page);
+	ClearFolioPrivate(folio);
+	set_folio_private(folio, 0);
+	put_folio(folio);
 
 	return data;
 }
 
+static inline void attach_page_private(struct page *page, void *data)
+{
+	attach_folio_private((struct folio *)page, data);
+}
+
+static inline void *detach_page_private(struct page *page)
+{
+	return detach_folio_private((struct folio *)page);
+}
+
 #ifdef CONFIG_NUMA
 extern struct page *__page_cache_alloc(gfp_t gfp);
 #else
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 09/25] mm: Add folio_index, folio_page and folio_contains
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (7 preceding siblings ...)
  2021-01-28  7:03 ` [PATCH v3 08/25] mm: Handle per-folio private data Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-01-28  7:03 ` [PATCH v3 10/25] mm/util: Add folio_mapping and folio_file_mapping Matthew Wilcox (Oracle)
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

folio_index() is the equivalent of page_index() for folios.  folio_page()
finds the page in a folio for a page cache index.  folio_contains()
tells you whether a folio contains a particular page cache index.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/pagemap.h | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 83d24b41fb04..86956e97cd5e 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -447,6 +447,29 @@ static inline bool thp_contains(struct page *head, pgoff_t index)
 	return page_index(head) == (index & ~(thp_nr_pages(head) - 1UL));
 }
 
+static inline pgoff_t folio_index(struct folio *folio)
+{
+        if (unlikely(FolioSwapCache(folio)))
+                return __page_file_index(&folio->page);
+        return folio->page.index;
+}
+
+static inline struct page *folio_page(struct folio *folio, pgoff_t index)
+{
+	index -= folio_index(folio);
+	VM_BUG_ON_FOLIO(index >= folio_nr_pages(folio), folio);
+	return &folio->page + index;
+}
+
+/* Does this folio contain this index? */
+static inline bool folio_contains(struct folio *folio, pgoff_t index)
+{
+	/* HugeTLBfs indexes the page cache in units of hpage_size */
+	if (PageHuge(&folio->page))
+		return folio->page.index == index;
+	return index - folio_index(folio) < folio_nr_pages(folio);
+}
+
 /*
  * Given the page we found in the page cache, return the page corresponding
  * to this index in the file
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 10/25] mm/util: Add folio_mapping and folio_file_mapping
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (8 preceding siblings ...)
  2021-01-28  7:03 ` [PATCH v3 09/25] mm: Add folio_index, folio_page and folio_contains Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-01-28  7:03 ` [PATCH v3 11/25] mm/memcg: Add folio_memcg, lock_folio_memcg and unlock_folio_memcg Matthew Wilcox (Oracle)
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

These are the folio equivalent of page_mapping() and page_file_mapping().
Adjust page_file_mapping() and page_mapping_file() to use folios
internally.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h | 23 +++++++++++++++--------
 mm/swapfile.c      |  6 +++---
 mm/util.c          | 20 ++++++++++----------
 3 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index d71c5776b571..c6b708007018 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1589,17 +1589,25 @@ void page_address_init(void);
 
 extern void *page_rmapping(struct page *page);
 extern struct anon_vma *page_anon_vma(struct page *page);
-extern struct address_space *page_mapping(struct page *page);
+struct address_space *folio_mapping(struct folio *);
+struct address_space *__folio_file_mapping(struct folio *);
 
-extern struct address_space *__page_file_mapping(struct page *);
+static inline struct address_space *page_mapping(struct page *page)
+{
+	return folio_mapping(page_folio(page));
+}
 
-static inline
-struct address_space *page_file_mapping(struct page *page)
+static inline struct address_space *folio_file_mapping(struct folio *folio)
 {
-	if (unlikely(PageSwapCache(page)))
-		return __page_file_mapping(page);
+	if (unlikely(FolioSwapCache(folio)))
+		return __folio_file_mapping(folio);
 
-	return page->mapping;
+	return folio->page.mapping;
+}
+
+static inline struct address_space *page_file_mapping(struct page *page)
+{
+	return folio_file_mapping(page_folio(page));
 }
 
 extern pgoff_t __page_file_index(struct page *page);
@@ -1616,7 +1624,6 @@ static inline pgoff_t page_index(struct page *page)
 }
 
 bool page_mapped(struct page *page);
-struct address_space *page_mapping(struct page *page);
 struct address_space *page_mapping_file(struct page *page);
 
 /*
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 12a18b896fce..b68e94d5b112 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3551,11 +3551,11 @@ struct swap_info_struct *page_swap_info(struct page *page)
 /*
  * out-of-line __page_file_ methods to avoid include hell.
  */
-struct address_space *__page_file_mapping(struct page *page)
+struct address_space *__folio_file_mapping(struct folio *folio)
 {
-	return page_swap_info(page)->swap_file->f_mapping;
+	return page_swap_info(&folio->page)->swap_file->f_mapping;
 }
-EXPORT_SYMBOL_GPL(__page_file_mapping);
+EXPORT_SYMBOL_GPL(__folio_file_mapping);
 
 pgoff_t __page_file_index(struct page *page)
 {
diff --git a/mm/util.c b/mm/util.c
index c37e24d5fa43..c052c39b9f1c 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -686,39 +686,39 @@ struct anon_vma *page_anon_vma(struct page *page)
 	return __page_rmapping(page);
 }
 
-struct address_space *page_mapping(struct page *page)
+struct address_space *folio_mapping(struct folio *folio)
 {
 	struct address_space *mapping;
 
-	page = compound_head(page);
-
 	/* This happens if someone calls flush_dcache_page on slab page */
-	if (unlikely(PageSlab(page)))
+	if (unlikely(FolioSlab(folio)))
 		return NULL;
 
-	if (unlikely(PageSwapCache(page))) {
+	if (unlikely(FolioSwapCache(folio))) {
 		swp_entry_t entry;
 
-		entry.val = page_private(page);
+		entry.val = folio_private(folio);
 		return swap_address_space(entry);
 	}
 
-	mapping = page->mapping;
+	mapping = folio->page.mapping;
 	if ((unsigned long)mapping & PAGE_MAPPING_ANON)
 		return NULL;
 
 	return (void *)((unsigned long)mapping & ~PAGE_MAPPING_FLAGS);
 }
-EXPORT_SYMBOL(page_mapping);
+EXPORT_SYMBOL(folio_mapping);
 
 /*
  * For file cache pages, return the address_space, otherwise return NULL
  */
 struct address_space *page_mapping_file(struct page *page)
 {
-	if (unlikely(PageSwapCache(page)))
+	struct folio *folio = page_folio(page);
+
+	if (unlikely(FolioSwapCache(folio)))
 		return NULL;
-	return page_mapping(page);
+	return folio_mapping(folio);
 }
 
 /* Slow path of page_mapcount() for compound pages */
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 11/25] mm/memcg: Add folio_memcg, lock_folio_memcg and unlock_folio_memcg
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (9 preceding siblings ...)
  2021-01-28  7:03 ` [PATCH v3 10/25] mm/util: Add folio_mapping and folio_file_mapping Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-01-28  7:03 ` [PATCH v3 12/25] mm/memcg: Add mem_cgroup_folio_lruvec Matthew Wilcox (Oracle)
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

The memcontrol code already assumes that page_memcg() will be called
with a non-tail page, so make that more natural by wrapping it with a
folio API.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/memcontrol.h | 16 ++++++++++++++++
 mm/memcontrol.c            | 36 ++++++++++++++++++++++++------------
 2 files changed, 40 insertions(+), 12 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 7a38a1517a05..89aaa22506e6 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -383,6 +383,11 @@ static inline struct mem_cgroup *page_memcg(struct page *page)
 	return (struct mem_cgroup *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK);
 }
 
+static inline struct mem_cgroup *folio_memcg(struct folio *folio)
+{
+	return page_memcg(&folio->page);
+}
+
 /*
  * page_memcg_rcu - locklessly get the memory cgroup associated with a page
  * @page: a pointer to the page struct
@@ -869,8 +874,10 @@ void mem_cgroup_print_oom_group(struct mem_cgroup *memcg);
 extern bool cgroup_memory_noswap;
 #endif
 
+struct mem_cgroup *lock_folio_memcg(struct folio *folio);
 struct mem_cgroup *lock_page_memcg(struct page *page);
 void __unlock_page_memcg(struct mem_cgroup *memcg);
+void unlock_folio_memcg(struct folio *folio);
 void unlock_page_memcg(struct page *page);
 
 /*
@@ -1298,6 +1305,11 @@ mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg)
 {
 }
 
+static inline struct mem_cgroup *lock_folio_memcg(struct folio *folio)
+{
+	return NULL;
+}
+
 static inline struct mem_cgroup *lock_page_memcg(struct page *page)
 {
 	return NULL;
@@ -1307,6 +1319,10 @@ static inline void __unlock_page_memcg(struct mem_cgroup *memcg)
 {
 }
 
+static inline void unlock_folio_memcg(struct folio *folio)
+{
+}
+
 static inline void unlock_page_memcg(struct page *page)
 {
 }
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index ed5cc78a8dbf..c3c0c8124b09 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2139,19 +2139,18 @@ void mem_cgroup_print_oom_group(struct mem_cgroup *memcg)
 }
 
 /**
- * lock_page_memcg - lock a page and memcg binding
- * @page: the page
+ * lock_folio_memcg - lock a folio and memcg binding
+ * @folio: the folio
  *
- * This function protects unlocked LRU pages from being moved to
+ * This function protects unlocked LRU folios from being moved to
  * another cgroup.
  *
  * It ensures lifetime of the returned memcg. Caller is responsible
- * for the lifetime of the page; __unlock_page_memcg() is available
- * when @page might get freed inside the locked section.
+ * for the lifetime of the folio; __unlock_folio_memcg() is available
+ * when @folio might get freed inside the locked section.
  */
-struct mem_cgroup *lock_page_memcg(struct page *page)
+struct mem_cgroup *lock_folio_memcg(struct folio *folio)
 {
-	struct page *head = compound_head(page); /* rmap on tail pages */
 	struct mem_cgroup *memcg;
 	unsigned long flags;
 
@@ -2171,7 +2170,7 @@ struct mem_cgroup *lock_page_memcg(struct page *page)
 	if (mem_cgroup_disabled())
 		return NULL;
 again:
-	memcg = page_memcg(head);
+	memcg = folio_memcg(folio);
 	if (unlikely(!memcg))
 		return NULL;
 
@@ -2185,7 +2184,7 @@ struct mem_cgroup *lock_page_memcg(struct page *page)
 		return memcg;
 
 	spin_lock_irqsave(&memcg->move_lock, flags);
-	if (memcg != page_memcg(head)) {
+	if (memcg != folio_memcg(folio)) {
 		spin_unlock_irqrestore(&memcg->move_lock, flags);
 		goto again;
 	}
@@ -2200,6 +2199,12 @@ struct mem_cgroup *lock_page_memcg(struct page *page)
 
 	return memcg;
 }
+EXPORT_SYMBOL(lock_folio_memcg);
+
+struct mem_cgroup *lock_page_memcg(struct page *page)
+{
+	return lock_folio_memcg(page_folio(page));
+}
 EXPORT_SYMBOL(lock_page_memcg);
 
 /**
@@ -2222,15 +2227,22 @@ void __unlock_page_memcg(struct mem_cgroup *memcg)
 	rcu_read_unlock();
 }
 
+/**
+ * unlock_folio_memcg - unlock a folio and memcg binding
+ * @folio: the folio
+ */
+void unlock_folio_memcg(struct folio *folio)
+{
+	__unlock_page_memcg(folio_memcg(folio));
+}
+
 /**
  * unlock_page_memcg - unlock a page and memcg binding
  * @page: the page
  */
 void unlock_page_memcg(struct page *page)
 {
-	struct page *head = compound_head(page);
-
-	__unlock_page_memcg(page_memcg(head));
+	unlock_folio_memcg(page_folio(page));
 }
 EXPORT_SYMBOL(unlock_page_memcg);
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 12/25] mm/memcg: Add mem_cgroup_folio_lruvec
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (10 preceding siblings ...)
  2021-01-28  7:03 ` [PATCH v3 11/25] mm/memcg: Add folio_memcg, lock_folio_memcg and unlock_folio_memcg Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-01-28  7:03 ` [PATCH v3 13/25] mm: Add unlock_folio Matthew Wilcox (Oracle)
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

mem_cgroup_page_lruvec() already expects a head page, so this will add some
typesafety once we can remove mem_cgroup_page_lruvec().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/memcontrol.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 89aaa22506e6..ec7ecfc0e47b 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -1454,6 +1454,12 @@ static inline void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page)
 }
 #endif /* CONFIG_MEMCG */
 
+static inline struct lruvec *mem_cgroup_folio_lruvec(struct folio *folio,
+						    struct pglist_data *pgdat)
+{
+	return mem_cgroup_page_lruvec(&folio->page, pgdat);
+}
+
 static inline void __inc_lruvec_kmem_state(void *p, enum node_stat_item idx)
 {
 	__mod_lruvec_kmem_state(p, idx, 1);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 13/25] mm: Add unlock_folio
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (11 preceding siblings ...)
  2021-01-28  7:03 ` [PATCH v3 12/25] mm/memcg: Add mem_cgroup_folio_lruvec Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-01-28  7:03 ` [PATCH v3 14/25] mm: Add lock_folio Matthew Wilcox (Oracle)
                   ` (11 subsequent siblings)
  24 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Convert unlock_page() to call unlock_folio().  By using a folio we avoid
a call to compound_head().  This shortens the function from 39 bytes to
25 and removes 4 instructions on x86-64.  Those instructions are currently
pushed into each caller, but subsequent patches will convert many of the
callers to operate on folios.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/pagemap.h | 16 +++++++++++++++-
 mm/filemap.c            | 27 ++++++++++-----------------
 2 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 86956e97cd5e..5fcab5e1787c 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -623,9 +623,23 @@ extern int __lock_page_killable(struct page *page);
 extern int __lock_page_async(struct page *page, struct wait_page_queue *wait);
 extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 				unsigned int flags);
-extern void unlock_page(struct page *page);
+void unlock_folio(struct folio *folio);
 extern void unlock_page_fscache(struct page *page);
 
+/**
+ * unlock_page - Unlock a locked page.
+ * @page: The page.
+ *
+ * Unlocks the page and wakes up any thread sleeping on the page lock.
+ *
+ * Context: May be called from interrupt or process context.  May not be
+ * called from NMI context.
+ */
+static inline void unlock_page(struct page *page)
+{
+	return unlock_folio(page_folio(page));
+}
+
 /*
  * Return true if the page was successfully locked
  */
diff --git a/mm/filemap.c b/mm/filemap.c
index 4417fd15d633..b639651d1573 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1407,29 +1407,22 @@ static inline bool clear_bit_unlock_is_negative_byte(long nr, volatile void *mem
 #endif
 
 /**
- * unlock_page - unlock a locked page
- * @page: the page
+ * unlock_folio - Unlock a locked folio.
+ * @folio: The folio.
  *
- * Unlocks the page and wakes up sleepers in wait_on_page_locked().
- * Also wakes sleepers in wait_on_page_writeback() because the wakeup
- * mechanism between PageLocked pages and PageWriteback pages is shared.
- * But that's OK - sleepers in wait_on_page_writeback() just go back to sleep.
+ * Unlocks the folio and wakes up any thread sleeping on the page lock.
  *
- * Note that this depends on PG_waiters being the sign bit in the byte
- * that contains PG_locked - thus the BUILD_BUG_ON(). That allows us to
- * clear the PG_locked bit and test PG_waiters at the same time fairly
- * portably (architectures that do LL/SC can test any bit, while x86 can
- * test the sign bit).
+ * Context: May be called from interrupt or process context.  May not be
+ * called from NMI context.
  */
-void unlock_page(struct page *page)
+void unlock_folio(struct folio *folio)
 {
 	BUILD_BUG_ON(PG_waiters != 7);
-	page = compound_head(page);
-	VM_BUG_ON_PAGE(!PageLocked(page), page);
-	if (clear_bit_unlock_is_negative_byte(PG_locked, &page->flags))
-		wake_up_page_bit(page, PG_locked);
+	VM_BUG_ON_FOLIO(!FolioLocked(folio), folio);
+	if (clear_bit_unlock_is_negative_byte(PG_locked, folio_flags(folio)))
+		wake_up_page_bit(&folio->page, PG_locked);
 }
-EXPORT_SYMBOL(unlock_page);
+EXPORT_SYMBOL(unlock_folio);
 
 /**
  * unlock_page_fscache - Unlock a page pinned with PG_fscache
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 14/25] mm: Add lock_folio
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (12 preceding siblings ...)
  2021-01-28  7:03 ` [PATCH v3 13/25] mm: Add unlock_folio Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-01-28  7:03 ` [PATCH v3 15/25] mm: Add lock_folio_killable Matthew Wilcox (Oracle)
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This is like lock_page() but for use by callers who know they have a folio.
Convert __lock_page() to be __lock_folio().  This saves one call to
compound_head() per contended call to lock_page().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/pagemap.h | 21 +++++++++++++++------
 mm/filemap.c            | 29 +++++++++++++++--------------
 2 files changed, 30 insertions(+), 20 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 5fcab5e1787c..0e9ad46e8d55 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -618,7 +618,7 @@ static inline bool wake_page_match(struct wait_page_queue *wait_page,
 	return true;
 }
 
-extern void __lock_page(struct page *page);
+void __lock_folio(struct folio *folio);
 extern int __lock_page_killable(struct page *page);
 extern int __lock_page_async(struct page *page, struct wait_page_queue *wait);
 extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
@@ -640,13 +640,24 @@ static inline void unlock_page(struct page *page)
 	return unlock_folio(page_folio(page));
 }
 
+static inline bool trylock_folio(struct folio *folio)
+{
+	return likely(!test_and_set_bit_lock(PG_locked, folio_flags(folio)));
+}
+
 /*
  * Return true if the page was successfully locked
  */
 static inline int trylock_page(struct page *page)
 {
-	page = compound_head(page);
-	return (likely(!test_and_set_bit_lock(PG_locked, &page->flags)));
+	return trylock_folio(page_folio(page));
+}
+
+static inline void lock_folio(struct folio *folio)
+{
+	might_sleep();
+	if (!trylock_folio(folio))
+		__lock_folio(folio);
 }
 
 /*
@@ -654,9 +665,7 @@ static inline int trylock_page(struct page *page)
  */
 static inline void lock_page(struct page *page)
 {
-	might_sleep();
-	if (!trylock_page(page))
-		__lock_page(page);
+	lock_folio(page_folio(page));
 }
 
 /*
diff --git a/mm/filemap.c b/mm/filemap.c
index b639651d1573..f95967ef16da 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1159,7 +1159,7 @@ static void wake_up_page(struct page *page, int bit)
  */
 enum behavior {
 	EXCLUSIVE,	/* Hold ref to page and take the bit when woken, like
-			 * __lock_page() waiting on then setting PG_locked.
+			 * __lock_folio() waiting on then setting PG_locked.
 			 */
 	SHARED,		/* Hold ref to page and check the bit when woken, like
 			 * wait_on_page_writeback() waiting on PG_writeback.
@@ -1505,17 +1505,16 @@ void page_endio(struct page *page, bool is_write, int err)
 EXPORT_SYMBOL_GPL(page_endio);
 
 /**
- * __lock_page - get a lock on the page, assuming we need to sleep to get it
- * @__page: the page to lock
+ * __lock_folio - Get a lock on the folio, assuming we need to sleep to get it.
+ * @folio: The folio to lock
  */
-void __lock_page(struct page *__page)
+void __lock_folio(struct folio *folio)
 {
-	struct page *page = compound_head(__page);
-	wait_queue_head_t *q = page_waitqueue(page);
-	wait_on_page_bit_common(q, page, PG_locked, TASK_UNINTERRUPTIBLE,
+	wait_queue_head_t *q = page_waitqueue(&folio->page);
+	wait_on_page_bit_common(q, &folio->page, PG_locked, TASK_UNINTERRUPTIBLE,
 				EXCLUSIVE);
 }
-EXPORT_SYMBOL(__lock_page);
+EXPORT_SYMBOL(__lock_folio);
 
 int __lock_page_killable(struct page *__page)
 {
@@ -1590,10 +1589,10 @@ int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 			return 0;
 		}
 	} else {
-		__lock_page(page);
+		__lock_folio(page_folio(page));
 	}
-	return 1;
 
+	return 1;
 }
 
 /**
@@ -2738,7 +2737,9 @@ loff_t mapping_seek_hole_data(struct address_space *mapping, loff_t start,
 static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
 				     struct file **fpin)
 {
-	if (trylock_page(page))
+	struct folio *folio = page_folio(page);
+
+	if (trylock_folio(folio))
 		return 1;
 
 	/*
@@ -2751,7 +2752,7 @@ static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
 
 	*fpin = maybe_unlock_mmap_for_io(vmf, *fpin);
 	if (vmf->flags & FAULT_FLAG_KILLABLE) {
-		if (__lock_page_killable(page)) {
+		if (__lock_page_killable(&folio->page)) {
 			/*
 			 * We didn't have the right flags to drop the mmap_lock,
 			 * but all fault_handlers only check for fatal signals
@@ -2763,11 +2764,11 @@ static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
 			return 0;
 		}
 	} else
-		__lock_page(page);
+		__lock_folio(folio);
+
 	return 1;
 }
 
-
 /*
  * Synchronous readahead happens when we don't even find a page in the page
  * cache at all.  We don't want to perform IO under the mmap sem, so if we have
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 15/25] mm: Add lock_folio_killable
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (13 preceding siblings ...)
  2021-01-28  7:03 ` [PATCH v3 14/25] mm: Add lock_folio Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-01-28  7:03 ` [PATCH v3 16/25] mm: Convert lock_page_async to lock_folio_async Matthew Wilcox (Oracle)
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This is like lock_page_killable() but for use by callers who
know they have a folio.  Convert __lock_page_killable() to be
__lock_folio_killable().  This saves one call to compound_head() per
contended call to lock_page_killable().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/pagemap.h | 15 ++++++++++-----
 mm/filemap.c            | 17 +++++++++--------
 2 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 0e9ad46e8d55..93a4ab9feaa8 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -619,7 +619,7 @@ static inline bool wake_page_match(struct wait_page_queue *wait_page,
 }
 
 void __lock_folio(struct folio *folio);
-extern int __lock_page_killable(struct page *page);
+int __lock_folio_killable(struct folio *folio);
 extern int __lock_page_async(struct page *page, struct wait_page_queue *wait);
 extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 				unsigned int flags);
@@ -668,6 +668,14 @@ static inline void lock_page(struct page *page)
 	lock_folio(page_folio(page));
 }
 
+static inline int lock_folio_killable(struct folio *folio)
+{
+	might_sleep();
+	if (!trylock_folio(folio))
+		return __lock_folio_killable(folio);
+	return 0;
+}
+
 /*
  * lock_page_killable is like lock_page but can be interrupted by fatal
  * signals.  It returns 0 if it locked the page and -EINTR if it was
@@ -675,10 +683,7 @@ static inline void lock_page(struct page *page)
  */
 static inline int lock_page_killable(struct page *page)
 {
-	might_sleep();
-	if (!trylock_page(page))
-		return __lock_page_killable(page);
-	return 0;
+	return lock_folio_killable(page_folio(page));
 }
 
 /*
diff --git a/mm/filemap.c b/mm/filemap.c
index f95967ef16da..c378b28c2bdc 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1516,14 +1516,13 @@ void __lock_folio(struct folio *folio)
 }
 EXPORT_SYMBOL(__lock_folio);
 
-int __lock_page_killable(struct page *__page)
+int __lock_folio_killable(struct folio *folio)
 {
-	struct page *page = compound_head(__page);
-	wait_queue_head_t *q = page_waitqueue(page);
-	return wait_on_page_bit_common(q, page, PG_locked, TASK_KILLABLE,
+	wait_queue_head_t *q = page_waitqueue(&folio->page);
+	return wait_on_page_bit_common(q, &folio->page, PG_locked, TASK_KILLABLE,
 					EXCLUSIVE);
 }
-EXPORT_SYMBOL_GPL(__lock_page_killable);
+EXPORT_SYMBOL_GPL(__lock_folio_killable);
 
 int __lock_page_async(struct page *page, struct wait_page_queue *wait)
 {
@@ -1565,6 +1564,8 @@ int __lock_page_async(struct page *page, struct wait_page_queue *wait)
 int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 			 unsigned int flags)
 {
+	struct folio *folio = page_folio(page);
+
 	if (fault_flag_allow_retry_first(flags)) {
 		/*
 		 * CAUTION! In this case, mmap_lock is not released
@@ -1583,13 +1584,13 @@ int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 	if (flags & FAULT_FLAG_KILLABLE) {
 		int ret;
 
-		ret = __lock_page_killable(page);
+		ret = __lock_folio_killable(folio);
 		if (ret) {
 			mmap_read_unlock(mm);
 			return 0;
 		}
 	} else {
-		__lock_folio(page_folio(page));
+		__lock_folio(folio);
 	}
 
 	return 1;
@@ -2752,7 +2753,7 @@ static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
 
 	*fpin = maybe_unlock_mmap_for_io(vmf, *fpin);
 	if (vmf->flags & FAULT_FLAG_KILLABLE) {
-		if (__lock_page_killable(&folio->page)) {
+		if (__lock_folio_killable(folio)) {
 			/*
 			 * We didn't have the right flags to drop the mmap_lock,
 			 * but all fault_handlers only check for fatal signals
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 16/25] mm: Convert lock_page_async to lock_folio_async
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (14 preceding siblings ...)
  2021-01-28  7:03 ` [PATCH v3 15/25] mm: Add lock_folio_killable Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-01-28  7:03 ` [PATCH v3 17/25] mm/filemap: Convert end_page_writeback to end_folio_writeback Matthew Wilcox (Oracle)
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

When the caller already has a folio, this saves a call to compound_head().
If not, the call to compound_head() is merely moved.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/io_uring.c           |  2 +-
 include/linux/pagemap.h | 14 +++++++-------
 mm/filemap.c            | 12 ++++++------
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 03748faa5295..2627160ffd4c 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3398,7 +3398,7 @@ static int io_read_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 }
 
 /*
- * This is our waitqueue callback handler, registered through lock_page_async()
+ * This is our waitqueue callback handler, registered through lock_folio_async()
  * when we initially tried to do the IO with the iocb armed our waitqueue.
  * This gets called when the page is unlocked, and we generally expect that to
  * happen when the page IO is completed and the page is now uptodate. This will
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 93a4ab9feaa8..131d1aa2af61 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -620,7 +620,7 @@ static inline bool wake_page_match(struct wait_page_queue *wait_page,
 
 void __lock_folio(struct folio *folio);
 int __lock_folio_killable(struct folio *folio);
-extern int __lock_page_async(struct page *page, struct wait_page_queue *wait);
+int __lock_folio_async(struct folio *folio, struct wait_page_queue *wait);
 extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 				unsigned int flags);
 void unlock_folio(struct folio *folio);
@@ -687,18 +687,18 @@ static inline int lock_page_killable(struct page *page)
 }
 
 /*
- * lock_page_async - Lock the page, unless this would block. If the page
- * is already locked, then queue a callback when the page becomes unlocked.
+ * lock_folio_async - Lock the folio, unless this would block. If the folio
+ * is already locked, then queue a callback when the folio becomes unlocked.
  * This callback can then retry the operation.
  *
- * Returns 0 if the page is locked successfully, or -EIOCBQUEUED if the page
+ * Returns 0 if the folio is locked successfully, or -EIOCBQUEUED if the folio
  * was already locked and the callback defined in 'wait' was queued.
  */
-static inline int lock_page_async(struct page *page,
+static inline int lock_folio_async(struct folio *folio,
 				  struct wait_page_queue *wait)
 {
-	if (!trylock_page(page))
-		return __lock_page_async(page, wait);
+	if (!trylock_folio(folio))
+		return __lock_folio_async(folio, wait);
 	return 0;
 }
 
diff --git a/mm/filemap.c b/mm/filemap.c
index c378b28c2bdc..a54eb4641385 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1524,18 +1524,18 @@ int __lock_folio_killable(struct folio *folio)
 }
 EXPORT_SYMBOL_GPL(__lock_folio_killable);
 
-int __lock_page_async(struct page *page, struct wait_page_queue *wait)
+int __lock_folio_async(struct folio *folio, struct wait_page_queue *wait)
 {
-	struct wait_queue_head *q = page_waitqueue(page);
+	struct wait_queue_head *q = page_waitqueue(&folio->page);
 	int ret = 0;
 
-	wait->page = page;
+	wait->page = &folio->page;
 	wait->bit_nr = PG_locked;
 
 	spin_lock_irq(&q->lock);
 	__add_wait_queue_entry_tail(q, &wait->wait);
-	SetPageWaiters(page);
-	ret = !trylock_page(page);
+	SetFolioWaiters(folio);
+	ret = !trylock_folio(folio);
 	/*
 	 * If we were successful now, we know we're still on the
 	 * waitqueue as we're still under the lock. This means it's
@@ -2293,7 +2293,7 @@ static int filemap_update_page(struct kiocb *iocb,
 			put_and_wait_on_page_locked(page, TASK_KILLABLE);
 			return AOP_TRUNCATED_PAGE;
 		}
-		error = __lock_page_async(page, iocb->ki_waitq);
+		error = __lock_folio_async(page_folio(page), iocb->ki_waitq);
 		if (error)
 			return error;
 	}
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 17/25] mm/filemap: Convert end_page_writeback to end_folio_writeback
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (15 preceding siblings ...)
  2021-01-28  7:03 ` [PATCH v3 16/25] mm: Convert lock_page_async to lock_folio_async Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-01-28  7:03 ` [PATCH v3 18/25] mm: Convert wait_on_page_bit to wait_on_folio_bit Matthew Wilcox (Oracle)
                   ` (7 subsequent siblings)
  24 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Add a wrapper function for users that are not yet converted to folios.
With a distro config, this function shrinks from 213 bytes to 105 bytes
due to elimination of repeated calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/pagemap.h |  6 +++++-
 mm/filemap.c            | 30 +++++++++++++++---------------
 2 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 131d1aa2af61..67d3badc9fe0 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -758,7 +758,11 @@ static inline void wait_on_page_fscache(struct page *page)
 
 int put_and_wait_on_page_locked(struct page *page, int state);
 void wait_on_page_writeback(struct page *page);
-extern void end_page_writeback(struct page *page);
+void end_folio_writeback(struct folio *folio);
+static inline void end_page_writeback(struct page *page)
+{
+	return end_folio_writeback(page_folio(page));
+}
 void wait_for_stable_page(struct page *page);
 
 void page_endio(struct page *page, bool is_write, int err);
diff --git a/mm/filemap.c b/mm/filemap.c
index a54eb4641385..65008c42e47d 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1147,11 +1147,11 @@ static void wake_up_page_bit(struct page *page, int bit_nr)
 	spin_unlock_irqrestore(&q->lock, flags);
 }
 
-static void wake_up_page(struct page *page, int bit)
+static void wake_up_folio(struct folio *folio, int bit)
 {
-	if (!PageWaiters(page))
+	if (!FolioWaiters(folio))
 		return;
-	wake_up_page_bit(page, bit);
+	wake_up_page_bit(&folio->page, bit);
 }
 
 /*
@@ -1443,10 +1443,10 @@ void unlock_page_fscache(struct page *page)
 EXPORT_SYMBOL(unlock_page_fscache);
 
 /**
- * end_page_writeback - end writeback against a page
- * @page: the page
+ * end_folio_writeback - End writeback against a page.
+ * @folio: The page.
  */
-void end_page_writeback(struct page *page)
+void end_folio_writeback(struct folio *folio)
 {
 	/*
 	 * TestClearPageReclaim could be used here but it is an atomic
@@ -1455,26 +1455,26 @@ void end_page_writeback(struct page *page)
 	 * justify taking an atomic operation penalty at the end of
 	 * ever page writeback.
 	 */
-	if (PageReclaim(page)) {
-		ClearPageReclaim(page);
-		rotate_reclaimable_page(page);
+	if (FolioReclaim(folio)) {
+		ClearFolioReclaim(folio);
+		rotate_reclaimable_page(&folio->page);
 	}
 
 	/*
 	 * Writeback does not hold a page reference of its own, relying
 	 * on truncation to wait for the clearing of PG_writeback.
 	 * But here we must make sure that the page is not freed and
-	 * reused before the wake_up_page().
+	 * reused before the wake_up_folio().
 	 */
-	get_page(page);
-	if (!test_clear_page_writeback(page))
+	get_folio(folio);
+	if (!test_clear_page_writeback(&folio->page))
 		BUG();
 
 	smp_mb__after_atomic();
-	wake_up_page(page, PG_writeback);
-	put_page(page);
+	wake_up_folio(folio, PG_writeback);
+	put_folio(folio);
 }
-EXPORT_SYMBOL(end_page_writeback);
+EXPORT_SYMBOL(end_folio_writeback);
 
 /*
  * After completing I/O on a page, call this routine to update the page
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 18/25] mm: Convert wait_on_page_bit to wait_on_folio_bit
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (16 preceding siblings ...)
  2021-01-28  7:03 ` [PATCH v3 17/25] mm/filemap: Convert end_page_writeback to end_folio_writeback Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-01-28  7:03 ` [PATCH v3 19/25] mm: Add wait_for_stable_folio and wait_on_folio_writeback Matthew Wilcox (Oracle)
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

We must deal with folios here otherwise we'll get the wrong waitqueue
and fail to receive wakeups.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/afs/write.c          | 31 ++++++++++++-----------
 include/linux/pagemap.h | 19 +++++++++------
 mm/filemap.c            | 54 ++++++++++++++++++-----------------------
 mm/page-writeback.c     |  7 +++---
 4 files changed, 56 insertions(+), 55 deletions(-)

diff --git a/fs/afs/write.c b/fs/afs/write.c
index e672833c99bc..b3dac7afd123 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -915,13 +915,14 @@ int afs_fsync(struct file *file, loff_t start, loff_t end, int datasync)
  */
 vm_fault_t afs_page_mkwrite(struct vm_fault *vmf)
 {
-	struct page *page = thp_head(vmf->page);
+	struct folio *folio = page_folio(vmf->page);
 	struct file *file = vmf->vma->vm_file;
 	struct inode *inode = file_inode(file);
 	struct afs_vnode *vnode = AFS_FS_I(inode);
 	unsigned long priv;
 
-	_enter("{{%llx:%llu}},{%lx}", vnode->fid.vid, vnode->fid.vnode, page->index);
+	_enter("{{%llx:%llu}},{%lx}", vnode->fid.vid, vnode->fid.vnode,
+			folio->page.index);
 
 	sb_start_pagefault(inode->i_sb);
 
@@ -929,32 +930,34 @@ vm_fault_t afs_page_mkwrite(struct vm_fault *vmf)
 	 * be modified.  We then assume the entire page will need writing back.
 	 */
 #ifdef CONFIG_AFS_FSCACHE
-	if (PageFsCache(page) &&
-	    wait_on_page_bit_killable(page, PG_fscache) < 0)
+	if (FolioFsCache(folio) &&
+	    wait_on_folio_bit_killable(folio, PG_fscache) < 0)
 		return VM_FAULT_RETRY;
 #endif
 
-	if (PageWriteback(page) &&
-	    wait_on_page_bit_killable(page, PG_writeback) < 0)
+	if (FolioWriteback(folio) &&
+	    wait_on_folio_bit_killable(folio, PG_writeback) < 0)
 		return VM_FAULT_RETRY;
 
-	if (lock_page_killable(page) < 0)
+	if (lock_folio_killable(folio) < 0)
 		return VM_FAULT_RETRY;
 
 	/* We mustn't change page->private until writeback is complete as that
 	 * details the portion of the page we need to write back and we might
 	 * need to redirty the page if there's a problem.
 	 */
-	wait_on_page_writeback(page);
+	wait_on_page_writeback(&folio->page);
 
-	priv = afs_page_dirty(page, 0, thp_size(page));
+	priv = afs_page_dirty(&folio->page, 0, folio_size(folio));
 	priv = afs_page_dirty_mmapped(priv);
-	if (PagePrivate(page)) {
-		set_page_private(page, priv);
-		trace_afs_page_dirty(vnode, tracepoint_string("mkwrite+"), page);
+	if (FolioPrivate(folio)) {
+		set_folio_private(folio, priv);
+		trace_afs_page_dirty(vnode, tracepoint_string("mkwrite+"),
+				&folio->page);
 	} else {
-		attach_page_private(page, (void *)priv);
-		trace_afs_page_dirty(vnode, tracepoint_string("mkwrite"), page);
+		attach_folio_private(folio, (void *)priv);
+		trace_afs_page_dirty(vnode, tracepoint_string("mkwrite"),
+				&folio->page);
 	}
 	file_update_time(file);
 
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 67d3badc9fe0..55f3c1a8be3c 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -720,8 +720,8 @@ static inline int lock_page_or_retry(struct page *page, struct mm_struct *mm,
  * This is exported only for wait_on_page_locked/wait_on_page_writeback, etc.,
  * and should not be used directly.
  */
-extern void wait_on_page_bit(struct page *page, int bit_nr);
-extern int wait_on_page_bit_killable(struct page *page, int bit_nr);
+extern void wait_on_folio_bit(struct folio *folio, int bit_nr);
+extern int wait_on_folio_bit_killable(struct folio *folio, int bit_nr);
 
 /* 
  * Wait for a page to be unlocked.
@@ -732,15 +732,17 @@ extern int wait_on_page_bit_killable(struct page *page, int bit_nr);
  */
 static inline void wait_on_page_locked(struct page *page)
 {
-	if (PageLocked(page))
-		wait_on_page_bit(compound_head(page), PG_locked);
+	struct folio *folio = page_folio(page);
+	if (FolioLocked(folio))
+		wait_on_folio_bit(folio, PG_locked);
 }
 
 static inline int wait_on_page_locked_killable(struct page *page)
 {
-	if (!PageLocked(page))
+	struct folio *folio = page_folio(page);
+	if (!FolioLocked(folio))
 		return 0;
-	return wait_on_page_bit_killable(compound_head(page), PG_locked);
+	return wait_on_folio_bit_killable(folio, PG_locked);
 }
 
 /**
@@ -752,8 +754,9 @@ static inline int wait_on_page_locked_killable(struct page *page)
  */
 static inline void wait_on_page_fscache(struct page *page)
 {
-	if (PagePrivate2(page))
-		wait_on_page_bit(compound_head(page), PG_fscache);
+	struct folio *folio = page_folio(page);
+	if (FolioPrivate2(folio))
+		wait_on_folio_bit(folio, PG_fscache);
 }
 
 int put_and_wait_on_page_locked(struct page *page, int state);
diff --git a/mm/filemap.c b/mm/filemap.c
index 65008c42e47d..f68bf0129458 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1074,7 +1074,7 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync,
 	 *
 	 * So update the flags atomically, and wake up the waiter
 	 * afterwards to avoid any races. This store-release pairs
-	 * with the load-acquire in wait_on_page_bit_common().
+	 * with the load-acquire in wait_on_folio_bit_common().
 	 */
 	smp_store_release(&wait->flags, flags | WQ_FLAG_WOKEN);
 	wake_up_state(wait->private, mode);
@@ -1155,7 +1155,7 @@ static void wake_up_folio(struct folio *folio, int bit)
 }
 
 /*
- * A choice of three behaviors for wait_on_page_bit_common():
+ * A choice of three behaviors for wait_on_folio_bit_common():
  */
 enum behavior {
 	EXCLUSIVE,	/* Hold ref to page and take the bit when woken, like
@@ -1189,9 +1189,10 @@ static inline bool trylock_page_bit_common(struct page *page, int bit_nr,
 /* How many times do we accept lock stealing from under a waiter? */
 int sysctl_page_lock_unfairness = 5;
 
-static inline int wait_on_page_bit_common(wait_queue_head_t *q,
-	struct page *page, int bit_nr, int state, enum behavior behavior)
+static inline int wait_on_folio_bit_common(struct folio *folio, int bit_nr,
+		int state, enum behavior behavior)
 {
+	wait_queue_head_t *q = page_waitqueue(&folio->page);
 	int unfairness = sysctl_page_lock_unfairness;
 	struct wait_page_queue wait_page;
 	wait_queue_entry_t *wait = &wait_page.wait;
@@ -1200,8 +1201,8 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	unsigned long pflags;
 
 	if (bit_nr == PG_locked &&
-	    !PageUptodate(page) && PageWorkingset(page)) {
-		if (!PageSwapBacked(page)) {
+	    !FolioUptodate(folio) && FolioWorkingset(folio)) {
+		if (!FolioSwapBacked(folio)) {
 			delayacct_thrashing_start();
 			delayacct = true;
 		}
@@ -1211,7 +1212,7 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 
 	init_wait(wait);
 	wait->func = wake_page_function;
-	wait_page.page = page;
+	wait_page.page = &folio->page;
 	wait_page.bit_nr = bit_nr;
 
 repeat:
@@ -1226,7 +1227,7 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	 * Do one last check whether we can get the
 	 * page bit synchronously.
 	 *
-	 * Do the SetPageWaiters() marking before that
+	 * Do the SetFolioWaiters() marking before that
 	 * to let any waker we _just_ missed know they
 	 * need to wake us up (otherwise they'll never
 	 * even go to the slow case that looks at the
@@ -1237,8 +1238,8 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	 * lock to avoid races.
 	 */
 	spin_lock_irq(&q->lock);
-	SetPageWaiters(page);
-	if (!trylock_page_bit_common(page, bit_nr, wait))
+	SetFolioWaiters(folio);
+	if (!trylock_page_bit_common(&folio->page, bit_nr, wait))
 		__add_wait_queue_entry_tail(q, wait);
 	spin_unlock_irq(&q->lock);
 
@@ -1248,10 +1249,10 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	 * see whether the page bit testing has already
 	 * been done by the wake function.
 	 *
-	 * We can drop our reference to the page.
+	 * We can drop our reference to the folio.
 	 */
 	if (behavior == DROP)
-		put_page(page);
+		put_folio(folio);
 
 	/*
 	 * Note that until the "finish_wait()", or until
@@ -1288,7 +1289,7 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 		 *
 		 * And if that fails, we'll have to retry this all.
 		 */
-		if (unlikely(test_and_set_bit(bit_nr, &page->flags)))
+		if (unlikely(test_and_set_bit(bit_nr, folio_flags(folio))))
 			goto repeat;
 
 		wait->flags |= WQ_FLAG_DONE;
@@ -1328,19 +1329,17 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	return wait->flags & WQ_FLAG_WOKEN ? 0 : -EINTR;
 }
 
-void wait_on_page_bit(struct page *page, int bit_nr)
+void wait_on_folio_bit(struct folio *folio, int bit_nr)
 {
-	wait_queue_head_t *q = page_waitqueue(page);
-	wait_on_page_bit_common(q, page, bit_nr, TASK_UNINTERRUPTIBLE, SHARED);
+	wait_on_folio_bit_common(folio, bit_nr, TASK_UNINTERRUPTIBLE, SHARED);
 }
-EXPORT_SYMBOL(wait_on_page_bit);
+EXPORT_SYMBOL(wait_on_folio_bit);
 
-int wait_on_page_bit_killable(struct page *page, int bit_nr)
+int wait_on_folio_bit_killable(struct folio *folio, int bit_nr)
 {
-	wait_queue_head_t *q = page_waitqueue(page);
-	return wait_on_page_bit_common(q, page, bit_nr, TASK_KILLABLE, SHARED);
+	return wait_on_folio_bit_common(folio, bit_nr, TASK_KILLABLE, SHARED);
 }
-EXPORT_SYMBOL(wait_on_page_bit_killable);
+EXPORT_SYMBOL(wait_on_folio_bit_killable);
 
 /**
  * put_and_wait_on_page_locked - Drop a reference and wait for it to be unlocked
@@ -1357,11 +1356,8 @@ EXPORT_SYMBOL(wait_on_page_bit_killable);
  */
 int put_and_wait_on_page_locked(struct page *page, int state)
 {
-	wait_queue_head_t *q;
-
-	page = compound_head(page);
-	q = page_waitqueue(page);
-	return wait_on_page_bit_common(q, page, PG_locked, state, DROP);
+	wait_on_folio_bit_common(page_folio(page), PG_locked,
+				state, DROP);
 }
 
 /**
@@ -1510,16 +1506,14 @@ EXPORT_SYMBOL_GPL(page_endio);
  */
 void __lock_folio(struct folio *folio)
 {
-	wait_queue_head_t *q = page_waitqueue(&folio->page);
-	wait_on_page_bit_common(q, &folio->page, PG_locked, TASK_UNINTERRUPTIBLE,
+	wait_on_folio_bit_common(folio, PG_locked, TASK_UNINTERRUPTIBLE,
 				EXCLUSIVE);
 }
 EXPORT_SYMBOL(__lock_folio);
 
 int __lock_folio_killable(struct folio *folio)
 {
-	wait_queue_head_t *q = page_waitqueue(&folio->page);
-	return wait_on_page_bit_common(q, &folio->page, PG_locked, TASK_KILLABLE,
+	return wait_on_folio_bit_common(folio, PG_locked, TASK_KILLABLE,
 					EXCLUSIVE);
 }
 EXPORT_SYMBOL_GPL(__lock_folio_killable);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index eb34d204d4ee..51b4326f0aaa 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2826,9 +2826,10 @@ EXPORT_SYMBOL(__test_set_page_writeback);
  */
 void wait_on_page_writeback(struct page *page)
 {
-	while (PageWriteback(page)) {
-		trace_wait_on_page_writeback(page, page_mapping(page));
-		wait_on_page_bit(page, PG_writeback);
+	struct folio *folio = page_folio(page);
+	while (FolioWriteback(folio)) {
+		trace_wait_on_page_writeback(page, folio_mapping(folio));
+		wait_on_folio_bit(folio, PG_writeback);
 	}
 }
 EXPORT_SYMBOL_GPL(wait_on_page_writeback);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 19/25] mm: Add wait_for_stable_folio and wait_on_folio_writeback
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (17 preceding siblings ...)
  2021-01-28  7:03 ` [PATCH v3 18/25] mm: Convert wait_on_page_bit to wait_on_folio_bit Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-01-28  7:03 ` [PATCH v3 20/25] mm: Add wait_on_folio_locked & wait_on_folio_locked_killable Matthew Wilcox (Oracle)
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Add compatibility wrappers for code which has not yet been converted
to use folios.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/pagemap.h | 12 ++++++++++--
 mm/page-writeback.c     | 27 +++++++++++++--------------
 2 files changed, 23 insertions(+), 16 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 55f3c1a8be3c..757e437e7f09 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -760,13 +760,21 @@ static inline void wait_on_page_fscache(struct page *page)
 }
 
 int put_and_wait_on_page_locked(struct page *page, int state);
-void wait_on_page_writeback(struct page *page);
+void wait_on_folio_writeback(struct folio *folio);
+static inline void wait_on_page_writeback(struct page *page)
+{
+	return wait_on_folio_writeback(page_folio(page));
+}
 void end_folio_writeback(struct folio *folio);
 static inline void end_page_writeback(struct page *page)
 {
 	return end_folio_writeback(page_folio(page));
 }
-void wait_for_stable_page(struct page *page);
+void wait_for_stable_folio(struct folio *folio);
+static inline void wait_for_stable_page(struct page *page)
+{
+	return wait_for_stable_folio(page_folio(page));
+}
 
 void page_endio(struct page *page, bool is_write, int err);
 
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 51b4326f0aaa..908fc7f60ae7 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2822,30 +2822,29 @@ int __test_set_page_writeback(struct page *page, bool keep_write)
 EXPORT_SYMBOL(__test_set_page_writeback);
 
 /*
- * Wait for a page to complete writeback
+ * Wait for a folio to complete writeback
  */
-void wait_on_page_writeback(struct page *page)
+void wait_on_folio_writeback(struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
 	while (FolioWriteback(folio)) {
-		trace_wait_on_page_writeback(page, folio_mapping(folio));
+		trace_wait_on_page_writeback(&folio->page,
+						folio_mapping(folio));
 		wait_on_folio_bit(folio, PG_writeback);
 	}
 }
-EXPORT_SYMBOL_GPL(wait_on_page_writeback);
+EXPORT_SYMBOL_GPL(wait_on_folio_writeback);
 
 /**
- * wait_for_stable_page() - wait for writeback to finish, if necessary.
- * @page:	The page to wait on.
+ * wait_for_stable_folio() - wait for writeback to finish, if necessary.
+ * @folio: The folio to wait on.
  *
- * This function determines if the given page is related to a backing device
- * that requires page contents to be held stable during writeback.  If so, then
+ * This function determines if the given folio is related to a backing device
+ * that requires folio contents to be held stable during writeback.  If so, then
  * it will wait for any pending writeback to complete.
  */
-void wait_for_stable_page(struct page *page)
+void wait_for_stable_folio(struct folio *folio)
 {
-	page = thp_head(page);
-	if (page->mapping->host->i_sb->s_iflags & SB_I_STABLE_WRITES)
-		wait_on_page_writeback(page);
+	if (folio->page.mapping->host->i_sb->s_iflags & SB_I_STABLE_WRITES)
+		wait_on_folio_writeback(folio);
 }
-EXPORT_SYMBOL_GPL(wait_for_stable_page);
+EXPORT_SYMBOL_GPL(wait_for_stable_folio);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 20/25] mm: Add wait_on_folio_locked & wait_on_folio_locked_killable
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (18 preceding siblings ...)
  2021-01-28  7:03 ` [PATCH v3 19/25] mm: Add wait_for_stable_folio and wait_on_folio_writeback Matthew Wilcox (Oracle)
@ 2021-01-28  7:03 ` Matthew Wilcox (Oracle)
  2021-01-28  7:04 ` [PATCH v3 21/25] mm: Convert lock_page_or_retry to lock_folio_or_retry Matthew Wilcox (Oracle)
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Turn wait_on_page_locked() and wait_on_page_locked_killable() into
wrappers.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/pagemap.h | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 757e437e7f09..546565a7907c 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -730,16 +730,14 @@ extern int wait_on_folio_bit_killable(struct folio *folio, int bit_nr);
  * ie with increased "page->count" so that the page won't
  * go away during the wait..
  */
-static inline void wait_on_page_locked(struct page *page)
+static inline void wait_on_folio_locked(struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
 	if (FolioLocked(folio))
 		wait_on_folio_bit(folio, PG_locked);
 }
 
-static inline int wait_on_page_locked_killable(struct page *page)
+static inline int wait_on_folio_locked_killable(struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
 	if (!FolioLocked(folio))
 		return 0;
 	return wait_on_folio_bit_killable(folio, PG_locked);
@@ -759,6 +757,16 @@ static inline void wait_on_page_fscache(struct page *page)
 		wait_on_folio_bit(folio, PG_fscache);
 }
 
+static inline void wait_on_page_locked(struct page *page)
+{
+	wait_on_folio_locked(page_folio(page));
+}
+
+static inline int wait_on_page_locked_killable(struct page *page)
+{
+	return wait_on_folio_locked_killable(page_folio(page));
+}
+
 int put_and_wait_on_page_locked(struct page *page, int state);
 void wait_on_folio_writeback(struct folio *folio);
 static inline void wait_on_page_writeback(struct page *page)
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 21/25] mm: Convert lock_page_or_retry to lock_folio_or_retry
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (19 preceding siblings ...)
  2021-01-28  7:03 ` [PATCH v3 20/25] mm: Add wait_on_folio_locked & wait_on_folio_locked_killable Matthew Wilcox (Oracle)
@ 2021-01-28  7:04 ` Matthew Wilcox (Oracle)
  2021-01-28  7:04 ` [PATCH v3 22/25] mm/filemap: Convert wake_up_page_bit to wake_up_folio_bit Matthew Wilcox (Oracle)
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:04 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

There's already a hidden compound_head() call in trylock_page(), so
just make it explicit in the caller, which may later have a folio
for its own reasons.  This saves a call to compound_head() inside
__lock_page_or_retry().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/pagemap.h | 10 +++++-----
 mm/filemap.c            | 16 +++++++---------
 mm/memory.c             | 10 +++++-----
 3 files changed, 17 insertions(+), 19 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 546565a7907c..f59af1547e7b 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -621,7 +621,7 @@ static inline bool wake_page_match(struct wait_page_queue *wait_page,
 void __lock_folio(struct folio *folio);
 int __lock_folio_killable(struct folio *folio);
 int __lock_folio_async(struct folio *folio, struct wait_page_queue *wait);
-extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
+int __lock_folio_or_retry(struct folio *folio, struct mm_struct *mm,
 				unsigned int flags);
 void unlock_folio(struct folio *folio);
 extern void unlock_page_fscache(struct page *page);
@@ -703,17 +703,17 @@ static inline int lock_folio_async(struct folio *folio,
 }
 
 /*
- * lock_page_or_retry - Lock the page, unless this would block and the
+ * lock_folio_or_retry - Lock the folio, unless this would block and the
  * caller indicated that it can handle a retry.
  *
  * Return value and mmap_lock implications depend on flags; see
- * __lock_page_or_retry().
+ * __lock_folio_or_retry().
  */
-static inline int lock_page_or_retry(struct page *page, struct mm_struct *mm,
+static inline int lock_folio_or_retry(struct folio *folio, struct mm_struct *mm,
 				     unsigned int flags)
 {
 	might_sleep();
-	return trylock_page(page) || __lock_page_or_retry(page, mm, flags);
+	return trylock_folio(folio) || __lock_folio_or_retry(folio, mm, flags);
 }
 
 /*
diff --git a/mm/filemap.c b/mm/filemap.c
index f68bf0129458..f0a76258de97 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1546,20 +1546,18 @@ int __lock_folio_async(struct folio *folio, struct wait_page_queue *wait)
 
 /*
  * Return values:
- * 1 - page is locked; mmap_lock is still held.
- * 0 - page is not locked.
+ * 1 - folio is locked; mmap_lock is still held.
+ * 0 - folio is not locked.
  *     mmap_lock has been released (mmap_read_unlock(), unless flags had both
  *     FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_RETRY_NOWAIT set, in
  *     which case mmap_lock is still held.
  *
  * If neither ALLOW_RETRY nor KILLABLE are set, will always return 1
- * with the page locked and the mmap_lock unperturbed.
+ * with the folio locked and the mmap_lock unperturbed.
  */
-int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
+int __lock_folio_or_retry(struct folio *folio, struct mm_struct *mm,
 			 unsigned int flags)
 {
-	struct folio *folio = page_folio(page);
-
 	if (fault_flag_allow_retry_first(flags)) {
 		/*
 		 * CAUTION! In this case, mmap_lock is not released
@@ -1570,9 +1568,9 @@ int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 
 		mmap_read_unlock(mm);
 		if (flags & FAULT_FLAG_KILLABLE)
-			wait_on_page_locked_killable(page);
+			wait_on_folio_locked_killable(folio);
 		else
-			wait_on_page_locked(page);
+			wait_on_folio_locked(folio);
 		return 0;
 	}
 	if (flags & FAULT_FLAG_KILLABLE) {
@@ -2724,7 +2722,7 @@ loff_t mapping_seek_hole_data(struct address_space *mapping, loff_t start,
  * @page - the page to lock.
  * @fpin - the pointer to the file we may pin (or is already pinned).
  *
- * This works similar to lock_page_or_retry in that it can drop the mmap_lock.
+ * This works similar to lock_folio_or_retry in that it can drop the mmap_lock.
  * It differs in that it actually returns the page locked if it returns 1 and 0
  * if it couldn't lock the page.  If we did have to drop the mmap_lock then fpin
  * will point to the pinned file and needs to be fput()'ed at a later point.
diff --git a/mm/memory.c b/mm/memory.c
index 06992770f23e..bb15abef559b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3352,7 +3352,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
 		goto out_release;
 	}
 
-	locked = lock_page_or_retry(page, vma->vm_mm, vmf->flags);
+	locked = lock_folio_or_retry(page_folio(page), vma->vm_mm, vmf->flags);
 
 	delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
 	if (!locked) {
@@ -4104,7 +4104,7 @@ static vm_fault_t do_shared_fault(struct vm_fault *vmf)
  * We enter with non-exclusive mmap_lock (to exclude vma changes,
  * but allow concurrent faults).
  * The mmap_lock may have been released depending on flags and our
- * return value.  See filemap_fault() and __lock_page_or_retry().
+ * return value.  See filemap_fault() and __lock_folio_or_retry().
  * If mmap_lock is released, vma may become invalid (for example
  * by other thread calling munmap()).
  */
@@ -4338,7 +4338,7 @@ static vm_fault_t wp_huge_pud(struct vm_fault *vmf, pud_t orig_pud)
  * concurrent faults).
  *
  * The mmap_lock may have been released depending on flags and our return value.
- * See filemap_fault() and __lock_page_or_retry().
+ * See filemap_fault() and __lock_folio_or_retry().
  */
 static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
 {
@@ -4431,7 +4431,7 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
  * By the time we get here, we already hold the mm semaphore
  *
  * The mmap_lock may have been released depending on flags and our
- * return value.  See filemap_fault() and __lock_page_or_retry().
+ * return value.  See filemap_fault() and __lock_folio_or_retry().
  */
 static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
 		unsigned long address, unsigned int flags)
@@ -4587,7 +4587,7 @@ static inline void mm_account_fault(struct pt_regs *regs,
  * By the time we get here, we already hold the mm semaphore
  *
  * The mmap_lock may have been released depending on flags and our
- * return value.  See filemap_fault() and __lock_page_or_retry().
+ * return value.  See filemap_fault() and __lock_folio_or_retry().
  */
 vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
 			   unsigned int flags, struct pt_regs *regs)
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 22/25] mm/filemap: Convert wake_up_page_bit to wake_up_folio_bit
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (20 preceding siblings ...)
  2021-01-28  7:04 ` [PATCH v3 21/25] mm: Convert lock_page_or_retry to lock_folio_or_retry Matthew Wilcox (Oracle)
@ 2021-01-28  7:04 ` Matthew Wilcox (Oracle)
  2021-01-28  7:04 ` [PATCH v3 23/25] mm: Convert test_clear_page_writeback to test_clear_folio_writeback Matthew Wilcox (Oracle)
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:04 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

All callers have a folio, so use it directly.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/filemap.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index f0a76258de97..906b29c3e1fb 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1093,14 +1093,14 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync,
 	return (flags & WQ_FLAG_EXCLUSIVE) != 0;
 }
 
-static void wake_up_page_bit(struct page *page, int bit_nr)
+static void wake_up_folio_bit(struct folio *folio, int bit_nr)
 {
-	wait_queue_head_t *q = page_waitqueue(page);
+	wait_queue_head_t *q = page_waitqueue(&folio->page);
 	struct wait_page_key key;
 	unsigned long flags;
 	wait_queue_entry_t bookmark;
 
-	key.page = page;
+	key.page = &folio->page;
 	key.bit_nr = bit_nr;
 	key.page_match = 0;
 
@@ -1135,7 +1135,7 @@ static void wake_up_page_bit(struct page *page, int bit_nr)
 	 * page waiters.
 	 */
 	if (!waitqueue_active(q) || !key.page_match) {
-		ClearPageWaiters(page);
+		ClearFolioWaiters(folio);
 		/*
 		 * It's possible to miss clearing Waiters here, when we woke
 		 * our page waiters, but the hashed waitqueue has waiters for
@@ -1151,7 +1151,7 @@ static void wake_up_folio(struct folio *folio, int bit)
 {
 	if (!FolioWaiters(folio))
 		return;
-	wake_up_page_bit(&folio->page, bit);
+	wake_up_folio_bit(folio, bit);
 }
 
 /*
@@ -1416,7 +1416,7 @@ void unlock_folio(struct folio *folio)
 	BUILD_BUG_ON(PG_waiters != 7);
 	VM_BUG_ON_FOLIO(!FolioLocked(folio), folio);
 	if (clear_bit_unlock_is_negative_byte(PG_locked, folio_flags(folio)))
-		wake_up_page_bit(&folio->page, PG_locked);
+		wake_up_folio_bit(folio, PG_locked);
 }
 EXPORT_SYMBOL(unlock_folio);
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 23/25] mm: Convert test_clear_page_writeback to test_clear_folio_writeback
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (21 preceding siblings ...)
  2021-01-28  7:04 ` [PATCH v3 22/25] mm/filemap: Convert wake_up_page_bit to wake_up_folio_bit Matthew Wilcox (Oracle)
@ 2021-01-28  7:04 ` Matthew Wilcox (Oracle)
  2021-01-28  7:04 ` [PATCH v3 24/25] mm/filemap: Convert page wait queues to be folios Matthew Wilcox (Oracle)
  2021-01-28  7:04 ` [PATCH v3 25/25] cachefiles: Switch to wait_page_key Matthew Wilcox (Oracle)
  24 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:04 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

The one caller of test_clear_page_writeback() already has a folio, so make
it clear that test_clear_page_writeback() operates on the entire folio.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/page-flags.h |  2 +-
 mm/filemap.c               |  2 +-
 mm/page-writeback.c        | 18 +++++++++---------
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 90381858d901..01aa4a71bf14 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -594,7 +594,7 @@ static __always_inline void SetPageUptodate(struct page *page)
 
 CLEARPAGEFLAG(Uptodate, uptodate, PF_NO_TAIL)
 
-int test_clear_page_writeback(struct page *page);
+int test_clear_folio_writeback(struct folio *folio);
 int __test_set_page_writeback(struct page *page, bool keep_write);
 
 #define test_set_page_writeback(page)			\
diff --git a/mm/filemap.c b/mm/filemap.c
index 906b29c3e1fb..a00030b2ef71 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1463,7 +1463,7 @@ void end_folio_writeback(struct folio *folio)
 	 * reused before the wake_up_folio().
 	 */
 	get_folio(folio);
-	if (!test_clear_page_writeback(&folio->page))
+	if (!test_clear_folio_writeback(folio))
 		BUG();
 
 	smp_mb__after_atomic();
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 908fc7f60ae7..db8a99e4a3d2 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2719,24 +2719,24 @@ int clear_page_dirty_for_io(struct page *page)
 }
 EXPORT_SYMBOL(clear_page_dirty_for_io);
 
-int test_clear_page_writeback(struct page *page)
+int test_clear_folio_writeback(struct folio *folio)
 {
-	struct address_space *mapping = page_mapping(page);
+	struct address_space *mapping = folio_mapping(folio);
 	struct mem_cgroup *memcg;
 	struct lruvec *lruvec;
 	int ret;
 
-	memcg = lock_page_memcg(page);
-	lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page));
+	memcg = lock_folio_memcg(folio);
+	lruvec = mem_cgroup_folio_lruvec(folio, folio_pgdat(folio));
 	if (mapping && mapping_use_writeback_tags(mapping)) {
 		struct inode *inode = mapping->host;
 		struct backing_dev_info *bdi = inode_to_bdi(inode);
 		unsigned long flags;
 
 		xa_lock_irqsave(&mapping->i_pages, flags);
-		ret = TestClearPageWriteback(page);
+		ret = TestClearFolioWriteback(folio);
 		if (ret) {
-			__xa_clear_mark(&mapping->i_pages, page_index(page),
+			__xa_clear_mark(&mapping->i_pages, folio_index(folio),
 						PAGECACHE_TAG_WRITEBACK);
 			if (bdi->capabilities & BDI_CAP_WRITEBACK_ACCT) {
 				struct bdi_writeback *wb = inode_to_wb(inode);
@@ -2752,12 +2752,12 @@ int test_clear_page_writeback(struct page *page)
 
 		xa_unlock_irqrestore(&mapping->i_pages, flags);
 	} else {
-		ret = TestClearPageWriteback(page);
+		ret = TestClearFolioWriteback(folio);
 	}
 	if (ret) {
 		dec_lruvec_state(lruvec, NR_WRITEBACK);
-		dec_zone_page_state(page, NR_ZONE_WRITE_PENDING);
-		inc_node_page_state(page, NR_WRITTEN);
+		dec_zone_folio_stat(folio, NR_ZONE_WRITE_PENDING);
+		inc_node_folio_stat(folio, NR_WRITTEN);
 	}
 	__unlock_page_memcg(memcg);
 	return ret;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 24/25] mm/filemap: Convert page wait queues to be folios
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (22 preceding siblings ...)
  2021-01-28  7:04 ` [PATCH v3 23/25] mm: Convert test_clear_page_writeback to test_clear_folio_writeback Matthew Wilcox (Oracle)
@ 2021-01-28  7:04 ` Matthew Wilcox (Oracle)
  2021-01-28  7:04 ` [PATCH v3 25/25] cachefiles: Switch to wait_page_key Matthew Wilcox (Oracle)
  24 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:04 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Reinforce that if we're waiting for a bit in a struct page, that's
actually in the head page by changing the type from page to folio.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/pagemap.h |  6 +++---
 mm/filemap.c            | 40 +++++++++++++++++++++-------------------
 2 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index f59af1547e7b..f0a601f6d68c 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -594,13 +594,13 @@ static inline pgoff_t linear_page_index(struct vm_area_struct *vma,
 
 /* This has the same layout as wait_bit_key - see fs/cachefiles/rdwr.c */
 struct wait_page_key {
-	struct page *page;
+	struct folio *folio;
 	int bit_nr;
 	int page_match;
 };
 
 struct wait_page_queue {
-	struct page *page;
+	struct folio *folio;
 	int bit_nr;
 	wait_queue_entry_t wait;
 };
@@ -608,7 +608,7 @@ struct wait_page_queue {
 static inline bool wake_page_match(struct wait_page_queue *wait_page,
 				  struct wait_page_key *key)
 {
-	if (wait_page->page != key->page)
+	if (wait_page->folio != key->folio)
 	       return false;
 	key->page_match = 1;
 
diff --git a/mm/filemap.c b/mm/filemap.c
index a00030b2ef71..ff61f1f2ce2c 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -991,11 +991,11 @@ EXPORT_SYMBOL(__page_cache_alloc);
  */
 #define PAGE_WAIT_TABLE_BITS 8
 #define PAGE_WAIT_TABLE_SIZE (1 << PAGE_WAIT_TABLE_BITS)
-static wait_queue_head_t page_wait_table[PAGE_WAIT_TABLE_SIZE] __cacheline_aligned;
+static wait_queue_head_t folio_wait_table[PAGE_WAIT_TABLE_SIZE] __cacheline_aligned;
 
-static wait_queue_head_t *page_waitqueue(struct page *page)
+static wait_queue_head_t *folio_waitqueue(struct folio *folio)
 {
-	return &page_wait_table[hash_ptr(page, PAGE_WAIT_TABLE_BITS)];
+	return &folio_wait_table[hash_ptr(folio, PAGE_WAIT_TABLE_BITS)];
 }
 
 void __init pagecache_init(void)
@@ -1003,7 +1003,7 @@ void __init pagecache_init(void)
 	int i;
 
 	for (i = 0; i < PAGE_WAIT_TABLE_SIZE; i++)
-		init_waitqueue_head(&page_wait_table[i]);
+		init_waitqueue_head(&folio_wait_table[i]);
 
 	page_writeback_init();
 }
@@ -1058,10 +1058,11 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync,
 	 */
 	flags = wait->flags;
 	if (flags & WQ_FLAG_EXCLUSIVE) {
-		if (test_bit(key->bit_nr, &key->page->flags))
+		if (test_bit(key->bit_nr, &key->folio->page.flags))
 			return -1;
 		if (flags & WQ_FLAG_CUSTOM) {
-			if (test_and_set_bit(key->bit_nr, &key->page->flags))
+			if (test_and_set_bit(key->bit_nr,
+						&key->folio->page.flags))
 				return -1;
 			flags |= WQ_FLAG_DONE;
 		}
@@ -1095,12 +1096,12 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync,
 
 static void wake_up_folio_bit(struct folio *folio, int bit_nr)
 {
-	wait_queue_head_t *q = page_waitqueue(&folio->page);
+	wait_queue_head_t *q = folio_waitqueue(folio);
 	struct wait_page_key key;
 	unsigned long flags;
 	wait_queue_entry_t bookmark;
 
-	key.page = &folio->page;
+	key.folio = folio;
 	key.bit_nr = bit_nr;
 	key.page_match = 0;
 
@@ -1192,7 +1193,7 @@ int sysctl_page_lock_unfairness = 5;
 static inline int wait_on_folio_bit_common(struct folio *folio, int bit_nr,
 		int state, enum behavior behavior)
 {
-	wait_queue_head_t *q = page_waitqueue(&folio->page);
+	wait_queue_head_t *q = folio_waitqueue(folio);
 	int unfairness = sysctl_page_lock_unfairness;
 	struct wait_page_queue wait_page;
 	wait_queue_entry_t *wait = &wait_page.wait;
@@ -1212,7 +1213,7 @@ static inline int wait_on_folio_bit_common(struct folio *folio, int bit_nr,
 
 	init_wait(wait);
 	wait->func = wake_page_function;
-	wait_page.page = &folio->page;
+	wait_page.folio = folio;
 	wait_page.bit_nr = bit_nr;
 
 repeat:
@@ -1356,7 +1357,7 @@ EXPORT_SYMBOL(wait_on_folio_bit_killable);
  */
 int put_and_wait_on_page_locked(struct page *page, int state)
 {
-	wait_on_folio_bit_common(page_folio(page), PG_locked,
+	return wait_on_folio_bit_common(page_folio(page), PG_locked,
 				state, DROP);
 }
 
@@ -1369,12 +1370,13 @@ int put_and_wait_on_page_locked(struct page *page, int state)
  */
 void add_page_wait_queue(struct page *page, wait_queue_entry_t *waiter)
 {
-	wait_queue_head_t *q = page_waitqueue(page);
+	struct folio *folio = page_folio(page);
+	wait_queue_head_t *q = folio_waitqueue(folio);
 	unsigned long flags;
 
 	spin_lock_irqsave(&q->lock, flags);
 	__add_wait_queue_entry_tail(q, waiter);
-	SetPageWaiters(page);
+	SetFolioWaiters(folio);
 	spin_unlock_irqrestore(&q->lock, flags);
 }
 EXPORT_SYMBOL_GPL(add_page_wait_queue);
@@ -1431,10 +1433,10 @@ EXPORT_SYMBOL(unlock_folio);
  */
 void unlock_page_fscache(struct page *page)
 {
-	page = compound_head(page);
-	VM_BUG_ON_PAGE(!PagePrivate2(page), page);
-	clear_bit_unlock(PG_fscache, &page->flags);
-	wake_up_page_bit(page, PG_fscache);
+	struct folio *folio = page_folio(page);
+	VM_BUG_ON_FOLIO(!FolioPrivate2(folio), folio);
+	clear_bit_unlock(PG_fscache, &folio->page.flags);
+	wake_up_folio_bit(folio, PG_fscache);
 }
 EXPORT_SYMBOL(unlock_page_fscache);
 
@@ -1520,10 +1522,10 @@ EXPORT_SYMBOL_GPL(__lock_folio_killable);
 
 int __lock_folio_async(struct folio *folio, struct wait_page_queue *wait)
 {
-	struct wait_queue_head *q = page_waitqueue(&folio->page);
+	struct wait_queue_head *q = folio_waitqueue(folio);
 	int ret = 0;
 
-	wait->page = &folio->page;
+	wait->folio = folio;
 	wait->bit_nr = PG_locked;
 
 	spin_lock_irq(&q->lock);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 25/25] cachefiles: Switch to wait_page_key
  2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
                   ` (23 preceding siblings ...)
  2021-01-28  7:04 ` [PATCH v3 24/25] mm/filemap: Convert page wait queues to be folios Matthew Wilcox (Oracle)
@ 2021-01-28  7:04 ` Matthew Wilcox (Oracle)
  24 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-01-28  7:04 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Cachefiles was relying on wait_page_key and wait_bit_key being the
same layout, which is fragile.  Now that wait_page_key is exposed in
the pagemap.h header, we can remove that fragility.  Also switch it
to use the folio directly instead of the page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/cachefiles/rdwr.c    | 13 ++++++-------
 include/linux/pagemap.h |  1 -
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
index e027c718ca01..b1dbc484a9c7 100644
--- a/fs/cachefiles/rdwr.c
+++ b/fs/cachefiles/rdwr.c
@@ -24,22 +24,21 @@ static int cachefiles_read_waiter(wait_queue_entry_t *wait, unsigned mode,
 		container_of(wait, struct cachefiles_one_read, monitor);
 	struct cachefiles_object *object;
 	struct fscache_retrieval *op = monitor->op;
-	struct wait_bit_key *key = _key;
-	struct page *page = wait->private;
+	struct wait_page_key *key = _key;
+	struct folio *folio = wait->private;
 
 	ASSERT(key);
 
 	_enter("{%lu},%u,%d,{%p,%u}",
 	       monitor->netfs_page->index, mode, sync,
-	       key->flags, key->bit_nr);
+	       key->folio, key->bit_nr);
 
-	if (key->flags != &page->flags ||
-	    key->bit_nr != PG_locked)
+	if (key->folio != folio || key->bit_nr != PG_locked)
 		return 0;
 
-	_debug("--- monitor %p %lx ---", page, page->flags);
+	_debug("--- monitor %p %lx ---", folio, folio->page.flags);
 
-	if (!PageUptodate(page) && !PageError(page)) {
+	if (!FolioUptodate(folio) && !FolioError(folio)) {
 		/* unlocked, not uptodate and not erronous? */
 		_debug("page probably truncated");
 	}
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index f0a601f6d68c..e8d8c66b027e 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -592,7 +592,6 @@ static inline pgoff_t linear_page_index(struct vm_area_struct *vma,
 	return pgoff;
 }
 
-/* This has the same layout as wait_bit_key - see fs/cachefiles/rdwr.c */
 struct wait_page_key {
 	struct folio *folio;
 	int bit_nr;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v3 01/25] mm: Introduce struct folio
  2021-01-28  7:03 ` [PATCH v3 01/25] mm: Introduce struct folio Matthew Wilcox (Oracle)
@ 2021-03-01 20:26   ` Zi Yan
  2021-03-01 20:53     ` Matthew Wilcox
  2021-03-02 13:22     ` Matthew Wilcox
  0 siblings, 2 replies; 37+ messages in thread
From: Zi Yan @ 2021-03-01 20:26 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: linux-fsdevel, linux-mm, linux-kernel, Mike Kravetz

[-- Attachment #1: Type: text/plain, Size: 1939 bytes --]

On 28 Jan 2021, at 2:03, Matthew Wilcox (Oracle) wrote:

> We have trouble keeping track of whether we've already called
> compound_head() to ensure we're not operating on a tail page.  Further,
> it's never clear whether we intend a struct page to refer to PAGE_SIZE
> bytes or page_size(compound_head(page)).
>
> Introduce a new type 'struct folio' that always refers to an entire
> (possibly compound) page, and points to the head page (or base page).
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/mm.h       | 26 ++++++++++++++++++++++++++
>  include/linux/mm_types.h | 17 +++++++++++++++++
>  2 files changed, 43 insertions(+)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 2d6e715ab8ea..f20504017adf 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -924,6 +924,11 @@ static inline unsigned int compound_order(struct page *page)
>  	return page[1].compound_order;
>  }
>
> +static inline unsigned int folio_order(struct folio *folio)
> +{
> +	return compound_order(&folio->page);
> +}
> +
>  static inline bool hpage_pincount_available(struct page *page)
>  {
>  	/*
> @@ -975,6 +980,26 @@ static inline unsigned int page_shift(struct page *page)
>
>  void free_compound_page(struct page *page);
>
> +static inline unsigned long folio_nr_pages(struct folio *folio)
> +{
> +	return compound_nr(&folio->page);
> +}
> +
> +static inline struct folio *next_folio(struct folio *folio)
> +{
> +	return folio + folio_nr_pages(folio);

Are you planning to make hugetlb use folio too?

If yes, this might not work if we have CONFIG_SPARSEMEM && !CONFIG_SPARSEMEM_VMEMMAP
with a hugetlb folio > MAX_ORDER, because struct page might not be virtually contiguous.
See the experiment I did in [1].


[1] https://lore.kernel.org/linux-mm/16F7C58B-4D79-41C5-9B64-A1A1628F4AF2@nvidia.com/


—
Best Regards,
Yan Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v3 01/25] mm: Introduce struct folio
  2021-03-01 20:26   ` Zi Yan
@ 2021-03-01 20:53     ` Matthew Wilcox
  2021-03-01 21:03       ` Zi Yan
  2021-03-02 13:22     ` Matthew Wilcox
  1 sibling, 1 reply; 37+ messages in thread
From: Matthew Wilcox @ 2021-03-01 20:53 UTC (permalink / raw)
  To: Zi Yan; +Cc: linux-fsdevel, linux-mm, linux-kernel, Mike Kravetz

On Mon, Mar 01, 2021 at 03:26:11PM -0500, Zi Yan wrote:
> > +static inline struct folio *next_folio(struct folio *folio)
> > +{
> > +	return folio + folio_nr_pages(folio);
> 
> Are you planning to make hugetlb use folio too?

Eventually, probably.  It's not my focus.

> If yes, this might not work if we have CONFIG_SPARSEMEM && !CONFIG_SPARSEMEM_VMEMMAP
> with a hugetlb folio > MAX_ORDER, because struct page might not be virtually contiguous.
> See the experiment I did in [1].
> 
> [1] https://lore.kernel.org/linux-mm/16F7C58B-4D79-41C5-9B64-A1A1628F4AF2@nvidia.com/

I thought we were going to forbid that configuration?  ie no pages
larger than MAX_ORDER with (SPARSEMEM && !SPARSEMEM_VMEMMAP)

https://lore.kernel.org/linux-mm/312AECBD-CA6D-4E93-A6C1-1DF87BABD92D@nvidia.com/

is somewhere else we were discussing this.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v3 01/25] mm: Introduce struct folio
  2021-03-01 20:53     ` Matthew Wilcox
@ 2021-03-01 21:03       ` Zi Yan
  0 siblings, 0 replies; 37+ messages in thread
From: Zi Yan @ 2021-03-01 21:03 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-fsdevel, linux-mm, linux-kernel, Mike Kravetz

[-- Attachment #1: Type: text/plain, Size: 1245 bytes --]

On 1 Mar 2021, at 15:53, Matthew Wilcox wrote:

> On Mon, Mar 01, 2021 at 03:26:11PM -0500, Zi Yan wrote:
>>> +static inline struct folio *next_folio(struct folio *folio)
>>> +{
>>> +	return folio + folio_nr_pages(folio);
>>
>> Are you planning to make hugetlb use folio too?
>
> Eventually, probably.  It's not my focus.
>
>> If yes, this might not work if we have CONFIG_SPARSEMEM && !CONFIG_SPARSEMEM_VMEMMAP
>> with a hugetlb folio > MAX_ORDER, because struct page might not be virtually contiguous.
>> See the experiment I did in [1].
>>
>> [1] https://lore.kernel.org/linux-mm/16F7C58B-4D79-41C5-9B64-A1A1628F4AF2@nvidia.com/
>
> I thought we were going to forbid that configuration?  ie no pages
> larger than MAX_ORDER with (SPARSEMEM && !SPARSEMEM_VMEMMAP)
>
> https://lore.kernel.org/linux-mm/312AECBD-CA6D-4E93-A6C1-1DF87BABD92D@nvidia.com/
>
> is somewhere else we were discussing this.

That is my plan for 1GB THP, making it depend on SPARSEMEM_VMEMMAP,
otherwise the THP code will be too complicated to read. My concern
is just about using folio in hugetlb, since

If hugetlb is not using folio soon, the patch looks good to me.

Reviewed-by: Zi Yan <ziy@nvidia.com>


—
Best Regards,
Yan Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v3 02/25] mm: Add folio_pgdat
  2021-01-28  7:03 ` [PATCH v3 02/25] mm: Add folio_pgdat Matthew Wilcox (Oracle)
@ 2021-03-01 21:05   ` Zi Yan
  0 siblings, 0 replies; 37+ messages in thread
From: Zi Yan @ 2021-03-01 21:05 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-fsdevel, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 964 bytes --]

On 28 Jan 2021, at 2:03, Matthew Wilcox (Oracle) wrote:

> This is just a convenience wrapper for callers with folios; pgdat can
> be reached from tail pages as well as head pages.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/mm.h | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index f20504017adf..7d787229dd40 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1503,6 +1503,11 @@ static inline pg_data_t *page_pgdat(const struct page *page)
>  	return NODE_DATA(page_to_nid(page));
>  }
>
> +static inline pg_data_t *folio_pgdat(const struct folio *folio)
> +{
> +	return page_pgdat(&folio->page);
> +}
> +
>  #ifdef SECTION_IN_PAGE_FLAGS
>  static inline void set_page_section(struct page *page, unsigned long section)
>  {
> -- 
> 2.29.2

LGTM.

Reviewed-by: Zi Yan <ziy@nvidia.com>

—
Best Regards,
Yan Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v3 03/25] mm/vmstat: Add folio stat wrappers
  2021-01-28  7:03 ` [PATCH v3 03/25] mm/vmstat: Add folio stat wrappers Matthew Wilcox (Oracle)
@ 2021-03-01 21:17   ` Zi Yan
  2021-03-01 22:15     ` Matthew Wilcox
  0 siblings, 1 reply; 37+ messages in thread
From: Zi Yan @ 2021-03-01 21:17 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-fsdevel, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3026 bytes --]

On 28 Jan 2021, at 2:03, Matthew Wilcox (Oracle) wrote:

> Allow page counters to be more readily modified by callers which have
> a folio.  Name these wrappers with 'stat' instead of 'state' as requested
> by Linus here:
> https://lore.kernel.org/linux-mm/CAHk-=wj847SudR-kt+46fT3+xFFgiwpgThvm7DJWGdi4cVrbnQ@mail.gmail.com/
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/vmstat.h | 60 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 60 insertions(+)
>
> diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
> index 773135fc6e19..3c3373c2c3c2 100644
> --- a/include/linux/vmstat.h
> +++ b/include/linux/vmstat.h
> @@ -396,6 +396,54 @@ static inline void drain_zonestat(struct zone *zone,
>  			struct per_cpu_pageset *pset) { }
>  #endif		/* CONFIG_SMP */
>
> +static inline
> +void __inc_zone_folio_stat(struct folio *folio, enum zone_stat_item item)
> +{
> +	__inc_zone_page_state(&folio->page, item);

Shouldn’t we change the stats with folio_nr_pages(folio) here? And all
changes below. Otherwise one folio is always counted as a single page.

> +}
> +
> +static inline
> +void __dec_zone_folio_stat(struct folio *folio, enum zone_stat_item item)
> +{
> +	__dec_zone_page_state(&folio->page, item);
> +}
> +
> +static inline
> +void inc_zone_folio_stat(struct folio *folio, enum zone_stat_item item)
> +{
> +	inc_zone_page_state(&folio->page, item);
> +}
> +
> +static inline
> +void dec_zone_folio_stat(struct folio *folio, enum zone_stat_item item)
> +{
> +	dec_zone_page_state(&folio->page, item);
> +}
> +
> +static inline
> +void __inc_node_folio_stat(struct folio *folio, enum node_stat_item item)
> +{
> +	__inc_node_page_state(&folio->page, item);
> +}
> +
> +static inline
> +void __dec_node_folio_stat(struct folio *folio, enum node_stat_item item)
> +{
> +	__dec_node_page_state(&folio->page, item);
> +}
> +
> +static inline
> +void inc_node_folio_stat(struct folio *folio, enum node_stat_item item)
> +{
> +	inc_node_page_state(&folio->page, item);
> +}
> +
> +static inline
> +void dec_node_folio_stat(struct folio *folio, enum node_stat_item item)
> +{
> +	dec_node_page_state(&folio->page, item);
> +}
> +
>  static inline void __mod_zone_freepage_state(struct zone *zone, int nr_pages,
>  					     int migratetype)
>  {
> @@ -530,6 +578,18 @@ static inline void __dec_lruvec_page_state(struct page *page,
>  	__mod_lruvec_page_state(page, idx, -1);
>  }
>
> +static inline void __inc_lruvec_folio_stat(struct folio *folio,
> +					   enum node_stat_item idx)
> +{
> +	__mod_lruvec_page_state(&folio->page, idx, 1);
> +}
> +
> +static inline void __dec_lruvec_folio_stat(struct folio *folio,
> +					   enum node_stat_item idx)
> +{
> +	__mod_lruvec_page_state(&folio->page, idx, -1);
> +}
> +
>  static inline void inc_lruvec_state(struct lruvec *lruvec,
>  				    enum node_stat_item idx)
>  {
> -- 
> 2.29.2


—
Best Regards,
Yan Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v3 04/25] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO
  2021-01-28  7:03 ` [PATCH v3 04/25] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO Matthew Wilcox (Oracle)
@ 2021-03-01 21:25   ` Zi Yan
  0 siblings, 0 replies; 37+ messages in thread
From: Zi Yan @ 2021-03-01 21:25 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-fsdevel, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2426 bytes --]

On 28 Jan 2021, at 2:03, Matthew Wilcox (Oracle) wrote:

> These are the folio equivalents of VM_BUG_ON_PAGE and VM_WARN_ON_ONCE_PAGE.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/mmdebug.h | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/include/linux/mmdebug.h b/include/linux/mmdebug.h
> index 5d0767cb424a..77d24e1dcaec 100644
> --- a/include/linux/mmdebug.h
> +++ b/include/linux/mmdebug.h
> @@ -23,6 +23,13 @@ void dump_mm(const struct mm_struct *mm);
>  			BUG();						\
>  		}							\
>  	} while (0)
> +#define VM_BUG_ON_FOLIO(cond, folio)					\
> +	do {								\
> +		if (unlikely(cond)) {					\
> +			dump_page(&folio->page, "VM_BUG_ON_FOLIO(" __stringify(cond)")");\
> +			BUG();						\
> +		}							\
> +	} while (0)
>  #define VM_BUG_ON_VMA(cond, vma)					\
>  	do {								\
>  		if (unlikely(cond)) {					\
> @@ -48,6 +55,17 @@ void dump_mm(const struct mm_struct *mm);
>  	}								\
>  	unlikely(__ret_warn_once);					\
>  })
> +#define VM_WARN_ON_ONCE_FOLIO(cond, folio)	({			\
> +	static bool __section(".data.once") __warned;			\
> +	int __ret_warn_once = !!(cond);					\
> +									\
> +	if (unlikely(__ret_warn_once && !__warned)) {			\
> +		dump_page(&folio->page, "VM_WARN_ON_ONCE_FOLIO(" __stringify(cond)")");\
> +		__warned = true;					\
> +		WARN_ON(1);						\
> +	}								\
> +	unlikely(__ret_warn_once);					\
> +})
>
>  #define VM_WARN_ON(cond) (void)WARN_ON(cond)
>  #define VM_WARN_ON_ONCE(cond) (void)WARN_ON_ONCE(cond)
> @@ -56,11 +74,13 @@ void dump_mm(const struct mm_struct *mm);
>  #else
>  #define VM_BUG_ON(cond) BUILD_BUG_ON_INVALID(cond)
>  #define VM_BUG_ON_PAGE(cond, page) VM_BUG_ON(cond)
> +#define VM_BUG_ON_FOLIO(cond, folio) VM_BUG_ON(cond)
>  #define VM_BUG_ON_VMA(cond, vma) VM_BUG_ON(cond)
>  #define VM_BUG_ON_MM(cond, mm) VM_BUG_ON(cond)
>  #define VM_WARN_ON(cond) BUILD_BUG_ON_INVALID(cond)
>  #define VM_WARN_ON_ONCE(cond) BUILD_BUG_ON_INVALID(cond)
>  #define VM_WARN_ON_ONCE_PAGE(cond, page)  BUILD_BUG_ON_INVALID(cond)
> +#define VM_WARN_ON_ONCE_FOLIO(cond, folio)  BUILD_BUG_ON_INVALID(cond)
>  #define VM_WARN_ONCE(cond, format...) BUILD_BUG_ON_INVALID(cond)
>  #define VM_WARN(cond, format...) BUILD_BUG_ON_INVALID(cond)
>  #endif
> -- 
> 2.29.2

LGTM.

Reviewed-by: Zi Yan <ziy@nvidia.com>

—
Best Regards,
Yan Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v3 05/25] mm: Add put_folio
  2021-01-28  7:03 ` [PATCH v3 05/25] mm: Add put_folio Matthew Wilcox (Oracle)
@ 2021-03-01 21:41   ` Zi Yan
  0 siblings, 0 replies; 37+ messages in thread
From: Zi Yan @ 2021-03-01 21:41 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-fsdevel, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1626 bytes --]

On 28 Jan 2021, at 2:03, Matthew Wilcox (Oracle) wrote:

> If we know we have a folio, we can call put_folio() instead of put_page()
> and save the overhead of calling compound_head().  Also skips the
> devmap checks.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/mm.h | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 7d787229dd40..873d649107ba 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1220,9 +1220,15 @@ static inline __must_check bool try_get_page(struct page *page)
>  	return true;
>  }
>
> +static inline void put_folio(struct folio *folio)
> +{
> +	if (put_page_testzero(&folio->page))
> +		__put_page(&folio->page);
> +}
> +
>  static inline void put_page(struct page *page)
>  {
> -	page = compound_head(page);
> +	struct folio *folio = page_folio(page);
>
>  	/*
>  	 * For devmap managed pages we need to catch refcount transition from
> @@ -1230,13 +1236,12 @@ static inline void put_page(struct page *page)
>  	 * need to inform the device driver through callback. See
>  	 * include/linux/memremap.h and HMM for details.
>  	 */
> -	if (page_is_devmap_managed(page)) {
> -		put_devmap_managed_page(page);
> +	if (page_is_devmap_managed(&folio->page)) {
> +		put_devmap_managed_page(&folio->page);
>  		return;
>  	}
>
> -	if (put_page_testzero(page))
> -		__put_page(page);
> +	put_folio(folio);
>  }
>
>  /*
> -- 
> 2.29.2

LGTM.

Reviewed-by: Zi Yan <ziy@nvidia.com>


—
Best Regards,
Yan Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v3 06/25] mm: Add get_folio
  2021-01-28  7:03 ` [PATCH v3 06/25] mm: Add get_folio Matthew Wilcox (Oracle)
@ 2021-03-01 21:45   ` Zi Yan
  0 siblings, 0 replies; 37+ messages in thread
From: Zi Yan @ 2021-03-01 21:45 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: linux-fsdevel, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1718 bytes --]

On 28 Jan 2021, at 2:03, Matthew Wilcox (Oracle) wrote:

> If we know we have a folio, we can call get_folio() instead of get_page()
> and save the overhead of calling compound_head().
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/mm.h | 19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 873d649107ba..d71c5776b571 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1192,18 +1192,19 @@ static inline bool is_pci_p2pdma_page(const struct page *page)
>  }
>
>  /* 127: arbitrary random number, small enough to assemble well */
> -#define page_ref_zero_or_close_to_overflow(page) \
> -	((unsigned int) page_ref_count(page) + 127u <= 127u)
> +#define folio_ref_zero_or_close_to_overflow(folio) \
> +	((unsigned int) page_ref_count(&folio->page) + 127u <= 127u)
> +
> +static inline void get_folio(struct folio *folio)
> +{
> +	/* Getting a page requires an already elevated page->_refcount. */
> +	VM_BUG_ON_FOLIO(folio_ref_zero_or_close_to_overflow(folio), folio);
> +	page_ref_inc(&folio->page);
> +}
>
>  static inline void get_page(struct page *page)
>  {
> -	page = compound_head(page);
> -	/*
> -	 * Getting a normal page or the head of a compound page
> -	 * requires to already have an elevated page->_refcount.
> -	 */
> -	VM_BUG_ON_PAGE(page_ref_zero_or_close_to_overflow(page), page);
> -	page_ref_inc(page);
> +	get_folio(page_folio(page));
>  }
>
>  bool __must_check try_grab_page(struct page *page, unsigned int flags);
> -- 
> 2.29.2

LGTM.

Reviewed-by: Zi Yan <ziy@nvidia.com>

—
Best Regards,
Yan Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v3 03/25] mm/vmstat: Add folio stat wrappers
  2021-03-01 21:17   ` Zi Yan
@ 2021-03-01 22:15     ` Matthew Wilcox
  0 siblings, 0 replies; 37+ messages in thread
From: Matthew Wilcox @ 2021-03-01 22:15 UTC (permalink / raw)
  To: Zi Yan; +Cc: linux-fsdevel, linux-mm, linux-kernel

On Mon, Mar 01, 2021 at 04:17:39PM -0500, Zi Yan wrote:
> On 28 Jan 2021, at 2:03, Matthew Wilcox (Oracle) wrote:
> > Allow page counters to be more readily modified by callers which have
> > a folio.  Name these wrappers with 'stat' instead of 'state' as requested
>
> Shouldn’t we change the stats with folio_nr_pages(folio) here? And all
> changes below. Otherwise one folio is always counted as a single page.

That's a good point.  Looking through the changes in my current folio
tree (which doesn't get as far as the thp tree did; ie doesn't yet allocate
multi-page folios, so hasn't been tested with anything larger than a
single page), the callers are ...

@@ -2698,3 +2698,3 @@ int clear_page_dirty_for_io(struct page *page)
-               if (TestClearPageDirty(page)) {
-                       dec_lruvec_page_state(page, NR_FILE_DIRTY);
-                       dec_zone_page_state(page, NR_ZONE_WRITE_PENDING);
+               if (TestClearFolioDirty(folio)) {
+                       dec_lruvec_folio_stat(folio, NR_FILE_DIRTY);
+                       dec_zone_folio_stat(folio, NR_ZONE_WRITE_PENDING);
@@ -2432,3 +2433,3 @@ void account_page_dirtied(struct page *page, struct addres
s_space *mapping)
-               __inc_lruvec_page_state(page, NR_FILE_DIRTY);
-               __inc_zone_page_state(page, NR_ZONE_WRITE_PENDING);
-               __inc_node_page_state(page, NR_DIRTIED);
+               __inc_lruvec_folio_stat(folio, NR_FILE_DIRTY);
+               __inc_zone_folio_stat(folio, NR_ZONE_WRITE_PENDING);
+               __inc_node_folio_stat(folio, NR_DIRTIED);
@@ -891 +890 @@ noinline int __add_to_page_cache_locked(struct page *page,
-                       __inc_lruvec_page_state(page, NR_FILE_PAGES);
+                       __inc_lruvec_folio_stat(folio, NR_FILE_PAGES);
@@ -2759,2 +2759,2 @@ int test_clear_page_writeback(struct page *page)
-               dec_zone_page_state(page, NR_ZONE_WRITE_PENDING);
-               inc_node_page_state(page, NR_WRITTEN);
+               dec_zone_folio_stat(folio, NR_ZONE_WRITE_PENDING);
+               inc_node_folio_stat(folio, NR_WRITTEN);

I think it's clear from this that I haven't found all the places
that I need to change yet ;-)

Looking at the places I did change in the thp tree, there are changes
like this:

@@ -860,27 +864,30 @@ noinline int __add_to_page_cache_locked(struct page *page,
-               if (!huge)
-                       __inc_lruvec_page_state(page, NR_FILE_PAGES);
+               if (!huge) {
+                       __mod_lruvec_page_state(page, NR_FILE_PAGES, nr);
+                       if (nr > 1)
+                               __mod_node_page_state(page_pgdat(page),
+                                               NR_FILE_THPS, nr);
+               }

... but I never did do some of the changes which the above changes imply
are needed.  So the thp tree probably had all kinds of bad statistics
that I never noticed.

So ... at least some of the users are definitely going to want to
cache the 'nr_pages' and use it multiple times, including calling
__mod_node_folio_state(), but others should do what you suggested.
Thanks!  I'll make that change.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v3 01/25] mm: Introduce struct folio
  2021-03-01 20:26   ` Zi Yan
  2021-03-01 20:53     ` Matthew Wilcox
@ 2021-03-02 13:22     ` Matthew Wilcox
  2021-03-02 17:47       ` Zi Yan
  1 sibling, 1 reply; 37+ messages in thread
From: Matthew Wilcox @ 2021-03-02 13:22 UTC (permalink / raw)
  To: Zi Yan; +Cc: linux-fsdevel, linux-mm, linux-kernel, Mike Kravetz

On Mon, Mar 01, 2021 at 03:26:11PM -0500, Zi Yan wrote:
> > +static inline struct folio *next_folio(struct folio *folio)
> > +{
> > +	return folio + folio_nr_pages(folio);
> 
> Are you planning to make hugetlb use folio too?
> 
> If yes, this might not work if we have CONFIG_SPARSEMEM && !CONFIG_SPARSEMEM_VMEMMAP
> with a hugetlb folio > MAX_ORDER, because struct page might not be virtually contiguous.
> See the experiment I did in [1].

Actually, how about proofing this against a future change?

static inline struct folio *next_folio(struct folio *folio)
{
#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
	pfn_t next_pfn = page_to_pfn(&folio->page) + folio_nr_pages(folio);
	return (struct folio *)pfn_to_page(next_pfn);
#else
	return folio + folio_nr_pages(folio);
#endif
}

(not compiled)


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v3 01/25] mm: Introduce struct folio
  2021-03-02 13:22     ` Matthew Wilcox
@ 2021-03-02 17:47       ` Zi Yan
  0 siblings, 0 replies; 37+ messages in thread
From: Zi Yan @ 2021-03-02 17:47 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-fsdevel, linux-mm, linux-kernel, Mike Kravetz

[-- Attachment #1: Type: text/plain, Size: 1357 bytes --]

On 2 Mar 2021, at 8:22, Matthew Wilcox wrote:

> On Mon, Mar 01, 2021 at 03:26:11PM -0500, Zi Yan wrote:
>>> +static inline struct folio *next_folio(struct folio *folio)
>>> +{
>>> +	return folio + folio_nr_pages(folio);
>>
>> Are you planning to make hugetlb use folio too?
>>
>> If yes, this might not work if we have CONFIG_SPARSEMEM && !CONFIG_SPARSEMEM_VMEMMAP
>> with a hugetlb folio > MAX_ORDER, because struct page might not be virtually contiguous.
>> See the experiment I did in [1].
>
> Actually, how about proofing this against a future change?
>
> static inline struct folio *next_folio(struct folio *folio)
> {
> #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
> 	pfn_t next_pfn = page_to_pfn(&folio->page) + folio_nr_pages(folio);
> 	return (struct folio *)pfn_to_page(next_pfn);
> #else
> 	return folio + folio_nr_pages(folio);
> #endif
> }
>
> (not compiled)

Yes, it should work. A better version might be that in the top half
you check folio order first and if the order >= MAX_ORDER, we use
the complicated code, otherwise just folio+folio_nr_pages(folio).

This CONFIG_SPARSEMEM && !CONFIG_SPARSEMEM_VMEMMAP is really not friendly
to >=MAX_ORDER pages. Most likely I am going to make 1GB THP
rely on CONFIG_SPARSEMEM_VMEMMAP to avoid complicated code.

—
Best Regards,
Yan Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2021-03-02 20:41 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-28  7:03 [PATCH v3 00/25] Page folios Matthew Wilcox (Oracle)
2021-01-28  7:03 ` [PATCH v3 01/25] mm: Introduce struct folio Matthew Wilcox (Oracle)
2021-03-01 20:26   ` Zi Yan
2021-03-01 20:53     ` Matthew Wilcox
2021-03-01 21:03       ` Zi Yan
2021-03-02 13:22     ` Matthew Wilcox
2021-03-02 17:47       ` Zi Yan
2021-01-28  7:03 ` [PATCH v3 02/25] mm: Add folio_pgdat Matthew Wilcox (Oracle)
2021-03-01 21:05   ` Zi Yan
2021-01-28  7:03 ` [PATCH v3 03/25] mm/vmstat: Add folio stat wrappers Matthew Wilcox (Oracle)
2021-03-01 21:17   ` Zi Yan
2021-03-01 22:15     ` Matthew Wilcox
2021-01-28  7:03 ` [PATCH v3 04/25] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO Matthew Wilcox (Oracle)
2021-03-01 21:25   ` Zi Yan
2021-01-28  7:03 ` [PATCH v3 05/25] mm: Add put_folio Matthew Wilcox (Oracle)
2021-03-01 21:41   ` Zi Yan
2021-01-28  7:03 ` [PATCH v3 06/25] mm: Add get_folio Matthew Wilcox (Oracle)
2021-03-01 21:45   ` Zi Yan
2021-01-28  7:03 ` [PATCH v3 07/25] mm: Create FolioFlags Matthew Wilcox (Oracle)
2021-01-28  7:03 ` [PATCH v3 08/25] mm: Handle per-folio private data Matthew Wilcox (Oracle)
2021-01-28  7:03 ` [PATCH v3 09/25] mm: Add folio_index, folio_page and folio_contains Matthew Wilcox (Oracle)
2021-01-28  7:03 ` [PATCH v3 10/25] mm/util: Add folio_mapping and folio_file_mapping Matthew Wilcox (Oracle)
2021-01-28  7:03 ` [PATCH v3 11/25] mm/memcg: Add folio_memcg, lock_folio_memcg and unlock_folio_memcg Matthew Wilcox (Oracle)
2021-01-28  7:03 ` [PATCH v3 12/25] mm/memcg: Add mem_cgroup_folio_lruvec Matthew Wilcox (Oracle)
2021-01-28  7:03 ` [PATCH v3 13/25] mm: Add unlock_folio Matthew Wilcox (Oracle)
2021-01-28  7:03 ` [PATCH v3 14/25] mm: Add lock_folio Matthew Wilcox (Oracle)
2021-01-28  7:03 ` [PATCH v3 15/25] mm: Add lock_folio_killable Matthew Wilcox (Oracle)
2021-01-28  7:03 ` [PATCH v3 16/25] mm: Convert lock_page_async to lock_folio_async Matthew Wilcox (Oracle)
2021-01-28  7:03 ` [PATCH v3 17/25] mm/filemap: Convert end_page_writeback to end_folio_writeback Matthew Wilcox (Oracle)
2021-01-28  7:03 ` [PATCH v3 18/25] mm: Convert wait_on_page_bit to wait_on_folio_bit Matthew Wilcox (Oracle)
2021-01-28  7:03 ` [PATCH v3 19/25] mm: Add wait_for_stable_folio and wait_on_folio_writeback Matthew Wilcox (Oracle)
2021-01-28  7:03 ` [PATCH v3 20/25] mm: Add wait_on_folio_locked & wait_on_folio_locked_killable Matthew Wilcox (Oracle)
2021-01-28  7:04 ` [PATCH v3 21/25] mm: Convert lock_page_or_retry to lock_folio_or_retry Matthew Wilcox (Oracle)
2021-01-28  7:04 ` [PATCH v3 22/25] mm/filemap: Convert wake_up_page_bit to wake_up_folio_bit Matthew Wilcox (Oracle)
2021-01-28  7:04 ` [PATCH v3 23/25] mm: Convert test_clear_page_writeback to test_clear_folio_writeback Matthew Wilcox (Oracle)
2021-01-28  7:04 ` [PATCH v3 24/25] mm/filemap: Convert page wait queues to be folios Matthew Wilcox (Oracle)
2021-01-28  7:04 ` [PATCH v3 25/25] cachefiles: Switch to wait_page_key Matthew Wilcox (Oracle)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).