linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8.1 00/31] Memory Folios
@ 2021-04-30 18:07 Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 01/31] mm: Introduce struct folio Matthew Wilcox (Oracle)
                   ` (31 more replies)
  0 siblings, 32 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm; +Cc: Matthew Wilcox (Oracle)

Managing memory in 4KiB pages is a serious overhead.  Many benchmarks
benefit from a larger "page size".  As an example, an earlier iteration
of this idea which used compound pages (and wasn't particularly tuned)
got a 7% performance boost when compiling the kernel.

Using compound pages or THPs exposes a serious weakness in our type
system.  Functions are often unprepared for compound pages to be passed
to them, and may only act on PAGE_SIZE chunks.  Even functions which are
aware of compound pages may expect a head page, and do the wrong thing
if passed a tail page.

There have been efforts to label function parameters as 'head' instead
of 'page' to indicate that the function expects a head page, but this
leaves us with runtime assertions instead of using the compiler to prove
that nobody has mistakenly passed a tail page.  Calling a struct page
'head' is also inaccurate as they will work perfectly well on base pages.

We also waste a lot of instructions ensuring that we're not looking at
a tail page.  Almost every call to PageFoo() contains one or more hidden
calls to compound_head().  This also happens for get_page(), put_page()
and many more functions.  There does not appear to be a way to tell gcc
that it can cache the result of compound_head(), nor is there a way to
tell it that compound_head() is idempotent.

This series introduces the 'struct folio' as a replacement for
head-or-base pages.  This initial set reduces the kernel size by
approximately 6kB by removing conversions from tail pages to head pages.
The real purpose of this series is adding infrastructure to enable
further use of the folio.

The medium-term goal is to convert all filesystems and some device
drivers to work in terms of folios.  This series contains a lot of
explicit conversions, but it's important to realise it's removing a lot
of implicit conversions in some relatively hot paths.  There will be very
few conversions from folios when this work is completed; filesystems,
the page cache, the LRU and so on will generally only deal with folios.

The text size reduces by between 6kB (a config based on Oracle UEK)
and 1.2kB (allnoconfig).  Performance seems almost unaffected based
on kernbench.

Current tree at:
https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/folio

(contains another ~120 patches on top of this batch, not all of which are
in good shape for submission)

v8.1:
 - Rebase on next-20210430
 - You need https://lore.kernel.org/linux-mm/20210430145549.2662354-1-willy@infradead.org/ first
 - Big renaming (thanks to peterz):
   - PageFoo() becomes folio_foo()
   - SetFolioFoo() becomes folio_set_foo()
   - ClearFolioFoo() becomes folio_clear_foo()
   - __SetFolioFoo() becomes __folio_set_foo()
   - __ClearFolioFoo() becomes __folio_clear_foo()
   - TestSetPageFoo() becomes folio_test_set_foo()
   - TestClearPageFoo() becomes folio_test_clear_foo()
   - PageHuge() is now folio_hugetlb()
   - put_folio() becomes folio_put()
   - get_folio() becomes folio_get()
   - put_folio_testzero() becomes folio_put_testzero()
   - set_folio_count() becomes folio_set_count()
   - attach_folio_private() becomes folio_attach_private()
   - detach_folio_private() becomes folio_detach_private()
   - lock_folio() becomes folio_lock()
   - unlock_folio() becomes folio_unlock()
   - trylock_folio() becomes folio_trylock()
   - __lock_folio_or_retry becomes __folio_lock_or_retry()
   - __lock_folio_async() becomes __folio_lock_async()
   - wake_up_folio_bit() becomes folio_wake_bit()
   - wake_up_folio() becomes folio_wake()
   - wait_on_folio_bit() becomes folio_wait_bit()
   - wait_for_stable_folio() becomes folio_wait_stable()
   - wait_on_folio() becomes folio_wait()
   - wait_on_folio_locked() becomes folio_wait_locked()
   - wait_on_folio_writeback() becomes folio_wait_writeback()
   - end_folio_writeback() becomes folio_end_writeback()
   - add_folio_wait_queue() becomes folio_add_wait_queue()
 - Add folio_young() and folio_idle() family of functions
 - Move page_folio() to page-flags.h and use _compound_head()
 - Make page_folio() const-preserving
 - Add folio_page() to get the nth page from a folio
 - Improve struct folio kernel-doc
 - Convert folio flag tests to return bool instead of int
 - Eliminate set_folio_private()
 - folio_get_private() is the equivalent of page_private() (as folio_private()
   is now a test for whether the private flag is set on the folio)
 - Move folio_rotate_reclaimable() into this patchset
 - Add page-flags.h to the kernel-doc
 - Add netfs.h to the kernel-doc
 - Add a family of folio_lock_lruvec() wrappers
 - Add a family of folio_relock_lruvec() wrappers

v7:
https://lore.kernel.org/linux-mm/20210409185105.188284-1-willy@infradead.org/

Matthew Wilcox (Oracle) (31):
  mm: Introduce struct folio
  mm: Add folio_pgdat and folio_zone
  mm/vmstat: Add functions to account folio statistics
  mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO
  mm: Add folio reference count functions
  mm: Add folio_put
  mm: Add folio_get
  mm: Add folio flag manipulation functions
  mm: Add folio_young() and folio_idle()
  mm: Handle per-folio private data
  mm/filemap: Add folio_index, folio_file_page and folio_contains
  mm/filemap: Add folio_next_index
  mm/filemap: Add folio_offset and folio_file_offset
  mm/util: Add folio_mapping and folio_file_mapping
  mm: Add folio_mapcount
  mm/memcg: Add folio wrappers for various functions
  mm/filemap: Add folio_unlock
  mm/filemap: Add folio_lock
  mm/filemap: Add folio_lock_killable
  mm/filemap: Add __folio_lock_async
  mm/filemap: Add __folio_lock_or_retry
  mm/filemap: Add folio_wait_locked
  mm/swap: Add folio_rotate_reclaimable
  mm/filemap: Add folio_end_writeback
  mm/writeback: Add folio_wait_writeback
  mm/writeback: Add folio_wait_stable
  mm/filemap: Add folio_wait_bit
  mm/filemap: Add folio_wake_bit
  mm/filemap: Convert page wait queues to be folios
  mm/filemap: Add folio private_2 functions
  fs/netfs: Add folio fscache functions

 Documentation/core-api/mm-api.rst           |   4 +
 Documentation/filesystems/netfs_library.rst |   2 +
 fs/afs/write.c                              |   9 +-
 fs/cachefiles/rdwr.c                        |  16 +-
 fs/io_uring.c                               |   2 +-
 include/linux/memcontrol.h                  |  58 ++++
 include/linux/mm.h                          | 173 ++++++++++--
 include/linux/mm_types.h                    |  71 +++++
 include/linux/mmdebug.h                     |  20 ++
 include/linux/netfs.h                       |  77 +++--
 include/linux/page-flags.h                  | 222 +++++++++++----
 include/linux/page_idle.h                   |  99 ++++---
 include/linux/page_ref.h                    |  88 +++++-
 include/linux/pagemap.h                     | 276 +++++++++++++-----
 include/linux/swap.h                        |   7 +-
 include/linux/vmstat.h                      | 107 +++++++
 mm/Makefile                                 |   2 +-
 mm/filemap.c                                | 295 ++++++++++----------
 mm/folio-compat.c                           |  37 +++
 mm/internal.h                               |   1 +
 mm/memory.c                                 |   8 +-
 mm/page-writeback.c                         |  72 +++--
 mm/page_io.c                                |   4 +-
 mm/swap.c                                   |  18 +-
 mm/swapfile.c                               |   8 +-
 mm/util.c                                   |  30 +-
 26 files changed, 1247 insertions(+), 459 deletions(-)
 create mode 100644 mm/folio-compat.c

-- 
2.30.2

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v8 01/31] mm: Introduce struct folio
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 02/31] mm: Add folio_pgdat and folio_zone Matthew Wilcox (Oracle)
                   ` (30 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm; +Cc: Matthew Wilcox (Oracle), Jeff Layton

A struct folio is a new abstraction to replace the venerable struct page.
A function which takes a struct folio argument declares that it will
operate on the entire (possibly compound) page, not just PAGE_SIZE bytes.
In return, the caller guarantees that the pointer it is passing does
not point to a tail page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 Documentation/core-api/mm-api.rst |  1 +
 include/linux/mm.h                | 74 +++++++++++++++++++++++++++++++
 include/linux/mm_types.h          | 60 +++++++++++++++++++++++++
 include/linux/page-flags.h        | 27 +++++++++++
 4 files changed, 162 insertions(+)

diff --git a/Documentation/core-api/mm-api.rst b/Documentation/core-api/mm-api.rst
index 34f46df91a8b..cbf5858dadac 100644
--- a/Documentation/core-api/mm-api.rst
+++ b/Documentation/core-api/mm-api.rst
@@ -95,5 +95,6 @@ More Memory Management Functions
 .. kernel-doc:: mm/mempolicy.c
 .. kernel-doc:: include/linux/mm_types.h
    :internal:
+.. kernel-doc:: include/linux/page-flags.h
 .. kernel-doc:: include/linux/mm.h
    :internal:
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2327f99b121f..b29c86824e6b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -950,6 +950,20 @@ static inline unsigned int compound_order(struct page *page)
 	return page[1].compound_order;
 }
 
+/**
+ * folio_order - The allocation order of a folio.
+ * @folio: The folio.
+ *
+ * A folio is composed of 2^order pages.  See get_order() for the definition
+ * of order.
+ *
+ * Return: The order of the folio.
+ */
+static inline unsigned int folio_order(struct folio *folio)
+{
+	return compound_order(&folio->page);
+}
+
 static inline bool hpage_pincount_available(struct page *page)
 {
 	/*
@@ -1595,6 +1609,65 @@ static inline void set_page_links(struct page *page, enum zone_type zone,
 #endif
 }
 
+/**
+ * folio_nr_pages - The number of pages in the folio.
+ * @folio: The folio.
+ *
+ * Return: A number which is a power of two.
+ */
+static inline unsigned long folio_nr_pages(struct folio *folio)
+{
+	return compound_nr(&folio->page);
+}
+
+/**
+ * folio_next - Move to the next physical folio.
+ * @folio: The folio we're currently operating on.
+ *
+ * If you have physically contiguous memory which may span more than
+ * one folio (eg a &struct bio_vec), use this function to move from one
+ * folio to the next.  Do not use it if the memory is only virtually
+ * contiguous as the folios are almost certainly not adjacent to each
+ * other.  This is the folio equivalent to writing ``page++``.
+ *
+ * Context: We assume that the folios are refcounted and/or locked at a
+ * higher level and do not adjust the reference counts.
+ * Return: The next struct folio.
+ */
+static inline struct folio *folio_next(struct folio *folio)
+{
+	return (struct folio *)folio_page(folio, folio_nr_pages(folio));
+}
+
+/**
+ * folio_shift - The number of bits covered by this folio.
+ * @folio: The folio.
+ *
+ * A folio contains a number of bytes which is a power-of-two in size.
+ * This function tells you which power-of-two the folio is.
+ *
+ * Context: The caller should have a reference on the folio to prevent
+ * it from being split.  It is not necessary for the folio to be locked.
+ * Return: The base-2 logarithm of the size of this folio.
+ */
+static inline unsigned int folio_shift(struct folio *folio)
+{
+	return PAGE_SHIFT + folio_order(folio);
+}
+
+/**
+ * folio_size - The number of bytes in a folio.
+ * @folio: The folio.
+ *
+ * Context: The caller should have a reference on the folio to prevent
+ * it from being split.  It is not necessary for the folio to be locked.
+ * Return: The number of bytes in this folio.
+ */
+static inline size_t folio_size(struct folio *folio)
+{
+	return PAGE_SIZE << folio_order(folio);
+}
+
 /*
  * Some inline functions in vmstat.h depend on page_zone()
  */
@@ -1699,6 +1772,7 @@ extern void pagefault_out_of_memory(void);
 
 #define offset_in_page(p)	((unsigned long)(p) & ~PAGE_MASK)
 #define offset_in_thp(page, p)	((unsigned long)(p) & (thp_size(page) - 1))
+#define offset_in_folio(folio, p) ((unsigned long)(p) & (folio_size(folio) - 1))
 
 /*
  * Flags passed to show_mem() and show_free_areas() to suppress output in
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 5aacc1c10a45..276e358c75d3 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -224,6 +224,66 @@ struct page {
 #endif
 } _struct_page_alignment;
 
+/**
+ * struct folio - Represents a contiguous set of bytes.
+ * @flags: Identical to the page flags.
+ * @lru: Least Recently Used list; tracks how recently this folio was used.
+ * @mapping: The file this page belongs to, or refers to the anon_vma for
+ *    anonymous pages.
+ * @index: Offset within the file, in units of pages.  For anonymous pages,
+ *    this is the index from the beginning of the mmap.
+ * @private: Filesystem per-folio data (see folio_attach_private()).
+ *    Used for swp_entry_t if folio_swapcache().
+ * @_mapcount: Do not access this member directly.  Use folio_mapcount() to
+ *    find out how many times this folio is mapped by userspace.
+ * @_refcount: Do not access this member directly.  Use folio_ref_count()
+ *    to find how many references there are to this folio.
+ * @memcg_data: Memory Control Group data.
+ *
+ * A folio is a physically, virtually and logically contiguous set
+ * of bytes.  It is a power-of-two in size, and it is aligned to that
+ * same power-of-two.  It is at least as large as %PAGE_SIZE.  If it is
+ * in the page cache, it is at a file offset which is a multiple of that
+ * power-of-two.  It may be mapped into userspace at an address which is
+ * at an arbitrary page offset, but its kernel virtual address is aligned
+ * to its size.
+ */
+struct folio {
+	/* private: don't document the anon union */
+	union {
+		struct {
+	/* public: */
+			unsigned long flags;
+			struct list_head lru;
+			struct address_space *mapping;
+			pgoff_t index;
+			unsigned long private;
+			atomic_t _mapcount;
+			atomic_t _refcount;
+#ifdef CONFIG_MEMCG
+			unsigned long memcg_data;
+#endif
+	/* private: the union with struct page is transitional */
+		};
+		struct page page;
+	};
+};
+
+static_assert(sizeof(struct page) == sizeof(struct folio));
+#define FOLIO_MATCH(pg, fl)						\
+	static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
+FOLIO_MATCH(flags, flags);
+FOLIO_MATCH(lru, lru);
+FOLIO_MATCH(compound_head, lru);
+FOLIO_MATCH(index, index);
+FOLIO_MATCH(private, private);
+FOLIO_MATCH(_mapcount, _mapcount);
+FOLIO_MATCH(_refcount, _refcount);
+#ifdef CONFIG_MEMCG
+FOLIO_MATCH(memcg_data, memcg_data);
+#endif
+#undef FOLIO_MATCH
+
 static inline atomic_t *compound_mapcount_ptr(struct page *page)
 {
 	return &page[1].compound_mapcount;
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index d8e26243db25..e069aa8b11b7 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -188,6 +188,33 @@ static inline unsigned long _compound_head(const struct page *page)
 
 #define compound_head(page)	((typeof(page))_compound_head(page))
 
+/**
+ * page_folio - Converts from page to folio.
+ * @p: The page.
+ *
+ * Every page is part of a folio.  This function cannot be called on a
+ * NULL pointer.
+ *
+ * Context: No reference, nor lock is required on @page.  If the caller
+ * does not hold a reference, this call may race with a folio split, so
+ * it should re-check the folio still contains this page after gaining
+ * a reference on the folio.
+ * Return: The folio which contains this page.
+ */
+#define page_folio(p)		(_Generic((p),				\
+	const struct page *:	(const struct folio *)_compound_head(p), \
+	struct page *:		(struct folio *)_compound_head(p)))
+
+/**
+ * folio_page - Return a page from a folio.
+ * @folio: The folio.
+ * @n: The page number to return.
+ *
+ * @n is relative to the start of the folio.  It should be between
+ * 0 and folio_nr_pages(@folio) - 1, but this is not checked for.
+ */
+#define folio_page(folio, n)	nth_page(&(folio)->page, n)
+
 static __always_inline int PageTail(struct page *page)
 {
 	return READ_ONCE(page->compound_head) & 1;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 02/31] mm: Add folio_pgdat and folio_zone
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 01/31] mm: Introduce struct folio Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 03/31] mm/vmstat: Add functions to account folio statistics Matthew Wilcox (Oracle)
                   ` (29 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Zi Yan, Christoph Hellwig, Jeff Layton

These are just convenience wrappers for callers with folios; pgdat and
zone can be reached from tail pages as well as head pages.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/mm.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index b29c86824e6b..a55c2c0628b6 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1560,6 +1560,16 @@ static inline pg_data_t *page_pgdat(const struct page *page)
 	return NODE_DATA(page_to_nid(page));
 }
 
+static inline struct zone *folio_zone(const struct folio *folio)
+{
+	return page_zone(&folio->page);
+}
+
+static inline pg_data_t *folio_pgdat(const struct folio *folio)
+{
+	return page_pgdat(&folio->page);
+}
+
 #ifdef SECTION_IN_PAGE_FLAGS
 static inline void set_page_section(struct page *page, unsigned long section)
 {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 03/31] mm/vmstat: Add functions to account folio statistics
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 01/31] mm: Introduce struct folio Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 02/31] mm: Add folio_pgdat and folio_zone Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 04/31] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO Matthew Wilcox (Oracle)
                   ` (28 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

Allow page counters to be more readily modified by callers which have
a folio.  Name these wrappers with 'stat' instead of 'state' as requested
by Linus here:
https://lore.kernel.org/linux-mm/CAHk-=wj847SudR-kt+46fT3+xFFgiwpgThvm7DJWGdi4cVrbnQ@mail.gmail.com/

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/vmstat.h | 107 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 107 insertions(+)

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 3299cd69e4ca..d287d7c31b8f 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -402,6 +402,78 @@ static inline void drain_zonestat(struct zone *zone,
 			struct per_cpu_pageset *pset) { }
 #endif		/* CONFIG_SMP */
 
+static inline void __zone_stat_mod_folio(struct folio *folio,
+		enum zone_stat_item item, long nr)
+{
+	__mod_zone_page_state(folio_zone(folio), item, nr);
+}
+
+static inline void __zone_stat_add_folio(struct folio *folio,
+		enum zone_stat_item item)
+{
+	__mod_zone_page_state(folio_zone(folio), item, folio_nr_pages(folio));
+}
+
+static inline void __zone_stat_sub_folio(struct folio *folio,
+		enum zone_stat_item item)
+{
+	__mod_zone_page_state(folio_zone(folio), item, -folio_nr_pages(folio));
+}
+
+static inline void zone_stat_mod_folio(struct folio *folio,
+		enum zone_stat_item item, long nr)
+{
+	mod_zone_page_state(folio_zone(folio), item, nr);
+}
+
+static inline void zone_stat_add_folio(struct folio *folio,
+		enum zone_stat_item item)
+{
+	mod_zone_page_state(folio_zone(folio), item, folio_nr_pages(folio));
+}
+
+static inline void zone_stat_sub_folio(struct folio *folio,
+		enum zone_stat_item item)
+{
+	mod_zone_page_state(folio_zone(folio), item, -folio_nr_pages(folio));
+}
+
+static inline void __node_stat_mod_folio(struct folio *folio,
+		enum node_stat_item item, long nr)
+{
+	__mod_node_page_state(folio_pgdat(folio), item, nr);
+}
+
+static inline void __node_stat_add_folio(struct folio *folio,
+		enum node_stat_item item)
+{
+	__mod_node_page_state(folio_pgdat(folio), item, folio_nr_pages(folio));
+}
+
+static inline void __node_stat_sub_folio(struct folio *folio,
+		enum node_stat_item item)
+{
+	__mod_node_page_state(folio_pgdat(folio), item, -folio_nr_pages(folio));
+}
+
+static inline void node_stat_mod_folio(struct folio *folio,
+		enum node_stat_item item, long nr)
+{
+	mod_node_page_state(folio_pgdat(folio), item, nr);
+}
+
+static inline void node_stat_add_folio(struct folio *folio,
+		enum node_stat_item item)
+{
+	mod_node_page_state(folio_pgdat(folio), item, folio_nr_pages(folio));
+}
+
+static inline void node_stat_sub_folio(struct folio *folio,
+		enum node_stat_item item)
+{
+	mod_node_page_state(folio_pgdat(folio), item, -folio_nr_pages(folio));
+}
+
 static inline void __mod_zone_freepage_state(struct zone *zone, int nr_pages,
 					     int migratetype)
 {
@@ -530,6 +602,24 @@ static inline void __dec_lruvec_page_state(struct page *page,
 	__mod_lruvec_page_state(page, idx, -1);
 }
 
+static inline void __lruvec_stat_mod_folio(struct folio *folio,
+					   enum node_stat_item idx, int val)
+{
+	__mod_lruvec_page_state(&folio->page, idx, val);
+}
+
+static inline void __lruvec_stat_add_folio(struct folio *folio,
+					   enum node_stat_item idx)
+{
+	__lruvec_stat_mod_folio(folio, idx, folio_nr_pages(folio));
+}
+
+static inline void __lruvec_stat_sub_folio(struct folio *folio,
+					   enum node_stat_item idx)
+{
+	__lruvec_stat_mod_folio(folio, idx, -folio_nr_pages(folio));
+}
+
 static inline void inc_lruvec_page_state(struct page *page,
 					 enum node_stat_item idx)
 {
@@ -542,4 +632,21 @@ static inline void dec_lruvec_page_state(struct page *page,
 	mod_lruvec_page_state(page, idx, -1);
 }
 
+static inline void lruvec_stat_mod_folio(struct folio *folio,
+					 enum node_stat_item idx, int val)
+{
+	mod_lruvec_page_state(&folio->page, idx, val);
+}
+
+static inline void lruvec_stat_add_folio(struct folio *folio,
+					 enum node_stat_item idx)
+{
+	lruvec_stat_mod_folio(folio, idx, folio_nr_pages(folio));
+}
+
+static inline void lruvec_stat_sub_folio(struct folio *folio,
+					 enum node_stat_item idx)
+{
+	lruvec_stat_mod_folio(folio, idx, -folio_nr_pages(folio));
+}
 #endif /* _LINUX_VMSTAT_H */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 04/31] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (2 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 03/31] mm/vmstat: Add functions to account folio statistics Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 05/31] mm: Add folio reference count functions Matthew Wilcox (Oracle)
                   ` (27 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Zi Yan, Christoph Hellwig, Jeff Layton

These are the folio equivalents of VM_BUG_ON_PAGE and VM_WARN_ON_ONCE_PAGE.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/mmdebug.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/include/linux/mmdebug.h b/include/linux/mmdebug.h
index 1935d4c72d10..d7285f8148a3 100644
--- a/include/linux/mmdebug.h
+++ b/include/linux/mmdebug.h
@@ -22,6 +22,13 @@ void dump_mm(const struct mm_struct *mm);
 			BUG();						\
 		}							\
 	} while (0)
+#define VM_BUG_ON_FOLIO(cond, folio)					\
+	do {								\
+		if (unlikely(cond)) {					\
+			dump_page(&folio->page, "VM_BUG_ON_FOLIO(" __stringify(cond)")");\
+			BUG();						\
+		}							\
+	} while (0)
 #define VM_BUG_ON_VMA(cond, vma)					\
 	do {								\
 		if (unlikely(cond)) {					\
@@ -47,6 +54,17 @@ void dump_mm(const struct mm_struct *mm);
 	}								\
 	unlikely(__ret_warn_once);					\
 })
+#define VM_WARN_ON_ONCE_FOLIO(cond, folio)	({			\
+	static bool __section(".data.once") __warned;			\
+	int __ret_warn_once = !!(cond);					\
+									\
+	if (unlikely(__ret_warn_once && !__warned)) {			\
+		dump_page(&folio->page, "VM_WARN_ON_ONCE_FOLIO(" __stringify(cond)")");\
+		__warned = true;					\
+		WARN_ON(1);						\
+	}								\
+	unlikely(__ret_warn_once);					\
+})
 
 #define VM_WARN_ON(cond) (void)WARN_ON(cond)
 #define VM_WARN_ON_ONCE(cond) (void)WARN_ON_ONCE(cond)
@@ -55,11 +73,13 @@ void dump_mm(const struct mm_struct *mm);
 #else
 #define VM_BUG_ON(cond) BUILD_BUG_ON_INVALID(cond)
 #define VM_BUG_ON_PAGE(cond, page) VM_BUG_ON(cond)
+#define VM_BUG_ON_FOLIO(cond, folio) VM_BUG_ON(cond)
 #define VM_BUG_ON_VMA(cond, vma) VM_BUG_ON(cond)
 #define VM_BUG_ON_MM(cond, mm) VM_BUG_ON(cond)
 #define VM_WARN_ON(cond) BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN_ON_ONCE(cond) BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN_ON_ONCE_PAGE(cond, page)  BUILD_BUG_ON_INVALID(cond)
+#define VM_WARN_ON_ONCE_FOLIO(cond, folio)  BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN_ONCE(cond, format...) BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN(cond, format...) BUILD_BUG_ON_INVALID(cond)
 #endif
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 05/31] mm: Add folio reference count functions
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (3 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 04/31] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 06/31] mm: Add folio_put Matthew Wilcox (Oracle)
                   ` (26 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

These functions mirror their page reference counterparts.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 Documentation/core-api/mm-api.rst |  1 +
 include/linux/page_ref.h          | 88 ++++++++++++++++++++++++++++++-
 2 files changed, 88 insertions(+), 1 deletion(-)

diff --git a/Documentation/core-api/mm-api.rst b/Documentation/core-api/mm-api.rst
index cbf5858dadac..88545b2dce98 100644
--- a/Documentation/core-api/mm-api.rst
+++ b/Documentation/core-api/mm-api.rst
@@ -98,3 +98,4 @@ More Memory Management Functions
 .. kernel-doc:: include/linux/page-flags.h
 .. kernel-doc:: include/linux/mm.h
    :internal:
+.. kernel-doc:: include/linux/page_ref.h
diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h
index 7ad46f45df39..85816b2c0496 100644
--- a/include/linux/page_ref.h
+++ b/include/linux/page_ref.h
@@ -67,9 +67,31 @@ static inline int page_ref_count(const struct page *page)
 	return atomic_read(&page->_refcount);
 }
 
+/**
+ * folio_ref_count - The reference count on this folio.
+ * @folio: The folio.
+ *
+ * The refcount is usually incremented by calls to folio_get() and
+ * decremented by calls to folio_put().  Some typical users of the
+ * folio refcount:
+ *
+ * - Each reference from a page table
+ * - The page cache
+ * - Filesystem private data
+ * - The LRU list
+ * - Pipes
+ * - Direct IO which references this page in the process address space
+ *
+ * Return: The number of references to this folio.
+ */
+static inline int folio_ref_count(const struct folio *folio)
+{
+	return page_ref_count(&folio->page);
+}
+
 static inline int page_count(const struct page *page)
 {
-	return atomic_read(&compound_head(page)->_refcount);
+	return folio_ref_count(page_folio(page));
 }
 
 static inline void set_page_count(struct page *page, int v)
@@ -79,6 +101,11 @@ static inline void set_page_count(struct page *page, int v)
 		__page_ref_set(page, v);
 }
 
+static inline void folio_set_count(struct folio *folio, int v)
+{
+	set_page_count(&folio->page, v);
+}
+
 /*
  * Setup the page count before being freed into the page allocator for
  * the first time (boot or memory hotplug)
@@ -95,6 +122,11 @@ static inline void page_ref_add(struct page *page, int nr)
 		__page_ref_mod(page, nr);
 }
 
+static inline void folio_ref_add(struct folio *folio, int nr)
+{
+	page_ref_add(&folio->page, nr);
+}
+
 static inline void page_ref_sub(struct page *page, int nr)
 {
 	atomic_sub(nr, &page->_refcount);
@@ -102,6 +134,11 @@ static inline void page_ref_sub(struct page *page, int nr)
 		__page_ref_mod(page, -nr);
 }
 
+static inline void folio_ref_sub(struct folio *folio, int nr)
+{
+	page_ref_sub(&folio->page, nr);
+}
+
 static inline int page_ref_sub_return(struct page *page, int nr)
 {
 	int ret = atomic_sub_return(nr, &page->_refcount);
@@ -111,6 +148,11 @@ static inline int page_ref_sub_return(struct page *page, int nr)
 	return ret;
 }
 
+static inline int folio_ref_sub_return(struct folio *folio, int nr)
+{
+	return page_ref_sub_return(&folio->page, nr);
+}
+
 static inline void page_ref_inc(struct page *page)
 {
 	atomic_inc(&page->_refcount);
@@ -118,6 +160,11 @@ static inline void page_ref_inc(struct page *page)
 		__page_ref_mod(page, 1);
 }
 
+static inline void folio_ref_inc(struct folio *folio)
+{
+	page_ref_inc(&folio->page);
+}
+
 static inline void page_ref_dec(struct page *page)
 {
 	atomic_dec(&page->_refcount);
@@ -125,6 +172,11 @@ static inline void page_ref_dec(struct page *page)
 		__page_ref_mod(page, -1);
 }
 
+static inline void folio_ref_dec(struct folio *folio)
+{
+	page_ref_dec(&folio->page);
+}
+
 static inline int page_ref_sub_and_test(struct page *page, int nr)
 {
 	int ret = atomic_sub_and_test(nr, &page->_refcount);
@@ -134,6 +186,11 @@ static inline int page_ref_sub_and_test(struct page *page, int nr)
 	return ret;
 }
 
+static inline int folio_ref_sub_and_test(struct folio *folio, int nr)
+{
+	return page_ref_sub_and_test(&folio->page, nr);
+}
+
 static inline int page_ref_inc_return(struct page *page)
 {
 	int ret = atomic_inc_return(&page->_refcount);
@@ -143,6 +200,11 @@ static inline int page_ref_inc_return(struct page *page)
 	return ret;
 }
 
+static inline int folio_ref_inc_return(struct folio *folio)
+{
+	return page_ref_inc_return(&folio->page);
+}
+
 static inline int page_ref_dec_and_test(struct page *page)
 {
 	int ret = atomic_dec_and_test(&page->_refcount);
@@ -152,6 +214,11 @@ static inline int page_ref_dec_and_test(struct page *page)
 	return ret;
 }
 
+static inline int folio_ref_dec_and_test(struct folio *folio)
+{
+	return page_ref_dec_and_test(&folio->page);
+}
+
 static inline int page_ref_dec_return(struct page *page)
 {
 	int ret = atomic_dec_return(&page->_refcount);
@@ -161,6 +228,11 @@ static inline int page_ref_dec_return(struct page *page)
 	return ret;
 }
 
+static inline int folio_ref_dec_return(struct folio *folio)
+{
+	return page_ref_dec_return(&folio->page);
+}
+
 static inline int page_ref_add_unless(struct page *page, int nr, int u)
 {
 	int ret = atomic_add_unless(&page->_refcount, nr, u);
@@ -170,6 +242,11 @@ static inline int page_ref_add_unless(struct page *page, int nr, int u)
 	return ret;
 }
 
+static inline int folio_ref_add_unless(struct folio *folio, int nr, int u)
+{
+	return page_ref_add_unless(&folio->page, nr, u);
+}
+
 static inline int page_ref_freeze(struct page *page, int count)
 {
 	int ret = likely(atomic_cmpxchg(&page->_refcount, count, 0) == count);
@@ -179,6 +256,11 @@ static inline int page_ref_freeze(struct page *page, int count)
 	return ret;
 }
 
+static inline int folio_ref_freeze(struct folio *folio, int count)
+{
+	return page_ref_freeze(&folio->page, count);
+}
+
 static inline void page_ref_unfreeze(struct page *page, int count)
 {
 	VM_BUG_ON_PAGE(page_count(page) != 0, page);
@@ -189,4 +271,8 @@ static inline void page_ref_unfreeze(struct page *page, int count)
 		__page_ref_unfreeze(page, count);
 }
 
+static inline void folio_ref_unfreeze(struct folio *folio, int count)
+{
+	page_ref_unfreeze(&folio->page, count);
+}
 #endif
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 06/31] mm: Add folio_put
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (4 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 05/31] mm: Add folio reference count functions Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 07/31] mm: Add folio_get Matthew Wilcox (Oracle)
                   ` (25 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Zi Yan, Christoph Hellwig, Jeff Layton

If we know we have a folio, we can call folio_put() instead of put_page()
and save the overhead of calling compound_head().  Also skips the
devmap checks.

This commit looks like it should be a no-op, but actually saves 1312 bytes
of text with the distro-derived config that I'm testing.  Some functions
grow a little while others shrink.  I presume the compiler is making
different inlining decisions.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/mm.h | 33 ++++++++++++++++++++++++++++-----
 1 file changed, 28 insertions(+), 5 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a55c2c0628b6..610948f0cb43 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -751,6 +751,11 @@ static inline int put_page_testzero(struct page *page)
 	return page_ref_dec_and_test(page);
 }
 
+static inline int folio_put_testzero(struct folio *folio)
+{
+	return put_page_testzero(&folio->page);
+}
+
 /*
  * Try to grab a ref unless the page has a refcount of zero, return false if
  * that is the case.
@@ -1242,9 +1247,28 @@ static inline __must_check bool try_get_page(struct page *page)
 	return true;
 }
 
+/**
+ * folio_put - Decrement the reference count on a folio.
+ * @folio: The folio.
+ *
+ * If the folio's reference count reaches zero, the memory will be
+ * released back to the page allocator and may be used by another
+ * allocation immediately.  Do not access the memory or the struct folio
+ * after calling folio_put() unless you can be sure that it wasn't the
+ * last reference.
+ *
+ * Context: May be called in process or interrupt context, but not in NMI
+ * context.  May be called while holding a spinlock.
+ */
+static inline void folio_put(struct folio *folio)
+{
+	if (folio_put_testzero(folio))
+		__put_page(&folio->page);
+}
+
 static inline void put_page(struct page *page)
 {
-	page = compound_head(page);
+	struct folio *folio = page_folio(page);
 
 	/*
 	 * For devmap managed pages we need to catch refcount transition from
@@ -1252,13 +1276,12 @@ static inline void put_page(struct page *page)
 	 * need to inform the device driver through callback. See
 	 * include/linux/memremap.h and HMM for details.
 	 */
-	if (page_is_devmap_managed(page)) {
-		put_devmap_managed_page(page);
+	if (page_is_devmap_managed(&folio->page)) {
+		put_devmap_managed_page(&folio->page);
 		return;
 	}
 
-	if (put_page_testzero(page))
-		__put_page(page);
+	folio_put(folio);
 }
 
 /*
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 07/31] mm: Add folio_get
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (5 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 06/31] mm: Add folio_put Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 08/31] mm: Add folio flag manipulation functions Matthew Wilcox (Oracle)
                   ` (24 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Zi Yan, Christoph Hellwig, Jeff Layton

If we know we have a folio, we can call folio_get() instead
of get_page() and save the overhead of calling compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/mm.h | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 610948f0cb43..b133734a7530 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1219,18 +1219,26 @@ static inline bool is_pci_p2pdma_page(const struct page *page)
 }
 
 /* 127: arbitrary random number, small enough to assemble well */
-#define page_ref_zero_or_close_to_overflow(page) \
-	((unsigned int) page_ref_count(page) + 127u <= 127u)
+#define folio_ref_zero_or_close_to_overflow(folio) \
+	((unsigned int) folio_ref_count(folio) + 127u <= 127u)
+
+/**
+ * folio_get - Increment the reference count on a folio.
+ * @folio: The folio.
+ *
+ * Context: May be called in any context, as long as you know that
+ * you have a refcount on the folio.  If you do not already have one,
+ * try_grab_page() may be the right interface for you to use.
+ */
+static inline void folio_get(struct folio *folio)
+{
+	VM_BUG_ON_FOLIO(folio_ref_zero_or_close_to_overflow(folio), folio);
+	folio_ref_inc(folio);
+}
 
 static inline void get_page(struct page *page)
 {
-	page = compound_head(page);
-	/*
-	 * Getting a normal page or the head of a compound page
-	 * requires to already have an elevated page->_refcount.
-	 */
-	VM_BUG_ON_PAGE(page_ref_zero_or_close_to_overflow(page), page);
-	page_ref_inc(page);
+	folio_get(page_folio(page));
 }
 
 bool __must_check try_grab_page(struct page *page, unsigned int flags);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 08/31] mm: Add folio flag manipulation functions
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (6 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 07/31] mm: Add folio_get Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 09/31] mm: Add folio_young() and folio_idle() Matthew Wilcox (Oracle)
                   ` (23 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

These new functions are the folio analogues of the various PageFlags
functions.  If CONFIG_DEBUG_VM_PGFLAGS is enabled, we check the folio
is not a tail page at every invocation.  This will also catch the
PagePoisoned case as a poisoned page has every bit set, which would
include PageTail.

This saves 1727 bytes of text with the distro-derived config that
I'm testing due to removing a double call to compound_head() in
PageSwapCache().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/page-flags.h | 195 ++++++++++++++++++++++++++-----------
 1 file changed, 140 insertions(+), 55 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index e069aa8b11b7..3f721431485e 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -140,6 +140,8 @@ enum pageflags {
 #endif
 	__NR_PAGEFLAGS,
 
+	PG_readahead = PG_reclaim,
+
 	/* Filesystems */
 	PG_checked = PG_owner_priv_1,
 
@@ -239,6 +241,15 @@ static inline void page_init_poison(struct page *page, size_t size)
 }
 #endif
 
+static unsigned long *folio_flags(struct folio *folio, unsigned n)
+{
+	struct page *page = &folio->page;
+
+	VM_BUG_ON_PGFLAGS(PageTail(page), page);
+	VM_BUG_ON_PGFLAGS(n > 0 && !test_bit(PG_head, &page->flags), page);
+	return &page[n].flags;
+}
+
 /*
  * Page flags policies wrt compound pages
  *
@@ -283,34 +294,56 @@ static inline void page_init_poison(struct page *page, size_t size)
 		VM_BUG_ON_PGFLAGS(!PageHead(page), page);		\
 		PF_POISONED_CHECK(&page[1]); })
 
+/* Which page is the flag stored in */
+#define FOLIO_PF_ANY		0
+#define FOLIO_PF_HEAD		0
+#define FOLIO_PF_ONLY_HEAD	0
+#define FOLIO_PF_NO_TAIL	0
+#define FOLIO_PF_NO_COMPOUND	0
+#define FOLIO_PF_SECOND		1
+
 /*
  * Macros to create function definitions for page flags
  */
 #define TESTPAGEFLAG(uname, lname, policy)				\
+static __always_inline bool folio_##lname(struct folio *folio)		\
+	{ return test_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \
 static __always_inline int Page##uname(struct page *page)		\
 	{ return test_bit(PG_##lname, &policy(page, 0)->flags); }
 
 #define SETPAGEFLAG(uname, lname, policy)				\
+static __always_inline void folio_set_##lname(struct folio *folio)	\
+	{ set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); }	\
 static __always_inline void SetPage##uname(struct page *page)		\
 	{ set_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define CLEARPAGEFLAG(uname, lname, policy)				\
+static __always_inline void folio_clear_##lname(struct folio *folio)	\
+	{ clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); }	\
 static __always_inline void ClearPage##uname(struct page *page)		\
 	{ clear_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define __SETPAGEFLAG(uname, lname, policy)				\
+static __always_inline void __folio_set_##lname(struct folio *folio)	\
+	{ __set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); }	\
 static __always_inline void __SetPage##uname(struct page *page)		\
 	{ __set_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define __CLEARPAGEFLAG(uname, lname, policy)				\
+static __always_inline void __folio_clear_##lname(struct folio *folio)	\
+	{ __clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \
 static __always_inline void __ClearPage##uname(struct page *page)	\
 	{ __clear_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define TESTSETFLAG(uname, lname, policy)				\
+static __always_inline bool folio_test_set_##lname(struct folio *folio)	\
+	{ return test_and_set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \
 static __always_inline int TestSetPage##uname(struct page *page)	\
 	{ return test_and_set_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define TESTCLEARFLAG(uname, lname, policy)				\
+static __always_inline bool folio_test_clear_##lname(struct folio *folio) \
+{ return test_and_clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \
 static __always_inline int TestClearPage##uname(struct page *page)	\
 	{ return test_and_clear_bit(PG_##lname, &policy(page, 1)->flags); }
 
@@ -328,29 +361,35 @@ static __always_inline int TestClearPage##uname(struct page *page)	\
 	TESTSETFLAG(uname, lname, policy)				\
 	TESTCLEARFLAG(uname, lname, policy)
 
-#define TESTPAGEFLAG_FALSE(uname)					\
+#define TESTPAGEFLAG_FALSE(uname, lname)				\
+static inline bool folio_##lname(const struct folio *folio) { return 0; } \
 static inline int Page##uname(const struct page *page) { return 0; }
 
-#define SETPAGEFLAG_NOOP(uname)						\
+#define SETPAGEFLAG_NOOP(uname, lname)					\
+static inline void folio_set_##lname(struct folio *folio) { }		\
 static inline void SetPage##uname(struct page *page) {  }
 
-#define CLEARPAGEFLAG_NOOP(uname)					\
+#define CLEARPAGEFLAG_NOOP(uname, lname)				\
+static inline void folio_clear_##lname(struct folio *folio) { }		\
 static inline void ClearPage##uname(struct page *page) {  }
 
-#define __CLEARPAGEFLAG_NOOP(uname)					\
+#define __CLEARPAGEFLAG_NOOP(uname, lname)				\
+static inline void __folio_clear_##lname(struct folio *folio) { }	\
 static inline void __ClearPage##uname(struct page *page) {  }
 
-#define TESTSETFLAG_FALSE(uname)					\
+#define TESTSETFLAG_FALSE(uname, lname)					\
+static inline bool folio_test_set_##lname(struct folio *folio) { return 0; } \
 static inline int TestSetPage##uname(struct page *page) { return 0; }
 
-#define TESTCLEARFLAG_FALSE(uname)					\
+#define TESTCLEARFLAG_FALSE(uname, lname)				\
+static inline bool folio_test_clear_##lname(struct folio *folio) { return 0; } \
 static inline int TestClearPage##uname(struct page *page) { return 0; }
 
-#define PAGEFLAG_FALSE(uname) TESTPAGEFLAG_FALSE(uname)			\
-	SETPAGEFLAG_NOOP(uname) CLEARPAGEFLAG_NOOP(uname)
+#define PAGEFLAG_FALSE(uname, lname) TESTPAGEFLAG_FALSE(uname, lname)	\
+	SETPAGEFLAG_NOOP(uname, lname) CLEARPAGEFLAG_NOOP(uname, lname)
 
-#define TESTSCFLAG_FALSE(uname)						\
-	TESTSETFLAG_FALSE(uname) TESTCLEARFLAG_FALSE(uname)
+#define TESTSCFLAG_FALSE(uname, lname)					\
+	TESTSETFLAG_FALSE(uname, lname) TESTCLEARFLAG_FALSE(uname, lname)
 
 __PAGEFLAG(Locked, locked, PF_NO_TAIL)
 PAGEFLAG(Waiters, waiters, PF_ONLY_HEAD) __CLEARPAGEFLAG(Waiters, waiters, PF_ONLY_HEAD)
@@ -406,8 +445,8 @@ PAGEFLAG(MappedToDisk, mappedtodisk, PF_NO_TAIL)
 /* PG_readahead is only used for reads; PG_reclaim is only for writes */
 PAGEFLAG(Reclaim, reclaim, PF_NO_TAIL)
 	TESTCLEARFLAG(Reclaim, reclaim, PF_NO_TAIL)
-PAGEFLAG(Readahead, reclaim, PF_NO_COMPOUND)
-	TESTCLEARFLAG(Readahead, reclaim, PF_NO_COMPOUND)
+PAGEFLAG(Readahead, readahead, PF_NO_COMPOUND)
+	TESTCLEARFLAG(Readahead, readahead, PF_NO_COMPOUND)
 
 #ifdef CONFIG_HIGHMEM
 /*
@@ -416,22 +455,25 @@ PAGEFLAG(Readahead, reclaim, PF_NO_COMPOUND)
  */
 #define PageHighMem(__p) is_highmem_idx(page_zonenum(__p))
 #else
-PAGEFLAG_FALSE(HighMem)
+PAGEFLAG_FALSE(HighMem, highmem)
 #endif
 
 #ifdef CONFIG_SWAP
-static __always_inline int PageSwapCache(struct page *page)
+static __always_inline bool folio_swapcache(struct folio *folio)
 {
-#ifdef CONFIG_THP_SWAP
-	page = compound_head(page);
-#endif
-	return PageSwapBacked(page) && test_bit(PG_swapcache, &page->flags);
+	return folio_swapbacked(folio) &&
+			test_bit(PG_swapcache, folio_flags(folio, 0));
+}
 
+static __always_inline bool PageSwapCache(struct page *page)
+{
+	return folio_swapcache(page_folio(page));
 }
+
 SETPAGEFLAG(SwapCache, swapcache, PF_NO_TAIL)
 CLEARPAGEFLAG(SwapCache, swapcache, PF_NO_TAIL)
 #else
-PAGEFLAG_FALSE(SwapCache)
+PAGEFLAG_FALSE(SwapCache, swapcache)
 #endif
 
 PAGEFLAG(Unevictable, unevictable, PF_HEAD)
@@ -443,14 +485,14 @@ PAGEFLAG(Mlocked, mlocked, PF_NO_TAIL)
 	__CLEARPAGEFLAG(Mlocked, mlocked, PF_NO_TAIL)
 	TESTSCFLAG(Mlocked, mlocked, PF_NO_TAIL)
 #else
-PAGEFLAG_FALSE(Mlocked) __CLEARPAGEFLAG_NOOP(Mlocked)
-	TESTSCFLAG_FALSE(Mlocked)
+PAGEFLAG_FALSE(Mlocked, mlocked) __CLEARPAGEFLAG_NOOP(Mlocked, mlocked)
+	TESTSCFLAG_FALSE(Mlocked, mlocked)
 #endif
 
 #ifdef CONFIG_ARCH_USES_PG_UNCACHED
 PAGEFLAG(Uncached, uncached, PF_NO_COMPOUND)
 #else
-PAGEFLAG_FALSE(Uncached)
+PAGEFLAG_FALSE(Uncached, uncached)
 #endif
 
 #ifdef CONFIG_MEMORY_FAILURE
@@ -459,7 +501,7 @@ TESTSCFLAG(HWPoison, hwpoison, PF_ANY)
 #define __PG_HWPOISON (1UL << PG_hwpoison)
 extern bool take_page_off_buddy(struct page *page);
 #else
-PAGEFLAG_FALSE(HWPoison)
+PAGEFLAG_FALSE(HWPoison, hwpoison)
 #define __PG_HWPOISON 0
 #endif
 
@@ -505,10 +547,14 @@ static __always_inline int PageMappingFlags(struct page *page)
 	return ((unsigned long)page->mapping & PAGE_MAPPING_FLAGS) != 0;
 }
 
-static __always_inline int PageAnon(struct page *page)
+static __always_inline bool folio_anon(struct folio *folio)
+{
+	return ((unsigned long)folio->mapping & PAGE_MAPPING_ANON) != 0;
+}
+
+static __always_inline bool PageAnon(struct page *page)
 {
-	page = compound_head(page);
-	return ((unsigned long)page->mapping & PAGE_MAPPING_ANON) != 0;
+	return folio_anon(page_folio(page));
 }
 
 static __always_inline int __PageMovable(struct page *page)
@@ -524,30 +570,32 @@ static __always_inline int __PageMovable(struct page *page)
  * is found in VM_MERGEABLE vmas.  It's a PageAnon page, pointing not to any
  * anon_vma, but to that page's node of the stable tree.
  */
-static __always_inline int PageKsm(struct page *page)
+static __always_inline bool folio_ksm(struct folio *folio)
 {
-	page = compound_head(page);
-	return ((unsigned long)page->mapping & PAGE_MAPPING_FLAGS) ==
+	return ((unsigned long)folio->mapping & PAGE_MAPPING_FLAGS) ==
 				PAGE_MAPPING_KSM;
 }
+
+static __always_inline bool PageKsm(struct page *page)
+{
+	return folio_ksm(page_folio(page));
+}
 #else
-TESTPAGEFLAG_FALSE(Ksm)
+TESTPAGEFLAG_FALSE(Ksm, ksm)
 #endif
 
 u64 stable_page_flags(struct page *page);
 
-static inline int PageUptodate(struct page *page)
+static inline bool folio_uptodate(struct folio *folio)
 {
-	int ret;
-	page = compound_head(page);
-	ret = test_bit(PG_uptodate, &(page)->flags);
+	bool ret = test_bit(PG_uptodate, folio_flags(folio, 0));
 	/*
-	 * Must ensure that the data we read out of the page is loaded
-	 * _after_ we've loaded page->flags to check for PageUptodate.
-	 * We can skip the barrier if the page is not uptodate, because
+	 * Must ensure that the data we read out of the folio is loaded
+	 * _after_ we've loaded folio->flags to check the uptodate bit.
+	 * We can skip the barrier if the folio is not uptodate, because
 	 * we wouldn't be reading anything from it.
 	 *
-	 * See SetPageUptodate() for the other side of the story.
+	 * See folio_set_uptodate() for the other side of the story.
 	 */
 	if (ret)
 		smp_rmb();
@@ -555,23 +603,36 @@ static inline int PageUptodate(struct page *page)
 	return ret;
 }
 
-static __always_inline void __SetPageUptodate(struct page *page)
+static inline int PageUptodate(struct page *page)
+{
+	return folio_uptodate(page_folio(page));
+}
+
+static __always_inline void __folio_set_uptodate(struct folio *folio)
 {
-	VM_BUG_ON_PAGE(PageTail(page), page);
 	smp_wmb();
-	__set_bit(PG_uptodate, &page->flags);
+	__set_bit(PG_uptodate, folio_flags(folio, 0));
 }
 
-static __always_inline void SetPageUptodate(struct page *page)
+static __always_inline void folio_set_uptodate(struct folio *folio)
 {
-	VM_BUG_ON_PAGE(PageTail(page), page);
 	/*
 	 * Memory barrier must be issued before setting the PG_uptodate bit,
-	 * so that all previous stores issued in order to bring the page
-	 * uptodate are actually visible before PageUptodate becomes true.
+	 * so that all previous stores issued in order to bring the folio
+	 * uptodate are actually visible before folio_uptodate becomes true.
 	 */
 	smp_wmb();
-	set_bit(PG_uptodate, &page->flags);
+	set_bit(PG_uptodate, folio_flags(folio, 0));
+}
+
+static __always_inline void __SetPageUptodate(struct page *page)
+{
+	__folio_set_uptodate((struct folio *)page);
+}
+
+static __always_inline void SetPageUptodate(struct page *page)
+{
+	folio_set_uptodate((struct folio *)page);
 }
 
 CLEARPAGEFLAG(Uptodate, uptodate, PF_NO_TAIL)
@@ -596,6 +657,17 @@ static inline void set_page_writeback_keepwrite(struct page *page)
 
 __PAGEFLAG(Head, head, PF_ANY) CLEARPAGEFLAG(Head, head, PF_ANY)
 
+/* Whether there are one or multiple pages in a folio */
+static inline bool folio_single(struct folio *folio)
+{
+	return !folio_head(folio);
+}
+
+static inline bool folio_multi(struct folio *folio)
+{
+	return folio_head(folio);
+}
+
 static __always_inline void set_compound_head(struct page *page, struct page *head)
 {
 	WRITE_ONCE(page->compound_head, (unsigned long)head + 1);
@@ -619,12 +691,15 @@ static inline void ClearPageCompound(struct page *page)
 #ifdef CONFIG_HUGETLB_PAGE
 int PageHuge(struct page *page);
 int PageHeadHuge(struct page *page);
+static inline bool folio_hugetlb(struct folio *folio)
+{
+	return PageHeadHuge(&folio->page);
+}
 #else
-TESTPAGEFLAG_FALSE(Huge)
-TESTPAGEFLAG_FALSE(HeadHuge)
+TESTPAGEFLAG_FALSE(Huge, hugetlb)
+TESTPAGEFLAG_FALSE(HeadHuge, headhuge)
 #endif
 
-
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 /*
  * PageHuge() only returns true for hugetlbfs pages, but not for
@@ -640,6 +715,11 @@ static inline int PageTransHuge(struct page *page)
 	return PageHead(page);
 }
 
+static inline bool folio_transhuge(struct folio *folio)
+{
+	return folio_head(folio);
+}
+
 /*
  * PageTransCompound returns true for both transparent huge pages
  * and hugetlbfs pages, so it should only be called when it's known
@@ -713,12 +793,12 @@ static inline int PageTransTail(struct page *page)
 PAGEFLAG(DoubleMap, double_map, PF_SECOND)
 	TESTSCFLAG(DoubleMap, double_map, PF_SECOND)
 #else
-TESTPAGEFLAG_FALSE(TransHuge)
-TESTPAGEFLAG_FALSE(TransCompound)
-TESTPAGEFLAG_FALSE(TransCompoundMap)
-TESTPAGEFLAG_FALSE(TransTail)
-PAGEFLAG_FALSE(DoubleMap)
-	TESTSCFLAG_FALSE(DoubleMap)
+TESTPAGEFLAG_FALSE(TransHuge, transhuge)
+TESTPAGEFLAG_FALSE(TransCompound, transcompound)
+TESTPAGEFLAG_FALSE(TransCompoundMap, transcompoundmap)
+TESTPAGEFLAG_FALSE(TransTail, transtail)
+PAGEFLAG_FALSE(DoubleMap, double_map)
+	TESTSCFLAG_FALSE(DoubleMap, double_map)
 #endif
 
 /*
@@ -871,6 +951,11 @@ static inline int page_has_private(struct page *page)
 	return !!(page->flags & PAGE_FLAGS_PRIVATE);
 }
 
+static inline bool folio_has_private(struct folio *folio)
+{
+	return page_has_private(&folio->page);
+}
+
 #undef PF_ANY
 #undef PF_HEAD
 #undef PF_ONLY_HEAD
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 09/31] mm: Add folio_young() and folio_idle()
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (7 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 08/31] mm: Add folio flag manipulation functions Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 10/31] mm: Handle per-folio private data Matthew Wilcox (Oracle)
                   ` (22 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm; +Cc: Matthew Wilcox (Oracle)

Idle page tracking is handled through page_ext for 32-bit.  Add folio
equivalents for 32 bit and move all the page compatibility parts to
common code.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/page_idle.h | 99 +++++++++++++++++++--------------------
 1 file changed, 49 insertions(+), 50 deletions(-)

diff --git a/include/linux/page_idle.h b/include/linux/page_idle.h
index 1e894d34bdce..9693d1a93781 100644
--- a/include/linux/page_idle.h
+++ b/include/linux/page_idle.h
@@ -8,46 +8,16 @@
 
 #ifdef CONFIG_IDLE_PAGE_TRACKING
 
-#ifdef CONFIG_64BIT
-static inline bool page_is_young(struct page *page)
-{
-	return PageYoung(page);
-}
-
-static inline void set_page_young(struct page *page)
-{
-	SetPageYoung(page);
-}
-
-static inline bool test_and_clear_page_young(struct page *page)
-{
-	return TestClearPageYoung(page);
-}
-
-static inline bool page_is_idle(struct page *page)
-{
-	return PageIdle(page);
-}
-
-static inline void set_page_idle(struct page *page)
-{
-	SetPageIdle(page);
-}
-
-static inline void clear_page_idle(struct page *page)
-{
-	ClearPageIdle(page);
-}
-#else /* !CONFIG_64BIT */
+#ifndef CONFIG_64BIT
 /*
  * If there is not enough space to store Idle and Young bits in page flags, use
  * page ext flags instead.
  */
 extern struct page_ext_operations page_idle_ops;
 
-static inline bool page_is_young(struct page *page)
+static inline bool folio_young(struct folio *folio)
 {
-	struct page_ext *page_ext = lookup_page_ext(page);
+	struct page_ext *page_ext = lookup_page_ext(&folio->page);
 
 	if (unlikely(!page_ext))
 		return false;
@@ -55,9 +25,9 @@ static inline bool page_is_young(struct page *page)
 	return test_bit(PAGE_EXT_YOUNG, &page_ext->flags);
 }
 
-static inline void set_page_young(struct page *page)
+static inline void folio_set_young(struct folio *folio)
 {
-	struct page_ext *page_ext = lookup_page_ext(page);
+	struct page_ext *page_ext = lookup_page_ext(&folio->page);
 
 	if (unlikely(!page_ext))
 		return;
@@ -65,9 +35,9 @@ static inline void set_page_young(struct page *page)
 	set_bit(PAGE_EXT_YOUNG, &page_ext->flags);
 }
 
-static inline bool test_and_clear_page_young(struct page *page)
+static inline bool folio_test_clear_young(struct folio *folio)
 {
-	struct page_ext *page_ext = lookup_page_ext(page);
+	struct page_ext *page_ext = lookup_page_ext(&folio->page);
 
 	if (unlikely(!page_ext))
 		return false;
@@ -75,9 +45,9 @@ static inline bool test_and_clear_page_young(struct page *page)
 	return test_and_clear_bit(PAGE_EXT_YOUNG, &page_ext->flags);
 }
 
-static inline bool page_is_idle(struct page *page)
+static inline bool folio_idle(struct folio *folio)
 {
-	struct page_ext *page_ext = lookup_page_ext(page);
+	struct page_ext *page_ext = lookup_page_ext(&folio->page);
 
 	if (unlikely(!page_ext))
 		return false;
@@ -85,9 +55,9 @@ static inline bool page_is_idle(struct page *page)
 	return test_bit(PAGE_EXT_IDLE, &page_ext->flags);
 }
 
-static inline void set_page_idle(struct page *page)
+static inline void folio_set_idle(struct folio *folio)
 {
-	struct page_ext *page_ext = lookup_page_ext(page);
+	struct page_ext *page_ext = lookup_page_ext(&folio->page);
 
 	if (unlikely(!page_ext))
 		return;
@@ -95,46 +65,75 @@ static inline void set_page_idle(struct page *page)
 	set_bit(PAGE_EXT_IDLE, &page_ext->flags);
 }
 
-static inline void clear_page_idle(struct page *page)
+static inline void folio_clear_idle(struct folio *folio)
 {
-	struct page_ext *page_ext = lookup_page_ext(page);
+	struct page_ext *page_ext = lookup_page_ext(&folio->page);
 
 	if (unlikely(!page_ext))
 		return;
 
 	clear_bit(PAGE_EXT_IDLE, &page_ext->flags);
 }
-#endif /* CONFIG_64BIT */
+#endif /* !CONFIG_64BIT */
 
 #else /* !CONFIG_IDLE_PAGE_TRACKING */
 
-static inline bool page_is_young(struct page *page)
+static inline bool folio_young(struct folio *folio)
 {
 	return false;
 }
 
-static inline void set_page_young(struct page *page)
+static inline void folio_set_young(struct folio *folio)
 {
 }
 
-static inline bool test_and_clear_page_young(struct page *page)
+static inline bool folio_test_clear_young(struct folio *folio)
 {
 	return false;
 }
 
-static inline bool page_is_idle(struct page *page)
+static inline bool folio_idle(struct folio *folio)
 {
 	return false;
 }
 
-static inline void set_page_idle(struct page *page)
+static inline void folio_set_idle(struct folio *folio)
 {
 }
 
-static inline void clear_page_idle(struct page *page)
+static inline void folio_clear_idle(struct folio *folio)
 {
 }
 
 #endif /* CONFIG_IDLE_PAGE_TRACKING */
 
+static inline bool page_is_young(struct page *page)
+{
+	return folio_young(page_folio(page));
+}
+
+static inline void set_page_young(struct page *page)
+{
+	folio_set_young(page_folio(page));
+}
+
+static inline bool test_and_clear_page_young(struct page *page)
+{
+	return folio_test_clear_young(page_folio(page));
+}
+
+static inline bool page_is_idle(struct page *page)
+{
+	return folio_idle(page_folio(page));
+}
+
+static inline void set_page_idle(struct page *page)
+{
+	folio_set_idle(page_folio(page));
+}
+
+static inline void clear_page_idle(struct page *page)
+{
+	folio_clear_idle(page_folio(page));
+}
 #endif /* _LINUX_MM_PAGE_IDLE_H */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 10/31] mm: Handle per-folio private data
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (8 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 09/31] mm: Add folio_young() and folio_idle() Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 11/31] mm/filemap: Add folio_index, folio_file_page and folio_contains Matthew Wilcox (Oracle)
                   ` (21 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

Add folio_get_private() and folio_set_private() which mirror page_private()
and set_page_private() -- ie folio private data is the same as page
private data.  The only difference is that these return a void *
instead of an unsigned long, which matches the majority of users.

Turn attach_page_private() into folio_attach_private() and reimplement
attach_page_private() as a wrapper.  No filesystem which uses page private
data currently supports compound pages, so we're free to define the rules.
attach_page_private() may only be called on a head page; if you want
to add private data to a tail page, you can call set_page_private()
directly (and shouldn't increment the page refcount!  That should be
done when adding private data to the head page / folio).

This saves 597 bytes of text with the distro-derived config that I'm
testing due to removing the calls to compound_head() in get_page()
& put_page().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/mm_types.h | 11 +++++++++
 include/linux/pagemap.h  | 48 ++++++++++++++++++++++++----------------
 2 files changed, 40 insertions(+), 19 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 276e358c75d3..111c304b7d13 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -302,6 +302,12 @@ static inline atomic_t *compound_pincount_ptr(struct page *page)
 #define PAGE_FRAG_CACHE_MAX_SIZE	__ALIGN_MASK(32768, ~PAGE_MASK)
 #define PAGE_FRAG_CACHE_MAX_ORDER	get_order(PAGE_FRAG_CACHE_MAX_SIZE)
 
+/*
+ * page_private can be used on tail pages.  However, PagePrivate is only
+ * checked by the VM on the head page.  So page_private on the tail pages
+ * should be used for data that's ancillary to the head page (eg attaching
+ * buffer heads to tail pages after attaching buffer heads to the head page)
+ */
 #define page_private(page)		((page)->private)
 
 static inline void set_page_private(struct page *page, unsigned long private)
@@ -309,6 +315,11 @@ static inline void set_page_private(struct page *page, unsigned long private)
 	page->private = private;
 }
 
+static inline void *folio_get_private(struct folio *folio)
+{
+	return (void *)folio->private;
+}
+
 struct page_frag_cache {
 	void * va;
 #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index a4bd41128bf3..c58587225fe5 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -260,42 +260,52 @@ static inline int page_cache_add_speculative(struct page *page, int count)
 }
 
 /**
- * attach_page_private - Attach private data to a page.
- * @page: Page to attach data to.
- * @data: Data to attach to page.
+ * folio_attach_private - Attach private data to a folio.
+ * @folio: Folio to attach data to.
+ * @data: Data to attach to folio.
  *
- * Attaching private data to a page increments the page's reference count.
- * The data must be detached before the page will be freed.
+ * Attaching private data to a folio increments the page's reference count.
+ * The data must be detached before the folio will be freed.
  */
-static inline void attach_page_private(struct page *page, void *data)
+static inline void folio_attach_private(struct folio *folio, void *data)
 {
-	get_page(page);
-	set_page_private(page, (unsigned long)data);
-	SetPagePrivate(page);
+	folio_get(folio);
+	folio->private = (unsigned long)data;
+	folio_set_private(folio);
 }
 
 /**
- * detach_page_private - Detach private data from a page.
- * @page: Page to detach data from.
+ * folio_detach_private - Detach private data from a folio.
+ * @folio: Folio to detach data from.
  *
- * Removes the data that was previously attached to the page and decrements
+ * Removes the data that was previously attached to the folio and decrements
  * the refcount on the page.
  *
- * Return: Data that was attached to the page.
+ * Return: Data that was attached to the folio.
  */
-static inline void *detach_page_private(struct page *page)
+static inline void *folio_detach_private(struct folio *folio)
 {
-	void *data = (void *)page_private(page);
+	void *data = folio_get_private(folio);
 
-	if (!PagePrivate(page))
+	if (!folio_private(folio))
 		return NULL;
-	ClearPagePrivate(page);
-	set_page_private(page, 0);
-	put_page(page);
+	folio_clear_private(folio);
+	folio->private = 0;
+	folio_put(folio);
 
 	return data;
 }
 
+static inline void attach_page_private(struct page *page, void *data)
+{
+	folio_attach_private(page_folio(page), data);
+}
+
+static inline void *detach_page_private(struct page *page)
+{
+	return folio_detach_private(page_folio(page));
+}
+
 #ifdef CONFIG_NUMA
 extern struct page *__page_cache_alloc(gfp_t gfp);
 #else
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 11/31] mm/filemap: Add folio_index, folio_file_page and folio_contains
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (9 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 10/31] mm: Handle per-folio private data Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 12/31] mm/filemap: Add folio_next_index Matthew Wilcox (Oracle)
                   ` (20 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

folio_index() is the equivalent of page_index() for folios.
folio_file_page() is the equivalent of find_subpage().
folio_contains() is the equivalent of thp_contains().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 53 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index c58587225fe5..58c9b98604d0 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -462,6 +462,59 @@ static inline bool thp_contains(struct page *head, pgoff_t index)
 	return page_index(head) == (index & ~(thp_nr_pages(head) - 1UL));
 }
 
+#define swapcache_index(folio)	__page_file_index(&(folio)->page)
+
+/**
+ * folio_index - File index of a folio.
+ * @folio: The folio.
+ *
+ * For a folio which is either in the page cache or the swap cache,
+ * return its index within the address_space it belongs to.  If you know
+ * the page is definitely in the page cache, you can look at the folio's
+ * index directly.
+ *
+ * Return: The index (offset in units of pages) of a folio in its file.
+ */
+static inline pgoff_t folio_index(struct folio *folio)
+{
+        if (unlikely(folio_swapcache(folio)))
+                return swapcache_index(folio);
+        return folio->index;
+}
+
+/**
+ * folio_file_page - The page for a particular index.
+ * @folio: The folio which contains this index.
+ * @index: The index we want to look up.
+ *
+ * Sometimes after looking up a folio in the page cache, we need to
+ * obtain the specific page for an index (eg a page fault).
+ *
+ * Return: The page containing the file data for this index.
+ */
+static inline struct page *folio_file_page(struct folio *folio, pgoff_t index)
+{
+	return nth_page(&folio->page, index & (folio_nr_pages(folio) - 1));
+}
+
+/**
+ * folio_contains - Does this folio contain this index?
+ * @folio: The folio.
+ * @index: The page index within the file.
+ *
+ * Context: The caller should have the page locked in order to prevent
+ * (eg) shmem from moving the page between the page cache and swap cache
+ * and changing its index in the middle of the operation.
+ * Return: true or false.
+ */
+static inline bool folio_contains(struct folio *folio, pgoff_t index)
+{
+	/* HugeTLBfs indexes the page cache in units of hpage_size */
+	if (folio_hugetlb(folio))
+		return folio->index == index;
+	return index - folio_index(folio) < folio_nr_pages(folio);
+}
+
 /*
  * Given the page we found in the page cache, return the page corresponding
  * to this index in the file
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 12/31] mm/filemap: Add folio_next_index
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (10 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 11/31] mm/filemap: Add folio_index, folio_file_page and folio_contains Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 13/31] mm/filemap: Add folio_offset and folio_file_offset Matthew Wilcox (Oracle)
                   ` (19 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

This helper returns the page index of the next folio in the file (ie
the end of this folio, plus one).

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 58c9b98604d0..5301131cc5b3 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -482,6 +482,17 @@ static inline pgoff_t folio_index(struct folio *folio)
         return folio->index;
 }
 
+/**
+ * folio_next_index - Get the index of the next folio.
+ * @folio: The current folio.
+ *
+ * Return: The index of the folio which follows this folio in the file.
+ */
+static inline pgoff_t folio_next_index(struct folio *folio)
+{
+	return folio->index + folio_nr_pages(folio);
+}
+
 /**
  * folio_file_page - The page for a particular index.
  * @folio: The folio which contains this index.
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 13/31] mm/filemap: Add folio_offset and folio_file_offset
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (11 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 12/31] mm/filemap: Add folio_next_index Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 14/31] mm/util: Add folio_mapping and folio_file_mapping Matthew Wilcox (Oracle)
                   ` (18 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

These are just wrappers around their page counterpart.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 5301131cc5b3..3d4ca4fd7b96 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -634,6 +634,16 @@ static inline loff_t page_file_offset(struct page *page)
 	return ((loff_t)page_index(page)) << PAGE_SHIFT;
 }
 
+static inline loff_t folio_offset(struct folio *folio)
+{
+	return page_offset(&folio->page);
+}
+
+static inline loff_t folio_file_offset(struct folio *folio)
+{
+	return page_file_offset(&folio->page);
+}
+
 extern pgoff_t linear_hugepage_index(struct vm_area_struct *vma,
 				     unsigned long address);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 14/31] mm/util: Add folio_mapping and folio_file_mapping
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (12 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 13/31] mm/filemap: Add folio_offset and folio_file_offset Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 15/31] mm: Add folio_mapcount Matthew Wilcox (Oracle)
                   ` (17 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

These are the folio equivalent of page_mapping() and page_file_mapping().
Add an out-of-line page_mapping() wrapper around folio_mapping()
in order to prevent the page_folio() call from bloating every caller
of page_mapping().  Adjust page_file_mapping() and page_mapping_file()
to use folios internally.  Rename __page_file_mapping() to
swapcache_mapping() and change it to take a folio.

This ends up saving 186 bytes of text overall.  folio_mapping() is
45 bytes shorter than page_mapping() was, but the new page_mapping()
wrapper is 30 bytes.  The major reduction is a few bytes less in dozens
of nfs functions (which call page_file_mapping()).  Most of these appear
to be a slight change in gcc's register allocation decisions, which allow:

   48 8b 56 08         mov    0x8(%rsi),%rdx
   48 8d 42 ff         lea    -0x1(%rdx),%rax
   83 e2 01            and    $0x1,%edx
   48 0f 44 c6         cmove  %rsi,%rax

to become:

   48 8b 46 08         mov    0x8(%rsi),%rax
   48 8d 78 ff         lea    -0x1(%rax),%rdi
   a8 01               test   $0x1,%al
   48 0f 44 fe         cmove  %rsi,%rdi

for a reduction of a single byte.  Once the NFS client is converted to
use folios, this entire sequence will disappear.

Also add folio_mapping() documentation.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 Documentation/core-api/mm-api.rst |  2 ++
 include/linux/mm.h                | 14 -------------
 include/linux/pagemap.h           | 35 +++++++++++++++++++++++++++++--
 include/linux/swap.h              |  6 ++++++
 mm/Makefile                       |  2 +-
 mm/folio-compat.c                 | 13 ++++++++++++
 mm/swapfile.c                     |  8 +++----
 mm/util.c                         | 30 +++++++++++++++-----------
 8 files changed, 77 insertions(+), 33 deletions(-)
 create mode 100644 mm/folio-compat.c

diff --git a/Documentation/core-api/mm-api.rst b/Documentation/core-api/mm-api.rst
index 88545b2dce98..e7e0f8a70794 100644
--- a/Documentation/core-api/mm-api.rst
+++ b/Documentation/core-api/mm-api.rst
@@ -99,3 +99,5 @@ More Memory Management Functions
 .. kernel-doc:: include/linux/mm.h
    :internal:
 .. kernel-doc:: include/linux/page_ref.h
+.. kernel-doc:: mm/util.c
+   :functions: folio_mapping
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b133734a7530..fb779dca5ee8 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1749,19 +1749,6 @@ void page_address_init(void);
 
 extern void *page_rmapping(struct page *page);
 extern struct anon_vma *page_anon_vma(struct page *page);
-extern struct address_space *page_mapping(struct page *page);
-
-extern struct address_space *__page_file_mapping(struct page *);
-
-static inline
-struct address_space *page_file_mapping(struct page *page)
-{
-	if (unlikely(PageSwapCache(page)))
-		return __page_file_mapping(page);
-
-	return page->mapping;
-}
-
 extern pgoff_t __page_file_index(struct page *page);
 
 /*
@@ -1776,7 +1763,6 @@ static inline pgoff_t page_index(struct page *page)
 }
 
 bool page_mapped(struct page *page);
-struct address_space *page_mapping(struct page *page);
 
 /*
  * Return true only if the page has been allocated with
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 3d4ca4fd7b96..b6bd60bbdee2 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -162,14 +162,45 @@ static inline void filemap_nr_thps_dec(struct address_space *mapping)
 
 void release_pages(struct page **pages, int nr);
 
+struct address_space *page_mapping(struct page *);
+struct address_space *folio_mapping(struct folio *);
+struct address_space *swapcache_mapping(struct folio *);
+
+/**
+ * folio_file_mapping - Find the mapping this folio belongs to.
+ * @folio: The folio.
+ *
+ * For folios which are in the page cache, return the mapping that this
+ * page belongs to.  Folios in the swap cache return the mapping of the
+ * swap file or swap device where the data is stored.  This is different
+ * from the mapping returned by folio_mapping().  The only reason to
+ * use it is if, like NFS, you return 0 from ->activate_swapfile.
+ *
+ * Do not call this for folios which aren't in the page cache or swap cache.
+ */
+static inline struct address_space *folio_file_mapping(struct folio *folio)
+{
+	if (unlikely(folio_swapcache(folio)))
+		return swapcache_mapping(folio);
+
+	return folio->mapping;
+}
+
+static inline struct address_space *page_file_mapping(struct page *page)
+{
+	return folio_file_mapping(page_folio(page));
+}
+
 /*
  * For file cache pages, return the address_space, otherwise return NULL
  */
 static inline struct address_space *page_mapping_file(struct page *page)
 {
-	if (unlikely(PageSwapCache(page)))
+	struct folio *folio = page_folio(page);
+
+	if (unlikely(folio_swapcache(folio)))
 		return NULL;
-	return page_mapping(page);
+	return folio_mapping(folio);
 }
 
 /*
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 144727041e78..20766342845b 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -314,6 +314,12 @@ struct vma_swap_readahead {
 #endif
 };
 
+static inline swp_entry_t folio_swap_entry(struct folio *folio)
+{
+	swp_entry_t entry = { .val = page_private(&folio->page) };
+	return entry;
+}
+
 /* linux/mm/workingset.c */
 void workingset_age_nonresident(struct lruvec *lruvec, unsigned long nr_pages);
 void *workingset_eviction(struct page *page, struct mem_cgroup *target_memcg);
diff --git a/mm/Makefile b/mm/Makefile
index a9ad6122d468..434c2a46b6c5 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -46,7 +46,7 @@ mmu-$(CONFIG_MMU)	+= process_vm_access.o
 endif
 
 obj-y			:= filemap.o mempool.o oom_kill.o fadvise.o \
-			   maccess.o page-writeback.o \
+			   maccess.o page-writeback.o folio-compat.o \
 			   readahead.o swap.o truncate.o vmscan.o shmem.o \
 			   util.o mmzone.o vmstat.o backing-dev.o \
 			   mm_init.o percpu.o slab_common.o \
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
new file mode 100644
index 000000000000..5e107aa30a62
--- /dev/null
+++ b/mm/folio-compat.c
@@ -0,0 +1,13 @@
+/*
+ * Compatibility functions which bloat the callers too much to make inline.
+ * All of the callers of these functions should be converted to use folios
+ * eventually.
+ */
+
+#include <linux/pagemap.h>
+
+struct address_space *page_mapping(struct page *page)
+{
+	return folio_mapping(page_folio(page));
+}
+EXPORT_SYMBOL(page_mapping);
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 149e77454e3c..d0ee24239a83 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3533,13 +3533,13 @@ struct swap_info_struct *page_swap_info(struct page *page)
 }
 
 /*
- * out-of-line __page_file_ methods to avoid include hell.
+ * out-of-line methods to avoid include hell.
  */
-struct address_space *__page_file_mapping(struct page *page)
+struct address_space *swapcache_mapping(struct folio *folio)
 {
-	return page_swap_info(page)->swap_file->f_mapping;
+	return page_swap_info(&folio->page)->swap_file->f_mapping;
 }
-EXPORT_SYMBOL_GPL(__page_file_mapping);
+EXPORT_SYMBOL_GPL(swapcache_mapping);
 
 pgoff_t __page_file_index(struct page *page)
 {
diff --git a/mm/util.c b/mm/util.c
index a8bf17f18a81..afd99591cb81 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -686,30 +686,36 @@ struct anon_vma *page_anon_vma(struct page *page)
 	return __page_rmapping(page);
 }
 
-struct address_space *page_mapping(struct page *page)
+/**
+ * folio_mapping - Find the mapping where this folio is stored.
+ * @folio: The folio.
+ *
+ * For folios which are in the page cache, return the mapping that this
+ * page belongs to.  Folios in the swap cache return the swap mapping
+ * this page is stored in (which is different from the mapping for the
+ * swap file or swap device where the data is stored).
+ *
+ * You can call this for folios which aren't in the swap cache or page
+ * cache and it will return NULL.
+ */
+struct address_space *folio_mapping(struct folio *folio)
 {
 	struct address_space *mapping;
 
-	page = compound_head(page);
-
 	/* This happens if someone calls flush_dcache_page on slab page */
-	if (unlikely(PageSlab(page)))
+	if (unlikely(folio_slab(folio)))
 		return NULL;
 
-	if (unlikely(PageSwapCache(page))) {
-		swp_entry_t entry;
-
-		entry.val = page_private(page);
-		return swap_address_space(entry);
-	}
+	if (unlikely(folio_swapcache(folio)))
+		return swap_address_space(folio_swap_entry(folio));
 
-	mapping = page->mapping;
+	mapping = folio->mapping;
 	if ((unsigned long)mapping & PAGE_MAPPING_ANON)
 		return NULL;
 
 	return (void *)((unsigned long)mapping & ~PAGE_MAPPING_FLAGS);
 }
-EXPORT_SYMBOL(page_mapping);
+EXPORT_SYMBOL(folio_mapping);
 
 /* Slow path of page_mapcount() for compound pages */
 int __page_mapcount(struct page *page)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 15/31] mm: Add folio_mapcount
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (13 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 14/31] mm/util: Add folio_mapping and folio_file_mapping Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 16/31] mm/memcg: Add folio wrappers for various functions Matthew Wilcox (Oracle)
                   ` (16 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

This is the folio equivalent of page_mapcount().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/mm.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index fb779dca5ee8..bca3e2518e5e 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -883,6 +883,22 @@ static inline int page_mapcount(struct page *page)
 	return atomic_read(&page->_mapcount) + 1;
 }
 
+/**
+ * folio_mapcount - The number of mappings of this folio.
+ * @folio: The folio.
+ *
+ * The result includes the number of times any of the pages in the
+ * folio are mapped to userspace.
+ *
+ * Return: The number of page table entries which refer to this folio.
+ */
+static inline int folio_mapcount(struct folio *folio)
+{
+	if (unlikely(folio_multi(folio)))
+		return __page_mapcount(&folio->page);
+	return atomic_read(&folio->_mapcount) + 1;
+}
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 int total_mapcount(struct page *page);
 int page_trans_huge_mapcount(struct page *page, int *total_mapcount);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 16/31] mm/memcg: Add folio wrappers for various functions
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (14 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 15/31] mm: Add folio_mapcount Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 17/31] mm/filemap: Add folio_unlock Matthew Wilcox (Oracle)
                   ` (15 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

Add new wrapper functions folio_memcg(), lock_folio_memcg(),
unlock_folio_memcg(), mem_cgroup_folio_lruvec() and
count_memcg_folio_event()

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/memcontrol.h | 58 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index c193be760709..b45b505be0ec 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -456,6 +456,11 @@ static inline struct mem_cgroup *page_memcg(struct page *page)
 		return __page_memcg(page);
 }
 
+static inline struct mem_cgroup *folio_memcg(struct folio *folio)
+{
+	return page_memcg(&folio->page);
+}
+
 /*
  * page_memcg_rcu - locklessly get the memory cgroup associated with a page
  * @page: a pointer to the page struct
@@ -1058,6 +1063,15 @@ static inline void count_memcg_page_event(struct page *page,
 		count_memcg_events(memcg, idx, 1);
 }
 
+static inline void count_memcg_folio_event(struct folio *folio,
+					  enum vm_event_item idx)
+{
+	struct mem_cgroup *memcg = folio_memcg(folio);
+
+	if (memcg)
+		count_memcg_events(memcg, idx, folio_nr_pages(folio));
+}
+
 static inline void count_memcg_event_mm(struct mm_struct *mm,
 					enum vm_event_item idx)
 {
@@ -1477,6 +1491,22 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 }
 #endif /* CONFIG_MEMCG */
 
+static inline void lock_folio_memcg(struct folio *folio)
+{
+	lock_page_memcg(&folio->page);
+}
+
+static inline void unlock_folio_memcg(struct folio *folio)
+{
+	unlock_page_memcg(&folio->page);
+}
+
+static inline struct lruvec *mem_cgroup_folio_lruvec(struct folio *folio,
+						    struct pglist_data *pgdat)
+{
+	return mem_cgroup_page_lruvec(&folio->page, pgdat);
+}
+
 static inline void __inc_lruvec_kmem_state(void *p, enum node_stat_item idx)
 {
 	__mod_lruvec_kmem_state(p, idx, 1);
@@ -1544,6 +1574,34 @@ static inline struct lruvec *relock_page_lruvec_irqsave(struct page *page,
 	return lock_page_lruvec_irqsave(page, flags);
 }
 
+static inline struct lruvec *folio_lock_lruvec(struct folio *folio)
+{
+	return lock_page_lruvec(&folio->page);
+}
+
+static inline struct lruvec *folio_lock_lruvec_irq(struct folio *folio)
+{
+	return lock_page_lruvec_irq(&folio->page);
+}
+
+static inline struct lruvec *folio_lock_lruvec_irqsave(struct folio *folio,
+		unsigned long *flagsp)
+{
+	return lock_page_lruvec_irqsave(&folio->page, flagsp);
+}
+
+static inline struct lruvec *folio_relock_lruvec_irq(struct folio *folio,
+		struct lruvec *locked_lruvec)
+{
+	return relock_page_lruvec_irq(&folio->page, locked_lruvec);
+}
+
+static inline struct lruvec *folio_relock_lruvec_irqsave(struct folio *folio,
+		struct lruvec *locked_lruvec, unsigned long *flagsp)
+{
+	return relock_page_lruvec_irqsave(&folio->page, locked_lruvec, flagsp);
+}
+
 #ifdef CONFIG_CGROUP_WRITEBACK
 
 struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 17/31] mm/filemap: Add folio_unlock
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (15 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 16/31] mm/memcg: Add folio wrappers for various functions Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 18/31] mm/filemap: Add folio_lock Matthew Wilcox (Oracle)
                   ` (14 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

Convert unlock_page() to call folio_unlock().  By using a folio we
avoid a call to compound_head().  This shortens the function from 39
bytes to 25 and removes 4 instructions on x86-64.  Because we still
have unlock_page(), it's a net increase of 24 bytes of text for the
kernel as a whole, but any path that uses folio_unlock() will execute
4 fewer instructions.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h |  3 ++-
 mm/filemap.c            | 27 ++++++++++-----------------
 mm/folio-compat.c       |  6 ++++++
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index b6bd60bbdee2..9126c0db3a60 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -719,7 +719,8 @@ extern int __lock_page_killable(struct page *page);
 extern int __lock_page_async(struct page *page, struct wait_page_queue *wait);
 extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 				unsigned int flags);
-extern void unlock_page(struct page *page);
+void unlock_page(struct page *page);
+void folio_unlock(struct folio *folio);
 
 /*
  * Return true if the page was successfully locked
diff --git a/mm/filemap.c b/mm/filemap.c
index 66f7e9fdfbc4..090b303bcd45 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1435,29 +1435,22 @@ static inline bool clear_bit_unlock_is_negative_byte(long nr, volatile void *mem
 #endif
 
 /**
- * unlock_page - unlock a locked page
- * @page: the page
+ * folio_unlock - Unlock a locked folio.
+ * @folio: The folio.
  *
- * Unlocks the page and wakes up sleepers in wait_on_page_locked().
- * Also wakes sleepers in wait_on_page_writeback() because the wakeup
- * mechanism between PageLocked pages and PageWriteback pages is shared.
- * But that's OK - sleepers in wait_on_page_writeback() just go back to sleep.
+ * Unlocks the folio and wakes up any thread sleeping on the page lock.
  *
- * Note that this depends on PG_waiters being the sign bit in the byte
- * that contains PG_locked - thus the BUILD_BUG_ON(). That allows us to
- * clear the PG_locked bit and test PG_waiters at the same time fairly
- * portably (architectures that do LL/SC can test any bit, while x86 can
- * test the sign bit).
+ * Context: May be called from interrupt or process context.  May not be
+ * called from NMI context.
  */
-void unlock_page(struct page *page)
+void folio_unlock(struct folio *folio)
 {
 	BUILD_BUG_ON(PG_waiters != 7);
-	page = compound_head(page);
-	VM_BUG_ON_PAGE(!PageLocked(page), page);
-	if (clear_bit_unlock_is_negative_byte(PG_locked, &page->flags))
-		wake_up_page_bit(page, PG_locked);
+	VM_BUG_ON_FOLIO(!folio_locked(folio), folio);
+	if (clear_bit_unlock_is_negative_byte(PG_locked, folio_flags(folio, 0)))
+		wake_up_page_bit(&folio->page, PG_locked);
 }
-EXPORT_SYMBOL(unlock_page);
+EXPORT_SYMBOL(folio_unlock);
 
 /**
  * end_page_private_2 - Clear PG_private_2 and release any waiters
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 5e107aa30a62..91b3d00a92f7 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -11,3 +11,9 @@ struct address_space *page_mapping(struct page *page)
 	return folio_mapping(page_folio(page));
 }
 EXPORT_SYMBOL(page_mapping);
+
+void unlock_page(struct page *page)
+{
+	return folio_unlock(page_folio(page));
+}
+EXPORT_SYMBOL(unlock_page);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 18/31] mm/filemap: Add folio_lock
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (16 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 17/31] mm/filemap: Add folio_unlock Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 19/31] mm/filemap: Add folio_lock_killable Matthew Wilcox (Oracle)
                   ` (13 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

This is like lock_page() but for use by callers who know they have a folio.
Convert __lock_page() to be __folio_lock().  This saves one call to
compound_head() per contended call to lock_page().

Saves 362 bytes of text; mostly from improved register allocation and
inlining decisions.  __folio_lock is 59 bytes while __lock_page was 79.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 24 +++++++++++++++++++-----
 mm/filemap.c            | 29 +++++++++++++++--------------
 2 files changed, 34 insertions(+), 19 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 9126c0db3a60..ff81be103539 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -714,7 +714,7 @@ static inline bool wake_page_match(struct wait_page_queue *wait_page,
 	return true;
 }
 
-extern void __lock_page(struct page *page);
+void __folio_lock(struct folio *folio);
 extern int __lock_page_killable(struct page *page);
 extern int __lock_page_async(struct page *page, struct wait_page_queue *wait);
 extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
@@ -722,13 +722,24 @@ extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 void unlock_page(struct page *page);
 void folio_unlock(struct folio *folio);
 
+static inline bool folio_trylock(struct folio *folio)
+{
+	return likely(!test_and_set_bit_lock(PG_locked, folio_flags(folio, 0)));
+}
+
 /*
  * Return true if the page was successfully locked
  */
 static inline int trylock_page(struct page *page)
 {
-	page = compound_head(page);
-	return (likely(!test_and_set_bit_lock(PG_locked, &page->flags)));
+	return folio_trylock(page_folio(page));
+}
+
+static inline void folio_lock(struct folio *folio)
+{
+	might_sleep();
+	if (!folio_trylock(folio))
+		__folio_lock(folio);
 }
 
 /*
@@ -736,9 +747,12 @@ static inline int trylock_page(struct page *page)
  */
 static inline void lock_page(struct page *page)
 {
+	struct folio *folio;
 	might_sleep();
-	if (!trylock_page(page))
-		__lock_page(page);
+
+	folio = page_folio(page);
+	if (!folio_trylock(folio))
+		__folio_lock(folio);
 }
 
 /*
diff --git a/mm/filemap.c b/mm/filemap.c
index 090b303bcd45..6935b068856f 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1187,7 +1187,7 @@ static void wake_up_page(struct page *page, int bit)
  */
 enum behavior {
 	EXCLUSIVE,	/* Hold ref to page and take the bit when woken, like
-			 * __lock_page() waiting on then setting PG_locked.
+			 * __folio_lock() waiting on then setting PG_locked.
 			 */
 	SHARED,		/* Hold ref to page and check the bit when woken, like
 			 * wait_on_page_writeback() waiting on PG_writeback.
@@ -1576,17 +1576,16 @@ void page_endio(struct page *page, bool is_write, int err)
 EXPORT_SYMBOL_GPL(page_endio);
 
 /**
- * __lock_page - get a lock on the page, assuming we need to sleep to get it
- * @__page: the page to lock
+ * __folio_lock - Get a lock on the folio, assuming we need to sleep to get it.
+ * @folio: The folio to lock
  */
-void __lock_page(struct page *__page)
+void __folio_lock(struct folio *folio)
 {
-	struct page *page = compound_head(__page);
-	wait_queue_head_t *q = page_waitqueue(page);
-	wait_on_page_bit_common(q, page, PG_locked, TASK_UNINTERRUPTIBLE,
+	wait_queue_head_t *q = page_waitqueue(&folio->page);
+	wait_on_page_bit_common(q, &folio->page, PG_locked, TASK_UNINTERRUPTIBLE,
 				EXCLUSIVE);
 }
-EXPORT_SYMBOL(__lock_page);
+EXPORT_SYMBOL(__folio_lock);
 
 int __lock_page_killable(struct page *__page)
 {
@@ -1661,10 +1660,10 @@ int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 			return 0;
 		}
 	} else {
-		__lock_page(page);
+		__folio_lock(page_folio(page));
 	}
-	return 1;
 
+	return 1;
 }
 
 /**
@@ -2815,7 +2814,9 @@ loff_t mapping_seek_hole_data(struct address_space *mapping, loff_t start,
 static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
 				     struct file **fpin)
 {
-	if (trylock_page(page))
+	struct folio *folio = page_folio(page);
+
+	if (folio_trylock(folio))
 		return 1;
 
 	/*
@@ -2828,7 +2829,7 @@ static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
 
 	*fpin = maybe_unlock_mmap_for_io(vmf, *fpin);
 	if (vmf->flags & FAULT_FLAG_KILLABLE) {
-		if (__lock_page_killable(page)) {
+		if (__lock_page_killable(&folio->page)) {
 			/*
 			 * We didn't have the right flags to drop the mmap_lock,
 			 * but all fault_handlers only check for fatal signals
@@ -2840,11 +2841,11 @@ static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
 			return 0;
 		}
 	} else
-		__lock_page(page);
+		__folio_lock(folio);
+
 	return 1;
 }
 
-
 /*
  * Synchronous readahead happens when we don't even find a page in the page
  * cache at all.  We don't want to perform IO under the mmap sem, so if we have
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 19/31] mm/filemap: Add folio_lock_killable
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (17 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 18/31] mm/filemap: Add folio_lock Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 20/31] mm/filemap: Add __folio_lock_async Matthew Wilcox (Oracle)
                   ` (12 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

This is like lock_page_killable() but for use by callers who
know they have a folio.  Convert __lock_page_killable() to be
__folio_lock_killable().  This saves one call to compound_head() per
contended call to lock_page_killable().

__folio_lock_killable() is 20 bytes smaller than __lock_page_killable()
was.  lock_page_maybe_drop_mmap() shrinks by 68 bytes and
__lock_page_or_retry() shrinks by 66 bytes.  That's a total of 154 bytes
of text saved.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 15 ++++++++++-----
 mm/filemap.c            | 17 +++++++++--------
 2 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index ff81be103539..332731ee541a 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -715,7 +715,7 @@ static inline bool wake_page_match(struct wait_page_queue *wait_page,
 }
 
 void __folio_lock(struct folio *folio);
-extern int __lock_page_killable(struct page *page);
+int __folio_lock_killable(struct folio *folio);
 extern int __lock_page_async(struct page *page, struct wait_page_queue *wait);
 extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 				unsigned int flags);
@@ -755,6 +755,14 @@ static inline void lock_page(struct page *page)
 		__folio_lock(folio);
 }
 
+static inline int folio_lock_killable(struct folio *folio)
+{
+	might_sleep();
+	if (!folio_trylock(folio))
+		return __folio_lock_killable(folio);
+	return 0;
+}
+
 /*
  * lock_page_killable is like lock_page but can be interrupted by fatal
  * signals.  It returns 0 if it locked the page and -EINTR if it was
@@ -762,10 +770,7 @@ static inline void lock_page(struct page *page)
  */
 static inline int lock_page_killable(struct page *page)
 {
-	might_sleep();
-	if (!trylock_page(page))
-		return __lock_page_killable(page);
-	return 0;
+	return folio_lock_killable(page_folio(page));
 }
 
 /*
diff --git a/mm/filemap.c b/mm/filemap.c
index 6935b068856f..27a86d53dd89 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1587,14 +1587,13 @@ void __folio_lock(struct folio *folio)
 }
 EXPORT_SYMBOL(__folio_lock);
 
-int __lock_page_killable(struct page *__page)
+int __folio_lock_killable(struct folio *folio)
 {
-	struct page *page = compound_head(__page);
-	wait_queue_head_t *q = page_waitqueue(page);
-	return wait_on_page_bit_common(q, page, PG_locked, TASK_KILLABLE,
+	wait_queue_head_t *q = page_waitqueue(&folio->page);
+	return wait_on_page_bit_common(q, &folio->page, PG_locked, TASK_KILLABLE,
 					EXCLUSIVE);
 }
-EXPORT_SYMBOL_GPL(__lock_page_killable);
+EXPORT_SYMBOL_GPL(__folio_lock_killable);
 
 int __lock_page_async(struct page *page, struct wait_page_queue *wait)
 {
@@ -1636,6 +1635,8 @@ int __lock_page_async(struct page *page, struct wait_page_queue *wait)
 int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 			 unsigned int flags)
 {
+	struct folio *folio = page_folio(page);
+
 	if (fault_flag_allow_retry_first(flags)) {
 		/*
 		 * CAUTION! In this case, mmap_lock is not released
@@ -1654,13 +1655,13 @@ int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 	if (flags & FAULT_FLAG_KILLABLE) {
 		int ret;
 
-		ret = __lock_page_killable(page);
+		ret = __folio_lock_killable(folio);
 		if (ret) {
 			mmap_read_unlock(mm);
 			return 0;
 		}
 	} else {
-		__folio_lock(page_folio(page));
+		__folio_lock(folio);
 	}
 
 	return 1;
@@ -2829,7 +2830,7 @@ static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
 
 	*fpin = maybe_unlock_mmap_for_io(vmf, *fpin);
 	if (vmf->flags & FAULT_FLAG_KILLABLE) {
-		if (__lock_page_killable(&folio->page)) {
+		if (__folio_lock_killable(folio)) {
 			/*
 			 * We didn't have the right flags to drop the mmap_lock,
 			 * but all fault_handlers only check for fatal signals
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 20/31] mm/filemap: Add __folio_lock_async
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (18 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 19/31] mm/filemap: Add folio_lock_killable Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 21/31] mm/filemap: Add __folio_lock_or_retry Matthew Wilcox (Oracle)
                   ` (11 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

There aren't any actual callers of lock_page_async(), so remove it.
Convert filemap_update_page() to call __folio_lock_async().

__folio_lock_async() is 21 bytes smaller than __lock_page_async(),
but the real savings come from using a folio in filemap_update_page(),
shrinking it from 514 bytes to 403 bytes, saving 111 bytes.  The text
shrinks by 132 bytes in total.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 fs/io_uring.c           |  2 +-
 include/linux/pagemap.h | 17 -----------------
 mm/filemap.c            | 31 ++++++++++++++++---------------
 3 files changed, 17 insertions(+), 33 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index a880edb90d0c..83171c138310 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3158,7 +3158,7 @@ static int io_read_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 }
 
 /*
- * This is our waitqueue callback handler, registered through lock_page_async()
+ * This is our waitqueue callback handler, registered through __folio_lock_async()
  * when we initially tried to do the IO with the iocb armed our waitqueue.
  * This gets called when the page is unlocked, and we generally expect that to
  * happen when the page IO is completed and the page is now uptodate. This will
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 332731ee541a..69c4de9224cd 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -716,7 +716,6 @@ static inline bool wake_page_match(struct wait_page_queue *wait_page,
 
 void __folio_lock(struct folio *folio);
 int __folio_lock_killable(struct folio *folio);
-extern int __lock_page_async(struct page *page, struct wait_page_queue *wait);
 extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 				unsigned int flags);
 void unlock_page(struct page *page);
@@ -773,22 +772,6 @@ static inline int lock_page_killable(struct page *page)
 	return folio_lock_killable(page_folio(page));
 }
 
-/*
- * lock_page_async - Lock the page, unless this would block. If the page
- * is already locked, then queue a callback when the page becomes unlocked.
- * This callback can then retry the operation.
- *
- * Returns 0 if the page is locked successfully, or -EIOCBQUEUED if the page
- * was already locked and the callback defined in 'wait' was queued.
- */
-static inline int lock_page_async(struct page *page,
-				  struct wait_page_queue *wait)
-{
-	if (!trylock_page(page))
-		return __lock_page_async(page, wait);
-	return 0;
-}
-
 /*
  * lock_page_or_retry - Lock the page, unless this would block and the
  * caller indicated that it can handle a retry.
diff --git a/mm/filemap.c b/mm/filemap.c
index 27a86d53dd89..20e1d2c9f3ca 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1595,18 +1595,18 @@ int __folio_lock_killable(struct folio *folio)
 }
 EXPORT_SYMBOL_GPL(__folio_lock_killable);
 
-int __lock_page_async(struct page *page, struct wait_page_queue *wait)
+static int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait)
 {
-	struct wait_queue_head *q = page_waitqueue(page);
+	struct wait_queue_head *q = page_waitqueue(&folio->page);
 	int ret = 0;
 
-	wait->page = page;
+	wait->page = &folio->page;
 	wait->bit_nr = PG_locked;
 
 	spin_lock_irq(&q->lock);
 	__add_wait_queue_entry_tail(q, &wait->wait);
-	SetPageWaiters(page);
-	ret = !trylock_page(page);
+	folio_set_waiters(folio);
+	ret = !folio_trylock(folio);
 	/*
 	 * If we were successful now, we know we're still on the
 	 * waitqueue as we're still under the lock. This means it's
@@ -2359,41 +2359,42 @@ static int filemap_update_page(struct kiocb *iocb,
 		struct address_space *mapping, struct iov_iter *iter,
 		struct page *page)
 {
+	struct folio *folio = page_folio(page);
 	int error;
 
-	if (!trylock_page(page)) {
+	if (!folio_trylock(folio)) {
 		if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_NOIO))
 			return -EAGAIN;
 		if (!(iocb->ki_flags & IOCB_WAITQ)) {
-			put_and_wait_on_page_locked(page, TASK_KILLABLE);
+			put_and_wait_on_page_locked(&folio->page, TASK_KILLABLE);
 			return AOP_TRUNCATED_PAGE;
 		}
-		error = __lock_page_async(page, iocb->ki_waitq);
+		error = __folio_lock_async(folio, iocb->ki_waitq);
 		if (error)
 			return error;
 	}
 
-	if (!page->mapping)
+	if (!folio->mapping)
 		goto truncated;
 
 	error = 0;
-	if (filemap_range_uptodate(mapping, iocb->ki_pos, iter, page))
+	if (filemap_range_uptodate(mapping, iocb->ki_pos, iter, &folio->page))
 		goto unlock;
 
 	error = -EAGAIN;
 	if (iocb->ki_flags & (IOCB_NOIO | IOCB_NOWAIT | IOCB_WAITQ))
 		goto unlock;
 
-	error = filemap_read_page(iocb->ki_filp, mapping, page);
+	error = filemap_read_page(iocb->ki_filp, mapping, &folio->page);
 	if (error == AOP_TRUNCATED_PAGE)
-		put_page(page);
+		folio_put(folio);
 	return error;
 truncated:
-	unlock_page(page);
-	put_page(page);
+	folio_unlock(folio);
+	folio_put(folio);
 	return AOP_TRUNCATED_PAGE;
 unlock:
-	unlock_page(page);
+	folio_unlock(folio);
 	return error;
 }
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 21/31] mm/filemap: Add __folio_lock_or_retry
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (19 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 20/31] mm/filemap: Add __folio_lock_async Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 22/31] mm/filemap: Add folio_wait_locked Matthew Wilcox (Oracle)
                   ` (10 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

Convert __lock_page_or_retry() to __folio_lock_or_retry().  This actually
saves 4 bytes in the only caller of lock_page_or_retry() (due to better
register allocation) and saves the 20 byte cost of calling page_folio()
in __folio_lock_or_retry() for a total saving of 24 bytes.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h |  9 ++++++---
 mm/filemap.c            | 10 ++++------
 mm/memory.c             |  8 ++++----
 3 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 69c4de9224cd..ad554aef9cfb 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -716,7 +716,7 @@ static inline bool wake_page_match(struct wait_page_queue *wait_page,
 
 void __folio_lock(struct folio *folio);
 int __folio_lock_killable(struct folio *folio);
-extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
+int __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
 				unsigned int flags);
 void unlock_page(struct page *page);
 void folio_unlock(struct folio *folio);
@@ -777,13 +777,16 @@ static inline int lock_page_killable(struct page *page)
  * caller indicated that it can handle a retry.
  *
  * Return value and mmap_lock implications depend on flags; see
- * __lock_page_or_retry().
+ * __folio_lock_or_retry().
  */
 static inline int lock_page_or_retry(struct page *page, struct mm_struct *mm,
 				     unsigned int flags)
 {
+	struct folio *folio;
 	might_sleep();
-	return trylock_page(page) || __lock_page_or_retry(page, mm, flags);
+
+	folio = page_folio(page);
+	return folio_trylock(folio) || __folio_lock_or_retry(folio, mm, flags);
 }
 
 /*
diff --git a/mm/filemap.c b/mm/filemap.c
index 20e1d2c9f3ca..c70d3da2b7b2 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1623,20 +1623,18 @@ static int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait)
 
 /*
  * Return values:
- * 1 - page is locked; mmap_lock is still held.
- * 0 - page is not locked.
+ * 1 - folio is locked; mmap_lock is still held.
+ * 0 - folio is not locked.
  *     mmap_lock has been released (mmap_read_unlock(), unless flags had both
  *     FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_RETRY_NOWAIT set, in
  *     which case mmap_lock is still held.
  *
  * If neither ALLOW_RETRY nor KILLABLE are set, will always return 1
- * with the page locked and the mmap_lock unperturbed.
+ * with the folio locked and the mmap_lock unperturbed.
  */
-int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
+int __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
 			 unsigned int flags)
 {
-	struct folio *folio = page_folio(page);
-
 	if (fault_flag_allow_retry_first(flags)) {
 		/*
 		 * CAUTION! In this case, mmap_lock is not released
diff --git a/mm/memory.c b/mm/memory.c
index 86ba6c1f6821..fc3f50d0702c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4065,7 +4065,7 @@ static vm_fault_t do_shared_fault(struct vm_fault *vmf)
  * We enter with non-exclusive mmap_lock (to exclude vma changes,
  * but allow concurrent faults).
  * The mmap_lock may have been released depending on flags and our
- * return value.  See filemap_fault() and __lock_page_or_retry().
+ * return value.  See filemap_fault() and __folio_lock_or_retry().
  * If mmap_lock is released, vma may become invalid (for example
  * by other thread calling munmap()).
  */
@@ -4307,7 +4307,7 @@ static vm_fault_t wp_huge_pud(struct vm_fault *vmf, pud_t orig_pud)
  * concurrent faults).
  *
  * The mmap_lock may have been released depending on flags and our return value.
- * See filemap_fault() and __lock_page_or_retry().
+ * See filemap_fault() and __folio_lock_or_retry().
  */
 static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
 {
@@ -4411,7 +4411,7 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
  * By the time we get here, we already hold the mm semaphore
  *
  * The mmap_lock may have been released depending on flags and our
- * return value.  See filemap_fault() and __lock_page_or_retry().
+ * return value.  See filemap_fault() and __folio_lock_or_retry().
  */
 static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
 		unsigned long address, unsigned int flags)
@@ -4567,7 +4567,7 @@ static inline void mm_account_fault(struct pt_regs *regs,
  * By the time we get here, we already hold the mm semaphore
  *
  * The mmap_lock may have been released depending on flags and our
- * return value.  See filemap_fault() and __lock_page_or_retry().
+ * return value.  See filemap_fault() and __folio_lock_or_retry().
  */
 vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
 			   unsigned int flags, struct pt_regs *regs)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 22/31] mm/filemap: Add folio_wait_locked
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (20 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 21/31] mm/filemap: Add __folio_lock_or_retry Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 23/31] mm/swap: Add folio_rotate_reclaimable Matthew Wilcox (Oracle)
                   ` (9 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

Also add folio_wait_locked_killable().  Turn wait_on_page_locked()
and wait_on_page_locked_killable() into wrappers.  This eliminates a
call to compound_head() from each call-site, reducing text size by 200
bytes for me.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 26 ++++++++++++++++++--------
 mm/filemap.c            |  4 ++--
 2 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index ad554aef9cfb..87002bc2fdce 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -797,23 +797,33 @@ extern void wait_on_page_bit(struct page *page, int bit_nr);
 extern int wait_on_page_bit_killable(struct page *page, int bit_nr);
 
 /* 
- * Wait for a page to be unlocked.
+ * Wait for a folio to be unlocked.
  *
- * This must be called with the caller "holding" the page,
- * ie with increased "page->count" so that the page won't
+ * This must be called with the caller "holding" the folio,
+ * ie with increased "page->count" so that the folio won't
  * go away during the wait..
  */
+static inline void folio_wait_locked(struct folio *folio)
+{
+	if (folio_locked(folio))
+		wait_on_page_bit(&folio->page, PG_locked);
+}
+
+static inline int folio_wait_locked_killable(struct folio *folio)
+{
+	if (!folio_locked(folio))
+		return 0;
+	return wait_on_page_bit_killable(&folio->page, PG_locked);
+}
+
 static inline void wait_on_page_locked(struct page *page)
 {
-	if (PageLocked(page))
-		wait_on_page_bit(compound_head(page), PG_locked);
+	folio_wait_locked(page_folio(page));
 }
 
 static inline int wait_on_page_locked_killable(struct page *page)
 {
-	if (!PageLocked(page))
-		return 0;
-	return wait_on_page_bit_killable(compound_head(page), PG_locked);
+	return folio_wait_locked_killable(page_folio(page));
 }
 
 int put_and_wait_on_page_locked(struct page *page, int state);
diff --git a/mm/filemap.c b/mm/filemap.c
index c70d3da2b7b2..3eb0447604b4 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1645,9 +1645,9 @@ int __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
 
 		mmap_read_unlock(mm);
 		if (flags & FAULT_FLAG_KILLABLE)
-			wait_on_page_locked_killable(page);
+			folio_wait_locked_killable(folio);
 		else
-			wait_on_page_locked(page);
+			folio_wait_locked(folio);
 		return 0;
 	}
 	if (flags & FAULT_FLAG_KILLABLE) {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 23/31] mm/swap: Add folio_rotate_reclaimable
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (21 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 22/31] mm/filemap: Add folio_wait_locked Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 24/31] mm/filemap: Add folio_end_writeback Matthew Wilcox (Oracle)
                   ` (8 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm; +Cc: Matthew Wilcox (Oracle)

Move the declaration into mm/internal.h and rename
rotate_reclaimable_page() to folio_rotate_reclaimable().  This eliminates
all five of the calls to compound_head() in this function, saving 75 bytes
at the cost of adding 14 bytes to its one caller, end_page_writeback().
Net 61 bytes savings.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/swap.h |  1 -
 mm/filemap.c         |  2 +-
 mm/internal.h        |  1 +
 mm/page_io.c         |  4 ++--
 mm/swap.c            | 18 +++++++++---------
 5 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 20766342845b..76b2338ef24d 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -365,7 +365,6 @@ extern void lru_add_drain(void);
 extern void lru_add_drain_cpu(int cpu);
 extern void lru_add_drain_cpu_zone(struct zone *zone);
 extern void lru_add_drain_all(void);
-extern void rotate_reclaimable_page(struct page *page);
 extern void deactivate_file_page(struct page *page);
 extern void deactivate_page(struct page *page);
 extern void mark_page_lazyfree(struct page *page);
diff --git a/mm/filemap.c b/mm/filemap.c
index 3eb0447604b4..e15c0a273362 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1528,7 +1528,7 @@ void end_page_writeback(struct page *page)
 	 */
 	if (PageReclaim(page)) {
 		ClearPageReclaim(page);
-		rotate_reclaimable_page(page);
+		folio_rotate_reclaimable(page_folio(page));
 	}
 
 	/*
diff --git a/mm/internal.h b/mm/internal.h
index 46eb82eaa195..68d363a3a1f3 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -35,6 +35,7 @@
 void page_writeback_init(void);
 
 vm_fault_t do_swap_page(struct vm_fault *vmf);
+void folio_rotate_reclaimable(struct folio *folio);
 
 void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma,
 		unsigned long floor, unsigned long ceiling);
diff --git a/mm/page_io.c b/mm/page_io.c
index c493ce9ebcf5..d597bc6e6e45 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -38,7 +38,7 @@ void end_swap_bio_write(struct bio *bio)
 		 * Also print a dire warning that things will go BAD (tm)
 		 * very quickly.
 		 *
-		 * Also clear PG_reclaim to avoid rotate_reclaimable_page()
+		 * Also clear PG_reclaim to avoid folio_rotate_reclaimable()
 		 */
 		set_page_dirty(page);
 		pr_alert_ratelimited("Write-error on swap-device (%u:%u:%llu)\n",
@@ -317,7 +317,7 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc,
 			 * temporary failure if the system has limited
 			 * memory for allocating transmit buffers.
 			 * Mark the page dirty and avoid
-			 * rotate_reclaimable_page but rate-limit the
+			 * folio_rotate_reclaimable but rate-limit the
 			 * messages but do not flag PageError like
 			 * the normal direct-to-bio case as it could
 			 * be temporary.
diff --git a/mm/swap.c b/mm/swap.c
index dfb48cf9c2c9..6caca11cd2ec 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -249,23 +249,23 @@ static bool pagevec_add_and_need_flush(struct pagevec *pvec, struct page *page)
 }
 
 /*
- * Writeback is about to end against a page which has been marked for immediate
- * reclaim.  If it still appears to be reclaimable, move it to the tail of the
- * inactive list.
+ * Writeback is about to end against a folio which has been marked for
+ * immediate reclaim.  If it still appears to be reclaimable, move it
+ * to the tail of the inactive list.
  *
- * rotate_reclaimable_page() must disable IRQs, to prevent nasty races.
+ * folio_rotate_reclaimable() must disable IRQs, to prevent nasty races.
  */
-void rotate_reclaimable_page(struct page *page)
+void folio_rotate_reclaimable(struct folio *folio)
 {
-	if (!PageLocked(page) && !PageDirty(page) &&
-	    !PageUnevictable(page) && PageLRU(page)) {
+	if (!folio_locked(folio) && !folio_dirty(folio) &&
+	    !folio_unevictable(folio) && folio_lru(folio)) {
 		struct pagevec *pvec;
 		unsigned long flags;
 
-		get_page(page);
+		folio_get(folio);
 		local_lock_irqsave(&lru_rotate.lock, flags);
 		pvec = this_cpu_ptr(&lru_rotate.pvec);
-		if (pagevec_add_and_need_flush(pvec, page))
+		if (pagevec_add_and_need_flush(pvec, &folio->page))
 			pagevec_lru_move_fn(pvec, pagevec_move_tail_fn);
 		local_unlock_irqrestore(&lru_rotate.lock, flags);
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 24/31] mm/filemap: Add folio_end_writeback
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (22 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 23/31] mm/swap: Add folio_rotate_reclaimable Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 25/31] mm/writeback: Add folio_wait_writeback Matthew Wilcox (Oracle)
                   ` (7 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

Add an end_page_writeback() wrapper function for users that are not yet
converted to folios.

folio_end_writeback() is less than half the size of end_page_writeback()
at just 105 bytes compared to 213 bytes, due to removing all the
compound_head() calls.  The 30 byte wrapper function makes this a net
saving of 70 bytes.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h |  3 ++-
 mm/filemap.c            | 40 ++++++++++++++++++++--------------------
 mm/folio-compat.c       |  6 ++++++
 3 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 87002bc2fdce..f35cd7002107 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -829,7 +829,8 @@ static inline int wait_on_page_locked_killable(struct page *page)
 int put_and_wait_on_page_locked(struct page *page, int state);
 void wait_on_page_writeback(struct page *page);
 int wait_on_page_writeback_killable(struct page *page);
-extern void end_page_writeback(struct page *page);
+void end_page_writeback(struct page *page);
+void folio_end_writeback(struct folio *folio);
 void wait_for_stable_page(struct page *page);
 
 void page_endio(struct page *page, bool is_write, int err);
diff --git a/mm/filemap.c b/mm/filemap.c
index e15c0a273362..3990ec4040ec 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1175,11 +1175,11 @@ static void wake_up_page_bit(struct page *page, int bit_nr)
 	spin_unlock_irqrestore(&q->lock, flags);
 }
 
-static void wake_up_page(struct page *page, int bit)
+static void folio_wake(struct folio *folio, int bit)
 {
-	if (!PageWaiters(page))
+	if (!folio_waiters(folio))
 		return;
-	wake_up_page_bit(page, bit);
+	wake_up_page_bit(&folio->page, bit);
 }
 
 /*
@@ -1514,38 +1514,38 @@ int wait_on_page_private_2_killable(struct page *page)
 EXPORT_SYMBOL(wait_on_page_private_2_killable);
 
 /**
- * end_page_writeback - end writeback against a page
- * @page: the page
+ * folio_end_writeback - End writeback against a folio.
+ * @folio: The folio.
  */
-void end_page_writeback(struct page *page)
+void folio_end_writeback(struct folio *folio)
 {
 	/*
-	 * TestClearPageReclaim could be used here but it is an atomic
+	 * folio_test_clear_reclaim() could be used here but it is an atomic
 	 * operation and overkill in this particular case. Failing to
-	 * shuffle a page marked for immediate reclaim is too mild to
+	 * shuffle a folio marked for immediate reclaim is too mild to
 	 * justify taking an atomic operation penalty at the end of
-	 * ever page writeback.
+	 * every folio writeback.
 	 */
-	if (PageReclaim(page)) {
-		ClearPageReclaim(page);
-		folio_rotate_reclaimable(page_folio(page));
+	if (folio_reclaim(folio)) {
+		folio_clear_reclaim(folio);
+		folio_rotate_reclaimable(folio);
 	}
 
 	/*
-	 * Writeback does not hold a page reference of its own, relying
+	 * Writeback does not hold a folio reference of its own, relying
 	 * on truncation to wait for the clearing of PG_writeback.
-	 * But here we must make sure that the page is not freed and
-	 * reused before the wake_up_page().
+	 * But here we must make sure that the folio is not freed and
+	 * reused before the folio_wake().
 	 */
-	get_page(page);
-	if (!test_clear_page_writeback(page))
+	folio_get(folio);
+	if (!test_clear_page_writeback(&folio->page))
 		BUG();
 
 	smp_mb__after_atomic();
-	wake_up_page(page, PG_writeback);
-	put_page(page);
+	folio_wake(folio, PG_writeback);
+	folio_put(folio);
 }
-EXPORT_SYMBOL(end_page_writeback);
+EXPORT_SYMBOL(folio_end_writeback);
 
 /*
  * After completing I/O on a page, call this routine to update the page
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 91b3d00a92f7..526843d03d58 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -17,3 +17,9 @@ void unlock_page(struct page *page)
 	return folio_unlock(page_folio(page));
 }
 EXPORT_SYMBOL(unlock_page);
+
+void end_page_writeback(struct page *page)
+{
+	return folio_end_writeback(page_folio(page));
+}
+EXPORT_SYMBOL(end_page_writeback);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 25/31] mm/writeback: Add folio_wait_writeback
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (23 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 24/31] mm/filemap: Add folio_end_writeback Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 26/31] mm/writeback: Add folio_wait_stable Matthew Wilcox (Oracle)
                   ` (6 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

wait_on_page_writeback_killable() only has one caller, so convert it to
call folio_wait_writeback_killable().  For the wait_on_page_writeback()
callers, add a compatibility wrapper around folio_wait_writeback().

Turning PageWriteback() into folio_writeback() eliminates a call to
compound_head() which saves 8 bytes and 15 bytes in the two functions.
That is more than offset by adding the wait_on_page_writeback
compatibility wrapper for a net increase in text of 15 bytes.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 fs/afs/write.c          |  9 ++++----
 include/linux/pagemap.h |  3 ++-
 mm/folio-compat.c       |  6 ++++++
 mm/page-writeback.c     | 48 ++++++++++++++++++++++++++++-------------
 4 files changed, 46 insertions(+), 20 deletions(-)

diff --git a/fs/afs/write.c b/fs/afs/write.c
index dc66ff15dd16..84f514f6548b 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -831,7 +831,8 @@ int afs_fsync(struct file *file, loff_t start, loff_t end, int datasync)
  */
 vm_fault_t afs_page_mkwrite(struct vm_fault *vmf)
 {
-	struct page *page = thp_head(vmf->page);
+	struct folio *folio = page_folio(vmf->page);
+	struct page *page = &folio->page;
 	struct file *file = vmf->vma->vm_file;
 	struct inode *inode = file_inode(file);
 	struct afs_vnode *vnode = AFS_FS_I(inode);
@@ -850,7 +851,7 @@ vm_fault_t afs_page_mkwrite(struct vm_fault *vmf)
 		return VM_FAULT_RETRY;
 #endif
 
-	if (wait_on_page_writeback_killable(page))
+	if (folio_wait_writeback_killable(folio))
 		return VM_FAULT_RETRY;
 
 	if (lock_page_killable(page) < 0)
@@ -860,8 +861,8 @@ vm_fault_t afs_page_mkwrite(struct vm_fault *vmf)
 	 * details the portion of the page we need to write back and we might
 	 * need to redirty the page if there's a problem.
 	 */
-	if (wait_on_page_writeback_killable(page) < 0) {
-		unlock_page(page);
+	if (folio_wait_writeback_killable(folio) < 0) {
+		folio_unlock(folio);
 		return VM_FAULT_RETRY;
 	}
 
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index f35cd7002107..814c488926d2 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -828,7 +828,8 @@ static inline int wait_on_page_locked_killable(struct page *page)
 
 int put_and_wait_on_page_locked(struct page *page, int state);
 void wait_on_page_writeback(struct page *page);
-int wait_on_page_writeback_killable(struct page *page);
+void folio_wait_writeback(struct folio *folio);
+int folio_wait_writeback_killable(struct folio *folio);
 void end_page_writeback(struct page *page);
 void folio_end_writeback(struct folio *folio);
 void wait_for_stable_page(struct page *page);
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 526843d03d58..41275dac7a92 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -23,3 +23,9 @@ void end_page_writeback(struct page *page)
 	return folio_end_writeback(page_folio(page));
 }
 EXPORT_SYMBOL(end_page_writeback);
+
+void wait_on_page_writeback(struct page *page)
+{
+	return folio_wait_writeback(page_folio(page));
+}
+EXPORT_SYMBOL_GPL(wait_on_page_writeback);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 0062d5c57d41..c8bc78cd0f2b 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2818,33 +2818,51 @@ int __test_set_page_writeback(struct page *page, bool keep_write)
 }
 EXPORT_SYMBOL(__test_set_page_writeback);
 
-/*
- * Wait for a page to complete writeback
+/**
+ * folio_wait_writeback - Wait for a folio to finish writeback.
+ * @folio: The folio to wait for.
+ *
+ * If the folio is currently being written back to storage, wait for the
+ * I/O to complete.
+ *
+ * Context: Sleeps.  Must be called in process context and with
+ * no spinlocks held.  Caller should hold a reference on the folio.
+ * If the folio is not locked, writeback may start again after writeback
+ * has finished.
  */
-void wait_on_page_writeback(struct page *page)
+void folio_wait_writeback(struct folio *folio)
 {
-	while (PageWriteback(page)) {
-		trace_wait_on_page_writeback(page, page_mapping(page));
-		wait_on_page_bit(page, PG_writeback);
+	while (folio_writeback(folio)) {
+		trace_wait_on_page_writeback(&folio->page, folio_mapping(folio));
+		wait_on_page_bit(&folio->page, PG_writeback);
 	}
 }
-EXPORT_SYMBOL_GPL(wait_on_page_writeback);
+EXPORT_SYMBOL_GPL(folio_wait_writeback);
 
-/*
- * Wait for a page to complete writeback.  Returns -EINTR if we get a
- * fatal signal while waiting.
+/**
+ * folio_wait_writeback_killable - Wait for a folio to finish writeback.
+ * @folio: The folio to wait for.
+ *
+ * If the folio is currently being written back to storage, wait for the
+ * I/O to complete or a fatal signal to arrive.
+ *
+ * Context: Sleeps.  Must be called in process context and with
+ * no spinlocks held.  Caller should hold a reference on the folio.
+ * If the folio is not locked, writeback may start again after writeback
+ * has finished.
+ * Return: 0 on success, -EINTR if we get a fatal signal while waiting.
  */
-int wait_on_page_writeback_killable(struct page *page)
+int folio_wait_writeback_killable(struct folio *folio)
 {
-	while (PageWriteback(page)) {
-		trace_wait_on_page_writeback(page, page_mapping(page));
-		if (wait_on_page_bit_killable(page, PG_writeback))
+	while (folio_writeback(folio)) {
+		trace_wait_on_page_writeback(&folio->page, folio_mapping(folio));
+		if (wait_on_page_bit_killable(&folio->page, PG_writeback))
 			return -EINTR;
 	}
 
 	return 0;
 }
-EXPORT_SYMBOL_GPL(wait_on_page_writeback_killable);
+EXPORT_SYMBOL_GPL(folio_wait_writeback_killable);
 
 /**
  * wait_for_stable_page() - wait for writeback to finish, if necessary.
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 26/31] mm/writeback: Add folio_wait_stable
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (24 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 25/31] mm/writeback: Add folio_wait_writeback Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 27/31] mm/filemap: Add folio_wait_bit Matthew Wilcox (Oracle)
                   ` (5 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

Move wait_for_stable_page() into the folio compatibility file.
folio_wait_stable() avoids a call to compound_head() and is 14 bytes
smaller than wait_for_stable_page() was.  The net text size grows by 24
bytes as a result of this patch.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h |  1 +
 mm/folio-compat.c       |  6 ++++++
 mm/page-writeback.c     | 24 ++++++++++++++----------
 3 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 814c488926d2..b11482e1ef18 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -833,6 +833,7 @@ int folio_wait_writeback_killable(struct folio *folio);
 void end_page_writeback(struct page *page);
 void folio_end_writeback(struct folio *folio);
 void wait_for_stable_page(struct page *page);
+void folio_wait_stable(struct folio *folio);
 
 void page_endio(struct page *page, bool is_write, int err);
 
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 41275dac7a92..3c83f03b80d7 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -29,3 +29,9 @@ void wait_on_page_writeback(struct page *page)
 	return folio_wait_writeback(page_folio(page));
 }
 EXPORT_SYMBOL_GPL(wait_on_page_writeback);
+
+void wait_for_stable_page(struct page *page)
+{
+	return folio_wait_stable(page_folio(page));
+}
+EXPORT_SYMBOL_GPL(wait_for_stable_page);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index c8bc78cd0f2b..8b15c44552f7 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2865,17 +2865,21 @@ int folio_wait_writeback_killable(struct folio *folio)
 EXPORT_SYMBOL_GPL(folio_wait_writeback_killable);
 
 /**
- * wait_for_stable_page() - wait for writeback to finish, if necessary.
- * @page:	The page to wait on.
+ * folio_wait_stable() - wait for writeback to finish, if necessary.
+ * @folio: The folio to wait on.
  *
- * This function determines if the given page is related to a backing device
- * that requires page contents to be held stable during writeback.  If so, then
- * it will wait for any pending writeback to complete.
+ * This function determines if the given folio is related to a backing
+ * device that requires folio contents to be held stable during writeback.
+ * If so, then it will wait for any pending writeback to complete.
+ *
+ * Context: Sleeps.  Must be called in process context and with
+ * no spinlocks held.  Caller should hold a reference on the folio.
+ * If the folio is not locked, writeback may start again after writeback
+ * has finished.
  */
-void wait_for_stable_page(struct page *page)
+void folio_wait_stable(struct folio *folio)
 {
-	page = thp_head(page);
-	if (page->mapping->host->i_sb->s_iflags & SB_I_STABLE_WRITES)
-		wait_on_page_writeback(page);
+	if (folio->mapping->host->i_sb->s_iflags & SB_I_STABLE_WRITES)
+		folio_wait_writeback(folio);
 }
-EXPORT_SYMBOL_GPL(wait_for_stable_page);
+EXPORT_SYMBOL_GPL(folio_wait_stable);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 27/31] mm/filemap: Add folio_wait_bit
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (25 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 26/31] mm/writeback: Add folio_wait_stable Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 28/31] mm/filemap: Add folio_wake_bit Matthew Wilcox (Oracle)
                   ` (4 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

Rename wait_on_page_bit() to folio_wait_bit().  We must always wait on
the folio, otherwise we won't be woken up due to the tail page hashing
to a different bucket from the head page.

This commit shrinks the kernel by 691 bytes, mostly due to moving
the page waitqueue lookup into folio_wait_bit_common().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 10 +++---
 mm/filemap.c            | 77 +++++++++++++++++++----------------------
 mm/page-writeback.c     |  4 +--
 3 files changed, 43 insertions(+), 48 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index b11482e1ef18..9310561f1b1e 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -790,11 +790,11 @@ static inline int lock_page_or_retry(struct page *page, struct mm_struct *mm,
 }
 
 /*
- * This is exported only for wait_on_page_locked/wait_on_page_writeback, etc.,
+ * This is exported only for folio_wait_locked/folio_wait_writeback, etc.,
  * and should not be used directly.
  */
-extern void wait_on_page_bit(struct page *page, int bit_nr);
-extern int wait_on_page_bit_killable(struct page *page, int bit_nr);
+extern void folio_wait_bit(struct folio *folio, int bit_nr);
+extern int folio_wait_bit_killable(struct folio *folio, int bit_nr);
 
 /* 
  * Wait for a folio to be unlocked.
@@ -806,14 +806,14 @@ extern int wait_on_page_bit_killable(struct page *page, int bit_nr);
 static inline void folio_wait_locked(struct folio *folio)
 {
 	if (folio_locked(folio))
-		wait_on_page_bit(&folio->page, PG_locked);
+		folio_wait_bit(folio, PG_locked);
 }
 
 static inline int folio_wait_locked_killable(struct folio *folio)
 {
 	if (!folio_locked(folio))
 		return 0;
-	return wait_on_page_bit_killable(&folio->page, PG_locked);
+	return folio_wait_bit_killable(folio, PG_locked);
 }
 
 static inline void wait_on_page_locked(struct page *page)
diff --git a/mm/filemap.c b/mm/filemap.c
index 3990ec4040ec..907a947ca21f 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1102,7 +1102,7 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync,
 	 *
 	 * So update the flags atomically, and wake up the waiter
 	 * afterwards to avoid any races. This store-release pairs
-	 * with the load-acquire in wait_on_page_bit_common().
+	 * with the load-acquire in folio_wait_bit_common().
 	 */
 	smp_store_release(&wait->flags, flags | WQ_FLAG_WOKEN);
 	wake_up_state(wait->private, mode);
@@ -1183,7 +1183,7 @@ static void folio_wake(struct folio *folio, int bit)
 }
 
 /*
- * A choice of three behaviors for wait_on_page_bit_common():
+ * A choice of three behaviors for folio_wait_bit_common():
  */
 enum behavior {
 	EXCLUSIVE,	/* Hold ref to page and take the bit when woken, like
@@ -1198,16 +1198,16 @@ enum behavior {
 };
 
 /*
- * Attempt to check (or get) the page bit, and mark us done
+ * Attempt to check (or get) the folio flag, and mark us done
  * if successful.
  */
-static inline bool trylock_page_bit_common(struct page *page, int bit_nr,
+static inline bool folio_trylock_flag(struct folio *folio, int bit_nr,
 					struct wait_queue_entry *wait)
 {
 	if (wait->flags & WQ_FLAG_EXCLUSIVE) {
-		if (test_and_set_bit(bit_nr, &page->flags))
+		if (test_and_set_bit(bit_nr, &folio->flags))
 			return false;
-	} else if (test_bit(bit_nr, &page->flags))
+	} else if (test_bit(bit_nr, &folio->flags))
 		return false;
 
 	wait->flags |= WQ_FLAG_WOKEN | WQ_FLAG_DONE;
@@ -1217,9 +1217,10 @@ static inline bool trylock_page_bit_common(struct page *page, int bit_nr,
 /* How many times do we accept lock stealing from under a waiter? */
 int sysctl_page_lock_unfairness = 5;
 
-static inline int wait_on_page_bit_common(wait_queue_head_t *q,
-	struct page *page, int bit_nr, int state, enum behavior behavior)
+static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
+		int state, enum behavior behavior)
 {
+	wait_queue_head_t *q = page_waitqueue(&folio->page);
 	int unfairness = sysctl_page_lock_unfairness;
 	struct wait_page_queue wait_page;
 	wait_queue_entry_t *wait = &wait_page.wait;
@@ -1228,8 +1229,8 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	unsigned long pflags;
 
 	if (bit_nr == PG_locked &&
-	    !PageUptodate(page) && PageWorkingset(page)) {
-		if (!PageSwapBacked(page)) {
+	    !folio_uptodate(folio) && folio_workingset(folio)) {
+		if (!folio_swapbacked(folio)) {
 			delayacct_thrashing_start();
 			delayacct = true;
 		}
@@ -1239,7 +1240,7 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 
 	init_wait(wait);
 	wait->func = wake_page_function;
-	wait_page.page = page;
+	wait_page.page = &folio->page;
 	wait_page.bit_nr = bit_nr;
 
 repeat:
@@ -1254,7 +1255,7 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	 * Do one last check whether we can get the
 	 * page bit synchronously.
 	 *
-	 * Do the SetPageWaiters() marking before that
+	 * Do the folio_set_waiters() marking before that
 	 * to let any waker we _just_ missed know they
 	 * need to wake us up (otherwise they'll never
 	 * even go to the slow case that looks at the
@@ -1265,8 +1266,8 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	 * lock to avoid races.
 	 */
 	spin_lock_irq(&q->lock);
-	SetPageWaiters(page);
-	if (!trylock_page_bit_common(page, bit_nr, wait))
+	folio_set_waiters(folio);
+	if (!folio_trylock_flag(folio, bit_nr, wait))
 		__add_wait_queue_entry_tail(q, wait);
 	spin_unlock_irq(&q->lock);
 
@@ -1276,10 +1277,10 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	 * see whether the page bit testing has already
 	 * been done by the wake function.
 	 *
-	 * We can drop our reference to the page.
+	 * We can drop our reference to the folio.
 	 */
 	if (behavior == DROP)
-		put_page(page);
+		folio_put(folio);
 
 	/*
 	 * Note that until the "finish_wait()", or until
@@ -1316,7 +1317,7 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 		 *
 		 * And if that fails, we'll have to retry this all.
 		 */
-		if (unlikely(test_and_set_bit(bit_nr, &page->flags)))
+		if (unlikely(test_and_set_bit(bit_nr, folio_flags(folio, 0))))
 			goto repeat;
 
 		wait->flags |= WQ_FLAG_DONE;
@@ -1325,7 +1326,7 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 
 	/*
 	 * If a signal happened, this 'finish_wait()' may remove the last
-	 * waiter from the wait-queues, but the PageWaiters bit will remain
+	 * waiter from the wait-queues, but the folio_waiters bit will remain
 	 * set. That's ok. The next wakeup will take care of it, and trying
 	 * to do it here would be difficult and prone to races.
 	 */
@@ -1356,19 +1357,17 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	return wait->flags & WQ_FLAG_WOKEN ? 0 : -EINTR;
 }
 
-void wait_on_page_bit(struct page *page, int bit_nr)
+void folio_wait_bit(struct folio *folio, int bit_nr)
 {
-	wait_queue_head_t *q = page_waitqueue(page);
-	wait_on_page_bit_common(q, page, bit_nr, TASK_UNINTERRUPTIBLE, SHARED);
+	folio_wait_bit_common(folio, bit_nr, TASK_UNINTERRUPTIBLE, SHARED);
 }
-EXPORT_SYMBOL(wait_on_page_bit);
+EXPORT_SYMBOL(folio_wait_bit);
 
-int wait_on_page_bit_killable(struct page *page, int bit_nr)
+int folio_wait_bit_killable(struct folio *folio, int bit_nr)
 {
-	wait_queue_head_t *q = page_waitqueue(page);
-	return wait_on_page_bit_common(q, page, bit_nr, TASK_KILLABLE, SHARED);
+	return folio_wait_bit_common(folio, bit_nr, TASK_KILLABLE, SHARED);
 }
-EXPORT_SYMBOL(wait_on_page_bit_killable);
+EXPORT_SYMBOL(folio_wait_bit_killable);
 
 /**
  * put_and_wait_on_page_locked - Drop a reference and wait for it to be unlocked
@@ -1385,11 +1384,8 @@ EXPORT_SYMBOL(wait_on_page_bit_killable);
  */
 int put_and_wait_on_page_locked(struct page *page, int state)
 {
-	wait_queue_head_t *q;
-
-	page = compound_head(page);
-	q = page_waitqueue(page);
-	return wait_on_page_bit_common(q, page, PG_locked, state, DROP);
+	return folio_wait_bit_common(page_folio(page), PG_locked, state,
+			DROP);
 }
 
 /**
@@ -1481,9 +1477,10 @@ EXPORT_SYMBOL(end_page_private_2);
  */
 void wait_on_page_private_2(struct page *page)
 {
-	page = compound_head(page);
-	while (PagePrivate2(page))
-		wait_on_page_bit(page, PG_private_2);
+	struct folio *folio = page_folio(page);
+
+	while (folio_private_2(folio))
+		folio_wait_bit(folio, PG_private_2);
 }
 EXPORT_SYMBOL(wait_on_page_private_2);
 
@@ -1500,11 +1497,11 @@ EXPORT_SYMBOL(wait_on_page_private_2);
  */
 int wait_on_page_private_2_killable(struct page *page)
 {
+	struct folio *folio = page_folio(page);
 	int ret = 0;
 
-	page = compound_head(page);
-	while (PagePrivate2(page)) {
-		ret = wait_on_page_bit_killable(page, PG_private_2);
+	while (folio_private_2(folio)) {
+		ret = folio_wait_bit_killable(folio, PG_private_2);
 		if (ret < 0)
 			break;
 	}
@@ -1581,16 +1578,14 @@ EXPORT_SYMBOL_GPL(page_endio);
  */
 void __folio_lock(struct folio *folio)
 {
-	wait_queue_head_t *q = page_waitqueue(&folio->page);
-	wait_on_page_bit_common(q, &folio->page, PG_locked, TASK_UNINTERRUPTIBLE,
+	folio_wait_bit_common(folio, PG_locked, TASK_UNINTERRUPTIBLE,
 				EXCLUSIVE);
 }
 EXPORT_SYMBOL(__folio_lock);
 
 int __folio_lock_killable(struct folio *folio)
 {
-	wait_queue_head_t *q = page_waitqueue(&folio->page);
-	return wait_on_page_bit_common(q, &folio->page, PG_locked, TASK_KILLABLE,
+	return folio_wait_bit_common(folio, PG_locked, TASK_KILLABLE,
 					EXCLUSIVE);
 }
 EXPORT_SYMBOL_GPL(__folio_lock_killable);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 8b15c44552f7..157ed52e4b7d 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2834,7 +2834,7 @@ void folio_wait_writeback(struct folio *folio)
 {
 	while (folio_writeback(folio)) {
 		trace_wait_on_page_writeback(&folio->page, folio_mapping(folio));
-		wait_on_page_bit(&folio->page, PG_writeback);
+		folio_wait_bit(folio, PG_writeback);
 	}
 }
 EXPORT_SYMBOL_GPL(folio_wait_writeback);
@@ -2856,7 +2856,7 @@ int folio_wait_writeback_killable(struct folio *folio)
 {
 	while (folio_writeback(folio)) {
 		trace_wait_on_page_writeback(&folio->page, folio_mapping(folio));
-		if (wait_on_page_bit_killable(&folio->page, PG_writeback))
+		if (folio_wait_bit_killable(folio, PG_writeback))
 			return -EINTR;
 	}
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 28/31] mm/filemap: Add folio_wake_bit
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (26 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 27/31] mm/filemap: Add folio_wait_bit Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 29/31] mm/filemap: Convert page wait queues to be folios Matthew Wilcox (Oracle)
                   ` (3 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

Convert wake_up_page_bit() to folio_wake_bit().  All callers have a folio,
so use it directly.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 mm/filemap.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 907a947ca21f..8b569a3d2d2a 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1121,14 +1121,14 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync,
 	return (flags & WQ_FLAG_EXCLUSIVE) != 0;
 }
 
-static void wake_up_page_bit(struct page *page, int bit_nr)
+static void folio_wake_bit(struct folio *folio, int bit_nr)
 {
-	wait_queue_head_t *q = page_waitqueue(page);
+	wait_queue_head_t *q = page_waitqueue(&folio->page);
 	struct wait_page_key key;
 	unsigned long flags;
 	wait_queue_entry_t bookmark;
 
-	key.page = page;
+	key.page = &folio->page;
 	key.bit_nr = bit_nr;
 	key.page_match = 0;
 
@@ -1163,7 +1163,7 @@ static void wake_up_page_bit(struct page *page, int bit_nr)
 	 * page waiters.
 	 */
 	if (!waitqueue_active(q) || !key.page_match) {
-		ClearPageWaiters(page);
+		folio_clear_waiters(folio);
 		/*
 		 * It's possible to miss clearing Waiters here, when we woke
 		 * our page waiters, but the hashed waitqueue has waiters for
@@ -1179,7 +1179,7 @@ static void folio_wake(struct folio *folio, int bit)
 {
 	if (!folio_waiters(folio))
 		return;
-	wake_up_page_bit(&folio->page, bit);
+	folio_wake_bit(folio, bit);
 }
 
 /*
@@ -1444,7 +1444,7 @@ void folio_unlock(struct folio *folio)
 	BUILD_BUG_ON(PG_waiters != 7);
 	VM_BUG_ON_FOLIO(!folio_locked(folio), folio);
 	if (clear_bit_unlock_is_negative_byte(PG_locked, folio_flags(folio, 0)))
-		wake_up_page_bit(&folio->page, PG_locked);
+		folio_wake_bit(folio, PG_locked);
 }
 EXPORT_SYMBOL(folio_unlock);
 
@@ -1461,11 +1461,12 @@ EXPORT_SYMBOL(folio_unlock);
  */
 void end_page_private_2(struct page *page)
 {
-	page = compound_head(page);
-	VM_BUG_ON_PAGE(!PagePrivate2(page), page);
-	clear_bit_unlock(PG_private_2, &page->flags);
-	wake_up_page_bit(page, PG_private_2);
-	put_page(page);
+	struct folio *folio = page_folio(page);
+
+	VM_BUG_ON_FOLIO(!folio_private_2(folio), folio);
+	clear_bit_unlock(PG_private_2, folio_flags(folio, 0));
+	folio_wake_bit(folio, PG_private_2);
+	folio_put(folio);
 }
 EXPORT_SYMBOL(end_page_private_2);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 29/31] mm/filemap: Convert page wait queues to be folios
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (27 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 28/31] mm/filemap: Add folio_wake_bit Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 30/31] mm/filemap: Add folio private_2 functions Matthew Wilcox (Oracle)
                   ` (2 subsequent siblings)
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm
  Cc: Matthew Wilcox (Oracle), Christoph Hellwig, Jeff Layton

Reinforce that page flags are actually in the head page by changing the
type from page to folio.  Increases the size of cachefiles by two bytes,
but the kernel core is unchanged in size.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 fs/cachefiles/rdwr.c    | 16 ++++++++--------
 include/linux/pagemap.h |  8 ++++----
 mm/filemap.c            | 38 +++++++++++++++++++-------------------
 3 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
index 8ffc40e84a59..e211a3d5ba44 100644
--- a/fs/cachefiles/rdwr.c
+++ b/fs/cachefiles/rdwr.c
@@ -25,20 +25,20 @@ static int cachefiles_read_waiter(wait_queue_entry_t *wait, unsigned mode,
 	struct cachefiles_object *object;
 	struct fscache_retrieval *op = monitor->op;
 	struct wait_page_key *key = _key;
-	struct page *page = wait->private;
+	struct folio *folio = wait->private;
 
 	ASSERT(key);
 
 	_enter("{%lu},%u,%d,{%p,%u}",
 	       monitor->netfs_page->index, mode, sync,
-	       key->page, key->bit_nr);
+	       key->folio, key->bit_nr);
 
-	if (key->page != page || key->bit_nr != PG_locked)
+	if (key->folio != folio || key->bit_nr != PG_locked)
 		return 0;
 
-	_debug("--- monitor %p %lx ---", page, page->flags);
+	_debug("--- monitor %p %lx ---", folio, folio->flags);
 
-	if (!PageUptodate(page) && !PageError(page)) {
+	if (!folio_uptodate(folio) && !folio_error(folio)) {
 		/* unlocked, not uptodate and not erronous? */
 		_debug("page probably truncated");
 	}
@@ -107,7 +107,7 @@ static int cachefiles_read_reissue(struct cachefiles_object *object,
 	put_page(backpage2);
 
 	INIT_LIST_HEAD(&monitor->op_link);
-	add_page_wait_queue(backpage, &monitor->monitor);
+	folio_add_wait_queue(page_folio(backpage), &monitor->monitor);
 
 	if (trylock_page(backpage)) {
 		ret = -EIO;
@@ -294,7 +294,7 @@ static int cachefiles_read_backing_file_one(struct cachefiles_object *object,
 	get_page(backpage);
 	monitor->back_page = backpage;
 	monitor->monitor.private = backpage;
-	add_page_wait_queue(backpage, &monitor->monitor);
+	folio_add_wait_queue(page_folio(backpage), &monitor->monitor);
 	monitor = NULL;
 
 	/* but the page may have been read before the monitor was installed, so
@@ -548,7 +548,7 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
 		get_page(backpage);
 		monitor->back_page = backpage;
 		monitor->monitor.private = backpage;
-		add_page_wait_queue(backpage, &monitor->monitor);
+		folio_add_wait_queue(page_folio(backpage), &monitor->monitor);
 		monitor = NULL;
 
 		/* but the page may have been read before the monitor was
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 9310561f1b1e..d8204a3d6734 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -690,13 +690,13 @@ static inline pgoff_t linear_page_index(struct vm_area_struct *vma,
 }
 
 struct wait_page_key {
-	struct page *page;
+	struct folio *folio;
 	int bit_nr;
 	int page_match;
 };
 
 struct wait_page_queue {
-	struct page *page;
+	struct folio *folio;
 	int bit_nr;
 	wait_queue_entry_t wait;
 };
@@ -704,7 +704,7 @@ struct wait_page_queue {
 static inline bool wake_page_match(struct wait_page_queue *wait_page,
 				  struct wait_page_key *key)
 {
-	if (wait_page->page != key->page)
+	if (wait_page->folio != key->folio)
 	       return false;
 	key->page_match = 1;
 
@@ -860,7 +860,7 @@ int wait_on_page_private_2_killable(struct page *page);
 /*
  * Add an arbitrary waiter to a page's wait queue
  */
-extern void add_page_wait_queue(struct page *page, wait_queue_entry_t *waiter);
+void folio_add_wait_queue(struct folio *folio, wait_queue_entry_t *waiter);
 
 /*
  * Fault everything in given userspace address range in.
diff --git a/mm/filemap.c b/mm/filemap.c
index 8b569a3d2d2a..94d4ad7432d3 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1019,11 +1019,11 @@ EXPORT_SYMBOL(__page_cache_alloc);
  */
 #define PAGE_WAIT_TABLE_BITS 8
 #define PAGE_WAIT_TABLE_SIZE (1 << PAGE_WAIT_TABLE_BITS)
-static wait_queue_head_t page_wait_table[PAGE_WAIT_TABLE_SIZE] __cacheline_aligned;
+static wait_queue_head_t folio_wait_table[PAGE_WAIT_TABLE_SIZE] __cacheline_aligned;
 
-static wait_queue_head_t *page_waitqueue(struct page *page)
+static wait_queue_head_t *folio_waitqueue(struct folio *folio)
 {
-	return &page_wait_table[hash_ptr(page, PAGE_WAIT_TABLE_BITS)];
+	return &folio_wait_table[hash_ptr(folio, PAGE_WAIT_TABLE_BITS)];
 }
 
 void __init pagecache_init(void)
@@ -1031,7 +1031,7 @@ void __init pagecache_init(void)
 	int i;
 
 	for (i = 0; i < PAGE_WAIT_TABLE_SIZE; i++)
-		init_waitqueue_head(&page_wait_table[i]);
+		init_waitqueue_head(&folio_wait_table[i]);
 
 	page_writeback_init();
 }
@@ -1086,10 +1086,10 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync,
 	 */
 	flags = wait->flags;
 	if (flags & WQ_FLAG_EXCLUSIVE) {
-		if (test_bit(key->bit_nr, &key->page->flags))
+		if (test_bit(key->bit_nr, &key->folio->flags))
 			return -1;
 		if (flags & WQ_FLAG_CUSTOM) {
-			if (test_and_set_bit(key->bit_nr, &key->page->flags))
+			if (test_and_set_bit(key->bit_nr, &key->folio->flags))
 				return -1;
 			flags |= WQ_FLAG_DONE;
 		}
@@ -1123,12 +1123,12 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync,
 
 static void folio_wake_bit(struct folio *folio, int bit_nr)
 {
-	wait_queue_head_t *q = page_waitqueue(&folio->page);
+	wait_queue_head_t *q = folio_waitqueue(folio);
 	struct wait_page_key key;
 	unsigned long flags;
 	wait_queue_entry_t bookmark;
 
-	key.page = &folio->page;
+	key.folio = folio;
 	key.bit_nr = bit_nr;
 	key.page_match = 0;
 
@@ -1220,7 +1220,7 @@ int sysctl_page_lock_unfairness = 5;
 static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
 		int state, enum behavior behavior)
 {
-	wait_queue_head_t *q = page_waitqueue(&folio->page);
+	wait_queue_head_t *q = folio_waitqueue(folio);
 	int unfairness = sysctl_page_lock_unfairness;
 	struct wait_page_queue wait_page;
 	wait_queue_entry_t *wait = &wait_page.wait;
@@ -1240,7 +1240,7 @@ static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
 
 	init_wait(wait);
 	wait->func = wake_page_function;
-	wait_page.page = &folio->page;
+	wait_page.folio = folio;
 	wait_page.bit_nr = bit_nr;
 
 repeat:
@@ -1389,23 +1389,23 @@ int put_and_wait_on_page_locked(struct page *page, int state)
 }
 
 /**
- * add_page_wait_queue - Add an arbitrary waiter to a page's wait queue
- * @page: Page defining the wait queue of interest
+ * folio_add_wait_queue - Add an arbitrary waiter to a folio's wait queue
+ * @folio: Folio defining the wait queue of interest
  * @waiter: Waiter to add to the queue
  *
- * Add an arbitrary @waiter to the wait queue for the nominated @page.
+ * Add an arbitrary @waiter to the wait queue for the nominated @folio.
  */
-void add_page_wait_queue(struct page *page, wait_queue_entry_t *waiter)
+void folio_add_wait_queue(struct folio *folio, wait_queue_entry_t *waiter)
 {
-	wait_queue_head_t *q = page_waitqueue(page);
+	wait_queue_head_t *q = folio_waitqueue(folio);
 	unsigned long flags;
 
 	spin_lock_irqsave(&q->lock, flags);
 	__add_wait_queue_entry_tail(q, waiter);
-	SetPageWaiters(page);
+	folio_set_waiters(folio);
 	spin_unlock_irqrestore(&q->lock, flags);
 }
-EXPORT_SYMBOL_GPL(add_page_wait_queue);
+EXPORT_SYMBOL_GPL(folio_add_wait_queue);
 
 #ifndef clear_bit_unlock_is_negative_byte
 
@@ -1593,10 +1593,10 @@ EXPORT_SYMBOL_GPL(__folio_lock_killable);
 
 static int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait)
 {
-	struct wait_queue_head *q = page_waitqueue(&folio->page);
+	struct wait_queue_head *q = folio_waitqueue(folio);
 	int ret = 0;
 
-	wait->page = &folio->page;
+	wait->folio = folio;
 	wait->bit_nr = PG_locked;
 
 	spin_lock_irq(&q->lock);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 30/31] mm/filemap: Add folio private_2 functions
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (28 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 29/31] mm/filemap: Convert page wait queues to be folios Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:07 ` [PATCH v8 31/31] fs/netfs: Add folio fscache functions Matthew Wilcox (Oracle)
  2021-04-30 18:47 ` [PATCH v8.1 00/31] Memory Folios Hugh Dickins
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm; +Cc: Matthew Wilcox (Oracle)

end_page_private_2() becomes folio_end_private_2(),
wait_on_page_private_2() becomes folio_wait_private_2() and
wait_on_page_private_2_killable() becomes folio_wait_private_2_killable().

Adjust the fscache equivalents to call page_folio() before calling these
functions to avoid adding wrappers.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/netfs.h   |  6 +++---
 include/linux/pagemap.h |  6 +++---
 mm/filemap.c            | 37 ++++++++++++++++---------------------
 3 files changed, 22 insertions(+), 27 deletions(-)

diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 9062adfa2fb9..fad8c6209edd 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -55,7 +55,7 @@ static inline void set_page_fscache(struct page *page)
  */
 static inline void end_page_fscache(struct page *page)
 {
-	end_page_private_2(page);
+	folio_end_private_2(page_folio(page));
 }
 
 /**
@@ -66,7 +66,7 @@ static inline void end_page_fscache(struct page *page)
  */
 static inline void wait_on_page_fscache(struct page *page)
 {
-	wait_on_page_private_2(page);
+	folio_wait_private_2(page_folio(page));
 }
 
 /**
@@ -82,7 +82,7 @@ static inline void wait_on_page_fscache(struct page *page)
  */
 static inline int wait_on_page_fscache_killable(struct page *page)
 {
-	return wait_on_page_private_2_killable(page);
+	return folio_wait_private_2_killable(page_folio(page));
 }
 
 enum netfs_read_source {
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index d8204a3d6734..f68bd82837a4 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -853,9 +853,9 @@ static inline void set_page_private_2(struct page *page)
 	SetPagePrivate2(page);
 }
 
-void end_page_private_2(struct page *page);
-void wait_on_page_private_2(struct page *page);
-int wait_on_page_private_2_killable(struct page *page);
+void folio_end_private_2(struct folio *folio);
+void folio_wait_private_2(struct folio *folio);
+int folio_wait_private_2_killable(struct folio *folio);
 
 /*
  * Add an arbitrary waiter to a page's wait queue
diff --git a/mm/filemap.c b/mm/filemap.c
index 94d4ad7432d3..3cf1d7fbd0b1 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1449,56 +1449,51 @@ void folio_unlock(struct folio *folio)
 EXPORT_SYMBOL(folio_unlock);
 
 /**
- * end_page_private_2 - Clear PG_private_2 and release any waiters
- * @page: The page
+ * folio_end_private_2 - Clear PG_private_2 and wake any waiters.
+ * @folio: The folio.
  *
- * Clear the PG_private_2 bit on a page and wake up any sleepers waiting for
- * this.  The page ref held for PG_private_2 being set is released.
+ * Clear the PG_private_2 bit on a folio and wake up any sleepers waiting for
+ * it.  The page ref held for PG_private_2 being set is released.
  *
  * This is, for example, used when a netfs page is being written to a local
  * disk cache, thereby allowing writes to the cache for the same page to be
  * serialised.
  */
-void end_page_private_2(struct page *page)
+void folio_end_private_2(struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
-
 	VM_BUG_ON_FOLIO(!folio_private_2(folio), folio);
 	clear_bit_unlock(PG_private_2, folio_flags(folio, 0));
 	folio_wake_bit(folio, PG_private_2);
 	folio_put(folio);
 }
-EXPORT_SYMBOL(end_page_private_2);
+EXPORT_SYMBOL(folio_end_private_2);
 
 /**
- * wait_on_page_private_2 - Wait for PG_private_2 to be cleared on a page
- * @page: The page to wait on
+ * folio_wait_private_2 - Wait for PG_private_2 to be cleared on a page.
+ * @folio: The folio to wait on.
  *
- * Wait for PG_private_2 (aka PG_fscache) to be cleared on a page.
+ * Wait for PG_private_2 (aka PG_fscache) to be cleared on a folio.
  */
-void wait_on_page_private_2(struct page *page)
+void folio_wait_private_2(struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
-
 	while (folio_private_2(folio))
 		folio_wait_bit(folio, PG_private_2);
 }
-EXPORT_SYMBOL(wait_on_page_private_2);
+EXPORT_SYMBOL(folio_wait_private_2);
 
 /**
- * wait_on_page_private_2_killable - Wait for PG_private_2 to be cleared on a page
- * @page: The page to wait on
+ * folio_wait_private_2_killable - Wait for PG_private_2 to be cleared on a folio.
+ * @folio: The folio to wait on.
  *
- * Wait for PG_private_2 (aka PG_fscache) to be cleared on a page or until a
+ * Wait for PG_private_2 (aka PG_fscache) to be cleared on a folio or until a
  * fatal signal is received by the calling task.
  *
  * Return:
  * - 0 if successful.
  * - -EINTR if a fatal signal was encountered.
  */
-int wait_on_page_private_2_killable(struct page *page)
+int folio_wait_private_2_killable(struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
 	int ret = 0;
 
 	while (folio_private_2(folio)) {
@@ -1509,7 +1504,7 @@ int wait_on_page_private_2_killable(struct page *page)
 
 	return ret;
 }
-EXPORT_SYMBOL(wait_on_page_private_2_killable);
+EXPORT_SYMBOL(folio_wait_private_2_killable);
 
 /**
  * folio_end_writeback - End writeback against a folio.
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 31/31] fs/netfs: Add folio fscache functions
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (29 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 30/31] mm/filemap: Add folio private_2 functions Matthew Wilcox (Oracle)
@ 2021-04-30 18:07 ` Matthew Wilcox (Oracle)
  2021-04-30 18:47 ` [PATCH v8.1 00/31] Memory Folios Hugh Dickins
  31 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-30 18:07 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, akpm; +Cc: Matthew Wilcox (Oracle)

Match the page writeback functions by adding
folio_start_fscache(), folio_end_fscache(), folio_wait_fscache() and
folio_wait_fscache_killable().  Also rewrite the kernel-doc to describe
when to use the function rather than what the function does, and include
the kernel-doc in the appropriate rst file.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 Documentation/filesystems/netfs_library.rst |  2 +
 include/linux/netfs.h                       | 75 +++++++++++++--------
 2 files changed, 50 insertions(+), 27 deletions(-)

diff --git a/Documentation/filesystems/netfs_library.rst b/Documentation/filesystems/netfs_library.rst
index 57a641847818..bb68d39f03b7 100644
--- a/Documentation/filesystems/netfs_library.rst
+++ b/Documentation/filesystems/netfs_library.rst
@@ -524,3 +524,5 @@ Note that these methods are passed a pointer to the cache resource structure,
 not the read request structure as they could be used in other situations where
 there isn't a read request structure as well, such as writing dirty data to the
 cache.
+
+.. kernel-doc:: include/linux/netfs.h
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index fad8c6209edd..ea0bb793ac98 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -22,6 +22,7 @@
  * Overload PG_private_2 to give us PG_fscache - this is used to indicate that
  * a page is currently backed by a local disk cache
  */
+#define folio_fscache(folio)		folio_private_2(folio)
 #define PageFsCache(page)		PagePrivate2((page))
 #define SetPageFsCache(page)		SetPagePrivate2((page))
 #define ClearPageFsCache(page)		ClearPagePrivate2((page))
@@ -29,57 +30,77 @@
 #define TestClearPageFsCache(page)	TestClearPagePrivate2((page))
 
 /**
- * set_page_fscache - Set PG_fscache on a page and take a ref
- * @page: The page.
+ * folio_start_fscache - Start an fscache operation on a folio.
+ * @folio: The folio.
  *
- * Set the PG_fscache (PG_private_2) flag on a page and take the reference
- * needed for the VM to handle its lifetime correctly.  This sets the flag and
- * takes the reference unconditionally, so care must be taken not to set the
- * flag again if it's already set.
+ * Call this function before an fscache operation starts on a folio.
+ * Starting a second fscache operation before the first one finishes is
+ * not allowed.
  */
-static inline void set_page_fscache(struct page *page)
+static inline void folio_start_fscache(struct folio *folio)
 {
-	set_page_private_2(page);
+	VM_BUG_ON_FOLIO(folio_private_2(folio), folio);
+	folio_get(folio);
+	folio_set_private_2(folio);
 }
 
 /**
- * end_page_fscache - Clear PG_fscache and release any waiters
- * @page: The page
- *
- * Clear the PG_fscache (PG_private_2) bit on a page and wake up any sleepers
- * waiting for this.  The page ref held for PG_private_2 being set is released.
+ * folio_end_fscache - End an fscache operation on a folio.
+ * @folio: The folio.
  *
- * This is, for example, used when a netfs page is being written to a local
- * disk cache, thereby allowing writes to the cache for the same page to be
- * serialised.
+ * Call this function after an fscache operation has finished.  This will
+ * wake any sleepers waiting on this folio.
  */
-static inline void end_page_fscache(struct page *page)
+static inline void folio_end_fscache(struct folio *folio)
 {
-	folio_end_private_2(page_folio(page));
+	folio_end_private_2(folio);
 }
 
 /**
- * wait_on_page_fscache - Wait for PG_fscache to be cleared on a page
- * @page: The page to wait on
+ * folio_wait_fscache - Wait for an fscache operation on this folio to end.
+ * @folio: The folio.
  *
- * Wait for PG_fscache (aka PG_private_2) to be cleared on a page.
+ * If an fscache operation is in progress on this folio, wait for it to
+ * finish.  Another fscache operation may start after this one finishes,
+ * unless the caller holds the folio lock.
  */
-static inline void wait_on_page_fscache(struct page *page)
+static inline void folio_wait_fscache(struct folio *folio)
 {
-	folio_wait_private_2(page_folio(page));
+	folio_wait_private_2(folio);
 }
 
 /**
- * wait_on_page_fscache_killable - Wait for PG_fscache to be cleared on a page
- * @page: The page to wait on
+ * folio_wait_fscache_killable - Wait for an fscache operation on this folio to end.
+ * @folio: The folio.
  *
- * Wait for PG_fscache (aka PG_private_2) to be cleared on a page or until a
- * fatal signal is received by the calling task.
+ * If an fscache operation is in progress on this folio, wait for it to
+ * finish or for a fatal signal to be received.  Another fscache operation
+ * may start after this one finishes, unless the caller holds the folio lock.
  *
  * Return:
  * - 0 if successful.
  * - -EINTR if a fatal signal was encountered.
  */
+static inline int folio_wait_fscache_killable(struct folio *folio)
+{
+	return folio_wait_private_2_killable(folio);
+}
+
+static inline void set_page_fscache(struct page *page)
+{
+	folio_start_fscache(page_folio(page));
+}
+
+static inline void end_page_fscache(struct page *page)
+{
+	folio_end_private_2(page_folio(page));
+}
+
+static inline void wait_on_page_fscache(struct page *page)
+{
+	folio_wait_private_2(page_folio(page));
+}
+
 static inline int wait_on_page_fscache_killable(struct page *page)
 {
 	return folio_wait_private_2_killable(page_folio(page));
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v8.1 00/31] Memory Folios
  2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
                   ` (30 preceding siblings ...)
  2021-04-30 18:07 ` [PATCH v8 31/31] fs/netfs: Add folio fscache functions Matthew Wilcox (Oracle)
@ 2021-04-30 18:47 ` Hugh Dickins
  2021-05-01  1:32   ` Nicholas Piggin
  31 siblings, 1 reply; 41+ messages in thread
From: Hugh Dickins @ 2021-04-30 18:47 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Linus Torvalds, linux-mm, linux-fsdevel, akpm

Adding Linus to the Cc (of this one only): he surely has an interest.

On Fri, 30 Apr 2021, Matthew Wilcox (Oracle) wrote:

> Managing memory in 4KiB pages is a serious overhead.  Many benchmarks
> benefit from a larger "page size".  As an example, an earlier iteration
> of this idea which used compound pages (and wasn't particularly tuned)
> got a 7% performance boost when compiling the kernel.
> 
> Using compound pages or THPs exposes a serious weakness in our type
> system.  Functions are often unprepared for compound pages to be passed
> to them, and may only act on PAGE_SIZE chunks.  Even functions which are
> aware of compound pages may expect a head page, and do the wrong thing
> if passed a tail page.
> 
> There have been efforts to label function parameters as 'head' instead
> of 'page' to indicate that the function expects a head page, but this
> leaves us with runtime assertions instead of using the compiler to prove
> that nobody has mistakenly passed a tail page.  Calling a struct page
> 'head' is also inaccurate as they will work perfectly well on base pages.
> 
> We also waste a lot of instructions ensuring that we're not looking at
> a tail page.  Almost every call to PageFoo() contains one or more hidden
> calls to compound_head().  This also happens for get_page(), put_page()
> and many more functions.  There does not appear to be a way to tell gcc
> that it can cache the result of compound_head(), nor is there a way to
> tell it that compound_head() is idempotent.
> 
> This series introduces the 'struct folio' as a replacement for
> head-or-base pages.  This initial set reduces the kernel size by
> approximately 6kB by removing conversions from tail pages to head pages.
> The real purpose of this series is adding infrastructure to enable
> further use of the folio.
> 
> The medium-term goal is to convert all filesystems and some device
> drivers to work in terms of folios.  This series contains a lot of
> explicit conversions, but it's important to realise it's removing a lot
> of implicit conversions in some relatively hot paths.  There will be very
> few conversions from folios when this work is completed; filesystems,
> the page cache, the LRU and so on will generally only deal with folios.
> 
> The text size reduces by between 6kB (a config based on Oracle UEK)
> and 1.2kB (allnoconfig).  Performance seems almost unaffected based
> on kernbench.
> 
> Current tree at:
> https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/folio
> 
> (contains another ~120 patches on top of this batch, not all of which are
> in good shape for submission)
> 
> v8.1:
>  - Rebase on next-20210430
>  - You need https://lore.kernel.org/linux-mm/20210430145549.2662354-1-willy@infradead.org/ first
>  - Big renaming (thanks to peterz):
>    - PageFoo() becomes folio_foo()
>    - SetFolioFoo() becomes folio_set_foo()
>    - ClearFolioFoo() becomes folio_clear_foo()
>    - __SetFolioFoo() becomes __folio_set_foo()
>    - __ClearFolioFoo() becomes __folio_clear_foo()
>    - TestSetPageFoo() becomes folio_test_set_foo()
>    - TestClearPageFoo() becomes folio_test_clear_foo()
>    - PageHuge() is now folio_hugetlb()
>    - put_folio() becomes folio_put()
>    - get_folio() becomes folio_get()
>    - put_folio_testzero() becomes folio_put_testzero()
>    - set_folio_count() becomes folio_set_count()
>    - attach_folio_private() becomes folio_attach_private()
>    - detach_folio_private() becomes folio_detach_private()
>    - lock_folio() becomes folio_lock()
>    - unlock_folio() becomes folio_unlock()
>    - trylock_folio() becomes folio_trylock()
>    - __lock_folio_or_retry becomes __folio_lock_or_retry()
>    - __lock_folio_async() becomes __folio_lock_async()
>    - wake_up_folio_bit() becomes folio_wake_bit()
>    - wake_up_folio() becomes folio_wake()
>    - wait_on_folio_bit() becomes folio_wait_bit()
>    - wait_for_stable_folio() becomes folio_wait_stable()
>    - wait_on_folio() becomes folio_wait()
>    - wait_on_folio_locked() becomes folio_wait_locked()
>    - wait_on_folio_writeback() becomes folio_wait_writeback()
>    - end_folio_writeback() becomes folio_end_writeback()
>    - add_folio_wait_queue() becomes folio_add_wait_queue()
>  - Add folio_young() and folio_idle() family of functions
>  - Move page_folio() to page-flags.h and use _compound_head()
>  - Make page_folio() const-preserving
>  - Add folio_page() to get the nth page from a folio
>  - Improve struct folio kernel-doc
>  - Convert folio flag tests to return bool instead of int
>  - Eliminate set_folio_private()
>  - folio_get_private() is the equivalent of page_private() (as folio_private()
>    is now a test for whether the private flag is set on the folio)
>  - Move folio_rotate_reclaimable() into this patchset
>  - Add page-flags.h to the kernel-doc
>  - Add netfs.h to the kernel-doc
>  - Add a family of folio_lock_lruvec() wrappers
>  - Add a family of folio_relock_lruvec() wrappers
> 
> v7:
> https://lore.kernel.org/linux-mm/20210409185105.188284-1-willy@infradead.org/
> 
> Matthew Wilcox (Oracle) (31):
>   mm: Introduce struct folio
>   mm: Add folio_pgdat and folio_zone
>   mm/vmstat: Add functions to account folio statistics
>   mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO
>   mm: Add folio reference count functions
>   mm: Add folio_put
>   mm: Add folio_get
>   mm: Add folio flag manipulation functions
>   mm: Add folio_young() and folio_idle()
>   mm: Handle per-folio private data
>   mm/filemap: Add folio_index, folio_file_page and folio_contains
>   mm/filemap: Add folio_next_index
>   mm/filemap: Add folio_offset and folio_file_offset
>   mm/util: Add folio_mapping and folio_file_mapping
>   mm: Add folio_mapcount
>   mm/memcg: Add folio wrappers for various functions
>   mm/filemap: Add folio_unlock
>   mm/filemap: Add folio_lock
>   mm/filemap: Add folio_lock_killable
>   mm/filemap: Add __folio_lock_async
>   mm/filemap: Add __folio_lock_or_retry
>   mm/filemap: Add folio_wait_locked
>   mm/swap: Add folio_rotate_reclaimable
>   mm/filemap: Add folio_end_writeback
>   mm/writeback: Add folio_wait_writeback
>   mm/writeback: Add folio_wait_stable
>   mm/filemap: Add folio_wait_bit
>   mm/filemap: Add folio_wake_bit
>   mm/filemap: Convert page wait queues to be folios
>   mm/filemap: Add folio private_2 functions
>   fs/netfs: Add folio fscache functions
> 
>  Documentation/core-api/mm-api.rst           |   4 +
>  Documentation/filesystems/netfs_library.rst |   2 +
>  fs/afs/write.c                              |   9 +-
>  fs/cachefiles/rdwr.c                        |  16 +-
>  fs/io_uring.c                               |   2 +-
>  include/linux/memcontrol.h                  |  58 ++++
>  include/linux/mm.h                          | 173 ++++++++++--
>  include/linux/mm_types.h                    |  71 +++++
>  include/linux/mmdebug.h                     |  20 ++
>  include/linux/netfs.h                       |  77 +++--
>  include/linux/page-flags.h                  | 222 +++++++++++----
>  include/linux/page_idle.h                   |  99 ++++---
>  include/linux/page_ref.h                    |  88 +++++-
>  include/linux/pagemap.h                     | 276 +++++++++++++-----
>  include/linux/swap.h                        |   7 +-
>  include/linux/vmstat.h                      | 107 +++++++
>  mm/Makefile                                 |   2 +-
>  mm/filemap.c                                | 295 ++++++++++----------
>  mm/folio-compat.c                           |  37 +++
>  mm/internal.h                               |   1 +
>  mm/memory.c                                 |   8 +-
>  mm/page-writeback.c                         |  72 +++--
>  mm/page_io.c                                |   4 +-
>  mm/swap.c                                   |  18 +-
>  mm/swapfile.c                               |   8 +-
>  mm/util.c                                   |  30 +-
>  26 files changed, 1247 insertions(+), 459 deletions(-)
>  create mode 100644 mm/folio-compat.c
> 
> -- 
> 2.30.2

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8.1 00/31] Memory Folios
  2021-04-30 18:47 ` [PATCH v8.1 00/31] Memory Folios Hugh Dickins
@ 2021-05-01  1:32   ` Nicholas Piggin
  2021-05-01  2:37     ` Matthew Wilcox
  2021-05-01 21:38     ` John Hubbard
  0 siblings, 2 replies; 41+ messages in thread
From: Nicholas Piggin @ 2021-05-01  1:32 UTC (permalink / raw)
  To: Hugh Dickins, Matthew Wilcox (Oracle)
  Cc: akpm, linux-fsdevel, linux-mm, Linus Torvalds

Excerpts from Hugh Dickins's message of May 1, 2021 4:47 am:
> Adding Linus to the Cc (of this one only): he surely has an interest.
> 
> On Fri, 30 Apr 2021, Matthew Wilcox (Oracle) wrote:
> 
>> Managing memory in 4KiB pages is a serious overhead.  Many benchmarks
>> benefit from a larger "page size".  As an example, an earlier iteration
>> of this idea which used compound pages (and wasn't particularly tuned)
>> got a 7% performance boost when compiling the kernel.
>> 
>> Using compound pages or THPs exposes a serious weakness in our type
>> system.  Functions are often unprepared for compound pages to be passed
>> to them, and may only act on PAGE_SIZE chunks.  Even functions which are
>> aware of compound pages may expect a head page, and do the wrong thing
>> if passed a tail page.
>> 
>> There have been efforts to label function parameters as 'head' instead
>> of 'page' to indicate that the function expects a head page, but this
>> leaves us with runtime assertions instead of using the compiler to prove
>> that nobody has mistakenly passed a tail page.  Calling a struct page
>> 'head' is also inaccurate as they will work perfectly well on base pages.
>> 
>> We also waste a lot of instructions ensuring that we're not looking at
>> a tail page.  Almost every call to PageFoo() contains one or more hidden
>> calls to compound_head().  This also happens for get_page(), put_page()
>> and many more functions.  There does not appear to be a way to tell gcc
>> that it can cache the result of compound_head(), nor is there a way to
>> tell it that compound_head() is idempotent.
>> 
>> This series introduces the 'struct folio' as a replacement for
>> head-or-base pages.  This initial set reduces the kernel size by
>> approximately 6kB by removing conversions from tail pages to head pages.
>> The real purpose of this series is adding infrastructure to enable
>> further use of the folio.
>> 
>> The medium-term goal is to convert all filesystems and some device
>> drivers to work in terms of folios.  This series contains a lot of
>> explicit conversions, but it's important to realise it's removing a lot
>> of implicit conversions in some relatively hot paths.  There will be very
>> few conversions from folios when this work is completed; filesystems,
>> the page cache, the LRU and so on will generally only deal with folios.
>> 
>> The text size reduces by between 6kB (a config based on Oracle UEK)
>> and 1.2kB (allnoconfig).  Performance seems almost unaffected based
>> on kernbench.
>> 
>> Current tree at:
>> https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/folio
>> 
>> (contains another ~120 patches on top of this batch, not all of which are
>> in good shape for submission)
>> 
>> v8.1:
>>  - Rebase on next-20210430
>>  - You need https://lore.kernel.org/linux-mm/20210430145549.2662354-1-willy@infradead.org/ first
>>  - Big renaming (thanks to peterz):
>>    - PageFoo() becomes folio_foo()
>>    - SetFolioFoo() becomes folio_set_foo()
>>    - ClearFolioFoo() becomes folio_clear_foo()
>>    - __SetFolioFoo() becomes __folio_set_foo()
>>    - __ClearFolioFoo() becomes __folio_clear_foo()
>>    - TestSetPageFoo() becomes folio_test_set_foo()
>>    - TestClearPageFoo() becomes folio_test_clear_foo()
>>    - PageHuge() is now folio_hugetlb()

If you rename these things at the same time, can you make it clear 
they're flags (folio_flag_set_foo())? The weird camel case accessors at 
least make that clear (after you get to know them).

We have a set_page_dirty(), so page_set_dirty() would be annoying.
page_flag_set_dirty() keeps the easy distinction that SetPageDirty()
provides.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8.1 00/31] Memory Folios
  2021-05-01  1:32   ` Nicholas Piggin
@ 2021-05-01  2:37     ` Matthew Wilcox
  2021-05-01 14:31       ` Matthew Wilcox
  2021-05-01 21:38     ` John Hubbard
  1 sibling, 1 reply; 41+ messages in thread
From: Matthew Wilcox @ 2021-05-01  2:37 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Hugh Dickins, akpm, linux-fsdevel, linux-mm, Linus Torvalds

On Sat, May 01, 2021 at 11:32:20AM +1000, Nicholas Piggin wrote:
> Excerpts from Hugh Dickins's message of May 1, 2021 4:47 am:
> > On Fri, 30 Apr 2021, Matthew Wilcox (Oracle) wrote:
> >>  - Big renaming (thanks to peterz):
> >>    - PageFoo() becomes folio_foo()
> >>    - SetFolioFoo() becomes folio_set_foo()
> >>    - ClearFolioFoo() becomes folio_clear_foo()
> >>    - __SetFolioFoo() becomes __folio_set_foo()
> >>    - __ClearFolioFoo() becomes __folio_clear_foo()
> >>    - TestSetPageFoo() becomes folio_test_set_foo()
> >>    - TestClearPageFoo() becomes folio_test_clear_foo()
> >>    - PageHuge() is now folio_hugetlb()
> 
> If you rename these things at the same time, can you make it clear 
> they're flags (folio_flag_set_foo())? The weird camel case accessors at 
> least make that clear (after you get to know them).
> 
> We have a set_page_dirty(), so page_set_dirty() would be annoying.
> page_flag_set_dirty() keeps the easy distinction that SetPageDirty()
> provides.

Maybe I should have sent more of the patches in this batch ...

mark_page_accessed() becomes folio_mark_accessed()
set_page_dirty() becomes folio_mark_dirty()
set_page_writeback() becomes folio_start_writeback()
test_clear_page_writeback() becomes __folio_end_writeback()
cancel_dirty_page() becomes folio_cancel_dirty()
clear_page_dirty_for_io() becomes folio_clear_dirty_for_io()
lru_cache_add() becomes folio_add_lru()
add_to_page_cache_lru() becomes folio_add_to_page_cache()
write_one_page() becomes folio_write_one()
account_page_redirty() becomes folio_account_redirty()
account_page_cleaned() becomes folio_account_cleaned()

So the general pattern is that folio_set_foo() and folio_clear_foo()
works on the flag directly.  If we do anything fancy to it, it's
folio_verb_foo() where verb depends on foo.

I'm not entirely comfortable with this.  I'd like to stop modules
from accessing folio_set_dirty() because it's just going to mess
up filesystems.  I just haven't thought of a good way to expose
some flags and not others.

Actually, looking at what filesystems actually use at the moment, it's
quite a small subset:

ClearPageChecked
ClearPageDirty
ClearPageError
ClearPageFsCache
__ClearPageLocked
ClearPageMappedToDisk
ClearPagePrivate2
ClearPagePrivate
ClearPageReferenced
ClearPageUptodate
TestClearPageError
TestClearPageFsCache
TestClearPagePrivate2
TestClearPageDirty

SetPageError
__SetPageLocked
SetPageMappedToDisk
SetPagePrivate2
SetPagePrivate
SetPageUptodate
__SetPageUptodate
TestSetPageDirty
TestSetPageFsCache

several of those are ... confused ... but the vast majority of page flags
don't need to be exposed to filesystems.  Does it make you feel better if
folio_set_dirty() doesn't get exposed outside the VFS?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8.1 00/31] Memory Folios
  2021-05-01  2:37     ` Matthew Wilcox
@ 2021-05-01 14:31       ` Matthew Wilcox
  0 siblings, 0 replies; 41+ messages in thread
From: Matthew Wilcox @ 2021-05-01 14:31 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Hugh Dickins, akpm, linux-fsdevel, linux-mm, Linus Torvalds

On Sat, May 01, 2021 at 03:37:11AM +0100, Matthew Wilcox wrote:
> On Sat, May 01, 2021 at 11:32:20AM +1000, Nicholas Piggin wrote:
> > Excerpts from Hugh Dickins's message of May 1, 2021 4:47 am:
> > > On Fri, 30 Apr 2021, Matthew Wilcox (Oracle) wrote:
> > >>  - Big renaming (thanks to peterz):
> > >>    - PageFoo() becomes folio_foo()
> > >>    - SetFolioFoo() becomes folio_set_foo()
> > >>    - ClearFolioFoo() becomes folio_clear_foo()
> > >>    - __SetFolioFoo() becomes __folio_set_foo()
> > >>    - __ClearFolioFoo() becomes __folio_clear_foo()
> > >>    - TestSetPageFoo() becomes folio_test_set_foo()
> > >>    - TestClearPageFoo() becomes folio_test_clear_foo()
> > >>    - PageHuge() is now folio_hugetlb()
> > 
> > If you rename these things at the same time, can you make it clear 
> > they're flags (folio_flag_set_foo())? The weird camel case accessors at 
> > least make that clear (after you get to know them).
> > 
> > We have a set_page_dirty(), so page_set_dirty() would be annoying.
> > page_flag_set_dirty() keeps the easy distinction that SetPageDirty()
> > provides.
> 
> Maybe I should have sent more of the patches in this batch ...
> 
> mark_page_accessed() becomes folio_mark_accessed()
> set_page_dirty() becomes folio_mark_dirty()
> set_page_writeback() becomes folio_start_writeback()
> test_clear_page_writeback() becomes __folio_end_writeback()
> cancel_dirty_page() becomes folio_cancel_dirty()
> clear_page_dirty_for_io() becomes folio_clear_dirty_for_io()
> lru_cache_add() becomes folio_add_lru()
> add_to_page_cache_lru() becomes folio_add_to_page_cache()
> write_one_page() becomes folio_write_one()
> account_page_redirty() becomes folio_account_redirty()
> account_page_cleaned() becomes folio_account_cleaned()
> 
> So the general pattern is that folio_set_foo() and folio_clear_foo()
> works on the flag directly.  If we do anything fancy to it, it's
> folio_verb_foo() where verb depends on foo.

After sleeping on this, I now think "Why not both?"

folio_dirty() -- defined in page-flags.h

folio_test_set_dirty_flag()
folio_test_clear_dirty_flag()
__folio_clear_dirty_flag()
__folio_set_dirty_flag()
folio_clear_dirty_flag()
folio_set_dirty_flag() -- generated in filemap.h under #ifndef MODULE

folio_mark_dirty() -- declared in mm.h (this is rare; turns out all kinds of
			crap wants to mark pages as being dirty)
folio_clear_dirty_for_io() -- declared in filemap.h


the other flags would mostly follow this pattern.  i'd also change
folio_set_uptodate() to folio_mark_uptodate().

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8.1 00/31] Memory Folios
  2021-05-01  1:32   ` Nicholas Piggin
  2021-05-01  2:37     ` Matthew Wilcox
@ 2021-05-01 21:38     ` John Hubbard
  2021-05-02  0:17       ` Matthew Wilcox
  1 sibling, 1 reply; 41+ messages in thread
From: John Hubbard @ 2021-05-01 21:38 UTC (permalink / raw)
  To: Nicholas Piggin, Hugh Dickins, Matthew Wilcox (Oracle)
  Cc: akpm, linux-fsdevel, linux-mm, Linus Torvalds

On 4/30/21 6:32 PM, Nicholas Piggin wrote:
...
>>>   - Big renaming (thanks to peterz):
>>>     - PageFoo() becomes folio_foo()
>>>     - SetFolioFoo() becomes folio_set_foo()
>>>     - ClearFolioFoo() becomes folio_clear_foo()
>>>     - __SetFolioFoo() becomes __folio_set_foo()
>>>     - __ClearFolioFoo() becomes __folio_clear_foo()
>>>     - TestSetPageFoo() becomes folio_test_set_foo()
>>>     - TestClearPageFoo() becomes folio_test_clear_foo()
>>>     - PageHuge() is now folio_hugetlb()
> 
> If you rename these things at the same time, can you make it clear
> they're flags (folio_flag_set_foo())? The weird camel case accessors at
> least make that clear (after you get to know them).
> 

In addition to pointing out that the name was a page flag, the weird
camel case also meant, "if you try to search for this symbol, you will
be defeated", because the darn thing is constructed via macro
concatenation. Which was a reasonable tradeoff of "cannot find it" vs.
"it's extremely simple, and a macro keeps out the bugs when making lots
of these".

Except that over time, it turned out to be not quite that simple, and
people started adding functionality. So now it's "cannot find it, and
it's also got little goodies hiding in there--maybe!".

And now with this change, we have for example:

     #define CLEARPAGEFLAG_NOOP(uname, lname)				\
     static inline void folio_clear_##lname(struct folio *folio) { }	\
     static inline void ClearPage##uname(struct page *page) {  }

...which is both inconsistent, and as Nicholas points out, no longer
obviously a macro-concatenated name.

Given all that, I'd argue for either:

     a) sticking with the camel case ("ClearFolioFoo), or

     b) changing a bunch of the items to actual written-out names. What's
        the harm? We'd end up with a longer file, but one could grep or
        cscope for the names.

thanks,
-- 
John Hubbard
NVIDIA

> We have a set_page_dirty(), so page_set_dirty() would be annoying.
> page_flag_set_dirty() keeps the easy distinction that SetPageDirty()
> provides.
> 
> Thanks,
> Nick
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8.1 00/31] Memory Folios
  2021-05-01 21:38     ` John Hubbard
@ 2021-05-02  0:17       ` Matthew Wilcox
  2021-05-02  0:42         ` John Hubbard
  0 siblings, 1 reply; 41+ messages in thread
From: Matthew Wilcox @ 2021-05-02  0:17 UTC (permalink / raw)
  To: John Hubbard
  Cc: Nicholas Piggin, Hugh Dickins, akpm, linux-fsdevel, linux-mm,
	Linus Torvalds

On Sat, May 01, 2021 at 02:38:50PM -0700, John Hubbard wrote:
> On 4/30/21 6:32 PM, Nicholas Piggin wrote:
> ...
> > > >   - Big renaming (thanks to peterz):
> > > >     - PageFoo() becomes folio_foo()
> > > >     - SetFolioFoo() becomes folio_set_foo()
> > > >     - ClearFolioFoo() becomes folio_clear_foo()
> > > >     - __SetFolioFoo() becomes __folio_set_foo()
> > > >     - __ClearFolioFoo() becomes __folio_clear_foo()
> > > >     - TestSetPageFoo() becomes folio_test_set_foo()
> > > >     - TestClearPageFoo() becomes folio_test_clear_foo()
> > > >     - PageHuge() is now folio_hugetlb()
> > 
> > If you rename these things at the same time, can you make it clear
> > they're flags (folio_flag_set_foo())? The weird camel case accessors at
> > least make that clear (after you get to know them).
> 
> In addition to pointing out that the name was a page flag, the weird
> camel case also meant, "if you try to search for this symbol, you will
> be defeated", because the darn thing is constructed via macro
> concatenation.

I've always hated that, FWIW.  And you can't add kernel-doc for them
because kernel-doc doesn't understand cpp.  So my current plan (quoting
my other email):

folio_dirty() -- defined in page-flags.h
would have kernel-doc, would be greppable

folio_test_set_dirty_flag()
folio_test_clear_dirty_flag()
__folio_clear_dirty_flag()
__folio_set_dirty_flag()
folio_clear_dirty_flag()
folio_set_dirty_flag() -- generated in filemap.h under #ifndef MODULE
would not have kernel-doc, would not be greppable, would only be used
in core vfs and core mm.

folio_mark_dirty() -- declared in mm.h (this is rare; turns out all kinds of
			crap wants to mark pages as being dirty)
folio_clear_dirty_for_io() -- declared in filemap.h
already have kernel-doc, are greppable, used by filesystems and sometimes
other random code.

> Except that over time, it turned out to be not quite that simple, and
> people started adding functionality. So now it's "cannot find it, and
> it's also got little goodies hiding in there--maybe!".

I also don't like that.  With what I'm thinking, there are no special
cases hidden in the autogenerated names.  Special things like the current
SetPageUptodate would be in folio_mark_uptodate() and filesystems couldn't
even call folio_set_uptodate().

> Given all that, I'd argue for either:
>     b) changing a bunch of the items to actual written-out names. What's
>        the harm? We'd end up with a longer file, but one could grep or
>        cscope for the names.

I hope the above makes you happy -- everything a filesystem author needs
gets kernel-doc.  People working inside the VM/VFS still get exposed
to undocumented folio_test_set_foo_flag(), but it's all regular and
autogenerated.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8.1 00/31] Memory Folios
  2021-05-02  0:17       ` Matthew Wilcox
@ 2021-05-02  0:42         ` John Hubbard
  2021-05-02  0:45           ` John Hubbard
  2021-05-02  2:31           ` Matthew Wilcox
  0 siblings, 2 replies; 41+ messages in thread
From: John Hubbard @ 2021-05-02  0:42 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Nicholas Piggin, Hugh Dickins, akpm, linux-fsdevel, linux-mm,
	Linus Torvalds

On 5/1/21 5:17 PM, Matthew Wilcox wrote:
...
>> In addition to pointing out that the name was a page flag, the weird
>> camel case also meant, "if you try to search for this symbol, you will
>> be defeated", because the darn thing is constructed via macro
>> concatenation.
> 
> I've always hated that, FWIW.  And you can't add kernel-doc for them
> because kernel-doc doesn't understand cpp.  So my current plan (quoting
> my other email):
> 
> folio_dirty() -- defined in page-flags.h
> would have kernel-doc, would be greppable
> 
> folio_test_set_dirty_flag()
> folio_test_clear_dirty_flag()
> __folio_clear_dirty_flag()
> __folio_set_dirty_flag()
> folio_clear_dirty_flag()
> folio_set_dirty_flag() -- generated in filemap.h under #ifndef MODULE
> would not have kernel-doc, would not be greppable, would only be used
> in core vfs and core mm.
> 
> folio_mark_dirty() -- declared in mm.h (this is rare; turns out all kinds of
> 			crap wants to mark pages as being dirty)
> folio_clear_dirty_for_io() -- declared in filemap.h
> already have kernel-doc, are greppable, used by filesystems and sometimes
> other random code.

Yes, the page dirty stuff is definitely not simple, so it's very good to
move away from the auto-generated names there. Looks like you are down to
just a couple of generated names now, if I'm reading this correctly.

> 
>> Except that over time, it turned out to be not quite that simple, and
>> people started adding functionality. So now it's "cannot find it, and
>> it's also got little goodies hiding in there--maybe!".
> 
> I also don't like that.  With what I'm thinking, there are no special
> cases hidden in the autogenerated names.  Special things like the current

Yes, that's a good guideline: no special cases in the auto-generated function
names. Because, either it is a simple, standard auto-generated thing, or
it is something that one needs to read, in order to see exactly what it does.


> SetPageUptodate would be in folio_mark_uptodate() and filesystems couldn't
> even call folio_set_uptodate().
> 
>> Given all that, I'd argue for either:
>>      b) changing a bunch of the items to actual written-out names. What's
>>         the harm? We'd end up with a longer file, but one could grep or
>>         cscope for the names.
> 
> I hope the above makes you happy -- everything a filesystem author needs
> gets kernel-doc.  People working inside the VM/VFS still get exposed

If "kernel-doc" is effectively a proxy for "file names are directly visible
in the source code, then I'm a lot happier than I was, yes. :)

> to undocumented folio_test_set_foo_flag(), but it's all regular and
> autogenerated.
> 

Sounds pretty good.


thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8.1 00/31] Memory Folios
  2021-05-02  0:42         ` John Hubbard
@ 2021-05-02  0:45           ` John Hubbard
  2021-05-02  2:31           ` Matthew Wilcox
  1 sibling, 0 replies; 41+ messages in thread
From: John Hubbard @ 2021-05-02  0:45 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Nicholas Piggin, Hugh Dickins, akpm, linux-fsdevel, linux-mm,
	Linus Torvalds

On 5/1/21 5:42 PM, John Hubbard wrote:
> If "kernel-doc" is effectively a proxy for "file names are directly visible
> in the source code, then I'm a lot happier than I was, yes. :)

umm, s/file names/function names/ , please.

  thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8.1 00/31] Memory Folios
  2021-05-02  0:42         ` John Hubbard
  2021-05-02  0:45           ` John Hubbard
@ 2021-05-02  2:31           ` Matthew Wilcox
  1 sibling, 0 replies; 41+ messages in thread
From: Matthew Wilcox @ 2021-05-02  2:31 UTC (permalink / raw)
  To: John Hubbard
  Cc: Nicholas Piggin, Hugh Dickins, akpm, linux-fsdevel, linux-mm,
	Linus Torvalds

On Sat, May 01, 2021 at 05:42:21PM -0700, John Hubbard wrote:
> On 5/1/21 5:17 PM, Matthew Wilcox wrote:
> > folio_dirty() -- defined in page-flags.h
> > would have kernel-doc, would be greppable
> > 
> > folio_test_set_dirty_flag()
> > folio_test_clear_dirty_flag()
> > __folio_clear_dirty_flag()
> > __folio_set_dirty_flag()
> > folio_clear_dirty_flag()
> > folio_set_dirty_flag() -- generated in filemap.h under #ifndef MODULE
> > would not have kernel-doc, would not be greppable, would only be used
> > in core vfs and core mm.
> > 
> > folio_mark_dirty() -- declared in mm.h (this is rare; turns out all kinds of
> > 			crap wants to mark pages as being dirty)
> > folio_clear_dirty_for_io() -- declared in filemap.h
> > already have kernel-doc, are greppable, used by filesystems and sometimes
> > other random code.
> 
> Yes, the page dirty stuff is definitely not simple, so it's very good to
> move away from the auto-generated names there. Looks like you are down to
> just a couple of generated names now, if I'm reading this correctly.

Six -- test_set, test_clear, __set, __clear, set, clear.

> > I hope the above makes you happy -- everything a filesystem author needs
> > gets kernel-doc.  People working inside the VM/VFS still get exposed
> 
> If "kernel-doc" is effectively a proxy for "file names are directly visible
> in the source code, then I'm a lot happier than I was, yes. :)

I'm thinking about this kind of thing for each flag (uptodate was the
easiest one to start with because it's already not autogenerated):

/**
 * folio_uptodate - Is this folio up to date?
 * @folio: The folio.
 *
 * The uptodate flag is set on a folio when every byte in the folio is at
 * least as new as the corresponding bytes on storage.  Anonymous folios
 * are always uptodate.  If the folio is not uptodate, some of the bytes
 * in it may be; see the is_partially_uptodate() address_space operation.
 */
static inline bool folio_uptodate(struct folio *folio)
{
...

(um, this is going to increase the patch series significantly.  i may
not do this until later.  there's more important things to get in that
are already done and waiting on this initial patch series.)

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2021-05-02  2:31 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-30 18:07 [PATCH v8.1 00/31] Memory Folios Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 01/31] mm: Introduce struct folio Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 02/31] mm: Add folio_pgdat and folio_zone Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 03/31] mm/vmstat: Add functions to account folio statistics Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 04/31] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 05/31] mm: Add folio reference count functions Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 06/31] mm: Add folio_put Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 07/31] mm: Add folio_get Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 08/31] mm: Add folio flag manipulation functions Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 09/31] mm: Add folio_young() and folio_idle() Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 10/31] mm: Handle per-folio private data Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 11/31] mm/filemap: Add folio_index, folio_file_page and folio_contains Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 12/31] mm/filemap: Add folio_next_index Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 13/31] mm/filemap: Add folio_offset and folio_file_offset Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 14/31] mm/util: Add folio_mapping and folio_file_mapping Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 15/31] mm: Add folio_mapcount Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 16/31] mm/memcg: Add folio wrappers for various functions Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 17/31] mm/filemap: Add folio_unlock Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 18/31] mm/filemap: Add folio_lock Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 19/31] mm/filemap: Add folio_lock_killable Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 20/31] mm/filemap: Add __folio_lock_async Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 21/31] mm/filemap: Add __folio_lock_or_retry Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 22/31] mm/filemap: Add folio_wait_locked Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 23/31] mm/swap: Add folio_rotate_reclaimable Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 24/31] mm/filemap: Add folio_end_writeback Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 25/31] mm/writeback: Add folio_wait_writeback Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 26/31] mm/writeback: Add folio_wait_stable Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 27/31] mm/filemap: Add folio_wait_bit Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 28/31] mm/filemap: Add folio_wake_bit Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 29/31] mm/filemap: Convert page wait queues to be folios Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 30/31] mm/filemap: Add folio private_2 functions Matthew Wilcox (Oracle)
2021-04-30 18:07 ` [PATCH v8 31/31] fs/netfs: Add folio fscache functions Matthew Wilcox (Oracle)
2021-04-30 18:47 ` [PATCH v8.1 00/31] Memory Folios Hugh Dickins
2021-05-01  1:32   ` Nicholas Piggin
2021-05-01  2:37     ` Matthew Wilcox
2021-05-01 14:31       ` Matthew Wilcox
2021-05-01 21:38     ` John Hubbard
2021-05-02  0:17       ` Matthew Wilcox
2021-05-02  0:42         ` John Hubbard
2021-05-02  0:45           ` John Hubbard
2021-05-02  2:31           ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).