linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v9 00/96] Memory folios
@ 2021-05-05 15:04 Matthew Wilcox (Oracle)
  2021-05-05 15:04 ` [PATCH v9 01/96] mm: Optimise nth_page for contiguous memmap Matthew Wilcox (Oracle)
                   ` (95 more replies)
  0 siblings, 96 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:04 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Managing memory in 4KiB pages is a serious overhead.  Many benchmarks
benefit from a larger "page size".  As an example, an earlier iteration
of this idea which used compound pages (and wasn't particularly tuned)
got a 7% performance boost when compiling the kernel.

Using compound pages or THPs exposes a weakness of our type system.
Functions are often unprepared for compound pages to be passed to them,
and may only act on PAGE_SIZE chunks.  Even functions which are aware of
compound pages may expect a head page, and do the wrong thing if passed
a tail page.

We also waste a lot of instructions ensuring that we're not looking at
a tail page.  Almost every call to PageFoo() contains one or more hidden
calls to compound_head().  This also happens for get_page(), put_page()
and many more functions.  There does not appear to be a way to tell gcc
that it can cache the result of compound_head(), nor is there a way to
tell it that compound_head() is idempotent.

This patch series uses a new type, the struct folio, to manage memory.
The first 8 patches are prep work that don't involve the folio at all
but fix problems I found while working on this.  Patches 9-81 introduce
infrastructure (that more than pays for itself, shrinking the kernel by
over 4kB of text).  Patches 82-96 convert iomap (ie xfs) to use folios
as an example.

Git: https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/tags/folio_9
v8: https://lore.kernel.org/linux-mm/20210430180740.2707166-1-willy@infradead.org/
Even more work that's not being submitted:
https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/folio

v9:
 - Rebase onto next-20210505
 - Rename folio_test_set_foo() to folio_test_set_foo_flag() (Nick Piggin)
 - Rename folio_set_uptodate() to folio_mark_uptodate()
 - Rename trylock_folio_flag() to folio_trylock_flag()
 - Add all remaining supporting patches for iomap
 - Add iomap conversion patches

Matthew Wilcox (Oracle) (96):
  mm: Optimise nth_page for contiguous memmap
  mm: Make __dump_page static
  mm/debug: Factor PagePoisoned out of __dump_page
  mm/page_owner: Constify dump_page_owner
  mm: Make compound_head const-preserving
  mm: Constify get_pfnblock_flags_mask and get_pfnblock_migratetype
  mm: Constify page_count and page_ref_count
  mm: Fix struct page layout on 32-bit systems
  mm: Introduce struct folio
  mm: Add folio_pgdat and folio_zone
  mm/vmstat: Add functions to account folio statistics
  mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO
  mm: Add folio reference count functions
  mm: Add folio_put
  mm: Add folio_get
  mm: Add folio flag manipulation functions
  mm: Add folio_young() and folio_idle()
  mm: Handle per-folio private data
  mm/filemap: Add folio_index, folio_file_page and folio_contains
  mm/filemap: Add folio_next_index
  mm/filemap: Add folio_offset and folio_file_offset
  mm/util: Add folio_mapping and folio_file_mapping
  mm: Add folio_mapcount
  mm/memcg: Add folio wrappers for various functions
  mm/filemap: Add folio_unlock
  mm/filemap: Add folio_lock
  mm/filemap: Add folio_lock_killable
  mm/filemap: Add __folio_lock_async
  mm/filemap: Add __folio_lock_or_retry
  mm/filemap: Add folio_wait_locked
  mm/swap: Add folio_rotate_reclaimable
  mm/filemap: Add folio_end_writeback
  mm/writeback: Add folio_wait_writeback
  mm/writeback: Add folio_wait_stable
  mm/filemap: Add folio_wait_bit
  mm/filemap: Add folio_wake_bit
  mm/filemap: Convert page wait queues to be folios
  mm/filemap: Add folio private_2 functions
  fs/netfs: Add folio fscache functions
  mm: Add folio_mapped
  mm/workingset: Convert workingset_activation to take a folio
  mm/swap: Add folio_activate
  mm/swap: Add folio_mark_accessed
  mm/rmap: Add folio_mkclean
  mm: Add kmap_local_folio
  mm: Add flush_dcache_folio
  mm: Add arch_make_folio_accessible
  mm/memcg: Remove 'page' parameter to mem_cgroup_charge_statistics
  mm/memcg: Use the node id in mem_cgroup_update_tree
  mm/memcg: Convert commit_charge to take a folio
  mm/memcg: Add folio_charge_cgroup
  mm/memcg: Add folio_uncharge_cgroup
  mm/memcg: Convert mem_cgroup_track_foreign_dirty_slowpath to folio
  mm/writeback: Rename __add_wb_stat to wb_stat_mod
  flex_proportions: Allow N events instead of 1
  mm/writeback: Change __wb_writeout_inc to __wb_writeout_add
  mm/writeback: Convert test_clear_page_writeback to
    __folio_end_writeback
  mm/writeback: Add folio_start_writeback
  mm/writeback: Add folio_mark_dirty
  mm/writeback: Use __set_page_dirty in __set_page_dirty_nobuffers
  mm/writeback: Add __folio_mark_dirty
  mm/writeback: Add filemap_dirty_folio
  mm/writeback: Add folio_account_cleaned
  mm/writeback: Add folio_cancel_dirty
  mm/writeback: Add folio_clear_dirty_for_io
  mm/writeback: Add folio_account_redirty
  mm/writeback: Add folio_redirty_for_writepage
  mm/filemap: Add i_blocks_per_folio
  mm/filemap: Add folio_mkwrite_check_truncate
  mm/filemap: Add readahead_folio
  block: Add bio_add_folio
  block: Add bio_for_each_folio_all
  mm/lru: Add folio_lru and folio_is_file_lru
  mm/workingset: Convert workingset_refault to take a folio
  mm/lru: Add folio_add_lru
  mm/page_alloc: Add __alloc_folio, __alloc_folio_node and alloc_folio
  mm/filemap: Add filemap_alloc_folio
  mm/filemap: Add folio_add_to_page_cache
  mm/filemap: Convert mapping_get_entry to return a folio
  mm/filemap: Add filemap_get_folio and find_get_folio
  mm/filemap: Add filemap_get_stable_folio
  iomap: Convert to_iomap_page to take a folio
  iomap: Convert iomap_page_create to take a folio
  iomap: Convert iomap_page_release to take a folio
  iomap: Convert iomap_releasepage to use a folio
  iomap: Convert iomap_invalidatepage to use a folio
  iomap: Pass the iomap_page into iomap_set_range_uptodate
  iomap: Use folio offsets instead of page offsets
  iomap: Convert bio completions to use folios
  iomap: Convert readahead and readpage to use a folio
  iomap: Convert iomap_page_mkwrite to use a folio
  iomap: Convert iomap_write_begin and iomap_write_end to folios
  iomap: Convert iomap_read_inline_data to take a folio
  iomap: Convert iomap_write_end_inline to take a folio
  iomap: Convert iomap_add_to_ioend to take a folio
  iomap: Convert iomap_do_writepage to use a folio

 Documentation/core-api/cachetlb.rst         |   6 +
 Documentation/core-api/mm-api.rst           |   4 +
 Documentation/filesystems/netfs_library.rst |   2 +
 block/bio.c                                 |  21 +
 fs/afs/write.c                              |   9 +-
 fs/buffer.c                                 |  25 -
 fs/cachefiles/rdwr.c                        |  16 +-
 fs/io_uring.c                               |   2 +-
 fs/iomap/buffered-io.c                      | 524 +++++++++-----------
 fs/jfs/jfs_metapage.c                       |   1 +
 include/asm-generic/cacheflush.h            |  14 +
 include/linux/backing-dev.h                 |   6 +-
 include/linux/bio.h                         |  46 +-
 include/linux/flex_proportions.h            |   9 +-
 include/linux/gfp.h                         |  22 +-
 include/linux/highmem-internal.h            |  11 +
 include/linux/highmem.h                     |  38 ++
 include/linux/iomap.h                       |   2 +-
 include/linux/memcontrol.h                  |  81 ++-
 include/linux/mm.h                          | 226 +++++++--
 include/linux/mm_inline.h                   |  44 +-
 include/linux/mm_types.h                    |  75 ++-
 include/linux/mmdebug.h                     |  23 +-
 include/linux/netfs.h                       |  77 +--
 include/linux/page-flags.h                  | 260 +++++++---
 include/linux/page_idle.h                   |  99 ++--
 include/linux/page_owner.h                  |   6 +-
 include/linux/page_ref.h                    |  92 +++-
 include/linux/pageblock-flags.h             |   2 +-
 include/linux/pagemap.h                     | 467 ++++++++++++-----
 include/linux/rmap.h                        |  10 +-
 include/linux/swap.h                        |  17 +-
 include/linux/vmstat.h                      | 107 ++++
 include/linux/writeback.h                   |   9 +-
 include/net/page_pool.h                     |  12 +-
 include/trace/events/writeback.h            |   8 +-
 lib/flex_proportions.c                      |  28 +-
 mm/Makefile                                 |   2 +-
 mm/debug.c                                  |  25 +-
 mm/filemap.c                                | 509 +++++++++----------
 mm/folio-compat.c                           | 107 ++++
 mm/internal.h                               |   2 +
 mm/khugepaged.c                             |  32 +-
 mm/memcontrol.c                             | 100 ++--
 mm/memory.c                                 |  11 +-
 mm/mempolicy.c                              |  10 +
 mm/migrate.c                                |  56 +--
 mm/page-writeback.c                         | 461 ++++++++++-------
 mm/page_alloc.c                             |  28 +-
 mm/page_io.c                                |   4 +-
 mm/page_owner.c                             |   2 +-
 mm/rmap.c                                   |  12 +-
 mm/swap.c                                   | 101 ++--
 mm/swap_state.c                             |   2 +-
 mm/swapfile.c                               |   8 +-
 mm/util.c                                   |  60 ++-
 mm/workingset.c                             |  44 +-
 net/core/page_pool.c                        |  12 +-
 58 files changed, 2568 insertions(+), 1421 deletions(-)
 create mode 100644 mm/folio-compat.c

-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 01/96] mm: Optimise nth_page for contiguous memmap
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
@ 2021-05-05 15:04 ` Matthew Wilcox (Oracle)
  2021-05-05 17:24   ` Vlastimil Babka
  2021-05-05 15:04 ` [PATCH v9 02/96] mm: Make __dump_page static Matthew Wilcox (Oracle)
                   ` (94 subsequent siblings)
  95 siblings, 1 reply; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:04 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Christoph Hellwig, David Hildenbrand, Zi Yan

If the memmap is virtually contiguous (either because we're using
a virtually mapped memmap or because we don't support a discontig
memmap at all), then we can implement nth_page() by simple addition.
Contrary to popular belief, the compiler is not able to optimise this
itself for a vmemmap configuration.  This reduces one example user (sg.c)
by four instructions:

        struct page *page = nth_page(rsv_schp->pages[k], offset >> PAGE_SHIFT);

before:
   49 8b 45 70             mov    0x70(%r13),%rax
   48 63 c9                movslq %ecx,%rcx
   48 c1 eb 0c             shr    $0xc,%rbx
   48 8b 04 c8             mov    (%rax,%rcx,8),%rax
   48 2b 05 00 00 00 00    sub    0x0(%rip),%rax
           R_X86_64_PC32      vmemmap_base-0x4
   48 c1 f8 06             sar    $0x6,%rax
   48 01 d8                add    %rbx,%rax
   48 c1 e0 06             shl    $0x6,%rax
   48 03 05 00 00 00 00    add    0x0(%rip),%rax
           R_X86_64_PC32      vmemmap_base-0x4

after:
   49 8b 45 70             mov    0x70(%r13),%rax
   48 63 c9                movslq %ecx,%rcx
   48 c1 eb 0c             shr    $0xc,%rbx
   48 c1 e3 06             shl    $0x6,%rbx
   48 03 1c c8             add    (%rax,%rcx,8),%rbx

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
---
 include/linux/mm.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 25b9041f9925..2327f99b121f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -234,7 +234,11 @@ int overcommit_policy_handler(struct ctl_table *, int, void *, size_t *,
 int __add_to_page_cache_locked(struct page *page, struct address_space *mapping,
 		pgoff_t index, gfp_t gfp, void **shadowp);
 
+#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
 #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
+#else
+#define nth_page(page,n) ((page) + (n))
+#endif
 
 /* to align the pointer to the (next) page boundary */
 #define PAGE_ALIGN(addr) ALIGN(addr, PAGE_SIZE)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 02/96] mm: Make __dump_page static
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
  2021-05-05 15:04 ` [PATCH v9 01/96] mm: Optimise nth_page for contiguous memmap Matthew Wilcox (Oracle)
@ 2021-05-05 15:04 ` Matthew Wilcox (Oracle)
  2021-05-05 15:04 ` [PATCH v9 03/96] mm/debug: Factor PagePoisoned out of __dump_page Matthew Wilcox (Oracle)
                   ` (93 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:04 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, William Kucharski, Vlastimil Babka,
	Anshuman Khandual

The only caller of __dump_page() now opencodes dump_page(), so
remove it as an externally visible symbol.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 include/linux/mmdebug.h | 3 +--
 mm/debug.c              | 2 +-
 mm/page_alloc.c         | 3 +--
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/include/linux/mmdebug.h b/include/linux/mmdebug.h
index 5d0767cb424a..1935d4c72d10 100644
--- a/include/linux/mmdebug.h
+++ b/include/linux/mmdebug.h
@@ -9,8 +9,7 @@ struct page;
 struct vm_area_struct;
 struct mm_struct;
 
-extern void dump_page(struct page *page, const char *reason);
-extern void __dump_page(struct page *page, const char *reason);
+void dump_page(struct page *page, const char *reason);
 void dump_vma(const struct vm_area_struct *vma);
 void dump_mm(const struct mm_struct *mm);
 
diff --git a/mm/debug.c b/mm/debug.c
index 0bdda8407f71..84cdcd0f7bd3 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -42,7 +42,7 @@ const struct trace_print_flags vmaflag_names[] = {
 	{0, NULL}
 };
 
-void __dump_page(struct page *page, const char *reason)
+static void __dump_page(struct page *page, const char *reason)
 {
 	struct page *head = compound_head(page);
 	struct address_space *mapping;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a2fe714aed93..f23702e7c564 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -658,8 +658,7 @@ static void bad_page(struct page *page, const char *reason)
 
 	pr_alert("BUG: Bad page state in process %s  pfn:%05lx\n",
 		current->comm, page_to_pfn(page));
-	__dump_page(page, reason);
-	dump_page_owner(page);
+	dump_page(page, reason);
 
 	print_modules();
 	dump_stack();
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 03/96] mm/debug: Factor PagePoisoned out of __dump_page
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
  2021-05-05 15:04 ` [PATCH v9 01/96] mm: Optimise nth_page for contiguous memmap Matthew Wilcox (Oracle)
  2021-05-05 15:04 ` [PATCH v9 02/96] mm: Make __dump_page static Matthew Wilcox (Oracle)
@ 2021-05-05 15:04 ` Matthew Wilcox (Oracle)
  2021-05-05 15:04 ` [PATCH v9 04/96] mm/page_owner: Constify dump_page_owner Matthew Wilcox (Oracle)
                   ` (92 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:04 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, William Kucharski, Vlastimil Babka,
	Anshuman Khandual

Move the PagePoisoned test into dump_page().  Skip the hex print
for poisoned pages -- we know they're full of ffffffff.  Move the
reason printing from __dump_page() to dump_page().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 mm/debug.c | 25 +++++++------------------
 1 file changed, 7 insertions(+), 18 deletions(-)

diff --git a/mm/debug.c b/mm/debug.c
index 84cdcd0f7bd3..e73fe0a8ec3d 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -42,11 +42,10 @@ const struct trace_print_flags vmaflag_names[] = {
 	{0, NULL}
 };
 
-static void __dump_page(struct page *page, const char *reason)
+static void __dump_page(struct page *page)
 {
 	struct page *head = compound_head(page);
 	struct address_space *mapping;
-	bool page_poisoned = PagePoisoned(page);
 	bool compound = PageCompound(page);
 	/*
 	 * Accessing the pageblock without the zone lock. It could change to
@@ -58,16 +57,6 @@ static void __dump_page(struct page *page, const char *reason)
 	int mapcount;
 	char *type = "";
 
-	/*
-	 * If struct page is poisoned don't access Page*() functions as that
-	 * leads to recursive loop. Page*() check for poisoned pages, and calls
-	 * dump_page() when detected.
-	 */
-	if (page_poisoned) {
-		pr_warn("page:%px is uninitialized and poisoned", page);
-		goto hex_only;
-	}
-
 	if (page < head || (page >= head + MAX_ORDER_NR_PAGES)) {
 		/*
 		 * Corrupt page, so we cannot call page_mapping. Instead, do a
@@ -173,8 +162,6 @@ static void __dump_page(struct page *page, const char *reason)
 
 	pr_warn("%sflags: %#lx(%pGp)%s\n", type, head->flags, &head->flags,
 		page_cma ? " CMA" : "");
-
-hex_only:
 	print_hex_dump(KERN_WARNING, "raw: ", DUMP_PREFIX_NONE, 32,
 			sizeof(unsigned long), page,
 			sizeof(struct page), false);
@@ -182,14 +169,16 @@ static void __dump_page(struct page *page, const char *reason)
 		print_hex_dump(KERN_WARNING, "head: ", DUMP_PREFIX_NONE, 32,
 			sizeof(unsigned long), head,
 			sizeof(struct page), false);
-
-	if (reason)
-		pr_warn("page dumped because: %s\n", reason);
 }
 
 void dump_page(struct page *page, const char *reason)
 {
-	__dump_page(page, reason);
+	if (PagePoisoned(page))
+		pr_warn("page:%p is uninitialized and poisoned", page);
+	else
+		__dump_page(page);
+	if (reason)
+		pr_warn("page dumped because: %s\n", reason);
 	dump_page_owner(page);
 }
 EXPORT_SYMBOL(dump_page);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 04/96] mm/page_owner: Constify dump_page_owner
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (2 preceding siblings ...)
  2021-05-05 15:04 ` [PATCH v9 03/96] mm/debug: Factor PagePoisoned out of __dump_page Matthew Wilcox (Oracle)
@ 2021-05-05 15:04 ` Matthew Wilcox (Oracle)
  2021-05-05 15:04 ` [PATCH v9 05/96] mm: Make compound_head const-preserving Matthew Wilcox (Oracle)
                   ` (91 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:04 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, William Kucharski, Vlastimil Babka,
	Anshuman Khandual

dump_page_owner() only uses struct page to find the page_ext, and
lookup_page_ext() already takes a const argument.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 include/linux/page_owner.h | 6 +++---
 mm/page_owner.c            | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/page_owner.h b/include/linux/page_owner.h
index 3468794f83d2..719bfe5108c5 100644
--- a/include/linux/page_owner.h
+++ b/include/linux/page_owner.h
@@ -14,7 +14,7 @@ extern void __set_page_owner(struct page *page,
 extern void __split_page_owner(struct page *page, unsigned int nr);
 extern void __copy_page_owner(struct page *oldpage, struct page *newpage);
 extern void __set_page_owner_migrate_reason(struct page *page, int reason);
-extern void __dump_page_owner(struct page *page);
+extern void __dump_page_owner(const struct page *page);
 extern void pagetypeinfo_showmixedcount_print(struct seq_file *m,
 					pg_data_t *pgdat, struct zone *zone);
 
@@ -46,7 +46,7 @@ static inline void set_page_owner_migrate_reason(struct page *page, int reason)
 	if (static_branch_unlikely(&page_owner_inited))
 		__set_page_owner_migrate_reason(page, reason);
 }
-static inline void dump_page_owner(struct page *page)
+static inline void dump_page_owner(const struct page *page)
 {
 	if (static_branch_unlikely(&page_owner_inited))
 		__dump_page_owner(page);
@@ -69,7 +69,7 @@ static inline void copy_page_owner(struct page *oldpage, struct page *newpage)
 static inline void set_page_owner_migrate_reason(struct page *page, int reason)
 {
 }
-static inline void dump_page_owner(struct page *page)
+static inline void dump_page_owner(const struct page *page)
 {
 }
 #endif /* CONFIG_PAGE_OWNER */
diff --git a/mm/page_owner.c b/mm/page_owner.c
index adfabb560eb9..f51a57e92aa3 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -392,7 +392,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
 	return -ENOMEM;
 }
 
-void __dump_page_owner(struct page *page)
+void __dump_page_owner(const struct page *page)
 {
 	struct page_ext *page_ext = lookup_page_ext(page);
 	struct page_owner *page_owner;
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 05/96] mm: Make compound_head const-preserving
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (3 preceding siblings ...)
  2021-05-05 15:04 ` [PATCH v9 04/96] mm/page_owner: Constify dump_page_owner Matthew Wilcox (Oracle)
@ 2021-05-05 15:04 ` Matthew Wilcox (Oracle)
  2021-05-05 15:04 ` [PATCH v9 06/96] mm: Constify get_pfnblock_flags_mask and get_pfnblock_migratetype Matthew Wilcox (Oracle)
                   ` (90 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:04 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, William Kucharski, Vlastimil Babka,
	Anshuman Khandual

If you pass a const pointer to compound_head(), you get a const pointer
back; if you pass a mutable pointer, you get a mutable pointer back.
Also remove an unnecessary forward definition of struct page; we're about
to dereference page->compound_head, so it must already have been defined.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 include/linux/page-flags.h | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 04a34c08e0a6..d8e26243db25 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -177,17 +177,17 @@ enum pageflags {
 
 #ifndef __GENERATING_BOUNDS_H
 
-struct page;	/* forward declaration */
-
-static inline struct page *compound_head(struct page *page)
+static inline unsigned long _compound_head(const struct page *page)
 {
 	unsigned long head = READ_ONCE(page->compound_head);
 
 	if (unlikely(head & 1))
-		return (struct page *) (head - 1);
-	return page;
+		return head - 1;
+	return (unsigned long)page;
 }
 
+#define compound_head(page)	((typeof(page))_compound_head(page))
+
 static __always_inline int PageTail(struct page *page)
 {
 	return READ_ONCE(page->compound_head) & 1;
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 06/96] mm: Constify get_pfnblock_flags_mask and get_pfnblock_migratetype
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (4 preceding siblings ...)
  2021-05-05 15:04 ` [PATCH v9 05/96] mm: Make compound_head const-preserving Matthew Wilcox (Oracle)
@ 2021-05-05 15:04 ` Matthew Wilcox (Oracle)
  2021-05-05 15:04 ` [PATCH v9 07/96] mm: Constify page_count and page_ref_count Matthew Wilcox (Oracle)
                   ` (89 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:04 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, William Kucharski, Vlastimil Babka,
	Anshuman Khandual

The struct page is not modified by these routines, so it can be marked
const.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 include/linux/pageblock-flags.h |  2 +-
 mm/page_alloc.c                 | 13 +++++++------
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/linux/pageblock-flags.h b/include/linux/pageblock-flags.h
index fff52ad370c1..973fd731a520 100644
--- a/include/linux/pageblock-flags.h
+++ b/include/linux/pageblock-flags.h
@@ -54,7 +54,7 @@ extern unsigned int pageblock_order;
 /* Forward declaration */
 struct page;
 
-unsigned long get_pfnblock_flags_mask(struct page *page,
+unsigned long get_pfnblock_flags_mask(const struct page *page,
 				unsigned long pfn,
 				unsigned long mask);
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f23702e7c564..5a1e5b624594 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -474,7 +474,7 @@ static inline bool defer_init(int nid, unsigned long pfn, unsigned long end_pfn)
 #endif
 
 /* Return a pointer to the bitmap storing bits affecting a block of pages */
-static inline unsigned long *get_pageblock_bitmap(struct page *page,
+static inline unsigned long *get_pageblock_bitmap(const struct page *page,
 							unsigned long pfn)
 {
 #ifdef CONFIG_SPARSEMEM
@@ -484,7 +484,7 @@ static inline unsigned long *get_pageblock_bitmap(struct page *page,
 #endif /* CONFIG_SPARSEMEM */
 }
 
-static inline int pfn_to_bitidx(struct page *page, unsigned long pfn)
+static inline int pfn_to_bitidx(const struct page *page, unsigned long pfn)
 {
 #ifdef CONFIG_SPARSEMEM
 	pfn &= (PAGES_PER_SECTION-1);
@@ -495,7 +495,7 @@ static inline int pfn_to_bitidx(struct page *page, unsigned long pfn)
 }
 
 static __always_inline
-unsigned long __get_pfnblock_flags_mask(struct page *page,
+unsigned long __get_pfnblock_flags_mask(const struct page *page,
 					unsigned long pfn,
 					unsigned long mask)
 {
@@ -520,13 +520,14 @@ unsigned long __get_pfnblock_flags_mask(struct page *page,
  *
  * Return: pageblock_bits flags
  */
-unsigned long get_pfnblock_flags_mask(struct page *page, unsigned long pfn,
-					unsigned long mask)
+unsigned long get_pfnblock_flags_mask(const struct page *page,
+					unsigned long pfn, unsigned long mask)
 {
 	return __get_pfnblock_flags_mask(page, pfn, mask);
 }
 
-static __always_inline int get_pfnblock_migratetype(struct page *page, unsigned long pfn)
+static __always_inline int get_pfnblock_migratetype(const struct page *page,
+					unsigned long pfn)
 {
 	return __get_pfnblock_flags_mask(page, pfn, MIGRATETYPE_MASK);
 }
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 07/96] mm: Constify page_count and page_ref_count
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (5 preceding siblings ...)
  2021-05-05 15:04 ` [PATCH v9 06/96] mm: Constify get_pfnblock_flags_mask and get_pfnblock_migratetype Matthew Wilcox (Oracle)
@ 2021-05-05 15:04 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 08/96] mm: Fix struct page layout on 32-bit systems Matthew Wilcox (Oracle)
                   ` (88 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:04 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, William Kucharski, Vlastimil Babka,
	Anshuman Khandual

Now that compound_head() accepts a const struct page pointer, these two
functions can be marked as not modifying the page pointer they are passed.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 include/linux/page_ref.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h
index f3318f34fc54..7ad46f45df39 100644
--- a/include/linux/page_ref.h
+++ b/include/linux/page_ref.h
@@ -62,12 +62,12 @@ static inline void __page_ref_unfreeze(struct page *page, int v)
 
 #endif
 
-static inline int page_ref_count(struct page *page)
+static inline int page_ref_count(const struct page *page)
 {
 	return atomic_read(&page->_refcount);
 }
 
-static inline int page_count(struct page *page)
+static inline int page_count(const struct page *page)
 {
 	return atomic_read(&compound_head(page)->_refcount);
 }
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 08/96] mm: Fix struct page layout on 32-bit systems
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (6 preceding siblings ...)
  2021-05-05 15:04 ` [PATCH v9 07/96] mm: Constify page_count and page_ref_count Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 17:33   ` Vlastimil Babka
  2021-05-05 15:05 ` [PATCH v9 09/96] mm: Introduce struct folio Matthew Wilcox (Oracle)
                   ` (87 subsequent siblings)
  95 siblings, 1 reply; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Ilias Apalodimas, Jesper Dangaard Brouer

32-bit architectures which expect 8-byte alignment for 8-byte integers
and need 64-bit DMA addresses (arm, mips, ppc) had their struct page
inadvertently expanded in 2019.  When the dma_addr_t was added, it forced
the alignment of the union to 8 bytes, which inserted a 4 byte gap between
'flags' and the union.

Fix this by storing the dma_addr_t in one or two adjacent unsigned longs.
This restores the alignment to that of an unsigned long.  We always
store the low bits in the first word to prevent the PageTail bit from
being inadvertently set on a big endian platform.  If that happened,
get_user_pages_fast() racing against a page which was freed and
reallocated to the page_pool could dereference a bogus compound_head(),
which would be hard to trace back to this cause.

Fixes: c25fff7171be ("mm: add dma_addr_t to struct page")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 include/linux/mm_types.h |  4 ++--
 include/net/page_pool.h  | 12 +++++++++++-
 net/core/page_pool.c     | 12 +++++++-----
 3 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 6613b26a8894..5aacc1c10a45 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -97,10 +97,10 @@ struct page {
 		};
 		struct {	/* page_pool used by netstack */
 			/**
-			 * @dma_addr: might require a 64-bit value even on
+			 * @dma_addr: might require a 64-bit value on
 			 * 32-bit architectures.
 			 */
-			dma_addr_t dma_addr;
+			unsigned long dma_addr[2];
 		};
 		struct {	/* slab, slob and slub */
 			union {
diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index 6d517a37c18b..b4b6de909c93 100644
--- a/include/net/page_pool.h
+++ b/include/net/page_pool.h
@@ -198,7 +198,17 @@ static inline void page_pool_recycle_direct(struct page_pool *pool,
 
 static inline dma_addr_t page_pool_get_dma_addr(struct page *page)
 {
-	return page->dma_addr;
+	dma_addr_t ret = page->dma_addr[0];
+	if (sizeof(dma_addr_t) > sizeof(unsigned long))
+		ret |= (dma_addr_t)page->dma_addr[1] << 16 << 16;
+	return ret;
+}
+
+static inline void page_pool_set_dma_addr(struct page *page, dma_addr_t addr)
+{
+	page->dma_addr[0] = addr;
+	if (sizeof(dma_addr_t) > sizeof(unsigned long))
+		page->dma_addr[1] = upper_32_bits(addr);
 }
 
 static inline bool is_page_pool_compiled_in(void)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 9ec1aa9640ad..3c4c4c7a0402 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -174,8 +174,10 @@ static void page_pool_dma_sync_for_device(struct page_pool *pool,
 					  struct page *page,
 					  unsigned int dma_sync_size)
 {
+	dma_addr_t dma_addr = page_pool_get_dma_addr(page);
+
 	dma_sync_size = min(dma_sync_size, pool->p.max_len);
-	dma_sync_single_range_for_device(pool->p.dev, page->dma_addr,
+	dma_sync_single_range_for_device(pool->p.dev, dma_addr,
 					 pool->p.offset, dma_sync_size,
 					 pool->p.dma_dir);
 }
@@ -195,7 +197,7 @@ static bool page_pool_dma_map(struct page_pool *pool, struct page *page)
 	if (dma_mapping_error(pool->p.dev, dma))
 		return false;
 
-	page->dma_addr = dma;
+	page_pool_set_dma_addr(page, dma);
 
 	if (pool->p.flags & PP_FLAG_DMA_SYNC_DEV)
 		page_pool_dma_sync_for_device(pool, page, pool->p.max_len);
@@ -331,13 +333,13 @@ void page_pool_release_page(struct page_pool *pool, struct page *page)
 		 */
 		goto skip_dma_unmap;
 
-	dma = page->dma_addr;
+	dma = page_pool_get_dma_addr(page);
 
-	/* When page is unmapped, it cannot be returned our pool */
+	/* When page is unmapped, it cannot be returned to our pool */
 	dma_unmap_page_attrs(pool->p.dev, dma,
 			     PAGE_SIZE << pool->p.order, pool->p.dma_dir,
 			     DMA_ATTR_SKIP_CPU_SYNC);
-	page->dma_addr = 0;
+	page_pool_set_dma_addr(page, 0);
 skip_dma_unmap:
 	/* This may be the last page returned, releasing the pool, so
 	 * it is not safe to reference pool afterwards.
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 09/96] mm: Introduce struct folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (7 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 08/96] mm: Fix struct page layout on 32-bit systems Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 10/96] mm: Add folio_pgdat and folio_zone Matthew Wilcox (Oracle)
                   ` (86 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Jeff Layton

A struct folio is a new abstraction to replace the venerable struct page.
A function which takes a struct folio argument declares that it will
operate on the entire (possibly compound) page, not just PAGE_SIZE bytes.
In return, the caller guarantees that the pointer it is passing does
not point to a tail page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 Documentation/core-api/mm-api.rst |  1 +
 include/linux/mm.h                | 74 +++++++++++++++++++++++++++++++
 include/linux/mm_types.h          | 60 +++++++++++++++++++++++++
 include/linux/page-flags.h        | 27 +++++++++++
 4 files changed, 162 insertions(+)

diff --git a/Documentation/core-api/mm-api.rst b/Documentation/core-api/mm-api.rst
index a42f9baddfbf..2a94e6164f80 100644
--- a/Documentation/core-api/mm-api.rst
+++ b/Documentation/core-api/mm-api.rst
@@ -95,6 +95,7 @@ More Memory Management Functions
 .. kernel-doc:: mm/mempolicy.c
 .. kernel-doc:: include/linux/mm_types.h
    :internal:
+.. kernel-doc:: include/linux/page-flags.h
 .. kernel-doc:: include/linux/mm.h
    :internal:
 .. kernel-doc:: include/linux/mmzone.h
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2327f99b121f..b29c86824e6b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -950,6 +950,20 @@ static inline unsigned int compound_order(struct page *page)
 	return page[1].compound_order;
 }
 
+/**
+ * folio_order - The allocation order of a folio.
+ * @folio: The folio.
+ *
+ * A folio is composed of 2^order pages.  See get_order() for the definition
+ * of order.
+ *
+ * Return: The order of the folio.
+ */
+static inline unsigned int folio_order(struct folio *folio)
+{
+	return compound_order(&folio->page);
+}
+
 static inline bool hpage_pincount_available(struct page *page)
 {
 	/*
@@ -1595,6 +1609,65 @@ static inline void set_page_links(struct page *page, enum zone_type zone,
 #endif
 }
 
+/**
+ * folio_nr_pages - The number of pages in the folio.
+ * @folio: The folio.
+ *
+ * Return: A number which is a power of two.
+ */
+static inline unsigned long folio_nr_pages(struct folio *folio)
+{
+	return compound_nr(&folio->page);
+}
+
+/**
+ * folio_next - Move to the next physical folio.
+ * @folio: The folio we're currently operating on.
+ *
+ * If you have physically contiguous memory which may span more than
+ * one folio (eg a &struct bio_vec), use this function to move from one
+ * folio to the next.  Do not use it if the memory is only virtually
+ * contiguous as the folios are almost certainly not adjacent to each
+ * other.  This is the folio equivalent to writing ``page++``.
+ *
+ * Context: We assume that the folios are refcounted and/or locked at a
+ * higher level and do not adjust the reference counts.
+ * Return: The next struct folio.
+ */
+static inline struct folio *folio_next(struct folio *folio)
+{
+	return (struct folio *)folio_page(folio, folio_nr_pages(folio));
+}
+
+/**
+ * folio_shift - The number of bits covered by this folio.
+ * @folio: The folio.
+ *
+ * A folio contains a number of bytes which is a power-of-two in size.
+ * This function tells you which power-of-two the folio is.
+ *
+ * Context: The caller should have a reference on the folio to prevent
+ * it from being split.  It is not necessary for the folio to be locked.
+ * Return: The base-2 logarithm of the size of this folio.
+ */
+static inline unsigned int folio_shift(struct folio *folio)
+{
+	return PAGE_SHIFT + folio_order(folio);
+}
+
+/**
+ * folio_size - The number of bytes in a folio.
+ * @folio: The folio.
+ *
+ * Context: The caller should have a reference on the folio to prevent
+ * it from being split.  It is not necessary for the folio to be locked.
+ * Return: The number of bytes in this folio.
+ */
+static inline size_t folio_size(struct folio *folio)
+{
+	return PAGE_SIZE << folio_order(folio);
+}
+
 /*
  * Some inline functions in vmstat.h depend on page_zone()
  */
@@ -1699,6 +1772,7 @@ extern void pagefault_out_of_memory(void);
 
 #define offset_in_page(p)	((unsigned long)(p) & ~PAGE_MASK)
 #define offset_in_thp(page, p)	((unsigned long)(p) & (thp_size(page) - 1))
+#define offset_in_folio(folio, p) ((unsigned long)(p) & (folio_size(folio) - 1))
 
 /*
  * Flags passed to show_mem() and show_free_areas() to suppress output in
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 5aacc1c10a45..276e358c75d3 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -224,6 +224,66 @@ struct page {
 #endif
 } _struct_page_alignment;
 
+/**
+ * struct folio - Represents a contiguous set of bytes.
+ * @flags: Identical to the page flags.
+ * @lru: Least Recently Used list; tracks how recently this folio was used.
+ * @mapping: The file this page belongs to, or refers to the anon_vma for
+ *    anonymous pages.
+ * @index: Offset within the file, in units of pages.  For anonymous pages,
+ *    this is the index from the beginning of the mmap.
+ * @private: Filesystem per-folio data (see folio_attach_private()).
+ *    Used for swp_entry_t if folio_swapcache().
+ * @_mapcount: Do not access this member directly.  Use folio_mapcount() to
+ *    find out how many times this folio is mapped by userspace.
+ * @_refcount: Do not access this member directly.  Use folio_ref_count()
+ *    to find how many references there are to this folio.
+ * @memcg_data: Memory Control Group data.
+ *
+ * A folio is a physically, virtually and logically contiguous set
+ * of bytes.  It is a power-of-two in size, and it is aligned to that
+ * same power-of-two.  It is at least as large as %PAGE_SIZE.  If it is
+ * in the page cache, it is at a file offset which is a multiple of that
+ * power-of-two.  It may be mapped into userspace at an address which is
+ * at an arbitrary page offset, but its kernel virtual address is aligned
+ * to its size.
+ */
+struct folio {
+	/* private: don't document the anon union */
+	union {
+		struct {
+	/* public: */
+			unsigned long flags;
+			struct list_head lru;
+			struct address_space *mapping;
+			pgoff_t index;
+			unsigned long private;
+			atomic_t _mapcount;
+			atomic_t _refcount;
+#ifdef CONFIG_MEMCG
+			unsigned long memcg_data;
+#endif
+	/* private: the union with struct page is transitional */
+		};
+		struct page page;
+	};
+};
+
+static_assert(sizeof(struct page) == sizeof(struct folio));
+#define FOLIO_MATCH(pg, fl)						\
+	static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
+FOLIO_MATCH(flags, flags);
+FOLIO_MATCH(lru, lru);
+FOLIO_MATCH(compound_head, lru);
+FOLIO_MATCH(index, index);
+FOLIO_MATCH(private, private);
+FOLIO_MATCH(_mapcount, _mapcount);
+FOLIO_MATCH(_refcount, _refcount);
+#ifdef CONFIG_MEMCG
+FOLIO_MATCH(memcg_data, memcg_data);
+#endif
+#undef FOLIO_MATCH
+
 static inline atomic_t *compound_mapcount_ptr(struct page *page)
 {
 	return &page[1].compound_mapcount;
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index d8e26243db25..e069aa8b11b7 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -188,6 +188,33 @@ static inline unsigned long _compound_head(const struct page *page)
 
 #define compound_head(page)	((typeof(page))_compound_head(page))
 
+/**
+ * page_folio - Converts from page to folio.
+ * @p: The page.
+ *
+ * Every page is part of a folio.  This function cannot be called on a
+ * NULL pointer.
+ *
+ * Context: No reference, nor lock is required on @page.  If the caller
+ * does not hold a reference, this call may race with a folio split, so
+ * it should re-check the folio still contains this page after gaining
+ * a reference on the folio.
+ * Return: The folio which contains this page.
+ */
+#define page_folio(p)		(_Generic((p),				\
+	const struct page *:	(const struct folio *)_compound_head(p), \
+	struct page *:		(struct folio *)_compound_head(p)))
+
+/**
+ * folio_page - Return a page from a folio.
+ * @folio: The folio.
+ * @n: The page number to return.
+ *
+ * @n is relative to the start of the folio.  It should be between
+ * 0 and folio_nr_pages(@folio) - 1, but this is not checked for.
+ */
+#define folio_page(folio, n)	nth_page(&(folio)->page, n)
+
 static __always_inline int PageTail(struct page *page)
 {
 	return READ_ONCE(page->compound_head) & 1;
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 10/96] mm: Add folio_pgdat and folio_zone
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (8 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 09/96] mm: Introduce struct folio Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 11/96] mm/vmstat: Add functions to account folio statistics Matthew Wilcox (Oracle)
                   ` (85 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Zi Yan, Christoph Hellwig, Jeff Layton

These are just convenience wrappers for callers with folios; pgdat and
zone can be reached from tail pages as well as head pages.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/mm.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index b29c86824e6b..a55c2c0628b6 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1560,6 +1560,16 @@ static inline pg_data_t *page_pgdat(const struct page *page)
 	return NODE_DATA(page_to_nid(page));
 }
 
+static inline struct zone *folio_zone(const struct folio *folio)
+{
+	return page_zone(&folio->page);
+}
+
+static inline pg_data_t *folio_pgdat(const struct folio *folio)
+{
+	return page_pgdat(&folio->page);
+}
+
 #ifdef SECTION_IN_PAGE_FLAGS
 static inline void set_page_section(struct page *page, unsigned long section)
 {
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 11/96] mm/vmstat: Add functions to account folio statistics
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (9 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 10/96] mm: Add folio_pgdat and folio_zone Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 12/96] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO Matthew Wilcox (Oracle)
                   ` (84 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

Allow page counters to be more readily modified by callers which have
a folio.  Name these wrappers with 'stat' instead of 'state' as requested
by Linus here:
https://lore.kernel.org/linux-mm/CAHk-=wj847SudR-kt+46fT3+xFFgiwpgThvm7DJWGdi4cVrbnQ@mail.gmail.com/

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/vmstat.h | 107 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 107 insertions(+)

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 3299cd69e4ca..d287d7c31b8f 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -402,6 +402,78 @@ static inline void drain_zonestat(struct zone *zone,
 			struct per_cpu_pageset *pset) { }
 #endif		/* CONFIG_SMP */
 
+static inline void __zone_stat_mod_folio(struct folio *folio,
+		enum zone_stat_item item, long nr)
+{
+	__mod_zone_page_state(folio_zone(folio), item, nr);
+}
+
+static inline void __zone_stat_add_folio(struct folio *folio,
+		enum zone_stat_item item)
+{
+	__mod_zone_page_state(folio_zone(folio), item, folio_nr_pages(folio));
+}
+
+static inline void __zone_stat_sub_folio(struct folio *folio,
+		enum zone_stat_item item)
+{
+	__mod_zone_page_state(folio_zone(folio), item, -folio_nr_pages(folio));
+}
+
+static inline void zone_stat_mod_folio(struct folio *folio,
+		enum zone_stat_item item, long nr)
+{
+	mod_zone_page_state(folio_zone(folio), item, nr);
+}
+
+static inline void zone_stat_add_folio(struct folio *folio,
+		enum zone_stat_item item)
+{
+	mod_zone_page_state(folio_zone(folio), item, folio_nr_pages(folio));
+}
+
+static inline void zone_stat_sub_folio(struct folio *folio,
+		enum zone_stat_item item)
+{
+	mod_zone_page_state(folio_zone(folio), item, -folio_nr_pages(folio));
+}
+
+static inline void __node_stat_mod_folio(struct folio *folio,
+		enum node_stat_item item, long nr)
+{
+	__mod_node_page_state(folio_pgdat(folio), item, nr);
+}
+
+static inline void __node_stat_add_folio(struct folio *folio,
+		enum node_stat_item item)
+{
+	__mod_node_page_state(folio_pgdat(folio), item, folio_nr_pages(folio));
+}
+
+static inline void __node_stat_sub_folio(struct folio *folio,
+		enum node_stat_item item)
+{
+	__mod_node_page_state(folio_pgdat(folio), item, -folio_nr_pages(folio));
+}
+
+static inline void node_stat_mod_folio(struct folio *folio,
+		enum node_stat_item item, long nr)
+{
+	mod_node_page_state(folio_pgdat(folio), item, nr);
+}
+
+static inline void node_stat_add_folio(struct folio *folio,
+		enum node_stat_item item)
+{
+	mod_node_page_state(folio_pgdat(folio), item, folio_nr_pages(folio));
+}
+
+static inline void node_stat_sub_folio(struct folio *folio,
+		enum node_stat_item item)
+{
+	mod_node_page_state(folio_pgdat(folio), item, -folio_nr_pages(folio));
+}
+
 static inline void __mod_zone_freepage_state(struct zone *zone, int nr_pages,
 					     int migratetype)
 {
@@ -530,6 +602,24 @@ static inline void __dec_lruvec_page_state(struct page *page,
 	__mod_lruvec_page_state(page, idx, -1);
 }
 
+static inline void __lruvec_stat_mod_folio(struct folio *folio,
+					   enum node_stat_item idx, int val)
+{
+	__mod_lruvec_page_state(&folio->page, idx, val);
+}
+
+static inline void __lruvec_stat_add_folio(struct folio *folio,
+					   enum node_stat_item idx)
+{
+	__lruvec_stat_mod_folio(folio, idx, folio_nr_pages(folio));
+}
+
+static inline void __lruvec_stat_sub_folio(struct folio *folio,
+					   enum node_stat_item idx)
+{
+	__lruvec_stat_mod_folio(folio, idx, -folio_nr_pages(folio));
+}
+
 static inline void inc_lruvec_page_state(struct page *page,
 					 enum node_stat_item idx)
 {
@@ -542,4 +632,21 @@ static inline void dec_lruvec_page_state(struct page *page,
 	mod_lruvec_page_state(page, idx, -1);
 }
 
+static inline void lruvec_stat_mod_folio(struct folio *folio,
+					 enum node_stat_item idx, int val)
+{
+	mod_lruvec_page_state(&folio->page, idx, val);
+}
+
+static inline void lruvec_stat_add_folio(struct folio *folio,
+					 enum node_stat_item idx)
+{
+	lruvec_stat_mod_folio(folio, idx, folio_nr_pages(folio));
+}
+
+static inline void lruvec_stat_sub_folio(struct folio *folio,
+					 enum node_stat_item idx)
+{
+	lruvec_stat_mod_folio(folio, idx, -folio_nr_pages(folio));
+}
 #endif /* _LINUX_VMSTAT_H */
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 12/96] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (10 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 11/96] mm/vmstat: Add functions to account folio statistics Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 13/96] mm: Add folio reference count functions Matthew Wilcox (Oracle)
                   ` (83 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Zi Yan, Christoph Hellwig, Jeff Layton

These are the folio equivalents of VM_BUG_ON_PAGE and VM_WARN_ON_ONCE_PAGE.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/mmdebug.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/include/linux/mmdebug.h b/include/linux/mmdebug.h
index 1935d4c72d10..d7285f8148a3 100644
--- a/include/linux/mmdebug.h
+++ b/include/linux/mmdebug.h
@@ -22,6 +22,13 @@ void dump_mm(const struct mm_struct *mm);
 			BUG();						\
 		}							\
 	} while (0)
+#define VM_BUG_ON_FOLIO(cond, folio)					\
+	do {								\
+		if (unlikely(cond)) {					\
+			dump_page(&folio->page, "VM_BUG_ON_FOLIO(" __stringify(cond)")");\
+			BUG();						\
+		}							\
+	} while (0)
 #define VM_BUG_ON_VMA(cond, vma)					\
 	do {								\
 		if (unlikely(cond)) {					\
@@ -47,6 +54,17 @@ void dump_mm(const struct mm_struct *mm);
 	}								\
 	unlikely(__ret_warn_once);					\
 })
+#define VM_WARN_ON_ONCE_FOLIO(cond, folio)	({			\
+	static bool __section(".data.once") __warned;			\
+	int __ret_warn_once = !!(cond);					\
+									\
+	if (unlikely(__ret_warn_once && !__warned)) {			\
+		dump_page(&folio->page, "VM_WARN_ON_ONCE_FOLIO(" __stringify(cond)")");\
+		__warned = true;					\
+		WARN_ON(1);						\
+	}								\
+	unlikely(__ret_warn_once);					\
+})
 
 #define VM_WARN_ON(cond) (void)WARN_ON(cond)
 #define VM_WARN_ON_ONCE(cond) (void)WARN_ON_ONCE(cond)
@@ -55,11 +73,13 @@ void dump_mm(const struct mm_struct *mm);
 #else
 #define VM_BUG_ON(cond) BUILD_BUG_ON_INVALID(cond)
 #define VM_BUG_ON_PAGE(cond, page) VM_BUG_ON(cond)
+#define VM_BUG_ON_FOLIO(cond, folio) VM_BUG_ON(cond)
 #define VM_BUG_ON_VMA(cond, vma) VM_BUG_ON(cond)
 #define VM_BUG_ON_MM(cond, mm) VM_BUG_ON(cond)
 #define VM_WARN_ON(cond) BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN_ON_ONCE(cond) BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN_ON_ONCE_PAGE(cond, page)  BUILD_BUG_ON_INVALID(cond)
+#define VM_WARN_ON_ONCE_FOLIO(cond, folio)  BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN_ONCE(cond, format...) BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN(cond, format...) BUILD_BUG_ON_INVALID(cond)
 #endif
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 13/96] mm: Add folio reference count functions
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (11 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 12/96] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 14/96] mm: Add folio_put Matthew Wilcox (Oracle)
                   ` (82 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

These functions mirror their page reference counterparts.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 Documentation/core-api/mm-api.rst |  1 +
 include/linux/page_ref.h          | 88 ++++++++++++++++++++++++++++++-
 2 files changed, 88 insertions(+), 1 deletion(-)

diff --git a/Documentation/core-api/mm-api.rst b/Documentation/core-api/mm-api.rst
index 2a94e6164f80..5c459ee2acce 100644
--- a/Documentation/core-api/mm-api.rst
+++ b/Documentation/core-api/mm-api.rst
@@ -98,4 +98,5 @@ More Memory Management Functions
 .. kernel-doc:: include/linux/page-flags.h
 .. kernel-doc:: include/linux/mm.h
    :internal:
+.. kernel-doc:: include/linux/page_ref.h
 .. kernel-doc:: include/linux/mmzone.h
diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h
index 7ad46f45df39..85816b2c0496 100644
--- a/include/linux/page_ref.h
+++ b/include/linux/page_ref.h
@@ -67,9 +67,31 @@ static inline int page_ref_count(const struct page *page)
 	return atomic_read(&page->_refcount);
 }
 
+/**
+ * folio_ref_count - The reference count on this folio.
+ * @folio: The folio.
+ *
+ * The refcount is usually incremented by calls to folio_get() and
+ * decremented by calls to folio_put().  Some typical users of the
+ * folio refcount:
+ *
+ * - Each reference from a page table
+ * - The page cache
+ * - Filesystem private data
+ * - The LRU list
+ * - Pipes
+ * - Direct IO which references this page in the process address space
+ *
+ * Return: The number of references to this folio.
+ */
+static inline int folio_ref_count(const struct folio *folio)
+{
+	return page_ref_count(&folio->page);
+}
+
 static inline int page_count(const struct page *page)
 {
-	return atomic_read(&compound_head(page)->_refcount);
+	return folio_ref_count(page_folio(page));
 }
 
 static inline void set_page_count(struct page *page, int v)
@@ -79,6 +101,11 @@ static inline void set_page_count(struct page *page, int v)
 		__page_ref_set(page, v);
 }
 
+static inline void folio_set_count(struct folio *folio, int v)
+{
+	set_page_count(&folio->page, v);
+}
+
 /*
  * Setup the page count before being freed into the page allocator for
  * the first time (boot or memory hotplug)
@@ -95,6 +122,11 @@ static inline void page_ref_add(struct page *page, int nr)
 		__page_ref_mod(page, nr);
 }
 
+static inline void folio_ref_add(struct folio *folio, int nr)
+{
+	page_ref_add(&folio->page, nr);
+}
+
 static inline void page_ref_sub(struct page *page, int nr)
 {
 	atomic_sub(nr, &page->_refcount);
@@ -102,6 +134,11 @@ static inline void page_ref_sub(struct page *page, int nr)
 		__page_ref_mod(page, -nr);
 }
 
+static inline void folio_ref_sub(struct folio *folio, int nr)
+{
+	page_ref_sub(&folio->page, nr);
+}
+
 static inline int page_ref_sub_return(struct page *page, int nr)
 {
 	int ret = atomic_sub_return(nr, &page->_refcount);
@@ -111,6 +148,11 @@ static inline int page_ref_sub_return(struct page *page, int nr)
 	return ret;
 }
 
+static inline int folio_ref_sub_return(struct folio *folio, int nr)
+{
+	return page_ref_sub_return(&folio->page, nr);
+}
+
 static inline void page_ref_inc(struct page *page)
 {
 	atomic_inc(&page->_refcount);
@@ -118,6 +160,11 @@ static inline void page_ref_inc(struct page *page)
 		__page_ref_mod(page, 1);
 }
 
+static inline void folio_ref_inc(struct folio *folio)
+{
+	page_ref_inc(&folio->page);
+}
+
 static inline void page_ref_dec(struct page *page)
 {
 	atomic_dec(&page->_refcount);
@@ -125,6 +172,11 @@ static inline void page_ref_dec(struct page *page)
 		__page_ref_mod(page, -1);
 }
 
+static inline void folio_ref_dec(struct folio *folio)
+{
+	page_ref_dec(&folio->page);
+}
+
 static inline int page_ref_sub_and_test(struct page *page, int nr)
 {
 	int ret = atomic_sub_and_test(nr, &page->_refcount);
@@ -134,6 +186,11 @@ static inline int page_ref_sub_and_test(struct page *page, int nr)
 	return ret;
 }
 
+static inline int folio_ref_sub_and_test(struct folio *folio, int nr)
+{
+	return page_ref_sub_and_test(&folio->page, nr);
+}
+
 static inline int page_ref_inc_return(struct page *page)
 {
 	int ret = atomic_inc_return(&page->_refcount);
@@ -143,6 +200,11 @@ static inline int page_ref_inc_return(struct page *page)
 	return ret;
 }
 
+static inline int folio_ref_inc_return(struct folio *folio)
+{
+	return page_ref_inc_return(&folio->page);
+}
+
 static inline int page_ref_dec_and_test(struct page *page)
 {
 	int ret = atomic_dec_and_test(&page->_refcount);
@@ -152,6 +214,11 @@ static inline int page_ref_dec_and_test(struct page *page)
 	return ret;
 }
 
+static inline int folio_ref_dec_and_test(struct folio *folio)
+{
+	return page_ref_dec_and_test(&folio->page);
+}
+
 static inline int page_ref_dec_return(struct page *page)
 {
 	int ret = atomic_dec_return(&page->_refcount);
@@ -161,6 +228,11 @@ static inline int page_ref_dec_return(struct page *page)
 	return ret;
 }
 
+static inline int folio_ref_dec_return(struct folio *folio)
+{
+	return page_ref_dec_return(&folio->page);
+}
+
 static inline int page_ref_add_unless(struct page *page, int nr, int u)
 {
 	int ret = atomic_add_unless(&page->_refcount, nr, u);
@@ -170,6 +242,11 @@ static inline int page_ref_add_unless(struct page *page, int nr, int u)
 	return ret;
 }
 
+static inline int folio_ref_add_unless(struct folio *folio, int nr, int u)
+{
+	return page_ref_add_unless(&folio->page, nr, u);
+}
+
 static inline int page_ref_freeze(struct page *page, int count)
 {
 	int ret = likely(atomic_cmpxchg(&page->_refcount, count, 0) == count);
@@ -179,6 +256,11 @@ static inline int page_ref_freeze(struct page *page, int count)
 	return ret;
 }
 
+static inline int folio_ref_freeze(struct folio *folio, int count)
+{
+	return page_ref_freeze(&folio->page, count);
+}
+
 static inline void page_ref_unfreeze(struct page *page, int count)
 {
 	VM_BUG_ON_PAGE(page_count(page) != 0, page);
@@ -189,4 +271,8 @@ static inline void page_ref_unfreeze(struct page *page, int count)
 		__page_ref_unfreeze(page, count);
 }
 
+static inline void folio_ref_unfreeze(struct folio *folio, int count)
+{
+	page_ref_unfreeze(&folio->page, count);
+}
 #endif
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 14/96] mm: Add folio_put
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (12 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 13/96] mm: Add folio reference count functions Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 15/96] mm: Add folio_get Matthew Wilcox (Oracle)
                   ` (81 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Zi Yan, Christoph Hellwig, Jeff Layton

If we know we have a folio, we can call folio_put() instead of put_page()
and save the overhead of calling compound_head().  Also skips the
devmap checks.

This commit looks like it should be a no-op, but actually saves 1312 bytes
of text with the distro-derived config that I'm testing.  Some functions
grow a little while others shrink.  I presume the compiler is making
different inlining decisions.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/mm.h | 33 ++++++++++++++++++++++++++++-----
 1 file changed, 28 insertions(+), 5 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a55c2c0628b6..610948f0cb43 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -751,6 +751,11 @@ static inline int put_page_testzero(struct page *page)
 	return page_ref_dec_and_test(page);
 }
 
+static inline int folio_put_testzero(struct folio *folio)
+{
+	return put_page_testzero(&folio->page);
+}
+
 /*
  * Try to grab a ref unless the page has a refcount of zero, return false if
  * that is the case.
@@ -1242,9 +1247,28 @@ static inline __must_check bool try_get_page(struct page *page)
 	return true;
 }
 
+/**
+ * folio_put - Decrement the reference count on a folio.
+ * @folio: The folio.
+ *
+ * If the folio's reference count reaches zero, the memory will be
+ * released back to the page allocator and may be used by another
+ * allocation immediately.  Do not access the memory or the struct folio
+ * after calling folio_put() unless you can be sure that it wasn't the
+ * last reference.
+ *
+ * Context: May be called in process or interrupt context, but not in NMI
+ * context.  May be called while holding a spinlock.
+ */
+static inline void folio_put(struct folio *folio)
+{
+	if (folio_put_testzero(folio))
+		__put_page(&folio->page);
+}
+
 static inline void put_page(struct page *page)
 {
-	page = compound_head(page);
+	struct folio *folio = page_folio(page);
 
 	/*
 	 * For devmap managed pages we need to catch refcount transition from
@@ -1252,13 +1276,12 @@ static inline void put_page(struct page *page)
 	 * need to inform the device driver through callback. See
 	 * include/linux/memremap.h and HMM for details.
 	 */
-	if (page_is_devmap_managed(page)) {
-		put_devmap_managed_page(page);
+	if (page_is_devmap_managed(&folio->page)) {
+		put_devmap_managed_page(&folio->page);
 		return;
 	}
 
-	if (put_page_testzero(page))
-		__put_page(page);
+	folio_put(folio);
 }
 
 /*
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 15/96] mm: Add folio_get
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (13 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 14/96] mm: Add folio_put Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 16/96] mm: Add folio flag manipulation functions Matthew Wilcox (Oracle)
                   ` (80 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, Zi Yan, Christoph Hellwig, Jeff Layton

If we know we have a folio, we can call folio_get() instead
of get_page() and save the overhead of calling compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/mm.h | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 610948f0cb43..b133734a7530 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1219,18 +1219,26 @@ static inline bool is_pci_p2pdma_page(const struct page *page)
 }
 
 /* 127: arbitrary random number, small enough to assemble well */
-#define page_ref_zero_or_close_to_overflow(page) \
-	((unsigned int) page_ref_count(page) + 127u <= 127u)
+#define folio_ref_zero_or_close_to_overflow(folio) \
+	((unsigned int) folio_ref_count(folio) + 127u <= 127u)
+
+/**
+ * folio_get - Increment the reference count on a folio.
+ * @folio: The folio.
+ *
+ * Context: May be called in any context, as long as you know that
+ * you have a refcount on the folio.  If you do not already have one,
+ * try_grab_page() may be the right interface for you to use.
+ */
+static inline void folio_get(struct folio *folio)
+{
+	VM_BUG_ON_FOLIO(folio_ref_zero_or_close_to_overflow(folio), folio);
+	folio_ref_inc(folio);
+}
 
 static inline void get_page(struct page *page)
 {
-	page = compound_head(page);
-	/*
-	 * Getting a normal page or the head of a compound page
-	 * requires to already have an elevated page->_refcount.
-	 */
-	VM_BUG_ON_PAGE(page_ref_zero_or_close_to_overflow(page), page);
-	page_ref_inc(page);
+	folio_get(page_folio(page));
 }
 
 bool __must_check try_grab_page(struct page *page, unsigned int flags);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 16/96] mm: Add folio flag manipulation functions
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (14 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 15/96] mm: Add folio_get Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 17/96] mm: Add folio_young() and folio_idle() Matthew Wilcox (Oracle)
                   ` (79 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

These new functions are the folio analogues of the various PageFlags
functions.  If CONFIG_DEBUG_VM_PGFLAGS is enabled, we check the folio
is not a tail page at every invocation.  This will also catch the
PagePoisoned case as a poisoned page has every bit set, which would
include PageTail.

This saves 1727 bytes of text with the distro-derived config that
I'm testing due to removing a double call to compound_head() in
PageSwapCache().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/page-flags.h | 203 +++++++++++++++++++++++++++----------
 1 file changed, 148 insertions(+), 55 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index e069aa8b11b7..ef8b7c6dc91c 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -140,6 +140,8 @@ enum pageflags {
 #endif
 	__NR_PAGEFLAGS,
 
+	PG_readahead = PG_reclaim,
+
 	/* Filesystems */
 	PG_checked = PG_owner_priv_1,
 
@@ -239,6 +241,15 @@ static inline void page_init_poison(struct page *page, size_t size)
 }
 #endif
 
+static unsigned long *folio_flags(struct folio *folio, unsigned n)
+{
+	struct page *page = &folio->page;
+
+	VM_BUG_ON_PGFLAGS(PageTail(page), page);
+	VM_BUG_ON_PGFLAGS(n > 0 && !test_bit(PG_head, &page->flags), page);
+	return &page[n].flags;
+}
+
 /*
  * Page flags policies wrt compound pages
  *
@@ -283,34 +294,62 @@ static inline void page_init_poison(struct page *page, size_t size)
 		VM_BUG_ON_PGFLAGS(!PageHead(page), page);		\
 		PF_POISONED_CHECK(&page[1]); })
 
+/* Which page is the flag stored in */
+#define FOLIO_PF_ANY		0
+#define FOLIO_PF_HEAD		0
+#define FOLIO_PF_ONLY_HEAD	0
+#define FOLIO_PF_NO_TAIL	0
+#define FOLIO_PF_NO_COMPOUND	0
+#define FOLIO_PF_SECOND		1
+
 /*
  * Macros to create function definitions for page flags
  */
 #define TESTPAGEFLAG(uname, lname, policy)				\
+static __always_inline bool folio_##lname(struct folio *folio)		\
+{ return test_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); }	\
 static __always_inline int Page##uname(struct page *page)		\
 	{ return test_bit(PG_##lname, &policy(page, 0)->flags); }
 
 #define SETPAGEFLAG(uname, lname, policy)				\
+static __always_inline							\
+void folio_set_##lname##_flag(struct folio *folio)			\
+{ set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); }		\
 static __always_inline void SetPage##uname(struct page *page)		\
 	{ set_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define CLEARPAGEFLAG(uname, lname, policy)				\
+static __always_inline							\
+void folio_clear_##lname##_flag(struct folio *folio)			\
+{ clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); }		\
 static __always_inline void ClearPage##uname(struct page *page)		\
 	{ clear_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define __SETPAGEFLAG(uname, lname, policy)				\
+static __always_inline							\
+void __folio_set_##lname##_flag(struct folio *folio)			\
+{ __set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); }		\
 static __always_inline void __SetPage##uname(struct page *page)		\
 	{ __set_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define __CLEARPAGEFLAG(uname, lname, policy)				\
+static __always_inline							\
+void __folio_clear_##lname##_flag(struct folio *folio)			\
+{ __clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); }	\
 static __always_inline void __ClearPage##uname(struct page *page)	\
 	{ __clear_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define TESTSETFLAG(uname, lname, policy)				\
+static __always_inline							\
+bool folio_test_set_##lname##_flag(struct folio *folio)		\
+{ return test_and_set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \
 static __always_inline int TestSetPage##uname(struct page *page)	\
 	{ return test_and_set_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define TESTCLEARFLAG(uname, lname, policy)				\
+static __always_inline							\
+bool folio_test_clear_##lname##_flag(struct folio *folio)		\
+{ return test_and_clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \
 static __always_inline int TestClearPage##uname(struct page *page)	\
 	{ return test_and_clear_bit(PG_##lname, &policy(page, 1)->flags); }
 
@@ -328,29 +367,37 @@ static __always_inline int TestClearPage##uname(struct page *page)	\
 	TESTSETFLAG(uname, lname, policy)				\
 	TESTCLEARFLAG(uname, lname, policy)
 
-#define TESTPAGEFLAG_FALSE(uname)					\
+#define TESTPAGEFLAG_FALSE(uname, lname)				\
+static inline bool folio_##lname(const struct folio *folio) { return 0; } \
 static inline int Page##uname(const struct page *page) { return 0; }
 
-#define SETPAGEFLAG_NOOP(uname)						\
+#define SETPAGEFLAG_NOOP(uname, lname)					\
+static inline void folio_set_##lname##_flag(struct folio *folio) { }	\
 static inline void SetPage##uname(struct page *page) {  }
 
-#define CLEARPAGEFLAG_NOOP(uname)					\
+#define CLEARPAGEFLAG_NOOP(uname, lname)				\
+static inline void folio_clear_##lname##_flag(struct folio *folio) { }	\
 static inline void ClearPage##uname(struct page *page) {  }
 
-#define __CLEARPAGEFLAG_NOOP(uname)					\
+#define __CLEARPAGEFLAG_NOOP(uname, lname)				\
+static inline void __folio_clear_##lname_flags(struct folio *folio) { }	\
 static inline void __ClearPage##uname(struct page *page) {  }
 
-#define TESTSETFLAG_FALSE(uname)					\
+#define TESTSETFLAG_FALSE(uname, lname)					\
+static inline bool folio_test_set_##lname##_flag(struct folio *folio)	\
+{ return 0; }								\
 static inline int TestSetPage##uname(struct page *page) { return 0; }
 
-#define TESTCLEARFLAG_FALSE(uname)					\
+#define TESTCLEARFLAG_FALSE(uname, lname)				\
+static inline bool folio_test_clear_##lname##_flag(struct folio *folio) \
+{ return 0; }								\
 static inline int TestClearPage##uname(struct page *page) { return 0; }
 
-#define PAGEFLAG_FALSE(uname) TESTPAGEFLAG_FALSE(uname)			\
-	SETPAGEFLAG_NOOP(uname) CLEARPAGEFLAG_NOOP(uname)
+#define PAGEFLAG_FALSE(uname, lname) TESTPAGEFLAG_FALSE(uname, lname)	\
+	SETPAGEFLAG_NOOP(uname, lname) CLEARPAGEFLAG_NOOP(uname, lname)
 
-#define TESTSCFLAG_FALSE(uname)						\
-	TESTSETFLAG_FALSE(uname) TESTCLEARFLAG_FALSE(uname)
+#define TESTSCFLAG_FALSE(uname, lname)					\
+	TESTSETFLAG_FALSE(uname, lname) TESTCLEARFLAG_FALSE(uname, lname)
 
 __PAGEFLAG(Locked, locked, PF_NO_TAIL)
 PAGEFLAG(Waiters, waiters, PF_ONLY_HEAD) __CLEARPAGEFLAG(Waiters, waiters, PF_ONLY_HEAD)
@@ -406,8 +453,8 @@ PAGEFLAG(MappedToDisk, mappedtodisk, PF_NO_TAIL)
 /* PG_readahead is only used for reads; PG_reclaim is only for writes */
 PAGEFLAG(Reclaim, reclaim, PF_NO_TAIL)
 	TESTCLEARFLAG(Reclaim, reclaim, PF_NO_TAIL)
-PAGEFLAG(Readahead, reclaim, PF_NO_COMPOUND)
-	TESTCLEARFLAG(Readahead, reclaim, PF_NO_COMPOUND)
+PAGEFLAG(Readahead, readahead, PF_NO_COMPOUND)
+	TESTCLEARFLAG(Readahead, readahead, PF_NO_COMPOUND)
 
 #ifdef CONFIG_HIGHMEM
 /*
@@ -416,22 +463,25 @@ PAGEFLAG(Readahead, reclaim, PF_NO_COMPOUND)
  */
 #define PageHighMem(__p) is_highmem_idx(page_zonenum(__p))
 #else
-PAGEFLAG_FALSE(HighMem)
+PAGEFLAG_FALSE(HighMem, highmem)
 #endif
 
 #ifdef CONFIG_SWAP
-static __always_inline int PageSwapCache(struct page *page)
+static __always_inline bool folio_swapcache(struct folio *folio)
 {
-#ifdef CONFIG_THP_SWAP
-	page = compound_head(page);
-#endif
-	return PageSwapBacked(page) && test_bit(PG_swapcache, &page->flags);
+	return folio_swapbacked(folio) &&
+			test_bit(PG_swapcache, folio_flags(folio, 0));
+}
 
+static __always_inline bool PageSwapCache(struct page *page)
+{
+	return folio_swapcache(page_folio(page));
 }
+
 SETPAGEFLAG(SwapCache, swapcache, PF_NO_TAIL)
 CLEARPAGEFLAG(SwapCache, swapcache, PF_NO_TAIL)
 #else
-PAGEFLAG_FALSE(SwapCache)
+PAGEFLAG_FALSE(SwapCache, swapcache)
 #endif
 
 PAGEFLAG(Unevictable, unevictable, PF_HEAD)
@@ -443,14 +493,14 @@ PAGEFLAG(Mlocked, mlocked, PF_NO_TAIL)
 	__CLEARPAGEFLAG(Mlocked, mlocked, PF_NO_TAIL)
 	TESTSCFLAG(Mlocked, mlocked, PF_NO_TAIL)
 #else
-PAGEFLAG_FALSE(Mlocked) __CLEARPAGEFLAG_NOOP(Mlocked)
-	TESTSCFLAG_FALSE(Mlocked)
+PAGEFLAG_FALSE(Mlocked, mlocked) __CLEARPAGEFLAG_NOOP(Mlocked, mlocked)
+	TESTSCFLAG_FALSE(Mlocked, mlocked)
 #endif
 
 #ifdef CONFIG_ARCH_USES_PG_UNCACHED
 PAGEFLAG(Uncached, uncached, PF_NO_COMPOUND)
 #else
-PAGEFLAG_FALSE(Uncached)
+PAGEFLAG_FALSE(Uncached, uncached)
 #endif
 
 #ifdef CONFIG_MEMORY_FAILURE
@@ -459,7 +509,7 @@ TESTSCFLAG(HWPoison, hwpoison, PF_ANY)
 #define __PG_HWPOISON (1UL << PG_hwpoison)
 extern bool take_page_off_buddy(struct page *page);
 #else
-PAGEFLAG_FALSE(HWPoison)
+PAGEFLAG_FALSE(HWPoison, hwpoison)
 #define __PG_HWPOISON 0
 #endif
 
@@ -505,10 +555,14 @@ static __always_inline int PageMappingFlags(struct page *page)
 	return ((unsigned long)page->mapping & PAGE_MAPPING_FLAGS) != 0;
 }
 
-static __always_inline int PageAnon(struct page *page)
+static __always_inline bool folio_anon(struct folio *folio)
+{
+	return ((unsigned long)folio->mapping & PAGE_MAPPING_ANON) != 0;
+}
+
+static __always_inline bool PageAnon(struct page *page)
 {
-	page = compound_head(page);
-	return ((unsigned long)page->mapping & PAGE_MAPPING_ANON) != 0;
+	return folio_anon(page_folio(page));
 }
 
 static __always_inline int __PageMovable(struct page *page)
@@ -524,30 +578,32 @@ static __always_inline int __PageMovable(struct page *page)
  * is found in VM_MERGEABLE vmas.  It's a PageAnon page, pointing not to any
  * anon_vma, but to that page's node of the stable tree.
  */
-static __always_inline int PageKsm(struct page *page)
+static __always_inline bool folio_ksm(struct folio *folio)
 {
-	page = compound_head(page);
-	return ((unsigned long)page->mapping & PAGE_MAPPING_FLAGS) ==
+	return ((unsigned long)folio->mapping & PAGE_MAPPING_FLAGS) ==
 				PAGE_MAPPING_KSM;
 }
+
+static __always_inline bool PageKsm(struct page *page)
+{
+	return folio_ksm(page_folio(page));
+}
 #else
-TESTPAGEFLAG_FALSE(Ksm)
+TESTPAGEFLAG_FALSE(Ksm, ksm)
 #endif
 
 u64 stable_page_flags(struct page *page);
 
-static inline int PageUptodate(struct page *page)
+static inline bool folio_uptodate(struct folio *folio)
 {
-	int ret;
-	page = compound_head(page);
-	ret = test_bit(PG_uptodate, &(page)->flags);
+	bool ret = test_bit(PG_uptodate, folio_flags(folio, 0));
 	/*
-	 * Must ensure that the data we read out of the page is loaded
-	 * _after_ we've loaded page->flags to check for PageUptodate.
-	 * We can skip the barrier if the page is not uptodate, because
+	 * Must ensure that the data we read out of the folio is loaded
+	 * _after_ we've loaded folio->flags to check the uptodate bit.
+	 * We can skip the barrier if the folio is not uptodate, because
 	 * we wouldn't be reading anything from it.
 	 *
-	 * See SetPageUptodate() for the other side of the story.
+	 * See folio_mark_uptodate() for the other side of the story.
 	 */
 	if (ret)
 		smp_rmb();
@@ -555,23 +611,36 @@ static inline int PageUptodate(struct page *page)
 	return ret;
 }
 
-static __always_inline void __SetPageUptodate(struct page *page)
+static inline int PageUptodate(struct page *page)
+{
+	return folio_uptodate(page_folio(page));
+}
+
+static __always_inline void __folio_mark_uptodate(struct folio *folio)
 {
-	VM_BUG_ON_PAGE(PageTail(page), page);
 	smp_wmb();
-	__set_bit(PG_uptodate, &page->flags);
+	__set_bit(PG_uptodate, folio_flags(folio, 0));
 }
 
-static __always_inline void SetPageUptodate(struct page *page)
+static __always_inline void folio_mark_uptodate(struct folio *folio)
 {
-	VM_BUG_ON_PAGE(PageTail(page), page);
 	/*
 	 * Memory barrier must be issued before setting the PG_uptodate bit,
-	 * so that all previous stores issued in order to bring the page
-	 * uptodate are actually visible before PageUptodate becomes true.
+	 * so that all previous stores issued in order to bring the folio
+	 * uptodate are actually visible before folio_uptodate becomes true.
 	 */
 	smp_wmb();
-	set_bit(PG_uptodate, &page->flags);
+	set_bit(PG_uptodate, folio_flags(folio, 0));
+}
+
+static __always_inline void __SetPageUptodate(struct page *page)
+{
+	__folio_mark_uptodate((struct folio *)page);
+}
+
+static __always_inline void SetPageUptodate(struct page *page)
+{
+	folio_mark_uptodate((struct folio *)page);
 }
 
 CLEARPAGEFLAG(Uptodate, uptodate, PF_NO_TAIL)
@@ -596,6 +665,17 @@ static inline void set_page_writeback_keepwrite(struct page *page)
 
 __PAGEFLAG(Head, head, PF_ANY) CLEARPAGEFLAG(Head, head, PF_ANY)
 
+/* Whether there are one or multiple pages in a folio */
+static inline bool folio_single(struct folio *folio)
+{
+	return !folio_head(folio);
+}
+
+static inline bool folio_multi(struct folio *folio)
+{
+	return folio_head(folio);
+}
+
 static __always_inline void set_compound_head(struct page *page, struct page *head)
 {
 	WRITE_ONCE(page->compound_head, (unsigned long)head + 1);
@@ -619,12 +699,15 @@ static inline void ClearPageCompound(struct page *page)
 #ifdef CONFIG_HUGETLB_PAGE
 int PageHuge(struct page *page);
 int PageHeadHuge(struct page *page);
+static inline bool folio_hugetlb(struct folio *folio)
+{
+	return PageHeadHuge(&folio->page);
+}
 #else
-TESTPAGEFLAG_FALSE(Huge)
-TESTPAGEFLAG_FALSE(HeadHuge)
+TESTPAGEFLAG_FALSE(Huge, hugetlb)
+TESTPAGEFLAG_FALSE(HeadHuge, headhuge)
 #endif
 
-
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 /*
  * PageHuge() only returns true for hugetlbfs pages, but not for
@@ -640,6 +723,11 @@ static inline int PageTransHuge(struct page *page)
 	return PageHead(page);
 }
 
+static inline bool folio_transhuge(struct folio *folio)
+{
+	return folio_head(folio);
+}
+
 /*
  * PageTransCompound returns true for both transparent huge pages
  * and hugetlbfs pages, so it should only be called when it's known
@@ -713,12 +801,12 @@ static inline int PageTransTail(struct page *page)
 PAGEFLAG(DoubleMap, double_map, PF_SECOND)
 	TESTSCFLAG(DoubleMap, double_map, PF_SECOND)
 #else
-TESTPAGEFLAG_FALSE(TransHuge)
-TESTPAGEFLAG_FALSE(TransCompound)
-TESTPAGEFLAG_FALSE(TransCompoundMap)
-TESTPAGEFLAG_FALSE(TransTail)
-PAGEFLAG_FALSE(DoubleMap)
-	TESTSCFLAG_FALSE(DoubleMap)
+TESTPAGEFLAG_FALSE(TransHuge, transhuge)
+TESTPAGEFLAG_FALSE(TransCompound, transcompound)
+TESTPAGEFLAG_FALSE(TransCompoundMap, transcompoundmap)
+TESTPAGEFLAG_FALSE(TransTail, transtail)
+PAGEFLAG_FALSE(DoubleMap, double_map)
+	TESTSCFLAG_FALSE(DoubleMap, double_map)
 #endif
 
 /*
@@ -871,6 +959,11 @@ static inline int page_has_private(struct page *page)
 	return !!(page->flags & PAGE_FLAGS_PRIVATE);
 }
 
+static inline bool folio_has_private(struct folio *folio)
+{
+	return page_has_private(&folio->page);
+}
+
 #undef PF_ANY
 #undef PF_HEAD
 #undef PF_ONLY_HEAD
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 17/96] mm: Add folio_young() and folio_idle()
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (15 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 16/96] mm: Add folio flag manipulation functions Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 18/96] mm: Handle per-folio private data Matthew Wilcox (Oracle)
                   ` (78 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Idle page tracking is handled through page_ext for 32-bit.  Add folio
equivalents for 32 bit and move all the page compatibility parts to
common code.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/page_idle.h | 99 +++++++++++++++++++--------------------
 1 file changed, 49 insertions(+), 50 deletions(-)

diff --git a/include/linux/page_idle.h b/include/linux/page_idle.h
index 1e894d34bdce..bd957e818558 100644
--- a/include/linux/page_idle.h
+++ b/include/linux/page_idle.h
@@ -8,46 +8,16 @@
 
 #ifdef CONFIG_IDLE_PAGE_TRACKING
 
-#ifdef CONFIG_64BIT
-static inline bool page_is_young(struct page *page)
-{
-	return PageYoung(page);
-}
-
-static inline void set_page_young(struct page *page)
-{
-	SetPageYoung(page);
-}
-
-static inline bool test_and_clear_page_young(struct page *page)
-{
-	return TestClearPageYoung(page);
-}
-
-static inline bool page_is_idle(struct page *page)
-{
-	return PageIdle(page);
-}
-
-static inline void set_page_idle(struct page *page)
-{
-	SetPageIdle(page);
-}
-
-static inline void clear_page_idle(struct page *page)
-{
-	ClearPageIdle(page);
-}
-#else /* !CONFIG_64BIT */
+#ifndef CONFIG_64BIT
 /*
  * If there is not enough space to store Idle and Young bits in page flags, use
  * page ext flags instead.
  */
 extern struct page_ext_operations page_idle_ops;
 
-static inline bool page_is_young(struct page *page)
+static inline bool folio_young(struct folio *folio)
 {
-	struct page_ext *page_ext = lookup_page_ext(page);
+	struct page_ext *page_ext = lookup_page_ext(&folio->page);
 
 	if (unlikely(!page_ext))
 		return false;
@@ -55,9 +25,9 @@ static inline bool page_is_young(struct page *page)
 	return test_bit(PAGE_EXT_YOUNG, &page_ext->flags);
 }
 
-static inline void set_page_young(struct page *page)
+static inline void folio_set_young_flag(struct folio *folio)
 {
-	struct page_ext *page_ext = lookup_page_ext(page);
+	struct page_ext *page_ext = lookup_page_ext(&folio->page);
 
 	if (unlikely(!page_ext))
 		return;
@@ -65,9 +35,9 @@ static inline void set_page_young(struct page *page)
 	set_bit(PAGE_EXT_YOUNG, &page_ext->flags);
 }
 
-static inline bool test_and_clear_page_young(struct page *page)
+static inline bool folio_test_clear_young_flag(struct folio *folio)
 {
-	struct page_ext *page_ext = lookup_page_ext(page);
+	struct page_ext *page_ext = lookup_page_ext(&folio->page);
 
 	if (unlikely(!page_ext))
 		return false;
@@ -75,9 +45,9 @@ static inline bool test_and_clear_page_young(struct page *page)
 	return test_and_clear_bit(PAGE_EXT_YOUNG, &page_ext->flags);
 }
 
-static inline bool page_is_idle(struct page *page)
+static inline bool folio_idle(struct folio *folio)
 {
-	struct page_ext *page_ext = lookup_page_ext(page);
+	struct page_ext *page_ext = lookup_page_ext(&folio->page);
 
 	if (unlikely(!page_ext))
 		return false;
@@ -85,9 +55,9 @@ static inline bool page_is_idle(struct page *page)
 	return test_bit(PAGE_EXT_IDLE, &page_ext->flags);
 }
 
-static inline void set_page_idle(struct page *page)
+static inline void folio_set_idle_flag(struct folio *folio)
 {
-	struct page_ext *page_ext = lookup_page_ext(page);
+	struct page_ext *page_ext = lookup_page_ext(&folio->page);
 
 	if (unlikely(!page_ext))
 		return;
@@ -95,46 +65,75 @@ static inline void set_page_idle(struct page *page)
 	set_bit(PAGE_EXT_IDLE, &page_ext->flags);
 }
 
-static inline void clear_page_idle(struct page *page)
+static inline void folio_clear_idle_flag(struct folio *folio)
 {
-	struct page_ext *page_ext = lookup_page_ext(page);
+	struct page_ext *page_ext = lookup_page_ext(&folio->page);
 
 	if (unlikely(!page_ext))
 		return;
 
 	clear_bit(PAGE_EXT_IDLE, &page_ext->flags);
 }
-#endif /* CONFIG_64BIT */
+#endif /* !CONFIG_64BIT */
 
 #else /* !CONFIG_IDLE_PAGE_TRACKING */
 
-static inline bool page_is_young(struct page *page)
+static inline bool folio_young(struct folio *folio)
 {
 	return false;
 }
 
-static inline void set_page_young(struct page *page)
+static inline void folio_set_young_flag(struct folio *folio)
 {
 }
 
-static inline bool test_and_clear_page_young(struct page *page)
+static inline bool folio_test_clear_young_flag(struct folio *folio)
 {
 	return false;
 }
 
-static inline bool page_is_idle(struct page *page)
+static inline bool folio_idle(struct folio *folio)
 {
 	return false;
 }
 
-static inline void set_page_idle(struct page *page)
+static inline void folio_set_idle_flag(struct folio *folio)
 {
 }
 
-static inline void clear_page_idle(struct page *page)
+static inline void folio_clear_idle_flag(struct folio *folio)
 {
 }
 
 #endif /* CONFIG_IDLE_PAGE_TRACKING */
 
+static inline bool page_is_young(struct page *page)
+{
+	return folio_young(page_folio(page));
+}
+
+static inline void set_page_young(struct page *page)
+{
+	folio_set_young_flag(page_folio(page));
+}
+
+static inline bool test_and_clear_page_young(struct page *page)
+{
+	return folio_test_clear_young_flag(page_folio(page));
+}
+
+static inline bool page_is_idle(struct page *page)
+{
+	return folio_idle(page_folio(page));
+}
+
+static inline void set_page_idle(struct page *page)
+{
+	folio_set_idle_flag(page_folio(page));
+}
+
+static inline void clear_page_idle(struct page *page)
+{
+	folio_clear_idle_flag(page_folio(page));
+}
 #endif /* _LINUX_MM_PAGE_IDLE_H */
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 18/96] mm: Handle per-folio private data
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (16 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 17/96] mm: Add folio_young() and folio_idle() Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 19/96] mm/filemap: Add folio_index, folio_file_page and folio_contains Matthew Wilcox (Oracle)
                   ` (77 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

Add folio_get_private() which mirrors page_private() -- ie folio private
data is the same as page private data.  The only difference is that these
return a void * instead of an unsigned long, which matches the majority
of users.

Turn attach_page_private() into folio_attach_private() and reimplement
attach_page_private() as a wrapper.  No filesystem which uses page private
data currently supports compound pages, so we're free to define the rules.
attach_page_private() may only be called on a head page; if you want
to add private data to a tail page, you can call set_page_private()
directly (and shouldn't increment the page refcount!  That should be
done when adding private data to the head page / folio).

This saves 597 bytes of text with the distro-derived config that I'm
testing due to removing the calls to compound_head() in get_page()
& put_page().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/mm_types.h | 11 +++++++++
 include/linux/pagemap.h  | 48 ++++++++++++++++++++++++----------------
 2 files changed, 40 insertions(+), 19 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 276e358c75d3..111c304b7d13 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -302,6 +302,12 @@ static inline atomic_t *compound_pincount_ptr(struct page *page)
 #define PAGE_FRAG_CACHE_MAX_SIZE	__ALIGN_MASK(32768, ~PAGE_MASK)
 #define PAGE_FRAG_CACHE_MAX_ORDER	get_order(PAGE_FRAG_CACHE_MAX_SIZE)
 
+/*
+ * page_private can be used on tail pages.  However, PagePrivate is only
+ * checked by the VM on the head page.  So page_private on the tail pages
+ * should be used for data that's ancillary to the head page (eg attaching
+ * buffer heads to tail pages after attaching buffer heads to the head page)
+ */
 #define page_private(page)		((page)->private)
 
 static inline void set_page_private(struct page *page, unsigned long private)
@@ -309,6 +315,11 @@ static inline void set_page_private(struct page *page, unsigned long private)
 	page->private = private;
 }
 
+static inline void *folio_get_private(struct folio *folio)
+{
+	return (void *)folio->private;
+}
+
 struct page_frag_cache {
 	void * va;
 #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index a4bd41128bf3..e2a66ada9d69 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -260,42 +260,52 @@ static inline int page_cache_add_speculative(struct page *page, int count)
 }
 
 /**
- * attach_page_private - Attach private data to a page.
- * @page: Page to attach data to.
- * @data: Data to attach to page.
+ * folio_attach_private - Attach private data to a folio.
+ * @folio: Folio to attach data to.
+ * @data: Data to attach to folio.
  *
- * Attaching private data to a page increments the page's reference count.
- * The data must be detached before the page will be freed.
+ * Attaching private data to a folio increments the page's reference count.
+ * The data must be detached before the folio will be freed.
  */
-static inline void attach_page_private(struct page *page, void *data)
+static inline void folio_attach_private(struct folio *folio, void *data)
 {
-	get_page(page);
-	set_page_private(page, (unsigned long)data);
-	SetPagePrivate(page);
+	folio_get(folio);
+	folio->private = (unsigned long)data;
+	folio_set_private_flag(folio);
 }
 
 /**
- * detach_page_private - Detach private data from a page.
- * @page: Page to detach data from.
+ * folio_detach_private - Detach private data from a folio.
+ * @folio: Folio to detach data from.
  *
- * Removes the data that was previously attached to the page and decrements
+ * Removes the data that was previously attached to the folio and decrements
  * the refcount on the page.
  *
- * Return: Data that was attached to the page.
+ * Return: Data that was attached to the folio.
  */
-static inline void *detach_page_private(struct page *page)
+static inline void *folio_detach_private(struct folio *folio)
 {
-	void *data = (void *)page_private(page);
+	void *data = folio_get_private(folio);
 
-	if (!PagePrivate(page))
+	if (!folio_private(folio))
 		return NULL;
-	ClearPagePrivate(page);
-	set_page_private(page, 0);
-	put_page(page);
+	folio_clear_private_flag(folio);
+	folio->private = 0;
+	folio_put(folio);
 
 	return data;
 }
 
+static inline void attach_page_private(struct page *page, void *data)
+{
+	folio_attach_private(page_folio(page), data);
+}
+
+static inline void *detach_page_private(struct page *page)
+{
+	return folio_detach_private(page_folio(page));
+}
+
 #ifdef CONFIG_NUMA
 extern struct page *__page_cache_alloc(gfp_t gfp);
 #else
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 19/96] mm/filemap: Add folio_index, folio_file_page and folio_contains
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (17 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 18/96] mm: Handle per-folio private data Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 20/96] mm/filemap: Add folio_next_index Matthew Wilcox (Oracle)
                   ` (76 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

folio_index() is the equivalent of page_index() for folios.
folio_file_page() is the equivalent of find_subpage().
folio_contains() is the equivalent of thp_contains().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 53 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index e2a66ada9d69..fdb319928781 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -462,6 +462,59 @@ static inline bool thp_contains(struct page *head, pgoff_t index)
 	return page_index(head) == (index & ~(thp_nr_pages(head) - 1UL));
 }
 
+#define swapcache_index(folio)	__page_file_index(&(folio)->page)
+
+/**
+ * folio_index - File index of a folio.
+ * @folio: The folio.
+ *
+ * For a folio which is either in the page cache or the swap cache,
+ * return its index within the address_space it belongs to.  If you know
+ * the page is definitely in the page cache, you can look at the folio's
+ * index directly.
+ *
+ * Return: The index (offset in units of pages) of a folio in its file.
+ */
+static inline pgoff_t folio_index(struct folio *folio)
+{
+        if (unlikely(folio_swapcache(folio)))
+                return swapcache_index(folio);
+        return folio->index;
+}
+
+/**
+ * folio_file_page - The page for a particular index.
+ * @folio: The folio which contains this index.
+ * @index: The index we want to look up.
+ *
+ * Sometimes after looking up a folio in the page cache, we need to
+ * obtain the specific page for an index (eg a page fault).
+ *
+ * Return: The page containing the file data for this index.
+ */
+static inline struct page *folio_file_page(struct folio *folio, pgoff_t index)
+{
+	return nth_page(&folio->page, index & (folio_nr_pages(folio) - 1));
+}
+
+/**
+ * folio_contains - Does this folio contain this index?
+ * @folio: The folio.
+ * @index: The page index within the file.
+ *
+ * Context: The caller should have the page locked in order to prevent
+ * (eg) shmem from moving the page between the page cache and swap cache
+ * and changing its index in the middle of the operation.
+ * Return: true or false.
+ */
+static inline bool folio_contains(struct folio *folio, pgoff_t index)
+{
+	/* HugeTLBfs indexes the page cache in units of hpage_size */
+	if (folio_hugetlb(folio))
+		return folio->index == index;
+	return index - folio_index(folio) < folio_nr_pages(folio);
+}
+
 /*
  * Given the page we found in the page cache, return the page corresponding
  * to this index in the file
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 20/96] mm/filemap: Add folio_next_index
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (18 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 19/96] mm/filemap: Add folio_index, folio_file_page and folio_contains Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 21/96] mm/filemap: Add folio_offset and folio_file_offset Matthew Wilcox (Oracle)
                   ` (75 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

This helper returns the page index of the next folio in the file (ie
the end of this folio, plus one).

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index fdb319928781..5ccc73c760b1 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -482,6 +482,17 @@ static inline pgoff_t folio_index(struct folio *folio)
         return folio->index;
 }
 
+/**
+ * folio_next_index - Get the index of the next folio.
+ * @folio: The current folio.
+ *
+ * Return: The index of the folio which follows this folio in the file.
+ */
+static inline pgoff_t folio_next_index(struct folio *folio)
+{
+	return folio->index + folio_nr_pages(folio);
+}
+
 /**
  * folio_file_page - The page for a particular index.
  * @folio: The folio which contains this index.
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 21/96] mm/filemap: Add folio_offset and folio_file_offset
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (19 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 20/96] mm/filemap: Add folio_next_index Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 22/96] mm/util: Add folio_mapping and folio_file_mapping Matthew Wilcox (Oracle)
                   ` (74 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

These are just wrappers around their page counterpart.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 5ccc73c760b1..2df82e356821 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -634,6 +634,16 @@ static inline loff_t page_file_offset(struct page *page)
 	return ((loff_t)page_index(page)) << PAGE_SHIFT;
 }
 
+static inline loff_t folio_offset(struct folio *folio)
+{
+	return page_offset(&folio->page);
+}
+
+static inline loff_t folio_file_offset(struct folio *folio)
+{
+	return page_file_offset(&folio->page);
+}
+
 extern pgoff_t linear_hugepage_index(struct vm_area_struct *vma,
 				     unsigned long address);
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 22/96] mm/util: Add folio_mapping and folio_file_mapping
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (20 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 21/96] mm/filemap: Add folio_offset and folio_file_offset Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 23/96] mm: Add folio_mapcount Matthew Wilcox (Oracle)
                   ` (73 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

These are the folio equivalent of page_mapping() and page_file_mapping().
Add an out-of-line page_mapping() wrapper around folio_mapping()
in order to prevent the page_folio() call from bloating every caller
of page_mapping().  Adjust page_file_mapping() and page_mapping_file()
to use folios internally.  Rename __page_file_mapping() to
swapcache_mapping() and change it to take a folio.

This ends up saving 186 bytes of text overall.  folio_mapping() is
45 bytes shorter than page_mapping() was, but the new page_mapping()
wrapper is 30 bytes.  The major reduction is a few bytes less in dozens
of nfs functions (which call page_file_mapping()).  Most of these appear
to be a slight change in gcc's register allocation decisions, which allow:

   48 8b 56 08         mov    0x8(%rsi),%rdx
   48 8d 42 ff         lea    -0x1(%rdx),%rax
   83 e2 01            and    $0x1,%edx
   48 0f 44 c6         cmove  %rsi,%rax

to become:

   48 8b 46 08         mov    0x8(%rsi),%rax
   48 8d 78 ff         lea    -0x1(%rax),%rdi
   a8 01               test   $0x1,%al
   48 0f 44 fe         cmove  %rsi,%rdi

for a reduction of a single byte.  Once the NFS client is converted to
use folios, this entire sequence will disappear.

Also add folio_mapping() documentation.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 Documentation/core-api/mm-api.rst |  2 ++
 include/linux/mm.h                | 14 -------------
 include/linux/pagemap.h           | 35 +++++++++++++++++++++++++++++--
 include/linux/swap.h              |  6 ++++++
 mm/Makefile                       |  2 +-
 mm/folio-compat.c                 | 13 ++++++++++++
 mm/swapfile.c                     |  8 +++----
 mm/util.c                         | 30 +++++++++++++++-----------
 8 files changed, 77 insertions(+), 33 deletions(-)
 create mode 100644 mm/folio-compat.c

diff --git a/Documentation/core-api/mm-api.rst b/Documentation/core-api/mm-api.rst
index 5c459ee2acce..dcce6605947a 100644
--- a/Documentation/core-api/mm-api.rst
+++ b/Documentation/core-api/mm-api.rst
@@ -100,3 +100,5 @@ More Memory Management Functions
    :internal:
 .. kernel-doc:: include/linux/page_ref.h
 .. kernel-doc:: include/linux/mmzone.h
+.. kernel-doc:: mm/util.c
+   :functions: folio_mapping
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b133734a7530..fb779dca5ee8 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1749,19 +1749,6 @@ void page_address_init(void);
 
 extern void *page_rmapping(struct page *page);
 extern struct anon_vma *page_anon_vma(struct page *page);
-extern struct address_space *page_mapping(struct page *page);
-
-extern struct address_space *__page_file_mapping(struct page *);
-
-static inline
-struct address_space *page_file_mapping(struct page *page)
-{
-	if (unlikely(PageSwapCache(page)))
-		return __page_file_mapping(page);
-
-	return page->mapping;
-}
-
 extern pgoff_t __page_file_index(struct page *page);
 
 /*
@@ -1776,7 +1763,6 @@ static inline pgoff_t page_index(struct page *page)
 }
 
 bool page_mapped(struct page *page);
-struct address_space *page_mapping(struct page *page);
 
 /*
  * Return true only if the page has been allocated with
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 2df82e356821..9aee462639b0 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -162,14 +162,45 @@ static inline void filemap_nr_thps_dec(struct address_space *mapping)
 
 void release_pages(struct page **pages, int nr);
 
+struct address_space *page_mapping(struct page *);
+struct address_space *folio_mapping(struct folio *);
+struct address_space *swapcache_mapping(struct folio *);
+
+/**
+ * folio_file_mapping - Find the mapping this folio belongs to.
+ * @folio: The folio.
+ *
+ * For folios which are in the page cache, return the mapping that this
+ * page belongs to.  Folios in the swap cache return the mapping of the
+ * swap file or swap device where the data is stored.  This is different
+ * from the mapping returned by folio_mapping().  The only reason to
+ * use it is if, like NFS, you return 0 from ->activate_swapfile.
+ *
+ * Do not call this for folios which aren't in the page cache or swap cache.
+ */
+static inline struct address_space *folio_file_mapping(struct folio *folio)
+{
+	if (unlikely(folio_swapcache(folio)))
+		return swapcache_mapping(folio);
+
+	return folio->mapping;
+}
+
+static inline struct address_space *page_file_mapping(struct page *page)
+{
+	return folio_file_mapping(page_folio(page));
+}
+
 /*
  * For file cache pages, return the address_space, otherwise return NULL
  */
 static inline struct address_space *page_mapping_file(struct page *page)
 {
-	if (unlikely(PageSwapCache(page)))
+	struct folio *folio = page_folio(page);
+
+	if (unlikely(folio_swapcache(folio)))
 		return NULL;
-	return page_mapping(page);
+	return folio_mapping(folio);
 }
 
 /*
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 144727041e78..20766342845b 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -314,6 +314,12 @@ struct vma_swap_readahead {
 #endif
 };
 
+static inline swp_entry_t folio_swap_entry(struct folio *folio)
+{
+	swp_entry_t entry = { .val = page_private(&folio->page) };
+	return entry;
+}
+
 /* linux/mm/workingset.c */
 void workingset_age_nonresident(struct lruvec *lruvec, unsigned long nr_pages);
 void *workingset_eviction(struct page *page, struct mem_cgroup *target_memcg);
diff --git a/mm/Makefile b/mm/Makefile
index a9ad6122d468..434c2a46b6c5 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -46,7 +46,7 @@ mmu-$(CONFIG_MMU)	+= process_vm_access.o
 endif
 
 obj-y			:= filemap.o mempool.o oom_kill.o fadvise.o \
-			   maccess.o page-writeback.o \
+			   maccess.o page-writeback.o folio-compat.o \
 			   readahead.o swap.o truncate.o vmscan.o shmem.o \
 			   util.o mmzone.o vmstat.o backing-dev.o \
 			   mm_init.o percpu.o slab_common.o \
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
new file mode 100644
index 000000000000..5e107aa30a62
--- /dev/null
+++ b/mm/folio-compat.c
@@ -0,0 +1,13 @@
+/*
+ * Compatibility functions which bloat the callers too much to make inline.
+ * All of the callers of these functions should be converted to use folios
+ * eventually.
+ */
+
+#include <linux/pagemap.h>
+
+struct address_space *page_mapping(struct page *page)
+{
+	return folio_mapping(page_folio(page));
+}
+EXPORT_SYMBOL(page_mapping);
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 149e77454e3c..d0ee24239a83 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3533,13 +3533,13 @@ struct swap_info_struct *page_swap_info(struct page *page)
 }
 
 /*
- * out-of-line __page_file_ methods to avoid include hell.
+ * out-of-line methods to avoid include hell.
  */
-struct address_space *__page_file_mapping(struct page *page)
+struct address_space *swapcache_mapping(struct folio *folio)
 {
-	return page_swap_info(page)->swap_file->f_mapping;
+	return page_swap_info(&folio->page)->swap_file->f_mapping;
 }
-EXPORT_SYMBOL_GPL(__page_file_mapping);
+EXPORT_SYMBOL_GPL(swapcache_mapping);
 
 pgoff_t __page_file_index(struct page *page)
 {
diff --git a/mm/util.c b/mm/util.c
index a8bf17f18a81..afd99591cb81 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -686,30 +686,36 @@ struct anon_vma *page_anon_vma(struct page *page)
 	return __page_rmapping(page);
 }
 
-struct address_space *page_mapping(struct page *page)
+/**
+ * folio_mapping - Find the mapping where this folio is stored.
+ * @folio: The folio.
+ *
+ * For folios which are in the page cache, return the mapping that this
+ * page belongs to.  Folios in the swap cache return the swap mapping
+ * this page is stored in (which is different from the mapping for the
+ * swap file or swap device where the data is stored).
+ *
+ * You can call this for folios which aren't in the swap cache or page
+ * cache and it will return NULL.
+ */
+struct address_space *folio_mapping(struct folio *folio)
 {
 	struct address_space *mapping;
 
-	page = compound_head(page);
-
 	/* This happens if someone calls flush_dcache_page on slab page */
-	if (unlikely(PageSlab(page)))
+	if (unlikely(folio_slab(folio)))
 		return NULL;
 
-	if (unlikely(PageSwapCache(page))) {
-		swp_entry_t entry;
-
-		entry.val = page_private(page);
-		return swap_address_space(entry);
-	}
+	if (unlikely(folio_swapcache(folio)))
+		return swap_address_space(folio_swap_entry(folio));
 
-	mapping = page->mapping;
+	mapping = folio->mapping;
 	if ((unsigned long)mapping & PAGE_MAPPING_ANON)
 		return NULL;
 
 	return (void *)((unsigned long)mapping & ~PAGE_MAPPING_FLAGS);
 }
-EXPORT_SYMBOL(page_mapping);
+EXPORT_SYMBOL(folio_mapping);
 
 /* Slow path of page_mapcount() for compound pages */
 int __page_mapcount(struct page *page)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 23/96] mm: Add folio_mapcount
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (21 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 22/96] mm/util: Add folio_mapping and folio_file_mapping Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 24/96] mm/memcg: Add folio wrappers for various functions Matthew Wilcox (Oracle)
                   ` (72 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

This is the folio equivalent of page_mapcount().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/mm.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index fb779dca5ee8..bca3e2518e5e 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -883,6 +883,22 @@ static inline int page_mapcount(struct page *page)
 	return atomic_read(&page->_mapcount) + 1;
 }
 
+/**
+ * folio_mapcount - The number of mappings of this folio.
+ * @folio: The folio.
+ *
+ * The result includes the number of times any of the pages in the
+ * folio are mapped to userspace.
+ *
+ * Return: The number of page table entries which refer to this folio.
+ */
+static inline int folio_mapcount(struct folio *folio)
+{
+	if (unlikely(folio_multi(folio)))
+		return __page_mapcount(&folio->page);
+	return atomic_read(&folio->_mapcount) + 1;
+}
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 int total_mapcount(struct page *page);
 int page_trans_huge_mapcount(struct page *page, int *total_mapcount);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 24/96] mm/memcg: Add folio wrappers for various functions
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (22 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 23/96] mm: Add folio_mapcount Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 25/96] mm/filemap: Add folio_unlock Matthew Wilcox (Oracle)
                   ` (71 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

Add new wrapper functions folio_memcg(), lock_folio_memcg(),
unlock_folio_memcg(), mem_cgroup_folio_lruvec() and
count_memcg_folio_event()

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/memcontrol.h | 58 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index c193be760709..b45b505be0ec 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -456,6 +456,11 @@ static inline struct mem_cgroup *page_memcg(struct page *page)
 		return __page_memcg(page);
 }
 
+static inline struct mem_cgroup *folio_memcg(struct folio *folio)
+{
+	return page_memcg(&folio->page);
+}
+
 /*
  * page_memcg_rcu - locklessly get the memory cgroup associated with a page
  * @page: a pointer to the page struct
@@ -1058,6 +1063,15 @@ static inline void count_memcg_page_event(struct page *page,
 		count_memcg_events(memcg, idx, 1);
 }
 
+static inline void count_memcg_folio_event(struct folio *folio,
+					  enum vm_event_item idx)
+{
+	struct mem_cgroup *memcg = folio_memcg(folio);
+
+	if (memcg)
+		count_memcg_events(memcg, idx, folio_nr_pages(folio));
+}
+
 static inline void count_memcg_event_mm(struct mm_struct *mm,
 					enum vm_event_item idx)
 {
@@ -1477,6 +1491,22 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 }
 #endif /* CONFIG_MEMCG */
 
+static inline void lock_folio_memcg(struct folio *folio)
+{
+	lock_page_memcg(&folio->page);
+}
+
+static inline void unlock_folio_memcg(struct folio *folio)
+{
+	unlock_page_memcg(&folio->page);
+}
+
+static inline struct lruvec *mem_cgroup_folio_lruvec(struct folio *folio,
+						    struct pglist_data *pgdat)
+{
+	return mem_cgroup_page_lruvec(&folio->page, pgdat);
+}
+
 static inline void __inc_lruvec_kmem_state(void *p, enum node_stat_item idx)
 {
 	__mod_lruvec_kmem_state(p, idx, 1);
@@ -1544,6 +1574,34 @@ static inline struct lruvec *relock_page_lruvec_irqsave(struct page *page,
 	return lock_page_lruvec_irqsave(page, flags);
 }
 
+static inline struct lruvec *folio_lock_lruvec(struct folio *folio)
+{
+	return lock_page_lruvec(&folio->page);
+}
+
+static inline struct lruvec *folio_lock_lruvec_irq(struct folio *folio)
+{
+	return lock_page_lruvec_irq(&folio->page);
+}
+
+static inline struct lruvec *folio_lock_lruvec_irqsave(struct folio *folio,
+		unsigned long *flagsp)
+{
+	return lock_page_lruvec_irqsave(&folio->page, flagsp);
+}
+
+static inline struct lruvec *folio_relock_lruvec_irq(struct folio *folio,
+		struct lruvec *locked_lruvec)
+{
+	return relock_page_lruvec_irq(&folio->page, locked_lruvec);
+}
+
+static inline struct lruvec *folio_relock_lruvec_irqsave(struct folio *folio,
+		struct lruvec *locked_lruvec, unsigned long *flagsp)
+{
+	return relock_page_lruvec_irqsave(&folio->page, locked_lruvec, flagsp);
+}
+
 #ifdef CONFIG_CGROUP_WRITEBACK
 
 struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 25/96] mm/filemap: Add folio_unlock
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (23 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 24/96] mm/memcg: Add folio wrappers for various functions Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 26/96] mm/filemap: Add folio_lock Matthew Wilcox (Oracle)
                   ` (70 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

Convert unlock_page() to call folio_unlock().  By using a folio we
avoid a call to compound_head().  This shortens the function from 39
bytes to 25 and removes 4 instructions on x86-64.  Because we still
have unlock_page(), it's a net increase of 24 bytes of text for the
kernel as a whole, but any path that uses folio_unlock() will execute
4 fewer instructions.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h |  3 ++-
 mm/filemap.c            | 27 ++++++++++-----------------
 mm/folio-compat.c       |  6 ++++++
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 9aee462639b0..005c978393eb 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -719,7 +719,8 @@ extern int __lock_page_killable(struct page *page);
 extern int __lock_page_async(struct page *page, struct wait_page_queue *wait);
 extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 				unsigned int flags);
-extern void unlock_page(struct page *page);
+void unlock_page(struct page *page);
+void folio_unlock(struct folio *folio);
 
 /*
  * Return true if the page was successfully locked
diff --git a/mm/filemap.c b/mm/filemap.c
index 66f7e9fdfbc4..090b303bcd45 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1435,29 +1435,22 @@ static inline bool clear_bit_unlock_is_negative_byte(long nr, volatile void *mem
 #endif
 
 /**
- * unlock_page - unlock a locked page
- * @page: the page
+ * folio_unlock - Unlock a locked folio.
+ * @folio: The folio.
  *
- * Unlocks the page and wakes up sleepers in wait_on_page_locked().
- * Also wakes sleepers in wait_on_page_writeback() because the wakeup
- * mechanism between PageLocked pages and PageWriteback pages is shared.
- * But that's OK - sleepers in wait_on_page_writeback() just go back to sleep.
+ * Unlocks the folio and wakes up any thread sleeping on the page lock.
  *
- * Note that this depends on PG_waiters being the sign bit in the byte
- * that contains PG_locked - thus the BUILD_BUG_ON(). That allows us to
- * clear the PG_locked bit and test PG_waiters at the same time fairly
- * portably (architectures that do LL/SC can test any bit, while x86 can
- * test the sign bit).
+ * Context: May be called from interrupt or process context.  May not be
+ * called from NMI context.
  */
-void unlock_page(struct page *page)
+void folio_unlock(struct folio *folio)
 {
 	BUILD_BUG_ON(PG_waiters != 7);
-	page = compound_head(page);
-	VM_BUG_ON_PAGE(!PageLocked(page), page);
-	if (clear_bit_unlock_is_negative_byte(PG_locked, &page->flags))
-		wake_up_page_bit(page, PG_locked);
+	VM_BUG_ON_FOLIO(!folio_locked(folio), folio);
+	if (clear_bit_unlock_is_negative_byte(PG_locked, folio_flags(folio, 0)))
+		wake_up_page_bit(&folio->page, PG_locked);
 }
-EXPORT_SYMBOL(unlock_page);
+EXPORT_SYMBOL(folio_unlock);
 
 /**
  * end_page_private_2 - Clear PG_private_2 and release any waiters
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 5e107aa30a62..91b3d00a92f7 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -11,3 +11,9 @@ struct address_space *page_mapping(struct page *page)
 	return folio_mapping(page_folio(page));
 }
 EXPORT_SYMBOL(page_mapping);
+
+void unlock_page(struct page *page)
+{
+	return folio_unlock(page_folio(page));
+}
+EXPORT_SYMBOL(unlock_page);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 26/96] mm/filemap: Add folio_lock
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (24 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 25/96] mm/filemap: Add folio_unlock Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 27/96] mm/filemap: Add folio_lock_killable Matthew Wilcox (Oracle)
                   ` (69 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

This is like lock_page() but for use by callers who know they have a folio.
Convert __lock_page() to be __folio_lock().  This saves one call to
compound_head() per contended call to lock_page().

Saves 362 bytes of text; mostly from improved register allocation and
inlining decisions.  __folio_lock is 59 bytes while __lock_page was 79.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 24 +++++++++++++++++++-----
 mm/filemap.c            | 29 +++++++++++++++--------------
 2 files changed, 34 insertions(+), 19 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 005c978393eb..c366a415b8ac 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -714,7 +714,7 @@ static inline bool wake_page_match(struct wait_page_queue *wait_page,
 	return true;
 }
 
-extern void __lock_page(struct page *page);
+void __folio_lock(struct folio *folio);
 extern int __lock_page_killable(struct page *page);
 extern int __lock_page_async(struct page *page, struct wait_page_queue *wait);
 extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
@@ -722,13 +722,24 @@ extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 void unlock_page(struct page *page);
 void folio_unlock(struct folio *folio);
 
+static inline bool folio_trylock(struct folio *folio)
+{
+	return likely(!test_and_set_bit_lock(PG_locked, folio_flags(folio, 0)));
+}
+
 /*
  * Return true if the page was successfully locked
  */
 static inline int trylock_page(struct page *page)
 {
-	page = compound_head(page);
-	return (likely(!test_and_set_bit_lock(PG_locked, &page->flags)));
+	return folio_trylock(page_folio(page));
+}
+
+static inline void folio_lock(struct folio *folio)
+{
+	might_sleep();
+	if (!folio_trylock(folio))
+		__folio_lock(folio);
 }
 
 /*
@@ -736,9 +747,12 @@ static inline int trylock_page(struct page *page)
  */
 static inline void lock_page(struct page *page)
 {
+	struct folio *folio;
 	might_sleep();
-	if (!trylock_page(page))
-		__lock_page(page);
+
+	folio = page_folio(page);
+	if (!folio_trylock(folio))
+		__folio_lock(folio);
 }
 
 /*
diff --git a/mm/filemap.c b/mm/filemap.c
index 090b303bcd45..6935b068856f 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1187,7 +1187,7 @@ static void wake_up_page(struct page *page, int bit)
  */
 enum behavior {
 	EXCLUSIVE,	/* Hold ref to page and take the bit when woken, like
-			 * __lock_page() waiting on then setting PG_locked.
+			 * __folio_lock() waiting on then setting PG_locked.
 			 */
 	SHARED,		/* Hold ref to page and check the bit when woken, like
 			 * wait_on_page_writeback() waiting on PG_writeback.
@@ -1576,17 +1576,16 @@ void page_endio(struct page *page, bool is_write, int err)
 EXPORT_SYMBOL_GPL(page_endio);
 
 /**
- * __lock_page - get a lock on the page, assuming we need to sleep to get it
- * @__page: the page to lock
+ * __folio_lock - Get a lock on the folio, assuming we need to sleep to get it.
+ * @folio: The folio to lock
  */
-void __lock_page(struct page *__page)
+void __folio_lock(struct folio *folio)
 {
-	struct page *page = compound_head(__page);
-	wait_queue_head_t *q = page_waitqueue(page);
-	wait_on_page_bit_common(q, page, PG_locked, TASK_UNINTERRUPTIBLE,
+	wait_queue_head_t *q = page_waitqueue(&folio->page);
+	wait_on_page_bit_common(q, &folio->page, PG_locked, TASK_UNINTERRUPTIBLE,
 				EXCLUSIVE);
 }
-EXPORT_SYMBOL(__lock_page);
+EXPORT_SYMBOL(__folio_lock);
 
 int __lock_page_killable(struct page *__page)
 {
@@ -1661,10 +1660,10 @@ int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 			return 0;
 		}
 	} else {
-		__lock_page(page);
+		__folio_lock(page_folio(page));
 	}
-	return 1;
 
+	return 1;
 }
 
 /**
@@ -2815,7 +2814,9 @@ loff_t mapping_seek_hole_data(struct address_space *mapping, loff_t start,
 static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
 				     struct file **fpin)
 {
-	if (trylock_page(page))
+	struct folio *folio = page_folio(page);
+
+	if (folio_trylock(folio))
 		return 1;
 
 	/*
@@ -2828,7 +2829,7 @@ static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
 
 	*fpin = maybe_unlock_mmap_for_io(vmf, *fpin);
 	if (vmf->flags & FAULT_FLAG_KILLABLE) {
-		if (__lock_page_killable(page)) {
+		if (__lock_page_killable(&folio->page)) {
 			/*
 			 * We didn't have the right flags to drop the mmap_lock,
 			 * but all fault_handlers only check for fatal signals
@@ -2840,11 +2841,11 @@ static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
 			return 0;
 		}
 	} else
-		__lock_page(page);
+		__folio_lock(folio);
+
 	return 1;
 }
 
-
 /*
  * Synchronous readahead happens when we don't even find a page in the page
  * cache at all.  We don't want to perform IO under the mmap sem, so if we have
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 27/96] mm/filemap: Add folio_lock_killable
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (25 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 26/96] mm/filemap: Add folio_lock Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 28/96] mm/filemap: Add __folio_lock_async Matthew Wilcox (Oracle)
                   ` (68 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

This is like lock_page_killable() but for use by callers who
know they have a folio.  Convert __lock_page_killable() to be
__folio_lock_killable().  This saves one call to compound_head() per
contended call to lock_page_killable().

__folio_lock_killable() is 20 bytes smaller than __lock_page_killable()
was.  lock_page_maybe_drop_mmap() shrinks by 68 bytes and
__lock_page_or_retry() shrinks by 66 bytes.  That's a total of 154 bytes
of text saved.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 15 ++++++++++-----
 mm/filemap.c            | 17 +++++++++--------
 2 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index c366a415b8ac..b7cec3948541 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -715,7 +715,7 @@ static inline bool wake_page_match(struct wait_page_queue *wait_page,
 }
 
 void __folio_lock(struct folio *folio);
-extern int __lock_page_killable(struct page *page);
+int __folio_lock_killable(struct folio *folio);
 extern int __lock_page_async(struct page *page, struct wait_page_queue *wait);
 extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 				unsigned int flags);
@@ -755,6 +755,14 @@ static inline void lock_page(struct page *page)
 		__folio_lock(folio);
 }
 
+static inline int folio_lock_killable(struct folio *folio)
+{
+	might_sleep();
+	if (!folio_trylock(folio))
+		return __folio_lock_killable(folio);
+	return 0;
+}
+
 /*
  * lock_page_killable is like lock_page but can be interrupted by fatal
  * signals.  It returns 0 if it locked the page and -EINTR if it was
@@ -762,10 +770,7 @@ static inline void lock_page(struct page *page)
  */
 static inline int lock_page_killable(struct page *page)
 {
-	might_sleep();
-	if (!trylock_page(page))
-		return __lock_page_killable(page);
-	return 0;
+	return folio_lock_killable(page_folio(page));
 }
 
 /*
diff --git a/mm/filemap.c b/mm/filemap.c
index 6935b068856f..27a86d53dd89 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1587,14 +1587,13 @@ void __folio_lock(struct folio *folio)
 }
 EXPORT_SYMBOL(__folio_lock);
 
-int __lock_page_killable(struct page *__page)
+int __folio_lock_killable(struct folio *folio)
 {
-	struct page *page = compound_head(__page);
-	wait_queue_head_t *q = page_waitqueue(page);
-	return wait_on_page_bit_common(q, page, PG_locked, TASK_KILLABLE,
+	wait_queue_head_t *q = page_waitqueue(&folio->page);
+	return wait_on_page_bit_common(q, &folio->page, PG_locked, TASK_KILLABLE,
 					EXCLUSIVE);
 }
-EXPORT_SYMBOL_GPL(__lock_page_killable);
+EXPORT_SYMBOL_GPL(__folio_lock_killable);
 
 int __lock_page_async(struct page *page, struct wait_page_queue *wait)
 {
@@ -1636,6 +1635,8 @@ int __lock_page_async(struct page *page, struct wait_page_queue *wait)
 int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 			 unsigned int flags)
 {
+	struct folio *folio = page_folio(page);
+
 	if (fault_flag_allow_retry_first(flags)) {
 		/*
 		 * CAUTION! In this case, mmap_lock is not released
@@ -1654,13 +1655,13 @@ int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 	if (flags & FAULT_FLAG_KILLABLE) {
 		int ret;
 
-		ret = __lock_page_killable(page);
+		ret = __folio_lock_killable(folio);
 		if (ret) {
 			mmap_read_unlock(mm);
 			return 0;
 		}
 	} else {
-		__folio_lock(page_folio(page));
+		__folio_lock(folio);
 	}
 
 	return 1;
@@ -2829,7 +2830,7 @@ static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
 
 	*fpin = maybe_unlock_mmap_for_io(vmf, *fpin);
 	if (vmf->flags & FAULT_FLAG_KILLABLE) {
-		if (__lock_page_killable(&folio->page)) {
+		if (__folio_lock_killable(folio)) {
 			/*
 			 * We didn't have the right flags to drop the mmap_lock,
 			 * but all fault_handlers only check for fatal signals
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 28/96] mm/filemap: Add __folio_lock_async
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (26 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 27/96] mm/filemap: Add folio_lock_killable Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 29/96] mm/filemap: Add __folio_lock_or_retry Matthew Wilcox (Oracle)
                   ` (67 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

There aren't any actual callers of lock_page_async(), so remove it.
Convert filemap_update_page() to call __folio_lock_async().

__folio_lock_async() is 21 bytes smaller than __lock_page_async(),
but the real savings come from using a folio in filemap_update_page(),
shrinking it from 514 bytes to 403 bytes, saving 111 bytes.  The text
shrinks by 132 bytes in total.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 fs/io_uring.c           |  2 +-
 include/linux/pagemap.h | 17 -----------------
 mm/filemap.c            | 31 ++++++++++++++++---------------
 3 files changed, 17 insertions(+), 33 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 7a2e83bc005d..c85f7b90b51e 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3158,7 +3158,7 @@ static int io_read_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 }
 
 /*
- * This is our waitqueue callback handler, registered through lock_page_async()
+ * This is our waitqueue callback handler, registered through __folio_lock_async()
  * when we initially tried to do the IO with the iocb armed our waitqueue.
  * This gets called when the page is unlocked, and we generally expect that to
  * happen when the page IO is completed and the page is now uptodate. This will
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index b7cec3948541..eb3b6ab39a0b 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -716,7 +716,6 @@ static inline bool wake_page_match(struct wait_page_queue *wait_page,
 
 void __folio_lock(struct folio *folio);
 int __folio_lock_killable(struct folio *folio);
-extern int __lock_page_async(struct page *page, struct wait_page_queue *wait);
 extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 				unsigned int flags);
 void unlock_page(struct page *page);
@@ -773,22 +772,6 @@ static inline int lock_page_killable(struct page *page)
 	return folio_lock_killable(page_folio(page));
 }
 
-/*
- * lock_page_async - Lock the page, unless this would block. If the page
- * is already locked, then queue a callback when the page becomes unlocked.
- * This callback can then retry the operation.
- *
- * Returns 0 if the page is locked successfully, or -EIOCBQUEUED if the page
- * was already locked and the callback defined in 'wait' was queued.
- */
-static inline int lock_page_async(struct page *page,
-				  struct wait_page_queue *wait)
-{
-	if (!trylock_page(page))
-		return __lock_page_async(page, wait);
-	return 0;
-}
-
 /*
  * lock_page_or_retry - Lock the page, unless this would block and the
  * caller indicated that it can handle a retry.
diff --git a/mm/filemap.c b/mm/filemap.c
index 27a86d53dd89..0033dc6c11e8 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1595,18 +1595,18 @@ int __folio_lock_killable(struct folio *folio)
 }
 EXPORT_SYMBOL_GPL(__folio_lock_killable);
 
-int __lock_page_async(struct page *page, struct wait_page_queue *wait)
+static int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait)
 {
-	struct wait_queue_head *q = page_waitqueue(page);
+	struct wait_queue_head *q = page_waitqueue(&folio->page);
 	int ret = 0;
 
-	wait->page = page;
+	wait->page = &folio->page;
 	wait->bit_nr = PG_locked;
 
 	spin_lock_irq(&q->lock);
 	__add_wait_queue_entry_tail(q, &wait->wait);
-	SetPageWaiters(page);
-	ret = !trylock_page(page);
+	folio_set_waiters_flag(folio);
+	ret = !folio_trylock(folio);
 	/*
 	 * If we were successful now, we know we're still on the
 	 * waitqueue as we're still under the lock. This means it's
@@ -2359,41 +2359,42 @@ static int filemap_update_page(struct kiocb *iocb,
 		struct address_space *mapping, struct iov_iter *iter,
 		struct page *page)
 {
+	struct folio *folio = page_folio(page);
 	int error;
 
-	if (!trylock_page(page)) {
+	if (!folio_trylock(folio)) {
 		if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_NOIO))
 			return -EAGAIN;
 		if (!(iocb->ki_flags & IOCB_WAITQ)) {
-			put_and_wait_on_page_locked(page, TASK_KILLABLE);
+			put_and_wait_on_page_locked(&folio->page, TASK_KILLABLE);
 			return AOP_TRUNCATED_PAGE;
 		}
-		error = __lock_page_async(page, iocb->ki_waitq);
+		error = __folio_lock_async(folio, iocb->ki_waitq);
 		if (error)
 			return error;
 	}
 
-	if (!page->mapping)
+	if (!folio->mapping)
 		goto truncated;
 
 	error = 0;
-	if (filemap_range_uptodate(mapping, iocb->ki_pos, iter, page))
+	if (filemap_range_uptodate(mapping, iocb->ki_pos, iter, &folio->page))
 		goto unlock;
 
 	error = -EAGAIN;
 	if (iocb->ki_flags & (IOCB_NOIO | IOCB_NOWAIT | IOCB_WAITQ))
 		goto unlock;
 
-	error = filemap_read_page(iocb->ki_filp, mapping, page);
+	error = filemap_read_page(iocb->ki_filp, mapping, &folio->page);
 	if (error == AOP_TRUNCATED_PAGE)
-		put_page(page);
+		folio_put(folio);
 	return error;
 truncated:
-	unlock_page(page);
-	put_page(page);
+	folio_unlock(folio);
+	folio_put(folio);
 	return AOP_TRUNCATED_PAGE;
 unlock:
-	unlock_page(page);
+	folio_unlock(folio);
 	return error;
 }
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 29/96] mm/filemap: Add __folio_lock_or_retry
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (27 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 28/96] mm/filemap: Add __folio_lock_async Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 30/96] mm/filemap: Add folio_wait_locked Matthew Wilcox (Oracle)
                   ` (66 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

Convert __lock_page_or_retry() to __folio_lock_or_retry().  This actually
saves 4 bytes in the only caller of lock_page_or_retry() (due to better
register allocation) and saves the 20 byte cost of calling page_folio()
in __folio_lock_or_retry() for a total saving of 24 bytes.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h |  9 ++++++---
 mm/filemap.c            | 10 ++++------
 mm/memory.c             |  8 ++++----
 3 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index eb3b6ab39a0b..dd1ca33896b4 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -716,7 +716,7 @@ static inline bool wake_page_match(struct wait_page_queue *wait_page,
 
 void __folio_lock(struct folio *folio);
 int __folio_lock_killable(struct folio *folio);
-extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
+int __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
 				unsigned int flags);
 void unlock_page(struct page *page);
 void folio_unlock(struct folio *folio);
@@ -777,13 +777,16 @@ static inline int lock_page_killable(struct page *page)
  * caller indicated that it can handle a retry.
  *
  * Return value and mmap_lock implications depend on flags; see
- * __lock_page_or_retry().
+ * __folio_lock_or_retry().
  */
 static inline int lock_page_or_retry(struct page *page, struct mm_struct *mm,
 				     unsigned int flags)
 {
+	struct folio *folio;
 	might_sleep();
-	return trylock_page(page) || __lock_page_or_retry(page, mm, flags);
+
+	folio = page_folio(page);
+	return folio_trylock(folio) || __folio_lock_or_retry(folio, mm, flags);
 }
 
 /*
diff --git a/mm/filemap.c b/mm/filemap.c
index 0033dc6c11e8..3c159e5f2dcd 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1623,20 +1623,18 @@ static int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait)
 
 /*
  * Return values:
- * 1 - page is locked; mmap_lock is still held.
- * 0 - page is not locked.
+ * 1 - folio is locked; mmap_lock is still held.
+ * 0 - folio is not locked.
  *     mmap_lock has been released (mmap_read_unlock(), unless flags had both
  *     FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_RETRY_NOWAIT set, in
  *     which case mmap_lock is still held.
  *
  * If neither ALLOW_RETRY nor KILLABLE are set, will always return 1
- * with the page locked and the mmap_lock unperturbed.
+ * with the folio locked and the mmap_lock unperturbed.
  */
-int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
+int __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
 			 unsigned int flags)
 {
-	struct folio *folio = page_folio(page);
-
 	if (fault_flag_allow_retry_first(flags)) {
 		/*
 		 * CAUTION! In this case, mmap_lock is not released
diff --git a/mm/memory.c b/mm/memory.c
index 86ba6c1f6821..fc3f50d0702c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4065,7 +4065,7 @@ static vm_fault_t do_shared_fault(struct vm_fault *vmf)
  * We enter with non-exclusive mmap_lock (to exclude vma changes,
  * but allow concurrent faults).
  * The mmap_lock may have been released depending on flags and our
- * return value.  See filemap_fault() and __lock_page_or_retry().
+ * return value.  See filemap_fault() and __folio_lock_or_retry().
  * If mmap_lock is released, vma may become invalid (for example
  * by other thread calling munmap()).
  */
@@ -4307,7 +4307,7 @@ static vm_fault_t wp_huge_pud(struct vm_fault *vmf, pud_t orig_pud)
  * concurrent faults).
  *
  * The mmap_lock may have been released depending on flags and our return value.
- * See filemap_fault() and __lock_page_or_retry().
+ * See filemap_fault() and __folio_lock_or_retry().
  */
 static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
 {
@@ -4411,7 +4411,7 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
  * By the time we get here, we already hold the mm semaphore
  *
  * The mmap_lock may have been released depending on flags and our
- * return value.  See filemap_fault() and __lock_page_or_retry().
+ * return value.  See filemap_fault() and __folio_lock_or_retry().
  */
 static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
 		unsigned long address, unsigned int flags)
@@ -4567,7 +4567,7 @@ static inline void mm_account_fault(struct pt_regs *regs,
  * By the time we get here, we already hold the mm semaphore
  *
  * The mmap_lock may have been released depending on flags and our
- * return value.  See filemap_fault() and __lock_page_or_retry().
+ * return value.  See filemap_fault() and __folio_lock_or_retry().
  */
 vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
 			   unsigned int flags, struct pt_regs *regs)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 30/96] mm/filemap: Add folio_wait_locked
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (28 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 29/96] mm/filemap: Add __folio_lock_or_retry Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 31/96] mm/swap: Add folio_rotate_reclaimable Matthew Wilcox (Oracle)
                   ` (65 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

Also add folio_wait_locked_killable().  Turn wait_on_page_locked()
and wait_on_page_locked_killable() into wrappers.  This eliminates a
call to compound_head() from each call-site, reducing text size by 200
bytes for me.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 26 ++++++++++++++++++--------
 mm/filemap.c            |  4 ++--
 2 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index dd1ca33896b4..615f5b3e65c4 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -797,23 +797,33 @@ extern void wait_on_page_bit(struct page *page, int bit_nr);
 extern int wait_on_page_bit_killable(struct page *page, int bit_nr);
 
 /* 
- * Wait for a page to be unlocked.
+ * Wait for a folio to be unlocked.
  *
- * This must be called with the caller "holding" the page,
- * ie with increased "page->count" so that the page won't
+ * This must be called with the caller "holding" the folio,
+ * ie with increased "page->count" so that the folio won't
  * go away during the wait..
  */
+static inline void folio_wait_locked(struct folio *folio)
+{
+	if (folio_locked(folio))
+		wait_on_page_bit(&folio->page, PG_locked);
+}
+
+static inline int folio_wait_locked_killable(struct folio *folio)
+{
+	if (!folio_locked(folio))
+		return 0;
+	return wait_on_page_bit_killable(&folio->page, PG_locked);
+}
+
 static inline void wait_on_page_locked(struct page *page)
 {
-	if (PageLocked(page))
-		wait_on_page_bit(compound_head(page), PG_locked);
+	folio_wait_locked(page_folio(page));
 }
 
 static inline int wait_on_page_locked_killable(struct page *page)
 {
-	if (!PageLocked(page))
-		return 0;
-	return wait_on_page_bit_killable(compound_head(page), PG_locked);
+	return folio_wait_locked_killable(page_folio(page));
 }
 
 int put_and_wait_on_page_locked(struct page *page, int state);
diff --git a/mm/filemap.c b/mm/filemap.c
index 3c159e5f2dcd..73e078c40bd7 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1645,9 +1645,9 @@ int __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
 
 		mmap_read_unlock(mm);
 		if (flags & FAULT_FLAG_KILLABLE)
-			wait_on_page_locked_killable(page);
+			folio_wait_locked_killable(folio);
 		else
-			wait_on_page_locked(page);
+			folio_wait_locked(folio);
 		return 0;
 	}
 	if (flags & FAULT_FLAG_KILLABLE) {
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 31/96] mm/swap: Add folio_rotate_reclaimable
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (29 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 30/96] mm/filemap: Add folio_wait_locked Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 32/96] mm/filemap: Add folio_end_writeback Matthew Wilcox (Oracle)
                   ` (64 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Move the declaration into mm/internal.h and rename
rotate_reclaimable_page() to folio_rotate_reclaimable().  This eliminates
all five of the calls to compound_head() in this function, saving 75 bytes
at the cost of adding 14 bytes to its one caller, end_page_writeback().
Net 61 bytes savings.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/swap.h |  1 -
 mm/filemap.c         |  2 +-
 mm/internal.h        |  1 +
 mm/page_io.c         |  4 ++--
 mm/swap.c            | 18 +++++++++---------
 5 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 20766342845b..76b2338ef24d 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -365,7 +365,6 @@ extern void lru_add_drain(void);
 extern void lru_add_drain_cpu(int cpu);
 extern void lru_add_drain_cpu_zone(struct zone *zone);
 extern void lru_add_drain_all(void);
-extern void rotate_reclaimable_page(struct page *page);
 extern void deactivate_file_page(struct page *page);
 extern void deactivate_page(struct page *page);
 extern void mark_page_lazyfree(struct page *page);
diff --git a/mm/filemap.c b/mm/filemap.c
index 73e078c40bd7..06cb717c7c60 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1528,7 +1528,7 @@ void end_page_writeback(struct page *page)
 	 */
 	if (PageReclaim(page)) {
 		ClearPageReclaim(page);
-		rotate_reclaimable_page(page);
+		folio_rotate_reclaimable(page_folio(page));
 	}
 
 	/*
diff --git a/mm/internal.h b/mm/internal.h
index 46eb82eaa195..68d363a3a1f3 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -35,6 +35,7 @@
 void page_writeback_init(void);
 
 vm_fault_t do_swap_page(struct vm_fault *vmf);
+void folio_rotate_reclaimable(struct folio *folio);
 
 void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma,
 		unsigned long floor, unsigned long ceiling);
diff --git a/mm/page_io.c b/mm/page_io.c
index c493ce9ebcf5..d597bc6e6e45 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -38,7 +38,7 @@ void end_swap_bio_write(struct bio *bio)
 		 * Also print a dire warning that things will go BAD (tm)
 		 * very quickly.
 		 *
-		 * Also clear PG_reclaim to avoid rotate_reclaimable_page()
+		 * Also clear PG_reclaim to avoid folio_rotate_reclaimable()
 		 */
 		set_page_dirty(page);
 		pr_alert_ratelimited("Write-error on swap-device (%u:%u:%llu)\n",
@@ -317,7 +317,7 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc,
 			 * temporary failure if the system has limited
 			 * memory for allocating transmit buffers.
 			 * Mark the page dirty and avoid
-			 * rotate_reclaimable_page but rate-limit the
+			 * folio_rotate_reclaimable but rate-limit the
 			 * messages but do not flag PageError like
 			 * the normal direct-to-bio case as it could
 			 * be temporary.
diff --git a/mm/swap.c b/mm/swap.c
index dfb48cf9c2c9..6caca11cd2ec 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -249,23 +249,23 @@ static bool pagevec_add_and_need_flush(struct pagevec *pvec, struct page *page)
 }
 
 /*
- * Writeback is about to end against a page which has been marked for immediate
- * reclaim.  If it still appears to be reclaimable, move it to the tail of the
- * inactive list.
+ * Writeback is about to end against a folio which has been marked for
+ * immediate reclaim.  If it still appears to be reclaimable, move it
+ * to the tail of the inactive list.
  *
- * rotate_reclaimable_page() must disable IRQs, to prevent nasty races.
+ * folio_rotate_reclaimable() must disable IRQs, to prevent nasty races.
  */
-void rotate_reclaimable_page(struct page *page)
+void folio_rotate_reclaimable(struct folio *folio)
 {
-	if (!PageLocked(page) && !PageDirty(page) &&
-	    !PageUnevictable(page) && PageLRU(page)) {
+	if (!folio_locked(folio) && !folio_dirty(folio) &&
+	    !folio_unevictable(folio) && folio_lru(folio)) {
 		struct pagevec *pvec;
 		unsigned long flags;
 
-		get_page(page);
+		folio_get(folio);
 		local_lock_irqsave(&lru_rotate.lock, flags);
 		pvec = this_cpu_ptr(&lru_rotate.pvec);
-		if (pagevec_add_and_need_flush(pvec, page))
+		if (pagevec_add_and_need_flush(pvec, &folio->page))
 			pagevec_lru_move_fn(pvec, pagevec_move_tail_fn);
 		local_unlock_irqrestore(&lru_rotate.lock, flags);
 	}
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 32/96] mm/filemap: Add folio_end_writeback
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (30 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 31/96] mm/swap: Add folio_rotate_reclaimable Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 33/96] mm/writeback: Add folio_wait_writeback Matthew Wilcox (Oracle)
                   ` (63 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

Add an end_page_writeback() wrapper function for users that are not yet
converted to folios.

folio_end_writeback() is less than half the size of end_page_writeback()
at just 105 bytes compared to 213 bytes, due to removing all the
compound_head() calls.  The 30 byte wrapper function makes this a net
saving of 70 bytes.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h |  3 ++-
 mm/filemap.c            | 40 ++++++++++++++++++++--------------------
 mm/folio-compat.c       |  6 ++++++
 3 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 615f5b3e65c4..f1078272fb26 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -829,7 +829,8 @@ static inline int wait_on_page_locked_killable(struct page *page)
 int put_and_wait_on_page_locked(struct page *page, int state);
 void wait_on_page_writeback(struct page *page);
 int wait_on_page_writeback_killable(struct page *page);
-extern void end_page_writeback(struct page *page);
+void end_page_writeback(struct page *page);
+void folio_end_writeback(struct folio *folio);
 void wait_for_stable_page(struct page *page);
 
 void page_endio(struct page *page, bool is_write, int err);
diff --git a/mm/filemap.c b/mm/filemap.c
index 06cb717c7c60..9d2cfa5d3a40 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1175,11 +1175,11 @@ static void wake_up_page_bit(struct page *page, int bit_nr)
 	spin_unlock_irqrestore(&q->lock, flags);
 }
 
-static void wake_up_page(struct page *page, int bit)
+static void folio_wake(struct folio *folio, int bit)
 {
-	if (!PageWaiters(page))
+	if (!folio_waiters(folio))
 		return;
-	wake_up_page_bit(page, bit);
+	wake_up_page_bit(&folio->page, bit);
 }
 
 /*
@@ -1514,38 +1514,38 @@ int wait_on_page_private_2_killable(struct page *page)
 EXPORT_SYMBOL(wait_on_page_private_2_killable);
 
 /**
- * end_page_writeback - end writeback against a page
- * @page: the page
+ * folio_end_writeback - End writeback against a folio.
+ * @folio: The folio.
  */
-void end_page_writeback(struct page *page)
+void folio_end_writeback(struct folio *folio)
 {
 	/*
-	 * TestClearPageReclaim could be used here but it is an atomic
+	 * folio_test_clear_reclaim_flag() could be used here but it is an atomic
 	 * operation and overkill in this particular case. Failing to
-	 * shuffle a page marked for immediate reclaim is too mild to
+	 * shuffle a folio marked for immediate reclaim is too mild to
 	 * justify taking an atomic operation penalty at the end of
-	 * ever page writeback.
+	 * every folio writeback.
 	 */
-	if (PageReclaim(page)) {
-		ClearPageReclaim(page);
-		folio_rotate_reclaimable(page_folio(page));
+	if (folio_reclaim(folio)) {
+		folio_clear_reclaim_flag(folio);
+		folio_rotate_reclaimable(folio);
 	}
 
 	/*
-	 * Writeback does not hold a page reference of its own, relying
+	 * Writeback does not hold a folio reference of its own, relying
 	 * on truncation to wait for the clearing of PG_writeback.
-	 * But here we must make sure that the page is not freed and
-	 * reused before the wake_up_page().
+	 * But here we must make sure that the folio is not freed and
+	 * reused before the folio_wake().
 	 */
-	get_page(page);
-	if (!test_clear_page_writeback(page))
+	folio_get(folio);
+	if (!test_clear_page_writeback(&folio->page))
 		BUG();
 
 	smp_mb__after_atomic();
-	wake_up_page(page, PG_writeback);
-	put_page(page);
+	folio_wake(folio, PG_writeback);
+	folio_put(folio);
 }
-EXPORT_SYMBOL(end_page_writeback);
+EXPORT_SYMBOL(folio_end_writeback);
 
 /*
  * After completing I/O on a page, call this routine to update the page
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 91b3d00a92f7..526843d03d58 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -17,3 +17,9 @@ void unlock_page(struct page *page)
 	return folio_unlock(page_folio(page));
 }
 EXPORT_SYMBOL(unlock_page);
+
+void end_page_writeback(struct page *page)
+{
+	return folio_end_writeback(page_folio(page));
+}
+EXPORT_SYMBOL(end_page_writeback);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 33/96] mm/writeback: Add folio_wait_writeback
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (31 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 32/96] mm/filemap: Add folio_end_writeback Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 34/96] mm/writeback: Add folio_wait_stable Matthew Wilcox (Oracle)
                   ` (62 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

wait_on_page_writeback_killable() only has one caller, so convert it to
call folio_wait_writeback_killable().  For the wait_on_page_writeback()
callers, add a compatibility wrapper around folio_wait_writeback().

Turning PageWriteback() into folio_writeback() eliminates a call to
compound_head() which saves 8 bytes and 15 bytes in the two functions.
That is more than offset by adding the wait_on_page_writeback
compatibility wrapper for a net increase in text of 15 bytes.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 fs/afs/write.c          |  9 ++++----
 include/linux/pagemap.h |  3 ++-
 mm/folio-compat.c       |  6 ++++++
 mm/page-writeback.c     | 48 ++++++++++++++++++++++++++++-------------
 4 files changed, 46 insertions(+), 20 deletions(-)

diff --git a/fs/afs/write.c b/fs/afs/write.c
index 3edb6204b937..22b1c4d43687 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -832,7 +832,8 @@ int afs_fsync(struct file *file, loff_t start, loff_t end, int datasync)
  */
 vm_fault_t afs_page_mkwrite(struct vm_fault *vmf)
 {
-	struct page *page = thp_head(vmf->page);
+	struct folio *folio = page_folio(vmf->page);
+	struct page *page = &folio->page;
 	struct file *file = vmf->vma->vm_file;
 	struct inode *inode = file_inode(file);
 	struct afs_vnode *vnode = AFS_FS_I(inode);
@@ -851,7 +852,7 @@ vm_fault_t afs_page_mkwrite(struct vm_fault *vmf)
 		return VM_FAULT_RETRY;
 #endif
 
-	if (wait_on_page_writeback_killable(page))
+	if (folio_wait_writeback_killable(folio))
 		return VM_FAULT_RETRY;
 
 	if (lock_page_killable(page) < 0)
@@ -861,8 +862,8 @@ vm_fault_t afs_page_mkwrite(struct vm_fault *vmf)
 	 * details the portion of the page we need to write back and we might
 	 * need to redirty the page if there's a problem.
 	 */
-	if (wait_on_page_writeback_killable(page) < 0) {
-		unlock_page(page);
+	if (folio_wait_writeback_killable(folio) < 0) {
+		folio_unlock(folio);
 		return VM_FAULT_RETRY;
 	}
 
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index f1078272fb26..8849180c783c 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -828,7 +828,8 @@ static inline int wait_on_page_locked_killable(struct page *page)
 
 int put_and_wait_on_page_locked(struct page *page, int state);
 void wait_on_page_writeback(struct page *page);
-int wait_on_page_writeback_killable(struct page *page);
+void folio_wait_writeback(struct folio *folio);
+int folio_wait_writeback_killable(struct folio *folio);
 void end_page_writeback(struct page *page);
 void folio_end_writeback(struct folio *folio);
 void wait_for_stable_page(struct page *page);
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 526843d03d58..41275dac7a92 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -23,3 +23,9 @@ void end_page_writeback(struct page *page)
 	return folio_end_writeback(page_folio(page));
 }
 EXPORT_SYMBOL(end_page_writeback);
+
+void wait_on_page_writeback(struct page *page)
+{
+	return folio_wait_writeback(page_folio(page));
+}
+EXPORT_SYMBOL_GPL(wait_on_page_writeback);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 0062d5c57d41..c8bc78cd0f2b 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2818,33 +2818,51 @@ int __test_set_page_writeback(struct page *page, bool keep_write)
 }
 EXPORT_SYMBOL(__test_set_page_writeback);
 
-/*
- * Wait for a page to complete writeback
+/**
+ * folio_wait_writeback - Wait for a folio to finish writeback.
+ * @folio: The folio to wait for.
+ *
+ * If the folio is currently being written back to storage, wait for the
+ * I/O to complete.
+ *
+ * Context: Sleeps.  Must be called in process context and with
+ * no spinlocks held.  Caller should hold a reference on the folio.
+ * If the folio is not locked, writeback may start again after writeback
+ * has finished.
  */
-void wait_on_page_writeback(struct page *page)
+void folio_wait_writeback(struct folio *folio)
 {
-	while (PageWriteback(page)) {
-		trace_wait_on_page_writeback(page, page_mapping(page));
-		wait_on_page_bit(page, PG_writeback);
+	while (folio_writeback(folio)) {
+		trace_wait_on_page_writeback(&folio->page, folio_mapping(folio));
+		wait_on_page_bit(&folio->page, PG_writeback);
 	}
 }
-EXPORT_SYMBOL_GPL(wait_on_page_writeback);
+EXPORT_SYMBOL_GPL(folio_wait_writeback);
 
-/*
- * Wait for a page to complete writeback.  Returns -EINTR if we get a
- * fatal signal while waiting.
+/**
+ * folio_wait_writeback_killable - Wait for a folio to finish writeback.
+ * @folio: The folio to wait for.
+ *
+ * If the folio is currently being written back to storage, wait for the
+ * I/O to complete or a fatal signal to arrive.
+ *
+ * Context: Sleeps.  Must be called in process context and with
+ * no spinlocks held.  Caller should hold a reference on the folio.
+ * If the folio is not locked, writeback may start again after writeback
+ * has finished.
+ * Return: 0 on success, -EINTR if we get a fatal signal while waiting.
  */
-int wait_on_page_writeback_killable(struct page *page)
+int folio_wait_writeback_killable(struct folio *folio)
 {
-	while (PageWriteback(page)) {
-		trace_wait_on_page_writeback(page, page_mapping(page));
-		if (wait_on_page_bit_killable(page, PG_writeback))
+	while (folio_writeback(folio)) {
+		trace_wait_on_page_writeback(&folio->page, folio_mapping(folio));
+		if (wait_on_page_bit_killable(&folio->page, PG_writeback))
 			return -EINTR;
 	}
 
 	return 0;
 }
-EXPORT_SYMBOL_GPL(wait_on_page_writeback_killable);
+EXPORT_SYMBOL_GPL(folio_wait_writeback_killable);
 
 /**
  * wait_for_stable_page() - wait for writeback to finish, if necessary.
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 34/96] mm/writeback: Add folio_wait_stable
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (32 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 33/96] mm/writeback: Add folio_wait_writeback Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 35/96] mm/filemap: Add folio_wait_bit Matthew Wilcox (Oracle)
                   ` (61 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

Move wait_for_stable_page() into the folio compatibility file.
folio_wait_stable() avoids a call to compound_head() and is 14 bytes
smaller than wait_for_stable_page() was.  The net text size grows by 24
bytes as a result of this patch.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h |  1 +
 mm/folio-compat.c       |  6 ++++++
 mm/page-writeback.c     | 24 ++++++++++++++----------
 3 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 8849180c783c..dd98b25fec49 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -833,6 +833,7 @@ int folio_wait_writeback_killable(struct folio *folio);
 void end_page_writeback(struct page *page);
 void folio_end_writeback(struct folio *folio);
 void wait_for_stable_page(struct page *page);
+void folio_wait_stable(struct folio *folio);
 
 void page_endio(struct page *page, bool is_write, int err);
 
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 41275dac7a92..3c83f03b80d7 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -29,3 +29,9 @@ void wait_on_page_writeback(struct page *page)
 	return folio_wait_writeback(page_folio(page));
 }
 EXPORT_SYMBOL_GPL(wait_on_page_writeback);
+
+void wait_for_stable_page(struct page *page)
+{
+	return folio_wait_stable(page_folio(page));
+}
+EXPORT_SYMBOL_GPL(wait_for_stable_page);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index c8bc78cd0f2b..8b15c44552f7 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2865,17 +2865,21 @@ int folio_wait_writeback_killable(struct folio *folio)
 EXPORT_SYMBOL_GPL(folio_wait_writeback_killable);
 
 /**
- * wait_for_stable_page() - wait for writeback to finish, if necessary.
- * @page:	The page to wait on.
+ * folio_wait_stable() - wait for writeback to finish, if necessary.
+ * @folio: The folio to wait on.
  *
- * This function determines if the given page is related to a backing device
- * that requires page contents to be held stable during writeback.  If so, then
- * it will wait for any pending writeback to complete.
+ * This function determines if the given folio is related to a backing
+ * device that requires folio contents to be held stable during writeback.
+ * If so, then it will wait for any pending writeback to complete.
+ *
+ * Context: Sleeps.  Must be called in process context and with
+ * no spinlocks held.  Caller should hold a reference on the folio.
+ * If the folio is not locked, writeback may start again after writeback
+ * has finished.
  */
-void wait_for_stable_page(struct page *page)
+void folio_wait_stable(struct folio *folio)
 {
-	page = thp_head(page);
-	if (page->mapping->host->i_sb->s_iflags & SB_I_STABLE_WRITES)
-		wait_on_page_writeback(page);
+	if (folio->mapping->host->i_sb->s_iflags & SB_I_STABLE_WRITES)
+		folio_wait_writeback(folio);
 }
-EXPORT_SYMBOL_GPL(wait_for_stable_page);
+EXPORT_SYMBOL_GPL(folio_wait_stable);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 35/96] mm/filemap: Add folio_wait_bit
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (33 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 34/96] mm/writeback: Add folio_wait_stable Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 36/96] mm/filemap: Add folio_wake_bit Matthew Wilcox (Oracle)
                   ` (60 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

Rename wait_on_page_bit() to folio_wait_bit().  We must always wait on
the folio, otherwise we won't be woken up due to the tail page hashing
to a different bucket from the head page.

This commit shrinks the kernel by 691 bytes, mostly due to moving
the page waitqueue lookup into folio_wait_bit_common().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 10 +++---
 mm/filemap.c            | 77 +++++++++++++++++++----------------------
 mm/page-writeback.c     |  4 +--
 3 files changed, 43 insertions(+), 48 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index dd98b25fec49..6c067ce340b5 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -790,11 +790,11 @@ static inline int lock_page_or_retry(struct page *page, struct mm_struct *mm,
 }
 
 /*
- * This is exported only for wait_on_page_locked/wait_on_page_writeback, etc.,
+ * This is exported only for folio_wait_locked/folio_wait_writeback, etc.,
  * and should not be used directly.
  */
-extern void wait_on_page_bit(struct page *page, int bit_nr);
-extern int wait_on_page_bit_killable(struct page *page, int bit_nr);
+extern void folio_wait_bit(struct folio *folio, int bit_nr);
+extern int folio_wait_bit_killable(struct folio *folio, int bit_nr);
 
 /* 
  * Wait for a folio to be unlocked.
@@ -806,14 +806,14 @@ extern int wait_on_page_bit_killable(struct page *page, int bit_nr);
 static inline void folio_wait_locked(struct folio *folio)
 {
 	if (folio_locked(folio))
-		wait_on_page_bit(&folio->page, PG_locked);
+		folio_wait_bit(folio, PG_locked);
 }
 
 static inline int folio_wait_locked_killable(struct folio *folio)
 {
 	if (!folio_locked(folio))
 		return 0;
-	return wait_on_page_bit_killable(&folio->page, PG_locked);
+	return folio_wait_bit_killable(folio, PG_locked);
 }
 
 static inline void wait_on_page_locked(struct page *page)
diff --git a/mm/filemap.c b/mm/filemap.c
index 9d2cfa5d3a40..ab66e637eb5a 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1102,7 +1102,7 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync,
 	 *
 	 * So update the flags atomically, and wake up the waiter
 	 * afterwards to avoid any races. This store-release pairs
-	 * with the load-acquire in wait_on_page_bit_common().
+	 * with the load-acquire in folio_wait_bit_common().
 	 */
 	smp_store_release(&wait->flags, flags | WQ_FLAG_WOKEN);
 	wake_up_state(wait->private, mode);
@@ -1183,7 +1183,7 @@ static void folio_wake(struct folio *folio, int bit)
 }
 
 /*
- * A choice of three behaviors for wait_on_page_bit_common():
+ * A choice of three behaviors for folio_wait_bit_common():
  */
 enum behavior {
 	EXCLUSIVE,	/* Hold ref to page and take the bit when woken, like
@@ -1198,16 +1198,16 @@ enum behavior {
 };
 
 /*
- * Attempt to check (or get) the page bit, and mark us done
+ * Attempt to check (or get) the folio flag, and mark us done
  * if successful.
  */
-static inline bool trylock_page_bit_common(struct page *page, int bit_nr,
+static inline bool folio_trylock_flag(struct folio *folio, int bit_nr,
 					struct wait_queue_entry *wait)
 {
 	if (wait->flags & WQ_FLAG_EXCLUSIVE) {
-		if (test_and_set_bit(bit_nr, &page->flags))
+		if (test_and_set_bit(bit_nr, &folio->flags))
 			return false;
-	} else if (test_bit(bit_nr, &page->flags))
+	} else if (test_bit(bit_nr, &folio->flags))
 		return false;
 
 	wait->flags |= WQ_FLAG_WOKEN | WQ_FLAG_DONE;
@@ -1217,9 +1217,10 @@ static inline bool trylock_page_bit_common(struct page *page, int bit_nr,
 /* How many times do we accept lock stealing from under a waiter? */
 int sysctl_page_lock_unfairness = 5;
 
-static inline int wait_on_page_bit_common(wait_queue_head_t *q,
-	struct page *page, int bit_nr, int state, enum behavior behavior)
+static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
+		int state, enum behavior behavior)
 {
+	wait_queue_head_t *q = page_waitqueue(&folio->page);
 	int unfairness = sysctl_page_lock_unfairness;
 	struct wait_page_queue wait_page;
 	wait_queue_entry_t *wait = &wait_page.wait;
@@ -1228,8 +1229,8 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	unsigned long pflags;
 
 	if (bit_nr == PG_locked &&
-	    !PageUptodate(page) && PageWorkingset(page)) {
-		if (!PageSwapBacked(page)) {
+	    !folio_uptodate(folio) && folio_workingset(folio)) {
+		if (!folio_swapbacked(folio)) {
 			delayacct_thrashing_start();
 			delayacct = true;
 		}
@@ -1239,7 +1240,7 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 
 	init_wait(wait);
 	wait->func = wake_page_function;
-	wait_page.page = page;
+	wait_page.page = &folio->page;
 	wait_page.bit_nr = bit_nr;
 
 repeat:
@@ -1254,7 +1255,7 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	 * Do one last check whether we can get the
 	 * page bit synchronously.
 	 *
-	 * Do the SetPageWaiters() marking before that
+	 * Do the folio_set_waiters_flag() marking before that
 	 * to let any waker we _just_ missed know they
 	 * need to wake us up (otherwise they'll never
 	 * even go to the slow case that looks at the
@@ -1265,8 +1266,8 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	 * lock to avoid races.
 	 */
 	spin_lock_irq(&q->lock);
-	SetPageWaiters(page);
-	if (!trylock_page_bit_common(page, bit_nr, wait))
+	folio_set_waiters_flag(folio);
+	if (!folio_trylock_flag(folio, bit_nr, wait))
 		__add_wait_queue_entry_tail(q, wait);
 	spin_unlock_irq(&q->lock);
 
@@ -1276,10 +1277,10 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	 * see whether the page bit testing has already
 	 * been done by the wake function.
 	 *
-	 * We can drop our reference to the page.
+	 * We can drop our reference to the folio.
 	 */
 	if (behavior == DROP)
-		put_page(page);
+		folio_put(folio);
 
 	/*
 	 * Note that until the "finish_wait()", or until
@@ -1316,7 +1317,7 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 		 *
 		 * And if that fails, we'll have to retry this all.
 		 */
-		if (unlikely(test_and_set_bit(bit_nr, &page->flags)))
+		if (unlikely(test_and_set_bit(bit_nr, folio_flags(folio, 0))))
 			goto repeat;
 
 		wait->flags |= WQ_FLAG_DONE;
@@ -1325,7 +1326,7 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 
 	/*
 	 * If a signal happened, this 'finish_wait()' may remove the last
-	 * waiter from the wait-queues, but the PageWaiters bit will remain
+	 * waiter from the wait-queues, but the folio_waiters bit will remain
 	 * set. That's ok. The next wakeup will take care of it, and trying
 	 * to do it here would be difficult and prone to races.
 	 */
@@ -1356,19 +1357,17 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	return wait->flags & WQ_FLAG_WOKEN ? 0 : -EINTR;
 }
 
-void wait_on_page_bit(struct page *page, int bit_nr)
+void folio_wait_bit(struct folio *folio, int bit_nr)
 {
-	wait_queue_head_t *q = page_waitqueue(page);
-	wait_on_page_bit_common(q, page, bit_nr, TASK_UNINTERRUPTIBLE, SHARED);
+	folio_wait_bit_common(folio, bit_nr, TASK_UNINTERRUPTIBLE, SHARED);
 }
-EXPORT_SYMBOL(wait_on_page_bit);
+EXPORT_SYMBOL(folio_wait_bit);
 
-int wait_on_page_bit_killable(struct page *page, int bit_nr)
+int folio_wait_bit_killable(struct folio *folio, int bit_nr)
 {
-	wait_queue_head_t *q = page_waitqueue(page);
-	return wait_on_page_bit_common(q, page, bit_nr, TASK_KILLABLE, SHARED);
+	return folio_wait_bit_common(folio, bit_nr, TASK_KILLABLE, SHARED);
 }
-EXPORT_SYMBOL(wait_on_page_bit_killable);
+EXPORT_SYMBOL(folio_wait_bit_killable);
 
 /**
  * put_and_wait_on_page_locked - Drop a reference and wait for it to be unlocked
@@ -1385,11 +1384,8 @@ EXPORT_SYMBOL(wait_on_page_bit_killable);
  */
 int put_and_wait_on_page_locked(struct page *page, int state)
 {
-	wait_queue_head_t *q;
-
-	page = compound_head(page);
-	q = page_waitqueue(page);
-	return wait_on_page_bit_common(q, page, PG_locked, state, DROP);
+	return folio_wait_bit_common(page_folio(page), PG_locked, state,
+			DROP);
 }
 
 /**
@@ -1481,9 +1477,10 @@ EXPORT_SYMBOL(end_page_private_2);
  */
 void wait_on_page_private_2(struct page *page)
 {
-	page = compound_head(page);
-	while (PagePrivate2(page))
-		wait_on_page_bit(page, PG_private_2);
+	struct folio *folio = page_folio(page);
+
+	while (folio_private_2(folio))
+		folio_wait_bit(folio, PG_private_2);
 }
 EXPORT_SYMBOL(wait_on_page_private_2);
 
@@ -1500,11 +1497,11 @@ EXPORT_SYMBOL(wait_on_page_private_2);
  */
 int wait_on_page_private_2_killable(struct page *page)
 {
+	struct folio *folio = page_folio(page);
 	int ret = 0;
 
-	page = compound_head(page);
-	while (PagePrivate2(page)) {
-		ret = wait_on_page_bit_killable(page, PG_private_2);
+	while (folio_private_2(folio)) {
+		ret = folio_wait_bit_killable(folio, PG_private_2);
 		if (ret < 0)
 			break;
 	}
@@ -1581,16 +1578,14 @@ EXPORT_SYMBOL_GPL(page_endio);
  */
 void __folio_lock(struct folio *folio)
 {
-	wait_queue_head_t *q = page_waitqueue(&folio->page);
-	wait_on_page_bit_common(q, &folio->page, PG_locked, TASK_UNINTERRUPTIBLE,
+	folio_wait_bit_common(folio, PG_locked, TASK_UNINTERRUPTIBLE,
 				EXCLUSIVE);
 }
 EXPORT_SYMBOL(__folio_lock);
 
 int __folio_lock_killable(struct folio *folio)
 {
-	wait_queue_head_t *q = page_waitqueue(&folio->page);
-	return wait_on_page_bit_common(q, &folio->page, PG_locked, TASK_KILLABLE,
+	return folio_wait_bit_common(folio, PG_locked, TASK_KILLABLE,
 					EXCLUSIVE);
 }
 EXPORT_SYMBOL_GPL(__folio_lock_killable);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 8b15c44552f7..157ed52e4b7d 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2834,7 +2834,7 @@ void folio_wait_writeback(struct folio *folio)
 {
 	while (folio_writeback(folio)) {
 		trace_wait_on_page_writeback(&folio->page, folio_mapping(folio));
-		wait_on_page_bit(&folio->page, PG_writeback);
+		folio_wait_bit(folio, PG_writeback);
 	}
 }
 EXPORT_SYMBOL_GPL(folio_wait_writeback);
@@ -2856,7 +2856,7 @@ int folio_wait_writeback_killable(struct folio *folio)
 {
 	while (folio_writeback(folio)) {
 		trace_wait_on_page_writeback(&folio->page, folio_mapping(folio));
-		if (wait_on_page_bit_killable(&folio->page, PG_writeback))
+		if (folio_wait_bit_killable(folio, PG_writeback))
 			return -EINTR;
 	}
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 36/96] mm/filemap: Add folio_wake_bit
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (34 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 35/96] mm/filemap: Add folio_wait_bit Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 37/96] mm/filemap: Convert page wait queues to be folios Matthew Wilcox (Oracle)
                   ` (59 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

Convert wake_up_page_bit() to folio_wake_bit().  All callers have a folio,
so use it directly.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 mm/filemap.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index ab66e637eb5a..f5cb6464fe37 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1121,14 +1121,14 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync,
 	return (flags & WQ_FLAG_EXCLUSIVE) != 0;
 }
 
-static void wake_up_page_bit(struct page *page, int bit_nr)
+static void folio_wake_bit(struct folio *folio, int bit_nr)
 {
-	wait_queue_head_t *q = page_waitqueue(page);
+	wait_queue_head_t *q = page_waitqueue(&folio->page);
 	struct wait_page_key key;
 	unsigned long flags;
 	wait_queue_entry_t bookmark;
 
-	key.page = page;
+	key.page = &folio->page;
 	key.bit_nr = bit_nr;
 	key.page_match = 0;
 
@@ -1163,7 +1163,7 @@ static void wake_up_page_bit(struct page *page, int bit_nr)
 	 * page waiters.
 	 */
 	if (!waitqueue_active(q) || !key.page_match) {
-		ClearPageWaiters(page);
+		folio_clear_waiters_flag(folio);
 		/*
 		 * It's possible to miss clearing Waiters here, when we woke
 		 * our page waiters, but the hashed waitqueue has waiters for
@@ -1179,7 +1179,7 @@ static void folio_wake(struct folio *folio, int bit)
 {
 	if (!folio_waiters(folio))
 		return;
-	wake_up_page_bit(&folio->page, bit);
+	folio_wake_bit(folio, bit);
 }
 
 /*
@@ -1444,7 +1444,7 @@ void folio_unlock(struct folio *folio)
 	BUILD_BUG_ON(PG_waiters != 7);
 	VM_BUG_ON_FOLIO(!folio_locked(folio), folio);
 	if (clear_bit_unlock_is_negative_byte(PG_locked, folio_flags(folio, 0)))
-		wake_up_page_bit(&folio->page, PG_locked);
+		folio_wake_bit(folio, PG_locked);
 }
 EXPORT_SYMBOL(folio_unlock);
 
@@ -1461,11 +1461,12 @@ EXPORT_SYMBOL(folio_unlock);
  */
 void end_page_private_2(struct page *page)
 {
-	page = compound_head(page);
-	VM_BUG_ON_PAGE(!PagePrivate2(page), page);
-	clear_bit_unlock(PG_private_2, &page->flags);
-	wake_up_page_bit(page, PG_private_2);
-	put_page(page);
+	struct folio *folio = page_folio(page);
+
+	VM_BUG_ON_FOLIO(!folio_private_2(folio), folio);
+	clear_bit_unlock(PG_private_2, folio_flags(folio, 0));
+	folio_wake_bit(folio, PG_private_2);
+	folio_put(folio);
 }
 EXPORT_SYMBOL(end_page_private_2);
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 37/96] mm/filemap: Convert page wait queues to be folios
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (35 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 36/96] mm/filemap: Add folio_wake_bit Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 38/96] mm/filemap: Add folio private_2 functions Matthew Wilcox (Oracle)
                   ` (58 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm
  Cc: Matthew Wilcox (Oracle), linux-kernel, Christoph Hellwig, Jeff Layton

Reinforce that page flags are actually in the head page by changing the
type from page to folio.  Increases the size of cachefiles by two bytes,
but the kernel core is unchanged in size.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 fs/cachefiles/rdwr.c    | 16 ++++++++--------
 include/linux/pagemap.h |  8 ++++----
 mm/filemap.c            | 38 +++++++++++++++++++-------------------
 3 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
index 8ffc40e84a59..e211a3d5ba44 100644
--- a/fs/cachefiles/rdwr.c
+++ b/fs/cachefiles/rdwr.c
@@ -25,20 +25,20 @@ static int cachefiles_read_waiter(wait_queue_entry_t *wait, unsigned mode,
 	struct cachefiles_object *object;
 	struct fscache_retrieval *op = monitor->op;
 	struct wait_page_key *key = _key;
-	struct page *page = wait->private;
+	struct folio *folio = wait->private;
 
 	ASSERT(key);
 
 	_enter("{%lu},%u,%d,{%p,%u}",
 	       monitor->netfs_page->index, mode, sync,
-	       key->page, key->bit_nr);
+	       key->folio, key->bit_nr);
 
-	if (key->page != page || key->bit_nr != PG_locked)
+	if (key->folio != folio || key->bit_nr != PG_locked)
 		return 0;
 
-	_debug("--- monitor %p %lx ---", page, page->flags);
+	_debug("--- monitor %p %lx ---", folio, folio->flags);
 
-	if (!PageUptodate(page) && !PageError(page)) {
+	if (!folio_uptodate(folio) && !folio_error(folio)) {
 		/* unlocked, not uptodate and not erronous? */
 		_debug("page probably truncated");
 	}
@@ -107,7 +107,7 @@ static int cachefiles_read_reissue(struct cachefiles_object *object,
 	put_page(backpage2);
 
 	INIT_LIST_HEAD(&monitor->op_link);
-	add_page_wait_queue(backpage, &monitor->monitor);
+	folio_add_wait_queue(page_folio(backpage), &monitor->monitor);
 
 	if (trylock_page(backpage)) {
 		ret = -EIO;
@@ -294,7 +294,7 @@ static int cachefiles_read_backing_file_one(struct cachefiles_object *object,
 	get_page(backpage);
 	monitor->back_page = backpage;
 	monitor->monitor.private = backpage;
-	add_page_wait_queue(backpage, &monitor->monitor);
+	folio_add_wait_queue(page_folio(backpage), &monitor->monitor);
 	monitor = NULL;
 
 	/* but the page may have been read before the monitor was installed, so
@@ -548,7 +548,7 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
 		get_page(backpage);
 		monitor->back_page = backpage;
 		monitor->monitor.private = backpage;
-		add_page_wait_queue(backpage, &monitor->monitor);
+		folio_add_wait_queue(page_folio(backpage), &monitor->monitor);
 		monitor = NULL;
 
 		/* but the page may have been read before the monitor was
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 6c067ce340b5..7e00cde24de7 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -690,13 +690,13 @@ static inline pgoff_t linear_page_index(struct vm_area_struct *vma,
 }
 
 struct wait_page_key {
-	struct page *page;
+	struct folio *folio;
 	int bit_nr;
 	int page_match;
 };
 
 struct wait_page_queue {
-	struct page *page;
+	struct folio *folio;
 	int bit_nr;
 	wait_queue_entry_t wait;
 };
@@ -704,7 +704,7 @@ struct wait_page_queue {
 static inline bool wake_page_match(struct wait_page_queue *wait_page,
 				  struct wait_page_key *key)
 {
-	if (wait_page->page != key->page)
+	if (wait_page->folio != key->folio)
 	       return false;
 	key->page_match = 1;
 
@@ -860,7 +860,7 @@ int wait_on_page_private_2_killable(struct page *page);
 /*
  * Add an arbitrary waiter to a page's wait queue
  */
-extern void add_page_wait_queue(struct page *page, wait_queue_entry_t *waiter);
+void folio_add_wait_queue(struct folio *folio, wait_queue_entry_t *waiter);
 
 /*
  * Fault everything in given userspace address range in.
diff --git a/mm/filemap.c b/mm/filemap.c
index f5cb6464fe37..c4a53cffffb0 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1019,11 +1019,11 @@ EXPORT_SYMBOL(__page_cache_alloc);
  */
 #define PAGE_WAIT_TABLE_BITS 8
 #define PAGE_WAIT_TABLE_SIZE (1 << PAGE_WAIT_TABLE_BITS)
-static wait_queue_head_t page_wait_table[PAGE_WAIT_TABLE_SIZE] __cacheline_aligned;
+static wait_queue_head_t folio_wait_table[PAGE_WAIT_TABLE_SIZE] __cacheline_aligned;
 
-static wait_queue_head_t *page_waitqueue(struct page *page)
+static wait_queue_head_t *folio_waitqueue(struct folio *folio)
 {
-	return &page_wait_table[hash_ptr(page, PAGE_WAIT_TABLE_BITS)];
+	return &folio_wait_table[hash_ptr(folio, PAGE_WAIT_TABLE_BITS)];
 }
 
 void __init pagecache_init(void)
@@ -1031,7 +1031,7 @@ void __init pagecache_init(void)
 	int i;
 
 	for (i = 0; i < PAGE_WAIT_TABLE_SIZE; i++)
-		init_waitqueue_head(&page_wait_table[i]);
+		init_waitqueue_head(&folio_wait_table[i]);
 
 	page_writeback_init();
 }
@@ -1086,10 +1086,10 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync,
 	 */
 	flags = wait->flags;
 	if (flags & WQ_FLAG_EXCLUSIVE) {
-		if (test_bit(key->bit_nr, &key->page->flags))
+		if (test_bit(key->bit_nr, &key->folio->flags))
 			return -1;
 		if (flags & WQ_FLAG_CUSTOM) {
-			if (test_and_set_bit(key->bit_nr, &key->page->flags))
+			if (test_and_set_bit(key->bit_nr, &key->folio->flags))
 				return -1;
 			flags |= WQ_FLAG_DONE;
 		}
@@ -1123,12 +1123,12 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync,
 
 static void folio_wake_bit(struct folio *folio, int bit_nr)
 {
-	wait_queue_head_t *q = page_waitqueue(&folio->page);
+	wait_queue_head_t *q = folio_waitqueue(folio);
 	struct wait_page_key key;
 	unsigned long flags;
 	wait_queue_entry_t bookmark;
 
-	key.page = &folio->page;
+	key.folio = folio;
 	key.bit_nr = bit_nr;
 	key.page_match = 0;
 
@@ -1220,7 +1220,7 @@ int sysctl_page_lock_unfairness = 5;
 static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
 		int state, enum behavior behavior)
 {
-	wait_queue_head_t *q = page_waitqueue(&folio->page);
+	wait_queue_head_t *q = folio_waitqueue(folio);
 	int unfairness = sysctl_page_lock_unfairness;
 	struct wait_page_queue wait_page;
 	wait_queue_entry_t *wait = &wait_page.wait;
@@ -1240,7 +1240,7 @@ static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
 
 	init_wait(wait);
 	wait->func = wake_page_function;
-	wait_page.page = &folio->page;
+	wait_page.folio = folio;
 	wait_page.bit_nr = bit_nr;
 
 repeat:
@@ -1389,23 +1389,23 @@ int put_and_wait_on_page_locked(struct page *page, int state)
 }
 
 /**
- * add_page_wait_queue - Add an arbitrary waiter to a page's wait queue
- * @page: Page defining the wait queue of interest
+ * folio_add_wait_queue - Add an arbitrary waiter to a folio's wait queue
+ * @folio: Folio defining the wait queue of interest
  * @waiter: Waiter to add to the queue
  *
- * Add an arbitrary @waiter to the wait queue for the nominated @page.
+ * Add an arbitrary @waiter to the wait queue for the nominated @folio.
  */
-void add_page_wait_queue(struct page *page, wait_queue_entry_t *waiter)
+void folio_add_wait_queue(struct folio *folio, wait_queue_entry_t *waiter)
 {
-	wait_queue_head_t *q = page_waitqueue(page);
+	wait_queue_head_t *q = folio_waitqueue(folio);
 	unsigned long flags;
 
 	spin_lock_irqsave(&q->lock, flags);
 	__add_wait_queue_entry_tail(q, waiter);
-	SetPageWaiters(page);
+	folio_set_waiters_flag(folio);
 	spin_unlock_irqrestore(&q->lock, flags);
 }
-EXPORT_SYMBOL_GPL(add_page_wait_queue);
+EXPORT_SYMBOL_GPL(folio_add_wait_queue);
 
 #ifndef clear_bit_unlock_is_negative_byte
 
@@ -1593,10 +1593,10 @@ EXPORT_SYMBOL_GPL(__folio_lock_killable);
 
 static int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait)
 {
-	struct wait_queue_head *q = page_waitqueue(&folio->page);
+	struct wait_queue_head *q = folio_waitqueue(folio);
 	int ret = 0;
 
-	wait->page = &folio->page;
+	wait->folio = folio;
 	wait->bit_nr = PG_locked;
 
 	spin_lock_irq(&q->lock);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 38/96] mm/filemap: Add folio private_2 functions
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (36 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 37/96] mm/filemap: Convert page wait queues to be folios Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 39/96] fs/netfs: Add folio fscache functions Matthew Wilcox (Oracle)
                   ` (57 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

end_page_private_2() becomes folio_end_private_2(),
wait_on_page_private_2() becomes folio_wait_private_2() and
wait_on_page_private_2_killable() becomes folio_wait_private_2_killable().

Adjust the fscache equivalents to call page_folio() before calling these
functions to avoid adding wrappers.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/netfs.h   |  6 +++---
 include/linux/pagemap.h |  6 +++---
 mm/filemap.c            | 37 ++++++++++++++++---------------------
 3 files changed, 22 insertions(+), 27 deletions(-)

diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 9062adfa2fb9..fad8c6209edd 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -55,7 +55,7 @@ static inline void set_page_fscache(struct page *page)
  */
 static inline void end_page_fscache(struct page *page)
 {
-	end_page_private_2(page);
+	folio_end_private_2(page_folio(page));
 }
 
 /**
@@ -66,7 +66,7 @@ static inline void end_page_fscache(struct page *page)
  */
 static inline void wait_on_page_fscache(struct page *page)
 {
-	wait_on_page_private_2(page);
+	folio_wait_private_2(page_folio(page));
 }
 
 /**
@@ -82,7 +82,7 @@ static inline void wait_on_page_fscache(struct page *page)
  */
 static inline int wait_on_page_fscache_killable(struct page *page)
 {
-	return wait_on_page_private_2_killable(page);
+	return folio_wait_private_2_killable(page_folio(page));
 }
 
 enum netfs_read_source {
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 7e00cde24de7..f6a03fd68ac8 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -853,9 +853,9 @@ static inline void set_page_private_2(struct page *page)
 	SetPagePrivate2(page);
 }
 
-void end_page_private_2(struct page *page);
-void wait_on_page_private_2(struct page *page);
-int wait_on_page_private_2_killable(struct page *page);
+void folio_end_private_2(struct folio *folio);
+void folio_wait_private_2(struct folio *folio);
+int folio_wait_private_2_killable(struct folio *folio);
 
 /*
  * Add an arbitrary waiter to a page's wait queue
diff --git a/mm/filemap.c b/mm/filemap.c
index c4a53cffffb0..c77e0ba9098a 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1449,56 +1449,51 @@ void folio_unlock(struct folio *folio)
 EXPORT_SYMBOL(folio_unlock);
 
 /**
- * end_page_private_2 - Clear PG_private_2 and release any waiters
- * @page: The page
+ * folio_end_private_2 - Clear PG_private_2 and wake any waiters.
+ * @folio: The folio.
  *
- * Clear the PG_private_2 bit on a page and wake up any sleepers waiting for
- * this.  The page ref held for PG_private_2 being set is released.
+ * Clear the PG_private_2 bit on a folio and wake up any sleepers waiting for
+ * it.  The page ref held for PG_private_2 being set is released.
  *
  * This is, for example, used when a netfs page is being written to a local
  * disk cache, thereby allowing writes to the cache for the same page to be
  * serialised.
  */
-void end_page_private_2(struct page *page)
+void folio_end_private_2(struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
-
 	VM_BUG_ON_FOLIO(!folio_private_2(folio), folio);
 	clear_bit_unlock(PG_private_2, folio_flags(folio, 0));
 	folio_wake_bit(folio, PG_private_2);
 	folio_put(folio);
 }
-EXPORT_SYMBOL(end_page_private_2);
+EXPORT_SYMBOL(folio_end_private_2);
 
 /**
- * wait_on_page_private_2 - Wait for PG_private_2 to be cleared on a page
- * @page: The page to wait on
+ * folio_wait_private_2 - Wait for PG_private_2 to be cleared on a page.
+ * @folio: The folio to wait on.
  *
- * Wait for PG_private_2 (aka PG_fscache) to be cleared on a page.
+ * Wait for PG_private_2 (aka PG_fscache) to be cleared on a folio.
  */
-void wait_on_page_private_2(struct page *page)
+void folio_wait_private_2(struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
-
 	while (folio_private_2(folio))
 		folio_wait_bit(folio, PG_private_2);
 }
-EXPORT_SYMBOL(wait_on_page_private_2);
+EXPORT_SYMBOL(folio_wait_private_2);
 
 /**
- * wait_on_page_private_2_killable - Wait for PG_private_2 to be cleared on a page
- * @page: The page to wait on
+ * folio_wait_private_2_killable - Wait for PG_private_2 to be cleared on a folio.
+ * @folio: The folio to wait on.
  *
- * Wait for PG_private_2 (aka PG_fscache) to be cleared on a page or until a
+ * Wait for PG_private_2 (aka PG_fscache) to be cleared on a folio or until a
  * fatal signal is received by the calling task.
  *
  * Return:
  * - 0 if successful.
  * - -EINTR if a fatal signal was encountered.
  */
-int wait_on_page_private_2_killable(struct page *page)
+int folio_wait_private_2_killable(struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
 	int ret = 0;
 
 	while (folio_private_2(folio)) {
@@ -1509,7 +1504,7 @@ int wait_on_page_private_2_killable(struct page *page)
 
 	return ret;
 }
-EXPORT_SYMBOL(wait_on_page_private_2_killable);
+EXPORT_SYMBOL(folio_wait_private_2_killable);
 
 /**
  * folio_end_writeback - End writeback against a folio.
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 39/96] fs/netfs: Add folio fscache functions
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (37 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 38/96] mm/filemap: Add folio private_2 functions Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 40/96] mm: Add folio_mapped Matthew Wilcox (Oracle)
                   ` (56 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Match the page writeback functions by adding
folio_start_fscache(), folio_end_fscache(), folio_wait_fscache() and
folio_wait_fscache_killable().  Also rewrite the kernel-doc to describe
when to use the function rather than what the function does, and include
the kernel-doc in the appropriate rst file.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 Documentation/filesystems/netfs_library.rst |  2 +
 include/linux/netfs.h                       | 75 +++++++++++++--------
 2 files changed, 50 insertions(+), 27 deletions(-)

diff --git a/Documentation/filesystems/netfs_library.rst b/Documentation/filesystems/netfs_library.rst
index 57a641847818..bb68d39f03b7 100644
--- a/Documentation/filesystems/netfs_library.rst
+++ b/Documentation/filesystems/netfs_library.rst
@@ -524,3 +524,5 @@ Note that these methods are passed a pointer to the cache resource structure,
 not the read request structure as they could be used in other situations where
 there isn't a read request structure as well, such as writing dirty data to the
 cache.
+
+.. kernel-doc:: include/linux/netfs.h
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index fad8c6209edd..b0bbd343fc98 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -22,6 +22,7 @@
  * Overload PG_private_2 to give us PG_fscache - this is used to indicate that
  * a page is currently backed by a local disk cache
  */
+#define folio_fscache(folio)		folio_private_2(folio)
 #define PageFsCache(page)		PagePrivate2((page))
 #define SetPageFsCache(page)		SetPagePrivate2((page))
 #define ClearPageFsCache(page)		ClearPagePrivate2((page))
@@ -29,57 +30,77 @@
 #define TestClearPageFsCache(page)	TestClearPagePrivate2((page))
 
 /**
- * set_page_fscache - Set PG_fscache on a page and take a ref
- * @page: The page.
+ * folio_start_fscache - Start an fscache operation on a folio.
+ * @folio: The folio.
  *
- * Set the PG_fscache (PG_private_2) flag on a page and take the reference
- * needed for the VM to handle its lifetime correctly.  This sets the flag and
- * takes the reference unconditionally, so care must be taken not to set the
- * flag again if it's already set.
+ * Call this function before an fscache operation starts on a folio.
+ * Starting a second fscache operation before the first one finishes is
+ * not allowed.
  */
-static inline void set_page_fscache(struct page *page)
+static inline void folio_start_fscache(struct folio *folio)
 {
-	set_page_private_2(page);
+	VM_BUG_ON_FOLIO(folio_private_2(folio), folio);
+	folio_get(folio);
+	folio_set_private_2_flag(folio);
 }
 
 /**
- * end_page_fscache - Clear PG_fscache and release any waiters
- * @page: The page
- *
- * Clear the PG_fscache (PG_private_2) bit on a page and wake up any sleepers
- * waiting for this.  The page ref held for PG_private_2 being set is released.
+ * folio_end_fscache - End an fscache operation on a folio.
+ * @folio: The folio.
  *
- * This is, for example, used when a netfs page is being written to a local
- * disk cache, thereby allowing writes to the cache for the same page to be
- * serialised.
+ * Call this function after an fscache operation has finished.  This will
+ * wake any sleepers waiting on this folio.
  */
-static inline void end_page_fscache(struct page *page)
+static inline void folio_end_fscache(struct folio *folio)
 {
-	folio_end_private_2(page_folio(page));
+	folio_end_private_2(folio);
 }
 
 /**
- * wait_on_page_fscache - Wait for PG_fscache to be cleared on a page
- * @page: The page to wait on
+ * folio_wait_fscache - Wait for an fscache operation on this folio to end.
+ * @folio: The folio.
  *
- * Wait for PG_fscache (aka PG_private_2) to be cleared on a page.
+ * If an fscache operation is in progress on this folio, wait for it to
+ * finish.  Another fscache operation may start after this one finishes,
+ * unless the caller holds the folio lock.
  */
-static inline void wait_on_page_fscache(struct page *page)
+static inline void folio_wait_fscache(struct folio *folio)
 {
-	folio_wait_private_2(page_folio(page));
+	folio_wait_private_2(folio);
 }
 
 /**
- * wait_on_page_fscache_killable - Wait for PG_fscache to be cleared on a page
- * @page: The page to wait on
+ * folio_wait_fscache_killable - Wait for an fscache operation on this folio to end.
+ * @folio: The folio.
  *
- * Wait for PG_fscache (aka PG_private_2) to be cleared on a page or until a
- * fatal signal is received by the calling task.
+ * If an fscache operation is in progress on this folio, wait for it to
+ * finish or for a fatal signal to be received.  Another fscache operation
+ * may start after this one finishes, unless the caller holds the folio lock.
  *
  * Return:
  * - 0 if successful.
  * - -EINTR if a fatal signal was encountered.
  */
+static inline int folio_wait_fscache_killable(struct folio *folio)
+{
+	return folio_wait_private_2_killable(folio);
+}
+
+static inline void set_page_fscache(struct page *page)
+{
+	folio_start_fscache(page_folio(page));
+}
+
+static inline void end_page_fscache(struct page *page)
+{
+	folio_end_private_2(page_folio(page));
+}
+
+static inline void wait_on_page_fscache(struct page *page)
+{
+	folio_wait_private_2(page_folio(page));
+}
+
 static inline int wait_on_page_fscache_killable(struct page *page)
 {
 	return folio_wait_private_2_killable(page_folio(page));
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 40/96] mm: Add folio_mapped
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (38 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 39/96] fs/netfs: Add folio fscache functions Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 41/96] mm/workingset: Convert workingset_activation to take a folio Matthew Wilcox (Oracle)
                   ` (55 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This function is the equivalent of page_mapped().  It is slightly
shorter as we do not need to handle the PageTail() case.  Reimplement
page_mapped() as a wrapper around folio_mapped().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h |  1 +
 mm/folio-compat.c  |  6 ++++++
 mm/util.c          | 30 +++++++++++++++++-------------
 3 files changed, 24 insertions(+), 13 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index bca3e2518e5e..6bfc43309e4b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1779,6 +1779,7 @@ static inline pgoff_t page_index(struct page *page)
 }
 
 bool page_mapped(struct page *page);
+bool folio_mapped(struct folio *folio);
 
 /*
  * Return true only if the page has been allocated with
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 3c83f03b80d7..7044fcc8a8aa 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -35,3 +35,9 @@ void wait_for_stable_page(struct page *page)
 	return folio_wait_stable(page_folio(page));
 }
 EXPORT_SYMBOL_GPL(wait_for_stable_page);
+
+bool page_mapped(struct page *page)
+{
+	return folio_mapped(page_folio(page));
+}
+EXPORT_SYMBOL(page_mapped);
diff --git a/mm/util.c b/mm/util.c
index afd99591cb81..e60b0920f9a7 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -652,28 +652,32 @@ void *page_rmapping(struct page *page)
 	return __page_rmapping(page);
 }
 
-/*
- * Return true if this page is mapped into pagetables.
- * For compound page it returns true if any subpage of compound page is mapped.
+/**
+ * folio_mapped - Is this folio mapped into userspace?
+ * @folio: The folio.
+ *
+ * Return: true if any page in this folio is mapped into pagetables.
  */
-bool page_mapped(struct page *page)
+bool folio_mapped(struct folio *folio)
 {
-	int i;
+	int i, nr;
 
-	if (likely(!PageCompound(page)))
-		return atomic_read(&page->_mapcount) >= 0;
-	page = compound_head(page);
-	if (atomic_read(compound_mapcount_ptr(page)) >= 0)
+	if (folio_single(folio))
+		return atomic_read(&folio->_mapcount) >= 0;
+	if (atomic_read(compound_mapcount_ptr(&folio->page)) >= 0)
 		return true;
-	if (PageHuge(page))
+	if (folio_hugetlb(folio))
 		return false;
-	for (i = 0; i < compound_nr(page); i++) {
-		if (atomic_read(&page[i]._mapcount) >= 0)
+
+	nr = folio_nr_pages(folio);
+	for (i = 0; i < nr; i++) {
+		struct page *page = nth_page(&folio->page, i);
+		if (atomic_read(&page->_mapcount) >= 0)
 			return true;
 	}
 	return false;
 }
-EXPORT_SYMBOL(page_mapped);
+EXPORT_SYMBOL(folio_mapped);
 
 struct anon_vma *page_anon_vma(struct page *page)
 {
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 41/96] mm/workingset: Convert workingset_activation to take a folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (39 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 40/96] mm: Add folio_mapped Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 42/96] mm/swap: Add folio_activate Matthew Wilcox (Oracle)
                   ` (54 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This function already assumed it was being passed a head page.  No real
change here, except that thp_nr_pages() compiles away on kernels with
THP compiled out while folio_nr_pages() is always present.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/swap.h |  2 +-
 mm/swap.c            |  2 +-
 mm/workingset.c      | 10 +++++-----
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 76b2338ef24d..8e0118b25bdc 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -324,7 +324,7 @@ static inline swp_entry_t folio_swap_entry(struct folio *folio)
 void workingset_age_nonresident(struct lruvec *lruvec, unsigned long nr_pages);
 void *workingset_eviction(struct page *page, struct mem_cgroup *target_memcg);
 void workingset_refault(struct page *page, void *shadow);
-void workingset_activation(struct page *page);
+void workingset_activation(struct folio *folio);
 
 /* Only track the nodes of mappings with shadow entries */
 void workingset_update_node(struct xa_node *node);
diff --git a/mm/swap.c b/mm/swap.c
index 6caca11cd2ec..828889349b0b 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -445,7 +445,7 @@ void mark_page_accessed(struct page *page)
 		else
 			__lru_cache_activate_page(page);
 		ClearPageReferenced(page);
-		workingset_activation(page);
+		workingset_activation(page_folio(page));
 	}
 	if (page_is_idle(page))
 		clear_page_idle(page);
diff --git a/mm/workingset.c b/mm/workingset.c
index b7cdeca5a76d..d969403f2b2a 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -390,9 +390,9 @@ void workingset_refault(struct page *page, void *shadow)
 
 /**
  * workingset_activation - note a page activation
- * @page: page that is being activated
+ * @folio: Folio that is being activated.
  */
-void workingset_activation(struct page *page)
+void workingset_activation(struct folio *folio)
 {
 	struct mem_cgroup *memcg;
 	struct lruvec *lruvec;
@@ -405,11 +405,11 @@ void workingset_activation(struct page *page)
 	 * XXX: See workingset_refault() - this should return
 	 * root_mem_cgroup even for !CONFIG_MEMCG.
 	 */
-	memcg = page_memcg_rcu(page);
+	memcg = page_memcg_rcu(&folio->page);
 	if (!mem_cgroup_disabled() && !memcg)
 		goto out;
-	lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page));
-	workingset_age_nonresident(lruvec, thp_nr_pages(page));
+	lruvec = mem_cgroup_folio_lruvec(folio, folio_pgdat(folio));
+	workingset_age_nonresident(lruvec, folio_nr_pages(folio));
 out:
 	rcu_read_unlock();
 }
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 42/96] mm/swap: Add folio_activate
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (40 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 41/96] mm/workingset: Convert workingset_activation to take a folio Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 43/96] mm/swap: Add folio_mark_accessed Matthew Wilcox (Oracle)
                   ` (53 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This ends up being inlined into mark_page_accessed(), saving 70 bytes
of text.  It eliminates a lot of calls to compound_head(), but we still
have to pass a page to __activate_page() because pagevec_lru_move_fn()
takes a struct page.  That should be cleaned up eventually.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/swap.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index 828889349b0b..07244999bcc6 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -347,16 +347,16 @@ static bool need_activate_page_drain(int cpu)
 	return pagevec_count(&per_cpu(lru_pvecs.activate_page, cpu)) != 0;
 }
 
-static void activate_page(struct page *page)
+static void folio_activate(struct folio *folio)
 {
-	page = compound_head(page);
-	if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
+	if (folio_lru(folio) && !folio_active(folio) &&
+	    !folio_unevictable(folio)) {
 		struct pagevec *pvec;
 
 		local_lock(&lru_pvecs.lock);
 		pvec = this_cpu_ptr(&lru_pvecs.activate_page);
-		get_page(page);
-		if (pagevec_add_and_need_flush(pvec, page))
+		folio_get(folio);
+		if (pagevec_add_and_need_flush(pvec, &folio->page))
 			pagevec_lru_move_fn(pvec, __activate_page);
 		local_unlock(&lru_pvecs.lock);
 	}
@@ -367,16 +367,15 @@ static inline void activate_page_drain(int cpu)
 {
 }
 
-static void activate_page(struct page *page)
+static void folio_activate(struct folio *folio)
 {
 	struct lruvec *lruvec;
 
-	page = compound_head(page);
-	if (TestClearPageLRU(page)) {
-		lruvec = lock_page_lruvec_irq(page);
-		__activate_page(page, lruvec);
+	if (folio_test_clear_lru_flag(folio)) {
+		lruvec = folio_lock_lruvec_irq(folio);
+		__activate_page(&folio->page, lruvec);
 		unlock_page_lruvec_irq(lruvec);
-		SetPageLRU(page);
+		folio_set_lru_flag(folio);
 	}
 }
 #endif
@@ -441,7 +440,7 @@ void mark_page_accessed(struct page *page)
 		 * LRU on the next drain.
 		 */
 		if (PageLRU(page))
-			activate_page(page);
+			folio_activate(page_folio(page));
 		else
 			__lru_cache_activate_page(page);
 		ClearPageReferenced(page);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 43/96] mm/swap: Add folio_mark_accessed
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (41 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 42/96] mm/swap: Add folio_activate Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 44/96] mm/rmap: Add folio_mkclean Matthew Wilcox (Oracle)
                   ` (52 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Convert mark_page_accessed() to folio_mark_accessed().  It already
operated on the entire compound page, but now we can avoid calling
compound_head quite so many times.  Shrinks the function from 424 bytes
to 295 bytes (shrinking by 129 bytes).  The compatibility wrapper is 30
bytes, plus the 8 bytes for the exported symbol means the kernel shrinks
by 91 bytes.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/swap.h |  3 ++-
 mm/folio-compat.c    |  7 +++++++
 mm/swap.c            | 34 ++++++++++++++++------------------
 3 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 8e0118b25bdc..d1cb67cdb476 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -346,7 +346,8 @@ extern void lru_note_cost(struct lruvec *lruvec, bool file,
 			  unsigned int nr_pages);
 extern void lru_note_cost_page(struct page *);
 extern void lru_cache_add(struct page *);
-extern void mark_page_accessed(struct page *);
+void mark_page_accessed(struct page *);
+void folio_mark_accessed(struct folio *);
 
 extern atomic_t lru_disable_count;
 
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 7044fcc8a8aa..a374747ae1c6 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -5,6 +5,7 @@
  */
 
 #include <linux/pagemap.h>
+#include <linux/swap.h>
 
 struct address_space *page_mapping(struct page *page)
 {
@@ -41,3 +42,9 @@ bool page_mapped(struct page *page)
 	return folio_mapped(page_folio(page));
 }
 EXPORT_SYMBOL(page_mapped);
+
+void mark_page_accessed(struct page *page)
+{
+	folio_mark_accessed(page_folio(page));
+}
+EXPORT_SYMBOL(mark_page_accessed);
diff --git a/mm/swap.c b/mm/swap.c
index 07244999bcc6..8e7f92be2f6f 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -380,7 +380,7 @@ static void folio_activate(struct folio *folio)
 }
 #endif
 
-static void __lru_cache_activate_page(struct page *page)
+static void __lru_cache_activate_folio(struct folio *folio)
 {
 	struct pagevec *pvec;
 	int i;
@@ -401,8 +401,8 @@ static void __lru_cache_activate_page(struct page *page)
 	for (i = pagevec_count(pvec) - 1; i >= 0; i--) {
 		struct page *pagevec_page = pvec->pages[i];
 
-		if (pagevec_page == page) {
-			SetPageActive(page);
+		if (pagevec_page == &folio->page) {
+			folio_set_active_flag(folio);
 			break;
 		}
 	}
@@ -420,36 +420,34 @@ static void __lru_cache_activate_page(struct page *page)
  * When a newly allocated page is not yet visible, so safe for non-atomic ops,
  * __SetPageReferenced(page) may be substituted for mark_page_accessed(page).
  */
-void mark_page_accessed(struct page *page)
+void folio_mark_accessed(struct folio *folio)
 {
-	page = compound_head(page);
-
-	if (!PageReferenced(page)) {
-		SetPageReferenced(page);
-	} else if (PageUnevictable(page)) {
+	if (!folio_referenced(folio)) {
+		folio_set_referenced_flag(folio);
+	} else if (folio_unevictable(folio)) {
 		/*
 		 * Unevictable pages are on the "LRU_UNEVICTABLE" list. But,
 		 * this list is never rotated or maintained, so marking an
 		 * evictable page accessed has no effect.
 		 */
-	} else if (!PageActive(page)) {
+	} else if (!folio_active(folio)) {
 		/*
 		 * If the page is on the LRU, queue it for activation via
 		 * lru_pvecs.activate_page. Otherwise, assume the page is on a
 		 * pagevec, mark it active and it'll be moved to the active
 		 * LRU on the next drain.
 		 */
-		if (PageLRU(page))
-			folio_activate(page_folio(page));
+		if (folio_lru(folio))
+			folio_activate(folio);
 		else
-			__lru_cache_activate_page(page);
-		ClearPageReferenced(page);
-		workingset_activation(page_folio(page));
+			__lru_cache_activate_folio(folio);
+		folio_clear_referenced_flag(folio);
+		workingset_activation(folio);
 	}
-	if (page_is_idle(page))
-		clear_page_idle(page);
+	if (folio_idle(folio))
+		folio_clear_idle_flag(folio);
 }
-EXPORT_SYMBOL(mark_page_accessed);
+EXPORT_SYMBOL(folio_mark_accessed);
 
 /**
  * lru_cache_add - add a page to a page list
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 44/96] mm/rmap: Add folio_mkclean
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (42 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 43/96] mm/swap: Add folio_mark_accessed Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 45/96] mm: Add kmap_local_folio Matthew Wilcox (Oracle)
                   ` (51 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Transform page_mkclean() into folio_mkclean() and add a page_mkclean()
wrapper around folio_mkclean().

folio_mkclean is 15 bytes smaller than page_mkclean, but the kernel
is enlarged by 33 bytes due to inlining page_folio() into each caller.
This will go away once the callers are converted to use folio_mkclean().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/rmap.h | 10 ++++++----
 mm/rmap.c            | 12 ++++++------
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index def5c62c93b3..edb006bc4159 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -233,7 +233,7 @@ unsigned long page_address_in_vma(struct page *, struct vm_area_struct *);
  *
  * returns the number of cleaned PTEs.
  */
-int page_mkclean(struct page *);
+int folio_mkclean(struct folio *);
 
 /*
  * called in munlock()/munmap() path to check for other vmas holding
@@ -291,12 +291,14 @@ static inline int page_referenced(struct page *page, int is_locked,
 
 #define try_to_unmap(page, refs) false
 
-static inline int page_mkclean(struct page *page)
+static inline int folio_mkclean(struct folio *folio)
 {
 	return 0;
 }
-
-
 #endif	/* CONFIG_MMU */
 
+static inline int page_mkclean(struct page *page)
+{
+	return folio_mkclean(page_folio(page));
+}
 #endif	/* _LINUX_RMAP_H */
diff --git a/mm/rmap.c b/mm/rmap.c
index 693a610e181d..e29dbbc880d7 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -983,7 +983,7 @@ static bool invalid_mkclean_vma(struct vm_area_struct *vma, void *arg)
 	return true;
 }
 
-int page_mkclean(struct page *page)
+int folio_mkclean(struct folio *folio)
 {
 	int cleaned = 0;
 	struct address_space *mapping;
@@ -993,20 +993,20 @@ int page_mkclean(struct page *page)
 		.invalid_vma = invalid_mkclean_vma,
 	};
 
-	BUG_ON(!PageLocked(page));
+	BUG_ON(!folio_locked(folio));
 
-	if (!page_mapped(page))
+	if (!folio_mapped(folio))
 		return 0;
 
-	mapping = page_mapping(page);
+	mapping = folio_mapping(folio);
 	if (!mapping)
 		return 0;
 
-	rmap_walk(page, &rwc);
+	rmap_walk(&folio->page, &rwc);
 
 	return cleaned;
 }
-EXPORT_SYMBOL_GPL(page_mkclean);
+EXPORT_SYMBOL_GPL(folio_mkclean);
 
 /**
  * page_move_anon_rmap - move a page to our anon_vma
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 45/96] mm: Add kmap_local_folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (43 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 44/96] mm/rmap: Add folio_mkclean Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 46/96] mm: Add flush_dcache_folio Matthew Wilcox (Oracle)
                   ` (50 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This allows us to map a portion of a folio.  Callers can only expect
to access up to the next page boundary.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/highmem-internal.h | 11 +++++++++
 include/linux/highmem.h          | 38 ++++++++++++++++++++++++++++++++
 2 files changed, 49 insertions(+)

diff --git a/include/linux/highmem-internal.h b/include/linux/highmem-internal.h
index 7902c7d8b55f..39cdcfdbc23a 100644
--- a/include/linux/highmem-internal.h
+++ b/include/linux/highmem-internal.h
@@ -73,6 +73,12 @@ static inline void *kmap_local_page(struct page *page)
 	return __kmap_local_page_prot(page, kmap_prot);
 }
 
+static inline void *kmap_local_folio(struct folio *folio, size_t offset)
+{
+	struct page *page = nth_page(&folio->page, offset / PAGE_SIZE);
+	return __kmap_local_page_prot(page, kmap_prot) + offset % PAGE_SIZE;
+}
+
 static inline void *kmap_local_page_prot(struct page *page, pgprot_t prot)
 {
 	return __kmap_local_page_prot(page, prot);
@@ -160,6 +166,11 @@ static inline void *kmap_local_page(struct page *page)
 	return page_address(page);
 }
 
+static inline void *kmap_local_folio(struct folio *folio, size_t offset)
+{
+	return page_address(&folio->page) + offset;
+}
+
 static inline void *kmap_local_page_prot(struct page *page, pgprot_t prot)
 {
 	return kmap_local_page(page);
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index b06eeeaee975..b01deb45d265 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -96,6 +96,44 @@ static inline void kmap_flush_unused(void);
  */
 static inline void *kmap_local_page(struct page *page);
 
+/**
+ * kmap_local_folio - Map a page in this folio for temporary usage
+ * @folio:	The folio to be mapped.
+ * @offset:	The byte offset within the folio.
+ *
+ * Returns: The virtual address of the mapping
+ *
+ * Can be invoked from any context.
+ *
+ * Requires careful handling when nesting multiple mappings because the map
+ * management is stack based. The unmap has to be in the reverse order of
+ * the map operation:
+ *
+ * addr1 = kmap_local_folio(page1, offset1);
+ * addr2 = kmap_local_folio(page2, offset2);
+ * ...
+ * kunmap_local(addr2);
+ * kunmap_local(addr1);
+ *
+ * Unmapping addr1 before addr2 is invalid and causes malfunction.
+ *
+ * Contrary to kmap() mappings the mapping is only valid in the context of
+ * the caller and cannot be handed to other contexts.
+ *
+ * On CONFIG_HIGHMEM=n kernels and for low memory pages this returns the
+ * virtual address of the direct mapping. Only real highmem pages are
+ * temporarily mapped.
+ *
+ * While it is significantly faster than kmap() for the higmem case it
+ * comes with restrictions about the pointer validity. Only use when really
+ * necessary.
+ *
+ * On HIGHMEM enabled systems mapping a highmem page has the side effect of
+ * disabling migration in order to keep the virtual address stable across
+ * preemption. No caller of kmap_local_folio() can rely on this side effect.
+ */
+static inline void *kmap_local_folio(struct folio *folio, size_t offset);
+
 /**
  * kmap_atomic - Atomically map a page for temporary usage - Deprecated!
  * @page:	Pointer to the page to be mapped
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 46/96] mm: Add flush_dcache_folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (44 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 45/96] mm: Add kmap_local_folio Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 23:35   ` kernel test robot
  2021-05-05 15:05 ` [PATCH v9 47/96] mm: Add arch_make_folio_accessible Matthew Wilcox (Oracle)
                   ` (49 subsequent siblings)
  95 siblings, 1 reply; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This is a default implementation which calls flush_dcache_page() on
each page in the folio.  If architectures can do better, they should
implement their own version of it.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 Documentation/core-api/cachetlb.rst |  6 ++++++
 include/asm-generic/cacheflush.h    | 14 ++++++++++++++
 2 files changed, 20 insertions(+)

diff --git a/Documentation/core-api/cachetlb.rst b/Documentation/core-api/cachetlb.rst
index fe4290e26729..29682f69a915 100644
--- a/Documentation/core-api/cachetlb.rst
+++ b/Documentation/core-api/cachetlb.rst
@@ -325,6 +325,12 @@ maps this page at its virtual address.
 			dirty.  Again, see sparc64 for examples of how
 			to deal with this.
 
+  ``void flush_dcache_folio(struct folio *folio)``
+	This function is called under the same circumstances as
+	flush_dcache_page().  It allows the architecture to
+	optimise for flushing the entire folio of pages instead
+	of flushing one page at a time.
+
   ``void copy_to_user_page(struct vm_area_struct *vma, struct page *page,
   unsigned long user_vaddr, void *dst, void *src, int len)``
   ``void copy_from_user_page(struct vm_area_struct *vma, struct page *page,
diff --git a/include/asm-generic/cacheflush.h b/include/asm-generic/cacheflush.h
index 4a674db4e1fa..0155357b840f 100644
--- a/include/asm-generic/cacheflush.h
+++ b/include/asm-generic/cacheflush.h
@@ -49,9 +49,23 @@ static inline void flush_cache_page(struct vm_area_struct *vma,
 static inline void flush_dcache_page(struct page *page)
 {
 }
+
+static inline void flush_dcache_folio(struct folio *folio) { }
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 0
+#define ARCH_IMPLEMENTS_FLUSH_DCACHE_FOLIO
 #endif
 
+#ifndef ARCH_IMPLEMENTS_FLUSH_DCACHE_FOLIO
+static inline void flush_dcache_folio(struct folio *folio)
+{
+	unsigned int n = folio_nr_pages(folio);
+
+	do {
+		n--;
+		flush_dcache_page(nth_page(&folio->page, n));
+	} while (n);
+}
+#endif
 
 #ifndef flush_dcache_mmap_lock
 static inline void flush_dcache_mmap_lock(struct address_space *mapping)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 47/96] mm: Add arch_make_folio_accessible
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (45 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 46/96] mm: Add flush_dcache_folio Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 48/96] mm/memcg: Remove 'page' parameter to mem_cgroup_charge_statistics Matthew Wilcox (Oracle)
                   ` (48 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

As a default implementation, call arch_make_page_accessible n times.
If an architecture can do better, it can override this.

Also move the default implementation of arch_make_page_accessible()
from gfp.h to mm.h.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/gfp.h |  6 ------
 include/linux/mm.h  | 21 +++++++++++++++++++++
 2 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 11da8af06704..a503d928e684 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -508,12 +508,6 @@ static inline void arch_free_page(struct page *page, int order) { }
 #ifndef HAVE_ARCH_ALLOC_PAGE
 static inline void arch_alloc_page(struct page *page, int order) { }
 #endif
-#ifndef HAVE_ARCH_MAKE_PAGE_ACCESSIBLE
-static inline int arch_make_page_accessible(struct page *page)
-{
-	return 0;
-}
-#endif
 
 struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid,
 		nodemask_t *nodemask);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 6bfc43309e4b..75279db82040 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1725,6 +1725,27 @@ static inline size_t folio_size(struct folio *folio)
 	return PAGE_SIZE << folio_order(folio);
 }
 
+#ifndef HAVE_ARCH_MAKE_PAGE_ACCESSIBLE
+static inline int arch_make_page_accessible(struct page *page)
+{
+	return 0;
+}
+#endif
+
+#ifndef HAVE_ARCH_MAKE_FOLIO_ACCESSIBLE
+static inline int arch_make_folio_accessible(struct folio *folio)
+{
+	int ret, i;
+	for (i = 0; i < folio_nr_pages(folio); i++) {
+		ret = arch_make_page_accessible(folio_page(folio, i));
+		if (ret)
+			break;
+	}
+
+	return ret;
+}
+#endif
+
 /*
  * Some inline functions in vmstat.h depend on page_zone()
  */
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 48/96] mm/memcg: Remove 'page' parameter to mem_cgroup_charge_statistics
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (46 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 47/96] mm: Add arch_make_folio_accessible Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 49/96] mm/memcg: Use the node id in mem_cgroup_update_tree Matthew Wilcox (Oracle)
                   ` (47 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

The last use of 'page' was removed by commit 468c398233da ("mm:
memcontrol: switch to native NR_ANON_THPS counter"), so we can now remove
the parameter from the function.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/memcontrol.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 64ada9e650a5..1204c6a0c671 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -814,7 +814,6 @@ static unsigned long memcg_events_local(struct mem_cgroup *memcg, int event)
 }
 
 static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg,
-					 struct page *page,
 					 int nr_pages)
 {
 	/* pagein of a big page is an event. So, ignore page size */
@@ -5504,9 +5503,9 @@ static int mem_cgroup_move_account(struct page *page,
 	ret = 0;
 
 	local_irq_disable();
-	mem_cgroup_charge_statistics(to, page, nr_pages);
+	mem_cgroup_charge_statistics(to, nr_pages);
 	memcg_check_events(to, page);
-	mem_cgroup_charge_statistics(from, page, -nr_pages);
+	mem_cgroup_charge_statistics(from, -nr_pages);
 	memcg_check_events(from, page);
 	local_irq_enable();
 out_unlock:
@@ -6527,7 +6526,7 @@ static int __mem_cgroup_charge(struct page *page, struct mem_cgroup *memcg,
 	commit_charge(page, memcg);
 
 	local_irq_disable();
-	mem_cgroup_charge_statistics(memcg, page, nr_pages);
+	mem_cgroup_charge_statistics(memcg, nr_pages);
 	memcg_check_events(memcg, page);
 	local_irq_enable();
 out:
@@ -6814,7 +6813,7 @@ void mem_cgroup_migrate(struct page *oldpage, struct page *newpage)
 	commit_charge(newpage, memcg);
 
 	local_irq_save(flags);
-	mem_cgroup_charge_statistics(memcg, newpage, nr_pages);
+	mem_cgroup_charge_statistics(memcg, nr_pages);
 	memcg_check_events(memcg, newpage);
 	local_irq_restore(flags);
 }
@@ -7044,7 +7043,7 @@ void mem_cgroup_swapout(struct page *page, swp_entry_t entry)
 	 * only synchronisation we have for updating the per-CPU variables.
 	 */
 	VM_BUG_ON(!irqs_disabled());
-	mem_cgroup_charge_statistics(memcg, page, -nr_entries);
+	mem_cgroup_charge_statistics(memcg, -nr_entries);
 	memcg_check_events(memcg, page);
 
 	css_put(&memcg->css);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 49/96] mm/memcg: Use the node id in mem_cgroup_update_tree
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (47 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 48/96] mm/memcg: Remove 'page' parameter to mem_cgroup_charge_statistics Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 50/96] mm/memcg: Convert commit_charge to take a folio Matthew Wilcox (Oracle)
                   ` (46 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Hoist the page_to_nid() call from mem_cgroup_page_nodeinfo() into
mem_cgroup_update_tree().  That lets us call soft_limit_tree_node()
and delete soft_limit_tree_from_page() altogether.  Saves 42
bytes of kernel text on my config.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/memcontrol.c | 17 ++++-------------
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 1204c6a0c671..7423cb11eb88 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -453,10 +453,8 @@ ino_t page_cgroup_ino(struct page *page)
 }
 
 static struct mem_cgroup_per_node *
-mem_cgroup_page_nodeinfo(struct mem_cgroup *memcg, struct page *page)
+mem_cgroup_nodeinfo(struct mem_cgroup *memcg, int nid)
 {
-	int nid = page_to_nid(page);
-
 	return memcg->nodeinfo[nid];
 }
 
@@ -466,14 +464,6 @@ soft_limit_tree_node(int nid)
 	return soft_limit_tree.rb_tree_per_node[nid];
 }
 
-static struct mem_cgroup_tree_per_node *
-soft_limit_tree_from_page(struct page *page)
-{
-	int nid = page_to_nid(page);
-
-	return soft_limit_tree.rb_tree_per_node[nid];
-}
-
 static void __mem_cgroup_insert_exceeded(struct mem_cgroup_per_node *mz,
 					 struct mem_cgroup_tree_per_node *mctz,
 					 unsigned long new_usage_in_excess)
@@ -549,8 +539,9 @@ static void mem_cgroup_update_tree(struct mem_cgroup *memcg, struct page *page)
 	unsigned long excess;
 	struct mem_cgroup_per_node *mz;
 	struct mem_cgroup_tree_per_node *mctz;
+	int nid = page_to_nid(page);
 
-	mctz = soft_limit_tree_from_page(page);
+	mctz = soft_limit_tree_node(nid);
 	if (!mctz)
 		return;
 	/*
@@ -558,7 +549,7 @@ static void mem_cgroup_update_tree(struct mem_cgroup *memcg, struct page *page)
 	 * because their event counter is not touched.
 	 */
 	for (; memcg; memcg = parent_mem_cgroup(memcg)) {
-		mz = mem_cgroup_page_nodeinfo(memcg, page);
+		mz = mem_cgroup_nodeinfo(memcg, nid);
 		excess = soft_limit_excess(memcg);
 		/*
 		 * We have to update the tree if mz is on RB-tree or
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 50/96] mm/memcg: Convert commit_charge to take a folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (48 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 49/96] mm/memcg: Use the node id in mem_cgroup_update_tree Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 51/96] mm/memcg: Add folio_charge_cgroup Matthew Wilcox (Oracle)
                   ` (45 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

The memcg_data is only set on the head page, so enforce that by
typing it as a folio.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/memcontrol.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 7423cb11eb88..5493e094e9e9 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2700,9 +2700,9 @@ static void cancel_charge(struct mem_cgroup *memcg, unsigned int nr_pages)
 }
 #endif
 
-static void commit_charge(struct page *page, struct mem_cgroup *memcg)
+static void commit_charge(struct folio *folio, struct mem_cgroup *memcg)
 {
-	VM_BUG_ON_PAGE(page_memcg(page), page);
+	VM_BUG_ON_FOLIO(folio_memcg(folio), folio);
 	/*
 	 * Any of the following ensures page's memcg stability:
 	 *
@@ -2711,7 +2711,7 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg)
 	 * - lock_page_memcg()
 	 * - exclusive reference
 	 */
-	page->memcg_data = (unsigned long)memcg;
+	folio->memcg_data = (unsigned long)memcg;
 }
 
 static struct mem_cgroup *get_mem_cgroup_from_objcg(struct obj_cgroup *objcg)
@@ -6506,7 +6506,8 @@ void mem_cgroup_calculate_protection(struct mem_cgroup *root,
 static int __mem_cgroup_charge(struct page *page, struct mem_cgroup *memcg,
 			       gfp_t gfp)
 {
-	unsigned int nr_pages = thp_nr_pages(page);
+	struct folio *folio = page_folio(page);
+	unsigned int nr_pages = folio_nr_pages(folio);
 	int ret;
 
 	ret = try_charge(memcg, gfp, nr_pages);
@@ -6514,7 +6515,7 @@ static int __mem_cgroup_charge(struct page *page, struct mem_cgroup *memcg,
 		goto out;
 
 	css_get(&memcg->css);
-	commit_charge(page, memcg);
+	commit_charge(folio, memcg);
 
 	local_irq_disable();
 	mem_cgroup_charge_statistics(memcg, nr_pages);
@@ -6771,21 +6772,22 @@ void mem_cgroup_uncharge_list(struct list_head *page_list)
  */
 void mem_cgroup_migrate(struct page *oldpage, struct page *newpage)
 {
+	struct folio *newfolio = page_folio(newpage);
 	struct mem_cgroup *memcg;
-	unsigned int nr_pages;
+	unsigned int nr_pages = folio_nr_pages(newfolio);
 	unsigned long flags;
 
 	VM_BUG_ON_PAGE(!PageLocked(oldpage), oldpage);
-	VM_BUG_ON_PAGE(!PageLocked(newpage), newpage);
-	VM_BUG_ON_PAGE(PageAnon(oldpage) != PageAnon(newpage), newpage);
-	VM_BUG_ON_PAGE(PageTransHuge(oldpage) != PageTransHuge(newpage),
-		       newpage);
+	VM_BUG_ON_FOLIO(!folio_locked(newfolio), newfolio);
+	VM_BUG_ON_FOLIO(PageAnon(oldpage) != folio_anon(newfolio), newfolio);
+	VM_BUG_ON_FOLIO(compound_nr(oldpage) != folio_nr_pages(newfolio),
+		       newfolio);
 
 	if (mem_cgroup_disabled())
 		return;
 
 	/* Page cache replacement: new page already charged? */
-	if (page_memcg(newpage))
+	if (folio_memcg(newfolio))
 		return;
 
 	memcg = page_memcg(oldpage);
@@ -6794,14 +6796,12 @@ void mem_cgroup_migrate(struct page *oldpage, struct page *newpage)
 		return;
 
 	/* Force-charge the new page. The old one will be freed soon */
-	nr_pages = thp_nr_pages(newpage);
-
 	page_counter_charge(&memcg->memory, nr_pages);
 	if (do_memsw_account())
 		page_counter_charge(&memcg->memsw, nr_pages);
 
 	css_get(&memcg->css);
-	commit_charge(newpage, memcg);
+	commit_charge(newfolio, memcg);
 
 	local_irq_save(flags);
 	mem_cgroup_charge_statistics(memcg, nr_pages);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 51/96] mm/memcg: Add folio_charge_cgroup
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (49 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 50/96] mm/memcg: Convert commit_charge to take a folio Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 52/96] mm/memcg: Add folio_uncharge_cgroup Matthew Wilcox (Oracle)
                   ` (44 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

mem_cgroup_charge() already assumed it was being passed a non-tail
page (and looking at the callers, that's true; it's called for freshly
allocated pages).  The only real change here is that folio_nr_pages()
doesn't compile away like thp_nr_pages() does as folio support
is not conditional on transparent hugepage support.  Reimplement
mem_cgroup_charge() as a wrapper around folio_charge_cgroup().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/memcontrol.h |  8 ++++++++
 mm/folio-compat.c          |  7 +++++++
 mm/memcontrol.c            | 26 +++++++++++++-------------
 3 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index b45b505be0ec..d0798d54f637 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -699,6 +699,8 @@ static inline bool mem_cgroup_below_min(struct mem_cgroup *memcg)
 		page_counter_read(&memcg->memory);
 }
 
+int folio_charge_cgroup(struct folio *, struct mm_struct *, gfp_t);
+
 int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask);
 int mem_cgroup_swapin_charge_page(struct page *page, struct mm_struct *mm,
 				  gfp_t gfp, swp_entry_t entry);
@@ -1201,6 +1203,12 @@ static inline bool mem_cgroup_below_min(struct mem_cgroup *memcg)
 	return false;
 }
 
+static inline int folio_charge_cgroup(struct folio *folio,
+		struct mm_struct *mm, gfp_t gfp)
+{
+	return 0;
+}
+
 static inline int mem_cgroup_charge(struct page *page, struct mm_struct *mm,
 				    gfp_t gfp_mask)
 {
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index a374747ae1c6..1d71b8b587f8 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -48,3 +48,10 @@ void mark_page_accessed(struct page *page)
 	folio_mark_accessed(page_folio(page));
 }
 EXPORT_SYMBOL(mark_page_accessed);
+
+#ifdef CONFIG_MEMCG
+int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp)
+{
+	return folio_charge_cgroup(page_folio(page), mm, gfp);
+}
+#endif
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5493e094e9e9..529b10e72d5a 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6503,10 +6503,9 @@ void mem_cgroup_calculate_protection(struct mem_cgroup *root,
 			atomic_long_read(&parent->memory.children_low_usage)));
 }
 
-static int __mem_cgroup_charge(struct page *page, struct mem_cgroup *memcg,
+static int __mem_cgroup_charge(struct folio *folio, struct mem_cgroup *memcg,
 			       gfp_t gfp)
 {
-	struct folio *folio = page_folio(page);
 	unsigned int nr_pages = folio_nr_pages(folio);
 	int ret;
 
@@ -6519,26 +6518,26 @@ static int __mem_cgroup_charge(struct page *page, struct mem_cgroup *memcg,
 
 	local_irq_disable();
 	mem_cgroup_charge_statistics(memcg, nr_pages);
-	memcg_check_events(memcg, page);
+	memcg_check_events(memcg, &folio->page);
 	local_irq_enable();
 out:
 	return ret;
 }
 
 /**
- * mem_cgroup_charge - charge a newly allocated page to a cgroup
- * @page: page to charge
- * @mm: mm context of the victim
- * @gfp_mask: reclaim mode
+ * folio_charge_cgroup - Charge a newly allocated folio to a cgroup.
+ * @folio: Folio to charge.
+ * @mm: mm context of the allocating task.
+ * @gfp: reclaim mode
  *
- * Try to charge @page to the memcg that @mm belongs to, reclaiming
- * pages according to @gfp_mask if necessary.
+ * Try to charge @folio to the memcg that @mm belongs to, reclaiming
+ * pages according to @gfp if necessary.
  *
- * Do not use this for pages allocated for swapin.
+ * Do not use this for folios allocated for swapin.
  *
  * Returns 0 on success. Otherwise, an error code is returned.
  */
-int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask)
+int folio_charge_cgroup(struct folio *folio, struct mm_struct *mm, gfp_t gfp)
 {
 	struct mem_cgroup *memcg;
 	int ret;
@@ -6547,7 +6546,7 @@ int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask)
 		return 0;
 
 	memcg = get_mem_cgroup_from_mm(mm);
-	ret = __mem_cgroup_charge(page, memcg, gfp_mask);
+	ret = __mem_cgroup_charge(folio, memcg, gfp);
 	css_put(&memcg->css);
 
 	return ret;
@@ -6568,6 +6567,7 @@ int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask)
 int mem_cgroup_swapin_charge_page(struct page *page, struct mm_struct *mm,
 				  gfp_t gfp, swp_entry_t entry)
 {
+	struct folio *folio = page_folio(page);
 	struct mem_cgroup *memcg;
 	unsigned short id;
 	int ret;
@@ -6582,7 +6582,7 @@ int mem_cgroup_swapin_charge_page(struct page *page, struct mm_struct *mm,
 		memcg = get_mem_cgroup_from_mm(mm);
 	rcu_read_unlock();
 
-	ret = __mem_cgroup_charge(page, memcg, gfp);
+	ret = __mem_cgroup_charge(folio, memcg, gfp);
 
 	css_put(&memcg->css);
 	return ret;
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 52/96] mm/memcg: Add folio_uncharge_cgroup
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (50 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 51/96] mm/memcg: Add folio_charge_cgroup Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 20:24   ` kernel test robot
  2021-05-05 15:05 ` [PATCH v9 53/96] mm/memcg: Convert mem_cgroup_track_foreign_dirty_slowpath to folio Matthew Wilcox (Oracle)
                   ` (43 subsequent siblings)
  95 siblings, 1 reply; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Reimplement mem_cgroup_uncharge() as a wrapper around
folio_uncharge_cgroup().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/memcontrol.h |  5 +++++
 mm/folio-compat.c          |  5 +++++
 mm/memcontrol.c            | 14 +++++++-------
 3 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index d0798d54f637..2e68c9848432 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -700,6 +700,7 @@ static inline bool mem_cgroup_below_min(struct mem_cgroup *memcg)
 }
 
 int folio_charge_cgroup(struct folio *, struct mm_struct *, gfp_t);
+void folio_uncharge_cgroup(struct folio *);
 
 int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask);
 int mem_cgroup_swapin_charge_page(struct page *page, struct mm_struct *mm,
@@ -1209,6 +1210,10 @@ static inline int folio_charge_cgroup(struct folio *folio,
 	return 0;
 }
 
+static inline void folio_uncharge_cgroup(struct folio *)
+{
+}
+
 static inline int mem_cgroup_charge(struct page *page, struct mm_struct *mm,
 				    gfp_t gfp_mask)
 {
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 1d71b8b587f8..d229b979b00d 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -54,4 +54,9 @@ int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp)
 {
 	return folio_charge_cgroup(page_folio(page), mm, gfp);
 }
+
+void mem_cgroup_uncharge(struct page *page)
+{
+	folio_uncharge_cgroup(page_folio(page));
+}
 #endif
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 529b10e72d5a..4ca2661cf891 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6717,24 +6717,24 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug)
 }
 
 /**
- * mem_cgroup_uncharge - uncharge a page
- * @page: page to uncharge
+ * folio_uncharge_cgroup - Uncharge a folio.
+ * @folio: Folio to uncharge.
  *
- * Uncharge a page previously charged with mem_cgroup_charge().
+ * Uncharge a folio previously charged with folio_charge_cgroup().
  */
-void mem_cgroup_uncharge(struct page *page)
+void folio_uncharge_cgroup(struct folio *folio)
 {
 	struct uncharge_gather ug;
 
 	if (mem_cgroup_disabled())
 		return;
 
-	/* Don't touch page->lru of any random page, pre-check: */
-	if (!page_memcg(page))
+	/* Don't touch folio->lru of any random page, pre-check: */
+	if (!folio_memcg(folio))
 		return;
 
 	uncharge_gather_clear(&ug);
-	uncharge_page(page, &ug);
+	uncharge_page(&folio->page, &ug);
 	uncharge_batch(&ug);
 }
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 53/96] mm/memcg: Convert mem_cgroup_track_foreign_dirty_slowpath to folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (51 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 52/96] mm/memcg: Add folio_uncharge_cgroup Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 54/96] mm/writeback: Rename __add_wb_stat to wb_stat_mod Matthew Wilcox (Oracle)
                   ` (42 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

The page was only being used for the memcg and to gather trace
information, so this is a simple conversion.  The only caller of
mem_cgroup_track_foreign_dirty() will be converted to folios in a later
patch, so doing this now makes that patch simpler.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/memcontrol.h       | 7 ++++---
 include/trace/events/writeback.h | 8 ++++----
 mm/memcontrol.c                  | 6 +++---
 3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 2e68c9848432..c77561311199 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -1622,17 +1622,18 @@ void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
 			 unsigned long *pheadroom, unsigned long *pdirty,
 			 unsigned long *pwriteback);
 
-void mem_cgroup_track_foreign_dirty_slowpath(struct page *page,
+void mem_cgroup_track_foreign_dirty_slowpath(struct folio *folio,
 					     struct bdi_writeback *wb);
 
 static inline void mem_cgroup_track_foreign_dirty(struct page *page,
 						  struct bdi_writeback *wb)
 {
+	struct folio *folio = page_folio(page);
 	if (mem_cgroup_disabled())
 		return;
 
-	if (unlikely(&page_memcg(page)->css != wb->memcg_css))
-		mem_cgroup_track_foreign_dirty_slowpath(page, wb);
+	if (unlikely(&folio_memcg(folio)->css != wb->memcg_css))
+		mem_cgroup_track_foreign_dirty_slowpath(folio, wb);
 }
 
 void mem_cgroup_flush_foreign(struct bdi_writeback *wb);
diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h
index 1efa463c4979..80b24801bbf7 100644
--- a/include/trace/events/writeback.h
+++ b/include/trace/events/writeback.h
@@ -235,9 +235,9 @@ TRACE_EVENT(inode_switch_wbs,
 
 TRACE_EVENT(track_foreign_dirty,
 
-	TP_PROTO(struct page *page, struct bdi_writeback *wb),
+	TP_PROTO(struct folio *folio, struct bdi_writeback *wb),
 
-	TP_ARGS(page, wb),
+	TP_ARGS(folio, wb),
 
 	TP_STRUCT__entry(
 		__array(char,		name, 32)
@@ -249,7 +249,7 @@ TRACE_EVENT(track_foreign_dirty,
 	),
 
 	TP_fast_assign(
-		struct address_space *mapping = page_mapping(page);
+		struct address_space *mapping = folio_mapping(folio);
 		struct inode *inode = mapping ? mapping->host : NULL;
 
 		strscpy_pad(__entry->name, bdi_dev_name(wb->bdi), 32);
@@ -257,7 +257,7 @@ TRACE_EVENT(track_foreign_dirty,
 		__entry->ino		= inode ? inode->i_ino : 0;
 		__entry->memcg_id	= wb->memcg_css->id;
 		__entry->cgroup_ino	= __trace_wb_assign_cgroup(wb);
-		__entry->page_cgroup_ino = cgroup_ino(page_memcg(page)->css.cgroup);
+		__entry->page_cgroup_ino = cgroup_ino(folio_memcg(folio)->css.cgroup);
 	),
 
 	TP_printk("bdi %s[%llu]: ino=%lu memcg_id=%u cgroup_ino=%lu page_cgroup_ino=%lu",
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 4ca2661cf891..2bf6accfc1e6 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4394,17 +4394,17 @@ void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
  * As being wrong occasionally doesn't matter, updates and accesses to the
  * records are lockless and racy.
  */
-void mem_cgroup_track_foreign_dirty_slowpath(struct page *page,
+void mem_cgroup_track_foreign_dirty_slowpath(struct folio *folio,
 					     struct bdi_writeback *wb)
 {
-	struct mem_cgroup *memcg = page_memcg(page);
+	struct mem_cgroup *memcg = folio_memcg(folio);
 	struct memcg_cgwb_frn *frn;
 	u64 now = get_jiffies_64();
 	u64 oldest_at = now;
 	int oldest = -1;
 	int i;
 
-	trace_track_foreign_dirty(page, wb);
+	trace_track_foreign_dirty(folio, wb);
 
 	/*
 	 * Pick the slot to use.  If there is already a slot for @wb, keep
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 54/96] mm/writeback: Rename __add_wb_stat to wb_stat_mod
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (52 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 53/96] mm/memcg: Convert mem_cgroup_track_foreign_dirty_slowpath to folio Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 55/96] flex_proportions: Allow N events instead of 1 Matthew Wilcox (Oracle)
                   ` (41 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Make this look like the newly renamed vmstat functions.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/backing-dev.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 44df4fcef65c..a852876bb6e2 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -64,7 +64,7 @@ static inline bool bdi_has_dirty_io(struct backing_dev_info *bdi)
 	return atomic_long_read(&bdi->tot_write_bandwidth);
 }
 
-static inline void __add_wb_stat(struct bdi_writeback *wb,
+static inline void wb_stat_mod(struct bdi_writeback *wb,
 				 enum wb_stat_item item, s64 amount)
 {
 	percpu_counter_add_batch(&wb->stat[item], amount, WB_STAT_BATCH);
@@ -72,12 +72,12 @@ static inline void __add_wb_stat(struct bdi_writeback *wb,
 
 static inline void inc_wb_stat(struct bdi_writeback *wb, enum wb_stat_item item)
 {
-	__add_wb_stat(wb, item, 1);
+	wb_stat_mod(wb, item, 1);
 }
 
 static inline void dec_wb_stat(struct bdi_writeback *wb, enum wb_stat_item item)
 {
-	__add_wb_stat(wb, item, -1);
+	wb_stat_mod(wb, item, -1);
 }
 
 static inline s64 wb_stat(struct bdi_writeback *wb, enum wb_stat_item item)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 55/96] flex_proportions: Allow N events instead of 1
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (53 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 54/96] mm/writeback: Rename __add_wb_stat to wb_stat_mod Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 56/96] mm/writeback: Change __wb_writeout_inc to __wb_writeout_add Matthew Wilcox (Oracle)
                   ` (40 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

When batching events (such as writing back N pages in a single I/O), it
is better to do one flex_proportion operation instead of N.  There is
only one caller of __fprop_inc_percpu_max(), and it's the one we're
going to change in the next patch, so rename it instead of adding a
compatibility wrapper.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/flex_proportions.h |  9 +++++----
 lib/flex_proportions.c           | 28 +++++++++++++++++++---------
 mm/page-writeback.c              |  4 ++--
 3 files changed, 26 insertions(+), 15 deletions(-)

diff --git a/include/linux/flex_proportions.h b/include/linux/flex_proportions.h
index c12df59d3f5f..3e378b1fb0bc 100644
--- a/include/linux/flex_proportions.h
+++ b/include/linux/flex_proportions.h
@@ -83,9 +83,10 @@ struct fprop_local_percpu {
 
 int fprop_local_init_percpu(struct fprop_local_percpu *pl, gfp_t gfp);
 void fprop_local_destroy_percpu(struct fprop_local_percpu *pl);
-void __fprop_inc_percpu(struct fprop_global *p, struct fprop_local_percpu *pl);
-void __fprop_inc_percpu_max(struct fprop_global *p, struct fprop_local_percpu *pl,
-			    int max_frac);
+void __fprop_add_percpu(struct fprop_global *p, struct fprop_local_percpu *pl,
+		long nr);
+void __fprop_add_percpu_max(struct fprop_global *p,
+		struct fprop_local_percpu *pl, int max_frac, long nr);
 void fprop_fraction_percpu(struct fprop_global *p,
 	struct fprop_local_percpu *pl, unsigned long *numerator,
 	unsigned long *denominator);
@@ -96,7 +97,7 @@ void fprop_inc_percpu(struct fprop_global *p, struct fprop_local_percpu *pl)
 	unsigned long flags;
 
 	local_irq_save(flags);
-	__fprop_inc_percpu(p, pl);
+	__fprop_add_percpu(p, pl, 1);
 	local_irq_restore(flags);
 }
 
diff --git a/lib/flex_proportions.c b/lib/flex_proportions.c
index 451543937524..53e7eb1dd76c 100644
--- a/lib/flex_proportions.c
+++ b/lib/flex_proportions.c
@@ -217,11 +217,12 @@ static void fprop_reflect_period_percpu(struct fprop_global *p,
 }
 
 /* Event of type pl happened */
-void __fprop_inc_percpu(struct fprop_global *p, struct fprop_local_percpu *pl)
+void __fprop_add_percpu(struct fprop_global *p, struct fprop_local_percpu *pl,
+		long nr)
 {
 	fprop_reflect_period_percpu(p, pl);
-	percpu_counter_add_batch(&pl->events, 1, PROP_BATCH);
-	percpu_counter_add(&p->events, 1);
+	percpu_counter_add_batch(&pl->events, nr, PROP_BATCH);
+	percpu_counter_add(&p->events, nr);
 }
 
 void fprop_fraction_percpu(struct fprop_global *p,
@@ -253,20 +254,29 @@ void fprop_fraction_percpu(struct fprop_global *p,
 }
 
 /*
- * Like __fprop_inc_percpu() except that event is counted only if the given
+ * Like __fprop_add_percpu() except that event is counted only if the given
  * type has fraction smaller than @max_frac/FPROP_FRAC_BASE
  */
-void __fprop_inc_percpu_max(struct fprop_global *p,
-			    struct fprop_local_percpu *pl, int max_frac)
+void __fprop_add_percpu_max(struct fprop_global *p,
+		struct fprop_local_percpu *pl, int max_frac, long nr)
 {
 	if (unlikely(max_frac < FPROP_FRAC_BASE)) {
 		unsigned long numerator, denominator;
+		s64 tmp;
 
 		fprop_fraction_percpu(p, pl, &numerator, &denominator);
-		if (numerator >
-		    (((u64)denominator) * max_frac) >> FPROP_FRAC_SHIFT)
+		/* Adding 'nr' to fraction exceeds max_frac/FPROP_FRAC_BASE? */
+		tmp = (u64)denominator * max_frac -
+					((u64)numerator << FPROP_FRAC_SHIFT);
+		if (tmp < 0) {
+			/* Maximum fraction already exceeded? */
 			return;
+		} else if (tmp < nr * (FPROP_FRAC_BASE - max_frac)) {
+			/* Add just enough for the fraction to saturate */
+			nr = div_u64(tmp + FPROP_FRAC_BASE - max_frac - 1,
+					FPROP_FRAC_BASE - max_frac);
+		}
 	}
 
-	__fprop_inc_percpu(p, pl);
+	__fprop_add_percpu(p, pl, nr);
 }
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 157ed52e4b7d..764e92bd9014 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -572,8 +572,8 @@ static void wb_domain_writeout_inc(struct wb_domain *dom,
 				   struct fprop_local_percpu *completions,
 				   unsigned int max_prop_frac)
 {
-	__fprop_inc_percpu_max(&dom->completions, completions,
-			       max_prop_frac);
+	__fprop_add_percpu_max(&dom->completions, completions,
+			       max_prop_frac, 1);
 	/* First event after period switching was turned off? */
 	if (unlikely(!dom->period_time)) {
 		/*
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 56/96] mm/writeback: Change __wb_writeout_inc to __wb_writeout_add
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (54 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 55/96] flex_proportions: Allow N events instead of 1 Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 57/96] mm/writeback: Convert test_clear_page_writeback to __folio_end_writeback Matthew Wilcox (Oracle)
                   ` (39 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Allow for accounting N pages at once instead of one page at a time.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/page-writeback.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 764e92bd9014..98efb3fc6466 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -568,12 +568,12 @@ static unsigned long wp_next_time(unsigned long cur_time)
 	return cur_time;
 }
 
-static void wb_domain_writeout_inc(struct wb_domain *dom,
+static void wb_domain_writeout_add(struct wb_domain *dom,
 				   struct fprop_local_percpu *completions,
-				   unsigned int max_prop_frac)
+				   unsigned int max_prop_frac, long nr)
 {
 	__fprop_add_percpu_max(&dom->completions, completions,
-			       max_prop_frac, 1);
+			       max_prop_frac, nr);
 	/* First event after period switching was turned off? */
 	if (unlikely(!dom->period_time)) {
 		/*
@@ -591,18 +591,18 @@ static void wb_domain_writeout_inc(struct wb_domain *dom,
  * Increment @wb's writeout completion count and the global writeout
  * completion count. Called from test_clear_page_writeback().
  */
-static inline void __wb_writeout_inc(struct bdi_writeback *wb)
+static inline void __wb_writeout_add(struct bdi_writeback *wb, long nr)
 {
 	struct wb_domain *cgdom;
 
-	inc_wb_stat(wb, WB_WRITTEN);
-	wb_domain_writeout_inc(&global_wb_domain, &wb->completions,
-			       wb->bdi->max_prop_frac);
+	wb_stat_mod(wb, WB_WRITTEN, nr);
+	wb_domain_writeout_add(&global_wb_domain, &wb->completions,
+			       wb->bdi->max_prop_frac, nr);
 
 	cgdom = mem_cgroup_wb_domain(wb);
 	if (cgdom)
-		wb_domain_writeout_inc(cgdom, wb_memcg_completions(wb),
-				       wb->bdi->max_prop_frac);
+		wb_domain_writeout_add(cgdom, wb_memcg_completions(wb),
+				       wb->bdi->max_prop_frac, nr);
 }
 
 void wb_writeout_inc(struct bdi_writeback *wb)
@@ -610,7 +610,7 @@ void wb_writeout_inc(struct bdi_writeback *wb)
 	unsigned long flags;
 
 	local_irq_save(flags);
-	__wb_writeout_inc(wb);
+	__wb_writeout_add(wb, 1);
 	local_irq_restore(flags);
 }
 EXPORT_SYMBOL_GPL(wb_writeout_inc);
@@ -2739,7 +2739,7 @@ int test_clear_page_writeback(struct page *page)
 				struct bdi_writeback *wb = inode_to_wb(inode);
 
 				dec_wb_stat(wb, WB_WRITEBACK);
-				__wb_writeout_inc(wb);
+				__wb_writeout_add(wb, 1);
 			}
 		}
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 57/96] mm/writeback: Convert test_clear_page_writeback to __folio_end_writeback
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (55 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 56/96] mm/writeback: Change __wb_writeout_inc to __wb_writeout_add Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 58/96] mm/writeback: Add folio_start_writeback Matthew Wilcox (Oracle)
                   ` (38 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

test_clear_page_writeback() is actually an mm-internal function, although
it's named as if it's a pagecache function.  Move it to mm/internal.h,
rename it to __folio_end_writeback() and change the return type to bool.

The conversion from page to folio is mostly about accounting the number
of pages being written back, although it does eliminate a couple of
calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/page-flags.h |  1 -
 mm/filemap.c               |  2 +-
 mm/internal.h              |  1 +
 mm/page-writeback.c        | 29 +++++++++++++++--------------
 4 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index ef8b7c6dc91c..a2e203b9f677 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -645,7 +645,6 @@ static __always_inline void SetPageUptodate(struct page *page)
 
 CLEARPAGEFLAG(Uptodate, uptodate, PF_NO_TAIL)
 
-int test_clear_page_writeback(struct page *page);
 int __test_set_page_writeback(struct page *page, bool keep_write);
 
 #define test_set_page_writeback(page)			\
diff --git a/mm/filemap.c b/mm/filemap.c
index c77e0ba9098a..e6aa49e32255 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1531,7 +1531,7 @@ void folio_end_writeback(struct folio *folio)
 	 * reused before the folio_wake().
 	 */
 	folio_get(folio);
-	if (!test_clear_page_writeback(&folio->page))
+	if (!__folio_end_writeback(folio))
 		BUG();
 
 	smp_mb__after_atomic();
diff --git a/mm/internal.h b/mm/internal.h
index 68d363a3a1f3..91c607b5c1af 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -36,6 +36,7 @@ void page_writeback_init(void);
 
 vm_fault_t do_swap_page(struct vm_fault *vmf);
 void folio_rotate_reclaimable(struct folio *folio);
+bool __folio_end_writeback(struct folio *folio);
 
 void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma,
 		unsigned long floor, unsigned long ceiling);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 98efb3fc6466..9b8f39d124e7 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -589,7 +589,7 @@ static void wb_domain_writeout_add(struct wb_domain *dom,
 
 /*
  * Increment @wb's writeout completion count and the global writeout
- * completion count. Called from test_clear_page_writeback().
+ * completion count. Called from __folio_end_writeback().
  */
 static inline void __wb_writeout_add(struct bdi_writeback *wb, long nr)
 {
@@ -2719,27 +2719,28 @@ int clear_page_dirty_for_io(struct page *page)
 }
 EXPORT_SYMBOL(clear_page_dirty_for_io);
 
-int test_clear_page_writeback(struct page *page)
+bool __folio_end_writeback(struct folio *folio)
 {
-	struct address_space *mapping = page_mapping(page);
-	int ret;
+	long nr = folio_nr_pages(folio);
+	struct address_space *mapping = folio_mapping(folio);
+	bool ret;
 
-	lock_page_memcg(page);
+	lock_folio_memcg(folio);
 	if (mapping && mapping_use_writeback_tags(mapping)) {
 		struct inode *inode = mapping->host;
 		struct backing_dev_info *bdi = inode_to_bdi(inode);
 		unsigned long flags;
 
 		xa_lock_irqsave(&mapping->i_pages, flags);
-		ret = TestClearPageWriteback(page);
+		ret = folio_test_clear_writeback_flag(folio);
 		if (ret) {
-			__xa_clear_mark(&mapping->i_pages, page_index(page),
+			__xa_clear_mark(&mapping->i_pages, folio_index(folio),
 						PAGECACHE_TAG_WRITEBACK);
 			if (bdi->capabilities & BDI_CAP_WRITEBACK_ACCT) {
 				struct bdi_writeback *wb = inode_to_wb(inode);
 
-				dec_wb_stat(wb, WB_WRITEBACK);
-				__wb_writeout_add(wb, 1);
+				wb_stat_mod(wb, WB_WRITEBACK, -nr);
+				__wb_writeout_add(wb, nr);
 			}
 		}
 
@@ -2749,14 +2750,14 @@ int test_clear_page_writeback(struct page *page)
 
 		xa_unlock_irqrestore(&mapping->i_pages, flags);
 	} else {
-		ret = TestClearPageWriteback(page);
+		ret = folio_test_clear_writeback_flag(folio);
 	}
 	if (ret) {
-		dec_lruvec_page_state(page, NR_WRITEBACK);
-		dec_zone_page_state(page, NR_ZONE_WRITE_PENDING);
-		inc_node_page_state(page, NR_WRITTEN);
+		lruvec_stat_mod_folio(folio, NR_WRITEBACK, -nr);
+		zone_stat_mod_folio(folio, NR_ZONE_WRITE_PENDING, -nr);
+		node_stat_mod_folio(folio, NR_WRITTEN, nr);
 	}
-	unlock_page_memcg(page);
+	unlock_folio_memcg(folio);
 	return ret;
 }
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 58/96] mm/writeback: Add folio_start_writeback
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (56 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 57/96] mm/writeback: Convert test_clear_page_writeback to __folio_end_writeback Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 59/96] mm/writeback: Add folio_mark_dirty Matthew Wilcox (Oracle)
                   ` (37 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Rename set_page_writeback() to folio_start_writeback() to match
folio_end_writeback().  Do not bother with wrappers that return void;
callers are perfectly capable of ignoring return values.

Add wrappers for set_page_writeback(), set_page_writeback_keepwrite() and
test_set_page_writeback() for compatibililty with existing filesystems.
The main advantage of this patch is getting the statistics right,
although it does eliminate a couple of calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/page-flags.h | 19 +++++++++++-------
 mm/page-writeback.c        | 40 ++++++++++++++++++++------------------
 2 files changed, 33 insertions(+), 26 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index a2e203b9f677..b685ef9b41a3 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -645,21 +645,26 @@ static __always_inline void SetPageUptodate(struct page *page)
 
 CLEARPAGEFLAG(Uptodate, uptodate, PF_NO_TAIL)
 
-int __test_set_page_writeback(struct page *page, bool keep_write);
+bool __folio_start_writeback(struct folio *folio, bool keep_write);
 
-#define test_set_page_writeback(page)			\
-	__test_set_page_writeback(page, false)
-#define test_set_page_writeback_keepwrite(page)	\
-	__test_set_page_writeback(page, true)
+#define folio_start_writeback(folio)			\
+	__folio_start_writeback(folio, false)
+#define folio_start_writeback_keepwrite(folio)	\
+	__folio_start_writeback(folio, true)
 
 static inline void set_page_writeback(struct page *page)
 {
-	test_set_page_writeback(page);
+	folio_start_writeback(page_folio(page));
 }
 
 static inline void set_page_writeback_keepwrite(struct page *page)
 {
-	test_set_page_writeback_keepwrite(page);
+	folio_start_writeback_keepwrite(page_folio(page));
+}
+
+static inline bool test_set_page_writeback(struct page *page)
+{
+	return folio_start_writeback(page_folio(page));
 }
 
 __PAGEFLAG(Head, head, PF_ANY) CLEARPAGEFLAG(Head, head, PF_ANY)
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 9b8f39d124e7..4d36f4d3037f 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2761,21 +2761,23 @@ bool __folio_end_writeback(struct folio *folio)
 	return ret;
 }
 
-int __test_set_page_writeback(struct page *page, bool keep_write)
+bool __folio_start_writeback(struct folio *folio, bool keep_write)
 {
-	struct address_space *mapping = page_mapping(page);
-	int ret, access_ret;
+	long nr = folio_nr_pages(folio);
+	struct address_space *mapping = folio_mapping(folio);
+	bool ret;
+	int access_ret;
 
-	lock_page_memcg(page);
+	lock_folio_memcg(folio);
 	if (mapping && mapping_use_writeback_tags(mapping)) {
-		XA_STATE(xas, &mapping->i_pages, page_index(page));
+		XA_STATE(xas, &mapping->i_pages, folio_index(folio));
 		struct inode *inode = mapping->host;
 		struct backing_dev_info *bdi = inode_to_bdi(inode);
 		unsigned long flags;
 
 		xas_lock_irqsave(&xas, flags);
 		xas_load(&xas);
-		ret = TestSetPageWriteback(page);
+		ret = folio_test_set_writeback_flag(folio);
 		if (!ret) {
 			bool on_wblist;
 
@@ -2784,40 +2786,40 @@ int __test_set_page_writeback(struct page *page, bool keep_write)
 
 			xas_set_mark(&xas, PAGECACHE_TAG_WRITEBACK);
 			if (bdi->capabilities & BDI_CAP_WRITEBACK_ACCT)
-				inc_wb_stat(inode_to_wb(inode), WB_WRITEBACK);
+				wb_stat_mod(inode_to_wb(inode), WB_WRITEBACK,
+						nr);
 
 			/*
-			 * We can come through here when swapping anonymous
-			 * pages, so we don't necessarily have an inode to track
-			 * for sync.
+			 * We can come through here when swapping
+			 * anonymous folios, so we don't necessarily
+			 * have an inode to track for sync.
 			 */
 			if (mapping->host && !on_wblist)
 				sb_mark_inode_writeback(mapping->host);
 		}
-		if (!PageDirty(page))
+		if (!folio_dirty(folio))
 			xas_clear_mark(&xas, PAGECACHE_TAG_DIRTY);
 		if (!keep_write)
 			xas_clear_mark(&xas, PAGECACHE_TAG_TOWRITE);
 		xas_unlock_irqrestore(&xas, flags);
 	} else {
-		ret = TestSetPageWriteback(page);
+		ret = folio_test_set_writeback_flag(folio);
 	}
 	if (!ret) {
-		inc_lruvec_page_state(page, NR_WRITEBACK);
-		inc_zone_page_state(page, NR_ZONE_WRITE_PENDING);
+		lruvec_stat_mod_folio(folio, NR_WRITEBACK, nr);
+		zone_stat_mod_folio(folio, NR_ZONE_WRITE_PENDING, nr);
 	}
-	unlock_page_memcg(page);
-	access_ret = arch_make_page_accessible(page);
+	unlock_folio_memcg(folio);
+	access_ret = arch_make_folio_accessible(folio);
 	/*
 	 * If writeback has been triggered on a page that cannot be made
 	 * accessible, it is too late to recover here.
 	 */
-	VM_BUG_ON_PAGE(access_ret != 0, page);
+	VM_BUG_ON_FOLIO(access_ret != 0, folio);
 
 	return ret;
-
 }
-EXPORT_SYMBOL(__test_set_page_writeback);
+EXPORT_SYMBOL(__folio_start_writeback);
 
 /**
  * folio_wait_writeback - Wait for a folio to finish writeback.
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 59/96] mm/writeback: Add folio_mark_dirty
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (57 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 58/96] mm/writeback: Add folio_start_writeback Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 60/96] mm/writeback: Use __set_page_dirty in __set_page_dirty_nobuffers Matthew Wilcox (Oracle)
                   ` (36 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Reimplement set_page_dirty() as a wrapper around folio_mark_dirty().
There is no change to filesystems as they were already being called
with the compound_head of the page being marked dirty.  We avoid
several calls to compound_head(), both statically (through
using folio_dirty() instead of PageDirty() and dynamically by
calling folio_mapping() instead of page_mapping().

Also return bool instead of int to show the range of values actually
returned, and add kernel-doc.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h  |  3 ++-
 mm/folio-compat.c   |  6 ++++++
 mm/page-writeback.c | 28 +++++++++++++++-------------
 3 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 75279db82040..c72cecbfe00d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1998,7 +1998,8 @@ int redirty_page_for_writepage(struct writeback_control *wbc,
 void account_page_dirtied(struct page *page, struct address_space *mapping);
 void account_page_cleaned(struct page *page, struct address_space *mapping,
 			  struct bdi_writeback *wb);
-int set_page_dirty(struct page *page);
+bool folio_mark_dirty(struct folio *folio);
+bool set_page_dirty(struct page *page);
 int set_page_dirty_lock(struct page *page);
 void __cancel_dirty_page(struct page *page);
 static inline void cancel_dirty_page(struct page *page)
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index d229b979b00d..a504ecf1d695 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -60,3 +60,9 @@ void mem_cgroup_uncharge(struct page *page)
 	folio_uncharge_cgroup(page_folio(page));
 }
 #endif
+
+bool set_page_dirty(struct page *page)
+{
+	return folio_mark_dirty(page_folio(page));
+}
+EXPORT_SYMBOL(set_page_dirty);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 4d36f4d3037f..d810841ed03a 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2543,8 +2543,9 @@ int redirty_page_for_writepage(struct writeback_control *wbc, struct page *page)
 }
 EXPORT_SYMBOL(redirty_page_for_writepage);
 
-/*
- * Dirty a page.
+/**
+ * folio_mark_dirty - Mark a folio as being modified.
+ * @folio: The folio.
  *
  * For pages with a mapping this should be done under the page lock
  * for the benefit of asynchronous memory errors who prefer a consistent
@@ -2553,12 +2554,13 @@ EXPORT_SYMBOL(redirty_page_for_writepage);
  *
  * If the mapping doesn't provide a set_page_dirty a_op, then
  * just fall through and assume that it wants buffer_heads.
+ *
+ * Return: True if the folio was newly dirtied, false if it was already dirty.
  */
-int set_page_dirty(struct page *page)
+bool folio_mark_dirty(struct folio *folio)
 {
-	struct address_space *mapping = page_mapping(page);
+	struct address_space *mapping = folio_mapping(folio);
 
-	page = compound_head(page);
 	if (likely(mapping)) {
 		int (*spd)(struct page *) = mapping->a_ops->set_page_dirty;
 		/*
@@ -2571,21 +2573,21 @@ int set_page_dirty(struct page *page)
 		 * it will confuse readahead and make it restart the size rampup
 		 * process. But it's a trivial problem.
 		 */
-		if (PageReclaim(page))
-			ClearPageReclaim(page);
+		if (folio_reclaim(folio))
+			folio_clear_reclaim_flag(folio);
 #ifdef CONFIG_BLOCK
 		if (!spd)
 			spd = __set_page_dirty_buffers;
 #endif
-		return (*spd)(page);
+		return (*spd)(&folio->page);
 	}
-	if (!PageDirty(page)) {
-		if (!TestSetPageDirty(page))
-			return 1;
+	if (!folio_dirty(folio)) {
+		if (!folio_test_set_dirty_flag(folio))
+			return true;
 	}
-	return 0;
+	return false;
 }
-EXPORT_SYMBOL(set_page_dirty);
+EXPORT_SYMBOL(folio_mark_dirty);
 
 /*
  * set_page_dirty() is racy if the caller has no reference against
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 60/96] mm/writeback: Use __set_page_dirty in __set_page_dirty_nobuffers
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (58 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 59/96] mm/writeback: Add folio_mark_dirty Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 61/96] mm/writeback: Add __folio_mark_dirty Matthew Wilcox (Oracle)
                   ` (35 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Move __set_page_dirty() from buffer.c (which is not compiled if
CONFIG_BLOCK is deselected) to writeback.c (which is always compiled
in).  This code was repeated almost verbatim, although the BUG_ON()
in __set_page_dirty_nobuffers() is removed because I can't prove to my
satisfaction that it never happens.

This means that account_page_dirtied() can now be static.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/buffer.c         | 25 -------------------------
 include/linux/mm.h  |  1 -
 mm/page-writeback.c | 42 ++++++++++++++++++++++++++++++++----------
 3 files changed, 32 insertions(+), 36 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 673cfbef9eec..f5384cff7e0c 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -588,31 +588,6 @@ void mark_buffer_dirty_inode(struct buffer_head *bh, struct inode *inode)
 }
 EXPORT_SYMBOL(mark_buffer_dirty_inode);
 
-/*
- * Mark the page dirty, and set it dirty in the page cache, and mark the inode
- * dirty.
- *
- * If warn is true, then emit a warning if the page is not uptodate and has
- * not been truncated.
- *
- * The caller must hold lock_page_memcg().
- */
-void __set_page_dirty(struct page *page, struct address_space *mapping,
-			     int warn)
-{
-	unsigned long flags;
-
-	xa_lock_irqsave(&mapping->i_pages, flags);
-	if (page->mapping) {	/* Race with truncate? */
-		WARN_ON_ONCE(warn && !PageUptodate(page));
-		account_page_dirtied(page, mapping);
-		__xa_set_mark(&mapping->i_pages, page_index(page),
-				PAGECACHE_TAG_DIRTY);
-	}
-	xa_unlock_irqrestore(&mapping->i_pages, flags);
-}
-EXPORT_SYMBOL_GPL(__set_page_dirty);
-
 /*
  * Add a page to the dirty page list.
  *
diff --git a/include/linux/mm.h b/include/linux/mm.h
index c72cecbfe00d..8970ea86a5e2 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1995,7 +1995,6 @@ int __set_page_dirty_nobuffers(struct page *page);
 int __set_page_dirty_no_writeback(struct page *page);
 int redirty_page_for_writepage(struct writeback_control *wbc,
 				struct page *page);
-void account_page_dirtied(struct page *page, struct address_space *mapping);
 void account_page_cleaned(struct page *page, struct address_space *mapping,
 			  struct bdi_writeback *wb);
 bool folio_mark_dirty(struct folio *folio);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index d810841ed03a..534b9ef5dcd7 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2417,7 +2417,8 @@ int __set_page_dirty_no_writeback(struct page *page)
  *
  * NOTE: This relies on being atomic wrt interrupts.
  */
-void account_page_dirtied(struct page *page, struct address_space *mapping)
+static void account_page_dirtied(struct page *page,
+		struct address_space *mapping)
 {
 	struct inode *inode = mapping->host;
 
@@ -2458,6 +2459,35 @@ void account_page_cleaned(struct page *page, struct address_space *mapping,
 	}
 }
 
+/*
+ * Mark the page dirty, and set it dirty in the page cache, and mark the inode
+ * dirty.
+ *
+ * If warn is true, then emit a warning if the page is not uptodate and has
+ * not been truncated.
+ *
+ * The caller must hold lock_page_memcg().  Most callers have the page
+ * locked.  A few have the page blocked from truncation through other
+ * means (eg zap_page_range() has it mapped and is holding the page table
+ * lock).  This can also be called from mark_buffer_dirty(), which I
+ * cannot prove is always protected against truncate.
+ */
+void __set_page_dirty(struct page *page, struct address_space *mapping,
+			     int warn)
+{
+	unsigned long flags;
+
+	xa_lock_irqsave(&mapping->i_pages, flags);
+	if (page->mapping) {	/* Race with truncate? */
+		WARN_ON_ONCE(warn && !PageUptodate(page));
+		account_page_dirtied(page, mapping);
+		__xa_set_mark(&mapping->i_pages, page_index(page),
+				PAGECACHE_TAG_DIRTY);
+	}
+	xa_unlock_irqrestore(&mapping->i_pages, flags);
+}
+EXPORT_SYMBOL_GPL(__set_page_dirty);
+
 /*
  * For address_spaces which do not use buffers.  Just tag the page as dirty in
  * the xarray.
@@ -2475,20 +2505,12 @@ int __set_page_dirty_nobuffers(struct page *page)
 	lock_page_memcg(page);
 	if (!TestSetPageDirty(page)) {
 		struct address_space *mapping = page_mapping(page);
-		unsigned long flags;
-
 		if (!mapping) {
 			unlock_page_memcg(page);
 			return 1;
 		}
 
-		xa_lock_irqsave(&mapping->i_pages, flags);
-		BUG_ON(page_mapping(page) != mapping);
-		WARN_ON_ONCE(!PagePrivate(page) && !PageUptodate(page));
-		account_page_dirtied(page, mapping);
-		__xa_set_mark(&mapping->i_pages, page_index(page),
-				   PAGECACHE_TAG_DIRTY);
-		xa_unlock_irqrestore(&mapping->i_pages, flags);
+		__set_page_dirty(page, mapping, !PagePrivate(page));
 		unlock_page_memcg(page);
 
 		if (mapping->host) {
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 61/96] mm/writeback: Add __folio_mark_dirty
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (59 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 60/96] mm/writeback: Use __set_page_dirty in __set_page_dirty_nobuffers Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 62/96] mm/writeback: Add filemap_dirty_folio Matthew Wilcox (Oracle)
                   ` (34 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Turn __set_page_dirty() into a wrapper around __folio_mark_dirty() (which
can directly cast from page to folio because we know that set_page_dirty()
calls filesystems with the head page).  Convert account_page_dirtied()
into folio_account_dirtied() and account the number of pages in the folio.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/memcontrol.h |  5 ++--
 include/linux/mm.h         |  1 -
 include/linux/pagemap.h    |  6 +++++
 mm/page-writeback.c        | 47 +++++++++++++++++++-------------------
 4 files changed, 32 insertions(+), 27 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index c77561311199..c690a1f19d20 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -1625,10 +1625,9 @@ void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
 void mem_cgroup_track_foreign_dirty_slowpath(struct folio *folio,
 					     struct bdi_writeback *wb);
 
-static inline void mem_cgroup_track_foreign_dirty(struct page *page,
+static inline void mem_cgroup_track_foreign_dirty(struct folio *folio,
 						  struct bdi_writeback *wb)
 {
-	struct folio *folio = page_folio(page);
 	if (mem_cgroup_disabled())
 		return;
 
@@ -1653,7 +1652,7 @@ static inline void mem_cgroup_wb_stats(struct bdi_writeback *wb,
 {
 }
 
-static inline void mem_cgroup_track_foreign_dirty(struct page *page,
+static inline void mem_cgroup_track_foreign_dirty(struct folio *folio,
 						  struct bdi_writeback *wb)
 {
 }
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8970ea86a5e2..59655bde6c32 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1990,7 +1990,6 @@ extern int try_to_release_page(struct page * page, gfp_t gfp_mask);
 extern void do_invalidatepage(struct page *page, unsigned int offset,
 			      unsigned int length);
 
-void __set_page_dirty(struct page *, struct address_space *, int warn);
 int __set_page_dirty_nobuffers(struct page *page);
 int __set_page_dirty_no_writeback(struct page *page);
 int redirty_page_for_writepage(struct writeback_control *wbc,
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index f6a03fd68ac8..84f0a823820b 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -834,6 +834,12 @@ void end_page_writeback(struct page *page);
 void folio_end_writeback(struct folio *folio);
 void wait_for_stable_page(struct page *page);
 void folio_wait_stable(struct folio *folio);
+void __folio_mark_dirty(struct folio *folio, struct address_space *, int warn);
+static inline void __set_page_dirty(struct page *page,
+		struct address_space *mapping, int warn)
+{
+	__folio_mark_dirty((struct folio *)page, mapping, warn);
+}
 
 void page_endio(struct page *page, bool is_write, int err);
 
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 534b9ef5dcd7..88f6734706c9 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2417,29 +2417,30 @@ int __set_page_dirty_no_writeback(struct page *page)
  *
  * NOTE: This relies on being atomic wrt interrupts.
  */
-static void account_page_dirtied(struct page *page,
+static void folio_account_dirtied(struct folio *folio,
 		struct address_space *mapping)
 {
 	struct inode *inode = mapping->host;
 
-	trace_writeback_dirty_page(page, mapping);
+	trace_writeback_dirty_page(&folio->page, mapping);
 
 	if (mapping_can_writeback(mapping)) {
 		struct bdi_writeback *wb;
+		long nr = folio_nr_pages(folio);
 
-		inode_attach_wb(inode, page);
+		inode_attach_wb(inode, &folio->page);
 		wb = inode_to_wb(inode);
 
-		__inc_lruvec_page_state(page, NR_FILE_DIRTY);
-		__inc_zone_page_state(page, NR_ZONE_WRITE_PENDING);
-		__inc_node_page_state(page, NR_DIRTIED);
-		inc_wb_stat(wb, WB_RECLAIMABLE);
-		inc_wb_stat(wb, WB_DIRTIED);
-		task_io_account_write(PAGE_SIZE);
-		current->nr_dirtied++;
-		this_cpu_inc(bdp_ratelimits);
+		__lruvec_stat_mod_folio(folio, NR_FILE_DIRTY, nr);
+		__zone_stat_mod_folio(folio, NR_ZONE_WRITE_PENDING, nr);
+		__node_stat_mod_folio(folio, NR_DIRTIED, nr);
+		wb_stat_mod(wb, WB_RECLAIMABLE, nr);
+		wb_stat_mod(wb, WB_DIRTIED, nr);
+		task_io_account_write(nr * PAGE_SIZE);
+		current->nr_dirtied += nr;
+		this_cpu_add(bdp_ratelimits, nr);
 
-		mem_cgroup_track_foreign_dirty(page, wb);
+		mem_cgroup_track_foreign_dirty(folio, wb);
 	}
 }
 
@@ -2460,33 +2461,33 @@ void account_page_cleaned(struct page *page, struct address_space *mapping,
 }
 
 /*
- * Mark the page dirty, and set it dirty in the page cache, and mark the inode
- * dirty.
+ * Mark the folio dirty, and set it dirty in the page cache, and mark
+ * the inode dirty.
  *
- * If warn is true, then emit a warning if the page is not uptodate and has
+ * If warn is true, then emit a warning if the folio is not uptodate and has
  * not been truncated.
  *
- * The caller must hold lock_page_memcg().  Most callers have the page
- * locked.  A few have the page blocked from truncation through other
+ * The caller must hold lock_page_memcg().  Most callers have the folio
+ * locked.  A few have the folio blocked from truncation through other
  * means (eg zap_page_range() has it mapped and is holding the page table
  * lock).  This can also be called from mark_buffer_dirty(), which I
  * cannot prove is always protected against truncate.
  */
-void __set_page_dirty(struct page *page, struct address_space *mapping,
+void __folio_mark_dirty(struct folio *folio, struct address_space *mapping,
 			     int warn)
 {
 	unsigned long flags;
 
 	xa_lock_irqsave(&mapping->i_pages, flags);
-	if (page->mapping) {	/* Race with truncate? */
-		WARN_ON_ONCE(warn && !PageUptodate(page));
-		account_page_dirtied(page, mapping);
-		__xa_set_mark(&mapping->i_pages, page_index(page),
+	if (folio->mapping) {	/* Race with truncate? */
+		WARN_ON_ONCE(warn && !folio_uptodate(folio));
+		folio_account_dirtied(folio, mapping);
+		__xa_set_mark(&mapping->i_pages, folio_index(folio),
 				PAGECACHE_TAG_DIRTY);
 	}
 	xa_unlock_irqrestore(&mapping->i_pages, flags);
 }
-EXPORT_SYMBOL_GPL(__set_page_dirty);
+EXPORT_SYMBOL_GPL(__folio_mark_dirty);
 
 /*
  * For address_spaces which do not use buffers.  Just tag the page as dirty in
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 62/96] mm/writeback: Add filemap_dirty_folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (60 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 61/96] mm/writeback: Add __folio_mark_dirty Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 63/96] mm/writeback: Add folio_account_cleaned Matthew Wilcox (Oracle)
                   ` (33 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Reimplement __set_page_dirty_nobuffers() as a wrapper around
filemap_dirty_folio().  This can use a cast to struct folio
because we know that the ->set_page_dirty address space op
is always called with a page pointer that happens to also be
a folio pointer.  Saves 7 bytes of kernel text.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/writeback.h |  1 +
 mm/page-writeback.c       | 64 ++++++++++++++++++++++-----------------
 2 files changed, 37 insertions(+), 28 deletions(-)

diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 8e5c5bb16e2d..aa372f6d2b55 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -398,6 +398,7 @@ void writeback_set_ratelimit(void);
 void tag_pages_for_writeback(struct address_space *mapping,
 			     pgoff_t start, pgoff_t end);
 
+bool filemap_dirty_folio(struct address_space *mapping, struct folio *folio);
 void account_page_redirty(struct page *page);
 
 void sb_mark_inode_writeback(struct inode *inode);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 88f6734706c9..93a00d3efa55 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2489,39 +2489,47 @@ void __folio_mark_dirty(struct folio *folio, struct address_space *mapping,
 }
 EXPORT_SYMBOL_GPL(__folio_mark_dirty);
 
-/*
- * For address_spaces which do not use buffers.  Just tag the page as dirty in
- * the xarray.
- *
- * This is also used when a single buffer is being dirtied: we want to set the
- * page dirty in that case, but not all the buffers.  This is a "bottom-up"
- * dirtying, whereas __set_page_dirty_buffers() is a "top-down" dirtying.
- *
- * The caller must ensure this doesn't race with truncation.  Most will simply
- * hold the page lock, but e.g. zap_pte_range() calls with the page mapped and
- * the pte lock held, which also locks out truncation.
+/**
+ * filemap_dirty_folio - Mark a folio dirty for filesystems which do not use buffer_heads.
+ * @mapping: Address space this folio belongs to.
+ * @folio: Folio to be marked as dirty.
+ *
+ * Filesystems which do not use buffer heads should call this function
+ * from their set_page_dirty address space operation.  It ignores the
+ * contents of folio_private(), so if the filesystem marks individual
+ * blocks as dirty, the filesystem should handle that itself.
+ *
+ * This is also sometimes used by filesystems which use buffer_heads when
+ * a single buffer is being dirtied: we want to set the folio dirty in
+ * that case, but not all the buffers.  This is a "bottom-up" dirtying,
+ * whereas __set_page_dirty_buffers() is a "top-down" dirtying.
+ *
+ * The caller must ensure this doesn't race with truncation.  Most will
+ * simply hold the folio lock, but e.g. zap_pte_range() calls with the
+ * folio mapped and the pte lock held, which also locks out truncation.
  */
-int __set_page_dirty_nobuffers(struct page *page)
+bool filemap_dirty_folio(struct address_space *mapping, struct folio *folio)
 {
-	lock_page_memcg(page);
-	if (!TestSetPageDirty(page)) {
-		struct address_space *mapping = page_mapping(page);
-		if (!mapping) {
-			unlock_page_memcg(page);
-			return 1;
-		}
+	lock_folio_memcg(folio);
+	if (folio_test_set_dirty_flag(folio)) {
+		unlock_folio_memcg(folio);
+		return false;
+	}
 
-		__set_page_dirty(page, mapping, !PagePrivate(page));
-		unlock_page_memcg(page);
+	__folio_mark_dirty(folio, mapping, !folio_private(folio));
+	unlock_folio_memcg(folio);
 
-		if (mapping->host) {
-			/* !PageAnon && !swapper_space */
-			__mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
-		}
-		return 1;
+	if (mapping->host) {
+		/* !PageAnon && !swapper_space */
+		__mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
 	}
-	unlock_page_memcg(page);
-	return 0;
+	return true;
+}
+EXPORT_SYMBOL(filemap_dirty_folio);
+
+int __set_page_dirty_nobuffers(struct page *page)
+{
+	return filemap_dirty_folio(page_mapping(page), (struct folio *)page);
 }
 EXPORT_SYMBOL(__set_page_dirty_nobuffers);
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 63/96] mm/writeback: Add folio_account_cleaned
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (61 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 62/96] mm/writeback: Add filemap_dirty_folio Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 64/96] mm/writeback: Add folio_cancel_dirty Matthew Wilcox (Oracle)
                   ` (32 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Get the statistics right; compound pages were being accounted as a
single page.  Also move the declaration to filemap.h since this is
part of the page cache.  Add a wrapper for account_page_cleaned().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h      |  3 ---
 include/linux/pagemap.h |  7 +++++++
 mm/page-writeback.c     | 11 ++++++-----
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 59655bde6c32..0beb94071a15 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -39,7 +39,6 @@ struct anon_vma_chain;
 struct file_ra_state;
 struct user_struct;
 struct writeback_control;
-struct bdi_writeback;
 struct pt_regs;
 
 extern int sysctl_page_lock_unfairness;
@@ -1994,8 +1993,6 @@ int __set_page_dirty_nobuffers(struct page *page);
 int __set_page_dirty_no_writeback(struct page *page);
 int redirty_page_for_writepage(struct writeback_control *wbc,
 				struct page *page);
-void account_page_cleaned(struct page *page, struct address_space *mapping,
-			  struct bdi_writeback *wb);
 bool folio_mark_dirty(struct folio *folio);
 bool set_page_dirty(struct page *page);
 int set_page_dirty_lock(struct page *page);
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 84f0a823820b..a5933bcb5f00 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -840,6 +840,13 @@ static inline void __set_page_dirty(struct page *page,
 {
 	__folio_mark_dirty((struct folio *)page, mapping, warn);
 }
+void folio_account_cleaned(struct folio *folio, struct address_space *mapping,
+			  struct bdi_writeback *wb);
+static inline void account_page_cleaned(struct page *page,
+		struct address_space *mapping, struct bdi_writeback *wb)
+{
+	return folio_account_cleaned(page_folio(page), mapping, wb);
+}
 
 void page_endio(struct page *page, bool is_write, int err);
 
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 93a00d3efa55..261eb64387a9 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2449,14 +2449,15 @@ static void folio_account_dirtied(struct folio *folio,
  *
  * Caller must hold lock_page_memcg().
  */
-void account_page_cleaned(struct page *page, struct address_space *mapping,
+void folio_account_cleaned(struct folio *folio, struct address_space *mapping,
 			  struct bdi_writeback *wb)
 {
 	if (mapping_can_writeback(mapping)) {
-		dec_lruvec_page_state(page, NR_FILE_DIRTY);
-		dec_zone_page_state(page, NR_ZONE_WRITE_PENDING);
-		dec_wb_stat(wb, WB_RECLAIMABLE);
-		task_io_account_cancelled_write(PAGE_SIZE);
+		long nr = folio_nr_pages(folio);
+		lruvec_stat_mod_folio(folio, NR_FILE_DIRTY, -nr);
+		zone_stat_mod_folio(folio, NR_ZONE_WRITE_PENDING, -nr);
+		wb_stat_mod(wb, WB_RECLAIMABLE, -nr);
+		task_io_account_cancelled_write(folio_size(folio));
 	}
 }
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 64/96] mm/writeback: Add folio_cancel_dirty
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (62 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 63/96] mm/writeback: Add folio_account_cleaned Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 65/96] mm/writeback: Add folio_clear_dirty_for_io Matthew Wilcox (Oracle)
                   ` (31 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Turn __cancel_dirty_page() into __folio_cancel_dirty() and add wrappers.
Move the prototypes into pagemap.h since this is page cache functionality.
Saves 44 bytes of kernel text in total; 33 bytes from __folio_cancel_dirty
and 11 from two callers of cancel_dirty_page().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h      |  7 -------
 include/linux/pagemap.h | 11 +++++++++++
 mm/page-writeback.c     | 16 ++++++++--------
 3 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0beb94071a15..5ed887d51d07 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1996,13 +1996,6 @@ int redirty_page_for_writepage(struct writeback_control *wbc,
 bool folio_mark_dirty(struct folio *folio);
 bool set_page_dirty(struct page *page);
 int set_page_dirty_lock(struct page *page);
-void __cancel_dirty_page(struct page *page);
-static inline void cancel_dirty_page(struct page *page)
-{
-	/* Avoid atomic ops, locking, etc. when not actually needed. */
-	if (PageDirty(page))
-		__cancel_dirty_page(page);
-}
 int clear_page_dirty_for_io(struct page *page);
 
 int get_cmdline(struct task_struct *task, char *buffer, int buflen);
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index a5933bcb5f00..53a1b925f54e 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -847,6 +847,17 @@ static inline void account_page_cleaned(struct page *page,
 {
 	return folio_account_cleaned(page_folio(page), mapping, wb);
 }
+void __folio_cancel_dirty(struct folio *folio);
+static inline void folio_cancel_dirty(struct folio *folio)
+{
+	/* Avoid atomic ops, locking, etc. when not actually needed. */
+	if (folio_dirty(folio))
+		__folio_cancel_dirty(folio);
+}
+static inline void cancel_dirty_page(struct page *page)
+{
+	folio_cancel_dirty(page_folio(page));
+}
 
 void page_endio(struct page *page, bool is_write, int err);
 
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 261eb64387a9..57b39e2d46ac 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2655,28 +2655,28 @@ EXPORT_SYMBOL(set_page_dirty_lock);
  * page without actually doing it through the VM. Can you say "ext3 is
  * horribly ugly"? Thought you could.
  */
-void __cancel_dirty_page(struct page *page)
+void __folio_cancel_dirty(struct folio *folio)
 {
-	struct address_space *mapping = page_mapping(page);
+	struct address_space *mapping = folio_mapping(folio);
 
 	if (mapping_can_writeback(mapping)) {
 		struct inode *inode = mapping->host;
 		struct bdi_writeback *wb;
 		struct wb_lock_cookie cookie = {};
 
-		lock_page_memcg(page);
+		lock_folio_memcg(folio);
 		wb = unlocked_inode_to_wb_begin(inode, &cookie);
 
-		if (TestClearPageDirty(page))
-			account_page_cleaned(page, mapping, wb);
+		if (folio_test_clear_dirty_flag(folio))
+			folio_account_cleaned(folio, mapping, wb);
 
 		unlocked_inode_to_wb_end(inode, &cookie);
-		unlock_page_memcg(page);
+		unlock_folio_memcg(folio);
 	} else {
-		ClearPageDirty(page);
+		folio_clear_dirty_flag(folio);
 	}
 }
-EXPORT_SYMBOL(__cancel_dirty_page);
+EXPORT_SYMBOL(__folio_cancel_dirty);
 
 /*
  * Clear a page's dirty flag, while caring for dirty memory accounting.
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 65/96] mm/writeback: Add folio_clear_dirty_for_io
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (63 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 64/96] mm/writeback: Add folio_cancel_dirty Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 66/96] mm/writeback: Add folio_account_redirty Matthew Wilcox (Oracle)
                   ` (30 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Transform clear_page_dirty_for_io() into folio_clear_dirty_for_io()
and add a compatibility wrapper.  Also move the declaration to pagemap.h
as this is page cache functionality that doesn't need to be used by the
rest of the kernel.

Increases the size of the kernel by 79 bytes.  While we remove a few
calls to compound_head(), we add a call to folio_nr_pages() to get the
stats correct.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h      |  1 -
 include/linux/pagemap.h |  2 ++
 mm/folio-compat.c       |  6 ++++
 mm/page-writeback.c     | 63 +++++++++++++++++++++--------------------
 4 files changed, 40 insertions(+), 32 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5ed887d51d07..36e9ae216df3 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1996,7 +1996,6 @@ int redirty_page_for_writepage(struct writeback_control *wbc,
 bool folio_mark_dirty(struct folio *folio);
 bool set_page_dirty(struct page *page);
 int set_page_dirty_lock(struct page *page);
-int clear_page_dirty_for_io(struct page *page);
 
 int get_cmdline(struct task_struct *task, char *buffer, int buflen);
 
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 53a1b925f54e..fa24217e305d 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -858,6 +858,8 @@ static inline void cancel_dirty_page(struct page *page)
 {
 	folio_cancel_dirty(page_folio(page));
 }
+bool folio_clear_dirty_for_io(struct folio *folio);
+bool clear_page_dirty_for_io(struct page *page);
 
 void page_endio(struct page *page, bool is_write, int err);
 
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index a504ecf1d695..76262bcf858c 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -66,3 +66,9 @@ bool set_page_dirty(struct page *page)
 	return folio_mark_dirty(page_folio(page));
 }
 EXPORT_SYMBOL(set_page_dirty);
+
+bool clear_page_dirty_for_io(struct page *page)
+{
+	return folio_clear_dirty_for_io(page_folio(page));
+}
+EXPORT_SYMBOL(clear_page_dirty_for_io);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 57b39e2d46ac..c075ed60de0f 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2679,25 +2679,25 @@ void __folio_cancel_dirty(struct folio *folio)
 EXPORT_SYMBOL(__folio_cancel_dirty);
 
 /*
- * Clear a page's dirty flag, while caring for dirty memory accounting.
- * Returns true if the page was previously dirty.
- *
- * This is for preparing to put the page under writeout.  We leave the page
- * tagged as dirty in the xarray so that a concurrent write-for-sync
- * can discover it via a PAGECACHE_TAG_DIRTY walk.  The ->writepage
- * implementation will run either set_page_writeback() or set_page_dirty(),
- * at which stage we bring the page's dirty flag and xarray dirty tag
- * back into sync.
- *
- * This incoherency between the page's dirty flag and xarray tag is
- * unfortunate, but it only exists while the page is locked.
+ * Clear a folio's dirty flag, while caring for dirty memory accounting.
+ * Returns true if the folio was previously dirty.
+ *
+ * This is for preparing to put the folio under writeout.  We leave
+ * the folio tagged as dirty in the xarray so that a concurrent
+ * write-for-sync can discover it via a PAGECACHE_TAG_DIRTY walk.
+ * The ->writepage implementation will run either folio_start_writeback()
+ * or folio_mark_dirty(), at which stage we bring the folio's dirty flag
+ * and xarray dirty tag back into sync.
+ *
+ * This incoherency between the folio's dirty flag and xarray tag is
+ * unfortunate, but it only exists while the folio is locked.
  */
-int clear_page_dirty_for_io(struct page *page)
+bool folio_clear_dirty_for_io(struct folio *folio)
 {
-	struct address_space *mapping = page_mapping(page);
-	int ret = 0;
+	struct address_space *mapping = folio_mapping(folio);
+	bool ret = false;
 
-	VM_BUG_ON_PAGE(!PageLocked(page), page);
+	VM_BUG_ON_FOLIO(!folio_locked(folio), folio);
 
 	if (mapping && mapping_can_writeback(mapping)) {
 		struct inode *inode = mapping->host;
@@ -2710,48 +2710,49 @@ int clear_page_dirty_for_io(struct page *page)
 		 * We use this sequence to make sure that
 		 *  (a) we account for dirty stats properly
 		 *  (b) we tell the low-level filesystem to
-		 *      mark the whole page dirty if it was
+		 *      mark the whole folio dirty if it was
 		 *      dirty in a pagetable. Only to then
-		 *  (c) clean the page again and return 1 to
+		 *  (c) clean the folio again and return 1 to
 		 *      cause the writeback.
 		 *
 		 * This way we avoid all nasty races with the
 		 * dirty bit in multiple places and clearing
 		 * them concurrently from different threads.
 		 *
-		 * Note! Normally the "set_page_dirty(page)"
+		 * Note! Normally the "folio_mark_dirty(folio)"
 		 * has no effect on the actual dirty bit - since
 		 * that will already usually be set. But we
 		 * need the side effects, and it can help us
 		 * avoid races.
 		 *
-		 * We basically use the page "master dirty bit"
+		 * We basically use the folio "master dirty bit"
 		 * as a serialization point for all the different
 		 * threads doing their things.
 		 */
-		if (page_mkclean(page))
-			set_page_dirty(page);
+		if (folio_mkclean(folio))
+			folio_mark_dirty(folio);
 		/*
 		 * We carefully synchronise fault handlers against
-		 * installing a dirty pte and marking the page dirty
+		 * installing a dirty pte and marking the folio dirty
 		 * at this point.  We do this by having them hold the
-		 * page lock while dirtying the page, and pages are
+		 * page lock while dirtying the folio, and folios are
 		 * always locked coming in here, so we get the desired
 		 * exclusion.
 		 */
 		wb = unlocked_inode_to_wb_begin(inode, &cookie);
-		if (TestClearPageDirty(page)) {
-			dec_lruvec_page_state(page, NR_FILE_DIRTY);
-			dec_zone_page_state(page, NR_ZONE_WRITE_PENDING);
-			dec_wb_stat(wb, WB_RECLAIMABLE);
-			ret = 1;
+		if (folio_test_clear_dirty_flag(folio)) {
+			long nr = folio_nr_pages(folio);
+			lruvec_stat_mod_folio(folio, NR_FILE_DIRTY, -nr);
+			zone_stat_mod_folio(folio, NR_ZONE_WRITE_PENDING, -nr);
+			wb_stat_mod(wb, WB_RECLAIMABLE, -nr);
+			ret = true;
 		}
 		unlocked_inode_to_wb_end(inode, &cookie);
 		return ret;
 	}
-	return TestClearPageDirty(page);
+	return folio_test_clear_dirty_flag(folio);
 }
-EXPORT_SYMBOL(clear_page_dirty_for_io);
+EXPORT_SYMBOL(folio_clear_dirty_for_io);
 
 bool __folio_end_writeback(struct folio *folio)
 {
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 66/96] mm/writeback: Add folio_account_redirty
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (64 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 65/96] mm/writeback: Add folio_clear_dirty_for_io Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:05 ` [PATCH v9 67/96] mm/writeback: Add folio_redirty_for_writepage Matthew Wilcox (Oracle)
                   ` (29 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Account the number of pages in the folio that we're redirtying.
Turn account_page_dirty() into a wrapper around it.  Also turn
the comment on folio_account_redirty() into kernel-doc and
edit it slightly so it makes sense to its potential callers.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/writeback.h |  6 +++++-
 mm/page-writeback.c       | 32 +++++++++++++++++++-------------
 2 files changed, 24 insertions(+), 14 deletions(-)

diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index aa372f6d2b55..9883dbf8b763 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -399,7 +399,11 @@ void tag_pages_for_writeback(struct address_space *mapping,
 			     pgoff_t start, pgoff_t end);
 
 bool filemap_dirty_folio(struct address_space *mapping, struct folio *folio);
-void account_page_redirty(struct page *page);
+void folio_account_redirty(struct folio *folio);
+static inline void account_page_redirty(struct page *page)
+{
+	folio_account_redirty(page_folio(page));
+}
 
 void sb_mark_inode_writeback(struct inode *inode);
 void sb_clear_inode_writeback(struct inode *inode);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index c075ed60de0f..1ff3e7dec0bd 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1090,7 +1090,7 @@ static void wb_update_write_bandwidth(struct bdi_writeback *wb,
 	 * write_bandwidth = ---------------------------------------------------
 	 *                                          period
 	 *
-	 * @written may have decreased due to account_page_redirty().
+	 * @written may have decreased due to folio_account_redirty().
 	 * Avoid underflowing @bw calculation.
 	 */
 	bw = written - min(written, wb->written_stamp);
@@ -2534,30 +2534,36 @@ int __set_page_dirty_nobuffers(struct page *page)
 }
 EXPORT_SYMBOL(__set_page_dirty_nobuffers);
 
-/*
- * Call this whenever redirtying a page, to de-account the dirty counters
- * (NR_DIRTIED, WB_DIRTIED, tsk->nr_dirtied), so that they match the written
- * counters (NR_WRITTEN, WB_WRITTEN) in long term. The mismatches will lead to
- * systematic errors in balanced_dirty_ratelimit and the dirty pages position
- * control.
+/**
+ * folio_account_redirty - Manually account for redirtying a page.
+ * @folio: The folio which is being redirtied.
+ *
+ * Most filesystems should call folio_redirty_for_writepage() instead
+ * of this fuction.  If your filesystem is doing writeback outside the
+ * context of a writeback_control(), it can call this when redirtying
+ * a folio, to de-account the dirty counters (NR_DIRTIED, WB_DIRTIED,
+ * tsk->nr_dirtied), so that they match the written counters (NR_WRITTEN,
+ * WB_WRITTEN) in long term. The mismatches will lead to systematic errors
+ * in balanced_dirty_ratelimit and the dirty pages position control.
  */
-void account_page_redirty(struct page *page)
+void folio_account_redirty(struct folio *folio)
 {
-	struct address_space *mapping = page->mapping;
+	struct address_space *mapping = folio->mapping;
 
 	if (mapping && mapping_can_writeback(mapping)) {
 		struct inode *inode = mapping->host;
 		struct bdi_writeback *wb;
 		struct wb_lock_cookie cookie = {};
+		unsigned nr = folio_nr_pages(folio);
 
 		wb = unlocked_inode_to_wb_begin(inode, &cookie);
-		current->nr_dirtied--;
-		dec_node_page_state(page, NR_DIRTIED);
-		dec_wb_stat(wb, WB_DIRTIED);
+		current->nr_dirtied -= nr;
+		node_stat_mod_folio(folio, NR_DIRTIED, -nr);
+		wb_stat_mod(wb, WB_DIRTIED, -nr);
 		unlocked_inode_to_wb_end(inode, &cookie);
 	}
 }
-EXPORT_SYMBOL(account_page_redirty);
+EXPORT_SYMBOL(folio_account_redirty);
 
 /*
  * When a writepage implementation decides that it doesn't want to write this
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 67/96] mm/writeback: Add folio_redirty_for_writepage
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (65 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 66/96] mm/writeback: Add folio_account_redirty Matthew Wilcox (Oracle)
@ 2021-05-05 15:05 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 68/96] mm/filemap: Add i_blocks_per_folio Matthew Wilcox (Oracle)
                   ` (28 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:05 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Reimplement redirty_page_for_writepage() as a wrapper around
folio_redirty_for_writepage().  Account the number of pages in the
folio, add kernel-doc and move the prototype to writeback.h.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/jfs/jfs_metapage.c     |  1 +
 include/linux/mm.h        |  4 ----
 include/linux/writeback.h |  2 ++
 mm/folio-compat.c         |  7 +++++++
 mm/page-writeback.c       | 30 ++++++++++++++++++++----------
 5 files changed, 30 insertions(+), 14 deletions(-)

diff --git a/fs/jfs/jfs_metapage.c b/fs/jfs/jfs_metapage.c
index 176580f54af9..104ae698443e 100644
--- a/fs/jfs/jfs_metapage.c
+++ b/fs/jfs/jfs_metapage.c
@@ -13,6 +13,7 @@
 #include <linux/buffer_head.h>
 #include <linux/mempool.h>
 #include <linux/seq_file.h>
+#include <linux/writeback.h>
 #include "jfs_incore.h"
 #include "jfs_superblock.h"
 #include "jfs_filsys.h"
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 36e9ae216df3..3f91c9971b32 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -36,9 +36,7 @@
 struct mempolicy;
 struct anon_vma;
 struct anon_vma_chain;
-struct file_ra_state;
 struct user_struct;
-struct writeback_control;
 struct pt_regs;
 
 extern int sysctl_page_lock_unfairness;
@@ -1991,8 +1989,6 @@ extern void do_invalidatepage(struct page *page, unsigned int offset,
 
 int __set_page_dirty_nobuffers(struct page *page);
 int __set_page_dirty_no_writeback(struct page *page);
-int redirty_page_for_writepage(struct writeback_control *wbc,
-				struct page *page);
 bool folio_mark_dirty(struct folio *folio);
 bool set_page_dirty(struct page *page);
 int set_page_dirty_lock(struct page *page);
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 9883dbf8b763..bdf933f74a1d 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -404,6 +404,8 @@ static inline void account_page_redirty(struct page *page)
 {
 	folio_account_redirty(page_folio(page));
 }
+bool folio_redirty_for_writepage(struct writeback_control *, struct folio *);
+bool redirty_page_for_writepage(struct writeback_control *, struct page *);
 
 void sb_mark_inode_writeback(struct inode *inode);
 void sb_clear_inode_writeback(struct inode *inode);
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 76262bcf858c..a57a1f53512c 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -72,3 +72,10 @@ bool clear_page_dirty_for_io(struct page *page)
 	return folio_clear_dirty_for_io(page_folio(page));
 }
 EXPORT_SYMBOL(clear_page_dirty_for_io);
+
+bool redirty_page_for_writepage(struct writeback_control *wbc,
+		struct page *page)
+{
+	return folio_redirty_for_writepage(wbc, page_folio(page));
+}
+EXPORT_SYMBOL(redirty_page_for_writepage);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 1ff3e7dec0bd..ac86f3cbba1c 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2565,21 +2565,31 @@ void folio_account_redirty(struct folio *folio)
 }
 EXPORT_SYMBOL(folio_account_redirty);
 
-/*
- * When a writepage implementation decides that it doesn't want to write this
- * page for some reason, it should redirty the locked page via
- * redirty_page_for_writepage() and it should then unlock the page and return 0
+/**
+ * folio_redirty_for_writepage - Decline to write a dirty folio.
+ * @wbc: The writeback control.
+ * @folio: The folio.
+ *
+ * When a writepage implementation decides that it doesn't want to write
+ * @folio for some reason, it should call this function, unlock @folio and
+ * return 0.
+ *
+ * Return: True if we redirtied the folio.  False if someone else dirtied
+ * it first.
  */
-int redirty_page_for_writepage(struct writeback_control *wbc, struct page *page)
+bool folio_redirty_for_writepage(struct writeback_control *wbc,
+		struct folio *folio)
 {
-	int ret;
+	bool ret;
+	unsigned nr = folio_nr_pages(folio);
+
+	wbc->pages_skipped += nr;
+	ret = filemap_dirty_folio(folio->mapping, folio);
+	folio_account_redirty(folio);
 
-	wbc->pages_skipped++;
-	ret = __set_page_dirty_nobuffers(page);
-	account_page_redirty(page);
 	return ret;
 }
-EXPORT_SYMBOL(redirty_page_for_writepage);
+EXPORT_SYMBOL(folio_redirty_for_writepage);
 
 /**
  * folio_mark_dirty - Mark a folio as being modified.
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 68/96] mm/filemap: Add i_blocks_per_folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (66 preceding siblings ...)
  2021-05-05 15:05 ` [PATCH v9 67/96] mm/writeback: Add folio_redirty_for_writepage Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 69/96] mm/filemap: Add folio_mkwrite_check_truncate Matthew Wilcox (Oracle)
                   ` (27 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Reimplement i_blocks_per_page() as a wrapper around i_blocks_per_folio().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/pagemap.h | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index fa24217e305d..2f896574aad7 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -1224,19 +1224,25 @@ static inline int page_mkwrite_check_truncate(struct page *page,
 }
 
 /**
- * i_blocks_per_page - How many blocks fit in this page.
+ * i_blocks_per_folio - How many blocks fit in this folio.
  * @inode: The inode which contains the blocks.
- * @page: The page (head page if the page is a THP).
+ * @folio: The folio.
  *
- * If the block size is larger than the size of this page, return zero.
+ * If the block size is larger than the size of this folio, return zero.
  *
- * Context: The caller should hold a refcount on the page to prevent it
+ * Context: The caller should hold a refcount on the folio to prevent it
  * from being split.
- * Return: The number of filesystem blocks covered by this page.
+ * Return: The number of filesystem blocks covered by this folio.
  */
+static inline
+unsigned int i_blocks_per_folio(struct inode *inode, struct folio *folio)
+{
+	return folio_size(folio) >> inode->i_blkbits;
+}
+
 static inline
 unsigned int i_blocks_per_page(struct inode *inode, struct page *page)
 {
-	return thp_size(page) >> inode->i_blkbits;
+	return i_blocks_per_folio(inode, page_folio(page));
 }
 #endif /* _LINUX_PAGEMAP_H */
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 69/96] mm/filemap: Add folio_mkwrite_check_truncate
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (67 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 68/96] mm/filemap: Add i_blocks_per_folio Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 70/96] mm/filemap: Add readahead_folio Matthew Wilcox (Oracle)
                   ` (26 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This is the folio equivalent of page_mkwrite_check_truncate().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/pagemap.h | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 2f896574aad7..8fd00dc5ebd5 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -1195,6 +1195,34 @@ static inline unsigned long dir_pages(struct inode *inode)
 			       PAGE_SHIFT;
 }
 
+/**
+ * folio_mkwrite_check_truncate - check if folio was truncated
+ * @folio: the folio to check
+ * @inode: the inode to check the folio against
+ *
+ * Return: the number of bytes in the folio up to EOF,
+ * or -EFAULT if the folio was truncated.
+ */
+static inline ssize_t folio_mkwrite_check_truncate(struct folio *folio,
+					      struct inode *inode)
+{
+	loff_t size = i_size_read(inode);
+	pgoff_t index = size >> PAGE_SHIFT;
+	size_t offset = offset_in_folio(folio, size);
+
+	if (!folio->mapping)
+		return -EFAULT;
+
+	/* folio is wholly inside EOF */
+	if (folio_next_index(folio) - 1 < index)
+		return folio_size(folio);
+	/* folio is wholly past EOF */
+	if (folio->index > index || !offset)
+		return -EFAULT;
+	/* folio is partially inside EOF */
+	return offset;
+}
+
 /**
  * page_mkwrite_check_truncate - check if page was truncated
  * @page: the page to check
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 70/96] mm/filemap: Add readahead_folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (68 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 69/96] mm/filemap: Add folio_mkwrite_check_truncate Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 71/96] block: Add bio_add_folio Matthew Wilcox (Oracle)
                   ` (25 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

The pointers stored in the page cache are folios, by definition.
This change comes with a behaviour change -- callers of readahead_folio()
are no longer required to put the page reference themselves.  This matches
how readpage works, rather than matching how readpages used to work.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/pagemap.h | 53 +++++++++++++++++++++++++++++------------
 1 file changed, 38 insertions(+), 15 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 8fd00dc5ebd5..d54772aa7a3a 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -1062,33 +1062,56 @@ void page_cache_async_readahead(struct address_space *mapping,
 	page_cache_async_ra(&ractl, page, req_count);
 }
 
+static inline struct folio *__readahead_folio(struct readahead_control *ractl)
+{
+	struct folio *folio;
+
+	BUG_ON(ractl->_batch_count > ractl->_nr_pages);
+	ractl->_nr_pages -= ractl->_batch_count;
+	ractl->_index += ractl->_batch_count;
+
+	if (!ractl->_nr_pages) {
+		ractl->_batch_count = 0;
+		return NULL;
+	}
+
+	folio = xa_load(&ractl->mapping->i_pages, ractl->_index);
+	VM_BUG_ON_FOLIO(!folio_locked(folio), folio);
+	ractl->_batch_count = folio_nr_pages(folio);
+
+	return folio;
+}
+
 /**
  * readahead_page - Get the next page to read.
- * @rac: The current readahead request.
+ * @ractl: The current readahead request.
  *
  * Context: The page is locked and has an elevated refcount.  The caller
  * should decreases the refcount once the page has been submitted for I/O
  * and unlock the page once all I/O to that page has completed.
  * Return: A pointer to the next page, or %NULL if we are done.
  */
-static inline struct page *readahead_page(struct readahead_control *rac)
+static inline struct page *readahead_page(struct readahead_control *ractl)
 {
-	struct page *page;
+	struct folio *folio = __readahead_folio(ractl);
 
-	BUG_ON(rac->_batch_count > rac->_nr_pages);
-	rac->_nr_pages -= rac->_batch_count;
-	rac->_index += rac->_batch_count;
-
-	if (!rac->_nr_pages) {
-		rac->_batch_count = 0;
-		return NULL;
-	}
+	return &folio->page;
+}
 
-	page = xa_load(&rac->mapping->i_pages, rac->_index);
-	VM_BUG_ON_PAGE(!PageLocked(page), page);
-	rac->_batch_count = thp_nr_pages(page);
+/**
+ * readahead_folio - Get the next folio to read.
+ * @ractl: The current readahead request.
+ *
+ * Context: The folio is locked.  The caller should unlock the folio once
+ * all I/O to that folio has completed.
+ * Return: A pointer to the next folio, or %NULL if we are done.
+ */
+static inline struct folio *readahead_folio(struct readahead_control *ractl)
+{
+	struct folio *folio = __readahead_folio(ractl);
 
-	return page;
+	folio_put(folio);
+	return folio;
 }
 
 static inline unsigned int __readahead_batch(struct readahead_control *rac,
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 71/96] block: Add bio_add_folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (69 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 70/96] mm/filemap: Add readahead_folio Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 72/96] block: Add bio_for_each_folio_all Matthew Wilcox (Oracle)
                   ` (24 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This is a thin wrapper around bio_add_page().  The main advantage here
is the documentation that the submitter can expect to see folios in the
completion handler, and that stupidly large folios are not supported.
It's not currently possible to allocate stupidly large folios, but if
it ever becomes possible, this function will fail gracefully instead of
doing I/O to the wrong bytes.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 block/bio.c         | 21 +++++++++++++++++++++
 include/linux/bio.h |  3 ++-
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/block/bio.c b/block/bio.c
index 221dc56ba22f..f84defe8d84a 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -940,6 +940,27 @@ int bio_add_page(struct bio *bio, struct page *page,
 }
 EXPORT_SYMBOL(bio_add_page);
 
+/**
+ * bio_add_folio - Attempt to add part of a folio to a bio.
+ * @bio: Bio to add to.
+ * @folio: Folio to add.
+ * @len: How many bytes from the folio to add.
+ * @off: First byte in this folio to add.
+ *
+ * Always uses the head page of the folio in the bio.  If a submitter
+ * only uses bio_add_folio(), it can count on never seeing tail pages
+ * in the completion routine.  BIOs do not support folios larger than 2GiB.
+ *
+ * Return: The number of bytes from this folio added to the bio.
+ */
+size_t bio_add_folio(struct bio *bio, struct folio *folio, size_t len,
+		size_t off)
+{
+	if (len > UINT_MAX || off > UINT_MAX)
+		return 0;
+	return bio_add_page(bio, &folio->page, len, off);
+}
+
 void bio_release_pages(struct bio *bio, bool mark_dirty)
 {
 	struct bvec_iter_all iter_all;
diff --git a/include/linux/bio.h b/include/linux/bio.h
index f1a99f0a240c..f41f20ebc2c6 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -468,7 +468,8 @@ extern void bio_uninit(struct bio *);
 extern void bio_reset(struct bio *);
 void bio_chain(struct bio *, struct bio *);
 
-extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
+int bio_add_page(struct bio *, struct page *, unsigned len, unsigned off);
+size_t bio_add_folio(struct bio *, struct folio *, size_t len, size_t off);
 extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
 			   unsigned int, unsigned int);
 int bio_add_zone_append_page(struct bio *bio, struct page *page,
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 72/96] block: Add bio_for_each_folio_all
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (70 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 71/96] block: Add bio_add_folio Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 73/96] mm/lru: Add folio_lru and folio_is_file_lru Matthew Wilcox (Oracle)
                   ` (23 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Allow callers to iterate over each folio instead of each page.  The
bio need not have been constructed using folios originally.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/bio.h | 43 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/include/linux/bio.h b/include/linux/bio.h
index f41f20ebc2c6..8af03cfcfbb6 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -194,7 +194,7 @@ static inline void bio_advance_iter_single(const struct bio *bio,
  */
 #define bio_for_each_bvec_all(bvl, bio, i)		\
 	for (i = 0, bvl = bio_first_bvec_all(bio);	\
-	     i < (bio)->bi_vcnt; i++, bvl++)		\
+	     i < (bio)->bi_vcnt; i++, bvl++)
 
 #define bio_iter_last(bvec, iter) ((iter).bi_size == (bvec).bv_len)
 
@@ -320,6 +320,47 @@ static inline struct bio_vec *bio_last_bvec_all(struct bio *bio)
 	return &bio->bi_io_vec[bio->bi_vcnt - 1];
 }
 
+struct folio_iter {
+	struct folio *folio;
+	size_t offset;
+	size_t length;
+	size_t _seg_count;
+	int _i;
+};
+
+static inline
+void bio_first_folio(struct folio_iter *fi, struct bio *bio, int i)
+{
+	struct bio_vec *bvec = bio_first_bvec_all(bio) + i;
+
+	fi->folio = page_folio(bvec->bv_page);
+	fi->offset = bvec->bv_offset +
+			PAGE_SIZE * (bvec->bv_page - &fi->folio->page);
+	fi->_seg_count = bvec->bv_len;
+	fi->length = min(folio_size(fi->folio) - fi->offset, fi->_seg_count);
+	fi->_i = i;
+}
+
+static inline void bio_next_folio(struct folio_iter *fi, struct bio *bio)
+{
+	fi->_seg_count -= fi->length;
+	if (fi->_seg_count) {
+		fi->folio = folio_next(fi->folio);
+		fi->offset = 0;
+		fi->length = min(folio_size(fi->folio), fi->_seg_count);
+	} else if (fi->_i + 1 < bio->bi_vcnt) {
+		bio_first_folio(fi, bio, fi->_i + 1);
+	} else {
+		fi->folio = NULL;
+	}
+}
+
+/*
+ * Iterate over each folio in a bio.
+ */
+#define bio_for_each_folio_all(fi, bio)				\
+	for (bio_first_folio(&fi, bio, 0); fi.folio; bio_next_folio(&fi, bio))
+
 enum bip_flags {
 	BIP_BLOCK_INTEGRITY	= 1 << 0, /* block layer owns integrity data */
 	BIP_MAPPED_INTEGRITY	= 1 << 1, /* ref tag has been remapped */
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 73/96] mm/lru: Add folio_lru and folio_is_file_lru
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (71 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 72/96] block: Add bio_for_each_folio_all Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 74/96] mm/workingset: Convert workingset_refault to take a folio Matthew Wilcox (Oracle)
                   ` (22 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Convert page_lru() to call folio_lru() and convert page_is_file_lru()
to call folio_is_file_lru().  All pages on the LRUs are folios (because
tail pages use the space for the LRU list as compound_head), so all
callers can be converted.

Saves 637 bytes of kernel text; no functions grow.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm_inline.h | 44 ++++++++++++++++++++++++---------------
 1 file changed, 27 insertions(+), 17 deletions(-)

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 355ea1ee32bd..c03b12ea0b7b 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -6,22 +6,27 @@
 #include <linux/swap.h>
 
 /**
- * page_is_file_lru - should the page be on a file LRU or anon LRU?
- * @page: the page to test
+ * folio_is_file_lru - should the folio be on a file LRU or anon LRU?
+ * @folio: the folio to test
  *
- * Returns 1 if @page is a regular filesystem backed page cache page or a lazily
- * freed anonymous page (e.g. via MADV_FREE).  Returns 0 if @page is a normal
- * anonymous page, a tmpfs page or otherwise ram or swap backed page.  Used by
- * functions that manipulate the LRU lists, to sort a page onto the right LRU
- * list.
+ * Returns 1 if @folio is a regular filesystem backed page cache folio
+ * or a lazily freed anonymous folio (e.g. via MADV_FREE).  Returns 0 if
+ * @folio is a normal anonymous folio, a tmpfs folio or otherwise ram or
+ * swap backed folio.  Used by functions that manipulate the LRU lists,
+ * to sort a folio onto the right LRU list.
  *
  * We would like to get this info without a page flag, but the state
- * needs to survive until the page is last deleted from the LRU, which
+ * needs to survive until the folio is last deleted from the LRU, which
  * could be as far down as __page_cache_release.
  */
+static inline int folio_is_file_lru(struct folio *folio)
+{
+	return !folio_swapbacked(folio);
+}
+
 static inline int page_is_file_lru(struct page *page)
 {
-	return !PageSwapBacked(page);
+	return folio_is_file_lru(page_folio(page));
 }
 
 static __always_inline void update_lru_size(struct lruvec *lruvec,
@@ -57,28 +62,33 @@ static __always_inline void __clear_page_lru_flags(struct page *page)
 }
 
 /**
- * page_lru - which LRU list should a page be on?
- * @page: the page to test
+ * folio_lru_list - which LRU list should a folio be on?
+ * @folio: the folio to test
  *
- * Returns the LRU list a page should be on, as an index
+ * Returns the LRU list a folio should be on, as an index
  * into the array of LRU lists.
  */
-static __always_inline enum lru_list page_lru(struct page *page)
+static __always_inline enum lru_list folio_lru_list(struct folio *folio)
 {
 	enum lru_list lru;
 
-	VM_BUG_ON_PAGE(PageActive(page) && PageUnevictable(page), page);
+	VM_BUG_ON_FOLIO(folio_active(folio) && folio_unevictable(folio), folio);
 
-	if (PageUnevictable(page))
+	if (folio_unevictable(folio))
 		return LRU_UNEVICTABLE;
 
-	lru = page_is_file_lru(page) ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON;
-	if (PageActive(page))
+	lru = folio_is_file_lru(folio) ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON;
+	if (folio_active(folio))
 		lru += LRU_ACTIVE;
 
 	return lru;
 }
 
+static __always_inline enum lru_list page_lru(struct page *page)
+{
+	return folio_lru_list(page_folio(page));
+}
+
 static __always_inline void add_page_to_lru_list(struct page *page,
 				struct lruvec *lruvec)
 {
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 74/96] mm/workingset: Convert workingset_refault to take a folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (72 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 73/96] mm/lru: Add folio_lru and folio_is_file_lru Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 20:17   ` kernel test robot
  2021-05-05 15:06 ` [PATCH v9 75/96] mm/lru: Add folio_add_lru Matthew Wilcox (Oracle)
                   ` (21 subsequent siblings)
  95 siblings, 1 reply; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This nets us 178 bytes of savings from removing calls to compound_head.
The three callers all grow a little, but each of them will be converted
to use folios soon, so that's fine.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/swap.h |  4 ++--
 mm/filemap.c         |  2 +-
 mm/memory.c          |  3 ++-
 mm/swap.c            |  6 +++---
 mm/swap_state.c      |  2 +-
 mm/workingset.c      | 34 +++++++++++++++++-----------------
 6 files changed, 26 insertions(+), 25 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index d1cb67cdb476..35d3dba422a8 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -323,7 +323,7 @@ static inline swp_entry_t folio_swap_entry(struct folio *folio)
 /* linux/mm/workingset.c */
 void workingset_age_nonresident(struct lruvec *lruvec, unsigned long nr_pages);
 void *workingset_eviction(struct page *page, struct mem_cgroup *target_memcg);
-void workingset_refault(struct page *page, void *shadow);
+void workingset_refault(struct folio *folio, void *shadow);
 void workingset_activation(struct folio *folio);
 
 /* Only track the nodes of mappings with shadow entries */
@@ -344,7 +344,7 @@ extern unsigned long nr_free_buffer_pages(void);
 /* linux/mm/swap.c */
 extern void lru_note_cost(struct lruvec *lruvec, bool file,
 			  unsigned int nr_pages);
-extern void lru_note_cost_page(struct page *);
+extern void lru_note_cost_folio(struct folio *);
 extern void lru_cache_add(struct page *);
 void mark_page_accessed(struct page *);
 void folio_mark_accessed(struct folio *);
diff --git a/mm/filemap.c b/mm/filemap.c
index e6aa49e32255..5c130bfcdb1c 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -979,7 +979,7 @@ int add_to_page_cache_lru(struct page *page, struct address_space *mapping,
 		 */
 		WARN_ON_ONCE(PageActive(page));
 		if (!(gfp_mask & __GFP_WRITE) && shadow)
-			workingset_refault(page, shadow);
+			workingset_refault(page_folio(page), shadow);
 		lru_cache_add(page);
 	}
 	return ret;
diff --git a/mm/memory.c b/mm/memory.c
index fc3f50d0702c..a73da89c36ef 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3364,7 +3364,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
 
 				shadow = get_shadow_from_swap_cache(entry);
 				if (shadow)
-					workingset_refault(page, shadow);
+					workingset_refault(page_folio(page),
+								shadow);
 
 				lru_cache_add(page);
 
diff --git a/mm/swap.c b/mm/swap.c
index 8e7f92be2f6f..cd441cdb82fd 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -311,10 +311,10 @@ void lru_note_cost(struct lruvec *lruvec, bool file, unsigned int nr_pages)
 	} while ((lruvec = parent_lruvec(lruvec)));
 }
 
-void lru_note_cost_page(struct page *page)
+void lru_note_cost_folio(struct folio *folio)
 {
-	lru_note_cost(mem_cgroup_page_lruvec(page, page_pgdat(page)),
-		      page_is_file_lru(page), thp_nr_pages(page));
+	lru_note_cost(mem_cgroup_folio_lruvec(folio, folio_pgdat(folio)),
+		      folio_is_file_lru(folio), folio_nr_pages(folio));
 }
 
 static void __activate_page(struct page *page, struct lruvec *lruvec)
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 272ea2108c9d..1c8e8b3aa10b 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -503,7 +503,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
 	mem_cgroup_swapin_uncharge_swap(entry);
 
 	if (shadow)
-		workingset_refault(page, shadow);
+		workingset_refault(page_folio(page), shadow);
 
 	/* Caller will initiate read into locked page */
 	lru_cache_add(page);
diff --git a/mm/workingset.c b/mm/workingset.c
index d969403f2b2a..93002e4f8f15 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -271,17 +271,17 @@ void *workingset_eviction(struct page *page, struct mem_cgroup *target_memcg)
 }
 
 /**
- * workingset_refault - evaluate the refault of a previously evicted page
- * @page: the freshly allocated replacement page
- * @shadow: shadow entry of the evicted page
+ * workingset_refault - evaluate the refault of a previously evicted folio
+ * @page: the freshly allocated replacement folio
+ * @shadow: shadow entry of the evicted folio
  *
  * Calculates and evaluates the refault distance of the previously
- * evicted page in the context of the node and the memcg whose memory
+ * evicted folio in the context of the node and the memcg whose memory
  * pressure caused the eviction.
  */
-void workingset_refault(struct page *page, void *shadow)
+void workingset_refault(struct folio *folio, void *shadow)
 {
-	bool file = page_is_file_lru(page);
+	bool file = folio_is_file_lru(folio);
 	struct mem_cgroup *eviction_memcg;
 	struct lruvec *eviction_lruvec;
 	unsigned long refault_distance;
@@ -299,10 +299,10 @@ void workingset_refault(struct page *page, void *shadow)
 	rcu_read_lock();
 	/*
 	 * Look up the memcg associated with the stored ID. It might
-	 * have been deleted since the page's eviction.
+	 * have been deleted since the folio's eviction.
 	 *
 	 * Note that in rare events the ID could have been recycled
-	 * for a new cgroup that refaults a shared page. This is
+	 * for a new cgroup that refaults a shared folio. This is
 	 * impossible to tell from the available data. However, this
 	 * should be a rare and limited disturbance, and activations
 	 * are always speculative anyway. Ultimately, it's the aging
@@ -338,14 +338,14 @@ void workingset_refault(struct page *page, void *shadow)
 	refault_distance = (refault - eviction) & EVICTION_MASK;
 
 	/*
-	 * The activation decision for this page is made at the level
+	 * The activation decision for this folio is made at the level
 	 * where the eviction occurred, as that is where the LRU order
-	 * during page reclaim is being determined.
+	 * during folio reclaim is being determined.
 	 *
-	 * However, the cgroup that will own the page is the one that
+	 * However, the cgroup that will own the folio is the one that
 	 * is actually experiencing the refault event.
 	 */
-	memcg = page_memcg(page);
+	memcg = folio_memcg(folio);
 	lruvec = mem_cgroup_lruvec(memcg, pgdat);
 
 	inc_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + file);
@@ -373,15 +373,15 @@ void workingset_refault(struct page *page, void *shadow)
 	if (refault_distance > workingset_size)
 		goto out;
 
-	SetPageActive(page);
-	workingset_age_nonresident(lruvec, thp_nr_pages(page));
+	folio_set_active_flag(folio);
+	workingset_age_nonresident(lruvec, folio_nr_pages(folio));
 	inc_lruvec_state(lruvec, WORKINGSET_ACTIVATE_BASE + file);
 
-	/* Page was active prior to eviction */
+	/* Folio was active prior to eviction */
 	if (workingset) {
-		SetPageWorkingset(page);
+		folio_set_workingset_flag(folio);
 		/* XXX: Move to lru_cache_add() when it supports new vs putback */
-		lru_note_cost_page(page);
+		lru_note_cost_folio(folio);
 		inc_lruvec_state(lruvec, WORKINGSET_RESTORE_BASE + file);
 	}
 out:
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 75/96] mm/lru: Add folio_add_lru
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (73 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 74/96] mm/workingset: Convert workingset_refault to take a folio Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 76/96] mm/page_alloc: Add __alloc_folio, __alloc_folio_node and alloc_folio Matthew Wilcox (Oracle)
                   ` (20 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Reimplement lru_cache_add() as a wrapper around folio_add_lru().
Saves 159 bytes of kernel text due to removing calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/swap.h |  1 +
 mm/folio-compat.c    |  6 ++++++
 mm/swap.c            | 22 +++++++++++-----------
 3 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 35d3dba422a8..87bfabd69132 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -345,6 +345,7 @@ extern unsigned long nr_free_buffer_pages(void);
 extern void lru_note_cost(struct lruvec *lruvec, bool file,
 			  unsigned int nr_pages);
 extern void lru_note_cost_folio(struct folio *);
+extern void folio_add_lru(struct folio *);
 extern void lru_cache_add(struct page *);
 void mark_page_accessed(struct page *);
 void folio_mark_accessed(struct folio *);
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index a57a1f53512c..7de3839ad072 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -79,3 +79,9 @@ bool redirty_page_for_writepage(struct writeback_control *wbc,
 	return folio_redirty_for_writepage(wbc, page_folio(page));
 }
 EXPORT_SYMBOL(redirty_page_for_writepage);
+
+void lru_cache_add(struct page *page)
+{
+	folio_add_lru(page_folio(page));
+}
+EXPORT_SYMBOL(lru_cache_add);
diff --git a/mm/swap.c b/mm/swap.c
index cd441cdb82fd..1cc6fb618cb5 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -450,29 +450,29 @@ void folio_mark_accessed(struct folio *folio)
 EXPORT_SYMBOL(folio_mark_accessed);
 
 /**
- * lru_cache_add - add a page to a page list
- * @page: the page to be added to the LRU.
+ * folio_add_lru - Add a folio to an LRU list.
+ * @folio: The folio to be added to the LRU.
  *
- * Queue the page for addition to the LRU via pagevec. The decision on whether
+ * Queue the folio for addition to the LRU. The decision on whether
  * to add the page to the [in]active [file|anon] list is deferred until the
- * pagevec is drained. This gives a chance for the caller of lru_cache_add()
- * have the page added to the active list using mark_page_accessed().
+ * pagevec is drained. This gives a chance for the caller of folio_add_lru()
+ * have the folio added to the active list using folio_mark_accessed().
  */
-void lru_cache_add(struct page *page)
+void folio_add_lru(struct folio *folio)
 {
 	struct pagevec *pvec;
 
-	VM_BUG_ON_PAGE(PageActive(page) && PageUnevictable(page), page);
-	VM_BUG_ON_PAGE(PageLRU(page), page);
+	VM_BUG_ON_FOLIO(folio_active(folio) && folio_unevictable(folio), folio);
+	VM_BUG_ON_FOLIO(folio_lru(folio), folio);
 
-	get_page(page);
+	folio_get(folio);
 	local_lock(&lru_pvecs.lock);
 	pvec = this_cpu_ptr(&lru_pvecs.lru_add);
-	if (pagevec_add_and_need_flush(pvec, page))
+	if (pagevec_add_and_need_flush(pvec, &folio->page))
 		__pagevec_lru_add(pvec);
 	local_unlock(&lru_pvecs.lock);
 }
-EXPORT_SYMBOL(lru_cache_add);
+EXPORT_SYMBOL(folio_add_lru);
 
 /**
  * lru_cache_add_inactive_or_unevictable
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 76/96] mm/page_alloc: Add __alloc_folio, __alloc_folio_node and alloc_folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (74 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 75/96] mm/lru: Add folio_add_lru Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 77/96] mm/filemap: Add filemap_alloc_folio Matthew Wilcox (Oracle)
                   ` (19 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

These wrappers are mostly for type safety, but they also ensure that the
page allocator allocates a compound page and initialises the deferred
list if the page is large enough to have one.  While the new allocation
functions cost 65 bytes of text, they save dozens of bytes of text in
each of their callers, due to not having to call prep_transhuge_page().
Overall, shrinks the kernel by 238 bytes.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/gfp.h | 16 +++++++++++++
 mm/khugepaged.c     | 32 ++++++++++----------------
 mm/mempolicy.c      | 10 ++++++++
 mm/migrate.c        | 56 +++++++++++++++++++++------------------------
 mm/page_alloc.c     | 12 ++++++++++
 5 files changed, 76 insertions(+), 50 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index a503d928e684..76086c798cb1 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -511,6 +511,8 @@ static inline void arch_alloc_page(struct page *page, int order) { }
 
 struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid,
 		nodemask_t *nodemask);
+struct folio *__alloc_folio(gfp_t gfp, unsigned int order, int preferred_nid,
+		nodemask_t *nodemask);
 
 unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
 				nodemask_t *nodemask, int nr_pages,
@@ -543,6 +545,15 @@ __alloc_pages_node(int nid, gfp_t gfp_mask, unsigned int order)
 	return __alloc_pages(gfp_mask, order, nid, NULL);
 }
 
+static inline
+struct folio *__alloc_folio_node(gfp_t gfp, unsigned int order, int nid)
+{
+	VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
+	VM_WARN_ON((gfp & __GFP_THISNODE) && !node_online(nid));
+
+	return __alloc_folio(gfp, order, nid, NULL);
+}
+
 /*
  * Allocate pages, preferring the node given as nid. When nid == NUMA_NO_NODE,
  * prefer the current CPU's closest node. Otherwise node must be valid and
@@ -559,6 +570,7 @@ static inline struct page *alloc_pages_node(int nid, gfp_t gfp_mask,
 
 #ifdef CONFIG_NUMA
 struct page *alloc_pages(gfp_t gfp, unsigned int order);
+struct folio *alloc_folio(gfp_t gfp, unsigned order);
 extern struct page *alloc_pages_vma(gfp_t gfp_mask, int order,
 			struct vm_area_struct *vma, unsigned long addr,
 			int node, bool hugepage);
@@ -569,6 +581,10 @@ static inline struct page *alloc_pages(gfp_t gfp_mask, unsigned int order)
 {
 	return alloc_pages_node(numa_node_id(), gfp_mask, order);
 }
+static inline struct folio *alloc_folio(gfp_t gfp, unsigned int order)
+{
+	return __alloc_folio_node(gfp, order, numa_node_id());
+}
 #define alloc_pages_vma(gfp_mask, order, vma, addr, node, false)\
 	alloc_pages(gfp_mask, order)
 #define alloc_hugepage_vma(gfp_mask, vma, addr, order) \
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 6c0185fdd815..9dde71607f7c 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -877,18 +877,20 @@ static bool khugepaged_prealloc_page(struct page **hpage, bool *wait)
 static struct page *
 khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node)
 {
+	struct folio *folio;
+
 	VM_BUG_ON_PAGE(*hpage, *hpage);
 
-	*hpage = __alloc_pages_node(node, gfp, HPAGE_PMD_ORDER);
-	if (unlikely(!*hpage)) {
+	folio = __alloc_folio_node(gfp, HPAGE_PMD_ORDER, node);
+	if (unlikely(!folio)) {
 		count_vm_event(THP_COLLAPSE_ALLOC_FAILED);
 		*hpage = ERR_PTR(-ENOMEM);
 		return NULL;
 	}
 
-	prep_transhuge_page(*hpage);
 	count_vm_event(THP_COLLAPSE_ALLOC);
-	return *hpage;
+	*hpage = &folio->page;
+	return &folio->page;
 }
 #else
 static int khugepaged_find_target_node(void)
@@ -896,24 +898,14 @@ static int khugepaged_find_target_node(void)
 	return 0;
 }
 
-static inline struct page *alloc_khugepaged_hugepage(void)
-{
-	struct page *page;
-
-	page = alloc_pages(alloc_hugepage_khugepaged_gfpmask(),
-			   HPAGE_PMD_ORDER);
-	if (page)
-		prep_transhuge_page(page);
-	return page;
-}
-
 static struct page *khugepaged_alloc_hugepage(bool *wait)
 {
-	struct page *hpage;
+	struct folio *folio;
 
 	do {
-		hpage = alloc_khugepaged_hugepage();
-		if (!hpage) {
+		folio = alloc_folio(alloc_hugepage_khugepaged_gfpmask(),
+					HPAGE_PMD_ORDER);
+		if (!folio) {
 			count_vm_event(THP_COLLAPSE_ALLOC_FAILED);
 			if (!*wait)
 				return NULL;
@@ -922,9 +914,9 @@ static struct page *khugepaged_alloc_hugepage(bool *wait)
 			khugepaged_alloc_sleep();
 		} else
 			count_vm_event(THP_COLLAPSE_ALLOC);
-	} while (unlikely(!hpage) && likely(khugepaged_enabled()));
+	} while (unlikely(!folio) && likely(khugepaged_enabled()));
 
-	return hpage;
+	return &folio->page;
 }
 
 static bool khugepaged_prealloc_page(struct page **hpage, bool *wait)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index d79fa299b70c..382fec380f28 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2277,6 +2277,16 @@ struct page *alloc_pages(gfp_t gfp, unsigned order)
 }
 EXPORT_SYMBOL(alloc_pages);
 
+struct folio *alloc_folio(gfp_t gfp, unsigned order)
+{
+	struct page *page = alloc_pages(gfp | __GFP_COMP, order);
+
+	if (page && order > 1)
+		prep_transhuge_page(page);
+	return (struct folio *)page;
+}
+EXPORT_SYMBOL(alloc_folio);
+
 int vma_dup_policy(struct vm_area_struct *src, struct vm_area_struct *dst)
 {
 	struct mempolicy *pol = mpol_dup(vma_policy(src));
diff --git a/mm/migrate.c b/mm/migrate.c
index b234c3f3acb7..0b9cadbad900 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1562,7 +1562,7 @@ struct page *alloc_migration_target(struct page *page, unsigned long private)
 	struct migration_target_control *mtc;
 	gfp_t gfp_mask;
 	unsigned int order = 0;
-	struct page *new_page = NULL;
+	struct folio *new_folio = NULL;
 	int nid;
 	int zidx;
 
@@ -1592,12 +1592,9 @@ struct page *alloc_migration_target(struct page *page, unsigned long private)
 	if (is_highmem_idx(zidx) || zidx == ZONE_MOVABLE)
 		gfp_mask |= __GFP_HIGHMEM;
 
-	new_page = __alloc_pages(gfp_mask, order, nid, mtc->nmask);
+	new_folio = __alloc_folio(gfp_mask, order, nid, mtc->nmask);
 
-	if (new_page && PageTransHuge(new_page))
-		prep_transhuge_page(new_page);
-
-	return new_page;
+	return &new_folio->page;
 }
 
 #ifdef CONFIG_NUMA
@@ -2155,35 +2152,34 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 	spinlock_t *ptl;
 	pg_data_t *pgdat = NODE_DATA(node);
 	int isolated = 0;
-	struct page *new_page = NULL;
+	struct folio *new_folio = NULL;
 	int page_lru = page_is_file_lru(page);
 	unsigned long start = address & HPAGE_PMD_MASK;
 
-	new_page = alloc_pages_node(node,
-		(GFP_TRANSHUGE_LIGHT | __GFP_THISNODE),
-		HPAGE_PMD_ORDER);
-	if (!new_page)
+	new_folio = __alloc_folio_node(node,
+			(GFP_TRANSHUGE_LIGHT | __GFP_THISNODE),
+			HPAGE_PMD_ORDER);
+	if (!new_folio)
 		goto out_fail;
-	prep_transhuge_page(new_page);
 
 	isolated = numamigrate_isolate_page(pgdat, page);
 	if (!isolated) {
-		put_page(new_page);
+		folio_put(new_folio);
 		goto out_fail;
 	}
 
 	/* Prepare a page as a migration target */
-	__SetPageLocked(new_page);
+	__folio_set_locked_flag(new_folio);
 	if (PageSwapBacked(page))
-		__SetPageSwapBacked(new_page);
+		__folio_set_swapbacked_flag(new_folio);
 
 	/* anon mapping, we can simply copy page->mapping to the new page: */
-	new_page->mapping = page->mapping;
-	new_page->index = page->index;
+	new_folio->mapping = page->mapping;
+	new_folio->index = page->index;
 	/* flush the cache before copying using the kernel virtual address */
 	flush_cache_range(vma, start, start + HPAGE_PMD_SIZE);
-	migrate_page_copy(new_page, page);
-	WARN_ON(PageLRU(new_page));
+	migrate_page_copy(&new_folio->page, page);
+	WARN_ON(folio_lru(new_folio));
 
 	/* Recheck the target PMD */
 	ptl = pmd_lock(mm, pmd);
@@ -2191,13 +2187,13 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 		spin_unlock(ptl);
 
 		/* Reverse changes made by migrate_page_copy() */
-		if (TestClearPageActive(new_page))
+		if (folio_test_clear_active_flag(new_folio))
 			SetPageActive(page);
-		if (TestClearPageUnevictable(new_page))
+		if (folio_test_clear_unevictable_flag(new_folio))
 			SetPageUnevictable(page);
 
-		unlock_page(new_page);
-		put_page(new_page);		/* Free it */
+		folio_unlock(new_folio);
+		folio_put(new_folio);		/* Free it */
 
 		/* Retake the callers reference and putback on LRU */
 		get_page(page);
@@ -2208,7 +2204,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 		goto out_unlock;
 	}
 
-	entry = mk_huge_pmd(new_page, vma->vm_page_prot);
+	entry = mk_huge_pmd(&new_folio->page, vma->vm_page_prot);
 	entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
 
 	/*
@@ -2219,7 +2215,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 	 * new page and page_add_new_anon_rmap guarantee the copy is
 	 * visible before the pagetable update.
 	 */
-	page_add_anon_rmap(new_page, vma, start, true);
+	page_add_anon_rmap(&new_folio->page, vma, start, true);
 	/*
 	 * At this point the pmd is numa/protnone (i.e. non present) and the TLB
 	 * has already been flushed globally.  So no TLB can be currently
@@ -2235,17 +2231,17 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 	update_mmu_cache_pmd(vma, address, &entry);
 
 	page_ref_unfreeze(page, 2);
-	mlock_migrate_page(new_page, page);
+	mlock_migrate_page(&new_folio->page, page);
 	page_remove_rmap(page, true);
-	set_page_owner_migrate_reason(new_page, MR_NUMA_MISPLACED);
+	set_page_owner_migrate_reason(&new_folio->page, MR_NUMA_MISPLACED);
 
 	spin_unlock(ptl);
 
 	/* Take an "isolate" reference and put new page on the LRU. */
-	get_page(new_page);
-	putback_lru_page(new_page);
+	folio_get(new_folio);
+	putback_lru_page(&new_folio->page);
 
-	unlock_page(new_page);
+	folio_unlock(new_folio);
 	unlock_page(page);
 	put_page(page);			/* Drop the rmap reference */
 	put_page(page);			/* Drop the LRU isolation reference */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5a1e5b624594..6b5d3f993a41 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5225,6 +5225,18 @@ struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid,
 }
 EXPORT_SYMBOL(__alloc_pages);
 
+struct folio *__alloc_folio(gfp_t gfp, unsigned int order, int preferred_nid,
+		nodemask_t *nodemask)
+{
+	struct page *page = __alloc_pages(gfp | __GFP_COMP, order,
+			preferred_nid, nodemask);
+
+	if (page && order > 1)
+		prep_transhuge_page(page);
+	return (struct folio *)page;
+}
+EXPORT_SYMBOL(__alloc_folio);
+
 /*
  * Common helper functions. Never use with __GFP_HIGHMEM because the returned
  * address cannot represent highmem pages. Use alloc_pages and then kmap if
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 77/96] mm/filemap: Add filemap_alloc_folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (75 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 76/96] mm/page_alloc: Add __alloc_folio, __alloc_folio_node and alloc_folio Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-06  0:00   ` kernel test robot
  2021-05-05 15:06 ` [PATCH v9 78/96] mm/filemap: Add folio_add_to_page_cache Matthew Wilcox (Oracle)
                   ` (18 subsequent siblings)
  95 siblings, 1 reply; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Reimplement __page_cache_alloc as a wrapper around filemap_alloc_folio
to allow filesystems to be converted at our leisure.  Increases
kernel text size by 133 bytes, mostly in cachefiles_read_backing_file().
pagecache_get_page() shrinks by 32 bytes, though.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/pagemap.h | 11 ++++++++---
 mm/filemap.c            | 14 +++++++-------
 2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index d54772aa7a3a..64370f615aba 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -338,14 +338,19 @@ static inline void *detach_page_private(struct page *page)
 }
 
 #ifdef CONFIG_NUMA
-extern struct page *__page_cache_alloc(gfp_t gfp);
+extern struct folio *filemap_alloc_folio(gfp_t gfp, unsigned int order);
 #else
-static inline struct page *__page_cache_alloc(gfp_t gfp)
+static inline struct folio *filemap_alloc_folio(gfp_t gfp, unsigned int order)
 {
-	return alloc_pages(gfp, 0);
+	return alloc_folio(gfp, order);
 }
 #endif
 
+static inline struct page *__page_cache_alloc(gfp_t gfp)
+{
+	return &filemap_alloc_folio(gfp, 0)->page;
+}
+
 static inline struct page *page_cache_alloc(struct address_space *x)
 {
 	return __page_cache_alloc(mapping_gfp_mask(x));
diff --git a/mm/filemap.c b/mm/filemap.c
index 5c130bfcdb1c..a9c16f05b863 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -987,24 +987,24 @@ int add_to_page_cache_lru(struct page *page, struct address_space *mapping,
 EXPORT_SYMBOL_GPL(add_to_page_cache_lru);
 
 #ifdef CONFIG_NUMA
-struct page *__page_cache_alloc(gfp_t gfp)
+struct folio *filemap_alloc_folio(gfp_t gfp, unsigned int order)
 {
 	int n;
-	struct page *page;
+	struct folio *folio;
 
 	if (cpuset_do_page_mem_spread()) {
 		unsigned int cpuset_mems_cookie;
 		do {
 			cpuset_mems_cookie = read_mems_allowed_begin();
 			n = cpuset_mem_spread_node();
-			page = __alloc_pages_node(n, gfp, 0);
-		} while (!page && read_mems_allowed_retry(cpuset_mems_cookie));
+			folio = __alloc_folio_node(n, gfp, order);
+		} while (!folio && read_mems_allowed_retry(cpuset_mems_cookie));
 
-		return page;
+		return folio;
 	}
-	return alloc_pages(gfp, 0);
+	return alloc_folio(gfp, order);
 }
-EXPORT_SYMBOL(__page_cache_alloc);
+EXPORT_SYMBOL(filemap_alloc_folio);
 #endif
 
 /*
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 78/96] mm/filemap: Add folio_add_to_page_cache
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (76 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 77/96] mm/filemap: Add filemap_alloc_folio Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 79/96] mm/filemap: Convert mapping_get_entry to return a folio Matthew Wilcox (Oracle)
                   ` (17 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Pages being added to the page cache should already be folios, so
turn add_to_page_cache_lru() into a wrapper.  Saves hundreds of
bytes of text.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h      |  7 -----
 include/linux/pagemap.h | 22 ++++++++++++---
 mm/filemap.c            | 59 ++++++++++++++++++++---------------------
 3 files changed, 48 insertions(+), 40 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 3f91c9971b32..a6144d325064 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -223,13 +223,6 @@ int overcommit_kbytes_handler(struct ctl_table *, int, void *, size_t *,
 		loff_t *);
 int overcommit_policy_handler(struct ctl_table *, int, void *, size_t *,
 		loff_t *);
-/*
- * Any attempt to mark this function as static leads to build failure
- * when CONFIG_DEBUG_INFO_BTF is enabled because __add_to_page_cache_locked()
- * is referred to by BPF code. This must be visible for error injection.
- */
-int __add_to_page_cache_locked(struct page *page, struct address_space *mapping,
-		pgoff_t index, gfp_t gfp, void **shadowp);
 
 #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
 #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 64370f615aba..8eab3d8400d2 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -951,9 +951,9 @@ static inline int fault_in_pages_readable(const char __user *uaddr, int size)
 }
 
 int add_to_page_cache_locked(struct page *page, struct address_space *mapping,
-				pgoff_t index, gfp_t gfp_mask);
-int add_to_page_cache_lru(struct page *page, struct address_space *mapping,
-				pgoff_t index, gfp_t gfp_mask);
+				pgoff_t index, gfp_t gfp);
+int folio_add_to_page_cache(struct folio *folio, struct address_space *mapping,
+				pgoff_t index, gfp_t gfp);
 extern void delete_from_page_cache(struct page *page);
 extern void __delete_from_page_cache(struct page *page, void *shadow);
 void replace_page_cache_page(struct page *old, struct page *new);
@@ -978,6 +978,22 @@ static inline int add_to_page_cache(struct page *page,
 	return error;
 }
 
+static inline int add_to_page_cache_lru(struct page *page,
+		struct address_space *mapping, pgoff_t index, gfp_t gfp)
+{
+	return folio_add_to_page_cache((struct folio *)page, mapping,
+			index, gfp);
+}
+
+/*
+ * Making this function static leads to build failure when
+ * CONFIG_DEBUG_INFO_BTF is enabled because __add_to_page_cache_locked()
+ * is referred to by BPF code.  This must be visible for error injection.
+ */
+int __add_to_page_cache_locked(struct folio *folio,
+		struct address_space *mapping,
+		pgoff_t index, gfp_t gfp, void **shadowp);
+
 /**
  * struct readahead_control - Describes a readahead request.
  *
diff --git a/mm/filemap.c b/mm/filemap.c
index a9c16f05b863..062610ae95d8 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -853,26 +853,25 @@ void replace_page_cache_page(struct page *old, struct page *new)
 }
 EXPORT_SYMBOL_GPL(replace_page_cache_page);
 
-noinline int __add_to_page_cache_locked(struct page *page,
-					struct address_space *mapping,
-					pgoff_t offset, gfp_t gfp,
-					void **shadowp)
+noinline int __add_to_page_cache_locked(struct folio *folio,
+		struct address_space *mapping, pgoff_t index, gfp_t gfp,
+		void **shadowp)
 {
-	XA_STATE(xas, &mapping->i_pages, offset);
-	int huge = PageHuge(page);
+	XA_STATE(xas, &mapping->i_pages, index);
+	int huge = folio_hugetlb(folio);
 	int error;
 	bool charged = false;
 
-	VM_BUG_ON_PAGE(!PageLocked(page), page);
-	VM_BUG_ON_PAGE(PageSwapBacked(page), page);
+	VM_BUG_ON_FOLIO(!folio_locked(folio), folio);
+	VM_BUG_ON_FOLIO(folio_swapbacked(folio), folio);
 	mapping_set_update(&xas, mapping);
 
-	get_page(page);
-	page->mapping = mapping;
-	page->index = offset;
+	folio_get(folio);
+	folio->mapping = mapping;
+	folio->index = index;
 
 	if (!huge) {
-		error = mem_cgroup_charge(page, current->mm, gfp);
+		error = folio_charge_cgroup(folio, current->mm, gfp);
 		if (error)
 			goto error;
 		charged = true;
@@ -884,7 +883,7 @@ noinline int __add_to_page_cache_locked(struct page *page,
 		unsigned int order = xa_get_order(xas.xa, xas.xa_index);
 		void *entry, *old = NULL;
 
-		if (order > thp_order(page))
+		if (order > folio_order(folio))
 			xas_split_alloc(&xas, xa_load(xas.xa, xas.xa_index),
 					order, gfp);
 		xas_lock_irq(&xas);
@@ -901,13 +900,13 @@ noinline int __add_to_page_cache_locked(struct page *page,
 				*shadowp = old;
 			/* entry may have been split before we acquired lock */
 			order = xa_get_order(xas.xa, xas.xa_index);
-			if (order > thp_order(page)) {
+			if (order > folio_order(folio)) {
 				xas_split(&xas, old, order);
 				xas_reset(&xas);
 			}
 		}
 
-		xas_store(&xas, page);
+		xas_store(&xas, folio);
 		if (xas_error(&xas))
 			goto unlock;
 
@@ -915,7 +914,7 @@ noinline int __add_to_page_cache_locked(struct page *page,
 
 		/* hugetlb pages do not participate in page cache accounting */
 		if (!huge)
-			__inc_lruvec_page_state(page, NR_FILE_PAGES);
+			__lruvec_stat_add_folio(folio, NR_FILE_PAGES);
 unlock:
 		xas_unlock_irq(&xas);
 	} while (xas_nomem(&xas, gfp));
@@ -923,16 +922,16 @@ noinline int __add_to_page_cache_locked(struct page *page,
 	if (xas_error(&xas)) {
 		error = xas_error(&xas);
 		if (charged)
-			mem_cgroup_uncharge(page);
+			folio_uncharge_cgroup(folio);
 		goto error;
 	}
 
-	trace_mm_filemap_add_to_page_cache(page);
+	trace_mm_filemap_add_to_page_cache(&folio->page);
 	return 0;
 error:
-	page->mapping = NULL;
+	folio->mapping = NULL;
 	/* Leave page->index set: truncation relies upon it */
-	put_page(page);
+	folio_put(folio);
 	return error;
 }
 ALLOW_ERROR_INJECTION(__add_to_page_cache_locked, ERRNO);
@@ -952,22 +951,22 @@ ALLOW_ERROR_INJECTION(__add_to_page_cache_locked, ERRNO);
 int add_to_page_cache_locked(struct page *page, struct address_space *mapping,
 		pgoff_t offset, gfp_t gfp_mask)
 {
-	return __add_to_page_cache_locked(page, mapping, offset,
+	return __add_to_page_cache_locked(page_folio(page), mapping, offset,
 					  gfp_mask, NULL);
 }
 EXPORT_SYMBOL(add_to_page_cache_locked);
 
-int add_to_page_cache_lru(struct page *page, struct address_space *mapping,
-				pgoff_t offset, gfp_t gfp_mask)
+int folio_add_to_page_cache(struct folio *folio, struct address_space *mapping,
+				pgoff_t index, gfp_t gfp_mask)
 {
 	void *shadow = NULL;
 	int ret;
 
-	__SetPageLocked(page);
-	ret = __add_to_page_cache_locked(page, mapping, offset,
+	__folio_set_locked_flag(folio);
+	ret = __add_to_page_cache_locked(folio, mapping, index,
 					 gfp_mask, &shadow);
 	if (unlikely(ret))
-		__ClearPageLocked(page);
+		__folio_clear_locked_flag(folio);
 	else {
 		/*
 		 * The page might have been evicted from cache only
@@ -977,14 +976,14 @@ int add_to_page_cache_lru(struct page *page, struct address_space *mapping,
 		 * data from the working set, only to cache data that will
 		 * get overwritten with something else, is a waste of memory.
 		 */
-		WARN_ON_ONCE(PageActive(page));
+		WARN_ON_ONCE(folio_active(folio));
 		if (!(gfp_mask & __GFP_WRITE) && shadow)
-			workingset_refault(page_folio(page), shadow);
-		lru_cache_add(page);
+			workingset_refault(folio, shadow);
+		folio_add_lru(folio);
 	}
 	return ret;
 }
-EXPORT_SYMBOL_GPL(add_to_page_cache_lru);
+EXPORT_SYMBOL_GPL(folio_add_to_page_cache);
 
 #ifdef CONFIG_NUMA
 struct folio *filemap_alloc_folio(gfp_t gfp, unsigned int order)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 79/96] mm/filemap: Convert mapping_get_entry to return a folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (77 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 78/96] mm/filemap: Add folio_add_to_page_cache Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 80/96] mm/filemap: Add filemap_get_folio and find_get_folio Matthew Wilcox (Oracle)
                   ` (16 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

The pagecache only contains folios, so indicate that this is definitely
not a tail page.  Shrinks mapping_get_entry() by 56 bytes, but grows
pagecache_get_page() by 21 bytes as gcc makes slightly different hot/cold
code decisions.  A net reduction of 35 bytes of text.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/filemap.c | 33 +++++++++++++++++----------------
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 062610ae95d8..b3714dddb045 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1732,34 +1732,33 @@ EXPORT_SYMBOL(page_cache_prev_miss);
  * @mapping: the address_space to search
  * @index: The page cache index.
  *
- * Looks up the page cache slot at @mapping & @index.  If there is a
- * page cache page, the head page is returned with an increased refcount.
+ * Looks up the page cache entry at @mapping & @index.  If it is a folio,
+ * it is returned with an increased refcount.  If it is a shadow entry
+ * of a previously evicted folio, or a swap entry from shmem/tmpfs,
+ * it is returned without further action.
  *
- * If the slot holds a shadow entry of a previously evicted page, or a
- * swap entry from shmem/tmpfs, it is returned.
- *
- * Return: The head page or shadow entry, %NULL if nothing is found.
+ * Return: The folio, swap or shadow entry, %NULL if nothing is found.
  */
-static struct page *mapping_get_entry(struct address_space *mapping,
+static struct folio *mapping_get_entry(struct address_space *mapping,
 		pgoff_t index)
 {
 	XA_STATE(xas, &mapping->i_pages, index);
-	struct page *page;
+	struct folio *folio;
 
 	rcu_read_lock();
 repeat:
 	xas_reset(&xas);
-	page = xas_load(&xas);
-	if (xas_retry(&xas, page))
+	folio = xas_load(&xas);
+	if (xas_retry(&xas, folio))
 		goto repeat;
 	/*
 	 * A shadow entry of a recently evicted page, or a swap entry from
 	 * shmem/tmpfs.  Return it without attempting to raise page count.
 	 */
-	if (!page || xa_is_value(page))
+	if (!folio || xa_is_value(folio))
 		goto out;
 
-	if (!page_cache_get_speculative(page))
+	if (!page_cache_get_speculative(&folio->page))
 		goto repeat;
 
 	/*
@@ -1767,14 +1766,14 @@ static struct page *mapping_get_entry(struct address_space *mapping,
 	 * This is part of the lockless pagecache protocol. See
 	 * include/linux/pagemap.h for details.
 	 */
-	if (unlikely(page != xas_reload(&xas))) {
-		put_page(page);
+	if (unlikely(folio != xas_reload(&xas))) {
+		folio_put(folio);
 		goto repeat;
 	}
 out:
 	rcu_read_unlock();
 
-	return page;
+	return folio;
 }
 
 /**
@@ -1814,10 +1813,12 @@ static struct page *mapping_get_entry(struct address_space *mapping,
 struct page *pagecache_get_page(struct address_space *mapping, pgoff_t index,
 		int fgp_flags, gfp_t gfp_mask)
 {
+	struct folio *folio;
 	struct page *page;
 
 repeat:
-	page = mapping_get_entry(mapping, index);
+	folio = mapping_get_entry(mapping, index);
+	page = &folio->page;
 	if (xa_is_value(page)) {
 		if (fgp_flags & FGP_ENTRY)
 			return page;
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 80/96] mm/filemap: Add filemap_get_folio and find_get_folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (78 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 79/96] mm/filemap: Convert mapping_get_entry to return a folio Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 81/96] mm/filemap: Add filemap_get_stable_folio Matthew Wilcox (Oracle)
                   ` (15 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Turn pagecache_get_page() into a wrapper around filemap_get_folio().
Remove find_lock_head() as this use case is now covered by
filemap_get_folio().

Reduces overall kernel size by 209 bytes.  filemap_get_folio() is
316 bytes shorter than pagecache_get_page() was, but the new
pagecache_get_page() is 99 bytes.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/pagemap.h | 31 +++++---------
 mm/filemap.c            | 90 +++++++++++++++++++----------------------
 mm/folio-compat.c       | 11 +++++
 3 files changed, 63 insertions(+), 69 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 8eab3d8400d2..03125035077c 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -378,8 +378,10 @@ pgoff_t page_cache_prev_miss(struct address_space *mapping,
 #define FGP_HEAD		0x00000080
 #define FGP_ENTRY		0x00000100
 
-struct page *pagecache_get_page(struct address_space *mapping, pgoff_t offset,
-		int fgp_flags, gfp_t cache_gfp_mask);
+struct folio *filemap_get_folio(struct address_space *mapping, pgoff_t index,
+		int fgp_flags, gfp_t gfp);
+struct page *pagecache_get_page(struct address_space *mapping, pgoff_t index,
+		int fgp_flags, gfp_t gfp);
 
 /**
  * find_get_page - find and get a page reference
@@ -397,6 +399,12 @@ static inline struct page *find_get_page(struct address_space *mapping,
 	return pagecache_get_page(mapping, offset, 0, 0);
 }
 
+static inline struct folio *find_get_folio(struct address_space *mapping,
+					pgoff_t index)
+{
+	return filemap_get_folio(mapping, index, 0, 0);
+}
+
 static inline struct page *find_get_page_flags(struct address_space *mapping,
 					pgoff_t offset, int fgp_flags)
 {
@@ -422,25 +430,6 @@ static inline struct page *find_lock_page(struct address_space *mapping,
 	return pagecache_get_page(mapping, index, FGP_LOCK, 0);
 }
 
-/**
- * find_lock_head - Locate, pin and lock a pagecache page.
- * @mapping: The address_space to search.
- * @index: The page index.
- *
- * Looks up the page cache entry at @mapping & @index.  If there is a
- * page cache page, its head page is returned locked and with an increased
- * refcount.
- *
- * Context: May sleep.
- * Return: A struct page which is !PageTail, or %NULL if there is no page
- * in the cache for this index.
- */
-static inline struct page *find_lock_head(struct address_space *mapping,
-					pgoff_t index)
-{
-	return pagecache_get_page(mapping, index, FGP_LOCK | FGP_HEAD, 0);
-}
-
 /**
  * find_or_create_page - locate or add a pagecache page
  * @mapping: the page's address_space
diff --git a/mm/filemap.c b/mm/filemap.c
index b3714dddb045..3d8715a6dd08 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1777,95 +1777,89 @@ static struct folio *mapping_get_entry(struct address_space *mapping,
 }
 
 /**
- * pagecache_get_page - Find and get a reference to a page.
+ * filemap_get_folio - Find and get a reference to a folio.
  * @mapping: The address_space to search.
  * @index: The page index.
- * @fgp_flags: %FGP flags modify how the page is returned.
- * @gfp_mask: Memory allocation flags to use if %FGP_CREAT is specified.
+ * @fgp_flags: %FGP flags modify how the folio is returned.
+ * @gfp: Memory allocation flags to use if %FGP_CREAT is specified.
  *
  * Looks up the page cache entry at @mapping & @index.
  *
  * @fgp_flags can be zero or more of these flags:
  *
- * * %FGP_ACCESSED - The page will be marked accessed.
- * * %FGP_LOCK - The page is returned locked.
- * * %FGP_HEAD - If the page is present and a THP, return the head page
- *   rather than the exact page specified by the index.
+ * * %FGP_ACCESSED - The folio will be marked accessed.
+ * * %FGP_LOCK - The folio is returned locked.
  * * %FGP_ENTRY - If there is a shadow / swap / DAX entry, return it
- *   instead of allocating a new page to replace it.
+ *   instead of allocating a new folio to replace it.
  * * %FGP_CREAT - If no page is present then a new page is allocated using
- *   @gfp_mask and added to the page cache and the VM's LRU list.
+ *   @gfp and added to the page cache and the VM's LRU list.
  *   The page is returned locked and with an increased refcount.
  * * %FGP_FOR_MMAP - The caller wants to do its own locking dance if the
  *   page is already in cache.  If the page was allocated, unlock it before
  *   returning so the caller can do the same dance.
- * * %FGP_WRITE - The page will be written
- * * %FGP_NOFS - __GFP_FS will get cleared in gfp mask
- * * %FGP_NOWAIT - Don't get blocked by page lock
+ * * %FGP_WRITE - The page will be written to by the caller.
+ * * %FGP_NOFS - __GFP_FS will get cleared in gfp.
+ * * %FGP_NOWAIT - Don't get blocked by page lock.
  *
  * If %FGP_LOCK or %FGP_CREAT are specified then the function may sleep even
  * if the %GFP flags specified for %FGP_CREAT are atomic.
  *
  * If there is a page cache page, it is returned with an increased refcount.
  *
- * Return: The found page or %NULL otherwise.
+ * Return: The found folio or %NULL otherwise.
  */
-struct page *pagecache_get_page(struct address_space *mapping, pgoff_t index,
-		int fgp_flags, gfp_t gfp_mask)
+struct folio *filemap_get_folio(struct address_space *mapping, pgoff_t index,
+		int fgp_flags, gfp_t gfp)
 {
 	struct folio *folio;
-	struct page *page;
 
 repeat:
 	folio = mapping_get_entry(mapping, index);
-	page = &folio->page;
-	if (xa_is_value(page)) {
+	if (xa_is_value(folio)) {
 		if (fgp_flags & FGP_ENTRY)
-			return page;
-		page = NULL;
+			return folio;
+		folio = NULL;
 	}
-	if (!page)
+	if (!folio)
 		goto no_page;
 
 	if (fgp_flags & FGP_LOCK) {
 		if (fgp_flags & FGP_NOWAIT) {
-			if (!trylock_page(page)) {
-				put_page(page);
+			if (!folio_trylock(folio)) {
+				folio_put(folio);
 				return NULL;
 			}
 		} else {
-			lock_page(page);
+			folio_lock(folio);
 		}
 
 		/* Has the page been truncated? */
-		if (unlikely(page->mapping != mapping)) {
-			unlock_page(page);
-			put_page(page);
+		if (unlikely(folio->mapping != mapping)) {
+			folio_unlock(folio);
+			folio_put(folio);
 			goto repeat;
 		}
-		VM_BUG_ON_PAGE(!thp_contains(page, index), page);
+		VM_BUG_ON_FOLIO(!folio_contains(folio, index), folio);
 	}
 
 	if (fgp_flags & FGP_ACCESSED)
-		mark_page_accessed(page);
+		folio_mark_accessed(folio);
 	else if (fgp_flags & FGP_WRITE) {
 		/* Clear idle flag for buffer write */
-		if (page_is_idle(page))
-			clear_page_idle(page);
+		if (folio_idle(folio))
+			folio_clear_idle_flag(folio);
 	}
-	if (!(fgp_flags & FGP_HEAD))
-		page = find_subpage(page, index);
 
 no_page:
-	if (!page && (fgp_flags & FGP_CREAT)) {
+	if (!folio && (fgp_flags & FGP_CREAT)) {
 		int err;
 		if ((fgp_flags & FGP_WRITE) && mapping_can_writeback(mapping))
-			gfp_mask |= __GFP_WRITE;
+			gfp |= __GFP_WRITE;
 		if (fgp_flags & FGP_NOFS)
-			gfp_mask &= ~__GFP_FS;
+			gfp &= ~__GFP_FS;
 
-		page = __page_cache_alloc(gfp_mask);
-		if (!page)
+		folio = filemap_alloc_folio(gfp, 0);
+		if (!folio)
 			return NULL;
 
 		if (WARN_ON_ONCE(!(fgp_flags & (FGP_LOCK | FGP_FOR_MMAP))))
@@ -1873,27 +1867,27 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t index,
 
 		/* Init accessed so avoid atomic mark_page_accessed later */
 		if (fgp_flags & FGP_ACCESSED)
-			__SetPageReferenced(page);
+			__folio_set_referenced_flag(folio);
 
-		err = add_to_page_cache_lru(page, mapping, index, gfp_mask);
+		err = folio_add_to_page_cache(folio, mapping, index, gfp);
 		if (unlikely(err)) {
-			put_page(page);
-			page = NULL;
+			folio_put(folio);
+			folio = NULL;
 			if (err == -EEXIST)
 				goto repeat;
 		}
 
 		/*
-		 * add_to_page_cache_lru locks the page, and for mmap we expect
-		 * an unlocked page.
+		 * folio_add_to_page_cache locks the page, and for mmap
+		 * we expect an unlocked page.
 		 */
-		if (page && (fgp_flags & FGP_FOR_MMAP))
-			unlock_page(page);
+		if (folio && (fgp_flags & FGP_FOR_MMAP))
+			folio_unlock(folio);
 	}
 
-	return page;
+	return folio;
 }
-EXPORT_SYMBOL(pagecache_get_page);
+EXPORT_SYMBOL(filemap_get_folio);
 
 static inline struct page *find_get_entry(struct xa_state *xas, pgoff_t max,
 		xa_mark_t mark)
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 7de3839ad072..df0038c65da9 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -85,3 +85,14 @@ void lru_cache_add(struct page *page)
 	folio_add_lru(page_folio(page));
 }
 EXPORT_SYMBOL(lru_cache_add);
+
+struct page *pagecache_get_page(struct address_space *mapping, pgoff_t index,
+		int fgp_flags, gfp_t gfp)
+{
+	struct folio *folio = filemap_get_folio(mapping, index, fgp_flags, gfp);
+
+	if ((fgp_flags & FGP_HEAD) || !folio || xa_is_value(folio))
+		return &folio->page;
+	return folio_file_page(folio, index);
+}
+EXPORT_SYMBOL(pagecache_get_page);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 81/96] mm/filemap: Add filemap_get_stable_folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (79 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 80/96] mm/filemap: Add filemap_get_folio and find_get_folio Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 82/96] iomap: Convert to_iomap_page to take a folio Matthew Wilcox (Oracle)
                   ` (14 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This is the folio equivalent of grab_cache_page_write_begin(), which is
reimplemented as a wrapper around it.

Kernel grows by 88 bytes.  filemap_get_stable_folio() is the same
size as the old grab_cache_page_write_begin(), but the wrapper is
80 bytes, plus the 8 bytes for the EXPORT_SYMBOL.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/pagemap.h |  2 ++
 mm/filemap.c            | 20 ++++++++++----------
 mm/folio-compat.c       |  9 +++++++++
 3 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 03125035077c..726cfc61b9e5 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -589,6 +589,8 @@ static inline unsigned find_get_pages_tag(struct address_space *mapping,
 					nr_pages, pages);
 }
 
+struct folio *filemap_get_stable_folio(struct address_space *mapping,
+		pgoff_t index, unsigned flags);
 struct page *grab_cache_page_write_begin(struct address_space *mapping,
 			pgoff_t index, unsigned flags);
 
diff --git a/mm/filemap.c b/mm/filemap.c
index 3d8715a6dd08..8399deb678f6 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3574,26 +3574,26 @@ generic_file_direct_write(struct kiocb *iocb, struct iov_iter *from)
 EXPORT_SYMBOL(generic_file_direct_write);
 
 /*
- * Find or create a page at the given pagecache position. Return the locked
- * page. This function is specifically for buffered writes.
+ * Find or create a folio at the given pagecache position. Return the locked
+ * folio once there are no pending writes.
  */
-struct page *grab_cache_page_write_begin(struct address_space *mapping,
+struct folio *filemap_get_stable_folio(struct address_space *mapping,
 					pgoff_t index, unsigned flags)
 {
-	struct page *page;
-	int fgp_flags = FGP_LOCK|FGP_WRITE|FGP_CREAT;
+	struct folio *folio;
+	int fgp_flags = FGP_LOCK | FGP_WRITE | FGP_CREAT;
 
 	if (flags & AOP_FLAG_NOFS)
 		fgp_flags |= FGP_NOFS;
 
-	page = pagecache_get_page(mapping, index, fgp_flags,
+	folio = filemap_get_folio(mapping, index, fgp_flags,
 			mapping_gfp_mask(mapping));
-	if (page)
-		wait_for_stable_page(page);
+	if (folio)
+		folio_wait_stable(folio);
 
-	return page;
+	return folio;
 }
-EXPORT_SYMBOL(grab_cache_page_write_begin);
+EXPORT_SYMBOL(filemap_get_stable_folio);
 
 ssize_t generic_perform_write(struct file *file,
 				struct iov_iter *i, loff_t pos)
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index df0038c65da9..940fe515a3a2 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -96,3 +96,12 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t index,
 	return folio_file_page(folio, index);
 }
 EXPORT_SYMBOL(pagecache_get_page);
+
+struct page *grab_cache_page_write_begin(struct address_space *mapping,
+					pgoff_t index, unsigned flags)
+{
+	struct folio *folio = filemap_get_stable_folio(mapping, index, flags);
+
+	return folio ? folio_file_page(folio, index) : NULL;
+}
+EXPORT_SYMBOL(grab_cache_page_write_begin);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 82/96] iomap: Convert to_iomap_page to take a folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (80 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 81/96] mm/filemap: Add filemap_get_stable_folio Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 83/96] iomap: Convert iomap_page_create " Matthew Wilcox (Oracle)
                   ` (13 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

The big comment about only using a head page can go away now that
it takes a folio argument.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 35 +++++++++++++++++------------------
 1 file changed, 17 insertions(+), 18 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index f2cd2034a87b..466a3de63497 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -22,8 +22,8 @@
 #include "../internal.h"
 
 /*
- * Structure allocated for each page or THP when block size < page size
- * to track sub-page uptodate status and I/O completions.
+ * Structure allocated for each folio when block size < folio size
+ * to track sub-folio uptodate status and I/O completions.
  */
 struct iomap_page {
 	atomic_t		read_bytes_pending;
@@ -32,17 +32,10 @@ struct iomap_page {
 	unsigned long		uptodate[];
 };
 
-static inline struct iomap_page *to_iomap_page(struct page *page)
+static inline struct iomap_page *to_iomap_page(struct folio *folio)
 {
-	/*
-	 * per-block data is stored in the head page.  Callers should
-	 * not be dealing with tail pages (and if they are, they can
-	 * call thp_head() first.
-	 */
-	VM_BUG_ON_PGFLAGS(PageTail(page), page);
-
-	if (page_has_private(page))
-		return (struct iomap_page *)page_private(page);
+	if (folio_private(folio))
+		return folio_get_private(folio);
 	return NULL;
 }
 
@@ -51,7 +44,8 @@ static struct bio_set iomap_ioend_bioset;
 static struct iomap_page *
 iomap_page_create(struct inode *inode, struct page *page)
 {
-	struct iomap_page *iop = to_iomap_page(page);
+	struct folio *folio = page_folio(page);
+	struct iomap_page *iop = to_iomap_page(folio);
 	unsigned int nr_blocks = i_blocks_per_page(inode, page);
 
 	if (iop || nr_blocks <= 1)
@@ -144,7 +138,8 @@ iomap_adjust_read_range(struct inode *inode, struct iomap_page *iop,
 static void
 iomap_iop_set_range_uptodate(struct page *page, unsigned off, unsigned len)
 {
-	struct iomap_page *iop = to_iomap_page(page);
+	struct folio *folio = page_folio(page);
+	struct iomap_page *iop = to_iomap_page(folio);
 	struct inode *inode = page->mapping->host;
 	unsigned first = off >> inode->i_blkbits;
 	unsigned last = (off + len - 1) >> inode->i_blkbits;
@@ -173,7 +168,8 @@ static void
 iomap_read_page_end_io(struct bio_vec *bvec, int error)
 {
 	struct page *page = bvec->bv_page;
-	struct iomap_page *iop = to_iomap_page(page);
+	struct folio *folio = page_folio(page);
+	struct iomap_page *iop = to_iomap_page(folio);
 
 	if (unlikely(error)) {
 		ClearPageUptodate(page);
@@ -433,7 +429,8 @@ int
 iomap_is_partially_uptodate(struct page *page, unsigned long from,
 		unsigned long count)
 {
-	struct iomap_page *iop = to_iomap_page(page);
+	struct folio *folio = page_folio(page);
+	struct iomap_page *iop = to_iomap_page(folio);
 	struct inode *inode = page->mapping->host;
 	unsigned len, first, last;
 	unsigned i;
@@ -1041,7 +1038,8 @@ static void
 iomap_finish_page_writeback(struct inode *inode, struct page *page,
 		int error, unsigned int len)
 {
-	struct iomap_page *iop = to_iomap_page(page);
+	struct folio *folio = page_folio(page);
+	struct iomap_page *iop = to_iomap_page(folio);
 
 	if (error) {
 		SetPageError(page);
@@ -1334,7 +1332,8 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 		struct writeback_control *wbc, struct inode *inode,
 		struct page *page, u64 end_offset)
 {
-	struct iomap_page *iop = to_iomap_page(page);
+	struct folio *folio = page_folio(page);
+	struct iomap_page *iop = to_iomap_page(folio);
 	struct iomap_ioend *ioend, *next;
 	unsigned len = i_blocksize(inode);
 	u64 file_offset; /* file offset of page */
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 83/96] iomap: Convert iomap_page_create to take a folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (81 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 82/96] iomap: Convert to_iomap_page to take a folio Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 84/96] iomap: Convert iomap_page_release " Matthew Wilcox (Oracle)
                   ` (12 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This function already assumed it was being passed a head page, so
just formalise that.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 466a3de63497..94d33b0a96ff 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -42,11 +42,10 @@ static inline struct iomap_page *to_iomap_page(struct folio *folio)
 static struct bio_set iomap_ioend_bioset;
 
 static struct iomap_page *
-iomap_page_create(struct inode *inode, struct page *page)
+iomap_page_create(struct inode *inode, struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
 	struct iomap_page *iop = to_iomap_page(folio);
-	unsigned int nr_blocks = i_blocks_per_page(inode, page);
+	unsigned int nr_blocks = i_blocks_per_folio(inode, folio);
 
 	if (iop || nr_blocks <= 1)
 		return iop;
@@ -54,9 +53,9 @@ iomap_page_create(struct inode *inode, struct page *page)
 	iop = kzalloc(struct_size(iop, uptodate, BITS_TO_LONGS(nr_blocks)),
 			GFP_NOFS | __GFP_NOFAIL);
 	spin_lock_init(&iop->uptodate_lock);
-	if (PageUptodate(page))
+	if (folio_uptodate(folio))
 		bitmap_fill(iop->uptodate, nr_blocks);
-	attach_page_private(page, iop);
+	folio_attach_private(folio, iop);
 	return iop;
 }
 
@@ -235,7 +234,8 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 {
 	struct iomap_readpage_ctx *ctx = data;
 	struct page *page = ctx->cur_page;
-	struct iomap_page *iop = iomap_page_create(inode, page);
+	struct folio *folio = page_folio(page);
+	struct iomap_page *iop = iomap_page_create(inode, folio);
 	bool same_page = false, is_contig = false;
 	loff_t orig_pos = pos;
 	unsigned poff, plen;
@@ -547,7 +547,8 @@ static int
 __iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, int flags,
 		struct page *page, struct iomap *srcmap)
 {
-	struct iomap_page *iop = iomap_page_create(inode, page);
+	struct folio *folio = page_folio(page);
+	struct iomap_page *iop = iomap_page_create(inode, folio);
 	loff_t block_size = i_blocksize(inode);
 	loff_t block_start = round_down(pos, block_size);
 	loff_t block_end = round_up(pos + len, block_size);
@@ -985,6 +986,7 @@ iomap_page_mkwrite_actor(struct inode *inode, loff_t pos, loff_t length,
 		void *data, struct iomap *iomap, struct iomap *srcmap)
 {
 	struct page *page = data;
+	struct folio *folio = page_folio(page);
 	int ret;
 
 	if (iomap->flags & IOMAP_F_BUFFER_HEAD) {
@@ -994,7 +996,7 @@ iomap_page_mkwrite_actor(struct inode *inode, loff_t pos, loff_t length,
 		block_commit_write(page, 0, length);
 	} else {
 		WARN_ON_ONCE(!PageUptodate(page));
-		iomap_page_create(inode, page);
+		iomap_page_create(inode, folio);
 		set_page_dirty(page);
 	}
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 84/96] iomap: Convert iomap_page_release to take a folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (82 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 83/96] iomap: Convert iomap_page_create " Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 85/96] iomap: Convert iomap_releasepage to use " Matthew Wilcox (Oracle)
                   ` (11 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

iomap_page_release() was also assuming that it was being passed a
head page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 94d33b0a96ff..9f2d0df0837c 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -59,18 +59,18 @@ iomap_page_create(struct inode *inode, struct folio *folio)
 	return iop;
 }
 
-static void
-iomap_page_release(struct page *page)
+static void iomap_page_release(struct folio *folio)
 {
-	struct iomap_page *iop = detach_page_private(page);
-	unsigned int nr_blocks = i_blocks_per_page(page->mapping->host, page);
+	struct iomap_page *iop = folio_detach_private(folio);
+	unsigned int nr_blocks = i_blocks_per_folio(folio->mapping->host,
+							folio);
 
 	if (!iop)
 		return;
 	WARN_ON_ONCE(atomic_read(&iop->read_bytes_pending));
 	WARN_ON_ONCE(atomic_read(&iop->write_bytes_pending));
 	WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) !=
-			PageUptodate(page));
+			folio_uptodate(folio));
 	kfree(iop);
 }
 
@@ -456,6 +456,8 @@ EXPORT_SYMBOL_GPL(iomap_is_partially_uptodate);
 int
 iomap_releasepage(struct page *page, gfp_t gfp_mask)
 {
+	struct folio *folio = page_folio(page);
+
 	trace_iomap_releasepage(page->mapping->host, page_offset(page),
 			PAGE_SIZE);
 
@@ -466,7 +468,7 @@ iomap_releasepage(struct page *page, gfp_t gfp_mask)
 	 */
 	if (PageDirty(page) || PageWriteback(page))
 		return 0;
-	iomap_page_release(page);
+	iomap_page_release(folio);
 	return 1;
 }
 EXPORT_SYMBOL_GPL(iomap_releasepage);
@@ -474,6 +476,8 @@ EXPORT_SYMBOL_GPL(iomap_releasepage);
 void
 iomap_invalidatepage(struct page *page, unsigned int offset, unsigned int len)
 {
+	struct folio *folio = page_folio(page);
+
 	trace_iomap_invalidatepage(page->mapping->host, offset, len);
 
 	/*
@@ -483,7 +487,7 @@ iomap_invalidatepage(struct page *page, unsigned int offset, unsigned int len)
 	if (offset == 0 && len == PAGE_SIZE) {
 		WARN_ON_ONCE(PageWriteback(page));
 		cancel_dirty_page(page);
-		iomap_page_release(page);
+		iomap_page_release(folio);
 	}
 }
 EXPORT_SYMBOL_GPL(iomap_invalidatepage);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 85/96] iomap: Convert iomap_releasepage to use a folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (83 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 84/96] iomap: Convert iomap_page_release " Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 86/96] iomap: Convert iomap_invalidatepage " Matthew Wilcox (Oracle)
                   ` (10 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This is an address_space operation, so its argument must remain as a
struct page, but we can use a folio internally.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 9f2d0df0837c..33226a32e5c5 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -458,15 +458,15 @@ iomap_releasepage(struct page *page, gfp_t gfp_mask)
 {
 	struct folio *folio = page_folio(page);
 
-	trace_iomap_releasepage(page->mapping->host, page_offset(page),
-			PAGE_SIZE);
+	trace_iomap_releasepage(folio->mapping->host, folio_offset(folio),
+			folio_size(folio));
 
 	/*
 	 * mm accommodates an old ext3 case where clean pages might not have had
 	 * the dirty bit cleared. Thus, it can send actual dirty pages to
 	 * ->releasepage() via shrink_active_list(), skip those here.
 	 */
-	if (PageDirty(page) || PageWriteback(page))
+	if (folio_dirty(folio) || folio_writeback(folio))
 		return 0;
 	iomap_page_release(folio);
 	return 1;
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 86/96] iomap: Convert iomap_invalidatepage to use a folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (84 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 85/96] iomap: Convert iomap_releasepage to use " Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 87/96] iomap: Pass the iomap_page into iomap_set_range_uptodate Matthew Wilcox (Oracle)
                   ` (9 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

This is an address_space operation, so its argument must remain as a
struct page, but we can use a folio internally.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 33226a32e5c5..c36e16b87c45 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -478,15 +478,15 @@ iomap_invalidatepage(struct page *page, unsigned int offset, unsigned int len)
 {
 	struct folio *folio = page_folio(page);
 
-	trace_iomap_invalidatepage(page->mapping->host, offset, len);
+	trace_iomap_invalidatepage(folio->mapping->host, offset, len);
 
 	/*
 	 * If we are invalidating the entire page, clear the dirty state from it
 	 * and release it to avoid unnecessary buildup of the LRU.
 	 */
 	if (offset == 0 && len == PAGE_SIZE) {
-		WARN_ON_ONCE(PageWriteback(page));
-		cancel_dirty_page(page);
+		WARN_ON_ONCE(folio_writeback(folio));
+		folio_cancel_dirty(folio);
 		iomap_page_release(folio);
 	}
 }
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 87/96] iomap: Pass the iomap_page into iomap_set_range_uptodate
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (85 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 86/96] iomap: Convert iomap_invalidatepage " Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 88/96] iomap: Use folio offsets instead of page offsets Matthew Wilcox (Oracle)
                   ` (8 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

All but one caller already has the iomap_page, and we can avoid getting
it again.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index c36e16b87c45..f40a22a696c6 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -134,11 +134,9 @@ iomap_adjust_read_range(struct inode *inode, struct iomap_page *iop,
 	*lenp = plen;
 }
 
-static void
-iomap_iop_set_range_uptodate(struct page *page, unsigned off, unsigned len)
+static void iomap_iop_set_range_uptodate(struct page *page,
+		struct iomap_page *iop, unsigned off, unsigned len)
 {
-	struct folio *folio = page_folio(page);
-	struct iomap_page *iop = to_iomap_page(folio);
 	struct inode *inode = page->mapping->host;
 	unsigned first = off >> inode->i_blkbits;
 	unsigned last = (off + len - 1) >> inode->i_blkbits;
@@ -151,14 +149,14 @@ iomap_iop_set_range_uptodate(struct page *page, unsigned off, unsigned len)
 	spin_unlock_irqrestore(&iop->uptodate_lock, flags);
 }
 
-static void
-iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len)
+static void iomap_set_range_uptodate(struct page *page,
+		struct iomap_page *iop, unsigned off, unsigned len)
 {
 	if (PageError(page))
 		return;
 
-	if (page_has_private(page))
-		iomap_iop_set_range_uptodate(page, off, len);
+	if (iop)
+		iomap_iop_set_range_uptodate(page, iop, off, len);
 	else
 		SetPageUptodate(page);
 }
@@ -174,7 +172,8 @@ iomap_read_page_end_io(struct bio_vec *bvec, int error)
 		ClearPageUptodate(page);
 		SetPageError(page);
 	} else {
-		iomap_set_range_uptodate(page, bvec->bv_offset, bvec->bv_len);
+		iomap_set_range_uptodate(page, iop, bvec->bv_offset,
+						bvec->bv_len);
 	}
 
 	if (!iop || atomic_sub_and_test(bvec->bv_len, &iop->read_bytes_pending))
@@ -254,7 +253,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 
 	if (iomap_block_needs_zeroing(inode, iomap, pos)) {
 		zero_user(page, poff, plen);
-		iomap_set_range_uptodate(page, poff, plen);
+		iomap_set_range_uptodate(page, iop, poff, plen);
 		goto done;
 	}
 
@@ -583,7 +582,7 @@ __iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, int flags,
 			if (status)
 				return status;
 		}
-		iomap_set_range_uptodate(page, poff, plen);
+		iomap_set_range_uptodate(page, iop, poff, plen);
 	} while ((block_start += plen) < block_end);
 
 	return 0;
@@ -670,6 +669,8 @@ EXPORT_SYMBOL_GPL(iomap_set_page_dirty);
 static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 		size_t copied, struct page *page)
 {
+	struct folio *folio = page_folio(page);
+	struct iomap_page *iop = to_iomap_page(folio);
 	flush_dcache_page(page);
 
 	/*
@@ -685,7 +686,7 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 	 */
 	if (unlikely(copied < len && !PageUptodate(page)))
 		return 0;
-	iomap_set_range_uptodate(page, offset_in_page(pos), len);
+	iomap_set_range_uptodate(page, iop, offset_in_page(pos), len);
 	iomap_set_page_dirty(page);
 	return copied;
 }
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 88/96] iomap: Use folio offsets instead of page offsets
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (86 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 87/96] iomap: Pass the iomap_page into iomap_set_range_uptodate Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 89/96] iomap: Convert bio completions to use folios Matthew Wilcox (Oracle)
                   ` (7 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Pass a folio around instead of the page, and make sure the offset
is relative to the start of the folio instead of the start of a page.
Also use size_t for offset & length to make it clear that these are byte
counts, and to support >2GB folios in the future.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 85 ++++++++++++++++++++++--------------------
 1 file changed, 44 insertions(+), 41 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index f40a22a696c6..b61fc7da3d65 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -75,18 +75,18 @@ static void iomap_page_release(struct folio *folio)
 }
 
 /*
- * Calculate the range inside the page that we actually need to read.
+ * Calculate the range inside the folio that we actually need to read.
  */
-static void
-iomap_adjust_read_range(struct inode *inode, struct iomap_page *iop,
-		loff_t *pos, loff_t length, unsigned *offp, unsigned *lenp)
+static void iomap_adjust_read_range(struct inode *inode, struct folio *folio,
+		loff_t *pos, loff_t length, size_t *offp, size_t *lenp)
 {
+	struct iomap_page *iop = to_iomap_page(folio);
 	loff_t orig_pos = *pos;
 	loff_t isize = i_size_read(inode);
 	unsigned block_bits = inode->i_blkbits;
 	unsigned block_size = (1 << block_bits);
-	unsigned poff = offset_in_page(*pos);
-	unsigned plen = min_t(loff_t, PAGE_SIZE - poff, length);
+	size_t poff = offset_in_folio(folio, *pos);
+	size_t plen = min_t(loff_t, folio_size(folio) - poff, length);
 	unsigned first = poff >> block_bits;
 	unsigned last = (poff + plen - 1) >> block_bits;
 
@@ -124,7 +124,7 @@ iomap_adjust_read_range(struct inode *inode, struct iomap_page *iop,
 	 * page cache for blocks that are entirely outside of i_size.
 	 */
 	if (orig_pos <= isize && orig_pos + length > isize) {
-		unsigned end = offset_in_page(isize - 1) >> block_bits;
+		unsigned end = offset_in_folio(folio, isize - 1) >> block_bits;
 
 		if (first <= end && last > end)
 			plen -= (last - end) * block_size;
@@ -134,31 +134,31 @@ iomap_adjust_read_range(struct inode *inode, struct iomap_page *iop,
 	*lenp = plen;
 }
 
-static void iomap_iop_set_range_uptodate(struct page *page,
-		struct iomap_page *iop, unsigned off, unsigned len)
+static void iomap_iop_set_range_uptodate(struct folio *folio,
+		struct iomap_page *iop, size_t off, size_t len)
 {
-	struct inode *inode = page->mapping->host;
+	struct inode *inode = folio->mapping->host;
 	unsigned first = off >> inode->i_blkbits;
 	unsigned last = (off + len - 1) >> inode->i_blkbits;
 	unsigned long flags;
 
 	spin_lock_irqsave(&iop->uptodate_lock, flags);
 	bitmap_set(iop->uptodate, first, last - first + 1);
-	if (bitmap_full(iop->uptodate, i_blocks_per_page(inode, page)))
-		SetPageUptodate(page);
+	if (bitmap_full(iop->uptodate, i_blocks_per_folio(inode, folio)))
+		folio_mark_uptodate(folio);
 	spin_unlock_irqrestore(&iop->uptodate_lock, flags);
 }
 
-static void iomap_set_range_uptodate(struct page *page,
-		struct iomap_page *iop, unsigned off, unsigned len)
+static void iomap_set_range_uptodate(struct folio *folio,
+		struct iomap_page *iop, size_t off, size_t len)
 {
-	if (PageError(page))
+	if (folio_error(folio))
 		return;
 
 	if (iop)
-		iomap_iop_set_range_uptodate(page, iop, off, len);
+		iomap_iop_set_range_uptodate(folio, iop, off, len);
 	else
-		SetPageUptodate(page);
+		folio_mark_uptodate(folio);
 }
 
 static void
@@ -169,15 +169,17 @@ iomap_read_page_end_io(struct bio_vec *bvec, int error)
 	struct iomap_page *iop = to_iomap_page(folio);
 
 	if (unlikely(error)) {
-		ClearPageUptodate(page);
-		SetPageError(page);
+		folio_clear_uptodate_flag(folio);
+		folio_set_error_flag(folio);
 	} else {
-		iomap_set_range_uptodate(page, iop, bvec->bv_offset,
-						bvec->bv_len);
+		size_t off = (page - &folio->page) * PAGE_SIZE +
+				bvec->bv_offset;
+
+		iomap_set_range_uptodate(folio, iop, off, bvec->bv_len);
 	}
 
 	if (!iop || atomic_sub_and_test(bvec->bv_len, &iop->read_bytes_pending))
-		unlock_page(page);
+		folio_unlock(folio);
 }
 
 static void
@@ -237,7 +239,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 	struct iomap_page *iop = iomap_page_create(inode, folio);
 	bool same_page = false, is_contig = false;
 	loff_t orig_pos = pos;
-	unsigned poff, plen;
+	size_t poff, plen;
 	sector_t sector;
 
 	if (iomap->type == IOMAP_INLINE) {
@@ -246,14 +248,14 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 		return PAGE_SIZE;
 	}
 
-	/* zero post-eof blocks as the page may be mapped */
-	iomap_adjust_read_range(inode, iop, &pos, length, &poff, &plen);
+	/* zero post-eof blocks as the folio may be mapped */
+	iomap_adjust_read_range(inode, folio, &pos, length, &poff, &plen);
 	if (plen == 0)
 		goto done;
 
 	if (iomap_block_needs_zeroing(inode, iomap, pos)) {
-		zero_user(page, poff, plen);
-		iomap_set_range_uptodate(page, iop, poff, plen);
+		zero_user(&folio->page, poff, plen);
+		iomap_set_range_uptodate(folio, iop, poff, plen);
 		goto done;
 	}
 
@@ -264,7 +266,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 	/* Try to merge into a previous segment if we can */
 	sector = iomap_sector(iomap, pos);
 	if (ctx->bio && bio_end_sector(ctx->bio) == sector) {
-		if (__bio_try_merge_page(ctx->bio, page, plen, poff,
+		if (__bio_try_merge_page(ctx->bio, &folio->page, plen, poff,
 				&same_page))
 			goto done;
 		is_contig = true;
@@ -296,7 +298,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 		ctx->bio->bi_end_io = iomap_read_end_io;
 	}
 
-	bio_add_page(ctx->bio, page, plen, poff);
+	bio_add_folio(ctx->bio, folio, plen, poff);
 done:
 	/*
 	 * Move the caller beyond our range so that it keeps making progress.
@@ -531,9 +533,8 @@ iomap_write_failed(struct inode *inode, loff_t pos, unsigned len)
 		truncate_pagecache_range(inode, max(pos, i_size), pos + len);
 }
 
-static int
-iomap_read_page_sync(loff_t block_start, struct page *page, unsigned poff,
-		unsigned plen, struct iomap *iomap)
+static int iomap_read_folio_sync(loff_t block_start, struct folio *folio,
+		size_t poff, size_t plen, struct iomap *iomap)
 {
 	struct bio_vec bvec;
 	struct bio bio;
@@ -542,7 +543,7 @@ iomap_read_page_sync(loff_t block_start, struct page *page, unsigned poff,
 	bio.bi_opf = REQ_OP_READ;
 	bio.bi_iter.bi_sector = iomap_sector(iomap, block_start);
 	bio_set_dev(&bio, iomap->bdev);
-	__bio_add_page(&bio, page, plen, poff);
+	bio_add_folio(&bio, folio, plen, poff);
 	return submit_bio_wait(&bio);
 }
 
@@ -555,14 +556,15 @@ __iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, int flags,
 	loff_t block_size = i_blocksize(inode);
 	loff_t block_start = round_down(pos, block_size);
 	loff_t block_end = round_up(pos + len, block_size);
-	unsigned from = offset_in_page(pos), to = from + len, poff, plen;
+	size_t from = offset_in_folio(folio, pos), to = from + len;
+	size_t poff, plen;
 
-	if (PageUptodate(page))
+	if (folio_uptodate(folio))
 		return 0;
-	ClearPageError(page);
+	folio_clear_error_flag(folio);
 
 	do {
-		iomap_adjust_read_range(inode, iop, &block_start,
+		iomap_adjust_read_range(inode, folio, &block_start,
 				block_end - block_start, &poff, &plen);
 		if (plen == 0)
 			break;
@@ -575,14 +577,15 @@ __iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, int flags,
 		if (iomap_block_needs_zeroing(inode, srcmap, block_start)) {
 			if (WARN_ON_ONCE(flags & IOMAP_WRITE_F_UNSHARE))
 				return -EIO;
-			zero_user_segments(page, poff, from, to, poff + plen);
+			zero_user_segments(&folio->page, poff, from, to,
+						poff + plen);
 		} else {
-			int status = iomap_read_page_sync(block_start, page,
+			int status = iomap_read_folio_sync(block_start, folio,
 					poff, plen, srcmap);
 			if (status)
 				return status;
 		}
-		iomap_set_range_uptodate(page, iop, poff, plen);
+		iomap_set_range_uptodate(folio, iop, poff, plen);
 	} while ((block_start += plen) < block_end);
 
 	return 0;
@@ -686,7 +689,7 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 	 */
 	if (unlikely(copied < len && !PageUptodate(page)))
 		return 0;
-	iomap_set_range_uptodate(page, iop, offset_in_page(pos), len);
+	iomap_set_range_uptodate(folio, iop, offset_in_folio(folio, pos), len);
 	iomap_set_page_dirty(page);
 	return copied;
 }
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 89/96] iomap: Convert bio completions to use folios
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (87 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 88/96] iomap: Use folio offsets instead of page offsets Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 90/96] iomap: Convert readahead and readpage to use a folio Matthew Wilcox (Oracle)
                   ` (6 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Use bio_for_each_folio() to iterate over each folio in the bio
instead of iterating over each page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 46 +++++++++++++++++-------------------------
 1 file changed, 18 insertions(+), 28 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index b61fc7da3d65..354d91bc92c5 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -161,36 +161,29 @@ static void iomap_set_range_uptodate(struct folio *folio,
 		folio_mark_uptodate(folio);
 }
 
-static void
-iomap_read_page_end_io(struct bio_vec *bvec, int error)
+static void iomap_finish_folio_read(struct folio *folio, size_t offset,
+		size_t len, int error)
 {
-	struct page *page = bvec->bv_page;
-	struct folio *folio = page_folio(page);
 	struct iomap_page *iop = to_iomap_page(folio);
 
 	if (unlikely(error)) {
 		folio_clear_uptodate_flag(folio);
 		folio_set_error_flag(folio);
 	} else {
-		size_t off = (page - &folio->page) * PAGE_SIZE +
-				bvec->bv_offset;
-
-		iomap_set_range_uptodate(folio, iop, off, bvec->bv_len);
+		iomap_set_range_uptodate(folio, iop, offset, len);
 	}
 
-	if (!iop || atomic_sub_and_test(bvec->bv_len, &iop->read_bytes_pending))
+	if (!iop || atomic_sub_and_test(len, &iop->read_bytes_pending))
 		folio_unlock(folio);
 }
 
-static void
-iomap_read_end_io(struct bio *bio)
+static void iomap_read_end_io(struct bio *bio)
 {
 	int error = blk_status_to_errno(bio->bi_status);
-	struct bio_vec *bvec;
-	struct bvec_iter_all iter_all;
+	struct folio_iter fi;
 
-	bio_for_each_segment_all(bvec, bio, iter_all)
-		iomap_read_page_end_io(bvec, error);
+	bio_for_each_folio_all(fi, bio)
+		iomap_finish_folio_read(fi.folio, fi.offset, fi.length, error);
 	bio_put(bio);
 }
 
@@ -1044,23 +1037,21 @@ vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops)
 }
 EXPORT_SYMBOL_GPL(iomap_page_mkwrite);
 
-static void
-iomap_finish_page_writeback(struct inode *inode, struct page *page,
-		int error, unsigned int len)
+static void iomap_finish_folio_write(struct inode *inode, struct folio *folio,
+		size_t len, int error)
 {
-	struct folio *folio = page_folio(page);
 	struct iomap_page *iop = to_iomap_page(folio);
 
 	if (error) {
-		SetPageError(page);
+		folio_set_error_flag(folio);
 		mapping_set_error(inode->i_mapping, -EIO);
 	}
 
-	WARN_ON_ONCE(i_blocks_per_page(inode, page) > 1 && !iop);
+	WARN_ON_ONCE(i_blocks_per_folio(inode, folio) > 1 && !iop);
 	WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) <= 0);
 
 	if (!iop || atomic_sub_and_test(len, &iop->write_bytes_pending))
-		end_page_writeback(page);
+		folio_end_writeback(folio);
 }
 
 /*
@@ -1079,8 +1070,7 @@ iomap_finish_ioend(struct iomap_ioend *ioend, int error)
 	bool quiet = bio_flagged(bio, BIO_QUIET);
 
 	for (bio = &ioend->io_inline_bio; bio; bio = next) {
-		struct bio_vec *bv;
-		struct bvec_iter_all iter_all;
+		struct folio_iter fi;
 
 		/*
 		 * For the last bio, bi_private points to the ioend, so we
@@ -1091,10 +1081,10 @@ iomap_finish_ioend(struct iomap_ioend *ioend, int error)
 		else
 			next = bio->bi_private;
 
-		/* walk each page on bio, ending page IO on them */
-		bio_for_each_segment_all(bv, bio, iter_all)
-			iomap_finish_page_writeback(inode, bv->bv_page, error,
-					bv->bv_len);
+		/* walk all folios in bio, ending page IO on them */
+		bio_for_each_folio_all(fi, bio)
+			iomap_finish_folio_write(inode, fi.folio, fi.length,
+					error);
 		bio_put(bio);
 	}
 	/* The ioend has been freed by bio_put() */
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 90/96] iomap: Convert readahead and readpage to use a folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (88 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 89/96] iomap: Convert bio completions to use folios Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 91/96] iomap: Convert iomap_page_mkwrite " Matthew Wilcox (Oracle)
                   ` (5 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Handle folios of arbitrary size instead of working in PAGE_SIZE units.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 60 +++++++++++++++++++++---------------------
 1 file changed, 30 insertions(+), 30 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 354d91bc92c5..1962967dc4b0 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -188,8 +188,8 @@ static void iomap_read_end_io(struct bio *bio)
 }
 
 struct iomap_readpage_ctx {
-	struct page		*cur_page;
-	bool			cur_page_in_bio;
+	struct folio		*cur_folio;
+	bool			cur_folio_in_bio;
 	struct bio		*bio;
 	struct readahead_control *rac;
 };
@@ -227,8 +227,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 		struct iomap *iomap, struct iomap *srcmap)
 {
 	struct iomap_readpage_ctx *ctx = data;
-	struct page *page = ctx->cur_page;
-	struct folio *folio = page_folio(page);
+	struct folio *folio = ctx->cur_folio;
 	struct iomap_page *iop = iomap_page_create(inode, folio);
 	bool same_page = false, is_contig = false;
 	loff_t orig_pos = pos;
@@ -237,7 +236,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 
 	if (iomap->type == IOMAP_INLINE) {
 		WARN_ON_ONCE(pos);
-		iomap_read_inline_data(inode, page, iomap);
+		iomap_read_inline_data(inode, &folio->page, iomap);
 		return PAGE_SIZE;
 	}
 
@@ -252,7 +251,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 		goto done;
 	}
 
-	ctx->cur_page_in_bio = true;
+	ctx->cur_folio_in_bio = true;
 	if (iop)
 		atomic_add(plen, &iop->read_bytes_pending);
 
@@ -266,7 +265,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 	}
 
 	if (!is_contig || bio_full(ctx->bio, plen)) {
-		gfp_t gfp = mapping_gfp_constraint(page->mapping, GFP_KERNEL);
+		gfp_t gfp = mapping_gfp_constraint(folio->mapping, GFP_KERNEL);
 		gfp_t orig_gfp = gfp;
 		unsigned int nr_vecs = DIV_ROUND_UP(length, PAGE_SIZE);
 
@@ -305,30 +304,32 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 int
 iomap_readpage(struct page *page, const struct iomap_ops *ops)
 {
-	struct iomap_readpage_ctx ctx = { .cur_page = page };
-	struct inode *inode = page->mapping->host;
-	unsigned poff;
+	struct folio *folio = page_folio(page);
+	struct iomap_readpage_ctx ctx = { .cur_folio = folio };
+	struct inode *inode = folio->mapping->host;
+	size_t poff;
 	loff_t ret;
+	size_t len = folio_size(folio);
 
-	trace_iomap_readpage(page->mapping->host, 1);
+	trace_iomap_readpage(inode, 1);
 
-	for (poff = 0; poff < PAGE_SIZE; poff += ret) {
-		ret = iomap_apply(inode, page_offset(page) + poff,
-				PAGE_SIZE - poff, 0, ops, &ctx,
+	for (poff = 0; poff < len; poff += ret) {
+		ret = iomap_apply(inode, folio_offset(folio) + poff,
+				len - poff, 0, ops, &ctx,
 				iomap_readpage_actor);
 		if (ret <= 0) {
 			WARN_ON_ONCE(ret == 0);
-			SetPageError(page);
+			folio_set_error_flag(folio);
 			break;
 		}
 	}
 
 	if (ctx.bio) {
 		submit_bio(ctx.bio);
-		WARN_ON_ONCE(!ctx.cur_page_in_bio);
+		WARN_ON_ONCE(!ctx.cur_folio_in_bio);
 	} else {
-		WARN_ON_ONCE(ctx.cur_page_in_bio);
-		unlock_page(page);
+		WARN_ON_ONCE(ctx.cur_folio_in_bio);
+		folio_unlock(folio);
 	}
 
 	/*
@@ -348,15 +349,15 @@ iomap_readahead_actor(struct inode *inode, loff_t pos, loff_t length,
 	loff_t done, ret;
 
 	for (done = 0; done < length; done += ret) {
-		if (ctx->cur_page && offset_in_page(pos + done) == 0) {
-			if (!ctx->cur_page_in_bio)
-				unlock_page(ctx->cur_page);
-			put_page(ctx->cur_page);
-			ctx->cur_page = NULL;
+		if (ctx->cur_folio &&
+		    offset_in_folio(ctx->cur_folio, pos + done) == 0) {
+			if (!ctx->cur_folio_in_bio)
+				folio_unlock(ctx->cur_folio);
+			ctx->cur_folio = NULL;
 		}
-		if (!ctx->cur_page) {
-			ctx->cur_page = readahead_page(ctx->rac);
-			ctx->cur_page_in_bio = false;
+		if (!ctx->cur_folio) {
+			ctx->cur_folio = readahead_folio(ctx->rac);
+			ctx->cur_folio_in_bio = false;
 		}
 		ret = iomap_readpage_actor(inode, pos + done, length - done,
 				ctx, iomap, srcmap);
@@ -404,10 +405,9 @@ void iomap_readahead(struct readahead_control *rac, const struct iomap_ops *ops)
 
 	if (ctx.bio)
 		submit_bio(ctx.bio);
-	if (ctx.cur_page) {
-		if (!ctx.cur_page_in_bio)
-			unlock_page(ctx.cur_page);
-		put_page(ctx.cur_page);
+	if (ctx.cur_folio) {
+		if (!ctx.cur_folio_in_bio)
+			folio_unlock(ctx.cur_folio);
 	}
 }
 EXPORT_SYMBOL_GPL(iomap_readahead);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 91/96] iomap: Convert iomap_page_mkwrite to use a folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (89 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 90/96] iomap: Convert readahead and readpage to use a folio Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 92/96] iomap: Convert iomap_write_begin and iomap_write_end to folios Matthew Wilcox (Oracle)
                   ` (4 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

If we write to any page in a folio, we have to mark the entire
folio as dirty, and potentially COW the entire folio, because it'll
all get written back as one unit.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 67 +++++++++++++-----------------------------
 include/linux/iomap.h  |  2 +-
 mm/page-writeback.c    |  1 -
 3 files changed, 22 insertions(+), 48 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 1962967dc4b0..c52c266d4abe 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -637,31 +637,6 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
 	return status;
 }
 
-int
-iomap_set_page_dirty(struct page *page)
-{
-	struct address_space *mapping = page_mapping(page);
-	int newly_dirty;
-
-	if (unlikely(!mapping))
-		return !TestSetPageDirty(page);
-
-	/*
-	 * Lock out page's memcg migration to keep PageDirty
-	 * synchronized with per-memcg dirty page counters.
-	 */
-	lock_page_memcg(page);
-	newly_dirty = !TestSetPageDirty(page);
-	if (newly_dirty)
-		__set_page_dirty(page, mapping, 0);
-	unlock_page_memcg(page);
-
-	if (newly_dirty)
-		__mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
-	return newly_dirty;
-}
-EXPORT_SYMBOL_GPL(iomap_set_page_dirty);
-
 static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 		size_t copied, struct page *page)
 {
@@ -982,23 +957,23 @@ iomap_truncate_page(struct inode *inode, loff_t pos, bool *did_zero,
 }
 EXPORT_SYMBOL_GPL(iomap_truncate_page);
 
-static loff_t
-iomap_page_mkwrite_actor(struct inode *inode, loff_t pos, loff_t length,
-		void *data, struct iomap *iomap, struct iomap *srcmap)
+static loff_t iomap_folio_mkwrite_actor(struct inode *inode, loff_t pos,
+		loff_t length, void *data, struct iomap *iomap,
+		struct iomap *srcmap)
 {
-	struct page *page = data;
-	struct folio *folio = page_folio(page);
+	struct folio *folio = data;
 	int ret;
 
 	if (iomap->flags & IOMAP_F_BUFFER_HEAD) {
-		ret = __block_write_begin_int(page, pos, length, NULL, iomap);
+		ret = __block_write_begin_int(&folio->page, pos, length, NULL,
+						iomap);
 		if (ret)
 			return ret;
-		block_commit_write(page, 0, length);
+		block_commit_write(&folio->page, 0, length);
 	} else {
-		WARN_ON_ONCE(!PageUptodate(page));
+		WARN_ON_ONCE(!folio_uptodate(folio));
 		iomap_page_create(inode, folio);
-		set_page_dirty(page);
+		folio_mark_dirty(folio);
 	}
 
 	return length;
@@ -1006,33 +981,33 @@ iomap_page_mkwrite_actor(struct inode *inode, loff_t pos, loff_t length,
 
 vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops)
 {
-	struct page *page = vmf->page;
+	struct folio *folio = page_folio(vmf->page);
 	struct inode *inode = file_inode(vmf->vma->vm_file);
-	unsigned long length;
-	loff_t offset;
+	size_t length;
+	loff_t pos;
 	ssize_t ret;
 
-	lock_page(page);
-	ret = page_mkwrite_check_truncate(page, inode);
+	folio_lock(folio);
+	ret = folio_mkwrite_check_truncate(folio, inode);
 	if (ret < 0)
 		goto out_unlock;
 	length = ret;
 
-	offset = page_offset(page);
+	pos = folio_offset(folio);
 	while (length > 0) {
-		ret = iomap_apply(inode, offset, length,
-				IOMAP_WRITE | IOMAP_FAULT, ops, page,
-				iomap_page_mkwrite_actor);
+		ret = iomap_apply(inode, pos, length,
+				IOMAP_WRITE | IOMAP_FAULT, ops, folio,
+				iomap_folio_mkwrite_actor);
 		if (unlikely(ret <= 0))
 			goto out_unlock;
-		offset += ret;
+		pos += ret;
 		length -= ret;
 	}
 
-	wait_for_stable_page(page);
+	folio_wait_stable(folio);
 	return VM_FAULT_LOCKED;
 out_unlock:
-	unlock_page(page);
+	folio_unlock(folio);
 	return block_page_mkwrite_return(ret);
 }
 EXPORT_SYMBOL_GPL(iomap_page_mkwrite);
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index c87d0cb0de6d..bd02a07ddd6e 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -159,7 +159,7 @@ ssize_t iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *from,
 		const struct iomap_ops *ops);
 int iomap_readpage(struct page *page, const struct iomap_ops *ops);
 void iomap_readahead(struct readahead_control *, const struct iomap_ops *ops);
-int iomap_set_page_dirty(struct page *page);
+#define iomap_set_page_dirty	__set_page_dirty_nobuffers
 int iomap_is_partially_uptodate(struct page *page, unsigned long from,
 		unsigned long count);
 int iomap_releasepage(struct page *page, gfp_t gfp_mask);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index ac86f3cbba1c..1b5384cf51e1 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2488,7 +2488,6 @@ void __folio_mark_dirty(struct folio *folio, struct address_space *mapping,
 	}
 	xa_unlock_irqrestore(&mapping->i_pages, flags);
 }
-EXPORT_SYMBOL_GPL(__folio_mark_dirty);
 
 /**
  * filemap_dirty_folio - Mark a folio dirty for filesystems which do not use buffer_heads.
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 92/96] iomap: Convert iomap_write_begin and iomap_write_end to folios
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (90 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 91/96] iomap: Convert iomap_page_mkwrite " Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 21:36   ` kernel test robot
  2021-05-05 15:06 ` [PATCH v9 93/96] iomap: Convert iomap_read_inline_data to take a folio Matthew Wilcox (Oracle)
                   ` (3 subsequent siblings)
  95 siblings, 1 reply; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

These functions still only work in PAGE_SIZE chunks, but there are
fewer conversions from head to tail pages as a result of this patch.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 65 ++++++++++++++++++++++--------------------
 1 file changed, 34 insertions(+), 31 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index c52c266d4abe..1c7e16e6aad1 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -542,9 +542,8 @@ static int iomap_read_folio_sync(loff_t block_start, struct folio *folio,
 
 static int
 __iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, int flags,
-		struct page *page, struct iomap *srcmap)
+		struct folio *folio, struct iomap *srcmap)
 {
-	struct folio *folio = page_folio(page);
 	struct iomap_page *iop = iomap_page_create(inode, folio);
 	loff_t block_size = i_blocksize(inode);
 	loff_t block_start = round_down(pos, block_size);
@@ -584,11 +583,12 @@ __iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, int flags,
 	return 0;
 }
 
-static int
-iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
-		struct page **pagep, struct iomap *iomap, struct iomap *srcmap)
+static int iomap_write_begin(struct inode *inode, loff_t pos, size_t len,
+		unsigned flags, struct folio **foliop, struct iomap *iomap,
+		struct iomap *srcmap)
 {
 	const struct iomap_page_ops *page_ops = iomap->page_ops;
+	struct folio *folio;
 	struct page *page;
 	int status = 0;
 
@@ -605,30 +605,31 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
 			return status;
 	}
 
-	page = grab_cache_page_write_begin(inode->i_mapping, pos >> PAGE_SHIFT,
+	folio = filemap_get_stable_folio(inode->i_mapping, pos >> PAGE_SHIFT,
 			AOP_FLAG_NOFS);
-	if (!page) {
+	if (!folio) {
 		status = -ENOMEM;
 		goto out_no_page;
 	}
 
+	page = folio_file_page(folio, pos >> PAGE_SHIFT);
 	if (srcmap->type == IOMAP_INLINE)
 		iomap_read_inline_data(inode, page, srcmap);
 	else if (iomap->flags & IOMAP_F_BUFFER_HEAD)
 		status = __block_write_begin_int(page, pos, len, NULL, srcmap);
 	else
-		status = __iomap_write_begin(inode, pos, len, flags, page,
+		status = __iomap_write_begin(inode, pos, len, flags, folio,
 				srcmap);
 
 	if (unlikely(status))
 		goto out_unlock;
 
-	*pagep = page;
+	*foliop = folio;
 	return 0;
 
 out_unlock:
-	unlock_page(page);
-	put_page(page);
+	folio_unlock(folio);
+	folio_put(folio);
 	iomap_write_failed(inode, pos, len);
 
 out_no_page:
@@ -638,11 +639,10 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
 }
 
 static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
-		size_t copied, struct page *page)
+		size_t copied, struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
 	struct iomap_page *iop = to_iomap_page(folio);
-	flush_dcache_page(page);
+	flush_dcache_folio(folio);
 
 	/*
 	 * The blocks that were entirely written will now be uptodate, so we
@@ -655,10 +655,10 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 	 * uptodate page as a zero-length write, and force the caller to redo
 	 * the whole thing.
 	 */
-	if (unlikely(copied < len && !PageUptodate(page)))
+	if (unlikely(copied < len && !folio_uptodate(folio)))
 		return 0;
 	iomap_set_range_uptodate(folio, iop, offset_in_folio(folio, pos), len);
-	iomap_set_page_dirty(page);
+	filemap_dirty_folio(inode->i_mapping, folio);
 	return copied;
 }
 
@@ -681,9 +681,10 @@ static size_t iomap_write_end_inline(struct inode *inode, struct page *page,
 
 /* Returns the number of bytes copied.  May be 0.  Cannot be an errno. */
 static size_t iomap_write_end(struct inode *inode, loff_t pos, size_t len,
-		size_t copied, struct page *page, struct iomap *iomap,
+		size_t copied, struct folio *folio, struct iomap *iomap,
 		struct iomap *srcmap)
 {
+	struct page *page = folio_file_page(folio, pos / PAGE_SIZE);
 	const struct iomap_page_ops *page_ops = iomap->page_ops;
 	loff_t old_size = inode->i_size;
 	size_t ret;
@@ -694,7 +695,7 @@ static size_t iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 		ret = block_write_end(NULL, inode->i_mapping, pos, len, copied,
 				page, NULL);
 	} else {
-		ret = __iomap_write_end(inode, pos, len, copied, page);
+		ret = __iomap_write_end(inode, pos, len, copied, folio);
 	}
 
 	/*
@@ -706,13 +707,13 @@ static size_t iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 		i_size_write(inode, pos + ret);
 		iomap->flags |= IOMAP_F_SIZE_CHANGED;
 	}
-	unlock_page(page);
+	folio_unlock(folio);
 
 	if (old_size < pos)
 		pagecache_isize_extended(inode, old_size, pos);
 	if (page_ops && page_ops->page_done)
 		page_ops->page_done(inode, pos, ret, page, iomap);
-	put_page(page);
+	folio_put(folio);
 
 	if (ret < len)
 		iomap_write_failed(inode, pos, len);
@@ -728,6 +729,7 @@ iomap_write_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 	ssize_t written = 0;
 
 	do {
+		struct folio *folio;
 		struct page *page;
 		unsigned long offset;	/* Offset into pagecache page */
 		unsigned long bytes;	/* Bytes to write to page */
@@ -755,18 +757,19 @@ iomap_write_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 			break;
 		}
 
-		status = iomap_write_begin(inode, pos, bytes, 0, &page, iomap,
+		status = iomap_write_begin(inode, pos, bytes, 0, &folio, iomap,
 				srcmap);
 		if (unlikely(status))
 			break;
 
+		page = folio_file_page(folio, pos / PAGE_SIZE);
 		if (mapping_writably_mapped(inode->i_mapping))
 			flush_dcache_page(page);
 
 		copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
 
-		copied = iomap_write_end(inode, pos, bytes, copied, page, iomap,
-				srcmap);
+		copied = iomap_write_end(inode, pos, bytes, copied, folio,
+				iomap, srcmap);
 
 		cond_resched();
 
@@ -831,14 +834,14 @@ iomap_unshare_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 	do {
 		unsigned long offset = offset_in_page(pos);
 		unsigned long bytes = min_t(loff_t, PAGE_SIZE - offset, length);
-		struct page *page;
+		struct folio *folio;
 
 		status = iomap_write_begin(inode, pos, bytes,
-				IOMAP_WRITE_F_UNSHARE, &page, iomap, srcmap);
+				IOMAP_WRITE_F_UNSHARE, &folio, iomap, srcmap);
 		if (unlikely(status))
 			return status;
 
-		status = iomap_write_end(inode, pos, bytes, bytes, page, iomap,
+		status = iomap_write_end(inode, pos, bytes, bytes, folio, iomap,
 				srcmap);
 		if (WARN_ON_ONCE(status == 0))
 			return -EIO;
@@ -877,19 +880,19 @@ EXPORT_SYMBOL_GPL(iomap_file_unshare);
 static s64 iomap_zero(struct inode *inode, loff_t pos, u64 length,
 		struct iomap *iomap, struct iomap *srcmap)
 {
-	struct page *page;
+	struct folio *folio;
 	int status;
 	unsigned offset = offset_in_page(pos);
 	unsigned bytes = min_t(u64, PAGE_SIZE - offset, length);
 
-	status = iomap_write_begin(inode, pos, bytes, 0, &page, iomap, srcmap);
+	status = iomap_write_begin(inode, pos, bytes, 0, &folio, iomap, srcmap);
 	if (status)
 		return status;
 
-	zero_user(page, offset, bytes);
-	mark_page_accessed(page);
+	zero_user(folio_file_page(folio, pos / PAGE_SIZE), offset, bytes);
+	folio_mark_accessed(folio);
 
-	return iomap_write_end(inode, pos, bytes, bytes, page, iomap, srcmap);
+	return iomap_write_end(inode, pos, bytes, bytes, folio, iomap, srcmap);
 }
 
 static loff_t iomap_zero_range_actor(struct inode *inode, loff_t pos,
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 93/96] iomap: Convert iomap_read_inline_data to take a folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (91 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 92/96] iomap: Convert iomap_write_begin and iomap_write_end to folios Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 94/96] iomap: Convert iomap_write_end_inline " Matthew Wilcox (Oracle)
                   ` (2 subsequent siblings)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Inline data is restricted to being less than a page in size, so we
don't need to handle multi-page folios.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 1c7e16e6aad1..72fe6741a36c 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -194,24 +194,24 @@ struct iomap_readpage_ctx {
 	struct readahead_control *rac;
 };
 
-static void
-iomap_read_inline_data(struct inode *inode, struct page *page,
+static void iomap_read_inline_data(struct inode *inode, struct folio *folio,
 		struct iomap *iomap)
 {
 	size_t size = i_size_read(inode);
 	void *addr;
 
-	if (PageUptodate(page))
+	if (folio_uptodate(folio))
 		return;
 
-	BUG_ON(page->index);
+	BUG_ON(folio->index);
+	BUG_ON(folio_multi(folio));
 	BUG_ON(size > PAGE_SIZE - offset_in_page(iomap->inline_data));
 
-	addr = kmap_atomic(page);
+	addr = kmap_local_folio(folio, 0);
 	memcpy(addr, iomap->inline_data, size);
 	memset(addr + size, 0, PAGE_SIZE - size);
-	kunmap_atomic(addr);
-	SetPageUptodate(page);
+	kunmap_local(addr);
+	folio_mark_uptodate(folio);
 }
 
 static inline bool iomap_block_needs_zeroing(struct inode *inode,
@@ -236,7 +236,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 
 	if (iomap->type == IOMAP_INLINE) {
 		WARN_ON_ONCE(pos);
-		iomap_read_inline_data(inode, &folio->page, iomap);
+		iomap_read_inline_data(inode, folio, iomap);
 		return PAGE_SIZE;
 	}
 
@@ -614,7 +614,7 @@ static int iomap_write_begin(struct inode *inode, loff_t pos, size_t len,
 
 	page = folio_file_page(folio, pos >> PAGE_SHIFT);
 	if (srcmap->type == IOMAP_INLINE)
-		iomap_read_inline_data(inode, page, srcmap);
+		iomap_read_inline_data(inode, folio, srcmap);
 	else if (iomap->flags & IOMAP_F_BUFFER_HEAD)
 		status = __block_write_begin_int(page, pos, len, NULL, srcmap);
 	else
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 94/96] iomap: Convert iomap_write_end_inline to take a folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (92 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 93/96] iomap: Convert iomap_read_inline_data to take a folio Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 95/96] iomap: Convert iomap_add_to_ioend " Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 96/96] iomap: Convert iomap_do_writepage to use " Matthew Wilcox (Oracle)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Inline data only occupies a single page, but using a folio means that
we don't need to call compound_head() in PageUptodate().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 72fe6741a36c..40934c624f11 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -662,18 +662,18 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 	return copied;
 }
 
-static size_t iomap_write_end_inline(struct inode *inode, struct page *page,
+static size_t iomap_write_end_inline(struct inode *inode, struct folio *folio,
 		struct iomap *iomap, loff_t pos, size_t copied)
 {
 	void *addr;
 
-	WARN_ON_ONCE(!PageUptodate(page));
+	WARN_ON_ONCE(!folio_uptodate(folio));
 	BUG_ON(pos + copied > PAGE_SIZE - offset_in_page(iomap->inline_data));
 
-	flush_dcache_page(page);
-	addr = kmap_atomic(page);
+	flush_dcache_folio(folio);
+	addr = kmap_local_folio(folio, 0);
 	memcpy(iomap->inline_data + pos, addr + pos, copied);
-	kunmap_atomic(addr);
+	kunmap_local(addr);
 
 	mark_inode_dirty(inode);
 	return copied;
@@ -690,7 +690,7 @@ static size_t iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 	size_t ret;
 
 	if (srcmap->type == IOMAP_INLINE) {
-		ret = iomap_write_end_inline(inode, page, iomap, pos, copied);
+		ret = iomap_write_end_inline(inode, folio, iomap, pos, copied);
 	} else if (srcmap->flags & IOMAP_F_BUFFER_HEAD) {
 		ret = block_write_end(NULL, inode->i_mapping, pos, len, copied,
 				page, NULL);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 95/96] iomap: Convert iomap_add_to_ioend to take a folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (93 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 94/96] iomap: Convert iomap_write_end_inline " Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  2021-05-05 15:06 ` [PATCH v9 96/96] iomap: Convert iomap_do_writepage to use " Matthew Wilcox (Oracle)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

We still iterate one block at a time, but now we call compound_head()
less often.  Rename file_offset to pos to fit the rest of the file.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 66 +++++++++++++++++++-----------------------
 1 file changed, 30 insertions(+), 36 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 40934c624f11..4ea256de9d04 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1257,36 +1257,29 @@ iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t offset,
  * first, otherwise finish off the current ioend and start another.
  */
 static void
-iomap_add_to_ioend(struct inode *inode, loff_t offset, struct page *page,
+iomap_add_to_ioend(struct inode *inode, loff_t pos, struct folio *folio,
 		struct iomap_page *iop, struct iomap_writepage_ctx *wpc,
 		struct writeback_control *wbc, struct list_head *iolist)
 {
-	sector_t sector = iomap_sector(&wpc->iomap, offset);
+	sector_t sector = iomap_sector(&wpc->iomap, pos);
 	unsigned len = i_blocksize(inode);
-	unsigned poff = offset & (PAGE_SIZE - 1);
-	bool merged, same_page = false;
+	size_t poff = offset_in_folio(folio, pos);
 
-	if (!wpc->ioend || !iomap_can_add_to_ioend(wpc, offset, sector)) {
+	if (!wpc->ioend || !iomap_can_add_to_ioend(wpc, pos, sector)) {
 		if (wpc->ioend)
 			list_add(&wpc->ioend->io_list, iolist);
-		wpc->ioend = iomap_alloc_ioend(inode, wpc, offset, sector, wbc);
+		wpc->ioend = iomap_alloc_ioend(inode, wpc, pos, sector, wbc);
 	}
 
-	merged = __bio_try_merge_page(wpc->ioend->io_bio, page, len, poff,
-			&same_page);
 	if (iop)
 		atomic_add(len, &iop->write_bytes_pending);
-
-	if (!merged) {
-		if (bio_full(wpc->ioend->io_bio, len)) {
-			wpc->ioend->io_bio =
-				iomap_chain_bio(wpc->ioend->io_bio);
-		}
-		bio_add_page(wpc->ioend->io_bio, page, len, poff);
+	if (!bio_add_folio(wpc->ioend->io_bio, folio, len, poff)) {
+		wpc->ioend->io_bio = iomap_chain_bio(wpc->ioend->io_bio);
+		bio_add_folio(wpc->ioend->io_bio, folio, len, poff);
 	}
 
 	wpc->ioend->io_size += len;
-	wbc_account_cgroup_owner(wbc, page, len);
+	wbc_account_cgroup_owner(wbc, &folio->page, len);
 }
 
 /*
@@ -1314,40 +1307,41 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 	struct iomap_page *iop = to_iomap_page(folio);
 	struct iomap_ioend *ioend, *next;
 	unsigned len = i_blocksize(inode);
-	u64 file_offset; /* file offset of page */
+	unsigned nblocks = i_blocks_per_folio(inode, folio);
+	loff_t pos = folio_offset(folio);
 	int error = 0, count = 0, i;
 	LIST_HEAD(submit_list);
 
-	WARN_ON_ONCE(i_blocks_per_page(inode, page) > 1 && !iop);
+	WARN_ON_ONCE(nblocks > 1 && !iop);
 	WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) != 0);
 
 	/*
-	 * Walk through the page to find areas to write back. If we run off the
-	 * end of the current map or find the current map invalid, grab a new
-	 * one.
+	 * Walk through the folio to find areas to write back. If we
+	 * run off the end of the current map or find the current map
+	 * invalid, grab a new one.
 	 */
-	for (i = 0, file_offset = page_offset(page);
-	     i < (PAGE_SIZE >> inode->i_blkbits) && file_offset < end_offset;
-	     i++, file_offset += len) {
+	for (i = 0; i < nblocks; i++, pos += len) {
+		if (pos >= end_offset)
+			break;
 		if (iop && !test_bit(i, iop->uptodate))
 			continue;
 
-		error = wpc->ops->map_blocks(wpc, inode, file_offset);
+		error = wpc->ops->map_blocks(wpc, inode, pos);
 		if (error)
 			break;
 		if (WARN_ON_ONCE(wpc->iomap.type == IOMAP_INLINE))
 			continue;
 		if (wpc->iomap.type == IOMAP_HOLE)
 			continue;
-		iomap_add_to_ioend(inode, file_offset, page, iop, wpc, wbc,
+		iomap_add_to_ioend(inode, pos, folio, iop, wpc, wbc,
 				 &submit_list);
 		count++;
 	}
 
 	WARN_ON_ONCE(!wpc->ioend && !list_empty(&submit_list));
-	WARN_ON_ONCE(!PageLocked(page));
-	WARN_ON_ONCE(PageWriteback(page));
-	WARN_ON_ONCE(PageDirty(page));
+	WARN_ON_ONCE(!folio_locked(folio));
+	WARN_ON_ONCE(folio_writeback(folio));
+	WARN_ON_ONCE(folio_dirty(folio));
 
 	/*
 	 * We cannot cancel the ioend directly here on error.  We may have
@@ -1363,16 +1357,16 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 		 * now.
 		 */
 		if (wpc->ops->discard_page)
-			wpc->ops->discard_page(page, file_offset);
+			wpc->ops->discard_page(&folio->page, pos);
 		if (!count) {
-			ClearPageUptodate(page);
-			unlock_page(page);
+			folio_clear_uptodate_flag(folio);
+			folio_unlock(folio);
 			goto done;
 		}
 	}
 
-	set_page_writeback(page);
-	unlock_page(page);
+	folio_start_writeback(folio);
+	folio_unlock(folio);
 
 	/*
 	 * Preserve the original error if there was one, otherwise catch
@@ -1393,9 +1387,9 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 	 * with a partial page truncate on a sub-page block sized filesystem.
 	 */
 	if (!count)
-		end_page_writeback(page);
+		folio_end_writeback(folio);
 done:
-	mapping_set_error(page->mapping, error);
+	mapping_set_error(folio->mapping, error);
 	return error;
 }
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* [PATCH v9 96/96] iomap: Convert iomap_do_writepage to use a folio
  2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
                   ` (94 preceding siblings ...)
  2021-05-05 15:06 ` [PATCH v9 95/96] iomap: Convert iomap_add_to_ioend " Matthew Wilcox (Oracle)
@ 2021-05-05 15:06 ` Matthew Wilcox (Oracle)
  95 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-05-05 15:06 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm; +Cc: Matthew Wilcox (Oracle), linux-kernel

Writeback an entire folio at a time, and adjust some of the variables
to have more familiar names.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 49 +++++++++++++++++++-----------------------
 1 file changed, 22 insertions(+), 27 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 4ea256de9d04..2dfeb7fa5f03 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1301,9 +1301,8 @@ iomap_add_to_ioend(struct inode *inode, loff_t pos, struct folio *folio,
 static int
 iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 		struct writeback_control *wbc, struct inode *inode,
-		struct page *page, u64 end_offset)
+		struct folio *folio, loff_t end_pos)
 {
-	struct folio *folio = page_folio(page);
 	struct iomap_page *iop = to_iomap_page(folio);
 	struct iomap_ioend *ioend, *next;
 	unsigned len = i_blocksize(inode);
@@ -1321,7 +1320,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 	 * invalid, grab a new one.
 	 */
 	for (i = 0; i < nblocks; i++, pos += len) {
-		if (pos >= end_offset)
+		if (pos >= end_pos)
 			break;
 		if (iop && !test_bit(i, iop->uptodate))
 			continue;
@@ -1403,16 +1402,15 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
 static int
 iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
 {
+	struct folio *folio = page_folio(page);
 	struct iomap_writepage_ctx *wpc = data;
-	struct inode *inode = page->mapping->host;
-	pgoff_t end_index;
-	u64 end_offset;
-	loff_t offset;
+	struct inode *inode = folio->mapping->host;
+	loff_t end_pos, isize;
 
-	trace_iomap_writepage(inode, page_offset(page), PAGE_SIZE);
+	trace_iomap_writepage(inode, folio_offset(folio), folio_size(folio));
 
 	/*
-	 * Refuse to write the page out if we are called from reclaim context.
+	 * Refuse to write the folio out if we are called from reclaim context.
 	 *
 	 * This avoids stack overflows when called from deeply used stacks in
 	 * random callers for direct reclaim or memcg reclaim.  We explicitly
@@ -1426,10 +1424,10 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
 		goto redirty;
 
 	/*
-	 * Is this page beyond the end of the file?
+	 * Is this folio beyond the end of the file?
 	 *
-	 * The page index is less than the end_index, adjust the end_offset
-	 * to the highest offset that this page should represent.
+	 * The folio index is less than the end_index, adjust the end_pos
+	 * to the highest offset that this folio should represent.
 	 * -----------------------------------------------------
 	 * |			file mapping	       | <EOF> |
 	 * -----------------------------------------------------
@@ -1438,11 +1436,9 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
 	 * |     desired writeback range    |      see else    |
 	 * ---------------------------------^------------------|
 	 */
-	offset = i_size_read(inode);
-	end_index = offset >> PAGE_SHIFT;
-	if (page->index < end_index)
-		end_offset = (loff_t)(page->index + 1) << PAGE_SHIFT;
-	else {
+	isize = i_size_read(inode);
+	end_pos = folio_offset(folio) + folio_size(folio);
+	if (end_pos - 1 >= isize) {
 		/*
 		 * Check whether the page to write out is beyond or straddles
 		 * i_size or not.
@@ -1454,7 +1450,8 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
 		 * |				    |      Straddles     |
 		 * ---------------------------------^-----------|--------|
 		 */
-		unsigned offset_into_page = offset & (PAGE_SIZE - 1);
+		size_t poff = offset_in_folio(folio, isize);
+		pgoff_t end_index = isize >> PAGE_SHIFT;
 
 		/*
 		 * Skip the page if it is fully outside i_size, e.g. due to a
@@ -1473,8 +1470,8 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
 		 * if the page to write is totally beyond the i_size or if it's
 		 * offset is just equal to the EOF.
 		 */
-		if (page->index > end_index ||
-		    (page->index == end_index && offset_into_page == 0))
+		if (folio->index > end_index ||
+		    (folio->index == end_index && poff == 0))
 			goto redirty;
 
 		/*
@@ -1485,17 +1482,15 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
 		 * memory is zeroed when mapped, and writes to that region are
 		 * not written out to the file."
 		 */
-		zero_user_segment(page, offset_into_page, PAGE_SIZE);
-
-		/* Adjust the end_offset to the end of file */
-		end_offset = offset;
+		zero_user_segment(&folio->page, poff, folio_size(folio));
+		end_pos = isize;
 	}
 
-	return iomap_writepage_map(wpc, wbc, inode, page, end_offset);
+	return iomap_writepage_map(wpc, wbc, inode, folio, end_pos);
 
 redirty:
-	redirty_page_for_writepage(wbc, page);
-	unlock_page(page);
+	folio_redirty_for_writepage(wbc, folio);
+	folio_unlock(folio);
 	return 0;
 }
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v9 01/96] mm: Optimise nth_page for contiguous memmap
  2021-05-05 15:04 ` [PATCH v9 01/96] mm: Optimise nth_page for contiguous memmap Matthew Wilcox (Oracle)
@ 2021-05-05 17:24   ` Vlastimil Babka
  0 siblings, 0 replies; 108+ messages in thread
From: Vlastimil Babka @ 2021-05-05 17:24 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-fsdevel, linux-mm
  Cc: linux-kernel, Christoph Hellwig, David Hildenbrand, Zi Yan

On 5/5/21 5:04 PM, Matthew Wilcox (Oracle) wrote:
> If the memmap is virtually contiguous (either because we're using
> a virtually mapped memmap or because we don't support a discontig
> memmap at all), then we can implement nth_page() by simple addition.
> Contrary to popular belief, the compiler is not able to optimise this
> itself for a vmemmap configuration.  This reduces one example user (sg.c)
> by four instructions:
> 
>         struct page *page = nth_page(rsv_schp->pages[k], offset >> PAGE_SHIFT);
> 
> before:
>    49 8b 45 70             mov    0x70(%r13),%rax
>    48 63 c9                movslq %ecx,%rcx
>    48 c1 eb 0c             shr    $0xc,%rbx
>    48 8b 04 c8             mov    (%rax,%rcx,8),%rax
>    48 2b 05 00 00 00 00    sub    0x0(%rip),%rax
>            R_X86_64_PC32      vmemmap_base-0x4
>    48 c1 f8 06             sar    $0x6,%rax
>    48 01 d8                add    %rbx,%rax
>    48 c1 e0 06             shl    $0x6,%rax
>    48 03 05 00 00 00 00    add    0x0(%rip),%rax
>            R_X86_64_PC32      vmemmap_base-0x4
> 
> after:
>    49 8b 45 70             mov    0x70(%r13),%rax
>    48 63 c9                movslq %ecx,%rcx
>    48 c1 eb 0c             shr    $0xc,%rbx
>    48 c1 e3 06             shl    $0x6,%rbx
>    48 03 1c c8             add    (%rax,%rcx,8),%rbx
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: David Hildenbrand <david@redhat.com>
> Reviewed-by: Zi Yan <ziy@nvidia.com>

Reviewed-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  include/linux/mm.h | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 25b9041f9925..2327f99b121f 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -234,7 +234,11 @@ int overcommit_policy_handler(struct ctl_table *, int, void *, size_t *,
>  int __add_to_page_cache_locked(struct page *page, struct address_space *mapping,
>  		pgoff_t index, gfp_t gfp, void **shadowp);
>  
> +#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
>  #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
> +#else
> +#define nth_page(page,n) ((page) + (n))
> +#endif
>  
>  /* to align the pointer to the (next) page boundary */
>  #define PAGE_ALIGN(addr) ALIGN(addr, PAGE_SIZE)
> 


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v9 08/96] mm: Fix struct page layout on 32-bit systems
  2021-05-05 15:05 ` [PATCH v9 08/96] mm: Fix struct page layout on 32-bit systems Matthew Wilcox (Oracle)
@ 2021-05-05 17:33   ` Vlastimil Babka
  0 siblings, 0 replies; 108+ messages in thread
From: Vlastimil Babka @ 2021-05-05 17:33 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-fsdevel, linux-mm
  Cc: linux-kernel, Ilias Apalodimas, Jesper Dangaard Brouer

On 5/5/21 5:05 PM, Matthew Wilcox (Oracle) wrote:
> 32-bit architectures which expect 8-byte alignment for 8-byte integers
> and need 64-bit DMA addresses (arm, mips, ppc) had their struct page
> inadvertently expanded in 2019.  When the dma_addr_t was added, it forced
> the alignment of the union to 8 bytes, which inserted a 4 byte gap between
> 'flags' and the union.
> 
> Fix this by storing the dma_addr_t in one or two adjacent unsigned longs.
> This restores the alignment to that of an unsigned long.  We always
> store the low bits in the first word to prevent the PageTail bit from
> being inadvertently set on a big endian platform.  If that happened,
> get_user_pages_fast() racing against a page which was freed and
> reallocated to the page_pool could dereference a bogus compound_head(),
> which would be hard to trace back to this cause.
> 
> Fixes: c25fff7171be ("mm: add dma_addr_t to struct page")
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v9 74/96] mm/workingset: Convert workingset_refault to take a folio
  2021-05-05 15:06 ` [PATCH v9 74/96] mm/workingset: Convert workingset_refault to take a folio Matthew Wilcox (Oracle)
@ 2021-05-05 20:17   ` kernel test robot
  2021-05-05 20:57     ` Matthew Wilcox
  0 siblings, 1 reply; 108+ messages in thread
From: kernel test robot @ 2021-05-05 20:17 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-fsdevel, linux-mm
  Cc: kbuild-all, Matthew Wilcox (Oracle), linux-kernel

[-- Attachment #1: Type: text/plain, Size: 7692 bytes --]

Hi "Matthew,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on next-20210505]
[cannot apply to hnaz-linux-mm/master xfs-linux/for-next tip/perf/core shaggy/jfs-next block/for-next linus/master asm-generic/master v5.12 v5.12-rc8 v5.12-rc7 v5.12]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox-Oracle/Memory-folios/20210506-014108
base:    29955e0289b3255c5f609a7564a0f0bb4ae35c7a
config: nds32-defconfig (attached as .config)
compiler: nds32le-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/b1883a3797e1623bf783141c25482fee16e1031c
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Matthew-Wilcox-Oracle/Memory-folios/20210506-014108
        git checkout b1883a3797e1623bf783141c25482fee16e1031c
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross W=1 ARCH=nds32 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All error/warnings (new ones prefixed by >>):

   In file included from mm/workingset.c:8:
   include/linux/memcontrol.h: In function 'folio_uncharge_cgroup':
   include/linux/memcontrol.h:1213:42: error: parameter name omitted
    1213 | static inline void folio_uncharge_cgroup(struct folio *)
         |                                          ^~~~~~~~~~~~~~
   mm/workingset.c: In function 'unpack_shadow':
   mm/workingset.c:201:15: warning: variable 'nid' set but not used [-Wunused-but-set-variable]
     201 |  int memcgid, nid;
         |               ^~~
   mm/workingset.c: In function 'workingset_refault':
>> mm/workingset.c:348:10: error: implicit declaration of function 'folio_memcg' [-Werror=implicit-function-declaration]
     348 |  memcg = folio_memcg(folio);
         |          ^~~~~~~~~~~
>> mm/workingset.c:348:8: warning: assignment to 'struct mem_cgroup *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
     348 |  memcg = folio_memcg(folio);
         |        ^
   cc1: some warnings being treated as errors


vim +/folio_memcg +348 mm/workingset.c

   272	
   273	/**
   274	 * workingset_refault - evaluate the refault of a previously evicted folio
   275	 * @page: the freshly allocated replacement folio
   276	 * @shadow: shadow entry of the evicted folio
   277	 *
   278	 * Calculates and evaluates the refault distance of the previously
   279	 * evicted folio in the context of the node and the memcg whose memory
   280	 * pressure caused the eviction.
   281	 */
   282	void workingset_refault(struct folio *folio, void *shadow)
   283	{
   284		bool file = folio_is_file_lru(folio);
   285		struct mem_cgroup *eviction_memcg;
   286		struct lruvec *eviction_lruvec;
   287		unsigned long refault_distance;
   288		unsigned long workingset_size;
   289		struct pglist_data *pgdat;
   290		struct mem_cgroup *memcg;
   291		unsigned long eviction;
   292		struct lruvec *lruvec;
   293		unsigned long refault;
   294		bool workingset;
   295		int memcgid;
   296	
   297		unpack_shadow(shadow, &memcgid, &pgdat, &eviction, &workingset);
   298	
   299		rcu_read_lock();
   300		/*
   301		 * Look up the memcg associated with the stored ID. It might
   302		 * have been deleted since the folio's eviction.
   303		 *
   304		 * Note that in rare events the ID could have been recycled
   305		 * for a new cgroup that refaults a shared folio. This is
   306		 * impossible to tell from the available data. However, this
   307		 * should be a rare and limited disturbance, and activations
   308		 * are always speculative anyway. Ultimately, it's the aging
   309		 * algorithm's job to shake out the minimum access frequency
   310		 * for the active cache.
   311		 *
   312		 * XXX: On !CONFIG_MEMCG, this will always return NULL; it
   313		 * would be better if the root_mem_cgroup existed in all
   314		 * configurations instead.
   315		 */
   316		eviction_memcg = mem_cgroup_from_id(memcgid);
   317		if (!mem_cgroup_disabled() && !eviction_memcg)
   318			goto out;
   319		eviction_lruvec = mem_cgroup_lruvec(eviction_memcg, pgdat);
   320		refault = atomic_long_read(&eviction_lruvec->nonresident_age);
   321	
   322		/*
   323		 * Calculate the refault distance
   324		 *
   325		 * The unsigned subtraction here gives an accurate distance
   326		 * across nonresident_age overflows in most cases. There is a
   327		 * special case: usually, shadow entries have a short lifetime
   328		 * and are either refaulted or reclaimed along with the inode
   329		 * before they get too old.  But it is not impossible for the
   330		 * nonresident_age to lap a shadow entry in the field, which
   331		 * can then result in a false small refault distance, leading
   332		 * to a false activation should this old entry actually
   333		 * refault again.  However, earlier kernels used to deactivate
   334		 * unconditionally with *every* reclaim invocation for the
   335		 * longest time, so the occasional inappropriate activation
   336		 * leading to pressure on the active list is not a problem.
   337		 */
   338		refault_distance = (refault - eviction) & EVICTION_MASK;
   339	
   340		/*
   341		 * The activation decision for this folio is made at the level
   342		 * where the eviction occurred, as that is where the LRU order
   343		 * during folio reclaim is being determined.
   344		 *
   345		 * However, the cgroup that will own the folio is the one that
   346		 * is actually experiencing the refault event.
   347		 */
 > 348		memcg = folio_memcg(folio);
   349		lruvec = mem_cgroup_lruvec(memcg, pgdat);
   350	
   351		inc_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + file);
   352	
   353		/*
   354		 * Compare the distance to the existing workingset size. We
   355		 * don't activate pages that couldn't stay resident even if
   356		 * all the memory was available to the workingset. Whether
   357		 * workingset competition needs to consider anon or not depends
   358		 * on having swap.
   359		 */
   360		workingset_size = lruvec_page_state(eviction_lruvec, NR_ACTIVE_FILE);
   361		if (!file) {
   362			workingset_size += lruvec_page_state(eviction_lruvec,
   363							     NR_INACTIVE_FILE);
   364		}
   365		if (mem_cgroup_get_nr_swap_pages(memcg) > 0) {
   366			workingset_size += lruvec_page_state(eviction_lruvec,
   367							     NR_ACTIVE_ANON);
   368			if (file) {
   369				workingset_size += lruvec_page_state(eviction_lruvec,
   370							     NR_INACTIVE_ANON);
   371			}
   372		}
   373		if (refault_distance > workingset_size)
   374			goto out;
   375	
   376		folio_set_active_flag(folio);
   377		workingset_age_nonresident(lruvec, folio_nr_pages(folio));
   378		inc_lruvec_state(lruvec, WORKINGSET_ACTIVATE_BASE + file);
   379	
   380		/* Folio was active prior to eviction */
   381		if (workingset) {
   382			folio_set_workingset_flag(folio);
   383			/* XXX: Move to lru_cache_add() when it supports new vs putback */
   384			lru_note_cost_folio(folio);
   385			inc_lruvec_state(lruvec, WORKINGSET_RESTORE_BASE + file);
   386		}
   387	out:
   388		rcu_read_unlock();
   389	}
   390	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 10847 bytes --]

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v9 52/96] mm/memcg: Add folio_uncharge_cgroup
  2021-05-05 15:05 ` [PATCH v9 52/96] mm/memcg: Add folio_uncharge_cgroup Matthew Wilcox (Oracle)
@ 2021-05-05 20:24   ` kernel test robot
  0 siblings, 0 replies; 108+ messages in thread
From: kernel test robot @ 2021-05-05 20:24 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-fsdevel, linux-mm
  Cc: kbuild-all, clang-built-linux, Matthew Wilcox (Oracle), linux-kernel

[-- Attachment #1: Type: text/plain, Size: 9221 bytes --]

Hi "Matthew,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on next-20210505]
[cannot apply to hnaz-linux-mm/master xfs-linux/for-next tip/perf/core shaggy/jfs-next block/for-next linus/master asm-generic/master v5.12 v5.12-rc8 v5.12-rc7 v5.12]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox-Oracle/Memory-folios/20210506-014108
base:    29955e0289b3255c5f609a7564a0f0bb4ae35c7a
config: powerpc-randconfig-r013-20210505 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 8f5a2a5836cc8e4c1def2bdeb022e7b496623439)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install powerpc cross compiling tool for clang build
        # apt-get install binutils-powerpc-linux-gnu
        # https://github.com/0day-ci/linux/commit/10f8a1d9657c57cdbad467d07ea326e6c831ac81
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Matthew-Wilcox-Oracle/Memory-folios/20210506-014108
        git checkout 10f8a1d9657c57cdbad467d07ea326e6c831ac81
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from arch/powerpc/kernel/asm-offsets.c:23:
   In file included from include/linux/suspend.h:5:
   In file included from include/linux/swap.h:9:
>> include/linux/memcontrol.h:1213:56: warning: omitting the parameter name in a function definition is a C2x extension [-Wc2x-extensions]
   static inline void folio_uncharge_cgroup(struct folio *)
                                                          ^
   1 warning generated.
--
   In file included from mm/shmem.c:35:
   In file included from include/linux/swap.h:9:
>> include/linux/memcontrol.h:1213:56: warning: omitting the parameter name in a function definition is a C2x extension [-Wc2x-extensions]
   static inline void folio_uncharge_cgroup(struct folio *)
                                                          ^
   mm/shmem.c:1464:20: warning: unused function 'shmem_show_mpol' [-Wunused-function]
   static inline void shmem_show_mpol(struct seq_file *seq, struct mempolicy *mpol)
                      ^
   2 warnings generated.
--
   In file included from mm/page_alloc.c:21:
   In file included from include/linux/swap.h:9:
>> include/linux/memcontrol.h:1213:56: warning: omitting the parameter name in a function definition is a C2x extension [-Wc2x-extensions]
   static inline void folio_uncharge_cgroup(struct folio *)
                                                          ^
   mm/page_alloc.c:3651:15: warning: no previous prototype for function 'should_fail_alloc_page' [-Wmissing-prototypes]
   noinline bool should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
                 ^
   mm/page_alloc.c:3651:10: note: declare 'static' if the function is not intended to be used outside of this translation unit
   noinline bool should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
            ^
            static 
   2 warnings generated.
--
   In file included from mm/sparse.c:14:
   In file included from include/linux/swap.h:9:
>> include/linux/memcontrol.h:1213:56: warning: omitting the parameter name in a function definition is a C2x extension [-Wc2x-extensions]
   static inline void folio_uncharge_cgroup(struct folio *)
                                                          ^
   mm/sparse.c:720:25: warning: no previous prototype for function 'populate_section_memmap' [-Wmissing-prototypes]
   struct page * __meminit populate_section_memmap(unsigned long pfn,
                           ^
   mm/sparse.c:720:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   struct page * __meminit populate_section_memmap(unsigned long pfn,
   ^
   static 
   2 warnings generated.
--
   In file included from mm/slub.c:14:
   In file included from include/linux/swap.h:9:
>> include/linux/memcontrol.h:1213:56: warning: omitting the parameter name in a function definition is a C2x extension [-Wc2x-extensions]
   static inline void folio_uncharge_cgroup(struct folio *)
                                                          ^
   mm/slub.c:1534:21: warning: unused function 'kmalloc_large_node_hook' [-Wunused-function]
   static inline void *kmalloc_large_node_hook(void *ptr, size_t size, gfp_t flags)
                       ^
   2 warnings generated.
--
   In file included from kernel/sched/core.c:13:
   In file included from kernel/sched/sched.h:65:
   In file included from include/linux/suspend.h:5:
   In file included from include/linux/swap.h:9:
>> include/linux/memcontrol.h:1213:56: warning: omitting the parameter name in a function definition is a C2x extension [-Wc2x-extensions]
   static inline void folio_uncharge_cgroup(struct folio *)
                                                          ^
   kernel/sched/core.c:2850:6: warning: no previous prototype for function 'sched_set_stop_task' [-Wmissing-prototypes]
   void sched_set_stop_task(int cpu, struct task_struct *stop)
        ^
   kernel/sched/core.c:2850:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   void sched_set_stop_task(int cpu, struct task_struct *stop)
   ^
   static 
   2 warnings generated.
--
   In file included from kernel/sched/cputime.c:5:
   In file included from kernel/sched/sched.h:65:
   In file included from include/linux/suspend.h:5:
   In file included from include/linux/swap.h:9:
>> include/linux/memcontrol.h:1213:56: warning: omitting the parameter name in a function definition is a C2x extension [-Wc2x-extensions]
   static inline void folio_uncharge_cgroup(struct folio *)
                                                          ^
   kernel/sched/cputime.c:256:19: warning: unused function 'account_other_time' [-Wunused-function]
   static inline u64 account_other_time(u64 max)
                     ^
   kernel/sched/cputime.c:399:20: warning: unused function 'irqtime_account_idle_ticks' [-Wunused-function]
   static inline void irqtime_account_idle_ticks(int ticks) { }
                      ^
   kernel/sched/cputime.c:400:20: warning: unused function 'irqtime_account_process_tick' [-Wunused-function]
   static inline void irqtime_account_process_tick(struct task_struct *p, int user_tick,
                      ^
   4 warnings generated.
--
   In file included from kernel/sched/rt.c:6:
   In file included from kernel/sched/sched.h:65:
   In file included from include/linux/suspend.h:5:
   In file included from include/linux/swap.h:9:
>> include/linux/memcontrol.h:1213:56: warning: omitting the parameter name in a function definition is a C2x extension [-Wc2x-extensions]
   static inline void folio_uncharge_cgroup(struct folio *)
                                                          ^
   kernel/sched/rt.c:669:6: warning: no previous prototype for function 'sched_rt_bandwidth_account' [-Wmissing-prototypes]
   bool sched_rt_bandwidth_account(struct rt_rq *rt_rq)
        ^
   kernel/sched/rt.c:669:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   bool sched_rt_bandwidth_account(struct rt_rq *rt_rq)
   ^
   static 
   2 warnings generated.
--
   In file included from kernel/sched/topology.c:5:
   In file included from kernel/sched/sched.h:65:
   In file included from include/linux/suspend.h:5:
   In file included from include/linux/swap.h:9:
>> include/linux/memcontrol.h:1213:56: warning: omitting the parameter name in a function definition is a C2x extension [-Wc2x-extensions]
   static inline void folio_uncharge_cgroup(struct folio *)
                                                          ^
   kernel/sched/topology.c:157:20: warning: unused function 'sched_debug' [-Wunused-function]
   static inline bool sched_debug(void)
                      ^
   2 warnings generated.
--
   error: no override and no default toolchain set
   init/Kconfig:70:warning: 'RUSTC_VERSION': number is invalid
   In file included from arch/powerpc/kernel/asm-offsets.c:23:
   In file included from include/linux/suspend.h:5:
   In file included from include/linux/swap.h:9:
>> include/linux/memcontrol.h:1213:56: warning: omitting the parameter name in a function definition is a C2x extension [-Wc2x-extensions]
   static inline void folio_uncharge_cgroup(struct folio *)
                                                          ^
   1 warning generated.


vim +1213 include/linux/memcontrol.h

  1212	
> 1213	static inline void folio_uncharge_cgroup(struct folio *)
  1214	{
  1215	}
  1216	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 26070 bytes --]

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v9 74/96] mm/workingset: Convert workingset_refault to take a folio
  2021-05-05 20:17   ` kernel test robot
@ 2021-05-05 20:57     ` Matthew Wilcox
  0 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox @ 2021-05-05 20:57 UTC (permalink / raw)
  To: kernel test robot; +Cc: linux-fsdevel, linux-mm, kbuild-all, linux-kernel

On Thu, May 06, 2021 at 04:17:27AM +0800, kernel test robot wrote:
>    In file included from mm/workingset.c:8:
>    include/linux/memcontrol.h: In function 'folio_uncharge_cgroup':
>    include/linux/memcontrol.h:1213:42: error: parameter name omitted
>     1213 | static inline void folio_uncharge_cgroup(struct folio *)
>          |                                          ^~~~~~~~~~~~~~

Fixed (also reported in your other report)

>    mm/workingset.c: In function 'unpack_shadow':
>    mm/workingset.c:201:15: warning: variable 'nid' set but not used [-Wunused-but-set-variable]
>      201 |  int memcgid, nid;
>          |               ^~~

I didn't introduce this one; not trying to fix it ;-)

>    mm/workingset.c: In function 'workingset_refault':
> >> mm/workingset.c:348:10: error: implicit declaration of function 'folio_memcg' [-Werror=implicit-function-declaration]
>      348 |  memcg = folio_memcg(folio);
>          |          ^~~~~~~~~~~
> >> mm/workingset.c:348:8: warning: assignment to 'struct mem_cgroup *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
>      348 |  memcg = folio_memcg(folio);
>          |        ^
>    cc1: some warnings being treated as errors

Fixed.  Thanks!


^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v9 92/96] iomap: Convert iomap_write_begin and iomap_write_end to folios
  2021-05-05 15:06 ` [PATCH v9 92/96] iomap: Convert iomap_write_begin and iomap_write_end to folios Matthew Wilcox (Oracle)
@ 2021-05-05 21:36   ` kernel test robot
  2021-05-05 22:10     ` Matthew Wilcox
  0 siblings, 1 reply; 108+ messages in thread
From: kernel test robot @ 2021-05-05 21:36 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-fsdevel, linux-mm
  Cc: kbuild-all, Matthew Wilcox (Oracle), linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3471 bytes --]

Hi "Matthew,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on next-20210505]
[cannot apply to hnaz-linux-mm/master xfs-linux/for-next tip/perf/core shaggy/jfs-next block/for-next linus/master asm-generic/master v5.12 v5.12-rc8 v5.12-rc7 v5.12]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox-Oracle/Memory-folios/20210506-014108
base:    29955e0289b3255c5f609a7564a0f0bb4ae35c7a
config: nds32-defconfig (attached as .config)
compiler: nds32le-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/0780b0addad735d2cceb3680d49f54f8618e1334
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Matthew-Wilcox-Oracle/Memory-folios/20210506-014108
        git checkout 0780b0addad735d2cceb3680d49f54f8618e1334
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross W=1 ARCH=nds32 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from include/linux/swap.h:9,
                    from fs/iomap/buffered-io.c:16:
   include/linux/memcontrol.h: In function 'folio_uncharge_cgroup':
   include/linux/memcontrol.h:1213:42: error: parameter name omitted
    1213 | static inline void folio_uncharge_cgroup(struct folio *)
         |                                          ^~~~~~~~~~~~~~
   fs/iomap/buffered-io.c: In function '__iomap_write_end':
>> fs/iomap/buffered-io.c:645:2: error: implicit declaration of function 'flush_dcache_folio'; did you mean 'flush_dcache_page'? [-Werror=implicit-function-declaration]
     645 |  flush_dcache_folio(folio);
         |  ^~~~~~~~~~~~~~~~~~
         |  flush_dcache_page
   cc1: some warnings being treated as errors


vim +645 fs/iomap/buffered-io.c

   640	
   641	static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
   642			size_t copied, struct folio *folio)
   643	{
   644		struct iomap_page *iop = to_iomap_page(folio);
 > 645		flush_dcache_folio(folio);
   646	
   647		/*
   648		 * The blocks that were entirely written will now be uptodate, so we
   649		 * don't have to worry about a readpage reading them and overwriting a
   650		 * partial write.  However if we have encountered a short write and only
   651		 * partially written into a block, it will not be marked uptodate, so a
   652		 * readpage might come in and destroy our partial write.
   653		 *
   654		 * Do the simplest thing, and just treat any short write to a non
   655		 * uptodate page as a zero-length write, and force the caller to redo
   656		 * the whole thing.
   657		 */
   658		if (unlikely(copied < len && !folio_uptodate(folio)))
   659			return 0;
   660		iomap_set_range_uptodate(folio, iop, offset_in_folio(folio, pos), len);
   661		filemap_dirty_folio(inode->i_mapping, folio);
   662		return copied;
   663	}
   664	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 10847 bytes --]

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v9 92/96] iomap: Convert iomap_write_begin and iomap_write_end to folios
  2021-05-05 21:36   ` kernel test robot
@ 2021-05-05 22:10     ` Matthew Wilcox
  0 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox @ 2021-05-05 22:10 UTC (permalink / raw)
  To: kernel test robot; +Cc: linux-fsdevel, linux-mm, kbuild-all, linux-kernel

On Thu, May 06, 2021 at 05:36:54AM +0800, kernel test robot wrote:
> config: nds32-defconfig (attached as .config)
>    fs/iomap/buffered-io.c: In function '__iomap_write_end':
> >> fs/iomap/buffered-io.c:645:2: error: implicit declaration of function 'flush_dcache_folio'; did you mean 'flush_dcache_page'? [-Werror=implicit-function-declaration]
>      645 |  flush_dcache_folio(folio);
>          |  ^~~~~~~~~~~~~~~~~~
>          |  flush_dcache_page
>    cc1: some warnings being treated as errors

Argh, nds32 doesn't (always) include asm-generic/cacheflush.h, so it
doesn't pick up the generic implementation of flush_dcache_folio().
Copy-and-pasted it to nds32.

Thanks.

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v9 46/96] mm: Add flush_dcache_folio
  2021-05-05 15:05 ` [PATCH v9 46/96] mm: Add flush_dcache_folio Matthew Wilcox (Oracle)
@ 2021-05-05 23:35   ` kernel test robot
  2021-05-06  2:33     ` Matthew Wilcox
  0 siblings, 1 reply; 108+ messages in thread
From: kernel test robot @ 2021-05-05 23:35 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-fsdevel, linux-mm
  Cc: kbuild-all, Matthew Wilcox (Oracle), linux-kernel

[-- Attachment #1: Type: text/plain, Size: 9633 bytes --]

Hi "Matthew,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on next-20210505]
[cannot apply to hnaz-linux-mm/master xfs-linux/for-next tip/perf/core shaggy/jfs-next block/for-next linus/master asm-generic/master v5.12 v5.12-rc8 v5.12-rc7 v5.12]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox-Oracle/Memory-folios/20210506-014108
base:    29955e0289b3255c5f609a7564a0f0bb4ae35c7a
config: ia64-allmodconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/2104ef87cf0390e2def04a508c79a664b4a4fcc4
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Matthew-Wilcox-Oracle/Memory-folios/20210506-014108
        git checkout 2104ef87cf0390e2def04a508c79a664b4a4fcc4
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross W=1 ARCH=ia64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from arch/ia64/include/asm/cacheflush.h:31,
                    from arch/ia64/include/asm/pgtable.h:153,
                    from include/linux/pgtable.h:6,
                    from arch/ia64/include/asm/uaccess.h:40,
                    from include/linux/uaccess.h:11,
                    from include/linux/sched/task.h:11,
                    from include/linux/sched/signal.h:9,
                    from arch/ia64/kernel/asm-offsets.c:10:
   include/asm-generic/cacheflush.h: In function 'flush_dcache_folio':
>> include/asm-generic/cacheflush.h:61:19: error: implicit declaration of function 'folio_nr_pages'; did you mean 'folio_page'? [-Werror=implicit-function-declaration]
      61 |  unsigned int n = folio_nr_pages(folio);
         |                   ^~~~~~~~~~~~~~
         |                   folio_page
   In file included from arch/ia64/include/asm/pgtable.h:153,
                    from include/linux/pgtable.h:6,
                    from arch/ia64/include/asm/uaccess.h:40,
                    from include/linux/uaccess.h:11,
                    from include/linux/sched/task.h:11,
                    from include/linux/sched/signal.h:9,
                    from arch/ia64/kernel/asm-offsets.c:10:
>> include/asm-generic/cacheflush.h:65:21: error: implicit declaration of function 'nth_page' [-Werror=implicit-function-declaration]
      65 |   flush_dcache_page(nth_page(&folio->page, n));
         |                     ^~~~~~~~
   arch/ia64/include/asm/cacheflush.h:18:25: note: in definition of macro 'flush_dcache_page'
      18 |  clear_bit(PG_arch_1, &(page)->flags); \
         |                         ^~~~
>> arch/ia64/include/asm/cacheflush.h:18:30: error: invalid type argument of '->' (have 'int')
      18 |  clear_bit(PG_arch_1, &(page)->flags); \
         |                              ^~
   include/asm-generic/cacheflush.h:65:3: note: in expansion of macro 'flush_dcache_page'
      65 |   flush_dcache_page(nth_page(&folio->page, n));
         |   ^~~~~~~~~~~~~~~~~
   arch/ia64/kernel/asm-offsets.c: At top level:
   arch/ia64/kernel/asm-offsets.c:23:6: warning: no previous prototype for 'foo' [-Wmissing-prototypes]
      23 | void foo(void)
         |      ^~~
   cc1: some warnings being treated as errors
--
   In file included from arch/ia64/include/asm/cacheflush.h:31,
                    from arch/ia64/include/asm/pgtable.h:153,
                    from include/linux/pgtable.h:6,
                    from arch/ia64/include/asm/uaccess.h:40,
                    from include/linux/uaccess.h:11,
                    from include/linux/sched/task.h:11,
                    from include/linux/sched/signal.h:9,
                    from arch/ia64/kernel/asm-offsets.c:10:
   include/asm-generic/cacheflush.h: In function 'flush_dcache_folio':
>> include/asm-generic/cacheflush.h:61:19: error: implicit declaration of function 'folio_nr_pages'; did you mean 'folio_page'? [-Werror=implicit-function-declaration]
      61 |  unsigned int n = folio_nr_pages(folio);
         |                   ^~~~~~~~~~~~~~
         |                   folio_page
   In file included from arch/ia64/include/asm/pgtable.h:153,
                    from include/linux/pgtable.h:6,
                    from arch/ia64/include/asm/uaccess.h:40,
                    from include/linux/uaccess.h:11,
                    from include/linux/sched/task.h:11,
                    from include/linux/sched/signal.h:9,
                    from arch/ia64/kernel/asm-offsets.c:10:
>> include/asm-generic/cacheflush.h:65:21: error: implicit declaration of function 'nth_page' [-Werror=implicit-function-declaration]
      65 |   flush_dcache_page(nth_page(&folio->page, n));
         |                     ^~~~~~~~
   arch/ia64/include/asm/cacheflush.h:18:25: note: in definition of macro 'flush_dcache_page'
      18 |  clear_bit(PG_arch_1, &(page)->flags); \
         |                         ^~~~
>> arch/ia64/include/asm/cacheflush.h:18:30: error: invalid type argument of '->' (have 'int')
      18 |  clear_bit(PG_arch_1, &(page)->flags); \
         |                              ^~
   include/asm-generic/cacheflush.h:65:3: note: in expansion of macro 'flush_dcache_page'
      65 |   flush_dcache_page(nth_page(&folio->page, n));
         |   ^~~~~~~~~~~~~~~~~
   arch/ia64/kernel/asm-offsets.c: At top level:
   arch/ia64/kernel/asm-offsets.c:23:6: warning: no previous prototype for 'foo' [-Wmissing-prototypes]
      23 | void foo(void)
         |      ^~~
   cc1: some warnings being treated as errors
   make[2]: *** [scripts/Makefile.build:118: arch/ia64/kernel/asm-offsets.s] Error 1
   make[2]: Target '__build' not remade because of errors.
   make[1]: *** [Makefile:1313: prepare0] Error 2
   make[1]: Target 'modules_prepare' not remade because of errors.
   make: *** [Makefile:222: __sub-make] Error 2
   make: Target 'modules_prepare' not remade because of errors.
--
   error: no override and no default toolchain set
   init/Kconfig:70:warning: 'RUSTC_VERSION': number is invalid
   In file included from arch/ia64/include/asm/cacheflush.h:31,
                    from arch/ia64/include/asm/pgtable.h:153,
                    from include/linux/pgtable.h:6,
                    from arch/ia64/include/asm/uaccess.h:40,
                    from include/linux/uaccess.h:11,
                    from include/linux/sched/task.h:11,
                    from include/linux/sched/signal.h:9,
                    from arch/ia64/kernel/asm-offsets.c:10:
   include/asm-generic/cacheflush.h: In function 'flush_dcache_folio':
>> include/asm-generic/cacheflush.h:61:19: error: implicit declaration of function 'folio_nr_pages'; did you mean 'folio_page'? [-Werror=implicit-function-declaration]
      61 |  unsigned int n = folio_nr_pages(folio);
         |                   ^~~~~~~~~~~~~~
         |                   folio_page
   In file included from arch/ia64/include/asm/pgtable.h:153,
                    from include/linux/pgtable.h:6,
                    from arch/ia64/include/asm/uaccess.h:40,
                    from include/linux/uaccess.h:11,
                    from include/linux/sched/task.h:11,
                    from include/linux/sched/signal.h:9,
                    from arch/ia64/kernel/asm-offsets.c:10:
>> include/asm-generic/cacheflush.h:65:21: error: implicit declaration of function 'nth_page' [-Werror=implicit-function-declaration]
      65 |   flush_dcache_page(nth_page(&folio->page, n));
         |                     ^~~~~~~~
   arch/ia64/include/asm/cacheflush.h:18:25: note: in definition of macro 'flush_dcache_page'
      18 |  clear_bit(PG_arch_1, &(page)->flags); \
         |                         ^~~~
>> arch/ia64/include/asm/cacheflush.h:18:30: error: invalid type argument of '->' (have 'int')
      18 |  clear_bit(PG_arch_1, &(page)->flags); \
         |                              ^~
   include/asm-generic/cacheflush.h:65:3: note: in expansion of macro 'flush_dcache_page'
      65 |   flush_dcache_page(nth_page(&folio->page, n));
         |   ^~~~~~~~~~~~~~~~~
   arch/ia64/kernel/asm-offsets.c: At top level:
   arch/ia64/kernel/asm-offsets.c:23:6: warning: no previous prototype for 'foo' [-Wmissing-prototypes]
      23 | void foo(void)
         |      ^~~
   cc1: some warnings being treated as errors
   make[2]: *** [scripts/Makefile.build:118: arch/ia64/kernel/asm-offsets.s] Error 1
   make[2]: Target '__build' not remade because of errors.
   make[1]: *** [Makefile:1313: prepare0] Error 2
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [Makefile:222: __sub-make] Error 2
   make: Target 'prepare' not remade because of errors.


vim +61 include/asm-generic/cacheflush.h

    57	
    58	#ifndef ARCH_IMPLEMENTS_FLUSH_DCACHE_FOLIO
    59	static inline void flush_dcache_folio(struct folio *folio)
    60	{
  > 61		unsigned int n = folio_nr_pages(folio);
    62	
    63		do {
    64			n--;
  > 65			flush_dcache_page(nth_page(&folio->page, n));
    66		} while (n);
    67	}
    68	#endif
    69	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 64551 bytes --]

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v9 77/96] mm/filemap: Add filemap_alloc_folio
  2021-05-05 15:06 ` [PATCH v9 77/96] mm/filemap: Add filemap_alloc_folio Matthew Wilcox (Oracle)
@ 2021-05-06  0:00   ` kernel test robot
  2021-05-06  2:28     ` Matthew Wilcox
  0 siblings, 1 reply; 108+ messages in thread
From: kernel test robot @ 2021-05-06  0:00 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-fsdevel, linux-mm
  Cc: kbuild-all, Matthew Wilcox (Oracle), linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2830 bytes --]

Hi "Matthew,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on next-20210505]
[cannot apply to hnaz-linux-mm/master xfs-linux/for-next tip/perf/core shaggy/jfs-next block/for-next linus/master asm-generic/master v5.12 v5.12-rc8 v5.12-rc7 v5.12]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox-Oracle/Memory-folios/20210506-014108
base:    29955e0289b3255c5f609a7564a0f0bb4ae35c7a
config: x86_64-randconfig-s031-20210505 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.3-341-g8af24329-dirty
        # https://github.com/0day-ci/linux/commit/f694d0d59ba326a2db7840f6ff0a455620cdcaa1
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Matthew-Wilcox-Oracle/Memory-folios/20210506-014108
        git checkout f694d0d59ba326a2db7840f6ff0a455620cdcaa1
        # save the attached .config to linux build tree
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)
>> mm/filemap.c:1000:52: sparse: sparse: incorrect type in argument 1 (different base types) @@     expected restricted gfp_t [usertype] gfp @@     got int [assigned] n @@
   mm/filemap.c:1000:52: sparse:     expected restricted gfp_t [usertype] gfp
   mm/filemap.c:1000:52: sparse:     got int [assigned] n
>> mm/filemap.c:1000:55: sparse: sparse: incorrect type in argument 2 (different base types) @@     expected unsigned int order @@     got restricted gfp_t [usertype] gfp @@
   mm/filemap.c:1000:55: sparse:     expected unsigned int order
   mm/filemap.c:1000:55: sparse:     got restricted gfp_t [usertype] gfp

vim +1000 mm/filemap.c

   988	
   989	#ifdef CONFIG_NUMA
   990	struct folio *filemap_alloc_folio(gfp_t gfp, unsigned int order)
   991	{
   992		int n;
   993		struct folio *folio;
   994	
   995		if (cpuset_do_page_mem_spread()) {
   996			unsigned int cpuset_mems_cookie;
   997			do {
   998				cpuset_mems_cookie = read_mems_allowed_begin();
   999				n = cpuset_mem_spread_node();
> 1000				folio = __alloc_folio_node(n, gfp, order);
  1001			} while (!folio && read_mems_allowed_retry(cpuset_mems_cookie));
  1002	
  1003			return folio;
  1004		}
  1005		return alloc_folio(gfp, order);
  1006	}
  1007	EXPORT_SYMBOL(filemap_alloc_folio);
  1008	#endif
  1009	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 34210 bytes --]

^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v9 77/96] mm/filemap: Add filemap_alloc_folio
  2021-05-06  0:00   ` kernel test robot
@ 2021-05-06  2:28     ` Matthew Wilcox
  0 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox @ 2021-05-06  2:28 UTC (permalink / raw)
  To: kernel test robot; +Cc: linux-fsdevel, linux-mm, kbuild-all, linux-kernel

On Thu, May 06, 2021 at 08:00:25AM +0800, kernel test robot wrote:
> sparse warnings: (new ones prefixed by >>)
> >> mm/filemap.c:1000:52: sparse: sparse: incorrect type in argument 1 (different base types) @@     expected restricted gfp_t [usertype] gfp @@     got int [assigned] n @@
>    mm/filemap.c:1000:52: sparse:     expected restricted gfp_t [usertype] gfp
>    mm/filemap.c:1000:52: sparse:     got int [assigned] n
> >> mm/filemap.c:1000:55: sparse: sparse: incorrect type in argument 2 (different base types) @@     expected unsigned int order @@     got restricted gfp_t [usertype] gfp @@
>    mm/filemap.c:1000:55: sparse:     expected unsigned int order
>    mm/filemap.c:1000:55: sparse:     got restricted gfp_t [usertype] gfp

fixed



^ permalink raw reply	[flat|nested] 108+ messages in thread

* Re: [PATCH v9 46/96] mm: Add flush_dcache_folio
  2021-05-05 23:35   ` kernel test robot
@ 2021-05-06  2:33     ` Matthew Wilcox
  0 siblings, 0 replies; 108+ messages in thread
From: Matthew Wilcox @ 2021-05-06  2:33 UTC (permalink / raw)
  To: kernel test robot; +Cc: linux-fsdevel, linux-mm, kbuild-all, linux-kernel

On Thu, May 06, 2021 at 07:35:16AM +0800, kernel test robot wrote:
>    In file included from arch/ia64/include/asm/cacheflush.h:31,
>                     from arch/ia64/include/asm/pgtable.h:153,
>                     from include/linux/pgtable.h:6,
>                     from arch/ia64/include/asm/uaccess.h:40,
>                     from include/linux/uaccess.h:11,
>                     from include/linux/sched/task.h:11,
>                     from include/linux/sched/signal.h:9,
>                     from arch/ia64/kernel/asm-offsets.c:10:
>    include/asm-generic/cacheflush.h: In function 'flush_dcache_folio':
> >> include/asm-generic/cacheflush.h:61:19: error: implicit declaration of function 'folio_nr_pages'; did you mean 'folio_page'? [-Werror=implicit-function-declaration]

Ugh, I can't be bothered to untangle the ia64 header file mess.
I'll just move the function out of line.


^ permalink raw reply	[flat|nested] 108+ messages in thread

end of thread, other threads:[~2021-05-06  2:33 UTC | newest]

Thread overview: 108+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-05 15:04 [PATCH v9 00/96] Memory folios Matthew Wilcox (Oracle)
2021-05-05 15:04 ` [PATCH v9 01/96] mm: Optimise nth_page for contiguous memmap Matthew Wilcox (Oracle)
2021-05-05 17:24   ` Vlastimil Babka
2021-05-05 15:04 ` [PATCH v9 02/96] mm: Make __dump_page static Matthew Wilcox (Oracle)
2021-05-05 15:04 ` [PATCH v9 03/96] mm/debug: Factor PagePoisoned out of __dump_page Matthew Wilcox (Oracle)
2021-05-05 15:04 ` [PATCH v9 04/96] mm/page_owner: Constify dump_page_owner Matthew Wilcox (Oracle)
2021-05-05 15:04 ` [PATCH v9 05/96] mm: Make compound_head const-preserving Matthew Wilcox (Oracle)
2021-05-05 15:04 ` [PATCH v9 06/96] mm: Constify get_pfnblock_flags_mask and get_pfnblock_migratetype Matthew Wilcox (Oracle)
2021-05-05 15:04 ` [PATCH v9 07/96] mm: Constify page_count and page_ref_count Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 08/96] mm: Fix struct page layout on 32-bit systems Matthew Wilcox (Oracle)
2021-05-05 17:33   ` Vlastimil Babka
2021-05-05 15:05 ` [PATCH v9 09/96] mm: Introduce struct folio Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 10/96] mm: Add folio_pgdat and folio_zone Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 11/96] mm/vmstat: Add functions to account folio statistics Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 12/96] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 13/96] mm: Add folio reference count functions Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 14/96] mm: Add folio_put Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 15/96] mm: Add folio_get Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 16/96] mm: Add folio flag manipulation functions Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 17/96] mm: Add folio_young() and folio_idle() Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 18/96] mm: Handle per-folio private data Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 19/96] mm/filemap: Add folio_index, folio_file_page and folio_contains Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 20/96] mm/filemap: Add folio_next_index Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 21/96] mm/filemap: Add folio_offset and folio_file_offset Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 22/96] mm/util: Add folio_mapping and folio_file_mapping Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 23/96] mm: Add folio_mapcount Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 24/96] mm/memcg: Add folio wrappers for various functions Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 25/96] mm/filemap: Add folio_unlock Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 26/96] mm/filemap: Add folio_lock Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 27/96] mm/filemap: Add folio_lock_killable Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 28/96] mm/filemap: Add __folio_lock_async Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 29/96] mm/filemap: Add __folio_lock_or_retry Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 30/96] mm/filemap: Add folio_wait_locked Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 31/96] mm/swap: Add folio_rotate_reclaimable Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 32/96] mm/filemap: Add folio_end_writeback Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 33/96] mm/writeback: Add folio_wait_writeback Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 34/96] mm/writeback: Add folio_wait_stable Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 35/96] mm/filemap: Add folio_wait_bit Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 36/96] mm/filemap: Add folio_wake_bit Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 37/96] mm/filemap: Convert page wait queues to be folios Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 38/96] mm/filemap: Add folio private_2 functions Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 39/96] fs/netfs: Add folio fscache functions Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 40/96] mm: Add folio_mapped Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 41/96] mm/workingset: Convert workingset_activation to take a folio Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 42/96] mm/swap: Add folio_activate Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 43/96] mm/swap: Add folio_mark_accessed Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 44/96] mm/rmap: Add folio_mkclean Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 45/96] mm: Add kmap_local_folio Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 46/96] mm: Add flush_dcache_folio Matthew Wilcox (Oracle)
2021-05-05 23:35   ` kernel test robot
2021-05-06  2:33     ` Matthew Wilcox
2021-05-05 15:05 ` [PATCH v9 47/96] mm: Add arch_make_folio_accessible Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 48/96] mm/memcg: Remove 'page' parameter to mem_cgroup_charge_statistics Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 49/96] mm/memcg: Use the node id in mem_cgroup_update_tree Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 50/96] mm/memcg: Convert commit_charge to take a folio Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 51/96] mm/memcg: Add folio_charge_cgroup Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 52/96] mm/memcg: Add folio_uncharge_cgroup Matthew Wilcox (Oracle)
2021-05-05 20:24   ` kernel test robot
2021-05-05 15:05 ` [PATCH v9 53/96] mm/memcg: Convert mem_cgroup_track_foreign_dirty_slowpath to folio Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 54/96] mm/writeback: Rename __add_wb_stat to wb_stat_mod Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 55/96] flex_proportions: Allow N events instead of 1 Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 56/96] mm/writeback: Change __wb_writeout_inc to __wb_writeout_add Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 57/96] mm/writeback: Convert test_clear_page_writeback to __folio_end_writeback Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 58/96] mm/writeback: Add folio_start_writeback Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 59/96] mm/writeback: Add folio_mark_dirty Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 60/96] mm/writeback: Use __set_page_dirty in __set_page_dirty_nobuffers Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 61/96] mm/writeback: Add __folio_mark_dirty Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 62/96] mm/writeback: Add filemap_dirty_folio Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 63/96] mm/writeback: Add folio_account_cleaned Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 64/96] mm/writeback: Add folio_cancel_dirty Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 65/96] mm/writeback: Add folio_clear_dirty_for_io Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 66/96] mm/writeback: Add folio_account_redirty Matthew Wilcox (Oracle)
2021-05-05 15:05 ` [PATCH v9 67/96] mm/writeback: Add folio_redirty_for_writepage Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 68/96] mm/filemap: Add i_blocks_per_folio Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 69/96] mm/filemap: Add folio_mkwrite_check_truncate Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 70/96] mm/filemap: Add readahead_folio Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 71/96] block: Add bio_add_folio Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 72/96] block: Add bio_for_each_folio_all Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 73/96] mm/lru: Add folio_lru and folio_is_file_lru Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 74/96] mm/workingset: Convert workingset_refault to take a folio Matthew Wilcox (Oracle)
2021-05-05 20:17   ` kernel test robot
2021-05-05 20:57     ` Matthew Wilcox
2021-05-05 15:06 ` [PATCH v9 75/96] mm/lru: Add folio_add_lru Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 76/96] mm/page_alloc: Add __alloc_folio, __alloc_folio_node and alloc_folio Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 77/96] mm/filemap: Add filemap_alloc_folio Matthew Wilcox (Oracle)
2021-05-06  0:00   ` kernel test robot
2021-05-06  2:28     ` Matthew Wilcox
2021-05-05 15:06 ` [PATCH v9 78/96] mm/filemap: Add folio_add_to_page_cache Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 79/96] mm/filemap: Convert mapping_get_entry to return a folio Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 80/96] mm/filemap: Add filemap_get_folio and find_get_folio Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 81/96] mm/filemap: Add filemap_get_stable_folio Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 82/96] iomap: Convert to_iomap_page to take a folio Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 83/96] iomap: Convert iomap_page_create " Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 84/96] iomap: Convert iomap_page_release " Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 85/96] iomap: Convert iomap_releasepage to use " Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 86/96] iomap: Convert iomap_invalidatepage " Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 87/96] iomap: Pass the iomap_page into iomap_set_range_uptodate Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 88/96] iomap: Use folio offsets instead of page offsets Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 89/96] iomap: Convert bio completions to use folios Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 90/96] iomap: Convert readahead and readpage to use a folio Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 91/96] iomap: Convert iomap_page_mkwrite " Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 92/96] iomap: Convert iomap_write_begin and iomap_write_end to folios Matthew Wilcox (Oracle)
2021-05-05 21:36   ` kernel test robot
2021-05-05 22:10     ` Matthew Wilcox
2021-05-05 15:06 ` [PATCH v9 93/96] iomap: Convert iomap_read_inline_data to take a folio Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 94/96] iomap: Convert iomap_write_end_inline " Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 95/96] iomap: Convert iomap_add_to_ioend " Matthew Wilcox (Oracle)
2021-05-05 15:06 ` [PATCH v9 96/96] iomap: Convert iomap_do_writepage to use " Matthew Wilcox (Oracle)

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox