Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v7 00/28] Memory Folios
@ 2021-04-09 18:50 Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 01/28] mm: Optimise nth_page for contiguous memmap Matthew Wilcox (Oracle)
                   ` (27 more replies)
  0 siblings, 28 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs

Managing memory in 4KiB pages is a serious overhead.  Many benchmarks
benefit from a larger "page size".  As an example, an earlier iteration
of this idea which used compound pages (and wasn't particularly tuned)
got a 7% performance boost when compiling the kernel.

Using compound pages or THPs exposes a serious weakness in our type
system.  Functions are often unprepared for compound pages to be passed
to them, and may only act on PAGE_SIZE chunks.  Even functions which are
aware of compound pages may expect a head page, and do the wrong thing
if passed a tail page.

There have been efforts to label function parameters as 'head' instead
of 'page' to indicate that the function expects a head page, but this
leaves us with runtime assertions instead of using the compiler to prove
that nobody has mistakenly passed a tail page.  Calling a struct page
'head' is also inaccurate as they will work perfectly well on base pages.

We also waste a lot of instructions ensuring that we're not looking at
a tail page.  Almost every call to PageFoo() contains one or more hidden
calls to compound_head().  This also happens for get_page(), put_page()
and many more functions.  There does not appear to be a way to tell gcc
that it can cache the result of compound_head(), nor is there a way to
tell it that compound_head() is idempotent.

This series introduces the 'struct folio' as a replacement for
head-or-base pages.  This initial set reduces the kernel size by
approximately 6kB by removing conversions from tail pages to head pages.
The real purpose of this series is adding infrastructure to enable
further use of the folio.

The medium-term goal is to convert all filesystems and some device
drivers to work in terms of folios.  This series contains a lot of
explicit conversions, but it's important to realise it's removing a lot
of implicit conversions in some relatively hot paths.  There will be very
few conversions from folios when this work is completed; filesystems,
the page cache, the LRU and so on will generally only deal with folios.

I analysed the text size reduction using a config based on Oracle UEK
with all modules changed to built-in.  That's obviously not a kernel
which makes sense to run, but it serves to compare the effects on (many
common) filesystems & drivers, not just the core.

add/remove: 33652/33642 grow/shrink: 1799/1955 up/down: 895792/-901770 (-5978)

For a "just the core" comparison, here's an allnoconfig comparison:
add/remove: 201/197 grow/shrink: 9/29 up/down: 7523/-8797 (-1274)

Current tree at:
https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/folio

(contains another ~100 patches on top of this batch, not all of which are
in good shape for submission)

v7:
 - Rebase on next-20210409
   - keep up with afs changes
   - wait_on_page_fscache() no longer needs to be modified
   - unlock_page_private_2() changed to end_page_private_2()
   - wait_on_page_private_2() is new
   - wait_on_page_private_2_killable() is new
 - Optimise nth_page() instead of avoiding it (Christoph, Kirill)
 - Use nth_page() in folio_file_page()
 - Use static_assert() for FOLIO_MATCH (Rasmus)
 - Add a FOLIO_MATCH that lru and compound_head are at the same offset
 - Make page_count() use folio_ref_count() instead of page_ref_count()
v6:
 - Rebase on next-20210330
   - wait_bit_key patch merged by Linus
   - wait_on_page_writeback_killable() patches merged by Linus
   - Documentation patch merged by Andrew
 - Move folio_next_index() into this series
 - Move folio_offset() and folio_file_offset() into this series
 - Mirror members of struct page (for pagecache / anon) into struct folio,
   so (eg) you can use folio->mapping instead of folio->page.mapping
 - Add folio_ref_* functions, including kernel-doc for folio_ref_count().
 - Add count_memcg_folio_event()
 - Add put_folio_testzero()
 - Add folio_mapcount()
 - Add FolioKsm()
 - Fix afs_page_mkwrite() compilation
 - Fix/improve kernel-doc for
   - struct folio
   - add_folio_wait_queue()
   - wait_for_stable_folio()
   - wait_on_folio_writeback()
   - wait_on_folio_writeback_killable()
v5:
 - Rebase on next-20210319
 - Pull out three bug-fix patches to the front of the series, allowing
   them to be applied earlier.
 - Fix folio_page() against pages being moved between swap & page cache
 - Fix FolioDoubleMap to use the right page flags
 - Rename next_folio() to folio_next() (akpm)
 - Renamed folio stat functions (akpm)
 - Add 'mod' versions of the folio stats for users that already have 'nr'
 - Renamed folio_page to folio_file_page() (akpm)
 - Added kernel-doc for struct folio, folio_next(), folio_index(),
   folio_file_page(), folio_contains(), folio_order(), folio_nr_pages(),
   folio_shift(), folio_size(), page_folio(), get_folio(), put_folio()
 - Make folio_private() work in terms of void * instead of unsigned long
 - Used page_folio() in attach/detach page_private() (hch)
 - Drop afs_page_mkwrite folio conversion from this series
 - Add wait_on_folio_writeback_killable()
 - Convert add_page_wait_queue() to add_folio_wait_queue()
 - Add folio_swap_entry() helper
 - Drop the additions of *FolioFsCache
 - Simplify the addition of lock_folio_memcg() et al
 - Drop test_clear_page_writeback() conversion from this series
 - Add FolioTransHuge() definition
 - Rename __folio_file_mapping() to swapcache_mapping()
 - Added swapcache_index() helper
 - Removed lock_folio_async()
 - Made __lock_folio_async() static to filemap.c
 - Converted unlock_page_private_2() to use a folio internally
v4:
 - Rebase on current Linus tree (including swap fix)
 - Analyse each patch in terms of its effects on kernel text size.
   A few were modified to improve their effect.  In particular, where
   pushing calls to page_folio() into the callers resulted in unacceptable
   size increases, the wrapper was placed in mm/folio-compat.c.  This lets
   us see all the places which are good targets for conversion to folios.
 - Some of the patches were reordered, split or merged in order to make
   more logical sense.
 - Use nth_page() for folio_next() if we're using SPARSEMEM and not
   VMEMMAP (Zi Yan)
 - Increment and decrement page stats in units of pages instead of units
   of folios (Zi Yan)
v3:
 - Rebase on next-20210127.  Two major sources of conflict, the
   generic_file_buffered_read refactoring (in akpm tree) and the
   fscache work (in dhowells tree).
v2:
 - Pare patch series back to just infrastructure and the page waiting
   parts.

Matthew Wilcox (Oracle) (28):
  mm: Optimise nth_page for contiguous memmap
  mm: Introduce struct folio
  mm: Add folio_pgdat and folio_zone
  mm/vmstat: Add functions to account folio statistics
  mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO
  mm: Add folio reference count functions
  mm: Add put_folio
  mm: Add get_folio
  mm: Create FolioFlags
  mm: Handle per-folio private data
  mm/filemap: Add folio_index, folio_file_page and folio_contains
  mm/filemap: Add folio_next_index
  mm/filemap: Add folio_offset and folio_file_offset
  mm/util: Add folio_mapping and folio_file_mapping
  mm: Add folio_mapcount
  mm/memcg: Add folio wrappers for various functions
  mm/filemap: Add unlock_folio
  mm/filemap: Add lock_folio
  mm/filemap: Add lock_folio_killable
  mm/filemap: Add __lock_folio_async
  mm/filemap: Add __lock_folio_or_retry
  mm/filemap: Add wait_on_folio_locked
  mm/filemap: Add end_folio_writeback
  mm/writeback: Add wait_on_folio_writeback
  mm/writeback: Add wait_for_stable_folio
  mm/filemap: Convert wait_on_page_bit to wait_on_folio_bit
  mm/filemap: Convert wake_up_page_bit to wake_up_folio_bit
  mm/filemap: Convert page wait queues to be folios

 Documentation/core-api/mm-api.rst |   3 +
 fs/afs/write.c                    |   9 +-
 fs/cachefiles/rdwr.c              |  16 +-
 fs/io_uring.c                     |   2 +-
 include/linux/memcontrol.h        |  30 ++++
 include/linux/mm.h                | 177 ++++++++++++++++----
 include/linux/mm_types.h          |  96 +++++++++++
 include/linux/mmdebug.h           |  20 +++
 include/linux/page-flags.h        | 130 +++++++++++---
 include/linux/page_ref.h          |  88 +++++++++-
 include/linux/pagemap.h           | 270 ++++++++++++++++++++++--------
 include/linux/swap.h              |   6 +
 include/linux/vmstat.h            | 107 ++++++++++++
 mm/Makefile                       |   2 +-
 mm/filemap.c                      | 256 ++++++++++++++--------------
 mm/folio-compat.c                 |  37 ++++
 mm/memory.c                       |   8 +-
 mm/page-writeback.c               |  72 +++++---
 mm/swapfile.c                     |   8 +-
 mm/util.c                         |  30 ++--
 20 files changed, 1056 insertions(+), 311 deletions(-)
 create mode 100644 mm/folio-compat.c

-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 01/28] mm: Optimise nth_page for contiguous memmap
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-12  6:08   ` Christoph Hellwig
  2021-04-09 18:50 ` [PATCH v7 02/28] mm: Introduce struct folio Matthew Wilcox (Oracle)
                   ` (26 subsequent siblings)
  27 siblings, 1 reply; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs

If the memmap is virtually contiguous (either because we're using
a virtually mapped memmap or because we don't support a discontig
memmap at all), then we can implement nth_page() by simple addition.
Contrary to popular belief, the compiler is not able to optimise this
itself for a vmemmap configuration.  This reduces one example user (sg.c)
by four instructions:

        struct page *page = nth_page(rsv_schp->pages[k], offset >> PAGE_SHIFT);

before:
   49 8b 45 70             mov    0x70(%r13),%rax
   48 63 c9                movslq %ecx,%rcx
   48 c1 eb 0c             shr    $0xc,%rbx
   48 8b 04 c8             mov    (%rax,%rcx,8),%rax
   48 2b 05 00 00 00 00    sub    0x0(%rip),%rax
           R_X86_64_PC32      vmemmap_base-0x4
   48 c1 f8 06             sar    $0x6,%rax
   48 01 d8                add    %rbx,%rax
   48 c1 e0 06             shl    $0x6,%rax
   48 03 05 00 00 00 00    add    0x0(%rip),%rax
           R_X86_64_PC32      vmemmap_base-0x4

after:
   49 8b 45 70             mov    0x70(%r13),%rax
   48 63 c9                movslq %ecx,%rcx
   48 c1 eb 0c             shr    $0xc,%rbx
   48 c1 e3 06             shl    $0x6,%rbx
   48 03 1c c8             add    (%rax,%rcx,8),%rbx

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index b58c73e50da0..036f63a44a5c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -234,7 +234,11 @@ int overcommit_policy_handler(struct ctl_table *, int, void *, size_t *,
 int __add_to_page_cache_locked(struct page *page, struct address_space *mapping,
 		pgoff_t index, gfp_t gfp, void **shadowp);
 
+#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
 #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
+#else
+#define nth_page(page,n) ((page) + (n))
+#endif
 
 /* to align the pointer to the (next) page boundary */
 #define PAGE_ALIGN(addr) ALIGN(addr, PAGE_SIZE)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 02/28] mm: Introduce struct folio
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 01/28] mm: Optimise nth_page for contiguous memmap Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 22:45   ` kernel test robot
                     ` (2 more replies)
  2021-04-09 18:50 ` [PATCH v7 03/28] mm: Add folio_pgdat and folio_zone Matthew Wilcox (Oracle)
                   ` (25 subsequent siblings)
  27 siblings, 3 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Jeff Layton

A struct folio is a new abstraction to replace the venerable struct page.
A function which takes a struct folio argument declares that it will
operate on the entire (possibly compound) page, not just PAGE_SIZE bytes.
In return, the caller guarantees that the pointer it is passing does
not point to a tail page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/mm.h       | 74 +++++++++++++++++++++++++++++++++++++
 include/linux/mm_types.h | 80 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 154 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 036f63a44a5c..4ece80aa8d05 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -950,6 +950,20 @@ static inline unsigned int compound_order(struct page *page)
 	return page[1].compound_order;
 }
 
+/**
+ * folio_order - The allocation order of a folio.
+ * @folio: The folio.
+ *
+ * A folio is composed of 2^order pages.  See get_order() for the definition
+ * of order.
+ *
+ * Return: The order of the folio.
+ */
+static inline unsigned int folio_order(struct folio *folio)
+{
+	return compound_order(&folio->page);
+}
+
 static inline bool hpage_pincount_available(struct page *page)
 {
 	/*
@@ -1595,6 +1609,65 @@ static inline void set_page_links(struct page *page, enum zone_type zone,
 #endif
 }
 
+/**
+ * folio_nr_pages - The number of pages in the folio.
+ * @folio: The folio.
+ *
+ * Return: A number which is a power of two.
+ */
+static inline unsigned long folio_nr_pages(struct folio *folio)
+{
+	return compound_nr(&folio->page);
+}
+
+/**
+ * folio_next - Move to the next physical folio.
+ * @folio: The folio we're currently operating on.
+ *
+ * If you have physically contiguous memory which may span more than
+ * one folio (eg a &struct bio_vec), use this function to move from one
+ * folio to the next.  Do not use it if the memory is only virtually
+ * contiguous as the folios are almost certainly not adjacent to each
+ * other.  This is the folio equivalent to writing ``page++``.
+ *
+ * Context: We assume that the folios are refcounted and/or locked at a
+ * higher level and do not adjust the reference counts.
+ * Return: The next struct folio.
+ */
+static inline struct folio *folio_next(struct folio *folio)
+{
+	return (struct folio *)nth_page(&folio->page, folio_nr_pages(folio));
+}
+
+/**
+ * folio_shift - The number of bits covered by this folio.
+ * @folio: The folio.
+ *
+ * A folio contains a number of bytes which is a power-of-two in size.
+ * This function tells you which power-of-two the folio is.
+ *
+ * Context: The caller should have a reference on the folio to prevent
+ * it from being split.  It is not necessary for the folio to be locked.
+ * Return: The base-2 logarithm of the size of this folio.
+ */
+static inline unsigned int folio_shift(struct folio *folio)
+{
+	return PAGE_SHIFT + folio_order(folio);
+}
+
+/**
+ * folio_size - The number of bytes in a folio.
+ * @folio: The folio.
+ *
+ * Context: The caller should have a reference on the folio to prevent
+ * it from being split.  It is not necessary for the folio to be locked.
+ * Return: The number of bytes in this folio.
+ */
+static inline size_t folio_size(struct folio *folio)
+{
+	return PAGE_SIZE << folio_order(folio);
+}
+
 /*
  * Some inline functions in vmstat.h depend on page_zone()
  */
@@ -1699,6 +1772,7 @@ extern void pagefault_out_of_memory(void);
 
 #define offset_in_page(p)	((unsigned long)(p) & ~PAGE_MASK)
 #define offset_in_thp(page, p)	((unsigned long)(p) & (thp_size(page) - 1))
+#define offset_in_folio(folio, p) ((unsigned long)(p) & (folio_size(folio) - 1))
 
 /*
  * Flags passed to show_mem() and show_free_areas() to suppress output in
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 6613b26a8894..d65050851037 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -224,6 +224,86 @@ struct page {
 #endif
 } _struct_page_alignment;
 
+/**
+ * struct folio - Represents a contiguous set of bytes.
+ * @flags: Identical to the page flags.
+ * @lru: Least Recently Used list; tracks how recently this folio was used.
+ * @mapping: The file this page belongs to, or refers to the anon_vma for
+ *    anonymous pages.
+ * @index: Offset within the file, in units of pages.  For anonymous pages,
+ *    this is the index from the beginning of the mmap.
+ * @private: Filesystem per-folio data (see attach_folio_private()).
+ *    Used for swp_entry_t if FolioSwapCache().
+ * @_mapcount: How many times this folio is mapped to userspace.  Use
+ *    folio_mapcount() to access it.
+ * @_refcount: Number of references to this folio.  Use folio_ref_count()
+ *    to read it.
+ * @memcg_data: Memory Control Group data.
+ *
+ * A folio is a physically, virtually and logically contiguous set
+ * of bytes.  It is a power-of-two in size, and it is aligned to that
+ * same power-of-two.  It is at least as large as %PAGE_SIZE.  If it is
+ * in the page cache, it is at a file offset which is a multiple of that
+ * power-of-two.
+ */
+struct folio {
+	/* private: don't document the anon union */
+	union {
+		struct {
+	/* public: */
+			unsigned long flags;
+			struct list_head lru;
+			struct address_space *mapping;
+			pgoff_t index;
+			unsigned long private;
+			atomic_t _mapcount;
+			atomic_t _refcount;
+#ifdef CONFIG_MEMCG
+			unsigned long memcg_data;
+#endif
+	/* private: the union with struct page is transitional */
+		};
+		struct page page;
+	};
+};
+
+static_assert(sizeof(struct page) == sizeof(struct folio));
+#define FOLIO_MATCH(pg, fl)						\
+	static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
+FOLIO_MATCH(flags, flags);
+FOLIO_MATCH(lru, lru);
+FOLIO_MATCH(compound_head, lru);
+FOLIO_MATCH(index, index);
+FOLIO_MATCH(private, private);
+FOLIO_MATCH(_mapcount, _mapcount);
+FOLIO_MATCH(_refcount, _refcount);
+#ifdef CONFIG_MEMCG
+FOLIO_MATCH(memcg_data, memcg_data);
+#endif
+#undef FOLIO_MATCH
+
+/**
+ * page_folio - Converts from page to folio.
+ * @page: The page.
+ *
+ * Every page is part of a folio.  This function cannot be called on a
+ * NULL pointer.
+ *
+ * Context: No reference, nor lock is required on @page.  If the caller
+ * does not hold a reference, this call may race with a folio split, so
+ * it should re-check the folio still contains this page after gaining
+ * a reference on the folio.
+ * Return: The folio which contains this page.
+ */
+static inline struct folio *page_folio(struct page *page)
+{
+	unsigned long head = READ_ONCE(page->compound_head);
+
+	if (unlikely(head & 1))
+		return (struct folio *)(head - 1);
+	return (struct folio *)page;
+}
+
 static inline atomic_t *compound_mapcount_ptr(struct page *page)
 {
 	return &page[1].compound_mapcount;
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 03/28] mm: Add folio_pgdat and folio_zone
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 01/28] mm: Optimise nth_page for contiguous memmap Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 02/28] mm: Introduce struct folio Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 04/28] mm/vmstat: Add functions to account folio statistics Matthew Wilcox (Oracle)
                   ` (24 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs, Zi Yan,
	Christoph Hellwig, Jeff Layton

These are just convenience wrappers for callers with folios; pgdat and
zone can be reached from tail pages as well as head pages.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/mm.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 4ece80aa8d05..4c98b52613b7 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1560,6 +1560,16 @@ static inline pg_data_t *page_pgdat(const struct page *page)
 	return NODE_DATA(page_to_nid(page));
 }
 
+static inline struct zone *folio_zone(const struct folio *folio)
+{
+	return page_zone(&folio->page);
+}
+
+static inline pg_data_t *folio_pgdat(const struct folio *folio)
+{
+	return page_pgdat(&folio->page);
+}
+
 #ifdef SECTION_IN_PAGE_FLAGS
 static inline void set_page_section(struct page *page, unsigned long section)
 {
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 04/28] mm/vmstat: Add functions to account folio statistics
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (2 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 03/28] mm: Add folio_pgdat and folio_zone Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 05/28] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO Matthew Wilcox (Oracle)
                   ` (23 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

Allow page counters to be more readily modified by callers which have
a folio.  Name these wrappers with 'stat' instead of 'state' as requested
by Linus here:
https://lore.kernel.org/linux-mm/CAHk-=wj847SudR-kt+46fT3+xFFgiwpgThvm7DJWGdi4cVrbnQ@mail.gmail.com/

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/vmstat.h | 107 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 107 insertions(+)

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 3299cd69e4ca..d287d7c31b8f 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -402,6 +402,78 @@ static inline void drain_zonestat(struct zone *zone,
 			struct per_cpu_pageset *pset) { }
 #endif		/* CONFIG_SMP */
 
+static inline void __zone_stat_mod_folio(struct folio *folio,
+		enum zone_stat_item item, long nr)
+{
+	__mod_zone_page_state(folio_zone(folio), item, nr);
+}
+
+static inline void __zone_stat_add_folio(struct folio *folio,
+		enum zone_stat_item item)
+{
+	__mod_zone_page_state(folio_zone(folio), item, folio_nr_pages(folio));
+}
+
+static inline void __zone_stat_sub_folio(struct folio *folio,
+		enum zone_stat_item item)
+{
+	__mod_zone_page_state(folio_zone(folio), item, -folio_nr_pages(folio));
+}
+
+static inline void zone_stat_mod_folio(struct folio *folio,
+		enum zone_stat_item item, long nr)
+{
+	mod_zone_page_state(folio_zone(folio), item, nr);
+}
+
+static inline void zone_stat_add_folio(struct folio *folio,
+		enum zone_stat_item item)
+{
+	mod_zone_page_state(folio_zone(folio), item, folio_nr_pages(folio));
+}
+
+static inline void zone_stat_sub_folio(struct folio *folio,
+		enum zone_stat_item item)
+{
+	mod_zone_page_state(folio_zone(folio), item, -folio_nr_pages(folio));
+}
+
+static inline void __node_stat_mod_folio(struct folio *folio,
+		enum node_stat_item item, long nr)
+{
+	__mod_node_page_state(folio_pgdat(folio), item, nr);
+}
+
+static inline void __node_stat_add_folio(struct folio *folio,
+		enum node_stat_item item)
+{
+	__mod_node_page_state(folio_pgdat(folio), item, folio_nr_pages(folio));
+}
+
+static inline void __node_stat_sub_folio(struct folio *folio,
+		enum node_stat_item item)
+{
+	__mod_node_page_state(folio_pgdat(folio), item, -folio_nr_pages(folio));
+}
+
+static inline void node_stat_mod_folio(struct folio *folio,
+		enum node_stat_item item, long nr)
+{
+	mod_node_page_state(folio_pgdat(folio), item, nr);
+}
+
+static inline void node_stat_add_folio(struct folio *folio,
+		enum node_stat_item item)
+{
+	mod_node_page_state(folio_pgdat(folio), item, folio_nr_pages(folio));
+}
+
+static inline void node_stat_sub_folio(struct folio *folio,
+		enum node_stat_item item)
+{
+	mod_node_page_state(folio_pgdat(folio), item, -folio_nr_pages(folio));
+}
+
 static inline void __mod_zone_freepage_state(struct zone *zone, int nr_pages,
 					     int migratetype)
 {
@@ -530,6 +602,24 @@ static inline void __dec_lruvec_page_state(struct page *page,
 	__mod_lruvec_page_state(page, idx, -1);
 }
 
+static inline void __lruvec_stat_mod_folio(struct folio *folio,
+					   enum node_stat_item idx, int val)
+{
+	__mod_lruvec_page_state(&folio->page, idx, val);
+}
+
+static inline void __lruvec_stat_add_folio(struct folio *folio,
+					   enum node_stat_item idx)
+{
+	__lruvec_stat_mod_folio(folio, idx, folio_nr_pages(folio));
+}
+
+static inline void __lruvec_stat_sub_folio(struct folio *folio,
+					   enum node_stat_item idx)
+{
+	__lruvec_stat_mod_folio(folio, idx, -folio_nr_pages(folio));
+}
+
 static inline void inc_lruvec_page_state(struct page *page,
 					 enum node_stat_item idx)
 {
@@ -542,4 +632,21 @@ static inline void dec_lruvec_page_state(struct page *page,
 	mod_lruvec_page_state(page, idx, -1);
 }
 
+static inline void lruvec_stat_mod_folio(struct folio *folio,
+					 enum node_stat_item idx, int val)
+{
+	mod_lruvec_page_state(&folio->page, idx, val);
+}
+
+static inline void lruvec_stat_add_folio(struct folio *folio,
+					 enum node_stat_item idx)
+{
+	lruvec_stat_mod_folio(folio, idx, folio_nr_pages(folio));
+}
+
+static inline void lruvec_stat_sub_folio(struct folio *folio,
+					 enum node_stat_item idx)
+{
+	lruvec_stat_mod_folio(folio, idx, -folio_nr_pages(folio));
+}
 #endif /* _LINUX_VMSTAT_H */
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 05/28] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (3 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 04/28] mm/vmstat: Add functions to account folio statistics Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 06/28] mm: Add folio reference count functions Matthew Wilcox (Oracle)
                   ` (22 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs, Zi Yan,
	Christoph Hellwig, Jeff Layton

These are the folio equivalents of VM_BUG_ON_PAGE and VM_WARN_ON_ONCE_PAGE.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/mmdebug.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/include/linux/mmdebug.h b/include/linux/mmdebug.h
index 5d0767cb424a..77d24e1dcaec 100644
--- a/include/linux/mmdebug.h
+++ b/include/linux/mmdebug.h
@@ -23,6 +23,13 @@ void dump_mm(const struct mm_struct *mm);
 			BUG();						\
 		}							\
 	} while (0)
+#define VM_BUG_ON_FOLIO(cond, folio)					\
+	do {								\
+		if (unlikely(cond)) {					\
+			dump_page(&folio->page, "VM_BUG_ON_FOLIO(" __stringify(cond)")");\
+			BUG();						\
+		}							\
+	} while (0)
 #define VM_BUG_ON_VMA(cond, vma)					\
 	do {								\
 		if (unlikely(cond)) {					\
@@ -48,6 +55,17 @@ void dump_mm(const struct mm_struct *mm);
 	}								\
 	unlikely(__ret_warn_once);					\
 })
+#define VM_WARN_ON_ONCE_FOLIO(cond, folio)	({			\
+	static bool __section(".data.once") __warned;			\
+	int __ret_warn_once = !!(cond);					\
+									\
+	if (unlikely(__ret_warn_once && !__warned)) {			\
+		dump_page(&folio->page, "VM_WARN_ON_ONCE_FOLIO(" __stringify(cond)")");\
+		__warned = true;					\
+		WARN_ON(1);						\
+	}								\
+	unlikely(__ret_warn_once);					\
+})
 
 #define VM_WARN_ON(cond) (void)WARN_ON(cond)
 #define VM_WARN_ON_ONCE(cond) (void)WARN_ON_ONCE(cond)
@@ -56,11 +74,13 @@ void dump_mm(const struct mm_struct *mm);
 #else
 #define VM_BUG_ON(cond) BUILD_BUG_ON_INVALID(cond)
 #define VM_BUG_ON_PAGE(cond, page) VM_BUG_ON(cond)
+#define VM_BUG_ON_FOLIO(cond, folio) VM_BUG_ON(cond)
 #define VM_BUG_ON_VMA(cond, vma) VM_BUG_ON(cond)
 #define VM_BUG_ON_MM(cond, mm) VM_BUG_ON(cond)
 #define VM_WARN_ON(cond) BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN_ON_ONCE(cond) BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN_ON_ONCE_PAGE(cond, page)  BUILD_BUG_ON_INVALID(cond)
+#define VM_WARN_ON_ONCE_FOLIO(cond, folio)  BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN_ONCE(cond, format...) BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN(cond, format...) BUILD_BUG_ON_INVALID(cond)
 #endif
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 06/28] mm: Add folio reference count functions
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (4 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 05/28] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 07/28] mm: Add put_folio Matthew Wilcox (Oracle)
                   ` (21 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

These functions mirror their page reference counterparts.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 Documentation/core-api/mm-api.rst |  1 +
 include/linux/page_ref.h          | 88 ++++++++++++++++++++++++++++++-
 2 files changed, 88 insertions(+), 1 deletion(-)

diff --git a/Documentation/core-api/mm-api.rst b/Documentation/core-api/mm-api.rst
index 34f46df91a8b..1ead2570b217 100644
--- a/Documentation/core-api/mm-api.rst
+++ b/Documentation/core-api/mm-api.rst
@@ -97,3 +97,4 @@ More Memory Management Functions
    :internal:
 .. kernel-doc:: include/linux/mm.h
    :internal:
+.. kernel-doc:: include/linux/page_ref.h
diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h
index f3318f34fc54..47f0070dc508 100644
--- a/include/linux/page_ref.h
+++ b/include/linux/page_ref.h
@@ -67,9 +67,31 @@ static inline int page_ref_count(struct page *page)
 	return atomic_read(&page->_refcount);
 }
 
+/**
+ * folio_ref_count - The reference count on this folio.
+ * @folio: The folio.
+ *
+ * The refcount is usually incremented by calls to get_folio() and
+ * decremented by calls to put_folio().  Some typical users of the
+ * folio refcount:
+ *
+ * - Each reference from a page table
+ * - The page cache
+ * - Filesystem private data
+ * - The LRU list
+ * - Pipes
+ * - Direct IO which references this page in the process address space
+ *
+ * Return: The number of references to this folio.
+ */
+static inline int folio_ref_count(struct folio *folio)
+{
+	return page_ref_count(&folio->page);
+}
+
 static inline int page_count(struct page *page)
 {
-	return atomic_read(&compound_head(page)->_refcount);
+	return folio_ref_count(page_folio(page));
 }
 
 static inline void set_page_count(struct page *page, int v)
@@ -79,6 +101,11 @@ static inline void set_page_count(struct page *page, int v)
 		__page_ref_set(page, v);
 }
 
+static inline void set_folio_count(struct folio *folio, int v)
+{
+	set_page_count(&folio->page, v);
+}
+
 /*
  * Setup the page count before being freed into the page allocator for
  * the first time (boot or memory hotplug)
@@ -95,6 +122,11 @@ static inline void page_ref_add(struct page *page, int nr)
 		__page_ref_mod(page, nr);
 }
 
+static inline void folio_ref_add(struct folio *folio, int nr)
+{
+	page_ref_add(&folio->page, nr);
+}
+
 static inline void page_ref_sub(struct page *page, int nr)
 {
 	atomic_sub(nr, &page->_refcount);
@@ -102,6 +134,11 @@ static inline void page_ref_sub(struct page *page, int nr)
 		__page_ref_mod(page, -nr);
 }
 
+static inline void folio_ref_sub(struct folio *folio, int nr)
+{
+	page_ref_sub(&folio->page, nr);
+}
+
 static inline int page_ref_sub_return(struct page *page, int nr)
 {
 	int ret = atomic_sub_return(nr, &page->_refcount);
@@ -111,6 +148,11 @@ static inline int page_ref_sub_return(struct page *page, int nr)
 	return ret;
 }
 
+static inline int folio_ref_sub_return(struct folio *folio, int nr)
+{
+	return page_ref_sub_return(&folio->page, nr);
+}
+
 static inline void page_ref_inc(struct page *page)
 {
 	atomic_inc(&page->_refcount);
@@ -118,6 +160,11 @@ static inline void page_ref_inc(struct page *page)
 		__page_ref_mod(page, 1);
 }
 
+static inline void folio_ref_inc(struct folio *folio)
+{
+	page_ref_inc(&folio->page);
+}
+
 static inline void page_ref_dec(struct page *page)
 {
 	atomic_dec(&page->_refcount);
@@ -125,6 +172,11 @@ static inline void page_ref_dec(struct page *page)
 		__page_ref_mod(page, -1);
 }
 
+static inline void folio_ref_dec(struct folio *folio)
+{
+	page_ref_dec(&folio->page);
+}
+
 static inline int page_ref_sub_and_test(struct page *page, int nr)
 {
 	int ret = atomic_sub_and_test(nr, &page->_refcount);
@@ -134,6 +186,11 @@ static inline int page_ref_sub_and_test(struct page *page, int nr)
 	return ret;
 }
 
+static inline int folio_ref_sub_and_test(struct folio *folio, int nr)
+{
+	return page_ref_sub_and_test(&folio->page, nr);
+}
+
 static inline int page_ref_inc_return(struct page *page)
 {
 	int ret = atomic_inc_return(&page->_refcount);
@@ -143,6 +200,11 @@ static inline int page_ref_inc_return(struct page *page)
 	return ret;
 }
 
+static inline int folio_ref_inc_return(struct folio *folio)
+{
+	return page_ref_inc_return(&folio->page);
+}
+
 static inline int page_ref_dec_and_test(struct page *page)
 {
 	int ret = atomic_dec_and_test(&page->_refcount);
@@ -152,6 +214,11 @@ static inline int page_ref_dec_and_test(struct page *page)
 	return ret;
 }
 
+static inline int folio_ref_dec_and_test(struct folio *folio)
+{
+	return page_ref_dec_and_test(&folio->page);
+}
+
 static inline int page_ref_dec_return(struct page *page)
 {
 	int ret = atomic_dec_return(&page->_refcount);
@@ -161,6 +228,11 @@ static inline int page_ref_dec_return(struct page *page)
 	return ret;
 }
 
+static inline int folio_ref_dec_return(struct folio *folio)
+{
+	return page_ref_dec_return(&folio->page);
+}
+
 static inline int page_ref_add_unless(struct page *page, int nr, int u)
 {
 	int ret = atomic_add_unless(&page->_refcount, nr, u);
@@ -170,6 +242,11 @@ static inline int page_ref_add_unless(struct page *page, int nr, int u)
 	return ret;
 }
 
+static inline int folio_ref_add_unless(struct folio *folio, int nr, int u)
+{
+	return page_ref_add_unless(&folio->page, nr, u);
+}
+
 static inline int page_ref_freeze(struct page *page, int count)
 {
 	int ret = likely(atomic_cmpxchg(&page->_refcount, count, 0) == count);
@@ -179,6 +256,11 @@ static inline int page_ref_freeze(struct page *page, int count)
 	return ret;
 }
 
+static inline int folio_ref_freeze(struct folio *folio, int count)
+{
+	return page_ref_freeze(&folio->page, count);
+}
+
 static inline void page_ref_unfreeze(struct page *page, int count)
 {
 	VM_BUG_ON_PAGE(page_count(page) != 0, page);
@@ -189,4 +271,8 @@ static inline void page_ref_unfreeze(struct page *page, int count)
 		__page_ref_unfreeze(page, count);
 }
 
+static inline void folio_ref_unfreeze(struct folio *folio, int count)
+{
+	page_ref_unfreeze(&folio->page, count);
+}
 #endif
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 07/28] mm: Add put_folio
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (5 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 06/28] mm: Add folio reference count functions Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 08/28] mm: Add get_folio Matthew Wilcox (Oracle)
                   ` (20 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs, Zi Yan,
	Christoph Hellwig, Jeff Layton

If we know we have a folio, we can call put_folio() instead of put_page()
and save the overhead of calling compound_head().  Also skips the
devmap checks.

This commit looks like it should be a no-op, but actually saves 1312 bytes
of text with the distro-derived config that I'm testing.  Some functions
grow a little while others shrink.  I presume the compiler is making
different inlining decisions.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/mm.h | 33 ++++++++++++++++++++++++++++-----
 1 file changed, 28 insertions(+), 5 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 4c98b52613b7..747c6f47aef6 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -751,6 +751,11 @@ static inline int put_page_testzero(struct page *page)
 	return page_ref_dec_and_test(page);
 }
 
+static inline int put_folio_testzero(struct folio *folio)
+{
+	return put_page_testzero(&folio->page);
+}
+
 /*
  * Try to grab a ref unless the page has a refcount of zero, return false if
  * that is the case.
@@ -1242,9 +1247,28 @@ static inline __must_check bool try_get_page(struct page *page)
 	return true;
 }
 
+/**
+ * put_folio - Decrement the reference count on a folio.
+ * @folio: The folio.
+ *
+ * If the folio's reference count reaches zero, the memory will be
+ * released back to the page allocator and may be used by another
+ * allocation immediately.  Do not access the memory or the struct folio
+ * after calling put_folio() unless you can be sure that it wasn't the
+ * last reference.
+ *
+ * Context: May be called in process or interrupt context, but not in NMI
+ * context.  May be called while holding a spinlock.
+ */
+static inline void put_folio(struct folio *folio)
+{
+	if (put_folio_testzero(folio))
+		__put_page(&folio->page);
+}
+
 static inline void put_page(struct page *page)
 {
-	page = compound_head(page);
+	struct folio *folio = page_folio(page);
 
 	/*
 	 * For devmap managed pages we need to catch refcount transition from
@@ -1252,13 +1276,12 @@ static inline void put_page(struct page *page)
 	 * need to inform the device driver through callback. See
 	 * include/linux/memremap.h and HMM for details.
 	 */
-	if (page_is_devmap_managed(page)) {
-		put_devmap_managed_page(page);
+	if (page_is_devmap_managed(&folio->page)) {
+		put_devmap_managed_page(&folio->page);
 		return;
 	}
 
-	if (put_page_testzero(page))
-		__put_page(page);
+	put_folio(folio);
 }
 
 /*
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 08/28] mm: Add get_folio
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (6 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 07/28] mm: Add put_folio Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 09/28] mm: Create FolioFlags Matthew Wilcox (Oracle)
                   ` (19 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs, Zi Yan,
	Christoph Hellwig, Jeff Layton

If we know we have a folio, we can call get_folio() instead
of get_page() and save the overhead of calling compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/mm.h | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 747c6f47aef6..67d9104c1cc1 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1219,18 +1219,26 @@ static inline bool is_pci_p2pdma_page(const struct page *page)
 }
 
 /* 127: arbitrary random number, small enough to assemble well */
-#define page_ref_zero_or_close_to_overflow(page) \
-	((unsigned int) page_ref_count(page) + 127u <= 127u)
+#define folio_ref_zero_or_close_to_overflow(folio) \
+	((unsigned int) folio_ref_count(folio) + 127u <= 127u)
+
+/**
+ * get_folio - Increment the reference count on a folio.
+ * @folio: The folio.
+ *
+ * Context: May be called in any context, as long as you know that
+ * you have a refcount on the folio.  If you do not already have one,
+ * try_grab_page() may be the right interface for you to use.
+ */
+static inline void get_folio(struct folio *folio)
+{
+	VM_BUG_ON_FOLIO(folio_ref_zero_or_close_to_overflow(folio), folio);
+	folio_ref_inc(folio);
+}
 
 static inline void get_page(struct page *page)
 {
-	page = compound_head(page);
-	/*
-	 * Getting a normal page or the head of a compound page
-	 * requires to already have an elevated page->_refcount.
-	 */
-	VM_BUG_ON_PAGE(page_ref_zero_or_close_to_overflow(page), page);
-	page_ref_inc(page);
+	get_folio(page_folio(page));
 }
 
 bool __must_check try_grab_page(struct page *page, unsigned int flags);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 09/28] mm: Create FolioFlags
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (7 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 08/28] mm: Add get_folio Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-19 13:25   ` Peter Zijlstra
  2021-04-09 18:50 ` [PATCH v7 10/28] mm: Handle per-folio private data Matthew Wilcox (Oracle)
                   ` (18 subsequent siblings)
  27 siblings, 1 reply; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

These new functions are the folio analogues of the PageFlags functions.
If CONFIG_DEBUG_VM_PGFLAGS is enabled, we check the folio is not a tail
page at every invocation.  Note that this will also catch the PagePoisoned
case as a poisoned page has every bit set, which would include PageTail.

This saves 1727 bytes of text with the distro-derived config that
I'm testing due to removing a double call to compound_head() in
PageSwapCache().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/page-flags.h | 130 ++++++++++++++++++++++++++++++-------
 1 file changed, 107 insertions(+), 23 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 04a34c08e0a6..b923a90b3ba5 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -212,6 +212,15 @@ static inline void page_init_poison(struct page *page, size_t size)
 }
 #endif
 
+static unsigned long *folio_flags(struct folio *folio, unsigned n)
+{
+	struct page *page = &folio->page;
+
+	VM_BUG_ON_PGFLAGS(PageTail(page), page);
+	VM_BUG_ON_PGFLAGS(n > 0 && !test_bit(PG_head, &page->flags), page);
+	return &page[n].flags;
+}
+
 /*
  * Page flags policies wrt compound pages
  *
@@ -256,34 +265,56 @@ static inline void page_init_poison(struct page *page, size_t size)
 		VM_BUG_ON_PGFLAGS(!PageHead(page), page);		\
 		PF_POISONED_CHECK(&page[1]); })
 
+/* Which page is the flag stored in */
+#define FOLIO_PF_ANY		0
+#define FOLIO_PF_HEAD		0
+#define FOLIO_PF_ONLY_HEAD	0
+#define FOLIO_PF_NO_TAIL	0
+#define FOLIO_PF_NO_COMPOUND	0
+#define FOLIO_PF_SECOND		1
+
 /*
  * Macros to create function definitions for page flags
  */
 #define TESTPAGEFLAG(uname, lname, policy)				\
+static __always_inline int Folio##uname(struct folio *folio)		\
+	{ return test_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \
 static __always_inline int Page##uname(struct page *page)		\
 	{ return test_bit(PG_##lname, &policy(page, 0)->flags); }
 
 #define SETPAGEFLAG(uname, lname, policy)				\
+static __always_inline void SetFolio##uname(struct folio *folio)	\
+	{ set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); }	\
 static __always_inline void SetPage##uname(struct page *page)		\
 	{ set_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define CLEARPAGEFLAG(uname, lname, policy)				\
+static __always_inline void ClearFolio##uname(struct folio *folio)	\
+	{ clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); }	\
 static __always_inline void ClearPage##uname(struct page *page)		\
 	{ clear_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define __SETPAGEFLAG(uname, lname, policy)				\
+static __always_inline void __SetFolio##uname(struct folio *folio)	\
+	{ __set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); }	\
 static __always_inline void __SetPage##uname(struct page *page)		\
 	{ __set_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define __CLEARPAGEFLAG(uname, lname, policy)				\
+static __always_inline void __ClearFolio##uname(struct folio *folio)	\
+	{ __clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \
 static __always_inline void __ClearPage##uname(struct page *page)	\
 	{ __clear_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define TESTSETFLAG(uname, lname, policy)				\
+static __always_inline int TestSetFolio##uname(struct folio *folio)	\
+	{ return test_and_set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \
 static __always_inline int TestSetPage##uname(struct page *page)	\
 	{ return test_and_set_bit(PG_##lname, &policy(page, 1)->flags); }
 
 #define TESTCLEARFLAG(uname, lname, policy)				\
+static __always_inline int TestClearFolio##uname(struct folio *folio)	\
+	{ return test_and_clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \
 static __always_inline int TestClearPage##uname(struct page *page)	\
 	{ return test_and_clear_bit(PG_##lname, &policy(page, 1)->flags); }
 
@@ -302,21 +333,27 @@ static __always_inline int TestClearPage##uname(struct page *page)	\
 	TESTCLEARFLAG(uname, lname, policy)
 
 #define TESTPAGEFLAG_FALSE(uname)					\
+static inline int Folio##uname(const struct folio *folio) { return 0; }	\
 static inline int Page##uname(const struct page *page) { return 0; }
 
 #define SETPAGEFLAG_NOOP(uname)						\
+static inline void SetFolio##uname(struct folio *folio) { }		\
 static inline void SetPage##uname(struct page *page) {  }
 
 #define CLEARPAGEFLAG_NOOP(uname)					\
+static inline void ClearFolio##uname(struct folio *folio) { }		\
 static inline void ClearPage##uname(struct page *page) {  }
 
 #define __CLEARPAGEFLAG_NOOP(uname)					\
+static inline void __ClearFolio##uname(struct folio *folio) { }		\
 static inline void __ClearPage##uname(struct page *page) {  }
 
 #define TESTSETFLAG_FALSE(uname)					\
+static inline int TestSetFolio##uname(struct folio *folio) { return 0; } \
 static inline int TestSetPage##uname(struct page *page) { return 0; }
 
 #define TESTCLEARFLAG_FALSE(uname)					\
+static inline int TestClearFolio##uname(struct folio *folio) { return 0; } \
 static inline int TestClearPage##uname(struct page *page) { return 0; }
 
 #define PAGEFLAG_FALSE(uname) TESTPAGEFLAG_FALSE(uname)			\
@@ -393,14 +430,18 @@ PAGEFLAG_FALSE(HighMem)
 #endif
 
 #ifdef CONFIG_SWAP
-static __always_inline int PageSwapCache(struct page *page)
+static __always_inline bool FolioSwapCache(struct folio *folio)
 {
-#ifdef CONFIG_THP_SWAP
-	page = compound_head(page);
-#endif
-	return PageSwapBacked(page) && test_bit(PG_swapcache, &page->flags);
+	return FolioSwapBacked(folio) &&
+			test_bit(PG_swapcache, folio_flags(folio, 0));
 
 }
+
+static __always_inline bool PageSwapCache(struct page *page)
+{
+	return FolioSwapCache(page_folio(page));
+}
+
 SETPAGEFLAG(SwapCache, swapcache, PF_NO_TAIL)
 CLEARPAGEFLAG(SwapCache, swapcache, PF_NO_TAIL)
 #else
@@ -478,10 +519,14 @@ static __always_inline int PageMappingFlags(struct page *page)
 	return ((unsigned long)page->mapping & PAGE_MAPPING_FLAGS) != 0;
 }
 
-static __always_inline int PageAnon(struct page *page)
+static __always_inline bool FolioAnon(struct folio *folio)
+{
+	return ((unsigned long)folio->mapping & PAGE_MAPPING_ANON) != 0;
+}
+
+static __always_inline bool PageAnon(struct page *page)
 {
-	page = compound_head(page);
-	return ((unsigned long)page->mapping & PAGE_MAPPING_ANON) != 0;
+	return FolioAnon(page_folio(page));
 }
 
 static __always_inline int __PageMovable(struct page *page)
@@ -497,30 +542,32 @@ static __always_inline int __PageMovable(struct page *page)
  * is found in VM_MERGEABLE vmas.  It's a PageAnon page, pointing not to any
  * anon_vma, but to that page's node of the stable tree.
  */
-static __always_inline int PageKsm(struct page *page)
+static __always_inline bool FolioKsm(struct folio *folio)
 {
-	page = compound_head(page);
-	return ((unsigned long)page->mapping & PAGE_MAPPING_FLAGS) ==
+	return ((unsigned long)folio->mapping & PAGE_MAPPING_FLAGS) ==
 				PAGE_MAPPING_KSM;
 }
+
+static __always_inline bool PageKsm(struct page *page)
+{
+	return FolioKsm(page_folio(page));
+}
 #else
 TESTPAGEFLAG_FALSE(Ksm)
 #endif
 
 u64 stable_page_flags(struct page *page);
 
-static inline int PageUptodate(struct page *page)
+static inline int FolioUptodate(struct folio *folio)
 {
-	int ret;
-	page = compound_head(page);
-	ret = test_bit(PG_uptodate, &(page)->flags);
+	int ret = test_bit(PG_uptodate, folio_flags(folio, 0));
 	/*
 	 * Must ensure that the data we read out of the page is loaded
 	 * _after_ we've loaded page->flags to check for PageUptodate.
 	 * We can skip the barrier if the page is not uptodate, because
 	 * we wouldn't be reading anything from it.
 	 *
-	 * See SetPageUptodate() for the other side of the story.
+	 * See SetFolioUptodate() for the other side of the story.
 	 */
 	if (ret)
 		smp_rmb();
@@ -528,23 +575,36 @@ static inline int PageUptodate(struct page *page)
 	return ret;
 }
 
-static __always_inline void __SetPageUptodate(struct page *page)
+static inline int PageUptodate(struct page *page)
+{
+	return FolioUptodate(page_folio(page));
+}
+
+static __always_inline void __SetFolioUptodate(struct folio *folio)
 {
-	VM_BUG_ON_PAGE(PageTail(page), page);
 	smp_wmb();
-	__set_bit(PG_uptodate, &page->flags);
+	__set_bit(PG_uptodate, folio_flags(folio, 0));
 }
 
-static __always_inline void SetPageUptodate(struct page *page)
+static __always_inline void SetFolioUptodate(struct folio *folio)
 {
-	VM_BUG_ON_PAGE(PageTail(page), page);
 	/*
 	 * Memory barrier must be issued before setting the PG_uptodate bit,
 	 * so that all previous stores issued in order to bring the page
 	 * uptodate are actually visible before PageUptodate becomes true.
 	 */
 	smp_wmb();
-	set_bit(PG_uptodate, &page->flags);
+	set_bit(PG_uptodate, folio_flags(folio, 0));
+}
+
+static __always_inline void __SetPageUptodate(struct page *page)
+{
+	__SetFolioUptodate((struct folio *)page);
+}
+
+static __always_inline void SetPageUptodate(struct page *page)
+{
+	SetFolioUptodate((struct folio *)page);
 }
 
 CLEARPAGEFLAG(Uptodate, uptodate, PF_NO_TAIL)
@@ -569,6 +629,17 @@ static inline void set_page_writeback_keepwrite(struct page *page)
 
 __PAGEFLAG(Head, head, PF_ANY) CLEARPAGEFLAG(Head, head, PF_ANY)
 
+/* Whether there are one or multiple pages in a folio */
+static inline bool FolioSingle(struct folio *folio)
+{
+	return !FolioHead(folio);
+}
+
+static inline bool FolioMulti(struct folio *folio)
+{
+	return FolioHead(folio);
+}
+
 static __always_inline void set_compound_head(struct page *page, struct page *head)
 {
 	WRITE_ONCE(page->compound_head, (unsigned long)head + 1);
@@ -592,12 +663,15 @@ static inline void ClearPageCompound(struct page *page)
 #ifdef CONFIG_HUGETLB_PAGE
 int PageHuge(struct page *page);
 int PageHeadHuge(struct page *page);
+static inline bool FolioHuge(struct folio *folio)
+{
+	return PageHeadHuge(&folio->page);
+}
 #else
 TESTPAGEFLAG_FALSE(Huge)
 TESTPAGEFLAG_FALSE(HeadHuge)
 #endif
 
-
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 /*
  * PageHuge() only returns true for hugetlbfs pages, but not for
@@ -613,6 +687,11 @@ static inline int PageTransHuge(struct page *page)
 	return PageHead(page);
 }
 
+static inline bool FolioTransHuge(struct folio *folio)
+{
+	return FolioHead(folio);
+}
+
 /*
  * PageTransCompound returns true for both transparent huge pages
  * and hugetlbfs pages, so it should only be called when it's known
@@ -844,6 +923,11 @@ static inline int page_has_private(struct page *page)
 	return !!(page->flags & PAGE_FLAGS_PRIVATE);
 }
 
+static inline bool folio_has_private(struct folio *folio)
+{
+	return page_has_private(&folio->page);
+}
+
 #undef PF_ANY
 #undef PF_HEAD
 #undef PF_ONLY_HEAD
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 10/28] mm: Handle per-folio private data
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (8 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 09/28] mm: Create FolioFlags Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 11/28] mm/filemap: Add folio_index, folio_file_page and folio_contains Matthew Wilcox (Oracle)
                   ` (17 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

Add folio_private() and set_folio_private() which mirror page_private()
and set_page_private() -- ie folio private data is the same as page
private data.  The only difference is that these return a void *
instead of an unsigned long, which matches the majority of users.

Turn attach_page_private() into attach_folio_private() and reimplement
attach_page_private() as a wrapper.  No filesystem which uses page private
data currently supports compound pages, so we're free to define the rules.
attach_page_private() may only be called on a head page; if you want
to add private data to a tail page, you can call set_page_private()
directly (and shouldn't increment the page refcount!  That should be
done when adding private data to the head page / folio).

This saves 597 bytes of text with the distro-derived config that I'm
testing due to removing the calls to compound_head() in get_page()
& put_page().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/mm_types.h | 16 ++++++++++++++
 include/linux/pagemap.h  | 48 ++++++++++++++++++++++++----------------
 2 files changed, 45 insertions(+), 19 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index d65050851037..1210a8e41fad 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -322,6 +322,12 @@ static inline atomic_t *compound_pincount_ptr(struct page *page)
 #define PAGE_FRAG_CACHE_MAX_SIZE	__ALIGN_MASK(32768, ~PAGE_MASK)
 #define PAGE_FRAG_CACHE_MAX_ORDER	get_order(PAGE_FRAG_CACHE_MAX_SIZE)
 
+/*
+ * page_private can be used on tail pages.  However, PagePrivate is only
+ * checked by the VM on the head page.  So page_private on the tail pages
+ * should be used for data that's ancillary to the head page (eg attaching
+ * buffer heads to tail pages after attaching buffer heads to the head page)
+ */
 #define page_private(page)		((page)->private)
 
 static inline void set_page_private(struct page *page, unsigned long private)
@@ -329,6 +335,16 @@ static inline void set_page_private(struct page *page, unsigned long private)
 	page->private = private;
 }
 
+static inline void *folio_private(struct folio *folio)
+{
+	return (void *)folio->private;
+}
+
+static inline void set_folio_private(struct folio *folio, void *v)
+{
+	folio->private = (unsigned long)v;
+}
+
 struct page_frag_cache {
 	void * va;
 #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 350e1c9726f2..f2fd0b811c1b 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -260,42 +260,52 @@ static inline int page_cache_add_speculative(struct page *page, int count)
 }
 
 /**
- * attach_page_private - Attach private data to a page.
- * @page: Page to attach data to.
- * @data: Data to attach to page.
+ * attach_folio_private - Attach private data to a folio.
+ * @folio: Folio to attach data to.
+ * @data: Data to attach to folio.
  *
- * Attaching private data to a page increments the page's reference count.
- * The data must be detached before the page will be freed.
+ * Attaching private data to a folio increments the page's reference count.
+ * The data must be detached before the folio will be freed.
  */
-static inline void attach_page_private(struct page *page, void *data)
+static inline void attach_folio_private(struct folio *folio, void *data)
 {
-	get_page(page);
-	set_page_private(page, (unsigned long)data);
-	SetPagePrivate(page);
+	get_folio(folio);
+	set_folio_private(folio, data);
+	SetFolioPrivate(folio);
 }
 
 /**
- * detach_page_private - Detach private data from a page.
- * @page: Page to detach data from.
+ * detach_folio_private - Detach private data from a folio.
+ * @folio: Folio to detach data from.
  *
- * Removes the data that was previously attached to the page and decrements
+ * Removes the data that was previously attached to the folio and decrements
  * the refcount on the page.
  *
- * Return: Data that was attached to the page.
+ * Return: Data that was attached to the folio.
  */
-static inline void *detach_page_private(struct page *page)
+static inline void *detach_folio_private(struct folio *folio)
 {
-	void *data = (void *)page_private(page);
+	void *data = folio_private(folio);
 
-	if (!PagePrivate(page))
+	if (!FolioPrivate(folio))
 		return NULL;
-	ClearPagePrivate(page);
-	set_page_private(page, 0);
-	put_page(page);
+	ClearFolioPrivate(folio);
+	set_folio_private(folio, NULL);
+	put_folio(folio);
 
 	return data;
 }
 
+static inline void attach_page_private(struct page *page, void *data)
+{
+	attach_folio_private(page_folio(page), data);
+}
+
+static inline void *detach_page_private(struct page *page)
+{
+	return detach_folio_private(page_folio(page));
+}
+
 #ifdef CONFIG_NUMA
 extern struct page *__page_cache_alloc(gfp_t gfp);
 #else
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 11/28] mm/filemap: Add folio_index, folio_file_page and folio_contains
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (9 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 10/28] mm: Handle per-folio private data Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 12/28] mm/filemap: Add folio_next_index Matthew Wilcox (Oracle)
                   ` (16 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

folio_index() is the equivalent of page_index() for folios.
folio_file_page() is the equivalent of find_subpage().
folio_contains() is the equivalent of thp_contains().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 53 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index f2fd0b811c1b..a8b108a9ac6e 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -462,6 +462,59 @@ static inline bool thp_contains(struct page *head, pgoff_t index)
 	return page_index(head) == (index & ~(thp_nr_pages(head) - 1UL));
 }
 
+#define swapcache_index(folio)	__page_file_index(&(folio)->page)
+
+/**
+ * folio_index - File index of a folio.
+ * @folio: The folio.
+ *
+ * For a folio which is either in the page cache or the swap cache,
+ * return its index within the address_space it belongs to.  If you know
+ * the page is definitely in the page cache, you can look at the folio's
+ * index directly.
+ *
+ * Return: The index (offset in units of pages) of a folio in its file.
+ */
+static inline pgoff_t folio_index(struct folio *folio)
+{
+        if (unlikely(FolioSwapCache(folio)))
+                return swapcache_index(folio);
+        return folio->index;
+}
+
+/**
+ * folio_file_page - The page for a particular index.
+ * @folio: The folio which contains this index.
+ * @index: The index we want to look up.
+ *
+ * Sometimes after looking up a folio in the page cache, we need to
+ * obtain the specific page for an index (eg a page fault).
+ *
+ * Return: The page containing the file data for this index.
+ */
+static inline struct page *folio_file_page(struct folio *folio, pgoff_t index)
+{
+	return nth_page(&folio->page, index & (folio_nr_pages(folio) - 1));
+}
+
+/**
+ * folio_contains - Does this folio contain this index?
+ * @folio: The folio.
+ * @index: The page index within the file.
+ *
+ * Context: The caller should have the page locked in order to prevent
+ * (eg) shmem from moving the page between the page cache and swap cache
+ * and changing its index in the middle of the operation.
+ * Return: true or false.
+ */
+static inline bool folio_contains(struct folio *folio, pgoff_t index)
+{
+	/* HugeTLBfs indexes the page cache in units of hpage_size */
+	if (PageHuge(&folio->page))
+		return folio->index == index;
+	return index - folio_index(folio) < folio_nr_pages(folio);
+}
+
 /*
  * Given the page we found in the page cache, return the page corresponding
  * to this index in the file
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 12/28] mm/filemap: Add folio_next_index
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (10 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 11/28] mm/filemap: Add folio_index, folio_file_page and folio_contains Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 13/28] mm/filemap: Add folio_offset and folio_file_offset Matthew Wilcox (Oracle)
                   ` (15 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

This helper returns the page index of the next folio in the file (ie
the end of this folio, plus one).

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index a8b108a9ac6e..5130503519b0 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -482,6 +482,17 @@ static inline pgoff_t folio_index(struct folio *folio)
         return folio->index;
 }
 
+/**
+ * folio_next_index - Get the index of the next folio.
+ * @folio: The current folio.
+ *
+ * Return: The index of the folio which follows this folio in the file.
+ */
+static inline pgoff_t folio_next_index(struct folio *folio)
+{
+	return folio->index + folio_nr_pages(folio);
+}
+
 /**
  * folio_file_page - The page for a particular index.
  * @folio: The folio which contains this index.
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 13/28] mm/filemap: Add folio_offset and folio_file_offset
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (11 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 12/28] mm/filemap: Add folio_next_index Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 14/28] mm/util: Add folio_mapping and folio_file_mapping Matthew Wilcox (Oracle)
                   ` (14 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

These are just wrappers around their page counterpart.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 5130503519b0..c15e72ee9ea8 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -634,6 +634,16 @@ static inline loff_t page_file_offset(struct page *page)
 	return ((loff_t)page_index(page)) << PAGE_SHIFT;
 }
 
+static inline loff_t folio_offset(struct folio *folio)
+{
+	return page_offset(&folio->page);
+}
+
+static inline loff_t folio_file_offset(struct folio *folio)
+{
+	return page_file_offset(&folio->page);
+}
+
 extern pgoff_t linear_hugepage_index(struct vm_area_struct *vma,
 				     unsigned long address);
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 14/28] mm/util: Add folio_mapping and folio_file_mapping
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (12 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 13/28] mm/filemap: Add folio_offset and folio_file_offset Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 15/28] mm: Add folio_mapcount Matthew Wilcox (Oracle)
                   ` (13 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

These are the folio equivalent of page_mapping() and page_file_mapping().
Add an out-of-line page_mapping() wrapper around folio_mapping()
in order to prevent the page_folio() call from bloating every caller
of page_mapping().  Adjust page_file_mapping() and page_mapping_file()
to use folios internally.  Rename __page_file_mapping() to
swapcache_mapping() and change it to take a folio.

This ends up saving 186 bytes of text overall.  folio_mapping() is
45 bytes shorter than page_mapping() was, but the new page_mapping()
wrapper is 30 bytes.  The major reduction is a few bytes less in dozens
of nfs functions (which call page_file_mapping()).  Most of these appear
to be a slight change in gcc's register allocation decisions, which allow:

   48 8b 56 08         mov    0x8(%rsi),%rdx
   48 8d 42 ff         lea    -0x1(%rdx),%rax
   83 e2 01            and    $0x1,%edx
   48 0f 44 c6         cmove  %rsi,%rax

to become:

   48 8b 46 08         mov    0x8(%rsi),%rax
   48 8d 78 ff         lea    -0x1(%rax),%rdi
   a8 01               test   $0x1,%al
   48 0f 44 fe         cmove  %rsi,%rdi

for a reduction of a single byte.  Once the NFS client is converted to
use folios, this entire sequence will disappear.

Also add folio_mapping() documentation.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 Documentation/core-api/mm-api.rst |  2 ++
 include/linux/mm.h                | 14 -------------
 include/linux/pagemap.h           | 35 +++++++++++++++++++++++++++++--
 include/linux/swap.h              |  6 ++++++
 mm/Makefile                       |  2 +-
 mm/folio-compat.c                 | 13 ++++++++++++
 mm/swapfile.c                     |  8 +++----
 mm/util.c                         | 30 +++++++++++++++-----------
 8 files changed, 77 insertions(+), 33 deletions(-)
 create mode 100644 mm/folio-compat.c

diff --git a/Documentation/core-api/mm-api.rst b/Documentation/core-api/mm-api.rst
index 1ead2570b217..b1f4e6d52199 100644
--- a/Documentation/core-api/mm-api.rst
+++ b/Documentation/core-api/mm-api.rst
@@ -98,3 +98,5 @@ More Memory Management Functions
 .. kernel-doc:: include/linux/mm.h
    :internal:
 .. kernel-doc:: include/linux/page_ref.h
+.. kernel-doc:: mm/util.c
+   :functions: folio_mapping
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 67d9104c1cc1..143b354c3f4a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1749,19 +1749,6 @@ void page_address_init(void);
 
 extern void *page_rmapping(struct page *page);
 extern struct anon_vma *page_anon_vma(struct page *page);
-extern struct address_space *page_mapping(struct page *page);
-
-extern struct address_space *__page_file_mapping(struct page *);
-
-static inline
-struct address_space *page_file_mapping(struct page *page)
-{
-	if (unlikely(PageSwapCache(page)))
-		return __page_file_mapping(page);
-
-	return page->mapping;
-}
-
 extern pgoff_t __page_file_index(struct page *page);
 
 /*
@@ -1776,7 +1763,6 @@ static inline pgoff_t page_index(struct page *page)
 }
 
 bool page_mapped(struct page *page);
-struct address_space *page_mapping(struct page *page);
 
 /*
  * Return true only if the page has been allocated with
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index c15e72ee9ea8..f8150dfbf110 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -162,14 +162,45 @@ static inline void filemap_nr_thps_dec(struct address_space *mapping)
 
 void release_pages(struct page **pages, int nr);
 
+struct address_space *page_mapping(struct page *);
+struct address_space *folio_mapping(struct folio *);
+struct address_space *swapcache_mapping(struct folio *);
+
+/**
+ * folio_file_mapping - Find the mapping this folio belongs to.
+ * @folio: The folio.
+ *
+ * For folios which are in the page cache, return the mapping that this
+ * page belongs to.  Folios in the swap cache return the mapping of the
+ * swap file or swap device where the data is stored.  This is different
+ * from the mapping returned by folio_mapping().  The only reason to
+ * use it is if, like NFS, you return 0 from ->activate_swapfile.
+ *
+ * Do not call this for folios which aren't in the page cache or swap cache.
+ */
+static inline struct address_space *folio_file_mapping(struct folio *folio)
+{
+	if (unlikely(FolioSwapCache(folio)))
+		return swapcache_mapping(folio);
+
+	return folio->mapping;
+}
+
+static inline struct address_space *page_file_mapping(struct page *page)
+{
+	return folio_file_mapping(page_folio(page));
+}
+
 /*
  * For file cache pages, return the address_space, otherwise return NULL
  */
 static inline struct address_space *page_mapping_file(struct page *page)
 {
-	if (unlikely(PageSwapCache(page)))
+	struct folio *folio = page_folio(page);
+
+	if (unlikely(FolioSwapCache(folio)))
 		return NULL;
-	return page_mapping(page);
+	return folio_mapping(folio);
 }
 
 /*
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 144727041e78..20766342845b 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -314,6 +314,12 @@ struct vma_swap_readahead {
 #endif
 };
 
+static inline swp_entry_t folio_swap_entry(struct folio *folio)
+{
+	swp_entry_t entry = { .val = page_private(&folio->page) };
+	return entry;
+}
+
 /* linux/mm/workingset.c */
 void workingset_age_nonresident(struct lruvec *lruvec, unsigned long nr_pages);
 void *workingset_eviction(struct page *page, struct mem_cgroup *target_memcg);
diff --git a/mm/Makefile b/mm/Makefile
index a9ad6122d468..434c2a46b6c5 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -46,7 +46,7 @@ mmu-$(CONFIG_MMU)	+= process_vm_access.o
 endif
 
 obj-y			:= filemap.o mempool.o oom_kill.o fadvise.o \
-			   maccess.o page-writeback.o \
+			   maccess.o page-writeback.o folio-compat.o \
 			   readahead.o swap.o truncate.o vmscan.o shmem.o \
 			   util.o mmzone.o vmstat.o backing-dev.o \
 			   mm_init.o percpu.o slab_common.o \
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
new file mode 100644
index 000000000000..5e107aa30a62
--- /dev/null
+++ b/mm/folio-compat.c
@@ -0,0 +1,13 @@
+/*
+ * Compatibility functions which bloat the callers too much to make inline.
+ * All of the callers of these functions should be converted to use folios
+ * eventually.
+ */
+
+#include <linux/pagemap.h>
+
+struct address_space *page_mapping(struct page *page)
+{
+	return folio_mapping(page_folio(page));
+}
+EXPORT_SYMBOL(page_mapping);
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 149e77454e3c..d0ee24239a83 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3533,13 +3533,13 @@ struct swap_info_struct *page_swap_info(struct page *page)
 }
 
 /*
- * out-of-line __page_file_ methods to avoid include hell.
+ * out-of-line methods to avoid include hell.
  */
-struct address_space *__page_file_mapping(struct page *page)
+struct address_space *swapcache_mapping(struct folio *folio)
 {
-	return page_swap_info(page)->swap_file->f_mapping;
+	return page_swap_info(&folio->page)->swap_file->f_mapping;
 }
-EXPORT_SYMBOL_GPL(__page_file_mapping);
+EXPORT_SYMBOL_GPL(swapcache_mapping);
 
 pgoff_t __page_file_index(struct page *page)
 {
diff --git a/mm/util.c b/mm/util.c
index 0b6dd9d81da7..f9e04471a8d9 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -686,30 +686,36 @@ struct anon_vma *page_anon_vma(struct page *page)
 	return __page_rmapping(page);
 }
 
-struct address_space *page_mapping(struct page *page)
+/**
+ * folio_mapping - Find the mapping where this folio is stored.
+ * @folio: The folio.
+ *
+ * For folios which are in the page cache, return the mapping that this
+ * page belongs to.  Folios in the swap cache return the swap mapping
+ * this page is stored in (which is different from the mapping for the
+ * swap file or swap device where the data is stored).
+ *
+ * You can call this for folios which aren't in the swap cache or page
+ * cache and it will return NULL.
+ */
+struct address_space *folio_mapping(struct folio *folio)
 {
 	struct address_space *mapping;
 
-	page = compound_head(page);
-
 	/* This happens if someone calls flush_dcache_page on slab page */
-	if (unlikely(PageSlab(page)))
+	if (unlikely(FolioSlab(folio)))
 		return NULL;
 
-	if (unlikely(PageSwapCache(page))) {
-		swp_entry_t entry;
-
-		entry.val = page_private(page);
-		return swap_address_space(entry);
-	}
+	if (unlikely(FolioSwapCache(folio)))
+		return swap_address_space(folio_swap_entry(folio));
 
-	mapping = page->mapping;
+	mapping = folio->mapping;
 	if ((unsigned long)mapping & PAGE_MAPPING_ANON)
 		return NULL;
 
 	return (void *)((unsigned long)mapping & ~PAGE_MAPPING_FLAGS);
 }
-EXPORT_SYMBOL(page_mapping);
+EXPORT_SYMBOL(folio_mapping);
 
 /* Slow path of page_mapcount() for compound pages */
 int __page_mapcount(struct page *page)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 15/28] mm: Add folio_mapcount
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (13 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 14/28] mm/util: Add folio_mapping and folio_file_mapping Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 16/28] mm/memcg: Add folio wrappers for various functions Matthew Wilcox (Oracle)
                   ` (12 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

This is the folio equivalent of page_mapcount().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/mm.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 143b354c3f4a..7bd2ce197e2f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -883,6 +883,22 @@ static inline int page_mapcount(struct page *page)
 	return atomic_read(&page->_mapcount) + 1;
 }
 
+/**
+ * folio_mapcount - The number of mappings of this folio.
+ * @folio: The folio.
+ *
+ * The result includes the number of times any of the pages in the
+ * folio are mapped to userspace.
+ *
+ * Return: The number of page table entries which refer to this folio.
+ */
+static inline int folio_mapcount(struct folio *folio)
+{
+	if (unlikely(FolioMulti(folio)))
+		return __page_mapcount(&folio->page);
+	return atomic_read(&folio->_mapcount) + 1;
+}
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 int total_mapcount(struct page *page);
 int page_trans_huge_mapcount(struct page *page, int *total_mapcount);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 16/28] mm/memcg: Add folio wrappers for various functions
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (14 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 15/28] mm: Add folio_mapcount Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 17/28] mm/filemap: Add unlock_folio Matthew Wilcox (Oracle)
                   ` (11 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

Add new wrapper functions folio_memcg(), lock_folio_memcg(),
unlock_folio_memcg(), mem_cgroup_folio_lruvec() and
count_memcg_folio_event()

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/memcontrol.h | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index b8b0a802852c..f15b46f5b06c 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -456,6 +456,11 @@ static inline struct mem_cgroup *page_memcg(struct page *page)
 		return __page_memcg(page);
 }
 
+static inline struct mem_cgroup *folio_memcg(struct folio *folio)
+{
+	return page_memcg(&folio->page);
+}
+
 /*
  * page_memcg_rcu - locklessly get the memory cgroup associated with a page
  * @page: a pointer to the page struct
@@ -1052,6 +1057,15 @@ static inline void count_memcg_page_event(struct page *page,
 		count_memcg_events(memcg, idx, 1);
 }
 
+static inline void count_memcg_folio_event(struct folio *folio,
+					  enum vm_event_item idx)
+{
+	struct mem_cgroup *memcg = folio_memcg(folio);
+
+	if (memcg)
+		count_memcg_events(memcg, idx, folio_nr_pages(folio));
+}
+
 static inline void count_memcg_event_mm(struct mm_struct *mm,
 					enum vm_event_item idx)
 {
@@ -1471,6 +1485,22 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 }
 #endif /* CONFIG_MEMCG */
 
+static inline void lock_folio_memcg(struct folio *folio)
+{
+	lock_page_memcg(&folio->page);
+}
+
+static inline void unlock_folio_memcg(struct folio *folio)
+{
+	unlock_page_memcg(&folio->page);
+}
+
+static inline struct lruvec *mem_cgroup_folio_lruvec(struct folio *folio,
+						    struct pglist_data *pgdat)
+{
+	return mem_cgroup_page_lruvec(&folio->page, pgdat);
+}
+
 static inline void __inc_lruvec_kmem_state(void *p, enum node_stat_item idx)
 {
 	__mod_lruvec_kmem_state(p, idx, 1);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 17/28] mm/filemap: Add unlock_folio
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (15 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 16/28] mm/memcg: Add folio wrappers for various functions Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 18/28] mm/filemap: Add lock_folio Matthew Wilcox (Oracle)
                   ` (10 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

Convert unlock_page() to call unlock_folio().  By using a folio we
avoid a call to compound_head().  This shortens the function from 39
bytes to 25 and removes 4 instructions on x86-64.  Because we still
have unlock_page(), it's a net increase of 24 bytes of text for the
kernel as a whole, but any path that uses unlock_folio() will execute
4 fewer instructions.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h |  3 ++-
 mm/filemap.c            | 27 ++++++++++-----------------
 mm/folio-compat.c       |  6 ++++++
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index f8150dfbf110..d94c9f8a451d 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -719,7 +719,8 @@ extern int __lock_page_killable(struct page *page);
 extern int __lock_page_async(struct page *page, struct wait_page_queue *wait);
 extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 				unsigned int flags);
-extern void unlock_page(struct page *page);
+void unlock_page(struct page *page);
+void unlock_folio(struct folio *folio);
 
 /*
  * Return true if the page was successfully locked
diff --git a/mm/filemap.c b/mm/filemap.c
index 7aacec5684c7..d1cf51274921 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1435,29 +1435,22 @@ static inline bool clear_bit_unlock_is_negative_byte(long nr, volatile void *mem
 #endif
 
 /**
- * unlock_page - unlock a locked page
- * @page: the page
+ * unlock_folio - Unlock a locked folio.
+ * @folio: The folio.
  *
- * Unlocks the page and wakes up sleepers in wait_on_page_locked().
- * Also wakes sleepers in wait_on_page_writeback() because the wakeup
- * mechanism between PageLocked pages and PageWriteback pages is shared.
- * But that's OK - sleepers in wait_on_page_writeback() just go back to sleep.
+ * Unlocks the folio and wakes up any thread sleeping on the page lock.
  *
- * Note that this depends on PG_waiters being the sign bit in the byte
- * that contains PG_locked - thus the BUILD_BUG_ON(). That allows us to
- * clear the PG_locked bit and test PG_waiters at the same time fairly
- * portably (architectures that do LL/SC can test any bit, while x86 can
- * test the sign bit).
+ * Context: May be called from interrupt or process context.  May not be
+ * called from NMI context.
  */
-void unlock_page(struct page *page)
+void unlock_folio(struct folio *folio)
 {
 	BUILD_BUG_ON(PG_waiters != 7);
-	page = compound_head(page);
-	VM_BUG_ON_PAGE(!PageLocked(page), page);
-	if (clear_bit_unlock_is_negative_byte(PG_locked, &page->flags))
-		wake_up_page_bit(page, PG_locked);
+	VM_BUG_ON_FOLIO(!FolioLocked(folio), folio);
+	if (clear_bit_unlock_is_negative_byte(PG_locked, folio_flags(folio, 0)))
+		wake_up_page_bit(&folio->page, PG_locked);
 }
-EXPORT_SYMBOL(unlock_page);
+EXPORT_SYMBOL(unlock_folio);
 
 /**
  * end_page_private_2 - Clear PG_private_2 and release any waiters
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 5e107aa30a62..02798abf19a1 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -11,3 +11,9 @@ struct address_space *page_mapping(struct page *page)
 	return folio_mapping(page_folio(page));
 }
 EXPORT_SYMBOL(page_mapping);
+
+void unlock_page(struct page *page)
+{
+	return unlock_folio(page_folio(page));
+}
+EXPORT_SYMBOL(unlock_page);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 18/28] mm/filemap: Add lock_folio
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (16 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 17/28] mm/filemap: Add unlock_folio Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 19/28] mm/filemap: Add lock_folio_killable Matthew Wilcox (Oracle)
                   ` (9 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

This is like lock_page() but for use by callers who know they have a folio.
Convert __lock_page() to be __lock_folio().  This saves one call to
compound_head() per contended call to lock_page().

Saves 362 bytes of text; mostly from improved register allocation and
inlining decisions.  __lock_folio is 59 bytes while __lock_page was 79.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 24 +++++++++++++++++++-----
 mm/filemap.c            | 29 +++++++++++++++--------------
 2 files changed, 34 insertions(+), 19 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index d94c9f8a451d..c59b19f1cb0f 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -714,7 +714,7 @@ static inline bool wake_page_match(struct wait_page_queue *wait_page,
 	return true;
 }
 
-extern void __lock_page(struct page *page);
+void __lock_folio(struct folio *folio);
 extern int __lock_page_killable(struct page *page);
 extern int __lock_page_async(struct page *page, struct wait_page_queue *wait);
 extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
@@ -722,13 +722,24 @@ extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 void unlock_page(struct page *page);
 void unlock_folio(struct folio *folio);
 
+static inline bool trylock_folio(struct folio *folio)
+{
+	return likely(!test_and_set_bit_lock(PG_locked, folio_flags(folio, 0)));
+}
+
 /*
  * Return true if the page was successfully locked
  */
 static inline int trylock_page(struct page *page)
 {
-	page = compound_head(page);
-	return (likely(!test_and_set_bit_lock(PG_locked, &page->flags)));
+	return trylock_folio(page_folio(page));
+}
+
+static inline void lock_folio(struct folio *folio)
+{
+	might_sleep();
+	if (!trylock_folio(folio))
+		__lock_folio(folio);
 }
 
 /*
@@ -736,9 +747,12 @@ static inline int trylock_page(struct page *page)
  */
 static inline void lock_page(struct page *page)
 {
+	struct folio *folio;
 	might_sleep();
-	if (!trylock_page(page))
-		__lock_page(page);
+
+	folio = page_folio(page);
+	if (!trylock_folio(folio))
+		__lock_folio(folio);
 }
 
 /*
diff --git a/mm/filemap.c b/mm/filemap.c
index d1cf51274921..36289a9816b5 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1187,7 +1187,7 @@ static void wake_up_page(struct page *page, int bit)
  */
 enum behavior {
 	EXCLUSIVE,	/* Hold ref to page and take the bit when woken, like
-			 * __lock_page() waiting on then setting PG_locked.
+			 * __lock_folio() waiting on then setting PG_locked.
 			 */
 	SHARED,		/* Hold ref to page and check the bit when woken, like
 			 * wait_on_page_writeback() waiting on PG_writeback.
@@ -1574,17 +1574,16 @@ void page_endio(struct page *page, bool is_write, int err)
 EXPORT_SYMBOL_GPL(page_endio);
 
 /**
- * __lock_page - get a lock on the page, assuming we need to sleep to get it
- * @__page: the page to lock
+ * __lock_folio - Get a lock on the folio, assuming we need to sleep to get it.
+ * @folio: The folio to lock
  */
-void __lock_page(struct page *__page)
+void __lock_folio(struct folio *folio)
 {
-	struct page *page = compound_head(__page);
-	wait_queue_head_t *q = page_waitqueue(page);
-	wait_on_page_bit_common(q, page, PG_locked, TASK_UNINTERRUPTIBLE,
+	wait_queue_head_t *q = page_waitqueue(&folio->page);
+	wait_on_page_bit_common(q, &folio->page, PG_locked, TASK_UNINTERRUPTIBLE,
 				EXCLUSIVE);
 }
-EXPORT_SYMBOL(__lock_page);
+EXPORT_SYMBOL(__lock_folio);
 
 int __lock_page_killable(struct page *__page)
 {
@@ -1659,10 +1658,10 @@ int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 			return 0;
 		}
 	} else {
-		__lock_page(page);
+		__lock_folio(page_folio(page));
 	}
-	return 1;
 
+	return 1;
 }
 
 /**
@@ -2806,7 +2805,9 @@ loff_t mapping_seek_hole_data(struct address_space *mapping, loff_t start,
 static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
 				     struct file **fpin)
 {
-	if (trylock_page(page))
+	struct folio *folio = page_folio(page);
+
+	if (trylock_folio(folio))
 		return 1;
 
 	/*
@@ -2819,7 +2820,7 @@ static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
 
 	*fpin = maybe_unlock_mmap_for_io(vmf, *fpin);
 	if (vmf->flags & FAULT_FLAG_KILLABLE) {
-		if (__lock_page_killable(page)) {
+		if (__lock_page_killable(&folio->page)) {
 			/*
 			 * We didn't have the right flags to drop the mmap_lock,
 			 * but all fault_handlers only check for fatal signals
@@ -2831,11 +2832,11 @@ static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
 			return 0;
 		}
 	} else
-		__lock_page(page);
+		__lock_folio(folio);
+
 	return 1;
 }
 
-
 /*
  * Synchronous readahead happens when we don't even find a page in the page
  * cache at all.  We don't want to perform IO under the mmap sem, so if we have
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 19/28] mm/filemap: Add lock_folio_killable
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (17 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 18/28] mm/filemap: Add lock_folio Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 20/28] mm/filemap: Add __lock_folio_async Matthew Wilcox (Oracle)
                   ` (8 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

This is like lock_page_killable() but for use by callers who
know they have a folio.  Convert __lock_page_killable() to be
__lock_folio_killable().  This saves one call to compound_head() per
contended call to lock_page_killable().

__lock_folio_killable() is 20 bytes smaller than __lock_page_killable()
was.  lock_page_maybe_drop_mmap() shrinks by 68 bytes and
__lock_page_or_retry() shrinks by 66 bytes.  That's a total of 154 bytes
of text saved.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 15 ++++++++++-----
 mm/filemap.c            | 17 +++++++++--------
 2 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index c59b19f1cb0f..b23b95f771f7 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -715,7 +715,7 @@ static inline bool wake_page_match(struct wait_page_queue *wait_page,
 }
 
 void __lock_folio(struct folio *folio);
-extern int __lock_page_killable(struct page *page);
+int __lock_folio_killable(struct folio *folio);
 extern int __lock_page_async(struct page *page, struct wait_page_queue *wait);
 extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 				unsigned int flags);
@@ -755,6 +755,14 @@ static inline void lock_page(struct page *page)
 		__lock_folio(folio);
 }
 
+static inline int lock_folio_killable(struct folio *folio)
+{
+	might_sleep();
+	if (!trylock_folio(folio))
+		return __lock_folio_killable(folio);
+	return 0;
+}
+
 /*
  * lock_page_killable is like lock_page but can be interrupted by fatal
  * signals.  It returns 0 if it locked the page and -EINTR if it was
@@ -762,10 +770,7 @@ static inline void lock_page(struct page *page)
  */
 static inline int lock_page_killable(struct page *page)
 {
-	might_sleep();
-	if (!trylock_page(page))
-		return __lock_page_killable(page);
-	return 0;
+	return lock_folio_killable(page_folio(page));
 }
 
 /*
diff --git a/mm/filemap.c b/mm/filemap.c
index 36289a9816b5..2a4fa0b5fa88 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1585,14 +1585,13 @@ void __lock_folio(struct folio *folio)
 }
 EXPORT_SYMBOL(__lock_folio);
 
-int __lock_page_killable(struct page *__page)
+int __lock_folio_killable(struct folio *folio)
 {
-	struct page *page = compound_head(__page);
-	wait_queue_head_t *q = page_waitqueue(page);
-	return wait_on_page_bit_common(q, page, PG_locked, TASK_KILLABLE,
+	wait_queue_head_t *q = page_waitqueue(&folio->page);
+	return wait_on_page_bit_common(q, &folio->page, PG_locked, TASK_KILLABLE,
 					EXCLUSIVE);
 }
-EXPORT_SYMBOL_GPL(__lock_page_killable);
+EXPORT_SYMBOL_GPL(__lock_folio_killable);
 
 int __lock_page_async(struct page *page, struct wait_page_queue *wait)
 {
@@ -1634,6 +1633,8 @@ int __lock_page_async(struct page *page, struct wait_page_queue *wait)
 int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 			 unsigned int flags)
 {
+	struct folio *folio = page_folio(page);
+
 	if (fault_flag_allow_retry_first(flags)) {
 		/*
 		 * CAUTION! In this case, mmap_lock is not released
@@ -1652,13 +1653,13 @@ int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 	if (flags & FAULT_FLAG_KILLABLE) {
 		int ret;
 
-		ret = __lock_page_killable(page);
+		ret = __lock_folio_killable(folio);
 		if (ret) {
 			mmap_read_unlock(mm);
 			return 0;
 		}
 	} else {
-		__lock_folio(page_folio(page));
+		__lock_folio(folio);
 	}
 
 	return 1;
@@ -2820,7 +2821,7 @@ static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page,
 
 	*fpin = maybe_unlock_mmap_for_io(vmf, *fpin);
 	if (vmf->flags & FAULT_FLAG_KILLABLE) {
-		if (__lock_page_killable(&folio->page)) {
+		if (__lock_folio_killable(folio)) {
 			/*
 			 * We didn't have the right flags to drop the mmap_lock,
 			 * but all fault_handlers only check for fatal signals
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 20/28] mm/filemap: Add __lock_folio_async
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (18 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 19/28] mm/filemap: Add lock_folio_killable Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 21/28] mm/filemap: Add __lock_folio_or_retry Matthew Wilcox (Oracle)
                   ` (7 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

There aren't any actual callers of lock_page_async(), so remove it.
Convert filemap_update_page() to call __lock_folio_async().

__lock_folio_async() is 21 bytes smaller than __lock_page_async(),
but the real savings come from using a folio in filemap_update_page(),
shrinking it from 514 bytes to 403 bytes, saving 111 bytes.  The text
shrinks by 132 bytes in total.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 fs/io_uring.c           |  2 +-
 include/linux/pagemap.h | 17 -----------------
 mm/filemap.c            | 31 ++++++++++++++++---------------
 3 files changed, 17 insertions(+), 33 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 81e5d156af1c..154fd92ab8a8 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3142,7 +3142,7 @@ static int io_read_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 }
 
 /*
- * This is our waitqueue callback handler, registered through lock_page_async()
+ * This is our waitqueue callback handler, registered through lock_folio_async()
  * when we initially tried to do the IO with the iocb armed our waitqueue.
  * This gets called when the page is unlocked, and we generally expect that to
  * happen when the page IO is completed and the page is now uptodate. This will
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index b23b95f771f7..c63f59d4ae60 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -716,7 +716,6 @@ static inline bool wake_page_match(struct wait_page_queue *wait_page,
 
 void __lock_folio(struct folio *folio);
 int __lock_folio_killable(struct folio *folio);
-extern int __lock_page_async(struct page *page, struct wait_page_queue *wait);
 extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
 				unsigned int flags);
 void unlock_page(struct page *page);
@@ -773,22 +772,6 @@ static inline int lock_page_killable(struct page *page)
 	return lock_folio_killable(page_folio(page));
 }
 
-/*
- * lock_page_async - Lock the page, unless this would block. If the page
- * is already locked, then queue a callback when the page becomes unlocked.
- * This callback can then retry the operation.
- *
- * Returns 0 if the page is locked successfully, or -EIOCBQUEUED if the page
- * was already locked and the callback defined in 'wait' was queued.
- */
-static inline int lock_page_async(struct page *page,
-				  struct wait_page_queue *wait)
-{
-	if (!trylock_page(page))
-		return __lock_page_async(page, wait);
-	return 0;
-}
-
 /*
  * lock_page_or_retry - Lock the page, unless this would block and the
  * caller indicated that it can handle a retry.
diff --git a/mm/filemap.c b/mm/filemap.c
index 2a4fa0b5fa88..9485ce2a4820 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1593,18 +1593,18 @@ int __lock_folio_killable(struct folio *folio)
 }
 EXPORT_SYMBOL_GPL(__lock_folio_killable);
 
-int __lock_page_async(struct page *page, struct wait_page_queue *wait)
+static int __lock_folio_async(struct folio *folio, struct wait_page_queue *wait)
 {
-	struct wait_queue_head *q = page_waitqueue(page);
+	struct wait_queue_head *q = page_waitqueue(&folio->page);
 	int ret = 0;
 
-	wait->page = page;
+	wait->page = &folio->page;
 	wait->bit_nr = PG_locked;
 
 	spin_lock_irq(&q->lock);
 	__add_wait_queue_entry_tail(q, &wait->wait);
-	SetPageWaiters(page);
-	ret = !trylock_page(page);
+	SetFolioWaiters(folio);
+	ret = !trylock_folio(folio);
 	/*
 	 * If we were successful now, we know we're still on the
 	 * waitqueue as we're still under the lock. This means it's
@@ -2351,41 +2351,42 @@ static int filemap_update_page(struct kiocb *iocb,
 		struct address_space *mapping, struct iov_iter *iter,
 		struct page *page)
 {
+	struct folio *folio = page_folio(page);
 	int error;
 
-	if (!trylock_page(page)) {
+	if (!trylock_folio(folio)) {
 		if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_NOIO))
 			return -EAGAIN;
 		if (!(iocb->ki_flags & IOCB_WAITQ)) {
-			put_and_wait_on_page_locked(page, TASK_KILLABLE);
+			put_and_wait_on_page_locked(&folio->page, TASK_KILLABLE);
 			return AOP_TRUNCATED_PAGE;
 		}
-		error = __lock_page_async(page, iocb->ki_waitq);
+		error = __lock_folio_async(folio, iocb->ki_waitq);
 		if (error)
 			return error;
 	}
 
-	if (!page->mapping)
+	if (!folio->mapping)
 		goto truncated;
 
 	error = 0;
-	if (filemap_range_uptodate(mapping, iocb->ki_pos, iter, page))
+	if (filemap_range_uptodate(mapping, iocb->ki_pos, iter, &folio->page))
 		goto unlock;
 
 	error = -EAGAIN;
 	if (iocb->ki_flags & (IOCB_NOIO | IOCB_NOWAIT | IOCB_WAITQ))
 		goto unlock;
 
-	error = filemap_read_page(iocb->ki_filp, mapping, page);
+	error = filemap_read_page(iocb->ki_filp, mapping, &folio->page);
 	if (error == AOP_TRUNCATED_PAGE)
-		put_page(page);
+		put_folio(folio);
 	return error;
 truncated:
-	unlock_page(page);
-	put_page(page);
+	unlock_folio(folio);
+	put_folio(folio);
 	return AOP_TRUNCATED_PAGE;
 unlock:
-	unlock_page(page);
+	unlock_folio(folio);
 	return error;
 }
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 21/28] mm/filemap: Add __lock_folio_or_retry
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (19 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 20/28] mm/filemap: Add __lock_folio_async Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 18:50 ` [PATCH v7 22/28] mm/filemap: Add wait_on_folio_locked Matthew Wilcox (Oracle)
                   ` (6 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

Convert __lock_page_or_retry() to __lock_folio_or_retry().  This actually
saves 4 bytes in the only caller of lock_page_or_retry() (due to better
register allocation) and saves the 20 byte cost of calling page_folio()
in __lock_folio_or_retry() for a total saving of 24 bytes.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h |  9 ++++++---
 mm/filemap.c            | 10 ++++------
 mm/memory.c             |  8 ++++----
 3 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index c63f59d4ae60..00864e098693 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -716,7 +716,7 @@ static inline bool wake_page_match(struct wait_page_queue *wait_page,
 
 void __lock_folio(struct folio *folio);
 int __lock_folio_killable(struct folio *folio);
-extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
+int __lock_folio_or_retry(struct folio *folio, struct mm_struct *mm,
 				unsigned int flags);
 void unlock_page(struct page *page);
 void unlock_folio(struct folio *folio);
@@ -777,13 +777,16 @@ static inline int lock_page_killable(struct page *page)
  * caller indicated that it can handle a retry.
  *
  * Return value and mmap_lock implications depend on flags; see
- * __lock_page_or_retry().
+ * __lock_folio_or_retry().
  */
 static inline int lock_page_or_retry(struct page *page, struct mm_struct *mm,
 				     unsigned int flags)
 {
+	struct folio *folio;
 	might_sleep();
-	return trylock_page(page) || __lock_page_or_retry(page, mm, flags);
+
+	folio = page_folio(page);
+	return trylock_folio(folio) || __lock_folio_or_retry(folio, mm, flags);
 }
 
 /*
diff --git a/mm/filemap.c b/mm/filemap.c
index 9485ce2a4820..73c8d7102157 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1621,20 +1621,18 @@ static int __lock_folio_async(struct folio *folio, struct wait_page_queue *wait)
 
 /*
  * Return values:
- * 1 - page is locked; mmap_lock is still held.
- * 0 - page is not locked.
+ * 1 - folio is locked; mmap_lock is still held.
+ * 0 - folio is not locked.
  *     mmap_lock has been released (mmap_read_unlock(), unless flags had both
  *     FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_RETRY_NOWAIT set, in
  *     which case mmap_lock is still held.
  *
  * If neither ALLOW_RETRY nor KILLABLE are set, will always return 1
- * with the page locked and the mmap_lock unperturbed.
+ * with the folio locked and the mmap_lock unperturbed.
  */
-int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
+int __lock_folio_or_retry(struct folio *folio, struct mm_struct *mm,
 			 unsigned int flags)
 {
-	struct folio *folio = page_folio(page);
-
 	if (fault_flag_allow_retry_first(flags)) {
 		/*
 		 * CAUTION! In this case, mmap_lock is not released
diff --git a/mm/memory.c b/mm/memory.c
index cc71a445c76c..a2768ca793e5 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4067,7 +4067,7 @@ static vm_fault_t do_shared_fault(struct vm_fault *vmf)
  * We enter with non-exclusive mmap_lock (to exclude vma changes,
  * but allow concurrent faults).
  * The mmap_lock may have been released depending on flags and our
- * return value.  See filemap_fault() and __lock_page_or_retry().
+ * return value.  See filemap_fault() and __lock_folio_or_retry().
  * If mmap_lock is released, vma may become invalid (for example
  * by other thread calling munmap()).
  */
@@ -4299,7 +4299,7 @@ static vm_fault_t wp_huge_pud(struct vm_fault *vmf, pud_t orig_pud)
  * concurrent faults).
  *
  * The mmap_lock may have been released depending on flags and our return value.
- * See filemap_fault() and __lock_page_or_retry().
+ * See filemap_fault() and __lock_folio_or_retry().
  */
 static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
 {
@@ -4403,7 +4403,7 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
  * By the time we get here, we already hold the mm semaphore
  *
  * The mmap_lock may have been released depending on flags and our
- * return value.  See filemap_fault() and __lock_page_or_retry().
+ * return value.  See filemap_fault() and __lock_folio_or_retry().
  */
 static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
 		unsigned long address, unsigned int flags)
@@ -4559,7 +4559,7 @@ static inline void mm_account_fault(struct pt_regs *regs,
  * By the time we get here, we already hold the mm semaphore
  *
  * The mmap_lock may have been released depending on flags and our
- * return value.  See filemap_fault() and __lock_page_or_retry().
+ * return value.  See filemap_fault() and __lock_folio_or_retry().
  */
 vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
 			   unsigned int flags, struct pt_regs *regs)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 22/28] mm/filemap: Add wait_on_folio_locked
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (20 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 21/28] mm/filemap: Add __lock_folio_or_retry Matthew Wilcox (Oracle)
@ 2021-04-09 18:50 ` Matthew Wilcox (Oracle)
  2021-04-09 18:51 ` [PATCH v7 23/28] mm/filemap: Add end_folio_writeback Matthew Wilcox (Oracle)
                   ` (5 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:50 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

Also add wait_on_folio_locked_killable().  Turn wait_on_page_locked()
and wait_on_page_locked_killable() into wrappers.  This eliminates a
call to compound_head() from each call-site, reducing text size by 200
bytes for me.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 26 ++++++++++++++++++--------
 mm/filemap.c            |  4 ++--
 2 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 00864e098693..18c4c2ae8074 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -797,23 +797,33 @@ extern void wait_on_page_bit(struct page *page, int bit_nr);
 extern int wait_on_page_bit_killable(struct page *page, int bit_nr);
 
 /* 
- * Wait for a page to be unlocked.
+ * Wait for a folio to be unlocked.
  *
- * This must be called with the caller "holding" the page,
- * ie with increased "page->count" so that the page won't
+ * This must be called with the caller "holding" the folio,
+ * ie with increased "page->count" so that the folio won't
  * go away during the wait..
  */
+static inline void wait_on_folio_locked(struct folio *folio)
+{
+	if (FolioLocked(folio))
+		wait_on_page_bit(&folio->page, PG_locked);
+}
+
+static inline int wait_on_folio_locked_killable(struct folio *folio)
+{
+	if (!FolioLocked(folio))
+		return 0;
+	return wait_on_page_bit_killable(&folio->page, PG_locked);
+}
+
 static inline void wait_on_page_locked(struct page *page)
 {
-	if (PageLocked(page))
-		wait_on_page_bit(compound_head(page), PG_locked);
+	wait_on_folio_locked(page_folio(page));
 }
 
 static inline int wait_on_page_locked_killable(struct page *page)
 {
-	if (!PageLocked(page))
-		return 0;
-	return wait_on_page_bit_killable(compound_head(page), PG_locked);
+	return wait_on_folio_locked_killable(page_folio(page));
 }
 
 int put_and_wait_on_page_locked(struct page *page, int state);
diff --git a/mm/filemap.c b/mm/filemap.c
index 73c8d7102157..d23430ad5bbc 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1643,9 +1643,9 @@ int __lock_folio_or_retry(struct folio *folio, struct mm_struct *mm,
 
 		mmap_read_unlock(mm);
 		if (flags & FAULT_FLAG_KILLABLE)
-			wait_on_page_locked_killable(page);
+			wait_on_folio_locked_killable(folio);
 		else
-			wait_on_page_locked(page);
+			wait_on_folio_locked(folio);
 		return 0;
 	}
 	if (flags & FAULT_FLAG_KILLABLE) {
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 23/28] mm/filemap: Add end_folio_writeback
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (21 preceding siblings ...)
  2021-04-09 18:50 ` [PATCH v7 22/28] mm/filemap: Add wait_on_folio_locked Matthew Wilcox (Oracle)
@ 2021-04-09 18:51 ` Matthew Wilcox (Oracle)
  2021-04-09 18:51 ` [PATCH v7 24/28] mm/writeback: Add wait_on_folio_writeback Matthew Wilcox (Oracle)
                   ` (4 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:51 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

Add an end_page_writeback() wrapper function for users that are not yet
converted to folios.

end_folio_writeback() is less than half the size of end_page_writeback()
at just 105 bytes compared to 213 bytes, due to removing all the
compound_head() calls.  The 30 byte wrapper function makes this a net
saving of 70 bytes.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h |  3 ++-
 mm/filemap.c            | 38 +++++++++++++++++++-------------------
 mm/folio-compat.c       |  6 ++++++
 3 files changed, 27 insertions(+), 20 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 18c4c2ae8074..9bc01429dc25 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -829,7 +829,8 @@ static inline int wait_on_page_locked_killable(struct page *page)
 int put_and_wait_on_page_locked(struct page *page, int state);
 void wait_on_page_writeback(struct page *page);
 int wait_on_page_writeback_killable(struct page *page);
-extern void end_page_writeback(struct page *page);
+void end_page_writeback(struct page *page);
+void end_folio_writeback(struct folio *folio);
 void wait_for_stable_page(struct page *page);
 
 void page_endio(struct page *page, bool is_write, int err);
diff --git a/mm/filemap.c b/mm/filemap.c
index d23430ad5bbc..cdb8250af510 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1175,11 +1175,11 @@ static void wake_up_page_bit(struct page *page, int bit_nr)
 	spin_unlock_irqrestore(&q->lock, flags);
 }
 
-static void wake_up_page(struct page *page, int bit)
+static void wake_up_folio(struct folio *folio, int bit)
 {
-	if (!PageWaiters(page))
+	if (!FolioWaiters(folio))
 		return;
-	wake_up_page_bit(page, bit);
+	wake_up_page_bit(&folio->page, bit);
 }
 
 /*
@@ -1512,38 +1512,38 @@ int wait_on_page_private_2_killable(struct page *page)
 EXPORT_SYMBOL(wait_on_page_private_2_killable);
 
 /**
- * end_page_writeback - end writeback against a page
- * @page: the page
+ * end_folio_writeback - End writeback against a folio.
+ * @folio: The folio.
  */
-void end_page_writeback(struct page *page)
+void end_folio_writeback(struct folio *folio)
 {
 	/*
 	 * TestClearPageReclaim could be used here but it is an atomic
 	 * operation and overkill in this particular case. Failing to
-	 * shuffle a page marked for immediate reclaim is too mild to
+	 * shuffle a folio marked for immediate reclaim is too mild to
 	 * justify taking an atomic operation penalty at the end of
-	 * ever page writeback.
+	 * every folio writeback.
 	 */
-	if (PageReclaim(page)) {
-		ClearPageReclaim(page);
-		rotate_reclaimable_page(page);
+	if (FolioReclaim(folio)) {
+		ClearFolioReclaim(folio);
+		rotate_reclaimable_page(&folio->page);
 	}
 
 	/*
-	 * Writeback does not hold a page reference of its own, relying
+	 * Writeback does not hold a folio reference of its own, relying
 	 * on truncation to wait for the clearing of PG_writeback.
-	 * But here we must make sure that the page is not freed and
-	 * reused before the wake_up_page().
+	 * But here we must make sure that the folio is not freed and
+	 * reused before the wake_up_folio().
 	 */
-	get_page(page);
-	if (!test_clear_page_writeback(page))
+	get_folio(folio);
+	if (!test_clear_page_writeback(&folio->page))
 		BUG();
 
 	smp_mb__after_atomic();
-	wake_up_page(page, PG_writeback);
-	put_page(page);
+	wake_up_folio(folio, PG_writeback);
+	put_folio(folio);
 }
-EXPORT_SYMBOL(end_page_writeback);
+EXPORT_SYMBOL(end_folio_writeback);
 
 /*
  * After completing I/O on a page, call this routine to update the page
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 02798abf19a1..d1a1dfe52589 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -17,3 +17,9 @@ void unlock_page(struct page *page)
 	return unlock_folio(page_folio(page));
 }
 EXPORT_SYMBOL(unlock_page);
+
+void end_page_writeback(struct page *page)
+{
+	return end_folio_writeback(page_folio(page));
+}
+EXPORT_SYMBOL(end_page_writeback);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 24/28] mm/writeback: Add wait_on_folio_writeback
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (22 preceding siblings ...)
  2021-04-09 18:51 ` [PATCH v7 23/28] mm/filemap: Add end_folio_writeback Matthew Wilcox (Oracle)
@ 2021-04-09 18:51 ` Matthew Wilcox (Oracle)
  2021-04-09 18:51 ` [PATCH v7 25/28] mm/writeback: Add wait_for_stable_folio Matthew Wilcox (Oracle)
                   ` (3 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:51 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

wait_on_page_writeback_killable() only has one caller, so convert it to
call wait_on_folio_writeback_killable().  For the wait_on_page_writeback()
callers, add a compatibility wrapper around wait_on_folio_writeback().

Turning PageWriteback() into FolioWriteback() eliminates a call to
compound_head() which saves 8 bytes and 15 bytes in the two functions.
That is more than offset by adding the wait_on_page_writeback
compatibility wrapper for a net increase in text of 15 bytes.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 fs/afs/write.c          |  9 ++++----
 include/linux/pagemap.h |  3 ++-
 mm/folio-compat.c       |  6 ++++++
 mm/page-writeback.c     | 48 ++++++++++++++++++++++++++++-------------
 4 files changed, 46 insertions(+), 20 deletions(-)

diff --git a/fs/afs/write.c b/fs/afs/write.c
index dc66ff15dd16..a1def42e2e45 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -831,7 +831,8 @@ int afs_fsync(struct file *file, loff_t start, loff_t end, int datasync)
  */
 vm_fault_t afs_page_mkwrite(struct vm_fault *vmf)
 {
-	struct page *page = thp_head(vmf->page);
+	struct folio *folio = page_folio(vmf->page);
+	struct page *page = &folio->page;
 	struct file *file = vmf->vma->vm_file;
 	struct inode *inode = file_inode(file);
 	struct afs_vnode *vnode = AFS_FS_I(inode);
@@ -850,7 +851,7 @@ vm_fault_t afs_page_mkwrite(struct vm_fault *vmf)
 		return VM_FAULT_RETRY;
 #endif
 
-	if (wait_on_page_writeback_killable(page))
+	if (wait_on_folio_writeback_killable(folio))
 		return VM_FAULT_RETRY;
 
 	if (lock_page_killable(page) < 0)
@@ -860,8 +861,8 @@ vm_fault_t afs_page_mkwrite(struct vm_fault *vmf)
 	 * details the portion of the page we need to write back and we might
 	 * need to redirty the page if there's a problem.
 	 */
-	if (wait_on_page_writeback_killable(page) < 0) {
-		unlock_page(page);
+	if (wait_on_folio_writeback_killable(folio) < 0) {
+		unlock_folio(folio);
 		return VM_FAULT_RETRY;
 	}
 
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 9bc01429dc25..99331c35c89c 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -828,7 +828,8 @@ static inline int wait_on_page_locked_killable(struct page *page)
 
 int put_and_wait_on_page_locked(struct page *page, int state);
 void wait_on_page_writeback(struct page *page);
-int wait_on_page_writeback_killable(struct page *page);
+void wait_on_folio_writeback(struct folio *folio);
+int wait_on_folio_writeback_killable(struct folio *folio);
 void end_page_writeback(struct page *page);
 void end_folio_writeback(struct folio *folio);
 void wait_for_stable_page(struct page *page);
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index d1a1dfe52589..6aadecc39fba 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -23,3 +23,9 @@ void end_page_writeback(struct page *page)
 	return end_folio_writeback(page_folio(page));
 }
 EXPORT_SYMBOL(end_page_writeback);
+
+void wait_on_page_writeback(struct page *page)
+{
+	return wait_on_folio_writeback(page_folio(page));
+}
+EXPORT_SYMBOL_GPL(wait_on_page_writeback);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 0062d5c57d41..8271f9b24b69 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2818,33 +2818,51 @@ int __test_set_page_writeback(struct page *page, bool keep_write)
 }
 EXPORT_SYMBOL(__test_set_page_writeback);
 
-/*
- * Wait for a page to complete writeback
+/**
+ * wait_on_folio_writeback - Wait for a folio to finish writeback.
+ * @folio: The folio to wait for.
+ *
+ * If the folio is currently being written back to storage, wait for the
+ * I/O to complete.
+ *
+ * Context: Sleeps.  Must be called in process context and with
+ * no spinlocks held.  Caller should hold a reference on the folio.
+ * If the folio is not locked, writeback may start again after writeback
+ * has finished.
  */
-void wait_on_page_writeback(struct page *page)
+void wait_on_folio_writeback(struct folio *folio)
 {
-	while (PageWriteback(page)) {
-		trace_wait_on_page_writeback(page, page_mapping(page));
-		wait_on_page_bit(page, PG_writeback);
+	while (FolioWriteback(folio)) {
+		trace_wait_on_page_writeback(&folio->page, folio_mapping(folio));
+		wait_on_page_bit(&folio->page, PG_writeback);
 	}
 }
-EXPORT_SYMBOL_GPL(wait_on_page_writeback);
+EXPORT_SYMBOL_GPL(wait_on_folio_writeback);
 
-/*
- * Wait for a page to complete writeback.  Returns -EINTR if we get a
- * fatal signal while waiting.
+/**
+ * wait_on_folio_writeback_killable - Wait for a folio to finish writeback.
+ * @folio: The folio to wait for.
+ *
+ * If the folio is currently being written back to storage, wait for the
+ * I/O to complete or a fatal signal to arrive.
+ *
+ * Context: Sleeps.  Must be called in process context and with
+ * no spinlocks held.  Caller should hold a reference on the folio.
+ * If the folio is not locked, writeback may start again after writeback
+ * has finished.
+ * Return: 0 on success, -EINTR if we get a fatal signal while waiting.
  */
-int wait_on_page_writeback_killable(struct page *page)
+int wait_on_folio_writeback_killable(struct folio *folio)
 {
-	while (PageWriteback(page)) {
-		trace_wait_on_page_writeback(page, page_mapping(page));
-		if (wait_on_page_bit_killable(page, PG_writeback))
+	while (FolioWriteback(folio)) {
+		trace_wait_on_page_writeback(&folio->page, folio_mapping(folio));
+		if (wait_on_page_bit_killable(&folio->page, PG_writeback))
 			return -EINTR;
 	}
 
 	return 0;
 }
-EXPORT_SYMBOL_GPL(wait_on_page_writeback_killable);
+EXPORT_SYMBOL_GPL(wait_on_folio_writeback_killable);
 
 /**
  * wait_for_stable_page() - wait for writeback to finish, if necessary.
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 25/28] mm/writeback: Add wait_for_stable_folio
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (23 preceding siblings ...)
  2021-04-09 18:51 ` [PATCH v7 24/28] mm/writeback: Add wait_on_folio_writeback Matthew Wilcox (Oracle)
@ 2021-04-09 18:51 ` Matthew Wilcox (Oracle)
  2021-04-09 18:51 ` [PATCH v7 26/28] mm/filemap: Convert wait_on_page_bit to wait_on_folio_bit Matthew Wilcox (Oracle)
                   ` (2 subsequent siblings)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:51 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

Move wait_for_stable_page() into the folio compatibility file.
wait_for_stable_folio() avoids a call to compound_head() and is 14 bytes
smaller than wait_for_stable_page() was.  The net text size grows by 24
bytes as a result of this patch.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h |  1 +
 mm/folio-compat.c       |  6 ++++++
 mm/page-writeback.c     | 24 ++++++++++++++----------
 3 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 99331c35c89c..d50fc5adbee1 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -833,6 +833,7 @@ int wait_on_folio_writeback_killable(struct folio *folio);
 void end_page_writeback(struct page *page);
 void end_folio_writeback(struct folio *folio);
 void wait_for_stable_page(struct page *page);
+void wait_for_stable_folio(struct folio *folio);
 
 void page_endio(struct page *page, bool is_write, int err);
 
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 6aadecc39fba..335594fe414e 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -29,3 +29,9 @@ void wait_on_page_writeback(struct page *page)
 	return wait_on_folio_writeback(page_folio(page));
 }
 EXPORT_SYMBOL_GPL(wait_on_page_writeback);
+
+void wait_for_stable_page(struct page *page)
+{
+	return wait_for_stable_folio(page_folio(page));
+}
+EXPORT_SYMBOL_GPL(wait_for_stable_page);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 8271f9b24b69..9d55ceec05c0 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2865,17 +2865,21 @@ int wait_on_folio_writeback_killable(struct folio *folio)
 EXPORT_SYMBOL_GPL(wait_on_folio_writeback_killable);
 
 /**
- * wait_for_stable_page() - wait for writeback to finish, if necessary.
- * @page:	The page to wait on.
+ * wait_for_stable_folio() - wait for writeback to finish, if necessary.
+ * @folio: The folio to wait on.
  *
- * This function determines if the given page is related to a backing device
- * that requires page contents to be held stable during writeback.  If so, then
- * it will wait for any pending writeback to complete.
+ * This function determines if the given folio is related to a backing
+ * device that requires folio contents to be held stable during writeback.
+ * If so, then it will wait for any pending writeback to complete.
+ *
+ * Context: Sleeps.  Must be called in process context and with
+ * no spinlocks held.  Caller should hold a reference on the folio.
+ * If the folio is not locked, writeback may start again after writeback
+ * has finished.
  */
-void wait_for_stable_page(struct page *page)
+void wait_for_stable_folio(struct folio *folio)
 {
-	page = thp_head(page);
-	if (page->mapping->host->i_sb->s_iflags & SB_I_STABLE_WRITES)
-		wait_on_page_writeback(page);
+	if (folio->mapping->host->i_sb->s_iflags & SB_I_STABLE_WRITES)
+		wait_on_folio_writeback(folio);
 }
-EXPORT_SYMBOL_GPL(wait_for_stable_page);
+EXPORT_SYMBOL_GPL(wait_for_stable_folio);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 26/28] mm/filemap: Convert wait_on_page_bit to wait_on_folio_bit
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (24 preceding siblings ...)
  2021-04-09 18:51 ` [PATCH v7 25/28] mm/writeback: Add wait_for_stable_folio Matthew Wilcox (Oracle)
@ 2021-04-09 18:51 ` Matthew Wilcox (Oracle)
  2021-04-09 18:51 ` [PATCH v7 27/28] mm/filemap: Convert wake_up_page_bit to wake_up_folio_bit Matthew Wilcox (Oracle)
  2021-04-09 18:51 ` [PATCH v7 28/28] mm/filemap: Convert page wait queues to be folios Matthew Wilcox (Oracle)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:51 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

We must always wait on the folio, otherwise we won't be woken up.

This commit shrinks the kernel by 691 bytes, mostly due to moving
the page waitqueue lookup into wait_on_folio_bit_common().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 include/linux/pagemap.h | 10 +++---
 mm/filemap.c            | 67 ++++++++++++++++++++---------------------
 mm/page-writeback.c     |  4 +--
 3 files changed, 39 insertions(+), 42 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index d50fc5adbee1..5bccccff48eb 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -790,11 +790,11 @@ static inline int lock_page_or_retry(struct page *page, struct mm_struct *mm,
 }
 
 /*
- * This is exported only for wait_on_page_locked/wait_on_page_writeback, etc.,
+ * This is exported only for wait_on_folio_locked/wait_on_folio_writeback, etc.,
  * and should not be used directly.
  */
-extern void wait_on_page_bit(struct page *page, int bit_nr);
-extern int wait_on_page_bit_killable(struct page *page, int bit_nr);
+extern void wait_on_folio_bit(struct folio *folio, int bit_nr);
+extern int wait_on_folio_bit_killable(struct folio *folio, int bit_nr);
 
 /* 
  * Wait for a folio to be unlocked.
@@ -806,14 +806,14 @@ extern int wait_on_page_bit_killable(struct page *page, int bit_nr);
 static inline void wait_on_folio_locked(struct folio *folio)
 {
 	if (FolioLocked(folio))
-		wait_on_page_bit(&folio->page, PG_locked);
+		wait_on_folio_bit(folio, PG_locked);
 }
 
 static inline int wait_on_folio_locked_killable(struct folio *folio)
 {
 	if (!FolioLocked(folio))
 		return 0;
-	return wait_on_page_bit_killable(&folio->page, PG_locked);
+	return wait_on_folio_bit_killable(folio, PG_locked);
 }
 
 static inline void wait_on_page_locked(struct page *page)
diff --git a/mm/filemap.c b/mm/filemap.c
index cdb8250af510..8f07e21a8f29 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1102,7 +1102,7 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync,
 	 *
 	 * So update the flags atomically, and wake up the waiter
 	 * afterwards to avoid any races. This store-release pairs
-	 * with the load-acquire in wait_on_page_bit_common().
+	 * with the load-acquire in wait_on_folio_bit_common().
 	 */
 	smp_store_release(&wait->flags, flags | WQ_FLAG_WOKEN);
 	wake_up_state(wait->private, mode);
@@ -1183,7 +1183,7 @@ static void wake_up_folio(struct folio *folio, int bit)
 }
 
 /*
- * A choice of three behaviors for wait_on_page_bit_common():
+ * A choice of three behaviors for wait_on_folio_bit_common():
  */
 enum behavior {
 	EXCLUSIVE,	/* Hold ref to page and take the bit when woken, like
@@ -1217,9 +1217,10 @@ static inline bool trylock_page_bit_common(struct page *page, int bit_nr,
 /* How many times do we accept lock stealing from under a waiter? */
 int sysctl_page_lock_unfairness = 5;
 
-static inline int wait_on_page_bit_common(wait_queue_head_t *q,
-	struct page *page, int bit_nr, int state, enum behavior behavior)
+static inline int wait_on_folio_bit_common(struct folio *folio, int bit_nr,
+		int state, enum behavior behavior)
 {
+	wait_queue_head_t *q = page_waitqueue(&folio->page);
 	int unfairness = sysctl_page_lock_unfairness;
 	struct wait_page_queue wait_page;
 	wait_queue_entry_t *wait = &wait_page.wait;
@@ -1228,8 +1229,8 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	unsigned long pflags;
 
 	if (bit_nr == PG_locked &&
-	    !PageUptodate(page) && PageWorkingset(page)) {
-		if (!PageSwapBacked(page)) {
+	    !FolioUptodate(folio) && FolioWorkingset(folio)) {
+		if (!FolioSwapBacked(folio)) {
 			delayacct_thrashing_start();
 			delayacct = true;
 		}
@@ -1239,7 +1240,7 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 
 	init_wait(wait);
 	wait->func = wake_page_function;
-	wait_page.page = page;
+	wait_page.page = &folio->page;
 	wait_page.bit_nr = bit_nr;
 
 repeat:
@@ -1254,7 +1255,7 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	 * Do one last check whether we can get the
 	 * page bit synchronously.
 	 *
-	 * Do the SetPageWaiters() marking before that
+	 * Do the SetFolioWaiters() marking before that
 	 * to let any waker we _just_ missed know they
 	 * need to wake us up (otherwise they'll never
 	 * even go to the slow case that looks at the
@@ -1265,8 +1266,8 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	 * lock to avoid races.
 	 */
 	spin_lock_irq(&q->lock);
-	SetPageWaiters(page);
-	if (!trylock_page_bit_common(page, bit_nr, wait))
+	SetFolioWaiters(folio);
+	if (!trylock_page_bit_common(&folio->page, bit_nr, wait))
 		__add_wait_queue_entry_tail(q, wait);
 	spin_unlock_irq(&q->lock);
 
@@ -1276,10 +1277,10 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	 * see whether the page bit testing has already
 	 * been done by the wake function.
 	 *
-	 * We can drop our reference to the page.
+	 * We can drop our reference to the folio.
 	 */
 	if (behavior == DROP)
-		put_page(page);
+		put_folio(folio);
 
 	/*
 	 * Note that until the "finish_wait()", or until
@@ -1316,7 +1317,7 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 		 *
 		 * And if that fails, we'll have to retry this all.
 		 */
-		if (unlikely(test_and_set_bit(bit_nr, &page->flags)))
+		if (unlikely(test_and_set_bit(bit_nr, folio_flags(folio, 0))))
 			goto repeat;
 
 		wait->flags |= WQ_FLAG_DONE;
@@ -1325,7 +1326,7 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 
 	/*
 	 * If a signal happened, this 'finish_wait()' may remove the last
-	 * waiter from the wait-queues, but the PageWaiters bit will remain
+	 * waiter from the wait-queues, but the FolioWaiters bit will remain
 	 * set. That's ok. The next wakeup will take care of it, and trying
 	 * to do it here would be difficult and prone to races.
 	 */
@@ -1356,19 +1357,17 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
 	return wait->flags & WQ_FLAG_WOKEN ? 0 : -EINTR;
 }
 
-void wait_on_page_bit(struct page *page, int bit_nr)
+void wait_on_folio_bit(struct folio *folio, int bit_nr)
 {
-	wait_queue_head_t *q = page_waitqueue(page);
-	wait_on_page_bit_common(q, page, bit_nr, TASK_UNINTERRUPTIBLE, SHARED);
+	wait_on_folio_bit_common(folio, bit_nr, TASK_UNINTERRUPTIBLE, SHARED);
 }
-EXPORT_SYMBOL(wait_on_page_bit);
+EXPORT_SYMBOL(wait_on_folio_bit);
 
-int wait_on_page_bit_killable(struct page *page, int bit_nr)
+int wait_on_folio_bit_killable(struct folio *folio, int bit_nr)
 {
-	wait_queue_head_t *q = page_waitqueue(page);
-	return wait_on_page_bit_common(q, page, bit_nr, TASK_KILLABLE, SHARED);
+	return wait_on_folio_bit_common(folio, bit_nr, TASK_KILLABLE, SHARED);
 }
-EXPORT_SYMBOL(wait_on_page_bit_killable);
+EXPORT_SYMBOL(wait_on_folio_bit_killable);
 
 /**
  * put_and_wait_on_page_locked - Drop a reference and wait for it to be unlocked
@@ -1385,11 +1384,8 @@ EXPORT_SYMBOL(wait_on_page_bit_killable);
  */
 int put_and_wait_on_page_locked(struct page *page, int state)
 {
-	wait_queue_head_t *q;
-
-	page = compound_head(page);
-	q = page_waitqueue(page);
-	return wait_on_page_bit_common(q, page, PG_locked, state, DROP);
+	return wait_on_folio_bit_common(page_folio(page), PG_locked, state,
+			DROP);
 }
 
 /**
@@ -1481,8 +1477,10 @@ EXPORT_SYMBOL(end_page_private_2);
  */
 void wait_on_page_private_2(struct page *page)
 {
-	while (PagePrivate2(page))
-		wait_on_page_bit(page, PG_private_2);
+	struct folio *folio = page_folio(page);
+
+	while (FolioPrivate2(folio))
+		wait_on_folio_bit(folio, PG_private_2);
 }
 EXPORT_SYMBOL(wait_on_page_private_2);
 
@@ -1499,10 +1497,11 @@ EXPORT_SYMBOL(wait_on_page_private_2);
  */
 int wait_on_page_private_2_killable(struct page *page)
 {
+	struct folio *folio = page_folio(page);
 	int ret = 0;
 
-	while (PagePrivate2(page)) {
-		ret = wait_on_page_bit_killable(page, PG_private_2);
+	while (FolioPrivate2(folio)) {
+		ret = wait_on_folio_bit_killable(folio, PG_private_2);
 		if (ret < 0)
 			break;
 	}
@@ -1579,16 +1578,14 @@ EXPORT_SYMBOL_GPL(page_endio);
  */
 void __lock_folio(struct folio *folio)
 {
-	wait_queue_head_t *q = page_waitqueue(&folio->page);
-	wait_on_page_bit_common(q, &folio->page, PG_locked, TASK_UNINTERRUPTIBLE,
+	wait_on_folio_bit_common(folio, PG_locked, TASK_UNINTERRUPTIBLE,
 				EXCLUSIVE);
 }
 EXPORT_SYMBOL(__lock_folio);
 
 int __lock_folio_killable(struct folio *folio)
 {
-	wait_queue_head_t *q = page_waitqueue(&folio->page);
-	return wait_on_page_bit_common(q, &folio->page, PG_locked, TASK_KILLABLE,
+	return wait_on_folio_bit_common(folio, PG_locked, TASK_KILLABLE,
 					EXCLUSIVE);
 }
 EXPORT_SYMBOL_GPL(__lock_folio_killable);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 9d55ceec05c0..7aed4feabdd2 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2834,7 +2834,7 @@ void wait_on_folio_writeback(struct folio *folio)
 {
 	while (FolioWriteback(folio)) {
 		trace_wait_on_page_writeback(&folio->page, folio_mapping(folio));
-		wait_on_page_bit(&folio->page, PG_writeback);
+		wait_on_folio_bit(folio, PG_writeback);
 	}
 }
 EXPORT_SYMBOL_GPL(wait_on_folio_writeback);
@@ -2856,7 +2856,7 @@ int wait_on_folio_writeback_killable(struct folio *folio)
 {
 	while (FolioWriteback(folio)) {
 		trace_wait_on_page_writeback(&folio->page, folio_mapping(folio));
-		if (wait_on_page_bit_killable(&folio->page, PG_writeback))
+		if (wait_on_folio_bit_killable(folio, PG_writeback))
 			return -EINTR;
 	}
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 27/28] mm/filemap: Convert wake_up_page_bit to wake_up_folio_bit
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (25 preceding siblings ...)
  2021-04-09 18:51 ` [PATCH v7 26/28] mm/filemap: Convert wait_on_page_bit to wait_on_folio_bit Matthew Wilcox (Oracle)
@ 2021-04-09 18:51 ` Matthew Wilcox (Oracle)
  2021-04-09 18:51 ` [PATCH v7 28/28] mm/filemap: Convert page wait queues to be folios Matthew Wilcox (Oracle)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:51 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

All callers have a folio, so use it directly.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 mm/filemap.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 8f07e21a8f29..dfdc04130c5b 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1121,14 +1121,14 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync,
 	return (flags & WQ_FLAG_EXCLUSIVE) != 0;
 }
 
-static void wake_up_page_bit(struct page *page, int bit_nr)
+static void wake_up_folio_bit(struct folio *folio, int bit_nr)
 {
-	wait_queue_head_t *q = page_waitqueue(page);
+	wait_queue_head_t *q = page_waitqueue(&folio->page);
 	struct wait_page_key key;
 	unsigned long flags;
 	wait_queue_entry_t bookmark;
 
-	key.page = page;
+	key.page = &folio->page;
 	key.bit_nr = bit_nr;
 	key.page_match = 0;
 
@@ -1163,7 +1163,7 @@ static void wake_up_page_bit(struct page *page, int bit_nr)
 	 * page waiters.
 	 */
 	if (!waitqueue_active(q) || !key.page_match) {
-		ClearPageWaiters(page);
+		ClearFolioWaiters(folio);
 		/*
 		 * It's possible to miss clearing Waiters here, when we woke
 		 * our page waiters, but the hashed waitqueue has waiters for
@@ -1179,7 +1179,7 @@ static void wake_up_folio(struct folio *folio, int bit)
 {
 	if (!FolioWaiters(folio))
 		return;
-	wake_up_page_bit(&folio->page, bit);
+	wake_up_folio_bit(folio, bit);
 }
 
 /*
@@ -1444,7 +1444,7 @@ void unlock_folio(struct folio *folio)
 	BUILD_BUG_ON(PG_waiters != 7);
 	VM_BUG_ON_FOLIO(!FolioLocked(folio), folio);
 	if (clear_bit_unlock_is_negative_byte(PG_locked, folio_flags(folio, 0)))
-		wake_up_page_bit(&folio->page, PG_locked);
+		wake_up_folio_bit(folio, PG_locked);
 }
 EXPORT_SYMBOL(unlock_folio);
 
@@ -1461,11 +1461,12 @@ EXPORT_SYMBOL(unlock_folio);
  */
 void end_page_private_2(struct page *page)
 {
-	page = compound_head(page);
-	VM_BUG_ON_PAGE(!PagePrivate2(page), page);
-	clear_bit_unlock(PG_private_2, &page->flags);
-	wake_up_page_bit(page, PG_private_2);
-	put_page(page);
+	struct folio *folio = page_folio(page);
+
+	VM_BUG_ON_FOLIO(!FolioPrivate2(folio), folio);
+	clear_bit_unlock(PG_private_2, folio_flags(folio, 0));
+	wake_up_folio_bit(folio, PG_private_2);
+	put_folio(folio);
 }
 EXPORT_SYMBOL(end_page_private_2);
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v7 28/28] mm/filemap: Convert page wait queues to be folios
  2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
                   ` (26 preceding siblings ...)
  2021-04-09 18:51 ` [PATCH v7 27/28] mm/filemap: Convert wake_up_page_bit to wake_up_folio_bit Matthew Wilcox (Oracle)
@ 2021-04-09 18:51 ` Matthew Wilcox (Oracle)
  27 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-04-09 18:51 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

Reinforce that if we're waiting for a bit in a struct page, that's
actually in the head page by changing the type from page to folio.
Increases the size of cachefiles by two bytes, but the kernel core
is unchanged in size.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Layton <jlayton@kernel.org>
---
 fs/cachefiles/rdwr.c    | 16 ++++++++--------
 include/linux/pagemap.h |  8 ++++----
 mm/filemap.c            | 38 +++++++++++++++++++-------------------
 3 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
index 8ffc40e84a59..364af267ebaa 100644
--- a/fs/cachefiles/rdwr.c
+++ b/fs/cachefiles/rdwr.c
@@ -25,20 +25,20 @@ static int cachefiles_read_waiter(wait_queue_entry_t *wait, unsigned mode,
 	struct cachefiles_object *object;
 	struct fscache_retrieval *op = monitor->op;
 	struct wait_page_key *key = _key;
-	struct page *page = wait->private;
+	struct folio *folio = wait->private;
 
 	ASSERT(key);
 
 	_enter("{%lu},%u,%d,{%p,%u}",
 	       monitor->netfs_page->index, mode, sync,
-	       key->page, key->bit_nr);
+	       key->folio, key->bit_nr);
 
-	if (key->page != page || key->bit_nr != PG_locked)
+	if (key->folio != folio || key->bit_nr != PG_locked)
 		return 0;
 
-	_debug("--- monitor %p %lx ---", page, page->flags);
+	_debug("--- monitor %p %lx ---", folio, folio->flags);
 
-	if (!PageUptodate(page) && !PageError(page)) {
+	if (!FolioUptodate(folio) && !FolioError(folio)) {
 		/* unlocked, not uptodate and not erronous? */
 		_debug("page probably truncated");
 	}
@@ -107,7 +107,7 @@ static int cachefiles_read_reissue(struct cachefiles_object *object,
 	put_page(backpage2);
 
 	INIT_LIST_HEAD(&monitor->op_link);
-	add_page_wait_queue(backpage, &monitor->monitor);
+	add_folio_wait_queue(page_folio(backpage), &monitor->monitor);
 
 	if (trylock_page(backpage)) {
 		ret = -EIO;
@@ -294,7 +294,7 @@ static int cachefiles_read_backing_file_one(struct cachefiles_object *object,
 	get_page(backpage);
 	monitor->back_page = backpage;
 	monitor->monitor.private = backpage;
-	add_page_wait_queue(backpage, &monitor->monitor);
+	add_folio_wait_queue(page_folio(backpage), &monitor->monitor);
 	monitor = NULL;
 
 	/* but the page may have been read before the monitor was installed, so
@@ -548,7 +548,7 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
 		get_page(backpage);
 		monitor->back_page = backpage;
 		monitor->monitor.private = backpage;
-		add_page_wait_queue(backpage, &monitor->monitor);
+		add_folio_wait_queue(page_folio(backpage), &monitor->monitor);
 		monitor = NULL;
 
 		/* but the page may have been read before the monitor was
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 5bccccff48eb..17df86f2bcde 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -690,13 +690,13 @@ static inline pgoff_t linear_page_index(struct vm_area_struct *vma,
 }
 
 struct wait_page_key {
-	struct page *page;
+	struct folio *folio;
 	int bit_nr;
 	int page_match;
 };
 
 struct wait_page_queue {
-	struct page *page;
+	struct folio *folio;
 	int bit_nr;
 	wait_queue_entry_t wait;
 };
@@ -704,7 +704,7 @@ struct wait_page_queue {
 static inline bool wake_page_match(struct wait_page_queue *wait_page,
 				  struct wait_page_key *key)
 {
-	if (wait_page->page != key->page)
+	if (wait_page->folio != key->folio)
 	       return false;
 	key->page_match = 1;
 
@@ -859,7 +859,7 @@ int wait_on_page_private_2_killable(struct page *page);
 /*
  * Add an arbitrary waiter to a page's wait queue
  */
-extern void add_page_wait_queue(struct page *page, wait_queue_entry_t *waiter);
+void add_folio_wait_queue(struct folio *folio, wait_queue_entry_t *waiter);
 
 /*
  * Fault everything in given userspace address range in.
diff --git a/mm/filemap.c b/mm/filemap.c
index dfdc04130c5b..bc0021632c47 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1019,11 +1019,11 @@ EXPORT_SYMBOL(__page_cache_alloc);
  */
 #define PAGE_WAIT_TABLE_BITS 8
 #define PAGE_WAIT_TABLE_SIZE (1 << PAGE_WAIT_TABLE_BITS)
-static wait_queue_head_t page_wait_table[PAGE_WAIT_TABLE_SIZE] __cacheline_aligned;
+static wait_queue_head_t folio_wait_table[PAGE_WAIT_TABLE_SIZE] __cacheline_aligned;
 
-static wait_queue_head_t *page_waitqueue(struct page *page)
+static wait_queue_head_t *folio_waitqueue(struct folio *folio)
 {
-	return &page_wait_table[hash_ptr(page, PAGE_WAIT_TABLE_BITS)];
+	return &folio_wait_table[hash_ptr(folio, PAGE_WAIT_TABLE_BITS)];
 }
 
 void __init pagecache_init(void)
@@ -1031,7 +1031,7 @@ void __init pagecache_init(void)
 	int i;
 
 	for (i = 0; i < PAGE_WAIT_TABLE_SIZE; i++)
-		init_waitqueue_head(&page_wait_table[i]);
+		init_waitqueue_head(&folio_wait_table[i]);
 
 	page_writeback_init();
 }
@@ -1086,10 +1086,10 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync,
 	 */
 	flags = wait->flags;
 	if (flags & WQ_FLAG_EXCLUSIVE) {
-		if (test_bit(key->bit_nr, &key->page->flags))
+		if (test_bit(key->bit_nr, &key->folio->flags))
 			return -1;
 		if (flags & WQ_FLAG_CUSTOM) {
-			if (test_and_set_bit(key->bit_nr, &key->page->flags))
+			if (test_and_set_bit(key->bit_nr, &key->folio->flags))
 				return -1;
 			flags |= WQ_FLAG_DONE;
 		}
@@ -1123,12 +1123,12 @@ static int wake_page_function(wait_queue_entry_t *wait, unsigned mode, int sync,
 
 static void wake_up_folio_bit(struct folio *folio, int bit_nr)
 {
-	wait_queue_head_t *q = page_waitqueue(&folio->page);
+	wait_queue_head_t *q = folio_waitqueue(folio);
 	struct wait_page_key key;
 	unsigned long flags;
 	wait_queue_entry_t bookmark;
 
-	key.page = &folio->page;
+	key.folio = folio;
 	key.bit_nr = bit_nr;
 	key.page_match = 0;
 
@@ -1220,7 +1220,7 @@ int sysctl_page_lock_unfairness = 5;
 static inline int wait_on_folio_bit_common(struct folio *folio, int bit_nr,
 		int state, enum behavior behavior)
 {
-	wait_queue_head_t *q = page_waitqueue(&folio->page);
+	wait_queue_head_t *q = folio_waitqueue(folio);
 	int unfairness = sysctl_page_lock_unfairness;
 	struct wait_page_queue wait_page;
 	wait_queue_entry_t *wait = &wait_page.wait;
@@ -1240,7 +1240,7 @@ static inline int wait_on_folio_bit_common(struct folio *folio, int bit_nr,
 
 	init_wait(wait);
 	wait->func = wake_page_function;
-	wait_page.page = &folio->page;
+	wait_page.folio = folio;
 	wait_page.bit_nr = bit_nr;
 
 repeat:
@@ -1389,23 +1389,23 @@ int put_and_wait_on_page_locked(struct page *page, int state)
 }
 
 /**
- * add_page_wait_queue - Add an arbitrary waiter to a page's wait queue
- * @page: Page defining the wait queue of interest
+ * add_folio_wait_queue - Add an arbitrary waiter to a folio's wait queue
+ * @folio: Folio defining the wait queue of interest
  * @waiter: Waiter to add to the queue
  *
- * Add an arbitrary @waiter to the wait queue for the nominated @page.
+ * Add an arbitrary @waiter to the wait queue for the nominated @folio.
  */
-void add_page_wait_queue(struct page *page, wait_queue_entry_t *waiter)
+void add_folio_wait_queue(struct folio *folio, wait_queue_entry_t *waiter)
 {
-	wait_queue_head_t *q = page_waitqueue(page);
+	wait_queue_head_t *q = folio_waitqueue(folio);
 	unsigned long flags;
 
 	spin_lock_irqsave(&q->lock, flags);
 	__add_wait_queue_entry_tail(q, waiter);
-	SetPageWaiters(page);
+	SetFolioWaiters(folio);
 	spin_unlock_irqrestore(&q->lock, flags);
 }
-EXPORT_SYMBOL_GPL(add_page_wait_queue);
+EXPORT_SYMBOL_GPL(add_folio_wait_queue);
 
 #ifndef clear_bit_unlock_is_negative_byte
 
@@ -1593,10 +1593,10 @@ EXPORT_SYMBOL_GPL(__lock_folio_killable);
 
 static int __lock_folio_async(struct folio *folio, struct wait_page_queue *wait)
 {
-	struct wait_queue_head *q = page_waitqueue(&folio->page);
+	struct wait_queue_head *q = folio_waitqueue(folio);
 	int ret = 0;
 
-	wait->page = &folio->page;
+	wait->folio = folio;
 	wait->bit_nr = PG_locked;
 
 	spin_lock_irq(&q->lock);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v7 02/28] mm: Introduce struct folio
  2021-04-09 18:50 ` [PATCH v7 02/28] mm: Introduce struct folio Matthew Wilcox (Oracle)
@ 2021-04-09 22:45   ` kernel test robot
  2021-04-10  2:43     ` Bogus struct page layout on 32-bit Matthew Wilcox
  2021-04-10  2:51   ` [PATCH v7 02/28] mm: Introduce struct folio kernel test robot
  2021-04-16 15:55   ` Matthew Wilcox
  2 siblings, 1 reply; 47+ messages in thread
From: kernel test robot @ 2021-04-09 22:45 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-mm
  Cc: kbuild-all, clang-built-linux, Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Jeff Layton


[-- Attachment #1: Type: text/plain, Size: 23781 bytes --]

Hi "Matthew,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on next-20210409]
[also build test ERROR on v5.12-rc6]
[cannot apply to linux/master linus/master hnaz-linux-mm/master v5.12-rc6 v5.12-rc5 v5.12-rc4]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox-Oracle/Memory-Folios/20210410-031353
base:    e99d8a8495175df8cb8b739f8cf9b0fc9d0cd3b5
config: powerpc-randconfig-r032-20210409 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project dd453a1389b6a7e6d9214b449d3c54981b1a89b6)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install powerpc cross compiling tool for clang build
        # apt-get install binutils-powerpc-linux-gnu
        # https://github.com/0day-ci/linux/commit/5658a201516d2ed74a34c328e3b55f552d4861d8
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Matthew-Wilcox-Oracle/Memory-Folios/20210410-031353
        git checkout 5658a201516d2ed74a34c328e3b55f552d4861d8
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from arch/powerpc/kernel/asm-offsets.c:14:
   In file included from include/linux/compat.h:17:
   In file included from include/linux/fs.h:15:
   In file included from include/linux/radix-tree.h:19:
   In file included from include/linux/xarray.h:14:
   In file included from include/linux/gfp.h:6:
   In file included from include/linux/mmzone.h:21:
>> include/linux/mm_types.h:274:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, lru) == __builtin_offsetof(struct folio, lru)' "offsetof(struct page, lru) == offsetof(struct folio, lru)"
   FOLIO_MATCH(lru, lru);
   ^~~~~~~~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
           static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: expanded from macro 'static_assert'
   #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:78:41: note: expanded from macro '__static_assert'
   #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
                                           ^              ~~~~
   In file included from arch/powerpc/kernel/asm-offsets.c:14:
   In file included from include/linux/compat.h:17:
   In file included from include/linux/fs.h:15:
   In file included from include/linux/radix-tree.h:19:
   In file included from include/linux/xarray.h:14:
   In file included from include/linux/gfp.h:6:
   In file included from include/linux/mmzone.h:21:
>> include/linux/mm_types.h:275:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, compound_head) == __builtin_offsetof(struct folio, lru)' "offsetof(struct page, compound_head) == offsetof(struct folio, lru)"
   FOLIO_MATCH(compound_head, lru);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
           static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: expanded from macro 'static_assert'
   #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:78:41: note: expanded from macro '__static_assert'
   #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
                                           ^              ~~~~
   In file included from arch/powerpc/kernel/asm-offsets.c:14:
   In file included from include/linux/compat.h:17:
   In file included from include/linux/fs.h:15:
   In file included from include/linux/radix-tree.h:19:
   In file included from include/linux/xarray.h:14:
   In file included from include/linux/gfp.h:6:
   In file included from include/linux/mmzone.h:21:
>> include/linux/mm_types.h:276:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, index) == __builtin_offsetof(struct folio, index)' "offsetof(struct page, index) == offsetof(struct folio, index)"
   FOLIO_MATCH(index, index);
   ^~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
           static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: expanded from macro 'static_assert'
   #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:78:41: note: expanded from macro '__static_assert'
   #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
                                           ^              ~~~~
   In file included from arch/powerpc/kernel/asm-offsets.c:14:
   In file included from include/linux/compat.h:17:
   In file included from include/linux/fs.h:15:
   In file included from include/linux/radix-tree.h:19:
   In file included from include/linux/xarray.h:14:
   In file included from include/linux/gfp.h:6:
   In file included from include/linux/mmzone.h:21:
>> include/linux/mm_types.h:277:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, private) == __builtin_offsetof(struct folio, private)' "offsetof(struct page, private) == offsetof(struct folio, private)"
   FOLIO_MATCH(private, private);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
           static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: expanded from macro 'static_assert'
   #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:78:41: note: expanded from macro '__static_assert'
   #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
                                           ^              ~~~~
   In file included from arch/powerpc/kernel/asm-offsets.c:14:
   In file included from include/linux/compat.h:17:
   In file included from include/linux/fs.h:15:
   In file included from include/linux/radix-tree.h:19:
   In file included from include/linux/xarray.h:14:
   In file included from include/linux/gfp.h:6:
   In file included from include/linux/mmzone.h:21:
>> include/linux/mm_types.h:278:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, _mapcount) == __builtin_offsetof(struct folio, _mapcount)' "offsetof(struct page, _mapcount) == offsetof(struct folio, _mapcount)"
   FOLIO_MATCH(_mapcount, _mapcount);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
           static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: expanded from macro 'static_assert'
   #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:78:41: note: expanded from macro '__static_assert'
   #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
                                           ^              ~~~~
   In file included from arch/powerpc/kernel/asm-offsets.c:14:
   In file included from include/linux/compat.h:17:
   In file included from include/linux/fs.h:15:
   In file included from include/linux/radix-tree.h:19:
   In file included from include/linux/xarray.h:14:
   In file included from include/linux/gfp.h:6:
   In file included from include/linux/mmzone.h:21:
>> include/linux/mm_types.h:279:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, _refcount) == __builtin_offsetof(struct folio, _refcount)' "offsetof(struct page, _refcount) == offsetof(struct folio, _refcount)"
   FOLIO_MATCH(_refcount, _refcount);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
           static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: expanded from macro 'static_assert'
   #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:78:41: note: expanded from macro '__static_assert'
   #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
                                           ^              ~~~~
   In file included from arch/powerpc/kernel/asm-offsets.c:14:
   In file included from include/linux/compat.h:17:
   In file included from include/linux/fs.h:15:
   In file included from include/linux/radix-tree.h:19:
   In file included from include/linux/xarray.h:14:
   In file included from include/linux/gfp.h:6:
   In file included from include/linux/mmzone.h:21:
>> include/linux/mm_types.h:281:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, memcg_data) == __builtin_offsetof(struct folio, memcg_data)' "offsetof(struct page, memcg_data) == offsetof(struct folio, memcg_data)"
   FOLIO_MATCH(memcg_data, memcg_data);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
           static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: expanded from macro 'static_assert'
   #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:78:41: note: expanded from macro '__static_assert'
   #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
                                           ^              ~~~~
   In file included from arch/powerpc/kernel/asm-offsets.c:21:
   include/linux/mman.h:155:9: warning: division by zero is undefined [-Wdivision-by-zero]
                  _calc_vm_trans(flags, MAP_LOCKED,     VM_LOCKED    ) |
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/mman.h:133:21: note: expanded from macro '_calc_vm_trans'
      : ((x) & (bit1)) / ((bit1) / (bit2))))
                       ^ ~~~~~~~~~~~~~~~~~
   include/linux/mman.h:156:9: warning: division by zero is undefined [-Wdivision-by-zero]
                  _calc_vm_trans(flags, MAP_SYNC,       VM_SYNC      ) |
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/mman.h:133:21: note: expanded from macro '_calc_vm_trans'
      : ((x) & (bit1)) / ((bit1) / (bit2))))
                       ^ ~~~~~~~~~~~~~~~~~
   2 warnings and 7 errors generated.
--
   error: no override and no default toolchain set
   init/Kconfig:70:warning: 'RUSTC_VERSION': number is invalid
   In file included from arch/powerpc/kernel/asm-offsets.c:14:
   In file included from include/linux/compat.h:17:
   In file included from include/linux/fs.h:15:
   In file included from include/linux/radix-tree.h:19:
   In file included from include/linux/xarray.h:14:
   In file included from include/linux/gfp.h:6:
   In file included from include/linux/mmzone.h:21:
>> include/linux/mm_types.h:274:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, lru) == __builtin_offsetof(struct folio, lru)' "offsetof(struct page, lru) == offsetof(struct folio, lru)"
   FOLIO_MATCH(lru, lru);
   ^~~~~~~~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
           static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: expanded from macro 'static_assert'
   #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:78:41: note: expanded from macro '__static_assert'
   #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
                                           ^              ~~~~
   In file included from arch/powerpc/kernel/asm-offsets.c:14:
   In file included from include/linux/compat.h:17:
   In file included from include/linux/fs.h:15:
   In file included from include/linux/radix-tree.h:19:
   In file included from include/linux/xarray.h:14:
   In file included from include/linux/gfp.h:6:
   In file included from include/linux/mmzone.h:21:
>> include/linux/mm_types.h:275:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, compound_head) == __builtin_offsetof(struct folio, lru)' "offsetof(struct page, compound_head) == offsetof(struct folio, lru)"
   FOLIO_MATCH(compound_head, lru);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
           static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: expanded from macro 'static_assert'
   #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:78:41: note: expanded from macro '__static_assert'
   #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
                                           ^              ~~~~
   In file included from arch/powerpc/kernel/asm-offsets.c:14:
   In file included from include/linux/compat.h:17:
   In file included from include/linux/fs.h:15:
   In file included from include/linux/radix-tree.h:19:
   In file included from include/linux/xarray.h:14:
   In file included from include/linux/gfp.h:6:
   In file included from include/linux/mmzone.h:21:
>> include/linux/mm_types.h:276:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, index) == __builtin_offsetof(struct folio, index)' "offsetof(struct page, index) == offsetof(struct folio, index)"
   FOLIO_MATCH(index, index);
   ^~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
           static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: expanded from macro 'static_assert'
   #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:78:41: note: expanded from macro '__static_assert'
   #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
                                           ^              ~~~~
   In file included from arch/powerpc/kernel/asm-offsets.c:14:
   In file included from include/linux/compat.h:17:
   In file included from include/linux/fs.h:15:
   In file included from include/linux/radix-tree.h:19:
   In file included from include/linux/xarray.h:14:
   In file included from include/linux/gfp.h:6:
   In file included from include/linux/mmzone.h:21:
>> include/linux/mm_types.h:277:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, private) == __builtin_offsetof(struct folio, private)' "offsetof(struct page, private) == offsetof(struct folio, private)"
   FOLIO_MATCH(private, private);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
           static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: expanded from macro 'static_assert'
   #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:78:41: note: expanded from macro '__static_assert'
   #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
                                           ^              ~~~~
   In file included from arch/powerpc/kernel/asm-offsets.c:14:
   In file included from include/linux/compat.h:17:
   In file included from include/linux/fs.h:15:
   In file included from include/linux/radix-tree.h:19:
   In file included from include/linux/xarray.h:14:
   In file included from include/linux/gfp.h:6:
   In file included from include/linux/mmzone.h:21:
>> include/linux/mm_types.h:278:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, _mapcount) == __builtin_offsetof(struct folio, _mapcount)' "offsetof(struct page, _mapcount) == offsetof(struct folio, _mapcount)"
   FOLIO_MATCH(_mapcount, _mapcount);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
           static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: expanded from macro 'static_assert'
   #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:78:41: note: expanded from macro '__static_assert'
   #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
                                           ^              ~~~~
   In file included from arch/powerpc/kernel/asm-offsets.c:14:
   In file included from include/linux/compat.h:17:
   In file included from include/linux/fs.h:15:
   In file included from include/linux/radix-tree.h:19:
   In file included from include/linux/xarray.h:14:
   In file included from include/linux/gfp.h:6:
   In file included from include/linux/mmzone.h:21:
>> include/linux/mm_types.h:279:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, _refcount) == __builtin_offsetof(struct folio, _refcount)' "offsetof(struct page, _refcount) == offsetof(struct folio, _refcount)"
   FOLIO_MATCH(_refcount, _refcount);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
           static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: expanded from macro 'static_assert'
   #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:78:41: note: expanded from macro '__static_assert'
   #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
                                           ^              ~~~~
   In file included from arch/powerpc/kernel/asm-offsets.c:14:
   In file included from include/linux/compat.h:17:
   In file included from include/linux/fs.h:15:
   In file included from include/linux/radix-tree.h:19:
   In file included from include/linux/xarray.h:14:
   In file included from include/linux/gfp.h:6:
   In file included from include/linux/mmzone.h:21:
>> include/linux/mm_types.h:281:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, memcg_data) == __builtin_offsetof(struct folio, memcg_data)' "offsetof(struct page, memcg_data) == offsetof(struct folio, memcg_data)"
   FOLIO_MATCH(memcg_data, memcg_data);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
           static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: expanded from macro 'static_assert'
   #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:78:41: note: expanded from macro '__static_assert'
   #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
                                           ^              ~~~~
   In file included from arch/powerpc/kernel/asm-offsets.c:21:
   include/linux/mman.h:155:9: warning: division by zero is undefined [-Wdivision-by-zero]
                  _calc_vm_trans(flags, MAP_LOCKED,     VM_LOCKED    ) |
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/mman.h:133:21: note: expanded from macro '_calc_vm_trans'
      : ((x) & (bit1)) / ((bit1) / (bit2))))
                       ^ ~~~~~~~~~~~~~~~~~
   include/linux/mman.h:156:9: warning: division by zero is undefined [-Wdivision-by-zero]
                  _calc_vm_trans(flags, MAP_SYNC,       VM_SYNC      ) |
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/mman.h:133:21: note: expanded from macro '_calc_vm_trans'
      : ((x) & (bit1)) / ((bit1) / (bit2))))
                       ^ ~~~~~~~~~~~~~~~~~
   2 warnings and 7 errors generated.
   make[2]: *** [scripts/Makefile.build:118: arch/powerpc/kernel/asm-offsets.s] Error 1
   make[2]: Target '__build' not remade because of errors.
   make[1]: *** [Makefile:1304: prepare0] Error 2
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [Makefile:222: __sub-make] Error 2
   make: Target 'prepare' not remade because of errors.


vim +274 include/linux/mm_types.h

   269	
   270	static_assert(sizeof(struct page) == sizeof(struct folio));
   271	#define FOLIO_MATCH(pg, fl)						\
   272		static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
   273	FOLIO_MATCH(flags, flags);
 > 274	FOLIO_MATCH(lru, lru);
 > 275	FOLIO_MATCH(compound_head, lru);
 > 276	FOLIO_MATCH(index, index);
 > 277	FOLIO_MATCH(private, private);
 > 278	FOLIO_MATCH(_mapcount, _mapcount);
 > 279	FOLIO_MATCH(_refcount, _refcount);
   280	#ifdef CONFIG_MEMCG
 > 281	FOLIO_MATCH(memcg_data, memcg_data);
   282	#endif
   283	#undef FOLIO_MATCH
   284	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 41005 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Bogus struct page layout on 32-bit
  2021-04-09 22:45   ` kernel test robot
@ 2021-04-10  2:43     ` Matthew Wilcox
  2021-04-10  6:21       ` Jesper Dangaard Brouer
                         ` (2 more replies)
  0 siblings, 3 replies; 47+ messages in thread
From: Matthew Wilcox @ 2021-04-10  2:43 UTC (permalink / raw)
  To: kernel test robot
  Cc: linux-mm, kbuild-all, clang-built-linux, linux-kernel,
	linux-fsdevel, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, linuxppc-dev, linux-arm-kernel,
	Jesper Dangaard Brouer, David S. Miller

On Sat, Apr 10, 2021 at 06:45:35AM +0800, kernel test robot wrote:
> >> include/linux/mm_types.h:274:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, lru) == __builtin_offsetof(struct folio, lru)' "offsetof(struct page, lru) == offsetof(struct folio, lru)"
>    FOLIO_MATCH(lru, lru);
>    include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
>            static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))

Well, this is interesting.  pahole reports:

struct page {
        long unsigned int          flags;                /*     0     4 */
        /* XXX 4 bytes hole, try to pack */
        union {
                struct {
                        struct list_head lru;            /*     8     8 */
...
struct folio {
        union {
                struct {
                        long unsigned int flags;         /*     0     4 */
                        struct list_head lru;            /*     4     8 */

so this assert has absolutely done its job.

But why has this assert triggered?  Why is struct page layout not what
we thought it was?  Turns out it's the dma_addr added in 2019 by commit
c25fff7171be ("mm: add dma_addr_t to struct page").  On this particular
config, it's 64-bit, and ppc32 requires alignment to 64-bit.  So
the whole union gets moved out by 4 bytes.

Unfortunately, we can't just fix this by putting an 'unsigned long pad'
in front of it.  It still aligns the entire union to 8 bytes, and then
it skips another 4 bytes after the pad.

We can fix it like this ...

+++ b/include/linux/mm_types.h
@@ -96,11 +96,12 @@ struct page {
                        unsigned long private;
                };
                struct {        /* page_pool used by netstack */
+                       unsigned long _page_pool_pad;
                        /**
                         * @dma_addr: might require a 64-bit value even on
                         * 32-bit architectures.
                         */
-                       dma_addr_t dma_addr;
+                       dma_addr_t dma_addr __packed;
                };
                struct {        /* slab, slob and slub */
                        union {

but I don't know if GCC is smart enough to realise that dma_addr is now
on an 8 byte boundary and it can use a normal instruction to access it,
or whether it'll do something daft like use byte loads to access it.

We could also do:

+                       dma_addr_t dma_addr __packed __aligned(sizeof(void *));

and I see pahole, at least sees this correctly:

                struct {
                        long unsigned int _page_pool_pad; /*     4     4 */
                        dma_addr_t dma_addr __attribute__((__aligned__(4))); /*     8     8 */
                } __attribute__((__packed__)) __attribute__((__aligned__(4)));  

This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
/ dma_addr_t.  Advice, please?

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v7 02/28] mm: Introduce struct folio
  2021-04-09 18:50 ` [PATCH v7 02/28] mm: Introduce struct folio Matthew Wilcox (Oracle)
  2021-04-09 22:45   ` kernel test robot
@ 2021-04-10  2:51   ` kernel test robot
  2021-04-16 15:55   ` Matthew Wilcox
  2 siblings, 0 replies; 47+ messages in thread
From: kernel test robot @ 2021-04-10  2:51 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), linux-mm
  Cc: kbuild-all, Matthew Wilcox (Oracle),
	linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Jeff Layton


[-- Attachment #1: Type: text/plain, Size: 26190 bytes --]

Hi "Matthew,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on next-20210409]
[also build test ERROR on v5.12-rc6]
[cannot apply to linux/master linus/master hnaz-linux-mm/master v5.12-rc6 v5.12-rc5 v5.12-rc4]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Matthew-Wilcox-Oracle/Memory-Folios/20210410-031353
base:    e99d8a8495175df8cb8b739f8cf9b0fc9d0cd3b5
config: mips-gpr_defconfig (attached as .config)
compiler: mipsel-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/5658a201516d2ed74a34c328e3b55f552d4861d8
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Matthew-Wilcox-Oracle/Memory-Folios/20210410-031353
        git checkout 5658a201516d2ed74a34c328e3b55f552d4861d8
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=mips 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from include/linux/bits.h:22,
                    from include/linux/bitops.h:6,
                    from include/linux/log2.h:12,
                    from include/asm-generic/div64.h:53,
                    from arch/mips/include/asm/div64.h:12,
                    from include/linux/math64.h:7,
                    from include/linux/time.h:6,
                    from include/linux/compat.h:10,
                    from arch/mips/kernel/asm-offsets.c:12:
>> include/linux/build_bug.h:78:41: error: static assertion failed: "offsetof(struct page, lru) == offsetof(struct folio, lru)"
      78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
         |                                         ^~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
      77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
         |                                  ^~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: in expansion of macro 'static_assert'
     272 |  static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
         |  ^~~~~~~~~~~~~
   include/linux/mm_types.h:274:1: note: in expansion of macro 'FOLIO_MATCH'
     274 | FOLIO_MATCH(lru, lru);
         | ^~~~~~~~~~~
>> include/linux/build_bug.h:78:41: error: static assertion failed: "offsetof(struct page, compound_head) == offsetof(struct folio, lru)"
      78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
         |                                         ^~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
      77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
         |                                  ^~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: in expansion of macro 'static_assert'
     272 |  static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
         |  ^~~~~~~~~~~~~
   include/linux/mm_types.h:275:1: note: in expansion of macro 'FOLIO_MATCH'
     275 | FOLIO_MATCH(compound_head, lru);
         | ^~~~~~~~~~~
>> include/linux/build_bug.h:78:41: error: static assertion failed: "offsetof(struct page, index) == offsetof(struct folio, index)"
      78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
         |                                         ^~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
      77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
         |                                  ^~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: in expansion of macro 'static_assert'
     272 |  static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
         |  ^~~~~~~~~~~~~
   include/linux/mm_types.h:276:1: note: in expansion of macro 'FOLIO_MATCH'
     276 | FOLIO_MATCH(index, index);
         | ^~~~~~~~~~~
>> include/linux/build_bug.h:78:41: error: static assertion failed: "offsetof(struct page, private) == offsetof(struct folio, private)"
      78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
         |                                         ^~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
      77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
         |                                  ^~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: in expansion of macro 'static_assert'
     272 |  static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
         |  ^~~~~~~~~~~~~
   include/linux/mm_types.h:277:1: note: in expansion of macro 'FOLIO_MATCH'
     277 | FOLIO_MATCH(private, private);
         | ^~~~~~~~~~~
>> include/linux/build_bug.h:78:41: error: static assertion failed: "offsetof(struct page, _mapcount) == offsetof(struct folio, _mapcount)"
      78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
         |                                         ^~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
      77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
         |                                  ^~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: in expansion of macro 'static_assert'
     272 |  static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
         |  ^~~~~~~~~~~~~
   include/linux/mm_types.h:278:1: note: in expansion of macro 'FOLIO_MATCH'
     278 | FOLIO_MATCH(_mapcount, _mapcount);
         | ^~~~~~~~~~~
>> include/linux/build_bug.h:78:41: error: static assertion failed: "offsetof(struct page, _refcount) == offsetof(struct folio, _refcount)"
      78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
         |                                         ^~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
      77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
         |                                  ^~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: in expansion of macro 'static_assert'
     272 |  static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
         |  ^~~~~~~~~~~~~
   include/linux/mm_types.h:279:1: note: in expansion of macro 'FOLIO_MATCH'
     279 | FOLIO_MATCH(_refcount, _refcount);
         | ^~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:26:6: warning: no previous prototype for 'output_ptreg_defines' [-Wmissing-prototypes]
      26 | void output_ptreg_defines(void)
         |      ^~~~~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:78:6: warning: no previous prototype for 'output_task_defines' [-Wmissing-prototypes]
      78 | void output_task_defines(void)
         |      ^~~~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:93:6: warning: no previous prototype for 'output_thread_info_defines' [-Wmissing-prototypes]
      93 | void output_thread_info_defines(void)
         |      ^~~~~~~~~~~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:109:6: warning: no previous prototype for 'output_thread_defines' [-Wmissing-prototypes]
     109 | void output_thread_defines(void)
         |      ^~~~~~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:137:6: warning: no previous prototype for 'output_thread_fpu_defines' [-Wmissing-prototypes]
     137 | void output_thread_fpu_defines(void)
         |      ^~~~~~~~~~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:180:6: warning: no previous prototype for 'output_mm_defines' [-Wmissing-prototypes]
     180 | void output_mm_defines(void)
         |      ^~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:219:6: warning: no previous prototype for 'output_sc_defines' [-Wmissing-prototypes]
     219 | void output_sc_defines(void)
         |      ^~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:254:6: warning: no previous prototype for 'output_signal_defined' [-Wmissing-prototypes]
     254 | void output_signal_defined(void)
         |      ^~~~~~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:347:6: warning: no previous prototype for 'output_kvm_defines' [-Wmissing-prototypes]
     347 | void output_kvm_defines(void)
         |      ^~~~~~~~~~~~~~~~~~
--
   In file included from include/linux/bits.h:22,
                    from include/linux/bitops.h:6,
                    from include/linux/log2.h:12,
                    from include/asm-generic/div64.h:53,
                    from arch/mips/include/asm/div64.h:12,
                    from include/linux/math64.h:7,
                    from include/linux/time.h:6,
                    from include/linux/compat.h:10,
                    from arch/mips/kernel/asm-offsets.c:12:
>> include/linux/build_bug.h:78:41: error: static assertion failed: "offsetof(struct page, lru) == offsetof(struct folio, lru)"
      78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
         |                                         ^~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
      77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
         |                                  ^~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: in expansion of macro 'static_assert'
     272 |  static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
         |  ^~~~~~~~~~~~~
   include/linux/mm_types.h:274:1: note: in expansion of macro 'FOLIO_MATCH'
     274 | FOLIO_MATCH(lru, lru);
         | ^~~~~~~~~~~
>> include/linux/build_bug.h:78:41: error: static assertion failed: "offsetof(struct page, compound_head) == offsetof(struct folio, lru)"
      78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
         |                                         ^~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
      77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
         |                                  ^~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: in expansion of macro 'static_assert'
     272 |  static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
         |  ^~~~~~~~~~~~~
   include/linux/mm_types.h:275:1: note: in expansion of macro 'FOLIO_MATCH'
     275 | FOLIO_MATCH(compound_head, lru);
         | ^~~~~~~~~~~
>> include/linux/build_bug.h:78:41: error: static assertion failed: "offsetof(struct page, index) == offsetof(struct folio, index)"
      78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
         |                                         ^~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
      77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
         |                                  ^~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: in expansion of macro 'static_assert'
     272 |  static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
         |  ^~~~~~~~~~~~~
   include/linux/mm_types.h:276:1: note: in expansion of macro 'FOLIO_MATCH'
     276 | FOLIO_MATCH(index, index);
         | ^~~~~~~~~~~
>> include/linux/build_bug.h:78:41: error: static assertion failed: "offsetof(struct page, private) == offsetof(struct folio, private)"
      78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
         |                                         ^~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
      77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
         |                                  ^~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: in expansion of macro 'static_assert'
     272 |  static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
         |  ^~~~~~~~~~~~~
   include/linux/mm_types.h:277:1: note: in expansion of macro 'FOLIO_MATCH'
     277 | FOLIO_MATCH(private, private);
         | ^~~~~~~~~~~
>> include/linux/build_bug.h:78:41: error: static assertion failed: "offsetof(struct page, _mapcount) == offsetof(struct folio, _mapcount)"
      78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
         |                                         ^~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
      77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
         |                                  ^~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: in expansion of macro 'static_assert'
     272 |  static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
         |  ^~~~~~~~~~~~~
   include/linux/mm_types.h:278:1: note: in expansion of macro 'FOLIO_MATCH'
     278 | FOLIO_MATCH(_mapcount, _mapcount);
         | ^~~~~~~~~~~
>> include/linux/build_bug.h:78:41: error: static assertion failed: "offsetof(struct page, _refcount) == offsetof(struct folio, _refcount)"
      78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
         |                                         ^~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
      77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
         |                                  ^~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: in expansion of macro 'static_assert'
     272 |  static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
         |  ^~~~~~~~~~~~~
   include/linux/mm_types.h:279:1: note: in expansion of macro 'FOLIO_MATCH'
     279 | FOLIO_MATCH(_refcount, _refcount);
         | ^~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:26:6: warning: no previous prototype for 'output_ptreg_defines' [-Wmissing-prototypes]
      26 | void output_ptreg_defines(void)
         |      ^~~~~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:78:6: warning: no previous prototype for 'output_task_defines' [-Wmissing-prototypes]
      78 | void output_task_defines(void)
         |      ^~~~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:93:6: warning: no previous prototype for 'output_thread_info_defines' [-Wmissing-prototypes]
      93 | void output_thread_info_defines(void)
         |      ^~~~~~~~~~~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:109:6: warning: no previous prototype for 'output_thread_defines' [-Wmissing-prototypes]
     109 | void output_thread_defines(void)
         |      ^~~~~~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:137:6: warning: no previous prototype for 'output_thread_fpu_defines' [-Wmissing-prototypes]
     137 | void output_thread_fpu_defines(void)
         |      ^~~~~~~~~~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:180:6: warning: no previous prototype for 'output_mm_defines' [-Wmissing-prototypes]
     180 | void output_mm_defines(void)
         |      ^~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:219:6: warning: no previous prototype for 'output_sc_defines' [-Wmissing-prototypes]
     219 | void output_sc_defines(void)
         |      ^~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:254:6: warning: no previous prototype for 'output_signal_defined' [-Wmissing-prototypes]
     254 | void output_signal_defined(void)
         |      ^~~~~~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:347:6: warning: no previous prototype for 'output_kvm_defines' [-Wmissing-prototypes]
     347 | void output_kvm_defines(void)
         |      ^~~~~~~~~~~~~~~~~~
   make[2]: *** [scripts/Makefile.build:118: arch/mips/kernel/asm-offsets.s] Error 1
   make[2]: Target '__build' not remade because of errors.
   make[1]: *** [Makefile:1304: prepare0] Error 2
   make[1]: Target 'modules_prepare' not remade because of errors.
   make: *** [Makefile:222: __sub-make] Error 2
   make: Target 'modules_prepare' not remade because of errors.
--
   error: no override and no default toolchain set
   init/Kconfig:70:warning: 'RUSTC_VERSION': number is invalid
   In file included from include/linux/bits.h:22,
                    from include/linux/bitops.h:6,
                    from include/linux/log2.h:12,
                    from include/asm-generic/div64.h:53,
                    from arch/mips/include/asm/div64.h:12,
                    from include/linux/math64.h:7,
                    from include/linux/time.h:6,
                    from include/linux/compat.h:10,
                    from arch/mips/kernel/asm-offsets.c:12:
>> include/linux/build_bug.h:78:41: error: static assertion failed: "offsetof(struct page, lru) == offsetof(struct folio, lru)"
      78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
         |                                         ^~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
      77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
         |                                  ^~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: in expansion of macro 'static_assert'
     272 |  static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
         |  ^~~~~~~~~~~~~
   include/linux/mm_types.h:274:1: note: in expansion of macro 'FOLIO_MATCH'
     274 | FOLIO_MATCH(lru, lru);
         | ^~~~~~~~~~~
>> include/linux/build_bug.h:78:41: error: static assertion failed: "offsetof(struct page, compound_head) == offsetof(struct folio, lru)"
      78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
         |                                         ^~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
      77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
         |                                  ^~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: in expansion of macro 'static_assert'
     272 |  static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
         |  ^~~~~~~~~~~~~
   include/linux/mm_types.h:275:1: note: in expansion of macro 'FOLIO_MATCH'
     275 | FOLIO_MATCH(compound_head, lru);
         | ^~~~~~~~~~~
>> include/linux/build_bug.h:78:41: error: static assertion failed: "offsetof(struct page, index) == offsetof(struct folio, index)"
      78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
         |                                         ^~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
      77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
         |                                  ^~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: in expansion of macro 'static_assert'
     272 |  static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
         |  ^~~~~~~~~~~~~
   include/linux/mm_types.h:276:1: note: in expansion of macro 'FOLIO_MATCH'
     276 | FOLIO_MATCH(index, index);
         | ^~~~~~~~~~~
>> include/linux/build_bug.h:78:41: error: static assertion failed: "offsetof(struct page, private) == offsetof(struct folio, private)"
      78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
         |                                         ^~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
      77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
         |                                  ^~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: in expansion of macro 'static_assert'
     272 |  static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
         |  ^~~~~~~~~~~~~
   include/linux/mm_types.h:277:1: note: in expansion of macro 'FOLIO_MATCH'
     277 | FOLIO_MATCH(private, private);
         | ^~~~~~~~~~~
>> include/linux/build_bug.h:78:41: error: static assertion failed: "offsetof(struct page, _mapcount) == offsetof(struct folio, _mapcount)"
      78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
         |                                         ^~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
      77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
         |                                  ^~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: in expansion of macro 'static_assert'
     272 |  static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
         |  ^~~~~~~~~~~~~
   include/linux/mm_types.h:278:1: note: in expansion of macro 'FOLIO_MATCH'
     278 | FOLIO_MATCH(_mapcount, _mapcount);
         | ^~~~~~~~~~~
>> include/linux/build_bug.h:78:41: error: static assertion failed: "offsetof(struct page, _refcount) == offsetof(struct folio, _refcount)"
      78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
         |                                         ^~~~~~~~~~~~~~
   include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
      77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
         |                                  ^~~~~~~~~~~~~~~
   include/linux/mm_types.h:272:2: note: in expansion of macro 'static_assert'
     272 |  static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
         |  ^~~~~~~~~~~~~
   include/linux/mm_types.h:279:1: note: in expansion of macro 'FOLIO_MATCH'
     279 | FOLIO_MATCH(_refcount, _refcount);
         | ^~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:26:6: warning: no previous prototype for 'output_ptreg_defines' [-Wmissing-prototypes]
      26 | void output_ptreg_defines(void)
         |      ^~~~~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:78:6: warning: no previous prototype for 'output_task_defines' [-Wmissing-prototypes]
      78 | void output_task_defines(void)
         |      ^~~~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:93:6: warning: no previous prototype for 'output_thread_info_defines' [-Wmissing-prototypes]
      93 | void output_thread_info_defines(void)
         |      ^~~~~~~~~~~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:109:6: warning: no previous prototype for 'output_thread_defines' [-Wmissing-prototypes]
     109 | void output_thread_defines(void)
         |      ^~~~~~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:137:6: warning: no previous prototype for 'output_thread_fpu_defines' [-Wmissing-prototypes]
     137 | void output_thread_fpu_defines(void)
         |      ^~~~~~~~~~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:180:6: warning: no previous prototype for 'output_mm_defines' [-Wmissing-prototypes]
     180 | void output_mm_defines(void)
         |      ^~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:219:6: warning: no previous prototype for 'output_sc_defines' [-Wmissing-prototypes]
     219 | void output_sc_defines(void)
         |      ^~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:254:6: warning: no previous prototype for 'output_signal_defined' [-Wmissing-prototypes]
     254 | void output_signal_defined(void)
         |      ^~~~~~~~~~~~~~~~~~~~~
   arch/mips/kernel/asm-offsets.c:347:6: warning: no previous prototype for 'output_kvm_defines' [-Wmissing-prototypes]
     347 | void output_kvm_defines(void)
         |      ^~~~~~~~~~~~~~~~~~
   make[2]: *** [scripts/Makefile.build:118: arch/mips/kernel/asm-offsets.s] Error 1
   make[2]: Target '__build' not remade because of errors.
   make[1]: *** [Makefile:1304: prepare0] Error 2
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [Makefile:222: __sub-make] Error 2
   make: Target 'prepare' not remade because of errors.


vim +78 include/linux/build_bug.h

bc6245e5efd70c Ian Abbott       2017-07-10  60  
6bab69c65013be Rasmus Villemoes 2019-03-07  61  /**
6bab69c65013be Rasmus Villemoes 2019-03-07  62   * static_assert - check integer constant expression at build time
6bab69c65013be Rasmus Villemoes 2019-03-07  63   *
6bab69c65013be Rasmus Villemoes 2019-03-07  64   * static_assert() is a wrapper for the C11 _Static_assert, with a
6bab69c65013be Rasmus Villemoes 2019-03-07  65   * little macro magic to make the message optional (defaulting to the
6bab69c65013be Rasmus Villemoes 2019-03-07  66   * stringification of the tested expression).
6bab69c65013be Rasmus Villemoes 2019-03-07  67   *
6bab69c65013be Rasmus Villemoes 2019-03-07  68   * Contrary to BUILD_BUG_ON(), static_assert() can be used at global
6bab69c65013be Rasmus Villemoes 2019-03-07  69   * scope, but requires the expression to be an integer constant
6bab69c65013be Rasmus Villemoes 2019-03-07  70   * expression (i.e., it is not enough that __builtin_constant_p() is
6bab69c65013be Rasmus Villemoes 2019-03-07  71   * true for expr).
6bab69c65013be Rasmus Villemoes 2019-03-07  72   *
6bab69c65013be Rasmus Villemoes 2019-03-07  73   * Also note that BUILD_BUG_ON() fails the build if the condition is
6bab69c65013be Rasmus Villemoes 2019-03-07  74   * true, while static_assert() fails the build if the expression is
6bab69c65013be Rasmus Villemoes 2019-03-07  75   * false.
6bab69c65013be Rasmus Villemoes 2019-03-07  76   */
6bab69c65013be Rasmus Villemoes 2019-03-07 @77  #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
6bab69c65013be Rasmus Villemoes 2019-03-07 @78  #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
6bab69c65013be Rasmus Villemoes 2019-03-07  79  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 21737 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Bogus struct page layout on 32-bit
  2021-04-10  2:43     ` Bogus struct page layout on 32-bit Matthew Wilcox
@ 2021-04-10  6:21       ` Jesper Dangaard Brouer
  2021-04-10  8:52         ` Ilias Apalodimas
  2021-04-10 14:17       ` David Laight
  2021-04-10 19:10       ` Arnd Bergmann
  2 siblings, 1 reply; 47+ messages in thread
From: Jesper Dangaard Brouer @ 2021-04-10  6:21 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: kernel test robot, linux-mm, kbuild-all, clang-built-linux,
	linux-kernel, linux-fsdevel, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, linuxppc-dev,
	linux-arm-kernel, David S. Miller, brouer, Ilias Apalodimas,
	Ivan Khoronzhuk, Matteo Croce, netdev

On Sat, 10 Apr 2021 03:43:13 +0100
Matthew Wilcox <willy@infradead.org> wrote:

> On Sat, Apr 10, 2021 at 06:45:35AM +0800, kernel test robot wrote:
> > >> include/linux/mm_types.h:274:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, lru) == __builtin_offsetof(struct folio, lru)' "offsetof(struct page, lru) == offsetof(struct folio, lru)"  
> >    FOLIO_MATCH(lru, lru);
> >    include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
> >            static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))  
> 
> Well, this is interesting.  pahole reports:
> 
> struct page {
>         long unsigned int          flags;                /*     0     4 */
>         /* XXX 4 bytes hole, try to pack */
>         union {
>                 struct {
>                         struct list_head lru;            /*     8     8 */
> ...
> struct folio {
>         union {
>                 struct {
>                         long unsigned int flags;         /*     0     4 */
>                         struct list_head lru;            /*     4     8 */
> 
> so this assert has absolutely done its job.
> 
> But why has this assert triggered?  Why is struct page layout not what
> we thought it was?  Turns out it's the dma_addr added in 2019 by commit
> c25fff7171be ("mm: add dma_addr_t to struct page").  On this particular
> config, it's 64-bit, and ppc32 requires alignment to 64-bit.  So
> the whole union gets moved out by 4 bytes.

Argh, good that you are catching this!

> Unfortunately, we can't just fix this by putting an 'unsigned long pad'
> in front of it.  It still aligns the entire union to 8 bytes, and then
> it skips another 4 bytes after the pad.
> 
> We can fix it like this ...
> 
> +++ b/include/linux/mm_types.h
> @@ -96,11 +96,12 @@ struct page {
>                         unsigned long private;
>                 };
>                 struct {        /* page_pool used by netstack */
> +                       unsigned long _page_pool_pad;

I'm fine with this pad.  Matteo is currently proposing[1] to add a 32-bit
value after @dma_addr, and he could use this area instead.

[1] https://lore.kernel.org/netdev/20210409223801.104657-3-mcroce@linux.microsoft.com/

When adding/changing this, we need to make sure that it doesn't overlap
member @index, because network stack use/check page_is_pfmemalloc().
As far as my calculations this is safe to add.  I always try to keep an
eye out for this, but I wonder if we could have a build check like yours.


>                         /**
>                          * @dma_addr: might require a 64-bit value even on
>                          * 32-bit architectures.
>                          */
> -                       dma_addr_t dma_addr;
> +                       dma_addr_t dma_addr __packed;
>                 };
>                 struct {        /* slab, slob and slub */
>                         union {
> 
> but I don't know if GCC is smart enough to realise that dma_addr is now
> on an 8 byte boundary and it can use a normal instruction to access it,
> or whether it'll do something daft like use byte loads to access it.
> 
> We could also do:
> 
> +                       dma_addr_t dma_addr __packed __aligned(sizeof(void *));
> 
> and I see pahole, at least sees this correctly:
> 
>                 struct {
>                         long unsigned int _page_pool_pad; /*     4     4 */
>                         dma_addr_t dma_addr __attribute__((__aligned__(4))); /*     8     8 */
>                 } __attribute__((__packed__)) __attribute__((__aligned__(4)));  
> 
> This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> / dma_addr_t.  Advice, please?

I'm not sure that the 32-bit behavior is with 64-bit (dma) addrs.

I don't have any 32-bit boards with 64-bit DMA.  Cc. Ivan, wasn't your
board (572x ?) 32-bit with driver 'cpsw' this case (where Ivan added
XDP+page_pool) ?

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Bogus struct page layout on 32-bit
  2021-04-10  6:21       ` Jesper Dangaard Brouer
@ 2021-04-10  8:52         ` Ilias Apalodimas
  2021-04-10 14:06           ` Matthew Wilcox
  2021-04-16  9:26           ` Grygorii Strashko
  0 siblings, 2 replies; 47+ messages in thread
From: Ilias Apalodimas @ 2021-04-10  8:52 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Matthew Wilcox, kernel test robot, Linux-MM, kbuild-all,
	clang-built-linux, open list, linux-fsdevel, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, linuxppc-dev, Linux ARM,
	David S. Miller, Ivan Khoronzhuk, Matteo Croce, netdev,
	Grygorii Strashko

+CC Grygorii for the cpsw part as Ivan's email is not valid anymore

Thanks for catching this. Interesting indeed...

On Sat, 10 Apr 2021 at 09:22, Jesper Dangaard Brouer <brouer@redhat.com> wrote:
>
> On Sat, 10 Apr 2021 03:43:13 +0100
> Matthew Wilcox <willy@infradead.org> wrote:
>
> > On Sat, Apr 10, 2021 at 06:45:35AM +0800, kernel test robot wrote:
> > > >> include/linux/mm_types.h:274:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, lru) == __builtin_offsetof(struct folio, lru)' "offsetof(struct page, lru) == offsetof(struct folio, lru)"
> > >    FOLIO_MATCH(lru, lru);
> > >    include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
> > >            static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
> >
> > Well, this is interesting.  pahole reports:
> >
> > struct page {
> >         long unsigned int          flags;                /*     0     4 */
> >         /* XXX 4 bytes hole, try to pack */
> >         union {
> >                 struct {
> >                         struct list_head lru;            /*     8     8 */
> > ...
> > struct folio {
> >         union {
> >                 struct {
> >                         long unsigned int flags;         /*     0     4 */
> >                         struct list_head lru;            /*     4     8 */
> >
> > so this assert has absolutely done its job.
> >
> > But why has this assert triggered?  Why is struct page layout not what
> > we thought it was?  Turns out it's the dma_addr added in 2019 by commit
> > c25fff7171be ("mm: add dma_addr_t to struct page").  On this particular
> > config, it's 64-bit, and ppc32 requires alignment to 64-bit.  So
> > the whole union gets moved out by 4 bytes.
>
> Argh, good that you are catching this!
>
> > Unfortunately, we can't just fix this by putting an 'unsigned long pad'
> > in front of it.  It still aligns the entire union to 8 bytes, and then
> > it skips another 4 bytes after the pad.
> >
> > We can fix it like this ...
> >
> > +++ b/include/linux/mm_types.h
> > @@ -96,11 +96,12 @@ struct page {
> >                         unsigned long private;
> >                 };
> >                 struct {        /* page_pool used by netstack */
> > +                       unsigned long _page_pool_pad;
>
> I'm fine with this pad.  Matteo is currently proposing[1] to add a 32-bit
> value after @dma_addr, and he could use this area instead.
>
> [1] https://lore.kernel.org/netdev/20210409223801.104657-3-mcroce@linux.microsoft.com/
>
> When adding/changing this, we need to make sure that it doesn't overlap
> member @index, because network stack use/check page_is_pfmemalloc().
> As far as my calculations this is safe to add.  I always try to keep an
> eye out for this, but I wonder if we could have a build check like yours.
>
>
> >                         /**
> >                          * @dma_addr: might require a 64-bit value even on
> >                          * 32-bit architectures.
> >                          */
> > -                       dma_addr_t dma_addr;
> > +                       dma_addr_t dma_addr __packed;
> >                 };
> >                 struct {        /* slab, slob and slub */
> >                         union {
> >
> > but I don't know if GCC is smart enough to realise that dma_addr is now
> > on an 8 byte boundary and it can use a normal instruction to access it,
> > or whether it'll do something daft like use byte loads to access it.
> >
> > We could also do:
> >
> > +                       dma_addr_t dma_addr __packed __aligned(sizeof(void *));
> >
> > and I see pahole, at least sees this correctly:
> >
> >                 struct {
> >                         long unsigned int _page_pool_pad; /*     4     4 */
> >                         dma_addr_t dma_addr __attribute__((__aligned__(4))); /*     8     8 */
> >                 } __attribute__((__packed__)) __attribute__((__aligned__(4)));
> >
> > This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> > / dma_addr_t.  Advice, please?
>
> I'm not sure that the 32-bit behavior is with 64-bit (dma) addrs.
>
> I don't have any 32-bit boards with 64-bit DMA.  Cc. Ivan, wasn't your
> board (572x ?) 32-bit with driver 'cpsw' this case (where Ivan added
> XDP+page_pool) ?
>
> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Bogus struct page layout on 32-bit
  2021-04-10  8:52         ` Ilias Apalodimas
@ 2021-04-10 14:06           ` Matthew Wilcox
  2021-04-10 15:54             ` Russell King - ARM Linux admin
  2021-04-16  9:26           ` Grygorii Strashko
  1 sibling, 1 reply; 47+ messages in thread
From: Matthew Wilcox @ 2021-04-10 14:06 UTC (permalink / raw)
  To: Ilias Apalodimas
  Cc: Jesper Dangaard Brouer, kernel test robot, Linux-MM, kbuild-all,
	clang-built-linux, open list, linux-fsdevel, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, linuxppc-dev, Linux ARM,
	David S. Miller, Ivan Khoronzhuk, Matteo Croce, netdev,
	Grygorii Strashko

How about moving the flags into the union?  A bit messy, but we don't
have to play games with __packed__.

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 1210a8e41fad..f374d2f06255 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -68,16 +68,22 @@ struct mem_cgroup;
 #endif
 
 struct page {
-	unsigned long flags;		/* Atomic flags, some possibly
-					 * updated asynchronously */
 	/*
-	 * Five words (20/40 bytes) are available in this union.
-	 * WARNING: bit 0 of the first word is used for PageTail(). That
-	 * means the other users of this union MUST NOT use the bit to
+	 * This union is six words (24 / 48 bytes) in size.
+	 * The first word is reserved for atomic flags, often updated
+	 * asynchronously.  Use the PageFoo() macros to access it.  Some
+	 * of the flags can be reused for your own purposes, but the
+	 * word as a whole often contains other information and overwriting
+	 * it will cause functions like page_zone() and page_node() to stop
+	 * working correctly.
+	 *
+	 * Bit 0 of the second word is used for PageTail(). That
+	 * means the other users of this union MUST leave the bit zero to
 	 * avoid collision and false-positive PageTail().
 	 */
 	union {
 		struct {	/* Page cache and anonymous pages */
+			unsigned long flags;
 			/**
 			 * @lru: Pageout list, eg. active_list protected by
 			 * lruvec->lru_lock.  Sometimes used as a generic list
@@ -96,6 +102,8 @@ struct page {
 			unsigned long private;
 		};
 		struct {	/* page_pool used by netstack */
+			unsigned long _pp_flags;
+			unsigned long _pp_pad;
 			/**
 			 * @dma_addr: might require a 64-bit value even on
 			 * 32-bit architectures.
@@ -103,6 +111,7 @@ struct page {
 			dma_addr_t dma_addr;
 		};
 		struct {	/* slab, slob and slub */
+			unsigned long _slab_flags;
 			union {
 				struct list_head slab_list;
 				struct {	/* Partial pages */
@@ -130,6 +139,7 @@ struct page {
 			};
 		};
 		struct {	/* Tail pages of compound page */
+			unsigned long _tail1_flags;
 			unsigned long compound_head;	/* Bit zero is set */
 
 			/* First tail page only */
@@ -139,12 +149,14 @@ struct page {
 			unsigned int compound_nr; /* 1 << compound_order */
 		};
 		struct {	/* Second tail page of compound page */
+			unsigned long _tail2_flags;
 			unsigned long _compound_pad_1;	/* compound_head */
 			atomic_t hpage_pinned_refcount;
 			/* For both global and memcg */
 			struct list_head deferred_list;
 		};
 		struct {	/* Page table pages */
+			unsigned long _pt_flags;
 			unsigned long _pt_pad_1;	/* compound_head */
 			pgtable_t pmd_huge_pte; /* protected by page->ptl */
 			unsigned long _pt_pad_2;	/* mapping */
@@ -159,6 +171,7 @@ struct page {
 #endif
 		};
 		struct {	/* ZONE_DEVICE pages */
+			unsigned long _zd_flags;
 			/** @pgmap: Points to the hosting device page map. */
 			struct dev_pagemap *pgmap;
 			void *zone_device_data;
@@ -174,8 +187,11 @@ struct page {
 			 */
 		};
 
-		/** @rcu_head: You can use this to free a page by RCU. */
-		struct rcu_head rcu_head;
+		struct {
+			unsigned long _rcu_flags;
+			/** @rcu_head: You can use this to free a page by RCU. */
+			struct rcu_head rcu_head;
+		};
 	};
 
 	union {		/* This union is 4 bytes in size. */

^ permalink raw reply	[flat|nested] 47+ messages in thread

* RE: Bogus struct page layout on 32-bit
  2021-04-10  2:43     ` Bogus struct page layout on 32-bit Matthew Wilcox
  2021-04-10  6:21       ` Jesper Dangaard Brouer
@ 2021-04-10 14:17       ` David Laight
  2021-04-10 19:10       ` Arnd Bergmann
  2 siblings, 0 replies; 47+ messages in thread
From: David Laight @ 2021-04-10 14:17 UTC (permalink / raw)
  To: 'Matthew Wilcox', kernel test robot
  Cc: linux-mm, kbuild-all, clang-built-linux, linux-kernel,
	linux-fsdevel, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, linuxppc-dev, linux-arm-kernel,
	Jesper Dangaard Brouer, David S. Miller

From: Matthew Wilcox
> Sent: 10 April 2021 03:43
> On Sat, Apr 10, 2021 at 06:45:35AM +0800, kernel test robot wrote:
> > >> include/linux/mm_types.h:274:1: error: static_assert failed due to requirement
> '__builtin_offsetof(struct page, lru) == __builtin_offsetof(struct folio, lru)' "offsetof(struct page,
> lru) == offsetof(struct folio, lru)"
> >    FOLIO_MATCH(lru, lru);
> >    include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
> >            static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
> 
> Well, this is interesting.  pahole reports:
> 
> struct page {
>         long unsigned int          flags;                /*     0     4 */
>         /* XXX 4 bytes hole, try to pack */
>         union {
>                 struct {
>                         struct list_head lru;            /*     8     8 */
> ...
> struct folio {
>         union {
>                 struct {
>                         long unsigned int flags;         /*     0     4 */
>                         struct list_head lru;            /*     4     8 */
> 
> so this assert has absolutely done its job.
> 
> But why has this assert triggered?  Why is struct page layout not what
> we thought it was?  Turns out it's the dma_addr added in 2019 by commit
> c25fff7171be ("mm: add dma_addr_t to struct page").  On this particular
> config, it's 64-bit, and ppc32 requires alignment to 64-bit.  So
> the whole union gets moved out by 4 bytes.
> 
> Unfortunately, we can't just fix this by putting an 'unsigned long pad'
> in front of it.  It still aligns the entire union to 8 bytes, and then
> it skips another 4 bytes after the pad.
> 
> We can fix it like this ...
> 
> +++ b/include/linux/mm_types.h
> @@ -96,11 +96,12 @@ struct page {
>                         unsigned long private;
>                 };
>                 struct {        /* page_pool used by netstack */
> +                       unsigned long _page_pool_pad;
>                         /**
>                          * @dma_addr: might require a 64-bit value even on
>                          * 32-bit architectures.
>                          */
> -                       dma_addr_t dma_addr;
> +                       dma_addr_t dma_addr __packed;
>                 };
>                 struct {        /* slab, slob and slub */
>                         union {
> 
> but I don't know if GCC is smart enough to realise that dma_addr is now
> on an 8 byte boundary and it can use a normal instruction to access it,
> or whether it'll do something daft like use byte loads to access it.

I'm betting it will use byte accesses.
Checked - it does seem to use 4-byte accesses.
(godbolt is rather short of 32 bit compilers...)

> 
> We could also do:
> 
> +                       dma_addr_t dma_addr __packed __aligned(sizeof(void *));

I wonder if __aligned(n) should be defined as
	__attribute__((packed,aligned(n))
I don't think you ever want the 'unpacked' variant.

But explicitly reducing the alignment of single member is much
better than the habit of marking the structure 'packed'.

(Never mind the habit of adding __packed 'because we don't want
the compiler to add random padding.)

> 
> and I see pahole, at least sees this correctly:
> 
>                 struct {
>                         long unsigned int _page_pool_pad; /*     4     4 */
>                         dma_addr_t dma_addr __attribute__((__aligned__(4))); /*     8     8 */
>                 } __attribute__((__packed__)) __attribute__((__aligned__(4)));

Is the attribute on the struct an artifact of pahole?
It should just have an alignment of 4 without anything special.

> 
> This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> / dma_addr_t.  Advice, please?

Only those where a 64-bit value is 64-bit aligned.
So that excludes x86 (which can have 64-bit dma) but includes sparc32
(which probably doesn't).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Bogus struct page layout on 32-bit
  2021-04-10 14:06           ` Matthew Wilcox
@ 2021-04-10 15:54             ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 47+ messages in thread
From: Russell King - ARM Linux admin @ 2021-04-10 15:54 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Ilias Apalodimas, Jesper Dangaard Brouer, kernel test robot,
	Linux-MM, kbuild-all, clang-built-linux, open list,
	linux-fsdevel, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, linuxppc-dev, Linux ARM, David S. Miller,
	Ivan Khoronzhuk, Matteo Croce, netdev, Grygorii Strashko

On Sat, Apr 10, 2021 at 03:06:52PM +0100, Matthew Wilcox wrote:
> How about moving the flags into the union?  A bit messy, but we don't
> have to play games with __packed__.

Yes, that is probably the better solution, avoiding the games to try
and get the union appropriately placed on 32-bit systems.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Bogus struct page layout on 32-bit
  2021-04-10  2:43     ` Bogus struct page layout on 32-bit Matthew Wilcox
  2021-04-10  6:21       ` Jesper Dangaard Brouer
  2021-04-10 14:17       ` David Laight
@ 2021-04-10 19:10       ` Arnd Bergmann
  2021-04-11 22:35         ` Matthew Wilcox
  2 siblings, 1 reply; 47+ messages in thread
From: Arnd Bergmann @ 2021-04-10 19:10 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: kernel test robot, Linux-MM, kbuild-all, clang-built-linux,
	Linux Kernel Mailing List, Linux FS-devel Mailing List,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	linuxppc-dev, Linux ARM, Jesper Dangaard Brouer, David S. Miller

On Sat, Apr 10, 2021 at 4:44 AM Matthew Wilcox <willy@infradead.org> wrote:
> +                       dma_addr_t dma_addr __packed;
>                 };
>                 struct {        /* slab, slob and slub */
>                         union {
>
> but I don't know if GCC is smart enough to realise that dma_addr is now
> on an 8 byte boundary and it can use a normal instruction to access it,
> or whether it'll do something daft like use byte loads to access it.
>
> We could also do:
>
> +                       dma_addr_t dma_addr __packed __aligned(sizeof(void *));
>
> and I see pahole, at least sees this correctly:
>
>                 struct {
>                         long unsigned int _page_pool_pad; /*     4     4 */
>                         dma_addr_t dma_addr __attribute__((__aligned__(4))); /*     8     8 */
>                 } __attribute__((__packed__)) __attribute__((__aligned__(4)));
>
> This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> / dma_addr_t.  Advice, please?

I've tried out what gcc would make of this:  https://godbolt.org/z/aTEbxxbG3

struct page {
    short a;
    struct {
        short b;
        long long c __attribute__((packed, aligned(2)));
    } __attribute__((packed));
} __attribute__((aligned(8)));

In this structure, 'c' is clearly aligned to eight bytes, and gcc does
realize that
it is safe to use the 'ldrd' instruction for 32-bit arm, which is forbidden on
struct members with less than 4 byte alignment. However, it also complains
that passing a pointer to 'c' into a function that expects a 'long long' is not
allowed because alignof(c) is only '2' here.

(I used 'short' here because I having a 64-bit member misaligned by four
bytes wouldn't make a difference to the instructions on Arm, or any other
32-bit architecture I can think of, regardless of the ABI requirements).

      Arnd

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Bogus struct page layout on 32-bit
  2021-04-10 19:10       ` Arnd Bergmann
@ 2021-04-11 22:35         ` Matthew Wilcox
  0 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox @ 2021-04-11 22:35 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: kernel test robot, Linux-MM, kbuild-all, clang-built-linux,
	Linux Kernel Mailing List, Linux FS-devel Mailing List,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	linuxppc-dev, Linux ARM, Jesper Dangaard Brouer, David S. Miller

On Sat, Apr 10, 2021 at 09:10:47PM +0200, Arnd Bergmann wrote:
> On Sat, Apr 10, 2021 at 4:44 AM Matthew Wilcox <willy@infradead.org> wrote:
> > +                       dma_addr_t dma_addr __packed;
> >                 };
> >                 struct {        /* slab, slob and slub */
> >                         union {
> >
> > but I don't know if GCC is smart enough to realise that dma_addr is now
> > on an 8 byte boundary and it can use a normal instruction to access it,
> > or whether it'll do something daft like use byte loads to access it.
> >
> > We could also do:
> >
> > +                       dma_addr_t dma_addr __packed __aligned(sizeof(void *));
> >
> > and I see pahole, at least sees this correctly:
> >
> >                 struct {
> >                         long unsigned int _page_pool_pad; /*     4     4 */
> >                         dma_addr_t dma_addr __attribute__((__aligned__(4))); /*     8     8 */
> >                 } __attribute__((__packed__)) __attribute__((__aligned__(4)));
> >
> > This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
> > / dma_addr_t.  Advice, please?
> 
> I've tried out what gcc would make of this:  https://godbolt.org/z/aTEbxxbG3
> 
> struct page {
>     short a;
>     struct {
>         short b;
>         long long c __attribute__((packed, aligned(2)));
>     } __attribute__((packed));
> } __attribute__((aligned(8)));
> 
> In this structure, 'c' is clearly aligned to eight bytes, and gcc does
> realize that
> it is safe to use the 'ldrd' instruction for 32-bit arm, which is forbidden on
> struct members with less than 4 byte alignment. However, it also complains
> that passing a pointer to 'c' into a function that expects a 'long long' is not
> allowed because alignof(c) is only '2' here.
> 
> (I used 'short' here because I having a 64-bit member misaligned by four
> bytes wouldn't make a difference to the instructions on Arm, or any other
> 32-bit architecture I can think of, regardless of the ABI requirements).

So ... we could do this:

+++ b/include/linux/types.h
@@ -140,7 +140,7 @@ typedef u64 blkcnt_t;
  * so they don't care about the size of the actual bus addresses.
  */
 #ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
-typedef u64 dma_addr_t;
+typedef u64 __attribute__((aligned(sizeof(void *)))) dma_addr_t;
 #else
 typedef u32 dma_addr_t;
 #endif

but I'm a little scared that this might have unintended consequences.
And Jesper points out that a big-endian 64-bit dma_addr_t can impersonate
a PageTail page, and we should solve that problem while we're at it.
So I don't think we should do this, but thought I should mention it as
a possibility.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v7 01/28] mm: Optimise nth_page for contiguous memmap
  2021-04-09 18:50 ` [PATCH v7 01/28] mm: Optimise nth_page for contiguous memmap Matthew Wilcox (Oracle)
@ 2021-04-12  6:08   ` Christoph Hellwig
  0 siblings, 0 replies; 47+ messages in thread
From: Christoph Hellwig @ 2021-04-12  6:08 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: linux-mm, linux-kernel, linux-fsdevel, linux-cachefs, linux-afs

On Fri, Apr 09, 2021 at 07:50:38PM +0100, Matthew Wilcox (Oracle) wrote:
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Bogus struct page layout on 32-bit
  2021-04-10  8:52         ` Ilias Apalodimas
  2021-04-10 14:06           ` Matthew Wilcox
@ 2021-04-16  9:26           ` Grygorii Strashko
  2021-04-16 14:10             ` Arnd Bergmann
  2021-04-17 13:08             ` David Laight
  1 sibling, 2 replies; 47+ messages in thread
From: Grygorii Strashko @ 2021-04-16  9:26 UTC (permalink / raw)
  To: Ilias Apalodimas, Jesper Dangaard Brouer, Christoph Hellwig
  Cc: Matthew Wilcox, kernel test robot, Linux-MM, kbuild-all,
	clang-built-linux, open list, linux-fsdevel, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, linuxppc-dev, Linux ARM,
	David S. Miller, Matteo Croce, netdev

Hi Ilias, All,

On 10/04/2021 11:52, Ilias Apalodimas wrote:
> +CC Grygorii for the cpsw part as Ivan's email is not valid anymore
> 
> Thanks for catching this. Interesting indeed...
> 
> On Sat, 10 Apr 2021 at 09:22, Jesper Dangaard Brouer <brouer@redhat.com> wrote:
>>
>> On Sat, 10 Apr 2021 03:43:13 +0100
>> Matthew Wilcox <willy@infradead.org> wrote:
>>
>>> On Sat, Apr 10, 2021 at 06:45:35AM +0800, kernel test robot wrote:
>>>>>> include/linux/mm_types.h:274:1: error: static_assert failed due to requirement '__builtin_offsetof(struct page, lru) == __builtin_offsetof(struct folio, lru)' "offsetof(struct page, lru) == offsetof(struct folio, lru)"
>>>>     FOLIO_MATCH(lru, lru);
>>>>     include/linux/mm_types.h:272:2: note: expanded from macro 'FOLIO_MATCH'
>>>>             static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl))
>>>
>>> Well, this is interesting.  pahole reports:
>>>
>>> struct page {
>>>          long unsigned int          flags;                /*     0     4 */
>>>          /* XXX 4 bytes hole, try to pack */
>>>          union {
>>>                  struct {
>>>                          struct list_head lru;            /*     8     8 */
>>> ...
>>> struct folio {
>>>          union {
>>>                  struct {
>>>                          long unsigned int flags;         /*     0     4 */
>>>                          struct list_head lru;            /*     4     8 */
>>>
>>> so this assert has absolutely done its job.
>>>
>>> But why has this assert triggered?  Why is struct page layout not what
>>> we thought it was?  Turns out it's the dma_addr added in 2019 by commit
>>> c25fff7171be ("mm: add dma_addr_t to struct page").  On this particular
>>> config, it's 64-bit, and ppc32 requires alignment to 64-bit.  So
>>> the whole union gets moved out by 4 bytes.
>>
>> Argh, good that you are catching this!
>>
>>> Unfortunately, we can't just fix this by putting an 'unsigned long pad'
>>> in front of it.  It still aligns the entire union to 8 bytes, and then
>>> it skips another 4 bytes after the pad.
>>>
>>> We can fix it like this ...
>>>
>>> +++ b/include/linux/mm_types.h
>>> @@ -96,11 +96,12 @@ struct page {
>>>                          unsigned long private;
>>>                  };
>>>                  struct {        /* page_pool used by netstack */
>>> +                       unsigned long _page_pool_pad;
>>
>> I'm fine with this pad.  Matteo is currently proposing[1] to add a 32-bit
>> value after @dma_addr, and he could use this area instead.
>>
>> [1] https://lore.kernel.org/netdev/20210409223801.104657-3-mcroce@linux.microsoft.com/
>>
>> When adding/changing this, we need to make sure that it doesn't overlap
>> member @index, because network stack use/check page_is_pfmemalloc().
>> As far as my calculations this is safe to add.  I always try to keep an
>> eye out for this, but I wonder if we could have a build check like yours.
>>
>>
>>>                          /**
>>>                           * @dma_addr: might require a 64-bit value even on
>>>                           * 32-bit architectures.
>>>                           */
>>> -                       dma_addr_t dma_addr;
>>> +                       dma_addr_t dma_addr __packed;
>>>                  };
>>>                  struct {        /* slab, slob and slub */
>>>                          union {
>>>
>>> but I don't know if GCC is smart enough to realise that dma_addr is now
>>> on an 8 byte boundary and it can use a normal instruction to access it,
>>> or whether it'll do something daft like use byte loads to access it.
>>>
>>> We could also do:
>>>
>>> +                       dma_addr_t dma_addr __packed __aligned(sizeof(void *));
>>>
>>> and I see pahole, at least sees this correctly:
>>>
>>>                  struct {
>>>                          long unsigned int _page_pool_pad; /*     4     4 */
>>>                          dma_addr_t dma_addr __attribute__((__aligned__(4))); /*     8     8 */
>>>                  } __attribute__((__packed__)) __attribute__((__aligned__(4)));
>>>
>>> This presumably affects any 32-bit architecture with a 64-bit phys_addr_t
>>> / dma_addr_t.  Advice, please?
>>
>> I'm not sure that the 32-bit behavior is with 64-bit (dma) addrs.
>>
>> I don't have any 32-bit boards with 64-bit DMA.  Cc. Ivan, wasn't your
>> board (572x ?) 32-bit with driver 'cpsw' this case (where Ivan added
>> XDP+page_pool) ?

Sry, for delayed reply.

The TI platforms am3/4/5 (cpsw) and Keystone 2 (netcp) can do only 32bit DMA even in case of LPAE (dma-ranges are used).
Originally, as I remember, CONFIG_ARCH_DMA_ADDR_T_64BIT has not been selected for the LPAE case
on TI platforms and the fact that it became set is the result of multi-paltform/allXXXconfig/DMA
optimizations and unification.
(just checked - not set in 4.14)

Probable commit 4965a68780c5 ("arch: define the ARCH_DMA_ADDR_T_64BIT config symbol in lib/Kconfig").

The TI drivers have been updated, finally to accept ARCH_DMA_ADDR_T_64BIT=y by using things like (__force u32)
for example.

Honestly, I've done sanity check of CPSW with LPAE=y (ARCH_DMA_ADDR_T_64BIT=y) very long time ago.

-- 
Best regards,
grygorii

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Bogus struct page layout on 32-bit
  2021-04-16  9:26           ` Grygorii Strashko
@ 2021-04-16 14:10             ` Arnd Bergmann
  2021-04-17 13:08             ` David Laight
  1 sibling, 0 replies; 47+ messages in thread
From: Arnd Bergmann @ 2021-04-16 14:10 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Ilias Apalodimas, Jesper Dangaard Brouer, Christoph Hellwig,
	Matthew Wilcox, kernel test robot, Linux-MM, kbuild-all,
	clang-built-linux, open list, linux-fsdevel, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, linuxppc-dev, Linux ARM,
	David S. Miller, Matteo Croce, netdev

On Fri, Apr 16, 2021 at 11:27 AM 'Grygorii Strashko' via Clang Built
Linux <clang-built-linux@googlegroups.com> wrote:
> On 10/04/2021 11:52, Ilias Apalodimas wrote:
> > +CC Grygorii for the cpsw part as Ivan's email is not valid anymore
> The TI platforms am3/4/5 (cpsw) and Keystone 2 (netcp) can do only 32bit DMA even in case of LPAE (dma-ranges are used).
> Originally, as I remember, CONFIG_ARCH_DMA_ADDR_T_64BIT has not been selected for the LPAE case
> on TI platforms and the fact that it became set is the result of multi-paltform/allXXXconfig/DMA
> optimizations and unification.
> (just checked - not set in 4.14)
>
> Probable commit 4965a68780c5 ("arch: define the ARCH_DMA_ADDR_T_64BIT config symbol in lib/Kconfig").

I completely missed this change in the past, and I don't really agree
with it either.

Most 32-bit Arm platforms are in fact limited to 32-bit DMA, even when they have
MMIO or RAM areas above the 4GB boundary that require LPAE.

> The TI drivers have been updated, finally to accept ARCH_DMA_ADDR_T_64BIT=y by using
> things like (__force u32) for example.
>
> Honestly, I've done sanity check of CPSW with LPAE=y (ARCH_DMA_ADDR_T_64BIT=y) very long time ago.

This is of course a good idea, drivers should work with any
combination of 32-bit
or 64-bit phys_addr_t and dma_addr_t.

        Arnd

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v7 02/28] mm: Introduce struct folio
  2021-04-09 18:50 ` [PATCH v7 02/28] mm: Introduce struct folio Matthew Wilcox (Oracle)
  2021-04-09 22:45   ` kernel test robot
  2021-04-10  2:51   ` [PATCH v7 02/28] mm: Introduce struct folio kernel test robot
@ 2021-04-16 15:55   ` Matthew Wilcox
  2021-04-19  9:06     ` Kirill A. Shutemov
  2 siblings, 1 reply; 47+ messages in thread
From: Matthew Wilcox @ 2021-04-16 15:55 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-fsdevel, linux-cachefs, linux-afs, Jeff Layton

On Fri, Apr 09, 2021 at 07:50:39PM +0100, Matthew Wilcox (Oracle) wrote:
> A struct folio is a new abstraction to replace the venerable struct page.
> A function which takes a struct folio argument declares that it will
> operate on the entire (possibly compound) page, not just PAGE_SIZE bytes.
> In return, the caller guarantees that the pointer it is passing does
> not point to a tail page.
> +++ b/include/linux/mm_types.h
[...]
> +static inline struct folio *page_folio(struct page *page)
> +{
> +	unsigned long head = READ_ONCE(page->compound_head);
> +
> +	if (unlikely(head & 1))
> +		return (struct folio *)(head - 1);
> +	return (struct folio *)page;
> +}

I'm looking at changing this for the next revision, and basing it on
my recent patch to make compound_head() const-preserving:

+#define page_folio(page)       _Generic((page),                        \
+       const struct page *:    (const struct folio *)_compound_head(page), \
+       struct page *:          (struct folio *)_compound_head(page))

I've also noticed an awkward pattern occurring that I think this makes
less awkward:

+/**
+ * folio_page - Return a page from a folio.
+ * @folio: The folio.
+ * @n: The page number to return.
+ *
+ * @n is relative to the start of the folio.  It should be between
+ * 0 and folio_nr_pages(@folio) - 1, but this is not checked for.
+ */
+#define folio_page(folio, n)   nth_page(&(folio)->page, n)

That lets me simplify folio_next():

+static inline struct folio *folio_next(struct folio *folio)
+{
+       return (struct folio *)folio_page(folio, folio_nr_pages(folio));
+}

(it occurs to me this should also be const-preserving, but it's not clear
that's needed yet)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* RE: Bogus struct page layout on 32-bit
  2021-04-16  9:26           ` Grygorii Strashko
  2021-04-16 14:10             ` Arnd Bergmann
@ 2021-04-17 13:08             ` David Laight
  1 sibling, 0 replies; 47+ messages in thread
From: David Laight @ 2021-04-17 13:08 UTC (permalink / raw)
  To: 'Grygorii Strashko',
	Ilias Apalodimas, Jesper Dangaard Brouer, Christoph Hellwig
  Cc: Matthew Wilcox, kernel test robot, Linux-MM, kbuild-all,
	clang-built-linux, open list, linux-fsdevel, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, linuxppc-dev, Linux ARM,
	David S. Miller, Matteo Croce, netdev

From: Grygorii Strashko
> Sent: 16 April 2021 10:27
...
> Sry, for delayed reply.
> 
> The TI platforms am3/4/5 (cpsw) and Keystone 2 (netcp) can do only 32bit DMA even in case of LPAE
> (dma-ranges are used).
> Originally, as I remember, CONFIG_ARCH_DMA_ADDR_T_64BIT has not been selected for the LPAE case
> on TI platforms and the fact that it became set is the result of multi-paltform/allXXXconfig/DMA
> optimizations and unification.
> (just checked - not set in 4.14)
> 
> Probable commit 4965a68780c5 ("arch: define the ARCH_DMA_ADDR_T_64BIT config symbol in lib/Kconfig").
> 
> The TI drivers have been updated, finally to accept ARCH_DMA_ADDR_T_64BIT=y by using things like
> (__force u32)
> for example.

Hmmm using (__force u32) is probably wrong.
If an address +length >= 2**32 can get passed then the IO request
needs to be errored (or a bounce buffer used).

Otherwise you can get particularly horrid corruptions.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v7 02/28] mm: Introduce struct folio
  2021-04-16 15:55   ` Matthew Wilcox
@ 2021-04-19  9:06     ` Kirill A. Shutemov
  0 siblings, 0 replies; 47+ messages in thread
From: Kirill A. Shutemov @ 2021-04-19  9:06 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Jeff Layton

On Fri, Apr 16, 2021 at 04:55:16PM +0100, Matthew Wilcox wrote:
> On Fri, Apr 09, 2021 at 07:50:39PM +0100, Matthew Wilcox (Oracle) wrote:
> > A struct folio is a new abstraction to replace the venerable struct page.
> > A function which takes a struct folio argument declares that it will
> > operate on the entire (possibly compound) page, not just PAGE_SIZE bytes.
> > In return, the caller guarantees that the pointer it is passing does
> > not point to a tail page.
> > +++ b/include/linux/mm_types.h
> [...]
> > +static inline struct folio *page_folio(struct page *page)
> > +{
> > +	unsigned long head = READ_ONCE(page->compound_head);
> > +
> > +	if (unlikely(head & 1))
> > +		return (struct folio *)(head - 1);
> > +	return (struct folio *)page;
> > +}
> 
> I'm looking at changing this for the next revision, and basing it on
> my recent patch to make compound_head() const-preserving:
> 
> +#define page_folio(page)       _Generic((page),                        \
> +       const struct page *:    (const struct folio *)_compound_head(page), \
> +       struct page *:          (struct folio *)_compound_head(page))
> 
> I've also noticed an awkward pattern occurring that I think this makes
> less awkward:
> 
> +/**
> + * folio_page - Return a page from a folio.
> + * @folio: The folio.
> + * @n: The page number to return.
> + *
> + * @n is relative to the start of the folio.  It should be between
> + * 0 and folio_nr_pages(@folio) - 1, but this is not checked for.
> + */
> +#define folio_page(folio, n)   nth_page(&(folio)->page, n)
> 
> That lets me simplify folio_next():
> 
> +static inline struct folio *folio_next(struct folio *folio)
> +{
> +       return (struct folio *)folio_page(folio, folio_nr_pages(folio));
> +}
> 
> (it occurs to me this should also be const-preserving, but it's not clear
> that's needed yet)

Are we risking that we would need to replace inline functions with macros
all the way down? Not sure const-preserving worth it.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v7 09/28] mm: Create FolioFlags
  2021-04-09 18:50 ` [PATCH v7 09/28] mm: Create FolioFlags Matthew Wilcox (Oracle)
@ 2021-04-19 13:25   ` Peter Zijlstra
  2021-04-19 13:55     ` Matthew Wilcox
  0 siblings, 1 reply; 47+ messages in thread
From: Peter Zijlstra @ 2021-04-19 13:25 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: linux-mm, linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

On Fri, Apr 09, 2021 at 07:50:46PM +0100, Matthew Wilcox (Oracle) wrote:
> These new functions are the folio analogues of the PageFlags functions.
> If CONFIG_DEBUG_VM_PGFLAGS is enabled, we check the folio is not a tail
> page at every invocation.  Note that this will also catch the PagePoisoned
> case as a poisoned page has every bit set, which would include PageTail.
> 
> This saves 1727 bytes of text with the distro-derived config that
> I'm testing due to removing a double call to compound_head() in
> PageSwapCache().

I vote for dropping the Camels if we're going to rework all this.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v7 09/28] mm: Create FolioFlags
  2021-04-19 13:25   ` Peter Zijlstra
@ 2021-04-19 13:55     ` Matthew Wilcox
  0 siblings, 0 replies; 47+ messages in thread
From: Matthew Wilcox @ 2021-04-19 13:55 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-mm, linux-kernel, linux-fsdevel, linux-cachefs, linux-afs,
	Christoph Hellwig, Jeff Layton

On Mon, Apr 19, 2021 at 03:25:46PM +0200, Peter Zijlstra wrote:
> On Fri, Apr 09, 2021 at 07:50:46PM +0100, Matthew Wilcox (Oracle) wrote:
> > These new functions are the folio analogues of the PageFlags functions.
> > If CONFIG_DEBUG_VM_PGFLAGS is enabled, we check the folio is not a tail
> > page at every invocation.  Note that this will also catch the PagePoisoned
> > case as a poisoned page has every bit set, which would include PageTail.
> > 
> > This saves 1727 bytes of text with the distro-derived config that
> > I'm testing due to removing a double call to compound_head() in
> > PageSwapCache().
> 
> I vote for dropping the Camels if we're going to rework all this.

I'm open to that.  It's a bit of rework now, but easier to do it as
part of this than as a separate series.

So, concretely:

PageReferences() becomes folio_referenced()
SetPageReferenced() becomes folio_set_referenced()
ClearPageReferenced() becomes folio_clear_referenced()
__SetFolioReferenced() becomes __folio_set_referenced()
__ClearFolioReferenced() becomes __folio_clear_referenced()
TestSetPageReferenced() becomes folio_test_set_referenced()
TestClearPageReferenced() becomes folio_test_clear_referenced()

We do have some functions already like set_page_writeback(), but I
think those can become folio_set_writeback() without doing any harm.
We also have page_is_young(), page_is_pfmemalloc(), page_is_guard(),
etc.  Should it be folio_referenced()?  or folio_is_referenced()?


^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, back to index

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-09 18:50 [PATCH v7 00/28] Memory Folios Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 01/28] mm: Optimise nth_page for contiguous memmap Matthew Wilcox (Oracle)
2021-04-12  6:08   ` Christoph Hellwig
2021-04-09 18:50 ` [PATCH v7 02/28] mm: Introduce struct folio Matthew Wilcox (Oracle)
2021-04-09 22:45   ` kernel test robot
2021-04-10  2:43     ` Bogus struct page layout on 32-bit Matthew Wilcox
2021-04-10  6:21       ` Jesper Dangaard Brouer
2021-04-10  8:52         ` Ilias Apalodimas
2021-04-10 14:06           ` Matthew Wilcox
2021-04-10 15:54             ` Russell King - ARM Linux admin
2021-04-16  9:26           ` Grygorii Strashko
2021-04-16 14:10             ` Arnd Bergmann
2021-04-17 13:08             ` David Laight
2021-04-10 14:17       ` David Laight
2021-04-10 19:10       ` Arnd Bergmann
2021-04-11 22:35         ` Matthew Wilcox
2021-04-10  2:51   ` [PATCH v7 02/28] mm: Introduce struct folio kernel test robot
2021-04-16 15:55   ` Matthew Wilcox
2021-04-19  9:06     ` Kirill A. Shutemov
2021-04-09 18:50 ` [PATCH v7 03/28] mm: Add folio_pgdat and folio_zone Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 04/28] mm/vmstat: Add functions to account folio statistics Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 05/28] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 06/28] mm: Add folio reference count functions Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 07/28] mm: Add put_folio Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 08/28] mm: Add get_folio Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 09/28] mm: Create FolioFlags Matthew Wilcox (Oracle)
2021-04-19 13:25   ` Peter Zijlstra
2021-04-19 13:55     ` Matthew Wilcox
2021-04-09 18:50 ` [PATCH v7 10/28] mm: Handle per-folio private data Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 11/28] mm/filemap: Add folio_index, folio_file_page and folio_contains Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 12/28] mm/filemap: Add folio_next_index Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 13/28] mm/filemap: Add folio_offset and folio_file_offset Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 14/28] mm/util: Add folio_mapping and folio_file_mapping Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 15/28] mm: Add folio_mapcount Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 16/28] mm/memcg: Add folio wrappers for various functions Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 17/28] mm/filemap: Add unlock_folio Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 18/28] mm/filemap: Add lock_folio Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 19/28] mm/filemap: Add lock_folio_killable Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 20/28] mm/filemap: Add __lock_folio_async Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 21/28] mm/filemap: Add __lock_folio_or_retry Matthew Wilcox (Oracle)
2021-04-09 18:50 ` [PATCH v7 22/28] mm/filemap: Add wait_on_folio_locked Matthew Wilcox (Oracle)
2021-04-09 18:51 ` [PATCH v7 23/28] mm/filemap: Add end_folio_writeback Matthew Wilcox (Oracle)
2021-04-09 18:51 ` [PATCH v7 24/28] mm/writeback: Add wait_on_folio_writeback Matthew Wilcox (Oracle)
2021-04-09 18:51 ` [PATCH v7 25/28] mm/writeback: Add wait_for_stable_folio Matthew Wilcox (Oracle)
2021-04-09 18:51 ` [PATCH v7 26/28] mm/filemap: Convert wait_on_page_bit to wait_on_folio_bit Matthew Wilcox (Oracle)
2021-04-09 18:51 ` [PATCH v7 27/28] mm/filemap: Convert wake_up_page_bit to wake_up_folio_bit Matthew Wilcox (Oracle)
2021-04-09 18:51 ` [PATCH v7 28/28] mm/filemap: Convert page wait queues to be folios Matthew Wilcox (Oracle)

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git