linux-sh.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups
@ 2024-04-09 19:22 David Hildenbrand
  2024-04-09 19:22 ` [PATCH v1 01/18] mm: allow for detecting underflows with page_mapcount() again David Hildenbrand
                   ` (17 more replies)
  0 siblings, 18 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-09 19:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, David Hildenbrand, Andrew Morton,
	Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

This series tracks the mapcount of large folios in a single value, so
it can be read efficiently and atomically, just like the mapcount of
small folios.

folio_mapcount() is then used in a couple more places, most notably to
reduce false negatives in folio_likely_mapped_shared(), and many users of
page_mapcount() are cleaned up (that's maybe why you got CCed on the
full series, sorry sh+xtensa folks! :) ).

The remaining s390x user and one KSM user of page_mapcount() are getting
removed separately on the list right now. I have patches to handle the
other KSM one, the khugepaged one and the kpagecount one; as they are not
as "obvious", I will send them out separately in the future. Once that is
all in place, I'm planning on moving page_mapcount() into
fs/proc/task_mmu.c, the remaining user for the time being (and we can
discuss at LSF/MM details on that :) ).

I proposed the mapcount for large folios (previously called total
mapcount) originally in part of [1] and I later included it in [2] where
it is a requirement. In the meantime, I changed the patch a bit so I
dropped all RB's. During the discussion of [1], Peter Xu correctly raised
that this additional tracking might affect the performance when
PMD->PTE remapping THPs. In the meantime. I addressed that by batching RMAP
operations during fork(), unmap/zap and when PMD->PTE remapping THPs.

Running some of my micro-benchmarks [3] (fork,munmap,cow-byte,remap) on 1
GiB of memory backed by folios with the same order, I observe the following
on an Intel(R) Xeon(R) Silver 4210R CPU @ 2.40GHz tuned for reproducible
results as much as possible:

Standard deviation is mostly < 1%, except for order-9, where it's < 2% for
fork() and munmap().

(1) Small folios are not affected (< 1%) in all 4 microbenchmarks.
(2) Order-4 folios are not affected (< 1%) in all 4 microbenchmarks. A bit
    weird comapred to the other orders ...
(3) PMD->PTE remapping of order-9 THPs is not affected (< 1%)
(4) COW-byte (COWing a single page by writing a single byte) is not
    affected for any order (< 1 %). The page copy_fault overhead dominates
    everything.
(5) fork() is mostly not affected (< 1%), except order-2, where we have
    a slowdown of ~4%. Already for order-3 folios, we're down to a slowdown
    of < 1%.
(6) munmap() sees a slowdown by < 3% for some orders (order-5,
    order-6, order-9), but less for others (< 1% for order-4 and order-8,
    < 2% for order-2, order-3, order-7).

Especially the fork() and munmap() benchmark are sensitive to each added
instruction and other system noise, so I suspect some of the change and
observed weirdness (order-4) is due to code layout changes and other
factors, but not really due to the added atomics.

So in the common case where we can batch, the added atomics don't really
make a big difference, especially in light of the recent improvements for
large folios that we recently gained due to batching. Surprisingly, for
some cases where we cannot batch (e.g., COW), the added atomics don't seem
to matter, because other overhead dominates.

My fork and munmap micro-benchmarks don't cover cases where we cannot
batch-process bigger parts of large folios. As this is not the common case,
I'm not worrying about that right now.

Future work is batching RMAP operations during swapout and folio
migration.

Not CCing everybody (e.g., cgroups folks just because of the doc
updated) recommended by get_maintainers, to reduce noise. Tested on
x86-64, compile-tested on a bunch of other archs. Will do more testing
in the upcoming days.

[1] https://lore.kernel.org/all/20230809083256.699513-1-david@redhat.com/
[2] https://lore.kernel.org/all/20231124132626.235350-1-david@redhat.com/
[3] https://gitlab.com/davidhildenbrand/scratchspace/-/raw/main/pte-mapped-folio-benchmarks.c?ref_type=heads

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Yin Fengwei <fengwei.yin@intel.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Richard Chang <richardycc@google.com>

David Hildenbrand (18):
  mm: allow for detecting underflows with page_mapcount() again
  mm/rmap: always inline anon/file rmap duplication of a single PTE
  mm/rmap: add fast-path for small folios when
    adding/removing/duplicating
  mm: track mapcount of large folios in single value
  mm: improve folio_likely_mapped_shared() using the mapcount of large
    folios
  mm: make folio_mapcount() return 0 for small typed folios
  mm/memory: use folio_mapcount() in zap_present_folio_ptes()
  mm/huge_memory: use folio_mapcount() in zap_huge_pmd() sanity check
  mm/memory-failure: use folio_mapcount() in hwpoison_user_mappings()
  mm/page_alloc: use folio_mapped() in __alloc_contig_migrate_range()
  mm/migrate: use folio_likely_mapped_shared() in
    add_page_for_migration()
  sh/mm/cache: use folio_mapped() in copy_from_user_page()
  mm/filemap: use folio_mapcount() in filemap_unaccount_folio()
  mm/migrate_device: use folio_mapcount() in migrate_vma_check_page()
  trace/events/page_ref: trace the raw page mapcount value
  xtensa/mm: convert check_tlb_entry() to sanity check folios
  mm/debug: print only page mapcount (excluding folio entire mapcount)
    in __dump_folio()
  Documentation/admin-guide/cgroup-v1/memory.rst: don't reference
    page_mapcount()

 .../admin-guide/cgroup-v1/memory.rst          |  4 +-
 Documentation/mm/transhuge.rst                | 12 +--
 arch/sh/mm/cache.c                            |  2 +-
 arch/xtensa/mm/tlb.c                          | 11 +--
 include/linux/mm.h                            | 77 +++++++++++--------
 include/linux/mm_types.h                      |  5 +-
 include/linux/rmap.h                          | 40 +++++++++-
 include/trace/events/page_ref.h               |  4 +-
 mm/debug.c                                    | 12 +--
 mm/filemap.c                                  |  2 +-
 mm/huge_memory.c                              |  2 +-
 mm/hugetlb.c                                  |  4 +-
 mm/internal.h                                 |  3 +
 mm/khugepaged.c                               |  2 +-
 mm/memory-failure.c                           |  4 +-
 mm/memory.c                                   |  3 +-
 mm/migrate.c                                  |  2 +-
 mm/migrate_device.c                           | 12 +--
 mm/page_alloc.c                               | 12 ++-
 mm/rmap.c                                     | 60 +++++++--------
 20 files changed, 163 insertions(+), 110 deletions(-)

-- 
2.44.0


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v1 01/18] mm: allow for detecting underflows with page_mapcount() again
  2024-04-09 19:22 [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups David Hildenbrand
@ 2024-04-09 19:22 ` David Hildenbrand
  2024-04-09 20:06   ` Zi Yan
                     ` (2 more replies)
  2024-04-09 19:22 ` [PATCH v1 02/18] mm/rmap: always inline anon/file rmap duplication of a single PTE David Hildenbrand
                   ` (16 subsequent siblings)
  17 siblings, 3 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-09 19:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, David Hildenbrand, Andrew Morton,
	Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

Commit 53277bcf126d ("mm: support page_mapcount() on page_has_type()
pages") made it impossible to detect mapcount underflows by treating
any negative raw mapcount value as a mapcount of 0.

We perform such underflow checks in zap_present_folio_ptes() and
zap_huge_pmd(), which would currently no longer trigger.

Let's check against PAGE_MAPCOUNT_RESERVE instead by using
page_type_has_type(), like page_has_type() would, so we can still catch
some underflows.

Fixes: 53277bcf126d ("mm: support page_mapcount() on page_has_type() pages")
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/mm.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ef34cf54c14f..0fb8a40f82dd 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1229,11 +1229,10 @@ static inline void page_mapcount_reset(struct page *page)
  */
 static inline int page_mapcount(struct page *page)
 {
-	int mapcount = atomic_read(&page->_mapcount) + 1;
+	int mapcount = atomic_read(&page->_mapcount);
 
 	/* Handle page_has_type() pages */
-	if (mapcount < 0)
-		mapcount = 0;
+	mapcount = page_type_has_type(mapcount) ? 0 : mapcount + 1;
 	if (unlikely(PageCompound(page)))
 		mapcount += folio_entire_mapcount(page_folio(page));
 
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v1 02/18] mm/rmap: always inline anon/file rmap duplication of a single PTE
  2024-04-09 19:22 [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups David Hildenbrand
  2024-04-09 19:22 ` [PATCH v1 01/18] mm: allow for detecting underflows with page_mapcount() again David Hildenbrand
@ 2024-04-09 19:22 ` David Hildenbrand
  2024-04-19  2:25   ` Yin, Fengwei
  2024-04-19 14:01   ` Yin, Fengwei
  2024-04-09 19:22 ` [PATCH v1 03/18] mm/rmap: add fast-path for small folios when adding/removing/duplicating David Hildenbrand
                   ` (15 subsequent siblings)
  17 siblings, 2 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-09 19:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, David Hildenbrand, Andrew Morton,
	Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

As we grow the code, the compiler might make stupid decisions and
unnecessarily degrade fork() performance. Let's make sure to always inline
functions that operate on a single PTE so the compiler will always
optimize out the loop and avoid a function call.

This is a preparation for maintining a total mapcount for large folios.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/rmap.h | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 9bf9324214fc..9549d78928bb 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -347,8 +347,12 @@ static inline void folio_dup_file_rmap_ptes(struct folio *folio,
 {
 	__folio_dup_file_rmap(folio, page, nr_pages, RMAP_LEVEL_PTE);
 }
-#define folio_dup_file_rmap_pte(folio, page) \
-	folio_dup_file_rmap_ptes(folio, page, 1)
+
+static __always_inline void folio_dup_file_rmap_pte(struct folio *folio,
+		struct page *page)
+{
+	__folio_dup_file_rmap(folio, page, 1, RMAP_LEVEL_PTE);
+}
 
 /**
  * folio_dup_file_rmap_pmd - duplicate a PMD mapping of a page range of a folio
@@ -448,8 +452,13 @@ static inline int folio_try_dup_anon_rmap_ptes(struct folio *folio,
 	return __folio_try_dup_anon_rmap(folio, page, nr_pages, src_vma,
 					 RMAP_LEVEL_PTE);
 }
-#define folio_try_dup_anon_rmap_pte(folio, page, vma) \
-	folio_try_dup_anon_rmap_ptes(folio, page, 1, vma)
+
+static __always_inline int folio_try_dup_anon_rmap_pte(struct folio *folio,
+		struct page *page, struct vm_area_struct *src_vma)
+{
+	return __folio_try_dup_anon_rmap(folio, page, 1, src_vma,
+					 RMAP_LEVEL_PTE);
+}
 
 /**
  * folio_try_dup_anon_rmap_pmd - try duplicating a PMD mapping of a page range
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v1 03/18] mm/rmap: add fast-path for small folios when adding/removing/duplicating
  2024-04-09 19:22 [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups David Hildenbrand
  2024-04-09 19:22 ` [PATCH v1 01/18] mm: allow for detecting underflows with page_mapcount() again David Hildenbrand
  2024-04-09 19:22 ` [PATCH v1 02/18] mm/rmap: always inline anon/file rmap duplication of a single PTE David Hildenbrand
@ 2024-04-09 19:22 ` David Hildenbrand
  2024-04-19 14:02   ` Yin, Fengwei
  2024-04-09 19:22 ` [PATCH v1 04/18] mm: track mapcount of large folios in single value David Hildenbrand
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 43+ messages in thread
From: David Hildenbrand @ 2024-04-09 19:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, David Hildenbrand, Andrew Morton,
	Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

Let's add a fast-path for small folios to all relevant rmap functions.
Note that only RMAP_LEVEL_PTE applies.

This is a preparation for tracking the mapcount of large folios in a
single value.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/rmap.h | 13 +++++++++++++
 mm/rmap.c            | 26 ++++++++++++++++----------
 2 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 9549d78928bb..327f1ca5a487 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -322,6 +322,11 @@ static __always_inline void __folio_dup_file_rmap(struct folio *folio,
 
 	switch (level) {
 	case RMAP_LEVEL_PTE:
+		if (!folio_test_large(folio)) {
+			atomic_inc(&page->_mapcount);
+			break;
+		}
+
 		do {
 			atomic_inc(&page->_mapcount);
 		} while (page++, --nr_pages > 0);
@@ -405,6 +410,14 @@ static __always_inline int __folio_try_dup_anon_rmap(struct folio *folio,
 				if (PageAnonExclusive(page + i))
 					return -EBUSY;
 		}
+
+		if (!folio_test_large(folio)) {
+			if (PageAnonExclusive(page))
+				ClearPageAnonExclusive(page);
+			atomic_inc(&page->_mapcount);
+			break;
+		}
+
 		do {
 			if (PageAnonExclusive(page))
 				ClearPageAnonExclusive(page);
diff --git a/mm/rmap.c b/mm/rmap.c
index 56b313aa2ebf..4bde6d60db6c 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1172,15 +1172,18 @@ static __always_inline unsigned int __folio_add_rmap(struct folio *folio,
 
 	switch (level) {
 	case RMAP_LEVEL_PTE:
+		if (!folio_test_large(folio)) {
+			nr = atomic_inc_and_test(&page->_mapcount);
+			break;
+		}
+
 		do {
 			first = atomic_inc_and_test(&page->_mapcount);
-			if (first && folio_test_large(folio)) {
+			if (first) {
 				first = atomic_inc_return_relaxed(mapped);
-				first = (first < ENTIRELY_MAPPED);
+				if (first < ENTIRELY_MAPPED)
+					nr++;
 			}
-
-			if (first)
-				nr++;
 		} while (page++, --nr_pages > 0);
 		break;
 	case RMAP_LEVEL_PMD:
@@ -1514,15 +1517,18 @@ static __always_inline void __folio_remove_rmap(struct folio *folio,
 
 	switch (level) {
 	case RMAP_LEVEL_PTE:
+		if (!folio_test_large(folio)) {
+			nr = atomic_add_negative(-1, &page->_mapcount);
+			break;
+		}
+
 		do {
 			last = atomic_add_negative(-1, &page->_mapcount);
-			if (last && folio_test_large(folio)) {
+			if (last) {
 				last = atomic_dec_return_relaxed(mapped);
-				last = (last < ENTIRELY_MAPPED);
+				if (last < ENTIRELY_MAPPED)
+					nr++;
 			}
-
-			if (last)
-				nr++;
 		} while (page++, --nr_pages > 0);
 		break;
 	case RMAP_LEVEL_PMD:
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v1 04/18] mm: track mapcount of large folios in single value
  2024-04-09 19:22 [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups David Hildenbrand
                   ` (2 preceding siblings ...)
  2024-04-09 19:22 ` [PATCH v1 03/18] mm/rmap: add fast-path for small folios when adding/removing/duplicating David Hildenbrand
@ 2024-04-09 19:22 ` David Hildenbrand
  2024-04-09 20:13   ` Zi Yan
                     ` (2 more replies)
  2024-04-09 19:22 ` [PATCH v1 05/18] mm: improve folio_likely_mapped_shared() using the mapcount of large folios David Hildenbrand
                   ` (13 subsequent siblings)
  17 siblings, 3 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-09 19:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, David Hildenbrand, Andrew Morton,
	Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

Let's track the mapcount of large folios in a single value. The mapcount of
a large folio currently corresponds to the sum of the entire mapcount and
all page mapcounts.

This sum is what we actually want to know in folio_mapcount() and it is
also sufficient for implementing folio_mapped().

With PTE-mapped THP becoming more important and more widely used, we want
to avoid looping over all pages of a folio just to obtain the mapcount
of large folios. The comment "In the common case, avoid the loop when no
pages mapped by PTE" in folio_total_mapcount() does no longer hold for
mTHP that are always mapped by PTE.

Further, we are planning on using folio_mapcount() more
frequently, and might even want to remove page mapcounts for large
folios in some kernel configs. Therefore, allow for reading the mapcount of
large folios efficiently and atomically without looping over any pages.

Maintain the mapcount also for hugetlb pages for simplicity. Use the new
mapcount to implement folio_mapcount() and folio_mapped(). Make
page_mapped() simply call folio_mapped(). We can now get rid of
folio_large_is_mapped().

_nr_pages_mapped is now only used in rmap code and for debugging
purposes. Keep folio_nr_pages_mapped() around, but document that its use
should be limited to rmap internals and debugging purposes.

This change implies one additional atomic add/sub whenever
mapping/unmapping (parts of) a large folio.

As we now batch RMAP operations for PTE-mapped THP during fork(),
during unmap/zap, and when PTE-remapping a PMD-mapped THP, and we adjust
the large mapcount for a PTE batch only once, the added overhead in the
common case is small. Only when unmapping individual pages of a large folio
(e.g., during COW), the overhead might be bigger in comparison, but it's
essentially one additional atomic operation.

Note that before the new mapcount would overflow, already our refcount
would overflow: each mapping requires a folio reference. Extend the
focumentation of folio_mapcount().

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 Documentation/mm/transhuge.rst | 12 +++++-----
 include/linux/mm.h             | 44 ++++++++++++++++------------------
 include/linux/mm_types.h       |  5 ++--
 include/linux/rmap.h           | 10 ++++++++
 mm/debug.c                     |  3 ++-
 mm/hugetlb.c                   |  4 ++--
 mm/internal.h                  |  3 +++
 mm/khugepaged.c                |  2 +-
 mm/page_alloc.c                |  4 ++++
 mm/rmap.c                      | 34 +++++++++-----------------
 10 files changed, 62 insertions(+), 59 deletions(-)

diff --git a/Documentation/mm/transhuge.rst b/Documentation/mm/transhuge.rst
index 93c9239b9ebe..1ba0ad63246c 100644
--- a/Documentation/mm/transhuge.rst
+++ b/Documentation/mm/transhuge.rst
@@ -116,14 +116,14 @@ pages:
     succeeds on tail pages.
 
   - map/unmap of a PMD entry for the whole THP increment/decrement
-    folio->_entire_mapcount and also increment/decrement
-    folio->_nr_pages_mapped by ENTIRELY_MAPPED when _entire_mapcount
-    goes from -1 to 0 or 0 to -1.
+    folio->_entire_mapcount, increment/decrement folio->_large_mapcount
+    and also increment/decrement folio->_nr_pages_mapped by ENTIRELY_MAPPED
+    when _entire_mapcount goes from -1 to 0 or 0 to -1.
 
   - map/unmap of individual pages with PTE entry increment/decrement
-    page->_mapcount and also increment/decrement folio->_nr_pages_mapped
-    when page->_mapcount goes from -1 to 0 or 0 to -1 as this counts
-    the number of pages mapped by PTE.
+    page->_mapcount, increment/decrement folio->_large_mapcount and also
+    increment/decrement folio->_nr_pages_mapped when page->_mapcount goes
+    from -1 to 0 or 0 to -1 as this counts the number of pages mapped by PTE.
 
 split_huge_page internally has to distribute the refcounts in the head
 page to the tail pages before clearing all PG_head/tail bits from the page
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0fb8a40f82dd..1862a216af15 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1239,16 +1239,26 @@ static inline int page_mapcount(struct page *page)
 	return mapcount;
 }
 
-int folio_total_mapcount(const struct folio *folio);
+static inline int folio_large_mapcount(const struct folio *folio)
+{
+	VM_WARN_ON_FOLIO(!folio_test_large(folio), folio);
+	return atomic_read(&folio->_large_mapcount) + 1;
+}
 
 /**
- * folio_mapcount() - Calculate the number of mappings of this folio.
+ * folio_mapcount() - Number of mappings of this folio.
  * @folio: The folio.
  *
- * A large folio tracks both how many times the entire folio is mapped,
- * and how many times each individual page in the folio is mapped.
- * This function calculates the total number of times the folio is
- * mapped.
+ * The folio mapcount corresponds to the number of present user page table
+ * entries that reference any part of a folio. Each such present user page
+ * table entry must be paired with exactly on folio reference.
+ *
+ * For ordindary folios, each user page table entry (PTE/PMD/PUD/...) counts
+ * exactly once.
+ *
+ * For hugetlb folios, each abstracted "hugetlb" user page table entry that
+ * references the entire folio counts exactly once, even when such special
+ * page table entries are comprised of multiple ordinary page table entries.
  *
  * Return: The number of times this folio is mapped.
  */
@@ -1256,17 +1266,7 @@ static inline int folio_mapcount(const struct folio *folio)
 {
 	if (likely(!folio_test_large(folio)))
 		return atomic_read(&folio->_mapcount) + 1;
-	return folio_total_mapcount(folio);
-}
-
-static inline bool folio_large_is_mapped(const struct folio *folio)
-{
-	/*
-	 * Reading _entire_mapcount below could be omitted if hugetlb
-	 * participated in incrementing nr_pages_mapped when compound mapped.
-	 */
-	return atomic_read(&folio->_nr_pages_mapped) > 0 ||
-		atomic_read(&folio->_entire_mapcount) >= 0;
+	return folio_large_mapcount(folio);
 }
 
 /**
@@ -1275,11 +1275,9 @@ static inline bool folio_large_is_mapped(const struct folio *folio)
  *
  * Return: True if any page in this folio is referenced by user page tables.
  */
-static inline bool folio_mapped(struct folio *folio)
+static inline bool folio_mapped(const struct folio *folio)
 {
-	if (likely(!folio_test_large(folio)))
-		return atomic_read(&folio->_mapcount) >= 0;
-	return folio_large_is_mapped(folio);
+	return folio_mapcount(folio) >= 1;
 }
 
 /*
@@ -1289,9 +1287,7 @@ static inline bool folio_mapped(struct folio *folio)
  */
 static inline bool page_mapped(const struct page *page)
 {
-	if (likely(!PageCompound(page)))
-		return atomic_read(&page->_mapcount) >= 0;
-	return folio_large_is_mapped(page_folio(page));
+	return folio_mapped(page_folio(page));
 }
 
 static inline struct page *virt_to_head_page(const void *x)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 4260c595a79d..c432add95913 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -289,7 +289,8 @@ typedef struct {
  * @virtual: Virtual address in the kernel direct map.
  * @_last_cpupid: IDs of last CPU and last process that accessed the folio.
  * @_entire_mapcount: Do not use directly, call folio_entire_mapcount().
- * @_nr_pages_mapped: Do not use directly, call folio_mapcount().
+ * @_large_mapcount: Do not use directly, call folio_mapcount().
+ * @_nr_pages_mapped: Do not use outside of rmap and debug code.
  * @_pincount: Do not use directly, call folio_maybe_dma_pinned().
  * @_folio_nr_pages: Do not use directly, call folio_nr_pages().
  * @_hugetlb_subpool: Do not use directly, use accessor in hugetlb.h.
@@ -348,8 +349,8 @@ struct folio {
 		struct {
 			unsigned long _flags_1;
 			unsigned long _head_1;
-			unsigned long _folio_avail;
 	/* public: */
+			atomic_t _large_mapcount;
 			atomic_t _entire_mapcount;
 			atomic_t _nr_pages_mapped;
 			atomic_t _pincount;
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 327f1ca5a487..0f906dc6d280 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -273,6 +273,7 @@ static inline int hugetlb_try_dup_anon_rmap(struct folio *folio,
 		ClearPageAnonExclusive(&folio->page);
 	}
 	atomic_inc(&folio->_entire_mapcount);
+	atomic_inc(&folio->_large_mapcount);
 	return 0;
 }
 
@@ -306,6 +307,7 @@ static inline void hugetlb_add_file_rmap(struct folio *folio)
 	VM_WARN_ON_FOLIO(folio_test_anon(folio), folio);
 
 	atomic_inc(&folio->_entire_mapcount);
+	atomic_inc(&folio->_large_mapcount);
 }
 
 static inline void hugetlb_remove_rmap(struct folio *folio)
@@ -313,11 +315,14 @@ static inline void hugetlb_remove_rmap(struct folio *folio)
 	VM_WARN_ON_FOLIO(!folio_test_hugetlb(folio), folio);
 
 	atomic_dec(&folio->_entire_mapcount);
+	atomic_dec(&folio->_large_mapcount);
 }
 
 static __always_inline void __folio_dup_file_rmap(struct folio *folio,
 		struct page *page, int nr_pages, enum rmap_level level)
 {
+	const int orig_nr_pages = nr_pages;
+
 	__folio_rmap_sanity_checks(folio, page, nr_pages, level);
 
 	switch (level) {
@@ -330,9 +335,11 @@ static __always_inline void __folio_dup_file_rmap(struct folio *folio,
 		do {
 			atomic_inc(&page->_mapcount);
 		} while (page++, --nr_pages > 0);
+		atomic_add(orig_nr_pages, &folio->_large_mapcount);
 		break;
 	case RMAP_LEVEL_PMD:
 		atomic_inc(&folio->_entire_mapcount);
+		atomic_inc(&folio->_large_mapcount);
 		break;
 	}
 }
@@ -382,6 +389,7 @@ static __always_inline int __folio_try_dup_anon_rmap(struct folio *folio,
 		struct page *page, int nr_pages, struct vm_area_struct *src_vma,
 		enum rmap_level level)
 {
+	const int orig_nr_pages = nr_pages;
 	bool maybe_pinned;
 	int i;
 
@@ -423,6 +431,7 @@ static __always_inline int __folio_try_dup_anon_rmap(struct folio *folio,
 				ClearPageAnonExclusive(page);
 			atomic_inc(&page->_mapcount);
 		} while (page++, --nr_pages > 0);
+		atomic_add(orig_nr_pages, &folio->_large_mapcount);
 		break;
 	case RMAP_LEVEL_PMD:
 		if (PageAnonExclusive(page)) {
@@ -431,6 +440,7 @@ static __always_inline int __folio_try_dup_anon_rmap(struct folio *folio,
 			ClearPageAnonExclusive(page);
 		}
 		atomic_inc(&folio->_entire_mapcount);
+		atomic_inc(&folio->_large_mapcount);
 		break;
 	}
 	return 0;
diff --git a/mm/debug.c b/mm/debug.c
index b71186f1fb0b..d064db42af54 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -68,8 +68,9 @@ static void __dump_folio(struct folio *folio, struct page *page,
 			folio_ref_count(folio), mapcount, mapping,
 			folio->index + idx, pfn);
 	if (folio_test_large(folio)) {
-		pr_warn("head: order:%u entire_mapcount:%d nr_pages_mapped:%d pincount:%d\n",
+		pr_warn("head: order:%u mapcount:%d entire_mapcount:%d nr_pages_mapped:%d pincount:%d\n",
 				folio_order(folio),
+				folio_mapcount(folio),
 				folio_entire_mapcount(folio),
 				folio_nr_pages_mapped(folio),
 				atomic_read(&folio->_pincount));
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 454900c84b30..a8536349de13 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1517,7 +1517,7 @@ static void __destroy_compound_gigantic_folio(struct folio *folio,
 	struct page *p;
 
 	atomic_set(&folio->_entire_mapcount, 0);
-	atomic_set(&folio->_nr_pages_mapped, 0);
+	atomic_set(&folio->_large_mapcount, 0);
 	atomic_set(&folio->_pincount, 0);
 
 	for (i = 1; i < nr_pages; i++) {
@@ -2120,7 +2120,7 @@ static bool __prep_compound_gigantic_folio(struct folio *folio,
 	/* we rely on prep_new_hugetlb_folio to set the hugetlb flag */
 	folio_set_order(folio, order);
 	atomic_set(&folio->_entire_mapcount, -1);
-	atomic_set(&folio->_nr_pages_mapped, 0);
+	atomic_set(&folio->_large_mapcount, -1);
 	atomic_set(&folio->_pincount, 0);
 	return true;
 
diff --git a/mm/internal.h b/mm/internal.h
index 9d3250b4a08a..51fa6246769c 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -72,6 +72,8 @@ void page_writeback_init(void);
 /*
  * How many individual pages have an elevated _mapcount.  Excludes
  * the folio's entire_mapcount.
+ *
+ * Don't use this function outside of debugging code.
  */
 static inline int folio_nr_pages_mapped(const struct folio *folio)
 {
@@ -610,6 +612,7 @@ static inline void prep_compound_head(struct page *page, unsigned int order)
 	struct folio *folio = (struct folio *)page;
 
 	folio_set_order(folio, order);
+	atomic_set(&folio->_large_mapcount, -1);
 	atomic_set(&folio->_entire_mapcount, -1);
 	atomic_set(&folio->_nr_pages_mapped, 0);
 	atomic_set(&folio->_pincount, 0);
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 89e2624fb3ff..2f73d2aa9ae8 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1358,7 +1358,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
 		 * Check if the page has any GUP (or other external) pins.
 		 *
 		 * Here the check may be racy:
-		 * it may see total_mapcount > refcount in some cases?
+		 * it may see folio_mapcount() > folio_ref_count().
 		 * But such case is ephemeral we could always retry collapse
 		 * later.  However it may report false positive if the page
 		 * has excessive GUP pins (i.e. 512).  Anyway the same check
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index adbb7e6e0c72..393366d4a704 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -941,6 +941,10 @@ static int free_tail_page_prepare(struct page *head_page, struct page *page)
 			bad_page(page, "nonzero entire_mapcount");
 			goto out;
 		}
+		if (unlikely(folio_large_mapcount(folio))) {
+			bad_page(page, "nonzero large_mapcount");
+			goto out;
+		}
 		if (unlikely(atomic_read(&folio->_nr_pages_mapped))) {
 			bad_page(page, "nonzero nr_pages_mapped");
 			goto out;
diff --git a/mm/rmap.c b/mm/rmap.c
index 4bde6d60db6c..2608c40dffad 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1138,34 +1138,12 @@ int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t pgoff,
 	return page_vma_mkclean_one(&pvmw);
 }
 
-int folio_total_mapcount(const struct folio *folio)
-{
-	int mapcount = folio_entire_mapcount(folio);
-	int nr_pages;
-	int i;
-
-	/* In the common case, avoid the loop when no pages mapped by PTE */
-	if (folio_nr_pages_mapped(folio) == 0)
-		return mapcount;
-	/*
-	 * Add all the PTE mappings of those pages mapped by PTE.
-	 * Limit the loop to folio_nr_pages_mapped()?
-	 * Perhaps: given all the raciness, that may be a good or a bad idea.
-	 */
-	nr_pages = folio_nr_pages(folio);
-	for (i = 0; i < nr_pages; i++)
-		mapcount += atomic_read(&folio_page(folio, i)->_mapcount);
-
-	/* But each of those _mapcounts was based on -1 */
-	mapcount += nr_pages;
-	return mapcount;
-}
-
 static __always_inline unsigned int __folio_add_rmap(struct folio *folio,
 		struct page *page, int nr_pages, enum rmap_level level,
 		int *nr_pmdmapped)
 {
 	atomic_t *mapped = &folio->_nr_pages_mapped;
+	const int orig_nr_pages = nr_pages;
 	int first, nr = 0;
 
 	__folio_rmap_sanity_checks(folio, page, nr_pages, level);
@@ -1185,6 +1163,7 @@ static __always_inline unsigned int __folio_add_rmap(struct folio *folio,
 					nr++;
 			}
 		} while (page++, --nr_pages > 0);
+		atomic_add(orig_nr_pages, &folio->_large_mapcount);
 		break;
 	case RMAP_LEVEL_PMD:
 		first = atomic_inc_and_test(&folio->_entire_mapcount);
@@ -1201,6 +1180,7 @@ static __always_inline unsigned int __folio_add_rmap(struct folio *folio,
 				nr = 0;
 			}
 		}
+		atomic_inc(&folio->_large_mapcount);
 		break;
 	}
 	return nr;
@@ -1436,10 +1416,14 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma,
 			SetPageAnonExclusive(page);
 		}
 
+		/* increment count (starts at -1) */
+		atomic_set(&folio->_large_mapcount, nr - 1);
 		atomic_set(&folio->_nr_pages_mapped, nr);
 	} else {
 		/* increment count (starts at -1) */
 		atomic_set(&folio->_entire_mapcount, 0);
+		/* increment count (starts at -1) */
+		atomic_set(&folio->_large_mapcount, 0);
 		atomic_set(&folio->_nr_pages_mapped, ENTIRELY_MAPPED);
 		SetPageAnonExclusive(&folio->page);
 		__lruvec_stat_mod_folio(folio, NR_ANON_THPS, nr);
@@ -1522,6 +1506,7 @@ static __always_inline void __folio_remove_rmap(struct folio *folio,
 			break;
 		}
 
+		atomic_sub(nr_pages, &folio->_large_mapcount);
 		do {
 			last = atomic_add_negative(-1, &page->_mapcount);
 			if (last) {
@@ -1532,6 +1517,7 @@ static __always_inline void __folio_remove_rmap(struct folio *folio,
 		} while (page++, --nr_pages > 0);
 		break;
 	case RMAP_LEVEL_PMD:
+		atomic_dec(&folio->_large_mapcount);
 		last = atomic_add_negative(-1, &folio->_entire_mapcount);
 		if (last) {
 			nr = atomic_sub_return_relaxed(ENTIRELY_MAPPED, mapped);
@@ -2714,6 +2700,7 @@ void hugetlb_add_anon_rmap(struct folio *folio, struct vm_area_struct *vma,
 	VM_WARN_ON_FOLIO(!folio_test_anon(folio), folio);
 
 	atomic_inc(&folio->_entire_mapcount);
+	atomic_inc(&folio->_large_mapcount);
 	if (flags & RMAP_EXCLUSIVE)
 		SetPageAnonExclusive(&folio->page);
 	VM_WARN_ON_FOLIO(folio_entire_mapcount(folio) > 1 &&
@@ -2728,6 +2715,7 @@ void hugetlb_add_new_anon_rmap(struct folio *folio,
 	BUG_ON(address < vma->vm_start || address >= vma->vm_end);
 	/* increment count (starts at -1) */
 	atomic_set(&folio->_entire_mapcount, 0);
+	atomic_set(&folio->_large_mapcount, 0);
 	folio_clear_hugetlb_restore_reserve(folio);
 	__folio_set_anon(folio, vma, address, true);
 	SetPageAnonExclusive(&folio->page);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v1 05/18] mm: improve folio_likely_mapped_shared() using the mapcount of large folios
  2024-04-09 19:22 [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups David Hildenbrand
                   ` (3 preceding siblings ...)
  2024-04-09 19:22 ` [PATCH v1 04/18] mm: track mapcount of large folios in single value David Hildenbrand
@ 2024-04-09 19:22 ` David Hildenbrand
  2024-04-16 10:40   ` Lance Yang
                     ` (2 more replies)
  2024-04-09 19:22 ` [PATCH v1 06/18] mm: make folio_mapcount() return 0 for small typed folios David Hildenbrand
                   ` (12 subsequent siblings)
  17 siblings, 3 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-09 19:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, David Hildenbrand, Andrew Morton,
	Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

We can now read the mapcount of large folios very efficiently. Use it to
improve our handling of partially-mappable folios, falling back
to making a guess only in case the folio is not "obviously mapped shared".

We can now better detect partially-mappable folios where the first page is
not mapped as "mapped shared", reducing "false negatives"; but false
negatives are still possible.

While at it, fixup a wrong comment (false positive vs. false negative)
for KSM folios.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/mm.h | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 1862a216af15..daf687f0e8e5 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2183,7 +2183,7 @@ static inline size_t folio_size(struct folio *folio)
  *       indicate "mapped shared" (false positive) when two VMAs in the same MM
  *       cover the same file range.
  *    #. For (small) KSM folios, the return value can wrongly indicate "mapped
- *       shared" (false negative), when the folio is mapped multiple times into
+ *       shared" (false positive), when the folio is mapped multiple times into
  *       the same MM.
  *
  * Further, this function only considers current page table mappings that
@@ -2200,7 +2200,22 @@ static inline size_t folio_size(struct folio *folio)
  */
 static inline bool folio_likely_mapped_shared(struct folio *folio)
 {
-	return page_mapcount(folio_page(folio, 0)) > 1;
+	int mapcount = folio_mapcount(folio);
+
+	/* Only partially-mappable folios require more care. */
+	if (!folio_test_large(folio) || unlikely(folio_test_hugetlb(folio)))
+		return mapcount > 1;
+
+	/* A single mapping implies "mapped exclusively". */
+	if (mapcount <= 1)
+		return false;
+
+	/* If any page is mapped more than once we treat it "mapped shared". */
+	if (folio_entire_mapcount(folio) || mapcount > folio_nr_pages(folio))
+		return true;
+
+	/* Let's guess based on the first subpage. */
+	return atomic_read(&folio->_mapcount) > 0;
 }
 
 #ifndef HAVE_ARCH_MAKE_PAGE_ACCESSIBLE
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v1 06/18] mm: make folio_mapcount() return 0 for small typed folios
  2024-04-09 19:22 [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups David Hildenbrand
                   ` (4 preceding siblings ...)
  2024-04-09 19:22 ` [PATCH v1 05/18] mm: improve folio_likely_mapped_shared() using the mapcount of large folios David Hildenbrand
@ 2024-04-09 19:22 ` David Hildenbrand
  2024-04-24  9:40   ` David Hildenbrand
  2024-04-09 19:22 ` [PATCH v1 07/18] mm/memory: use folio_mapcount() in zap_present_folio_ptes() David Hildenbrand
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 43+ messages in thread
From: David Hildenbrand @ 2024-04-09 19:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, David Hildenbrand, Andrew Morton,
	Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

We already handle it properly for large folios. Let's also return "0"
for small typed folios, like page_mapcount() currently would.

Consequently, folio_mapcount() will never return negative values for
typed folios, but may return negative values for underflows.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/mm.h | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index daf687f0e8e5..d453232bba62 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1260,12 +1260,19 @@ static inline int folio_large_mapcount(const struct folio *folio)
  * references the entire folio counts exactly once, even when such special
  * page table entries are comprised of multiple ordinary page table entries.
  *
+ * Will report 0 for pages which cannot be mapped into userspace, such as
+ * slab, page tables and similar.
+ *
  * Return: The number of times this folio is mapped.
  */
 static inline int folio_mapcount(const struct folio *folio)
 {
-	if (likely(!folio_test_large(folio)))
-		return atomic_read(&folio->_mapcount) + 1;
+	int mapcount;
+
+	if (likely(!folio_test_large(folio))) {
+		mapcount = atomic_read(&folio->_mapcount);
+		return page_type_has_type(mapcount) ? 0 : mapcount + 1;
+	}
 	return folio_large_mapcount(folio);
 }
 
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v1 07/18] mm/memory: use folio_mapcount() in zap_present_folio_ptes()
  2024-04-09 19:22 [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups David Hildenbrand
                   ` (5 preceding siblings ...)
  2024-04-09 19:22 ` [PATCH v1 06/18] mm: make folio_mapcount() return 0 for small typed folios David Hildenbrand
@ 2024-04-09 19:22 ` David Hildenbrand
  2024-04-09 19:22 ` [PATCH v1 08/18] mm/huge_memory: use folio_mapcount() in zap_huge_pmd() sanity check David Hildenbrand
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-09 19:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, David Hildenbrand, Andrew Morton,
	Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

We want to limit the use of page_mapcount() to the places where it is
absolutely necessary. In zap_present_folio_ptes(), let's simply check
the folio mapcount(). If there is some issue, it will underflow at some
point either way when unmapping.

As indicated already in commit 10ebac4f95e7 ("mm/memory: optimize unmap/zap
with PTE-mapped THP"), we already documented "If we ever have a cheap
folio_mapcount(), we might just want to check for underflows there.".

There is no change for small folios. For large folios, we'll now catch
more underflows when batch-unmapping, because instead of only testing
the mapcount of the first subpage, we'll test if the folio mapcount
underflows.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 78422d1c7381..178492efb4af 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1502,8 +1502,7 @@ static __always_inline void zap_present_folio_ptes(struct mmu_gather *tlb,
 	if (!delay_rmap) {
 		folio_remove_rmap_ptes(folio, page, nr, vma);
 
-		/* Only sanity-check the first page in a batch. */
-		if (unlikely(page_mapcount(page) < 0))
+		if (unlikely(folio_mapcount(folio) < 0))
 			print_bad_pte(vma, addr, ptent, page);
 	}
 
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v1 08/18] mm/huge_memory: use folio_mapcount() in zap_huge_pmd() sanity check
  2024-04-09 19:22 [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups David Hildenbrand
                   ` (6 preceding siblings ...)
  2024-04-09 19:22 ` [PATCH v1 07/18] mm/memory: use folio_mapcount() in zap_present_folio_ptes() David Hildenbrand
@ 2024-04-09 19:22 ` David Hildenbrand
  2024-04-09 19:22 ` [PATCH v1 09/18] mm/memory-failure: use folio_mapcount() in hwpoison_user_mappings() David Hildenbrand
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-09 19:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, David Hildenbrand, Andrew Morton,
	Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

We want to limit the use of page_mapcount() to the places where it is
absolutely necessary. Let's similarly check for folio_mapcount() underflows
instead of page_mapcount() underflows like we do in
zap_present_folio_ptes() now.

Instead of the VM_BUG_ON(), we should actually be doing something like
print_bad_pte(). For now, let's keep it simple and use WARN_ON_ONCE(),
performing that check independently of DEBUG_VM.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/huge_memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index d8d2ed80b0bf..68ac27d229ef 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1851,7 +1851,7 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 
 			folio = page_folio(page);
 			folio_remove_rmap_pmd(folio, page, vma);
-			VM_BUG_ON_PAGE(page_mapcount(page) < 0, page);
+			WARN_ON_ONCE(folio_mapcount(folio) < 0);
 			VM_BUG_ON_PAGE(!PageHead(page), page);
 		} else if (thp_migration_supported()) {
 			swp_entry_t entry;
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v1 09/18] mm/memory-failure: use folio_mapcount() in hwpoison_user_mappings()
  2024-04-09 19:22 [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups David Hildenbrand
                   ` (7 preceding siblings ...)
  2024-04-09 19:22 ` [PATCH v1 08/18] mm/huge_memory: use folio_mapcount() in zap_huge_pmd() sanity check David Hildenbrand
@ 2024-04-09 19:22 ` David Hildenbrand
  2024-04-09 19:22 ` [PATCH v1 10/18] mm/page_alloc: use folio_mapped() in __alloc_contig_migrate_range() David Hildenbrand
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-09 19:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, David Hildenbrand, Andrew Morton,
	Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

We want to limit the use of page_mapcount() to the places where it is
absolutely necessary. We can only unmap full folios; page_mapped(),
which we check here, is translated to folio_mapped() -- based on
folio_mapcount(). So let's print the folio mapcount instead.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory-failure.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 88359a185c5f..ee2f4b8905ef 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1628,8 +1628,8 @@ static bool hwpoison_user_mappings(struct page *p, unsigned long pfn,
 
 	unmap_success = !page_mapped(p);
 	if (!unmap_success)
-		pr_err("%#lx: failed to unmap page (mapcount=%d)\n",
-		       pfn, page_mapcount(p));
+		pr_err("%#lx: failed to unmap page (folio mapcount=%d)\n",
+		       pfn, folio_mapcount(page_folio(p)));
 
 	/*
 	 * try_to_unmap() might put mlocked page in lru cache, so call
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v1 10/18] mm/page_alloc: use folio_mapped() in __alloc_contig_migrate_range()
  2024-04-09 19:22 [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups David Hildenbrand
                   ` (8 preceding siblings ...)
  2024-04-09 19:22 ` [PATCH v1 09/18] mm/memory-failure: use folio_mapcount() in hwpoison_user_mappings() David Hildenbrand
@ 2024-04-09 19:22 ` David Hildenbrand
  2024-04-09 19:22 ` [PATCH v1 11/18] mm/migrate: use folio_likely_mapped_shared() in add_page_for_migration() David Hildenbrand
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-09 19:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, David Hildenbrand, Andrew Morton,
	Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

We want to limit the use of page_mapcount() to the places where it is
absolutely necessary.

For tracing purposes, we use page_mapcount() in
__alloc_contig_migrate_range(). Adding that mapcount to total_mapped sounds
strange: total_migrated and total_reclaimed would count each page only
once, not multiple times.

But then, isolate_migratepages_range() adds each folio only once to the
list. So for large folios, we would query the mapcount of the
first page of the folio, which doesn't make too much sense for large
folios.

Let's simply use folio_mapped() * folio_nr_pages(), which makes more
sense as nr_migratepages is also incremented by the number of pages in
the folio in case of successful migration.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/page_alloc.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 393366d4a704..40fc0f60e021 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6389,8 +6389,12 @@ int __alloc_contig_migrate_range(struct compact_control *cc,
 
 		if (trace_mm_alloc_contig_migrate_range_info_enabled()) {
 			total_reclaimed += nr_reclaimed;
-			list_for_each_entry(page, &cc->migratepages, lru)
-				total_mapped += page_mapcount(page);
+			list_for_each_entry(page, &cc->migratepages, lru) {
+				struct folio *folio = page_folio(page);
+
+				total_mapped += folio_mapped(folio) *
+						folio_nr_pages(folio);
+			}
 		}
 
 		ret = migrate_pages(&cc->migratepages, alloc_migration_target,
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v1 11/18] mm/migrate: use folio_likely_mapped_shared() in add_page_for_migration()
  2024-04-09 19:22 [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups David Hildenbrand
                   ` (9 preceding siblings ...)
  2024-04-09 19:22 ` [PATCH v1 10/18] mm/page_alloc: use folio_mapped() in __alloc_contig_migrate_range() David Hildenbrand
@ 2024-04-09 19:22 ` David Hildenbrand
  2024-04-09 19:22 ` [PATCH v1 12/18] sh/mm/cache: use folio_mapped() in copy_from_user_page() David Hildenbrand
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-09 19:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, David Hildenbrand, Andrew Morton,
	Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

We want to limit the use of page_mapcount() to the places where it is
absolutely necessary. In add_page_for_migration(), we actually want to
check if the folio is mapped shared, to reject such folios. So let's
use folio_likely_mapped_shared() instead.

For small folios, fully mapped THP, and hugetlb folios, there is no change.
For partially mapped, shared THP, we should now do a better job at
rejecting such folios.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/migrate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 285072bca29c..d87ce32645d4 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2140,7 +2140,7 @@ static int add_page_for_migration(struct mm_struct *mm, const void __user *p,
 		goto out_putfolio;
 
 	err = -EACCES;
-	if (page_mapcount(page) > 1 && !migrate_all)
+	if (folio_likely_mapped_shared(folio) && !migrate_all)
 		goto out_putfolio;
 
 	err = -EBUSY;
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v1 12/18] sh/mm/cache: use folio_mapped() in copy_from_user_page()
  2024-04-09 19:22 [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups David Hildenbrand
                   ` (10 preceding siblings ...)
  2024-04-09 19:22 ` [PATCH v1 11/18] mm/migrate: use folio_likely_mapped_shared() in add_page_for_migration() David Hildenbrand
@ 2024-04-09 19:22 ` David Hildenbrand
  2024-04-09 19:22 ` [PATCH v1 13/18] mm/filemap: use folio_mapcount() in filemap_unaccount_folio() David Hildenbrand
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-09 19:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, David Hildenbrand, Andrew Morton,
	Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

We want to limit the use of page_mapcount() to the places where it is
absolutely necessary.

We're already using folio_mapped in copy_user_highpage() and
copy_to_user_page() for a similar purpose so ... let's also simply use
it for copy_from_user_page().

There is no change for small folios. Likely we won't stumble over many
large folios on sh in that code either way.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/sh/mm/cache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/sh/mm/cache.c b/arch/sh/mm/cache.c
index 9bcaa5619eab..d8be352e14d2 100644
--- a/arch/sh/mm/cache.c
+++ b/arch/sh/mm/cache.c
@@ -84,7 +84,7 @@ void copy_from_user_page(struct vm_area_struct *vma, struct page *page,
 {
 	struct folio *folio = page_folio(page);
 
-	if (boot_cpu_data.dcache.n_aliases && page_mapcount(page) &&
+	if (boot_cpu_data.dcache.n_aliases && folio_mapped(folio) &&
 	    test_bit(PG_dcache_clean, &folio->flags)) {
 		void *vfrom = kmap_coherent(page, vaddr) + (vaddr & ~PAGE_MASK);
 		memcpy(dst, vfrom, len);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v1 13/18] mm/filemap: use folio_mapcount() in filemap_unaccount_folio()
  2024-04-09 19:22 [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups David Hildenbrand
                   ` (11 preceding siblings ...)
  2024-04-09 19:22 ` [PATCH v1 12/18] sh/mm/cache: use folio_mapped() in copy_from_user_page() David Hildenbrand
@ 2024-04-09 19:22 ` David Hildenbrand
  2024-04-09 19:22 ` [PATCH v1 14/18] mm/migrate_device: use folio_mapcount() in migrate_vma_check_page() David Hildenbrand
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-09 19:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, David Hildenbrand, Andrew Morton,
	Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

We want to limit the use of page_mapcount() to the places where it is
absolutely necessary.

Let's use folio_mapcount() instead of filemap_unaccount_folio().

No functional change intended, because we're only dealing with small
folios.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/filemap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index c668e11cd6ef..d4aa82ad5b59 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -168,7 +168,7 @@ static void filemap_unaccount_folio(struct address_space *mapping,
 		add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
 
 		if (mapping_exiting(mapping) && !folio_test_large(folio)) {
-			int mapcount = page_mapcount(&folio->page);
+			int mapcount = folio_mapcount(folio);
 
 			if (folio_ref_count(folio) >= mapcount + 2) {
 				/*
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v1 14/18] mm/migrate_device: use folio_mapcount() in migrate_vma_check_page()
  2024-04-09 19:22 [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups David Hildenbrand
                   ` (12 preceding siblings ...)
  2024-04-09 19:22 ` [PATCH v1 13/18] mm/filemap: use folio_mapcount() in filemap_unaccount_folio() David Hildenbrand
@ 2024-04-09 19:22 ` David Hildenbrand
  2024-04-09 19:22 ` [PATCH v1 15/18] trace/events/page_ref: trace the raw page mapcount value David Hildenbrand
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-09 19:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, David Hildenbrand, Andrew Morton,
	Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

We want to limit the use of page_mapcount() to the places where it is
absolutely necessary. Let's convert migrate_vma_check_page() to work on
a folio internally so we can remove the page_mapcount() usage.

Note that we reject any large folios.

There is a lot more folio conversion to be had, but that has to wait for
another day. No functional change intended.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/migrate_device.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/mm/migrate_device.c b/mm/migrate_device.c
index d40b46ae9d65..b929b450b77c 100644
--- a/mm/migrate_device.c
+++ b/mm/migrate_device.c
@@ -324,6 +324,8 @@ static void migrate_vma_collect(struct migrate_vma *migrate)
  */
 static bool migrate_vma_check_page(struct page *page, struct page *fault_page)
 {
+	struct folio *folio = page_folio(page);
+
 	/*
 	 * One extra ref because caller holds an extra reference, either from
 	 * isolate_lru_page() for a regular page, or migrate_vma_collect() for
@@ -336,18 +338,18 @@ static bool migrate_vma_check_page(struct page *page, struct page *fault_page)
 	 * check them than regular pages, because they can be mapped with a pmd
 	 * or with a pte (split pte mapping).
 	 */
-	if (PageCompound(page))
+	if (folio_test_large(folio))
 		return false;
 
 	/* Page from ZONE_DEVICE have one extra reference */
-	if (is_zone_device_page(page))
+	if (folio_is_zone_device(folio))
 		extra++;
 
 	/* For file back page */
-	if (page_mapping(page))
-		extra += 1 + page_has_private(page);
+	if (folio_mapping(folio))
+		extra += 1 + folio_has_private(folio);
 
-	if ((page_count(page) - extra) > page_mapcount(page))
+	if ((folio_ref_count(folio) - extra) > folio_mapcount(folio))
 		return false;
 
 	return true;
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v1 15/18] trace/events/page_ref: trace the raw page mapcount value
  2024-04-09 19:22 [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups David Hildenbrand
                   ` (13 preceding siblings ...)
  2024-04-09 19:22 ` [PATCH v1 14/18] mm/migrate_device: use folio_mapcount() in migrate_vma_check_page() David Hildenbrand
@ 2024-04-09 19:22 ` David Hildenbrand
  2024-04-09 19:22 ` [PATCH v1 16/18] xtensa/mm: convert check_tlb_entry() to sanity check folios David Hildenbrand
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-09 19:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, David Hildenbrand, Andrew Morton,
	Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

We want to limit the use of page_mapcount() to the places where it is
absolutely necessary. We already trace raw page->refcount, raw page->flags
and raw page->mapping, and don't involve any folios. Let's also trace the
raw mapcount value that does not consider the entire mapcount of large
folios, and we don't add "1" to it.

When dealing with typed folios, this makes a lot more sense. ... and
it's for debugging purposes only either way.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/trace/events/page_ref.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/page_ref.h b/include/trace/events/page_ref.h
index 8a99c1cd417b..fe33a255b7d0 100644
--- a/include/trace/events/page_ref.h
+++ b/include/trace/events/page_ref.h
@@ -30,7 +30,7 @@ DECLARE_EVENT_CLASS(page_ref_mod_template,
 		__entry->pfn = page_to_pfn(page);
 		__entry->flags = page->flags;
 		__entry->count = page_ref_count(page);
-		__entry->mapcount = page_mapcount(page);
+		__entry->mapcount = atomic_read(&page->_mapcount);
 		__entry->mapping = page->mapping;
 		__entry->mt = get_pageblock_migratetype(page);
 		__entry->val = v;
@@ -79,7 +79,7 @@ DECLARE_EVENT_CLASS(page_ref_mod_and_test_template,
 		__entry->pfn = page_to_pfn(page);
 		__entry->flags = page->flags;
 		__entry->count = page_ref_count(page);
-		__entry->mapcount = page_mapcount(page);
+		__entry->mapcount = atomic_read(&page->_mapcount);
 		__entry->mapping = page->mapping;
 		__entry->mt = get_pageblock_migratetype(page);
 		__entry->val = v;
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v1 16/18] xtensa/mm: convert check_tlb_entry() to sanity check folios
  2024-04-09 19:22 [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups David Hildenbrand
                   ` (14 preceding siblings ...)
  2024-04-09 19:22 ` [PATCH v1 15/18] trace/events/page_ref: trace the raw page mapcount value David Hildenbrand
@ 2024-04-09 19:22 ` David Hildenbrand
  2024-04-09 19:23 ` [PATCH v1 17/18] mm/debug: print only page mapcount (excluding folio entire mapcount) in __dump_folio() David Hildenbrand
  2024-04-09 19:23 ` [PATCH v1 18/18] Documentation/admin-guide/cgroup-v1/memory.rst: don't reference page_mapcount() David Hildenbrand
  17 siblings, 0 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-09 19:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, David Hildenbrand, Andrew Morton,
	Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

We want to limit the use of page_mapcount() to the places where it is
absolutely necessary. So let's convert check_tlb_entry() to perform
sanity checks on folios instead of pages.

This essentially already happened: page_count() is mapped to
folio_ref_count(), and page_mapped() to folio_mapped() internally.
However, we would have printed the page_mapount(), which
does not really match what page_mapped() would have checked.

Let's simply print the folio mapcount to avoid using page_mapcount(). For
small folios there is no change.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/xtensa/mm/tlb.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/xtensa/mm/tlb.c b/arch/xtensa/mm/tlb.c
index 4f974b74883c..d8b60d6e50a8 100644
--- a/arch/xtensa/mm/tlb.c
+++ b/arch/xtensa/mm/tlb.c
@@ -256,12 +256,13 @@ static int check_tlb_entry(unsigned w, unsigned e, bool dtlb)
 					dtlb ? 'D' : 'I', w, e, r0, r1, pte);
 			if (pte == 0 || !pte_present(__pte(pte))) {
 				struct page *p = pfn_to_page(r1 >> PAGE_SHIFT);
-				pr_err("page refcount: %d, mapcount: %d\n",
-						page_count(p),
-						page_mapcount(p));
-				if (!page_count(p))
+				struct folio *f = page_folio(p);
+
+				pr_err("folio refcount: %d, mapcount: %d\n",
+					folio_ref_count(f), folio_mapcount(f));
+				if (!folio_ref_count(f))
 					rc |= TLB_INSANE;
-				else if (page_mapcount(p))
+				else if (folio_mapped(f))
 					rc |= TLB_SUSPICIOUS;
 			} else {
 				rc |= TLB_INSANE;
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v1 17/18] mm/debug: print only page mapcount (excluding folio entire mapcount) in __dump_folio()
  2024-04-09 19:22 [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups David Hildenbrand
                   ` (15 preceding siblings ...)
  2024-04-09 19:22 ` [PATCH v1 16/18] xtensa/mm: convert check_tlb_entry() to sanity check folios David Hildenbrand
@ 2024-04-09 19:23 ` David Hildenbrand
  2024-04-09 19:23 ` [PATCH v1 18/18] Documentation/admin-guide/cgroup-v1/memory.rst: don't reference page_mapcount() David Hildenbrand
  17 siblings, 0 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-09 19:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, David Hildenbrand, Andrew Morton,
	Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

Let's simplify and only print the page mapcount: we already print the
large folio mapcount and the entire folio mapcount for large folios
separately; that should be sufficient to figure out what's happening.

While at it, print the page mapcount also if it had an underflow,
filtering out only typed pages.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/debug.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/mm/debug.c b/mm/debug.c
index d064db42af54..69e524c3e601 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -55,15 +55,10 @@ static void __dump_folio(struct folio *folio, struct page *page,
 		unsigned long pfn, unsigned long idx)
 {
 	struct address_space *mapping = folio_mapping(folio);
-	int mapcount = atomic_read(&page->_mapcount) + 1;
+	int mapcount = atomic_read(&page->_mapcount);
 	char *type = "";
 
-	/* Open-code page_mapcount() to avoid looking up a stale folio */
-	if (mapcount < 0)
-		mapcount = 0;
-	if (folio_test_large(folio))
-		mapcount += folio_entire_mapcount(folio);
-
+	mapcount = page_type_has_type(mapcount) ? 0 : mapcount + 1;
 	pr_warn("page: refcount:%d mapcount:%d mapping:%p index:%#lx pfn:%#lx\n",
 			folio_ref_count(folio), mapcount, mapping,
 			folio->index + idx, pfn);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v1 18/18] Documentation/admin-guide/cgroup-v1/memory.rst: don't reference page_mapcount()
  2024-04-09 19:22 [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups David Hildenbrand
                   ` (16 preceding siblings ...)
  2024-04-09 19:23 ` [PATCH v1 17/18] mm/debug: print only page mapcount (excluding folio entire mapcount) in __dump_folio() David Hildenbrand
@ 2024-04-09 19:23 ` David Hildenbrand
  17 siblings, 0 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-09 19:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, David Hildenbrand, Andrew Morton,
	Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

Let's stop talking about page_mapcount().

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 Documentation/admin-guide/cgroup-v1/memory.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
index 46110e6a31bb..9cde26d33843 100644
--- a/Documentation/admin-guide/cgroup-v1/memory.rst
+++ b/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -802,8 +802,8 @@ a page or a swap can be moved only when it is charged to the task's current
 |   | anonymous pages, file pages (and swaps) in the range mmapped by the task |
 |   | will be moved even if the task hasn't done page fault, i.e. they might   |
 |   | not be the task's "RSS", but other task's "RSS" that maps the same file. |
-|   | And mapcount of the page is ignored (the page can be moved even if       |
-|   | page_mapcount(page) > 1). You must enable Swap Extension (see 2.4) to    |
+|   | The mapcount of the page is ignored (the page can be moved independent   |
+|   | of the mapcount). You must enable Swap Extension (see 2.4) to            |
 |   | enable move of swap charges.                                             |
 +---+--------------------------------------------------------------------------+
 
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 01/18] mm: allow for detecting underflows with page_mapcount() again
  2024-04-09 19:22 ` [PATCH v1 01/18] mm: allow for detecting underflows with page_mapcount() again David Hildenbrand
@ 2024-04-09 20:06   ` Zi Yan
  2024-04-09 21:42   ` Matthew Wilcox
  2024-04-24  9:38   ` David Hildenbrand
  2 siblings, 0 replies; 43+ messages in thread
From: Zi Yan @ 2024-04-09 20:06 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, linux-doc, cgroups, linux-sh,
	linux-trace-kernel, linux-fsdevel, Andrew Morton, Matthew Wilcox,
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Jonathan Corbet,
	Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

[-- Attachment #1: Type: text/plain, Size: 1510 bytes --]

On 9 Apr 2024, at 15:22, David Hildenbrand wrote:

> Commit 53277bcf126d ("mm: support page_mapcount() on page_has_type()
> pages") made it impossible to detect mapcount underflows by treating
> any negative raw mapcount value as a mapcount of 0.
>
> We perform such underflow checks in zap_present_folio_ptes() and
> zap_huge_pmd(), which would currently no longer trigger.
>
> Let's check against PAGE_MAPCOUNT_RESERVE instead by using
> page_type_has_type(), like page_has_type() would, so we can still catch
> some underflows.
>
> Fixes: 53277bcf126d ("mm: support page_mapcount() on page_has_type() pages")
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  include/linux/mm.h | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index ef34cf54c14f..0fb8a40f82dd 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1229,11 +1229,10 @@ static inline void page_mapcount_reset(struct page *page)
>   */
>  static inline int page_mapcount(struct page *page)
>  {
> -	int mapcount = atomic_read(&page->_mapcount) + 1;
> +	int mapcount = atomic_read(&page->_mapcount);
>
>  	/* Handle page_has_type() pages */
> -	if (mapcount < 0)
> -		mapcount = 0;
> +	mapcount = page_type_has_type(mapcount) ? 0 : mapcount + 1;
>  	if (unlikely(PageCompound(page)))
>  		mapcount += folio_entire_mapcount(page_folio(page));

LGTM. This could be picked up separately. Reviewed-by: Zi Yan <ziy@nvidia.com>

--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 04/18] mm: track mapcount of large folios in single value
  2024-04-09 19:22 ` [PATCH v1 04/18] mm: track mapcount of large folios in single value David Hildenbrand
@ 2024-04-09 20:13   ` Zi Yan
  2024-04-10  8:20     ` David Hildenbrand
  2024-04-18 14:50   ` Lance Yang
  2024-04-19 14:02   ` Yin, Fengwei
  2 siblings, 1 reply; 43+ messages in thread
From: Zi Yan @ 2024-04-09 20:13 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, linux-doc, cgroups, linux-sh,
	linux-trace-kernel, linux-fsdevel, Andrew Morton, Matthew Wilcox,
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Jonathan Corbet,
	Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

[-- Attachment #1: Type: text/plain, Size: 2186 bytes --]

On 9 Apr 2024, at 15:22, David Hildenbrand wrote:

> Let's track the mapcount of large folios in a single value. The mapcount of
> a large folio currently corresponds to the sum of the entire mapcount and
> all page mapcounts.
>
> This sum is what we actually want to know in folio_mapcount() and it is
> also sufficient for implementing folio_mapped().
>
> With PTE-mapped THP becoming more important and more widely used, we want
> to avoid looping over all pages of a folio just to obtain the mapcount
> of large folios. The comment "In the common case, avoid the loop when no
> pages mapped by PTE" in folio_total_mapcount() does no longer hold for
> mTHP that are always mapped by PTE.
>
> Further, we are planning on using folio_mapcount() more
> frequently, and might even want to remove page mapcounts for large
> folios in some kernel configs. Therefore, allow for reading the mapcount of
> large folios efficiently and atomically without looping over any pages.
>
> Maintain the mapcount also for hugetlb pages for simplicity. Use the new
> mapcount to implement folio_mapcount() and folio_mapped(). Make
> page_mapped() simply call folio_mapped(). We can now get rid of
> folio_large_is_mapped().
>
> _nr_pages_mapped is now only used in rmap code and for debugging
> purposes. Keep folio_nr_pages_mapped() around, but document that its use
> should be limited to rmap internals and debugging purposes.
>
> This change implies one additional atomic add/sub whenever
> mapping/unmapping (parts of) a large folio.
>
> As we now batch RMAP operations for PTE-mapped THP during fork(),
> during unmap/zap, and when PTE-remapping a PMD-mapped THP, and we adjust
> the large mapcount for a PTE batch only once, the added overhead in the
> common case is small. Only when unmapping individual pages of a large folio
> (e.g., during COW), the overhead might be bigger in comparison, but it's
> essentially one additional atomic operation.
>
> Note that before the new mapcount would overflow, already our refcount
> would overflow: each mapping requires a folio reference. Extend the
> focumentation of folio_mapcount().

s/focumentation/documentation/  ;)

--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 01/18] mm: allow for detecting underflows with page_mapcount() again
  2024-04-09 19:22 ` [PATCH v1 01/18] mm: allow for detecting underflows with page_mapcount() again David Hildenbrand
  2024-04-09 20:06   ` Zi Yan
@ 2024-04-09 21:42   ` Matthew Wilcox
  2024-04-10  8:10     ` David Hildenbrand
  2024-04-24  9:38   ` David Hildenbrand
  2 siblings, 1 reply; 43+ messages in thread
From: Matthew Wilcox @ 2024-04-09 21:42 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, linux-doc, cgroups, linux-sh,
	linux-trace-kernel, linux-fsdevel, Andrew Morton, Peter Xu,
	Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan, Jonathan Corbet,
	Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

On Tue, Apr 09, 2024 at 09:22:44PM +0200, David Hildenbrand wrote:
> Commit 53277bcf126d ("mm: support page_mapcount() on page_has_type()
> pages") made it impossible to detect mapcount underflows by treating
> any negative raw mapcount value as a mapcount of 0.

Yes, but I don't think this is the right place to check for underflow.
We should be checking for that on modification, not on read.  I think
it's more important for page_mapcount() to be fast than a debugging aid.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 01/18] mm: allow for detecting underflows with page_mapcount() again
  2024-04-09 21:42   ` Matthew Wilcox
@ 2024-04-10  8:10     ` David Hildenbrand
  0 siblings, 0 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-10  8:10 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-kernel, linux-mm, linux-doc, cgroups, linux-sh,
	linux-trace-kernel, linux-fsdevel, Andrew Morton, Peter Xu,
	Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan, Jonathan Corbet,
	Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

On 09.04.24 23:42, Matthew Wilcox wrote:
> On Tue, Apr 09, 2024 at 09:22:44PM +0200, David Hildenbrand wrote:
>> Commit 53277bcf126d ("mm: support page_mapcount() on page_has_type()
>> pages") made it impossible to detect mapcount underflows by treating
>> any negative raw mapcount value as a mapcount of 0.
> 
> Yes, but I don't think this is the right place to check for underflow.
> We should be checking for that on modification, not on read.

While I don't disagree (and we'd check more instances that way, for example
deferred rmap removal), that requires a bit more churn and figuring out of
if losing some information we would have printed in print_bad_pte() is worth
that change.

> I think
> it's more important for page_mapcount() to be fast than a debugging aid.

I really don't think page_mapcount() is a good use of time for
micro-optimizations, but let's investigate:

A big hunk of code in page_mapcount() seems to be the compound handling.
The code before that (reading mapcount, checking for the condition,
conditionally setting it to 0), would generate right now:

  177:	8b 42 30             	mov    0x30(%rdx),%eax
  17a:   b9 00 00 00 00          mov    $0x0,%ecx
  17f:	83 c0 01             	add    $0x1,%eax
  182:	0f 48 c1             	cmovs  %ecx,%eax

My variant is longer:

  17b:	8b 4a 30             	mov    0x30(%rdx),%ecx
  17e:	81 f9 7f ff ff ff    	cmp    $0xffffff7f,%ecx
  184:	8d 41 01             	lea    0x1(%rcx),%eax
  187:	b9 00 00 00 00       	mov    $0x0,%ecx
  18c:	0f 4e c1             	cmovle %ecx,%eax
  18f:	48 8b 0a             	mov    (%rdx),%rcx

The compiler does not seem to do the smart thing, which would
be rearranging the code to effectively be:

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ef34cf54c14f..7392596882ae 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1232,7 +1232,7 @@ static inline int page_mapcount(struct page *page)
         int mapcount = atomic_read(&page->_mapcount) + 1;
  
         /* Handle page_has_type() pages */
-       if (mapcount < 0)
+       if (mapcount < PAGE_MAPCOUNT_RESERVE + 1)
                 mapcount = 0;
         if (unlikely(PageCompound(page)))
                 mapcount += folio_entire_mapcount(page_folio(page));


Which would result in:

  177:   8b 42 30                mov    0x30(%rdx),%eax
  17a:   31 c9                   xor    %ecx,%ecx
  17c:   83 c0 01                add    $0x1,%eax
  17f:   83 f8 80                cmp    $0xffffff80,%eax
  182:   0f 4e c1                cmovle %ecx,%eax


Same code length, one more instruction. No jumps.


I can switch to the above (essentially inlining
page_type_has_type()) for now and look into different sanity checks --
and extending the documentation around page_mapcount() behavior for
underflows -- separately.

... unless you insist that we really have to change that immediately.

Thanks!

-- 
Cheers,

David / dhildenb


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 04/18] mm: track mapcount of large folios in single value
  2024-04-09 20:13   ` Zi Yan
@ 2024-04-10  8:20     ` David Hildenbrand
  0 siblings, 0 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-10  8:20 UTC (permalink / raw)
  To: Zi Yan
  Cc: linux-kernel, linux-mm, linux-doc, cgroups, linux-sh,
	linux-trace-kernel, linux-fsdevel, Andrew Morton, Matthew Wilcox,
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Jonathan Corbet,
	Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

On 09.04.24 22:13, Zi Yan wrote:
> On 9 Apr 2024, at 15:22, David Hildenbrand wrote:
> 
>> Let's track the mapcount of large folios in a single value. The mapcount of
>> a large folio currently corresponds to the sum of the entire mapcount and
>> all page mapcounts.
>>
>> This sum is what we actually want to know in folio_mapcount() and it is
>> also sufficient for implementing folio_mapped().
>>
>> With PTE-mapped THP becoming more important and more widely used, we want
>> to avoid looping over all pages of a folio just to obtain the mapcount
>> of large folios. The comment "In the common case, avoid the loop when no
>> pages mapped by PTE" in folio_total_mapcount() does no longer hold for
>> mTHP that are always mapped by PTE.
>>
>> Further, we are planning on using folio_mapcount() more
>> frequently, and might even want to remove page mapcounts for large
>> folios in some kernel configs. Therefore, allow for reading the mapcount of
>> large folios efficiently and atomically without looping over any pages.
>>
>> Maintain the mapcount also for hugetlb pages for simplicity. Use the new
>> mapcount to implement folio_mapcount() and folio_mapped(). Make
>> page_mapped() simply call folio_mapped(). We can now get rid of
>> folio_large_is_mapped().
>>
>> _nr_pages_mapped is now only used in rmap code and for debugging
>> purposes. Keep folio_nr_pages_mapped() around, but document that its use
>> should be limited to rmap internals and debugging purposes.
>>
>> This change implies one additional atomic add/sub whenever
>> mapping/unmapping (parts of) a large folio.
>>
>> As we now batch RMAP operations for PTE-mapped THP during fork(),
>> during unmap/zap, and when PTE-remapping a PMD-mapped THP, and we adjust
>> the large mapcount for a PTE batch only once, the added overhead in the
>> common case is small. Only when unmapping individual pages of a large folio
>> (e.g., during COW), the overhead might be bigger in comparison, but it's
>> essentially one additional atomic operation.
>>
>> Note that before the new mapcount would overflow, already our refcount
>> would overflow: each mapping requires a folio reference. Extend the
>> focumentation of folio_mapcount().
> 
> s/focumentation/documentation/  ;)

Thanks! :)

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 05/18] mm: improve folio_likely_mapped_shared() using the mapcount of large folios
  2024-04-09 19:22 ` [PATCH v1 05/18] mm: improve folio_likely_mapped_shared() using the mapcount of large folios David Hildenbrand
@ 2024-04-16 10:40   ` Lance Yang
  2024-04-16 10:47     ` David Hildenbrand
  2024-04-19  2:29   ` Yin, Fengwei
  2024-04-19 14:03   ` Yin, Fengwei
  2 siblings, 1 reply; 43+ messages in thread
From: Lance Yang @ 2024-04-16 10:40 UTC (permalink / raw)
  To: david
  Cc: akpm, cgroups, chris, corbet, dalias, fengwei.yin, glaubitz,
	hughd, jcmvbkbc, linmiaohe, linux-doc, linux-fsdevel,
	linux-kernel, linux-mm, linux-sh, linux-trace-kernel,
	muchun.song, naoya.horiguchi, peterx, richardycc, ryan.roberts,
	shy828301, willy, ysato, ziy, Lance Yang

Hey David,

Maybe I spotted a bug below.

[...]
 static inline bool folio_likely_mapped_shared(struct folio *folio)
 {
-	return page_mapcount(folio_page(folio, 0)) > 1;
+	int mapcount = folio_mapcount(folio);
+
+	/* Only partially-mappable folios require more care. */
+	if (!folio_test_large(folio) || unlikely(folio_test_hugetlb(folio)))
+		return mapcount > 1;
+
+	/* A single mapping implies "mapped exclusively". */
+	if (mapcount <= 1)
+		return false;
+
+	/* If any page is mapped more than once we treat it "mapped shared". */
+	if (folio_entire_mapcount(folio) || mapcount > folio_nr_pages(folio))
+		return true;

bug: if a PMD-mapped THP is exclusively mapped, the folio_entire_mapcount()
function will return 1 (atomic_read(&folio->_entire_mapcount) + 1).

IIUC, when mapping a PMD entry for the entire THP, folio->_entire_mapcount
increments from -1 to 0.

Thanks,
Lance

+
+	/* Let's guess based on the first subpage. */
+	return atomic_read(&folio->_mapcount) > 0;
 }
[...]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 05/18] mm: improve folio_likely_mapped_shared() using the mapcount of large folios
  2024-04-16 10:40   ` Lance Yang
@ 2024-04-16 10:47     ` David Hildenbrand
  2024-04-16 10:52       ` Lance Yang
  0 siblings, 1 reply; 43+ messages in thread
From: David Hildenbrand @ 2024-04-16 10:47 UTC (permalink / raw)
  To: Lance Yang
  Cc: akpm, cgroups, chris, corbet, dalias, fengwei.yin, glaubitz,
	hughd, jcmvbkbc, linmiaohe, linux-doc, linux-fsdevel,
	linux-kernel, linux-mm, linux-sh, linux-trace-kernel,
	muchun.song, naoya.horiguchi, peterx, richardycc, ryan.roberts,
	shy828301, willy, ysato, ziy

On 16.04.24 12:40, Lance Yang wrote:
> Hey David,
> 
> Maybe I spotted a bug below.

Thanks for the review!

> 
> [...]
>   static inline bool folio_likely_mapped_shared(struct folio *folio)
>   {
> -	return page_mapcount(folio_page(folio, 0)) > 1;
> +	int mapcount = folio_mapcount(folio);
> +
> +	/* Only partially-mappable folios require more care. */
> +	if (!folio_test_large(folio) || unlikely(folio_test_hugetlb(folio)))
> +		return mapcount > 1;
> +
> +	/* A single mapping implies "mapped exclusively". */
> +	if (mapcount <= 1)
> +		return false;
> +
> +	/* If any page is mapped more than once we treat it "mapped shared". */
> +	if (folio_entire_mapcount(folio) || mapcount > folio_nr_pages(folio))
> +		return true;
> 
> bug: if a PMD-mapped THP is exclusively mapped, the folio_entire_mapcount()
> function will return 1 (atomic_read(&folio->_entire_mapcount) + 1).

If it's exclusively mapped, then folio_mapcount(folio)==1. In which case 
the previous statement:

if (mapcount <= 1)
	return false;

Catches it.

IOW, once we reach this point we now that folio_mapcount(folio) > 1, and 
there must be something else besides the entire mapping ("more than once").


Or did I not address your concern?

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 05/18] mm: improve folio_likely_mapped_shared() using the mapcount of large folios
  2024-04-16 10:47     ` David Hildenbrand
@ 2024-04-16 10:52       ` Lance Yang
  2024-04-16 10:53         ` David Hildenbrand
  0 siblings, 1 reply; 43+ messages in thread
From: Lance Yang @ 2024-04-16 10:52 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: akpm, cgroups, chris, corbet, dalias, fengwei.yin, glaubitz,
	hughd, jcmvbkbc, linmiaohe, linux-doc, linux-fsdevel,
	linux-kernel, linux-mm, linux-sh, linux-trace-kernel,
	muchun.song, naoya.horiguchi, peterx, richardycc, ryan.roberts,
	shy828301, willy, ysato, ziy

On Tue, Apr 16, 2024 at 6:47 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 16.04.24 12:40, Lance Yang wrote:
> > Hey David,
> >
> > Maybe I spotted a bug below.
>
> Thanks for the review!
>
> >
> > [...]
> >   static inline bool folio_likely_mapped_shared(struct folio *folio)
> >   {
> > -     return page_mapcount(folio_page(folio, 0)) > 1;
> > +     int mapcount = folio_mapcount(folio);
> > +
> > +     /* Only partially-mappable folios require more care. */
> > +     if (!folio_test_large(folio) || unlikely(folio_test_hugetlb(folio)))
> > +             return mapcount > 1;
> > +
> > +     /* A single mapping implies "mapped exclusively". */
> > +     if (mapcount <= 1)
> > +             return false;
> > +
> > +     /* If any page is mapped more than once we treat it "mapped shared". */
> > +     if (folio_entire_mapcount(folio) || mapcount > folio_nr_pages(folio))
> > +             return true;
> >
> > bug: if a PMD-mapped THP is exclusively mapped, the folio_entire_mapcount()
> > function will return 1 (atomic_read(&folio->_entire_mapcount) + 1).
>
> If it's exclusively mapped, then folio_mapcount(folio)==1. In which case
> the previous statement:
>
> if (mapcount <= 1)
>         return false;
>
> Catches it.

You're right!

>
> IOW, once we reach this point we now that folio_mapcount(folio) > 1, and
> there must be something else besides the entire mapping ("more than once").
>
>
> Or did I not address your concern?

Sorry, my mistake :(

Thanks,
Lance

>
> --
> Cheers,
>
> David / dhildenb
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 05/18] mm: improve folio_likely_mapped_shared() using the mapcount of large folios
  2024-04-16 10:52       ` Lance Yang
@ 2024-04-16 10:53         ` David Hildenbrand
  0 siblings, 0 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-16 10:53 UTC (permalink / raw)
  To: Lance Yang
  Cc: akpm, cgroups, chris, corbet, dalias, fengwei.yin, glaubitz,
	hughd, jcmvbkbc, linmiaohe, linux-doc, linux-fsdevel,
	linux-kernel, linux-mm, linux-sh, linux-trace-kernel,
	muchun.song, naoya.horiguchi, peterx, richardycc, ryan.roberts,
	shy828301, willy, ysato, ziy

On 16.04.24 12:52, Lance Yang wrote:
> On Tue, Apr 16, 2024 at 6:47 PM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 16.04.24 12:40, Lance Yang wrote:
>>> Hey David,
>>>
>>> Maybe I spotted a bug below.
>>
>> Thanks for the review!
>>
>>>
>>> [...]
>>>    static inline bool folio_likely_mapped_shared(struct folio *folio)
>>>    {
>>> -     return page_mapcount(folio_page(folio, 0)) > 1;
>>> +     int mapcount = folio_mapcount(folio);
>>> +
>>> +     /* Only partially-mappable folios require more care. */
>>> +     if (!folio_test_large(folio) || unlikely(folio_test_hugetlb(folio)))
>>> +             return mapcount > 1;
>>> +
>>> +     /* A single mapping implies "mapped exclusively". */
>>> +     if (mapcount <= 1)
>>> +             return false;
>>> +
>>> +     /* If any page is mapped more than once we treat it "mapped shared". */
>>> +     if (folio_entire_mapcount(folio) || mapcount > folio_nr_pages(folio))
>>> +             return true;
>>>
>>> bug: if a PMD-mapped THP is exclusively mapped, the folio_entire_mapcount()
>>> function will return 1 (atomic_read(&folio->_entire_mapcount) + 1).
>>
>> If it's exclusively mapped, then folio_mapcount(folio)==1. In which case
>> the previous statement:
>>
>> if (mapcount <= 1)
>>          return false;
>>
>> Catches it.
> 
> You're right!
> 
>>
>> IOW, once we reach this point we now that folio_mapcount(folio) > 1, and
>> there must be something else besides the entire mapping ("more than once").
>>
>>
>> Or did I not address your concern?
> 
> Sorry, my mistake :(

No worries, thanks for the review and thinking this through!

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 04/18] mm: track mapcount of large folios in single value
  2024-04-09 19:22 ` [PATCH v1 04/18] mm: track mapcount of large folios in single value David Hildenbrand
  2024-04-09 20:13   ` Zi Yan
@ 2024-04-18 14:50   ` Lance Yang
  2024-04-18 15:09     ` David Hildenbrand
  2024-04-19 14:02   ` Yin, Fengwei
  2 siblings, 1 reply; 43+ messages in thread
From: Lance Yang @ 2024-04-18 14:50 UTC (permalink / raw)
  To: david
  Cc: akpm, cgroups, chris, corbet, dalias, fengwei.yin, glaubitz,
	hughd, jcmvbkbc, linmiaohe, linux-doc, linux-fsdevel,
	linux-kernel, linux-mm, linux-sh, linux-trace-kernel,
	muchun.song, naoya.horiguchi, peterx, richardycc, ryan.roberts,
	shy828301, willy, ysato, ziy

Hey David,

FWIW, just a nit below.

diff --git a/mm/rmap.c b/mm/rmap.c
index 2608c40dffad..08bb6834cf72 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1143,7 +1143,6 @@ static __always_inline unsigned int __folio_add_rmap(struct folio *folio,
 		int *nr_pmdmapped)
 {
 	atomic_t *mapped = &folio->_nr_pages_mapped;
-	const int orig_nr_pages = nr_pages;
 	int first, nr = 0;
 
 	__folio_rmap_sanity_checks(folio, page, nr_pages, level);
@@ -1155,6 +1154,7 @@ static __always_inline unsigned int __folio_add_rmap(struct folio *folio,
 			break;
 		}
 
+		atomic_add(nr_pages, &folio->_large_mapcount);
 		do {
 			first = atomic_inc_and_test(&page->_mapcount);
 			if (first) {
@@ -1163,7 +1163,6 @@ static __always_inline unsigned int __folio_add_rmap(struct folio *folio,
 					nr++;
 			}
 		} while (page++, --nr_pages > 0);
-		atomic_add(orig_nr_pages, &folio->_large_mapcount);
 		break;
 	case RMAP_LEVEL_PMD:
 		first = atomic_inc_and_test(&folio->_entire_mapcount);

Thanks,
Lance

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 04/18] mm: track mapcount of large folios in single value
  2024-04-18 14:50   ` Lance Yang
@ 2024-04-18 15:09     ` David Hildenbrand
  2024-04-19  0:31       ` Lance Yang
  0 siblings, 1 reply; 43+ messages in thread
From: David Hildenbrand @ 2024-04-18 15:09 UTC (permalink / raw)
  To: Lance Yang
  Cc: akpm, cgroups, chris, corbet, dalias, fengwei.yin, glaubitz,
	hughd, jcmvbkbc, linmiaohe, linux-doc, linux-fsdevel,
	linux-kernel, linux-mm, linux-sh, linux-trace-kernel,
	muchun.song, naoya.horiguchi, peterx, richardycc, ryan.roberts,
	shy828301, willy, ysato, ziy

On 18.04.24 16:50, Lance Yang wrote:
> Hey David,
> 
> FWIW, just a nit below.

Hi!

Thanks, but that was done on purpose.

This way, we'll have a memory barrier (due to at least one 
atomic_inc_and_test()) between incrementing the folio refcount 
(happening before the rmap change) and incrementing the mapcount.

Is it required? Not 100% sure, refcount vs. mapcount checks are always a 
bit racy. But doing it this way let me sleep better at night ;)

[with no subpage mapcounts, we'd do the atomic_inc_and_test on the large 
mapcount and have the memory barrier there again; but that's stuff for 
the future]

Thanks!

> 
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 2608c40dffad..08bb6834cf72 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1143,7 +1143,6 @@ static __always_inline unsigned int __folio_add_rmap(struct folio *folio,
>   		int *nr_pmdmapped)
>   {
>   	atomic_t *mapped = &folio->_nr_pages_mapped;
> -	const int orig_nr_pages = nr_pages;
>   	int first, nr = 0;
>   
>   	__folio_rmap_sanity_checks(folio, page, nr_pages, level);
> @@ -1155,6 +1154,7 @@ static __always_inline unsigned int __folio_add_rmap(struct folio *folio,
>   			break;
>   		}
>   
> +		atomic_add(nr_pages, &folio->_large_mapcount);
>   		do {
>   			first = atomic_inc_and_test(&page->_mapcount);
>   			if (first) {
> @@ -1163,7 +1163,6 @@ static __always_inline unsigned int __folio_add_rmap(struct folio *folio,
>   					nr++;
>   			}
>   		} while (page++, --nr_pages > 0);
> -		atomic_add(orig_nr_pages, &folio->_large_mapcount);
>   		break;
>   	case RMAP_LEVEL_PMD:
>   		first = atomic_inc_and_test(&folio->_entire_mapcount);
> 
> Thanks,
> Lance
> 

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 04/18] mm: track mapcount of large folios in single value
  2024-04-18 15:09     ` David Hildenbrand
@ 2024-04-19  0:31       ` Lance Yang
  0 siblings, 0 replies; 43+ messages in thread
From: Lance Yang @ 2024-04-19  0:31 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: akpm, cgroups, chris, corbet, dalias, fengwei.yin, glaubitz,
	hughd, jcmvbkbc, linmiaohe, linux-doc, linux-fsdevel,
	linux-kernel, linux-mm, linux-sh, linux-trace-kernel,
	muchun.song, naoya.horiguchi, peterx, richardycc, ryan.roberts,
	shy828301, willy, ysato, ziy

On Thu, Apr 18, 2024 at 11:09 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 18.04.24 16:50, Lance Yang wrote:
> > Hey David,
> >
> > FWIW, just a nit below.
>
> Hi!
>

Thanks for clarifying!

> Thanks, but that was done on purpose.
>
> This way, we'll have a memory barrier (due to at least one
> atomic_inc_and_test()) between incrementing the folio refcount
> (happening before the rmap change) and incrementing the mapcount.
>
> Is it required? Not 100% sure, refcount vs. mapcount checks are always a
> bit racy. But doing it this way let me sleep better at night ;)

Yep, I understood :)

Thanks,
Lance

>
> [with no subpage mapcounts, we'd do the atomic_inc_and_test on the large
> mapcount and have the memory barrier there again; but that's stuff for
> the future]
>
> Thanks!



>
> >
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index 2608c40dffad..08bb6834cf72 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -1143,7 +1143,6 @@ static __always_inline unsigned int __folio_add_rmap(struct folio *folio,
> >               int *nr_pmdmapped)
> >   {
> >       atomic_t *mapped = &folio->_nr_pages_mapped;
> > -     const int orig_nr_pages = nr_pages;
> >       int first, nr = 0;
> >
> >       __folio_rmap_sanity_checks(folio, page, nr_pages, level);
> > @@ -1155,6 +1154,7 @@ static __always_inline unsigned int __folio_add_rmap(struct folio *folio,
> >                       break;
> >               }
> >
> > +             atomic_add(nr_pages, &folio->_large_mapcount);
> >               do {
> >                       first = atomic_inc_and_test(&page->_mapcount);
> >                       if (first) {
> > @@ -1163,7 +1163,6 @@ static __always_inline unsigned int __folio_add_rmap(struct folio *folio,
> >                                       nr++;
> >                       }
> >               } while (page++, --nr_pages > 0);
> > -             atomic_add(orig_nr_pages, &folio->_large_mapcount);
> >               break;
> >       case RMAP_LEVEL_PMD:
> >               first = atomic_inc_and_test(&folio->_entire_mapcount);
> >
> > Thanks,
> > Lance
> >
>
> --
> Cheers,
>
> David / dhildenb
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 02/18] mm/rmap: always inline anon/file rmap duplication of a single PTE
  2024-04-09 19:22 ` [PATCH v1 02/18] mm/rmap: always inline anon/file rmap duplication of a single PTE David Hildenbrand
@ 2024-04-19  2:25   ` Yin, Fengwei
  2024-04-19  9:14     ` David Hildenbrand
  2024-04-19 14:01   ` Yin, Fengwei
  1 sibling, 1 reply; 43+ messages in thread
From: Yin, Fengwei @ 2024-04-19  2:25 UTC (permalink / raw)
  To: David Hildenbrand, linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, Andrew Morton, Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yang Shi, Zi Yan, Jonathan Corbet,
	Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang



On 4/10/2024 3:22 AM, David Hildenbrand wrote:
> As we grow the code, the compiler might make stupid decisions and
> unnecessarily degrade fork() performance. Let's make sure to always inline
> functions that operate on a single PTE so the compiler will always
> optimize out the loop and avoid a function call.
> 
> This is a preparation for maintining a total mapcount for large folios.
> 
> Signed-off-by: David Hildenbrand<david@redhat.com>
The patch looks good to me. Just curious: Is this change driven by code
reviewing or performance data profiling? Thanks.


Regards
Yin, Fengwei

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 05/18] mm: improve folio_likely_mapped_shared() using the mapcount of large folios
  2024-04-09 19:22 ` [PATCH v1 05/18] mm: improve folio_likely_mapped_shared() using the mapcount of large folios David Hildenbrand
  2024-04-16 10:40   ` Lance Yang
@ 2024-04-19  2:29   ` Yin, Fengwei
  2024-04-19  9:19     ` David Hildenbrand
  2024-04-19 14:03   ` Yin, Fengwei
  2 siblings, 1 reply; 43+ messages in thread
From: Yin, Fengwei @ 2024-04-19  2:29 UTC (permalink / raw)
  To: David Hildenbrand, linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, Andrew Morton, Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yang Shi, Zi Yan, Jonathan Corbet,
	Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang



On 4/10/2024 3:22 AM, David Hildenbrand wrote:
> @@ -2200,7 +2200,22 @@ static inline size_t folio_size(struct folio *folio)
>    */
>   static inline bool folio_likely_mapped_shared(struct folio *folio)
>   {
> -	return page_mapcount(folio_page(folio, 0)) > 1;
> +	int mapcount = folio_mapcount(folio);
> +
> +	/* Only partially-mappable folios require more care. */
> +	if (!folio_test_large(folio) || unlikely(folio_test_hugetlb(folio)))
> +		return mapcount > 1;
My understanding is that mapcount > folio_nr_pages(folio) can cover
order 0 folio. And also folio_entire_mapcount() can cover hugetlb (I am
not 100% sure for this one).  I am wondering whether we can drop above
two lines? Thanks.


Regards
Yin, Fengwei

> +
> +	/* A single mapping implies "mapped exclusively". */
> +	if (mapcount <= 1)
> +		return false;
> +
> +	/* If any page is mapped more than once we treat it "mapped shared". */
> +	if (folio_entire_mapcount(folio) || mapcount > folio_nr_pages(folio))
> +		return true;
> +
> +	/* Let's guess based on the first subpage. */
> +	return atomic_read(&folio->_mapcount) > 0;
>   }


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 02/18] mm/rmap: always inline anon/file rmap duplication of a single PTE
  2024-04-19  2:25   ` Yin, Fengwei
@ 2024-04-19  9:14     ` David Hildenbrand
  0 siblings, 0 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-19  9:14 UTC (permalink / raw)
  To: Yin, Fengwei, linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, Andrew Morton, Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yang Shi, Zi Yan, Jonathan Corbet,
	Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

On 19.04.24 04:25, Yin, Fengwei wrote:
> 
> 
> On 4/10/2024 3:22 AM, David Hildenbrand wrote:
>> As we grow the code, the compiler might make stupid decisions and
>> unnecessarily degrade fork() performance. Let's make sure to always inline
>> functions that operate on a single PTE so the compiler will always
>> optimize out the loop and avoid a function call.
>>
>> This is a preparation for maintining a total mapcount for large folios.
>>
>> Signed-off-by: David Hildenbrand<david@redhat.com>
> The patch looks good to me. Just curious: Is this change driven by code
> reviewing or performance data profiling? Thanks.

It was identified while observing an performance degradation with small 
folios in the fork() microbenchmark discussed in the cover letter 
(mentioned here as "unnecessarily degrade fork() performance").

The added atomic_add() was sufficient for the compiler not inline and 
optimize-out nr_pages, inserting a function call to a function where 
nr_pages is not optimized out.

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 05/18] mm: improve folio_likely_mapped_shared() using the mapcount of large folios
  2024-04-19  2:29   ` Yin, Fengwei
@ 2024-04-19  9:19     ` David Hildenbrand
  2024-04-19 13:47       ` Yin, Fengwei
  0 siblings, 1 reply; 43+ messages in thread
From: David Hildenbrand @ 2024-04-19  9:19 UTC (permalink / raw)
  To: Yin, Fengwei, linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, Andrew Morton, Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yang Shi, Zi Yan, Jonathan Corbet,
	Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

On 19.04.24 04:29, Yin, Fengwei wrote:
> 
> 
> On 4/10/2024 3:22 AM, David Hildenbrand wrote:
>> @@ -2200,7 +2200,22 @@ static inline size_t folio_size(struct folio *folio)
>>     */
>>    static inline bool folio_likely_mapped_shared(struct folio *folio)
>>    {
>> -	return page_mapcount(folio_page(folio, 0)) > 1;
>> +	int mapcount = folio_mapcount(folio);
>> +
>> +	/* Only partially-mappable folios require more care. */
>> +	if (!folio_test_large(folio) || unlikely(folio_test_hugetlb(folio)))
>> +		return mapcount > 1;
> My understanding is that mapcount > folio_nr_pages(folio) can cover
> order 0 folio. And also folio_entire_mapcount() can cover hugetlb (I am
> not 100% sure for this one).  I am wondering whether we can drop above
> two lines? Thanks.

folio_entire_mapcount() does not apply to small folios, so we must not 
call that for small folios.

Regarding hugetlb, subpage mapcounts are completely unused, except 
subpage 0 mapcount, which is now *always* negative (storing a page type) 
-- so there is no trusting on that value at all.

So in the end, it all looked cleanest when only special-casing on 
partially-mappable folios where we know the entire mapcount exists and 
we know that subapge mapcount 0 actually stores something reasonable 
(not a type).

Thanks!

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 05/18] mm: improve folio_likely_mapped_shared() using the mapcount of large folios
  2024-04-19  9:19     ` David Hildenbrand
@ 2024-04-19 13:47       ` Yin, Fengwei
  2024-04-19 13:48         ` David Hildenbrand
  0 siblings, 1 reply; 43+ messages in thread
From: Yin, Fengwei @ 2024-04-19 13:47 UTC (permalink / raw)
  To: David Hildenbrand, linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, Andrew Morton, Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yang Shi, Zi Yan, Jonathan Corbet,
	Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang



On 4/19/2024 5:19 PM, David Hildenbrand wrote:
> On 19.04.24 04:29, Yin, Fengwei wrote:
>>
>>
>> On 4/10/2024 3:22 AM, David Hildenbrand wrote:
>>> @@ -2200,7 +2200,22 @@ static inline size_t folio_size(struct folio 
>>> *folio)
>>>     */
>>>    static inline bool folio_likely_mapped_shared(struct folio *folio)
>>>    {
>>> -    return page_mapcount(folio_page(folio, 0)) > 1;
>>> +    int mapcount = folio_mapcount(folio);
>>> +
>>> +    /* Only partially-mappable folios require more care. */
>>> +    if (!folio_test_large(folio) || 
>>> unlikely(folio_test_hugetlb(folio)))
>>> +        return mapcount > 1;
>> My understanding is that mapcount > folio_nr_pages(folio) can cover
>> order 0 folio. And also folio_entire_mapcount() can cover hugetlb (I am
>> not 100% sure for this one).  I am wondering whether we can drop above
>> two lines? Thanks.
> 
> folio_entire_mapcount() does not apply to small folios, so we must not 
> call that for small folios.
Right. I missed this part. Thanks for clarification.


Regards
Yin, Fengwei

> 
> Regarding hugetlb, subpage mapcounts are completely unused, except 
> subpage 0 mapcount, which is now *always* negative (storing a page type) 
> -- so there is no trusting on that value at all.
> 
> So in the end, it all looked cleanest when only special-casing on 
> partially-mappable folios where we know the entire mapcount exists and 
> we know that subapge mapcount 0 actually stores something reasonable 
> (not a type).
> 
> Thanks!
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 05/18] mm: improve folio_likely_mapped_shared() using the mapcount of large folios
  2024-04-19 13:47       ` Yin, Fengwei
@ 2024-04-19 13:48         ` David Hildenbrand
  0 siblings, 0 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-19 13:48 UTC (permalink / raw)
  To: Yin, Fengwei, linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, Andrew Morton, Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yang Shi, Zi Yan, Jonathan Corbet,
	Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

On 19.04.24 15:47, Yin, Fengwei wrote:
> 
> 
> On 4/19/2024 5:19 PM, David Hildenbrand wrote:
>> On 19.04.24 04:29, Yin, Fengwei wrote:
>>>
>>>
>>> On 4/10/2024 3:22 AM, David Hildenbrand wrote:
>>>> @@ -2200,7 +2200,22 @@ static inline size_t folio_size(struct folio
>>>> *folio)
>>>>      */
>>>>     static inline bool folio_likely_mapped_shared(struct folio *folio)
>>>>     {
>>>> -    return page_mapcount(folio_page(folio, 0)) > 1;
>>>> +    int mapcount = folio_mapcount(folio);
>>>> +
>>>> +    /* Only partially-mappable folios require more care. */
>>>> +    if (!folio_test_large(folio) ||
>>>> unlikely(folio_test_hugetlb(folio)))
>>>> +        return mapcount > 1;
>>> My understanding is that mapcount > folio_nr_pages(folio) can cover
>>> order 0 folio. And also folio_entire_mapcount() can cover hugetlb (I am
>>> not 100% sure for this one).  I am wondering whether we can drop above
>>> two lines? Thanks.
>>
>> folio_entire_mapcount() does not apply to small folios, so we must not
>> call that for small folios.
> Right. I missed this part. Thanks for clarification.

Thanks for the review!

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 02/18] mm/rmap: always inline anon/file rmap duplication of a single PTE
  2024-04-09 19:22 ` [PATCH v1 02/18] mm/rmap: always inline anon/file rmap duplication of a single PTE David Hildenbrand
  2024-04-19  2:25   ` Yin, Fengwei
@ 2024-04-19 14:01   ` Yin, Fengwei
  1 sibling, 0 replies; 43+ messages in thread
From: Yin, Fengwei @ 2024-04-19 14:01 UTC (permalink / raw)
  To: David Hildenbrand, linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, Andrew Morton, Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yang Shi, Zi Yan, Jonathan Corbet,
	Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang



On 4/10/2024 3:22 AM, David Hildenbrand wrote:
> As we grow the code, the compiler might make stupid decisions and
> unnecessarily degrade fork() performance. Let's make sure to always inline
> functions that operate on a single PTE so the compiler will always
> optimize out the loop and avoid a function call.
> 
> This is a preparation for maintining a total mapcount for large folios.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 03/18] mm/rmap: add fast-path for small folios when adding/removing/duplicating
  2024-04-09 19:22 ` [PATCH v1 03/18] mm/rmap: add fast-path for small folios when adding/removing/duplicating David Hildenbrand
@ 2024-04-19 14:02   ` Yin, Fengwei
  0 siblings, 0 replies; 43+ messages in thread
From: Yin, Fengwei @ 2024-04-19 14:02 UTC (permalink / raw)
  To: David Hildenbrand, linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, Andrew Morton, Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yang Shi, Zi Yan, Jonathan Corbet,
	Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang



On 4/10/2024 3:22 AM, David Hildenbrand wrote:
> Let's add a fast-path for small folios to all relevant rmap functions.
> Note that only RMAP_LEVEL_PTE applies.
> 
> This is a preparation for tracking the mapcount of large folios in a
> single value.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 04/18] mm: track mapcount of large folios in single value
  2024-04-09 19:22 ` [PATCH v1 04/18] mm: track mapcount of large folios in single value David Hildenbrand
  2024-04-09 20:13   ` Zi Yan
  2024-04-18 14:50   ` Lance Yang
@ 2024-04-19 14:02   ` Yin, Fengwei
  2 siblings, 0 replies; 43+ messages in thread
From: Yin, Fengwei @ 2024-04-19 14:02 UTC (permalink / raw)
  To: David Hildenbrand, linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, Andrew Morton, Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yang Shi, Zi Yan, Jonathan Corbet,
	Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang



On 4/10/2024 3:22 AM, David Hildenbrand wrote:
> Let's track the mapcount of large folios in a single value. The mapcount of
> a large folio currently corresponds to the sum of the entire mapcount and
> all page mapcounts.
> 
> This sum is what we actually want to know in folio_mapcount() and it is
> also sufficient for implementing folio_mapped().
> 
> With PTE-mapped THP becoming more important and more widely used, we want
> to avoid looping over all pages of a folio just to obtain the mapcount
> of large folios. The comment "In the common case, avoid the loop when no
> pages mapped by PTE" in folio_total_mapcount() does no longer hold for
> mTHP that are always mapped by PTE.
> 
> Further, we are planning on using folio_mapcount() more
> frequently, and might even want to remove page mapcounts for large
> folios in some kernel configs. Therefore, allow for reading the mapcount of
> large folios efficiently and atomically without looping over any pages.
> 
> Maintain the mapcount also for hugetlb pages for simplicity. Use the new
> mapcount to implement folio_mapcount() and folio_mapped(). Make
> page_mapped() simply call folio_mapped(). We can now get rid of
> folio_large_is_mapped().
> 
> _nr_pages_mapped is now only used in rmap code and for debugging
> purposes. Keep folio_nr_pages_mapped() around, but document that its use
> should be limited to rmap internals and debugging purposes.
> 
> This change implies one additional atomic add/sub whenever
> mapping/unmapping (parts of) a large folio.
> 
> As we now batch RMAP operations for PTE-mapped THP during fork(),
> during unmap/zap, and when PTE-remapping a PMD-mapped THP, and we adjust
> the large mapcount for a PTE batch only once, the added overhead in the
> common case is small. Only when unmapping individual pages of a large folio
> (e.g., during COW), the overhead might be bigger in comparison, but it's
> essentially one additional atomic operation.
> 
> Note that before the new mapcount would overflow, already our refcount
> would overflow: each mapping requires a folio reference. Extend the
> focumentation of folio_mapcount().
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 05/18] mm: improve folio_likely_mapped_shared() using the mapcount of large folios
  2024-04-09 19:22 ` [PATCH v1 05/18] mm: improve folio_likely_mapped_shared() using the mapcount of large folios David Hildenbrand
  2024-04-16 10:40   ` Lance Yang
  2024-04-19  2:29   ` Yin, Fengwei
@ 2024-04-19 14:03   ` Yin, Fengwei
  2 siblings, 0 replies; 43+ messages in thread
From: Yin, Fengwei @ 2024-04-19 14:03 UTC (permalink / raw)
  To: David Hildenbrand, linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, Andrew Morton, Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yang Shi, Zi Yan, Jonathan Corbet,
	Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang



On 4/10/2024 3:22 AM, David Hildenbrand wrote:
> We can now read the mapcount of large folios very efficiently. Use it to
> improve our handling of partially-mappable folios, falling back
> to making a guess only in case the folio is not "obviously mapped shared".
> 
> We can now better detect partially-mappable folios where the first page is
> not mapped as "mapped shared", reducing "false negatives"; but false
> negatives are still possible.
> 
> While at it, fixup a wrong comment (false positive vs. false negative)
> for KSM folios.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 01/18] mm: allow for detecting underflows with page_mapcount() again
  2024-04-09 19:22 ` [PATCH v1 01/18] mm: allow for detecting underflows with page_mapcount() again David Hildenbrand
  2024-04-09 20:06   ` Zi Yan
  2024-04-09 21:42   ` Matthew Wilcox
@ 2024-04-24  9:38   ` David Hildenbrand
  2 siblings, 0 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-24  9:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, Andrew Morton, Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

On 09.04.24 21:22, David Hildenbrand wrote:
> Commit 53277bcf126d ("mm: support page_mapcount() on page_has_type()
> pages") made it impossible to detect mapcount underflows by treating
> any negative raw mapcount value as a mapcount of 0.
> 
> We perform such underflow checks in zap_present_folio_ptes() and
> zap_huge_pmd(), which would currently no longer trigger.
> 
> Let's check against PAGE_MAPCOUNT_RESERVE instead by using
> page_type_has_type(), like page_has_type() would, so we can still catch
> some underflows.
> 
> Fixes: 53277bcf126d ("mm: support page_mapcount() on page_has_type() pages")
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>   include/linux/mm.h | 5 ++---
>   1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index ef34cf54c14f..0fb8a40f82dd 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1229,11 +1229,10 @@ static inline void page_mapcount_reset(struct page *page)
>    */
>   static inline int page_mapcount(struct page *page)
>   {
> -	int mapcount = atomic_read(&page->_mapcount) + 1;
> +	int mapcount = atomic_read(&page->_mapcount);
>   
>   	/* Handle page_has_type() pages */
> -	if (mapcount < 0)
> -		mapcount = 0;
> +	mapcount = page_type_has_type(mapcount) ? 0 : mapcount + 1;
>   	if (unlikely(PageCompound(page)))
>   		mapcount += folio_entire_mapcount(page_folio(page));
>   

 From b49849001f3d2aad0af93cf2098065d7cbd9a959 Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david@redhat.com>
Date: Wed, 24 Apr 2024 10:50:09 +0200
Subject: [PATCH] !fixup: mm: allow for detecting underflows with
  page_mapcount() again

Let's make page_mapcount() slighly more efficient by inlining the
page_type_has_type() check.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
  include/linux/mm.h | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index dc33f8269fb52..cf700c5cdd58b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1229,10 +1229,11 @@ static inline void page_mapcount_reset(struct page *page)
   */
  static inline int page_mapcount(struct page *page)
  {
-	int mapcount = atomic_read(&page->_mapcount);
+	int mapcount = atomic_read(&page->_mapcount) + 1;
  
  	/* Handle page_has_type() pages */
-	mapcount = page_type_has_type(mapcount) ? 0 : mapcount + 1;
+	if (mapcount < PAGE_MAPCOUNT_RESERVE + 1)
+		mapcount = 0;
  	if (unlikely(PageCompound(page)))
  		mapcount += folio_entire_mapcount(page_folio(page));
  
-- 
2.44.0


-- 
Cheers,

David / dhildenb


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v1 06/18] mm: make folio_mapcount() return 0 for small typed folios
  2024-04-09 19:22 ` [PATCH v1 06/18] mm: make folio_mapcount() return 0 for small typed folios David Hildenbrand
@ 2024-04-24  9:40   ` David Hildenbrand
  0 siblings, 0 replies; 43+ messages in thread
From: David Hildenbrand @ 2024-04-24  9:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linux-doc, cgroups, linux-sh, linux-trace-kernel,
	linux-fsdevel, Andrew Morton, Matthew Wilcox (Oracle),
	Peter Xu, Ryan Roberts, Yin Fengwei, Yang Shi, Zi Yan,
	Jonathan Corbet, Hugh Dickins, Yoshinori Sato, Rich Felker,
	John Paul Adrian Glaubitz, Chris Zankel, Max Filippov,
	Muchun Song, Miaohe Lin, Naoya Horiguchi, Richard Chang

On 09.04.24 21:22, David Hildenbrand wrote:
> We already handle it properly for large folios. Let's also return "0"
> for small typed folios, like page_mapcount() currently would.
> 
> Consequently, folio_mapcount() will never return negative values for
> typed folios, but may return negative values for underflows.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>   include/linux/mm.h | 11 +++++++++--
>   1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index daf687f0e8e5..d453232bba62 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1260,12 +1260,19 @@ static inline int folio_large_mapcount(const struct folio *folio)
>    * references the entire folio counts exactly once, even when such special
>    * page table entries are comprised of multiple ordinary page table entries.
>    *
> + * Will report 0 for pages which cannot be mapped into userspace, such as
> + * slab, page tables and similar.
> + *
>    * Return: The number of times this folio is mapped.
>    */
>   static inline int folio_mapcount(const struct folio *folio)
>   {
> -	if (likely(!folio_test_large(folio)))
> -		return atomic_read(&folio->_mapcount) + 1;
> +	int mapcount;
> +
> +	if (likely(!folio_test_large(folio))) {
> +		mapcount = atomic_read(&folio->_mapcount);
> +		return page_type_has_type(mapcount) ? 0 : mapcount + 1;
> +	}
>   	return folio_large_mapcount(folio);
>   }
>   

 From 98acfb7ff35cb65fcfca5e799bf58f8afe84a645 Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david@redhat.com>
Date: Wed, 24 Apr 2024 10:56:17 +0200
Subject: [PATCH] !fixup: mm: make folio_mapcount() return 0 for small typed
  folios

Just like page_mapcount(), let's make folio_mapcount() slightly more
efficient.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
  include/linux/mm.h | 7 +++++--
  1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index cf700c5cdd58b..78e583b50e421 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1271,8 +1271,11 @@ static inline int folio_mapcount(const struct folio *folio)
  	int mapcount;
  
  	if (likely(!folio_test_large(folio))) {
-		mapcount = atomic_read(&folio->_mapcount);
-		return page_type_has_type(mapcount) ? 0 : mapcount + 1;
+		mapcount = atomic_read(&folio->_mapcount) + 1;
+		/* Handle page_has_type() pages */
+		if (mapcount < PAGE_MAPCOUNT_RESERVE + 1)
+			mapcount = 0;
+		return mapcount;
  	}
  	return folio_large_mapcount(folio);
  }
-- 
2.44.0


-- 
Cheers,

David / dhildenb


^ permalink raw reply related	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2024-04-24  9:40 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-09 19:22 [PATCH v1 00/18] mm: mapcount for large folios + page_mapcount() cleanups David Hildenbrand
2024-04-09 19:22 ` [PATCH v1 01/18] mm: allow for detecting underflows with page_mapcount() again David Hildenbrand
2024-04-09 20:06   ` Zi Yan
2024-04-09 21:42   ` Matthew Wilcox
2024-04-10  8:10     ` David Hildenbrand
2024-04-24  9:38   ` David Hildenbrand
2024-04-09 19:22 ` [PATCH v1 02/18] mm/rmap: always inline anon/file rmap duplication of a single PTE David Hildenbrand
2024-04-19  2:25   ` Yin, Fengwei
2024-04-19  9:14     ` David Hildenbrand
2024-04-19 14:01   ` Yin, Fengwei
2024-04-09 19:22 ` [PATCH v1 03/18] mm/rmap: add fast-path for small folios when adding/removing/duplicating David Hildenbrand
2024-04-19 14:02   ` Yin, Fengwei
2024-04-09 19:22 ` [PATCH v1 04/18] mm: track mapcount of large folios in single value David Hildenbrand
2024-04-09 20:13   ` Zi Yan
2024-04-10  8:20     ` David Hildenbrand
2024-04-18 14:50   ` Lance Yang
2024-04-18 15:09     ` David Hildenbrand
2024-04-19  0:31       ` Lance Yang
2024-04-19 14:02   ` Yin, Fengwei
2024-04-09 19:22 ` [PATCH v1 05/18] mm: improve folio_likely_mapped_shared() using the mapcount of large folios David Hildenbrand
2024-04-16 10:40   ` Lance Yang
2024-04-16 10:47     ` David Hildenbrand
2024-04-16 10:52       ` Lance Yang
2024-04-16 10:53         ` David Hildenbrand
2024-04-19  2:29   ` Yin, Fengwei
2024-04-19  9:19     ` David Hildenbrand
2024-04-19 13:47       ` Yin, Fengwei
2024-04-19 13:48         ` David Hildenbrand
2024-04-19 14:03   ` Yin, Fengwei
2024-04-09 19:22 ` [PATCH v1 06/18] mm: make folio_mapcount() return 0 for small typed folios David Hildenbrand
2024-04-24  9:40   ` David Hildenbrand
2024-04-09 19:22 ` [PATCH v1 07/18] mm/memory: use folio_mapcount() in zap_present_folio_ptes() David Hildenbrand
2024-04-09 19:22 ` [PATCH v1 08/18] mm/huge_memory: use folio_mapcount() in zap_huge_pmd() sanity check David Hildenbrand
2024-04-09 19:22 ` [PATCH v1 09/18] mm/memory-failure: use folio_mapcount() in hwpoison_user_mappings() David Hildenbrand
2024-04-09 19:22 ` [PATCH v1 10/18] mm/page_alloc: use folio_mapped() in __alloc_contig_migrate_range() David Hildenbrand
2024-04-09 19:22 ` [PATCH v1 11/18] mm/migrate: use folio_likely_mapped_shared() in add_page_for_migration() David Hildenbrand
2024-04-09 19:22 ` [PATCH v1 12/18] sh/mm/cache: use folio_mapped() in copy_from_user_page() David Hildenbrand
2024-04-09 19:22 ` [PATCH v1 13/18] mm/filemap: use folio_mapcount() in filemap_unaccount_folio() David Hildenbrand
2024-04-09 19:22 ` [PATCH v1 14/18] mm/migrate_device: use folio_mapcount() in migrate_vma_check_page() David Hildenbrand
2024-04-09 19:22 ` [PATCH v1 15/18] trace/events/page_ref: trace the raw page mapcount value David Hildenbrand
2024-04-09 19:22 ` [PATCH v1 16/18] xtensa/mm: convert check_tlb_entry() to sanity check folios David Hildenbrand
2024-04-09 19:23 ` [PATCH v1 17/18] mm/debug: print only page mapcount (excluding folio entire mapcount) in __dump_folio() David Hildenbrand
2024-04-09 19:23 ` [PATCH v1 18/18] Documentation/admin-guide/cgroup-v1/memory.rst: don't reference page_mapcount() David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).