All of lore.kernel.org
 help / color / mirror / Atom feed
From: James Houghton <jthoughton@google.com>
To: Mike Kravetz <mike.kravetz@oracle.com>,
	Hugh Dickins <hughd@google.com>,
	Muchun Song <songmuchun@bytedance.com>,
	Peter Xu <peterx@redhat.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	David Hildenbrand <david@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Jiaqi Yan <jiaqiyan@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	James Houghton <jthoughton@google.com>
Subject: [PATCH 1/2] mm: rmap: make hugetlb pages participate in _nr_pages_mapped
Date: Mon,  6 Mar 2023 23:00:03 +0000	[thread overview]
Message-ID: <20230306230004.1387007-2-jthoughton@google.com> (raw)
In-Reply-To: <20230306230004.1387007-1-jthoughton@google.com>

For compound mappings (compound=true), _nr_pages_mapped will now be
incremented by COMPOUND_MAPPED when the first compound mapping is
created.

For small mappings, _nr_pages_mapped is incremented by 1 when the
particular small page is mapped for the first time. This is incompatible
with HPageVmemmapOptimize()ed folios, as most of the tail page structs
will be mapped read-only.

Currently HugeTLB always passes compound=true, but in the future,
HugeTLB pages may be mapped with small mappings.

To implement this change:
 1. Replace most of HugeTLB's calls to page_dup_file_rmap() with
    page_add_file_rmap(). The call in copy_hugetlb_page_range() is kept.
 2. Update page_add_file_rmap() and page_remove_rmap() to support
    HugeTLB folios.
 3. Update hugepage_add_anon_rmap() and hugepage_add_new_anon_rmap() to
    also increment _nr_pages_mapped properly.

With these changes, folio_large_is_mapped() no longer needs to check
_entire_mapcount.

HugeTLB doesn't use LRU or mlock, so page_add_file_rmap() and
page_remove_rmap() excludes those pieces. It is also important that
the folio_test_pmd_mappable() check is removed (or changed), as it's
possible to have a HugeTLB page whose order is not >= HPAGE_PMD_ORDER,
like arm64's CONT_PTE_SIZE HugeTLB pages.

This patch limits HugeTLB pages to 16G in size. That limit can be
increased if COMPOUND_MAPPED is raised.

Signed-off-by: James Houghton <jthoughton@google.com>

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ce1590933995..25c898c51129 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1143,12 +1143,7 @@ static inline int total_mapcount(struct page *page)
 
 static inline bool folio_large_is_mapped(struct folio *folio)
 {
-	/*
-	 * Reading _entire_mapcount below could be omitted if hugetlb
-	 * participated in incrementing nr_pages_mapped when compound mapped.
-	 */
-	return atomic_read(&folio->_nr_pages_mapped) > 0 ||
-		atomic_read(&folio->_entire_mapcount) >= 0;
+	return atomic_read(&folio->_nr_pages_mapped) > 0;
 }
 
 /**
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 07abcb6eb203..bf3d327cd1b9 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5911,7 +5911,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 	if (anon_rmap)
 		hugepage_add_new_anon_rmap(folio, vma, haddr);
 	else
-		page_dup_file_rmap(&folio->page, true);
+		page_add_file_rmap(&folio->page, vma, true);
 	new_pte = make_huge_pte(vma, &folio->page, ((vma->vm_flags & VM_WRITE)
 				&& (vma->vm_flags & VM_SHARED)));
 	/*
@@ -6293,7 +6293,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
 		goto out_release_unlock;
 
 	if (folio_in_pagecache)
-		page_dup_file_rmap(&folio->page, true);
+		page_add_file_rmap(&folio->page, dst_vma, true);
 	else
 		hugepage_add_new_anon_rmap(folio, dst_vma, dst_addr);
 
diff --git a/mm/internal.h b/mm/internal.h
index fce94775819c..400451a4dd49 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -55,8 +55,7 @@ void page_writeback_init(void);
 /*
  * If a 16GB hugetlb folio were mapped by PTEs of all of its 4kB pages,
  * its nr_pages_mapped would be 0x400000: choose the COMPOUND_MAPPED bit
- * above that range, instead of 2*(PMD_SIZE/PAGE_SIZE).  Hugetlb currently
- * leaves nr_pages_mapped at 0, but avoid surprise if it participates later.
+ * above that range, instead of 2*(PMD_SIZE/PAGE_SIZE).
  */
 #define COMPOUND_MAPPED		0x800000
 #define FOLIO_PAGES_MAPPED	(COMPOUND_MAPPED - 1)
diff --git a/mm/migrate.c b/mm/migrate.c
index 476cd24e8f32..c95c1cbc7a47 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -254,7 +254,7 @@ static bool remove_migration_pte(struct folio *folio,
 				hugepage_add_anon_rmap(new, vma, pvmw.address,
 						       rmap_flags);
 			else
-				page_dup_file_rmap(new, true);
+				page_add_file_rmap(new, vma, true);
 			set_huge_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte);
 		} else
 #endif
diff --git a/mm/rmap.c b/mm/rmap.c
index ba901c416785..4a975429b91a 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1316,19 +1316,21 @@ void page_add_file_rmap(struct page *page, struct vm_area_struct *vma,
 	int nr = 0, nr_pmdmapped = 0;
 	bool first;
 
-	VM_BUG_ON_PAGE(compound && !PageTransHuge(page), page);
+	VM_BUG_ON_PAGE(compound && !PageTransHuge(page)
+				&& !folio_test_hugetlb(folio), page);
 
 	/* Is page being mapped by PTE? Is this its first map to be added? */
 	if (likely(!compound)) {
+		if (unlikely(folio_test_hugetlb(folio)))
+			VM_BUG_ON_PAGE(HPageVmemmapOptimized(&folio->page),
+				       page);
 		first = atomic_inc_and_test(&page->_mapcount);
 		nr = first;
 		if (first && folio_test_large(folio)) {
 			nr = atomic_inc_return_relaxed(mapped);
 			nr = (nr < COMPOUND_MAPPED);
 		}
-	} else if (folio_test_pmd_mappable(folio)) {
-		/* That test is redundant: it's for safety or to optimize out */
-
+	} else {
 		first = atomic_inc_and_test(&folio->_entire_mapcount);
 		if (first) {
 			nr = atomic_add_return_relaxed(COMPOUND_MAPPED, mapped);
@@ -1345,6 +1347,9 @@ void page_add_file_rmap(struct page *page, struct vm_area_struct *vma,
 		}
 	}
 
+	if (folio_test_hugetlb(folio))
+		return;
+
 	if (nr_pmdmapped)
 		__lruvec_stat_mod_folio(folio, folio_test_swapbacked(folio) ?
 			NR_SHMEM_PMDMAPPED : NR_FILE_PMDMAPPED, nr_pmdmapped);
@@ -1373,24 +1378,18 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma,
 
 	VM_BUG_ON_PAGE(compound && !PageHead(page), page);
 
-	/* Hugetlb pages are not counted in NR_*MAPPED */
-	if (unlikely(folio_test_hugetlb(folio))) {
-		/* hugetlb pages are always mapped with pmds */
-		atomic_dec(&folio->_entire_mapcount);
-		return;
-	}
-
 	/* Is page being unmapped by PTE? Is this its last map to be removed? */
 	if (likely(!compound)) {
+		if (unlikely(folio_test_hugetlb(folio)))
+			VM_BUG_ON_PAGE(HPageVmemmapOptimized(&folio->page),
+				       page);
 		last = atomic_add_negative(-1, &page->_mapcount);
 		nr = last;
 		if (last && folio_test_large(folio)) {
 			nr = atomic_dec_return_relaxed(mapped);
 			nr = (nr < COMPOUND_MAPPED);
 		}
-	} else if (folio_test_pmd_mappable(folio)) {
-		/* That test is redundant: it's for safety or to optimize out */
-
+	} else {
 		last = atomic_add_negative(-1, &folio->_entire_mapcount);
 		if (last) {
 			nr = atomic_sub_return_relaxed(COMPOUND_MAPPED, mapped);
@@ -1407,6 +1406,9 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma,
 		}
 	}
 
+	if (folio_test_hugetlb(folio))
+		return;
+
 	if (nr_pmdmapped) {
 		if (folio_test_anon(folio))
 			idx = NR_ANON_THPS;
@@ -2541,9 +2543,11 @@ void hugepage_add_anon_rmap(struct page *page, struct vm_area_struct *vma,
 	first = atomic_inc_and_test(&folio->_entire_mapcount);
 	VM_BUG_ON_PAGE(!first && (flags & RMAP_EXCLUSIVE), page);
 	VM_BUG_ON_PAGE(!first && PageAnonExclusive(page), page);
-	if (first)
+	if (first) {
+		atomic_add(COMPOUND_MAPPED, &folio->_nr_pages_mapped);
 		__page_set_anon_rmap(folio, page, vma, address,
 				     !!(flags & RMAP_EXCLUSIVE));
+	}
 }
 
 void hugepage_add_new_anon_rmap(struct folio *folio,
@@ -2552,6 +2556,7 @@ void hugepage_add_new_anon_rmap(struct folio *folio,
 	BUG_ON(address < vma->vm_start || address >= vma->vm_end);
 	/* increment count (starts at -1) */
 	atomic_set(&folio->_entire_mapcount, 0);
+	atomic_set(&folio->_nr_pages_mapped, COMPOUND_MAPPED);
 	folio_clear_hugetlb_restore_reserve(folio);
 	__page_set_anon_rmap(folio, &folio->page, vma, address, 1);
 }
-- 
2.40.0.rc0.216.gc4246ad0f0-goog


  reply	other threads:[~2023-03-06 23:00 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-06 23:00 [PATCH 0/2] mm: rmap: merge HugeTLB mapcount logic with THPs James Houghton
2023-03-06 23:00 ` James Houghton [this message]
2023-03-07 21:54   ` [PATCH 1/2] mm: rmap: make hugetlb pages participate in _nr_pages_mapped Mike Kravetz
2023-03-08  0:36     ` James Houghton
2023-03-08 21:56       ` Peter Xu
2023-03-09 19:58         ` James Houghton
2023-03-06 23:00 ` [PATCH 2/2] mm: rmap: increase COMPOUND_MAPPED to support 512G HugeTLB pages James Houghton
2023-03-08 22:10 ` [PATCH 0/2] mm: rmap: merge HugeTLB mapcount logic with THPs Peter Xu
2023-03-09 18:05   ` James Houghton
2023-03-09 19:29     ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230306230004.1387007-2-jthoughton@google.com \
    --to=jthoughton@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=jiaqiyan@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=peterx@redhat.com \
    --cc=rientjes@google.com \
    --cc=songmuchun@bytedance.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.