linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/5] create hugetlb flags to consolidate state
@ 2021-01-20  1:30 Mike Kravetz
  2021-01-20  1:30 ` [PATCH v2 1/5] hugetlb: use page.private for hugetlb specific page flags Mike Kravetz
                   ` (4 more replies)
  0 siblings, 5 replies; 24+ messages in thread
From: Mike Kravetz @ 2021-01-20  1:30 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Michal Hocko, Naoya Horiguchi, Muchun Song, David Hildenbrand,
	Oscar Salvador, Matthew Wilcox, Andrew Morton, Mike Kravetz

While discussing a series of hugetlb fixes in [1], it became evident
that the hugetlb specific page state information is stored in a somewhat
haphazard manner.  Code dealing with state information would be easier
to read, understand and maintain if this information was stored in a
consistent manner.

This series uses page.private of the hugetlb head page for storing a
set of hugetlb specific page flags.  Routines are priovided for test,
set and clear of the flags.

[1] https://lore.kernel.org/r/20210106084739.63318-1-songmuchun@bytedance.com

Patch -> v2
  Went back to functions similar to 'normal' page flags (Matthew/Muchun)
  Decided to leave check in unmap_and_move_huge_page and print warning

RFC -> PATCH
  Simplified to use a single set of flag manipulation routines (Oscar)
  Moved flags and routines to hugetlb.h (Muchun)
  Changed format of page flag names (Muchun)
  Changed subpool routine names (Matthew)
  More comments in code (Oscar)

Based on v5.11-rc3-mmotm-2021-01-12-01-57

Mike Kravetz (5):
  hugetlb: use page.private for hugetlb specific page flags
  hugetlb: convert page_huge_active() HPageMigratable flag
  hugetlb: only set HPageMigratable for migratable hstates
  hugetlb: convert PageHugeTemporary() to HPageTemporary flag
  hugetlb: convert PageHugeFreed to HPageFreed flag

 fs/hugetlbfs/inode.c       |  14 +---
 include/linux/hugetlb.h    |  95 +++++++++++++++++++++++
 include/linux/page-flags.h |   6 --
 mm/hugetlb.c               | 149 +++++++++++--------------------------
 mm/memory_hotplug.c        |   8 +-
 mm/migrate.c               |   9 +--
 6 files changed, 154 insertions(+), 127 deletions(-)

-- 
2.29.2



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 1/5] hugetlb: use page.private for hugetlb specific page flags
  2021-01-20  1:30 [PATCH v2 0/5] create hugetlb flags to consolidate state Mike Kravetz
@ 2021-01-20  1:30 ` Mike Kravetz
  2021-01-20  8:10   ` Miaohe Lin
                     ` (2 more replies)
  2021-01-20  1:30 ` [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag Mike Kravetz
                   ` (3 subsequent siblings)
  4 siblings, 3 replies; 24+ messages in thread
From: Mike Kravetz @ 2021-01-20  1:30 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Michal Hocko, Naoya Horiguchi, Muchun Song, David Hildenbrand,
	Oscar Salvador, Matthew Wilcox, Andrew Morton, Mike Kravetz

As hugetlbfs evolved, state information about hugetlb pages was added.
One 'convenient' way of doing this was to use available fields in tail
pages.  Over time, it has become difficult to know the meaning or contents
of fields simply by looking at a small bit of code.  Sometimes, the
naming is just confusing.  For example: The PagePrivate flag indicates
a huge page reservation was consumed and needs to be restored if an error
is encountered and the page is freed before it is instantiated.  The
page.private field contains the pointer to a subpool if the page is
associated with one.

In an effort to make the code more readable, use page.private to contain
hugetlb specific page flags.  These flags will have test, set and clear
functions similar to those used for 'normal' page flags.  More importantly,
an enum of flag values will be created with names that actually reflect
their purpose.

In this patch,
- Create infrastructure for hugetlb specific page flag functions
- Move subpool pointer to page[1].private to make way for flags
  Create routines with meaningful names to modify subpool field
- Use new HPageRestoreReserve flag instead of PagePrivate

Conversion of other state information will happen in subsequent patches.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 fs/hugetlbfs/inode.c    | 12 ++-----
 include/linux/hugetlb.h | 72 +++++++++++++++++++++++++++++++++++++++++
 mm/hugetlb.c            | 45 +++++++++++++-------------
 3 files changed, 97 insertions(+), 32 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 740693d7f255..b8a661780c4a 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -955,15 +955,9 @@ static int hugetlbfs_migrate_page(struct address_space *mapping,
 	if (rc != MIGRATEPAGE_SUCCESS)
 		return rc;
 
-	/*
-	 * page_private is subpool pointer in hugetlb pages.  Transfer to
-	 * new page.  PagePrivate is not associated with page_private for
-	 * hugetlb pages and can not be set here as only page_huge_active
-	 * pages can be migrated.
-	 */
-	if (page_private(page)) {
-		set_page_private(newpage, page_private(page));
-		set_page_private(page, 0);
+	if (hugetlb_page_subpool(page)) {
+		hugetlb_set_page_subpool(newpage, hugetlb_page_subpool(page));
+		hugetlb_set_page_subpool(page, NULL);
 	}
 
 	if (mode != MIGRATE_SYNC_NO_COPY)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index ef5b144b8aac..be71a00ee2a0 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -472,6 +472,64 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
 					unsigned long flags);
 #endif /* HAVE_ARCH_HUGETLB_UNMAPPED_AREA */
 
+/*
+ * huegtlb page specific state flags.  These flags are located in page.private
+ * of the hugetlb head page.  Functions created via the below macros should be
+ * used to manipulate these flags.
+ *
+ * HPG_restore_reserve - Set when a hugetlb page consumes a reservation at
+ *	allocation time.  Cleared when page is fully instantiated.  Free
+ *	routine checks flag to restore a reservation on error paths.
+ */
+enum hugetlb_page_flags {
+	HPG_restore_reserve = 0,
+	__NR_HPAGEFLAGS,
+};
+
+/*
+ * Macros to create test, set and clear function definitions for
+ * hugetlb specific page flags.
+ */
+#ifdef CONFIG_HUGETLB_PAGE
+#define TESTHPAGEFLAG(uname, flname)				\
+static inline int HPage##uname(struct page *page)		\
+	{ BUILD_BUG_ON(sizeof_field(struct page, private) *	\
+			BITS_PER_BYTE < __NR_HPAGEFLAGS);	\
+	return test_bit(HPG_##flname, &(page->private)); }
+
+#define SETHPAGEFLAG(uname, flname)				\
+static inline void SetHPage##uname(struct page *page)		\
+	{ set_bit(HPG_##flname, &(page->private)); }
+
+#define CLEARHPAGEFLAG(uname, flname)				\
+static inline void ClearHPage##uname(struct page *page)		\
+	{ clear_bit(HPG_##flname, &(page->private)); }
+#else
+#define TESTHPAGEFLAG(uname, flname)				\
+static inline int HPage##uname(struct page *page)		\
+	{ BUILD_BUG_ON(sizeof_field(struct page, private) *	\
+			BITS_PER_BYTE < __NR_HPAGEFLAGS);	\
+	return 0 }
+
+#define SETHPAGEFLAG(uname, flname)				\
+static inline void SetHPage##uname(struct page *page)		\
+	{ }
+
+#define CLEARHPAGEFLAG(uname, flname)				\
+static inline void ClearHPage##uname(struct page *page)		\
+	{ }
+#endif
+
+#define HPAGEFLAG(uname, flname)				\
+	TESTHPAGEFLAG(uname, flname)				\
+	SETHPAGEFLAG(uname, flname)				\
+	CLEARHPAGEFLAG(uname, flname)				\
+
+/*
+ * Create functions associated with hugetlb page flags
+ */
+HPAGEFLAG(RestoreReserve, restore_reserve)
+
 #ifdef CONFIG_HUGETLB_PAGE
 
 #define HSTATE_NAME_LEN 32
@@ -531,6 +589,20 @@ extern unsigned int default_hstate_idx;
 
 #define default_hstate (hstates[default_hstate_idx])
 
+/*
+ * hugetlb page subpool pointer located in hpage[1].private
+ */
+static inline struct hugepage_subpool *hugetlb_page_subpool(struct page *hpage)
+{
+	return (struct hugepage_subpool *)(hpage+1)->private;
+}
+
+static inline void hugetlb_set_page_subpool(struct page *hpage,
+					struct hugepage_subpool *subpool)
+{
+	set_page_private(hpage+1, (unsigned long)subpool);
+}
+
 static inline struct hstate *hstate_file(struct file *f)
 {
 	return hstate_inode(file_inode(f));
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 737b2dce19e6..8bed6b5202d2 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1133,7 +1133,7 @@ static struct page *dequeue_huge_page_vma(struct hstate *h,
 	nid = huge_node(vma, address, gfp_mask, &mpol, &nodemask);
 	page = dequeue_huge_page_nodemask(h, gfp_mask, nid, nodemask);
 	if (page && !avoid_reserve && vma_has_reserves(vma, chg)) {
-		SetPagePrivate(page);
+		SetHPageRestoreReserve(page);
 		h->resv_huge_pages--;
 	}
 
@@ -1407,20 +1407,19 @@ static void __free_huge_page(struct page *page)
 	 */
 	struct hstate *h = page_hstate(page);
 	int nid = page_to_nid(page);
-	struct hugepage_subpool *spool =
-		(struct hugepage_subpool *)page_private(page);
+	struct hugepage_subpool *spool = hugetlb_page_subpool(page);
 	bool restore_reserve;
 
 	VM_BUG_ON_PAGE(page_count(page), page);
 	VM_BUG_ON_PAGE(page_mapcount(page), page);
 
-	set_page_private(page, 0);
+	hugetlb_set_page_subpool(page, NULL);
 	page->mapping = NULL;
-	restore_reserve = PagePrivate(page);
-	ClearPagePrivate(page);
+	restore_reserve = HPageRestoreReserve(page);
+	ClearHPageRestoreReserve(page);
 
 	/*
-	 * If PagePrivate() was set on page, page allocation consumed a
+	 * If HPageRestoreReserve was set on page, page allocation consumed a
 	 * reservation.  If the page was associated with a subpool, there
 	 * would have been a page reserved in the subpool before allocation
 	 * via hugepage_subpool_get_pages().  Since we are 'restoring' the
@@ -2254,24 +2253,24 @@ static long vma_add_reservation(struct hstate *h,
  * This routine is called to restore a reservation on error paths.  In the
  * specific error paths, a huge page was allocated (via alloc_huge_page)
  * and is about to be freed.  If a reservation for the page existed,
- * alloc_huge_page would have consumed the reservation and set PagePrivate
- * in the newly allocated page.  When the page is freed via free_huge_page,
- * the global reservation count will be incremented if PagePrivate is set.
- * However, free_huge_page can not adjust the reserve map.  Adjust the
- * reserve map here to be consistent with global reserve count adjustments
- * to be made by free_huge_page.
+ * alloc_huge_page would have consumed the reservation and set
+ * HPageRestoreReserve in the newly allocated page.  When the page is freed
+ * via free_huge_page, the global reservation count will be incremented if
+ * HPageRestoreReserve is set.  However, free_huge_page can not adjust the
+ * reserve map.  Adjust the reserve map here to be consistent with global
+ * reserve count adjustments to be made by free_huge_page.
  */
 static void restore_reserve_on_error(struct hstate *h,
 			struct vm_area_struct *vma, unsigned long address,
 			struct page *page)
 {
-	if (unlikely(PagePrivate(page))) {
+	if (unlikely(HPageRestoreReserve(page))) {
 		long rc = vma_needs_reservation(h, vma, address);
 
 		if (unlikely(rc < 0)) {
 			/*
 			 * Rare out of memory condition in reserve map
-			 * manipulation.  Clear PagePrivate so that
+			 * manipulation.  Clear HPageRestoreReserve so that
 			 * global reserve count will not be incremented
 			 * by free_huge_page.  This will make it appear
 			 * as though the reservation for this page was
@@ -2280,7 +2279,7 @@ static void restore_reserve_on_error(struct hstate *h,
 			 * is better than inconsistent global huge page
 			 * accounting of reserve counts.
 			 */
-			ClearPagePrivate(page);
+			ClearHPageRestoreReserve(page);
 		} else if (rc) {
 			rc = vma_add_reservation(h, vma, address);
 			if (unlikely(rc < 0))
@@ -2288,7 +2287,7 @@ static void restore_reserve_on_error(struct hstate *h,
 				 * See above comment about rare out of
 				 * memory condition.
 				 */
-				ClearPagePrivate(page);
+				ClearHPageRestoreReserve(page);
 		} else
 			vma_end_reservation(h, vma, address);
 	}
@@ -2369,7 +2368,7 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
 		if (!page)
 			goto out_uncharge_cgroup;
 		if (!avoid_reserve && vma_has_reserves(vma, gbl_chg)) {
-			SetPagePrivate(page);
+			SetHPageRestoreReserve(page);
 			h->resv_huge_pages--;
 		}
 		spin_lock(&hugetlb_lock);
@@ -2387,7 +2386,7 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
 
 	spin_unlock(&hugetlb_lock);
 
-	set_page_private(page, (unsigned long)spool);
+	hugetlb_set_page_subpool(page, spool);
 
 	map_commit = vma_commit_reservation(h, vma, addr);
 	if (unlikely(map_chg > map_commit)) {
@@ -4212,7 +4211,7 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma,
 	spin_lock(ptl);
 	ptep = huge_pte_offset(mm, haddr, huge_page_size(h));
 	if (likely(ptep && pte_same(huge_ptep_get(ptep), pte))) {
-		ClearPagePrivate(new_page);
+		ClearHPageRestoreReserve(new_page);
 
 		/* Break COW */
 		huge_ptep_clear_flush(vma, haddr, ptep);
@@ -4279,7 +4278,7 @@ int huge_add_to_page_cache(struct page *page, struct address_space *mapping,
 
 	if (err)
 		return err;
-	ClearPagePrivate(page);
+	ClearHPageRestoreReserve(page);
 
 	/*
 	 * set page dirty so that it will not be removed from cache/file
@@ -4441,7 +4440,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 		goto backout;
 
 	if (anon_rmap) {
-		ClearPagePrivate(page);
+		ClearHPageRestoreReserve(page);
 		hugepage_add_new_anon_rmap(page, vma, haddr);
 	} else
 		page_dup_rmap(page, true);
@@ -4755,7 +4754,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
 	if (vm_shared) {
 		page_dup_rmap(page, true);
 	} else {
-		ClearPagePrivate(page);
+		ClearHPageRestoreReserve(page);
 		hugepage_add_new_anon_rmap(page, dst_vma, dst_addr);
 	}
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag
  2021-01-20  1:30 [PATCH v2 0/5] create hugetlb flags to consolidate state Mike Kravetz
  2021-01-20  1:30 ` [PATCH v2 1/5] hugetlb: use page.private for hugetlb specific page flags Mike Kravetz
@ 2021-01-20  1:30 ` Mike Kravetz
  2021-01-20  9:59   ` Oscar Salvador
                     ` (4 more replies)
  2021-01-20  1:30 ` [PATCH v2 3/5] hugetlb: only set HPageMigratable for migratable hstates Mike Kravetz
                   ` (2 subsequent siblings)
  4 siblings, 5 replies; 24+ messages in thread
From: Mike Kravetz @ 2021-01-20  1:30 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Michal Hocko, Naoya Horiguchi, Muchun Song, David Hildenbrand,
	Oscar Salvador, Matthew Wilcox, Andrew Morton, Mike Kravetz

Use the new hugetlb page specific flag HPageMigratable to replace the
page_huge_active interfaces.  By it's name, page_huge_active implied
that a huge page was on the active list.  However, that is not really
what code checking the flag wanted to know.  It really wanted to determine
if the huge page could be migrated.  This happens when the page is actually
added the page cache and/or task page table.  This is the reasoning behind
the name change.

The VM_BUG_ON_PAGE() calls in the *_huge_active() interfaces are not
really necessary as we KNOW the page is a hugetlb page.  Therefore, they
are removed.

The routine page_huge_active checked for PageHeadHuge before testing the
active bit.  This is unnecessary in the case where we hold a reference or
lock and know it is a hugetlb head page.  page_huge_active is also called
without holding a reference or lock (scan_movable_pages), and can race with
code freeing the page.  The extra check in page_huge_active shortened the
race window, but did not prevent the race.  Offline code calling
scan_movable_pages already deals with these races, so removing the check
is acceptable.  Add comment to racy code.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 fs/hugetlbfs/inode.c       |  2 +-
 include/linux/hugetlb.h    |  5 +++++
 include/linux/page-flags.h |  6 -----
 mm/hugetlb.c               | 45 ++++++++++----------------------------
 mm/memory_hotplug.c        |  8 ++++++-
 5 files changed, 24 insertions(+), 42 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index b8a661780c4a..ec9f03aa2738 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -735,7 +735,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
 
 		mutex_unlock(&hugetlb_fault_mutex_table[hash]);
 
-		set_page_huge_active(page);
+		SetHPageMigratable(page);
 		/*
 		 * unlock_page because locked by add_to_page_cache()
 		 * put_page() due to reference from alloc_huge_page()
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index be71a00ee2a0..ce3d03da0133 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -480,9 +480,13 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
  * HPG_restore_reserve - Set when a hugetlb page consumes a reservation at
  *	allocation time.  Cleared when page is fully instantiated.  Free
  *	routine checks flag to restore a reservation on error paths.
+ * HPG_migratable  - Set after a newly allocated page is added to the page
+ *	cache and/or page tables.  Indicates the page is a candidate for
+ *	migration.
  */
 enum hugetlb_page_flags {
 	HPG_restore_reserve = 0,
+	HPG_migratable,
 	__NR_HPAGEFLAGS,
 };
 
@@ -529,6 +533,7 @@ static inline void ClearHPage##uname(struct page *page)		\
  * Create functions associated with hugetlb page flags
  */
 HPAGEFLAG(RestoreReserve, restore_reserve)
+HPAGEFLAG(Migratable, migratable)
 
 #ifdef CONFIG_HUGETLB_PAGE
 
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index bc6fd1ee7dd6..04a34c08e0a6 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -592,15 +592,9 @@ static inline void ClearPageCompound(struct page *page)
 #ifdef CONFIG_HUGETLB_PAGE
 int PageHuge(struct page *page);
 int PageHeadHuge(struct page *page);
-bool page_huge_active(struct page *page);
 #else
 TESTPAGEFLAG_FALSE(Huge)
 TESTPAGEFLAG_FALSE(HeadHuge)
-
-static inline bool page_huge_active(struct page *page)
-{
-	return 0;
-}
 #endif
 
 
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 8bed6b5202d2..c24da40626d3 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1353,30 +1353,6 @@ struct hstate *size_to_hstate(unsigned long size)
 	return NULL;
 }
 
-/*
- * Test to determine whether the hugepage is "active/in-use" (i.e. being linked
- * to hstate->hugepage_activelist.)
- *
- * This function can be called for tail pages, but never returns true for them.
- */
-bool page_huge_active(struct page *page)
-{
-	return PageHeadHuge(page) && PagePrivate(&page[1]);
-}
-
-/* never called for tail page */
-void set_page_huge_active(struct page *page)
-{
-	VM_BUG_ON_PAGE(!PageHeadHuge(page), page);
-	SetPagePrivate(&page[1]);
-}
-
-static void clear_page_huge_active(struct page *page)
-{
-	VM_BUG_ON_PAGE(!PageHeadHuge(page), page);
-	ClearPagePrivate(&page[1]);
-}
-
 /*
  * Internal hugetlb specific page flag. Do not use outside of the hugetlb
  * code
@@ -1438,7 +1414,7 @@ static void __free_huge_page(struct page *page)
 	}
 
 	spin_lock(&hugetlb_lock);
-	clear_page_huge_active(page);
+	ClearHPageMigratable(page);
 	hugetlb_cgroup_uncharge_page(hstate_index(h),
 				     pages_per_huge_page(h), page);
 	hugetlb_cgroup_uncharge_page_rsvd(hstate_index(h),
@@ -4220,7 +4196,7 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma,
 				make_huge_pte(vma, new_page, 1));
 		page_remove_rmap(old_page, true);
 		hugepage_add_new_anon_rmap(new_page, vma, haddr);
-		set_page_huge_active(new_page);
+		SetHPageMigratable(new_page);
 		/* Make the old page be freed below */
 		new_page = old_page;
 	}
@@ -4457,12 +4433,12 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 	spin_unlock(ptl);
 
 	/*
-	 * Only make newly allocated pages active.  Existing pages found
-	 * in the pagecache could be !page_huge_active() if they have been
-	 * isolated for migration.
+	 * Only set HPageMigratable in newly allocated pages.  Existing pages
+	 * found in the pagecache may not have HPageMigratableset if they have
+	 * been isolated for migration.
 	 */
 	if (new_page)
-		set_page_huge_active(page);
+		SetHPageMigratable(page);
 
 	unlock_page(page);
 out:
@@ -4773,7 +4749,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
 	update_mmu_cache(dst_vma, dst_addr, dst_pte);
 
 	spin_unlock(ptl);
-	set_page_huge_active(page);
+	SetHPageMigratable(page);
 	if (vm_shared)
 		unlock_page(page);
 	ret = 0;
@@ -5591,12 +5567,13 @@ bool isolate_huge_page(struct page *page, struct list_head *list)
 	bool ret = true;
 
 	spin_lock(&hugetlb_lock);
-	if (!PageHeadHuge(page) || !page_huge_active(page) ||
+	if (!PageHeadHuge(page) ||
+	    !HPageMigratable(page) ||
 	    !get_page_unless_zero(page)) {
 		ret = false;
 		goto unlock;
 	}
-	clear_page_huge_active(page);
+	ClearHPageMigratable(page);
 	list_move_tail(&page->lru, list);
 unlock:
 	spin_unlock(&hugetlb_lock);
@@ -5607,7 +5584,7 @@ void putback_active_hugepage(struct page *page)
 {
 	VM_BUG_ON_PAGE(!PageHead(page), page);
 	spin_lock(&hugetlb_lock);
-	set_page_huge_active(page);
+	SetHPageMigratable(page);
 	list_move_tail(&page->lru, &(page_hstate(page))->hugepage_activelist);
 	spin_unlock(&hugetlb_lock);
 	put_page(page);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index f9d57b9be8c7..563da803e0e0 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1260,7 +1260,13 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
 		if (!PageHuge(page))
 			continue;
 		head = compound_head(page);
-		if (page_huge_active(head))
+		/*
+		 * This test is racy as we hold no reference or lock.  The
+		 * hugetlb page could have been free'ed and head is no longer
+		 * a hugetlb page before the following check.  In such unlikely
+		 * cases false positives and negatives are possible.
+		 */
+		if (HPageMigratable(head))
 			goto found;
 		skip = compound_nr(head) - (page - head);
 		pfn += skip - 1;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 3/5] hugetlb: only set HPageMigratable for migratable hstates
  2021-01-20  1:30 [PATCH v2 0/5] create hugetlb flags to consolidate state Mike Kravetz
  2021-01-20  1:30 ` [PATCH v2 1/5] hugetlb: use page.private for hugetlb specific page flags Mike Kravetz
  2021-01-20  1:30 ` [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag Mike Kravetz
@ 2021-01-20  1:30 ` Mike Kravetz
  2021-01-20  1:30 ` [PATCH v2 4/5] hugetlb: convert PageHugeTemporary() to HPageTemporary flag Mike Kravetz
  2021-01-20  1:30 ` [PATCH v2 5/5] hugetlb: convert PageHugeFreed to HPageFreed flag Mike Kravetz
  4 siblings, 0 replies; 24+ messages in thread
From: Mike Kravetz @ 2021-01-20  1:30 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Michal Hocko, Naoya Horiguchi, Muchun Song, David Hildenbrand,
	Oscar Salvador, Matthew Wilcox, Andrew Morton, Mike Kravetz

The HP_Migratable flag indicates a page is a candidate for migration.
Only set the flag if the page's hstate supports migration.  This allows
the migration paths to detect non-migratable pages earlier.  If migration
is not supported for the hstate, HP_Migratable will not be set, the page
will not be isolated and no attempt will be made to migrate.  We should
never get to unmap_and_move_huge_page for a page where migration is not
supported, so throw a warning if we do.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 fs/hugetlbfs/inode.c    | 2 +-
 include/linux/hugetlb.h | 9 +++++++++
 mm/hugetlb.c            | 8 ++++----
 mm/migrate.c            | 9 ++++-----
 4 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index ec9f03aa2738..8b8acdafd0be 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -735,7 +735,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
 
 		mutex_unlock(&hugetlb_fault_mutex_table[hash]);
 
-		SetHPageMigratable(page);
+		SetHPageMigratableIfSupported(page);
 		/*
 		 * unlock_page because locked by add_to_page_cache()
 		 * put_page() due to reference from alloc_huge_page()
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index ce3d03da0133..1e17529c8b81 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -744,6 +744,15 @@ static inline bool hugepage_migration_supported(struct hstate *h)
 	return arch_hugetlb_migration_supported(h);
 }
 
+/*
+ * Only set HPageMigratable if migration supported for page
+ */
+static inline void SetHPageMigratableIfSupported(struct page *page)
+{
+	if (hugepage_migration_supported(page_hstate(page)))
+		SetHPageMigratable(page);
+}
+
 /*
  * Movability check is different as compared to migration check.
  * It determines whether or not a huge page should be placed on
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index c24da40626d3..6e32751489e8 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4196,7 +4196,7 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma,
 				make_huge_pte(vma, new_page, 1));
 		page_remove_rmap(old_page, true);
 		hugepage_add_new_anon_rmap(new_page, vma, haddr);
-		SetHPageMigratable(new_page);
+		SetHPageMigratableIfSupported(new_page);
 		/* Make the old page be freed below */
 		new_page = old_page;
 	}
@@ -4438,7 +4438,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 	 * been isolated for migration.
 	 */
 	if (new_page)
-		SetHPageMigratable(page);
+		SetHPageMigratableIfSupported(page);
 
 	unlock_page(page);
 out:
@@ -4749,7 +4749,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
 	update_mmu_cache(dst_vma, dst_addr, dst_pte);
 
 	spin_unlock(ptl);
-	SetHPageMigratable(page);
+	SetHPageMigratableIfSupported(page);
 	if (vm_shared)
 		unlock_page(page);
 	ret = 0;
@@ -5584,7 +5584,7 @@ void putback_active_hugepage(struct page *page)
 {
 	VM_BUG_ON_PAGE(!PageHead(page), page);
 	spin_lock(&hugetlb_lock);
-	SetHPageMigratable(page);
+	SetHPageMigratableIfSupported(page);
 	list_move_tail(&page->lru, &(page_hstate(page))->hugepage_activelist);
 	spin_unlock(&hugetlb_lock);
 	put_page(page);
diff --git a/mm/migrate.c b/mm/migrate.c
index 0339f3874d7c..943391cd1a7c 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1273,13 +1273,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
 	struct address_space *mapping = NULL;
 
 	/*
-	 * Migratability of hugepages depends on architectures and their size.
-	 * This check is necessary because some callers of hugepage migration
-	 * like soft offline and memory hotremove don't walk through page
-	 * tables or check whether the hugepage is pmd-based or not before
-	 * kicking migration.
+	 * Support for migration should be checked at isolation time.
+	 * Therefore, we should never get here if migration is not supported
+	 * for the page.
 	 */
 	if (!hugepage_migration_supported(page_hstate(hpage))) {
+		VM_WARN_ON(1);
 		list_move_tail(&hpage->lru, ret);
 		return -ENOSYS;
 	}
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 4/5] hugetlb: convert PageHugeTemporary() to HPageTemporary flag
  2021-01-20  1:30 [PATCH v2 0/5] create hugetlb flags to consolidate state Mike Kravetz
                   ` (2 preceding siblings ...)
  2021-01-20  1:30 ` [PATCH v2 3/5] hugetlb: only set HPageMigratable for migratable hstates Mike Kravetz
@ 2021-01-20  1:30 ` Mike Kravetz
  2021-01-20 10:09   ` Oscar Salvador
  2021-01-20  1:30 ` [PATCH v2 5/5] hugetlb: convert PageHugeFreed to HPageFreed flag Mike Kravetz
  4 siblings, 1 reply; 24+ messages in thread
From: Mike Kravetz @ 2021-01-20  1:30 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Michal Hocko, Naoya Horiguchi, Muchun Song, David Hildenbrand,
	Oscar Salvador, Matthew Wilcox, Andrew Morton, Mike Kravetz

Use new hugetlb specific HPageTemporary flag to replace the
PageHugeTemporary() interfaces.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 include/linux/hugetlb.h |  6 ++++++
 mm/hugetlb.c            | 36 +++++++-----------------------------
 2 files changed, 13 insertions(+), 29 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 1e17529c8b81..ec329b9cc0fc 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -483,10 +483,15 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
  * HPG_migratable  - Set after a newly allocated page is added to the page
  *	cache and/or page tables.  Indicates the page is a candidate for
  *	migration.
+ * HPG_temporary - - Set on a page that is temporarily allocated from the buddy
+ *	allocator.  Typically used for migration target pages when no pages
+ *	are available in the pool.  The hugetlb free page path will
+ *	immediately free pages with this flag set to the buddy allocator.
  */
 enum hugetlb_page_flags {
 	HPG_restore_reserve = 0,
 	HPG_migratable,
+	HPG_temporary,
 	__NR_HPAGEFLAGS,
 };
 
@@ -534,6 +539,7 @@ static inline void ClearHPage##uname(struct page *page)		\
  */
 HPAGEFLAG(RestoreReserve, restore_reserve)
 HPAGEFLAG(Migratable, migratable)
+HPAGEFLAG(Temporary, temporary)
 
 #ifdef CONFIG_HUGETLB_PAGE
 
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 6e32751489e8..0d2bfc2b6adc 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1353,28 +1353,6 @@ struct hstate *size_to_hstate(unsigned long size)
 	return NULL;
 }
 
-/*
- * Internal hugetlb specific page flag. Do not use outside of the hugetlb
- * code
- */
-static inline bool PageHugeTemporary(struct page *page)
-{
-	if (!PageHuge(page))
-		return false;
-
-	return (unsigned long)page[2].mapping == -1U;
-}
-
-static inline void SetPageHugeTemporary(struct page *page)
-{
-	page[2].mapping = (void *)-1U;
-}
-
-static inline void ClearPageHugeTemporary(struct page *page)
-{
-	page[2].mapping = NULL;
-}
-
 static void __free_huge_page(struct page *page)
 {
 	/*
@@ -1422,9 +1400,9 @@ static void __free_huge_page(struct page *page)
 	if (restore_reserve)
 		h->resv_huge_pages++;
 
-	if (PageHugeTemporary(page)) {
+	if (HPageTemporary(page)) {
 		list_del(&page->lru);
-		ClearPageHugeTemporary(page);
+		ClearHPageTemporary(page);
 		update_and_free_page(h, page);
 	} else if (h->surplus_huge_pages_node[nid]) {
 		/* remove the page from active list */
@@ -1863,7 +1841,7 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask,
 	 * codeflow
 	 */
 	if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) {
-		SetPageHugeTemporary(page);
+		SetHPageTemporary(page);
 		spin_unlock(&hugetlb_lock);
 		put_page(page);
 		return NULL;
@@ -1894,7 +1872,7 @@ static struct page *alloc_migrate_huge_page(struct hstate *h, gfp_t gfp_mask,
 	 * We do not account these pages as surplus because they are only
 	 * temporary and will be released properly on the last reference
 	 */
-	SetPageHugeTemporary(page);
+	SetHPageTemporary(page);
 
 	return page;
 }
@@ -5607,12 +5585,12 @@ void move_hugetlb_state(struct page *oldpage, struct page *newpage, int reason)
 	 * here as well otherwise the global surplus count will not match
 	 * the per-node's.
 	 */
-	if (PageHugeTemporary(newpage)) {
+	if (HPageTemporary(newpage)) {
 		int old_nid = page_to_nid(oldpage);
 		int new_nid = page_to_nid(newpage);
 
-		SetPageHugeTemporary(oldpage);
-		ClearPageHugeTemporary(newpage);
+		SetHPageTemporary(oldpage);
+		ClearHPageTemporary(newpage);
 
 		spin_lock(&hugetlb_lock);
 		if (h->surplus_huge_pages_node[old_nid]) {
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 5/5] hugetlb: convert PageHugeFreed to HPageFreed flag
  2021-01-20  1:30 [PATCH v2 0/5] create hugetlb flags to consolidate state Mike Kravetz
                   ` (3 preceding siblings ...)
  2021-01-20  1:30 ` [PATCH v2 4/5] hugetlb: convert PageHugeTemporary() to HPageTemporary flag Mike Kravetz
@ 2021-01-20  1:30 ` Mike Kravetz
  2021-01-20 10:10   ` Oscar Salvador
  2021-01-20 10:44   ` [External] " Muchun Song
  4 siblings, 2 replies; 24+ messages in thread
From: Mike Kravetz @ 2021-01-20  1:30 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Michal Hocko, Naoya Horiguchi, Muchun Song, David Hildenbrand,
	Oscar Salvador, Matthew Wilcox, Andrew Morton, Mike Kravetz

Use new hugetlb specific HPageFreed flag to replace the
PageHugeFreed interfaces.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 include/linux/hugetlb.h |  3 +++
 mm/hugetlb.c            | 23 ++++-------------------
 2 files changed, 7 insertions(+), 19 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index ec329b9cc0fc..8fd0970cefdb 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -487,11 +487,13 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
  *	allocator.  Typically used for migration target pages when no pages
  *	are available in the pool.  The hugetlb free page path will
  *	immediately free pages with this flag set to the buddy allocator.
+ * HPG_freed - Set when page is on the free lists.
  */
 enum hugetlb_page_flags {
 	HPG_restore_reserve = 0,
 	HPG_migratable,
 	HPG_temporary,
+	HPG_freed,
 	__NR_HPAGEFLAGS,
 };
 
@@ -540,6 +542,7 @@ static inline void ClearHPage##uname(struct page *page)		\
 HPAGEFLAG(RestoreReserve, restore_reserve)
 HPAGEFLAG(Migratable, migratable)
 HPAGEFLAG(Temporary, temporary)
+HPAGEFLAG(Freed, freed)
 
 #ifdef CONFIG_HUGETLB_PAGE
 
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 0d2bfc2b6adc..d5a78aedbfda 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -79,21 +79,6 @@ DEFINE_SPINLOCK(hugetlb_lock);
 static int num_fault_mutexes;
 struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp;
 
-static inline bool PageHugeFreed(struct page *head)
-{
-	return page_private(head + 4) == -1UL;
-}
-
-static inline void SetPageHugeFreed(struct page *head)
-{
-	set_page_private(head + 4, -1UL);
-}
-
-static inline void ClearPageHugeFreed(struct page *head)
-{
-	set_page_private(head + 4, 0);
-}
-
 /* Forward declaration */
 static int hugetlb_acct_memory(struct hstate *h, long delta);
 
@@ -1043,7 +1028,7 @@ static void enqueue_huge_page(struct hstate *h, struct page *page)
 	list_move(&page->lru, &h->hugepage_freelists[nid]);
 	h->free_huge_pages++;
 	h->free_huge_pages_node[nid]++;
-	SetPageHugeFreed(page);
+	SetHPageFreed(page);
 }
 
 static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
@@ -1060,7 +1045,7 @@ static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
 
 		list_move(&page->lru, &h->hugepage_activelist);
 		set_page_refcounted(page);
-		ClearPageHugeFreed(page);
+		ClearHPageFreed(page);
 		h->free_huge_pages--;
 		h->free_huge_pages_node[nid]--;
 		return page;
@@ -1474,7 +1459,7 @@ static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
 	spin_lock(&hugetlb_lock);
 	h->nr_huge_pages++;
 	h->nr_huge_pages_node[nid]++;
-	ClearPageHugeFreed(page);
+	ClearHPageFreed(page);
 	spin_unlock(&hugetlb_lock);
 }
 
@@ -1747,7 +1732,7 @@ int dissolve_free_huge_page(struct page *page)
 		 * We should make sure that the page is already on the free list
 		 * when it is dissolved.
 		 */
-		if (unlikely(!PageHugeFreed(head))) {
+		if (unlikely(!HPageFreed(head))) {
 			rc = -EAGAIN;
 			goto out;
 		}
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/5] hugetlb: use page.private for hugetlb specific page flags
  2021-01-20  1:30 ` [PATCH v2 1/5] hugetlb: use page.private for hugetlb specific page flags Mike Kravetz
@ 2021-01-20  8:10   ` Miaohe Lin
  2021-01-20  8:43   ` kernel test robot
  2021-01-20  9:30   ` Oscar Salvador
  2 siblings, 0 replies; 24+ messages in thread
From: Miaohe Lin @ 2021-01-20  8:10 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: Michal Hocko, Naoya Horiguchi, Muchun Song, David Hildenbrand,
	Oscar Salvador, Matthew Wilcox, Andrew Morton, linux-kernel,
	linux-mm

Hi:
On 2021/1/20 9:30, Mike Kravetz wrote:
> As hugetlbfs evolved, state information about hugetlb pages was added.
> One 'convenient' way of doing this was to use available fields in tail
> pages.  Over time, it has become difficult to know the meaning or contents
> of fields simply by looking at a small bit of code.  Sometimes, the
> naming is just confusing.  For example: The PagePrivate flag indicates
> a huge page reservation was consumed and needs to be restored if an error
> is encountered and the page is freed before it is instantiated.  The

This PagePrivate flag really confused me for a long time. :(

> page.private field contains the pointer to a subpool if the page is
> associated with one.
> 
> In an effort to make the code more readable, use page.private to contain
> hugetlb specific page flags.  These flags will have test, set and clear
> functions similar to those used for 'normal' page flags.  More importantly,
> an enum of flag values will be created with names that actually reflect
> their purpose.

Thanks for doing this. This would make life easier.

> 
> In this patch,
> - Create infrastructure for hugetlb specific page flag functions
> - Move subpool pointer to page[1].private to make way for flags
>   Create routines with meaningful names to modify subpool field
> - Use new HPageRestoreReserve flag instead of PagePrivate
> 
> Conversion of other state information will happen in subsequent patches.
> 
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> ---
>  fs/hugetlbfs/inode.c    | 12 ++-----
>  include/linux/hugetlb.h | 72 +++++++++++++++++++++++++++++++++++++++++
>  mm/hugetlb.c            | 45 +++++++++++++-------------
>  3 files changed, 97 insertions(+), 32 deletions(-)
> 
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 740693d7f255..b8a661780c4a 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -955,15 +955,9 @@ static int hugetlbfs_migrate_page(struct address_space *mapping,
>  	if (rc != MIGRATEPAGE_SUCCESS)
>  		return rc;
>  
> -	/*
> -	 * page_private is subpool pointer in hugetlb pages.  Transfer to
> -	 * new page.  PagePrivate is not associated with page_private for
> -	 * hugetlb pages and can not be set here as only page_huge_active
> -	 * pages can be migrated.
> -	 */
> -	if (page_private(page)) {
> -		set_page_private(newpage, page_private(page));
> -		set_page_private(page, 0);
> +	if (hugetlb_page_subpool(page)) {
> +		hugetlb_set_page_subpool(newpage, hugetlb_page_subpool(page));
> +		hugetlb_set_page_subpool(page, NULL);
>  	}
>  
>  	if (mode != MIGRATE_SYNC_NO_COPY)
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index ef5b144b8aac..be71a00ee2a0 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -472,6 +472,64 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
>  					unsigned long flags);
>  #endif /* HAVE_ARCH_HUGETLB_UNMAPPED_AREA */
>  
> +/*
> + * huegtlb page specific state flags.  These flags are located in page.private
> + * of the hugetlb head page.  Functions created via the below macros should be
> + * used to manipulate these flags.
> + *
> + * HPG_restore_reserve - Set when a hugetlb page consumes a reservation at
> + *	allocation time.  Cleared when page is fully instantiated.  Free
> + *	routine checks flag to restore a reservation on error paths.
> + */
> +enum hugetlb_page_flags {
> +	HPG_restore_reserve = 0,
> +	__NR_HPAGEFLAGS,
> +};
> +
> +/*
> + * Macros to create test, set and clear function definitions for
> + * hugetlb specific page flags.
> + */
> +#ifdef CONFIG_HUGETLB_PAGE
> +#define TESTHPAGEFLAG(uname, flname)				\
> +static inline int HPage##uname(struct page *page)		\
> +	{ BUILD_BUG_ON(sizeof_field(struct page, private) *	\
> +			BITS_PER_BYTE < __NR_HPAGEFLAGS);	\
> +	return test_bit(HPG_##flname, &(page->private)); }
> +
> +#define SETHPAGEFLAG(uname, flname)				\
> +static inline void SetHPage##uname(struct page *page)		\
> +	{ set_bit(HPG_##flname, &(page->private)); }
> +
> +#define CLEARHPAGEFLAG(uname, flname)				\
> +static inline void ClearHPage##uname(struct page *page)		\
> +	{ clear_bit(HPG_##flname, &(page->private)); }
> +#else
> +#define TESTHPAGEFLAG(uname, flname)				\
> +static inline int HPage##uname(struct page *page)		\
> +	{ BUILD_BUG_ON(sizeof_field(struct page, private) *	\
> +			BITS_PER_BYTE < __NR_HPAGEFLAGS);	\
> +	return 0 }
> +
> +#define SETHPAGEFLAG(uname, flname)				\
> +static inline void SetHPage##uname(struct page *page)		\
> +	{ }
> +
> +#define CLEARHPAGEFLAG(uname, flname)				\
> +static inline void ClearHPage##uname(struct page *page)		\
> +	{ }
> +#endif
> +
> +#define HPAGEFLAG(uname, flname)				\
> +	TESTHPAGEFLAG(uname, flname)				\
> +	SETHPAGEFLAG(uname, flname)				\
> +	CLEARHPAGEFLAG(uname, flname)				\
> +
> +/*
> + * Create functions associated with hugetlb page flags
> + */
> +HPAGEFLAG(RestoreReserve, restore_reserve)
> +
>  #ifdef CONFIG_HUGETLB_PAGE
>  
>  #define HSTATE_NAME_LEN 32
> @@ -531,6 +589,20 @@ extern unsigned int default_hstate_idx;
>  
>  #define default_hstate (hstates[default_hstate_idx])
>  
> +/*
> + * hugetlb page subpool pointer located in hpage[1].private
> + */
> +static inline struct hugepage_subpool *hugetlb_page_subpool(struct page *hpage)
> +{
> +	return (struct hugepage_subpool *)(hpage+1)->private;
> +}
> +
> +static inline void hugetlb_set_page_subpool(struct page *hpage,
> +					struct hugepage_subpool *subpool)
> +{
> +	set_page_private(hpage+1, (unsigned long)subpool);
> +}
> +
>  static inline struct hstate *hstate_file(struct file *f)
>  {
>  	return hstate_inode(file_inode(f));
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 737b2dce19e6..8bed6b5202d2 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1133,7 +1133,7 @@ static struct page *dequeue_huge_page_vma(struct hstate *h,
>  	nid = huge_node(vma, address, gfp_mask, &mpol, &nodemask);
>  	page = dequeue_huge_page_nodemask(h, gfp_mask, nid, nodemask);
>  	if (page && !avoid_reserve && vma_has_reserves(vma, chg)) {
> -		SetPagePrivate(page);
> +		SetHPageRestoreReserve(page);
>  		h->resv_huge_pages--;
>  	}
>  
> @@ -1407,20 +1407,19 @@ static void __free_huge_page(struct page *page)
>  	 */
>  	struct hstate *h = page_hstate(page);
>  	int nid = page_to_nid(page);
> -	struct hugepage_subpool *spool =
> -		(struct hugepage_subpool *)page_private(page);
> +	struct hugepage_subpool *spool = hugetlb_page_subpool(page);
>  	bool restore_reserve;
>  
>  	VM_BUG_ON_PAGE(page_count(page), page);
>  	VM_BUG_ON_PAGE(page_mapcount(page), page);
>  
> -	set_page_private(page, 0);
> +	hugetlb_set_page_subpool(page, NULL);
>  	page->mapping = NULL;
> -	restore_reserve = PagePrivate(page);
> -	ClearPagePrivate(page);
> +	restore_reserve = HPageRestoreReserve(page);
> +	ClearHPageRestoreReserve(page);
>  
>  	/*
> -	 * If PagePrivate() was set on page, page allocation consumed a
> +	 * If HPageRestoreReserve was set on page, page allocation consumed a
>  	 * reservation.  If the page was associated with a subpool, there
>  	 * would have been a page reserved in the subpool before allocation
>  	 * via hugepage_subpool_get_pages().  Since we are 'restoring' the
> @@ -2254,24 +2253,24 @@ static long vma_add_reservation(struct hstate *h,
>   * This routine is called to restore a reservation on error paths.  In the
>   * specific error paths, a huge page was allocated (via alloc_huge_page)
>   * and is about to be freed.  If a reservation for the page existed,
> - * alloc_huge_page would have consumed the reservation and set PagePrivate
> - * in the newly allocated page.  When the page is freed via free_huge_page,
> - * the global reservation count will be incremented if PagePrivate is set.
> - * However, free_huge_page can not adjust the reserve map.  Adjust the
> - * reserve map here to be consistent with global reserve count adjustments
> - * to be made by free_huge_page.
> + * alloc_huge_page would have consumed the reservation and set
> + * HPageRestoreReserve in the newly allocated page.  When the page is freed
> + * via free_huge_page, the global reservation count will be incremented if
> + * HPageRestoreReserve is set.  However, free_huge_page can not adjust the
> + * reserve map.  Adjust the reserve map here to be consistent with global
> + * reserve count adjustments to be made by free_huge_page.
>   */
>  static void restore_reserve_on_error(struct hstate *h,
>  			struct vm_area_struct *vma, unsigned long address,
>  			struct page *page)
>  {
> -	if (unlikely(PagePrivate(page))) {
> +	if (unlikely(HPageRestoreReserve(page))) {
>  		long rc = vma_needs_reservation(h, vma, address);
>  
>  		if (unlikely(rc < 0)) {
>  			/*
>  			 * Rare out of memory condition in reserve map
> -			 * manipulation.  Clear PagePrivate so that
> +			 * manipulation.  Clear HPageRestoreReserve so that
>  			 * global reserve count will not be incremented
>  			 * by free_huge_page.  This will make it appear
>  			 * as though the reservation for this page was
> @@ -2280,7 +2279,7 @@ static void restore_reserve_on_error(struct hstate *h,
>  			 * is better than inconsistent global huge page
>  			 * accounting of reserve counts.
>  			 */
> -			ClearPagePrivate(page);
> +			ClearHPageRestoreReserve(page);
>  		} else if (rc) {
>  			rc = vma_add_reservation(h, vma, address);
>  			if (unlikely(rc < 0))
> @@ -2288,7 +2287,7 @@ static void restore_reserve_on_error(struct hstate *h,
>  				 * See above comment about rare out of
>  				 * memory condition.
>  				 */
> -				ClearPagePrivate(page);
> +				ClearHPageRestoreReserve(page);
>  		} else
>  			vma_end_reservation(h, vma, address);
>  	}
> @@ -2369,7 +2368,7 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
>  		if (!page)
>  			goto out_uncharge_cgroup;
>  		if (!avoid_reserve && vma_has_reserves(vma, gbl_chg)) {
> -			SetPagePrivate(page);
> +			SetHPageRestoreReserve(page);
>  			h->resv_huge_pages--;
>  		}
>  		spin_lock(&hugetlb_lock);
> @@ -2387,7 +2386,7 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
>  
>  	spin_unlock(&hugetlb_lock);
>  
> -	set_page_private(page, (unsigned long)spool);
> +	hugetlb_set_page_subpool(page, spool);
>  
>  	map_commit = vma_commit_reservation(h, vma, addr);
>  	if (unlikely(map_chg > map_commit)) {
> @@ -4212,7 +4211,7 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma,
>  	spin_lock(ptl);
>  	ptep = huge_pte_offset(mm, haddr, huge_page_size(h));
>  	if (likely(ptep && pte_same(huge_ptep_get(ptep), pte))) {
> -		ClearPagePrivate(new_page);
> +		ClearHPageRestoreReserve(new_page);
>  
>  		/* Break COW */
>  		huge_ptep_clear_flush(vma, haddr, ptep);
> @@ -4279,7 +4278,7 @@ int huge_add_to_page_cache(struct page *page, struct address_space *mapping,
>  
>  	if (err)
>  		return err;
> -	ClearPagePrivate(page);
> +	ClearHPageRestoreReserve(page);
>  
>  	/*
>  	 * set page dirty so that it will not be removed from cache/file
> @@ -4441,7 +4440,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
>  		goto backout;
>  
>  	if (anon_rmap) {
> -		ClearPagePrivate(page);
> +		ClearHPageRestoreReserve(page);
>  		hugepage_add_new_anon_rmap(page, vma, haddr);
>  	} else
>  		page_dup_rmap(page, true);
> @@ -4755,7 +4754,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
>  	if (vm_shared) {
>  		page_dup_rmap(page, true);
>  	} else {
> -		ClearPagePrivate(page);
> +		ClearHPageRestoreReserve(page);
>  		hugepage_add_new_anon_rmap(page, dst_vma, dst_addr);
>  	}
>  
> 

Looks good to me.Thanks.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/5] hugetlb: use page.private for hugetlb specific page flags
  2021-01-20  1:30 ` [PATCH v2 1/5] hugetlb: use page.private for hugetlb specific page flags Mike Kravetz
  2021-01-20  8:10   ` Miaohe Lin
@ 2021-01-20  8:43   ` kernel test robot
  2021-01-20  9:30   ` Oscar Salvador
  2 siblings, 0 replies; 24+ messages in thread
From: kernel test robot @ 2021-01-20  8:43 UTC (permalink / raw)
  To: Mike Kravetz, linux-kernel, linux-mm
  Cc: kbuild-all, Michal Hocko, Naoya Horiguchi, Muchun Song,
	David Hildenbrand, Oscar Salvador, Matthew Wilcox, Andrew Morton,
	Linux Memory Management List

[-- Attachment #1: Type: text/plain, Size: 11496 bytes --]

Hi Mike,

I love your patch! Yet something to improve:

[auto build test ERROR on next-20210119]
[also build test ERROR on v5.11-rc4]
[cannot apply to linux/master linus/master hnaz-linux-mm/master v5.11-rc4 v5.11-rc3 v5.11-rc2]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Mike-Kravetz/create-hugetlb-flags-to-consolidate-state/20210120-142142
base:    b4bb878f3eb3e604ebfe83bbc17eb7af8d99cbf4
config: nds32-defconfig (attached as .config)
compiler: nds32le-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/ebeb3dd25715894428a5107720212b022616c02f
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Mike-Kravetz/create-hugetlb-flags-to-consolidate-state/20210120-142142
        git checkout ebeb3dd25715894428a5107720212b022616c02f
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=nds32 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from kernel/fork.c:51:
   include/linux/hugetlb.h: In function 'HPageRestoreReserve':
>> include/linux/hugetlb.h:512:11: error: expected ';' before '}' token
     512 |  return 0 }
         |           ^
   include/linux/hugetlb.h:524:2: note: in expansion of macro 'TESTHPAGEFLAG'
     524 |  TESTHPAGEFLAG(uname, flname)    \
         |  ^~~~~~~~~~~~~
   include/linux/hugetlb.h:531:1: note: in expansion of macro 'HPAGEFLAG'
     531 | HPAGEFLAG(RestoreReserve, restore_reserve)
         | ^~~~~~~~~
   kernel/fork.c: At top level:
   kernel/fork.c:161:13: warning: no previous prototype for 'arch_release_task_struct' [-Wmissing-prototypes]
     161 | void __weak arch_release_task_struct(struct task_struct *tsk)
         |             ^~~~~~~~~~~~~~~~~~~~~~~~
   kernel/fork.c:746:20: warning: no previous prototype for 'arch_task_cache_init' [-Wmissing-prototypes]
     746 | void __init __weak arch_task_cache_init(void) { }
         |                    ^~~~~~~~~~~~~~~~~~~~
   kernel/fork.c:836:12: warning: no previous prototype for 'arch_dup_task_struct' [-Wmissing-prototypes]
     836 | int __weak arch_dup_task_struct(struct task_struct *dst,
         |            ^~~~~~~~~~~~~~~~~~~~
--
   In file included from kernel/sysctl.c:45:
   include/linux/hugetlb.h: In function 'HPageRestoreReserve':
>> include/linux/hugetlb.h:512:11: error: expected ';' before '}' token
     512 |  return 0 }
         |           ^
   include/linux/hugetlb.h:524:2: note: in expansion of macro 'TESTHPAGEFLAG'
     524 |  TESTHPAGEFLAG(uname, flname)    \
         |  ^~~~~~~~~~~~~
   include/linux/hugetlb.h:531:1: note: in expansion of macro 'HPAGEFLAG'
     531 | HPAGEFLAG(RestoreReserve, restore_reserve)
         | ^~~~~~~~~
--
   In file included from include/linux/migrate.h:8,
                    from kernel/sched/sched.h:53,
                    from kernel/sched/core.c:13:
   include/linux/hugetlb.h: In function 'HPageRestoreReserve':
>> include/linux/hugetlb.h:512:11: error: expected ';' before '}' token
     512 |  return 0 }
         |           ^
   include/linux/hugetlb.h:524:2: note: in expansion of macro 'TESTHPAGEFLAG'
     524 |  TESTHPAGEFLAG(uname, flname)    \
         |  ^~~~~~~~~~~~~
   include/linux/hugetlb.h:531:1: note: in expansion of macro 'HPAGEFLAG'
     531 | HPAGEFLAG(RestoreReserve, restore_reserve)
         | ^~~~~~~~~
   kernel/sched/core.c: In function 'ttwu_stat':
   kernel/sched/core.c:2955:13: warning: variable 'rq' set but not used [-Wunused-but-set-variable]
    2955 |  struct rq *rq;
         |             ^~
   kernel/sched/core.c: In function 'schedule_tail':
   kernel/sched/core.c:4298:13: warning: variable 'rq' set but not used [-Wunused-but-set-variable]
    4298 |  struct rq *rq;
         |             ^~
--
   In file included from include/linux/migrate.h:8,
                    from kernel/sched/sched.h:53,
                    from kernel/sched/fair.c:23:
   include/linux/hugetlb.h: In function 'HPageRestoreReserve':
>> include/linux/hugetlb.h:512:11: error: expected ';' before '}' token
     512 |  return 0 }
         |           ^
   include/linux/hugetlb.h:524:2: note: in expansion of macro 'TESTHPAGEFLAG'
     524 |  TESTHPAGEFLAG(uname, flname)    \
         |  ^~~~~~~~~~~~~
   include/linux/hugetlb.h:531:1: note: in expansion of macro 'HPAGEFLAG'
     531 | HPAGEFLAG(RestoreReserve, restore_reserve)
         | ^~~~~~~~~
   kernel/sched/fair.c: At top level:
   kernel/sched/fair.c:5399:6: warning: no previous prototype for 'init_cfs_bandwidth' [-Wmissing-prototypes]
    5399 | void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b) {}
         |      ^~~~~~~~~~~~~~~~~~
   kernel/sched/fair.c:11218:6: warning: no previous prototype for 'free_fair_sched_group' [-Wmissing-prototypes]
   11218 | void free_fair_sched_group(struct task_group *tg) { }
         |      ^~~~~~~~~~~~~~~~~~~~~
   kernel/sched/fair.c:11220:5: warning: no previous prototype for 'alloc_fair_sched_group' [-Wmissing-prototypes]
   11220 | int alloc_fair_sched_group(struct task_group *tg, struct task_group *parent)
         |     ^~~~~~~~~~~~~~~~~~~~~~
   kernel/sched/fair.c:11225:6: warning: no previous prototype for 'online_fair_sched_group' [-Wmissing-prototypes]
   11225 | void online_fair_sched_group(struct task_group *tg) { }
         |      ^~~~~~~~~~~~~~~~~~~~~~~
   kernel/sched/fair.c:11227:6: warning: no previous prototype for 'unregister_fair_sched_group' [-Wmissing-prototypes]
   11227 | void unregister_fair_sched_group(struct task_group *tg) { }
         |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~
--
   In file included from include/linux/migrate.h:8,
                    from kernel/sched/sched.h:53,
                    from kernel/sched/rt.c:6:
   include/linux/hugetlb.h: In function 'HPageRestoreReserve':
>> include/linux/hugetlb.h:512:11: error: expected ';' before '}' token
     512 |  return 0 }
         |           ^
   include/linux/hugetlb.h:524:2: note: in expansion of macro 'TESTHPAGEFLAG'
     524 |  TESTHPAGEFLAG(uname, flname)    \
         |  ^~~~~~~~~~~~~
   include/linux/hugetlb.h:531:1: note: in expansion of macro 'HPAGEFLAG'
     531 | HPAGEFLAG(RestoreReserve, restore_reserve)
         | ^~~~~~~~~
   kernel/sched/rt.c: At top level:
   kernel/sched/rt.c:253:6: warning: no previous prototype for 'free_rt_sched_group' [-Wmissing-prototypes]
     253 | void free_rt_sched_group(struct task_group *tg) { }
         |      ^~~~~~~~~~~~~~~~~~~
   kernel/sched/rt.c:255:5: warning: no previous prototype for 'alloc_rt_sched_group' [-Wmissing-prototypes]
     255 | int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent)
         |     ^~~~~~~~~~~~~~~~~~~~
   kernel/sched/rt.c:669:6: warning: no previous prototype for 'sched_rt_bandwidth_account' [-Wmissing-prototypes]
     669 | bool sched_rt_bandwidth_account(struct rt_rq *rt_rq)
         |      ^~~~~~~~~~~~~~~~~~~~~~~~~~
--
   In file included from include/linux/migrate.h:8,
                    from mm/page_alloc.c:61:
   include/linux/hugetlb.h: In function 'HPageRestoreReserve':
>> include/linux/hugetlb.h:512:11: error: expected ';' before '}' token
     512 |  return 0 }
         |           ^
   include/linux/hugetlb.h:524:2: note: in expansion of macro 'TESTHPAGEFLAG'
     524 |  TESTHPAGEFLAG(uname, flname)    \
         |  ^~~~~~~~~~~~~
   include/linux/hugetlb.h:531:1: note: in expansion of macro 'HPAGEFLAG'
     531 | HPAGEFLAG(RestoreReserve, restore_reserve)
         | ^~~~~~~~~
   mm/page_alloc.c: At top level:
   mm/page_alloc.c:2618:5: warning: no previous prototype for 'find_suitable_fallback' [-Wmissing-prototypes]
    2618 | int find_suitable_fallback(struct free_area *area, unsigned int order,
         |     ^~~~~~~~~~~~~~~~~~~~~~
   mm/page_alloc.c:3597:15: warning: no previous prototype for 'should_fail_alloc_page' [-Wmissing-prototypes]
    3597 | noinline bool should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
         |               ^~~~~~~~~~~~~~~~~~~~~~
   mm/page_alloc.c:6257:23: warning: no previous prototype for 'memmap_init' [-Wmissing-prototypes]
    6257 | void __meminit __weak memmap_init(unsigned long size, int nid,
         |                       ^~~~~~~~~~~
--
   In file included from fs/proc/meminfo.c:6:
   include/linux/hugetlb.h: In function 'HPageRestoreReserve':
>> include/linux/hugetlb.h:512:11: error: expected ';' before '}' token
     512 |  return 0 }
         |           ^
   include/linux/hugetlb.h:524:2: note: in expansion of macro 'TESTHPAGEFLAG'
     524 |  TESTHPAGEFLAG(uname, flname)    \
         |  ^~~~~~~~~~~~~
   include/linux/hugetlb.h:531:1: note: in expansion of macro 'HPAGEFLAG'
     531 | HPAGEFLAG(RestoreReserve, restore_reserve)
         | ^~~~~~~~~
   fs/proc/meminfo.c: At top level:
   fs/proc/meminfo.c:22:28: warning: no previous prototype for 'arch_report_meminfo' [-Wmissing-prototypes]
      22 | void __attribute__((weak)) arch_report_meminfo(struct seq_file *m)
         |                            ^~~~~~~~~~~~~~~~~~~
--
   In file included from kernel/events/core.c:31:
   include/linux/hugetlb.h: In function 'HPageRestoreReserve':
>> include/linux/hugetlb.h:512:11: error: expected ';' before '}' token
     512 |  return 0 }
         |           ^
   include/linux/hugetlb.h:524:2: note: in expansion of macro 'TESTHPAGEFLAG'
     524 |  TESTHPAGEFLAG(uname, flname)    \
         |  ^~~~~~~~~~~~~
   include/linux/hugetlb.h:531:1: note: in expansion of macro 'HPAGEFLAG'
     531 | HPAGEFLAG(RestoreReserve, restore_reserve)
         | ^~~~~~~~~
   kernel/events/core.c: At top level:
   kernel/events/core.c:6539:6: warning: no previous prototype for 'perf_pmu_snapshot_aux' [-Wmissing-prototypes]
    6539 | long perf_pmu_snapshot_aux(struct perf_buffer *rb,
         |      ^~~~~~~~~~~~~~~~~~~~~


vim +512 include/linux/hugetlb.h

   488	
   489	/*
   490	 * Macros to create test, set and clear function definitions for
   491	 * hugetlb specific page flags.
   492	 */
   493	#ifdef CONFIG_HUGETLB_PAGE
   494	#define TESTHPAGEFLAG(uname, flname)				\
   495	static inline int HPage##uname(struct page *page)		\
   496		{ BUILD_BUG_ON(sizeof_field(struct page, private) *	\
   497				BITS_PER_BYTE < __NR_HPAGEFLAGS);	\
   498		return test_bit(HPG_##flname, &(page->private)); }
   499	
   500	#define SETHPAGEFLAG(uname, flname)				\
   501	static inline void SetHPage##uname(struct page *page)		\
   502		{ set_bit(HPG_##flname, &(page->private)); }
   503	
   504	#define CLEARHPAGEFLAG(uname, flname)				\
   505	static inline void ClearHPage##uname(struct page *page)		\
   506		{ clear_bit(HPG_##flname, &(page->private)); }
   507	#else
   508	#define TESTHPAGEFLAG(uname, flname)				\
   509	static inline int HPage##uname(struct page *page)		\
   510		{ BUILD_BUG_ON(sizeof_field(struct page, private) *	\
   511				BITS_PER_BYTE < __NR_HPAGEFLAGS);	\
 > 512		return 0 }
   513	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 10838 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/5] hugetlb: use page.private for hugetlb specific page flags
  2021-01-20  1:30 ` [PATCH v2 1/5] hugetlb: use page.private for hugetlb specific page flags Mike Kravetz
  2021-01-20  8:10   ` Miaohe Lin
  2021-01-20  8:43   ` kernel test robot
@ 2021-01-20  9:30   ` Oscar Salvador
  2021-01-20 18:12     ` Mike Kravetz
  2 siblings, 1 reply; 24+ messages in thread
From: Oscar Salvador @ 2021-01-20  9:30 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-kernel, linux-mm, Michal Hocko, Naoya Horiguchi,
	Muchun Song, David Hildenbrand, Matthew Wilcox, Andrew Morton

On Tue, Jan 19, 2021 at 05:30:45PM -0800, Mike Kravetz wrote:
> + * Macros to create test, set and clear function definitions for
> + * hugetlb specific page flags.
> + */
> +#ifdef CONFIG_HUGETLB_PAGE
> +#define TESTHPAGEFLAG(uname, flname)				\
> +static inline int HPage##uname(struct page *page)		\
> +	{ BUILD_BUG_ON(sizeof_field(struct page, private) *	\
> +			BITS_PER_BYTE < __NR_HPAGEFLAGS);	\
> +	return test_bit(HPG_##flname, &(page->private)); }
> +
> +#define SETHPAGEFLAG(uname, flname)				\
> +static inline void SetHPage##uname(struct page *page)		\
> +	{ set_bit(HPG_##flname, &(page->private)); }
> +
> +#define CLEARHPAGEFLAG(uname, flname)				\
> +static inline void ClearHPage##uname(struct page *page)		\
> +	{ clear_bit(HPG_##flname, &(page->private)); }
> +#else
> +#define TESTHPAGEFLAG(uname, flname)				\
> +static inline int HPage##uname(struct page *page)		\
> +	{ BUILD_BUG_ON(sizeof_field(struct page, private) *	\
> +			BITS_PER_BYTE < __NR_HPAGEFLAGS);	\
> +	return 0 }

You missed a ";" right there.

I might be missing something, but I do not think we need a BUILD_BUG_ON there
when CONFIG_HUGETLB_PAGE is not set?
Actually, would make more sense to move the BUILD_BUG_ON from above to
hugetlb_init?

Other than that, looks good to me, and I think it is a great improvment
towards readability and maintability.

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag
  2021-01-20  1:30 ` [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag Mike Kravetz
@ 2021-01-20  9:59   ` Oscar Salvador
  2021-01-20 10:00     ` Oscar Salvador
  2021-01-21  8:27   ` Oscar Salvador
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 24+ messages in thread
From: Oscar Salvador @ 2021-01-20  9:59 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-kernel, linux-mm, Michal Hocko, Naoya Horiguchi,
	Muchun Song, David Hildenbrand, Matthew Wilcox, Andrew Morton

On Tue, Jan 19, 2021 at 05:30:46PM -0800, Mike Kravetz wrote:
> Use the new hugetlb page specific flag HPageMigratable to replace the
> page_huge_active interfaces.  By it's name, page_huge_active implied
> that a huge page was on the active list.  However, that is not really
> what code checking the flag wanted to know.  It really wanted to determine
> if the huge page could be migrated.  This happens when the page is actually
> added the page cache and/or task page table.  This is the reasoning behind
> the name change.
> 
> The VM_BUG_ON_PAGE() calls in the *_huge_active() interfaces are not
> really necessary as we KNOW the page is a hugetlb page.  Therefore, they
> are removed.
> 
> The routine page_huge_active checked for PageHeadHuge before testing the
> active bit.  This is unnecessary in the case where we hold a reference or
> lock and know it is a hugetlb head page.  page_huge_active is also called
> without holding a reference or lock (scan_movable_pages), and can race with
> code freeing the page.  The extra check in page_huge_active shortened the
> race window, but did not prevent the race.  Offline code calling
> scan_movable_pages already deals with these races, so removing the check
> is acceptable.  Add comment to racy code.
> 
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>

Hi Mike,

This comment addresses both this patch and the next one.

Instead of putting the SetHPageMigratable flag spread over the
allocation paths, would it make more sense to place it in
alloc_huge_page before returning the page?
Then we could opencode SetHPageMigratableIfSupported right there.

I might be missing something and this might not be possible, but if it
is, it would look cleaner and more logical to me.


-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag
  2021-01-20  9:59   ` Oscar Salvador
@ 2021-01-20 10:00     ` Oscar Salvador
  2021-01-20 21:48       ` Mike Kravetz
  0 siblings, 1 reply; 24+ messages in thread
From: Oscar Salvador @ 2021-01-20 10:00 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-kernel, linux-mm, Michal Hocko, Naoya Horiguchi,
	Muchun Song, David Hildenbrand, Matthew Wilcox, Andrew Morton

On Wed, Jan 20, 2021 at 10:59:05AM +0100, Oscar Salvador wrote:
> On Tue, Jan 19, 2021 at 05:30:46PM -0800, Mike Kravetz wrote:
> > Use the new hugetlb page specific flag HPageMigratable to replace the
> > page_huge_active interfaces.  By it's name, page_huge_active implied
> > that a huge page was on the active list.  However, that is not really
> > what code checking the flag wanted to know.  It really wanted to determine
> > if the huge page could be migrated.  This happens when the page is actually
> > added the page cache and/or task page table.  This is the reasoning behind
> > the name change.
> > 
> > The VM_BUG_ON_PAGE() calls in the *_huge_active() interfaces are not
> > really necessary as we KNOW the page is a hugetlb page.  Therefore, they
> > are removed.
> > 
> > The routine page_huge_active checked for PageHeadHuge before testing the
> > active bit.  This is unnecessary in the case where we hold a reference or
> > lock and know it is a hugetlb head page.  page_huge_active is also called
> > without holding a reference or lock (scan_movable_pages), and can race with
> > code freeing the page.  The extra check in page_huge_active shortened the
> > race window, but did not prevent the race.  Offline code calling
> > scan_movable_pages already deals with these races, so removing the check
> > is acceptable.  Add comment to racy code.
> > 
> > Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> 
> Hi Mike,
> 
> This comment addresses both this patch and the next one.
> 
> Instead of putting the SetHPageMigratable flag spread over the
> allocation paths, would it make more sense to place it in
> alloc_huge_page before returning the page?
> Then we could opencode SetHPageMigratableIfSupported right there.

and in putback_active_hugepage.

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 4/5] hugetlb: convert PageHugeTemporary() to HPageTemporary flag
  2021-01-20  1:30 ` [PATCH v2 4/5] hugetlb: convert PageHugeTemporary() to HPageTemporary flag Mike Kravetz
@ 2021-01-20 10:09   ` Oscar Salvador
  2021-01-20 18:14     ` Mike Kravetz
  0 siblings, 1 reply; 24+ messages in thread
From: Oscar Salvador @ 2021-01-20 10:09 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-kernel, linux-mm, Michal Hocko, Naoya Horiguchi,
	Muchun Song, David Hildenbrand, Matthew Wilcox, Andrew Morton

On Tue, Jan 19, 2021 at 05:30:48PM -0800, Mike Kravetz wrote:
> Use new hugetlb specific HPageTemporary flag to replace the
> PageHugeTemporary() interfaces.
> 
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>

I would have added a brief comment explaining why it is ok to drop
the PageHuge() check in PageHugeTemporary.
AFAICS, the paths checking it already know they are handling with a 
hugetlb page, but still it is better to mention it in the changelog
in case someone wonders.

Other than that looks good to me:

Reviewed-by: Oscar Salvador <osalvador@suse.de>

> ---
>  include/linux/hugetlb.h |  6 ++++++
>  mm/hugetlb.c            | 36 +++++++-----------------------------
>  2 files changed, 13 insertions(+), 29 deletions(-)
> 
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 1e17529c8b81..ec329b9cc0fc 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -483,10 +483,15 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
>   * HPG_migratable  - Set after a newly allocated page is added to the page
>   *	cache and/or page tables.  Indicates the page is a candidate for
>   *	migration.
> + * HPG_temporary - - Set on a page that is temporarily allocated from the buddy
> + *	allocator.  Typically used for migration target pages when no pages
> + *	are available in the pool.  The hugetlb free page path will
> + *	immediately free pages with this flag set to the buddy allocator.
>   */
>  enum hugetlb_page_flags {
>  	HPG_restore_reserve = 0,
>  	HPG_migratable,
> +	HPG_temporary,
>  	__NR_HPAGEFLAGS,
>  };
>  
> @@ -534,6 +539,7 @@ static inline void ClearHPage##uname(struct page *page)		\
>   */
>  HPAGEFLAG(RestoreReserve, restore_reserve)
>  HPAGEFLAG(Migratable, migratable)
> +HPAGEFLAG(Temporary, temporary)
>  
>  #ifdef CONFIG_HUGETLB_PAGE
>  
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 6e32751489e8..0d2bfc2b6adc 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1353,28 +1353,6 @@ struct hstate *size_to_hstate(unsigned long size)
>  	return NULL;
>  }
>  
> -/*
> - * Internal hugetlb specific page flag. Do not use outside of the hugetlb
> - * code
> - */
> -static inline bool PageHugeTemporary(struct page *page)
> -{
> -	if (!PageHuge(page))
> -		return false;
> -
> -	return (unsigned long)page[2].mapping == -1U;
> -}
> -
> -static inline void SetPageHugeTemporary(struct page *page)
> -{
> -	page[2].mapping = (void *)-1U;
> -}
> -
> -static inline void ClearPageHugeTemporary(struct page *page)
> -{
> -	page[2].mapping = NULL;
> -}
> -
>  static void __free_huge_page(struct page *page)
>  {
>  	/*
> @@ -1422,9 +1400,9 @@ static void __free_huge_page(struct page *page)
>  	if (restore_reserve)
>  		h->resv_huge_pages++;
>  
> -	if (PageHugeTemporary(page)) {
> +	if (HPageTemporary(page)) {
>  		list_del(&page->lru);
> -		ClearPageHugeTemporary(page);
> +		ClearHPageTemporary(page);
>  		update_and_free_page(h, page);
>  	} else if (h->surplus_huge_pages_node[nid]) {
>  		/* remove the page from active list */
> @@ -1863,7 +1841,7 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask,
>  	 * codeflow
>  	 */
>  	if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) {
> -		SetPageHugeTemporary(page);
> +		SetHPageTemporary(page);
>  		spin_unlock(&hugetlb_lock);
>  		put_page(page);
>  		return NULL;
> @@ -1894,7 +1872,7 @@ static struct page *alloc_migrate_huge_page(struct hstate *h, gfp_t gfp_mask,
>  	 * We do not account these pages as surplus because they are only
>  	 * temporary and will be released properly on the last reference
>  	 */
> -	SetPageHugeTemporary(page);
> +	SetHPageTemporary(page);
>  
>  	return page;
>  }
> @@ -5607,12 +5585,12 @@ void move_hugetlb_state(struct page *oldpage, struct page *newpage, int reason)
>  	 * here as well otherwise the global surplus count will not match
>  	 * the per-node's.
>  	 */
> -	if (PageHugeTemporary(newpage)) {
> +	if (HPageTemporary(newpage)) {
>  		int old_nid = page_to_nid(oldpage);
>  		int new_nid = page_to_nid(newpage);
>  
> -		SetPageHugeTemporary(oldpage);
> -		ClearPageHugeTemporary(newpage);
> +		SetHPageTemporary(oldpage);
> +		ClearHPageTemporary(newpage);
>  
>  		spin_lock(&hugetlb_lock);
>  		if (h->surplus_huge_pages_node[old_nid]) {
> -- 
> 2.29.2
> 

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 5/5] hugetlb: convert PageHugeFreed to HPageFreed flag
  2021-01-20  1:30 ` [PATCH v2 5/5] hugetlb: convert PageHugeFreed to HPageFreed flag Mike Kravetz
@ 2021-01-20 10:10   ` Oscar Salvador
  2021-01-20 10:44   ` [External] " Muchun Song
  1 sibling, 0 replies; 24+ messages in thread
From: Oscar Salvador @ 2021-01-20 10:10 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-kernel, linux-mm, Michal Hocko, Naoya Horiguchi,
	Muchun Song, David Hildenbrand, Matthew Wilcox, Andrew Morton

On Tue, Jan 19, 2021 at 05:30:49PM -0800, Mike Kravetz wrote:
> Use new hugetlb specific HPageFreed flag to replace the
> PageHugeFreed interfaces.
> 
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>

> ---
>  include/linux/hugetlb.h |  3 +++
>  mm/hugetlb.c            | 23 ++++-------------------
>  2 files changed, 7 insertions(+), 19 deletions(-)
> 
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index ec329b9cc0fc..8fd0970cefdb 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -487,11 +487,13 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
>   *	allocator.  Typically used for migration target pages when no pages
>   *	are available in the pool.  The hugetlb free page path will
>   *	immediately free pages with this flag set to the buddy allocator.
> + * HPG_freed - Set when page is on the free lists.
>   */
>  enum hugetlb_page_flags {
>  	HPG_restore_reserve = 0,
>  	HPG_migratable,
>  	HPG_temporary,
> +	HPG_freed,
>  	__NR_HPAGEFLAGS,
>  };
>  
> @@ -540,6 +542,7 @@ static inline void ClearHPage##uname(struct page *page)		\
>  HPAGEFLAG(RestoreReserve, restore_reserve)
>  HPAGEFLAG(Migratable, migratable)
>  HPAGEFLAG(Temporary, temporary)
> +HPAGEFLAG(Freed, freed)
>  
>  #ifdef CONFIG_HUGETLB_PAGE
>  
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 0d2bfc2b6adc..d5a78aedbfda 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -79,21 +79,6 @@ DEFINE_SPINLOCK(hugetlb_lock);
>  static int num_fault_mutexes;
>  struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp;
>  
> -static inline bool PageHugeFreed(struct page *head)
> -{
> -	return page_private(head + 4) == -1UL;
> -}
> -
> -static inline void SetPageHugeFreed(struct page *head)
> -{
> -	set_page_private(head + 4, -1UL);
> -}
> -
> -static inline void ClearPageHugeFreed(struct page *head)
> -{
> -	set_page_private(head + 4, 0);
> -}
> -
>  /* Forward declaration */
>  static int hugetlb_acct_memory(struct hstate *h, long delta);
>  
> @@ -1043,7 +1028,7 @@ static void enqueue_huge_page(struct hstate *h, struct page *page)
>  	list_move(&page->lru, &h->hugepage_freelists[nid]);
>  	h->free_huge_pages++;
>  	h->free_huge_pages_node[nid]++;
> -	SetPageHugeFreed(page);
> +	SetHPageFreed(page);
>  }
>  
>  static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
> @@ -1060,7 +1045,7 @@ static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
>  
>  		list_move(&page->lru, &h->hugepage_activelist);
>  		set_page_refcounted(page);
> -		ClearPageHugeFreed(page);
> +		ClearHPageFreed(page);
>  		h->free_huge_pages--;
>  		h->free_huge_pages_node[nid]--;
>  		return page;
> @@ -1474,7 +1459,7 @@ static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
>  	spin_lock(&hugetlb_lock);
>  	h->nr_huge_pages++;
>  	h->nr_huge_pages_node[nid]++;
> -	ClearPageHugeFreed(page);
> +	ClearHPageFreed(page);
>  	spin_unlock(&hugetlb_lock);
>  }
>  
> @@ -1747,7 +1732,7 @@ int dissolve_free_huge_page(struct page *page)
>  		 * We should make sure that the page is already on the free list
>  		 * when it is dissolved.
>  		 */
> -		if (unlikely(!PageHugeFreed(head))) {
> +		if (unlikely(!HPageFreed(head))) {
>  			rc = -EAGAIN;
>  			goto out;
>  		}
> -- 
> 2.29.2
> 
> 

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [External] [PATCH v2 5/5] hugetlb: convert PageHugeFreed to HPageFreed flag
  2021-01-20  1:30 ` [PATCH v2 5/5] hugetlb: convert PageHugeFreed to HPageFreed flag Mike Kravetz
  2021-01-20 10:10   ` Oscar Salvador
@ 2021-01-20 10:44   ` Muchun Song
  1 sibling, 0 replies; 24+ messages in thread
From: Muchun Song @ 2021-01-20 10:44 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: LKML, Linux Memory Management List, Michal Hocko,
	Naoya Horiguchi, David Hildenbrand, Oscar Salvador,
	Matthew Wilcox, Andrew Morton

On Wed, Jan 20, 2021 at 9:31 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> Use new hugetlb specific HPageFreed flag to replace the
> PageHugeFreed interfaces.
>
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>

Reviewed-by: Muchun Song <songmuchun@bytedance.com>

> ---
>  include/linux/hugetlb.h |  3 +++
>  mm/hugetlb.c            | 23 ++++-------------------
>  2 files changed, 7 insertions(+), 19 deletions(-)
>
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index ec329b9cc0fc..8fd0970cefdb 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -487,11 +487,13 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
>   *     allocator.  Typically used for migration target pages when no pages
>   *     are available in the pool.  The hugetlb free page path will
>   *     immediately free pages with this flag set to the buddy allocator.
> + * HPG_freed - Set when page is on the free lists.
>   */
>  enum hugetlb_page_flags {
>         HPG_restore_reserve = 0,
>         HPG_migratable,
>         HPG_temporary,
> +       HPG_freed,
>         __NR_HPAGEFLAGS,
>  };
>
> @@ -540,6 +542,7 @@ static inline void ClearHPage##uname(struct page *page)             \
>  HPAGEFLAG(RestoreReserve, restore_reserve)
>  HPAGEFLAG(Migratable, migratable)
>  HPAGEFLAG(Temporary, temporary)
> +HPAGEFLAG(Freed, freed)
>
>  #ifdef CONFIG_HUGETLB_PAGE
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 0d2bfc2b6adc..d5a78aedbfda 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -79,21 +79,6 @@ DEFINE_SPINLOCK(hugetlb_lock);
>  static int num_fault_mutexes;
>  struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp;
>
> -static inline bool PageHugeFreed(struct page *head)
> -{
> -       return page_private(head + 4) == -1UL;
> -}
> -
> -static inline void SetPageHugeFreed(struct page *head)
> -{
> -       set_page_private(head + 4, -1UL);
> -}
> -
> -static inline void ClearPageHugeFreed(struct page *head)
> -{
> -       set_page_private(head + 4, 0);
> -}
> -
>  /* Forward declaration */
>  static int hugetlb_acct_memory(struct hstate *h, long delta);
>
> @@ -1043,7 +1028,7 @@ static void enqueue_huge_page(struct hstate *h, struct page *page)
>         list_move(&page->lru, &h->hugepage_freelists[nid]);
>         h->free_huge_pages++;
>         h->free_huge_pages_node[nid]++;
> -       SetPageHugeFreed(page);
> +       SetHPageFreed(page);
>  }
>
>  static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
> @@ -1060,7 +1045,7 @@ static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
>
>                 list_move(&page->lru, &h->hugepage_activelist);
>                 set_page_refcounted(page);
> -               ClearPageHugeFreed(page);
> +               ClearHPageFreed(page);
>                 h->free_huge_pages--;
>                 h->free_huge_pages_node[nid]--;
>                 return page;
> @@ -1474,7 +1459,7 @@ static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
>         spin_lock(&hugetlb_lock);
>         h->nr_huge_pages++;
>         h->nr_huge_pages_node[nid]++;
> -       ClearPageHugeFreed(page);
> +       ClearHPageFreed(page);
>         spin_unlock(&hugetlb_lock);
>  }
>
> @@ -1747,7 +1732,7 @@ int dissolve_free_huge_page(struct page *page)
>                  * We should make sure that the page is already on the free list
>                  * when it is dissolved.
>                  */
> -               if (unlikely(!PageHugeFreed(head))) {
> +               if (unlikely(!HPageFreed(head))) {
>                         rc = -EAGAIN;
>                         goto out;
>                 }
> --
> 2.29.2
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/5] hugetlb: use page.private for hugetlb specific page flags
  2021-01-20  9:30   ` Oscar Salvador
@ 2021-01-20 18:12     ` Mike Kravetz
  0 siblings, 0 replies; 24+ messages in thread
From: Mike Kravetz @ 2021-01-20 18:12 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: linux-kernel, linux-mm, Michal Hocko, Naoya Horiguchi,
	Muchun Song, David Hildenbrand, Matthew Wilcox, Andrew Morton

On 1/20/21 1:30 AM, Oscar Salvador wrote:
> On Tue, Jan 19, 2021 at 05:30:45PM -0800, Mike Kravetz wrote:
>> + * Macros to create test, set and clear function definitions for
>> + * hugetlb specific page flags.
>> + */
>> +#ifdef CONFIG_HUGETLB_PAGE
>> +#define TESTHPAGEFLAG(uname, flname)				\
>> +static inline int HPage##uname(struct page *page)		\
>> +	{ BUILD_BUG_ON(sizeof_field(struct page, private) *	\
>> +			BITS_PER_BYTE < __NR_HPAGEFLAGS);	\
>> +	return test_bit(HPG_##flname, &(page->private)); }
>> +
>> +#define SETHPAGEFLAG(uname, flname)				\
>> +static inline void SetHPage##uname(struct page *page)		\
>> +	{ set_bit(HPG_##flname, &(page->private)); }
>> +
>> +#define CLEARHPAGEFLAG(uname, flname)				\
>> +static inline void ClearHPage##uname(struct page *page)		\
>> +	{ clear_bit(HPG_##flname, &(page->private)); }
>> +#else
>> +#define TESTHPAGEFLAG(uname, flname)				\
>> +static inline int HPage##uname(struct page *page)		\
>> +	{ BUILD_BUG_ON(sizeof_field(struct page, private) *	\
>> +			BITS_PER_BYTE < __NR_HPAGEFLAGS);	\
>> +	return 0 }
> 
> You missed a ";" right there.

Thanks.  I made that typo when fixing up some trivial checkpatch warnings.
Lesson learned (again), nothing is too trivial not to introduce erros.

> I might be missing something, but I do not think we need a BUILD_BUG_ON there
> when CONFIG_HUGETLB_PAGE is not set?
> Actually, would make more sense to move the BUILD_BUG_ON from above to
> hugetlb_init?

Yes, hugetlb_init is a better location for the BUILD_BUG_ON.  I initially
put it there before creating the !CONFIG_HUGETLB_PAGE version of the macros.
With the !CONFIG_HUGETLB_PAGE versions, none of the flags are ever used so
it is OK if the BUILD_BUG_ON is not included if !CONFIG_HUGETLB_PAGE.

-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 4/5] hugetlb: convert PageHugeTemporary() to HPageTemporary flag
  2021-01-20 10:09   ` Oscar Salvador
@ 2021-01-20 18:14     ` Mike Kravetz
  0 siblings, 0 replies; 24+ messages in thread
From: Mike Kravetz @ 2021-01-20 18:14 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: linux-kernel, linux-mm, Michal Hocko, Naoya Horiguchi,
	Muchun Song, David Hildenbrand, Matthew Wilcox, Andrew Morton

On 1/20/21 2:09 AM, Oscar Salvador wrote:
> On Tue, Jan 19, 2021 at 05:30:48PM -0800, Mike Kravetz wrote:
>> Use new hugetlb specific HPageTemporary flag to replace the
>> PageHugeTemporary() interfaces.
>>
>> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> 
> I would have added a brief comment explaining why it is ok to drop
> the PageHuge() check in PageHugeTemporary.
> AFAICS, the paths checking it already know they are handling with a 
> hugetlb page, but still it is better to mention it in the changelog
> in case someone wonders.

Thanks.

Since I have to do another version, I will add this.

-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag
  2021-01-20 10:00     ` Oscar Salvador
@ 2021-01-20 21:48       ` Mike Kravetz
  2021-01-21  8:18         ` Oscar Salvador
  0 siblings, 1 reply; 24+ messages in thread
From: Mike Kravetz @ 2021-01-20 21:48 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: linux-kernel, linux-mm, Michal Hocko, Naoya Horiguchi,
	Muchun Song, David Hildenbrand, Matthew Wilcox, Andrew Morton

On 1/20/21 2:00 AM, Oscar Salvador wrote:
> On Wed, Jan 20, 2021 at 10:59:05AM +0100, Oscar Salvador wrote:
>> On Tue, Jan 19, 2021 at 05:30:46PM -0800, Mike Kravetz wrote:
>>> Use the new hugetlb page specific flag HPageMigratable to replace the
>>> page_huge_active interfaces.  By it's name, page_huge_active implied
>>> that a huge page was on the active list.  However, that is not really
>>> what code checking the flag wanted to know.  It really wanted to determine
>>> if the huge page could be migrated.  This happens when the page is actually
>>> added the page cache and/or task page table.  This is the reasoning behind
>>> the name change.
>>>
>>> The VM_BUG_ON_PAGE() calls in the *_huge_active() interfaces are not
>>> really necessary as we KNOW the page is a hugetlb page.  Therefore, they
>>> are removed.
>>>
>>> The routine page_huge_active checked for PageHeadHuge before testing the
>>> active bit.  This is unnecessary in the case where we hold a reference or
>>> lock and know it is a hugetlb head page.  page_huge_active is also called
>>> without holding a reference or lock (scan_movable_pages), and can race with
>>> code freeing the page.  The extra check in page_huge_active shortened the
>>> race window, but did not prevent the race.  Offline code calling
>>> scan_movable_pages already deals with these races, so removing the check
>>> is acceptable.  Add comment to racy code.
>>>
>>> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
>>
>> Hi Mike,
>>
>> This comment addresses both this patch and the next one.
>>
>> Instead of putting the SetHPageMigratable flag spread over the
>> allocation paths, would it make more sense to place it in
>> alloc_huge_page before returning the page?
>> Then we could opencode SetHPageMigratableIfSupported right there.
> 
> and in putback_active_hugepage.


Hi Oscar,

In Muchun's series of hugetlb bug fixes, Michal asked the same question.

https://lore.kernel.org/linux-mm/7e69a55c-d501-6b42-8225-a677f09fb829@oracle.com/

The 'short answer' is that the this would allow a page to be migrated
after allocation but before the page fault code adds it to the page
cache or page tables.  This actually caused bugs in the past.
-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag
  2021-01-20 21:48       ` Mike Kravetz
@ 2021-01-21  8:18         ` Oscar Salvador
  0 siblings, 0 replies; 24+ messages in thread
From: Oscar Salvador @ 2021-01-21  8:18 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-kernel, linux-mm, Michal Hocko, Naoya Horiguchi,
	Muchun Song, David Hildenbrand, Matthew Wilcox, Andrew Morton

On Wed, Jan 20, 2021 at 01:48:05PM -0800, Mike Kravetz wrote:
> >> This comment addresses both this patch and the next one.
> >>
> >> Instead of putting the SetHPageMigratable flag spread over the
> >> allocation paths, would it make more sense to place it in
> >> alloc_huge_page before returning the page?
> >> Then we could opencode SetHPageMigratableIfSupported right there.
> > 
> > and in putback_active_hugepage.
> 
> 
> Hi Oscar,
> 
> In Muchun's series of hugetlb bug fixes, Michal asked the same question.
> 
> https://lore.kernel.org/linux-mm/7e69a55c-d501-6b42-8225-a677f09fb829@oracle.com/
> 
> The 'short answer' is that the this would allow a page to be migrated
> after allocation but before the page fault code adds it to the page
> cache or page tables.  This actually caused bugs in the past.

Oh, I see. I jumped late into that patchset so I missed some early messages.
Thanks for explaining this again.

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag
  2021-01-20  1:30 ` [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag Mike Kravetz
  2021-01-20  9:59   ` Oscar Salvador
@ 2021-01-21  8:27   ` Oscar Salvador
  2021-01-21 12:01   ` [External] " Muchun Song
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 24+ messages in thread
From: Oscar Salvador @ 2021-01-21  8:27 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-kernel, linux-mm, Michal Hocko, Naoya Horiguchi,
	Muchun Song, David Hildenbrand, Matthew Wilcox, Andrew Morton

On Tue, Jan 19, 2021 at 05:30:46PM -0800, Mike Kravetz wrote:
> Use the new hugetlb page specific flag HPageMigratable to replace the
> page_huge_active interfaces.  By it's name, page_huge_active implied
> that a huge page was on the active list.  However, that is not really
> what code checking the flag wanted to know.  It really wanted to determine
> if the huge page could be migrated.  This happens when the page is actually
> added the page cache and/or task page table.  This is the reasoning behind
> the name change.
> 
> The VM_BUG_ON_PAGE() calls in the *_huge_active() interfaces are not
> really necessary as we KNOW the page is a hugetlb page.  Therefore, they
> are removed.
> 
> The routine page_huge_active checked for PageHeadHuge before testing the
> active bit.  This is unnecessary in the case where we hold a reference or
> lock and know it is a hugetlb head page.  page_huge_active is also called
> without holding a reference or lock (scan_movable_pages), and can race with
> code freeing the page.  The extra check in page_huge_active shortened the
> race window, but did not prevent the race.  Offline code calling
> scan_movable_pages already deals with these races, so removing the check
> is acceptable.  Add comment to racy code.
> 
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>

> -/*
> - * Test to determine whether the hugepage is "active/in-use" (i.e. being linked
> - * to hstate->hugepage_activelist.)
> - *
> - * This function can be called for tail pages, but never returns true for them.
> - */
> -bool page_huge_active(struct page *page)
> -{
> -	return PageHeadHuge(page) && PagePrivate(&page[1]);

This made me think once again.
I wonder if we could ever see a scenario where page[0] is a rightful page while
page[1] is poisoned/unitialized (poison_pages()).
A lot of things would have to happen between the two checks, so I do not see it
possible and as you mentioned earlier, the race is already there.

Just wanted to speak up my mind. 

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [External] [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag
  2021-01-20  1:30 ` [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag Mike Kravetz
  2021-01-20  9:59   ` Oscar Salvador
  2021-01-21  8:27   ` Oscar Salvador
@ 2021-01-21 12:01   ` Muchun Song
  2021-01-22  6:53   ` Miaohe Lin
  2021-02-03  7:42   ` [External] " Muchun Song
  4 siblings, 0 replies; 24+ messages in thread
From: Muchun Song @ 2021-01-21 12:01 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: LKML, Linux Memory Management List, Michal Hocko,
	Naoya Horiguchi, David Hildenbrand, Oscar Salvador,
	Matthew Wilcox, Andrew Morton

On Wed, Jan 20, 2021 at 9:33 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> Use the new hugetlb page specific flag HPageMigratable to replace the
> page_huge_active interfaces.  By it's name, page_huge_active implied
> that a huge page was on the active list.  However, that is not really
> what code checking the flag wanted to know.  It really wanted to determine
> if the huge page could be migrated.  This happens when the page is actually
> added the page cache and/or task page table.  This is the reasoning behind
> the name change.
>
> The VM_BUG_ON_PAGE() calls in the *_huge_active() interfaces are not
> really necessary as we KNOW the page is a hugetlb page.  Therefore, they
> are removed.
>
> The routine page_huge_active checked for PageHeadHuge before testing the
> active bit.  This is unnecessary in the case where we hold a reference or
> lock and know it is a hugetlb head page.  page_huge_active is also called
> without holding a reference or lock (scan_movable_pages), and can race with
> code freeing the page.  The extra check in page_huge_active shortened the
> race window, but did not prevent the race.  Offline code calling
> scan_movable_pages already deals with these races, so removing the check
> is acceptable.  Add comment to racy code.
>
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>

Reviewed-by: Muchun Song <songmuchun@bytedance.com>

Thanks.

> ---
>  fs/hugetlbfs/inode.c       |  2 +-
>  include/linux/hugetlb.h    |  5 +++++
>  include/linux/page-flags.h |  6 -----
>  mm/hugetlb.c               | 45 ++++++++++----------------------------
>  mm/memory_hotplug.c        |  8 ++++++-
>  5 files changed, 24 insertions(+), 42 deletions(-)
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index b8a661780c4a..ec9f03aa2738 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -735,7 +735,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
>
>                 mutex_unlock(&hugetlb_fault_mutex_table[hash]);
>
> -               set_page_huge_active(page);
> +               SetHPageMigratable(page);
>                 /*
>                  * unlock_page because locked by add_to_page_cache()
>                  * put_page() due to reference from alloc_huge_page()
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index be71a00ee2a0..ce3d03da0133 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -480,9 +480,13 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
>   * HPG_restore_reserve - Set when a hugetlb page consumes a reservation at
>   *     allocation time.  Cleared when page is fully instantiated.  Free
>   *     routine checks flag to restore a reservation on error paths.
> + * HPG_migratable  - Set after a newly allocated page is added to the page
> + *     cache and/or page tables.  Indicates the page is a candidate for
> + *     migration.
>   */
>  enum hugetlb_page_flags {
>         HPG_restore_reserve = 0,
> +       HPG_migratable,
>         __NR_HPAGEFLAGS,
>  };
>
> @@ -529,6 +533,7 @@ static inline void ClearHPage##uname(struct page *page)             \
>   * Create functions associated with hugetlb page flags
>   */
>  HPAGEFLAG(RestoreReserve, restore_reserve)
> +HPAGEFLAG(Migratable, migratable)
>
>  #ifdef CONFIG_HUGETLB_PAGE
>
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index bc6fd1ee7dd6..04a34c08e0a6 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -592,15 +592,9 @@ static inline void ClearPageCompound(struct page *page)
>  #ifdef CONFIG_HUGETLB_PAGE
>  int PageHuge(struct page *page);
>  int PageHeadHuge(struct page *page);
> -bool page_huge_active(struct page *page);
>  #else
>  TESTPAGEFLAG_FALSE(Huge)
>  TESTPAGEFLAG_FALSE(HeadHuge)
> -
> -static inline bool page_huge_active(struct page *page)
> -{
> -       return 0;
> -}
>  #endif
>
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 8bed6b5202d2..c24da40626d3 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1353,30 +1353,6 @@ struct hstate *size_to_hstate(unsigned long size)
>         return NULL;
>  }
>
> -/*
> - * Test to determine whether the hugepage is "active/in-use" (i.e. being linked
> - * to hstate->hugepage_activelist.)
> - *
> - * This function can be called for tail pages, but never returns true for them.
> - */
> -bool page_huge_active(struct page *page)
> -{
> -       return PageHeadHuge(page) && PagePrivate(&page[1]);
> -}
> -
> -/* never called for tail page */
> -void set_page_huge_active(struct page *page)
> -{
> -       VM_BUG_ON_PAGE(!PageHeadHuge(page), page);
> -       SetPagePrivate(&page[1]);
> -}
> -
> -static void clear_page_huge_active(struct page *page)
> -{
> -       VM_BUG_ON_PAGE(!PageHeadHuge(page), page);
> -       ClearPagePrivate(&page[1]);
> -}
> -
>  /*
>   * Internal hugetlb specific page flag. Do not use outside of the hugetlb
>   * code
> @@ -1438,7 +1414,7 @@ static void __free_huge_page(struct page *page)
>         }
>
>         spin_lock(&hugetlb_lock);
> -       clear_page_huge_active(page);
> +       ClearHPageMigratable(page);
>         hugetlb_cgroup_uncharge_page(hstate_index(h),
>                                      pages_per_huge_page(h), page);
>         hugetlb_cgroup_uncharge_page_rsvd(hstate_index(h),
> @@ -4220,7 +4196,7 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma,
>                                 make_huge_pte(vma, new_page, 1));
>                 page_remove_rmap(old_page, true);
>                 hugepage_add_new_anon_rmap(new_page, vma, haddr);
> -               set_page_huge_active(new_page);
> +               SetHPageMigratable(new_page);
>                 /* Make the old page be freed below */
>                 new_page = old_page;
>         }
> @@ -4457,12 +4433,12 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
>         spin_unlock(ptl);
>
>         /*
> -        * Only make newly allocated pages active.  Existing pages found
> -        * in the pagecache could be !page_huge_active() if they have been
> -        * isolated for migration.
> +        * Only set HPageMigratable in newly allocated pages.  Existing pages
> +        * found in the pagecache may not have HPageMigratableset if they have
> +        * been isolated for migration.
>          */
>         if (new_page)
> -               set_page_huge_active(page);
> +               SetHPageMigratable(page);
>
>         unlock_page(page);
>  out:
> @@ -4773,7 +4749,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
>         update_mmu_cache(dst_vma, dst_addr, dst_pte);
>
>         spin_unlock(ptl);
> -       set_page_huge_active(page);
> +       SetHPageMigratable(page);
>         if (vm_shared)
>                 unlock_page(page);
>         ret = 0;
> @@ -5591,12 +5567,13 @@ bool isolate_huge_page(struct page *page, struct list_head *list)
>         bool ret = true;
>
>         spin_lock(&hugetlb_lock);
> -       if (!PageHeadHuge(page) || !page_huge_active(page) ||
> +       if (!PageHeadHuge(page) ||
> +           !HPageMigratable(page) ||
>             !get_page_unless_zero(page)) {
>                 ret = false;
>                 goto unlock;
>         }
> -       clear_page_huge_active(page);
> +       ClearHPageMigratable(page);
>         list_move_tail(&page->lru, list);
>  unlock:
>         spin_unlock(&hugetlb_lock);
> @@ -5607,7 +5584,7 @@ void putback_active_hugepage(struct page *page)
>  {
>         VM_BUG_ON_PAGE(!PageHead(page), page);
>         spin_lock(&hugetlb_lock);
> -       set_page_huge_active(page);
> +       SetHPageMigratable(page);
>         list_move_tail(&page->lru, &(page_hstate(page))->hugepage_activelist);
>         spin_unlock(&hugetlb_lock);
>         put_page(page);
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index f9d57b9be8c7..563da803e0e0 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1260,7 +1260,13 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
>                 if (!PageHuge(page))
>                         continue;
>                 head = compound_head(page);
> -               if (page_huge_active(head))
> +               /*
> +                * This test is racy as we hold no reference or lock.  The
> +                * hugetlb page could have been free'ed and head is no longer
> +                * a hugetlb page before the following check.  In such unlikely
> +                * cases false positives and negatives are possible.
> +                */
> +               if (HPageMigratable(head))
>                         goto found;
>                 skip = compound_nr(head) - (page - head);
>                 pfn += skip - 1;
> --
> 2.29.2
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag
  2021-01-20  1:30 ` [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag Mike Kravetz
                     ` (2 preceding siblings ...)
  2021-01-21 12:01   ` [External] " Muchun Song
@ 2021-01-22  6:53   ` Miaohe Lin
  2021-01-22 23:12     ` Mike Kravetz
  2021-02-03  7:42   ` [External] " Muchun Song
  4 siblings, 1 reply; 24+ messages in thread
From: Miaohe Lin @ 2021-01-22  6:53 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: Michal Hocko, Naoya Horiguchi, Muchun Song, David Hildenbrand,
	Oscar Salvador, Matthew Wilcox, Andrew Morton, linux-kernel,
	linux-mm

Hi:
On 2021/1/20 9:30, Mike Kravetz wrote:
> Use the new hugetlb page specific flag HPageMigratable to replace the
> page_huge_active interfaces.  By it's name, page_huge_active implied
> that a huge page was on the active list.  However, that is not really
> what code checking the flag wanted to know.  It really wanted to determine
> if the huge page could be migrated.  This happens when the page is actually
> added the page cache and/or task page table.  This is the reasoning behind

s/added/added to/

> the name change.
> 
> The VM_BUG_ON_PAGE() calls in the *_huge_active() interfaces are not
> really necessary as we KNOW the page is a hugetlb page.  Therefore, they
> are removed.
> 
> The routine page_huge_active checked for PageHeadHuge before testing the
> active bit.  This is unnecessary in the case where we hold a reference or
> lock and know it is a hugetlb head page.  page_huge_active is also called
> without holding a reference or lock (scan_movable_pages), and can race with
> code freeing the page.  The extra check in page_huge_active shortened the
> race window, but did not prevent the race.  Offline code calling
> scan_movable_pages already deals with these races, so removing the check
> is acceptable.  Add comment to racy code.
> 
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> ---
>  fs/hugetlbfs/inode.c       |  2 +-
>  include/linux/hugetlb.h    |  5 +++++
>  include/linux/page-flags.h |  6 -----
>  mm/hugetlb.c               | 45 ++++++++++----------------------------
>  mm/memory_hotplug.c        |  8 ++++++-
>  5 files changed, 24 insertions(+), 42 deletions(-)
> 
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index b8a661780c4a..ec9f03aa2738 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -735,7 +735,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
>  
>  		mutex_unlock(&hugetlb_fault_mutex_table[hash]);
>  
> -		set_page_huge_active(page);
> +		SetHPageMigratable(page);
>  		/*
>  		 * unlock_page because locked by add_to_page_cache()
>  		 * put_page() due to reference from alloc_huge_page()
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index be71a00ee2a0..ce3d03da0133 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -480,9 +480,13 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
>   * HPG_restore_reserve - Set when a hugetlb page consumes a reservation at
>   *	allocation time.  Cleared when page is fully instantiated.  Free
>   *	routine checks flag to restore a reservation on error paths.
> + * HPG_migratable  - Set after a newly allocated page is added to the page
> + *	cache and/or page tables.  Indicates the page is a candidate for
> + *	migration.
>   */
>  enum hugetlb_page_flags {
>  	HPG_restore_reserve = 0,
> +	HPG_migratable,
>  	__NR_HPAGEFLAGS,
>  };
>  
> @@ -529,6 +533,7 @@ static inline void ClearHPage##uname(struct page *page)		\
>   * Create functions associated with hugetlb page flags
>   */
>  HPAGEFLAG(RestoreReserve, restore_reserve)
> +HPAGEFLAG(Migratable, migratable)
>  
>  #ifdef CONFIG_HUGETLB_PAGE
>  
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index bc6fd1ee7dd6..04a34c08e0a6 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -592,15 +592,9 @@ static inline void ClearPageCompound(struct page *page)
>  #ifdef CONFIG_HUGETLB_PAGE
>  int PageHuge(struct page *page);
>  int PageHeadHuge(struct page *page);
> -bool page_huge_active(struct page *page);
>  #else
>  TESTPAGEFLAG_FALSE(Huge)
>  TESTPAGEFLAG_FALSE(HeadHuge)
> -
> -static inline bool page_huge_active(struct page *page)
> -{
> -	return 0;
> -}
>  #endif
>  
>  
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 8bed6b5202d2..c24da40626d3 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1353,30 +1353,6 @@ struct hstate *size_to_hstate(unsigned long size)
>  	return NULL;
>  }
>  
> -/*
> - * Test to determine whether the hugepage is "active/in-use" (i.e. being linked
> - * to hstate->hugepage_activelist.)
> - *
> - * This function can be called for tail pages, but never returns true for them.
> - */
> -bool page_huge_active(struct page *page)
> -{
> -	return PageHeadHuge(page) && PagePrivate(&page[1]);
> -}
> -
> -/* never called for tail page */
> -void set_page_huge_active(struct page *page)
> -{
> -	VM_BUG_ON_PAGE(!PageHeadHuge(page), page);
> -	SetPagePrivate(&page[1]);
> -}
> -
> -static void clear_page_huge_active(struct page *page)
> -{
> -	VM_BUG_ON_PAGE(!PageHeadHuge(page), page);
> -	ClearPagePrivate(&page[1]);
> -}
> -
>  /*
>   * Internal hugetlb specific page flag. Do not use outside of the hugetlb
>   * code
> @@ -1438,7 +1414,7 @@ static void __free_huge_page(struct page *page)
>  	}
>  
>  	spin_lock(&hugetlb_lock);
> -	clear_page_huge_active(page);
> +	ClearHPageMigratable(page);
>  	hugetlb_cgroup_uncharge_page(hstate_index(h),
>  				     pages_per_huge_page(h), page);
>  	hugetlb_cgroup_uncharge_page_rsvd(hstate_index(h),
> @@ -4220,7 +4196,7 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma,
>  				make_huge_pte(vma, new_page, 1));
>  		page_remove_rmap(old_page, true);
>  		hugepage_add_new_anon_rmap(new_page, vma, haddr);
> -		set_page_huge_active(new_page);
> +		SetHPageMigratable(new_page);
>  		/* Make the old page be freed below */
>  		new_page = old_page;
>  	}
> @@ -4457,12 +4433,12 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
>  	spin_unlock(ptl);
>  
>  	/*
> -	 * Only make newly allocated pages active.  Existing pages found
> -	 * in the pagecache could be !page_huge_active() if they have been
> -	 * isolated for migration.
> +	 * Only set HPageMigratable in newly allocated pages.  Existing pages
> +	 * found in the pagecache may not have HPageMigratableset if they have
> +	 * been isolated for migration.
>  	 */
>  	if (new_page)
> -		set_page_huge_active(page);
> +		SetHPageMigratable(page);
>  
>  	unlock_page(page);
>  out:
> @@ -4773,7 +4749,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
>  	update_mmu_cache(dst_vma, dst_addr, dst_pte);
>  
>  	spin_unlock(ptl);
> -	set_page_huge_active(page);
> +	SetHPageMigratable(page);
>  	if (vm_shared)
>  		unlock_page(page);
>  	ret = 0;
> @@ -5591,12 +5567,13 @@ bool isolate_huge_page(struct page *page, struct list_head *list)
>  	bool ret = true;
>  
>  	spin_lock(&hugetlb_lock);
> -	if (!PageHeadHuge(page) || !page_huge_active(page) ||
> +	if (!PageHeadHuge(page) ||
> +	    !HPageMigratable(page) ||
>  	    !get_page_unless_zero(page)) {
>  		ret = false;
>  		goto unlock;
>  	}
> -	clear_page_huge_active(page);
> +	ClearHPageMigratable(page);
>  	list_move_tail(&page->lru, list);
>  unlock:
>  	spin_unlock(&hugetlb_lock);
> @@ -5607,7 +5584,7 @@ void putback_active_hugepage(struct page *page)
>  {
>  	VM_BUG_ON_PAGE(!PageHead(page), page);
>  	spin_lock(&hugetlb_lock);
> -	set_page_huge_active(page);
> +	SetHPageMigratable(page);
>  	list_move_tail(&page->lru, &(page_hstate(page))->hugepage_activelist);
>  	spin_unlock(&hugetlb_lock);
>  	put_page(page);
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index f9d57b9be8c7..563da803e0e0 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1260,7 +1260,13 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
>  		if (!PageHuge(page))
>  			continue;
>  		head = compound_head(page);
> -		if (page_huge_active(head))
> +		/*
> +		 * This test is racy as we hold no reference or lock.  The
> +		 * hugetlb page could have been free'ed and head is no longer
> +		 * a hugetlb page before the following check.  In such unlikely
> +		 * cases false positives and negatives are possible.
> +		 */

Is it necessary to mention that: "offline_pages() could handle these racy cases." in the comment ?

> +		if (HPageMigratable(head))
>  			goto found;
>  		skip = compound_nr(head) - (page - head);
>  		pfn += skip - 1;
> 

Looks good to me. Thanks.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag
  2021-01-22  6:53   ` Miaohe Lin
@ 2021-01-22 23:12     ` Mike Kravetz
  0 siblings, 0 replies; 24+ messages in thread
From: Mike Kravetz @ 2021-01-22 23:12 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: Michal Hocko, Naoya Horiguchi, Muchun Song, David Hildenbrand,
	Oscar Salvador, Matthew Wilcox, Andrew Morton, linux-kernel,
	linux-mm

On 1/21/21 10:53 PM, Miaohe Lin wrote:
> Hi:
> On 2021/1/20 9:30, Mike Kravetz wrote:
>> Use the new hugetlb page specific flag HPageMigratable to replace the
>> page_huge_active interfaces.  By it's name, page_huge_active implied
>> that a huge page was on the active list.  However, that is not really
>> what code checking the flag wanted to know.  It really wanted to determine
>> if the huge page could be migrated.  This happens when the page is actually
>> added the page cache and/or task page table.  This is the reasoning behind
> 
> s/added/added to/

Thanks

> 
>> the name change.
>>
...
>> @@ -1260,7 +1260,13 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
>>  		if (!PageHuge(page))
>>  			continue;
>>  		head = compound_head(page);
>> -		if (page_huge_active(head))
>> +		/*
>> +		 * This test is racy as we hold no reference or lock.  The
>> +		 * hugetlb page could have been free'ed and head is no longer
>> +		 * a hugetlb page before the following check.  In such unlikely
>> +		 * cases false positives and negatives are possible.
>> +		 */
> 
> Is it necessary to mention that: "offline_pages() could handle these racy cases." in the comment ?
> 

I will enhance the comment to say that calling code must deal with these
possible scenarios.
-- 
Mike Kravetz

>> +		if (HPageMigratable(head))
>>  			goto found;
>>  		skip = compound_nr(head) - (page - head);
>>  		pfn += skip - 1;
>>
> 
> Looks good to me. Thanks.
> 
> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [External] [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag
  2021-01-20  1:30 ` [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag Mike Kravetz
                     ` (3 preceding siblings ...)
  2021-01-22  6:53   ` Miaohe Lin
@ 2021-02-03  7:42   ` Muchun Song
  2021-02-04  0:44     ` Mike Kravetz
  4 siblings, 1 reply; 24+ messages in thread
From: Muchun Song @ 2021-02-03  7:42 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: LKML, Linux Memory Management List, Michal Hocko,
	Naoya Horiguchi, David Hildenbrand, Oscar Salvador,
	Matthew Wilcox, Andrew Morton

On Wed, Jan 20, 2021 at 9:33 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> Use the new hugetlb page specific flag HPageMigratable to replace the
> page_huge_active interfaces.  By it's name, page_huge_active implied
> that a huge page was on the active list.  However, that is not really
> what code checking the flag wanted to know.  It really wanted to determine
> if the huge page could be migrated.  This happens when the page is actually
> added the page cache and/or task page table.  This is the reasoning behind
> the name change.
>
> The VM_BUG_ON_PAGE() calls in the *_huge_active() interfaces are not
> really necessary as we KNOW the page is a hugetlb page.  Therefore, they
> are removed.
>
> The routine page_huge_active checked for PageHeadHuge before testing the
> active bit.  This is unnecessary in the case where we hold a reference or
> lock and know it is a hugetlb head page.  page_huge_active is also called
> without holding a reference or lock (scan_movable_pages), and can race with
> code freeing the page.  The extra check in page_huge_active shortened the
> race window, but did not prevent the race.  Offline code calling
> scan_movable_pages already deals with these races, so removing the check
> is acceptable.  Add comment to racy code.
>
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>

Hi Mike,

I found that you may forget to remove set_page_huge_active()
from include/linux/hugetlb.h.

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 37fd248ce271..6f680cf0eee6 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -852,7 +852,7 @@ static inline void
huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
 }
 #endif

-void set_page_huge_active(struct page *page);
+

Thanks.


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [External] [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag
  2021-02-03  7:42   ` [External] " Muchun Song
@ 2021-02-04  0:44     ` Mike Kravetz
  0 siblings, 0 replies; 24+ messages in thread
From: Mike Kravetz @ 2021-02-04  0:44 UTC (permalink / raw)
  To: Muchun Song, Andrew Morton
  Cc: LKML, Linux Memory Management List, Michal Hocko,
	Naoya Horiguchi, David Hildenbrand, Oscar Salvador,
	Matthew Wilcox

On 2/2/21 11:42 PM, Muchun Song wrote:
> On Wed, Jan 20, 2021 at 9:33 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>>
>> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> 
> Hi Mike,
> 
> I found that you may forget to remove set_page_huge_active()
> from include/linux/hugetlb.h.
> 
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 37fd248ce271..6f680cf0eee6 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -852,7 +852,7 @@ static inline void
> huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
>  }
>  #endif
> 
> -void set_page_huge_active(struct page *page);
> +
> 
> Thanks.

Thank you Muchun for finding this.

Thank you Andrew for fixing in your tree.
-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2021-02-04  0:44 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-20  1:30 [PATCH v2 0/5] create hugetlb flags to consolidate state Mike Kravetz
2021-01-20  1:30 ` [PATCH v2 1/5] hugetlb: use page.private for hugetlb specific page flags Mike Kravetz
2021-01-20  8:10   ` Miaohe Lin
2021-01-20  8:43   ` kernel test robot
2021-01-20  9:30   ` Oscar Salvador
2021-01-20 18:12     ` Mike Kravetz
2021-01-20  1:30 ` [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag Mike Kravetz
2021-01-20  9:59   ` Oscar Salvador
2021-01-20 10:00     ` Oscar Salvador
2021-01-20 21:48       ` Mike Kravetz
2021-01-21  8:18         ` Oscar Salvador
2021-01-21  8:27   ` Oscar Salvador
2021-01-21 12:01   ` [External] " Muchun Song
2021-01-22  6:53   ` Miaohe Lin
2021-01-22 23:12     ` Mike Kravetz
2021-02-03  7:42   ` [External] " Muchun Song
2021-02-04  0:44     ` Mike Kravetz
2021-01-20  1:30 ` [PATCH v2 3/5] hugetlb: only set HPageMigratable for migratable hstates Mike Kravetz
2021-01-20  1:30 ` [PATCH v2 4/5] hugetlb: convert PageHugeTemporary() to HPageTemporary flag Mike Kravetz
2021-01-20 10:09   ` Oscar Salvador
2021-01-20 18:14     ` Mike Kravetz
2021-01-20  1:30 ` [PATCH v2 5/5] hugetlb: convert PageHugeFreed to HPageFreed flag Mike Kravetz
2021-01-20 10:10   ` Oscar Salvador
2021-01-20 10:44   ` [External] " Muchun Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).