[PATCH mm-unstable v5 00/10] convert core hugetlb functions to folios

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH mm-unstable v5 00/10] convert core hugetlb functions to folios
@ 2022-11-29 22:50 Sidhartha Kumar
  2022-11-29 22:50 ` [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions Sidhartha Kumar
                   ` (9 more replies)
  0 siblings, 10 replies; 33+ messages in thread
From: Sidhartha Kumar @ 2022-11-29 22:50 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, almasrymina, linmiaohe,
	hughd, tsahu, jhubbard, david, Sidhartha Kumar

============== OVERVIEW ===========================
Now that many hugetlb helper functions that deal with hugetlb specific
flags[1] and hugetlb cgroups[2] are converted to folios, higher level
allocation, prep, and freeing functions within hugetlb can also be
converted to operate in folios.

Patch 1 of this series implements the wrapper functions around setting
the compound destructor and compound order for a folio. Besides the user
added in patch 1, patch 2 and patch 9 also use these helper functions.

Patches 2-10 convert the higher level hugetlb functions to folios.

============== TESTING ===========================
LTP:
	Ran 10 back to back rounds of the LTP hugetlb test suite.

Gigantic Huge Pages:
	Test allocation and freeing via hugeadm commands:
		hugeadm --pool-pages-min 1GB:10
		hugeadm --pool-pages-min 1GB:0

Demote:
	Demote 1 1GB hugepages to 512 2MB hugepages
		echo 1 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
		echo 1 > /sys/kernel/mm/hugepages/hugepages-1048576kB/demote
		cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
			# 512
		cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
			# 0

Rebased on 10/29/2022 mm-unstable

[1] https://lore.kernel.org/lkml/20220922154207.1575343-1-sidhartha.kumar@oracle.com/
[2] https://lore.kernel.org/linux-mm/20221101223059.460937-1-sidhartha.kumar@oracle.com/

v1 -> v2:
	- fix conflict with "mm,thp,rmap: simplify compound page mapcount handling"
v2 -> v3:
	- v2 contained wrong version of patch 1
v3 -> v4:
	- change instances of folio_{clear, set}_head() to __folio_{clear, set}_head()
	- rebase on top of hugetlb: Fix __prep_compound_gigantic_page page flag setting
v4 -> v5:
	- change comment with HPageVmemmapOptimized to hugetlb_vmemmap_optimized
	  per Tarun Sahu
	- fix Smatch warning in patch 10 by reorganizing alloc_pool_huge_page
	- fix NULL pointer dereference issue in patch 10 per John Hubbard
	- use NULL rather than 0 in patch 6 per David Hildenbrand

Sidhartha Kumar (10):
  mm: add folio dtor and order setter functions
  mm/hugetlb: convert destroy_compound_gigantic_page() to folios
  mm/hugetlb: convert dissolve_free_huge_page() to folios
  mm/hugetlb: convert remove_hugetlb_page() to folios
  mm/hugetlb: convert update_and_free_page() to folios
  mm/hugetlb: convert add_hugetlb_page() to folios and add
    hugetlb_cma_folio()
  mm/hugetlb: convert enqueue_huge_page() to folios
  mm/hugetlb: convert free_gigantic_page() to folios
  mm/hugetlb: convert hugetlb prep functions to folios
  mm/hugetlb: change hugetlb allocation functions to return a folio

 include/linux/mm.h |  16 ++
 mm/hugetlb.c       | 409 ++++++++++++++++++++++-----------------------
 2 files changed, 219 insertions(+), 206 deletions(-)

-- 
2.38.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions
  2022-11-29 22:50 [PATCH mm-unstable v5 00/10] convert core hugetlb functions to folios Sidhartha Kumar
@ 2022-11-29 22:50 ` Sidhartha Kumar
  2022-12-07  0:18   ` Mike Kravetz
  2022-12-07  3:34   ` Muchun Song
  2022-11-29 22:50 ` [PATCH mm-unstable v5 02/10] mm/hugetlb: convert destroy_compound_gigantic_page() to folios Sidhartha Kumar
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 33+ messages in thread
From: Sidhartha Kumar @ 2022-11-29 22:50 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, almasrymina, linmiaohe,
	hughd, tsahu, jhubbard, david, Sidhartha Kumar

Add folio equivalents for set_compound_order() and set_compound_page_dtor().

Also remove extra new-lines introduced by mm/hugetlb: convert
move_hugetlb_state() to folios and mm/hugetlb_cgroup: convert
hugetlb_cgroup_uncharge_page() to folios.

Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
Suggested-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
---
 include/linux/mm.h | 16 ++++++++++++++++
 mm/hugetlb.c       |  4 +---
 2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a48c5ad16a5e..2bdef8a5298a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -972,6 +972,13 @@ static inline void set_compound_page_dtor(struct page *page,
 	page[1].compound_dtor = compound_dtor;
 }
 
+static inline void folio_set_compound_dtor(struct folio *folio,
+		enum compound_dtor_id compound_dtor)
+{
+	VM_BUG_ON_FOLIO(compound_dtor >= NR_COMPOUND_DTORS, folio);
+	folio->_folio_dtor = compound_dtor;
+}
+
 void destroy_large_folio(struct folio *folio);
 
 static inline int head_compound_pincount(struct page *head)
@@ -987,6 +994,15 @@ static inline void set_compound_order(struct page *page, unsigned int order)
 #endif
 }
 
+static inline void folio_set_compound_order(struct folio *folio,
+		unsigned int order)
+{
+	folio->_folio_order = order;
+#ifdef CONFIG_64BIT
+	folio->_folio_nr_pages = order ? 1U << order : 0;
+#endif
+}
+
 /* Returns the number of pages in this potentially compound page. */
 static inline unsigned long compound_nr(struct page *page)
 {
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 4c2fe7fec475..6390de8975c5 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1780,7 +1780,7 @@ static void __prep_new_hugetlb_folio(struct hstate *h, struct folio *folio)
 {
 	hugetlb_vmemmap_optimize(h, &folio->page);
 	INIT_LIST_HEAD(&folio->lru);
-	folio->_folio_dtor = HUGETLB_PAGE_DTOR;
+	folio_set_compound_dtor(folio, HUGETLB_PAGE_DTOR);
 	hugetlb_set_folio_subpool(folio, NULL);
 	set_hugetlb_cgroup(folio, NULL);
 	set_hugetlb_cgroup_rsvd(folio, NULL);
@@ -2938,7 +2938,6 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
 	 * a reservation exists for the allocation.
 	 */
 	page = dequeue_huge_page_vma(h, vma, addr, avoid_reserve, gbl_chg);
-
 	if (!page) {
 		spin_unlock_irq(&hugetlb_lock);
 		page = alloc_buddy_huge_page_with_mpol(h, vma, addr);
@@ -7351,7 +7350,6 @@ void move_hugetlb_state(struct folio *old_folio, struct folio *new_folio, int re
 		int old_nid = folio_nid(old_folio);
 		int new_nid = folio_nid(new_folio);
 
-
 		folio_set_hugetlb_temporary(old_folio);
 		folio_clear_hugetlb_temporary(new_folio);
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH mm-unstable v5 02/10] mm/hugetlb: convert destroy_compound_gigantic_page() to folios
  2022-11-29 22:50 [PATCH mm-unstable v5 00/10] convert core hugetlb functions to folios Sidhartha Kumar
  2022-11-29 22:50 ` [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions Sidhartha Kumar
@ 2022-11-29 22:50 ` Sidhartha Kumar
  2022-12-07  0:32   ` Mike Kravetz
  2022-11-29 22:50 ` [PATCH mm-unstable v5 03/10] mm/hugetlb: convert dissolve_free_huge_page() " Sidhartha Kumar
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Sidhartha Kumar @ 2022-11-29 22:50 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, almasrymina, linmiaohe,
	hughd, tsahu, jhubbard, david, Sidhartha Kumar

Convert page operations within __destroy_compound_gigantic_page() to the
corresponding folio operations.

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
---
 mm/hugetlb.c | 43 +++++++++++++++++++++----------------------
 1 file changed, 21 insertions(+), 22 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 6390de8975c5..f6f791675c38 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1325,43 +1325,40 @@ static int hstate_next_node_to_free(struct hstate *h, nodemask_t *nodes_allowed)
 		nr_nodes--)
 
 /* used to demote non-gigantic_huge pages as well */
-static void __destroy_compound_gigantic_page(struct page *page,
+static void __destroy_compound_gigantic_folio(struct folio *folio,
 					unsigned int order, bool demote)
 {
 	int i;
 	int nr_pages = 1 << order;
 	struct page *p;
 
-	atomic_set(compound_mapcount_ptr(page), 0);
-	atomic_set(subpages_mapcount_ptr(page), 0);
-	atomic_set(compound_pincount_ptr(page), 0);
+	atomic_set(folio_mapcount_ptr(folio), 0);
+	atomic_set(folio_subpages_mapcount_ptr(folio), 0);
+	atomic_set(folio_pincount_ptr(folio), 0);
 
 	for (i = 1; i < nr_pages; i++) {
-		p = nth_page(page, i);
+		p = folio_page(folio, i);
 		p->mapping = NULL;
 		clear_compound_head(p);
 		if (!demote)
 			set_page_refcounted(p);
 	}
 
-	set_compound_order(page, 0);
-#ifdef CONFIG_64BIT
-	page[1].compound_nr = 0;
-#endif
-	__ClearPageHead(page);
+	folio_set_compound_order(folio, 0);
+	__folio_clear_head(folio);
 }
 
-static void destroy_compound_hugetlb_page_for_demote(struct page *page,
+static void destroy_compound_hugetlb_folio_for_demote(struct folio *folio,
 					unsigned int order)
 {
-	__destroy_compound_gigantic_page(page, order, true);
+	__destroy_compound_gigantic_folio(folio, order, true);
 }
 
 #ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
-static void destroy_compound_gigantic_page(struct page *page,
+static void destroy_compound_gigantic_folio(struct folio *folio,
 					unsigned int order)
 {
-	__destroy_compound_gigantic_page(page, order, false);
+	__destroy_compound_gigantic_folio(folio, order, false);
 }
 
 static void free_gigantic_page(struct page *page, unsigned int order)
@@ -1430,7 +1427,7 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
 	return NULL;
 }
 static inline void free_gigantic_page(struct page *page, unsigned int order) { }
-static inline void destroy_compound_gigantic_page(struct page *page,
+static inline void destroy_compound_gigantic_folio(struct folio *folio,
 						unsigned int order) { }
 #endif
 
@@ -1477,8 +1474,8 @@ static void __remove_hugetlb_page(struct hstate *h, struct page *page,
 	 *
 	 * For gigantic pages set the destructor to the null dtor.  This
 	 * destructor will never be called.  Before freeing the gigantic
-	 * page destroy_compound_gigantic_page will turn the compound page
-	 * into a simple group of pages.  After this the destructor does not
+	 * page destroy_compound_gigantic_folio will turn the folio into a
+	 * simple group of pages.  After this the destructor does not
 	 * apply.
 	 *
 	 * This handles the case where more than one ref is held when and
@@ -1559,6 +1556,7 @@ static void add_hugetlb_page(struct hstate *h, struct page *page,
 static void __update_and_free_page(struct hstate *h, struct page *page)
 {
 	int i;
+	struct folio *folio = page_folio(page);
 	struct page *subpage;
 
 	if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
@@ -1587,8 +1585,8 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
 	 * Move PageHWPoison flag from head page to the raw error pages,
 	 * which makes any healthy subpages reusable.
 	 */
-	if (unlikely(PageHWPoison(page)))
-		hugetlb_clear_page_hwpoison(page);
+	if (unlikely(folio_test_hwpoison(folio)))
+		hugetlb_clear_page_hwpoison(&folio->page);
 
 	for (i = 0; i < pages_per_huge_page(h); i++) {
 		subpage = nth_page(page, i);
@@ -1604,7 +1602,7 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
 	 */
 	if (hstate_is_gigantic(h) ||
 	    hugetlb_cma_page(page, huge_page_order(h))) {
-		destroy_compound_gigantic_page(page, huge_page_order(h));
+		destroy_compound_gigantic_folio(folio, huge_page_order(h));
 		free_gigantic_page(page, huge_page_order(h));
 	} else {
 		__free_pages(page, huge_page_order(h));
@@ -3437,6 +3435,7 @@ static int demote_free_huge_page(struct hstate *h, struct page *page)
 {
 	int i, nid = page_to_nid(page);
 	struct hstate *target_hstate;
+	struct folio *folio = page_folio(page);
 	struct page *subpage;
 	int rc = 0;
 
@@ -3455,10 +3454,10 @@ static int demote_free_huge_page(struct hstate *h, struct page *page)
 	}
 
 	/*
-	 * Use destroy_compound_hugetlb_page_for_demote for all huge page
+	 * Use destroy_compound_hugetlb_folio_for_demote for all huge page
 	 * sizes as it will not ref count pages.
 	 */
-	destroy_compound_hugetlb_page_for_demote(page, huge_page_order(h));
+	destroy_compound_hugetlb_folio_for_demote(folio, huge_page_order(h));
 
 	/*
 	 * Taking target hstate mutex synchronizes with set_max_huge_pages.
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH mm-unstable v5 03/10] mm/hugetlb: convert dissolve_free_huge_page() to folios
  2022-11-29 22:50 [PATCH mm-unstable v5 00/10] convert core hugetlb functions to folios Sidhartha Kumar
  2022-11-29 22:50 ` [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions Sidhartha Kumar
  2022-11-29 22:50 ` [PATCH mm-unstable v5 02/10] mm/hugetlb: convert destroy_compound_gigantic_page() to folios Sidhartha Kumar
@ 2022-11-29 22:50 ` Sidhartha Kumar
  2022-12-07  0:52   ` Mike Kravetz
  2022-11-29 22:50 ` [PATCH mm-unstable v5 04/10] mm/hugetlb: convert remove_hugetlb_page() " Sidhartha Kumar
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Sidhartha Kumar @ 2022-11-29 22:50 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, almasrymina, linmiaohe,
	hughd, tsahu, jhubbard, david, Sidhartha Kumar

Removes compound_head() call by using a folio rather than a head page.

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
---
 mm/hugetlb.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f6f791675c38..d0acabbf3025 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2128,21 +2128,21 @@ static struct page *remove_pool_huge_page(struct hstate *h,
 int dissolve_free_huge_page(struct page *page)
 {
 	int rc = -EBUSY;
+	struct folio *folio = page_folio(page);
 
 retry:
 	/* Not to disrupt normal path by vainly holding hugetlb_lock */
-	if (!PageHuge(page))
+	if (!folio_test_hugetlb(folio))
 		return 0;
 
 	spin_lock_irq(&hugetlb_lock);
-	if (!PageHuge(page)) {
+	if (!folio_test_hugetlb(folio)) {
 		rc = 0;
 		goto out;
 	}
 
-	if (!page_count(page)) {
-		struct page *head = compound_head(page);
-		struct hstate *h = page_hstate(head);
+	if (!folio_ref_count(folio)) {
+		struct hstate *h = folio_hstate(folio);
 		if (!available_huge_pages(h))
 			goto out;
 
@@ -2150,7 +2150,7 @@ int dissolve_free_huge_page(struct page *page)
 		 * We should make sure that the page is already on the free list
 		 * when it is dissolved.
 		 */
-		if (unlikely(!HPageFreed(head))) {
+		if (unlikely(!folio_test_hugetlb_freed(folio))) {
 			spin_unlock_irq(&hugetlb_lock);
 			cond_resched();
 
@@ -2165,7 +2165,7 @@ int dissolve_free_huge_page(struct page *page)
 			goto retry;
 		}
 
-		remove_hugetlb_page(h, head, false);
+		remove_hugetlb_page(h, &folio->page, false);
 		h->max_huge_pages--;
 		spin_unlock_irq(&hugetlb_lock);
 
@@ -2177,12 +2177,12 @@ int dissolve_free_huge_page(struct page *page)
 		 * Attempt to allocate vmemmmap here so that we can take
 		 * appropriate action on failure.
 		 */
-		rc = hugetlb_vmemmap_restore(h, head);
+		rc = hugetlb_vmemmap_restore(h, &folio->page);
 		if (!rc) {
-			update_and_free_page(h, head, false);
+			update_and_free_page(h, &folio->page, false);
 		} else {
 			spin_lock_irq(&hugetlb_lock);
-			add_hugetlb_page(h, head, false);
+			add_hugetlb_page(h, &folio->page, false);
 			h->max_huge_pages++;
 			spin_unlock_irq(&hugetlb_lock);
 		}
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH mm-unstable v5 04/10] mm/hugetlb: convert remove_hugetlb_page() to folios
  2022-11-29 22:50 [PATCH mm-unstable v5 00/10] convert core hugetlb functions to folios Sidhartha Kumar
                   ` (2 preceding siblings ...)
  2022-11-29 22:50 ` [PATCH mm-unstable v5 03/10] mm/hugetlb: convert dissolve_free_huge_page() " Sidhartha Kumar
@ 2022-11-29 22:50 ` Sidhartha Kumar
  2022-12-07  1:43   ` Mike Kravetz
  2022-11-29 22:50 ` [PATCH mm-unstable v5 05/10] mm/hugetlb: convert update_and_free_page() " Sidhartha Kumar
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Sidhartha Kumar @ 2022-11-29 22:50 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, almasrymina, linmiaohe,
	hughd, tsahu, jhubbard, david, Sidhartha Kumar

Removes page_folio() call by converting callers to directly pass a folio
into __remove_hugetlb_page().

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
---
 mm/hugetlb.c | 48 +++++++++++++++++++++++++-----------------------
 1 file changed, 25 insertions(+), 23 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index d0acabbf3025..90ba01a76f87 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1432,19 +1432,18 @@ static inline void destroy_compound_gigantic_folio(struct folio *folio,
 #endif
 
 /*
- * Remove hugetlb page from lists, and update dtor so that page appears
+ * Remove hugetlb folio from lists, and update dtor so that the folio appears
  * as just a compound page.
  *
- * A reference is held on the page, except in the case of demote.
+ * A reference is held on the folio, except in the case of demote.
  *
  * Must be called with hugetlb lock held.
  */
-static void __remove_hugetlb_page(struct hstate *h, struct page *page,
+static void __remove_hugetlb_folio(struct hstate *h, struct folio *folio,
 							bool adjust_surplus,
 							bool demote)
 {
-	int nid = page_to_nid(page);
-	struct folio *folio = page_folio(page);
+	int nid = folio_nid(folio);
 
 	VM_BUG_ON_FOLIO(hugetlb_cgroup_from_folio(folio), folio);
 	VM_BUG_ON_FOLIO(hugetlb_cgroup_from_folio_rsvd(folio), folio);
@@ -1453,9 +1452,9 @@ static void __remove_hugetlb_page(struct hstate *h, struct page *page,
 	if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
 		return;
 
-	list_del(&page->lru);
+	list_del(&folio->lru);
 
-	if (HPageFreed(page)) {
+	if (folio_test_hugetlb_freed(folio)) {
 		h->free_huge_pages--;
 		h->free_huge_pages_node[nid]--;
 	}
@@ -1485,26 +1484,26 @@ static void __remove_hugetlb_page(struct hstate *h, struct page *page,
 	 * be turned into a page of smaller size.
 	 */
 	if (!demote)
-		set_page_refcounted(page);
+		folio_ref_unfreeze(folio, 1);
 	if (hstate_is_gigantic(h))
-		set_compound_page_dtor(page, NULL_COMPOUND_DTOR);
+		folio_set_compound_dtor(folio, NULL_COMPOUND_DTOR);
 	else
-		set_compound_page_dtor(page, COMPOUND_PAGE_DTOR);
+		folio_set_compound_dtor(folio, COMPOUND_PAGE_DTOR);
 
 	h->nr_huge_pages--;
 	h->nr_huge_pages_node[nid]--;
 }
 
-static void remove_hugetlb_page(struct hstate *h, struct page *page,
+static void remove_hugetlb_folio(struct hstate *h, struct folio *folio,
 							bool adjust_surplus)
 {
-	__remove_hugetlb_page(h, page, adjust_surplus, false);
+	__remove_hugetlb_folio(h, folio, adjust_surplus, false);
 }
 
-static void remove_hugetlb_page_for_demote(struct hstate *h, struct page *page,
+static void remove_hugetlb_folio_for_demote(struct hstate *h, struct folio *folio,
 							bool adjust_surplus)
 {
-	__remove_hugetlb_page(h, page, adjust_surplus, true);
+	__remove_hugetlb_folio(h, folio, adjust_surplus, true);
 }
 
 static void add_hugetlb_page(struct hstate *h, struct page *page,
@@ -1639,8 +1638,9 @@ static void free_hpage_workfn(struct work_struct *work)
 		/*
 		 * The VM_BUG_ON_PAGE(!PageHuge(page), page) in page_hstate()
 		 * is going to trigger because a previous call to
-		 * remove_hugetlb_page() will set_compound_page_dtor(page,
-		 * NULL_COMPOUND_DTOR), so do not use page_hstate() directly.
+		 * remove_hugetlb_folio() will call folio_set_compound_dtor
+		 * (folio, NULL_COMPOUND_DTOR), so do not use page_hstate()
+		 * directly.
 		 */
 		h = size_to_hstate(page_size(page));
 
@@ -1749,12 +1749,12 @@ void free_huge_page(struct page *page)
 		h->resv_huge_pages++;
 
 	if (folio_test_hugetlb_temporary(folio)) {
-		remove_hugetlb_page(h, page, false);
+		remove_hugetlb_folio(h, folio, false);
 		spin_unlock_irqrestore(&hugetlb_lock, flags);
 		update_and_free_page(h, page, true);
 	} else if (h->surplus_huge_pages_node[nid]) {
 		/* remove the page from active list */
-		remove_hugetlb_page(h, page, true);
+		remove_hugetlb_folio(h, folio, true);
 		spin_unlock_irqrestore(&hugetlb_lock, flags);
 		update_and_free_page(h, page, true);
 	} else {
@@ -2092,6 +2092,7 @@ static struct page *remove_pool_huge_page(struct hstate *h,
 {
 	int nr_nodes, node;
 	struct page *page = NULL;
+	struct folio *folio;
 
 	lockdep_assert_held(&hugetlb_lock);
 	for_each_node_mask_to_free(h, nr_nodes, node, nodes_allowed) {
@@ -2103,7 +2104,8 @@ static struct page *remove_pool_huge_page(struct hstate *h,
 		    !list_empty(&h->hugepage_freelists[node])) {
 			page = list_entry(h->hugepage_freelists[node].next,
 					  struct page, lru);
-			remove_hugetlb_page(h, page, acct_surplus);
+			folio = page_folio(page);
+			remove_hugetlb_folio(h, folio, acct_surplus);
 			break;
 		}
 	}
@@ -2165,7 +2167,7 @@ int dissolve_free_huge_page(struct page *page)
 			goto retry;
 		}
 
-		remove_hugetlb_page(h, &folio->page, false);
+		remove_hugetlb_folio(h, folio, false);
 		h->max_huge_pages--;
 		spin_unlock_irq(&hugetlb_lock);
 
@@ -2803,7 +2805,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
 		 * and enqueue_huge_page() for new_page. The counters will remain
 		 * stable since this happens under the lock.
 		 */
-		remove_hugetlb_page(h, old_page, false);
+		remove_hugetlb_folio(h, old_folio, false);
 
 		/*
 		 * Ref count on new page is already zero as it was dropped
@@ -3230,7 +3232,7 @@ static void try_to_free_low(struct hstate *h, unsigned long count,
 				goto out;
 			if (PageHighMem(page))
 				continue;
-			remove_hugetlb_page(h, page, false);
+			remove_hugetlb_folio(h, page_folio(page), false);
 			list_add(&page->lru, &page_list);
 		}
 	}
@@ -3441,7 +3443,7 @@ static int demote_free_huge_page(struct hstate *h, struct page *page)
 
 	target_hstate = size_to_hstate(PAGE_SIZE << h->demote_order);
 
-	remove_hugetlb_page_for_demote(h, page, false);
+	remove_hugetlb_folio_for_demote(h, folio, false);
 	spin_unlock_irq(&hugetlb_lock);
 
 	rc = hugetlb_vmemmap_restore(h, page);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH mm-unstable v5 05/10] mm/hugetlb: convert update_and_free_page() to folios
  2022-11-29 22:50 [PATCH mm-unstable v5 00/10] convert core hugetlb functions to folios Sidhartha Kumar
                   ` (3 preceding siblings ...)
  2022-11-29 22:50 ` [PATCH mm-unstable v5 04/10] mm/hugetlb: convert remove_hugetlb_page() " Sidhartha Kumar
@ 2022-11-29 22:50 ` Sidhartha Kumar
  2022-12-07  2:02   ` Mike Kravetz
  2022-11-29 22:50 ` [PATCH mm-unstable v5 06/10] mm/hugetlb: convert add_hugetlb_page() to folios and add hugetlb_cma_folio() Sidhartha Kumar
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Sidhartha Kumar @ 2022-11-29 22:50 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, almasrymina, linmiaohe,
	hughd, tsahu, jhubbard, david, Sidhartha Kumar

Make more progress on converting the free_huge_page() destructor to
operate on folios by converting update_and_free_page() to folios.

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
---
 mm/hugetlb.c | 30 ++++++++++++++++--------------
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 90ba01a76f87..83777d1ccbf3 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1478,7 +1478,7 @@ static void __remove_hugetlb_folio(struct hstate *h, struct folio *folio,
 	 * apply.
 	 *
 	 * This handles the case where more than one ref is held when and
-	 * after update_and_free_page is called.
+	 * after update_and_free_hugetlb_folio is called.
 	 *
 	 * In the case of demote we do not ref count the page as it will soon
 	 * be turned into a page of smaller size.
@@ -1609,7 +1609,7 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
 }
 
 /*
- * As update_and_free_page() can be called under any context, so we cannot
+ * As update_and_free_hugetlb_folio() can be called under any context, so we cannot
  * use GFP_KERNEL to allocate vmemmap pages. However, we can defer the
  * actual freeing in a workqueue to prevent from using GFP_ATOMIC to allocate
  * the vmemmap pages.
@@ -1657,11 +1657,11 @@ static inline void flush_free_hpage_work(struct hstate *h)
 		flush_work(&free_hpage_work);
 }
 
-static void update_and_free_page(struct hstate *h, struct page *page,
+static void update_and_free_hugetlb_folio(struct hstate *h, struct folio *folio,
 				 bool atomic)
 {
-	if (!HPageVmemmapOptimized(page) || !atomic) {
-		__update_and_free_page(h, page);
+	if (!folio_test_hugetlb_vmemmap_optimized(folio) || !atomic) {
+		__update_and_free_page(h, &folio->page);
 		return;
 	}
 
@@ -1672,16 +1672,18 @@ static void update_and_free_page(struct hstate *h, struct page *page,
 	 * empty. Otherwise, schedule_work() had been called but the workfn
 	 * hasn't retrieved the list yet.
 	 */
-	if (llist_add((struct llist_node *)&page->mapping, &hpage_freelist))
+	if (llist_add((struct llist_node *)&folio->mapping, &hpage_freelist))
 		schedule_work(&free_hpage_work);
 }
 
 static void update_and_free_pages_bulk(struct hstate *h, struct list_head *list)
 {
 	struct page *page, *t_page;
+	struct folio *folio;
 
 	list_for_each_entry_safe(page, t_page, list, lru) {
-		update_and_free_page(h, page, false);
+		folio = page_folio(page);
+		update_and_free_hugetlb_folio(h, folio, false);
 		cond_resched();
 	}
 }
@@ -1751,12 +1753,12 @@ void free_huge_page(struct page *page)
 	if (folio_test_hugetlb_temporary(folio)) {
 		remove_hugetlb_folio(h, folio, false);
 		spin_unlock_irqrestore(&hugetlb_lock, flags);
-		update_and_free_page(h, page, true);
+		update_and_free_hugetlb_folio(h, folio, true);
 	} else if (h->surplus_huge_pages_node[nid]) {
 		/* remove the page from active list */
 		remove_hugetlb_folio(h, folio, true);
 		spin_unlock_irqrestore(&hugetlb_lock, flags);
-		update_and_free_page(h, page, true);
+		update_and_free_hugetlb_folio(h, folio, true);
 	} else {
 		arch_clear_hugepage_flags(page);
 		enqueue_huge_page(h, page);
@@ -2172,8 +2174,8 @@ int dissolve_free_huge_page(struct page *page)
 		spin_unlock_irq(&hugetlb_lock);
 
 		/*
-		 * Normally update_and_free_page will allocate required vmemmmap
-		 * before freeing the page.  update_and_free_page will fail to
+		 * Normally update_and_free_hugtlb_folio will allocate required vmemmmap
+		 * before freeing the page.  update_and_free_hugtlb_folio will fail to
 		 * free the page if it can not allocate required vmemmap.  We
 		 * need to adjust max_huge_pages if the page is not freed.
 		 * Attempt to allocate vmemmmap here so that we can take
@@ -2181,7 +2183,7 @@ int dissolve_free_huge_page(struct page *page)
 		 */
 		rc = hugetlb_vmemmap_restore(h, &folio->page);
 		if (!rc) {
-			update_and_free_page(h, &folio->page, false);
+			update_and_free_hugetlb_folio(h, folio, false);
 		} else {
 			spin_lock_irq(&hugetlb_lock);
 			add_hugetlb_page(h, &folio->page, false);
@@ -2818,7 +2820,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
 		 * Pages have been replaced, we can safely free the old one.
 		 */
 		spin_unlock_irq(&hugetlb_lock);
-		update_and_free_page(h, old_page, false);
+		update_and_free_hugetlb_folio(h, old_folio, false);
 	}
 
 	return ret;
@@ -2827,7 +2829,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
 	spin_unlock_irq(&hugetlb_lock);
 	/* Page has a zero ref count, but needs a ref to be freed */
 	folio_ref_unfreeze(new_folio, 1);
-	update_and_free_page(h, new_page, false);
+	update_and_free_hugetlb_folio(h, new_folio, false);
 
 	return ret;
 }
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH mm-unstable v5 06/10] mm/hugetlb: convert add_hugetlb_page() to folios and add hugetlb_cma_folio()
  2022-11-29 22:50 [PATCH mm-unstable v5 00/10] convert core hugetlb functions to folios Sidhartha Kumar
                   ` (4 preceding siblings ...)
  2022-11-29 22:50 ` [PATCH mm-unstable v5 05/10] mm/hugetlb: convert update_and_free_page() " Sidhartha Kumar
@ 2022-11-29 22:50 ` Sidhartha Kumar
  2022-12-07 18:38   ` Mike Kravetz
  2022-11-29 22:50 ` [PATCH mm-unstable v5 07/10] mm/hugetlb: convert enqueue_huge_page() to folios Sidhartha Kumar
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Sidhartha Kumar @ 2022-11-29 22:50 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, almasrymina, linmiaohe,
	hughd, tsahu, jhubbard, david, Sidhartha Kumar

Convert add_hugetlb_page() to take in a folio, also convert
hugetlb_cma_page() to take in a folio.

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
---
 mm/hugetlb.c | 42 +++++++++++++++++++++---------------------
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 83777d1ccbf3..57909a0e7157 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -54,13 +54,13 @@ struct hstate hstates[HUGE_MAX_HSTATE];
 #ifdef CONFIG_CMA
 static struct cma *hugetlb_cma[MAX_NUMNODES];
 static unsigned long hugetlb_cma_size_in_node[MAX_NUMNODES] __initdata;
-static bool hugetlb_cma_page(struct page *page, unsigned int order)
+static bool hugetlb_cma_folio(struct folio *folio, unsigned int order)
 {
-	return cma_pages_valid(hugetlb_cma[page_to_nid(page)], page,
+	return cma_pages_valid(hugetlb_cma[folio_nid(folio)], &folio->page,
 				1 << order);
 }
 #else
-static bool hugetlb_cma_page(struct page *page, unsigned int order)
+static bool hugetlb_cma_folio(struct folio *folio, unsigned int order)
 {
 	return false;
 }
@@ -1506,17 +1506,17 @@ static void remove_hugetlb_folio_for_demote(struct hstate *h, struct folio *foli
 	__remove_hugetlb_folio(h, folio, adjust_surplus, true);
 }
 
-static void add_hugetlb_page(struct hstate *h, struct page *page,
+static void add_hugetlb_folio(struct hstate *h, struct folio *folio,
 			     bool adjust_surplus)
 {
 	int zeroed;
-	int nid = page_to_nid(page);
+	int nid = folio_nid(folio);
 
-	VM_BUG_ON_PAGE(!HPageVmemmapOptimized(page), page);
+	VM_BUG_ON_FOLIO(!folio_test_hugetlb_vmemmap_optimized(folio), folio);
 
 	lockdep_assert_held(&hugetlb_lock);
 
-	INIT_LIST_HEAD(&page->lru);
+	INIT_LIST_HEAD(&folio->lru);
 	h->nr_huge_pages++;
 	h->nr_huge_pages_node[nid]++;
 
@@ -1525,21 +1525,21 @@ static void add_hugetlb_page(struct hstate *h, struct page *page,
 		h->surplus_huge_pages_node[nid]++;
 	}
 
-	set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
-	set_page_private(page, 0);
+	folio_set_compound_dtor(folio, HUGETLB_PAGE_DTOR);
+	folio_change_private(folio, NULL);
 	/*
-	 * We have to set HPageVmemmapOptimized again as above
-	 * set_page_private(page, 0) cleared it.
+	 * We have to set hugetlb_vmemmap_optimized again as above
+	 * folio_change_private(folio, NULL) cleared it.
 	 */
-	SetHPageVmemmapOptimized(page);
+	folio_set_hugetlb_vmemmap_optimized(folio);
 
 	/*
-	 * This page is about to be managed by the hugetlb allocator and
+	 * This folio is about to be managed by the hugetlb allocator and
 	 * should have no users.  Drop our reference, and check for others
 	 * just in case.
 	 */
-	zeroed = put_page_testzero(page);
-	if (!zeroed)
+	zeroed = folio_put_testzero(folio);
+	if (unlikely(!zeroed))
 		/*
 		 * It is VERY unlikely soneone else has taken a ref on
 		 * the page.  In this case, we simply return as the
@@ -1548,8 +1548,8 @@ static void add_hugetlb_page(struct hstate *h, struct page *page,
 		 */
 		return;
 
-	arch_clear_hugepage_flags(page);
-	enqueue_huge_page(h, page);
+	arch_clear_hugepage_flags(&folio->page);
+	enqueue_huge_page(h, &folio->page);
 }
 
 static void __update_and_free_page(struct hstate *h, struct page *page)
@@ -1575,7 +1575,7 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
 		 * page and put the page back on the hugetlb free list and treat
 		 * as a surplus page.
 		 */
-		add_hugetlb_page(h, page, true);
+		add_hugetlb_folio(h, page_folio(page), true);
 		spin_unlock_irq(&hugetlb_lock);
 		return;
 	}
@@ -1600,7 +1600,7 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
 	 * need to be given back to CMA in free_gigantic_page.
 	 */
 	if (hstate_is_gigantic(h) ||
-	    hugetlb_cma_page(page, huge_page_order(h))) {
+	    hugetlb_cma_folio(folio, huge_page_order(h))) {
 		destroy_compound_gigantic_folio(folio, huge_page_order(h));
 		free_gigantic_page(page, huge_page_order(h));
 	} else {
@@ -2186,7 +2186,7 @@ int dissolve_free_huge_page(struct page *page)
 			update_and_free_hugetlb_folio(h, folio, false);
 		} else {
 			spin_lock_irq(&hugetlb_lock);
-			add_hugetlb_page(h, &folio->page, false);
+			add_hugetlb_folio(h, folio, false);
 			h->max_huge_pages++;
 			spin_unlock_irq(&hugetlb_lock);
 		}
@@ -3453,7 +3453,7 @@ static int demote_free_huge_page(struct hstate *h, struct page *page)
 		/* Allocation of vmemmmap failed, we can not demote page */
 		spin_lock_irq(&hugetlb_lock);
 		set_page_refcounted(page);
-		add_hugetlb_page(h, page, false);
+		add_hugetlb_folio(h, page_folio(page), false);
 		return rc;
 	}
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH mm-unstable v5 07/10] mm/hugetlb: convert enqueue_huge_page() to folios
  2022-11-29 22:50 [PATCH mm-unstable v5 00/10] convert core hugetlb functions to folios Sidhartha Kumar
                   ` (5 preceding siblings ...)
  2022-11-29 22:50 ` [PATCH mm-unstable v5 06/10] mm/hugetlb: convert add_hugetlb_page() to folios and add hugetlb_cma_folio() Sidhartha Kumar
@ 2022-11-29 22:50 ` Sidhartha Kumar
  2022-12-07 18:46   ` Mike Kravetz
  2022-11-29 22:50 ` [PATCH mm-unstable v5 08/10] mm/hugetlb: convert free_gigantic_page() " Sidhartha Kumar
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Sidhartha Kumar @ 2022-11-29 22:50 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, almasrymina, linmiaohe,
	hughd, tsahu, jhubbard, david, Sidhartha Kumar

Convert callers of enqueue_huge_page() to pass in a folio, function is
renamed to enqueue_hugetlb_folio().

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
---
 mm/hugetlb.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 57909a0e7157..c889593d5053 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1127,17 +1127,17 @@ static bool vma_has_reserves(struct vm_area_struct *vma, long chg)
 	return false;
 }
 
-static void enqueue_huge_page(struct hstate *h, struct page *page)
+static void enqueue_hugetlb_folio(struct hstate *h, struct folio *folio)
 {
-	int nid = page_to_nid(page);
+	int nid = folio_nid(folio);
 
 	lockdep_assert_held(&hugetlb_lock);
-	VM_BUG_ON_PAGE(page_count(page), page);
+	VM_BUG_ON_FOLIO(folio_ref_count(folio), folio);
 
-	list_move(&page->lru, &h->hugepage_freelists[nid]);
+	list_move(&folio->lru, &h->hugepage_freelists[nid]);
 	h->free_huge_pages++;
 	h->free_huge_pages_node[nid]++;
-	SetHPageFreed(page);
+	folio_set_hugetlb_freed(folio);
 }
 
 static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
@@ -1549,7 +1549,7 @@ static void add_hugetlb_folio(struct hstate *h, struct folio *folio,
 		return;
 
 	arch_clear_hugepage_flags(&folio->page);
-	enqueue_huge_page(h, &folio->page);
+	enqueue_hugetlb_folio(h, folio);
 }
 
 static void __update_and_free_page(struct hstate *h, struct page *page)
@@ -1761,7 +1761,7 @@ void free_huge_page(struct page *page)
 		update_and_free_hugetlb_folio(h, folio, true);
 	} else {
 		arch_clear_hugepage_flags(page);
-		enqueue_huge_page(h, page);
+		enqueue_hugetlb_folio(h, folio);
 		spin_unlock_irqrestore(&hugetlb_lock, flags);
 	}
 }
@@ -2438,7 +2438,7 @@ static int gather_surplus_pages(struct hstate *h, long delta)
 		if ((--needed) < 0)
 			break;
 		/* Add the page to the hugetlb allocator */
-		enqueue_huge_page(h, page);
+		enqueue_hugetlb_folio(h, page_folio(page));
 	}
 free:
 	spin_unlock_irq(&hugetlb_lock);
@@ -2804,8 +2804,8 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
 		 * Ok, old_page is still a genuine free hugepage. Remove it from
 		 * the freelist and decrease the counters. These will be
 		 * incremented again when calling __prep_account_new_huge_page()
-		 * and enqueue_huge_page() for new_page. The counters will remain
-		 * stable since this happens under the lock.
+		 * and enqueue_hugetlb_folio() for new_folio. The counters will
+		 * remain stable since this happens under the lock.
 		 */
 		remove_hugetlb_folio(h, old_folio, false);
 
@@ -2814,7 +2814,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
 		 * earlier.  It can be directly added to the pool free list.
 		 */
 		__prep_account_new_huge_page(h, nid);
-		enqueue_huge_page(h, new_page);
+		enqueue_hugetlb_folio(h, new_folio);
 
 		/*
 		 * Pages have been replaced, we can safely free the old one.
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH mm-unstable v5 08/10] mm/hugetlb: convert free_gigantic_page() to folios
  2022-11-29 22:50 [PATCH mm-unstable v5 00/10] convert core hugetlb functions to folios Sidhartha Kumar
                   ` (6 preceding siblings ...)
  2022-11-29 22:50 ` [PATCH mm-unstable v5 07/10] mm/hugetlb: convert enqueue_huge_page() to folios Sidhartha Kumar
@ 2022-11-29 22:50 ` Sidhartha Kumar
  2022-12-07 19:04   ` Mike Kravetz
  2022-11-29 22:50 ` [PATCH mm-unstable v5 09/10] mm/hugetlb: convert hugetlb prep functions " Sidhartha Kumar
  2022-11-29 22:50 ` [PATCH mm-unstable v5 10/10] mm/hugetlb: change hugetlb allocation functions to return a folio Sidhartha Kumar
  9 siblings, 1 reply; 33+ messages in thread
From: Sidhartha Kumar @ 2022-11-29 22:50 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, almasrymina, linmiaohe,
	hughd, tsahu, jhubbard, david, Sidhartha Kumar

Convert callers of free_gigantic_page() to use folios, function is then
renamed to free_gigantic_folio().

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
---
 mm/hugetlb.c | 29 +++++++++++++++++------------
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index c889593d5053..5e580ab834c3 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1361,18 +1361,20 @@ static void destroy_compound_gigantic_folio(struct folio *folio,
 	__destroy_compound_gigantic_folio(folio, order, false);
 }
 
-static void free_gigantic_page(struct page *page, unsigned int order)
+static void free_gigantic_folio(struct folio *folio, unsigned int order)
 {
 	/*
 	 * If the page isn't allocated using the cma allocator,
 	 * cma_release() returns false.
 	 */
 #ifdef CONFIG_CMA
-	if (cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order))
+	int nid = folio_nid(folio);
+
+	if (cma_release(hugetlb_cma[nid], &folio->page, 1 << order))
 		return;
 #endif
 
-	free_contig_range(page_to_pfn(page), 1 << order);
+	free_contig_range(folio_pfn(folio), 1 << order);
 }
 
 #ifdef CONFIG_CONTIG_ALLOC
@@ -1426,7 +1428,8 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
 {
 	return NULL;
 }
-static inline void free_gigantic_page(struct page *page, unsigned int order) { }
+static inline void free_gigantic_folio(struct folio *folio,
+						unsigned int order) { }
 static inline void destroy_compound_gigantic_folio(struct folio *folio,
 						unsigned int order) { }
 #endif
@@ -1565,7 +1568,7 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
 	 * If we don't know which subpages are hwpoisoned, we can't free
 	 * the hugepage, so it's leaked intentionally.
 	 */
-	if (HPageRawHwpUnreliable(page))
+	if (folio_test_hugetlb_raw_hwp_unreliable(folio))
 		return;
 
 	if (hugetlb_vmemmap_restore(h, page)) {
@@ -1575,7 +1578,7 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
 		 * page and put the page back on the hugetlb free list and treat
 		 * as a surplus page.
 		 */
-		add_hugetlb_folio(h, page_folio(page), true);
+		add_hugetlb_folio(h, folio, true);
 		spin_unlock_irq(&hugetlb_lock);
 		return;
 	}
@@ -1588,7 +1591,7 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
 		hugetlb_clear_page_hwpoison(&folio->page);
 
 	for (i = 0; i < pages_per_huge_page(h); i++) {
-		subpage = nth_page(page, i);
+		subpage = folio_page(folio, i);
 		subpage->flags &= ~(1 << PG_locked | 1 << PG_error |
 				1 << PG_referenced | 1 << PG_dirty |
 				1 << PG_active | 1 << PG_private |
@@ -1597,12 +1600,12 @@ static void __update_and_free_page(struct hstate *h, struct page *page)
 
 	/*
 	 * Non-gigantic pages demoted from CMA allocated gigantic pages
-	 * need to be given back to CMA in free_gigantic_page.
+	 * need to be given back to CMA in free_gigantic_folio.
 	 */
 	if (hstate_is_gigantic(h) ||
 	    hugetlb_cma_folio(folio, huge_page_order(h))) {
 		destroy_compound_gigantic_folio(folio, huge_page_order(h));
-		free_gigantic_page(page, huge_page_order(h));
+		free_gigantic_folio(folio, huge_page_order(h));
 	} else {
 		__free_pages(page, huge_page_order(h));
 	}
@@ -2025,6 +2028,7 @@ static struct page *alloc_fresh_huge_page(struct hstate *h,
 		nodemask_t *node_alloc_noretry)
 {
 	struct page *page;
+	struct folio *folio;
 	bool retry = false;
 
 retry:
@@ -2035,14 +2039,14 @@ static struct page *alloc_fresh_huge_page(struct hstate *h,
 				nid, nmask, node_alloc_noretry);
 	if (!page)
 		return NULL;
-
+	folio = page_folio(page);
 	if (hstate_is_gigantic(h)) {
 		if (!prep_compound_gigantic_page(page, huge_page_order(h))) {
 			/*
 			 * Rare failure to convert pages to compound page.
 			 * Free pages and try again - ONCE!
 			 */
-			free_gigantic_page(page, huge_page_order(h));
+			free_gigantic_folio(folio, huge_page_order(h));
 			if (!retry) {
 				retry = true;
 				goto retry;
@@ -3050,6 +3054,7 @@ static void __init gather_bootmem_prealloc(void)
 
 	list_for_each_entry(m, &huge_boot_pages, list) {
 		struct page *page = virt_to_page(m);
+		struct folio *folio = page_folio(page);
 		struct hstate *h = m->hstate;
 
 		VM_BUG_ON(!hstate_is_gigantic(h));
@@ -3060,7 +3065,7 @@ static void __init gather_bootmem_prealloc(void)
 			free_huge_page(page); /* add to the hugepage allocator */
 		} else {
 			/* VERY unlikely inflated ref count on a tail page */
-			free_gigantic_page(page, huge_page_order(h));
+			free_gigantic_folio(folio, huge_page_order(h));
 		}
 
 		/*
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH mm-unstable v5 09/10] mm/hugetlb: convert hugetlb prep functions to folios
  2022-11-29 22:50 [PATCH mm-unstable v5 00/10] convert core hugetlb functions to folios Sidhartha Kumar
                   ` (7 preceding siblings ...)
  2022-11-29 22:50 ` [PATCH mm-unstable v5 08/10] mm/hugetlb: convert free_gigantic_page() " Sidhartha Kumar
@ 2022-11-29 22:50 ` Sidhartha Kumar
  2022-12-07 19:35   ` Mike Kravetz
  2022-11-29 22:50 ` [PATCH mm-unstable v5 10/10] mm/hugetlb: change hugetlb allocation functions to return a folio Sidhartha Kumar
  9 siblings, 1 reply; 33+ messages in thread
From: Sidhartha Kumar @ 2022-11-29 22:50 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, almasrymina, linmiaohe,
	hughd, tsahu, jhubbard, david, Sidhartha Kumar

Convert prep_new_huge_page() and __prep_compound_gigantic_page() to
folios.

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
---
 mm/hugetlb.c | 63 +++++++++++++++++++++++++---------------------------
 1 file changed, 30 insertions(+), 33 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 5e580ab834c3..f61b4eb58cde 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1789,29 +1789,27 @@ static void __prep_new_hugetlb_folio(struct hstate *h, struct folio *folio)
 	set_hugetlb_cgroup_rsvd(folio, NULL);
 }
 
-static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
+static void prep_new_hugetlb_folio(struct hstate *h, struct folio *folio, int nid)
 {
-	struct folio *folio = page_folio(page);
-
 	__prep_new_hugetlb_folio(h, folio);
 	spin_lock_irq(&hugetlb_lock);
 	__prep_account_new_huge_page(h, nid);
 	spin_unlock_irq(&hugetlb_lock);
 }
 
-static bool __prep_compound_gigantic_page(struct page *page, unsigned int order,
-								bool demote)
+static bool __prep_compound_gigantic_folio(struct folio *folio,
+					unsigned int order, bool demote)
 {
 	int i, j;
 	int nr_pages = 1 << order;
 	struct page *p;
 
-	/* we rely on prep_new_huge_page to set the destructor */
-	set_compound_order(page, order);
-	__ClearPageReserved(page);
-	__SetPageHead(page);
+	/* we rely on prep_new_hugetlb_folio to set the destructor */
+	folio_set_compound_order(folio, order);
+	__folio_clear_reserved(folio);
+	__folio_set_head(folio);
 	for (i = 0; i < nr_pages; i++) {
-		p = nth_page(page, i);
+		p = folio_page(folio, i);
 
 		/*
 		 * For gigantic hugepages allocated through bootmem at
@@ -1853,43 +1851,41 @@ static bool __prep_compound_gigantic_page(struct page *page, unsigned int order,
 			VM_BUG_ON_PAGE(page_count(p), p);
 		}
 		if (i != 0)
-			set_compound_head(p, page);
+			set_compound_head(p, &folio->page);
 	}
-	atomic_set(compound_mapcount_ptr(page), -1);
-	atomic_set(subpages_mapcount_ptr(page), 0);
-	atomic_set(compound_pincount_ptr(page), 0);
+	atomic_set(folio_mapcount_ptr(folio), -1);
+	atomic_set(folio_subpages_mapcount_ptr(folio), 0);
+	atomic_set(folio_pincount_ptr(folio), 0);
 	return true;
 
 out_error:
 	/* undo page modifications made above */
 	for (j = 0; j < i; j++) {
-		p = nth_page(page, j);
+		p = folio_page(folio, j);
 		if (j != 0)
 			clear_compound_head(p);
 		set_page_refcounted(p);
 	}
 	/* need to clear PG_reserved on remaining tail pages  */
 	for (; j < nr_pages; j++) {
-		p = nth_page(page, j);
+		p = folio_page(folio, j);
 		__ClearPageReserved(p);
 	}
-	set_compound_order(page, 0);
-#ifdef CONFIG_64BIT
-	page[1].compound_nr = 0;
-#endif
-	__ClearPageHead(page);
+	folio_set_compound_order(folio, 0);
+	__folio_clear_head(folio);
 	return false;
 }
 
-static bool prep_compound_gigantic_page(struct page *page, unsigned int order)
+static bool prep_compound_gigantic_folio(struct folio *folio,
+							unsigned int order)
 {
-	return __prep_compound_gigantic_page(page, order, false);
+	return __prep_compound_gigantic_folio(folio, order, false);
 }
 
-static bool prep_compound_gigantic_page_for_demote(struct page *page,
+static bool prep_compound_gigantic_folio_for_demote(struct folio *folio,
 							unsigned int order)
 {
-	return __prep_compound_gigantic_page(page, order, true);
+	return __prep_compound_gigantic_folio(folio, order, true);
 }
 
 /*
@@ -2041,7 +2037,7 @@ static struct page *alloc_fresh_huge_page(struct hstate *h,
 		return NULL;
 	folio = page_folio(page);
 	if (hstate_is_gigantic(h)) {
-		if (!prep_compound_gigantic_page(page, huge_page_order(h))) {
+		if (!prep_compound_gigantic_folio(folio, huge_page_order(h))) {
 			/*
 			 * Rare failure to convert pages to compound page.
 			 * Free pages and try again - ONCE!
@@ -2054,7 +2050,7 @@ static struct page *alloc_fresh_huge_page(struct hstate *h,
 			return NULL;
 		}
 	}
-	prep_new_huge_page(h, page, page_to_nid(page));
+	prep_new_hugetlb_folio(h, folio, folio_nid(folio));
 
 	return page;
 }
@@ -3058,10 +3054,10 @@ static void __init gather_bootmem_prealloc(void)
 		struct hstate *h = m->hstate;
 
 		VM_BUG_ON(!hstate_is_gigantic(h));
-		WARN_ON(page_count(page) != 1);
-		if (prep_compound_gigantic_page(page, huge_page_order(h))) {
-			WARN_ON(PageReserved(page));
-			prep_new_huge_page(h, page, page_to_nid(page));
+		WARN_ON(folio_ref_count(folio) != 1);
+		if (prep_compound_gigantic_folio(folio, huge_page_order(h))) {
+			WARN_ON(folio_test_reserved(folio));
+			prep_new_hugetlb_folio(h, folio, folio_nid(folio));
 			free_huge_page(page); /* add to the hugepage allocator */
 		} else {
 			/* VERY unlikely inflated ref count on a tail page */
@@ -3480,13 +3476,14 @@ static int demote_free_huge_page(struct hstate *h, struct page *page)
 	for (i = 0; i < pages_per_huge_page(h);
 				i += pages_per_huge_page(target_hstate)) {
 		subpage = nth_page(page, i);
+		folio = page_folio(subpage);
 		if (hstate_is_gigantic(target_hstate))
-			prep_compound_gigantic_page_for_demote(subpage,
+			prep_compound_gigantic_folio_for_demote(folio,
 							target_hstate->order);
 		else
 			prep_compound_page(subpage, target_hstate->order);
 		set_page_private(subpage, 0);
-		prep_new_huge_page(target_hstate, subpage, nid);
+		prep_new_hugetlb_folio(target_hstate, folio, nid);
 		free_huge_page(subpage);
 	}
 	mutex_unlock(&target_hstate->resize_lock);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH mm-unstable v5 10/10] mm/hugetlb: change hugetlb allocation functions to return a folio
  2022-11-29 22:50 [PATCH mm-unstable v5 00/10] convert core hugetlb functions to folios Sidhartha Kumar
                   ` (8 preceding siblings ...)
  2022-11-29 22:50 ` [PATCH mm-unstable v5 09/10] mm/hugetlb: convert hugetlb prep functions " Sidhartha Kumar
@ 2022-11-29 22:50 ` Sidhartha Kumar
  2022-12-07 22:01   ` Mike Kravetz
  9 siblings, 1 reply; 33+ messages in thread
From: Sidhartha Kumar @ 2022-11-29 22:50 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, almasrymina, linmiaohe,
	hughd, tsahu, jhubbard, david, Sidhartha Kumar, Wei Chen,
	Rasmus Villemoes

Many hugetlb allocation helper functions have now been converting to
folios, update their higher level callers to be compatible with folios.
alloc_pool_huge_page is reorganized to avoid a smatch warning reporting
the folio variable is unintialized.

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reported-by: Wei Chen <harperchen1110@gmail.com>
Suggested-by: John Hubbard <jhubbard@nvidia.com>
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 mm/hugetlb.c | 120 ++++++++++++++++++++++++---------------------------
 1 file changed, 57 insertions(+), 63 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f61b4eb58cde..944e1222ea7f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1378,23 +1378,23 @@ static void free_gigantic_folio(struct folio *folio, unsigned int order)
 }
 
 #ifdef CONFIG_CONTIG_ALLOC
-static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
+static struct folio *alloc_gigantic_folio(struct hstate *h, gfp_t gfp_mask,
 		int nid, nodemask_t *nodemask)
 {
+	struct page *page;
 	unsigned long nr_pages = pages_per_huge_page(h);
 	if (nid == NUMA_NO_NODE)
 		nid = numa_mem_id();
 
 #ifdef CONFIG_CMA
 	{
-		struct page *page;
 		int node;
 
 		if (hugetlb_cma[nid]) {
 			page = cma_alloc(hugetlb_cma[nid], nr_pages,
 					huge_page_order(h), true);
 			if (page)
-				return page;
+				return page_folio(page);
 		}
 
 		if (!(gfp_mask & __GFP_THISNODE)) {
@@ -1405,17 +1405,18 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
 				page = cma_alloc(hugetlb_cma[node], nr_pages,
 						huge_page_order(h), true);
 				if (page)
-					return page;
+					return page_folio(page);
 			}
 		}
 	}
 #endif
 
-	return alloc_contig_pages(nr_pages, gfp_mask, nid, nodemask);
+	page = alloc_contig_pages(nr_pages, gfp_mask, nid, nodemask);
+	return page ? page_folio(page) : NULL;
 }
 
 #else /* !CONFIG_CONTIG_ALLOC */
-static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
+static struct folio *alloc_gigantic_folio(struct hstate *h, gfp_t gfp_mask,
 					int nid, nodemask_t *nodemask)
 {
 	return NULL;
@@ -1423,7 +1424,7 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
 #endif /* CONFIG_CONTIG_ALLOC */
 
 #else /* !CONFIG_ARCH_HAS_GIGANTIC_PAGE */
-static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
+static struct folio *alloc_gigantic_folio(struct hstate *h, gfp_t gfp_mask,
 					int nid, nodemask_t *nodemask)
 {
 	return NULL;
@@ -1950,7 +1951,7 @@ pgoff_t hugetlb_basepage_index(struct page *page)
 	return (index << compound_order(page_head)) + compound_idx;
 }
 
-static struct page *alloc_buddy_huge_page(struct hstate *h,
+static struct folio *alloc_buddy_hugetlb_folio(struct hstate *h,
 		gfp_t gfp_mask, int nid, nodemask_t *nmask,
 		nodemask_t *node_alloc_noretry)
 {
@@ -1988,11 +1989,6 @@ static struct page *alloc_buddy_huge_page(struct hstate *h,
 		page = NULL;
 	}
 
-	if (page)
-		__count_vm_event(HTLB_BUDDY_PGALLOC);
-	else
-		__count_vm_event(HTLB_BUDDY_PGALLOC_FAIL);
-
 	/*
 	 * If we did not specify __GFP_RETRY_MAYFAIL, but still got a page this
 	 * indicates an overall state change.  Clear bit so that we resume
@@ -2009,7 +2005,13 @@ static struct page *alloc_buddy_huge_page(struct hstate *h,
 	if (node_alloc_noretry && !page && alloc_try_hard)
 		node_set(nid, *node_alloc_noretry);
 
-	return page;
+	if (!page) {
+		__count_vm_event(HTLB_BUDDY_PGALLOC_FAIL);
+		return NULL;
+	}
+
+	__count_vm_event(HTLB_BUDDY_PGALLOC);
+	return page_folio(page);
 }
 
 /*
@@ -2019,23 +2021,21 @@ static struct page *alloc_buddy_huge_page(struct hstate *h,
  * Note that returned page is 'frozen':  ref count of head page and all tail
  * pages is zero.
  */
-static struct page *alloc_fresh_huge_page(struct hstate *h,
+static struct folio *alloc_fresh_hugetlb_folio(struct hstate *h,
 		gfp_t gfp_mask, int nid, nodemask_t *nmask,
 		nodemask_t *node_alloc_noretry)
 {
-	struct page *page;
 	struct folio *folio;
 	bool retry = false;
 
 retry:
 	if (hstate_is_gigantic(h))
-		page = alloc_gigantic_page(h, gfp_mask, nid, nmask);
+		folio = alloc_gigantic_folio(h, gfp_mask, nid, nmask);
 	else
-		page = alloc_buddy_huge_page(h, gfp_mask,
+		folio = alloc_buddy_hugetlb_folio(h, gfp_mask,
 				nid, nmask, node_alloc_noretry);
-	if (!page)
+	if (!folio)
 		return NULL;
-	folio = page_folio(page);
 	if (hstate_is_gigantic(h)) {
 		if (!prep_compound_gigantic_folio(folio, huge_page_order(h))) {
 			/*
@@ -2052,7 +2052,7 @@ static struct page *alloc_fresh_huge_page(struct hstate *h,
 	}
 	prep_new_hugetlb_folio(h, folio, folio_nid(folio));
 
-	return page;
+	return folio;
 }
 
 /*
@@ -2062,23 +2062,20 @@ static struct page *alloc_fresh_huge_page(struct hstate *h,
 static int alloc_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed,
 				nodemask_t *node_alloc_noretry)
 {
-	struct page *page;
+	struct folio *folio;
 	int nr_nodes, node;
 	gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
 
 	for_each_node_mask_to_alloc(h, nr_nodes, node, nodes_allowed) {
-		page = alloc_fresh_huge_page(h, gfp_mask, node, nodes_allowed,
-						node_alloc_noretry);
-		if (page)
-			break;
+		folio = alloc_fresh_hugetlb_folio(h, gfp_mask, node,
+					nodes_allowed, node_alloc_noretry);
+		if (folio) {
+			free_huge_page(&folio->page); /* free it into the hugepage allocator */
+			return 1;
+		}
 	}
 
-	if (!page)
-		return 0;
-
-	free_huge_page(page); /* free it into the hugepage allocator */
-
-	return 1;
+	return 0;
 }
 
 /*
@@ -2237,7 +2234,7 @@ int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
 static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask,
 						int nid, nodemask_t *nmask)
 {
-	struct page *page = NULL;
+	struct folio *folio = NULL;
 
 	if (hstate_is_gigantic(h))
 		return NULL;
@@ -2247,8 +2244,8 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask,
 		goto out_unlock;
 	spin_unlock_irq(&hugetlb_lock);
 
-	page = alloc_fresh_huge_page(h, gfp_mask, nid, nmask, NULL);
-	if (!page)
+	folio = alloc_fresh_hugetlb_folio(h, gfp_mask, nid, nmask, NULL);
+	if (!folio)
 		return NULL;
 
 	spin_lock_irq(&hugetlb_lock);
@@ -2260,43 +2257,42 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask,
 	 * codeflow
 	 */
 	if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) {
-		SetHPageTemporary(page);
+		folio_set_hugetlb_temporary(folio);
 		spin_unlock_irq(&hugetlb_lock);
-		free_huge_page(page);
+		free_huge_page(&folio->page);
 		return NULL;
 	}
 
 	h->surplus_huge_pages++;
-	h->surplus_huge_pages_node[page_to_nid(page)]++;
+	h->surplus_huge_pages_node[folio_nid(folio)]++;
 
 out_unlock:
 	spin_unlock_irq(&hugetlb_lock);
 
-	return page;
+	return &folio->page;
 }
 
 static struct page *alloc_migrate_huge_page(struct hstate *h, gfp_t gfp_mask,
 				     int nid, nodemask_t *nmask)
 {
-	struct page *page;
+	struct folio *folio;
 
 	if (hstate_is_gigantic(h))
 		return NULL;
 
-	page = alloc_fresh_huge_page(h, gfp_mask, nid, nmask, NULL);
-	if (!page)
+	folio = alloc_fresh_hugetlb_folio(h, gfp_mask, nid, nmask, NULL);
+	if (!folio)
 		return NULL;
 
 	/* fresh huge pages are frozen */
-	set_page_refcounted(page);
-
+	folio_ref_unfreeze(folio, 1);
 	/*
 	 * We do not account these pages as surplus because they are only
 	 * temporary and will be released properly on the last reference
 	 */
-	SetHPageTemporary(page);
+	folio_set_hugetlb_temporary(folio);
 
-	return page;
+	return &folio->page;
 }
 
 /*
@@ -2745,19 +2741,18 @@ void restore_reserve_on_error(struct hstate *h, struct vm_area_struct *vma,
 }
 
 /*
- * alloc_and_dissolve_huge_page - Allocate a new page and dissolve the old one
+ * alloc_and_dissolve_hugetlb_folio - Allocate a new folio and dissolve
+ * the old one
  * @h: struct hstate old page belongs to
- * @old_page: Old page to dissolve
+ * @old_folio: Old folio to dissolve
  * @list: List to isolate the page in case we need to
  * Returns 0 on success, otherwise negated error.
  */
-static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
-					struct list_head *list)
+static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
+			struct folio *old_folio, struct list_head *list)
 {
 	gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
-	struct folio *old_folio = page_folio(old_page);
 	int nid = folio_nid(old_folio);
-	struct page *new_page;
 	struct folio *new_folio;
 	int ret = 0;
 
@@ -2768,26 +2763,25 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
 	 * the pool.  This simplifies and let us do most of the processing
 	 * under the lock.
 	 */
-	new_page = alloc_buddy_huge_page(h, gfp_mask, nid, NULL, NULL);
-	if (!new_page)
+	new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid, NULL, NULL);
+	if (!new_folio)
 		return -ENOMEM;
-	new_folio = page_folio(new_page);
 	__prep_new_hugetlb_folio(h, new_folio);
 
 retry:
 	spin_lock_irq(&hugetlb_lock);
 	if (!folio_test_hugetlb(old_folio)) {
 		/*
-		 * Freed from under us. Drop new_page too.
+		 * Freed from under us. Drop new_folio too.
 		 */
 		goto free_new;
 	} else if (folio_ref_count(old_folio)) {
 		/*
-		 * Someone has grabbed the page, try to isolate it here.
+		 * Someone has grabbed the folio, try to isolate it here.
 		 * Fail with -EBUSY if not possible.
 		 */
 		spin_unlock_irq(&hugetlb_lock);
-		ret = isolate_hugetlb(old_page, list);
+		ret = isolate_hugetlb(&old_folio->page, list);
 		spin_lock_irq(&hugetlb_lock);
 		goto free_new;
 	} else if (!folio_test_hugetlb_freed(old_folio)) {
@@ -2865,7 +2859,7 @@ int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list)
 	if (folio_ref_count(folio) && !isolate_hugetlb(&folio->page, list))
 		ret = 0;
 	else if (!folio_ref_count(folio))
-		ret = alloc_and_dissolve_huge_page(h, &folio->page, list);
+		ret = alloc_and_dissolve_hugetlb_folio(h, folio, list);
 
 	return ret;
 }
@@ -3083,14 +3077,14 @@ static void __init hugetlb_hstate_alloc_pages_onenode(struct hstate *h, int nid)
 			if (!alloc_bootmem_huge_page(h, nid))
 				break;
 		} else {
-			struct page *page;
+			struct folio *folio;
 			gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
 
-			page = alloc_fresh_huge_page(h, gfp_mask, nid,
+			folio = alloc_fresh_hugetlb_folio(h, gfp_mask, nid,
 					&node_states[N_MEMORY], NULL);
-			if (!page)
+			if (!folio)
 				break;
-			free_huge_page(page); /* free it into the hugepage allocator */
+			free_huge_page(&folio->page); /* free it into the hugepage allocator */
 		}
 		cond_resched();
 	}
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions
  2022-11-29 22:50 ` [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions Sidhartha Kumar
@ 2022-12-07  0:18   ` Mike Kravetz
  2022-12-07  3:34   ` Muchun Song
  1 sibling, 0 replies; 33+ messages in thread
From: Mike Kravetz @ 2022-12-07  0:18 UTC (permalink / raw)
  To: Sidhartha Kumar
  Cc: linux-kernel, linux-mm, akpm, songmuchun, willy, almasrymina,
	linmiaohe, hughd, tsahu, jhubbard, david

On 11/29/22 14:50, Sidhartha Kumar wrote:
> Add folio equivalents for set_compound_order() and set_compound_page_dtor().
> 
> Also remove extra new-lines introduced by mm/hugetlb: convert
> move_hugetlb_state() to folios and mm/hugetlb_cgroup: convert
> hugetlb_cgroup_uncharge_page() to folios.
> 
> Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
> Suggested-by: Muchun Song <songmuchun@bytedance.com>
> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> ---
>  include/linux/mm.h | 16 ++++++++++++++++
>  mm/hugetlb.c       |  4 +---
>  2 files changed, 17 insertions(+), 3 deletions(-)

Sorry, for the delay,

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 02/10] mm/hugetlb: convert destroy_compound_gigantic_page() to folios
  2022-11-29 22:50 ` [PATCH mm-unstable v5 02/10] mm/hugetlb: convert destroy_compound_gigantic_page() to folios Sidhartha Kumar
@ 2022-12-07  0:32   ` Mike Kravetz
  0 siblings, 0 replies; 33+ messages in thread
From: Mike Kravetz @ 2022-12-07  0:32 UTC (permalink / raw)
  To: Sidhartha Kumar
  Cc: linux-kernel, linux-mm, akpm, songmuchun, willy, almasrymina,
	linmiaohe, hughd, tsahu, jhubbard, david

On 11/29/22 14:50, Sidhartha Kumar wrote:
> Convert page operations within __destroy_compound_gigantic_page() to the
> corresponding folio operations.
> 
> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> ---
>  mm/hugetlb.c | 43 +++++++++++++++++++++----------------------
>  1 file changed, 21 insertions(+), 22 deletions(-)

Looks fine,

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 03/10] mm/hugetlb: convert dissolve_free_huge_page() to folios
  2022-11-29 22:50 ` [PATCH mm-unstable v5 03/10] mm/hugetlb: convert dissolve_free_huge_page() " Sidhartha Kumar
@ 2022-12-07  0:52   ` Mike Kravetz
  0 siblings, 0 replies; 33+ messages in thread
From: Mike Kravetz @ 2022-12-07  0:52 UTC (permalink / raw)
  To: Sidhartha Kumar
  Cc: linux-kernel, linux-mm, akpm, songmuchun, willy, almasrymina,
	linmiaohe, hughd, tsahu, jhubbard, david

On 11/29/22 14:50, Sidhartha Kumar wrote:
> Removes compound_head() call by using a folio rather than a head page.
> 
> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> ---
>  mm/hugetlb.c | 20 ++++++++++----------
>  1 file changed, 10 insertions(+), 10 deletions(-)

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 04/10] mm/hugetlb: convert remove_hugetlb_page() to folios
  2022-11-29 22:50 ` [PATCH mm-unstable v5 04/10] mm/hugetlb: convert remove_hugetlb_page() " Sidhartha Kumar
@ 2022-12-07  1:43   ` Mike Kravetz
  0 siblings, 0 replies; 33+ messages in thread
From: Mike Kravetz @ 2022-12-07  1:43 UTC (permalink / raw)
  To: Sidhartha Kumar
  Cc: linux-kernel, linux-mm, akpm, songmuchun, willy, almasrymina,
	linmiaohe, hughd, tsahu, jhubbard, david

On 11/29/22 14:50, Sidhartha Kumar wrote:
> Removes page_folio() call by converting callers to directly pass a folio
> into __remove_hugetlb_page().
> 
> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> ---
>  mm/hugetlb.c | 48 +++++++++++++++++++++++++-----------------------
>  1 file changed, 25 insertions(+), 23 deletions(-)

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 05/10] mm/hugetlb: convert update_and_free_page() to folios
  2022-11-29 22:50 ` [PATCH mm-unstable v5 05/10] mm/hugetlb: convert update_and_free_page() " Sidhartha Kumar
@ 2022-12-07  2:02   ` Mike Kravetz
  0 siblings, 0 replies; 33+ messages in thread
From: Mike Kravetz @ 2022-12-07  2:02 UTC (permalink / raw)
  To: Sidhartha Kumar
  Cc: linux-kernel, linux-mm, akpm, songmuchun, willy, almasrymina,
	linmiaohe, hughd, tsahu, jhubbard, david

On 11/29/22 14:50, Sidhartha Kumar wrote:
> Make more progress on converting the free_huge_page() destructor to
> operate on folios by converting update_and_free_page() to folios.
> 
> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> ---
>  mm/hugetlb.c | 30 ++++++++++++++++--------------
>  1 file changed, 16 insertions(+), 14 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 90ba01a76f87..83777d1ccbf3 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
<snip>

Minor nit, not worth fixing now.  But, we should cleanup sometime.  Comments
below still refer to pages when we are operating on folios.

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
-- 
Mike Kravetz

> @@ -2818,7 +2820,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
>  		 * Pages have been replaced, we can safely free the old one.
>  		 */
>  		spin_unlock_irq(&hugetlb_lock);
> -		update_and_free_page(h, old_page, false);
> +		update_and_free_hugetlb_folio(h, old_folio, false);
>  	}
>  
>  	return ret;
> @@ -2827,7 +2829,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
>  	spin_unlock_irq(&hugetlb_lock);
>  	/* Page has a zero ref count, but needs a ref to be freed */
>  	folio_ref_unfreeze(new_folio, 1);
> -	update_and_free_page(h, new_page, false);
> +	update_and_free_hugetlb_folio(h, new_folio, false);
>  
>  	return ret;
>  }
> -- 
> 2.38.1
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions
  2022-11-29 22:50 ` [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions Sidhartha Kumar
  2022-12-07  0:18   ` Mike Kravetz
@ 2022-12-07  3:34   ` Muchun Song
  2022-12-07  3:42     ` Mike Kravetz
  1 sibling, 1 reply; 33+ messages in thread
From: Muchun Song @ 2022-12-07  3:34 UTC (permalink / raw)
  To: Sidhartha Kumar
  Cc: linux-kernel, Linux Memory Management List, Andrew Morton,
	Muchun Song, Mike Kravetz, Matthew Wilcox, Mina Almasry,
	Miaohe Lin, hughd, tsahu, jhubbard, David Hildenbrand



> On Nov 30, 2022, at 06:50, Sidhartha Kumar <sidhartha.kumar@oracle.com> wrote:
> 
> Add folio equivalents for set_compound_order() and set_compound_page_dtor().
> 
> Also remove extra new-lines introduced by mm/hugetlb: convert
> move_hugetlb_state() to folios and mm/hugetlb_cgroup: convert
> hugetlb_cgroup_uncharge_page() to folios.
> 
> Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
> Suggested-by: Muchun Song <songmuchun@bytedance.com>
> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> ---
> include/linux/mm.h | 16 ++++++++++++++++
> mm/hugetlb.c       |  4 +---
> 2 files changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index a48c5ad16a5e..2bdef8a5298a 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -972,6 +972,13 @@ static inline void set_compound_page_dtor(struct page *page,
> page[1].compound_dtor = compound_dtor;
> }
> 
> +static inline void folio_set_compound_dtor(struct folio *folio,
> + enum compound_dtor_id compound_dtor)
> +{
> + VM_BUG_ON_FOLIO(compound_dtor >= NR_COMPOUND_DTORS, folio);
> + folio->_folio_dtor = compound_dtor;
> +}
> +
> void destroy_large_folio(struct folio *folio);
> 
> static inline int head_compound_pincount(struct page *head)
> @@ -987,6 +994,15 @@ static inline void set_compound_order(struct page *page, unsigned int order)
> #endif
> }
> 
> +static inline void folio_set_compound_order(struct folio *folio,
> + unsigned int order)
> +{
> + folio->_folio_order = order;
> +#ifdef CONFIG_64BIT
> + folio->_folio_nr_pages = order ? 1U << order : 0;

It seems that you think the user could pass 0 to order. However,
->_folio_nr_pages and ->_folio_order fields are invalid for order-0 pages.
You should not touch it. So this should be:

static inline void folio_set_compound_order(struct folio *folio,
					    unsigned int order)
{
	if (!folio_test_large(folio))
		return;

	folio->_folio_order = order;
#ifdef CONFIG_64BIT
	folio->_folio_nr_pages = 1U << order;
#endif
}

If we can make sure all the users of folio_set_compound_order() should pass
a non-order-0 page (it is true for now), then I suggest adding a VM_BUG_ON()
here to catch unexpected users.

static inline void folio_set_compound_order(struct folio *folio,
					    unsigned int order)
{
	VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);

	folio->_folio_order = order;
#ifdef CONFIG_64BIT
	folio->_folio_nr_pages = 1U << order;
#endif
}

Thanks.

> +#endif
> +}
> +


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions
  2022-12-07  3:34   ` Muchun Song
@ 2022-12-07  3:42     ` Mike Kravetz
  2022-12-07  4:11       ` Muchun Song
  0 siblings, 1 reply; 33+ messages in thread
From: Mike Kravetz @ 2022-12-07  3:42 UTC (permalink / raw)
  To: Muchun Song
  Cc: Sidhartha Kumar, linux-kernel, Linux Memory Management List,
	Andrew Morton, Muchun Song, Matthew Wilcox, Mina Almasry,
	Miaohe Lin, hughd, tsahu, jhubbard, David Hildenbrand

On 12/07/22 11:34, Muchun Song wrote:
> 
> 
> > On Nov 30, 2022, at 06:50, Sidhartha Kumar <sidhartha.kumar@oracle.com> wrote:
> > 
> > Add folio equivalents for set_compound_order() and set_compound_page_dtor().
> > 
> > Also remove extra new-lines introduced by mm/hugetlb: convert
> > move_hugetlb_state() to folios and mm/hugetlb_cgroup: convert
> > hugetlb_cgroup_uncharge_page() to folios.
> > 
> > Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
> > Suggested-by: Muchun Song <songmuchun@bytedance.com>
> > Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> > ---
> > include/linux/mm.h | 16 ++++++++++++++++
> > mm/hugetlb.c       |  4 +---
> > 2 files changed, 17 insertions(+), 3 deletions(-)
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index a48c5ad16a5e..2bdef8a5298a 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -972,6 +972,13 @@ static inline void set_compound_page_dtor(struct page *page,
> > page[1].compound_dtor = compound_dtor;
> > }
> > 
> > +static inline void folio_set_compound_dtor(struct folio *folio,
> > + enum compound_dtor_id compound_dtor)
> > +{
> > + VM_BUG_ON_FOLIO(compound_dtor >= NR_COMPOUND_DTORS, folio);
> > + folio->_folio_dtor = compound_dtor;
> > +}
> > +
> > void destroy_large_folio(struct folio *folio);
> > 
> > static inline int head_compound_pincount(struct page *head)
> > @@ -987,6 +994,15 @@ static inline void set_compound_order(struct page *page, unsigned int order)
> > #endif
> > }
> > 
> > +static inline void folio_set_compound_order(struct folio *folio,
> > + unsigned int order)
> > +{
> > + folio->_folio_order = order;
> > +#ifdef CONFIG_64BIT
> > + folio->_folio_nr_pages = order ? 1U << order : 0;
> 
> It seems that you think the user could pass 0 to order. However,
> ->_folio_nr_pages and ->_folio_order fields are invalid for order-0 pages.
> You should not touch it. So this should be:
> 
> static inline void folio_set_compound_order(struct folio *folio,
> 					    unsigned int order)
> {
> 	if (!folio_test_large(folio))
> 		return;
> 
> 	folio->_folio_order = order;
> #ifdef CONFIG_64BIT
> 	folio->_folio_nr_pages = 1U << order;
> #endif
> }

I believe this was changed to accommodate the code in
__destroy_compound_gigantic_page().  It is used in a subsequent patch.
Here is the v6.0 version of the routine.

static void __destroy_compound_gigantic_page(struct page *page,
					unsigned int order, bool demote)
{
	int i;
	int nr_pages = 1 << order;
	struct page *p = page + 1;

	atomic_set(compound_mapcount_ptr(page), 0);
	atomic_set(compound_pincount_ptr(page), 0);

	for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) {
		p->mapping = NULL;
		clear_compound_head(p);
		if (!demote)
			set_page_refcounted(p);
	}

	set_compound_order(page, 0);
#ifdef CONFIG_64BIT
	page[1].compound_nr = 0;
#endif
	__ClearPageHead(page);
}


Might have been better to change this set_compound_order call to
folio_set_compound_order in this patch.
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions
  2022-12-07  3:42     ` Mike Kravetz
@ 2022-12-07  4:11       ` Muchun Song
  2022-12-07 18:12         ` Mike Kravetz
  2022-12-12 18:34         ` David Hildenbrand
  0 siblings, 2 replies; 33+ messages in thread
From: Muchun Song @ 2022-12-07  4:11 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: Sidhartha Kumar, linux-kernel, Linux Memory Management List,
	Andrew Morton, Muchun Song, Matthew Wilcox, Mina Almasry,
	Miaohe Lin, hughd, tsahu, jhubbard, David Hildenbrand



> On Dec 7, 2022, at 11:42, Mike Kravetz <mike.kravetz@oracle.com> wrote:
> 
> On 12/07/22 11:34, Muchun Song wrote:
>> 
>> 
>>> On Nov 30, 2022, at 06:50, Sidhartha Kumar <sidhartha.kumar@oracle.com> wrote:
>>> 
>>> Add folio equivalents for set_compound_order() and set_compound_page_dtor().
>>> 
>>> Also remove extra new-lines introduced by mm/hugetlb: convert
>>> move_hugetlb_state() to folios and mm/hugetlb_cgroup: convert
>>> hugetlb_cgroup_uncharge_page() to folios.
>>> 
>>> Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
>>> Suggested-by: Muchun Song <songmuchun@bytedance.com>
>>> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
>>> ---
>>> include/linux/mm.h | 16 ++++++++++++++++
>>> mm/hugetlb.c       |  4 +---
>>> 2 files changed, 17 insertions(+), 3 deletions(-)
>>> 
>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>> index a48c5ad16a5e..2bdef8a5298a 100644
>>> --- a/include/linux/mm.h
>>> +++ b/include/linux/mm.h
>>> @@ -972,6 +972,13 @@ static inline void set_compound_page_dtor(struct page *page,
>>> page[1].compound_dtor = compound_dtor;
>>> }
>>> 
>>> +static inline void folio_set_compound_dtor(struct folio *folio,
>>> + enum compound_dtor_id compound_dtor)
>>> +{
>>> + VM_BUG_ON_FOLIO(compound_dtor >= NR_COMPOUND_DTORS, folio);
>>> + folio->_folio_dtor = compound_dtor;
>>> +}
>>> +
>>> void destroy_large_folio(struct folio *folio);
>>> 
>>> static inline int head_compound_pincount(struct page *head)
>>> @@ -987,6 +994,15 @@ static inline void set_compound_order(struct page *page, unsigned int order)
>>> #endif
>>> }
>>> 
>>> +static inline void folio_set_compound_order(struct folio *folio,
>>> + unsigned int order)
>>> +{
>>> + folio->_folio_order = order;
>>> +#ifdef CONFIG_64BIT
>>> + folio->_folio_nr_pages = order ? 1U << order : 0;
>> 
>> It seems that you think the user could pass 0 to order. However,
>> ->_folio_nr_pages and ->_folio_order fields are invalid for order-0 pages.
>> You should not touch it. So this should be:
>> 
>> static inline void folio_set_compound_order(struct folio *folio,
>>     unsigned int order)
>> {
>> 	if (!folio_test_large(folio))
>> 		return;
>> 
>> 	folio->_folio_order = order;
>> #ifdef CONFIG_64BIT
>> 	folio->_folio_nr_pages = 1U << order;
>> #endif
>> }
> 
> I believe this was changed to accommodate the code in
> __destroy_compound_gigantic_page().  It is used in a subsequent patch.
> Here is the v6.0 version of the routine.

Thanks for your clarification.

> 
> static void __destroy_compound_gigantic_page(struct page *page,
> unsigned int order, bool demote)
> {
> 	int i;
> 	int nr_pages = 1 << order;
> 	struct page *p = page + 1;
> 
> 	atomic_set(compound_mapcount_ptr(page), 0);
> 	atomic_set(compound_pincount_ptr(page), 0);
> 
> 	for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) {
> 		p->mapping = NULL;
> 		clear_compound_head(p);
> 		if (!demote)
> 			set_page_refcounted(p);
> 	}
> 
> 	set_compound_order(page, 0);
> #ifdef CONFIG_64BIT
> 	page[1].compound_nr = 0;
> #endif
> 	__ClearPageHead(page);
> }
> 
> 
> Might have been better to change this set_compound_order call to
> folio_set_compound_order in this patch.
> 

Agree. It has confused me a lot. I suggest changing the code to the
followings. The folio_test_large() check is still to avoid unexpected
users for OOB.

static inline void folio_set_compound_order(struct folio *folio,
					    unsigned int order)
{
	VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
	// or
	// if (!folio_test_large(folio))
	// 	return;

	folio->_folio_order = order;
#ifdef CONFIG_64BIT
	folio->_folio_nr_pages = order ? 1U << order : 0;
#endif
}

Thanks.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions
  2022-12-07  4:11       ` Muchun Song
@ 2022-12-07 18:12         ` Mike Kravetz
  2022-12-07 18:49           ` Sidhartha Kumar
  2022-12-12 18:34         ` David Hildenbrand
  1 sibling, 1 reply; 33+ messages in thread
From: Mike Kravetz @ 2022-12-07 18:12 UTC (permalink / raw)
  To: Muchun Song
  Cc: Sidhartha Kumar, linux-kernel, Linux Memory Management List,
	Andrew Morton, Muchun Song, Matthew Wilcox, Mina Almasry,
	Miaohe Lin, hughd, tsahu, jhubbard, David Hildenbrand

On 12/07/22 12:11, Muchun Song wrote:
> 
> 
> > On Dec 7, 2022, at 11:42, Mike Kravetz <mike.kravetz@oracle.com> wrote:
> > 
> > On 12/07/22 11:34, Muchun Song wrote:
> >> 
> >> 
> >>> On Nov 30, 2022, at 06:50, Sidhartha Kumar <sidhartha.kumar@oracle.com> wrote:
> >>> 
> >>> Add folio equivalents for set_compound_order() and set_compound_page_dtor().
> >>> 
> >>> Also remove extra new-lines introduced by mm/hugetlb: convert
> >>> move_hugetlb_state() to folios and mm/hugetlb_cgroup: convert
> >>> hugetlb_cgroup_uncharge_page() to folios.
> >>> 
> >>> Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
> >>> Suggested-by: Muchun Song <songmuchun@bytedance.com>
> >>> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> >>> ---
> >>> include/linux/mm.h | 16 ++++++++++++++++
> >>> mm/hugetlb.c       |  4 +---
> >>> 2 files changed, 17 insertions(+), 3 deletions(-)
> >>> 
> >>> diff --git a/include/linux/mm.h b/include/linux/mm.h
> >>> index a48c5ad16a5e..2bdef8a5298a 100644
> >>> --- a/include/linux/mm.h
> >>> +++ b/include/linux/mm.h
> >>> @@ -972,6 +972,13 @@ static inline void set_compound_page_dtor(struct page *page,
> >>> page[1].compound_dtor = compound_dtor;
> >>> }
> >>> 
> >>> +static inline void folio_set_compound_dtor(struct folio *folio,
> >>> + enum compound_dtor_id compound_dtor)
> >>> +{
> >>> + VM_BUG_ON_FOLIO(compound_dtor >= NR_COMPOUND_DTORS, folio);
> >>> + folio->_folio_dtor = compound_dtor;
> >>> +}
> >>> +
> >>> void destroy_large_folio(struct folio *folio);
> >>> 
> >>> static inline int head_compound_pincount(struct page *head)
> >>> @@ -987,6 +994,15 @@ static inline void set_compound_order(struct page *page, unsigned int order)
> >>> #endif
> >>> }
> >>> 
> >>> +static inline void folio_set_compound_order(struct folio *folio,
> >>> + unsigned int order)
> >>> +{
> >>> + folio->_folio_order = order;
> >>> +#ifdef CONFIG_64BIT
> >>> + folio->_folio_nr_pages = order ? 1U << order : 0;
> >> 
> >> It seems that you think the user could pass 0 to order. However,
> >> ->_folio_nr_pages and ->_folio_order fields are invalid for order-0 pages.
> >> You should not touch it. So this should be:
> >> 
> >> static inline void folio_set_compound_order(struct folio *folio,
> >>     unsigned int order)
> >> {
> >> 	if (!folio_test_large(folio))
> >> 		return;
> >> 
> >> 	folio->_folio_order = order;
> >> #ifdef CONFIG_64BIT
> >> 	folio->_folio_nr_pages = 1U << order;
> >> #endif
> >> }
> > 
> > I believe this was changed to accommodate the code in
> > __destroy_compound_gigantic_page().  It is used in a subsequent patch.
> > Here is the v6.0 version of the routine.
> 
> Thanks for your clarification.
> 
> > 
> > static void __destroy_compound_gigantic_page(struct page *page,
> > unsigned int order, bool demote)
> > {
> > 	int i;
> > 	int nr_pages = 1 << order;
> > 	struct page *p = page + 1;
> > 
> > 	atomic_set(compound_mapcount_ptr(page), 0);
> > 	atomic_set(compound_pincount_ptr(page), 0);
> > 
> > 	for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) {
> > 		p->mapping = NULL;
> > 		clear_compound_head(p);
> > 		if (!demote)
> > 			set_page_refcounted(p);
> > 	}
> > 
> > 	set_compound_order(page, 0);
> > #ifdef CONFIG_64BIT
> > 	page[1].compound_nr = 0;
> > #endif
> > 	__ClearPageHead(page);
> > }
> > 
> > 
> > Might have been better to change this set_compound_order call to
> > folio_set_compound_order in this patch.
> > 
> 
> Agree. It has confused me a lot. I suggest changing the code to the
> followings. The folio_test_large() check is still to avoid unexpected
> users for OOB.
> 
> static inline void folio_set_compound_order(struct folio *folio,
> 					    unsigned int order)
> {
> 	VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
> 	// or
> 	// if (!folio_test_large(folio))
> 	// 	return;
> 
> 	folio->_folio_order = order;
> #ifdef CONFIG_64BIT
> 	folio->_folio_nr_pages = order ? 1U << order : 0;
> #endif
> }

I think the VM_BUG_ON_FOLIO is appropriate as it would at least flag
data corruption.

Thinking about this some more, it seems that hugetlb is the only caller
that abuses folio_set_compound_order (and previously set_compound_order)
by passing in a zero order.  Since it is unlikely that anyone knows of
this abuse, it might be good to add a comment to the routine to note
why it handles the zero case.  This might help prevent changes which
would potentially break hugetlb.
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 06/10] mm/hugetlb: convert add_hugetlb_page() to folios and add hugetlb_cma_folio()
  2022-11-29 22:50 ` [PATCH mm-unstable v5 06/10] mm/hugetlb: convert add_hugetlb_page() to folios and add hugetlb_cma_folio() Sidhartha Kumar
@ 2022-12-07 18:38   ` Mike Kravetz
  0 siblings, 0 replies; 33+ messages in thread
From: Mike Kravetz @ 2022-12-07 18:38 UTC (permalink / raw)
  To: Sidhartha Kumar
  Cc: linux-kernel, linux-mm, akpm, songmuchun, willy, almasrymina,
	linmiaohe, hughd, tsahu, jhubbard, david

On 11/29/22 14:50, Sidhartha Kumar wrote:
> Convert add_hugetlb_page() to take in a folio, also convert
> hugetlb_cma_page() to take in a folio.
> 
> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> ---
>  mm/hugetlb.c | 42 +++++++++++++++++++++---------------------
>  1 file changed, 21 insertions(+), 21 deletions(-)

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 07/10] mm/hugetlb: convert enqueue_huge_page() to folios
  2022-11-29 22:50 ` [PATCH mm-unstable v5 07/10] mm/hugetlb: convert enqueue_huge_page() to folios Sidhartha Kumar
@ 2022-12-07 18:46   ` Mike Kravetz
  0 siblings, 0 replies; 33+ messages in thread
From: Mike Kravetz @ 2022-12-07 18:46 UTC (permalink / raw)
  To: Sidhartha Kumar
  Cc: linux-kernel, linux-mm, akpm, songmuchun, willy, almasrymina,
	linmiaohe, hughd, tsahu, jhubbard, david

On 11/29/22 14:50, Sidhartha Kumar wrote:
> Convert callers of enqueue_huge_page() to pass in a folio, function is
> renamed to enqueue_hugetlb_folio().
> 
> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> ---
>  mm/hugetlb.c | 22 +++++++++++-----------
>  1 file changed, 11 insertions(+), 11 deletions(-)

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions
  2022-12-07 18:12         ` Mike Kravetz
@ 2022-12-07 18:49           ` Sidhartha Kumar
  2022-12-07 19:05             ` Sidhartha Kumar
  0 siblings, 1 reply; 33+ messages in thread
From: Sidhartha Kumar @ 2022-12-07 18:49 UTC (permalink / raw)
  To: Mike Kravetz, Muchun Song
  Cc: linux-kernel, Linux Memory Management List, Andrew Morton,
	Muchun Song, Matthew Wilcox, Mina Almasry, Miaohe Lin, hughd,
	tsahu, jhubbard, David Hildenbrand

On 12/7/22 10:12 AM, Mike Kravetz wrote:
> On 12/07/22 12:11, Muchun Song wrote:
>>
>>
>>> On Dec 7, 2022, at 11:42, Mike Kravetz <mike.kravetz@oracle.com> wrote:
>>>
>>> On 12/07/22 11:34, Muchun Song wrote:
>>>>
>>>>
>>>>> On Nov 30, 2022, at 06:50, Sidhartha Kumar <sidhartha.kumar@oracle.com> wrote:
>>>>>
>>>>> Add folio equivalents for set_compound_order() and set_compound_page_dtor().
>>>>>
>>>>> Also remove extra new-lines introduced by mm/hugetlb: convert
>>>>> move_hugetlb_state() to folios and mm/hugetlb_cgroup: convert
>>>>> hugetlb_cgroup_uncharge_page() to folios.
>>>>>
>>>>> Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
>>>>> Suggested-by: Muchun Song <songmuchun@bytedance.com>
>>>>> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
>>>>> ---
>>>>> include/linux/mm.h | 16 ++++++++++++++++
>>>>> mm/hugetlb.c       |  4 +---
>>>>> 2 files changed, 17 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>>>> index a48c5ad16a5e..2bdef8a5298a 100644
>>>>> --- a/include/linux/mm.h
>>>>> +++ b/include/linux/mm.h
>>>>> @@ -972,6 +972,13 @@ static inline void set_compound_page_dtor(struct page *page,
>>>>> page[1].compound_dtor = compound_dtor;
>>>>> }
>>>>>
>>>>> +static inline void folio_set_compound_dtor(struct folio *folio,
>>>>> + enum compound_dtor_id compound_dtor)
>>>>> +{
>>>>> + VM_BUG_ON_FOLIO(compound_dtor >= NR_COMPOUND_DTORS, folio);
>>>>> + folio->_folio_dtor = compound_dtor;
>>>>> +}
>>>>> +
>>>>> void destroy_large_folio(struct folio *folio);
>>>>>
>>>>> static inline int head_compound_pincount(struct page *head)
>>>>> @@ -987,6 +994,15 @@ static inline void set_compound_order(struct page *page, unsigned int order)
>>>>> #endif
>>>>> }
>>>>>
>>>>> +static inline void folio_set_compound_order(struct folio *folio,
>>>>> + unsigned int order)
>>>>> +{
>>>>> + folio->_folio_order = order;
>>>>> +#ifdef CONFIG_64BIT
>>>>> + folio->_folio_nr_pages = order ? 1U << order : 0;
>>>>
>>>> It seems that you think the user could pass 0 to order. However,
>>>> ->_folio_nr_pages and ->_folio_order fields are invalid for order-0 pages.
>>>> You should not touch it. So this should be:
>>>>
>>>> static inline void folio_set_compound_order(struct folio *folio,
>>>>      unsigned int order)
>>>> {
>>>> 	if (!folio_test_large(folio))
>>>> 		return;
>>>>
>>>> 	folio->_folio_order = order;
>>>> #ifdef CONFIG_64BIT
>>>> 	folio->_folio_nr_pages = 1U << order;
>>>> #endif
>>>> }
>>>
>>> I believe this was changed to accommodate the code in
>>> __destroy_compound_gigantic_page().  It is used in a subsequent patch.
>>> Here is the v6.0 version of the routine.
>>
>> Thanks for your clarification.
>>
>>>
>>> static void __destroy_compound_gigantic_page(struct page *page,
>>> unsigned int order, bool demote)
>>> {
>>> 	int i;
>>> 	int nr_pages = 1 << order;
>>> 	struct page *p = page + 1;
>>>
>>> 	atomic_set(compound_mapcount_ptr(page), 0);
>>> 	atomic_set(compound_pincount_ptr(page), 0);
>>>
>>> 	for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) {
>>> 		p->mapping = NULL;
>>> 		clear_compound_head(p);
>>> 		if (!demote)
>>> 			set_page_refcounted(p);
>>> 	}
>>>
>>> 	set_compound_order(page, 0);
>>> #ifdef CONFIG_64BIT
>>> 	page[1].compound_nr = 0;
>>> #endif
>>> 	__ClearPageHead(page);
>>> }
>>>
>>>
>>> Might have been better to change this set_compound_order call to
>>> folio_set_compound_order in this patch.
>>>
>>
>> Agree. It has confused me a lot. I suggest changing the code to the
>> followings. The folio_test_large() check is still to avoid unexpected
>> users for OOB.
>>
>> static inline void folio_set_compound_order(struct folio *folio,
>> 					    unsigned int order)
>> {
>> 	VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
>> 	// or
>> 	// if (!folio_test_large(folio))
>> 	// 	return;
>>
>> 	folio->_folio_order = order;
>> #ifdef CONFIG_64BIT
>> 	folio->_folio_nr_pages = order ? 1U << order : 0;
>> #endif
>> }
> 
> I think the VM_BUG_ON_FOLIO is appropriate as it would at least flag
> data corruption.
> 
As Mike pointed out, my intention with supporting the 0 case was to 
cleanup the __destroy_compound_gigantic_page code by moving the ifdef 
CONFIG_64BIT lines to folio_set_compound_order(). I'll add the 
VM_BUG_ON_FOLIO line as well as a comment to make it clear it is not 
normally supported.

> Thinking about this some more, it seems that hugetlb is the only caller
> that abuses folio_set_compound_order (and previously set_compound_order)
> by passing in a zero order.  Since it is unlikely that anyone knows of
> this abuse, it might be good to add a comment to the routine to note
> why it handles the zero case.  This might help prevent changes which
> would potentially break hugetlb.

+/*
+ * _folio_nr_pages and _folio_order are invalid for
+ * order-zero pages. An exception is hugetlb, which passes
+ * in a zero order in __destroy_compound_gigantic_page().
+ */
  static inline void folio_set_compound_order(struct folio *folio,
                 unsigned int order)
  {
+       VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
+
         folio->_folio_order = order;
  #ifdef CONFIG_64BIT
         folio->_folio_nr_pages = order ? 1U << order : 0;

Does this comment work?



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 08/10] mm/hugetlb: convert free_gigantic_page() to folios
  2022-11-29 22:50 ` [PATCH mm-unstable v5 08/10] mm/hugetlb: convert free_gigantic_page() " Sidhartha Kumar
@ 2022-12-07 19:04   ` Mike Kravetz
  0 siblings, 0 replies; 33+ messages in thread
From: Mike Kravetz @ 2022-12-07 19:04 UTC (permalink / raw)
  To: Sidhartha Kumar
  Cc: linux-kernel, linux-mm, akpm, songmuchun, willy, almasrymina,
	linmiaohe, hughd, tsahu, jhubbard, david

On 11/29/22 14:50, Sidhartha Kumar wrote:
> Convert callers of free_gigantic_page() to use folios, function is then
> renamed to free_gigantic_folio().

More changes to __update_and_free_page than expected based on this
description.  However, they are all folio conversion related so ...

> 
> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> ---
>  mm/hugetlb.c | 29 +++++++++++++++++------------
>  1 file changed, 17 insertions(+), 12 deletions(-)

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions
  2022-12-07 18:49           ` Sidhartha Kumar
@ 2022-12-07 19:05             ` Sidhartha Kumar
  2022-12-07 19:25               ` Mike Kravetz
  0 siblings, 1 reply; 33+ messages in thread
From: Sidhartha Kumar @ 2022-12-07 19:05 UTC (permalink / raw)
  To: Mike Kravetz, Muchun Song
  Cc: linux-kernel, Linux Memory Management List, Andrew Morton,
	Muchun Song, Matthew Wilcox, Mina Almasry, Miaohe Lin, hughd,
	tsahu, jhubbard, David Hildenbrand

On 12/7/22 10:49 AM, Sidhartha Kumar wrote:
> On 12/7/22 10:12 AM, Mike Kravetz wrote:
>> On 12/07/22 12:11, Muchun Song wrote:
>>>
>>>
>>>> On Dec 7, 2022, at 11:42, Mike Kravetz <mike.kravetz@oracle.com> wrote:
>>>>
>>>> On 12/07/22 11:34, Muchun Song wrote:
>>>>>
>>>>>
>>>>>> On Nov 30, 2022, at 06:50, Sidhartha Kumar 
>>>>>> <sidhartha.kumar@oracle.com> wrote:
>>>>>>
>>>>>> Add folio equivalents for set_compound_order() and 
>>>>>> set_compound_page_dtor().
>>>>>>
>>>>>> Also remove extra new-lines introduced by mm/hugetlb: convert
>>>>>> move_hugetlb_state() to folios and mm/hugetlb_cgroup: convert
>>>>>> hugetlb_cgroup_uncharge_page() to folios.
>>>>>>
>>>>>> Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
>>>>>> Suggested-by: Muchun Song <songmuchun@bytedance.com>
>>>>>> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
>>>>>> ---
>>>>>> include/linux/mm.h | 16 ++++++++++++++++
>>>>>> mm/hugetlb.c       |  4 +---
>>>>>> 2 files changed, 17 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>>>>> index a48c5ad16a5e..2bdef8a5298a 100644
>>>>>> --- a/include/linux/mm.h
>>>>>> +++ b/include/linux/mm.h
>>>>>> @@ -972,6 +972,13 @@ static inline void 
>>>>>> set_compound_page_dtor(struct page *page,
>>>>>> page[1].compound_dtor = compound_dtor;
>>>>>> }
>>>>>>
>>>>>> +static inline void folio_set_compound_dtor(struct folio *folio,
>>>>>> + enum compound_dtor_id compound_dtor)
>>>>>> +{
>>>>>> + VM_BUG_ON_FOLIO(compound_dtor >= NR_COMPOUND_DTORS, folio);
>>>>>> + folio->_folio_dtor = compound_dtor;
>>>>>> +}
>>>>>> +
>>>>>> void destroy_large_folio(struct folio *folio);
>>>>>>
>>>>>> static inline int head_compound_pincount(struct page *head)
>>>>>> @@ -987,6 +994,15 @@ static inline void set_compound_order(struct 
>>>>>> page *page, unsigned int order)
>>>>>> #endif
>>>>>> }
>>>>>>
>>>>>> +static inline void folio_set_compound_order(struct folio *folio,
>>>>>> + unsigned int order)
>>>>>> +{
>>>>>> + folio->_folio_order = order;
>>>>>> +#ifdef CONFIG_64BIT
>>>>>> + folio->_folio_nr_pages = order ? 1U << order : 0;
>>>>>
>>>>> It seems that you think the user could pass 0 to order. However,
>>>>> ->_folio_nr_pages and ->_folio_order fields are invalid for order-0 
>>>>> pages.
>>>>> You should not touch it. So this should be:
>>>>>
>>>>> static inline void folio_set_compound_order(struct folio *folio,
>>>>>      unsigned int order)
>>>>> {
>>>>>     if (!folio_test_large(folio))
>>>>>         return;
>>>>>
>>>>>     folio->_folio_order = order;
>>>>> #ifdef CONFIG_64BIT
>>>>>     folio->_folio_nr_pages = 1U << order;
>>>>> #endif
>>>>> }
>>>>
>>>> I believe this was changed to accommodate the code in
>>>> __destroy_compound_gigantic_page().  It is used in a subsequent patch.
>>>> Here is the v6.0 version of the routine.
>>>
>>> Thanks for your clarification.
>>>
>>>>
>>>> static void __destroy_compound_gigantic_page(struct page *page,
>>>> unsigned int order, bool demote)
>>>> {
>>>>     int i;
>>>>     int nr_pages = 1 << order;
>>>>     struct page *p = page + 1;
>>>>
>>>>     atomic_set(compound_mapcount_ptr(page), 0);
>>>>     atomic_set(compound_pincount_ptr(page), 0);
>>>>
>>>>     for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) {
>>>>         p->mapping = NULL;
>>>>         clear_compound_head(p);
>>>>         if (!demote)
>>>>             set_page_refcounted(p);
>>>>     }
>>>>
>>>>     set_compound_order(page, 0);
>>>> #ifdef CONFIG_64BIT
>>>>     page[1].compound_nr = 0;
>>>> #endif
>>>>     __ClearPageHead(page);
>>>> }
>>>>
>>>>
>>>> Might have been better to change this set_compound_order call to
>>>> folio_set_compound_order in this patch.
>>>>
>>>
>>> Agree. It has confused me a lot. I suggest changing the code to the
>>> followings. The folio_test_large() check is still to avoid unexpected
>>> users for OOB.
>>>
>>> static inline void folio_set_compound_order(struct folio *folio,
>>>                         unsigned int order)
>>> {
>>>     VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
>>>     // or
>>>     // if (!folio_test_large(folio))
>>>     //     return;
>>>
>>>     folio->_folio_order = order;
>>> #ifdef CONFIG_64BIT
>>>     folio->_folio_nr_pages = order ? 1U << order : 0;
>>> #endif
>>> }
>>
>> I think the VM_BUG_ON_FOLIO is appropriate as it would at least flag
>> data corruption.
>>
> As Mike pointed out, my intention with supporting the 0 case was to 
> cleanup the __destroy_compound_gigantic_page code by moving the ifdef 
> CONFIG_64BIT lines to folio_set_compound_order(). I'll add the 
> VM_BUG_ON_FOLIO line as well as a comment to make it clear it is not 
> normally supported.
> 
>> Thinking about this some more, it seems that hugetlb is the only caller
>> that abuses folio_set_compound_order (and previously set_compound_order)
>> by passing in a zero order.  Since it is unlikely that anyone knows of
>> this abuse, it might be good to add a comment to the routine to note
>> why it handles the zero case.  This might help prevent changes which
>> would potentially break hugetlb.
> 
> +/*
> + * _folio_nr_pages and _folio_order are invalid for
> + * order-zero pages. An exception is hugetlb, which passes
> + * in a zero order in __destroy_compound_gigantic_page().
> + */
>   static inline void folio_set_compound_order(struct folio *folio,
>                  unsigned int order)
>   {
> +       VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
> +
>          folio->_folio_order = order;
>   #ifdef CONFIG_64BIT
>          folio->_folio_nr_pages = order ? 1U << order : 0;
> 
> Does this comment work?
> 
> 

I will change the comment from referencing 
__destory_compound_gigantic_page()
to __destroy_compound_gigantic_folio, although 
__prep_compound_gigantic_folio() is another user of 
folio_set_compound_order(folio, 0). Should the sentence just be "An 
exception is hugetlb, which passes in a zero order"?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions
  2022-12-07 19:05             ` Sidhartha Kumar
@ 2022-12-07 19:25               ` Mike Kravetz
  2022-12-08  2:19                 ` Muchun Song
  0 siblings, 1 reply; 33+ messages in thread
From: Mike Kravetz @ 2022-12-07 19:25 UTC (permalink / raw)
  To: Sidhartha Kumar
  Cc: Muchun Song, linux-kernel, Linux Memory Management List,
	Andrew Morton, Muchun Song, Matthew Wilcox, Mina Almasry,
	Miaohe Lin, hughd, tsahu, jhubbard, David Hildenbrand

On 12/07/22 11:05, Sidhartha Kumar wrote:
> On 12/7/22 10:49 AM, Sidhartha Kumar wrote:
> > On 12/7/22 10:12 AM, Mike Kravetz wrote:
> > > On 12/07/22 12:11, Muchun Song wrote:
> > > > > On Dec 7, 2022, at 11:42, Mike Kravetz <mike.kravetz@oracle.com> wrote:
> > > > > On 12/07/22 11:34, Muchun Song wrote:
> > > > 
> > > > Agree. It has confused me a lot. I suggest changing the code to the
> > > > followings. The folio_test_large() check is still to avoid unexpected
> > > > users for OOB.
> > > > 
> > > > static inline void folio_set_compound_order(struct folio *folio,
> > > >                         unsigned int order)
> > > > {
> > > >     VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
> > > >     // or
> > > >     // if (!folio_test_large(folio))
> > > >     //     return;
> > > > 
> > > >     folio->_folio_order = order;
> > > > #ifdef CONFIG_64BIT
> > > >     folio->_folio_nr_pages = order ? 1U << order : 0;
> > > > #endif
> > > > }
> > > 
> > > I think the VM_BUG_ON_FOLIO is appropriate as it would at least flag
> > > data corruption.
> > > 
> > As Mike pointed out, my intention with supporting the 0 case was to
> > cleanup the __destroy_compound_gigantic_page code by moving the ifdef
> > CONFIG_64BIT lines to folio_set_compound_order(). I'll add the
> > VM_BUG_ON_FOLIO line as well as a comment to make it clear it is not
> > normally supported.
> > 
> > > Thinking about this some more, it seems that hugetlb is the only caller
> > > that abuses folio_set_compound_order (and previously set_compound_order)
> > > by passing in a zero order.  Since it is unlikely that anyone knows of
> > > this abuse, it might be good to add a comment to the routine to note
> > > why it handles the zero case.  This might help prevent changes which
> > > would potentially break hugetlb.
> > 
> > +/*
> > + * _folio_nr_pages and _folio_order are invalid for
> > + * order-zero pages. An exception is hugetlb, which passes
> > + * in a zero order in __destroy_compound_gigantic_page().
> > + */
> >   static inline void folio_set_compound_order(struct folio *folio,
> >                  unsigned int order)
> >   {
> > +       VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
> > +
> >          folio->_folio_order = order;
> >   #ifdef CONFIG_64BIT
> >          folio->_folio_nr_pages = order ? 1U << order : 0;
> > 
> > Does this comment work?
> > 
> > 
> 
> I will change the comment from referencing
> __destory_compound_gigantic_page()
> to __destroy_compound_gigantic_folio, although
> __prep_compound_gigantic_folio() is another user of
> folio_set_compound_order(folio, 0). Should the sentence just be "An
> exception is hugetlb, which passes in a zero order"?

How about a comment like this?

/*
 * folio_set_compound_order is generally passed a non-zero order to
 * set up/create a large folio.  However, hugetlb code abuses this by
 * passing in zero when 'dissolving' a large folio.
 */

My only concern is that someone may modify the routine such that it no
longer works when passed zero order.  It is not likely as anyone should
notice the special case for zero, and look for callers.
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 09/10] mm/hugetlb: convert hugetlb prep functions to folios
  2022-11-29 22:50 ` [PATCH mm-unstable v5 09/10] mm/hugetlb: convert hugetlb prep functions " Sidhartha Kumar
@ 2022-12-07 19:35   ` Mike Kravetz
  0 siblings, 0 replies; 33+ messages in thread
From: Mike Kravetz @ 2022-12-07 19:35 UTC (permalink / raw)
  To: Sidhartha Kumar
  Cc: linux-kernel, linux-mm, akpm, songmuchun, willy, almasrymina,
	linmiaohe, hughd, tsahu, jhubbard, david

On 11/29/22 14:50, Sidhartha Kumar wrote:
> Convert prep_new_huge_page() and __prep_compound_gigantic_page() to
> folios.
> 
> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> ---
>  mm/hugetlb.c | 63 +++++++++++++++++++++++++---------------------------
>  1 file changed, 30 insertions(+), 33 deletions(-)

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 10/10] mm/hugetlb: change hugetlb allocation functions to return a folio
  2022-11-29 22:50 ` [PATCH mm-unstable v5 10/10] mm/hugetlb: change hugetlb allocation functions to return a folio Sidhartha Kumar
@ 2022-12-07 22:01   ` Mike Kravetz
  0 siblings, 0 replies; 33+ messages in thread
From: Mike Kravetz @ 2022-12-07 22:01 UTC (permalink / raw)
  To: Sidhartha Kumar
  Cc: linux-kernel, linux-mm, akpm, songmuchun, willy, almasrymina,
	linmiaohe, hughd, tsahu, jhubbard, david, Wei Chen,
	Rasmus Villemoes

On 11/29/22 14:50, Sidhartha Kumar wrote:
> Many hugetlb allocation helper functions have now been converting to
> folios, update their higher level callers to be compatible with folios.
> alloc_pool_huge_page is reorganized to avoid a smatch warning reporting
> the folio variable is unintialized.
> 
> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> Reported-by: Wei Chen <harperchen1110@gmail.com>
> Suggested-by: John Hubbard <jhubbard@nvidia.com>
> Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> ---
>  mm/hugetlb.c | 120 ++++++++++++++++++++++++---------------------------
>  1 file changed, 57 insertions(+), 63 deletions(-)

Thanks!  And, thank you to John, Wei and Rasmus for helping with
previous versions of the patch.

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions
  2022-12-07 19:25               ` Mike Kravetz
@ 2022-12-08  2:19                 ` Muchun Song
  2022-12-08  2:31                   ` John Hubbard
  0 siblings, 1 reply; 33+ messages in thread
From: Muchun Song @ 2022-12-08  2:19 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: Sidhartha Kumar, linux-kernel, Linux Memory Management List,
	Andrew Morton, Muchun Song, Matthew Wilcox, Mina Almasry,
	Miaohe Lin, hughd, tsahu, jhubbard, David Hildenbrand



> On Dec 8, 2022, at 03:25, Mike Kravetz <mike.kravetz@oracle.com> wrote:
> 
> On 12/07/22 11:05, Sidhartha Kumar wrote:
>> On 12/7/22 10:49 AM, Sidhartha Kumar wrote:
>>> On 12/7/22 10:12 AM, Mike Kravetz wrote:
>>>> On 12/07/22 12:11, Muchun Song wrote:
>>>>>> On Dec 7, 2022, at 11:42, Mike Kravetz <mike.kravetz@oracle.com> wrote:
>>>>>> On 12/07/22 11:34, Muchun Song wrote:
>>>>> 
>>>>> Agree. It has confused me a lot. I suggest changing the code to the
>>>>> followings. The folio_test_large() check is still to avoid unexpected
>>>>> users for OOB.
>>>>> 
>>>>> static inline void folio_set_compound_order(struct folio *folio,
>>>>>                         unsigned int order)
>>>>> {
>>>>>     VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
>>>>>     // or
>>>>>     // if (!folio_test_large(folio))
>>>>>     //     return;
>>>>> 
>>>>>     folio->_folio_order = order;
>>>>> #ifdef CONFIG_64BIT
>>>>>     folio->_folio_nr_pages = order ? 1U << order : 0;
>>>>> #endif
>>>>> }
>>>> 
>>>> I think the VM_BUG_ON_FOLIO is appropriate as it would at least flag
>>>> data corruption.
>>>> 
>>> As Mike pointed out, my intention with supporting the 0 case was to
>>> cleanup the __destroy_compound_gigantic_page code by moving the ifdef
>>> CONFIG_64BIT lines to folio_set_compound_order(). I'll add the
>>> VM_BUG_ON_FOLIO line as well as a comment to make it clear it is not
>>> normally supported.
>>> 
>>>> Thinking about this some more, it seems that hugetlb is the only caller
>>>> that abuses folio_set_compound_order (and previously set_compound_order)
>>>> by passing in a zero order.  Since it is unlikely that anyone knows of
>>>> this abuse, it might be good to add a comment to the routine to note
>>>> why it handles the zero case.  This might help prevent changes which
>>>> would potentially break hugetlb.
>>> 
>>> +/*
>>> + * _folio_nr_pages and _folio_order are invalid for
>>> + * order-zero pages. An exception is hugetlb, which passes
>>> + * in a zero order in __destroy_compound_gigantic_page().
>>> + */
>>>  static inline void folio_set_compound_order(struct folio *folio,
>>>                 unsigned int order)
>>>  {
>>> +       VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
>>> +
>>>         folio->_folio_order = order;
>>>  #ifdef CONFIG_64BIT
>>>         folio->_folio_nr_pages = order ? 1U << order : 0;
>>> 
>>> Does this comment work?
>>> 
>>> 
>> 
>> I will change the comment from referencing
>> __destory_compound_gigantic_page()
>> to __destroy_compound_gigantic_folio, although
>> __prep_compound_gigantic_folio() is another user of
>> folio_set_compound_order(folio, 0). Should the sentence just be "An
>> exception is hugetlb, which passes in a zero order"?
> 
> How about a comment like this?
> 
> /*
> * folio_set_compound_order is generally passed a non-zero order to
> * set up/create a large folio.  However, hugetlb code abuses this by
> * passing in zero when 'dissolving' a large folio.
> */

How about adding a new helper like "folio_dissolve_compound(struct folio *folio)"?
then it may be unnecessary to add a comment.

Thanks.

> 
> My only concern is that someone may modify the routine such that it no
> longer works when passed zero order.  It is not likely as anyone should
> notice the special case for zero, and look for callers.
> -- 
> Mike Kravetz



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions
  2022-12-08  2:19                 ` Muchun Song
@ 2022-12-08  2:31                   ` John Hubbard
  2022-12-08  4:44                     ` Muchun Song
  0 siblings, 1 reply; 33+ messages in thread
From: John Hubbard @ 2022-12-08  2:31 UTC (permalink / raw)
  To: Muchun Song, Mike Kravetz
  Cc: Sidhartha Kumar, linux-kernel, Linux Memory Management List,
	Andrew Morton, Muchun Song, Matthew Wilcox, Mina Almasry,
	Miaohe Lin, hughd, tsahu, David Hildenbrand

On 12/7/22 18:19, Muchun Song wrote:
> How about adding a new helper like "folio_dissolve_compound(struct folio *folio)"?
> then it may be unnecessary to add a comment.
> 

Yes, and in fact, I just finished spelling out exactly how to do that, in [1].


[1] https://lore.kernel.org/all/d17530ad-8e12-8069-d619-a2d72fe80e15@nvidia.com/


thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions
  2022-12-08  2:31                   ` John Hubbard
@ 2022-12-08  4:44                     ` Muchun Song
  0 siblings, 0 replies; 33+ messages in thread
From: Muchun Song @ 2022-12-08  4:44 UTC (permalink / raw)
  To: John Hubbard
  Cc: Mike Kravetz, Sidhartha Kumar, linux-kernel,
	Linux Memory Management List, Andrew Morton, Muchun Song,
	Matthew Wilcox, Mina Almasry, Miaohe Lin, hughd, tsahu,
	David Hildenbrand



> On Dec 8, 2022, at 10:31, John Hubbard <jhubbard@nvidia.com> wrote:
> 
> On 12/7/22 18:19, Muchun Song wrote:
>> How about adding a new helper like "folio_dissolve_compound(struct folio *folio)"?
>> then it may be unnecessary to add a comment.
> 
> Yes, and in fact, I just finished spelling out exactly how to do that, in [1].

Thanks for your guidance. I have replied in thread [1].

> 
> 
> [1] https://lore.kernel.org/all/d17530ad-8e12-8069-d619-a2d72fe80e15@nvidia.com/
> 
> 
> thanks,
> -- 
> John Hubbard
> NVIDIA


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions
  2022-12-07  4:11       ` Muchun Song
  2022-12-07 18:12         ` Mike Kravetz
@ 2022-12-12 18:34         ` David Hildenbrand
  2022-12-12 18:50           ` Sidhartha Kumar
  1 sibling, 1 reply; 33+ messages in thread
From: David Hildenbrand @ 2022-12-12 18:34 UTC (permalink / raw)
  To: Muchun Song, Mike Kravetz
  Cc: Sidhartha Kumar, linux-kernel, Linux Memory Management List,
	Andrew Morton, Muchun Song, Matthew Wilcox, Mina Almasry,
	Miaohe Lin, hughd, tsahu, jhubbard

On 07.12.22 05:11, Muchun Song wrote:
> 
> 
>> On Dec 7, 2022, at 11:42, Mike Kravetz <mike.kravetz@oracle.com> wrote:
>>
>> On 12/07/22 11:34, Muchun Song wrote:
>>>
>>>
>>>> On Nov 30, 2022, at 06:50, Sidhartha Kumar <sidhartha.kumar@oracle.com> wrote:
>>>>
>>>> Add folio equivalents for set_compound_order() and set_compound_page_dtor().
>>>>
>>>> Also remove extra new-lines introduced by mm/hugetlb: convert
>>>> move_hugetlb_state() to folios and mm/hugetlb_cgroup: convert
>>>> hugetlb_cgroup_uncharge_page() to folios.
>>>>
>>>> Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
>>>> Suggested-by: Muchun Song <songmuchun@bytedance.com>
>>>> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
>>>> ---
>>>> include/linux/mm.h | 16 ++++++++++++++++
>>>> mm/hugetlb.c       |  4 +---
>>>> 2 files changed, 17 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>>> index a48c5ad16a5e..2bdef8a5298a 100644
>>>> --- a/include/linux/mm.h
>>>> +++ b/include/linux/mm.h
>>>> @@ -972,6 +972,13 @@ static inline void set_compound_page_dtor(struct page *page,
>>>> page[1].compound_dtor = compound_dtor;
>>>> }
>>>>
>>>> +static inline void folio_set_compound_dtor(struct folio *folio,
>>>> + enum compound_dtor_id compound_dtor)
>>>> +{
>>>> + VM_BUG_ON_FOLIO(compound_dtor >= NR_COMPOUND_DTORS, folio);
>>>> + folio->_folio_dtor = compound_dtor;
>>>> +}
>>>> +
>>>> void destroy_large_folio(struct folio *folio);
>>>>
>>>> static inline int head_compound_pincount(struct page *head)
>>>> @@ -987,6 +994,15 @@ static inline void set_compound_order(struct page *page, unsigned int order)
>>>> #endif
>>>> }
>>>>
>>>> +static inline void folio_set_compound_order(struct folio *folio,
>>>> + unsigned int order)
>>>> +{
>>>> + folio->_folio_order = order;
>>>> +#ifdef CONFIG_64BIT
>>>> + folio->_folio_nr_pages = order ? 1U << order : 0;
>>>
>>> It seems that you think the user could pass 0 to order. However,
>>> ->_folio_nr_pages and ->_folio_order fields are invalid for order-0 pages.
>>> You should not touch it. So this should be:
>>>
>>> static inline void folio_set_compound_order(struct folio *folio,
>>>      unsigned int order)
>>> {
>>> 	if (!folio_test_large(folio))
>>> 		return;
>>>
>>> 	folio->_folio_order = order;
>>> #ifdef CONFIG_64BIT
>>> 	folio->_folio_nr_pages = 1U << order;
>>> #endif
>>> }
>>
>> I believe this was changed to accommodate the code in
>> __destroy_compound_gigantic_page().  It is used in a subsequent patch.
>> Here is the v6.0 version of the routine.
> 
> Thanks for your clarification.
> 
>>
>> static void __destroy_compound_gigantic_page(struct page *page,
>> unsigned int order, bool demote)
>> {
>> 	int i;
>> 	int nr_pages = 1 << order;
>> 	struct page *p = page + 1;
>>
>> 	atomic_set(compound_mapcount_ptr(page), 0);
>> 	atomic_set(compound_pincount_ptr(page), 0);
>>
>> 	for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) {
>> 		p->mapping = NULL;
>> 		clear_compound_head(p);
>> 		if (!demote)
>> 			set_page_refcounted(p);
>> 	}
>>
>> 	set_compound_order(page, 0);
>> #ifdef CONFIG_64BIT
>> 	page[1].compound_nr = 0;
>> #endif
>> 	__ClearPageHead(page);
>> }
>>
>>
>> Might have been better to change this set_compound_order call to
>> folio_set_compound_order in this patch.
>>
> 
> Agree. It has confused me a lot. I suggest changing the code to the
> followings. The folio_test_large() check is still to avoid unexpected
> users for OOB.
> 
> static inline void folio_set_compound_order(struct folio *folio,
> 					    unsigned int order)
> {
> 	VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
> 	// or
> 	// if (!folio_test_large(folio))
> 	// 	return;
> 
> 	folio->_folio_order = order;
> #ifdef CONFIG_64BIT
> 	folio->_folio_nr_pages = order ? 1U << order : 0;
> #endif
> }
> 
> Thanks.

On mm-stable, dynamically allocating gigantic pages gives me:

[   23.625105] page:00000000ae27bd2a refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1800
[   23.635117] flags: 0x17fffc00000000(node=0|zone=2|lastcpupid=0x1ffff)
[   23.641969] raw: 0017fffc00000000 ffffa4edc489bb58 fffff784c6000048 0000000000000000
[   23.650141] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
[   23.657907] page dumped because: VM_BUG_ON_FOLIO(!folio_test_large(folio))
[   23.663955] ------------[ cut here ]------------
[   23.667680] kernel BUG at include/linux/mm.h:1030!
[   23.673378] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[   23.681610] CPU: 3 PID: 1220 Comm: bash Not tainted 6.1.0-rc4+ #13
[   23.690281] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014
[   23.699837] RIP: 0010:__prep_compound_gigantic_folio+0x115/0x120
[   23.706587] Code: c7 44 24 5c 00 00 00 00 5b 44 89 d0 5d 41 5c 41 5d 41 5e c3 cc cc cc cc 48 83 e8 09
[   23.725954] RSP: 0018:ffffa4edc489bc90 EFLAGS: 00010296
[   23.731001] RAX: 000000000000003e RBX: ffffffffae39a720 RCX: 0000000000000000
[   23.736065] RDX: 0000000000000001 RSI: ffffffffad522c25 RDI: 00000000ffffffff
[   23.740866] RBP: 0000000000040000 R08: 0000000000000000 R09: ffffa4edc489bb58
[   23.745385] R10: 0000000000000003 R11: ffff926d7ff4b028 R12: fffff784c6000000
[   23.749464] R13: 0000000000040000 R14: fffff784c6000000 R15: ffff9266039fb900
[   23.753176] FS:  00007f69a9703740(0000) GS:ffff92696f8c0000(0000) knlGS:0000000000000000
[   23.758299] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   23.761275] CR2: 0000556789080640 CR3: 0000000105568005 CR4: 0000000000770ee0
[   23.764929] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   23.768572] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   23.772221] PKRU: 55555554
[   23.773696] Call Trace:
[   23.776170]  <TASK>
[   23.777258]  alloc_fresh_hugetlb_folio+0x14b/0x220
[   23.779406]  alloc_pool_huge_page+0x7c/0x110
[   23.781273]  __nr_hugepages_store_common+0x191/0x400
[   23.783369]  ? __mod_memcg_lruvec_state+0x93/0x110
[   23.785343]  ? __mod_lruvec_page_state+0xa6/0x1a0
[   23.787223]  ? _copy_from_iter+0x8c/0x540
[   23.788788]  ? __kmem_cache_alloc_node+0x13b/0x2a0
[   23.790577]  ? kernfs_fop_write_iter+0x174/0x1f0
[   23.792313]  nr_hugepages_store+0x57/0x70
[   23.793754]  kernfs_fop_write_iter+0x11b/0x1f0
[   23.795337]  vfs_write+0x1e1/0x3a0
[   23.796531]  ksys_write+0x53/0xd0
[   23.797690]  do_syscall_64+0x37/0x90
[   23.798947]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   23.800602] RIP: 0033:0x7f69a9501977
[   23.801798] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 14
[   23.807322] RSP: 002b:00007ffce87f3338 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[   23.809460] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f69a9501977
[   23.811485] RDX: 0000000000000002 RSI: 00005652a970e5e0 RDI: 0000000000000001
[   23.813443] RBP: 00005652a970e5e0 R08: 0000000000000000 R09: 0000000000000073
[   23.815385] R10: 0000000000001000 R11: 0000000000000246 R12: 0000000000000002
[   23.817289] R13: 00007f69a95f9780 R14: 0000000000000002 R15: 00007f69a95f49e0
[   23.819176]  </TASK>
[   23.819792] Modules linked in: intel_rapl_msr intel_rapl_common intel_uncore_frequency_common isst_ir
[   23.829056] ---[ end trace 0000000000000000 ]---
[   23.830413] RIP: 0010:__prep_compound_gigantic_folio+0x115/0x120
[   23.832390] Code: c7 44 24 5c 00 00 00 00 5b 44 89 d0 5d 41 5c 41 5d 41 5e c3 cc cc cc cc 48 83 e8 09
[   23.836787] RSP: 0018:ffffa4edc489bc90 EFLAGS: 00010296
[   23.838083] RAX: 000000000000003e RBX: ffffffffae39a720 RCX: 0000000000000000
[   23.839764] RDX: 0000000000000001 RSI: ffffffffad522c25 RDI: 00000000ffffffff
[   23.841456] RBP: 0000000000040000 R08: 0000000000000000 R09: ffffa4edc489bb58
[   23.843150] R10: 0000000000000003 R11: ffff926d7ff4b028 R12: fffff784c6000000
[   23.844836] R13: 0000000000040000 R14: fffff784c6000000 R15: ffff9266039fb900
[   23.846521] FS:  00007f69a9703740(0000) GS:ffff92696f8c0000(0000) knlGS:0000000000000000
[   23.848417] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   23.849797] CR2: 0000556789080640 CR3: 0000000105568005 CR4: 0000000000770ee0
[   23.851491] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   23.853181] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   23.854861] PKRU: 55555554


Something's broken.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions
  2022-12-12 18:34         ` David Hildenbrand
@ 2022-12-12 18:50           ` Sidhartha Kumar
  0 siblings, 0 replies; 33+ messages in thread
From: Sidhartha Kumar @ 2022-12-12 18:50 UTC (permalink / raw)
  To: David Hildenbrand, Muchun Song, Mike Kravetz
  Cc: linux-kernel, Linux Memory Management List, Andrew Morton,
	Muchun Song, Matthew Wilcox, Mina Almasry, Miaohe Lin, hughd,
	tsahu, jhubbard

On 12/12/22 10:34 AM, David Hildenbrand wrote:
> On 07.12.22 05:11, Muchun Song wrote:
>>
>>
>>> On Dec 7, 2022, at 11:42, Mike Kravetz <mike.kravetz@oracle.com> wrote:
>>>
>>> On 12/07/22 11:34, Muchun Song wrote:
>>>>
>>>>
>>>>> On Nov 30, 2022, at 06:50, Sidhartha Kumar 
>>>>> <sidhartha.kumar@oracle.com> wrote:
>>>>>
>>>>> Add folio equivalents for set_compound_order() and 
>>>>> set_compound_page_dtor().
>>>>>
>>>>> Also remove extra new-lines introduced by mm/hugetlb: convert
>>>>> move_hugetlb_state() to folios and mm/hugetlb_cgroup: convert
>>>>> hugetlb_cgroup_uncharge_page() to folios.
>>>>>
>>>>> Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
>>>>> Suggested-by: Muchun Song <songmuchun@bytedance.com>
>>>>> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
>>>>> ---
>>>>> include/linux/mm.h | 16 ++++++++++++++++
>>>>> mm/hugetlb.c       |  4 +---
>>>>> 2 files changed, 17 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>>>> index a48c5ad16a5e..2bdef8a5298a 100644
>>>>> --- a/include/linux/mm.h
>>>>> +++ b/include/linux/mm.h
>>>>> @@ -972,6 +972,13 @@ static inline void 
>>>>> set_compound_page_dtor(struct page *page,
>>>>> page[1].compound_dtor = compound_dtor;
>>>>> }
>>>>>
>>>>> +static inline void folio_set_compound_dtor(struct folio *folio,
>>>>> + enum compound_dtor_id compound_dtor)
>>>>> +{
>>>>> + VM_BUG_ON_FOLIO(compound_dtor >= NR_COMPOUND_DTORS, folio);
>>>>> + folio->_folio_dtor = compound_dtor;
>>>>> +}
>>>>> +
>>>>> void destroy_large_folio(struct folio *folio);
>>>>>
>>>>> static inline int head_compound_pincount(struct page *head)
>>>>> @@ -987,6 +994,15 @@ static inline void set_compound_order(struct 
>>>>> page *page, unsigned int order)
>>>>> #endif
>>>>> }
>>>>>
>>>>> +static inline void folio_set_compound_order(struct folio *folio,
>>>>> + unsigned int order)
>>>>> +{
>>>>> + folio->_folio_order = order;
>>>>> +#ifdef CONFIG_64BIT
>>>>> + folio->_folio_nr_pages = order ? 1U << order : 0;
>>>>
>>>> It seems that you think the user could pass 0 to order. However,
>>>> ->_folio_nr_pages and ->_folio_order fields are invalid for order-0 
>>>> pages.
>>>> You should not touch it. So this should be:
>>>>
>>>> static inline void folio_set_compound_order(struct folio *folio,
>>>>      unsigned int order)
>>>> {
>>>>     if (!folio_test_large(folio))
>>>>         return;
>>>>
>>>>     folio->_folio_order = order;
>>>> #ifdef CONFIG_64BIT
>>>>     folio->_folio_nr_pages = 1U << order;
>>>> #endif
>>>> }
>>>
>>> I believe this was changed to accommodate the code in
>>> __destroy_compound_gigantic_page().  It is used in a subsequent patch.
>>> Here is the v6.0 version of the routine.
>>
>> Thanks for your clarification.
>>
>>>
>>> static void __destroy_compound_gigantic_page(struct page *page,
>>> unsigned int order, bool demote)
>>> {
>>>     int i;
>>>     int nr_pages = 1 << order;
>>>     struct page *p = page + 1;
>>>
>>>     atomic_set(compound_mapcount_ptr(page), 0);
>>>     atomic_set(compound_pincount_ptr(page), 0);
>>>
>>>     for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) {
>>>         p->mapping = NULL;
>>>         clear_compound_head(p);
>>>         if (!demote)
>>>             set_page_refcounted(p);
>>>     }
>>>
>>>     set_compound_order(page, 0);
>>> #ifdef CONFIG_64BIT
>>>     page[1].compound_nr = 0;
>>> #endif
>>>     __ClearPageHead(page);
>>> }
>>>
>>>
>>> Might have been better to change this set_compound_order call to
>>> folio_set_compound_order in this patch.
>>>
>>
>> Agree. It has confused me a lot. I suggest changing the code to the
>> followings. The folio_test_large() check is still to avoid unexpected
>> users for OOB.
>>
>> static inline void folio_set_compound_order(struct folio *folio,
>>                         unsigned int order)
>> {
>>     VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
>>     // or
>>     // if (!folio_test_large(folio))
>>     //     return;
>>
>>     folio->_folio_order = order;
>> #ifdef CONFIG_64BIT
>>     folio->_folio_nr_pages = order ? 1U << order : 0;
>> #endif
>> }
>>
>> Thanks.
> 
> On mm-stable, dynamically allocating gigantic pages gives me:
> 
> [   23.625105] page:00000000ae27bd2a refcount:1 mapcount:0 
> mapping:0000000000000000 index:0x0 pfn:0x1800
> [   23.635117] flags: 0x17fffc00000000(node=0|zone=2|lastcpupid=0x1ffff)
> [   23.641969] raw: 0017fffc00000000 ffffa4edc489bb58 fffff784c6000048 
> 0000000000000000
> [   23.650141] raw: 0000000000000000 0000000000000000 00000001ffffffff 
> 0000000000000000
> [   23.657907] page dumped because: 
> VM_BUG_ON_FOLIO(!folio_test_large(folio))


static bool __prep_compound_gigantic_folio(struct folio *folio,
					unsigned int order, bool demote)
{
	int i, j;
	int nr_pages = 1 << order;
	struct page *p;

	/* we rely on prep_new_hugetlb_folio to set the destructor */
	folio_set_compound_order(folio, order);
	__folio_clear_reserved(folio);
	__folio_set_head(folio);

The problem looks to be because the head flag is set after the call to 
folio_set_compound_order() which looks for PG_head. I will test the fix 
of moving folio_set_compound_order() to after the __folio_set_head() call.

> [   23.663955] ------------[ cut here ]------------
> [   23.667680] kernel BUG at include/linux/mm.h:1030!
> [   23.673378] invalid opcode: 0000 [#1] PREEMPT SMP PTI
> [   23.681610] CPU: 3 PID: 1220 Comm: bash Not tainted 6.1.0-rc4+ #13
> [   23.690281] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
> 1.16.0-1.fc36 04/01/2014
> [   23.699837] RIP: 0010:__prep_compound_gigantic_folio+0x115/0x120
> [   23.706587] Code: c7 44 24 5c 00 00 00 00 5b 44 89 d0 5d 41 5c 41 5d 
> 41 5e c3 cc cc cc cc 48 83 e8 09
> [   23.725954] RSP: 0018:ffffa4edc489bc90 EFLAGS: 00010296
> [   23.731001] RAX: 000000000000003e RBX: ffffffffae39a720 RCX: 
> 0000000000000000
> [   23.736065] RDX: 0000000000000001 RSI: ffffffffad522c25 RDI: 
> 00000000ffffffff
> [   23.740866] RBP: 0000000000040000 R08: 0000000000000000 R09: 
> ffffa4edc489bb58
> [   23.745385] R10: 0000000000000003 R11: ffff926d7ff4b028 R12: 
> fffff784c6000000
> [   23.749464] R13: 0000000000040000 R14: fffff784c6000000 R15: 
> ffff9266039fb900
> [   23.753176] FS:  00007f69a9703740(0000) GS:ffff92696f8c0000(0000) 
> knlGS:0000000000000000
> [   23.758299] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   23.761275] CR2: 0000556789080640 CR3: 0000000105568005 CR4: 
> 0000000000770ee0
> [   23.764929] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
> 0000000000000000
> [   23.768572] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
> 0000000000000400
> [   23.772221] PKRU: 55555554
> [   23.773696] Call Trace:
> [   23.776170]  <TASK>
> [   23.777258]  alloc_fresh_hugetlb_folio+0x14b/0x220
> [   23.779406]  alloc_pool_huge_page+0x7c/0x110
> [   23.781273]  __nr_hugepages_store_common+0x191/0x400
> [   23.783369]  ? __mod_memcg_lruvec_state+0x93/0x110
> [   23.785343]  ? __mod_lruvec_page_state+0xa6/0x1a0
> [   23.787223]  ? _copy_from_iter+0x8c/0x540
> [   23.788788]  ? __kmem_cache_alloc_node+0x13b/0x2a0
> [   23.790577]  ? kernfs_fop_write_iter+0x174/0x1f0
> [   23.792313]  nr_hugepages_store+0x57/0x70
> [   23.793754]  kernfs_fop_write_iter+0x11b/0x1f0
> [   23.795337]  vfs_write+0x1e1/0x3a0
> [   23.796531]  ksys_write+0x53/0xd0
> [   23.797690]  do_syscall_64+0x37/0x90
> [   23.798947]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [   23.800602] RIP: 0033:0x7f69a9501977
> [   23.801798] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 
> 1f 00 f3 0f 1e fa 64 8b 04 25 14
> [   23.807322] RSP: 002b:00007ffce87f3338 EFLAGS: 00000246 ORIG_RAX: 
> 0000000000000001
> [   23.809460] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 
> 00007f69a9501977
> [   23.811485] RDX: 0000000000000002 RSI: 00005652a970e5e0 RDI: 
> 0000000000000001
> [   23.813443] RBP: 00005652a970e5e0 R08: 0000000000000000 R09: 
> 0000000000000073
> [   23.815385] R10: 0000000000001000 R11: 0000000000000246 R12: 
> 0000000000000002
> [   23.817289] R13: 00007f69a95f9780 R14: 0000000000000002 R15: 
> 00007f69a95f49e0
> [   23.819176]  </TASK>
> [   23.819792] Modules linked in: intel_rapl_msr intel_rapl_common 
> intel_uncore_frequency_common isst_ir
> [   23.829056] ---[ end trace 0000000000000000 ]---
> [   23.830413] RIP: 0010:__prep_compound_gigantic_folio+0x115/0x120
> [   23.832390] Code: c7 44 24 5c 00 00 00 00 5b 44 89 d0 5d 41 5c 41 5d 
> 41 5e c3 cc cc cc cc 48 83 e8 09
> [   23.836787] RSP: 0018:ffffa4edc489bc90 EFLAGS: 00010296
> [   23.838083] RAX: 000000000000003e RBX: ffffffffae39a720 RCX: 
> 0000000000000000
> [   23.839764] RDX: 0000000000000001 RSI: ffffffffad522c25 RDI: 
> 00000000ffffffff
> [   23.841456] RBP: 0000000000040000 R08: 0000000000000000 R09: 
> ffffa4edc489bb58
> [   23.843150] R10: 0000000000000003 R11: ffff926d7ff4b028 R12: 
> fffff784c6000000
> [   23.844836] R13: 0000000000040000 R14: fffff784c6000000 R15: 
> ffff9266039fb900
> [   23.846521] FS:  00007f69a9703740(0000) GS:ffff92696f8c0000(0000) 
> knlGS:0000000000000000
> [   23.848417] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   23.849797] CR2: 0000556789080640 CR3: 0000000105568005 CR4: 
> 0000000000770ee0
> [   23.851491] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
> 0000000000000000
> [   23.853181] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
> 0000000000000400
> [   23.854861] PKRU: 55555554
> 
> 
> Something's broken.
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2022-12-12 18:52 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-29 22:50 [PATCH mm-unstable v5 00/10] convert core hugetlb functions to folios Sidhartha Kumar
2022-11-29 22:50 ` [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions Sidhartha Kumar
2022-12-07  0:18   ` Mike Kravetz
2022-12-07  3:34   ` Muchun Song
2022-12-07  3:42     ` Mike Kravetz
2022-12-07  4:11       ` Muchun Song
2022-12-07 18:12         ` Mike Kravetz
2022-12-07 18:49           ` Sidhartha Kumar
2022-12-07 19:05             ` Sidhartha Kumar
2022-12-07 19:25               ` Mike Kravetz
2022-12-08  2:19                 ` Muchun Song
2022-12-08  2:31                   ` John Hubbard
2022-12-08  4:44                     ` Muchun Song
2022-12-12 18:34         ` David Hildenbrand
2022-12-12 18:50           ` Sidhartha Kumar
2022-11-29 22:50 ` [PATCH mm-unstable v5 02/10] mm/hugetlb: convert destroy_compound_gigantic_page() to folios Sidhartha Kumar
2022-12-07  0:32   ` Mike Kravetz
2022-11-29 22:50 ` [PATCH mm-unstable v5 03/10] mm/hugetlb: convert dissolve_free_huge_page() " Sidhartha Kumar
2022-12-07  0:52   ` Mike Kravetz
2022-11-29 22:50 ` [PATCH mm-unstable v5 04/10] mm/hugetlb: convert remove_hugetlb_page() " Sidhartha Kumar
2022-12-07  1:43   ` Mike Kravetz
2022-11-29 22:50 ` [PATCH mm-unstable v5 05/10] mm/hugetlb: convert update_and_free_page() " Sidhartha Kumar
2022-12-07  2:02   ` Mike Kravetz
2022-11-29 22:50 ` [PATCH mm-unstable v5 06/10] mm/hugetlb: convert add_hugetlb_page() to folios and add hugetlb_cma_folio() Sidhartha Kumar
2022-12-07 18:38   ` Mike Kravetz
2022-11-29 22:50 ` [PATCH mm-unstable v5 07/10] mm/hugetlb: convert enqueue_huge_page() to folios Sidhartha Kumar
2022-12-07 18:46   ` Mike Kravetz
2022-11-29 22:50 ` [PATCH mm-unstable v5 08/10] mm/hugetlb: convert free_gigantic_page() " Sidhartha Kumar
2022-12-07 19:04   ` Mike Kravetz
2022-11-29 22:50 ` [PATCH mm-unstable v5 09/10] mm/hugetlb: convert hugetlb prep functions " Sidhartha Kumar
2022-12-07 19:35   ` Mike Kravetz
2022-11-29 22:50 ` [PATCH mm-unstable v5 10/10] mm/hugetlb: change hugetlb allocation functions to return a folio Sidhartha Kumar
2022-12-07 22:01   ` Mike Kravetz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).