linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/10] mm/hugetlb: code refine and simplification
@ 2020-08-07  9:12 Wei Yang
  2020-08-07  9:12 ` [PATCH 01/10] mm/hugetlb: not necessary to coalesce regions recursively Wei Yang
                   ` (10 more replies)
  0 siblings, 11 replies; 45+ messages in thread
From: Wei Yang @ 2020-08-07  9:12 UTC (permalink / raw)
  To: mike.kravetz, akpm; +Cc: linux-mm, linux-kernel, Wei Yang

Following are some cleanup for hugetlb.

Simple test with tools/testing/selftests/vm/map_hugetlb pass.

Wei Yang (10):
  mm/hugetlb: not necessary to coalesce regions recursively
  mm/hugetlb: make sure to get NULL when list is empty
  mm/hugetlb: use list_splice to merge two list at once
  mm/hugetlb: count file_region to be added when regions_needed != NULL
  mm/hugetlb: remove the redundant check on non_swap_entry()
  mm/hugetlb: remove redundant huge_pte_alloc() in hugetlb_fault()
  mm/hugetlb: a page from buddy is not on any list
  mm/hugetlb: return non-isolated page in the loop instead of break and
    check
  mm/hugetlb: narrow the hugetlb_lock protection area during preparing
    huge page
  mm/hugetlb: not necessary to abuse temporary page to workaround the
    nasty free_huge_page

 mm/hugetlb.c | 101 ++++++++++++++++++++++-----------------------------
 1 file changed, 44 insertions(+), 57 deletions(-)

-- 
2.20.1 (Apple Git-117)



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 01/10] mm/hugetlb: not necessary to coalesce regions recursively
  2020-08-07  9:12 [PATCH 00/10] mm/hugetlb: code refine and simplification Wei Yang
@ 2020-08-07  9:12 ` Wei Yang
  2020-08-07 12:47   ` Baoquan He
  2020-08-10 20:22   ` Mike Kravetz
  2020-08-07  9:12 ` [PATCH 02/10] mm/hugetlb: make sure to get NULL when list is empty Wei Yang
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 45+ messages in thread
From: Wei Yang @ 2020-08-07  9:12 UTC (permalink / raw)
  To: mike.kravetz, akpm; +Cc: linux-mm, linux-kernel, Wei Yang

Per my understanding, we keep the regions ordered and would always
coalesce regions properly. So the task to keep this property is just
to coalesce its neighbour.

Let's simplify this.

Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
---
 mm/hugetlb.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 590111ea6975..62ec74f6d03f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -307,8 +307,7 @@ static void coalesce_file_region(struct resv_map *resv, struct file_region *rg)
 		list_del(&rg->link);
 		kfree(rg);
 
-		coalesce_file_region(resv, prg);
-		return;
+		rg = prg;
 	}
 
 	nrg = list_next_entry(rg, link);
@@ -318,9 +317,6 @@ static void coalesce_file_region(struct resv_map *resv, struct file_region *rg)
 
 		list_del(&rg->link);
 		kfree(rg);
-
-		coalesce_file_region(resv, nrg);
-		return;
 	}
 }
 
-- 
2.20.1 (Apple Git-117)



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 02/10] mm/hugetlb: make sure to get NULL when list is empty
  2020-08-07  9:12 [PATCH 00/10] mm/hugetlb: code refine and simplification Wei Yang
  2020-08-07  9:12 ` [PATCH 01/10] mm/hugetlb: not necessary to coalesce regions recursively Wei Yang
@ 2020-08-07  9:12 ` Wei Yang
  2020-08-07 12:49   ` Baoquan He
  2020-08-07  9:12 ` [PATCH 03/10] mm/hugetlb: use list_splice to merge two list at once Wei Yang
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 45+ messages in thread
From: Wei Yang @ 2020-08-07  9:12 UTC (permalink / raw)
  To: mike.kravetz, akpm; +Cc: linux-mm, linux-kernel, Wei Yang

list_first_entry() may not return NULL even when the list is empty.

Let's make sure the behavior by using list_first_entry_or_null(),
otherwise it would corrupt the list.

Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
---
 mm/hugetlb.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 62ec74f6d03f..0a2f3851b828 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -237,7 +237,8 @@ get_file_region_entry_from_cache(struct resv_map *resv, long from, long to)
 	VM_BUG_ON(resv->region_cache_count <= 0);
 
 	resv->region_cache_count--;
-	nrg = list_first_entry(&resv->region_cache, struct file_region, link);
+	nrg = list_first_entry_or_null(&resv->region_cache,
+			struct file_region, link);
 	VM_BUG_ON(!nrg);
 	list_del(&nrg->link);
 
-- 
2.20.1 (Apple Git-117)



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 03/10] mm/hugetlb: use list_splice to merge two list at once
  2020-08-07  9:12 [PATCH 00/10] mm/hugetlb: code refine and simplification Wei Yang
  2020-08-07  9:12 ` [PATCH 01/10] mm/hugetlb: not necessary to coalesce regions recursively Wei Yang
  2020-08-07  9:12 ` [PATCH 02/10] mm/hugetlb: make sure to get NULL when list is empty Wei Yang
@ 2020-08-07  9:12 ` Wei Yang
  2020-08-07 12:53   ` Baoquan He
  2020-08-10 21:07   ` Mike Kravetz
  2020-08-07  9:12 ` [PATCH 04/10] mm/hugetlb: count file_region to be added when regions_needed != NULL Wei Yang
                   ` (7 subsequent siblings)
  10 siblings, 2 replies; 45+ messages in thread
From: Wei Yang @ 2020-08-07  9:12 UTC (permalink / raw)
  To: mike.kravetz, akpm; +Cc: linux-mm, linux-kernel, Wei Yang

Instead of add allocated file_region one by one to region_cache, we
could use list_splice to merge two list at once.

Also we know the number of entries in the list, increase the number
directly.

Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
---
 mm/hugetlb.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 0a2f3851b828..929256c130f9 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -443,11 +443,8 @@ static int allocate_file_region_entries(struct resv_map *resv,
 
 		spin_lock(&resv->lock);
 
-		list_for_each_entry_safe(rg, trg, &allocated_regions, link) {
-			list_del(&rg->link);
-			list_add(&rg->link, &resv->region_cache);
-			resv->region_cache_count++;
-		}
+		list_splice(&allocated_regions, &resv->region_cache);
+		resv->region_cache_count += to_allocate;
 	}
 
 	return 0;
-- 
2.20.1 (Apple Git-117)



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 04/10] mm/hugetlb: count file_region to be added when regions_needed != NULL
  2020-08-07  9:12 [PATCH 00/10] mm/hugetlb: code refine and simplification Wei Yang
                   ` (2 preceding siblings ...)
  2020-08-07  9:12 ` [PATCH 03/10] mm/hugetlb: use list_splice to merge two list at once Wei Yang
@ 2020-08-07  9:12 ` Wei Yang
  2020-08-07 12:54   ` Baoquan He
  2020-08-10 21:46   ` Mike Kravetz
  2020-08-07  9:12 ` [PATCH 05/10] mm/hugetlb: remove the redundant check on non_swap_entry() Wei Yang
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 45+ messages in thread
From: Wei Yang @ 2020-08-07  9:12 UTC (permalink / raw)
  To: mike.kravetz, akpm; +Cc: linux-mm, linux-kernel, Wei Yang

There are only two cases of function add_reservation_in_range()

    * count file_region and return the number in regions_needed
    * do the real list operation without counting

This means it is not necessary to have two parameters to classify these
two cases.

Just use regions_needed to separate them.

Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
---
 mm/hugetlb.c | 33 +++++++++++++++++----------------
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 929256c130f9..d775e514eb2e 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -321,16 +321,17 @@ static void coalesce_file_region(struct resv_map *resv, struct file_region *rg)
 	}
 }
 
-/* Must be called with resv->lock held. Calling this with count_only == true
- * will count the number of pages to be added but will not modify the linked
- * list. If regions_needed != NULL and count_only == true, then regions_needed
- * will indicate the number of file_regions needed in the cache to carry out to
- * add the regions for this range.
+/*
+ * Must be called with resv->lock held.
+ *
+ * Calling this with regions_needed != NULL will count the number of pages
+ * to be added but will not modify the linked list. And regions_needed will
+ * indicate the number of file_regions needed in the cache to carry out to add
+ * the regions for this range.
  */
 static long add_reservation_in_range(struct resv_map *resv, long f, long t,
 				     struct hugetlb_cgroup *h_cg,
-				     struct hstate *h, long *regions_needed,
-				     bool count_only)
+				     struct hstate *h, long *regions_needed)
 {
 	long add = 0;
 	struct list_head *head = &resv->regions;
@@ -366,14 +367,14 @@ static long add_reservation_in_range(struct resv_map *resv, long f, long t,
 		 */
 		if (rg->from > last_accounted_offset) {
 			add += rg->from - last_accounted_offset;
-			if (!count_only) {
+			if (!regions_needed) {
 				nrg = get_file_region_entry_from_cache(
 					resv, last_accounted_offset, rg->from);
 				record_hugetlb_cgroup_uncharge_info(h_cg, h,
 								    resv, nrg);
 				list_add(&nrg->link, rg->link.prev);
 				coalesce_file_region(resv, nrg);
-			} else if (regions_needed)
+			} else
 				*regions_needed += 1;
 		}
 
@@ -385,13 +386,13 @@ static long add_reservation_in_range(struct resv_map *resv, long f, long t,
 	 */
 	if (last_accounted_offset < t) {
 		add += t - last_accounted_offset;
-		if (!count_only) {
+		if (!regions_needed) {
 			nrg = get_file_region_entry_from_cache(
 				resv, last_accounted_offset, t);
 			record_hugetlb_cgroup_uncharge_info(h_cg, h, resv, nrg);
 			list_add(&nrg->link, rg->link.prev);
 			coalesce_file_region(resv, nrg);
-		} else if (regions_needed)
+		} else
 			*regions_needed += 1;
 	}
 
@@ -484,8 +485,8 @@ static long region_add(struct resv_map *resv, long f, long t,
 retry:
 
 	/* Count how many regions are actually needed to execute this add. */
-	add_reservation_in_range(resv, f, t, NULL, NULL, &actual_regions_needed,
-				 true);
+	add_reservation_in_range(resv, f, t, NULL, NULL,
+				 &actual_regions_needed);
 
 	/*
 	 * Check for sufficient descriptors in the cache to accommodate
@@ -513,7 +514,7 @@ static long region_add(struct resv_map *resv, long f, long t,
 		goto retry;
 	}
 
-	add = add_reservation_in_range(resv, f, t, h_cg, h, NULL, false);
+	add = add_reservation_in_range(resv, f, t, h_cg, h, NULL);
 
 	resv->adds_in_progress -= in_regions_needed;
 
@@ -549,9 +550,9 @@ static long region_chg(struct resv_map *resv, long f, long t,
 
 	spin_lock(&resv->lock);
 
-	/* Count how many hugepages in this range are NOT respresented. */
+	/* Count how many hugepages in this range are NOT represented. */
 	chg = add_reservation_in_range(resv, f, t, NULL, NULL,
-				       out_regions_needed, true);
+				       out_regions_needed);
 
 	if (*out_regions_needed == 0)
 		*out_regions_needed = 1;
-- 
2.20.1 (Apple Git-117)



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 05/10] mm/hugetlb: remove the redundant check on non_swap_entry()
  2020-08-07  9:12 [PATCH 00/10] mm/hugetlb: code refine and simplification Wei Yang
                   ` (3 preceding siblings ...)
  2020-08-07  9:12 ` [PATCH 04/10] mm/hugetlb: count file_region to be added when regions_needed != NULL Wei Yang
@ 2020-08-07  9:12 ` Wei Yang
  2020-08-07 12:55   ` Baoquan He
  2020-08-07  9:12 ` [PATCH 06/10] mm/hugetlb: remove redundant huge_pte_alloc() in hugetlb_fault() Wei Yang
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 45+ messages in thread
From: Wei Yang @ 2020-08-07  9:12 UTC (permalink / raw)
  To: mike.kravetz, akpm; +Cc: linux-mm, linux-kernel, Wei Yang

Migration and hwpoison entry is a subset of non_swap_entry().

Remove the redundant check on non_swap_entry().

Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
---
 mm/hugetlb.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index d775e514eb2e..f5f04e89000d 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3778,7 +3778,7 @@ bool is_hugetlb_entry_migration(pte_t pte)
 	if (huge_pte_none(pte) || pte_present(pte))
 		return false;
 	swp = pte_to_swp_entry(pte);
-	if (non_swap_entry(swp) && is_migration_entry(swp))
+	if (is_migration_entry(swp))
 		return true;
 	else
 		return false;
@@ -3791,7 +3791,7 @@ static int is_hugetlb_entry_hwpoisoned(pte_t pte)
 	if (huge_pte_none(pte) || pte_present(pte))
 		return 0;
 	swp = pte_to_swp_entry(pte);
-	if (non_swap_entry(swp) && is_hwpoison_entry(swp))
+	if (is_hwpoison_entry(swp))
 		return 1;
 	else
 		return 0;
-- 
2.20.1 (Apple Git-117)



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 06/10] mm/hugetlb: remove redundant huge_pte_alloc() in hugetlb_fault()
  2020-08-07  9:12 [PATCH 00/10] mm/hugetlb: code refine and simplification Wei Yang
                   ` (4 preceding siblings ...)
  2020-08-07  9:12 ` [PATCH 05/10] mm/hugetlb: remove the redundant check on non_swap_entry() Wei Yang
@ 2020-08-07  9:12 ` Wei Yang
  2020-08-07 12:59   ` Baoquan He
  2020-08-10 22:00   ` Mike Kravetz
  2020-08-07  9:12 ` [PATCH 07/10] mm/hugetlb: a page from buddy is not on any list Wei Yang
                   ` (4 subsequent siblings)
  10 siblings, 2 replies; 45+ messages in thread
From: Wei Yang @ 2020-08-07  9:12 UTC (permalink / raw)
  To: mike.kravetz, akpm; +Cc: linux-mm, linux-kernel, Wei Yang

Before proper processing, huge_pte_alloc() would be called
un-conditionally. It is not necessary to do this when ptep is NULL.

Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
---
 mm/hugetlb.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f5f04e89000d..fb09e5a83c39 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4534,10 +4534,6 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 		} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
 			return VM_FAULT_HWPOISON_LARGE |
 				VM_FAULT_SET_HINDEX(hstate_index(h));
-	} else {
-		ptep = huge_pte_alloc(mm, haddr, huge_page_size(h));
-		if (!ptep)
-			return VM_FAULT_OOM;
 	}
 
 	/*
-- 
2.20.1 (Apple Git-117)



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 07/10] mm/hugetlb: a page from buddy is not on any list
  2020-08-07  9:12 [PATCH 00/10] mm/hugetlb: code refine and simplification Wei Yang
                   ` (5 preceding siblings ...)
  2020-08-07  9:12 ` [PATCH 06/10] mm/hugetlb: remove redundant huge_pte_alloc() in hugetlb_fault() Wei Yang
@ 2020-08-07  9:12 ` Wei Yang
  2020-08-07 13:06   ` Baoquan He
  2020-08-10 22:25   ` Mike Kravetz
  2020-08-07  9:12 ` [PATCH 08/10] mm/hugetlb: return non-isolated page in the loop instead of break and check Wei Yang
                   ` (3 subsequent siblings)
  10 siblings, 2 replies; 45+ messages in thread
From: Wei Yang @ 2020-08-07  9:12 UTC (permalink / raw)
  To: mike.kravetz, akpm; +Cc: linux-mm, linux-kernel, Wei Yang

The page allocated from buddy is not on any list, so just use list_add()
is enough.

Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
---
 mm/hugetlb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index fb09e5a83c39..b8e844911b5a 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2430,7 +2430,7 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
 			h->resv_huge_pages--;
 		}
 		spin_lock(&hugetlb_lock);
-		list_move(&page->lru, &h->hugepage_activelist);
+		list_add(&page->lru, &h->hugepage_activelist);
 		/* Fall through */
 	}
 	hugetlb_cgroup_commit_charge(idx, pages_per_huge_page(h), h_cg, page);
-- 
2.20.1 (Apple Git-117)



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 08/10] mm/hugetlb: return non-isolated page in the loop instead of break and check
  2020-08-07  9:12 [PATCH 00/10] mm/hugetlb: code refine and simplification Wei Yang
                   ` (6 preceding siblings ...)
  2020-08-07  9:12 ` [PATCH 07/10] mm/hugetlb: a page from buddy is not on any list Wei Yang
@ 2020-08-07  9:12 ` Wei Yang
  2020-08-07 13:09   ` Baoquan He
  2020-08-10 22:55   ` Mike Kravetz
  2020-08-07  9:12 ` [PATCH 09/10] mm/hugetlb: narrow the hugetlb_lock protection area during preparing huge page Wei Yang
                   ` (2 subsequent siblings)
  10 siblings, 2 replies; 45+ messages in thread
From: Wei Yang @ 2020-08-07  9:12 UTC (permalink / raw)
  To: mike.kravetz, akpm; +Cc: linux-mm, linux-kernel, Wei Yang

Function dequeue_huge_page_node_exact() iterates the free list and
return the first non-isolated one.

Instead of break and check the loop variant, we could return in the loop
directly. This could reduce some redundant check.

Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
---
 mm/hugetlb.c | 26 ++++++++++++--------------
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index b8e844911b5a..9473eb6800e9 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1035,20 +1035,18 @@ static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
 {
 	struct page *page;
 
-	list_for_each_entry(page, &h->hugepage_freelists[nid], lru)
-		if (!PageHWPoison(page))
-			break;
-	/*
-	 * if 'non-isolated free hugepage' not found on the list,
-	 * the allocation fails.
-	 */
-	if (&h->hugepage_freelists[nid] == &page->lru)
-		return NULL;
-	list_move(&page->lru, &h->hugepage_activelist);
-	set_page_refcounted(page);
-	h->free_huge_pages--;
-	h->free_huge_pages_node[nid]--;
-	return page;
+	list_for_each_entry(page, &h->hugepage_freelists[nid], lru) {
+		if (PageHWPoison(page))
+			continue;
+
+		list_move(&page->lru, &h->hugepage_activelist);
+		set_page_refcounted(page);
+		h->free_huge_pages--;
+		h->free_huge_pages_node[nid]--;
+		return page;
+	}
+
+	return NULL;
 }
 
 static struct page *dequeue_huge_page_nodemask(struct hstate *h, gfp_t gfp_mask, int nid,
-- 
2.20.1 (Apple Git-117)



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 09/10] mm/hugetlb: narrow the hugetlb_lock protection area during preparing huge page
  2020-08-07  9:12 [PATCH 00/10] mm/hugetlb: code refine and simplification Wei Yang
                   ` (7 preceding siblings ...)
  2020-08-07  9:12 ` [PATCH 08/10] mm/hugetlb: return non-isolated page in the loop instead of break and check Wei Yang
@ 2020-08-07  9:12 ` Wei Yang
  2020-08-07 13:12   ` Baoquan He
  2020-08-10 23:02   ` Mike Kravetz
  2020-08-07  9:12 ` [PATCH 10/10] mm/hugetlb: not necessary to abuse temporary page to workaround the nasty free_huge_page Wei Yang
  2020-08-07 22:25 ` [PATCH 00/10] mm/hugetlb: code refine and simplification Mike Kravetz
  10 siblings, 2 replies; 45+ messages in thread
From: Wei Yang @ 2020-08-07  9:12 UTC (permalink / raw)
  To: mike.kravetz, akpm; +Cc: linux-mm, linux-kernel, Wei Yang

set_hugetlb_cgroup_[rsvd] just manipulate page local data, which is not
necessary to be protected by hugetlb_lock.

Let's take this out.

Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
---
 mm/hugetlb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 9473eb6800e9..1f2010c9dd8d 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1494,9 +1494,9 @@ static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
 {
 	INIT_LIST_HEAD(&page->lru);
 	set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
-	spin_lock(&hugetlb_lock);
 	set_hugetlb_cgroup(page, NULL);
 	set_hugetlb_cgroup_rsvd(page, NULL);
+	spin_lock(&hugetlb_lock);
 	h->nr_huge_pages++;
 	h->nr_huge_pages_node[nid]++;
 	spin_unlock(&hugetlb_lock);
-- 
2.20.1 (Apple Git-117)



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 10/10] mm/hugetlb: not necessary to abuse temporary page to workaround the nasty free_huge_page
  2020-08-07  9:12 [PATCH 00/10] mm/hugetlb: code refine and simplification Wei Yang
                   ` (8 preceding siblings ...)
  2020-08-07  9:12 ` [PATCH 09/10] mm/hugetlb: narrow the hugetlb_lock protection area during preparing huge page Wei Yang
@ 2020-08-07  9:12 ` Wei Yang
  2020-08-10  2:17   ` Baoquan He
  2020-08-07 22:25 ` [PATCH 00/10] mm/hugetlb: code refine and simplification Mike Kravetz
  10 siblings, 1 reply; 45+ messages in thread
From: Wei Yang @ 2020-08-07  9:12 UTC (permalink / raw)
  To: mike.kravetz, akpm; +Cc: linux-mm, linux-kernel, Wei Yang

Let's always increase surplus_huge_pages and so that free_huge_page
could decrease it at free time.

Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
---
 mm/hugetlb.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 1f2010c9dd8d..a0eb81e0e4c5 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1913,21 +1913,19 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask,
 		return NULL;
 
 	spin_lock(&hugetlb_lock);
+
+	h->surplus_huge_pages++;
+	h->surplus_huge_pages_node[page_to_nid(page)]++;
+
 	/*
 	 * We could have raced with the pool size change.
 	 * Double check that and simply deallocate the new page
-	 * if we would end up overcommiting the surpluses. Abuse
-	 * temporary page to workaround the nasty free_huge_page
-	 * codeflow
+	 * if we would end up overcommiting the surpluses.
 	 */
-	if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) {
-		SetPageHugeTemporary(page);
+	if (h->surplus_huge_pages > h->nr_overcommit_huge_pages) {
 		spin_unlock(&hugetlb_lock);
 		put_page(page);
 		return NULL;
-	} else {
-		h->surplus_huge_pages++;
-		h->surplus_huge_pages_node[page_to_nid(page)]++;
 	}
 
 out_unlock:
-- 
2.20.1 (Apple Git-117)



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 01/10] mm/hugetlb: not necessary to coalesce regions recursively
  2020-08-07  9:12 ` [PATCH 01/10] mm/hugetlb: not necessary to coalesce regions recursively Wei Yang
@ 2020-08-07 12:47   ` Baoquan He
  2020-08-10 20:22   ` Mike Kravetz
  1 sibling, 0 replies; 45+ messages in thread
From: Baoquan He @ 2020-08-07 12:47 UTC (permalink / raw)
  To: Wei Yang; +Cc: mike.kravetz, akpm, linux-mm, linux-kernel

On 08/07/20 at 05:12pm, Wei Yang wrote:
> Per my understanding, we keep the regions ordered and would always
> coalesce regions properly. So the task to keep this property is just
> to coalesce its neighbour.
> 
> Let's simplify this.
> 
> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
> ---
>  mm/hugetlb.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 590111ea6975..62ec74f6d03f 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -307,8 +307,7 @@ static void coalesce_file_region(struct resv_map *resv, struct file_region *rg)
>  		list_del(&rg->link);
>  		kfree(rg);
>  
> -		coalesce_file_region(resv, prg);
> -		return;
> +		rg = prg;
>  	}
>  
>  	nrg = list_next_entry(rg, link);
> @@ -318,9 +317,6 @@ static void coalesce_file_region(struct resv_map *resv, struct file_region *rg)
>  
>  		list_del(&rg->link);
>  		kfree(rg);
> -
> -		coalesce_file_region(resv, nrg);

I agree with the change. But this change the original behaviour of
coalesce_file_region, not sure if there's any reason we need to do that,
maybe Mike can give a judgement. Personally,

Reviewed-by: Baoquan He <bhe@redhat.com>

> -		return;
>  	}
>  }
>  
> -- 
> 2.20.1 (Apple Git-117)
> 
> 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/10] mm/hugetlb: make sure to get NULL when list is empty
  2020-08-07  9:12 ` [PATCH 02/10] mm/hugetlb: make sure to get NULL when list is empty Wei Yang
@ 2020-08-07 12:49   ` Baoquan He
  2020-08-07 14:28     ` Wei Yang
  0 siblings, 1 reply; 45+ messages in thread
From: Baoquan He @ 2020-08-07 12:49 UTC (permalink / raw)
  To: Wei Yang; +Cc: mike.kravetz, akpm, linux-mm, linux-kernel

On 08/07/20 at 05:12pm, Wei Yang wrote:
> list_first_entry() may not return NULL even when the list is empty.
> 
> Let's make sure the behavior by using list_first_entry_or_null(),
> otherwise it would corrupt the list.
> 
> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
> ---
>  mm/hugetlb.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 62ec74f6d03f..0a2f3851b828 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -237,7 +237,8 @@ get_file_region_entry_from_cache(struct resv_map *resv, long from, long to)
>  	VM_BUG_ON(resv->region_cache_count <= 0);


We have had above line, is it possible to be NULL from list_first_entry?

>  
>  	resv->region_cache_count--;
> -	nrg = list_first_entry(&resv->region_cache, struct file_region, link);
> +	nrg = list_first_entry_or_null(&resv->region_cache,
> +			struct file_region, link);
>  	VM_BUG_ON(!nrg);
>  	list_del(&nrg->link);
>  
> -- 
> 2.20.1 (Apple Git-117)
> 
> 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 03/10] mm/hugetlb: use list_splice to merge two list at once
  2020-08-07  9:12 ` [PATCH 03/10] mm/hugetlb: use list_splice to merge two list at once Wei Yang
@ 2020-08-07 12:53   ` Baoquan He
  2020-08-10 21:07   ` Mike Kravetz
  1 sibling, 0 replies; 45+ messages in thread
From: Baoquan He @ 2020-08-07 12:53 UTC (permalink / raw)
  To: Wei Yang; +Cc: mike.kravetz, akpm, linux-mm, linux-kernel

On 08/07/20 at 05:12pm, Wei Yang wrote:
> Instead of add allocated file_region one by one to region_cache, we
> could use list_splice to merge two list at once.
> 
> Also we know the number of entries in the list, increase the number
> directly.
> 
> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
> ---
>  mm/hugetlb.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 0a2f3851b828..929256c130f9 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -443,11 +443,8 @@ static int allocate_file_region_entries(struct resv_map *resv,
>  
>  		spin_lock(&resv->lock);
>  
> -		list_for_each_entry_safe(rg, trg, &allocated_regions, link) {
> -			list_del(&rg->link);
> -			list_add(&rg->link, &resv->region_cache);
> -			resv->region_cache_count++;
> -		}
> +		list_splice(&allocated_regions, &resv->region_cache);
> +		resv->region_cache_count += to_allocate;

Looks good to me.

Reviewed-by: Baoquan He <bhe@redhat.com>

>  	}
>  
>  	return 0;
> -- 
> 2.20.1 (Apple Git-117)
> 
> 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/10] mm/hugetlb: count file_region to be added when regions_needed != NULL
  2020-08-07  9:12 ` [PATCH 04/10] mm/hugetlb: count file_region to be added when regions_needed != NULL Wei Yang
@ 2020-08-07 12:54   ` Baoquan He
  2020-08-10 21:46   ` Mike Kravetz
  1 sibling, 0 replies; 45+ messages in thread
From: Baoquan He @ 2020-08-07 12:54 UTC (permalink / raw)
  To: Wei Yang; +Cc: mike.kravetz, akpm, linux-mm, linux-kernel

On 08/07/20 at 05:12pm, Wei Yang wrote:
> There are only two cases of function add_reservation_in_range()
> 
>     * count file_region and return the number in regions_needed
>     * do the real list operation without counting
> 
> This means it is not necessary to have two parameters to classify these
> two cases.
> 
> Just use regions_needed to separate them.
> 
> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>

Nice clean up.

Reviewed-by: Baoquan He <bhe@redhat.com>

> ---
>  mm/hugetlb.c | 33 +++++++++++++++++----------------
>  1 file changed, 17 insertions(+), 16 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 929256c130f9..d775e514eb2e 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -321,16 +321,17 @@ static void coalesce_file_region(struct resv_map *resv, struct file_region *rg)
>  	}
>  }
>  
> -/* Must be called with resv->lock held. Calling this with count_only == true
> - * will count the number of pages to be added but will not modify the linked
> - * list. If regions_needed != NULL and count_only == true, then regions_needed
> - * will indicate the number of file_regions needed in the cache to carry out to
> - * add the regions for this range.
> +/*
> + * Must be called with resv->lock held.
> + *
> + * Calling this with regions_needed != NULL will count the number of pages
> + * to be added but will not modify the linked list. And regions_needed will
> + * indicate the number of file_regions needed in the cache to carry out to add
> + * the regions for this range.
>   */
>  static long add_reservation_in_range(struct resv_map *resv, long f, long t,
>  				     struct hugetlb_cgroup *h_cg,
> -				     struct hstate *h, long *regions_needed,
> -				     bool count_only)
> +				     struct hstate *h, long *regions_needed)
>  {
>  	long add = 0;
>  	struct list_head *head = &resv->regions;
> @@ -366,14 +367,14 @@ static long add_reservation_in_range(struct resv_map *resv, long f, long t,
>  		 */
>  		if (rg->from > last_accounted_offset) {
>  			add += rg->from - last_accounted_offset;
> -			if (!count_only) {
> +			if (!regions_needed) {
>  				nrg = get_file_region_entry_from_cache(
>  					resv, last_accounted_offset, rg->from);
>  				record_hugetlb_cgroup_uncharge_info(h_cg, h,
>  								    resv, nrg);
>  				list_add(&nrg->link, rg->link.prev);
>  				coalesce_file_region(resv, nrg);
> -			} else if (regions_needed)
> +			} else
>  				*regions_needed += 1;
>  		}
>  
> @@ -385,13 +386,13 @@ static long add_reservation_in_range(struct resv_map *resv, long f, long t,
>  	 */
>  	if (last_accounted_offset < t) {
>  		add += t - last_accounted_offset;
> -		if (!count_only) {
> +		if (!regions_needed) {
>  			nrg = get_file_region_entry_from_cache(
>  				resv, last_accounted_offset, t);
>  			record_hugetlb_cgroup_uncharge_info(h_cg, h, resv, nrg);
>  			list_add(&nrg->link, rg->link.prev);
>  			coalesce_file_region(resv, nrg);
> -		} else if (regions_needed)
> +		} else
>  			*regions_needed += 1;
>  	}
>  
> @@ -484,8 +485,8 @@ static long region_add(struct resv_map *resv, long f, long t,
>  retry:
>  
>  	/* Count how many regions are actually needed to execute this add. */
> -	add_reservation_in_range(resv, f, t, NULL, NULL, &actual_regions_needed,
> -				 true);
> +	add_reservation_in_range(resv, f, t, NULL, NULL,
> +				 &actual_regions_needed);
>  
>  	/*
>  	 * Check for sufficient descriptors in the cache to accommodate
> @@ -513,7 +514,7 @@ static long region_add(struct resv_map *resv, long f, long t,
>  		goto retry;
>  	}
>  
> -	add = add_reservation_in_range(resv, f, t, h_cg, h, NULL, false);
> +	add = add_reservation_in_range(resv, f, t, h_cg, h, NULL);
>  
>  	resv->adds_in_progress -= in_regions_needed;
>  
> @@ -549,9 +550,9 @@ static long region_chg(struct resv_map *resv, long f, long t,
>  
>  	spin_lock(&resv->lock);
>  
> -	/* Count how many hugepages in this range are NOT respresented. */
> +	/* Count how many hugepages in this range are NOT represented. */
>  	chg = add_reservation_in_range(resv, f, t, NULL, NULL,
> -				       out_regions_needed, true);
> +				       out_regions_needed);
>  
>  	if (*out_regions_needed == 0)
>  		*out_regions_needed = 1;
> -- 
> 2.20.1 (Apple Git-117)
> 
> 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 05/10] mm/hugetlb: remove the redundant check on non_swap_entry()
  2020-08-07  9:12 ` [PATCH 05/10] mm/hugetlb: remove the redundant check on non_swap_entry() Wei Yang
@ 2020-08-07 12:55   ` Baoquan He
  2020-08-07 14:28     ` Wei Yang
  0 siblings, 1 reply; 45+ messages in thread
From: Baoquan He @ 2020-08-07 12:55 UTC (permalink / raw)
  To: Wei Yang; +Cc: mike.kravetz, akpm, linux-mm, linux-kernel

On 08/07/20 at 05:12pm, Wei Yang wrote:
> Migration and hwpoison entry is a subset of non_swap_entry().
> 
> Remove the redundant check on non_swap_entry().
> 
> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>

Hmm, I have posted one patch to do the same thing, got reivewed by
people.

https://lore.kernel.org/linux-mm/20200723104636.GS32539@MiWiFi-R3L-srv/

> ---
>  mm/hugetlb.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index d775e514eb2e..f5f04e89000d 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3778,7 +3778,7 @@ bool is_hugetlb_entry_migration(pte_t pte)
>  	if (huge_pte_none(pte) || pte_present(pte))
>  		return false;
>  	swp = pte_to_swp_entry(pte);
> -	if (non_swap_entry(swp) && is_migration_entry(swp))
> +	if (is_migration_entry(swp))
>  		return true;
>  	else
>  		return false;
> @@ -3791,7 +3791,7 @@ static int is_hugetlb_entry_hwpoisoned(pte_t pte)
>  	if (huge_pte_none(pte) || pte_present(pte))
>  		return 0;
>  	swp = pte_to_swp_entry(pte);
> -	if (non_swap_entry(swp) && is_hwpoison_entry(swp))
> +	if (is_hwpoison_entry(swp))
>  		return 1;
>  	else
>  		return 0;
> -- 
> 2.20.1 (Apple Git-117)
> 
> 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 06/10] mm/hugetlb: remove redundant huge_pte_alloc() in hugetlb_fault()
  2020-08-07  9:12 ` [PATCH 06/10] mm/hugetlb: remove redundant huge_pte_alloc() in hugetlb_fault() Wei Yang
@ 2020-08-07 12:59   ` Baoquan He
  2020-08-10 22:00   ` Mike Kravetz
  1 sibling, 0 replies; 45+ messages in thread
From: Baoquan He @ 2020-08-07 12:59 UTC (permalink / raw)
  To: Wei Yang; +Cc: mike.kravetz, akpm, linux-mm, linux-kernel

On 08/07/20 at 05:12pm, Wei Yang wrote:
> Before proper processing, huge_pte_alloc() would be called
> un-conditionally. It is not necessary to do this when ptep is NULL.
> 
> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
> ---
>  mm/hugetlb.c | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index f5f04e89000d..fb09e5a83c39 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -4534,10 +4534,6 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>  		} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
>  			return VM_FAULT_HWPOISON_LARGE |
>  				VM_FAULT_SET_HINDEX(hstate_index(h));
> -	} else {
> -		ptep = huge_pte_alloc(mm, haddr, huge_page_size(h));
> -		if (!ptep)
> -			return VM_FAULT_OOM;

Right, seems a relic from Mike's i_mmap_rwsem handling patches.

Reviewed-by: Baoquan He <bhe@redhat.com>

>  	}
>  
>  	/*
> -- 
> 2.20.1 (Apple Git-117)
> 
> 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 07/10] mm/hugetlb: a page from buddy is not on any list
  2020-08-07  9:12 ` [PATCH 07/10] mm/hugetlb: a page from buddy is not on any list Wei Yang
@ 2020-08-07 13:06   ` Baoquan He
  2020-08-10 22:25   ` Mike Kravetz
  1 sibling, 0 replies; 45+ messages in thread
From: Baoquan He @ 2020-08-07 13:06 UTC (permalink / raw)
  To: Wei Yang; +Cc: mike.kravetz, akpm, linux-mm, linux-kernel

On 08/07/20 at 05:12pm, Wei Yang wrote:
> The page allocated from buddy is not on any list, so just use list_add()
> is enough.
> 
> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
> ---
>  mm/hugetlb.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index fb09e5a83c39..b8e844911b5a 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -2430,7 +2430,7 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
>  			h->resv_huge_pages--;
>  		}
>  		spin_lock(&hugetlb_lock);
> -		list_move(&page->lru, &h->hugepage_activelist);
> +		list_add(&page->lru, &h->hugepage_activelist);

Looks good to me.

Reviewed-by: Baoquan He <bhe@redhat.com>

>  		/* Fall through */
>  	}
>  	hugetlb_cgroup_commit_charge(idx, pages_per_huge_page(h), h_cg, page);
> -- 
> 2.20.1 (Apple Git-117)
> 
> 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 08/10] mm/hugetlb: return non-isolated page in the loop instead of break and check
  2020-08-07  9:12 ` [PATCH 08/10] mm/hugetlb: return non-isolated page in the loop instead of break and check Wei Yang
@ 2020-08-07 13:09   ` Baoquan He
  2020-08-07 14:32     ` Wei Yang
  2020-08-10 22:55   ` Mike Kravetz
  1 sibling, 1 reply; 45+ messages in thread
From: Baoquan He @ 2020-08-07 13:09 UTC (permalink / raw)
  To: Wei Yang; +Cc: mike.kravetz, akpm, linux-mm, linux-kernel

On 08/07/20 at 05:12pm, Wei Yang wrote:
> Function dequeue_huge_page_node_exact() iterates the free list and
> return the first non-isolated one.
> 
> Instead of break and check the loop variant, we could return in the loop
> directly. This could reduce some redundant check.
> 
> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
> ---
>  mm/hugetlb.c | 26 ++++++++++++--------------
>  1 file changed, 12 insertions(+), 14 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index b8e844911b5a..9473eb6800e9 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1035,20 +1035,18 @@ static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
>  {
>  	struct page *page;
>  
> -	list_for_each_entry(page, &h->hugepage_freelists[nid], lru)
> -		if (!PageHWPoison(page))
> -			break;

I don't see how it can reduce redundant check, just two different
styles.

> -	/*
> -	 * if 'non-isolated free hugepage' not found on the list,
> -	 * the allocation fails.

But the above code comment seems stale, it checks HWPoision page
directly, but not the old isolated page checking.

> -	 */
> -	if (&h->hugepage_freelists[nid] == &page->lru)
> -		return NULL;
> -	list_move(&page->lru, &h->hugepage_activelist);
> -	set_page_refcounted(page);
> -	h->free_huge_pages--;
> -	h->free_huge_pages_node[nid]--;
> -	return page;
> +	list_for_each_entry(page, &h->hugepage_freelists[nid], lru) {
> +		if (PageHWPoison(page))
> +			continue;
> +
> +		list_move(&page->lru, &h->hugepage_activelist);
> +		set_page_refcounted(page);
> +		h->free_huge_pages--;
> +		h->free_huge_pages_node[nid]--;
> +		return page;
> +	}
> +
> +	return NULL;
>  }
>  
>  static struct page *dequeue_huge_page_nodemask(struct hstate *h, gfp_t gfp_mask, int nid,
> -- 
> 2.20.1 (Apple Git-117)
> 
> 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 09/10] mm/hugetlb: narrow the hugetlb_lock protection area during preparing huge page
  2020-08-07  9:12 ` [PATCH 09/10] mm/hugetlb: narrow the hugetlb_lock protection area during preparing huge page Wei Yang
@ 2020-08-07 13:12   ` Baoquan He
  2020-08-10 23:02   ` Mike Kravetz
  1 sibling, 0 replies; 45+ messages in thread
From: Baoquan He @ 2020-08-07 13:12 UTC (permalink / raw)
  To: Wei Yang; +Cc: mike.kravetz, akpm, linux-mm, linux-kernel

On 08/07/20 at 05:12pm, Wei Yang wrote:
> set_hugetlb_cgroup_[rsvd] just manipulate page local data, which is not
> necessary to be protected by hugetlb_lock.
> 
> Let's take this out.
> 
> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
> ---
>  mm/hugetlb.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 9473eb6800e9..1f2010c9dd8d 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1494,9 +1494,9 @@ static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
>  {
>  	INIT_LIST_HEAD(&page->lru);
>  	set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
> -	spin_lock(&hugetlb_lock);
>  	set_hugetlb_cgroup(page, NULL);
>  	set_hugetlb_cgroup_rsvd(page, NULL);
> +	spin_lock(&hugetlb_lock);

Looks good to me.

Reviewed-by: Baoquan He <bhe@redhat.com>

>  	h->nr_huge_pages++;
>  	h->nr_huge_pages_node[nid]++;
>  	spin_unlock(&hugetlb_lock);
> -- 
> 2.20.1 (Apple Git-117)
> 
> 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/10] mm/hugetlb: make sure to get NULL when list is empty
  2020-08-07 12:49   ` Baoquan He
@ 2020-08-07 14:28     ` Wei Yang
  2020-08-10  0:57       ` Baoquan He
  2020-08-10 20:28       ` Mike Kravetz
  0 siblings, 2 replies; 45+ messages in thread
From: Wei Yang @ 2020-08-07 14:28 UTC (permalink / raw)
  To: Baoquan He; +Cc: Wei Yang, mike.kravetz, akpm, linux-mm, linux-kernel

On Fri, Aug 07, 2020 at 08:49:51PM +0800, Baoquan He wrote:
>On 08/07/20 at 05:12pm, Wei Yang wrote:
>> list_first_entry() may not return NULL even when the list is empty.
>> 
>> Let's make sure the behavior by using list_first_entry_or_null(),
>> otherwise it would corrupt the list.
>> 
>> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
>> ---
>>  mm/hugetlb.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>> 
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 62ec74f6d03f..0a2f3851b828 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -237,7 +237,8 @@ get_file_region_entry_from_cache(struct resv_map *resv, long from, long to)
>>  	VM_BUG_ON(resv->region_cache_count <= 0);
>
>
>We have had above line, is it possible to be NULL from list_first_entry?
>
>>  
>>  	resv->region_cache_count--;
>> -	nrg = list_first_entry(&resv->region_cache, struct file_region, link);
>> +	nrg = list_first_entry_or_null(&resv->region_cache,
>> +			struct file_region, link);
>>  	VM_BUG_ON(!nrg);

Or we can remove this VM_BUG_ON()?

>>  	list_del(&nrg->link);
>>  
>> -- 
>> 2.20.1 (Apple Git-117)
>> 
>> 

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 05/10] mm/hugetlb: remove the redundant check on non_swap_entry()
  2020-08-07 12:55   ` Baoquan He
@ 2020-08-07 14:28     ` Wei Yang
  0 siblings, 0 replies; 45+ messages in thread
From: Wei Yang @ 2020-08-07 14:28 UTC (permalink / raw)
  To: Baoquan He; +Cc: Wei Yang, mike.kravetz, akpm, linux-mm, linux-kernel

On Fri, Aug 07, 2020 at 08:55:50PM +0800, Baoquan He wrote:
>On 08/07/20 at 05:12pm, Wei Yang wrote:
>> Migration and hwpoison entry is a subset of non_swap_entry().
>> 
>> Remove the redundant check on non_swap_entry().
>> 
>> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
>
>Hmm, I have posted one patch to do the same thing, got reivewed by
>people.
>
>https://lore.kernel.org/linux-mm/20200723104636.GS32539@MiWiFi-R3L-srv/
>

Nice

>> ---
>>  mm/hugetlb.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>> 
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index d775e514eb2e..f5f04e89000d 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -3778,7 +3778,7 @@ bool is_hugetlb_entry_migration(pte_t pte)
>>  	if (huge_pte_none(pte) || pte_present(pte))
>>  		return false;
>>  	swp = pte_to_swp_entry(pte);
>> -	if (non_swap_entry(swp) && is_migration_entry(swp))
>> +	if (is_migration_entry(swp))
>>  		return true;
>>  	else
>>  		return false;
>> @@ -3791,7 +3791,7 @@ static int is_hugetlb_entry_hwpoisoned(pte_t pte)
>>  	if (huge_pte_none(pte) || pte_present(pte))
>>  		return 0;
>>  	swp = pte_to_swp_entry(pte);
>> -	if (non_swap_entry(swp) && is_hwpoison_entry(swp))
>> +	if (is_hwpoison_entry(swp))
>>  		return 1;
>>  	else
>>  		return 0;
>> -- 
>> 2.20.1 (Apple Git-117)
>> 
>> 

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 08/10] mm/hugetlb: return non-isolated page in the loop instead of break and check
  2020-08-07 13:09   ` Baoquan He
@ 2020-08-07 14:32     ` Wei Yang
  0 siblings, 0 replies; 45+ messages in thread
From: Wei Yang @ 2020-08-07 14:32 UTC (permalink / raw)
  To: Baoquan He; +Cc: Wei Yang, mike.kravetz, akpm, linux-mm, linux-kernel

On Fri, Aug 07, 2020 at 09:09:31PM +0800, Baoquan He wrote:
>On 08/07/20 at 05:12pm, Wei Yang wrote:
>> Function dequeue_huge_page_node_exact() iterates the free list and
>> return the first non-isolated one.
>> 
>> Instead of break and check the loop variant, we could return in the loop
>> directly. This could reduce some redundant check.
>> 
>> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
>> ---
>>  mm/hugetlb.c | 26 ++++++++++++--------------
>>  1 file changed, 12 insertions(+), 14 deletions(-)
>> 
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index b8e844911b5a..9473eb6800e9 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1035,20 +1035,18 @@ static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
>>  {
>>  	struct page *page;
>>  
>> -	list_for_each_entry(page, &h->hugepage_freelists[nid], lru)
>> -		if (!PageHWPoison(page))
>> -			break;
>
>I don't see how it can reduce redundant check, just two different
>styles.
>

list_for_each_entry() stops by checking whether the item reach list head.

>> -	/*
>> -	 * if 'non-isolated free hugepage' not found on the list,
>> -	 * the allocation fails.
>
>But the above code comment seems stale, it checks HWPoision page()
>directly, but not the old isolated page checking.
>
>> -	 */
>> -	if (&h->hugepage_freelists[nid] == &page->lru)
>> -		return NULL;

And here the check is done again, if we really iterate the whole list.

By take the code in the loop, we can avoid this check.

>> -	list_move(&page->lru, &h->hugepage_activelist);
>> -	set_page_refcounted(page);
>> -	h->free_huge_pages--;
>> -	h->free_huge_pages_node[nid]--;
>> -	return page;
>> +	list_for_each_entry(page, &h->hugepage_freelists[nid], lru) {
>> +		if (PageHWPoison(page))
>> +			continue;
>> +
>> +		list_move(&page->lru, &h->hugepage_activelist);
>> +		set_page_refcounted(page);
>> +		h->free_huge_pages--;
>> +		h->free_huge_pages_node[nid]--;
>> +		return page;
>> +	}
>> +
>> +	return NULL;
>>  }
>>  
>>  static struct page *dequeue_huge_page_nodemask(struct hstate *h, gfp_t gfp_mask, int nid,
>> -- 
>> 2.20.1 (Apple Git-117)
>> 
>> 

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/10] mm/hugetlb: code refine and simplification
  2020-08-07  9:12 [PATCH 00/10] mm/hugetlb: code refine and simplification Wei Yang
                   ` (9 preceding siblings ...)
  2020-08-07  9:12 ` [PATCH 10/10] mm/hugetlb: not necessary to abuse temporary page to workaround the nasty free_huge_page Wei Yang
@ 2020-08-07 22:25 ` Mike Kravetz
  10 siblings, 0 replies; 45+ messages in thread
From: Mike Kravetz @ 2020-08-07 22:25 UTC (permalink / raw)
  To: Wei Yang, akpm; +Cc: linux-mm, linux-kernel

On 8/7/20 2:12 AM, Wei Yang wrote:
> Following are some cleanup for hugetlb.
> 
> Simple test with tools/testing/selftests/vm/map_hugetlb pass.
> 
> Wei Yang (10):
>   mm/hugetlb: not necessary to coalesce regions recursively
>   mm/hugetlb: make sure to get NULL when list is empty
>   mm/hugetlb: use list_splice to merge two list at once
>   mm/hugetlb: count file_region to be added when regions_needed != NULL
>   mm/hugetlb: remove the redundant check on non_swap_entry()
>   mm/hugetlb: remove redundant huge_pte_alloc() in hugetlb_fault()
>   mm/hugetlb: a page from buddy is not on any list
>   mm/hugetlb: return non-isolated page in the loop instead of break and
>     check
>   mm/hugetlb: narrow the hugetlb_lock protection area during preparing
>     huge page
>   mm/hugetlb: not necessary to abuse temporary page to workaround the
>     nasty free_huge_page
> 
>  mm/hugetlb.c | 101 ++++++++++++++++++++++-----------------------------
>  1 file changed, 44 insertions(+), 57 deletions(-)

Thanks Wei Yang!

I'll take a look at these next week.
-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/10] mm/hugetlb: make sure to get NULL when list is empty
  2020-08-07 14:28     ` Wei Yang
@ 2020-08-10  0:57       ` Baoquan He
  2020-08-10 20:28       ` Mike Kravetz
  1 sibling, 0 replies; 45+ messages in thread
From: Baoquan He @ 2020-08-10  0:57 UTC (permalink / raw)
  To: Wei Yang; +Cc: mike.kravetz, akpm, linux-mm, linux-kernel

On 08/07/20 at 10:28pm, Wei Yang wrote:
> On Fri, Aug 07, 2020 at 08:49:51PM +0800, Baoquan He wrote:
> >On 08/07/20 at 05:12pm, Wei Yang wrote:
> >> list_first_entry() may not return NULL even when the list is empty.
> >> 
> >> Let's make sure the behavior by using list_first_entry_or_null(),
> >> otherwise it would corrupt the list.
> >> 
> >> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
> >> ---
> >>  mm/hugetlb.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> >> index 62ec74f6d03f..0a2f3851b828 100644
> >> --- a/mm/hugetlb.c
> >> +++ b/mm/hugetlb.c
> >> @@ -237,7 +237,8 @@ get_file_region_entry_from_cache(struct resv_map *resv, long from, long to)
> >>  	VM_BUG_ON(resv->region_cache_count <= 0);
> >
> >
> >We have had above line, is it possible to be NULL from list_first_entry?
> >
> >>  
> >>  	resv->region_cache_count--;
> >> -	nrg = list_first_entry(&resv->region_cache, struct file_region, link);
> >> +	nrg = list_first_entry_or_null(&resv->region_cache,
> >> +			struct file_region, link);
> >>  	VM_BUG_ON(!nrg);
> 
> Or we can remove this VM_BUG_ON()?

Yeah, it's fine to me.

> 
> >>  	list_del(&nrg->link);
> >>  
> >> -- 
> >> 2.20.1 (Apple Git-117)
> >> 
> >> 
> 
> -- 
> Wei Yang
> Help you, Help me
> 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/10] mm/hugetlb: not necessary to abuse temporary page to workaround the nasty free_huge_page
  2020-08-07  9:12 ` [PATCH 10/10] mm/hugetlb: not necessary to abuse temporary page to workaround the nasty free_huge_page Wei Yang
@ 2020-08-10  2:17   ` Baoquan He
  2020-08-11  0:19     ` Mike Kravetz
  0 siblings, 1 reply; 45+ messages in thread
From: Baoquan He @ 2020-08-10  2:17 UTC (permalink / raw)
  To: Wei Yang, mhocko; +Cc: mike.kravetz, akpm, linux-mm, linux-kernel

On 08/07/20 at 05:12pm, Wei Yang wrote:
> Let's always increase surplus_huge_pages and so that free_huge_page
> could decrease it at free time.
> 
> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
> ---
>  mm/hugetlb.c | 14 ++++++--------
>  1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 1f2010c9dd8d..a0eb81e0e4c5 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1913,21 +1913,19 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask,
>  		return NULL;
>  
>  	spin_lock(&hugetlb_lock);
> +
> +	h->surplus_huge_pages++;
> +	h->surplus_huge_pages_node[page_to_nid(page)]++;
> +
>  	/*
>  	 * We could have raced with the pool size change.
>  	 * Double check that and simply deallocate the new page
> -	 * if we would end up overcommiting the surpluses. Abuse
> -	 * temporary page to workaround the nasty free_huge_page
> -	 * codeflow
> +	 * if we would end up overcommiting the surpluses.
>  	 */
> -	if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) {
> -		SetPageHugeTemporary(page);

Hmm, the temporary page way is taken intentionally in
commit 9980d744a0428 ("mm, hugetlb: get rid of surplus page accounting tricks").
From code, this is done inside hugetlb_lock holding, and the code flow
is straightforward, should be safe. Adding Michal to CC.


> +	if (h->surplus_huge_pages > h->nr_overcommit_huge_pages) {
>  		spin_unlock(&hugetlb_lock);
>  		put_page(page);
>  		return NULL;
> -	} else {
> -		h->surplus_huge_pages++;
> -		h->surplus_huge_pages_node[page_to_nid(page)]++;
>  	}
>  
>  out_unlock:
> -- 
> 2.20.1 (Apple Git-117)
> 
> 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 01/10] mm/hugetlb: not necessary to coalesce regions recursively
  2020-08-07  9:12 ` [PATCH 01/10] mm/hugetlb: not necessary to coalesce regions recursively Wei Yang
  2020-08-07 12:47   ` Baoquan He
@ 2020-08-10 20:22   ` Mike Kravetz
  1 sibling, 0 replies; 45+ messages in thread
From: Mike Kravetz @ 2020-08-10 20:22 UTC (permalink / raw)
  To: Wei Yang, akpm; +Cc: linux-mm, linux-kernel

On 8/7/20 2:12 AM, Wei Yang wrote:
> Per my understanding, we keep the regions ordered and would always
> coalesce regions properly. So the task to keep this property is just
> to coalesce its neighbour.
> 
> Let's simplify this.
> 
> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>

Thanks!  It is unfortunate that the region management code is difficult
to understand.

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/10] mm/hugetlb: make sure to get NULL when list is empty
  2020-08-07 14:28     ` Wei Yang
  2020-08-10  0:57       ` Baoquan He
@ 2020-08-10 20:28       ` Mike Kravetz
  2020-08-10 23:05         ` Wei Yang
  1 sibling, 1 reply; 45+ messages in thread
From: Mike Kravetz @ 2020-08-10 20:28 UTC (permalink / raw)
  To: Wei Yang, Baoquan He; +Cc: akpm, linux-mm, linux-kernel

On 8/7/20 7:28 AM, Wei Yang wrote:
> On Fri, Aug 07, 2020 at 08:49:51PM +0800, Baoquan He wrote:
>> On 08/07/20 at 05:12pm, Wei Yang wrote:
>>> list_first_entry() may not return NULL even when the list is empty.
>>>
>>> Let's make sure the behavior by using list_first_entry_or_null(),
>>> otherwise it would corrupt the list.
>>>
>>> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
>>> ---
>>>  mm/hugetlb.c | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>> index 62ec74f6d03f..0a2f3851b828 100644
>>> --- a/mm/hugetlb.c
>>> +++ b/mm/hugetlb.c
>>> @@ -237,7 +237,8 @@ get_file_region_entry_from_cache(struct resv_map *resv, long from, long to)
>>>  	VM_BUG_ON(resv->region_cache_count <= 0);
>>
>>
>> We have had above line, is it possible to be NULL from list_first_entry?
>>
>>>  
>>>  	resv->region_cache_count--;
>>> -	nrg = list_first_entry(&resv->region_cache, struct file_region, link);
>>> +	nrg = list_first_entry_or_null(&resv->region_cache,
>>> +			struct file_region, link);
>>>  	VM_BUG_ON(!nrg);
> 
> Or we can remove this VM_BUG_ON()?
> 

I would prefer that we just remove the 'VM_BUG_ON(!nrg)'.  Code elsewhere
is responsible for making sure there is ALWAYS an entry in the cache.  That
is why the 'VM_BUG_ON(resv->region_cache_count <= 0)' is at the beginning
of the routine.

-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 03/10] mm/hugetlb: use list_splice to merge two list at once
  2020-08-07  9:12 ` [PATCH 03/10] mm/hugetlb: use list_splice to merge two list at once Wei Yang
  2020-08-07 12:53   ` Baoquan He
@ 2020-08-10 21:07   ` Mike Kravetz
  1 sibling, 0 replies; 45+ messages in thread
From: Mike Kravetz @ 2020-08-10 21:07 UTC (permalink / raw)
  To: Wei Yang, akpm; +Cc: linux-mm, linux-kernel

On 8/7/20 2:12 AM, Wei Yang wrote:
> Instead of add allocated file_region one by one to region_cache, we
> could use list_splice to merge two list at once.
> 
> Also we know the number of entries in the list, increase the number
> directly.
> 
> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>

Thanks!

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/10] mm/hugetlb: count file_region to be added when regions_needed != NULL
  2020-08-07  9:12 ` [PATCH 04/10] mm/hugetlb: count file_region to be added when regions_needed != NULL Wei Yang
  2020-08-07 12:54   ` Baoquan He
@ 2020-08-10 21:46   ` Mike Kravetz
  1 sibling, 0 replies; 45+ messages in thread
From: Mike Kravetz @ 2020-08-10 21:46 UTC (permalink / raw)
  To: Wei Yang, akpm; +Cc: linux-mm, linux-kernel

On 8/7/20 2:12 AM, Wei Yang wrote:
> There are only two cases of function add_reservation_in_range()
> 
>     * count file_region and return the number in regions_needed
>     * do the real list operation without counting
> 
> This means it is not necessary to have two parameters to classify these
> two cases.
> 
> Just use regions_needed to separate them.
> 
> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>

Thanks!

I really like removal of the 'count_only' parameter.  However, within the
routine 'count_only' made the code just a little easier to understand. I
might have:
- Removed the function parameter.
- Added local variable
  bool count_only = regions_needed != NULL;
- Left remaining code as it.

I'm OK with the proposed change.  Any change to readability is VERY minimal.

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 06/10] mm/hugetlb: remove redundant huge_pte_alloc() in hugetlb_fault()
  2020-08-07  9:12 ` [PATCH 06/10] mm/hugetlb: remove redundant huge_pte_alloc() in hugetlb_fault() Wei Yang
  2020-08-07 12:59   ` Baoquan He
@ 2020-08-10 22:00   ` Mike Kravetz
  1 sibling, 0 replies; 45+ messages in thread
From: Mike Kravetz @ 2020-08-10 22:00 UTC (permalink / raw)
  To: Wei Yang, akpm; +Cc: linux-mm, linux-kernel

On 8/7/20 2:12 AM, Wei Yang wrote:
> Before proper processing, huge_pte_alloc() would be called
> un-conditionally. It is not necessary to do this when ptep is NULL.

Worse, that extra call is a bug.  I believe Andrew pulled this patch into
his queue.  It still could use a review.

https://lore.kernel.org/linux-mm/e670f327-5cf9-1959-96e4-6dc7cc30d3d5@oracle.com/

-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 07/10] mm/hugetlb: a page from buddy is not on any list
  2020-08-07  9:12 ` [PATCH 07/10] mm/hugetlb: a page from buddy is not on any list Wei Yang
  2020-08-07 13:06   ` Baoquan He
@ 2020-08-10 22:25   ` Mike Kravetz
  1 sibling, 0 replies; 45+ messages in thread
From: Mike Kravetz @ 2020-08-10 22:25 UTC (permalink / raw)
  To: Wei Yang, akpm; +Cc: linux-mm, linux-kernel

On 8/7/20 2:12 AM, Wei Yang wrote:
> The page allocated from buddy is not on any list, so just use list_add()
> is enough.
> 
> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>

Thanks!

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 08/10] mm/hugetlb: return non-isolated page in the loop instead of break and check
  2020-08-07  9:12 ` [PATCH 08/10] mm/hugetlb: return non-isolated page in the loop instead of break and check Wei Yang
  2020-08-07 13:09   ` Baoquan He
@ 2020-08-10 22:55   ` Mike Kravetz
  1 sibling, 0 replies; 45+ messages in thread
From: Mike Kravetz @ 2020-08-10 22:55 UTC (permalink / raw)
  To: Wei Yang, akpm; +Cc: linux-mm, linux-kernel

On 8/7/20 2:12 AM, Wei Yang wrote:
> Function dequeue_huge_page_node_exact() iterates the free list and
> return the first non-isolated one.
> 
> Instead of break and check the loop variant, we could return in the loop
> directly. This could reduce some redundant check.
> 
> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>

I agree with Baoquan He in that this is more of a style change.  Certainly
there is the potential to avoid an extra check and that is always good.

The real value in this patch (IMO) is removal of the stale comment.

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 09/10] mm/hugetlb: narrow the hugetlb_lock protection area during preparing huge page
  2020-08-07  9:12 ` [PATCH 09/10] mm/hugetlb: narrow the hugetlb_lock protection area during preparing huge page Wei Yang
  2020-08-07 13:12   ` Baoquan He
@ 2020-08-10 23:02   ` Mike Kravetz
  1 sibling, 0 replies; 45+ messages in thread
From: Mike Kravetz @ 2020-08-10 23:02 UTC (permalink / raw)
  To: Wei Yang, akpm; +Cc: linux-mm, linux-kernel

On 8/7/20 2:12 AM, Wei Yang wrote:
> set_hugetlb_cgroup_[rsvd] just manipulate page local data, which is not
> necessary to be protected by hugetlb_lock.
> 
> Let's take this out.
> 
> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>

Thanks!

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/10] mm/hugetlb: make sure to get NULL when list is empty
  2020-08-10 20:28       ` Mike Kravetz
@ 2020-08-10 23:05         ` Wei Yang
  0 siblings, 0 replies; 45+ messages in thread
From: Wei Yang @ 2020-08-10 23:05 UTC (permalink / raw)
  To: Mike Kravetz; +Cc: Wei Yang, Baoquan He, akpm, linux-mm, linux-kernel

On Mon, Aug 10, 2020 at 01:28:46PM -0700, Mike Kravetz wrote:
>On 8/7/20 7:28 AM, Wei Yang wrote:
>> On Fri, Aug 07, 2020 at 08:49:51PM +0800, Baoquan He wrote:
>>> On 08/07/20 at 05:12pm, Wei Yang wrote:
>>>> list_first_entry() may not return NULL even when the list is empty.
>>>>
>>>> Let's make sure the behavior by using list_first_entry_or_null(),
>>>> otherwise it would corrupt the list.
>>>>
>>>> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
>>>> ---
>>>>  mm/hugetlb.c | 3 ++-
>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>>> index 62ec74f6d03f..0a2f3851b828 100644
>>>> --- a/mm/hugetlb.c
>>>> +++ b/mm/hugetlb.c
>>>> @@ -237,7 +237,8 @@ get_file_region_entry_from_cache(struct resv_map *resv, long from, long to)
>>>>  	VM_BUG_ON(resv->region_cache_count <= 0);
>>>
>>>
>>> We have had above line, is it possible to be NULL from list_first_entry?
>>>
>>>>  
>>>>  	resv->region_cache_count--;
>>>> -	nrg = list_first_entry(&resv->region_cache, struct file_region, link);
>>>> +	nrg = list_first_entry_or_null(&resv->region_cache,
>>>> +			struct file_region, link);
>>>>  	VM_BUG_ON(!nrg);
>> 
>> Or we can remove this VM_BUG_ON()?
>> 
>
>I would prefer that we just remove the 'VM_BUG_ON(!nrg)'.  Code elsewhere
>is responsible for making sure there is ALWAYS an entry in the cache.  That
>is why the 'VM_BUG_ON(resv->region_cache_count <= 0)' is at the beginning
>of the routine.

Sure, will change to this.

>
>-- 
>Mike Kravetz

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/10] mm/hugetlb: not necessary to abuse temporary page to workaround the nasty free_huge_page
  2020-08-10  2:17   ` Baoquan He
@ 2020-08-11  0:19     ` Mike Kravetz
  2020-08-11  1:51       ` Baoquan He
  0 siblings, 1 reply; 45+ messages in thread
From: Mike Kravetz @ 2020-08-11  0:19 UTC (permalink / raw)
  To: Baoquan He, Wei Yang, mhocko; +Cc: akpm, linux-mm, linux-kernel

On 8/9/20 7:17 PM, Baoquan He wrote:
> On 08/07/20 at 05:12pm, Wei Yang wrote:
>> Let's always increase surplus_huge_pages and so that free_huge_page
>> could decrease it at free time.
>>
>> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
>> ---
>>  mm/hugetlb.c | 14 ++++++--------
>>  1 file changed, 6 insertions(+), 8 deletions(-)
>>
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 1f2010c9dd8d..a0eb81e0e4c5 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1913,21 +1913,19 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask,
>>  		return NULL;
>>  
>>  	spin_lock(&hugetlb_lock);
>> +
>> +	h->surplus_huge_pages++;
>> +	h->surplus_huge_pages_node[page_to_nid(page)]++;
>> +
>>  	/*
>>  	 * We could have raced with the pool size change.
>>  	 * Double check that and simply deallocate the new page
>> -	 * if we would end up overcommiting the surpluses. Abuse
>> -	 * temporary page to workaround the nasty free_huge_page
>> -	 * codeflow
>> +	 * if we would end up overcommiting the surpluses.
>>  	 */
>> -	if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) {
>> -		SetPageHugeTemporary(page);
> 
> Hmm, the temporary page way is taken intentionally in
> commit 9980d744a0428 ("mm, hugetlb: get rid of surplus page accounting tricks").
> From code, this is done inside hugetlb_lock holding, and the code flow
> is straightforward, should be safe. Adding Michal to CC.
> 

I remember when the temporary page code was added for page migration.
The use of temporary page here was added at about the same time.  Temporary
page does have one advantage in that it will not CAUSE surplus count to
exceed overcommit.  This patch could cause surplus to exceed overcommit
for a very short period of time.  However, do note that for this to happen
the code needs to race with a pool resize which itself could cause surplus
to exceed overcommit.

IMO both approaches are valid.
- Advantage of temporary page is that it can not cause surplus to exceed
  overcommit.  Disadvantage is as mentioned in the comment 'abuse of temporary
  page'.
- Advantage of this patch is that it uses existing counters.  Disadvantage
  is that it can momentarily cause surplus to exceed overcommit.

Unless someone has a strong opinion, I prefer the changes in this patch.
-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/10] mm/hugetlb: not necessary to abuse temporary page to workaround the nasty free_huge_page
  2020-08-11  0:19     ` Mike Kravetz
@ 2020-08-11  1:51       ` Baoquan He
  2020-08-11  6:54         ` Michal Hocko
  0 siblings, 1 reply; 45+ messages in thread
From: Baoquan He @ 2020-08-11  1:51 UTC (permalink / raw)
  To: Mike Kravetz; +Cc: Wei Yang, mhocko, akpm, linux-mm, linux-kernel

On 08/10/20 at 05:19pm, Mike Kravetz wrote:
> On 8/9/20 7:17 PM, Baoquan He wrote:
> > On 08/07/20 at 05:12pm, Wei Yang wrote:
> >> Let's always increase surplus_huge_pages and so that free_huge_page
> >> could decrease it at free time.
> >>
> >> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
> >> ---
> >>  mm/hugetlb.c | 14 ++++++--------
> >>  1 file changed, 6 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> >> index 1f2010c9dd8d..a0eb81e0e4c5 100644
> >> --- a/mm/hugetlb.c
> >> +++ b/mm/hugetlb.c
> >> @@ -1913,21 +1913,19 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask,
> >>  		return NULL;
> >>  
> >>  	spin_lock(&hugetlb_lock);
> >> +
> >> +	h->surplus_huge_pages++;
> >> +	h->surplus_huge_pages_node[page_to_nid(page)]++;
> >> +
> >>  	/*
> >>  	 * We could have raced with the pool size change.
> >>  	 * Double check that and simply deallocate the new page
> >> -	 * if we would end up overcommiting the surpluses. Abuse
> >> -	 * temporary page to workaround the nasty free_huge_page
> >> -	 * codeflow
> >> +	 * if we would end up overcommiting the surpluses.
> >>  	 */
> >> -	if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) {
> >> -		SetPageHugeTemporary(page);
> > 
> > Hmm, the temporary page way is taken intentionally in
> > commit 9980d744a0428 ("mm, hugetlb: get rid of surplus page accounting tricks").
> > From code, this is done inside hugetlb_lock holding, and the code flow
> > is straightforward, should be safe. Adding Michal to CC.
> > 
> 
> I remember when the temporary page code was added for page migration.
> The use of temporary page here was added at about the same time.  Temporary
> page does have one advantage in that it will not CAUSE surplus count to
> exceed overcommit.  This patch could cause surplus to exceed overcommit
> for a very short period of time.  However, do note that for this to happen
> the code needs to race with a pool resize which itself could cause surplus
> to exceed overcommit.
> 
> IMO both approaches are valid.
> - Advantage of temporary page is that it can not cause surplus to exceed
>   overcommit.  Disadvantage is as mentioned in the comment 'abuse of temporary
>   page'.
> - Advantage of this patch is that it uses existing counters.  Disadvantage
>   is that it can momentarily cause surplus to exceed overcommit.

Yeah, since it's all done inside hugetlb_lock, should be OK even
though it may cause surplus to exceed overcommit.
> 
> Unless someone has a strong opinion, I prefer the changes in this patch.

Agree, I also prefer the code change in this patch, to remove the
unnecessary confusion about the temporary page.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/10] mm/hugetlb: not necessary to abuse temporary page to workaround the nasty free_huge_page
  2020-08-11  1:51       ` Baoquan He
@ 2020-08-11  6:54         ` Michal Hocko
  2020-08-11 21:43           ` Mike Kravetz
  2020-08-11 23:55           ` Baoquan He
  0 siblings, 2 replies; 45+ messages in thread
From: Michal Hocko @ 2020-08-11  6:54 UTC (permalink / raw)
  To: Baoquan He; +Cc: Mike Kravetz, Wei Yang, akpm, linux-mm, linux-kernel

On Tue 11-08-20 09:51:48, Baoquan He wrote:
> On 08/10/20 at 05:19pm, Mike Kravetz wrote:
> > On 8/9/20 7:17 PM, Baoquan He wrote:
> > > On 08/07/20 at 05:12pm, Wei Yang wrote:
> > >> Let's always increase surplus_huge_pages and so that free_huge_page
> > >> could decrease it at free time.
> > >>
> > >> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
> > >> ---
> > >>  mm/hugetlb.c | 14 ++++++--------
> > >>  1 file changed, 6 insertions(+), 8 deletions(-)
> > >>
> > >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > >> index 1f2010c9dd8d..a0eb81e0e4c5 100644
> > >> --- a/mm/hugetlb.c
> > >> +++ b/mm/hugetlb.c
> > >> @@ -1913,21 +1913,19 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask,
> > >>  		return NULL;
> > >>  
> > >>  	spin_lock(&hugetlb_lock);
> > >> +
> > >> +	h->surplus_huge_pages++;
> > >> +	h->surplus_huge_pages_node[page_to_nid(page)]++;
> > >> +
> > >>  	/*
> > >>  	 * We could have raced with the pool size change.
> > >>  	 * Double check that and simply deallocate the new page
> > >> -	 * if we would end up overcommiting the surpluses. Abuse
> > >> -	 * temporary page to workaround the nasty free_huge_page
> > >> -	 * codeflow
> > >> +	 * if we would end up overcommiting the surpluses.
> > >>  	 */
> > >> -	if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) {
> > >> -		SetPageHugeTemporary(page);
> > > 
> > > Hmm, the temporary page way is taken intentionally in
> > > commit 9980d744a0428 ("mm, hugetlb: get rid of surplus page accounting tricks").
> > > From code, this is done inside hugetlb_lock holding, and the code flow
> > > is straightforward, should be safe. Adding Michal to CC.

But the lock is not held during the migration, right?

> > I remember when the temporary page code was added for page migration.
> > The use of temporary page here was added at about the same time.  Temporary
> > page does have one advantage in that it will not CAUSE surplus count to
> > exceed overcommit.  This patch could cause surplus to exceed overcommit
> > for a very short period of time.  However, do note that for this to happen
> > the code needs to race with a pool resize which itself could cause surplus
> > to exceed overcommit.

Correct.

> > IMO both approaches are valid.
> > - Advantage of temporary page is that it can not cause surplus to exceed
> >   overcommit.  Disadvantage is as mentioned in the comment 'abuse of temporary
> >   page'.
> > - Advantage of this patch is that it uses existing counters.  Disadvantage
> >   is that it can momentarily cause surplus to exceed overcommit.

Do I remember correctly that this can cause an allocation failure due to
overcommit check? In other words it would be user space visible thing?

> Yeah, since it's all done inside hugetlb_lock, should be OK even
> though it may cause surplus to exceed overcommit.
> > 
> > Unless someone has a strong opinion, I prefer the changes in this patch.
> 
> Agree, I also prefer the code change in this patch, to remove the
> unnecessary confusion about the temporary page.

I have managed to forgot all the juicy details since I have made that
change. All that remains is that the surplus pages accounting was quite
tricky and back then I didn't figure out a simpler method that would
achieve the consistent look at those counters. As mentioned above I
suspect this could lead to pre-mature allocation failures while the
migration is ongoing. Sure quite unlikely to happen and the race window
is likely very small. Maybe this is even acceptable but I would strongly
recommend to have all this thinking documented in the changelog.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/10] mm/hugetlb: not necessary to abuse temporary page to workaround the nasty free_huge_page
  2020-08-11  6:54         ` Michal Hocko
@ 2020-08-11 21:43           ` Mike Kravetz
  2020-08-11 23:19             ` Wei Yang
                               ` (2 more replies)
  2020-08-11 23:55           ` Baoquan He
  1 sibling, 3 replies; 45+ messages in thread
From: Mike Kravetz @ 2020-08-11 21:43 UTC (permalink / raw)
  To: Michal Hocko, Baoquan He; +Cc: Wei Yang, akpm, linux-mm, linux-kernel

On 8/10/20 11:54 PM, Michal Hocko wrote:
> 
> I have managed to forgot all the juicy details since I have made that
> change. All that remains is that the surplus pages accounting was quite
> tricky and back then I didn't figure out a simpler method that would
> achieve the consistent look at those counters. As mentioned above I
> suspect this could lead to pre-mature allocation failures while the
> migration is ongoing.

It is likely lost in the e-mail thread, but the suggested change was to
alloc_surplus_huge_page().  The code which allocates the migration target
(alloc_migrate_huge_page) will not be changed.  So, this should not be
an issue.

>                       Sure quite unlikely to happen and the race window
> is likely very small. Maybe this is even acceptable but I would strongly
> recommend to have all this thinking documented in the changelog.

I wrote down a description of what happens in the two different approaches
"temporary page" vs "surplus page".  It is at the very end of this e-mail.
When looking at the details, I came up with what may be an even better
approach.  Why not just call the low level routine to free the page instead
of going through put_page/free_huge_page?  At the very least, it saves a
lock roundtrip and there is no need to worry about the counters/accounting.

Here is a patch to do that.  However, we are optimizing a return path in
a race condition that we are unlikely to ever hit.  I 'tested' it by allocating
an 'extra' page and freeing it via this method in alloc_surplus_huge_page.

From 864c5f8ef4900c95ca3f6f2363a85f3cb25e793e Mon Sep 17 00:00:00 2001
From: Mike Kravetz <mike.kravetz@oracle.com>
Date: Tue, 11 Aug 2020 12:45:41 -0700
Subject: [PATCH] hugetlb: optimize race error return in
 alloc_surplus_huge_page

The routine alloc_surplus_huge_page() could race with with a pool
size change.  If this happens, the allocated page may not be needed.
To free the page, the current code will 'Abuse temporary page to
workaround the nasty free_huge_page codeflow'.  Instead, directly
call the low level routine that free_huge_page uses.  This works
out well because the page is new, we hold the only reference and
already hold the hugetlb_lock.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 mm/hugetlb.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 590111ea6975..ac89b91fba86 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1923,14 +1923,17 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask,
 	/*
 	 * We could have raced with the pool size change.
 	 * Double check that and simply deallocate the new page
-	 * if we would end up overcommiting the surpluses. Abuse
-	 * temporary page to workaround the nasty free_huge_page
-	 * codeflow
+	 * if we would end up overcommiting the surpluses.
 	 */
 	if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) {
-		SetPageHugeTemporary(page);
+		/*
+		 * Since this page is new, we hold the only reference, and
+		 * we already hold the hugetlb_lock call the low level free
+		 * page routine.  This saves at least a lock roundtrip.
+		 */
+		(void)put_page_testzero(page); /* don't call destructor */
+		update_and_free_page(h, page);
 		spin_unlock(&hugetlb_lock);
-		put_page(page);
 		return NULL;
 	} else {
 		h->surplus_huge_pages++;
-- 
2.25.4


Here is a description of the difference in "Temporary Page" vs "Surplus
Page" approach.

Both only allocate a fresh huge page if surplus_huge_pages is less than
nr_overcommit_huge_pages.  Of course, the lock protecting those counts
must be dropped to perform the allocation.  After reacquiring the lock
is where we have the proposed difference in behavior.

temporary page behavior
-----------------------
if surplus_huge_pages >= h->nr_overcommit_huge_pages
	SetPageHugeTemporary(page)
	spin_unlock(&hugetlb_lock);
	put_page(page);

At this time we know surplus_huge_pages is 'at least' nr_overcommit_huge_pages.
As a result, any new allocation will fail.
Only user visible result is that number of huge pages will be one greater than
that specified by user and overcommit values.  This is only visible for the
short time until the page is actully freed as a result of put_page().

free_huge_page()
	number of huge pages will be decremented

suprlus page behavior
---------------------
surplus_huge_pages++
surplus_huge_pages_node[page_to_nid(page)]++
if surplus_huge_pages > nr_overcommit_huge_pages
	spin_unlock(&hugetlb_lock);
	put_page(page);

At this time we know surplus_huge_pages is greater than
nr_overcommit_huge_pages.  As a result, any new allocation will fail.
User visible result is an increase in surplus pages as well as number of
huge pages.  In addition, surplus pages will exceed overcommit.  This is
only visible for the short time until the page is actully freed as a
result of put_page().

free_huge_page()
	number of huge pages will be decremented
	h->surplus_huge_pages--;
	h->surplus_huge_pages_node[nid]--;



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/10] mm/hugetlb: not necessary to abuse temporary page to workaround the nasty free_huge_page
  2020-08-11 21:43           ` Mike Kravetz
@ 2020-08-11 23:19             ` Wei Yang
  2020-08-11 23:25               ` Mike Kravetz
  2020-08-12  5:40             ` Baoquan He
  2020-08-13 11:46             ` Michal Hocko
  2 siblings, 1 reply; 45+ messages in thread
From: Wei Yang @ 2020-08-11 23:19 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: Michal Hocko, Baoquan He, Wei Yang, akpm, linux-mm, linux-kernel

On Tue, Aug 11, 2020 at 02:43:28PM -0700, Mike Kravetz wrote:
>On 8/10/20 11:54 PM, Michal Hocko wrote:
>> 
>> I have managed to forgot all the juicy details since I have made that
>> change. All that remains is that the surplus pages accounting was quite
>> tricky and back then I didn't figure out a simpler method that would
>> achieve the consistent look at those counters. As mentioned above I
>> suspect this could lead to pre-mature allocation failures while the
>> migration is ongoing.
>
>It is likely lost in the e-mail thread, but the suggested change was to
>alloc_surplus_huge_page().  The code which allocates the migration target
>(alloc_migrate_huge_page) will not be changed.  So, this should not be
>an issue.
>
>>                       Sure quite unlikely to happen and the race window
>> is likely very small. Maybe this is even acceptable but I would strongly
>> recommend to have all this thinking documented in the changelog.
>
>I wrote down a description of what happens in the two different approaches
>"temporary page" vs "surplus page".  It is at the very end of this e-mail.
>When looking at the details, I came up with what may be an even better
>approach.  Why not just call the low level routine to free the page instead
>of going through put_page/free_huge_page?  At the very least, it saves a
>lock roundtrip and there is no need to worry about the counters/accounting.
>
>Here is a patch to do that.  However, we are optimizing a return path in
>a race condition that we are unlikely to ever hit.  I 'tested' it by allocating
>an 'extra' page and freeing it via this method in alloc_surplus_huge_page.
>
>>From 864c5f8ef4900c95ca3f6f2363a85f3cb25e793e Mon Sep 17 00:00:00 2001
>From: Mike Kravetz <mike.kravetz@oracle.com>
>Date: Tue, 11 Aug 2020 12:45:41 -0700
>Subject: [PATCH] hugetlb: optimize race error return in
> alloc_surplus_huge_page
>
>The routine alloc_surplus_huge_page() could race with with a pool
>size change.  If this happens, the allocated page may not be needed.
>To free the page, the current code will 'Abuse temporary page to
>workaround the nasty free_huge_page codeflow'.  Instead, directly
>call the low level routine that free_huge_page uses.  This works
>out well because the page is new, we hold the only reference and
>already hold the hugetlb_lock.
>
>Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
>---
> mm/hugetlb.c | 13 ++++++++-----
> 1 file changed, 8 insertions(+), 5 deletions(-)
>
>diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>index 590111ea6975..ac89b91fba86 100644
>--- a/mm/hugetlb.c
>+++ b/mm/hugetlb.c
>@@ -1923,14 +1923,17 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask,
> 	/*
> 	 * We could have raced with the pool size change.
> 	 * Double check that and simply deallocate the new page
>-	 * if we would end up overcommiting the surpluses. Abuse
>-	 * temporary page to workaround the nasty free_huge_page
>-	 * codeflow
>+	 * if we would end up overcommiting the surpluses.
> 	 */
> 	if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) {
>-		SetPageHugeTemporary(page);
>+		/*
>+		 * Since this page is new, we hold the only reference, and
>+		 * we already hold the hugetlb_lock call the low level free
>+		 * page routine.  This saves at least a lock roundtrip.

The change looks good to me, while I may not understand the "lock roundtrip".
You mean we don't need to release the hugetlb_lock?

>+		 */
>+		(void)put_page_testzero(page); /* don't call destructor */
>+		update_and_free_page(h, page);
> 		spin_unlock(&hugetlb_lock);
>-		put_page(page);
> 		return NULL;
> 	} else {
> 		h->surplus_huge_pages++;
>-- 
>2.25.4
>
>
>Here is a description of the difference in "Temporary Page" vs "Surplus
>Page" approach.
>
>Both only allocate a fresh huge page if surplus_huge_pages is less than
>nr_overcommit_huge_pages.  Of course, the lock protecting those counts
>must be dropped to perform the allocation.  After reacquiring the lock
>is where we have the proposed difference in behavior.
>
>temporary page behavior
>-----------------------
>if surplus_huge_pages >= h->nr_overcommit_huge_pages
>	SetPageHugeTemporary(page)
>	spin_unlock(&hugetlb_lock);
>	put_page(page);
>
>At this time we know surplus_huge_pages is 'at least' nr_overcommit_huge_pages.
>As a result, any new allocation will fail.
>Only user visible result is that number of huge pages will be one greater than
>that specified by user and overcommit values.  This is only visible for the
>short time until the page is actully freed as a result of put_page().
>
>free_huge_page()
>	number of huge pages will be decremented
>
>suprlus page behavior
>---------------------
>surplus_huge_pages++
>surplus_huge_pages_node[page_to_nid(page)]++
>if surplus_huge_pages > nr_overcommit_huge_pages
>	spin_unlock(&hugetlb_lock);
>	put_page(page);
>
>At this time we know surplus_huge_pages is greater than
>nr_overcommit_huge_pages.  As a result, any new allocation will fail.
>User visible result is an increase in surplus pages as well as number of
>huge pages.  In addition, surplus pages will exceed overcommit.  This is
>only visible for the short time until the page is actully freed as a
>result of put_page().
>
>free_huge_page()
>	number of huge pages will be decremented
>	h->surplus_huge_pages--;
>	h->surplus_huge_pages_node[nid]--;

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/10] mm/hugetlb: not necessary to abuse temporary page to workaround the nasty free_huge_page
  2020-08-11 23:19             ` Wei Yang
@ 2020-08-11 23:25               ` Mike Kravetz
  0 siblings, 0 replies; 45+ messages in thread
From: Mike Kravetz @ 2020-08-11 23:25 UTC (permalink / raw)
  To: Wei Yang; +Cc: Michal Hocko, Baoquan He, akpm, linux-mm, linux-kernel

On 8/11/20 4:19 PM, Wei Yang wrote:
> On Tue, Aug 11, 2020 at 02:43:28PM -0700, Mike Kravetz wrote:
>> Subject: [PATCH] hugetlb: optimize race error return in
>> alloc_surplus_huge_page
>>
>> The routine alloc_surplus_huge_page() could race with with a pool
>> size change.  If this happens, the allocated page may not be needed.
>> To free the page, the current code will 'Abuse temporary page to
>> workaround the nasty free_huge_page codeflow'.  Instead, directly
>> call the low level routine that free_huge_page uses.  This works
>> out well because the page is new, we hold the only reference and
>> already hold the hugetlb_lock.
>>
>> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
>> ---
>> mm/hugetlb.c | 13 ++++++++-----
>> 1 file changed, 8 insertions(+), 5 deletions(-)
>>
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 590111ea6975..ac89b91fba86 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1923,14 +1923,17 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask,
>> 	/*
>> 	 * We could have raced with the pool size change.
>> 	 * Double check that and simply deallocate the new page
>> -	 * if we would end up overcommiting the surpluses. Abuse
>> -	 * temporary page to workaround the nasty free_huge_page
>> -	 * codeflow
>> +	 * if we would end up overcommiting the surpluses.
>> 	 */
>> 	if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) {
>> -		SetPageHugeTemporary(page);
>> +		/*
>> +		 * Since this page is new, we hold the only reference, and
>> +		 * we already hold the hugetlb_lock call the low level free
>> +		 * page routine.  This saves at least a lock roundtrip.
> 
> The change looks good to me, while I may not understand the "lock roundtrip".
> You mean we don't need to release the hugetlb_lock?

Correct.
Normally we would free the page via free_huge_page() processing.  To do that
we need to drop hugetlb_lock and call put_page/free_huge_page which will
need to acquire the hugetlb_lock again.
-- 
Mike Kravetz

> 
>> +		 */
>> +		(void)put_page_testzero(page); /* don't call destructor */
>> +		update_and_free_page(h, page);
>> 		spin_unlock(&hugetlb_lock);
>> -		put_page(page);
>> 		return NULL;
>> 	} else {
>> 		h->surplus_huge_pages++;
>> -- 
>> 2.25.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/10] mm/hugetlb: not necessary to abuse temporary page to workaround the nasty free_huge_page
  2020-08-11  6:54         ` Michal Hocko
  2020-08-11 21:43           ` Mike Kravetz
@ 2020-08-11 23:55           ` Baoquan He
  1 sibling, 0 replies; 45+ messages in thread
From: Baoquan He @ 2020-08-11 23:55 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Mike Kravetz, Wei Yang, akpm, linux-mm, linux-kernel

On 08/11/20 at 08:54am, Michal Hocko wrote:
> On Tue 11-08-20 09:51:48, Baoquan He wrote:
> > On 08/10/20 at 05:19pm, Mike Kravetz wrote:
> > > On 8/9/20 7:17 PM, Baoquan He wrote:
> > > > On 08/07/20 at 05:12pm, Wei Yang wrote:
> > > >> Let's always increase surplus_huge_pages and so that free_huge_page
> > > >> could decrease it at free time.
> > > >>
> > > >> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
> > > >> ---
> > > >>  mm/hugetlb.c | 14 ++++++--------
> > > >>  1 file changed, 6 insertions(+), 8 deletions(-)
> > > >>
> > > >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > >> index 1f2010c9dd8d..a0eb81e0e4c5 100644
> > > >> --- a/mm/hugetlb.c
> > > >> +++ b/mm/hugetlb.c
> > > >> @@ -1913,21 +1913,19 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask,
> > > >>  		return NULL;
> > > >>  
> > > >>  	spin_lock(&hugetlb_lock);
> > > >> +
> > > >> +	h->surplus_huge_pages++;
> > > >> +	h->surplus_huge_pages_node[page_to_nid(page)]++;
> > > >> +
> > > >>  	/*
> > > >>  	 * We could have raced with the pool size change.
> > > >>  	 * Double check that and simply deallocate the new page
> > > >> -	 * if we would end up overcommiting the surpluses. Abuse
> > > >> -	 * temporary page to workaround the nasty free_huge_page
> > > >> -	 * codeflow
> > > >> +	 * if we would end up overcommiting the surpluses.
> > > >>  	 */
> > > >> -	if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) {
> > > >> -		SetPageHugeTemporary(page);
> > > > 
> > > > Hmm, the temporary page way is taken intentionally in
> > > > commit 9980d744a0428 ("mm, hugetlb: get rid of surplus page accounting tricks").
> > > > From code, this is done inside hugetlb_lock holding, and the code flow
> > > > is straightforward, should be safe. Adding Michal to CC.
> 
> But the lock is not held during the migration, right?

I see what I misunderstoold about the hugetlb_lock holding. The
put_page() is called after releasing hugetlb_lock in
alloc_surplus_huge_page(), I mistakenly got put_page() is inside
hugetlb_lock. Yes, there's obviously a race window, and the temporary
page way is an effective way to not mess up the surplus_huge_pages
accounting.

> 
> > > I remember when the temporary page code was added for page migration.
> > > The use of temporary page here was added at about the same time.  Temporary
> > > page does have one advantage in that it will not CAUSE surplus count to
> > > exceed overcommit.  This patch could cause surplus to exceed overcommit
> > > for a very short period of time.  However, do note that for this to happen
> > > the code needs to race with a pool resize which itself could cause surplus
> > > to exceed overcommit.
> 
> Correct.
> 
> > > IMO both approaches are valid.
> > > - Advantage of temporary page is that it can not cause surplus to exceed
> > >   overcommit.  Disadvantage is as mentioned in the comment 'abuse of temporary
> > >   page'.
> > > - Advantage of this patch is that it uses existing counters.  Disadvantage
> > >   is that it can momentarily cause surplus to exceed overcommit.
> 
> Do I remember correctly that this can cause an allocation failure due to
> overcommit check? In other words it would be user space visible thing?
> 
> > Yeah, since it's all done inside hugetlb_lock, should be OK even
> > though it may cause surplus to exceed overcommit.
> > > 
> > > Unless someone has a strong opinion, I prefer the changes in this patch.
> > 
> > Agree, I also prefer the code change in this patch, to remove the
> > unnecessary confusion about the temporary page.
> 
> I have managed to forgot all the juicy details since I have made that
> change. All that remains is that the surplus pages accounting was quite
> tricky and back then I didn't figure out a simpler method that would
> achieve the consistent look at those counters. As mentioned above I
> suspect this could lead to pre-mature allocation failures while the
> migration is ongoing. Sure quite unlikely to happen and the race window
> is likely very small. Maybe this is even acceptable but I would strongly
> recommend to have all this thinking documented in the changelog.
> -- 
> Michal Hocko
> SUSE Labs
> 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/10] mm/hugetlb: not necessary to abuse temporary page to workaround the nasty free_huge_page
  2020-08-11 21:43           ` Mike Kravetz
  2020-08-11 23:19             ` Wei Yang
@ 2020-08-12  5:40             ` Baoquan He
  2020-08-13 11:46             ` Michal Hocko
  2 siblings, 0 replies; 45+ messages in thread
From: Baoquan He @ 2020-08-12  5:40 UTC (permalink / raw)
  To: Mike Kravetz; +Cc: Michal Hocko, Wei Yang, akpm, linux-mm, linux-kernel

On 08/11/20 at 02:43pm, Mike Kravetz wrote:
> Here is a patch to do that.  However, we are optimizing a return path in
> a race condition that we are unlikely to ever hit.  I 'tested' it by allocating
> an 'extra' page and freeing it via this method in alloc_surplus_huge_page.
> 
> From 864c5f8ef4900c95ca3f6f2363a85f3cb25e793e Mon Sep 17 00:00:00 2001
> From: Mike Kravetz <mike.kravetz@oracle.com>
> Date: Tue, 11 Aug 2020 12:45:41 -0700
> Subject: [PATCH] hugetlb: optimize race error return in
>  alloc_surplus_huge_page
> 
> The routine alloc_surplus_huge_page() could race with with a pool
> size change.  If this happens, the allocated page may not be needed.
> To free the page, the current code will 'Abuse temporary page to
> workaround the nasty free_huge_page codeflow'.  Instead, directly
> call the low level routine that free_huge_page uses.  This works
> out well because the page is new, we hold the only reference and
> already hold the hugetlb_lock.
> 
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> ---
>  mm/hugetlb.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 590111ea6975..ac89b91fba86 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1923,14 +1923,17 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask,
>  	/*
>  	 * We could have raced with the pool size change.
>  	 * Double check that and simply deallocate the new page
> -	 * if we would end up overcommiting the surpluses. Abuse
> -	 * temporary page to workaround the nasty free_huge_page
> -	 * codeflow
> +	 * if we would end up overcommiting the surpluses.
>  	 */
>  	if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) {
> -		SetPageHugeTemporary(page);
> +		/*
> +		 * Since this page is new, we hold the only reference, and
> +		 * we already hold the hugetlb_lock call the low level free
> +		 * page routine.  This saves at least a lock roundtrip.
> +		 */
> +		(void)put_page_testzero(page); /* don't call destructor */
> +		update_and_free_page(h, page);

Yeah, taking this code change, or keeping the temporary page way as is,
both looks good.

>  		spin_unlock(&hugetlb_lock);
> -		put_page(page);
>  		return NULL;
>  	} else {
>  		h->surplus_huge_pages++;



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/10] mm/hugetlb: not necessary to abuse temporary page to workaround the nasty free_huge_page
  2020-08-11 21:43           ` Mike Kravetz
  2020-08-11 23:19             ` Wei Yang
  2020-08-12  5:40             ` Baoquan He
@ 2020-08-13 11:46             ` Michal Hocko
  2020-08-17  3:04               ` Wei Yang
  2 siblings, 1 reply; 45+ messages in thread
From: Michal Hocko @ 2020-08-13 11:46 UTC (permalink / raw)
  To: Mike Kravetz; +Cc: Baoquan He, Wei Yang, akpm, linux-mm, linux-kernel

On Tue 11-08-20 14:43:28, Mike Kravetz wrote:
> On 8/10/20 11:54 PM, Michal Hocko wrote:
> > 
> > I have managed to forgot all the juicy details since I have made that
> > change. All that remains is that the surplus pages accounting was quite
> > tricky and back then I didn't figure out a simpler method that would
> > achieve the consistent look at those counters. As mentioned above I
> > suspect this could lead to pre-mature allocation failures while the
> > migration is ongoing.
> 
> It is likely lost in the e-mail thread, but the suggested change was to
> alloc_surplus_huge_page().  The code which allocates the migration target
> (alloc_migrate_huge_page) will not be changed.  So, this should not be
> an issue.

OK, I've missed that obviously.

> >                       Sure quite unlikely to happen and the race window
> > is likely very small. Maybe this is even acceptable but I would strongly
> > recommend to have all this thinking documented in the changelog.
> 
> I wrote down a description of what happens in the two different approaches
> "temporary page" vs "surplus page".  It is at the very end of this e-mail.
> When looking at the details, I came up with what may be an even better
> approach.  Why not just call the low level routine to free the page instead
> of going through put_page/free_huge_page?  At the very least, it saves a
> lock roundtrip and there is no need to worry about the counters/accounting.
> 
> Here is a patch to do that.  However, we are optimizing a return path in
> a race condition that we are unlikely to ever hit.  I 'tested' it by allocating
> an 'extra' page and freeing it via this method in alloc_surplus_huge_page.
> 
> >From 864c5f8ef4900c95ca3f6f2363a85f3cb25e793e Mon Sep 17 00:00:00 2001
> From: Mike Kravetz <mike.kravetz@oracle.com>
> Date: Tue, 11 Aug 2020 12:45:41 -0700
> Subject: [PATCH] hugetlb: optimize race error return in
>  alloc_surplus_huge_page
> 
> The routine alloc_surplus_huge_page() could race with with a pool
> size change.  If this happens, the allocated page may not be needed.
> To free the page, the current code will 'Abuse temporary page to
> workaround the nasty free_huge_page codeflow'.  Instead, directly
> call the low level routine that free_huge_page uses.  This works
> out well because the page is new, we hold the only reference and
> already hold the hugetlb_lock.
> 
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> ---
>  mm/hugetlb.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 590111ea6975..ac89b91fba86 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1923,14 +1923,17 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask,
>  	/*
>  	 * We could have raced with the pool size change.
>  	 * Double check that and simply deallocate the new page
> -	 * if we would end up overcommiting the surpluses. Abuse
> -	 * temporary page to workaround the nasty free_huge_page
> -	 * codeflow
> +	 * if we would end up overcommiting the surpluses.
>  	 */
>  	if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) {
> -		SetPageHugeTemporary(page);
> +		/*
> +		 * Since this page is new, we hold the only reference, and
> +		 * we already hold the hugetlb_lock call the low level free
> +		 * page routine.  This saves at least a lock roundtrip.
> +		 */
> +		(void)put_page_testzero(page); /* don't call destructor */
> +		update_and_free_page(h, page);
>  		spin_unlock(&hugetlb_lock);
> -		put_page(page);
>  		return NULL;
>  	} else {
>  		h->surplus_huge_pages++;

Yes this makes sense. I would have to think about this more to be
confident and give Acked-by but this looks sensible from a quick glance.

Thanks!
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/10] mm/hugetlb: not necessary to abuse temporary page to workaround the nasty free_huge_page
  2020-08-13 11:46             ` Michal Hocko
@ 2020-08-17  3:04               ` Wei Yang
  0 siblings, 0 replies; 45+ messages in thread
From: Wei Yang @ 2020-08-17  3:04 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Mike Kravetz, Baoquan He, Wei Yang, akpm, linux-mm, linux-kernel

On Thu, Aug 13, 2020 at 01:46:38PM +0200, Michal Hocko wrote:
>On Tue 11-08-20 14:43:28, Mike Kravetz wrote:
>> On 8/10/20 11:54 PM, Michal Hocko wrote:
>> > 
>> > I have managed to forgot all the juicy details since I have made that
>> > change. All that remains is that the surplus pages accounting was quite
>> > tricky and back then I didn't figure out a simpler method that would
>> > achieve the consistent look at those counters. As mentioned above I
>> > suspect this could lead to pre-mature allocation failures while the
>> > migration is ongoing.
>> 
>> It is likely lost in the e-mail thread, but the suggested change was to
>> alloc_surplus_huge_page().  The code which allocates the migration target
>> (alloc_migrate_huge_page) will not be changed.  So, this should not be
>> an issue.
>
>OK, I've missed that obviously.
>
>> >                       Sure quite unlikely to happen and the race window
>> > is likely very small. Maybe this is even acceptable but I would strongly
>> > recommend to have all this thinking documented in the changelog.
>> 
>> I wrote down a description of what happens in the two different approaches
>> "temporary page" vs "surplus page".  It is at the very end of this e-mail.
>> When looking at the details, I came up with what may be an even better
>> approach.  Why not just call the low level routine to free the page instead
>> of going through put_page/free_huge_page?  At the very least, it saves a
>> lock roundtrip and there is no need to worry about the counters/accounting.
>> 
>> Here is a patch to do that.  However, we are optimizing a return path in
>> a race condition that we are unlikely to ever hit.  I 'tested' it by allocating
>> an 'extra' page and freeing it via this method in alloc_surplus_huge_page.
>> 
>> >From 864c5f8ef4900c95ca3f6f2363a85f3cb25e793e Mon Sep 17 00:00:00 2001
>> From: Mike Kravetz <mike.kravetz@oracle.com>
>> Date: Tue, 11 Aug 2020 12:45:41 -0700
>> Subject: [PATCH] hugetlb: optimize race error return in
>>  alloc_surplus_huge_page
>> 
>> The routine alloc_surplus_huge_page() could race with with a pool
>> size change.  If this happens, the allocated page may not be needed.
>> To free the page, the current code will 'Abuse temporary page to
>> workaround the nasty free_huge_page codeflow'.  Instead, directly
>> call the low level routine that free_huge_page uses.  This works
>> out well because the page is new, we hold the only reference and
>> already hold the hugetlb_lock.
>> 
>> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
>> ---
>>  mm/hugetlb.c | 13 ++++++++-----
>>  1 file changed, 8 insertions(+), 5 deletions(-)
>> 
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 590111ea6975..ac89b91fba86 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1923,14 +1923,17 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask,
>>  	/*
>>  	 * We could have raced with the pool size change.
>>  	 * Double check that and simply deallocate the new page
>> -	 * if we would end up overcommiting the surpluses. Abuse
>> -	 * temporary page to workaround the nasty free_huge_page
>> -	 * codeflow
>> +	 * if we would end up overcommiting the surpluses.
>>  	 */
>>  	if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) {
>> -		SetPageHugeTemporary(page);
>> +		/*
>> +		 * Since this page is new, we hold the only reference, and
>> +		 * we already hold the hugetlb_lock call the low level free
>> +		 * page routine.  This saves at least a lock roundtrip.
>> +		 */
>> +		(void)put_page_testzero(page); /* don't call destructor */
>> +		update_and_free_page(h, page);
>>  		spin_unlock(&hugetlb_lock);
>> -		put_page(page);
>>  		return NULL;
>>  	} else {
>>  		h->surplus_huge_pages++;
>
>Yes this makes sense. I would have to think about this more to be
>confident and give Acked-by but this looks sensible from a quick glance.
>

If it is ok, I would like to send v2 without this one to give more time
for a discussion?

>Thanks!
>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2020-08-17  3:04 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-07  9:12 [PATCH 00/10] mm/hugetlb: code refine and simplification Wei Yang
2020-08-07  9:12 ` [PATCH 01/10] mm/hugetlb: not necessary to coalesce regions recursively Wei Yang
2020-08-07 12:47   ` Baoquan He
2020-08-10 20:22   ` Mike Kravetz
2020-08-07  9:12 ` [PATCH 02/10] mm/hugetlb: make sure to get NULL when list is empty Wei Yang
2020-08-07 12:49   ` Baoquan He
2020-08-07 14:28     ` Wei Yang
2020-08-10  0:57       ` Baoquan He
2020-08-10 20:28       ` Mike Kravetz
2020-08-10 23:05         ` Wei Yang
2020-08-07  9:12 ` [PATCH 03/10] mm/hugetlb: use list_splice to merge two list at once Wei Yang
2020-08-07 12:53   ` Baoquan He
2020-08-10 21:07   ` Mike Kravetz
2020-08-07  9:12 ` [PATCH 04/10] mm/hugetlb: count file_region to be added when regions_needed != NULL Wei Yang
2020-08-07 12:54   ` Baoquan He
2020-08-10 21:46   ` Mike Kravetz
2020-08-07  9:12 ` [PATCH 05/10] mm/hugetlb: remove the redundant check on non_swap_entry() Wei Yang
2020-08-07 12:55   ` Baoquan He
2020-08-07 14:28     ` Wei Yang
2020-08-07  9:12 ` [PATCH 06/10] mm/hugetlb: remove redundant huge_pte_alloc() in hugetlb_fault() Wei Yang
2020-08-07 12:59   ` Baoquan He
2020-08-10 22:00   ` Mike Kravetz
2020-08-07  9:12 ` [PATCH 07/10] mm/hugetlb: a page from buddy is not on any list Wei Yang
2020-08-07 13:06   ` Baoquan He
2020-08-10 22:25   ` Mike Kravetz
2020-08-07  9:12 ` [PATCH 08/10] mm/hugetlb: return non-isolated page in the loop instead of break and check Wei Yang
2020-08-07 13:09   ` Baoquan He
2020-08-07 14:32     ` Wei Yang
2020-08-10 22:55   ` Mike Kravetz
2020-08-07  9:12 ` [PATCH 09/10] mm/hugetlb: narrow the hugetlb_lock protection area during preparing huge page Wei Yang
2020-08-07 13:12   ` Baoquan He
2020-08-10 23:02   ` Mike Kravetz
2020-08-07  9:12 ` [PATCH 10/10] mm/hugetlb: not necessary to abuse temporary page to workaround the nasty free_huge_page Wei Yang
2020-08-10  2:17   ` Baoquan He
2020-08-11  0:19     ` Mike Kravetz
2020-08-11  1:51       ` Baoquan He
2020-08-11  6:54         ` Michal Hocko
2020-08-11 21:43           ` Mike Kravetz
2020-08-11 23:19             ` Wei Yang
2020-08-11 23:25               ` Mike Kravetz
2020-08-12  5:40             ` Baoquan He
2020-08-13 11:46             ` Michal Hocko
2020-08-17  3:04               ` Wei Yang
2020-08-11 23:55           ` Baoquan He
2020-08-07 22:25 ` [PATCH 00/10] mm/hugetlb: code refine and simplification Mike Kravetz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).