All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: Michal Hocko <mhocko@suse.com>,
	Shakeel Butt <shakeelb@google.com>,
	Oscar Salvador <osalvador@suse.de>,
	David Hildenbrand <david@redhat.com>,
	Muchun Song <songmuchun@bytedance.com>,
	David Rientjes <rientjes@google.com>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Matthew Wilcox <willy@infradead.org>,
	HORIGUCHI NAOYA <naoya.horiguchi@nec.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>,
	Waiman Long <longman@redhat.com>, Peter Xu <peterx@redhat.com>,
	Mina Almasry <almasrymina@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mike Kravetz <mike.kravetz@oracle.com>
Subject: [RFC PATCH 2/8] hugetlb: recompute min_count when dropping hugetlb_lock
Date: Fri, 19 Mar 2021 15:42:03 -0700	[thread overview]
Message-ID: <20210319224209.150047-3-mike.kravetz@oracle.com> (raw)
In-Reply-To: <20210319224209.150047-1-mike.kravetz@oracle.com>

The routine set_max_huge_pages reduces the number of hugetlb_pages,
by calling free_pool_huge_page in a loop.  It does this as long as
persistent_huge_pages() is above a calculated min_count value.
However, this loop can conditionally drop hugetlb_lock and in some
circumstances free_pool_huge_page can drop hugetlb_lock.  If the
lock is dropped, counters could change the calculated min_count
value may no longer be valid.

The routine try_to_free_low has the same issue.

Recalculate min_count in each loop iteration as hugetlb_lock may have
been dropped.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 mm/hugetlb.c | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index d5be25f910e8..c537274c2a38 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2521,11 +2521,20 @@ static void __init report_hugepages(void)
 	}
 }
 
+static inline unsigned long min_hp_count(struct hstate *h, unsigned long count)
+{
+	unsigned long min_count;
+
+	min_count = h->resv_huge_pages + h->nr_huge_pages - h->free_huge_pages;
+	return max(count, min_count);
+}
+
 #ifdef CONFIG_HIGHMEM
 static void try_to_free_low(struct hstate *h, unsigned long count,
 						nodemask_t *nodes_allowed)
 {
 	int i;
+	unsigned long min_count = min_hp_count(h, count);
 
 	if (hstate_is_gigantic(h))
 		return;
@@ -2534,7 +2543,7 @@ static void try_to_free_low(struct hstate *h, unsigned long count,
 		struct page *page, *next;
 		struct list_head *freel = &h->hugepage_freelists[i];
 		list_for_each_entry_safe(page, next, freel, lru) {
-			if (count >= h->nr_huge_pages)
+			if (min_count >= h->nr_huge_pages)
 				return;
 			if (PageHighMem(page))
 				continue;
@@ -2542,6 +2551,12 @@ static void try_to_free_low(struct hstate *h, unsigned long count,
 			update_and_free_page(h, page);
 			h->free_huge_pages--;
 			h->free_huge_pages_node[page_to_nid(page)]--;
+
+			/*
+			 * update_and_free_page could have dropped lock so
+			 * recompute min_count.
+			 */
+			min_count = min_hp_count(h, count);
 		}
 	}
 }
@@ -2695,13 +2710,15 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
 	 * and won't grow the pool anywhere else. Not until one of the
 	 * sysctls are changed, or the surplus pages go out of use.
 	 */
-	min_count = h->resv_huge_pages + h->nr_huge_pages - h->free_huge_pages;
-	min_count = max(count, min_count);
-	try_to_free_low(h, min_count, nodes_allowed);
+	min_count = min_hp_count(h, count);
+	try_to_free_low(h, count, nodes_allowed);
 	while (min_count < persistent_huge_pages(h)) {
 		if (!free_pool_huge_page(h, nodes_allowed, 0))
 			break;
 		cond_resched_lock(&hugetlb_lock);
+
+		/* Recompute min_count in case hugetlb_lock was dropped */
+		min_count = min_hp_count(h, count);
 	}
 	while (count < persistent_huge_pages(h)) {
 		if (!adjust_pool_surplus(h, nodes_allowed, 1))
-- 
2.30.2


  parent reply	other threads:[~2021-03-19 22:44 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-19 22:42 [RFC PATCH 0/8] make hugetlb put_page safe for all calling contexts Mike Kravetz
2021-03-19 22:42 ` [RFC PATCH 1/8] hugetlb: add per-hstate mutex to synchronize user adjustments Mike Kravetz
2021-03-22 13:59   ` Michal Hocko
2021-03-22 16:57     ` Mike Kravetz
2021-03-23  7:48       ` Michal Hocko
2021-03-19 22:42 ` Mike Kravetz [this message]
2021-03-22 14:07   ` [RFC PATCH 2/8] hugetlb: recompute min_count when dropping hugetlb_lock Michal Hocko
2021-03-22 23:07     ` Mike Kravetz
2021-03-23  7:50       ` Michal Hocko
2021-03-23  8:01         ` Peter Zijlstra
2021-03-23  8:14           ` Michal Hocko
2021-03-23 23:18             ` Mike Kravetz
2021-03-24  8:36               ` Michal Hocko
2021-03-24 16:43                 ` Mike Kravetz
2021-03-19 22:42 ` [RFC PATCH 3/8] hugetlb: create remove_hugetlb_page() to separate functionality Mike Kravetz
2021-03-22 14:15   ` Michal Hocko
2021-03-22 17:01     ` Mike Kravetz
2021-03-19 22:42 ` [RFC PATCH 4/8] hugetlb: call update_and_free_page without hugetlb_lock Mike Kravetz
2021-03-22 14:19   ` Michal Hocko
2021-03-19 22:42 ` [RFC PATCH 5/8] hugetlb: change free_pool_huge_page to remove_pool_huge_page Mike Kravetz
2021-03-22 14:31   ` Michal Hocko
2021-03-22 23:28     ` Mike Kravetz
2021-03-23  7:57       ` Michal Hocko
2021-03-24  1:03         ` Mike Kravetz
2021-03-24  8:40           ` Michal Hocko
2021-03-24 16:38             ` Mike Kravetz
2021-03-24 16:50               ` Michal Hocko
2021-03-19 22:42 ` [RFC PATCH 6/8] hugetlb: make free_huge_page irq safe Mike Kravetz
2021-03-21 19:55   ` Mike Kravetz
2021-03-22 13:36   ` [hugetlb] cd190f60f9: BUG:sleeping_function_called_from_invalid_context_at_mm/hugetlb.c kernel test robot
2021-03-22 13:36     ` kernel test robot
2021-03-22 13:36     ` [LTP] " kernel test robot
2021-03-22 14:35   ` [RFC PATCH 6/8] hugetlb: make free_huge_page irq safe Michal Hocko
2021-03-19 22:42 ` [RFC PATCH 7/8] hugetlb: add update_and_free_page_no_sleep for irq context Mike Kravetz
2021-03-20  1:18   ` Hillf Danton
2021-03-25  0:26     ` Mike Kravetz
2021-03-22  8:41   ` Peter Zijlstra
2021-03-22 17:42     ` Mike Kravetz
2021-03-22 18:10       ` Roman Gushchin
2021-03-23 18:51         ` Mike Kravetz
2021-03-23 19:07           ` Roman Gushchin
2021-03-24  8:43           ` Michal Hocko
2021-03-24 16:53             ` Mike Kravetz
2021-03-22 20:43       ` Peter Zijlstra
2021-03-22 14:42   ` Michal Hocko
2021-03-22 14:46     ` Michal Hocko
2021-03-19 22:42 ` [RFC PATCH 8/8] hugetlb: track hugetlb pages allocated via cma_alloc Mike Kravetz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210319224209.150047-3-mike.kravetz@oracle.com \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=david@redhat.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longman@redhat.com \
    --cc=mhocko@suse.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=shakeelb@google.com \
    --cc=songmuchun@bytedance.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.