All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org,
	mhocko@suse.com, mike.kravetz@oracle.com,
	mm-commits@vger.kernel.org, osalvador@suse.de,
	shy828301@gmail.com, songmuchun@bytedance.com,
	stable@vger.kernel.org, torvalds@linux-foundation.org
Subject: [patch 02/18] mm: hugetlb: fix a race between freeing and dissolving the page
Date: Thu, 04 Feb 2021 18:32:06 -0800	[thread overview]
Message-ID: <20210205023206.qxjmf4Zwv%akpm@linux-foundation.org> (raw)
In-Reply-To: <20210204183135.e123f0d6027529f2cf500cf2@linux-foundation.org>

From: Muchun Song <songmuchun@bytedance.com>
Subject: mm: hugetlb: fix a race between freeing and dissolving the page

There is a race condition between __free_huge_page()
and dissolve_free_huge_page().

CPU0:                         CPU1:

// page_count(page) == 1
put_page(page)
  __free_huge_page(page)
                              dissolve_free_huge_page(page)
                                spin_lock(&hugetlb_lock)
                                // PageHuge(page) && !page_count(page)
                                update_and_free_page(page)
                                // page is freed to the buddy
                                spin_unlock(&hugetlb_lock)
    spin_lock(&hugetlb_lock)
    clear_page_huge_active(page)
    enqueue_huge_page(page)
    // It is wrong, the page is already freed
    spin_unlock(&hugetlb_lock)

The race windows is between put_page() and dissolve_free_huge_page().

We should make sure that the page is already on the free list
when it is dissolved.

As a result __free_huge_page would corrupt page(s) already in the buddy
allocator.

Link: https://lkml.kernel.org/r/20210115124942.46403-4-songmuchun@bytedance.com
Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb.c |   39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

--- a/mm/hugetlb.c~mm-hugetlb-fix-a-race-between-freeing-and-dissolving-the-page
+++ a/mm/hugetlb.c
@@ -79,6 +79,21 @@ DEFINE_SPINLOCK(hugetlb_lock);
 static int num_fault_mutexes;
 struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp;
 
+static inline bool PageHugeFreed(struct page *head)
+{
+	return page_private(head + 4) == -1UL;
+}
+
+static inline void SetPageHugeFreed(struct page *head)
+{
+	set_page_private(head + 4, -1UL);
+}
+
+static inline void ClearPageHugeFreed(struct page *head)
+{
+	set_page_private(head + 4, 0);
+}
+
 /* Forward declaration */
 static int hugetlb_acct_memory(struct hstate *h, long delta);
 
@@ -1028,6 +1043,7 @@ static void enqueue_huge_page(struct hst
 	list_move(&page->lru, &h->hugepage_freelists[nid]);
 	h->free_huge_pages++;
 	h->free_huge_pages_node[nid]++;
+	SetPageHugeFreed(page);
 }
 
 static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
@@ -1044,6 +1060,7 @@ static struct page *dequeue_huge_page_no
 
 		list_move(&page->lru, &h->hugepage_activelist);
 		set_page_refcounted(page);
+		ClearPageHugeFreed(page);
 		h->free_huge_pages--;
 		h->free_huge_pages_node[nid]--;
 		return page;
@@ -1505,6 +1522,7 @@ static void prep_new_huge_page(struct hs
 	spin_lock(&hugetlb_lock);
 	h->nr_huge_pages++;
 	h->nr_huge_pages_node[nid]++;
+	ClearPageHugeFreed(page);
 	spin_unlock(&hugetlb_lock);
 }
 
@@ -1755,6 +1773,7 @@ int dissolve_free_huge_page(struct page
 {
 	int rc = -EBUSY;
 
+retry:
 	/* Not to disrupt normal path by vainly holding hugetlb_lock */
 	if (!PageHuge(page))
 		return 0;
@@ -1771,6 +1790,26 @@ int dissolve_free_huge_page(struct page
 		int nid = page_to_nid(head);
 		if (h->free_huge_pages - h->resv_huge_pages == 0)
 			goto out;
+
+		/*
+		 * We should make sure that the page is already on the free list
+		 * when it is dissolved.
+		 */
+		if (unlikely(!PageHugeFreed(head))) {
+			spin_unlock(&hugetlb_lock);
+			cond_resched();
+
+			/*
+			 * Theoretically, we should return -EBUSY when we
+			 * encounter this race. In fact, we have a chance
+			 * to successfully dissolve the page if we do a
+			 * retry. Because the race window is quite small.
+			 * If we seize this opportunity, it is an optimization
+			 * for increasing the success rate of dissolving page.
+			 */
+			goto retry;
+		}
+
 		/*
 		 * Move PageHWPoison flag from head page to the raw error page,
 		 * which makes any subpages rather than the error page reusable.
_

  parent reply	other threads:[~2021-02-05  2:33 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-05  2:31 incoming Andrew Morton
2021-02-05  2:32 ` [patch 01/18] mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page Andrew Morton
2021-02-05  2:32 ` Andrew Morton [this message]
2021-02-05  2:32 ` [patch 03/18] mm: hugetlb: fix a race between isolating and freeing page Andrew Morton
2021-02-05  2:32 ` [patch 04/18] mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active Andrew Morton
2021-02-05  2:32 ` [patch 05/18] mm: migrate: do not migrate HugeTLB page whose refcount is one Andrew Morton
2021-02-05  2:32 ` [patch 06/18] mm, compaction: move high_pfn to the for loop scope Andrew Morton
2021-02-05  2:32 ` [patch 07/18] mm/vmalloc: separate put pages and flush VM flags Andrew Morton
2021-02-05  2:32 ` [patch 08/18] init/gcov: allow CONFIG_CONSTRUCTORS on UML to fix module gcov Andrew Morton
2021-02-05  2:32 ` [patch 09/18] mm: thp: fix MADV_REMOVE deadlock on shmem THP Andrew Morton
2021-02-05  2:32 ` [patch 10/18] memblock: do not start bottom-up allocations with kernel_end Andrew Morton
2021-02-05  2:32 ` [patch 11/18] mailmap: fix name/email for Viresh Kumar Andrew Morton
2021-02-05  2:32 ` [patch 12/18] mailmap: add entries for Manivannan Sadhasivam Andrew Morton
2021-02-05  2:32 ` [patch 13/18] mm/filemap: add missing mem_cgroup_uncharge() to __add_to_page_cache_locked() Andrew Morton
2021-02-05  2:32 ` [patch 14/18] kasan: add explicit preconditions to kasan_report() Andrew Morton
2021-02-05  2:32 ` [patch 15/18] kasan: make addr_has_metadata() return true for valid addresses Andrew Morton
2021-02-05  2:32 ` [patch 16/18] ubsan: implement __ubsan_handle_alignment_assumption Andrew Morton
2021-02-05  2:33 ` [patch 17/18] mm: hugetlb: fix missing put_page in gather_surplus_pages() Andrew Morton
2021-02-05  2:33 ` [patch 18/18] MAINTAINERS/.mailmap: use my @kernel.org address Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210205023206.qxjmf4Zwv%akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=mm-commits@vger.kernel.org \
    --cc=osalvador@suse.de \
    --cc=shy828301@gmail.com \
    --cc=songmuchun@bytedance.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.