[PATCH] mm: hugetlb: fix a race between memory-failure/soft_offline and gather_surplus_pages

From: Muchun Song <songmuchun@bytedance.com>
To: mike.kravetz@oracle.com, akpm@linux-foundation.org,
	mhocko@suse.com, osalvador@suse.de
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Muchun Song <songmuchun@bytedance.com>
Subject: [PATCH] mm: hugetlb: fix a race between memory-failure/soft_offline and gather_surplus_pages
Date: Wed, 21 Apr 2021 14:02:59 +0800	[thread overview]
Message-ID: <20210421060259.67554-1-songmuchun@bytedance.com> (raw)

The possible bad scenario:

CPU0:                           CPU1:

                                gather_surplus_pages()
                                  page = alloc_surplus_huge_page()
memory_failure_hugetlb()
  get_hwpoison_page(page)
    __get_hwpoison_page(page)
      get_page_unless_zero(page)
                                  zero = put_page_testzero(page)
                                  VM_BUG_ON_PAGE(!zero, page)
                                  enqueue_huge_page(h, page)
  put_page(page)

The refcount can possibly be increased by memory-failure or soft_offline
handlers, we can trigger VM_BUG_ON_PAGE and wrongly add the page to the
hugetlb pool list.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 mm/hugetlb.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 3476aa06da70..6c96332db34b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2145,17 +2145,14 @@ static int gather_surplus_pages(struct hstate *h, long delta)
 
 	/* Free the needed pages to the hugetlb pool */
 	list_for_each_entry_safe(page, tmp, &surplus_list, lru) {
-		int zeroed;
-
 		if ((--needed) < 0)
 			break;
 		/*
-		 * This page is now managed by the hugetlb allocator and has
-		 * no users -- drop the buddy allocator's reference.
+		 * The refcount can possibly be increased by memory-failure or
+		 * soft_offline handlers.
 		 */
-		zeroed = put_page_testzero(page);
-		VM_BUG_ON_PAGE(!zeroed, page);
-		enqueue_huge_page(h, page);
+		if (likely(put_page_testzero(page)))
+			enqueue_huge_page(h, page);
 	}
 free:
 	spin_unlock_irq(&hugetlb_lock);
-- 
2.11.0