All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/5] Fix some bugs about HugeTLB code
@ 2021-01-15 12:49 Muchun Song
  2021-01-15 12:49 ` [PATCH v6 1/5] mm: migrate: do not migrate HugeTLB page whose refcount is one Muchun Song
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Muchun Song @ 2021-01-15 12:49 UTC (permalink / raw)
  To: mike.kravetz, akpm
  Cc: n-horiguchi, ak, mhocko, linux-mm, linux-kernel, Muchun Song

This patch series aims to fix some bugs and add some improvements.

Changelog since v5 -> v6:
 - Simplify patch #3 and update commit changelog.

Changelog since v4 -> v5:
 - Squash "mm: hugetlb: retry dissolve page when hitting race"
   to "mm: hugetlb: fix a race between freeing and dissolving the page"

Changelog since v3 -> v4:
  - Update commit log of patch #1.
  - Drop "mm: hugetlb: add return -EAGAIN for dissolve_free_huge_page"
  - Add "mm: hugetlb: retry dissolve page when hitting race"

Changelog since v2 -> v3:
  - Update commit log.
  - Using head[3].private to indicate the page is freed in patch #3.

Changelog since v1 -> v2:
  - Export set_page_huge_active() in patch #2 to fix.
  - Using head[3].mapping to indicate the page is freed in patch #3.
  - Flush @free_hpage_work in patch #4.

Muchun Song (5):
  mm: migrate: do not migrate HugeTLB page whose refcount is one
  mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
  mm: hugetlb: fix a race between freeing and dissolving the page
  mm: hugetlb: fix a race between isolating and freeing page
  mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active

 fs/hugetlbfs/inode.c    |  3 ++-
 include/linux/hugetlb.h |  2 ++
 mm/hugetlb.c            | 48 +++++++++++++++++++++++++++++++++++++++++++-----
 mm/migrate.c            |  6 ++++++
 4 files changed, 53 insertions(+), 6 deletions(-)

-- 
2.11.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v6 1/5] mm: migrate: do not migrate HugeTLB page whose refcount is one
  2021-01-15 12:49 [PATCH v6 0/5] Fix some bugs about HugeTLB code Muchun Song
@ 2021-01-15 12:49 ` Muchun Song
  2021-01-15 12:49 ` [PATCH v6 2/5] mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page Muchun Song
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Muchun Song @ 2021-01-15 12:49 UTC (permalink / raw)
  To: mike.kravetz, akpm
  Cc: n-horiguchi, ak, mhocko, linux-mm, linux-kernel, Muchun Song,
	Yang Shi, Michal Hocko, David Hildenbrand, Oscar Salvador

All pages isolated for the migration have an elevated reference count
and therefore seeing a reference count equal to 1 means that the last
user of the page has dropped the reference and the page has became
unused and there doesn't make much sense to migrate it anymore. This has
been done for regular pages and this patch does the same for hugetlb
pages. Although the likelyhood of the race is rather small for hugetlb
pages it makes sense the two code paths in sync.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Yang Shi <shy828301@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
---
 mm/migrate.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/mm/migrate.c b/mm/migrate.c
index 4385f2fb5d18..a6631c4eb6a6 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1279,6 +1279,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
 		return -ENOSYS;
 	}
 
+	if (page_count(hpage) == 1) {
+		/* page was freed from under us. So we are done. */
+		putback_active_hugepage(hpage);
+		return MIGRATEPAGE_SUCCESS;
+	}
+
 	new_hpage = get_new_page(hpage, private);
 	if (!new_hpage)
 		return -ENOMEM;
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v6 2/5] mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
  2021-01-15 12:49 [PATCH v6 0/5] Fix some bugs about HugeTLB code Muchun Song
  2021-01-15 12:49 ` [PATCH v6 1/5] mm: migrate: do not migrate HugeTLB page whose refcount is one Muchun Song
@ 2021-01-15 12:49 ` Muchun Song
  2021-01-15 12:49 ` [PATCH v6 3/5] mm: hugetlb: fix a race between freeing and dissolving the page Muchun Song
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Muchun Song @ 2021-01-15 12:49 UTC (permalink / raw)
  To: mike.kravetz, akpm
  Cc: n-horiguchi, ak, mhocko, linux-mm, linux-kernel, Muchun Song,
	Michal Hocko, Oscar Salvador, stable

If a new hugetlb page is allocated during fallocate it will not be
marked as active (set_page_huge_active) which will result in a later
isolate_huge_page failure when the page migration code would like to
move that page. Such a failure would be unexpected and wrong.

Only export set_page_huge_active, just leave clear_page_huge_active
as static. Because there are no external users.

Fixes: 70c3547e36f5 (hugetlbfs: add hugetlbfs_fallocate())
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: stable@vger.kernel.org
---
 fs/hugetlbfs/inode.c    | 3 ++-
 include/linux/hugetlb.h | 2 ++
 mm/hugetlb.c            | 2 +-
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index b5c109703daa..21c20fd5f9ee 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -735,9 +735,10 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
 
 		mutex_unlock(&hugetlb_fault_mutex_table[hash]);
 
+		set_page_huge_active(page);
 		/*
 		 * unlock_page because locked by add_to_page_cache()
-		 * page_put due to reference from alloc_huge_page()
+		 * put_page() due to reference from alloc_huge_page()
 		 */
 		unlock_page(page);
 		put_page(page);
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index ebca2ef02212..b5807f23caf8 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -770,6 +770,8 @@ static inline void huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
 }
 #endif
 
+void set_page_huge_active(struct page *page);
+
 #else	/* CONFIG_HUGETLB_PAGE */
 struct hstate {};
 
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 1f3bf1710b66..4741d60f8955 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1348,7 +1348,7 @@ bool page_huge_active(struct page *page)
 }
 
 /* never called for tail page */
-static void set_page_huge_active(struct page *page)
+void set_page_huge_active(struct page *page)
 {
 	VM_BUG_ON_PAGE(!PageHeadHuge(page), page);
 	SetPagePrivate(&page[1]);
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v6 3/5] mm: hugetlb: fix a race between freeing and dissolving the page
  2021-01-15 12:49 [PATCH v6 0/5] Fix some bugs about HugeTLB code Muchun Song
  2021-01-15 12:49 ` [PATCH v6 1/5] mm: migrate: do not migrate HugeTLB page whose refcount is one Muchun Song
  2021-01-15 12:49 ` [PATCH v6 2/5] mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page Muchun Song
@ 2021-01-15 12:49 ` Muchun Song
  2021-01-15 13:22   ` Oscar Salvador
  2021-01-18  9:13   ` Michal Hocko
  2021-01-15 12:49 ` [PATCH v6 4/5] mm: hugetlb: fix a race between isolating and freeing page Muchun Song
  2021-01-15 12:49 ` [PATCH v6 5/5] mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active Muchun Song
  4 siblings, 2 replies; 10+ messages in thread
From: Muchun Song @ 2021-01-15 12:49 UTC (permalink / raw)
  To: mike.kravetz, akpm
  Cc: n-horiguchi, ak, mhocko, linux-mm, linux-kernel, Muchun Song, stable

There is a race condition between __free_huge_page()
and dissolve_free_huge_page().

CPU0:                         CPU1:

// page_count(page) == 1
put_page(page)
  __free_huge_page(page)
                              dissolve_free_huge_page(page)
                                spin_lock(&hugetlb_lock)
                                // PageHuge(page) && !page_count(page)
                                update_and_free_page(page)
                                // page is freed to the buddy
                                spin_unlock(&hugetlb_lock)
    spin_lock(&hugetlb_lock)
    clear_page_huge_active(page)
    enqueue_huge_page(page)
    // It is wrong, the page is already freed
    spin_unlock(&hugetlb_lock)

The race windows is between put_page() and dissolve_free_huge_page().

We should make sure that the page is already on the free list
when it is dissolved.

As a result __free_huge_page would corrupt page(s) already in the buddy
allocator.

Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: stable@vger.kernel.org
---
 mm/hugetlb.c | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 4741d60f8955..b99fe4a2b435 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -79,6 +79,21 @@ DEFINE_SPINLOCK(hugetlb_lock);
 static int num_fault_mutexes;
 struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp;
 
+static inline bool PageHugeFreed(struct page *head)
+{
+	return page_private(head + 4) == -1UL;
+}
+
+static inline void SetPageHugeFreed(struct page *head)
+{
+	set_page_private(head + 4, -1UL);
+}
+
+static inline void ClearPageHugeFreed(struct page *head)
+{
+	set_page_private(head + 4, 0);
+}
+
 /* Forward declaration */
 static int hugetlb_acct_memory(struct hstate *h, long delta);
 
@@ -1028,6 +1043,7 @@ static void enqueue_huge_page(struct hstate *h, struct page *page)
 	list_move(&page->lru, &h->hugepage_freelists[nid]);
 	h->free_huge_pages++;
 	h->free_huge_pages_node[nid]++;
+	SetPageHugeFreed(page);
 }
 
 static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
@@ -1044,6 +1060,7 @@ static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
 
 		list_move(&page->lru, &h->hugepage_activelist);
 		set_page_refcounted(page);
+		ClearPageHugeFreed(page);
 		h->free_huge_pages--;
 		h->free_huge_pages_node[nid]--;
 		return page;
@@ -1504,6 +1521,7 @@ static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
 	spin_lock(&hugetlb_lock);
 	h->nr_huge_pages++;
 	h->nr_huge_pages_node[nid]++;
+	ClearPageHugeFreed(page);
 	spin_unlock(&hugetlb_lock);
 }
 
@@ -1754,6 +1772,7 @@ int dissolve_free_huge_page(struct page *page)
 {
 	int rc = -EBUSY;
 
+retry:
 	/* Not to disrupt normal path by vainly holding hugetlb_lock */
 	if (!PageHuge(page))
 		return 0;
@@ -1770,6 +1789,26 @@ int dissolve_free_huge_page(struct page *page)
 		int nid = page_to_nid(head);
 		if (h->free_huge_pages - h->resv_huge_pages == 0)
 			goto out;
+
+		/*
+		 * We should make sure that the page is already on the free list
+		 * when it is dissolved.
+		 */
+		if (unlikely(!PageHugeFreed(head))) {
+			spin_unlock(&hugetlb_lock);
+			cond_resched();
+
+			/*
+			 * Theoretically, we should return -EBUSY when we
+			 * encounter this race. In fact, we have a chance
+			 * to successfully dissolve the page if we do a
+			 * retry. Because the race window is quite small.
+			 * If we seize this opportunity, it is an optimization
+			 * for increasing the success rate of dissolving page.
+			 */
+			goto retry;
+		}
+
 		/*
 		 * Move PageHWPoison flag from head page to the raw error page,
 		 * which makes any subpages rather than the error page reusable.
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v6 4/5] mm: hugetlb: fix a race between isolating and freeing page
  2021-01-15 12:49 [PATCH v6 0/5] Fix some bugs about HugeTLB code Muchun Song
                   ` (2 preceding siblings ...)
  2021-01-15 12:49 ` [PATCH v6 3/5] mm: hugetlb: fix a race between freeing and dissolving the page Muchun Song
@ 2021-01-15 12:49 ` Muchun Song
  2021-01-15 13:26   ` Oscar Salvador
  2021-01-15 12:49 ` [PATCH v6 5/5] mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active Muchun Song
  4 siblings, 1 reply; 10+ messages in thread
From: Muchun Song @ 2021-01-15 12:49 UTC (permalink / raw)
  To: mike.kravetz, akpm
  Cc: n-horiguchi, ak, mhocko, linux-mm, linux-kernel, Muchun Song,
	Michal Hocko, stable

There is a race between isolate_huge_page() and __free_huge_page().

CPU0:                                       CPU1:

if (PageHuge(page))
                                            put_page(page)
                                              __free_huge_page(page)
                                                  spin_lock(&hugetlb_lock)
                                                  update_and_free_page(page)
                                                    set_compound_page_dtor(page,
                                                      NULL_COMPOUND_DTOR)
                                                  spin_unlock(&hugetlb_lock)
  isolate_huge_page(page)
    // trigger BUG_ON
    VM_BUG_ON_PAGE(!PageHead(page), page)
    spin_lock(&hugetlb_lock)
    page_huge_active(page)
      // trigger BUG_ON
      VM_BUG_ON_PAGE(!PageHuge(page), page)
    spin_unlock(&hugetlb_lock)

When we isolate a HugeTLB page on CPU0. Meanwhile, we free it to the
buddy allocator on CPU1. Then, we can trigger a BUG_ON on CPU0. Because
it is already freed to the buddy allocator.

Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: stable@vger.kernel.org
---
 mm/hugetlb.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index b99fe4a2b435..f2cef3cf1f85 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5572,9 +5572,9 @@ bool isolate_huge_page(struct page *page, struct list_head *list)
 {
 	bool ret = true;
 
-	VM_BUG_ON_PAGE(!PageHead(page), page);
 	spin_lock(&hugetlb_lock);
-	if (!page_huge_active(page) || !get_page_unless_zero(page)) {
+	if (!PageHeadHuge(page) || !page_huge_active(page) ||
+	    !get_page_unless_zero(page)) {
 		ret = false;
 		goto unlock;
 	}
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v6 5/5] mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
  2021-01-15 12:49 [PATCH v6 0/5] Fix some bugs about HugeTLB code Muchun Song
                   ` (3 preceding siblings ...)
  2021-01-15 12:49 ` [PATCH v6 4/5] mm: hugetlb: fix a race between isolating and freeing page Muchun Song
@ 2021-01-15 12:49 ` Muchun Song
  2021-01-15 13:27   ` Oscar Salvador
  4 siblings, 1 reply; 10+ messages in thread
From: Muchun Song @ 2021-01-15 12:49 UTC (permalink / raw)
  To: mike.kravetz, akpm
  Cc: n-horiguchi, ak, mhocko, linux-mm, linux-kernel, Muchun Song,
	Michal Hocko, stable

The page_huge_active() can be called from scan_movable_pages() which
do not hold a reference count to the HugeTLB page. So when we call
page_huge_active() from scan_movable_pages(), the HugeTLB page can
be freed parallel. Then we will trigger a BUG_ON which is in the
page_huge_active() when CONFIG_DEBUG_VM is enabled. Just remove the
VM_BUG_ON_PAGE.

Fixes: 7e1f049efb86 ("mm: hugetlb: cleanup using paeg_huge_active()")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: stable@vger.kernel.org
---
 mm/hugetlb.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f2cef3cf1f85..2d671f3743f5 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1360,8 +1360,7 @@ struct hstate *size_to_hstate(unsigned long size)
  */
 bool page_huge_active(struct page *page)
 {
-	VM_BUG_ON_PAGE(!PageHuge(page), page);
-	return PageHead(page) && PagePrivate(&page[1]);
+	return PageHeadHuge(page) && PagePrivate(&page[1]);
 }
 
 /* never called for tail page */
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 3/5] mm: hugetlb: fix a race between freeing and dissolving the page
  2021-01-15 12:49 ` [PATCH v6 3/5] mm: hugetlb: fix a race between freeing and dissolving the page Muchun Song
@ 2021-01-15 13:22   ` Oscar Salvador
  2021-01-18  9:13   ` Michal Hocko
  1 sibling, 0 replies; 10+ messages in thread
From: Oscar Salvador @ 2021-01-15 13:22 UTC (permalink / raw)
  To: Muchun Song
  Cc: mike.kravetz, akpm, n-horiguchi, ak, mhocko, linux-mm,
	linux-kernel, stable

On Fri, Jan 15, 2021 at 08:49:40PM +0800, Muchun Song wrote:
> There is a race condition between __free_huge_page()
> and dissolve_free_huge_page().
> 
> CPU0:                         CPU1:
> 
> // page_count(page) == 1
> put_page(page)
>   __free_huge_page(page)
>                               dissolve_free_huge_page(page)
>                                 spin_lock(&hugetlb_lock)
>                                 // PageHuge(page) && !page_count(page)
>                                 update_and_free_page(page)
>                                 // page is freed to the buddy
>                                 spin_unlock(&hugetlb_lock)
>     spin_lock(&hugetlb_lock)
>     clear_page_huge_active(page)
>     enqueue_huge_page(page)
>     // It is wrong, the page is already freed
>     spin_unlock(&hugetlb_lock)
> 
> The race windows is between put_page() and dissolve_free_huge_page().
> 
> We should make sure that the page is already on the free list
> when it is dissolved.
> 
> As a result __free_huge_page would corrupt page(s) already in the buddy
> allocator.
> 
> Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: stable@vger.kernel.org

LGTM,

Reviewed-by: Oscar Salvador <osalvador@suse.de>


-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 4/5] mm: hugetlb: fix a race between isolating and freeing page
  2021-01-15 12:49 ` [PATCH v6 4/5] mm: hugetlb: fix a race between isolating and freeing page Muchun Song
@ 2021-01-15 13:26   ` Oscar Salvador
  0 siblings, 0 replies; 10+ messages in thread
From: Oscar Salvador @ 2021-01-15 13:26 UTC (permalink / raw)
  To: Muchun Song
  Cc: mike.kravetz, akpm, n-horiguchi, ak, mhocko, linux-mm,
	linux-kernel, Michal Hocko, stable

On Fri, Jan 15, 2021 at 08:49:41PM +0800, Muchun Song wrote:
> There is a race between isolate_huge_page() and __free_huge_page().
> 
> CPU0:                                       CPU1:
> 
> if (PageHuge(page))
>                                             put_page(page)
>                                               __free_huge_page(page)
>                                                   spin_lock(&hugetlb_lock)
>                                                   update_and_free_page(page)
>                                                     set_compound_page_dtor(page,
>                                                       NULL_COMPOUND_DTOR)
>                                                   spin_unlock(&hugetlb_lock)
>   isolate_huge_page(page)
>     // trigger BUG_ON
>     VM_BUG_ON_PAGE(!PageHead(page), page)
>     spin_lock(&hugetlb_lock)
>     page_huge_active(page)
>       // trigger BUG_ON
>       VM_BUG_ON_PAGE(!PageHuge(page), page)
>     spin_unlock(&hugetlb_lock)
> 
> When we isolate a HugeTLB page on CPU0. Meanwhile, we free it to the
> buddy allocator on CPU1. Then, we can trigger a BUG_ON on CPU0. Because
> it is already freed to the buddy allocator.
> 
> Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Cc: stable@vger.kernel.org

Reviewed-by: Oscar Salvador <osalvador@suse.de>

-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 5/5] mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
  2021-01-15 12:49 ` [PATCH v6 5/5] mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active Muchun Song
@ 2021-01-15 13:27   ` Oscar Salvador
  0 siblings, 0 replies; 10+ messages in thread
From: Oscar Salvador @ 2021-01-15 13:27 UTC (permalink / raw)
  To: Muchun Song
  Cc: mike.kravetz, akpm, n-horiguchi, ak, mhocko, linux-mm,
	linux-kernel, Michal Hocko, stable

On Fri, Jan 15, 2021 at 08:49:42PM +0800, Muchun Song wrote:
> The page_huge_active() can be called from scan_movable_pages() which
> do not hold a reference count to the HugeTLB page. So when we call
> page_huge_active() from scan_movable_pages(), the HugeTLB page can
> be freed parallel. Then we will trigger a BUG_ON which is in the
> page_huge_active() when CONFIG_DEBUG_VM is enabled. Just remove the
> VM_BUG_ON_PAGE.
> 
> Fixes: 7e1f049efb86 ("mm: hugetlb: cleanup using paeg_huge_active()")
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Cc: stable@vger.kernel.org

Reviewed-by: Oscar Salvador <osalvador@suse.de>

-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 3/5] mm: hugetlb: fix a race between freeing and dissolving the page
  2021-01-15 12:49 ` [PATCH v6 3/5] mm: hugetlb: fix a race between freeing and dissolving the page Muchun Song
  2021-01-15 13:22   ` Oscar Salvador
@ 2021-01-18  9:13   ` Michal Hocko
  1 sibling, 0 replies; 10+ messages in thread
From: Michal Hocko @ 2021-01-18  9:13 UTC (permalink / raw)
  To: Muchun Song
  Cc: mike.kravetz, akpm, n-horiguchi, ak, linux-mm, linux-kernel, stable

On Fri 15-01-21 20:49:40, Muchun Song wrote:
> There is a race condition between __free_huge_page()
> and dissolve_free_huge_page().
> 
> CPU0:                         CPU1:
> 
> // page_count(page) == 1
> put_page(page)
>   __free_huge_page(page)
>                               dissolve_free_huge_page(page)
>                                 spin_lock(&hugetlb_lock)
>                                 // PageHuge(page) && !page_count(page)
>                                 update_and_free_page(page)
>                                 // page is freed to the buddy
>                                 spin_unlock(&hugetlb_lock)
>     spin_lock(&hugetlb_lock)
>     clear_page_huge_active(page)
>     enqueue_huge_page(page)
>     // It is wrong, the page is already freed
>     spin_unlock(&hugetlb_lock)
> 
> The race windows is between put_page() and dissolve_free_huge_page().
> 
> We should make sure that the page is already on the free list
> when it is dissolved.
> 
> As a result __free_huge_page would corrupt page(s) already in the buddy
> allocator.
> 
> Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: stable@vger.kernel.org

Acked-by: Michal Hocko <mhocko@suse.com>

Thanks!

> ---
>  mm/hugetlb.c | 39 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 39 insertions(+)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 4741d60f8955..b99fe4a2b435 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -79,6 +79,21 @@ DEFINE_SPINLOCK(hugetlb_lock);
>  static int num_fault_mutexes;
>  struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp;
>  
> +static inline bool PageHugeFreed(struct page *head)
> +{
> +	return page_private(head + 4) == -1UL;
> +}
> +
> +static inline void SetPageHugeFreed(struct page *head)
> +{
> +	set_page_private(head + 4, -1UL);
> +}
> +
> +static inline void ClearPageHugeFreed(struct page *head)
> +{
> +	set_page_private(head + 4, 0);
> +}
> +
>  /* Forward declaration */
>  static int hugetlb_acct_memory(struct hstate *h, long delta);
>  
> @@ -1028,6 +1043,7 @@ static void enqueue_huge_page(struct hstate *h, struct page *page)
>  	list_move(&page->lru, &h->hugepage_freelists[nid]);
>  	h->free_huge_pages++;
>  	h->free_huge_pages_node[nid]++;
> +	SetPageHugeFreed(page);
>  }
>  
>  static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
> @@ -1044,6 +1060,7 @@ static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
>  
>  		list_move(&page->lru, &h->hugepage_activelist);
>  		set_page_refcounted(page);
> +		ClearPageHugeFreed(page);
>  		h->free_huge_pages--;
>  		h->free_huge_pages_node[nid]--;
>  		return page;
> @@ -1504,6 +1521,7 @@ static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
>  	spin_lock(&hugetlb_lock);
>  	h->nr_huge_pages++;
>  	h->nr_huge_pages_node[nid]++;
> +	ClearPageHugeFreed(page);
>  	spin_unlock(&hugetlb_lock);
>  }
>  
> @@ -1754,6 +1772,7 @@ int dissolve_free_huge_page(struct page *page)
>  {
>  	int rc = -EBUSY;
>  
> +retry:
>  	/* Not to disrupt normal path by vainly holding hugetlb_lock */
>  	if (!PageHuge(page))
>  		return 0;
> @@ -1770,6 +1789,26 @@ int dissolve_free_huge_page(struct page *page)
>  		int nid = page_to_nid(head);
>  		if (h->free_huge_pages - h->resv_huge_pages == 0)
>  			goto out;
> +
> +		/*
> +		 * We should make sure that the page is already on the free list
> +		 * when it is dissolved.
> +		 */
> +		if (unlikely(!PageHugeFreed(head))) {
> +			spin_unlock(&hugetlb_lock);
> +			cond_resched();
> +
> +			/*
> +			 * Theoretically, we should return -EBUSY when we
> +			 * encounter this race. In fact, we have a chance
> +			 * to successfully dissolve the page if we do a
> +			 * retry. Because the race window is quite small.
> +			 * If we seize this opportunity, it is an optimization
> +			 * for increasing the success rate of dissolving page.
> +			 */
> +			goto retry;
> +		}
> +
>  		/*
>  		 * Move PageHWPoison flag from head page to the raw error page,
>  		 * which makes any subpages rather than the error page reusable.
> -- 
> 2.11.0

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-01-18  9:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-15 12:49 [PATCH v6 0/5] Fix some bugs about HugeTLB code Muchun Song
2021-01-15 12:49 ` [PATCH v6 1/5] mm: migrate: do not migrate HugeTLB page whose refcount is one Muchun Song
2021-01-15 12:49 ` [PATCH v6 2/5] mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page Muchun Song
2021-01-15 12:49 ` [PATCH v6 3/5] mm: hugetlb: fix a race between freeing and dissolving the page Muchun Song
2021-01-15 13:22   ` Oscar Salvador
2021-01-18  9:13   ` Michal Hocko
2021-01-15 12:49 ` [PATCH v6 4/5] mm: hugetlb: fix a race between isolating and freeing page Muchun Song
2021-01-15 13:26   ` Oscar Salvador
2021-01-15 12:49 ` [PATCH v6 5/5] mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active Muchun Song
2021-01-15 13:27   ` Oscar Salvador

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.