linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/6] Fix some bugs about HugeTLB code
@ 2021-01-13  5:22 Muchun Song
  2021-01-13  5:22 ` [PATCH v4 1/6] mm: migrate: do not migrate HugeTLB page whose refcount is one Muchun Song
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Muchun Song @ 2021-01-13  5:22 UTC (permalink / raw)
  To: mike.kravetz, akpm
  Cc: n-horiguchi, ak, mhocko, linux-mm, linux-kernel, Muchun Song

This patch series aims to fix some bugs and add some improvements.

Changelog since v3 -> v4:
  - Update commit log of patch #1.
  - Drop "mm: hugetlb: add return -EAGAIN for dissolve_free_huge_page"
  - Add "mm: hugetlb: retry dissolve page when hitting race"

Changelog since v2 -> v3:
  - Update commit log.
  - Using head[3].private to indicate the page is freed in patch #3.

Changelog since v1 -> v2:
  - Export set_page_huge_active() in patch #2 to fix.
  - Using head[3].mapping to indicate the page is freed in patch #3.
  - Flush @free_hpage_work in patch #4.

Muchun Song (6):
  mm: migrate: do not migrate HugeTLB page whose refcount is one
  mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
  mm: hugetlb: fix a race between freeing and dissolving the page
  mm: hugetlb: retry dissolve page when hitting race
  mm: hugetlb: fix a race between isolating and freeing page
  mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active

 fs/hugetlbfs/inode.c    |  3 ++-
 include/linux/hugetlb.h |  2 ++
 mm/hugetlb.c            | 51 ++++++++++++++++++++++++++++++++++++++++++++-----
 mm/migrate.c            |  6 ++++++
 4 files changed, 56 insertions(+), 6 deletions(-)

-- 
2.11.0



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v4 1/6] mm: migrate: do not migrate HugeTLB page whose refcount is one
  2021-01-13  5:22 [PATCH v4 0/6] Fix some bugs about HugeTLB code Muchun Song
@ 2021-01-13  5:22 ` Muchun Song
  2021-01-13 10:30   ` David Hildenbrand
  2021-01-13 10:57   ` Oscar Salvador
  2021-01-13  5:22 ` [PATCH v4 2/6] mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page Muchun Song
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 21+ messages in thread
From: Muchun Song @ 2021-01-13  5:22 UTC (permalink / raw)
  To: mike.kravetz, akpm
  Cc: n-horiguchi, ak, mhocko, linux-mm, linux-kernel, Muchun Song,
	Yang Shi, Michal Hocko

All pages isolated for the migration have an elevated reference count
and therefore seeing a reference count equal to 1 means that the last
user of the page has dropped the reference and the page has became
unused and there doesn't make much sense to migrate it anymore. This has
been done for regular pages and this patch does the same for hugetlb
pages. Although the likelyhood of the race is rather small for hugetlb
pages it makes sense the two code paths in sync.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Yang Shi <shy828301@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 mm/migrate.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/mm/migrate.c b/mm/migrate.c
index 4385f2fb5d18..a6631c4eb6a6 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1279,6 +1279,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
 		return -ENOSYS;
 	}
 
+	if (page_count(hpage) == 1) {
+		/* page was freed from under us. So we are done. */
+		putback_active_hugepage(hpage);
+		return MIGRATEPAGE_SUCCESS;
+	}
+
 	new_hpage = get_new_page(hpage, private);
 	if (!new_hpage)
 		return -ENOMEM;
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 2/6] mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
  2021-01-13  5:22 [PATCH v4 0/6] Fix some bugs about HugeTLB code Muchun Song
  2021-01-13  5:22 ` [PATCH v4 1/6] mm: migrate: do not migrate HugeTLB page whose refcount is one Muchun Song
@ 2021-01-13  5:22 ` Muchun Song
  2021-01-13 11:00   ` Oscar Salvador
  2021-01-13  5:22 ` [PATCH v4 3/6] mm: hugetlb: fix a race between freeing and dissolving the page Muchun Song
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Muchun Song @ 2021-01-13  5:22 UTC (permalink / raw)
  To: mike.kravetz, akpm
  Cc: n-horiguchi, ak, mhocko, linux-mm, linux-kernel, Muchun Song,
	Michal Hocko, stable

If a new hugetlb page is allocated during fallocate it will not be
marked as active (set_page_huge_active) which will result in a later
isolate_huge_page failure when the page migration code would like to
move that page. Such a failure would be unexpected and wrong.

Only export set_page_huge_active, just leave clear_page_huge_active
as static. Because there are no external users.

Fixes: 70c3547e36f5 (hugetlbfs: add hugetlbfs_fallocate())
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: stable@vger.kernel.org
---
 fs/hugetlbfs/inode.c    | 3 ++-
 include/linux/hugetlb.h | 2 ++
 mm/hugetlb.c            | 2 +-
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index b5c109703daa..21c20fd5f9ee 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -735,9 +735,10 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
 
 		mutex_unlock(&hugetlb_fault_mutex_table[hash]);
 
+		set_page_huge_active(page);
 		/*
 		 * unlock_page because locked by add_to_page_cache()
-		 * page_put due to reference from alloc_huge_page()
+		 * put_page() due to reference from alloc_huge_page()
 		 */
 		unlock_page(page);
 		put_page(page);
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index ebca2ef02212..b5807f23caf8 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -770,6 +770,8 @@ static inline void huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
 }
 #endif
 
+void set_page_huge_active(struct page *page);
+
 #else	/* CONFIG_HUGETLB_PAGE */
 struct hstate {};
 
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 1f3bf1710b66..4741d60f8955 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1348,7 +1348,7 @@ bool page_huge_active(struct page *page)
 }
 
 /* never called for tail page */
-static void set_page_huge_active(struct page *page)
+void set_page_huge_active(struct page *page)
 {
 	VM_BUG_ON_PAGE(!PageHeadHuge(page), page);
 	SetPagePrivate(&page[1]);
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 3/6] mm: hugetlb: fix a race between freeing and dissolving the page
  2021-01-13  5:22 [PATCH v4 0/6] Fix some bugs about HugeTLB code Muchun Song
  2021-01-13  5:22 ` [PATCH v4 1/6] mm: migrate: do not migrate HugeTLB page whose refcount is one Muchun Song
  2021-01-13  5:22 ` [PATCH v4 2/6] mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page Muchun Song
@ 2021-01-13  5:22 ` Muchun Song
  2021-01-13  9:31   ` Michal Hocko
  2021-01-13  5:22 ` [PATCH v4 4/6] mm: hugetlb: retry dissolve page when hitting race Muchun Song
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Muchun Song @ 2021-01-13  5:22 UTC (permalink / raw)
  To: mike.kravetz, akpm
  Cc: n-horiguchi, ak, mhocko, linux-mm, linux-kernel, Muchun Song, stable

There is a race condition between __free_huge_page()
and dissolve_free_huge_page().

CPU0:                         CPU1:

// page_count(page) == 1
put_page(page)
  __free_huge_page(page)
                              dissolve_free_huge_page(page)
                                spin_lock(&hugetlb_lock)
                                // PageHuge(page) && !page_count(page)
                                update_and_free_page(page)
                                // page is freed to the buddy
                                spin_unlock(&hugetlb_lock)
    spin_lock(&hugetlb_lock)
    clear_page_huge_active(page)
    enqueue_huge_page(page)
    // It is wrong, the page is already freed
    spin_unlock(&hugetlb_lock)

The race windows is between put_page() and dissolve_free_huge_page().

We should make sure that the page is already on the free list
when it is dissolved.

Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: stable@vger.kernel.org
---
 mm/hugetlb.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 4741d60f8955..4a9011e12175 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -79,6 +79,21 @@ DEFINE_SPINLOCK(hugetlb_lock);
 static int num_fault_mutexes;
 struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp;
 
+static inline bool PageHugeFreed(struct page *head)
+{
+	return page_private(head + 4) == -1UL;
+}
+
+static inline void SetPageHugeFreed(struct page *head)
+{
+	set_page_private(head + 4, -1UL);
+}
+
+static inline void ClearPageHugeFreed(struct page *head)
+{
+	set_page_private(head + 4, 0);
+}
+
 /* Forward declaration */
 static int hugetlb_acct_memory(struct hstate *h, long delta);
 
@@ -1028,6 +1043,7 @@ static void enqueue_huge_page(struct hstate *h, struct page *page)
 	list_move(&page->lru, &h->hugepage_freelists[nid]);
 	h->free_huge_pages++;
 	h->free_huge_pages_node[nid]++;
+	SetPageHugeFreed(page);
 }
 
 static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
@@ -1044,6 +1060,7 @@ static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
 
 		list_move(&page->lru, &h->hugepage_activelist);
 		set_page_refcounted(page);
+		ClearPageHugeFreed(page);
 		h->free_huge_pages--;
 		h->free_huge_pages_node[nid]--;
 		return page;
@@ -1504,6 +1521,7 @@ static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
 	spin_lock(&hugetlb_lock);
 	h->nr_huge_pages++;
 	h->nr_huge_pages_node[nid]++;
+	ClearPageHugeFreed(page);
 	spin_unlock(&hugetlb_lock);
 }
 
@@ -1770,6 +1788,14 @@ int dissolve_free_huge_page(struct page *page)
 		int nid = page_to_nid(head);
 		if (h->free_huge_pages - h->resv_huge_pages == 0)
 			goto out;
+
+		/*
+		 * We should make sure that the page is already on the free list
+		 * when it is dissolved.
+		 */
+		if (unlikely(!PageHugeFreed(head)))
+			goto out;
+
 		/*
 		 * Move PageHWPoison flag from head page to the raw error page,
 		 * which makes any subpages rather than the error page reusable.
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 4/6] mm: hugetlb: retry dissolve page when hitting race
  2021-01-13  5:22 [PATCH v4 0/6] Fix some bugs about HugeTLB code Muchun Song
                   ` (2 preceding siblings ...)
  2021-01-13  5:22 ` [PATCH v4 3/6] mm: hugetlb: fix a race between freeing and dissolving the page Muchun Song
@ 2021-01-13  5:22 ` Muchun Song
  2021-01-13  9:33   ` Michal Hocko
  2021-01-13  5:22 ` [PATCH v4 5/6] mm: hugetlb: fix a race between isolating and freeing page Muchun Song
  2021-01-13  5:22 ` [PATCH v4 6/6] mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active Muchun Song
  5 siblings, 1 reply; 21+ messages in thread
From: Muchun Song @ 2021-01-13  5:22 UTC (permalink / raw)
  To: mike.kravetz, akpm
  Cc: n-horiguchi, ak, mhocko, linux-mm, linux-kernel, Muchun Song

There is a race between dissolve_free_huge_page() and put_page().
Theoretically, we should return -EBUSY when we encounter this race.
In fact, we have a chance to successfully dissolve the page if we
do a retry. Because the race window is quite small. If we seize
this opportunity, it is an optimization for increasing the success
rate of dissolving page.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 mm/hugetlb.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 4a9011e12175..898e4ea43e13 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1772,6 +1772,7 @@ int dissolve_free_huge_page(struct page *page)
 {
 	int rc = -EBUSY;
 
+retry:
 	/* Not to disrupt normal path by vainly holding hugetlb_lock */
 	if (!PageHuge(page))
 		return 0;
@@ -1793,8 +1794,23 @@ int dissolve_free_huge_page(struct page *page)
 		 * We should make sure that the page is already on the free list
 		 * when it is dissolved.
 		 */
-		if (unlikely(!PageHugeFreed(head)))
-			goto out;
+		if (unlikely(!PageHugeFreed(head))) {
+			spin_unlock(&hugetlb_lock);
+
+			/*
+			 * Theoretically, we should return -EBUSY when we
+			 * encounter this race. In fact, we have a chance
+			 * to successfully dissolve the page if we do a
+			 * retry. Because the race window is quite small.
+			 * If we seize this opportunity, it is an optimization
+			 * for increasing the success rate of dissolving page.
+			 */
+			while (PageHeadHuge(head) && !PageHugeFreed(head)) {
+				cond_resched();
+				cpu_relax();
+			}
+			goto retry;
+		}
 
 		/*
 		 * Move PageHWPoison flag from head page to the raw error page,
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 5/6] mm: hugetlb: fix a race between isolating and freeing page
  2021-01-13  5:22 [PATCH v4 0/6] Fix some bugs about HugeTLB code Muchun Song
                   ` (3 preceding siblings ...)
  2021-01-13  5:22 ` [PATCH v4 4/6] mm: hugetlb: retry dissolve page when hitting race Muchun Song
@ 2021-01-13  5:22 ` Muchun Song
  2021-01-13  5:22 ` [PATCH v4 6/6] mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active Muchun Song
  5 siblings, 0 replies; 21+ messages in thread
From: Muchun Song @ 2021-01-13  5:22 UTC (permalink / raw)
  To: mike.kravetz, akpm
  Cc: n-horiguchi, ak, mhocko, linux-mm, linux-kernel, Muchun Song,
	Michal Hocko, stable

There is a race between isolate_huge_page() and __free_huge_page().

CPU0:                                       CPU1:

if (PageHuge(page))
                                            put_page(page)
                                              __free_huge_page(page)
                                                  spin_lock(&hugetlb_lock)
                                                  update_and_free_page(page)
                                                    set_compound_page_dtor(page,
                                                      NULL_COMPOUND_DTOR)
                                                  spin_unlock(&hugetlb_lock)
  isolate_huge_page(page)
    // trigger BUG_ON
    VM_BUG_ON_PAGE(!PageHead(page), page)
    spin_lock(&hugetlb_lock)
    page_huge_active(page)
      // trigger BUG_ON
      VM_BUG_ON_PAGE(!PageHuge(page), page)
    spin_unlock(&hugetlb_lock)

When we isolate a HugeTLB page on CPU0. Meanwhile, we free it to the
buddy allocator on CPU1. Then, we can trigger a BUG_ON on CPU0. Because
it is already freed to the buddy allocator.

Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: stable@vger.kernel.org
---
 mm/hugetlb.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 898e4ea43e13..a7ed22811672 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5575,9 +5575,9 @@ bool isolate_huge_page(struct page *page, struct list_head *list)
 {
 	bool ret = true;
 
-	VM_BUG_ON_PAGE(!PageHead(page), page);
 	spin_lock(&hugetlb_lock);
-	if (!page_huge_active(page) || !get_page_unless_zero(page)) {
+	if (!PageHeadHuge(page) || !page_huge_active(page) ||
+	    !get_page_unless_zero(page)) {
 		ret = false;
 		goto unlock;
 	}
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 6/6] mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
  2021-01-13  5:22 [PATCH v4 0/6] Fix some bugs about HugeTLB code Muchun Song
                   ` (4 preceding siblings ...)
  2021-01-13  5:22 ` [PATCH v4 5/6] mm: hugetlb: fix a race between isolating and freeing page Muchun Song
@ 2021-01-13  5:22 ` Muchun Song
  5 siblings, 0 replies; 21+ messages in thread
From: Muchun Song @ 2021-01-13  5:22 UTC (permalink / raw)
  To: mike.kravetz, akpm
  Cc: n-horiguchi, ak, mhocko, linux-mm, linux-kernel, Muchun Song,
	Michal Hocko, stable

The page_huge_active() can be called from scan_movable_pages() which
do not hold a reference count to the HugeTLB page. So when we call
page_huge_active() from scan_movable_pages(), the HugeTLB page can
be freed parallel. Then we will trigger a BUG_ON which is in the
page_huge_active() when CONFIG_DEBUG_VM is enabled. Just remove the
VM_BUG_ON_PAGE.

Fixes: 7e1f049efb86 ("mm: hugetlb: cleanup using paeg_huge_active()")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: stable@vger.kernel.org
---
 mm/hugetlb.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a7ed22811672..8c6005a538a2 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1360,8 +1360,7 @@ struct hstate *size_to_hstate(unsigned long size)
  */
 bool page_huge_active(struct page *page)
 {
-	VM_BUG_ON_PAGE(!PageHuge(page), page);
-	return PageHead(page) && PagePrivate(&page[1]);
+	return PageHeadHuge(page) && PagePrivate(&page[1]);
 }
 
 /* never called for tail page */
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 3/6] mm: hugetlb: fix a race between freeing and dissolving the page
  2021-01-13  5:22 ` [PATCH v4 3/6] mm: hugetlb: fix a race between freeing and dissolving the page Muchun Song
@ 2021-01-13  9:31   ` Michal Hocko
  0 siblings, 0 replies; 21+ messages in thread
From: Michal Hocko @ 2021-01-13  9:31 UTC (permalink / raw)
  To: Muchun Song
  Cc: mike.kravetz, akpm, n-horiguchi, ak, linux-mm, linux-kernel, stable

On Wed 13-01-21 13:22:06, Muchun Song wrote:
> There is a race condition between __free_huge_page()
> and dissolve_free_huge_page().
> 
> CPU0:                         CPU1:
> 
> // page_count(page) == 1
> put_page(page)
>   __free_huge_page(page)
>                               dissolve_free_huge_page(page)
>                                 spin_lock(&hugetlb_lock)
>                                 // PageHuge(page) && !page_count(page)
>                                 update_and_free_page(page)
>                                 // page is freed to the buddy
>                                 spin_unlock(&hugetlb_lock)
>     spin_lock(&hugetlb_lock)
>     clear_page_huge_active(page)
>     enqueue_huge_page(page)
>     // It is wrong, the page is already freed
>     spin_unlock(&hugetlb_lock)
> 
> The race windows is between put_page() and dissolve_free_huge_page().
> 
> We should make sure that the page is already on the free list
> when it is dissolved.

Please describe the effect of the bug.
"
As a result __free_huge_page would corrupt page(s) already in the buddy
allocator.
"

> 
> Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: stable@vger.kernel.org
[...]
> @@ -1770,6 +1788,14 @@ int dissolve_free_huge_page(struct page *page)
>  		int nid = page_to_nid(head);
>  		if (h->free_huge_pages - h->resv_huge_pages == 0)
>  			goto out;
> +
> +		/*
> +		 * We should make sure that the page is already on the free list
> +		 * when it is dissolved.
> +		 */
> +		if (unlikely(!PageHugeFreed(head)))
> +			goto out;

I believe we have agreed to retry for this temporary state.

> +
>  		/*
>  		 * Move PageHWPoison flag from head page to the raw error page,
>  		 * which makes any subpages rather than the error page reusable.
> -- 
> 2.11.0

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 4/6] mm: hugetlb: retry dissolve page when hitting race
  2021-01-13  5:22 ` [PATCH v4 4/6] mm: hugetlb: retry dissolve page when hitting race Muchun Song
@ 2021-01-13  9:33   ` Michal Hocko
  2021-01-13 10:14     ` [External] " Muchun Song
  0 siblings, 1 reply; 21+ messages in thread
From: Michal Hocko @ 2021-01-13  9:33 UTC (permalink / raw)
  To: Muchun Song; +Cc: mike.kravetz, akpm, n-horiguchi, ak, linux-mm, linux-kernel

On Wed 13-01-21 13:22:07, Muchun Song wrote:
> There is a race between dissolve_free_huge_page() and put_page().
> Theoretically, we should return -EBUSY when we encounter this race.
> In fact, we have a chance to successfully dissolve the page if we
> do a retry. Because the race window is quite small. If we seize
> this opportunity, it is an optimization for increasing the success
> rate of dissolving page.
> 
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> ---
>  mm/hugetlb.c | 20 ++++++++++++++++++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 4a9011e12175..898e4ea43e13 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1772,6 +1772,7 @@ int dissolve_free_huge_page(struct page *page)
>  {
>  	int rc = -EBUSY;
>  
> +retry:
>  	/* Not to disrupt normal path by vainly holding hugetlb_lock */
>  	if (!PageHuge(page))
>  		return 0;
> @@ -1793,8 +1794,23 @@ int dissolve_free_huge_page(struct page *page)
>  		 * We should make sure that the page is already on the free list
>  		 * when it is dissolved.
>  		 */
> -		if (unlikely(!PageHugeFreed(head)))
> -			goto out;
> +		if (unlikely(!PageHugeFreed(head))) {
> +			spin_unlock(&hugetlb_lock);
> +
> +			/*
> +			 * Theoretically, we should return -EBUSY when we
> +			 * encounter this race. In fact, we have a chance
> +			 * to successfully dissolve the page if we do a
> +			 * retry. Because the race window is quite small.
> +			 * If we seize this opportunity, it is an optimization
> +			 * for increasing the success rate of dissolving page.
> +			 */
> +			while (PageHeadHuge(head) && !PageHugeFreed(head)) {
> +				cond_resched();
> +				cpu_relax();
> +			}
> +			goto retry;

OK, so you have done the retry here. Please fold it into the previous
patch. Also do we need cpu_relax on top of cond_resched as well?

> +		}
>  
>  		/*
>  		 * Move PageHWPoison flag from head page to the raw error page,
> -- 
> 2.11.0

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [External] Re: [PATCH v4 4/6] mm: hugetlb: retry dissolve page when hitting race
  2021-01-13  9:33   ` Michal Hocko
@ 2021-01-13 10:14     ` Muchun Song
  2021-01-13 10:38       ` Michal Hocko
  0 siblings, 1 reply; 21+ messages in thread
From: Muchun Song @ 2021-01-13 10:14 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Mike Kravetz, Andrew Morton, Naoya Horiguchi, Andi Kleen,
	Linux Memory Management List, LKML

On Wed, Jan 13, 2021 at 5:33 PM Michal Hocko <mhocko@suse.com> wrote:
>
> On Wed 13-01-21 13:22:07, Muchun Song wrote:
> > There is a race between dissolve_free_huge_page() and put_page().
> > Theoretically, we should return -EBUSY when we encounter this race.
> > In fact, we have a chance to successfully dissolve the page if we
> > do a retry. Because the race window is quite small. If we seize
> > this opportunity, it is an optimization for increasing the success
> > rate of dissolving page.
> >
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > ---
> >  mm/hugetlb.c | 20 ++++++++++++++++++--
> >  1 file changed, 18 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index 4a9011e12175..898e4ea43e13 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -1772,6 +1772,7 @@ int dissolve_free_huge_page(struct page *page)
> >  {
> >       int rc = -EBUSY;
> >
> > +retry:
> >       /* Not to disrupt normal path by vainly holding hugetlb_lock */
> >       if (!PageHuge(page))
> >               return 0;
> > @@ -1793,8 +1794,23 @@ int dissolve_free_huge_page(struct page *page)
> >                * We should make sure that the page is already on the free list
> >                * when it is dissolved.
> >                */
> > -             if (unlikely(!PageHugeFreed(head)))
> > -                     goto out;
> > +             if (unlikely(!PageHugeFreed(head))) {
> > +                     spin_unlock(&hugetlb_lock);
> > +
> > +                     /*
> > +                      * Theoretically, we should return -EBUSY when we
> > +                      * encounter this race. In fact, we have a chance
> > +                      * to successfully dissolve the page if we do a
> > +                      * retry. Because the race window is quite small.
> > +                      * If we seize this opportunity, it is an optimization
> > +                      * for increasing the success rate of dissolving page.
> > +                      */
> > +                     while (PageHeadHuge(head) && !PageHugeFreed(head)) {
> > +                             cond_resched();
> > +                             cpu_relax();
> > +                     }
> > +                     goto retry;
>
> OK, so you have done the retry here. Please fold it into the previous
> patch. Also do we need cpu_relax on top of cond_resched as well?

Because the previous patch is a bugfix and should be backprt to the other
stable tree, right? I just want the fix patch to be small enough.
So I do the retry in this patch. If you do not agree with this. I
will fold this into the previous patch.

Do you mean this?

cpu_relax();
cond_resched();
cpu_relax();

If yes, IMHO, I don't think it is necessary.

>
> > +             }
> >
> >               /*
> >                * Move PageHWPoison flag from head page to the raw error page,
> > --
> > 2.11.0
>
> --
> Michal Hocko
> SUSE Labs


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 1/6] mm: migrate: do not migrate HugeTLB page whose refcount is one
  2021-01-13  5:22 ` [PATCH v4 1/6] mm: migrate: do not migrate HugeTLB page whose refcount is one Muchun Song
@ 2021-01-13 10:30   ` David Hildenbrand
  2021-01-13 10:57   ` Oscar Salvador
  1 sibling, 0 replies; 21+ messages in thread
From: David Hildenbrand @ 2021-01-13 10:30 UTC (permalink / raw)
  To: Muchun Song, mike.kravetz, akpm
  Cc: n-horiguchi, ak, mhocko, linux-mm, linux-kernel, Yang Shi, Michal Hocko

On 13.01.21 06:22, Muchun Song wrote:
> All pages isolated for the migration have an elevated reference count
> and therefore seeing a reference count equal to 1 means that the last
> user of the page has dropped the reference and the page has became
> unused and there doesn't make much sense to migrate it anymore. This has
> been done for regular pages and this patch does the same for hugetlb
> pages. Although the likelyhood of the race is rather small for hugetlb
> pages it makes sense the two code paths in sync.
> 
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
> Acked-by: Yang Shi <shy828301@gmail.com>
> Acked-by: Michal Hocko <mhocko@suse.com>
> ---
>  mm/migrate.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 4385f2fb5d18..a6631c4eb6a6 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1279,6 +1279,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
>  		return -ENOSYS;
>  	}
>  
> +	if (page_count(hpage) == 1) {
> +		/* page was freed from under us. So we are done. */
> +		putback_active_hugepage(hpage);
> +		return MIGRATEPAGE_SUCCESS;
> +	}
> +
>  	new_hpage = get_new_page(hpage, private);
>  	if (!new_hpage)
>  		return -ENOMEM;
> 

Happy to say that I now know understand what's going on here

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [External] Re: [PATCH v4 4/6] mm: hugetlb: retry dissolve page when hitting race
  2021-01-13 10:14     ` [External] " Muchun Song
@ 2021-01-13 10:38       ` Michal Hocko
  2021-01-13 11:11         ` Muchun Song
  0 siblings, 1 reply; 21+ messages in thread
From: Michal Hocko @ 2021-01-13 10:38 UTC (permalink / raw)
  To: Muchun Song
  Cc: Mike Kravetz, Andrew Morton, Naoya Horiguchi, Andi Kleen,
	Linux Memory Management List, LKML

On Wed 13-01-21 18:14:55, Muchun Song wrote:
> On Wed, Jan 13, 2021 at 5:33 PM Michal Hocko <mhocko@suse.com> wrote:
> >
> > On Wed 13-01-21 13:22:07, Muchun Song wrote:
> > > There is a race between dissolve_free_huge_page() and put_page().
> > > Theoretically, we should return -EBUSY when we encounter this race.
> > > In fact, we have a chance to successfully dissolve the page if we
> > > do a retry. Because the race window is quite small. If we seize
> > > this opportunity, it is an optimization for increasing the success
> > > rate of dissolving page.
> > >
> > > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > > ---
> > >  mm/hugetlb.c | 20 ++++++++++++++++++--
> > >  1 file changed, 18 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > index 4a9011e12175..898e4ea43e13 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -1772,6 +1772,7 @@ int dissolve_free_huge_page(struct page *page)
> > >  {
> > >       int rc = -EBUSY;
> > >
> > > +retry:
> > >       /* Not to disrupt normal path by vainly holding hugetlb_lock */
> > >       if (!PageHuge(page))
> > >               return 0;
> > > @@ -1793,8 +1794,23 @@ int dissolve_free_huge_page(struct page *page)
> > >                * We should make sure that the page is already on the free list
> > >                * when it is dissolved.
> > >                */
> > > -             if (unlikely(!PageHugeFreed(head)))
> > > -                     goto out;
> > > +             if (unlikely(!PageHugeFreed(head))) {
> > > +                     spin_unlock(&hugetlb_lock);
> > > +
> > > +                     /*
> > > +                      * Theoretically, we should return -EBUSY when we
> > > +                      * encounter this race. In fact, we have a chance
> > > +                      * to successfully dissolve the page if we do a
> > > +                      * retry. Because the race window is quite small.
> > > +                      * If we seize this opportunity, it is an optimization
> > > +                      * for increasing the success rate of dissolving page.
> > > +                      */
> > > +                     while (PageHeadHuge(head) && !PageHugeFreed(head)) {
> > > +                             cond_resched();
> > > +                             cpu_relax();
> > > +                     }
> > > +                     goto retry;
> >
> > OK, so you have done the retry here. Please fold it into the previous
> > patch. Also do we need cpu_relax on top of cond_resched as well?
> 
> Because the previous patch is a bugfix and should be backprt to the other
> stable tree, right?

Yes, it is a bugfix but it arguably opens another issue so the follow up
patch should better be applied along with it.

> I just want the fix patch to be small enough.
> So I do the retry in this patch. If you do not agree with this. I
> will fold this into the previous patch.
> 
> Do you mean this?
> 
> cpu_relax();
> cond_resched();
> cpu_relax();

No, I am questiong the use of cpu_relax. What is the point?

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 1/6] mm: migrate: do not migrate HugeTLB page whose refcount is one
  2021-01-13  5:22 ` [PATCH v4 1/6] mm: migrate: do not migrate HugeTLB page whose refcount is one Muchun Song
  2021-01-13 10:30   ` David Hildenbrand
@ 2021-01-13 10:57   ` Oscar Salvador
  2021-01-13 11:03     ` [External] " Muchun Song
  1 sibling, 1 reply; 21+ messages in thread
From: Oscar Salvador @ 2021-01-13 10:57 UTC (permalink / raw)
  To: Muchun Song
  Cc: mike.kravetz, akpm, n-horiguchi, ak, mhocko, linux-mm,
	linux-kernel, Yang Shi, Michal Hocko

On Wed, Jan 13, 2021 at 01:22:04PM +0800, Muchun Song wrote:
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 4385f2fb5d18..a6631c4eb6a6 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1279,6 +1279,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
>  		return -ENOSYS;
>  	}
>  
> +	if (page_count(hpage) == 1) {
> +		/* page was freed from under us. So we are done. */
> +		putback_active_hugepage(hpage);
> +		return MIGRATEPAGE_SUCCESS;
> +	}
> +

I was a bit puzzled as why we did not go to the "goto" block that already does
this, but then I saw the put_new_page/new_hpage handlind further down.
It could be re-arranged but out of scope, so:

Reviewed-by: Oscar Salvador <osalvador@suse.de>


-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/6] mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
  2021-01-13  5:22 ` [PATCH v4 2/6] mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page Muchun Song
@ 2021-01-13 11:00   ` Oscar Salvador
  0 siblings, 0 replies; 21+ messages in thread
From: Oscar Salvador @ 2021-01-13 11:00 UTC (permalink / raw)
  To: Muchun Song
  Cc: mike.kravetz, akpm, n-horiguchi, ak, mhocko, linux-mm,
	linux-kernel, Michal Hocko, stable

On Wed, Jan 13, 2021 at 01:22:05PM +0800, Muchun Song wrote:
> If a new hugetlb page is allocated during fallocate it will not be
> marked as active (set_page_huge_active) which will result in a later
> isolate_huge_page failure when the page migration code would like to
> move that page. Such a failure would be unexpected and wrong.
> 
> Only export set_page_huge_active, just leave clear_page_huge_active
> as static. Because there are no external users.
> 
> Fixes: 70c3547e36f5 (hugetlbfs: add hugetlbfs_fallocate())
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: stable@vger.kernel.org

Reviewed-by: Oscar Salvador <osalvador@suse.de>

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [External] Re: [PATCH v4 1/6] mm: migrate: do not migrate HugeTLB page whose refcount is one
  2021-01-13 10:57   ` Oscar Salvador
@ 2021-01-13 11:03     ` Muchun Song
  0 siblings, 0 replies; 21+ messages in thread
From: Muchun Song @ 2021-01-13 11:03 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: Mike Kravetz, Andrew Morton, Naoya Horiguchi, Andi Kleen, mhocko,
	Linux Memory Management List, LKML, Yang Shi, Michal Hocko

On Wed, Jan 13, 2021 at 6:57 PM Oscar Salvador <osalvador@suse.de> wrote:
>
> On Wed, Jan 13, 2021 at 01:22:04PM +0800, Muchun Song wrote:
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index 4385f2fb5d18..a6631c4eb6a6 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -1279,6 +1279,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
> >               return -ENOSYS;
> >       }
> >
> > +     if (page_count(hpage) == 1) {
> > +             /* page was freed from under us. So we are done. */
> > +             putback_active_hugepage(hpage);
> > +             return MIGRATEPAGE_SUCCESS;
> > +     }
> > +
>
> I was a bit puzzled as why we did not go to the "goto" block that already does
> this, but then I saw the put_new_page/new_hpage handlind further down.
> It could be re-arranged but out of scope, so:

Agree. I also thought about this. :)

>
> Reviewed-by: Oscar Salvador <osalvador@suse.de>
>

Thanks.

>
> --
> Oscar Salvador
> SUSE L3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [External] Re: [PATCH v4 4/6] mm: hugetlb: retry dissolve page when hitting race
  2021-01-13 10:38       ` Michal Hocko
@ 2021-01-13 11:11         ` Muchun Song
  2021-01-13 11:14           ` Oscar Salvador
  2021-01-13 11:22           ` Michal Hocko
  0 siblings, 2 replies; 21+ messages in thread
From: Muchun Song @ 2021-01-13 11:11 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Mike Kravetz, Andrew Morton, Naoya Horiguchi, Andi Kleen,
	Linux Memory Management List, LKML

On Wed, Jan 13, 2021 at 6:38 PM Michal Hocko <mhocko@suse.com> wrote:
>
> On Wed 13-01-21 18:14:55, Muchun Song wrote:
> > On Wed, Jan 13, 2021 at 5:33 PM Michal Hocko <mhocko@suse.com> wrote:
> > >
> > > On Wed 13-01-21 13:22:07, Muchun Song wrote:
> > > > There is a race between dissolve_free_huge_page() and put_page().
> > > > Theoretically, we should return -EBUSY when we encounter this race.
> > > > In fact, we have a chance to successfully dissolve the page if we
> > > > do a retry. Because the race window is quite small. If we seize
> > > > this opportunity, it is an optimization for increasing the success
> > > > rate of dissolving page.
> > > >
> > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > > > ---
> > > >  mm/hugetlb.c | 20 ++++++++++++++++++--
> > > >  1 file changed, 18 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > > index 4a9011e12175..898e4ea43e13 100644
> > > > --- a/mm/hugetlb.c
> > > > +++ b/mm/hugetlb.c
> > > > @@ -1772,6 +1772,7 @@ int dissolve_free_huge_page(struct page *page)
> > > >  {
> > > >       int rc = -EBUSY;
> > > >
> > > > +retry:
> > > >       /* Not to disrupt normal path by vainly holding hugetlb_lock */
> > > >       if (!PageHuge(page))
> > > >               return 0;
> > > > @@ -1793,8 +1794,23 @@ int dissolve_free_huge_page(struct page *page)
> > > >                * We should make sure that the page is already on the free list
> > > >                * when it is dissolved.
> > > >                */
> > > > -             if (unlikely(!PageHugeFreed(head)))
> > > > -                     goto out;
> > > > +             if (unlikely(!PageHugeFreed(head))) {
> > > > +                     spin_unlock(&hugetlb_lock);
> > > > +
> > > > +                     /*
> > > > +                      * Theoretically, we should return -EBUSY when we
> > > > +                      * encounter this race. In fact, we have a chance
> > > > +                      * to successfully dissolve the page if we do a
> > > > +                      * retry. Because the race window is quite small.
> > > > +                      * If we seize this opportunity, it is an optimization
> > > > +                      * for increasing the success rate of dissolving page.
> > > > +                      */
> > > > +                     while (PageHeadHuge(head) && !PageHugeFreed(head)) {
> > > > +                             cond_resched();
> > > > +                             cpu_relax();
> > > > +                     }
> > > > +                     goto retry;
> > >
> > > OK, so you have done the retry here. Please fold it into the previous
> > > patch. Also do we need cpu_relax on top of cond_resched as well?
> >
> > Because the previous patch is a bugfix and should be backprt to the other
> > stable tree, right?
>
> Yes, it is a bugfix but it arguably opens another issue so the follow up
> patch should better be applied along with it.

OK. I will fold this one into the previous one. Thanks.

>
> > I just want the fix patch to be small enough.
> > So I do the retry in this patch. If you do not agree with this. I
> > will fold this into the previous patch.
> >
> > Do you mean this?
> >
> > cpu_relax();
> > cond_resched();
> > cpu_relax();
>
> No, I am questiong the use of cpu_relax. What is the point?

If there is no task to be scheduled. Here is just a while loop.
The cpu_relax is a good thing to insert into busy-wait loops,
right?


>
> --
> Michal Hocko
> SUSE Labs


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [External] Re: [PATCH v4 4/6] mm: hugetlb: retry dissolve page when hitting race
  2021-01-13 11:11         ` Muchun Song
@ 2021-01-13 11:14           ` Oscar Salvador
  2021-01-13 11:20             ` Muchun Song
  2021-01-13 11:22           ` Michal Hocko
  1 sibling, 1 reply; 21+ messages in thread
From: Oscar Salvador @ 2021-01-13 11:14 UTC (permalink / raw)
  To: Muchun Song
  Cc: Michal Hocko, Mike Kravetz, Andrew Morton, Naoya Horiguchi,
	Andi Kleen, Linux Memory Management List, LKML

On Wed, Jan 13, 2021 at 07:11:06PM +0800, Muchun Song wrote:
> If there is no task to be scheduled. Here is just a while loop.
> The cpu_relax is a good thing to insert into busy-wait loops,
> right?

But if the race window is that small, does it make sense?

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [External] Re: [PATCH v4 4/6] mm: hugetlb: retry dissolve page when hitting race
  2021-01-13 11:14           ` Oscar Salvador
@ 2021-01-13 11:20             ` Muchun Song
  2021-01-13 12:03               ` Michal Hocko
  0 siblings, 1 reply; 21+ messages in thread
From: Muchun Song @ 2021-01-13 11:20 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: Michal Hocko, Mike Kravetz, Andrew Morton, Naoya Horiguchi,
	Andi Kleen, Linux Memory Management List, LKML

On Wed, Jan 13, 2021 at 7:15 PM Oscar Salvador <osalvador@suse.de> wrote:
>
> On Wed, Jan 13, 2021 at 07:11:06PM +0800, Muchun Song wrote:
> > If there is no task to be scheduled. Here is just a while loop.
> > The cpu_relax is a good thing to insert into busy-wait loops,
> > right?
>
> But if the race window is that small, does it make sense?

Actually, there is one exception. The race window could
become larger. If the page is freed via a workqueue (see
free_huge_page()). In this case, the cpu_relax() can
make sense. Right?

>
> --
> Oscar Salvador
> SUSE L3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [External] Re: [PATCH v4 4/6] mm: hugetlb: retry dissolve page when hitting race
  2021-01-13 11:11         ` Muchun Song
  2021-01-13 11:14           ` Oscar Salvador
@ 2021-01-13 11:22           ` Michal Hocko
  2021-01-13 12:15             ` Muchun Song
  1 sibling, 1 reply; 21+ messages in thread
From: Michal Hocko @ 2021-01-13 11:22 UTC (permalink / raw)
  To: Muchun Song
  Cc: Mike Kravetz, Andrew Morton, Naoya Horiguchi, Andi Kleen,
	Linux Memory Management List, LKML

On Wed 13-01-21 19:11:06, Muchun Song wrote:
> On Wed, Jan 13, 2021 at 6:38 PM Michal Hocko <mhocko@suse.com> wrote:
[...]
> > > I just want the fix patch to be small enough.
> > > So I do the retry in this patch. If you do not agree with this. I
> > > will fold this into the previous patch.
> > >
> > > Do you mean this?
> > >
> > > cpu_relax();
> > > cond_resched();
> > > cpu_relax();
> >
> > No, I am questiong the use of cpu_relax. What is the point?
> 
> If there is no task to be scheduled. Here is just a while loop.
> The cpu_relax is a good thing to insert into busy-wait loops,
> right?

Well in an ideal world we would simply have a way to block and wait for
the particular page. This is probably an overkill for a rare event like
this. And while you are right that theoretically there might be nobody
else to run but I find it rather unlikely considering that this path is
racing with somebody. Sure there is even less likely possibility that
the race is actually waiting for worker context but really I would just
make it simple retry loop. If we ever hit a real busy loop then this
would be pretty straightforward to spot and fix up.

It's not like I am against the patch with cpu_relax but I find it
excessive for this purpose. If you feel strongly then just keep it.

Once the two patches are squashed I will ack it.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [External] Re: [PATCH v4 4/6] mm: hugetlb: retry dissolve page when hitting race
  2021-01-13 11:20             ` Muchun Song
@ 2021-01-13 12:03               ` Michal Hocko
  0 siblings, 0 replies; 21+ messages in thread
From: Michal Hocko @ 2021-01-13 12:03 UTC (permalink / raw)
  To: Muchun Song
  Cc: Oscar Salvador, Mike Kravetz, Andrew Morton, Naoya Horiguchi,
	Andi Kleen, Linux Memory Management List, LKML

On Wed 13-01-21 19:20:17, Muchun Song wrote:
> On Wed, Jan 13, 2021 at 7:15 PM Oscar Salvador <osalvador@suse.de> wrote:
> >
> > On Wed, Jan 13, 2021 at 07:11:06PM +0800, Muchun Song wrote:
> > > If there is no task to be scheduled. Here is just a while loop.
> > > The cpu_relax is a good thing to insert into busy-wait loops,
> > > right?
> >
> > But if the race window is that small, does it make sense?
> 
> Actually, there is one exception. The race window could
> become larger. If the page is freed via a workqueue (see
> free_huge_page()). In this case, the cpu_relax() can
> make sense. Right?

The system would have to be under serious stress for WQ to clog. I do
not expect there would be nothing runable at that stage. Possible?
Maybe but if that matters than a short sleep would be more preferable
than cpu_relax.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [External] Re: [PATCH v4 4/6] mm: hugetlb: retry dissolve page when hitting race
  2021-01-13 11:22           ` Michal Hocko
@ 2021-01-13 12:15             ` Muchun Song
  0 siblings, 0 replies; 21+ messages in thread
From: Muchun Song @ 2021-01-13 12:15 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Mike Kravetz, Andrew Morton, Naoya Horiguchi, Andi Kleen,
	Linux Memory Management List, LKML

On Wed, Jan 13, 2021 at 7:22 PM Michal Hocko <mhocko@suse.com> wrote:
>
> On Wed 13-01-21 19:11:06, Muchun Song wrote:
> > On Wed, Jan 13, 2021 at 6:38 PM Michal Hocko <mhocko@suse.com> wrote:
> [...]
> > > > I just want the fix patch to be small enough.
> > > > So I do the retry in this patch. If you do not agree with this. I
> > > > will fold this into the previous patch.
> > > >
> > > > Do you mean this?
> > > >
> > > > cpu_relax();
> > > > cond_resched();
> > > > cpu_relax();
> > >
> > > No, I am questiong the use of cpu_relax. What is the point?
> >
> > If there is no task to be scheduled. Here is just a while loop.
> > The cpu_relax is a good thing to insert into busy-wait loops,
> > right?
>
> Well in an ideal world we would simply have a way to block and wait for
> the particular page. This is probably an overkill for a rare event like
> this. And while you are right that theoretically there might be nobody
> else to run but I find it rather unlikely considering that this path is
> racing with somebody. Sure there is even less likely possibility that
> the race is actually waiting for worker context but really I would just
> make it simple retry loop. If we ever hit a real busy loop then this
> would be pretty straightforward to spot and fix up.
>
> It's not like I am against the patch with cpu_relax but I find it
> excessive for this purpose. If you feel strongly then just keep it.
>
> Once the two patches are squashed I will ack it.

OK. I will do this. Thanks.

> --
> Michal Hocko
> SUSE Labs


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2021-01-13 12:16 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-13  5:22 [PATCH v4 0/6] Fix some bugs about HugeTLB code Muchun Song
2021-01-13  5:22 ` [PATCH v4 1/6] mm: migrate: do not migrate HugeTLB page whose refcount is one Muchun Song
2021-01-13 10:30   ` David Hildenbrand
2021-01-13 10:57   ` Oscar Salvador
2021-01-13 11:03     ` [External] " Muchun Song
2021-01-13  5:22 ` [PATCH v4 2/6] mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page Muchun Song
2021-01-13 11:00   ` Oscar Salvador
2021-01-13  5:22 ` [PATCH v4 3/6] mm: hugetlb: fix a race between freeing and dissolving the page Muchun Song
2021-01-13  9:31   ` Michal Hocko
2021-01-13  5:22 ` [PATCH v4 4/6] mm: hugetlb: retry dissolve page when hitting race Muchun Song
2021-01-13  9:33   ` Michal Hocko
2021-01-13 10:14     ` [External] " Muchun Song
2021-01-13 10:38       ` Michal Hocko
2021-01-13 11:11         ` Muchun Song
2021-01-13 11:14           ` Oscar Salvador
2021-01-13 11:20             ` Muchun Song
2021-01-13 12:03               ` Michal Hocko
2021-01-13 11:22           ` Michal Hocko
2021-01-13 12:15             ` Muchun Song
2021-01-13  5:22 ` [PATCH v4 5/6] mm: hugetlb: fix a race between isolating and freeing page Muchun Song
2021-01-13  5:22 ` [PATCH v4 6/6] mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active Muchun Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).