linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] mm: misc migrate cleanup and improvement
@ 2020-11-03 13:03 Yang Shi
  2020-11-03 13:03 ` [PATCH 1/5] mm: truncate_complete_page is not existed anymore Yang Shi
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Yang Shi @ 2020-11-03 13:03 UTC (permalink / raw)
  To: mhocko, ziy, songliubraving, mgorman, jack, akpm
  Cc: shy828301, linux-mm, linux-kernel


Some misc migrate code cleanup and improvement.

Yang Shi (5):
      mm: truncate_complete_page is not existed anymore
      mm: migrate: simplify the logic for handling permanent failure
      mm: migrate: skip shared exec THP for NUMA balancing
      mm: migrate: clean up migrate_prep{_local}
      mm: migrate: return -ENOSYS if THP migration is unsupported

 include/linux/migrate.h |   4 +--
 mm/mempolicy.c          |   8 ++----
 mm/migrate.c            | 148 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------
 mm/vmscan.c             |   2 +-
 4 files changed, 99 insertions(+), 63 deletions(-)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/5] mm: truncate_complete_page is not existed anymore
  2020-11-03 13:03 [PATCH 0/5] mm: misc migrate cleanup and improvement Yang Shi
@ 2020-11-03 13:03 ` Yang Shi
  2020-11-06  9:16   ` Jan Kara
  2020-11-03 13:03 ` [PATCH 2/5] mm: migrate: simplify the logic for handling permanent failure Yang Shi
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Yang Shi @ 2020-11-03 13:03 UTC (permalink / raw)
  To: mhocko, ziy, songliubraving, mgorman, jack, akpm
  Cc: shy828301, linux-mm, linux-kernel

The commit 9f4e41f4717832e34cca153ced62b4a1d7e26c0e ("mm: refactor
truncate_complete_page()") refactored truncate_complete_page(), and it
is not existed anymore, correct the comment in vmscan and migrate to avoid
confusion.

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 mm/migrate.c | 2 +-
 mm/vmscan.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 5ca5842df5db..8a2e7e19e27b 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1106,7 +1106,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 	 * and treated as swapcache but it has no rmap yet.
 	 * Calling try_to_unmap() against a page->mapping==NULL page will
 	 * trigger a BUG.  So handle it here.
-	 * 2. An orphaned page (see truncate_complete_page) might have
+	 * 2. An orphaned page (see truncate_cleanup_page) might have
 	 * fs-private metadata. The page can be picked up due to memory
 	 * offlining.  Everywhere else except page reclaim, the page is
 	 * invisible to the vm, so the page can not be migrated.  So try to
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 1b8f0e059767..165cca87edc8 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1393,7 +1393,7 @@ static unsigned int shrink_page_list(struct list_head *page_list,
 		 *
 		 * Rarely, pages can have buffers and no ->mapping.  These are
 		 * the pages which were not successfully invalidated in
-		 * truncate_complete_page().  We try to drop those buffers here
+		 * truncate_cleanup_page().  We try to drop those buffers here
 		 * and if that worked, and the page is no longer mapped into
 		 * process address space (page_count == 1) it can be freed.
 		 * Otherwise, leave the page on the LRU so it is swappable.
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/5] mm: migrate: simplify the logic for handling permanent failure
  2020-11-03 13:03 [PATCH 0/5] mm: misc migrate cleanup and improvement Yang Shi
  2020-11-03 13:03 ` [PATCH 1/5] mm: truncate_complete_page is not existed anymore Yang Shi
@ 2020-11-03 13:03 ` Yang Shi
  2020-11-06 20:03   ` Zi Yan
  2020-11-03 13:03 ` [PATCH 3/5] mm: migrate: skip shared exec THP for NUMA balancing Yang Shi
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Yang Shi @ 2020-11-03 13:03 UTC (permalink / raw)
  To: mhocko, ziy, songliubraving, mgorman, jack, akpm
  Cc: shy828301, linux-mm, linux-kernel

When unmap_and_move{_huge_page}() returns !-EAGAIN and !MIGRATEPAGE_SUCCESS,
the page would be put back to LRU or proper list if it is non-LRU movable
page.  But, the callers always call putback_movable_pages() to put the
failed pages back later on, so it seems not very efficient to put every
single page back immediately, and the code looks convoluted.

Put the failed page on a separate list, then splice the list to migrate
list when all pages are tried.  It is the caller's responsibility to
call putback_movable_pages() to handle failures.  This also makes the
code simpler and more readable.

After the change the rules are:
    * Success: non hugetlb page will be freed, hugetlb page will be put
               back
    * -EAGAIN: stay on the from list
    * -ENOMEM: stay on the from list
    * Other errno: put on ret_pages list then splice to from list

The from list would be empty iff all pages are migrated successfully, it
was not so before.  This has no impact to current existing callsites.

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 mm/migrate.c | 58 ++++++++++++++++++++++++++--------------------------
 1 file changed, 29 insertions(+), 29 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 8a2e7e19e27b..c33c92495ead 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1169,7 +1169,8 @@ static int unmap_and_move(new_page_t get_new_page,
 				   free_page_t put_new_page,
 				   unsigned long private, struct page *page,
 				   int force, enum migrate_mode mode,
-				   enum migrate_reason reason)
+				   enum migrate_reason reason,
+				   struct list_head *ret)
 {
 	int rc = MIGRATEPAGE_SUCCESS;
 	struct page *newpage = NULL;
@@ -1206,7 +1207,14 @@ static int unmap_and_move(new_page_t get_new_page,
 		 * migrated will have kept its references and be restored.
 		 */
 		list_del(&page->lru);
+	}
 
+	/*
+	 * If migration is successful, releases reference grabbed during
+	 * isolation. Otherwise, restore the page to right list unless
+	 * we want to retry.
+	 */
+	if (rc == MIGRATEPAGE_SUCCESS) {
 		/*
 		 * Compaction can migrate also non-LRU pages which are
 		 * not accounted to NR_ISOLATED_*. They can be recognized
@@ -1215,35 +1223,16 @@ static int unmap_and_move(new_page_t get_new_page,
 		if (likely(!__PageMovable(page)))
 			mod_node_page_state(page_pgdat(page), NR_ISOLATED_ANON +
 					page_is_file_lru(page), -thp_nr_pages(page));
-	}
 
-	/*
-	 * If migration is successful, releases reference grabbed during
-	 * isolation. Otherwise, restore the page to right list unless
-	 * we want to retry.
-	 */
-	if (rc == MIGRATEPAGE_SUCCESS) {
 		if (reason != MR_MEMORY_FAILURE)
 			/*
 			 * We release the page in page_handle_poison.
 			 */
 			put_page(page);
 	} else {
-		if (rc != -EAGAIN) {
-			if (likely(!__PageMovable(page))) {
-				putback_lru_page(page);
-				goto put_new;
-			}
+		if (rc != -EAGAIN)
+			list_add_tail(&page->lru, ret);
 
-			lock_page(page);
-			if (PageMovable(page))
-				putback_movable_page(page);
-			else
-				__ClearPageIsolated(page);
-			unlock_page(page);
-			put_page(page);
-		}
-put_new:
 		if (put_new_page)
 			put_new_page(newpage, private);
 		else
@@ -1274,7 +1263,8 @@ static int unmap_and_move(new_page_t get_new_page,
 static int unmap_and_move_huge_page(new_page_t get_new_page,
 				free_page_t put_new_page, unsigned long private,
 				struct page *hpage, int force,
-				enum migrate_mode mode, int reason)
+				enum migrate_mode mode, int reason,
+				struct list_head *ret)
 {
 	int rc = -EAGAIN;
 	int page_was_mapped = 0;
@@ -1290,7 +1280,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
 	 * kicking migration.
 	 */
 	if (!hugepage_migration_supported(page_hstate(hpage))) {
-		putback_active_hugepage(hpage);
+		list_move_tail(&hpage->lru, ret);
 		return -ENOSYS;
 	}
 
@@ -1372,8 +1362,10 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
 out_unlock:
 	unlock_page(hpage);
 out:
-	if (rc != -EAGAIN)
+	if (rc == MIGRATEPAGE_SUCCESS)
 		putback_active_hugepage(hpage);
+	else if (rc != -EAGAIN && rc != MIGRATEPAGE_SUCCESS)
+		list_move_tail(&hpage->lru, ret);
 
 	/*
 	 * If migration was not successful and there's a freeing callback, use
@@ -1404,8 +1396,8 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
  *
  * The function returns after 10 attempts or if no pages are movable any more
  * because the list has become empty or no retryable pages exist any more.
- * The caller should call putback_movable_pages() to return pages to the LRU
- * or free list only if ret != 0.
+ * It is caller's responsibility to call putback_movable_pages() to return pages
+ * to the LRU or free list only if ret != 0.
  *
  * Returns the number of pages that were not migrated, or an error code.
  */
@@ -1426,6 +1418,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 	struct page *page2;
 	int swapwrite = current->flags & PF_SWAPWRITE;
 	int rc, nr_subpages;
+	LIST_HEAD(ret_pages);
 
 	if (!swapwrite)
 		current->flags |= PF_SWAPWRITE;
@@ -1448,11 +1441,12 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 			if (PageHuge(page))
 				rc = unmap_and_move_huge_page(get_new_page,
 						put_new_page, private, page,
-						pass > 2, mode, reason);
+						pass > 2, mode, reason,
+						&ret_pages);
 			else
 				rc = unmap_and_move(get_new_page, put_new_page,
 						private, page, pass > 2, mode,
-						reason);
+						reason, &ret_pages);
 
 			switch(rc) {
 			case -ENOMEM:
@@ -1519,6 +1513,12 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 	nr_thp_failed += thp_retry;
 	rc = nr_failed;
 out:
+	/*
+	 * Put the permanent failure page back to migration list, they
+	 * will be put back to the right list by the caller.
+	 */
+	list_splice(&ret_pages, from);
+
 	count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
 	count_vm_events(PGMIGRATE_FAIL, nr_failed);
 	count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/5] mm: migrate: skip shared exec THP for NUMA balancing
  2020-11-03 13:03 [PATCH 0/5] mm: misc migrate cleanup and improvement Yang Shi
  2020-11-03 13:03 ` [PATCH 1/5] mm: truncate_complete_page is not existed anymore Yang Shi
  2020-11-03 13:03 ` [PATCH 2/5] mm: migrate: simplify the logic for handling permanent failure Yang Shi
@ 2020-11-03 13:03 ` Yang Shi
  2020-11-03 13:03 ` [PATCH 4/5] mm: migrate: clean up migrate_prep{_local} Yang Shi
  2020-11-03 13:03 ` [PATCH 5/5] mm: migrate: return -ENOSYS if THP migration is unsupported Yang Shi
  4 siblings, 0 replies; 14+ messages in thread
From: Yang Shi @ 2020-11-03 13:03 UTC (permalink / raw)
  To: mhocko, ziy, songliubraving, mgorman, jack, akpm
  Cc: shy828301, linux-mm, linux-kernel

The NUMA balancing skip shared exec base page.  Since
CONFIG_READ_ONLY_THP_FOR_FS was introduced, there are probably shared
exec THP, so skip such THPs for NUMA balancing as well.

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 mm/migrate.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index c33c92495ead..9a32bb128f31 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2061,6 +2061,16 @@ bool pmd_trans_migrating(pmd_t pmd)
 	return PageLocked(page);
 }
 
+static inline bool is_shared_exec_page(struct vm_area_struct *vma,
+				       struct page *page)
+{
+	if (page_mapcount(page) != 1 && page_is_file_lru(page) &&
+	    (vma->vm_flags & VM_EXEC))
+		return true;
+
+	return false;
+}
+
 /*
  * Attempt to migrate a misplaced page to the specified destination
  * node. Caller is expected to have an elevated reference count on
@@ -2078,8 +2088,7 @@ int migrate_misplaced_page(struct page *page, struct vm_area_struct *vma,
 	 * Don't migrate file pages that are mapped in multiple processes
 	 * with execute permissions as they are probably shared libraries.
 	 */
-	if (page_mapcount(page) != 1 && page_is_file_lru(page) &&
-	    (vma->vm_flags & VM_EXEC))
+	if (is_shared_exec_page(vma, page))
 		goto out;
 
 	/*
@@ -2134,6 +2143,10 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 	int page_lru = page_is_file_lru(page);
 	unsigned long start = address & HPAGE_PMD_MASK;
 
+	if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
+	    is_shared_exec_page(vma, page))
+		goto out;
+
 	new_page = alloc_pages_node(node,
 		(GFP_TRANSHUGE_LIGHT | __GFP_THISNODE),
 		HPAGE_PMD_ORDER);
@@ -2245,6 +2258,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 
 out_unlock:
 	unlock_page(page);
+out:
 	put_page(page);
 	return 0;
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/5] mm: migrate: clean up migrate_prep{_local}
  2020-11-03 13:03 [PATCH 0/5] mm: misc migrate cleanup and improvement Yang Shi
                   ` (2 preceding siblings ...)
  2020-11-03 13:03 ` [PATCH 3/5] mm: migrate: skip shared exec THP for NUMA balancing Yang Shi
@ 2020-11-03 13:03 ` Yang Shi
  2020-11-06 20:05   ` Zi Yan
  2020-11-03 13:03 ` [PATCH 5/5] mm: migrate: return -ENOSYS if THP migration is unsupported Yang Shi
  4 siblings, 1 reply; 14+ messages in thread
From: Yang Shi @ 2020-11-03 13:03 UTC (permalink / raw)
  To: mhocko, ziy, songliubraving, mgorman, jack, akpm
  Cc: shy828301, linux-mm, linux-kernel

The migrate_prep{_local} never fails, so it is pointless to have return
value and check the return value.

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 include/linux/migrate.h | 4 ++--
 mm/mempolicy.c          | 8 ++------
 mm/migrate.c            | 8 ++------
 3 files changed, 6 insertions(+), 14 deletions(-)

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 0f8d1583fa8e..4594838a0f7c 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -45,8 +45,8 @@ extern struct page *alloc_migration_target(struct page *page, unsigned long priv
 extern int isolate_movable_page(struct page *page, isolate_mode_t mode);
 extern void putback_movable_page(struct page *page);
 
-extern int migrate_prep(void);
-extern int migrate_prep_local(void);
+extern void migrate_prep(void);
+extern void migrate_prep_local(void);
 extern void migrate_page_states(struct page *newpage, struct page *page);
 extern void migrate_page_copy(struct page *newpage, struct page *page);
 extern int migrate_huge_page_move_mapping(struct address_space *mapping,
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 3fde772ef5ef..780861312008 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1114,9 +1114,7 @@ int do_migrate_pages(struct mm_struct *mm, const nodemask_t *from,
 	int err;
 	nodemask_t tmp;
 
-	err = migrate_prep();
-	if (err)
-		return err;
+	migrate_prep();
 
 	mmap_read_lock(mm);
 
@@ -1315,9 +1313,7 @@ static long do_mbind(unsigned long start, unsigned long len,
 
 	if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
 
-		err = migrate_prep();
-		if (err)
-			goto mpol_out;
+		migrate_prep();
 	}
 	{
 		NODEMASK_SCRATCH(scratch);
diff --git a/mm/migrate.c b/mm/migrate.c
index 9a32bb128f31..8f6a61c9274b 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -62,7 +62,7 @@
  * to be migrated using isolate_lru_page(). If scheduling work on other CPUs is
  * undesirable, use migrate_prep_local()
  */
-int migrate_prep(void)
+void migrate_prep(void)
 {
 	/*
 	 * Clear the LRU lists so pages can be isolated.
@@ -71,16 +71,12 @@ int migrate_prep(void)
 	 * pages that may be busy.
 	 */
 	lru_add_drain_all();
-
-	return 0;
 }
 
 /* Do the necessary work of migrate_prep but not if it involves other CPUs */
-int migrate_prep_local(void)
+void migrate_prep_local(void)
 {
 	lru_add_drain();
-
-	return 0;
 }
 
 int isolate_movable_page(struct page *page, isolate_mode_t mode)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/5] mm: migrate: return -ENOSYS if THP migration is unsupported
  2020-11-03 13:03 [PATCH 0/5] mm: misc migrate cleanup and improvement Yang Shi
                   ` (3 preceding siblings ...)
  2020-11-03 13:03 ` [PATCH 4/5] mm: migrate: clean up migrate_prep{_local} Yang Shi
@ 2020-11-03 13:03 ` Yang Shi
  2020-11-06 20:17   ` Zi Yan
  4 siblings, 1 reply; 14+ messages in thread
From: Yang Shi @ 2020-11-03 13:03 UTC (permalink / raw)
  To: mhocko, ziy, songliubraving, mgorman, jack, akpm
  Cc: shy828301, linux-mm, linux-kernel

In the current implementation unmap_and_move() would return -ENOMEM if
THP migration is unsupported, then the THP will be split.  If split is
failed just exit without trying to migrate other pages.  It doesn't make
too much sense since there may be enough free memory to migrate other
pages and there may be a lot base pages on the list.

Return -ENOSYS to make consistent with hugetlb.  And if THP split is
failed just skip and try other pages on the list.

Just skip the whole list and exit when free memory is really low.

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 mm/migrate.c | 62 ++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 46 insertions(+), 16 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 8f6a61c9274b..b3466d8c7f03 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1172,7 +1172,7 @@ static int unmap_and_move(new_page_t get_new_page,
 	struct page *newpage = NULL;
 
 	if (!thp_migration_supported() && PageTransHuge(page))
-		return -ENOMEM;
+		return -ENOSYS;
 
 	if (page_count(page) == 1) {
 		/* page was freed from under us. So we are done. */
@@ -1376,6 +1376,20 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
 	return rc;
 }
 
+static inline int try_split_thp(struct page *page, struct page *page2,
+				struct list_head *from)
+{
+	int rc = 0;
+
+	lock_page(page);
+	rc = split_huge_page_to_list(page, from);
+	unlock_page(page);
+	if (!rc)
+		list_safe_reset_next(page, page2, lru);
+
+	return rc;
+}
+
 /*
  * migrate_pages - migrate the pages specified in a list, to the free pages
  *		   supplied as the target for the page migration
@@ -1445,24 +1459,40 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 						reason, &ret_pages);
 
 			switch(rc) {
+			/*
+			 * THP migration might be unsupported or the
+			 * allocation could've failed so we should
+			 * retry on the same page with the THP split
+			 * to base pages.
+			 *
+			 * Head page is retried immediately and tail
+			 * pages are added to the tail of the list so
+			 * we encounter them after the rest of the list
+			 * is processed.
+			 */
+			case -ENOSYS:
+				/* THP migration is unsupported */
+				if (is_thp) {
+					if (!try_split_thp(page, page2, from)) {
+						nr_thp_split++;
+						goto retry;
+					}
+
+					nr_thp_failed++;
+					nr_failed += nr_subpages;
+					break;
+				}
+
+				/* Hugetlb migration is unsupported */
+				nr_failed++;
+				break;
 			case -ENOMEM:
 				/*
-				 * THP migration might be unsupported or the
-				 * allocation could've failed so we should
-				 * retry on the same page with the THP split
-				 * to base pages.
-				 *
-				 * Head page is retried immediately and tail
-				 * pages are added to the tail of the list so
-				 * we encounter them after the rest of the list
-				 * is processed.
+				 * When memory is low, don't bother to try to migrate
+				 * other pages, just exit.
 				 */
 				if (is_thp) {
-					lock_page(page);
-					rc = split_huge_page_to_list(page, from);
-					unlock_page(page);
-					if (!rc) {
-						list_safe_reset_next(page, page2, lru);
+					if (!try_split_thp(page, page2, from)) {
 						nr_thp_split++;
 						goto retry;
 					}
@@ -1490,7 +1520,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 				break;
 			default:
 				/*
-				 * Permanent failure (-EBUSY, -ENOSYS, etc.):
+				 * Permanent failure (-EBUSY, etc.):
 				 * unlike -EAGAIN case, the failed page is
 				 * removed from migration page list and not
 				 * retried in the next outer loop.
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] mm: truncate_complete_page is not existed anymore
  2020-11-03 13:03 ` [PATCH 1/5] mm: truncate_complete_page is not existed anymore Yang Shi
@ 2020-11-06  9:16   ` Jan Kara
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2020-11-06  9:16 UTC (permalink / raw)
  To: Yang Shi
  Cc: mhocko, ziy, songliubraving, mgorman, jack, akpm, linux-mm, linux-kernel

On Tue 03-11-20 05:03:30, Yang Shi wrote:
> The commit 9f4e41f4717832e34cca153ced62b4a1d7e26c0e ("mm: refactor
> truncate_complete_page()") refactored truncate_complete_page(), and it
> is not existed anymore, correct the comment in vmscan and migrate to avoid
> confusion.
> 
> Signed-off-by: Yang Shi <shy828301@gmail.com>

Thanks! Looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  mm/migrate.c | 2 +-
>  mm/vmscan.c  | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 5ca5842df5db..8a2e7e19e27b 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1106,7 +1106,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
>  	 * and treated as swapcache but it has no rmap yet.
>  	 * Calling try_to_unmap() against a page->mapping==NULL page will
>  	 * trigger a BUG.  So handle it here.
> -	 * 2. An orphaned page (see truncate_complete_page) might have
> +	 * 2. An orphaned page (see truncate_cleanup_page) might have
>  	 * fs-private metadata. The page can be picked up due to memory
>  	 * offlining.  Everywhere else except page reclaim, the page is
>  	 * invisible to the vm, so the page can not be migrated.  So try to
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 1b8f0e059767..165cca87edc8 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1393,7 +1393,7 @@ static unsigned int shrink_page_list(struct list_head *page_list,
>  		 *
>  		 * Rarely, pages can have buffers and no ->mapping.  These are
>  		 * the pages which were not successfully invalidated in
> -		 * truncate_complete_page().  We try to drop those buffers here
> +		 * truncate_cleanup_page().  We try to drop those buffers here
>  		 * and if that worked, and the page is no longer mapped into
>  		 * process address space (page_count == 1) it can be freed.
>  		 * Otherwise, leave the page on the LRU so it is swappable.
> -- 
> 2.26.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/5] mm: migrate: simplify the logic for handling permanent failure
  2020-11-03 13:03 ` [PATCH 2/5] mm: migrate: simplify the logic for handling permanent failure Yang Shi
@ 2020-11-06 20:03   ` Zi Yan
  2020-11-06 21:34     ` Yang Shi
  0 siblings, 1 reply; 14+ messages in thread
From: Zi Yan @ 2020-11-06 20:03 UTC (permalink / raw)
  To: Yang Shi
  Cc: mhocko, songliubraving, mgorman, jack, akpm, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 6866 bytes --]

On 3 Nov 2020, at 8:03, Yang Shi wrote:

> When unmap_and_move{_huge_page}() returns !-EAGAIN and !MIGRATEPAGE_SUCCESS,
> the page would be put back to LRU or proper list if it is non-LRU movable
> page.  But, the callers always call putback_movable_pages() to put the
> failed pages back later on, so it seems not very efficient to put every
> single page back immediately, and the code looks convoluted.
>
> Put the failed page on a separate list, then splice the list to migrate
> list when all pages are tried.  It is the caller's responsibility to
> call putback_movable_pages() to handle failures.  This also makes the
> code simpler and more readable.
>
> After the change the rules are:
>     * Success: non hugetlb page will be freed, hugetlb page will be put
>                back
>     * -EAGAIN: stay on the from list
>     * -ENOMEM: stay on the from list
>     * Other errno: put on ret_pages list then splice to from list

Can you put this before the switch case in the migrate_pages? That will
be very helpful to understand the code.
>
> The from list would be empty iff all pages are migrated successfully, it

s/iff/if unless you really mean if and only if. :)


Everything else looks good to me. Thanks for making the code cleaner.
With the changes above, you can add Reviewed-by: Zi Yan <ziy@nvidia.com>.

> was not so before.  This has no impact to current existing callsites.
>
> Signed-off-by: Yang Shi <shy828301@gmail.com>
> ---
>  mm/migrate.c | 58 ++++++++++++++++++++++++++--------------------------
>  1 file changed, 29 insertions(+), 29 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 8a2e7e19e27b..c33c92495ead 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1169,7 +1169,8 @@ static int unmap_and_move(new_page_t get_new_page,
>  				   free_page_t put_new_page,
>  				   unsigned long private, struct page *page,
>  				   int force, enum migrate_mode mode,
> -				   enum migrate_reason reason)
> +				   enum migrate_reason reason,
> +				   struct list_head *ret)
>  {
>  	int rc = MIGRATEPAGE_SUCCESS;
>  	struct page *newpage = NULL;
> @@ -1206,7 +1207,14 @@ static int unmap_and_move(new_page_t get_new_page,
>  		 * migrated will have kept its references and be restored.
>  		 */
>  		list_del(&page->lru);
> +	}
>
> +	/*
> +	 * If migration is successful, releases reference grabbed during
> +	 * isolation. Otherwise, restore the page to right list unless
> +	 * we want to retry.
> +	 */
> +	if (rc == MIGRATEPAGE_SUCCESS) {
>  		/*
>  		 * Compaction can migrate also non-LRU pages which are
>  		 * not accounted to NR_ISOLATED_*. They can be recognized
> @@ -1215,35 +1223,16 @@ static int unmap_and_move(new_page_t get_new_page,
>  		if (likely(!__PageMovable(page)))
>  			mod_node_page_state(page_pgdat(page), NR_ISOLATED_ANON +
>  					page_is_file_lru(page), -thp_nr_pages(page));
> -	}
>
> -	/*
> -	 * If migration is successful, releases reference grabbed during
> -	 * isolation. Otherwise, restore the page to right list unless
> -	 * we want to retry.
> -	 */
> -	if (rc == MIGRATEPAGE_SUCCESS) {
>  		if (reason != MR_MEMORY_FAILURE)
>  			/*
>  			 * We release the page in page_handle_poison.
>  			 */
>  			put_page(page);
>  	} else {
> -		if (rc != -EAGAIN) {
> -			if (likely(!__PageMovable(page))) {
> -				putback_lru_page(page);
> -				goto put_new;
> -			}
> +		if (rc != -EAGAIN)
> +			list_add_tail(&page->lru, ret);
>
> -			lock_page(page);
> -			if (PageMovable(page))
> -				putback_movable_page(page);
> -			else
> -				__ClearPageIsolated(page);
> -			unlock_page(page);
> -			put_page(page);
> -		}
> -put_new:
>  		if (put_new_page)
>  			put_new_page(newpage, private);
>  		else
> @@ -1274,7 +1263,8 @@ static int unmap_and_move(new_page_t get_new_page,
>  static int unmap_and_move_huge_page(new_page_t get_new_page,
>  				free_page_t put_new_page, unsigned long private,
>  				struct page *hpage, int force,
> -				enum migrate_mode mode, int reason)
> +				enum migrate_mode mode, int reason,
> +				struct list_head *ret)
>  {
>  	int rc = -EAGAIN;
>  	int page_was_mapped = 0;
> @@ -1290,7 +1280,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
>  	 * kicking migration.
>  	 */
>  	if (!hugepage_migration_supported(page_hstate(hpage))) {
> -		putback_active_hugepage(hpage);
> +		list_move_tail(&hpage->lru, ret);
>  		return -ENOSYS;
>  	}
>
> @@ -1372,8 +1362,10 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
>  out_unlock:
>  	unlock_page(hpage);
>  out:
> -	if (rc != -EAGAIN)
> +	if (rc == MIGRATEPAGE_SUCCESS)
>  		putback_active_hugepage(hpage);
> +	else if (rc != -EAGAIN && rc != MIGRATEPAGE_SUCCESS)
> +		list_move_tail(&hpage->lru, ret);
>
>  	/*
>  	 * If migration was not successful and there's a freeing callback, use
> @@ -1404,8 +1396,8 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
>   *
>   * The function returns after 10 attempts or if no pages are movable any more
>   * because the list has become empty or no retryable pages exist any more.
> - * The caller should call putback_movable_pages() to return pages to the LRU
> - * or free list only if ret != 0.
> + * It is caller's responsibility to call putback_movable_pages() to return pages
> + * to the LRU or free list only if ret != 0.
>   *
>   * Returns the number of pages that were not migrated, or an error code.
>   */
> @@ -1426,6 +1418,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>  	struct page *page2;
>  	int swapwrite = current->flags & PF_SWAPWRITE;
>  	int rc, nr_subpages;
> +	LIST_HEAD(ret_pages);
>
>  	if (!swapwrite)
>  		current->flags |= PF_SWAPWRITE;
> @@ -1448,11 +1441,12 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>  			if (PageHuge(page))
>  				rc = unmap_and_move_huge_page(get_new_page,
>  						put_new_page, private, page,
> -						pass > 2, mode, reason);
> +						pass > 2, mode, reason,
> +						&ret_pages);
>  			else
>  				rc = unmap_and_move(get_new_page, put_new_page,
>  						private, page, pass > 2, mode,
> -						reason);
> +						reason, &ret_pages);
>
>  			switch(rc) {
>  			case -ENOMEM:
> @@ -1519,6 +1513,12 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>  	nr_thp_failed += thp_retry;
>  	rc = nr_failed;
>  out:
> +	/*
> +	 * Put the permanent failure page back to migration list, they
> +	 * will be put back to the right list by the caller.
> +	 */
> +	list_splice(&ret_pages, from);
> +
>  	count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
>  	count_vm_events(PGMIGRATE_FAIL, nr_failed);
>  	count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
> -- 
> 2.26.2


—
Best Regards,
Yan Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/5] mm: migrate: clean up migrate_prep{_local}
  2020-11-03 13:03 ` [PATCH 4/5] mm: migrate: clean up migrate_prep{_local} Yang Shi
@ 2020-11-06 20:05   ` Zi Yan
  0 siblings, 0 replies; 14+ messages in thread
From: Zi Yan @ 2020-11-06 20:05 UTC (permalink / raw)
  To: Yang Shi
  Cc: mhocko, songliubraving, mgorman, jack, akpm, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 476 bytes --]

On 3 Nov 2020, at 8:03, Yang Shi wrote:

> The migrate_prep{_local} never fails, so it is pointless to have return
> value and check the return value.
>
> Signed-off-by: Yang Shi <shy828301@gmail.com>
> ---
>  include/linux/migrate.h | 4 ++--
>  mm/mempolicy.c          | 8 ++------
>  mm/migrate.c            | 8 ++------
>  3 files changed, 6 insertions(+), 14 deletions(-)
>

LGTM. Thanks. Reviewed-by: Zi Yan <ziy@nvidia.com>


—
Best Regards,
Yan Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/5] mm: migrate: return -ENOSYS if THP migration is unsupported
  2020-11-03 13:03 ` [PATCH 5/5] mm: migrate: return -ENOSYS if THP migration is unsupported Yang Shi
@ 2020-11-06 20:17   ` Zi Yan
  2020-11-06 21:53     ` Yang Shi
  2020-11-06 22:02     ` Yang Shi
  0 siblings, 2 replies; 14+ messages in thread
From: Zi Yan @ 2020-11-06 20:17 UTC (permalink / raw)
  To: Yang Shi
  Cc: mhocko, songliubraving, mgorman, jack, akpm, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4362 bytes --]

On 3 Nov 2020, at 8:03, Yang Shi wrote:

> In the current implementation unmap_and_move() would return -ENOMEM if
> THP migration is unsupported, then the THP will be split.  If split is
> failed just exit without trying to migrate other pages.  It doesn't make
> too much sense since there may be enough free memory to migrate other
> pages and there may be a lot base pages on the list.
>
> Return -ENOSYS to make consistent with hugetlb.  And if THP split is
> failed just skip and try other pages on the list.
>
> Just skip the whole list and exit when free memory is really low.
>
> Signed-off-by: Yang Shi <shy828301@gmail.com>
> ---
>  mm/migrate.c | 62 ++++++++++++++++++++++++++++++++++++++--------------
>  1 file changed, 46 insertions(+), 16 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 8f6a61c9274b..b3466d8c7f03 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1172,7 +1172,7 @@ static int unmap_and_move(new_page_t get_new_page,
>  	struct page *newpage = NULL;
>
>  	if (!thp_migration_supported() && PageTransHuge(page))
> -		return -ENOMEM;
> +		return -ENOSYS;
>
>  	if (page_count(page) == 1) {
>  		/* page was freed from under us. So we are done. */
> @@ -1376,6 +1376,20 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
>  	return rc;
>  }
>
> +static inline int try_split_thp(struct page *page, struct page *page2,
> +				struct list_head *from)
> +{
> +	int rc = 0;
> +
> +	lock_page(page);
> +	rc = split_huge_page_to_list(page, from);
> +	unlock_page(page);
> +	if (!rc)
> +		list_safe_reset_next(page, page2, lru);

This does not work as expected, right? After macro expansion, we have
page2 = list_next_entry(page, lru). Since page2 is passed as a pointer, the change
does not return back the caller. You need to use the pointer to page2 here.

> +
> +	return rc;
> +}
> +
>  /*
>   * migrate_pages - migrate the pages specified in a list, to the free pages
>   *		   supplied as the target for the page migration
> @@ -1445,24 +1459,40 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>  						reason, &ret_pages);
>
>  			switch(rc) {
> +			/*
> +			 * THP migration might be unsupported or the
> +			 * allocation could've failed so we should
> +			 * retry on the same page with the THP split
> +			 * to base pages.
> +			 *
> +			 * Head page is retried immediately and tail
> +			 * pages are added to the tail of the list so
> +			 * we encounter them after the rest of the list
> +			 * is processed.
> +			 */
> +			case -ENOSYS:
> +				/* THP migration is unsupported */
> +				if (is_thp) {
> +					if (!try_split_thp(page, page2, from)) {
> +						nr_thp_split++;
> +						goto retry;
> +					}
> +
> +					nr_thp_failed++;
> +					nr_failed += nr_subpages;
> +					break;
> +				}
> +
> +				/* Hugetlb migration is unsupported */
> +				nr_failed++;
> +				break;
>  			case -ENOMEM:
>  				/*
> -				 * THP migration might be unsupported or the
> -				 * allocation could've failed so we should
> -				 * retry on the same page with the THP split
> -				 * to base pages.
> -				 *
> -				 * Head page is retried immediately and tail
> -				 * pages are added to the tail of the list so
> -				 * we encounter them after the rest of the list
> -				 * is processed.
> +				 * When memory is low, don't bother to try to migrate
> +				 * other pages, just exit.

The comment does not match the code below. For THPs, the code tries to split the THP
and migrate the base pages if the split is successful.

>  				 */
>  				if (is_thp) {
> -					lock_page(page);
> -					rc = split_huge_page_to_list(page, from);
> -					unlock_page(page);
> -					if (!rc) {
> -						list_safe_reset_next(page, page2, lru);
> +					if (!try_split_thp(page, page2, from)) {
>  						nr_thp_split++;
>  						goto retry;
>  					}
> @@ -1490,7 +1520,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>  				break;
>  			default:
>  				/*
> -				 * Permanent failure (-EBUSY, -ENOSYS, etc.):
> +				 * Permanent failure (-EBUSY, etc.):
>  				 * unlike -EAGAIN case, the failed page is
>  				 * removed from migration page list and not
>  				 * retried in the next outer loop.
> -- 
> 2.26.2


—
Best Regards,
Yan Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/5] mm: migrate: simplify the logic for handling permanent failure
  2020-11-06 20:03   ` Zi Yan
@ 2020-11-06 21:34     ` Yang Shi
  0 siblings, 0 replies; 14+ messages in thread
From: Yang Shi @ 2020-11-06 21:34 UTC (permalink / raw)
  To: Zi Yan
  Cc: Michal Hocko, Song Liu, Mel Gorman, Jan Kara, Andrew Morton,
	Linux MM, Linux Kernel Mailing List

On Fri, Nov 6, 2020 at 12:03 PM Zi Yan <ziy@nvidia.com> wrote:
>
> On 3 Nov 2020, at 8:03, Yang Shi wrote:
>
> > When unmap_and_move{_huge_page}() returns !-EAGAIN and !MIGRATEPAGE_SUCCESS,
> > the page would be put back to LRU or proper list if it is non-LRU movable
> > page.  But, the callers always call putback_movable_pages() to put the
> > failed pages back later on, so it seems not very efficient to put every
> > single page back immediately, and the code looks convoluted.
> >
> > Put the failed page on a separate list, then splice the list to migrate
> > list when all pages are tried.  It is the caller's responsibility to
> > call putback_movable_pages() to handle failures.  This also makes the
> > code simpler and more readable.
> >
> > After the change the rules are:
> >     * Success: non hugetlb page will be freed, hugetlb page will be put
> >                back
> >     * -EAGAIN: stay on the from list
> >     * -ENOMEM: stay on the from list
> >     * Other errno: put on ret_pages list then splice to from list
>
> Can you put this before the switch case in the migrate_pages? That will
> be very helpful to understand the code.

Sure, I agree the switch case deserves some comments.

> >
> > The from list would be empty iff all pages are migrated successfully, it
>
> s/iff/if unless you really mean if and only if. :)

Yes, I mean if and only if.

>
>
> Everything else looks good to me. Thanks for making the code cleaner.
> With the changes above, you can add Reviewed-by: Zi Yan <ziy@nvidia.com>.

Thanks.

>
> > was not so before.  This has no impact to current existing callsites.
> >
> > Signed-off-by: Yang Shi <shy828301@gmail.com>
> > ---
> >  mm/migrate.c | 58 ++++++++++++++++++++++++++--------------------------
> >  1 file changed, 29 insertions(+), 29 deletions(-)
> >
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index 8a2e7e19e27b..c33c92495ead 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -1169,7 +1169,8 @@ static int unmap_and_move(new_page_t get_new_page,
> >                                  free_page_t put_new_page,
> >                                  unsigned long private, struct page *page,
> >                                  int force, enum migrate_mode mode,
> > -                                enum migrate_reason reason)
> > +                                enum migrate_reason reason,
> > +                                struct list_head *ret)
> >  {
> >       int rc = MIGRATEPAGE_SUCCESS;
> >       struct page *newpage = NULL;
> > @@ -1206,7 +1207,14 @@ static int unmap_and_move(new_page_t get_new_page,
> >                * migrated will have kept its references and be restored.
> >                */
> >               list_del(&page->lru);
> > +     }
> >
> > +     /*
> > +      * If migration is successful, releases reference grabbed during
> > +      * isolation. Otherwise, restore the page to right list unless
> > +      * we want to retry.
> > +      */
> > +     if (rc == MIGRATEPAGE_SUCCESS) {
> >               /*
> >                * Compaction can migrate also non-LRU pages which are
> >                * not accounted to NR_ISOLATED_*. They can be recognized
> > @@ -1215,35 +1223,16 @@ static int unmap_and_move(new_page_t get_new_page,
> >               if (likely(!__PageMovable(page)))
> >                       mod_node_page_state(page_pgdat(page), NR_ISOLATED_ANON +
> >                                       page_is_file_lru(page), -thp_nr_pages(page));
> > -     }
> >
> > -     /*
> > -      * If migration is successful, releases reference grabbed during
> > -      * isolation. Otherwise, restore the page to right list unless
> > -      * we want to retry.
> > -      */
> > -     if (rc == MIGRATEPAGE_SUCCESS) {
> >               if (reason != MR_MEMORY_FAILURE)
> >                       /*
> >                        * We release the page in page_handle_poison.
> >                        */
> >                       put_page(page);
> >       } else {
> > -             if (rc != -EAGAIN) {
> > -                     if (likely(!__PageMovable(page))) {
> > -                             putback_lru_page(page);
> > -                             goto put_new;
> > -                     }
> > +             if (rc != -EAGAIN)
> > +                     list_add_tail(&page->lru, ret);
> >
> > -                     lock_page(page);
> > -                     if (PageMovable(page))
> > -                             putback_movable_page(page);
> > -                     else
> > -                             __ClearPageIsolated(page);
> > -                     unlock_page(page);
> > -                     put_page(page);
> > -             }
> > -put_new:
> >               if (put_new_page)
> >                       put_new_page(newpage, private);
> >               else
> > @@ -1274,7 +1263,8 @@ static int unmap_and_move(new_page_t get_new_page,
> >  static int unmap_and_move_huge_page(new_page_t get_new_page,
> >                               free_page_t put_new_page, unsigned long private,
> >                               struct page *hpage, int force,
> > -                             enum migrate_mode mode, int reason)
> > +                             enum migrate_mode mode, int reason,
> > +                             struct list_head *ret)
> >  {
> >       int rc = -EAGAIN;
> >       int page_was_mapped = 0;
> > @@ -1290,7 +1280,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
> >        * kicking migration.
> >        */
> >       if (!hugepage_migration_supported(page_hstate(hpage))) {
> > -             putback_active_hugepage(hpage);
> > +             list_move_tail(&hpage->lru, ret);
> >               return -ENOSYS;
> >       }
> >
> > @@ -1372,8 +1362,10 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
> >  out_unlock:
> >       unlock_page(hpage);
> >  out:
> > -     if (rc != -EAGAIN)
> > +     if (rc == MIGRATEPAGE_SUCCESS)
> >               putback_active_hugepage(hpage);
> > +     else if (rc != -EAGAIN && rc != MIGRATEPAGE_SUCCESS)
> > +             list_move_tail(&hpage->lru, ret);
> >
> >       /*
> >        * If migration was not successful and there's a freeing callback, use
> > @@ -1404,8 +1396,8 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
> >   *
> >   * The function returns after 10 attempts or if no pages are movable any more
> >   * because the list has become empty or no retryable pages exist any more.
> > - * The caller should call putback_movable_pages() to return pages to the LRU
> > - * or free list only if ret != 0.
> > + * It is caller's responsibility to call putback_movable_pages() to return pages
> > + * to the LRU or free list only if ret != 0.
> >   *
> >   * Returns the number of pages that were not migrated, or an error code.
> >   */
> > @@ -1426,6 +1418,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> >       struct page *page2;
> >       int swapwrite = current->flags & PF_SWAPWRITE;
> >       int rc, nr_subpages;
> > +     LIST_HEAD(ret_pages);
> >
> >       if (!swapwrite)
> >               current->flags |= PF_SWAPWRITE;
> > @@ -1448,11 +1441,12 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> >                       if (PageHuge(page))
> >                               rc = unmap_and_move_huge_page(get_new_page,
> >                                               put_new_page, private, page,
> > -                                             pass > 2, mode, reason);
> > +                                             pass > 2, mode, reason,
> > +                                             &ret_pages);
> >                       else
> >                               rc = unmap_and_move(get_new_page, put_new_page,
> >                                               private, page, pass > 2, mode,
> > -                                             reason);
> > +                                             reason, &ret_pages);
> >
> >                       switch(rc) {
> >                       case -ENOMEM:
> > @@ -1519,6 +1513,12 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> >       nr_thp_failed += thp_retry;
> >       rc = nr_failed;
> >  out:
> > +     /*
> > +      * Put the permanent failure page back to migration list, they
> > +      * will be put back to the right list by the caller.
> > +      */
> > +     list_splice(&ret_pages, from);
> > +
> >       count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
> >       count_vm_events(PGMIGRATE_FAIL, nr_failed);
> >       count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
> > --
> > 2.26.2
>
>
> —
> Best Regards,
> Yan Zi

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/5] mm: migrate: return -ENOSYS if THP migration is unsupported
  2020-11-06 20:17   ` Zi Yan
@ 2020-11-06 21:53     ` Yang Shi
  2020-11-06 22:02     ` Yang Shi
  1 sibling, 0 replies; 14+ messages in thread
From: Yang Shi @ 2020-11-06 21:53 UTC (permalink / raw)
  To: Zi Yan
  Cc: Michal Hocko, Song Liu, Mel Gorman, Jan Kara, Andrew Morton,
	Linux MM, Linux Kernel Mailing List

On Fri, Nov 6, 2020 at 12:17 PM Zi Yan <ziy@nvidia.com> wrote:
>
> On 3 Nov 2020, at 8:03, Yang Shi wrote:
>
> > In the current implementation unmap_and_move() would return -ENOMEM if
> > THP migration is unsupported, then the THP will be split.  If split is
> > failed just exit without trying to migrate other pages.  It doesn't make
> > too much sense since there may be enough free memory to migrate other
> > pages and there may be a lot base pages on the list.
> >
> > Return -ENOSYS to make consistent with hugetlb.  And if THP split is
> > failed just skip and try other pages on the list.
> >
> > Just skip the whole list and exit when free memory is really low.
> >
> > Signed-off-by: Yang Shi <shy828301@gmail.com>
> > ---
> >  mm/migrate.c | 62 ++++++++++++++++++++++++++++++++++++++--------------
> >  1 file changed, 46 insertions(+), 16 deletions(-)
> >
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index 8f6a61c9274b..b3466d8c7f03 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -1172,7 +1172,7 @@ static int unmap_and_move(new_page_t get_new_page,
> >       struct page *newpage = NULL;
> >
> >       if (!thp_migration_supported() && PageTransHuge(page))
> > -             return -ENOMEM;
> > +             return -ENOSYS;
> >
> >       if (page_count(page) == 1) {
> >               /* page was freed from under us. So we are done. */
> > @@ -1376,6 +1376,20 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
> >       return rc;
> >  }
> >
> > +static inline int try_split_thp(struct page *page, struct page *page2,
> > +                             struct list_head *from)
> > +{
> > +     int rc = 0;
> > +
> > +     lock_page(page);
> > +     rc = split_huge_page_to_list(page, from);
> > +     unlock_page(page);
> > +     if (!rc)
> > +             list_safe_reset_next(page, page2, lru);
>
> This does not work as expected, right? After macro expansion, we have
> page2 = list_next_entry(page, lru). Since page2 is passed as a pointer, the change
> does not return back the caller. You need to use the pointer to page2 here.

Yes, I should pass in **page2. Thanks for catching this.

>
> > +
> > +     return rc;
> > +}
> > +
> >  /*
> >   * migrate_pages - migrate the pages specified in a list, to the free pages
> >   *              supplied as the target for the page migration
> > @@ -1445,24 +1459,40 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> >                                               reason, &ret_pages);
> >
> >                       switch(rc) {
> > +                     /*
> > +                      * THP migration might be unsupported or the
> > +                      * allocation could've failed so we should
> > +                      * retry on the same page with the THP split
> > +                      * to base pages.
> > +                      *
> > +                      * Head page is retried immediately and tail
> > +                      * pages are added to the tail of the list so
> > +                      * we encounter them after the rest of the list
> > +                      * is processed.
> > +                      */
> > +                     case -ENOSYS:
> > +                             /* THP migration is unsupported */
> > +                             if (is_thp) {
> > +                                     if (!try_split_thp(page, page2, from)) {
> > +                                             nr_thp_split++;
> > +                                             goto retry;
> > +                                     }
> > +
> > +                                     nr_thp_failed++;
> > +                                     nr_failed += nr_subpages;
> > +                                     break;
> > +                             }
> > +
> > +                             /* Hugetlb migration is unsupported */
> > +                             nr_failed++;
> > +                             break;
> >                       case -ENOMEM:
> >                               /*
> > -                              * THP migration might be unsupported or the
> > -                              * allocation could've failed so we should
> > -                              * retry on the same page with the THP split
> > -                              * to base pages.
> > -                              *
> > -                              * Head page is retried immediately and tail
> > -                              * pages are added to the tail of the list so
> > -                              * we encounter them after the rest of the list
> > -                              * is processed.
> > +                              * When memory is low, don't bother to try to migrate
> > +                              * other pages, just exit.
>
> The comment does not match the code below. For THPs, the code tries to split the THP
> and migrate the base pages if the split is successful.

The comment here just covers the "goto out" rather than the split
because I thought it is covered by the comment before "case -ENOSYS",
which says "unsupported or the allocation could've failed", so I
didn't repeat here.

>
> >                                */
> >                               if (is_thp) {
> > -                                     lock_page(page);
> > -                                     rc = split_huge_page_to_list(page, from);
> > -                                     unlock_page(page);
> > -                                     if (!rc) {
> > -                                             list_safe_reset_next(page, page2, lru);
> > +                                     if (!try_split_thp(page, page2, from)) {
> >                                               nr_thp_split++;
> >                                               goto retry;
> >                                       }
> > @@ -1490,7 +1520,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> >                               break;
> >                       default:
> >                               /*
> > -                              * Permanent failure (-EBUSY, -ENOSYS, etc.):
> > +                              * Permanent failure (-EBUSY, etc.):
> >                                * unlike -EAGAIN case, the failed page is
> >                                * removed from migration page list and not
> >                                * retried in the next outer loop.
> > --
> > 2.26.2
>
>
> —
> Best Regards,
> Yan Zi

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/5] mm: migrate: return -ENOSYS if THP migration is unsupported
  2020-11-06 20:17   ` Zi Yan
  2020-11-06 21:53     ` Yang Shi
@ 2020-11-06 22:02     ` Yang Shi
  2020-11-06 23:15       ` Yang Shi
  1 sibling, 1 reply; 14+ messages in thread
From: Yang Shi @ 2020-11-06 22:02 UTC (permalink / raw)
  To: Zi Yan
  Cc: Michal Hocko, Song Liu, Mel Gorman, Jan Kara, Andrew Morton,
	Linux MM, Linux Kernel Mailing List

On Fri, Nov 6, 2020 at 12:17 PM Zi Yan <ziy@nvidia.com> wrote:
>
> On 3 Nov 2020, at 8:03, Yang Shi wrote:
>
> > In the current implementation unmap_and_move() would return -ENOMEM if
> > THP migration is unsupported, then the THP will be split.  If split is
> > failed just exit without trying to migrate other pages.  It doesn't make
> > too much sense since there may be enough free memory to migrate other
> > pages and there may be a lot base pages on the list.
> >
> > Return -ENOSYS to make consistent with hugetlb.  And if THP split is
> > failed just skip and try other pages on the list.
> >
> > Just skip the whole list and exit when free memory is really low.
> >
> > Signed-off-by: Yang Shi <shy828301@gmail.com>
> > ---
> >  mm/migrate.c | 62 ++++++++++++++++++++++++++++++++++++++--------------
> >  1 file changed, 46 insertions(+), 16 deletions(-)
> >
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index 8f6a61c9274b..b3466d8c7f03 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -1172,7 +1172,7 @@ static int unmap_and_move(new_page_t get_new_page,
> >       struct page *newpage = NULL;
> >
> >       if (!thp_migration_supported() && PageTransHuge(page))
> > -             return -ENOMEM;
> > +             return -ENOSYS;
> >
> >       if (page_count(page) == 1) {
> >               /* page was freed from under us. So we are done. */
> > @@ -1376,6 +1376,20 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
> >       return rc;
> >  }
> >
> > +static inline int try_split_thp(struct page *page, struct page *page2,
> > +                             struct list_head *from)
> > +{
> > +     int rc = 0;
> > +
> > +     lock_page(page);
> > +     rc = split_huge_page_to_list(page, from);
> > +     unlock_page(page);
> > +     if (!rc)
> > +             list_safe_reset_next(page, page2, lru);
>
> This does not work as expected, right? After macro expansion, we have
> page2 = list_next_entry(page, lru). Since page2 is passed as a pointer, the change
> does not return back the caller. You need to use the pointer to page2 here.
>
> > +
> > +     return rc;
> > +}
> > +
> >  /*
> >   * migrate_pages - migrate the pages specified in a list, to the free pages
> >   *              supplied as the target for the page migration
> > @@ -1445,24 +1459,40 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> >                                               reason, &ret_pages);
> >
> >                       switch(rc) {
> > +                     /*
> > +                      * THP migration might be unsupported or the
> > +                      * allocation could've failed so we should
> > +                      * retry on the same page with the THP split
> > +                      * to base pages.
> > +                      *
> > +                      * Head page is retried immediately and tail
> > +                      * pages are added to the tail of the list so
> > +                      * we encounter them after the rest of the list
> > +                      * is processed.
> > +                      */
> > +                     case -ENOSYS:
> > +                             /* THP migration is unsupported */
> > +                             if (is_thp) {
> > +                                     if (!try_split_thp(page, page2, from)) {
> > +                                             nr_thp_split++;
> > +                                             goto retry;
> > +                                     }
> > +
> > +                                     nr_thp_failed++;
> > +                                     nr_failed += nr_subpages;
> > +                                     break;
> > +                             }
> > +
> > +                             /* Hugetlb migration is unsupported */
> > +                             nr_failed++;
> > +                             break;
> >                       case -ENOMEM:
> >                               /*
> > -                              * THP migration might be unsupported or the
> > -                              * allocation could've failed so we should
> > -                              * retry on the same page with the THP split
> > -                              * to base pages.
> > -                              *
> > -                              * Head page is retried immediately and tail
> > -                              * pages are added to the tail of the list so
> > -                              * we encounter them after the rest of the list
> > -                              * is processed.
> > +                              * When memory is low, don't bother to try to migrate
> > +                              * other pages, just exit.
>
> The comment does not match the code below. For THPs, the code tries to split the THP
> and migrate the base pages if the split is successful.

BTW, actually I don't think -ENOSYS is a proper return value for this
case since it typically means "the syscall doesn't exist". IMHO, it
should return -EINVAL. And actually the return value doesn't matter
since all callers of migrate_pages() just check if the return value !=
0. And, neither -ENOSYS nor -EINVAL won't be visible by userspace
since migrate_pages() just returns the number of failed pages for this
case.

Anyway the return value is a little bit messy, it may return -ENOMEM,
0 or nr_failed. But it looks the callers just care about if ret != 0,
so it may be better to let it return nr_failed for -ENOMEM case too.

>
> >                                */
> >                               if (is_thp) {
> > -                                     lock_page(page);
> > -                                     rc = split_huge_page_to_list(page, from);
> > -                                     unlock_page(page);
> > -                                     if (!rc) {
> > -                                             list_safe_reset_next(page, page2, lru);
> > +                                     if (!try_split_thp(page, page2, from)) {
> >                                               nr_thp_split++;
> >                                               goto retry;
> >                                       }
> > @@ -1490,7 +1520,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> >                               break;
> >                       default:
> >                               /*
> > -                              * Permanent failure (-EBUSY, -ENOSYS, etc.):
> > +                              * Permanent failure (-EBUSY, etc.):
> >                                * unlike -EAGAIN case, the failed page is
> >                                * removed from migration page list and not
> >                                * retried in the next outer loop.
> > --
> > 2.26.2
>
>
> —
> Best Regards,
> Yan Zi

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/5] mm: migrate: return -ENOSYS if THP migration is unsupported
  2020-11-06 22:02     ` Yang Shi
@ 2020-11-06 23:15       ` Yang Shi
  0 siblings, 0 replies; 14+ messages in thread
From: Yang Shi @ 2020-11-06 23:15 UTC (permalink / raw)
  To: Zi Yan
  Cc: Michal Hocko, Song Liu, Mel Gorman, Jan Kara, Andrew Morton,
	Linux MM, Linux Kernel Mailing List

On Fri, Nov 6, 2020 at 2:02 PM Yang Shi <shy828301@gmail.com> wrote:
>
> On Fri, Nov 6, 2020 at 12:17 PM Zi Yan <ziy@nvidia.com> wrote:
> >
> > On 3 Nov 2020, at 8:03, Yang Shi wrote:
> >
> > > In the current implementation unmap_and_move() would return -ENOMEM if
> > > THP migration is unsupported, then the THP will be split.  If split is
> > > failed just exit without trying to migrate other pages.  It doesn't make
> > > too much sense since there may be enough free memory to migrate other
> > > pages and there may be a lot base pages on the list.
> > >
> > > Return -ENOSYS to make consistent with hugetlb.  And if THP split is
> > > failed just skip and try other pages on the list.
> > >
> > > Just skip the whole list and exit when free memory is really low.
> > >
> > > Signed-off-by: Yang Shi <shy828301@gmail.com>
> > > ---
> > >  mm/migrate.c | 62 ++++++++++++++++++++++++++++++++++++++--------------
> > >  1 file changed, 46 insertions(+), 16 deletions(-)
> > >
> > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > index 8f6a61c9274b..b3466d8c7f03 100644
> > > --- a/mm/migrate.c
> > > +++ b/mm/migrate.c
> > > @@ -1172,7 +1172,7 @@ static int unmap_and_move(new_page_t get_new_page,
> > >       struct page *newpage = NULL;
> > >
> > >       if (!thp_migration_supported() && PageTransHuge(page))
> > > -             return -ENOMEM;
> > > +             return -ENOSYS;
> > >
> > >       if (page_count(page) == 1) {
> > >               /* page was freed from under us. So we are done. */
> > > @@ -1376,6 +1376,20 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
> > >       return rc;
> > >  }
> > >
> > > +static inline int try_split_thp(struct page *page, struct page *page2,
> > > +                             struct list_head *from)
> > > +{
> > > +     int rc = 0;
> > > +
> > > +     lock_page(page);
> > > +     rc = split_huge_page_to_list(page, from);
> > > +     unlock_page(page);
> > > +     if (!rc)
> > > +             list_safe_reset_next(page, page2, lru);
> >
> > This does not work as expected, right? After macro expansion, we have
> > page2 = list_next_entry(page, lru). Since page2 is passed as a pointer, the change
> > does not return back the caller. You need to use the pointer to page2 here.
> >
> > > +
> > > +     return rc;
> > > +}
> > > +
> > >  /*
> > >   * migrate_pages - migrate the pages specified in a list, to the free pages
> > >   *              supplied as the target for the page migration
> > > @@ -1445,24 +1459,40 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> > >                                               reason, &ret_pages);
> > >
> > >                       switch(rc) {
> > > +                     /*
> > > +                      * THP migration might be unsupported or the
> > > +                      * allocation could've failed so we should
> > > +                      * retry on the same page with the THP split
> > > +                      * to base pages.
> > > +                      *
> > > +                      * Head page is retried immediately and tail
> > > +                      * pages are added to the tail of the list so
> > > +                      * we encounter them after the rest of the list
> > > +                      * is processed.
> > > +                      */
> > > +                     case -ENOSYS:
> > > +                             /* THP migration is unsupported */
> > > +                             if (is_thp) {
> > > +                                     if (!try_split_thp(page, page2, from)) {
> > > +                                             nr_thp_split++;
> > > +                                             goto retry;
> > > +                                     }
> > > +
> > > +                                     nr_thp_failed++;
> > > +                                     nr_failed += nr_subpages;
> > > +                                     break;
> > > +                             }
> > > +
> > > +                             /* Hugetlb migration is unsupported */
> > > +                             nr_failed++;
> > > +                             break;
> > >                       case -ENOMEM:
> > >                               /*
> > > -                              * THP migration might be unsupported or the
> > > -                              * allocation could've failed so we should
> > > -                              * retry on the same page with the THP split
> > > -                              * to base pages.
> > > -                              *
> > > -                              * Head page is retried immediately and tail
> > > -                              * pages are added to the tail of the list so
> > > -                              * we encounter them after the rest of the list
> > > -                              * is processed.
> > > +                              * When memory is low, don't bother to try to migrate
> > > +                              * other pages, just exit.
> >
> > The comment does not match the code below. For THPs, the code tries to split the THP
> > and migrate the base pages if the split is successful.
>
> BTW, actually I don't think -ENOSYS is a proper return value for this
> case since it typically means "the syscall doesn't exist". IMHO, it
> should return -EINVAL. And actually the return value doesn't matter
> since all callers of migrate_pages() just check if the return value !=
> 0. And, neither -ENOSYS nor -EINVAL won't be visible by userspace
> since migrate_pages() just returns the number of failed pages for this
> case.
>
> Anyway the return value is a little bit messy, it may return -ENOMEM,
> 0 or nr_failed. But it looks the callers just care about if ret != 0,
> so it may be better to let it return nr_failed for -ENOMEM case too.

By relooking all the callsites, it seems I overlooked two cases:
     * compaction: it checks if the return value is -ENOMEM if so it
may return COMPACT_CONTENDED
     * migrate_pages syscall: it may return -ENOMEM to the userspace,
but the man page doesn't document this case, not sure if it is
legitimate

>
> >
> > >                                */
> > >                               if (is_thp) {
> > > -                                     lock_page(page);
> > > -                                     rc = split_huge_page_to_list(page, from);
> > > -                                     unlock_page(page);
> > > -                                     if (!rc) {
> > > -                                             list_safe_reset_next(page, page2, lru);
> > > +                                     if (!try_split_thp(page, page2, from)) {
> > >                                               nr_thp_split++;
> > >                                               goto retry;
> > >                                       }
> > > @@ -1490,7 +1520,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
> > >                               break;
> > >                       default:
> > >                               /*
> > > -                              * Permanent failure (-EBUSY, -ENOSYS, etc.):
> > > +                              * Permanent failure (-EBUSY, etc.):
> > >                                * unlike -EAGAIN case, the failed page is
> > >                                * removed from migration page list and not
> > >                                * retried in the next outer loop.
> > > --
> > > 2.26.2
> >
> >
> > —
> > Best Regards,
> > Yan Zi

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-11-06 23:15 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-03 13:03 [PATCH 0/5] mm: misc migrate cleanup and improvement Yang Shi
2020-11-03 13:03 ` [PATCH 1/5] mm: truncate_complete_page is not existed anymore Yang Shi
2020-11-06  9:16   ` Jan Kara
2020-11-03 13:03 ` [PATCH 2/5] mm: migrate: simplify the logic for handling permanent failure Yang Shi
2020-11-06 20:03   ` Zi Yan
2020-11-06 21:34     ` Yang Shi
2020-11-03 13:03 ` [PATCH 3/5] mm: migrate: skip shared exec THP for NUMA balancing Yang Shi
2020-11-03 13:03 ` [PATCH 4/5] mm: migrate: clean up migrate_prep{_local} Yang Shi
2020-11-06 20:05   ` Zi Yan
2020-11-03 13:03 ` [PATCH 5/5] mm: migrate: return -ENOSYS if THP migration is unsupported Yang Shi
2020-11-06 20:17   ` Zi Yan
2020-11-06 21:53     ` Yang Shi
2020-11-06 22:02     ` Yang Shi
2020-11-06 23:15       ` Yang Shi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).