All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH -v4 0/9] migrate_pages(): batch TLB flushing
@ 2023-02-06  6:33 Huang Ying
  2023-02-06  6:33 ` [PATCH -v4 1/9] migrate_pages: organize stats with struct migrate_pages_stats Huang Ying
                   ` (9 more replies)
  0 siblings, 10 replies; 33+ messages in thread
From: Huang Ying @ 2023-02-06  6:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Huang, Ying, Zi Yan, Yang Shi,
	Baolin Wang, Oscar Salvador, Matthew Wilcox, Bharata B Rao,
	Alistair Popple, haoxin, Minchan Kim, Mike Kravetz,
	Hyeonggon Yoo

From: "Huang, Ying" <ying.huang@intel.com>

Now, migrate_pages() migrate folios one by one, like the fake code as
follows,

  for each folio
    unmap
    flush TLB
    copy
    restore map

If multiple folios are passed to migrate_pages(), there are
opportunities to batch the TLB flushing and copying.  That is, we can
change the code to something as follows,

  for each folio
    unmap
  for each folio
    flush TLB
  for each folio
    copy
  for each folio
    restore map

The total number of TLB flushing IPI can be reduced considerably.  And
we may use some hardware accelerator such as DSA to accelerate the
folio copying.

So in this patch, we refactor the migrate_pages() implementation and
implement the TLB flushing batching.  Base on this, hardware
accelerated folio copying can be implemented.

If too many folios are passed to migrate_pages(), in the naive batched
implementation, we may unmap too many folios at the same time.  The
possibility for a task to wait for the migrated folios to be mapped
again increases.  So the latency may be hurt.  To deal with this
issue, the max number of folios be unmapped in batch is restricted to
no more than HPAGE_PMD_NR in the unit of page.  That is, the influence
is at the same level of THP migration.

We use the following test to measure the performance impact of the
patchset,

On a 2-socket Intel server,

 - Run pmbench memory accessing benchmark

 - Run `migratepages` to migrate pages of pmbench between node 0 and
   node 1 back and forth.

With the patch, the TLB flushing IPI reduces 99.1% during the test and
the number of pages migrated successfully per second increases 291.7%.

This patchset is based on v6.2-rc4.

Changes:

v4:

- Fixed another bug about non-LRU folio migration.  Thanks Hyeonggon!

v3:

- Rebased on v6.2-rc4

- Fixed a bug about non-LRU folio migration.  Thanks Mike!

- Fixed some comments.  Thanks Baolin!

- Collected reviewed-by.

v2:

- Rebased on v6.2-rc3

- Fixed type force cast warning.  Thanks Kees!

- Added more comments and cleaned up the code.  Thanks Andrew, Zi, Alistair, Dan!

- Collected reviewed-by.

from rfc to v1:

- Rebased on v6.2-rc1

- Fix the deadlock issue caused by locking multiple pages synchronously
  per Alistair's comments.  Thanks!

- Fix the autonumabench panic per Rao's comments and fix.  Thanks!

- Other minor fixes per comments. Thanks!

Best Regards,
Huang, Ying

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH -v4 1/9] migrate_pages: organize stats with struct migrate_pages_stats
  2023-02-06  6:33 [PATCH -v4 0/9] migrate_pages(): batch TLB flushing Huang Ying
@ 2023-02-06  6:33 ` Huang Ying
  2023-02-07 16:28   ` haoxin
  2023-02-06  6:33 ` [PATCH -v4 2/9] migrate_pages: separate hugetlb folios migration Huang Ying
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Huang Ying @ 2023-02-06  6:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Huang Ying, Alistair Popple, Zi Yan,
	Baolin Wang, Yang Shi, Oscar Salvador, Matthew Wilcox,
	Bharata B Rao, haoxin, Minchan Kim, Mike Kravetz, Hyeonggon Yoo

Define struct migrate_pages_stats to organize the various statistics
in migrate_pages().  This makes it easier to collect and consume the
statistics in multiple functions.  This will be needed in the
following patches in the series.

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Bharata B Rao <bharata@amd.com>
Cc: haoxin <xhao@linux.alibaba.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 mm/migrate.c | 60 +++++++++++++++++++++++++++++-----------------------
 1 file changed, 34 insertions(+), 26 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index a4d3fc65085f..ef388a9e4747 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1396,6 +1396,16 @@ static inline int try_split_folio(struct folio *folio, struct list_head *split_f
 	return rc;
 }
 
+struct migrate_pages_stats {
+	int nr_succeeded;	/* Normal and large folios migrated successfully, in
+				   units of base pages */
+	int nr_failed_pages;	/* Normal and large folios failed to be migrated, in
+				   units of base pages.  Untried folios aren't counted */
+	int nr_thp_succeeded;	/* THP migrated successfully */
+	int nr_thp_failed;	/* THP failed to be migrated */
+	int nr_thp_split;	/* THP split before migrating */
+};
+
 /*
  * migrate_pages - migrate the folios specified in a list, to the free folios
  *		   supplied as the target for the page migration
@@ -1430,13 +1440,8 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 	int large_retry = 1;
 	int thp_retry = 1;
 	int nr_failed = 0;
-	int nr_failed_pages = 0;
 	int nr_retry_pages = 0;
-	int nr_succeeded = 0;
-	int nr_thp_succeeded = 0;
 	int nr_large_failed = 0;
-	int nr_thp_failed = 0;
-	int nr_thp_split = 0;
 	int pass = 0;
 	bool is_large = false;
 	bool is_thp = false;
@@ -1446,9 +1451,11 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 	LIST_HEAD(split_folios);
 	bool nosplit = (reason == MR_NUMA_MISPLACED);
 	bool no_split_folio_counting = false;
+	struct migrate_pages_stats stats;
 
 	trace_mm_migrate_pages_start(mode, reason);
 
+	memset(&stats, 0, sizeof(stats));
 split_folio_migration:
 	for (pass = 0; pass < 10 && (retry || large_retry); pass++) {
 		retry = 0;
@@ -1502,9 +1509,9 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 				/* Large folio migration is unsupported */
 				if (is_large) {
 					nr_large_failed++;
-					nr_thp_failed += is_thp;
+					stats.nr_thp_failed += is_thp;
 					if (!try_split_folio(folio, &split_folios)) {
-						nr_thp_split += is_thp;
+						stats.nr_thp_split += is_thp;
 						break;
 					}
 				/* Hugetlb migration is unsupported */
@@ -1512,7 +1519,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 					nr_failed++;
 				}
 
-				nr_failed_pages += nr_pages;
+				stats.nr_failed_pages += nr_pages;
 				list_move_tail(&folio->lru, &ret_folios);
 				break;
 			case -ENOMEM:
@@ -1522,13 +1529,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 				 */
 				if (is_large) {
 					nr_large_failed++;
-					nr_thp_failed += is_thp;
+					stats.nr_thp_failed += is_thp;
 					/* Large folio NUMA faulting doesn't split to retry. */
 					if (!nosplit) {
 						int ret = try_split_folio(folio, &split_folios);
 
 						if (!ret) {
-							nr_thp_split += is_thp;
+							stats.nr_thp_split += is_thp;
 							break;
 						} else if (reason == MR_LONGTERM_PIN &&
 							   ret == -EAGAIN) {
@@ -1546,7 +1553,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 					nr_failed++;
 				}
 
-				nr_failed_pages += nr_pages + nr_retry_pages;
+				stats.nr_failed_pages += nr_pages + nr_retry_pages;
 				/*
 				 * There might be some split folios of fail-to-migrate large
 				 * folios left in split_folios list. Move them back to migration
@@ -1556,7 +1563,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 				list_splice_init(&split_folios, from);
 				/* nr_failed isn't updated for not used */
 				nr_large_failed += large_retry;
-				nr_thp_failed += thp_retry;
+				stats.nr_thp_failed += thp_retry;
 				goto out;
 			case -EAGAIN:
 				if (is_large) {
@@ -1568,8 +1575,8 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 				nr_retry_pages += nr_pages;
 				break;
 			case MIGRATEPAGE_SUCCESS:
-				nr_succeeded += nr_pages;
-				nr_thp_succeeded += is_thp;
+				stats.nr_succeeded += nr_pages;
+				stats.nr_thp_succeeded += is_thp;
 				break;
 			default:
 				/*
@@ -1580,20 +1587,20 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 				 */
 				if (is_large) {
 					nr_large_failed++;
-					nr_thp_failed += is_thp;
+					stats.nr_thp_failed += is_thp;
 				} else if (!no_split_folio_counting) {
 					nr_failed++;
 				}
 
-				nr_failed_pages += nr_pages;
+				stats.nr_failed_pages += nr_pages;
 				break;
 			}
 		}
 	}
 	nr_failed += retry;
 	nr_large_failed += large_retry;
-	nr_thp_failed += thp_retry;
-	nr_failed_pages += nr_retry_pages;
+	stats.nr_thp_failed += thp_retry;
+	stats.nr_failed_pages += nr_retry_pages;
 	/*
 	 * Try to migrate split folios of fail-to-migrate large folios, no
 	 * nr_failed counting in this round, since all split folios of a
@@ -1626,16 +1633,17 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 	if (list_empty(from))
 		rc = 0;
 
-	count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
-	count_vm_events(PGMIGRATE_FAIL, nr_failed_pages);
-	count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
-	count_vm_events(THP_MIGRATION_FAIL, nr_thp_failed);
-	count_vm_events(THP_MIGRATION_SPLIT, nr_thp_split);
-	trace_mm_migrate_pages(nr_succeeded, nr_failed_pages, nr_thp_succeeded,
-			       nr_thp_failed, nr_thp_split, mode, reason);
+	count_vm_events(PGMIGRATE_SUCCESS, stats.nr_succeeded);
+	count_vm_events(PGMIGRATE_FAIL, stats.nr_failed_pages);
+	count_vm_events(THP_MIGRATION_SUCCESS, stats.nr_thp_succeeded);
+	count_vm_events(THP_MIGRATION_FAIL, stats.nr_thp_failed);
+	count_vm_events(THP_MIGRATION_SPLIT, stats.nr_thp_split);
+	trace_mm_migrate_pages(stats.nr_succeeded, stats.nr_failed_pages,
+			       stats.nr_thp_succeeded, stats.nr_thp_failed,
+			       stats.nr_thp_split, mode, reason);
 
 	if (ret_succeeded)
-		*ret_succeeded = nr_succeeded;
+		*ret_succeeded = stats.nr_succeeded;
 
 	return rc;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH -v4 2/9] migrate_pages: separate hugetlb folios migration
  2023-02-06  6:33 [PATCH -v4 0/9] migrate_pages(): batch TLB flushing Huang Ying
  2023-02-06  6:33 ` [PATCH -v4 1/9] migrate_pages: organize stats with struct migrate_pages_stats Huang Ying
@ 2023-02-06  6:33 ` Huang Ying
  2023-02-07 16:42   ` haoxin
  2023-02-06  6:33 ` [PATCH -v4 3/9] migrate_pages: restrict number of pages to migrate in batch Huang Ying
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Huang Ying @ 2023-02-06  6:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Huang Ying, Baolin Wang, Zi Yan,
	Yang Shi, Oscar Salvador, Matthew Wilcox, Bharata B Rao,
	Alistair Popple, haoxin, Minchan Kim, Mike Kravetz,
	Hyeonggon Yoo

This is a preparation patch to batch the folio unmapping and moving
for the non-hugetlb folios.  Based on that we can batch the TLB
shootdown during the folio migration and make it possible to use some
hardware accelerator for the folio copying.

In this patch the hugetlb folios and non-hugetlb folios migration is
separated in migrate_pages() to make it easy to change the non-hugetlb
folios migration implementation.

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Bharata B Rao <bharata@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: haoxin <xhao@linux.alibaba.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 mm/migrate.c | 141 +++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 119 insertions(+), 22 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index ef388a9e4747..be7f37523463 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1396,6 +1396,8 @@ static inline int try_split_folio(struct folio *folio, struct list_head *split_f
 	return rc;
 }
 
+#define NR_MAX_MIGRATE_PAGES_RETRY	10
+
 struct migrate_pages_stats {
 	int nr_succeeded;	/* Normal and large folios migrated successfully, in
 				   units of base pages */
@@ -1406,6 +1408,95 @@ struct migrate_pages_stats {
 	int nr_thp_split;	/* THP split before migrating */
 };
 
+/*
+ * Returns the number of hugetlb folios that were not migrated, or an error code
+ * after NR_MAX_MIGRATE_PAGES_RETRY attempts or if no hugetlb folios are movable
+ * any more because the list has become empty or no retryable hugetlb folios
+ * exist any more. It is caller's responsibility to call putback_movable_pages()
+ * only if ret != 0.
+ */
+static int migrate_hugetlbs(struct list_head *from, new_page_t get_new_page,
+			    free_page_t put_new_page, unsigned long private,
+			    enum migrate_mode mode, int reason,
+			    struct migrate_pages_stats *stats,
+			    struct list_head *ret_folios)
+{
+	int retry = 1;
+	int nr_failed = 0;
+	int nr_retry_pages = 0;
+	int pass = 0;
+	struct folio *folio, *folio2;
+	int rc, nr_pages;
+
+	for (pass = 0; pass < NR_MAX_MIGRATE_PAGES_RETRY && retry; pass++) {
+		retry = 0;
+		nr_retry_pages = 0;
+
+		list_for_each_entry_safe(folio, folio2, from, lru) {
+			if (!folio_test_hugetlb(folio))
+				continue;
+
+			nr_pages = folio_nr_pages(folio);
+
+			cond_resched();
+
+			rc = unmap_and_move_huge_page(get_new_page,
+						      put_new_page, private,
+						      &folio->page, pass > 2, mode,
+						      reason, ret_folios);
+			/*
+			 * The rules are:
+			 *	Success: hugetlb folio will be put back
+			 *	-EAGAIN: stay on the from list
+			 *	-ENOMEM: stay on the from list
+			 *	-ENOSYS: stay on the from list
+			 *	Other errno: put on ret_folios list
+			 */
+			switch(rc) {
+			case -ENOSYS:
+				/* Hugetlb migration is unsupported */
+				nr_failed++;
+				stats->nr_failed_pages += nr_pages;
+				list_move_tail(&folio->lru, ret_folios);
+				break;
+			case -ENOMEM:
+				/*
+				 * When memory is low, don't bother to try to migrate
+				 * other folios, just exit.
+				 */
+				stats->nr_failed_pages += nr_pages + nr_retry_pages;
+				return -ENOMEM;
+			case -EAGAIN:
+				retry++;
+				nr_retry_pages += nr_pages;
+				break;
+			case MIGRATEPAGE_SUCCESS:
+				stats->nr_succeeded += nr_pages;
+				break;
+			default:
+				/*
+				 * Permanent failure (-EBUSY, etc.):
+				 * unlike -EAGAIN case, the failed folio is
+				 * removed from migration folio list and not
+				 * retried in the next outer loop.
+				 */
+				nr_failed++;
+				stats->nr_failed_pages += nr_pages;
+				break;
+			}
+		}
+	}
+	/*
+	 * nr_failed is number of hugetlb folios failed to be migrated.  After
+	 * NR_MAX_MIGRATE_PAGES_RETRY attempts, give up and count retried hugetlb
+	 * folios as failed.
+	 */
+	nr_failed += retry;
+	stats->nr_failed_pages += nr_retry_pages;
+
+	return nr_failed;
+}
+
 /*
  * migrate_pages - migrate the folios specified in a list, to the free folios
  *		   supplied as the target for the page migration
@@ -1422,10 +1513,10 @@ struct migrate_pages_stats {
  * @ret_succeeded:	Set to the number of folios migrated successfully if
  *			the caller passes a non-NULL pointer.
  *
- * The function returns after 10 attempts or if no folios are movable any more
- * because the list has become empty or no retryable folios exist any more.
- * It is caller's responsibility to call putback_movable_pages() to return folios
- * to the LRU or free list only if ret != 0.
+ * The function returns after NR_MAX_MIGRATE_PAGES_RETRY attempts or if no folios
+ * are movable any more because the list has become empty or no retryable folios
+ * exist any more. It is caller's responsibility to call putback_movable_pages()
+ * only if ret != 0.
  *
  * Returns the number of {normal folio, large folio, hugetlb} that were not
  * migrated, or an error code. The number of large folio splits will be
@@ -1439,7 +1530,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 	int retry = 1;
 	int large_retry = 1;
 	int thp_retry = 1;
-	int nr_failed = 0;
+	int nr_failed;
 	int nr_retry_pages = 0;
 	int nr_large_failed = 0;
 	int pass = 0;
@@ -1456,38 +1547,45 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 	trace_mm_migrate_pages_start(mode, reason);
 
 	memset(&stats, 0, sizeof(stats));
+	rc = migrate_hugetlbs(from, get_new_page, put_new_page, private, mode, reason,
+			      &stats, &ret_folios);
+	if (rc < 0)
+		goto out;
+	nr_failed = rc;
+
 split_folio_migration:
-	for (pass = 0; pass < 10 && (retry || large_retry); pass++) {
+	for (pass = 0;
+	     pass < NR_MAX_MIGRATE_PAGES_RETRY && (retry || large_retry);
+	     pass++) {
 		retry = 0;
 		large_retry = 0;
 		thp_retry = 0;
 		nr_retry_pages = 0;
 
 		list_for_each_entry_safe(folio, folio2, from, lru) {
+			/* Retried hugetlb folios will be kept in list  */
+			if (folio_test_hugetlb(folio)) {
+				list_move_tail(&folio->lru, &ret_folios);
+				continue;
+			}
+
 			/*
 			 * Large folio statistics is based on the source large
 			 * folio. Capture required information that might get
 			 * lost during migration.
 			 */
-			is_large = folio_test_large(folio) && !folio_test_hugetlb(folio);
+			is_large = folio_test_large(folio);
 			is_thp = is_large && folio_test_pmd_mappable(folio);
 			nr_pages = folio_nr_pages(folio);
+
 			cond_resched();
 
-			if (folio_test_hugetlb(folio))
-				rc = unmap_and_move_huge_page(get_new_page,
-						put_new_page, private,
-						&folio->page, pass > 2, mode,
-						reason,
-						&ret_folios);
-			else
-				rc = unmap_and_move(get_new_page, put_new_page,
-						private, folio, pass > 2, mode,
-						reason, &ret_folios);
+			rc = unmap_and_move(get_new_page, put_new_page,
+					    private, folio, pass > 2, mode,
+					    reason, &ret_folios);
 			/*
 			 * The rules are:
-			 *	Success: non hugetlb folio will be freed, hugetlb
-			 *		 folio will be put back
+			 *	Success: folio will be freed
 			 *	-EAGAIN: stay on the from list
 			 *	-ENOMEM: stay on the from list
 			 *	-ENOSYS: stay on the from list
@@ -1514,7 +1612,6 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 						stats.nr_thp_split += is_thp;
 						break;
 					}
-				/* Hugetlb migration is unsupported */
 				} else if (!no_split_folio_counting) {
 					nr_failed++;
 				}
@@ -1608,8 +1705,8 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 	 */
 	if (!list_empty(&split_folios)) {
 		/*
-		 * Move non-migrated folios (after 10 retries) to ret_folios
-		 * to avoid migrating them again.
+		 * Move non-migrated folios (after NR_MAX_MIGRATE_PAGES_RETRY
+		 * retries) to ret_folios to avoid migrating them again.
 		 */
 		list_splice_init(from, &ret_folios);
 		list_splice_init(&split_folios, from);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH -v4 3/9] migrate_pages: restrict number of pages to migrate in batch
  2023-02-06  6:33 [PATCH -v4 0/9] migrate_pages(): batch TLB flushing Huang Ying
  2023-02-06  6:33 ` [PATCH -v4 1/9] migrate_pages: organize stats with struct migrate_pages_stats Huang Ying
  2023-02-06  6:33 ` [PATCH -v4 2/9] migrate_pages: separate hugetlb folios migration Huang Ying
@ 2023-02-06  6:33 ` Huang Ying
  2023-02-07 17:01   ` haoxin
  2023-02-06  6:33 ` [PATCH -v4 4/9] migrate_pages: split unmap_and_move() to _unmap() and _move() Huang Ying
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Huang Ying @ 2023-02-06  6:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Huang Ying, Baolin Wang, Zi Yan,
	Yang Shi, Oscar Salvador, Matthew Wilcox, Bharata B Rao,
	Alistair Popple, haoxin, Minchan Kim, Mike Kravetz,
	Hyeonggon Yoo

This is a preparation patch to batch the folio unmapping and moving
for non-hugetlb folios.

If we had batched the folio unmapping, all folios to be migrated would
be unmapped before copying the contents and flags of the folios.  If
the folios that were passed to migrate_pages() were too many in unit
of pages, the execution of the processes would be stopped for too long
time, thus too long latency.  For example, migrate_pages() syscall
will call migrate_pages() with all folios of a process.  To avoid this
possible issue, in this patch, we restrict the number of pages to be
migrated to be no more than HPAGE_PMD_NR.  That is, the influence is
at the same level of THP migration.

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Bharata B Rao <bharata@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: haoxin <xhao@linux.alibaba.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 mm/migrate.c | 174 +++++++++++++++++++++++++++++++--------------------
 1 file changed, 106 insertions(+), 68 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index be7f37523463..9a667039c34c 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1396,6 +1396,11 @@ static inline int try_split_folio(struct folio *folio, struct list_head *split_f
 	return rc;
 }
 
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+#define NR_MAX_BATCHED_MIGRATION	HPAGE_PMD_NR
+#else
+#define NR_MAX_BATCHED_MIGRATION	512
+#endif
 #define NR_MAX_MIGRATE_PAGES_RETRY	10
 
 struct migrate_pages_stats {
@@ -1497,40 +1502,15 @@ static int migrate_hugetlbs(struct list_head *from, new_page_t get_new_page,
 	return nr_failed;
 }
 
-/*
- * migrate_pages - migrate the folios specified in a list, to the free folios
- *		   supplied as the target for the page migration
- *
- * @from:		The list of folios to be migrated.
- * @get_new_page:	The function used to allocate free folios to be used
- *			as the target of the folio migration.
- * @put_new_page:	The function used to free target folios if migration
- *			fails, or NULL if no special handling is necessary.
- * @private:		Private data to be passed on to get_new_page()
- * @mode:		The migration mode that specifies the constraints for
- *			folio migration, if any.
- * @reason:		The reason for folio migration.
- * @ret_succeeded:	Set to the number of folios migrated successfully if
- *			the caller passes a non-NULL pointer.
- *
- * The function returns after NR_MAX_MIGRATE_PAGES_RETRY attempts or if no folios
- * are movable any more because the list has become empty or no retryable folios
- * exist any more. It is caller's responsibility to call putback_movable_pages()
- * only if ret != 0.
- *
- * Returns the number of {normal folio, large folio, hugetlb} that were not
- * migrated, or an error code. The number of large folio splits will be
- * considered as the number of non-migrated large folio, no matter how many
- * split folios of the large folio are migrated successfully.
- */
-int migrate_pages(struct list_head *from, new_page_t get_new_page,
+static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
 		free_page_t put_new_page, unsigned long private,
-		enum migrate_mode mode, int reason, unsigned int *ret_succeeded)
+		enum migrate_mode mode, int reason, struct list_head *ret_folios,
+		struct migrate_pages_stats *stats)
 {
 	int retry = 1;
 	int large_retry = 1;
 	int thp_retry = 1;
-	int nr_failed;
+	int nr_failed = 0;
 	int nr_retry_pages = 0;
 	int nr_large_failed = 0;
 	int pass = 0;
@@ -1538,20 +1518,9 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 	bool is_thp = false;
 	struct folio *folio, *folio2;
 	int rc, nr_pages;
-	LIST_HEAD(ret_folios);
 	LIST_HEAD(split_folios);
 	bool nosplit = (reason == MR_NUMA_MISPLACED);
 	bool no_split_folio_counting = false;
-	struct migrate_pages_stats stats;
-
-	trace_mm_migrate_pages_start(mode, reason);
-
-	memset(&stats, 0, sizeof(stats));
-	rc = migrate_hugetlbs(from, get_new_page, put_new_page, private, mode, reason,
-			      &stats, &ret_folios);
-	if (rc < 0)
-		goto out;
-	nr_failed = rc;
 
 split_folio_migration:
 	for (pass = 0;
@@ -1563,12 +1532,6 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 		nr_retry_pages = 0;
 
 		list_for_each_entry_safe(folio, folio2, from, lru) {
-			/* Retried hugetlb folios will be kept in list  */
-			if (folio_test_hugetlb(folio)) {
-				list_move_tail(&folio->lru, &ret_folios);
-				continue;
-			}
-
 			/*
 			 * Large folio statistics is based on the source large
 			 * folio. Capture required information that might get
@@ -1582,15 +1545,14 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 
 			rc = unmap_and_move(get_new_page, put_new_page,
 					    private, folio, pass > 2, mode,
-					    reason, &ret_folios);
+					    reason, ret_folios);
 			/*
 			 * The rules are:
 			 *	Success: folio will be freed
 			 *	-EAGAIN: stay on the from list
 			 *	-ENOMEM: stay on the from list
 			 *	-ENOSYS: stay on the from list
-			 *	Other errno: put on ret_folios list then splice to
-			 *		     from list
+			 *	Other errno: put on ret_folios list
 			 */
 			switch(rc) {
 			/*
@@ -1607,17 +1569,17 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 				/* Large folio migration is unsupported */
 				if (is_large) {
 					nr_large_failed++;
-					stats.nr_thp_failed += is_thp;
+					stats->nr_thp_failed += is_thp;
 					if (!try_split_folio(folio, &split_folios)) {
-						stats.nr_thp_split += is_thp;
+						stats->nr_thp_split += is_thp;
 						break;
 					}
 				} else if (!no_split_folio_counting) {
 					nr_failed++;
 				}
 
-				stats.nr_failed_pages += nr_pages;
-				list_move_tail(&folio->lru, &ret_folios);
+				stats->nr_failed_pages += nr_pages;
+				list_move_tail(&folio->lru, ret_folios);
 				break;
 			case -ENOMEM:
 				/*
@@ -1626,13 +1588,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 				 */
 				if (is_large) {
 					nr_large_failed++;
-					stats.nr_thp_failed += is_thp;
+					stats->nr_thp_failed += is_thp;
 					/* Large folio NUMA faulting doesn't split to retry. */
 					if (!nosplit) {
 						int ret = try_split_folio(folio, &split_folios);
 
 						if (!ret) {
-							stats.nr_thp_split += is_thp;
+							stats->nr_thp_split += is_thp;
 							break;
 						} else if (reason == MR_LONGTERM_PIN &&
 							   ret == -EAGAIN) {
@@ -1650,17 +1612,17 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 					nr_failed++;
 				}
 
-				stats.nr_failed_pages += nr_pages + nr_retry_pages;
+				stats->nr_failed_pages += nr_pages + nr_retry_pages;
 				/*
 				 * There might be some split folios of fail-to-migrate large
-				 * folios left in split_folios list. Move them back to migration
+				 * folios left in split_folios list. Move them to ret_folios
 				 * list so that they could be put back to the right list by
 				 * the caller otherwise the folio refcnt will be leaked.
 				 */
-				list_splice_init(&split_folios, from);
+				list_splice_init(&split_folios, ret_folios);
 				/* nr_failed isn't updated for not used */
 				nr_large_failed += large_retry;
-				stats.nr_thp_failed += thp_retry;
+				stats->nr_thp_failed += thp_retry;
 				goto out;
 			case -EAGAIN:
 				if (is_large) {
@@ -1672,8 +1634,8 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 				nr_retry_pages += nr_pages;
 				break;
 			case MIGRATEPAGE_SUCCESS:
-				stats.nr_succeeded += nr_pages;
-				stats.nr_thp_succeeded += is_thp;
+				stats->nr_succeeded += nr_pages;
+				stats->nr_thp_succeeded += is_thp;
 				break;
 			default:
 				/*
@@ -1684,20 +1646,20 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 				 */
 				if (is_large) {
 					nr_large_failed++;
-					stats.nr_thp_failed += is_thp;
+					stats->nr_thp_failed += is_thp;
 				} else if (!no_split_folio_counting) {
 					nr_failed++;
 				}
 
-				stats.nr_failed_pages += nr_pages;
+				stats->nr_failed_pages += nr_pages;
 				break;
 			}
 		}
 	}
 	nr_failed += retry;
 	nr_large_failed += large_retry;
-	stats.nr_thp_failed += thp_retry;
-	stats.nr_failed_pages += nr_retry_pages;
+	stats->nr_thp_failed += thp_retry;
+	stats->nr_failed_pages += nr_retry_pages;
 	/*
 	 * Try to migrate split folios of fail-to-migrate large folios, no
 	 * nr_failed counting in this round, since all split folios of a
@@ -1708,7 +1670,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 		 * Move non-migrated folios (after NR_MAX_MIGRATE_PAGES_RETRY
 		 * retries) to ret_folios to avoid migrating them again.
 		 */
-		list_splice_init(from, &ret_folios);
+		list_splice_init(from, ret_folios);
 		list_splice_init(&split_folios, from);
 		no_split_folio_counting = true;
 		retry = 1;
@@ -1716,6 +1678,82 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 	}
 
 	rc = nr_failed + nr_large_failed;
+out:
+	return rc;
+}
+
+/*
+ * migrate_pages - migrate the folios specified in a list, to the free folios
+ *		   supplied as the target for the page migration
+ *
+ * @from:		The list of folios to be migrated.
+ * @get_new_page:	The function used to allocate free folios to be used
+ *			as the target of the folio migration.
+ * @put_new_page:	The function used to free target folios if migration
+ *			fails, or NULL if no special handling is necessary.
+ * @private:		Private data to be passed on to get_new_page()
+ * @mode:		The migration mode that specifies the constraints for
+ *			folio migration, if any.
+ * @reason:		The reason for folio migration.
+ * @ret_succeeded:	Set to the number of folios migrated successfully if
+ *			the caller passes a non-NULL pointer.
+ *
+ * The function returns after NR_MAX_MIGRATE_PAGES_RETRY attempts or if no folios
+ * are movable any more because the list has become empty or no retryable folios
+ * exist any more. It is caller's responsibility to call putback_movable_pages()
+ * only if ret != 0.
+ *
+ * Returns the number of {normal folio, large folio, hugetlb} that were not
+ * migrated, or an error code. The number of large folio splits will be
+ * considered as the number of non-migrated large folio, no matter how many
+ * split folios of the large folio are migrated successfully.
+ */
+int migrate_pages(struct list_head *from, new_page_t get_new_page,
+		free_page_t put_new_page, unsigned long private,
+		enum migrate_mode mode, int reason, unsigned int *ret_succeeded)
+{
+	int rc, rc_gather;
+	int nr_pages;
+	struct folio *folio, *folio2;
+	LIST_HEAD(folios);
+	LIST_HEAD(ret_folios);
+	struct migrate_pages_stats stats;
+
+	trace_mm_migrate_pages_start(mode, reason);
+
+	memset(&stats, 0, sizeof(stats));
+
+	rc_gather = migrate_hugetlbs(from, get_new_page, put_new_page, private,
+				     mode, reason, &stats, &ret_folios);
+	if (rc_gather < 0)
+		goto out;
+again:
+	nr_pages = 0;
+	list_for_each_entry_safe(folio, folio2, from, lru) {
+		/* Retried hugetlb folios will be kept in list  */
+		if (folio_test_hugetlb(folio)) {
+			list_move_tail(&folio->lru, &ret_folios);
+			continue;
+		}
+
+		nr_pages += folio_nr_pages(folio);
+		if (nr_pages > NR_MAX_BATCHED_MIGRATION)
+			break;
+	}
+	if (nr_pages > NR_MAX_BATCHED_MIGRATION)
+		list_cut_before(&folios, from, &folio->lru);
+	else
+		list_splice_init(from, &folios);
+	rc = migrate_pages_batch(&folios, get_new_page, put_new_page, private,
+				 mode, reason, &ret_folios, &stats);
+	list_splice_tail_init(&folios, &ret_folios);
+	if (rc < 0) {
+		rc_gather = rc;
+		goto out;
+	}
+	rc_gather += rc;
+	if (!list_empty(from))
+		goto again;
 out:
 	/*
 	 * Put the permanent failure folio back to migration list, they
@@ -1728,7 +1766,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 	 * are migrated successfully.
 	 */
 	if (list_empty(from))
-		rc = 0;
+		rc_gather = 0;
 
 	count_vm_events(PGMIGRATE_SUCCESS, stats.nr_succeeded);
 	count_vm_events(PGMIGRATE_FAIL, stats.nr_failed_pages);
@@ -1742,7 +1780,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 	if (ret_succeeded)
 		*ret_succeeded = stats.nr_succeeded;
 
-	return rc;
+	return rc_gather;
 }
 
 struct page *alloc_migration_target(struct page *page, unsigned long private)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH -v4 4/9] migrate_pages: split unmap_and_move() to _unmap() and _move()
  2023-02-06  6:33 [PATCH -v4 0/9] migrate_pages(): batch TLB flushing Huang Ying
                   ` (2 preceding siblings ...)
  2023-02-06  6:33 ` [PATCH -v4 3/9] migrate_pages: restrict number of pages to migrate in batch Huang Ying
@ 2023-02-06  6:33 ` Huang Ying
  2023-02-07 17:11   ` haoxin
  2023-02-06  6:33 ` [PATCH -v4 5/9] migrate_pages: batch _unmap and _move Huang Ying
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Huang Ying @ 2023-02-06  6:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Huang Ying, Baolin Wang, Zi Yan,
	Yang Shi, Oscar Salvador, Matthew Wilcox, Bharata B Rao,
	Alistair Popple, haoxin, Minchan Kim, Mike Kravetz,
	Hyeonggon Yoo

This is a preparation patch to batch the folio unmapping and moving.

In this patch, unmap_and_move() is split to migrate_folio_unmap() and
migrate_folio_move().  So, we can batch _unmap() and _move() in
different loops later.  To pass some information between unmap and
move, the original unused dst->mapping and dst->private are used.

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Bharata B Rao <bharata@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: haoxin <xhao@linux.alibaba.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 include/linux/migrate.h |   1 +
 mm/migrate.c            | 170 ++++++++++++++++++++++++++++++----------
 2 files changed, 130 insertions(+), 41 deletions(-)

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 3ef77f52a4f0..7376074f2e1e 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -18,6 +18,7 @@ struct migration_target_control;
  * - zero on page migration success;
  */
 #define MIGRATEPAGE_SUCCESS		0
+#define MIGRATEPAGE_UNMAP		1
 
 /**
  * struct movable_operations - Driver page migration
diff --git a/mm/migrate.c b/mm/migrate.c
index 9a667039c34c..0428449149f4 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1009,11 +1009,53 @@ static int move_to_new_folio(struct folio *dst, struct folio *src,
 	return rc;
 }
 
-static int __unmap_and_move(struct folio *src, struct folio *dst,
+/*
+ * To record some information during migration, we uses some unused
+ * fields (mapping and private) of struct folio of the newly allocated
+ * destination folio.  This is safe because nobody is using them
+ * except us.
+ */
+static void __migrate_folio_record(struct folio *dst,
+				   unsigned long page_was_mapped,
+				   struct anon_vma *anon_vma)
+{
+	dst->mapping = (void *)anon_vma;
+	dst->private = (void *)page_was_mapped;
+}
+
+static void __migrate_folio_extract(struct folio *dst,
+				   int *page_was_mappedp,
+				   struct anon_vma **anon_vmap)
+{
+	*anon_vmap = (void *)dst->mapping;
+	*page_was_mappedp = (unsigned long)dst->private;
+	dst->mapping = NULL;
+	dst->private = NULL;
+}
+
+/* Cleanup src folio upon migration success */
+static void migrate_folio_done(struct folio *src,
+			       enum migrate_reason reason)
+{
+	/*
+	 * Compaction can migrate also non-LRU pages which are
+	 * not accounted to NR_ISOLATED_*. They can be recognized
+	 * as __PageMovable
+	 */
+	if (likely(!__folio_test_movable(src)))
+		mod_node_page_state(folio_pgdat(src), NR_ISOLATED_ANON +
+				    folio_is_file_lru(src), -folio_nr_pages(src));
+
+	if (reason != MR_MEMORY_FAILURE)
+		/* We release the page in page_handle_poison. */
+		folio_put(src);
+}
+
+static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
 				int force, enum migrate_mode mode)
 {
 	int rc = -EAGAIN;
-	bool page_was_mapped = false;
+	int page_was_mapped = 0;
 	struct anon_vma *anon_vma = NULL;
 	bool is_lru = !__PageMovable(&src->page);
 
@@ -1089,8 +1131,8 @@ static int __unmap_and_move(struct folio *src, struct folio *dst,
 		goto out_unlock;
 
 	if (unlikely(!is_lru)) {
-		rc = move_to_new_folio(dst, src, mode);
-		goto out_unlock_both;
+		__migrate_folio_record(dst, page_was_mapped, anon_vma);
+		return MIGRATEPAGE_UNMAP;
 	}
 
 	/*
@@ -1115,11 +1157,42 @@ static int __unmap_and_move(struct folio *src, struct folio *dst,
 		VM_BUG_ON_FOLIO(folio_test_anon(src) &&
 			       !folio_test_ksm(src) && !anon_vma, src);
 		try_to_migrate(src, 0);
-		page_was_mapped = true;
+		page_was_mapped = 1;
 	}
 
-	if (!folio_mapped(src))
-		rc = move_to_new_folio(dst, src, mode);
+	if (!folio_mapped(src)) {
+		__migrate_folio_record(dst, page_was_mapped, anon_vma);
+		return MIGRATEPAGE_UNMAP;
+	}
+
+	if (page_was_mapped)
+		remove_migration_ptes(src, src, false);
+
+out_unlock_both:
+	folio_unlock(dst);
+out_unlock:
+	/* Drop an anon_vma reference if we took one */
+	if (anon_vma)
+		put_anon_vma(anon_vma);
+	folio_unlock(src);
+out:
+
+	return rc;
+}
+
+static int __migrate_folio_move(struct folio *src, struct folio *dst,
+				enum migrate_mode mode)
+{
+	int rc;
+	int page_was_mapped = 0;
+	struct anon_vma *anon_vma = NULL;
+	bool is_lru = !__PageMovable(&src->page);
+
+	__migrate_folio_extract(dst, &page_was_mapped, &anon_vma);
+
+	rc = move_to_new_folio(dst, src, mode);
+	if (unlikely(!is_lru))
+		goto out_unlock_both;
 
 	/*
 	 * When successful, push dst to LRU immediately: so that if it
@@ -1142,12 +1215,10 @@ static int __unmap_and_move(struct folio *src, struct folio *dst,
 
 out_unlock_both:
 	folio_unlock(dst);
-out_unlock:
 	/* Drop an anon_vma reference if we took one */
 	if (anon_vma)
 		put_anon_vma(anon_vma);
 	folio_unlock(src);
-out:
 	/*
 	 * If migration is successful, decrease refcount of dst,
 	 * which will not free the page because new page owner increased
@@ -1159,19 +1230,15 @@ static int __unmap_and_move(struct folio *src, struct folio *dst,
 	return rc;
 }
 
-/*
- * Obtain the lock on folio, remove all ptes and migrate the folio
- * to the newly allocated folio in dst.
- */
-static int unmap_and_move(new_page_t get_new_page,
-				   free_page_t put_new_page,
-				   unsigned long private, struct folio *src,
-				   int force, enum migrate_mode mode,
-				   enum migrate_reason reason,
-				   struct list_head *ret)
+/* Obtain the lock on page, remove all ptes. */
+static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page,
+			       unsigned long private, struct folio *src,
+			       struct folio **dstp, int force,
+			       enum migrate_mode mode, enum migrate_reason reason,
+			       struct list_head *ret)
 {
 	struct folio *dst;
-	int rc = MIGRATEPAGE_SUCCESS;
+	int rc = MIGRATEPAGE_UNMAP;
 	struct page *newpage = NULL;
 
 	if (!thp_migration_supported() && folio_test_transhuge(src))
@@ -1182,20 +1249,50 @@ static int unmap_and_move(new_page_t get_new_page,
 		folio_clear_active(src);
 		folio_clear_unevictable(src);
 		/* free_pages_prepare() will clear PG_isolated. */
-		goto out;
+		list_del(&src->lru);
+		migrate_folio_done(src, reason);
+		return MIGRATEPAGE_SUCCESS;
 	}
 
 	newpage = get_new_page(&src->page, private);
 	if (!newpage)
 		return -ENOMEM;
 	dst = page_folio(newpage);
+	*dstp = dst;
 
 	dst->private = NULL;
-	rc = __unmap_and_move(src, dst, force, mode);
+	rc = __migrate_folio_unmap(src, dst, force, mode);
+	if (rc == MIGRATEPAGE_UNMAP)
+		return rc;
+
+	/*
+	 * A page that has not been migrated will have kept its
+	 * references and be restored.
+	 */
+	/* restore the folio to right list. */
+	if (rc != -EAGAIN)
+		list_move_tail(&src->lru, ret);
+
+	if (put_new_page)
+		put_new_page(&dst->page, private);
+	else
+		folio_put(dst);
+
+	return rc;
+}
+
+/* Migrate the folio to the newly allocated folio in dst. */
+static int migrate_folio_move(free_page_t put_new_page, unsigned long private,
+			      struct folio *src, struct folio *dst,
+			      enum migrate_mode mode, enum migrate_reason reason,
+			      struct list_head *ret)
+{
+	int rc;
+
+	rc = __migrate_folio_move(src, dst, mode);
 	if (rc == MIGRATEPAGE_SUCCESS)
 		set_page_owner_migrate_reason(&dst->page, reason);
 
-out:
 	if (rc != -EAGAIN) {
 		/*
 		 * A folio that has been migrated has all references
@@ -1211,20 +1308,7 @@ static int unmap_and_move(new_page_t get_new_page,
 	 * we want to retry.
 	 */
 	if (rc == MIGRATEPAGE_SUCCESS) {
-		/*
-		 * Compaction can migrate also non-LRU folios which are
-		 * not accounted to NR_ISOLATED_*. They can be recognized
-		 * as __folio_test_movable
-		 */
-		if (likely(!__folio_test_movable(src)))
-			mod_node_page_state(folio_pgdat(src), NR_ISOLATED_ANON +
-					folio_is_file_lru(src), -folio_nr_pages(src));
-
-		if (reason != MR_MEMORY_FAILURE)
-			/*
-			 * We release the folio in page_handle_poison.
-			 */
-			folio_put(src);
+		migrate_folio_done(src, reason);
 	} else {
 		if (rc != -EAGAIN)
 			list_add_tail(&src->lru, ret);
@@ -1516,7 +1600,7 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
 	int pass = 0;
 	bool is_large = false;
 	bool is_thp = false;
-	struct folio *folio, *folio2;
+	struct folio *folio, *folio2, *dst = NULL;
 	int rc, nr_pages;
 	LIST_HEAD(split_folios);
 	bool nosplit = (reason == MR_NUMA_MISPLACED);
@@ -1543,9 +1627,13 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
 
 			cond_resched();
 
-			rc = unmap_and_move(get_new_page, put_new_page,
-					    private, folio, pass > 2, mode,
-					    reason, ret_folios);
+			rc = migrate_folio_unmap(get_new_page, put_new_page, private,
+						 folio, &dst, pass > 2, mode,
+						 reason, ret_folios);
+			if (rc == MIGRATEPAGE_UNMAP)
+				rc = migrate_folio_move(put_new_page, private,
+							folio, dst, mode,
+							reason, ret_folios);
 			/*
 			 * The rules are:
 			 *	Success: folio will be freed
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH -v4 5/9] migrate_pages: batch _unmap and _move
  2023-02-06  6:33 [PATCH -v4 0/9] migrate_pages(): batch TLB flushing Huang Ying
                   ` (3 preceding siblings ...)
  2023-02-06  6:33 ` [PATCH -v4 4/9] migrate_pages: split unmap_and_move() to _unmap() and _move() Huang Ying
@ 2023-02-06  6:33 ` Huang Ying
  2023-02-06 16:10   ` Zi Yan
  2023-02-07 17:33   ` haoxin
  2023-02-06  6:33 ` [PATCH -v4 6/9] migrate_pages: move migrate_folio_unmap() Huang Ying
                   ` (4 subsequent siblings)
  9 siblings, 2 replies; 33+ messages in thread
From: Huang Ying @ 2023-02-06  6:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Huang Ying, Hyeonggon Yoo, Zi Yan,
	Yang Shi, Baolin Wang, Oscar Salvador, Matthew Wilcox,
	Bharata B Rao, Alistair Popple, haoxin, Minchan Kim,
	Mike Kravetz

In this patch the _unmap and _move stage of the folio migration is
batched.  That for, previously, it is,

  for each folio
    _unmap()
    _move()

Now, it is,

  for each folio
    _unmap()
  for each folio
    _move()

Based on this, we can batch the TLB flushing and use some hardware
accelerator to copy folios between batched _unmap and batched _move
stages.

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Bharata B Rao <bharata@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: haoxin <xhao@linux.alibaba.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
---
 mm/migrate.c | 208 +++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 184 insertions(+), 24 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 0428449149f4..fa7212330cb6 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1033,6 +1033,33 @@ static void __migrate_folio_extract(struct folio *dst,
 	dst->private = NULL;
 }
 
+/* Restore the source folio to the original state upon failure */
+static void migrate_folio_undo_src(struct folio *src,
+				   int page_was_mapped,
+				   struct anon_vma *anon_vma,
+				   struct list_head *ret)
+{
+	if (page_was_mapped)
+		remove_migration_ptes(src, src, false);
+	/* Drop an anon_vma reference if we took one */
+	if (anon_vma)
+		put_anon_vma(anon_vma);
+	folio_unlock(src);
+	list_move_tail(&src->lru, ret);
+}
+
+/* Restore the destination folio to the original state upon failure */
+static void migrate_folio_undo_dst(struct folio *dst,
+				   free_page_t put_new_page,
+				   unsigned long private)
+{
+	folio_unlock(dst);
+	if (put_new_page)
+		put_new_page(&dst->page, private);
+	else
+		folio_put(dst);
+}
+
 /* Cleanup src folio upon migration success */
 static void migrate_folio_done(struct folio *src,
 			       enum migrate_reason reason)
@@ -1052,7 +1079,7 @@ static void migrate_folio_done(struct folio *src,
 }
 
 static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
-				int force, enum migrate_mode mode)
+				 int force, bool force_lock, enum migrate_mode mode)
 {
 	int rc = -EAGAIN;
 	int page_was_mapped = 0;
@@ -1079,6 +1106,17 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
 		if (current->flags & PF_MEMALLOC)
 			goto out;
 
+		/*
+		 * We have locked some folios, to avoid deadlock, we cannot
+		 * lock the folio synchronously.  Go out to process (and
+		 * unlock) all the locked folios.  Then we can lock the folio
+		 * synchronously.
+		 */
+		if (!force_lock) {
+			rc = -EDEADLOCK;
+			goto out;
+		}
+
 		folio_lock(src);
 	}
 
@@ -1187,10 +1225,20 @@ static int __migrate_folio_move(struct folio *src, struct folio *dst,
 	int page_was_mapped = 0;
 	struct anon_vma *anon_vma = NULL;
 	bool is_lru = !__PageMovable(&src->page);
+	struct list_head *prev;
 
 	__migrate_folio_extract(dst, &page_was_mapped, &anon_vma);
+	prev = dst->lru.prev;
+	list_del(&dst->lru);
 
 	rc = move_to_new_folio(dst, src, mode);
+
+	if (rc == -EAGAIN) {
+		list_add(&dst->lru, prev);
+		__migrate_folio_record(dst, page_was_mapped, anon_vma);
+		return rc;
+	}
+
 	if (unlikely(!is_lru))
 		goto out_unlock_both;
 
@@ -1233,7 +1281,7 @@ static int __migrate_folio_move(struct folio *src, struct folio *dst,
 /* Obtain the lock on page, remove all ptes. */
 static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page,
 			       unsigned long private, struct folio *src,
-			       struct folio **dstp, int force,
+			       struct folio **dstp, int force, bool force_lock,
 			       enum migrate_mode mode, enum migrate_reason reason,
 			       struct list_head *ret)
 {
@@ -1261,7 +1309,7 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page
 	*dstp = dst;
 
 	dst->private = NULL;
-	rc = __migrate_folio_unmap(src, dst, force, mode);
+	rc = __migrate_folio_unmap(src, dst, force, force_lock, mode);
 	if (rc == MIGRATEPAGE_UNMAP)
 		return rc;
 
@@ -1270,7 +1318,7 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page
 	 * references and be restored.
 	 */
 	/* restore the folio to right list. */
-	if (rc != -EAGAIN)
+	if (rc != -EAGAIN && rc != -EDEADLOCK)
 		list_move_tail(&src->lru, ret);
 
 	if (put_new_page)
@@ -1309,9 +1357,8 @@ static int migrate_folio_move(free_page_t put_new_page, unsigned long private,
 	 */
 	if (rc == MIGRATEPAGE_SUCCESS) {
 		migrate_folio_done(src, reason);
-	} else {
-		if (rc != -EAGAIN)
-			list_add_tail(&src->lru, ret);
+	} else if (rc != -EAGAIN) {
+		list_add_tail(&src->lru, ret);
 
 		if (put_new_page)
 			put_new_page(&dst->page, private);
@@ -1591,7 +1638,7 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
 		enum migrate_mode mode, int reason, struct list_head *ret_folios,
 		struct migrate_pages_stats *stats)
 {
-	int retry = 1;
+	int retry;
 	int large_retry = 1;
 	int thp_retry = 1;
 	int nr_failed = 0;
@@ -1600,13 +1647,19 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
 	int pass = 0;
 	bool is_large = false;
 	bool is_thp = false;
-	struct folio *folio, *folio2, *dst = NULL;
-	int rc, nr_pages;
+	struct folio *folio, *folio2, *dst = NULL, *dst2;
+	int rc, rc_saved, nr_pages;
 	LIST_HEAD(split_folios);
+	LIST_HEAD(unmap_folios);
+	LIST_HEAD(dst_folios);
 	bool nosplit = (reason == MR_NUMA_MISPLACED);
 	bool no_split_folio_counting = false;
+	bool force_lock;
 
-split_folio_migration:
+retry:
+	rc_saved = 0;
+	force_lock = true;
+	retry = 1;
 	for (pass = 0;
 	     pass < NR_MAX_MIGRATE_PAGES_RETRY && (retry || large_retry);
 	     pass++) {
@@ -1628,16 +1681,15 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
 			cond_resched();
 
 			rc = migrate_folio_unmap(get_new_page, put_new_page, private,
-						 folio, &dst, pass > 2, mode,
-						 reason, ret_folios);
-			if (rc == MIGRATEPAGE_UNMAP)
-				rc = migrate_folio_move(put_new_page, private,
-							folio, dst, mode,
-							reason, ret_folios);
+						 folio, &dst, pass > 2, force_lock,
+						 mode, reason, ret_folios);
 			/*
 			 * The rules are:
 			 *	Success: folio will be freed
+			 *	Unmap: folio will be put on unmap_folios list,
+			 *	       dst folio put on dst_folios list
 			 *	-EAGAIN: stay on the from list
+			 *	-EDEADLOCK: stay on the from list
 			 *	-ENOMEM: stay on the from list
 			 *	-ENOSYS: stay on the from list
 			 *	Other errno: put on ret_folios list
@@ -1672,7 +1724,7 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
 			case -ENOMEM:
 				/*
 				 * When memory is low, don't bother to try to migrate
-				 * other folios, just exit.
+				 * other folios, move unmapped folios, then exit.
 				 */
 				if (is_large) {
 					nr_large_failed++;
@@ -1711,7 +1763,19 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
 				/* nr_failed isn't updated for not used */
 				nr_large_failed += large_retry;
 				stats->nr_thp_failed += thp_retry;
-				goto out;
+				rc_saved = rc;
+				if (list_empty(&unmap_folios))
+					goto out;
+				else
+					goto move;
+			case -EDEADLOCK:
+				/*
+				 * The folio cannot be locked for potential deadlock.
+				 * Go move (and unlock) all locked folios.  Then we can
+				 * try again.
+				 */
+				rc_saved = rc;
+				goto move;
 			case -EAGAIN:
 				if (is_large) {
 					large_retry++;
@@ -1725,6 +1789,15 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
 				stats->nr_succeeded += nr_pages;
 				stats->nr_thp_succeeded += is_thp;
 				break;
+			case MIGRATEPAGE_UNMAP:
+				/*
+				 * We have locked some folios, don't force lock
+				 * to avoid deadlock.
+				 */
+				force_lock = false;
+				list_move_tail(&folio->lru, &unmap_folios);
+				list_add_tail(&dst->lru, &dst_folios);
+				break;
 			default:
 				/*
 				 * Permanent failure (-EBUSY, etc.):
@@ -1748,12 +1821,95 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
 	nr_large_failed += large_retry;
 	stats->nr_thp_failed += thp_retry;
 	stats->nr_failed_pages += nr_retry_pages;
+move:
+	retry = 1;
+	for (pass = 0;
+	     pass < NR_MAX_MIGRATE_PAGES_RETRY && (retry || large_retry);
+	     pass++) {
+		retry = 0;
+		large_retry = 0;
+		thp_retry = 0;
+		nr_retry_pages = 0;
+
+		dst = list_first_entry(&dst_folios, struct folio, lru);
+		dst2 = list_next_entry(dst, lru);
+		list_for_each_entry_safe(folio, folio2, &unmap_folios, lru) {
+			is_large = folio_test_large(folio);
+			is_thp = is_large && folio_test_pmd_mappable(folio);
+			nr_pages = folio_nr_pages(folio);
+
+			cond_resched();
+
+			rc = migrate_folio_move(put_new_page, private,
+						folio, dst, mode,
+						reason, ret_folios);
+			/*
+			 * The rules are:
+			 *	Success: folio will be freed
+			 *	-EAGAIN: stay on the unmap_folios list
+			 *	Other errno: put on ret_folios list
+			 */
+			switch(rc) {
+			case -EAGAIN:
+				if (is_large) {
+					large_retry++;
+					thp_retry += is_thp;
+				} else if (!no_split_folio_counting) {
+					retry++;
+				}
+				nr_retry_pages += nr_pages;
+				break;
+			case MIGRATEPAGE_SUCCESS:
+				stats->nr_succeeded += nr_pages;
+				stats->nr_thp_succeeded += is_thp;
+				break;
+			default:
+				if (is_large) {
+					nr_large_failed++;
+					stats->nr_thp_failed += is_thp;
+				} else if (!no_split_folio_counting) {
+					nr_failed++;
+				}
+
+				stats->nr_failed_pages += nr_pages;
+				break;
+			}
+			dst = dst2;
+			dst2 = list_next_entry(dst, lru);
+		}
+	}
+	nr_failed += retry;
+	nr_large_failed += large_retry;
+	stats->nr_thp_failed += thp_retry;
+	stats->nr_failed_pages += nr_retry_pages;
+
+	if (rc_saved)
+		rc = rc_saved;
+	else
+		rc = nr_failed + nr_large_failed;
+out:
+	/* Cleanup remaining folios */
+	dst = list_first_entry(&dst_folios, struct folio, lru);
+	dst2 = list_next_entry(dst, lru);
+	list_for_each_entry_safe(folio, folio2, &unmap_folios, lru) {
+		int page_was_mapped = 0;
+		struct anon_vma *anon_vma = NULL;
+
+		__migrate_folio_extract(dst, &page_was_mapped, &anon_vma);
+		migrate_folio_undo_src(folio, page_was_mapped, anon_vma,
+				       ret_folios);
+		list_del(&dst->lru);
+		migrate_folio_undo_dst(dst, put_new_page, private);
+		dst = dst2;
+		dst2 = list_next_entry(dst, lru);
+	}
+
 	/*
 	 * Try to migrate split folios of fail-to-migrate large folios, no
 	 * nr_failed counting in this round, since all split folios of a
 	 * large folio is counted as 1 failure in the first round.
 	 */
-	if (!list_empty(&split_folios)) {
+	if (rc >= 0 && !list_empty(&split_folios)) {
 		/*
 		 * Move non-migrated folios (after NR_MAX_MIGRATE_PAGES_RETRY
 		 * retries) to ret_folios to avoid migrating them again.
@@ -1761,12 +1917,16 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
 		list_splice_init(from, ret_folios);
 		list_splice_init(&split_folios, from);
 		no_split_folio_counting = true;
-		retry = 1;
-		goto split_folio_migration;
+		goto retry;
 	}
 
-	rc = nr_failed + nr_large_failed;
-out:
+	/*
+	 * We have unlocked all locked folios, so we can force lock now, let's
+	 * try again.
+	 */
+	if (rc == -EDEADLOCK)
+		goto retry;
+
 	return rc;
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH -v4 6/9] migrate_pages: move migrate_folio_unmap()
  2023-02-06  6:33 [PATCH -v4 0/9] migrate_pages(): batch TLB flushing Huang Ying
                   ` (4 preceding siblings ...)
  2023-02-06  6:33 ` [PATCH -v4 5/9] migrate_pages: batch _unmap and _move Huang Ying
@ 2023-02-06  6:33 ` Huang Ying
  2023-02-07 14:40   ` Zi Yan
  2023-02-06  6:33 ` [PATCH -v4 7/9] migrate_pages: share more code between _unmap and _move Huang Ying
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Huang Ying @ 2023-02-06  6:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Huang Ying, Zi Yan, Yang Shi,
	Baolin Wang, Oscar Salvador, Matthew Wilcox, Bharata B Rao,
	Alistair Popple, haoxin, Minchan Kim, Mike Kravetz,
	Hyeonggon Yoo

Just move the position of the functions.  There's no any functionality
change.  This is to make it easier to review the next patch via
putting code near its position in the next patch.

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Bharata B Rao <bharata@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: haoxin <xhao@linux.alibaba.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 mm/migrate.c | 102 +++++++++++++++++++++++++--------------------------
 1 file changed, 51 insertions(+), 51 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index fa7212330cb6..23eb01cfae4c 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1218,6 +1218,57 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
 	return rc;
 }
 
+/* Obtain the lock on page, remove all ptes. */
+static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page,
+			       unsigned long private, struct folio *src,
+			       struct folio **dstp, int force, bool force_lock,
+			       enum migrate_mode mode, enum migrate_reason reason,
+			       struct list_head *ret)
+{
+	struct folio *dst;
+	int rc = MIGRATEPAGE_UNMAP;
+	struct page *newpage = NULL;
+
+	if (!thp_migration_supported() && folio_test_transhuge(src))
+		return -ENOSYS;
+
+	if (folio_ref_count(src) == 1) {
+		/* Folio was freed from under us. So we are done. */
+		folio_clear_active(src);
+		folio_clear_unevictable(src);
+		/* free_pages_prepare() will clear PG_isolated. */
+		list_del(&src->lru);
+		migrate_folio_done(src, reason);
+		return MIGRATEPAGE_SUCCESS;
+	}
+
+	newpage = get_new_page(&src->page, private);
+	if (!newpage)
+		return -ENOMEM;
+	dst = page_folio(newpage);
+	*dstp = dst;
+
+	dst->private = NULL;
+	rc = __migrate_folio_unmap(src, dst, force, force_lock, mode);
+	if (rc == MIGRATEPAGE_UNMAP)
+		return rc;
+
+	/*
+	 * A page that has not been migrated will have kept its
+	 * references and be restored.
+	 */
+	/* restore the folio to right list. */
+	if (rc != -EAGAIN && rc != -EDEADLOCK)
+		list_move_tail(&src->lru, ret);
+
+	if (put_new_page)
+		put_new_page(&dst->page, private);
+	else
+		folio_put(dst);
+
+	return rc;
+}
+
 static int __migrate_folio_move(struct folio *src, struct folio *dst,
 				enum migrate_mode mode)
 {
@@ -1278,57 +1329,6 @@ static int __migrate_folio_move(struct folio *src, struct folio *dst,
 	return rc;
 }
 
-/* Obtain the lock on page, remove all ptes. */
-static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page,
-			       unsigned long private, struct folio *src,
-			       struct folio **dstp, int force, bool force_lock,
-			       enum migrate_mode mode, enum migrate_reason reason,
-			       struct list_head *ret)
-{
-	struct folio *dst;
-	int rc = MIGRATEPAGE_UNMAP;
-	struct page *newpage = NULL;
-
-	if (!thp_migration_supported() && folio_test_transhuge(src))
-		return -ENOSYS;
-
-	if (folio_ref_count(src) == 1) {
-		/* Folio was freed from under us. So we are done. */
-		folio_clear_active(src);
-		folio_clear_unevictable(src);
-		/* free_pages_prepare() will clear PG_isolated. */
-		list_del(&src->lru);
-		migrate_folio_done(src, reason);
-		return MIGRATEPAGE_SUCCESS;
-	}
-
-	newpage = get_new_page(&src->page, private);
-	if (!newpage)
-		return -ENOMEM;
-	dst = page_folio(newpage);
-	*dstp = dst;
-
-	dst->private = NULL;
-	rc = __migrate_folio_unmap(src, dst, force, force_lock, mode);
-	if (rc == MIGRATEPAGE_UNMAP)
-		return rc;
-
-	/*
-	 * A page that has not been migrated will have kept its
-	 * references and be restored.
-	 */
-	/* restore the folio to right list. */
-	if (rc != -EAGAIN && rc != -EDEADLOCK)
-		list_move_tail(&src->lru, ret);
-
-	if (put_new_page)
-		put_new_page(&dst->page, private);
-	else
-		folio_put(dst);
-
-	return rc;
-}
-
 /* Migrate the folio to the newly allocated folio in dst. */
 static int migrate_folio_move(free_page_t put_new_page, unsigned long private,
 			      struct folio *src, struct folio *dst,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH -v4 7/9] migrate_pages: share more code between _unmap and _move
  2023-02-06  6:33 [PATCH -v4 0/9] migrate_pages(): batch TLB flushing Huang Ying
                   ` (5 preceding siblings ...)
  2023-02-06  6:33 ` [PATCH -v4 6/9] migrate_pages: move migrate_folio_unmap() Huang Ying
@ 2023-02-06  6:33 ` Huang Ying
  2023-02-07 14:50   ` Zi Yan
  2023-02-06  6:33 ` [PATCH -v4 8/9] migrate_pages: batch flushing TLB Huang Ying
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Huang Ying @ 2023-02-06  6:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Huang Ying, Zi Yan, Yang Shi,
	Baolin Wang, Oscar Salvador, Matthew Wilcox, Bharata B Rao,
	Alistair Popple, haoxin, Minchan Kim, Mike Kravetz,
	Hyeonggon Yoo

This is a code cleanup patch to reduce the duplicated code between the
_unmap and _move stages of migrate_pages().  No functionality change
is expected.

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Bharata B Rao <bharata@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: haoxin <xhao@linux.alibaba.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 mm/migrate.c | 203 ++++++++++++++++++++-------------------------------
 1 file changed, 81 insertions(+), 122 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 23eb01cfae4c..9378fa2ad4a5 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1037,6 +1037,7 @@ static void __migrate_folio_extract(struct folio *dst,
 static void migrate_folio_undo_src(struct folio *src,
 				   int page_was_mapped,
 				   struct anon_vma *anon_vma,
+				   bool locked,
 				   struct list_head *ret)
 {
 	if (page_was_mapped)
@@ -1044,16 +1045,20 @@ static void migrate_folio_undo_src(struct folio *src,
 	/* Drop an anon_vma reference if we took one */
 	if (anon_vma)
 		put_anon_vma(anon_vma);
-	folio_unlock(src);
-	list_move_tail(&src->lru, ret);
+	if (locked)
+		folio_unlock(src);
+	if (ret)
+		list_move_tail(&src->lru, ret);
 }
 
 /* Restore the destination folio to the original state upon failure */
 static void migrate_folio_undo_dst(struct folio *dst,
+				   bool locked,
 				   free_page_t put_new_page,
 				   unsigned long private)
 {
-	folio_unlock(dst);
+	if (locked)
+		folio_unlock(dst);
 	if (put_new_page)
 		put_new_page(&dst->page, private);
 	else
@@ -1078,13 +1083,42 @@ static void migrate_folio_done(struct folio *src,
 		folio_put(src);
 }
 
-static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
-				 int force, bool force_lock, enum migrate_mode mode)
+/* Obtain the lock on page, remove all ptes. */
+static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page,
+			       unsigned long private, struct folio *src,
+			       struct folio **dstp, int force, bool force_lock,
+			       enum migrate_mode mode, enum migrate_reason reason,
+			       struct list_head *ret)
 {
+	struct folio *dst;
 	int rc = -EAGAIN;
+	struct page *newpage = NULL;
 	int page_was_mapped = 0;
 	struct anon_vma *anon_vma = NULL;
 	bool is_lru = !__PageMovable(&src->page);
+	bool locked = false;
+	bool dst_locked = false;
+
+	if (!thp_migration_supported() && folio_test_transhuge(src))
+		return -ENOSYS;
+
+	if (folio_ref_count(src) == 1) {
+		/* Folio was freed from under us. So we are done. */
+		folio_clear_active(src);
+		folio_clear_unevictable(src);
+		/* free_pages_prepare() will clear PG_isolated. */
+		list_del(&src->lru);
+		migrate_folio_done(src, reason);
+		return MIGRATEPAGE_SUCCESS;
+	}
+
+	newpage = get_new_page(&src->page, private);
+	if (!newpage)
+		return -ENOMEM;
+	dst = page_folio(newpage);
+	*dstp = dst;
+
+	dst->private = NULL;
 
 	if (!folio_trylock(src)) {
 		if (!force || mode == MIGRATE_ASYNC)
@@ -1119,6 +1153,7 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
 
 		folio_lock(src);
 	}
+	locked = true;
 
 	if (folio_test_writeback(src)) {
 		/*
@@ -1133,10 +1168,10 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
 			break;
 		default:
 			rc = -EBUSY;
-			goto out_unlock;
+			goto out;
 		}
 		if (!force)
-			goto out_unlock;
+			goto out;
 		folio_wait_writeback(src);
 	}
 
@@ -1166,7 +1201,8 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
 	 * This is much like races on refcount of oldpage: just don't BUG().
 	 */
 	if (unlikely(!folio_trylock(dst)))
-		goto out_unlock;
+		goto out;
+	dst_locked = true;
 
 	if (unlikely(!is_lru)) {
 		__migrate_folio_record(dst, page_was_mapped, anon_vma);
@@ -1188,7 +1224,7 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
 	if (!src->mapping) {
 		if (folio_test_private(src)) {
 			try_to_free_buffers(src);
-			goto out_unlock_both;
+			goto out;
 		}
 	} else if (folio_mapped(src)) {
 		/* Establish migration ptes */
@@ -1203,74 +1239,26 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
 		return MIGRATEPAGE_UNMAP;
 	}
 
-	if (page_was_mapped)
-		remove_migration_ptes(src, src, false);
-
-out_unlock_both:
-	folio_unlock(dst);
-out_unlock:
-	/* Drop an anon_vma reference if we took one */
-	if (anon_vma)
-		put_anon_vma(anon_vma);
-	folio_unlock(src);
 out:
-
-	return rc;
-}
-
-/* Obtain the lock on page, remove all ptes. */
-static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page,
-			       unsigned long private, struct folio *src,
-			       struct folio **dstp, int force, bool force_lock,
-			       enum migrate_mode mode, enum migrate_reason reason,
-			       struct list_head *ret)
-{
-	struct folio *dst;
-	int rc = MIGRATEPAGE_UNMAP;
-	struct page *newpage = NULL;
-
-	if (!thp_migration_supported() && folio_test_transhuge(src))
-		return -ENOSYS;
-
-	if (folio_ref_count(src) == 1) {
-		/* Folio was freed from under us. So we are done. */
-		folio_clear_active(src);
-		folio_clear_unevictable(src);
-		/* free_pages_prepare() will clear PG_isolated. */
-		list_del(&src->lru);
-		migrate_folio_done(src, reason);
-		return MIGRATEPAGE_SUCCESS;
-	}
-
-	newpage = get_new_page(&src->page, private);
-	if (!newpage)
-		return -ENOMEM;
-	dst = page_folio(newpage);
-	*dstp = dst;
-
-	dst->private = NULL;
-	rc = __migrate_folio_unmap(src, dst, force, force_lock, mode);
-	if (rc == MIGRATEPAGE_UNMAP)
-		return rc;
-
 	/*
 	 * A page that has not been migrated will have kept its
 	 * references and be restored.
 	 */
 	/* restore the folio to right list. */
-	if (rc != -EAGAIN && rc != -EDEADLOCK)
-		list_move_tail(&src->lru, ret);
+	if (rc == -EAGAIN || rc == -EDEADLOCK)
+		ret = NULL;
 
-	if (put_new_page)
-		put_new_page(&dst->page, private);
-	else
-		folio_put(dst);
+	migrate_folio_undo_src(src, page_was_mapped, anon_vma, locked, ret);
+	migrate_folio_undo_dst(dst, dst_locked, put_new_page, private);
 
 	return rc;
 }
 
-static int __migrate_folio_move(struct folio *src, struct folio *dst,
-				enum migrate_mode mode)
+/* Migrate the folio to the newly allocated folio in dst. */
+static int migrate_folio_move(free_page_t put_new_page, unsigned long private,
+			      struct folio *src, struct folio *dst,
+			      enum migrate_mode mode, enum migrate_reason reason,
+			      struct list_head *ret)
 {
 	int rc;
 	int page_was_mapped = 0;
@@ -1283,12 +1271,8 @@ static int __migrate_folio_move(struct folio *src, struct folio *dst,
 	list_del(&dst->lru);
 
 	rc = move_to_new_folio(dst, src, mode);
-
-	if (rc == -EAGAIN) {
-		list_add(&dst->lru, prev);
-		__migrate_folio_record(dst, page_was_mapped, anon_vma);
-		return rc;
-	}
+	if (rc)
+		goto out;
 
 	if (unlikely(!is_lru))
 		goto out_unlock_both;
@@ -1302,70 +1286,45 @@ static int __migrate_folio_move(struct folio *src, struct folio *dst,
 	 * unsuccessful, and other cases when a page has been temporarily
 	 * isolated from the unevictable LRU: but this case is the easiest.
 	 */
-	if (rc == MIGRATEPAGE_SUCCESS) {
-		folio_add_lru(dst);
-		if (page_was_mapped)
-			lru_add_drain();
-	}
+	folio_add_lru(dst);
+	if (page_was_mapped)
+		lru_add_drain();
 
 	if (page_was_mapped)
-		remove_migration_ptes(src,
-			rc == MIGRATEPAGE_SUCCESS ? dst : src, false);
+		remove_migration_ptes(src, dst, false);
 
 out_unlock_both:
 	folio_unlock(dst);
-	/* Drop an anon_vma reference if we took one */
-	if (anon_vma)
-		put_anon_vma(anon_vma);
-	folio_unlock(src);
+	set_page_owner_migrate_reason(&dst->page, reason);
 	/*
 	 * If migration is successful, decrease refcount of dst,
 	 * which will not free the page because new page owner increased
 	 * refcounter.
 	 */
-	if (rc == MIGRATEPAGE_SUCCESS)
-		folio_put(dst);
-
-	return rc;
-}
-
-/* Migrate the folio to the newly allocated folio in dst. */
-static int migrate_folio_move(free_page_t put_new_page, unsigned long private,
-			      struct folio *src, struct folio *dst,
-			      enum migrate_mode mode, enum migrate_reason reason,
-			      struct list_head *ret)
-{
-	int rc;
-
-	rc = __migrate_folio_move(src, dst, mode);
-	if (rc == MIGRATEPAGE_SUCCESS)
-		set_page_owner_migrate_reason(&dst->page, reason);
-
-	if (rc != -EAGAIN) {
-		/*
-		 * A folio that has been migrated has all references
-		 * removed and will be freed. A folio that has not been
-		 * migrated will have kept its references and be restored.
-		 */
-		list_del(&src->lru);
-	}
+	folio_put(dst);
 
 	/*
-	 * If migration is successful, releases reference grabbed during
-	 * isolation. Otherwise, restore the folio to right list unless
-	 * we want to retry.
+	 * A page that has been migrated has all references removed
+	 * and will be freed.
 	 */
-	if (rc == MIGRATEPAGE_SUCCESS) {
-		migrate_folio_done(src, reason);
-	} else if (rc != -EAGAIN) {
-		list_add_tail(&src->lru, ret);
+	list_del(&src->lru);
+	/* Drop an anon_vma reference if we took one */
+	if (anon_vma)
+		put_anon_vma(anon_vma);
+	folio_unlock(src);
+	migrate_folio_done(src, reason);
 
-		if (put_new_page)
-			put_new_page(&dst->page, private);
-		else
-			folio_put(dst);
+	return rc;
+out:
+	if (rc == -EAGAIN) {
+		list_add(&dst->lru, prev);
+		__migrate_folio_record(dst, page_was_mapped, anon_vma);
+		return rc;
 	}
 
+	migrate_folio_undo_src(src, page_was_mapped, anon_vma, true, ret);
+	migrate_folio_undo_dst(dst, true, put_new_page, private);
+
 	return rc;
 }
 
@@ -1897,9 +1856,9 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
 
 		__migrate_folio_extract(dst, &page_was_mapped, &anon_vma);
 		migrate_folio_undo_src(folio, page_was_mapped, anon_vma,
-				       ret_folios);
+				       true, ret_folios);
 		list_del(&dst->lru);
-		migrate_folio_undo_dst(dst, put_new_page, private);
+		migrate_folio_undo_dst(dst, true, put_new_page, private);
 		dst = dst2;
 		dst2 = list_next_entry(dst, lru);
 	}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH -v4 8/9] migrate_pages: batch flushing TLB
  2023-02-06  6:33 [PATCH -v4 0/9] migrate_pages(): batch TLB flushing Huang Ying
                   ` (6 preceding siblings ...)
  2023-02-06  6:33 ` [PATCH -v4 7/9] migrate_pages: share more code between _unmap and _move Huang Ying
@ 2023-02-06  6:33 ` Huang Ying
  2023-02-07 14:52   ` Zi Yan
  2023-02-07 17:44   ` haoxin
  2023-02-06  6:33 ` [PATCH -v4 9/9] migrate_pages: move THP/hugetlb migration support check to simplify code Huang Ying
  2023-02-08  6:21 ` [PATCH -v4 0/9] migrate_pages(): batch TLB flushing haoxin
  9 siblings, 2 replies; 33+ messages in thread
From: Huang Ying @ 2023-02-06  6:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Huang Ying, Zi Yan, Yang Shi,
	Baolin Wang, Oscar Salvador, Matthew Wilcox, Bharata B Rao,
	Alistair Popple, haoxin, Minchan Kim, Mike Kravetz,
	Hyeonggon Yoo

The TLB flushing will cost quite some CPU cycles during the folio
migration in some situations.  For example, when migrate a folio of a
process with multiple active threads that run on multiple CPUs.  After
batching the _unmap and _move in migrate_pages(), the TLB flushing can
be batched easily with the existing TLB flush batching mechanism.
This patch implements that.

We use the following test case to test the patch.

On a 2-socket Intel server,

- Run pmbench memory accessing benchmark

- Run `migratepages` to migrate pages of pmbench between node 0 and
  node 1 back and forth.

With the patch, the TLB flushing IPI reduces 99.1% during the test and
the number of pages migrated successfully per second increases 291.7%.

NOTE: TLB flushing is batched only for normal folios, not for THP
folios.  Because the overhead of TLB flushing for THP folios is much
lower than that for normal folios (about 1/512 on x86 platform).

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Bharata B Rao <bharata@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: haoxin <xhao@linux.alibaba.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 mm/migrate.c |  4 +++-
 mm/rmap.c    | 20 +++++++++++++++++---
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 9378fa2ad4a5..ca6e2ff02a09 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1230,7 +1230,7 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page
 		/* Establish migration ptes */
 		VM_BUG_ON_FOLIO(folio_test_anon(src) &&
 			       !folio_test_ksm(src) && !anon_vma, src);
-		try_to_migrate(src, 0);
+		try_to_migrate(src, TTU_BATCH_FLUSH);
 		page_was_mapped = 1;
 	}
 
@@ -1781,6 +1781,8 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
 	stats->nr_thp_failed += thp_retry;
 	stats->nr_failed_pages += nr_retry_pages;
 move:
+	try_to_unmap_flush();
+
 	retry = 1;
 	for (pass = 0;
 	     pass < NR_MAX_MIGRATE_PAGES_RETRY && (retry || large_retry);
diff --git a/mm/rmap.c b/mm/rmap.c
index b616870a09be..2e125f3e462e 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1976,7 +1976,21 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
 		} else {
 			flush_cache_page(vma, address, pte_pfn(*pvmw.pte));
 			/* Nuke the page table entry. */
-			pteval = ptep_clear_flush(vma, address, pvmw.pte);
+			if (should_defer_flush(mm, flags)) {
+				/*
+				 * We clear the PTE but do not flush so potentially
+				 * a remote CPU could still be writing to the folio.
+				 * If the entry was previously clean then the
+				 * architecture must guarantee that a clear->dirty
+				 * transition on a cached TLB entry is written through
+				 * and traps if the PTE is unmapped.
+				 */
+				pteval = ptep_get_and_clear(mm, address, pvmw.pte);
+
+				set_tlb_ubc_flush_pending(mm, pte_dirty(pteval));
+			} else {
+				pteval = ptep_clear_flush(vma, address, pvmw.pte);
+			}
 		}
 
 		/* Set the dirty flag on the folio now the pte is gone. */
@@ -2148,10 +2162,10 @@ void try_to_migrate(struct folio *folio, enum ttu_flags flags)
 
 	/*
 	 * Migration always ignores mlock and only supports TTU_RMAP_LOCKED and
-	 * TTU_SPLIT_HUGE_PMD and TTU_SYNC flags.
+	 * TTU_SPLIT_HUGE_PMD, TTU_SYNC, and TTU_BATCH_FLUSH flags.
 	 */
 	if (WARN_ON_ONCE(flags & ~(TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PMD |
-					TTU_SYNC)))
+					TTU_SYNC | TTU_BATCH_FLUSH)))
 		return;
 
 	if (folio_is_zone_device(folio) &&
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH -v4 9/9] migrate_pages: move THP/hugetlb migration support check to simplify code
  2023-02-06  6:33 [PATCH -v4 0/9] migrate_pages(): batch TLB flushing Huang Ying
                   ` (7 preceding siblings ...)
  2023-02-06  6:33 ` [PATCH -v4 8/9] migrate_pages: batch flushing TLB Huang Ying
@ 2023-02-06  6:33 ` Huang Ying
  2023-02-07 14:53   ` Zi Yan
  2023-02-08  6:21 ` [PATCH -v4 0/9] migrate_pages(): batch TLB flushing haoxin
  9 siblings, 1 reply; 33+ messages in thread
From: Huang Ying @ 2023-02-06  6:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Huang Ying, Alistair Popple, Zi Yan,
	Yang Shi, Baolin Wang, Oscar Salvador, Matthew Wilcox,
	Bharata B Rao, haoxin, Minchan Kim, Mike Kravetz, Hyeonggon Yoo

This is a code cleanup patch, no functionality change is expected.
After the change, the line number reduces especially in the long
migrate_pages_batch().

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Suggested-by: Alistair Popple <apopple@nvidia.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Bharata B Rao <bharata@amd.com>
Cc: haoxin <xhao@linux.alibaba.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 mm/migrate.c | 83 +++++++++++++++++++++++-----------------------------
 1 file changed, 36 insertions(+), 47 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index ca6e2ff02a09..83d7ec8dfa66 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1099,9 +1099,6 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page
 	bool locked = false;
 	bool dst_locked = false;
 
-	if (!thp_migration_supported() && folio_test_transhuge(src))
-		return -ENOSYS;
-
 	if (folio_ref_count(src) == 1) {
 		/* Folio was freed from under us. So we are done. */
 		folio_clear_active(src);
@@ -1359,16 +1356,6 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
 	struct anon_vma *anon_vma = NULL;
 	struct address_space *mapping = NULL;
 
-	/*
-	 * Migratability of hugepages depends on architectures and their size.
-	 * This check is necessary because some callers of hugepage migration
-	 * like soft offline and memory hotremove don't walk through page
-	 * tables or check whether the hugepage is pmd-based or not before
-	 * kicking migration.
-	 */
-	if (!hugepage_migration_supported(page_hstate(hpage)))
-		return -ENOSYS;
-
 	if (folio_ref_count(src) == 1) {
 		/* page was freed from under us. So we are done. */
 		putback_active_hugepage(hpage);
@@ -1535,6 +1522,20 @@ static int migrate_hugetlbs(struct list_head *from, new_page_t get_new_page,
 
 			cond_resched();
 
+			/*
+			 * Migratability of hugepages depends on architectures and
+			 * their size.  This check is necessary because some callers
+			 * of hugepage migration like soft offline and memory
+			 * hotremove don't walk through page tables or check whether
+			 * the hugepage is pmd-based or not before kicking migration.
+			 */
+			if (!hugepage_migration_supported(folio_hstate(folio))) {
+				nr_failed++;
+				stats->nr_failed_pages += nr_pages;
+				list_move_tail(&folio->lru, ret_folios);
+				continue;
+			}
+
 			rc = unmap_and_move_huge_page(get_new_page,
 						      put_new_page, private,
 						      &folio->page, pass > 2, mode,
@@ -1544,16 +1545,9 @@ static int migrate_hugetlbs(struct list_head *from, new_page_t get_new_page,
 			 *	Success: hugetlb folio will be put back
 			 *	-EAGAIN: stay on the from list
 			 *	-ENOMEM: stay on the from list
-			 *	-ENOSYS: stay on the from list
 			 *	Other errno: put on ret_folios list
 			 */
 			switch(rc) {
-			case -ENOSYS:
-				/* Hugetlb migration is unsupported */
-				nr_failed++;
-				stats->nr_failed_pages += nr_pages;
-				list_move_tail(&folio->lru, ret_folios);
-				break;
 			case -ENOMEM:
 				/*
 				 * When memory is low, don't bother to try to migrate
@@ -1639,6 +1633,28 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
 
 			cond_resched();
 
+			/*
+			 * Large folio migration might be unsupported or
+			 * the allocation might be failed so we should retry
+			 * on the same folio with the large folio split
+			 * to normal folios.
+			 *
+			 * Split folios are put in split_folios, and
+			 * we will migrate them after the rest of the
+			 * list is processed.
+			 */
+			if (!thp_migration_supported() && is_thp) {
+				nr_large_failed++;
+				stats->nr_thp_failed++;
+				if (!try_split_folio(folio, &split_folios)) {
+					stats->nr_thp_split++;
+					continue;
+				}
+				stats->nr_failed_pages += nr_pages;
+				list_move_tail(&folio->lru, ret_folios);
+				continue;
+			}
+
 			rc = migrate_folio_unmap(get_new_page, put_new_page, private,
 						 folio, &dst, pass > 2, force_lock,
 						 mode, reason, ret_folios);
@@ -1650,36 +1666,9 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
 			 *	-EAGAIN: stay on the from list
 			 *	-EDEADLOCK: stay on the from list
 			 *	-ENOMEM: stay on the from list
-			 *	-ENOSYS: stay on the from list
 			 *	Other errno: put on ret_folios list
 			 */
 			switch(rc) {
-			/*
-			 * Large folio migration might be unsupported or
-			 * the allocation could've failed so we should retry
-			 * on the same folio with the large folio split
-			 * to normal folios.
-			 *
-			 * Split folios are put in split_folios, and
-			 * we will migrate them after the rest of the
-			 * list is processed.
-			 */
-			case -ENOSYS:
-				/* Large folio migration is unsupported */
-				if (is_large) {
-					nr_large_failed++;
-					stats->nr_thp_failed += is_thp;
-					if (!try_split_folio(folio, &split_folios)) {
-						stats->nr_thp_split += is_thp;
-						break;
-					}
-				} else if (!no_split_folio_counting) {
-					nr_failed++;
-				}
-
-				stats->nr_failed_pages += nr_pages;
-				list_move_tail(&folio->lru, ret_folios);
-				break;
 			case -ENOMEM:
 				/*
 				 * When memory is low, don't bother to try to migrate
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 5/9] migrate_pages: batch _unmap and _move
  2023-02-06  6:33 ` [PATCH -v4 5/9] migrate_pages: batch _unmap and _move Huang Ying
@ 2023-02-06 16:10   ` Zi Yan
  2023-02-07  5:58     ` Huang, Ying
  2023-02-13  6:55     ` Huang, Ying
  2023-02-07 17:33   ` haoxin
  1 sibling, 2 replies; 33+ messages in thread
From: Zi Yan @ 2023-02-06 16:10 UTC (permalink / raw)
  To: Huang Ying
  Cc: Andrew Morton, linux-mm, linux-kernel, Hyeonggon Yoo, Yang Shi,
	Baolin Wang, Oscar Salvador, Matthew Wilcox, Bharata B Rao,
	Alistair Popple, haoxin, Minchan Kim, Mike Kravetz

[-- Attachment #1: Type: text/plain, Size: 14595 bytes --]

On 6 Feb 2023, at 1:33, Huang Ying wrote:

> In this patch the _unmap and _move stage of the folio migration is
> batched.  That for, previously, it is,
>
>   for each folio
>     _unmap()
>     _move()
>
> Now, it is,
>
>   for each folio
>     _unmap()
>   for each folio
>     _move()
>
> Based on this, we can batch the TLB flushing and use some hardware
> accelerator to copy folios between batched _unmap and batched _move
> stages.
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Bharata B Rao <bharata@amd.com>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: haoxin <xhao@linux.alibaba.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> ---
>  mm/migrate.c | 208 +++++++++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 184 insertions(+), 24 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 0428449149f4..fa7212330cb6 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1033,6 +1033,33 @@ static void __migrate_folio_extract(struct folio *dst,
>  	dst->private = NULL;
>  }
>
> +/* Restore the source folio to the original state upon failure */
> +static void migrate_folio_undo_src(struct folio *src,
> +				   int page_was_mapped,
> +				   struct anon_vma *anon_vma,
> +				   struct list_head *ret)
> +{
> +	if (page_was_mapped)
> +		remove_migration_ptes(src, src, false);
> +	/* Drop an anon_vma reference if we took one */
> +	if (anon_vma)
> +		put_anon_vma(anon_vma);
> +	folio_unlock(src);
> +	list_move_tail(&src->lru, ret);
> +}
> +
> +/* Restore the destination folio to the original state upon failure */
> +static void migrate_folio_undo_dst(struct folio *dst,
> +				   free_page_t put_new_page,
> +				   unsigned long private)
> +{
> +	folio_unlock(dst);
> +	if (put_new_page)
> +		put_new_page(&dst->page, private);
> +	else
> +		folio_put(dst);
> +}
> +
>  /* Cleanup src folio upon migration success */
>  static void migrate_folio_done(struct folio *src,
>  			       enum migrate_reason reason)
> @@ -1052,7 +1079,7 @@ static void migrate_folio_done(struct folio *src,
>  }
>
>  static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
> -				int force, enum migrate_mode mode)
> +				 int force, bool force_lock, enum migrate_mode mode)
>  {
>  	int rc = -EAGAIN;
>  	int page_was_mapped = 0;
> @@ -1079,6 +1106,17 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>  		if (current->flags & PF_MEMALLOC)
>  			goto out;
>
> +		/*
> +		 * We have locked some folios, to avoid deadlock, we cannot
> +		 * lock the folio synchronously.  Go out to process (and
> +		 * unlock) all the locked folios.  Then we can lock the folio
> +		 * synchronously.
> +		 */
The comment alone is quite confusing and the variable might be better
renamed to avoid_force_lock, since there is a force variable to force
lock folio already. And the variable intends to discourage force lock
on a folio to avoid potential deadlock.

How about? Since "lock synchronously" might not be as straightforward
as wait to lock.

/*
 * We have locked some folios and are going to wait to lock this folio.
 * To avoid a potential deadlock, let's bail out and not do that. The
 * locked folios will be moved and unlocked, then we can wait to lock
 * this folio
 */

> +		if (!force_lock) {
> +			rc = -EDEADLOCK;
> +			goto out;
> +		}
> +
>  		folio_lock(src);
>  	}
>
> @@ -1187,10 +1225,20 @@ static int __migrate_folio_move(struct folio *src, struct folio *dst,
>  	int page_was_mapped = 0;
>  	struct anon_vma *anon_vma = NULL;
>  	bool is_lru = !__PageMovable(&src->page);
> +	struct list_head *prev;
>
>  	__migrate_folio_extract(dst, &page_was_mapped, &anon_vma);
> +	prev = dst->lru.prev;
> +	list_del(&dst->lru);
>
>  	rc = move_to_new_folio(dst, src, mode);
> +
> +	if (rc == -EAGAIN) {
> +		list_add(&dst->lru, prev);
> +		__migrate_folio_record(dst, page_was_mapped, anon_vma);
> +		return rc;
> +	}
> +
>  	if (unlikely(!is_lru))
>  		goto out_unlock_both;
>
> @@ -1233,7 +1281,7 @@ static int __migrate_folio_move(struct folio *src, struct folio *dst,
>  /* Obtain the lock on page, remove all ptes. */
>  static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page,
>  			       unsigned long private, struct folio *src,
> -			       struct folio **dstp, int force,
> +			       struct folio **dstp, int force, bool force_lock,
>  			       enum migrate_mode mode, enum migrate_reason reason,
>  			       struct list_head *ret)
>  {
> @@ -1261,7 +1309,7 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page
>  	*dstp = dst;
>
>  	dst->private = NULL;
> -	rc = __migrate_folio_unmap(src, dst, force, mode);
> +	rc = __migrate_folio_unmap(src, dst, force, force_lock, mode);
>  	if (rc == MIGRATEPAGE_UNMAP)
>  		return rc;
>
> @@ -1270,7 +1318,7 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page
>  	 * references and be restored.
>  	 */
>  	/* restore the folio to right list. */
> -	if (rc != -EAGAIN)
> +	if (rc != -EAGAIN && rc != -EDEADLOCK)
>  		list_move_tail(&src->lru, ret);
>
>  	if (put_new_page)
> @@ -1309,9 +1357,8 @@ static int migrate_folio_move(free_page_t put_new_page, unsigned long private,
>  	 */
>  	if (rc == MIGRATEPAGE_SUCCESS) {
>  		migrate_folio_done(src, reason);
> -	} else {
> -		if (rc != -EAGAIN)
> -			list_add_tail(&src->lru, ret);
> +	} else if (rc != -EAGAIN) {
> +		list_add_tail(&src->lru, ret);
>
>  		if (put_new_page)
>  			put_new_page(&dst->page, private);
> @@ -1591,7 +1638,7 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>  		enum migrate_mode mode, int reason, struct list_head *ret_folios,
>  		struct migrate_pages_stats *stats)

Like I said in my last comment to this patch, migrate_pages_batch() function
deserves a detailed comment about its working flow including the error handling.
Now you only put some in the git log, which is hard to access after several code
changes later.

How about?

/*
 * migrate_pages_batch() first unmaps pages in the from as many as possible,
 * then migrates the unmapped pages. During unmap process, different situations
 * are handled differently:
 * 1. ENOSYS, unsupported large folio migration: move to ret_folios list
 * 2. ENOMEM, lower memory at the destination: migrate existing unmapped folios
 *    and stop, since existing unmapped folios have new pages allocated and can
 *    be migrated
 * 3. EDEADLOCK, to be unmapped page is locked by someone else, to avoid deadlock,
 *    we migrate existing unmapped pages and try to lock again
 * 4. MIGRATEPAGE_SUCCESS, the folios was freed under us: no action
 * 5. MIGRATEPAGE_UNMAP, unmap succeeded: set avoid_force_lock to true to avoid
 *    wait to lock a folio in the future to avoid deadlock.
 *
 * For folios unmapped but cannot be migrated, we will restore their original
 * states during cleanup stage at the end.
 */

>  {
> -	int retry = 1;
> +	int retry;
>  	int large_retry = 1;
>  	int thp_retry = 1;
>  	int nr_failed = 0;
> @@ -1600,13 +1647,19 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>  	int pass = 0;
>  	bool is_large = false;
>  	bool is_thp = false;
> -	struct folio *folio, *folio2, *dst = NULL;
> -	int rc, nr_pages;
> +	struct folio *folio, *folio2, *dst = NULL, *dst2;
> +	int rc, rc_saved, nr_pages;
>  	LIST_HEAD(split_folios);
> +	LIST_HEAD(unmap_folios);
> +	LIST_HEAD(dst_folios);
>  	bool nosplit = (reason == MR_NUMA_MISPLACED);
>  	bool no_split_folio_counting = false;
> +	bool force_lock;
>
> -split_folio_migration:
> +retry:
> +	rc_saved = 0;
> +	force_lock = true;
> +	retry = 1;
>  	for (pass = 0;
>  	     pass < NR_MAX_MIGRATE_PAGES_RETRY && (retry || large_retry);
>  	     pass++) {
> @@ -1628,16 +1681,15 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>  			cond_resched();
>
>  			rc = migrate_folio_unmap(get_new_page, put_new_page, private,
> -						 folio, &dst, pass > 2, mode,
> -						 reason, ret_folios);
> -			if (rc == MIGRATEPAGE_UNMAP)
> -				rc = migrate_folio_move(put_new_page, private,
> -							folio, dst, mode,
> -							reason, ret_folios);
> +						 folio, &dst, pass > 2, force_lock,
> +						 mode, reason, ret_folios);
>  			/*
>  			 * The rules are:
>  			 *	Success: folio will be freed
> +			 *	Unmap: folio will be put on unmap_folios list,
> +			 *	       dst folio put on dst_folios list
>  			 *	-EAGAIN: stay on the from list
> +			 *	-EDEADLOCK: stay on the from list
>  			 *	-ENOMEM: stay on the from list
>  			 *	-ENOSYS: stay on the from list
>  			 *	Other errno: put on ret_folios list
> @@ -1672,7 +1724,7 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>  			case -ENOMEM:
>  				/*
>  				 * When memory is low, don't bother to try to migrate
> -				 * other folios, just exit.
> +				 * other folios, move unmapped folios, then exit.
>  				 */
>  				if (is_large) {
>  					nr_large_failed++;
> @@ -1711,7 +1763,19 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>  				/* nr_failed isn't updated for not used */
>  				nr_large_failed += large_retry;
>  				stats->nr_thp_failed += thp_retry;
> -				goto out;
> +				rc_saved = rc;
> +				if (list_empty(&unmap_folios))
> +					goto out;
> +				else
> +					goto move;
> +			case -EDEADLOCK:
> +				/*
> +				 * The folio cannot be locked for potential deadlock.
> +				 * Go move (and unlock) all locked folios.  Then we can
> +				 * try again.
> +				 */
> +				rc_saved = rc;
> +				goto move;
>  			case -EAGAIN:
>  				if (is_large) {
>  					large_retry++;
> @@ -1725,6 +1789,15 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>  				stats->nr_succeeded += nr_pages;
>  				stats->nr_thp_succeeded += is_thp;
>  				break;
> +			case MIGRATEPAGE_UNMAP:
> +				/*
> +				 * We have locked some folios, don't force lock
> +				 * to avoid deadlock.
> +				 */
> +				force_lock = false;
> +				list_move_tail(&folio->lru, &unmap_folios);
> +				list_add_tail(&dst->lru, &dst_folios);
> +				break;
>  			default:
>  				/*
>  				 * Permanent failure (-EBUSY, etc.):
> @@ -1748,12 +1821,95 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>  	nr_large_failed += large_retry;
>  	stats->nr_thp_failed += thp_retry;
>  	stats->nr_failed_pages += nr_retry_pages;
> +move:
> +	retry = 1;
> +	for (pass = 0;
> +	     pass < NR_MAX_MIGRATE_PAGES_RETRY && (retry || large_retry);
> +	     pass++) {
> +		retry = 0;
> +		large_retry = 0;
> +		thp_retry = 0;
> +		nr_retry_pages = 0;
> +
> +		dst = list_first_entry(&dst_folios, struct folio, lru);
> +		dst2 = list_next_entry(dst, lru);
> +		list_for_each_entry_safe(folio, folio2, &unmap_folios, lru) {
> +			is_large = folio_test_large(folio);
> +			is_thp = is_large && folio_test_pmd_mappable(folio);
> +			nr_pages = folio_nr_pages(folio);
> +
> +			cond_resched();
> +
> +			rc = migrate_folio_move(put_new_page, private,
> +						folio, dst, mode,
> +						reason, ret_folios);
> +			/*
> +			 * The rules are:
> +			 *	Success: folio will be freed
> +			 *	-EAGAIN: stay on the unmap_folios list
> +			 *	Other errno: put on ret_folios list
> +			 */
> +			switch(rc) {
> +			case -EAGAIN:
> +				if (is_large) {
> +					large_retry++;
> +					thp_retry += is_thp;
> +				} else if (!no_split_folio_counting) {
> +					retry++;
> +				}
> +				nr_retry_pages += nr_pages;
> +				break;
> +			case MIGRATEPAGE_SUCCESS:
> +				stats->nr_succeeded += nr_pages;
> +				stats->nr_thp_succeeded += is_thp;
> +				break;
> +			default:
> +				if (is_large) {
> +					nr_large_failed++;
> +					stats->nr_thp_failed += is_thp;
> +				} else if (!no_split_folio_counting) {
> +					nr_failed++;
> +				}
> +
> +				stats->nr_failed_pages += nr_pages;
> +				break;
> +			}
> +			dst = dst2;
> +			dst2 = list_next_entry(dst, lru);
> +		}
> +	}
> +	nr_failed += retry;
> +	nr_large_failed += large_retry;
> +	stats->nr_thp_failed += thp_retry;
> +	stats->nr_failed_pages += nr_retry_pages;
> +
> +	if (rc_saved)
> +		rc = rc_saved;
> +	else
> +		rc = nr_failed + nr_large_failed;
> +out:
> +	/* Cleanup remaining folios */
> +	dst = list_first_entry(&dst_folios, struct folio, lru);
> +	dst2 = list_next_entry(dst, lru);
> +	list_for_each_entry_safe(folio, folio2, &unmap_folios, lru) {
> +		int page_was_mapped = 0;
> +		struct anon_vma *anon_vma = NULL;
> +
> +		__migrate_folio_extract(dst, &page_was_mapped, &anon_vma);
> +		migrate_folio_undo_src(folio, page_was_mapped, anon_vma,
> +				       ret_folios);
> +		list_del(&dst->lru);
> +		migrate_folio_undo_dst(dst, put_new_page, private);
> +		dst = dst2;
> +		dst2 = list_next_entry(dst, lru);
> +	}
> +
>  	/*
>  	 * Try to migrate split folios of fail-to-migrate large folios, no
>  	 * nr_failed counting in this round, since all split folios of a
>  	 * large folio is counted as 1 failure in the first round.
>  	 */
> -	if (!list_empty(&split_folios)) {
> +	if (rc >= 0 && !list_empty(&split_folios)) {
>  		/*
>  		 * Move non-migrated folios (after NR_MAX_MIGRATE_PAGES_RETRY
>  		 * retries) to ret_folios to avoid migrating them again.
> @@ -1761,12 +1917,16 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>  		list_splice_init(from, ret_folios);
>  		list_splice_init(&split_folios, from);
>  		no_split_folio_counting = true;
> -		retry = 1;
> -		goto split_folio_migration;
> +		goto retry;
>  	}
>
> -	rc = nr_failed + nr_large_failed;
> -out:
> +	/*
> +	 * We have unlocked all locked folios, so we can force lock now, let's
> +	 * try again.
> +	 */
> +	if (rc == -EDEADLOCK)
> +		goto retry;
> +
>  	return rc;
>  }
>
> -- 
> 2.35.1

After rename the variable (or give it a better name) and add the comments,
you can add Reviewed-by: Zi Yan <ziy@nvidia.com>

Thanks.

--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 5/9] migrate_pages: batch _unmap and _move
  2023-02-06 16:10   ` Zi Yan
@ 2023-02-07  5:58     ` Huang, Ying
  2023-02-13  6:55     ` Huang, Ying
  1 sibling, 0 replies; 33+ messages in thread
From: Huang, Ying @ 2023-02-07  5:58 UTC (permalink / raw)
  To: Zi Yan
  Cc: Andrew Morton, linux-mm, linux-kernel, Hyeonggon Yoo, Yang Shi,
	Baolin Wang, Oscar Salvador, Matthew Wilcox, Bharata B Rao,
	Alistair Popple, haoxin, Minchan Kim, Mike Kravetz

Zi Yan <ziy@nvidia.com> writes:

> On 6 Feb 2023, at 1:33, Huang Ying wrote:
>
>> In this patch the _unmap and _move stage of the folio migration is
>> batched.  That for, previously, it is,
>>
>>   for each folio
>>     _unmap()
>>     _move()
>>
>> Now, it is,
>>
>>   for each folio
>>     _unmap()
>>   for each folio
>>     _move()
>>
>> Based on this, we can batch the TLB flushing and use some hardware
>> accelerator to copy folios between batched _unmap and batched _move
>> stages.
>>
>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
>> Cc: Zi Yan <ziy@nvidia.com>
>> Cc: Yang Shi <shy828301@gmail.com>
>> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
>> Cc: Oscar Salvador <osalvador@suse.de>
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Cc: Bharata B Rao <bharata@amd.com>
>> Cc: Alistair Popple <apopple@nvidia.com>
>> Cc: haoxin <xhao@linux.alibaba.com>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Mike Kravetz <mike.kravetz@oracle.com>
>> ---
>>  mm/migrate.c | 208 +++++++++++++++++++++++++++++++++++++++++++++------
>>  1 file changed, 184 insertions(+), 24 deletions(-)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index 0428449149f4..fa7212330cb6 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1033,6 +1033,33 @@ static void __migrate_folio_extract(struct folio *dst,
>>  	dst->private = NULL;
>>  }
>>
>> +/* Restore the source folio to the original state upon failure */
>> +static void migrate_folio_undo_src(struct folio *src,
>> +				   int page_was_mapped,
>> +				   struct anon_vma *anon_vma,
>> +				   struct list_head *ret)
>> +{
>> +	if (page_was_mapped)
>> +		remove_migration_ptes(src, src, false);
>> +	/* Drop an anon_vma reference if we took one */
>> +	if (anon_vma)
>> +		put_anon_vma(anon_vma);
>> +	folio_unlock(src);
>> +	list_move_tail(&src->lru, ret);
>> +}
>> +
>> +/* Restore the destination folio to the original state upon failure */
>> +static void migrate_folio_undo_dst(struct folio *dst,
>> +				   free_page_t put_new_page,
>> +				   unsigned long private)
>> +{
>> +	folio_unlock(dst);
>> +	if (put_new_page)
>> +		put_new_page(&dst->page, private);
>> +	else
>> +		folio_put(dst);
>> +}
>> +
>>  /* Cleanup src folio upon migration success */
>>  static void migrate_folio_done(struct folio *src,
>>  			       enum migrate_reason reason)
>> @@ -1052,7 +1079,7 @@ static void migrate_folio_done(struct folio *src,
>>  }
>>
>>  static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>> -				int force, enum migrate_mode mode)
>> +				 int force, bool force_lock, enum migrate_mode mode)
>>  {
>>  	int rc = -EAGAIN;
>>  	int page_was_mapped = 0;
>> @@ -1079,6 +1106,17 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>>  		if (current->flags & PF_MEMALLOC)
>>  			goto out;
>>
>> +		/*
>> +		 * We have locked some folios, to avoid deadlock, we cannot
>> +		 * lock the folio synchronously.  Go out to process (and
>> +		 * unlock) all the locked folios.  Then we can lock the folio
>> +		 * synchronously.
>> +		 */
> The comment alone is quite confusing and the variable might be better
> renamed to avoid_force_lock, since there is a force variable to force
> lock folio already. And the variable intends to discourage force lock
> on a folio to avoid potential deadlock.

OK.  Will rename "force_lock" to "avoid_force_lock" in the next version.

> How about? Since "lock synchronously" might not be as straightforward
> as wait to lock.
>
> /*
>  * We have locked some folios and are going to wait to lock this folio.
>  * To avoid a potential deadlock, let's bail out and not do that. The
>  * locked folios will be moved and unlocked, then we can wait to lock
>  * this folio
>  */

Thanks!  It looks better.  Will use it in the next version.

Best Regards,
Huang, Ying

>> +		if (!force_lock) {
>> +			rc = -EDEADLOCK;
>> +			goto out;
>> +		}
>> +
>>  		folio_lock(src);
>>  	}
>>
>> @@ -1187,10 +1225,20 @@ static int __migrate_folio_move(struct folio *src, struct folio *dst,
>>  	int page_was_mapped = 0;
>>  	struct anon_vma *anon_vma = NULL;
>>  	bool is_lru = !__PageMovable(&src->page);
>> +	struct list_head *prev;
>>
>>  	__migrate_folio_extract(dst, &page_was_mapped, &anon_vma);
>> +	prev = dst->lru.prev;
>> +	list_del(&dst->lru);
>>
>>  	rc = move_to_new_folio(dst, src, mode);
>> +
>> +	if (rc == -EAGAIN) {
>> +		list_add(&dst->lru, prev);
>> +		__migrate_folio_record(dst, page_was_mapped, anon_vma);
>> +		return rc;
>> +	}
>> +
>>  	if (unlikely(!is_lru))
>>  		goto out_unlock_both;
>>
>> @@ -1233,7 +1281,7 @@ static int __migrate_folio_move(struct folio *src, struct folio *dst,
>>  /* Obtain the lock on page, remove all ptes. */
>>  static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page,
>>  			       unsigned long private, struct folio *src,
>> -			       struct folio **dstp, int force,
>> +			       struct folio **dstp, int force, bool force_lock,
>>  			       enum migrate_mode mode, enum migrate_reason reason,
>>  			       struct list_head *ret)
>>  {
>> @@ -1261,7 +1309,7 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page
>>  	*dstp = dst;
>>
>>  	dst->private = NULL;
>> -	rc = __migrate_folio_unmap(src, dst, force, mode);
>> +	rc = __migrate_folio_unmap(src, dst, force, force_lock, mode);
>>  	if (rc == MIGRATEPAGE_UNMAP)
>>  		return rc;
>>
>> @@ -1270,7 +1318,7 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page
>>  	 * references and be restored.
>>  	 */
>>  	/* restore the folio to right list. */
>> -	if (rc != -EAGAIN)
>> +	if (rc != -EAGAIN && rc != -EDEADLOCK)
>>  		list_move_tail(&src->lru, ret);
>>
>>  	if (put_new_page)
>> @@ -1309,9 +1357,8 @@ static int migrate_folio_move(free_page_t put_new_page, unsigned long private,
>>  	 */
>>  	if (rc == MIGRATEPAGE_SUCCESS) {
>>  		migrate_folio_done(src, reason);
>> -	} else {
>> -		if (rc != -EAGAIN)
>> -			list_add_tail(&src->lru, ret);
>> +	} else if (rc != -EAGAIN) {
>> +		list_add_tail(&src->lru, ret);
>>
>>  		if (put_new_page)
>>  			put_new_page(&dst->page, private);
>> @@ -1591,7 +1638,7 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>>  		enum migrate_mode mode, int reason, struct list_head *ret_folios,
>>  		struct migrate_pages_stats *stats)
>
> Like I said in my last comment to this patch, migrate_pages_batch() function
> deserves a detailed comment about its working flow including the error handling.
> Now you only put some in the git log, which is hard to access after several code
> changes later.
>
> How about?
>
> /*
>  * migrate_pages_batch() first unmaps pages in the from as many as possible,
>  * then migrates the unmapped pages. During unmap process, different situations
>  * are handled differently:
>  * 1. ENOSYS, unsupported large folio migration: move to ret_folios list
>  * 2. ENOMEM, lower memory at the destination: migrate existing unmapped folios
>  *    and stop, since existing unmapped folios have new pages allocated and can
>  *    be migrated
>  * 3. EDEADLOCK, to be unmapped page is locked by someone else, to avoid deadlock,
>  *    we migrate existing unmapped pages and try to lock again
>  * 4. MIGRATEPAGE_SUCCESS, the folios was freed under us: no action
>  * 5. MIGRATEPAGE_UNMAP, unmap succeeded: set avoid_force_lock to true to avoid
>  *    wait to lock a folio in the future to avoid deadlock.
>  *
>  * For folios unmapped but cannot be migrated, we will restore their original
>  * states during cleanup stage at the end.
>  */
>
>>  {
>> -	int retry = 1;
>> +	int retry;
>>  	int large_retry = 1;
>>  	int thp_retry = 1;
>>  	int nr_failed = 0;
>> @@ -1600,13 +1647,19 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>>  	int pass = 0;
>>  	bool is_large = false;
>>  	bool is_thp = false;
>> -	struct folio *folio, *folio2, *dst = NULL;
>> -	int rc, nr_pages;
>> +	struct folio *folio, *folio2, *dst = NULL, *dst2;
>> +	int rc, rc_saved, nr_pages;
>>  	LIST_HEAD(split_folios);
>> +	LIST_HEAD(unmap_folios);
>> +	LIST_HEAD(dst_folios);
>>  	bool nosplit = (reason == MR_NUMA_MISPLACED);
>>  	bool no_split_folio_counting = false;
>> +	bool force_lock;
>>
>> -split_folio_migration:
>> +retry:
>> +	rc_saved = 0;
>> +	force_lock = true;
>> +	retry = 1;
>>  	for (pass = 0;
>>  	     pass < NR_MAX_MIGRATE_PAGES_RETRY && (retry || large_retry);
>>  	     pass++) {
>> @@ -1628,16 +1681,15 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>>  			cond_resched();
>>
>>  			rc = migrate_folio_unmap(get_new_page, put_new_page, private,
>> -						 folio, &dst, pass > 2, mode,
>> -						 reason, ret_folios);
>> -			if (rc == MIGRATEPAGE_UNMAP)
>> -				rc = migrate_folio_move(put_new_page, private,
>> -							folio, dst, mode,
>> -							reason, ret_folios);
>> +						 folio, &dst, pass > 2, force_lock,
>> +						 mode, reason, ret_folios);
>>  			/*
>>  			 * The rules are:
>>  			 *	Success: folio will be freed
>> +			 *	Unmap: folio will be put on unmap_folios list,
>> +			 *	       dst folio put on dst_folios list
>>  			 *	-EAGAIN: stay on the from list
>> +			 *	-EDEADLOCK: stay on the from list
>>  			 *	-ENOMEM: stay on the from list
>>  			 *	-ENOSYS: stay on the from list
>>  			 *	Other errno: put on ret_folios list
>> @@ -1672,7 +1724,7 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>>  			case -ENOMEM:
>>  				/*
>>  				 * When memory is low, don't bother to try to migrate
>> -				 * other folios, just exit.
>> +				 * other folios, move unmapped folios, then exit.
>>  				 */
>>  				if (is_large) {
>>  					nr_large_failed++;
>> @@ -1711,7 +1763,19 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>>  				/* nr_failed isn't updated for not used */
>>  				nr_large_failed += large_retry;
>>  				stats->nr_thp_failed += thp_retry;
>> -				goto out;
>> +				rc_saved = rc;
>> +				if (list_empty(&unmap_folios))
>> +					goto out;
>> +				else
>> +					goto move;
>> +			case -EDEADLOCK:
>> +				/*
>> +				 * The folio cannot be locked for potential deadlock.
>> +				 * Go move (and unlock) all locked folios.  Then we can
>> +				 * try again.
>> +				 */
>> +				rc_saved = rc;
>> +				goto move;
>>  			case -EAGAIN:
>>  				if (is_large) {
>>  					large_retry++;
>> @@ -1725,6 +1789,15 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>>  				stats->nr_succeeded += nr_pages;
>>  				stats->nr_thp_succeeded += is_thp;
>>  				break;
>> +			case MIGRATEPAGE_UNMAP:
>> +				/*
>> +				 * We have locked some folios, don't force lock
>> +				 * to avoid deadlock.
>> +				 */
>> +				force_lock = false;
>> +				list_move_tail(&folio->lru, &unmap_folios);
>> +				list_add_tail(&dst->lru, &dst_folios);
>> +				break;
>>  			default:
>>  				/*
>>  				 * Permanent failure (-EBUSY, etc.):
>> @@ -1748,12 +1821,95 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>>  	nr_large_failed += large_retry;
>>  	stats->nr_thp_failed += thp_retry;
>>  	stats->nr_failed_pages += nr_retry_pages;
>> +move:
>> +	retry = 1;
>> +	for (pass = 0;
>> +	     pass < NR_MAX_MIGRATE_PAGES_RETRY && (retry || large_retry);
>> +	     pass++) {
>> +		retry = 0;
>> +		large_retry = 0;
>> +		thp_retry = 0;
>> +		nr_retry_pages = 0;
>> +
>> +		dst = list_first_entry(&dst_folios, struct folio, lru);
>> +		dst2 = list_next_entry(dst, lru);
>> +		list_for_each_entry_safe(folio, folio2, &unmap_folios, lru) {
>> +			is_large = folio_test_large(folio);
>> +			is_thp = is_large && folio_test_pmd_mappable(folio);
>> +			nr_pages = folio_nr_pages(folio);
>> +
>> +			cond_resched();
>> +
>> +			rc = migrate_folio_move(put_new_page, private,
>> +						folio, dst, mode,
>> +						reason, ret_folios);
>> +			/*
>> +			 * The rules are:
>> +			 *	Success: folio will be freed
>> +			 *	-EAGAIN: stay on the unmap_folios list
>> +			 *	Other errno: put on ret_folios list
>> +			 */
>> +			switch(rc) {
>> +			case -EAGAIN:
>> +				if (is_large) {
>> +					large_retry++;
>> +					thp_retry += is_thp;
>> +				} else if (!no_split_folio_counting) {
>> +					retry++;
>> +				}
>> +				nr_retry_pages += nr_pages;
>> +				break;
>> +			case MIGRATEPAGE_SUCCESS:
>> +				stats->nr_succeeded += nr_pages;
>> +				stats->nr_thp_succeeded += is_thp;
>> +				break;
>> +			default:
>> +				if (is_large) {
>> +					nr_large_failed++;
>> +					stats->nr_thp_failed += is_thp;
>> +				} else if (!no_split_folio_counting) {
>> +					nr_failed++;
>> +				}
>> +
>> +				stats->nr_failed_pages += nr_pages;
>> +				break;
>> +			}
>> +			dst = dst2;
>> +			dst2 = list_next_entry(dst, lru);
>> +		}
>> +	}
>> +	nr_failed += retry;
>> +	nr_large_failed += large_retry;
>> +	stats->nr_thp_failed += thp_retry;
>> +	stats->nr_failed_pages += nr_retry_pages;
>> +
>> +	if (rc_saved)
>> +		rc = rc_saved;
>> +	else
>> +		rc = nr_failed + nr_large_failed;
>> +out:
>> +	/* Cleanup remaining folios */
>> +	dst = list_first_entry(&dst_folios, struct folio, lru);
>> +	dst2 = list_next_entry(dst, lru);
>> +	list_for_each_entry_safe(folio, folio2, &unmap_folios, lru) {
>> +		int page_was_mapped = 0;
>> +		struct anon_vma *anon_vma = NULL;
>> +
>> +		__migrate_folio_extract(dst, &page_was_mapped, &anon_vma);
>> +		migrate_folio_undo_src(folio, page_was_mapped, anon_vma,
>> +				       ret_folios);
>> +		list_del(&dst->lru);
>> +		migrate_folio_undo_dst(dst, put_new_page, private);
>> +		dst = dst2;
>> +		dst2 = list_next_entry(dst, lru);
>> +	}
>> +
>>  	/*
>>  	 * Try to migrate split folios of fail-to-migrate large folios, no
>>  	 * nr_failed counting in this round, since all split folios of a
>>  	 * large folio is counted as 1 failure in the first round.
>>  	 */
>> -	if (!list_empty(&split_folios)) {
>> +	if (rc >= 0 && !list_empty(&split_folios)) {
>>  		/*
>>  		 * Move non-migrated folios (after NR_MAX_MIGRATE_PAGES_RETRY
>>  		 * retries) to ret_folios to avoid migrating them again.
>> @@ -1761,12 +1917,16 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>>  		list_splice_init(from, ret_folios);
>>  		list_splice_init(&split_folios, from);
>>  		no_split_folio_counting = true;
>> -		retry = 1;
>> -		goto split_folio_migration;
>> +		goto retry;
>>  	}
>>
>> -	rc = nr_failed + nr_large_failed;
>> -out:
>> +	/*
>> +	 * We have unlocked all locked folios, so we can force lock now, let's
>> +	 * try again.
>> +	 */
>> +	if (rc == -EDEADLOCK)
>> +		goto retry;
>> +
>>  	return rc;
>>  }
>>
>> -- 
>> 2.35.1
>
> After rename the variable (or give it a better name) and add the comments,
> you can add Reviewed-by: Zi Yan <ziy@nvidia.com>
>
> Thanks.
>
> --
> Best Regards,
> Yan, Zi

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 6/9] migrate_pages: move migrate_folio_unmap()
  2023-02-06  6:33 ` [PATCH -v4 6/9] migrate_pages: move migrate_folio_unmap() Huang Ying
@ 2023-02-07 14:40   ` Zi Yan
  0 siblings, 0 replies; 33+ messages in thread
From: Zi Yan @ 2023-02-07 14:40 UTC (permalink / raw)
  To: Huang Ying
  Cc: Andrew Morton, linux-mm, linux-kernel, Yang Shi, Baolin Wang,
	Oscar Salvador, Matthew Wilcox, Bharata B Rao, Alistair Popple,
	haoxin, Minchan Kim, Mike Kravetz, Hyeonggon Yoo

[-- Attachment #1: Type: text/plain, Size: 940 bytes --]

On 6 Feb 2023, at 1:33, Huang Ying wrote:

> Just move the position of the functions.  There's no any functionality
> change.  This is to make it easier to review the next patch via
> putting code near its position in the next patch.
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Bharata B Rao <bharata@amd.com>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: haoxin <xhao@linux.alibaba.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> ---
>  mm/migrate.c | 102 +++++++++++++++++++++++++--------------------------
>  1 file changed, 51 insertions(+), 51 deletions(-)
>
LGTM. Reviewed-by: Zi Yan <ziy@nvidia.com>

--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 7/9] migrate_pages: share more code between _unmap and _move
  2023-02-06  6:33 ` [PATCH -v4 7/9] migrate_pages: share more code between _unmap and _move Huang Ying
@ 2023-02-07 14:50   ` Zi Yan
  2023-02-08 12:02     ` Huang, Ying
  0 siblings, 1 reply; 33+ messages in thread
From: Zi Yan @ 2023-02-07 14:50 UTC (permalink / raw)
  To: Huang Ying
  Cc: Andrew Morton, linux-mm, linux-kernel, Yang Shi, Baolin Wang,
	Oscar Salvador, Matthew Wilcox, Bharata B Rao, Alistair Popple,
	haoxin, Minchan Kim, Mike Kravetz, Hyeonggon Yoo

[-- Attachment #1: Type: text/plain, Size: 11616 bytes --]

On 6 Feb 2023, at 1:33, Huang Ying wrote:

> This is a code cleanup patch to reduce the duplicated code between the
> _unmap and _move stages of migrate_pages().  No functionality change
> is expected.
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Bharata B Rao <bharata@amd.com>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: haoxin <xhao@linux.alibaba.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> ---
>  mm/migrate.c | 203 ++++++++++++++++++++-------------------------------
>  1 file changed, 81 insertions(+), 122 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 23eb01cfae4c..9378fa2ad4a5 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1037,6 +1037,7 @@ static void __migrate_folio_extract(struct folio *dst,
>  static void migrate_folio_undo_src(struct folio *src,
>  				   int page_was_mapped,
>  				   struct anon_vma *anon_vma,
> +				   bool locked,
>  				   struct list_head *ret)
>  {
>  	if (page_was_mapped)
> @@ -1044,16 +1045,20 @@ static void migrate_folio_undo_src(struct folio *src,
>  	/* Drop an anon_vma reference if we took one */
>  	if (anon_vma)
>  		put_anon_vma(anon_vma);
> -	folio_unlock(src);
> -	list_move_tail(&src->lru, ret);
> +	if (locked)
> +		folio_unlock(src);

Having a comment would be better.
/* A page that has not been migrated, move it to a list for later restoration */
> +	if (ret)
> +		list_move_tail(&src->lru, ret);
>  }
>
>  /* Restore the destination folio to the original state upon failure */
>  static void migrate_folio_undo_dst(struct folio *dst,
> +				   bool locked,
>  				   free_page_t put_new_page,
>  				   unsigned long private)
>  {
> -	folio_unlock(dst);
> +	if (locked)
> +		folio_unlock(dst);
>  	if (put_new_page)
>  		put_new_page(&dst->page, private);
>  	else
> @@ -1078,13 +1083,42 @@ static void migrate_folio_done(struct folio *src,
>  		folio_put(src);
>  }
>
> -static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
> -				 int force, bool force_lock, enum migrate_mode mode)
> +/* Obtain the lock on page, remove all ptes. */
> +static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page,
> +			       unsigned long private, struct folio *src,
> +			       struct folio **dstp, int force, bool force_lock,
> +			       enum migrate_mode mode, enum migrate_reason reason,
> +			       struct list_head *ret)
>  {
> +	struct folio *dst;
>  	int rc = -EAGAIN;
> +	struct page *newpage = NULL;
>  	int page_was_mapped = 0;
>  	struct anon_vma *anon_vma = NULL;
>  	bool is_lru = !__PageMovable(&src->page);
> +	bool locked = false;
> +	bool dst_locked = false;
> +
> +	if (!thp_migration_supported() && folio_test_transhuge(src))
> +		return -ENOSYS;
> +
> +	if (folio_ref_count(src) == 1) {
> +		/* Folio was freed from under us. So we are done. */
> +		folio_clear_active(src);
> +		folio_clear_unevictable(src);
> +		/* free_pages_prepare() will clear PG_isolated. */
> +		list_del(&src->lru);
> +		migrate_folio_done(src, reason);
> +		return MIGRATEPAGE_SUCCESS;
> +	}
> +
> +	newpage = get_new_page(&src->page, private);
> +	if (!newpage)
> +		return -ENOMEM;
> +	dst = page_folio(newpage);
> +	*dstp = dst;
> +
> +	dst->private = NULL;
>
>  	if (!folio_trylock(src)) {
>  		if (!force || mode == MIGRATE_ASYNC)
> @@ -1119,6 +1153,7 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>
>  		folio_lock(src);
>  	}
> +	locked = true;
>
>  	if (folio_test_writeback(src)) {
>  		/*
> @@ -1133,10 +1168,10 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>  			break;
>  		default:
>  			rc = -EBUSY;
> -			goto out_unlock;
> +			goto out;
>  		}
>  		if (!force)
> -			goto out_unlock;
> +			goto out;
>  		folio_wait_writeback(src);
>  	}
>
> @@ -1166,7 +1201,8 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>  	 * This is much like races on refcount of oldpage: just don't BUG().
>  	 */
>  	if (unlikely(!folio_trylock(dst)))
> -		goto out_unlock;
> +		goto out;
> +	dst_locked = true;
>
>  	if (unlikely(!is_lru)) {
>  		__migrate_folio_record(dst, page_was_mapped, anon_vma);
> @@ -1188,7 +1224,7 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>  	if (!src->mapping) {
>  		if (folio_test_private(src)) {
>  			try_to_free_buffers(src);
> -			goto out_unlock_both;
> +			goto out;
>  		}
>  	} else if (folio_mapped(src)) {
>  		/* Establish migration ptes */
> @@ -1203,74 +1239,26 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>  		return MIGRATEPAGE_UNMAP;
>  	}
>
> -	if (page_was_mapped)
> -		remove_migration_ptes(src, src, false);
> -
> -out_unlock_both:
> -	folio_unlock(dst);
> -out_unlock:
> -	/* Drop an anon_vma reference if we took one */
> -	if (anon_vma)
> -		put_anon_vma(anon_vma);
> -	folio_unlock(src);
>  out:
> -
> -	return rc;
> -}
> -
> -/* Obtain the lock on page, remove all ptes. */
> -static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page,
> -			       unsigned long private, struct folio *src,
> -			       struct folio **dstp, int force, bool force_lock,
> -			       enum migrate_mode mode, enum migrate_reason reason,
> -			       struct list_head *ret)
> -{
> -	struct folio *dst;
> -	int rc = MIGRATEPAGE_UNMAP;
> -	struct page *newpage = NULL;
> -
> -	if (!thp_migration_supported() && folio_test_transhuge(src))
> -		return -ENOSYS;
> -
> -	if (folio_ref_count(src) == 1) {
> -		/* Folio was freed from under us. So we are done. */
> -		folio_clear_active(src);
> -		folio_clear_unevictable(src);
> -		/* free_pages_prepare() will clear PG_isolated. */
> -		list_del(&src->lru);
> -		migrate_folio_done(src, reason);
> -		return MIGRATEPAGE_SUCCESS;
> -	}
> -
> -	newpage = get_new_page(&src->page, private);
> -	if (!newpage)
> -		return -ENOMEM;
> -	dst = page_folio(newpage);
> -	*dstp = dst;
> -
> -	dst->private = NULL;
> -	rc = __migrate_folio_unmap(src, dst, force, force_lock, mode);
> -	if (rc == MIGRATEPAGE_UNMAP)
> -		return rc;
> -
>  	/*
>  	 * A page that has not been migrated will have kept its
>  	 * references and be restored.
>  	 */
>  	/* restore the folio to right list. */

This comment is stale. Probably should be
/* Keep the folio and we will try it again later */

> -	if (rc != -EAGAIN && rc != -EDEADLOCK)
> -		list_move_tail(&src->lru, ret);
> +	if (rc == -EAGAIN || rc == -EDEADLOCK)
> +		ret = NULL;
>
> -	if (put_new_page)
> -		put_new_page(&dst->page, private);
> -	else
> -		folio_put(dst);
> +	migrate_folio_undo_src(src, page_was_mapped, anon_vma, locked, ret);
> +	migrate_folio_undo_dst(dst, dst_locked, put_new_page, private);
>
>  	return rc;
>  }
>
> -static int __migrate_folio_move(struct folio *src, struct folio *dst,
> -				enum migrate_mode mode)
> +/* Migrate the folio to the newly allocated folio in dst. */
> +static int migrate_folio_move(free_page_t put_new_page, unsigned long private,
> +			      struct folio *src, struct folio *dst,
> +			      enum migrate_mode mode, enum migrate_reason reason,
> +			      struct list_head *ret)
>  {
>  	int rc;
>  	int page_was_mapped = 0;
> @@ -1283,12 +1271,8 @@ static int __migrate_folio_move(struct folio *src, struct folio *dst,
>  	list_del(&dst->lru);
>
>  	rc = move_to_new_folio(dst, src, mode);
> -
> -	if (rc == -EAGAIN) {
> -		list_add(&dst->lru, prev);
> -		__migrate_folio_record(dst, page_was_mapped, anon_vma);
> -		return rc;
> -	}
> +	if (rc)
> +		goto out;
>
>  	if (unlikely(!is_lru))
>  		goto out_unlock_both;
> @@ -1302,70 +1286,45 @@ static int __migrate_folio_move(struct folio *src, struct folio *dst,
>  	 * unsuccessful, and other cases when a page has been temporarily
>  	 * isolated from the unevictable LRU: but this case is the easiest.
>  	 */
> -	if (rc == MIGRATEPAGE_SUCCESS) {
> -		folio_add_lru(dst);
> -		if (page_was_mapped)
> -			lru_add_drain();
> -	}
> +	folio_add_lru(dst);
> +	if (page_was_mapped)
> +		lru_add_drain();
>
>  	if (page_was_mapped)
> -		remove_migration_ptes(src,
> -			rc == MIGRATEPAGE_SUCCESS ? dst : src, false);
> +		remove_migration_ptes(src, dst, false);
>
>  out_unlock_both:
>  	folio_unlock(dst);
> -	/* Drop an anon_vma reference if we took one */
> -	if (anon_vma)
> -		put_anon_vma(anon_vma);
> -	folio_unlock(src);
> +	set_page_owner_migrate_reason(&dst->page, reason);
>  	/*
>  	 * If migration is successful, decrease refcount of dst,
>  	 * which will not free the page because new page owner increased
>  	 * refcounter.
>  	 */
> -	if (rc == MIGRATEPAGE_SUCCESS)
> -		folio_put(dst);
> -
> -	return rc;
> -}
> -
> -/* Migrate the folio to the newly allocated folio in dst. */
> -static int migrate_folio_move(free_page_t put_new_page, unsigned long private,
> -			      struct folio *src, struct folio *dst,
> -			      enum migrate_mode mode, enum migrate_reason reason,
> -			      struct list_head *ret)
> -{
> -	int rc;
> -
> -	rc = __migrate_folio_move(src, dst, mode);
> -	if (rc == MIGRATEPAGE_SUCCESS)
> -		set_page_owner_migrate_reason(&dst->page, reason);
> -
> -	if (rc != -EAGAIN) {
> -		/*
> -		 * A folio that has been migrated has all references
> -		 * removed and will be freed. A folio that has not been
> -		 * migrated will have kept its references and be restored.
> -		 */
> -		list_del(&src->lru);
> -	}
> +	folio_put(dst);
>
>  	/*
> -	 * If migration is successful, releases reference grabbed during
> -	 * isolation. Otherwise, restore the folio to right list unless
> -	 * we want to retry.
> +	 * A page that has been migrated has all references removed
> +	 * and will be freed.
>  	 */
> -	if (rc == MIGRATEPAGE_SUCCESS) {
> -		migrate_folio_done(src, reason);
> -	} else if (rc != -EAGAIN) {
> -		list_add_tail(&src->lru, ret);
> +	list_del(&src->lru);
> +	/* Drop an anon_vma reference if we took one */
> +	if (anon_vma)
> +		put_anon_vma(anon_vma);
> +	folio_unlock(src);
> +	migrate_folio_done(src, reason);
>
> -		if (put_new_page)
> -			put_new_page(&dst->page, private);
> -		else
> -			folio_put(dst);
> +	return rc;
> +out:
> +	if (rc == -EAGAIN) {
> +		list_add(&dst->lru, prev);
> +		__migrate_folio_record(dst, page_was_mapped, anon_vma);
> +		return rc;
>  	}
>
> +	migrate_folio_undo_src(src, page_was_mapped, anon_vma, true, ret);
> +	migrate_folio_undo_dst(dst, true, put_new_page, private);
> +
>  	return rc;
>  }
>
> @@ -1897,9 +1856,9 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>
>  		__migrate_folio_extract(dst, &page_was_mapped, &anon_vma);
>  		migrate_folio_undo_src(folio, page_was_mapped, anon_vma,
> -				       ret_folios);
> +				       true, ret_folios);
>  		list_del(&dst->lru);
> -		migrate_folio_undo_dst(dst, put_new_page, private);
> +		migrate_folio_undo_dst(dst, true, put_new_page, private);
>  		dst = dst2;
>  		dst2 = list_next_entry(dst, lru);
>  	}
> -- 
> 2.35.1

Everything else looks good to me, just need to fix the two comments above.
Reviewed-by: Zi Yan <ziy@nvidia.com>

--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 8/9] migrate_pages: batch flushing TLB
  2023-02-06  6:33 ` [PATCH -v4 8/9] migrate_pages: batch flushing TLB Huang Ying
@ 2023-02-07 14:52   ` Zi Yan
  2023-02-08 11:27     ` Huang, Ying
  2023-02-07 17:44   ` haoxin
  1 sibling, 1 reply; 33+ messages in thread
From: Zi Yan @ 2023-02-07 14:52 UTC (permalink / raw)
  To: Huang Ying
  Cc: Andrew Morton, linux-mm, linux-kernel, Yang Shi, Baolin Wang,
	Oscar Salvador, Matthew Wilcox, Bharata B Rao, Alistair Popple,
	haoxin, Minchan Kim, Mike Kravetz, Hyeonggon Yoo

[-- Attachment #1: Type: text/plain, Size: 4277 bytes --]

On 6 Feb 2023, at 1:33, Huang Ying wrote:

> The TLB flushing will cost quite some CPU cycles during the folio
> migration in some situations.  For example, when migrate a folio of a
> process with multiple active threads that run on multiple CPUs.  After
> batching the _unmap and _move in migrate_pages(), the TLB flushing can
> be batched easily with the existing TLB flush batching mechanism.
> This patch implements that.
>
> We use the following test case to test the patch.
>
> On a 2-socket Intel server,
>
> - Run pmbench memory accessing benchmark
>
> - Run `migratepages` to migrate pages of pmbench between node 0 and
>   node 1 back and forth.
>
> With the patch, the TLB flushing IPI reduces 99.1% during the test and
> the number of pages migrated successfully per second increases 291.7%.
>
> NOTE: TLB flushing is batched only for normal folios, not for THP
> folios.  Because the overhead of TLB flushing for THP folios is much
> lower than that for normal folios (about 1/512 on x86 platform).
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Bharata B Rao <bharata@amd.com>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: haoxin <xhao@linux.alibaba.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> ---
>  mm/migrate.c |  4 +++-
>  mm/rmap.c    | 20 +++++++++++++++++---
>  2 files changed, 20 insertions(+), 4 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 9378fa2ad4a5..ca6e2ff02a09 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1230,7 +1230,7 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page
>  		/* Establish migration ptes */
>  		VM_BUG_ON_FOLIO(folio_test_anon(src) &&
>  			       !folio_test_ksm(src) && !anon_vma, src);
> -		try_to_migrate(src, 0);
> +		try_to_migrate(src, TTU_BATCH_FLUSH);
>  		page_was_mapped = 1;
>  	}
>
> @@ -1781,6 +1781,8 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>  	stats->nr_thp_failed += thp_retry;
>  	stats->nr_failed_pages += nr_retry_pages;
>  move:

Maybe a comment:
/* Flush TLBs for all the unmapped pages */

> +	try_to_unmap_flush();
> +
>  	retry = 1;
>  	for (pass = 0;
>  	     pass < NR_MAX_MIGRATE_PAGES_RETRY && (retry || large_retry);
> diff --git a/mm/rmap.c b/mm/rmap.c
> index b616870a09be..2e125f3e462e 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1976,7 +1976,21 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
>  		} else {
>  			flush_cache_page(vma, address, pte_pfn(*pvmw.pte));
>  			/* Nuke the page table entry. */
> -			pteval = ptep_clear_flush(vma, address, pvmw.pte);
> +			if (should_defer_flush(mm, flags)) {
> +				/*
> +				 * We clear the PTE but do not flush so potentially
> +				 * a remote CPU could still be writing to the folio.
> +				 * If the entry was previously clean then the
> +				 * architecture must guarantee that a clear->dirty
> +				 * transition on a cached TLB entry is written through
> +				 * and traps if the PTE is unmapped.
> +				 */
> +				pteval = ptep_get_and_clear(mm, address, pvmw.pte);
> +
> +				set_tlb_ubc_flush_pending(mm, pte_dirty(pteval));
> +			} else {
> +				pteval = ptep_clear_flush(vma, address, pvmw.pte);
> +			}
>  		}
>
>  		/* Set the dirty flag on the folio now the pte is gone. */
> @@ -2148,10 +2162,10 @@ void try_to_migrate(struct folio *folio, enum ttu_flags flags)
>
>  	/*
>  	 * Migration always ignores mlock and only supports TTU_RMAP_LOCKED and
> -	 * TTU_SPLIT_HUGE_PMD and TTU_SYNC flags.
> +	 * TTU_SPLIT_HUGE_PMD, TTU_SYNC, and TTU_BATCH_FLUSH flags.
>  	 */
>  	if (WARN_ON_ONCE(flags & ~(TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PMD |
> -					TTU_SYNC)))
> +					TTU_SYNC | TTU_BATCH_FLUSH)))
>  		return;
>
>  	if (folio_is_zone_device(folio) &&
> -- 
> 2.35.1

Everything else looks good to me. Reviewed-by: Zi Yan <ziy@nvidia.com>

--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 9/9] migrate_pages: move THP/hugetlb migration support check to simplify code
  2023-02-06  6:33 ` [PATCH -v4 9/9] migrate_pages: move THP/hugetlb migration support check to simplify code Huang Ying
@ 2023-02-07 14:53   ` Zi Yan
  0 siblings, 0 replies; 33+ messages in thread
From: Zi Yan @ 2023-02-07 14:53 UTC (permalink / raw)
  To: Huang Ying
  Cc: Andrew Morton, linux-mm, linux-kernel, Alistair Popple, Yang Shi,
	Baolin Wang, Oscar Salvador, Matthew Wilcox, Bharata B Rao,
	haoxin, Minchan Kim, Mike Kravetz, Hyeonggon Yoo

[-- Attachment #1: Type: text/plain, Size: 928 bytes --]

On 6 Feb 2023, at 1:33, Huang Ying wrote:

> This is a code cleanup patch, no functionality change is expected.
> After the change, the line number reduces especially in the long
> migrate_pages_batch().
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Suggested-by: Alistair Popple <apopple@nvidia.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Bharata B Rao <bharata@amd.com>
> Cc: haoxin <xhao@linux.alibaba.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> ---
>  mm/migrate.c | 83 +++++++++++++++++++++++-----------------------------
>  1 file changed, 36 insertions(+), 47 deletions(-)
>
LGTM. Thanks. Reviewed-by: Zi Yan <ziy@nvidia.com>

--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 1/9] migrate_pages: organize stats with struct migrate_pages_stats
  2023-02-06  6:33 ` [PATCH -v4 1/9] migrate_pages: organize stats with struct migrate_pages_stats Huang Ying
@ 2023-02-07 16:28   ` haoxin
  0 siblings, 0 replies; 33+ messages in thread
From: haoxin @ 2023-02-07 16:28 UTC (permalink / raw)
  To: Huang Ying, Andrew Morton
  Cc: linux-mm, linux-kernel, Alistair Popple, Zi Yan, Baolin Wang,
	Yang Shi, Oscar Salvador, Matthew Wilcox, Bharata B Rao,
	Minchan Kim, Mike Kravetz, Hyeonggon Yoo


在 2023/2/6 下午2:33, Huang Ying 写道:
> Define struct migrate_pages_stats to organize the various statistics
> in migrate_pages().  This makes it easier to collect and consume the
> statistics in multiple functions.  This will be needed in the
> following patches in the series.
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Reviewed-by: Alistair Popple <apopple@nvidia.com>
> Reviewed-by: Zi Yan <ziy@nvidia.com>
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Bharata B Rao <bharata@amd.com>
> Cc: haoxin <xhao@linux.alibaba.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> ---
>   mm/migrate.c | 60 +++++++++++++++++++++++++++++-----------------------
>   1 file changed, 34 insertions(+), 26 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index a4d3fc65085f..ef388a9e4747 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1396,6 +1396,16 @@ static inline int try_split_folio(struct folio *folio, struct list_head *split_f
>   	return rc;
>   }
>   
> +struct migrate_pages_stats {
> +	int nr_succeeded;	/* Normal and large folios migrated successfully, in
> +				   units of base pages */
> +	int nr_failed_pages;	/* Normal and large folios failed to be migrated, in
> +				   units of base pages.  Untried folios aren't counted */
> +	int nr_thp_succeeded;	/* THP migrated successfully */
> +	int nr_thp_failed;	/* THP failed to be migrated */
> +	int nr_thp_split;	/* THP split before migrating */
> +};
> +
>   /*
>    * migrate_pages - migrate the folios specified in a list, to the free folios
>    *		   supplied as the target for the page migration
> @@ -1430,13 +1440,8 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   	int large_retry = 1;
>   	int thp_retry = 1;
>   	int nr_failed = 0;
> -	int nr_failed_pages = 0;
>   	int nr_retry_pages = 0;
> -	int nr_succeeded = 0;
> -	int nr_thp_succeeded = 0;
>   	int nr_large_failed = 0;
> -	int nr_thp_failed = 0;
> -	int nr_thp_split = 0;
>   	int pass = 0;
>   	bool is_large = false;
>   	bool is_thp = false;
> @@ -1446,9 +1451,11 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   	LIST_HEAD(split_folios);
>   	bool nosplit = (reason == MR_NUMA_MISPLACED);
>   	bool no_split_folio_counting = false;
> +	struct migrate_pages_stats stats;
>   
>   	trace_mm_migrate_pages_start(mode, reason);
>   
> +	memset(&stats, 0, sizeof(stats));
>   split_folio_migration:
>   	for (pass = 0; pass < 10 && (retry || large_retry); pass++) {
>   		retry = 0;
> @@ -1502,9 +1509,9 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   				/* Large folio migration is unsupported */
>   				if (is_large) {
>   					nr_large_failed++;
> -					nr_thp_failed += is_thp;
> +					stats.nr_thp_failed += is_thp;
>   					if (!try_split_folio(folio, &split_folios)) {
> -						nr_thp_split += is_thp;
> +						stats.nr_thp_split += is_thp;
>   						break;
>   					}
>   				/* Hugetlb migration is unsupported */
> @@ -1512,7 +1519,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   					nr_failed++;
>   				}
>   
> -				nr_failed_pages += nr_pages;
> +				stats.nr_failed_pages += nr_pages;
>   				list_move_tail(&folio->lru, &ret_folios);
>   				break;
>   			case -ENOMEM:
> @@ -1522,13 +1529,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   				 */
>   				if (is_large) {
>   					nr_large_failed++;
> -					nr_thp_failed += is_thp;
> +					stats.nr_thp_failed += is_thp;
>   					/* Large folio NUMA faulting doesn't split to retry. */
>   					if (!nosplit) {
>   						int ret = try_split_folio(folio, &split_folios);
>   
>   						if (!ret) {
> -							nr_thp_split += is_thp;
> +							stats.nr_thp_split += is_thp;
>   							break;
>   						} else if (reason == MR_LONGTERM_PIN &&
>   							   ret == -EAGAIN) {
> @@ -1546,7 +1553,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   					nr_failed++;
>   				}
>   
> -				nr_failed_pages += nr_pages + nr_retry_pages;
> +				stats.nr_failed_pages += nr_pages + nr_retry_pages;
>   				/*
>   				 * There might be some split folios of fail-to-migrate large
>   				 * folios left in split_folios list. Move them back to migration
> @@ -1556,7 +1563,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   				list_splice_init(&split_folios, from);
>   				/* nr_failed isn't updated for not used */
>   				nr_large_failed += large_retry;
> -				nr_thp_failed += thp_retry;
> +				stats.nr_thp_failed += thp_retry;
>   				goto out;
>   			case -EAGAIN:
>   				if (is_large) {
> @@ -1568,8 +1575,8 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   				nr_retry_pages += nr_pages;
>   				break;
>   			case MIGRATEPAGE_SUCCESS:
> -				nr_succeeded += nr_pages;
> -				nr_thp_succeeded += is_thp;
> +				stats.nr_succeeded += nr_pages;
> +				stats.nr_thp_succeeded += is_thp;
>   				break;
>   			default:
>   				/*
> @@ -1580,20 +1587,20 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   				 */
>   				if (is_large) {
>   					nr_large_failed++;
> -					nr_thp_failed += is_thp;
> +					stats.nr_thp_failed += is_thp;
>   				} else if (!no_split_folio_counting) {
>   					nr_failed++;
>   				}
>   
> -				nr_failed_pages += nr_pages;
> +				stats.nr_failed_pages += nr_pages;
>   				break;
>   			}
>   		}
>   	}
>   	nr_failed += retry;
>   	nr_large_failed += large_retry;
> -	nr_thp_failed += thp_retry;
> -	nr_failed_pages += nr_retry_pages;
> +	stats.nr_thp_failed += thp_retry;
> +	stats.nr_failed_pages += nr_retry_pages;
>   	/*
>   	 * Try to migrate split folios of fail-to-migrate large folios, no
>   	 * nr_failed counting in this round, since all split folios of a
> @@ -1626,16 +1633,17 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   	if (list_empty(from))
>   		rc = 0;
>   
> -	count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
> -	count_vm_events(PGMIGRATE_FAIL, nr_failed_pages);
> -	count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
> -	count_vm_events(THP_MIGRATION_FAIL, nr_thp_failed);
> -	count_vm_events(THP_MIGRATION_SPLIT, nr_thp_split);
> -	trace_mm_migrate_pages(nr_succeeded, nr_failed_pages, nr_thp_succeeded,
> -			       nr_thp_failed, nr_thp_split, mode, reason);
> +	count_vm_events(PGMIGRATE_SUCCESS, stats.nr_succeeded);
> +	count_vm_events(PGMIGRATE_FAIL, stats.nr_failed_pages);
> +	count_vm_events(THP_MIGRATION_SUCCESS, stats.nr_thp_succeeded);
> +	count_vm_events(THP_MIGRATION_FAIL, stats.nr_thp_failed);
> +	count_vm_events(THP_MIGRATION_SPLIT, stats.nr_thp_split);
> +	trace_mm_migrate_pages(stats.nr_succeeded, stats.nr_failed_pages,
> +			       stats.nr_thp_succeeded, stats.nr_thp_failed,
> +			       stats.nr_thp_split, mode, reason);
>   
>   	if (ret_succeeded)
> -		*ret_succeeded = nr_succeeded;
> +		*ret_succeeded = stats.nr_succeeded;
>   
>   	return rc;
>   }
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 2/9] migrate_pages: separate hugetlb folios migration
  2023-02-06  6:33 ` [PATCH -v4 2/9] migrate_pages: separate hugetlb folios migration Huang Ying
@ 2023-02-07 16:42   ` haoxin
  2023-02-08 11:35     ` Huang, Ying
  0 siblings, 1 reply; 33+ messages in thread
From: haoxin @ 2023-02-07 16:42 UTC (permalink / raw)
  To: Huang Ying, Andrew Morton
  Cc: linux-mm, linux-kernel, Baolin Wang, Zi Yan, Yang Shi,
	Oscar Salvador, Matthew Wilcox, Bharata B Rao, Alistair Popple,
	Minchan Kim, Mike Kravetz, Hyeonggon Yoo


在 2023/2/6 下午2:33, Huang Ying 写道:
> This is a preparation patch to batch the folio unmapping and moving
> for the non-hugetlb folios.  Based on that we can batch the TLB
> shootdown during the folio migration and make it possible to use some
> hardware accelerator for the folio copying.
>
> In this patch the hugetlb folios and non-hugetlb folios migration is
> separated in migrate_pages() to make it easy to change the non-hugetlb
> folios migration implementation.
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Bharata B Rao <bharata@amd.com>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: haoxin <xhao@linux.alibaba.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> ---
>   mm/migrate.c | 141 +++++++++++++++++++++++++++++++++++++++++++--------
>   1 file changed, 119 insertions(+), 22 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index ef388a9e4747..be7f37523463 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1396,6 +1396,8 @@ static inline int try_split_folio(struct folio *folio, struct list_head *split_f
>   	return rc;
>   }
>   
> +#define NR_MAX_MIGRATE_PAGES_RETRY	10
> +
>   struct migrate_pages_stats {
>   	int nr_succeeded;	/* Normal and large folios migrated successfully, in
>   				   units of base pages */
> @@ -1406,6 +1408,95 @@ struct migrate_pages_stats {
>   	int nr_thp_split;	/* THP split before migrating */
>   };
>   
> +/*
> + * Returns the number of hugetlb folios that were not migrated, or an error code
> + * after NR_MAX_MIGRATE_PAGES_RETRY attempts or if no hugetlb folios are movable
> + * any more because the list has become empty or no retryable hugetlb folios
> + * exist any more. It is caller's responsibility to call putback_movable_pages()
> + * only if ret != 0.
> + */
> +static int migrate_hugetlbs(struct list_head *from, new_page_t get_new_page,
> +			    free_page_t put_new_page, unsigned long private,
> +			    enum migrate_mode mode, int reason,
> +			    struct migrate_pages_stats *stats,
> +			    struct list_head *ret_folios)
> +{
> +	int retry = 1;
> +	int nr_failed = 0;
> +	int nr_retry_pages = 0;
> +	int pass = 0;
> +	struct folio *folio, *folio2;
> +	int rc, nr_pages;
> +
> +	for (pass = 0; pass < NR_MAX_MIGRATE_PAGES_RETRY && retry; pass++) {
> +		retry = 0;
> +		nr_retry_pages = 0;
> +
> +		list_for_each_entry_safe(folio, folio2, from, lru) {
> +			if (!folio_test_hugetlb(folio))
> +				continue;
> +
> +			nr_pages = folio_nr_pages(folio);
> +
> +			cond_resched();
Just curious, why put  cond_resched() here,  it makes  "nr_pages = 
folio_nr_pages(folio)" looks Separately  with below codes.
> +
> +			rc = unmap_and_move_huge_page(get_new_page,
> +						      put_new_page, private,
> +						      &folio->page, pass > 2, mode,
> +						      reason, ret_folios);
> +			/*
> +			 * The rules are:
> +			 *	Success: hugetlb folio will be put back
> +			 *	-EAGAIN: stay on the from list
> +			 *	-ENOMEM: stay on the from list
> +			 *	-ENOSYS: stay on the from list
> +			 *	Other errno: put on ret_folios list
> +			 */
> +			switch(rc) {
> +			case -ENOSYS:
> +				/* Hugetlb migration is unsupported */
> +				nr_failed++;
> +				stats->nr_failed_pages += nr_pages;
> +				list_move_tail(&folio->lru, ret_folios);
> +				break;
> +			case -ENOMEM:
> +				/*
> +				 * When memory is low, don't bother to try to migrate
> +				 * other folios, just exit.
> +				 */
> +				stats->nr_failed_pages += nr_pages + nr_retry_pages;
> +				return -ENOMEM;
> +			case -EAGAIN:
> +				retry++;
> +				nr_retry_pages += nr_pages;
> +				break;
> +			case MIGRATEPAGE_SUCCESS:
> +				stats->nr_succeeded += nr_pages;
> +				break;
> +			default:
> +				/*
> +				 * Permanent failure (-EBUSY, etc.):
> +				 * unlike -EAGAIN case, the failed folio is
> +				 * removed from migration folio list and not
> +				 * retried in the next outer loop.
> +				 */
> +				nr_failed++;
> +				stats->nr_failed_pages += nr_pages;
> +				break;
> +			}
> +		}
> +	}
> +	/*
> +	 * nr_failed is number of hugetlb folios failed to be migrated.  After
> +	 * NR_MAX_MIGRATE_PAGES_RETRY attempts, give up and count retried hugetlb
> +	 * folios as failed.
> +	 */
> +	nr_failed += retry;
> +	stats->nr_failed_pages += nr_retry_pages;
> +
> +	return nr_failed;
> +}
> +
>   /*
>    * migrate_pages - migrate the folios specified in a list, to the free folios
>    *		   supplied as the target for the page migration
> @@ -1422,10 +1513,10 @@ struct migrate_pages_stats {
>    * @ret_succeeded:	Set to the number of folios migrated successfully if
>    *			the caller passes a non-NULL pointer.
>    *
> - * The function returns after 10 attempts or if no folios are movable any more
> - * because the list has become empty or no retryable folios exist any more.
> - * It is caller's responsibility to call putback_movable_pages() to return folios
> - * to the LRU or free list only if ret != 0.
> + * The function returns after NR_MAX_MIGRATE_PAGES_RETRY attempts or if no folios
> + * are movable any more because the list has become empty or no retryable folios
> + * exist any more. It is caller's responsibility to call putback_movable_pages()
> + * only if ret != 0.
>    *
>    * Returns the number of {normal folio, large folio, hugetlb} that were not
>    * migrated, or an error code. The number of large folio splits will be
> @@ -1439,7 +1530,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   	int retry = 1;
>   	int large_retry = 1;
>   	int thp_retry = 1;
> -	int nr_failed = 0;
> +	int nr_failed;
>   	int nr_retry_pages = 0;
>   	int nr_large_failed = 0;
>   	int pass = 0;
> @@ -1456,38 +1547,45 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   	trace_mm_migrate_pages_start(mode, reason);
>   
>   	memset(&stats, 0, sizeof(stats));
> +	rc = migrate_hugetlbs(from, get_new_page, put_new_page, private, mode, reason,
> +			      &stats, &ret_folios);
> +	if (rc < 0)
> +		goto out;
> +	nr_failed = rc;
> +
>   split_folio_migration:
> -	for (pass = 0; pass < 10 && (retry || large_retry); pass++) {
> +	for (pass = 0;
> +	     pass < NR_MAX_MIGRATE_PAGES_RETRY && (retry || large_retry);
> +	     pass++) {
>   		retry = 0;
>   		large_retry = 0;
>   		thp_retry = 0;
>   		nr_retry_pages = 0;
>   
>   		list_for_each_entry_safe(folio, folio2, from, lru) {
> +			/* Retried hugetlb folios will be kept in list  */
> +			if (folio_test_hugetlb(folio)) {
> +				list_move_tail(&folio->lru, &ret_folios);
> +				continue;
> +			}
> +
>   			/*
>   			 * Large folio statistics is based on the source large
>   			 * folio. Capture required information that might get
>   			 * lost during migration.
>   			 */
> -			is_large = folio_test_large(folio) && !folio_test_hugetlb(folio);
> +			is_large = folio_test_large(folio);
>   			is_thp = is_large && folio_test_pmd_mappable(folio);
>   			nr_pages = folio_nr_pages(folio);
> +
>   			cond_resched();
>   
> -			if (folio_test_hugetlb(folio))
> -				rc = unmap_and_move_huge_page(get_new_page,
> -						put_new_page, private,
> -						&folio->page, pass > 2, mode,
> -						reason,
> -						&ret_folios);
> -			else
> -				rc = unmap_and_move(get_new_page, put_new_page,
> -						private, folio, pass > 2, mode,
> -						reason, &ret_folios);
> +			rc = unmap_and_move(get_new_page, put_new_page,
> +					    private, folio, pass > 2, mode,
> +					    reason, &ret_folios);
>   			/*
>   			 * The rules are:
> -			 *	Success: non hugetlb folio will be freed, hugetlb
> -			 *		 folio will be put back
> +			 *	Success: folio will be freed
>   			 *	-EAGAIN: stay on the from list
>   			 *	-ENOMEM: stay on the from list
>   			 *	-ENOSYS: stay on the from list
> @@ -1514,7 +1612,6 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   						stats.nr_thp_split += is_thp;
>   						break;
>   					}
> -				/* Hugetlb migration is unsupported */
>   				} else if (!no_split_folio_counting) {
>   					nr_failed++;
>   				}
> @@ -1608,8 +1705,8 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   	 */
>   	if (!list_empty(&split_folios)) {
>   		/*
> -		 * Move non-migrated folios (after 10 retries) to ret_folios
> -		 * to avoid migrating them again.
> +		 * Move non-migrated folios (after NR_MAX_MIGRATE_PAGES_RETRY
> +		 * retries) to ret_folios to avoid migrating them again.
>   		 */
>   		list_splice_init(from, &ret_folios);
>   		list_splice_init(&split_folios, from);
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 3/9] migrate_pages: restrict number of pages to migrate in batch
  2023-02-06  6:33 ` [PATCH -v4 3/9] migrate_pages: restrict number of pages to migrate in batch Huang Ying
@ 2023-02-07 17:01   ` haoxin
  0 siblings, 0 replies; 33+ messages in thread
From: haoxin @ 2023-02-07 17:01 UTC (permalink / raw)
  To: Huang Ying, Andrew Morton
  Cc: linux-mm, linux-kernel, Baolin Wang, Zi Yan, Yang Shi,
	Oscar Salvador, Matthew Wilcox, Bharata B Rao, Alistair Popple,
	Minchan Kim, Mike Kravetz, Hyeonggon Yoo


在 2023/2/6 下午2:33, Huang Ying 写道:
> This is a preparation patch to batch the folio unmapping and moving
> for non-hugetlb folios.
>
> If we had batched the folio unmapping, all folios to be migrated would
> be unmapped before copying the contents and flags of the folios.  If
> the folios that were passed to migrate_pages() were too many in unit
> of pages, the execution of the processes would be stopped for too long
> time, thus too long latency.  For example, migrate_pages() syscall
> will call migrate_pages() with all folios of a process.  To avoid this
> possible issue, in this patch, we restrict the number of pages to be
> migrated to be no more than HPAGE_PMD_NR.  That is, the influence is
> at the same level of THP migration.
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Bharata B Rao <bharata@amd.com>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: haoxin <xhao@linux.alibaba.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> ---
>   mm/migrate.c | 174 +++++++++++++++++++++++++++++++--------------------
>   1 file changed, 106 insertions(+), 68 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index be7f37523463..9a667039c34c 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1396,6 +1396,11 @@ static inline int try_split_folio(struct folio *folio, struct list_head *split_f
>   	return rc;
>   }
>   
> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +#define NR_MAX_BATCHED_MIGRATION	HPAGE_PMD_NR
> +#else
> +#define NR_MAX_BATCHED_MIGRATION	512
> +#endif
>   #define NR_MAX_MIGRATE_PAGES_RETRY	10
>   
>   struct migrate_pages_stats {
> @@ -1497,40 +1502,15 @@ static int migrate_hugetlbs(struct list_head *from, new_page_t get_new_page,
>   	return nr_failed;
>   }
>   
> -/*
> - * migrate_pages - migrate the folios specified in a list, to the free folios
> - *		   supplied as the target for the page migration
> - *
> - * @from:		The list of folios to be migrated.
> - * @get_new_page:	The function used to allocate free folios to be used
> - *			as the target of the folio migration.
> - * @put_new_page:	The function used to free target folios if migration
> - *			fails, or NULL if no special handling is necessary.
> - * @private:		Private data to be passed on to get_new_page()
> - * @mode:		The migration mode that specifies the constraints for
> - *			folio migration, if any.
> - * @reason:		The reason for folio migration.
> - * @ret_succeeded:	Set to the number of folios migrated successfully if
> - *			the caller passes a non-NULL pointer.
> - *
> - * The function returns after NR_MAX_MIGRATE_PAGES_RETRY attempts or if no folios
> - * are movable any more because the list has become empty or no retryable folios
> - * exist any more. It is caller's responsibility to call putback_movable_pages()
> - * only if ret != 0.
> - *
> - * Returns the number of {normal folio, large folio, hugetlb} that were not
> - * migrated, or an error code. The number of large folio splits will be
> - * considered as the number of non-migrated large folio, no matter how many
> - * split folios of the large folio are migrated successfully.
> - */
> -int migrate_pages(struct list_head *from, new_page_t get_new_page,
> +static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>   		free_page_t put_new_page, unsigned long private,
> -		enum migrate_mode mode, int reason, unsigned int *ret_succeeded)
> +		enum migrate_mode mode, int reason, struct list_head *ret_folios,
> +		struct migrate_pages_stats *stats)
>   {
>   	int retry = 1;
>   	int large_retry = 1;
>   	int thp_retry = 1;
> -	int nr_failed;
> +	int nr_failed = 0;
>   	int nr_retry_pages = 0;
>   	int nr_large_failed = 0;
>   	int pass = 0;
> @@ -1538,20 +1518,9 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   	bool is_thp = false;
>   	struct folio *folio, *folio2;
>   	int rc, nr_pages;
> -	LIST_HEAD(ret_folios);
>   	LIST_HEAD(split_folios);
>   	bool nosplit = (reason == MR_NUMA_MISPLACED);
>   	bool no_split_folio_counting = false;
> -	struct migrate_pages_stats stats;
> -
> -	trace_mm_migrate_pages_start(mode, reason);
> -
> -	memset(&stats, 0, sizeof(stats));
> -	rc = migrate_hugetlbs(from, get_new_page, put_new_page, private, mode, reason,
> -			      &stats, &ret_folios);
> -	if (rc < 0)
> -		goto out;
> -	nr_failed = rc;
>   
>   split_folio_migration:
>   	for (pass = 0;
> @@ -1563,12 +1532,6 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   		nr_retry_pages = 0;
>   
>   		list_for_each_entry_safe(folio, folio2, from, lru) {
> -			/* Retried hugetlb folios will be kept in list  */
> -			if (folio_test_hugetlb(folio)) {
> -				list_move_tail(&folio->lru, &ret_folios);
> -				continue;
> -			}
> -
>   			/*
>   			 * Large folio statistics is based on the source large
>   			 * folio. Capture required information that might get
> @@ -1582,15 +1545,14 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   
>   			rc = unmap_and_move(get_new_page, put_new_page,
>   					    private, folio, pass > 2, mode,
> -					    reason, &ret_folios);
> +					    reason, ret_folios);
>   			/*
>   			 * The rules are:
>   			 *	Success: folio will be freed
>   			 *	-EAGAIN: stay on the from list
>   			 *	-ENOMEM: stay on the from list
>   			 *	-ENOSYS: stay on the from list
> -			 *	Other errno: put on ret_folios list then splice to
> -			 *		     from list
> +			 *	Other errno: put on ret_folios list
>   			 */
>   			switch(rc) {
>   			/*
> @@ -1607,17 +1569,17 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   				/* Large folio migration is unsupported */
>   				if (is_large) {
>   					nr_large_failed++;
> -					stats.nr_thp_failed += is_thp;
> +					stats->nr_thp_failed += is_thp;
>   					if (!try_split_folio(folio, &split_folios)) {
> -						stats.nr_thp_split += is_thp;
> +						stats->nr_thp_split += is_thp;
>   						break;
>   					}
>   				} else if (!no_split_folio_counting) {
>   					nr_failed++;
>   				}
>   
> -				stats.nr_failed_pages += nr_pages;
> -				list_move_tail(&folio->lru, &ret_folios);
> +				stats->nr_failed_pages += nr_pages;
> +				list_move_tail(&folio->lru, ret_folios);
>   				break;
>   			case -ENOMEM:
>   				/*
> @@ -1626,13 +1588,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   				 */
>   				if (is_large) {
>   					nr_large_failed++;
> -					stats.nr_thp_failed += is_thp;
> +					stats->nr_thp_failed += is_thp;
>   					/* Large folio NUMA faulting doesn't split to retry. */
>   					if (!nosplit) {
>   						int ret = try_split_folio(folio, &split_folios);
>   
>   						if (!ret) {
> -							stats.nr_thp_split += is_thp;
> +							stats->nr_thp_split += is_thp;
>   							break;
>   						} else if (reason == MR_LONGTERM_PIN &&
>   							   ret == -EAGAIN) {
> @@ -1650,17 +1612,17 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   					nr_failed++;
>   				}
>   
> -				stats.nr_failed_pages += nr_pages + nr_retry_pages;
> +				stats->nr_failed_pages += nr_pages + nr_retry_pages;
>   				/*
>   				 * There might be some split folios of fail-to-migrate large
> -				 * folios left in split_folios list. Move them back to migration
> +				 * folios left in split_folios list. Move them to ret_folios
>   				 * list so that they could be put back to the right list by
>   				 * the caller otherwise the folio refcnt will be leaked.
>   				 */
> -				list_splice_init(&split_folios, from);
> +				list_splice_init(&split_folios, ret_folios);
>   				/* nr_failed isn't updated for not used */
>   				nr_large_failed += large_retry;
> -				stats.nr_thp_failed += thp_retry;
> +				stats->nr_thp_failed += thp_retry;
>   				goto out;
>   			case -EAGAIN:
>   				if (is_large) {
> @@ -1672,8 +1634,8 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   				nr_retry_pages += nr_pages;
>   				break;
>   			case MIGRATEPAGE_SUCCESS:
> -				stats.nr_succeeded += nr_pages;
> -				stats.nr_thp_succeeded += is_thp;
> +				stats->nr_succeeded += nr_pages;
> +				stats->nr_thp_succeeded += is_thp;
>   				break;
>   			default:
>   				/*
> @@ -1684,20 +1646,20 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   				 */
>   				if (is_large) {
>   					nr_large_failed++;
> -					stats.nr_thp_failed += is_thp;
> +					stats->nr_thp_failed += is_thp;
>   				} else if (!no_split_folio_counting) {
>   					nr_failed++;
>   				}
>   
> -				stats.nr_failed_pages += nr_pages;
> +				stats->nr_failed_pages += nr_pages;
>   				break;
>   			}
>   		}
>   	}
>   	nr_failed += retry;
>   	nr_large_failed += large_retry;
> -	stats.nr_thp_failed += thp_retry;
> -	stats.nr_failed_pages += nr_retry_pages;
> +	stats->nr_thp_failed += thp_retry;
> +	stats->nr_failed_pages += nr_retry_pages;
>   	/*
>   	 * Try to migrate split folios of fail-to-migrate large folios, no
>   	 * nr_failed counting in this round, since all split folios of a
> @@ -1708,7 +1670,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   		 * Move non-migrated folios (after NR_MAX_MIGRATE_PAGES_RETRY
>   		 * retries) to ret_folios to avoid migrating them again.
>   		 */
> -		list_splice_init(from, &ret_folios);
> +		list_splice_init(from, ret_folios);
>   		list_splice_init(&split_folios, from);
>   		no_split_folio_counting = true;
>   		retry = 1;
> @@ -1716,6 +1678,82 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   	}
>   
>   	rc = nr_failed + nr_large_failed;
> +out:
> +	return rc;
> +}
> +
> +/*
> + * migrate_pages - migrate the folios specified in a list, to the free folios
> + *		   supplied as the target for the page migration
> + *
> + * @from:		The list of folios to be migrated.
> + * @get_new_page:	The function used to allocate free folios to be used
> + *			as the target of the folio migration.
> + * @put_new_page:	The function used to free target folios if migration
> + *			fails, or NULL if no special handling is necessary.
> + * @private:		Private data to be passed on to get_new_page()
> + * @mode:		The migration mode that specifies the constraints for
> + *			folio migration, if any.
> + * @reason:		The reason for folio migration.
> + * @ret_succeeded:	Set to the number of folios migrated successfully if
> + *			the caller passes a non-NULL pointer.
> + *
> + * The function returns after NR_MAX_MIGRATE_PAGES_RETRY attempts or if no folios
> + * are movable any more because the list has become empty or no retryable folios
> + * exist any more. It is caller's responsibility to call putback_movable_pages()
> + * only if ret != 0.
> + *
> + * Returns the number of {normal folio, large folio, hugetlb} that were not
> + * migrated, or an error code. The number of large folio splits will be
> + * considered as the number of non-migrated large folio, no matter how many
> + * split folios of the large folio are migrated successfully.
> + */
> +int migrate_pages(struct list_head *from, new_page_t get_new_page,
> +		free_page_t put_new_page, unsigned long private,
> +		enum migrate_mode mode, int reason, unsigned int *ret_succeeded)
> +{
> +	int rc, rc_gather;
> +	int nr_pages;
> +	struct folio *folio, *folio2;
> +	LIST_HEAD(folios);
> +	LIST_HEAD(ret_folios);
> +	struct migrate_pages_stats stats;
> +
> +	trace_mm_migrate_pages_start(mode, reason);
> +
> +	memset(&stats, 0, sizeof(stats));
> +
> +	rc_gather = migrate_hugetlbs(from, get_new_page, put_new_page, private,
> +				     mode, reason, &stats, &ret_folios);
> +	if (rc_gather < 0)
> +		goto out;
> +again:
> +	nr_pages = 0;
> +	list_for_each_entry_safe(folio, folio2, from, lru) {
> +		/* Retried hugetlb folios will be kept in list  */
> +		if (folio_test_hugetlb(folio)) {
> +			list_move_tail(&folio->lru, &ret_folios);
> +			continue;
> +		}
> +
> +		nr_pages += folio_nr_pages(folio);
> +		if (nr_pages > NR_MAX_BATCHED_MIGRATION)
> +			break;
> +	}
> +	if (nr_pages > NR_MAX_BATCHED_MIGRATION)
> +		list_cut_before(&folios, from, &folio->lru);
> +	else
> +		list_splice_init(from, &folios);
> +	rc = migrate_pages_batch(&folios, get_new_page, put_new_page, private,
> +				 mode, reason, &ret_folios, &stats);
> +	list_splice_tail_init(&folios, &ret_folios);
> +	if (rc < 0) {
> +		rc_gather = rc;
> +		goto out;
> +	}
> +	rc_gather += rc;
> +	if (!list_empty(from))
> +		goto again;
>   out:
>   	/*
>   	 * Put the permanent failure folio back to migration list, they
> @@ -1728,7 +1766,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   	 * are migrated successfully.
>   	 */
>   	if (list_empty(from))
> -		rc = 0;
> +		rc_gather = 0;
>   
>   	count_vm_events(PGMIGRATE_SUCCESS, stats.nr_succeeded);
>   	count_vm_events(PGMIGRATE_FAIL, stats.nr_failed_pages);
> @@ -1742,7 +1780,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   	if (ret_succeeded)
>   		*ret_succeeded = stats.nr_succeeded;
>   
> -	return rc;
> +	return rc_gather;
>   }
>   
>   struct page *alloc_migration_target(struct page *page, unsigned long private)
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 4/9] migrate_pages: split unmap_and_move() to _unmap() and _move()
  2023-02-06  6:33 ` [PATCH -v4 4/9] migrate_pages: split unmap_and_move() to _unmap() and _move() Huang Ying
@ 2023-02-07 17:11   ` haoxin
  2023-02-07 17:27     ` haoxin
  0 siblings, 1 reply; 33+ messages in thread
From: haoxin @ 2023-02-07 17:11 UTC (permalink / raw)
  To: Huang Ying, Andrew Morton
  Cc: linux-mm, linux-kernel, Baolin Wang, Zi Yan, Yang Shi,
	Oscar Salvador, Matthew Wilcox, Bharata B Rao, Alistair Popple,
	Minchan Kim, Mike Kravetz, Hyeonggon Yoo


在 2023/2/6 下午2:33, Huang Ying 写道:
> This is a preparation patch to batch the folio unmapping and moving.
>
> In this patch, unmap_and_move() is split to migrate_folio_unmap() and
> migrate_folio_move().  So, we can batch _unmap() and _move() in
> different loops later.  To pass some information between unmap and
> move, the original unused dst->mapping and dst->private are used.
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Bharata B Rao <bharata@amd.com>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: haoxin <xhao@linux.alibaba.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> ---
>   include/linux/migrate.h |   1 +
>   mm/migrate.c            | 170 ++++++++++++++++++++++++++++++----------
>   2 files changed, 130 insertions(+), 41 deletions(-)
>
> diff --git a/include/linux/migrate.h b/include/linux/migrate.h
> index 3ef77f52a4f0..7376074f2e1e 100644
> --- a/include/linux/migrate.h
> +++ b/include/linux/migrate.h
> @@ -18,6 +18,7 @@ struct migration_target_control;
>    * - zero on page migration success;
>    */
>   #define MIGRATEPAGE_SUCCESS		0
> +#define MIGRATEPAGE_UNMAP		1
>   
>   /**
>    * struct movable_operations - Driver page migration
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 9a667039c34c..0428449149f4 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1009,11 +1009,53 @@ static int move_to_new_folio(struct folio *dst, struct folio *src,
>   	return rc;
>   }
>   
> -static int __unmap_and_move(struct folio *src, struct folio *dst,
> +/*
> + * To record some information during migration, we uses
uses / use
>   some unused
> + * fields (mapping and private) of struct folio of the newly allocated
> + * destination folio.  This is safe because nobody is using them
> + * except us.
> + */
> +static void __migrate_folio_record(struct folio *dst,
> +				   unsigned long page_was_mapped,
> +				   struct anon_vma *anon_vma)
> +{
> +	dst->mapping = (void *)anon_vma;
> +	dst->private = (void *)page_was_mapped;
> +}
> +
> +static void __migrate_folio_extract(struct folio *dst,
> +				   int *page_was_mappedp,
> +				   struct anon_vma **anon_vmap)
> +{
> +	*anon_vmap = (void *)dst->mapping;
> +	*page_was_mappedp = (unsigned long)dst->private;
> +	dst->mapping = NULL;
> +	dst->private = NULL;
> +}
> +
> +/* Cleanup src folio upon migration success */
> +static void migrate_folio_done(struct folio *src,
> +			       enum migrate_reason reason)
> +{
> +	/*
> +	 * Compaction can migrate also non-LRU pages which are
> +	 * not accounted to NR_ISOLATED_*. They can be recognized
> +	 * as __PageMovable
> +	 */
> +	if (likely(!__folio_test_movable(src)))
> +		mod_node_page_state(folio_pgdat(src), NR_ISOLATED_ANON +
> +				    folio_is_file_lru(src), -folio_nr_pages(src));
> +
> +	if (reason != MR_MEMORY_FAILURE)
> +		/* We release the page in page_handle_poison. */
> +		folio_put(src);
> +}
> +
> +static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>   				int force, enum migrate_mode mode)
>   {
>   	int rc = -EAGAIN;
> -	bool page_was_mapped = false;
> +	int page_was_mapped = 0;
>   	struct anon_vma *anon_vma = NULL;
>   	bool is_lru = !__PageMovable(&src->page);
>   
> @@ -1089,8 +1131,8 @@ static int __unmap_and_move(struct folio *src, struct folio *dst,
>   		goto out_unlock;
>   
>   	if (unlikely(!is_lru)) {
> -		rc = move_to_new_folio(dst, src, mode);
> -		goto out_unlock_both;
> +		__migrate_folio_record(dst, page_was_mapped, anon_vma);
> +		return MIGRATEPAGE_UNMAP;
>   	}
>   
>   	/*
> @@ -1115,11 +1157,42 @@ static int __unmap_and_move(struct folio *src, struct folio *dst,
>   		VM_BUG_ON_FOLIO(folio_test_anon(src) &&
>   			       !folio_test_ksm(src) && !anon_vma, src);
>   		try_to_migrate(src, 0);
> -		page_was_mapped = true;
> +		page_was_mapped = 1;
>   	}
>   
> -	if (!folio_mapped(src))
> -		rc = move_to_new_folio(dst, src, mode);
> +	if (!folio_mapped(src)) {
> +		__migrate_folio_record(dst, page_was_mapped, anon_vma);
> +		return MIGRATEPAGE_UNMAP;
> +	}
> +
> +	if (page_was_mapped)
> +		remove_migration_ptes(src, src, false);
> +
> +out_unlock_both:
> +	folio_unlock(dst);
> +out_unlock:
> +	/* Drop an anon_vma reference if we took one */
> +	if (anon_vma)
> +		put_anon_vma(anon_vma);
> +	folio_unlock(src);
> +out:
> +
> +	return rc;
> +}
> +
> +static int __migrate_folio_move(struct folio *src, struct folio *dst,
> +				enum migrate_mode mode)
> +{
> +	int rc;
> +	int page_was_mapped = 0;
> +	struct anon_vma *anon_vma = NULL;
> +	bool is_lru = !__PageMovable(&src->page);
> +
> +	__migrate_folio_extract(dst, &page_was_mapped, &anon_vma);
> +
> +	rc = move_to_new_folio(dst, src, mode);
> +	if (unlikely(!is_lru))
> +		goto out_unlock_both;
>   
>   	/*
>   	 * When successful, push dst to LRU immediately: so that if it
> @@ -1142,12 +1215,10 @@ static int __unmap_and_move(struct folio *src, struct folio *dst,
>   
>   out_unlock_both:
>   	folio_unlock(dst);
> -out_unlock:
>   	/* Drop an anon_vma reference if we took one */
>   	if (anon_vma)
>   		put_anon_vma(anon_vma);
>   	folio_unlock(src);
> -out:
>   	/*
>   	 * If migration is successful, decrease refcount of dst,
>   	 * which will not free the page because new page owner increased
> @@ -1159,19 +1230,15 @@ static int __unmap_and_move(struct folio *src, struct folio *dst,
>   	return rc;
>   }
>   
> -/*
> - * Obtain the lock on folio, remove all ptes and migrate the folio
> - * to the newly allocated folio in dst.
> - */
> -static int unmap_and_move(new_page_t get_new_page,
> -				   free_page_t put_new_page,
> -				   unsigned long private, struct folio *src,
> -				   int force, enum migrate_mode mode,
> -				   enum migrate_reason reason,
> -				   struct list_head *ret)
> +/* Obtain the lock on page, remove all ptes. */
> +static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page,
> +			       unsigned long private, struct folio *src,
> +			       struct folio **dstp, int force,
> +			       enum migrate_mode mode, enum migrate_reason reason,
> +			       struct list_head *ret)
>   {
>   	struct folio *dst;
> -	int rc = MIGRATEPAGE_SUCCESS;
> +	int rc = MIGRATEPAGE_UNMAP;
>   	struct page *newpage = NULL;
>   
>   	if (!thp_migration_supported() && folio_test_transhuge(src))
> @@ -1182,20 +1249,50 @@ static int unmap_and_move(new_page_t get_new_page,
>   		folio_clear_active(src);
>   		folio_clear_unevictable(src);
>   		/* free_pages_prepare() will clear PG_isolated. */
> -		goto out;
> +		list_del(&src->lru);
> +		migrate_folio_done(src, reason);
> +		return MIGRATEPAGE_SUCCESS;
>   	}
>   
>   	newpage = get_new_page(&src->page, private);
>   	if (!newpage)
>   		return -ENOMEM;
>   	dst = page_folio(newpage);
> +	*dstp = dst;
>   
>   	dst->private = NULL;
> -	rc = __unmap_and_move(src, dst, force, mode);
> +	rc = __migrate_folio_unmap(src, dst, force, mode);
> +	if (rc == MIGRATEPAGE_UNMAP)
> +		return rc;
> +
> +	/*
> +	 * A page that has not been migrated will have kept its
> +	 * references and be restored.
> +	 */
> +	/* restore the folio to right list. */
> +	if (rc != -EAGAIN)
> +		list_move_tail(&src->lru, ret);
> +
> +	if (put_new_page)
> +		put_new_page(&dst->page, private);
> +	else
> +		folio_put(dst);
> +
> +	return rc;
> +}
> +
> +/* Migrate the folio to the newly allocated folio in dst. */
> +static int migrate_folio_move(free_page_t put_new_page, unsigned long private,
> +			      struct folio *src, struct folio *dst,
> +			      enum migrate_mode mode, enum migrate_reason reason,
> +			      struct list_head *ret)
> +{
> +	int rc;
> +
> +	rc = __migrate_folio_move(src, dst, mode);
>   	if (rc == MIGRATEPAGE_SUCCESS)
>   		set_page_owner_migrate_reason(&dst->page, reason);
>   
> -out:
>   	if (rc != -EAGAIN) {
>   		/*
>   		 * A folio that has been migrated has all references
> @@ -1211,20 +1308,7 @@ static int unmap_and_move(new_page_t get_new_page,
>   	 * we want to retry.
>   	 */
>   	if (rc == MIGRATEPAGE_SUCCESS) {
> -		/*
> -		 * Compaction can migrate also non-LRU folios which are
> -		 * not accounted to NR_ISOLATED_*. They can be recognized
> -		 * as __folio_test_movable
> -		 */
> -		if (likely(!__folio_test_movable(src)))
> -			mod_node_page_state(folio_pgdat(src), NR_ISOLATED_ANON +
> -					folio_is_file_lru(src), -folio_nr_pages(src));
> -
> -		if (reason != MR_MEMORY_FAILURE)
> -			/*
> -			 * We release the folio in page_handle_poison.
> -			 */
> -			folio_put(src);
> +		migrate_folio_done(src, reason);
>   	} else {
>   		if (rc != -EAGAIN)
>   			list_add_tail(&src->lru, ret);
> @@ -1516,7 +1600,7 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>   	int pass = 0;
>   	bool is_large = false;
>   	bool is_thp = false;
> -	struct folio *folio, *folio2;
> +	struct folio *folio, *folio2, *dst = NULL;
>   	int rc, nr_pages;
>   	LIST_HEAD(split_folios);
>   	bool nosplit = (reason == MR_NUMA_MISPLACED);
> @@ -1543,9 +1627,13 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>   
>   			cond_resched();
>   
> -			rc = unmap_and_move(get_new_page, put_new_page,
> -					    private, folio, pass > 2, mode,
> -					    reason, ret_folios);
> +			rc = migrate_folio_unmap(get_new_page, put_new_page, private,
> +						 folio, &dst, pass > 2, mode,
> +						 reason, ret_folios);
> +			if (rc == MIGRATEPAGE_UNMAP)
> +				rc = migrate_folio_move(put_new_page, private,
> +							folio, dst, mode,
> +							reason, ret_folios);
How to deal with the whole  pages are ummaped success,  but only part  
pages are moved success ?
>   			/*
>   			 * The rules are:
>   			 *	Success: folio will be freed

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 4/9] migrate_pages: split unmap_and_move() to _unmap() and _move()
  2023-02-07 17:11   ` haoxin
@ 2023-02-07 17:27     ` haoxin
  0 siblings, 0 replies; 33+ messages in thread
From: haoxin @ 2023-02-07 17:27 UTC (permalink / raw)
  To: Huang Ying, Andrew Morton
  Cc: linux-mm, linux-kernel, Baolin Wang, Zi Yan, Yang Shi,
	Oscar Salvador, Matthew Wilcox, Bharata B Rao, Alistair Popple,
	Minchan Kim, Mike Kravetz, Hyeonggon Yoo


在 2023/2/8 上午1:11, haoxin 写道:
>
> 在 2023/2/6 下午2:33, Huang Ying 写道:
>> This is a preparation patch to batch the folio unmapping and moving.
>>
>> In this patch, unmap_and_move() is split to migrate_folio_unmap() and
>> migrate_folio_move().  So, we can batch _unmap() and _move() in
>> different loops later.  To pass some information between unmap and
>> move, the original unused dst->mapping and dst->private are used.
>>
>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>> Cc: Zi Yan <ziy@nvidia.com>
>> Cc: Yang Shi <shy828301@gmail.com>
>> Cc: Oscar Salvador <osalvador@suse.de>
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Cc: Bharata B Rao <bharata@amd.com>
>> Cc: Alistair Popple <apopple@nvidia.com>
>> Cc: haoxin <xhao@linux.alibaba.com>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Mike Kravetz <mike.kravetz@oracle.com>
>> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
>> ---
>>   include/linux/migrate.h |   1 +
>>   mm/migrate.c            | 170 ++++++++++++++++++++++++++++++----------
>>   2 files changed, 130 insertions(+), 41 deletions(-)
>>
>> diff --git a/include/linux/migrate.h b/include/linux/migrate.h
>> index 3ef77f52a4f0..7376074f2e1e 100644
>> --- a/include/linux/migrate.h
>> +++ b/include/linux/migrate.h
>> @@ -18,6 +18,7 @@ struct migration_target_control;
>>    * - zero on page migration success;
>>    */
>>   #define MIGRATEPAGE_SUCCESS        0
>> +#define MIGRATEPAGE_UNMAP        1
>>     /**
>>    * struct movable_operations - Driver page migration
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index 9a667039c34c..0428449149f4 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1009,11 +1009,53 @@ static int move_to_new_folio(struct folio 
>> *dst, struct folio *src,
>>       return rc;
>>   }
>>   -static int __unmap_and_move(struct folio *src, struct folio *dst,
>> +/*
>> + * To record some information during migration, we uses
> uses / use
>>   some unused
>> + * fields (mapping and private) of struct folio of the newly allocated
>> + * destination folio.  This is safe because nobody is using them
>> + * except us.
>> + */
>> +static void __migrate_folio_record(struct folio *dst,
>> +                   unsigned long page_was_mapped,
>> +                   struct anon_vma *anon_vma)
>> +{
>> +    dst->mapping = (void *)anon_vma;
>> +    dst->private = (void *)page_was_mapped;
>> +}
>> +
>> +static void __migrate_folio_extract(struct folio *dst,
>> +                   int *page_was_mappedp,
>> +                   struct anon_vma **anon_vmap)
>> +{
>> +    *anon_vmap = (void *)dst->mapping;
>> +    *page_was_mappedp = (unsigned long)dst->private;
>> +    dst->mapping = NULL;
>> +    dst->private = NULL;
>> +}
>> +
>> +/* Cleanup src folio upon migration success */
>> +static void migrate_folio_done(struct folio *src,
>> +                   enum migrate_reason reason)
>> +{
>> +    /*
>> +     * Compaction can migrate also non-LRU pages which are
>> +     * not accounted to NR_ISOLATED_*. They can be recognized
>> +     * as __PageMovable
>> +     */
>> +    if (likely(!__folio_test_movable(src)))
>> +        mod_node_page_state(folio_pgdat(src), NR_ISOLATED_ANON +
>> +                    folio_is_file_lru(src), -folio_nr_pages(src));
>> +
>> +    if (reason != MR_MEMORY_FAILURE)
>> +        /* We release the page in page_handle_poison. */
>> +        folio_put(src);
>> +}
>> +
>> +static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>>                   int force, enum migrate_mode mode)
>>   {
>>       int rc = -EAGAIN;
>> -    bool page_was_mapped = false;
>> +    int page_was_mapped = 0;
>>       struct anon_vma *anon_vma = NULL;
>>       bool is_lru = !__PageMovable(&src->page);
>>   @@ -1089,8 +1131,8 @@ static int __unmap_and_move(struct folio 
>> *src, struct folio *dst,
>>           goto out_unlock;
>>         if (unlikely(!is_lru)) {
>> -        rc = move_to_new_folio(dst, src, mode);
>> -        goto out_unlock_both;
>> +        __migrate_folio_record(dst, page_was_mapped, anon_vma);
>> +        return MIGRATEPAGE_UNMAP;
>>       }
>>         /*
>> @@ -1115,11 +1157,42 @@ static int __unmap_and_move(struct folio 
>> *src, struct folio *dst,
>>           VM_BUG_ON_FOLIO(folio_test_anon(src) &&
>>                      !folio_test_ksm(src) && !anon_vma, src);
>>           try_to_migrate(src, 0);
>> -        page_was_mapped = true;
>> +        page_was_mapped = 1;
>>       }
>>   -    if (!folio_mapped(src))
>> -        rc = move_to_new_folio(dst, src, mode);
>> +    if (!folio_mapped(src)) {
>> +        __migrate_folio_record(dst, page_was_mapped, anon_vma);
>> +        return MIGRATEPAGE_UNMAP;
>> +    }
>> +
>> +    if (page_was_mapped)
>> +        remove_migration_ptes(src, src, false);
>> +
>> +out_unlock_both:
>> +    folio_unlock(dst);
>> +out_unlock:
>> +    /* Drop an anon_vma reference if we took one */
>> +    if (anon_vma)
>> +        put_anon_vma(anon_vma);
>> +    folio_unlock(src);
>> +out:
>> +
>> +    return rc;
>> +}
>> +
>> +static int __migrate_folio_move(struct folio *src, struct folio *dst,
>> +                enum migrate_mode mode)
>> +{
>> +    int rc;
>> +    int page_was_mapped = 0;
>> +    struct anon_vma *anon_vma = NULL;
>> +    bool is_lru = !__PageMovable(&src->page);
>> +
>> +    __migrate_folio_extract(dst, &page_was_mapped, &anon_vma);
>> +
>> +    rc = move_to_new_folio(dst, src, mode);
>> +    if (unlikely(!is_lru))
>> +        goto out_unlock_both;
>>         /*
>>        * When successful, push dst to LRU immediately: so that if it
>> @@ -1142,12 +1215,10 @@ static int __unmap_and_move(struct folio 
>> *src, struct folio *dst,
>>     out_unlock_both:
>>       folio_unlock(dst);
>> -out_unlock:
>>       /* Drop an anon_vma reference if we took one */
>>       if (anon_vma)
>>           put_anon_vma(anon_vma);
>>       folio_unlock(src);
>> -out:
>>       /*
>>        * If migration is successful, decrease refcount of dst,
>>        * which will not free the page because new page owner increased
>> @@ -1159,19 +1230,15 @@ static int __unmap_and_move(struct folio 
>> *src, struct folio *dst,
>>       return rc;
>>   }
>>   -/*
>> - * Obtain the lock on folio, remove all ptes and migrate the folio
>> - * to the newly allocated folio in dst.
>> - */
>> -static int unmap_and_move(new_page_t get_new_page,
>> -                   free_page_t put_new_page,
>> -                   unsigned long private, struct folio *src,
>> -                   int force, enum migrate_mode mode,
>> -                   enum migrate_reason reason,
>> -                   struct list_head *ret)
>> +/* Obtain the lock on page, remove all ptes. */
>> +static int migrate_folio_unmap(new_page_t get_new_page, free_page_t 
>> put_new_page,
>> +                   unsigned long private, struct folio *src,
>> +                   struct folio **dstp, int force,
>> +                   enum migrate_mode mode, enum migrate_reason reason,
>> +                   struct list_head *ret)
>>   {
>>       struct folio *dst;
>> -    int rc = MIGRATEPAGE_SUCCESS;
>> +    int rc = MIGRATEPAGE_UNMAP;
>>       struct page *newpage = NULL;
>>         if (!thp_migration_supported() && folio_test_transhuge(src))
>> @@ -1182,20 +1249,50 @@ static int unmap_and_move(new_page_t 
>> get_new_page,
>>           folio_clear_active(src);
>>           folio_clear_unevictable(src);
>>           /* free_pages_prepare() will clear PG_isolated. */
>> -        goto out;
>> +        list_del(&src->lru);
>> +        migrate_folio_done(src, reason);
>> +        return MIGRATEPAGE_SUCCESS;
>>       }
>>         newpage = get_new_page(&src->page, private);
>>       if (!newpage)
>>           return -ENOMEM;
>>       dst = page_folio(newpage);
>> +    *dstp = dst;
>>         dst->private = NULL;
>> -    rc = __unmap_and_move(src, dst, force, mode);
>> +    rc = __migrate_folio_unmap(src, dst, force, mode);
>> +    if (rc == MIGRATEPAGE_UNMAP)
>> +        return rc;
>> +
>> +    /*
>> +     * A page that has not been migrated will have kept its
>> +     * references and be restored.
>> +     */
>> +    /* restore the folio to right list. */
>> +    if (rc != -EAGAIN)
>> +        list_move_tail(&src->lru, ret);
>> +
>> +    if (put_new_page)
>> +        put_new_page(&dst->page, private);
>> +    else
>> +        folio_put(dst);
>> +
>> +    return rc;
>> +}
>> +
>> +/* Migrate the folio to the newly allocated folio in dst. */
>> +static int migrate_folio_move(free_page_t put_new_page, unsigned 
>> long private,
>> +                  struct folio *src, struct folio *dst,
>> +                  enum migrate_mode mode, enum migrate_reason reason,
>> +                  struct list_head *ret)
>> +{
>> +    int rc;
>> +
>> +    rc = __migrate_folio_move(src, dst, mode);
>>       if (rc == MIGRATEPAGE_SUCCESS)
>>           set_page_owner_migrate_reason(&dst->page, reason);
>>   -out:
>>       if (rc != -EAGAIN) {
>>           /*
>>            * A folio that has been migrated has all references
>> @@ -1211,20 +1308,7 @@ static int unmap_and_move(new_page_t 
>> get_new_page,
>>        * we want to retry.
>>        */
>>       if (rc == MIGRATEPAGE_SUCCESS) {
>> -        /*
>> -         * Compaction can migrate also non-LRU folios which are
>> -         * not accounted to NR_ISOLATED_*. They can be recognized
>> -         * as __folio_test_movable
>> -         */
>> -        if (likely(!__folio_test_movable(src)))
>> -            mod_node_page_state(folio_pgdat(src), NR_ISOLATED_ANON +
>> -                    folio_is_file_lru(src), -folio_nr_pages(src));
>> -
>> -        if (reason != MR_MEMORY_FAILURE)
>> -            /*
>> -             * We release the folio in page_handle_poison.
>> -             */
>> -            folio_put(src);
>> +        migrate_folio_done(src, reason);
>>       } else {
>>           if (rc != -EAGAIN)
>>               list_add_tail(&src->lru, ret);
>> @@ -1516,7 +1600,7 @@ static int migrate_pages_batch(struct list_head 
>> *from, new_page_t get_new_page,
>>       int pass = 0;
>>       bool is_large = false;
>>       bool is_thp = false;
>> -    struct folio *folio, *folio2;
>> +    struct folio *folio, *folio2, *dst = NULL;
>>       int rc, nr_pages;
>>       LIST_HEAD(split_folios);
>>       bool nosplit = (reason == MR_NUMA_MISPLACED);
>> @@ -1543,9 +1627,13 @@ static int migrate_pages_batch(struct 
>> list_head *from, new_page_t get_new_page,
>>                 cond_resched();
>>   -            rc = unmap_and_move(get_new_page, put_new_page,
>> -                        private, folio, pass > 2, mode,
>> -                        reason, ret_folios);
>> +            rc = migrate_folio_unmap(get_new_page, put_new_page, 
>> private,
>> +                         folio, &dst, pass > 2, mode,
>> +                         reason, ret_folios);
>> +            if (rc == MIGRATEPAGE_UNMAP)
>> +                rc = migrate_folio_move(put_new_page, private,
>> +                            folio, dst, mode,
>> +                            reason, ret_folios);
> How to deal with the whole  pages are ummaped success,  but only part  
> pages are moved success ?

Please ignore this, i get answer from patch 5.

Reviewed-by: Xin Hao <xhao@linux.alibaba.com>

>>               /*
>>                * The rules are:
>>                *    Success: folio will be freed

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 5/9] migrate_pages: batch _unmap and _move
  2023-02-06  6:33 ` [PATCH -v4 5/9] migrate_pages: batch _unmap and _move Huang Ying
  2023-02-06 16:10   ` Zi Yan
@ 2023-02-07 17:33   ` haoxin
  1 sibling, 0 replies; 33+ messages in thread
From: haoxin @ 2023-02-07 17:33 UTC (permalink / raw)
  To: Huang Ying, Andrew Morton
  Cc: linux-mm, linux-kernel, Hyeonggon Yoo, Zi Yan, Yang Shi,
	Baolin Wang, Oscar Salvador, Matthew Wilcox, Bharata B Rao,
	Alistair Popple, Minchan Kim, Mike Kravetz

I will do some tests on my arm64 server, but there is a problem with my 
machine environment, I will provide the test results later.

Of course, i should combine with this patch 
https://lore.kernel.org/lkml/20221117082648.47526-1-yangyicong@huawei.com/

在 2023/2/6 下午2:33, Huang Ying 写道:
> In this patch the _unmap and _move stage of the folio migration is
> batched.  That for, previously, it is,
>
>    for each folio
>      _unmap()
>      _move()
>
> Now, it is,
>
>    for each folio
>      _unmap()
>    for each folio
>      _move()
>
> Based on this, we can batch the TLB flushing and use some hardware
> accelerator to copy folios between batched _unmap and batched _move
> stages.
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Bharata B Rao <bharata@amd.com>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: haoxin <xhao@linux.alibaba.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> ---
>   mm/migrate.c | 208 +++++++++++++++++++++++++++++++++++++++++++++------
>   1 file changed, 184 insertions(+), 24 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 0428449149f4..fa7212330cb6 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1033,6 +1033,33 @@ static void __migrate_folio_extract(struct folio *dst,
>   	dst->private = NULL;
>   }
>   
> +/* Restore the source folio to the original state upon failure */
> +static void migrate_folio_undo_src(struct folio *src,
> +				   int page_was_mapped,
> +				   struct anon_vma *anon_vma,
> +				   struct list_head *ret)
> +{
> +	if (page_was_mapped)
> +		remove_migration_ptes(src, src, false);
> +	/* Drop an anon_vma reference if we took one */
> +	if (anon_vma)
> +		put_anon_vma(anon_vma);
> +	folio_unlock(src);
> +	list_move_tail(&src->lru, ret);
> +}
> +
> +/* Restore the destination folio to the original state upon failure */
> +static void migrate_folio_undo_dst(struct folio *dst,
> +				   free_page_t put_new_page,
> +				   unsigned long private)
> +{
> +	folio_unlock(dst);
> +	if (put_new_page)
> +		put_new_page(&dst->page, private);
> +	else
> +		folio_put(dst);
> +}
> +
>   /* Cleanup src folio upon migration success */
>   static void migrate_folio_done(struct folio *src,
>   			       enum migrate_reason reason)
> @@ -1052,7 +1079,7 @@ static void migrate_folio_done(struct folio *src,
>   }
>   
>   static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
> -				int force, enum migrate_mode mode)
> +				 int force, bool force_lock, enum migrate_mode mode)
>   {
>   	int rc = -EAGAIN;
>   	int page_was_mapped = 0;
> @@ -1079,6 +1106,17 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>   		if (current->flags & PF_MEMALLOC)
>   			goto out;
>   
> +		/*
> +		 * We have locked some folios, to avoid deadlock, we cannot
> +		 * lock the folio synchronously.  Go out to process (and
> +		 * unlock) all the locked folios.  Then we can lock the folio
> +		 * synchronously.
> +		 */
> +		if (!force_lock) {
> +			rc = -EDEADLOCK;
> +			goto out;
> +		}
> +
>   		folio_lock(src);
>   	}
>   
> @@ -1187,10 +1225,20 @@ static int __migrate_folio_move(struct folio *src, struct folio *dst,
>   	int page_was_mapped = 0;
>   	struct anon_vma *anon_vma = NULL;
>   	bool is_lru = !__PageMovable(&src->page);
> +	struct list_head *prev;
>   
>   	__migrate_folio_extract(dst, &page_was_mapped, &anon_vma);
> +	prev = dst->lru.prev;
> +	list_del(&dst->lru);
>   
>   	rc = move_to_new_folio(dst, src, mode);
> +
> +	if (rc == -EAGAIN) {
> +		list_add(&dst->lru, prev);
> +		__migrate_folio_record(dst, page_was_mapped, anon_vma);
> +		return rc;
> +	}
> +
>   	if (unlikely(!is_lru))
>   		goto out_unlock_both;
>   
> @@ -1233,7 +1281,7 @@ static int __migrate_folio_move(struct folio *src, struct folio *dst,
>   /* Obtain the lock on page, remove all ptes. */
>   static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page,
>   			       unsigned long private, struct folio *src,
> -			       struct folio **dstp, int force,
> +			       struct folio **dstp, int force, bool force_lock,
>   			       enum migrate_mode mode, enum migrate_reason reason,
>   			       struct list_head *ret)
>   {
> @@ -1261,7 +1309,7 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page
>   	*dstp = dst;
>   
>   	dst->private = NULL;
> -	rc = __migrate_folio_unmap(src, dst, force, mode);
> +	rc = __migrate_folio_unmap(src, dst, force, force_lock, mode);
>   	if (rc == MIGRATEPAGE_UNMAP)
>   		return rc;
>   
> @@ -1270,7 +1318,7 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page
>   	 * references and be restored.
>   	 */
>   	/* restore the folio to right list. */
> -	if (rc != -EAGAIN)
> +	if (rc != -EAGAIN && rc != -EDEADLOCK)
>   		list_move_tail(&src->lru, ret);
>   
>   	if (put_new_page)
> @@ -1309,9 +1357,8 @@ static int migrate_folio_move(free_page_t put_new_page, unsigned long private,
>   	 */
>   	if (rc == MIGRATEPAGE_SUCCESS) {
>   		migrate_folio_done(src, reason);
> -	} else {
> -		if (rc != -EAGAIN)
> -			list_add_tail(&src->lru, ret);
> +	} else if (rc != -EAGAIN) {
> +		list_add_tail(&src->lru, ret);
>   
>   		if (put_new_page)
>   			put_new_page(&dst->page, private);
> @@ -1591,7 +1638,7 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>   		enum migrate_mode mode, int reason, struct list_head *ret_folios,
>   		struct migrate_pages_stats *stats)
>   {
> -	int retry = 1;
> +	int retry;
>   	int large_retry = 1;
>   	int thp_retry = 1;
>   	int nr_failed = 0;
> @@ -1600,13 +1647,19 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>   	int pass = 0;
>   	bool is_large = false;
>   	bool is_thp = false;
> -	struct folio *folio, *folio2, *dst = NULL;
> -	int rc, nr_pages;
> +	struct folio *folio, *folio2, *dst = NULL, *dst2;
> +	int rc, rc_saved, nr_pages;
>   	LIST_HEAD(split_folios);
> +	LIST_HEAD(unmap_folios);
> +	LIST_HEAD(dst_folios);
>   	bool nosplit = (reason == MR_NUMA_MISPLACED);
>   	bool no_split_folio_counting = false;
> +	bool force_lock;
>   
> -split_folio_migration:
> +retry:
> +	rc_saved = 0;
> +	force_lock = true;
> +	retry = 1;
>   	for (pass = 0;
>   	     pass < NR_MAX_MIGRATE_PAGES_RETRY && (retry || large_retry);
>   	     pass++) {
> @@ -1628,16 +1681,15 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>   			cond_resched();
>   
>   			rc = migrate_folio_unmap(get_new_page, put_new_page, private,
> -						 folio, &dst, pass > 2, mode,
> -						 reason, ret_folios);
> -			if (rc == MIGRATEPAGE_UNMAP)
> -				rc = migrate_folio_move(put_new_page, private,
> -							folio, dst, mode,
> -							reason, ret_folios);
> +						 folio, &dst, pass > 2, force_lock,
> +						 mode, reason, ret_folios);
>   			/*
>   			 * The rules are:
>   			 *	Success: folio will be freed
> +			 *	Unmap: folio will be put on unmap_folios list,
> +			 *	       dst folio put on dst_folios list
>   			 *	-EAGAIN: stay on the from list
> +			 *	-EDEADLOCK: stay on the from list
>   			 *	-ENOMEM: stay on the from list
>   			 *	-ENOSYS: stay on the from list
>   			 *	Other errno: put on ret_folios list
> @@ -1672,7 +1724,7 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>   			case -ENOMEM:
>   				/*
>   				 * When memory is low, don't bother to try to migrate
> -				 * other folios, just exit.
> +				 * other folios, move unmapped folios, then exit.
>   				 */
>   				if (is_large) {
>   					nr_large_failed++;
> @@ -1711,7 +1763,19 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>   				/* nr_failed isn't updated for not used */
>   				nr_large_failed += large_retry;
>   				stats->nr_thp_failed += thp_retry;
> -				goto out;
> +				rc_saved = rc;
> +				if (list_empty(&unmap_folios))
> +					goto out;
> +				else
> +					goto move;
> +			case -EDEADLOCK:
> +				/*
> +				 * The folio cannot be locked for potential deadlock.
> +				 * Go move (and unlock) all locked folios.  Then we can
> +				 * try again.
> +				 */
> +				rc_saved = rc;
> +				goto move;
>   			case -EAGAIN:
>   				if (is_large) {
>   					large_retry++;
> @@ -1725,6 +1789,15 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>   				stats->nr_succeeded += nr_pages;
>   				stats->nr_thp_succeeded += is_thp;
>   				break;
> +			case MIGRATEPAGE_UNMAP:
> +				/*
> +				 * We have locked some folios, don't force lock
> +				 * to avoid deadlock.
> +				 */
> +				force_lock = false;
> +				list_move_tail(&folio->lru, &unmap_folios);
> +				list_add_tail(&dst->lru, &dst_folios);
> +				break;
>   			default:
>   				/*
>   				 * Permanent failure (-EBUSY, etc.):
> @@ -1748,12 +1821,95 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>   	nr_large_failed += large_retry;
>   	stats->nr_thp_failed += thp_retry;
>   	stats->nr_failed_pages += nr_retry_pages;
> +move:
> +	retry = 1;
> +	for (pass = 0;
> +	     pass < NR_MAX_MIGRATE_PAGES_RETRY && (retry || large_retry);
> +	     pass++) {
> +		retry = 0;
> +		large_retry = 0;
> +		thp_retry = 0;
> +		nr_retry_pages = 0;
> +
> +		dst = list_first_entry(&dst_folios, struct folio, lru);
> +		dst2 = list_next_entry(dst, lru);
> +		list_for_each_entry_safe(folio, folio2, &unmap_folios, lru) {
> +			is_large = folio_test_large(folio);
> +			is_thp = is_large && folio_test_pmd_mappable(folio);
> +			nr_pages = folio_nr_pages(folio);
> +
> +			cond_resched();
> +
> +			rc = migrate_folio_move(put_new_page, private,
> +						folio, dst, mode,
> +						reason, ret_folios);
> +			/*
> +			 * The rules are:
> +			 *	Success: folio will be freed
> +			 *	-EAGAIN: stay on the unmap_folios list
> +			 *	Other errno: put on ret_folios list
> +			 */
> +			switch(rc) {
> +			case -EAGAIN:
> +				if (is_large) {
> +					large_retry++;
> +					thp_retry += is_thp;
> +				} else if (!no_split_folio_counting) {
> +					retry++;
> +				}
> +				nr_retry_pages += nr_pages;
> +				break;
> +			case MIGRATEPAGE_SUCCESS:
> +				stats->nr_succeeded += nr_pages;
> +				stats->nr_thp_succeeded += is_thp;
> +				break;
> +			default:
> +				if (is_large) {
> +					nr_large_failed++;
> +					stats->nr_thp_failed += is_thp;
> +				} else if (!no_split_folio_counting) {
> +					nr_failed++;
> +				}
> +
> +				stats->nr_failed_pages += nr_pages;
> +				break;
> +			}
> +			dst = dst2;
> +			dst2 = list_next_entry(dst, lru);
> +		}
> +	}
> +	nr_failed += retry;
> +	nr_large_failed += large_retry;
> +	stats->nr_thp_failed += thp_retry;
> +	stats->nr_failed_pages += nr_retry_pages;
> +
> +	if (rc_saved)
> +		rc = rc_saved;
> +	else
> +		rc = nr_failed + nr_large_failed;
> +out:
> +	/* Cleanup remaining folios */
> +	dst = list_first_entry(&dst_folios, struct folio, lru);
> +	dst2 = list_next_entry(dst, lru);
> +	list_for_each_entry_safe(folio, folio2, &unmap_folios, lru) {
> +		int page_was_mapped = 0;
> +		struct anon_vma *anon_vma = NULL;
> +
> +		__migrate_folio_extract(dst, &page_was_mapped, &anon_vma);
> +		migrate_folio_undo_src(folio, page_was_mapped, anon_vma,
> +				       ret_folios);
> +		list_del(&dst->lru);
> +		migrate_folio_undo_dst(dst, put_new_page, private);
> +		dst = dst2;
> +		dst2 = list_next_entry(dst, lru);
> +	}
> +
>   	/*
>   	 * Try to migrate split folios of fail-to-migrate large folios, no
>   	 * nr_failed counting in this round, since all split folios of a
>   	 * large folio is counted as 1 failure in the first round.
>   	 */
> -	if (!list_empty(&split_folios)) {
> +	if (rc >= 0 && !list_empty(&split_folios)) {
>   		/*
>   		 * Move non-migrated folios (after NR_MAX_MIGRATE_PAGES_RETRY
>   		 * retries) to ret_folios to avoid migrating them again.
> @@ -1761,12 +1917,16 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>   		list_splice_init(from, ret_folios);
>   		list_splice_init(&split_folios, from);
>   		no_split_folio_counting = true;
> -		retry = 1;
> -		goto split_folio_migration;
> +		goto retry;
>   	}
>   
> -	rc = nr_failed + nr_large_failed;
> -out:
> +	/*
> +	 * We have unlocked all locked folios, so we can force lock now, let's
> +	 * try again.
> +	 */
> +	if (rc == -EDEADLOCK)
> +		goto retry;
> +
>   	return rc;
>   }
>   

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 8/9] migrate_pages: batch flushing TLB
  2023-02-06  6:33 ` [PATCH -v4 8/9] migrate_pages: batch flushing TLB Huang Ying
  2023-02-07 14:52   ` Zi Yan
@ 2023-02-07 17:44   ` haoxin
  1 sibling, 0 replies; 33+ messages in thread
From: haoxin @ 2023-02-07 17:44 UTC (permalink / raw)
  To: Huang Ying, Andrew Morton
  Cc: linux-mm, linux-kernel, Zi Yan, Yang Shi, Baolin Wang,
	Oscar Salvador, Matthew Wilcox, Bharata B Rao, Alistair Popple,
	Minchan Kim, Mike Kravetz, Hyeonggon Yoo


在 2023/2/6 下午2:33, Huang Ying 写道:
> The TLB flushing will cost quite some CPU cycles during the folio
> migration in some situations.  For example, when migrate a folio of a
> process with multiple active threads that run on multiple CPUs.  After
> batching the _unmap and _move in migrate_pages(), the TLB flushing can
> be batched easily with the existing TLB flush batching mechanism.
> This patch implements that.
>
> We use the following test case to test the patch.
>
> On a 2-socket Intel server,
>
> - Run pmbench memory accessing benchmark
>
> - Run `migratepages` to migrate pages of pmbench between node 0 and
>    node 1 back and forth.
>
> With the patch, the TLB flushing IPI reduces 99.1% during the test and
> the number of pages migrated successfully per second increases 291.7%.
>
> NOTE: TLB flushing is batched only for normal folios, not for THP
> folios.  Because the overhead of TLB flushing for THP folios is much
> lower than that for normal folios (about 1/512 on x86 platform).
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Bharata B Rao <bharata@amd.com>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: haoxin <xhao@linux.alibaba.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> ---
>   mm/migrate.c |  4 +++-
>   mm/rmap.c    | 20 +++++++++++++++++---
>   2 files changed, 20 insertions(+), 4 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 9378fa2ad4a5..ca6e2ff02a09 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1230,7 +1230,7 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page
>   		/* Establish migration ptes */
>   		VM_BUG_ON_FOLIO(folio_test_anon(src) &&
>   			       !folio_test_ksm(src) && !anon_vma, src);
> -		try_to_migrate(src, 0);
> +		try_to_migrate(src, TTU_BATCH_FLUSH);
>   		page_was_mapped = 1;
>   	}
>   
> @@ -1781,6 +1781,8 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>   	stats->nr_thp_failed += thp_retry;
>   	stats->nr_failed_pages += nr_retry_pages;
>   move:
> +	try_to_unmap_flush();
> +
>   	retry = 1;
>   	for (pass = 0;
>   	     pass < NR_MAX_MIGRATE_PAGES_RETRY && (retry || large_retry);
> diff --git a/mm/rmap.c b/mm/rmap.c
> index b616870a09be..2e125f3e462e 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1976,7 +1976,21 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
>   		} else {
>   			flush_cache_page(vma, address, pte_pfn(*pvmw.pte));
>   			/* Nuke the page table entry. */
> -			pteval = ptep_clear_flush(vma, address, pvmw.pte);
> +			if (should_defer_flush(mm, flags)) {
> +				/*
> +				 * We clear the PTE but do not flush so potentially
> +				 * a remote CPU could still be writing to the folio.
> +				 * If the entry was previously clean then the
> +				 * architecture must guarantee that a clear->dirty
> +				 * transition on a cached TLB entry is written through
> +				 * and traps if the PTE is unmapped.
> +				 */
> +				pteval = ptep_get_and_clear(mm, address, pvmw.pte);
Nice work, Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
> +
> +				set_tlb_ubc_flush_pending(mm, pte_dirty(pteval));
> +			} else {
> +				pteval = ptep_clear_flush(vma, address, pvmw.pte);
> +			}
>   		}
>   
>   		/* Set the dirty flag on the folio now the pte is gone. */
> @@ -2148,10 +2162,10 @@ void try_to_migrate(struct folio *folio, enum ttu_flags flags)
>   
>   	/*
>   	 * Migration always ignores mlock and only supports TTU_RMAP_LOCKED and
> -	 * TTU_SPLIT_HUGE_PMD and TTU_SYNC flags.
> +	 * TTU_SPLIT_HUGE_PMD, TTU_SYNC, and TTU_BATCH_FLUSH flags.
>   	 */
>   	if (WARN_ON_ONCE(flags & ~(TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PMD |
> -					TTU_SYNC)))
> +					TTU_SYNC | TTU_BATCH_FLUSH)))
>   		return;
>   
>   	if (folio_is_zone_device(folio) &&

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 0/9] migrate_pages(): batch TLB flushing
  2023-02-06  6:33 [PATCH -v4 0/9] migrate_pages(): batch TLB flushing Huang Ying
                   ` (8 preceding siblings ...)
  2023-02-06  6:33 ` [PATCH -v4 9/9] migrate_pages: move THP/hugetlb migration support check to simplify code Huang Ying
@ 2023-02-08  6:21 ` haoxin
  2023-02-08  6:27   ` haoxin
  2023-02-08 11:25   ` Huang, Ying
  9 siblings, 2 replies; 33+ messages in thread
From: haoxin @ 2023-02-08  6:21 UTC (permalink / raw)
  To: Huang Ying, Andrew Morton
  Cc: linux-mm, linux-kernel, Zi Yan, Yang Shi, Baolin Wang,
	Oscar Salvador, Matthew Wilcox, Bharata B Rao, Alistair Popple,
	Minchan Kim, Mike Kravetz, Hyeonggon Yoo

[-- Attachment #1: Type: text/plain, Size: 4359 bytes --]

On my arm64 server with 128 cores, 2 numa nodes.

I used memhog as benchmark :

     numactl -m -C 5 memhog -r100000 1G

The test result as below:

  With this patch:

     #time migratepages 8490 0 1

     real 0m1.161s

     user 0m0.000s

     sys 0m1.161s

without this patch:

     #time migratepages 8460 0 1

     real 0m2.068s

     user 0m0.001s

     sys 0m2.068s

So you can see the migration performance improvement about *+78%*

This is the perf  record info.

w/o
+   51.07%     0.09%  migratepages  [kernel.kallsyms]  [k] 
migrate_folio_extra
+   42.43%     0.04%  migratepages  [kernel.kallsyms]  [k] folio_copy
+   42.34%    42.34%  migratepages  [kernel.kallsyms]  [k] __pi_copy_page
+   33.99%     0.09%  migratepages  [kernel.kallsyms]  [k] rmap_walk_anon
+   32.35%     0.04%  migratepages  [kernel.kallsyms]  [k] try_to_migrate
*+   27.78%    27.78%  migratepages  [kernel.kallsyms]  [k] 
ptep_clear_flush *
+    8.19%     6.64%  migratepages  [kernel.kallsyms]  [k] 
folio_migrate_flagsmigrati_tlb_flush

w/ this patch
+   18.57%     0.13%  migratepages     [kernel.kallsyms]   [k] 
migrate_pages
+   18.23%     0.07%  migratepages     [kernel.kallsyms]   [k] 
migrate_pages_batch
+   16.29%     0.13%  migratepages     [kernel.kallsyms]   [k] 
migrate_folio_move
+   12.73%     0.10%  migratepages     [kernel.kallsyms]   [k] 
move_to_new_folio
+   12.52%     0.06%  migratepages     [kernel.kallsyms]   [k] 
migrate_folio_extra

Therefore, this patch helps improve performance in page migration


So,  you can add Tested-by: Xin Hao <xhao@linux.alibaba.com>


在 2023/2/6 下午2:33, Huang Ying 写道:
> From: "Huang, Ying"<ying.huang@intel.com>
>
> Now, migrate_pages() migrate folios one by one, like the fake code as
> follows,
>
>    for each folio
>      unmap
>      flush TLB
>      copy
>      restore map
>
> If multiple folios are passed to migrate_pages(), there are
> opportunities to batch the TLB flushing and copying.  That is, we can
> change the code to something as follows,
>
>    for each folio
>      unmap
>    for each folio
>      flush TLB
>    for each folio
>      copy
>    for each folio
>      restore map
>
> The total number of TLB flushing IPI can be reduced considerably.  And
> we may use some hardware accelerator such as DSA to accelerate the
> folio copying.
>
> So in this patch, we refactor the migrate_pages() implementation and
> implement the TLB flushing batching.  Base on this, hardware
> accelerated folio copying can be implemented.
>
> If too many folios are passed to migrate_pages(), in the naive batched
> implementation, we may unmap too many folios at the same time.  The
> possibility for a task to wait for the migrated folios to be mapped
> again increases.  So the latency may be hurt.  To deal with this
> issue, the max number of folios be unmapped in batch is restricted to
> no more than HPAGE_PMD_NR in the unit of page.  That is, the influence
> is at the same level of THP migration.
>
> We use the following test to measure the performance impact of the
> patchset,
>
> On a 2-socket Intel server,
>
>   - Run pmbench memory accessing benchmark
>
>   - Run `migratepages` to migrate pages of pmbench between node 0 and
>     node 1 back and forth.
>
> With the patch, the TLB flushing IPI reduces 99.1% during the test and
> the number of pages migrated successfully per second increases 291.7%.
>
> This patchset is based on v6.2-rc4.
>
> Changes:
>
> v4:
>
> - Fixed another bug about non-LRU folio migration.  Thanks Hyeonggon!
>
> v3:
>
> - Rebased on v6.2-rc4
>
> - Fixed a bug about non-LRU folio migration.  Thanks Mike!
>
> - Fixed some comments.  Thanks Baolin!
>
> - Collected reviewed-by.
>
> v2:
>
> - Rebased on v6.2-rc3
>
> - Fixed type force cast warning.  Thanks Kees!
>
> - Added more comments and cleaned up the code.  Thanks Andrew, Zi, Alistair, Dan!
>
> - Collected reviewed-by.
>
> from rfc to v1:
>
> - Rebased on v6.2-rc1
>
> - Fix the deadlock issue caused by locking multiple pages synchronously
>    per Alistair's comments.  Thanks!
>
> - Fix the autonumabench panic per Rao's comments and fix.  Thanks!
>
> - Other minor fixes per comments. Thanks!
>
> Best Regards,
> Huang, Ying

[-- Attachment #2: Type: text/html, Size: 19511 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 0/9] migrate_pages(): batch TLB flushing
  2023-02-08  6:21 ` [PATCH -v4 0/9] migrate_pages(): batch TLB flushing haoxin
@ 2023-02-08  6:27   ` haoxin
  2023-02-08 11:04     ` Jonathan Cameron
  2023-02-08 11:25   ` Huang, Ying
  1 sibling, 1 reply; 33+ messages in thread
From: haoxin @ 2023-02-08  6:27 UTC (permalink / raw)
  To: Huang Ying, Andrew Morton
  Cc: linux-mm, linux-kernel, Zi Yan, Yang Shi, Baolin Wang,
	Oscar Salvador, Matthew Wilcox, Bharata B Rao, Alistair Popple,
	Minchan Kim, Mike Kravetz, Hyeonggon Yoo

[-- Attachment #1: Type: text/plain, Size: 4644 bytes --]


在 2023/2/8 下午2:21, haoxin 写道:
>
> On my arm64 server with 128 cores, 2 numa nodes.
>
> I used memhog as benchmark :
>
>     numactl -m -C 5 memhog -r100000 1G
>
     Do a fix, numactl -m 0 -C 5 memhog -r100000 1G

> The test result as below:
>
>  With this patch:
>
>     #time migratepages 8490 0 1
>
>     real 0m1.161s
>
>     user 0m0.000s
>
>     sys 0m1.161s
>
> without this patch:
>
>     #time migratepages 8460 0 1
>
>     real 0m2.068s
>
>     user 0m0.001s
>
>     sys 0m2.068s
>
> So you can see the migration performance improvement about *+78%*
>
> This is the perf record info.
>
> w/o
> +   51.07%     0.09%  migratepages  [kernel.kallsyms]  [k] 
> migrate_folio_extra
> +   42.43%     0.04%  migratepages  [kernel.kallsyms]  [k] folio_copy
> +   42.34%    42.34%  migratepages  [kernel.kallsyms]  [k] __pi_copy_page
> +   33.99%     0.09%  migratepages  [kernel.kallsyms]  [k] rmap_walk_anon
> +   32.35%     0.04%  migratepages  [kernel.kallsyms]  [k] try_to_migrate
> *+   27.78%    27.78%  migratepages  [kernel.kallsyms]  [k] 
> ptep_clear_flush *
> +    8.19%     6.64%  migratepages  [kernel.kallsyms]  [k] 
> folio_migrate_flagsmigrati_tlb_flush
>
> w/ this patch
> +   18.57%     0.13%  migratepages     [kernel.kallsyms]   [k] 
> migrate_pages
> +   18.23%     0.07%  migratepages     [kernel.kallsyms]   [k] 
> migrate_pages_batch
> +   16.29%     0.13%  migratepages     [kernel.kallsyms]   [k] 
> migrate_folio_move
> +   12.73%     0.10%  migratepages     [kernel.kallsyms]   [k] 
> move_to_new_folio
> +   12.52%     0.06%  migratepages     [kernel.kallsyms]   [k] 
> migrate_folio_extra
>
> Therefore, this patch helps improve performance in page migration
>
>
> So,  you can add Tested-by: Xin Hao <xhao@linux.alibaba.com>
>
>
> 在 2023/2/6 下午2:33, Huang Ying 写道:
>> From: "Huang, Ying"<ying.huang@intel.com>
>>
>> Now, migrate_pages() migrate folios one by one, like the fake code as
>> follows,
>>
>>    for each folio
>>      unmap
>>      flush TLB
>>      copy
>>      restore map
>>
>> If multiple folios are passed to migrate_pages(), there are
>> opportunities to batch the TLB flushing and copying.  That is, we can
>> change the code to something as follows,
>>
>>    for each folio
>>      unmap
>>    for each folio
>>      flush TLB
>>    for each folio
>>      copy
>>    for each folio
>>      restore map
>>
>> The total number of TLB flushing IPI can be reduced considerably.  And
>> we may use some hardware accelerator such as DSA to accelerate the
>> folio copying.
>>
>> So in this patch, we refactor the migrate_pages() implementation and
>> implement the TLB flushing batching.  Base on this, hardware
>> accelerated folio copying can be implemented.
>>
>> If too many folios are passed to migrate_pages(), in the naive batched
>> implementation, we may unmap too many folios at the same time.  The
>> possibility for a task to wait for the migrated folios to be mapped
>> again increases.  So the latency may be hurt.  To deal with this
>> issue, the max number of folios be unmapped in batch is restricted to
>> no more than HPAGE_PMD_NR in the unit of page.  That is, the influence
>> is at the same level of THP migration.
>>
>> We use the following test to measure the performance impact of the
>> patchset,
>>
>> On a 2-socket Intel server,
>>
>>   - Run pmbench memory accessing benchmark
>>
>>   - Run `migratepages` to migrate pages of pmbench between node 0 and
>>     node 1 back and forth.
>>
>> With the patch, the TLB flushing IPI reduces 99.1% during the test and
>> the number of pages migrated successfully per second increases 291.7%.
>>
>> This patchset is based on v6.2-rc4.
>>
>> Changes:
>>
>> v4:
>>
>> - Fixed another bug about non-LRU folio migration.  Thanks Hyeonggon!
>>
>> v3:
>>
>> - Rebased on v6.2-rc4
>>
>> - Fixed a bug about non-LRU folio migration.  Thanks Mike!
>>
>> - Fixed some comments.  Thanks Baolin!
>>
>> - Collected reviewed-by.
>>
>> v2:
>>
>> - Rebased on v6.2-rc3
>>
>> - Fixed type force cast warning.  Thanks Kees!
>>
>> - Added more comments and cleaned up the code.  Thanks Andrew, Zi, Alistair, Dan!
>>
>> - Collected reviewed-by.
>>
>> from rfc to v1:
>>
>> - Rebased on v6.2-rc1
>>
>> - Fix the deadlock issue caused by locking multiple pages synchronously
>>    per Alistair's comments.  Thanks!
>>
>> - Fix the autonumabench panic per Rao's comments and fix.  Thanks!
>>
>> - Other minor fixes per comments. Thanks!
>>
>> Best Regards,
>> Huang, Ying

[-- Attachment #2: Type: text/html, Size: 7147 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 0/9] migrate_pages(): batch TLB flushing
  2023-02-08  6:27   ` haoxin
@ 2023-02-08 11:04     ` Jonathan Cameron
  0 siblings, 0 replies; 33+ messages in thread
From: Jonathan Cameron @ 2023-02-08 11:04 UTC (permalink / raw)
  To: haoxin
  Cc: Huang Ying, Andrew Morton, linux-mm, linux-kernel, Zi Yan,
	Yang Shi, Baolin Wang, Oscar Salvador, Matthew Wilcox,
	Bharata B Rao, Alistair Popple, Minchan Kim, Mike Kravetz,
	Hyeonggon Yoo, yangyicong

On Wed, 8 Feb 2023 14:27:32 +0800
haoxin <xhao@linux.alibaba.com> wrote:

> 在 2023/2/8 下午2:21, haoxin 写道:
> >
> > On my arm64 server with 128 cores, 2 numa nodes.
> >
> > I used memhog as benchmark :
> >
> >     numactl -m -C 5 memhog -r100000 1G
> >  
>      Do a fix, numactl -m 0 -C 5 memhog -r100000 1G

Nice results - thanks for sharing.

Just to confirm, is this with this tlb batching on arm64 patch set?
https://lore.kernel.org/linux-arm-kernel/20221117082648.47526-1-yangyicong@huawei.com/

I think that still hasn't been applied upstream.

Jonathan

> 
> > The test result as below:
> >
> >  With this patch:
> >
> >     #time migratepages 8490 0 1
> >
> >     real 0m1.161s
> >
> >     user 0m0.000s
> >
> >     sys 0m1.161s
> >
> > without this patch:
> >
> >     #time migratepages 8460 0 1
> >
> >     real 0m2.068s
> >
> >     user 0m0.001s
> >
> >     sys 0m2.068s
> >
> > So you can see the migration performance improvement about *+78%*
> >
> > This is the perf record info.
> >
> > w/o
> > +   51.07%     0.09%  migratepages  [kernel.kallsyms]  [k] 
> > migrate_folio_extra
> > +   42.43%     0.04%  migratepages  [kernel.kallsyms]  [k] folio_copy
> > +   42.34%    42.34%  migratepages  [kernel.kallsyms]  [k] __pi_copy_page
> > +   33.99%     0.09%  migratepages  [kernel.kallsyms]  [k] rmap_walk_anon
> > +   32.35%     0.04%  migratepages  [kernel.kallsyms]  [k] try_to_migrate
> > *+   27.78%    27.78%  migratepages  [kernel.kallsyms]  [k] 
> > ptep_clear_flush *
> > +    8.19%     6.64%  migratepages  [kernel.kallsyms]  [k] 
> > folio_migrate_flagsmigrati_tlb_flush
> >
> > w/ this patch
> > +   18.57%     0.13%  migratepages     [kernel.kallsyms]   [k] 
> > migrate_pages
> > +   18.23%     0.07%  migratepages     [kernel.kallsyms]   [k] 
> > migrate_pages_batch
> > +   16.29%     0.13%  migratepages     [kernel.kallsyms]   [k] 
> > migrate_folio_move
> > +   12.73%     0.10%  migratepages     [kernel.kallsyms]   [k] 
> > move_to_new_folio
> > +   12.52%     0.06%  migratepages     [kernel.kallsyms]   [k] 
> > migrate_folio_extra
> >
> > Therefore, this patch helps improve performance in page migration
> >
> >
> > So,  you can add Tested-by: Xin Hao <xhao@linux.alibaba.com>
> >
> >
> > 在 2023/2/6 下午2:33, Huang Ying 写道:  
> >> From: "Huang, Ying"<ying.huang@intel.com>
> >>
> >> Now, migrate_pages() migrate folios one by one, like the fake code as
> >> follows,
> >>
> >>    for each folio
> >>      unmap
> >>      flush TLB
> >>      copy
> >>      restore map
> >>
> >> If multiple folios are passed to migrate_pages(), there are
> >> opportunities to batch the TLB flushing and copying.  That is, we can
> >> change the code to something as follows,
> >>
> >>    for each folio
> >>      unmap
> >>    for each folio
> >>      flush TLB
> >>    for each folio
> >>      copy
> >>    for each folio
> >>      restore map
> >>
> >> The total number of TLB flushing IPI can be reduced considerably.  And
> >> we may use some hardware accelerator such as DSA to accelerate the
> >> folio copying.
> >>
> >> So in this patch, we refactor the migrate_pages() implementation and
> >> implement the TLB flushing batching.  Base on this, hardware
> >> accelerated folio copying can be implemented.
> >>
> >> If too many folios are passed to migrate_pages(), in the naive batched
> >> implementation, we may unmap too many folios at the same time.  The
> >> possibility for a task to wait for the migrated folios to be mapped
> >> again increases.  So the latency may be hurt.  To deal with this
> >> issue, the max number of folios be unmapped in batch is restricted to
> >> no more than HPAGE_PMD_NR in the unit of page.  That is, the influence
> >> is at the same level of THP migration.
> >>
> >> We use the following test to measure the performance impact of the
> >> patchset,
> >>
> >> On a 2-socket Intel server,
> >>
> >>   - Run pmbench memory accessing benchmark
> >>
> >>   - Run `migratepages` to migrate pages of pmbench between node 0 and
> >>     node 1 back and forth.
> >>
> >> With the patch, the TLB flushing IPI reduces 99.1% during the test and
> >> the number of pages migrated successfully per second increases 291.7%.
> >>
> >> This patchset is based on v6.2-rc4.
> >>
> >> Changes:
> >>
> >> v4:
> >>
> >> - Fixed another bug about non-LRU folio migration.  Thanks Hyeonggon!
> >>
> >> v3:
> >>
> >> - Rebased on v6.2-rc4
> >>
> >> - Fixed a bug about non-LRU folio migration.  Thanks Mike!
> >>
> >> - Fixed some comments.  Thanks Baolin!
> >>
> >> - Collected reviewed-by.
> >>
> >> v2:
> >>
> >> - Rebased on v6.2-rc3
> >>
> >> - Fixed type force cast warning.  Thanks Kees!
> >>
> >> - Added more comments and cleaned up the code.  Thanks Andrew, Zi, Alistair, Dan!
> >>
> >> - Collected reviewed-by.
> >>
> >> from rfc to v1:
> >>
> >> - Rebased on v6.2-rc1
> >>
> >> - Fix the deadlock issue caused by locking multiple pages synchronously
> >>    per Alistair's comments.  Thanks!
> >>
> >> - Fix the autonumabench panic per Rao's comments and fix.  Thanks!
> >>
> >> - Other minor fixes per comments. Thanks!
> >>
> >> Best Regards,
> >> Huang, Ying  


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 0/9] migrate_pages(): batch TLB flushing
  2023-02-08  6:21 ` [PATCH -v4 0/9] migrate_pages(): batch TLB flushing haoxin
  2023-02-08  6:27   ` haoxin
@ 2023-02-08 11:25   ` Huang, Ying
  1 sibling, 0 replies; 33+ messages in thread
From: Huang, Ying @ 2023-02-08 11:25 UTC (permalink / raw)
  To: haoxin
  Cc: Andrew Morton, linux-mm, linux-kernel, Zi Yan, Yang Shi,
	Baolin Wang, Oscar Salvador, Matthew Wilcox, Bharata B Rao,
	Alistair Popple, Minchan Kim, Mike Kravetz, Hyeonggon Yoo

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=ascii, Size: 4373 bytes --]

haoxin <xhao@linux.alibaba.com> writes:

> On my arm64 server with 128 cores, 2 numa nodes.
>
> I used memhog as benchmark :
>
>   numactl -m -C 5 memhog -r100000 1G
>
> The test result as below:
>
> With this patch:
>
>   #time migratepages 8490 0 1
>
>   real 0m1.161s
>
>   user 0m0.000s
>
>   sys 0m1.161s
>
> without this patch:
>
>   #time migratepages 8460 0 1
>
>   real 0m2.068s
>
>   user 0m0.001s
>
>   sys 0m2.068s
>
> So you can see the migration performance improvement about *+78%*
>
> This is the perf record info.
>
> w/o
> +  51.07%   0.09% migratepages
> [kernel.kallsyms] [k] migrate_folio_extra
> +  42.43%   0.04% migratepages [kernel.kallsyms] [k] folio_copy
> +  42.34%  42.34% migratepages [kernel.kallsyms] [k] __pi_copy_page
> +  33.99%   0.09% migratepages [kernel.kallsyms] [k] rmap_walk_anon
> +  32.35%   0.04% migratepages [kernel.kallsyms] [k] try_to_migrate
> *+  27.78%  27.78% migratepages
>  [kernel.kallsyms] [k] ptep_clear_flush *
> +  8.19%   6.64% migratepages
> [kernel.kallsyms] [k] folio_migrate_flagsmigrati_tlb_flush
>
> w/ this patch
> +  18.57%   0.13%
> migratepages   [kernel.kallsyms]  [k]
> migrate_pages
> +  18.23%   0.07%
> migratepages   [kernel.kallsyms]  [k]
> migrate_pages_batch
> +  16.29%   0.13%
> migratepages   [kernel.kallsyms]  [k]
> migrate_folio_move
> +  12.73%   0.10%
> migratepages   [kernel.kallsyms]  [k]
> move_to_new_folio
> +  12.52%   0.06%
> migratepages   [kernel.kallsyms]  [k]
> migrate_folio_extra
>
> Therefore, this patch helps improve performance in page migration
>
>
> So, you can add Tested-by: Xin Hao <xhao@linux.alibaba.com>

Thank you very much!

Best Regards,
Huang, Ying

>
> ( 2023/2/6 \vH2:33, Huang Ying S:
>> From: "Huang, Ying"<ying.huang@intel.com>
>>
>> Now, migrate_pages() migrate folios one by one, like the fake code as
>> follows,
>>
>>    for each folio
>>      unmap
>>      flush TLB
>>      copy
>>      restore map
>>
>> If multiple folios are passed to migrate_pages(), there are
>> opportunities to batch the TLB flushing and copying.  That is, we can
>> change the code to something as follows,
>>
>>    for each folio
>>      unmap
>>    for each folio
>>      flush TLB
>>    for each folio
>>      copy
>>    for each folio
>>      restore map
>>
>> The total number of TLB flushing IPI can be reduced considerably.  And
>> we may use some hardware accelerator such as DSA to accelerate the
>> folio copying.
>>
>> So in this patch, we refactor the migrate_pages() implementation and
>> implement the TLB flushing batching.  Base on this, hardware
>> accelerated folio copying can be implemented.
>>
>> If too many folios are passed to migrate_pages(), in the naive batched
>> implementation, we may unmap too many folios at the same time.  The
>> possibility for a task to wait for the migrated folios to be mapped
>> again increases.  So the latency may be hurt.  To deal with this
>> issue, the max number of folios be unmapped in batch is restricted to
>> no more than HPAGE_PMD_NR in the unit of page.  That is, the influence
>> is at the same level of THP migration.
>>
>> We use the following test to measure the performance impact of the
>> patchset,
>>
>> On a 2-socket Intel server,
>>
>>   - Run pmbench memory accessing benchmark
>>
>>   - Run `migratepages` to migrate pages of pmbench between node 0 and
>>     node 1 back and forth.
>>
>> With the patch, the TLB flushing IPI reduces 99.1% during the test and
>> the number of pages migrated successfully per second increases 291.7%.
>>
>> This patchset is based on v6.2-rc4.
>>
>> Changes:
>>
>> v4:
>>
>> - Fixed another bug about non-LRU folio migration.  Thanks Hyeonggon!
>>
>> v3:
>>
>> - Rebased on v6.2-rc4
>>
>> - Fixed a bug about non-LRU folio migration.  Thanks Mike!
>>
>> - Fixed some comments.  Thanks Baolin!
>>
>> - Collected reviewed-by.
>>
>> v2:
>>
>> - Rebased on v6.2-rc3
>>
>> - Fixed type force cast warning.  Thanks Kees!
>>
>> - Added more comments and cleaned up the code.  Thanks Andrew, Zi, Alistair, Dan!
>>
>> - Collected reviewed-by.
>>
>> from rfc to v1:
>>
>> - Rebased on v6.2-rc1
>>
>> - Fix the deadlock issue caused by locking multiple pages synchronously
>>    per Alistair's comments.  Thanks!
>>
>> - Fix the autonumabench panic per Rao's comments and fix.  Thanks!
>>
>> - Other minor fixes per comments. Thanks!
>>
>> Best Regards,
>> Huang, Ying

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 8/9] migrate_pages: batch flushing TLB
  2023-02-07 14:52   ` Zi Yan
@ 2023-02-08 11:27     ` Huang, Ying
  0 siblings, 0 replies; 33+ messages in thread
From: Huang, Ying @ 2023-02-08 11:27 UTC (permalink / raw)
  To: Zi Yan
  Cc: Andrew Morton, linux-mm, linux-kernel, Yang Shi, Baolin Wang,
	Oscar Salvador, Matthew Wilcox, Bharata B Rao, Alistair Popple,
	haoxin, Minchan Kim, Mike Kravetz, Hyeonggon Yoo

Zi Yan <ziy@nvidia.com> writes:

> On 6 Feb 2023, at 1:33, Huang Ying wrote:
>
>> The TLB flushing will cost quite some CPU cycles during the folio
>> migration in some situations.  For example, when migrate a folio of a
>> process with multiple active threads that run on multiple CPUs.  After
>> batching the _unmap and _move in migrate_pages(), the TLB flushing can
>> be batched easily with the existing TLB flush batching mechanism.
>> This patch implements that.
>>
>> We use the following test case to test the patch.
>>
>> On a 2-socket Intel server,
>>
>> - Run pmbench memory accessing benchmark
>>
>> - Run `migratepages` to migrate pages of pmbench between node 0 and
>>   node 1 back and forth.
>>
>> With the patch, the TLB flushing IPI reduces 99.1% during the test and
>> the number of pages migrated successfully per second increases 291.7%.
>>
>> NOTE: TLB flushing is batched only for normal folios, not for THP
>> folios.  Because the overhead of TLB flushing for THP folios is much
>> lower than that for normal folios (about 1/512 on x86 platform).
>>
>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> Cc: Zi Yan <ziy@nvidia.com>
>> Cc: Yang Shi <shy828301@gmail.com>
>> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
>> Cc: Oscar Salvador <osalvador@suse.de>
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Cc: Bharata B Rao <bharata@amd.com>
>> Cc: Alistair Popple <apopple@nvidia.com>
>> Cc: haoxin <xhao@linux.alibaba.com>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Mike Kravetz <mike.kravetz@oracle.com>
>> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
>> ---
>>  mm/migrate.c |  4 +++-
>>  mm/rmap.c    | 20 +++++++++++++++++---
>>  2 files changed, 20 insertions(+), 4 deletions(-)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index 9378fa2ad4a5..ca6e2ff02a09 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1230,7 +1230,7 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page
>>  		/* Establish migration ptes */
>>  		VM_BUG_ON_FOLIO(folio_test_anon(src) &&
>>  			       !folio_test_ksm(src) && !anon_vma, src);
>> -		try_to_migrate(src, 0);
>> +		try_to_migrate(src, TTU_BATCH_FLUSH);
>>  		page_was_mapped = 1;
>>  	}
>>
>> @@ -1781,6 +1781,8 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>>  	stats->nr_thp_failed += thp_retry;
>>  	stats->nr_failed_pages += nr_retry_pages;
>>  move:
>
> Maybe a comment:
> /* Flush TLBs for all the unmapped pages */

OK.  Will do that in the next version.

Best Regards,
Huang, Ying

>> +	try_to_unmap_flush();
>> +
>>  	retry = 1;
>>  	for (pass = 0;
>>  	     pass < NR_MAX_MIGRATE_PAGES_RETRY && (retry || large_retry);
>> diff --git a/mm/rmap.c b/mm/rmap.c
>> index b616870a09be..2e125f3e462e 100644
>> --- a/mm/rmap.c
>> +++ b/mm/rmap.c
>> @@ -1976,7 +1976,21 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
>>  		} else {
>>  			flush_cache_page(vma, address, pte_pfn(*pvmw.pte));
>>  			/* Nuke the page table entry. */
>> -			pteval = ptep_clear_flush(vma, address, pvmw.pte);
>> +			if (should_defer_flush(mm, flags)) {
>> +				/*
>> +				 * We clear the PTE but do not flush so potentially
>> +				 * a remote CPU could still be writing to the folio.
>> +				 * If the entry was previously clean then the
>> +				 * architecture must guarantee that a clear->dirty
>> +				 * transition on a cached TLB entry is written through
>> +				 * and traps if the PTE is unmapped.
>> +				 */
>> +				pteval = ptep_get_and_clear(mm, address, pvmw.pte);
>> +
>> +				set_tlb_ubc_flush_pending(mm, pte_dirty(pteval));
>> +			} else {
>> +				pteval = ptep_clear_flush(vma, address, pvmw.pte);
>> +			}
>>  		}
>>
>>  		/* Set the dirty flag on the folio now the pte is gone. */
>> @@ -2148,10 +2162,10 @@ void try_to_migrate(struct folio *folio, enum ttu_flags flags)
>>
>>  	/*
>>  	 * Migration always ignores mlock and only supports TTU_RMAP_LOCKED and
>> -	 * TTU_SPLIT_HUGE_PMD and TTU_SYNC flags.
>> +	 * TTU_SPLIT_HUGE_PMD, TTU_SYNC, and TTU_BATCH_FLUSH flags.
>>  	 */
>>  	if (WARN_ON_ONCE(flags & ~(TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PMD |
>> -					TTU_SYNC)))
>> +					TTU_SYNC | TTU_BATCH_FLUSH)))
>>  		return;
>>
>>  	if (folio_is_zone_device(folio) &&
>> -- 
>> 2.35.1
>
> Everything else looks good to me. Reviewed-by: Zi Yan <ziy@nvidia.com>
>
> --
> Best Regards,
> Yan, Zi

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 2/9] migrate_pages: separate hugetlb folios migration
  2023-02-07 16:42   ` haoxin
@ 2023-02-08 11:35     ` Huang, Ying
  0 siblings, 0 replies; 33+ messages in thread
From: Huang, Ying @ 2023-02-08 11:35 UTC (permalink / raw)
  To: haoxin
  Cc: Andrew Morton, linux-mm, linux-kernel, Baolin Wang, Zi Yan,
	Yang Shi, Oscar Salvador, Matthew Wilcox, Bharata B Rao,
	Alistair Popple, Minchan Kim, Mike Kravetz, Hyeonggon Yoo

haoxin <xhao@linux.alibaba.com> writes:

> ( 2023/2/6 \vH2:33, Huang Ying S:
>> This is a preparation patch to batch the folio unmapping and moving
>> for the non-hugetlb folios.  Based on that we can batch the TLB
>> shootdown during the folio migration and make it possible to use some
>> hardware accelerator for the folio copying.
>>
>> In this patch the hugetlb folios and non-hugetlb folios migration is
>> separated in migrate_pages() to make it easy to change the non-hugetlb
>> folios migration implementation.
>>
>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>> Cc: Zi Yan <ziy@nvidia.com>
>> Cc: Yang Shi <shy828301@gmail.com>
>> Cc: Oscar Salvador <osalvador@suse.de>
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Cc: Bharata B Rao <bharata@amd.com>
>> Cc: Alistair Popple <apopple@nvidia.com>
>> Cc: haoxin <xhao@linux.alibaba.com>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Mike Kravetz <mike.kravetz@oracle.com>
>> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
>> ---
>>   mm/migrate.c | 141 +++++++++++++++++++++++++++++++++++++++++++--------
>>   1 file changed, 119 insertions(+), 22 deletions(-)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index ef388a9e4747..be7f37523463 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1396,6 +1396,8 @@ static inline int try_split_folio(struct folio *folio, struct list_head *split_f
>>   	return rc;
>>   }
>>   +#define NR_MAX_MIGRATE_PAGES_RETRY	10
>> +
>>   struct migrate_pages_stats {
>>   	int nr_succeeded;	/* Normal and large folios migrated successfully, in
>>   				   units of base pages */
>> @@ -1406,6 +1408,95 @@ struct migrate_pages_stats {
>>   	int nr_thp_split;	/* THP split before migrating */
>>   };
>>   +/*
>> + * Returns the number of hugetlb folios that were not migrated, or an error code
>> + * after NR_MAX_MIGRATE_PAGES_RETRY attempts or if no hugetlb folios are movable
>> + * any more because the list has become empty or no retryable hugetlb folios
>> + * exist any more. It is caller's responsibility to call putback_movable_pages()
>> + * only if ret != 0.
>> + */
>> +static int migrate_hugetlbs(struct list_head *from, new_page_t get_new_page,
>> +			    free_page_t put_new_page, unsigned long private,
>> +			    enum migrate_mode mode, int reason,
>> +			    struct migrate_pages_stats *stats,
>> +			    struct list_head *ret_folios)
>> +{
>> +	int retry = 1;
>> +	int nr_failed = 0;
>> +	int nr_retry_pages = 0;
>> +	int pass = 0;
>> +	struct folio *folio, *folio2;
>> +	int rc, nr_pages;
>> +
>> +	for (pass = 0; pass < NR_MAX_MIGRATE_PAGES_RETRY && retry; pass++) {
>> +		retry = 0;
>> +		nr_retry_pages = 0;
>> +
>> +		list_for_each_entry_safe(folio, folio2, from, lru) {
>> +			if (!folio_test_hugetlb(folio))
>> +				continue;
>> +
>> +			nr_pages = folio_nr_pages(folio);
>> +
>> +			cond_resched();
> Just curious, why put cond_resched() here, it makes
> "nr_pages = folio_nr_pages(folio)" looks Separately with below
> codes.

This is the original behavior.  Per my understanding, this can reduce
the schedule latency of the page migration.

>> +
>> +			rc = unmap_and_move_huge_page(get_new_page,
>> +						      put_new_page, private,
>> +						      &folio->page, pass > 2, mode,
>> +						      reason, ret_folios);
>> +			/*
>> +			 * The rules are:
>> +			 *	Success: hugetlb folio will be put back
>> +			 *	-EAGAIN: stay on the from list
>> +			 *	-ENOMEM: stay on the from list
>> +			 *	-ENOSYS: stay on the from list
>> +			 *	Other errno: put on ret_folios list
>> +			 */
>> +			switch(rc) {
>> +			case -ENOSYS:
>> +				/* Hugetlb migration is unsupported */
>> +				nr_failed++;
>> +				stats->nr_failed_pages += nr_pages;
>> +				list_move_tail(&folio->lru, ret_folios);
>> +				break;
>> +			case -ENOMEM:
>> +				/*
>> +				 * When memory is low, don't bother to try to migrate
>> +				 * other folios, just exit.
>> +				 */
>> +				stats->nr_failed_pages += nr_pages + nr_retry_pages;
>> +				return -ENOMEM;
>> +			case -EAGAIN:
>> +				retry++;
>> +				nr_retry_pages += nr_pages;
>> +				break;
>> +			case MIGRATEPAGE_SUCCESS:
>> +				stats->nr_succeeded += nr_pages;
>> +				break;
>> +			default:
>> +				/*
>> +				 * Permanent failure (-EBUSY, etc.):
>> +				 * unlike -EAGAIN case, the failed folio is
>> +				 * removed from migration folio list and not
>> +				 * retried in the next outer loop.
>> +				 */
>> +				nr_failed++;
>> +				stats->nr_failed_pages += nr_pages;
>> +				break;
>> +			}
>> +		}
>> +	}
>> +	/*
>> +	 * nr_failed is number of hugetlb folios failed to be migrated.  After
>> +	 * NR_MAX_MIGRATE_PAGES_RETRY attempts, give up and count retried hugetlb
>> +	 * folios as failed.
>> +	 */
>> +	nr_failed += retry;
>> +	stats->nr_failed_pages += nr_retry_pages;
>> +
>> +	return nr_failed;
>> +}
>> +
>>   /*
>>    * migrate_pages - migrate the folios specified in a list, to the free folios
>>    *		   supplied as the target for the page migration
>> @@ -1422,10 +1513,10 @@ struct migrate_pages_stats {
>>    * @ret_succeeded:	Set to the number of folios migrated successfully if
>>    *			the caller passes a non-NULL pointer.
>>    *
>> - * The function returns after 10 attempts or if no folios are movable any more
>> - * because the list has become empty or no retryable folios exist any more.
>> - * It is caller's responsibility to call putback_movable_pages() to return folios
>> - * to the LRU or free list only if ret != 0.
>> + * The function returns after NR_MAX_MIGRATE_PAGES_RETRY attempts or if no folios
>> + * are movable any more because the list has become empty or no retryable folios
>> + * exist any more. It is caller's responsibility to call putback_movable_pages()
>> + * only if ret != 0.
>>    *
>>    * Returns the number of {normal folio, large folio, hugetlb} that were not
>>    * migrated, or an error code. The number of large folio splits will be
>> @@ -1439,7 +1530,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>   	int retry = 1;
>>   	int large_retry = 1;
>>   	int thp_retry = 1;
>> -	int nr_failed = 0;
>> +	int nr_failed;
>>   	int nr_retry_pages = 0;
>>   	int nr_large_failed = 0;
>>   	int pass = 0;
>> @@ -1456,38 +1547,45 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>   	trace_mm_migrate_pages_start(mode, reason);
>>     	memset(&stats, 0, sizeof(stats));
>> +	rc = migrate_hugetlbs(from, get_new_page, put_new_page, private, mode, reason,
>> +			      &stats, &ret_folios);
>> +	if (rc < 0)
>> +		goto out;
>> +	nr_failed = rc;
>> +
>>   split_folio_migration:
>> -	for (pass = 0; pass < 10 && (retry || large_retry); pass++) {
>> +	for (pass = 0;
>> +	     pass < NR_MAX_MIGRATE_PAGES_RETRY && (retry || large_retry);
>> +	     pass++) {
>>   		retry = 0;
>>   		large_retry = 0;
>>   		thp_retry = 0;
>>   		nr_retry_pages = 0;
>>     		list_for_each_entry_safe(folio, folio2, from, lru) {
>> +			/* Retried hugetlb folios will be kept in list  */
>> +			if (folio_test_hugetlb(folio)) {
>> +				list_move_tail(&folio->lru, &ret_folios);
>> +				continue;
>> +			}
>> +
>>   			/*
>>   			 * Large folio statistics is based on the source large
>>   			 * folio. Capture required information that might get
>>   			 * lost during migration.
>>   			 */
>> -			is_large = folio_test_large(folio) && !folio_test_hugetlb(folio);
>> +			is_large = folio_test_large(folio);
>>   			is_thp = is_large && folio_test_pmd_mappable(folio);
>>   			nr_pages = folio_nr_pages(folio);
>> +
>>   			cond_resched();
>>   -			if (folio_test_hugetlb(folio))
>> -				rc = unmap_and_move_huge_page(get_new_page,
>> -						put_new_page, private,
>> -						&folio->page, pass > 2, mode,
>> -						reason,
>> -						&ret_folios);
>> -			else
>> -				rc = unmap_and_move(get_new_page, put_new_page,
>> -						private, folio, pass > 2, mode,
>> -						reason, &ret_folios);
>> +			rc = unmap_and_move(get_new_page, put_new_page,
>> +					    private, folio, pass > 2, mode,
>> +					    reason, &ret_folios);
>>   			/*
>>   			 * The rules are:
>> -			 *	Success: non hugetlb folio will be freed, hugetlb
>> -			 *		 folio will be put back
>> +			 *	Success: folio will be freed
>>   			 *	-EAGAIN: stay on the from list
>>   			 *	-ENOMEM: stay on the from list
>>   			 *	-ENOSYS: stay on the from list
>> @@ -1514,7 +1612,6 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>   						stats.nr_thp_split += is_thp;
>>   						break;
>>   					}
>> -				/* Hugetlb migration is unsupported */
>>   				} else if (!no_split_folio_counting) {
>>   					nr_failed++;
>>   				}
>> @@ -1608,8 +1705,8 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>   	 */
>>   	if (!list_empty(&split_folios)) {
>>   		/*
>> -		 * Move non-migrated folios (after 10 retries) to ret_folios
>> -		 * to avoid migrating them again.
>> +		 * Move non-migrated folios (after NR_MAX_MIGRATE_PAGES_RETRY
>> +		 * retries) to ret_folios to avoid migrating them again.
>>   		 */
>>   		list_splice_init(from, &ret_folios);
>>   		list_splice_init(&split_folios, from);
> Reviewed-by: Xin Hao <xhao@linux.alibaba.com>

Thanks!

Best Regards,
Huang, Ying

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 7/9] migrate_pages: share more code between _unmap and _move
  2023-02-07 14:50   ` Zi Yan
@ 2023-02-08 12:02     ` Huang, Ying
  2023-02-08 19:47       ` Zi Yan
  0 siblings, 1 reply; 33+ messages in thread
From: Huang, Ying @ 2023-02-08 12:02 UTC (permalink / raw)
  To: Zi Yan
  Cc: Andrew Morton, linux-mm, linux-kernel, Yang Shi, Baolin Wang,
	Oscar Salvador, Matthew Wilcox, Bharata B Rao, Alistair Popple,
	haoxin, Minchan Kim, Mike Kravetz, Hyeonggon Yoo

Zi Yan <ziy@nvidia.com> writes:

> On 6 Feb 2023, at 1:33, Huang Ying wrote:
>
>> This is a code cleanup patch to reduce the duplicated code between the
>> _unmap and _move stages of migrate_pages().  No functionality change
>> is expected.
>>
>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> Cc: Zi Yan <ziy@nvidia.com>
>> Cc: Yang Shi <shy828301@gmail.com>
>> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
>> Cc: Oscar Salvador <osalvador@suse.de>
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Cc: Bharata B Rao <bharata@amd.com>
>> Cc: Alistair Popple <apopple@nvidia.com>
>> Cc: haoxin <xhao@linux.alibaba.com>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Mike Kravetz <mike.kravetz@oracle.com>
>> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
>> ---
>>  mm/migrate.c | 203 ++++++++++++++++++++-------------------------------
>>  1 file changed, 81 insertions(+), 122 deletions(-)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index 23eb01cfae4c..9378fa2ad4a5 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1037,6 +1037,7 @@ static void __migrate_folio_extract(struct folio *dst,
>>  static void migrate_folio_undo_src(struct folio *src,
>>  				   int page_was_mapped,
>>  				   struct anon_vma *anon_vma,
>> +				   bool locked,
>>  				   struct list_head *ret)
>>  {
>>  	if (page_was_mapped)
>> @@ -1044,16 +1045,20 @@ static void migrate_folio_undo_src(struct folio *src,
>>  	/* Drop an anon_vma reference if we took one */
>>  	if (anon_vma)
>>  		put_anon_vma(anon_vma);
>> -	folio_unlock(src);
>> -	list_move_tail(&src->lru, ret);
>> +	if (locked)
>> +		folio_unlock(src);
>
> Having a comment would be better.
> /* A page that has not been migrated, move it to a list for later restoration */

Emm... the page state has been restored in the previous operations of
the function.  This is the last step and the page will be moved to
"return" list, then the caller of migrate_pages() will call
putback_movable_pages().

We have some comments for the function (migrate_folio_undo_src()) as
follows,

/* Restore the source folio to the original state upon failure */

>> +	if (ret)
>> +		list_move_tail(&src->lru, ret);
>>  }
>>
>>  /* Restore the destination folio to the original state upon failure */
>>  static void migrate_folio_undo_dst(struct folio *dst,
>> +				   bool locked,
>>  				   free_page_t put_new_page,
>>  				   unsigned long private)
>>  {
>> -	folio_unlock(dst);
>> +	if (locked)
>> +		folio_unlock(dst);
>>  	if (put_new_page)
>>  		put_new_page(&dst->page, private);
>>  	else
>> @@ -1078,13 +1083,42 @@ static void migrate_folio_done(struct folio *src,
>>  		folio_put(src);
>>  }
>>
>> -static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>> -				 int force, bool force_lock, enum migrate_mode mode)
>> +/* Obtain the lock on page, remove all ptes. */
>> +static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page,
>> +			       unsigned long private, struct folio *src,
>> +			       struct folio **dstp, int force, bool force_lock,
>> +			       enum migrate_mode mode, enum migrate_reason reason,
>> +			       struct list_head *ret)
>>  {
>> +	struct folio *dst;
>>  	int rc = -EAGAIN;
>> +	struct page *newpage = NULL;
>>  	int page_was_mapped = 0;
>>  	struct anon_vma *anon_vma = NULL;
>>  	bool is_lru = !__PageMovable(&src->page);
>> +	bool locked = false;
>> +	bool dst_locked = false;
>> +
>> +	if (!thp_migration_supported() && folio_test_transhuge(src))
>> +		return -ENOSYS;
>> +
>> +	if (folio_ref_count(src) == 1) {
>> +		/* Folio was freed from under us. So we are done. */
>> +		folio_clear_active(src);
>> +		folio_clear_unevictable(src);
>> +		/* free_pages_prepare() will clear PG_isolated. */
>> +		list_del(&src->lru);
>> +		migrate_folio_done(src, reason);
>> +		return MIGRATEPAGE_SUCCESS;
>> +	}
>> +
>> +	newpage = get_new_page(&src->page, private);
>> +	if (!newpage)
>> +		return -ENOMEM;
>> +	dst = page_folio(newpage);
>> +	*dstp = dst;
>> +
>> +	dst->private = NULL;
>>
>>  	if (!folio_trylock(src)) {
>>  		if (!force || mode == MIGRATE_ASYNC)
>> @@ -1119,6 +1153,7 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>>
>>  		folio_lock(src);
>>  	}
>> +	locked = true;
>>
>>  	if (folio_test_writeback(src)) {
>>  		/*
>> @@ -1133,10 +1168,10 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>>  			break;
>>  		default:
>>  			rc = -EBUSY;
>> -			goto out_unlock;
>> +			goto out;
>>  		}
>>  		if (!force)
>> -			goto out_unlock;
>> +			goto out;
>>  		folio_wait_writeback(src);
>>  	}
>>
>> @@ -1166,7 +1201,8 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>>  	 * This is much like races on refcount of oldpage: just don't BUG().
>>  	 */
>>  	if (unlikely(!folio_trylock(dst)))
>> -		goto out_unlock;
>> +		goto out;
>> +	dst_locked = true;
>>
>>  	if (unlikely(!is_lru)) {
>>  		__migrate_folio_record(dst, page_was_mapped, anon_vma);
>> @@ -1188,7 +1224,7 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>>  	if (!src->mapping) {
>>  		if (folio_test_private(src)) {
>>  			try_to_free_buffers(src);
>> -			goto out_unlock_both;
>> +			goto out;
>>  		}
>>  	} else if (folio_mapped(src)) {
>>  		/* Establish migration ptes */
>> @@ -1203,74 +1239,26 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>>  		return MIGRATEPAGE_UNMAP;
>>  	}
>>
>> -	if (page_was_mapped)
>> -		remove_migration_ptes(src, src, false);
>> -
>> -out_unlock_both:
>> -	folio_unlock(dst);
>> -out_unlock:
>> -	/* Drop an anon_vma reference if we took one */
>> -	if (anon_vma)
>> -		put_anon_vma(anon_vma);
>> -	folio_unlock(src);
>>  out:
>> -
>> -	return rc;
>> -}
>> -
>> -/* Obtain the lock on page, remove all ptes. */
>> -static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page,
>> -			       unsigned long private, struct folio *src,
>> -			       struct folio **dstp, int force, bool force_lock,
>> -			       enum migrate_mode mode, enum migrate_reason reason,
>> -			       struct list_head *ret)
>> -{
>> -	struct folio *dst;
>> -	int rc = MIGRATEPAGE_UNMAP;
>> -	struct page *newpage = NULL;
>> -
>> -	if (!thp_migration_supported() && folio_test_transhuge(src))
>> -		return -ENOSYS;
>> -
>> -	if (folio_ref_count(src) == 1) {
>> -		/* Folio was freed from under us. So we are done. */
>> -		folio_clear_active(src);
>> -		folio_clear_unevictable(src);
>> -		/* free_pages_prepare() will clear PG_isolated. */
>> -		list_del(&src->lru);
>> -		migrate_folio_done(src, reason);
>> -		return MIGRATEPAGE_SUCCESS;
>> -	}
>> -
>> -	newpage = get_new_page(&src->page, private);
>> -	if (!newpage)
>> -		return -ENOMEM;
>> -	dst = page_folio(newpage);
>> -	*dstp = dst;
>> -
>> -	dst->private = NULL;
>> -	rc = __migrate_folio_unmap(src, dst, force, force_lock, mode);
>> -	if (rc == MIGRATEPAGE_UNMAP)
>> -		return rc;
>> -
>>  	/*
>>  	 * A page that has not been migrated will have kept its
>>  	 * references and be restored.
>>  	 */
>>  	/* restore the folio to right list. */
>
> This comment is stale. Probably should be
> /* Keep the folio and we will try it again later */

Good catch!  Will revise this in the next version.

Best Regards,
Huang, Ying

>> -	if (rc != -EAGAIN && rc != -EDEADLOCK)
>> -		list_move_tail(&src->lru, ret);
>> +	if (rc == -EAGAIN || rc == -EDEADLOCK)
>> +		ret = NULL;
>>
>> -	if (put_new_page)
>> -		put_new_page(&dst->page, private);
>> -	else
>> -		folio_put(dst);
>> +	migrate_folio_undo_src(src, page_was_mapped, anon_vma, locked, ret);
>> +	migrate_folio_undo_dst(dst, dst_locked, put_new_page, private);
>>
>>  	return rc;
>>  }
>>
>> -static int __migrate_folio_move(struct folio *src, struct folio *dst,
>> -				enum migrate_mode mode)
>> +/* Migrate the folio to the newly allocated folio in dst. */
>> +static int migrate_folio_move(free_page_t put_new_page, unsigned long private,
>> +			      struct folio *src, struct folio *dst,
>> +			      enum migrate_mode mode, enum migrate_reason reason,
>> +			      struct list_head *ret)
>>  {
>>  	int rc;
>>  	int page_was_mapped = 0;
>> @@ -1283,12 +1271,8 @@ static int __migrate_folio_move(struct folio *src, struct folio *dst,
>>  	list_del(&dst->lru);
>>
>>  	rc = move_to_new_folio(dst, src, mode);
>> -
>> -	if (rc == -EAGAIN) {
>> -		list_add(&dst->lru, prev);
>> -		__migrate_folio_record(dst, page_was_mapped, anon_vma);
>> -		return rc;
>> -	}
>> +	if (rc)
>> +		goto out;
>>
>>  	if (unlikely(!is_lru))
>>  		goto out_unlock_both;
>> @@ -1302,70 +1286,45 @@ static int __migrate_folio_move(struct folio *src, struct folio *dst,
>>  	 * unsuccessful, and other cases when a page has been temporarily
>>  	 * isolated from the unevictable LRU: but this case is the easiest.
>>  	 */
>> -	if (rc == MIGRATEPAGE_SUCCESS) {
>> -		folio_add_lru(dst);
>> -		if (page_was_mapped)
>> -			lru_add_drain();
>> -	}
>> +	folio_add_lru(dst);
>> +	if (page_was_mapped)
>> +		lru_add_drain();
>>
>>  	if (page_was_mapped)
>> -		remove_migration_ptes(src,
>> -			rc == MIGRATEPAGE_SUCCESS ? dst : src, false);
>> +		remove_migration_ptes(src, dst, false);
>>
>>  out_unlock_both:
>>  	folio_unlock(dst);
>> -	/* Drop an anon_vma reference if we took one */
>> -	if (anon_vma)
>> -		put_anon_vma(anon_vma);
>> -	folio_unlock(src);
>> +	set_page_owner_migrate_reason(&dst->page, reason);
>>  	/*
>>  	 * If migration is successful, decrease refcount of dst,
>>  	 * which will not free the page because new page owner increased
>>  	 * refcounter.
>>  	 */
>> -	if (rc == MIGRATEPAGE_SUCCESS)
>> -		folio_put(dst);
>> -
>> -	return rc;
>> -}
>> -
>> -/* Migrate the folio to the newly allocated folio in dst. */
>> -static int migrate_folio_move(free_page_t put_new_page, unsigned long private,
>> -			      struct folio *src, struct folio *dst,
>> -			      enum migrate_mode mode, enum migrate_reason reason,
>> -			      struct list_head *ret)
>> -{
>> -	int rc;
>> -
>> -	rc = __migrate_folio_move(src, dst, mode);
>> -	if (rc == MIGRATEPAGE_SUCCESS)
>> -		set_page_owner_migrate_reason(&dst->page, reason);
>> -
>> -	if (rc != -EAGAIN) {
>> -		/*
>> -		 * A folio that has been migrated has all references
>> -		 * removed and will be freed. A folio that has not been
>> -		 * migrated will have kept its references and be restored.
>> -		 */
>> -		list_del(&src->lru);
>> -	}
>> +	folio_put(dst);
>>
>>  	/*
>> -	 * If migration is successful, releases reference grabbed during
>> -	 * isolation. Otherwise, restore the folio to right list unless
>> -	 * we want to retry.
>> +	 * A page that has been migrated has all references removed
>> +	 * and will be freed.
>>  	 */
>> -	if (rc == MIGRATEPAGE_SUCCESS) {
>> -		migrate_folio_done(src, reason);
>> -	} else if (rc != -EAGAIN) {
>> -		list_add_tail(&src->lru, ret);
>> +	list_del(&src->lru);
>> +	/* Drop an anon_vma reference if we took one */
>> +	if (anon_vma)
>> +		put_anon_vma(anon_vma);
>> +	folio_unlock(src);
>> +	migrate_folio_done(src, reason);
>>
>> -		if (put_new_page)
>> -			put_new_page(&dst->page, private);
>> -		else
>> -			folio_put(dst);
>> +	return rc;
>> +out:
>> +	if (rc == -EAGAIN) {
>> +		list_add(&dst->lru, prev);
>> +		__migrate_folio_record(dst, page_was_mapped, anon_vma);
>> +		return rc;
>>  	}
>>
>> +	migrate_folio_undo_src(src, page_was_mapped, anon_vma, true, ret);
>> +	migrate_folio_undo_dst(dst, true, put_new_page, private);
>> +
>>  	return rc;
>>  }
>>
>> @@ -1897,9 +1856,9 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>>
>>  		__migrate_folio_extract(dst, &page_was_mapped, &anon_vma);
>>  		migrate_folio_undo_src(folio, page_was_mapped, anon_vma,
>> -				       ret_folios);
>> +				       true, ret_folios);
>>  		list_del(&dst->lru);
>> -		migrate_folio_undo_dst(dst, put_new_page, private);
>> +		migrate_folio_undo_dst(dst, true, put_new_page, private);
>>  		dst = dst2;
>>  		dst2 = list_next_entry(dst, lru);
>>  	}
>> -- 
>> 2.35.1
>
> Everything else looks good to me, just need to fix the two comments above.
> Reviewed-by: Zi Yan <ziy@nvidia.com>
>
> --
> Best Regards,
> Yan, Zi

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 7/9] migrate_pages: share more code between _unmap and _move
  2023-02-08 12:02     ` Huang, Ying
@ 2023-02-08 19:47       ` Zi Yan
  2023-02-10  7:09         ` Huang, Ying
  0 siblings, 1 reply; 33+ messages in thread
From: Zi Yan @ 2023-02-08 19:47 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Andrew Morton, linux-mm, linux-kernel, Yang Shi, Baolin Wang,
	Oscar Salvador, Matthew Wilcox, Bharata B Rao, Alistair Popple,
	haoxin, Minchan Kim, Mike Kravetz, Hyeonggon Yoo

[-- Attachment #1: Type: text/plain, Size: 13266 bytes --]

On 8 Feb 2023, at 7:02, Huang, Ying wrote:

> Zi Yan <ziy@nvidia.com> writes:
>
>> On 6 Feb 2023, at 1:33, Huang Ying wrote:
>>
>>> This is a code cleanup patch to reduce the duplicated code between the
>>> _unmap and _move stages of migrate_pages().  No functionality change
>>> is expected.
>>>
>>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>>> Cc: Zi Yan <ziy@nvidia.com>
>>> Cc: Yang Shi <shy828301@gmail.com>
>>> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
>>> Cc: Oscar Salvador <osalvador@suse.de>
>>> Cc: Matthew Wilcox <willy@infradead.org>
>>> Cc: Bharata B Rao <bharata@amd.com>
>>> Cc: Alistair Popple <apopple@nvidia.com>
>>> Cc: haoxin <xhao@linux.alibaba.com>
>>> Cc: Minchan Kim <minchan@kernel.org>
>>> Cc: Mike Kravetz <mike.kravetz@oracle.com>
>>> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
>>> ---
>>>  mm/migrate.c | 203 ++++++++++++++++++++-------------------------------
>>>  1 file changed, 81 insertions(+), 122 deletions(-)
>>>
>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>> index 23eb01cfae4c..9378fa2ad4a5 100644
>>> --- a/mm/migrate.c
>>> +++ b/mm/migrate.c
>>> @@ -1037,6 +1037,7 @@ static void __migrate_folio_extract(struct folio *dst,
>>>  static void migrate_folio_undo_src(struct folio *src,
>>>  				   int page_was_mapped,
>>>  				   struct anon_vma *anon_vma,
>>> +				   bool locked,
>>>  				   struct list_head *ret)
>>>  {
>>>  	if (page_was_mapped)
>>> @@ -1044,16 +1045,20 @@ static void migrate_folio_undo_src(struct folio *src,
>>>  	/* Drop an anon_vma reference if we took one */
>>>  	if (anon_vma)
>>>  		put_anon_vma(anon_vma);
>>> -	folio_unlock(src);
>>> -	list_move_tail(&src->lru, ret);
>>> +	if (locked)
>>> +		folio_unlock(src);
>>
>> Having a comment would be better.
>> /* A page that has not been migrated, move it to a list for later restoration */
>
> Emm... the page state has been restored in the previous operations of
> the function.  This is the last step and the page will be moved to
> "return" list, then the caller of migrate_pages() will call
> putback_movable_pages().

But if (rc == -EAGAIN || rc == -EDEADLOCK) then ret will be NULL, thus the page
will not be put back, right? And for both cases, the src page state is not
changed at all. So probably only call migrate_folio_undo_src() when
(rc != -EAGAIN && rc != -EDEADLOCK)? And still require ret to be non NULL.

>
> We have some comments for the function (migrate_folio_undo_src()) as
> follows,
>
> /* Restore the source folio to the original state upon failure */
>
>>> +	if (ret)
>>> +		list_move_tail(&src->lru, ret);
>>>  }
>>>
>>>  /* Restore the destination folio to the original state upon failure */
>>>  static void migrate_folio_undo_dst(struct folio *dst,
>>> +				   bool locked,
>>>  				   free_page_t put_new_page,
>>>  				   unsigned long private)
>>>  {
>>> -	folio_unlock(dst);
>>> +	if (locked)
>>> +		folio_unlock(dst);
>>>  	if (put_new_page)
>>>  		put_new_page(&dst->page, private);
>>>  	else
>>> @@ -1078,13 +1083,42 @@ static void migrate_folio_done(struct folio *src,
>>>  		folio_put(src);
>>>  }
>>>
>>> -static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>>> -				 int force, bool force_lock, enum migrate_mode mode)
>>> +/* Obtain the lock on page, remove all ptes. */
>>> +static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page,
>>> +			       unsigned long private, struct folio *src,
>>> +			       struct folio **dstp, int force, bool force_lock,
>>> +			       enum migrate_mode mode, enum migrate_reason reason,
>>> +			       struct list_head *ret)
>>>  {
>>> +	struct folio *dst;
>>>  	int rc = -EAGAIN;
>>> +	struct page *newpage = NULL;
>>>  	int page_was_mapped = 0;
>>>  	struct anon_vma *anon_vma = NULL;
>>>  	bool is_lru = !__PageMovable(&src->page);
>>> +	bool locked = false;
>>> +	bool dst_locked = false;
>>> +
>>> +	if (!thp_migration_supported() && folio_test_transhuge(src))
>>> +		return -ENOSYS;
>>> +
>>> +	if (folio_ref_count(src) == 1) {
>>> +		/* Folio was freed from under us. So we are done. */
>>> +		folio_clear_active(src);
>>> +		folio_clear_unevictable(src);
>>> +		/* free_pages_prepare() will clear PG_isolated. */
>>> +		list_del(&src->lru);
>>> +		migrate_folio_done(src, reason);
>>> +		return MIGRATEPAGE_SUCCESS;
>>> +	}
>>> +
>>> +	newpage = get_new_page(&src->page, private);
>>> +	if (!newpage)
>>> +		return -ENOMEM;
>>> +	dst = page_folio(newpage);
>>> +	*dstp = dst;
>>> +
>>> +	dst->private = NULL;
>>>
>>>  	if (!folio_trylock(src)) {
>>>  		if (!force || mode == MIGRATE_ASYNC)
>>> @@ -1119,6 +1153,7 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>>>
>>>  		folio_lock(src);
>>>  	}
>>> +	locked = true;
>>>
>>>  	if (folio_test_writeback(src)) {
>>>  		/*
>>> @@ -1133,10 +1168,10 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>>>  			break;
>>>  		default:
>>>  			rc = -EBUSY;
>>> -			goto out_unlock;
>>> +			goto out;
>>>  		}
>>>  		if (!force)
>>> -			goto out_unlock;
>>> +			goto out;
>>>  		folio_wait_writeback(src);
>>>  	}
>>>
>>> @@ -1166,7 +1201,8 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>>>  	 * This is much like races on refcount of oldpage: just don't BUG().
>>>  	 */
>>>  	if (unlikely(!folio_trylock(dst)))
>>> -		goto out_unlock;
>>> +		goto out;
>>> +	dst_locked = true;
>>>
>>>  	if (unlikely(!is_lru)) {
>>>  		__migrate_folio_record(dst, page_was_mapped, anon_vma);
>>> @@ -1188,7 +1224,7 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>>>  	if (!src->mapping) {
>>>  		if (folio_test_private(src)) {
>>>  			try_to_free_buffers(src);
>>> -			goto out_unlock_both;
>>> +			goto out;
>>>  		}
>>>  	} else if (folio_mapped(src)) {
>>>  		/* Establish migration ptes */
>>> @@ -1203,74 +1239,26 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>>>  		return MIGRATEPAGE_UNMAP;
>>>  	}
>>>
>>> -	if (page_was_mapped)
>>> -		remove_migration_ptes(src, src, false);
>>> -
>>> -out_unlock_both:
>>> -	folio_unlock(dst);
>>> -out_unlock:
>>> -	/* Drop an anon_vma reference if we took one */
>>> -	if (anon_vma)
>>> -		put_anon_vma(anon_vma);
>>> -	folio_unlock(src);
>>>  out:
>>> -
>>> -	return rc;
>>> -}
>>> -
>>> -/* Obtain the lock on page, remove all ptes. */
>>> -static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page,
>>> -			       unsigned long private, struct folio *src,
>>> -			       struct folio **dstp, int force, bool force_lock,
>>> -			       enum migrate_mode mode, enum migrate_reason reason,
>>> -			       struct list_head *ret)
>>> -{
>>> -	struct folio *dst;
>>> -	int rc = MIGRATEPAGE_UNMAP;
>>> -	struct page *newpage = NULL;
>>> -
>>> -	if (!thp_migration_supported() && folio_test_transhuge(src))
>>> -		return -ENOSYS;
>>> -
>>> -	if (folio_ref_count(src) == 1) {
>>> -		/* Folio was freed from under us. So we are done. */
>>> -		folio_clear_active(src);
>>> -		folio_clear_unevictable(src);
>>> -		/* free_pages_prepare() will clear PG_isolated. */
>>> -		list_del(&src->lru);
>>> -		migrate_folio_done(src, reason);
>>> -		return MIGRATEPAGE_SUCCESS;
>>> -	}
>>> -
>>> -	newpage = get_new_page(&src->page, private);
>>> -	if (!newpage)
>>> -		return -ENOMEM;
>>> -	dst = page_folio(newpage);
>>> -	*dstp = dst;
>>> -
>>> -	dst->private = NULL;
>>> -	rc = __migrate_folio_unmap(src, dst, force, force_lock, mode);
>>> -	if (rc == MIGRATEPAGE_UNMAP)
>>> -		return rc;
>>> -
>>>  	/*
>>>  	 * A page that has not been migrated will have kept its
>>>  	 * references and be restored.
>>>  	 */
>>>  	/* restore the folio to right list. */
>>
>> This comment is stale. Probably should be
>> /* Keep the folio and we will try it again later */
>
> Good catch!  Will revise this in the next version.
>
> Best Regards,
> Huang, Ying
>
>>> -	if (rc != -EAGAIN && rc != -EDEADLOCK)
>>> -		list_move_tail(&src->lru, ret);
>>> +	if (rc == -EAGAIN || rc == -EDEADLOCK)
>>> +		ret = NULL;
>>>
>>> -	if (put_new_page)
>>> -		put_new_page(&dst->page, private);
>>> -	else
>>> -		folio_put(dst);
>>> +	migrate_folio_undo_src(src, page_was_mapped, anon_vma, locked, ret);
>>> +	migrate_folio_undo_dst(dst, dst_locked, put_new_page, private);
>>>
>>>  	return rc;
>>>  }
>>>
>>> -static int __migrate_folio_move(struct folio *src, struct folio *dst,
>>> -				enum migrate_mode mode)
>>> +/* Migrate the folio to the newly allocated folio in dst. */
>>> +static int migrate_folio_move(free_page_t put_new_page, unsigned long private,
>>> +			      struct folio *src, struct folio *dst,
>>> +			      enum migrate_mode mode, enum migrate_reason reason,
>>> +			      struct list_head *ret)
>>>  {
>>>  	int rc;
>>>  	int page_was_mapped = 0;
>>> @@ -1283,12 +1271,8 @@ static int __migrate_folio_move(struct folio *src, struct folio *dst,
>>>  	list_del(&dst->lru);
>>>
>>>  	rc = move_to_new_folio(dst, src, mode);
>>> -
>>> -	if (rc == -EAGAIN) {
>>> -		list_add(&dst->lru, prev);
>>> -		__migrate_folio_record(dst, page_was_mapped, anon_vma);
>>> -		return rc;
>>> -	}
>>> +	if (rc)
>>> +		goto out;
>>>
>>>  	if (unlikely(!is_lru))
>>>  		goto out_unlock_both;
>>> @@ -1302,70 +1286,45 @@ static int __migrate_folio_move(struct folio *src, struct folio *dst,
>>>  	 * unsuccessful, and other cases when a page has been temporarily
>>>  	 * isolated from the unevictable LRU: but this case is the easiest.
>>>  	 */
>>> -	if (rc == MIGRATEPAGE_SUCCESS) {
>>> -		folio_add_lru(dst);
>>> -		if (page_was_mapped)
>>> -			lru_add_drain();
>>> -	}
>>> +	folio_add_lru(dst);
>>> +	if (page_was_mapped)
>>> +		lru_add_drain();
>>>
>>>  	if (page_was_mapped)
>>> -		remove_migration_ptes(src,
>>> -			rc == MIGRATEPAGE_SUCCESS ? dst : src, false);
>>> +		remove_migration_ptes(src, dst, false);
>>>
>>>  out_unlock_both:
>>>  	folio_unlock(dst);
>>> -	/* Drop an anon_vma reference if we took one */
>>> -	if (anon_vma)
>>> -		put_anon_vma(anon_vma);
>>> -	folio_unlock(src);
>>> +	set_page_owner_migrate_reason(&dst->page, reason);
>>>  	/*
>>>  	 * If migration is successful, decrease refcount of dst,
>>>  	 * which will not free the page because new page owner increased
>>>  	 * refcounter.
>>>  	 */
>>> -	if (rc == MIGRATEPAGE_SUCCESS)
>>> -		folio_put(dst);
>>> -
>>> -	return rc;
>>> -}
>>> -
>>> -/* Migrate the folio to the newly allocated folio in dst. */
>>> -static int migrate_folio_move(free_page_t put_new_page, unsigned long private,
>>> -			      struct folio *src, struct folio *dst,
>>> -			      enum migrate_mode mode, enum migrate_reason reason,
>>> -			      struct list_head *ret)
>>> -{
>>> -	int rc;
>>> -
>>> -	rc = __migrate_folio_move(src, dst, mode);
>>> -	if (rc == MIGRATEPAGE_SUCCESS)
>>> -		set_page_owner_migrate_reason(&dst->page, reason);
>>> -
>>> -	if (rc != -EAGAIN) {
>>> -		/*
>>> -		 * A folio that has been migrated has all references
>>> -		 * removed and will be freed. A folio that has not been
>>> -		 * migrated will have kept its references and be restored.
>>> -		 */
>>> -		list_del(&src->lru);
>>> -	}
>>> +	folio_put(dst);
>>>
>>>  	/*
>>> -	 * If migration is successful, releases reference grabbed during
>>> -	 * isolation. Otherwise, restore the folio to right list unless
>>> -	 * we want to retry.
>>> +	 * A page that has been migrated has all references removed
>>> +	 * and will be freed.
>>>  	 */
>>> -	if (rc == MIGRATEPAGE_SUCCESS) {
>>> -		migrate_folio_done(src, reason);
>>> -	} else if (rc != -EAGAIN) {
>>> -		list_add_tail(&src->lru, ret);
>>> +	list_del(&src->lru);
>>> +	/* Drop an anon_vma reference if we took one */
>>> +	if (anon_vma)
>>> +		put_anon_vma(anon_vma);
>>> +	folio_unlock(src);
>>> +	migrate_folio_done(src, reason);
>>>
>>> -		if (put_new_page)
>>> -			put_new_page(&dst->page, private);
>>> -		else
>>> -			folio_put(dst);
>>> +	return rc;
>>> +out:
>>> +	if (rc == -EAGAIN) {
>>> +		list_add(&dst->lru, prev);
>>> +		__migrate_folio_record(dst, page_was_mapped, anon_vma);
>>> +		return rc;
>>>  	}
>>>
>>> +	migrate_folio_undo_src(src, page_was_mapped, anon_vma, true, ret);
>>> +	migrate_folio_undo_dst(dst, true, put_new_page, private);
>>> +
>>>  	return rc;
>>>  }
>>>
>>> @@ -1897,9 +1856,9 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>>>
>>>  		__migrate_folio_extract(dst, &page_was_mapped, &anon_vma);
>>>  		migrate_folio_undo_src(folio, page_was_mapped, anon_vma,
>>> -				       ret_folios);
>>> +				       true, ret_folios);
>>>  		list_del(&dst->lru);
>>> -		migrate_folio_undo_dst(dst, put_new_page, private);
>>> +		migrate_folio_undo_dst(dst, true, put_new_page, private);
>>>  		dst = dst2;
>>>  		dst2 = list_next_entry(dst, lru);
>>>  	}
>>> -- 
>>> 2.35.1
>>
>> Everything else looks good to me, just need to fix the two comments above.
>> Reviewed-by: Zi Yan <ziy@nvidia.com>
>>
>> --
>> Best Regards,
>> Yan, Zi


--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 7/9] migrate_pages: share more code between _unmap and _move
  2023-02-08 19:47       ` Zi Yan
@ 2023-02-10  7:09         ` Huang, Ying
  0 siblings, 0 replies; 33+ messages in thread
From: Huang, Ying @ 2023-02-10  7:09 UTC (permalink / raw)
  To: Zi Yan
  Cc: Andrew Morton, linux-mm, linux-kernel, Yang Shi, Baolin Wang,
	Oscar Salvador, Matthew Wilcox, Bharata B Rao, Alistair Popple,
	haoxin, Minchan Kim, Mike Kravetz, Hyeonggon Yoo

Zi Yan <ziy@nvidia.com> writes:

> On 8 Feb 2023, at 7:02, Huang, Ying wrote:
>
>> Zi Yan <ziy@nvidia.com> writes:
>>
>>> On 6 Feb 2023, at 1:33, Huang Ying wrote:
>>>
>>>> This is a code cleanup patch to reduce the duplicated code between the
>>>> _unmap and _move stages of migrate_pages().  No functionality change
>>>> is expected.
>>>>
>>>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>>>> Cc: Zi Yan <ziy@nvidia.com>
>>>> Cc: Yang Shi <shy828301@gmail.com>
>>>> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
>>>> Cc: Oscar Salvador <osalvador@suse.de>
>>>> Cc: Matthew Wilcox <willy@infradead.org>
>>>> Cc: Bharata B Rao <bharata@amd.com>
>>>> Cc: Alistair Popple <apopple@nvidia.com>
>>>> Cc: haoxin <xhao@linux.alibaba.com>
>>>> Cc: Minchan Kim <minchan@kernel.org>
>>>> Cc: Mike Kravetz <mike.kravetz@oracle.com>
>>>> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
>>>> ---
>>>>  mm/migrate.c | 203 ++++++++++++++++++++-------------------------------
>>>>  1 file changed, 81 insertions(+), 122 deletions(-)
>>>>
>>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>>> index 23eb01cfae4c..9378fa2ad4a5 100644
>>>> --- a/mm/migrate.c
>>>> +++ b/mm/migrate.c
>>>> @@ -1037,6 +1037,7 @@ static void __migrate_folio_extract(struct folio *dst,
>>>>  static void migrate_folio_undo_src(struct folio *src,
>>>>  				   int page_was_mapped,
>>>>  				   struct anon_vma *anon_vma,
>>>> +				   bool locked,
>>>>  				   struct list_head *ret)
>>>>  {
>>>>  	if (page_was_mapped)
>>>> @@ -1044,16 +1045,20 @@ static void migrate_folio_undo_src(struct folio *src,
>>>>  	/* Drop an anon_vma reference if we took one */
>>>>  	if (anon_vma)
>>>>  		put_anon_vma(anon_vma);
>>>> -	folio_unlock(src);
>>>> -	list_move_tail(&src->lru, ret);
>>>> +	if (locked)
>>>> +		folio_unlock(src);
>>>
>>> Having a comment would be better.
>>> /* A page that has not been migrated, move it to a list for later restoration */
>>
>> Emm... the page state has been restored in the previous operations of
>> the function.  This is the last step and the page will be moved to
>> "return" list, then the caller of migrate_pages() will call
>> putback_movable_pages().
>
> But if (rc == -EAGAIN || rc == -EDEADLOCK) then ret will be NULL, thus the page
> will not be put back, right?

Yes.  That is a special case.

> And for both cases, the src page state is not
> changed at all.

Their state should be restored to the original state too for being
processed again.  That is done in the previous operations too.  For
example, if the folio has been locked, before return with -EAGAIN, we
need to unlock the folio, otherwise, we will run into double lock.

> So probably only call migrate_folio_undo_src() when
> (rc != -EAGAIN && rc != -EDEADLOCK)? And still require ret to be non NULL.
>
>>
>> We have some comments for the function (migrate_folio_undo_src()) as
>> follows,
>>
>> /* Restore the source folio to the original state upon failure */
>>
>>>> +	if (ret)
>>>> +		list_move_tail(&src->lru, ret);
>>>>  }
>>>>

[snip]

Best Regards,
Huang, Ying

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH -v4 5/9] migrate_pages: batch _unmap and _move
  2023-02-06 16:10   ` Zi Yan
  2023-02-07  5:58     ` Huang, Ying
@ 2023-02-13  6:55     ` Huang, Ying
  1 sibling, 0 replies; 33+ messages in thread
From: Huang, Ying @ 2023-02-13  6:55 UTC (permalink / raw)
  To: Zi Yan
  Cc: Andrew Morton, linux-mm, linux-kernel, Hyeonggon Yoo, Yang Shi,
	Baolin Wang, Oscar Salvador, Matthew Wilcox, Bharata B Rao,
	Alistair Popple, haoxin, Minchan Kim, Mike Kravetz

Zi Yan <ziy@nvidia.com> writes:

> On 6 Feb 2023, at 1:33, Huang Ying wrote:
>
>> In this patch the _unmap and _move stage of the folio migration is
>> batched.  That for, previously, it is,
>>
>>   for each folio
>>     _unmap()
>>     _move()
>>
>> Now, it is,
>>
>>   for each folio
>>     _unmap()
>>   for each folio
>>     _move()
>>
>> Based on this, we can batch the TLB flushing and use some hardware
>> accelerator to copy folios between batched _unmap and batched _move
>> stages.
>>
>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
>> Cc: Zi Yan <ziy@nvidia.com>
>> Cc: Yang Shi <shy828301@gmail.com>
>> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
>> Cc: Oscar Salvador <osalvador@suse.de>
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Cc: Bharata B Rao <bharata@amd.com>
>> Cc: Alistair Popple <apopple@nvidia.com>
>> Cc: haoxin <xhao@linux.alibaba.com>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Mike Kravetz <mike.kravetz@oracle.com>
>> ---
>>  mm/migrate.c | 208 +++++++++++++++++++++++++++++++++++++++++++++------
>>  1 file changed, 184 insertions(+), 24 deletions(-)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index 0428449149f4..fa7212330cb6 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1033,6 +1033,33 @@ static void __migrate_folio_extract(struct folio *dst,
>>  	dst->private = NULL;
>>  }
>>
>> +/* Restore the source folio to the original state upon failure */
>> +static void migrate_folio_undo_src(struct folio *src,
>> +				   int page_was_mapped,
>> +				   struct anon_vma *anon_vma,
>> +				   struct list_head *ret)
>> +{
>> +	if (page_was_mapped)
>> +		remove_migration_ptes(src, src, false);
>> +	/* Drop an anon_vma reference if we took one */
>> +	if (anon_vma)
>> +		put_anon_vma(anon_vma);
>> +	folio_unlock(src);
>> +	list_move_tail(&src->lru, ret);
>> +}
>> +
>> +/* Restore the destination folio to the original state upon failure */
>> +static void migrate_folio_undo_dst(struct folio *dst,
>> +				   free_page_t put_new_page,
>> +				   unsigned long private)
>> +{
>> +	folio_unlock(dst);
>> +	if (put_new_page)
>> +		put_new_page(&dst->page, private);
>> +	else
>> +		folio_put(dst);
>> +}
>> +
>>  /* Cleanup src folio upon migration success */
>>  static void migrate_folio_done(struct folio *src,
>>  			       enum migrate_reason reason)
>> @@ -1052,7 +1079,7 @@ static void migrate_folio_done(struct folio *src,
>>  }
>>
>>  static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>> -				int force, enum migrate_mode mode)
>> +				 int force, bool force_lock, enum migrate_mode mode)
>>  {
>>  	int rc = -EAGAIN;
>>  	int page_was_mapped = 0;
>> @@ -1079,6 +1106,17 @@ static int __migrate_folio_unmap(struct folio *src, struct folio *dst,
>>  		if (current->flags & PF_MEMALLOC)
>>  			goto out;
>>
>> +		/*
>> +		 * We have locked some folios, to avoid deadlock, we cannot
>> +		 * lock the folio synchronously.  Go out to process (and
>> +		 * unlock) all the locked folios.  Then we can lock the folio
>> +		 * synchronously.
>> +		 */
> The comment alone is quite confusing and the variable might be better
> renamed to avoid_force_lock, since there is a force variable to force
> lock folio already. And the variable intends to discourage force lock
> on a folio to avoid potential deadlock.
>
> How about? Since "lock synchronously" might not be as straightforward
> as wait to lock.
>
> /*
>  * We have locked some folios and are going to wait to lock this folio.
>  * To avoid a potential deadlock, let's bail out and not do that. The
>  * locked folios will be moved and unlocked, then we can wait to lock
>  * this folio
>  */
>
>> +		if (!force_lock) {
>> +			rc = -EDEADLOCK;
>> +			goto out;
>> +		}
>> +
>>  		folio_lock(src);
>>  	}
>>
>> @@ -1187,10 +1225,20 @@ static int __migrate_folio_move(struct folio *src, struct folio *dst,
>>  	int page_was_mapped = 0;
>>  	struct anon_vma *anon_vma = NULL;
>>  	bool is_lru = !__PageMovable(&src->page);
>> +	struct list_head *prev;
>>
>>  	__migrate_folio_extract(dst, &page_was_mapped, &anon_vma);
>> +	prev = dst->lru.prev;
>> +	list_del(&dst->lru);
>>
>>  	rc = move_to_new_folio(dst, src, mode);
>> +
>> +	if (rc == -EAGAIN) {
>> +		list_add(&dst->lru, prev);
>> +		__migrate_folio_record(dst, page_was_mapped, anon_vma);
>> +		return rc;
>> +	}
>> +
>>  	if (unlikely(!is_lru))
>>  		goto out_unlock_both;
>>
>> @@ -1233,7 +1281,7 @@ static int __migrate_folio_move(struct folio *src, struct folio *dst,
>>  /* Obtain the lock on page, remove all ptes. */
>>  static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page,
>>  			       unsigned long private, struct folio *src,
>> -			       struct folio **dstp, int force,
>> +			       struct folio **dstp, int force, bool force_lock,
>>  			       enum migrate_mode mode, enum migrate_reason reason,
>>  			       struct list_head *ret)
>>  {
>> @@ -1261,7 +1309,7 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page
>>  	*dstp = dst;
>>
>>  	dst->private = NULL;
>> -	rc = __migrate_folio_unmap(src, dst, force, mode);
>> +	rc = __migrate_folio_unmap(src, dst, force, force_lock, mode);
>>  	if (rc == MIGRATEPAGE_UNMAP)
>>  		return rc;
>>
>> @@ -1270,7 +1318,7 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page
>>  	 * references and be restored.
>>  	 */
>>  	/* restore the folio to right list. */
>> -	if (rc != -EAGAIN)
>> +	if (rc != -EAGAIN && rc != -EDEADLOCK)
>>  		list_move_tail(&src->lru, ret);
>>
>>  	if (put_new_page)
>> @@ -1309,9 +1357,8 @@ static int migrate_folio_move(free_page_t put_new_page, unsigned long private,
>>  	 */
>>  	if (rc == MIGRATEPAGE_SUCCESS) {
>>  		migrate_folio_done(src, reason);
>> -	} else {
>> -		if (rc != -EAGAIN)
>> -			list_add_tail(&src->lru, ret);
>> +	} else if (rc != -EAGAIN) {
>> +		list_add_tail(&src->lru, ret);
>>
>>  		if (put_new_page)
>>  			put_new_page(&dst->page, private);
>> @@ -1591,7 +1638,7 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>>  		enum migrate_mode mode, int reason, struct list_head *ret_folios,
>>  		struct migrate_pages_stats *stats)
>
> Like I said in my last comment to this patch, migrate_pages_batch() function
> deserves a detailed comment about its working flow including the error handling.
> Now you only put some in the git log, which is hard to access after several code
> changes later.
>
> How about?
>
> /*
>  * migrate_pages_batch() first unmaps pages in the from as many as possible,
>  * then migrates the unmapped pages. During unmap process, different situations
>  * are handled differently:
>  * 1. ENOSYS, unsupported large folio migration: move to ret_folios list
>  * 2. ENOMEM, lower memory at the destination: migrate existing unmapped folios
>  *    and stop, since existing unmapped folios have new pages allocated and can
>  *    be migrated
>  * 3. EDEADLOCK, to be unmapped page is locked by someone else, to avoid deadlock,
>  *    we migrate existing unmapped pages and try to lock again
>  * 4. MIGRATEPAGE_SUCCESS, the folios was freed under us: no action
>  * 5. MIGRATEPAGE_UNMAP, unmap succeeded: set avoid_force_lock to true to avoid
>  *    wait to lock a folio in the future to avoid deadlock.
>  *
>  * For folios unmapped but cannot be migrated, we will restore their original
>  * states during cleanup stage at the end.
>  */

Sorry, I didn't notice the above comments in the previous reply.

The comments appear to too detailed for me.  I think that it's better
for people to get the details from the code itself.  So, I want to use
the much simplified version as below.

/*
 * migrate_pages_batch() first unmaps folios in the from list as many as
 * possible, then move the unmapped folios.
 */

Best Regards,
Huang, Ying

>>  {
>> -	int retry = 1;
>> +	int retry;
>>  	int large_retry = 1;
>>  	int thp_retry = 1;
>>  	int nr_failed = 0;
>> @@ -1600,13 +1647,19 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>>  	int pass = 0;
>>  	bool is_large = false;
>>  	bool is_thp = false;
>> -	struct folio *folio, *folio2, *dst = NULL;
>> -	int rc, nr_pages;
>> +	struct folio *folio, *folio2, *dst = NULL, *dst2;
>> +	int rc, rc_saved, nr_pages;
>>  	LIST_HEAD(split_folios);
>> +	LIST_HEAD(unmap_folios);
>> +	LIST_HEAD(dst_folios);
>>  	bool nosplit = (reason == MR_NUMA_MISPLACED);
>>  	bool no_split_folio_counting = false;
>> +	bool force_lock;
>>
>> -split_folio_migration:
>> +retry:
>> +	rc_saved = 0;
>> +	force_lock = true;
>> +	retry = 1;
>>  	for (pass = 0;
>>  	     pass < NR_MAX_MIGRATE_PAGES_RETRY && (retry || large_retry);
>>  	     pass++) {
>> @@ -1628,16 +1681,15 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>>  			cond_resched();
>>
>>  			rc = migrate_folio_unmap(get_new_page, put_new_page, private,
>> -						 folio, &dst, pass > 2, mode,
>> -						 reason, ret_folios);
>> -			if (rc == MIGRATEPAGE_UNMAP)
>> -				rc = migrate_folio_move(put_new_page, private,
>> -							folio, dst, mode,
>> -							reason, ret_folios);
>> +						 folio, &dst, pass > 2, force_lock,
>> +						 mode, reason, ret_folios);
>>  			/*
>>  			 * The rules are:
>>  			 *	Success: folio will be freed
>> +			 *	Unmap: folio will be put on unmap_folios list,
>> +			 *	       dst folio put on dst_folios list
>>  			 *	-EAGAIN: stay on the from list
>> +			 *	-EDEADLOCK: stay on the from list
>>  			 *	-ENOMEM: stay on the from list
>>  			 *	-ENOSYS: stay on the from list
>>  			 *	Other errno: put on ret_folios list
>> @@ -1672,7 +1724,7 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>>  			case -ENOMEM:
>>  				/*
>>  				 * When memory is low, don't bother to try to migrate
>> -				 * other folios, just exit.
>> +				 * other folios, move unmapped folios, then exit.
>>  				 */
>>  				if (is_large) {
>>  					nr_large_failed++;
>> @@ -1711,7 +1763,19 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>>  				/* nr_failed isn't updated for not used */
>>  				nr_large_failed += large_retry;
>>  				stats->nr_thp_failed += thp_retry;
>> -				goto out;
>> +				rc_saved = rc;
>> +				if (list_empty(&unmap_folios))
>> +					goto out;
>> +				else
>> +					goto move;
>> +			case -EDEADLOCK:
>> +				/*
>> +				 * The folio cannot be locked for potential deadlock.
>> +				 * Go move (and unlock) all locked folios.  Then we can
>> +				 * try again.
>> +				 */
>> +				rc_saved = rc;
>> +				goto move;
>>  			case -EAGAIN:
>>  				if (is_large) {
>>  					large_retry++;
>> @@ -1725,6 +1789,15 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>>  				stats->nr_succeeded += nr_pages;
>>  				stats->nr_thp_succeeded += is_thp;
>>  				break;
>> +			case MIGRATEPAGE_UNMAP:
>> +				/*
>> +				 * We have locked some folios, don't force lock
>> +				 * to avoid deadlock.
>> +				 */
>> +				force_lock = false;
>> +				list_move_tail(&folio->lru, &unmap_folios);
>> +				list_add_tail(&dst->lru, &dst_folios);
>> +				break;
>>  			default:
>>  				/*
>>  				 * Permanent failure (-EBUSY, etc.):
>> @@ -1748,12 +1821,95 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>>  	nr_large_failed += large_retry;
>>  	stats->nr_thp_failed += thp_retry;
>>  	stats->nr_failed_pages += nr_retry_pages;
>> +move:
>> +	retry = 1;
>> +	for (pass = 0;
>> +	     pass < NR_MAX_MIGRATE_PAGES_RETRY && (retry || large_retry);
>> +	     pass++) {
>> +		retry = 0;
>> +		large_retry = 0;
>> +		thp_retry = 0;
>> +		nr_retry_pages = 0;
>> +
>> +		dst = list_first_entry(&dst_folios, struct folio, lru);
>> +		dst2 = list_next_entry(dst, lru);
>> +		list_for_each_entry_safe(folio, folio2, &unmap_folios, lru) {
>> +			is_large = folio_test_large(folio);
>> +			is_thp = is_large && folio_test_pmd_mappable(folio);
>> +			nr_pages = folio_nr_pages(folio);
>> +
>> +			cond_resched();
>> +
>> +			rc = migrate_folio_move(put_new_page, private,
>> +						folio, dst, mode,
>> +						reason, ret_folios);
>> +			/*
>> +			 * The rules are:
>> +			 *	Success: folio will be freed
>> +			 *	-EAGAIN: stay on the unmap_folios list
>> +			 *	Other errno: put on ret_folios list
>> +			 */
>> +			switch(rc) {
>> +			case -EAGAIN:
>> +				if (is_large) {
>> +					large_retry++;
>> +					thp_retry += is_thp;
>> +				} else if (!no_split_folio_counting) {
>> +					retry++;
>> +				}
>> +				nr_retry_pages += nr_pages;
>> +				break;
>> +			case MIGRATEPAGE_SUCCESS:
>> +				stats->nr_succeeded += nr_pages;
>> +				stats->nr_thp_succeeded += is_thp;
>> +				break;
>> +			default:
>> +				if (is_large) {
>> +					nr_large_failed++;
>> +					stats->nr_thp_failed += is_thp;
>> +				} else if (!no_split_folio_counting) {
>> +					nr_failed++;
>> +				}
>> +
>> +				stats->nr_failed_pages += nr_pages;
>> +				break;
>> +			}
>> +			dst = dst2;
>> +			dst2 = list_next_entry(dst, lru);
>> +		}
>> +	}
>> +	nr_failed += retry;
>> +	nr_large_failed += large_retry;
>> +	stats->nr_thp_failed += thp_retry;
>> +	stats->nr_failed_pages += nr_retry_pages;
>> +
>> +	if (rc_saved)
>> +		rc = rc_saved;
>> +	else
>> +		rc = nr_failed + nr_large_failed;
>> +out:
>> +	/* Cleanup remaining folios */
>> +	dst = list_first_entry(&dst_folios, struct folio, lru);
>> +	dst2 = list_next_entry(dst, lru);
>> +	list_for_each_entry_safe(folio, folio2, &unmap_folios, lru) {
>> +		int page_was_mapped = 0;
>> +		struct anon_vma *anon_vma = NULL;
>> +
>> +		__migrate_folio_extract(dst, &page_was_mapped, &anon_vma);
>> +		migrate_folio_undo_src(folio, page_was_mapped, anon_vma,
>> +				       ret_folios);
>> +		list_del(&dst->lru);
>> +		migrate_folio_undo_dst(dst, put_new_page, private);
>> +		dst = dst2;
>> +		dst2 = list_next_entry(dst, lru);
>> +	}
>> +
>>  	/*
>>  	 * Try to migrate split folios of fail-to-migrate large folios, no
>>  	 * nr_failed counting in this round, since all split folios of a
>>  	 * large folio is counted as 1 failure in the first round.
>>  	 */
>> -	if (!list_empty(&split_folios)) {
>> +	if (rc >= 0 && !list_empty(&split_folios)) {
>>  		/*
>>  		 * Move non-migrated folios (after NR_MAX_MIGRATE_PAGES_RETRY
>>  		 * retries) to ret_folios to avoid migrating them again.
>> @@ -1761,12 +1917,16 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>>  		list_splice_init(from, ret_folios);
>>  		list_splice_init(&split_folios, from);
>>  		no_split_folio_counting = true;
>> -		retry = 1;
>> -		goto split_folio_migration;
>> +		goto retry;
>>  	}
>>
>> -	rc = nr_failed + nr_large_failed;
>> -out:
>> +	/*
>> +	 * We have unlocked all locked folios, so we can force lock now, let's
>> +	 * try again.
>> +	 */
>> +	if (rc == -EDEADLOCK)
>> +		goto retry;
>> +
>>  	return rc;
>>  }
>>
>> -- 
>> 2.35.1
>
> After rename the variable (or give it a better name) and add the comments,
> you can add Reviewed-by: Zi Yan <ziy@nvidia.com>
>
> Thanks.
>
> --
> Best Regards,
> Yan, Zi

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2023-02-13  6:56 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-06  6:33 [PATCH -v4 0/9] migrate_pages(): batch TLB flushing Huang Ying
2023-02-06  6:33 ` [PATCH -v4 1/9] migrate_pages: organize stats with struct migrate_pages_stats Huang Ying
2023-02-07 16:28   ` haoxin
2023-02-06  6:33 ` [PATCH -v4 2/9] migrate_pages: separate hugetlb folios migration Huang Ying
2023-02-07 16:42   ` haoxin
2023-02-08 11:35     ` Huang, Ying
2023-02-06  6:33 ` [PATCH -v4 3/9] migrate_pages: restrict number of pages to migrate in batch Huang Ying
2023-02-07 17:01   ` haoxin
2023-02-06  6:33 ` [PATCH -v4 4/9] migrate_pages: split unmap_and_move() to _unmap() and _move() Huang Ying
2023-02-07 17:11   ` haoxin
2023-02-07 17:27     ` haoxin
2023-02-06  6:33 ` [PATCH -v4 5/9] migrate_pages: batch _unmap and _move Huang Ying
2023-02-06 16:10   ` Zi Yan
2023-02-07  5:58     ` Huang, Ying
2023-02-13  6:55     ` Huang, Ying
2023-02-07 17:33   ` haoxin
2023-02-06  6:33 ` [PATCH -v4 6/9] migrate_pages: move migrate_folio_unmap() Huang Ying
2023-02-07 14:40   ` Zi Yan
2023-02-06  6:33 ` [PATCH -v4 7/9] migrate_pages: share more code between _unmap and _move Huang Ying
2023-02-07 14:50   ` Zi Yan
2023-02-08 12:02     ` Huang, Ying
2023-02-08 19:47       ` Zi Yan
2023-02-10  7:09         ` Huang, Ying
2023-02-06  6:33 ` [PATCH -v4 8/9] migrate_pages: batch flushing TLB Huang Ying
2023-02-07 14:52   ` Zi Yan
2023-02-08 11:27     ` Huang, Ying
2023-02-07 17:44   ` haoxin
2023-02-06  6:33 ` [PATCH -v4 9/9] migrate_pages: move THP/hugetlb migration support check to simplify code Huang Ying
2023-02-07 14:53   ` Zi Yan
2023-02-08  6:21 ` [PATCH -v4 0/9] migrate_pages(): batch TLB flushing haoxin
2023-02-08  6:27   ` haoxin
2023-02-08 11:04     ` Jonathan Cameron
2023-02-08 11:25   ` Huang, Ying

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.