linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v3 0/8] Use pageblock_order for cma and alloc_contig_range alignment.
@ 2022-01-05 21:47 Zi Yan
  2022-01-05 21:47 ` [RFC PATCH v3 1/8] mm: page_alloc: avoid merging non-fallbackable pageblocks with others Zi Yan
                   ` (7 more replies)
  0 siblings, 8 replies; 24+ messages in thread
From: Zi Yan @ 2022-01-05 21:47 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren, Zi Yan

From: Zi Yan <ziy@nvidia.com>

Hi all,

This patchset tries to remove the MAX_ORDER - 1 alignment requirement for CMA
and alloc_contig_range(). It prepares for my upcoming changes to make MAX_ORDER
adjustable at boot time[1]. It is on top of mmotm-2021-12-29-20-07.

The MAX_ORDER - 1 alignment requirement comes from that alloc_contig_range()
isolates pageblocks to remove free memory from buddy allocator but isolating
only a subset of pageblocks within a page spanning across multiple pageblocks
causes free page accounting issues. Isolated page might not be put into the
right free list, since the code assumes the migratetype of the first pageblock
as the whole free page migratetype. This is based on the discussion at [2].

To remove the requirement, this patchset:
1. still isolates pageblocks at MAX_ORDER - 1 granularity;
2. but saves the pageblock migratetypes outside the specified range of
   alloc_contig_range() and restores them after all pages within the range
   become free after __alloc_contig_migrate_range();
3. only checks unmovable pages within the range instead of MAX_ORDER - 1 aligned
   range during isolation to avoid alloc_contig_range() failure when pageblocks
   within a MAX_ORDER - 1 aligned range are allocated separately.
3. splits free pages spanning multiple pageblocks at the beginning and the end
   of the range and puts the split pages to the right migratetype free lists
   based on the pageblock migratetypes;
4. returns pages not in the range as it did before.

Isolation needs to be done at MAX_ORDER - 1 granularity, because otherwise
either 1) it is needed to detect to-be-isolated page size (free, PageHuge, THP,
or other PageCompound) to make sure all pageblocks belonging to a single page
are isolated together and later restore pageblock migratetypes outside the
range, or 2) assuming isolation happens at pageblock granularity, a free page
with multi-migratetype pageblocks can seen in free page path and needs
to be split and freed at pageblock granularity.

One optimization might come later:
1. make MIGRATE_ISOLATE a separate bit to avoid saving and restoring existing
   migratetypes before and after isolation respectively.

Feel free to give comments and suggestions. Thanks.


[1] https://lore.kernel.org/linux-mm/20210805190253.2795604-1-zi.yan@sent.com/
[2] https://lore.kernel.org/linux-mm/d19fb078-cb9b-f60f-e310-fdeea1b947d2@redhat.com/

Zi Yan (8):
  mm: page_alloc: avoid merging non-fallbackable pageblocks with others.
  mm: compaction: handle non-lru compound pages properly in
    isolate_migratepages_block().
  mm: migrate: allocate the right size of non hugetlb or THP compound
    pages.
  mm: make alloc_contig_range work at pageblock granularity
  mm: page_isolation: check specified range for unmovable pages during
    isolation.
  mm: cma: use pageblock_order as the single alignment
  drivers: virtio_mem: use pageblock size as the minimum virtio_mem
    size.
  arch: powerpc: adjust fadump alignment to be pageblock aligned.

 arch/powerpc/include/asm/fadump-internal.h |   4 +-
 drivers/virtio/virtio_mem.c                |   3 +-
 include/linux/mmzone.h                     |  11 +-
 include/linux/page-isolation.h             |   3 +-
 kernel/dma/contiguous.c                    |   2 +-
 mm/cma.c                                   |   6 +-
 mm/compaction.c                            |  10 +-
 mm/memory_hotplug.c                        |  12 +-
 mm/migrate.c                               |  11 +-
 mm/page_alloc.c                            | 328 +++++++++++----------
 mm/page_isolation.c                        | 148 +++++++++-
 11 files changed, 353 insertions(+), 185 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC PATCH v3 1/8] mm: page_alloc: avoid merging non-fallbackable pageblocks with others.
  2022-01-05 21:47 [RFC PATCH v3 0/8] Use pageblock_order for cma and alloc_contig_range alignment Zi Yan
@ 2022-01-05 21:47 ` Zi Yan
  2022-01-12 10:54   ` David Hildenbrand
  2022-01-05 21:47 ` [RFC PATCH v3 2/8] mm: compaction: handle non-lru compound pages properly in isolate_migratepages_block() Zi Yan
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Zi Yan @ 2022-01-05 21:47 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren, Zi Yan

From: Zi Yan <ziy@nvidia.com>

This is done in addition to MIGRATE_ISOLATE pageblock merge avoidance.
It prepares for the upcoming removal of the MAX_ORDER-1 alignment
requirement for CMA and alloc_contig_range().

MIGRARTE_HIGHATOMIC should not merge with other migratetypes like
MIGRATE_ISOLATE and MIGRARTE_CMA[1], so this commit prevents that too.
Also add MIGRARTE_HIGHATOMIC to fallbacks array for completeness.

[1] https://lore.kernel.org/linux-mm/20211130100853.GP3366@techsingularity.net/

Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 include/linux/mmzone.h |  6 ++++++
 mm/page_alloc.c        | 28 ++++++++++++++++++----------
 2 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index aed44e9b5d89..0aa549653e4e 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -83,6 +83,12 @@ static inline bool is_migrate_movable(int mt)
 	return is_migrate_cma(mt) || mt == MIGRATE_MOVABLE;
 }
 
+/* See fallbacks[MIGRATE_TYPES][3] in page_alloc.c */
+static inline bool migratetype_has_fallback(int mt)
+{
+	return mt < MIGRATE_PCPTYPES;
+}
+
 #define for_each_migratetype_order(order, type) \
 	for (order = 0; order < MAX_ORDER; order++) \
 		for (type = 0; type < MIGRATE_TYPES; type++)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8dd6399bafb5..5193c953dbf8 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1042,6 +1042,12 @@ buddy_merge_likely(unsigned long pfn, unsigned long buddy_pfn,
 	return page_is_buddy(higher_page, higher_buddy, order + 1);
 }
 
+static inline bool has_non_fallback_pageblock(struct zone *zone)
+{
+	return has_isolate_pageblock(zone) || zone_cma_pages(zone) != 0 ||
+		zone->nr_reserved_highatomic != 0;
+}
+
 /*
  * Freeing function for a buddy system allocator.
  *
@@ -1117,14 +1123,15 @@ static inline void __free_one_page(struct page *page,
 	}
 	if (order < MAX_ORDER - 1) {
 		/* If we are here, it means order is >= pageblock_order.
-		 * We want to prevent merge between freepages on isolate
-		 * pageblock and normal pageblock. Without this, pageblock
-		 * isolation could cause incorrect freepage or CMA accounting.
+		 * We want to prevent merge between freepages on pageblock
+		 * without fallbacks and normal pageblock. Without this,
+		 * pageblock isolation could cause incorrect freepage or CMA
+		 * accounting or HIGHATOMIC accounting.
 		 *
 		 * We don't want to hit this code for the more frequent
 		 * low-order merging.
 		 */
-		if (unlikely(has_isolate_pageblock(zone))) {
+		if (unlikely(has_non_fallback_pageblock(zone))) {
 			int buddy_mt;
 
 			buddy_pfn = __find_buddy_pfn(pfn, order);
@@ -1132,8 +1139,8 @@ static inline void __free_one_page(struct page *page,
 			buddy_mt = get_pageblock_migratetype(buddy);
 
 			if (migratetype != buddy_mt
-					&& (is_migrate_isolate(migratetype) ||
-						is_migrate_isolate(buddy_mt)))
+					&& (!migratetype_has_fallback(migratetype) ||
+						!migratetype_has_fallback(buddy_mt)))
 				goto done_merging;
 		}
 		max_order = order + 1;
@@ -2484,6 +2491,7 @@ static int fallbacks[MIGRATE_TYPES][3] = {
 	[MIGRATE_UNMOVABLE]   = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE,   MIGRATE_TYPES },
 	[MIGRATE_MOVABLE]     = { MIGRATE_RECLAIMABLE, MIGRATE_UNMOVABLE, MIGRATE_TYPES },
 	[MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE,   MIGRATE_MOVABLE,   MIGRATE_TYPES },
+	[MIGRATE_HIGHATOMIC] = { MIGRATE_TYPES }, /* Never used */
 #ifdef CONFIG_CMA
 	[MIGRATE_CMA]         = { MIGRATE_TYPES }, /* Never used */
 #endif
@@ -2795,8 +2803,8 @@ static void reserve_highatomic_pageblock(struct page *page, struct zone *zone,
 
 	/* Yoink! */
 	mt = get_pageblock_migratetype(page);
-	if (!is_migrate_highatomic(mt) && !is_migrate_isolate(mt)
-	    && !is_migrate_cma(mt)) {
+	/* Only reserve normal pageblock */
+	if (migratetype_has_fallback(mt)) {
 		zone->nr_reserved_highatomic += pageblock_nr_pages;
 		set_pageblock_migratetype(page, MIGRATE_HIGHATOMIC);
 		move_freepages_block(zone, page, MIGRATE_HIGHATOMIC, NULL);
@@ -3545,8 +3553,8 @@ int __isolate_free_page(struct page *page, unsigned int order)
 		struct page *endpage = page + (1 << order) - 1;
 		for (; page < endpage; page += pageblock_nr_pages) {
 			int mt = get_pageblock_migratetype(page);
-			if (!is_migrate_isolate(mt) && !is_migrate_cma(mt)
-			    && !is_migrate_highatomic(mt))
+			/* Only change normal pageblock */
+			if (migratetype_has_fallback(mt))
 				set_pageblock_migratetype(page,
 							  MIGRATE_MOVABLE);
 		}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH v3 2/8] mm: compaction: handle non-lru compound pages properly in isolate_migratepages_block().
  2022-01-05 21:47 [RFC PATCH v3 0/8] Use pageblock_order for cma and alloc_contig_range alignment Zi Yan
  2022-01-05 21:47 ` [RFC PATCH v3 1/8] mm: page_alloc: avoid merging non-fallbackable pageblocks with others Zi Yan
@ 2022-01-05 21:47 ` Zi Yan
  2022-01-12 11:01   ` David Hildenbrand
  2022-01-05 21:47 ` [RFC PATCH v3 3/8] mm: migrate: allocate the right size of non hugetlb or THP compound pages Zi Yan
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Zi Yan @ 2022-01-05 21:47 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren, Zi Yan

From: Zi Yan <ziy@nvidia.com>

In isolate_migratepages_block(), a !PageLRU tail page can be encountered
when the page is larger than a pageblock. Use compound head page for the
checks inside and skip the entire compound page when isolation succeeds.

Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 mm/compaction.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index b4e94cda3019..ad9053fbbe06 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -979,19 +979,23 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 		 * Skip any other type of page
 		 */
 		if (!PageLRU(page)) {
+			struct page *head = compound_head(page);
 			/*
 			 * __PageMovable can return false positive so we need
 			 * to verify it under page_lock.
 			 */
-			if (unlikely(__PageMovable(page)) &&
-					!PageIsolated(page)) {
+			if (unlikely(__PageMovable(head)) &&
+					!PageIsolated(head)) {
 				if (locked) {
 					unlock_page_lruvec_irqrestore(locked, flags);
 					locked = NULL;
 				}
 
-				if (!isolate_movable_page(page, isolate_mode))
+				if (!isolate_movable_page(head, isolate_mode)) {
+					low_pfn += (1 << compound_order(head)) - 1 - (page - head);
+					page = head;
 					goto isolate_success;
+				}
 			}
 
 			goto isolate_fail;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH v3 3/8] mm: migrate: allocate the right size of non hugetlb or THP compound pages.
  2022-01-05 21:47 [RFC PATCH v3 0/8] Use pageblock_order for cma and alloc_contig_range alignment Zi Yan
  2022-01-05 21:47 ` [RFC PATCH v3 1/8] mm: page_alloc: avoid merging non-fallbackable pageblocks with others Zi Yan
  2022-01-05 21:47 ` [RFC PATCH v3 2/8] mm: compaction: handle non-lru compound pages properly in isolate_migratepages_block() Zi Yan
@ 2022-01-05 21:47 ` Zi Yan
  2022-01-12 11:04   ` David Hildenbrand
  2022-01-05 21:47 ` [RFC PATCH v3 4/8] mm: make alloc_contig_range work at pageblock granularity Zi Yan
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Zi Yan @ 2022-01-05 21:47 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren, Zi Yan

From: Zi Yan <ziy@nvidia.com>

alloc_migration_target() is used by alloc_contig_range() and non-LRU
movable compound pages can be migrated. Current code does not allocate the
right page size for such pages. Check THP precisely using
is_transparent_huge() and add allocation support for non-LRU compound
pages.

Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 mm/migrate.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index c7da064b4781..b1851ffb8576 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1546,9 +1546,7 @@ struct page *alloc_migration_target(struct page *page, unsigned long private)
 
 		gfp_mask = htlb_modify_alloc_mask(h, gfp_mask);
 		return alloc_huge_page_nodemask(h, nid, mtc->nmask, gfp_mask);
-	}
-
-	if (PageTransHuge(page)) {
+	} else if (is_transparent_hugepage(page)) {
 		/*
 		 * clear __GFP_RECLAIM to make the migration callback
 		 * consistent with regular THP allocations.
@@ -1556,14 +1554,19 @@ struct page *alloc_migration_target(struct page *page, unsigned long private)
 		gfp_mask &= ~__GFP_RECLAIM;
 		gfp_mask |= GFP_TRANSHUGE;
 		order = HPAGE_PMD_ORDER;
+	} else if (PageCompound(page)) {
+		/* for non-LRU movable compound pages */
+		gfp_mask |= __GFP_COMP;
+		order = compound_order(page);
 	}
+
 	zidx = zone_idx(page_zone(page));
 	if (is_highmem_idx(zidx) || zidx == ZONE_MOVABLE)
 		gfp_mask |= __GFP_HIGHMEM;
 
 	new_page = __alloc_pages(gfp_mask, order, nid, mtc->nmask);
 
-	if (new_page && PageTransHuge(new_page))
+	if (new_page && is_transparent_hugepage(page))
 		prep_transhuge_page(new_page);
 
 	return new_page;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH v3 4/8] mm: make alloc_contig_range work at pageblock granularity
  2022-01-05 21:47 [RFC PATCH v3 0/8] Use pageblock_order for cma and alloc_contig_range alignment Zi Yan
                   ` (2 preceding siblings ...)
  2022-01-05 21:47 ` [RFC PATCH v3 3/8] mm: migrate: allocate the right size of non hugetlb or THP compound pages Zi Yan
@ 2022-01-05 21:47 ` Zi Yan
  2022-01-05 21:47 ` [RFC PATCH v3 5/8] mm: page_isolation: check specified range for unmovable pages during isolation Zi Yan
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 24+ messages in thread
From: Zi Yan @ 2022-01-05 21:47 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren, Zi Yan

From: Zi Yan <ziy@nvidia.com>

alloc_contig_range() worked at MAX_ORDER-1 granularity to avoid merging
pageblocks with different migratetypes. It might unnecessarily convert
extra pageblocks at the beginning and at the end of the range. Change
alloc_contig_range() to work at pageblock granularity.

It is done by restoring pageblock types and split >pageblock_order free
pages after isolating at MAX_ORDER-1 granularity and migrating pages
away at pageblock granularity. The reason for this process is that
during isolation, some pages, either free or in-use, might have >pageblock
sizes and isolating part of them can cause free accounting issues.
Restoring the migratetypes of the pageblocks not in the interesting
range later is much easier.

Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 mm/page_alloc.c | 174 ++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 154 insertions(+), 20 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5193c953dbf8..e1c09ae54e31 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8986,8 +8986,8 @@ struct page *has_unmovable_pages(struct zone *zone, struct page *page,
 #ifdef CONFIG_CONTIG_ALLOC
 static unsigned long pfn_max_align_down(unsigned long pfn)
 {
-	return pfn & ~(max_t(unsigned long, MAX_ORDER_NR_PAGES,
-			     pageblock_nr_pages) - 1);
+	return ALIGN_DOWN(pfn, max_t(unsigned long, MAX_ORDER_NR_PAGES,
+				     pageblock_nr_pages));
 }
 
 static unsigned long pfn_max_align_up(unsigned long pfn)
@@ -9076,6 +9076,52 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
 	return 0;
 }
 
+static inline int save_migratetypes(unsigned char *migratetypes,
+				unsigned long start_pfn, unsigned long end_pfn)
+{
+	unsigned long pfn = start_pfn;
+	int num = 0;
+
+	while (pfn < end_pfn) {
+		migratetypes[num] = get_pageblock_migratetype(pfn_to_page(pfn));
+		num++;
+		pfn += pageblock_nr_pages;
+	}
+	return num;
+}
+
+static inline int restore_migratetypes(unsigned char *migratetypes,
+				unsigned long start_pfn, unsigned long end_pfn)
+{
+	unsigned long pfn = start_pfn;
+	int num = 0;
+
+	while (pfn < end_pfn) {
+		set_pageblock_migratetype(pfn_to_page(pfn), migratetypes[num]);
+		num++;
+		pfn += pageblock_nr_pages;
+	}
+	return num;
+}
+
+static inline void split_free_page_into_pageblocks(struct page *free_page,
+				int order, struct zone *zone)
+{
+	unsigned long pfn;
+
+	spin_lock(&zone->lock);
+	del_page_from_free_list(free_page, zone, order);
+	for (pfn = page_to_pfn(free_page);
+	     pfn < page_to_pfn(free_page) + (1UL << order);
+	     pfn += pageblock_nr_pages) {
+		int mt = get_pfnblock_migratetype(pfn_to_page(pfn), pfn);
+
+		__free_one_page(pfn_to_page(pfn), pfn, zone, pageblock_order,
+				mt, FPI_NONE);
+	}
+	spin_unlock(&zone->lock);
+}
+
 /**
  * alloc_contig_range() -- tries to allocate given range of pages
  * @start:	start PFN to allocate
@@ -9101,8 +9147,15 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 		       unsigned migratetype, gfp_t gfp_mask)
 {
 	unsigned long outer_start, outer_end;
+	unsigned long isolate_start = pfn_max_align_down(start);
+	unsigned long isolate_end = pfn_max_align_up(end);
+	unsigned long alloc_start = ALIGN_DOWN(start, pageblock_nr_pages);
+	unsigned long alloc_end = ALIGN(end, pageblock_nr_pages);
+	unsigned long num_pageblock_to_save;
 	unsigned int order;
 	int ret = 0;
+	unsigned char *saved_mt;
+	int num;
 
 	struct compact_control cc = {
 		.nr_migratepages = 0,
@@ -9116,11 +9169,30 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 	};
 	INIT_LIST_HEAD(&cc.migratepages);
 
+	/*
+	 * TODO: make MIGRATE_ISOLATE a standalone bit to avoid overwriting
+	 * the exiting migratetype. Then, we will not need the save and restore
+	 * process here.
+	 */
+
+	/* Save the migratepages of the pageblocks before start and after end */
+	num_pageblock_to_save = (alloc_start - isolate_start) / pageblock_nr_pages
+				+ (isolate_end - alloc_end) / pageblock_nr_pages;
+	saved_mt =
+		kmalloc_array(num_pageblock_to_save,
+			      sizeof(unsigned char), GFP_KERNEL);
+	if (!saved_mt)
+		return -ENOMEM;
+
+	num = save_migratetypes(saved_mt, isolate_start, alloc_start);
+
+	num = save_migratetypes(&saved_mt[num], alloc_end, isolate_end);
+
 	/*
 	 * What we do here is we mark all pageblocks in range as
 	 * MIGRATE_ISOLATE.  Because pageblock and max order pages may
 	 * have different sizes, and due to the way page allocator
-	 * work, we align the range to biggest of the two pages so
+	 * work, we align the isolation range to biggest of the two so
 	 * that page allocator won't try to merge buddies from
 	 * different pageblocks and change MIGRATE_ISOLATE to some
 	 * other migration type.
@@ -9130,6 +9202,20 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 	 * we are interested in).  This will put all the pages in
 	 * range back to page allocator as MIGRATE_ISOLATE.
 	 *
+	 * Afterwards, we restore the migratetypes of the pageblocks not
+	 * in range, split free pages spanning outside the range,
+	 * and put split free pages (at pageblock_order) to the right
+	 * migratetype list.
+	 *
+	 * NOTE: the above approach is used because it can cause free
+	 * page accounting issues during isolation, if a page, either
+	 * free or in-use, contains multiple pageblocks and we only
+	 * isolate a subset of them. For example, if only the second
+	 * pageblock is isolated from a page with 2 pageblocks, after
+	 * the page is free, it will be put in the first pageblock
+	 * migratetype list instead of having 2 pageblocks in two
+	 * separate migratetype lists.
+	 *
 	 * When this is done, we take the pages in range from page
 	 * allocator removing them from the buddy system.  This way
 	 * page allocator will never consider using them.
@@ -9140,10 +9226,9 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 	 * put back to page allocator so that buddy can use them.
 	 */
 
-	ret = start_isolate_page_range(pfn_max_align_down(start),
-				       pfn_max_align_up(end), migratetype, 0);
+	ret = start_isolate_page_range(isolate_start, isolate_end, migratetype, 0);
 	if (ret)
-		return ret;
+		goto done;
 
 	drain_all_pages(cc.zone);
 
@@ -9179,6 +9264,19 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 	 * isolated thus they won't get removed from buddy.
 	 */
 
+	/*
+	 * Restore migratetypes of pageblocks outside [start, end)
+	 * TODO: remove it when MIGRATE_ISOLATE becomes a standalone bit
+	 */
+
+	num = restore_migratetypes(saved_mt, isolate_start, alloc_start);
+
+	num = restore_migratetypes(&saved_mt[num], alloc_end, isolate_end);
+
+	/*
+	 * Split free page spanning [isolate_start, alloc_start) and put the
+	 * pageblocks in the right migratetype lists.
+	 */
 	order = 0;
 	outer_start = start;
 	while (!PageBuddy(pfn_to_page(outer_start))) {
@@ -9193,37 +9291,73 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 		order = buddy_order(pfn_to_page(outer_start));
 
 		/*
-		 * outer_start page could be small order buddy page and
-		 * it doesn't include start page. Adjust outer_start
-		 * in this case to report failed page properly
-		 * on tracepoint in test_pages_isolated()
+		 * split the free page has start page and put the pageblocks
+		 * in the right migratetype list
 		 */
-		if (outer_start + (1UL << order) <= start)
-			outer_start = start;
+		if (outer_start + (1UL << order) > start) {
+			struct page *free_page = pfn_to_page(outer_start);
+
+			split_free_page_into_pageblocks(free_page, order, cc.zone);
+		}
+	}
+
+	/*
+	 * Split free page spanning [alloc_end, isolate_end) and put the
+	 * pageblocks in the right migratetype list
+	 */
+	for (outer_end = alloc_end; outer_end < isolate_end;) {
+		unsigned long begin_pfn = outer_end;
+
+		order = 0;
+		while (!PageBuddy(pfn_to_page(outer_end))) {
+			if (++order >= MAX_ORDER) {
+				outer_end = begin_pfn;
+				break;
+			}
+			outer_end &= ~0UL << order;
+		}
+
+		if (outer_end != begin_pfn) {
+			order = buddy_order(pfn_to_page(outer_end));
+
+			/*
+			 * split the free page has start page and put the pageblocks
+			 * in the right migratetype list
+			 */
+			VM_BUG_ON(outer_end + (1UL << order) <= begin_pfn);
+			{
+				struct page *free_page = pfn_to_page(outer_end);
+
+				split_free_page_into_pageblocks(free_page, order, cc.zone);
+			}
+			outer_end += 1UL << order;
+		} else
+			outer_end = begin_pfn + 1;
 	}
 
 	/* Make sure the range is really isolated. */
-	if (test_pages_isolated(outer_start, end, 0)) {
+	if (test_pages_isolated(alloc_start, alloc_end, 0)) {
 		ret = -EBUSY;
 		goto done;
 	}
 
 	/* Grab isolated pages from freelists. */
-	outer_end = isolate_freepages_range(&cc, outer_start, end);
+	outer_end = isolate_freepages_range(&cc, alloc_start, alloc_end);
 	if (!outer_end) {
 		ret = -EBUSY;
 		goto done;
 	}
 
 	/* Free head and tail (if any) */
-	if (start != outer_start)
-		free_contig_range(outer_start, start - outer_start);
-	if (end != outer_end)
-		free_contig_range(end, outer_end - end);
+	if (start != alloc_start)
+		free_contig_range(alloc_start, start - alloc_start);
+	if (end != alloc_end)
+		free_contig_range(end, alloc_end - end);
 
 done:
-	undo_isolate_page_range(pfn_max_align_down(start),
-				pfn_max_align_up(end), migratetype);
+	kfree(saved_mt);
+	undo_isolate_page_range(alloc_start,
+				alloc_end, migratetype);
 	return ret;
 }
 EXPORT_SYMBOL(alloc_contig_range);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH v3 5/8] mm: page_isolation: check specified range for unmovable pages during isolation.
  2022-01-05 21:47 [RFC PATCH v3 0/8] Use pageblock_order for cma and alloc_contig_range alignment Zi Yan
                   ` (3 preceding siblings ...)
  2022-01-05 21:47 ` [RFC PATCH v3 4/8] mm: make alloc_contig_range work at pageblock granularity Zi Yan
@ 2022-01-05 21:47 ` Zi Yan
  2022-01-14 13:38   ` David Hildenbrand
  2022-01-05 21:47 ` [RFC PATCH v3 6/8] mm: cma: use pageblock_order as the single alignment Zi Yan
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Zi Yan @ 2022-01-05 21:47 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren, Zi Yan

From: Zi Yan <ziy@nvidia.com>

Enable set_migratetype_isolate() to check specified sub-range for
unmovable pages during isolation. Page isolation is done
at max(MAX_ORDER_NR_PAEGS, pageblock_nr_pages) granularity, but not all
pages within that granularity are intended to be isolated. For example,
alloc_contig_range(), which uses page isolation, allows ranges without
alignment. This commit makes unmovable page check only look for
interesting pages, so that page isolation can succeed for any
non-overlapping ranges.

has_unmovable_pages() is moved to mm/page_isolation.c since it is only
used by page isolation.

Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 include/linux/page-isolation.h |   3 +-
 mm/memory_hotplug.c            |  12 ++-
 mm/page_alloc.c                | 122 +--------------------------
 mm/page_isolation.c            | 148 +++++++++++++++++++++++++++++++--
 4 files changed, 153 insertions(+), 132 deletions(-)

diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 572458016331..a4d2687ed4e6 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -33,8 +33,6 @@ static inline bool is_migrate_isolate(int migratetype)
 #define MEMORY_OFFLINE	0x1
 #define REPORT_FAILURE	0x2
 
-struct page *has_unmovable_pages(struct zone *zone, struct page *page,
-				 int migratetype, int flags);
 void set_pageblock_migratetype(struct page *page, int migratetype);
 int move_freepages_block(struct zone *zone, struct page *page,
 				int migratetype, int *num_movable);
@@ -44,6 +42,7 @@ int move_freepages_block(struct zone *zone, struct page *page,
  */
 int
 start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
+			 unsigned long isolate_start, unsigned long isolate_end,
 			 unsigned migratetype, int flags);
 
 /*
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 0139b77c51d5..5db84c3fa882 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1901,8 +1901,18 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
 	zone_pcp_disable(zone);
 	lru_cache_disable();
 
-	/* set above range as isolated */
+	/*
+	 * set above range as isolated
+	 *
+	 * start_pfn and end_pfn are the same as isolate_start and isolate_end,
+	 * because start_pfn and end_pfn are already PAGES_PER_SECTION
+	 * (>= MAX_ORDER_NR_PAGES) aligned; if start_pfn is
+	 * pageblock_nr_pages aligned in memmap_on_memory case, there is no
+	 * need to isolate pages before start_pfn, since they are used by
+	 * memmap thus not user visible.
+	 */
 	ret = start_isolate_page_range(start_pfn, end_pfn,
+				       start_pfn, end_pfn,
 				       MIGRATE_MOVABLE,
 				       MEMORY_OFFLINE | REPORT_FAILURE);
 	if (ret) {
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e1c09ae54e31..faee7637740a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8864,125 +8864,6 @@ void *__init alloc_large_system_hash(const char *tablename,
 	return table;
 }
 
-/*
- * This function checks whether pageblock includes unmovable pages or not.
- *
- * PageLRU check without isolation or lru_lock could race so that
- * MIGRATE_MOVABLE block might include unmovable pages. And __PageMovable
- * check without lock_page also may miss some movable non-lru pages at
- * race condition. So you can't expect this function should be exact.
- *
- * Returns a page without holding a reference. If the caller wants to
- * dereference that page (e.g., dumping), it has to make sure that it
- * cannot get removed (e.g., via memory unplug) concurrently.
- *
- */
-struct page *has_unmovable_pages(struct zone *zone, struct page *page,
-				 int migratetype, int flags)
-{
-	unsigned long iter = 0;
-	unsigned long pfn = page_to_pfn(page);
-	unsigned long offset = pfn % pageblock_nr_pages;
-
-	if (is_migrate_cma_page(page)) {
-		/*
-		 * CMA allocations (alloc_contig_range) really need to mark
-		 * isolate CMA pageblocks even when they are not movable in fact
-		 * so consider them movable here.
-		 */
-		if (is_migrate_cma(migratetype))
-			return NULL;
-
-		return page;
-	}
-
-	for (; iter < pageblock_nr_pages - offset; iter++) {
-		page = pfn_to_page(pfn + iter);
-
-		/*
-		 * Both, bootmem allocations and memory holes are marked
-		 * PG_reserved and are unmovable. We can even have unmovable
-		 * allocations inside ZONE_MOVABLE, for example when
-		 * specifying "movablecore".
-		 */
-		if (PageReserved(page))
-			return page;
-
-		/*
-		 * If the zone is movable and we have ruled out all reserved
-		 * pages then it should be reasonably safe to assume the rest
-		 * is movable.
-		 */
-		if (zone_idx(zone) == ZONE_MOVABLE)
-			continue;
-
-		/*
-		 * Hugepages are not in LRU lists, but they're movable.
-		 * THPs are on the LRU, but need to be counted as #small pages.
-		 * We need not scan over tail pages because we don't
-		 * handle each tail page individually in migration.
-		 */
-		if (PageHuge(page) || PageTransCompound(page)) {
-			struct page *head = compound_head(page);
-			unsigned int skip_pages;
-
-			if (PageHuge(page)) {
-				if (!hugepage_migration_supported(page_hstate(head)))
-					return page;
-			} else if (!PageLRU(head) && !__PageMovable(head)) {
-				return page;
-			}
-
-			skip_pages = compound_nr(head) - (page - head);
-			iter += skip_pages - 1;
-			continue;
-		}
-
-		/*
-		 * We can't use page_count without pin a page
-		 * because another CPU can free compound page.
-		 * This check already skips compound tails of THP
-		 * because their page->_refcount is zero at all time.
-		 */
-		if (!page_ref_count(page)) {
-			if (PageBuddy(page))
-				iter += (1 << buddy_order(page)) - 1;
-			continue;
-		}
-
-		/*
-		 * The HWPoisoned page may be not in buddy system, and
-		 * page_count() is not 0.
-		 */
-		if ((flags & MEMORY_OFFLINE) && PageHWPoison(page))
-			continue;
-
-		/*
-		 * We treat all PageOffline() pages as movable when offlining
-		 * to give drivers a chance to decrement their reference count
-		 * in MEM_GOING_OFFLINE in order to indicate that these pages
-		 * can be offlined as there are no direct references anymore.
-		 * For actually unmovable PageOffline() where the driver does
-		 * not support this, we will fail later when trying to actually
-		 * move these pages that still have a reference count > 0.
-		 * (false negatives in this function only)
-		 */
-		if ((flags & MEMORY_OFFLINE) && PageOffline(page))
-			continue;
-
-		if (__PageMovable(page) || PageLRU(page))
-			continue;
-
-		/*
-		 * If there are RECLAIMABLE pages, we need to check
-		 * it.  But now, memory offline itself doesn't call
-		 * shrink_node_slabs() and it still to be fixed.
-		 */
-		return page;
-	}
-	return NULL;
-}
-
 #ifdef CONFIG_CONTIG_ALLOC
 static unsigned long pfn_max_align_down(unsigned long pfn)
 {
@@ -9226,7 +9107,8 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 	 * put back to page allocator so that buddy can use them.
 	 */
 
-	ret = start_isolate_page_range(isolate_start, isolate_end, migratetype, 0);
+	ret = start_isolate_page_range(start, end, isolate_start, isolate_end,
+				       migratetype, 0);
 	if (ret)
 		goto done;
 
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 6a0ddda6b3c5..7a7991460eb9 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -15,12 +15,143 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/page_isolation.h>
 
-static int set_migratetype_isolate(struct page *page, int migratetype, int isol_flags)
+/*
+ * This function checks whether pageblock within [start_pfn, end_pfn) includes
+ * unmovable pages or not.
+ *
+ * PageLRU check without isolation or lru_lock could race so that
+ * MIGRATE_MOVABLE block might include unmovable pages. And __PageMovable
+ * check without lock_page also may miss some movable non-lru pages at
+ * race condition. So you can't expect this function should be exact.
+ *
+ * Returns a page without holding a reference. If the caller wants to
+ * dereference that page (e.g., dumping), it has to make sure that it
+ * cannot get removed (e.g., via memory unplug) concurrently.
+ *
+ */
+static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
+				 int migratetype, int flags,
+				 unsigned long start_pfn, unsigned long end_pfn)
+{
+	unsigned long first_pfn = max(page_to_pfn(page), start_pfn);
+	unsigned long pfn = first_pfn;
+	unsigned long last_pfn = min(ALIGN(pfn + 1, pageblock_nr_pages), end_pfn);
+
+	page = pfn_to_page(pfn);
+
+	if (is_migrate_cma_page(page)) {
+		/*
+		 * CMA allocations (alloc_contig_range) really need to mark
+		 * isolate CMA pageblocks even when they are not movable in fact
+		 * so consider them movable here.
+		 */
+		if (is_migrate_cma(migratetype))
+			return NULL;
+
+		return page;
+	}
+
+	for (pfn = first_pfn; pfn < last_pfn; pfn++) {
+		page = pfn_to_page(pfn);
+
+		/*
+		 * Both, bootmem allocations and memory holes are marked
+		 * PG_reserved and are unmovable. We can even have unmovable
+		 * allocations inside ZONE_MOVABLE, for example when
+		 * specifying "movablecore".
+		 */
+		if (PageReserved(page))
+			return page;
+
+		/*
+		 * If the zone is movable and we have ruled out all reserved
+		 * pages then it should be reasonably safe to assume the rest
+		 * is movable.
+		 */
+		if (zone_idx(zone) == ZONE_MOVABLE)
+			continue;
+
+		/*
+		 * Hugepages are not in LRU lists, but they're movable.
+		 * THPs are on the LRU, but need to be counted as #small pages.
+		 * We need not scan over tail pages because we don't
+		 * handle each tail page individually in migration.
+		 */
+		if (PageHuge(page) || PageTransCompound(page)) {
+			struct page *head = compound_head(page);
+			unsigned int skip_pages;
+
+			if (PageHuge(page)) {
+				if (!hugepage_migration_supported(page_hstate(head)))
+					return page;
+			} else if (!PageLRU(head) && !__PageMovable(head)) {
+				return page;
+			}
+
+			skip_pages = compound_nr(head) - (page - head);
+			pfn += skip_pages - 1;
+			continue;
+		}
+
+		/*
+		 * We can't use page_count without pin a page
+		 * because another CPU can free compound page.
+		 * This check already skips compound tails of THP
+		 * because their page->_refcount is zero at all time.
+		 */
+		if (!page_ref_count(page)) {
+			if (PageBuddy(page))
+				pfn += (1 << buddy_order(page)) - 1;
+			continue;
+		}
+
+		/*
+		 * The HWPoisoned page may be not in buddy system, and
+		 * page_count() is not 0.
+		 */
+		if ((flags & MEMORY_OFFLINE) && PageHWPoison(page))
+			continue;
+
+		/*
+		 * We treat all PageOffline() pages as movable when offlining
+		 * to give drivers a chance to decrement their reference count
+		 * in MEM_GOING_OFFLINE in order to indicate that these pages
+		 * can be offlined as there are no direct references anymore.
+		 * For actually unmovable PageOffline() where the driver does
+		 * not support this, we will fail later when trying to actually
+		 * move these pages that still have a reference count > 0.
+		 * (false negatives in this function only)
+		 */
+		if ((flags & MEMORY_OFFLINE) && PageOffline(page))
+			continue;
+
+		if (__PageMovable(page) || PageLRU(page))
+			continue;
+
+		/*
+		 * If there are RECLAIMABLE pages, we need to check
+		 * it.  But now, memory offline itself doesn't call
+		 * shrink_node_slabs() and it still to be fixed.
+		 */
+		return page;
+	}
+	return NULL;
+}
+
+/*
+ * This function set pageblock migratetype to isolate if no unmovable page is
+ * present in [start_pfn, end_pfn). The pageblock must be within
+ * [start_pfn, end_pfn).
+ */
+static int set_migratetype_isolate(struct page *page, int migratetype, int isol_flags,
+			unsigned long start_pfn, unsigned long end_pfn)
 {
 	struct zone *zone = page_zone(page);
 	struct page *unmovable;
 	unsigned long flags;
 
+	VM_BUG_ON(page_to_pfn(page) < start_pfn || page_to_pfn(page) >= end_pfn);
+
 	spin_lock_irqsave(&zone->lock, flags);
 
 	/*
@@ -37,7 +168,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
 	 * FIXME: Now, memory hotplug doesn't call shrink_slab() by itself.
 	 * We just check MOVABLE pages.
 	 */
-	unmovable = has_unmovable_pages(zone, page, migratetype, isol_flags);
+	unmovable = has_unmovable_pages(zone, page, migratetype, isol_flags, start_pfn, end_pfn);
 	if (!unmovable) {
 		unsigned long nr_pages;
 		int mt = get_pageblock_migratetype(page);
@@ -185,20 +316,19 @@ __first_valid_page(unsigned long pfn, unsigned long nr_pages)
  * Return: 0 on success and -EBUSY if any part of range cannot be isolated.
  */
 int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
+			     unsigned long isolate_start, unsigned long isolate_end,
 			     unsigned migratetype, int flags)
 {
 	unsigned long pfn;
 	struct page *page;
 
-	BUG_ON(!IS_ALIGNED(start_pfn, pageblock_nr_pages));
-	BUG_ON(!IS_ALIGNED(end_pfn, pageblock_nr_pages));
-
-	for (pfn = start_pfn;
-	     pfn < end_pfn;
+	for (pfn = isolate_start;
+	     pfn < isolate_end;
 	     pfn += pageblock_nr_pages) {
 		page = __first_valid_page(pfn, pageblock_nr_pages);
-		if (page && set_migratetype_isolate(page, migratetype, flags)) {
-			undo_isolate_page_range(start_pfn, pfn, migratetype);
+		if (page && set_migratetype_isolate(page, migratetype, flags,
+					start_pfn, end_pfn)) {
+			undo_isolate_page_range(isolate_start, pfn, migratetype);
 			return -EBUSY;
 		}
 	}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH v3 6/8] mm: cma: use pageblock_order as the single alignment
  2022-01-05 21:47 [RFC PATCH v3 0/8] Use pageblock_order for cma and alloc_contig_range alignment Zi Yan
                   ` (4 preceding siblings ...)
  2022-01-05 21:47 ` [RFC PATCH v3 5/8] mm: page_isolation: check specified range for unmovable pages during isolation Zi Yan
@ 2022-01-05 21:47 ` Zi Yan
  2022-01-05 21:47 ` [RFC PATCH v3 7/8] drivers: virtio_mem: use pageblock size as the minimum virtio_mem size Zi Yan
  2022-01-05 21:47 ` [RFC PATCH v3 8/8] arch: powerpc: adjust fadump alignment to be pageblock aligned Zi Yan
  7 siblings, 0 replies; 24+ messages in thread
From: Zi Yan @ 2022-01-05 21:47 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren, Zi Yan

From: Zi Yan <ziy@nvidia.com>

Now alloc_contig_range() works at pageblock granularity. Change CMA
allocation, which uses alloc_contig_range(), to use pageblock_order
alignment.

Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 include/linux/mmzone.h  | 5 +----
 kernel/dma/contiguous.c | 2 +-
 mm/cma.c                | 6 ++----
 mm/page_alloc.c         | 6 +++---
 4 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 0aa549653e4e..d28a02a893d6 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -54,10 +54,7 @@ enum migratetype {
 	 *
 	 * The way to use it is to change migratetype of a range of
 	 * pageblocks to MIGRATE_CMA which can be done by
-	 * __free_pageblock_cma() function.  What is important though
-	 * is that a range of pageblocks must be aligned to
-	 * MAX_ORDER_NR_PAGES should biggest page be bigger than
-	 * a single pageblock.
+	 * __free_pageblock_cma() function.
 	 */
 	MIGRATE_CMA,
 #endif
diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index 3d63d91cba5c..ac35b14b0786 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -399,7 +399,7 @@ static const struct reserved_mem_ops rmem_cma_ops = {
 
 static int __init rmem_cma_setup(struct reserved_mem *rmem)
 {
-	phys_addr_t align = PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order);
+	phys_addr_t align = PAGE_SIZE << pageblock_order;
 	phys_addr_t mask = align - 1;
 	unsigned long node = rmem->fdt_node;
 	bool default_cma = of_get_flat_dt_prop(node, "linux,cma-default", NULL);
diff --git a/mm/cma.c b/mm/cma.c
index bc9ca8f3c487..d171158bd418 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -180,8 +180,7 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
 		return -EINVAL;
 
 	/* ensure minimal alignment required by mm core */
-	alignment = PAGE_SIZE <<
-			max_t(unsigned long, MAX_ORDER - 1, pageblock_order);
+	alignment = PAGE_SIZE << pageblock_order;
 
 	/* alignment should be aligned with order_per_bit */
 	if (!IS_ALIGNED(alignment >> PAGE_SHIFT, 1 << order_per_bit))
@@ -268,8 +267,7 @@ int __init cma_declare_contiguous_nid(phys_addr_t base,
 	 * migratetype page by page allocator's buddy algorithm. In the case,
 	 * you couldn't get a contiguous memory, which is not what we want.
 	 */
-	alignment = max(alignment,  (phys_addr_t)PAGE_SIZE <<
-			  max_t(unsigned long, MAX_ORDER - 1, pageblock_order));
+	alignment = max(alignment,  (phys_addr_t)PAGE_SIZE << pageblock_order);
 	if (fixed && base & (alignment - 1)) {
 		ret = -EINVAL;
 		pr_err("Region at %pa must be aligned to %pa bytes\n",
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index faee7637740a..63d76f436ed1 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -9013,8 +9013,8 @@ static inline void split_free_page_into_pageblocks(struct page *free_page,
  *			be either of the two.
  * @gfp_mask:	GFP mask to use during compaction
  *
- * The PFN range does not have to be pageblock or MAX_ORDER_NR_PAGES
- * aligned.  The PFN range must belong to a single zone.
+ * The PFN range does not have to be pageblock aligned. The PFN range must
+ * belong to a single zone.
  *
  * The first thing this routine does is attempt to MIGRATE_ISOLATE all
  * pageblocks in the range.  Once isolated, the pageblocks should not
@@ -9130,7 +9130,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 	ret = 0;
 
 	/*
-	 * Pages from [start, end) are within a MAX_ORDER_NR_PAGES
+	 * Pages from [start, end) are within a pageblock_nr_pages
 	 * aligned blocks that are marked as MIGRATE_ISOLATE.  What's
 	 * more, all pages in [start, end) are free in page allocator.
 	 * What we are going to do is to allocate all pages from
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH v3 7/8] drivers: virtio_mem: use pageblock size as the minimum virtio_mem size.
  2022-01-05 21:47 [RFC PATCH v3 0/8] Use pageblock_order for cma and alloc_contig_range alignment Zi Yan
                   ` (5 preceding siblings ...)
  2022-01-05 21:47 ` [RFC PATCH v3 6/8] mm: cma: use pageblock_order as the single alignment Zi Yan
@ 2022-01-05 21:47 ` Zi Yan
  2022-01-14 13:44   ` David Hildenbrand
  2022-01-05 21:47 ` [RFC PATCH v3 8/8] arch: powerpc: adjust fadump alignment to be pageblock aligned Zi Yan
  7 siblings, 1 reply; 24+ messages in thread
From: Zi Yan @ 2022-01-05 21:47 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren, Zi Yan

From: Zi Yan <ziy@nvidia.com>

alloc_contig_range() now only needs to be aligned to pageblock_order,
drop virtio_mem size requirement that it needs to be the max of
pageblock_order and MAX_ORDER.

Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 drivers/virtio/virtio_mem.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index a6a78685cfbe..2664dc16d0f9 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -2481,8 +2481,7 @@ static int virtio_mem_init_hotplug(struct virtio_mem *vm)
 	 * - Is required for now for alloc_contig_range() to work reliably -
 	 *   it doesn't properly handle smaller granularity on ZONE_NORMAL.
 	 */
-	sb_size = max_t(uint64_t, MAX_ORDER_NR_PAGES,
-			pageblock_nr_pages) * PAGE_SIZE;
+	sb_size = pageblock_nr_pages * PAGE_SIZE;
 	sb_size = max_t(uint64_t, vm->device_block_size, sb_size);
 
 	if (sb_size < memory_block_size_bytes() && !force_bbm) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH v3 8/8] arch: powerpc: adjust fadump alignment to be pageblock aligned.
  2022-01-05 21:47 [RFC PATCH v3 0/8] Use pageblock_order for cma and alloc_contig_range alignment Zi Yan
                   ` (6 preceding siblings ...)
  2022-01-05 21:47 ` [RFC PATCH v3 7/8] drivers: virtio_mem: use pageblock size as the minimum virtio_mem size Zi Yan
@ 2022-01-05 21:47 ` Zi Yan
  7 siblings, 0 replies; 24+ messages in thread
From: Zi Yan @ 2022-01-05 21:47 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren, Zi Yan

From: Zi Yan <ziy@nvidia.com>

CMA only requires pageblock alignment now. Change CMA alignment in
fadump too.

Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 arch/powerpc/include/asm/fadump-internal.h | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/fadump-internal.h b/arch/powerpc/include/asm/fadump-internal.h
index 52189928ec08..fbfca85b4200 100644
--- a/arch/powerpc/include/asm/fadump-internal.h
+++ b/arch/powerpc/include/asm/fadump-internal.h
@@ -20,9 +20,7 @@
 #define memblock_num_regions(memblock_type)	(memblock.memblock_type.cnt)
 
 /* Alignment per CMA requirement. */
-#define FADUMP_CMA_ALIGNMENT	(PAGE_SIZE <<				\
-				 max_t(unsigned long, MAX_ORDER - 1,	\
-				 pageblock_order))
+#define FADUMP_CMA_ALIGNMENT	(PAGE_SIZE << pageblock_order)
 
 /* FAD commands */
 #define FADUMP_REGISTER			1
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v3 1/8] mm: page_alloc: avoid merging non-fallbackable pageblocks with others.
  2022-01-05 21:47 ` [RFC PATCH v3 1/8] mm: page_alloc: avoid merging non-fallbackable pageblocks with others Zi Yan
@ 2022-01-12 10:54   ` David Hildenbrand
  2022-01-13 11:36     ` Mike Rapoport
  2022-01-13 14:49     ` Zi Yan
  0 siblings, 2 replies; 24+ messages in thread
From: David Hildenbrand @ 2022-01-12 10:54 UTC (permalink / raw)
  To: Zi Yan, linux-mm
  Cc: linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren

On 05.01.22 22:47, Zi Yan wrote:
> From: Zi Yan <ziy@nvidia.com>
> 
> This is done in addition to MIGRATE_ISOLATE pageblock merge avoidance.
> It prepares for the upcoming removal of the MAX_ORDER-1 alignment
> requirement for CMA and alloc_contig_range().
> 
> MIGRARTE_HIGHATOMIC should not merge with other migratetypes like
> MIGRATE_ISOLATE and MIGRARTE_CMA[1], so this commit prevents that too.
> Also add MIGRARTE_HIGHATOMIC to fallbacks array for completeness.
> 
> [1] https://lore.kernel.org/linux-mm/20211130100853.GP3366@techsingularity.net/
> 
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> ---
>  include/linux/mmzone.h |  6 ++++++
>  mm/page_alloc.c        | 28 ++++++++++++++++++----------
>  2 files changed, 24 insertions(+), 10 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index aed44e9b5d89..0aa549653e4e 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -83,6 +83,12 @@ static inline bool is_migrate_movable(int mt)
>  	return is_migrate_cma(mt) || mt == MIGRATE_MOVABLE;
>  }
>  
> +/* See fallbacks[MIGRATE_TYPES][3] in page_alloc.c */
> +static inline bool migratetype_has_fallback(int mt)
> +{
> +	return mt < MIGRATE_PCPTYPES;
> +}
> +
>  #define for_each_migratetype_order(order, type) \
>  	for (order = 0; order < MAX_ORDER; order++) \
>  		for (type = 0; type < MIGRATE_TYPES; type++)
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 8dd6399bafb5..5193c953dbf8 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1042,6 +1042,12 @@ buddy_merge_likely(unsigned long pfn, unsigned long buddy_pfn,
>  	return page_is_buddy(higher_page, higher_buddy, order + 1);
>  }
>  
> +static inline bool has_non_fallback_pageblock(struct zone *zone)
> +{
> +	return has_isolate_pageblock(zone) || zone_cma_pages(zone) != 0 ||
> +		zone->nr_reserved_highatomic != 0;
> +}

Due to zone_cma_pages(), the unlikely() below will be very wrong on many
setups. Previously, isolation really was a corner case. CMA and
highatomic are less of a corner case ...

I'm not even sure if this check is worth having around anymore at all,
or if it would be easier and cheaper to just always check the both
migration types unconditionally. Would certainly simplify the code.

Side node: we actually care about has_free_non_fallback_pageblock(), we
can only merge with free pageblocks. But that might not necessarily be
cheaper to test/track/check.

> +
>  /*
>   * Freeing function for a buddy system allocator.
>   *
> @@ -1117,14 +1123,15 @@ static inline void __free_one_page(struct page *page,
>  	}
>  	if (order < MAX_ORDER - 1) {
>  		/* If we are here, it means order is >= pageblock_order.
> -		 * We want to prevent merge between freepages on isolate
> -		 * pageblock and normal pageblock. Without this, pageblock
> -		 * isolation could cause incorrect freepage or CMA accounting.
> +		 * We want to prevent merge between freepages on pageblock
> +		 * without fallbacks and normal pageblock. Without this,
> +		 * pageblock isolation could cause incorrect freepage or CMA
> +		 * accounting or HIGHATOMIC accounting.
>  		 *
>  		 * We don't want to hit this code for the more frequent
>  		 * low-order merging.
>  		 */
> -		if (unlikely(has_isolate_pageblock(zone))) {
> +		if (unlikely(has_non_fallback_pageblock(zone))) {
>  			int buddy_mt;
>  
>  			buddy_pfn = __find_buddy_pfn(pfn, order);
> @@ -1132,8 +1139,8 @@ static inline void __free_one_page(struct page *page,
>  			buddy_mt = get_pageblock_migratetype(buddy);
>  
>  			if (migratetype != buddy_mt
> -					&& (is_migrate_isolate(migratetype) ||
> -						is_migrate_isolate(buddy_mt)))
> +					&& (!migratetype_has_fallback(migratetype) ||
> +						!migratetype_has_fallback(buddy_mt)))
>  				goto done_merging;
>  		}
>  		max_order = order + 1;
> @@ -2484,6 +2491,7 @@ static int fallbacks[MIGRATE_TYPES][3] = {
>  	[MIGRATE_UNMOVABLE]   = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE,   MIGRATE_TYPES },
>  	[MIGRATE_MOVABLE]     = { MIGRATE_RECLAIMABLE, MIGRATE_UNMOVABLE, MIGRATE_TYPES },
>  	[MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE,   MIGRATE_MOVABLE,   MIGRATE_TYPES },
> +	[MIGRATE_HIGHATOMIC] = { MIGRATE_TYPES }, /* Never used */
>  #ifdef CONFIG_CMA
>  	[MIGRATE_CMA]         = { MIGRATE_TYPES }, /* Never used */
>  #endif
> @@ -2795,8 +2803,8 @@ static void reserve_highatomic_pageblock(struct page *page, struct zone *zone,
>  
>  	/* Yoink! */
>  	mt = get_pageblock_migratetype(page);
> -	if (!is_migrate_highatomic(mt) && !is_migrate_isolate(mt)
> -	    && !is_migrate_cma(mt)) {
> +	/* Only reserve normal pageblock */
> +	if (migratetype_has_fallback(mt)) {
>  		zone->nr_reserved_highatomic += pageblock_nr_pages;
>  		set_pageblock_migratetype(page, MIGRATE_HIGHATOMIC);
>  		move_freepages_block(zone, page, MIGRATE_HIGHATOMIC, NULL);
> @@ -3545,8 +3553,8 @@ int __isolate_free_page(struct page *page, unsigned int order)
>  		struct page *endpage = page + (1 << order) - 1;
>  		for (; page < endpage; page += pageblock_nr_pages) {
>  			int mt = get_pageblock_migratetype(page);
> -			if (!is_migrate_isolate(mt) && !is_migrate_cma(mt)
> -			    && !is_migrate_highatomic(mt))
> +			/* Only change normal pageblock */
> +			if (migratetype_has_fallback(mt))
>  				set_pageblock_migratetype(page,
>  							  MIGRATE_MOVABLE);
>  		}

That part is a nice cleanup IMHO. Although the "has fallback" part is a
bit imprecise. "migratetype_is_mergable()" might be a bit clearer.
ideally "migratetype_is_mergable_with_other_types()". Can we come up
with a nice name for that?

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v3 2/8] mm: compaction: handle non-lru compound pages properly in isolate_migratepages_block().
  2022-01-05 21:47 ` [RFC PATCH v3 2/8] mm: compaction: handle non-lru compound pages properly in isolate_migratepages_block() Zi Yan
@ 2022-01-12 11:01   ` David Hildenbrand
  2022-01-13 14:57     ` Zi Yan
  0 siblings, 1 reply; 24+ messages in thread
From: David Hildenbrand @ 2022-01-12 11:01 UTC (permalink / raw)
  To: Zi Yan, linux-mm
  Cc: linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren

On 05.01.22 22:47, Zi Yan wrote:
> From: Zi Yan <ziy@nvidia.com>
> 
> In isolate_migratepages_block(), a !PageLRU tail page can be encountered
> when the page is larger than a pageblock. Use compound head page for the
> checks inside and skip the entire compound page when isolation succeeds.
> 

This will currently never happen, due to the way we always isolate
MAX_ORDER -1 ranges, correct?

Better note that in the patch description, because currently it reads
like it's an actual fix "can be encountered".

> Signed-off-by: Zi Yan <ziy@nvidia.com>
> ---
>  mm/compaction.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index b4e94cda3019..ad9053fbbe06 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -979,19 +979,23 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>  		 * Skip any other type of page
>  		 */
>  		if (!PageLRU(page)) {
> +			struct page *head = compound_head(page);
>  			/*
>  			 * __PageMovable can return false positive so we need
>  			 * to verify it under page_lock.
>  			 */
> -			if (unlikely(__PageMovable(page)) &&
> -					!PageIsolated(page)) {
> +			if (unlikely(__PageMovable(head)) &&
> +					!PageIsolated(head)) {
>  				if (locked) {
>  					unlock_page_lruvec_irqrestore(locked, flags);
>  					locked = NULL;
>  				}
>  
> -				if (!isolate_movable_page(page, isolate_mode))
> +				if (!isolate_movable_page(head, isolate_mode)) {
> +					low_pfn += (1 << compound_order(head)) - 1 - (page - head);
> +					page = head;
>  					goto isolate_success;
> +				}
>  			}
>  
>  			goto isolate_fail;


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v3 3/8] mm: migrate: allocate the right size of non hugetlb or THP compound pages.
  2022-01-05 21:47 ` [RFC PATCH v3 3/8] mm: migrate: allocate the right size of non hugetlb or THP compound pages Zi Yan
@ 2022-01-12 11:04   ` David Hildenbrand
  2022-01-13 15:46     ` Zi Yan
  0 siblings, 1 reply; 24+ messages in thread
From: David Hildenbrand @ 2022-01-12 11:04 UTC (permalink / raw)
  To: Zi Yan, linux-mm
  Cc: linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren

On 05.01.22 22:47, Zi Yan wrote:
> From: Zi Yan <ziy@nvidia.com>
> 
> alloc_migration_target() is used by alloc_contig_range() and non-LRU
> movable compound pages can be migrated. Current code does not allocate the
> right page size for such pages. Check THP precisely using
> is_transparent_huge() and add allocation support for non-LRU compound
> pages.

IIRC, we don't have any non-lru migratable pages that are coumpound
pages. Read: not used and not supported :)

Why is this required in the context of this series?


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v3 1/8] mm: page_alloc: avoid merging non-fallbackable pageblocks with others.
  2022-01-12 10:54   ` David Hildenbrand
@ 2022-01-13 11:36     ` Mike Rapoport
  2022-01-13 12:28       ` David Hildenbrand
  2022-01-13 14:49     ` Zi Yan
  1 sibling, 1 reply; 24+ messages in thread
From: Mike Rapoport @ 2022-01-13 11:36 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Zi Yan, linux-mm, linux-kernel, Michael Ellerman,
	Christoph Hellwig, Marek Szyprowski, Robin Murphy, linuxppc-dev,
	virtualization, iommu, Vlastimil Babka, Mel Gorman, Eric Ren

On Wed, Jan 12, 2022 at 11:54:49AM +0100, David Hildenbrand wrote:
> On 05.01.22 22:47, Zi Yan wrote:
> > From: Zi Yan <ziy@nvidia.com>
> > 
> > This is done in addition to MIGRATE_ISOLATE pageblock merge avoidance.
> > It prepares for the upcoming removal of the MAX_ORDER-1 alignment
> > requirement for CMA and alloc_contig_range().
> > 
> > MIGRARTE_HIGHATOMIC should not merge with other migratetypes like
> > MIGRATE_ISOLATE and MIGRARTE_CMA[1], so this commit prevents that too.
> > Also add MIGRARTE_HIGHATOMIC to fallbacks array for completeness.
> > 
> > [1] https://lore.kernel.org/linux-mm/20211130100853.GP3366@techsingularity.net/
> > 
> > Signed-off-by: Zi Yan <ziy@nvidia.com>
> > ---
> >  include/linux/mmzone.h |  6 ++++++
> >  mm/page_alloc.c        | 28 ++++++++++++++++++----------
> >  2 files changed, 24 insertions(+), 10 deletions(-)
> > 

...

> > @@ -3545,8 +3553,8 @@ int __isolate_free_page(struct page *page, unsigned int order)
> >  		struct page *endpage = page + (1 << order) - 1;
> >  		for (; page < endpage; page += pageblock_nr_pages) {
> >  			int mt = get_pageblock_migratetype(page);
> > -			if (!is_migrate_isolate(mt) && !is_migrate_cma(mt)
> > -			    && !is_migrate_highatomic(mt))
> > +			/* Only change normal pageblock */
> > +			if (migratetype_has_fallback(mt))
> >  				set_pageblock_migratetype(page,
> >  							  MIGRATE_MOVABLE);
> >  		}
> 
> That part is a nice cleanup IMHO. Although the "has fallback" part is a
> bit imprecise. "migratetype_is_mergable()" might be a bit clearer.
> ideally "migratetype_is_mergable_with_other_types()". Can we come up
> with a nice name for that?

migratetype_is_mergable() kinda implies "_with_other_types", no?

I like migratetype_is_mergable() more than _has_fallback().

My $0.02 to bikeshedding :)
 
> -- 
> Thanks,
> 
> David / dhildenb
> 
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v3 1/8] mm: page_alloc: avoid merging non-fallbackable pageblocks with others.
  2022-01-13 11:36     ` Mike Rapoport
@ 2022-01-13 12:28       ` David Hildenbrand
  2022-01-13 14:50         ` Zi Yan
  0 siblings, 1 reply; 24+ messages in thread
From: David Hildenbrand @ 2022-01-13 12:28 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Zi Yan, linux-mm, linux-kernel, Michael Ellerman,
	Christoph Hellwig, Marek Szyprowski, Robin Murphy, linuxppc-dev,
	virtualization, iommu, Vlastimil Babka, Mel Gorman, Eric Ren

On 13.01.22 12:36, Mike Rapoport wrote:
> On Wed, Jan 12, 2022 at 11:54:49AM +0100, David Hildenbrand wrote:
>> On 05.01.22 22:47, Zi Yan wrote:
>>> From: Zi Yan <ziy@nvidia.com>
>>>
>>> This is done in addition to MIGRATE_ISOLATE pageblock merge avoidance.
>>> It prepares for the upcoming removal of the MAX_ORDER-1 alignment
>>> requirement for CMA and alloc_contig_range().
>>>
>>> MIGRARTE_HIGHATOMIC should not merge with other migratetypes like
>>> MIGRATE_ISOLATE and MIGRARTE_CMA[1], so this commit prevents that too.
>>> Also add MIGRARTE_HIGHATOMIC to fallbacks array for completeness.
>>>
>>> [1] https://lore.kernel.org/linux-mm/20211130100853.GP3366@techsingularity.net/
>>>
>>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>>> ---
>>>  include/linux/mmzone.h |  6 ++++++
>>>  mm/page_alloc.c        | 28 ++++++++++++++++++----------
>>>  2 files changed, 24 insertions(+), 10 deletions(-)
>>>
> 
> ...
> 
>>> @@ -3545,8 +3553,8 @@ int __isolate_free_page(struct page *page, unsigned int order)
>>>  		struct page *endpage = page + (1 << order) - 1;
>>>  		for (; page < endpage; page += pageblock_nr_pages) {
>>>  			int mt = get_pageblock_migratetype(page);
>>> -			if (!is_migrate_isolate(mt) && !is_migrate_cma(mt)
>>> -			    && !is_migrate_highatomic(mt))
>>> +			/* Only change normal pageblock */
>>> +			if (migratetype_has_fallback(mt))
>>>  				set_pageblock_migratetype(page,
>>>  							  MIGRATE_MOVABLE);
>>>  		}
>>
>> That part is a nice cleanup IMHO. Although the "has fallback" part is a
>> bit imprecise. "migratetype_is_mergable()" might be a bit clearer.
>> ideally "migratetype_is_mergable_with_other_types()". Can we come up
>> with a nice name for that?
> 
> migratetype_is_mergable() kinda implies "_with_other_types", no?
> 
> I like migratetype_is_mergable() more than _has_fallback().
> 
> My $0.02 to bikeshedding :)

:)

Yeah, for me migratetype_is_mergable() would also be good enough. I
think I was at first thinking one could mistake it with a dedicated
migratetype. But such functions are historically called

is_migrate_cma/is_migrate_cma/....

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v3 1/8] mm: page_alloc: avoid merging non-fallbackable pageblocks with others.
  2022-01-12 10:54   ` David Hildenbrand
  2022-01-13 11:36     ` Mike Rapoport
@ 2022-01-13 14:49     ` Zi Yan
  1 sibling, 0 replies; 24+ messages in thread
From: Zi Yan @ 2022-01-13 14:49 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-mm, linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren

[-- Attachment #1: Type: text/plain, Size: 6230 bytes --]

On 12 Jan 2022, at 5:54, David Hildenbrand wrote:

> On 05.01.22 22:47, Zi Yan wrote:
>> From: Zi Yan <ziy@nvidia.com>
>>
>> This is done in addition to MIGRATE_ISOLATE pageblock merge avoidance.
>> It prepares for the upcoming removal of the MAX_ORDER-1 alignment
>> requirement for CMA and alloc_contig_range().
>>
>> MIGRARTE_HIGHATOMIC should not merge with other migratetypes like
>> MIGRATE_ISOLATE and MIGRARTE_CMA[1], so this commit prevents that too.
>> Also add MIGRARTE_HIGHATOMIC to fallbacks array for completeness.
>>
>> [1] https://lore.kernel.org/linux-mm/20211130100853.GP3366@techsingularity.net/
>>
>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>> ---
>>  include/linux/mmzone.h |  6 ++++++
>>  mm/page_alloc.c        | 28 ++++++++++++++++++----------
>>  2 files changed, 24 insertions(+), 10 deletions(-)
>>
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index aed44e9b5d89..0aa549653e4e 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -83,6 +83,12 @@ static inline bool is_migrate_movable(int mt)
>>  	return is_migrate_cma(mt) || mt == MIGRATE_MOVABLE;
>>  }
>>
>> +/* See fallbacks[MIGRATE_TYPES][3] in page_alloc.c */
>> +static inline bool migratetype_has_fallback(int mt)
>> +{
>> +	return mt < MIGRATE_PCPTYPES;
>> +}
>> +
>>  #define for_each_migratetype_order(order, type) \
>>  	for (order = 0; order < MAX_ORDER; order++) \
>>  		for (type = 0; type < MIGRATE_TYPES; type++)
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 8dd6399bafb5..5193c953dbf8 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -1042,6 +1042,12 @@ buddy_merge_likely(unsigned long pfn, unsigned long buddy_pfn,
>>  	return page_is_buddy(higher_page, higher_buddy, order + 1);
>>  }
>>
>> +static inline bool has_non_fallback_pageblock(struct zone *zone)
>> +{
>> +	return has_isolate_pageblock(zone) || zone_cma_pages(zone) != 0 ||
>> +		zone->nr_reserved_highatomic != 0;
>> +}
>
> Due to zone_cma_pages(), the unlikely() below will be very wrong on many
> setups. Previously, isolation really was a corner case. CMA and
> highatomic are less of a corner case ...

Got it.

>
> I'm not even sure if this check is worth having around anymore at all,
> or if it would be easier and cheaper to just always check the both
> migration types unconditionally. Would certainly simplify the code.

I will remove the if check below, since, like you said, the check is
no longer a corner case with added highatomic and CMA check.

>
> Side node: we actually care about has_free_non_fallback_pageblock(), we
> can only merge with free pageblocks. But that might not necessarily be
> cheaper to test/track/check.
>

I agree that what we are actually looking for is free pageblocks of these
migratetypes. But tracking them is nontrivial.

>> +
>>  /*
>>   * Freeing function for a buddy system allocator.
>>   *
>> @@ -1117,14 +1123,15 @@ static inline void __free_one_page(struct page *page,
>>  	}
>>  	if (order < MAX_ORDER - 1) {
>>  		/* If we are here, it means order is >= pageblock_order.
>> -		 * We want to prevent merge between freepages on isolate
>> -		 * pageblock and normal pageblock. Without this, pageblock
>> -		 * isolation could cause incorrect freepage or CMA accounting.
>> +		 * We want to prevent merge between freepages on pageblock
>> +		 * without fallbacks and normal pageblock. Without this,
>> +		 * pageblock isolation could cause incorrect freepage or CMA
>> +		 * accounting or HIGHATOMIC accounting.
>>  		 *
>>  		 * We don't want to hit this code for the more frequent
>>  		 * low-order merging.
>>  		 */
>> -		if (unlikely(has_isolate_pageblock(zone))) {
>> +		if (unlikely(has_non_fallback_pageblock(zone))) {
>>  			int buddy_mt;
>>
>>  			buddy_pfn = __find_buddy_pfn(pfn, order);
>> @@ -1132,8 +1139,8 @@ static inline void __free_one_page(struct page *page,
>>  			buddy_mt = get_pageblock_migratetype(buddy);
>>
>>  			if (migratetype != buddy_mt
>> -					&& (is_migrate_isolate(migratetype) ||
>> -						is_migrate_isolate(buddy_mt)))
>> +					&& (!migratetype_has_fallback(migratetype) ||
>> +						!migratetype_has_fallback(buddy_mt)))
>>  				goto done_merging;
>>  		}
>>  		max_order = order + 1;
>> @@ -2484,6 +2491,7 @@ static int fallbacks[MIGRATE_TYPES][3] = {
>>  	[MIGRATE_UNMOVABLE]   = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE,   MIGRATE_TYPES },
>>  	[MIGRATE_MOVABLE]     = { MIGRATE_RECLAIMABLE, MIGRATE_UNMOVABLE, MIGRATE_TYPES },
>>  	[MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE,   MIGRATE_MOVABLE,   MIGRATE_TYPES },
>> +	[MIGRATE_HIGHATOMIC] = { MIGRATE_TYPES }, /* Never used */
>>  #ifdef CONFIG_CMA
>>  	[MIGRATE_CMA]         = { MIGRATE_TYPES }, /* Never used */
>>  #endif
>> @@ -2795,8 +2803,8 @@ static void reserve_highatomic_pageblock(struct page *page, struct zone *zone,
>>
>>  	/* Yoink! */
>>  	mt = get_pageblock_migratetype(page);
>> -	if (!is_migrate_highatomic(mt) && !is_migrate_isolate(mt)
>> -	    && !is_migrate_cma(mt)) {
>> +	/* Only reserve normal pageblock */
>> +	if (migratetype_has_fallback(mt)) {
>>  		zone->nr_reserved_highatomic += pageblock_nr_pages;
>>  		set_pageblock_migratetype(page, MIGRATE_HIGHATOMIC);
>>  		move_freepages_block(zone, page, MIGRATE_HIGHATOMIC, NULL);
>> @@ -3545,8 +3553,8 @@ int __isolate_free_page(struct page *page, unsigned int order)
>>  		struct page *endpage = page + (1 << order) - 1;
>>  		for (; page < endpage; page += pageblock_nr_pages) {
>>  			int mt = get_pageblock_migratetype(page);
>> -			if (!is_migrate_isolate(mt) && !is_migrate_cma(mt)
>> -			    && !is_migrate_highatomic(mt))
>> +			/* Only change normal pageblock */
>> +			if (migratetype_has_fallback(mt))
>>  				set_pageblock_migratetype(page,
>>  							  MIGRATE_MOVABLE);
>>  		}
>
> That part is a nice cleanup IMHO. Although the "has fallback" part is a
> bit imprecise. "migratetype_is_mergable()" might be a bit clearer.
> ideally "migratetype_is_mergable_with_other_types()". Can we come up
> with a nice name for that?

Sure. Will change the name.

Thank you for the comments.


--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v3 1/8] mm: page_alloc: avoid merging non-fallbackable pageblocks with others.
  2022-01-13 12:28       ` David Hildenbrand
@ 2022-01-13 14:50         ` Zi Yan
  0 siblings, 0 replies; 24+ messages in thread
From: Zi Yan @ 2022-01-13 14:50 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Mike Rapoport, linux-mm, linux-kernel, Michael Ellerman,
	Christoph Hellwig, Marek Szyprowski, Robin Murphy, linuxppc-dev,
	virtualization, iommu, Vlastimil Babka, Mel Gorman, Eric Ren

[-- Attachment #1: Type: text/plain, Size: 2377 bytes --]

On 13 Jan 2022, at 7:28, David Hildenbrand wrote:

> On 13.01.22 12:36, Mike Rapoport wrote:
>> On Wed, Jan 12, 2022 at 11:54:49AM +0100, David Hildenbrand wrote:
>>> On 05.01.22 22:47, Zi Yan wrote:
>>>> From: Zi Yan <ziy@nvidia.com>
>>>>
>>>> This is done in addition to MIGRATE_ISOLATE pageblock merge avoidance.
>>>> It prepares for the upcoming removal of the MAX_ORDER-1 alignment
>>>> requirement for CMA and alloc_contig_range().
>>>>
>>>> MIGRARTE_HIGHATOMIC should not merge with other migratetypes like
>>>> MIGRATE_ISOLATE and MIGRARTE_CMA[1], so this commit prevents that too.
>>>> Also add MIGRARTE_HIGHATOMIC to fallbacks array for completeness.
>>>>
>>>> [1] https://lore.kernel.org/linux-mm/20211130100853.GP3366@techsingularity.net/
>>>>
>>>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>>>> ---
>>>>  include/linux/mmzone.h |  6 ++++++
>>>>  mm/page_alloc.c        | 28 ++++++++++++++++++----------
>>>>  2 files changed, 24 insertions(+), 10 deletions(-)
>>>>
>>
>> ...
>>
>>>> @@ -3545,8 +3553,8 @@ int __isolate_free_page(struct page *page, unsigned int order)
>>>>  		struct page *endpage = page + (1 << order) - 1;
>>>>  		for (; page < endpage; page += pageblock_nr_pages) {
>>>>  			int mt = get_pageblock_migratetype(page);
>>>> -			if (!is_migrate_isolate(mt) && !is_migrate_cma(mt)
>>>> -			    && !is_migrate_highatomic(mt))
>>>> +			/* Only change normal pageblock */
>>>> +			if (migratetype_has_fallback(mt))
>>>>  				set_pageblock_migratetype(page,
>>>>  							  MIGRATE_MOVABLE);
>>>>  		}
>>>
>>> That part is a nice cleanup IMHO. Although the "has fallback" part is a
>>> bit imprecise. "migratetype_is_mergable()" might be a bit clearer.
>>> ideally "migratetype_is_mergable_with_other_types()". Can we come up
>>> with a nice name for that?
>>
>> migratetype_is_mergable() kinda implies "_with_other_types", no?
>>
>> I like migratetype_is_mergable() more than _has_fallback().
>>
>> My $0.02 to bikeshedding :)
>
> :)
>
> Yeah, for me migratetype_is_mergable() would also be good enough. I
> think I was at first thinking one could mistake it with a dedicated
> migratetype. But such functions are historically called
>
> is_migrate_cma/is_migrate_cma/....
>
> -- 
> Thanks,
>
> David / dhildenb

OK. Will use migratetype_is_mergable() instead.


--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v3 2/8] mm: compaction: handle non-lru compound pages properly in isolate_migratepages_block().
  2022-01-12 11:01   ` David Hildenbrand
@ 2022-01-13 14:57     ` Zi Yan
  2022-01-13 16:23       ` Zi Yan
  0 siblings, 1 reply; 24+ messages in thread
From: Zi Yan @ 2022-01-13 14:57 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-mm, linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren

[-- Attachment #1: Type: text/plain, Size: 1980 bytes --]

On 12 Jan 2022, at 6:01, David Hildenbrand wrote:

> On 05.01.22 22:47, Zi Yan wrote:
>> From: Zi Yan <ziy@nvidia.com>
>>
>> In isolate_migratepages_block(), a !PageLRU tail page can be encountered
>> when the page is larger than a pageblock. Use compound head page for the
>> checks inside and skip the entire compound page when isolation succeeds.
>>
>
> This will currently never happen, due to the way we always isolate
> MAX_ORDER -1 ranges, correct?

You are right.

>
> Better note that in the patch description, because currently it reads
> like it's an actual fix "can be encountered".
>

Will do. This is a preparation patch for the upcoming commits.


>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>> ---
>>  mm/compaction.c | 10 +++++++---
>>  1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index b4e94cda3019..ad9053fbbe06 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -979,19 +979,23 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>  		 * Skip any other type of page
>>  		 */
>>  		if (!PageLRU(page)) {
>> +			struct page *head = compound_head(page);
>>  			/*
>>  			 * __PageMovable can return false positive so we need
>>  			 * to verify it under page_lock.
>>  			 */
>> -			if (unlikely(__PageMovable(page)) &&
>> -					!PageIsolated(page)) {
>> +			if (unlikely(__PageMovable(head)) &&
>> +					!PageIsolated(head)) {
>>  				if (locked) {
>>  					unlock_page_lruvec_irqrestore(locked, flags);
>>  					locked = NULL;
>>  				}
>>
>> -				if (!isolate_movable_page(page, isolate_mode))
>> +				if (!isolate_movable_page(head, isolate_mode)) {
>> +					low_pfn += (1 << compound_order(head)) - 1 - (page - head);
>> +					page = head;
>>  					goto isolate_success;
>> +				}
>>  			}
>>
>>  			goto isolate_fail;
>
>
> -- 
> Thanks,
>
> David / dhildenb


--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v3 3/8] mm: migrate: allocate the right size of non hugetlb or THP compound pages.
  2022-01-12 11:04   ` David Hildenbrand
@ 2022-01-13 15:46     ` Zi Yan
  2022-01-13 15:50       ` David Hildenbrand
  0 siblings, 1 reply; 24+ messages in thread
From: Zi Yan @ 2022-01-13 15:46 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-mm, linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren

[-- Attachment #1: Type: text/plain, Size: 1232 bytes --]

On 12 Jan 2022, at 6:04, David Hildenbrand wrote:

> On 05.01.22 22:47, Zi Yan wrote:
>> From: Zi Yan <ziy@nvidia.com>
>>
>> alloc_migration_target() is used by alloc_contig_range() and non-LRU
>> movable compound pages can be migrated. Current code does not allocate the
>> right page size for such pages. Check THP precisely using
>> is_transparent_huge() and add allocation support for non-LRU compound
>> pages.
>
> IIRC, we don't have any non-lru migratable pages that are coumpound
> pages. Read: not used and not supported :)

OK, but nothing prevents one writing a driver that allocates compound
pages and provides address_space->migratepage() and address_space->isolate_page().

Actually, to test this series, I write a kernel module that allocates
an order-10 page, gives it a fake address_space with migratepage() and
isolate_page(), __SetPageMovable() on it, then call alloc_contig_range()
on the page range. Apparently, my kernel module is not supported by
the kernel, thus, I added this patch.

Do you have an alternative test to my kernel module, so that I do not
even need this patch myself?

> Why is this required in the context of this series?

It might not be required. I will drop it.

--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v3 3/8] mm: migrate: allocate the right size of non hugetlb or THP compound pages.
  2022-01-13 15:46     ` Zi Yan
@ 2022-01-13 15:50       ` David Hildenbrand
  0 siblings, 0 replies; 24+ messages in thread
From: David Hildenbrand @ 2022-01-13 15:50 UTC (permalink / raw)
  To: Zi Yan
  Cc: linux-mm, linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren

On 13.01.22 16:46, Zi Yan wrote:
> On 12 Jan 2022, at 6:04, David Hildenbrand wrote:
> 
>> On 05.01.22 22:47, Zi Yan wrote:
>>> From: Zi Yan <ziy@nvidia.com>
>>>
>>> alloc_migration_target() is used by alloc_contig_range() and non-LRU
>>> movable compound pages can be migrated. Current code does not allocate the
>>> right page size for such pages. Check THP precisely using
>>> is_transparent_huge() and add allocation support for non-LRU compound
>>> pages.
>>
>> IIRC, we don't have any non-lru migratable pages that are coumpound
>> pages. Read: not used and not supported :)
> 
> OK, but nothing prevents one writing a driver that allocates compound
> pages and provides address_space->migratepage() and address_space->isolate_page().
> 
> Actually, to test this series, I write a kernel module that allocates
> an order-10 page, gives it a fake address_space with migratepage() and
> isolate_page(), __SetPageMovable() on it, then call alloc_contig_range()
> on the page range. Apparently, my kernel module is not supported by
> the kernel, thus, I added this patch.
> 
> Do you have an alternative test to my kernel module, so that I do not
> even need this patch myself?
> 
>> Why is this required in the context of this series?
> 
> It might not be required. I will drop it.

That's why I think it would be best dropping it. If you need it in
different context, better submit it in different context.

Makes this series easier to digest :)


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v3 2/8] mm: compaction: handle non-lru compound pages properly in isolate_migratepages_block().
  2022-01-13 14:57     ` Zi Yan
@ 2022-01-13 16:23       ` Zi Yan
  0 siblings, 0 replies; 24+ messages in thread
From: Zi Yan @ 2022-01-13 16:23 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-mm, linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren

[-- Attachment #1: Type: text/plain, Size: 2367 bytes --]

On 13 Jan 2022, at 9:57, Zi Yan wrote:

> On 12 Jan 2022, at 6:01, David Hildenbrand wrote:
>
>> On 05.01.22 22:47, Zi Yan wrote:
>>> From: Zi Yan <ziy@nvidia.com>
>>>
>>> In isolate_migratepages_block(), a !PageLRU tail page can be encountered
>>> when the page is larger than a pageblock. Use compound head page for the
>>> checks inside and skip the entire compound page when isolation succeeds.
>>>
>>
>> This will currently never happen, due to the way we always isolate
>> MAX_ORDER -1 ranges, correct?
>
> You are right.
>
>>
>> Better note that in the patch description, because currently it reads
>> like it's an actual fix "can be encountered".
>>
>
> Will do. This is a preparation patch for the upcoming commits.

I will drop this one too. Like you mentioned in [1], there are no
non-lru migratable compound pages. This is only used by my local
test code.

[1] https://lore.kernel.org/linux-mm/970ca2a4-416d-7e8f-37c7-510c5b050f4b@redhat.com/


>
>>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>>> ---
>>>  mm/compaction.c | 10 +++++++---
>>>  1 file changed, 7 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/mm/compaction.c b/mm/compaction.c
>>> index b4e94cda3019..ad9053fbbe06 100644
>>> --- a/mm/compaction.c
>>> +++ b/mm/compaction.c
>>> @@ -979,19 +979,23 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>>  		 * Skip any other type of page
>>>  		 */
>>>  		if (!PageLRU(page)) {
>>> +			struct page *head = compound_head(page);
>>>  			/*
>>>  			 * __PageMovable can return false positive so we need
>>>  			 * to verify it under page_lock.
>>>  			 */
>>> -			if (unlikely(__PageMovable(page)) &&
>>> -					!PageIsolated(page)) {
>>> +			if (unlikely(__PageMovable(head)) &&
>>> +					!PageIsolated(head)) {
>>>  				if (locked) {
>>>  					unlock_page_lruvec_irqrestore(locked, flags);
>>>  					locked = NULL;
>>>  				}
>>>
>>> -				if (!isolate_movable_page(page, isolate_mode))
>>> +				if (!isolate_movable_page(head, isolate_mode)) {
>>> +					low_pfn += (1 << compound_order(head)) - 1 - (page - head);
>>> +					page = head;
>>>  					goto isolate_success;
>>> +				}
>>>  			}
>>>
>>>  			goto isolate_fail;
>>
>>
>> -- 
>> Thanks,
>>
>> David / dhildenb
>
>
> --
> Best Regards,
> Yan, Zi


--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v3 5/8] mm: page_isolation: check specified range for unmovable pages during isolation.
  2022-01-05 21:47 ` [RFC PATCH v3 5/8] mm: page_isolation: check specified range for unmovable pages during isolation Zi Yan
@ 2022-01-14 13:38   ` David Hildenbrand
  2022-01-14 15:14     ` Zi Yan
  0 siblings, 1 reply; 24+ messages in thread
From: David Hildenbrand @ 2022-01-14 13:38 UTC (permalink / raw)
  To: Zi Yan, linux-mm
  Cc: linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren

On 05.01.22 22:47, Zi Yan wrote:
> From: Zi Yan <ziy@nvidia.com>
> 
> Enable set_migratetype_isolate() to check specified sub-range for
> unmovable pages during isolation. Page isolation is done
> at max(MAX_ORDER_NR_PAEGS, pageblock_nr_pages) granularity, but not all
> pages within that granularity are intended to be isolated. For example,
> alloc_contig_range(), which uses page isolation, allows ranges without
> alignment. This commit makes unmovable page check only look for
> interesting pages, so that page isolation can succeed for any
> non-overlapping ranges.

Are you handling if we start checking in the middle of a compound page
and actually have to lookup the head to figure out if movable or not?

> 
> has_unmovable_pages() is moved to mm/page_isolation.c since it is only
> used by page isolation.

Please move that into a separate patch upfront, makes this patch much
easier to review.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v3 7/8] drivers: virtio_mem: use pageblock size as the minimum virtio_mem size.
  2022-01-05 21:47 ` [RFC PATCH v3 7/8] drivers: virtio_mem: use pageblock size as the minimum virtio_mem size Zi Yan
@ 2022-01-14 13:44   ` David Hildenbrand
  2022-01-14 15:15     ` Zi Yan
  0 siblings, 1 reply; 24+ messages in thread
From: David Hildenbrand @ 2022-01-14 13:44 UTC (permalink / raw)
  To: Zi Yan, linux-mm
  Cc: linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren

On 05.01.22 22:47, Zi Yan wrote:
> From: Zi Yan <ziy@nvidia.com>
> 
> alloc_contig_range() now only needs to be aligned to pageblock_order,
> drop virtio_mem size requirement that it needs to be the max of
> pageblock_order and MAX_ORDER.
> 
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> ---
>  drivers/virtio/virtio_mem.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
> index a6a78685cfbe..2664dc16d0f9 100644
> --- a/drivers/virtio/virtio_mem.c
> +++ b/drivers/virtio/virtio_mem.c
> @@ -2481,8 +2481,7 @@ static int virtio_mem_init_hotplug(struct virtio_mem *vm)
>  	 * - Is required for now for alloc_contig_range() to work reliably -
>  	 *   it doesn't properly handle smaller granularity on ZONE_NORMAL.
>  	 */

Please also update this comment.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v3 5/8] mm: page_isolation: check specified range for unmovable pages during isolation.
  2022-01-14 13:38   ` David Hildenbrand
@ 2022-01-14 15:14     ` Zi Yan
  0 siblings, 0 replies; 24+ messages in thread
From: Zi Yan @ 2022-01-14 15:14 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-mm, linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren

[-- Attachment #1: Type: text/plain, Size: 1086 bytes --]

On 14 Jan 2022, at 8:38, David Hildenbrand wrote:

> On 05.01.22 22:47, Zi Yan wrote:
>> From: Zi Yan <ziy@nvidia.com>
>>
>> Enable set_migratetype_isolate() to check specified sub-range for
>> unmovable pages during isolation. Page isolation is done
>> at max(MAX_ORDER_NR_PAEGS, pageblock_nr_pages) granularity, but not all
>> pages within that granularity are intended to be isolated. For example,
>> alloc_contig_range(), which uses page isolation, allows ranges without
>> alignment. This commit makes unmovable page check only look for
>> interesting pages, so that page isolation can succeed for any
>> non-overlapping ranges.
>
> Are you handling if we start checking in the middle of a compound page
> and actually have to lookup the head to figure out if movable or not?
>

Yes. has_unmovable_pages() has that check already.


>>
>> has_unmovable_pages() is moved to mm/page_isolation.c since it is only
>> used by page isolation.
>
> Please move that into a separate patch upfront, makes this patch much
> easier to review.

Sure. Will do.

Thanks.

--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v3 7/8] drivers: virtio_mem: use pageblock size as the minimum virtio_mem size.
  2022-01-14 13:44   ` David Hildenbrand
@ 2022-01-14 15:15     ` Zi Yan
  0 siblings, 0 replies; 24+ messages in thread
From: Zi Yan @ 2022-01-14 15:15 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-mm, linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren

[-- Attachment #1: Type: text/plain, Size: 979 bytes --]

On 14 Jan 2022, at 8:44, David Hildenbrand wrote:

> On 05.01.22 22:47, Zi Yan wrote:
>> From: Zi Yan <ziy@nvidia.com>
>>
>> alloc_contig_range() now only needs to be aligned to pageblock_order,
>> drop virtio_mem size requirement that it needs to be the max of
>> pageblock_order and MAX_ORDER.
>>
>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>> ---
>>  drivers/virtio/virtio_mem.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
>> index a6a78685cfbe..2664dc16d0f9 100644
>> --- a/drivers/virtio/virtio_mem.c
>> +++ b/drivers/virtio/virtio_mem.c
>> @@ -2481,8 +2481,7 @@ static int virtio_mem_init_hotplug(struct virtio_mem *vm)
>>  	 * - Is required for now for alloc_contig_range() to work reliably -
>>  	 *   it doesn't properly handle smaller granularity on ZONE_NORMAL.
>>  	 */
>
> Please also update this comment.

No problem. Thanks for pointing this out.


--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2022-01-14 15:15 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-05 21:47 [RFC PATCH v3 0/8] Use pageblock_order for cma and alloc_contig_range alignment Zi Yan
2022-01-05 21:47 ` [RFC PATCH v3 1/8] mm: page_alloc: avoid merging non-fallbackable pageblocks with others Zi Yan
2022-01-12 10:54   ` David Hildenbrand
2022-01-13 11:36     ` Mike Rapoport
2022-01-13 12:28       ` David Hildenbrand
2022-01-13 14:50         ` Zi Yan
2022-01-13 14:49     ` Zi Yan
2022-01-05 21:47 ` [RFC PATCH v3 2/8] mm: compaction: handle non-lru compound pages properly in isolate_migratepages_block() Zi Yan
2022-01-12 11:01   ` David Hildenbrand
2022-01-13 14:57     ` Zi Yan
2022-01-13 16:23       ` Zi Yan
2022-01-05 21:47 ` [RFC PATCH v3 3/8] mm: migrate: allocate the right size of non hugetlb or THP compound pages Zi Yan
2022-01-12 11:04   ` David Hildenbrand
2022-01-13 15:46     ` Zi Yan
2022-01-13 15:50       ` David Hildenbrand
2022-01-05 21:47 ` [RFC PATCH v3 4/8] mm: make alloc_contig_range work at pageblock granularity Zi Yan
2022-01-05 21:47 ` [RFC PATCH v3 5/8] mm: page_isolation: check specified range for unmovable pages during isolation Zi Yan
2022-01-14 13:38   ` David Hildenbrand
2022-01-14 15:14     ` Zi Yan
2022-01-05 21:47 ` [RFC PATCH v3 6/8] mm: cma: use pageblock_order as the single alignment Zi Yan
2022-01-05 21:47 ` [RFC PATCH v3 7/8] drivers: virtio_mem: use pageblock size as the minimum virtio_mem size Zi Yan
2022-01-14 13:44   ` David Hildenbrand
2022-01-14 15:15     ` Zi Yan
2022-01-05 21:47 ` [RFC PATCH v3 8/8] arch: powerpc: adjust fadump alignment to be pageblock aligned Zi Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).