linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 00/11] mm/memory_hotplug: online_pages()/offline_pages() cleanups
@ 2020-08-19 10:11 David Hildenbrand
  2020-08-19 10:11 ` [PATCH v1 01/11] mm/memory_hotplug: inline __offline_pages() into offline_pages() David Hildenbrand
                   ` (10 more replies)
  0 siblings, 11 replies; 28+ messages in thread
From: David Hildenbrand @ 2020-08-19 10:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Baoquan He,
	Charan Teja Reddy, Dan Williams, Fenghua Yu, Logan Gunthorpe,
	Matthew Wilcox (Oracle),
	Mel Gorman, Michal Hocko, Michel Lespinasse, Mike Rapoport,
	Oscar Salvador, Pankaj Gupta, Tony Luck, Wei Yang

These are a bunch of cleanups for online_pages()/offline_pages() and
related code, mostly getting rid of memory hole handling that is no longer
necessary. There is only a single walk_system_ram_range() call left in
offline_pages(), to make sure we don't have any memory holes. I had some
of these patches lying around for a longer time but didn't have time to
polish them.

In addition, the last patch marks all pageblocks of memory to get onlined
MIGRATE_ISOLATE, so pages that have just been exposed to the buddy cannot
get allocated before onlining is complete. Once heavy lifting is done,
the pageblocks are set to MIGRATE_MOVABLE, such that allocations are
possible.

I played with DIMMs and virtio-mem on x86-64 and didn't spot any surprises.
I verified that the numer of isolated pageblocks is correctly handled when
onlining/offlining.

David Hildenbrand (11):
  mm/memory_hotplug: inline __offline_pages() into offline_pages()
  mm/memory_hotplug: enforce section granularity when onlining/offlining
  mm/memory_hotplug: simplify checking if all pages are isolated in
    offline_pages()
  mm/memory_hotplug: simplify offlining of pages in offline_pages()
  mm/page_alloc: simplify __offline_isolated_pages()
  mm/memory_hotplug: drop nr_isolate_pageblock in offline_pages()
  mm/page_isolation: simplify return value of start_isolate_page_range()
  mm/memory_hotplug: simplify page onlining
  mm/page_alloc: drop stale pageblock comment in memmap_init_zone*()
  mm: pass migratetype into memmap_init_zone() and
    move_pfn_range_to_zone()
  mm/memory_hotplug: mark pageblocks MIGRATE_ISOLATE while onlining
    memory

 arch/ia64/mm/init.c            |   4 +-
 include/linux/memory_hotplug.h |   7 +-
 include/linux/mm.h             |   3 +-
 mm/Kconfig                     |   2 +-
 mm/memory_hotplug.c            | 156 ++++++++++++++-------------------
 mm/memremap.c                  |   3 +-
 mm/page_alloc.c                |  64 ++++----------
 mm/page_isolation.c            |   7 +-
 8 files changed, 98 insertions(+), 148 deletions(-)

-- 
2.26.2



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v1 01/11] mm/memory_hotplug: inline __offline_pages() into offline_pages()
  2020-08-19 10:11 [PATCH v1 00/11] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
@ 2020-08-19 10:11 ` David Hildenbrand
  2020-08-19 12:23   ` Michal Hocko
  2020-08-19 10:11 ` [PATCH v1 02/11] mm/memory_hotplug: enforce section granularity when onlining/offlining David Hildenbrand
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: David Hildenbrand @ 2020-08-19 10:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Michal Hocko,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador

There is only a single user, offline_pages(). Let's inline, to make
it look more similar to online_pages().

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 7f62d69660e06..c781d386d87f9 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1473,11 +1473,10 @@ static int count_system_ram_pages_cb(unsigned long start_pfn,
 	return 0;
 }
 
-static int __ref __offline_pages(unsigned long start_pfn,
-		  unsigned long end_pfn)
+int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 {
-	unsigned long pfn, nr_pages = 0;
-	unsigned long offlined_pages = 0;
+	const unsigned long end_pfn = start_pfn + nr_pages;
+	unsigned long pfn, system_ram_pages = 0, offlined_pages = 0;
 	int ret, node, nr_isolate_pageblock;
 	unsigned long flags;
 	struct zone *zone;
@@ -1494,9 +1493,9 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	 * memory holes PG_reserved, don't need pfn_valid() checks, and can
 	 * avoid using walk_system_ram_range() later.
 	 */
-	walk_system_ram_range(start_pfn, end_pfn - start_pfn, &nr_pages,
+	walk_system_ram_range(start_pfn, nr_pages, &system_ram_pages,
 			      count_system_ram_pages_cb);
-	if (nr_pages != end_pfn - start_pfn) {
+	if (system_ram_pages != nr_pages) {
 		ret = -EINVAL;
 		reason = "memory holes";
 		goto failed_removal;
@@ -1631,11 +1630,6 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	return ret;
 }
 
-int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
-{
-	return __offline_pages(start_pfn, start_pfn + nr_pages);
-}
-
 static int check_memblock_offlined_cb(struct memory_block *mem, void *arg)
 {
 	int ret = !is_memblock_offlined(mem);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v1 02/11] mm/memory_hotplug: enforce section granularity when onlining/offlining
  2020-08-19 10:11 [PATCH v1 00/11] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
  2020-08-19 10:11 ` [PATCH v1 01/11] mm/memory_hotplug: inline __offline_pages() into offline_pages() David Hildenbrand
@ 2020-08-19 10:11 ` David Hildenbrand
  2020-08-19 12:37   ` Michal Hocko
  2020-08-19 10:11 ` [PATCH v1 03/11] mm/memory_hotplug: simplify checking if all pages are isolated in offline_pages() David Hildenbrand
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: David Hildenbrand @ 2020-08-19 10:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Michal Hocko,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador

Already two people (including me) tried to offline subsections, because
the function looks like it can deal with it. But we really can only
online/offline full sections (e.g., we can only mark full sections
online/offline via SECTION_IS_ONLINE).

Add a simple safety net that to document the restriction now. Current users
(core and powernv/memtrace) respect these restrictions.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index c781d386d87f9..6856702af68d9 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -801,6 +801,11 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
 	int ret;
 	struct memory_notify arg;
 
+	/* We can only online full sections (e.g., SECTION_IS_ONLINE) */
+	if (WARN_ON_ONCE(!nr_pages ||
+			 !IS_ALIGNED(pfn | nr_pages, PAGES_PER_SECTION)))
+		return -EINVAL;
+
 	mem_hotplug_begin();
 
 	/* associate pfn range with the zone */
@@ -1483,6 +1488,11 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 	struct memory_notify arg;
 	char *reason;
 
+	/* We can only offline full sections (e.g., SECTION_IS_ONLINE) */
+	if (WARN_ON_ONCE(!nr_pages ||
+			 !IS_ALIGNED(start_pfn | nr_pages, PAGES_PER_SECTION)))
+		return -EINVAL;
+
 	mem_hotplug_begin();
 
 	/*
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v1 03/11] mm/memory_hotplug: simplify checking if all pages are isolated in offline_pages()
  2020-08-19 10:11 [PATCH v1 00/11] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
  2020-08-19 10:11 ` [PATCH v1 01/11] mm/memory_hotplug: inline __offline_pages() into offline_pages() David Hildenbrand
  2020-08-19 10:11 ` [PATCH v1 02/11] mm/memory_hotplug: enforce section granularity when onlining/offlining David Hildenbrand
@ 2020-08-19 10:11 ` David Hildenbrand
  2020-08-19 12:39   ` Michal Hocko
  2020-08-31 10:10   ` Pankaj Gupta
  2020-08-19 10:11 ` [PATCH v1 04/11] mm/memory_hotplug: simplify offlining of pages " David Hildenbrand
                   ` (7 subsequent siblings)
  10 siblings, 2 replies; 28+ messages in thread
From: David Hildenbrand @ 2020-08-19 10:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Michal Hocko,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador

We make sure that we cannot have any memory holes right at the beginning
of offline_pages(). We no longer need walk_system_ram_range() and can
call test_pages_isolated() directly.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 16 +---------------
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 6856702af68d9..f64478349148d 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1384,17 +1384,6 @@ offline_isolated_pages_cb(unsigned long start, unsigned long nr_pages,
 	return 0;
 }
 
-/*
- * Check all pages in range, recorded as memory resource, are isolated.
- */
-static int
-check_pages_isolated_cb(unsigned long start_pfn, unsigned long nr_pages,
-			void *data)
-{
-	return test_pages_isolated(start_pfn, start_pfn + nr_pages,
-				   MEMORY_OFFLINE);
-}
-
 static int __init cmdline_parse_movable_node(char *p)
 {
 	movable_node_enabled = true;
@@ -1579,10 +1568,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 			reason = "failure to dissolve huge pages";
 			goto failed_removal_isolated;
 		}
-		/* check again */
-		ret = walk_system_ram_range(start_pfn, end_pfn - start_pfn,
-					    NULL, check_pages_isolated_cb);
-	} while (ret);
+	} while (test_pages_isolated(start_pfn, end_pfn, MEMORY_OFFLINE));
 
 	/* Ok, all of our target is isolated.
 	   We cannot do rollback at this point. */
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v1 04/11] mm/memory_hotplug: simplify offlining of pages in offline_pages()
  2020-08-19 10:11 [PATCH v1 00/11] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
                   ` (2 preceding siblings ...)
  2020-08-19 10:11 ` [PATCH v1 03/11] mm/memory_hotplug: simplify checking if all pages are isolated in offline_pages() David Hildenbrand
@ 2020-08-19 10:11 ` David Hildenbrand
  2020-08-19 12:40   ` Michal Hocko
  2020-08-19 10:11 ` [PATCH v1 05/11] mm/page_alloc: simplify __offline_isolated_pages() David Hildenbrand
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: David Hildenbrand @ 2020-08-19 10:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Michal Hocko,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador

We make sure that we cannot have any memory holes right at the beginning
of offline_pages(). We no longer need walk_system_ram_range() and can
call __offline_isolated_pages() directly.

offlined_pages always corresponds to nr_pages, so we can simplify that.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 28 ++++++++--------------------
 1 file changed, 8 insertions(+), 20 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index f64478349148d..50aa5df696e9d 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1373,17 +1373,6 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 	return ret;
 }
 
-/* Mark all sections offline and remove all free pages from the buddy. */
-static int
-offline_isolated_pages_cb(unsigned long start, unsigned long nr_pages,
-			void *data)
-{
-	unsigned long *offlined_pages = (unsigned long *)data;
-
-	*offlined_pages += __offline_isolated_pages(start, start + nr_pages);
-	return 0;
-}
-
 static int __init cmdline_parse_movable_node(char *p)
 {
 	movable_node_enabled = true;
@@ -1470,7 +1459,7 @@ static int count_system_ram_pages_cb(unsigned long start_pfn,
 int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 {
 	const unsigned long end_pfn = start_pfn + nr_pages;
-	unsigned long pfn, system_ram_pages = 0, offlined_pages = 0;
+	unsigned long pfn, system_ram_pages = 0;
 	int ret, node, nr_isolate_pageblock;
 	unsigned long flags;
 	struct zone *zone;
@@ -1570,11 +1559,10 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 		}
 	} while (test_pages_isolated(start_pfn, end_pfn, MEMORY_OFFLINE));
 
-	/* Ok, all of our target is isolated.
-	   We cannot do rollback at this point. */
-	walk_system_ram_range(start_pfn, end_pfn - start_pfn,
-			      &offlined_pages, offline_isolated_pages_cb);
-	pr_info("Offlined Pages %ld\n", offlined_pages);
+	/* Mark all sections offline and remove free pages from the buddy. */
+	__offline_isolated_pages(start_pfn, end_pfn);
+	pr_info("Offlined Pages %ld\n", nr_pages);
+
 	/*
 	 * Onlining will reset pagetype flags and makes migrate type
 	 * MOVABLE, so just need to decrease the number of isolated
@@ -1585,11 +1573,11 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 	spin_unlock_irqrestore(&zone->lock, flags);
 
 	/* removal success */
-	adjust_managed_page_count(pfn_to_page(start_pfn), -offlined_pages);
-	zone->present_pages -= offlined_pages;
+	adjust_managed_page_count(pfn_to_page(start_pfn), -nr_pages);
+	zone->present_pages -= nr_pages;
 
 	pgdat_resize_lock(zone->zone_pgdat, &flags);
-	zone->zone_pgdat->node_present_pages -= offlined_pages;
+	zone->zone_pgdat->node_present_pages -= nr_pages;
 	pgdat_resize_unlock(zone->zone_pgdat, &flags);
 
 	init_per_zone_wmark_min();
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v1 05/11] mm/page_alloc: simplify __offline_isolated_pages()
  2020-08-19 10:11 [PATCH v1 00/11] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
                   ` (3 preceding siblings ...)
  2020-08-19 10:11 ` [PATCH v1 04/11] mm/memory_hotplug: simplify offlining of pages " David Hildenbrand
@ 2020-08-19 10:11 ` David Hildenbrand
  2020-08-19 12:48   ` Michal Hocko
  2020-08-19 10:11 ` [PATCH v1 06/11] mm/memory_hotplug: drop nr_isolate_pageblock in offline_pages() David Hildenbrand
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: David Hildenbrand @ 2020-08-19 10:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Michal Hocko,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador

offline_pages() is the only user. __offline_isolated_pages() never gets
called with ranges that contain memory holes and we no longer care about
the return value. Drop the return value handling and all pfn_valid()
checks.

Update the documentation.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/memory_hotplug.h |  4 ++--
 mm/page_alloc.c                | 27 ++++-----------------------
 2 files changed, 6 insertions(+), 25 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 375515803cd83..0b461691d1a49 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -103,8 +103,8 @@ extern int online_pages(unsigned long pfn, unsigned long nr_pages,
 			int online_type, int nid);
 extern struct zone *test_pages_in_a_zone(unsigned long start_pfn,
 					 unsigned long end_pfn);
-extern unsigned long __offline_isolated_pages(unsigned long start_pfn,
-						unsigned long end_pfn);
+extern void __offline_isolated_pages(unsigned long start_pfn,
+				     unsigned long end_pfn);
 
 typedef void (*online_page_callback_t)(struct page *page, unsigned int order);
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cf0b25161feae..03f585f95dc60 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8692,35 +8692,21 @@ void zone_pcp_reset(struct zone *zone)
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
 /*
- * All pages in the range must be in a single zone and isolated
- * before calling this.
+ * All pages in the range must be in a single zone, must not contain holes,
+ * must span full sections, and must be isolated before calling this function.
  */
-unsigned long
-__offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
+void __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
 {
+	unsigned long pfn = start_pfn;
 	struct page *page;
 	struct zone *zone;
 	unsigned int order;
-	unsigned long pfn;
 	unsigned long flags;
-	unsigned long offlined_pages = 0;
-
-	/* find the first valid pfn */
-	for (pfn = start_pfn; pfn < end_pfn; pfn++)
-		if (pfn_valid(pfn))
-			break;
-	if (pfn == end_pfn)
-		return offlined_pages;
 
 	offline_mem_sections(pfn, end_pfn);
 	zone = page_zone(pfn_to_page(pfn));
 	spin_lock_irqsave(&zone->lock, flags);
-	pfn = start_pfn;
 	while (pfn < end_pfn) {
-		if (!pfn_valid(pfn)) {
-			pfn++;
-			continue;
-		}
 		page = pfn_to_page(pfn);
 		/*
 		 * The HWPoisoned page may be not in buddy system, and
@@ -8728,7 +8714,6 @@ __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
 		 */
 		if (unlikely(!PageBuddy(page) && PageHWPoison(page))) {
 			pfn++;
-			offlined_pages++;
 			continue;
 		}
 		/*
@@ -8739,20 +8724,16 @@ __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
 			BUG_ON(page_count(page));
 			BUG_ON(PageBuddy(page));
 			pfn++;
-			offlined_pages++;
 			continue;
 		}
 
 		BUG_ON(page_count(page));
 		BUG_ON(!PageBuddy(page));
 		order = page_order(page);
-		offlined_pages += 1 << order;
 		del_page_from_free_list(page, zone, order);
 		pfn += (1 << order);
 	}
 	spin_unlock_irqrestore(&zone->lock, flags);
-
-	return offlined_pages;
 }
 #endif
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v1 06/11] mm/memory_hotplug: drop nr_isolate_pageblock in offline_pages()
  2020-08-19 10:11 [PATCH v1 00/11] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
                   ` (4 preceding siblings ...)
  2020-08-19 10:11 ` [PATCH v1 05/11] mm/page_alloc: simplify __offline_isolated_pages() David Hildenbrand
@ 2020-08-19 10:11 ` David Hildenbrand
  2020-08-19 12:58   ` Michal Hocko
  2020-08-19 10:11 ` [PATCH v1 07/11] mm/page_isolation: simplify return value of start_isolate_page_range() David Hildenbrand
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: David Hildenbrand @ 2020-08-19 10:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Michal Hocko,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador

We make sure that we cannot have any memory holes right at the beginning
of offline_pages() and we only support to online/offline full sections.
Both, sections and pageblocks are a power of two in size, and sections
always span full pageblocks.

We can directly calculate the number of isolated pageblocks from nr_pages.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 50aa5df696e9d..098361fcb4504 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1460,10 +1460,10 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 {
 	const unsigned long end_pfn = start_pfn + nr_pages;
 	unsigned long pfn, system_ram_pages = 0;
-	int ret, node, nr_isolate_pageblock;
 	unsigned long flags;
 	struct zone *zone;
 	struct memory_notify arg;
+	int ret, node;
 	char *reason;
 
 	/* We can only offline full sections (e.g., SECTION_IS_ONLINE) */
@@ -1507,7 +1507,6 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 		reason = "failure to isolate range";
 		goto failed_removal;
 	}
-	nr_isolate_pageblock = ret;
 
 	arg.start_pfn = start_pfn;
 	arg.nr_pages = nr_pages;
@@ -1569,7 +1568,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 	 * pageblocks zone counter here.
 	 */
 	spin_lock_irqsave(&zone->lock, flags);
-	zone->nr_isolate_pageblock -= nr_isolate_pageblock;
+	zone->nr_isolate_pageblock -= nr_pages / pageblock_nr_pages;
 	spin_unlock_irqrestore(&zone->lock, flags);
 
 	/* removal success */
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v1 07/11] mm/page_isolation: simplify return value of start_isolate_page_range()
  2020-08-19 10:11 [PATCH v1 00/11] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
                   ` (5 preceding siblings ...)
  2020-08-19 10:11 ` [PATCH v1 06/11] mm/memory_hotplug: drop nr_isolate_pageblock in offline_pages() David Hildenbrand
@ 2020-08-19 10:11 ` David Hildenbrand
  2020-08-19 13:00   ` Michal Hocko
  2020-08-19 10:11 ` [PATCH v1 08/11] mm/memory_hotplug: simplify page onlining David Hildenbrand
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: David Hildenbrand @ 2020-08-19 10:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Michal Hocko,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador

Callers no longer need the number of isolated pageblocks. Let's
simplify.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 2 +-
 mm/page_alloc.c     | 2 +-
 mm/page_isolation.c | 7 ++-----
 3 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 098361fcb4504..0011a1115381c 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1503,7 +1503,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 	ret = start_isolate_page_range(start_pfn, end_pfn,
 				       MIGRATE_MOVABLE,
 				       MEMORY_OFFLINE | REPORT_FAILURE);
-	if (ret < 0) {
+	if (ret) {
 		reason = "failure to isolate range";
 		goto failed_removal;
 	}
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 03f585f95dc60..848664352dfe2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8456,7 +8456,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 
 	ret = start_isolate_page_range(pfn_max_align_down(start),
 				       pfn_max_align_up(end), migratetype, 0);
-	if (ret < 0)
+	if (ret)
 		return ret;
 
 	/*
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 242c03121d731..b066c860e606e 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -170,8 +170,7 @@ __first_valid_page(unsigned long pfn, unsigned long nr_pages)
  * pageblocks we may have modified and return -EBUSY to caller. This
  * prevents two threads from simultaneously working on overlapping ranges.
  *
- * Return: the number of isolated pageblocks on success and -EBUSY if any part
- * of range cannot be isolated.
+ * Return: 0 on success and -EBUSY if any part of range cannot be isolated.
  */
 int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
 			     unsigned migratetype, int flags)
@@ -179,7 +178,6 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
 	unsigned long pfn;
 	unsigned long undo_pfn;
 	struct page *page;
-	int nr_isolate_pageblock = 0;
 
 	BUG_ON(!IS_ALIGNED(start_pfn, pageblock_nr_pages));
 	BUG_ON(!IS_ALIGNED(end_pfn, pageblock_nr_pages));
@@ -193,10 +191,9 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
 				undo_pfn = pfn;
 				goto undo;
 			}
-			nr_isolate_pageblock++;
 		}
 	}
-	return nr_isolate_pageblock;
+	return 0;
 undo:
 	for (pfn = start_pfn;
 	     pfn < undo_pfn;
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v1 08/11] mm/memory_hotplug: simplify page onlining
  2020-08-19 10:11 [PATCH v1 00/11] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
                   ` (6 preceding siblings ...)
  2020-08-19 10:11 ` [PATCH v1 07/11] mm/page_isolation: simplify return value of start_isolate_page_range() David Hildenbrand
@ 2020-08-19 10:11 ` David Hildenbrand
  2020-08-19 13:02   ` Michal Hocko
  2020-08-19 10:11 ` [PATCH v1 09/11] mm/page_alloc: drop stale pageblock comment in memmap_init_zone*() David Hildenbrand
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: David Hildenbrand @ 2020-08-19 10:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Michal Hocko,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador

We don't allow to offline memory with holes, all boot memory is online,
and all hotplugged memory cannot have holes.

We can now simplify onlining of pages. As we only allow to online/offline
full sections and sections always span full MAX_ORDER_NR_PAGES, we can just
process MAX_ORDER - 1 pages without further special handling.

The number of onlined pages simply corresponds to the number of pages we
were requested to online.

While at it, refine the comment regarding the callback not exposing all
pages to the buddy.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 38 ++++++++++----------------------------
 1 file changed, 10 insertions(+), 28 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 0011a1115381c..3aba0d956f9b1 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -617,31 +617,22 @@ void generic_online_page(struct page *page, unsigned int order)
 }
 EXPORT_SYMBOL_GPL(generic_online_page);
 
-static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
-			void *arg)
+static void online_pages_range(unsigned long start_pfn, unsigned long nr_pages)
 {
 	const unsigned long end_pfn = start_pfn + nr_pages;
 	unsigned long pfn;
-	int order;
 
 	/*
-	 * Online the pages. The callback might decide to keep some pages
-	 * PG_reserved (to add them to the buddy later), but we still account
-	 * them as being online/belonging to this zone ("present").
+	 * Online the pages in MAX_ORDER - 1 aligned chunks. The callback might
+	 * decide to not expose all pages to the buddy (e.g., expose them
+	 * later). We account all pages as being online and belonging to this
+	 * zone ("present").
 	 */
-	for (pfn = start_pfn; pfn < end_pfn; pfn += 1ul << order) {
-		order = min(MAX_ORDER - 1, get_order(PFN_PHYS(end_pfn - pfn)));
-		/* __free_pages_core() wants pfns to be aligned to the order */
-		if (WARN_ON_ONCE(!IS_ALIGNED(pfn, 1ul << order)))
-			order = 0;
-		(*online_page_callback)(pfn_to_page(pfn), order);
-	}
+	for (pfn = start_pfn; pfn < end_pfn; pfn += MAX_ORDER_NR_PAGES)
+		(*online_page_callback)(pfn_to_page(pfn), MAX_ORDER - 1);
 
 	/* mark all involved sections as online */
 	online_mem_sections(start_pfn, end_pfn);
-
-	*(unsigned long *)arg += nr_pages;
-	return 0;
 }
 
 /* check which state of node_states will be changed when online memory */
@@ -795,7 +786,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
 		       int online_type, int nid)
 {
 	unsigned long flags;
-	unsigned long onlined_pages = 0;
 	struct zone *zone;
 	int need_zonelists_rebuild = 0;
 	int ret;
@@ -831,19 +821,11 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
 		setup_zone_pageset(zone);
 	}
 
-	ret = walk_system_ram_range(pfn, nr_pages, &onlined_pages,
-		online_pages_range);
-	if (ret) {
-		/* not a single memory resource was applicable */
-		if (need_zonelists_rebuild)
-			zone_pcp_reset(zone);
-		goto failed_addition;
-	}
-
-	zone->present_pages += onlined_pages;
+	online_pages_range(pfn, nr_pages);
+	zone->present_pages += nr_pages;
 
 	pgdat_resize_lock(zone->zone_pgdat, &flags);
-	zone->zone_pgdat->node_present_pages += onlined_pages;
+	zone->zone_pgdat->node_present_pages += nr_pages;
 	pgdat_resize_unlock(zone->zone_pgdat, &flags);
 
 	/*
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v1 09/11] mm/page_alloc: drop stale pageblock comment in memmap_init_zone*()
  2020-08-19 10:11 [PATCH v1 00/11] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
                   ` (7 preceding siblings ...)
  2020-08-19 10:11 ` [PATCH v1 08/11] mm/memory_hotplug: simplify page onlining David Hildenbrand
@ 2020-08-19 10:11 ` David Hildenbrand
  2020-08-19 13:05   ` Michal Hocko
  2020-08-19 10:11 ` [PATCH v1 10/11] mm: pass migratetype into memmap_init_zone() and move_pfn_range_to_zone() David Hildenbrand
  2020-08-19 10:11 ` [PATCH v1 11/11] mm/memory_hotplug: mark pageblocks MIGRATE_ISOLATE while onlining memory David Hildenbrand
  10 siblings, 1 reply; 28+ messages in thread
From: David Hildenbrand @ 2020-08-19 10:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Michal Hocko,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador, Mel Gorman

Commit ac5d2539b238 ("mm: meminit: reduce number of times pageblocks are
set during struct page init") moved the actual zone range check, leaving
only the alignment check for pageblocks.

Let's drop the stale comment and make the pageblock check easier to read.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/page_alloc.c | 14 ++------------
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 848664352dfe2..5db0b35f95e20 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6022,13 +6022,8 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 		 * to reserve their blocks rather than leaking throughout
 		 * the address space during boot when many long-lived
 		 * kernel allocations are made.
-		 *
-		 * bitmap is created for zone's valid pfn range. but memmap
-		 * can be created for invalid pages (for alignment)
-		 * check here not to call set_pageblock_migratetype() against
-		 * pfn out of zone.
 		 */
-		if (!(pfn & (pageblock_nr_pages - 1))) {
+		if (IS_ALIGNED(pfn, pageblock_nr_pages)) {
 			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
 			cond_resched();
 		}
@@ -6091,15 +6086,10 @@ void __ref memmap_init_zone_device(struct zone *zone,
 		 * the address space during boot when many long-lived
 		 * kernel allocations are made.
 		 *
-		 * bitmap is created for zone's valid pfn range. but memmap
-		 * can be created for invalid pages (for alignment)
-		 * check here not to call set_pageblock_migratetype() against
-		 * pfn out of zone.
-		 *
 		 * Please note that MEMMAP_HOTPLUG path doesn't clear memmap
 		 * because this is done early in section_activate()
 		 */
-		if (!(pfn & (pageblock_nr_pages - 1))) {
+		if (IS_ALIGNED(pfn, pageblock_nr_pages)) {
 			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
 			cond_resched();
 		}
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v1 10/11] mm: pass migratetype into memmap_init_zone() and move_pfn_range_to_zone()
  2020-08-19 10:11 [PATCH v1 00/11] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
                   ` (8 preceding siblings ...)
  2020-08-19 10:11 ` [PATCH v1 09/11] mm/page_alloc: drop stale pageblock comment in memmap_init_zone*() David Hildenbrand
@ 2020-08-19 10:11 ` David Hildenbrand
  2020-08-19 13:08   ` Michal Hocko
  2020-08-19 10:11 ` [PATCH v1 11/11] mm/memory_hotplug: mark pageblocks MIGRATE_ISOLATE while onlining memory David Hildenbrand
  10 siblings, 1 reply; 28+ messages in thread
From: David Hildenbrand @ 2020-08-19 10:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Michal Hocko,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador, Tony Luck,
	Fenghua Yu, Logan Gunthorpe, Dan Williams, Mike Rapoport,
	Matthew Wilcox (Oracle),
	Michel Lespinasse, linux-ia64

On the memory hotplug path, we want to start with MIGRATE_ISOLATE, to
un-isolate the pages after memory onlining is complete. Let's allow
passing in the migratetype.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: linux-ia64@vger.kernel.org
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/ia64/mm/init.c            |  4 ++--
 include/linux/memory_hotplug.h |  3 ++-
 include/linux/mm.h             |  3 ++-
 mm/memory_hotplug.c            | 11 ++++++++---
 mm/memremap.c                  |  3 ++-
 mm/page_alloc.c                | 21 ++++++++++++---------
 6 files changed, 28 insertions(+), 17 deletions(-)

diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index 0b3fb4c7af292..82b7a46ddd23d 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -538,7 +538,7 @@ virtual_memmap_init(u64 start, u64 end, void *arg)
 	if (map_start < map_end)
 		memmap_init_zone((unsigned long)(map_end - map_start),
 				 args->nid, args->zone, page_to_pfn(map_start),
-				 MEMMAP_EARLY, NULL);
+				 MEMMAP_EARLY, NULL, MIGRATE_MOVABLE);
 	return 0;
 }
 
@@ -548,7 +548,7 @@ memmap_init (unsigned long size, int nid, unsigned long zone,
 {
 	if (!vmem_map) {
 		memmap_init_zone(size, nid, zone, start_pfn, MEMMAP_EARLY,
-				NULL);
+				 NULL, MIGRATE_MOVABLE);
 	} else {
 		struct page *start;
 		struct memmap_init_callback_data args;
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 0b461691d1a49..cbafeda859380 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -346,7 +346,8 @@ extern int add_memory_resource(int nid, struct resource *resource);
 extern int add_memory_driver_managed(int nid, u64 start, u64 size,
 				     const char *resource_name);
 extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
-		unsigned long nr_pages, struct vmem_altmap *altmap);
+				   unsigned long nr_pages,
+				   struct vmem_altmap *altmap, int migratetype);
 extern void remove_pfn_range_from_zone(struct zone *zone,
 				       unsigned long start_pfn,
 				       unsigned long nr_pages);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8ab941cf73f44..c842aa2a97ba2 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2409,7 +2409,8 @@ extern int __meminit __early_pfn_to_nid(unsigned long pfn,
 
 extern void set_dma_reserve(unsigned long new_dma_reserve);
 extern void memmap_init_zone(unsigned long, int, unsigned long, unsigned long,
-		enum memmap_context, struct vmem_altmap *);
+			    enum memmap_context, struct vmem_altmap *,
+			    int migratetype);
 extern void setup_per_zone_wmarks(void);
 extern int __meminit init_per_zone_wmark_min(void);
 extern void mem_init(void);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 3aba0d956f9b1..1c16a5def781e 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -693,9 +693,14 @@ static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned lon
  * Associate the pfn range with the given zone, initializing the memmaps
  * and resizing the pgdat/zone data to span the added pages. After this
  * call, all affected pages are PG_reserved.
+ *
+ * All aligned pageblocks are initialized to the specified migratetype
+ * (usually MIGRATE_MOVABLE). Besides setting the migratetype, no related
+ * zone stats (e.g., nr_isolate_pageblock) are touched.
  */
 void __ref move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
-		unsigned long nr_pages, struct vmem_altmap *altmap)
+				  unsigned long nr_pages,
+				  struct vmem_altmap *altmap, int migratetype)
 {
 	struct pglist_data *pgdat = zone->zone_pgdat;
 	int nid = pgdat->node_id;
@@ -720,7 +725,7 @@ void __ref move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
 	 * are reserved so nobody should be touching them so we should be safe
 	 */
 	memmap_init_zone(nr_pages, nid, zone_idx(zone), start_pfn,
-			MEMMAP_HOTPLUG, altmap);
+			 MEMMAP_HOTPLUG, altmap, migratetype);
 
 	set_zone_contiguous(zone);
 }
@@ -800,7 +805,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
 
 	/* associate pfn range with the zone */
 	zone = zone_for_pfn_range(online_type, nid, pfn, nr_pages);
-	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL);
+	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_MOVABLE);
 
 	arg.start_pfn = pfn;
 	arg.nr_pages = nr_pages;
diff --git a/mm/memremap.c b/mm/memremap.c
index 8afcc54c89286..04dc1f4ed634e 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -342,7 +342,8 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid)
 
 		zone = &NODE_DATA(nid)->node_zones[ZONE_DEVICE];
 		move_pfn_range_to_zone(zone, PHYS_PFN(res->start),
-				PHYS_PFN(resource_size(res)), params.altmap);
+				       PHYS_PFN(resource_size(res)),
+				       params.altmap, MIGRATE_MOVABLE);
 	}
 
 	mem_hotplug_done();
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5db0b35f95e20..9f2dc61968689 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5970,10 +5970,15 @@ overlap_memmap_init(unsigned long zone, unsigned long *pfn)
  * Initially all pages are reserved - free ones are freed
  * up by memblock_free_all() once the early boot process is
  * done. Non-atomic initialization, single-pass.
+ *
+ * All aligned pageblocks are initialized to the specified migratetype
+ * (usually MIGRATE_MOVABLE). Besides setting the migratetype, no related
+ * zone stats (e.g., nr_isolate_pageblock) are touched.
  */
 void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
-		unsigned long start_pfn, enum memmap_context context,
-		struct vmem_altmap *altmap)
+				unsigned long start_pfn,
+				enum memmap_context context,
+				struct vmem_altmap *altmap, int migratetype)
 {
 	unsigned long pfn, end_pfn = start_pfn + size;
 	struct page *page;
@@ -6017,14 +6022,12 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 			__SetPageReserved(page);
 
 		/*
-		 * Mark the block movable so that blocks are reserved for
-		 * movable at startup. This will force kernel allocations
-		 * to reserve their blocks rather than leaking throughout
-		 * the address space during boot when many long-lived
-		 * kernel allocations are made.
+		 * Usually, we want to mark the pageblock MIGRATE_MOVABLE,
+		 * such that unmovable allocations won't be scattered all
+		 * over the place during system boot.
 		 */
 		if (IS_ALIGNED(pfn, pageblock_nr_pages)) {
-			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
+			set_pageblock_migratetype(page, migratetype);
 			cond_resched();
 		}
 		pfn++;
@@ -6124,7 +6127,7 @@ void __meminit __weak memmap_init(unsigned long size, int nid,
 		if (end_pfn > start_pfn) {
 			size = end_pfn - start_pfn;
 			memmap_init_zone(size, nid, zone, start_pfn,
-					 MEMMAP_EARLY, NULL);
+					 MEMMAP_EARLY, NULL, MIGRATE_MOVABLE);
 		}
 	}
 }
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v1 11/11] mm/memory_hotplug: mark pageblocks MIGRATE_ISOLATE while onlining memory
  2020-08-19 10:11 [PATCH v1 00/11] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
                   ` (9 preceding siblings ...)
  2020-08-19 10:11 ` [PATCH v1 10/11] mm: pass migratetype into memmap_init_zone() and move_pfn_range_to_zone() David Hildenbrand
@ 2020-08-19 10:11 ` David Hildenbrand
  2020-08-19 13:16   ` Michal Hocko
  10 siblings, 1 reply; 28+ messages in thread
From: David Hildenbrand @ 2020-08-19 10:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Michal Hocko,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador,
	Charan Teja Reddy

Currently, it can happen that pages are allocated (and freed) via the buddy
before we finished basic memory onlining.

For example, pages are exposed to the buddy and can be allocated before
we actually mark the sections online. Allocated pages could suddenly
fail pfn_to_online_page() checks. We had similar issues with pcp
handling, when pages are allocated+freed before we reach
zone_pcp_update() in online_pages() [1].

Instead, mark all pageblocks MIGRATE_ISOLATE, such that allocations are
impossible. Once done with the heavy lifting, use
undo_isolate_page_range() to move the pages to the MIGRATE_MOVABLE
freelist, marking them ready for allocation. Similar to offline_pages(),
we have to manually adjust zone->nr_isolate_pageblock.

[1] https://lkml.kernel.org/r/1597150703-19003-1-git-send-email-charante@codeaurora.org

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Charan Teja Reddy <charante@codeaurora.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/Kconfig          |  2 +-
 mm/memory_hotplug.c | 32 ++++++++++++++++++++++----------
 2 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/mm/Kconfig b/mm/Kconfig
index 6c974888f86f9..85a16ce1dbc49 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -152,6 +152,7 @@ config HAVE_BOOTMEM_INFO_NODE
 # eventually, we can have this option just 'select SPARSEMEM'
 config MEMORY_HOTPLUG
 	bool "Allow for memory hot-add"
+	select MEMORY_ISOLATION
 	depends on SPARSEMEM || X86_64_ACPI_NUMA
 	depends on ARCH_ENABLE_MEMORY_HOTPLUG
 	depends on 64BIT || BROKEN
@@ -178,7 +179,6 @@ config MEMORY_HOTPLUG_DEFAULT_ONLINE
 
 config MEMORY_HOTREMOVE
 	bool "Allow for memory hot remove"
-	select MEMORY_ISOLATION
 	select HAVE_BOOTMEM_INFO_NODE if (X86_64 || PPC64)
 	depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE
 	depends on MIGRATION
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 1c16a5def781e..35d56cbd3e45b 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -805,7 +805,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
 
 	/* associate pfn range with the zone */
 	zone = zone_for_pfn_range(online_type, nid, pfn, nr_pages);
-	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_MOVABLE);
+	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_ISOLATE);
 
 	arg.start_pfn = pfn;
 	arg.nr_pages = nr_pages;
@@ -816,6 +816,14 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
 	if (ret)
 		goto failed_addition;
 
+	/*
+	 * Fixup the number of isolated pageblocks before marking the sections
+	 * onlining, such that undo_isolate_page_range() works correctly.
+	 */
+	spin_lock_irqsave(&zone->lock, flags);
+	zone->nr_isolate_pageblock += nr_pages / pageblock_nr_pages;
+	spin_unlock_irqrestore(&zone->lock, flags);
+
 	/*
 	 * If this zone is not populated, then it is not in zonelist.
 	 * This means the page allocator ignores this zone.
@@ -833,21 +841,25 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
 	zone->zone_pgdat->node_present_pages += nr_pages;
 	pgdat_resize_unlock(zone->zone_pgdat, &flags);
 
+	node_states_set_node(nid, &arg);
+	if (need_zonelists_rebuild)
+		build_all_zonelists(NULL);
+	zone_pcp_update(zone);
+
+	/* Basic onlining is complete, allow allocation of onlined pages. */
+	undo_isolate_page_range(pfn, pfn + nr_pages, MIGRATE_MOVABLE);
+
 	/*
 	 * When exposing larger, physically contiguous memory areas to the
 	 * buddy, shuffling in the buddy (when freeing onlined pages, putting
 	 * them either to the head or the tail of the freelist) is only helpful
 	 * for maintaining the shuffle, but not for creating the initial
 	 * shuffle. Shuffle the whole zone to make sure the just onlined pages
-	 * are properly distributed across the whole freelist.
+	 * are properly distributed across the whole freelist. Make sure to
+	 * shuffle once pageblocks are no longer isolated.
 	 */
 	shuffle_zone(zone);
 
-	node_states_set_node(nid, &arg);
-	if (need_zonelists_rebuild)
-		build_all_zonelists(NULL);
-	zone_pcp_update(zone);
-
 	init_per_zone_wmark_min();
 
 	kswapd_run(nid);
@@ -1550,9 +1562,9 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 	pr_info("Offlined Pages %ld\n", nr_pages);
 
 	/*
-	 * Onlining will reset pagetype flags and makes migrate type
-	 * MOVABLE, so just need to decrease the number of isolated
-	 * pageblocks zone counter here.
+	 * The memory sections are marked offline, and the pageblock flags
+	 * effectively stale; nobody should be touching them. Fixup the number
+	 * of isolated pageblocks, memory onlining will properly revert this.
 	 */
 	spin_lock_irqsave(&zone->lock, flags);
 	zone->nr_isolate_pageblock -= nr_pages / pageblock_nr_pages;
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 01/11] mm/memory_hotplug: inline __offline_pages() into offline_pages()
  2020-08-19 10:11 ` [PATCH v1 01/11] mm/memory_hotplug: inline __offline_pages() into offline_pages() David Hildenbrand
@ 2020-08-19 12:23   ` Michal Hocko
  0 siblings, 0 replies; 28+ messages in thread
From: Michal Hocko @ 2020-08-19 12:23 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Wei Yang, Baoquan He,
	Pankaj Gupta, Oscar Salvador

On Wed 19-08-20 12:11:47, David Hildenbrand wrote:
> There is only a single user, offline_pages(). Let's inline, to make
> it look more similar to online_pages().
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memory_hotplug.c | 16 +++++-----------
>  1 file changed, 5 insertions(+), 11 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 7f62d69660e06..c781d386d87f9 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1473,11 +1473,10 @@ static int count_system_ram_pages_cb(unsigned long start_pfn,
>  	return 0;
>  }
>  
> -static int __ref __offline_pages(unsigned long start_pfn,
> -		  unsigned long end_pfn)
> +int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>  {
> -	unsigned long pfn, nr_pages = 0;
> -	unsigned long offlined_pages = 0;
> +	const unsigned long end_pfn = start_pfn + nr_pages;
> +	unsigned long pfn, system_ram_pages = 0, offlined_pages = 0;
>  	int ret, node, nr_isolate_pageblock;
>  	unsigned long flags;
>  	struct zone *zone;
> @@ -1494,9 +1493,9 @@ static int __ref __offline_pages(unsigned long start_pfn,
>  	 * memory holes PG_reserved, don't need pfn_valid() checks, and can
>  	 * avoid using walk_system_ram_range() later.
>  	 */
> -	walk_system_ram_range(start_pfn, end_pfn - start_pfn, &nr_pages,
> +	walk_system_ram_range(start_pfn, nr_pages, &system_ram_pages,
>  			      count_system_ram_pages_cb);
> -	if (nr_pages != end_pfn - start_pfn) {
> +	if (system_ram_pages != nr_pages) {
>  		ret = -EINVAL;
>  		reason = "memory holes";
>  		goto failed_removal;
> @@ -1631,11 +1630,6 @@ static int __ref __offline_pages(unsigned long start_pfn,
>  	return ret;
>  }
>  
> -int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
> -{
> -	return __offline_pages(start_pfn, start_pfn + nr_pages);
> -}
> -
>  static int check_memblock_offlined_cb(struct memory_block *mem, void *arg)
>  {
>  	int ret = !is_memblock_offlined(mem);
> -- 
> 2.26.2
> 

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 02/11] mm/memory_hotplug: enforce section granularity when onlining/offlining
  2020-08-19 10:11 ` [PATCH v1 02/11] mm/memory_hotplug: enforce section granularity when onlining/offlining David Hildenbrand
@ 2020-08-19 12:37   ` Michal Hocko
  2020-08-19 12:43     ` David Hildenbrand
  0 siblings, 1 reply; 28+ messages in thread
From: Michal Hocko @ 2020-08-19 12:37 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Wei Yang, Baoquan He,
	Pankaj Gupta, Oscar Salvador

On Wed 19-08-20 12:11:48, David Hildenbrand wrote:
> Already two people (including me) tried to offline subsections, because
> the function looks like it can deal with it. But we really can only
> online/offline full sections (e.g., we can only mark full sections
> online/offline via SECTION_IS_ONLINE).
> 
> Add a simple safety net that to document the restriction now. Current users
> (core and powernv/memtrace) respect these restrictions.

I do agree with the warning because it clarifies our expectations
indeed. Se below for more questions.

> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  mm/memory_hotplug.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index c781d386d87f9..6856702af68d9 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -801,6 +801,11 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
>  	int ret;
>  	struct memory_notify arg;
>  
> +	/* We can only online full sections (e.g., SECTION_IS_ONLINE) */
> +	if (WARN_ON_ONCE(!nr_pages ||
> +			 !IS_ALIGNED(pfn | nr_pages, PAGES_PER_SECTION)))
> +		return -EINVAL;

This looks looks unnecessarily cryptic to me. Do you want to catch full
section operation that doesn't start at the usual section boundary? If
yes the above doesn't work work unless I am missing something.

Why don't you simply WARN_ON_ONCE(nr_pages % PAGES_PER_SECTION).
!nr_pages doesn't sound like something interesting to care about or why
do we care?

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 03/11] mm/memory_hotplug: simplify checking if all pages are isolated in offline_pages()
  2020-08-19 10:11 ` [PATCH v1 03/11] mm/memory_hotplug: simplify checking if all pages are isolated in offline_pages() David Hildenbrand
@ 2020-08-19 12:39   ` Michal Hocko
  2020-08-31 10:10   ` Pankaj Gupta
  1 sibling, 0 replies; 28+ messages in thread
From: Michal Hocko @ 2020-08-19 12:39 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Wei Yang, Baoquan He,
	Pankaj Gupta, Oscar Salvador

On Wed 19-08-20 12:11:49, David Hildenbrand wrote:
> We make sure that we cannot have any memory holes right at the beginning
> of offline_pages(). We no longer need walk_system_ram_range() and can
> call test_pages_isolated() directly.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memory_hotplug.c | 16 +---------------
>  1 file changed, 1 insertion(+), 15 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 6856702af68d9..f64478349148d 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1384,17 +1384,6 @@ offline_isolated_pages_cb(unsigned long start, unsigned long nr_pages,
>  	return 0;
>  }
>  
> -/*
> - * Check all pages in range, recorded as memory resource, are isolated.
> - */
> -static int
> -check_pages_isolated_cb(unsigned long start_pfn, unsigned long nr_pages,
> -			void *data)
> -{
> -	return test_pages_isolated(start_pfn, start_pfn + nr_pages,
> -				   MEMORY_OFFLINE);
> -}
> -
>  static int __init cmdline_parse_movable_node(char *p)
>  {
>  	movable_node_enabled = true;
> @@ -1579,10 +1568,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>  			reason = "failure to dissolve huge pages";
>  			goto failed_removal_isolated;
>  		}
> -		/* check again */
> -		ret = walk_system_ram_range(start_pfn, end_pfn - start_pfn,
> -					    NULL, check_pages_isolated_cb);
> -	} while (ret);
> +	} while (test_pages_isolated(start_pfn, end_pfn, MEMORY_OFFLINE));
>  
>  	/* Ok, all of our target is isolated.
>  	   We cannot do rollback at this point. */
> -- 
> 2.26.2

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 04/11] mm/memory_hotplug: simplify offlining of pages in offline_pages()
  2020-08-19 10:11 ` [PATCH v1 04/11] mm/memory_hotplug: simplify offlining of pages " David Hildenbrand
@ 2020-08-19 12:40   ` Michal Hocko
  2020-08-19 12:44     ` David Hildenbrand
  0 siblings, 1 reply; 28+ messages in thread
From: Michal Hocko @ 2020-08-19 12:40 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Wei Yang, Baoquan He,
	Pankaj Gupta, Oscar Salvador

On Wed 19-08-20 12:11:50, David Hildenbrand wrote:
> We make sure that we cannot have any memory holes right at the beginning
> of offline_pages(). We no longer need walk_system_ram_range() and can
> call __offline_isolated_pages() directly.
> 
> offlined_pages always corresponds to nr_pages, so we can simplify that.

I would probably fold this into the previous patch as they are dealing
with the same thing.

> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memory_hotplug.c | 28 ++++++++--------------------
>  1 file changed, 8 insertions(+), 20 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index f64478349148d..50aa5df696e9d 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1373,17 +1373,6 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>  	return ret;
>  }
>  
> -/* Mark all sections offline and remove all free pages from the buddy. */
> -static int
> -offline_isolated_pages_cb(unsigned long start, unsigned long nr_pages,
> -			void *data)
> -{
> -	unsigned long *offlined_pages = (unsigned long *)data;
> -
> -	*offlined_pages += __offline_isolated_pages(start, start + nr_pages);
> -	return 0;
> -}
> -
>  static int __init cmdline_parse_movable_node(char *p)
>  {
>  	movable_node_enabled = true;
> @@ -1470,7 +1459,7 @@ static int count_system_ram_pages_cb(unsigned long start_pfn,
>  int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>  {
>  	const unsigned long end_pfn = start_pfn + nr_pages;
> -	unsigned long pfn, system_ram_pages = 0, offlined_pages = 0;
> +	unsigned long pfn, system_ram_pages = 0;
>  	int ret, node, nr_isolate_pageblock;
>  	unsigned long flags;
>  	struct zone *zone;
> @@ -1570,11 +1559,10 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>  		}
>  	} while (test_pages_isolated(start_pfn, end_pfn, MEMORY_OFFLINE));
>  
> -	/* Ok, all of our target is isolated.
> -	   We cannot do rollback at this point. */
> -	walk_system_ram_range(start_pfn, end_pfn - start_pfn,
> -			      &offlined_pages, offline_isolated_pages_cb);
> -	pr_info("Offlined Pages %ld\n", offlined_pages);
> +	/* Mark all sections offline and remove free pages from the buddy. */
> +	__offline_isolated_pages(start_pfn, end_pfn);
> +	pr_info("Offlined Pages %ld\n", nr_pages);
> +
>  	/*
>  	 * Onlining will reset pagetype flags and makes migrate type
>  	 * MOVABLE, so just need to decrease the number of isolated
> @@ -1585,11 +1573,11 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>  	spin_unlock_irqrestore(&zone->lock, flags);
>  
>  	/* removal success */
> -	adjust_managed_page_count(pfn_to_page(start_pfn), -offlined_pages);
> -	zone->present_pages -= offlined_pages;
> +	adjust_managed_page_count(pfn_to_page(start_pfn), -nr_pages);
> +	zone->present_pages -= nr_pages;
>  
>  	pgdat_resize_lock(zone->zone_pgdat, &flags);
> -	zone->zone_pgdat->node_present_pages -= offlined_pages;
> +	zone->zone_pgdat->node_present_pages -= nr_pages;
>  	pgdat_resize_unlock(zone->zone_pgdat, &flags);
>  
>  	init_per_zone_wmark_min();
> -- 
> 2.26.2
> 

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 02/11] mm/memory_hotplug: enforce section granularity when onlining/offlining
  2020-08-19 12:37   ` Michal Hocko
@ 2020-08-19 12:43     ` David Hildenbrand
  2020-08-19 12:54       ` Michal Hocko
  0 siblings, 1 reply; 28+ messages in thread
From: David Hildenbrand @ 2020-08-19 12:43 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Wei Yang, Baoquan He,
	Pankaj Gupta, Oscar Salvador

On 19.08.20 14:37, Michal Hocko wrote:
> On Wed 19-08-20 12:11:48, David Hildenbrand wrote:
>> Already two people (including me) tried to offline subsections, because
>> the function looks like it can deal with it. But we really can only
>> online/offline full sections (e.g., we can only mark full sections
>> online/offline via SECTION_IS_ONLINE).
>>
>> Add a simple safety net that to document the restriction now. Current users
>> (core and powernv/memtrace) respect these restrictions.
> 
> I do agree with the warning because it clarifies our expectations
> indeed. Se below for more questions.
> 
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
>> Cc: Baoquan He <bhe@redhat.com>
>> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
>> Cc: Oscar Salvador <osalvador@suse.de>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>  mm/memory_hotplug.c | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index c781d386d87f9..6856702af68d9 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -801,6 +801,11 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
>>  	int ret;
>>  	struct memory_notify arg;
>>  
>> +	/* We can only online full sections (e.g., SECTION_IS_ONLINE) */
>> +	if (WARN_ON_ONCE(!nr_pages ||
>> +			 !IS_ALIGNED(pfn | nr_pages, PAGES_PER_SECTION)))
>> +		return -EINVAL;
> 
> This looks looks unnecessarily cryptic to me. Do you want to catch full
> section operation that doesn't start at the usual section boundary? If
> yes the above doesn't work work unless I am missing something.
> 
> Why don't you simply WARN_ON_ONCE(nr_pages % PAGES_PER_SECTION).
> !nr_pages doesn't sound like something interesting to care about or why
> do we care?
> 

Also the start pfn has to be section aligned, so we always cover fully
aligned sections (e.g., not two partial ones).

It's essentially a compressed version of

!nr_pages || !IS_ALIGNED(pfn, PAGES_PER_SECTION) || !IS_ALIGN(nr_pages,
PAGES_PER_SECTION)

which is the same as

!nr_pages || pfn % PAGES_PER_SECTION) || nr_pages % PAGES_PER_SECTION

or

!nr_pages || (pfn | nr_pages) % PAGES_PER_SECTION

I consider IS_ALIGNED easier to read than % PAGES_PER_SECTION. I can
certainly un-compress, whatever you prefer, thanks.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 04/11] mm/memory_hotplug: simplify offlining of pages in offline_pages()
  2020-08-19 12:40   ` Michal Hocko
@ 2020-08-19 12:44     ` David Hildenbrand
  0 siblings, 0 replies; 28+ messages in thread
From: David Hildenbrand @ 2020-08-19 12:44 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Wei Yang, Baoquan He,
	Pankaj Gupta, Oscar Salvador

On 19.08.20 14:40, Michal Hocko wrote:
> On Wed 19-08-20 12:11:50, David Hildenbrand wrote:
>> We make sure that we cannot have any memory holes right at the beginning
>> of offline_pages(). We no longer need walk_system_ram_range() and can
>> call __offline_isolated_pages() directly.
>>
>> offlined_pages always corresponds to nr_pages, so we can simplify that.
> 
> I would probably fold this into the previous patch as they are dealing
> with the same thing.

Will do, shouldn't harm readability. Thanks!

---
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 05/11] mm/page_alloc: simplify __offline_isolated_pages()
  2020-08-19 10:11 ` [PATCH v1 05/11] mm/page_alloc: simplify __offline_isolated_pages() David Hildenbrand
@ 2020-08-19 12:48   ` Michal Hocko
  0 siblings, 0 replies; 28+ messages in thread
From: Michal Hocko @ 2020-08-19 12:48 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Wei Yang, Baoquan He,
	Pankaj Gupta, Oscar Salvador

On Wed 19-08-20 12:11:51, David Hildenbrand wrote:
> offline_pages() is the only user.  __offline_isolated_pages() never gets
> called with ranges that contain memory holes and we no longer care about
> the return value. Drop the return value handling and all pfn_valid()
> checks.
> 
> Update the documentation.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  include/linux/memory_hotplug.h |  4 ++--
>  mm/page_alloc.c                | 27 ++++-----------------------
>  2 files changed, 6 insertions(+), 25 deletions(-)
> 
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 375515803cd83..0b461691d1a49 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -103,8 +103,8 @@ extern int online_pages(unsigned long pfn, unsigned long nr_pages,
>  			int online_type, int nid);
>  extern struct zone *test_pages_in_a_zone(unsigned long start_pfn,
>  					 unsigned long end_pfn);
> -extern unsigned long __offline_isolated_pages(unsigned long start_pfn,
> -						unsigned long end_pfn);
> +extern void __offline_isolated_pages(unsigned long start_pfn,
> +				     unsigned long end_pfn);
>  
>  typedef void (*online_page_callback_t)(struct page *page, unsigned int order);
>  
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index cf0b25161feae..03f585f95dc60 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8692,35 +8692,21 @@ void zone_pcp_reset(struct zone *zone)
>  
>  #ifdef CONFIG_MEMORY_HOTREMOVE
>  /*
> - * All pages in the range must be in a single zone and isolated
> - * before calling this.
> + * All pages in the range must be in a single zone, must not contain holes,
> + * must span full sections, and must be isolated before calling this function.
>   */
> -unsigned long
> -__offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
> +void __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
>  {
> +	unsigned long pfn = start_pfn;
>  	struct page *page;
>  	struct zone *zone;
>  	unsigned int order;
> -	unsigned long pfn;
>  	unsigned long flags;
> -	unsigned long offlined_pages = 0;
> -
> -	/* find the first valid pfn */
> -	for (pfn = start_pfn; pfn < end_pfn; pfn++)
> -		if (pfn_valid(pfn))
> -			break;
> -	if (pfn == end_pfn)
> -		return offlined_pages;
>  
>  	offline_mem_sections(pfn, end_pfn);
>  	zone = page_zone(pfn_to_page(pfn));
>  	spin_lock_irqsave(&zone->lock, flags);
> -	pfn = start_pfn;
>  	while (pfn < end_pfn) {
> -		if (!pfn_valid(pfn)) {
> -			pfn++;
> -			continue;
> -		}
>  		page = pfn_to_page(pfn);
>  		/*
>  		 * The HWPoisoned page may be not in buddy system, and
> @@ -8728,7 +8714,6 @@ __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
>  		 */
>  		if (unlikely(!PageBuddy(page) && PageHWPoison(page))) {
>  			pfn++;
> -			offlined_pages++;
>  			continue;
>  		}
>  		/*
> @@ -8739,20 +8724,16 @@ __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
>  			BUG_ON(page_count(page));
>  			BUG_ON(PageBuddy(page));
>  			pfn++;
> -			offlined_pages++;
>  			continue;
>  		}
>  
>  		BUG_ON(page_count(page));
>  		BUG_ON(!PageBuddy(page));
>  		order = page_order(page);
> -		offlined_pages += 1 << order;
>  		del_page_from_free_list(page, zone, order);
>  		pfn += (1 << order);
>  	}
>  	spin_unlock_irqrestore(&zone->lock, flags);
> -
> -	return offlined_pages;
>  }
>  #endif
>  
> -- 
> 2.26.2
> 

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 02/11] mm/memory_hotplug: enforce section granularity when onlining/offlining
  2020-08-19 12:43     ` David Hildenbrand
@ 2020-08-19 12:54       ` Michal Hocko
  0 siblings, 0 replies; 28+ messages in thread
From: Michal Hocko @ 2020-08-19 12:54 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Wei Yang, Baoquan He,
	Pankaj Gupta, Oscar Salvador

On Wed 19-08-20 14:43:28, David Hildenbrand wrote:
> On 19.08.20 14:37, Michal Hocko wrote:
> > On Wed 19-08-20 12:11:48, David Hildenbrand wrote:
> >> Already two people (including me) tried to offline subsections, because
> >> the function looks like it can deal with it. But we really can only
> >> online/offline full sections (e.g., we can only mark full sections
> >> online/offline via SECTION_IS_ONLINE).
> >>
> >> Add a simple safety net that to document the restriction now. Current users
> >> (core and powernv/memtrace) respect these restrictions.
> > 
> > I do agree with the warning because it clarifies our expectations
> > indeed. Se below for more questions.
> > 
> >> Cc: Andrew Morton <akpm@linux-foundation.org>
> >> Cc: Michal Hocko <mhocko@suse.com>
> >> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> >> Cc: Baoquan He <bhe@redhat.com>
> >> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> >> Cc: Oscar Salvador <osalvador@suse.de>
> >> Signed-off-by: David Hildenbrand <david@redhat.com>
> >> ---
> >>  mm/memory_hotplug.c | 10 ++++++++++
> >>  1 file changed, 10 insertions(+)
> >>
> >> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> >> index c781d386d87f9..6856702af68d9 100644
> >> --- a/mm/memory_hotplug.c
> >> +++ b/mm/memory_hotplug.c
> >> @@ -801,6 +801,11 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
> >>  	int ret;
> >>  	struct memory_notify arg;
> >>  
> >> +	/* We can only online full sections (e.g., SECTION_IS_ONLINE) */
> >> +	if (WARN_ON_ONCE(!nr_pages ||
> >> +			 !IS_ALIGNED(pfn | nr_pages, PAGES_PER_SECTION)))
> >> +		return -EINVAL;
> > 
> > This looks looks unnecessarily cryptic to me. Do you want to catch full
> > section operation that doesn't start at the usual section boundary? If
> > yes the above doesn't work work unless I am missing something.
> > 
> > Why don't you simply WARN_ON_ONCE(nr_pages % PAGES_PER_SECTION).
> > !nr_pages doesn't sound like something interesting to care about or why
> > do we care?
> > 
> 
> Also the start pfn has to be section aligned, so we always cover fully
> aligned sections (e.g., not two partial ones).

OK, I've misread your intention. I thought that we check for the start
pfn prior to this warning but we only do that after.

Acked-by: Michal Hocko <mhocko@suse.com>
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 06/11] mm/memory_hotplug: drop nr_isolate_pageblock in offline_pages()
  2020-08-19 10:11 ` [PATCH v1 06/11] mm/memory_hotplug: drop nr_isolate_pageblock in offline_pages() David Hildenbrand
@ 2020-08-19 12:58   ` Michal Hocko
  0 siblings, 0 replies; 28+ messages in thread
From: Michal Hocko @ 2020-08-19 12:58 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Wei Yang, Baoquan He,
	Pankaj Gupta, Oscar Salvador

On Wed 19-08-20 12:11:52, David Hildenbrand wrote:
> We make sure that we cannot have any memory holes right at the beginning
> of offline_pages() and we only support to online/offline full sections.
> Both, sections and pageblocks are a power of two in size, and sections
> always span full pageblocks.
> 
> We can directly calculate the number of isolated pageblocks from nr_pages.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memory_hotplug.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 50aa5df696e9d..098361fcb4504 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1460,10 +1460,10 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>  {
>  	const unsigned long end_pfn = start_pfn + nr_pages;
>  	unsigned long pfn, system_ram_pages = 0;
> -	int ret, node, nr_isolate_pageblock;
>  	unsigned long flags;
>  	struct zone *zone;
>  	struct memory_notify arg;
> +	int ret, node;
>  	char *reason;
>  
>  	/* We can only offline full sections (e.g., SECTION_IS_ONLINE) */
> @@ -1507,7 +1507,6 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>  		reason = "failure to isolate range";
>  		goto failed_removal;
>  	}
> -	nr_isolate_pageblock = ret;
>  
>  	arg.start_pfn = start_pfn;
>  	arg.nr_pages = nr_pages;
> @@ -1569,7 +1568,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>  	 * pageblocks zone counter here.
>  	 */
>  	spin_lock_irqsave(&zone->lock, flags);
> -	zone->nr_isolate_pageblock -= nr_isolate_pageblock;
> +	zone->nr_isolate_pageblock -= nr_pages / pageblock_nr_pages;
>  	spin_unlock_irqrestore(&zone->lock, flags);
>  
>  	/* removal success */
> -- 
> 2.26.2
> 

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 07/11] mm/page_isolation: simplify return value of start_isolate_page_range()
  2020-08-19 10:11 ` [PATCH v1 07/11] mm/page_isolation: simplify return value of start_isolate_page_range() David Hildenbrand
@ 2020-08-19 13:00   ` Michal Hocko
  0 siblings, 0 replies; 28+ messages in thread
From: Michal Hocko @ 2020-08-19 13:00 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Wei Yang, Baoquan He,
	Pankaj Gupta, Oscar Salvador

On Wed 19-08-20 12:11:53, David Hildenbrand wrote:
> Callers no longer need the number of isolated pageblocks. Let's
> simplify.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memory_hotplug.c | 2 +-
>  mm/page_alloc.c     | 2 +-
>  mm/page_isolation.c | 7 ++-----
>  3 files changed, 4 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 098361fcb4504..0011a1115381c 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1503,7 +1503,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>  	ret = start_isolate_page_range(start_pfn, end_pfn,
>  				       MIGRATE_MOVABLE,
>  				       MEMORY_OFFLINE | REPORT_FAILURE);
> -	if (ret < 0) {
> +	if (ret) {
>  		reason = "failure to isolate range";
>  		goto failed_removal;
>  	}
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 03f585f95dc60..848664352dfe2 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8456,7 +8456,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
>  
>  	ret = start_isolate_page_range(pfn_max_align_down(start),
>  				       pfn_max_align_up(end), migratetype, 0);
> -	if (ret < 0)
> +	if (ret)
>  		return ret;
>  
>  	/*
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index 242c03121d731..b066c860e606e 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -170,8 +170,7 @@ __first_valid_page(unsigned long pfn, unsigned long nr_pages)
>   * pageblocks we may have modified and return -EBUSY to caller. This
>   * prevents two threads from simultaneously working on overlapping ranges.
>   *
> - * Return: the number of isolated pageblocks on success and -EBUSY if any part
> - * of range cannot be isolated.
> + * Return: 0 on success and -EBUSY if any part of range cannot be isolated.
>   */
>  int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>  			     unsigned migratetype, int flags)
> @@ -179,7 +178,6 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>  	unsigned long pfn;
>  	unsigned long undo_pfn;
>  	struct page *page;
> -	int nr_isolate_pageblock = 0;
>  
>  	BUG_ON(!IS_ALIGNED(start_pfn, pageblock_nr_pages));
>  	BUG_ON(!IS_ALIGNED(end_pfn, pageblock_nr_pages));
> @@ -193,10 +191,9 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>  				undo_pfn = pfn;
>  				goto undo;
>  			}
> -			nr_isolate_pageblock++;
>  		}
>  	}
> -	return nr_isolate_pageblock;
> +	return 0;
>  undo:
>  	for (pfn = start_pfn;
>  	     pfn < undo_pfn;
> -- 
> 2.26.2

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 08/11] mm/memory_hotplug: simplify page onlining
  2020-08-19 10:11 ` [PATCH v1 08/11] mm/memory_hotplug: simplify page onlining David Hildenbrand
@ 2020-08-19 13:02   ` Michal Hocko
  0 siblings, 0 replies; 28+ messages in thread
From: Michal Hocko @ 2020-08-19 13:02 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Wei Yang, Baoquan He,
	Pankaj Gupta, Oscar Salvador

On Wed 19-08-20 12:11:54, David Hildenbrand wrote:
> We don't allow to offline memory with holes, all boot memory is online,
> and all hotplugged memory cannot have holes.
> 
> We can now simplify onlining of pages. As we only allow to online/offline
> full sections and sections always span full MAX_ORDER_NR_PAGES, we can just
> process MAX_ORDER - 1 pages without further special handling.
> 
> The number of onlined pages simply corresponds to the number of pages we
> were requested to online.
> 
> While at it, refine the comment regarding the callback not exposing all
> pages to the buddy.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memory_hotplug.c | 38 ++++++++++----------------------------
>  1 file changed, 10 insertions(+), 28 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 0011a1115381c..3aba0d956f9b1 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -617,31 +617,22 @@ void generic_online_page(struct page *page, unsigned int order)
>  }
>  EXPORT_SYMBOL_GPL(generic_online_page);
>  
> -static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
> -			void *arg)
> +static void online_pages_range(unsigned long start_pfn, unsigned long nr_pages)
>  {
>  	const unsigned long end_pfn = start_pfn + nr_pages;
>  	unsigned long pfn;
> -	int order;
>  
>  	/*
> -	 * Online the pages. The callback might decide to keep some pages
> -	 * PG_reserved (to add them to the buddy later), but we still account
> -	 * them as being online/belonging to this zone ("present").
> +	 * Online the pages in MAX_ORDER - 1 aligned chunks. The callback might
> +	 * decide to not expose all pages to the buddy (e.g., expose them
> +	 * later). We account all pages as being online and belonging to this
> +	 * zone ("present").
>  	 */
> -	for (pfn = start_pfn; pfn < end_pfn; pfn += 1ul << order) {
> -		order = min(MAX_ORDER - 1, get_order(PFN_PHYS(end_pfn - pfn)));
> -		/* __free_pages_core() wants pfns to be aligned to the order */
> -		if (WARN_ON_ONCE(!IS_ALIGNED(pfn, 1ul << order)))
> -			order = 0;
> -		(*online_page_callback)(pfn_to_page(pfn), order);
> -	}
> +	for (pfn = start_pfn; pfn < end_pfn; pfn += MAX_ORDER_NR_PAGES)
> +		(*online_page_callback)(pfn_to_page(pfn), MAX_ORDER - 1);
>  
>  	/* mark all involved sections as online */
>  	online_mem_sections(start_pfn, end_pfn);
> -
> -	*(unsigned long *)arg += nr_pages;
> -	return 0;
>  }
>  
>  /* check which state of node_states will be changed when online memory */
> @@ -795,7 +786,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
>  		       int online_type, int nid)
>  {
>  	unsigned long flags;
> -	unsigned long onlined_pages = 0;
>  	struct zone *zone;
>  	int need_zonelists_rebuild = 0;
>  	int ret;
> @@ -831,19 +821,11 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
>  		setup_zone_pageset(zone);
>  	}
>  
> -	ret = walk_system_ram_range(pfn, nr_pages, &onlined_pages,
> -		online_pages_range);
> -	if (ret) {
> -		/* not a single memory resource was applicable */
> -		if (need_zonelists_rebuild)
> -			zone_pcp_reset(zone);
> -		goto failed_addition;
> -	}
> -
> -	zone->present_pages += onlined_pages;
> +	online_pages_range(pfn, nr_pages);
> +	zone->present_pages += nr_pages;
>  
>  	pgdat_resize_lock(zone->zone_pgdat, &flags);
> -	zone->zone_pgdat->node_present_pages += onlined_pages;
> +	zone->zone_pgdat->node_present_pages += nr_pages;
>  	pgdat_resize_unlock(zone->zone_pgdat, &flags);
>  
>  	/*
> -- 
> 2.26.2
> 

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 09/11] mm/page_alloc: drop stale pageblock comment in memmap_init_zone*()
  2020-08-19 10:11 ` [PATCH v1 09/11] mm/page_alloc: drop stale pageblock comment in memmap_init_zone*() David Hildenbrand
@ 2020-08-19 13:05   ` Michal Hocko
  0 siblings, 0 replies; 28+ messages in thread
From: Michal Hocko @ 2020-08-19 13:05 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Wei Yang, Baoquan He,
	Pankaj Gupta, Oscar Salvador, Mel Gorman

On Wed 19-08-20 12:11:55, David Hildenbrand wrote:
> Commit ac5d2539b238 ("mm: meminit: reduce number of times pageblocks are
> set during struct page init") moved the actual zone range check, leaving
> only the alignment check for pageblocks.
> 
> Let's drop the stale comment and make the pageblock check easier to read.

I do agree athat IS_ALIGNED is easier to read in this case.

> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Mel Gorman <mgorman@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/page_alloc.c | 14 ++------------
>  1 file changed, 2 insertions(+), 12 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 848664352dfe2..5db0b35f95e20 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6022,13 +6022,8 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>  		 * to reserve their blocks rather than leaking throughout
>  		 * the address space during boot when many long-lived
>  		 * kernel allocations are made.
> -		 *
> -		 * bitmap is created for zone's valid pfn range. but memmap
> -		 * can be created for invalid pages (for alignment)
> -		 * check here not to call set_pageblock_migratetype() against
> -		 * pfn out of zone.
>  		 */
> -		if (!(pfn & (pageblock_nr_pages - 1))) {
> +		if (IS_ALIGNED(pfn, pageblock_nr_pages)) {
>  			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
>  			cond_resched();
>  		}
> @@ -6091,15 +6086,10 @@ void __ref memmap_init_zone_device(struct zone *zone,
>  		 * the address space during boot when many long-lived
>  		 * kernel allocations are made.
>  		 *
> -		 * bitmap is created for zone's valid pfn range. but memmap
> -		 * can be created for invalid pages (for alignment)
> -		 * check here not to call set_pageblock_migratetype() against
> -		 * pfn out of zone.
> -		 *
>  		 * Please note that MEMMAP_HOTPLUG path doesn't clear memmap
>  		 * because this is done early in section_activate()
>  		 */
> -		if (!(pfn & (pageblock_nr_pages - 1))) {
> +		if (IS_ALIGNED(pfn, pageblock_nr_pages)) {
>  			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
>  			cond_resched();
>  		}
> -- 
> 2.26.2
> 

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 10/11] mm: pass migratetype into memmap_init_zone() and move_pfn_range_to_zone()
  2020-08-19 10:11 ` [PATCH v1 10/11] mm: pass migratetype into memmap_init_zone() and move_pfn_range_to_zone() David Hildenbrand
@ 2020-08-19 13:08   ` Michal Hocko
  0 siblings, 0 replies; 28+ messages in thread
From: Michal Hocko @ 2020-08-19 13:08 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Wei Yang, Baoquan He,
	Pankaj Gupta, Oscar Salvador, Tony Luck, Fenghua Yu,
	Logan Gunthorpe, Dan Williams, Mike Rapoport,
	Matthew Wilcox (Oracle),
	Michel Lespinasse, linux-ia64

On Wed 19-08-20 12:11:56, David Hildenbrand wrote:
> On the memory hotplug path, we want to start with MIGRATE_ISOLATE, to
> un-isolate the pages after memory onlining is complete. Let's allow
> passing in the migratetype.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Fenghua Yu <fenghua.yu@intel.com>
> Cc: Logan Gunthorpe <logang@deltatee.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Mike Rapoport <rppt@kernel.org>
> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> Cc: Michel Lespinasse <walken@google.com>
> Cc: linux-ia64@vger.kernel.org
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 11/11] mm/memory_hotplug: mark pageblocks MIGRATE_ISOLATE while onlining memory
  2020-08-19 10:11 ` [PATCH v1 11/11] mm/memory_hotplug: mark pageblocks MIGRATE_ISOLATE while onlining memory David Hildenbrand
@ 2020-08-19 13:16   ` Michal Hocko
  2020-08-19 13:19     ` David Hildenbrand
  0 siblings, 1 reply; 28+ messages in thread
From: Michal Hocko @ 2020-08-19 13:16 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Wei Yang, Baoquan He,
	Pankaj Gupta, Oscar Salvador, Charan Teja Reddy

On Wed 19-08-20 12:11:57, David Hildenbrand wrote:
> Currently, it can happen that pages are allocated (and freed) via the buddy
> before we finished basic memory onlining.
> 
> For example, pages are exposed to the buddy and can be allocated before
> we actually mark the sections online. Allocated pages could suddenly
> fail pfn_to_online_page() checks. We had similar issues with pcp
> handling, when pages are allocated+freed before we reach
> zone_pcp_update() in online_pages() [1].
> 
> Instead, mark all pageblocks MIGRATE_ISOLATE, such that allocations are
> impossible. Once done with the heavy lifting, use
> undo_isolate_page_range() to move the pages to the MIGRATE_MOVABLE
> freelist, marking them ready for allocation. Similar to offline_pages(),
> we have to manually adjust zone->nr_isolate_pageblock.
> 
> [1] https://lkml.kernel.org/r/1597150703-19003-1-git-send-email-charante@codeaurora.org
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Charan Teja Reddy <charante@codeaurora.org>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

Yes this looks very sensible and we should have done that from the
beginning. I just have one minor comment below
> @@ -816,6 +816,14 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
>  	if (ret)
>  		goto failed_addition;
>  
> +	/*
> +	 * Fixup the number of isolated pageblocks before marking the sections
> +	 * onlining, such that undo_isolate_page_range() works correctly.
> +	 */
> +	spin_lock_irqsave(&zone->lock, flags);
> +	zone->nr_isolate_pageblock += nr_pages / pageblock_nr_pages;
> +	spin_unlock_irqrestore(&zone->lock, flags);
> +

I am not entirely happy about this. I am wondering whether it would make
more sense to keep the counter in sync already in memmap_init_zone. Sure
we add a branch to the boot time initialization - and it always fails
there - but the code would be cleaner and we wouldn't have to do tricks
like this in caller(s).
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 11/11] mm/memory_hotplug: mark pageblocks MIGRATE_ISOLATE while onlining memory
  2020-08-19 13:16   ` Michal Hocko
@ 2020-08-19 13:19     ` David Hildenbrand
  0 siblings, 0 replies; 28+ messages in thread
From: David Hildenbrand @ 2020-08-19 13:19 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Wei Yang, Baoquan He,
	Pankaj Gupta, Oscar Salvador, Charan Teja Reddy

On 19.08.20 15:16, Michal Hocko wrote:
> On Wed 19-08-20 12:11:57, David Hildenbrand wrote:
>> Currently, it can happen that pages are allocated (and freed) via the buddy
>> before we finished basic memory onlining.
>>
>> For example, pages are exposed to the buddy and can be allocated before
>> we actually mark the sections online. Allocated pages could suddenly
>> fail pfn_to_online_page() checks. We had similar issues with pcp
>> handling, when pages are allocated+freed before we reach
>> zone_pcp_update() in online_pages() [1].
>>
>> Instead, mark all pageblocks MIGRATE_ISOLATE, such that allocations are
>> impossible. Once done with the heavy lifting, use
>> undo_isolate_page_range() to move the pages to the MIGRATE_MOVABLE
>> freelist, marking them ready for allocation. Similar to offline_pages(),
>> we have to manually adjust zone->nr_isolate_pageblock.
>>
>> [1] https://lkml.kernel.org/r/1597150703-19003-1-git-send-email-charante@codeaurora.org
>>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
>> Cc: Baoquan He <bhe@redhat.com>
>> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
>> Cc: Oscar Salvador <osalvador@suse.de>
>> Cc: Charan Teja Reddy <charante@codeaurora.org>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> 
> Yes this looks very sensible and we should have done that from the
> beginning. I just have one minor comment below
>> @@ -816,6 +816,14 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
>>  	if (ret)
>>  		goto failed_addition;
>>  
>> +	/*
>> +	 * Fixup the number of isolated pageblocks before marking the sections
>> +	 * onlining, such that undo_isolate_page_range() works correctly.
>> +	 */
>> +	spin_lock_irqsave(&zone->lock, flags);
>> +	zone->nr_isolate_pageblock += nr_pages / pageblock_nr_pages;
>> +	spin_unlock_irqrestore(&zone->lock, flags);
>> +
> 
> I am not entirely happy about this. I am wondering whether it would make
> more sense to keep the counter in sync already in memmap_init_zone. Sure
> we add a branch to the boot time initialization - and it always fails
> there - but the code would be cleaner and we wouldn't have to do tricks
> like this in caller(s).

I had that in mind initially. The issue is that we have to fixup in case
onlining fails, which I consider even more ugly. Also

1. It's the complete reverse of the offlining path now.
2. pageblock flags are essentially stale unless the section is online,
my approach moves the handling to the point where nothing else will go
wrong and we are just about to mark sections online. That looks a little
cleaner to me.

Unless there are strong opinions, I'd prefer to keep it like this.

Thanks for the very fast review Michal!

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 03/11] mm/memory_hotplug: simplify checking if all pages are isolated in offline_pages()
  2020-08-19 10:11 ` [PATCH v1 03/11] mm/memory_hotplug: simplify checking if all pages are isolated in offline_pages() David Hildenbrand
  2020-08-19 12:39   ` Michal Hocko
@ 2020-08-31 10:10   ` Pankaj Gupta
  1 sibling, 0 replies; 28+ messages in thread
From: Pankaj Gupta @ 2020-08-31 10:10 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: LKML, Linux MM, Andrew Morton, Michal Hocko, Wei Yang,
	Baoquan He, Oscar Salvador

> We make sure that we cannot have any memory holes right at the beginning
> of offline_pages(). We no longer need walk_system_ram_range() and can
> call test_pages_isolated() directly.
>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  mm/memory_hotplug.c | 16 +---------------
>  1 file changed, 1 insertion(+), 15 deletions(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 6856702af68d9..f64478349148d 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1384,17 +1384,6 @@ offline_isolated_pages_cb(unsigned long start, unsigned long nr_pages,
>         return 0;
>  }
>
> -/*
> - * Check all pages in range, recorded as memory resource, are isolated.
> - */
> -static int
> -check_pages_isolated_cb(unsigned long start_pfn, unsigned long nr_pages,
> -                       void *data)
> -{
> -       return test_pages_isolated(start_pfn, start_pfn + nr_pages,
> -                                  MEMORY_OFFLINE);
> -}
> -
>  static int __init cmdline_parse_movable_node(char *p)
>  {
>         movable_node_enabled = true;
> @@ -1579,10 +1568,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>                         reason = "failure to dissolve huge pages";
>                         goto failed_removal_isolated;
>                 }
> -               /* check again */
> -               ret = walk_system_ram_range(start_pfn, end_pfn - start_pfn,
> -                                           NULL, check_pages_isolated_cb);
> -       } while (ret);
> +       } while (test_pages_isolated(start_pfn, end_pfn, MEMORY_OFFLINE));
>
>         /* Ok, all of our target is isolated.
>            We cannot do rollback at this point. */
> --
> 2.26.2

Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2020-08-31 10:11 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-19 10:11 [PATCH v1 00/11] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
2020-08-19 10:11 ` [PATCH v1 01/11] mm/memory_hotplug: inline __offline_pages() into offline_pages() David Hildenbrand
2020-08-19 12:23   ` Michal Hocko
2020-08-19 10:11 ` [PATCH v1 02/11] mm/memory_hotplug: enforce section granularity when onlining/offlining David Hildenbrand
2020-08-19 12:37   ` Michal Hocko
2020-08-19 12:43     ` David Hildenbrand
2020-08-19 12:54       ` Michal Hocko
2020-08-19 10:11 ` [PATCH v1 03/11] mm/memory_hotplug: simplify checking if all pages are isolated in offline_pages() David Hildenbrand
2020-08-19 12:39   ` Michal Hocko
2020-08-31 10:10   ` Pankaj Gupta
2020-08-19 10:11 ` [PATCH v1 04/11] mm/memory_hotplug: simplify offlining of pages " David Hildenbrand
2020-08-19 12:40   ` Michal Hocko
2020-08-19 12:44     ` David Hildenbrand
2020-08-19 10:11 ` [PATCH v1 05/11] mm/page_alloc: simplify __offline_isolated_pages() David Hildenbrand
2020-08-19 12:48   ` Michal Hocko
2020-08-19 10:11 ` [PATCH v1 06/11] mm/memory_hotplug: drop nr_isolate_pageblock in offline_pages() David Hildenbrand
2020-08-19 12:58   ` Michal Hocko
2020-08-19 10:11 ` [PATCH v1 07/11] mm/page_isolation: simplify return value of start_isolate_page_range() David Hildenbrand
2020-08-19 13:00   ` Michal Hocko
2020-08-19 10:11 ` [PATCH v1 08/11] mm/memory_hotplug: simplify page onlining David Hildenbrand
2020-08-19 13:02   ` Michal Hocko
2020-08-19 10:11 ` [PATCH v1 09/11] mm/page_alloc: drop stale pageblock comment in memmap_init_zone*() David Hildenbrand
2020-08-19 13:05   ` Michal Hocko
2020-08-19 10:11 ` [PATCH v1 10/11] mm: pass migratetype into memmap_init_zone() and move_pfn_range_to_zone() David Hildenbrand
2020-08-19 13:08   ` Michal Hocko
2020-08-19 10:11 ` [PATCH v1 11/11] mm/memory_hotplug: mark pageblocks MIGRATE_ISOLATE while onlining memory David Hildenbrand
2020-08-19 13:16   ` Michal Hocko
2020-08-19 13:19     ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).