linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/10] mm/memory_hotplug: online_pages()/offline_pages() cleanups
@ 2020-08-19 17:59 David Hildenbrand
  2020-08-19 17:59 ` [PATCH v2 01/10] mm/memory_hotplug: inline __offline_pages() into offline_pages() David Hildenbrand
                   ` (9 more replies)
  0 siblings, 10 replies; 30+ messages in thread
From: David Hildenbrand @ 2020-08-19 17:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Baoquan He,
	Charan Teja Reddy, Dan Williams, Fenghua Yu, Logan Gunthorpe,
	Matthew Wilcox (Oracle),
	Mel Gorman, Michal Hocko, Michel Lespinasse, Mike Rapoport,
	Oscar Salvador, Pankaj Gupta, Tony Luck, Wei Yang

These are a bunch of cleanups for online_pages()/offline_pages() and
related code, mostly getting rid of memory hole handling that is no longer
necessary. There is only a single walk_system_ram_range() call left in
offline_pages(), to make sure we don't have any memory holes. I had some
of these patches lying around for a longer time but didn't have time to
polish them.

In addition, the last patch marks all pageblocks of memory to get onlined
MIGRATE_ISOLATE, so pages that have just been exposed to the buddy cannot
get allocated before onlining is complete. Once heavy lifting is done,
the pageblocks are set to MIGRATE_MOVABLE, such that allocations are
possible.

I played with DIMMs and virtio-mem on x86-64 and didn't spot any surprises.
I verified that the numer of isolated pageblocks is correctly handled when
onlining/offlining.

v1 -> v2:
- "mm/memory_hotplug: enforce section granularity when onlining/offlining"
-- Extended the patch description regarding alignment requirements
-- Fixed a typo
- Squashed "mm/memory_hotplug: simplify offlining of pages in
  offline_pages()" and "mm/memory_hotplug: simplify checking if all pages are
  isolated in offline_pages()", resulting in "mm/memory_hotplug: simplify page
  offlining"
- Added ACKs

David Hildenbrand (10):
  mm/memory_hotplug: inline __offline_pages() into offline_pages()
  mm/memory_hotplug: enforce section granularity when onlining/offlining
  mm/memory_hotplug: simplify page offlining
  mm/page_alloc: simplify __offline_isolated_pages()
  mm/memory_hotplug: drop nr_isolate_pageblock in offline_pages()
  mm/page_isolation: simplify return value of start_isolate_page_range()
  mm/memory_hotplug: simplify page onlining
  mm/page_alloc: drop stale pageblock comment in memmap_init_zone*()
  mm: pass migratetype into memmap_init_zone() and
    move_pfn_range_to_zone()
  mm/memory_hotplug: mark pageblocks MIGRATE_ISOLATE while onlining
    memory

 arch/ia64/mm/init.c            |   4 +-
 include/linux/memory_hotplug.h |   7 +-
 include/linux/mm.h             |   3 +-
 mm/Kconfig                     |   2 +-
 mm/memory_hotplug.c            | 156 ++++++++++++++-------------------
 mm/memremap.c                  |   3 +-
 mm/page_alloc.c                |  64 ++++----------
 mm/page_isolation.c            |   7 +-
 8 files changed, 98 insertions(+), 148 deletions(-)

-- 
2.26.2



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 01/10] mm/memory_hotplug: inline __offline_pages() into offline_pages()
  2020-08-19 17:59 [PATCH v2 00/10] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
@ 2020-08-19 17:59 ` David Hildenbrand
  2020-08-24 10:26   ` Oscar Salvador
  2020-08-31 10:05   ` Pankaj Gupta
  2020-08-19 17:59 ` [PATCH v2 02/10] mm/memory_hotplug: enforce section granularity when onlining/offlining David Hildenbrand
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 30+ messages in thread
From: David Hildenbrand @ 2020-08-19 17:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Michal Hocko, Andrew Morton,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador

There is only a single user, offline_pages(). Let's inline, to make
it look more similar to online_pages().

Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 7f62d69660e06..c781d386d87f9 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1473,11 +1473,10 @@ static int count_system_ram_pages_cb(unsigned long start_pfn,
 	return 0;
 }
 
-static int __ref __offline_pages(unsigned long start_pfn,
-		  unsigned long end_pfn)
+int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 {
-	unsigned long pfn, nr_pages = 0;
-	unsigned long offlined_pages = 0;
+	const unsigned long end_pfn = start_pfn + nr_pages;
+	unsigned long pfn, system_ram_pages = 0, offlined_pages = 0;
 	int ret, node, nr_isolate_pageblock;
 	unsigned long flags;
 	struct zone *zone;
@@ -1494,9 +1493,9 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	 * memory holes PG_reserved, don't need pfn_valid() checks, and can
 	 * avoid using walk_system_ram_range() later.
 	 */
-	walk_system_ram_range(start_pfn, end_pfn - start_pfn, &nr_pages,
+	walk_system_ram_range(start_pfn, nr_pages, &system_ram_pages,
 			      count_system_ram_pages_cb);
-	if (nr_pages != end_pfn - start_pfn) {
+	if (system_ram_pages != nr_pages) {
 		ret = -EINVAL;
 		reason = "memory holes";
 		goto failed_removal;
@@ -1631,11 +1630,6 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	return ret;
 }
 
-int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
-{
-	return __offline_pages(start_pfn, start_pfn + nr_pages);
-}
-
 static int check_memblock_offlined_cb(struct memory_block *mem, void *arg)
 {
 	int ret = !is_memblock_offlined(mem);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 02/10] mm/memory_hotplug: enforce section granularity when onlining/offlining
  2020-08-19 17:59 [PATCH v2 00/10] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
  2020-08-19 17:59 ` [PATCH v2 01/10] mm/memory_hotplug: inline __offline_pages() into offline_pages() David Hildenbrand
@ 2020-08-19 17:59 ` David Hildenbrand
  2020-08-24 10:39   ` Oscar Salvador
  2020-08-19 17:59 ` [PATCH v2 03/10] mm/memory_hotplug: simplify page offlining David Hildenbrand
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: David Hildenbrand @ 2020-08-19 17:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Michal Hocko, Andrew Morton,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador

Already two people (including me) tried to offline subsections, because
the function looks like it can deal with it. But we really can only
online/offline full sections that are properly aligned (e.g., we can only
mark full sections online/offline via SECTION_IS_ONLINE).

Add a simple safety net to document the restriction now. Current users
(core and powernv/memtrace) respect these restrictions.

Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index c781d386d87f9..6856702af68d9 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -801,6 +801,11 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
 	int ret;
 	struct memory_notify arg;
 
+	/* We can only online full sections (e.g., SECTION_IS_ONLINE) */
+	if (WARN_ON_ONCE(!nr_pages ||
+			 !IS_ALIGNED(pfn | nr_pages, PAGES_PER_SECTION)))
+		return -EINVAL;
+
 	mem_hotplug_begin();
 
 	/* associate pfn range with the zone */
@@ -1483,6 +1488,11 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 	struct memory_notify arg;
 	char *reason;
 
+	/* We can only offline full sections (e.g., SECTION_IS_ONLINE) */
+	if (WARN_ON_ONCE(!nr_pages ||
+			 !IS_ALIGNED(start_pfn | nr_pages, PAGES_PER_SECTION)))
+		return -EINVAL;
+
 	mem_hotplug_begin();
 
 	/*
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 03/10] mm/memory_hotplug: simplify page offlining
  2020-08-19 17:59 [PATCH v2 00/10] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
  2020-08-19 17:59 ` [PATCH v2 01/10] mm/memory_hotplug: inline __offline_pages() into offline_pages() David Hildenbrand
  2020-08-19 17:59 ` [PATCH v2 02/10] mm/memory_hotplug: enforce section granularity when onlining/offlining David Hildenbrand
@ 2020-08-19 17:59 ` David Hildenbrand
  2020-08-24 10:44   ` Oscar Salvador
  2020-09-03 21:58   ` Andrew Morton
  2020-08-19 17:59 ` [PATCH v2 04/10] mm/page_alloc: simplify __offline_isolated_pages() David Hildenbrand
                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 30+ messages in thread
From: David Hildenbrand @ 2020-08-19 17:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Michal Hocko, Andrew Morton,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador

We make sure that we cannot have any memory holes right at the beginning
of offline_pages(). We no longer need walk_system_ram_range() and can
call test_pages_isolated() and __offline_isolated_pages() directly.

offlined_pages always corresponds to nr_pages, so we can simplify that.

Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 46 ++++++++++-----------------------------------
 1 file changed, 10 insertions(+), 36 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 6856702af68d9..50aa5df696e9d 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1373,28 +1373,6 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 	return ret;
 }
 
-/* Mark all sections offline and remove all free pages from the buddy. */
-static int
-offline_isolated_pages_cb(unsigned long start, unsigned long nr_pages,
-			void *data)
-{
-	unsigned long *offlined_pages = (unsigned long *)data;
-
-	*offlined_pages += __offline_isolated_pages(start, start + nr_pages);
-	return 0;
-}
-
-/*
- * Check all pages in range, recorded as memory resource, are isolated.
- */
-static int
-check_pages_isolated_cb(unsigned long start_pfn, unsigned long nr_pages,
-			void *data)
-{
-	return test_pages_isolated(start_pfn, start_pfn + nr_pages,
-				   MEMORY_OFFLINE);
-}
-
 static int __init cmdline_parse_movable_node(char *p)
 {
 	movable_node_enabled = true;
@@ -1481,7 +1459,7 @@ static int count_system_ram_pages_cb(unsigned long start_pfn,
 int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 {
 	const unsigned long end_pfn = start_pfn + nr_pages;
-	unsigned long pfn, system_ram_pages = 0, offlined_pages = 0;
+	unsigned long pfn, system_ram_pages = 0;
 	int ret, node, nr_isolate_pageblock;
 	unsigned long flags;
 	struct zone *zone;
@@ -1579,16 +1557,12 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 			reason = "failure to dissolve huge pages";
 			goto failed_removal_isolated;
 		}
-		/* check again */
-		ret = walk_system_ram_range(start_pfn, end_pfn - start_pfn,
-					    NULL, check_pages_isolated_cb);
-	} while (ret);
-
-	/* Ok, all of our target is isolated.
-	   We cannot do rollback at this point. */
-	walk_system_ram_range(start_pfn, end_pfn - start_pfn,
-			      &offlined_pages, offline_isolated_pages_cb);
-	pr_info("Offlined Pages %ld\n", offlined_pages);
+	} while (test_pages_isolated(start_pfn, end_pfn, MEMORY_OFFLINE));
+
+	/* Mark all sections offline and remove free pages from the buddy. */
+	__offline_isolated_pages(start_pfn, end_pfn);
+	pr_info("Offlined Pages %ld\n", nr_pages);
+
 	/*
 	 * Onlining will reset pagetype flags and makes migrate type
 	 * MOVABLE, so just need to decrease the number of isolated
@@ -1599,11 +1573,11 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 	spin_unlock_irqrestore(&zone->lock, flags);
 
 	/* removal success */
-	adjust_managed_page_count(pfn_to_page(start_pfn), -offlined_pages);
-	zone->present_pages -= offlined_pages;
+	adjust_managed_page_count(pfn_to_page(start_pfn), -nr_pages);
+	zone->present_pages -= nr_pages;
 
 	pgdat_resize_lock(zone->zone_pgdat, &flags);
-	zone->zone_pgdat->node_present_pages -= offlined_pages;
+	zone->zone_pgdat->node_present_pages -= nr_pages;
 	pgdat_resize_unlock(zone->zone_pgdat, &flags);
 
 	init_per_zone_wmark_min();
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 04/10] mm/page_alloc: simplify __offline_isolated_pages()
  2020-08-19 17:59 [PATCH v2 00/10] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
                   ` (2 preceding siblings ...)
  2020-08-19 17:59 ` [PATCH v2 03/10] mm/memory_hotplug: simplify page offlining David Hildenbrand
@ 2020-08-19 17:59 ` David Hildenbrand
  2020-08-24 10:48   ` Oscar Salvador
  2020-08-19 17:59 ` [PATCH v2 05/10] mm/memory_hotplug: drop nr_isolate_pageblock in offline_pages() David Hildenbrand
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: David Hildenbrand @ 2020-08-19 17:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Michal Hocko, Andrew Morton,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador

offline_pages() is the only user. __offline_isolated_pages() never gets
called with ranges that contain memory holes and we no longer care about
the return value. Drop the return value handling and all pfn_valid()
checks.

Update the documentation.

Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/memory_hotplug.h |  4 ++--
 mm/page_alloc.c                | 27 ++++-----------------------
 2 files changed, 6 insertions(+), 25 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 375515803cd83..0b461691d1a49 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -103,8 +103,8 @@ extern int online_pages(unsigned long pfn, unsigned long nr_pages,
 			int online_type, int nid);
 extern struct zone *test_pages_in_a_zone(unsigned long start_pfn,
 					 unsigned long end_pfn);
-extern unsigned long __offline_isolated_pages(unsigned long start_pfn,
-						unsigned long end_pfn);
+extern void __offline_isolated_pages(unsigned long start_pfn,
+				     unsigned long end_pfn);
 
 typedef void (*online_page_callback_t)(struct page *page, unsigned int order);
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cf0b25161feae..03f585f95dc60 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8692,35 +8692,21 @@ void zone_pcp_reset(struct zone *zone)
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
 /*
- * All pages in the range must be in a single zone and isolated
- * before calling this.
+ * All pages in the range must be in a single zone, must not contain holes,
+ * must span full sections, and must be isolated before calling this function.
  */
-unsigned long
-__offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
+void __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
 {
+	unsigned long pfn = start_pfn;
 	struct page *page;
 	struct zone *zone;
 	unsigned int order;
-	unsigned long pfn;
 	unsigned long flags;
-	unsigned long offlined_pages = 0;
-
-	/* find the first valid pfn */
-	for (pfn = start_pfn; pfn < end_pfn; pfn++)
-		if (pfn_valid(pfn))
-			break;
-	if (pfn == end_pfn)
-		return offlined_pages;
 
 	offline_mem_sections(pfn, end_pfn);
 	zone = page_zone(pfn_to_page(pfn));
 	spin_lock_irqsave(&zone->lock, flags);
-	pfn = start_pfn;
 	while (pfn < end_pfn) {
-		if (!pfn_valid(pfn)) {
-			pfn++;
-			continue;
-		}
 		page = pfn_to_page(pfn);
 		/*
 		 * The HWPoisoned page may be not in buddy system, and
@@ -8728,7 +8714,6 @@ __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
 		 */
 		if (unlikely(!PageBuddy(page) && PageHWPoison(page))) {
 			pfn++;
-			offlined_pages++;
 			continue;
 		}
 		/*
@@ -8739,20 +8724,16 @@ __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
 			BUG_ON(page_count(page));
 			BUG_ON(PageBuddy(page));
 			pfn++;
-			offlined_pages++;
 			continue;
 		}
 
 		BUG_ON(page_count(page));
 		BUG_ON(!PageBuddy(page));
 		order = page_order(page);
-		offlined_pages += 1 << order;
 		del_page_from_free_list(page, zone, order);
 		pfn += (1 << order);
 	}
 	spin_unlock_irqrestore(&zone->lock, flags);
-
-	return offlined_pages;
 }
 #endif
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 05/10] mm/memory_hotplug: drop nr_isolate_pageblock in offline_pages()
  2020-08-19 17:59 [PATCH v2 00/10] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
                   ` (3 preceding siblings ...)
  2020-08-19 17:59 ` [PATCH v2 04/10] mm/page_alloc: simplify __offline_isolated_pages() David Hildenbrand
@ 2020-08-19 17:59 ` David Hildenbrand
  2020-08-24 10:49   ` Oscar Salvador
  2020-08-19 17:59 ` [PATCH v2 06/10] mm/page_isolation: simplify return value of start_isolate_page_range() David Hildenbrand
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: David Hildenbrand @ 2020-08-19 17:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Michal Hocko, Andrew Morton,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador

We make sure that we cannot have any memory holes right at the beginning
of offline_pages() and we only support to online/offline full sections.
Both, sections and pageblocks are a power of two in size, and sections
always span full pageblocks.

We can directly calculate the number of isolated pageblocks from nr_pages.

Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 50aa5df696e9d..098361fcb4504 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1460,10 +1460,10 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 {
 	const unsigned long end_pfn = start_pfn + nr_pages;
 	unsigned long pfn, system_ram_pages = 0;
-	int ret, node, nr_isolate_pageblock;
 	unsigned long flags;
 	struct zone *zone;
 	struct memory_notify arg;
+	int ret, node;
 	char *reason;
 
 	/* We can only offline full sections (e.g., SECTION_IS_ONLINE) */
@@ -1507,7 +1507,6 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 		reason = "failure to isolate range";
 		goto failed_removal;
 	}
-	nr_isolate_pageblock = ret;
 
 	arg.start_pfn = start_pfn;
 	arg.nr_pages = nr_pages;
@@ -1569,7 +1568,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 	 * pageblocks zone counter here.
 	 */
 	spin_lock_irqsave(&zone->lock, flags);
-	zone->nr_isolate_pageblock -= nr_isolate_pageblock;
+	zone->nr_isolate_pageblock -= nr_pages / pageblock_nr_pages;
 	spin_unlock_irqrestore(&zone->lock, flags);
 
 	/* removal success */
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 06/10] mm/page_isolation: simplify return value of start_isolate_page_range()
  2020-08-19 17:59 [PATCH v2 00/10] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
                   ` (4 preceding siblings ...)
  2020-08-19 17:59 ` [PATCH v2 05/10] mm/memory_hotplug: drop nr_isolate_pageblock in offline_pages() David Hildenbrand
@ 2020-08-19 17:59 ` David Hildenbrand
  2020-08-24 10:51   ` Oscar Salvador
  2020-08-19 17:59 ` [PATCH v2 07/10] mm/memory_hotplug: simplify page onlining David Hildenbrand
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: David Hildenbrand @ 2020-08-19 17:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Michal Hocko, Andrew Morton,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador

Callers no longer need the number of isolated pageblocks. Let's
simplify.

Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 2 +-
 mm/page_alloc.c     | 2 +-
 mm/page_isolation.c | 7 ++-----
 3 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 098361fcb4504..0011a1115381c 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1503,7 +1503,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 	ret = start_isolate_page_range(start_pfn, end_pfn,
 				       MIGRATE_MOVABLE,
 				       MEMORY_OFFLINE | REPORT_FAILURE);
-	if (ret < 0) {
+	if (ret) {
 		reason = "failure to isolate range";
 		goto failed_removal;
 	}
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 03f585f95dc60..848664352dfe2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8456,7 +8456,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 
 	ret = start_isolate_page_range(pfn_max_align_down(start),
 				       pfn_max_align_up(end), migratetype, 0);
-	if (ret < 0)
+	if (ret)
 		return ret;
 
 	/*
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 242c03121d731..b066c860e606e 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -170,8 +170,7 @@ __first_valid_page(unsigned long pfn, unsigned long nr_pages)
  * pageblocks we may have modified and return -EBUSY to caller. This
  * prevents two threads from simultaneously working on overlapping ranges.
  *
- * Return: the number of isolated pageblocks on success and -EBUSY if any part
- * of range cannot be isolated.
+ * Return: 0 on success and -EBUSY if any part of range cannot be isolated.
  */
 int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
 			     unsigned migratetype, int flags)
@@ -179,7 +178,6 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
 	unsigned long pfn;
 	unsigned long undo_pfn;
 	struct page *page;
-	int nr_isolate_pageblock = 0;
 
 	BUG_ON(!IS_ALIGNED(start_pfn, pageblock_nr_pages));
 	BUG_ON(!IS_ALIGNED(end_pfn, pageblock_nr_pages));
@@ -193,10 +191,9 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
 				undo_pfn = pfn;
 				goto undo;
 			}
-			nr_isolate_pageblock++;
 		}
 	}
-	return nr_isolate_pageblock;
+	return 0;
 undo:
 	for (pfn = start_pfn;
 	     pfn < undo_pfn;
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 07/10] mm/memory_hotplug: simplify page onlining
  2020-08-19 17:59 [PATCH v2 00/10] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
                   ` (5 preceding siblings ...)
  2020-08-19 17:59 ` [PATCH v2 06/10] mm/page_isolation: simplify return value of start_isolate_page_range() David Hildenbrand
@ 2020-08-19 17:59 ` David Hildenbrand
  2020-08-24 10:55   ` Oscar Salvador
  2020-08-19 17:59 ` [PATCH v2 08/10] mm/page_alloc: drop stale pageblock comment in memmap_init_zone*() David Hildenbrand
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: David Hildenbrand @ 2020-08-19 17:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Michal Hocko, Andrew Morton,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador

We don't allow to offline memory with holes, all boot memory is online,
and all hotplugged memory cannot have holes.

We can now simplify onlining of pages. As we only allow to online/offline
full sections and sections always span full MAX_ORDER_NR_PAGES, we can just
process MAX_ORDER - 1 pages without further special handling.

The number of onlined pages simply corresponds to the number of pages we
were requested to online.

While at it, refine the comment regarding the callback not exposing all
pages to the buddy.

Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 38 ++++++++++----------------------------
 1 file changed, 10 insertions(+), 28 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 0011a1115381c..3aba0d956f9b1 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -617,31 +617,22 @@ void generic_online_page(struct page *page, unsigned int order)
 }
 EXPORT_SYMBOL_GPL(generic_online_page);
 
-static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
-			void *arg)
+static void online_pages_range(unsigned long start_pfn, unsigned long nr_pages)
 {
 	const unsigned long end_pfn = start_pfn + nr_pages;
 	unsigned long pfn;
-	int order;
 
 	/*
-	 * Online the pages. The callback might decide to keep some pages
-	 * PG_reserved (to add them to the buddy later), but we still account
-	 * them as being online/belonging to this zone ("present").
+	 * Online the pages in MAX_ORDER - 1 aligned chunks. The callback might
+	 * decide to not expose all pages to the buddy (e.g., expose them
+	 * later). We account all pages as being online and belonging to this
+	 * zone ("present").
 	 */
-	for (pfn = start_pfn; pfn < end_pfn; pfn += 1ul << order) {
-		order = min(MAX_ORDER - 1, get_order(PFN_PHYS(end_pfn - pfn)));
-		/* __free_pages_core() wants pfns to be aligned to the order */
-		if (WARN_ON_ONCE(!IS_ALIGNED(pfn, 1ul << order)))
-			order = 0;
-		(*online_page_callback)(pfn_to_page(pfn), order);
-	}
+	for (pfn = start_pfn; pfn < end_pfn; pfn += MAX_ORDER_NR_PAGES)
+		(*online_page_callback)(pfn_to_page(pfn), MAX_ORDER - 1);
 
 	/* mark all involved sections as online */
 	online_mem_sections(start_pfn, end_pfn);
-
-	*(unsigned long *)arg += nr_pages;
-	return 0;
 }
 
 /* check which state of node_states will be changed when online memory */
@@ -795,7 +786,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
 		       int online_type, int nid)
 {
 	unsigned long flags;
-	unsigned long onlined_pages = 0;
 	struct zone *zone;
 	int need_zonelists_rebuild = 0;
 	int ret;
@@ -831,19 +821,11 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
 		setup_zone_pageset(zone);
 	}
 
-	ret = walk_system_ram_range(pfn, nr_pages, &onlined_pages,
-		online_pages_range);
-	if (ret) {
-		/* not a single memory resource was applicable */
-		if (need_zonelists_rebuild)
-			zone_pcp_reset(zone);
-		goto failed_addition;
-	}
-
-	zone->present_pages += onlined_pages;
+	online_pages_range(pfn, nr_pages);
+	zone->present_pages += nr_pages;
 
 	pgdat_resize_lock(zone->zone_pgdat, &flags);
-	zone->zone_pgdat->node_present_pages += onlined_pages;
+	zone->zone_pgdat->node_present_pages += nr_pages;
 	pgdat_resize_unlock(zone->zone_pgdat, &flags);
 
 	/*
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 08/10] mm/page_alloc: drop stale pageblock comment in memmap_init_zone*()
  2020-08-19 17:59 [PATCH v2 00/10] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
                   ` (6 preceding siblings ...)
  2020-08-19 17:59 ` [PATCH v2 07/10] mm/memory_hotplug: simplify page onlining David Hildenbrand
@ 2020-08-19 17:59 ` David Hildenbrand
  2020-08-24 10:58   ` Oscar Salvador
  2020-08-19 17:59 ` [PATCH v2 09/10] mm: pass migratetype into memmap_init_zone() and move_pfn_range_to_zone() David Hildenbrand
  2020-08-19 17:59 ` [PATCH v2 10/10] mm/memory_hotplug: mark pageblocks MIGRATE_ISOLATE while onlining memory David Hildenbrand
  9 siblings, 1 reply; 30+ messages in thread
From: David Hildenbrand @ 2020-08-19 17:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Michal Hocko, Andrew Morton,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador, Mel Gorman

Commit ac5d2539b238 ("mm: meminit: reduce number of times pageblocks are
set during struct page init") moved the actual zone range check, leaving
only the alignment check for pageblocks.

Let's drop the stale comment and make the pageblock check easier to read.

Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/page_alloc.c | 14 ++------------
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 848664352dfe2..5db0b35f95e20 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6022,13 +6022,8 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 		 * to reserve their blocks rather than leaking throughout
 		 * the address space during boot when many long-lived
 		 * kernel allocations are made.
-		 *
-		 * bitmap is created for zone's valid pfn range. but memmap
-		 * can be created for invalid pages (for alignment)
-		 * check here not to call set_pageblock_migratetype() against
-		 * pfn out of zone.
 		 */
-		if (!(pfn & (pageblock_nr_pages - 1))) {
+		if (IS_ALIGNED(pfn, pageblock_nr_pages)) {
 			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
 			cond_resched();
 		}
@@ -6091,15 +6086,10 @@ void __ref memmap_init_zone_device(struct zone *zone,
 		 * the address space during boot when many long-lived
 		 * kernel allocations are made.
 		 *
-		 * bitmap is created for zone's valid pfn range. but memmap
-		 * can be created for invalid pages (for alignment)
-		 * check here not to call set_pageblock_migratetype() against
-		 * pfn out of zone.
-		 *
 		 * Please note that MEMMAP_HOTPLUG path doesn't clear memmap
 		 * because this is done early in section_activate()
 		 */
-		if (!(pfn & (pageblock_nr_pages - 1))) {
+		if (IS_ALIGNED(pfn, pageblock_nr_pages)) {
 			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
 			cond_resched();
 		}
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 09/10] mm: pass migratetype into memmap_init_zone() and move_pfn_range_to_zone()
  2020-08-19 17:59 [PATCH v2 00/10] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
                   ` (7 preceding siblings ...)
  2020-08-19 17:59 ` [PATCH v2 08/10] mm/page_alloc: drop stale pageblock comment in memmap_init_zone*() David Hildenbrand
@ 2020-08-19 17:59 ` David Hildenbrand
  2020-08-24 11:59   ` Oscar Salvador
  2020-08-19 17:59 ` [PATCH v2 10/10] mm/memory_hotplug: mark pageblocks MIGRATE_ISOLATE while onlining memory David Hildenbrand
  9 siblings, 1 reply; 30+ messages in thread
From: David Hildenbrand @ 2020-08-19 17:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Michal Hocko, Andrew Morton,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador, Tony Luck,
	Fenghua Yu, Logan Gunthorpe, Dan Williams, Mike Rapoport,
	Matthew Wilcox (Oracle),
	Michel Lespinasse, linux-ia64

On the memory onlining path, we want to start with MIGRATE_ISOLATE, to
un-isolate the pages after memory onlining is complete. Let's allow
passing in the migratetype.

Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: linux-ia64@vger.kernel.org
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/ia64/mm/init.c            |  4 ++--
 include/linux/memory_hotplug.h |  3 ++-
 include/linux/mm.h             |  3 ++-
 mm/memory_hotplug.c            | 11 ++++++++---
 mm/memremap.c                  |  3 ++-
 mm/page_alloc.c                | 21 ++++++++++++---------
 6 files changed, 28 insertions(+), 17 deletions(-)

diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index 0b3fb4c7af292..82b7a46ddd23d 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -538,7 +538,7 @@ virtual_memmap_init(u64 start, u64 end, void *arg)
 	if (map_start < map_end)
 		memmap_init_zone((unsigned long)(map_end - map_start),
 				 args->nid, args->zone, page_to_pfn(map_start),
-				 MEMMAP_EARLY, NULL);
+				 MEMMAP_EARLY, NULL, MIGRATE_MOVABLE);
 	return 0;
 }
 
@@ -548,7 +548,7 @@ memmap_init (unsigned long size, int nid, unsigned long zone,
 {
 	if (!vmem_map) {
 		memmap_init_zone(size, nid, zone, start_pfn, MEMMAP_EARLY,
-				NULL);
+				 NULL, MIGRATE_MOVABLE);
 	} else {
 		struct page *start;
 		struct memmap_init_callback_data args;
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 0b461691d1a49..cbafeda859380 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -346,7 +346,8 @@ extern int add_memory_resource(int nid, struct resource *resource);
 extern int add_memory_driver_managed(int nid, u64 start, u64 size,
 				     const char *resource_name);
 extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
-		unsigned long nr_pages, struct vmem_altmap *altmap);
+				   unsigned long nr_pages,
+				   struct vmem_altmap *altmap, int migratetype);
 extern void remove_pfn_range_from_zone(struct zone *zone,
 				       unsigned long start_pfn,
 				       unsigned long nr_pages);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8ab941cf73f44..c842aa2a97ba2 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2409,7 +2409,8 @@ extern int __meminit __early_pfn_to_nid(unsigned long pfn,
 
 extern void set_dma_reserve(unsigned long new_dma_reserve);
 extern void memmap_init_zone(unsigned long, int, unsigned long, unsigned long,
-		enum memmap_context, struct vmem_altmap *);
+			    enum memmap_context, struct vmem_altmap *,
+			    int migratetype);
 extern void setup_per_zone_wmarks(void);
 extern int __meminit init_per_zone_wmark_min(void);
 extern void mem_init(void);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 3aba0d956f9b1..1c16a5def781e 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -693,9 +693,14 @@ static void __meminit resize_pgdat_range(struct pglist_data *pgdat, unsigned lon
  * Associate the pfn range with the given zone, initializing the memmaps
  * and resizing the pgdat/zone data to span the added pages. After this
  * call, all affected pages are PG_reserved.
+ *
+ * All aligned pageblocks are initialized to the specified migratetype
+ * (usually MIGRATE_MOVABLE). Besides setting the migratetype, no related
+ * zone stats (e.g., nr_isolate_pageblock) are touched.
  */
 void __ref move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
-		unsigned long nr_pages, struct vmem_altmap *altmap)
+				  unsigned long nr_pages,
+				  struct vmem_altmap *altmap, int migratetype)
 {
 	struct pglist_data *pgdat = zone->zone_pgdat;
 	int nid = pgdat->node_id;
@@ -720,7 +725,7 @@ void __ref move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
 	 * are reserved so nobody should be touching them so we should be safe
 	 */
 	memmap_init_zone(nr_pages, nid, zone_idx(zone), start_pfn,
-			MEMMAP_HOTPLUG, altmap);
+			 MEMMAP_HOTPLUG, altmap, migratetype);
 
 	set_zone_contiguous(zone);
 }
@@ -800,7 +805,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
 
 	/* associate pfn range with the zone */
 	zone = zone_for_pfn_range(online_type, nid, pfn, nr_pages);
-	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL);
+	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_MOVABLE);
 
 	arg.start_pfn = pfn;
 	arg.nr_pages = nr_pages;
diff --git a/mm/memremap.c b/mm/memremap.c
index 8afcc54c89286..04dc1f4ed634e 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -342,7 +342,8 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid)
 
 		zone = &NODE_DATA(nid)->node_zones[ZONE_DEVICE];
 		move_pfn_range_to_zone(zone, PHYS_PFN(res->start),
-				PHYS_PFN(resource_size(res)), params.altmap);
+				       PHYS_PFN(resource_size(res)),
+				       params.altmap, MIGRATE_MOVABLE);
 	}
 
 	mem_hotplug_done();
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5db0b35f95e20..9f2dc61968689 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5970,10 +5970,15 @@ overlap_memmap_init(unsigned long zone, unsigned long *pfn)
  * Initially all pages are reserved - free ones are freed
  * up by memblock_free_all() once the early boot process is
  * done. Non-atomic initialization, single-pass.
+ *
+ * All aligned pageblocks are initialized to the specified migratetype
+ * (usually MIGRATE_MOVABLE). Besides setting the migratetype, no related
+ * zone stats (e.g., nr_isolate_pageblock) are touched.
  */
 void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
-		unsigned long start_pfn, enum memmap_context context,
-		struct vmem_altmap *altmap)
+				unsigned long start_pfn,
+				enum memmap_context context,
+				struct vmem_altmap *altmap, int migratetype)
 {
 	unsigned long pfn, end_pfn = start_pfn + size;
 	struct page *page;
@@ -6017,14 +6022,12 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 			__SetPageReserved(page);
 
 		/*
-		 * Mark the block movable so that blocks are reserved for
-		 * movable at startup. This will force kernel allocations
-		 * to reserve their blocks rather than leaking throughout
-		 * the address space during boot when many long-lived
-		 * kernel allocations are made.
+		 * Usually, we want to mark the pageblock MIGRATE_MOVABLE,
+		 * such that unmovable allocations won't be scattered all
+		 * over the place during system boot.
 		 */
 		if (IS_ALIGNED(pfn, pageblock_nr_pages)) {
-			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
+			set_pageblock_migratetype(page, migratetype);
 			cond_resched();
 		}
 		pfn++;
@@ -6124,7 +6127,7 @@ void __meminit __weak memmap_init(unsigned long size, int nid,
 		if (end_pfn > start_pfn) {
 			size = end_pfn - start_pfn;
 			memmap_init_zone(size, nid, zone, start_pfn,
-					 MEMMAP_EARLY, NULL);
+					 MEMMAP_EARLY, NULL, MIGRATE_MOVABLE);
 		}
 	}
 }
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 10/10] mm/memory_hotplug: mark pageblocks MIGRATE_ISOLATE while onlining memory
  2020-08-19 17:59 [PATCH v2 00/10] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
                   ` (8 preceding siblings ...)
  2020-08-19 17:59 ` [PATCH v2 09/10] mm: pass migratetype into memmap_init_zone() and move_pfn_range_to_zone() David Hildenbrand
@ 2020-08-19 17:59 ` David Hildenbrand
  2020-08-24 12:07   ` Oscar Salvador
  9 siblings, 1 reply; 30+ messages in thread
From: David Hildenbrand @ 2020-08-19 17:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Michal Hocko, Andrew Morton,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador,
	Charan Teja Reddy

Currently, it can happen that pages are allocated (and freed) via the buddy
before we finished basic memory onlining.

For example, pages are exposed to the buddy and can be allocated before
we actually mark the sections online. Allocated pages could suddenly
fail pfn_to_online_page() checks. We had similar issues with pcp
handling, when pages are allocated+freed before we reach
zone_pcp_update() in online_pages() [1].

Instead, mark all pageblocks MIGRATE_ISOLATE, such that allocations are
impossible. Once done with the heavy lifting, use
undo_isolate_page_range() to move the pages to the MIGRATE_MOVABLE
freelist, marking them ready for allocation. Similar to offline_pages(),
we have to manually adjust zone->nr_isolate_pageblock.

[1] https://lkml.kernel.org/r/1597150703-19003-1-git-send-email-charante@codeaurora.org

Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Charan Teja Reddy <charante@codeaurora.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/Kconfig          |  2 +-
 mm/memory_hotplug.c | 32 ++++++++++++++++++++++----------
 2 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/mm/Kconfig b/mm/Kconfig
index 6c974888f86f9..85a16ce1dbc49 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -152,6 +152,7 @@ config HAVE_BOOTMEM_INFO_NODE
 # eventually, we can have this option just 'select SPARSEMEM'
 config MEMORY_HOTPLUG
 	bool "Allow for memory hot-add"
+	select MEMORY_ISOLATION
 	depends on SPARSEMEM || X86_64_ACPI_NUMA
 	depends on ARCH_ENABLE_MEMORY_HOTPLUG
 	depends on 64BIT || BROKEN
@@ -178,7 +179,6 @@ config MEMORY_HOTPLUG_DEFAULT_ONLINE
 
 config MEMORY_HOTREMOVE
 	bool "Allow for memory hot remove"
-	select MEMORY_ISOLATION
 	select HAVE_BOOTMEM_INFO_NODE if (X86_64 || PPC64)
 	depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE
 	depends on MIGRATION
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 1c16a5def781e..35d56cbd3e45b 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -805,7 +805,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
 
 	/* associate pfn range with the zone */
 	zone = zone_for_pfn_range(online_type, nid, pfn, nr_pages);
-	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_MOVABLE);
+	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_ISOLATE);
 
 	arg.start_pfn = pfn;
 	arg.nr_pages = nr_pages;
@@ -816,6 +816,14 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
 	if (ret)
 		goto failed_addition;
 
+	/*
+	 * Fixup the number of isolated pageblocks before marking the sections
+	 * onlining, such that undo_isolate_page_range() works correctly.
+	 */
+	spin_lock_irqsave(&zone->lock, flags);
+	zone->nr_isolate_pageblock += nr_pages / pageblock_nr_pages;
+	spin_unlock_irqrestore(&zone->lock, flags);
+
 	/*
 	 * If this zone is not populated, then it is not in zonelist.
 	 * This means the page allocator ignores this zone.
@@ -833,21 +841,25 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
 	zone->zone_pgdat->node_present_pages += nr_pages;
 	pgdat_resize_unlock(zone->zone_pgdat, &flags);
 
+	node_states_set_node(nid, &arg);
+	if (need_zonelists_rebuild)
+		build_all_zonelists(NULL);
+	zone_pcp_update(zone);
+
+	/* Basic onlining is complete, allow allocation of onlined pages. */
+	undo_isolate_page_range(pfn, pfn + nr_pages, MIGRATE_MOVABLE);
+
 	/*
 	 * When exposing larger, physically contiguous memory areas to the
 	 * buddy, shuffling in the buddy (when freeing onlined pages, putting
 	 * them either to the head or the tail of the freelist) is only helpful
 	 * for maintaining the shuffle, but not for creating the initial
 	 * shuffle. Shuffle the whole zone to make sure the just onlined pages
-	 * are properly distributed across the whole freelist.
+	 * are properly distributed across the whole freelist. Make sure to
+	 * shuffle once pageblocks are no longer isolated.
 	 */
 	shuffle_zone(zone);
 
-	node_states_set_node(nid, &arg);
-	if (need_zonelists_rebuild)
-		build_all_zonelists(NULL);
-	zone_pcp_update(zone);
-
 	init_per_zone_wmark_min();
 
 	kswapd_run(nid);
@@ -1550,9 +1562,9 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 	pr_info("Offlined Pages %ld\n", nr_pages);
 
 	/*
-	 * Onlining will reset pagetype flags and makes migrate type
-	 * MOVABLE, so just need to decrease the number of isolated
-	 * pageblocks zone counter here.
+	 * The memory sections are marked offline, and the pageblock flags
+	 * effectively stale; nobody should be touching them. Fixup the number
+	 * of isolated pageblocks, memory onlining will properly revert this.
 	 */
 	spin_lock_irqsave(&zone->lock, flags);
 	zone->nr_isolate_pageblock -= nr_pages / pageblock_nr_pages;
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 01/10] mm/memory_hotplug: inline __offline_pages() into offline_pages()
  2020-08-19 17:59 ` [PATCH v2 01/10] mm/memory_hotplug: inline __offline_pages() into offline_pages() David Hildenbrand
@ 2020-08-24 10:26   ` Oscar Salvador
  2020-08-31 10:05   ` Pankaj Gupta
  1 sibling, 0 replies; 30+ messages in thread
From: Oscar Salvador @ 2020-08-24 10:26 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Michal Hocko, Andrew Morton, Wei Yang,
	Baoquan He, Pankaj Gupta

On Wed, Aug 19, 2020 at 07:59:48PM +0200, David Hildenbrand wrote:
> There is only a single user, offline_pages(). Let's inline, to make
> it look more similar to online_pages().
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 02/10] mm/memory_hotplug: enforce section granularity when onlining/offlining
  2020-08-19 17:59 ` [PATCH v2 02/10] mm/memory_hotplug: enforce section granularity when onlining/offlining David Hildenbrand
@ 2020-08-24 10:39   ` Oscar Salvador
  2020-08-25  2:11     ` Wei Yang
  0 siblings, 1 reply; 30+ messages in thread
From: Oscar Salvador @ 2020-08-24 10:39 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Michal Hocko, Andrew Morton, Wei Yang,
	Baoquan He, Pankaj Gupta

On Wed, Aug 19, 2020 at 07:59:49PM +0200, David Hildenbrand wrote:
> Already two people (including me) tried to offline subsections, because
> the function looks like it can deal with it. But we really can only
> online/offline full sections that are properly aligned (e.g., we can only
> mark full sections online/offline via SECTION_IS_ONLINE).
> 
> Add a simple safety net to document the restriction now. Current users
> (core and powernv/memtrace) respect these restrictions.

It's been a while since I looked at sub-section handling stuff so sorry to ask
this, but was it true that we can hot-{remove,add} sub-section granularity, while
we can only online /offline on section granularity?

> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>


-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 03/10] mm/memory_hotplug: simplify page offlining
  2020-08-19 17:59 ` [PATCH v2 03/10] mm/memory_hotplug: simplify page offlining David Hildenbrand
@ 2020-08-24 10:44   ` Oscar Salvador
  2020-09-03 21:58   ` Andrew Morton
  1 sibling, 0 replies; 30+ messages in thread
From: Oscar Salvador @ 2020-08-24 10:44 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Michal Hocko, Andrew Morton, Wei Yang,
	Baoquan He, Pankaj Gupta

On Wed, Aug 19, 2020 at 07:59:50PM +0200, David Hildenbrand wrote:
> We make sure that we cannot have any memory holes right at the beginning
> of offline_pages(). We no longer need walk_system_ram_range() and can
> call test_pages_isolated() and __offline_isolated_pages() directly.
> 
> offlined_pages always corresponds to nr_pages, so we can simplify that.
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>


-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 04/10] mm/page_alloc: simplify __offline_isolated_pages()
  2020-08-19 17:59 ` [PATCH v2 04/10] mm/page_alloc: simplify __offline_isolated_pages() David Hildenbrand
@ 2020-08-24 10:48   ` Oscar Salvador
  0 siblings, 0 replies; 30+ messages in thread
From: Oscar Salvador @ 2020-08-24 10:48 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Michal Hocko, Andrew Morton, Wei Yang,
	Baoquan He, Pankaj Gupta

On Wed, Aug 19, 2020 at 07:59:51PM +0200, David Hildenbrand wrote:
> offline_pages() is the only user. __offline_isolated_pages() never gets
> called with ranges that contain memory holes and we no longer care about
> the return value. Drop the return value handling and all pfn_valid()
> checks.
> 
> Update the documentation.
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>

pretty nice

Reviewed-by: Oscar Salvador <osalvador@suse.de>


-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 05/10] mm/memory_hotplug: drop nr_isolate_pageblock in offline_pages()
  2020-08-19 17:59 ` [PATCH v2 05/10] mm/memory_hotplug: drop nr_isolate_pageblock in offline_pages() David Hildenbrand
@ 2020-08-24 10:49   ` Oscar Salvador
  0 siblings, 0 replies; 30+ messages in thread
From: Oscar Salvador @ 2020-08-24 10:49 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Michal Hocko, Andrew Morton, Wei Yang,
	Baoquan He, Pankaj Gupta

On Wed, Aug 19, 2020 at 07:59:52PM +0200, David Hildenbrand wrote:
> We make sure that we cannot have any memory holes right at the beginning
> of offline_pages() and we only support to online/offline full sections.
> Both, sections and pageblocks are a power of two in size, and sections
> always span full pageblocks.
> 
> We can directly calculate the number of isolated pageblocks from nr_pages.
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>


-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 06/10] mm/page_isolation: simplify return value of start_isolate_page_range()
  2020-08-19 17:59 ` [PATCH v2 06/10] mm/page_isolation: simplify return value of start_isolate_page_range() David Hildenbrand
@ 2020-08-24 10:51   ` Oscar Salvador
  0 siblings, 0 replies; 30+ messages in thread
From: Oscar Salvador @ 2020-08-24 10:51 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Michal Hocko, Andrew Morton, Wei Yang,
	Baoquan He, Pankaj Gupta

On Wed, Aug 19, 2020 at 07:59:53PM +0200, David Hildenbrand wrote:
> Callers no longer need the number of isolated pageblocks. Let's
> simplify.
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 07/10] mm/memory_hotplug: simplify page onlining
  2020-08-19 17:59 ` [PATCH v2 07/10] mm/memory_hotplug: simplify page onlining David Hildenbrand
@ 2020-08-24 10:55   ` Oscar Salvador
  0 siblings, 0 replies; 30+ messages in thread
From: Oscar Salvador @ 2020-08-24 10:55 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Michal Hocko, Andrew Morton, Wei Yang,
	Baoquan He, Pankaj Gupta

On Wed, Aug 19, 2020 at 07:59:54PM +0200, David Hildenbrand wrote:
> We don't allow to offline memory with holes, all boot memory is online,
> and all hotplugged memory cannot have holes.
> 
> We can now simplify onlining of pages. As we only allow to online/offline
> full sections and sections always span full MAX_ORDER_NR_PAGES, we can just
> process MAX_ORDER - 1 pages without further special handling.
> 
> The number of onlined pages simply corresponds to the number of pages we
> were requested to online.
> 
> While at it, refine the comment regarding the callback not exposing all
> pages to the buddy.
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 08/10] mm/page_alloc: drop stale pageblock comment in memmap_init_zone*()
  2020-08-19 17:59 ` [PATCH v2 08/10] mm/page_alloc: drop stale pageblock comment in memmap_init_zone*() David Hildenbrand
@ 2020-08-24 10:58   ` Oscar Salvador
  0 siblings, 0 replies; 30+ messages in thread
From: Oscar Salvador @ 2020-08-24 10:58 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Michal Hocko, Andrew Morton, Wei Yang,
	Baoquan He, Pankaj Gupta, Mel Gorman

On Wed, Aug 19, 2020 at 07:59:55PM +0200, David Hildenbrand wrote:
> Commit ac5d2539b238 ("mm: meminit: reduce number of times pageblocks are
> set during struct page init") moved the actual zone range check, leaving
> only the alignment check for pageblocks.
> 
> Let's drop the stale comment and make the pageblock check easier to read.
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Mel Gorman <mgorman@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 09/10] mm: pass migratetype into memmap_init_zone() and move_pfn_range_to_zone()
  2020-08-19 17:59 ` [PATCH v2 09/10] mm: pass migratetype into memmap_init_zone() and move_pfn_range_to_zone() David Hildenbrand
@ 2020-08-24 11:59   ` Oscar Salvador
  0 siblings, 0 replies; 30+ messages in thread
From: Oscar Salvador @ 2020-08-24 11:59 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Michal Hocko, Andrew Morton, Wei Yang,
	Baoquan He, Pankaj Gupta, Tony Luck, Fenghua Yu, Logan Gunthorpe,
	Dan Williams, Mike Rapoport, Matthew Wilcox (Oracle),
	Michel Lespinasse, linux-ia64

On Wed, Aug 19, 2020 at 07:59:56PM +0200, David Hildenbrand wrote:
> On the memory onlining path, we want to start with MIGRATE_ISOLATE, to
> un-isolate the pages after memory onlining is complete. Let's allow
> passing in the migratetype.
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Fenghua Yu <fenghua.yu@intel.com>
> Cc: Logan Gunthorpe <logang@deltatee.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Mike Rapoport <rppt@kernel.org>
> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> Cc: Michel Lespinasse <walken@google.com>
> Cc: linux-ia64@vger.kernel.org
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 10/10] mm/memory_hotplug: mark pageblocks MIGRATE_ISOLATE while onlining memory
  2020-08-19 17:59 ` [PATCH v2 10/10] mm/memory_hotplug: mark pageblocks MIGRATE_ISOLATE while onlining memory David Hildenbrand
@ 2020-08-24 12:07   ` Oscar Salvador
  0 siblings, 0 replies; 30+ messages in thread
From: Oscar Salvador @ 2020-08-24 12:07 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Michal Hocko, Andrew Morton, Wei Yang,
	Baoquan He, Pankaj Gupta, Charan Teja Reddy

On Wed, Aug 19, 2020 at 07:59:57PM +0200, David Hildenbrand wrote:
> Currently, it can happen that pages are allocated (and freed) via the buddy
> before we finished basic memory onlining.
> 
> For example, pages are exposed to the buddy and can be allocated before
> we actually mark the sections online. Allocated pages could suddenly
> fail pfn_to_online_page() checks. We had similar issues with pcp
> handling, when pages are allocated+freed before we reach
> zone_pcp_update() in online_pages() [1].
> 
> Instead, mark all pageblocks MIGRATE_ISOLATE, such that allocations are
> impossible. Once done with the heavy lifting, use
> undo_isolate_page_range() to move the pages to the MIGRATE_MOVABLE
> freelist, marking them ready for allocation. Similar to offline_pages(),
> we have to manually adjust zone->nr_isolate_pageblock.
> 
> [1] https://lkml.kernel.org/r/1597150703-19003-1-git-send-email-charante@codeaurora.org
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Charan Teja Reddy <charante@codeaurora.org>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>

-- 
Oscar Salvador
SUSE L3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 02/10] mm/memory_hotplug: enforce section granularity when onlining/offlining
  2020-08-24 10:39   ` Oscar Salvador
@ 2020-08-25  2:11     ` Wei Yang
  2020-09-08  9:14       ` David Hildenbrand
  0 siblings, 1 reply; 30+ messages in thread
From: Wei Yang @ 2020-08-25  2:11 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: David Hildenbrand, linux-kernel, linux-mm, Michal Hocko,
	Andrew Morton, Wei Yang, Baoquan He, Pankaj Gupta

On Mon, Aug 24, 2020 at 12:39:18PM +0200, Oscar Salvador wrote:
>On Wed, Aug 19, 2020 at 07:59:49PM +0200, David Hildenbrand wrote:
>> Already two people (including me) tried to offline subsections, because
>> the function looks like it can deal with it. But we really can only
>> online/offline full sections that are properly aligned (e.g., we can only
>> mark full sections online/offline via SECTION_IS_ONLINE).
>> 
>> Add a simple safety net to document the restriction now. Current users
>> (core and powernv/memtrace) respect these restrictions.
>
>It's been a while since I looked at sub-section handling stuff so sorry to ask
>this, but was it true that we can hot-{remove,add} sub-section granularity, while
>we can only online /offline on section granularity?
>

Seems you are right.

>> 
>> Acked-by: Michal Hocko <mhocko@suse.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
>> Cc: Baoquan He <bhe@redhat.com>
>> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
>> Cc: Oscar Salvador <osalvador@suse.de>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>
>Reviewed-by: Oscar Salvador <osalvador@suse.de>
>
>
>-- 
>Oscar Salvador
>SUSE L3

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 01/10] mm/memory_hotplug: inline __offline_pages() into offline_pages()
  2020-08-19 17:59 ` [PATCH v2 01/10] mm/memory_hotplug: inline __offline_pages() into offline_pages() David Hildenbrand
  2020-08-24 10:26   ` Oscar Salvador
@ 2020-08-31 10:05   ` Pankaj Gupta
  1 sibling, 0 replies; 30+ messages in thread
From: Pankaj Gupta @ 2020-08-31 10:05 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: LKML, Linux MM, Michal Hocko, Andrew Morton, Wei Yang,
	Baoquan He, Oscar Salvador

> There is only a single user, offline_pages(). Let's inline, to make
> it look more similar to online_pages().
>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  mm/memory_hotplug.c | 16 +++++-----------
>  1 file changed, 5 insertions(+), 11 deletions(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 7f62d69660e06..c781d386d87f9 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1473,11 +1473,10 @@ static int count_system_ram_pages_cb(unsigned long start_pfn,
>         return 0;
>  }
>
> -static int __ref __offline_pages(unsigned long start_pfn,
> -                 unsigned long end_pfn)
> +int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>  {
> -       unsigned long pfn, nr_pages = 0;
> -       unsigned long offlined_pages = 0;
> +       const unsigned long end_pfn = start_pfn + nr_pages;
> +       unsigned long pfn, system_ram_pages = 0, offlined_pages = 0;
>         int ret, node, nr_isolate_pageblock;
>         unsigned long flags;
>         struct zone *zone;
> @@ -1494,9 +1493,9 @@ static int __ref __offline_pages(unsigned long start_pfn,
>          * memory holes PG_reserved, don't need pfn_valid() checks, and can
>          * avoid using walk_system_ram_range() later.
>          */
> -       walk_system_ram_range(start_pfn, end_pfn - start_pfn, &nr_pages,
> +       walk_system_ram_range(start_pfn, nr_pages, &system_ram_pages,
>                               count_system_ram_pages_cb);
> -       if (nr_pages != end_pfn - start_pfn) {
> +       if (system_ram_pages != nr_pages) {
>                 ret = -EINVAL;
>                 reason = "memory holes";
>                 goto failed_removal;
> @@ -1631,11 +1630,6 @@ static int __ref __offline_pages(unsigned long start_pfn,
>         return ret;
>  }
>
> -int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
> -{
> -       return __offline_pages(start_pfn, start_pfn + nr_pages);
> -}
> -
>  static int check_memblock_offlined_cb(struct memory_block *mem, void *arg)
>  {
>         int ret = !is_memblock_offlined(mem);

Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 03/10] mm/memory_hotplug: simplify page offlining
  2020-08-19 17:59 ` [PATCH v2 03/10] mm/memory_hotplug: simplify page offlining David Hildenbrand
  2020-08-24 10:44   ` Oscar Salvador
@ 2020-09-03 21:58   ` Andrew Morton
  2020-09-04  5:47     ` David Hildenbrand
  1 sibling, 1 reply; 30+ messages in thread
From: Andrew Morton @ 2020-09-03 21:58 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Michal Hocko, Wei Yang, Baoquan He,
	Pankaj Gupta, Oscar Salvador

On Wed, 19 Aug 2020 19:59:50 +0200 David Hildenbrand <david@redhat.com> wrote:

> We make sure that we cannot have any memory holes right at the beginning
> of offline_pages(). We no longer need walk_system_ram_range() and can
> call test_pages_isolated() and __offline_isolated_pages() directly.
> 
> offlined_pages always corresponds to nr_pages, so we can simplify that.

This patch ran afoul of Pavel's "mm/memory_hotplug: drain per-cpu pages
again during memory offline", here:

> @@ -1481,7 +1459,7 @@ static int count_system_ram_pages_cb(unsigned long start_pfn,
>  int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>  {
>  	const unsigned long end_pfn = start_pfn + nr_pages;
> -	unsigned long pfn, system_ram_pages = 0, offlined_pages = 0;
> +	unsigned long pfn, system_ram_pages = 0;
>  	int ret, node, nr_isolate_pageblock;
>  	unsigned long flags;
>  	struct zone *zone;
> @@ -1579,16 +1557,12 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>  			reason = "failure to dissolve huge pages";
>  			goto failed_removal_isolated;
>  		}
> -		/* check again */
> -		ret = walk_system_ram_range(start_pfn, end_pfn - start_pfn,
> -					    NULL, check_pages_isolated_cb);
> -	} while (ret);
> -
> -	/* Ok, all of our target is isolated.
> -	   We cannot do rollback at this point. */
> -	walk_system_ram_range(start_pfn, end_pfn - start_pfn,
> -			      &offlined_pages, offline_isolated_pages_cb);
> -	pr_info("Offlined Pages %ld\n", offlined_pages);
> +	} while (test_pages_isolated(start_pfn, end_pfn, MEMORY_OFFLINE));
> +
> +	/* Mark all sections offline and remove free pages from the buddy. */
> +	__offline_isolated_pages(start_pfn, end_pfn);
> +	pr_info("Offlined Pages %ld\n", nr_pages);
> +
>  	/*
>  	 * Onlining will reset pagetype flags and makes migrate type

I did this.  Looks OK?

From: David Hildenbrand <david@redhat.com>
Subject: mm/memory_hotplug: simplify page offlining

We make sure that we cannot have any memory holes right at the beginning
of offline_pages().  We no longer need walk_system_ram_range() and can
call test_pages_isolated() and __offline_isolated_pages() directly.

offlined_pages always corresponds to nr_pages, so we can simplify that.

Link: https://lkml.kernel.org/r/20200819175957.28465-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Charan Teja Reddy <charante@codeaurora.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory_hotplug.c |   61 +++++++++++++++++-------------------------
 1 file changed, 25 insertions(+), 36 deletions(-)

--- a/mm/memory_hotplug.c~mm-memory_hotplug-simplify-page-offlining
+++ a/mm/memory_hotplug.c
@@ -1383,28 +1383,6 @@ do_migrate_range(unsigned long start_pfn
 	return ret;
 }
 
-/* Mark all sections offline and remove all free pages from the buddy. */
-static int
-offline_isolated_pages_cb(unsigned long start, unsigned long nr_pages,
-			void *data)
-{
-	unsigned long *offlined_pages = (unsigned long *)data;
-
-	*offlined_pages += __offline_isolated_pages(start, start + nr_pages);
-	return 0;
-}
-
-/*
- * Check all pages in range, recorded as memory resource, are isolated.
- */
-static int
-check_pages_isolated_cb(unsigned long start_pfn, unsigned long nr_pages,
-			void *data)
-{
-	return test_pages_isolated(start_pfn, start_pfn + nr_pages,
-				   MEMORY_OFFLINE);
-}
-
 static int __init cmdline_parse_movable_node(char *p)
 {
 	movable_node_enabled = true;
@@ -1491,7 +1469,7 @@ static int count_system_ram_pages_cb(uns
 int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 {
 	const unsigned long end_pfn = start_pfn + nr_pages;
-	unsigned long pfn, system_ram_pages = 0, offlined_pages = 0;
+	unsigned long pfn, system_ram_pages = 0;
 	int ret, node, nr_isolate_pageblock;
 	unsigned long flags;
 	struct zone *zone;
@@ -1589,16 +1567,27 @@ int __ref offline_pages(unsigned long st
 			reason = "failure to dissolve huge pages";
 			goto failed_removal_isolated;
 		}
-		/* check again */
-		ret = walk_system_ram_range(start_pfn, end_pfn - start_pfn,
-					    NULL, check_pages_isolated_cb);
-	} while (ret);
-
-	/* Ok, all of our target is isolated.
-	   We cannot do rollback at this point. */
-	walk_system_ram_range(start_pfn, end_pfn - start_pfn,
-			      &offlined_pages, offline_isolated_pages_cb);
-	pr_info("Offlined Pages %ld\n", offlined_pages);
+
+		/*
+		 * per-cpu pages are drained in start_isolate_page_range, but if
+		 * there are still pages that are not free, make sure that we
+		 * drain again, because when we isolated range we might
+		 * have raced with another thread that was adding pages to pcp
+		 * list.
+		 *
+		 * Forward progress should be still guaranteed because
+		 * pages on the pcp list can only belong to MOVABLE_ZONE
+		 * because has_unmovable_pages explicitly checks for
+		 * PageBuddy on freed pages on other zones.
+		 */
+		if (ret)
+			drain_all_pages(zone);
+	} while (test_pages_isolated(start_pfn, end_pfn, MEMORY_OFFLINE));
+
+	/* Mark all sections offline and remove free pages from the buddy. */
+	__offline_isolated_pages(start_pfn, end_pfn);
+	pr_info("Offlined Pages %ld\n", nr_pages);
+
 	/*
 	 * Onlining will reset pagetype flags and makes migrate type
 	 * MOVABLE, so just need to decrease the number of isolated
@@ -1609,11 +1598,11 @@ int __ref offline_pages(unsigned long st
 	spin_unlock_irqrestore(&zone->lock, flags);
 
 	/* removal success */
-	adjust_managed_page_count(pfn_to_page(start_pfn), -offlined_pages);
-	zone->present_pages -= offlined_pages;
+	adjust_managed_page_count(pfn_to_page(start_pfn), -nr_pages);
+	zone->present_pages -= nr_pages;
 
 	pgdat_resize_lock(zone->zone_pgdat, &flags);
-	zone->zone_pgdat->node_present_pages -= offlined_pages;
+	zone->zone_pgdat->node_present_pages -= nr_pages;
 	pgdat_resize_unlock(zone->zone_pgdat, &flags);
 
 	init_per_zone_wmark_min();
_



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 03/10] mm/memory_hotplug: simplify page offlining
  2020-09-03 21:58   ` Andrew Morton
@ 2020-09-04  5:47     ` David Hildenbrand
  2020-09-04  7:46       ` Michal Hocko
  2020-09-04 19:21       ` Andrew Morton
  0 siblings, 2 replies; 30+ messages in thread
From: David Hildenbrand @ 2020-09-04  5:47 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, linux-kernel, linux-mm, Michal Hocko,
	Wei Yang, Baoquan He, Pankaj Gupta, Oscar Salvador



> Am 03.09.2020 um 23:58 schrieb Andrew Morton <akpm@linux-foundation.org>:
> 
> On Wed, 19 Aug 2020 19:59:50 +0200 David Hildenbrand <david@redhat.com> wrote:
> 
>> We make sure that we cannot have any memory holes right at the beginning
>> of offline_pages(). We no longer need walk_system_ram_range() and can
>> call test_pages_isolated() and __offline_isolated_pages() directly.
>> 
>> offlined_pages always corresponds to nr_pages, so we can simplify that.
> 
> This patch ran afoul of Pavel's "mm/memory_hotplug: drain per-cpu pages
> again during memory offline", here:
> 
>> @@ -1481,7 +1459,7 @@ static int count_system_ram_pages_cb(unsigned long start_pfn,
>> int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>> {
>>    const unsigned long end_pfn = start_pfn + nr_pages;
>> -    unsigned long pfn, system_ram_pages = 0, offlined_pages = 0;
>> +    unsigned long pfn, system_ram_pages = 0;
>>    int ret, node, nr_isolate_pageblock;
>>    unsigned long flags;
>>    struct zone *zone;
>> @@ -1579,16 +1557,12 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>>            reason = "failure to dissolve huge pages";
>>            goto failed_removal_isolated;
>>        }
>> -        /* check again */
>> -        ret = walk_system_ram_range(start_pfn, end_pfn - start_pfn,
>> -                        NULL, check_pages_isolated_cb);
>> -    } while (ret);
>> -
>> -    /* Ok, all of our target is isolated.
>> -       We cannot do rollback at this point. */
>> -    walk_system_ram_range(start_pfn, end_pfn - start_pfn,
>> -                  &offlined_pages, offline_isolated_pages_cb);
>> -    pr_info("Offlined Pages %ld\n", offlined_pages);
>> +    } while (test_pages_isolated(start_pfn, end_pfn, MEMORY_OFFLINE));
>> +
>> +    /* Mark all sections offline and remove free pages from the buddy. */
>> +    __offline_isolated_pages(start_pfn, end_pfn);
>> +    pr_info("Offlined Pages %ld\n", nr_pages);
>> +
>>    /*
>>     * Onlining will reset pagetype flags and makes migrate type
> 
> I did this.  Looks OK?
> 

Reading on my smartphone, it looks like you squashed both patches?

> From: David Hildenbrand <david@redhat.com>
> Subject: mm/memory_hotplug: simplify page offlining
> 
> We make sure that we cannot have any memory holes right at the beginning
> of offline_pages().  We no longer need walk_system_ram_range() and can
> call test_pages_isolated() and __offline_isolated_pages() directly.
> 
> offlined_pages always corresponds to nr_pages, so we can simplify that.
> 
> Link: https://lkml.kernel.org/r/20200819175957.28465-4-david@redhat.com
> Signed-off-by: David Hildenbrand <david@redhat.com>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Reviewed-by: Oscar Salvador <osalvador@suse.de>
> Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Charan Teja Reddy <charante@codeaurora.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Fenghua Yu <fenghua.yu@intel.com>
> Cc: Logan Gunthorpe <logang@deltatee.com>
> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Michel Lespinasse <walken@google.com>
> Cc: Mike Rapoport <rppt@kernel.org>
> Cc: Tony Luck <tony.luck@intel.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
> mm/memory_hotplug.c |   61 +++++++++++++++++-------------------------
> 1 file changed, 25 insertions(+), 36 deletions(-)
> 
> --- a/mm/memory_hotplug.c~mm-memory_hotplug-simplify-page-offlining
> +++ a/mm/memory_hotplug.c
> @@ -1383,28 +1383,6 @@ do_migrate_range(unsigned long start_pfn
>    return ret;
> }
> 
> -/* Mark all sections offline and remove all free pages from the buddy. */
> -static int
> -offline_isolated_pages_cb(unsigned long start, unsigned long nr_pages,
> -            void *data)
> -{
> -    unsigned long *offlined_pages = (unsigned long *)data;
> -
> -    *offlined_pages += __offline_isolated_pages(start, start + nr_pages);
> -    return 0;
> -}
> -
> -/*
> - * Check all pages in range, recorded as memory resource, are isolated.
> - */
> -static int
> -check_pages_isolated_cb(unsigned long start_pfn, unsigned long nr_pages,
> -            void *data)
> -{
> -    return test_pages_isolated(start_pfn, start_pfn + nr_pages,
> -                   MEMORY_OFFLINE);
> -}
> -
> static int __init cmdline_parse_movable_node(char *p)
> {
>    movable_node_enabled = true;
> @@ -1491,7 +1469,7 @@ static int count_system_ram_pages_cb(uns
> int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
> {
>    const unsigned long end_pfn = start_pfn + nr_pages;
> -    unsigned long pfn, system_ram_pages = 0, offlined_pages = 0;
> +    unsigned long pfn, system_ram_pages = 0;
>    int ret, node, nr_isolate_pageblock;
>    unsigned long flags;
>    struct zone *zone;
> @@ -1589,16 +1567,27 @@ int __ref offline_pages(unsigned long st
>            reason = "failure to dissolve huge pages";
>            goto failed_removal_isolated;
>        }
> -        /* check again */
> -        ret = walk_system_ram_range(start_pfn, end_pfn - start_pfn,
> -                        NULL, check_pages_isolated_cb);
> -    } while (ret);
> -
> -    /* Ok, all of our target is isolated.
> -       We cannot do rollback at this point. */
> -    walk_system_ram_range(start_pfn, end_pfn - start_pfn,
> -                  &offlined_pages, offline_isolated_pages_cb);
> -    pr_info("Offlined Pages %ld\n", offlined_pages);
> +
> +        /*
> +         * per-cpu pages are drained in start_isolate_page_range, but if
> +         * there are still pages that are not free, make sure that we
> +         * drain again, because when we isolated range we might
> +         * have raced with another thread that was adding pages to pcp
> +         * list.
> +         *
> +         * Forward progress should be still guaranteed because
> +         * pages on the pcp list can only belong to MOVABLE_ZONE
> +         * because has_unmovable_pages explicitly checks for
> +         * PageBuddy on freed pages on other zones.
> +         */
> +        if (ret)
> +            drain_all_pages(zone);
> +    } while (test_pages_isolated(start_pfn, end_pfn, MEMORY_OFFLINE));

I think we have to do

ret = test_pages_isolated()
if (ret)
...
} while (ret);

So keeping the old code flow. I cannot resend before next Tuesday.

> +
> +    /* Mark all sections offline and remove free pages from the buddy. */
> +    __offline_isolated_pages(start_pfn, end_pfn);
> +    pr_info("Offlined Pages %ld\n", nr_pages);
> +
>    /*
>     * Onlining will reset pagetype flags and makes migrate type
>     * MOVABLE, so just need to decrease the number of isolated
> @@ -1609,11 +1598,11 @@ int __ref offline_pages(unsigned long st
>    spin_unlock_irqrestore(&zone->lock, flags);
> 
>    /* removal success */
> -    adjust_managed_page_count(pfn_to_page(start_pfn), -offlined_pages);
> -    zone->present_pages -= offlined_pages;
> +    adjust_managed_page_count(pfn_to_page(start_pfn), -nr_pages);
> +    zone->present_pages -= nr_pages;
> 
>    pgdat_resize_lock(zone->zone_pgdat, &flags);
> -    zone->zone_pgdat->node_present_pages -= offlined_pages;
> +    zone->zone_pgdat->node_present_pages -= nr_pages;
>    pgdat_resize_unlock(zone->zone_pgdat, &flags);
> 
>    init_per_zone_wmark_min();
> _
> 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 03/10] mm/memory_hotplug: simplify page offlining
  2020-09-04  5:47     ` David Hildenbrand
@ 2020-09-04  7:46       ` Michal Hocko
  2020-09-04 19:21       ` Andrew Morton
  1 sibling, 0 replies; 30+ messages in thread
From: Michal Hocko @ 2020-09-04  7:46 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Andrew Morton, linux-kernel, linux-mm, Wei Yang, Baoquan He,
	Pankaj Gupta, Oscar Salvador

On Fri 04-09-20 07:47:45, David Hildenbrand wrote:
> 
> 
> > Am 03.09.2020 um 23:58 schrieb Andrew Morton <akpm@linux-foundation.org>:
[...]
> > @@ -1589,16 +1567,27 @@ int __ref offline_pages(unsigned long st
> >            reason = "failure to dissolve huge pages";
> >            goto failed_removal_isolated;
> >        }
> > -        /* check again */
> > -        ret = walk_system_ram_range(start_pfn, end_pfn - start_pfn,
> > -                        NULL, check_pages_isolated_cb);
> > -    } while (ret);
> > -
> > -    /* Ok, all of our target is isolated.
> > -       We cannot do rollback at this point. */
> > -    walk_system_ram_range(start_pfn, end_pfn - start_pfn,
> > -                  &offlined_pages, offline_isolated_pages_cb);
> > -    pr_info("Offlined Pages %ld\n", offlined_pages);
> > +
> > +        /*
> > +         * per-cpu pages are drained in start_isolate_page_range, but if
> > +         * there are still pages that are not free, make sure that we
> > +         * drain again, because when we isolated range we might
> > +         * have raced with another thread that was adding pages to pcp
> > +         * list.
> > +         *
> > +         * Forward progress should be still guaranteed because
> > +         * pages on the pcp list can only belong to MOVABLE_ZONE
> > +         * because has_unmovable_pages explicitly checks for
> > +         * PageBuddy on freed pages on other zones.
> > +         */
> > +        if (ret)
> > +            drain_all_pages(zone);
> > +    } while (test_pages_isolated(start_pfn, end_pfn, MEMORY_OFFLINE));
> 
> I think we have to do
> 
> ret = test_pages_isolated()
> if (ret)

Yes.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 03/10] mm/memory_hotplug: simplify page offlining
  2020-09-04  5:47     ` David Hildenbrand
  2020-09-04  7:46       ` Michal Hocko
@ 2020-09-04 19:21       ` Andrew Morton
  2020-09-07  6:45         ` Michal Hocko
  1 sibling, 1 reply; 30+ messages in thread
From: Andrew Morton @ 2020-09-04 19:21 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Michal Hocko, Wei Yang, Baoquan He,
	Pankaj Gupta, Oscar Salvador

On Fri, 4 Sep 2020 07:47:45 +0200 David Hildenbrand <david@redhat.com> wrote:

> 
> 
> > +         * PageBuddy on freed pages on other zones.
> > +         */
> > +        if (ret)
> > +            drain_all_pages(zone);
> > +    } while (test_pages_isolated(start_pfn, end_pfn, MEMORY_OFFLINE));
> 
> I think we have to do
> 
> ret = test_pages_isolated()
> if (ret)
> ...
> } while (ret);
> 
> So keeping the old code flow. I cannot resend before next Tuesday.

Here's what I now (effectively) have:


From: David Hildenbrand <david@redhat.com>
Subject: mm/memory_hotplug: simplify page offlining

We make sure that we cannot have any memory holes right at the beginning
of offline_pages().  We no longer need walk_system_ram_range() and can
call test_pages_isolated() and __offline_isolated_pages() directly.

offlined_pages always corresponds to nr_pages, so we can simplify that.

Link: https://lkml.kernel.org/r/20200819175957.28465-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Charan Teja Reddy <charante@codeaurora.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>

--- a/mm/memory_hotplug.c~mm-memory_hotplug-simplify-page-offlining
+++ a/mm/memory_hotplug.c
@@ -1383,28 +1383,6 @@ do_migrate_range(unsigned long start_pfn
 	return ret;
 }
 
-/* Mark all sections offline and remove all free pages from the buddy. */
-static int
-offline_isolated_pages_cb(unsigned long start, unsigned long nr_pages,
-			void *data)
-{
-	unsigned long *offlined_pages = (unsigned long *)data;
-
-	*offlined_pages += __offline_isolated_pages(start, start + nr_pages);
-	return 0;
-}
-
-/*
- * Check all pages in range, recorded as memory resource, are isolated.
- */
-static int
-check_pages_isolated_cb(unsigned long start_pfn, unsigned long nr_pages,
-			void *data)
-{
-	return test_pages_isolated(start_pfn, start_pfn + nr_pages,
-				   MEMORY_OFFLINE);
-}
-
 static int __init cmdline_parse_movable_node(char *p)
 {
 	movable_node_enabled = true;
@@ -1491,7 +1469,7 @@ static int count_system_ram_pages_cb(uns
 int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 {
 	const unsigned long end_pfn = start_pfn + nr_pages;
-	unsigned long pfn, system_ram_pages = 0, offlined_pages = 0;
+	unsigned long pfn, system_ram_pages = 0;
 	int ret, node, nr_isolate_pageblock;
 	unsigned long flags;
 	struct zone *zone;
@@ -1589,9 +1567,7 @@ int __ref offline_pages(unsigned long st
 			reason = "failure to dissolve huge pages";
 			goto failed_removal_isolated;
 		}
-		/* check again */
-		ret = walk_system_ram_range(start_pfn, end_pfn - start_pfn,
-					    NULL, check_pages_isolated_cb);
+
 		/*
 		 * per-cpu pages are drained in start_isolate_page_range, but if
 		 * there are still pages that are not free, make sure that we
@@ -1604,15 +1580,15 @@ int __ref offline_pages(unsigned long st
 		 * because has_unmovable_pages explicitly checks for
 		 * PageBuddy on freed pages on other zones.
 		 */
+		ret = test_pages_isolated(start_pfn, end_pfn, MEMORY_OFFLINE);
 		if (ret)
 			drain_all_pages(zone);
 	} while (ret);
 
-	/* Ok, all of our target is isolated.
-	   We cannot do rollback at this point. */
-	walk_system_ram_range(start_pfn, end_pfn - start_pfn,
-			      &offlined_pages, offline_isolated_pages_cb);
-	pr_info("Offlined Pages %ld\n", offlined_pages);
+	/* Mark all sections offline and remove free pages from the buddy. */
+	__offline_isolated_pages(start_pfn, end_pfn);
+	pr_info("Offlined Pages %ld\n", nr_pages);
+
 	/*
 	 * Onlining will reset pagetype flags and makes migrate type
 	 * MOVABLE, so just need to decrease the number of isolated
@@ -1623,11 +1599,11 @@ int __ref offline_pages(unsigned long st
 	spin_unlock_irqrestore(&zone->lock, flags);
 
 	/* removal success */
-	adjust_managed_page_count(pfn_to_page(start_pfn), -offlined_pages);
-	zone->present_pages -= offlined_pages;
+	adjust_managed_page_count(pfn_to_page(start_pfn), -nr_pages);
+	zone->present_pages -= nr_pages;
 
 	pgdat_resize_lock(zone->zone_pgdat, &flags);
-	zone->zone_pgdat->node_present_pages -= offlined_pages;
+	zone->zone_pgdat->node_present_pages -= nr_pages;
 	pgdat_resize_unlock(zone->zone_pgdat, &flags);
 
 	init_per_zone_wmark_min();
_



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 03/10] mm/memory_hotplug: simplify page offlining
  2020-09-04 19:21       ` Andrew Morton
@ 2020-09-07  6:45         ` Michal Hocko
  2020-09-08  9:10           ` David Hildenbrand
  0 siblings, 1 reply; 30+ messages in thread
From: Michal Hocko @ 2020-09-07  6:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, linux-kernel, linux-mm, Wei Yang, Baoquan He,
	Pankaj Gupta, Oscar Salvador

On Fri 04-09-20 12:21:34, Andrew Morton wrote:
> On Fri, 4 Sep 2020 07:47:45 +0200 David Hildenbrand <david@redhat.com> wrote:
[...]
> @@ -1589,9 +1567,7 @@ int __ref offline_pages(unsigned long st
>  			reason = "failure to dissolve huge pages";
>  			goto failed_removal_isolated;
>  		}
> -		/* check again */
> -		ret = walk_system_ram_range(start_pfn, end_pfn - start_pfn,
> -					    NULL, check_pages_isolated_cb);
> +
>  		/*
>  		 * per-cpu pages are drained in start_isolate_page_range, but if
>  		 * there are still pages that are not free, make sure that we
> @@ -1604,15 +1580,15 @@ int __ref offline_pages(unsigned long st
>  		 * because has_unmovable_pages explicitly checks for
>  		 * PageBuddy on freed pages on other zones.
>  		 */
> +		ret = test_pages_isolated(start_pfn, end_pfn, MEMORY_OFFLINE);
>  		if (ret)
>  			drain_all_pages(zone);
>  	} while (ret);

Looks ok
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 03/10] mm/memory_hotplug: simplify page offlining
  2020-09-07  6:45         ` Michal Hocko
@ 2020-09-08  9:10           ` David Hildenbrand
  0 siblings, 0 replies; 30+ messages in thread
From: David Hildenbrand @ 2020-09-08  9:10 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton
  Cc: linux-kernel, linux-mm, Wei Yang, Baoquan He, Pankaj Gupta,
	Oscar Salvador

On 07.09.20 08:45, Michal Hocko wrote:
> On Fri 04-09-20 12:21:34, Andrew Morton wrote:
>> On Fri, 4 Sep 2020 07:47:45 +0200 David Hildenbrand <david@redhat.com> wrote:
> [...]
>> @@ -1589,9 +1567,7 @@ int __ref offline_pages(unsigned long st
>>  			reason = "failure to dissolve huge pages";
>>  			goto failed_removal_isolated;
>>  		}
>> -		/* check again */
>> -		ret = walk_system_ram_range(start_pfn, end_pfn - start_pfn,
>> -					    NULL, check_pages_isolated_cb);
>> +
>>  		/*
>>  		 * per-cpu pages are drained in start_isolate_page_range, but if
>>  		 * there are still pages that are not free, make sure that we
>> @@ -1604,15 +1580,15 @@ int __ref offline_pages(unsigned long st
>>  		 * because has_unmovable_pages explicitly checks for
>>  		 * PageBuddy on freed pages on other zones.
>>  		 */
>> +		ret = test_pages_isolated(start_pfn, end_pfn, MEMORY_OFFLINE);
>>  		if (ret)
>>  			drain_all_pages(zone);
>>  	} while (ret);
> 
> Looks ok
> 

Agreed, thanks Michal and Andrew.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 02/10] mm/memory_hotplug: enforce section granularity when onlining/offlining
  2020-08-25  2:11     ` Wei Yang
@ 2020-09-08  9:14       ` David Hildenbrand
  0 siblings, 0 replies; 30+ messages in thread
From: David Hildenbrand @ 2020-09-08  9:14 UTC (permalink / raw)
  To: Wei Yang, Oscar Salvador
  Cc: linux-kernel, linux-mm, Michal Hocko, Andrew Morton, Baoquan He,
	Pankaj Gupta

On 25.08.20 04:11, Wei Yang wrote:
> On Mon, Aug 24, 2020 at 12:39:18PM +0200, Oscar Salvador wrote:
>> On Wed, Aug 19, 2020 at 07:59:49PM +0200, David Hildenbrand wrote:
>>> Already two people (including me) tried to offline subsections, because
>>> the function looks like it can deal with it. But we really can only
>>> online/offline full sections that are properly aligned (e.g., we can only
>>> mark full sections online/offline via SECTION_IS_ONLINE).
>>>
>>> Add a simple safety net to document the restriction now. Current users
>>> (core and powernv/memtrace) respect these restrictions.
>>
>> It's been a while since I looked at sub-section handling stuff so sorry to ask
>> this, but was it true that we can hot-{remove,add} sub-section granularity, while
>> we can only online /offline on section granularity?
>>

Yes, we can hot-{remove,add} sub-section granularity ZONE_DEVICE memory,
but not memory to be managed by the buddy.

Examples are
- Memory block devices span 1..X sections and can either be
  online/offline
- We can only mark full sections to be online/offline in sparsemem
- Besides section handling, current onlining/offlining code could only
  work in MAX_ORDER - 1 granularity, not necessarily sub-section
  granularity.

Thanks for having a look.

> 
> Seems you are right.


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2020-09-08  9:15 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-19 17:59 [PATCH v2 00/10] mm/memory_hotplug: online_pages()/offline_pages() cleanups David Hildenbrand
2020-08-19 17:59 ` [PATCH v2 01/10] mm/memory_hotplug: inline __offline_pages() into offline_pages() David Hildenbrand
2020-08-24 10:26   ` Oscar Salvador
2020-08-31 10:05   ` Pankaj Gupta
2020-08-19 17:59 ` [PATCH v2 02/10] mm/memory_hotplug: enforce section granularity when onlining/offlining David Hildenbrand
2020-08-24 10:39   ` Oscar Salvador
2020-08-25  2:11     ` Wei Yang
2020-09-08  9:14       ` David Hildenbrand
2020-08-19 17:59 ` [PATCH v2 03/10] mm/memory_hotplug: simplify page offlining David Hildenbrand
2020-08-24 10:44   ` Oscar Salvador
2020-09-03 21:58   ` Andrew Morton
2020-09-04  5:47     ` David Hildenbrand
2020-09-04  7:46       ` Michal Hocko
2020-09-04 19:21       ` Andrew Morton
2020-09-07  6:45         ` Michal Hocko
2020-09-08  9:10           ` David Hildenbrand
2020-08-19 17:59 ` [PATCH v2 04/10] mm/page_alloc: simplify __offline_isolated_pages() David Hildenbrand
2020-08-24 10:48   ` Oscar Salvador
2020-08-19 17:59 ` [PATCH v2 05/10] mm/memory_hotplug: drop nr_isolate_pageblock in offline_pages() David Hildenbrand
2020-08-24 10:49   ` Oscar Salvador
2020-08-19 17:59 ` [PATCH v2 06/10] mm/page_isolation: simplify return value of start_isolate_page_range() David Hildenbrand
2020-08-24 10:51   ` Oscar Salvador
2020-08-19 17:59 ` [PATCH v2 07/10] mm/memory_hotplug: simplify page onlining David Hildenbrand
2020-08-24 10:55   ` Oscar Salvador
2020-08-19 17:59 ` [PATCH v2 08/10] mm/page_alloc: drop stale pageblock comment in memmap_init_zone*() David Hildenbrand
2020-08-24 10:58   ` Oscar Salvador
2020-08-19 17:59 ` [PATCH v2 09/10] mm: pass migratetype into memmap_init_zone() and move_pfn_range_to_zone() David Hildenbrand
2020-08-24 11:59   ` Oscar Salvador
2020-08-19 17:59 ` [PATCH v2 10/10] mm/memory_hotplug: mark pageblocks MIGRATE_ISOLATE while onlining memory David Hildenbrand
2020-08-24 12:07   ` Oscar Salvador

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).