* [PATCH v1 1/2] mm/memory_hotplug: set node_start_pfn of hotadded pgdat to 0
2020-04-22 15:53 [PATCH v1 0/2] mm/memory_hotplug: handle memblocks only with CONFIG_ARCH_KEEP_MEMBLOCK David Hildenbrand
@ 2020-04-22 15:53 ` David Hildenbrand
2020-04-22 15:53 ` [PATCH v1 2/2] mm/memory_hotplug: handle memblocks only with CONFIG_ARCH_KEEP_MEMBLOCK David Hildenbrand
1 sibling, 0 replies; 3+ messages in thread
From: David Hildenbrand @ 2020-04-22 15:53 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, David Hildenbrand, Pankaj Gupta, Andrew Morton,
Michal Hocko, Baoquan He, Oscar Salvador
A hotadded node/pgdat will span no pages at all, until memory is moved to
the zone/node via move_pfn_range_to_zone() -> resize_pgdat_range - e.g.,
when onlining memory blocks. We don't have to initialize the
node_start_pfn to the memory we are adding.
Especially, there is an inconsistency:
- Hotplugging memory to a memory-less node with cpus: node_start_pf == 0
- Offlining and removing last memory from a node: node_start_pfn == 0
- Hotplugging memory to a memory-less node without cpus: node_start_pfn !=0
As soon as memory is onlined, node_start_pfn is overwritten with the
actual start. E.g., when adding two DIMMs but only onlining one of
both, only that DIMM (with online memory blocks) is spanned by the
node.
Currently, the validity of node_start_pfn really is linked to
node_spanned_pages != 0. With node_spanned_pages == 0 (e.g., before
onlining memory), it has no meaning.
So let's stop setting node_start_pfn, just to be overwritten via
move_pfn_range_to_zone(). This avoids confusion when looking at the
code, wondering which magic will be performed with the node_start_pfn
in this function, when hotadding a pgdat.
Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
mm/memory_hotplug.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b65e0d1c9812..4dc263d2525a 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -866,10 +866,9 @@ static void reset_node_present_pages(pg_data_t *pgdat)
}
/* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
-static pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start)
+static pg_data_t __ref *hotadd_new_pgdat(int nid)
{
struct pglist_data *pgdat;
- unsigned long start_pfn = PFN_DOWN(start);
pgdat = NODE_DATA(nid);
if (!pgdat) {
@@ -899,9 +898,8 @@ static pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start)
}
/* we can use NODE_DATA(nid) from here */
-
pgdat->node_id = nid;
- pgdat->node_start_pfn = start_pfn;
+ pgdat->node_start_pfn = 0;
/* init node's zones as empty zones, we don't have any present pages.*/
free_area_init_core_hotplug(nid);
@@ -936,7 +934,6 @@ static void rollback_node_hotadd(int nid)
/**
* try_online_node - online a node if offlined
* @nid: the node ID
- * @start: start addr of the node
* @set_node_online: Whether we want to online the node
* called by cpu_up() to online a node without onlined memory.
*
@@ -945,7 +942,7 @@ static void rollback_node_hotadd(int nid)
* 0 -> the node is already online
* -ENOMEM -> the node could not be allocated
*/
-static int __try_online_node(int nid, u64 start, bool set_node_online)
+static int __try_online_node(int nid, bool set_node_online)
{
pg_data_t *pgdat;
int ret = 1;
@@ -953,7 +950,7 @@ static int __try_online_node(int nid, u64 start, bool set_node_online)
if (node_online(nid))
return 0;
- pgdat = hotadd_new_pgdat(nid, start);
+ pgdat = hotadd_new_pgdat(nid);
if (!pgdat) {
pr_err("Cannot online node %d due to NULL pgdat\n", nid);
ret = -ENOMEM;
@@ -977,7 +974,7 @@ int try_online_node(int nid)
int ret;
mem_hotplug_begin();
- ret = __try_online_node(nid, 0, true);
+ ret = __try_online_node(nid, true);
mem_hotplug_done();
return ret;
}
@@ -1036,7 +1033,7 @@ int __ref add_memory_resource(int nid, struct resource *res)
*/
memblock_add_node(start, size, nid);
- ret = __try_online_node(nid, start, false);
+ ret = __try_online_node(nid, false);
if (ret < 0)
goto error;
new_node = ret;
--
2.25.3
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH v1 2/2] mm/memory_hotplug: handle memblocks only with CONFIG_ARCH_KEEP_MEMBLOCK
2020-04-22 15:53 [PATCH v1 0/2] mm/memory_hotplug: handle memblocks only with CONFIG_ARCH_KEEP_MEMBLOCK David Hildenbrand
2020-04-22 15:53 ` [PATCH v1 1/2] mm/memory_hotplug: set node_start_pfn of hotadded pgdat to 0 David Hildenbrand
@ 2020-04-22 15:53 ` David Hildenbrand
1 sibling, 0 replies; 3+ messages in thread
From: David Hildenbrand @ 2020-04-22 15:53 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, David Hildenbrand, Mike Rapoport, Michal Hocko,
Andrew Morton, Michal Hocko, Baoquan He, Oscar Salvador,
Pankaj Gupta, Anshuman Khandual
The comment in add_memory_resource() is stale: hotadd_new_pgdat() will
no longer call get_pfn_range_for_nid(), as a hotadded pgdat will simply
span no pages at all, until memory is moved to the zone/node via
move_pfn_range_to_zone() - e.g., when onlining memory blocks.
The only archs that care about memblocks for hotplugged memory (either
for iterating over all system RAM or testing for memory validity) are
arm64, s390x, and powerpc - due to CONFIG_ARCH_KEEP_MEMBLOCK. Without
CONFIG_ARCH_KEEP_MEMBLOCK, we can simply stop messing with memblocks.
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
mm/Kconfig | 3 +++
mm/memory_hotplug.c | 20 ++++++++++----------
2 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/mm/Kconfig b/mm/Kconfig
index 3af64646f343..db626b8d4fdb 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -133,6 +133,9 @@ config HAVE_FAST_GUP
depends on MMU
bool
+# Don't discard allocated memory used to track "memory" and "reserved" memblocks
+# after early boot, so it can still be used to test for validity of memory.
+# Also, memblocks are updated with memory hot(un)plug.
config ARCH_KEEP_MEMBLOCK
bool
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 4dc263d2525a..555137bd0882 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1025,13 +1025,8 @@ int __ref add_memory_resource(int nid, struct resource *res)
mem_hotplug_begin();
- /*
- * Add new range to memblock so that when hotadd_new_pgdat() is called
- * to allocate new pgdat, get_pfn_range_for_nid() will be able to find
- * this new range and calculate total pages correctly. The range will
- * be removed at hot-remove time.
- */
- memblock_add_node(start, size, nid);
+ if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK))
+ memblock_add_node(start, size, nid);
ret = __try_online_node(nid, false);
if (ret < 0)
@@ -1080,7 +1075,8 @@ int __ref add_memory_resource(int nid, struct resource *res)
/* rollback pgdat allocation and others */
if (new_node)
rollback_node_hotadd(nid);
- memblock_remove(start, size);
+ if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK))
+ memblock_remove(start, size);
mem_hotplug_done();
return ret;
}
@@ -1677,8 +1673,12 @@ static int __ref try_remove_memory(int nid, u64 start, u64 size)
mem_hotplug_begin();
arch_remove_memory(nid, start, size, NULL);
- memblock_free(start, size);
- memblock_remove(start, size);
+
+ if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) {
+ memblock_free(start, size);
+ memblock_remove(start, size);
+ }
+
__release_memory_resource(start, size);
try_offline_node(nid);
--
2.25.3
^ permalink raw reply related [flat|nested] 3+ messages in thread