linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/6] mm/memory_hotplug: Exit early in __remove_pages() on BUGs
       [not found] <20190826101012.10575-1-david@redhat.com>
@ 2019-08-26 10:10 ` David Hildenbrand
  2019-08-26 10:10 ` [PATCH v2 2/6] mm: Exit early in set_zone_contiguous() if already contiguous David Hildenbrand
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2019-08-26 10:10 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Oscar Salvador,
	Michal Hocko, Pavel Tatashin, Dan Williams, Wei Yang

The error path should never happen in practice (unless bringing up a new
device driver, or on BUGs). However, it's clearer to not touch anything
in case we are going to return right away. Move the check/return.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 32a5386758ce..71779b7b14df 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -542,13 +542,13 @@ void __remove_pages(struct zone *zone, unsigned long pfn,
 	unsigned long map_offset = 0;
 	unsigned long nr, start_sec, end_sec;
 
+	if (check_pfn_span(pfn, nr_pages, "remove"))
+		return;
+
 	map_offset = vmem_altmap_offset(altmap);
 
 	clear_zone_contiguous(zone);
 
-	if (check_pfn_span(pfn, nr_pages, "remove"))
-		return;
-
 	start_sec = pfn_to_section_nr(pfn);
 	end_sec = pfn_to_section_nr(pfn + nr_pages - 1);
 	for (nr = start_sec; nr <= end_sec; nr++) {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/6] mm: Exit early in set_zone_contiguous() if already contiguous
       [not found] <20190826101012.10575-1-david@redhat.com>
  2019-08-26 10:10 ` [PATCH v2 1/6] mm/memory_hotplug: Exit early in __remove_pages() on BUGs David Hildenbrand
@ 2019-08-26 10:10 ` David Hildenbrand
  2019-08-26 10:10 ` [PATCH v2 3/6] mm/memory_hotplug: Process all zones when removing memory David Hildenbrand
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2019-08-26 10:10 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Michal Hocko,
	Vlastimil Babka, Oscar Salvador, Pavel Tatashin, Mel Gorman,
	Mike Rapoport, Dan Williams, Alexander Duyck

No need to recompute in case the zone is already marked contiguous.
We will soon exploit this on the memory removal path, where we will only
clear zone->contiguous on zones that intersect with the memory to be
removed.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/page_alloc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5b799e11fba3..995708e05cde 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1546,6 +1546,9 @@ void set_zone_contiguous(struct zone *zone)
 	unsigned long block_start_pfn = zone->zone_start_pfn;
 	unsigned long block_end_pfn;
 
+	if (zone->contiguous)
+		return;
+
 	block_end_pfn = ALIGN(block_start_pfn + 1, pageblock_nr_pages);
 	for (; block_start_pfn < zone_end_pfn(zone);
 			block_start_pfn = block_end_pfn,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 3/6] mm/memory_hotplug: Process all zones when removing memory
       [not found] <20190826101012.10575-1-david@redhat.com>
  2019-08-26 10:10 ` [PATCH v2 1/6] mm/memory_hotplug: Exit early in __remove_pages() on BUGs David Hildenbrand
  2019-08-26 10:10 ` [PATCH v2 2/6] mm: Exit early in set_zone_contiguous() if already contiguous David Hildenbrand
@ 2019-08-26 10:10 ` David Hildenbrand
  2019-08-29 15:39   ` Michal Hocko
  2019-08-26 10:10 ` [PATCH v2 4/6] mm/memory_hotplug: Cleanup __remove_pages() David Hildenbrand
  2019-08-26 10:10 ` [PATCH v2 5/6] mm: Introduce for_each_zone_nid() David Hildenbrand
  4 siblings, 1 reply; 14+ messages in thread
From: David Hildenbrand @ 2019-08-26 10:10 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Oscar Salvador,
	Michal Hocko, Pavel Tatashin, Dan Williams, Wei Yang

It is easier than I though to trigger a kernel bug by removing memory that
was never onlined. With CONFIG_DEBUG_VM the memmap is initialized with
garbage, resulting in the detection of a broken zone when removing memory.
Without CONFIG_DEBUG_VM it is less likely - but we could still have
garbage in the memmap.

:/# [   23.912993] BUG: unable to handle page fault for address: 000000000000353d
[   23.914219] #PF: supervisor write access in kernel mode
[   23.915199] #PF: error_code(0x0002) - not-present page
[   23.916160] PGD 0 P4D 0
[   23.916627] Oops: 0002 [#1] SMP PTI
[   23.917256] CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317
[   23.918900] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
[   23.921194] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
[   23.922249] RIP: 0010:clear_zone_contiguous+0x5/0x10
[   23.923173] Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840
[   23.926876] RSP: 0018:ffffad2400043c98 EFLAGS: 00010246
[   23.927928] RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000
[   23.929458] RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40
[   23.930899] RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001
[   23.932362] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
[   23.933603] R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680
[   23.934913] FS:  0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000
[   23.936294] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   23.937481] CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0
[   23.938687] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   23.939889] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   23.941168] Call Trace:
[   23.941580]  __remove_pages+0x4b/0x640
[   23.942303]  ? mark_held_locks+0x49/0x70
[   23.943149]  arch_remove_memory+0x63/0x8d
[   23.943921]  try_remove_memory+0xdb/0x130
[   23.944766]  ? walk_memory_blocks+0x7f/0x9e
[   23.945616]  __remove_memory+0xa/0x11
[   23.946274]  acpi_memory_device_remove+0x70/0x100
[   23.947308]  acpi_bus_trim+0x55/0x90
[   23.947914]  acpi_device_hotplug+0x227/0x3a0
[   23.948714]  acpi_hotplug_work_fn+0x1a/0x30
[   23.949433]  process_one_work+0x221/0x550
[   23.950190]  worker_thread+0x50/0x3b0
[   23.950993]  kthread+0x105/0x140
[   23.951644]  ? process_one_work+0x550/0x550
[   23.952508]  ? kthread_park+0x80/0x80
[   23.953367]  ret_from_fork+0x3a/0x50
[   23.954025] Modules linked in:
[   23.954613] CR2: 000000000000353d
[   23.955248] ---[ end trace 93d982b1fb3e1a69 ]---

But the problem is more extreme: When removing memory we could have
- Single memory blocks that fall into no zone (never onlined)
- Single memory blocks that fall into multiple zones (offlined+re-onlined)
- Multiple memory blocks that fall into different zones
Right now, the zones don't get updated properly in these cases.

So let's simply process all zones for now until we can properly handle
this via the reverse of move_pfn_range_to_zone() (which would then be
called something like remove_pfn_range_from_zone()), for example, when
offlining memory or before removing ZONE_DEVICE memory.

To speed things up, only mark applicable zones non-contiguous (and
therefore reduce the zones to recompute) and skip non-intersecting zones
when trying to resize. shrink_zone_span() and shrink_pgdat_span() seem
to be able to cope just fine with pfn ranges they don't actually
contain (but still intersect with).

Don't check for zone_intersects() when triggering set_zone_contiguous()
- we might have resized the zone and the check might no longer hold. For
now, we have to try to recompute any zone (which will be skipped in case
the zone is already contiguous).

Note1: Detecting which memory is still part of a zone is not easy before
removing memory as the detection relies almost completely on pfn_valid()
right now. pfn_online() cannot be used as ZONE_DEVICE memory is never
online. pfn_present() cannot be used as all memory is present once it was
added (but not onlined). We need to rethink/refactor this properly.

Note2: We are safe to call zone_intersects() without locking (as already
done by onlining code in default_zone_for_pfn()), as we are protected by
the memory hotplug lock - just like zone->contiguous.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 25 ++++++++++++++++++-------
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 71779b7b14df..27f0457b7512 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -505,22 +505,28 @@ static void __remove_zone(struct zone *zone, unsigned long start_pfn,
 	struct pglist_data *pgdat = zone->zone_pgdat;
 	unsigned long flags;
 
+	if (!zone_intersects(zone, start_pfn, nr_pages))
+		return;
+
 	pgdat_resize_lock(zone->zone_pgdat, &flags);
 	shrink_zone_span(zone, start_pfn, start_pfn + nr_pages);
 	shrink_pgdat_span(pgdat, start_pfn, start_pfn + nr_pages);
 	pgdat_resize_unlock(zone->zone_pgdat, &flags);
 }
 
-static void __remove_section(struct zone *zone, unsigned long pfn,
-		unsigned long nr_pages, unsigned long map_offset,
-		struct vmem_altmap *altmap)
+static void __remove_section(unsigned long pfn, unsigned long nr_pages,
+			     unsigned long map_offset,
+			     struct vmem_altmap *altmap)
 {
 	struct mem_section *ms = __nr_to_section(pfn_to_section_nr(pfn));
+	struct zone *zone;
 
 	if (WARN_ON_ONCE(!valid_section(ms)))
 		return;
 
-	__remove_zone(zone, pfn, nr_pages);
+	/* TODO: move zone handling out of memory removal path */
+	for_each_zone(zone)
+		__remove_zone(zone, pfn, nr_pages);
 	sparse_remove_section(ms, pfn, nr_pages, map_offset, altmap);
 }
 
@@ -547,7 +553,10 @@ void __remove_pages(struct zone *zone, unsigned long pfn,
 
 	map_offset = vmem_altmap_offset(altmap);
 
-	clear_zone_contiguous(zone);
+	/* TODO: move zone handling out of memory removal path */
+	for_each_zone(zone)
+		if (zone_intersects(zone, pfn, nr_pages))
+			clear_zone_contiguous(zone);
 
 	start_sec = pfn_to_section_nr(pfn);
 	end_sec = pfn_to_section_nr(pfn + nr_pages - 1);
@@ -557,13 +566,15 @@ void __remove_pages(struct zone *zone, unsigned long pfn,
 		cond_resched();
 		pfns = min(nr_pages, PAGES_PER_SECTION
 				- (pfn & ~PAGE_SECTION_MASK));
-		__remove_section(zone, pfn, pfns, map_offset, altmap);
+		__remove_section(pfn, pfns, map_offset, altmap);
 		pfn += pfns;
 		nr_pages -= pfns;
 		map_offset = 0;
 	}
 
-	set_zone_contiguous(zone);
+	/* TODO: move zone handling out of memory removal path */
+	for_each_zone(zone)
+		set_zone_contiguous(zone);
 }
 
 int set_online_page_callback(online_page_callback_t callback)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 4/6] mm/memory_hotplug: Cleanup __remove_pages()
       [not found] <20190826101012.10575-1-david@redhat.com>
                   ` (2 preceding siblings ...)
  2019-08-26 10:10 ` [PATCH v2 3/6] mm/memory_hotplug: Process all zones when removing memory David Hildenbrand
@ 2019-08-26 10:10 ` David Hildenbrand
  2019-08-26 10:10 ` [PATCH v2 5/6] mm: Introduce for_each_zone_nid() David Hildenbrand
  4 siblings, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2019-08-26 10:10 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Oscar Salvador,
	Michal Hocko, Pavel Tatashin, Dan Williams, Wei Yang

Let's drop the basically unused section stuff and simplify.

Also, let's use a shorter variant to calculate the number of pages to
the next section boundary.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 27f0457b7512..e88c96cf9d77 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -545,8 +545,9 @@ static void __remove_section(unsigned long pfn, unsigned long nr_pages,
 void __remove_pages(struct zone *zone, unsigned long pfn,
 		    unsigned long nr_pages, struct vmem_altmap *altmap)
 {
+	const unsigned long end_pfn = pfn + nr_pages;
+	unsigned long cur_nr_pages;
 	unsigned long map_offset = 0;
-	unsigned long nr, start_sec, end_sec;
 
 	if (check_pfn_span(pfn, nr_pages, "remove"))
 		return;
@@ -558,17 +559,11 @@ void __remove_pages(struct zone *zone, unsigned long pfn,
 		if (zone_intersects(zone, pfn, nr_pages))
 			clear_zone_contiguous(zone);
 
-	start_sec = pfn_to_section_nr(pfn);
-	end_sec = pfn_to_section_nr(pfn + nr_pages - 1);
-	for (nr = start_sec; nr <= end_sec; nr++) {
-		unsigned long pfns;
-
+	for (; pfn < end_pfn; pfn += cur_nr_pages) {
 		cond_resched();
-		pfns = min(nr_pages, PAGES_PER_SECTION
-				- (pfn & ~PAGE_SECTION_MASK));
-		__remove_section(pfn, pfns, map_offset, altmap);
-		pfn += pfns;
-		nr_pages -= pfns;
+		/* Select all remaining pages up to the next section boundary */
+		cur_nr_pages = min(end_pfn - pfn, -(pfn | PAGE_SECTION_MASK));
+		__remove_section(pfn, cur_nr_pages, map_offset, altmap);
 		map_offset = 0;
 	}
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 5/6] mm: Introduce for_each_zone_nid()
       [not found] <20190826101012.10575-1-david@redhat.com>
                   ` (3 preceding siblings ...)
  2019-08-26 10:10 ` [PATCH v2 4/6] mm/memory_hotplug: Cleanup __remove_pages() David Hildenbrand
@ 2019-08-26 10:10 ` David Hildenbrand
  4 siblings, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2019-08-26 10:10 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Vlastimil Babka,
	Dan Williams, Mel Gorman, Michal Hocko, Wei Yang,
	Johannes Weiner, Arun KS

Allow to iterate all zones belonging to a nid.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Arun KS <arunks@codeaurora.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/mmzone.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 8b5f758942a2..71f2b9b55069 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1004,6 +1004,11 @@ extern struct zone *next_zone(struct zone *zone);
 			; /* do nothing */		\
 		else
 
+#define for_each_zone_nid(zone, nid)			\
+	for (zone = (NODE_DATA(nid))->node_zones;	\
+	     zone && zone_to_nid(zone) == nid;		\
+	     zone = next_zone(zone))
+
 static inline struct zone *zonelist_zone(struct zoneref *zoneref)
 {
 	return zoneref->zone;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/6] mm/memory_hotplug: Process all zones when removing memory
  2019-08-26 10:10 ` [PATCH v2 3/6] mm/memory_hotplug: Process all zones when removing memory David Hildenbrand
@ 2019-08-29 15:39   ` Michal Hocko
  2019-08-29 15:54     ` David Hildenbrand
  0 siblings, 1 reply; 14+ messages in thread
From: Michal Hocko @ 2019-08-29 15:39 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Oscar Salvador,
	Pavel Tatashin, Dan Williams, Wei Yang

On Mon 26-08-19 12:10:09, David Hildenbrand wrote:
> It is easier than I though to trigger a kernel bug by removing memory that
> was never onlined. With CONFIG_DEBUG_VM the memmap is initialized with
> garbage, resulting in the detection of a broken zone when removing memory.
> Without CONFIG_DEBUG_VM it is less likely - but we could still have
> garbage in the memmap.
> 
> :/# [   23.912993] BUG: unable to handle page fault for address: 000000000000353d
> [   23.914219] #PF: supervisor write access in kernel mode
> [   23.915199] #PF: error_code(0x0002) - not-present page
> [   23.916160] PGD 0 P4D 0
> [   23.916627] Oops: 0002 [#1] SMP PTI
> [   23.917256] CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317
> [   23.918900] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
> [   23.921194] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
> [   23.922249] RIP: 0010:clear_zone_contiguous+0x5/0x10
> [   23.923173] Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840
> [   23.926876] RSP: 0018:ffffad2400043c98 EFLAGS: 00010246
> [   23.927928] RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000
> [   23.929458] RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40
> [   23.930899] RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001
> [   23.932362] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
> [   23.933603] R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680
> [   23.934913] FS:  0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000
> [   23.936294] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   23.937481] CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0
> [   23.938687] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   23.939889] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   23.941168] Call Trace:
> [   23.941580]  __remove_pages+0x4b/0x640
> [   23.942303]  ? mark_held_locks+0x49/0x70
> [   23.943149]  arch_remove_memory+0x63/0x8d
> [   23.943921]  try_remove_memory+0xdb/0x130
> [   23.944766]  ? walk_memory_blocks+0x7f/0x9e
> [   23.945616]  __remove_memory+0xa/0x11
> [   23.946274]  acpi_memory_device_remove+0x70/0x100
> [   23.947308]  acpi_bus_trim+0x55/0x90
> [   23.947914]  acpi_device_hotplug+0x227/0x3a0
> [   23.948714]  acpi_hotplug_work_fn+0x1a/0x30
> [   23.949433]  process_one_work+0x221/0x550
> [   23.950190]  worker_thread+0x50/0x3b0
> [   23.950993]  kthread+0x105/0x140
> [   23.951644]  ? process_one_work+0x550/0x550
> [   23.952508]  ? kthread_park+0x80/0x80
> [   23.953367]  ret_from_fork+0x3a/0x50
> [   23.954025] Modules linked in:
> [   23.954613] CR2: 000000000000353d
> [   23.955248] ---[ end trace 93d982b1fb3e1a69 ]---

Yes, this is indeed nasty. I didin't think of this when separating
memmap initialization from the hotremove. This means that the zone
pointer is a garbage in arch_remove_memory already. The proper fix is to
remove it from that level down. Moreover the zone is only needed for the
shrinking code and zone continuous thingy. The later belongs to offlining
code unless I am missing something. I can see that you are removing zone
parameter in a later patch but wouldn't it be just better to remove the
whole zone thing in a single patch and have this as a bug fix for a rare
bug with a fixes tag?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/6] mm/memory_hotplug: Process all zones when removing memory
  2019-08-29 15:39   ` Michal Hocko
@ 2019-08-29 15:54     ` David Hildenbrand
  2019-08-29 16:27       ` Michal Hocko
  0 siblings, 1 reply; 14+ messages in thread
From: David Hildenbrand @ 2019-08-29 15:54 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Oscar Salvador,
	Pavel Tatashin, Dan Williams, Wei Yang

On 29.08.19 17:39, Michal Hocko wrote:
> On Mon 26-08-19 12:10:09, David Hildenbrand wrote:
>> It is easier than I though to trigger a kernel bug by removing memory that
>> was never onlined. With CONFIG_DEBUG_VM the memmap is initialized with
>> garbage, resulting in the detection of a broken zone when removing memory.
>> Without CONFIG_DEBUG_VM it is less likely - but we could still have
>> garbage in the memmap.
>>
>> :/# [   23.912993] BUG: unable to handle page fault for address: 000000000000353d
>> [   23.914219] #PF: supervisor write access in kernel mode
>> [   23.915199] #PF: error_code(0x0002) - not-present page
>> [   23.916160] PGD 0 P4D 0
>> [   23.916627] Oops: 0002 [#1] SMP PTI
>> [   23.917256] CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317
>> [   23.918900] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
>> [   23.921194] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
>> [   23.922249] RIP: 0010:clear_zone_contiguous+0x5/0x10
>> [   23.923173] Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840
>> [   23.926876] RSP: 0018:ffffad2400043c98 EFLAGS: 00010246
>> [   23.927928] RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000
>> [   23.929458] RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40
>> [   23.930899] RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001
>> [   23.932362] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
>> [   23.933603] R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680
>> [   23.934913] FS:  0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000
>> [   23.936294] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [   23.937481] CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0
>> [   23.938687] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [   23.939889] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [   23.941168] Call Trace:
>> [   23.941580]  __remove_pages+0x4b/0x640
>> [   23.942303]  ? mark_held_locks+0x49/0x70
>> [   23.943149]  arch_remove_memory+0x63/0x8d
>> [   23.943921]  try_remove_memory+0xdb/0x130
>> [   23.944766]  ? walk_memory_blocks+0x7f/0x9e
>> [   23.945616]  __remove_memory+0xa/0x11
>> [   23.946274]  acpi_memory_device_remove+0x70/0x100
>> [   23.947308]  acpi_bus_trim+0x55/0x90
>> [   23.947914]  acpi_device_hotplug+0x227/0x3a0
>> [   23.948714]  acpi_hotplug_work_fn+0x1a/0x30
>> [   23.949433]  process_one_work+0x221/0x550
>> [   23.950190]  worker_thread+0x50/0x3b0
>> [   23.950993]  kthread+0x105/0x140
>> [   23.951644]  ? process_one_work+0x550/0x550
>> [   23.952508]  ? kthread_park+0x80/0x80
>> [   23.953367]  ret_from_fork+0x3a/0x50
>> [   23.954025] Modules linked in:
>> [   23.954613] CR2: 000000000000353d
>> [   23.955248] ---[ end trace 93d982b1fb3e1a69 ]---
> 
> Yes, this is indeed nasty. I didin't think of this when separating
> memmap initialization from the hotremove. This means that the zone
> pointer is a garbage in arch_remove_memory already. The proper fix is to
> remove it from that level down. Moreover the zone is only needed for the
> shrinking code and zone continuous thingy. The later belongs to offlining
> code unless I am missing something. I can see that you are removing zone
> parameter in a later patch but wouldn't it be just better to remove the
> whole zone thing in a single patch and have this as a bug fix for a rare
> bug with a fixes tag?
> 

If I remember correctly, this patch already fixed the issue for me,
without the other cleanup (removing the zone parameter). But I might be
wrong.

Anyhow, I'll send a v4 shortly (either this evening or tomorrow), so you
can safe yourself some review time and wait for that one :)

I'll try to see if I can attach fixes tags to selected commits. But if
it makes review harder, I prefer keeping this split (this has been
broken for a long time either way).

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/6] mm/memory_hotplug: Process all zones when removing memory
  2019-08-29 15:54     ` David Hildenbrand
@ 2019-08-29 16:27       ` Michal Hocko
  2019-08-29 16:59         ` David Hildenbrand
  0 siblings, 1 reply; 14+ messages in thread
From: Michal Hocko @ 2019-08-29 16:27 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Oscar Salvador,
	Pavel Tatashin, Dan Williams, Wei Yang

On Thu 29-08-19 17:54:35, David Hildenbrand wrote:
> On 29.08.19 17:39, Michal Hocko wrote:
> > On Mon 26-08-19 12:10:09, David Hildenbrand wrote:
> >> It is easier than I though to trigger a kernel bug by removing memory that
> >> was never onlined. With CONFIG_DEBUG_VM the memmap is initialized with
> >> garbage, resulting in the detection of a broken zone when removing memory.
> >> Without CONFIG_DEBUG_VM it is less likely - but we could still have
> >> garbage in the memmap.
> >>
> >> :/# [   23.912993] BUG: unable to handle page fault for address: 000000000000353d
> >> [   23.914219] #PF: supervisor write access in kernel mode
> >> [   23.915199] #PF: error_code(0x0002) - not-present page
> >> [   23.916160] PGD 0 P4D 0
> >> [   23.916627] Oops: 0002 [#1] SMP PTI
> >> [   23.917256] CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317
> >> [   23.918900] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
> >> [   23.921194] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
> >> [   23.922249] RIP: 0010:clear_zone_contiguous+0x5/0x10
> >> [   23.923173] Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840
> >> [   23.926876] RSP: 0018:ffffad2400043c98 EFLAGS: 00010246
> >> [   23.927928] RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000
> >> [   23.929458] RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40
> >> [   23.930899] RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001
> >> [   23.932362] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
> >> [   23.933603] R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680
> >> [   23.934913] FS:  0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000
> >> [   23.936294] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> [   23.937481] CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0
> >> [   23.938687] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> [   23.939889] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >> [   23.941168] Call Trace:
> >> [   23.941580]  __remove_pages+0x4b/0x640
> >> [   23.942303]  ? mark_held_locks+0x49/0x70
> >> [   23.943149]  arch_remove_memory+0x63/0x8d
> >> [   23.943921]  try_remove_memory+0xdb/0x130
> >> [   23.944766]  ? walk_memory_blocks+0x7f/0x9e
> >> [   23.945616]  __remove_memory+0xa/0x11
> >> [   23.946274]  acpi_memory_device_remove+0x70/0x100
> >> [   23.947308]  acpi_bus_trim+0x55/0x90
> >> [   23.947914]  acpi_device_hotplug+0x227/0x3a0
> >> [   23.948714]  acpi_hotplug_work_fn+0x1a/0x30
> >> [   23.949433]  process_one_work+0x221/0x550
> >> [   23.950190]  worker_thread+0x50/0x3b0
> >> [   23.950993]  kthread+0x105/0x140
> >> [   23.951644]  ? process_one_work+0x550/0x550
> >> [   23.952508]  ? kthread_park+0x80/0x80
> >> [   23.953367]  ret_from_fork+0x3a/0x50
> >> [   23.954025] Modules linked in:
> >> [   23.954613] CR2: 000000000000353d
> >> [   23.955248] ---[ end trace 93d982b1fb3e1a69 ]---
> > 
> > Yes, this is indeed nasty. I didin't think of this when separating
> > memmap initialization from the hotremove. This means that the zone
> > pointer is a garbage in arch_remove_memory already. The proper fix is to
> > remove it from that level down. Moreover the zone is only needed for the
> > shrinking code and zone continuous thingy. The later belongs to offlining
> > code unless I am missing something. I can see that you are removing zone
> > parameter in a later patch but wouldn't it be just better to remove the
> > whole zone thing in a single patch and have this as a bug fix for a rare
> > bug with a fixes tag?
> > 
> 
> If I remember correctly, this patch already fixed the issue for me,

That might be the case because nothing else does access zone on the way.
But the pointer is simply bogus. Removing it is the proper way to fix
it. And I argue that zone shouldn't even be necessary. Re-evaluating
continuous status of the zone is really something for offlining phase.
Check how we use pfn_to_online_page there.

> without the other cleanup (removing the zone parameter). But I might be
> wrong.
> 
> Anyhow, I'll send a v4 shortly (either this evening or tomorrow), so you
> can safe yourself some review time and wait for that one :)

No rush, really... It seems this is quite unlikely event as most hotplug
usecases simply online memory before removing it later on.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/6] mm/memory_hotplug: Process all zones when removing memory
  2019-08-29 16:27       ` Michal Hocko
@ 2019-08-29 16:59         ` David Hildenbrand
  2019-08-30  6:01           ` Michal Hocko
  0 siblings, 1 reply; 14+ messages in thread
From: David Hildenbrand @ 2019-08-29 16:59 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Oscar Salvador,
	Pavel Tatashin, Dan Williams, Wei Yang

On 29.08.19 18:27, Michal Hocko wrote:
> On Thu 29-08-19 17:54:35, David Hildenbrand wrote:
>> On 29.08.19 17:39, Michal Hocko wrote:
>>> On Mon 26-08-19 12:10:09, David Hildenbrand wrote:
>>>> It is easier than I though to trigger a kernel bug by removing memory that
>>>> was never onlined. With CONFIG_DEBUG_VM the memmap is initialized with
>>>> garbage, resulting in the detection of a broken zone when removing memory.
>>>> Without CONFIG_DEBUG_VM it is less likely - but we could still have
>>>> garbage in the memmap.
>>>>
>>>> :/# [   23.912993] BUG: unable to handle page fault for address: 000000000000353d
>>>> [   23.914219] #PF: supervisor write access in kernel mode
>>>> [   23.915199] #PF: error_code(0x0002) - not-present page
>>>> [   23.916160] PGD 0 P4D 0
>>>> [   23.916627] Oops: 0002 [#1] SMP PTI
>>>> [   23.917256] CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317
>>>> [   23.918900] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
>>>> [   23.921194] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
>>>> [   23.922249] RIP: 0010:clear_zone_contiguous+0x5/0x10
>>>> [   23.923173] Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840
>>>> [   23.926876] RSP: 0018:ffffad2400043c98 EFLAGS: 00010246
>>>> [   23.927928] RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000
>>>> [   23.929458] RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40
>>>> [   23.930899] RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001
>>>> [   23.932362] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
>>>> [   23.933603] R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680
>>>> [   23.934913] FS:  0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000
>>>> [   23.936294] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [   23.937481] CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0
>>>> [   23.938687] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>> [   23.939889] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>>> [   23.941168] Call Trace:
>>>> [   23.941580]  __remove_pages+0x4b/0x640
>>>> [   23.942303]  ? mark_held_locks+0x49/0x70
>>>> [   23.943149]  arch_remove_memory+0x63/0x8d
>>>> [   23.943921]  try_remove_memory+0xdb/0x130
>>>> [   23.944766]  ? walk_memory_blocks+0x7f/0x9e
>>>> [   23.945616]  __remove_memory+0xa/0x11
>>>> [   23.946274]  acpi_memory_device_remove+0x70/0x100
>>>> [   23.947308]  acpi_bus_trim+0x55/0x90
>>>> [   23.947914]  acpi_device_hotplug+0x227/0x3a0
>>>> [   23.948714]  acpi_hotplug_work_fn+0x1a/0x30
>>>> [   23.949433]  process_one_work+0x221/0x550
>>>> [   23.950190]  worker_thread+0x50/0x3b0
>>>> [   23.950993]  kthread+0x105/0x140
>>>> [   23.951644]  ? process_one_work+0x550/0x550
>>>> [   23.952508]  ? kthread_park+0x80/0x80
>>>> [   23.953367]  ret_from_fork+0x3a/0x50
>>>> [   23.954025] Modules linked in:
>>>> [   23.954613] CR2: 000000000000353d
>>>> [   23.955248] ---[ end trace 93d982b1fb3e1a69 ]---
>>>
>>> Yes, this is indeed nasty. I didin't think of this when separating
>>> memmap initialization from the hotremove. This means that the zone
>>> pointer is a garbage in arch_remove_memory already. The proper fix is to
>>> remove it from that level down. Moreover the zone is only needed for the
>>> shrinking code and zone continuous thingy. The later belongs to offlining
>>> code unless I am missing something. I can see that you are removing zone
>>> parameter in a later patch but wouldn't it be just better to remove the
>>> whole zone thing in a single patch and have this as a bug fix for a rare
>>> bug with a fixes tag?
>>>
>>
>> If I remember correctly, this patch already fixed the issue for me,
> 
> That might be the case because nothing else does access zone on the way.
> But the pointer is simply bogus. Removing it is the proper way to fix
> it. And I argue that zone shouldn't even be necessary. Re-evaluating
> continuous status of the zone is really something for offlining phase.
> Check how we use pfn_to_online_page there.

Yeah I'm with you, I think you spotted patch 6/6 of this series and v3
already that does exactly that. It's just a matter of rearranging things.

> 
>> without the other cleanup (removing the zone parameter). But I might be
>> wrong.
>>
>> Anyhow, I'll send a v4 shortly (either this evening or tomorrow), so you
>> can safe yourself some review time and wait for that one :)
> 
> No rush, really... It seems this is quite unlikely event as most hotplug
> usecases simply online memory before removing it later on.
> 

I can trigger it reliably right now while working/testing virtio-mem, so
I finally want to clean up this mess :) (has been on my list for a long
time). I'll try to hunt for the right commit id's that broke it.

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/6] mm/memory_hotplug: Process all zones when removing memory
  2019-08-29 16:59         ` David Hildenbrand
@ 2019-08-30  6:01           ` Michal Hocko
  2019-08-30  6:20             ` David Hildenbrand
  0 siblings, 1 reply; 14+ messages in thread
From: Michal Hocko @ 2019-08-30  6:01 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Oscar Salvador,
	Pavel Tatashin, Dan Williams, Wei Yang

On Thu 29-08-19 18:59:31, David Hildenbrand wrote:
> On 29.08.19 18:27, Michal Hocko wrote:
[...]
> > No rush, really... It seems this is quite unlikely event as most hotplug
> > usecases simply online memory before removing it later on.
> > 
> 
> I can trigger it reliably right now while working/testing virtio-mem, so
> I finally want to clean up this mess :) (has been on my list for a long
> time). I'll try to hunt for the right commit id's that broke it.

f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to
zones until online")

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/6] mm/memory_hotplug: Process all zones when removing memory
  2019-08-30  6:01           ` Michal Hocko
@ 2019-08-30  6:20             ` David Hildenbrand
  2019-08-30  6:47               ` Michal Hocko
  0 siblings, 1 reply; 14+ messages in thread
From: David Hildenbrand @ 2019-08-30  6:20 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Oscar Salvador,
	Pavel Tatashin, Dan Williams, Wei Yang

On 30.08.19 08:01, Michal Hocko wrote:
> On Thu 29-08-19 18:59:31, David Hildenbrand wrote:
>> On 29.08.19 18:27, Michal Hocko wrote:
> [...]
>>> No rush, really... It seems this is quite unlikely event as most hotplug
>>> usecases simply online memory before removing it later on.
>>>
>>
>> I can trigger it reliably right now while working/testing virtio-mem, so
>> I finally want to clean up this mess :) (has been on my list for a long
>> time). I'll try to hunt for the right commit id's that broke it.
> 
> f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to
> zones until online")
> 

Right, I can see that that commit still does a

set_page_node(page, nid);
SetPageReserved(page);

for every added page. But of course the zone_idx might contain garbage.
So for this garbage pointer, f1dd2cd13c4b seems to be the right one.

Regarding shrink_zone_span(), I suspect it was introduced by

d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug")

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/6] mm/memory_hotplug: Process all zones when removing memory
  2019-08-30  6:20             ` David Hildenbrand
@ 2019-08-30  6:47               ` Michal Hocko
  2019-08-30  7:07                 ` David Hildenbrand
  0 siblings, 1 reply; 14+ messages in thread
From: Michal Hocko @ 2019-08-30  6:47 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Oscar Salvador,
	Pavel Tatashin, Dan Williams, Wei Yang

On Fri 30-08-19 08:20:32, David Hildenbrand wrote:
[...]
> Regarding shrink_zone_span(), I suspect it was introduced by
>
> d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug")

zone shrinking code is much older - 815121d2b5cd5. But I do not think
this is really needed for Fixes tag.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/6] mm/memory_hotplug: Process all zones when removing memory
  2019-08-30  6:47               ` Michal Hocko
@ 2019-08-30  7:07                 ` David Hildenbrand
  2019-08-30  8:31                   ` Michal Hocko
  0 siblings, 1 reply; 14+ messages in thread
From: David Hildenbrand @ 2019-08-30  7:07 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Oscar Salvador,
	Pavel Tatashin, Dan Williams, Wei Yang

On 30.08.19 08:47, Michal Hocko wrote:
> On Fri 30-08-19 08:20:32, David Hildenbrand wrote:
> [...]
>> Regarding shrink_zone_span(), I suspect it was introduced by
>>
>> d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug")
> 
> zone shrinking code is much older - 815121d2b5cd5. But I do not think
> this is really needed for Fixes tag.
> 

Yes it's older, but since d0dc12e86b31 we could run into uninitialized
nids for added memory when trying to shrink the zone.

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/6] mm/memory_hotplug: Process all zones when removing memory
  2019-08-30  7:07                 ` David Hildenbrand
@ 2019-08-30  8:31                   ` Michal Hocko
  0 siblings, 0 replies; 14+ messages in thread
From: Michal Hocko @ 2019-08-30  8:31 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Oscar Salvador,
	Pavel Tatashin, Dan Williams, Wei Yang

On Fri 30-08-19 09:07:59, David Hildenbrand wrote:
> On 30.08.19 08:47, Michal Hocko wrote:
> > On Fri 30-08-19 08:20:32, David Hildenbrand wrote:
> > [...]
> >> Regarding shrink_zone_span(), I suspect it was introduced by
> >>
> >> d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug")
> > 
> > zone shrinking code is much older - 815121d2b5cd5. But I do not think
> > this is really needed for Fixes tag.
> > 
> 
> Yes it's older, but since d0dc12e86b31 we could run into uninitialized
> nids for added memory when trying to shrink the zone.

Well, not really. Strictly speaking pre d0dc12e86b31 would initialize
struct page to zero so it is initialized but that doesn't mean that the
struct page has a meaningfull and usable content. Zone index would be 0
so the lowest zone which is not something you want to use on the
hotremove path.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-08-30  8:31 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190826101012.10575-1-david@redhat.com>
2019-08-26 10:10 ` [PATCH v2 1/6] mm/memory_hotplug: Exit early in __remove_pages() on BUGs David Hildenbrand
2019-08-26 10:10 ` [PATCH v2 2/6] mm: Exit early in set_zone_contiguous() if already contiguous David Hildenbrand
2019-08-26 10:10 ` [PATCH v2 3/6] mm/memory_hotplug: Process all zones when removing memory David Hildenbrand
2019-08-29 15:39   ` Michal Hocko
2019-08-29 15:54     ` David Hildenbrand
2019-08-29 16:27       ` Michal Hocko
2019-08-29 16:59         ` David Hildenbrand
2019-08-30  6:01           ` Michal Hocko
2019-08-30  6:20             ` David Hildenbrand
2019-08-30  6:47               ` Michal Hocko
2019-08-30  7:07                 ` David Hildenbrand
2019-08-30  8:31                   ` Michal Hocko
2019-08-26 10:10 ` [PATCH v2 4/6] mm/memory_hotplug: Cleanup __remove_pages() David Hildenbrand
2019-08-26 10:10 ` [PATCH v2 5/6] mm: Introduce for_each_zone_nid() David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).