Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v6 01/10] mm/memunmap: Don't access uninitialized memmap in memunmap_pages()
       [not found] <20191006085646.5768-1-david@redhat.com>
@ 2019-10-06  8:56 ` David Hildenbrand
  2019-10-06 19:58   ` Damian Tometzki
  2019-10-14  9:05   ` David Hildenbrand
  2019-10-06  8:56 ` [PATCH v6 02/10] mm/memmap_init: Update variable name in memmap_init_zone David Hildenbrand
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 19+ messages in thread
From: David Hildenbrand @ 2019-10-06  8:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-s390, linux-ia64, Ira Weiny, linux-sh, Jason Gunthorpe,
	Aneesh Kumar K.V, x86, David Hildenbrand, linux-mm,
	Logan Gunthorpe, Dan Williams, linuxppc-dev, Andrew Morton,
	linux-arm-kernel

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>

With an altmap, the memmap falling into the reserved altmap space are
not initialized and, therefore, contain a garbage NID and a garbage
zone. Make sure to read the NID/zone from a memmap that was initialzed.

This fixes a kernel crash that is observed when destroying a namespace:

[   81.356173] kernel BUG at include/linux/mm.h:1107!
cpu 0x1: Vector: 700 (Program Check) at [c000000274087890]
    pc: c0000000004b9728: memunmap_pages+0x238/0x340
    lr: c0000000004b9724: memunmap_pages+0x234/0x340
...
    pid   = 3669, comm = ndctl
kernel BUG at include/linux/mm.h:1107!
[c000000274087ba0] c0000000009e3500 devm_action_release+0x30/0x50
[c000000274087bc0] c0000000009e4758 release_nodes+0x268/0x2d0
[c000000274087c30] c0000000009dd144 device_release_driver_internal+0x174/0x240
[c000000274087c70] c0000000009d9dfc unbind_store+0x13c/0x190
[c000000274087cb0] c0000000009d8a24 drv_attr_store+0x44/0x60
[c000000274087cd0] c0000000005a7470 sysfs_kf_write+0x70/0xa0
[c000000274087d10] c0000000005a5cac kernfs_fop_write+0x1ac/0x290
[c000000274087d60] c0000000004be45c __vfs_write+0x3c/0x70
[c000000274087d80] c0000000004c26e4 vfs_write+0xe4/0x200
[c000000274087dd0] c0000000004c2a6c ksys_write+0x7c/0x140
[c000000274087e20] c00000000000bbd0 system_call+0x5c/0x68

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
[ minimze code changes, rephrase description ]
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memremap.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/mm/memremap.c b/mm/memremap.c
index 557e53c6fb46..8c2fb44c3b4d 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -123,6 +123,7 @@ static void dev_pagemap_cleanup(struct dev_pagemap *pgmap)
 void memunmap_pages(struct dev_pagemap *pgmap)
 {
 	struct resource *res = &pgmap->res;
+	struct page *first_page;
 	unsigned long pfn;
 	int nid;
 
@@ -131,14 +132,16 @@ void memunmap_pages(struct dev_pagemap *pgmap)
 		put_page(pfn_to_page(pfn));
 	dev_pagemap_cleanup(pgmap);
 
+	/* make sure to access a memmap that was actually initialized */
+	first_page = pfn_to_page(pfn_first(pgmap));
+
 	/* pages are dead and unused, undo the arch mapping */
-	nid = page_to_nid(pfn_to_page(PHYS_PFN(res->start)));
+	nid = page_to_nid(first_page);
 
 	mem_hotplug_begin();
 	if (pgmap->type == MEMORY_DEVICE_PRIVATE) {
-		pfn = PHYS_PFN(res->start);
-		__remove_pages(page_zone(pfn_to_page(pfn)), pfn,
-				 PHYS_PFN(resource_size(res)), NULL);
+		__remove_pages(page_zone(first_page), PHYS_PFN(res->start),
+			       PHYS_PFN(resource_size(res)), NULL);
 	} else {
 		arch_remove_memory(nid, res->start, resource_size(res),
 				pgmap_altmap(pgmap));
-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v6 02/10] mm/memmap_init: Update variable name in memmap_init_zone
       [not found] <20191006085646.5768-1-david@redhat.com>
  2019-10-06  8:56 ` [PATCH v6 01/10] mm/memunmap: Don't access uninitialized memmap in memunmap_pages() David Hildenbrand
@ 2019-10-06  8:56 ` David Hildenbrand
  2019-10-06  8:56 ` [PATCH v6 03/10] mm/memory_hotplug: Don't access uninitialized memmaps in shrink_pgdat_span() David Hildenbrand
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: David Hildenbrand @ 2019-10-06  8:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-s390, Michal Hocko, linux-ia64, linux-sh, Pankaj Gupta,
	Aneesh Kumar K.V, Alexander Duyck, x86, David Hildenbrand,
	Mel Gorman, Mike Rapoport, linux-mm, Pavel Tatashin,
	Alexander Potapenko, Vlastimil Babka, Andrew Morton,
	linuxppc-dev, Dan Williams, linux-arm-kernel, Oscar Salvador

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>

The third argument is actually number of pages. Changes the variable name
from size to nr_pages to indicate this better.

No functional change in this patch.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
Cc: Alexander Potapenko <glider@google.com>
Reviewed-by: Pankaj Gupta <pagupta@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/page_alloc.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 15c2050c629b..b0b2d5464000 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5936,10 +5936,10 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 #ifdef CONFIG_ZONE_DEVICE
 void __ref memmap_init_zone_device(struct zone *zone,
 				   unsigned long start_pfn,
-				   unsigned long size,
+				   unsigned long nr_pages,
 				   struct dev_pagemap *pgmap)
 {
-	unsigned long pfn, end_pfn = start_pfn + size;
+	unsigned long pfn, end_pfn = start_pfn + nr_pages;
 	struct pglist_data *pgdat = zone->zone_pgdat;
 	struct vmem_altmap *altmap = pgmap_altmap(pgmap);
 	unsigned long zone_idx = zone_idx(zone);
@@ -5956,7 +5956,7 @@ void __ref memmap_init_zone_device(struct zone *zone,
 	 */
 	if (altmap) {
 		start_pfn = altmap->base_pfn + vmem_altmap_offset(altmap);
-		size = end_pfn - start_pfn;
+		nr_pages = end_pfn - start_pfn;
 	}
 
 	for (pfn = start_pfn; pfn < end_pfn; pfn++) {
@@ -6003,7 +6003,7 @@ void __ref memmap_init_zone_device(struct zone *zone,
 	}
 
 	pr_info("%s initialised %lu pages in %ums\n", __func__,
-		size, jiffies_to_msecs(jiffies - start));
+		nr_pages, jiffies_to_msecs(jiffies - start));
 }
 
 #endif
-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v6 03/10] mm/memory_hotplug: Don't access uninitialized memmaps in shrink_pgdat_span()
       [not found] <20191006085646.5768-1-david@redhat.com>
  2019-10-06  8:56 ` [PATCH v6 01/10] mm/memunmap: Don't access uninitialized memmap in memunmap_pages() David Hildenbrand
  2019-10-06  8:56 ` [PATCH v6 02/10] mm/memmap_init: Update variable name in memmap_init_zone David Hildenbrand
@ 2019-10-06  8:56 ` David Hildenbrand
  2019-10-14  9:31   ` David Hildenbrand
  2019-10-06  8:56 ` [PATCH v6 04/10] mm/memory_hotplug: Don't access uninitialized memmaps in shrink_zone_span() David Hildenbrand
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 19+ messages in thread
From: David Hildenbrand @ 2019-10-06  8:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-s390, Michal Hocko, linux-ia64, Pavel Tatashin, linux-sh,
	Aneesh Kumar K . V, x86, David Hildenbrand, linux-mm, Wei Yang,
	Andrew Morton, linuxppc-dev, Dan Williams, linux-arm-kernel,
	Oscar Salvador

We might use the nid of memmaps that were never initialized. For
example, if the memmap was poisoned, we will crash the kernel in
pfn_to_nid() right now. Let's use the calculated boundaries of the separate
zones instead. This now also avoids having to iterate over a whole bunch of
subsections again, after shrinking one zone.

Before commit d0dc12e86b31 ("mm/memory_hotplug: optimize memory
hotplug"), the memmap was initialized to 0 and the node was set to the
right value. After that commit, the node might be garbage.

We'll have to fix shrink_zone_span() next.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Fixes: d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug")
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 72 ++++++++++-----------------------------------
 1 file changed, 15 insertions(+), 57 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 680b4b3e57d9..86b4dc18e831 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -436,67 +436,25 @@ static void shrink_zone_span(struct zone *zone, unsigned long start_pfn,
 	zone_span_writeunlock(zone);
 }
 
-static void shrink_pgdat_span(struct pglist_data *pgdat,
-			      unsigned long start_pfn, unsigned long end_pfn)
+static void update_pgdat_span(struct pglist_data *pgdat)
 {
-	unsigned long pgdat_start_pfn = pgdat->node_start_pfn;
-	unsigned long p = pgdat_end_pfn(pgdat); /* pgdat_end_pfn namespace clash */
-	unsigned long pgdat_end_pfn = p;
-	unsigned long pfn;
-	int nid = pgdat->node_id;
-
-	if (pgdat_start_pfn == start_pfn) {
-		/*
-		 * If the section is smallest section in the pgdat, it need
-		 * shrink pgdat->node_start_pfn and pgdat->node_spanned_pages.
-		 * In this case, we find second smallest valid mem_section
-		 * for shrinking zone.
-		 */
-		pfn = find_smallest_section_pfn(nid, NULL, end_pfn,
-						pgdat_end_pfn);
-		if (pfn) {
-			pgdat->node_start_pfn = pfn;
-			pgdat->node_spanned_pages = pgdat_end_pfn - pfn;
-		}
-	} else if (pgdat_end_pfn == end_pfn) {
-		/*
-		 * If the section is biggest section in the pgdat, it need
-		 * shrink pgdat->node_spanned_pages.
-		 * In this case, we find second biggest valid mem_section for
-		 * shrinking zone.
-		 */
-		pfn = find_biggest_section_pfn(nid, NULL, pgdat_start_pfn,
-					       start_pfn);
-		if (pfn)
-			pgdat->node_spanned_pages = pfn - pgdat_start_pfn + 1;
-	}
-
-	/*
-	 * If the section is not biggest or smallest mem_section in the pgdat,
-	 * it only creates a hole in the pgdat. So in this case, we need not
-	 * change the pgdat.
-	 * But perhaps, the pgdat has only hole data. Thus it check the pgdat
-	 * has only hole or not.
-	 */
-	pfn = pgdat_start_pfn;
-	for (; pfn < pgdat_end_pfn; pfn += PAGES_PER_SUBSECTION) {
-		if (unlikely(!pfn_valid(pfn)))
-			continue;
-
-		if (pfn_to_nid(pfn) != nid)
-			continue;
+	unsigned long node_start_pfn = 0, node_end_pfn = 0;
+	struct zone *zone;
 
-		/* Skip range to be removed */
-		if (pfn >= start_pfn && pfn < end_pfn)
-			continue;
+	for (zone = pgdat->node_zones;
+	     zone < pgdat->node_zones + MAX_NR_ZONES; zone++) {
+		unsigned long zone_end_pfn = zone->zone_start_pfn +
+					     zone->spanned_pages;
 
-		/* If we find valid section, we have nothing to do */
-		return;
+		/* No need to lock the zones, they can't change. */
+		if (zone_end_pfn > node_end_pfn)
+			node_end_pfn = zone_end_pfn;
+		if (zone->zone_start_pfn < node_start_pfn)
+			node_start_pfn = zone->zone_start_pfn;
 	}
 
-	/* The pgdat has no valid section */
-	pgdat->node_start_pfn = 0;
-	pgdat->node_spanned_pages = 0;
+	pgdat->node_start_pfn = node_start_pfn;
+	pgdat->node_spanned_pages = node_end_pfn - node_start_pfn;
 }
 
 static void __remove_zone(struct zone *zone, unsigned long start_pfn,
@@ -507,7 +465,7 @@ static void __remove_zone(struct zone *zone, unsigned long start_pfn,
 
 	pgdat_resize_lock(zone->zone_pgdat, &flags);
 	shrink_zone_span(zone, start_pfn, start_pfn + nr_pages);
-	shrink_pgdat_span(pgdat, start_pfn, start_pfn + nr_pages);
+	update_pgdat_span(pgdat);
 	pgdat_resize_unlock(zone->zone_pgdat, &flags);
 }
 
-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v6 04/10] mm/memory_hotplug: Don't access uninitialized memmaps in shrink_zone_span()
       [not found] <20191006085646.5768-1-david@redhat.com>
                   ` (2 preceding siblings ...)
  2019-10-06  8:56 ` [PATCH v6 03/10] mm/memory_hotplug: Don't access uninitialized memmaps in shrink_pgdat_span() David Hildenbrand
@ 2019-10-06  8:56 ` David Hildenbrand
  2019-10-14  9:32   ` David Hildenbrand
  2019-10-06  8:56 ` [PATCH v6 06/10] mm/memory_hotplug: Poison memmap in remove_pfn_range_from_zone() David Hildenbrand
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 19+ messages in thread
From: David Hildenbrand @ 2019-10-06  8:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-s390, Michal Hocko, linux-ia64, Pavel Tatashin, linux-sh,
	Aneesh Kumar K . V, x86, David Hildenbrand, linux-mm,
	Andrew Morton, linuxppc-dev, Dan Williams, linux-arm-kernel,
	Oscar Salvador

Let's limit shrinking to !ZONE_DEVICE so we can fix the current code. We
should never try to touch the memmap of offline sections where we could
have uninitialized memmaps and could trigger BUGs when calling
page_to_nid() on poisoned pages.

There is no reliable way to distinguish an uninitialized memmap from an
initialized memmap that belongs to ZONE_DEVICE, as we don't have
anything like SECTION_IS_ONLINE we can use similar to
pfn_to_online_section() for !ZONE_DEVICE memory. E.g.,
set_zone_contiguous() similarly relies on pfn_to_online_section() and
will therefore never set a ZONE_DEVICE zone consecutive. Stopping to
shrink the ZONE_DEVICE therefore results in no observable changes,
besides /proc/zoneinfo indicating different boundaries - something we
can totally live with.

Before commit d0dc12e86b31 ("mm/memory_hotplug: optimize memory
hotplug"), the memmap was initialized with 0 and the node with the
right value. So the zone might be wrong but not garbage. After that
commit, both the zone and the node will be garbage when touching
uninitialized memmaps.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Fixes: d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug")
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 86b4dc18e831..f96608d24f6a 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -331,7 +331,7 @@ static unsigned long find_smallest_section_pfn(int nid, struct zone *zone,
 				     unsigned long end_pfn)
 {
 	for (; start_pfn < end_pfn; start_pfn += PAGES_PER_SUBSECTION) {
-		if (unlikely(!pfn_valid(start_pfn)))
+		if (unlikely(!pfn_to_online_page(start_pfn)))
 			continue;
 
 		if (unlikely(pfn_to_nid(start_pfn) != nid))
@@ -356,7 +356,7 @@ static unsigned long find_biggest_section_pfn(int nid, struct zone *zone,
 	/* pfn is the end pfn of a memory section. */
 	pfn = end_pfn - 1;
 	for (; pfn >= start_pfn; pfn -= PAGES_PER_SUBSECTION) {
-		if (unlikely(!pfn_valid(pfn)))
+		if (unlikely(!pfn_to_online_page(pfn)))
 			continue;
 
 		if (unlikely(pfn_to_nid(pfn) != nid))
@@ -415,7 +415,7 @@ static void shrink_zone_span(struct zone *zone, unsigned long start_pfn,
 	 */
 	pfn = zone_start_pfn;
 	for (; pfn < zone_end_pfn; pfn += PAGES_PER_SUBSECTION) {
-		if (unlikely(!pfn_valid(pfn)))
+		if (unlikely(!pfn_to_online_page(pfn)))
 			continue;
 
 		if (page_zone(pfn_to_page(pfn)) != zone)
@@ -463,6 +463,16 @@ static void __remove_zone(struct zone *zone, unsigned long start_pfn,
 	struct pglist_data *pgdat = zone->zone_pgdat;
 	unsigned long flags;
 
+#ifdef CONFIG_ZONE_DEVICE
+	/*
+	 * Zone shrinking code cannot properly deal with ZONE_DEVICE. So
+	 * we will not try to shrink the zones - which is okay as
+	 * set_zone_contiguous() cannot deal with ZONE_DEVICE either way.
+	 */
+	if (zone_idx(zone) == ZONE_DEVICE)
+		return;
+#endif
+
 	pgdat_resize_lock(zone->zone_pgdat, &flags);
 	shrink_zone_span(zone, start_pfn, start_pfn + nr_pages);
 	update_pgdat_span(pgdat);
-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v6 06/10] mm/memory_hotplug: Poison memmap in remove_pfn_range_from_zone()
       [not found] <20191006085646.5768-1-david@redhat.com>
                   ` (3 preceding siblings ...)
  2019-10-06  8:56 ` [PATCH v6 04/10] mm/memory_hotplug: Don't access uninitialized memmaps in shrink_zone_span() David Hildenbrand
@ 2019-10-06  8:56 ` David Hildenbrand
  2019-10-16 14:01   ` David Hildenbrand
  2019-10-06  8:56 ` [PATCH v6 07/10] mm/memory_hotplug: We always have a zone in find_(smallest|biggest)_section_pfn David Hildenbrand
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 19+ messages in thread
From: David Hildenbrand @ 2019-10-06  8:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-s390, Michal Hocko, linux-ia64, Pavel Tatashin, linux-sh,
	x86, David Hildenbrand, linux-mm, Andrew Morton, linuxppc-dev,
	Dan Williams, linux-arm-kernel, Oscar Salvador

Let's poison the pages similar to when adding new memory in
sparse_add_section(). Also call remove_pfn_range_from_zone() from
memunmap_pages(), so we can poison the memmap from there as well.

While at it, calculate the pfn in memunmap_pages() only once.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 3 +++
 mm/memremap.c       | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 5b003ffa5dc9..bf5173e7913d 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -464,6 +464,9 @@ void __ref remove_pfn_range_from_zone(struct zone *zone,
 	struct pglist_data *pgdat = zone->zone_pgdat;
 	unsigned long flags;
 
+	/* Poison struct pages because they are now uninitialized again. */
+	page_init_poison(pfn_to_page(start_pfn), sizeof(struct page) * nr_pages);
+
 #ifdef CONFIG_ZONE_DEVICE
 	/*
 	 * Zone shrinking code cannot properly deal with ZONE_DEVICE. So
diff --git a/mm/memremap.c b/mm/memremap.c
index 70263e6f093e..7fed8bd32a18 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -139,6 +139,8 @@ void memunmap_pages(struct dev_pagemap *pgmap)
 	nid = page_to_nid(first_page);
 
 	mem_hotplug_begin();
+	remove_pfn_range_from_zone(page_zone(first_page), PHYS_PFN(res->start),
+				   PHYS_PFN(resource_size(res)));
 	if (pgmap->type == MEMORY_DEVICE_PRIVATE) {
 		__remove_pages(PHYS_PFN(res->start),
 			       PHYS_PFN(resource_size(res)), NULL);
-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v6 07/10] mm/memory_hotplug: We always have a zone in find_(smallest|biggest)_section_pfn
       [not found] <20191006085646.5768-1-david@redhat.com>
                   ` (4 preceding siblings ...)
  2019-10-06  8:56 ` [PATCH v6 06/10] mm/memory_hotplug: Poison memmap in remove_pfn_range_from_zone() David Hildenbrand
@ 2019-10-06  8:56 ` David Hildenbrand
  2019-10-06  8:56 ` [PATCH v6 08/10] mm/memory_hotplug: Don't check for "all holes" in shrink_zone_span() David Hildenbrand
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: David Hildenbrand @ 2019-10-06  8:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-s390, Michal Hocko, linux-ia64, Pavel Tatashin, linux-sh,
	x86, David Hildenbrand, linux-mm, Wei Yang, Andrew Morton,
	linuxppc-dev, Dan Williams, linux-arm-kernel, Oscar Salvador

With shrink_pgdat_span() out of the way, we now always have a valid
zone.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index bf5173e7913d..f294918f7211 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -337,7 +337,7 @@ static unsigned long find_smallest_section_pfn(int nid, struct zone *zone,
 		if (unlikely(pfn_to_nid(start_pfn) != nid))
 			continue;
 
-		if (zone && zone != page_zone(pfn_to_page(start_pfn)))
+		if (zone != page_zone(pfn_to_page(start_pfn)))
 			continue;
 
 		return start_pfn;
@@ -362,7 +362,7 @@ static unsigned long find_biggest_section_pfn(int nid, struct zone *zone,
 		if (unlikely(pfn_to_nid(pfn) != nid))
 			continue;
 
-		if (zone && zone != page_zone(pfn_to_page(pfn)))
+		if (zone != page_zone(pfn_to_page(pfn)))
 			continue;
 
 		return pfn;
-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v6 08/10] mm/memory_hotplug: Don't check for "all holes" in shrink_zone_span()
       [not found] <20191006085646.5768-1-david@redhat.com>
                   ` (5 preceding siblings ...)
  2019-10-06  8:56 ` [PATCH v6 07/10] mm/memory_hotplug: We always have a zone in find_(smallest|biggest)_section_pfn David Hildenbrand
@ 2019-10-06  8:56 ` David Hildenbrand
  2019-10-06  8:56 ` [PATCH v6 09/10] mm/memory_hotplug: Drop local variables " David Hildenbrand
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: David Hildenbrand @ 2019-10-06  8:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-s390, Michal Hocko, linux-ia64, Pavel Tatashin, linux-sh,
	x86, David Hildenbrand, linux-mm, Wei Yang, Andrew Morton,
	linuxppc-dev, Dan Williams, linux-arm-kernel, Oscar Salvador

If we have holes, the holes will automatically get detected and removed
once we remove the next bigger/smaller section. The extra checks can
go.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 34 +++++++---------------------------
 1 file changed, 7 insertions(+), 27 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index f294918f7211..8dafa1ba8d9f 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -393,6 +393,9 @@ static void shrink_zone_span(struct zone *zone, unsigned long start_pfn,
 		if (pfn) {
 			zone->zone_start_pfn = pfn;
 			zone->spanned_pages = zone_end_pfn - pfn;
+		} else {
+			zone->zone_start_pfn = 0;
+			zone->spanned_pages = 0;
 		}
 	} else if (zone_end_pfn == end_pfn) {
 		/*
@@ -405,34 +408,11 @@ static void shrink_zone_span(struct zone *zone, unsigned long start_pfn,
 					       start_pfn);
 		if (pfn)
 			zone->spanned_pages = pfn - zone_start_pfn + 1;
+		else {
+			zone->zone_start_pfn = 0;
+			zone->spanned_pages = 0;
+		}
 	}
-
-	/*
-	 * The section is not biggest or smallest mem_section in the zone, it
-	 * only creates a hole in the zone. So in this case, we need not
-	 * change the zone. But perhaps, the zone has only hole data. Thus
-	 * it check the zone has only hole or not.
-	 */
-	pfn = zone_start_pfn;
-	for (; pfn < zone_end_pfn; pfn += PAGES_PER_SUBSECTION) {
-		if (unlikely(!pfn_to_online_page(pfn)))
-			continue;
-
-		if (page_zone(pfn_to_page(pfn)) != zone)
-			continue;
-
-		/* Skip range to be removed */
-		if (pfn >= start_pfn && pfn < end_pfn)
-			continue;
-
-		/* If we find valid section, we have nothing to do */
-		zone_span_writeunlock(zone);
-		return;
-	}
-
-	/* The zone has no valid section */
-	zone->zone_start_pfn = 0;
-	zone->spanned_pages = 0;
 	zone_span_writeunlock(zone);
 }
 
-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v6 09/10] mm/memory_hotplug: Drop local variables in shrink_zone_span()
       [not found] <20191006085646.5768-1-david@redhat.com>
                   ` (6 preceding siblings ...)
  2019-10-06  8:56 ` [PATCH v6 08/10] mm/memory_hotplug: Don't check for "all holes" in shrink_zone_span() David Hildenbrand
@ 2019-10-06  8:56 ` " David Hildenbrand
  2019-10-06  8:56 ` [PATCH v6 10/10] mm/memory_hotplug: Cleanup __remove_pages() David Hildenbrand
       [not found] ` <20191006085646.5768-6-david@redhat.com>
  9 siblings, 0 replies; 19+ messages in thread
From: David Hildenbrand @ 2019-10-06  8:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-s390, Michal Hocko, linux-ia64, Pavel Tatashin, linux-sh,
	x86, David Hildenbrand, linux-mm, Wei Yang, Andrew Morton,
	linuxppc-dev, Dan Williams, linux-arm-kernel, Oscar Salvador

Get rid of the unnecessary local variables.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 8dafa1ba8d9f..843481bd507d 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -374,14 +374,11 @@ static unsigned long find_biggest_section_pfn(int nid, struct zone *zone,
 static void shrink_zone_span(struct zone *zone, unsigned long start_pfn,
 			     unsigned long end_pfn)
 {
-	unsigned long zone_start_pfn = zone->zone_start_pfn;
-	unsigned long z = zone_end_pfn(zone); /* zone_end_pfn namespace clash */
-	unsigned long zone_end_pfn = z;
 	unsigned long pfn;
 	int nid = zone_to_nid(zone);
 
 	zone_span_writelock(zone);
-	if (zone_start_pfn == start_pfn) {
+	if (zone->zone_start_pfn == start_pfn) {
 		/*
 		 * If the section is smallest section in the zone, it need
 		 * shrink zone->zone_start_pfn and zone->zone_spanned_pages.
@@ -389,25 +386,25 @@ static void shrink_zone_span(struct zone *zone, unsigned long start_pfn,
 		 * for shrinking zone.
 		 */
 		pfn = find_smallest_section_pfn(nid, zone, end_pfn,
-						zone_end_pfn);
+						zone_end_pfn(zone));
 		if (pfn) {
+			zone->spanned_pages = zone_end_pfn(zone) - pfn;
 			zone->zone_start_pfn = pfn;
-			zone->spanned_pages = zone_end_pfn - pfn;
 		} else {
 			zone->zone_start_pfn = 0;
 			zone->spanned_pages = 0;
 		}
-	} else if (zone_end_pfn == end_pfn) {
+	} else if (zone_end_pfn(zone) == end_pfn) {
 		/*
 		 * If the section is biggest section in the zone, it need
 		 * shrink zone->spanned_pages.
 		 * In this case, we find second biggest valid mem_section for
 		 * shrinking zone.
 		 */
-		pfn = find_biggest_section_pfn(nid, zone, zone_start_pfn,
+		pfn = find_biggest_section_pfn(nid, zone, zone->zone_start_pfn,
 					       start_pfn);
 		if (pfn)
-			zone->spanned_pages = pfn - zone_start_pfn + 1;
+			zone->spanned_pages = pfn - zone->zone_start_pfn + 1;
 		else {
 			zone->zone_start_pfn = 0;
 			zone->spanned_pages = 0;
-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v6 10/10] mm/memory_hotplug: Cleanup __remove_pages()
       [not found] <20191006085646.5768-1-david@redhat.com>
                   ` (7 preceding siblings ...)
  2019-10-06  8:56 ` [PATCH v6 09/10] mm/memory_hotplug: Drop local variables " David Hildenbrand
@ 2019-10-06  8:56 ` David Hildenbrand
       [not found] ` <20191006085646.5768-6-david@redhat.com>
  9 siblings, 0 replies; 19+ messages in thread
From: David Hildenbrand @ 2019-10-06  8:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-s390, Michal Hocko, linux-ia64, Pavel Tatashin, linux-sh,
	x86, David Hildenbrand, linux-mm, Wei Yang, Andrew Morton,
	linuxppc-dev, Dan Williams, linux-arm-kernel, Oscar Salvador

Let's drop the basically unused section stuff and simplify.

Also, let's use a shorter variant to calculate the number of pages to
the next section boundary.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 843481bd507d..2275240cfa10 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -490,25 +490,20 @@ static void __remove_section(unsigned long pfn, unsigned long nr_pages,
 void __remove_pages(unsigned long pfn, unsigned long nr_pages,
 		    struct vmem_altmap *altmap)
 {
+	const unsigned long end_pfn = pfn + nr_pages;
+	unsigned long cur_nr_pages;
 	unsigned long map_offset = 0;
-	unsigned long nr, start_sec, end_sec;
 
 	map_offset = vmem_altmap_offset(altmap);
 
 	if (check_pfn_span(pfn, nr_pages, "remove"))
 		return;
 
-	start_sec = pfn_to_section_nr(pfn);
-	end_sec = pfn_to_section_nr(pfn + nr_pages - 1);
-	for (nr = start_sec; nr <= end_sec; nr++) {
-		unsigned long pfns;
-
+	for (; pfn < end_pfn; pfn += cur_nr_pages) {
 		cond_resched();
-		pfns = min(nr_pages, PAGES_PER_SECTION
-				- (pfn & ~PAGE_SECTION_MASK));
-		__remove_section(pfn, pfns, map_offset, altmap);
-		pfn += pfns;
-		nr_pages -= pfns;
+		/* Select all remaining pages up to the next section boundary */
+		cur_nr_pages = min(end_pfn - pfn, -(pfn | PAGE_SECTION_MASK));
+		__remove_section(pfn, cur_nr_pages, map_offset, altmap);
 		map_offset = 0;
 	}
 }
-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 01/10] mm/memunmap: Don't access uninitialized memmap in memunmap_pages()
  2019-10-06  8:56 ` [PATCH v6 01/10] mm/memunmap: Don't access uninitialized memmap in memunmap_pages() David Hildenbrand
@ 2019-10-06 19:58   ` Damian Tometzki
  2019-10-06 20:13     ` David Hildenbrand
  2019-10-14  9:05   ` David Hildenbrand
  1 sibling, 1 reply; 19+ messages in thread
From: Damian Tometzki @ 2019-10-06 19:58 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-s390, linux-ia64, Ira Weiny, linux-sh, Jason Gunthorpe,
	Aneesh Kumar K.V, x86, linux-kernel, linux-mm, Logan Gunthorpe,
	Dan Williams, linuxppc-dev, Andrew Morton, linux-arm-kernel

Hello David,

patch 05/10 is missing in the patch series. 


On Sun, 06. Oct 10:56, David Hildenbrand wrote:
> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
> 
> With an altmap, the memmap falling into the reserved altmap space are
> not initialized and, therefore, contain a garbage NID and a garbage
> zone. Make sure to read the NID/zone from a memmap that was initialzed.
> 
> This fixes a kernel crash that is observed when destroying a namespace:
> 
> [   81.356173] kernel BUG at include/linux/mm.h:1107!
> cpu 0x1: Vector: 700 (Program Check) at [c000000274087890]
>     pc: c0000000004b9728: memunmap_pages+0x238/0x340
>     lr: c0000000004b9724: memunmap_pages+0x234/0x340
> ...
>     pid   = 3669, comm = ndctl
> kernel BUG at include/linux/mm.h:1107!
> [c000000274087ba0] c0000000009e3500 devm_action_release+0x30/0x50
> [c000000274087bc0] c0000000009e4758 release_nodes+0x268/0x2d0
> [c000000274087c30] c0000000009dd144 device_release_driver_internal+0x174/0x240
> [c000000274087c70] c0000000009d9dfc unbind_store+0x13c/0x190
> [c000000274087cb0] c0000000009d8a24 drv_attr_store+0x44/0x60
> [c000000274087cd0] c0000000005a7470 sysfs_kf_write+0x70/0xa0
> [c000000274087d10] c0000000005a5cac kernfs_fop_write+0x1ac/0x290
> [c000000274087d60] c0000000004be45c __vfs_write+0x3c/0x70
> [c000000274087d80] c0000000004c26e4 vfs_write+0xe4/0x200
> [c000000274087dd0] c0000000004c2a6c ksys_write+0x7c/0x140
> [c000000274087e20] c00000000000bbd0 system_call+0x5c/0x68
> 
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Logan Gunthorpe <logang@deltatee.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> [ minimze code changes, rephrase description ]
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  mm/memremap.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/memremap.c b/mm/memremap.c
> index 557e53c6fb46..8c2fb44c3b4d 100644
> --- a/mm/memremap.c
> +++ b/mm/memremap.c
> @@ -123,6 +123,7 @@ static void dev_pagemap_cleanup(struct dev_pagemap *pgmap)
>  void memunmap_pages(struct dev_pagemap *pgmap)
>  {
>  	struct resource *res = &pgmap->res;
> +	struct page *first_page;
>  	unsigned long pfn;
>  	int nid;
>  
> @@ -131,14 +132,16 @@ void memunmap_pages(struct dev_pagemap *pgmap)
>  		put_page(pfn_to_page(pfn));
>  	dev_pagemap_cleanup(pgmap);
>  
> +	/* make sure to access a memmap that was actually initialized */
> +	first_page = pfn_to_page(pfn_first(pgmap));
> +
>  	/* pages are dead and unused, undo the arch mapping */
> -	nid = page_to_nid(pfn_to_page(PHYS_PFN(res->start)));
> +	nid = page_to_nid(first_page);

Why we need 'nid = page_to_nid(first_page)' we didnt use it anymore in this function ?

>  
>  	mem_hotplug_begin();
>  	if (pgmap->type == MEMORY_DEVICE_PRIVATE) {
> -		pfn = PHYS_PFN(res->start);
> -		__remove_pages(page_zone(pfn_to_page(pfn)), pfn,
> -				 PHYS_PFN(resource_size(res)), NULL);
> +		__remove_pages(page_zone(first_page), PHYS_PFN(res->start),
> +			       PHYS_PFN(resource_size(res)), NULL);
>  	} else {
>  		arch_remove_memory(nid, res->start, resource_size(res),
>  				pgmap_altmap(pgmap));
> -- 
> 2.21.0
>
Best regards
Damian
 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 01/10] mm/memunmap: Don't access uninitialized memmap in memunmap_pages()
  2019-10-06 19:58   ` Damian Tometzki
@ 2019-10-06 20:13     ` David Hildenbrand
  0 siblings, 0 replies; 19+ messages in thread
From: David Hildenbrand @ 2019-10-06 20:13 UTC (permalink / raw)
  To: linux-kernel, linux-mm, linux-arm-kernel, linux-ia64,
	linuxppc-dev, linux-s390, linux-sh, x86, Aneesh Kumar K.V,
	Dan Williams, Andrew Morton, Jason Gunthorpe, Logan Gunthorpe,
	Ira Weiny

On 06.10.19 21:58, Damian Tometzki wrote:
> Hello David,
> 
> patch 05/10 is missing in the patch series. 
> 

Hi Damian,

not really. Could be that lkml is slow today. E.g., check

https://marc.info/?l=linux-mm&m=157035222620403&w=2

and especially

https://marc.info/?l=linux-mm&m=157035225120440&w=2

All mails popped up on the mm list.

> 
> On Sun, 06. Oct 10:56, David Hildenbrand wrote:
>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
>>
>> With an altmap, the memmap falling into the reserved altmap space are
>> not initialized and, therefore, contain a garbage NID and a garbage
>> zone. Make sure to read the NID/zone from a memmap that was initialzed.
>>
>> This fixes a kernel crash that is observed when destroying a namespace:
>>
>> [   81.356173] kernel BUG at include/linux/mm.h:1107!
>> cpu 0x1: Vector: 700 (Program Check) at [c000000274087890]
>>     pc: c0000000004b9728: memunmap_pages+0x238/0x340
>>     lr: c0000000004b9724: memunmap_pages+0x234/0x340
>> ...
>>     pid   = 3669, comm = ndctl
>> kernel BUG at include/linux/mm.h:1107!
>> [c000000274087ba0] c0000000009e3500 devm_action_release+0x30/0x50
>> [c000000274087bc0] c0000000009e4758 release_nodes+0x268/0x2d0
>> [c000000274087c30] c0000000009dd144 device_release_driver_internal+0x174/0x240
>> [c000000274087c70] c0000000009d9dfc unbind_store+0x13c/0x190
>> [c000000274087cb0] c0000000009d8a24 drv_attr_store+0x44/0x60
>> [c000000274087cd0] c0000000005a7470 sysfs_kf_write+0x70/0xa0
>> [c000000274087d10] c0000000005a5cac kernfs_fop_write+0x1ac/0x290
>> [c000000274087d60] c0000000004be45c __vfs_write+0x3c/0x70
>> [c000000274087d80] c0000000004c26e4 vfs_write+0xe4/0x200
>> [c000000274087dd0] c0000000004c2a6c ksys_write+0x7c/0x140
>> [c000000274087e20] c00000000000bbd0 system_call+0x5c/0x68
>>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Jason Gunthorpe <jgg@ziepe.ca>
>> Cc: Logan Gunthorpe <logang@deltatee.com>
>> Cc: Ira Weiny <ira.weiny@intel.com>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> [ minimze code changes, rephrase description ]
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>  mm/memremap.c | 11 +++++++----
>>  1 file changed, 7 insertions(+), 4 deletions(-)
>>
>> diff --git a/mm/memremap.c b/mm/memremap.c
>> index 557e53c6fb46..8c2fb44c3b4d 100644
>> --- a/mm/memremap.c
>> +++ b/mm/memremap.c
>> @@ -123,6 +123,7 @@ static void dev_pagemap_cleanup(struct dev_pagemap *pgmap)
>>  void memunmap_pages(struct dev_pagemap *pgmap)
>>  {
>>  	struct resource *res = &pgmap->res;
>> +	struct page *first_page;
>>  	unsigned long pfn;
>>  	int nid;
>>  
>> @@ -131,14 +132,16 @@ void memunmap_pages(struct dev_pagemap *pgmap)
>>  		put_page(pfn_to_page(pfn));
>>  	dev_pagemap_cleanup(pgmap);
>>  
>> +	/* make sure to access a memmap that was actually initialized */
>> +	first_page = pfn_to_page(pfn_first(pgmap));
>> +
>>  	/* pages are dead and unused, undo the arch mapping */
>> -	nid = page_to_nid(pfn_to_page(PHYS_PFN(res->start)));
>> +	nid = page_to_nid(first_page);
> 
> Why we need 'nid = page_to_nid(first_page)' we didnt use it anymore in this function ?

Please see ...

> 
>>  
>>  	mem_hotplug_begin();
>>  	if (pgmap->type == MEMORY_DEVICE_PRIVATE) {
>> -		pfn = PHYS_PFN(res->start);
>> -		__remove_pages(page_zone(pfn_to_page(pfn)), pfn,
>> -				 PHYS_PFN(resource_size(res)), NULL);
>> +		__remove_pages(page_zone(first_page), PHYS_PFN(res->start),
>> +			       PHYS_PFN(resource_size(res)), NULL);
>>  	} else {
>>  		arch_remove_memory(nid, res->start, resource_size(res),
                                   ^ here

:)

>>  				pgmap_altmap(pgmap));



-- 

Thanks,

David / dhildenb

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 01/10] mm/memunmap: Don't access uninitialized memmap in memunmap_pages()
  2019-10-06  8:56 ` [PATCH v6 01/10] mm/memunmap: Don't access uninitialized memmap in memunmap_pages() David Hildenbrand
  2019-10-06 19:58   ` Damian Tometzki
@ 2019-10-14  9:05   ` David Hildenbrand
  1 sibling, 0 replies; 19+ messages in thread
From: David Hildenbrand @ 2019-10-14  9:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-s390, linux-ia64, Ira Weiny, linux-sh, Jason Gunthorpe,
	Aneesh Kumar K.V, x86, linux-mm, Logan Gunthorpe, Dan Williams,
	linuxppc-dev, Andrew Morton, linux-arm-kernel

On 06.10.19 10:56, David Hildenbrand wrote:
> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
> 
> With an altmap, the memmap falling into the reserved altmap space are
> not initialized and, therefore, contain a garbage NID and a garbage
> zone. Make sure to read the NID/zone from a memmap that was initialzed.
> 
> This fixes a kernel crash that is observed when destroying a namespace:
> 
> [   81.356173] kernel BUG at include/linux/mm.h:1107!
> cpu 0x1: Vector: 700 (Program Check) at [c000000274087890]
>      pc: c0000000004b9728: memunmap_pages+0x238/0x340
>      lr: c0000000004b9724: memunmap_pages+0x234/0x340
> ...
>      pid   = 3669, comm = ndctl
> kernel BUG at include/linux/mm.h:1107!
> [c000000274087ba0] c0000000009e3500 devm_action_release+0x30/0x50
> [c000000274087bc0] c0000000009e4758 release_nodes+0x268/0x2d0
> [c000000274087c30] c0000000009dd144 device_release_driver_internal+0x174/0x240
> [c000000274087c70] c0000000009d9dfc unbind_store+0x13c/0x190
> [c000000274087cb0] c0000000009d8a24 drv_attr_store+0x44/0x60
> [c000000274087cd0] c0000000005a7470 sysfs_kf_write+0x70/0xa0
> [c000000274087d10] c0000000005a5cac kernfs_fop_write+0x1ac/0x290
> [c000000274087d60] c0000000004be45c __vfs_write+0x3c/0x70
> [c000000274087d80] c0000000004c26e4 vfs_write+0xe4/0x200
> [c000000274087dd0] c0000000004c2a6c ksys_write+0x7c/0x140
> [c000000274087e20] c00000000000bbd0 system_call+0x5c/0x68
> 
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Logan Gunthorpe <logang@deltatee.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> [ minimze code changes, rephrase description ]
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>   mm/memremap.c | 11 +++++++----
>   1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/memremap.c b/mm/memremap.c
> index 557e53c6fb46..8c2fb44c3b4d 100644
> --- a/mm/memremap.c
> +++ b/mm/memremap.c
> @@ -123,6 +123,7 @@ static void dev_pagemap_cleanup(struct dev_pagemap *pgmap)
>   void memunmap_pages(struct dev_pagemap *pgmap)
>   {
>   	struct resource *res = &pgmap->res;
> +	struct page *first_page;
>   	unsigned long pfn;
>   	int nid;
>   
> @@ -131,14 +132,16 @@ void memunmap_pages(struct dev_pagemap *pgmap)
>   		put_page(pfn_to_page(pfn));
>   	dev_pagemap_cleanup(pgmap);
>   
> +	/* make sure to access a memmap that was actually initialized */
> +	first_page = pfn_to_page(pfn_first(pgmap));
> +
>   	/* pages are dead and unused, undo the arch mapping */
> -	nid = page_to_nid(pfn_to_page(PHYS_PFN(res->start)));
> +	nid = page_to_nid(first_page);
>   
>   	mem_hotplug_begin();
>   	if (pgmap->type == MEMORY_DEVICE_PRIVATE) {
> -		pfn = PHYS_PFN(res->start);
> -		__remove_pages(page_zone(pfn_to_page(pfn)), pfn,
> -				 PHYS_PFN(resource_size(res)), NULL);
> +		__remove_pages(page_zone(first_page), PHYS_PFN(res->start),
> +			       PHYS_PFN(resource_size(res)), NULL);
>   	} else {
>   		arch_remove_memory(nid, res->start, resource_size(res),
>   				pgmap_altmap(pgmap));
> 

@Andrew, can you add

Fixes: 2c2a5af6fed2 ("mm, memory_hotplug: add nid parameter to 
arch_remove_memory")

(which basically introduced the nid = page_to_nid(first_page))

The "page_zone(pfn_to_page(pfn)" was introduced by 69324b8f4833 ("mm, 
devm_memremap_pages: add MEMORY_DEVICE_PRIVATE support"), however, I 
think we will never have driver reserved memory with 
MEMORY_DEVICE_PRIVATE (no altmap AFAIKS).

Also, I think

Cc: stable@vger.kernel.org # v5.0+

makes sense.

-- 

Thanks,

David / dhildenb

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 03/10] mm/memory_hotplug: Don't access uninitialized memmaps in shrink_pgdat_span()
  2019-10-06  8:56 ` [PATCH v6 03/10] mm/memory_hotplug: Don't access uninitialized memmaps in shrink_pgdat_span() David Hildenbrand
@ 2019-10-14  9:31   ` David Hildenbrand
  0 siblings, 0 replies; 19+ messages in thread
From: David Hildenbrand @ 2019-10-14  9:31 UTC (permalink / raw)
  To: linux-kernel, Andrew Morton
  Cc: linux-s390, Michal Hocko, linux-ia64, Pavel Tatashin, linux-sh,
	Aneesh Kumar K . V, x86, linux-mm, Wei Yang, Dan Williams,
	linuxppc-dev, linux-arm-kernel, Oscar Salvador

On 06.10.19 10:56, David Hildenbrand wrote:
> We might use the nid of memmaps that were never initialized. For
> example, if the memmap was poisoned, we will crash the kernel in
> pfn_to_nid() right now. Let's use the calculated boundaries of the separate
> zones instead. This now also avoids having to iterate over a whole bunch of
> subsections again, after shrinking one zone.
> 
> Before commit d0dc12e86b31 ("mm/memory_hotplug: optimize memory
> hotplug"), the memmap was initialized to 0 and the node was set to the
> right value. After that commit, the node might be garbage.
> 
> We'll have to fix shrink_zone_span() next.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Wei Yang <richardw.yang@linux.intel.com>
> Fixes: d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug")

@Andrew, can you convert that to

Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") # visible after d0dc12e86b319

and add

Cc: stable@vger.kernel.org # v4.13+

-- 

Thanks,

David / dhildenb

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 04/10] mm/memory_hotplug: Don't access uninitialized memmaps in shrink_zone_span()
  2019-10-06  8:56 ` [PATCH v6 04/10] mm/memory_hotplug: Don't access uninitialized memmaps in shrink_zone_span() David Hildenbrand
@ 2019-10-14  9:32   ` David Hildenbrand
  2019-10-14 19:17     ` Andrew Morton
  0 siblings, 1 reply; 19+ messages in thread
From: David Hildenbrand @ 2019-10-14  9:32 UTC (permalink / raw)
  To: linux-kernel, Andrew Morton
  Cc: linux-s390, Michal Hocko, linux-ia64, Pavel Tatashin, linux-sh,
	Aneesh Kumar K . V, x86, linux-mm, Dan Williams, linuxppc-dev,
	linux-arm-kernel, Oscar Salvador

On 06.10.19 10:56, David Hildenbrand wrote:
> Let's limit shrinking to !ZONE_DEVICE so we can fix the current code. We
> should never try to touch the memmap of offline sections where we could
> have uninitialized memmaps and could trigger BUGs when calling
> page_to_nid() on poisoned pages.
> 
> There is no reliable way to distinguish an uninitialized memmap from an
> initialized memmap that belongs to ZONE_DEVICE, as we don't have
> anything like SECTION_IS_ONLINE we can use similar to
> pfn_to_online_section() for !ZONE_DEVICE memory. E.g.,
> set_zone_contiguous() similarly relies on pfn_to_online_section() and
> will therefore never set a ZONE_DEVICE zone consecutive. Stopping to
> shrink the ZONE_DEVICE therefore results in no observable changes,
> besides /proc/zoneinfo indicating different boundaries - something we
> can totally live with.
> 
> Before commit d0dc12e86b31 ("mm/memory_hotplug: optimize memory
> hotplug"), the memmap was initialized with 0 and the node with the
> right value. So the zone might be wrong but not garbage. After that
> commit, both the zone and the node will be garbage when touching
> uninitialized memmaps.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Fixes: d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug")

@Andrew, can you convert that to

Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded 
memory to zones until online") # visible after d0dc12e86b319

and add

Cc: stable@vger.kernel.org # v4.13+


-- 

Thanks,

David / dhildenb

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 04/10] mm/memory_hotplug: Don't access uninitialized memmaps in shrink_zone_span()
  2019-10-14  9:32   ` David Hildenbrand
@ 2019-10-14 19:17     ` Andrew Morton
  2019-11-19 14:16       ` David Hildenbrand
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Morton @ 2019-10-14 19:17 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-s390, Michal Hocko, linux-ia64, Pavel Tatashin, linux-sh,
	Aneesh Kumar K . V, x86, linux-kernel, linux-mm, Dan Williams,
	linuxppc-dev, linux-arm-kernel, Oscar Salvador

On Mon, 14 Oct 2019 11:32:13 +0200 David Hildenbrand <david@redhat.com> wrote:

> > Fixes: d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug")
> 
> @Andrew, can you convert that to
> 
> Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded 
> memory to zones until online") # visible after d0dc12e86b319
> 
> and add
> 
> Cc: stable@vger.kernel.org # v4.13+

Done, thanks.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 06/10] mm/memory_hotplug: Poison memmap in remove_pfn_range_from_zone()
  2019-10-06  8:56 ` [PATCH v6 06/10] mm/memory_hotplug: Poison memmap in remove_pfn_range_from_zone() David Hildenbrand
@ 2019-10-16 14:01   ` David Hildenbrand
  0 siblings, 0 replies; 19+ messages in thread
From: David Hildenbrand @ 2019-10-16 14:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-s390, Michal Hocko, linux-ia64, Pavel Tatashin, linux-sh,
	x86, linux-mm, Andrew Morton, linuxppc-dev, Dan Williams,
	linux-arm-kernel, Oscar Salvador

On 06.10.19 10:56, David Hildenbrand wrote:
> Let's poison the pages similar to when adding new memory in
> sparse_add_section(). Also call remove_pfn_range_from_zone() from
> memunmap_pages(), so we can poison the memmap from there as well.
> 
> While at it, calculate the pfn in memunmap_pages() only once.

FWIW, this comment is stale and could be dropped :)

-- 

Thanks,

David / dhildenb

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 04/10] mm/memory_hotplug: Don't access uninitialized memmaps in shrink_zone_span()
  2019-10-14 19:17     ` Andrew Morton
@ 2019-11-19 14:16       ` David Hildenbrand
  2019-11-19 20:44         ` Andrew Morton
  0 siblings, 1 reply; 19+ messages in thread
From: David Hildenbrand @ 2019-11-19 14:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-s390, Michal Hocko, linux-ia64, Pavel Tatashin, linux-sh,
	Aneesh Kumar K . V, x86, linux-kernel, Alexander Duyck, linux-mm,
	Dan Williams, linuxppc-dev, Toshiki Fukasawa, linux-arm-kernel,
	Oscar Salvador

On 14.10.19 21:17, Andrew Morton wrote:
> On Mon, 14 Oct 2019 11:32:13 +0200 David Hildenbrand <david@redhat.com> wrote:
> 
>>> Fixes: d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug")
>>
>> @Andrew, can you convert that to
>>
>> Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded
>> memory to zones until online") # visible after d0dc12e86b319
>>
>> and add
>>
>> Cc: stable@vger.kernel.org # v4.13+
> 
> Done, thanks.
> 

Just a note that Toshiki reported a BUG (race between delayed
initialization of ZONE_DEVICE memmaps without holding the memory
hotplug lock and concurrent zone shrinking).

https://lkml.org/lkml/2019/11/14/1040

"Iteration of create and destroy namespace causes the panic as below:

[   41.207694] kernel BUG at mm/page_alloc.c:535!
[   41.208109] invalid opcode: 0000 [#1] SMP PTI
[   41.208508] CPU: 7 PID: 2766 Comm: ndctl Not tainted 5.4.0-rc4 #6
[   41.209064] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[   41.210175] RIP: 0010:set_pfnblock_flags_mask+0x95/0xf0
[   41.210643] Code: 04 41 83 e2 3c 48 8d 04 a8 48 c1 e0 07 48 03 04 dd e0 59 55 bb 48 8b 58 68 48 39 da 73 0e 48 c7 c6 70 ac 11 bb e8 1b b2 fd ff <0f> 0b 48 03 58 78 48 39 da 73 e9 49 01 ca b9 3f 00 00 00 4f 8d 0c
[   41.212354] RSP: 0018:ffffac0d41557c80 EFLAGS: 00010246
[   41.212821] RAX: 000000000000004a RBX: 0000000000244a00 RCX: 0000000000000000
[   41.213459] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffbb1197dc
[   41.214100] RBP: 000000000000000c R08: 0000000000000439 R09: 0000000000000059
[   41.214736] R10: 0000000000000000 R11: ffffac0d41557b08 R12: ffff8be475ea72b0
[   41.215376] R13: 000000000000fa00 R14: 0000000000250000 R15: 00000000fffc0bb5
[   41.216008] FS:  00007f30862ab600(0000) GS:ffff8be57bc40000(0000) knlGS:0000000000000000
[   41.216771] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   41.217299] CR2: 000055e824d0d508 CR3: 0000000231dac000 CR4: 00000000000006e0
[   41.217934] Call Trace:
[   41.218225]  memmap_init_zone_device+0x165/0x17c
[   41.218642]  memremap_pages+0x4c1/0x540
[   41.218989]  devm_memremap_pages+0x1d/0x60
[   41.219367]  pmem_attach_disk+0x16b/0x600 [nd_pmem]
[   41.219804]  ? devm_nsio_enable+0xb8/0xe0
[   41.220172]  nvdimm_bus_probe+0x69/0x1c0
[   41.220526]  really_probe+0x1c2/0x3e0
[   41.220856]  driver_probe_device+0xb4/0x100
[   41.221238]  device_driver_attach+0x4f/0x60
[   41.221611]  bind_store+0xc9/0x110
[   41.221919]  kernfs_fop_write+0x116/0x190
[   41.222326]  vfs_write+0xa5/0x1a0
[   41.222626]  ksys_write+0x59/0xd0
[   41.222927]  do_syscall_64+0x5b/0x180
[   41.223264]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   41.223714] RIP: 0033:0x7f30865d0ed8
[   41.224037] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 45 78 0d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
[   41.225920] RSP: 002b:00007fffe5d30a78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[   41.226608] RAX: ffffffffffffffda RBX: 000055e824d07f40 RCX: 00007f30865d0ed8
[   41.227242] RDX: 0000000000000007 RSI: 000055e824d07f40 RDI: 0000000000000004
[   41.227870] RBP: 0000000000000007 R08: 0000000000000007 R09: 0000000000000006
[   41.228753] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
[   41.229419] R13: 00007f30862ab528 R14: 0000000000000001 R15: 000055e824d07f40

While creating a namespace and initializing memmap, if you destroy the namespace
and shrink the zone, it will initialize the memmap outside the zone and
trigger VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn), page) in
set_pfnblock_flags_mask()."


This BUG is also mitigated by this commit, where we for now stop to
shrink the ZONE_DEVICE zone until we can do it in a safe and clean
way.

-- 

Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 04/10] mm/memory_hotplug: Don't access uninitialized memmaps in shrink_zone_span()
  2019-11-19 14:16       ` David Hildenbrand
@ 2019-11-19 20:44         ` Andrew Morton
  0 siblings, 0 replies; 19+ messages in thread
From: Andrew Morton @ 2019-11-19 20:44 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-s390, Michal Hocko, linux-ia64, Pavel Tatashin, linux-sh,
	Aneesh Kumar K . V, x86, linux-kernel, Alexander Duyck, linux-mm,
	Dan Williams, linuxppc-dev, Toshiki Fukasawa, linux-arm-kernel,
	Oscar Salvador

On Tue, 19 Nov 2019 15:16:22 +0100 David Hildenbrand <david@redhat.com> wrote:

> On 14.10.19 21:17, Andrew Morton wrote:
> > On Mon, 14 Oct 2019 11:32:13 +0200 David Hildenbrand <david@redhat.com> wrote:
> > 
> >>> Fixes: d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug")
> >>
> >> @Andrew, can you convert that to
> >>
> >> Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded
> >> memory to zones until online") # visible after d0dc12e86b319
> >>
> >> and add
> >>
> >> Cc: stable@vger.kernel.org # v4.13+
> > 
> > Done, thanks.
> > 
> 
> Just a note that Toshiki reported a BUG (race between delayed
> initialization of ZONE_DEVICE memmaps without holding the memory
> hotplug lock and concurrent zone shrinking).
> 
> https://lkml.org/lkml/2019/11/14/1040
> 
> "Iteration of create and destroy namespace causes the panic as below:
> 
> [   41.207694] kernel BUG at mm/page_alloc.c:535!
> [   41.208109] invalid opcode: 0000 [#1] SMP PTI
> [   41.208508] CPU: 7 PID: 2766 Comm: ndctl Not tainted 5.4.0-rc4 #6
> [   41.209064] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
> [   41.210175] RIP: 0010:set_pfnblock_flags_mask+0x95/0xf0
> [   41.210643] Code: 04 41 83 e2 3c 48 8d 04 a8 48 c1 e0 07 48 03 04 dd e0 59 55 bb 48 8b 58 68 48 39 da 73 0e 48 c7 c6 70 ac 11 bb e8 1b b2 fd ff <0f> 0b 48 03 58 78 48 39 da 73 e9 49 01 ca b9 3f 00 00 00 4f 8d 0c
> [   41.212354] RSP: 0018:ffffac0d41557c80 EFLAGS: 00010246
> [   41.212821] RAX: 000000000000004a RBX: 0000000000244a00 RCX: 0000000000000000
> [   41.213459] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffbb1197dc
> [   41.214100] RBP: 000000000000000c R08: 0000000000000439 R09: 0000000000000059
> [   41.214736] R10: 0000000000000000 R11: ffffac0d41557b08 R12: ffff8be475ea72b0
> [   41.215376] R13: 000000000000fa00 R14: 0000000000250000 R15: 00000000fffc0bb5
> [   41.216008] FS:  00007f30862ab600(0000) GS:ffff8be57bc40000(0000) knlGS:0000000000000000
> [   41.216771] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   41.217299] CR2: 000055e824d0d508 CR3: 0000000231dac000 CR4: 00000000000006e0
> [   41.217934] Call Trace:
> [   41.218225]  memmap_init_zone_device+0x165/0x17c
> [   41.218642]  memremap_pages+0x4c1/0x540
> [   41.218989]  devm_memremap_pages+0x1d/0x60
> [   41.219367]  pmem_attach_disk+0x16b/0x600 [nd_pmem]
> [   41.219804]  ? devm_nsio_enable+0xb8/0xe0
> [   41.220172]  nvdimm_bus_probe+0x69/0x1c0
> [   41.220526]  really_probe+0x1c2/0x3e0
> [   41.220856]  driver_probe_device+0xb4/0x100
> [   41.221238]  device_driver_attach+0x4f/0x60
> [   41.221611]  bind_store+0xc9/0x110
> [   41.221919]  kernfs_fop_write+0x116/0x190
> [   41.222326]  vfs_write+0xa5/0x1a0
> [   41.222626]  ksys_write+0x59/0xd0
> [   41.222927]  do_syscall_64+0x5b/0x180
> [   41.223264]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [   41.223714] RIP: 0033:0x7f30865d0ed8
> [   41.224037] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 45 78 0d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
> [   41.225920] RSP: 002b:00007fffe5d30a78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [   41.226608] RAX: ffffffffffffffda RBX: 000055e824d07f40 RCX: 00007f30865d0ed8
> [   41.227242] RDX: 0000000000000007 RSI: 000055e824d07f40 RDI: 0000000000000004
> [   41.227870] RBP: 0000000000000007 R08: 0000000000000007 R09: 0000000000000006
> [   41.228753] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
> [   41.229419] R13: 00007f30862ab528 R14: 0000000000000001 R15: 000055e824d07f40
> 
> While creating a namespace and initializing memmap, if you destroy the namespace
> and shrink the zone, it will initialize the memmap outside the zone and
> trigger VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn), page) in
> set_pfnblock_flags_mask()."
> 
> 
> This BUG is also mitigated by this commit, where we for now stop to
> shrink the ZONE_DEVICE zone until we can do it in a safe and clean
> way.
> 

OK, thanks.  I updated the changelog, added Reported-by:Toshiki and
shall squeeze this fix into 5.4.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 05/10] mm/memory_hotplug: Shrink zones when offlining memory
       [not found]   ` <20191203151030.GB2600@linux>
@ 2019-12-03 15:27     ` David Hildenbrand
  0 siblings, 0 replies; 19+ messages in thread
From: David Hildenbrand @ 2019-12-03 15:27 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: linux-s390, Michal Hocko, linux-ia64, Pavel Tatashin, linux-sh,
	Aneesh Kumar K.V, x86, linux-kernel, Matthew Wilcox \(Oracle\),
	linux-mm, Logan Gunthorpe, Greg Kroah-Hartman, Andrew Morton,
	linuxppc-dev, Dan Williams, linux-arm-kernel

On 03.12.19 16:10, Oscar Salvador wrote:
> On Sun, Oct 06, 2019 at 10:56:41AM +0200, David Hildenbrand wrote:
>> Fixes: d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug")
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> I did not see anything wrong with the taken approach, and makes sense to me.
> The only thing that puzzles me is we seem to not balance spanned_pages
> for ZONE_DEVICE anymore.
> memremap_pages() increments them via move_pfn_range_to_zone, but we skip
> ZONE_DEVICE in remove_pfn_range_from_zone.

Yes, documented e.g., in

commit 7ce700bf11b5e2cb84e4352bbdf2123a7a239c84
Author: David Hildenbrand <david@redhat.com>
Date:   Thu Nov 21 17:53:56 2019 -0800

    mm/memory_hotplug: don't access uninitialized memmaps in
shrink_zone_span()

Needs some more thought - but is definitely not urgent (well, now it's
at least no longer completely broken).

> 
> That is not really related to this patch, so I might be missing something,
> but it caught my eye while reviewing this.
> 
> Anyway, for this one:
> 
> Reviewed-by: Oscar Salvador <osalvador@suse.de>
> 

Thanks!

> 
> off-topic: I __think__ we really need to trim the CC list.

Yes we should :) - done.

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, back to index

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20191006085646.5768-1-david@redhat.com>
2019-10-06  8:56 ` [PATCH v6 01/10] mm/memunmap: Don't access uninitialized memmap in memunmap_pages() David Hildenbrand
2019-10-06 19:58   ` Damian Tometzki
2019-10-06 20:13     ` David Hildenbrand
2019-10-14  9:05   ` David Hildenbrand
2019-10-06  8:56 ` [PATCH v6 02/10] mm/memmap_init: Update variable name in memmap_init_zone David Hildenbrand
2019-10-06  8:56 ` [PATCH v6 03/10] mm/memory_hotplug: Don't access uninitialized memmaps in shrink_pgdat_span() David Hildenbrand
2019-10-14  9:31   ` David Hildenbrand
2019-10-06  8:56 ` [PATCH v6 04/10] mm/memory_hotplug: Don't access uninitialized memmaps in shrink_zone_span() David Hildenbrand
2019-10-14  9:32   ` David Hildenbrand
2019-10-14 19:17     ` Andrew Morton
2019-11-19 14:16       ` David Hildenbrand
2019-11-19 20:44         ` Andrew Morton
2019-10-06  8:56 ` [PATCH v6 06/10] mm/memory_hotplug: Poison memmap in remove_pfn_range_from_zone() David Hildenbrand
2019-10-16 14:01   ` David Hildenbrand
2019-10-06  8:56 ` [PATCH v6 07/10] mm/memory_hotplug: We always have a zone in find_(smallest|biggest)_section_pfn David Hildenbrand
2019-10-06  8:56 ` [PATCH v6 08/10] mm/memory_hotplug: Don't check for "all holes" in shrink_zone_span() David Hildenbrand
2019-10-06  8:56 ` [PATCH v6 09/10] mm/memory_hotplug: Drop local variables " David Hildenbrand
2019-10-06  8:56 ` [PATCH v6 10/10] mm/memory_hotplug: Cleanup __remove_pages() David Hildenbrand
     [not found] ` <20191006085646.5768-6-david@redhat.com>
     [not found]   ` <20191203151030.GB2600@linux>
2019-12-03 15:27     ` [PATCH v6 05/10] mm/memory_hotplug: Shrink zones when offlining memory David Hildenbrand

Linux-ARM-Kernel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-arm-kernel/0 linux-arm-kernel/git/0.git
	git clone --mirror https://lore.kernel.org/linux-arm-kernel/1 linux-arm-kernel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-arm-kernel linux-arm-kernel/ https://lore.kernel.org/linux-arm-kernel \
		linux-arm-kernel@lists.infradead.org
	public-inbox-index linux-arm-kernel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-arm-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git