All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] mem-hotplug: introduce movablenode option
@ 2016-08-04 11:23 ` Xishi Qiu
  0 siblings, 0 replies; 26+ messages in thread
From: Xishi Qiu @ 2016-08-04 11:23 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Vlastimil Babka,
	Mel Gorman, Andrew Morton, Michal Hocko, David Rientjes,
	Joonsoo Kim, Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki
  Cc: Linux MM, LKML

This patch introduces a new boot option movablenode.

To support memory hotplug, boot option "movable_node" is needed. And to
support debug memory hotplug, boot option "movable_node" and "movablenode"
are both needed.

e.g. movable_node movablenode=1,2,4

It means node 1,2,4 will be set to movable nodes, the other nodes are
unmovable nodes. Usually movable nodes are parsed from SRAT table which
offered by BIOS, so this boot option is used for debug.

Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
---
 Documentation/kernel-parameters.txt |  4 ++++
 arch/x86/mm/srat.c                  | 36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 82b42c9..f8726f8 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2319,6 +2319,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 	movable_node	[KNL,X86] Boot-time switch to enable the effects
 			of CONFIG_MOVABLE_NODE=y. See mm/Kconfig for details.
 
+	movablenode=	[KNL,X86] Boot-time switch to set which node is
+			movable node.
+			Format: <movable nid>,...,<movable nid>
+
 	MTD_Partition=	[MTD]
 			Format: <name>,<region-number>,<size>,<offset>
 
diff --git a/arch/x86/mm/srat.c b/arch/x86/mm/srat.c
index b5f8218..c4cd81a 100644
--- a/arch/x86/mm/srat.c
+++ b/arch/x86/mm/srat.c
@@ -157,6 +157,38 @@ static inline int save_add_info(void) {return 1;}
 static inline int save_add_info(void) {return 0;}
 #endif
 
+static nodemask_t movablenode_mask;
+
+static void __init parse_movablenode_one(char *p)
+{
+	int node;
+
+	get_option(&p, &node);
+	node_set(node, movablenode_mask);
+}
+
+/*
+ * movablenode=<movable nid>,...,<movable nid> sets which node is movable
+ * node.
+ */
+static int __init parse_movablenode_opt(char *str)
+{
+#ifdef CONFIG_MOVABLE_NODE
+	while (str) {
+		char *k = strchr(str, ',');
+
+		if (k)
+			*k++ = 0;
+		parse_movablenode_one(str);
+		str = k;
+	}
+#else
+	pr_warn("movable_node option not supported\n");
+#endif
+	return 0;
+}
+early_param("movablenode", parse_movablenode_opt);
+
 /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */
 int __init
 acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
@@ -205,6 +237,10 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
 
 	max_possible_pfn = max(max_possible_pfn, PFN_UP(end - 1));
 
+	if (node_isset(node, movablenode_mask) && memblock_mark_hotplug(start, ma->length))
+		pr_warn("SRAT debug: Failed to mark hotplug range [mem %#010Lx-%#010Lx] in memblock\n",
+			(unsigned long long)start, (unsigned long long)end - 1);
+
 	return 0;
 out_err_bad_srat:
 	bad_srat();
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 1/3] mem-hotplug: introduce movablenode option
@ 2016-08-04 11:23 ` Xishi Qiu
  0 siblings, 0 replies; 26+ messages in thread
From: Xishi Qiu @ 2016-08-04 11:23 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Vlastimil Babka,
	Mel Gorman, Andrew Morton, Michal Hocko, David Rientjes,
	Joonsoo Kim, Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki
  Cc: Linux MM, LKML

This patch introduces a new boot option movablenode.

To support memory hotplug, boot option "movable_node" is needed. And to
support debug memory hotplug, boot option "movable_node" and "movablenode"
are both needed.

e.g. movable_node movablenode=1,2,4

It means node 1,2,4 will be set to movable nodes, the other nodes are
unmovable nodes. Usually movable nodes are parsed from SRAT table which
offered by BIOS, so this boot option is used for debug.

Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
---
 Documentation/kernel-parameters.txt |  4 ++++
 arch/x86/mm/srat.c                  | 36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 82b42c9..f8726f8 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2319,6 +2319,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 	movable_node	[KNL,X86] Boot-time switch to enable the effects
 			of CONFIG_MOVABLE_NODE=y. See mm/Kconfig for details.
 
+	movablenode=	[KNL,X86] Boot-time switch to set which node is
+			movable node.
+			Format: <movable nid>,...,<movable nid>
+
 	MTD_Partition=	[MTD]
 			Format: <name>,<region-number>,<size>,<offset>
 
diff --git a/arch/x86/mm/srat.c b/arch/x86/mm/srat.c
index b5f8218..c4cd81a 100644
--- a/arch/x86/mm/srat.c
+++ b/arch/x86/mm/srat.c
@@ -157,6 +157,38 @@ static inline int save_add_info(void) {return 1;}
 static inline int save_add_info(void) {return 0;}
 #endif
 
+static nodemask_t movablenode_mask;
+
+static void __init parse_movablenode_one(char *p)
+{
+	int node;
+
+	get_option(&p, &node);
+	node_set(node, movablenode_mask);
+}
+
+/*
+ * movablenode=<movable nid>,...,<movable nid> sets which node is movable
+ * node.
+ */
+static int __init parse_movablenode_opt(char *str)
+{
+#ifdef CONFIG_MOVABLE_NODE
+	while (str) {
+		char *k = strchr(str, ',');
+
+		if (k)
+			*k++ = 0;
+		parse_movablenode_one(str);
+		str = k;
+	}
+#else
+	pr_warn("movable_node option not supported\n");
+#endif
+	return 0;
+}
+early_param("movablenode", parse_movablenode_opt);
+
 /* Callback for parsing of the Proximity Domain <-> Memory Area mappings */
 int __init
 acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
@@ -205,6 +237,10 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
 
 	max_possible_pfn = max(max_possible_pfn, PFN_UP(end - 1));
 
+	if (node_isset(node, movablenode_mask) && memblock_mark_hotplug(start, ma->length))
+		pr_warn("SRAT debug: Failed to mark hotplug range [mem %#010Lx-%#010Lx] in memblock\n",
+			(unsigned long long)start, (unsigned long long)end - 1);
+
 	return 0;
 out_err_bad_srat:
 	bad_srat();
-- 
1.8.3.1


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/3] mem-hotplug: fix node spanned pages when we have a movable node
  2016-08-04 11:23 ` Xishi Qiu
@ 2016-08-04 11:24   ` Xishi Qiu
  -1 siblings, 0 replies; 26+ messages in thread
From: Xishi Qiu @ 2016-08-04 11:24 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Vlastimil Babka,
	Mel Gorman, Andrew Morton, Michal Hocko, David Rientjes,
	Joonsoo Kim, Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki
  Cc: Linux MM, LKML

commit 342332e6a925e9ed015e5465062c38d2b86ec8f9 rewrite the calculate of
node spanned pages. But when we have a movable node, the size of node spanned
pages is double added. That's because we have an empty normal zone, the present
pages is zero, but its spanned pages is not zero.

e.g.
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
[    0.000000]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
[    0.000000]   Normal   [mem 0x0000000100000000-0x0000007c7fffffff]
[    0.000000] Movable zone start for each node
[    0.000000]   Node 1: 0x0000001080000000
[    0.000000]   Node 2: 0x0000002080000000
[    0.000000]   Node 3: 0x0000003080000000
[    0.000000]   Node 4: 0x0000003c80000000
[    0.000000]   Node 5: 0x0000004c80000000
[    0.000000]   Node 6: 0x0000005c80000000
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000001000-0x000000000009ffff]
[    0.000000]   node   0: [mem 0x0000000000100000-0x000000007552afff]
[    0.000000]   node   0: [mem 0x000000007bd46000-0x000000007bd46fff]
[    0.000000]   node   0: [mem 0x000000007bdcd000-0x000000007bffffff]
[    0.000000]   node   0: [mem 0x0000000100000000-0x000000107fffffff]
[    0.000000]   node   1: [mem 0x0000001080000000-0x000000207fffffff]
[    0.000000]   node   2: [mem 0x0000002080000000-0x000000307fffffff]
[    0.000000]   node   3: [mem 0x0000003080000000-0x0000003c7fffffff]
[    0.000000]   node   4: [mem 0x0000003c80000000-0x0000004c7fffffff]
[    0.000000]   node   5: [mem 0x0000004c80000000-0x0000005c7fffffff]
[    0.000000]   node   6: [mem 0x0000005c80000000-0x0000006c7fffffff]
[    0.000000]   node   7: [mem 0x0000006c80000000-0x0000007c7fffffff]

node1:
[  760.227767] Normal, start=0x1080000, present=0x0, spanned=0x1000000
[  760.234024] Movable, start=0x1080000, present=0x1000000, spanned=0x1000000
[  760.240883] pgdat, start=0x1080000, present=0x1000000, spanned=0x2000000

After apply this patch, the problem is fixed.
node1:
[  289.770922] Normal, start=0x0, present=0x0, spanned=0x0
[  289.776153] Movable, start=0x1080000, present=0x1000000, spanned=0x1000000
[  289.783019] pgdat, start=0x1080000, present=0x1000000, spanned=0x1000000

Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
---
 mm/page_alloc.c | 54 +++++++++++++++++++++++-------------------------------
 1 file changed, 23 insertions(+), 31 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6903b69..2b258ec 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5173,15 +5173,6 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 		/*
-		 * If not mirrored_kernelcore and ZONE_MOVABLE exists, range
-		 * from zone_movable_pfn[nid] to end of each node should be
-		 * ZONE_MOVABLE not ZONE_NORMAL. skip it.
-		 */
-		if (!mirrored_kernelcore && zone_movable_pfn[nid])
-			if (zone == ZONE_NORMAL && pfn >= zone_movable_pfn[nid])
-				continue;
-
-		/*
 		 * Check given memblock attribute by firmware which can affect
 		 * kernel memory layout.  If zone==ZONE_MOVABLE but memory is
 		 * mirrored, it's an overlapped memmap init. skip it.
@@ -5619,6 +5610,12 @@ static void __meminit adjust_zone_range_for_zone_movable(int nid,
 			*zone_end_pfn = min(node_end_pfn,
 				arch_zone_highest_possible_pfn[movable_zone]);
 
+		/* Adjust for ZONE_MOVABLE starting within this range */
+		} else if (!mirrored_kernelcore &&
+			*zone_start_pfn < zone_movable_pfn[nid] &&
+			*zone_end_pfn > zone_movable_pfn[nid]) {
+			*zone_end_pfn = zone_movable_pfn[nid];
+
 		/* Check if this whole range is within ZONE_MOVABLE */
 		} else if (*zone_start_pfn >= zone_movable_pfn[nid])
 			*zone_start_pfn = *zone_end_pfn;
@@ -5722,28 +5719,23 @@ static unsigned long __meminit zone_absent_pages_in_node(int nid,
 	 * Treat pages to be ZONE_MOVABLE in ZONE_NORMAL as absent pages
 	 * and vice versa.
 	 */
-	if (zone_movable_pfn[nid]) {
-		if (mirrored_kernelcore) {
-			unsigned long start_pfn, end_pfn;
-			struct memblock_region *r;
-
-			for_each_memblock(memory, r) {
-				start_pfn = clamp(memblock_region_memory_base_pfn(r),
-						  zone_start_pfn, zone_end_pfn);
-				end_pfn = clamp(memblock_region_memory_end_pfn(r),
-						zone_start_pfn, zone_end_pfn);
-
-				if (zone_type == ZONE_MOVABLE &&
-				    memblock_is_mirror(r))
-					nr_absent += end_pfn - start_pfn;
-
-				if (zone_type == ZONE_NORMAL &&
-				    !memblock_is_mirror(r))
-					nr_absent += end_pfn - start_pfn;
-			}
-		} else {
-			if (zone_type == ZONE_NORMAL)
-				nr_absent += node_end_pfn - zone_movable_pfn[nid];
+	if (mirrored_kernelcore && zone_movable_pfn[nid]) {
+		unsigned long start_pfn, end_pfn;
+		struct memblock_region *r;
+
+		for_each_memblock(memory, r) {
+			start_pfn = clamp(memblock_region_memory_base_pfn(r),
+					  zone_start_pfn, zone_end_pfn);
+			end_pfn = clamp(memblock_region_memory_end_pfn(r),
+					zone_start_pfn, zone_end_pfn);
+
+			if (zone_type == ZONE_MOVABLE &&
+			    memblock_is_mirror(r))
+				nr_absent += end_pfn - start_pfn;
+
+			if (zone_type == ZONE_NORMAL &&
+			    !memblock_is_mirror(r))
+				nr_absent += end_pfn - start_pfn;
 		}
 	}
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/3] mem-hotplug: fix node spanned pages when we have a movable node
@ 2016-08-04 11:24   ` Xishi Qiu
  0 siblings, 0 replies; 26+ messages in thread
From: Xishi Qiu @ 2016-08-04 11:24 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Vlastimil Babka,
	Mel Gorman, Andrew Morton, Michal Hocko, David Rientjes,
	Joonsoo Kim, Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki
  Cc: Linux MM, LKML

commit 342332e6a925e9ed015e5465062c38d2b86ec8f9 rewrite the calculate of
node spanned pages. But when we have a movable node, the size of node spanned
pages is double added. That's because we have an empty normal zone, the present
pages is zero, but its spanned pages is not zero.

e.g.
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
[    0.000000]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
[    0.000000]   Normal   [mem 0x0000000100000000-0x0000007c7fffffff]
[    0.000000] Movable zone start for each node
[    0.000000]   Node 1: 0x0000001080000000
[    0.000000]   Node 2: 0x0000002080000000
[    0.000000]   Node 3: 0x0000003080000000
[    0.000000]   Node 4: 0x0000003c80000000
[    0.000000]   Node 5: 0x0000004c80000000
[    0.000000]   Node 6: 0x0000005c80000000
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000001000-0x000000000009ffff]
[    0.000000]   node   0: [mem 0x0000000000100000-0x000000007552afff]
[    0.000000]   node   0: [mem 0x000000007bd46000-0x000000007bd46fff]
[    0.000000]   node   0: [mem 0x000000007bdcd000-0x000000007bffffff]
[    0.000000]   node   0: [mem 0x0000000100000000-0x000000107fffffff]
[    0.000000]   node   1: [mem 0x0000001080000000-0x000000207fffffff]
[    0.000000]   node   2: [mem 0x0000002080000000-0x000000307fffffff]
[    0.000000]   node   3: [mem 0x0000003080000000-0x0000003c7fffffff]
[    0.000000]   node   4: [mem 0x0000003c80000000-0x0000004c7fffffff]
[    0.000000]   node   5: [mem 0x0000004c80000000-0x0000005c7fffffff]
[    0.000000]   node   6: [mem 0x0000005c80000000-0x0000006c7fffffff]
[    0.000000]   node   7: [mem 0x0000006c80000000-0x0000007c7fffffff]

node1:
[  760.227767] Normal, start=0x1080000, present=0x0, spanned=0x1000000
[  760.234024] Movable, start=0x1080000, present=0x1000000, spanned=0x1000000
[  760.240883] pgdat, start=0x1080000, present=0x1000000, spanned=0x2000000

After apply this patch, the problem is fixed.
node1:
[  289.770922] Normal, start=0x0, present=0x0, spanned=0x0
[  289.776153] Movable, start=0x1080000, present=0x1000000, spanned=0x1000000
[  289.783019] pgdat, start=0x1080000, present=0x1000000, spanned=0x1000000

Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
---
 mm/page_alloc.c | 54 +++++++++++++++++++++++-------------------------------
 1 file changed, 23 insertions(+), 31 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6903b69..2b258ec 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5173,15 +5173,6 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 		/*
-		 * If not mirrored_kernelcore and ZONE_MOVABLE exists, range
-		 * from zone_movable_pfn[nid] to end of each node should be
-		 * ZONE_MOVABLE not ZONE_NORMAL. skip it.
-		 */
-		if (!mirrored_kernelcore && zone_movable_pfn[nid])
-			if (zone == ZONE_NORMAL && pfn >= zone_movable_pfn[nid])
-				continue;
-
-		/*
 		 * Check given memblock attribute by firmware which can affect
 		 * kernel memory layout.  If zone==ZONE_MOVABLE but memory is
 		 * mirrored, it's an overlapped memmap init. skip it.
@@ -5619,6 +5610,12 @@ static void __meminit adjust_zone_range_for_zone_movable(int nid,
 			*zone_end_pfn = min(node_end_pfn,
 				arch_zone_highest_possible_pfn[movable_zone]);
 
+		/* Adjust for ZONE_MOVABLE starting within this range */
+		} else if (!mirrored_kernelcore &&
+			*zone_start_pfn < zone_movable_pfn[nid] &&
+			*zone_end_pfn > zone_movable_pfn[nid]) {
+			*zone_end_pfn = zone_movable_pfn[nid];
+
 		/* Check if this whole range is within ZONE_MOVABLE */
 		} else if (*zone_start_pfn >= zone_movable_pfn[nid])
 			*zone_start_pfn = *zone_end_pfn;
@@ -5722,28 +5719,23 @@ static unsigned long __meminit zone_absent_pages_in_node(int nid,
 	 * Treat pages to be ZONE_MOVABLE in ZONE_NORMAL as absent pages
 	 * and vice versa.
 	 */
-	if (zone_movable_pfn[nid]) {
-		if (mirrored_kernelcore) {
-			unsigned long start_pfn, end_pfn;
-			struct memblock_region *r;
-
-			for_each_memblock(memory, r) {
-				start_pfn = clamp(memblock_region_memory_base_pfn(r),
-						  zone_start_pfn, zone_end_pfn);
-				end_pfn = clamp(memblock_region_memory_end_pfn(r),
-						zone_start_pfn, zone_end_pfn);
-
-				if (zone_type == ZONE_MOVABLE &&
-				    memblock_is_mirror(r))
-					nr_absent += end_pfn - start_pfn;
-
-				if (zone_type == ZONE_NORMAL &&
-				    !memblock_is_mirror(r))
-					nr_absent += end_pfn - start_pfn;
-			}
-		} else {
-			if (zone_type == ZONE_NORMAL)
-				nr_absent += node_end_pfn - zone_movable_pfn[nid];
+	if (mirrored_kernelcore && zone_movable_pfn[nid]) {
+		unsigned long start_pfn, end_pfn;
+		struct memblock_region *r;
+
+		for_each_memblock(memory, r) {
+			start_pfn = clamp(memblock_region_memory_base_pfn(r),
+					  zone_start_pfn, zone_end_pfn);
+			end_pfn = clamp(memblock_region_memory_end_pfn(r),
+					zone_start_pfn, zone_end_pfn);
+
+			if (zone_type == ZONE_MOVABLE &&
+			    memblock_is_mirror(r))
+				nr_absent += end_pfn - start_pfn;
+
+			if (zone_type == ZONE_NORMAL &&
+			    !memblock_is_mirror(r))
+				nr_absent += end_pfn - start_pfn;
 		}
 	}
 
-- 
1.8.3.1


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 1/3] mm: fix set pageblock migratetype in deferred struct page init
  2016-08-04 11:23 ` Xishi Qiu
@ 2016-08-04 11:25   ` Xishi Qiu
  -1 siblings, 0 replies; 26+ messages in thread
From: Xishi Qiu @ 2016-08-04 11:25 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Vlastimil Babka,
	Mel Gorman, Andrew Morton, Michal Hocko, David Rientjes,
	Joonsoo Kim, Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki
  Cc: Linux MM, LKML

MAX_ORDER_NR_PAGES is usually 4M, and a pageblock is usually 2M, so we only
set one pageblock's migratetype in deferred_free_range() if pfn is aligned
to MAX_ORDER_NR_PAGES.

Also we missed to free the last block in deferred_init_memmap().

Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
---
 mm/page_alloc.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2b258ec..e0ec3b6 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1399,15 +1399,18 @@ static void __init deferred_free_range(struct page *page,
 		return;
 
 	/* Free a large naturally-aligned chunk if possible */
-	if (nr_pages == MAX_ORDER_NR_PAGES &&
-	    (pfn & (MAX_ORDER_NR_PAGES-1)) == 0) {
+	if (nr_pages == pageblock_nr_pages &&
+	    (pfn & (pageblock_nr_pages - 1)) == 0) {
 		set_pageblock_migratetype(page, MIGRATE_MOVABLE);
-		__free_pages_boot_core(page, MAX_ORDER-1);
+		__free_pages_boot_core(page, pageblock_order);
 		return;
 	}
 
-	for (i = 0; i < nr_pages; i++, page++)
+	for (i = 0; i < nr_pages; i++, page++, pfn++) {
+		if ((pfn & (pageblock_nr_pages - 1)) == 0)
+			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
 		__free_pages_boot_core(page, 0);
+	}
 }
 
 /* Completion tracking for deferred_init_memmap() threads */
@@ -1475,9 +1478,9 @@ static int __init deferred_init_memmap(void *data)
 
 			/*
 			 * Ensure pfn_valid is checked every
-			 * MAX_ORDER_NR_PAGES for memory holes
+			 * pageblock_nr_pages for memory holes
 			 */
-			if ((pfn & (MAX_ORDER_NR_PAGES - 1)) == 0) {
+			if ((pfn & (pageblock_nr_pages - 1)) == 0) {
 				if (!pfn_valid(pfn)) {
 					page = NULL;
 					goto free_range;
@@ -1490,7 +1493,7 @@ static int __init deferred_init_memmap(void *data)
 			}
 
 			/* Minimise pfn page lookups and scheduler checks */
-			if (page && (pfn & (MAX_ORDER_NR_PAGES - 1)) != 0) {
+			if (page && (pfn & (pageblock_nr_pages - 1)) != 0) {
 				page++;
 			} else {
 				nr_pages += nr_to_free;
@@ -1526,6 +1529,9 @@ free_range:
 			free_base_page = NULL;
 			free_base_pfn = nr_to_free = 0;
 		}
+		/* Free the last block of pages to allocator */
+		nr_pages += nr_to_free;
+		deferred_free_range(free_base_page, free_base_pfn, nr_to_free);
 
 		first_init_pfn = max(end_pfn, first_init_pfn);
 	}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 1/3] mm: fix set pageblock migratetype in deferred struct page init
@ 2016-08-04 11:25   ` Xishi Qiu
  0 siblings, 0 replies; 26+ messages in thread
From: Xishi Qiu @ 2016-08-04 11:25 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Vlastimil Babka,
	Mel Gorman, Andrew Morton, Michal Hocko, David Rientjes,
	Joonsoo Kim, Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki
  Cc: Linux MM, LKML

MAX_ORDER_NR_PAGES is usually 4M, and a pageblock is usually 2M, so we only
set one pageblock's migratetype in deferred_free_range() if pfn is aligned
to MAX_ORDER_NR_PAGES.

Also we missed to free the last block in deferred_init_memmap().

Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
---
 mm/page_alloc.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2b258ec..e0ec3b6 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1399,15 +1399,18 @@ static void __init deferred_free_range(struct page *page,
 		return;
 
 	/* Free a large naturally-aligned chunk if possible */
-	if (nr_pages == MAX_ORDER_NR_PAGES &&
-	    (pfn & (MAX_ORDER_NR_PAGES-1)) == 0) {
+	if (nr_pages == pageblock_nr_pages &&
+	    (pfn & (pageblock_nr_pages - 1)) == 0) {
 		set_pageblock_migratetype(page, MIGRATE_MOVABLE);
-		__free_pages_boot_core(page, MAX_ORDER-1);
+		__free_pages_boot_core(page, pageblock_order);
 		return;
 	}
 
-	for (i = 0; i < nr_pages; i++, page++)
+	for (i = 0; i < nr_pages; i++, page++, pfn++) {
+		if ((pfn & (pageblock_nr_pages - 1)) == 0)
+			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
 		__free_pages_boot_core(page, 0);
+	}
 }
 
 /* Completion tracking for deferred_init_memmap() threads */
@@ -1475,9 +1478,9 @@ static int __init deferred_init_memmap(void *data)
 
 			/*
 			 * Ensure pfn_valid is checked every
-			 * MAX_ORDER_NR_PAGES for memory holes
+			 * pageblock_nr_pages for memory holes
 			 */
-			if ((pfn & (MAX_ORDER_NR_PAGES - 1)) == 0) {
+			if ((pfn & (pageblock_nr_pages - 1)) == 0) {
 				if (!pfn_valid(pfn)) {
 					page = NULL;
 					goto free_range;
@@ -1490,7 +1493,7 @@ static int __init deferred_init_memmap(void *data)
 			}
 
 			/* Minimise pfn page lookups and scheduler checks */
-			if (page && (pfn & (MAX_ORDER_NR_PAGES - 1)) != 0) {
+			if (page && (pfn & (pageblock_nr_pages - 1)) != 0) {
 				page++;
 			} else {
 				nr_pages += nr_to_free;
@@ -1526,6 +1529,9 @@ free_range:
 			free_base_page = NULL;
 			free_base_pfn = nr_to_free = 0;
 		}
+		/* Free the last block of pages to allocator */
+		nr_pages += nr_to_free;
+		deferred_free_range(free_base_page, free_base_pfn, nr_to_free);
 
 		first_init_pfn = max(end_pfn, first_init_pfn);
 	}
-- 
1.8.3.1


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] mm: fix set pageblock migratetype in deferred struct page init
  2016-08-04 11:25   ` Xishi Qiu
@ 2016-08-04 11:36     ` Xishi Qiu
  -1 siblings, 0 replies; 26+ messages in thread
From: Xishi Qiu @ 2016-08-04 11:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Vlastimil Babka,
	Mel Gorman, Andrew Morton, Michal Hocko, David Rientjes,
	Joonsoo Kim, Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki
  Cc: Linux MM, LKML

On 2016/8/4 19:25, Xishi Qiu wrote:

> MAX_ORDER_NR_PAGES is usually 4M, and a pageblock is usually 2M, so we only
> set one pageblock's migratetype in deferred_free_range() if pfn is aligned
> to MAX_ORDER_NR_PAGES.
> 
> Also we missed to free the last block in deferred_init_memmap().
> 
> Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>

Sorry for the typo, this patch is 3/3, and 1/3 is this one
"[PATCH 1/3] mem-hotplug: introduce movablenode option"

However they are all independent.

Thanks,
Xishi Qiu

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] mm: fix set pageblock migratetype in deferred struct page init
@ 2016-08-04 11:36     ` Xishi Qiu
  0 siblings, 0 replies; 26+ messages in thread
From: Xishi Qiu @ 2016-08-04 11:36 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Vlastimil Babka,
	Mel Gorman, Andrew Morton, Michal Hocko, David Rientjes,
	Joonsoo Kim, Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki
  Cc: Linux MM, LKML

On 2016/8/4 19:25, Xishi Qiu wrote:

> MAX_ORDER_NR_PAGES is usually 4M, and a pageblock is usually 2M, so we only
> set one pageblock's migratetype in deferred_free_range() if pfn is aligned
> to MAX_ORDER_NR_PAGES.
> 
> Also we missed to free the last block in deferred_init_memmap().
> 
> Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>

Sorry for the typo, this patch is 3/3, and 1/3 is this one
"[PATCH 1/3] mem-hotplug: introduce movablenode option"

However they are all independent.

Thanks,
Xishi Qiu

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] mem-hotplug: introduce movablenode option
  2016-08-04 11:23 ` Xishi Qiu
@ 2016-08-11 23:13   ` Andrew Morton
  -1 siblings, 0 replies; 26+ messages in thread
From: Andrew Morton @ 2016-08-11 23:13 UTC (permalink / raw)
  To: Xishi Qiu
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Vlastimil Babka,
	Mel Gorman, Michal Hocko, David Rientjes, Joonsoo Kim,
	Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki, Linux MM, LKML

On Thu, 4 Aug 2016 19:23:54 +0800 Xishi Qiu <qiuxishi@huawei.com> wrote:

> This patch introduces a new boot option movablenode.
> 
> To support memory hotplug, boot option "movable_node" is needed. And to
> support debug memory hotplug, boot option "movable_node" and "movablenode"
> are both needed.
> 
> e.g. movable_node movablenode=1,2,4

I have some naming concerns.  "movable_node" and "movablenode" is just
confusing and ugly.

Can we just use the one parameter?   eg,

	vmlinux movable_node

or

	vmlinux movable_node=1,2,4

if not that, then how about "movable_node" and "movable_nodes"?  Then
every instance of "movablenode" in the patch itself should become
"movable_nodes" to be consistent with the command line parameter.

> It means node 1,2,4 will be set to movable nodes, the other nodes are
> unmovable nodes. Usually movable nodes are parsed from SRAT table which
> offered by BIOS, so this boot option is used for debug.
> 
>
> ---
>  Documentation/kernel-parameters.txt |  4 ++++
>  arch/x86/mm/srat.c                  | 36 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 40 insertions(+)
> 
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index 82b42c9..f8726f8 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2319,6 +2319,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>  	movable_node	[KNL,X86] Boot-time switch to enable the effects
>  			of CONFIG_MOVABLE_NODE=y. See mm/Kconfig for details.
>  
> +	movablenode=	[KNL,X86] Boot-time switch to set which node is
> +			movable node.
> +			Format: <movable nid>,...,<movable nid>

I think the docs should emphasize that this option disables the usual
SRAT-driven allocation and replaces it with manual allocation.

Also, can we please have more details in the patch changelog?  Why do we
*need* this?  Just for debugging?  Normally people will just use
SRAT-based allocation so normal users won't use this?  If so, why is
this debugging feature considered useful enough to add to the kernel?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] mem-hotplug: introduce movablenode option
@ 2016-08-11 23:13   ` Andrew Morton
  0 siblings, 0 replies; 26+ messages in thread
From: Andrew Morton @ 2016-08-11 23:13 UTC (permalink / raw)
  To: Xishi Qiu
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Vlastimil Babka,
	Mel Gorman, Michal Hocko, David Rientjes, Joonsoo Kim,
	Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki, Linux MM, LKML

On Thu, 4 Aug 2016 19:23:54 +0800 Xishi Qiu <qiuxishi@huawei.com> wrote:

> This patch introduces a new boot option movablenode.
> 
> To support memory hotplug, boot option "movable_node" is needed. And to
> support debug memory hotplug, boot option "movable_node" and "movablenode"
> are both needed.
> 
> e.g. movable_node movablenode=1,2,4

I have some naming concerns.  "movable_node" and "movablenode" is just
confusing and ugly.

Can we just use the one parameter?   eg,

	vmlinux movable_node

or

	vmlinux movable_node=1,2,4

if not that, then how about "movable_node" and "movable_nodes"?  Then
every instance of "movablenode" in the patch itself should become
"movable_nodes" to be consistent with the command line parameter.

> It means node 1,2,4 will be set to movable nodes, the other nodes are
> unmovable nodes. Usually movable nodes are parsed from SRAT table which
> offered by BIOS, so this boot option is used for debug.
> 
>
> ---
>  Documentation/kernel-parameters.txt |  4 ++++
>  arch/x86/mm/srat.c                  | 36 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 40 insertions(+)
> 
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index 82b42c9..f8726f8 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2319,6 +2319,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>  	movable_node	[KNL,X86] Boot-time switch to enable the effects
>  			of CONFIG_MOVABLE_NODE=y. See mm/Kconfig for details.
>  
> +	movablenode=	[KNL,X86] Boot-time switch to set which node is
> +			movable node.
> +			Format: <movable nid>,...,<movable nid>

I think the docs should emphasize that this option disables the usual
SRAT-driven allocation and replaces it with manual allocation.

Also, can we please have more details in the patch changelog?  Why do we
*need* this?  Just for debugging?  Normally people will just use
SRAT-based allocation so normal users won't use this?  If so, why is
this debugging feature considered useful enough to add to the kernel?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] mem-hotplug: introduce movablenode option
  2016-08-11 23:13   ` Andrew Morton
@ 2016-08-15  1:40     ` Xishi Qiu
  -1 siblings, 0 replies; 26+ messages in thread
From: Xishi Qiu @ 2016-08-15  1:40 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Vlastimil Babka,
	Mel Gorman, Michal Hocko, David Rientjes, Joonsoo Kim,
	Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki, Linux MM, LKML

On 2016/8/12 7:13, Andrew Morton wrote:

> On Thu, 4 Aug 2016 19:23:54 +0800 Xishi Qiu <qiuxishi@huawei.com> wrote:
> 
>> This patch introduces a new boot option movablenode.
>>
>> To support memory hotplug, boot option "movable_node" is needed. And to
>> support debug memory hotplug, boot option "movable_node" and "movablenode"
>> are both needed.
>>
>> e.g. movable_node movablenode=1,2,4
> 
> I have some naming concerns.  "movable_node" and "movablenode" is just
> confusing and ugly.
> 

Hi Andrew,

OK, how about other two fix patches?

Thanks,
Xishi Qiu

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] mem-hotplug: introduce movablenode option
@ 2016-08-15  1:40     ` Xishi Qiu
  0 siblings, 0 replies; 26+ messages in thread
From: Xishi Qiu @ 2016-08-15  1:40 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Vlastimil Babka,
	Mel Gorman, Michal Hocko, David Rientjes, Joonsoo Kim,
	Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki, Linux MM, LKML

On 2016/8/12 7:13, Andrew Morton wrote:

> On Thu, 4 Aug 2016 19:23:54 +0800 Xishi Qiu <qiuxishi@huawei.com> wrote:
> 
>> This patch introduces a new boot option movablenode.
>>
>> To support memory hotplug, boot option "movable_node" is needed. And to
>> support debug memory hotplug, boot option "movable_node" and "movablenode"
>> are both needed.
>>
>> e.g. movable_node movablenode=1,2,4
> 
> I have some naming concerns.  "movable_node" and "movablenode" is just
> confusing and ugly.
> 

Hi Andrew,

OK, how about other two fix patches?

Thanks,
Xishi Qiu

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] mm: fix set pageblock migratetype in deferred struct page init
  2016-08-04 11:25   ` Xishi Qiu
@ 2016-08-16  8:41     ` Michal Hocko
  -1 siblings, 0 replies; 26+ messages in thread
From: Michal Hocko @ 2016-08-16  8:41 UTC (permalink / raw)
  To: Xishi Qiu
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Vlastimil Babka,
	Mel Gorman, Andrew Morton, David Rientjes, Joonsoo Kim,
	Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki, Linux MM, LKML

On Thu 04-08-16 19:25:03, Xishi Qiu wrote:
> MAX_ORDER_NR_PAGES is usually 4M, and a pageblock is usually 2M, so we only
> set one pageblock's migratetype in deferred_free_range() if pfn is aligned
> to MAX_ORDER_NR_PAGES.

Do I read the changelog correctly and the bug causes leaking unmovable
allocations into movable zones?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] mm: fix set pageblock migratetype in deferred struct page init
@ 2016-08-16  8:41     ` Michal Hocko
  0 siblings, 0 replies; 26+ messages in thread
From: Michal Hocko @ 2016-08-16  8:41 UTC (permalink / raw)
  To: Xishi Qiu
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Vlastimil Babka,
	Mel Gorman, Andrew Morton, David Rientjes, Joonsoo Kim,
	Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki, Linux MM, LKML

On Thu 04-08-16 19:25:03, Xishi Qiu wrote:
> MAX_ORDER_NR_PAGES is usually 4M, and a pageblock is usually 2M, so we only
> set one pageblock's migratetype in deferred_free_range() if pfn is aligned
> to MAX_ORDER_NR_PAGES.

Do I read the changelog correctly and the bug causes leaking unmovable
allocations into movable zones?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] mm: fix set pageblock migratetype in deferred struct page init
  2016-08-16  8:41     ` Michal Hocko
@ 2016-08-16  8:56       ` Xishi Qiu
  -1 siblings, 0 replies; 26+ messages in thread
From: Xishi Qiu @ 2016-08-16  8:56 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Vlastimil Babka,
	Mel Gorman, Andrew Morton, David Rientjes, Joonsoo Kim,
	Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki, Linux MM, LKML

On 2016/8/16 16:41, Michal Hocko wrote:

> On Thu 04-08-16 19:25:03, Xishi Qiu wrote:
>> MAX_ORDER_NR_PAGES is usually 4M, and a pageblock is usually 2M, so we only
>> set one pageblock's migratetype in deferred_free_range() if pfn is aligned
>> to MAX_ORDER_NR_PAGES.
> 
> Do I read the changelog correctly and the bug causes leaking unmovable
> allocations into movable zones?

Hi Michal,

This bug will cause uninitialized migratetype, you can see from
"cat /proc/pagetypeinfo", almost half blocks are Unmovable.

Also this bug missed to free the last block pages, it cause memory leaking.

Thanks,
Xishi Qiu

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] mm: fix set pageblock migratetype in deferred struct page init
@ 2016-08-16  8:56       ` Xishi Qiu
  0 siblings, 0 replies; 26+ messages in thread
From: Xishi Qiu @ 2016-08-16  8:56 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Vlastimil Babka,
	Mel Gorman, Andrew Morton, David Rientjes, Joonsoo Kim,
	Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki, Linux MM, LKML

On 2016/8/16 16:41, Michal Hocko wrote:

> On Thu 04-08-16 19:25:03, Xishi Qiu wrote:
>> MAX_ORDER_NR_PAGES is usually 4M, and a pageblock is usually 2M, so we only
>> set one pageblock's migratetype in deferred_free_range() if pfn is aligned
>> to MAX_ORDER_NR_PAGES.
> 
> Do I read the changelog correctly and the bug causes leaking unmovable
> allocations into movable zones?

Hi Michal,

This bug will cause uninitialized migratetype, you can see from
"cat /proc/pagetypeinfo", almost half blocks are Unmovable.

Also this bug missed to free the last block pages, it cause memory leaking.

Thanks,
Xishi Qiu

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] mm: fix set pageblock migratetype in deferred struct page init
  2016-08-16  8:56       ` Xishi Qiu
@ 2016-08-16  9:23         ` Michal Hocko
  -1 siblings, 0 replies; 26+ messages in thread
From: Michal Hocko @ 2016-08-16  9:23 UTC (permalink / raw)
  To: Xishi Qiu
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Vlastimil Babka,
	Mel Gorman, Andrew Morton, David Rientjes, Joonsoo Kim,
	Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki, Linux MM, LKML

On Tue 16-08-16 16:56:54, Xishi Qiu wrote:
> On 2016/8/16 16:41, Michal Hocko wrote:
> 
> > On Thu 04-08-16 19:25:03, Xishi Qiu wrote:
> >> MAX_ORDER_NR_PAGES is usually 4M, and a pageblock is usually 2M, so we only
> >> set one pageblock's migratetype in deferred_free_range() if pfn is aligned
> >> to MAX_ORDER_NR_PAGES.
> > 
> > Do I read the changelog correctly and the bug causes leaking unmovable
> > allocations into movable zones?
> 
> Hi Michal,
> 
> This bug will cause uninitialized migratetype, you can see from
> "cat /proc/pagetypeinfo", almost half blocks are Unmovable.

Please add that information to the changelog. Leaking unmovable
allocations to the movable zones defeats the whole purpose of the
movable zone so I guess we really want to mark this for stable.
AFAICS it should also note:
Fixes: ac5d2539b238 ("mm: meminit: reduce number of times pageblocks are set during struct page init")
and stable 4.2+

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] mm: fix set pageblock migratetype in deferred struct page init
@ 2016-08-16  9:23         ` Michal Hocko
  0 siblings, 0 replies; 26+ messages in thread
From: Michal Hocko @ 2016-08-16  9:23 UTC (permalink / raw)
  To: Xishi Qiu
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Vlastimil Babka,
	Mel Gorman, Andrew Morton, David Rientjes, Joonsoo Kim,
	Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki, Linux MM, LKML

On Tue 16-08-16 16:56:54, Xishi Qiu wrote:
> On 2016/8/16 16:41, Michal Hocko wrote:
> 
> > On Thu 04-08-16 19:25:03, Xishi Qiu wrote:
> >> MAX_ORDER_NR_PAGES is usually 4M, and a pageblock is usually 2M, so we only
> >> set one pageblock's migratetype in deferred_free_range() if pfn is aligned
> >> to MAX_ORDER_NR_PAGES.
> > 
> > Do I read the changelog correctly and the bug causes leaking unmovable
> > allocations into movable zones?
> 
> Hi Michal,
> 
> This bug will cause uninitialized migratetype, you can see from
> "cat /proc/pagetypeinfo", almost half blocks are Unmovable.

Please add that information to the changelog. Leaking unmovable
allocations to the movable zones defeats the whole purpose of the
movable zone so I guess we really want to mark this for stable.
AFAICS it should also note:
Fixes: ac5d2539b238 ("mm: meminit: reduce number of times pageblocks are set during struct page init")
and stable 4.2+

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2] mm: fix set pageblock migratetype in deferred struct page init
  2016-08-16  9:23         ` Michal Hocko
@ 2016-08-16 10:01           ` Xishi Qiu
  -1 siblings, 0 replies; 26+ messages in thread
From: Xishi Qiu @ 2016-08-16 10:01 UTC (permalink / raw)
  To: Michal Hocko, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Vlastimil Babka, Mel Gorman, Andrew Morton, David Rientjes,
	Joonsoo Kim, Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki
  Cc: Linux MM, LKML

Fixes: ac5d2539b238 ("mm: meminit: reduce number of times pageblocks are set during struct page init")
and stable 4.2+

on x86_64 MAX_ORDER_NR_PAGES is usually 4M, and a pageblock is usually 2M,
so we only set one pageblock's migratetype in deferred_free_range() if pfn
is aligned to MAX_ORDER_NR_PAGES. That means it causes uninitialized migratetype
blocks, you can see from "cat /proc/pagetypeinfo", almost half blocks are
Unmovable.

Also we missed to free the last block in deferred_init_memmap(), it causes
memory leak.

Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
---
 mm/page_alloc.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2b258ec..e0ec3b6 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1399,15 +1399,18 @@ static void __init deferred_free_range(struct page *page,
 		return;
 
 	/* Free a large naturally-aligned chunk if possible */
-	if (nr_pages == MAX_ORDER_NR_PAGES &&
-	    (pfn & (MAX_ORDER_NR_PAGES-1)) == 0) {
+	if (nr_pages == pageblock_nr_pages &&
+	    (pfn & (pageblock_nr_pages - 1)) == 0) {
 		set_pageblock_migratetype(page, MIGRATE_MOVABLE);
-		__free_pages_boot_core(page, MAX_ORDER-1);
+		__free_pages_boot_core(page, pageblock_order);
 		return;
 	}
 
-	for (i = 0; i < nr_pages; i++, page++)
+	for (i = 0; i < nr_pages; i++, page++, pfn++) {
+		if ((pfn & (pageblock_nr_pages - 1)) == 0)
+			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
 		__free_pages_boot_core(page, 0);
+	}
 }
 
 /* Completion tracking for deferred_init_memmap() threads */
@@ -1475,9 +1478,9 @@ static int __init deferred_init_memmap(void *data)
 
 			/*
 			 * Ensure pfn_valid is checked every
-			 * MAX_ORDER_NR_PAGES for memory holes
+			 * pageblock_nr_pages for memory holes
 			 */
-			if ((pfn & (MAX_ORDER_NR_PAGES - 1)) == 0) {
+			if ((pfn & (pageblock_nr_pages - 1)) == 0) {
 				if (!pfn_valid(pfn)) {
 					page = NULL;
 					goto free_range;
@@ -1490,7 +1493,7 @@ static int __init deferred_init_memmap(void *data)
 			}
 
 			/* Minimise pfn page lookups and scheduler checks */
-			if (page && (pfn & (MAX_ORDER_NR_PAGES - 1)) != 0) {
+			if (page && (pfn & (pageblock_nr_pages - 1)) != 0) {
 				page++;
 			} else {
 				nr_pages += nr_to_free;
@@ -1526,6 +1529,9 @@ free_range:
 			free_base_page = NULL;
 			free_base_pfn = nr_to_free = 0;
 		}
+		/* Free the last block of pages to allocator */
+		nr_pages += nr_to_free;
+		deferred_free_range(free_base_page, free_base_pfn, nr_to_free);
 
 		first_init_pfn = max(end_pfn, first_init_pfn);
 	}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2] mm: fix set pageblock migratetype in deferred struct page init
@ 2016-08-16 10:01           ` Xishi Qiu
  0 siblings, 0 replies; 26+ messages in thread
From: Xishi Qiu @ 2016-08-16 10:01 UTC (permalink / raw)
  To: Michal Hocko, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Vlastimil Babka, Mel Gorman, Andrew Morton, David Rientjes,
	Joonsoo Kim, Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki
  Cc: Linux MM, LKML

Fixes: ac5d2539b238 ("mm: meminit: reduce number of times pageblocks are set during struct page init")
and stable 4.2+

on x86_64 MAX_ORDER_NR_PAGES is usually 4M, and a pageblock is usually 2M,
so we only set one pageblock's migratetype in deferred_free_range() if pfn
is aligned to MAX_ORDER_NR_PAGES. That means it causes uninitialized migratetype
blocks, you can see from "cat /proc/pagetypeinfo", almost half blocks are
Unmovable.

Also we missed to free the last block in deferred_init_memmap(), it causes
memory leak.

Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
---
 mm/page_alloc.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2b258ec..e0ec3b6 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1399,15 +1399,18 @@ static void __init deferred_free_range(struct page *page,
 		return;
 
 	/* Free a large naturally-aligned chunk if possible */
-	if (nr_pages == MAX_ORDER_NR_PAGES &&
-	    (pfn & (MAX_ORDER_NR_PAGES-1)) == 0) {
+	if (nr_pages == pageblock_nr_pages &&
+	    (pfn & (pageblock_nr_pages - 1)) == 0) {
 		set_pageblock_migratetype(page, MIGRATE_MOVABLE);
-		__free_pages_boot_core(page, MAX_ORDER-1);
+		__free_pages_boot_core(page, pageblock_order);
 		return;
 	}
 
-	for (i = 0; i < nr_pages; i++, page++)
+	for (i = 0; i < nr_pages; i++, page++, pfn++) {
+		if ((pfn & (pageblock_nr_pages - 1)) == 0)
+			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
 		__free_pages_boot_core(page, 0);
+	}
 }
 
 /* Completion tracking for deferred_init_memmap() threads */
@@ -1475,9 +1478,9 @@ static int __init deferred_init_memmap(void *data)
 
 			/*
 			 * Ensure pfn_valid is checked every
-			 * MAX_ORDER_NR_PAGES for memory holes
+			 * pageblock_nr_pages for memory holes
 			 */
-			if ((pfn & (MAX_ORDER_NR_PAGES - 1)) == 0) {
+			if ((pfn & (pageblock_nr_pages - 1)) == 0) {
 				if (!pfn_valid(pfn)) {
 					page = NULL;
 					goto free_range;
@@ -1490,7 +1493,7 @@ static int __init deferred_init_memmap(void *data)
 			}
 
 			/* Minimise pfn page lookups and scheduler checks */
-			if (page && (pfn & (MAX_ORDER_NR_PAGES - 1)) != 0) {
+			if (page && (pfn & (pageblock_nr_pages - 1)) != 0) {
 				page++;
 			} else {
 				nr_pages += nr_to_free;
@@ -1526,6 +1529,9 @@ free_range:
 			free_base_page = NULL;
 			free_base_pfn = nr_to_free = 0;
 		}
+		/* Free the last block of pages to allocator */
+		nr_pages += nr_to_free;
+		deferred_free_range(free_base_page, free_base_pfn, nr_to_free);
 
 		first_init_pfn = max(end_pfn, first_init_pfn);
 	}
-- 
1.8.3.1


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] mm: fix set pageblock migratetype in deferred struct page init
  2016-08-16  9:23         ` Michal Hocko
@ 2016-08-16 10:12           ` Vlastimil Babka
  -1 siblings, 0 replies; 26+ messages in thread
From: Vlastimil Babka @ 2016-08-16 10:12 UTC (permalink / raw)
  To: Michal Hocko, Xishi Qiu
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Mel Gorman,
	Andrew Morton, David Rientjes, Joonsoo Kim, Taku Izumi,
	'Kirill A . Shutemov',
	Kamezawa Hiroyuki, Linux MM, LKML

On 08/16/2016 11:23 AM, Michal Hocko wrote:
> On Tue 16-08-16 16:56:54, Xishi Qiu wrote:
>> On 2016/8/16 16:41, Michal Hocko wrote:
>>
>>> On Thu 04-08-16 19:25:03, Xishi Qiu wrote:
>>>> MAX_ORDER_NR_PAGES is usually 4M, and a pageblock is usually 2M, so we only
>>>> set one pageblock's migratetype in deferred_free_range() if pfn is aligned
>>>> to MAX_ORDER_NR_PAGES.
>>>
>>> Do I read the changelog correctly and the bug causes leaking unmovable
>>> allocations into movable zones?
>>
>> Hi Michal,
>>
>> This bug will cause uninitialized migratetype, you can see from
>> "cat /proc/pagetypeinfo", almost half blocks are Unmovable.
>
> Please add that information to the changelog. Leaking unmovable
> allocations to the movable zones defeats the whole purpose of the
> movable zone so I guess we really want to mark this for stable.

Note that it's not as severe. Pageblock migratetype is just heuristic 
against fragmentation. It should not allow unmovable allocations from 
movable zones (although I can't find what really does govern it).

> AFAICS it should also note:
> Fixes: ac5d2539b238 ("mm: meminit: reduce number of times pageblocks are set during struct page init")
> and stable 4.2+

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] mm: fix set pageblock migratetype in deferred struct page init
@ 2016-08-16 10:12           ` Vlastimil Babka
  0 siblings, 0 replies; 26+ messages in thread
From: Vlastimil Babka @ 2016-08-16 10:12 UTC (permalink / raw)
  To: Michal Hocko, Xishi Qiu
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Mel Gorman,
	Andrew Morton, David Rientjes, Joonsoo Kim, Taku Izumi,
	'Kirill A . Shutemov',
	Kamezawa Hiroyuki, Linux MM, LKML

On 08/16/2016 11:23 AM, Michal Hocko wrote:
> On Tue 16-08-16 16:56:54, Xishi Qiu wrote:
>> On 2016/8/16 16:41, Michal Hocko wrote:
>>
>>> On Thu 04-08-16 19:25:03, Xishi Qiu wrote:
>>>> MAX_ORDER_NR_PAGES is usually 4M, and a pageblock is usually 2M, so we only
>>>> set one pageblock's migratetype in deferred_free_range() if pfn is aligned
>>>> to MAX_ORDER_NR_PAGES.
>>>
>>> Do I read the changelog correctly and the bug causes leaking unmovable
>>> allocations into movable zones?
>>
>> Hi Michal,
>>
>> This bug will cause uninitialized migratetype, you can see from
>> "cat /proc/pagetypeinfo", almost half blocks are Unmovable.
>
> Please add that information to the changelog. Leaking unmovable
> allocations to the movable zones defeats the whole purpose of the
> movable zone so I guess we really want to mark this for stable.

Note that it's not as severe. Pageblock migratetype is just heuristic 
against fragmentation. It should not allow unmovable allocations from 
movable zones (although I can't find what really does govern it).

> AFAICS it should also note:
> Fixes: ac5d2539b238 ("mm: meminit: reduce number of times pageblocks are set during struct page init")
> and stable 4.2+



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] mm: fix set pageblock migratetype in deferred struct page init
  2016-08-16 10:12           ` Vlastimil Babka
@ 2016-08-16 10:20             ` Xishi Qiu
  -1 siblings, 0 replies; 26+ messages in thread
From: Xishi Qiu @ 2016-08-16 10:20 UTC (permalink / raw)
  To: Vlastimil Babka, Michal Hocko
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Mel Gorman,
	Andrew Morton, David Rientjes, Joonsoo Kim, Taku Izumi,
	'Kirill A . Shutemov',
	Kamezawa Hiroyuki, Linux MM, LKML

On 2016/8/16 18:12, Vlastimil Babka wrote:

> On 08/16/2016 11:23 AM, Michal Hocko wrote:
>> On Tue 16-08-16 16:56:54, Xishi Qiu wrote:
>>> On 2016/8/16 16:41, Michal Hocko wrote:
>>>
>>>> On Thu 04-08-16 19:25:03, Xishi Qiu wrote:
>>>>> MAX_ORDER_NR_PAGES is usually 4M, and a pageblock is usually 2M, so we only
>>>>> set one pageblock's migratetype in deferred_free_range() if pfn is aligned
>>>>> to MAX_ORDER_NR_PAGES.
>>>>
>>>> Do I read the changelog correctly and the bug causes leaking unmovable
>>>> allocations into movable zones?
>>>
>>> Hi Michal,
>>>
>>> This bug will cause uninitialized migratetype, you can see from
>>> "cat /proc/pagetypeinfo", almost half blocks are Unmovable.
>>
>> Please add that information to the changelog. Leaking unmovable
>> allocations to the movable zones defeats the whole purpose of the
>> movable zone so I guess we really want to mark this for stable.
> 
> Note that it's not as severe. Pageblock migratetype is just heuristic against fragmentation. It should not allow unmovable allocations from movable zones (although I can't find what really does govern it).
> 

Yes, leaking unmovable migratetype to movable zone is fine for mem-offline,
we will check every page in offline_pages().
But as I pointed that we missed to free the last block in deferred_init_memmap(),
this will lead to mem-offline fail.

Thanks,
Xishi Qiu

>> AFAICS it should also note:
>> Fixes: ac5d2539b238 ("mm: meminit: reduce number of times pageblocks are set during struct page init")
>> and stable 4.2+
> 
> 
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] mm: fix set pageblock migratetype in deferred struct page init
@ 2016-08-16 10:20             ` Xishi Qiu
  0 siblings, 0 replies; 26+ messages in thread
From: Xishi Qiu @ 2016-08-16 10:20 UTC (permalink / raw)
  To: Vlastimil Babka, Michal Hocko
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Mel Gorman,
	Andrew Morton, David Rientjes, Joonsoo Kim, Taku Izumi,
	'Kirill A . Shutemov',
	Kamezawa Hiroyuki, Linux MM, LKML

On 2016/8/16 18:12, Vlastimil Babka wrote:

> On 08/16/2016 11:23 AM, Michal Hocko wrote:
>> On Tue 16-08-16 16:56:54, Xishi Qiu wrote:
>>> On 2016/8/16 16:41, Michal Hocko wrote:
>>>
>>>> On Thu 04-08-16 19:25:03, Xishi Qiu wrote:
>>>>> MAX_ORDER_NR_PAGES is usually 4M, and a pageblock is usually 2M, so we only
>>>>> set one pageblock's migratetype in deferred_free_range() if pfn is aligned
>>>>> to MAX_ORDER_NR_PAGES.
>>>>
>>>> Do I read the changelog correctly and the bug causes leaking unmovable
>>>> allocations into movable zones?
>>>
>>> Hi Michal,
>>>
>>> This bug will cause uninitialized migratetype, you can see from
>>> "cat /proc/pagetypeinfo", almost half blocks are Unmovable.
>>
>> Please add that information to the changelog. Leaking unmovable
>> allocations to the movable zones defeats the whole purpose of the
>> movable zone so I guess we really want to mark this for stable.
> 
> Note that it's not as severe. Pageblock migratetype is just heuristic against fragmentation. It should not allow unmovable allocations from movable zones (although I can't find what really does govern it).
> 

Yes, leaking unmovable migratetype to movable zone is fine for mem-offline,
we will check every page in offline_pages().
But as I pointed that we missed to free the last block in deferred_init_memmap(),
this will lead to mem-offline fail.

Thanks,
Xishi Qiu

>> AFAICS it should also note:
>> Fixes: ac5d2539b238 ("mm: meminit: reduce number of times pageblocks are set during struct page init")
>> and stable 4.2+
> 
> 
> 
> 
> .
> 



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] mm: fix set pageblock migratetype in deferred struct page init
  2016-08-16 10:12           ` Vlastimil Babka
@ 2016-08-16 11:10             ` Michal Hocko
  -1 siblings, 0 replies; 26+ messages in thread
From: Michal Hocko @ 2016-08-16 11:10 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Xishi Qiu, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Mel Gorman, Andrew Morton, David Rientjes, Joonsoo Kim,
	Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki, Linux MM, LKML

On Tue 16-08-16 12:12:07, Vlastimil Babka wrote:
> On 08/16/2016 11:23 AM, Michal Hocko wrote:
> > On Tue 16-08-16 16:56:54, Xishi Qiu wrote:
> > > On 2016/8/16 16:41, Michal Hocko wrote:
> > > 
> > > > On Thu 04-08-16 19:25:03, Xishi Qiu wrote:
> > > > > MAX_ORDER_NR_PAGES is usually 4M, and a pageblock is usually 2M, so we only
> > > > > set one pageblock's migratetype in deferred_free_range() if pfn is aligned
> > > > > to MAX_ORDER_NR_PAGES.
> > > > 
> > > > Do I read the changelog correctly and the bug causes leaking unmovable
> > > > allocations into movable zones?
> > > 
> > > Hi Michal,
> > > 
> > > This bug will cause uninitialized migratetype, you can see from
> > > "cat /proc/pagetypeinfo", almost half blocks are Unmovable.
> > 
> > Please add that information to the changelog. Leaking unmovable
> > allocations to the movable zones defeats the whole purpose of the
> > movable zone so I guess we really want to mark this for stable.
> 
> Note that it's not as severe. Pageblock migratetype is just heuristic
> against fragmentation. It should not allow unmovable allocations from
> movable zones (although I can't find what really does govern it).

You are right! gfp_zone would disabllow movable zones from the zone
list. So we indeed cannot leak the unmovable allocation to the movable
zone and then this doesn't really sound all that important to bother
with stable backport. It would be really great to have this all in the
changelog. This code is far from straightforward so having some
assistance from the changelog is more than welcome.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] mm: fix set pageblock migratetype in deferred struct page init
@ 2016-08-16 11:10             ` Michal Hocko
  0 siblings, 0 replies; 26+ messages in thread
From: Michal Hocko @ 2016-08-16 11:10 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Xishi Qiu, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Mel Gorman, Andrew Morton, David Rientjes, Joonsoo Kim,
	Taku Izumi, 'Kirill A . Shutemov',
	Kamezawa Hiroyuki, Linux MM, LKML

On Tue 16-08-16 12:12:07, Vlastimil Babka wrote:
> On 08/16/2016 11:23 AM, Michal Hocko wrote:
> > On Tue 16-08-16 16:56:54, Xishi Qiu wrote:
> > > On 2016/8/16 16:41, Michal Hocko wrote:
> > > 
> > > > On Thu 04-08-16 19:25:03, Xishi Qiu wrote:
> > > > > MAX_ORDER_NR_PAGES is usually 4M, and a pageblock is usually 2M, so we only
> > > > > set one pageblock's migratetype in deferred_free_range() if pfn is aligned
> > > > > to MAX_ORDER_NR_PAGES.
> > > > 
> > > > Do I read the changelog correctly and the bug causes leaking unmovable
> > > > allocations into movable zones?
> > > 
> > > Hi Michal,
> > > 
> > > This bug will cause uninitialized migratetype, you can see from
> > > "cat /proc/pagetypeinfo", almost half blocks are Unmovable.
> > 
> > Please add that information to the changelog. Leaking unmovable
> > allocations to the movable zones defeats the whole purpose of the
> > movable zone so I guess we really want to mark this for stable.
> 
> Note that it's not as severe. Pageblock migratetype is just heuristic
> against fragmentation. It should not allow unmovable allocations from
> movable zones (although I can't find what really does govern it).

You are right! gfp_zone would disabllow movable zones from the zone
list. So we indeed cannot leak the unmovable allocation to the movable
zone and then this doesn't really sound all that important to bother
with stable backport. It would be really great to have this all in the
changelog. This code is far from straightforward so having some
assistance from the changelog is more than welcome.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2016-08-16 11:11 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-04 11:23 [PATCH 1/3] mem-hotplug: introduce movablenode option Xishi Qiu
2016-08-04 11:23 ` Xishi Qiu
2016-08-04 11:24 ` [PATCH 2/3] mem-hotplug: fix node spanned pages when we have a movable node Xishi Qiu
2016-08-04 11:24   ` Xishi Qiu
2016-08-04 11:25 ` [PATCH 1/3] mm: fix set pageblock migratetype in deferred struct page init Xishi Qiu
2016-08-04 11:25   ` Xishi Qiu
2016-08-04 11:36   ` Xishi Qiu
2016-08-04 11:36     ` Xishi Qiu
2016-08-16  8:41   ` Michal Hocko
2016-08-16  8:41     ` Michal Hocko
2016-08-16  8:56     ` Xishi Qiu
2016-08-16  8:56       ` Xishi Qiu
2016-08-16  9:23       ` Michal Hocko
2016-08-16  9:23         ` Michal Hocko
2016-08-16 10:01         ` [PATCH v2] " Xishi Qiu
2016-08-16 10:01           ` Xishi Qiu
2016-08-16 10:12         ` [PATCH 1/3] " Vlastimil Babka
2016-08-16 10:12           ` Vlastimil Babka
2016-08-16 10:20           ` Xishi Qiu
2016-08-16 10:20             ` Xishi Qiu
2016-08-16 11:10           ` Michal Hocko
2016-08-16 11:10             ` Michal Hocko
2016-08-11 23:13 ` [PATCH 1/3] mem-hotplug: introduce movablenode option Andrew Morton
2016-08-11 23:13   ` Andrew Morton
2016-08-15  1:40   ` Xishi Qiu
2016-08-15  1:40     ` Xishi Qiu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.