linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/2] mm: Introduce kernelcore=mirror option
@ 2016-01-08  8:25 Taku Izumi
  2016-01-08  8:26 ` Taku Izumi
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Taku Izumi @ 2016-01-08  8:25 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm
  Cc: tony.luck, qiuxishi, kamezawa.hiroyu, mel, dave.hansen, matt,
	arnd, steve.capper, sudeep.holla, Taku Izumi

Xeon E7 v3 based systems supports Address Range Mirroring
and UEFI BIOS complied with UEFI spec 2.5 can notify which
ranges are mirrored (reliable) via EFI memory map.
Now Linux kernel utilize its information and allocates 
boot time memory from reliable region.

My requirement is:
  - allocate kernel memory from mirrored region 
  - allocate user memory from non-mirrored region

In order to meet my requirement, ZONE_MOVABLE is useful.
By arranging non-mirrored range into ZONE_MOVABLE, 
mirrored memory is used for kernel allocations.

My idea is to extend existing "kernelcore" option and 
introduces kernelcore=mirror option. By specifying
"mirror" instead of specifying the amount of memory,
non-mirrored region will be arranged into ZONE_MOVABLE.

Earlier discussions are at: 
 https://lkml.org/lkml/2015/10/9/24
 https://lkml.org/lkml/2015/10/15/9
 https://lkml.org/lkml/2015/11/27/18
 https://lkml.org/lkml/2015/12/8/836

For example, suppose 2-nodes system with the following memory
 range: 
  node 0 [mem 0x0000000000001000-0x000000109fffffff] 
  node 1 [mem 0x00000010a0000000-0x000000209fffffff]
and the following ranges are marked as reliable (mirrored):
  [0x0000000000000000-0x0000000100000000] 
  [0x0000000100000000-0x0000000180000000] 
  [0x0000000800000000-0x0000000880000000] 
  [0x00000010a0000000-0x0000001120000000]
  [0x00000017a0000000-0x0000001820000000] 

If you specify kernelcore=mirror, ZONE_NORMAL and ZONE_MOVABLE
are arranged like bellow:

 - node 0:
  ZONE_NORMAL : [0x0000000100000000-0x00000010a0000000]
  ZONE_MOVABLE: [0x0000000180000000-0x00000010a0000000]
 - node 1: 
  ZONE_NORMAL : [0x00000010a0000000-0x00000020a0000000]
  ZONE_MOVABLE: [0x0000001120000000-0x00000020a0000000]
 
In overlapped range, pages to be ZONE_MOVABLE in ZONE_NORMAL
are treated as absent pages, and vice versa.

This patchset is created against "akpm" branch of linux-next


v1 -> v2:
 - Refine so that the above example case also can be
 handled properly:
v2 -> v3:
 - Change the option name from kernelcore=reliable
 into kernelcore=mirror and some documentation fix
 according to Andrew Morton's point
v3 -> v4:
 - Fix up the case of CONFIG_HAVE_MEMBLOCK_NODE_MAP=n
   (Fix boot failed of ARM machines)
 - No functional change in case of CONFIG_HAVE_MEMBLOCK_NODE_MAP=y


Taku Izumi (2):
  mm/page_alloc.c: calculate zone_start_pfn at
    zone_spanned_pages_in_node()
  mm/page_alloc.c: introduce kernelcore=mirror option

 Documentation/kernel-parameters.txt |  12 ++-
 mm/page_alloc.c                     | 154 ++++++++++++++++++++++++++++++++----
 2 files changed, 148 insertions(+), 18 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v4 0/2] mm: Introduce kernelcore=mirror option
  2016-01-08  8:25 [PATCH v4 0/2] mm: Introduce kernelcore=mirror option Taku Izumi
@ 2016-01-08  8:26 ` Taku Izumi
  2016-01-08  8:26 ` [PATCH v4 1/2] mm/page_alloc.c: calculate zone_start_pfn at zone_spanned_pages_in_node() Taku Izumi
  2016-01-08  8:26 ` [PATCH v4 2/2] mm/page_alloc.c: introduce kernelcore=mirror option Taku Izumi
  2 siblings, 0 replies; 7+ messages in thread
From: Taku Izumi @ 2016-01-08  8:26 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm
  Cc: tony.luck, qiuxishi, kamezawa.hiroyu, mel, dave.hansen, matt,
	arnd, steve.capper, sudeep.holla, Taku Izumi

Xeon E7 v3 based systems supports Address Range Mirroring
and UEFI BIOS complied with UEFI spec 2.5 can notify which
ranges are mirrored (reliable) via EFI memory map.
Now Linux kernel utilize its information and allocates 
boot time memory from reliable region.

My requirement is:
  - allocate kernel memory from mirrored region 
  - allocate user memory from non-mirrored region

In order to meet my requirement, ZONE_MOVABLE is useful.
By arranging non-mirrored range into ZONE_MOVABLE, 
mirrored memory is used for kernel allocations.

My idea is to extend existing "kernelcore" option and 
introduces kernelcore=mirror option. By specifying
"mirror" instead of specifying the amount of memory,
non-mirrored region will be arranged into ZONE_MOVABLE.

Earlier discussions are at: 
 https://lkml.org/lkml/2015/10/9/24
 https://lkml.org/lkml/2015/10/15/9
 https://lkml.org/lkml/2015/11/27/18
 https://lkml.org/lkml/2015/12/8/836

For example, suppose 2-nodes system with the following memory
 range: 
  node 0 [mem 0x0000000000001000-0x000000109fffffff] 
  node 1 [mem 0x00000010a0000000-0x000000209fffffff]
and the following ranges are marked as reliable (mirrored):
  [0x0000000000000000-0x0000000100000000] 
  [0x0000000100000000-0x0000000180000000] 
  [0x0000000800000000-0x0000000880000000] 
  [0x00000010a0000000-0x0000001120000000]
  [0x00000017a0000000-0x0000001820000000] 

If you specify kernelcore=mirror, ZONE_NORMAL and ZONE_MOVABLE
are arranged like bellow:

 - node 0:
  ZONE_NORMAL : [0x0000000100000000-0x00000010a0000000]
  ZONE_MOVABLE: [0x0000000180000000-0x00000010a0000000]
 - node 1: 
  ZONE_NORMAL : [0x00000010a0000000-0x00000020a0000000]
  ZONE_MOVABLE: [0x0000001120000000-0x00000020a0000000]
 
In overlapped range, pages to be ZONE_MOVABLE in ZONE_NORMAL
are treated as absent pages, and vice versa.

This patchset is created against "akpm" branch of linux-next


v1 -> v2:
 - Refine so that the above example case also can be
 handled properly:
v2 -> v3:
 - Change the option name from kernelcore=reliable
 into kernelcore=mirror and some documentation fix
 according to Andrew Morton's point
v3 -> v4:
 - Fix up the case of CONFIG_HAVE_MEMBLOCK_NODE_MAP=n
   (Fix boot failed of ARM machines)
 - No functional change in case of CONFIG_HAVE_MEMBLOCK_NODE_MAP=y


Taku Izumi (2):
  mm/page_alloc.c: calculate zone_start_pfn at
    zone_spanned_pages_in_node()
  mm/page_alloc.c: introduce kernelcore=mirror option

 Documentation/kernel-parameters.txt |  12 ++-
 mm/page_alloc.c                     | 154 ++++++++++++++++++++++++++++++++----
 2 files changed, 148 insertions(+), 18 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v4 1/2] mm/page_alloc.c: calculate zone_start_pfn at zone_spanned_pages_in_node()
  2016-01-08  8:25 [PATCH v4 0/2] mm: Introduce kernelcore=mirror option Taku Izumi
  2016-01-08  8:26 ` Taku Izumi
@ 2016-01-08  8:26 ` Taku Izumi
  2016-01-08  8:26 ` [PATCH v4 2/2] mm/page_alloc.c: introduce kernelcore=mirror option Taku Izumi
  2 siblings, 0 replies; 7+ messages in thread
From: Taku Izumi @ 2016-01-08  8:26 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm
  Cc: tony.luck, qiuxishi, kamezawa.hiroyu, mel, dave.hansen, matt,
	arnd, steve.capper, sudeep.holla, Taku Izumi

Currently each zone's zone_start_pfn is calculated at
free_area_init_core().  However zone's range is fixed at the time when
invoking zone_spanned_pages_in_node().

This patch changes each zone->zone_start_pfn is calculated at
zone_spanned_pages_in_node().

v1 -> v2:
 - Fix up the case of CONFIG_HAVE_MEMBLOCK_NODE_MAP=n
 - No functional change in case of CONFIG_HAVE_MEMBLOCK_NODE_MAP=y

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
---
 mm/page_alloc.c | 40 +++++++++++++++++++++++++++++-----------
 1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3c3a5c5..efb8996 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5076,31 +5076,31 @@ static unsigned long __meminit zone_spanned_pages_in_node(int nid,
 					unsigned long zone_type,
 					unsigned long node_start_pfn,
 					unsigned long node_end_pfn,
+					unsigned long *zone_start_pfn,
+					unsigned long *zone_end_pfn,
 					unsigned long *ignored)
 {
-	unsigned long zone_start_pfn, zone_end_pfn;
-
 	/* When hotadd a new node from cpu_up(), the node should be empty */
 	if (!node_start_pfn && !node_end_pfn)
 		return 0;
 
 	/* Get the start and end of the zone */
-	zone_start_pfn = arch_zone_lowest_possible_pfn[zone_type];
-	zone_end_pfn = arch_zone_highest_possible_pfn[zone_type];
+	*zone_start_pfn = arch_zone_lowest_possible_pfn[zone_type];
+	*zone_end_pfn = arch_zone_highest_possible_pfn[zone_type];
 	adjust_zone_range_for_zone_movable(nid, zone_type,
 				node_start_pfn, node_end_pfn,
-				&zone_start_pfn, &zone_end_pfn);
+				zone_start_pfn, zone_end_pfn);
 
 	/* Check that this node has pages within the zone's required range */
-	if (zone_end_pfn < node_start_pfn || zone_start_pfn > node_end_pfn)
+	if (*zone_end_pfn < node_start_pfn || *zone_start_pfn > node_end_pfn)
 		return 0;
 
 	/* Move the zone boundaries inside the node if necessary */
-	zone_end_pfn = min(zone_end_pfn, node_end_pfn);
-	zone_start_pfn = max(zone_start_pfn, node_start_pfn);
+	*zone_end_pfn = min(*zone_end_pfn, node_end_pfn);
+	*zone_start_pfn = max(*zone_start_pfn, node_start_pfn);
 
 	/* Return the spanned pages */
-	return zone_end_pfn - zone_start_pfn;
+	return *zone_end_pfn - *zone_start_pfn;
 }
 
 /*
@@ -5165,8 +5165,18 @@ static inline unsigned long __meminit zone_spanned_pages_in_node(int nid,
 					unsigned long zone_type,
 					unsigned long node_start_pfn,
 					unsigned long node_end_pfn,
+					unsigned long *zone_start_pfn,
+					unsigned long *zone_end_pfn,
 					unsigned long *zones_size)
 {
+	unsigned int zone;
+
+	*zone_start_pfn = node_start_pfn;
+	for (zone = 0; zone < zone_type; zone++)
+		*zone_start_pfn += zones_size[zone];
+
+	*zone_end_pfn = *zone_start_pfn + zones_size[zone_type];
+
 	return zones_size[zone_type];
 }
 
@@ -5195,15 +5205,22 @@ static void __meminit calculate_node_totalpages(struct pglist_data *pgdat,
 
 	for (i = 0; i < MAX_NR_ZONES; i++) {
 		struct zone *zone = pgdat->node_zones + i;
+		unsigned long zone_start_pfn, zone_end_pfn;
 		unsigned long size, real_size;
 
 		size = zone_spanned_pages_in_node(pgdat->node_id, i,
 						  node_start_pfn,
 						  node_end_pfn,
+						  &zone_start_pfn,
+						  &zone_end_pfn,
 						  zones_size);
 		real_size = size - zone_absent_pages_in_node(pgdat->node_id, i,
 						  node_start_pfn, node_end_pfn,
 						  zholes_size);
+		if (size)
+			zone->zone_start_pfn = zone_start_pfn;
+		else
+			zone->zone_start_pfn = 0;
 		zone->spanned_pages = size;
 		zone->present_pages = real_size;
 
@@ -5324,7 +5341,6 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
 {
 	enum zone_type j;
 	int nid = pgdat->node_id;
-	unsigned long zone_start_pfn = pgdat->node_start_pfn;
 	int ret;
 
 	pgdat_resize_init(pgdat);
@@ -5340,6 +5356,7 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
 	for (j = 0; j < MAX_NR_ZONES; j++) {
 		struct zone *zone = pgdat->node_zones + j;
 		unsigned long size, realsize, freesize, memmap_pages;
+		unsigned long zone_start_pfn = zone->zone_start_pfn;
 
 		size = zone->spanned_pages;
 		realsize = freesize = zone->present_pages;
@@ -5408,7 +5425,6 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
 		ret = init_currently_empty_zone(zone, zone_start_pfn, size);
 		BUG_ON(ret);
 		memmap_init(size, nid, j, zone_start_pfn);
-		zone_start_pfn += size;
 	}
 }
 
@@ -5476,6 +5492,8 @@ void __paginginit free_area_init_node(int nid, unsigned long *zones_size,
 	pr_info("Initmem setup node %d [mem %#018Lx-%#018Lx]\n", nid,
 		(u64)start_pfn << PAGE_SHIFT,
 		end_pfn ? ((u64)end_pfn << PAGE_SHIFT) - 1 : 0);
+#else
+	start_pfn = node_start_pfn;
 #endif
 	calculate_node_totalpages(pgdat, start_pfn, end_pfn,
 				  zones_size, zholes_size);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 2/2] mm/page_alloc.c: introduce kernelcore=mirror option
  2016-01-08  8:25 [PATCH v4 0/2] mm: Introduce kernelcore=mirror option Taku Izumi
  2016-01-08  8:26 ` Taku Izumi
  2016-01-08  8:26 ` [PATCH v4 1/2] mm/page_alloc.c: calculate zone_start_pfn at zone_spanned_pages_in_node() Taku Izumi
@ 2016-01-08  8:26 ` Taku Izumi
  2016-01-08 17:02   ` Sudeep Holla
  2 siblings, 1 reply; 7+ messages in thread
From: Taku Izumi @ 2016-01-08  8:26 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm
  Cc: tony.luck, qiuxishi, kamezawa.hiroyu, mel, dave.hansen, matt,
	arnd, steve.capper, sudeep.holla, Taku Izumi

This patch extends existing "kernelcore" option and introduces
kernelcore=mirror option.  By specifying "mirror" instead of specifying
the amount of memory, non-mirrored (non-reliable) region will be arranged
into ZONE_MOVABLE.

v1 -> v2:
 - Refine so that the following case also can be
   handled properly:

 Node X:  |MMMMMM------MMMMMM--------|
   (legend) M: mirrored  -: not mirrrored

 In this case, ZONE_NORMAL and ZONE_MOVABLE are
 arranged like bellow:

 Node X:  |MMMMMM------MMMMMM--------|
          |ooooooxxxxxxooooooxxxxxxxx| ZONE_NORMAL
                |ooooooxxxxxxoooooooo| ZONE_MOVABLE
   (legend) o: present  x: absent

v2 -> v3:
 - Fix build with CONFIG_HAVE_MEMBLOCK_NODE_MAP=n
 - No functional change in case of CONFIG_HAVE_MEMBLOCK_NODE_MAP=y

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
---
 Documentation/kernel-parameters.txt |  12 +++-
 mm/page_alloc.c                     | 114 ++++++++++++++++++++++++++++++++++--
 2 files changed, 119 insertions(+), 7 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 0ee59ec..af375ee 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1715,7 +1715,9 @@ Such letter suffixes can also be entirely omitted.
 
 	keepinitrd	[HW,ARM]
 
-	kernelcore=nn[KMGTPE]	[KNL,X86,IA-64,PPC] This parameter
+	kernelcore=	[KNL,X86,IA-64,PPC]
+			Format: nn[KMGTPE] | "mirror"
+			This parameter
 			specifies the amount of memory usable by the kernel
 			for non-movable allocations.  The requested amount is
 			spread evenly throughout all nodes in the system. The
@@ -1731,6 +1733,14 @@ Such letter suffixes can also be entirely omitted.
 			use the HighMem zone if it exists, and the Normal
 			zone if it does not.
 
+			Instead of specifying the amount of memory (nn[KMGTPE]),
+			you can specify "mirror" option. In case "mirror"
+			option is specified, mirrored (reliable) memory is used
+			for non-movable allocations and remaining memory is used
+			for Movable pages. nn[KMGTPE] and "mirror" are exclusive,
+			so you can NOT specify nn[KMGTPE] and "mirror" at the same
+			time.
+
 	kgdbdbgp=	[KGDB,HW] kgdb over EHCI usb debug port.
 			Format: <Controller#>[,poll interval]
 			The controller # is the number of the ehci usb debug
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index efb8996..b528328 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -260,6 +260,7 @@ static unsigned long __meminitdata arch_zone_highest_possible_pfn[MAX_NR_ZONES];
 static unsigned long __initdata required_kernelcore;
 static unsigned long __initdata required_movablecore;
 static unsigned long __meminitdata zone_movable_pfn[MAX_NUMNODES];
+static bool mirrored_kernelcore;
 
 /* movable_zone is the "real" zone pages in ZONE_MOVABLE are taken from */
 int movable_zone;
@@ -4613,6 +4614,9 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 	unsigned long pfn;
 	struct zone *z;
 	unsigned long nr_initialised = 0;
+#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
+	struct memblock_region *r = NULL, *tmp;
+#endif
 
 	if (highest_memmap_pfn < end_pfn - 1)
 		highest_memmap_pfn = end_pfn - 1;
@@ -4639,6 +4643,40 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 			if (!update_defer_init(pgdat, pfn, end_pfn,
 						&nr_initialised))
 				break;
+
+			/*
+			 * if not mirrored_kernelcore and ZONE_MOVABLE exists,
+			 * range from zone_movable_pfn[nid] to end of each node
+			 * should be ZONE_MOVABLE not ZONE_NORMAL. skip it.
+			 */
+			if (!mirrored_kernelcore && zone_movable_pfn[nid])
+				if (zone == ZONE_NORMAL &&
+				    pfn >= zone_movable_pfn[nid])
+					continue;
+
+#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
+			/*
+			 * check given memblock attribute by firmware which
+			 * can affect kernel memory layout.
+			 * if zone==ZONE_MOVABLE but memory is mirrored,
+			 * it's an overlapped memmap init. skip it.
+			 */
+			if (mirrored_kernelcore && zone == ZONE_MOVABLE) {
+				if (!r ||
+				    pfn >= memblock_region_memory_end_pfn(r)) {
+					for_each_memblock(memory, tmp)
+						if (pfn < memblock_region_memory_end_pfn(tmp))
+							break;
+					r = tmp;
+				}
+				if (pfn >= memblock_region_memory_base_pfn(r) &&
+				    memblock_is_mirror(r)) {
+					/* already initialized as NORMAL */
+					pfn = memblock_region_memory_end_pfn(r);
+					continue;
+				}
+			}
+#endif
 		}
 
 		/*
@@ -5057,11 +5095,6 @@ static void __meminit adjust_zone_range_for_zone_movable(int nid,
 			*zone_end_pfn = min(node_end_pfn,
 				arch_zone_highest_possible_pfn[movable_zone]);
 
-		/* Adjust for ZONE_MOVABLE starting within this range */
-		} else if (*zone_start_pfn < zone_movable_pfn[nid] &&
-				*zone_end_pfn > zone_movable_pfn[nid]) {
-			*zone_end_pfn = zone_movable_pfn[nid];
-
 		/* Check if this whole range is within ZONE_MOVABLE */
 		} else if (*zone_start_pfn >= zone_movable_pfn[nid])
 			*zone_start_pfn = *zone_end_pfn;
@@ -5146,6 +5179,7 @@ static unsigned long __meminit zone_absent_pages_in_node(int nid,
 	unsigned long zone_low = arch_zone_lowest_possible_pfn[zone_type];
 	unsigned long zone_high = arch_zone_highest_possible_pfn[zone_type];
 	unsigned long zone_start_pfn, zone_end_pfn;
+	unsigned long nr_absent;
 
 	/* When hotadd a new node from cpu_up(), the node should be empty */
 	if (!node_start_pfn && !node_end_pfn)
@@ -5157,7 +5191,39 @@ static unsigned long __meminit zone_absent_pages_in_node(int nid,
 	adjust_zone_range_for_zone_movable(nid, zone_type,
 			node_start_pfn, node_end_pfn,
 			&zone_start_pfn, &zone_end_pfn);
-	return __absent_pages_in_range(nid, zone_start_pfn, zone_end_pfn);
+	nr_absent = __absent_pages_in_range(nid, zone_start_pfn, zone_end_pfn);
+
+	/*
+	 * ZONE_MOVABLE handling.
+	 * Treat pages to be ZONE_MOVABLE in ZONE_NORMAL as absent pages
+	 * and vice versa.
+	 */
+	if (zone_movable_pfn[nid]) {
+		if (mirrored_kernelcore) {
+			unsigned long start_pfn, end_pfn;
+			struct memblock_region *r;
+
+			for_each_memblock(memory, r) {
+				start_pfn = clamp(memblock_region_memory_base_pfn(r),
+						  zone_start_pfn, zone_end_pfn);
+				end_pfn = clamp(memblock_region_memory_end_pfn(r),
+						zone_start_pfn, zone_end_pfn);
+
+				if (zone_type == ZONE_MOVABLE &&
+				    memblock_is_mirror(r))
+					nr_absent += end_pfn - start_pfn;
+
+				if (zone_type == ZONE_NORMAL &&
+				    !memblock_is_mirror(r))
+					nr_absent += end_pfn - start_pfn;
+			}
+		} else {
+			if (zone_type == ZONE_NORMAL)
+				nr_absent += node_end_pfn - zone_movable_pfn[nid];
+		}
+	}
+
+	return nr_absent;
 }
 
 #else /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
@@ -5665,6 +5731,36 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 	}
 
 	/*
+	 * If kernelcore=mirror is specified, ignore movablecore option
+	 */
+	if (mirrored_kernelcore) {
+		bool mem_below_4gb_not_mirrored = false;
+
+		for_each_memblock(memory, r) {
+			if (memblock_is_mirror(r))
+				continue;
+
+			nid = r->nid;
+
+			usable_startpfn = memblock_region_memory_base_pfn(r);
+
+			if (usable_startpfn < 0x100000) {
+				mem_below_4gb_not_mirrored = true;
+				continue;
+			}
+
+			zone_movable_pfn[nid] = zone_movable_pfn[nid] ?
+				min(usable_startpfn, zone_movable_pfn[nid]) :
+				usable_startpfn;
+		}
+
+		if (mem_below_4gb_not_mirrored)
+			pr_warn("This configuration results in unmirrored kernel memory.");
+
+		goto out2;
+	}
+
+	/*
 	 * If movablecore=nn[KMGTPE] was specified, calculate what size of
 	 * kernelcore that corresponds so that memory usable for
 	 * any allocation type is evenly spread. If both kernelcore
@@ -5924,6 +6020,12 @@ static int __init cmdline_parse_core(char *p, unsigned long *core)
  */
 static int __init cmdline_parse_kernelcore(char *p)
 {
+	/* parse kernelcore=mirror */
+	if (parse_option_str(p, "mirror")) {
+		mirrored_kernelcore = true;
+		return 0;
+	}
+
 	return cmdline_parse_core(p, &required_kernelcore);
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 2/2] mm/page_alloc.c: introduce kernelcore=mirror option
  2016-01-08  8:26 ` [PATCH v4 2/2] mm/page_alloc.c: introduce kernelcore=mirror option Taku Izumi
@ 2016-01-08 17:02   ` Sudeep Holla
  2016-01-08 23:12     ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Sudeep Holla @ 2016-01-08 17:02 UTC (permalink / raw)
  To: Taku Izumi, linux-kernel, linux-mm, akpm
  Cc: Sudeep Holla, tony.luck, qiuxishi, kamezawa.hiroyu, mel,
	dave.hansen, matt, arnd, steve.capper



On 08/01/16 08:26, Taku Izumi wrote:
> This patch extends existing "kernelcore" option and introduces
> kernelcore=mirror option.  By specifying "mirror" instead of specifying
> the amount of memory, non-mirrored (non-reliable) region will be arranged
> into ZONE_MOVABLE.
>
> v1 -> v2:
>   - Refine so that the following case also can be
>     handled properly:
>
>   Node X:  |MMMMMM------MMMMMM--------|
>     (legend) M: mirrored  -: not mirrrored
>
>   In this case, ZONE_NORMAL and ZONE_MOVABLE are
>   arranged like bellow:
>
>   Node X:  |MMMMMM------MMMMMM--------|
>            |ooooooxxxxxxooooooxxxxxxxx| ZONE_NORMAL
>                  |ooooooxxxxxxoooooooo| ZONE_MOVABLE
>     (legend) o: present  x: absent
>
> v2 -> v3:
>   - Fix build with CONFIG_HAVE_MEMBLOCK_NODE_MAP=n
>   - No functional change in case of CONFIG_HAVE_MEMBLOCK_NODE_MAP=y
>
> Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
> ---
>   Documentation/kernel-parameters.txt |  12 +++-
>   mm/page_alloc.c                     | 114 ++++++++++++++++++++++++++++++++++--
>   2 files changed, 119 insertions(+), 7 deletions(-)
>

[...]

> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index efb8996..b528328 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -260,6 +260,7 @@ static unsigned long __meminitdata arch_zone_highest_possible_pfn[MAX_NR_ZONES];
>   static unsigned long __initdata required_kernelcore;
>   static unsigned long __initdata required_movablecore;
>   static unsigned long __meminitdata zone_movable_pfn[MAX_NUMNODES];
> +static bool mirrored_kernelcore;
>
>   /* movable_zone is the "real" zone pages in ZONE_MOVABLE are taken from */
>   int movable_zone;
> @@ -4613,6 +4614,9 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>   	unsigned long pfn;
>   	struct zone *z;
>   	unsigned long nr_initialised = 0;
> +#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
> +	struct memblock_region *r = NULL, *tmp;
> +#endif
>
>   	if (highest_memmap_pfn < end_pfn - 1)
>   		highest_memmap_pfn = end_pfn - 1;
> @@ -4639,6 +4643,40 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>   			if (!update_defer_init(pgdat, pfn, end_pfn,
>   						&nr_initialised))
>   				break;
> +
> +			/*
> +			 * if not mirrored_kernelcore and ZONE_MOVABLE exists,
> +			 * range from zone_movable_pfn[nid] to end of each node
> +			 * should be ZONE_MOVABLE not ZONE_NORMAL. skip it.
> +			 */
> +			if (!mirrored_kernelcore && zone_movable_pfn[nid])
> +				if (zone == ZONE_NORMAL &&
> +				    pfn >= zone_movable_pfn[nid])
> +					continue;
> +

I tried this with today's -next, the above lines gave compilation error.
Moved them below into HAVE_MEMBLOCK_NODE_MAP and tested it on ARM64.
I don't see the previous backtraces. Let me know if that's correct or
you can post a version that compiles correctly and I can give a try.

> +#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
> +			/*
> +			 * check given memblock attribute by firmware which
> +			 * can affect kernel memory layout.
> +			 * if zone==ZONE_MOVABLE but memory is mirrored,
> +			 * it's an overlapped memmap init. skip it.
> +			 */
> +			if (mirrored_kernelcore && zone == ZONE_MOVABLE) {
> +				if (!r ||
> +				    pfn >= memblock_region_memory_end_pfn(r)) {
> +					for_each_memblock(memory, tmp)
> +						if (pfn < memblock_region_memory_end_pfn(tmp))
> +							break;
> +					r = tmp;
> +				}
> +				if (pfn >= memblock_region_memory_base_pfn(r) &&
> +				    memblock_is_mirror(r)) {
> +					/* already initialized as NORMAL */
> +					pfn = memblock_region_memory_end_pfn(r);
> +					continue;
> +				}
> +			}
> +#endif
>   		}

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 2/2] mm/page_alloc.c: introduce kernelcore=mirror option
  2016-01-08 17:02   ` Sudeep Holla
@ 2016-01-08 23:12     ` Andrew Morton
  2016-01-11  9:56       ` Sudeep Holla
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2016-01-08 23:12 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Taku Izumi, linux-kernel, linux-mm, tony.luck, qiuxishi,
	kamezawa.hiroyu, mel, dave.hansen, matt, arnd, steve.capper

On Fri, 8 Jan 2016 17:02:39 +0000 Sudeep Holla <sudeep.holla@arm.com> wrote:

> > +
> > +			/*
> > +			 * if not mirrored_kernelcore and ZONE_MOVABLE exists,
> > +			 * range from zone_movable_pfn[nid] to end of each node
> > +			 * should be ZONE_MOVABLE not ZONE_NORMAL. skip it.
> > +			 */
> > +			if (!mirrored_kernelcore && zone_movable_pfn[nid])
> > +				if (zone == ZONE_NORMAL &&
> > +				    pfn >= zone_movable_pfn[nid])
> > +					continue;
> > +
> 
> I tried this with today's -next, the above lines gave compilation error.
> Moved them below into HAVE_MEMBLOCK_NODE_MAP and tested it on ARM64.
> I don't see the previous backtraces. Let me know if that's correct or
> you can post a version that compiles correctly and I can give a try.

Thanks.   I'll include the below and shall add your tested-by:, OK?

From: Andrew Morton <akpm@linux-foundation.org>
Subject: mm-page_allocc-introduce-kernelcore=mirror-option-fix

fix build with CONFIG_HAVE_MEMBLOCK_NODE_MAP=n

Reported-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN Documentation/kernel-parameters.txt~mm-page_allocc-introduce-kernelcore=mirror-option-fix Documentation/kernel-parameters.txt
diff -puN mm/page_alloc.c~mm-page_allocc-introduce-kernelcore=mirror-option-fix mm/page_alloc.c
--- a/mm/page_alloc.c~mm-page_allocc-introduce-kernelcore=mirror-option-fix
+++ a/mm/page_alloc.c
@@ -4627,6 +4627,7 @@ void __meminit memmap_init_zone(unsigned
 						&nr_initialised))
 				break;
 
+#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 			/*
 			 * if not mirrored_kernelcore and ZONE_MOVABLE exists,
 			 * range from zone_movable_pfn[nid] to end of each node
@@ -4637,7 +4638,6 @@ void __meminit memmap_init_zone(unsigned
 				    pfn >= zone_movable_pfn[nid])
 					continue;
 
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 			/*
 			 * check given memblock attribute by firmware which
 			 * can affect kernel memory layout.
_

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 2/2] mm/page_alloc.c: introduce kernelcore=mirror option
  2016-01-08 23:12     ` Andrew Morton
@ 2016-01-11  9:56       ` Sudeep Holla
  0 siblings, 0 replies; 7+ messages in thread
From: Sudeep Holla @ 2016-01-11  9:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Sudeep Holla, Taku Izumi, linux-kernel, linux-mm, tony.luck,
	qiuxishi, kamezawa.hiroyu, mel, dave.hansen, matt, arnd,
	steve.capper



On 08/01/16 23:12, Andrew Morton wrote:
> On Fri, 8 Jan 2016 17:02:39 +0000 Sudeep Holla <sudeep.holla@arm.com> wrote:
>
>>> +
>>> +			/*
>>> +			 * if not mirrored_kernelcore and ZONE_MOVABLE exists,
>>> +			 * range from zone_movable_pfn[nid] to end of each node
>>> +			 * should be ZONE_MOVABLE not ZONE_NORMAL. skip it.
>>> +			 */
>>> +			if (!mirrored_kernelcore && zone_movable_pfn[nid])
>>> +				if (zone == ZONE_NORMAL &&
>>> +				    pfn >= zone_movable_pfn[nid])
>>> +					continue;
>>> +
>>
>> I tried this with today's -next, the above lines gave compilation error.
>> Moved them below into HAVE_MEMBLOCK_NODE_MAP and tested it on ARM64.
>> I don't see the previous backtraces. Let me know if that's correct or
>> you can post a version that compiles correctly and I can give a try.
>
> Thanks.   I'll include the below and shall add your tested-by:, OK?
>

Yes this is the exact change I tested. Also I retested your latest patch
set with today's -next. So,

Tested-by: Sudeep Holla <sudeep.holla@arm.com>

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-01-11  9:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-08  8:25 [PATCH v4 0/2] mm: Introduce kernelcore=mirror option Taku Izumi
2016-01-08  8:26 ` Taku Izumi
2016-01-08  8:26 ` [PATCH v4 1/2] mm/page_alloc.c: calculate zone_start_pfn at zone_spanned_pages_in_node() Taku Izumi
2016-01-08  8:26 ` [PATCH v4 2/2] mm/page_alloc.c: introduce kernelcore=mirror option Taku Izumi
2016-01-08 17:02   ` Sudeep Holla
2016-01-08 23:12     ` Andrew Morton
2016-01-11  9:56       ` Sudeep Holla

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).