All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tang Chen <tangchen@cn.fujitsu.com>
To: tglx@linutronix.de, mingo@elte.hu, hpa@zytor.com,
	akpm@linux-foundation.org, tj@kernel.org, trenn@suse.de,
	yinghai@kernel.org, jiang.liu@huawei.com, wency@cn.fujitsu.com,
	laijs@cn.fujitsu.com, isimatu.yasuaki@jp.fujitsu.com,
	izumi.taku@jp.fujitsu.com, mgorman@suse.de, minchan@kernel.org,
	mina86@mina86.com, gong.chen@linux.intel.com,
	vasilis.liaskovitis@profitbricks.com, lwoodman@redhat.com,
	riel@redhat.com, jweiner@redhat.com, prarit@redhat.com,
	zhangyanfei@cn.fujitsu.com, yanghy@cn.fujitsu.com
Cc: x86@kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-acpi@vger.kernel.org
Subject: [PATCH 20/21] x86, numa, acpi, memory-hotplug: Make movablecore=acpi have higher priority.
Date: Fri, 19 Jul 2013 15:59:33 +0800	[thread overview]
Message-ID: <1374220774-29974-21-git-send-email-tangchen@cn.fujitsu.com> (raw)
In-Reply-To: <1374220774-29974-1-git-send-email-tangchen@cn.fujitsu.com>

Arrange hotpluggable memory as ZONE_MOVABLE will cause NUMA performance down
because the kernel cannot use movable memory. For users who don't use memory
hotplug and who don't want to lose their NUMA performance, they need a way to
disable this functionality. So we improved movablecore boot option.

If users specify the original movablecore=nn@ss boot option, the kernel will
arrange [ss, ss+nn) as ZONE_MOVABLE. The kernelcore=nn@ss boot option is similar
except it specifies ZONE_NORMAL ranges.

Now, if users specify "movablecore=acpi" in kernel commandline, the kernel will
arrange hotpluggable memory in SRAT as ZONE_MOVABLE. And if users do this, all
the other movablecore=nn@ss and kernelcore=nn@ss options should be ignored.

For those who don't want this, just specify nothing. The kernel will act as
before.

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
---
 include/linux/memblock.h |    1 +
 mm/memblock.c            |    5 +++++
 mm/page_alloc.c          |   31 +++++++++++++++++++++++++++++--
 3 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index d520015..28ba511 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -64,6 +64,7 @@ int memblock_free(phys_addr_t base, phys_addr_t size);
 int memblock_reserve(phys_addr_t base, phys_addr_t size);
 int memblock_reserve_hotpluggable(phys_addr_t base, phys_addr_t size, int nid);
 int memblock_reserve_node(phys_addr_t base, phys_addr_t size, int nid);
+bool memblock_is_hotpluggable(struct memblock_region *region);
 void memblock_free_hotpluggable(void);
 void memblock_trim_memory(phys_addr_t align);
 
diff --git a/mm/memblock.c b/mm/memblock.c
index 1f5dc12..fd3ded8 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -616,6 +616,11 @@ int __init_memblock memblock_reserve_hotpluggable(phys_addr_t base,
 	return memblock_reserve_region(base, size, nid, MEMBLK_HOTPLUGGABLE);
 }
 
+bool __init_memblock memblock_is_hotpluggable(struct memblock_region *region)
+{
+	return region->flags & MEMBLK_HOTPLUGGABLE;
+}
+
 /**
  * __next_free_mem_range - next function for for_each_free_mem_range()
  * @idx: pointer to u64 loop variable
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6271c36..cdb7919 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4880,9 +4880,37 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 	nodemask_t saved_node_state = node_states[N_MEMORY];
 	unsigned long totalpages = early_calculate_totalpages();
 	int usable_nodes = nodes_weight(node_states[N_MEMORY]);
+	struct memblock_type *reserved = &memblock.reserved;
 
 	/*
-	 * If movablecore was specified, calculate what size of
+	 * Need to find movable_zone earlier in case movablecore=acpi is
+	 * specified.
+	 */
+	find_usable_zone_for_movable();
+
+	/*
+	 * If movablecore=acpi was specified, then zone_movable_pfn[] has been
+	 * initialized, and no more work needs to do.
+	 * NOTE: In this case, we ignore kernelcore option.
+	 */
+	if (movablecore_enable_srat) {
+		for (i = 0; i < reserved->cnt; i++) {
+			if (!memblock_is_hotpluggable(&reserved->regions[i]))
+				continue;
+
+			nid = reserved->regions[i].nid;
+
+			usable_startpfn = PFN_DOWN(reserved->regions[i].base);
+			zone_movable_pfn[nid] = zone_movable_pfn[nid] ?
+				min(usable_startpfn, zone_movable_pfn[nid]) :
+				usable_startpfn;
+		}
+
+		goto out;
+	}
+
+	/*
+	 * If movablecore=nn[KMG] was specified, calculate what size of
 	 * kernelcore that corresponds so that memory usable for
 	 * any allocation type is evenly spread. If both kernelcore
 	 * and movablecore are specified, then the value of kernelcore
@@ -4908,7 +4936,6 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 		goto out;
 
 	/* usable_startpfn is the lowest possible pfn ZONE_MOVABLE can be at */
-	find_usable_zone_for_movable();
 	usable_startpfn = arch_zone_lowest_possible_pfn[movable_zone];
 
 restart:
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Tang Chen <tangchen@cn.fujitsu.com>
To: tglx@linutronix.de, mingo@elte.hu, hpa@zytor.com,
	akpm@linux-foundation.org, tj@kernel.org, trenn@suse.de,
	yinghai@kernel.org, jiang.liu@huawei.com, wency@cn.fujitsu.com,
	laijs@cn.fujitsu.com, isimatu.yasuaki@jp.fujitsu.com,
	izumi.taku@jp.fujitsu.com, mgorman@suse.de, minchan@kernel.org,
	mina86@mina86.com, gong.chen@linux.intel.com,
	vasilis.liaskovitis@profitbricks.com, lwoodman@redhat.com,
	riel@redhat.com, jweiner@redhat.com, prarit@redhat.com,
	zhangyanfei@cn.fujitsu.com, yanghy@cn.fujitsu.com
Cc: x86@kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-acpi@vger.kernel.org
Subject: [PATCH 20/21] x86, numa, acpi, memory-hotplug: Make movablecore=acpi have higher priority.
Date: Fri, 19 Jul 2013 15:59:33 +0800	[thread overview]
Message-ID: <1374220774-29974-21-git-send-email-tangchen@cn.fujitsu.com> (raw)
In-Reply-To: <1374220774-29974-1-git-send-email-tangchen@cn.fujitsu.com>

Arrange hotpluggable memory as ZONE_MOVABLE will cause NUMA performance down
because the kernel cannot use movable memory. For users who don't use memory
hotplug and who don't want to lose their NUMA performance, they need a way to
disable this functionality. So we improved movablecore boot option.

If users specify the original movablecore=nn@ss boot option, the kernel will
arrange [ss, ss+nn) as ZONE_MOVABLE. The kernelcore=nn@ss boot option is similar
except it specifies ZONE_NORMAL ranges.

Now, if users specify "movablecore=acpi" in kernel commandline, the kernel will
arrange hotpluggable memory in SRAT as ZONE_MOVABLE. And if users do this, all
the other movablecore=nn@ss and kernelcore=nn@ss options should be ignored.

For those who don't want this, just specify nothing. The kernel will act as
before.

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
---
 include/linux/memblock.h |    1 +
 mm/memblock.c            |    5 +++++
 mm/page_alloc.c          |   31 +++++++++++++++++++++++++++++--
 3 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index d520015..28ba511 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -64,6 +64,7 @@ int memblock_free(phys_addr_t base, phys_addr_t size);
 int memblock_reserve(phys_addr_t base, phys_addr_t size);
 int memblock_reserve_hotpluggable(phys_addr_t base, phys_addr_t size, int nid);
 int memblock_reserve_node(phys_addr_t base, phys_addr_t size, int nid);
+bool memblock_is_hotpluggable(struct memblock_region *region);
 void memblock_free_hotpluggable(void);
 void memblock_trim_memory(phys_addr_t align);
 
diff --git a/mm/memblock.c b/mm/memblock.c
index 1f5dc12..fd3ded8 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -616,6 +616,11 @@ int __init_memblock memblock_reserve_hotpluggable(phys_addr_t base,
 	return memblock_reserve_region(base, size, nid, MEMBLK_HOTPLUGGABLE);
 }
 
+bool __init_memblock memblock_is_hotpluggable(struct memblock_region *region)
+{
+	return region->flags & MEMBLK_HOTPLUGGABLE;
+}
+
 /**
  * __next_free_mem_range - next function for for_each_free_mem_range()
  * @idx: pointer to u64 loop variable
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6271c36..cdb7919 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4880,9 +4880,37 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 	nodemask_t saved_node_state = node_states[N_MEMORY];
 	unsigned long totalpages = early_calculate_totalpages();
 	int usable_nodes = nodes_weight(node_states[N_MEMORY]);
+	struct memblock_type *reserved = &memblock.reserved;
 
 	/*
-	 * If movablecore was specified, calculate what size of
+	 * Need to find movable_zone earlier in case movablecore=acpi is
+	 * specified.
+	 */
+	find_usable_zone_for_movable();
+
+	/*
+	 * If movablecore=acpi was specified, then zone_movable_pfn[] has been
+	 * initialized, and no more work needs to do.
+	 * NOTE: In this case, we ignore kernelcore option.
+	 */
+	if (movablecore_enable_srat) {
+		for (i = 0; i < reserved->cnt; i++) {
+			if (!memblock_is_hotpluggable(&reserved->regions[i]))
+				continue;
+
+			nid = reserved->regions[i].nid;
+
+			usable_startpfn = PFN_DOWN(reserved->regions[i].base);
+			zone_movable_pfn[nid] = zone_movable_pfn[nid] ?
+				min(usable_startpfn, zone_movable_pfn[nid]) :
+				usable_startpfn;
+		}
+
+		goto out;
+	}
+
+	/*
+	 * If movablecore=nn[KMG] was specified, calculate what size of
 	 * kernelcore that corresponds so that memory usable for
 	 * any allocation type is evenly spread. If both kernelcore
 	 * and movablecore are specified, then the value of kernelcore
@@ -4908,7 +4936,6 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 		goto out;
 
 	/* usable_startpfn is the lowest possible pfn ZONE_MOVABLE can be at */
-	find_usable_zone_for_movable();
 	usable_startpfn = arch_zone_lowest_possible_pfn[movable_zone];
 
 restart:
-- 
1.7.1


  parent reply	other threads:[~2013-07-19  7:59 UTC|newest]

Thread overview: 152+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-19  7:59 [PATCH 00/21] Arrange hotpluggable memory as ZONE_MOVABLE Tang Chen
2013-07-19  7:59 ` Tang Chen
2013-07-19  7:59 ` [PATCH 01/21] acpi: Print Hot-Pluggable Field in SRAT Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-23 18:48   ` Tejun Heo
2013-07-23 18:48     ` Tejun Heo
2013-07-23 19:15     ` Joe Perches
2013-07-23 19:15       ` Joe Perches
2013-07-23 19:20       ` Tejun Heo
2013-07-23 19:20         ` Tejun Heo
2013-07-23 19:26         ` Joe Perches
2013-07-23 19:26           ` Joe Perches
2013-07-24  1:46     ` Tang Chen
2013-07-24  1:46       ` Tang Chen
2013-07-19  7:59 ` [PATCH 02/21] memblock, numa: Introduce flag into memblock Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-23 19:09   ` Tejun Heo
2013-07-23 19:09     ` Tejun Heo
2013-07-24  2:53     ` Tang Chen
2013-07-24  2:53       ` Tang Chen
2013-07-24 15:54       ` Tejun Heo
2013-07-24 15:54         ` Tejun Heo
2013-07-25  6:42         ` Tang Chen
2013-07-25  6:42           ` Tang Chen
2013-07-19  7:59 ` [PATCH 03/21] x86, acpi, numa, mem-hotplug: Introduce MEMBLK_HOTPLUGGABLE to reserve hotpluggable memory Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-23 19:19   ` Tejun Heo
2013-07-23 19:19     ` Tejun Heo
2013-07-24  2:55     ` Tang Chen
2013-07-24  2:55       ` Tang Chen
2013-07-19  7:59 ` [PATCH 04/21] acpi: Remove "continue" in macro INVALID_TABLE() Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-23 19:15   ` Tejun Heo
2013-07-23 19:15     ` Tejun Heo
2013-07-19  7:59 ` [PATCH 05/21] acpi: Introduce acpi_invalid_table() to check if a table is invalid Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-19  7:59 ` [PATCH 06/21] x86, acpi: Split acpi_boot_table_init() into two parts Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-19  7:59 ` [PATCH 07/21] x86, acpi: Initialize ACPI root table list earlier Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-19  7:59 ` [PATCH 08/21] x86, acpi: Also initialize signature and length when parsing root table Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-23 19:45   ` Tejun Heo
2013-07-23 19:45     ` Tejun Heo
2013-07-25  6:50     ` Tang Chen
2013-07-25  6:50       ` Tang Chen
2013-07-19  7:59 ` [PATCH 09/21] x86: Make get_ramdisk_{image|size}() global Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-23 19:56   ` Tejun Heo
2013-07-23 19:56     ` Tejun Heo
2013-07-24  3:12     ` Tang Chen
2013-07-24  3:12       ` Tang Chen
2013-07-19  7:59 ` [PATCH 10/21] earlycpio.c: Fix the confusing comment of find_cpio_data() Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-23 20:02   ` Tejun Heo
2013-07-23 20:02     ` Tejun Heo
2013-07-24  3:20     ` Tang Chen
2013-07-24  3:20       ` Tang Chen
2013-07-19  7:59 ` [PATCH 11/21] x86: get pg_data_t's memory from other node Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-23 20:09   ` Tejun Heo
2013-07-23 20:09     ` Tejun Heo
2013-07-24  3:52     ` Tang Chen
2013-07-24  3:52       ` Tang Chen
2013-07-24 16:03       ` Tejun Heo
2013-07-24 16:03         ` Tejun Heo
2013-07-19  7:59 ` [PATCH 12/21] x86, acpi: Try to find if SRAT is overrided earlier Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-23 20:27   ` Tejun Heo
2013-07-23 20:27     ` Tejun Heo
2013-07-24  6:57     ` Tang Chen
2013-07-24  6:57       ` Tang Chen
2013-07-19  7:59 ` [PATCH 13/21] x86, acpi: Try to find SRAT in firmware earlier Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-23 20:49   ` Tejun Heo
2013-07-23 20:49     ` Tejun Heo
2013-07-24 10:12     ` Tang Chen
2013-07-24 10:12       ` Tang Chen
2013-07-24 15:55       ` Tejun Heo
2013-07-24 15:55         ` Tejun Heo
2013-07-23 23:26   ` Cody P Schafer
2013-07-23 23:26     ` Cody P Schafer
2013-07-24 10:16     ` Tang Chen
2013-07-24 10:16       ` Tang Chen
2013-07-19  7:59 ` [PATCH 14/21] x86, acpi, numa: Reserve hotpluggable memory at early time Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-23 20:55   ` Tejun Heo
2013-07-23 20:55     ` Tejun Heo
2013-07-23 21:32     ` Tejun Heo
2013-07-23 21:32       ` Tejun Heo
2013-07-25  2:13       ` Tang Chen
2013-07-25  2:13         ` Tang Chen
2013-07-25 15:17         ` Tejun Heo
2013-07-25 15:17           ` Tejun Heo
2013-07-26  3:45           ` Tang Chen
2013-07-26  3:45             ` Tang Chen
2013-07-26 10:26             ` Tejun Heo
2013-07-26 10:26               ` Tejun Heo
2013-07-26 10:27               ` Tejun Heo
2013-07-26 10:27                 ` Tejun Heo
2013-07-29  2:12               ` Tang Chen
2013-07-29  2:12                 ` Tang Chen
2013-07-29 17:10                 ` Tejun Heo
2013-07-29 17:10                   ` Tejun Heo
2013-07-19  7:59 ` [PATCH 15/21] x86, acpi, numa: Don't reserve memory on nodes the kernel resides in Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-23 20:59   ` Tejun Heo
2013-07-23 20:59     ` Tejun Heo
2013-07-25  2:34     ` Tang Chen
2013-07-25  2:34       ` Tang Chen
2013-07-19  7:59 ` [PATCH 16/21] x86, memblock, mem-hotplug: Free hotpluggable memory reserved by memblock Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-23 21:00   ` Tejun Heo
2013-07-23 21:00     ` Tejun Heo
2013-07-25  2:35     ` Tang Chen
2013-07-25  2:35       ` Tang Chen
2013-07-19  7:59 ` [PATCH 17/21] page_alloc, mem-hotplug: Improve movablecore to {en|dis}able using SRAT Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-23 21:04   ` Tejun Heo
2013-07-23 21:04     ` Tejun Heo
2013-07-23 21:11     ` Tejun Heo
2013-07-23 21:11       ` Tejun Heo
2013-07-25  3:50       ` Tang Chen
2013-07-25  3:50         ` Tang Chen
2013-07-25 15:09         ` Tejun Heo
2013-07-25 15:09           ` Tejun Heo
2013-07-26  3:58           ` Tang Chen
2013-07-26  3:58             ` Tang Chen
2013-07-19  7:59 ` [PATCH 18/21] x86, numa: Synchronize nid info in memblock.reserve with numa_meminfo Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-23 21:25   ` Tejun Heo
2013-07-23 21:25     ` Tejun Heo
2013-07-25  4:09     ` Tang Chen
2013-07-25  4:09       ` Tang Chen
2013-07-25 15:05       ` Tejun Heo
2013-07-25 15:05         ` Tejun Heo
2013-07-26  4:00         ` Tang Chen
2013-07-26  4:00           ` Tang Chen
2013-07-19  7:59 ` [PATCH 19/21] x86, numa: Save nid when reserve memory into memblock.reserved[] Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-19  7:59 ` Tang Chen [this message]
2013-07-19  7:59   ` [PATCH 20/21] x86, numa, acpi, memory-hotplug: Make movablecore=acpi have higher priority Tang Chen
2013-07-23 21:21   ` Tejun Heo
2013-07-23 21:21     ` Tejun Heo
2013-07-19  7:59 ` [PATCH 21/21] doc, page_alloc, acpi, mem-hotplug: Add doc for movablecore=acpi boot option Tang Chen
2013-07-19  7:59   ` Tang Chen
2013-07-23 21:21   ` Tejun Heo
2013-07-23 21:21     ` Tejun Heo
2013-07-25  3:53     ` Tang Chen
2013-07-25  3:53       ` Tang Chen
2013-07-22  2:48 ` [PATCH 00/21] Arrange hotpluggable memory as ZONE_MOVABLE Tang Chen
2013-07-22  2:48   ` Tang Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1374220774-29974-21-git-send-email-tangchen@cn.fujitsu.com \
    --to=tangchen@cn.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=gong.chen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=izumi.taku@jp.fujitsu.com \
    --cc=jiang.liu@huawei.com \
    --cc=jweiner@redhat.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lwoodman@redhat.com \
    --cc=mgorman@suse.de \
    --cc=mina86@mina86.com \
    --cc=minchan@kernel.org \
    --cc=mingo@elte.hu \
    --cc=prarit@redhat.com \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=trenn@suse.de \
    --cc=vasilis.liaskovitis@profitbricks.com \
    --cc=wency@cn.fujitsu.com \
    --cc=x86@kernel.org \
    --cc=yanghy@cn.fujitsu.com \
    --cc=yinghai@kernel.org \
    --cc=zhangyanfei@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.