linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug
@ 2012-08-02  6:01 Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 01/23 V2] node_states: introduce N_MEMORY Lai Jiangshan
                   ` (23 more replies)
  0 siblings, 24 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel; +Cc: Lai Jiangshan

	A) Introduction:

This patchset adds MOVABLE-dedicated node and online_movable for memory-management.

It is used for anti-fragmentation(hugepage, big-order allocation...),
hot-removal-of-memory(virtualization, power-conserve, move memory between systems
to make better utilities of memories).

	B) changed from V1:

The original V1 patchset of MOVABLE-dedicated node is here:
http://comments.gmane.org/gmane.linux.kernel.mm/78122

The new V2 adds N_MEMORY and a notion of "MOVABLE-dedicated node".
And fix some related problems.

The orignal V1 patchset of "add online_movable" is here:
https://lkml.org/lkml/2012/7/4/145

The new V2 discards the MIGRATE_HOTREMOVE approach, and use a more straight
implementation(only 1 patch).

	C) User Interface:

When users(big system manager) need config some node/memory as MOVABLE:
	1 Use kernelcore_max_addr=XX when boot
	2 Use movable_online hotplug action when running
We may introduce some more convenient interface, such as
	movable_node=NODE_LIST boot option.

	D) Patches

Patch1        introduce N_MEMORY
Patch2-13     use N_MEMORY instead N_HIGH_MEMORY.
              The patches are separated by subsystem,
              *these conversions was(must be) checked carefully*.
              Patch13 also changes the node_states initialization
Patch14,15,17 Fix problems of the current code.(all related with hotplug)
Patch18       Add config to allow MOVABLE-dedicated node
Patch19-22    Add kernelcore_max_addr
Patch23       Add online_movable


Lai Jiangshan (19):
  node_states: introduce N_MEMORY
  cpuset: use N_MEMORY instead N_HIGH_MEMORY
  procfs: use N_MEMORY instead N_HIGH_MEMORY
  oom: use N_MEMORY instead N_HIGH_MEMORY
  mm,migrate: use N_MEMORY instead N_HIGH_MEMORY
  mempolicy: use N_MEMORY instead N_HIGH_MEMORY
  memcontrol: use N_MEMORY instead N_HIGH_MEMORY
  hugetlb: use N_MEMORY instead N_HIGH_MEMORY
  vmstat: use N_MEMORY instead N_HIGH_MEMORY
  kthread: use N_MEMORY instead N_HIGH_MEMORY
  init: use N_MEMORY instead N_HIGH_MEMORY
  vmscan: use N_MEMORY instead N_HIGH_MEMORY
  page_alloc: use N_MEMORY instead N_HIGH_MEMORY and change the node_states initialization
  slub, hotplug: ignore unrelated node's hot-adding and hot-removing
  memory_hotplug: fix missing nodemask management
  numa: add CONFIG_MOVABLE_NODE for movable-dedicated node
  page_alloc.c: don't subtract unrelated memmap from zone's present pages
  page_alloc: add kernelcore_max_addr
  mm, memory-hotplug: add online_movable

Yasuaki Ishimatsu (4):
  x86: get pg_data_t's memory from other node
  x86: use memblock_set_current_limit() to set memblock.current_limit
  memblock: limit memory address from memblock
  memblock: compare current_limit with end variable at
    memblock_find_in_range_node()

 Documentation/cgroups/cpusets.txt   |    2 +-
 Documentation/kernel-parameters.txt |    9 +++
 Documentation/memory-hotplug.txt    |   16 ++++-
 arch/x86/kernel/setup.c             |    4 +-
 arch/x86/mm/init_64.c               |    4 +-
 arch/x86/mm/numa.c                  |    8 ++-
 drivers/base/memory.c               |   19 +++--
 drivers/base/node.c                 |    8 ++-
 fs/proc/kcore.c                     |    2 +-
 fs/proc/task_mmu.c                  |    4 +-
 include/linux/cpuset.h              |    2 +-
 include/linux/memblock.h            |    1 +
 include/linux/memory_hotplug.h      |   13 +++-
 include/linux/nodemask.h            |    5 ++
 init/main.c                         |    2 +-
 kernel/cpuset.c                     |   32 ++++----
 kernel/kthread.c                    |    2 +-
 mm/Kconfig                          |    8 ++
 mm/hugetlb.c                        |   24 +++---
 mm/memblock.c                       |   10 ++-
 mm/memcontrol.c                     |   18 +++---
 mm/memory_hotplug.c                 |  137 ++++++++++++++++++++++++++++++++---
 mm/mempolicy.c                      |   12 ++--
 mm/migrate.c                        |    2 +-
 mm/oom_kill.c                       |    2 +-
 mm/page_alloc.c                     |   96 +++++++++++++++----------
 mm/page_cgroup.c                    |    2 +-
 mm/slub.c                           |    6 ++
 mm/vmscan.c                         |    4 +-
 mm/vmstat.c                         |    4 +-
 30 files changed, 335 insertions(+), 123 deletions(-)


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC PATCH 01/23 V2] node_states: introduce N_MEMORY
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 02/23 V2] cpuset: use N_MEMORY instead N_HIGH_MEMORY Lai Jiangshan
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel; +Cc: Lai Jiangshan

We have N_NORMAL_MEMORY for standing for the nodes that have normal memory with
zone_type <= ZONE_NORMAL.

And we have N_HIGH_MEMORY for standing for the nodes that have normal or high
memory.

But we don't have any word to stand for the nodes that have *any* memory.

And we have N_CPU but without N_MEMORY.

Current code reuse the N_HIGH_MEMORY for this purpose because any node which
has memory must have high memory or normal memory currently.

A)	But this reusing is bad for *readability*. Because the name
	N_HIGH_MEMORY just stands for high or normal:

A.example 1)
	mem_cgroup_nr_lru_pages():
		for_each_node_state(nid, N_HIGH_MEMORY)

	The user will be confused(why this function just counts for high or
	normal memory node? does it counts for ZONE_MOVABLE's lru pages?)
	until someone else tell them N_HIGH_MEMORY is reused to stand for
	nodes that have any memory.

A.cont) If we introduce N_MEMORY, we can reduce this confusing
	AND make the code more clearly:

A.example 2) mm/page_cgroup.c use N_HIGH_MEMORY twice:

	One is in page_cgroup_init(void):
		for_each_node_state(nid, N_HIGH_MEMORY) {

	It means if the node have memory, we will allocate page_cgroup map for
	the node. We should use N_MEMORY instead here to gaim more clearly.

	The second using is in alloc_page_cgroup():
		if (node_state(nid, N_HIGH_MEMORY))
			addr = vzalloc_node(size, nid);

	It means if the node has high or normal memory that can be allocated
	from kernel. We should keep N_HIGH_MEMORY here, and it will be better
	if the "any memory" semantic of N_HIGH_MEMORY is removed.

B)	This reusing is out-dated if we introduce MOVABLE-dedicated node.
	The MOVABLE-dedicated node should not appear in
	node_stats[N_HIGH_MEMORY] nor node_stats[N_NORMAL_MEMORY],
	because MOVABLE-dedicated node has no high or normal memory.

	In x86_64, N_HIGH_MEMORY=N_NORMAL_MEMORY, if a MOVABLE-dedicated node
	is in node_stats[N_HIGH_MEMORY], it is also means it is in
	node_stats[N_NORMAL_MEMORY], it causes SLUB wrong.

	The slub uses
		for_each_node_state(nid, N_NORMAL_MEMORY)
	and creates kmem_cache_node for MOVABLE-dedicated node and cause problem.

In one word, we need a N_MEMORY. We just intrude it as an alias to
N_HIGH_MEMORY and fix all im-proper usages of N_HIGH_MEMORY in late patches.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 include/linux/nodemask.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
index 7afc363..c6ebdc9 100644
--- a/include/linux/nodemask.h
+++ b/include/linux/nodemask.h
@@ -380,6 +380,7 @@ enum node_states {
 #else
 	N_HIGH_MEMORY = N_NORMAL_MEMORY,
 #endif
+	N_MEMORY = N_HIGH_MEMORY,
 	N_CPU,		/* The node has one or more cpus */
 	NR_NODE_STATES
 };
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 02/23 V2] cpuset: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 01/23 V2] node_states: introduce N_MEMORY Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 03/23 V2] procfs: " Lai Jiangshan
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Paul Menage, Rob Landley, linux-doc

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 Documentation/cgroups/cpusets.txt |    2 +-
 include/linux/cpuset.h            |    2 +-
 kernel/cpuset.c                   |   32 ++++++++++++++++----------------
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/Documentation/cgroups/cpusets.txt b/Documentation/cgroups/cpusets.txt
index cefd3d8..12e01d4 100644
--- a/Documentation/cgroups/cpusets.txt
+++ b/Documentation/cgroups/cpusets.txt
@@ -218,7 +218,7 @@ and name space for cpusets, with a minimum of additional kernel code.
 The cpus and mems files in the root (top_cpuset) cpuset are
 read-only.  The cpus file automatically tracks the value of
 cpu_online_mask using a CPU hotplug notifier, and the mems file
-automatically tracks the value of node_states[N_HIGH_MEMORY]--i.e.,
+automatically tracks the value of node_states[N_MEMORY]--i.e.,
 nodes with memory--using the cpuset_track_online_nodes() hook.
 
 
diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h
index 838320f..8c8a60d 100644
--- a/include/linux/cpuset.h
+++ b/include/linux/cpuset.h
@@ -144,7 +144,7 @@ static inline nodemask_t cpuset_mems_allowed(struct task_struct *p)
 	return node_possible_map;
 }
 
-#define cpuset_current_mems_allowed (node_states[N_HIGH_MEMORY])
+#define cpuset_current_mems_allowed (node_states[N_MEMORY])
 static inline void cpuset_init_current_mems_allowed(void) {}
 
 static inline int cpuset_nodemask_valid_mems_allowed(nodemask_t *nodemask)
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index f33c715..2b133db 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -302,10 +302,10 @@ static void guarantee_online_cpus(const struct cpuset *cs,
  * are online, with memory.  If none are online with memory, walk
  * up the cpuset hierarchy until we find one that does have some
  * online mems.  If we get all the way to the top and still haven't
- * found any online mems, return node_states[N_HIGH_MEMORY].
+ * found any online mems, return node_states[N_MEMORY].
  *
  * One way or another, we guarantee to return some non-empty subset
- * of node_states[N_HIGH_MEMORY].
+ * of node_states[N_MEMORY].
  *
  * Call with callback_mutex held.
  */
@@ -313,14 +313,14 @@ static void guarantee_online_cpus(const struct cpuset *cs,
 static void guarantee_online_mems(const struct cpuset *cs, nodemask_t *pmask)
 {
 	while (cs && !nodes_intersects(cs->mems_allowed,
-					node_states[N_HIGH_MEMORY]))
+					node_states[N_MEMORY]))
 		cs = cs->parent;
 	if (cs)
 		nodes_and(*pmask, cs->mems_allowed,
-					node_states[N_HIGH_MEMORY]);
+					node_states[N_MEMORY]);
 	else
-		*pmask = node_states[N_HIGH_MEMORY];
-	BUG_ON(!nodes_intersects(*pmask, node_states[N_HIGH_MEMORY]));
+		*pmask = node_states[N_MEMORY];
+	BUG_ON(!nodes_intersects(*pmask, node_states[N_MEMORY]));
 }
 
 /*
@@ -1100,7 +1100,7 @@ static int update_nodemask(struct cpuset *cs, struct cpuset *trialcs,
 		return -ENOMEM;
 
 	/*
-	 * top_cpuset.mems_allowed tracks node_stats[N_HIGH_MEMORY];
+	 * top_cpuset.mems_allowed tracks node_stats[N_MEMORY];
 	 * it's read-only
 	 */
 	if (cs == &top_cpuset) {
@@ -1122,7 +1122,7 @@ static int update_nodemask(struct cpuset *cs, struct cpuset *trialcs,
 			goto done;
 
 		if (!nodes_subset(trialcs->mems_allowed,
-				node_states[N_HIGH_MEMORY])) {
+				node_states[N_MEMORY])) {
 			retval =  -EINVAL;
 			goto done;
 		}
@@ -2034,7 +2034,7 @@ static struct cpuset *cpuset_next(struct list_head *queue)
  * before dropping down to the next.  It always processes a node before
  * any of its children.
  *
- * In the case of memory hot-unplug, it will remove nodes from N_HIGH_MEMORY
+ * In the case of memory hot-unplug, it will remove nodes from N_MEMORY
  * if all present pages from a node are offlined.
  */
 static void
@@ -2073,7 +2073,7 @@ scan_cpusets_upon_hotplug(struct cpuset *root, enum hotplug_event event)
 
 			/* Continue past cpusets with all mems online */
 			if (nodes_subset(cp->mems_allowed,
-					node_states[N_HIGH_MEMORY]))
+					node_states[N_MEMORY]))
 				continue;
 
 			oldmems = cp->mems_allowed;
@@ -2081,7 +2081,7 @@ scan_cpusets_upon_hotplug(struct cpuset *root, enum hotplug_event event)
 			/* Remove offline mems from this cpuset. */
 			mutex_lock(&callback_mutex);
 			nodes_and(cp->mems_allowed, cp->mems_allowed,
-						node_states[N_HIGH_MEMORY]);
+						node_states[N_MEMORY]);
 			mutex_unlock(&callback_mutex);
 
 			/* Move tasks from the empty cpuset to a parent */
@@ -2134,8 +2134,8 @@ void cpuset_update_active_cpus(bool cpu_online)
 
 #ifdef CONFIG_MEMORY_HOTPLUG
 /*
- * Keep top_cpuset.mems_allowed tracking node_states[N_HIGH_MEMORY].
- * Call this routine anytime after node_states[N_HIGH_MEMORY] changes.
+ * Keep top_cpuset.mems_allowed tracking node_states[N_MEMORY].
+ * Call this routine anytime after node_states[N_MEMORY] changes.
  * See cpuset_update_active_cpus() for CPU hotplug handling.
  */
 static int cpuset_track_online_nodes(struct notifier_block *self,
@@ -2148,7 +2148,7 @@ static int cpuset_track_online_nodes(struct notifier_block *self,
 	case MEM_ONLINE:
 		oldmems = top_cpuset.mems_allowed;
 		mutex_lock(&callback_mutex);
-		top_cpuset.mems_allowed = node_states[N_HIGH_MEMORY];
+		top_cpuset.mems_allowed = node_states[N_MEMORY];
 		mutex_unlock(&callback_mutex);
 		update_tasks_nodemask(&top_cpuset, &oldmems, NULL);
 		break;
@@ -2177,7 +2177,7 @@ static int cpuset_track_online_nodes(struct notifier_block *self,
 void __init cpuset_init_smp(void)
 {
 	cpumask_copy(top_cpuset.cpus_allowed, cpu_active_mask);
-	top_cpuset.mems_allowed = node_states[N_HIGH_MEMORY];
+	top_cpuset.mems_allowed = node_states[N_MEMORY];
 
 	hotplug_memory_notifier(cpuset_track_online_nodes, 10);
 
@@ -2245,7 +2245,7 @@ void cpuset_init_current_mems_allowed(void)
  *
  * Description: Returns the nodemask_t mems_allowed of the cpuset
  * attached to the specified @tsk.  Guaranteed to return some non-empty
- * subset of node_states[N_HIGH_MEMORY], even if this means going outside the
+ * subset of node_states[N_MEMORY], even if this means going outside the
  * tasks cpuset.
  **/
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 03/23 V2] procfs: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 01/23 V2] node_states: introduce N_MEMORY Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 02/23 V2] cpuset: use N_MEMORY instead N_HIGH_MEMORY Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 04/23 V2] oom: " Lai Jiangshan
                   ` (20 subsequent siblings)
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Andrew Morton, Laura Vasilescu, Jiri Kosina,
	WANG Cong, Djalal Harouni, Hugh Dickins, Naoya Horiguchi,
	David Rientjes, Konstantin Khlebnikov

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 fs/proc/kcore.c    |    2 +-
 fs/proc/task_mmu.c |    4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
index 86c67ee..e96d4f1 100644
--- a/fs/proc/kcore.c
+++ b/fs/proc/kcore.c
@@ -249,7 +249,7 @@ static int kcore_update_ram(void)
 	/* Not inialized....update now */
 	/* find out "max pfn" */
 	end_pfn = 0;
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		unsigned long node_end;
 		node_end  = NODE_DATA(nid)->node_start_pfn +
 			NODE_DATA(nid)->node_spanned_pages;
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 4540b8f..ed3d381 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1080,7 +1080,7 @@ static struct page *can_gather_numa_stats(pte_t pte, struct vm_area_struct *vma,
 		return NULL;
 
 	nid = page_to_nid(page);
-	if (!node_isset(nid, node_states[N_HIGH_MEMORY]))
+	if (!node_isset(nid, node_states[N_MEMORY]))
 		return NULL;
 
 	return page;
@@ -1232,7 +1232,7 @@ static int show_numa_map(struct seq_file *m, void *v, int is_pid)
 	if (md->writeback)
 		seq_printf(m, " writeback=%lu", md->writeback);
 
-	for_each_node_state(n, N_HIGH_MEMORY)
+	for_each_node_state(n, N_MEMORY)
 		if (md->node[n])
 			seq_printf(m, " N%d=%lu", n, md->node[n]);
 out:
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 04/23 V2] oom: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (2 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 03/23 V2] procfs: " Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 05/23 V2] mm,migrate: " Lai Jiangshan
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Andrew Morton, David Rientjes, KAMEZAWA Hiroyuki,
	Michal Hocko, KOSAKI Motohiro, linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/oom_kill.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index ac300c9..1e58f12 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -257,7 +257,7 @@ static enum oom_constraint constrained_alloc(struct zonelist *zonelist,
 	 * the page allocator means a mempolicy is in effect.  Cpuset policy
 	 * is enforced in get_page_from_freelist().
 	 */
-	if (nodemask && !nodes_subset(node_states[N_HIGH_MEMORY], *nodemask)) {
+	if (nodemask && !nodes_subset(node_states[N_MEMORY], *nodemask)) {
 		*totalpages = total_swap_pages;
 		for_each_node_mask(nid, *nodemask)
 			*totalpages += node_spanned_pages(nid);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 05/23 V2] mm,migrate: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (3 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 04/23 V2] oom: " Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02 16:09   ` Christoph Lameter
  2012-08-02  6:01 ` [RFC PATCH 06/23 V2] mempolicy: " Lai Jiangshan
                   ` (18 subsequent siblings)
  23 siblings, 1 reply; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Andrew Morton, Hugh Dickins, Mel Gorman,
	Wang Sheng-Hui, Christoph Lameter, linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/migrate.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index be26d5c..dbe4f86 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1226,7 +1226,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
 			if (node < 0 || node >= MAX_NUMNODES)
 				goto out_pm;
 
-			if (!node_state(node, N_HIGH_MEMORY))
+			if (!node_state(node, N_MEMORY))
 				goto out_pm;
 
 			err = -EACCES;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 06/23 V2] mempolicy: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (4 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 05/23 V2] mm,migrate: " Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 07/23 V2] memcontrol: " Lai Jiangshan
                   ` (17 subsequent siblings)
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Andrew Morton, Mel Gorman, David Rientjes,
	Rik van Riel, KOSAKI Motohiro, linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/mempolicy.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 1d771e4..ad0381d 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -212,9 +212,9 @@ static int mpol_set_nodemask(struct mempolicy *pol,
 	/* if mode is MPOL_DEFAULT, pol is NULL. This is right. */
 	if (pol == NULL)
 		return 0;
-	/* Check N_HIGH_MEMORY */
+	/* Check N_MEMORY */
 	nodes_and(nsc->mask1,
-		  cpuset_current_mems_allowed, node_states[N_HIGH_MEMORY]);
+		  cpuset_current_mems_allowed, node_states[N_MEMORY]);
 
 	VM_BUG_ON(!nodes);
 	if (pol->mode == MPOL_PREFERRED && nodes_empty(*nodes))
@@ -1363,7 +1363,7 @@ SYSCALL_DEFINE4(migrate_pages, pid_t, pid, unsigned long, maxnode,
 		goto out_put;
 	}
 
-	if (!nodes_subset(*new, node_states[N_HIGH_MEMORY])) {
+	if (!nodes_subset(*new, node_states[N_MEMORY])) {
 		err = -EINVAL;
 		goto out_put;
 	}
@@ -2314,7 +2314,7 @@ void __init numa_policy_init(void)
 	 * fall back to the largest node if they're all smaller.
 	 */
 	nodes_clear(interleave_nodes);
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		unsigned long total_pages = node_present_pages(nid);
 
 		/* Preserve the largest node */
@@ -2395,7 +2395,7 @@ int mpol_parse_str(char *str, struct mempolicy **mpol, int no_context)
 		*nodelist++ = '\0';
 		if (nodelist_parse(nodelist, nodes))
 			goto out;
-		if (!nodes_subset(nodes, node_states[N_HIGH_MEMORY]))
+		if (!nodes_subset(nodes, node_states[N_MEMORY]))
 			goto out;
 	} else
 		nodes_clear(nodes);
@@ -2429,7 +2429,7 @@ int mpol_parse_str(char *str, struct mempolicy **mpol, int no_context)
 		 * Default to online nodes with memory if no nodelist
 		 */
 		if (!nodelist)
-			nodes = node_states[N_HIGH_MEMORY];
+			nodes = node_states[N_MEMORY];
 		break;
 	case MPOL_LOCAL:
 		/*
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 07/23 V2] memcontrol: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (5 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 06/23 V2] mempolicy: " Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 08/23 V2] hugetlb: " Lai Jiangshan
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Johannes Weiner, Michal Hocko, Balbir Singh,
	KAMEZAWA Hiroyuki, Tejun Heo, Li Zefan, cgroups, linux-mm,
	containers

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/memcontrol.c  |   18 +++++++++---------
 mm/page_cgroup.c |    2 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f72b5e5..4402c2e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -797,7 +797,7 @@ static unsigned long mem_cgroup_nr_lru_pages(struct mem_cgroup *memcg,
 	int nid;
 	u64 total = 0;
 
-	for_each_node_state(nid, N_HIGH_MEMORY)
+	for_each_node_state(nid, N_MEMORY)
 		total += mem_cgroup_node_nr_lru_pages(memcg, nid, lru_mask);
 	return total;
 }
@@ -1549,9 +1549,9 @@ static void mem_cgroup_may_update_nodemask(struct mem_cgroup *memcg)
 		return;
 
 	/* make a nodemask where this memcg uses memory from */
-	memcg->scan_nodes = node_states[N_HIGH_MEMORY];
+	memcg->scan_nodes = node_states[N_MEMORY];
 
-	for_each_node_mask(nid, node_states[N_HIGH_MEMORY]) {
+	for_each_node_mask(nid, node_states[N_MEMORY]) {
 
 		if (!test_mem_cgroup_node_reclaimable(memcg, nid, false))
 			node_clear(nid, memcg->scan_nodes);
@@ -1622,7 +1622,7 @@ static bool mem_cgroup_reclaimable(struct mem_cgroup *memcg, bool noswap)
 	/*
 	 * Check rest of nodes.
 	 */
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		if (node_isset(nid, memcg->scan_nodes))
 			continue;
 		if (test_mem_cgroup_node_reclaimable(memcg, nid, noswap))
@@ -3700,7 +3700,7 @@ move_account:
 		drain_all_stock_sync(memcg);
 		ret = 0;
 		mem_cgroup_start_move(memcg);
-		for_each_node_state(node, N_HIGH_MEMORY) {
+		for_each_node_state(node, N_MEMORY) {
 			for (zid = 0; !ret && zid < MAX_NR_ZONES; zid++) {
 				enum lru_list lru;
 				for_each_lru(lru) {
@@ -4025,7 +4025,7 @@ static int mem_control_numa_stat_show(struct cgroup *cont, struct cftype *cft,
 
 	total_nr = mem_cgroup_nr_lru_pages(memcg, LRU_ALL);
 	seq_printf(m, "total=%lu", total_nr);
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		node_nr = mem_cgroup_node_nr_lru_pages(memcg, nid, LRU_ALL);
 		seq_printf(m, " N%d=%lu", nid, node_nr);
 	}
@@ -4033,7 +4033,7 @@ static int mem_control_numa_stat_show(struct cgroup *cont, struct cftype *cft,
 
 	file_nr = mem_cgroup_nr_lru_pages(memcg, LRU_ALL_FILE);
 	seq_printf(m, "file=%lu", file_nr);
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		node_nr = mem_cgroup_node_nr_lru_pages(memcg, nid,
 				LRU_ALL_FILE);
 		seq_printf(m, " N%d=%lu", nid, node_nr);
@@ -4042,7 +4042,7 @@ static int mem_control_numa_stat_show(struct cgroup *cont, struct cftype *cft,
 
 	anon_nr = mem_cgroup_nr_lru_pages(memcg, LRU_ALL_ANON);
 	seq_printf(m, "anon=%lu", anon_nr);
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		node_nr = mem_cgroup_node_nr_lru_pages(memcg, nid,
 				LRU_ALL_ANON);
 		seq_printf(m, " N%d=%lu", nid, node_nr);
@@ -4051,7 +4051,7 @@ static int mem_control_numa_stat_show(struct cgroup *cont, struct cftype *cft,
 
 	unevictable_nr = mem_cgroup_nr_lru_pages(memcg, BIT(LRU_UNEVICTABLE));
 	seq_printf(m, "unevictable=%lu", unevictable_nr);
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		node_nr = mem_cgroup_node_nr_lru_pages(memcg, nid,
 				BIT(LRU_UNEVICTABLE));
 		seq_printf(m, " N%d=%lu", nid, node_nr);
diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
index eb750f8..e775239 100644
--- a/mm/page_cgroup.c
+++ b/mm/page_cgroup.c
@@ -271,7 +271,7 @@ void __init page_cgroup_init(void)
 	if (mem_cgroup_disabled())
 		return;
 
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		unsigned long start_pfn, end_pfn;
 
 		start_pfn = node_start_pfn(nid);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 08/23 V2] hugetlb: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (6 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 07/23 V2] memcontrol: " Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-04 14:02   ` Hillf Danton
  2012-08-02  6:01 ` [RFC PATCH 09/23 V2] vmstat: " Lai Jiangshan
                   ` (15 subsequent siblings)
  23 siblings, 1 reply; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Greg Kroah-Hartman, Andrew Morton, Hillf Danton,
	Michal Hocko, KAMEZAWA Hiroyuki, linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 drivers/base/node.c |    2 +-
 mm/hugetlb.c        |   24 ++++++++++++------------
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
index af1a177..31f4805 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -227,7 +227,7 @@ static node_registration_func_t __hugetlb_unregister_node;
 static inline bool hugetlb_register_node(struct node *node)
 {
 	if (__hugetlb_register_node &&
-			node_state(node->dev.id, N_HIGH_MEMORY)) {
+			node_state(node->dev.id, N_MEMORY)) {
 		__hugetlb_register_node(node);
 		return true;
 	}
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index e198831..661db47 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1046,7 +1046,7 @@ static void return_unused_surplus_pages(struct hstate *h,
 	 * on-line nodes with memory and will handle the hstate accounting.
 	 */
 	while (nr_pages--) {
-		if (!free_pool_huge_page(h, &node_states[N_HIGH_MEMORY], 1))
+		if (!free_pool_huge_page(h, &node_states[N_MEMORY], 1))
 			break;
 	}
 }
@@ -1150,14 +1150,14 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma,
 int __weak alloc_bootmem_huge_page(struct hstate *h)
 {
 	struct huge_bootmem_page *m;
-	int nr_nodes = nodes_weight(node_states[N_HIGH_MEMORY]);
+	int nr_nodes = nodes_weight(node_states[N_MEMORY]);
 
 	while (nr_nodes) {
 		void *addr;
 
 		addr = __alloc_bootmem_node_nopanic(
 				NODE_DATA(hstate_next_node_to_alloc(h,
-						&node_states[N_HIGH_MEMORY])),
+						&node_states[N_MEMORY])),
 				huge_page_size(h), huge_page_size(h), 0);
 
 		if (addr) {
@@ -1229,7 +1229,7 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
 			if (!alloc_bootmem_huge_page(h))
 				break;
 		} else if (!alloc_fresh_huge_page(h,
-					 &node_states[N_HIGH_MEMORY]))
+					 &node_states[N_MEMORY]))
 			break;
 	}
 	h->max_huge_pages = i;
@@ -1497,7 +1497,7 @@ static ssize_t nr_hugepages_store_common(bool obey_mempolicy,
 		if (!(obey_mempolicy &&
 				init_nodemask_of_mempolicy(nodes_allowed))) {
 			NODEMASK_FREE(nodes_allowed);
-			nodes_allowed = &node_states[N_HIGH_MEMORY];
+			nodes_allowed = &node_states[N_MEMORY];
 		}
 	} else if (nodes_allowed) {
 		/*
@@ -1507,11 +1507,11 @@ static ssize_t nr_hugepages_store_common(bool obey_mempolicy,
 		count += h->nr_huge_pages - h->nr_huge_pages_node[nid];
 		init_nodemask_of_node(nodes_allowed, nid);
 	} else
-		nodes_allowed = &node_states[N_HIGH_MEMORY];
+		nodes_allowed = &node_states[N_MEMORY];
 
 	h->max_huge_pages = set_max_huge_pages(h, count, nodes_allowed);
 
-	if (nodes_allowed != &node_states[N_HIGH_MEMORY])
+	if (nodes_allowed != &node_states[N_MEMORY])
 		NODEMASK_FREE(nodes_allowed);
 
 	return len;
@@ -1812,7 +1812,7 @@ static void hugetlb_register_all_nodes(void)
 {
 	int nid;
 
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		struct node *node = &node_devices[nid];
 		if (node->dev.id == nid)
 			hugetlb_register_node(node);
@@ -1906,8 +1906,8 @@ void __init hugetlb_add_hstate(unsigned order)
 	h->free_huge_pages = 0;
 	for (i = 0; i < MAX_NUMNODES; ++i)
 		INIT_LIST_HEAD(&h->hugepage_freelists[i]);
-	h->next_nid_to_alloc = first_node(node_states[N_HIGH_MEMORY]);
-	h->next_nid_to_free = first_node(node_states[N_HIGH_MEMORY]);
+	h->next_nid_to_alloc = first_node(node_states[N_MEMORY]);
+	h->next_nid_to_free = first_node(node_states[N_MEMORY]);
 	snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB",
 					huge_page_size(h)/1024);
 
@@ -1995,11 +1995,11 @@ static int hugetlb_sysctl_handler_common(bool obey_mempolicy,
 		if (!(obey_mempolicy &&
 			       init_nodemask_of_mempolicy(nodes_allowed))) {
 			NODEMASK_FREE(nodes_allowed);
-			nodes_allowed = &node_states[N_HIGH_MEMORY];
+			nodes_allowed = &node_states[N_MEMORY];
 		}
 		h->max_huge_pages = set_max_huge_pages(h, tmp, nodes_allowed);
 
-		if (nodes_allowed != &node_states[N_HIGH_MEMORY])
+		if (nodes_allowed != &node_states[N_MEMORY])
 			NODEMASK_FREE(nodes_allowed);
 	}
 out:
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 09/23 V2] vmstat: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (7 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 08/23 V2] hugetlb: " Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02 16:09   ` Christoph Lameter
  2012-08-02  6:01 ` [RFC PATCH 10/23 V2] kthread: " Lai Jiangshan
                   ` (14 subsequent siblings)
  23 siblings, 1 reply; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Andrew Morton, Christoph Lameter,
	KAMEZAWA Hiroyuki, David Rientjes, linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/vmstat.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/vmstat.c b/mm/vmstat.c
index 1bbbbd9..aa3da12 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -917,7 +917,7 @@ static int pagetypeinfo_show(struct seq_file *m, void *arg)
 	pg_data_t *pgdat = (pg_data_t *)arg;
 
 	/* check memoryless node */
-	if (!node_state(pgdat->node_id, N_HIGH_MEMORY))
+	if (!node_state(pgdat->node_id, N_MEMORY))
 		return 0;
 
 	seq_printf(m, "Page block order: %d\n", pageblock_order);
@@ -1279,7 +1279,7 @@ static int unusable_show(struct seq_file *m, void *arg)
 	pg_data_t *pgdat = (pg_data_t *)arg;
 
 	/* check memoryless node */
-	if (!node_state(pgdat->node_id, N_HIGH_MEMORY))
+	if (!node_state(pgdat->node_id, N_MEMORY))
 		return 0;
 
 	walk_zones_in_node(m, pgdat, unusable_show_print);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 10/23 V2] kthread: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (8 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 09/23 V2] vmstat: " Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 11/23 V2] init: " Lai Jiangshan
                   ` (13 subsequent siblings)
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Tejun Heo, Paul Gortmaker,
	Henrique de Moraes Holschuh, Oleg Nesterov

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/kthread.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 3d3de63..4139962 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -280,7 +280,7 @@ int kthreadd(void *unused)
 	set_task_comm(tsk, "kthreadd");
 	ignore_signals(tsk);
 	set_cpus_allowed_ptr(tsk, cpu_all_mask);
-	set_mems_allowed(node_states[N_HIGH_MEMORY]);
+	set_mems_allowed(node_states[N_MEMORY]);
 
 	current->flags |= PF_NOFREEZE;
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 11/23 V2] init: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (9 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 10/23 V2] kthread: " Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 12/23 V2] vmscan: " Lai Jiangshan
                   ` (12 subsequent siblings)
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Rusty Russell, Ingo Molnar, Peter Zijlstra,
	Jim Cromie, Pawel Moll

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 init/main.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/init/main.c b/init/main.c
index 4121d1f..c9317aa 100644
--- a/init/main.c
+++ b/init/main.c
@@ -846,7 +846,7 @@ static int __init kernel_init(void * unused)
 	/*
 	 * init can allocate pages on any node
 	 */
-	set_mems_allowed(node_states[N_HIGH_MEMORY]);
+	set_mems_allowed(node_states[N_MEMORY]);
 	/*
 	 * init can run on any cpu.
 	 */
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 12/23 V2] vmscan: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (10 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 11/23 V2] init: " Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 13/23 V2] page_alloc: use N_MEMORY instead N_HIGH_MEMORY change the node_states initialization Lai Jiangshan
                   ` (11 subsequent siblings)
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Andrew Morton, KAMEZAWA Hiroyuki, Hugh Dickins,
	Minchan Kim, linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/vmscan.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 66e4310..1888026 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2921,7 +2921,7 @@ static int __devinit cpu_callback(struct notifier_block *nfb,
 	int nid;
 
 	if (action == CPU_ONLINE || action == CPU_ONLINE_FROZEN) {
-		for_each_node_state(nid, N_HIGH_MEMORY) {
+		for_each_node_state(nid, N_MEMORY) {
 			pg_data_t *pgdat = NODE_DATA(nid);
 			const struct cpumask *mask;
 
@@ -2976,7 +2976,7 @@ static int __init kswapd_init(void)
 	int nid;
 
 	swap_setup();
-	for_each_node_state(nid, N_HIGH_MEMORY)
+	for_each_node_state(nid, N_MEMORY)
  		kswapd_run(nid);
 	hotcpu_notifier(cpu_callback, 0);
 	return 0;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 13/23 V2] page_alloc: use N_MEMORY instead N_HIGH_MEMORY change the node_states initialization
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (11 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 12/23 V2] vmscan: " Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 14/23 V2] slub, hotplug: ignore unrelated node's hot-adding and hot-removing Lai Jiangshan
                   ` (10 subsequent siblings)
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Tejun Heo, Pekka Enberg, Yinghai Lu, David Rientjes,
	Andrew Morton, Michal Hocko, KAMEZAWA Hiroyuki, Minchan Kim,
	linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Since we introduced N_MEMORY, we update the initialization of node_states.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 arch/x86/mm/init_64.c |    4 +++-
 mm/page_alloc.c       |   40 ++++++++++++++++++++++------------------
 2 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 2b6b4a3..005f00c 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -625,7 +625,9 @@ void __init paging_init(void)
 	 *	 numa support is not compiled in, and later node_set_state
 	 *	 will not set it back.
 	 */
-	node_clear_state(0, N_NORMAL_MEMORY);
+	node_clear_state(0, N_MEMORY);
+	if (N_MEMORY != N_NORMAL_MEMORY)
+		node_clear_state(0, N_NORMAL_MEMORY);
 
 	zone_sizes_init();
 }
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4a4f921..0571f2a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1646,7 +1646,7 @@ bool zone_watermark_ok_safe(struct zone *z, int order, unsigned long mark,
  *
  * If the zonelist cache is present in the passed in zonelist, then
  * returns a pointer to the allowed node mask (either the current
- * tasks mems_allowed, or node_states[N_HIGH_MEMORY].)
+ * tasks mems_allowed, or node_states[N_MEMORY].)
  *
  * If the zonelist cache is not available for this zonelist, does
  * nothing and returns NULL.
@@ -1675,7 +1675,7 @@ static nodemask_t *zlc_setup(struct zonelist *zonelist, int alloc_flags)
 
 	allowednodes = !in_interrupt() && (alloc_flags & ALLOC_CPUSET) ?
 					&cpuset_current_mems_allowed :
-					&node_states[N_HIGH_MEMORY];
+					&node_states[N_MEMORY];
 	return allowednodes;
 }
 
@@ -3070,7 +3070,7 @@ static int find_next_best_node(int node, nodemask_t *used_node_mask)
 		return node;
 	}
 
-	for_each_node_state(n, N_HIGH_MEMORY) {
+	for_each_node_state(n, N_MEMORY) {
 
 		/* Don't want a node to appear more than once */
 		if (node_isset(n, *used_node_mask))
@@ -3212,7 +3212,7 @@ static int default_zonelist_order(void)
  	 * local memory, NODE_ORDER may be suitable.
          */
 	average_size = total_size /
-				(nodes_weight(node_states[N_HIGH_MEMORY]) + 1);
+				(nodes_weight(node_states[N_MEMORY]) + 1);
 	for_each_online_node(nid) {
 		low_kmem_size = 0;
 		total_size = 0;
@@ -4587,7 +4587,7 @@ unsigned long __init find_min_pfn_with_active_regions(void)
 /*
  * early_calculate_totalpages()
  * Sum pages in active regions for movable zone.
- * Populate N_HIGH_MEMORY for calculating usable_nodes.
+ * Populate N_MEMORY for calculating usable_nodes.
  */
 static unsigned long __init early_calculate_totalpages(void)
 {
@@ -4600,7 +4600,7 @@ static unsigned long __init early_calculate_totalpages(void)
 
 		totalpages += pages;
 		if (pages)
-			node_set_state(nid, N_HIGH_MEMORY);
+			node_set_state(nid, N_MEMORY);
 	}
   	return totalpages;
 }
@@ -4617,9 +4617,9 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 	unsigned long usable_startpfn;
 	unsigned long kernelcore_node, kernelcore_remaining;
 	/* save the state before borrow the nodemask */
-	nodemask_t saved_node_state = node_states[N_HIGH_MEMORY];
+	nodemask_t saved_node_state = node_states[N_MEMORY];
 	unsigned long totalpages = early_calculate_totalpages();
-	int usable_nodes = nodes_weight(node_states[N_HIGH_MEMORY]);
+	int usable_nodes = nodes_weight(node_states[N_MEMORY]);
 
 	/*
 	 * If movablecore was specified, calculate what size of
@@ -4654,7 +4654,7 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 restart:
 	/* Spread kernelcore memory as evenly as possible throughout nodes */
 	kernelcore_node = required_kernelcore / usable_nodes;
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		unsigned long start_pfn, end_pfn;
 
 		/*
@@ -4746,23 +4746,27 @@ restart:
 
 out:
 	/* restore the node_state */
-	node_states[N_HIGH_MEMORY] = saved_node_state;
+	node_states[N_MEMORY] = saved_node_state;
 }
 
-/* Any regular memory on that node ? */
-static void check_for_regular_memory(pg_data_t *pgdat)
+/* Any regular or high memory on that node ? */
+static void check_for_memory(pg_data_t *pgdat, int nid)
 {
-#ifdef CONFIG_HIGHMEM
 	enum zone_type zone_type;
 
-	for (zone_type = 0; zone_type <= ZONE_NORMAL; zone_type++) {
+	if (N_MEMORY == N_NORMAL_MEMORY)
+		return;
+
+	for (zone_type = 0; zone_type <= ZONE_MOVABLE - 1; zone_type++) {
 		struct zone *zone = &pgdat->node_zones[zone_type];
 		if (zone->present_pages) {
-			node_set_state(zone_to_nid(zone), N_NORMAL_MEMORY);
+			node_set_state(nid, N_HIGH_MEMORY);
+			if (N_NORMAL_MEMORY != N_HIGH_MEMORY &&
+			    zone_type <= ZONE_NORMAL)
+				node_set_state(nid, N_NORMAL_MEMORY);
 			break;
 		}
 	}
-#endif
 }
 
 /**
@@ -4845,8 +4849,8 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn)
 
 		/* Any memory on that node */
 		if (pgdat->node_present_pages)
-			node_set_state(nid, N_HIGH_MEMORY);
-		check_for_regular_memory(pgdat);
+			node_set_state(nid, N_MEMORY);
+		check_for_memory(pgdat, nid);
 	}
 }
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 14/23 V2] slub, hotplug: ignore unrelated node's hot-adding and hot-removing
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (12 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 13/23 V2] page_alloc: use N_MEMORY instead N_HIGH_MEMORY change the node_states initialization Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 15/23 V2] memory_hotplug: fix missing nodemask management Lai Jiangshan
                   ` (9 subsequent siblings)
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Christoph Lameter, Pekka Enberg, Matt Mackall, linux-mm

SLUB only fucus on the nodes which has normal memory, so ignore the other
node's hot-adding and hot-removing.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/slub.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 8c691fa..4c5bdc0 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3577,6 +3577,9 @@ static void slab_mem_offline_callback(void *arg)
 	if (offline_node < 0)
 		return;
 
+	if (page_zonenum(pfn_to_page(marg->start_pfn)) > ZONE_NORMAL)
+		return;
+
 	down_read(&slub_lock);
 	list_for_each_entry(s, &slab_caches, list) {
 		n = get_node(s, offline_node);
@@ -3611,6 +3614,9 @@ static int slab_mem_going_online_callback(void *arg)
 	if (nid < 0)
 		return 0;
 
+	if (page_zonenum(pfn_to_page(marg->start_pfn)) > ZONE_NORMAL)
+		return 0;
+
 	/*
 	 * We are bringing a node online. No memory is available yet. We must
 	 * allocate a kmem_cache_node structure in order to bring the node
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 15/23 V2] memory_hotplug: fix missing nodemask management
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (13 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 14/23 V2] slub, hotplug: ignore unrelated node's hot-adding and hot-removing Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 16/23 V2] numa: add CONFIG_MOVABLE_NODE for movable-dedicated node Lai Jiangshan
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Rob Landley, Andrew Morton, Paul Gortmaker,
	Bjorn Helgaas, David Rientjes, Wen Congyang, linux-doc, linux-mm

Currently memory_hotplug only manages the node_states[N_HIGH_MEMORY],
it forgot to manage node_states[N_NORMAL_MEMORY]. fix it.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 Documentation/memory-hotplug.txt |    2 +-
 mm/memory_hotplug.c              |   23 +++++++++++++++++++++--
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
index 6d0c251..89f21b2 100644
--- a/Documentation/memory-hotplug.txt
+++ b/Documentation/memory-hotplug.txt
@@ -382,7 +382,7 @@ struct memory_notify {
 
 start_pfn is start_pfn of online/offline memory.
 nr_pages is # of pages of online/offline memory.
-status_change_nid is set node id when N_HIGH_MEMORY of nodemask is (will be)
+status_change_nid is set node id when N_MEMORY of nodemask is (will be)
 set/clear. It means a new(memoryless) node gets new memory by online and a
 node loses all memory. If this is -1, then nodemask status is not changed.
 If status_changed_nid >= 0, callback should create/discard structures for the
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 427bb29..c44b39e 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -522,8 +522,18 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages)
 	init_per_zone_wmark_min();
 
 	if (onlined_pages) {
+		enum zone_type zoneid = zone_idx(zone);
+
 		kswapd_run(zone_to_nid(zone));
-		node_set_state(zone_to_nid(zone), N_HIGH_MEMORY);
+
+		node_set_state(nid, N_MEMORY);
+		if (zoneid <= ZONE_NORMAL && N_NORMAL_MEMORY != N_MEMORY)
+			node_set_state(nid, N_NORMAL_MEMORY);
+#ifdef CONFIG_HIGMEM
+		if (zoneid <= ZONE_HIGHMEM && N_HIGH_MEMORY != N_MEMORY)
+			node_set_state(nid, N_HIGH_MEMORY);
+#endif
+
 	}
 
 	vm_total_pages = nr_free_pagecache_pages();
@@ -966,7 +976,16 @@ repeat:
 	init_per_zone_wmark_min();
 
 	if (!node_present_pages(node)) {
-		node_clear_state(node, N_HIGH_MEMORY);
+		enum zone_type zoneid = zone_idx(zone);
+
+		node_clear_state(node, N_MEMORY);
+		if (zoneid <= ZONE_NORMAL && N_NORMAL_MEMORY != N_MEMORY)
+			node_clear_state(node, N_NORMAL_MEMORY);
+#ifdef CONFIG_HIGMEM
+		if (zoneid <= ZONE_HIGHMEM && N_HIGH_MEMORY != N_MEMORY)
+			node_clear_state(node, N_HIGH_MEMORY);
+#endif
+
 		kswapd_stop(node);
 	}
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 16/23 V2] numa: add CONFIG_MOVABLE_NODE for movable-dedicated node
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (14 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 15/23 V2] memory_hotplug: fix missing nodemask management Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 17/23 V2] page_alloc.c: don't subtract unrelated memmap from zone's present pages Lai Jiangshan
                   ` (7 subsequent siblings)
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Greg Kroah-Hartman, Andrew Morton, Jan Beulich,
	Seth Jennings, Dan Magenheimer, Michal Hocko, KAMEZAWA Hiroyuki,
	Minchan Kim, linux-mm

All are prepared, we can actually introduce N_MEMORY.
add CONFIG_MOVABLE_NODE make we can use it for movable-dedicated node

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 drivers/base/node.c      |    6 ++++++
 include/linux/nodemask.h |    4 ++++
 mm/Kconfig               |    8 ++++++++
 mm/page_alloc.c          |    3 +++
 4 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
index 31f4805..4bf5629 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -621,6 +621,9 @@ static struct node_attr node_state_attr[] = {
 #ifdef CONFIG_HIGHMEM
 	_NODE_ATTR(has_high_memory, N_HIGH_MEMORY),
 #endif
+#ifdef CONFIG_MOVABLE_NODE
+	_NODE_ATTR(has_memory, N_MEMORY),
+#endif
 };
 
 static struct attribute *node_state_attrs[] = {
@@ -631,6 +634,9 @@ static struct attribute *node_state_attrs[] = {
 #ifdef CONFIG_HIGHMEM
 	&node_state_attr[4].attr.attr,
 #endif
+#ifdef CONFIG_MOVABLE_NODE
+	&node_state_attr[4].attr.attr,
+#endif
 	NULL
 };
 
diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
index c6ebdc9..4e2cbfa 100644
--- a/include/linux/nodemask.h
+++ b/include/linux/nodemask.h
@@ -380,7 +380,11 @@ enum node_states {
 #else
 	N_HIGH_MEMORY = N_NORMAL_MEMORY,
 #endif
+#ifdef CONFIG_MOVABLE_NODE
+	N_MEMORY,		/* The node has memory(regular, high, movable) */
+#else
 	N_MEMORY = N_HIGH_MEMORY,
+#endif
 	N_CPU,		/* The node has one or more cpus */
 	NR_NODE_STATES
 };
diff --git a/mm/Kconfig b/mm/Kconfig
index 82fed4e..4371c65 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -140,6 +140,14 @@ config ARCH_DISCARD_MEMBLOCK
 config NO_BOOTMEM
 	boolean
 
+config MOVABLE_NODE
+	boolean "Enable to assign a node has only movable memory"
+	depends on HAVE_MEMBLOCK
+	depends on NO_BOOTMEM
+	depends on X86_64
+	depends on NUMA
+	default y
+
 # eventually, we can have this option just 'select SPARSEMEM'
 config MEMORY_HOTPLUG
 	bool "Allow for memory hot-add"
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0571f2a..737faf7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -91,6 +91,9 @@ nodemask_t node_states[NR_NODE_STATES] __read_mostly = {
 #ifdef CONFIG_HIGHMEM
 	[N_HIGH_MEMORY] = { { [0] = 1UL } },
 #endif
+#ifdef CONFIG_MOVABLE_NODE
+	[N_MEMORY] = { { [0] = 1UL } },
+#endif
 	[N_CPU] = { { [0] = 1UL } },
 #endif	/* NUMA */
 };
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 17/23 V2] page_alloc.c: don't subtract unrelated memmap from zone's present pages
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (15 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 16/23 V2] numa: add CONFIG_MOVABLE_NODE for movable-dedicated node Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 18/23 V2] page_alloc: add kernelcore_max_addr Lai Jiangshan
                   ` (6 subsequent siblings)
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Andrew Morton, Michal Hocko, KAMEZAWA Hiroyuki,
	Minchan Kim, linux-mm

A)======
Currently, memory-page-map(struct page array) is not defined in struct zone.
It is defined in several ways:

FLATMEM: global memmap, can be allocated from any zone <= ZONE_NORMAL
CONFIG_DISCONTIGMEM: node-specific memmap, can be allocated from any
		     zone <= ZONE_NORMAL within that node.
CONFIG_SPARSEMEM: memorysection-specific memmap, can be allocated from any zone,
		  when CONFIG_SPARSEMEM_VMEMMAP, it is even not physical continuous.

So, the memmap has nothing directly related with the zone. And it's memory can be
allocated outside, so it is wrong to subtract memmap's size from zone's
present pages.

B)======
When system has large holes, the subtracted-present-pages-size will become
very small or negative, make the memory management works bad at the zone or
make the zone unusable even the real-present-pages-size is actually large.

C)======
And subtracted-present-pages-size has problem when memory-hot-removing,
the zone->zone->present_pages may overflow and become huge(unsigned long).

D)======
memory-page-map is large and long living unreclaimable memory, it is good to
subtract them for proper watermark.
So a new proper approach is needed to do it similarly 
and new approach should also handle other long living unreclaimable memory.

Current blindly subtracted-present-pages-size approach does wrong, remove it.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/page_alloc.c |   20 +-------------------
 1 files changed, 1 insertions(+), 19 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 737faf7..03ad63d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4360,30 +4360,12 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat,
 
 	for (j = 0; j < MAX_NR_ZONES; j++) {
 		struct zone *zone = pgdat->node_zones + j;
-		unsigned long size, realsize, memmap_pages;
+		unsigned long size, realsize;
 
 		size = zone_spanned_pages_in_node(nid, j, zones_size);
 		realsize = size - zone_absent_pages_in_node(nid, j,
 								zholes_size);
 
-		/*
-		 * Adjust realsize so that it accounts for how much memory
-		 * is used by this zone for memmap. This affects the watermark
-		 * and per-cpu initialisations
-		 */
-		memmap_pages =
-			PAGE_ALIGN(size * sizeof(struct page)) >> PAGE_SHIFT;
-		if (realsize >= memmap_pages) {
-			realsize -= memmap_pages;
-			if (memmap_pages)
-				printk(KERN_DEBUG
-				       "  %s zone: %lu pages used for memmap\n",
-				       zone_names[j], memmap_pages);
-		} else
-			printk(KERN_WARNING
-				"  %s zone: %lu pages exceeds realsize %lu\n",
-				zone_names[j], memmap_pages, realsize);
-
 		/* Account for reserved pages */
 		if (j == 0 && realsize > dma_reserve) {
 			realsize -= dma_reserve;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 18/23 V2] page_alloc: add kernelcore_max_addr
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (16 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 17/23 V2] page_alloc.c: don't subtract unrelated memmap from zone's present pages Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 19/23 V2] x86: get pg_data_t's memory from other node Lai Jiangshan
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Rob Landley, Andrew Morton, Michal Hocko,
	KAMEZAWA Hiroyuki, Minchan Kim, linux-doc, linux-mm

Current ZONE_MOVABLE (kernelcore=) setting policy with boot option doesn't meet
our requirement. We need something like kernelcore_max_addr=XX boot option
to limit the kernelcore upper address.

The memory with higher address will be migratable(movable) and they
are easier to be offline(always ready to be offline when the system don't require
so much memory).

It makes things easy when we dynamic hot-add/remove memory, make better
utilities of memories, and helps for THP.

All kernelcore_max_addr=, kernelcore= and movablecore= can be safely specified
at the same time(or any 2 of them).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 Documentation/kernel-parameters.txt |    9 +++++++++
 mm/page_alloc.c                     |   29 ++++++++++++++++++++++++++++-
 2 files changed, 37 insertions(+), 1 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 12783fa..48dff61 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1216,6 +1216,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 			use the HighMem zone if it exists, and the Normal
 			zone if it does not.
 
+	kernelcore_max_addr=nn[KMG]	[KNL,X86,IA-64,PPC] This parameter
+			is the same effect as kernelcore parameter, except it
+			specifies the up physical address of memory range
+			usable by the kernel for non-movable allocations.
+			If both kernelcore and kernelcore_max_addr are
+			specified, this requested's priority is higher than
+			kernelcore's.
+			See the kernelcore parameter.
+
 	kgdbdbgp=	[KGDB,HW] kgdb over EHCI usb debug port.
 			Format: <Controller#>[,poll interval]
 			The controller # is the number of the ehci usb debug
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 03ad63d..65ac5c9 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -204,6 +204,7 @@ static unsigned long __meminitdata dma_reserve;
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 static unsigned long __meminitdata arch_zone_lowest_possible_pfn[MAX_NR_ZONES];
 static unsigned long __meminitdata arch_zone_highest_possible_pfn[MAX_NR_ZONES];
+static unsigned long __initdata required_kernelcore_max_pfn;
 static unsigned long __initdata required_kernelcore;
 static unsigned long __initdata required_movablecore;
 static unsigned long __meminitdata zone_movable_pfn[MAX_NUMNODES];
@@ -4600,6 +4601,7 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 {
 	int i, nid;
 	unsigned long usable_startpfn;
+	unsigned long kernelcore_max_pfn;
 	unsigned long kernelcore_node, kernelcore_remaining;
 	/* save the state before borrow the nodemask */
 	nodemask_t saved_node_state = node_states[N_MEMORY];
@@ -4628,6 +4630,9 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 		required_kernelcore = max(required_kernelcore, corepages);
 	}
 
+	if (required_kernelcore_max_pfn && !required_kernelcore)
+		required_kernelcore = totalpages;
+
 	/* If kernelcore was not specified, there is no ZONE_MOVABLE */
 	if (!required_kernelcore)
 		goto out;
@@ -4636,6 +4641,12 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 	find_usable_zone_for_movable();
 	usable_startpfn = arch_zone_lowest_possible_pfn[movable_zone];
 
+	if (required_kernelcore_max_pfn)
+		kernelcore_max_pfn = required_kernelcore_max_pfn;
+	else
+		kernelcore_max_pfn = ULONG_MAX >> PAGE_SHIFT;
+	kernelcore_max_pfn = max(kernelcore_max_pfn, usable_startpfn);
+
 restart:
 	/* Spread kernelcore memory as evenly as possible throughout nodes */
 	kernelcore_node = required_kernelcore / usable_nodes;
@@ -4662,8 +4673,12 @@ restart:
 			unsigned long size_pages;
 
 			start_pfn = max(start_pfn, zone_movable_pfn[nid]);
-			if (start_pfn >= end_pfn)
+			end_pfn = min(kernelcore_max_pfn, end_pfn);
+			if (start_pfn >= end_pfn) {
+				if (!zone_movable_pfn[nid])
+					zone_movable_pfn[nid] = start_pfn;
 				continue;
+			}
 
 			/* Account for what is only usable for kernelcore */
 			if (start_pfn < usable_startpfn) {
@@ -4854,6 +4869,18 @@ static int __init cmdline_parse_core(char *p, unsigned long *core)
 	return 0;
 }
 
+#ifdef CONFIG_MOVABLE_NODE
+/*
+ * kernelcore_max_addr=addr sets the up physical address of memory range
+ * for use for allocations that cannot be reclaimed or migrated.
+ */
+static int __init cmdline_parse_kernelcore_max_addr(char *p)
+{
+	return cmdline_parse_core(p, &required_kernelcore_max_pfn);
+}
+early_param("kernelcore_max_addr", cmdline_parse_kernelcore_max_addr);
+#endif
+
 /*
  * kernelcore=size sets the amount of memory for use for allocations that
  * cannot be reclaimed or migrated.
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 19/23 V2] x86: get pg_data_t's memory from other node
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (17 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 18/23 V2] page_alloc: add kernelcore_max_addr Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 20/23 V2] x86: use memblock_set_current_limit() to set memblock.current_limit Lai Jiangshan
                   ` (4 subsequent siblings)
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Yasuaki Ishimatsu, Lai Jiangshan, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, Andrew Morton, Wanlong Gao, Rusty Russell,
	Bjorn Helgaas

From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>

If system can create movable node which all memory of the
node is allocated as ZONE_MOVABLE, setup_node_data() cannot
allocate memory for the node's pg_data_t.
So when memblock_alloc_nid() fails, setup_node_data() retries
memblock_alloc().

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 arch/x86/mm/numa.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 2d125be..a86e315 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -223,9 +223,13 @@ static void __init setup_node_data(int nid, u64 start, u64 end)
 		remapped = true;
 	} else {
 		nd_pa = memblock_alloc_nid(nd_size, SMP_CACHE_BYTES, nid);
-		if (!nd_pa) {
-			pr_err("Cannot find %zu bytes in node %d\n",
+		if (!nd_pa)
+			printk(KERN_WARNING "Cannot find %zu bytes in node %d\n",
 			       nd_size, nid);
+		nd_pa = memblock_alloc(nd_size, SMP_CACHE_BYTES);
+		if (!nd_pa) {
+			pr_err("Cannot find %zu bytes in other node\n",
+			       nd_size);
 			return;
 		}
 		nd = __va(nd_pa);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 20/23 V2] x86: use memblock_set_current_limit() to set memblock.current_limit
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (18 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 19/23 V2] x86: get pg_data_t's memory from other node Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 21/23 V2] memblock: limit memory address from memblock Lai Jiangshan
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Yasuaki Ishimatsu, Lai Jiangshan, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, Jarkko Sakkinen, Matt Fleming,
	Andrew Morton

From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>

memblock.current_limit is set directly though memblock_set_current_limit()
is prepared. So fix it.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 arch/x86/kernel/setup.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index f4b9b80..bb9d9f8 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -889,7 +889,7 @@ void __init setup_arch(char **cmdline_p)
 
 	cleanup_highmap();
 
-	memblock.current_limit = get_max_mapped();
+	memblock_set_current_limit(get_max_mapped());
 	memblock_x86_fill();
 
 	/*
@@ -925,7 +925,7 @@ void __init setup_arch(char **cmdline_p)
 		max_low_pfn = max_pfn;
 	}
 #endif
-	memblock.current_limit = get_max_mapped();
+	memblock_set_current_limit(get_max_mapped());
 	dma_contiguous_reserve(0);
 
 	/*
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 21/23 V2] memblock: limit memory address from memblock
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (19 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 20/23 V2] x86: use memblock_set_current_limit() to set memblock.current_limit Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 22/23 V2] memblock: compare current_limit with end variable at memblock_find_in_range_node() Lai Jiangshan
                   ` (2 subsequent siblings)
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Yasuaki Ishimatsu, Lai Jiangshan, Tejun Heo, Andrew Morton,
	Yinghai Lu, Sam Ravnborg, Ingo Molnar, Gavin Shan, Michal Hocko,
	KAMEZAWA Hiroyuki, Minchan Kim, linux-mm

From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>

Setting kernelcore_max_pfn means all memory which is bigger than
the boot parameter is allocated as ZONE_MOVABLE. So memory which
is allocated by memblock also should be limited by the parameter.

The patch limits memory from memblock.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 include/linux/memblock.h |    1 +
 mm/memblock.c            |    5 ++++-
 mm/page_alloc.c          |    6 +++++-
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 19dc455..f2977ae 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -42,6 +42,7 @@ struct memblock {
 
 extern struct memblock memblock;
 extern int memblock_debug;
+extern phys_addr_t memblock_limit;
 
 #define memblock_dbg(fmt, ...) \
 	if (memblock_debug) printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
diff --git a/mm/memblock.c b/mm/memblock.c
index 5cc6731..663b805 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -931,7 +931,10 @@ int __init_memblock memblock_is_region_reserved(phys_addr_t base, phys_addr_t si
 
 void __init_memblock memblock_set_current_limit(phys_addr_t limit)
 {
-	memblock.current_limit = limit;
+	if (!memblock_limit || (memblock_limit > limit))
+		memblock.current_limit = limit;
+	else
+		memblock.current_limit = memblock_limit;
 }
 
 static void __init_memblock memblock_dump(struct memblock_type *type, char *name)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 65ac5c9..c4d3aa0 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -209,6 +209,8 @@ static unsigned long __initdata required_kernelcore;
 static unsigned long __initdata required_movablecore;
 static unsigned long __meminitdata zone_movable_pfn[MAX_NUMNODES];
 
+phys_addr_t memblock_limit;
+
 /* movable_zone is the "real" zone pages in ZONE_MOVABLE are taken from */
 int movable_zone;
 EXPORT_SYMBOL(movable_zone);
@@ -4876,7 +4878,9 @@ static int __init cmdline_parse_core(char *p, unsigned long *core)
  */
 static int __init cmdline_parse_kernelcore_max_addr(char *p)
 {
-	return cmdline_parse_core(p, &required_kernelcore_max_pfn);
+	cmdline_parse_core(p, &required_kernelcore_max_pfn);
+	memblock_limit = required_kernelcore_max_pfn << PAGE_SHIFT;
+	return 0;
 }
 early_param("kernelcore_max_addr", cmdline_parse_kernelcore_max_addr);
 #endif
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 22/23 V2] memblock: compare current_limit with end variable at memblock_find_in_range_node()
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (20 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 21/23 V2] memblock: limit memory address from memblock Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-02  6:01 ` [RFC PATCH 23/23 V2] mm, memory-hotplug: add online_movable Lai Jiangshan
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Yasuaki Ishimatsu, Lai Jiangshan, Tejun Heo, Andrew Morton,
	Ingo Molnar, Gavin Shan, Yinghai Lu, linux-mm

From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>

memblock_find_in_range_node() does not compare memblock.current_limit
with end variable. Thus even if memblock.current_limit is smaller than
end variable, the function allocates memory address that is bigger than
memblock.current_limit.

The patch adds the check to "memblock_find_in_range_node()"

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/memblock.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/memblock.c b/mm/memblock.c
index 663b805..ce7fcb6 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -99,11 +99,12 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start,
 					phys_addr_t align, int nid)
 {
 	phys_addr_t this_start, this_end, cand;
+	phys_addr_t current_limit = memblock.current_limit;
 	u64 i;
 
 	/* pump up @end */
-	if (end == MEMBLOCK_ALLOC_ACCESSIBLE)
-		end = memblock.current_limit;
+	if ((end == MEMBLOCK_ALLOC_ACCESSIBLE) || (end > current_limit))
+		end = current_limit;
 
 	/* avoid allocating the first page */
 	start = max_t(phys_addr_t, start, PAGE_SIZE);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC PATCH 23/23 V2] mm, memory-hotplug: add online_movable
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (21 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 22/23 V2] memblock: compare current_limit with end variable at memblock_find_in_range_node() Lai Jiangshan
@ 2012-08-02  6:01 ` Lai Jiangshan
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
  23 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-02  6:01 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Rob Landley, Greg Kroah-Hartman, Paul Gortmaker,
	Andrew Morton, Bjorn Helgaas, David Rientjes, Wen Congyang,
	linux-doc, linux-mm

When a memoryblock/memorysection is onlined by "online_movable", the kernel
will not have directly reference to the page of the memoryblock,
thus we can remove that memory any time when needed.

It makes things easy when we dynamic hot-add/remove memory, make better
utilities of memories, and helps for THP.

Current constraints: Only the memoryblock which is adjacent to the ZONE_MOVABLE
can be onlined from ZONE_NORMAL to ZONE_MOVABLE.

For opposite onlining behavior, we also introduce "online_kernel" to change
a memoryblock of ZONE_MOVABLE to ZONE_KERNEL when online.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 Documentation/memory-hotplug.txt |   14 ++++-
 drivers/base/memory.c            |   19 ++++--
 include/linux/memory_hotplug.h   |   13 ++++-
 mm/memory_hotplug.c              |  114 +++++++++++++++++++++++++++++++++++--
 4 files changed, 144 insertions(+), 16 deletions(-)

diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
index 89f21b2..7b1269c 100644
--- a/Documentation/memory-hotplug.txt
+++ b/Documentation/memory-hotplug.txt
@@ -161,7 +161,8 @@ a recent addition and not present on older kernels.
 		    in the memory block.
 'state'           : read-write
                     at read:  contains online/offline state of memory.
-                    at write: user can specify "online", "offline" command
+                    at write: user can specify "online_kernel",
+                    "online_movable", "online", "offline" command
                     which will be performed on al sections in the block.
 'phys_device'     : read-only: designed to show the name of physical memory
                     device.  This is not well implemented now.
@@ -255,6 +256,17 @@ For onlining, you have to write "online" to the section's state file as:
 
 % echo online > /sys/devices/system/memory/memoryXXX/state
 
+This onlining will not change the ZONE type of the target memory section,
+If the memory section is in ZONE_NORMAL, you can change it to ZONE_MOVABLE:
+
+% echo online_movable > /sys/devices/system/memory/memoryXXX/state
+(NOTE: current limit: this memory section must be adjacent to ZONE_MOVABLE)
+
+And if the memory section is in ZONE_MOVABLE, you can change it to ZONE_NORMAL:
+
+% echo online_kernel > /sys/devices/system/memory/memoryXXX/state
+(NOTE: current limit: this memory section must be adjacent to ZONE_NORMAL)
+
 After this, section memoryXXX's state will be 'online' and the amount of
 available memory will be increased.
 
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 7dda4f7..1ad2f48 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -246,7 +246,7 @@ static bool pages_correctly_reserved(unsigned long start_pfn,
  * OK to have direct references to sparsemem variables in here.
  */
 static int
-memory_block_action(unsigned long phys_index, unsigned long action)
+memory_block_action(unsigned long phys_index, unsigned long action, int online_type)
 {
 	unsigned long start_pfn, start_paddr;
 	unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block;
@@ -262,7 +262,7 @@ memory_block_action(unsigned long phys_index, unsigned long action)
 			if (!pages_correctly_reserved(start_pfn, nr_pages))
 				return -EBUSY;
 
-			ret = online_pages(start_pfn, nr_pages);
+			ret = online_pages(start_pfn, nr_pages, online_type);
 			break;
 		case MEM_OFFLINE:
 			start_paddr = page_to_pfn(first_page) << PAGE_SHIFT;
@@ -279,7 +279,8 @@ memory_block_action(unsigned long phys_index, unsigned long action)
 }
 
 static int memory_block_change_state(struct memory_block *mem,
-		unsigned long to_state, unsigned long from_state_req)
+		unsigned long to_state, unsigned long from_state_req,
+		int online_type)
 {
 	int ret = 0;
 
@@ -293,7 +294,7 @@ static int memory_block_change_state(struct memory_block *mem,
 	if (to_state == MEM_OFFLINE)
 		mem->state = MEM_GOING_OFFLINE;
 
-	ret = memory_block_action(mem->start_section_nr, to_state);
+	ret = memory_block_action(mem->start_section_nr, to_state, online_type);
 
 	if (ret) {
 		mem->state = from_state_req;
@@ -325,10 +326,14 @@ store_mem_state(struct device *dev,
 
 	mem = container_of(dev, struct memory_block, dev);
 
-	if (!strncmp(buf, "online", min((int)count, 6)))
-		ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE);
+	if (!strncmp(buf, "online_kernel", min((int)count, 13)))
+		ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE, ONLINE_KERNEL);
+	else if (!strncmp(buf, "online_movable", min((int)count, 14)))
+		ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE, ONLINE_MOVABLE);
+	else if (!strncmp(buf, "online", min((int)count, 6)))
+		ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE, ONLINE_KEEP);
 	else if(!strncmp(buf, "offline", min((int)count, 7)))
-		ret = memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE);
+		ret = memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
 
 	if (ret)
 		return ret;
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 910550f..047cd1d 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -25,6 +25,13 @@ enum {
 	MEMORY_HOTPLUG_MAX_BOOTMEM_TYPE = NODE_INFO,
 };
 
+/* Types for control the zone type of onlined memory */
+enum {
+	ONLINE_KEEP,
+	ONLINE_KERNEL,
+	ONLINE_MOVABLE,
+};
+
 /*
  * pgdat resizing functions
  */
@@ -45,6 +52,10 @@ void pgdat_resize_init(struct pglist_data *pgdat)
 }
 /*
  * Zone resizing functions
+ *
+ * Note: any attempt to resize a zone should has pgdat_resize_lock()
+ * zone_span_writelock() both held. This ensure the size of a zone
+ * can't be changed while pgdat_resize_lock() held.
  */
 static inline unsigned zone_span_seqbegin(struct zone *zone)
 {
@@ -70,7 +81,7 @@ extern int zone_grow_free_lists(struct zone *zone, unsigned long new_nr_pages);
 extern int zone_grow_waitqueues(struct zone *zone, unsigned long nr_pages);
 extern int add_one_highpage(struct page *page, int pfn, int bad_ppro);
 /* VM interface that may be used by firmware interface */
-extern int online_pages(unsigned long, unsigned long);
+extern int online_pages(unsigned long, unsigned long, int);
 extern void __offline_isolated_pages(unsigned long, unsigned long);
 
 typedef void (*online_page_callback_t)(struct page *page);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index c44b39e..b5ee3db 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -210,6 +210,89 @@ static void grow_zone_span(struct zone *zone, unsigned long start_pfn,
 	zone_span_writeunlock(zone);
 }
 
+static void resize_zone(struct zone *zone, unsigned long start_pfn,
+		unsigned long end_pfn)
+{
+
+	zone_span_writelock(zone);
+
+	zone->zone_start_pfn = start_pfn;
+	zone->spanned_pages = end_pfn - start_pfn;
+
+	zone_span_writeunlock(zone);
+}
+
+static void fix_zone_id(struct zone *zone, unsigned long start_pfn,
+		unsigned long end_pfn)
+{
+	enum zone_type zid = zone_idx(zone);
+	int nid = zone->zone_pgdat->node_id;
+	unsigned long pfn;
+
+	for (pfn = start_pfn; pfn < end_pfn; pfn++)
+		set_page_links(pfn_to_page(pfn), zid, nid, pfn);
+}
+
+static int move_pfn_range_left(struct zone *z1, struct zone *z2,
+		unsigned long start_pfn, unsigned long end_pfn)
+{
+	unsigned long flags;
+
+	pgdat_resize_lock(z1->zone_pgdat, &flags);
+
+	/* can't move pfns which are higher than @z2 */
+	if (end_pfn > z2->zone_start_pfn + z2->spanned_pages)
+		goto out_fail;
+	/* the move out part mast at the left most of @z2 */
+	if (start_pfn > z2->zone_start_pfn)
+		goto out_fail;
+	/* must included/overlap */
+	if (end_pfn <= z2->zone_start_pfn)
+		goto out_fail;
+
+	resize_zone(z1, z1->zone_start_pfn, end_pfn);
+	resize_zone(z2, end_pfn, z2->zone_start_pfn + z2->spanned_pages);
+
+	pgdat_resize_unlock(z1->zone_pgdat, &flags);
+
+	fix_zone_id(z1, start_pfn, end_pfn);
+
+	return 0;
+out_fail:
+	pgdat_resize_unlock(z1->zone_pgdat, &flags);
+	return -1;
+}
+
+static int move_pfn_range_right(struct zone *z1, struct zone *z2,
+		unsigned long start_pfn, unsigned long end_pfn)
+{
+	unsigned long flags;
+
+	pgdat_resize_lock(z1->zone_pgdat, &flags);
+
+	/* can't move pfns which are lower than @z1 */
+	if (z1->zone_start_pfn > start_pfn)
+		goto out_fail;
+	/* the move out part mast at the right most of @z1 */
+	if (z1->zone_start_pfn + z1->spanned_pages >  end_pfn)
+		goto out_fail;
+	/* must included/overlap */
+	if (start_pfn >= z1->zone_start_pfn + z1->spanned_pages)
+		goto out_fail;
+
+	resize_zone(z1, z1->zone_start_pfn, start_pfn);
+	resize_zone(z2, start_pfn, z2->zone_start_pfn + z2->spanned_pages);
+
+	pgdat_resize_unlock(z1->zone_pgdat, &flags);
+
+	fix_zone_id(z2, start_pfn, end_pfn);
+
+	return 0;
+out_fail:
+	pgdat_resize_unlock(z1->zone_pgdat, &flags);
+	return -1;
+}
+
 static void grow_pgdat_span(struct pglist_data *pgdat, unsigned long start_pfn,
 			    unsigned long end_pfn)
 {
@@ -457,7 +540,7 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
 }
 
 
-int __ref online_pages(unsigned long pfn, unsigned long nr_pages)
+int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_type)
 {
 	unsigned long onlined_pages = 0;
 	struct zone *zone;
@@ -467,6 +550,29 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages)
 	struct memory_notify arg;
 
 	lock_memory_hotplug();
+	/*
+	 * This doesn't need a lock to do pfn_to_page().
+	 * The section can't be removed here because of the
+	 * memory_block->state_mutex.
+	 */
+	zone = page_zone(pfn_to_page(pfn));
+
+	if (online_type == ONLINE_KERNEL && zone_idx(zone) == ZONE_MOVABLE) {
+		if (move_pfn_range_left(zone - 1, zone, pfn, pfn + nr_pages)) {
+			unlock_memory_hotplug();
+			return -1;
+		}
+	}
+	if (online_type == ONLINE_MOVABLE && zone_idx(zone) == ZONE_MOVABLE - 1) {
+		if (move_pfn_range_right(zone, zone + 1, pfn, pfn + nr_pages)) {
+			unlock_memory_hotplug();
+			return -1;
+		}
+	}
+
+	/* Previous code may changed the zone of the pfn range */
+	zone = page_zone(pfn_to_page(pfn));
+
 	arg.start_pfn = pfn;
 	arg.nr_pages = nr_pages;
 	arg.status_change_nid = -1;
@@ -483,12 +589,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages)
 		return ret;
 	}
 	/*
-	 * This doesn't need a lock to do pfn_to_page().
-	 * The section can't be removed here because of the
-	 * memory_block->state_mutex.
-	 */
-	zone = page_zone(pfn_to_page(pfn));
-	/*
 	 * If this zone is not populated, then it is not in zonelist.
 	 * This means the page allocator ignores this zone.
 	 * So, zonelist must be updated after online.
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH 09/23 V2] vmstat: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-02  6:01 ` [RFC PATCH 09/23 V2] vmstat: " Lai Jiangshan
@ 2012-08-02 16:09   ` Christoph Lameter
  0 siblings, 0 replies; 54+ messages in thread
From: Christoph Lameter @ 2012-08-02 16:09 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: Mel Gorman, linux-kernel, Andrew Morton, KAMEZAWA Hiroyuki,
	David Rientjes, linux-mm

On Thu, 2 Aug 2012, Lai Jiangshan wrote:

> The code here need to handle with the nodes which have memory, we should
> use N_MEMORY instead.

Acked-by: Christoph Lameter <cl@linux.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH 05/23 V2] mm,migrate: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-02  6:01 ` [RFC PATCH 05/23 V2] mm,migrate: " Lai Jiangshan
@ 2012-08-02 16:09   ` Christoph Lameter
  0 siblings, 0 replies; 54+ messages in thread
From: Christoph Lameter @ 2012-08-02 16:09 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: Mel Gorman, linux-kernel, Andrew Morton, Hugh Dickins,
	Mel Gorman, Wang Sheng-Hui, linux-mm

On Thu, 2 Aug 2012, Lai Jiangshan wrote:

> The code here need to handle with the nodes which have memory, we should
> use N_MEMORY instead.

Acked-by: Christoph Lameter <cl@linux.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC PATCH 08/23 V2] hugetlb: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-02  6:01 ` [RFC PATCH 08/23 V2] hugetlb: " Lai Jiangshan
@ 2012-08-04 14:02   ` Hillf Danton
  0 siblings, 0 replies; 54+ messages in thread
From: Hillf Danton @ 2012-08-04 14:02 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: Mel Gorman, linux-kernel, Greg Kroah-Hartman, Andrew Morton,
	Michal Hocko, KAMEZAWA Hiroyuki, linux-mm

On Thu, Aug 2, 2012 at 2:01 PM, Lai Jiangshan <laijs@cn.fujitsu.com> wrote:
> N_HIGH_MEMORY stands for the nodes that has normal or high memory.
> N_MEMORY stands for the nodes that has any memory.
>
> The code here need to handle with the nodes which have memory, we should
> use N_MEMORY instead.
>
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> ---
>  drivers/base/node.c |    2 +-
>  mm/hugetlb.c        |   24 ++++++++++++------------
>  2 files changed, 13 insertions(+), 13 deletions(-)
>

Better if the patch is split for hugetlb and node respectively.

Acked-by: Hillf Danton <dhillf@gmail.com>

> diff --git a/drivers/base/node.c b/drivers/base/node.c
> index af1a177..31f4805 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -227,7 +227,7 @@ static node_registration_func_t __hugetlb_unregister_node;
>  static inline bool hugetlb_register_node(struct node *node)
>  {
>         if (__hugetlb_register_node &&
> -                       node_state(node->dev.id, N_HIGH_MEMORY)) {
> +                       node_state(node->dev.id, N_MEMORY)) {
>                 __hugetlb_register_node(node);
>                 return true;
>         }
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index e198831..661db47 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1046,7 +1046,7 @@ static void return_unused_surplus_pages(struct hstate *h,
>          * on-line nodes with memory and will handle the hstate accounting.
>          */
>         while (nr_pages--) {
> -               if (!free_pool_huge_page(h, &node_states[N_HIGH_MEMORY], 1))
> +               if (!free_pool_huge_page(h, &node_states[N_MEMORY], 1))
>                         break;
>         }
>  }
> @@ -1150,14 +1150,14 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma,
>  int __weak alloc_bootmem_huge_page(struct hstate *h)
>  {
>         struct huge_bootmem_page *m;
> -       int nr_nodes = nodes_weight(node_states[N_HIGH_MEMORY]);
> +       int nr_nodes = nodes_weight(node_states[N_MEMORY]);
>
>         while (nr_nodes) {
>                 void *addr;
>
>                 addr = __alloc_bootmem_node_nopanic(
>                                 NODE_DATA(hstate_next_node_to_alloc(h,
> -                                               &node_states[N_HIGH_MEMORY])),
> +                                               &node_states[N_MEMORY])),
>                                 huge_page_size(h), huge_page_size(h), 0);
>
>                 if (addr) {
> @@ -1229,7 +1229,7 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
>                         if (!alloc_bootmem_huge_page(h))
>                                 break;
>                 } else if (!alloc_fresh_huge_page(h,
> -                                        &node_states[N_HIGH_MEMORY]))
> +                                        &node_states[N_MEMORY]))
>                         break;
>         }
>         h->max_huge_pages = i;
> @@ -1497,7 +1497,7 @@ static ssize_t nr_hugepages_store_common(bool obey_mempolicy,
>                 if (!(obey_mempolicy &&
>                                 init_nodemask_of_mempolicy(nodes_allowed))) {
>                         NODEMASK_FREE(nodes_allowed);
> -                       nodes_allowed = &node_states[N_HIGH_MEMORY];
> +                       nodes_allowed = &node_states[N_MEMORY];
>                 }
>         } else if (nodes_allowed) {
>                 /*
> @@ -1507,11 +1507,11 @@ static ssize_t nr_hugepages_store_common(bool obey_mempolicy,
>                 count += h->nr_huge_pages - h->nr_huge_pages_node[nid];
>                 init_nodemask_of_node(nodes_allowed, nid);
>         } else
> -               nodes_allowed = &node_states[N_HIGH_MEMORY];
> +               nodes_allowed = &node_states[N_MEMORY];
>
>         h->max_huge_pages = set_max_huge_pages(h, count, nodes_allowed);
>
> -       if (nodes_allowed != &node_states[N_HIGH_MEMORY])
> +       if (nodes_allowed != &node_states[N_MEMORY])
>                 NODEMASK_FREE(nodes_allowed);
>
>         return len;
> @@ -1812,7 +1812,7 @@ static void hugetlb_register_all_nodes(void)
>  {
>         int nid;
>
> -       for_each_node_state(nid, N_HIGH_MEMORY) {
> +       for_each_node_state(nid, N_MEMORY) {
>                 struct node *node = &node_devices[nid];
>                 if (node->dev.id == nid)
>                         hugetlb_register_node(node);
> @@ -1906,8 +1906,8 @@ void __init hugetlb_add_hstate(unsigned order)
>         h->free_huge_pages = 0;
>         for (i = 0; i < MAX_NUMNODES; ++i)
>                 INIT_LIST_HEAD(&h->hugepage_freelists[i]);
> -       h->next_nid_to_alloc = first_node(node_states[N_HIGH_MEMORY]);
> -       h->next_nid_to_free = first_node(node_states[N_HIGH_MEMORY]);
> +       h->next_nid_to_alloc = first_node(node_states[N_MEMORY]);
> +       h->next_nid_to_free = first_node(node_states[N_MEMORY]);
>         snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB",
>                                         huge_page_size(h)/1024);
>
> @@ -1995,11 +1995,11 @@ static int hugetlb_sysctl_handler_common(bool obey_mempolicy,
>                 if (!(obey_mempolicy &&
>                                init_nodemask_of_mempolicy(nodes_allowed))) {
>                         NODEMASK_FREE(nodes_allowed);
> -                       nodes_allowed = &node_states[N_HIGH_MEMORY];
> +                       nodes_allowed = &node_states[N_MEMORY];
>                 }
>                 h->max_huge_pages = set_max_huge_pages(h, tmp, nodes_allowed);
>
> -               if (nodes_allowed != &node_states[N_HIGH_MEMORY])
> +               if (nodes_allowed != &node_states[N_MEMORY])
>                         NODEMASK_FREE(nodes_allowed);
>         }
>  out:
> --
> 1.7.1
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug
  2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                   ` (22 preceding siblings ...)
  2012-08-02  6:01 ` [RFC PATCH 23/23 V2] mm, memory-hotplug: add online_movable Lai Jiangshan
@ 2012-08-06  9:22 ` Lai Jiangshan
  2012-08-06  9:22   ` [RFC V3 PATCH 01/25] page_alloc.c: don't subtract unrelated memmap from zone's present pages Lai Jiangshan
                     ` (25 more replies)
  23 siblings, 26 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:22 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel; +Cc: Lai Jiangshan

	A) Introduction:

This patchset adds MOVABLE-dedicated node and online_movable for memory-management.

It is used for anti-fragmentation(hugepage, big-order allocation...),
hot-removal-of-memory(virtualization, power-conserve, move memory between systems
to make better utilities of memories).

	B) User Interface:

When users(big system manager) need config some node/memory as MOVABLE:
	1 Use kernelcore_max_addr=XX when boot
	2 Use movable_online hotplug action when running
We may introduce some more convenient interface, such as
	movable_node=NODE_LIST boot option.

	C) Patches

Patch1-3      Fix problems of the current code.(all related with hotplug)
Patch4        cleanup for node_state_attr
Patch5        introduce N_MEMORY
Patch6-18     use N_MEMORY instead N_HIGH_MEMORY.
              The patches are separated by subsystem,
              *these conversions was(must be) checked carefully*.
              Patch18 also changes the node_states initialization
Patch19       Add config to allow MOVABLE-dedicated node
Patch20-24    Add kernelcore_max_addr
Patch25       Add online_movable and online_kernel


	D) changes
change V3-v2:
	Proper nodemask management

change V2-V1:

The original V1 patchset of MOVABLE-dedicated node is here:
http://comments.gmane.org/gmane.linux.kernel.mm/78122

The new V2 adds N_MEMORY and a notion of "MOVABLE-dedicated node".
And fix some related problems.

The orignal V1 patchset of "add online_movable" is here:
https://lkml.org/lkml/2012/7/4/145

The new V2 discards the MIGRATE_HOTREMOVE approach, and use a more straight
implementation(only 1 patch).


Lai Jiangshan (21):
  page_alloc.c: don't subtract unrelated memmap from zone's present
    pages
  memory_hotplug: fix missing nodemask management
  slub, hotplug: ignore unrelated node's hot-adding and hot-removing
  node: cleanup node_state_attr
  node_states: introduce N_MEMORY
  cpuset: use N_MEMORY instead N_HIGH_MEMORY
  procfs: use N_MEMORY instead N_HIGH_MEMORY
  memcontrol: use N_MEMORY instead N_HIGH_MEMORY
  oom: use N_MEMORY instead N_HIGH_MEMORY
  mm,migrate: use N_MEMORY instead N_HIGH_MEMORY
  mempolicy: use N_MEMORY instead N_HIGH_MEMORY
  hugetlb: use N_MEMORY instead N_HIGH_MEMORY
  vmstat: use N_MEMORY instead N_HIGH_MEMORY
  kthread: use N_MEMORY instead N_HIGH_MEMORY
  init: use N_MEMORY instead N_HIGH_MEMORY
  vmscan: use N_MEMORY instead N_HIGH_MEMORY
  page_alloc: use N_MEMORY instead N_HIGH_MEMORY change the node_states
    initialization
  hotplug: update nodemasks management
  numa: add CONFIG_MOVABLE_NODE for movable-dedicated node
  page_alloc: add kernelcore_max_addr
  mm, memory-hotplug: add online_movable and online_kernel

Yasuaki Ishimatsu (4):
  x86: get pg_data_t's memory from other node
  x86: use memblock_set_current_limit() to set memblock.current_limit
  memblock: limit memory address from memblock
  memblock: compare current_limit with end variable at
    memblock_find_in_range_node()

 Documentation/cgroups/cpusets.txt   |    2 +-
 Documentation/kernel-parameters.txt |    9 ++
 Documentation/memory-hotplug.txt    |   24 +++-
 arch/x86/kernel/setup.c             |    4 +-
 arch/x86/mm/init_64.c               |    4 +-
 arch/x86/mm/numa.c                  |    8 +-
 drivers/base/memory.c               |   19 ++-
 drivers/base/node.c                 |   28 +++--
 fs/proc/kcore.c                     |    2 +-
 fs/proc/task_mmu.c                  |    4 +-
 include/linux/cpuset.h              |    2 +-
 include/linux/memblock.h            |    1 +
 include/linux/memory.h              |    2 +
 include/linux/memory_hotplug.h      |   13 ++-
 include/linux/nodemask.h            |    5 +
 init/main.c                         |    2 +-
 kernel/cpuset.c                     |   32 +++---
 kernel/kthread.c                    |    2 +-
 mm/Kconfig                          |    8 ++
 mm/hugetlb.c                        |   24 ++--
 mm/memblock.c                       |   10 +-
 mm/memcontrol.c                     |   18 ++--
 mm/memory_hotplug.c                 |  232 ++++++++++++++++++++++++++++++++---
 mm/mempolicy.c                      |   12 +-
 mm/migrate.c                        |    2 +-
 mm/oom_kill.c                       |    2 +-
 mm/page_alloc.c                     |   96 +++++++++------
 mm/page_cgroup.c                    |    2 +-
 mm/slub.c                           |    4 +-
 mm/vmscan.c                         |    4 +-
 mm/vmstat.c                         |    4 +-
 31 files changed, 437 insertions(+), 144 deletions(-)

-- 
1.7.4.4


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 01/25] page_alloc.c: don't subtract unrelated memmap from zone's present pages
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
@ 2012-08-06  9:22   ` Lai Jiangshan
  2012-08-06  9:22   ` [RFC V3 PATCH 02/25] memory_hotplug: fix missing nodemask management Lai Jiangshan
                     ` (24 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:22 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Andrew Morton, Michal Hocko, KAMEZAWA Hiroyuki,
	Minchan Kim, linux-mm

A)======
Currently, memory-page-map(struct page array) is not defined in struct zone.
It is defined in several ways:

FLATMEM: global memmap, can be allocated from any zone <= ZONE_NORMAL
CONFIG_DISCONTIGMEM: node-specific memmap, can be allocated from any
		     zone <= ZONE_NORMAL within that node.
CONFIG_SPARSEMEM: memorysection-specific memmap, can be allocated from any zone,
		  when CONFIG_SPARSEMEM_VMEMMAP, it is even not physical continuous.

So, the memmap has nothing directly related with the zone. And it's memory can be
allocated outside, so it is wrong to subtract memmap's size from zone's
present pages.

B)======
When system has large holes, the subtracted-present-pages-size will become
very small or negative, make the memory management works bad at the zone or
make the zone unusable even the real-present-pages-size is actually large.

C)======
And subtracted-present-pages-size has problem when memory-hot-removing,
the zone->zone->present_pages may overflow and become huge(unsigned long).

D)======
memory-page-map is large and long living unreclaimable memory, it is good to
subtract them for proper watermark.
So a new proper approach is needed to do it similarly
and new approach should also handle other long living unreclaimable memory.

Current blindly subtracted-present-pages-size approach does wrong, remove it.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/page_alloc.c |   20 +-------------------
 1 files changed, 1 insertions(+), 19 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4a4f921..9312702 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4357,30 +4357,12 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat,
 
 	for (j = 0; j < MAX_NR_ZONES; j++) {
 		struct zone *zone = pgdat->node_zones + j;
-		unsigned long size, realsize, memmap_pages;
+		unsigned long size, realsize;
 
 		size = zone_spanned_pages_in_node(nid, j, zones_size);
 		realsize = size - zone_absent_pages_in_node(nid, j,
 								zholes_size);
 
-		/*
-		 * Adjust realsize so that it accounts for how much memory
-		 * is used by this zone for memmap. This affects the watermark
-		 * and per-cpu initialisations
-		 */
-		memmap_pages =
-			PAGE_ALIGN(size * sizeof(struct page)) >> PAGE_SHIFT;
-		if (realsize >= memmap_pages) {
-			realsize -= memmap_pages;
-			if (memmap_pages)
-				printk(KERN_DEBUG
-				       "  %s zone: %lu pages used for memmap\n",
-				       zone_names[j], memmap_pages);
-		} else
-			printk(KERN_WARNING
-				"  %s zone: %lu pages exceeds realsize %lu\n",
-				zone_names[j], memmap_pages, realsize);
-
 		/* Account for reserved pages */
 		if (j == 0 && realsize > dma_reserve) {
 			realsize -= dma_reserve;
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 02/25] memory_hotplug: fix missing nodemask management
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
  2012-08-06  9:22   ` [RFC V3 PATCH 01/25] page_alloc.c: don't subtract unrelated memmap from zone's present pages Lai Jiangshan
@ 2012-08-06  9:22   ` Lai Jiangshan
  2012-08-06  9:22   ` [RFC V3 PATCH 03/25] slub, hotplug: ignore unrelated node's hot-adding and hot-removing Lai Jiangshan
                     ` (23 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:22 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Rob Landley, Kay Sievers, Greg Kroah-Hartman,
	Andrew Morton, Paul Gortmaker, Bjorn Helgaas, David Rientjes,
	linux-doc, linux-mm

Currently memory_hotplug only manages the node_states[N_HIGH_MEMORY],
it forgot to manage node_states[N_NORMAL_MEMORY]. fix it.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 Documentation/memory-hotplug.txt |    5 ++-
 include/linux/memory.h           |    1 +
 mm/memory_hotplug.c              |   94 +++++++++++++++++++++++++++++++------
 3 files changed, 83 insertions(+), 17 deletions(-)

diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
index 6d0c251..6e6cbc7 100644
--- a/Documentation/memory-hotplug.txt
+++ b/Documentation/memory-hotplug.txt
@@ -377,15 +377,18 @@ The third argument is passed by pointer of struct memory_notify.
 struct memory_notify {
        unsigned long start_pfn;
        unsigned long nr_pages;
+       int status_change_nid_normal;
        int status_change_nid;
 }
 
 start_pfn is start_pfn of online/offline memory.
 nr_pages is # of pages of online/offline memory.
+status_change_nid_normal is set node id when N_NORMAL_MEMORY of nodemask
+is (will be) set/clear, if this is -1, then nodemask status is not changed.
 status_change_nid is set node id when N_HIGH_MEMORY of nodemask is (will be)
 set/clear. It means a new(memoryless) node gets new memory by online and a
 node loses all memory. If this is -1, then nodemask status is not changed.
-If status_changed_nid >= 0, callback should create/discard structures for the
+If status_changed_nid* >= 0, callback should create/discard structures for the
 node if necessary.
 
 --------------
diff --git a/include/linux/memory.h b/include/linux/memory.h
index 1ac7f6e..6b9202b 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -53,6 +53,7 @@ int arch_get_memory_phys_device(unsigned long start_pfn);
 struct memory_notify {
 	unsigned long start_pfn;
 	unsigned long nr_pages;
+	int status_change_nid_normal;
 	int status_change_nid;
 };
 
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 427bb29..3438c4a 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -456,6 +456,34 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
 	return 0;
 }
 
+static void check_nodemasks_changes_online(unsigned long nr_pages,
+	struct zone *zone, struct memory_notify *arg)
+{
+	int nid = zone_to_nid(zone);
+	enum zone_type zone_last = ZONE_NORMAL;
+
+	if (N_HIGH_MEMORY == N_NORMAL_MEMORY)
+		zone_last = ZONE_MOVABLE;
+
+	if (zone_idx(zone) <= zone_last && !node_state(nid, N_NORMAL_MEMORY))
+		arg->status_change_nid_normal = nid;
+	else
+		arg->status_change_nid_normal = -1;
+
+	if (!node_state(nid, N_HIGH_MEMORY))
+		arg->status_change_nid = nid;
+	else
+		arg->status_change_nid = -1;
+}
+
+static void set_nodemasks(int node, struct memory_notify *arg)
+{
+	if (arg->status_change_nid_normal >= 0)
+		node_set_state(node, N_NORMAL_MEMORY);
+
+	node_set_state(node, N_HIGH_MEMORY);
+}
+
 
 int __ref online_pages(unsigned long pfn, unsigned long nr_pages)
 {
@@ -467,13 +495,18 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages)
 	struct memory_notify arg;
 
 	lock_memory_hotplug();
+	/*
+	 * This doesn't need a lock to do pfn_to_page().
+	 * The section can't be removed here because of the
+	 * memory_block->state_mutex.
+	 */
+	zone = page_zone(pfn_to_page(pfn));
+
 	arg.start_pfn = pfn;
 	arg.nr_pages = nr_pages;
-	arg.status_change_nid = -1;
+	check_nodemasks_changes_online(nr_pages, zone, &arg);
 
 	nid = page_to_nid(pfn_to_page(pfn));
-	if (node_present_pages(nid) == 0)
-		arg.status_change_nid = nid;
 
 	ret = memory_notify(MEM_GOING_ONLINE, &arg);
 	ret = notifier_to_errno(ret);
@@ -483,12 +516,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages)
 		return ret;
 	}
 	/*
-	 * This doesn't need a lock to do pfn_to_page().
-	 * The section can't be removed here because of the
-	 * memory_block->state_mutex.
-	 */
-	zone = page_zone(pfn_to_page(pfn));
-	/*
 	 * If this zone is not populated, then it is not in zonelist.
 	 * This means the page allocator ignores this zone.
 	 * So, zonelist must be updated after online.
@@ -523,7 +550,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages)
 
 	if (onlined_pages) {
 		kswapd_run(zone_to_nid(zone));
-		node_set_state(zone_to_nid(zone), N_HIGH_MEMORY);
+		set_nodemasks(zone_to_nid(zone), &arg);
 	}
 
 	vm_total_pages = nr_free_pagecache_pages();
@@ -865,6 +892,44 @@ check_pages_isolated(unsigned long start_pfn, unsigned long end_pfn)
 	return offlined;
 }
 
+static void check_nodemasks_changes_offline(unsigned long nr_pages,
+		struct zone *zone, struct memory_notify *arg)
+{
+	struct pglist_data *pgdat = zone->zone_pgdat;
+	unsigned long present_pages = 0;
+	enum zone_type zt, zone_last = ZONE_NORMAL;
+
+	if (N_HIGH_MEMORY == N_NORMAL_MEMORY)
+		zone_last = ZONE_MOVABLE;
+
+	for (zt = 0; zt <= zone_last; zt++)
+		present_pages += pgdat->node_zones[zt].present_pages;
+	if (zone_idx(zone) <= zone_last && nr_pages >= present_pages)
+		arg->status_change_nid_normal = zone_to_nid(zone);
+	else
+		arg->status_change_nid_normal = -1;
+
+	zone_last = ZONE_MOVABLE;
+	for (; zt <= zone_last; zt++)
+		present_pages += pgdat->node_zones[zt].present_pages;
+	if (nr_pages >= present_pages)
+		arg->status_change_nid = zone_to_nid(zone);
+	else
+		arg->status_change_nid = -1;
+}
+
+static void clear_nodemasks(int node, struct memory_notify *arg)
+{
+	if (arg->status_change_nid_normal >= 0)
+		node_clear_state(node, N_NORMAL_MEMORY);
+
+	if (N_HIGH_MEMORY == N_NORMAL_MEMORY)
+		return;
+
+	if (arg->status_change_nid >= 0)
+		node_clear_state(node, N_HIGH_MEMORY);
+}
+
 static int __ref offline_pages(unsigned long start_pfn,
 		  unsigned long end_pfn, unsigned long timeout)
 {
@@ -898,9 +963,7 @@ static int __ref offline_pages(unsigned long start_pfn,
 
 	arg.start_pfn = start_pfn;
 	arg.nr_pages = nr_pages;
-	arg.status_change_nid = -1;
-	if (nr_pages >= node_present_pages(node))
-		arg.status_change_nid = node;
+	check_nodemasks_changes_offline(nr_pages, zone, &arg);
 
 	ret = memory_notify(MEM_GOING_OFFLINE, &arg);
 	ret = notifier_to_errno(ret);
@@ -965,10 +1028,9 @@ repeat:
 
 	init_per_zone_wmark_min();
 
-	if (!node_present_pages(node)) {
-		node_clear_state(node, N_HIGH_MEMORY);
+	clear_nodemasks(node, &arg);
+	if (arg.status_change_nid >= 0)
 		kswapd_stop(node);
-	}
 
 	vm_total_pages = nr_free_pagecache_pages();
 	writeback_set_ratelimit();
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 03/25] slub, hotplug: ignore unrelated node's hot-adding and hot-removing
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
  2012-08-06  9:22   ` [RFC V3 PATCH 01/25] page_alloc.c: don't subtract unrelated memmap from zone's present pages Lai Jiangshan
  2012-08-06  9:22   ` [RFC V3 PATCH 02/25] memory_hotplug: fix missing nodemask management Lai Jiangshan
@ 2012-08-06  9:22   ` Lai Jiangshan
  2012-08-06  9:22   ` [RFC V3 PATCH 04/25] node: cleanup node_state_attr Lai Jiangshan
                     ` (22 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:22 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Christoph Lameter, Pekka Enberg, Matt Mackall, linux-mm

SLUB only fucus on the nodes which has normal memory, so ignore the other
node's hot-adding and hot-removing.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/slub.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 8c691fa..f8b137a 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3568,7 +3568,7 @@ static void slab_mem_offline_callback(void *arg)
 	struct memory_notify *marg = arg;
 	int offline_node;
 
-	offline_node = marg->status_change_nid;
+	offline_node = marg->status_change_nid_normal;
 
 	/*
 	 * If the node still has available memory. we need kmem_cache_node
@@ -3601,7 +3601,7 @@ static int slab_mem_going_online_callback(void *arg)
 	struct kmem_cache_node *n;
 	struct kmem_cache *s;
 	struct memory_notify *marg = arg;
-	int nid = marg->status_change_nid;
+	int nid = marg->status_change_nid_normal;
 	int ret = 0;
 
 	/*
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 04/25] node: cleanup node_state_attr
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (2 preceding siblings ...)
  2012-08-06  9:22   ` [RFC V3 PATCH 03/25] slub, hotplug: ignore unrelated node's hot-adding and hot-removing Lai Jiangshan
@ 2012-08-06  9:22   ` Lai Jiangshan
  2012-08-06  9:22   ` [RFC V3 PATCH 05/25] node_states: introduce N_MEMORY Lai Jiangshan
                     ` (21 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:22 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel; +Cc: Lai Jiangshan, Greg Kroah-Hartman

Make it more readability and easy to add new state.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 drivers/base/node.c |   20 ++++++++++----------
 1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
index af1a177..5d7731e 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -614,23 +614,23 @@ static ssize_t show_node_state(struct device *dev,
 	{ __ATTR(name, 0444, show_node_state, NULL), state }
 
 static struct node_attr node_state_attr[] = {
-	_NODE_ATTR(possible, N_POSSIBLE),
-	_NODE_ATTR(online, N_ONLINE),
-	_NODE_ATTR(has_normal_memory, N_NORMAL_MEMORY),
-	_NODE_ATTR(has_cpu, N_CPU),
+	[N_POSSIBLE] = _NODE_ATTR(possible, N_POSSIBLE),
+	[N_ONLINE] = _NODE_ATTR(online, N_ONLINE),
+	[N_NORMAL_MEMORY] = _NODE_ATTR(has_normal_memory, N_NORMAL_MEMORY),
 #ifdef CONFIG_HIGHMEM
-	_NODE_ATTR(has_high_memory, N_HIGH_MEMORY),
+	[N_HIGH_MEMORY] = _NODE_ATTR(has_high_memory, N_HIGH_MEMORY),
 #endif
+	[N_CPU] = _NODE_ATTR(has_cpu, N_CPU),
 };
 
 static struct attribute *node_state_attrs[] = {
-	&node_state_attr[0].attr.attr,
-	&node_state_attr[1].attr.attr,
-	&node_state_attr[2].attr.attr,
-	&node_state_attr[3].attr.attr,
+	&node_state_attr[N_POSSIBLE].attr.attr,
+	&node_state_attr[N_ONLINE].attr.attr,
+	&node_state_attr[N_NORMAL_MEMORY].attr.attr,
 #ifdef CONFIG_HIGHMEM
-	&node_state_attr[4].attr.attr,
+	&node_state_attr[N_HIGH_MEMORY].attr.attr,
 #endif
+	&node_state_attr[N_CPU].attr.attr,
 	NULL
 };
 
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 05/25] node_states: introduce N_MEMORY
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (3 preceding siblings ...)
  2012-08-06  9:22   ` [RFC V3 PATCH 04/25] node: cleanup node_state_attr Lai Jiangshan
@ 2012-08-06  9:22   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 06/25] cpuset: use N_MEMORY instead N_HIGH_MEMORY Lai Jiangshan
                     ` (20 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:22 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel; +Cc: Lai Jiangshan, Christoph Lameter, Hillf Danton

We have N_NORMAL_MEMORY for standing for the nodes that have normal memory with
zone_type <= ZONE_NORMAL.

And we have N_HIGH_MEMORY for standing for the nodes that have normal or high
memory.

But we don't have any word to stand for the nodes that have *any* memory.

And we have N_CPU but without N_MEMORY.

Current code reuse the N_HIGH_MEMORY for this purpose because any node which
has memory must have high memory or normal memory currently.

A)	But this reusing is bad for *readability*. Because the name
	N_HIGH_MEMORY just stands for high or normal:

A.example 1)
	mem_cgroup_nr_lru_pages():
		for_each_node_state(nid, N_HIGH_MEMORY)

	The user will be confused(why this function just counts for high or
	normal memory node? does it counts for ZONE_MOVABLE's lru pages?)
	until someone else tell them N_HIGH_MEMORY is reused to stand for
	nodes that have any memory.

A.cont) If we introduce N_MEMORY, we can reduce this confusing
	AND make the code more clearly:

A.example 2) mm/page_cgroup.c use N_HIGH_MEMORY twice:

	One is in page_cgroup_init(void):
		for_each_node_state(nid, N_HIGH_MEMORY) {

	It means if the node have memory, we will allocate page_cgroup map for
	the node. We should use N_MEMORY instead here to gaim more clearly.

	The second using is in alloc_page_cgroup():
		if (node_state(nid, N_HIGH_MEMORY))
			addr = vzalloc_node(size, nid);

	It means if the node has high or normal memory that can be allocated
	from kernel. We should keep N_HIGH_MEMORY here, and it will be better
	if the "any memory" semantic of N_HIGH_MEMORY is removed.

B)	This reusing is out-dated if we introduce MOVABLE-dedicated node.
	The MOVABLE-dedicated node should not appear in
	node_stats[N_HIGH_MEMORY] nor node_stats[N_NORMAL_MEMORY],
	because MOVABLE-dedicated node has no high or normal memory.

	In x86_64, N_HIGH_MEMORY=N_NORMAL_MEMORY, if a MOVABLE-dedicated node
	is in node_stats[N_HIGH_MEMORY], it is also means it is in
	node_stats[N_NORMAL_MEMORY], it causes SLUB wrong.

	The slub uses
		for_each_node_state(nid, N_NORMAL_MEMORY)
	and creates kmem_cache_node for MOVABLE-dedicated node and cause problem.

In one word, we need a N_MEMORY. We just intrude it as an alias to
N_HIGH_MEMORY and fix all im-proper usages of N_HIGH_MEMORY in late patches.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Hillf Danton <dhillf@gmail.com>
---
 include/linux/nodemask.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
index 7afc363..c6ebdc9 100644
--- a/include/linux/nodemask.h
+++ b/include/linux/nodemask.h
@@ -380,6 +380,7 @@ enum node_states {
 #else
 	N_HIGH_MEMORY = N_NORMAL_MEMORY,
 #endif
+	N_MEMORY = N_HIGH_MEMORY,
 	N_CPU,		/* The node has one or more cpus */
 	NR_NODE_STATES
 };
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 06/25] cpuset: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (4 preceding siblings ...)
  2012-08-06  9:22   ` [RFC V3 PATCH 05/25] node_states: introduce N_MEMORY Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 07/25] procfs: " Lai Jiangshan
                     ` (19 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Paul Menage, Rob Landley, linux-doc

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Hillf Danton <dhillf@gmail.com>
---
 Documentation/cgroups/cpusets.txt |    2 +-
 include/linux/cpuset.h            |    2 +-
 kernel/cpuset.c                   |   32 ++++++++++++++++----------------
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/Documentation/cgroups/cpusets.txt b/Documentation/cgroups/cpusets.txt
index cefd3d8..12e01d4 100644
--- a/Documentation/cgroups/cpusets.txt
+++ b/Documentation/cgroups/cpusets.txt
@@ -218,7 +218,7 @@ and name space for cpusets, with a minimum of additional kernel code.
 The cpus and mems files in the root (top_cpuset) cpuset are
 read-only.  The cpus file automatically tracks the value of
 cpu_online_mask using a CPU hotplug notifier, and the mems file
-automatically tracks the value of node_states[N_HIGH_MEMORY]--i.e.,
+automatically tracks the value of node_states[N_MEMORY]--i.e.,
 nodes with memory--using the cpuset_track_online_nodes() hook.
 
 
diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h
index 838320f..8c8a60d29 100644
--- a/include/linux/cpuset.h
+++ b/include/linux/cpuset.h
@@ -144,7 +144,7 @@ static inline nodemask_t cpuset_mems_allowed(struct task_struct *p)
 	return node_possible_map;
 }
 
-#define cpuset_current_mems_allowed (node_states[N_HIGH_MEMORY])
+#define cpuset_current_mems_allowed (node_states[N_MEMORY])
 static inline void cpuset_init_current_mems_allowed(void) {}
 
 static inline int cpuset_nodemask_valid_mems_allowed(nodemask_t *nodemask)
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index f33c715..2b133db 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -302,10 +302,10 @@ static void guarantee_online_cpus(const struct cpuset *cs,
  * are online, with memory.  If none are online with memory, walk
  * up the cpuset hierarchy until we find one that does have some
  * online mems.  If we get all the way to the top and still haven't
- * found any online mems, return node_states[N_HIGH_MEMORY].
+ * found any online mems, return node_states[N_MEMORY].
  *
  * One way or another, we guarantee to return some non-empty subset
- * of node_states[N_HIGH_MEMORY].
+ * of node_states[N_MEMORY].
  *
  * Call with callback_mutex held.
  */
@@ -313,14 +313,14 @@ static void guarantee_online_cpus(const struct cpuset *cs,
 static void guarantee_online_mems(const struct cpuset *cs, nodemask_t *pmask)
 {
 	while (cs && !nodes_intersects(cs->mems_allowed,
-					node_states[N_HIGH_MEMORY]))
+					node_states[N_MEMORY]))
 		cs = cs->parent;
 	if (cs)
 		nodes_and(*pmask, cs->mems_allowed,
-					node_states[N_HIGH_MEMORY]);
+					node_states[N_MEMORY]);
 	else
-		*pmask = node_states[N_HIGH_MEMORY];
-	BUG_ON(!nodes_intersects(*pmask, node_states[N_HIGH_MEMORY]));
+		*pmask = node_states[N_MEMORY];
+	BUG_ON(!nodes_intersects(*pmask, node_states[N_MEMORY]));
 }
 
 /*
@@ -1100,7 +1100,7 @@ static int update_nodemask(struct cpuset *cs, struct cpuset *trialcs,
 		return -ENOMEM;
 
 	/*
-	 * top_cpuset.mems_allowed tracks node_stats[N_HIGH_MEMORY];
+	 * top_cpuset.mems_allowed tracks node_stats[N_MEMORY];
 	 * it's read-only
 	 */
 	if (cs == &top_cpuset) {
@@ -1122,7 +1122,7 @@ static int update_nodemask(struct cpuset *cs, struct cpuset *trialcs,
 			goto done;
 
 		if (!nodes_subset(trialcs->mems_allowed,
-				node_states[N_HIGH_MEMORY])) {
+				node_states[N_MEMORY])) {
 			retval =  -EINVAL;
 			goto done;
 		}
@@ -2034,7 +2034,7 @@ static struct cpuset *cpuset_next(struct list_head *queue)
  * before dropping down to the next.  It always processes a node before
  * any of its children.
  *
- * In the case of memory hot-unplug, it will remove nodes from N_HIGH_MEMORY
+ * In the case of memory hot-unplug, it will remove nodes from N_MEMORY
  * if all present pages from a node are offlined.
  */
 static void
@@ -2073,7 +2073,7 @@ scan_cpusets_upon_hotplug(struct cpuset *root, enum hotplug_event event)
 
 			/* Continue past cpusets with all mems online */
 			if (nodes_subset(cp->mems_allowed,
-					node_states[N_HIGH_MEMORY]))
+					node_states[N_MEMORY]))
 				continue;
 
 			oldmems = cp->mems_allowed;
@@ -2081,7 +2081,7 @@ scan_cpusets_upon_hotplug(struct cpuset *root, enum hotplug_event event)
 			/* Remove offline mems from this cpuset. */
 			mutex_lock(&callback_mutex);
 			nodes_and(cp->mems_allowed, cp->mems_allowed,
-						node_states[N_HIGH_MEMORY]);
+						node_states[N_MEMORY]);
 			mutex_unlock(&callback_mutex);
 
 			/* Move tasks from the empty cpuset to a parent */
@@ -2134,8 +2134,8 @@ void cpuset_update_active_cpus(bool cpu_online)
 
 #ifdef CONFIG_MEMORY_HOTPLUG
 /*
- * Keep top_cpuset.mems_allowed tracking node_states[N_HIGH_MEMORY].
- * Call this routine anytime after node_states[N_HIGH_MEMORY] changes.
+ * Keep top_cpuset.mems_allowed tracking node_states[N_MEMORY].
+ * Call this routine anytime after node_states[N_MEMORY] changes.
  * See cpuset_update_active_cpus() for CPU hotplug handling.
  */
 static int cpuset_track_online_nodes(struct notifier_block *self,
@@ -2148,7 +2148,7 @@ static int cpuset_track_online_nodes(struct notifier_block *self,
 	case MEM_ONLINE:
 		oldmems = top_cpuset.mems_allowed;
 		mutex_lock(&callback_mutex);
-		top_cpuset.mems_allowed = node_states[N_HIGH_MEMORY];
+		top_cpuset.mems_allowed = node_states[N_MEMORY];
 		mutex_unlock(&callback_mutex);
 		update_tasks_nodemask(&top_cpuset, &oldmems, NULL);
 		break;
@@ -2177,7 +2177,7 @@ static int cpuset_track_online_nodes(struct notifier_block *self,
 void __init cpuset_init_smp(void)
 {
 	cpumask_copy(top_cpuset.cpus_allowed, cpu_active_mask);
-	top_cpuset.mems_allowed = node_states[N_HIGH_MEMORY];
+	top_cpuset.mems_allowed = node_states[N_MEMORY];
 
 	hotplug_memory_notifier(cpuset_track_online_nodes, 10);
 
@@ -2245,7 +2245,7 @@ void cpuset_init_current_mems_allowed(void)
  *
  * Description: Returns the nodemask_t mems_allowed of the cpuset
  * attached to the specified @tsk.  Guaranteed to return some non-empty
- * subset of node_states[N_HIGH_MEMORY], even if this means going outside the
+ * subset of node_states[N_MEMORY], even if this means going outside the
  * tasks cpuset.
  **/
 
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 07/25] procfs: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (5 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 06/25] cpuset: use N_MEMORY instead N_HIGH_MEMORY Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 08/25] memcontrol: " Lai Jiangshan
                     ` (18 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Andrew Morton, Laura Vasilescu, Jiri Kosina,
	WANG Cong, Hugh Dickins, Naoya Horiguchi, David Rientjes,
	Konstantin Khlebnikov

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Hillf Danton <dhillf@gmail.com>
---
 fs/proc/kcore.c    |    2 +-
 fs/proc/task_mmu.c |    4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
index 86c67ee..e96d4f1 100644
--- a/fs/proc/kcore.c
+++ b/fs/proc/kcore.c
@@ -249,7 +249,7 @@ static int kcore_update_ram(void)
 	/* Not inialized....update now */
 	/* find out "max pfn" */
 	end_pfn = 0;
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		unsigned long node_end;
 		node_end  = NODE_DATA(nid)->node_start_pfn +
 			NODE_DATA(nid)->node_spanned_pages;
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 4540b8f..ed3d381 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1080,7 +1080,7 @@ static struct page *can_gather_numa_stats(pte_t pte, struct vm_area_struct *vma,
 		return NULL;
 
 	nid = page_to_nid(page);
-	if (!node_isset(nid, node_states[N_HIGH_MEMORY]))
+	if (!node_isset(nid, node_states[N_MEMORY]))
 		return NULL;
 
 	return page;
@@ -1232,7 +1232,7 @@ static int show_numa_map(struct seq_file *m, void *v, int is_pid)
 	if (md->writeback)
 		seq_printf(m, " writeback=%lu", md->writeback);
 
-	for_each_node_state(n, N_HIGH_MEMORY)
+	for_each_node_state(n, N_MEMORY)
 		if (md->node[n])
 			seq_printf(m, " N%d=%lu", n, md->node[n]);
 out:
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 08/25] memcontrol: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (6 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 07/25] procfs: " Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 09/25] oom: " Lai Jiangshan
                     ` (17 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Johannes Weiner, Michal Hocko, Balbir Singh,
	KAMEZAWA Hiroyuki, Tejun Heo, Li Zefan, cgroups, linux-mm,
	containers

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/memcontrol.c  |   18 +++++++++---------
 mm/page_cgroup.c |    2 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f72b5e5..4402c2e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -797,7 +797,7 @@ static unsigned long mem_cgroup_nr_lru_pages(struct mem_cgroup *memcg,
 	int nid;
 	u64 total = 0;
 
-	for_each_node_state(nid, N_HIGH_MEMORY)
+	for_each_node_state(nid, N_MEMORY)
 		total += mem_cgroup_node_nr_lru_pages(memcg, nid, lru_mask);
 	return total;
 }
@@ -1549,9 +1549,9 @@ static void mem_cgroup_may_update_nodemask(struct mem_cgroup *memcg)
 		return;
 
 	/* make a nodemask where this memcg uses memory from */
-	memcg->scan_nodes = node_states[N_HIGH_MEMORY];
+	memcg->scan_nodes = node_states[N_MEMORY];
 
-	for_each_node_mask(nid, node_states[N_HIGH_MEMORY]) {
+	for_each_node_mask(nid, node_states[N_MEMORY]) {
 
 		if (!test_mem_cgroup_node_reclaimable(memcg, nid, false))
 			node_clear(nid, memcg->scan_nodes);
@@ -1622,7 +1622,7 @@ static bool mem_cgroup_reclaimable(struct mem_cgroup *memcg, bool noswap)
 	/*
 	 * Check rest of nodes.
 	 */
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		if (node_isset(nid, memcg->scan_nodes))
 			continue;
 		if (test_mem_cgroup_node_reclaimable(memcg, nid, noswap))
@@ -3700,7 +3700,7 @@ move_account:
 		drain_all_stock_sync(memcg);
 		ret = 0;
 		mem_cgroup_start_move(memcg);
-		for_each_node_state(node, N_HIGH_MEMORY) {
+		for_each_node_state(node, N_MEMORY) {
 			for (zid = 0; !ret && zid < MAX_NR_ZONES; zid++) {
 				enum lru_list lru;
 				for_each_lru(lru) {
@@ -4025,7 +4025,7 @@ static int mem_control_numa_stat_show(struct cgroup *cont, struct cftype *cft,
 
 	total_nr = mem_cgroup_nr_lru_pages(memcg, LRU_ALL);
 	seq_printf(m, "total=%lu", total_nr);
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		node_nr = mem_cgroup_node_nr_lru_pages(memcg, nid, LRU_ALL);
 		seq_printf(m, " N%d=%lu", nid, node_nr);
 	}
@@ -4033,7 +4033,7 @@ static int mem_control_numa_stat_show(struct cgroup *cont, struct cftype *cft,
 
 	file_nr = mem_cgroup_nr_lru_pages(memcg, LRU_ALL_FILE);
 	seq_printf(m, "file=%lu", file_nr);
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		node_nr = mem_cgroup_node_nr_lru_pages(memcg, nid,
 				LRU_ALL_FILE);
 		seq_printf(m, " N%d=%lu", nid, node_nr);
@@ -4042,7 +4042,7 @@ static int mem_control_numa_stat_show(struct cgroup *cont, struct cftype *cft,
 
 	anon_nr = mem_cgroup_nr_lru_pages(memcg, LRU_ALL_ANON);
 	seq_printf(m, "anon=%lu", anon_nr);
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		node_nr = mem_cgroup_node_nr_lru_pages(memcg, nid,
 				LRU_ALL_ANON);
 		seq_printf(m, " N%d=%lu", nid, node_nr);
@@ -4051,7 +4051,7 @@ static int mem_control_numa_stat_show(struct cgroup *cont, struct cftype *cft,
 
 	unevictable_nr = mem_cgroup_nr_lru_pages(memcg, BIT(LRU_UNEVICTABLE));
 	seq_printf(m, "unevictable=%lu", unevictable_nr);
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		node_nr = mem_cgroup_node_nr_lru_pages(memcg, nid,
 				BIT(LRU_UNEVICTABLE));
 		seq_printf(m, " N%d=%lu", nid, node_nr);
diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
index eb750f8..e775239 100644
--- a/mm/page_cgroup.c
+++ b/mm/page_cgroup.c
@@ -271,7 +271,7 @@ void __init page_cgroup_init(void)
 	if (mem_cgroup_disabled())
 		return;
 
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		unsigned long start_pfn, end_pfn;
 
 		start_pfn = node_start_pfn(nid);
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 09/25] oom: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (7 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 08/25] memcontrol: " Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 10/25] mm,migrate: " Lai Jiangshan
                     ` (16 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Andrew Morton, David Rientjes, KAMEZAWA Hiroyuki,
	Michal Hocko, KOSAKI Motohiro, linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Hillf Danton <dhillf@gmail.com>
---
 mm/oom_kill.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index ac300c9..1e58f12 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -257,7 +257,7 @@ static enum oom_constraint constrained_alloc(struct zonelist *zonelist,
 	 * the page allocator means a mempolicy is in effect.  Cpuset policy
 	 * is enforced in get_page_from_freelist().
 	 */
-	if (nodemask && !nodes_subset(node_states[N_HIGH_MEMORY], *nodemask)) {
+	if (nodemask && !nodes_subset(node_states[N_MEMORY], *nodemask)) {
 		*totalpages = total_swap_pages;
 		for_each_node_mask(nid, *nodemask)
 			*totalpages += node_spanned_pages(nid);
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 10/25] mm,migrate: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (8 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 09/25] oom: " Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 11/25] mempolicy: " Lai Jiangshan
                     ` (15 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Andrew Morton, Hugh Dickins, Mel Gorman,
	Christoph Lameter, Wang Sheng-Hui, linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Christoph Lameter <cl@linux.com>
---
 mm/migrate.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index be26d5c..dbe4f86 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1226,7 +1226,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
 			if (node < 0 || node >= MAX_NUMNODES)
 				goto out_pm;
 
-			if (!node_state(node, N_HIGH_MEMORY))
+			if (!node_state(node, N_MEMORY))
 				goto out_pm;
 
 			err = -EACCES;
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 11/25] mempolicy: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (9 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 10/25] mm,migrate: " Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 12/25] hugetlb: " Lai Jiangshan
                     ` (14 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Andrew Morton, Mel Gorman, David Rientjes,
	Rik van Riel, KOSAKI Motohiro, linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/mempolicy.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 1d771e4..ad0381d 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -212,9 +212,9 @@ static int mpol_set_nodemask(struct mempolicy *pol,
 	/* if mode is MPOL_DEFAULT, pol is NULL. This is right. */
 	if (pol == NULL)
 		return 0;
-	/* Check N_HIGH_MEMORY */
+	/* Check N_MEMORY */
 	nodes_and(nsc->mask1,
-		  cpuset_current_mems_allowed, node_states[N_HIGH_MEMORY]);
+		  cpuset_current_mems_allowed, node_states[N_MEMORY]);
 
 	VM_BUG_ON(!nodes);
 	if (pol->mode == MPOL_PREFERRED && nodes_empty(*nodes))
@@ -1363,7 +1363,7 @@ SYSCALL_DEFINE4(migrate_pages, pid_t, pid, unsigned long, maxnode,
 		goto out_put;
 	}
 
-	if (!nodes_subset(*new, node_states[N_HIGH_MEMORY])) {
+	if (!nodes_subset(*new, node_states[N_MEMORY])) {
 		err = -EINVAL;
 		goto out_put;
 	}
@@ -2314,7 +2314,7 @@ void __init numa_policy_init(void)
 	 * fall back to the largest node if they're all smaller.
 	 */
 	nodes_clear(interleave_nodes);
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		unsigned long total_pages = node_present_pages(nid);
 
 		/* Preserve the largest node */
@@ -2395,7 +2395,7 @@ int mpol_parse_str(char *str, struct mempolicy **mpol, int no_context)
 		*nodelist++ = '\0';
 		if (nodelist_parse(nodelist, nodes))
 			goto out;
-		if (!nodes_subset(nodes, node_states[N_HIGH_MEMORY]))
+		if (!nodes_subset(nodes, node_states[N_MEMORY]))
 			goto out;
 	} else
 		nodes_clear(nodes);
@@ -2429,7 +2429,7 @@ int mpol_parse_str(char *str, struct mempolicy **mpol, int no_context)
 		 * Default to online nodes with memory if no nodelist
 		 */
 		if (!nodelist)
-			nodes = node_states[N_HIGH_MEMORY];
+			nodes = node_states[N_MEMORY];
 		break;
 	case MPOL_LOCAL:
 		/*
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 12/25] hugetlb: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (10 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 11/25] mempolicy: " Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 13/25] vmstat: " Lai Jiangshan
                     ` (13 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Greg Kroah-Hartman, Andrew Morton, Hillf Danton,
	Michal Hocko, KAMEZAWA Hiroyuki, linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Hillf Danton <dhillf@gmail.com>
---
 drivers/base/node.c |    2 +-
 mm/hugetlb.c        |   24 ++++++++++++------------
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
index 5d7731e..4c3aa7c 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -227,7 +227,7 @@ static node_registration_func_t __hugetlb_unregister_node;
 static inline bool hugetlb_register_node(struct node *node)
 {
 	if (__hugetlb_register_node &&
-			node_state(node->dev.id, N_HIGH_MEMORY)) {
+			node_state(node->dev.id, N_MEMORY)) {
 		__hugetlb_register_node(node);
 		return true;
 	}
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index e198831..661db47 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1046,7 +1046,7 @@ static void return_unused_surplus_pages(struct hstate *h,
 	 * on-line nodes with memory and will handle the hstate accounting.
 	 */
 	while (nr_pages--) {
-		if (!free_pool_huge_page(h, &node_states[N_HIGH_MEMORY], 1))
+		if (!free_pool_huge_page(h, &node_states[N_MEMORY], 1))
 			break;
 	}
 }
@@ -1150,14 +1150,14 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma,
 int __weak alloc_bootmem_huge_page(struct hstate *h)
 {
 	struct huge_bootmem_page *m;
-	int nr_nodes = nodes_weight(node_states[N_HIGH_MEMORY]);
+	int nr_nodes = nodes_weight(node_states[N_MEMORY]);
 
 	while (nr_nodes) {
 		void *addr;
 
 		addr = __alloc_bootmem_node_nopanic(
 				NODE_DATA(hstate_next_node_to_alloc(h,
-						&node_states[N_HIGH_MEMORY])),
+						&node_states[N_MEMORY])),
 				huge_page_size(h), huge_page_size(h), 0);
 
 		if (addr) {
@@ -1229,7 +1229,7 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
 			if (!alloc_bootmem_huge_page(h))
 				break;
 		} else if (!alloc_fresh_huge_page(h,
-					 &node_states[N_HIGH_MEMORY]))
+					 &node_states[N_MEMORY]))
 			break;
 	}
 	h->max_huge_pages = i;
@@ -1497,7 +1497,7 @@ static ssize_t nr_hugepages_store_common(bool obey_mempolicy,
 		if (!(obey_mempolicy &&
 				init_nodemask_of_mempolicy(nodes_allowed))) {
 			NODEMASK_FREE(nodes_allowed);
-			nodes_allowed = &node_states[N_HIGH_MEMORY];
+			nodes_allowed = &node_states[N_MEMORY];
 		}
 	} else if (nodes_allowed) {
 		/*
@@ -1507,11 +1507,11 @@ static ssize_t nr_hugepages_store_common(bool obey_mempolicy,
 		count += h->nr_huge_pages - h->nr_huge_pages_node[nid];
 		init_nodemask_of_node(nodes_allowed, nid);
 	} else
-		nodes_allowed = &node_states[N_HIGH_MEMORY];
+		nodes_allowed = &node_states[N_MEMORY];
 
 	h->max_huge_pages = set_max_huge_pages(h, count, nodes_allowed);
 
-	if (nodes_allowed != &node_states[N_HIGH_MEMORY])
+	if (nodes_allowed != &node_states[N_MEMORY])
 		NODEMASK_FREE(nodes_allowed);
 
 	return len;
@@ -1812,7 +1812,7 @@ static void hugetlb_register_all_nodes(void)
 {
 	int nid;
 
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		struct node *node = &node_devices[nid];
 		if (node->dev.id == nid)
 			hugetlb_register_node(node);
@@ -1906,8 +1906,8 @@ void __init hugetlb_add_hstate(unsigned order)
 	h->free_huge_pages = 0;
 	for (i = 0; i < MAX_NUMNODES; ++i)
 		INIT_LIST_HEAD(&h->hugepage_freelists[i]);
-	h->next_nid_to_alloc = first_node(node_states[N_HIGH_MEMORY]);
-	h->next_nid_to_free = first_node(node_states[N_HIGH_MEMORY]);
+	h->next_nid_to_alloc = first_node(node_states[N_MEMORY]);
+	h->next_nid_to_free = first_node(node_states[N_MEMORY]);
 	snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB",
 					huge_page_size(h)/1024);
 
@@ -1995,11 +1995,11 @@ static int hugetlb_sysctl_handler_common(bool obey_mempolicy,
 		if (!(obey_mempolicy &&
 			       init_nodemask_of_mempolicy(nodes_allowed))) {
 			NODEMASK_FREE(nodes_allowed);
-			nodes_allowed = &node_states[N_HIGH_MEMORY];
+			nodes_allowed = &node_states[N_MEMORY];
 		}
 		h->max_huge_pages = set_max_huge_pages(h, tmp, nodes_allowed);
 
-		if (nodes_allowed != &node_states[N_HIGH_MEMORY])
+		if (nodes_allowed != &node_states[N_MEMORY])
 			NODEMASK_FREE(nodes_allowed);
 	}
 out:
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 13/25] vmstat: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (11 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 12/25] hugetlb: " Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 14/25] kthread: " Lai Jiangshan
                     ` (12 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Andrew Morton, Christoph Lameter,
	KAMEZAWA Hiroyuki, David Rientjes, linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Christoph Lameter <cl@linux.com>
---
 mm/vmstat.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/vmstat.c b/mm/vmstat.c
index 1bbbbd9..aa3da12 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -917,7 +917,7 @@ static int pagetypeinfo_show(struct seq_file *m, void *arg)
 	pg_data_t *pgdat = (pg_data_t *)arg;
 
 	/* check memoryless node */
-	if (!node_state(pgdat->node_id, N_HIGH_MEMORY))
+	if (!node_state(pgdat->node_id, N_MEMORY))
 		return 0;
 
 	seq_printf(m, "Page block order: %d\n", pageblock_order);
@@ -1279,7 +1279,7 @@ static int unusable_show(struct seq_file *m, void *arg)
 	pg_data_t *pgdat = (pg_data_t *)arg;
 
 	/* check memoryless node */
-	if (!node_state(pgdat->node_id, N_HIGH_MEMORY))
+	if (!node_state(pgdat->node_id, N_MEMORY))
 		return 0;
 
 	walk_zones_in_node(m, pgdat, unusable_show_print);
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 14/25] kthread: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (12 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 13/25] vmstat: " Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 15/25] init: " Lai Jiangshan
                     ` (11 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Tejun Heo, Paul Gortmaker,
	Henrique de Moraes Holschuh, Oleg Nesterov

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/kthread.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 3d3de63..4139962 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -280,7 +280,7 @@ int kthreadd(void *unused)
 	set_task_comm(tsk, "kthreadd");
 	ignore_signals(tsk);
 	set_cpus_allowed_ptr(tsk, cpu_all_mask);
-	set_mems_allowed(node_states[N_HIGH_MEMORY]);
+	set_mems_allowed(node_states[N_MEMORY]);
 
 	current->flags |= PF_NOFREEZE;
 
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 15/25] init: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (13 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 14/25] kthread: " Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 16/25] vmscan: " Lai Jiangshan
                     ` (10 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Rusty Russell, Ingo Molnar, Peter Zijlstra,
	Jim Cromie, Pawel Moll

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 init/main.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/init/main.c b/init/main.c
index 4121d1f..c9317aa 100644
--- a/init/main.c
+++ b/init/main.c
@@ -846,7 +846,7 @@ static int __init kernel_init(void * unused)
 	/*
 	 * init can allocate pages on any node
 	 */
-	set_mems_allowed(node_states[N_HIGH_MEMORY]);
+	set_mems_allowed(node_states[N_MEMORY]);
 	/*
 	 * init can run on any cpu.
 	 */
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 16/25] vmscan: use N_MEMORY instead N_HIGH_MEMORY
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (14 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 15/25] init: " Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 17/25] page_alloc: use N_MEMORY instead N_HIGH_MEMORY change the node_states initialization Lai Jiangshan
                     ` (9 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Andrew Morton, KAMEZAWA Hiroyuki, Hugh Dickins,
	Minchan Kim, linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Hillf Danton <dhillf@gmail.com>
---
 mm/vmscan.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 66e4310..1888026 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2921,7 +2921,7 @@ static int __devinit cpu_callback(struct notifier_block *nfb,
 	int nid;
 
 	if (action == CPU_ONLINE || action == CPU_ONLINE_FROZEN) {
-		for_each_node_state(nid, N_HIGH_MEMORY) {
+		for_each_node_state(nid, N_MEMORY) {
 			pg_data_t *pgdat = NODE_DATA(nid);
 			const struct cpumask *mask;
 
@@ -2976,7 +2976,7 @@ static int __init kswapd_init(void)
 	int nid;
 
 	swap_setup();
-	for_each_node_state(nid, N_HIGH_MEMORY)
+	for_each_node_state(nid, N_MEMORY)
  		kswapd_run(nid);
 	hotcpu_notifier(cpu_callback, 0);
 	return 0;
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 17/25] page_alloc: use N_MEMORY instead N_HIGH_MEMORY change the node_states initialization
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (15 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 16/25] vmscan: " Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 18/25] hotplug: update nodemasks management Lai Jiangshan
                     ` (8 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Tejun Heo, Pekka Enberg, Yinghai Lu, David Rientjes,
	Andrew Morton, Michal Hocko, KAMEZAWA Hiroyuki, Minchan Kim,
	linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Since we introduced N_MEMORY, we update the initialization of node_states.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 arch/x86/mm/init_64.c |    4 +++-
 mm/page_alloc.c       |   40 ++++++++++++++++++++++------------------
 2 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 2b6b4a3..005f00c 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -625,7 +625,9 @@ void __init paging_init(void)
 	 *	 numa support is not compiled in, and later node_set_state
 	 *	 will not set it back.
 	 */
-	node_clear_state(0, N_NORMAL_MEMORY);
+	node_clear_state(0, N_MEMORY);
+	if (N_MEMORY != N_NORMAL_MEMORY)
+		node_clear_state(0, N_NORMAL_MEMORY);
 
 	zone_sizes_init();
 }
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 9312702..edffc35 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1646,7 +1646,7 @@ bool zone_watermark_ok_safe(struct zone *z, int order, unsigned long mark,
  *
  * If the zonelist cache is present in the passed in zonelist, then
  * returns a pointer to the allowed node mask (either the current
- * tasks mems_allowed, or node_states[N_HIGH_MEMORY].)
+ * tasks mems_allowed, or node_states[N_MEMORY].)
  *
  * If the zonelist cache is not available for this zonelist, does
  * nothing and returns NULL.
@@ -1675,7 +1675,7 @@ static nodemask_t *zlc_setup(struct zonelist *zonelist, int alloc_flags)
 
 	allowednodes = !in_interrupt() && (alloc_flags & ALLOC_CPUSET) ?
 					&cpuset_current_mems_allowed :
-					&node_states[N_HIGH_MEMORY];
+					&node_states[N_MEMORY];
 	return allowednodes;
 }
 
@@ -3070,7 +3070,7 @@ static int find_next_best_node(int node, nodemask_t *used_node_mask)
 		return node;
 	}
 
-	for_each_node_state(n, N_HIGH_MEMORY) {
+	for_each_node_state(n, N_MEMORY) {
 
 		/* Don't want a node to appear more than once */
 		if (node_isset(n, *used_node_mask))
@@ -3212,7 +3212,7 @@ static int default_zonelist_order(void)
  	 * local memory, NODE_ORDER may be suitable.
          */
 	average_size = total_size /
-				(nodes_weight(node_states[N_HIGH_MEMORY]) + 1);
+				(nodes_weight(node_states[N_MEMORY]) + 1);
 	for_each_online_node(nid) {
 		low_kmem_size = 0;
 		total_size = 0;
@@ -4569,7 +4569,7 @@ unsigned long __init find_min_pfn_with_active_regions(void)
 /*
  * early_calculate_totalpages()
  * Sum pages in active regions for movable zone.
- * Populate N_HIGH_MEMORY for calculating usable_nodes.
+ * Populate N_MEMORY for calculating usable_nodes.
  */
 static unsigned long __init early_calculate_totalpages(void)
 {
@@ -4582,7 +4582,7 @@ static unsigned long __init early_calculate_totalpages(void)
 
 		totalpages += pages;
 		if (pages)
-			node_set_state(nid, N_HIGH_MEMORY);
+			node_set_state(nid, N_MEMORY);
 	}
   	return totalpages;
 }
@@ -4599,9 +4599,9 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 	unsigned long usable_startpfn;
 	unsigned long kernelcore_node, kernelcore_remaining;
 	/* save the state before borrow the nodemask */
-	nodemask_t saved_node_state = node_states[N_HIGH_MEMORY];
+	nodemask_t saved_node_state = node_states[N_MEMORY];
 	unsigned long totalpages = early_calculate_totalpages();
-	int usable_nodes = nodes_weight(node_states[N_HIGH_MEMORY]);
+	int usable_nodes = nodes_weight(node_states[N_MEMORY]);
 
 	/*
 	 * If movablecore was specified, calculate what size of
@@ -4636,7 +4636,7 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 restart:
 	/* Spread kernelcore memory as evenly as possible throughout nodes */
 	kernelcore_node = required_kernelcore / usable_nodes;
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		unsigned long start_pfn, end_pfn;
 
 		/*
@@ -4728,23 +4728,27 @@ restart:
 
 out:
 	/* restore the node_state */
-	node_states[N_HIGH_MEMORY] = saved_node_state;
+	node_states[N_MEMORY] = saved_node_state;
 }
 
-/* Any regular memory on that node ? */
-static void check_for_regular_memory(pg_data_t *pgdat)
+/* Any regular or high memory on that node ? */
+static void check_for_memory(pg_data_t *pgdat, int nid)
 {
-#ifdef CONFIG_HIGHMEM
 	enum zone_type zone_type;
 
-	for (zone_type = 0; zone_type <= ZONE_NORMAL; zone_type++) {
+	if (N_MEMORY == N_NORMAL_MEMORY)
+		return;
+
+	for (zone_type = 0; zone_type <= ZONE_MOVABLE - 1; zone_type++) {
 		struct zone *zone = &pgdat->node_zones[zone_type];
 		if (zone->present_pages) {
-			node_set_state(zone_to_nid(zone), N_NORMAL_MEMORY);
+			node_set_state(nid, N_HIGH_MEMORY);
+			if (N_NORMAL_MEMORY != N_HIGH_MEMORY &&
+			    zone_type <= ZONE_NORMAL)
+				node_set_state(nid, N_NORMAL_MEMORY);
 			break;
 		}
 	}
-#endif
 }
 
 /**
@@ -4827,8 +4831,8 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn)
 
 		/* Any memory on that node */
 		if (pgdat->node_present_pages)
-			node_set_state(nid, N_HIGH_MEMORY);
-		check_for_regular_memory(pgdat);
+			node_set_state(nid, N_MEMORY);
+		check_for_memory(pgdat, nid);
 	}
 }
 
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 18/25] hotplug: update nodemasks management
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (16 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 17/25] page_alloc: use N_MEMORY instead N_HIGH_MEMORY change the node_states initialization Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 19/25] numa: add CONFIG_MOVABLE_NODE for movable-dedicated node Lai Jiangshan
                     ` (7 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Rob Landley, Kay Sievers, Greg Kroah-Hartman,
	Andrew Morton, Paul Gortmaker, Bjorn Helgaas, David Rientjes,
	linux-doc, linux-mm

update nodemasks management for N_MEMORY

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 Documentation/memory-hotplug.txt |    5 +++-
 include/linux/memory.h           |    1 +
 mm/memory_hotplug.c              |   49 +++++++++++++++++++++++++++++++++----
 3 files changed, 48 insertions(+), 7 deletions(-)

diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
index 6e6cbc7..70bc1c7 100644
--- a/Documentation/memory-hotplug.txt
+++ b/Documentation/memory-hotplug.txt
@@ -378,6 +378,7 @@ struct memory_notify {
        unsigned long start_pfn;
        unsigned long nr_pages;
        int status_change_nid_normal;
+       int status_change_nid_high;
        int status_change_nid;
 }
 
@@ -385,7 +386,9 @@ start_pfn is start_pfn of online/offline memory.
 nr_pages is # of pages of online/offline memory.
 status_change_nid_normal is set node id when N_NORMAL_MEMORY of nodemask
 is (will be) set/clear, if this is -1, then nodemask status is not changed.
-status_change_nid is set node id when N_HIGH_MEMORY of nodemask is (will be)
+status_change_nid_high is set node id when N_HIGH_MEMORY of nodemask
+is (will be) set/clear, if this is -1, then nodemask status is not changed.
+status_change_nid is set node id when N_MEMORY of nodemask is (will be)
 set/clear. It means a new(memoryless) node gets new memory by online and a
 node loses all memory. If this is -1, then nodemask status is not changed.
 If status_changed_nid* >= 0, callback should create/discard structures for the
diff --git a/include/linux/memory.h b/include/linux/memory.h
index 6b9202b..8089e49 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -54,6 +54,7 @@ struct memory_notify {
 	unsigned long start_pfn;
 	unsigned long nr_pages;
 	int status_change_nid_normal;
+	int status_change_nid_high;
 	int status_change_nid;
 };
 
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 3438c4a..c2c96a4 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -462,7 +462,7 @@ static void check_nodemasks_changes_online(unsigned long nr_pages,
 	int nid = zone_to_nid(zone);
 	enum zone_type zone_last = ZONE_NORMAL;
 
-	if (N_HIGH_MEMORY == N_NORMAL_MEMORY)
+	if (N_MEMORY == N_NORMAL_MEMORY)
 		zone_last = ZONE_MOVABLE;
 
 	if (zone_idx(zone) <= zone_last && !node_state(nid, N_NORMAL_MEMORY))
@@ -470,7 +470,20 @@ static void check_nodemasks_changes_online(unsigned long nr_pages,
 	else
 		arg->status_change_nid_normal = -1;
 
-	if (!node_state(nid, N_HIGH_MEMORY))
+#ifdef CONFIG_HIGHMEM
+	zone_last = ZONE_HIGH;
+	if (N_MEMORY == N_HIGH_MEMORY)
+		zone_last = ZONE_MOVABLE;
+
+	if (zone_idx(zone) <= zone_last && !node_state(nid, N_HIGH_MEMORY))
+		arg->status_change_nid_high = nid;
+	else
+		arg->status_change_nid_high = -1;
+#else
+	arg->status_change_nid_high = arg->status_change_nid_normal;
+#endif
+
+	if (!node_state(nid, N_MEMORY))
 		arg->status_change_nid = nid;
 	else
 		arg->status_change_nid = -1;
@@ -481,7 +494,10 @@ static void set_nodemasks(int node, struct memory_notify *arg)
 	if (arg->status_change_nid_normal >= 0)
 		node_set_state(node, N_NORMAL_MEMORY);
 
-	node_set_state(node, N_HIGH_MEMORY);
+	if (arg->status_change_nid_high >= 0)
+		node_set_state(node, N_HIGH_MEMORY);
+
+	node_set_state(node, N_MEMORY);
 }
 
 
@@ -899,7 +915,7 @@ static void check_nodemasks_changes_offline(unsigned long nr_pages,
 	unsigned long present_pages = 0;
 	enum zone_type zt, zone_last = ZONE_NORMAL;
 
-	if (N_HIGH_MEMORY == N_NORMAL_MEMORY)
+	if (N_MEMORY == N_NORMAL_MEMORY)
 		zone_last = ZONE_MOVABLE;
 
 	for (zt = 0; zt <= zone_last; zt++)
@@ -909,6 +925,21 @@ static void check_nodemasks_changes_offline(unsigned long nr_pages,
 	else
 		arg->status_change_nid_normal = -1;
 
+#ifdef CONIG_HIGHMEM
+	zone_last = ZONE_HIGH;
+	if (N_MEMORY == N_HIGH_MEMORY)
+		zone_last = ZONE_MOVABLE;
+
+	for (; zt <= zone_last; zt++)
+		present_pages += pgdat->node_zones[zt].present_pages;
+	if (zone_idx(zone) <= zone_last && nr_pages >= present_pages)
+		arg->status_change_nid_high = zone_to_nid(zone);
+	else
+		arg->status_change_nid_high = -1;
+#else
+	arg->status_change_nid_high = arg->status_change_nid_normal;
+#endif
+
 	zone_last = ZONE_MOVABLE;
 	for (; zt <= zone_last; zt++)
 		present_pages += pgdat->node_zones[zt].present_pages;
@@ -923,11 +954,17 @@ static void clear_nodemasks(int node, struct memory_notify *arg)
 	if (arg->status_change_nid_normal >= 0)
 		node_clear_state(node, N_NORMAL_MEMORY);
 
-	if (N_HIGH_MEMORY == N_NORMAL_MEMORY)
+	if (N_MEMORY == N_NORMAL_MEMORY)
 		return;
 
-	if (arg->status_change_nid >= 0)
+	if (arg->status_change_nid_high >= 0)
 		node_clear_state(node, N_HIGH_MEMORY);
+
+	if (N_MEMORY == N_HIGH_MEMORY)
+		return;
+
+	if (arg->status_change_nid >= 0)
+		node_clear_state(node, N_MEMORY);
 }
 
 static int __ref offline_pages(unsigned long start_pfn,
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 19/25] numa: add CONFIG_MOVABLE_NODE for movable-dedicated node
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (17 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 18/25] hotplug: update nodemasks management Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 20/25] page_alloc: add kernelcore_max_addr Lai Jiangshan
                     ` (6 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Greg Kroah-Hartman, Christoph Lameter,
	Hillf Danton, Andrew Morton, Jan Beulich, Seth Jennings,
	Dan Magenheimer, Michal Hocko, KAMEZAWA Hiroyuki, Minchan Kim,
	linux-mm

All are prepared, we can actually introduce N_MEMORY.
add CONFIG_MOVABLE_NODE make we can use it for movable-dedicated node

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 drivers/base/node.c      |    6 ++++++
 include/linux/nodemask.h |    4 ++++
 mm/Kconfig               |    8 ++++++++
 mm/page_alloc.c          |    3 +++
 4 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
index 4c3aa7c..653b5e2 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -620,6 +620,9 @@ static struct node_attr node_state_attr[] = {
 #ifdef CONFIG_HIGHMEM
 	[N_HIGH_MEMORY] = _NODE_ATTR(has_high_memory, N_HIGH_MEMORY),
 #endif
+#ifdef CONFIG_MOVABLE_NODE
+	[N_MEMORY] = _NODE_ATTR(has_memory, N_MEMORY),
+#endif
 	[N_CPU] = _NODE_ATTR(has_cpu, N_CPU),
 };
 
@@ -630,6 +633,9 @@ static struct attribute *node_state_attrs[] = {
 #ifdef CONFIG_HIGHMEM
 	&node_state_attr[N_HIGH_MEMORY].attr.attr,
 #endif
+#ifdef CONFIG_MOVABLE_NODE
+	&node_state_attr[N_MEMORY].attr.attr,
+#endif
 	&node_state_attr[N_CPU].attr.attr,
 	NULL
 };
diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
index c6ebdc9..4e2cbfa 100644
--- a/include/linux/nodemask.h
+++ b/include/linux/nodemask.h
@@ -380,7 +380,11 @@ enum node_states {
 #else
 	N_HIGH_MEMORY = N_NORMAL_MEMORY,
 #endif
+#ifdef CONFIG_MOVABLE_NODE
+	N_MEMORY,		/* The node has memory(regular, high, movable) */
+#else
 	N_MEMORY = N_HIGH_MEMORY,
+#endif
 	N_CPU,		/* The node has one or more cpus */
 	NR_NODE_STATES
 };
diff --git a/mm/Kconfig b/mm/Kconfig
index 82fed4e..4371c65 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -140,6 +140,14 @@ config ARCH_DISCARD_MEMBLOCK
 config NO_BOOTMEM
 	boolean
 
+config MOVABLE_NODE
+	boolean "Enable to assign a node has only movable memory"
+	depends on HAVE_MEMBLOCK
+	depends on NO_BOOTMEM
+	depends on X86_64
+	depends on NUMA
+	default y
+
 # eventually, we can have this option just 'select SPARSEMEM'
 config MEMORY_HOTPLUG
 	bool "Allow for memory hot-add"
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index edffc35..03ad63d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -91,6 +91,9 @@ nodemask_t node_states[NR_NODE_STATES] __read_mostly = {
 #ifdef CONFIG_HIGHMEM
 	[N_HIGH_MEMORY] = { { [0] = 1UL } },
 #endif
+#ifdef CONFIG_MOVABLE_NODE
+	[N_MEMORY] = { { [0] = 1UL } },
+#endif
 	[N_CPU] = { { [0] = 1UL } },
 #endif	/* NUMA */
 };
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 20/25] page_alloc: add kernelcore_max_addr
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (18 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 19/25] numa: add CONFIG_MOVABLE_NODE for movable-dedicated node Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 21/25] x86: get pg_data_t's memory from other node Lai Jiangshan
                     ` (5 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Rob Landley, Andrew Morton, Michal Hocko,
	KAMEZAWA Hiroyuki, Minchan Kim, linux-doc, linux-mm

Current ZONE_MOVABLE (kernelcore=) setting policy with boot option doesn't meet
our requirement. We need something like kernelcore_max_addr=XX boot option
to limit the kernelcore upper address.

The memory with higher address will be migratable(movable) and they
are easier to be offline(always ready to be offline when the system don't require
so much memory).

It makes things easy when we dynamic hot-add/remove memory, make better
utilities of memories, and helps for THP.

All kernelcore_max_addr=, kernelcore= and movablecore= can be safely specified
at the same time(or any 2 of them).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 Documentation/kernel-parameters.txt |    9 +++++++++
 mm/page_alloc.c                     |   29 ++++++++++++++++++++++++++++-
 2 files changed, 37 insertions(+), 1 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 12783fa..48dff61 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1216,6 +1216,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 			use the HighMem zone if it exists, and the Normal
 			zone if it does not.
 
+	kernelcore_max_addr=nn[KMG]	[KNL,X86,IA-64,PPC] This parameter
+			is the same effect as kernelcore parameter, except it
+			specifies the up physical address of memory range
+			usable by the kernel for non-movable allocations.
+			If both kernelcore and kernelcore_max_addr are
+			specified, this requested's priority is higher than
+			kernelcore's.
+			See the kernelcore parameter.
+
 	kgdbdbgp=	[KGDB,HW] kgdb over EHCI usb debug port.
 			Format: <Controller#>[,poll interval]
 			The controller # is the number of the ehci usb debug
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 03ad63d..65ac5c9 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -204,6 +204,7 @@ static unsigned long __meminitdata dma_reserve;
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 static unsigned long __meminitdata arch_zone_lowest_possible_pfn[MAX_NR_ZONES];
 static unsigned long __meminitdata arch_zone_highest_possible_pfn[MAX_NR_ZONES];
+static unsigned long __initdata required_kernelcore_max_pfn;
 static unsigned long __initdata required_kernelcore;
 static unsigned long __initdata required_movablecore;
 static unsigned long __meminitdata zone_movable_pfn[MAX_NUMNODES];
@@ -4600,6 +4601,7 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 {
 	int i, nid;
 	unsigned long usable_startpfn;
+	unsigned long kernelcore_max_pfn;
 	unsigned long kernelcore_node, kernelcore_remaining;
 	/* save the state before borrow the nodemask */
 	nodemask_t saved_node_state = node_states[N_MEMORY];
@@ -4628,6 +4630,9 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 		required_kernelcore = max(required_kernelcore, corepages);
 	}
 
+	if (required_kernelcore_max_pfn && !required_kernelcore)
+		required_kernelcore = totalpages;
+
 	/* If kernelcore was not specified, there is no ZONE_MOVABLE */
 	if (!required_kernelcore)
 		goto out;
@@ -4636,6 +4641,12 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 	find_usable_zone_for_movable();
 	usable_startpfn = arch_zone_lowest_possible_pfn[movable_zone];
 
+	if (required_kernelcore_max_pfn)
+		kernelcore_max_pfn = required_kernelcore_max_pfn;
+	else
+		kernelcore_max_pfn = ULONG_MAX >> PAGE_SHIFT;
+	kernelcore_max_pfn = max(kernelcore_max_pfn, usable_startpfn);
+
 restart:
 	/* Spread kernelcore memory as evenly as possible throughout nodes */
 	kernelcore_node = required_kernelcore / usable_nodes;
@@ -4662,8 +4673,12 @@ restart:
 			unsigned long size_pages;
 
 			start_pfn = max(start_pfn, zone_movable_pfn[nid]);
-			if (start_pfn >= end_pfn)
+			end_pfn = min(kernelcore_max_pfn, end_pfn);
+			if (start_pfn >= end_pfn) {
+				if (!zone_movable_pfn[nid])
+					zone_movable_pfn[nid] = start_pfn;
 				continue;
+			}
 
 			/* Account for what is only usable for kernelcore */
 			if (start_pfn < usable_startpfn) {
@@ -4854,6 +4869,18 @@ static int __init cmdline_parse_core(char *p, unsigned long *core)
 	return 0;
 }
 
+#ifdef CONFIG_MOVABLE_NODE
+/*
+ * kernelcore_max_addr=addr sets the up physical address of memory range
+ * for use for allocations that cannot be reclaimed or migrated.
+ */
+static int __init cmdline_parse_kernelcore_max_addr(char *p)
+{
+	return cmdline_parse_core(p, &required_kernelcore_max_pfn);
+}
+early_param("kernelcore_max_addr", cmdline_parse_kernelcore_max_addr);
+#endif
+
 /*
  * kernelcore=size sets the amount of memory for use for allocations that
  * cannot be reclaimed or migrated.
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 21/25] x86: get pg_data_t's memory from other node
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (19 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 20/25] page_alloc: add kernelcore_max_addr Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 22/25] x86: use memblock_set_current_limit() to set memblock.current_limit Lai Jiangshan
                     ` (4 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Yasuaki Ishimatsu, Lai Jiangshan, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, Andrew Morton, Rusty Russell, Bjorn Helgaas

From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>

If system can create movable node which all memory of the
node is allocated as ZONE_MOVABLE, setup_node_data() cannot
allocate memory for the node's pg_data_t.
So when memblock_alloc_nid() fails, setup_node_data() retries
memblock_alloc().

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 arch/x86/mm/numa.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 2d125be..a86e315 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -223,9 +223,13 @@ static void __init setup_node_data(int nid, u64 start, u64 end)
 		remapped = true;
 	} else {
 		nd_pa = memblock_alloc_nid(nd_size, SMP_CACHE_BYTES, nid);
-		if (!nd_pa) {
-			pr_err("Cannot find %zu bytes in node %d\n",
+		if (!nd_pa)
+			printk(KERN_WARNING "Cannot find %zu bytes in node %d\n",
 			       nd_size, nid);
+		nd_pa = memblock_alloc(nd_size, SMP_CACHE_BYTES);
+		if (!nd_pa) {
+			pr_err("Cannot find %zu bytes in other node\n",
+			       nd_size);
 			return;
 		}
 		nd = __va(nd_pa);
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 22/25] x86: use memblock_set_current_limit() to set memblock.current_limit
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (20 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 21/25] x86: get pg_data_t's memory from other node Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 23/25] memblock: limit memory address from memblock Lai Jiangshan
                     ` (3 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Yasuaki Ishimatsu, Lai Jiangshan, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, Jarkko Sakkinen, Matt Fleming,
	Andrew Morton

From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>

memblock.current_limit is set directly though memblock_set_current_limit()
is prepared. So fix it.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 arch/x86/kernel/setup.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index f4b9b80..bb9d9f8 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -889,7 +889,7 @@ void __init setup_arch(char **cmdline_p)
 
 	cleanup_highmap();
 
-	memblock.current_limit = get_max_mapped();
+	memblock_set_current_limit(get_max_mapped());
 	memblock_x86_fill();
 
 	/*
@@ -925,7 +925,7 @@ void __init setup_arch(char **cmdline_p)
 		max_low_pfn = max_pfn;
 	}
 #endif
-	memblock.current_limit = get_max_mapped();
+	memblock_set_current_limit(get_max_mapped());
 	dma_contiguous_reserve(0);
 
 	/*
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 23/25] memblock: limit memory address from memblock
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (21 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 22/25] x86: use memblock_set_current_limit() to set memblock.current_limit Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 24/25] memblock: compare current_limit with end variable at memblock_find_in_range_node() Lai Jiangshan
                     ` (2 subsequent siblings)
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Yasuaki Ishimatsu, Lai Jiangshan, Tejun Heo, Andrew Morton,
	Yinghai Lu, Sam Ravnborg, Ingo Molnar, Gavin Shan, Michal Hocko,
	KAMEZAWA Hiroyuki, Minchan Kim, linux-mm

From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>

Setting kernelcore_max_pfn means all memory which is bigger than
the boot parameter is allocated as ZONE_MOVABLE. So memory which
is allocated by memblock also should be limited by the parameter.

The patch limits memory from memblock.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 include/linux/memblock.h |    1 +
 mm/memblock.c            |    5 ++++-
 mm/page_alloc.c          |    6 +++++-
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 19dc455..f2977ae 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -42,6 +42,7 @@ struct memblock {
 
 extern struct memblock memblock;
 extern int memblock_debug;
+extern phys_addr_t memblock_limit;
 
 #define memblock_dbg(fmt, ...) \
 	if (memblock_debug) printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
diff --git a/mm/memblock.c b/mm/memblock.c
index 5cc6731..663b805 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -931,7 +931,10 @@ int __init_memblock memblock_is_region_reserved(phys_addr_t base, phys_addr_t si
 
 void __init_memblock memblock_set_current_limit(phys_addr_t limit)
 {
-	memblock.current_limit = limit;
+	if (!memblock_limit || (memblock_limit > limit))
+		memblock.current_limit = limit;
+	else
+		memblock.current_limit = memblock_limit;
 }
 
 static void __init_memblock memblock_dump(struct memblock_type *type, char *name)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 65ac5c9..c4d3aa0 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -209,6 +209,8 @@ static unsigned long __initdata required_kernelcore;
 static unsigned long __initdata required_movablecore;
 static unsigned long __meminitdata zone_movable_pfn[MAX_NUMNODES];
 
+phys_addr_t memblock_limit;
+
 /* movable_zone is the "real" zone pages in ZONE_MOVABLE are taken from */
 int movable_zone;
 EXPORT_SYMBOL(movable_zone);
@@ -4876,7 +4878,9 @@ static int __init cmdline_parse_core(char *p, unsigned long *core)
  */
 static int __init cmdline_parse_kernelcore_max_addr(char *p)
 {
-	return cmdline_parse_core(p, &required_kernelcore_max_pfn);
+	cmdline_parse_core(p, &required_kernelcore_max_pfn);
+	memblock_limit = required_kernelcore_max_pfn << PAGE_SHIFT;
+	return 0;
 }
 early_param("kernelcore_max_addr", cmdline_parse_kernelcore_max_addr);
 #endif
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 24/25] memblock: compare current_limit with end variable at memblock_find_in_range_node()
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (22 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 23/25] memblock: limit memory address from memblock Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-06  9:23   ` [RFC V3 PATCH 25/25] mm, memory-hotplug: add online_movable and online_kernel Lai Jiangshan
  2012-08-23  8:22   ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Yasuaki Ishimatsu
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Yasuaki Ishimatsu, Lai Jiangshan, Tejun Heo, Andrew Morton,
	Ingo Molnar, Gavin Shan, linux-mm

From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>

memblock_find_in_range_node() does not compare memblock.current_limit
with end variable. Thus even if memblock.current_limit is smaller than
end variable, the function allocates memory address that is bigger than
memblock.current_limit.

The patch adds the check to "memblock_find_in_range_node()"

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/memblock.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/memblock.c b/mm/memblock.c
index 663b805..ce7fcb6 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -99,11 +99,12 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start,
 					phys_addr_t align, int nid)
 {
 	phys_addr_t this_start, this_end, cand;
+	phys_addr_t current_limit = memblock.current_limit;
 	u64 i;
 
 	/* pump up @end */
-	if (end == MEMBLOCK_ALLOC_ACCESSIBLE)
-		end = memblock.current_limit;
+	if ((end == MEMBLOCK_ALLOC_ACCESSIBLE) || (end > current_limit))
+		end = current_limit;
 
 	/* avoid allocating the first page */
 	start = max_t(phys_addr_t, start, PAGE_SIZE);
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [RFC V3 PATCH 25/25] mm, memory-hotplug: add online_movable and online_kernel
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (23 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 24/25] memblock: compare current_limit with end variable at memblock_find_in_range_node() Lai Jiangshan
@ 2012-08-06  9:23   ` Lai Jiangshan
  2012-08-23  8:22   ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Yasuaki Ishimatsu
  25 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2012-08-06  9:23 UTC (permalink / raw)
  To: Mel Gorman, linux-kernel
  Cc: Lai Jiangshan, Rob Landley, Greg Kroah-Hartman, Paul Gortmaker,
	Andrew Morton, Bjorn Helgaas, David Rientjes, linux-doc,
	linux-mm

When a memoryblock/memorysection is onlined by "online_movable", the kernel
will not have directly reference to the page of the memoryblock,
thus we can remove that memory any time when needed.

It makes things easy when we dynamic hot-add/remove memory, make better
utilities of memories, and helps for THP.

Current constraints: Only the memoryblock which is adjacent to the ZONE_MOVABLE
can be onlined from ZONE_NORMAL to ZONE_MOVABLE.

For opposite onlining behavior, we also introduce "online_kernel" to change
a memoryblock of ZONE_MOVABLE to ZONE_KERNEL when online.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 Documentation/memory-hotplug.txt |   14 +++++-
 drivers/base/memory.c            |   19 +++++---
 include/linux/memory_hotplug.h   |   13 +++++-
 mm/memory_hotplug.c              |  101 +++++++++++++++++++++++++++++++++++++-
 4 files changed, 137 insertions(+), 10 deletions(-)

diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
index 70bc1c7..8e5eacb 100644
--- a/Documentation/memory-hotplug.txt
+++ b/Documentation/memory-hotplug.txt
@@ -161,7 +161,8 @@ a recent addition and not present on older kernels.
 		    in the memory block.
 'state'           : read-write
                     at read:  contains online/offline state of memory.
-                    at write: user can specify "online", "offline" command
+                    at write: user can specify "online_kernel",
+                    "online_movable", "online", "offline" command
                     which will be performed on al sections in the block.
 'phys_device'     : read-only: designed to show the name of physical memory
                     device.  This is not well implemented now.
@@ -255,6 +256,17 @@ For onlining, you have to write "online" to the section's state file as:
 
 % echo online > /sys/devices/system/memory/memoryXXX/state
 
+This onlining will not change the ZONE type of the target memory section,
+If the memory section is in ZONE_NORMAL, you can change it to ZONE_MOVABLE:
+
+% echo online_movable > /sys/devices/system/memory/memoryXXX/state
+(NOTE: current limit: this memory section must be adjacent to ZONE_MOVABLE)
+
+And if the memory section is in ZONE_MOVABLE, you can change it to ZONE_NORMAL:
+
+% echo online_kernel > /sys/devices/system/memory/memoryXXX/state
+(NOTE: current limit: this memory section must be adjacent to ZONE_NORMAL)
+
 After this, section memoryXXX's state will be 'online' and the amount of
 available memory will be increased.
 
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 7dda4f7..1ad2f48 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -246,7 +246,7 @@ static bool pages_correctly_reserved(unsigned long start_pfn,
  * OK to have direct references to sparsemem variables in here.
  */
 static int
-memory_block_action(unsigned long phys_index, unsigned long action)
+memory_block_action(unsigned long phys_index, unsigned long action, int online_type)
 {
 	unsigned long start_pfn, start_paddr;
 	unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block;
@@ -262,7 +262,7 @@ memory_block_action(unsigned long phys_index, unsigned long action)
 			if (!pages_correctly_reserved(start_pfn, nr_pages))
 				return -EBUSY;
 
-			ret = online_pages(start_pfn, nr_pages);
+			ret = online_pages(start_pfn, nr_pages, online_type);
 			break;
 		case MEM_OFFLINE:
 			start_paddr = page_to_pfn(first_page) << PAGE_SHIFT;
@@ -279,7 +279,8 @@ memory_block_action(unsigned long phys_index, unsigned long action)
 }
 
 static int memory_block_change_state(struct memory_block *mem,
-		unsigned long to_state, unsigned long from_state_req)
+		unsigned long to_state, unsigned long from_state_req,
+		int online_type)
 {
 	int ret = 0;
 
@@ -293,7 +294,7 @@ static int memory_block_change_state(struct memory_block *mem,
 	if (to_state == MEM_OFFLINE)
 		mem->state = MEM_GOING_OFFLINE;
 
-	ret = memory_block_action(mem->start_section_nr, to_state);
+	ret = memory_block_action(mem->start_section_nr, to_state, online_type);
 
 	if (ret) {
 		mem->state = from_state_req;
@@ -325,10 +326,14 @@ store_mem_state(struct device *dev,
 
 	mem = container_of(dev, struct memory_block, dev);
 
-	if (!strncmp(buf, "online", min((int)count, 6)))
-		ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE);
+	if (!strncmp(buf, "online_kernel", min((int)count, 13)))
+		ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE, ONLINE_KERNEL);
+	else if (!strncmp(buf, "online_movable", min((int)count, 14)))
+		ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE, ONLINE_MOVABLE);
+	else if (!strncmp(buf, "online", min((int)count, 6)))
+		ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE, ONLINE_KEEP);
 	else if(!strncmp(buf, "offline", min((int)count, 7)))
-		ret = memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE);
+		ret = memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
 
 	if (ret)
 		return ret;
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 910550f..047cd1d 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -25,6 +25,13 @@ enum {
 	MEMORY_HOTPLUG_MAX_BOOTMEM_TYPE = NODE_INFO,
 };
 
+/* Types for control the zone type of onlined memory */
+enum {
+	ONLINE_KEEP,
+	ONLINE_KERNEL,
+	ONLINE_MOVABLE,
+};
+
 /*
  * pgdat resizing functions
  */
@@ -45,6 +52,10 @@ void pgdat_resize_init(struct pglist_data *pgdat)
 }
 /*
  * Zone resizing functions
+ *
+ * Note: any attempt to resize a zone should has pgdat_resize_lock()
+ * zone_span_writelock() both held. This ensure the size of a zone
+ * can't be changed while pgdat_resize_lock() held.
  */
 static inline unsigned zone_span_seqbegin(struct zone *zone)
 {
@@ -70,7 +81,7 @@ extern int zone_grow_free_lists(struct zone *zone, unsigned long new_nr_pages);
 extern int zone_grow_waitqueues(struct zone *zone, unsigned long nr_pages);
 extern int add_one_highpage(struct page *page, int pfn, int bad_ppro);
 /* VM interface that may be used by firmware interface */
-extern int online_pages(unsigned long, unsigned long);
+extern int online_pages(unsigned long, unsigned long, int);
 extern void __offline_isolated_pages(unsigned long, unsigned long);
 
 typedef void (*online_page_callback_t)(struct page *page);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index c2c96a4..4e1db0a 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -210,6 +210,89 @@ static void grow_zone_span(struct zone *zone, unsigned long start_pfn,
 	zone_span_writeunlock(zone);
 }
 
+static void resize_zone(struct zone *zone, unsigned long start_pfn,
+		unsigned long end_pfn)
+{
+
+	zone_span_writelock(zone);
+
+	zone->zone_start_pfn = start_pfn;
+	zone->spanned_pages = end_pfn - start_pfn;
+
+	zone_span_writeunlock(zone);
+}
+
+static void fix_zone_id(struct zone *zone, unsigned long start_pfn,
+		unsigned long end_pfn)
+{
+	enum zone_type zid = zone_idx(zone);
+	int nid = zone->zone_pgdat->node_id;
+	unsigned long pfn;
+
+	for (pfn = start_pfn; pfn < end_pfn; pfn++)
+		set_page_links(pfn_to_page(pfn), zid, nid, pfn);
+}
+
+static int move_pfn_range_left(struct zone *z1, struct zone *z2,
+		unsigned long start_pfn, unsigned long end_pfn)
+{
+	unsigned long flags;
+
+	pgdat_resize_lock(z1->zone_pgdat, &flags);
+
+	/* can't move pfns which are higher than @z2 */
+	if (end_pfn > z2->zone_start_pfn + z2->spanned_pages)
+		goto out_fail;
+	/* the move out part mast at the left most of @z2 */
+	if (start_pfn > z2->zone_start_pfn)
+		goto out_fail;
+	/* must included/overlap */
+	if (end_pfn <= z2->zone_start_pfn)
+		goto out_fail;
+
+	resize_zone(z1, z1->zone_start_pfn, end_pfn);
+	resize_zone(z2, end_pfn, z2->zone_start_pfn + z2->spanned_pages);
+
+	pgdat_resize_unlock(z1->zone_pgdat, &flags);
+
+	fix_zone_id(z1, start_pfn, end_pfn);
+
+	return 0;
+out_fail:
+	pgdat_resize_unlock(z1->zone_pgdat, &flags);
+	return -1;
+}
+
+static int move_pfn_range_right(struct zone *z1, struct zone *z2,
+		unsigned long start_pfn, unsigned long end_pfn)
+{
+	unsigned long flags;
+
+	pgdat_resize_lock(z1->zone_pgdat, &flags);
+
+	/* can't move pfns which are lower than @z1 */
+	if (z1->zone_start_pfn > start_pfn)
+		goto out_fail;
+	/* the move out part mast at the right most of @z1 */
+	if (z1->zone_start_pfn + z1->spanned_pages >  end_pfn)
+		goto out_fail;
+	/* must included/overlap */
+	if (start_pfn >= z1->zone_start_pfn + z1->spanned_pages)
+		goto out_fail;
+
+	resize_zone(z1, z1->zone_start_pfn, start_pfn);
+	resize_zone(z2, start_pfn, z2->zone_start_pfn + z2->spanned_pages);
+
+	pgdat_resize_unlock(z1->zone_pgdat, &flags);
+
+	fix_zone_id(z2, start_pfn, end_pfn);
+
+	return 0;
+out_fail:
+	pgdat_resize_unlock(z1->zone_pgdat, &flags);
+	return -1;
+}
+
 static void grow_pgdat_span(struct pglist_data *pgdat, unsigned long start_pfn,
 			    unsigned long end_pfn)
 {
@@ -501,7 +584,7 @@ static void set_nodemasks(int node, struct memory_notify *arg)
 }
 
 
-int __ref online_pages(unsigned long pfn, unsigned long nr_pages)
+int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_type)
 {
 	unsigned long onlined_pages = 0;
 	struct zone *zone;
@@ -518,6 +601,22 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages)
 	 */
 	zone = page_zone(pfn_to_page(pfn));
 
+	if (online_type == ONLINE_KERNEL && zone_idx(zone) == ZONE_MOVABLE) {
+		if (move_pfn_range_left(zone - 1, zone, pfn, pfn + nr_pages)) {
+			unlock_memory_hotplug();
+			return -1;
+		}
+	}
+	if (online_type == ONLINE_MOVABLE && zone_idx(zone) == ZONE_MOVABLE - 1) {
+		if (move_pfn_range_right(zone, zone + 1, pfn, pfn + nr_pages)) {
+			unlock_memory_hotplug();
+			return -1;
+		}
+	}
+
+	/* Previous code may changed the zone of the pfn range */
+	zone = page_zone(pfn_to_page(pfn));
+
 	arg.start_pfn = pfn;
 	arg.nr_pages = nr_pages;
 	check_nodemasks_changes_online(nr_pages, zone, &arg);
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug
  2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
                     ` (24 preceding siblings ...)
  2012-08-06  9:23   ` [RFC V3 PATCH 25/25] mm, memory-hotplug: add online_movable and online_kernel Lai Jiangshan
@ 2012-08-23  8:22   ` Yasuaki Ishimatsu
  25 siblings, 0 replies; 54+ messages in thread
From: Yasuaki Ishimatsu @ 2012-08-23  8:22 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: Mel Gorman, linux-kernel

Hi Lai,

Sorry for late reply.
I'm trying to apply your patchset into linux-3.6-rc3.
But I failed to apply them.
What is the based kernel that can apply your patch?

Thanks,
Yasuaki Ishimatsu


2012/08/06 18:22, Lai Jiangshan wrote:
> 	A) Introduction:
> 
> This patchset adds MOVABLE-dedicated node and online_movable for memory-management.
> 
> It is used for anti-fragmentation(hugepage, big-order allocation...),
> hot-removal-of-memory(virtualization, power-conserve, move memory between systems
> to make better utilities of memories).
> 
> 	B) User Interface:
> 
> When users(big system manager) need config some node/memory as MOVABLE:
> 	1 Use kernelcore_max_addr=XX when boot
> 	2 Use movable_online hotplug action when running
> We may introduce some more convenient interface, such as
> 	movable_node=NODE_LIST boot option.
> 
> 	C) Patches
> 
> Patch1-3      Fix problems of the current code.(all related with hotplug)
> Patch4        cleanup for node_state_attr
> Patch5        introduce N_MEMORY
> Patch6-18     use N_MEMORY instead N_HIGH_MEMORY.
>                The patches are separated by subsystem,
>                *these conversions was(must be) checked carefully*.
>                Patch18 also changes the node_states initialization
> Patch19       Add config to allow MOVABLE-dedicated node
> Patch20-24    Add kernelcore_max_addr
> Patch25       Add online_movable and online_kernel
> 
> 
> 	D) changes
> change V3-v2:
> 	Proper nodemask management
> 
> change V2-V1:
> 
> The original V1 patchset of MOVABLE-dedicated node is here:
> http://comments.gmane.org/gmane.linux.kernel.mm/78122
> 
> The new V2 adds N_MEMORY and a notion of "MOVABLE-dedicated node".
> And fix some related problems.
> 
> The orignal V1 patchset of "add online_movable" is here:
> https://lkml.org/lkml/2012/7/4/145
> 
> The new V2 discards the MIGRATE_HOTREMOVE approach, and use a more straight
> implementation(only 1 patch).
> 
> 
> Lai Jiangshan (21):
>    page_alloc.c: don't subtract unrelated memmap from zone's present
>      pages
>    memory_hotplug: fix missing nodemask management
>    slub, hotplug: ignore unrelated node's hot-adding and hot-removing
>    node: cleanup node_state_attr
>    node_states: introduce N_MEMORY
>    cpuset: use N_MEMORY instead N_HIGH_MEMORY
>    procfs: use N_MEMORY instead N_HIGH_MEMORY
>    memcontrol: use N_MEMORY instead N_HIGH_MEMORY
>    oom: use N_MEMORY instead N_HIGH_MEMORY
>    mm,migrate: use N_MEMORY instead N_HIGH_MEMORY
>    mempolicy: use N_MEMORY instead N_HIGH_MEMORY
>    hugetlb: use N_MEMORY instead N_HIGH_MEMORY
>    vmstat: use N_MEMORY instead N_HIGH_MEMORY
>    kthread: use N_MEMORY instead N_HIGH_MEMORY
>    init: use N_MEMORY instead N_HIGH_MEMORY
>    vmscan: use N_MEMORY instead N_HIGH_MEMORY
>    page_alloc: use N_MEMORY instead N_HIGH_MEMORY change the node_states
>      initialization
>    hotplug: update nodemasks management
>    numa: add CONFIG_MOVABLE_NODE for movable-dedicated node
>    page_alloc: add kernelcore_max_addr
>    mm, memory-hotplug: add online_movable and online_kernel
> 
> Yasuaki Ishimatsu (4):
>    x86: get pg_data_t's memory from other node
>    x86: use memblock_set_current_limit() to set memblock.current_limit
>    memblock: limit memory address from memblock
>    memblock: compare current_limit with end variable at
>      memblock_find_in_range_node()
> 
>   Documentation/cgroups/cpusets.txt   |    2 +-
>   Documentation/kernel-parameters.txt |    9 ++
>   Documentation/memory-hotplug.txt    |   24 +++-
>   arch/x86/kernel/setup.c             |    4 +-
>   arch/x86/mm/init_64.c               |    4 +-
>   arch/x86/mm/numa.c                  |    8 +-
>   drivers/base/memory.c               |   19 ++-
>   drivers/base/node.c                 |   28 +++--
>   fs/proc/kcore.c                     |    2 +-
>   fs/proc/task_mmu.c                  |    4 +-
>   include/linux/cpuset.h              |    2 +-
>   include/linux/memblock.h            |    1 +
>   include/linux/memory.h              |    2 +
>   include/linux/memory_hotplug.h      |   13 ++-
>   include/linux/nodemask.h            |    5 +
>   init/main.c                         |    2 +-
>   kernel/cpuset.c                     |   32 +++---
>   kernel/kthread.c                    |    2 +-
>   mm/Kconfig                          |    8 ++
>   mm/hugetlb.c                        |   24 ++--
>   mm/memblock.c                       |   10 +-
>   mm/memcontrol.c                     |   18 ++--
>   mm/memory_hotplug.c                 |  232 ++++++++++++++++++++++++++++++++---
>   mm/mempolicy.c                      |   12 +-
>   mm/migrate.c                        |    2 +-
>   mm/oom_kill.c                       |    2 +-
>   mm/page_alloc.c                     |   96 +++++++++------
>   mm/page_cgroup.c                    |    2 +-
>   mm/slub.c                           |    4 +-
>   mm/vmscan.c                         |    4 +-
>   mm/vmstat.c                         |    4 +-
>   31 files changed, 437 insertions(+), 144 deletions(-)
> 



^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2012-08-23  8:22 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-02  6:01 [RFC PATCH 00/23 V2] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 01/23 V2] node_states: introduce N_MEMORY Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 02/23 V2] cpuset: use N_MEMORY instead N_HIGH_MEMORY Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 03/23 V2] procfs: " Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 04/23 V2] oom: " Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 05/23 V2] mm,migrate: " Lai Jiangshan
2012-08-02 16:09   ` Christoph Lameter
2012-08-02  6:01 ` [RFC PATCH 06/23 V2] mempolicy: " Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 07/23 V2] memcontrol: " Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 08/23 V2] hugetlb: " Lai Jiangshan
2012-08-04 14:02   ` Hillf Danton
2012-08-02  6:01 ` [RFC PATCH 09/23 V2] vmstat: " Lai Jiangshan
2012-08-02 16:09   ` Christoph Lameter
2012-08-02  6:01 ` [RFC PATCH 10/23 V2] kthread: " Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 11/23 V2] init: " Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 12/23 V2] vmscan: " Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 13/23 V2] page_alloc: use N_MEMORY instead N_HIGH_MEMORY change the node_states initialization Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 14/23 V2] slub, hotplug: ignore unrelated node's hot-adding and hot-removing Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 15/23 V2] memory_hotplug: fix missing nodemask management Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 16/23 V2] numa: add CONFIG_MOVABLE_NODE for movable-dedicated node Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 17/23 V2] page_alloc.c: don't subtract unrelated memmap from zone's present pages Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 18/23 V2] page_alloc: add kernelcore_max_addr Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 19/23 V2] x86: get pg_data_t's memory from other node Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 20/23 V2] x86: use memblock_set_current_limit() to set memblock.current_limit Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 21/23 V2] memblock: limit memory address from memblock Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 22/23 V2] memblock: compare current_limit with end variable at memblock_find_in_range_node() Lai Jiangshan
2012-08-02  6:01 ` [RFC PATCH 23/23 V2] mm, memory-hotplug: add online_movable Lai Jiangshan
2012-08-06  9:22 ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Lai Jiangshan
2012-08-06  9:22   ` [RFC V3 PATCH 01/25] page_alloc.c: don't subtract unrelated memmap from zone's present pages Lai Jiangshan
2012-08-06  9:22   ` [RFC V3 PATCH 02/25] memory_hotplug: fix missing nodemask management Lai Jiangshan
2012-08-06  9:22   ` [RFC V3 PATCH 03/25] slub, hotplug: ignore unrelated node's hot-adding and hot-removing Lai Jiangshan
2012-08-06  9:22   ` [RFC V3 PATCH 04/25] node: cleanup node_state_attr Lai Jiangshan
2012-08-06  9:22   ` [RFC V3 PATCH 05/25] node_states: introduce N_MEMORY Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 06/25] cpuset: use N_MEMORY instead N_HIGH_MEMORY Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 07/25] procfs: " Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 08/25] memcontrol: " Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 09/25] oom: " Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 10/25] mm,migrate: " Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 11/25] mempolicy: " Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 12/25] hugetlb: " Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 13/25] vmstat: " Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 14/25] kthread: " Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 15/25] init: " Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 16/25] vmscan: " Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 17/25] page_alloc: use N_MEMORY instead N_HIGH_MEMORY change the node_states initialization Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 18/25] hotplug: update nodemasks management Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 19/25] numa: add CONFIG_MOVABLE_NODE for movable-dedicated node Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 20/25] page_alloc: add kernelcore_max_addr Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 21/25] x86: get pg_data_t's memory from other node Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 22/25] x86: use memblock_set_current_limit() to set memblock.current_limit Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 23/25] memblock: limit memory address from memblock Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 24/25] memblock: compare current_limit with end variable at memblock_find_in_range_node() Lai Jiangshan
2012-08-06  9:23   ` [RFC V3 PATCH 25/25] mm, memory-hotplug: add online_movable and online_kernel Lai Jiangshan
2012-08-23  8:22   ` [RFC V3 PATCH 00/25] memory,numa: introduce MOVABLE-dedicated node and online_movable for hotplug Yasuaki Ishimatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).