linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yang Shi <yang.shi@linux.alibaba.com>
To: mhocko@suse.com, mgorman@techsingularity.net, riel@surriel.com,
	hannes@cmpxchg.org, akpm@linux-foundation.org,
	dave.hansen@intel.com, keith.busch@intel.com,
	dan.j.williams@intel.com, fengguang.wu@intel.com,
	fan.du@intel.com, ying.huang@intel.com, ziy@nvidia.com
Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: [v3 PATCH 2/9] mm: Introduce migrate target nodemask
Date: Fri, 14 Jun 2019 07:29:30 +0800	[thread overview]
Message-ID: <1560468577-101178-3-git-send-email-yang.shi@linux.alibaba.com> (raw)
In-Reply-To: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com>

With more memory types are invented, the system may have heterogeneous
memory hierarchy, i.e. DRAM and PMEM.  Some of them are cheaper and
slower than DRAM, may be good candidates to be used as secondary memory
to store not recently or frequently used data.

Introduce the "migrate target" nodemask for such memory nodes.  The
migrate target could be any memory types which are cheaper and/or
slower than DRAM.  Currently PMEM is one of such memory.

Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
---
 drivers/acpi/numa.c      | 12 ++++++++++++
 drivers/base/node.c      |  2 ++
 include/linux/nodemask.h |  1 +
 mm/page_alloc.c          |  1 +
 4 files changed, 16 insertions(+)

diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c
index 3099583..f75adba 100644
--- a/drivers/acpi/numa.c
+++ b/drivers/acpi/numa.c
@@ -296,6 +296,18 @@ void __init acpi_numa_slit_init(struct acpi_table_slit *slit)
 		goto out_err_bad_srat;
 	}
 
+	/*
+	 * The system may have memory hierarchy, some memory may be good
+	 * candidate for migration target, i.e. PMEM is one of them.  Mark
+	 * such memory as migration target.
+	 *
+	 * It may be better to retrieve such information from HMAT, but
+	 * SRAT sounds good enough for now.  May switch to HMAT in the
+	 * future.
+	 */ 
+	if (ma->flags & ACPI_SRAT_MEM_NON_VOLATILE)
+		node_set_state(node, N_MIGRATE_TARGET);
+
 	node_set(node, numa_nodes_parsed);
 
 	pr_info("SRAT: Node %u PXM %u [mem %#010Lx-%#010Lx]%s%s\n",
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 4d80fc8..351b694 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -985,6 +985,7 @@ static ssize_t show_node_state(struct device *dev,
 	[N_MEMORY] = _NODE_ATTR(has_memory, N_MEMORY),
 	[N_CPU] = _NODE_ATTR(has_cpu, N_CPU),
 	[N_CPU_MEM] = _NODE_ATTR(primary, N_CPU_MEM),
+	[N_MIGRATE_TARGET] = _NODE_ATTR(migrate_target, N_MIGRATE_TARGET),
 };
 
 static struct attribute *node_state_attrs[] = {
@@ -997,6 +998,7 @@ static ssize_t show_node_state(struct device *dev,
 	&node_state_attr[N_MEMORY].attr.attr,
 	&node_state_attr[N_CPU].attr.attr,
 	&node_state_attr[N_CPU_MEM].attr.attr,
+	&node_state_attr[N_MIGRATE_TARGET].attr.attr,
 	NULL
 };
 
diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
index 66a8964..411618c 100644
--- a/include/linux/nodemask.h
+++ b/include/linux/nodemask.h
@@ -400,6 +400,7 @@ enum node_states {
 	N_MEMORY,		/* The node has memory(regular, high, movable) */
 	N_CPU,			/* The node has one or more cpus */
 	N_CPU_MEM,		/* The node has both cpus and memory */
+	N_MIGRATE_TARGET,	/* The node is suitable migrate target */
 	NR_NODE_STATES
 };
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 757db89e..3b37c71 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -125,6 +125,7 @@ struct pcpu_drain {
 	[N_MEMORY] = { { [0] = 1UL } },
 	[N_CPU] = { { [0] = 1UL } },
 	[N_CPU_MEM] = { { [0] = 1UL } },
+	[N_MIGRATE_TARGET] = { { [0] = 1UL } },
 #endif	/* NUMA */
 };
 EXPORT_SYMBOL(node_states);
-- 
1.8.3.1


  parent reply	other threads:[~2019-06-13 23:30 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-13 23:29 [v3 RFC PATCH 0/9] Migrate mode for node reclaim with heterogeneous memory hierarchy Yang Shi
2019-06-13 23:29 ` [v3 PATCH 1/9] mm: define N_CPU_MEM node states Yang Shi
2019-06-13 23:29 ` Yang Shi [this message]
2019-06-13 23:29 ` [v3 PATCH 3/9] mm: page_alloc: make find_next_best_node find return migration target node Yang Shi
2019-06-13 23:29 ` [v3 PATCH 4/9] mm: migrate: make migrate_pages() return nr_succeeded Yang Shi
2019-06-13 23:29 ` [v3 PATCH 5/9] mm: vmscan: demote anon DRAM pages to migration target node Yang Shi
2019-06-13 23:29 ` [v3 PATCH 6/9] mm: vmscan: don't demote for memcg reclaim Yang Shi
2019-06-13 23:29 ` [v3 PATCH 7/9] mm: vmscan: check if the demote target node is contended or not Yang Shi
2019-06-13 23:29 ` [v3 PATCH 8/9] mm: vmscan: add page demotion counter Yang Shi
2019-06-13 23:29 ` [v3 PATCH 9/9] mm: numa: add page promotion counter Yang Shi
2019-06-27  2:57 ` [v3 RFC PATCH 0/9] Migrate mode for node reclaim with heterogeneous memory hierarchy Yang Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1560468577-101178-3-git-send-email-yang.shi@linux.alibaba.com \
    --to=yang.shi@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=fan.du@intel.com \
    --cc=fengguang.wu@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=keith.busch@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=riel@surriel.com \
    --cc=ying.huang@intel.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).