From: Yang Shi <yang.shi@linux.alibaba.com>
To: mhocko@suse.com, mgorman@techsingularity.net, riel@surriel.com,
hannes@cmpxchg.org, akpm@linux-foundation.org,
dave.hansen@intel.com, keith.busch@intel.com,
dan.j.williams@intel.com, fengguang.wu@intel.com,
fan.du@intel.com, ying.huang@intel.com, ziy@nvidia.com
Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: [v3 PATCH 3/9] mm: page_alloc: make find_next_best_node find return migration target node
Date: Fri, 14 Jun 2019 07:29:31 +0800 [thread overview]
Message-ID: <1560468577-101178-4-git-send-email-yang.shi@linux.alibaba.com> (raw)
In-Reply-To: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com>
Need find the cloest migration target node to demote DRAM pages. Add
"migration" parameter to find_next_best_node() to skip DRAM node on
demand.
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
---
mm/internal.h | 11 +++++++++++
mm/page_alloc.c | 14 ++++++++++----
2 files changed, 21 insertions(+), 4 deletions(-)
diff --git a/mm/internal.h b/mm/internal.h
index 9eeaf2b..a3181e2 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -292,6 +292,17 @@ static inline bool is_data_mapping(vm_flags_t flags)
return (flags & (VM_WRITE | VM_SHARED | VM_STACK)) == VM_WRITE;
}
+#ifdef CONFIG_NUMA
+extern int find_next_best_node(int node, nodemask_t *used_node_mask,
+ bool migration);
+#else
+static inline int find_next_best_node(int node, nodemask_t *used_node_mask,
+ bool migtation)
+{
+ return 0;
+}
+#endif
+
/* mm/util.c */
void __vma_link_list(struct mm_struct *mm, struct vm_area_struct *vma,
struct vm_area_struct *prev, struct rb_node *rb_parent);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3b37c71..917f64d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5425,6 +5425,7 @@ int numa_zonelist_order_handler(struct ctl_table *table, int write,
* find_next_best_node - find the next node that should appear in a given node's fallback list
* @node: node whose fallback list we're appending
* @used_node_mask: nodemask_t of already used nodes
+ * @migration: find next best migration target node
*
* We use a number of factors to determine which is the next node that should
* appear on a given node's fallback list. The node should not have appeared
@@ -5436,7 +5437,8 @@ int numa_zonelist_order_handler(struct ctl_table *table, int write,
*
* Return: node id of the found node or %NUMA_NO_NODE if no node is found.
*/
-static int find_next_best_node(int node, nodemask_t *used_node_mask)
+int find_next_best_node(int node, nodemask_t *used_node_mask,
+ bool migration)
{
int n, val;
int min_val = INT_MAX;
@@ -5444,13 +5446,18 @@ static int find_next_best_node(int node, nodemask_t *used_node_mask)
const struct cpumask *tmp = cpumask_of_node(0);
/* Use the local node if we haven't already */
- if (!node_isset(node, *used_node_mask)) {
+ if (!node_isset(node, *used_node_mask) &&
+ !migration) {
node_set(node, *used_node_mask);
return node;
}
for_each_node_state(n, N_MEMORY) {
+ /* Find next best migration target node */
+ if (migration && !node_state(n, N_MIGRATE_TARGET))
+ continue;
+
/* Don't want a node to appear more than once */
if (node_isset(n, *used_node_mask))
continue;
@@ -5482,7 +5489,6 @@ static int find_next_best_node(int node, nodemask_t *used_node_mask)
return best_node;
}
-
/*
* Build zonelists ordered by node and zones within node.
* This results in maximum locality--normal zone overflows into local
@@ -5544,7 +5550,7 @@ static void build_zonelists(pg_data_t *pgdat)
nodes_clear(used_mask);
memset(node_order, 0, sizeof(node_order));
- while ((node = find_next_best_node(local_node, &used_mask)) >= 0) {
+ while ((node = find_next_best_node(local_node, &used_mask, false)) >= 0) {
/*
* We don't want to pressure a particular node.
* So adding penalty to the first node in same
--
1.8.3.1
next prev parent reply other threads:[~2019-06-13 23:30 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-13 23:29 [v3 RFC PATCH 0/9] Migrate mode for node reclaim with heterogeneous memory hierarchy Yang Shi
2019-06-13 23:29 ` [v3 PATCH 1/9] mm: define N_CPU_MEM node states Yang Shi
2019-06-13 23:29 ` [v3 PATCH 2/9] mm: Introduce migrate target nodemask Yang Shi
2019-06-13 23:29 ` Yang Shi [this message]
2019-06-13 23:29 ` [v3 PATCH 4/9] mm: migrate: make migrate_pages() return nr_succeeded Yang Shi
2019-06-13 23:29 ` [v3 PATCH 5/9] mm: vmscan: demote anon DRAM pages to migration target node Yang Shi
2019-06-13 23:29 ` [v3 PATCH 6/9] mm: vmscan: don't demote for memcg reclaim Yang Shi
2019-06-13 23:29 ` [v3 PATCH 7/9] mm: vmscan: check if the demote target node is contended or not Yang Shi
2019-06-13 23:29 ` [v3 PATCH 8/9] mm: vmscan: add page demotion counter Yang Shi
2019-06-13 23:29 ` [v3 PATCH 9/9] mm: numa: add page promotion counter Yang Shi
2019-06-27 2:57 ` [v3 RFC PATCH 0/9] Migrate mode for node reclaim with heterogeneous memory hierarchy Yang Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1560468577-101178-4-git-send-email-yang.shi@linux.alibaba.com \
--to=yang.shi@linux.alibaba.com \
--cc=akpm@linux-foundation.org \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@intel.com \
--cc=fan.du@intel.com \
--cc=fengguang.wu@intel.com \
--cc=hannes@cmpxchg.org \
--cc=keith.busch@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=riel@surriel.com \
--cc=ying.huang@intel.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).