From: Dave Hansen <dave.hansen@linux.intel.com> To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, Dave Hansen <dave.hansen@linux.intel.com>, yang.shi@linux.alibaba.com, rientjes@google.com, ying.huang@intel.com, dan.j.williams@intel.com, david@redhat.com, osalvador@suse.de Subject: [RFC][PATCH 04/13] mm/numa: node demotion data structure and lookup Date: Mon, 25 Jan 2021 16:34:19 -0800 [thread overview] Message-ID: <20210126003419.43281680@viggo.jf.intel.com> (raw) In-Reply-To: <20210126003411.2AC51464@viggo.jf.intel.com> From: Dave Hansen <dave.hansen@linux.intel.com> Prepare for the kernel to auto-migrate pages to other memory nodes with a user defined node migration table. This allows creating single migration target for each NUMA node to enable the kernel to do NUMA page migrations instead of simply reclaiming colder pages. A node with no target is a "terminal node", so reclaim acts normally there. The migration target does not fundamentally _need_ to be a single node, but this implementation starts there to limit complexity. If you consider the migration path as a graph, cycles (loops) in the graph are disallowed. This avoids wasting resources by constantly migrating (A->B, B->A, A->B ...). The expectation is that cycles will never be allowed. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: David Rientjes <rientjes@google.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: osalvador <osalvador@suse.de> -- changes in July 2020: - Remove loop from next_demotion_node() and get_online_mems(). This means that the node returned by next_demotion_node() might now be offline, but the worst case is that the allocation fails. That's fine since it is transient. --- b/mm/migrate.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff -puN mm/migrate.c~0006-node-Define-and-export-memory-migration-path mm/migrate.c --- a/mm/migrate.c~0006-node-Define-and-export-memory-migration-path 2021-01-25 16:23:09.553866709 -0800 +++ b/mm/migrate.c 2021-01-25 16:23:09.558866709 -0800 @@ -1161,6 +1161,22 @@ out: return rc; } +static int node_demotion[MAX_NUMNODES] = {[0 ... MAX_NUMNODES - 1] = NUMA_NO_NODE}; + +/** + * next_demotion_node() - Get the next node in the demotion path + * @node: The starting node to lookup the next node + * + * @returns: node id for next memory node in the demotion path hierarchy + * from @node; NUMA_NO_NODE if @node is terminal. This does not keep + * @node online or guarantee that it *continues* to be the next demotion + * target. + */ +int next_demotion_node(int node) +{ + return node_demotion[node]; +} + /* * Obtain the lock on page, remove all ptes and migrate the page * to the newly allocated page in newpage. _
WARNING: multiple messages have this Message-ID (diff)
From: Dave Hansen <dave.hansen@linux.intel.com> To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org,Dave Hansen <dave.hansen@linux.intel.com>,yang.shi@linux.alibaba.com,rientjes@google.com,ying.huang@intel.com,dan.j.williams@intel.com,david@redhat.com,osalvador@suse.de Subject: [RFC][PATCH 04/13] mm/numa: node demotion data structure and lookup Date: Mon, 25 Jan 2021 16:34:19 -0800 [thread overview] Message-ID: <20210126003419.43281680@viggo.jf.intel.com> (raw) In-Reply-To: <20210126003411.2AC51464@viggo.jf.intel.com> From: Dave Hansen <dave.hansen@linux.intel.com> Prepare for the kernel to auto-migrate pages to other memory nodes with a user defined node migration table. This allows creating single migration target for each NUMA node to enable the kernel to do NUMA page migrations instead of simply reclaiming colder pages. A node with no target is a "terminal node", so reclaim acts normally there. The migration target does not fundamentally _need_ to be a single node, but this implementation starts there to limit complexity. If you consider the migration path as a graph, cycles (loops) in the graph are disallowed. This avoids wasting resources by constantly migrating (A->B, B->A, A->B ...). The expectation is that cycles will never be allowed. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: David Rientjes <rientjes@google.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: osalvador <osalvador@suse.de> -- changes in July 2020: - Remove loop from next_demotion_node() and get_online_mems(). This means that the node returned by next_demotion_node() might now be offline, but the worst case is that the allocation fails. That's fine since it is transient. --- b/mm/migrate.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff -puN mm/migrate.c~0006-node-Define-and-export-memory-migration-path mm/migrate.c --- a/mm/migrate.c~0006-node-Define-and-export-memory-migration-path 2021-01-25 16:23:09.553866709 -0800 +++ b/mm/migrate.c 2021-01-25 16:23:09.558866709 -0800 @@ -1161,6 +1161,22 @@ out: return rc; } +static int node_demotion[MAX_NUMNODES] = {[0 ... MAX_NUMNODES - 1] = NUMA_NO_NODE}; + +/** + * next_demotion_node() - Get the next node in the demotion path + * @node: The starting node to lookup the next node + * + * @returns: node id for next memory node in the demotion path hierarchy + * from @node; NUMA_NO_NODE if @node is terminal. This does not keep + * @node online or guarantee that it *continues* to be the next demotion + * target. + */ +int next_demotion_node(int node) +{ + return node_demotion[node]; +} + /* * Obtain the lock on page, remove all ptes and migrate the page * to the newly allocated page in newpage. _
next prev parent reply other threads:[~2021-01-26 12:36 UTC|newest] Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-01-26 0:34 [RFC][PATCH 00/13] [v5] Migrate Pages in lieu of discard Dave Hansen 2021-01-26 0:34 ` Dave Hansen 2021-01-26 0:34 ` [RFC][PATCH 01/13] mm/vmscan: restore zone_reclaim_mode ABI Dave Hansen 2021-01-26 0:34 ` Dave Hansen 2021-02-10 9:42 ` Oscar Salvador 2021-01-26 0:34 ` [RFC][PATCH 02/13] mm/vmscan: move RECLAIM* bits to uapi header Dave Hansen 2021-01-26 0:34 ` Dave Hansen 2021-02-10 9:44 ` Oscar Salvador 2021-01-26 0:34 ` [RFC][PATCH 03/13] mm/vmscan: replace implicit RECLAIM_ZONE checks with explicit checks Dave Hansen 2021-01-26 0:34 ` Dave Hansen 2021-01-31 1:10 ` David Rientjes 2021-01-31 1:10 ` David Rientjes 2021-02-10 9:54 ` Oscar Salvador 2021-01-26 0:34 ` Dave Hansen [this message] 2021-01-26 0:34 ` [RFC][PATCH 04/13] mm/numa: node demotion data structure and lookup Dave Hansen 2021-01-31 1:19 ` David Rientjes 2021-01-31 1:19 ` David Rientjes 2021-02-01 17:49 ` Dave Hansen 2021-01-26 0:34 ` [RFC][PATCH 05/13] mm/numa: automatically generate node migration order Dave Hansen 2021-01-26 0:34 ` Dave Hansen 2021-01-29 20:46 ` Yang Shi 2021-01-29 20:46 ` Yang Shi 2021-02-01 19:13 ` Dave Hansen 2021-02-02 11:43 ` Oscar Salvador 2021-02-02 17:46 ` Yang Shi 2021-02-02 17:46 ` Yang Shi 2021-02-03 0:43 ` Dave Hansen 2021-02-04 0:26 ` Yang Shi 2021-02-04 0:26 ` Yang Shi 2021-01-26 0:34 ` [RFC][PATCH 06/13] mm/migrate: update migration order during on hotplug events Dave Hansen 2021-01-26 0:34 ` Dave Hansen 2021-01-29 20:59 ` Yang Shi 2021-01-29 20:59 ` Yang Shi 2021-02-02 11:42 ` Oscar Salvador 2021-02-09 23:45 ` Dave Hansen 2021-02-10 8:55 ` Oscar Salvador 2021-01-26 0:34 ` [RFC][PATCH 07/13] mm/migrate: make migrate_pages() return nr_succeeded Dave Hansen 2021-01-26 0:34 ` Dave Hansen 2021-01-29 21:04 ` Yang Shi 2021-01-29 21:04 ` Yang Shi 2021-02-09 23:41 ` Dave Hansen 2021-01-26 0:34 ` [RFC][PATCH 08/13] mm/migrate: demote pages during reclaim Dave Hansen 2021-01-26 0:34 ` Dave Hansen 2021-02-02 11:55 ` Oscar Salvador 2021-02-02 22:45 ` Yang Shi 2021-02-02 22:45 ` Yang Shi 2021-02-02 22:56 ` Dave Hansen 2021-02-02 18:22 ` Yang Shi 2021-02-02 18:22 ` Yang Shi 2021-02-02 18:34 ` Dave Hansen 2021-01-26 0:34 ` [RFC][PATCH 09/13] mm/vmscan: add page demotion counter Dave Hansen 2021-01-26 0:34 ` Dave Hansen 2021-01-26 0:34 ` [RFC][PATCH 10/13] mm/vmscan: add helper for querying ability to age anonymous pages Dave Hansen 2021-01-26 0:34 ` Dave Hansen 2021-01-26 0:34 ` [RFC][PATCH 11/13] mm/vmscan: Consider anonymous pages without swap Dave Hansen 2021-01-26 0:34 ` Dave Hansen 2021-02-02 18:56 ` Yang Shi 2021-02-02 18:56 ` Yang Shi 2021-02-02 21:35 ` Dave Hansen 2021-02-02 22:35 ` Yang Shi 2021-02-02 22:35 ` Yang Shi 2021-01-26 0:34 ` [RFC][PATCH 12/13] mm/vmscan: never demote for memcg reclaim Dave Hansen 2021-01-26 0:34 ` Dave Hansen 2021-01-26 0:34 ` [RFC][PATCH 13/13] mm/migrate: new zone_reclaim_mode to enable reclaim migration Dave Hansen 2021-01-26 0:34 ` Dave Hansen 2021-01-31 1:13 ` [RFC][PATCH 00/13] [v5] Migrate Pages in lieu of discard David Rientjes 2021-01-31 1:13 ` David Rientjes
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210126003419.43281680@viggo.jf.intel.com \ --to=dave.hansen@linux.intel.com \ --cc=dan.j.williams@intel.com \ --cc=david@redhat.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=osalvador@suse.de \ --cc=rientjes@google.com \ --cc=yang.shi@linux.alibaba.com \ --cc=ying.huang@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.