linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@linux.intel.com>
To: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org,Dave Hansen
	<dave.hansen@linux.intel.com>,yang.shi@linux.alibaba.com,rientjes@google.com,ying.huang@intel.com,dan.j.williams@intel.com,david@redhat.com,osalvador@suse.de
Subject: [RFC][PATCH 13/13] mm/migrate: new zone_reclaim_mode to enable reclaim migration
Date: Mon, 25 Jan 2021 16:34:36 -0800	[thread overview]
Message-ID: <20210126003436.80749D77@viggo.jf.intel.com> (raw)
In-Reply-To: <20210126003411.2AC51464@viggo.jf.intel.com>


From: Dave Hansen <dave.hansen@linux.intel.com>

Some method is obviously needed to enable reclaim-based migration.

Just like traditional autonuma, there will be some workloads that
will benefit like workloads with more "static" configurations where
hot pages stay hot and cold pages stay cold.  If pages come and go
from the hot and cold sets, the benefits of this approach will be
more limited.

The benefits are truly workload-based and *not* hardware-based.
We do not believe that there is a viable threshold where certain
hardware configurations should have this mechanism enabled while
others do not.

To be conservative, earlier work defaulted to disable reclaim-
based migration and did not include a mechanism to enable it.
This propses extending the existing "zone_reclaim_mode" (now
now really node_reclaim_mode) as a method to enable it.

We are open to any alternative that allows end users to enable
this mechanism or disable it it workload harm is detected (just
like traditional autonuma).

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: osalvador <osalvador@suse.de>
---

 b/Documentation/admin-guide/sysctl/vm.rst |    9 +++++++++
 b/include/linux/swap.h                    |    3 ++-
 b/include/uapi/linux/mempolicy.h          |    1 +
 b/mm/vmscan.c                             |    6 ++++--
 4 files changed, 16 insertions(+), 3 deletions(-)

diff -puN Documentation/admin-guide/sysctl/vm.rst~RECLAIM_MIGRATE Documentation/admin-guide/sysctl/vm.rst
--- a/Documentation/admin-guide/sysctl/vm.rst~RECLAIM_MIGRATE	2021-01-25 16:23:43.721866624 -0800
+++ b/Documentation/admin-guide/sysctl/vm.rst	2021-01-25 16:23:43.732866624 -0800
@@ -971,6 +971,7 @@ This is value OR'ed together of
 1	Zone reclaim on
 2	Zone reclaim writes dirty pages out
 4	Zone reclaim swaps pages
+8	Zone reclaim migrates pages
 =	===================================
 
 zone_reclaim_mode is disabled by default.  For file servers or workloads
@@ -995,3 +996,11 @@ of other processes running on other node
 Allowing regular swap effectively restricts allocations to the local
 node unless explicitly overridden by memory policies or cpuset
 configurations.
+
+Page migration during reclaim is intended for systems with tiered memory
+configurations.  These systems have multiple types of memory with varied
+performance characteristics instead of plain NUMA systems where the same
+kind of memory is found at varied distances.  Allowing page migration
+during reclaim enables these systems to migrate pages from fast tiers to
+slow tiers when the fast tier is under pressure.  This migration is
+performed before swap.
diff -puN include/linux/swap.h~RECLAIM_MIGRATE include/linux/swap.h
--- a/include/linux/swap.h~RECLAIM_MIGRATE	2021-01-25 16:23:43.723866624 -0800
+++ b/include/linux/swap.h	2021-01-25 16:23:43.732866624 -0800
@@ -384,7 +384,8 @@ extern int sysctl_min_slab_ratio;
 static inline bool node_reclaim_enabled(void)
 {
 	/* Is any node_reclaim_mode bit set? */
-	return node_reclaim_mode & (RECLAIM_ZONE|RECLAIM_WRITE|RECLAIM_UNMAP);
+	return node_reclaim_mode & (RECLAIM_ZONE |RECLAIM_WRITE|
+				    RECLAIM_UNMAP|RECLAIM_MIGRATE);
 }
 
 extern void check_move_unevictable_pages(struct pagevec *pvec);
diff -puN include/uapi/linux/mempolicy.h~RECLAIM_MIGRATE include/uapi/linux/mempolicy.h
--- a/include/uapi/linux/mempolicy.h~RECLAIM_MIGRATE	2021-01-25 16:23:43.725866624 -0800
+++ b/include/uapi/linux/mempolicy.h	2021-01-25 16:23:43.732866624 -0800
@@ -69,5 +69,6 @@ enum {
 #define RECLAIM_ZONE	(1<<0)	/* Run shrink_inactive_list on the zone */
 #define RECLAIM_WRITE	(1<<1)	/* Writeout pages during reclaim */
 #define RECLAIM_UNMAP	(1<<2)	/* Unmap pages during reclaim */
+#define RECLAIM_MIGRATE	(1<<3)	/* Migrate to other nodes during reclaim */
 
 #endif /* _UAPI_LINUX_MEMPOLICY_H */
diff -puN mm/vmscan.c~RECLAIM_MIGRATE mm/vmscan.c
--- a/mm/vmscan.c~RECLAIM_MIGRATE	2021-01-25 16:23:43.728866624 -0800
+++ b/mm/vmscan.c	2021-01-25 16:23:43.734866624 -0800
@@ -1075,6 +1075,9 @@ static bool migrate_demote_page_ok(struc
 	VM_BUG_ON_PAGE(PageHuge(page), page);
 	VM_BUG_ON_PAGE(PageLRU(page), page);
 
+	if (!(node_reclaim_mode & RECLAIM_MIGRATE))
+		return false;
+
 	/* It is pointless to do demotion in memcg reclaim */
 	if (cgroup_reclaim(sc))
 		return false;
@@ -1084,8 +1087,7 @@ static bool migrate_demote_page_ok(struc
 	if (PageTransHuge(page) && !thp_migration_supported())
 		return false;
 
-	// FIXME: actually enable this later in the series
-	return false;
+	return true;
 }
 
 
_


  parent reply	other threads:[~2021-01-26  0:42 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-26  0:34 [RFC][PATCH 00/13] [v5] Migrate Pages in lieu of discard Dave Hansen
2021-01-26  0:34 ` [RFC][PATCH 01/13] mm/vmscan: restore zone_reclaim_mode ABI Dave Hansen
2021-02-10  9:42   ` Oscar Salvador
2021-01-26  0:34 ` [RFC][PATCH 02/13] mm/vmscan: move RECLAIM* bits to uapi header Dave Hansen
2021-02-10  9:44   ` Oscar Salvador
2021-01-26  0:34 ` [RFC][PATCH 03/13] mm/vmscan: replace implicit RECLAIM_ZONE checks with explicit checks Dave Hansen
2021-01-31  1:10   ` David Rientjes
2021-02-10  9:54   ` Oscar Salvador
2021-01-26  0:34 ` [RFC][PATCH 04/13] mm/numa: node demotion data structure and lookup Dave Hansen
2021-01-31  1:19   ` David Rientjes
2021-02-01 17:49     ` Dave Hansen
2021-01-26  0:34 ` [RFC][PATCH 05/13] mm/numa: automatically generate node migration order Dave Hansen
2021-01-29 20:46   ` Yang Shi
2021-02-01 19:13     ` Dave Hansen
2021-02-02 11:43       ` Oscar Salvador
2021-02-02 17:46       ` Yang Shi
2021-02-03  0:43         ` Dave Hansen
2021-02-04  0:26           ` Yang Shi
2021-01-26  0:34 ` [RFC][PATCH 06/13] mm/migrate: update migration order during on hotplug events Dave Hansen
2021-01-29 20:59   ` Yang Shi
2021-02-02 11:42   ` Oscar Salvador
2021-02-09 23:45     ` Dave Hansen
2021-02-10  8:55       ` Oscar Salvador
2021-01-26  0:34 ` [RFC][PATCH 07/13] mm/migrate: make migrate_pages() return nr_succeeded Dave Hansen
2021-01-29 21:04   ` Yang Shi
2021-02-09 23:41     ` Dave Hansen
2021-01-26  0:34 ` [RFC][PATCH 08/13] mm/migrate: demote pages during reclaim Dave Hansen
2021-02-02 11:55   ` Oscar Salvador
2021-02-02 22:45     ` Yang Shi
2021-02-02 22:56       ` Dave Hansen
2021-02-02 18:22   ` Yang Shi
2021-02-02 18:34     ` Dave Hansen
2021-01-26  0:34 ` [RFC][PATCH 09/13] mm/vmscan: add page demotion counter Dave Hansen
2021-01-26  0:34 ` [RFC][PATCH 10/13] mm/vmscan: add helper for querying ability to age anonymous pages Dave Hansen
2021-01-26  0:34 ` [RFC][PATCH 11/13] mm/vmscan: Consider anonymous pages without swap Dave Hansen
2021-02-02 18:56   ` Yang Shi
2021-02-02 21:35     ` Dave Hansen
2021-02-02 22:35       ` Yang Shi
2021-01-26  0:34 ` [RFC][PATCH 12/13] mm/vmscan: never demote for memcg reclaim Dave Hansen
2021-01-26  0:34 ` Dave Hansen [this message]
2021-01-31  1:13 ` [RFC][PATCH 00/13] [v5] Migrate Pages in lieu of discard David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210126003436.80749D77@viggo.jf.intel.com \
    --to=dave.hansen@linux.intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=osalvador@suse.de \
    --cc=rientjes@google.com \
    --cc=yang.shi@linux.alibaba.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).