linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yang Shi <shy828301@gmail.com>
To: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	 Yang Shi <yang.shi@linux.alibaba.com>,
	David Rientjes <rientjes@google.com>,
	 Huang Ying <ying.huang@intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	 David Hildenbrand <david@redhat.com>,
	Oscar Salvador <osalvador@suse.de>
Subject: Re: [PATCH 10/10] mm/migrate: new zone_reclaim_mode to enable reclaim migration
Date: Mon, 8 Mar 2021 16:24:22 -0800	[thread overview]
Message-ID: <CAHbLzkqKSOSnyXfqkeW2HDdYk6m+zSZuk5AX1waFVfK-1Vg1=Q@mail.gmail.com> (raw)
In-Reply-To: <20210305000009.EDF902E9@viggo.jf.intel.com>

On Thu, Mar 4, 2021 at 4:01 PM Dave Hansen <dave.hansen@linux.intel.com> wrote:
>
>
> From: Dave Hansen <dave.hansen@linux.intel.com>
>
> Some method is obviously needed to enable reclaim-based migration.
>
> Just like traditional autonuma, there will be some workloads that
> will benefit like workloads with more "static" configurations where
> hot pages stay hot and cold pages stay cold.  If pages come and go
> from the hot and cold sets, the benefits of this approach will be
> more limited.
>
> The benefits are truly workload-based and *not* hardware-based.
> We do not believe that there is a viable threshold where certain
> hardware configurations should have this mechanism enabled while
> others do not.
>
> To be conservative, earlier work defaulted to disable reclaim-
> based migration and did not include a mechanism to enable it.
> This proposes extending the existing "zone_reclaim_mode" (now
> now really node_reclaim_mode) as a method to enable it.
>
> We are open to any alternative that allows end users to enable
> this mechanism or disable it it workload harm is detected (just
> like traditional autonuma).
>
> Once this is enabled page demotion may move data to a NUMA node
> that does not fall into the cpuset of the allocating process.
> This could be construed to violate the guarantees of cpusets.
> However, since this is an opt-in mechanism, the assumption is
> that anyone enabling it is content to relax the guarantees.

I think we'd better have the cpuset violation paragraph along with new
zone reclaim mode text so that the users are aware of the potential
violation. I don't think commit log is the to-go place for any plain
users.

>
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Yang Shi <yang.shi@linux.alibaba.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Huang Ying <ying.huang@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: osalvador <osalvador@suse.de>
>
> changes since 20200122:
>  * Changelog material about relaxing cpuset constraints
> ---
>
>  b/Documentation/admin-guide/sysctl/vm.rst |    9 +++++++++
>  b/include/linux/swap.h                    |    3 ++-
>  b/include/uapi/linux/mempolicy.h          |    1 +
>  b/mm/vmscan.c                             |    6 ++++--
>  4 files changed, 16 insertions(+), 3 deletions(-)
>
> diff -puN Documentation/admin-guide/sysctl/vm.rst~RECLAIM_MIGRATE Documentation/admin-guide/sysctl/vm.rst
> --- a/Documentation/admin-guide/sysctl/vm.rst~RECLAIM_MIGRATE   2021-03-04 15:36:26.078806355 -0800
> +++ b/Documentation/admin-guide/sysctl/vm.rst   2021-03-04 15:36:26.093806355 -0800
> @@ -976,6 +976,7 @@ This is value OR'ed together of
>  1      Zone reclaim on
>  2      Zone reclaim writes dirty pages out
>  4      Zone reclaim swaps pages
> +8      Zone reclaim migrates pages
>  =      ===================================
>
>  zone_reclaim_mode is disabled by default.  For file servers or workloads
> @@ -1000,3 +1001,11 @@ of other processes running on other node
>  Allowing regular swap effectively restricts allocations to the local
>  node unless explicitly overridden by memory policies or cpuset
>  configurations.
> +
> +Page migration during reclaim is intended for systems with tiered memory
> +configurations.  These systems have multiple types of memory with varied
> +performance characteristics instead of plain NUMA systems where the same
> +kind of memory is found at varied distances.  Allowing page migration
> +during reclaim enables these systems to migrate pages from fast tiers to
> +slow tiers when the fast tier is under pressure.  This migration is
> +performed before swap.
> diff -puN include/linux/swap.h~RECLAIM_MIGRATE include/linux/swap.h
> --- a/include/linux/swap.h~RECLAIM_MIGRATE      2021-03-04 15:36:26.082806355 -0800
> +++ b/include/linux/swap.h      2021-03-04 15:36:26.093806355 -0800
> @@ -382,7 +382,8 @@ extern int sysctl_min_slab_ratio;
>  static inline bool node_reclaim_enabled(void)
>  {
>         /* Is any node_reclaim_mode bit set? */
> -       return node_reclaim_mode & (RECLAIM_ZONE|RECLAIM_WRITE|RECLAIM_UNMAP);
> +       return node_reclaim_mode & (RECLAIM_ZONE |RECLAIM_WRITE|
> +                                   RECLAIM_UNMAP|RECLAIM_MIGRATE);
>  }
>
>  extern void check_move_unevictable_pages(struct pagevec *pvec);
> diff -puN include/uapi/linux/mempolicy.h~RECLAIM_MIGRATE include/uapi/linux/mempolicy.h
> --- a/include/uapi/linux/mempolicy.h~RECLAIM_MIGRATE    2021-03-04 15:36:26.084806355 -0800
> +++ b/include/uapi/linux/mempolicy.h    2021-03-04 15:36:26.094806355 -0800
> @@ -69,5 +69,6 @@ enum {
>  #define RECLAIM_ZONE   (1<<0)  /* Run shrink_inactive_list on the zone */
>  #define RECLAIM_WRITE  (1<<1)  /* Writeout pages during reclaim */
>  #define RECLAIM_UNMAP  (1<<2)  /* Unmap pages during reclaim */
> +#define RECLAIM_MIGRATE        (1<<3)  /* Migrate to other nodes during reclaim */
>
>  #endif /* _UAPI_LINUX_MEMPOLICY_H */
> diff -puN mm/vmscan.c~RECLAIM_MIGRATE mm/vmscan.c
> --- a/mm/vmscan.c~RECLAIM_MIGRATE       2021-03-04 15:36:26.087806355 -0800
> +++ b/mm/vmscan.c       2021-03-04 15:36:26.096806355 -0800
> @@ -1073,6 +1073,9 @@ static bool migrate_demote_page_ok(struc
>         VM_BUG_ON_PAGE(PageHuge(page), page);
>         VM_BUG_ON_PAGE(PageLRU(page), page);
>
> +       if (!(node_reclaim_mode & RECLAIM_MIGRATE))
> +               return false;
> +
>         /* It is pointless to do demotion in memcg reclaim */
>         if (cgroup_reclaim(sc))
>                 return false;
> @@ -1082,8 +1085,7 @@ static bool migrate_demote_page_ok(struc
>         if (PageTransHuge(page) && !thp_migration_supported())
>                 return false;
>
> -       // FIXME: actually enable this later in the series
> -       return false;
> +       return true;
>  }
>
>  /* Check if a page is dirty or under writeback */
> _
>


  reply	other threads:[~2021-03-09  0:24 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-04 23:59 [PATCH 00/10] [v6] Migrate Pages in lieu of discard Dave Hansen
2021-03-04 23:59 ` [PATCH 01/10] mm/numa: node demotion data structure and lookup Dave Hansen
2021-03-08 23:58   ` Yang Shi
2021-03-04 23:59 ` [PATCH 02/10] mm/numa: automatically generate node migration order Dave Hansen
2021-03-08 23:59   ` Yang Shi
2021-03-04 23:59 ` [PATCH 03/10] mm/migrate: update node demotion order during on hotplug events Dave Hansen
2021-03-09  0:03   ` Yang Shi
2021-03-09 22:07     ` Dave Hansen
2021-03-04 23:59 ` [PATCH 04/10] mm/migrate: make migrate_pages() return nr_succeeded Dave Hansen
2021-03-09  0:05   ` Yang Shi
2021-03-04 23:59 ` [PATCH 05/10] mm/migrate: demote pages during reclaim Dave Hansen
2021-03-09  0:10   ` Yang Shi
2021-03-09 23:05     ` Dave Hansen
2021-03-05  0:00 ` [PATCH 06/10] mm/vmscan: add page demotion counter Dave Hansen
2021-03-09  0:11   ` Yang Shi
2021-03-05  0:00 ` [PATCH 07/10] mm/vmscan: add helper for querying ability to age anonymous pages Dave Hansen
2021-03-09  0:14   ` Yang Shi
2021-03-20  4:05   ` Greg Thelen
2021-03-05  0:00 ` [PATCH 08/10] mm/vmscan: Consider anonymous pages without swap Dave Hansen
2021-03-09  0:17   ` Yang Shi
2021-03-09 23:08     ` Dave Hansen
2021-03-05  0:00 ` [PATCH 09/10] mm/vmscan: never demote for memcg reclaim Dave Hansen
2021-03-09  0:17   ` Yang Shi
2021-03-05  0:00 ` [PATCH 10/10] mm/migrate: new zone_reclaim_mode to enable reclaim migration Dave Hansen
2021-03-09  0:24   ` Yang Shi [this message]
2021-03-09 21:53     ` Dave Hansen
2021-03-09  0:34 ` [PATCH 00/10] [v6] Migrate Pages in lieu of discard Yang Shi
2021-03-09 21:52   ` Dave Hansen
2021-04-01 18:32 [PATCH 00/10] [v7][RESEND] " Dave Hansen
2021-04-01 18:32 ` [PATCH 10/10] mm/migrate: new zone_reclaim_mode to enable reclaim migration Dave Hansen
2021-04-01 20:06   ` Yang Shi
2021-04-10  4:10   ` Wei Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHbLzkqKSOSnyXfqkeW2HDdYk6m+zSZuk5AX1waFVfK-1Vg1=Q@mail.gmail.com' \
    --to=shy828301@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=osalvador@suse.de \
    --cc=rientjes@google.com \
    --cc=yang.shi@linux.alibaba.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).