All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wei Xu <weixugc@google.com>
To: Tim Chen <tim.c.chen@linux.intel.com>
Cc: "Huang, Ying" <ying.huang@intel.com>,
	Michal Hocko <mhocko@suse.com>,
	Yosry Ahmed <yosryahmed@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Shakeel Butt <shakeelb@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>, Tejun Heo <tj@kernel.org>,
	Zefan Li <lizefan.x@bytedance.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Cgroups <cgroups@vger.kernel.org>,
	"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>, Jonathan Corbet <corbet@lwn.net>,
	Yu Zhao <yuzhao@google.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Greg Thelen <gthelen@google.com>
Subject: Re: [PATCH resend] memcg: introduce per-memcg reclaim interface
Date: Thu, 7 Apr 2022 15:12:19 -0700	[thread overview]
Message-ID: <CAAPL-u_aAbDOmATSA8ZvjnfBk_7EoXvLoh0etM0fB0aY1845VQ@mail.gmail.com> (raw)
In-Reply-To: <215bd7332aee0ed1092bad4d826a42854ebfd04a.camel@linux.intel.com>

On Thu, Apr 7, 2022 at 2:26 PM Tim Chen <tim.c.chen@linux.intel.com> wrote:
>
> On Wed, 2022-04-06 at 10:49 +0800, Huang, Ying wrote:
> >
> > > > If so,
> > > >
> > > > # echo A > memory.reclaim
> > > >
> > > > means
> > > >
> > > > a) "A" bytes memory are freed from the memcg, regardless demoting is
> > > >    used or not.
> > > >
> > > > or
> > > >
> > > > b) "A" bytes memory are reclaimed from the memcg, some of them may be
> > > >    freed, some of them may be just demoted from DRAM to PMEM.  The total
> > > >    number is "A".
> > > >
> > > > For me, a) looks more reasonable.
> > > >
> > >
> > > We can use a DEMOTE flag to control the demotion behavior for
> > > memory.reclaim.  If the flag is not set (the default), then
> > > no_demotion of scan_control can be set to 1, similar to
> > > reclaim_pages().
> >
> > If we have to use a flag to control the behavior, I think it's better to
> > have a separate interface (e.g. memory.demote).  But do we really need b)?
> >
> > > The question is then whether we want to rename memory.reclaim to
> > > something more general.  I think this name is fine if reclaim-based
> > > demotion is an accepted concept.
> >
>
> memory.demote will work for 2 level of memory tiers.  But when we have 3 level
> of memory (e.g. high bandwidth memory, DRAM and PMEM),
> it gets ambiguous again of wheter we sould demote from high bandwidth memory
> or DRAM.
>
> Will something like this be more general?
>
> echo X > memory_[dram,pmem,hbm].reclaim
>
> So echo X > memory_dram.reclaim
> means that we want to free up X bytes from DRAM for the mem cgroup.
>
> echo demote > memory_dram.reclaim_policy
>
> This means that we prefer demotion for reclaim instead
> of swapping to disk.
>

(resending in plain-text, sorry).

memory.demote can work with any level of memory tiers if a nodemask
argument (or a tier argument if there is a more-explicitly defined,
userspace visible tiering representation) is provided.  The semantics
can be to demote X bytes from these nodes to their next tier.

memory_dram/memory_pmem assumes the hardware for a particular memory
tier, which is undesirable.  For example, it is entirely possible that
a slow memory tier is implemented by a lower-cost/lower-performance
DDR device connected via CXL.mem, not by PMEM.  It is better for this
interface to speak in either the NUMA node abstraction or a new tier
abstraction.

It is also desirable to make this interface stateless, i.e. not to
require the setting of memory_dram.reclaim_policy.  Any policy can be
specified as arguments to the request itself and should only affect
that particular request.

Wei

WARNING: multiple messages have this Message-ID (diff)
From: Wei Xu <weixugc-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
To: Tim Chen <tim.c.chen-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Cc: "Huang,
	Ying" <ying.huang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>,
	Yosry Ahmed <yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Zefan Li <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>,
	Roman Gushchin
	<roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org>,
	Cgroups <cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"open list:DOCUMENTATION"
	<linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Linux Kernel Mailing List
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Linux MM <linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>,
	Jonathan Corbet <corbet-T1hC0tSOHrs@public.gmane.org>,
	Yu Zhao <yuzhao-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Dave Hansen <dave.hansen-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>,
	Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH resend] memcg: introduce per-memcg reclaim interface
Date: Thu, 7 Apr 2022 15:12:19 -0700	[thread overview]
Message-ID: <CAAPL-u_aAbDOmATSA8ZvjnfBk_7EoXvLoh0etM0fB0aY1845VQ@mail.gmail.com> (raw)
In-Reply-To: <215bd7332aee0ed1092bad4d826a42854ebfd04a.camel-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>

On Thu, Apr 7, 2022 at 2:26 PM Tim Chen <tim.c.chen-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> wrote:
>
> On Wed, 2022-04-06 at 10:49 +0800, Huang, Ying wrote:
> >
> > > > If so,
> > > >
> > > > # echo A > memory.reclaim
> > > >
> > > > means
> > > >
> > > > a) "A" bytes memory are freed from the memcg, regardless demoting is
> > > >    used or not.
> > > >
> > > > or
> > > >
> > > > b) "A" bytes memory are reclaimed from the memcg, some of them may be
> > > >    freed, some of them may be just demoted from DRAM to PMEM.  The total
> > > >    number is "A".
> > > >
> > > > For me, a) looks more reasonable.
> > > >
> > >
> > > We can use a DEMOTE flag to control the demotion behavior for
> > > memory.reclaim.  If the flag is not set (the default), then
> > > no_demotion of scan_control can be set to 1, similar to
> > > reclaim_pages().
> >
> > If we have to use a flag to control the behavior, I think it's better to
> > have a separate interface (e.g. memory.demote).  But do we really need b)?
> >
> > > The question is then whether we want to rename memory.reclaim to
> > > something more general.  I think this name is fine if reclaim-based
> > > demotion is an accepted concept.
> >
>
> memory.demote will work for 2 level of memory tiers.  But when we have 3 level
> of memory (e.g. high bandwidth memory, DRAM and PMEM),
> it gets ambiguous again of wheter we sould demote from high bandwidth memory
> or DRAM.
>
> Will something like this be more general?
>
> echo X > memory_[dram,pmem,hbm].reclaim
>
> So echo X > memory_dram.reclaim
> means that we want to free up X bytes from DRAM for the mem cgroup.
>
> echo demote > memory_dram.reclaim_policy
>
> This means that we prefer demotion for reclaim instead
> of swapping to disk.
>

(resending in plain-text, sorry).

memory.demote can work with any level of memory tiers if a nodemask
argument (or a tier argument if there is a more-explicitly defined,
userspace visible tiering representation) is provided.  The semantics
can be to demote X bytes from these nodes to their next tier.

memory_dram/memory_pmem assumes the hardware for a particular memory
tier, which is undesirable.  For example, it is entirely possible that
a slow memory tier is implemented by a lower-cost/lower-performance
DDR device connected via CXL.mem, not by PMEM.  It is better for this
interface to speak in either the NUMA node abstraction or a new tier
abstraction.

It is also desirable to make this interface stateless, i.e. not to
require the setting of memory_dram.reclaim_policy.  Any policy can be
specified as arguments to the request itself and should only affect
that particular request.

Wei

  parent reply	other threads:[~2022-04-07 22:12 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-31  8:41 [PATCH resend] memcg: introduce per-memcg reclaim interface Yosry Ahmed
2022-03-31 17:25 ` Roman Gushchin
2022-04-01  6:01   ` Wei Xu
2022-04-01  6:01     ` Wei Xu
2022-04-01  9:11   ` Yosry Ahmed
2022-04-01  9:11     ` Yosry Ahmed
2022-04-01 18:39     ` Roman Gushchin
2022-04-01 21:13       ` Johannes Weiner
2022-04-01 21:13         ` Johannes Weiner
2022-04-01 21:21         ` Roman Gushchin
2022-04-01 21:38           ` Wei Xu
2022-04-01 21:51           ` Johannes Weiner
2022-04-04 17:14             ` Shakeel Butt
2022-04-04 17:13       ` Yosry Ahmed
2022-04-04 17:55         ` Roman Gushchin
2022-04-01  9:15   ` Yosry Ahmed
2022-04-01  9:15     ` Yosry Ahmed
2022-04-01 15:41     ` Shakeel Butt
2022-04-01 13:49   ` Michal Hocko
2022-04-01 16:58     ` Roman Gushchin
2022-04-04  8:44       ` Michal Hocko
2022-04-04 18:25         ` Roman Gushchin
2022-03-31 19:25 ` Johannes Weiner
2022-04-01  0:33 ` Andrew Morton
2022-04-01  0:33   ` Andrew Morton
2022-04-01  3:38   ` Wei Xu
2022-04-01  9:17     ` Yosry Ahmed
2022-04-01  9:17       ` Yosry Ahmed
2022-04-01 13:03       ` Michal Hocko
2022-04-01 13:03         ` Michal Hocko
2022-04-01  3:05 ` Chen Wandun
2022-04-01  3:05   ` Chen Wandun
2022-04-01  9:20   ` Yosry Ahmed
2022-04-01  9:48     ` Chen Wandun
2022-04-01  9:48       ` Chen Wandun
2022-04-01 10:02       ` Yosry Ahmed
2022-04-01  4:05 ` Wei Xu
2022-04-01  4:05   ` Wei Xu
2022-04-01  9:22   ` Yosry Ahmed
2022-04-01  9:22     ` Yosry Ahmed
2022-04-01 15:22   ` Johannes Weiner
2022-04-01 20:14     ` Wei Xu
2022-04-01 21:07       ` Johannes Weiner
2022-04-01 21:07         ` Johannes Weiner
2022-04-04 17:08       ` Shakeel Butt
2022-04-05  2:30         ` Wei Xu
2022-04-05 10:09         ` Michal Koutný
2022-04-01  8:39 ` Vaibhav Jain
2022-04-01  9:23   ` Yosry Ahmed
2022-04-04  3:50     ` Vaibhav Jain
2022-04-04 17:18       ` Yosry Ahmed
2022-04-01 13:54 ` Michal Hocko
2022-04-01 16:56   ` Wei Xu
2022-04-01 16:56     ` Wei Xu
2022-04-02  8:13     ` Huang, Ying
2022-04-03  6:46       ` Wei Xu
2022-04-03  6:56       ` Wei Xu
2022-04-06  0:48         ` Huang, Ying
2022-04-06  1:07           ` Wei Xu
2022-04-06  1:07             ` Wei Xu
2022-04-06  2:49             ` Huang, Ying
2022-04-06  2:49               ` Huang, Ying
2022-04-06  5:02               ` Wei Xu
2022-04-06  6:32                 ` Huang, Ying
2022-04-06  7:05                   ` Wei Xu
2022-04-06  8:49                     ` Huang, Ying
2022-04-06  8:49                       ` Huang, Ying
2022-04-06 20:16                       ` Wei Xu
2022-04-06 20:16                         ` Wei Xu
2022-04-07  7:35                   ` Michal Hocko
2022-04-07 21:26               ` Tim Chen
2022-04-07 22:07                 ` Wei Xu
2022-04-07 22:12                 ` Wei Xu [this message]
2022-04-07 22:12                   ` Wei Xu
2022-04-07 23:11                   ` Tim Chen
2022-04-08  2:10                     ` Wei Xu
2022-04-08  2:10                       ` Wei Xu
2022-04-08  3:08                       ` Huang, Ying
2022-04-08  4:10                         ` Wei Xu
2022-04-08  4:10                           ` Wei Xu
2022-04-04 17:09   ` Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAPL-u_aAbDOmATSA8ZvjnfBk_7EoXvLoh0etM0fB0aY1845VQ@mail.gmail.com \
    --to=weixugc@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan.x@bytedance.com \
    --cc=mhocko@suse.com \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeelb@google.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=tj@kernel.org \
    --cc=ying.huang@intel.com \
    --cc=yosryahmed@google.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.