From: "Huang, Ying" <ying.huang@intel.com> To: Wei Xu <weixugc@google.com> Cc: Michal Hocko <mhocko@suse.com>, Yosry Ahmed <yosryahmed@google.com>, Johannes Weiner <hannes@cmpxchg.org>, Shakeel Butt <shakeelb@google.com>, Andrew Morton <akpm@linux-foundation.org>, David Rientjes <rientjes@google.com>, Tejun Heo <tj@kernel.org>, Zefan Li <lizefan.x@bytedance.com>, Roman Gushchin <roman.gushchin@linux.dev>, Cgroups <cgroups@vger.kernel.org>, "open list:DOCUMENTATION" <linux-doc@vger.kernel.org>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Linux MM <linux-mm@kvack.org>, Jonathan Corbet <corbet@lwn.net>, Yu Zhao <yuzhao@google.com>, Dave Hansen <dave.hansen@linux.intel.com>, Greg Thelen <gthelen@google.com> Subject: Re: [PATCH resend] memcg: introduce per-memcg reclaim interface Date: Wed, 06 Apr 2022 10:49:51 +0800 [thread overview] Message-ID: <87bkxfudrk.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw) In-Reply-To: <CAAPL-u_6XqQYtLAMNFvEo+0XU2VR=XYm0T9btL=g6rVVW2h93w@mail.gmail.com> (Wei Xu's message of "Tue, 5 Apr 2022 18:07:49 -0700") Wei Xu <weixugc@google.com> writes: > On Tue, Apr 5, 2022 at 5:49 PM Huang, Ying <ying.huang@intel.com> wrote: >> >> Wei Xu <weixugc@google.com> writes: >> >> > On Sat, Apr 2, 2022 at 1:13 AM Huang, Ying <ying.huang@intel.com> wrote: >> >> >> >> Wei Xu <weixugc@google.com> writes: >> >> >> >> > On Fri, Apr 1, 2022 at 6:54 AM Michal Hocko <mhocko@suse.com> wrote: >> >> >> >> >> >> On Thu 31-03-22 08:41:51, Yosry Ahmed wrote: >> >> >> > From: Shakeel Butt <shakeelb@google.com> >> >> >> > >> >> >> >> [snip] >> >> >> >> >> > Possible Extensions: >> >> >> > -------------------- >> >> >> > >> >> >> > - This interface can be extended with an additional parameter or flags >> >> >> > to allow specifying one or more types of memory to reclaim from (e.g. >> >> >> > file, anon, ..). >> >> >> > >> >> >> > - The interface can also be extended with a node mask to reclaim from >> >> >> > specific nodes. This has use cases for reclaim-based demotion in memory >> >> >> > tiering systens. >> >> >> > >> >> >> > - A similar per-node interface can also be added to support proactive >> >> >> > reclaim and reclaim-based demotion in systems without memcg. >> >> >> > >> >> >> > For now, let's keep things simple by adding the basic functionality. >> >> >> >> >> >> Yes, I am for the simplicity and this really looks like a bare minumum >> >> >> interface. But it is not really clear who do you want to add flags on >> >> >> top of it? >> >> >> >> >> >> I am not really sure we really need a node aware interface for memcg. >> >> >> The global reclaim interface will likely need a different node because >> >> >> we do not want to make this CONFIG_MEMCG constrained. >> >> > >> >> > A nodemask argument for memory.reclaim can be useful for memory >> >> > tiering between NUMA nodes with different performance. Similar to >> >> > proactive reclaim, it can allow a userspace daemon to drive >> >> > memcg-based proactive demotion via the reclaim-based demotion >> >> > mechanism in the kernel. >> >> >> >> I am not sure whether nodemask is a good way for demoting pages between >> >> different types of memory. For example, for a system with DRAM and >> >> PMEM, if specifying DRAM node in nodemask means demoting to PMEM, what >> >> is the meaning of specifying PMEM node? reclaiming to disk? >> >> >> >> In general, I have no objection to the idea in general. But we should >> >> have a clear and consistent interface. Per my understanding the default >> >> memcg interface is for memory, regardless of memory types. The memory >> >> reclaiming means reduce the memory usage, regardless of memory types. >> >> We need to either extending the semantics of memory reclaiming (to >> >> include memory demoting too), or add another interface for memory >> >> demoting. >> > >> > Good point. With the "demote pages during reclaim" patch series, >> > reclaim is already extended to demote pages as well. For example, >> > can_reclaim_anon_pages() returns true if demotion is allowed and >> > shrink_page_list() can demote pages instead of reclaiming pages. >> >> These are in-kernel implementation, not the ABI. So we still have >> the opportunity to define the ABI now. >> >> > Currently, demotion is disabled for memcg reclaim, which I think can >> > be relaxed and also necessary for memcg-based proactive demotion. I'd >> > like to suggest that we extend the semantics of memory.reclaim to >> > cover memory demotion as well. A flag can be used to enable/disable >> > the demotion behavior. >> >> If so, >> >> # echo A > memory.reclaim >> >> means >> >> a) "A" bytes memory are freed from the memcg, regardless demoting is >> used or not. >> >> or >> >> b) "A" bytes memory are reclaimed from the memcg, some of them may be >> freed, some of them may be just demoted from DRAM to PMEM. The total >> number is "A". >> >> For me, a) looks more reasonable. >> > > We can use a DEMOTE flag to control the demotion behavior for > memory.reclaim. If the flag is not set (the default), then > no_demotion of scan_control can be set to 1, similar to > reclaim_pages(). If we have to use a flag to control the behavior, I think it's better to have a separate interface (e.g. memory.demote). But do we really need b)? > The question is then whether we want to rename memory.reclaim to > something more general. I think this name is fine if reclaim-based > demotion is an accepted concept. Best Regards, Huang, Ying
WARNING: multiple messages have this Message-ID (diff)
From: "Huang, Ying" <ying.huang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> To: Wei Xu <weixugc-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Cc: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>, Yosry Ahmed <yosryahmed-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>, Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>, Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>, Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>, David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>, Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>, Zefan Li <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>, Roman Gushchin <roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org>, Cgroups <cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>, "open list:DOCUMENTATION" <linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>, Linux Kernel Mailing List <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>, Linux MM <linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>, Jonathan Corbet <corbet-T1hC0tSOHrs@public.gmane.org>, Yu Zhao <yuzhao-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>, Dave Hansen <dave.hansen-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>, Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Subject: Re: [PATCH resend] memcg: introduce per-memcg reclaim interface Date: Wed, 06 Apr 2022 10:49:51 +0800 [thread overview] Message-ID: <87bkxfudrk.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw) In-Reply-To: <CAAPL-u_6XqQYtLAMNFvEo+0XU2VR=XYm0T9btL=g6rVVW2h93w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> (Wei Xu's message of "Tue, 5 Apr 2022 18:07:49 -0700") Wei Xu <weixugc-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> writes: > On Tue, Apr 5, 2022 at 5:49 PM Huang, Ying <ying.huang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote: >> >> Wei Xu <weixugc-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> writes: >> >> > On Sat, Apr 2, 2022 at 1:13 AM Huang, Ying <ying.huang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote: >> >> >> >> Wei Xu <weixugc-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> writes: >> >> >> >> > On Fri, Apr 1, 2022 at 6:54 AM Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org> wrote: >> >> >> >> >> >> On Thu 31-03-22 08:41:51, Yosry Ahmed wrote: >> >> >> > From: Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> >> >> >> > >> >> >> >> [snip] >> >> >> >> >> > Possible Extensions: >> >> >> > -------------------- >> >> >> > >> >> >> > - This interface can be extended with an additional parameter or flags >> >> >> > to allow specifying one or more types of memory to reclaim from (e.g. >> >> >> > file, anon, ..). >> >> >> > >> >> >> > - The interface can also be extended with a node mask to reclaim from >> >> >> > specific nodes. This has use cases for reclaim-based demotion in memory >> >> >> > tiering systens. >> >> >> > >> >> >> > - A similar per-node interface can also be added to support proactive >> >> >> > reclaim and reclaim-based demotion in systems without memcg. >> >> >> > >> >> >> > For now, let's keep things simple by adding the basic functionality. >> >> >> >> >> >> Yes, I am for the simplicity and this really looks like a bare minumum >> >> >> interface. But it is not really clear who do you want to add flags on >> >> >> top of it? >> >> >> >> >> >> I am not really sure we really need a node aware interface for memcg. >> >> >> The global reclaim interface will likely need a different node because >> >> >> we do not want to make this CONFIG_MEMCG constrained. >> >> > >> >> > A nodemask argument for memory.reclaim can be useful for memory >> >> > tiering between NUMA nodes with different performance. Similar to >> >> > proactive reclaim, it can allow a userspace daemon to drive >> >> > memcg-based proactive demotion via the reclaim-based demotion >> >> > mechanism in the kernel. >> >> >> >> I am not sure whether nodemask is a good way for demoting pages between >> >> different types of memory. For example, for a system with DRAM and >> >> PMEM, if specifying DRAM node in nodemask means demoting to PMEM, what >> >> is the meaning of specifying PMEM node? reclaiming to disk? >> >> >> >> In general, I have no objection to the idea in general. But we should >> >> have a clear and consistent interface. Per my understanding the default >> >> memcg interface is for memory, regardless of memory types. The memory >> >> reclaiming means reduce the memory usage, regardless of memory types. >> >> We need to either extending the semantics of memory reclaiming (to >> >> include memory demoting too), or add another interface for memory >> >> demoting. >> > >> > Good point. With the "demote pages during reclaim" patch series, >> > reclaim is already extended to demote pages as well. For example, >> > can_reclaim_anon_pages() returns true if demotion is allowed and >> > shrink_page_list() can demote pages instead of reclaiming pages. >> >> These are in-kernel implementation, not the ABI. So we still have >> the opportunity to define the ABI now. >> >> > Currently, demotion is disabled for memcg reclaim, which I think can >> > be relaxed and also necessary for memcg-based proactive demotion. I'd >> > like to suggest that we extend the semantics of memory.reclaim to >> > cover memory demotion as well. A flag can be used to enable/disable >> > the demotion behavior. >> >> If so, >> >> # echo A > memory.reclaim >> >> means >> >> a) "A" bytes memory are freed from the memcg, regardless demoting is >> used or not. >> >> or >> >> b) "A" bytes memory are reclaimed from the memcg, some of them may be >> freed, some of them may be just demoted from DRAM to PMEM. The total >> number is "A". >> >> For me, a) looks more reasonable. >> > > We can use a DEMOTE flag to control the demotion behavior for > memory.reclaim. If the flag is not set (the default), then > no_demotion of scan_control can be set to 1, similar to > reclaim_pages(). If we have to use a flag to control the behavior, I think it's better to have a separate interface (e.g. memory.demote). But do we really need b)? > The question is then whether we want to rename memory.reclaim to > something more general. I think this name is fine if reclaim-based > demotion is an accepted concept. Best Regards, Huang, Ying
next prev parent reply other threads:[~2022-04-06 16:19 UTC|newest] Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-03-31 8:41 [PATCH resend] memcg: introduce per-memcg reclaim interface Yosry Ahmed 2022-03-31 17:25 ` Roman Gushchin 2022-04-01 6:01 ` Wei Xu 2022-04-01 6:01 ` Wei Xu 2022-04-01 9:11 ` Yosry Ahmed 2022-04-01 9:11 ` Yosry Ahmed 2022-04-01 18:39 ` Roman Gushchin 2022-04-01 21:13 ` Johannes Weiner 2022-04-01 21:13 ` Johannes Weiner 2022-04-01 21:21 ` Roman Gushchin 2022-04-01 21:38 ` Wei Xu 2022-04-01 21:51 ` Johannes Weiner 2022-04-04 17:14 ` Shakeel Butt 2022-04-04 17:13 ` Yosry Ahmed 2022-04-04 17:55 ` Roman Gushchin 2022-04-01 9:15 ` Yosry Ahmed 2022-04-01 9:15 ` Yosry Ahmed 2022-04-01 15:41 ` Shakeel Butt 2022-04-01 13:49 ` Michal Hocko 2022-04-01 16:58 ` Roman Gushchin 2022-04-04 8:44 ` Michal Hocko 2022-04-04 18:25 ` Roman Gushchin 2022-03-31 19:25 ` Johannes Weiner 2022-04-01 0:33 ` Andrew Morton 2022-04-01 0:33 ` Andrew Morton 2022-04-01 3:38 ` Wei Xu 2022-04-01 9:17 ` Yosry Ahmed 2022-04-01 9:17 ` Yosry Ahmed 2022-04-01 13:03 ` Michal Hocko 2022-04-01 13:03 ` Michal Hocko 2022-04-01 3:05 ` Chen Wandun 2022-04-01 3:05 ` Chen Wandun 2022-04-01 9:20 ` Yosry Ahmed 2022-04-01 9:48 ` Chen Wandun 2022-04-01 9:48 ` Chen Wandun 2022-04-01 10:02 ` Yosry Ahmed 2022-04-01 4:05 ` Wei Xu 2022-04-01 4:05 ` Wei Xu 2022-04-01 9:22 ` Yosry Ahmed 2022-04-01 9:22 ` Yosry Ahmed 2022-04-01 15:22 ` Johannes Weiner 2022-04-01 20:14 ` Wei Xu 2022-04-01 21:07 ` Johannes Weiner 2022-04-01 21:07 ` Johannes Weiner 2022-04-04 17:08 ` Shakeel Butt 2022-04-05 2:30 ` Wei Xu 2022-04-05 10:09 ` Michal Koutný 2022-04-01 8:39 ` Vaibhav Jain 2022-04-01 9:23 ` Yosry Ahmed 2022-04-04 3:50 ` Vaibhav Jain 2022-04-04 17:18 ` Yosry Ahmed 2022-04-01 13:54 ` Michal Hocko 2022-04-01 16:56 ` Wei Xu 2022-04-01 16:56 ` Wei Xu 2022-04-02 8:13 ` Huang, Ying 2022-04-03 6:46 ` Wei Xu 2022-04-03 6:56 ` Wei Xu 2022-04-06 0:48 ` Huang, Ying 2022-04-06 1:07 ` Wei Xu 2022-04-06 1:07 ` Wei Xu 2022-04-06 2:49 ` Huang, Ying [this message] 2022-04-06 2:49 ` Huang, Ying 2022-04-06 5:02 ` Wei Xu 2022-04-06 6:32 ` Huang, Ying 2022-04-06 7:05 ` Wei Xu 2022-04-06 8:49 ` Huang, Ying 2022-04-06 8:49 ` Huang, Ying 2022-04-06 20:16 ` Wei Xu 2022-04-06 20:16 ` Wei Xu 2022-04-07 7:35 ` Michal Hocko 2022-04-07 21:26 ` Tim Chen 2022-04-07 22:07 ` Wei Xu 2022-04-07 22:12 ` Wei Xu 2022-04-07 22:12 ` Wei Xu 2022-04-07 23:11 ` Tim Chen 2022-04-08 2:10 ` Wei Xu 2022-04-08 2:10 ` Wei Xu 2022-04-08 3:08 ` Huang, Ying 2022-04-08 4:10 ` Wei Xu 2022-04-08 4:10 ` Wei Xu 2022-04-04 17:09 ` Yosry Ahmed
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=87bkxfudrk.fsf@yhuang6-desk2.ccr.corp.intel.com \ --to=ying.huang@intel.com \ --cc=akpm@linux-foundation.org \ --cc=cgroups@vger.kernel.org \ --cc=corbet@lwn.net \ --cc=dave.hansen@linux.intel.com \ --cc=gthelen@google.com \ --cc=hannes@cmpxchg.org \ --cc=linux-doc@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=lizefan.x@bytedance.com \ --cc=mhocko@suse.com \ --cc=rientjes@google.com \ --cc=roman.gushchin@linux.dev \ --cc=shakeelb@google.com \ --cc=tj@kernel.org \ --cc=weixugc@google.com \ --cc=yosryahmed@google.com \ --cc=yuzhao@google.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.