From: CGEL <cgel.zte@gmail.com> To: Michal Hocko <mhocko@suse.com> Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, willy@infradead.org, shy828301@gmail.com, roman.gushchin@linux.dev, shakeelb@google.com, linmiaohe@huawei.com, william.kucharski@oracle.com, peterx@redhat.com, hughd@google.com, vbabka@suse.cz, songmuchun@bytedance.com, surenb@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, Yang Yang <yang.yang29@zte.com.cn> Subject: Re: [PATCH] mm/memcg: support control THP behaviour in cgroup Date: Wed, 18 May 2022 05:58:39 +0000 [thread overview] Message-ID: <62848b11.1c69fb81.6ce50.2091@mx.google.com> (raw) In-Reply-To: <YnpqYte2jLdcBiPg@dhcp22.suse.cz> On Tue, May 10, 2022 at 03:36:34PM +0200, Michal Hocko wrote: > On Tue 10-05-22 11:52:51, CGEL wrote: > > On Tue, May 10, 2022 at 12:00:04PM +0200, Michal Hocko wrote: > > > On Tue 10-05-22 01:43:38, CGEL wrote: > > > > On Mon, May 09, 2022 at 01:48:39PM +0200, Michal Hocko wrote: > > > > > On Mon 09-05-22 11:26:43, CGEL wrote: > > > > > > On Mon, May 09, 2022 at 12:00:28PM +0200, Michal Hocko wrote: > > > > > > > On Sat 07-05-22 02:05:25, CGEL wrote: > > > > > > > [...] > > > > > > > > If there are many containers to run on one host, and some of them have high > > > > > > > > performance requirements, administrator could turn on thp for them: > > > > > > > > # docker run -it --thp-enabled=always > > > > > > > > Then all the processes in those containers will always use thp. > > > > > > > > While other containers turn off thp by: > > > > > > > > # docker run -it --thp-enabled=never > > > > > > > > > > > > > > I do not know. The THP config space is already too confusing and complex > > > > > > > and this just adds on top. E.g. is the behavior of the knob > > > > > > > hierarchical? What is the policy if parent memcg says madivise while > > > > > > > child says always? How does the per-application configuration aligns > > > > > > > with all that (e.g. memcg policy madivise but application says never via > > > > > > > prctl while still uses some madvised - e.g. via library). > > > > > > > > > > > > > > > > > > > The cgroup THP behavior is align to host and totally independent just likes > > > > > > /sys/fs/cgroup/memory.swappiness. That means if one cgroup config 'always' > > > > > > for thp, it has no matter with host or other cgroup. This make it simple for > > > > > > user to understand or control. > > > > > > > > > > All controls in cgroup v2 should be hierarchical. This is really > > > > > required for a proper delegation semantic. > > > > > > > > > > > > > Could we align to the semantic of /sys/fs/cgroup/memory.swappiness? > > > > Some distributions like Ubuntu is still using cgroup v1. > > > > > > cgroup v1 interface is mostly frozen. All new features are added to the > > > v2 interface. > > > > > > > So what about we add this interface to cgroup v2? > > Can you come up with a sane hierarchical behavior? > > [...] > > > > For micro-service architecture, the application in one container is not a > > > > set of loosely tight processes, it's aim at provide one certain service, > > > > so different containers means different service, and different service > > > > has different QoS demand. > > > > > > OK, if they are tightly coupled you could apply the same THP policy by > > > an existing prctl interface. Why is that not feasible. As you are noting > > > below... > > > > > > > 5.containers usually managed by compose software, which treats container as > > > > base management unit; > > > > > > ..so the compose software can easily start up the workload by using prctl > > > to disable THP for whatever workloads it is not suitable for. > > > > prctl(PR_SET_THP_DISABLE..) can not be elegance to support the semantic we > > need. If only some containers needs THP, other containers and host do not need > > THP. We must set host THP to always first, and call prctl() to close THP for > > host tasks and other containers one by one, > > It might not be the most elegant solution but it should work. > Maintaining user interfaces for ever has some cost and the THP > configuration space is quite large already. So I would rather not add > more complication in unless that is absolutely necessary. > By the way, should we let prctl() support PR_SET_THP_ALWAYS? Just likes PR_TASK_PERF_EVENTS_DISABLE and PR_TASK_PERF_EVENTS_ENABLE. This would make it simpler to let certain process use THP while others not use. > > in this process some tasks that start before we call prctl() may > > already use THP with no need. > > As long as all those processes have a common ancestor I do not see how > that would be possible. > > -- > Michal Hocko > SUSE Labs
WARNING: multiple messages have this Message-ID (diff)
From: CGEL <cgel.zte-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> To: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org> Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org, hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org, willy-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org, shy828301-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org, roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org, shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org, linmiaohe-hv44wF8Li93QT0dZR+AlfA@public.gmane.org, william.kucharski-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org, peterx-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org, vbabka-AlSwsSmVLrQ@public.gmane.org, songmuchun-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org, surenb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Yang Yang <yang.yang29-Th6q7B73Y6EnDS1+zs4M5A@public.gmane.org> Subject: Re: [PATCH] mm/memcg: support control THP behaviour in cgroup Date: Wed, 18 May 2022 05:58:39 +0000 [thread overview] Message-ID: <62848b11.1c69fb81.6ce50.2091@mx.google.com> (raw) In-Reply-To: <YnpqYte2jLdcBiPg-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org> On Tue, May 10, 2022 at 03:36:34PM +0200, Michal Hocko wrote: > On Tue 10-05-22 11:52:51, CGEL wrote: > > On Tue, May 10, 2022 at 12:00:04PM +0200, Michal Hocko wrote: > > > On Tue 10-05-22 01:43:38, CGEL wrote: > > > > On Mon, May 09, 2022 at 01:48:39PM +0200, Michal Hocko wrote: > > > > > On Mon 09-05-22 11:26:43, CGEL wrote: > > > > > > On Mon, May 09, 2022 at 12:00:28PM +0200, Michal Hocko wrote: > > > > > > > On Sat 07-05-22 02:05:25, CGEL wrote: > > > > > > > [...] > > > > > > > > If there are many containers to run on one host, and some of them have high > > > > > > > > performance requirements, administrator could turn on thp for them: > > > > > > > > # docker run -it --thp-enabled=always > > > > > > > > Then all the processes in those containers will always use thp. > > > > > > > > While other containers turn off thp by: > > > > > > > > # docker run -it --thp-enabled=never > > > > > > > > > > > > > > I do not know. The THP config space is already too confusing and complex > > > > > > > and this just adds on top. E.g. is the behavior of the knob > > > > > > > hierarchical? What is the policy if parent memcg says madivise while > > > > > > > child says always? How does the per-application configuration aligns > > > > > > > with all that (e.g. memcg policy madivise but application says never via > > > > > > > prctl while still uses some madvised - e.g. via library). > > > > > > > > > > > > > > > > > > > The cgroup THP behavior is align to host and totally independent just likes > > > > > > /sys/fs/cgroup/memory.swappiness. That means if one cgroup config 'always' > > > > > > for thp, it has no matter with host or other cgroup. This make it simple for > > > > > > user to understand or control. > > > > > > > > > > All controls in cgroup v2 should be hierarchical. This is really > > > > > required for a proper delegation semantic. > > > > > > > > > > > > > Could we align to the semantic of /sys/fs/cgroup/memory.swappiness? > > > > Some distributions like Ubuntu is still using cgroup v1. > > > > > > cgroup v1 interface is mostly frozen. All new features are added to the > > > v2 interface. > > > > > > > So what about we add this interface to cgroup v2? > > Can you come up with a sane hierarchical behavior? > > [...] > > > > For micro-service architecture, the application in one container is not a > > > > set of loosely tight processes, it's aim at provide one certain service, > > > > so different containers means different service, and different service > > > > has different QoS demand. > > > > > > OK, if they are tightly coupled you could apply the same THP policy by > > > an existing prctl interface. Why is that not feasible. As you are noting > > > below... > > > > > > > 5.containers usually managed by compose software, which treats container as > > > > base management unit; > > > > > > ..so the compose software can easily start up the workload by using prctl > > > to disable THP for whatever workloads it is not suitable for. > > > > prctl(PR_SET_THP_DISABLE..) can not be elegance to support the semantic we > > need. If only some containers needs THP, other containers and host do not need > > THP. We must set host THP to always first, and call prctl() to close THP for > > host tasks and other containers one by one, > > It might not be the most elegant solution but it should work. > Maintaining user interfaces for ever has some cost and the THP > configuration space is quite large already. So I would rather not add > more complication in unless that is absolutely necessary. > By the way, should we let prctl() support PR_SET_THP_ALWAYS? Just likes PR_TASK_PERF_EVENTS_DISABLE and PR_TASK_PERF_EVENTS_ENABLE. This would make it simpler to let certain process use THP while others not use. > > in this process some tasks that start before we call prctl() may > > already use THP with no need. > > As long as all those processes have a common ancestor I do not see how > that would be possible. > > -- > Michal Hocko > SUSE Labs
next prev parent reply other threads:[~2022-05-18 5:58 UTC|newest] Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-05-05 3:38 [PATCH] mm/memcg: support control THP behaviour in cgroup cgel.zte 2022-05-05 3:38 ` cgel.zte-Re5JQEeQqe8AvxtiuMwx3w 2022-05-05 12:49 ` kernel test robot 2022-05-05 12:49 ` kernel test robot 2022-05-05 13:31 ` kernel test robot 2022-05-05 13:31 ` kernel test robot 2022-05-05 16:09 ` kernel test robot 2022-05-05 16:09 ` kernel test robot 2022-05-06 13:41 ` Michal Hocko 2022-05-06 13:41 ` Michal Hocko 2022-05-07 2:05 ` CGEL 2022-05-07 2:05 ` CGEL 2022-05-09 10:00 ` Michal Hocko 2022-05-09 10:00 ` Michal Hocko 2022-05-09 11:26 ` CGEL 2022-05-09 11:26 ` CGEL 2022-05-09 11:48 ` Michal Hocko 2022-05-09 11:48 ` Michal Hocko 2022-05-10 1:43 ` CGEL 2022-05-10 1:43 ` CGEL 2022-05-10 10:00 ` Michal Hocko 2022-05-10 10:00 ` Michal Hocko 2022-05-10 11:52 ` CGEL 2022-05-10 11:52 ` CGEL 2022-05-10 13:36 ` Michal Hocko 2022-05-10 13:36 ` Michal Hocko 2022-05-11 1:59 ` CGEL 2022-05-11 1:59 ` CGEL 2022-05-11 7:21 ` Michal Hocko 2022-05-11 7:21 ` Michal Hocko 2022-05-11 9:47 ` CGEL 2022-05-18 5:58 ` CGEL [this message] 2022-05-18 5:58 ` CGEL 2022-05-10 19:34 ` Yang Shi 2022-05-10 19:34 ` Yang Shi 2022-05-11 2:19 ` CGEL 2022-05-11 2:19 ` CGEL 2022-05-11 2:47 ` Shakeel Butt 2022-05-11 2:47 ` Shakeel Butt 2022-05-11 3:11 ` Roman Gushchin 2022-05-11 3:11 ` Roman Gushchin 2022-05-11 3:31 ` CGEL 2022-05-11 3:31 ` CGEL 2022-05-18 8:14 ` Balbir Singh 2022-05-18 8:14 ` Balbir Singh 2022-05-11 3:17 ` CGEL 2022-05-11 3:17 ` CGEL
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=62848b11.1c69fb81.6ce50.2091@mx.google.com \ --to=cgel.zte@gmail.com \ --cc=akpm@linux-foundation.org \ --cc=cgroups@vger.kernel.org \ --cc=hannes@cmpxchg.org \ --cc=hughd@google.com \ --cc=linmiaohe@huawei.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mhocko@suse.com \ --cc=peterx@redhat.com \ --cc=roman.gushchin@linux.dev \ --cc=shakeelb@google.com \ --cc=shy828301@gmail.com \ --cc=songmuchun@bytedance.com \ --cc=surenb@google.com \ --cc=vbabka@suse.cz \ --cc=william.kucharski@oracle.com \ --cc=willy@infradead.org \ --cc=yang.yang29@zte.com.cn \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.