All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: CGEL <cgel.zte@gmail.com>
Cc: akpm@linux-foundation.org, hannes@cmpxchg.org,
	willy@infradead.org, shy828301@gmail.com,
	roman.gushchin@linux.dev, shakeelb@google.com,
	linmiaohe@huawei.com, william.kucharski@oracle.com,
	peterx@redhat.com, hughd@google.com, vbabka@suse.cz,
	songmuchun@bytedance.com, surenb@google.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	cgroups@vger.kernel.org, Yang Yang <yang.yang29@zte.com.cn>
Subject: Re: [PATCH] mm/memcg: support control THP behaviour in cgroup
Date: Wed, 11 May 2022 09:21:53 +0200	[thread overview]
Message-ID: <YntkEUKPquTbBjMu@dhcp22.suse.cz> (raw)
In-Reply-To: <627b1899.1c69fb81.cd831.12d9@mx.google.com>

On Wed 11-05-22 01:59:52, CGEL wrote:
> On Tue, May 10, 2022 at 03:36:34PM +0200, Michal Hocko wrote:
[...]
> > Can you come up with a sane hierarchical behavior?
> >
> 
> I think this new interface better be independent not hierarchical anyway. Especially
> when we treat container as lightweight virtual machine.

I suspect you are focusing too much on your usecase and do not realize
wider consequences of this being an user interface that still has to be
sensible for other usecases. Take a delagation of the control to
subgroups as an example. If this is a per memcg knob (like swappiness)
then children can override parent's THP policy. This might be a less of
the deal for swappiness because the anon/file reclaim balancing should
be mostly an internal thing. But THP policy is different because it has
other effects to workloads running outside of the said cgroup - higher
memory demand, higher contention for high-order memory etc.

I do not really see how this could be a sensible per-memcg policy
without being fully hierarchical.

> 
> > [...]
> > > > > For micro-service architecture, the application in one container is not a
> > > > > set of loosely tight processes, it's aim at provide one certain service,
> > > > > so different containers means different service, and different service
> > > > > has different QoS demand. 
> > > > 
> > > > OK, if they are tightly coupled you could apply the same THP policy by
> > > > an existing prctl interface. Why is that not feasible. As you are noting
> > > > below...
> > > > 
> > > > >     5.containers usually managed by compose software, which treats container as
> > > > > base management unit;
> > > > 
> > > > ..so the compose software can easily start up the workload by using prctl
> > > > to disable THP for whatever workloads it is not suitable for.
> > > 
> > > prctl(PR_SET_THP_DISABLE..) can not be elegance to support the semantic we
> > > need. If only some containers needs THP, other containers and host do not need
> > > THP. We must set host THP to always first, and call prctl() to close THP for
> > > host tasks and other containers one by one,
> > 
> > It might not be the most elegant solution but it should work.
> 
> So you agree it's reasonable to set THP policy for process in container, right?

Yes, like in any other processes.

> If so, IMHO, when there are thousands of processes launch and die on the machine,
> it will be horrible to do so by calling prctl(), I don't see the reasonability.

Could you be more specific? The usual prctl use would be normally
handled by the launcher and rely on the per-process policy to be
inherited down the road.

-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>
To: CGEL <cgel.zte-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org,
	hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org,
	willy-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org,
	shy828301-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org,
	linmiaohe-hv44wF8Li93QT0dZR+AlfA@public.gmane.org,
	william.kucharski-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
	peterx-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org,
	vbabka-AlSwsSmVLrQ@public.gmane.org,
	songmuchun-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org,
	surenb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Yang Yang <yang.yang29-Th6q7B73Y6EnDS1+zs4M5A@public.gmane.org>
Subject: Re: [PATCH] mm/memcg: support control THP behaviour in cgroup
Date: Wed, 11 May 2022 09:21:53 +0200	[thread overview]
Message-ID: <YntkEUKPquTbBjMu@dhcp22.suse.cz> (raw)
In-Reply-To: <627b1899.1c69fb81.cd831.12d9-ATjtLOhZ0NVl57MIdRCFDg@public.gmane.org>

On Wed 11-05-22 01:59:52, CGEL wrote:
> On Tue, May 10, 2022 at 03:36:34PM +0200, Michal Hocko wrote:
[...]
> > Can you come up with a sane hierarchical behavior?
> >
> 
> I think this new interface better be independent not hierarchical anyway. Especially
> when we treat container as lightweight virtual machine.

I suspect you are focusing too much on your usecase and do not realize
wider consequences of this being an user interface that still has to be
sensible for other usecases. Take a delagation of the control to
subgroups as an example. If this is a per memcg knob (like swappiness)
then children can override parent's THP policy. This might be a less of
the deal for swappiness because the anon/file reclaim balancing should
be mostly an internal thing. But THP policy is different because it has
other effects to workloads running outside of the said cgroup - higher
memory demand, higher contention for high-order memory etc.

I do not really see how this could be a sensible per-memcg policy
without being fully hierarchical.

> 
> > [...]
> > > > > For micro-service architecture, the application in one container is not a
> > > > > set of loosely tight processes, it's aim at provide one certain service,
> > > > > so different containers means different service, and different service
> > > > > has different QoS demand. 
> > > > 
> > > > OK, if they are tightly coupled you could apply the same THP policy by
> > > > an existing prctl interface. Why is that not feasible. As you are noting
> > > > below...
> > > > 
> > > > >     5.containers usually managed by compose software, which treats container as
> > > > > base management unit;
> > > > 
> > > > ..so the compose software can easily start up the workload by using prctl
> > > > to disable THP for whatever workloads it is not suitable for.
> > > 
> > > prctl(PR_SET_THP_DISABLE..) can not be elegance to support the semantic we
> > > need. If only some containers needs THP, other containers and host do not need
> > > THP. We must set host THP to always first, and call prctl() to close THP for
> > > host tasks and other containers one by one,
> > 
> > It might not be the most elegant solution but it should work.
> 
> So you agree it's reasonable to set THP policy for process in container, right?

Yes, like in any other processes.

> If so, IMHO, when there are thousands of processes launch and die on the machine,
> it will be horrible to do so by calling prctl(), I don't see the reasonability.

Could you be more specific? The usual prctl use would be normally
handled by the launcher and rely on the per-process policy to be
inherited down the road.

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2022-05-11  7:22 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-05  3:38 [PATCH] mm/memcg: support control THP behaviour in cgroup cgel.zte
2022-05-05  3:38 ` cgel.zte-Re5JQEeQqe8AvxtiuMwx3w
2022-05-05 12:49 ` kernel test robot
2022-05-05 12:49   ` kernel test robot
2022-05-05 13:31 ` kernel test robot
2022-05-05 13:31   ` kernel test robot
2022-05-05 16:09 ` kernel test robot
2022-05-05 16:09   ` kernel test robot
2022-05-06 13:41 ` Michal Hocko
2022-05-06 13:41   ` Michal Hocko
2022-05-07  2:05   ` CGEL
2022-05-07  2:05     ` CGEL
2022-05-09 10:00     ` Michal Hocko
2022-05-09 10:00       ` Michal Hocko
2022-05-09 11:26       ` CGEL
2022-05-09 11:26         ` CGEL
2022-05-09 11:48         ` Michal Hocko
2022-05-09 11:48           ` Michal Hocko
2022-05-10  1:43           ` CGEL
2022-05-10  1:43             ` CGEL
2022-05-10 10:00             ` Michal Hocko
2022-05-10 10:00               ` Michal Hocko
2022-05-10 11:52               ` CGEL
2022-05-10 11:52                 ` CGEL
2022-05-10 13:36                 ` Michal Hocko
2022-05-10 13:36                   ` Michal Hocko
2022-05-11  1:59                   ` CGEL
2022-05-11  1:59                     ` CGEL
2022-05-11  7:21                     ` Michal Hocko [this message]
2022-05-11  7:21                       ` Michal Hocko
2022-05-11  9:47                       ` CGEL
2022-05-18  5:58                   ` CGEL
2022-05-18  5:58                     ` CGEL
2022-05-10 19:34             ` Yang Shi
2022-05-10 19:34               ` Yang Shi
2022-05-11  2:19               ` CGEL
2022-05-11  2:19                 ` CGEL
2022-05-11  2:47                 ` Shakeel Butt
2022-05-11  2:47                   ` Shakeel Butt
2022-05-11  3:11                   ` Roman Gushchin
2022-05-11  3:11                     ` Roman Gushchin
2022-05-11  3:31                     ` CGEL
2022-05-11  3:31                       ` CGEL
2022-05-18  8:14                       ` Balbir Singh
2022-05-18  8:14                         ` Balbir Singh
2022-05-11  3:17                   ` CGEL
2022-05-11  3:17                     ` CGEL

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YntkEUKPquTbBjMu@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgel.zte@gmail.com \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=peterx@redhat.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeelb@google.com \
    --cc=shy828301@gmail.com \
    --cc=songmuchun@bytedance.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=william.kucharski@oracle.com \
    --cc=willy@infradead.org \
    --cc=yang.yang29@zte.com.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.