From: Peter Zijlstra <peterz@infradead.org> To: Tejun Heo <tj@kernel.org> Cc: torvalds@linux-foundation.org, akpm@linux-foundation.org, mingo@redhat.com, lizefan@huawei.com, hannes@cmpxchg.org, pjt@google.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-api@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCHSET RFC cgroup/for-4.6] cgroup, sched: implement resource group and PRIO_RGRP Date: Mon, 14 Mar 2016 12:30:13 +0100 [thread overview] Message-ID: <20160314113013.GM6344@twins.programming.kicks-ass.net> (raw) In-Reply-To: <1457710888-31182-1-git-send-email-tj@kernel.org> On Fri, Mar 11, 2016 at 10:41:18AM -0500, Tejun Heo wrote: > * A rgroup is a cgroup which is invisible on and transparent to the > system-level cgroupfs interface. > > * A rgroup can be created by specifying CLONE_NEWRGRP flag, along with > CLONE_THREAD, during clone(2). A new rgroup is created under the > parent thread's cgroup and the new thread is created in it. This seems overly restrictive. As you well know there's people moving threads about after creation. Also, with this interface the whole thing cannot be used until your libc's pthread_create() has been patched to allow use of this new flag. > * A rgroup is automatically destroyed when empty. Except for Zombies it appears.. > * A top-level rgroup of a process is a rgroup whose parent cgroup is a > sgroup. A process may have multiple top-level rgroups and thus > multiple rgroup subtrees under the same parent sgroup. > > * Unlike sgroups, rgroups are allowed to compete against peer threads. > Each rgroup behaves equivalent to a sibling task. > > * rgroup subtrees are local to the process. When the process forks or > execs, its rgroup subtrees are collapsed. > > * When a process is migrated to a different cgroup, its rgroup > subtrees are preserved. This all makes it impossible to say put a single thread outside of the hierarchy forced upon it by the process. Like putting a RT thread in an isolated group on the side. Which is a rather common thing to do. > rgroup lays the foundation for other kernel mechanisms to make use of > resource controllers while providing proper isolation between system > management and in-process operations removing the awkward and > layer-violating requirement for coordination between individual > applications and system management. On top of the rgroup mechanism, > PRIO_RGRP is implemented for {set|get}priority(2). > > * PRIO_RGRP can only be used if the target task is already in a > rgroup. If setpriority(2) is used and cpu controller is available, > cpu controller is enabled until the target rgroup is covered and the > specified nice value is set as the weight of the rgroup. > > * The specified nice value has the same meaning as for tasks. For > example, a rgroup and a task competing under the same parent would > behave exactly the same as two tasks. > > * For top-level rgroups, PRIO_RGRP follows the same rlimit > restrictions as PRIO_PROCESS; however, as nested rgroups only > distribute CPU cycles which are allocated to the process, no > restriction is applied. While this appears neat, I doubt it will remain so in the face of this: > * A mechanism that applications can use to publish certain rgroups so > that external entities can determine which IDs to use to change > rgroup settings. I already have interface and implementation design > mostly pinned down. So you need some new fangled way to set/query all the other possible cgroup parameters supported, and then suddenly you have one that has two possible interface. That's way ugly. While I appreciate the sentiment that having two entities poking at the cgroup filesystem without coordination is a problem, I don't see this as the solution. I would much rather just kill the system wide thing, that too solves the problem. IOW, I'm unconvinced this approach will cater to current practises or even allow similar functionality.
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> To: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Cc: torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org, mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org, hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org, pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kernel-team-b10kYP2dOMg@public.gmane.org Subject: Re: [PATCHSET RFC cgroup/for-4.6] cgroup, sched: implement resource group and PRIO_RGRP Date: Mon, 14 Mar 2016 12:30:13 +0100 [thread overview] Message-ID: <20160314113013.GM6344@twins.programming.kicks-ass.net> (raw) In-Reply-To: <1457710888-31182-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> On Fri, Mar 11, 2016 at 10:41:18AM -0500, Tejun Heo wrote: > * A rgroup is a cgroup which is invisible on and transparent to the > system-level cgroupfs interface. > > * A rgroup can be created by specifying CLONE_NEWRGRP flag, along with > CLONE_THREAD, during clone(2). A new rgroup is created under the > parent thread's cgroup and the new thread is created in it. This seems overly restrictive. As you well know there's people moving threads about after creation. Also, with this interface the whole thing cannot be used until your libc's pthread_create() has been patched to allow use of this new flag. > * A rgroup is automatically destroyed when empty. Except for Zombies it appears.. > * A top-level rgroup of a process is a rgroup whose parent cgroup is a > sgroup. A process may have multiple top-level rgroups and thus > multiple rgroup subtrees under the same parent sgroup. > > * Unlike sgroups, rgroups are allowed to compete against peer threads. > Each rgroup behaves equivalent to a sibling task. > > * rgroup subtrees are local to the process. When the process forks or > execs, its rgroup subtrees are collapsed. > > * When a process is migrated to a different cgroup, its rgroup > subtrees are preserved. This all makes it impossible to say put a single thread outside of the hierarchy forced upon it by the process. Like putting a RT thread in an isolated group on the side. Which is a rather common thing to do. > rgroup lays the foundation for other kernel mechanisms to make use of > resource controllers while providing proper isolation between system > management and in-process operations removing the awkward and > layer-violating requirement for coordination between individual > applications and system management. On top of the rgroup mechanism, > PRIO_RGRP is implemented for {set|get}priority(2). > > * PRIO_RGRP can only be used if the target task is already in a > rgroup. If setpriority(2) is used and cpu controller is available, > cpu controller is enabled until the target rgroup is covered and the > specified nice value is set as the weight of the rgroup. > > * The specified nice value has the same meaning as for tasks. For > example, a rgroup and a task competing under the same parent would > behave exactly the same as two tasks. > > * For top-level rgroups, PRIO_RGRP follows the same rlimit > restrictions as PRIO_PROCESS; however, as nested rgroups only > distribute CPU cycles which are allocated to the process, no > restriction is applied. While this appears neat, I doubt it will remain so in the face of this: > * A mechanism that applications can use to publish certain rgroups so > that external entities can determine which IDs to use to change > rgroup settings. I already have interface and implementation design > mostly pinned down. So you need some new fangled way to set/query all the other possible cgroup parameters supported, and then suddenly you have one that has two possible interface. That's way ugly. While I appreciate the sentiment that having two entities poking at the cgroup filesystem without coordination is a problem, I don't see this as the solution. I would much rather just kill the system wide thing, that too solves the problem. IOW, I'm unconvinced this approach will cater to current practises or even allow similar functionality.
next prev parent reply other threads:[~2016-03-14 11:30 UTC|newest] Thread overview: 95+ messages / expand[flat|nested] mbox.gz Atom feed top 2016-03-11 15:41 [PATCHSET RFC cgroup/for-4.6] cgroup, sched: implement resource group and PRIO_RGRP Tejun Heo 2016-03-11 15:41 ` Tejun Heo 2016-03-11 15:41 ` [PATCH 01/10] cgroup: introduce cgroup_[un]lock() Tejun Heo 2016-03-11 15:41 ` Tejun Heo 2016-03-11 15:41 ` [PATCH 02/10] cgroup: un-inline cgroup_path() and friends Tejun Heo 2016-03-11 15:41 ` [PATCH 03/10] cgroup: introduce CGRP_MIGRATE_* flags Tejun Heo 2016-03-11 15:41 ` Tejun Heo 2016-03-11 15:41 ` [PATCH 04/10] signal: make put_signal_struct() public Tejun Heo 2016-03-11 15:41 ` [PATCH 05/10] cgroup, fork: add @new_rgrp_cset[p] and @clone_flags to cgroup fork callbacks Tejun Heo 2016-03-11 15:41 ` Tejun Heo 2016-03-11 15:41 ` [PATCH 06/10] cgroup, fork: add @child and @clone_flags to threadgroup_change_begin/end() Tejun Heo 2016-03-11 15:41 ` [PATCH 07/10] cgroup: introduce resource group Tejun Heo 2016-03-11 15:41 ` Tejun Heo 2016-03-11 15:41 ` [PATCH 08/10] cgroup: implement rgroup control mask handling Tejun Heo 2016-03-11 15:41 ` Tejun Heo 2016-03-11 15:41 ` [PATCH 09/10] cgroup: implement rgroup subtree migration Tejun Heo 2016-03-11 15:41 ` [PATCH 10/10] cgroup, sched: implement PRIO_RGRP for {set|get}priority() Tejun Heo 2016-03-11 15:41 ` Tejun Heo 2016-03-11 16:05 ` Example program for PRIO_RGRP Tejun Heo 2016-03-11 16:05 ` Tejun Heo 2016-03-12 6:26 ` [PATCHSET RFC cgroup/for-4.6] cgroup, sched: implement resource group and PRIO_RGRP Mike Galbraith 2016-03-12 6:26 ` Mike Galbraith 2016-03-12 17:04 ` Mike Galbraith 2016-03-12 17:04 ` Mike Galbraith 2016-03-12 17:13 ` cgroup NAKs ignored? " Ingo Molnar 2016-03-12 17:13 ` Ingo Molnar 2016-03-13 14:42 ` Tejun Heo 2016-03-13 14:42 ` Tejun Heo 2016-03-13 15:00 ` Tejun Heo 2016-03-13 15:00 ` Tejun Heo 2016-03-13 17:40 ` Mike Galbraith 2016-03-13 17:40 ` Mike Galbraith 2016-04-07 0:00 ` Tejun Heo 2016-04-07 0:00 ` Tejun Heo 2016-04-07 3:26 ` Mike Galbraith 2016-04-07 3:26 ` Mike Galbraith 2016-03-14 2:23 ` Mike Galbraith 2016-03-14 2:23 ` Mike Galbraith 2016-03-14 11:30 ` Peter Zijlstra [this message] 2016-03-14 11:30 ` Peter Zijlstra 2016-04-06 15:58 ` Tejun Heo 2016-04-06 15:58 ` Tejun Heo 2016-04-06 15:58 ` Tejun Heo 2016-04-07 6:45 ` Peter Zijlstra 2016-04-07 6:45 ` Peter Zijlstra 2016-04-07 7:35 ` Johannes Weiner 2016-04-07 7:35 ` Johannes Weiner 2016-04-07 8:05 ` Mike Galbraith 2016-04-07 8:05 ` Mike Galbraith 2016-04-07 8:08 ` Peter Zijlstra 2016-04-07 8:08 ` Peter Zijlstra 2016-04-07 9:28 ` Johannes Weiner 2016-04-07 9:28 ` Johannes Weiner 2016-04-07 10:42 ` Peter Zijlstra 2016-04-07 10:42 ` Peter Zijlstra 2016-04-07 19:45 ` Tejun Heo 2016-04-07 19:45 ` Tejun Heo 2016-04-07 20:25 ` Peter Zijlstra 2016-04-07 20:25 ` Peter Zijlstra 2016-04-08 20:11 ` Tejun Heo 2016-04-08 20:11 ` Tejun Heo 2016-04-09 6:16 ` Mike Galbraith 2016-04-09 6:16 ` Mike Galbraith 2016-04-09 13:39 ` Peter Zijlstra 2016-04-09 13:39 ` Peter Zijlstra 2016-04-12 22:29 ` Tejun Heo 2016-04-12 22:29 ` Tejun Heo 2016-04-13 7:43 ` Mike Galbraith 2016-04-13 7:43 ` Mike Galbraith 2016-04-13 15:59 ` Tejun Heo 2016-04-13 19:15 ` Mike Galbraith 2016-04-13 19:15 ` Mike Galbraith 2016-04-14 6:07 ` Mike Galbraith 2016-04-14 19:57 ` Tejun Heo 2016-04-14 19:57 ` Tejun Heo 2016-04-15 2:42 ` Mike Galbraith 2016-04-15 2:42 ` Mike Galbraith 2016-04-09 16:02 ` Peter Zijlstra 2016-04-09 16:02 ` Peter Zijlstra 2016-04-07 8:28 ` Peter Zijlstra 2016-04-07 8:28 ` Peter Zijlstra 2016-04-07 19:04 ` Johannes Weiner 2016-04-07 19:04 ` Johannes Weiner 2016-04-07 19:31 ` Peter Zijlstra 2016-04-07 19:31 ` Peter Zijlstra 2016-04-07 20:23 ` Johannes Weiner 2016-04-07 20:23 ` Johannes Weiner 2016-04-08 3:13 ` Mike Galbraith 2016-04-08 3:13 ` Mike Galbraith 2016-03-15 17:21 ` Michal Hocko 2016-03-15 17:21 ` Michal Hocko 2016-04-06 21:53 ` Tejun Heo 2016-04-06 21:53 ` Tejun Heo 2016-04-07 6:40 ` Peter Zijlstra 2016-04-07 6:40 ` Peter Zijlstra
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20160314113013.GM6344@twins.programming.kicks-ass.net \ --to=peterz@infradead.org \ --cc=akpm@linux-foundation.org \ --cc=cgroups@vger.kernel.org \ --cc=hannes@cmpxchg.org \ --cc=kernel-team@fb.com \ --cc=linux-api@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=lizefan@huawei.com \ --cc=mingo@redhat.com \ --cc=pjt@google.com \ --cc=tj@kernel.org \ --cc=torvalds@linux-foundation.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.