From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752458AbcHUFeV (ORCPT ); Sun, 21 Aug 2016 01:34:21 -0400 Received: from bedivere.hansenpartnership.com ([66.63.167.143]:52400 "EHLO bedivere.hansenpartnership.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751573AbcHUFeT (ORCPT ); Sun, 21 Aug 2016 01:34:19 -0400 Message-ID: <1471757654.2354.97.camel@HansenPartnership.com> Subject: Re: [Documentation] State of CPU controller in cgroup v2 From: James Bottomley To: Andy Lutomirski , Tejun Heo Cc: Ingo Molnar , Mike Galbraith , "linux-kernel@vger.kernel.org" , kernel-team@fb.com, "open list:CONTROL GROUP (CGROUP)" , Andrew Morton , Paul Turner , Li Zefan , Linux API , Peter Zijlstra , Johannes Weiner , Linus Torvalds Date: Sat, 20 Aug 2016 22:34:14 -0700 In-Reply-To: References: <20160805170752.GK2542@mtj.duckdns.org> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.16.5 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2016-08-17 at 13:18 -0700, Andy Lutomirski wrote: > On Aug 5, 2016 7:07 PM, "Tejun Heo" wrote: [...] > > 2. Disagreements and Arguments > > > > There have been several lengthy discussion threads [3][4] on LKML > > around the structural constraints of cgroup v2. The two that > > affect the CPU controller are process granularity and no internal > > process constraint. Both arise primarily from the need for common > > resource domain definition across different resources. > > > > The common resource domain is a powerful concept in cgroup v2 that > > allows controllers to make basic assumptions about the structural > > organization of processes and controllers inside the cgroup > > hierarchy, and thus solve problems spanning multiple types of > > resources. The prime example for this is page cache writeback: > > dirty page cache is regulated through throttling buffered writers > > based on memory availability, and initiating batched write outs to > > the disk based on IO capacity. Tracking and controlling writeback > > inside a cgroup thus requires the direct cooperation of the memory > > and the IO controller. > > > > This easily extends to other areas, such as CPU cycles consumed > > while performing memory reclaim or IO encryption. > > > > > > 2-1. Contentious Restrictions > > > > For controllers of different resources to work together, they must > > agree on a common organization. This uniform model across > > controllers imposes two contentious restrictions on the CPU > > controller: process granularity and the no-internal-process > > constraint. > > > > > > 2-1-1. Process Granularity > > > > For memory, because an address space is shared between all > > threads > > of a process, the terminal consumer is a process, not a thread. > > Separating the threads of a single process into different memory > > control domains doesn't make semantical sense. cgroup v2 ensures > > that all controller can agree on the same organization by > > requiring > > that threads of the same process belong to the same cgroup. > > I haven't followed all of the history here, but it seems to me that > this argument is less accurate than it appears. Linux, for better or > for worse, has somewhat orthogonal concepts of thread groups > (processes), mms, and file tables. An mm has VMAs in it, and VMAs > can reference things (files, etc) that hold resources. (Two mms can > share resources by mapping the same thing or using fork().) File > tables hold files, and files can use resources. Both of these are, > at best, moderately good approximations of what actually holds > resources. Meanwhile, threads (tasks) do syscalls, take page faults, > *allocate* resources, etc. > > So I think it's not really true to say that the "terminal consumer" > of anything is a process, not a thread. > > While it's certainly easier to think about assigning processes to > cgroups, and I certainly agree that, in the common case, it's the > right thing to do, I don't see why requiring it is a good idea. Can > we turn this around: what actually goes wrong if cgroup v2 were to > allow assigning individual threads if a user specifically requests > it? A similar point from a different consumer: from the unprivileged containers point of view, I'm interested in a thread based interface as well. The principle utility of unprivileged containers is to allow applications that wish to to use container properties (effectively to become self-containerising). Some that use the producer/consumer model do use process pools (apache springs to mind instantly) but some use thread pools. It is useful to the latter to preserve the concept of a thread as being the entity inhabiting the cgroup (but only where the granularity of the cgroup permits threads to participate) so we can easily modify them to be self containerising without forcing them to switch back from a thread pool model to a process pool model. I can see that process based is conceptually easier in v2 because you begin with a process tree, but it would really be a pity to lose the thread based controls we have now and permanently lose the ability to create more as we find uses for them. I can't really see how improving "common resource domain" is a good tradeoff for this. James