All of lore.kernel.org
 help / color / mirror / Atom feed
From: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
To: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: "Daniel P. Berrange"
	<berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Frederic Weisbecker
	<fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Containers
	<containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
	Daniel Walsh <dwalsh-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Cgroups <cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Subject: Re: [RFD] Merge task counter into memcg
Date: Thu, 12 Apr 2012 14:13:49 -0300	[thread overview]
Message-ID: <4F870D4D.6020405@parallels.com> (raw)
In-Reply-To: <20120412163825.GB13069-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

>
> The reason why I asked Frederic whether it would make more sense as
> part of memcg wasn't about flexibility but mostly about the type of
> the resource.  I'll continue below.
>
>>> Agree. Even people aiming for unified hierarchies are okay with an
>>> opt-in/out system, I believe. So the controllers need not to be
>>> active at all times. One way of doing this is what I suggested to
>>> Frederic: If you don't limit, don't account.
>>
>> I don't agree, it's a valid usecase to monitor a workload without
>> limiting it in any way.  I do it all the time.
>
> AFAICS, this seems to be the most valid use case for different
> controllers seeing different part of the hierarchy, even if the
> hierarchies aren't completely separate.  Accounting and control being
> in separate controllers is pretty sucky too as it ends up accounting
> things multiple times.  Maybe all controllers should learn how to do
> accounting w/o applying limits?  Not sure yet.

Well...

* I don't know how blkcgrp applies limits
* the cpu cgroup, is limiting by nature, in the sense that it divides 
shares in proportion to the number of cgroups in a hierarchy
* memcg has a RESOURCE_MAX default limit that is bigger than anything 
you can possibly count.

So one of the problems, is that "limiting" may mean different thing to 
each controller.

I am mostly talking about memory cgroup here. And there. "Accounting 
without limiting" can trivially be done by setting limit to 
RESOURCE_MAX-delta. This won't work when we start having machines with 
2^64 physical memory, but I guess we have some time until it happens.

The way I see, it's just a technicality over a way to runtime disable 
the accounting of a resource without filling the hierarchy with flags.


>> To reraise a point from my other email that was ignored: do users
>> actually really care about the number of tasks when they want to
>> prevent forkbombs?  If a task would use neither CPU nor memory, you
>> would not be interested in limiting the number of tasks.
>>
>> Because the number of tasks is not a resource.  CPU and memory are.
>>
>> So again, if we would include the memory impact of tasks properly
>> (structures, kernel stack pages) in the kernel memory counters which
>> we allow to limit, shouldn't this solve our problem?
>
> The task counter is trying to control the *number* of tasks, which is
> purely memory overhead.

No, it is not. As we talk, it is becoming increasingly clear that given 
the use case, the correct term is "translating task *back* into the 
actual amount of memory".

> Translating #tasks into the actual amount of
> memory isn't too trivial tho - the task stack isn't the only
> allocation and the numbers should somehow make sense to the userland
> in consistent way.  Also, I'm not sure whether this particular limit
> should live in its silo or should be summed up together as part of
> kmem (kmem itself is in its own silo after all apart from user memory,
> right?).


It is accounted together, but limited separately. Setting 
memory.kmem.limit > memory.limit is a trivial way to say "Don't limit 
kmem". (and yet account it)

Same thing would go for a stack limit (Well, assuming it won't be merged 
into kmem itself as well)

> So, if those can be settled, I think protecting against fork
> bombs could fit memcg better in the sense that the whole thing makes
> more sense.

I myself will advise against merging anything not byte-based to memcg.
"task counter" is not byte-based.
"fork bomb preventer" might be.

WARNING: multiple messages have this Message-ID (diff)
From: Glauber Costa <glommer@parallels.com>
To: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Daniel Walsh <dwalsh@redhat.com>,
	"Daniel P. Berrange" <berrange@redhat.com>,
	Li Zefan <lizf@cn.fujitsu.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Cgroups <cgroups@vger.kernel.org>,
	Containers <containers@lists.linux-foundation.org>
Subject: Re: [RFD] Merge task counter into memcg
Date: Thu, 12 Apr 2012 14:13:49 -0300	[thread overview]
Message-ID: <4F870D4D.6020405@parallels.com> (raw)
In-Reply-To: <20120412163825.GB13069@google.com>

>
> The reason why I asked Frederic whether it would make more sense as
> part of memcg wasn't about flexibility but mostly about the type of
> the resource.  I'll continue below.
>
>>> Agree. Even people aiming for unified hierarchies are okay with an
>>> opt-in/out system, I believe. So the controllers need not to be
>>> active at all times. One way of doing this is what I suggested to
>>> Frederic: If you don't limit, don't account.
>>
>> I don't agree, it's a valid usecase to monitor a workload without
>> limiting it in any way.  I do it all the time.
>
> AFAICS, this seems to be the most valid use case for different
> controllers seeing different part of the hierarchy, even if the
> hierarchies aren't completely separate.  Accounting and control being
> in separate controllers is pretty sucky too as it ends up accounting
> things multiple times.  Maybe all controllers should learn how to do
> accounting w/o applying limits?  Not sure yet.

Well...

* I don't know how blkcgrp applies limits
* the cpu cgroup, is limiting by nature, in the sense that it divides 
shares in proportion to the number of cgroups in a hierarchy
* memcg has a RESOURCE_MAX default limit that is bigger than anything 
you can possibly count.

So one of the problems, is that "limiting" may mean different thing to 
each controller.

I am mostly talking about memory cgroup here. And there. "Accounting 
without limiting" can trivially be done by setting limit to 
RESOURCE_MAX-delta. This won't work when we start having machines with 
2^64 physical memory, but I guess we have some time until it happens.

The way I see, it's just a technicality over a way to runtime disable 
the accounting of a resource without filling the hierarchy with flags.


>> To reraise a point from my other email that was ignored: do users
>> actually really care about the number of tasks when they want to
>> prevent forkbombs?  If a task would use neither CPU nor memory, you
>> would not be interested in limiting the number of tasks.
>>
>> Because the number of tasks is not a resource.  CPU and memory are.
>>
>> So again, if we would include the memory impact of tasks properly
>> (structures, kernel stack pages) in the kernel memory counters which
>> we allow to limit, shouldn't this solve our problem?
>
> The task counter is trying to control the *number* of tasks, which is
> purely memory overhead.

No, it is not. As we talk, it is becoming increasingly clear that given 
the use case, the correct term is "translating task *back* into the 
actual amount of memory".

> Translating #tasks into the actual amount of
> memory isn't too trivial tho - the task stack isn't the only
> allocation and the numbers should somehow make sense to the userland
> in consistent way.  Also, I'm not sure whether this particular limit
> should live in its silo or should be summed up together as part of
> kmem (kmem itself is in its own silo after all apart from user memory,
> right?).


It is accounted together, but limited separately. Setting 
memory.kmem.limit > memory.limit is a trivial way to say "Don't limit 
kmem". (and yet account it)

Same thing would go for a stack limit (Well, assuming it won't be merged 
into kmem itself as well)

> So, if those can be settled, I think protecting against fork
> bombs could fit memcg better in the sense that the whole thing makes
> more sense.

I myself will advise against merging anything not byte-based to memcg.
"task counter" is not byte-based.
"fork bomb preventer" might be.

  parent reply	other threads:[~2012-04-12 17:13 UTC|newest]

Thread overview: 117+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-11 18:57 [RFD] Merge task counter into memcg Frederic Weisbecker
2012-04-11 18:57 ` Frederic Weisbecker
     [not found] ` <20120411185715.GA4317-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>
2012-04-11 19:21   ` Glauber Costa
2012-04-11 19:21     ` Glauber Costa
2012-04-12 11:19     ` Frederic Weisbecker
2012-04-12 11:19       ` Frederic Weisbecker
     [not found]     ` <4F85D9C6.5000202-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-12 11:19       ` Frederic Weisbecker
2012-04-12  0:56   ` KAMEZAWA Hiroyuki
2012-04-12  1:07   ` Johannes Weiner
2012-04-12  3:56   ` Alexander Nikiforov
     [not found]     ` <4F86527C.2080507-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
2012-04-17  1:09       ` Frederic Weisbecker
2012-04-17  1:09     ` Frederic Weisbecker
2012-04-17  1:09       ` Frederic Weisbecker
2012-04-17  6:45       ` Alexander Nikiforov
2012-04-17  6:45         ` Alexander Nikiforov
2012-04-17 15:23         ` Tejun Heo
2012-04-17 15:23           ` Tejun Heo
     [not found]           ` <20120417152350.GC32402-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-04-19  3:34             ` Alexander Nikiforov
2012-04-19  3:34               ` Alexander Nikiforov
     [not found]         ` <4F8D1171.1090504-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
2012-04-17 15:23           ` Tejun Heo
     [not found]       ` <20120417010902.GA14646-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>
2012-04-17  6:45         ` Alexander Nikiforov
2012-04-12  4:00   ` Alexander Nikiforov
2012-04-12  4:00     ` Alexander Nikiforov
2012-04-12  0:56 ` KAMEZAWA Hiroyuki
2012-04-12  0:56   ` KAMEZAWA Hiroyuki
     [not found]   ` <4F862851.3040208-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2012-04-12 11:32     ` Frederic Weisbecker
2012-04-12 11:32       ` Frederic Weisbecker
2012-04-12 11:43       ` Glauber Costa
2012-04-12 11:43         ` Glauber Costa
     [not found]         ` <4F86BFC6.2050400-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-12 12:32           ` Johannes Weiner
2012-04-12 12:32         ` Johannes Weiner
2012-04-12 12:32           ` Johannes Weiner
     [not found]           ` <20120412123256.GI1787-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2012-04-12 13:12             ` Glauber Costa
2012-04-12 13:12               ` Glauber Costa
2012-04-12 15:30               ` Johannes Weiner
2012-04-12 15:30                 ` Johannes Weiner
     [not found]                 ` <20120412153055.GL1787-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2012-04-12 16:38                   ` Tejun Heo
2012-04-12 16:38                     ` Tejun Heo
     [not found]                     ` <20120412163825.GB13069-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-04-12 17:04                       ` Cgroup in a single hierarchy (Was: Re: [RFD] Merge task counter into memcg) Glauber Costa
2012-04-12 17:04                         ` Glauber Costa
2012-04-17 15:13                         ` Tejun Heo
2012-04-17 15:13                           ` Tejun Heo
     [not found]                           ` <20120417151352.GA32402-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-04-17 15:27                             ` Glauber Costa
2012-04-17 15:27                               ` Glauber Costa
     [not found]                         ` <4F870B18.5060703-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-17 15:13                           ` Tejun Heo
2012-04-12 17:13                       ` Glauber Costa [this message]
2012-04-12 17:13                         ` [RFD] Merge task counter into memcg Glauber Costa
2012-04-12 17:23                       ` Johannes Weiner
2012-04-12 17:23                     ` Johannes Weiner
2012-04-12 17:23                       ` Johannes Weiner
     [not found]                       ` <20120412172309.GM1787-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2012-04-12 17:41                         ` Tejun Heo
2012-04-12 17:41                           ` Tejun Heo
2012-04-12 17:53                           ` Glauber Costa
2012-04-12 17:53                             ` Glauber Costa
2012-04-13  1:42                           ` KAMEZAWA Hiroyuki
2012-04-13  1:42                             ` KAMEZAWA Hiroyuki
2012-04-17 15:41                             ` Tejun Heo
2012-04-17 15:41                               ` Tejun Heo
2012-04-17 16:52                               ` Glauber Costa
2012-04-17 16:52                                 ` Glauber Costa
     [not found]                                 ` <4F8D9FC4.3080800-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-18  6:51                                   ` KAMEZAWA Hiroyuki
2012-04-18  6:51                                 ` KAMEZAWA Hiroyuki
2012-04-18  6:51                                   ` KAMEZAWA Hiroyuki
2012-04-18  7:53                                   ` Frederic Weisbecker
2012-04-18  7:53                                     ` Frederic Weisbecker
2012-04-18  8:42                                     ` KAMEZAWA Hiroyuki
2012-04-18  8:42                                       ` KAMEZAWA Hiroyuki
     [not found]                                       ` <4F8E7E76.3020202-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2012-04-18  9:12                                         ` Frederic Weisbecker
2012-04-18 10:39                                         ` Johannes Weiner
2012-04-18  9:12                                       ` Frederic Weisbecker
2012-04-18  9:12                                         ` Frederic Weisbecker
2012-04-18 10:39                                       ` Johannes Weiner
2012-04-18 10:39                                         ` Johannes Weiner
2012-04-18 11:00                                         ` KAMEZAWA Hiroyuki
2012-04-18 11:00                                           ` KAMEZAWA Hiroyuki
     [not found]                                         ` <20120418103930.GA1771-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2012-04-18 11:00                                           ` KAMEZAWA Hiroyuki
     [not found]                                     ` <CAFTL4hw3C4s6VS07pJzdBawv0ugKJJa+Vnb-Q_9FrWEq4=ka9Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-04-18  8:42                                       ` KAMEZAWA Hiroyuki
     [not found]                                   ` <4F8E646B.1020807-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2012-04-18  7:53                                     ` Frederic Weisbecker
     [not found]                               ` <20120417154117.GE32402-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-04-17 16:52                                 ` Glauber Costa
     [not found]                             ` <4F878480.60505-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2012-04-13  1:50                               ` Glauber Costa
2012-04-13  1:50                                 ` Glauber Costa
2012-04-13  2:48                                 ` KAMEZAWA Hiroyuki
2012-04-13  2:48                                   ` KAMEZAWA Hiroyuki
     [not found]                                 ` <4F87865F.5060701-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-13  2:48                                   ` KAMEZAWA Hiroyuki
2012-04-17 15:41                               ` Tejun Heo
     [not found]                           ` <20120412174155.GC13069-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-04-12 17:53                             ` Glauber Costa
2012-04-13  1:42                             ` KAMEZAWA Hiroyuki
2012-04-12 16:54                   ` Glauber Costa
2012-04-12 16:54                 ` Glauber Costa
2012-04-12 16:54                   ` Glauber Costa
     [not found]               ` <4F86D4BD.1040305-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-12 15:30                 ` Johannes Weiner
     [not found]       ` <20120412113217.GB11455-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>
2012-04-12 11:43         ` Glauber Costa
2012-04-12  1:07 ` Johannes Weiner
2012-04-12  1:07   ` Johannes Weiner
     [not found]   ` <20120412010745.GE1787-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2012-04-12  2:15     ` Glauber Costa
2012-04-12  2:15       ` Glauber Costa
2012-04-12  3:26     ` Li Zefan
2012-04-12  3:26       ` Li Zefan
2012-04-12 14:55     ` Frederic Weisbecker
2012-04-12 14:55       ` Frederic Weisbecker
     [not found]       ` <20120412145507.GC11455-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>
2012-04-12 16:34         ` Glauber Costa
2012-04-12 16:34           ` Glauber Costa
     [not found]           ` <4F87042A.2000902-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-04-12 16:59             ` Frederic Weisbecker
2012-04-12 16:59               ` Frederic Weisbecker
2012-04-17 15:17               ` Tejun Heo
2012-04-17 15:17                 ` Tejun Heo
2012-04-18  6:54                 ` Frederic Weisbecker
2012-04-18  6:54                   ` Frederic Weisbecker
2012-04-18  8:10                   ` Frederic Weisbecker
2012-04-18  8:10                     ` Frederic Weisbecker
     [not found]                     ` <CAFTL4hxXT+hXWEnKop84JQ8ieHX4e=otpHnXYxdxaPgsiZYCiw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-04-18 12:00                       ` Glauber Costa
2012-04-18 12:00                     ` Glauber Costa
2012-04-18 12:00                       ` Glauber Costa
2012-04-18  8:10                   ` Frederic Weisbecker
     [not found]                 ` <20120417151753.GB32402-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-04-18  6:54                   ` Frederic Weisbecker
     [not found]               ` <20120412165922.GA12484-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>
2012-04-17 15:17                 ` Tejun Heo
2012-04-11 18:57 Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F870D4D.6020405@parallels.com \
    --to=glommer-bzqdu9zft3wakbo8gow8eq@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=dwalsh-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
    --cc=hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.