* [PATCH 0/2] fix and deprecate use_hierarchy file @ 2012-06-26 15:47 Glauber Costa 2012-06-26 15:47 ` Glauber Costa 2012-06-26 15:47 ` Glauber Costa 0 siblings, 2 replies; 47+ messages in thread From: Glauber Costa @ 2012-06-26 15:47 UTC (permalink / raw) To: cgroups Cc: linux-mm, kamezawa.hiroyu, Michal Hocko, Johannes Weiner, Andrew Morton Hi, I am just bundling my last two patches for use_hierarchy together, so it gets easier to track and apply. After these patches, use_hierarchy will default to true, and will need to be disabled at the root level to fallback to non-hierarchical. Still need, of course, to hear Kame's opinion on this. Thanks Glauber Costa (2): fix bad behavior in use_hierarchy file memcg: first step towards hierarchical controller mm/memcontrol.c | 11 +++++++++++ 1 file changed, 11 insertions(+) -- 1.7.10.2 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCH 1/2] fix bad behavior in use_hierarchy file @ 2012-06-26 15:47 ` Glauber Costa 0 siblings, 0 replies; 47+ messages in thread From: Glauber Costa @ 2012-06-26 15:47 UTC (permalink / raw) To: cgroups Cc: linux-mm, kamezawa.hiroyu, Michal Hocko, Johannes Weiner, Andrew Morton, Glauber Costa, Dhaval Giani I have an application that does the following: * copy the state of all controllers attached to a hierarchy * replicate it as a child of the current level. I would expect writes to the files to mostly succeed, since they are inheriting sane values from parents. But that is not the case for use_hierarchy. If it is set to 0, we succeed ok. If we're set to 1, the value of the file is automatically set to 1 in the children, but if userspace tries to write the very same 1, it will fail. That same situation happens if we set use_hierarchy, create a child, and then try to write 1 again. Now, there is no reason whatsoever for failing to write a value that is already there. It doesn't even match the comments, that states: /* If parent's use_hierarchy is set, we can't make any modifications * in the child subtrees... since we are not changing anything. The following patch tests the new value against the one we're storing, and automatically return 0 if we're not proposing a change. Signed-off-by: Glauber Costa <glommer@parallels.com> CC: Dhaval Giani <dhaval.giani@gmail.com> CC: Michal Hocko <mhocko@suse.cz> CC: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> CC: Johannes Weiner <hannes@cmpxchg.org> --- mm/memcontrol.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index df8c9fb..85f7790 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3989,6 +3989,10 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, parent_memcg = mem_cgroup_from_cont(parent); cgroup_lock(); + + if (memcg->use_hierarchy == val) + goto out; + /* * If parent's use_hierarchy is set, we can't make any modifications * in the child subtrees. If it is unset, then the change can @@ -4005,6 +4009,8 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, retval = -EBUSY; } else retval = -EINVAL; + +out: cgroup_unlock(); return retval; -- 1.7.10.2 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 1/2] fix bad behavior in use_hierarchy file @ 2012-06-26 15:47 ` Glauber Costa 0 siblings, 0 replies; 47+ messages in thread From: Glauber Costa @ 2012-06-26 15:47 UTC (permalink / raw) To: cgroups-u79uwXL29TY76Z2rM5mHXA Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A, Michal Hocko, Johannes Weiner, Andrew Morton, Glauber Costa, Dhaval Giani I have an application that does the following: * copy the state of all controllers attached to a hierarchy * replicate it as a child of the current level. I would expect writes to the files to mostly succeed, since they are inheriting sane values from parents. But that is not the case for use_hierarchy. If it is set to 0, we succeed ok. If we're set to 1, the value of the file is automatically set to 1 in the children, but if userspace tries to write the very same 1, it will fail. That same situation happens if we set use_hierarchy, create a child, and then try to write 1 again. Now, there is no reason whatsoever for failing to write a value that is already there. It doesn't even match the comments, that states: /* If parent's use_hierarchy is set, we can't make any modifications * in the child subtrees... since we are not changing anything. The following patch tests the new value against the one we're storing, and automatically return 0 if we're not proposing a change. Signed-off-by: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> CC: Dhaval Giani <dhaval.giani-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> CC: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> CC: Kamezawa Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> CC: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> --- mm/memcontrol.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index df8c9fb..85f7790 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3989,6 +3989,10 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, parent_memcg = mem_cgroup_from_cont(parent); cgroup_lock(); + + if (memcg->use_hierarchy == val) + goto out; + /* * If parent's use_hierarchy is set, we can't make any modifications * in the child subtrees. If it is unset, then the change can @@ -4005,6 +4009,8 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, retval = -EBUSY; } else retval = -EINVAL; + +out: cgroup_unlock(); return retval; -- 1.7.10.2 ^ permalink raw reply related [flat|nested] 47+ messages in thread
* Re: [PATCH 1/2] fix bad behavior in use_hierarchy file @ 2012-06-26 15:52 ` Michal Hocko 0 siblings, 0 replies; 47+ messages in thread From: Michal Hocko @ 2012-06-26 15:52 UTC (permalink / raw) To: Glauber Costa Cc: cgroups, linux-mm, kamezawa.hiroyu, Johannes Weiner, Andrew Morton, Dhaval Giani On Tue 26-06-12 19:47:13, Glauber Costa wrote: > I have an application that does the following: > > * copy the state of all controllers attached to a hierarchy > * replicate it as a child of the current level. > > I would expect writes to the files to mostly succeed, since they > are inheriting sane values from parents. > > But that is not the case for use_hierarchy. If it is set to 0, we > succeed ok. If we're set to 1, the value of the file is automatically > set to 1 in the children, but if userspace tries to write the > very same 1, it will fail. That same situation happens if we > set use_hierarchy, create a child, and then try to write 1 again. > > Now, there is no reason whatsoever for failing to write a value > that is already there. It doesn't even match the comments, that > states: > > /* If parent's use_hierarchy is set, we can't make any modifications > * in the child subtrees... > > since we are not changing anything. > > The following patch tests the new value against the one we're storing, > and automatically return 0 if we're not proposing a change. > > Signed-off-by: Glauber Costa <glommer@parallels.com> > CC: Dhaval Giani <dhaval.giani@gmail.com> > CC: Michal Hocko <mhocko@suse.cz> > CC: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > CC: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> > --- > mm/memcontrol.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index df8c9fb..85f7790 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -3989,6 +3989,10 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, > parent_memcg = mem_cgroup_from_cont(parent); > > cgroup_lock(); > + > + if (memcg->use_hierarchy == val) > + goto out; > + > /* > * If parent's use_hierarchy is set, we can't make any modifications > * in the child subtrees. If it is unset, then the change can > @@ -4005,6 +4009,8 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, > retval = -EBUSY; > } else > retval = -EINVAL; > + > +out: > cgroup_unlock(); > > return retval; > -- > 1.7.10.2 > > -- > To unsubscribe from this list: send the line "unsubscribe cgroups" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 1/2] fix bad behavior in use_hierarchy file @ 2012-06-26 15:52 ` Michal Hocko 0 siblings, 0 replies; 47+ messages in thread From: Michal Hocko @ 2012-06-26 15:52 UTC (permalink / raw) To: Glauber Costa Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A, Johannes Weiner, Andrew Morton, Dhaval Giani On Tue 26-06-12 19:47:13, Glauber Costa wrote: > I have an application that does the following: > > * copy the state of all controllers attached to a hierarchy > * replicate it as a child of the current level. > > I would expect writes to the files to mostly succeed, since they > are inheriting sane values from parents. > > But that is not the case for use_hierarchy. If it is set to 0, we > succeed ok. If we're set to 1, the value of the file is automatically > set to 1 in the children, but if userspace tries to write the > very same 1, it will fail. That same situation happens if we > set use_hierarchy, create a child, and then try to write 1 again. > > Now, there is no reason whatsoever for failing to write a value > that is already there. It doesn't even match the comments, that > states: > > /* If parent's use_hierarchy is set, we can't make any modifications > * in the child subtrees... > > since we are not changing anything. > > The following patch tests the new value against the one we're storing, > and automatically return 0 if we're not proposing a change. > > Signed-off-by: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> > CC: Dhaval Giani <dhaval.giani-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > CC: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> > CC: Kamezawa Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> > CC: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> Acked-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> > --- > mm/memcontrol.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index df8c9fb..85f7790 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -3989,6 +3989,10 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, > parent_memcg = mem_cgroup_from_cont(parent); > > cgroup_lock(); > + > + if (memcg->use_hierarchy == val) > + goto out; > + > /* > * If parent's use_hierarchy is set, we can't make any modifications > * in the child subtrees. If it is unset, then the change can > @@ -4005,6 +4009,8 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, > retval = -EBUSY; > } else > retval = -EINVAL; > + > +out: > cgroup_unlock(); > > return retval; > -- > 1.7.10.2 > > -- > To unsubscribe from this list: send the line "unsubscribe cgroups" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 1/2] fix bad behavior in use_hierarchy file @ 2012-06-26 15:54 ` Johannes Weiner 0 siblings, 0 replies; 47+ messages in thread From: Johannes Weiner @ 2012-06-26 15:54 UTC (permalink / raw) To: Glauber Costa Cc: cgroups, linux-mm, kamezawa.hiroyu, Michal Hocko, Andrew Morton, Dhaval Giani On Tue, Jun 26, 2012 at 07:47:13PM +0400, Glauber Costa wrote: > I have an application that does the following: > > * copy the state of all controllers attached to a hierarchy > * replicate it as a child of the current level. > > I would expect writes to the files to mostly succeed, since they > are inheriting sane values from parents. > > But that is not the case for use_hierarchy. If it is set to 0, we > succeed ok. If we're set to 1, the value of the file is automatically > set to 1 in the children, but if userspace tries to write the > very same 1, it will fail. That same situation happens if we > set use_hierarchy, create a child, and then try to write 1 again. > > Now, there is no reason whatsoever for failing to write a value > that is already there. It doesn't even match the comments, that > states: > > /* If parent's use_hierarchy is set, we can't make any modifications > * in the child subtrees... > > since we are not changing anything. > > The following patch tests the new value against the one we're storing, > and automatically return 0 if we're not proposing a change. > > Signed-off-by: Glauber Costa <glommer@parallels.com> > CC: Dhaval Giani <dhaval.giani@gmail.com> > CC: Michal Hocko <mhocko@suse.cz> > CC: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > CC: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 1/2] fix bad behavior in use_hierarchy file @ 2012-06-26 15:54 ` Johannes Weiner 0 siblings, 0 replies; 47+ messages in thread From: Johannes Weiner @ 2012-06-26 15:54 UTC (permalink / raw) To: Glauber Costa Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A, Michal Hocko, Andrew Morton, Dhaval Giani On Tue, Jun 26, 2012 at 07:47:13PM +0400, Glauber Costa wrote: > I have an application that does the following: > > * copy the state of all controllers attached to a hierarchy > * replicate it as a child of the current level. > > I would expect writes to the files to mostly succeed, since they > are inheriting sane values from parents. > > But that is not the case for use_hierarchy. If it is set to 0, we > succeed ok. If we're set to 1, the value of the file is automatically > set to 1 in the children, but if userspace tries to write the > very same 1, it will fail. That same situation happens if we > set use_hierarchy, create a child, and then try to write 1 again. > > Now, there is no reason whatsoever for failing to write a value > that is already there. It doesn't even match the comments, that > states: > > /* If parent's use_hierarchy is set, we can't make any modifications > * in the child subtrees... > > since we are not changing anything. > > The following patch tests the new value against the one we're storing, > and automatically return 0 if we're not proposing a change. > > Signed-off-by: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> > CC: Dhaval Giani <dhaval.giani-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > CC: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> > CC: Kamezawa Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> > CC: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> Acked-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 1/2] fix bad behavior in use_hierarchy file @ 2012-06-26 22:25 ` Andrew Morton 0 siblings, 0 replies; 47+ messages in thread From: Andrew Morton @ 2012-06-26 22:25 UTC (permalink / raw) To: Glauber Costa Cc: cgroups, linux-mm, kamezawa.hiroyu, Michal Hocko, Johannes Weiner, Dhaval Giani, Tejun Heo On Tue, 26 Jun 2012 19:47:13 +0400 Glauber Costa <glommer@parallels.com> wrote: > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -3989,6 +3989,10 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, > parent_memcg = mem_cgroup_from_cont(parent); > > cgroup_lock(); > + > + if (memcg->use_hierarchy == val) > + goto out; > + > /* > * If parent's use_hierarchy is set, we can't make any modifications > * in the child subtrees. If it is unset, then the change can > @@ -4005,6 +4009,8 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, > retval = -EBUSY; > } else > retval = -EINVAL; > + > +out: > cgroup_unlock(); > > return retval; hm. The various .write_u64() implementations go and return zero on success and cgroup_write_X64() sees this and rewrites the return value to `nbytes'. That was a bit naughty of us - it prevents a .write_u64() instance from being able to fully implement a partial write. We can *partially* implement a partial write, by returning a value between 1 and nbytes-1, but we can't return zero. It's a weird interface, it's a surprising interface and it was quite unnecessary to do it this way. Someone please slap Paul. It's hardly a big problem I, but that's why the unix write() interface was designed the way it is. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 1/2] fix bad behavior in use_hierarchy file @ 2012-06-26 22:25 ` Andrew Morton 0 siblings, 0 replies; 47+ messages in thread From: Andrew Morton @ 2012-06-26 22:25 UTC (permalink / raw) To: Glauber Costa Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A, Michal Hocko, Johannes Weiner, Dhaval Giani, Tejun Heo On Tue, 26 Jun 2012 19:47:13 +0400 Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> wrote: > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -3989,6 +3989,10 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, > parent_memcg = mem_cgroup_from_cont(parent); > > cgroup_lock(); > + > + if (memcg->use_hierarchy == val) > + goto out; > + > /* > * If parent's use_hierarchy is set, we can't make any modifications > * in the child subtrees. If it is unset, then the change can > @@ -4005,6 +4009,8 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, > retval = -EBUSY; > } else > retval = -EINVAL; > + > +out: > cgroup_unlock(); > > return retval; hm. The various .write_u64() implementations go and return zero on success and cgroup_write_X64() sees this and rewrites the return value to `nbytes'. That was a bit naughty of us - it prevents a .write_u64() instance from being able to fully implement a partial write. We can *partially* implement a partial write, by returning a value between 1 and nbytes-1, but we can't return zero. It's a weird interface, it's a surprising interface and it was quite unnecessary to do it this way. Someone please slap Paul. It's hardly a big problem I, but that's why the unix write() interface was designed the way it is. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 1/2] fix bad behavior in use_hierarchy file 2012-06-26 22:25 ` Andrew Morton (?) @ 2012-06-26 22:30 ` Tejun Heo -1 siblings, 0 replies; 47+ messages in thread From: Tejun Heo @ 2012-06-26 22:30 UTC (permalink / raw) To: Andrew Morton Cc: Glauber Costa, cgroups, linux-mm, kamezawa.hiroyu, Michal Hocko, Johannes Weiner, Dhaval Giani, Li Zefan (cc'ing Li) Hello, Andrew. On Tue, Jun 26, 2012 at 03:25:22PM -0700, Andrew Morton wrote: > hm. The various .write_u64() implementations go and return zero on > success and cgroup_write_X64() sees this and rewrites the return value > to `nbytes'. > > That was a bit naughty of us - it prevents a .write_u64() instance from > being able to fully implement a partial write. We can *partially* > implement a partial write, by returning a value between 1 and nbytes-1, > but we can't return zero. It's a weird interface, it's a surprising > interface and it was quite unnecessary to do it this way. Someone > please slap Paul. > > It's hardly a big problem I, but that's why the unix write() interface > was designed the way it is. The whole file interface is severely over-designed like a lot of other things in cgorup. I'm thinking about consolidating all the different read/write methods into one generic pair, likely based on seq_file and make all others helpers. Thanks. -- tejun -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 15:47 ` Glauber Costa 0 siblings, 0 replies; 47+ messages in thread From: Glauber Costa @ 2012-06-26 15:47 UTC (permalink / raw) To: cgroups Cc: linux-mm, kamezawa.hiroyu, Michal Hocko, Johannes Weiner, Andrew Morton, Glauber Costa, Tejun Heo Okay, so after recent discussions, I am proposing the following patch. It won't remove hierarchy, or anything like that. Just default to true in the root cgroup, and print a warning once if you try to set it back to 0. I am not adding it to feature-removal-schedule.txt because I don't view it as a consensus. Rather, changing the default would allow us to give it a time around in the open, and see if people complain and what we can learn about that. Signed-off-by: Glauber Costa <glommer@parallels.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> CC: Michal Hocko <mhocko@suse.cz> CC: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> CC: Tejun Heo <tj@kernel.org> --- mm/memcontrol.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 85f7790..c37e4c1 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3993,6 +3993,10 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, if (memcg->use_hierarchy == val) goto out; + WARN_ONCE(!parent_memcg && memcg->use_hierarchy, + "Non-hierarchical memcg is considered for deprecation\n" + "Please consider reorganizing your tree to work with hierarchical accounting\n" + "If you have any reason not to, let us know at cgroups@vger.kernel.org\n"); /* * If parent's use_hierarchy is set, we can't make any modifications * in the child subtrees. If it is unset, then the change can @@ -5221,6 +5225,7 @@ mem_cgroup_create(struct cgroup *cont) INIT_WORK(&stock->work, drain_local_stock); } hotcpu_notifier(memcg_cpu_hotplug_callback, 0); + memcg->use_hierarchy = true; } else { parent = mem_cgroup_from_cont(cont->parent); memcg->use_hierarchy = parent->use_hierarchy; -- 1.7.10.2 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 15:47 ` Glauber Costa 0 siblings, 0 replies; 47+ messages in thread From: Glauber Costa @ 2012-06-26 15:47 UTC (permalink / raw) To: cgroups-u79uwXL29TY76Z2rM5mHXA Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A, Michal Hocko, Johannes Weiner, Andrew Morton, Glauber Costa, Tejun Heo Okay, so after recent discussions, I am proposing the following patch. It won't remove hierarchy, or anything like that. Just default to true in the root cgroup, and print a warning once if you try to set it back to 0. I am not adding it to feature-removal-schedule.txt because I don't view it as a consensus. Rather, changing the default would allow us to give it a time around in the open, and see if people complain and what we can learn about that. Signed-off-by: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> Acked-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> CC: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> CC: Kamezawa Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> CC: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> --- mm/memcontrol.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 85f7790..c37e4c1 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3993,6 +3993,10 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, if (memcg->use_hierarchy == val) goto out; + WARN_ONCE(!parent_memcg && memcg->use_hierarchy, + "Non-hierarchical memcg is considered for deprecation\n" + "Please consider reorganizing your tree to work with hierarchical accounting\n" + "If you have any reason not to, let us know at cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org\n"); /* * If parent's use_hierarchy is set, we can't make any modifications * in the child subtrees. If it is unset, then the change can @@ -5221,6 +5225,7 @@ mem_cgroup_create(struct cgroup *cont) INIT_WORK(&stock->work, drain_local_stock); } hotcpu_notifier(memcg_cpu_hotplug_callback, 0); + memcg->use_hierarchy = true; } else { parent = mem_cgroup_from_cont(cont->parent); memcg->use_hierarchy = parent->use_hierarchy; -- 1.7.10.2 ^ permalink raw reply related [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 16:15 ` Michal Hocko 0 siblings, 0 replies; 47+ messages in thread From: Michal Hocko @ 2012-06-26 16:15 UTC (permalink / raw) To: Glauber Costa Cc: cgroups, linux-mm, kamezawa.hiroyu, Johannes Weiner, Andrew Morton, Tejun Heo On Tue 26-06-12 19:47:14, Glauber Costa wrote: > Okay, so after recent discussions, I am proposing the following > patch. It won't remove hierarchy, or anything like that. Just default > to true in the root cgroup, and print a warning once if you try > to set it back to 0. > > I am not adding it to feature-removal-schedule.txt because I don't > view it as a consensus. Rather, changing the default would allow us > to give it a time around in the open, and see if people complain > and what we can learn about that. > > Signed-off-by: Glauber Costa <glommer@parallels.com> > Acked-by: Johannes Weiner <hannes@cmpxchg.org> > CC: Michal Hocko <mhocko@suse.cz> > CC: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > CC: Tejun Heo <tj@kernel.org> > --- > mm/memcontrol.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 85f7790..c37e4c1 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -3993,6 +3993,10 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, > if (memcg->use_hierarchy == val) > goto out; > > + WARN_ONCE(!parent_memcg && memcg->use_hierarchy, Do you have to test anything here at all? The test above will get you out without doing anything if you are not trying to change anything. The default is true so you have to be trying to disable it. If you omit !parent_memcg test as well you will get a bonus of the early warning even if somebody has cgconfig.conf like this: group a/b/c { memory { memory.use_hierarchy = 0; [...] } } which worked previously... True there is a risk of a "false warning" when somebody just tries to set disable hierarchy when it is (and never was) allowed but I do not think this is that bad. The configuration like this is rather artificial but I guess we should try to be foolproof. > + "Non-hierarchical memcg is considered for deprecation\n" > + "Please consider reorganizing your tree to work with hierarchical accounting\n" > + "If you have any reason not to, let us know at cgroups@vger.kernel.org\n"); > /* > * If parent's use_hierarchy is set, we can't make any modifications > * in the child subtrees. If it is unset, then the change can > @@ -5221,6 +5225,7 @@ mem_cgroup_create(struct cgroup *cont) > INIT_WORK(&stock->work, drain_local_stock); > } > hotcpu_notifier(memcg_cpu_hotplug_callback, 0); > + memcg->use_hierarchy = true; > } else { > parent = mem_cgroup_from_cont(cont->parent); > memcg->use_hierarchy = parent->use_hierarchy; > -- > 1.7.10.2 > > -- > To unsubscribe from this list: send the line "unsubscribe cgroups" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 16:15 ` Michal Hocko 0 siblings, 0 replies; 47+ messages in thread From: Michal Hocko @ 2012-06-26 16:15 UTC (permalink / raw) To: Glauber Costa Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A, Johannes Weiner, Andrew Morton, Tejun Heo On Tue 26-06-12 19:47:14, Glauber Costa wrote: > Okay, so after recent discussions, I am proposing the following > patch. It won't remove hierarchy, or anything like that. Just default > to true in the root cgroup, and print a warning once if you try > to set it back to 0. > > I am not adding it to feature-removal-schedule.txt because I don't > view it as a consensus. Rather, changing the default would allow us > to give it a time around in the open, and see if people complain > and what we can learn about that. > > Signed-off-by: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> > Acked-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> > CC: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> > CC: Kamezawa Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> > CC: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > --- > mm/memcontrol.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 85f7790..c37e4c1 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -3993,6 +3993,10 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, > if (memcg->use_hierarchy == val) > goto out; > > + WARN_ONCE(!parent_memcg && memcg->use_hierarchy, Do you have to test anything here at all? The test above will get you out without doing anything if you are not trying to change anything. The default is true so you have to be trying to disable it. If you omit !parent_memcg test as well you will get a bonus of the early warning even if somebody has cgconfig.conf like this: group a/b/c { memory { memory.use_hierarchy = 0; [...] } } which worked previously... True there is a risk of a "false warning" when somebody just tries to set disable hierarchy when it is (and never was) allowed but I do not think this is that bad. The configuration like this is rather artificial but I guess we should try to be foolproof. > + "Non-hierarchical memcg is considered for deprecation\n" > + "Please consider reorganizing your tree to work with hierarchical accounting\n" > + "If you have any reason not to, let us know at cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org\n"); > /* > * If parent's use_hierarchy is set, we can't make any modifications > * in the child subtrees. If it is unset, then the change can > @@ -5221,6 +5225,7 @@ mem_cgroup_create(struct cgroup *cont) > INIT_WORK(&stock->work, drain_local_stock); > } > hotcpu_notifier(memcg_cpu_hotplug_callback, 0); > + memcg->use_hierarchy = true; > } else { > parent = mem_cgroup_from_cont(cont->parent); > memcg->use_hierarchy = parent->use_hierarchy; > -- > 1.7.10.2 > > -- > To unsubscribe from this list: send the line "unsubscribe cgroups" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 16:37 ` Glauber Costa 0 siblings, 0 replies; 47+ messages in thread From: Glauber Costa @ 2012-06-26 16:37 UTC (permalink / raw) To: Michal Hocko Cc: cgroups, linux-mm, kamezawa.hiroyu, Johannes Weiner, Andrew Morton, Tejun Heo On 06/26/2012 08:15 PM, Michal Hocko wrote: > On Tue 26-06-12 19:47:14, Glauber Costa wrote: >> Okay, so after recent discussions, I am proposing the following >> patch. It won't remove hierarchy, or anything like that. Just default >> to true in the root cgroup, and print a warning once if you try >> to set it back to 0. >> >> I am not adding it to feature-removal-schedule.txt because I don't >> view it as a consensus. Rather, changing the default would allow us >> to give it a time around in the open, and see if people complain >> and what we can learn about that. >> >> Signed-off-by: Glauber Costa <glommer@parallels.com> >> Acked-by: Johannes Weiner <hannes@cmpxchg.org> >> CC: Michal Hocko <mhocko@suse.cz> >> CC: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> >> CC: Tejun Heo <tj@kernel.org> >> --- >> mm/memcontrol.c | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> diff --git a/mm/memcontrol.c b/mm/memcontrol.c >> index 85f7790..c37e4c1 100644 >> --- a/mm/memcontrol.c >> +++ b/mm/memcontrol.c >> @@ -3993,6 +3993,10 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, >> if (memcg->use_hierarchy == val) >> goto out; >> >> + WARN_ONCE(!parent_memcg && memcg->use_hierarchy, > > Do you have to test anything here at all? The test above will get you > out without doing anything if you are not trying to change anything. > The default is true so you have to be trying to disable it. > > If you omit !parent_memcg test as well you will get a bonus of the early > warning even if somebody has cgconfig.conf like this: > > group a/b/c { > memory { > memory.use_hierarchy = 0; > [...] > } > } > > which worked previously... > True there is a risk of a "false warning" when somebody just tries to > set disable hierarchy when it is (and never was) allowed but I do not > think this is that bad. Well, a false warning is not that bad. It is better to be vocal. I will wait for Kame to put his comments, and I can resend with that change. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 16:37 ` Glauber Costa 0 siblings, 0 replies; 47+ messages in thread From: Glauber Costa @ 2012-06-26 16:37 UTC (permalink / raw) To: Michal Hocko Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A, Johannes Weiner, Andrew Morton, Tejun Heo On 06/26/2012 08:15 PM, Michal Hocko wrote: > On Tue 26-06-12 19:47:14, Glauber Costa wrote: >> Okay, so after recent discussions, I am proposing the following >> patch. It won't remove hierarchy, or anything like that. Just default >> to true in the root cgroup, and print a warning once if you try >> to set it back to 0. >> >> I am not adding it to feature-removal-schedule.txt because I don't >> view it as a consensus. Rather, changing the default would allow us >> to give it a time around in the open, and see if people complain >> and what we can learn about that. >> >> Signed-off-by: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> >> Acked-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> >> CC: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> >> CC: Kamezawa Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> >> CC: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> >> --- >> mm/memcontrol.c | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> diff --git a/mm/memcontrol.c b/mm/memcontrol.c >> index 85f7790..c37e4c1 100644 >> --- a/mm/memcontrol.c >> +++ b/mm/memcontrol.c >> @@ -3993,6 +3993,10 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, >> if (memcg->use_hierarchy == val) >> goto out; >> >> + WARN_ONCE(!parent_memcg && memcg->use_hierarchy, > > Do you have to test anything here at all? The test above will get you > out without doing anything if you are not trying to change anything. > The default is true so you have to be trying to disable it. > > If you omit !parent_memcg test as well you will get a bonus of the early > warning even if somebody has cgconfig.conf like this: > > group a/b/c { > memory { > memory.use_hierarchy = 0; > [...] > } > } > > which worked previously... > True there is a risk of a "false warning" when somebody just tries to > set disable hierarchy when it is (and never was) allowed but I do not > think this is that bad. Well, a false warning is not that bad. It is better to be vocal. I will wait for Kame to put his comments, and I can resend with that change. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 17:54 ` Michal Hocko 0 siblings, 0 replies; 47+ messages in thread From: Michal Hocko @ 2012-06-26 17:54 UTC (permalink / raw) To: Glauber Costa Cc: cgroups, linux-mm, kamezawa.hiroyu, Johannes Weiner, Andrew Morton, Tejun Heo On Tue 26-06-12 20:37:16, Glauber Costa wrote: > On 06/26/2012 08:15 PM, Michal Hocko wrote: > >On Tue 26-06-12 19:47:14, Glauber Costa wrote: > >>Okay, so after recent discussions, I am proposing the following > >>patch. It won't remove hierarchy, or anything like that. Just default > >>to true in the root cgroup, and print a warning once if you try > >>to set it back to 0. > >> > >>I am not adding it to feature-removal-schedule.txt because I don't > >>view it as a consensus. Rather, changing the default would allow us > >>to give it a time around in the open, and see if people complain > >>and what we can learn about that. > >> > >>Signed-off-by: Glauber Costa <glommer@parallels.com> > >>Acked-by: Johannes Weiner <hannes@cmpxchg.org> > >>CC: Michal Hocko <mhocko@suse.cz> > >>CC: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > >>CC: Tejun Heo <tj@kernel.org> > >>--- > >> mm/memcontrol.c | 5 +++++ > >> 1 file changed, 5 insertions(+) > >> > >>diff --git a/mm/memcontrol.c b/mm/memcontrol.c > >>index 85f7790..c37e4c1 100644 > >>--- a/mm/memcontrol.c > >>+++ b/mm/memcontrol.c > >>@@ -3993,6 +3993,10 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, > >> if (memcg->use_hierarchy == val) > >> goto out; > >> > >>+ WARN_ONCE(!parent_memcg && memcg->use_hierarchy, > > > >Do you have to test anything here at all? The test above will get you > >out without doing anything if you are not trying to change anything. > >The default is true so you have to be trying to disable it. > > > >If you omit !parent_memcg test as well you will get a bonus of the early > >warning even if somebody has cgconfig.conf like this: > > > > group a/b/c { > > memory { > > memory.use_hierarchy = 0; > > [...] > > } > > } > > > >which worked previously... > >True there is a risk of a "false warning" when somebody just tries to > >set disable hierarchy when it is (and never was) allowed but I do not > >think this is that bad. > > > Well, a false warning is not that bad. > It is better to be vocal. > > I will wait for Kame to put his comments, and I can resend with that change. OK, you can resend with Acked-by: Michal Hocko <mhocko@suse.cz> And thanks! I will try to push this into OpenSUSE kernels to have a wider audience for testing. I think that the approach is safe because there is a safe fallback I mentioned earlier (maybe you can put it into the changelog) and if there really exists a setting which cannot be converted to non-hierarchical layout then we should better know as soon as possible. -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 17:54 ` Michal Hocko 0 siblings, 0 replies; 47+ messages in thread From: Michal Hocko @ 2012-06-26 17:54 UTC (permalink / raw) To: Glauber Costa Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A, Johannes Weiner, Andrew Morton, Tejun Heo On Tue 26-06-12 20:37:16, Glauber Costa wrote: > On 06/26/2012 08:15 PM, Michal Hocko wrote: > >On Tue 26-06-12 19:47:14, Glauber Costa wrote: > >>Okay, so after recent discussions, I am proposing the following > >>patch. It won't remove hierarchy, or anything like that. Just default > >>to true in the root cgroup, and print a warning once if you try > >>to set it back to 0. > >> > >>I am not adding it to feature-removal-schedule.txt because I don't > >>view it as a consensus. Rather, changing the default would allow us > >>to give it a time around in the open, and see if people complain > >>and what we can learn about that. > >> > >>Signed-off-by: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> > >>Acked-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> > >>CC: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> > >>CC: Kamezawa Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> > >>CC: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > >>--- > >> mm/memcontrol.c | 5 +++++ > >> 1 file changed, 5 insertions(+) > >> > >>diff --git a/mm/memcontrol.c b/mm/memcontrol.c > >>index 85f7790..c37e4c1 100644 > >>--- a/mm/memcontrol.c > >>+++ b/mm/memcontrol.c > >>@@ -3993,6 +3993,10 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, > >> if (memcg->use_hierarchy == val) > >> goto out; > >> > >>+ WARN_ONCE(!parent_memcg && memcg->use_hierarchy, > > > >Do you have to test anything here at all? The test above will get you > >out without doing anything if you are not trying to change anything. > >The default is true so you have to be trying to disable it. > > > >If you omit !parent_memcg test as well you will get a bonus of the early > >warning even if somebody has cgconfig.conf like this: > > > > group a/b/c { > > memory { > > memory.use_hierarchy = 0; > > [...] > > } > > } > > > >which worked previously... > >True there is a risk of a "false warning" when somebody just tries to > >set disable hierarchy when it is (and never was) allowed but I do not > >think this is that bad. > > > Well, a false warning is not that bad. > It is better to be vocal. > > I will wait for Kame to put his comments, and I can resend with that change. OK, you can resend with Acked-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> And thanks! I will try to push this into OpenSUSE kernels to have a wider audience for testing. I think that the approach is safe because there is a safe fallback I mentioned earlier (maybe you can put it into the changelog) and if there really exists a setting which cannot be converted to non-hierarchical layout then we should better know as soon as possible. -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 18:04 ` Tejun Heo 0 siblings, 0 replies; 47+ messages in thread From: Tejun Heo @ 2012-06-26 18:04 UTC (permalink / raw) To: Glauber Costa Cc: cgroups, linux-mm, kamezawa.hiroyu, Michal Hocko, Johannes Weiner, Andrew Morton On Tue, Jun 26, 2012 at 07:47:14PM +0400, Glauber Costa wrote: > Okay, so after recent discussions, I am proposing the following > patch. It won't remove hierarchy, or anything like that. Just default > to true in the root cgroup, and print a warning once if you try > to set it back to 0. > > I am not adding it to feature-removal-schedule.txt because I don't > view it as a consensus. Rather, changing the default would allow us > to give it a time around in the open, and see if people complain > and what we can learn about that. > > Signed-off-by: Glauber Costa <glommer@parallels.com> > Acked-by: Johannes Weiner <hannes@cmpxchg.org> > CC: Michal Hocko <mhocko@suse.cz> > CC: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > CC: Tejun Heo <tj@kernel.org> > --- > mm/memcontrol.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 85f7790..c37e4c1 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -3993,6 +3993,10 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, > if (memcg->use_hierarchy == val) > goto out; > > + WARN_ONCE(!parent_memcg && memcg->use_hierarchy, > + "Non-hierarchical memcg is considered for deprecation\n" > + "Please consider reorganizing your tree to work with hierarchical accounting\n" > + "If you have any reason not to, let us know at cgroups@vger.kernel.org\n"); > /* > * If parent's use_hierarchy is set, we can't make any modifications > * in the child subtrees. If it is unset, then the change can > @@ -5221,6 +5225,7 @@ mem_cgroup_create(struct cgroup *cont) > INIT_WORK(&stock->work, drain_local_stock); > } > hotcpu_notifier(memcg_cpu_hotplug_callback, 0); > + memcg->use_hierarchy = true; > } else { > parent = mem_cgroup_from_cont(cont->parent); > memcg->use_hierarchy = parent->use_hierarchy; So, ummm, I don't think we can do this. We CAN NOT silently flip the default behavior like this. Hell, no. What we can do is something like the following. 1. Make .use_hierarchy a global property and convert .use_hierarchy file to reject writes to the setting which is different from the global one. Rip out partial hierarchy related code (how little they may be). Note that the default should still be flat hierarchy. 2. Mark flat hierarchy deprecated and produce a warning message if memcg is mounted w/o hierarchy option for a year or two. 3. After the existing users had enough chance to move away from flat hierarchy, rip out flat hierarchy code and error if hierarchy option is not specified. Later on, we may decide to get rid of the hierarchy mount option but I don't think that matters all that much. Thanks. -- tejun -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 18:04 ` Tejun Heo 0 siblings, 0 replies; 47+ messages in thread From: Tejun Heo @ 2012-06-26 18:04 UTC (permalink / raw) To: Glauber Costa Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A, Michal Hocko, Johannes Weiner, Andrew Morton On Tue, Jun 26, 2012 at 07:47:14PM +0400, Glauber Costa wrote: > Okay, so after recent discussions, I am proposing the following > patch. It won't remove hierarchy, or anything like that. Just default > to true in the root cgroup, and print a warning once if you try > to set it back to 0. > > I am not adding it to feature-removal-schedule.txt because I don't > view it as a consensus. Rather, changing the default would allow us > to give it a time around in the open, and see if people complain > and what we can learn about that. > > Signed-off-by: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> > Acked-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> > CC: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org> > CC: Kamezawa Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> > CC: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > --- > mm/memcontrol.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 85f7790..c37e4c1 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -3993,6 +3993,10 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, > if (memcg->use_hierarchy == val) > goto out; > > + WARN_ONCE(!parent_memcg && memcg->use_hierarchy, > + "Non-hierarchical memcg is considered for deprecation\n" > + "Please consider reorganizing your tree to work with hierarchical accounting\n" > + "If you have any reason not to, let us know at cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org\n"); > /* > * If parent's use_hierarchy is set, we can't make any modifications > * in the child subtrees. If it is unset, then the change can > @@ -5221,6 +5225,7 @@ mem_cgroup_create(struct cgroup *cont) > INIT_WORK(&stock->work, drain_local_stock); > } > hotcpu_notifier(memcg_cpu_hotplug_callback, 0); > + memcg->use_hierarchy = true; > } else { > parent = mem_cgroup_from_cont(cont->parent); > memcg->use_hierarchy = parent->use_hierarchy; So, ummm, I don't think we can do this. We CAN NOT silently flip the default behavior like this. Hell, no. What we can do is something like the following. 1. Make .use_hierarchy a global property and convert .use_hierarchy file to reject writes to the setting which is different from the global one. Rip out partial hierarchy related code (how little they may be). Note that the default should still be flat hierarchy. 2. Mark flat hierarchy deprecated and produce a warning message if memcg is mounted w/o hierarchy option for a year or two. 3. After the existing users had enough chance to move away from flat hierarchy, rip out flat hierarchy code and error if hierarchy option is not specified. Later on, we may decide to get rid of the hierarchy mount option but I don't think that matters all that much. Thanks. -- tejun ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller 2012-06-26 18:04 ` Tejun Heo (?) @ 2012-06-26 18:55 ` Johannes Weiner 2012-06-26 19:14 ` Tejun Heo -1 siblings, 1 reply; 47+ messages in thread From: Johannes Weiner @ 2012-06-26 18:55 UTC (permalink / raw) To: Tejun Heo Cc: Glauber Costa, cgroups, linux-mm, kamezawa.hiroyu, Michal Hocko, Andrew Morton On Tue, Jun 26, 2012 at 11:04:51AM -0700, Tejun Heo wrote: > On Tue, Jun 26, 2012 at 07:47:14PM +0400, Glauber Costa wrote: > > Okay, so after recent discussions, I am proposing the following > > patch. It won't remove hierarchy, or anything like that. Just default > > to true in the root cgroup, and print a warning once if you try > > to set it back to 0. > > > > I am not adding it to feature-removal-schedule.txt because I don't > > view it as a consensus. Rather, changing the default would allow us > > to give it a time around in the open, and see if people complain > > and what we can learn about that. > > > > Signed-off-by: Glauber Costa <glommer@parallels.com> > > Acked-by: Johannes Weiner <hannes@cmpxchg.org> > > CC: Michal Hocko <mhocko@suse.cz> > > CC: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > > CC: Tejun Heo <tj@kernel.org> > > --- > > mm/memcontrol.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index 85f7790..c37e4c1 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -3993,6 +3993,10 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft, > > if (memcg->use_hierarchy == val) > > goto out; > > > > + WARN_ONCE(!parent_memcg && memcg->use_hierarchy, > > + "Non-hierarchical memcg is considered for deprecation\n" > > + "Please consider reorganizing your tree to work with hierarchical accounting\n" > > + "If you have any reason not to, let us know at cgroups@vger.kernel.org\n"); > > /* > > * If parent's use_hierarchy is set, we can't make any modifications > > * in the child subtrees. If it is unset, then the change can > > @@ -5221,6 +5225,7 @@ mem_cgroup_create(struct cgroup *cont) > > INIT_WORK(&stock->work, drain_local_stock); > > } > > hotcpu_notifier(memcg_cpu_hotplug_callback, 0); > > + memcg->use_hierarchy = true; > > } else { > > parent = mem_cgroup_from_cont(cont->parent); > > memcg->use_hierarchy = parent->use_hierarchy; > > So, ummm, I don't think we can do this. We CAN NOT silently flip the > default behavior like this. Hell, no. Why? This is not flipping a 50/50 switch and forcing half the userbase to something that fits us better right now. It's about changing a default that most setups either don't care about or enable manually. We expect zero setups to functionally rely on nesting the directories without nesting the cgroups they represent. > What we can do is something like the following. > > 1. Make .use_hierarchy a global property and convert .use_hierarchy > file to reject writes to the setting which is different from the > global one. Rip out partial hierarchy related code (how little > they may be). Note that the default should still be flat > hierarchy. > > 2. Mark flat hierarchy deprecated and produce a warning message if > memcg is mounted w/o hierarchy option for a year or two. I think most of us assume that the common case is either not nesting directories or still working with hierarchy support actually enabled. I would hate if people had to jump through hoops to get the only behaviour we want to end up supporting and to not get yelled at, it sends all the wrong signals. > 3. After the existing users had enough chance to move away from flat > hierarchy, rip out flat hierarchy code and error if hierarchy > option is not specified. This description sounds much more sane than what we are actually trying to ban, which is not a flat structure, but treating groups with nested directories as equal siblings. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 19:14 ` Tejun Heo 0 siblings, 0 replies; 47+ messages in thread From: Tejun Heo @ 2012-06-26 19:14 UTC (permalink / raw) To: Johannes Weiner Cc: Glauber Costa, cgroups, linux-mm, kamezawa.hiroyu, Michal Hocko, Andrew Morton Hello, Johannes. On Tue, Jun 26, 2012 at 08:55:42PM +0200, Johannes Weiner wrote: > > 2. Mark flat hierarchy deprecated and produce a warning message if > > memcg is mounted w/o hierarchy option for a year or two. > > I think most of us assume that the common case is either not nesting > directories or still working with hierarchy support actually enabled. How do we know that? At least from what I see from blkcg usage, people do crazy stuff when given crazy interface and we've been providing completely crazy interface for years. We cannot switch that implicitly - the changed default behavior is drastically different and even could be difficult to chase down. Transitions towards good behavior are good but they have to be explicit. > I would hate if people had to jump through hoops to get the only > behaviour we want to end up supporting and to not get yelled at, it > sends all the wrong signals. It is inconvenient but that's the price that we have to pay for having been stupid. Kernel flipping behavior implicitly is far worse than any such inconvenience. These default behavior flips are something is better handled by distros / admins than kernel itself. They can orchestrate the userland infrastructure and handle and communicate these flips far better than kernel alone can do. We can't send out mails to flat hierarchy users after all. That's the reason why I'm suggesting mount option which can't be flipped (sans remount but that's going away too) once the system is configured by the distro or admin. The kernel should nudge mainline users towards new behavior while providing distros / admins a way to move to the new behavior. The kernel itself can't flip it like that. > > 3. After the existing users had enough chance to move away from flat > > hierarchy, rip out flat hierarchy code and error if hierarchy > > option is not specified. > > This description sounds much more sane than what we are actually > trying to ban, which is not a flat structure, but treating groups with > nested directories as equal siblings. Ah... yeah, flat hierarchy probably is the wrong way to describe it. I don't know. Superficial hierarchy? Thanks. -- tejun -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 19:14 ` Tejun Heo 0 siblings, 0 replies; 47+ messages in thread From: Tejun Heo @ 2012-06-26 19:14 UTC (permalink / raw) To: Johannes Weiner Cc: Glauber Costa, cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A, Michal Hocko, Andrew Morton Hello, Johannes. On Tue, Jun 26, 2012 at 08:55:42PM +0200, Johannes Weiner wrote: > > 2. Mark flat hierarchy deprecated and produce a warning message if > > memcg is mounted w/o hierarchy option for a year or two. > > I think most of us assume that the common case is either not nesting > directories or still working with hierarchy support actually enabled. How do we know that? At least from what I see from blkcg usage, people do crazy stuff when given crazy interface and we've been providing completely crazy interface for years. We cannot switch that implicitly - the changed default behavior is drastically different and even could be difficult to chase down. Transitions towards good behavior are good but they have to be explicit. > I would hate if people had to jump through hoops to get the only > behaviour we want to end up supporting and to not get yelled at, it > sends all the wrong signals. It is inconvenient but that's the price that we have to pay for having been stupid. Kernel flipping behavior implicitly is far worse than any such inconvenience. These default behavior flips are something is better handled by distros / admins than kernel itself. They can orchestrate the userland infrastructure and handle and communicate these flips far better than kernel alone can do. We can't send out mails to flat hierarchy users after all. That's the reason why I'm suggesting mount option which can't be flipped (sans remount but that's going away too) once the system is configured by the distro or admin. The kernel should nudge mainline users towards new behavior while providing distros / admins a way to move to the new behavior. The kernel itself can't flip it like that. > > 3. After the existing users had enough chance to move away from flat > > hierarchy, rip out flat hierarchy code and error if hierarchy > > option is not specified. > > This description sounds much more sane than what we are actually > trying to ban, which is not a flat structure, but treating groups with > nested directories as equal siblings. Ah... yeah, flat hierarchy probably is the wrong way to describe it. I don't know. Superficial hierarchy? Thanks. -- tejun ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 20:59 ` Johannes Weiner 0 siblings, 0 replies; 47+ messages in thread From: Johannes Weiner @ 2012-06-26 20:59 UTC (permalink / raw) To: Tejun Heo Cc: Glauber Costa, cgroups, linux-mm, kamezawa.hiroyu, Michal Hocko, Andrew Morton On Tue, Jun 26, 2012 at 12:14:50PM -0700, Tejun Heo wrote: > Hello, Johannes. > > On Tue, Jun 26, 2012 at 08:55:42PM +0200, Johannes Weiner wrote: > > > 2. Mark flat hierarchy deprecated and produce a warning message if > > > memcg is mounted w/o hierarchy option for a year or two. > > > > I think most of us assume that the common case is either not nesting > > directories or still working with hierarchy support actually enabled. > > How do we know that? At least from what I see from blkcg usage, > people do crazy stuff when given crazy interface and we've been > providing completely crazy interface for years. We cannot switch that > implicitly - the changed default behavior is drastically different and > even could be difficult to chase down. Transitions towards good > behavior are good but they have to be explicit. I can't really argue against this because this can't be anything but a handwaving contest. We simply have no data. Backward compatibility is a really good idea but I think it depends on the situation: I don't expect such a configuration to happen by accident because this behaviour is against fundamental expectations users have toward a filesystem, which is the hierarchical nature of nested entities. I also don't expect it to happen intentionnaly, simply because there is nothing to gain from a more complicated nested setup. And because there is nothing to gain, it is in addition really trivial to fix the insane setups by simply undoing the nesting, there is no downside for them. The only point where I agree with you is that it may indeed be non-obvious to detect in case you were relying on the filesystem hierarchy not being reflected in the controller hierarchy. But even that depends on the usecase, whether it's a subtle performance regression or a total failure to execute a previously supported workload, which would be pretty damn obvious. > > I would hate if people had to jump through hoops to get the only > > behaviour we want to end up supporting and to not get yelled at, it > > sends all the wrong signals. > > It is inconvenient but that's the price that we have to pay for having > been stupid. Kernel flipping behavior implicitly is far worse than > any such inconvenience. That's a matter of taste. It's not inconvenient for us, but for every normal person trying to use the feature in a sane way, just to support hypothetical crazy people when there is a really trivial solution for the crazy people's problem. > These default behavior flips are something is better handled by > distros / admins than kernel itself. They can orchestrate the > userland infrastructure and handle and communicate these flips far > better than kernel alone can do. We can't send out mails to flat > hierarchy users after all. That's the reason why I'm suggesting mount > option which can't be flipped (sans remount but that's going away too) > once the system is configured by the distro or admin. > > The kernel should nudge mainline users towards new behavior while > providing distros / admins a way to move to the new behavior. The > kernel itself can't flip it like that. I thought it was the kernel's job to provide sensible defaults and the distro's job to ease transitions... > > > 3. After the existing users had enough chance to move away from flat > > > hierarchy, rip out flat hierarchy code and error if hierarchy > > > option is not specified. > > > > This description sounds much more sane than what we are actually > > trying to ban, which is not a flat structure, but treating groups with > > nested directories as equal siblings. > > Ah... yeah, flat hierarchy probably is the wrong way to describe it. > I don't know. Superficial hierarchy? Not sure there really is an appropriate two-word name for this artifact. How about, pardon my German, Cgroupfilesystemcontrollerhierarchysemanticsdiscrepancy? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 20:59 ` Johannes Weiner 0 siblings, 0 replies; 47+ messages in thread From: Johannes Weiner @ 2012-06-26 20:59 UTC (permalink / raw) To: Tejun Heo Cc: Glauber Costa, cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A, Michal Hocko, Andrew Morton On Tue, Jun 26, 2012 at 12:14:50PM -0700, Tejun Heo wrote: > Hello, Johannes. > > On Tue, Jun 26, 2012 at 08:55:42PM +0200, Johannes Weiner wrote: > > > 2. Mark flat hierarchy deprecated and produce a warning message if > > > memcg is mounted w/o hierarchy option for a year or two. > > > > I think most of us assume that the common case is either not nesting > > directories or still working with hierarchy support actually enabled. > > How do we know that? At least from what I see from blkcg usage, > people do crazy stuff when given crazy interface and we've been > providing completely crazy interface for years. We cannot switch that > implicitly - the changed default behavior is drastically different and > even could be difficult to chase down. Transitions towards good > behavior are good but they have to be explicit. I can't really argue against this because this can't be anything but a handwaving contest. We simply have no data. Backward compatibility is a really good idea but I think it depends on the situation: I don't expect such a configuration to happen by accident because this behaviour is against fundamental expectations users have toward a filesystem, which is the hierarchical nature of nested entities. I also don't expect it to happen intentionnaly, simply because there is nothing to gain from a more complicated nested setup. And because there is nothing to gain, it is in addition really trivial to fix the insane setups by simply undoing the nesting, there is no downside for them. The only point where I agree with you is that it may indeed be non-obvious to detect in case you were relying on the filesystem hierarchy not being reflected in the controller hierarchy. But even that depends on the usecase, whether it's a subtle performance regression or a total failure to execute a previously supported workload, which would be pretty damn obvious. > > I would hate if people had to jump through hoops to get the only > > behaviour we want to end up supporting and to not get yelled at, it > > sends all the wrong signals. > > It is inconvenient but that's the price that we have to pay for having > been stupid. Kernel flipping behavior implicitly is far worse than > any such inconvenience. That's a matter of taste. It's not inconvenient for us, but for every normal person trying to use the feature in a sane way, just to support hypothetical crazy people when there is a really trivial solution for the crazy people's problem. > These default behavior flips are something is better handled by > distros / admins than kernel itself. They can orchestrate the > userland infrastructure and handle and communicate these flips far > better than kernel alone can do. We can't send out mails to flat > hierarchy users after all. That's the reason why I'm suggesting mount > option which can't be flipped (sans remount but that's going away too) > once the system is configured by the distro or admin. > > The kernel should nudge mainline users towards new behavior while > providing distros / admins a way to move to the new behavior. The > kernel itself can't flip it like that. I thought it was the kernel's job to provide sensible defaults and the distro's job to ease transitions... > > > 3. After the existing users had enough chance to move away from flat > > > hierarchy, rip out flat hierarchy code and error if hierarchy > > > option is not specified. > > > > This description sounds much more sane than what we are actually > > trying to ban, which is not a flat structure, but treating groups with > > nested directories as equal siblings. > > Ah... yeah, flat hierarchy probably is the wrong way to describe it. > I don't know. Superficial hierarchy? Not sure there really is an appropriate two-word name for this artifact. How about, pardon my German, Cgroupfilesystemcontrollerhierarchysemanticsdiscrepancy? ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 21:19 ` Tejun Heo 0 siblings, 0 replies; 47+ messages in thread From: Tejun Heo @ 2012-06-26 21:19 UTC (permalink / raw) To: Johannes Weiner Cc: Glauber Costa, cgroups, linux-mm, kamezawa.hiroyu, Michal Hocko, Andrew Morton Hello, Johannes. On Tue, Jun 26, 2012 at 10:59:24PM +0200, Johannes Weiner wrote: > > How do we know that? At least from what I see from blkcg usage, > > people do crazy stuff when given crazy interface and we've been > > providing completely crazy interface for years. We cannot switch that > > implicitly - the changed default behavior is drastically different and > > even could be difficult to chase down. Transitions towards good > > behavior are good but they have to be explicit. > > I can't really argue against this because this can't be anything but a > handwaving contest. We simply have no data. The problem is that we can't know for sure until we make it explode and, once we make it explode, well, it has exploded. > Backward compatibility is a really good idea but I think it depends on > the situation: > > I don't expect such a configuration to happen by accident because > this behaviour is against fundamental expectations users have toward > a filesystem, which is the hierarchical nature of nested entities. > > I also don't expect it to happen intentionnaly, simply because there > is nothing to gain from a more complicated nested setup. I agree it's a feature which no sane person should be using but if you look at the cgroup interface as whole, sadly, sanity is fairly scarce. Nothing prevents one from mounting cpu and mem on the same hierarchy expecting hierarchical control for cpus while using flat enforcement on mem. That was the default behavior after all. > And because there is nothing to gain, it is in addition really > trivial to fix the insane setups by simply undoing the nesting, > there is no downside for them. I have to disagree with that. Deployment sometimes can be very painful. In some cases, even flipping single parameter in sysfs depending on kernel version takes considerable effort. The behavior has been the contract that we offered userland for quite some time now. We shouldn't be changing that underneath them without any clear way for them to notice it. > The only point where I agree with you is that it may indeed be > non-obvious to detect in case you were relying on the filesystem > hierarchy not being reflected in the controller hierarchy. But even > that depends on the usecase, whether it's a subtle performance > regression or a total failure to execute a previously supported > workload, which would be pretty damn obvious. And imagine that happening in serveral thousand machine cluster with fairly complicated cgroup setup and kernel update rolling out for subset of machine types. I would be screaming bloody murder. > > > I would hate if people had to jump through hoops to get the only > > > behaviour we want to end up supporting and to not get yelled at, it > > > sends all the wrong signals. > > > > It is inconvenient but that's the price that we have to pay for having > > been stupid. Kernel flipping behavior implicitly is far worse than > > any such inconvenience. > > That's a matter of taste. It's not inconvenient for us, but for every > normal person trying to use the feature in a sane way, just to support > hypothetical crazy people when there is a really trivial solution for > the crazy people's problem. To me, this doesn't seem to be a matter of taste. This is crossing the line. Note that the said inconvenience is likely to be felt only by distros or admins and what should be done to remedy the situation will be obvious. I agree it isn't ideal but such is life, I suppose. > > The kernel should nudge mainline users towards new behavior while > > providing distros / admins a way to move to the new behavior. The > > kernel itself can't flip it like that. > > I thought it was the kernel's job to provide sensible defaults and the > distro's job to ease transitions... Yeah, sure, and we failed horribly at providing sensible default here. The right thing to do now is to gradually move towards the better default. > > Ah... yeah, flat hierarchy probably is the wrong way to describe it. > > I don't know. Superficial hierarchy? > > Not sure there really is an appropriate two-word name for this > artifact. How about, pardon my German, > Cgroupfilesystemcontrollerhierarchysemanticsdiscrepancy? I'd like more Scheisse in there please. Thanks. -- tejun -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 21:19 ` Tejun Heo 0 siblings, 0 replies; 47+ messages in thread From: Tejun Heo @ 2012-06-26 21:19 UTC (permalink / raw) To: Johannes Weiner Cc: Glauber Costa, cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A, Michal Hocko, Andrew Morton Hello, Johannes. On Tue, Jun 26, 2012 at 10:59:24PM +0200, Johannes Weiner wrote: > > How do we know that? At least from what I see from blkcg usage, > > people do crazy stuff when given crazy interface and we've been > > providing completely crazy interface for years. We cannot switch that > > implicitly - the changed default behavior is drastically different and > > even could be difficult to chase down. Transitions towards good > > behavior are good but they have to be explicit. > > I can't really argue against this because this can't be anything but a > handwaving contest. We simply have no data. The problem is that we can't know for sure until we make it explode and, once we make it explode, well, it has exploded. > Backward compatibility is a really good idea but I think it depends on > the situation: > > I don't expect such a configuration to happen by accident because > this behaviour is against fundamental expectations users have toward > a filesystem, which is the hierarchical nature of nested entities. > > I also don't expect it to happen intentionnaly, simply because there > is nothing to gain from a more complicated nested setup. I agree it's a feature which no sane person should be using but if you look at the cgroup interface as whole, sadly, sanity is fairly scarce. Nothing prevents one from mounting cpu and mem on the same hierarchy expecting hierarchical control for cpus while using flat enforcement on mem. That was the default behavior after all. > And because there is nothing to gain, it is in addition really > trivial to fix the insane setups by simply undoing the nesting, > there is no downside for them. I have to disagree with that. Deployment sometimes can be very painful. In some cases, even flipping single parameter in sysfs depending on kernel version takes considerable effort. The behavior has been the contract that we offered userland for quite some time now. We shouldn't be changing that underneath them without any clear way for them to notice it. > The only point where I agree with you is that it may indeed be > non-obvious to detect in case you were relying on the filesystem > hierarchy not being reflected in the controller hierarchy. But even > that depends on the usecase, whether it's a subtle performance > regression or a total failure to execute a previously supported > workload, which would be pretty damn obvious. And imagine that happening in serveral thousand machine cluster with fairly complicated cgroup setup and kernel update rolling out for subset of machine types. I would be screaming bloody murder. > > > I would hate if people had to jump through hoops to get the only > > > behaviour we want to end up supporting and to not get yelled at, it > > > sends all the wrong signals. > > > > It is inconvenient but that's the price that we have to pay for having > > been stupid. Kernel flipping behavior implicitly is far worse than > > any such inconvenience. > > That's a matter of taste. It's not inconvenient for us, but for every > normal person trying to use the feature in a sane way, just to support > hypothetical crazy people when there is a really trivial solution for > the crazy people's problem. To me, this doesn't seem to be a matter of taste. This is crossing the line. Note that the said inconvenience is likely to be felt only by distros or admins and what should be done to remedy the situation will be obvious. I agree it isn't ideal but such is life, I suppose. > > The kernel should nudge mainline users towards new behavior while > > providing distros / admins a way to move to the new behavior. The > > kernel itself can't flip it like that. > > I thought it was the kernel's job to provide sensible defaults and the > distro's job to ease transitions... Yeah, sure, and we failed horribly at providing sensible default here. The right thing to do now is to gradually move towards the better default. > > Ah... yeah, flat hierarchy probably is the wrong way to describe it. > > I don't know. Superficial hierarchy? > > Not sure there really is an appropriate two-word name for this > artifact. How about, pardon my German, > Cgroupfilesystemcontrollerhierarchysemanticsdiscrepancy? I'd like more Scheifle in there please. Thanks. -- tejun ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller 2012-06-26 21:19 ` Tejun Heo (?) @ 2012-06-27 8:57 ` Glauber Costa 2012-06-27 17:07 ` Tejun Heo -1 siblings, 1 reply; 47+ messages in thread From: Glauber Costa @ 2012-06-27 8:57 UTC (permalink / raw) To: Tejun Heo Cc: Johannes Weiner, cgroups, linux-mm, kamezawa.hiroyu, Michal Hocko, Andrew Morton >> And because there is nothing to gain, it is in addition really >> trivial to fix the insane setups by simply undoing the nesting, >> there is no downside for them. > > I have to disagree with that. Deployment sometimes can be very > painful. In some cases, even flipping single parameter in sysfs > depending on kernel version takes considerable effort. The behavior > has been the contract that we offered userland for quite some time > now. We shouldn't be changing that underneath them without any clear > way for them to notice it. Yes, and that's why once you deploy, you keep your updates to a minimum. Because hell, even *perfectly legitimate bug fixes* can change your behavior in a way you don't want. And you don't expect people to refrain from fixing bugs because of that. > >> The only point where I agree with you is that it may indeed be >> non-obvious to detect in case you were relying on the filesystem >> hierarchy not being reflected in the controller hierarchy. But even >> that depends on the usecase, whether it's a subtle performance >> regression or a total failure to execute a previously supported >> workload, which would be pretty damn obvious. > > And imagine that happening in serveral thousand machine cluster with > fairly complicated cgroup setup and kernel update rolling out for > subset of machine types. I would be screaming bloody murder. That is precisely why people in serious environments tend to run -stable, distro LTSes, or anything like that. Because they don't want any change, however minor, to potentially affect their stamped behavior. I am not proposing this patch to -stable, btw... -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller 2012-06-27 8:57 ` Glauber Costa @ 2012-06-27 17:07 ` Tejun Heo 0 siblings, 0 replies; 47+ messages in thread From: Tejun Heo @ 2012-06-27 17:07 UTC (permalink / raw) To: Glauber Costa Cc: Johannes Weiner, cgroups, linux-mm, kamezawa.hiroyu, Michal Hocko, Andrew Morton On Wed, Jun 27, 2012 at 12:57:12PM +0400, Glauber Costa wrote: > >I have to disagree with that. Deployment sometimes can be very > >painful. In some cases, even flipping single parameter in sysfs > >depending on kernel version takes considerable effort. The behavior > >has been the contract that we offered userland for quite some time > >now. We shouldn't be changing that underneath them without any clear > >way for them to notice it. > > Yes, and that's why once you deploy, you keep your updates to a > minimum. Because hell, even *perfectly legitimate bug fixes* can > change your behavior in a way you don't want. And you don't expect > people to refrain from fixing bugs because of that. Dude, there are numerous organizations running all types of infrastructures. I personally know some running debian + mainline kernel with regular kernel refresh (no, they aren't small). Silent behavior switches like this will be a big glowing fuck-you to those people and I don't wanna do that. And then there are infrastructures where new machines are continuously deployed and you know what? new machines often require new kernels. What are you gonna tell them? Don't cycle-upgrade your machines once you're in production? Please stop trying to argue that silently switching major behavior like this is okay. It simply isn't. This is breaching one of the most basic assumptions that our direct users make. NONONONONONONO. > That is precisely why people in serious environments tend to run > -stable, distro LTSes, or anything like that. Because they don't > want any change, however minor, to potentially affect their stamped > behavior. I am not proposing this patch to -stable, btw... Yeah, because once an environment is in prod, the only update they need is -stable and we can switch behaviors willy-nilly on any release. Ugh...... -- tejun -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 22:08 ` Michal Hocko 0 siblings, 0 replies; 47+ messages in thread From: Michal Hocko @ 2012-06-26 22:08 UTC (permalink / raw) To: Tejun Heo Cc: Glauber Costa, cgroups, linux-mm, kamezawa.hiroyu, Johannes Weiner, Andrew Morton On Tue 26-06-12 11:04:51, Tejun Heo wrote: > On Tue, Jun 26, 2012 at 07:47:14PM +0400, Glauber Costa wrote: [...] > > @@ -5221,6 +5225,7 @@ mem_cgroup_create(struct cgroup *cont) > > INIT_WORK(&stock->work, drain_local_stock); > > } > > hotcpu_notifier(memcg_cpu_hotplug_callback, 0); > > + memcg->use_hierarchy = true; > > } else { > > parent = mem_cgroup_from_cont(cont->parent); > > memcg->use_hierarchy = parent->use_hierarchy; > > So, ummm, I don't think we can do this. We CAN NOT silently flip the > default behavior like this. Hell, no. What we can do is something > like the following. > > 1. Make .use_hierarchy a global property and convert .use_hierarchy > file to reject writes to the setting which is different from the > global one. Yes, that's how I understood your global knob suggestion and I liked it in the beginning but then I realized that this would just mean "tweak it to work and don't report anything because that's easy to find in forums" for most users, which is not good because we would end up stuck in this half (not)hierarchical state for ever which is not a desirable state - at least from my POV. According to my experience, people usually create deeper subtrees just because they want to have memcg hierarchy together with other controller(s) and the other controller requires a different topology but then they do not care about memory.* attributes in parents. Those cases are not affected by this change because parents are unlimited by default. Deeper subtrees without hierarchy and independent limits are usually mis-configurations, and we would like to hear about those to help to fix them, or they are unfixable usecases which we want to know about as well (because then we have a blocker for the unified cgroup hierarchy, don't we). > Rip out partial hierarchy related code (how little > they may be). I double checked this and it's really a surprisingly small amount of code (I expected much more, to be honest). The only interesting parts are mem_cgroup_swappiness_write and mem_cgroup_oom_control_write which don't allow the value setting if the parent is hierarchical (the values are consistent throughout the hierarchy). This will need to be changed and it is definitely required before this change can be introduced (sorry I should have noticed that sooner). Anyway, both changes should be OK to ignore parent's use_hierarchy because both the reclaim and oom happens on the root of the subtree which hit the limit (and the global vm_swapiness resp. global OOM will be used if we get up to root). So what is the result in the end? We will reduce few annoying use_hierarchy checks in the end but I guess that the more important part is a reasonable semantic. Does it really make sense to build non hierarchical subtrees? How does it work along with other controllers which are hierarchical? The original implementation enabled that, all right, but we know that many things were over-designed in this area (and in cgroups in general) and we should rather fix them. > Note that the default should still be flat hierarchy. > > 2. Mark flat hierarchy deprecated and produce a warning message if > memcg is mounted w/o hierarchy option for a year or two. I would agree with you on this with many kernel configurables but this one doesn't fall in. There is a trivial fallback (set root to use_hierarchy=0) so the mount option seems like an overkill - yet another API to keep for some time... So in short, I do think we should go the sanity path and end up with hierarchical trees and sooner we start the better. > 3. After the existing users had enough chance to move away from flat > hierarchy, rip out flat hierarchy code and error if hierarchy > option is not specified. > > Later on, we may decide to get rid of the hierarchy mount option but I > don't think that matters all that much. > > Thanks. > > -- > tejun -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 22:08 ` Michal Hocko 0 siblings, 0 replies; 47+ messages in thread From: Michal Hocko @ 2012-06-26 22:08 UTC (permalink / raw) To: Tejun Heo Cc: Glauber Costa, cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A, Johannes Weiner, Andrew Morton On Tue 26-06-12 11:04:51, Tejun Heo wrote: > On Tue, Jun 26, 2012 at 07:47:14PM +0400, Glauber Costa wrote: [...] > > @@ -5221,6 +5225,7 @@ mem_cgroup_create(struct cgroup *cont) > > INIT_WORK(&stock->work, drain_local_stock); > > } > > hotcpu_notifier(memcg_cpu_hotplug_callback, 0); > > + memcg->use_hierarchy = true; > > } else { > > parent = mem_cgroup_from_cont(cont->parent); > > memcg->use_hierarchy = parent->use_hierarchy; > > So, ummm, I don't think we can do this. We CAN NOT silently flip the > default behavior like this. Hell, no. What we can do is something > like the following. > > 1. Make .use_hierarchy a global property and convert .use_hierarchy > file to reject writes to the setting which is different from the > global one. Yes, that's how I understood your global knob suggestion and I liked it in the beginning but then I realized that this would just mean "tweak it to work and don't report anything because that's easy to find in forums" for most users, which is not good because we would end up stuck in this half (not)hierarchical state for ever which is not a desirable state - at least from my POV. According to my experience, people usually create deeper subtrees just because they want to have memcg hierarchy together with other controller(s) and the other controller requires a different topology but then they do not care about memory.* attributes in parents. Those cases are not affected by this change because parents are unlimited by default. Deeper subtrees without hierarchy and independent limits are usually mis-configurations, and we would like to hear about those to help to fix them, or they are unfixable usecases which we want to know about as well (because then we have a blocker for the unified cgroup hierarchy, don't we). > Rip out partial hierarchy related code (how little > they may be). I double checked this and it's really a surprisingly small amount of code (I expected much more, to be honest). The only interesting parts are mem_cgroup_swappiness_write and mem_cgroup_oom_control_write which don't allow the value setting if the parent is hierarchical (the values are consistent throughout the hierarchy). This will need to be changed and it is definitely required before this change can be introduced (sorry I should have noticed that sooner). Anyway, both changes should be OK to ignore parent's use_hierarchy because both the reclaim and oom happens on the root of the subtree which hit the limit (and the global vm_swapiness resp. global OOM will be used if we get up to root). So what is the result in the end? We will reduce few annoying use_hierarchy checks in the end but I guess that the more important part is a reasonable semantic. Does it really make sense to build non hierarchical subtrees? How does it work along with other controllers which are hierarchical? The original implementation enabled that, all right, but we know that many things were over-designed in this area (and in cgroups in general) and we should rather fix them. > Note that the default should still be flat hierarchy. > > 2. Mark flat hierarchy deprecated and produce a warning message if > memcg is mounted w/o hierarchy option for a year or two. I would agree with you on this with many kernel configurables but this one doesn't fall in. There is a trivial fallback (set root to use_hierarchy=0) so the mount option seems like an overkill - yet another API to keep for some time... So in short, I do think we should go the sanity path and end up with hierarchical trees and sooner we start the better. > 3. After the existing users had enough chance to move away from flat > hierarchy, rip out flat hierarchy code and error if hierarchy > option is not specified. > > Later on, we may decide to get rid of the hierarchy mount option but I > don't think that matters all that much. > > Thanks. > > -- > tejun -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 22:14 ` Tejun Heo 0 siblings, 0 replies; 47+ messages in thread From: Tejun Heo @ 2012-06-26 22:14 UTC (permalink / raw) To: Michal Hocko Cc: Glauber Costa, cgroups, linux-mm, kamezawa.hiroyu, Johannes Weiner, Andrew Morton Hello, Michal. On Wed, Jun 27, 2012 at 12:08:09AM +0200, Michal Hocko wrote: > According to my experience, people usually create deeper subtrees > just because they want to have memcg hierarchy together with other > controller(s) and the other controller requires a different topology > but then they do not care about memory.* attributes in parents. > Those cases are not affected by this change because parents are > unlimited by default. > Deeper subtrees without hierarchy and independent limits are usually > mis-configurations, and we would like to hear about those to help to fix > them, or they are unfixable usecases which we want to know about as well > (because then we have a blocker for the unified cgroup hierarchy, don't > we). Yeah, this is something I'm seriously considering doing from cgroup core. ie. generating a warning message if the user nests cgroups w/ controllers which don't support full hierarchy. > > Note that the default should still be flat hierarchy. > > > > 2. Mark flat hierarchy deprecated and produce a warning message if > > memcg is mounted w/o hierarchy option for a year or two. > > I would agree with you on this with many kernel configurables but > this one doesn't fall in. There is a trivial fallback (set root to > use_hierarchy=0) so the mount option seems like an overkill - yet > another API to keep for some time... Just disallow clearing .use_hierarchy if it was mounted with the option? We can later either make the file RO 1 for compatibility's sake or remove it. > So in short, I do think we should go the sanity path and end up > with hierarchical trees and sooner we start the better. I do agree with you in principle, but I still don't think we can switch the default behavior underneath the users. Thanks. -- tejun -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-26 22:14 ` Tejun Heo 0 siblings, 0 replies; 47+ messages in thread From: Tejun Heo @ 2012-06-26 22:14 UTC (permalink / raw) To: Michal Hocko Cc: Glauber Costa, cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A, Johannes Weiner, Andrew Morton Hello, Michal. On Wed, Jun 27, 2012 at 12:08:09AM +0200, Michal Hocko wrote: > According to my experience, people usually create deeper subtrees > just because they want to have memcg hierarchy together with other > controller(s) and the other controller requires a different topology > but then they do not care about memory.* attributes in parents. > Those cases are not affected by this change because parents are > unlimited by default. > Deeper subtrees without hierarchy and independent limits are usually > mis-configurations, and we would like to hear about those to help to fix > them, or they are unfixable usecases which we want to know about as well > (because then we have a blocker for the unified cgroup hierarchy, don't > we). Yeah, this is something I'm seriously considering doing from cgroup core. ie. generating a warning message if the user nests cgroups w/ controllers which don't support full hierarchy. > > Note that the default should still be flat hierarchy. > > > > 2. Mark flat hierarchy deprecated and produce a warning message if > > memcg is mounted w/o hierarchy option for a year or two. > > I would agree with you on this with many kernel configurables but > this one doesn't fall in. There is a trivial fallback (set root to > use_hierarchy=0) so the mount option seems like an overkill - yet > another API to keep for some time... Just disallow clearing .use_hierarchy if it was mounted with the option? We can later either make the file RO 1 for compatibility's sake or remove it. > So in short, I do think we should go the sanity path and end up > with hierarchical trees and sooner we start the better. I do agree with you in principle, but I still don't think we can switch the default behavior underneath the users. Thanks. -- tejun ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller 2012-06-26 22:14 ` Tejun Heo (?) @ 2012-06-26 22:17 ` Tejun Heo -1 siblings, 0 replies; 47+ messages in thread From: Tejun Heo @ 2012-06-26 22:17 UTC (permalink / raw) To: Michal Hocko Cc: Glauber Costa, cgroups, linux-mm, kamezawa.hiroyu, Johannes Weiner, Andrew Morton On Tue, Jun 26, 2012 at 03:14:52PM -0700, Tejun Heo wrote: > Hello, Michal. > > On Wed, Jun 27, 2012 at 12:08:09AM +0200, Michal Hocko wrote: > > According to my experience, people usually create deeper subtrees > > just because they want to have memcg hierarchy together with other > > controller(s) and the other controller requires a different topology > > but then they do not care about memory.* attributes in parents. > > Those cases are not affected by this change because parents are > > unlimited by default. > > Deeper subtrees without hierarchy and independent limits are usually > > mis-configurations, and we would like to hear about those to help to fix > > them, or they are unfixable usecases which we want to know about as well > > (because then we have a blocker for the unified cgroup hierarchy, don't > > we). > > Yeah, this is something I'm seriously considering doing from cgroup > core. ie. generating a warning message if the user nests cgroups w/ > controllers which don't support full hierarchy. BTW, this is another reason I'm suggesting mount time option so that cgroup core can be told that the specific controller is hierarchy-aware. Thanks. -- tejun -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-27 8:52 ` Glauber Costa 0 siblings, 0 replies; 47+ messages in thread From: Glauber Costa @ 2012-06-27 8:52 UTC (permalink / raw) To: Tejun Heo Cc: Michal Hocko, cgroups, linux-mm, kamezawa.hiroyu, Johannes Weiner, Andrew Morton On 06/27/2012 02:14 AM, Tejun Heo wrote: > Hello, Michal. > > On Wed, Jun 27, 2012 at 12:08:09AM +0200, Michal Hocko wrote: >> According to my experience, people usually create deeper subtrees >> just because they want to have memcg hierarchy together with other >> controller(s) and the other controller requires a different topology >> but then they do not care about memory.* attributes in parents. >> Those cases are not affected by this change because parents are >> unlimited by default. >> Deeper subtrees without hierarchy and independent limits are usually >> mis-configurations, and we would like to hear about those to help to fix >> them, or they are unfixable usecases which we want to know about as well >> (because then we have a blocker for the unified cgroup hierarchy, don't >> we). > > Yeah, this is something I'm seriously considering doing from cgroup > core. ie. generating a warning message if the user nests cgroups w/ > controllers which don't support full hierarchy. > >>> Note that the default should still be flat hierarchy. >>> >>> 2. Mark flat hierarchy deprecated and produce a warning message if >>> memcg is mounted w/o hierarchy option for a year or two. >> >> I would agree with you on this with many kernel configurables but >> this one doesn't fall in. There is a trivial fallback (set root to >> use_hierarchy=0) so the mount option seems like an overkill - yet >> another API to keep for some time... > > Just disallow clearing .use_hierarchy if it was mounted with the > option? We can later either make the file RO 1 for compatibility's > sake or remove it. How will it buy us anything, if it is clear by default?? >> So in short, I do think we should go the sanity path and end up >> with hierarchical trees and sooner we start the better. > > I do agree with you in principle, but I still don't think we can > switch the default behavior underneath the users. > I think we all agree with that. I can't speak for Johannes here, but I risk saying that he agrees with that as well. The problem is that we may differ in what means "default behavior". It is very clear in a system call, API, or any documented feature. We never made the guarantee, *ever*, that non-hierarchical might be the default. I understand that users may have grown accustomed to it. But users grow accustomed to bugs as well! Bugs change behaviors. In fact, in hardware emulation - where it matters, because it is harder to change it - we have emulator people actually emulating bugs - because that is what software expects. Is this reason for us to keep bugs around, because people grew accustomed to it? Hell no. Well, it might be: If we have a proven user base that is big and solid on top of that, it may be fair to say: "Well, this is unfortunate, but this is how it plays". Here, we're discussing - or handwaving as Hannes stated, about whether or not we have *some* users relying on this behavior. We must certainly agree that this is not by far a solid and big usersbase, or anything at the like. Another analogy I believe it is pertinent: I consider this change much closer to icon or button placement in the Desktop market: No one *ever* said a particular button stays at a particular place. Yet it was there for many releases. If you change it, some people will feel it. So what? People change it anyway. Because that is *not* anything set in stone. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-27 8:52 ` Glauber Costa 0 siblings, 0 replies; 47+ messages in thread From: Glauber Costa @ 2012-06-27 8:52 UTC (permalink / raw) To: Tejun Heo Cc: Michal Hocko, cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A, Johannes Weiner, Andrew Morton On 06/27/2012 02:14 AM, Tejun Heo wrote: > Hello, Michal. > > On Wed, Jun 27, 2012 at 12:08:09AM +0200, Michal Hocko wrote: >> According to my experience, people usually create deeper subtrees >> just because they want to have memcg hierarchy together with other >> controller(s) and the other controller requires a different topology >> but then they do not care about memory.* attributes in parents. >> Those cases are not affected by this change because parents are >> unlimited by default. >> Deeper subtrees without hierarchy and independent limits are usually >> mis-configurations, and we would like to hear about those to help to fix >> them, or they are unfixable usecases which we want to know about as well >> (because then we have a blocker for the unified cgroup hierarchy, don't >> we). > > Yeah, this is something I'm seriously considering doing from cgroup > core. ie. generating a warning message if the user nests cgroups w/ > controllers which don't support full hierarchy. > >>> Note that the default should still be flat hierarchy. >>> >>> 2. Mark flat hierarchy deprecated and produce a warning message if >>> memcg is mounted w/o hierarchy option for a year or two. >> >> I would agree with you on this with many kernel configurables but >> this one doesn't fall in. There is a trivial fallback (set root to >> use_hierarchy=0) so the mount option seems like an overkill - yet >> another API to keep for some time... > > Just disallow clearing .use_hierarchy if it was mounted with the > option? We can later either make the file RO 1 for compatibility's > sake or remove it. How will it buy us anything, if it is clear by default?? >> So in short, I do think we should go the sanity path and end up >> with hierarchical trees and sooner we start the better. > > I do agree with you in principle, but I still don't think we can > switch the default behavior underneath the users. > I think we all agree with that. I can't speak for Johannes here, but I risk saying that he agrees with that as well. The problem is that we may differ in what means "default behavior". It is very clear in a system call, API, or any documented feature. We never made the guarantee, *ever*, that non-hierarchical might be the default. I understand that users may have grown accustomed to it. But users grow accustomed to bugs as well! Bugs change behaviors. In fact, in hardware emulation - where it matters, because it is harder to change it - we have emulator people actually emulating bugs - because that is what software expects. Is this reason for us to keep bugs around, because people grew accustomed to it? Hell no. Well, it might be: If we have a proven user base that is big and solid on top of that, it may be fair to say: "Well, this is unfortunate, but this is how it plays". Here, we're discussing - or handwaving as Hannes stated, about whether or not we have *some* users relying on this behavior. We must certainly agree that this is not by far a solid and big usersbase, or anything at the like. Another analogy I believe it is pertinent: I consider this change much closer to icon or button placement in the Desktop market: No one *ever* said a particular button stays at a particular place. Yet it was there for many releases. If you change it, some people will feel it. So what? People change it anyway. Because that is *not* anything set in stone. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-27 16:58 ` Tejun Heo 0 siblings, 0 replies; 47+ messages in thread From: Tejun Heo @ 2012-06-27 16:58 UTC (permalink / raw) To: Glauber Costa Cc: Michal Hocko, cgroups, linux-mm, kamezawa.hiroyu, Johannes Weiner, Andrew Morton Hello, Glauber. On Wed, Jun 27, 2012 at 12:52:27PM +0400, Glauber Costa wrote: > >Just disallow clearing .use_hierarchy if it was mounted with the > >option? We can later either make the file RO 1 for compatibility's > >sake or remove it. > > How will it buy us anything, if it is clear by default?? With the mount option specified, why would it be clear by default? > The problem is that we may differ in what means "default behavior". > > It is very clear in a system call, API, or any documented feature. > We never made the guarantee, *ever*, that non-hierarchical might be > the default. > > I understand that users may have grown accustomed to it. But users > grow accustomed to bugs as well! Bugs change behaviors. In fact, in > hardware emulation - where it matters, because it is harder to > change it - we have emulator people actually emulating bugs - > because that is what software expects. > > Is this reason for us to keep bugs around, because people grew > accustomed to it? Hell no. Well, it might be: If we have a proven > user base that is big and solid on top of that, it may be fair to > say: "Well, this is unfortunate, but this is how it plays". You're just playing with semantics now. Hey, who guarantees anything? I don't find anything inscribed in stone or with hundred goverment stamps which legally forbids me from "rm -rf"ing the whole cgroup. Gees... If we've shipped kernel versions with certain major behavior for years, that frigging is the guarantee we've been making to the userland. Of course, nothing is absolute and everything is subject to trade off, but, come on, we're talking about major SILENT behavior switch. No, nobody gets away with that. -- tejun -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-27 16:58 ` Tejun Heo 0 siblings, 0 replies; 47+ messages in thread From: Tejun Heo @ 2012-06-27 16:58 UTC (permalink / raw) To: Glauber Costa Cc: Michal Hocko, cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A, Johannes Weiner, Andrew Morton Hello, Glauber. On Wed, Jun 27, 2012 at 12:52:27PM +0400, Glauber Costa wrote: > >Just disallow clearing .use_hierarchy if it was mounted with the > >option? We can later either make the file RO 1 for compatibility's > >sake or remove it. > > How will it buy us anything, if it is clear by default?? With the mount option specified, why would it be clear by default? > The problem is that we may differ in what means "default behavior". > > It is very clear in a system call, API, or any documented feature. > We never made the guarantee, *ever*, that non-hierarchical might be > the default. > > I understand that users may have grown accustomed to it. But users > grow accustomed to bugs as well! Bugs change behaviors. In fact, in > hardware emulation - where it matters, because it is harder to > change it - we have emulator people actually emulating bugs - > because that is what software expects. > > Is this reason for us to keep bugs around, because people grew > accustomed to it? Hell no. Well, it might be: If we have a proven > user base that is big and solid on top of that, it may be fair to > say: "Well, this is unfortunate, but this is how it plays". You're just playing with semantics now. Hey, who guarantees anything? I don't find anything inscribed in stone or with hundred goverment stamps which legally forbids me from "rm -rf"ing the whole cgroup. Gees... If we've shipped kernel versions with certain major behavior for years, that frigging is the guarantee we've been making to the userland. Of course, nothing is absolute and everything is subject to trade off, but, come on, we're talking about major SILENT behavior switch. No, nobody gets away with that. -- tejun ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-27 12:51 ` Michal Hocko 0 siblings, 0 replies; 47+ messages in thread From: Michal Hocko @ 2012-06-27 12:51 UTC (permalink / raw) To: Tejun Heo Cc: Glauber Costa, cgroups, linux-mm, kamezawa.hiroyu, Johannes Weiner, Andrew Morton On Tue 26-06-12 15:14:52, Tejun Heo wrote: > Hello, Michal. > > On Wed, Jun 27, 2012 at 12:08:09AM +0200, Michal Hocko wrote: > > According to my experience, people usually create deeper subtrees > > just because they want to have memcg hierarchy together with other > > controller(s) and the other controller requires a different topology > > but then they do not care about memory.* attributes in parents. > > Those cases are not affected by this change because parents are > > unlimited by default. > > Deeper subtrees without hierarchy and independent limits are usually > > mis-configurations, and we would like to hear about those to help to fix > > them, or they are unfixable usecases which we want to know about as well > > (because then we have a blocker for the unified cgroup hierarchy, don't > > we). > > Yeah, this is something I'm seriously considering doing from cgroup > core. ie. generating a warning message if the user nests cgroups w/ > controllers which don't support full hierarchy. This is a good idea. > > > Note that the default should still be flat hierarchy. > > > > > > 2. Mark flat hierarchy deprecated and produce a warning message if > > > memcg is mounted w/o hierarchy option for a year or two. > > > > I would agree with you on this with many kernel configurables but > > this one doesn't fall in. There is a trivial fallback (set root to > > use_hierarchy=0) so the mount option seems like an overkill - yet > > another API to keep for some time... > > Just disallow clearing .use_hierarchy if it was mounted with the > option? Dunno, mount option just doesn't feel right. We do not offer other attributes to be set by them so it would be just confusing. Besides that it would require an integration into existing tools like cgconfig which is yet another pain just because of something that we never promissed to keep a certain way. There are many people who don't work with mount&fs cgroups directly but rather use libcgroup for that... -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-27 12:51 ` Michal Hocko 0 siblings, 0 replies; 47+ messages in thread From: Michal Hocko @ 2012-06-27 12:51 UTC (permalink / raw) To: Tejun Heo Cc: Glauber Costa, cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A, Johannes Weiner, Andrew Morton On Tue 26-06-12 15:14:52, Tejun Heo wrote: > Hello, Michal. > > On Wed, Jun 27, 2012 at 12:08:09AM +0200, Michal Hocko wrote: > > According to my experience, people usually create deeper subtrees > > just because they want to have memcg hierarchy together with other > > controller(s) and the other controller requires a different topology > > but then they do not care about memory.* attributes in parents. > > Those cases are not affected by this change because parents are > > unlimited by default. > > Deeper subtrees without hierarchy and independent limits are usually > > mis-configurations, and we would like to hear about those to help to fix > > them, or they are unfixable usecases which we want to know about as well > > (because then we have a blocker for the unified cgroup hierarchy, don't > > we). > > Yeah, this is something I'm seriously considering doing from cgroup > core. ie. generating a warning message if the user nests cgroups w/ > controllers which don't support full hierarchy. This is a good idea. > > > Note that the default should still be flat hierarchy. > > > > > > 2. Mark flat hierarchy deprecated and produce a warning message if > > > memcg is mounted w/o hierarchy option for a year or two. > > > > I would agree with you on this with many kernel configurables but > > this one doesn't fall in. There is a trivial fallback (set root to > > use_hierarchy=0) so the mount option seems like an overkill - yet > > another API to keep for some time... > > Just disallow clearing .use_hierarchy if it was mounted with the > option? Dunno, mount option just doesn't feel right. We do not offer other attributes to be set by them so it would be just confusing. Besides that it would require an integration into existing tools like cgconfig which is yet another pain just because of something that we never promissed to keep a certain way. There are many people who don't work with mount&fs cgroups directly but rather use libcgroup for that... -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-27 12:49 ` Glauber Costa 0 siblings, 0 replies; 47+ messages in thread From: Glauber Costa @ 2012-06-27 12:49 UTC (permalink / raw) To: Michal Hocko Cc: Tejun Heo, cgroups, linux-mm, kamezawa.hiroyu, Johannes Weiner, Andrew Morton On 06/27/2012 04:51 PM, Michal Hocko wrote: >> > Just disallow clearing .use_hierarchy if it was mounted with the >> > option? > Dunno, mount option just doesn't feel right. We do not offer other > attributes to be set by them so it would be just confusing. Besides that > it would require an integration into existing tools like cgconfig which > is yet another pain just because of something that we never promissed to > keep a certain way. There are many people who don't work with mount&fs > cgroups directly but rather use libcgroup for that... myself included. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-27 12:49 ` Glauber Costa 0 siblings, 0 replies; 47+ messages in thread From: Glauber Costa @ 2012-06-27 12:49 UTC (permalink / raw) To: Michal Hocko Cc: Tejun Heo, cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A, Johannes Weiner, Andrew Morton On 06/27/2012 04:51 PM, Michal Hocko wrote: >> > Just disallow clearing .use_hierarchy if it was mounted with the >> > option? > Dunno, mount option just doesn't feel right. We do not offer other > attributes to be set by them so it would be just confusing. Besides that > it would require an integration into existing tools like cgconfig which > is yet another pain just because of something that we never promissed to > keep a certain way. There are many people who don't work with mount&fs > cgroups directly but rather use libcgroup for that... myself included. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller 2012-06-27 12:51 ` Michal Hocko (?) (?) @ 2012-06-27 17:33 ` Tejun Heo 2012-06-28 8:46 ` Kamezawa Hiroyuki -1 siblings, 1 reply; 47+ messages in thread From: Tejun Heo @ 2012-06-27 17:33 UTC (permalink / raw) To: Michal Hocko Cc: Glauber Costa, cgroups, linux-mm, kamezawa.hiroyu, Johannes Weiner, Andrew Morton Hello, Michal. On Wed, Jun 27, 2012 at 02:51:19PM +0200, Michal Hocko wrote: > > Yeah, this is something I'm seriously considering doing from cgroup > > core. ie. generating a warning message if the user nests cgroups w/ > > controllers which don't support full hierarchy. > > This is a good idea. And I want each controller either to do proper hierarchy or not at all and disallow switching the behavior while mounted - at least disallow switching off hierarchy support dynamically. > > Just disallow clearing .use_hierarchy if it was mounted with the > > option? > > Dunno, mount option just doesn't feel right. We do not offer other > attributes to be set by them so it would be just confusing. Besides that > it would require an integration into existing tools like cgconfig which > is yet another pain just because of something that we never promissed to > keep a certain way. There are many people who don't work with mount&fs > cgroups directly but rather use libcgroup for that... If the default behavior has to be switched without extra input from userland, that should be noisy like hell and slow. e.g. generate warning messages whenever userland does something which is to be deprecated - nesting when .use_hierarchy == 0, mixing .use_hierarchy == 0 / 1, and maybe later on, using .use_hierarchy == 0 at all. Hmm.... we need to switch other controllers over to hierarchical behavior too. We may as well just do it from cgroup core. Once we rule out all users of pseudo hierarchy - nesting with controllers which don't support hierarchy - switching on hierarchy support per-controller shouldn't cause much problem. How about the following then? * I'll add a way for controllers to tell cgroup core that full hierarchy support is supported and a cgroup mount option to enable hierarchy (cgroup core itself already uses a number of mount options and libgroup or whatever should be able to deal with it). cgroup will refuse to mount if the hierarchy option is specified with a controller which doesn't support hierarchy and it will also whine like crazy if the userland tries to nest without the mount option specified. Each controller must enforce hierarchy once so directed by cgroup mount option. * While doing that, all applicable controllers will be updated to support hierarchy. * After sufficient time has passed, nesting without the mount option specified will be failed by cgroup core. As for memcg's .use_hierarchy, make it RO 1 if the cgroup indicates that hierarchy should be used. Otherwise, I don't know but make sure it gets phased out out use somehow. Thanks. -- tejun -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-28 8:46 ` Kamezawa Hiroyuki 0 siblings, 0 replies; 47+ messages in thread From: Kamezawa Hiroyuki @ 2012-06-28 8:46 UTC (permalink / raw) To: Tejun Heo Cc: Michal Hocko, Glauber Costa, cgroups, linux-mm, Johannes Weiner, Andrew Morton (2012/06/28 2:33), Tejun Heo wrote: > Hello, Michal. > > On Wed, Jun 27, 2012 at 02:51:19PM +0200, Michal Hocko wrote: >>> Yeah, this is something I'm seriously considering doing from cgroup >>> core. ie. generating a warning message if the user nests cgroups w/ >>> controllers which don't support full hierarchy. >> >> This is a good idea. > > And I want each controller either to do proper hierarchy or not at all > and disallow switching the behavior while mounted - at least disallow > switching off hierarchy support dynamically. > >>> Just disallow clearing .use_hierarchy if it was mounted with the >>> option? >> >> Dunno, mount option just doesn't feel right. We do not offer other >> attributes to be set by them so it would be just confusing. Besides that >> it would require an integration into existing tools like cgconfig which >> is yet another pain just because of something that we never promissed to >> keep a certain way. There are many people who don't work with mount&fs >> cgroups directly but rather use libcgroup for that... > > If the default behavior has to be switched without extra input from > userland, that should be noisy like hell and slow. e.g. generate > warning messages whenever userland does something which is to be > deprecated - nesting when .use_hierarchy == 0, mixing .use_hierarchy > == 0 / 1, and maybe later on, using .use_hierarchy == 0 at all. > > Hmm.... we need to switch other controllers over to hierarchical > behavior too. We may as well just do it from cgroup core. Once we > rule out all users of pseudo hierarchy - nesting with controllers > which don't support hierarchy - switching on hierarchy support > per-controller shouldn't cause much problem. > > How about the following then? > > * I'll add a way for controllers to tell cgroup core that full > hierarchy support is supported and a cgroup mount option to enable > hierarchy (cgroup core itself already uses a number of mount options > and libgroup or whatever should be able to deal with it). > > cgroup will refuse to mount if the hierarchy option is specified > with a controller which doesn't support hierarchy and it will also > whine like crazy if the userland tries to nest without the mount > option specified. > > Each controller must enforce hierarchy once so directed by cgroup > mount option. > > * While doing that, all applicable controllers will be updated to > support hierarchy. > > * After sufficient time has passed, nesting without the mount option > specified will be failed by cgroup core. > > As for memcg's .use_hierarchy, make it RO 1 if the cgroup indicates > that hierarchy should be used. Otherwise, I don't know but make sure > it gets phased out out use somehow. > The reason for use_hierarchy file was just _performance_, it _was_ terrible. Now it's not very good but not terrible. You all may think this as crazy idea. How about versioning ? Creating 'memory2'(memory cgroup v2) cgroup and mark 'memory' cgroup as deprecated, and put it to feature-removal-list. Of course, memory2 cgroup doesn't have use_hierarchy file and have kmem accounting. We should disallow to use memory and memory2 at the same time. Or, add version file to cgroup subsys ? select v2 as default...user can choose v1 with mount option if necessary, but it will not be maintained. Is it too difficult or messy ? To keep user experience, versioning is a way. And we can see how the changes affects users. Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-28 8:46 ` Kamezawa Hiroyuki 0 siblings, 0 replies; 47+ messages in thread From: Kamezawa Hiroyuki @ 2012-06-28 8:46 UTC (permalink / raw) To: Tejun Heo Cc: Michal Hocko, Glauber Costa, cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Johannes Weiner, Andrew Morton (2012/06/28 2:33), Tejun Heo wrote: > Hello, Michal. > > On Wed, Jun 27, 2012 at 02:51:19PM +0200, Michal Hocko wrote: >>> Yeah, this is something I'm seriously considering doing from cgroup >>> core. ie. generating a warning message if the user nests cgroups w/ >>> controllers which don't support full hierarchy. >> >> This is a good idea. > > And I want each controller either to do proper hierarchy or not at all > and disallow switching the behavior while mounted - at least disallow > switching off hierarchy support dynamically. > >>> Just disallow clearing .use_hierarchy if it was mounted with the >>> option? >> >> Dunno, mount option just doesn't feel right. We do not offer other >> attributes to be set by them so it would be just confusing. Besides that >> it would require an integration into existing tools like cgconfig which >> is yet another pain just because of something that we never promissed to >> keep a certain way. There are many people who don't work with mount&fs >> cgroups directly but rather use libcgroup for that... > > If the default behavior has to be switched without extra input from > userland, that should be noisy like hell and slow. e.g. generate > warning messages whenever userland does something which is to be > deprecated - nesting when .use_hierarchy == 0, mixing .use_hierarchy > == 0 / 1, and maybe later on, using .use_hierarchy == 0 at all. > > Hmm.... we need to switch other controllers over to hierarchical > behavior too. We may as well just do it from cgroup core. Once we > rule out all users of pseudo hierarchy - nesting with controllers > which don't support hierarchy - switching on hierarchy support > per-controller shouldn't cause much problem. > > How about the following then? > > * I'll add a way for controllers to tell cgroup core that full > hierarchy support is supported and a cgroup mount option to enable > hierarchy (cgroup core itself already uses a number of mount options > and libgroup or whatever should be able to deal with it). > > cgroup will refuse to mount if the hierarchy option is specified > with a controller which doesn't support hierarchy and it will also > whine like crazy if the userland tries to nest without the mount > option specified. > > Each controller must enforce hierarchy once so directed by cgroup > mount option. > > * While doing that, all applicable controllers will be updated to > support hierarchy. > > * After sufficient time has passed, nesting without the mount option > specified will be failed by cgroup core. > > As for memcg's .use_hierarchy, make it RO 1 if the cgroup indicates > that hierarchy should be used. Otherwise, I don't know but make sure > it gets phased out out use somehow. > The reason for use_hierarchy file was just _performance_, it _was_ terrible. Now it's not very good but not terrible. You all may think this as crazy idea. How about versioning ? Creating 'memory2'(memory cgroup v2) cgroup and mark 'memory' cgroup as deprecated, and put it to feature-removal-list. Of course, memory2 cgroup doesn't have use_hierarchy file and have kmem accounting. We should disallow to use memory and memory2 at the same time. Or, add version file to cgroup subsys ? select v2 as default...user can choose v1 with mount option if necessary, but it will not be maintained. Is it too difficult or messy ? To keep user experience, versioning is a way. And we can see how the changes affects users. Thanks, -Kame ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-28 9:12 ` Glauber Costa 0 siblings, 0 replies; 47+ messages in thread From: Glauber Costa @ 2012-06-28 9:12 UTC (permalink / raw) To: Kamezawa Hiroyuki Cc: Tejun Heo, Michal Hocko, cgroups, linux-mm, Johannes Weiner, Andrew Morton, David Rientjes On 06/28/2012 12:46 PM, Kamezawa Hiroyuki wrote: > (2012/06/28 2:33), Tejun Heo wrote: >> Hello, Michal. >> >> On Wed, Jun 27, 2012 at 02:51:19PM +0200, Michal Hocko wrote: >>>> Yeah, this is something I'm seriously considering doing from cgroup >>>> core. ie. generating a warning message if the user nests cgroups w/ >>>> controllers which don't support full hierarchy. >>> >>> This is a good idea. >> >> And I want each controller either to do proper hierarchy or not at all >> and disallow switching the behavior while mounted - at least disallow >> switching off hierarchy support dynamically. >> >>>> Just disallow clearing .use_hierarchy if it was mounted with the >>>> option? >>> >>> Dunno, mount option just doesn't feel right. We do not offer other >>> attributes to be set by them so it would be just confusing. Besides that >>> it would require an integration into existing tools like cgconfig which >>> is yet another pain just because of something that we never promissed to >>> keep a certain way. There are many people who don't work with mount&fs >>> cgroups directly but rather use libcgroup for that... >> >> If the default behavior has to be switched without extra input from >> userland, that should be noisy like hell and slow. e.g. generate >> warning messages whenever userland does something which is to be >> deprecated - nesting when .use_hierarchy == 0, mixing .use_hierarchy >> == 0 / 1, and maybe later on, using .use_hierarchy == 0 at all. >> >> Hmm.... we need to switch other controllers over to hierarchical >> behavior too. We may as well just do it from cgroup core. Once we >> rule out all users of pseudo hierarchy - nesting with controllers >> which don't support hierarchy - switching on hierarchy support >> per-controller shouldn't cause much problem. >> >> How about the following then? >> >> * I'll add a way for controllers to tell cgroup core that full >> hierarchy support is supported and a cgroup mount option to enable >> hierarchy (cgroup core itself already uses a number of mount options >> and libgroup or whatever should be able to deal with it). >> >> cgroup will refuse to mount if the hierarchy option is specified >> with a controller which doesn't support hierarchy and it will also >> whine like crazy if the userland tries to nest without the mount >> option specified. >> >> Each controller must enforce hierarchy once so directed by cgroup >> mount option. >> >> * While doing that, all applicable controllers will be updated to >> support hierarchy. >> >> * After sufficient time has passed, nesting without the mount option >> specified will be failed by cgroup core. >> >> As for memcg's .use_hierarchy, make it RO 1 if the cgroup indicates >> that hierarchy should be used. Otherwise, I don't know but make sure >> it gets phased out out use somehow. >> > > > The reason for use_hierarchy file was just _performance_, it _was_ > terrible. > Now it's not very good but not terrible. > > > You all may think this as crazy idea. How about versioning ? > > Creating 'memory2'(memory cgroup v2) cgroup and mark 'memory' cgroup as > deprecated, > and put it to feature-removal-list. > > Of course, memory2 cgroup doesn't have use_hierarchy file and have kmem > accounting. > We should disallow to use memory and memory2 at the same time. > > Or, add version file to cgroup subsys ? select v2 as default...user can > choose v1 > with mount option if necessary, but it will not be maintained. > Is it too difficult or messy ? > > To keep user experience, versioning is a way. And we can see how the > changes > affects users. > I think it needs more consideration. Let's consider the following points: * We are having a hard time getting rid of a file we know we hurt us. * We can identify at least a bazillion other points in which we suck. Some of them may be user visible, and we'll have an equally hard time fixing it * People like Michal and David are raising points on the lines of "I have this setup working, everything we add or remove may break it" I agree with it sometimes, disagree at others(*). * I really doubt memcg as it is now - even considered kmem in, is "finished". This means more additions are likely to follow, and waiting for it to ever be "finished" would be waiting forever. I don't necessarily agree with versioning. Mainly, because we have no one in the other end negotiating the features. Versioning makes more sense in those environments, where you can actually adapt your application to whatever you see in the version field, and conversely request a version you are safe with. But the point I want to make here is that whatever we agree on, we should work towards something that will allow us to more freely change and fix memcg in the future, without going through this discussion every single time. (*) Please note that I am not saying it is okay to break setups!!! I am only saying that I disagree that some actions will break setups that are used within reasonability. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 2/2] memcg: first step towards hierarchical controller @ 2012-06-28 9:12 ` Glauber Costa 0 siblings, 0 replies; 47+ messages in thread From: Glauber Costa @ 2012-06-28 9:12 UTC (permalink / raw) To: Kamezawa Hiroyuki Cc: Tejun Heo, Michal Hocko, cgroups-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Johannes Weiner, Andrew Morton, David Rientjes On 06/28/2012 12:46 PM, Kamezawa Hiroyuki wrote: > (2012/06/28 2:33), Tejun Heo wrote: >> Hello, Michal. >> >> On Wed, Jun 27, 2012 at 02:51:19PM +0200, Michal Hocko wrote: >>>> Yeah, this is something I'm seriously considering doing from cgroup >>>> core. ie. generating a warning message if the user nests cgroups w/ >>>> controllers which don't support full hierarchy. >>> >>> This is a good idea. >> >> And I want each controller either to do proper hierarchy or not at all >> and disallow switching the behavior while mounted - at least disallow >> switching off hierarchy support dynamically. >> >>>> Just disallow clearing .use_hierarchy if it was mounted with the >>>> option? >>> >>> Dunno, mount option just doesn't feel right. We do not offer other >>> attributes to be set by them so it would be just confusing. Besides that >>> it would require an integration into existing tools like cgconfig which >>> is yet another pain just because of something that we never promissed to >>> keep a certain way. There are many people who don't work with mount&fs >>> cgroups directly but rather use libcgroup for that... >> >> If the default behavior has to be switched without extra input from >> userland, that should be noisy like hell and slow. e.g. generate >> warning messages whenever userland does something which is to be >> deprecated - nesting when .use_hierarchy == 0, mixing .use_hierarchy >> == 0 / 1, and maybe later on, using .use_hierarchy == 0 at all. >> >> Hmm.... we need to switch other controllers over to hierarchical >> behavior too. We may as well just do it from cgroup core. Once we >> rule out all users of pseudo hierarchy - nesting with controllers >> which don't support hierarchy - switching on hierarchy support >> per-controller shouldn't cause much problem. >> >> How about the following then? >> >> * I'll add a way for controllers to tell cgroup core that full >> hierarchy support is supported and a cgroup mount option to enable >> hierarchy (cgroup core itself already uses a number of mount options >> and libgroup or whatever should be able to deal with it). >> >> cgroup will refuse to mount if the hierarchy option is specified >> with a controller which doesn't support hierarchy and it will also >> whine like crazy if the userland tries to nest without the mount >> option specified. >> >> Each controller must enforce hierarchy once so directed by cgroup >> mount option. >> >> * While doing that, all applicable controllers will be updated to >> support hierarchy. >> >> * After sufficient time has passed, nesting without the mount option >> specified will be failed by cgroup core. >> >> As for memcg's .use_hierarchy, make it RO 1 if the cgroup indicates >> that hierarchy should be used. Otherwise, I don't know but make sure >> it gets phased out out use somehow. >> > > > The reason for use_hierarchy file was just _performance_, it _was_ > terrible. > Now it's not very good but not terrible. > > > You all may think this as crazy idea. How about versioning ? > > Creating 'memory2'(memory cgroup v2) cgroup and mark 'memory' cgroup as > deprecated, > and put it to feature-removal-list. > > Of course, memory2 cgroup doesn't have use_hierarchy file and have kmem > accounting. > We should disallow to use memory and memory2 at the same time. > > Or, add version file to cgroup subsys ? select v2 as default...user can > choose v1 > with mount option if necessary, but it will not be maintained. > Is it too difficult or messy ? > > To keep user experience, versioning is a way. And we can see how the > changes > affects users. > I think it needs more consideration. Let's consider the following points: * We are having a hard time getting rid of a file we know we hurt us. * We can identify at least a bazillion other points in which we suck. Some of them may be user visible, and we'll have an equally hard time fixing it * People like Michal and David are raising points on the lines of "I have this setup working, everything we add or remove may break it" I agree with it sometimes, disagree at others(*). * I really doubt memcg as it is now - even considered kmem in, is "finished". This means more additions are likely to follow, and waiting for it to ever be "finished" would be waiting forever. I don't necessarily agree with versioning. Mainly, because we have no one in the other end negotiating the features. Versioning makes more sense in those environments, where you can actually adapt your application to whatever you see in the version field, and conversely request a version you are safe with. But the point I want to make here is that whatever we agree on, we should work towards something that will allow us to more freely change and fix memcg in the future, without going through this discussion every single time. (*) Please note that I am not saying it is okay to break setups!!! I am only saying that I disagree that some actions will break setups that are used within reasonability. ^ permalink raw reply [flat|nested] 47+ messages in thread
end of thread, other threads:[~2012-06-28 9:14 UTC | newest] Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-06-26 15:47 [PATCH 0/2] fix and deprecate use_hierarchy file Glauber Costa 2012-06-26 15:47 ` [PATCH 1/2] fix bad behavior in " Glauber Costa 2012-06-26 15:47 ` Glauber Costa 2012-06-26 15:52 ` Michal Hocko 2012-06-26 15:52 ` Michal Hocko 2012-06-26 15:54 ` Johannes Weiner 2012-06-26 15:54 ` Johannes Weiner 2012-06-26 22:25 ` Andrew Morton 2012-06-26 22:25 ` Andrew Morton 2012-06-26 22:30 ` Tejun Heo 2012-06-26 15:47 ` [PATCH 2/2] memcg: first step towards hierarchical controller Glauber Costa 2012-06-26 15:47 ` Glauber Costa 2012-06-26 16:15 ` Michal Hocko 2012-06-26 16:15 ` Michal Hocko 2012-06-26 16:37 ` Glauber Costa 2012-06-26 16:37 ` Glauber Costa 2012-06-26 17:54 ` Michal Hocko 2012-06-26 17:54 ` Michal Hocko 2012-06-26 18:04 ` Tejun Heo 2012-06-26 18:04 ` Tejun Heo 2012-06-26 18:55 ` Johannes Weiner 2012-06-26 19:14 ` Tejun Heo 2012-06-26 19:14 ` Tejun Heo 2012-06-26 20:59 ` Johannes Weiner 2012-06-26 20:59 ` Johannes Weiner 2012-06-26 21:19 ` Tejun Heo 2012-06-26 21:19 ` Tejun Heo 2012-06-27 8:57 ` Glauber Costa 2012-06-27 17:07 ` Tejun Heo 2012-06-26 22:08 ` Michal Hocko 2012-06-26 22:08 ` Michal Hocko 2012-06-26 22:14 ` Tejun Heo 2012-06-26 22:14 ` Tejun Heo 2012-06-26 22:17 ` Tejun Heo 2012-06-27 8:52 ` Glauber Costa 2012-06-27 8:52 ` Glauber Costa 2012-06-27 16:58 ` Tejun Heo 2012-06-27 16:58 ` Tejun Heo 2012-06-27 12:51 ` Michal Hocko 2012-06-27 12:51 ` Michal Hocko 2012-06-27 12:49 ` Glauber Costa 2012-06-27 12:49 ` Glauber Costa 2012-06-27 17:33 ` Tejun Heo 2012-06-28 8:46 ` Kamezawa Hiroyuki 2012-06-28 8:46 ` Kamezawa Hiroyuki 2012-06-28 9:12 ` Glauber Costa 2012-06-28 9:12 ` Glauber Costa
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.