All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods
@ 2011-08-23 22:19 Tejun Heo
  2011-08-23 22:19 ` [PATCH 1/6] cgroup: subsys->attach_task() should be called after migration Tejun Heo
                   ` (14 more replies)
  0 siblings, 15 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-23 22:19 UTC (permalink / raw)
  To: rjw, paul, lizf; +Cc: linux-pm, linux-kernel, containers

Hello,

cgroup has grown quite some number of subsys methods.  Some of them
are overlapping, inconsistent with each other and called under
different conditions depending on whether they're called for a single
task or whole process.  Unfortunately, these callbacks are complicated
and incomplete at the same time.

* ->attach_task() is called after migration for task attach but before
  for process.

* Ditto for ->pre_attach().

* ->can_attach_task() is called for every task in the thread group but
  ->attach_task() skips the ones which don't actually change cgroups.

* Task attach becomes noop if the task isn't actually moving.  Process
  attach is always performed.

* ->attach_task() doesn't (or at least aren't supposed to) have access
  to the old cgroup.

* During cancel, there's no way to access the affected tasks.

This patchset introduces cgroup_taskset along with some accessors and
iterator, updates methods to use it, consolidates usages and drops
superflous methods.

It contains the following six patches.

 0001-cgroup-subsys-attach_task-should-be-called-after-mig.patch
 0002-cgroup-improve-old-cgroup-handling-in-cgroup_attach_.patch
 0003-cgroup-introduce-cgroup_taskset-and-use-it-in-subsys.patch
 0004-cgroup-don-t-use-subsys-can_attach_task-or-attach_ta.patch
 0005-cgroup-cpuset-don-t-use-ss-pre_attach.patch
 0006-cgroup-kill-subsys-can_attach_task-pre_attach-and-at.patch

and is based on the current linux-pm/pm-freezer (7b5b95b3f5 "freezer:
remove should_send_signal() and update frozen()"), and available in
the following git tree.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git freezer

I based this on top of pm-freezer because cgroup_freezer changes
conflict (easy to resolve but still) and I'm planning on making
further changes to cgroup_freezer which will depend on both freezer
and cgroup changes.  How should we route these changes?

1. As this patchset would affect other cgroup changes, it makes sense
   to route these through the cgroup branch (BTW, where is it?) and
   propagate things there.  In that case, I'll re-spin the patches on
   top of that tree and send a pull request for the merged branch to
   Rafael.

2. Alternatively, if cgroup isn't expected to have too extensive
   changes in this cycle, we can just funnel all these through
   Rafael's tree.

3. Yet another choice would be applying these on Rafael's tree and
   then pull that into cgroup tree as further changes aren't gonna
   affect cgroup all that much.

What do you guys think?

Thank you.

 Documentation/cgroups/cgroups.txt |   46 +++-----
 block/blk-cgroup.c                |   45 +++++---
 include/linux/cgroup.h            |   31 ++++-
 kernel/cgroup.c                   |  200 ++++++++++++++++++++++++--------------
 kernel/cgroup_freezer.c           |   16 ---
 kernel/cpuset.c                   |  105 +++++++++----------
 kernel/events/core.c              |   13 +-
 kernel/sched.c                    |   31 +++--
 mm/memcontrol.c                   |   16 +--
 security/device_cgroup.c          |    7 -
 10 files changed, 289 insertions(+), 221 deletions(-)

--
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 1/6] cgroup: subsys->attach_task() should be called after migration
       [not found] ` <1314138000-2049-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2011-08-23 22:19   ` Tejun Heo
  2011-08-23 22:19   ` [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc() Tejun Heo
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-23 22:19 UTC (permalink / raw)
  To: rjw-KKrjLPT3xs0, paul-inf54ven1CmVyaH7bEyXVA,
	lizf-BthXqXjhjHXQFUHtdCDX3A
  Cc: Tejun Heo,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

cgroup_attach_task() calls subsys->attach_task() after
cgroup_task_migrate(); however, cgroup_attach_proc() calls it before
migration.  This actually affects some of the users.  Update
cgroup_attach_proc() such that ->attach_task() is called after
migration.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Paul Menage <paul-inf54ven1CmVyaH7bEyXVA@public.gmane.org>
Cc: Li Zefan <lizf-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
---
 kernel/cgroup.c |    8 +++++---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 1d2b6ce..a606fa2 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2135,14 +2135,16 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 		oldcgrp = task_cgroup_from_root(tsk, root);
 		if (cgrp == oldcgrp)
 			continue;
+
+		/* if the thread is PF_EXITING, it can just get skipped. */
+		retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, true);
+		BUG_ON(retval != 0 && retval != -ESRCH);
+
 		/* attach each task to each subsystem */
 		for_each_subsys(root, ss) {
 			if (ss->attach_task)
 				ss->attach_task(cgrp, tsk);
 		}
-		/* if the thread is PF_EXITING, it can just get skipped. */
-		retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, true);
-		BUG_ON(retval != 0 && retval != -ESRCH);
 	}
 	/* nothing is sensitive to fork() after this point. */
 
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 1/6] cgroup: subsys->attach_task() should be called after migration
  2011-08-23 22:19 [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Tejun Heo
@ 2011-08-23 22:19 ` Tejun Heo
  2011-08-24  0:32   ` Frederic Weisbecker
                     ` (2 more replies)
  2011-08-23 22:19 ` Tejun Heo
                   ` (13 subsequent siblings)
  14 siblings, 3 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-23 22:19 UTC (permalink / raw)
  To: rjw, paul, lizf; +Cc: linux-pm, linux-kernel, containers, Tejun Heo

cgroup_attach_task() calls subsys->attach_task() after
cgroup_task_migrate(); however, cgroup_attach_proc() calls it before
migration.  This actually affects some of the users.  Update
cgroup_attach_proc() such that ->attach_task() is called after
migration.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
---
 kernel/cgroup.c |    8 +++++---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 1d2b6ce..a606fa2 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2135,14 +2135,16 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 		oldcgrp = task_cgroup_from_root(tsk, root);
 		if (cgrp == oldcgrp)
 			continue;
+
+		/* if the thread is PF_EXITING, it can just get skipped. */
+		retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, true);
+		BUG_ON(retval != 0 && retval != -ESRCH);
+
 		/* attach each task to each subsystem */
 		for_each_subsys(root, ss) {
 			if (ss->attach_task)
 				ss->attach_task(cgrp, tsk);
 		}
-		/* if the thread is PF_EXITING, it can just get skipped. */
-		retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, true);
-		BUG_ON(retval != 0 && retval != -ESRCH);
 	}
 	/* nothing is sensitive to fork() after this point. */
 
-- 
1.7.6


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 1/6] cgroup: subsys->attach_task() should be called after migration
  2011-08-23 22:19 [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Tejun Heo
  2011-08-23 22:19 ` [PATCH 1/6] cgroup: subsys->attach_task() should be called after migration Tejun Heo
@ 2011-08-23 22:19 ` Tejun Heo
  2011-08-23 22:19 ` [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc() Tejun Heo
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-23 22:19 UTC (permalink / raw)
  To: rjw, paul, lizf; +Cc: Tejun Heo, containers, linux-pm, linux-kernel

cgroup_attach_task() calls subsys->attach_task() after
cgroup_task_migrate(); however, cgroup_attach_proc() calls it before
migration.  This actually affects some of the users.  Update
cgroup_attach_proc() such that ->attach_task() is called after
migration.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
---
 kernel/cgroup.c |    8 +++++---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 1d2b6ce..a606fa2 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2135,14 +2135,16 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 		oldcgrp = task_cgroup_from_root(tsk, root);
 		if (cgrp == oldcgrp)
 			continue;
+
+		/* if the thread is PF_EXITING, it can just get skipped. */
+		retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, true);
+		BUG_ON(retval != 0 && retval != -ESRCH);
+
 		/* attach each task to each subsystem */
 		for_each_subsys(root, ss) {
 			if (ss->attach_task)
 				ss->attach_task(cgrp, tsk);
 		}
-		/* if the thread is PF_EXITING, it can just get skipped. */
-		retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, true);
-		BUG_ON(retval != 0 && retval != -ESRCH);
 	}
 	/* nothing is sensitive to fork() after this point. */
 
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
       [not found] ` <1314138000-2049-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2011-08-23 22:19   ` [PATCH 1/6] cgroup: subsys->attach_task() should be called after migration Tejun Heo
@ 2011-08-23 22:19   ` Tejun Heo
  2011-08-23 22:19   ` [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach() Tejun Heo
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-23 22:19 UTC (permalink / raw)
  To: rjw-KKrjLPT3xs0, paul-inf54ven1CmVyaH7bEyXVA,
	lizf-BthXqXjhjHXQFUHtdCDX3A
  Cc: Tejun Heo,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

cgroup_attach_proc() behaves differently from cgroup_attach_task() in
the following aspects.

* All hooks are invoked even if no task is actually being moved.

* ->can_attach_task() is called for all tasks in the group whether the
  new cgrp is different from the current cgrp or not; however,
  ->attach_task() is skipped if new equals new.  This makes the calls
  asymmetric.

This patch improves old cgroup handling in cgroup_attach_proc() by
looking up the current cgroup at the head, recording it in the flex
array along with the task itself, and using it to remove the above two
differences.  This will also ease further changes.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Paul Menage <paul-inf54ven1CmVyaH7bEyXVA@public.gmane.org>
Cc: Li Zefan <lizf-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
---
 kernel/cgroup.c |   70 ++++++++++++++++++++++++++++++++++--------------------
 1 files changed, 44 insertions(+), 26 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index a606fa2..cf5f3e3 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1739,6 +1739,11 @@ int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen)
 }
 EXPORT_SYMBOL_GPL(cgroup_path);
 
+struct task_and_cgroup {
+	struct task_struct	*task;
+	struct cgroup		*cgrp;
+};
+
 /*
  * cgroup_task_migrate - move a task from one cgroup to another.
  *
@@ -1990,15 +1995,15 @@ static int css_set_prefetch(struct cgroup *cgrp, struct css_set *cg,
  */
 int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 {
-	int retval, i, group_size;
+	int retval, i, group_size, nr_todo;
 	struct cgroup_subsys *ss, *failed_ss = NULL;
 	bool cancel_failed_ss = false;
 	/* guaranteed to be initialized later, but the compiler needs this */
-	struct cgroup *oldcgrp = NULL;
 	struct css_set *oldcg;
 	struct cgroupfs_root *root = cgrp->root;
 	/* threadgroup list cursor and array */
 	struct task_struct *tsk;
+	struct task_and_cgroup *tc;
 	struct flex_array *group;
 	/*
 	 * we need to make sure we have css_sets for all the tasks we're
@@ -2017,8 +2022,7 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 */
 	group_size = get_nr_threads(leader);
 	/* flex_array supports very large thread-groups better than kmalloc. */
-	group = flex_array_alloc(sizeof(struct task_struct *), group_size,
-				 GFP_KERNEL);
+	group = flex_array_alloc(sizeof(*tc), group_size, GFP_KERNEL);
 	if (!group)
 		return -ENOMEM;
 	/* pre-allocate to guarantee space while iterating in rcu read-side. */
@@ -2042,8 +2046,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	}
 	/* take a reference on each task in the group to go in the array. */
 	tsk = leader;
-	i = 0;
+	i = nr_todo = 0;
 	do {
+		struct task_and_cgroup ent;
+
 		/* as per above, nr_threads may decrease, but not increase. */
 		BUG_ON(i >= group_size);
 		get_task_struct(tsk);
@@ -2051,14 +2057,23 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 		 * saying GFP_ATOMIC has no effect here because we did prealloc
 		 * earlier, but it's good form to communicate our expectations.
 		 */
-		retval = flex_array_put_ptr(group, i, tsk, GFP_ATOMIC);
+		ent.task = tsk;
+		ent.cgrp = task_cgroup_from_root(tsk, root);
+		retval = flex_array_put(group, i, &ent, GFP_ATOMIC);
 		BUG_ON(retval != 0);
 		i++;
+		if (ent.cgrp != cgrp)
+			nr_todo++;
 	} while_each_thread(leader, tsk);
 	/* remember the number of threads in the array for later. */
 	group_size = i;
 	rcu_read_unlock();
 
+	/* methods shouldn't be called if no task is actually migrating */
+	retval = 0;
+	if (!nr_todo)
+		goto out_put_tasks;
+
 	/*
 	 * step 1: check that we can legitimately attach to the cgroup.
 	 */
@@ -2074,8 +2089,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 		if (ss->can_attach_task) {
 			/* run on each task in the threadgroup. */
 			for (i = 0; i < group_size; i++) {
-				tsk = flex_array_get_ptr(group, i);
-				retval = ss->can_attach_task(cgrp, tsk);
+				tc = flex_array_get(group, i);
+				if (tc->cgrp == cgrp)
+					continue;
+				retval = ss->can_attach_task(cgrp, tc->task);
 				if (retval) {
 					failed_ss = ss;
 					cancel_failed_ss = true;
@@ -2091,23 +2108,22 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 */
 	INIT_LIST_HEAD(&newcg_list);
 	for (i = 0; i < group_size; i++) {
-		tsk = flex_array_get_ptr(group, i);
+		tc = flex_array_get(group, i);
 		/* nothing to do if this task is already in the cgroup */
-		oldcgrp = task_cgroup_from_root(tsk, root);
-		if (cgrp == oldcgrp)
+		if (tc->cgrp == cgrp)
 			continue;
 		/* get old css_set pointer */
-		task_lock(tsk);
-		if (tsk->flags & PF_EXITING) {
+		task_lock(tc->task);
+		if (tc->task->flags & PF_EXITING) {
 			/* ignore this task if it's going away */
-			task_unlock(tsk);
+			task_unlock(tc->task);
 			continue;
 		}
-		oldcg = tsk->cgroups;
+		oldcg = tc->task->cgroups;
 		get_css_set(oldcg);
-		task_unlock(tsk);
+		task_unlock(tc->task);
 		/* see if the new one for us is already in the list? */
-		if (css_set_check_fetched(cgrp, tsk, oldcg, &newcg_list)) {
+		if (css_set_check_fetched(cgrp, tc->task, oldcg, &newcg_list)) {
 			/* was already there, nothing to do. */
 			put_css_set(oldcg);
 		} else {
@@ -2130,20 +2146,19 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 			ss->pre_attach(cgrp);
 	}
 	for (i = 0; i < group_size; i++) {
-		tsk = flex_array_get_ptr(group, i);
+		tc = flex_array_get(group, i);
 		/* leave current thread as it is if it's already there */
-		oldcgrp = task_cgroup_from_root(tsk, root);
-		if (cgrp == oldcgrp)
+		if (tc->cgrp == cgrp)
 			continue;
 
 		/* if the thread is PF_EXITING, it can just get skipped. */
-		retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, true);
+		retval = cgroup_task_migrate(cgrp, tc->cgrp, tc->task, true);
 		BUG_ON(retval != 0 && retval != -ESRCH);
 
 		/* attach each task to each subsystem */
 		for_each_subsys(root, ss) {
 			if (ss->attach_task)
-				ss->attach_task(cgrp, tsk);
+				ss->attach_task(cgrp, tc->task);
 		}
 	}
 	/* nothing is sensitive to fork() after this point. */
@@ -2154,8 +2169,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 * being moved, this call will need to be reworked to communicate that.
 	 */
 	for_each_subsys(root, ss) {
-		if (ss->attach)
-			ss->attach(ss, cgrp, oldcgrp, leader);
+		if (ss->attach) {
+			tc = flex_array_get(group, 0);
+			ss->attach(ss, cgrp, tc->cgrp, tc->task);
+		}
 	}
 
 	/*
@@ -2184,10 +2201,11 @@ out_cancel_attach:
 				ss->cancel_attach(ss, cgrp, leader);
 		}
 	}
+out_put_tasks:
 	/* clean up the array of referenced threads in the group. */
 	for (i = 0; i < group_size; i++) {
-		tsk = flex_array_get_ptr(group, i);
-		put_task_struct(tsk);
+		tc = flex_array_get(group, i);
+		put_task_struct(tc->task);
 	}
 out_free_group_list:
 	flex_array_free(group);
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
  2011-08-23 22:19 [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Tejun Heo
                   ` (2 preceding siblings ...)
  2011-08-23 22:19 ` [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc() Tejun Heo
@ 2011-08-23 22:19 ` Tejun Heo
  2011-08-25  8:51   ` Paul Menage
                     ` (4 more replies)
  2011-08-23 22:19 ` [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach() Tejun Heo
                   ` (10 subsequent siblings)
  14 siblings, 5 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-23 22:19 UTC (permalink / raw)
  To: rjw, paul, lizf; +Cc: linux-pm, linux-kernel, containers, Tejun Heo

cgroup_attach_proc() behaves differently from cgroup_attach_task() in
the following aspects.

* All hooks are invoked even if no task is actually being moved.

* ->can_attach_task() is called for all tasks in the group whether the
  new cgrp is different from the current cgrp or not; however,
  ->attach_task() is skipped if new equals new.  This makes the calls
  asymmetric.

This patch improves old cgroup handling in cgroup_attach_proc() by
looking up the current cgroup at the head, recording it in the flex
array along with the task itself, and using it to remove the above two
differences.  This will also ease further changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
---
 kernel/cgroup.c |   70 ++++++++++++++++++++++++++++++++++--------------------
 1 files changed, 44 insertions(+), 26 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index a606fa2..cf5f3e3 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1739,6 +1739,11 @@ int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen)
 }
 EXPORT_SYMBOL_GPL(cgroup_path);
 
+struct task_and_cgroup {
+	struct task_struct	*task;
+	struct cgroup		*cgrp;
+};
+
 /*
  * cgroup_task_migrate - move a task from one cgroup to another.
  *
@@ -1990,15 +1995,15 @@ static int css_set_prefetch(struct cgroup *cgrp, struct css_set *cg,
  */
 int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 {
-	int retval, i, group_size;
+	int retval, i, group_size, nr_todo;
 	struct cgroup_subsys *ss, *failed_ss = NULL;
 	bool cancel_failed_ss = false;
 	/* guaranteed to be initialized later, but the compiler needs this */
-	struct cgroup *oldcgrp = NULL;
 	struct css_set *oldcg;
 	struct cgroupfs_root *root = cgrp->root;
 	/* threadgroup list cursor and array */
 	struct task_struct *tsk;
+	struct task_and_cgroup *tc;
 	struct flex_array *group;
 	/*
 	 * we need to make sure we have css_sets for all the tasks we're
@@ -2017,8 +2022,7 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 */
 	group_size = get_nr_threads(leader);
 	/* flex_array supports very large thread-groups better than kmalloc. */
-	group = flex_array_alloc(sizeof(struct task_struct *), group_size,
-				 GFP_KERNEL);
+	group = flex_array_alloc(sizeof(*tc), group_size, GFP_KERNEL);
 	if (!group)
 		return -ENOMEM;
 	/* pre-allocate to guarantee space while iterating in rcu read-side. */
@@ -2042,8 +2046,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	}
 	/* take a reference on each task in the group to go in the array. */
 	tsk = leader;
-	i = 0;
+	i = nr_todo = 0;
 	do {
+		struct task_and_cgroup ent;
+
 		/* as per above, nr_threads may decrease, but not increase. */
 		BUG_ON(i >= group_size);
 		get_task_struct(tsk);
@@ -2051,14 +2057,23 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 		 * saying GFP_ATOMIC has no effect here because we did prealloc
 		 * earlier, but it's good form to communicate our expectations.
 		 */
-		retval = flex_array_put_ptr(group, i, tsk, GFP_ATOMIC);
+		ent.task = tsk;
+		ent.cgrp = task_cgroup_from_root(tsk, root);
+		retval = flex_array_put(group, i, &ent, GFP_ATOMIC);
 		BUG_ON(retval != 0);
 		i++;
+		if (ent.cgrp != cgrp)
+			nr_todo++;
 	} while_each_thread(leader, tsk);
 	/* remember the number of threads in the array for later. */
 	group_size = i;
 	rcu_read_unlock();
 
+	/* methods shouldn't be called if no task is actually migrating */
+	retval = 0;
+	if (!nr_todo)
+		goto out_put_tasks;
+
 	/*
 	 * step 1: check that we can legitimately attach to the cgroup.
 	 */
@@ -2074,8 +2089,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 		if (ss->can_attach_task) {
 			/* run on each task in the threadgroup. */
 			for (i = 0; i < group_size; i++) {
-				tsk = flex_array_get_ptr(group, i);
-				retval = ss->can_attach_task(cgrp, tsk);
+				tc = flex_array_get(group, i);
+				if (tc->cgrp == cgrp)
+					continue;
+				retval = ss->can_attach_task(cgrp, tc->task);
 				if (retval) {
 					failed_ss = ss;
 					cancel_failed_ss = true;
@@ -2091,23 +2108,22 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 */
 	INIT_LIST_HEAD(&newcg_list);
 	for (i = 0; i < group_size; i++) {
-		tsk = flex_array_get_ptr(group, i);
+		tc = flex_array_get(group, i);
 		/* nothing to do if this task is already in the cgroup */
-		oldcgrp = task_cgroup_from_root(tsk, root);
-		if (cgrp == oldcgrp)
+		if (tc->cgrp == cgrp)
 			continue;
 		/* get old css_set pointer */
-		task_lock(tsk);
-		if (tsk->flags & PF_EXITING) {
+		task_lock(tc->task);
+		if (tc->task->flags & PF_EXITING) {
 			/* ignore this task if it's going away */
-			task_unlock(tsk);
+			task_unlock(tc->task);
 			continue;
 		}
-		oldcg = tsk->cgroups;
+		oldcg = tc->task->cgroups;
 		get_css_set(oldcg);
-		task_unlock(tsk);
+		task_unlock(tc->task);
 		/* see if the new one for us is already in the list? */
-		if (css_set_check_fetched(cgrp, tsk, oldcg, &newcg_list)) {
+		if (css_set_check_fetched(cgrp, tc->task, oldcg, &newcg_list)) {
 			/* was already there, nothing to do. */
 			put_css_set(oldcg);
 		} else {
@@ -2130,20 +2146,19 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 			ss->pre_attach(cgrp);
 	}
 	for (i = 0; i < group_size; i++) {
-		tsk = flex_array_get_ptr(group, i);
+		tc = flex_array_get(group, i);
 		/* leave current thread as it is if it's already there */
-		oldcgrp = task_cgroup_from_root(tsk, root);
-		if (cgrp == oldcgrp)
+		if (tc->cgrp == cgrp)
 			continue;
 
 		/* if the thread is PF_EXITING, it can just get skipped. */
-		retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, true);
+		retval = cgroup_task_migrate(cgrp, tc->cgrp, tc->task, true);
 		BUG_ON(retval != 0 && retval != -ESRCH);
 
 		/* attach each task to each subsystem */
 		for_each_subsys(root, ss) {
 			if (ss->attach_task)
-				ss->attach_task(cgrp, tsk);
+				ss->attach_task(cgrp, tc->task);
 		}
 	}
 	/* nothing is sensitive to fork() after this point. */
@@ -2154,8 +2169,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 * being moved, this call will need to be reworked to communicate that.
 	 */
 	for_each_subsys(root, ss) {
-		if (ss->attach)
-			ss->attach(ss, cgrp, oldcgrp, leader);
+		if (ss->attach) {
+			tc = flex_array_get(group, 0);
+			ss->attach(ss, cgrp, tc->cgrp, tc->task);
+		}
 	}
 
 	/*
@@ -2184,10 +2201,11 @@ out_cancel_attach:
 				ss->cancel_attach(ss, cgrp, leader);
 		}
 	}
+out_put_tasks:
 	/* clean up the array of referenced threads in the group. */
 	for (i = 0; i < group_size; i++) {
-		tsk = flex_array_get_ptr(group, i);
-		put_task_struct(tsk);
+		tc = flex_array_get(group, i);
+		put_task_struct(tc->task);
 	}
 out_free_group_list:
 	flex_array_free(group);
-- 
1.7.6


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
  2011-08-23 22:19 [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Tejun Heo
  2011-08-23 22:19 ` [PATCH 1/6] cgroup: subsys->attach_task() should be called after migration Tejun Heo
  2011-08-23 22:19 ` Tejun Heo
@ 2011-08-23 22:19 ` Tejun Heo
  2011-08-23 22:19 ` Tejun Heo
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-23 22:19 UTC (permalink / raw)
  To: rjw, paul, lizf; +Cc: Tejun Heo, containers, linux-pm, linux-kernel

cgroup_attach_proc() behaves differently from cgroup_attach_task() in
the following aspects.

* All hooks are invoked even if no task is actually being moved.

* ->can_attach_task() is called for all tasks in the group whether the
  new cgrp is different from the current cgrp or not; however,
  ->attach_task() is skipped if new equals new.  This makes the calls
  asymmetric.

This patch improves old cgroup handling in cgroup_attach_proc() by
looking up the current cgroup at the head, recording it in the flex
array along with the task itself, and using it to remove the above two
differences.  This will also ease further changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
---
 kernel/cgroup.c |   70 ++++++++++++++++++++++++++++++++++--------------------
 1 files changed, 44 insertions(+), 26 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index a606fa2..cf5f3e3 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1739,6 +1739,11 @@ int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen)
 }
 EXPORT_SYMBOL_GPL(cgroup_path);
 
+struct task_and_cgroup {
+	struct task_struct	*task;
+	struct cgroup		*cgrp;
+};
+
 /*
  * cgroup_task_migrate - move a task from one cgroup to another.
  *
@@ -1990,15 +1995,15 @@ static int css_set_prefetch(struct cgroup *cgrp, struct css_set *cg,
  */
 int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 {
-	int retval, i, group_size;
+	int retval, i, group_size, nr_todo;
 	struct cgroup_subsys *ss, *failed_ss = NULL;
 	bool cancel_failed_ss = false;
 	/* guaranteed to be initialized later, but the compiler needs this */
-	struct cgroup *oldcgrp = NULL;
 	struct css_set *oldcg;
 	struct cgroupfs_root *root = cgrp->root;
 	/* threadgroup list cursor and array */
 	struct task_struct *tsk;
+	struct task_and_cgroup *tc;
 	struct flex_array *group;
 	/*
 	 * we need to make sure we have css_sets for all the tasks we're
@@ -2017,8 +2022,7 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 */
 	group_size = get_nr_threads(leader);
 	/* flex_array supports very large thread-groups better than kmalloc. */
-	group = flex_array_alloc(sizeof(struct task_struct *), group_size,
-				 GFP_KERNEL);
+	group = flex_array_alloc(sizeof(*tc), group_size, GFP_KERNEL);
 	if (!group)
 		return -ENOMEM;
 	/* pre-allocate to guarantee space while iterating in rcu read-side. */
@@ -2042,8 +2046,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	}
 	/* take a reference on each task in the group to go in the array. */
 	tsk = leader;
-	i = 0;
+	i = nr_todo = 0;
 	do {
+		struct task_and_cgroup ent;
+
 		/* as per above, nr_threads may decrease, but not increase. */
 		BUG_ON(i >= group_size);
 		get_task_struct(tsk);
@@ -2051,14 +2057,23 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 		 * saying GFP_ATOMIC has no effect here because we did prealloc
 		 * earlier, but it's good form to communicate our expectations.
 		 */
-		retval = flex_array_put_ptr(group, i, tsk, GFP_ATOMIC);
+		ent.task = tsk;
+		ent.cgrp = task_cgroup_from_root(tsk, root);
+		retval = flex_array_put(group, i, &ent, GFP_ATOMIC);
 		BUG_ON(retval != 0);
 		i++;
+		if (ent.cgrp != cgrp)
+			nr_todo++;
 	} while_each_thread(leader, tsk);
 	/* remember the number of threads in the array for later. */
 	group_size = i;
 	rcu_read_unlock();
 
+	/* methods shouldn't be called if no task is actually migrating */
+	retval = 0;
+	if (!nr_todo)
+		goto out_put_tasks;
+
 	/*
 	 * step 1: check that we can legitimately attach to the cgroup.
 	 */
@@ -2074,8 +2089,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 		if (ss->can_attach_task) {
 			/* run on each task in the threadgroup. */
 			for (i = 0; i < group_size; i++) {
-				tsk = flex_array_get_ptr(group, i);
-				retval = ss->can_attach_task(cgrp, tsk);
+				tc = flex_array_get(group, i);
+				if (tc->cgrp == cgrp)
+					continue;
+				retval = ss->can_attach_task(cgrp, tc->task);
 				if (retval) {
 					failed_ss = ss;
 					cancel_failed_ss = true;
@@ -2091,23 +2108,22 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 */
 	INIT_LIST_HEAD(&newcg_list);
 	for (i = 0; i < group_size; i++) {
-		tsk = flex_array_get_ptr(group, i);
+		tc = flex_array_get(group, i);
 		/* nothing to do if this task is already in the cgroup */
-		oldcgrp = task_cgroup_from_root(tsk, root);
-		if (cgrp == oldcgrp)
+		if (tc->cgrp == cgrp)
 			continue;
 		/* get old css_set pointer */
-		task_lock(tsk);
-		if (tsk->flags & PF_EXITING) {
+		task_lock(tc->task);
+		if (tc->task->flags & PF_EXITING) {
 			/* ignore this task if it's going away */
-			task_unlock(tsk);
+			task_unlock(tc->task);
 			continue;
 		}
-		oldcg = tsk->cgroups;
+		oldcg = tc->task->cgroups;
 		get_css_set(oldcg);
-		task_unlock(tsk);
+		task_unlock(tc->task);
 		/* see if the new one for us is already in the list? */
-		if (css_set_check_fetched(cgrp, tsk, oldcg, &newcg_list)) {
+		if (css_set_check_fetched(cgrp, tc->task, oldcg, &newcg_list)) {
 			/* was already there, nothing to do. */
 			put_css_set(oldcg);
 		} else {
@@ -2130,20 +2146,19 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 			ss->pre_attach(cgrp);
 	}
 	for (i = 0; i < group_size; i++) {
-		tsk = flex_array_get_ptr(group, i);
+		tc = flex_array_get(group, i);
 		/* leave current thread as it is if it's already there */
-		oldcgrp = task_cgroup_from_root(tsk, root);
-		if (cgrp == oldcgrp)
+		if (tc->cgrp == cgrp)
 			continue;
 
 		/* if the thread is PF_EXITING, it can just get skipped. */
-		retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, true);
+		retval = cgroup_task_migrate(cgrp, tc->cgrp, tc->task, true);
 		BUG_ON(retval != 0 && retval != -ESRCH);
 
 		/* attach each task to each subsystem */
 		for_each_subsys(root, ss) {
 			if (ss->attach_task)
-				ss->attach_task(cgrp, tsk);
+				ss->attach_task(cgrp, tc->task);
 		}
 	}
 	/* nothing is sensitive to fork() after this point. */
@@ -2154,8 +2169,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 * being moved, this call will need to be reworked to communicate that.
 	 */
 	for_each_subsys(root, ss) {
-		if (ss->attach)
-			ss->attach(ss, cgrp, oldcgrp, leader);
+		if (ss->attach) {
+			tc = flex_array_get(group, 0);
+			ss->attach(ss, cgrp, tc->cgrp, tc->task);
+		}
 	}
 
 	/*
@@ -2184,10 +2201,11 @@ out_cancel_attach:
 				ss->cancel_attach(ss, cgrp, leader);
 		}
 	}
+out_put_tasks:
 	/* clean up the array of referenced threads in the group. */
 	for (i = 0; i < group_size; i++) {
-		tsk = flex_array_get_ptr(group, i);
-		put_task_struct(tsk);
+		tc = flex_array_get(group, i);
+		put_task_struct(tc->task);
 	}
 out_free_group_list:
 	flex_array_free(group);
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
       [not found] ` <1314138000-2049-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2011-08-23 22:19   ` [PATCH 1/6] cgroup: subsys->attach_task() should be called after migration Tejun Heo
  2011-08-23 22:19   ` [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc() Tejun Heo
@ 2011-08-23 22:19   ` Tejun Heo
  2011-08-23 22:19   ` [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task() Tejun Heo
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-23 22:19 UTC (permalink / raw)
  To: rjw-KKrjLPT3xs0, paul-inf54ven1CmVyaH7bEyXVA,
	lizf-BthXqXjhjHXQFUHtdCDX3A
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Daisuke Nishimura, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Tejun Heo, linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

Currently, there's no way to pass multiple tasks to cgroup_subsys
methods necessitating the need for separate per-process and per-task
methods.  This patch introduces cgroup_taskset which can be used to
pass multiple tasks and their associated cgroups to cgroup_subsys
methods.

Three methods - can_attach(), cancel_attach() and attach() - are
converted to use cgroup_taskset.  This unifies passed parameters so
that all methods have access to all information.  Conversions in this
patchset are identical and don't introduce any behavior change.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Paul Menage <paul-inf54ven1CmVyaH7bEyXVA@public.gmane.org>
Cc: Li Zefan <lizf-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
Cc: Balbir Singh <bsingharora-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Daisuke Nishimura <nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
Cc: James Morris <jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org>
---
 Documentation/cgroups/cgroups.txt |   26 ++++++----
 include/linux/cgroup.h            |   28 +++++++++-
 kernel/cgroup.c                   |   99 +++++++++++++++++++++++++++++++++----
 kernel/cgroup_freezer.c           |    2 +-
 kernel/cpuset.c                   |   18 ++++---
 mm/memcontrol.c                   |   16 +++---
 security/device_cgroup.c          |    7 ++-
 7 files changed, 153 insertions(+), 43 deletions(-)

diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
index cd67e90..2eee7cf 100644
--- a/Documentation/cgroups/cgroups.txt
+++ b/Documentation/cgroups/cgroups.txt
@@ -594,16 +594,21 @@ rmdir() will fail with it. From this behavior, pre_destroy() can be
 called multiple times against a cgroup.
 
 int can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
-	       struct task_struct *task)
+	       struct cgroup_taskset *tset)
 (cgroup_mutex held by caller)
 
-Called prior to moving a task into a cgroup; if the subsystem
-returns an error, this will abort the attach operation.  If a NULL
-task is passed, then a successful result indicates that *any*
-unspecified task can be moved into the cgroup. Note that this isn't
+Called prior to moving one or more tasks into a cgroup; if the
+subsystem returns an error, this will abort the attach operation.
+@tset contains the tasks to be attached and is guaranteed to have at
+least one task in it. If there are multiple, it's guaranteed that all
+are from the same thread group, @tset contains all tasks from the
+group whether they're actually switching cgroup or not, and the first
+task is the leader. Each @tset entry also contains the task's old
+cgroup and tasks which aren't switching cgroup can be skipped easily
+using the cgroup_taskset_for_each() iterator. Note that this isn't
 called on a fork. If this method returns 0 (success) then this should
-remain valid while the caller holds cgroup_mutex and it is ensured that either
-attach() or cancel_attach() will be called in future.
+remain valid while the caller holds cgroup_mutex and it is ensured
+that either attach() or cancel_attach() will be called in future.
 
 int can_attach_task(struct cgroup *cgrp, struct task_struct *tsk);
 (cgroup_mutex held by caller)
@@ -613,14 +618,14 @@ attached (possibly many when using cgroup_attach_proc). Called after
 can_attach.
 
 void cancel_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
-	       struct task_struct *task, bool threadgroup)
+		   struct cgroup_taskset *tset)
 (cgroup_mutex held by caller)
 
 Called when a task attach operation has failed after can_attach() has succeeded.
 A subsystem whose can_attach() has some side-effects should provide this
 function, so that the subsystem can implement a rollback. If not, not necessary.
 This will be called only about subsystems whose can_attach() operation have
-succeeded.
+succeeded. The parameters are identical to can_attach().
 
 void pre_attach(struct cgroup *cgrp);
 (cgroup_mutex held by caller)
@@ -629,11 +634,12 @@ For any non-per-thread attachment work that needs to happen before
 attach_task. Needed by cpuset.
 
 void attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
-	    struct cgroup *old_cgrp, struct task_struct *task)
+	    struct cgroup_taskset *tset)
 (cgroup_mutex held by caller)
 
 Called after the task has been attached to the cgroup, to allow any
 post-attachment activity that requires memory allocations or blocking.
+The parameters are identical to can_attach().
 
 void attach_task(struct cgroup *cgrp, struct task_struct *tsk);
 (cgroup_mutex held by caller)
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index da7e4bc..2470c8e 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -457,6 +457,28 @@ void cgroup_exclude_rmdir(struct cgroup_subsys_state *css);
 void cgroup_release_and_wakeup_rmdir(struct cgroup_subsys_state *css);
 
 /*
+ * Control Group taskset, used to pass around set of tasks to cgroup_subsys
+ * methods.
+ */
+struct cgroup_taskset;
+struct task_struct *cgroup_taskset_first(struct cgroup_taskset *tset);
+struct task_struct *cgroup_taskset_next(struct cgroup_taskset *tset);
+struct cgroup *cgroup_taskset_cur_cgroup(struct cgroup_taskset *tset);
+int cgroup_taskset_size(struct cgroup_taskset *tset);
+
+/**
+ * cgroup_taskset_for_each - iterate cgroup_taskset
+ * @task: the loop cursor
+ * @skip_cgrp: skip if task's cgroup matches this, %NULL to iterate through all
+ * @tset: taskset to iterate
+ */
+#define cgroup_taskset_for_each(task, skip_cgrp, tset)			\
+	for ((task) = cgroup_taskset_first((tset)); (task);		\
+	     (task) = cgroup_taskset_next((tset)))			\
+		if (!(skip_cgrp) ||					\
+		    cgroup_taskset_cur_cgroup((tset)) != (skip_cgrp))
+
+/*
  * Control Group subsystem type.
  * See Documentation/cgroups/cgroups.txt for details
  */
@@ -467,14 +489,14 @@ struct cgroup_subsys {
 	int (*pre_destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
 	void (*destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
 	int (*can_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
-			  struct task_struct *tsk);
+			  struct cgroup_taskset *tset);
 	int (*can_attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
 	void (*cancel_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
-			      struct task_struct *tsk);
+			      struct cgroup_taskset *tset);
 	void (*pre_attach)(struct cgroup *cgrp);
 	void (*attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
 	void (*attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
-		       struct cgroup *old_cgrp, struct task_struct *tsk);
+		       struct cgroup_taskset *tset);
 	void (*fork)(struct cgroup_subsys *ss, struct task_struct *task);
 	void (*exit)(struct cgroup_subsys *ss, struct cgroup *cgrp,
 			struct cgroup *old_cgrp, struct task_struct *task);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index cf5f3e3..474674b 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1739,11 +1739,85 @@ int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen)
 }
 EXPORT_SYMBOL_GPL(cgroup_path);
 
+/*
+ * Control Group taskset
+ */
 struct task_and_cgroup {
 	struct task_struct	*task;
 	struct cgroup		*cgrp;
 };
 
+struct cgroup_taskset {
+	struct task_and_cgroup	single;
+	struct flex_array	*tc_array;
+	int			tc_array_len;
+	int			idx;
+	struct cgroup		*cur_cgrp;
+};
+
+/**
+ * cgroup_taskset_first - reset taskset and return the first task
+ * @tset: taskset of interest
+ *
+ * @tset iteration is initialized and the first task is returned.
+ */
+struct task_struct *cgroup_taskset_first(struct cgroup_taskset *tset)
+{
+	if (tset->tc_array) {
+		tset->idx = 0;
+		return cgroup_taskset_next(tset);
+	} else {
+		tset->cur_cgrp = tset->single.cgrp;
+		return tset->single.task;
+	}
+}
+EXPORT_SYMBOL_GPL(cgroup_taskset_first);
+
+/**
+ * cgroup_taskset_next - iterate to the next task in taskset
+ * @tset: taskset of interest
+ *
+ * Return the next task in @tset.  Iteration must have been initialized
+ * with cgroup_taskset_first().
+ */
+struct task_struct *cgroup_taskset_next(struct cgroup_taskset *tset)
+{
+	struct task_and_cgroup *tc;
+
+	if (!tset->tc_array || tset->idx >= tset->tc_array_len)
+		return NULL;
+
+	tc = flex_array_get(tset->tc_array, tset->idx++);
+	tset->cur_cgrp = tc->cgrp;
+	return tc->task;
+}
+EXPORT_SYMBOL_GPL(cgroup_taskset_next);
+
+/**
+ * cgroup_taskset_cur_cgroup - return the matching cgroup for the current task
+ * @tset: taskset of interest
+ *
+ * Return the cgroup for the current (last returned) task of @tset.  This
+ * function must be preceded by either cgroup_taskset_first() or
+ * cgroup_taskset_next().
+ */
+struct cgroup *cgroup_taskset_cur_cgroup(struct cgroup_taskset *tset)
+{
+	return tset->cur_cgrp;
+}
+EXPORT_SYMBOL_GPL(cgroup_taskset_cur_cgroup);
+
+/**
+ * cgroup_taskset_size - return the number of tasks in taskset
+ * @tset: taskset of interest
+ */
+int cgroup_taskset_size(struct cgroup_taskset *tset)
+{
+	return tset->tc_array ? tset->tc_array_len : 1;
+}
+EXPORT_SYMBOL_GPL(cgroup_taskset_size);
+
+
 /*
  * cgroup_task_migrate - move a task from one cgroup to another.
  *
@@ -1828,15 +1902,19 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
 	struct cgroup_subsys *ss, *failed_ss = NULL;
 	struct cgroup *oldcgrp;
 	struct cgroupfs_root *root = cgrp->root;
+	struct cgroup_taskset tset = { };
 
 	/* Nothing to do if the task is already in that cgroup */
 	oldcgrp = task_cgroup_from_root(tsk, root);
 	if (cgrp == oldcgrp)
 		return 0;
 
+	tset.single.task = tsk;
+	tset.single.cgrp = oldcgrp;
+
 	for_each_subsys(root, ss) {
 		if (ss->can_attach) {
-			retval = ss->can_attach(ss, cgrp, tsk);
+			retval = ss->can_attach(ss, cgrp, &tset);
 			if (retval) {
 				/*
 				 * Remember on which subsystem the can_attach()
@@ -1867,7 +1945,7 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
 		if (ss->attach_task)
 			ss->attach_task(cgrp, tsk);
 		if (ss->attach)
-			ss->attach(ss, cgrp, oldcgrp, tsk);
+			ss->attach(ss, cgrp, &tset);
 	}
 
 	synchronize_rcu();
@@ -1889,7 +1967,7 @@ out:
 				 */
 				break;
 			if (ss->cancel_attach)
-				ss->cancel_attach(ss, cgrp, tsk);
+				ss->cancel_attach(ss, cgrp, &tset);
 		}
 	}
 	return retval;
@@ -2005,6 +2083,7 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	struct task_struct *tsk;
 	struct task_and_cgroup *tc;
 	struct flex_array *group;
+	struct cgroup_taskset tset = { };
 	/*
 	 * we need to make sure we have css_sets for all the tasks we're
 	 * going to move -before- we actually start moving them, so that in
@@ -2067,6 +2146,8 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	} while_each_thread(leader, tsk);
 	/* remember the number of threads in the array for later. */
 	group_size = i;
+	tset.tc_array = group;
+	tset.tc_array_len = group_size;
 	rcu_read_unlock();
 
 	/* methods shouldn't be called if no task is actually migrating */
@@ -2079,7 +2160,7 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 */
 	for_each_subsys(root, ss) {
 		if (ss->can_attach) {
-			retval = ss->can_attach(ss, cgrp, leader);
+			retval = ss->can_attach(ss, cgrp, &tset);
 			if (retval) {
 				failed_ss = ss;
 				goto out_cancel_attach;
@@ -2169,10 +2250,8 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 * being moved, this call will need to be reworked to communicate that.
 	 */
 	for_each_subsys(root, ss) {
-		if (ss->attach) {
-			tc = flex_array_get(group, 0);
-			ss->attach(ss, cgrp, tc->cgrp, tc->task);
-		}
+		if (ss->attach)
+			ss->attach(ss, cgrp, &tset);
 	}
 
 	/*
@@ -2194,11 +2273,11 @@ out_cancel_attach:
 		for_each_subsys(root, ss) {
 			if (ss == failed_ss) {
 				if (cancel_failed_ss && ss->cancel_attach)
-					ss->cancel_attach(ss, cgrp, leader);
+					ss->cancel_attach(ss, cgrp, &tset);
 				break;
 			}
 			if (ss->cancel_attach)
-				ss->cancel_attach(ss, cgrp, leader);
+				ss->cancel_attach(ss, cgrp, &tset);
 		}
 	}
 out_put_tasks:
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index 4e82525..a2b0082 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -159,7 +159,7 @@ static void freezer_destroy(struct cgroup_subsys *ss,
  */
 static int freezer_can_attach(struct cgroup_subsys *ss,
 			      struct cgroup *new_cgroup,
-			      struct task_struct *task)
+			      struct cgroup_taskset *tset)
 {
 	struct freezer *freezer;
 
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 10131fd..2e5825b 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1368,10 +1368,10 @@ static int fmeter_getrate(struct fmeter *fmp)
 }
 
 /* Called by cgroups to determine if a cpuset is usable; cgroup_mutex held */
-static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cont,
-			     struct task_struct *tsk)
+static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+			     struct cgroup_taskset *tset)
 {
-	struct cpuset *cs = cgroup_cs(cont);
+	struct cpuset *cs = cgroup_cs(cgrp);
 
 	if (cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed))
 		return -ENOSPC;
@@ -1384,7 +1384,7 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cont,
 	 * set_cpus_allowed_ptr() on all attached tasks before cpus_allowed may
 	 * be changed.
 	 */
-	if (tsk->flags & PF_THREAD_BOUND)
+	if (cgroup_taskset_first(tset)->flags & PF_THREAD_BOUND)
 		return -EINVAL;
 
 	return 0;
@@ -1434,12 +1434,14 @@ static void cpuset_attach_task(struct cgroup *cont, struct task_struct *tsk)
 	cpuset_update_task_spread_flag(cs, tsk);
 }
 
-static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cont,
-			  struct cgroup *oldcont, struct task_struct *tsk)
+static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+			  struct cgroup_taskset *tset)
 {
 	struct mm_struct *mm;
-	struct cpuset *cs = cgroup_cs(cont);
-	struct cpuset *oldcs = cgroup_cs(oldcont);
+	struct task_struct *tsk = cgroup_taskset_first(tset);
+	struct cgroup *oldcgrp = cgroup_taskset_cur_cgroup(tset);
+	struct cpuset *cs = cgroup_cs(cgrp);
+	struct cpuset *oldcs = cgroup_cs(oldcgrp);
 
 	/*
 	 * Change mm, possibly for multiple threads in a threadgroup. This is
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 930de94..b2802cc 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5460,8 +5460,9 @@ static void mem_cgroup_clear_mc(void)
 
 static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
 				struct cgroup *cgroup,
-				struct task_struct *p)
+				struct cgroup_taskset *tset)
 {
+	struct task_struct *p = cgroup_taskset_first(tset);
 	int ret = 0;
 	struct mem_cgroup *mem = mem_cgroup_from_cont(cgroup);
 
@@ -5499,7 +5500,7 @@ static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
 
 static void mem_cgroup_cancel_attach(struct cgroup_subsys *ss,
 				struct cgroup *cgroup,
-				struct task_struct *p)
+				struct cgroup_taskset *tset)
 {
 	mem_cgroup_clear_mc();
 }
@@ -5616,9 +5617,9 @@ retry:
 
 static void mem_cgroup_move_task(struct cgroup_subsys *ss,
 				struct cgroup *cont,
-				struct cgroup *old_cont,
-				struct task_struct *p)
+				struct cgroup_taskset *tset)
 {
+	struct task_struct *p = cgroup_taskset_first(tset);
 	struct mm_struct *mm = get_task_mm(p);
 
 	if (mm) {
@@ -5633,19 +5634,18 @@ static void mem_cgroup_move_task(struct cgroup_subsys *ss,
 #else	/* !CONFIG_MMU */
 static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
 				struct cgroup *cgroup,
-				struct task_struct *p)
+				struct cgroup_taskset *tset)
 {
 	return 0;
 }
 static void mem_cgroup_cancel_attach(struct cgroup_subsys *ss,
 				struct cgroup *cgroup,
-				struct task_struct *p)
+				struct cgroup_taskset *tset)
 {
 }
 static void mem_cgroup_move_task(struct cgroup_subsys *ss,
 				struct cgroup *cont,
-				struct cgroup *old_cont,
-				struct task_struct *p)
+				struct cgroup_taskset *tset)
 {
 }
 #endif
diff --git a/security/device_cgroup.c b/security/device_cgroup.c
index 4450fbe..8b5b5d8 100644
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -62,11 +62,12 @@ static inline struct dev_cgroup *task_devcgroup(struct task_struct *task)
 struct cgroup_subsys devices_subsys;
 
 static int devcgroup_can_attach(struct cgroup_subsys *ss,
-		struct cgroup *new_cgroup, struct task_struct *task)
+			struct cgroup *new_cgrp, struct cgroup_taskset *set)
 {
-	if (current != task && !capable(CAP_SYS_ADMIN))
-			return -EPERM;
+	struct task_struct *task = cgroup_taskset_first(set);
 
+	if (current != task && !capable(CAP_SYS_ADMIN))
+		return -EPERM;
 	return 0;
 }
 
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-23 22:19 [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Tejun Heo
                   ` (3 preceding siblings ...)
  2011-08-23 22:19 ` Tejun Heo
@ 2011-08-23 22:19 ` Tejun Heo
  2011-08-25  0:39   ` KAMEZAWA Hiroyuki
                     ` (6 more replies)
       [not found] ` <1314138000-2049-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                   ` (9 subsequent siblings)
  14 siblings, 7 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-23 22:19 UTC (permalink / raw)
  To: rjw, paul, lizf
  Cc: linux-pm, linux-kernel, containers, Tejun Heo, Balbir Singh,
	Daisuke Nishimura, KAMEZAWA Hiroyuki, James Morris

Currently, there's no way to pass multiple tasks to cgroup_subsys
methods necessitating the need for separate per-process and per-task
methods.  This patch introduces cgroup_taskset which can be used to
pass multiple tasks and their associated cgroups to cgroup_subsys
methods.

Three methods - can_attach(), cancel_attach() and attach() - are
converted to use cgroup_taskset.  This unifies passed parameters so
that all methods have access to all information.  Conversions in this
patchset are identical and don't introduce any behavior change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: James Morris <jmorris@namei.org>
---
 Documentation/cgroups/cgroups.txt |   26 ++++++----
 include/linux/cgroup.h            |   28 +++++++++-
 kernel/cgroup.c                   |   99 +++++++++++++++++++++++++++++++++----
 kernel/cgroup_freezer.c           |    2 +-
 kernel/cpuset.c                   |   18 ++++---
 mm/memcontrol.c                   |   16 +++---
 security/device_cgroup.c          |    7 ++-
 7 files changed, 153 insertions(+), 43 deletions(-)

diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
index cd67e90..2eee7cf 100644
--- a/Documentation/cgroups/cgroups.txt
+++ b/Documentation/cgroups/cgroups.txt
@@ -594,16 +594,21 @@ rmdir() will fail with it. From this behavior, pre_destroy() can be
 called multiple times against a cgroup.
 
 int can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
-	       struct task_struct *task)
+	       struct cgroup_taskset *tset)
 (cgroup_mutex held by caller)
 
-Called prior to moving a task into a cgroup; if the subsystem
-returns an error, this will abort the attach operation.  If a NULL
-task is passed, then a successful result indicates that *any*
-unspecified task can be moved into the cgroup. Note that this isn't
+Called prior to moving one or more tasks into a cgroup; if the
+subsystem returns an error, this will abort the attach operation.
+@tset contains the tasks to be attached and is guaranteed to have at
+least one task in it. If there are multiple, it's guaranteed that all
+are from the same thread group, @tset contains all tasks from the
+group whether they're actually switching cgroup or not, and the first
+task is the leader. Each @tset entry also contains the task's old
+cgroup and tasks which aren't switching cgroup can be skipped easily
+using the cgroup_taskset_for_each() iterator. Note that this isn't
 called on a fork. If this method returns 0 (success) then this should
-remain valid while the caller holds cgroup_mutex and it is ensured that either
-attach() or cancel_attach() will be called in future.
+remain valid while the caller holds cgroup_mutex and it is ensured
+that either attach() or cancel_attach() will be called in future.
 
 int can_attach_task(struct cgroup *cgrp, struct task_struct *tsk);
 (cgroup_mutex held by caller)
@@ -613,14 +618,14 @@ attached (possibly many when using cgroup_attach_proc). Called after
 can_attach.
 
 void cancel_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
-	       struct task_struct *task, bool threadgroup)
+		   struct cgroup_taskset *tset)
 (cgroup_mutex held by caller)
 
 Called when a task attach operation has failed after can_attach() has succeeded.
 A subsystem whose can_attach() has some side-effects should provide this
 function, so that the subsystem can implement a rollback. If not, not necessary.
 This will be called only about subsystems whose can_attach() operation have
-succeeded.
+succeeded. The parameters are identical to can_attach().
 
 void pre_attach(struct cgroup *cgrp);
 (cgroup_mutex held by caller)
@@ -629,11 +634,12 @@ For any non-per-thread attachment work that needs to happen before
 attach_task. Needed by cpuset.
 
 void attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
-	    struct cgroup *old_cgrp, struct task_struct *task)
+	    struct cgroup_taskset *tset)
 (cgroup_mutex held by caller)
 
 Called after the task has been attached to the cgroup, to allow any
 post-attachment activity that requires memory allocations or blocking.
+The parameters are identical to can_attach().
 
 void attach_task(struct cgroup *cgrp, struct task_struct *tsk);
 (cgroup_mutex held by caller)
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index da7e4bc..2470c8e 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -457,6 +457,28 @@ void cgroup_exclude_rmdir(struct cgroup_subsys_state *css);
 void cgroup_release_and_wakeup_rmdir(struct cgroup_subsys_state *css);
 
 /*
+ * Control Group taskset, used to pass around set of tasks to cgroup_subsys
+ * methods.
+ */
+struct cgroup_taskset;
+struct task_struct *cgroup_taskset_first(struct cgroup_taskset *tset);
+struct task_struct *cgroup_taskset_next(struct cgroup_taskset *tset);
+struct cgroup *cgroup_taskset_cur_cgroup(struct cgroup_taskset *tset);
+int cgroup_taskset_size(struct cgroup_taskset *tset);
+
+/**
+ * cgroup_taskset_for_each - iterate cgroup_taskset
+ * @task: the loop cursor
+ * @skip_cgrp: skip if task's cgroup matches this, %NULL to iterate through all
+ * @tset: taskset to iterate
+ */
+#define cgroup_taskset_for_each(task, skip_cgrp, tset)			\
+	for ((task) = cgroup_taskset_first((tset)); (task);		\
+	     (task) = cgroup_taskset_next((tset)))			\
+		if (!(skip_cgrp) ||					\
+		    cgroup_taskset_cur_cgroup((tset)) != (skip_cgrp))
+
+/*
  * Control Group subsystem type.
  * See Documentation/cgroups/cgroups.txt for details
  */
@@ -467,14 +489,14 @@ struct cgroup_subsys {
 	int (*pre_destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
 	void (*destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
 	int (*can_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
-			  struct task_struct *tsk);
+			  struct cgroup_taskset *tset);
 	int (*can_attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
 	void (*cancel_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
-			      struct task_struct *tsk);
+			      struct cgroup_taskset *tset);
 	void (*pre_attach)(struct cgroup *cgrp);
 	void (*attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
 	void (*attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
-		       struct cgroup *old_cgrp, struct task_struct *tsk);
+		       struct cgroup_taskset *tset);
 	void (*fork)(struct cgroup_subsys *ss, struct task_struct *task);
 	void (*exit)(struct cgroup_subsys *ss, struct cgroup *cgrp,
 			struct cgroup *old_cgrp, struct task_struct *task);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index cf5f3e3..474674b 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1739,11 +1739,85 @@ int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen)
 }
 EXPORT_SYMBOL_GPL(cgroup_path);
 
+/*
+ * Control Group taskset
+ */
 struct task_and_cgroup {
 	struct task_struct	*task;
 	struct cgroup		*cgrp;
 };
 
+struct cgroup_taskset {
+	struct task_and_cgroup	single;
+	struct flex_array	*tc_array;
+	int			tc_array_len;
+	int			idx;
+	struct cgroup		*cur_cgrp;
+};
+
+/**
+ * cgroup_taskset_first - reset taskset and return the first task
+ * @tset: taskset of interest
+ *
+ * @tset iteration is initialized and the first task is returned.
+ */
+struct task_struct *cgroup_taskset_first(struct cgroup_taskset *tset)
+{
+	if (tset->tc_array) {
+		tset->idx = 0;
+		return cgroup_taskset_next(tset);
+	} else {
+		tset->cur_cgrp = tset->single.cgrp;
+		return tset->single.task;
+	}
+}
+EXPORT_SYMBOL_GPL(cgroup_taskset_first);
+
+/**
+ * cgroup_taskset_next - iterate to the next task in taskset
+ * @tset: taskset of interest
+ *
+ * Return the next task in @tset.  Iteration must have been initialized
+ * with cgroup_taskset_first().
+ */
+struct task_struct *cgroup_taskset_next(struct cgroup_taskset *tset)
+{
+	struct task_and_cgroup *tc;
+
+	if (!tset->tc_array || tset->idx >= tset->tc_array_len)
+		return NULL;
+
+	tc = flex_array_get(tset->tc_array, tset->idx++);
+	tset->cur_cgrp = tc->cgrp;
+	return tc->task;
+}
+EXPORT_SYMBOL_GPL(cgroup_taskset_next);
+
+/**
+ * cgroup_taskset_cur_cgroup - return the matching cgroup for the current task
+ * @tset: taskset of interest
+ *
+ * Return the cgroup for the current (last returned) task of @tset.  This
+ * function must be preceded by either cgroup_taskset_first() or
+ * cgroup_taskset_next().
+ */
+struct cgroup *cgroup_taskset_cur_cgroup(struct cgroup_taskset *tset)
+{
+	return tset->cur_cgrp;
+}
+EXPORT_SYMBOL_GPL(cgroup_taskset_cur_cgroup);
+
+/**
+ * cgroup_taskset_size - return the number of tasks in taskset
+ * @tset: taskset of interest
+ */
+int cgroup_taskset_size(struct cgroup_taskset *tset)
+{
+	return tset->tc_array ? tset->tc_array_len : 1;
+}
+EXPORT_SYMBOL_GPL(cgroup_taskset_size);
+
+
 /*
  * cgroup_task_migrate - move a task from one cgroup to another.
  *
@@ -1828,15 +1902,19 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
 	struct cgroup_subsys *ss, *failed_ss = NULL;
 	struct cgroup *oldcgrp;
 	struct cgroupfs_root *root = cgrp->root;
+	struct cgroup_taskset tset = { };
 
 	/* Nothing to do if the task is already in that cgroup */
 	oldcgrp = task_cgroup_from_root(tsk, root);
 	if (cgrp == oldcgrp)
 		return 0;
 
+	tset.single.task = tsk;
+	tset.single.cgrp = oldcgrp;
+
 	for_each_subsys(root, ss) {
 		if (ss->can_attach) {
-			retval = ss->can_attach(ss, cgrp, tsk);
+			retval = ss->can_attach(ss, cgrp, &tset);
 			if (retval) {
 				/*
 				 * Remember on which subsystem the can_attach()
@@ -1867,7 +1945,7 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
 		if (ss->attach_task)
 			ss->attach_task(cgrp, tsk);
 		if (ss->attach)
-			ss->attach(ss, cgrp, oldcgrp, tsk);
+			ss->attach(ss, cgrp, &tset);
 	}
 
 	synchronize_rcu();
@@ -1889,7 +1967,7 @@ out:
 				 */
 				break;
 			if (ss->cancel_attach)
-				ss->cancel_attach(ss, cgrp, tsk);
+				ss->cancel_attach(ss, cgrp, &tset);
 		}
 	}
 	return retval;
@@ -2005,6 +2083,7 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	struct task_struct *tsk;
 	struct task_and_cgroup *tc;
 	struct flex_array *group;
+	struct cgroup_taskset tset = { };
 	/*
 	 * we need to make sure we have css_sets for all the tasks we're
 	 * going to move -before- we actually start moving them, so that in
@@ -2067,6 +2146,8 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	} while_each_thread(leader, tsk);
 	/* remember the number of threads in the array for later. */
 	group_size = i;
+	tset.tc_array = group;
+	tset.tc_array_len = group_size;
 	rcu_read_unlock();
 
 	/* methods shouldn't be called if no task is actually migrating */
@@ -2079,7 +2160,7 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 */
 	for_each_subsys(root, ss) {
 		if (ss->can_attach) {
-			retval = ss->can_attach(ss, cgrp, leader);
+			retval = ss->can_attach(ss, cgrp, &tset);
 			if (retval) {
 				failed_ss = ss;
 				goto out_cancel_attach;
@@ -2169,10 +2250,8 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 * being moved, this call will need to be reworked to communicate that.
 	 */
 	for_each_subsys(root, ss) {
-		if (ss->attach) {
-			tc = flex_array_get(group, 0);
-			ss->attach(ss, cgrp, tc->cgrp, tc->task);
-		}
+		if (ss->attach)
+			ss->attach(ss, cgrp, &tset);
 	}
 
 	/*
@@ -2194,11 +2273,11 @@ out_cancel_attach:
 		for_each_subsys(root, ss) {
 			if (ss == failed_ss) {
 				if (cancel_failed_ss && ss->cancel_attach)
-					ss->cancel_attach(ss, cgrp, leader);
+					ss->cancel_attach(ss, cgrp, &tset);
 				break;
 			}
 			if (ss->cancel_attach)
-				ss->cancel_attach(ss, cgrp, leader);
+				ss->cancel_attach(ss, cgrp, &tset);
 		}
 	}
 out_put_tasks:
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index 4e82525..a2b0082 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -159,7 +159,7 @@ static void freezer_destroy(struct cgroup_subsys *ss,
  */
 static int freezer_can_attach(struct cgroup_subsys *ss,
 			      struct cgroup *new_cgroup,
-			      struct task_struct *task)
+			      struct cgroup_taskset *tset)
 {
 	struct freezer *freezer;
 
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 10131fd..2e5825b 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1368,10 +1368,10 @@ static int fmeter_getrate(struct fmeter *fmp)
 }
 
 /* Called by cgroups to determine if a cpuset is usable; cgroup_mutex held */
-static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cont,
-			     struct task_struct *tsk)
+static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+			     struct cgroup_taskset *tset)
 {
-	struct cpuset *cs = cgroup_cs(cont);
+	struct cpuset *cs = cgroup_cs(cgrp);
 
 	if (cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed))
 		return -ENOSPC;
@@ -1384,7 +1384,7 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cont,
 	 * set_cpus_allowed_ptr() on all attached tasks before cpus_allowed may
 	 * be changed.
 	 */
-	if (tsk->flags & PF_THREAD_BOUND)
+	if (cgroup_taskset_first(tset)->flags & PF_THREAD_BOUND)
 		return -EINVAL;
 
 	return 0;
@@ -1434,12 +1434,14 @@ static void cpuset_attach_task(struct cgroup *cont, struct task_struct *tsk)
 	cpuset_update_task_spread_flag(cs, tsk);
 }
 
-static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cont,
-			  struct cgroup *oldcont, struct task_struct *tsk)
+static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+			  struct cgroup_taskset *tset)
 {
 	struct mm_struct *mm;
-	struct cpuset *cs = cgroup_cs(cont);
-	struct cpuset *oldcs = cgroup_cs(oldcont);
+	struct task_struct *tsk = cgroup_taskset_first(tset);
+	struct cgroup *oldcgrp = cgroup_taskset_cur_cgroup(tset);
+	struct cpuset *cs = cgroup_cs(cgrp);
+	struct cpuset *oldcs = cgroup_cs(oldcgrp);
 
 	/*
 	 * Change mm, possibly for multiple threads in a threadgroup. This is
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 930de94..b2802cc 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5460,8 +5460,9 @@ static void mem_cgroup_clear_mc(void)
 
 static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
 				struct cgroup *cgroup,
-				struct task_struct *p)
+				struct cgroup_taskset *tset)
 {
+	struct task_struct *p = cgroup_taskset_first(tset);
 	int ret = 0;
 	struct mem_cgroup *mem = mem_cgroup_from_cont(cgroup);
 
@@ -5499,7 +5500,7 @@ static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
 
 static void mem_cgroup_cancel_attach(struct cgroup_subsys *ss,
 				struct cgroup *cgroup,
-				struct task_struct *p)
+				struct cgroup_taskset *tset)
 {
 	mem_cgroup_clear_mc();
 }
@@ -5616,9 +5617,9 @@ retry:
 
 static void mem_cgroup_move_task(struct cgroup_subsys *ss,
 				struct cgroup *cont,
-				struct cgroup *old_cont,
-				struct task_struct *p)
+				struct cgroup_taskset *tset)
 {
+	struct task_struct *p = cgroup_taskset_first(tset);
 	struct mm_struct *mm = get_task_mm(p);
 
 	if (mm) {
@@ -5633,19 +5634,18 @@ static void mem_cgroup_move_task(struct cgroup_subsys *ss,
 #else	/* !CONFIG_MMU */
 static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
 				struct cgroup *cgroup,
-				struct task_struct *p)
+				struct cgroup_taskset *tset)
 {
 	return 0;
 }
 static void mem_cgroup_cancel_attach(struct cgroup_subsys *ss,
 				struct cgroup *cgroup,
-				struct task_struct *p)
+				struct cgroup_taskset *tset)
 {
 }
 static void mem_cgroup_move_task(struct cgroup_subsys *ss,
 				struct cgroup *cont,
-				struct cgroup *old_cont,
-				struct task_struct *p)
+				struct cgroup_taskset *tset)
 {
 }
 #endif
diff --git a/security/device_cgroup.c b/security/device_cgroup.c
index 4450fbe..8b5b5d8 100644
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -62,11 +62,12 @@ static inline struct dev_cgroup *task_devcgroup(struct task_struct *task)
 struct cgroup_subsys devices_subsys;
 
 static int devcgroup_can_attach(struct cgroup_subsys *ss,
-		struct cgroup *new_cgroup, struct task_struct *task)
+			struct cgroup *new_cgrp, struct cgroup_taskset *set)
 {
-	if (current != task && !capable(CAP_SYS_ADMIN))
-			return -EPERM;
+	struct task_struct *task = cgroup_taskset_first(set);
 
+	if (current != task && !capable(CAP_SYS_ADMIN))
+		return -EPERM;
 	return 0;
 }
 
-- 
1.7.6


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-23 22:19 [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Tejun Heo
                   ` (5 preceding siblings ...)
       [not found] ` <1314138000-2049-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2011-08-23 22:19 ` Tejun Heo
  2011-08-23 22:19 ` [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task() Tejun Heo
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-23 22:19 UTC (permalink / raw)
  To: rjw, paul, lizf
  Cc: containers, Daisuke Nishimura, linux-kernel, James Morris,
	Tejun Heo, linux-pm, KAMEZAWA Hiroyuki

Currently, there's no way to pass multiple tasks to cgroup_subsys
methods necessitating the need for separate per-process and per-task
methods.  This patch introduces cgroup_taskset which can be used to
pass multiple tasks and their associated cgroups to cgroup_subsys
methods.

Three methods - can_attach(), cancel_attach() and attach() - are
converted to use cgroup_taskset.  This unifies passed parameters so
that all methods have access to all information.  Conversions in this
patchset are identical and don't introduce any behavior change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: James Morris <jmorris@namei.org>
---
 Documentation/cgroups/cgroups.txt |   26 ++++++----
 include/linux/cgroup.h            |   28 +++++++++-
 kernel/cgroup.c                   |   99 +++++++++++++++++++++++++++++++++----
 kernel/cgroup_freezer.c           |    2 +-
 kernel/cpuset.c                   |   18 ++++---
 mm/memcontrol.c                   |   16 +++---
 security/device_cgroup.c          |    7 ++-
 7 files changed, 153 insertions(+), 43 deletions(-)

diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
index cd67e90..2eee7cf 100644
--- a/Documentation/cgroups/cgroups.txt
+++ b/Documentation/cgroups/cgroups.txt
@@ -594,16 +594,21 @@ rmdir() will fail with it. From this behavior, pre_destroy() can be
 called multiple times against a cgroup.
 
 int can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
-	       struct task_struct *task)
+	       struct cgroup_taskset *tset)
 (cgroup_mutex held by caller)
 
-Called prior to moving a task into a cgroup; if the subsystem
-returns an error, this will abort the attach operation.  If a NULL
-task is passed, then a successful result indicates that *any*
-unspecified task can be moved into the cgroup. Note that this isn't
+Called prior to moving one or more tasks into a cgroup; if the
+subsystem returns an error, this will abort the attach operation.
+@tset contains the tasks to be attached and is guaranteed to have at
+least one task in it. If there are multiple, it's guaranteed that all
+are from the same thread group, @tset contains all tasks from the
+group whether they're actually switching cgroup or not, and the first
+task is the leader. Each @tset entry also contains the task's old
+cgroup and tasks which aren't switching cgroup can be skipped easily
+using the cgroup_taskset_for_each() iterator. Note that this isn't
 called on a fork. If this method returns 0 (success) then this should
-remain valid while the caller holds cgroup_mutex and it is ensured that either
-attach() or cancel_attach() will be called in future.
+remain valid while the caller holds cgroup_mutex and it is ensured
+that either attach() or cancel_attach() will be called in future.
 
 int can_attach_task(struct cgroup *cgrp, struct task_struct *tsk);
 (cgroup_mutex held by caller)
@@ -613,14 +618,14 @@ attached (possibly many when using cgroup_attach_proc). Called after
 can_attach.
 
 void cancel_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
-	       struct task_struct *task, bool threadgroup)
+		   struct cgroup_taskset *tset)
 (cgroup_mutex held by caller)
 
 Called when a task attach operation has failed after can_attach() has succeeded.
 A subsystem whose can_attach() has some side-effects should provide this
 function, so that the subsystem can implement a rollback. If not, not necessary.
 This will be called only about subsystems whose can_attach() operation have
-succeeded.
+succeeded. The parameters are identical to can_attach().
 
 void pre_attach(struct cgroup *cgrp);
 (cgroup_mutex held by caller)
@@ -629,11 +634,12 @@ For any non-per-thread attachment work that needs to happen before
 attach_task. Needed by cpuset.
 
 void attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
-	    struct cgroup *old_cgrp, struct task_struct *task)
+	    struct cgroup_taskset *tset)
 (cgroup_mutex held by caller)
 
 Called after the task has been attached to the cgroup, to allow any
 post-attachment activity that requires memory allocations or blocking.
+The parameters are identical to can_attach().
 
 void attach_task(struct cgroup *cgrp, struct task_struct *tsk);
 (cgroup_mutex held by caller)
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index da7e4bc..2470c8e 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -457,6 +457,28 @@ void cgroup_exclude_rmdir(struct cgroup_subsys_state *css);
 void cgroup_release_and_wakeup_rmdir(struct cgroup_subsys_state *css);
 
 /*
+ * Control Group taskset, used to pass around set of tasks to cgroup_subsys
+ * methods.
+ */
+struct cgroup_taskset;
+struct task_struct *cgroup_taskset_first(struct cgroup_taskset *tset);
+struct task_struct *cgroup_taskset_next(struct cgroup_taskset *tset);
+struct cgroup *cgroup_taskset_cur_cgroup(struct cgroup_taskset *tset);
+int cgroup_taskset_size(struct cgroup_taskset *tset);
+
+/**
+ * cgroup_taskset_for_each - iterate cgroup_taskset
+ * @task: the loop cursor
+ * @skip_cgrp: skip if task's cgroup matches this, %NULL to iterate through all
+ * @tset: taskset to iterate
+ */
+#define cgroup_taskset_for_each(task, skip_cgrp, tset)			\
+	for ((task) = cgroup_taskset_first((tset)); (task);		\
+	     (task) = cgroup_taskset_next((tset)))			\
+		if (!(skip_cgrp) ||					\
+		    cgroup_taskset_cur_cgroup((tset)) != (skip_cgrp))
+
+/*
  * Control Group subsystem type.
  * See Documentation/cgroups/cgroups.txt for details
  */
@@ -467,14 +489,14 @@ struct cgroup_subsys {
 	int (*pre_destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
 	void (*destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
 	int (*can_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
-			  struct task_struct *tsk);
+			  struct cgroup_taskset *tset);
 	int (*can_attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
 	void (*cancel_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
-			      struct task_struct *tsk);
+			      struct cgroup_taskset *tset);
 	void (*pre_attach)(struct cgroup *cgrp);
 	void (*attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
 	void (*attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
-		       struct cgroup *old_cgrp, struct task_struct *tsk);
+		       struct cgroup_taskset *tset);
 	void (*fork)(struct cgroup_subsys *ss, struct task_struct *task);
 	void (*exit)(struct cgroup_subsys *ss, struct cgroup *cgrp,
 			struct cgroup *old_cgrp, struct task_struct *task);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index cf5f3e3..474674b 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1739,11 +1739,85 @@ int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen)
 }
 EXPORT_SYMBOL_GPL(cgroup_path);
 
+/*
+ * Control Group taskset
+ */
 struct task_and_cgroup {
 	struct task_struct	*task;
 	struct cgroup		*cgrp;
 };
 
+struct cgroup_taskset {
+	struct task_and_cgroup	single;
+	struct flex_array	*tc_array;
+	int			tc_array_len;
+	int			idx;
+	struct cgroup		*cur_cgrp;
+};
+
+/**
+ * cgroup_taskset_first - reset taskset and return the first task
+ * @tset: taskset of interest
+ *
+ * @tset iteration is initialized and the first task is returned.
+ */
+struct task_struct *cgroup_taskset_first(struct cgroup_taskset *tset)
+{
+	if (tset->tc_array) {
+		tset->idx = 0;
+		return cgroup_taskset_next(tset);
+	} else {
+		tset->cur_cgrp = tset->single.cgrp;
+		return tset->single.task;
+	}
+}
+EXPORT_SYMBOL_GPL(cgroup_taskset_first);
+
+/**
+ * cgroup_taskset_next - iterate to the next task in taskset
+ * @tset: taskset of interest
+ *
+ * Return the next task in @tset.  Iteration must have been initialized
+ * with cgroup_taskset_first().
+ */
+struct task_struct *cgroup_taskset_next(struct cgroup_taskset *tset)
+{
+	struct task_and_cgroup *tc;
+
+	if (!tset->tc_array || tset->idx >= tset->tc_array_len)
+		return NULL;
+
+	tc = flex_array_get(tset->tc_array, tset->idx++);
+	tset->cur_cgrp = tc->cgrp;
+	return tc->task;
+}
+EXPORT_SYMBOL_GPL(cgroup_taskset_next);
+
+/**
+ * cgroup_taskset_cur_cgroup - return the matching cgroup for the current task
+ * @tset: taskset of interest
+ *
+ * Return the cgroup for the current (last returned) task of @tset.  This
+ * function must be preceded by either cgroup_taskset_first() or
+ * cgroup_taskset_next().
+ */
+struct cgroup *cgroup_taskset_cur_cgroup(struct cgroup_taskset *tset)
+{
+	return tset->cur_cgrp;
+}
+EXPORT_SYMBOL_GPL(cgroup_taskset_cur_cgroup);
+
+/**
+ * cgroup_taskset_size - return the number of tasks in taskset
+ * @tset: taskset of interest
+ */
+int cgroup_taskset_size(struct cgroup_taskset *tset)
+{
+	return tset->tc_array ? tset->tc_array_len : 1;
+}
+EXPORT_SYMBOL_GPL(cgroup_taskset_size);
+
+
 /*
  * cgroup_task_migrate - move a task from one cgroup to another.
  *
@@ -1828,15 +1902,19 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
 	struct cgroup_subsys *ss, *failed_ss = NULL;
 	struct cgroup *oldcgrp;
 	struct cgroupfs_root *root = cgrp->root;
+	struct cgroup_taskset tset = { };
 
 	/* Nothing to do if the task is already in that cgroup */
 	oldcgrp = task_cgroup_from_root(tsk, root);
 	if (cgrp == oldcgrp)
 		return 0;
 
+	tset.single.task = tsk;
+	tset.single.cgrp = oldcgrp;
+
 	for_each_subsys(root, ss) {
 		if (ss->can_attach) {
-			retval = ss->can_attach(ss, cgrp, tsk);
+			retval = ss->can_attach(ss, cgrp, &tset);
 			if (retval) {
 				/*
 				 * Remember on which subsystem the can_attach()
@@ -1867,7 +1945,7 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
 		if (ss->attach_task)
 			ss->attach_task(cgrp, tsk);
 		if (ss->attach)
-			ss->attach(ss, cgrp, oldcgrp, tsk);
+			ss->attach(ss, cgrp, &tset);
 	}
 
 	synchronize_rcu();
@@ -1889,7 +1967,7 @@ out:
 				 */
 				break;
 			if (ss->cancel_attach)
-				ss->cancel_attach(ss, cgrp, tsk);
+				ss->cancel_attach(ss, cgrp, &tset);
 		}
 	}
 	return retval;
@@ -2005,6 +2083,7 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	struct task_struct *tsk;
 	struct task_and_cgroup *tc;
 	struct flex_array *group;
+	struct cgroup_taskset tset = { };
 	/*
 	 * we need to make sure we have css_sets for all the tasks we're
 	 * going to move -before- we actually start moving them, so that in
@@ -2067,6 +2146,8 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	} while_each_thread(leader, tsk);
 	/* remember the number of threads in the array for later. */
 	group_size = i;
+	tset.tc_array = group;
+	tset.tc_array_len = group_size;
 	rcu_read_unlock();
 
 	/* methods shouldn't be called if no task is actually migrating */
@@ -2079,7 +2160,7 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 */
 	for_each_subsys(root, ss) {
 		if (ss->can_attach) {
-			retval = ss->can_attach(ss, cgrp, leader);
+			retval = ss->can_attach(ss, cgrp, &tset);
 			if (retval) {
 				failed_ss = ss;
 				goto out_cancel_attach;
@@ -2169,10 +2250,8 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 * being moved, this call will need to be reworked to communicate that.
 	 */
 	for_each_subsys(root, ss) {
-		if (ss->attach) {
-			tc = flex_array_get(group, 0);
-			ss->attach(ss, cgrp, tc->cgrp, tc->task);
-		}
+		if (ss->attach)
+			ss->attach(ss, cgrp, &tset);
 	}
 
 	/*
@@ -2194,11 +2273,11 @@ out_cancel_attach:
 		for_each_subsys(root, ss) {
 			if (ss == failed_ss) {
 				if (cancel_failed_ss && ss->cancel_attach)
-					ss->cancel_attach(ss, cgrp, leader);
+					ss->cancel_attach(ss, cgrp, &tset);
 				break;
 			}
 			if (ss->cancel_attach)
-				ss->cancel_attach(ss, cgrp, leader);
+				ss->cancel_attach(ss, cgrp, &tset);
 		}
 	}
 out_put_tasks:
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index 4e82525..a2b0082 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -159,7 +159,7 @@ static void freezer_destroy(struct cgroup_subsys *ss,
  */
 static int freezer_can_attach(struct cgroup_subsys *ss,
 			      struct cgroup *new_cgroup,
-			      struct task_struct *task)
+			      struct cgroup_taskset *tset)
 {
 	struct freezer *freezer;
 
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 10131fd..2e5825b 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1368,10 +1368,10 @@ static int fmeter_getrate(struct fmeter *fmp)
 }
 
 /* Called by cgroups to determine if a cpuset is usable; cgroup_mutex held */
-static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cont,
-			     struct task_struct *tsk)
+static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+			     struct cgroup_taskset *tset)
 {
-	struct cpuset *cs = cgroup_cs(cont);
+	struct cpuset *cs = cgroup_cs(cgrp);
 
 	if (cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed))
 		return -ENOSPC;
@@ -1384,7 +1384,7 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cont,
 	 * set_cpus_allowed_ptr() on all attached tasks before cpus_allowed may
 	 * be changed.
 	 */
-	if (tsk->flags & PF_THREAD_BOUND)
+	if (cgroup_taskset_first(tset)->flags & PF_THREAD_BOUND)
 		return -EINVAL;
 
 	return 0;
@@ -1434,12 +1434,14 @@ static void cpuset_attach_task(struct cgroup *cont, struct task_struct *tsk)
 	cpuset_update_task_spread_flag(cs, tsk);
 }
 
-static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cont,
-			  struct cgroup *oldcont, struct task_struct *tsk)
+static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+			  struct cgroup_taskset *tset)
 {
 	struct mm_struct *mm;
-	struct cpuset *cs = cgroup_cs(cont);
-	struct cpuset *oldcs = cgroup_cs(oldcont);
+	struct task_struct *tsk = cgroup_taskset_first(tset);
+	struct cgroup *oldcgrp = cgroup_taskset_cur_cgroup(tset);
+	struct cpuset *cs = cgroup_cs(cgrp);
+	struct cpuset *oldcs = cgroup_cs(oldcgrp);
 
 	/*
 	 * Change mm, possibly for multiple threads in a threadgroup. This is
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 930de94..b2802cc 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5460,8 +5460,9 @@ static void mem_cgroup_clear_mc(void)
 
 static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
 				struct cgroup *cgroup,
-				struct task_struct *p)
+				struct cgroup_taskset *tset)
 {
+	struct task_struct *p = cgroup_taskset_first(tset);
 	int ret = 0;
 	struct mem_cgroup *mem = mem_cgroup_from_cont(cgroup);
 
@@ -5499,7 +5500,7 @@ static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
 
 static void mem_cgroup_cancel_attach(struct cgroup_subsys *ss,
 				struct cgroup *cgroup,
-				struct task_struct *p)
+				struct cgroup_taskset *tset)
 {
 	mem_cgroup_clear_mc();
 }
@@ -5616,9 +5617,9 @@ retry:
 
 static void mem_cgroup_move_task(struct cgroup_subsys *ss,
 				struct cgroup *cont,
-				struct cgroup *old_cont,
-				struct task_struct *p)
+				struct cgroup_taskset *tset)
 {
+	struct task_struct *p = cgroup_taskset_first(tset);
 	struct mm_struct *mm = get_task_mm(p);
 
 	if (mm) {
@@ -5633,19 +5634,18 @@ static void mem_cgroup_move_task(struct cgroup_subsys *ss,
 #else	/* !CONFIG_MMU */
 static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
 				struct cgroup *cgroup,
-				struct task_struct *p)
+				struct cgroup_taskset *tset)
 {
 	return 0;
 }
 static void mem_cgroup_cancel_attach(struct cgroup_subsys *ss,
 				struct cgroup *cgroup,
-				struct task_struct *p)
+				struct cgroup_taskset *tset)
 {
 }
 static void mem_cgroup_move_task(struct cgroup_subsys *ss,
 				struct cgroup *cont,
-				struct cgroup *old_cont,
-				struct task_struct *p)
+				struct cgroup_taskset *tset)
 {
 }
 #endif
diff --git a/security/device_cgroup.c b/security/device_cgroup.c
index 4450fbe..8b5b5d8 100644
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -62,11 +62,12 @@ static inline struct dev_cgroup *task_devcgroup(struct task_struct *task)
 struct cgroup_subsys devices_subsys;
 
 static int devcgroup_can_attach(struct cgroup_subsys *ss,
-		struct cgroup *new_cgroup, struct task_struct *task)
+			struct cgroup *new_cgrp, struct cgroup_taskset *set)
 {
-	if (current != task && !capable(CAP_SYS_ADMIN))
-			return -EPERM;
+	struct task_struct *task = cgroup_taskset_first(set);
 
+	if (current != task && !capable(CAP_SYS_ADMIN))
+		return -EPERM;
 	return 0;
 }
 
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task()
       [not found] ` <1314138000-2049-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (2 preceding siblings ...)
  2011-08-23 22:19   ` [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach() Tejun Heo
@ 2011-08-23 22:19   ` Tejun Heo
  2011-08-23 22:19   ` [PATCH 5/6] cgroup, cpuset: don't use ss->pre_attach() Tejun Heo
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-23 22:19 UTC (permalink / raw)
  To: rjw-KKrjLPT3xs0, paul-inf54ven1CmVyaH7bEyXVA,
	lizf-BthXqXjhjHXQFUHtdCDX3A
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Ingo Molnar, Daisuke Nishimura,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Tejun Heo,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

Now that subsys->can_attach() and attach() take @tset instead of
@task, they can handle per-task operations.  Convert
->can_attach_task() and ->attach_task() users to use ->can_attach()
and attach() instead.  Most converions are straight-forward.
Noteworthy changes are,

* In cgroup_freezer, remove unnecessary NULL assignments to unused
  methods.  It's useless and very prone to get out of sync, which
  already happened.

* In cpuset, PF_THREAD_BOUND test is checked for each task.  This
  doesn't make any practical difference but is conceptually cleaner.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Paul Menage <paul-inf54ven1CmVyaH7bEyXVA@public.gmane.org>
Cc: Li Zefan <lizf-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
Cc: Balbir Singh <bsingharora-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Daisuke Nishimura <nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
Cc: James Morris <jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org>
Cc: Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>
Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
---
 block/blk-cgroup.c      |   45 +++++++++++++++++++-----------
 kernel/cgroup_freezer.c |   14 +++-------
 kernel/cpuset.c         |   70 +++++++++++++++++++++-------------------------
 kernel/events/core.c    |   13 +++++---
 kernel/sched.c          |   31 +++++++++++++--------
 5 files changed, 91 insertions(+), 82 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index bcaf16e..99e0bd4 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -30,8 +30,10 @@ EXPORT_SYMBOL_GPL(blkio_root_cgroup);
 
 static struct cgroup_subsys_state *blkiocg_create(struct cgroup_subsys *,
 						  struct cgroup *);
-static int blkiocg_can_attach_task(struct cgroup *, struct task_struct *);
-static void blkiocg_attach_task(struct cgroup *, struct task_struct *);
+static int blkiocg_can_attach(struct cgroup_subsys *, struct cgroup *,
+			      struct cgroup_taskset *);
+static void blkiocg_attach(struct cgroup_subsys *, struct cgroup *,
+			   struct cgroup_taskset *);
 static void blkiocg_destroy(struct cgroup_subsys *, struct cgroup *);
 static int blkiocg_populate(struct cgroup_subsys *, struct cgroup *);
 
@@ -44,8 +46,8 @@ static int blkiocg_populate(struct cgroup_subsys *, struct cgroup *);
 struct cgroup_subsys blkio_subsys = {
 	.name = "blkio",
 	.create = blkiocg_create,
-	.can_attach_task = blkiocg_can_attach_task,
-	.attach_task = blkiocg_attach_task,
+	.can_attach = blkiocg_can_attach,
+	.attach = blkiocg_attach,
 	.destroy = blkiocg_destroy,
 	.populate = blkiocg_populate,
 #ifdef CONFIG_BLK_CGROUP
@@ -1614,30 +1616,39 @@ done:
  * of the main cic data structures.  For now we allow a task to change
  * its cgroup only if it's the only owner of its ioc.
  */
-static int blkiocg_can_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
+static int blkiocg_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+			      struct cgroup_taskset *tset)
 {
+	struct task_struct *task;
 	struct io_context *ioc;
 	int ret = 0;
 
 	/* task_lock() is needed to avoid races with exit_io_context() */
-	task_lock(tsk);
-	ioc = tsk->io_context;
-	if (ioc && atomic_read(&ioc->nr_tasks) > 1)
-		ret = -EINVAL;
-	task_unlock(tsk);
-
+	cgroup_taskset_for_each(task, cgrp, tset) {
+		task_lock(task);
+		ioc = task->io_context;
+		if (ioc && atomic_read(&ioc->nr_tasks) > 1)
+			ret = -EINVAL;
+		task_unlock(task);
+		if (ret)
+			break;
+	}
 	return ret;
 }
 
-static void blkiocg_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
+static void blkiocg_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+			   struct cgroup_taskset *tset)
 {
+	struct task_struct *task;
 	struct io_context *ioc;
 
-	task_lock(tsk);
-	ioc = tsk->io_context;
-	if (ioc)
-		ioc->cgroup_changed = 1;
-	task_unlock(tsk);
+	cgroup_taskset_for_each(task, cgrp, tset) {
+		task_lock(task);
+		ioc = task->io_context;
+		if (ioc)
+			ioc->cgroup_changed = 1;
+		task_unlock(task);
+	}
 }
 
 void blkio_policy_register(struct blkio_policy_type *blkiop)
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index a2b0082..2cb5e72 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -162,10 +162,14 @@ static int freezer_can_attach(struct cgroup_subsys *ss,
 			      struct cgroup_taskset *tset)
 {
 	struct freezer *freezer;
+	struct task_struct *task;
 
 	/*
 	 * Anything frozen can't move or be moved to/from.
 	 */
+	cgroup_taskset_for_each(task, new_cgroup, tset)
+		if (cgroup_freezing(task))
+			return -EBUSY;
 
 	freezer = cgroup_freezer(new_cgroup);
 	if (freezer->state != CGROUP_THAWED)
@@ -174,11 +178,6 @@ static int freezer_can_attach(struct cgroup_subsys *ss,
 	return 0;
 }
 
-static int freezer_can_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
-{
-	return cgroup_freezing(tsk) ? -EBUSY : 0;
-}
-
 static void freezer_fork(struct cgroup_subsys *ss, struct task_struct *task)
 {
 	struct freezer *freezer;
@@ -374,10 +373,5 @@ struct cgroup_subsys freezer_subsys = {
 	.populate	= freezer_populate,
 	.subsys_id	= freezer_subsys_id,
 	.can_attach	= freezer_can_attach,
-	.can_attach_task = freezer_can_attach_task,
-	.pre_attach	= NULL,
-	.attach_task	= NULL,
-	.attach		= NULL,
 	.fork		= freezer_fork,
-	.exit		= NULL,
 };
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 2e5825b..472ddd6 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1372,33 +1372,34 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
 			     struct cgroup_taskset *tset)
 {
 	struct cpuset *cs = cgroup_cs(cgrp);
+	struct task_struct *task;
+	int ret;
 
 	if (cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed))
 		return -ENOSPC;
 
-	/*
-	 * Kthreads bound to specific cpus cannot be moved to a new cpuset; we
-	 * cannot change their cpu affinity and isolating such threads by their
-	 * set of allowed nodes is unnecessary.  Thus, cpusets are not
-	 * applicable for such threads.  This prevents checking for success of
-	 * set_cpus_allowed_ptr() on all attached tasks before cpus_allowed may
-	 * be changed.
-	 */
-	if (cgroup_taskset_first(tset)->flags & PF_THREAD_BOUND)
-		return -EINVAL;
-
+	cgroup_taskset_for_each(task, cgrp, tset) {
+		/*
+		 * Kthreads bound to specific cpus cannot be moved to a new
+		 * cpuset; we cannot change their cpu affinity and
+		 * isolating such threads by their set of allowed nodes is
+		 * unnecessary.  Thus, cpusets are not applicable for such
+		 * threads.  This prevents checking for success of
+		 * set_cpus_allowed_ptr() on all attached tasks before
+		 * cpus_allowed may be changed.
+		 */
+		if (task->flags & PF_THREAD_BOUND)
+			return -EINVAL;
+		if ((ret = security_task_setscheduler(task)))
+			return ret;
+	}
 	return 0;
 }
 
-static int cpuset_can_attach_task(struct cgroup *cgrp, struct task_struct *task)
-{
-	return security_task_setscheduler(task);
-}
-
 /*
  * Protected by cgroup_lock. The nodemasks must be stored globally because
  * dynamically allocating them is not allowed in pre_attach, and they must
- * persist among pre_attach, attach_task, and attach.
+ * persist among pre_attach, and attach.
  */
 static cpumask_var_t cpus_attach;
 static nodemask_t cpuset_attach_nodemask_from;
@@ -1417,39 +1418,34 @@ static void cpuset_pre_attach(struct cgroup *cont)
 	guarantee_online_mems(cs, &cpuset_attach_nodemask_to);
 }
 
-/* Per-thread attachment work. */
-static void cpuset_attach_task(struct cgroup *cont, struct task_struct *tsk)
-{
-	int err;
-	struct cpuset *cs = cgroup_cs(cont);
-
-	/*
-	 * can_attach beforehand should guarantee that this doesn't fail.
-	 * TODO: have a better way to handle failure here
-	 */
-	err = set_cpus_allowed_ptr(tsk, cpus_attach);
-	WARN_ON_ONCE(err);
-
-	cpuset_change_task_nodemask(tsk, &cpuset_attach_nodemask_to);
-	cpuset_update_task_spread_flag(cs, tsk);
-}
-
 static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
 			  struct cgroup_taskset *tset)
 {
 	struct mm_struct *mm;
-	struct task_struct *tsk = cgroup_taskset_first(tset);
+	struct task_struct *task;
+	struct task_struct *leader = cgroup_taskset_first(tset);
 	struct cgroup *oldcgrp = cgroup_taskset_cur_cgroup(tset);
 	struct cpuset *cs = cgroup_cs(cgrp);
 	struct cpuset *oldcs = cgroup_cs(oldcgrp);
 
+	cgroup_taskset_for_each(task, cgrp, tset) {
+		/*
+		 * can_attach beforehand should guarantee that this doesn't
+		 * fail.  TODO: have a better way to handle failure here
+		 */
+		WARN_ON_ONCE(set_cpus_allowed_ptr(task, cpus_attach));
+
+		cpuset_change_task_nodemask(task, &cpuset_attach_nodemask_to);
+		cpuset_update_task_spread_flag(cs, task);
+	}
+
 	/*
 	 * Change mm, possibly for multiple threads in a threadgroup. This is
 	 * expensive and may sleep.
 	 */
 	cpuset_attach_nodemask_from = oldcs->mems_allowed;
 	cpuset_attach_nodemask_to = cs->mems_allowed;
-	mm = get_task_mm(tsk);
+	mm = get_task_mm(leader);
 	if (mm) {
 		mpol_rebind_mm(mm, &cpuset_attach_nodemask_to);
 		if (is_memory_migrate(cs))
@@ -1905,9 +1901,7 @@ struct cgroup_subsys cpuset_subsys = {
 	.create = cpuset_create,
 	.destroy = cpuset_destroy,
 	.can_attach = cpuset_can_attach,
-	.can_attach_task = cpuset_can_attach_task,
 	.pre_attach = cpuset_pre_attach,
-	.attach_task = cpuset_attach_task,
 	.attach = cpuset_attach,
 	.populate = cpuset_populate,
 	.post_clone = cpuset_post_clone,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index b8785e2..95e189d 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7000,10 +7000,13 @@ static int __perf_cgroup_move(void *info)
 	return 0;
 }
 
-static void
-perf_cgroup_attach_task(struct cgroup *cgrp, struct task_struct *task)
+static void perf_cgroup_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+			       struct cgroup_taskset *tset)
 {
-	task_function_call(task, __perf_cgroup_move, task);
+	struct task_struct *task;
+
+	cgroup_taskset_for_each(task, cgrp, tset)
+		task_function_call(task, __perf_cgroup_move, task);
 }
 
 static void perf_cgroup_exit(struct cgroup_subsys *ss, struct cgroup *cgrp,
@@ -7017,7 +7020,7 @@ static void perf_cgroup_exit(struct cgroup_subsys *ss, struct cgroup *cgrp,
 	if (!(task->flags & PF_EXITING))
 		return;
 
-	perf_cgroup_attach_task(cgrp, task);
+	task_function_call(task, __perf_cgroup_move, task);
 }
 
 struct cgroup_subsys perf_subsys = {
@@ -7026,6 +7029,6 @@ struct cgroup_subsys perf_subsys = {
 	.create		= perf_cgroup_create,
 	.destroy	= perf_cgroup_destroy,
 	.exit		= perf_cgroup_exit,
-	.attach_task	= perf_cgroup_attach_task,
+	.attach		= perf_cgroup_attach,
 };
 #endif /* CONFIG_CGROUP_PERF */
diff --git a/kernel/sched.c b/kernel/sched.c
index ccacdbd..dd7e460 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -8966,24 +8966,31 @@ cpu_cgroup_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp)
 	sched_destroy_group(tg);
 }
 
-static int
-cpu_cgroup_can_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
+static int cpu_cgroup_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+				 struct cgroup_taskset *tset)
 {
+	struct task_struct *task;
+
+	cgroup_taskset_for_each(task, cgrp, tset) {
 #ifdef CONFIG_RT_GROUP_SCHED
-	if (!sched_rt_can_attach(cgroup_tg(cgrp), tsk))
-		return -EINVAL;
+		if (!sched_rt_can_attach(cgroup_tg(cgrp), task))
+			return -EINVAL;
 #else
-	/* We don't support RT-tasks being in separate groups */
-	if (tsk->sched_class != &fair_sched_class)
-		return -EINVAL;
+		/* We don't support RT-tasks being in separate groups */
+		if (task->sched_class != &fair_sched_class)
+			return -EINVAL;
 #endif
+	}
 	return 0;
 }
 
-static void
-cpu_cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
+static void cpu_cgroup_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+			      struct cgroup_taskset *tset)
 {
-	sched_move_task(tsk);
+	struct task_struct *task;
+
+	cgroup_taskset_for_each(task, cgrp, tset)
+		sched_move_task(task);
 }
 
 static void
@@ -9071,8 +9078,8 @@ struct cgroup_subsys cpu_cgroup_subsys = {
 	.name		= "cpu",
 	.create		= cpu_cgroup_create,
 	.destroy	= cpu_cgroup_destroy,
-	.can_attach_task = cpu_cgroup_can_attach_task,
-	.attach_task	= cpu_cgroup_attach_task,
+	.can_attach	= cpu_cgroup_can_attach,
+	.attach		= cpu_cgroup_attach,
 	.exit		= cpu_cgroup_exit,
 	.populate	= cpu_cgroup_populate,
 	.subsys_id	= cpu_cgroup_subsys_id,
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task()
  2011-08-23 22:19 [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Tejun Heo
                   ` (7 preceding siblings ...)
  2011-08-23 22:19 ` [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task() Tejun Heo
@ 2011-08-23 22:19 ` Tejun Heo
  2011-08-24  1:57   ` Matt Helsley
                     ` (4 more replies)
  2011-08-23 22:19 ` [PATCH 5/6] cgroup, cpuset: don't use ss->pre_attach() Tejun Heo
                   ` (5 subsequent siblings)
  14 siblings, 5 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-23 22:19 UTC (permalink / raw)
  To: rjw, paul, lizf
  Cc: linux-pm, linux-kernel, containers, Tejun Heo, Balbir Singh,
	Daisuke Nishimura, KAMEZAWA Hiroyuki, James Morris, Ingo Molnar,
	Peter Zijlstra

Now that subsys->can_attach() and attach() take @tset instead of
@task, they can handle per-task operations.  Convert
->can_attach_task() and ->attach_task() users to use ->can_attach()
and attach() instead.  Most converions are straight-forward.
Noteworthy changes are,

* In cgroup_freezer, remove unnecessary NULL assignments to unused
  methods.  It's useless and very prone to get out of sync, which
  already happened.

* In cpuset, PF_THREAD_BOUND test is checked for each task.  This
  doesn't make any practical difference but is conceptually cleaner.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: James Morris <jmorris@namei.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
---
 block/blk-cgroup.c      |   45 +++++++++++++++++++-----------
 kernel/cgroup_freezer.c |   14 +++-------
 kernel/cpuset.c         |   70 +++++++++++++++++++++-------------------------
 kernel/events/core.c    |   13 +++++---
 kernel/sched.c          |   31 +++++++++++++--------
 5 files changed, 91 insertions(+), 82 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index bcaf16e..99e0bd4 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -30,8 +30,10 @@ EXPORT_SYMBOL_GPL(blkio_root_cgroup);
 
 static struct cgroup_subsys_state *blkiocg_create(struct cgroup_subsys *,
 						  struct cgroup *);
-static int blkiocg_can_attach_task(struct cgroup *, struct task_struct *);
-static void blkiocg_attach_task(struct cgroup *, struct task_struct *);
+static int blkiocg_can_attach(struct cgroup_subsys *, struct cgroup *,
+			      struct cgroup_taskset *);
+static void blkiocg_attach(struct cgroup_subsys *, struct cgroup *,
+			   struct cgroup_taskset *);
 static void blkiocg_destroy(struct cgroup_subsys *, struct cgroup *);
 static int blkiocg_populate(struct cgroup_subsys *, struct cgroup *);
 
@@ -44,8 +46,8 @@ static int blkiocg_populate(struct cgroup_subsys *, struct cgroup *);
 struct cgroup_subsys blkio_subsys = {
 	.name = "blkio",
 	.create = blkiocg_create,
-	.can_attach_task = blkiocg_can_attach_task,
-	.attach_task = blkiocg_attach_task,
+	.can_attach = blkiocg_can_attach,
+	.attach = blkiocg_attach,
 	.destroy = blkiocg_destroy,
 	.populate = blkiocg_populate,
 #ifdef CONFIG_BLK_CGROUP
@@ -1614,30 +1616,39 @@ done:
  * of the main cic data structures.  For now we allow a task to change
  * its cgroup only if it's the only owner of its ioc.
  */
-static int blkiocg_can_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
+static int blkiocg_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+			      struct cgroup_taskset *tset)
 {
+	struct task_struct *task;
 	struct io_context *ioc;
 	int ret = 0;
 
 	/* task_lock() is needed to avoid races with exit_io_context() */
-	task_lock(tsk);
-	ioc = tsk->io_context;
-	if (ioc && atomic_read(&ioc->nr_tasks) > 1)
-		ret = -EINVAL;
-	task_unlock(tsk);
-
+	cgroup_taskset_for_each(task, cgrp, tset) {
+		task_lock(task);
+		ioc = task->io_context;
+		if (ioc && atomic_read(&ioc->nr_tasks) > 1)
+			ret = -EINVAL;
+		task_unlock(task);
+		if (ret)
+			break;
+	}
 	return ret;
 }
 
-static void blkiocg_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
+static void blkiocg_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+			   struct cgroup_taskset *tset)
 {
+	struct task_struct *task;
 	struct io_context *ioc;
 
-	task_lock(tsk);
-	ioc = tsk->io_context;
-	if (ioc)
-		ioc->cgroup_changed = 1;
-	task_unlock(tsk);
+	cgroup_taskset_for_each(task, cgrp, tset) {
+		task_lock(task);
+		ioc = task->io_context;
+		if (ioc)
+			ioc->cgroup_changed = 1;
+		task_unlock(task);
+	}
 }
 
 void blkio_policy_register(struct blkio_policy_type *blkiop)
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index a2b0082..2cb5e72 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -162,10 +162,14 @@ static int freezer_can_attach(struct cgroup_subsys *ss,
 			      struct cgroup_taskset *tset)
 {
 	struct freezer *freezer;
+	struct task_struct *task;
 
 	/*
 	 * Anything frozen can't move or be moved to/from.
 	 */
+	cgroup_taskset_for_each(task, new_cgroup, tset)
+		if (cgroup_freezing(task))
+			return -EBUSY;
 
 	freezer = cgroup_freezer(new_cgroup);
 	if (freezer->state != CGROUP_THAWED)
@@ -174,11 +178,6 @@ static int freezer_can_attach(struct cgroup_subsys *ss,
 	return 0;
 }
 
-static int freezer_can_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
-{
-	return cgroup_freezing(tsk) ? -EBUSY : 0;
-}
-
 static void freezer_fork(struct cgroup_subsys *ss, struct task_struct *task)
 {
 	struct freezer *freezer;
@@ -374,10 +373,5 @@ struct cgroup_subsys freezer_subsys = {
 	.populate	= freezer_populate,
 	.subsys_id	= freezer_subsys_id,
 	.can_attach	= freezer_can_attach,
-	.can_attach_task = freezer_can_attach_task,
-	.pre_attach	= NULL,
-	.attach_task	= NULL,
-	.attach		= NULL,
 	.fork		= freezer_fork,
-	.exit		= NULL,
 };
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 2e5825b..472ddd6 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1372,33 +1372,34 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
 			     struct cgroup_taskset *tset)
 {
 	struct cpuset *cs = cgroup_cs(cgrp);
+	struct task_struct *task;
+	int ret;
 
 	if (cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed))
 		return -ENOSPC;
 
-	/*
-	 * Kthreads bound to specific cpus cannot be moved to a new cpuset; we
-	 * cannot change their cpu affinity and isolating such threads by their
-	 * set of allowed nodes is unnecessary.  Thus, cpusets are not
-	 * applicable for such threads.  This prevents checking for success of
-	 * set_cpus_allowed_ptr() on all attached tasks before cpus_allowed may
-	 * be changed.
-	 */
-	if (cgroup_taskset_first(tset)->flags & PF_THREAD_BOUND)
-		return -EINVAL;
-
+	cgroup_taskset_for_each(task, cgrp, tset) {
+		/*
+		 * Kthreads bound to specific cpus cannot be moved to a new
+		 * cpuset; we cannot change their cpu affinity and
+		 * isolating such threads by their set of allowed nodes is
+		 * unnecessary.  Thus, cpusets are not applicable for such
+		 * threads.  This prevents checking for success of
+		 * set_cpus_allowed_ptr() on all attached tasks before
+		 * cpus_allowed may be changed.
+		 */
+		if (task->flags & PF_THREAD_BOUND)
+			return -EINVAL;
+		if ((ret = security_task_setscheduler(task)))
+			return ret;
+	}
 	return 0;
 }
 
-static int cpuset_can_attach_task(struct cgroup *cgrp, struct task_struct *task)
-{
-	return security_task_setscheduler(task);
-}
-
 /*
  * Protected by cgroup_lock. The nodemasks must be stored globally because
  * dynamically allocating them is not allowed in pre_attach, and they must
- * persist among pre_attach, attach_task, and attach.
+ * persist among pre_attach, and attach.
  */
 static cpumask_var_t cpus_attach;
 static nodemask_t cpuset_attach_nodemask_from;
@@ -1417,39 +1418,34 @@ static void cpuset_pre_attach(struct cgroup *cont)
 	guarantee_online_mems(cs, &cpuset_attach_nodemask_to);
 }
 
-/* Per-thread attachment work. */
-static void cpuset_attach_task(struct cgroup *cont, struct task_struct *tsk)
-{
-	int err;
-	struct cpuset *cs = cgroup_cs(cont);
-
-	/*
-	 * can_attach beforehand should guarantee that this doesn't fail.
-	 * TODO: have a better way to handle failure here
-	 */
-	err = set_cpus_allowed_ptr(tsk, cpus_attach);
-	WARN_ON_ONCE(err);
-
-	cpuset_change_task_nodemask(tsk, &cpuset_attach_nodemask_to);
-	cpuset_update_task_spread_flag(cs, tsk);
-}
-
 static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
 			  struct cgroup_taskset *tset)
 {
 	struct mm_struct *mm;
-	struct task_struct *tsk = cgroup_taskset_first(tset);
+	struct task_struct *task;
+	struct task_struct *leader = cgroup_taskset_first(tset);
 	struct cgroup *oldcgrp = cgroup_taskset_cur_cgroup(tset);
 	struct cpuset *cs = cgroup_cs(cgrp);
 	struct cpuset *oldcs = cgroup_cs(oldcgrp);
 
+	cgroup_taskset_for_each(task, cgrp, tset) {
+		/*
+		 * can_attach beforehand should guarantee that this doesn't
+		 * fail.  TODO: have a better way to handle failure here
+		 */
+		WARN_ON_ONCE(set_cpus_allowed_ptr(task, cpus_attach));
+
+		cpuset_change_task_nodemask(task, &cpuset_attach_nodemask_to);
+		cpuset_update_task_spread_flag(cs, task);
+	}
+
 	/*
 	 * Change mm, possibly for multiple threads in a threadgroup. This is
 	 * expensive and may sleep.
 	 */
 	cpuset_attach_nodemask_from = oldcs->mems_allowed;
 	cpuset_attach_nodemask_to = cs->mems_allowed;
-	mm = get_task_mm(tsk);
+	mm = get_task_mm(leader);
 	if (mm) {
 		mpol_rebind_mm(mm, &cpuset_attach_nodemask_to);
 		if (is_memory_migrate(cs))
@@ -1905,9 +1901,7 @@ struct cgroup_subsys cpuset_subsys = {
 	.create = cpuset_create,
 	.destroy = cpuset_destroy,
 	.can_attach = cpuset_can_attach,
-	.can_attach_task = cpuset_can_attach_task,
 	.pre_attach = cpuset_pre_attach,
-	.attach_task = cpuset_attach_task,
 	.attach = cpuset_attach,
 	.populate = cpuset_populate,
 	.post_clone = cpuset_post_clone,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index b8785e2..95e189d 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7000,10 +7000,13 @@ static int __perf_cgroup_move(void *info)
 	return 0;
 }
 
-static void
-perf_cgroup_attach_task(struct cgroup *cgrp, struct task_struct *task)
+static void perf_cgroup_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+			       struct cgroup_taskset *tset)
 {
-	task_function_call(task, __perf_cgroup_move, task);
+	struct task_struct *task;
+
+	cgroup_taskset_for_each(task, cgrp, tset)
+		task_function_call(task, __perf_cgroup_move, task);
 }
 
 static void perf_cgroup_exit(struct cgroup_subsys *ss, struct cgroup *cgrp,
@@ -7017,7 +7020,7 @@ static void perf_cgroup_exit(struct cgroup_subsys *ss, struct cgroup *cgrp,
 	if (!(task->flags & PF_EXITING))
 		return;
 
-	perf_cgroup_attach_task(cgrp, task);
+	task_function_call(task, __perf_cgroup_move, task);
 }
 
 struct cgroup_subsys perf_subsys = {
@@ -7026,6 +7029,6 @@ struct cgroup_subsys perf_subsys = {
 	.create		= perf_cgroup_create,
 	.destroy	= perf_cgroup_destroy,
 	.exit		= perf_cgroup_exit,
-	.attach_task	= perf_cgroup_attach_task,
+	.attach		= perf_cgroup_attach,
 };
 #endif /* CONFIG_CGROUP_PERF */
diff --git a/kernel/sched.c b/kernel/sched.c
index ccacdbd..dd7e460 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -8966,24 +8966,31 @@ cpu_cgroup_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp)
 	sched_destroy_group(tg);
 }
 
-static int
-cpu_cgroup_can_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
+static int cpu_cgroup_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+				 struct cgroup_taskset *tset)
 {
+	struct task_struct *task;
+
+	cgroup_taskset_for_each(task, cgrp, tset) {
 #ifdef CONFIG_RT_GROUP_SCHED
-	if (!sched_rt_can_attach(cgroup_tg(cgrp), tsk))
-		return -EINVAL;
+		if (!sched_rt_can_attach(cgroup_tg(cgrp), task))
+			return -EINVAL;
 #else
-	/* We don't support RT-tasks being in separate groups */
-	if (tsk->sched_class != &fair_sched_class)
-		return -EINVAL;
+		/* We don't support RT-tasks being in separate groups */
+		if (task->sched_class != &fair_sched_class)
+			return -EINVAL;
 #endif
+	}
 	return 0;
 }
 
-static void
-cpu_cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
+static void cpu_cgroup_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+			      struct cgroup_taskset *tset)
 {
-	sched_move_task(tsk);
+	struct task_struct *task;
+
+	cgroup_taskset_for_each(task, cgrp, tset)
+		sched_move_task(task);
 }
 
 static void
@@ -9071,8 +9078,8 @@ struct cgroup_subsys cpu_cgroup_subsys = {
 	.name		= "cpu",
 	.create		= cpu_cgroup_create,
 	.destroy	= cpu_cgroup_destroy,
-	.can_attach_task = cpu_cgroup_can_attach_task,
-	.attach_task	= cpu_cgroup_attach_task,
+	.can_attach	= cpu_cgroup_can_attach,
+	.attach		= cpu_cgroup_attach,
 	.exit		= cpu_cgroup_exit,
 	.populate	= cpu_cgroup_populate,
 	.subsys_id	= cpu_cgroup_subsys_id,
-- 
1.7.6


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task()
  2011-08-23 22:19 [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Tejun Heo
                   ` (6 preceding siblings ...)
  2011-08-23 22:19 ` [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach() Tejun Heo
@ 2011-08-23 22:19 ` Tejun Heo
  2011-08-23 22:19 ` Tejun Heo
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-23 22:19 UTC (permalink / raw)
  To: rjw, paul, lizf
  Cc: Peter Zijlstra, containers, Ingo Molnar, Daisuke Nishimura,
	linux-kernel, James Morris, Tejun Heo, linux-pm,
	KAMEZAWA Hiroyuki

Now that subsys->can_attach() and attach() take @tset instead of
@task, they can handle per-task operations.  Convert
->can_attach_task() and ->attach_task() users to use ->can_attach()
and attach() instead.  Most converions are straight-forward.
Noteworthy changes are,

* In cgroup_freezer, remove unnecessary NULL assignments to unused
  methods.  It's useless and very prone to get out of sync, which
  already happened.

* In cpuset, PF_THREAD_BOUND test is checked for each task.  This
  doesn't make any practical difference but is conceptually cleaner.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: James Morris <jmorris@namei.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
---
 block/blk-cgroup.c      |   45 +++++++++++++++++++-----------
 kernel/cgroup_freezer.c |   14 +++-------
 kernel/cpuset.c         |   70 +++++++++++++++++++++-------------------------
 kernel/events/core.c    |   13 +++++---
 kernel/sched.c          |   31 +++++++++++++--------
 5 files changed, 91 insertions(+), 82 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index bcaf16e..99e0bd4 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -30,8 +30,10 @@ EXPORT_SYMBOL_GPL(blkio_root_cgroup);
 
 static struct cgroup_subsys_state *blkiocg_create(struct cgroup_subsys *,
 						  struct cgroup *);
-static int blkiocg_can_attach_task(struct cgroup *, struct task_struct *);
-static void blkiocg_attach_task(struct cgroup *, struct task_struct *);
+static int blkiocg_can_attach(struct cgroup_subsys *, struct cgroup *,
+			      struct cgroup_taskset *);
+static void blkiocg_attach(struct cgroup_subsys *, struct cgroup *,
+			   struct cgroup_taskset *);
 static void blkiocg_destroy(struct cgroup_subsys *, struct cgroup *);
 static int blkiocg_populate(struct cgroup_subsys *, struct cgroup *);
 
@@ -44,8 +46,8 @@ static int blkiocg_populate(struct cgroup_subsys *, struct cgroup *);
 struct cgroup_subsys blkio_subsys = {
 	.name = "blkio",
 	.create = blkiocg_create,
-	.can_attach_task = blkiocg_can_attach_task,
-	.attach_task = blkiocg_attach_task,
+	.can_attach = blkiocg_can_attach,
+	.attach = blkiocg_attach,
 	.destroy = blkiocg_destroy,
 	.populate = blkiocg_populate,
 #ifdef CONFIG_BLK_CGROUP
@@ -1614,30 +1616,39 @@ done:
  * of the main cic data structures.  For now we allow a task to change
  * its cgroup only if it's the only owner of its ioc.
  */
-static int blkiocg_can_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
+static int blkiocg_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+			      struct cgroup_taskset *tset)
 {
+	struct task_struct *task;
 	struct io_context *ioc;
 	int ret = 0;
 
 	/* task_lock() is needed to avoid races with exit_io_context() */
-	task_lock(tsk);
-	ioc = tsk->io_context;
-	if (ioc && atomic_read(&ioc->nr_tasks) > 1)
-		ret = -EINVAL;
-	task_unlock(tsk);
-
+	cgroup_taskset_for_each(task, cgrp, tset) {
+		task_lock(task);
+		ioc = task->io_context;
+		if (ioc && atomic_read(&ioc->nr_tasks) > 1)
+			ret = -EINVAL;
+		task_unlock(task);
+		if (ret)
+			break;
+	}
 	return ret;
 }
 
-static void blkiocg_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
+static void blkiocg_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+			   struct cgroup_taskset *tset)
 {
+	struct task_struct *task;
 	struct io_context *ioc;
 
-	task_lock(tsk);
-	ioc = tsk->io_context;
-	if (ioc)
-		ioc->cgroup_changed = 1;
-	task_unlock(tsk);
+	cgroup_taskset_for_each(task, cgrp, tset) {
+		task_lock(task);
+		ioc = task->io_context;
+		if (ioc)
+			ioc->cgroup_changed = 1;
+		task_unlock(task);
+	}
 }
 
 void blkio_policy_register(struct blkio_policy_type *blkiop)
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index a2b0082..2cb5e72 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -162,10 +162,14 @@ static int freezer_can_attach(struct cgroup_subsys *ss,
 			      struct cgroup_taskset *tset)
 {
 	struct freezer *freezer;
+	struct task_struct *task;
 
 	/*
 	 * Anything frozen can't move or be moved to/from.
 	 */
+	cgroup_taskset_for_each(task, new_cgroup, tset)
+		if (cgroup_freezing(task))
+			return -EBUSY;
 
 	freezer = cgroup_freezer(new_cgroup);
 	if (freezer->state != CGROUP_THAWED)
@@ -174,11 +178,6 @@ static int freezer_can_attach(struct cgroup_subsys *ss,
 	return 0;
 }
 
-static int freezer_can_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
-{
-	return cgroup_freezing(tsk) ? -EBUSY : 0;
-}
-
 static void freezer_fork(struct cgroup_subsys *ss, struct task_struct *task)
 {
 	struct freezer *freezer;
@@ -374,10 +373,5 @@ struct cgroup_subsys freezer_subsys = {
 	.populate	= freezer_populate,
 	.subsys_id	= freezer_subsys_id,
 	.can_attach	= freezer_can_attach,
-	.can_attach_task = freezer_can_attach_task,
-	.pre_attach	= NULL,
-	.attach_task	= NULL,
-	.attach		= NULL,
 	.fork		= freezer_fork,
-	.exit		= NULL,
 };
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 2e5825b..472ddd6 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1372,33 +1372,34 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
 			     struct cgroup_taskset *tset)
 {
 	struct cpuset *cs = cgroup_cs(cgrp);
+	struct task_struct *task;
+	int ret;
 
 	if (cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed))
 		return -ENOSPC;
 
-	/*
-	 * Kthreads bound to specific cpus cannot be moved to a new cpuset; we
-	 * cannot change their cpu affinity and isolating such threads by their
-	 * set of allowed nodes is unnecessary.  Thus, cpusets are not
-	 * applicable for such threads.  This prevents checking for success of
-	 * set_cpus_allowed_ptr() on all attached tasks before cpus_allowed may
-	 * be changed.
-	 */
-	if (cgroup_taskset_first(tset)->flags & PF_THREAD_BOUND)
-		return -EINVAL;
-
+	cgroup_taskset_for_each(task, cgrp, tset) {
+		/*
+		 * Kthreads bound to specific cpus cannot be moved to a new
+		 * cpuset; we cannot change their cpu affinity and
+		 * isolating such threads by their set of allowed nodes is
+		 * unnecessary.  Thus, cpusets are not applicable for such
+		 * threads.  This prevents checking for success of
+		 * set_cpus_allowed_ptr() on all attached tasks before
+		 * cpus_allowed may be changed.
+		 */
+		if (task->flags & PF_THREAD_BOUND)
+			return -EINVAL;
+		if ((ret = security_task_setscheduler(task)))
+			return ret;
+	}
 	return 0;
 }
 
-static int cpuset_can_attach_task(struct cgroup *cgrp, struct task_struct *task)
-{
-	return security_task_setscheduler(task);
-}
-
 /*
  * Protected by cgroup_lock. The nodemasks must be stored globally because
  * dynamically allocating them is not allowed in pre_attach, and they must
- * persist among pre_attach, attach_task, and attach.
+ * persist among pre_attach, and attach.
  */
 static cpumask_var_t cpus_attach;
 static nodemask_t cpuset_attach_nodemask_from;
@@ -1417,39 +1418,34 @@ static void cpuset_pre_attach(struct cgroup *cont)
 	guarantee_online_mems(cs, &cpuset_attach_nodemask_to);
 }
 
-/* Per-thread attachment work. */
-static void cpuset_attach_task(struct cgroup *cont, struct task_struct *tsk)
-{
-	int err;
-	struct cpuset *cs = cgroup_cs(cont);
-
-	/*
-	 * can_attach beforehand should guarantee that this doesn't fail.
-	 * TODO: have a better way to handle failure here
-	 */
-	err = set_cpus_allowed_ptr(tsk, cpus_attach);
-	WARN_ON_ONCE(err);
-
-	cpuset_change_task_nodemask(tsk, &cpuset_attach_nodemask_to);
-	cpuset_update_task_spread_flag(cs, tsk);
-}
-
 static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
 			  struct cgroup_taskset *tset)
 {
 	struct mm_struct *mm;
-	struct task_struct *tsk = cgroup_taskset_first(tset);
+	struct task_struct *task;
+	struct task_struct *leader = cgroup_taskset_first(tset);
 	struct cgroup *oldcgrp = cgroup_taskset_cur_cgroup(tset);
 	struct cpuset *cs = cgroup_cs(cgrp);
 	struct cpuset *oldcs = cgroup_cs(oldcgrp);
 
+	cgroup_taskset_for_each(task, cgrp, tset) {
+		/*
+		 * can_attach beforehand should guarantee that this doesn't
+		 * fail.  TODO: have a better way to handle failure here
+		 */
+		WARN_ON_ONCE(set_cpus_allowed_ptr(task, cpus_attach));
+
+		cpuset_change_task_nodemask(task, &cpuset_attach_nodemask_to);
+		cpuset_update_task_spread_flag(cs, task);
+	}
+
 	/*
 	 * Change mm, possibly for multiple threads in a threadgroup. This is
 	 * expensive and may sleep.
 	 */
 	cpuset_attach_nodemask_from = oldcs->mems_allowed;
 	cpuset_attach_nodemask_to = cs->mems_allowed;
-	mm = get_task_mm(tsk);
+	mm = get_task_mm(leader);
 	if (mm) {
 		mpol_rebind_mm(mm, &cpuset_attach_nodemask_to);
 		if (is_memory_migrate(cs))
@@ -1905,9 +1901,7 @@ struct cgroup_subsys cpuset_subsys = {
 	.create = cpuset_create,
 	.destroy = cpuset_destroy,
 	.can_attach = cpuset_can_attach,
-	.can_attach_task = cpuset_can_attach_task,
 	.pre_attach = cpuset_pre_attach,
-	.attach_task = cpuset_attach_task,
 	.attach = cpuset_attach,
 	.populate = cpuset_populate,
 	.post_clone = cpuset_post_clone,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index b8785e2..95e189d 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7000,10 +7000,13 @@ static int __perf_cgroup_move(void *info)
 	return 0;
 }
 
-static void
-perf_cgroup_attach_task(struct cgroup *cgrp, struct task_struct *task)
+static void perf_cgroup_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+			       struct cgroup_taskset *tset)
 {
-	task_function_call(task, __perf_cgroup_move, task);
+	struct task_struct *task;
+
+	cgroup_taskset_for_each(task, cgrp, tset)
+		task_function_call(task, __perf_cgroup_move, task);
 }
 
 static void perf_cgroup_exit(struct cgroup_subsys *ss, struct cgroup *cgrp,
@@ -7017,7 +7020,7 @@ static void perf_cgroup_exit(struct cgroup_subsys *ss, struct cgroup *cgrp,
 	if (!(task->flags & PF_EXITING))
 		return;
 
-	perf_cgroup_attach_task(cgrp, task);
+	task_function_call(task, __perf_cgroup_move, task);
 }
 
 struct cgroup_subsys perf_subsys = {
@@ -7026,6 +7029,6 @@ struct cgroup_subsys perf_subsys = {
 	.create		= perf_cgroup_create,
 	.destroy	= perf_cgroup_destroy,
 	.exit		= perf_cgroup_exit,
-	.attach_task	= perf_cgroup_attach_task,
+	.attach		= perf_cgroup_attach,
 };
 #endif /* CONFIG_CGROUP_PERF */
diff --git a/kernel/sched.c b/kernel/sched.c
index ccacdbd..dd7e460 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -8966,24 +8966,31 @@ cpu_cgroup_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp)
 	sched_destroy_group(tg);
 }
 
-static int
-cpu_cgroup_can_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
+static int cpu_cgroup_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+				 struct cgroup_taskset *tset)
 {
+	struct task_struct *task;
+
+	cgroup_taskset_for_each(task, cgrp, tset) {
 #ifdef CONFIG_RT_GROUP_SCHED
-	if (!sched_rt_can_attach(cgroup_tg(cgrp), tsk))
-		return -EINVAL;
+		if (!sched_rt_can_attach(cgroup_tg(cgrp), task))
+			return -EINVAL;
 #else
-	/* We don't support RT-tasks being in separate groups */
-	if (tsk->sched_class != &fair_sched_class)
-		return -EINVAL;
+		/* We don't support RT-tasks being in separate groups */
+		if (task->sched_class != &fair_sched_class)
+			return -EINVAL;
 #endif
+	}
 	return 0;
 }
 
-static void
-cpu_cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
+static void cpu_cgroup_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
+			      struct cgroup_taskset *tset)
 {
-	sched_move_task(tsk);
+	struct task_struct *task;
+
+	cgroup_taskset_for_each(task, cgrp, tset)
+		sched_move_task(task);
 }
 
 static void
@@ -9071,8 +9078,8 @@ struct cgroup_subsys cpu_cgroup_subsys = {
 	.name		= "cpu",
 	.create		= cpu_cgroup_create,
 	.destroy	= cpu_cgroup_destroy,
-	.can_attach_task = cpu_cgroup_can_attach_task,
-	.attach_task	= cpu_cgroup_attach_task,
+	.can_attach	= cpu_cgroup_can_attach,
+	.attach		= cpu_cgroup_attach,
 	.exit		= cpu_cgroup_exit,
 	.populate	= cpu_cgroup_populate,
 	.subsys_id	= cpu_cgroup_subsys_id,
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 5/6] cgroup, cpuset: don't use ss->pre_attach()
       [not found] ` <1314138000-2049-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (3 preceding siblings ...)
  2011-08-23 22:19   ` [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task() Tejun Heo
@ 2011-08-23 22:19   ` Tejun Heo
  2011-08-23 22:20   ` [PATCH 6/6] cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task() Tejun Heo
  2011-08-24  1:14   ` [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Frederic Weisbecker
  6 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-23 22:19 UTC (permalink / raw)
  To: rjw-KKrjLPT3xs0, paul-inf54ven1CmVyaH7bEyXVA,
	lizf-BthXqXjhjHXQFUHtdCDX3A
  Cc: Tejun Heo,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

->pre_attach() is supposed to be called before migration, which is
observed during process migration but task migration does it the other
way around.  The only ->pre_attach() user is cpuset which can do the
same operaitons in ->can_attach().  Collapse cpuset_pre_attach() into
cpuset_can_attach().

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Paul Menage <paul-inf54ven1CmVyaH7bEyXVA@public.gmane.org>
Cc: Li Zefan <lizf-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
---
 Documentation/cgroups/cgroups.txt |   20 --------------------
 kernel/cpuset.c                   |   29 ++++++++++++-----------------
 2 files changed, 12 insertions(+), 37 deletions(-)

diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
index 2eee7cf..afb7cde 100644
--- a/Documentation/cgroups/cgroups.txt
+++ b/Documentation/cgroups/cgroups.txt
@@ -610,13 +610,6 @@ called on a fork. If this method returns 0 (success) then this should
 remain valid while the caller holds cgroup_mutex and it is ensured
 that either attach() or cancel_attach() will be called in future.
 
-int can_attach_task(struct cgroup *cgrp, struct task_struct *tsk);
-(cgroup_mutex held by caller)
-
-As can_attach, but for operations that must be run once per task to be
-attached (possibly many when using cgroup_attach_proc). Called after
-can_attach.
-
 void cancel_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
 		   struct cgroup_taskset *tset)
 (cgroup_mutex held by caller)
@@ -627,12 +620,6 @@ function, so that the subsystem can implement a rollback. If not, not necessary.
 This will be called only about subsystems whose can_attach() operation have
 succeeded. The parameters are identical to can_attach().
 
-void pre_attach(struct cgroup *cgrp);
-(cgroup_mutex held by caller)
-
-For any non-per-thread attachment work that needs to happen before
-attach_task. Needed by cpuset.
-
 void attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
 	    struct cgroup_taskset *tset)
 (cgroup_mutex held by caller)
@@ -641,13 +628,6 @@ Called after the task has been attached to the cgroup, to allow any
 post-attachment activity that requires memory allocations or blocking.
 The parameters are identical to can_attach().
 
-void attach_task(struct cgroup *cgrp, struct task_struct *tsk);
-(cgroup_mutex held by caller)
-
-As attach, but for operations that must be run once per task to be attached,
-like can_attach_task. Called before attach. Currently does not support any
-subsystem that might need the old_cgrp for every thread in the group.
-
 void fork(struct cgroup_subsy *ss, struct task_struct *task)
 
 Called when a task is forked into a cgroup.
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 472ddd6..f0b8df3 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1367,6 +1367,15 @@ static int fmeter_getrate(struct fmeter *fmp)
 	return val;
 }
 
+/*
+ * Protected by cgroup_lock. The nodemasks must be stored globally because
+ * dynamically allocating them is not allowed in can_attach, and they must
+ * persist until attach.
+ */
+static cpumask_var_t cpus_attach;
+static nodemask_t cpuset_attach_nodemask_from;
+static nodemask_t cpuset_attach_nodemask_to;
+
 /* Called by cgroups to determine if a cpuset is usable; cgroup_mutex held */
 static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
 			     struct cgroup_taskset *tset)
@@ -1393,29 +1402,16 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
 		if ((ret = security_task_setscheduler(task)))
 			return ret;
 	}
-	return 0;
-}
-
-/*
- * Protected by cgroup_lock. The nodemasks must be stored globally because
- * dynamically allocating them is not allowed in pre_attach, and they must
- * persist among pre_attach, and attach.
- */
-static cpumask_var_t cpus_attach;
-static nodemask_t cpuset_attach_nodemask_from;
-static nodemask_t cpuset_attach_nodemask_to;
-
-/* Set-up work for before attaching each task. */
-static void cpuset_pre_attach(struct cgroup *cont)
-{
-	struct cpuset *cs = cgroup_cs(cont);
 
+	/* prepare for attach */
 	if (cs == &top_cpuset)
 		cpumask_copy(cpus_attach, cpu_possible_mask);
 	else
 		guarantee_online_cpus(cs, cpus_attach);
 
 	guarantee_online_mems(cs, &cpuset_attach_nodemask_to);
+
+	return 0;
 }
 
 static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
@@ -1901,7 +1897,6 @@ struct cgroup_subsys cpuset_subsys = {
 	.create = cpuset_create,
 	.destroy = cpuset_destroy,
 	.can_attach = cpuset_can_attach,
-	.pre_attach = cpuset_pre_attach,
 	.attach = cpuset_attach,
 	.populate = cpuset_populate,
 	.post_clone = cpuset_post_clone,
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 5/6] cgroup, cpuset: don't use ss->pre_attach()
  2011-08-23 22:19 [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Tejun Heo
                   ` (8 preceding siblings ...)
  2011-08-23 22:19 ` Tejun Heo
@ 2011-08-23 22:19 ` Tejun Heo
  2011-08-25  8:53   ` Paul Menage
                     ` (2 more replies)
  2011-08-23 22:19 ` Tejun Heo
                   ` (4 subsequent siblings)
  14 siblings, 3 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-23 22:19 UTC (permalink / raw)
  To: rjw, paul, lizf; +Cc: linux-pm, linux-kernel, containers, Tejun Heo

->pre_attach() is supposed to be called before migration, which is
observed during process migration but task migration does it the other
way around.  The only ->pre_attach() user is cpuset which can do the
same operaitons in ->can_attach().  Collapse cpuset_pre_attach() into
cpuset_can_attach().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
---
 Documentation/cgroups/cgroups.txt |   20 --------------------
 kernel/cpuset.c                   |   29 ++++++++++++-----------------
 2 files changed, 12 insertions(+), 37 deletions(-)

diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
index 2eee7cf..afb7cde 100644
--- a/Documentation/cgroups/cgroups.txt
+++ b/Documentation/cgroups/cgroups.txt
@@ -610,13 +610,6 @@ called on a fork. If this method returns 0 (success) then this should
 remain valid while the caller holds cgroup_mutex and it is ensured
 that either attach() or cancel_attach() will be called in future.
 
-int can_attach_task(struct cgroup *cgrp, struct task_struct *tsk);
-(cgroup_mutex held by caller)
-
-As can_attach, but for operations that must be run once per task to be
-attached (possibly many when using cgroup_attach_proc). Called after
-can_attach.
-
 void cancel_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
 		   struct cgroup_taskset *tset)
 (cgroup_mutex held by caller)
@@ -627,12 +620,6 @@ function, so that the subsystem can implement a rollback. If not, not necessary.
 This will be called only about subsystems whose can_attach() operation have
 succeeded. The parameters are identical to can_attach().
 
-void pre_attach(struct cgroup *cgrp);
-(cgroup_mutex held by caller)
-
-For any non-per-thread attachment work that needs to happen before
-attach_task. Needed by cpuset.
-
 void attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
 	    struct cgroup_taskset *tset)
 (cgroup_mutex held by caller)
@@ -641,13 +628,6 @@ Called after the task has been attached to the cgroup, to allow any
 post-attachment activity that requires memory allocations or blocking.
 The parameters are identical to can_attach().
 
-void attach_task(struct cgroup *cgrp, struct task_struct *tsk);
-(cgroup_mutex held by caller)
-
-As attach, but for operations that must be run once per task to be attached,
-like can_attach_task. Called before attach. Currently does not support any
-subsystem that might need the old_cgrp for every thread in the group.
-
 void fork(struct cgroup_subsy *ss, struct task_struct *task)
 
 Called when a task is forked into a cgroup.
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 472ddd6..f0b8df3 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1367,6 +1367,15 @@ static int fmeter_getrate(struct fmeter *fmp)
 	return val;
 }
 
+/*
+ * Protected by cgroup_lock. The nodemasks must be stored globally because
+ * dynamically allocating them is not allowed in can_attach, and they must
+ * persist until attach.
+ */
+static cpumask_var_t cpus_attach;
+static nodemask_t cpuset_attach_nodemask_from;
+static nodemask_t cpuset_attach_nodemask_to;
+
 /* Called by cgroups to determine if a cpuset is usable; cgroup_mutex held */
 static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
 			     struct cgroup_taskset *tset)
@@ -1393,29 +1402,16 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
 		if ((ret = security_task_setscheduler(task)))
 			return ret;
 	}
-	return 0;
-}
-
-/*
- * Protected by cgroup_lock. The nodemasks must be stored globally because
- * dynamically allocating them is not allowed in pre_attach, and they must
- * persist among pre_attach, and attach.
- */
-static cpumask_var_t cpus_attach;
-static nodemask_t cpuset_attach_nodemask_from;
-static nodemask_t cpuset_attach_nodemask_to;
-
-/* Set-up work for before attaching each task. */
-static void cpuset_pre_attach(struct cgroup *cont)
-{
-	struct cpuset *cs = cgroup_cs(cont);
 
+	/* prepare for attach */
 	if (cs == &top_cpuset)
 		cpumask_copy(cpus_attach, cpu_possible_mask);
 	else
 		guarantee_online_cpus(cs, cpus_attach);
 
 	guarantee_online_mems(cs, &cpuset_attach_nodemask_to);
+
+	return 0;
 }
 
 static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
@@ -1901,7 +1897,6 @@ struct cgroup_subsys cpuset_subsys = {
 	.create = cpuset_create,
 	.destroy = cpuset_destroy,
 	.can_attach = cpuset_can_attach,
-	.pre_attach = cpuset_pre_attach,
 	.attach = cpuset_attach,
 	.populate = cpuset_populate,
 	.post_clone = cpuset_post_clone,
-- 
1.7.6


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 5/6] cgroup, cpuset: don't use ss->pre_attach()
  2011-08-23 22:19 [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Tejun Heo
                   ` (9 preceding siblings ...)
  2011-08-23 22:19 ` [PATCH 5/6] cgroup, cpuset: don't use ss->pre_attach() Tejun Heo
@ 2011-08-23 22:19 ` Tejun Heo
  2011-08-23 22:20 ` [PATCH 6/6] cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task() Tejun Heo
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-23 22:19 UTC (permalink / raw)
  To: rjw, paul, lizf; +Cc: Tejun Heo, containers, linux-pm, linux-kernel

->pre_attach() is supposed to be called before migration, which is
observed during process migration but task migration does it the other
way around.  The only ->pre_attach() user is cpuset which can do the
same operaitons in ->can_attach().  Collapse cpuset_pre_attach() into
cpuset_can_attach().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
---
 Documentation/cgroups/cgroups.txt |   20 --------------------
 kernel/cpuset.c                   |   29 ++++++++++++-----------------
 2 files changed, 12 insertions(+), 37 deletions(-)

diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
index 2eee7cf..afb7cde 100644
--- a/Documentation/cgroups/cgroups.txt
+++ b/Documentation/cgroups/cgroups.txt
@@ -610,13 +610,6 @@ called on a fork. If this method returns 0 (success) then this should
 remain valid while the caller holds cgroup_mutex and it is ensured
 that either attach() or cancel_attach() will be called in future.
 
-int can_attach_task(struct cgroup *cgrp, struct task_struct *tsk);
-(cgroup_mutex held by caller)
-
-As can_attach, but for operations that must be run once per task to be
-attached (possibly many when using cgroup_attach_proc). Called after
-can_attach.
-
 void cancel_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
 		   struct cgroup_taskset *tset)
 (cgroup_mutex held by caller)
@@ -627,12 +620,6 @@ function, so that the subsystem can implement a rollback. If not, not necessary.
 This will be called only about subsystems whose can_attach() operation have
 succeeded. The parameters are identical to can_attach().
 
-void pre_attach(struct cgroup *cgrp);
-(cgroup_mutex held by caller)
-
-For any non-per-thread attachment work that needs to happen before
-attach_task. Needed by cpuset.
-
 void attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
 	    struct cgroup_taskset *tset)
 (cgroup_mutex held by caller)
@@ -641,13 +628,6 @@ Called after the task has been attached to the cgroup, to allow any
 post-attachment activity that requires memory allocations or blocking.
 The parameters are identical to can_attach().
 
-void attach_task(struct cgroup *cgrp, struct task_struct *tsk);
-(cgroup_mutex held by caller)
-
-As attach, but for operations that must be run once per task to be attached,
-like can_attach_task. Called before attach. Currently does not support any
-subsystem that might need the old_cgrp for every thread in the group.
-
 void fork(struct cgroup_subsy *ss, struct task_struct *task)
 
 Called when a task is forked into a cgroup.
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 472ddd6..f0b8df3 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1367,6 +1367,15 @@ static int fmeter_getrate(struct fmeter *fmp)
 	return val;
 }
 
+/*
+ * Protected by cgroup_lock. The nodemasks must be stored globally because
+ * dynamically allocating them is not allowed in can_attach, and they must
+ * persist until attach.
+ */
+static cpumask_var_t cpus_attach;
+static nodemask_t cpuset_attach_nodemask_from;
+static nodemask_t cpuset_attach_nodemask_to;
+
 /* Called by cgroups to determine if a cpuset is usable; cgroup_mutex held */
 static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
 			     struct cgroup_taskset *tset)
@@ -1393,29 +1402,16 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
 		if ((ret = security_task_setscheduler(task)))
 			return ret;
 	}
-	return 0;
-}
-
-/*
- * Protected by cgroup_lock. The nodemasks must be stored globally because
- * dynamically allocating them is not allowed in pre_attach, and they must
- * persist among pre_attach, and attach.
- */
-static cpumask_var_t cpus_attach;
-static nodemask_t cpuset_attach_nodemask_from;
-static nodemask_t cpuset_attach_nodemask_to;
-
-/* Set-up work for before attaching each task. */
-static void cpuset_pre_attach(struct cgroup *cont)
-{
-	struct cpuset *cs = cgroup_cs(cont);
 
+	/* prepare for attach */
 	if (cs == &top_cpuset)
 		cpumask_copy(cpus_attach, cpu_possible_mask);
 	else
 		guarantee_online_cpus(cs, cpus_attach);
 
 	guarantee_online_mems(cs, &cpuset_attach_nodemask_to);
+
+	return 0;
 }
 
 static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
@@ -1901,7 +1897,6 @@ struct cgroup_subsys cpuset_subsys = {
 	.create = cpuset_create,
 	.destroy = cpuset_destroy,
 	.can_attach = cpuset_can_attach,
-	.pre_attach = cpuset_pre_attach,
 	.attach = cpuset_attach,
 	.populate = cpuset_populate,
 	.post_clone = cpuset_post_clone,
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 6/6] cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task()
       [not found] ` <1314138000-2049-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (4 preceding siblings ...)
  2011-08-23 22:19   ` [PATCH 5/6] cgroup, cpuset: don't use ss->pre_attach() Tejun Heo
@ 2011-08-23 22:20   ` Tejun Heo
  2011-08-24  1:14   ` [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Frederic Weisbecker
  6 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-23 22:20 UTC (permalink / raw)
  To: rjw-KKrjLPT3xs0, paul-inf54ven1CmVyaH7bEyXVA,
	lizf-BthXqXjhjHXQFUHtdCDX3A
  Cc: Tejun Heo,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

These three methods are no longer used.  Kill them.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Paul Menage <paul-inf54ven1CmVyaH7bEyXVA@public.gmane.org>
Cc: Li Zefan <lizf-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
---
 include/linux/cgroup.h |    3 --
 kernel/cgroup.c        |   53 ++++-------------------------------------------
 2 files changed, 5 insertions(+), 51 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 2470c8e..5659d37 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -490,11 +490,8 @@ struct cgroup_subsys {
 	void (*destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
 	int (*can_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
 			  struct cgroup_taskset *tset);
-	int (*can_attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
 	void (*cancel_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
 			      struct cgroup_taskset *tset);
-	void (*pre_attach)(struct cgroup *cgrp);
-	void (*attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
 	void (*attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
 		       struct cgroup_taskset *tset);
 	void (*fork)(struct cgroup_subsys *ss, struct task_struct *task);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 474674b..374a4cb 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1926,13 +1926,6 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
 				goto out;
 			}
 		}
-		if (ss->can_attach_task) {
-			retval = ss->can_attach_task(cgrp, tsk);
-			if (retval) {
-				failed_ss = ss;
-				goto out;
-			}
-		}
 	}
 
 	retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, false);
@@ -1940,10 +1933,6 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
 		goto out;
 
 	for_each_subsys(root, ss) {
-		if (ss->pre_attach)
-			ss->pre_attach(cgrp);
-		if (ss->attach_task)
-			ss->attach_task(cgrp, tsk);
 		if (ss->attach)
 			ss->attach(ss, cgrp, &tset);
 	}
@@ -2075,7 +2064,6 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 {
 	int retval, i, group_size, nr_todo;
 	struct cgroup_subsys *ss, *failed_ss = NULL;
-	bool cancel_failed_ss = false;
 	/* guaranteed to be initialized later, but the compiler needs this */
 	struct css_set *oldcg;
 	struct cgroupfs_root *root = cgrp->root;
@@ -2166,21 +2154,6 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 				goto out_cancel_attach;
 			}
 		}
-		/* a callback to be run on every thread in the threadgroup. */
-		if (ss->can_attach_task) {
-			/* run on each task in the threadgroup. */
-			for (i = 0; i < group_size; i++) {
-				tc = flex_array_get(group, i);
-				if (tc->cgrp == cgrp)
-					continue;
-				retval = ss->can_attach_task(cgrp, tc->task);
-				if (retval) {
-					failed_ss = ss;
-					cancel_failed_ss = true;
-					goto out_cancel_attach;
-				}
-			}
-		}
 	}
 
 	/*
@@ -2217,15 +2190,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	}
 
 	/*
-	 * step 3: now that we're guaranteed success wrt the css_sets, proceed
-	 * to move all tasks to the new cgroup, calling ss->attach_task for each
-	 * one along the way. there are no failure cases after here, so this is
-	 * the commit point.
+	 * step 3: now that we're guaranteed success wrt the css_sets,
+	 * proceed to move all tasks to the new cgroup.  There are no
+	 * failure cases after here, so this is the commit point.
 	 */
-	for_each_subsys(root, ss) {
-		if (ss->pre_attach)
-			ss->pre_attach(cgrp);
-	}
 	for (i = 0; i < group_size; i++) {
 		tc = flex_array_get(group, i);
 		/* leave current thread as it is if it's already there */
@@ -2235,19 +2203,11 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 		/* if the thread is PF_EXITING, it can just get skipped. */
 		retval = cgroup_task_migrate(cgrp, tc->cgrp, tc->task, true);
 		BUG_ON(retval != 0 && retval != -ESRCH);
-
-		/* attach each task to each subsystem */
-		for_each_subsys(root, ss) {
-			if (ss->attach_task)
-				ss->attach_task(cgrp, tc->task);
-		}
 	}
 	/* nothing is sensitive to fork() after this point. */
 
 	/*
-	 * step 4: do expensive, non-thread-specific subsystem callbacks.
-	 * TODO: if ever a subsystem needs to know the oldcgrp for each task
-	 * being moved, this call will need to be reworked to communicate that.
+	 * step 4: do subsystem attach callbacks.
 	 */
 	for_each_subsys(root, ss) {
 		if (ss->attach)
@@ -2271,11 +2231,8 @@ out_cancel_attach:
 	/* same deal as in cgroup_attach_task */
 	if (retval) {
 		for_each_subsys(root, ss) {
-			if (ss == failed_ss) {
-				if (cancel_failed_ss && ss->cancel_attach)
-					ss->cancel_attach(ss, cgrp, &tset);
+			if (ss == failed_ss)
 				break;
-			}
 			if (ss->cancel_attach)
 				ss->cancel_attach(ss, cgrp, &tset);
 		}
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 6/6] cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task()
  2011-08-23 22:19 [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Tejun Heo
                   ` (11 preceding siblings ...)
  2011-08-23 22:20 ` [PATCH 6/6] cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task() Tejun Heo
@ 2011-08-23 22:20 ` Tejun Heo
  2011-08-25  9:45   ` Paul Menage
                     ` (2 more replies)
  2011-08-24  1:14 ` [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Frederic Weisbecker
  2011-08-24  1:14 ` Frederic Weisbecker
  14 siblings, 3 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-23 22:20 UTC (permalink / raw)
  To: rjw, paul, lizf; +Cc: linux-pm, linux-kernel, containers, Tejun Heo

These three methods are no longer used.  Kill them.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
---
 include/linux/cgroup.h |    3 --
 kernel/cgroup.c        |   53 ++++-------------------------------------------
 2 files changed, 5 insertions(+), 51 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 2470c8e..5659d37 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -490,11 +490,8 @@ struct cgroup_subsys {
 	void (*destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
 	int (*can_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
 			  struct cgroup_taskset *tset);
-	int (*can_attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
 	void (*cancel_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
 			      struct cgroup_taskset *tset);
-	void (*pre_attach)(struct cgroup *cgrp);
-	void (*attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
 	void (*attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
 		       struct cgroup_taskset *tset);
 	void (*fork)(struct cgroup_subsys *ss, struct task_struct *task);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 474674b..374a4cb 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1926,13 +1926,6 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
 				goto out;
 			}
 		}
-		if (ss->can_attach_task) {
-			retval = ss->can_attach_task(cgrp, tsk);
-			if (retval) {
-				failed_ss = ss;
-				goto out;
-			}
-		}
 	}
 
 	retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, false);
@@ -1940,10 +1933,6 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
 		goto out;
 
 	for_each_subsys(root, ss) {
-		if (ss->pre_attach)
-			ss->pre_attach(cgrp);
-		if (ss->attach_task)
-			ss->attach_task(cgrp, tsk);
 		if (ss->attach)
 			ss->attach(ss, cgrp, &tset);
 	}
@@ -2075,7 +2064,6 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 {
 	int retval, i, group_size, nr_todo;
 	struct cgroup_subsys *ss, *failed_ss = NULL;
-	bool cancel_failed_ss = false;
 	/* guaranteed to be initialized later, but the compiler needs this */
 	struct css_set *oldcg;
 	struct cgroupfs_root *root = cgrp->root;
@@ -2166,21 +2154,6 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 				goto out_cancel_attach;
 			}
 		}
-		/* a callback to be run on every thread in the threadgroup. */
-		if (ss->can_attach_task) {
-			/* run on each task in the threadgroup. */
-			for (i = 0; i < group_size; i++) {
-				tc = flex_array_get(group, i);
-				if (tc->cgrp == cgrp)
-					continue;
-				retval = ss->can_attach_task(cgrp, tc->task);
-				if (retval) {
-					failed_ss = ss;
-					cancel_failed_ss = true;
-					goto out_cancel_attach;
-				}
-			}
-		}
 	}
 
 	/*
@@ -2217,15 +2190,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	}
 
 	/*
-	 * step 3: now that we're guaranteed success wrt the css_sets, proceed
-	 * to move all tasks to the new cgroup, calling ss->attach_task for each
-	 * one along the way. there are no failure cases after here, so this is
-	 * the commit point.
+	 * step 3: now that we're guaranteed success wrt the css_sets,
+	 * proceed to move all tasks to the new cgroup.  There are no
+	 * failure cases after here, so this is the commit point.
 	 */
-	for_each_subsys(root, ss) {
-		if (ss->pre_attach)
-			ss->pre_attach(cgrp);
-	}
 	for (i = 0; i < group_size; i++) {
 		tc = flex_array_get(group, i);
 		/* leave current thread as it is if it's already there */
@@ -2235,19 +2203,11 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 		/* if the thread is PF_EXITING, it can just get skipped. */
 		retval = cgroup_task_migrate(cgrp, tc->cgrp, tc->task, true);
 		BUG_ON(retval != 0 && retval != -ESRCH);
-
-		/* attach each task to each subsystem */
-		for_each_subsys(root, ss) {
-			if (ss->attach_task)
-				ss->attach_task(cgrp, tc->task);
-		}
 	}
 	/* nothing is sensitive to fork() after this point. */
 
 	/*
-	 * step 4: do expensive, non-thread-specific subsystem callbacks.
-	 * TODO: if ever a subsystem needs to know the oldcgrp for each task
-	 * being moved, this call will need to be reworked to communicate that.
+	 * step 4: do subsystem attach callbacks.
 	 */
 	for_each_subsys(root, ss) {
 		if (ss->attach)
@@ -2271,11 +2231,8 @@ out_cancel_attach:
 	/* same deal as in cgroup_attach_task */
 	if (retval) {
 		for_each_subsys(root, ss) {
-			if (ss == failed_ss) {
-				if (cancel_failed_ss && ss->cancel_attach)
-					ss->cancel_attach(ss, cgrp, &tset);
+			if (ss == failed_ss)
 				break;
-			}
 			if (ss->cancel_attach)
 				ss->cancel_attach(ss, cgrp, &tset);
 		}
-- 
1.7.6


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 6/6] cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task()
  2011-08-23 22:19 [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Tejun Heo
                   ` (10 preceding siblings ...)
  2011-08-23 22:19 ` Tejun Heo
@ 2011-08-23 22:20 ` Tejun Heo
  2011-08-23 22:20 ` Tejun Heo
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-23 22:20 UTC (permalink / raw)
  To: rjw, paul, lizf; +Cc: Tejun Heo, containers, linux-pm, linux-kernel

These three methods are no longer used.  Kill them.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
---
 include/linux/cgroup.h |    3 --
 kernel/cgroup.c        |   53 ++++-------------------------------------------
 2 files changed, 5 insertions(+), 51 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 2470c8e..5659d37 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -490,11 +490,8 @@ struct cgroup_subsys {
 	void (*destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
 	int (*can_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
 			  struct cgroup_taskset *tset);
-	int (*can_attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
 	void (*cancel_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
 			      struct cgroup_taskset *tset);
-	void (*pre_attach)(struct cgroup *cgrp);
-	void (*attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
 	void (*attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
 		       struct cgroup_taskset *tset);
 	void (*fork)(struct cgroup_subsys *ss, struct task_struct *task);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 474674b..374a4cb 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1926,13 +1926,6 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
 				goto out;
 			}
 		}
-		if (ss->can_attach_task) {
-			retval = ss->can_attach_task(cgrp, tsk);
-			if (retval) {
-				failed_ss = ss;
-				goto out;
-			}
-		}
 	}
 
 	retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, false);
@@ -1940,10 +1933,6 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
 		goto out;
 
 	for_each_subsys(root, ss) {
-		if (ss->pre_attach)
-			ss->pre_attach(cgrp);
-		if (ss->attach_task)
-			ss->attach_task(cgrp, tsk);
 		if (ss->attach)
 			ss->attach(ss, cgrp, &tset);
 	}
@@ -2075,7 +2064,6 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 {
 	int retval, i, group_size, nr_todo;
 	struct cgroup_subsys *ss, *failed_ss = NULL;
-	bool cancel_failed_ss = false;
 	/* guaranteed to be initialized later, but the compiler needs this */
 	struct css_set *oldcg;
 	struct cgroupfs_root *root = cgrp->root;
@@ -2166,21 +2154,6 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 				goto out_cancel_attach;
 			}
 		}
-		/* a callback to be run on every thread in the threadgroup. */
-		if (ss->can_attach_task) {
-			/* run on each task in the threadgroup. */
-			for (i = 0; i < group_size; i++) {
-				tc = flex_array_get(group, i);
-				if (tc->cgrp == cgrp)
-					continue;
-				retval = ss->can_attach_task(cgrp, tc->task);
-				if (retval) {
-					failed_ss = ss;
-					cancel_failed_ss = true;
-					goto out_cancel_attach;
-				}
-			}
-		}
 	}
 
 	/*
@@ -2217,15 +2190,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	}
 
 	/*
-	 * step 3: now that we're guaranteed success wrt the css_sets, proceed
-	 * to move all tasks to the new cgroup, calling ss->attach_task for each
-	 * one along the way. there are no failure cases after here, so this is
-	 * the commit point.
+	 * step 3: now that we're guaranteed success wrt the css_sets,
+	 * proceed to move all tasks to the new cgroup.  There are no
+	 * failure cases after here, so this is the commit point.
 	 */
-	for_each_subsys(root, ss) {
-		if (ss->pre_attach)
-			ss->pre_attach(cgrp);
-	}
 	for (i = 0; i < group_size; i++) {
 		tc = flex_array_get(group, i);
 		/* leave current thread as it is if it's already there */
@@ -2235,19 +2203,11 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 		/* if the thread is PF_EXITING, it can just get skipped. */
 		retval = cgroup_task_migrate(cgrp, tc->cgrp, tc->task, true);
 		BUG_ON(retval != 0 && retval != -ESRCH);
-
-		/* attach each task to each subsystem */
-		for_each_subsys(root, ss) {
-			if (ss->attach_task)
-				ss->attach_task(cgrp, tc->task);
-		}
 	}
 	/* nothing is sensitive to fork() after this point. */
 
 	/*
-	 * step 4: do expensive, non-thread-specific subsystem callbacks.
-	 * TODO: if ever a subsystem needs to know the oldcgrp for each task
-	 * being moved, this call will need to be reworked to communicate that.
+	 * step 4: do subsystem attach callbacks.
 	 */
 	for_each_subsys(root, ss) {
 		if (ss->attach)
@@ -2271,11 +2231,8 @@ out_cancel_attach:
 	/* same deal as in cgroup_attach_task */
 	if (retval) {
 		for_each_subsys(root, ss) {
-			if (ss == failed_ss) {
-				if (cancel_failed_ss && ss->cancel_attach)
-					ss->cancel_attach(ss, cgrp, &tset);
+			if (ss == failed_ss)
 				break;
-			}
 			if (ss->cancel_attach)
 				ss->cancel_attach(ss, cgrp, &tset);
 		}
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* Re: [PATCH 1/6] cgroup: subsys->attach_task() should be called after migration
       [not found]   ` <1314138000-2049-2-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2011-08-24  0:32     ` Frederic Weisbecker
  0 siblings, 0 replies; 100+ messages in thread
From: Frederic Weisbecker @ 2011-08-24  0:32 UTC (permalink / raw)
  To: Tejun Heo, Andrew Morton
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, rjw-KKrjLPT3xs0,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	paul-inf54ven1CmVyaH7bEyXVA

On Wed, Aug 24, 2011 at 12:19:55AM +0200, Tejun Heo wrote:
> cgroup_attach_task() calls subsys->attach_task() after
> cgroup_task_migrate(); however, cgroup_attach_proc() calls it before
> migration.  This actually affects some of the users.  Update
> cgroup_attach_proc() such that ->attach_task() is called after
> migration.

There have been a patch posted recently:

        "[PATCH][BUGFIX] cgroups: fix ordering of calls in cgroup_attach_proc"

that not only fixes that ordering but also only attach the task if the
migration happened correctly (task not exited).

Can somebody queue it for 3.2 ?

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 1/6] cgroup: subsys->attach_task() should be called after migration
  2011-08-23 22:19 ` [PATCH 1/6] cgroup: subsys->attach_task() should be called after migration Tejun Heo
  2011-08-24  0:32   ` Frederic Weisbecker
@ 2011-08-24  0:32   ` Frederic Weisbecker
  2011-08-24  1:31     ` Li Zefan
                       ` (2 more replies)
       [not found]   ` <1314138000-2049-2-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2 siblings, 3 replies; 100+ messages in thread
From: Frederic Weisbecker @ 2011-08-24  0:32 UTC (permalink / raw)
  To: Tejun Heo, Andrew Morton
  Cc: rjw, paul, lizf, linux-pm, linux-kernel, containers

On Wed, Aug 24, 2011 at 12:19:55AM +0200, Tejun Heo wrote:
> cgroup_attach_task() calls subsys->attach_task() after
> cgroup_task_migrate(); however, cgroup_attach_proc() calls it before
> migration.  This actually affects some of the users.  Update
> cgroup_attach_proc() such that ->attach_task() is called after
> migration.

There have been a patch posted recently:

        "[PATCH][BUGFIX] cgroups: fix ordering of calls in cgroup_attach_proc"

that not only fixes that ordering but also only attach the task if the
migration happened correctly (task not exited).

Can somebody queue it for 3.2 ?

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 1/6] cgroup: subsys->attach_task() should be called after migration
  2011-08-23 22:19 ` [PATCH 1/6] cgroup: subsys->attach_task() should be called after migration Tejun Heo
@ 2011-08-24  0:32   ` Frederic Weisbecker
  2011-08-24  0:32   ` Frederic Weisbecker
       [not found]   ` <1314138000-2049-2-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2 siblings, 0 replies; 100+ messages in thread
From: Frederic Weisbecker @ 2011-08-24  0:32 UTC (permalink / raw)
  To: Tejun Heo, Andrew Morton; +Cc: containers, lizf, linux-kernel, linux-pm, paul

On Wed, Aug 24, 2011 at 12:19:55AM +0200, Tejun Heo wrote:
> cgroup_attach_task() calls subsys->attach_task() after
> cgroup_task_migrate(); however, cgroup_attach_proc() calls it before
> migration.  This actually affects some of the users.  Update
> cgroup_attach_proc() such that ->attach_task() is called after
> migration.

There have been a patch posted recently:

        "[PATCH][BUGFIX] cgroups: fix ordering of calls in cgroup_attach_proc"

that not only fixes that ordering but also only attach the task if the
migration happened correctly (task not exited).

Can somebody queue it for 3.2 ?

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods
       [not found] ` <1314138000-2049-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (5 preceding siblings ...)
  2011-08-23 22:20   ` [PATCH 6/6] cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task() Tejun Heo
@ 2011-08-24  1:14   ` Frederic Weisbecker
  6 siblings, 0 replies; 100+ messages in thread
From: Frederic Weisbecker @ 2011-08-24  1:14 UTC (permalink / raw)
  To: Tejun Heo
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, rjw-KKrjLPT3xs0,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	paul-inf54ven1CmVyaH7bEyXVA

On Wed, Aug 24, 2011 at 12:19:54AM +0200, Tejun Heo wrote:
> Hello,
> 
> cgroup has grown quite some number of subsys methods.  Some of them
> are overlapping, inconsistent with each other and called under
> different conditions depending on whether they're called for a single
> task or whole process.  Unfortunately, these callbacks are complicated
> and incomplete at the same time.
> 
> * ->attach_task() is called after migration for task attach but before
>   for process.
> 
> * Ditto for ->pre_attach().
> 
> * ->can_attach_task() is called for every task in the thread group but
>   ->attach_task() skips the ones which don't actually change cgroups.
> 
> * Task attach becomes noop if the task isn't actually moving.  Process
>   attach is always performed.
> 
> * ->attach_task() doesn't (or at least aren't supposed to) have access
>   to the old cgroup.
> 
> * During cancel, there's no way to access the affected tasks.
> 
> This patchset introduces cgroup_taskset along with some accessors and
> iterator, updates methods to use it, consolidates usages and drops
> superflous methods.
> 
> It contains the following six patches.
> 
>  0001-cgroup-subsys-attach_task-should-be-called-after-mig.patch
>  0002-cgroup-improve-old-cgroup-handling-in-cgroup_attach_.patch
>  0003-cgroup-introduce-cgroup_taskset-and-use-it-in-subsys.patch
>  0004-cgroup-don-t-use-subsys-can_attach_task-or-attach_ta.patch
>  0005-cgroup-cpuset-don-t-use-ss-pre_attach.patch
>  0006-cgroup-kill-subsys-can_attach_task-pre_attach-and-at.patch

I don't understand the point on patches 3,4,5,6

Why pushing the task iterations down to the subsystems?

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods
  2011-08-23 22:19 [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Tejun Heo
                   ` (12 preceding siblings ...)
  2011-08-23 22:20 ` Tejun Heo
@ 2011-08-24  1:14 ` Frederic Weisbecker
  2011-08-24  7:49   ` Tejun Heo
                     ` (2 more replies)
  2011-08-24  1:14 ` Frederic Weisbecker
  14 siblings, 3 replies; 100+ messages in thread
From: Frederic Weisbecker @ 2011-08-24  1:14 UTC (permalink / raw)
  To: Tejun Heo; +Cc: rjw, paul, lizf, linux-pm, linux-kernel, containers

On Wed, Aug 24, 2011 at 12:19:54AM +0200, Tejun Heo wrote:
> Hello,
> 
> cgroup has grown quite some number of subsys methods.  Some of them
> are overlapping, inconsistent with each other and called under
> different conditions depending on whether they're called for a single
> task or whole process.  Unfortunately, these callbacks are complicated
> and incomplete at the same time.
> 
> * ->attach_task() is called after migration for task attach but before
>   for process.
> 
> * Ditto for ->pre_attach().
> 
> * ->can_attach_task() is called for every task in the thread group but
>   ->attach_task() skips the ones which don't actually change cgroups.
> 
> * Task attach becomes noop if the task isn't actually moving.  Process
>   attach is always performed.
> 
> * ->attach_task() doesn't (or at least aren't supposed to) have access
>   to the old cgroup.
> 
> * During cancel, there's no way to access the affected tasks.
> 
> This patchset introduces cgroup_taskset along with some accessors and
> iterator, updates methods to use it, consolidates usages and drops
> superflous methods.
> 
> It contains the following six patches.
> 
>  0001-cgroup-subsys-attach_task-should-be-called-after-mig.patch
>  0002-cgroup-improve-old-cgroup-handling-in-cgroup_attach_.patch
>  0003-cgroup-introduce-cgroup_taskset-and-use-it-in-subsys.patch
>  0004-cgroup-don-t-use-subsys-can_attach_task-or-attach_ta.patch
>  0005-cgroup-cpuset-don-t-use-ss-pre_attach.patch
>  0006-cgroup-kill-subsys-can_attach_task-pre_attach-and-at.patch

I don't understand the point on patches 3,4,5,6

Why pushing the task iterations down to the subsystems?

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods
  2011-08-23 22:19 [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Tejun Heo
                   ` (13 preceding siblings ...)
  2011-08-24  1:14 ` [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Frederic Weisbecker
@ 2011-08-24  1:14 ` Frederic Weisbecker
  14 siblings, 0 replies; 100+ messages in thread
From: Frederic Weisbecker @ 2011-08-24  1:14 UTC (permalink / raw)
  To: Tejun Heo; +Cc: containers, lizf, linux-kernel, linux-pm, paul

On Wed, Aug 24, 2011 at 12:19:54AM +0200, Tejun Heo wrote:
> Hello,
> 
> cgroup has grown quite some number of subsys methods.  Some of them
> are overlapping, inconsistent with each other and called under
> different conditions depending on whether they're called for a single
> task or whole process.  Unfortunately, these callbacks are complicated
> and incomplete at the same time.
> 
> * ->attach_task() is called after migration for task attach but before
>   for process.
> 
> * Ditto for ->pre_attach().
> 
> * ->can_attach_task() is called for every task in the thread group but
>   ->attach_task() skips the ones which don't actually change cgroups.
> 
> * Task attach becomes noop if the task isn't actually moving.  Process
>   attach is always performed.
> 
> * ->attach_task() doesn't (or at least aren't supposed to) have access
>   to the old cgroup.
> 
> * During cancel, there's no way to access the affected tasks.
> 
> This patchset introduces cgroup_taskset along with some accessors and
> iterator, updates methods to use it, consolidates usages and drops
> superflous methods.
> 
> It contains the following six patches.
> 
>  0001-cgroup-subsys-attach_task-should-be-called-after-mig.patch
>  0002-cgroup-improve-old-cgroup-handling-in-cgroup_attach_.patch
>  0003-cgroup-introduce-cgroup_taskset-and-use-it-in-subsys.patch
>  0004-cgroup-don-t-use-subsys-can_attach_task-or-attach_ta.patch
>  0005-cgroup-cpuset-don-t-use-ss-pre_attach.patch
>  0006-cgroup-kill-subsys-can_attach_task-pre_attach-and-at.patch

I don't understand the point on patches 3,4,5,6

Why pushing the task iterations down to the subsystems?

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 1/6] cgroup: subsys->attach_task() should be called after migration
  2011-08-24  0:32   ` Frederic Weisbecker
  2011-08-24  1:31     ` Li Zefan
@ 2011-08-24  1:31     ` Li Zefan
  2011-08-24  1:31     ` Li Zefan
  2 siblings, 0 replies; 100+ messages in thread
From: Li Zefan @ 2011-08-24  1:31 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: paul-inf54ven1CmVyaH7bEyXVA, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	rjw-KKrjLPT3xs0, Tejun Heo,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Andrew Morton,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

> There have been a patch posted recently:
> 
>         "[PATCH][BUGFIX] cgroups: fix ordering of calls in cgroup_attach_proc"
> 
> that not only fixes that ordering but also only attach the task if the
> migration happened correctly (task not exited).
> 
> Can somebody queue it for 3.2 ?
> 

cgroup patches normally go through -mm tree.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 1/6] cgroup: subsys->attach_task() should be called after migration
  2011-08-24  0:32   ` Frederic Weisbecker
@ 2011-08-24  1:31     ` Li Zefan
  2011-08-24  1:31     ` Li Zefan
  2011-08-24  1:31     ` Li Zefan
  2 siblings, 0 replies; 100+ messages in thread
From: Li Zefan @ 2011-08-24  1:31 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Tejun Heo, Andrew Morton, rjw, paul, linux-pm, linux-kernel, containers

> There have been a patch posted recently:
> 
>         "[PATCH][BUGFIX] cgroups: fix ordering of calls in cgroup_attach_proc"
> 
> that not only fixes that ordering but also only attach the task if the
> migration happened correctly (task not exited).
> 
> Can somebody queue it for 3.2 ?
> 

cgroup patches normally go through -mm tree.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 1/6] cgroup: subsys->attach_task() should be called after migration
  2011-08-24  0:32   ` Frederic Weisbecker
  2011-08-24  1:31     ` Li Zefan
  2011-08-24  1:31     ` Li Zefan
@ 2011-08-24  1:31     ` Li Zefan
  2 siblings, 0 replies; 100+ messages in thread
From: Li Zefan @ 2011-08-24  1:31 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: paul, linux-kernel, Tejun Heo, containers, Andrew Morton, linux-pm

> There have been a patch posted recently:
> 
>         "[PATCH][BUGFIX] cgroups: fix ordering of calls in cgroup_attach_proc"
> 
> that not only fixes that ordering but also only attach the task if the
> migration happened correctly (task not exited).
> 
> Can somebody queue it for 3.2 ?
> 

cgroup patches normally go through -mm tree.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task()
       [not found]   ` <1314138000-2049-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2011-08-24  1:57     ` Matt Helsley
  2011-08-25  9:07     ` Paul Menage
  1 sibling, 0 replies; 100+ messages in thread
From: Matt Helsley @ 2011-08-24  1:57 UTC (permalink / raw)
  To: Tejun Heo
  Cc: paul-inf54ven1CmVyaH7bEyXVA, Daisuke Nishimura,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, rjw-KKrjLPT3xs0,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Ingo Molnar,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Wed, Aug 24, 2011 at 12:19:58AM +0200, Tejun Heo wrote:

<snip>

> * In cgroup_freezer, remove unnecessary NULL assignments to unused
>   methods.  It's useless and very prone to get out of sync, which
>   already happened.

You could post this part independently -- that might be best since
I guess the taskset bits will need more discussion.

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task()
  2011-08-23 22:19 ` Tejun Heo
  2011-08-24  1:57   ` Matt Helsley
@ 2011-08-24  1:57   ` Matt Helsley
       [not found]     ` <20110824015739.GE28444-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
                       ` (2 more replies)
       [not found]   ` <1314138000-2049-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (2 subsequent siblings)
  4 siblings, 3 replies; 100+ messages in thread
From: Matt Helsley @ 2011-08-24  1:57 UTC (permalink / raw)
  To: Tejun Heo
  Cc: rjw, paul, lizf, containers, Ingo Molnar, Daisuke Nishimura,
	linux-kernel, linux-pm

On Wed, Aug 24, 2011 at 12:19:58AM +0200, Tejun Heo wrote:

<snip>

> * In cgroup_freezer, remove unnecessary NULL assignments to unused
>   methods.  It's useless and very prone to get out of sync, which
>   already happened.

You could post this part independently -- that might be best since
I guess the taskset bits will need more discussion.

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task()
  2011-08-23 22:19 ` Tejun Heo
@ 2011-08-24  1:57   ` Matt Helsley
  2011-08-24  1:57   ` Matt Helsley
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 100+ messages in thread
From: Matt Helsley @ 2011-08-24  1:57 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizf, paul, Daisuke Nishimura, linux-kernel, linux-pm,
	Ingo Molnar, containers

On Wed, Aug 24, 2011 at 12:19:58AM +0200, Tejun Heo wrote:

<snip>

> * In cgroup_freezer, remove unnecessary NULL assignments to unused
>   methods.  It's useless and very prone to get out of sync, which
>   already happened.

You could post this part independently -- that might be best since
I guess the taskset bits will need more discussion.

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods
  2011-08-24  1:14 ` [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Frederic Weisbecker
  2011-08-24  7:49   ` Tejun Heo
@ 2011-08-24  7:49   ` Tejun Heo
  2011-08-24  7:49   ` Tejun Heo
  2 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-24  7:49 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, rjw-KKrjLPT3xs0,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	paul-inf54ven1CmVyaH7bEyXVA

Hello, Frederic.

On Wed, Aug 24, 2011 at 03:14:30AM +0200, Frederic Weisbecker wrote:
> >  0001-cgroup-subsys-attach_task-should-be-called-after-mig.patch
> >  0002-cgroup-improve-old-cgroup-handling-in-cgroup_attach_.patch
> >  0003-cgroup-introduce-cgroup_taskset-and-use-it-in-subsys.patch
> >  0004-cgroup-don-t-use-subsys-can_attach_task-or-attach_ta.patch
> >  0005-cgroup-cpuset-don-t-use-ss-pre_attach.patch
> >  0006-cgroup-kill-subsys-can_attach_task-pre_attach-and-at.patch
> 
> I don't understand the point on patches 3,4,5,6
> 
> Why pushing the task iterations down to the subsystems?

I'll try again.

It seems like methods were added to serve the immediate need of the
particular user at the time and that in turn led to addition of
callbacks which were both superflous and incomplete (the bullet points
in the original message list them).  This seems to have happened
because extra interface was added without trying to make the existing
interface complete.

The interface is complicated and cumbersome to use - are
[can_]attach() called first or [can_]attach_task()?  What about
cancelation?  What if a subsys wants to perform operations across
multiple tasks atomically?

In general, iteration-by-callback is painful to use.  Establishing
common context (be it synchronization domain or shared variables)
becomes very cumbersome and implementation becomes fragmented and
difficult to follow.  For example, imagine how it would be like to use
list if we had call_for_each_list_entry(func, list_head) instead of
the control-loop style iterators we have know.

So, using iterators enables making all relevant information to each
stage of attach so that only one callback is required for each step -
the way it should be.  In addition, it makes it far easier for
subsystems to implement more involved logic in their methods.

I tried to make cgroup_freezer behave better which requires better
synchronization against the freezer and, with the current interface,
it's extremely ugly and painful.  The new interface is complete, easy
to understand and use with far less subtleties.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods
  2011-08-24  1:14 ` [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Frederic Weisbecker
  2011-08-24  7:49   ` Tejun Heo
  2011-08-24  7:49   ` Tejun Heo
@ 2011-08-24  7:49   ` Tejun Heo
  2011-08-24 13:53     ` Frederic Weisbecker
                       ` (2 more replies)
  2 siblings, 3 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-24  7:49 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: rjw, paul, lizf, linux-pm, linux-kernel, containers

Hello, Frederic.

On Wed, Aug 24, 2011 at 03:14:30AM +0200, Frederic Weisbecker wrote:
> >  0001-cgroup-subsys-attach_task-should-be-called-after-mig.patch
> >  0002-cgroup-improve-old-cgroup-handling-in-cgroup_attach_.patch
> >  0003-cgroup-introduce-cgroup_taskset-and-use-it-in-subsys.patch
> >  0004-cgroup-don-t-use-subsys-can_attach_task-or-attach_ta.patch
> >  0005-cgroup-cpuset-don-t-use-ss-pre_attach.patch
> >  0006-cgroup-kill-subsys-can_attach_task-pre_attach-and-at.patch
> 
> I don't understand the point on patches 3,4,5,6
> 
> Why pushing the task iterations down to the subsystems?

I'll try again.

It seems like methods were added to serve the immediate need of the
particular user at the time and that in turn led to addition of
callbacks which were both superflous and incomplete (the bullet points
in the original message list them).  This seems to have happened
because extra interface was added without trying to make the existing
interface complete.

The interface is complicated and cumbersome to use - are
[can_]attach() called first or [can_]attach_task()?  What about
cancelation?  What if a subsys wants to perform operations across
multiple tasks atomically?

In general, iteration-by-callback is painful to use.  Establishing
common context (be it synchronization domain or shared variables)
becomes very cumbersome and implementation becomes fragmented and
difficult to follow.  For example, imagine how it would be like to use
list if we had call_for_each_list_entry(func, list_head) instead of
the control-loop style iterators we have know.

So, using iterators enables making all relevant information to each
stage of attach so that only one callback is required for each step -
the way it should be.  In addition, it makes it far easier for
subsystems to implement more involved logic in their methods.

I tried to make cgroup_freezer behave better which requires better
synchronization against the freezer and, with the current interface,
it's extremely ugly and painful.  The new interface is complete, easy
to understand and use with far less subtleties.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods
  2011-08-24  1:14 ` [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Frederic Weisbecker
@ 2011-08-24  7:49   ` Tejun Heo
  2011-08-24  7:49   ` Tejun Heo
  2011-08-24  7:49   ` Tejun Heo
  2 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-24  7:49 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: containers, lizf, linux-kernel, linux-pm, paul

Hello, Frederic.

On Wed, Aug 24, 2011 at 03:14:30AM +0200, Frederic Weisbecker wrote:
> >  0001-cgroup-subsys-attach_task-should-be-called-after-mig.patch
> >  0002-cgroup-improve-old-cgroup-handling-in-cgroup_attach_.patch
> >  0003-cgroup-introduce-cgroup_taskset-and-use-it-in-subsys.patch
> >  0004-cgroup-don-t-use-subsys-can_attach_task-or-attach_ta.patch
> >  0005-cgroup-cpuset-don-t-use-ss-pre_attach.patch
> >  0006-cgroup-kill-subsys-can_attach_task-pre_attach-and-at.patch
> 
> I don't understand the point on patches 3,4,5,6
> 
> Why pushing the task iterations down to the subsystems?

I'll try again.

It seems like methods were added to serve the immediate need of the
particular user at the time and that in turn led to addition of
callbacks which were both superflous and incomplete (the bullet points
in the original message list them).  This seems to have happened
because extra interface was added without trying to make the existing
interface complete.

The interface is complicated and cumbersome to use - are
[can_]attach() called first or [can_]attach_task()?  What about
cancelation?  What if a subsys wants to perform operations across
multiple tasks atomically?

In general, iteration-by-callback is painful to use.  Establishing
common context (be it synchronization domain or shared variables)
becomes very cumbersome and implementation becomes fragmented and
difficult to follow.  For example, imagine how it would be like to use
list if we had call_for_each_list_entry(func, list_head) instead of
the control-loop style iterators we have know.

So, using iterators enables making all relevant information to each
stage of attach so that only one callback is required for each step -
the way it should be.  In addition, it makes it far easier for
subsystems to implement more involved logic in their methods.

I tried to make cgroup_freezer behave better which requires better
synchronization against the freezer and, with the current interface,
it's extremely ugly and painful.  The new interface is complete, easy
to understand and use with far less subtleties.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task()
       [not found]     ` <20110824015739.GE28444-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
@ 2011-08-24  7:54       ` Tejun Heo
  0 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-24  7:54 UTC (permalink / raw)
  To: Matt Helsley
  Cc: paul-inf54ven1CmVyaH7bEyXVA, Daisuke Nishimura,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, rjw-KKrjLPT3xs0,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Ingo Molnar,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

Hello,

On Tue, Aug 23, 2011 at 06:57:39PM -0700, Matt Helsley wrote:
> On Wed, Aug 24, 2011 at 12:19:58AM +0200, Tejun Heo wrote:
> 
> <snip>
> 
> > * In cgroup_freezer, remove unnecessary NULL assignments to unused
> >   methods.  It's useless and very prone to get out of sync, which
> >   already happened.
> 
> You could post this part independently -- that might be best since
> I guess the taskset bits will need more discussion.

Urgh... I really think subsystems which aren't very isolated should
have its own tree.  Going through -mm works well if the changes are
isolated or all inter-dependent changes go through -mm too but it
stops working as soon as the changes start span across other git
trees.  If cgroup isn't expected to see a lot of changes in this cycle
(if so, please set up a git tree, it isn't difficult), the best
solution, I think, would be sticking w/ Rafael's branch.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task()
  2011-08-24  1:57   ` Matt Helsley
       [not found]     ` <20110824015739.GE28444-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
@ 2011-08-24  7:54     ` Tejun Heo
  2011-08-24  7:54     ` Tejun Heo
  2 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-24  7:54 UTC (permalink / raw)
  To: Matt Helsley
  Cc: rjw, paul, lizf, containers, Ingo Molnar, Daisuke Nishimura,
	linux-kernel, linux-pm

Hello,

On Tue, Aug 23, 2011 at 06:57:39PM -0700, Matt Helsley wrote:
> On Wed, Aug 24, 2011 at 12:19:58AM +0200, Tejun Heo wrote:
> 
> <snip>
> 
> > * In cgroup_freezer, remove unnecessary NULL assignments to unused
> >   methods.  It's useless and very prone to get out of sync, which
> >   already happened.
> 
> You could post this part independently -- that might be best since
> I guess the taskset bits will need more discussion.

Urgh... I really think subsystems which aren't very isolated should
have its own tree.  Going through -mm works well if the changes are
isolated or all inter-dependent changes go through -mm too but it
stops working as soon as the changes start span across other git
trees.  If cgroup isn't expected to see a lot of changes in this cycle
(if so, please set up a git tree, it isn't difficult), the best
solution, I think, would be sticking w/ Rafael's branch.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task()
  2011-08-24  1:57   ` Matt Helsley
       [not found]     ` <20110824015739.GE28444-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
  2011-08-24  7:54     ` Tejun Heo
@ 2011-08-24  7:54     ` Tejun Heo
  2 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-24  7:54 UTC (permalink / raw)
  To: Matt Helsley
  Cc: lizf, paul, Daisuke Nishimura, linux-kernel, linux-pm,
	Ingo Molnar, containers

Hello,

On Tue, Aug 23, 2011 at 06:57:39PM -0700, Matt Helsley wrote:
> On Wed, Aug 24, 2011 at 12:19:58AM +0200, Tejun Heo wrote:
> 
> <snip>
> 
> > * In cgroup_freezer, remove unnecessary NULL assignments to unused
> >   methods.  It's useless and very prone to get out of sync, which
> >   already happened.
> 
> You could post this part independently -- that might be best since
> I guess the taskset bits will need more discussion.

Urgh... I really think subsystems which aren't very isolated should
have its own tree.  Going through -mm works well if the changes are
isolated or all inter-dependent changes go through -mm too but it
stops working as soon as the changes start span across other git
trees.  If cgroup isn't expected to see a lot of changes in this cycle
(if so, please set up a git tree, it isn't difficult), the best
solution, I think, would be sticking w/ Rafael's branch.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods
       [not found]     ` <20110824074959.GA14170-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
@ 2011-08-24 13:53       ` Frederic Weisbecker
  0 siblings, 0 replies; 100+ messages in thread
From: Frederic Weisbecker @ 2011-08-24 13:53 UTC (permalink / raw)
  To: Tejun Heo
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, rjw-KKrjLPT3xs0,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	paul-inf54ven1CmVyaH7bEyXVA

On Wed, Aug 24, 2011 at 09:49:59AM +0200, Tejun Heo wrote:
> Hello, Frederic.
> 
> On Wed, Aug 24, 2011 at 03:14:30AM +0200, Frederic Weisbecker wrote:
> > >  0001-cgroup-subsys-attach_task-should-be-called-after-mig.patch
> > >  0002-cgroup-improve-old-cgroup-handling-in-cgroup_attach_.patch
> > >  0003-cgroup-introduce-cgroup_taskset-and-use-it-in-subsys.patch
> > >  0004-cgroup-don-t-use-subsys-can_attach_task-or-attach_ta.patch
> > >  0005-cgroup-cpuset-don-t-use-ss-pre_attach.patch
> > >  0006-cgroup-kill-subsys-can_attach_task-pre_attach-and-at.patch
> > 
> > I don't understand the point on patches 3,4,5,6
> > 
> > Why pushing the task iterations down to the subsystems?
> 
> I'll try again.
> 
> It seems like methods were added to serve the immediate need of the
> particular user at the time and that in turn led to addition of
> callbacks which were both superflous and incomplete (the bullet points
> in the original message list them).  This seems to have happened
> because extra interface was added without trying to make the existing
> interface complete.
> 
> The interface is complicated and cumbersome to use - are
> [can_]attach() called first or [can_]attach_task()?  What about
> cancelation?  What if a subsys wants to perform operations across
> multiple tasks atomically?
> 
> In general, iteration-by-callback is painful to use.  Establishing
> common context (be it synchronization domain or shared variables)
> becomes very cumbersome and implementation becomes fragmented and
> difficult to follow.  For example, imagine how it would be like to use
> list if we had call_for_each_list_entry(func, list_head) instead of
> the control-loop style iterators we have know.
> 
> So, using iterators enables making all relevant information to each
> stage of attach so that only one callback is required for each step -
> the way it should be.  In addition, it makes it far easier for
> subsystems to implement more involved logic in their methods.
> 
> I tried to make cgroup_freezer behave better which requires better
> synchronization against the freezer and, with the current interface,
> it's extremely ugly and painful.  The new interface is complete, easy
> to understand and use with far less subtleties.

Yeah it's true that the order between [can]attach/[can]attach_task plus
the added mess with pre_attach was not entirely sane. The fact we have
foo and foo_task is already a problem.

I guess we indeed need to sacrifice the iteration from the cgroup core
for that.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods
  2011-08-24  7:49   ` Tejun Heo
  2011-08-24 13:53     ` Frederic Weisbecker
@ 2011-08-24 13:53     ` Frederic Weisbecker
       [not found]     ` <20110824074959.GA14170-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
  2 siblings, 0 replies; 100+ messages in thread
From: Frederic Weisbecker @ 2011-08-24 13:53 UTC (permalink / raw)
  To: Tejun Heo; +Cc: rjw, paul, lizf, linux-pm, linux-kernel, containers

On Wed, Aug 24, 2011 at 09:49:59AM +0200, Tejun Heo wrote:
> Hello, Frederic.
> 
> On Wed, Aug 24, 2011 at 03:14:30AM +0200, Frederic Weisbecker wrote:
> > >  0001-cgroup-subsys-attach_task-should-be-called-after-mig.patch
> > >  0002-cgroup-improve-old-cgroup-handling-in-cgroup_attach_.patch
> > >  0003-cgroup-introduce-cgroup_taskset-and-use-it-in-subsys.patch
> > >  0004-cgroup-don-t-use-subsys-can_attach_task-or-attach_ta.patch
> > >  0005-cgroup-cpuset-don-t-use-ss-pre_attach.patch
> > >  0006-cgroup-kill-subsys-can_attach_task-pre_attach-and-at.patch
> > 
> > I don't understand the point on patches 3,4,5,6
> > 
> > Why pushing the task iterations down to the subsystems?
> 
> I'll try again.
> 
> It seems like methods were added to serve the immediate need of the
> particular user at the time and that in turn led to addition of
> callbacks which were both superflous and incomplete (the bullet points
> in the original message list them).  This seems to have happened
> because extra interface was added without trying to make the existing
> interface complete.
> 
> The interface is complicated and cumbersome to use - are
> [can_]attach() called first or [can_]attach_task()?  What about
> cancelation?  What if a subsys wants to perform operations across
> multiple tasks atomically?
> 
> In general, iteration-by-callback is painful to use.  Establishing
> common context (be it synchronization domain or shared variables)
> becomes very cumbersome and implementation becomes fragmented and
> difficult to follow.  For example, imagine how it would be like to use
> list if we had call_for_each_list_entry(func, list_head) instead of
> the control-loop style iterators we have know.
> 
> So, using iterators enables making all relevant information to each
> stage of attach so that only one callback is required for each step -
> the way it should be.  In addition, it makes it far easier for
> subsystems to implement more involved logic in their methods.
> 
> I tried to make cgroup_freezer behave better which requires better
> synchronization against the freezer and, with the current interface,
> it's extremely ugly and painful.  The new interface is complete, easy
> to understand and use with far less subtleties.

Yeah it's true that the order between [can]attach/[can]attach_task plus
the added mess with pre_attach was not entirely sane. The fact we have
foo and foo_task is already a problem.

I guess we indeed need to sacrifice the iteration from the cgroup core
for that.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods
  2011-08-24  7:49   ` Tejun Heo
@ 2011-08-24 13:53     ` Frederic Weisbecker
  2011-08-24 13:53     ` Frederic Weisbecker
       [not found]     ` <20110824074959.GA14170-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
  2 siblings, 0 replies; 100+ messages in thread
From: Frederic Weisbecker @ 2011-08-24 13:53 UTC (permalink / raw)
  To: Tejun Heo; +Cc: containers, lizf, linux-kernel, linux-pm, paul

On Wed, Aug 24, 2011 at 09:49:59AM +0200, Tejun Heo wrote:
> Hello, Frederic.
> 
> On Wed, Aug 24, 2011 at 03:14:30AM +0200, Frederic Weisbecker wrote:
> > >  0001-cgroup-subsys-attach_task-should-be-called-after-mig.patch
> > >  0002-cgroup-improve-old-cgroup-handling-in-cgroup_attach_.patch
> > >  0003-cgroup-introduce-cgroup_taskset-and-use-it-in-subsys.patch
> > >  0004-cgroup-don-t-use-subsys-can_attach_task-or-attach_ta.patch
> > >  0005-cgroup-cpuset-don-t-use-ss-pre_attach.patch
> > >  0006-cgroup-kill-subsys-can_attach_task-pre_attach-and-at.patch
> > 
> > I don't understand the point on patches 3,4,5,6
> > 
> > Why pushing the task iterations down to the subsystems?
> 
> I'll try again.
> 
> It seems like methods were added to serve the immediate need of the
> particular user at the time and that in turn led to addition of
> callbacks which were both superflous and incomplete (the bullet points
> in the original message list them).  This seems to have happened
> because extra interface was added without trying to make the existing
> interface complete.
> 
> The interface is complicated and cumbersome to use - are
> [can_]attach() called first or [can_]attach_task()?  What about
> cancelation?  What if a subsys wants to perform operations across
> multiple tasks atomically?
> 
> In general, iteration-by-callback is painful to use.  Establishing
> common context (be it synchronization domain or shared variables)
> becomes very cumbersome and implementation becomes fragmented and
> difficult to follow.  For example, imagine how it would be like to use
> list if we had call_for_each_list_entry(func, list_head) instead of
> the control-loop style iterators we have know.
> 
> So, using iterators enables making all relevant information to each
> stage of attach so that only one callback is required for each step -
> the way it should be.  In addition, it makes it far easier for
> subsystems to implement more involved logic in their methods.
> 
> I tried to make cgroup_freezer behave better which requires better
> synchronization against the freezer and, with the current interface,
> it's extremely ugly and painful.  The new interface is complete, easy
> to understand and use with far less subtleties.

Yeah it's true that the order between [can]attach/[can]attach_task plus
the added mess with pre_attach was not entirely sane. The fact we have
foo and foo_task is already a problem.

I guess we indeed need to sacrifice the iteration from the cgroup core
for that.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
       [not found]   ` <1314138000-2049-4-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2011-08-25  0:39     ` KAMEZAWA Hiroyuki
  2011-08-25  9:14     ` Paul Menage
  2011-08-25  9:32     ` Paul Menage
  2 siblings, 0 replies; 100+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-08-25  0:39 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Daisuke Nishimura,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, rjw-KKrjLPT3xs0,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	paul-inf54ven1CmVyaH7bEyXVA

On Wed, 24 Aug 2011 00:19:57 +0200
Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:

> Currently, there's no way to pass multiple tasks to cgroup_subsys
> methods necessitating the need for separate per-process and per-task
> methods.  This patch introduces cgroup_taskset which can be used to
> pass multiple tasks and their associated cgroups to cgroup_subsys
> methods.
> 
> Three methods - can_attach(), cancel_attach() and attach() - are
> converted to use cgroup_taskset.  This unifies passed parameters so
> that all methods have access to all information.  Conversions in this
> patchset are identical and don't introduce any behavior change.
> 
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Cc: Paul Menage <paul-inf54ven1CmVyaH7bEyXVA@public.gmane.org>
> Cc: Li Zefan <lizf-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
> Cc: Balbir Singh <bsingharora-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: Daisuke Nishimura <nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
> Cc: James Morris <jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org>

Thank you for your work. I welcome this !

Some comments around memcg.

> ---
>  Documentation/cgroups/cgroups.txt |   26 ++++++----
>  include/linux/cgroup.h            |   28 +++++++++-
>  kernel/cgroup.c                   |   99 +++++++++++++++++++++++++++++++++----
>  kernel/cgroup_freezer.c           |    2 +-
>  kernel/cpuset.c                   |   18 ++++---
>  mm/memcontrol.c                   |   16 +++---
>  security/device_cgroup.c          |    7 ++-
>  7 files changed, 153 insertions(+), 43 deletions(-)
> 
> diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
> index cd67e90..2eee7cf 100644
> --- a/Documentation/cgroups/cgroups.txt
> +++ b/Documentation/cgroups/cgroups.txt
> @@ -594,16 +594,21 @@ rmdir() will fail with it. From this behavior, pre_destroy() can be
>  called multiple times against a cgroup.
>  
>  int can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> -	       struct task_struct *task)
> +	       struct cgroup_taskset *tset)
>  (cgroup_mutex held by caller)
>  
> -Called prior to moving a task into a cgroup; if the subsystem
> -returns an error, this will abort the attach operation.  If a NULL
> -task is passed, then a successful result indicates that *any*
> -unspecified task can be moved into the cgroup. Note that this isn't
> +Called prior to moving one or more tasks into a cgroup; if the
> +subsystem returns an error, this will abort the attach operation.
> +@tset contains the tasks to be attached and is guaranteed to have at
> +least one task in it. If there are multiple, it's guaranteed that all
> +are from the same thread group,


Do this, "If there are multiple, it's guaranteed that all
are from the same thread group ", means the 'tset' contains
only one mm_struct ?

And is it guaranteed that any task in tset will not be freed while
subsystem routine runs ?

> @tset contains all tasks from the
> +group whether they're actually switching cgroup or not, and the first
> +task is the leader. Each @tset entry also contains the task's old
> +cgroup and tasks which aren't switching cgroup can be skipped easily
> +using the cgroup_taskset_for_each() iterator. Note that this isn't
>  called on a fork. If this method returns 0 (success) then this should
> -remain valid while the caller holds cgroup_mutex and it is ensured that either
> -attach() or cancel_attach() will be called in future.
> +remain valid while the caller holds cgroup_mutex and it is ensured
> +that either attach() or cancel_attach() will be called in future.
>  
>  int can_attach_task(struct cgroup *cgrp, struct task_struct *tsk);
>  (cgroup_mutex held by caller)
> @@ -613,14 +618,14 @@ attached (possibly many when using cgroup_attach_proc). Called after
>  can_attach.
>  
>  void cancel_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> -	       struct task_struct *task, bool threadgroup)
> +		   struct cgroup_taskset *tset)
>  (cgroup_mutex held by caller)
>  
>  Called when a task attach operation has failed after can_attach() has succeeded.
>  A subsystem whose can_attach() has some side-effects should provide this
>  function, so that the subsystem can implement a rollback. If not, not necessary.
>  This will be called only about subsystems whose can_attach() operation have
> -succeeded.
> +succeeded. The parameters are identical to can_attach().
>  
>  void pre_attach(struct cgroup *cgrp);
>  (cgroup_mutex held by caller)
> @@ -629,11 +634,12 @@ For any non-per-thread attachment work that needs to happen before
>  attach_task. Needed by cpuset.
>  
>  void attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> -	    struct cgroup *old_cgrp, struct task_struct *task)
> +	    struct cgroup_taskset *tset)
>  (cgroup_mutex held by caller)
>  
>  Called after the task has been attached to the cgroup, to allow any
>  post-attachment activity that requires memory allocations or blocking.
> +The parameters are identical to can_attach().
>  
>  void attach_task(struct cgroup *cgrp, struct task_struct *tsk);
>  (cgroup_mutex held by caller)
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index da7e4bc..2470c8e 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -457,6 +457,28 @@ void cgroup_exclude_rmdir(struct cgroup_subsys_state *css);
>  void cgroup_release_and_wakeup_rmdir(struct cgroup_subsys_state *css);
>  
>  /*
> + * Control Group taskset, used to pass around set of tasks to cgroup_subsys
> + * methods.
> + */
> +struct cgroup_taskset;
> +struct task_struct *cgroup_taskset_first(struct cgroup_taskset *tset);
> +struct task_struct *cgroup_taskset_next(struct cgroup_taskset *tset);
> +struct cgroup *cgroup_taskset_cur_cgroup(struct cgroup_taskset *tset);
> +int cgroup_taskset_size(struct cgroup_taskset *tset);
> +
> +/**
> + * cgroup_taskset_for_each - iterate cgroup_taskset
> + * @task: the loop cursor
> + * @skip_cgrp: skip if task's cgroup matches this, %NULL to iterate through all
> + * @tset: taskset to iterate
> + */
> +#define cgroup_taskset_for_each(task, skip_cgrp, tset)			\
> +	for ((task) = cgroup_taskset_first((tset)); (task);		\
> +	     (task) = cgroup_taskset_next((tset)))			\
> +		if (!(skip_cgrp) ||					\
> +		    cgroup_taskset_cur_cgroup((tset)) != (skip_cgrp))
> +
> +/*
>   * Control Group subsystem type.
>   * See Documentation/cgroups/cgroups.txt for details
>   */
> @@ -467,14 +489,14 @@ struct cgroup_subsys {
>  	int (*pre_destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
>  	void (*destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
>  	int (*can_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
> -			  struct task_struct *tsk);
> +			  struct cgroup_taskset *tset);
>  	int (*can_attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
>  	void (*cancel_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
> -			      struct task_struct *tsk);
> +			      struct cgroup_taskset *tset);
>  	void (*pre_attach)(struct cgroup *cgrp);
>  	void (*attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
>  	void (*attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
> -		       struct cgroup *old_cgrp, struct task_struct *tsk);
> +		       struct cgroup_taskset *tset);
>  	void (*fork)(struct cgroup_subsys *ss, struct task_struct *task);
>  	void (*exit)(struct cgroup_subsys *ss, struct cgroup *cgrp,
>  			struct cgroup *old_cgrp, struct task_struct *task);
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index cf5f3e3..474674b 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -1739,11 +1739,85 @@ int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen)
>  }
>  EXPORT_SYMBOL_GPL(cgroup_path);
>  
> +/*
> + * Control Group taskset
> + */
>  struct task_and_cgroup {
>  	struct task_struct	*task;
>  	struct cgroup		*cgrp;
>  };
>  
> +struct cgroup_taskset {
> +	struct task_and_cgroup	single;
> +	struct flex_array	*tc_array;
> +	int			tc_array_len;
> +	int			idx;
> +	struct cgroup		*cur_cgrp;
> +};
> +
> +/**
> + * cgroup_taskset_first - reset taskset and return the first task
> + * @tset: taskset of interest
> + *
> + * @tset iteration is initialized and the first task is returned.
> + */
> +struct task_struct *cgroup_taskset_first(struct cgroup_taskset *tset)
> +{
> +	if (tset->tc_array) {
> +		tset->idx = 0;
> +		return cgroup_taskset_next(tset);
> +	} else {
> +		tset->cur_cgrp = tset->single.cgrp;
> +		return tset->single.task;
> +	}
> +}
> +EXPORT_SYMBOL_GPL(cgroup_taskset_first);
> +
> +/**
> + * cgroup_taskset_next - iterate to the next task in taskset
> + * @tset: taskset of interest
> + *
> + * Return the next task in @tset.  Iteration must have been initialized
> + * with cgroup_taskset_first().
> + */
> +struct task_struct *cgroup_taskset_next(struct cgroup_taskset *tset)
> +{
> +	struct task_and_cgroup *tc;
> +
> +	if (!tset->tc_array || tset->idx >= tset->tc_array_len)
> +		return NULL;
> +
> +	tc = flex_array_get(tset->tc_array, tset->idx++);
> +	tset->cur_cgrp = tc->cgrp;
> +	return tc->task;
> +}
> +EXPORT_SYMBOL_GPL(cgroup_taskset_next);
> +
> +/**
> + * cgroup_taskset_cur_cgroup - return the matching cgroup for the current task
> + * @tset: taskset of interest
> + *
> + * Return the cgroup for the current (last returned) task of @tset.  This
> + * function must be preceded by either cgroup_taskset_first() or
> + * cgroup_taskset_next().
> + */
> +struct cgroup *cgroup_taskset_cur_cgroup(struct cgroup_taskset *tset)
> +{
> +	return tset->cur_cgrp;
> +}
> +EXPORT_SYMBOL_GPL(cgroup_taskset_cur_cgroup);
> +
> +/**
> + * cgroup_taskset_size - return the number of tasks in taskset
> + * @tset: taskset of interest
> + */
> +int cgroup_taskset_size(struct cgroup_taskset *tset)
> +{
> +	return tset->tc_array ? tset->tc_array_len : 1;
> +}
> +EXPORT_SYMBOL_GPL(cgroup_taskset_size);
> +
> +
>  /*
>   * cgroup_task_migrate - move a task from one cgroup to another.
>   *
> @@ -1828,15 +1902,19 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
>  	struct cgroup_subsys *ss, *failed_ss = NULL;
>  	struct cgroup *oldcgrp;
>  	struct cgroupfs_root *root = cgrp->root;
> +	struct cgroup_taskset tset = { };
>  
>  	/* Nothing to do if the task is already in that cgroup */
>  	oldcgrp = task_cgroup_from_root(tsk, root);
>  	if (cgrp == oldcgrp)
>  		return 0;
>  
> +	tset.single.task = tsk;
> +	tset.single.cgrp = oldcgrp;
> +
>  	for_each_subsys(root, ss) {
>  		if (ss->can_attach) {
> -			retval = ss->can_attach(ss, cgrp, tsk);
> +			retval = ss->can_attach(ss, cgrp, &tset);
>  			if (retval) {
>  				/*
>  				 * Remember on which subsystem the can_attach()
> @@ -1867,7 +1945,7 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
>  		if (ss->attach_task)
>  			ss->attach_task(cgrp, tsk);
>  		if (ss->attach)
> -			ss->attach(ss, cgrp, oldcgrp, tsk);
> +			ss->attach(ss, cgrp, &tset);
>  	}
>  
>  	synchronize_rcu();
> @@ -1889,7 +1967,7 @@ out:
>  				 */
>  				break;
>  			if (ss->cancel_attach)
> -				ss->cancel_attach(ss, cgrp, tsk);
> +				ss->cancel_attach(ss, cgrp, &tset);
>  		}
>  	}
>  	return retval;
> @@ -2005,6 +2083,7 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>  	struct task_struct *tsk;
>  	struct task_and_cgroup *tc;
>  	struct flex_array *group;
> +	struct cgroup_taskset tset = { };
>  	/*
>  	 * we need to make sure we have css_sets for all the tasks we're
>  	 * going to move -before- we actually start moving them, so that in
> @@ -2067,6 +2146,8 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>  	} while_each_thread(leader, tsk);
>  	/* remember the number of threads in the array for later. */
>  	group_size = i;
> +	tset.tc_array = group;
> +	tset.tc_array_len = group_size;
>  	rcu_read_unlock();
>  
>  	/* methods shouldn't be called if no task is actually migrating */
> @@ -2079,7 +2160,7 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>  	 */
>  	for_each_subsys(root, ss) {
>  		if (ss->can_attach) {
> -			retval = ss->can_attach(ss, cgrp, leader);
> +			retval = ss->can_attach(ss, cgrp, &tset);
>  			if (retval) {
>  				failed_ss = ss;
>  				goto out_cancel_attach;
> @@ -2169,10 +2250,8 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>  	 * being moved, this call will need to be reworked to communicate that.
>  	 */
>  	for_each_subsys(root, ss) {
> -		if (ss->attach) {
> -			tc = flex_array_get(group, 0);
> -			ss->attach(ss, cgrp, tc->cgrp, tc->task);
> -		}
> +		if (ss->attach)
> +			ss->attach(ss, cgrp, &tset);
>  	}
>  
>  	/*
> @@ -2194,11 +2273,11 @@ out_cancel_attach:
>  		for_each_subsys(root, ss) {
>  			if (ss == failed_ss) {
>  				if (cancel_failed_ss && ss->cancel_attach)
> -					ss->cancel_attach(ss, cgrp, leader);
> +					ss->cancel_attach(ss, cgrp, &tset);
>  				break;
>  			}
>  			if (ss->cancel_attach)
> -				ss->cancel_attach(ss, cgrp, leader);
> +				ss->cancel_attach(ss, cgrp, &tset);
>  		}
>  	}
>  out_put_tasks:
> diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
> index 4e82525..a2b0082 100644
> --- a/kernel/cgroup_freezer.c
> +++ b/kernel/cgroup_freezer.c
> @@ -159,7 +159,7 @@ static void freezer_destroy(struct cgroup_subsys *ss,
>   */
>  static int freezer_can_attach(struct cgroup_subsys *ss,
>  			      struct cgroup *new_cgroup,
> -			      struct task_struct *task)
> +			      struct cgroup_taskset *tset)
>  {
>  	struct freezer *freezer;
>  
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index 10131fd..2e5825b 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -1368,10 +1368,10 @@ static int fmeter_getrate(struct fmeter *fmp)
>  }
>  
>  /* Called by cgroups to determine if a cpuset is usable; cgroup_mutex held */
> -static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cont,
> -			     struct task_struct *tsk)
> +static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +			     struct cgroup_taskset *tset)
>  {
> -	struct cpuset *cs = cgroup_cs(cont);
> +	struct cpuset *cs = cgroup_cs(cgrp);
>  
>  	if (cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed))
>  		return -ENOSPC;
> @@ -1384,7 +1384,7 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cont,
>  	 * set_cpus_allowed_ptr() on all attached tasks before cpus_allowed may
>  	 * be changed.
>  	 */
> -	if (tsk->flags & PF_THREAD_BOUND)
> +	if (cgroup_taskset_first(tset)->flags & PF_THREAD_BOUND)
>  		return -EINVAL;
>  
>  	return 0;
> @@ -1434,12 +1434,14 @@ static void cpuset_attach_task(struct cgroup *cont, struct task_struct *tsk)
>  	cpuset_update_task_spread_flag(cs, tsk);
>  }
>  
> -static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cont,
> -			  struct cgroup *oldcont, struct task_struct *tsk)
> +static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +			  struct cgroup_taskset *tset)
>  {
>  	struct mm_struct *mm;
> -	struct cpuset *cs = cgroup_cs(cont);
> -	struct cpuset *oldcs = cgroup_cs(oldcont);
> +	struct task_struct *tsk = cgroup_taskset_first(tset);
> +	struct cgroup *oldcgrp = cgroup_taskset_cur_cgroup(tset);
> +	struct cpuset *cs = cgroup_cs(cgrp);
> +	struct cpuset *oldcs = cgroup_cs(oldcgrp);
>  
>  	/*
>  	 * Change mm, possibly for multiple threads in a threadgroup. This is
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 930de94..b2802cc 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5460,8 +5460,9 @@ static void mem_cgroup_clear_mc(void)
>  
>  static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
>  				struct cgroup *cgroup,
> -				struct task_struct *p)
> +				struct cgroup_taskset *tset)
>  {
> +	struct task_struct *p = cgroup_taskset_first(tset);
>  	int ret = 0;
>  	struct mem_cgroup *mem = mem_cgroup_from_cont(cgroup);
>  

Ah..hmm. I think this doesn't work as expected for memcg.
Maybe code like this will be required.

{
	struct mem_cgroup *mem = mem_cgroup_from_cont(cgroup);
        struct mem_cgroup *from = NULL;
	struct task_struct *p;
	int ret = 0;
	/*
 	 * memcg just works against mm-owner. Check mm-owner is in this cgroup.
	 * Because tset contains only one thread-group, we'll find a task of
	 * mm->owner, at most.
 	 */
	for_cgroup_taskset_for_each(task, NULL, tset) {
		struct mm_struct *mm;

		mm = get_task_mm(task);
		if (!mm)
			continue;
		if (mm->owner == task) {
			p = task;
			break;
		}
		mmput(mm);
	}
	if (!p)
		return ret;
        from = mem_cgroup_from_task(p);
        mem_cgroup_start_move(from);
        spin_lock(&mc.lock);
	mc.from = from;
        mc.to = mem;
        spin_unlock(&mc.lock);
        /* We set mc.moving_task later */

        ret = mem_cgroup_precharge_mc(mm);
        if (ret)
             mem_cgroup_clear_mc();
	mm_put(mm);
        return ret;
} 



> @@ -5499,7 +5500,7 @@ static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
>  
>  static void mem_cgroup_cancel_attach(struct cgroup_subsys *ss,
>  				struct cgroup *cgroup,
> -				struct task_struct *p)
> +				struct cgroup_taskset *tset)
>  {
>  	mem_cgroup_clear_mc();
>  }
> @@ -5616,9 +5617,9 @@ retry:
>  
>  static void mem_cgroup_move_task(struct cgroup_subsys *ss,
>  				struct cgroup *cont,
> -				struct cgroup *old_cont,
> -				struct task_struct *p)
> +				struct cgroup_taskset *tset)
>  {
> +	struct task_struct *p = cgroup_taskset_first(tset);
>  	struct mm_struct *mm = get_task_mm(p);
>  

Similar code with can_attach() will be required.

Thanks,
-Kame

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-23 22:19 ` [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach() Tejun Heo
  2011-08-25  0:39   ` KAMEZAWA Hiroyuki
@ 2011-08-25  0:39   ` KAMEZAWA Hiroyuki
  2011-08-25  8:20     ` Tejun Heo
                       ` (2 more replies)
  2011-08-25  9:14   ` Paul Menage
                     ` (4 subsequent siblings)
  6 siblings, 3 replies; 100+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-08-25  0:39 UTC (permalink / raw)
  To: Tejun Heo
  Cc: rjw, paul, lizf, linux-pm, linux-kernel, containers,
	Balbir Singh, Daisuke Nishimura, James Morris

On Wed, 24 Aug 2011 00:19:57 +0200
Tejun Heo <tj@kernel.org> wrote:

> Currently, there's no way to pass multiple tasks to cgroup_subsys
> methods necessitating the need for separate per-process and per-task
> methods.  This patch introduces cgroup_taskset which can be used to
> pass multiple tasks and their associated cgroups to cgroup_subsys
> methods.
> 
> Three methods - can_attach(), cancel_attach() and attach() - are
> converted to use cgroup_taskset.  This unifies passed parameters so
> that all methods have access to all information.  Conversions in this
> patchset are identical and don't introduce any behavior change.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Paul Menage <paul@paulmenage.org>
> Cc: Li Zefan <lizf@cn.fujitsu.com>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: James Morris <jmorris@namei.org>

Thank you for your work. I welcome this !

Some comments around memcg.

> ---
>  Documentation/cgroups/cgroups.txt |   26 ++++++----
>  include/linux/cgroup.h            |   28 +++++++++-
>  kernel/cgroup.c                   |   99 +++++++++++++++++++++++++++++++++----
>  kernel/cgroup_freezer.c           |    2 +-
>  kernel/cpuset.c                   |   18 ++++---
>  mm/memcontrol.c                   |   16 +++---
>  security/device_cgroup.c          |    7 ++-
>  7 files changed, 153 insertions(+), 43 deletions(-)
> 
> diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
> index cd67e90..2eee7cf 100644
> --- a/Documentation/cgroups/cgroups.txt
> +++ b/Documentation/cgroups/cgroups.txt
> @@ -594,16 +594,21 @@ rmdir() will fail with it. From this behavior, pre_destroy() can be
>  called multiple times against a cgroup.
>  
>  int can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> -	       struct task_struct *task)
> +	       struct cgroup_taskset *tset)
>  (cgroup_mutex held by caller)
>  
> -Called prior to moving a task into a cgroup; if the subsystem
> -returns an error, this will abort the attach operation.  If a NULL
> -task is passed, then a successful result indicates that *any*
> -unspecified task can be moved into the cgroup. Note that this isn't
> +Called prior to moving one or more tasks into a cgroup; if the
> +subsystem returns an error, this will abort the attach operation.
> +@tset contains the tasks to be attached and is guaranteed to have at
> +least one task in it. If there are multiple, it's guaranteed that all
> +are from the same thread group,


Do this, "If there are multiple, it's guaranteed that all
are from the same thread group ", means the 'tset' contains
only one mm_struct ?

And is it guaranteed that any task in tset will not be freed while
subsystem routine runs ?

> @tset contains all tasks from the
> +group whether they're actually switching cgroup or not, and the first
> +task is the leader. Each @tset entry also contains the task's old
> +cgroup and tasks which aren't switching cgroup can be skipped easily
> +using the cgroup_taskset_for_each() iterator. Note that this isn't
>  called on a fork. If this method returns 0 (success) then this should
> -remain valid while the caller holds cgroup_mutex and it is ensured that either
> -attach() or cancel_attach() will be called in future.
> +remain valid while the caller holds cgroup_mutex and it is ensured
> +that either attach() or cancel_attach() will be called in future.
>  
>  int can_attach_task(struct cgroup *cgrp, struct task_struct *tsk);
>  (cgroup_mutex held by caller)
> @@ -613,14 +618,14 @@ attached (possibly many when using cgroup_attach_proc). Called after
>  can_attach.
>  
>  void cancel_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> -	       struct task_struct *task, bool threadgroup)
> +		   struct cgroup_taskset *tset)
>  (cgroup_mutex held by caller)
>  
>  Called when a task attach operation has failed after can_attach() has succeeded.
>  A subsystem whose can_attach() has some side-effects should provide this
>  function, so that the subsystem can implement a rollback. If not, not necessary.
>  This will be called only about subsystems whose can_attach() operation have
> -succeeded.
> +succeeded. The parameters are identical to can_attach().
>  
>  void pre_attach(struct cgroup *cgrp);
>  (cgroup_mutex held by caller)
> @@ -629,11 +634,12 @@ For any non-per-thread attachment work that needs to happen before
>  attach_task. Needed by cpuset.
>  
>  void attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> -	    struct cgroup *old_cgrp, struct task_struct *task)
> +	    struct cgroup_taskset *tset)
>  (cgroup_mutex held by caller)
>  
>  Called after the task has been attached to the cgroup, to allow any
>  post-attachment activity that requires memory allocations or blocking.
> +The parameters are identical to can_attach().
>  
>  void attach_task(struct cgroup *cgrp, struct task_struct *tsk);
>  (cgroup_mutex held by caller)
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index da7e4bc..2470c8e 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -457,6 +457,28 @@ void cgroup_exclude_rmdir(struct cgroup_subsys_state *css);
>  void cgroup_release_and_wakeup_rmdir(struct cgroup_subsys_state *css);
>  
>  /*
> + * Control Group taskset, used to pass around set of tasks to cgroup_subsys
> + * methods.
> + */
> +struct cgroup_taskset;
> +struct task_struct *cgroup_taskset_first(struct cgroup_taskset *tset);
> +struct task_struct *cgroup_taskset_next(struct cgroup_taskset *tset);
> +struct cgroup *cgroup_taskset_cur_cgroup(struct cgroup_taskset *tset);
> +int cgroup_taskset_size(struct cgroup_taskset *tset);
> +
> +/**
> + * cgroup_taskset_for_each - iterate cgroup_taskset
> + * @task: the loop cursor
> + * @skip_cgrp: skip if task's cgroup matches this, %NULL to iterate through all
> + * @tset: taskset to iterate
> + */
> +#define cgroup_taskset_for_each(task, skip_cgrp, tset)			\
> +	for ((task) = cgroup_taskset_first((tset)); (task);		\
> +	     (task) = cgroup_taskset_next((tset)))			\
> +		if (!(skip_cgrp) ||					\
> +		    cgroup_taskset_cur_cgroup((tset)) != (skip_cgrp))
> +
> +/*
>   * Control Group subsystem type.
>   * See Documentation/cgroups/cgroups.txt for details
>   */
> @@ -467,14 +489,14 @@ struct cgroup_subsys {
>  	int (*pre_destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
>  	void (*destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
>  	int (*can_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
> -			  struct task_struct *tsk);
> +			  struct cgroup_taskset *tset);
>  	int (*can_attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
>  	void (*cancel_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
> -			      struct task_struct *tsk);
> +			      struct cgroup_taskset *tset);
>  	void (*pre_attach)(struct cgroup *cgrp);
>  	void (*attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
>  	void (*attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
> -		       struct cgroup *old_cgrp, struct task_struct *tsk);
> +		       struct cgroup_taskset *tset);
>  	void (*fork)(struct cgroup_subsys *ss, struct task_struct *task);
>  	void (*exit)(struct cgroup_subsys *ss, struct cgroup *cgrp,
>  			struct cgroup *old_cgrp, struct task_struct *task);
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index cf5f3e3..474674b 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -1739,11 +1739,85 @@ int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen)
>  }
>  EXPORT_SYMBOL_GPL(cgroup_path);
>  
> +/*
> + * Control Group taskset
> + */
>  struct task_and_cgroup {
>  	struct task_struct	*task;
>  	struct cgroup		*cgrp;
>  };
>  
> +struct cgroup_taskset {
> +	struct task_and_cgroup	single;
> +	struct flex_array	*tc_array;
> +	int			tc_array_len;
> +	int			idx;
> +	struct cgroup		*cur_cgrp;
> +};
> +
> +/**
> + * cgroup_taskset_first - reset taskset and return the first task
> + * @tset: taskset of interest
> + *
> + * @tset iteration is initialized and the first task is returned.
> + */
> +struct task_struct *cgroup_taskset_first(struct cgroup_taskset *tset)
> +{
> +	if (tset->tc_array) {
> +		tset->idx = 0;
> +		return cgroup_taskset_next(tset);
> +	} else {
> +		tset->cur_cgrp = tset->single.cgrp;
> +		return tset->single.task;
> +	}
> +}
> +EXPORT_SYMBOL_GPL(cgroup_taskset_first);
> +
> +/**
> + * cgroup_taskset_next - iterate to the next task in taskset
> + * @tset: taskset of interest
> + *
> + * Return the next task in @tset.  Iteration must have been initialized
> + * with cgroup_taskset_first().
> + */
> +struct task_struct *cgroup_taskset_next(struct cgroup_taskset *tset)
> +{
> +	struct task_and_cgroup *tc;
> +
> +	if (!tset->tc_array || tset->idx >= tset->tc_array_len)
> +		return NULL;
> +
> +	tc = flex_array_get(tset->tc_array, tset->idx++);
> +	tset->cur_cgrp = tc->cgrp;
> +	return tc->task;
> +}
> +EXPORT_SYMBOL_GPL(cgroup_taskset_next);
> +
> +/**
> + * cgroup_taskset_cur_cgroup - return the matching cgroup for the current task
> + * @tset: taskset of interest
> + *
> + * Return the cgroup for the current (last returned) task of @tset.  This
> + * function must be preceded by either cgroup_taskset_first() or
> + * cgroup_taskset_next().
> + */
> +struct cgroup *cgroup_taskset_cur_cgroup(struct cgroup_taskset *tset)
> +{
> +	return tset->cur_cgrp;
> +}
> +EXPORT_SYMBOL_GPL(cgroup_taskset_cur_cgroup);
> +
> +/**
> + * cgroup_taskset_size - return the number of tasks in taskset
> + * @tset: taskset of interest
> + */
> +int cgroup_taskset_size(struct cgroup_taskset *tset)
> +{
> +	return tset->tc_array ? tset->tc_array_len : 1;
> +}
> +EXPORT_SYMBOL_GPL(cgroup_taskset_size);
> +
> +
>  /*
>   * cgroup_task_migrate - move a task from one cgroup to another.
>   *
> @@ -1828,15 +1902,19 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
>  	struct cgroup_subsys *ss, *failed_ss = NULL;
>  	struct cgroup *oldcgrp;
>  	struct cgroupfs_root *root = cgrp->root;
> +	struct cgroup_taskset tset = { };
>  
>  	/* Nothing to do if the task is already in that cgroup */
>  	oldcgrp = task_cgroup_from_root(tsk, root);
>  	if (cgrp == oldcgrp)
>  		return 0;
>  
> +	tset.single.task = tsk;
> +	tset.single.cgrp = oldcgrp;
> +
>  	for_each_subsys(root, ss) {
>  		if (ss->can_attach) {
> -			retval = ss->can_attach(ss, cgrp, tsk);
> +			retval = ss->can_attach(ss, cgrp, &tset);
>  			if (retval) {
>  				/*
>  				 * Remember on which subsystem the can_attach()
> @@ -1867,7 +1945,7 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
>  		if (ss->attach_task)
>  			ss->attach_task(cgrp, tsk);
>  		if (ss->attach)
> -			ss->attach(ss, cgrp, oldcgrp, tsk);
> +			ss->attach(ss, cgrp, &tset);
>  	}
>  
>  	synchronize_rcu();
> @@ -1889,7 +1967,7 @@ out:
>  				 */
>  				break;
>  			if (ss->cancel_attach)
> -				ss->cancel_attach(ss, cgrp, tsk);
> +				ss->cancel_attach(ss, cgrp, &tset);
>  		}
>  	}
>  	return retval;
> @@ -2005,6 +2083,7 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>  	struct task_struct *tsk;
>  	struct task_and_cgroup *tc;
>  	struct flex_array *group;
> +	struct cgroup_taskset tset = { };
>  	/*
>  	 * we need to make sure we have css_sets for all the tasks we're
>  	 * going to move -before- we actually start moving them, so that in
> @@ -2067,6 +2146,8 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>  	} while_each_thread(leader, tsk);
>  	/* remember the number of threads in the array for later. */
>  	group_size = i;
> +	tset.tc_array = group;
> +	tset.tc_array_len = group_size;
>  	rcu_read_unlock();
>  
>  	/* methods shouldn't be called if no task is actually migrating */
> @@ -2079,7 +2160,7 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>  	 */
>  	for_each_subsys(root, ss) {
>  		if (ss->can_attach) {
> -			retval = ss->can_attach(ss, cgrp, leader);
> +			retval = ss->can_attach(ss, cgrp, &tset);
>  			if (retval) {
>  				failed_ss = ss;
>  				goto out_cancel_attach;
> @@ -2169,10 +2250,8 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>  	 * being moved, this call will need to be reworked to communicate that.
>  	 */
>  	for_each_subsys(root, ss) {
> -		if (ss->attach) {
> -			tc = flex_array_get(group, 0);
> -			ss->attach(ss, cgrp, tc->cgrp, tc->task);
> -		}
> +		if (ss->attach)
> +			ss->attach(ss, cgrp, &tset);
>  	}
>  
>  	/*
> @@ -2194,11 +2273,11 @@ out_cancel_attach:
>  		for_each_subsys(root, ss) {
>  			if (ss == failed_ss) {
>  				if (cancel_failed_ss && ss->cancel_attach)
> -					ss->cancel_attach(ss, cgrp, leader);
> +					ss->cancel_attach(ss, cgrp, &tset);
>  				break;
>  			}
>  			if (ss->cancel_attach)
> -				ss->cancel_attach(ss, cgrp, leader);
> +				ss->cancel_attach(ss, cgrp, &tset);
>  		}
>  	}
>  out_put_tasks:
> diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
> index 4e82525..a2b0082 100644
> --- a/kernel/cgroup_freezer.c
> +++ b/kernel/cgroup_freezer.c
> @@ -159,7 +159,7 @@ static void freezer_destroy(struct cgroup_subsys *ss,
>   */
>  static int freezer_can_attach(struct cgroup_subsys *ss,
>  			      struct cgroup *new_cgroup,
> -			      struct task_struct *task)
> +			      struct cgroup_taskset *tset)
>  {
>  	struct freezer *freezer;
>  
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index 10131fd..2e5825b 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -1368,10 +1368,10 @@ static int fmeter_getrate(struct fmeter *fmp)
>  }
>  
>  /* Called by cgroups to determine if a cpuset is usable; cgroup_mutex held */
> -static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cont,
> -			     struct task_struct *tsk)
> +static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +			     struct cgroup_taskset *tset)
>  {
> -	struct cpuset *cs = cgroup_cs(cont);
> +	struct cpuset *cs = cgroup_cs(cgrp);
>  
>  	if (cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed))
>  		return -ENOSPC;
> @@ -1384,7 +1384,7 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cont,
>  	 * set_cpus_allowed_ptr() on all attached tasks before cpus_allowed may
>  	 * be changed.
>  	 */
> -	if (tsk->flags & PF_THREAD_BOUND)
> +	if (cgroup_taskset_first(tset)->flags & PF_THREAD_BOUND)
>  		return -EINVAL;
>  
>  	return 0;
> @@ -1434,12 +1434,14 @@ static void cpuset_attach_task(struct cgroup *cont, struct task_struct *tsk)
>  	cpuset_update_task_spread_flag(cs, tsk);
>  }
>  
> -static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cont,
> -			  struct cgroup *oldcont, struct task_struct *tsk)
> +static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +			  struct cgroup_taskset *tset)
>  {
>  	struct mm_struct *mm;
> -	struct cpuset *cs = cgroup_cs(cont);
> -	struct cpuset *oldcs = cgroup_cs(oldcont);
> +	struct task_struct *tsk = cgroup_taskset_first(tset);
> +	struct cgroup *oldcgrp = cgroup_taskset_cur_cgroup(tset);
> +	struct cpuset *cs = cgroup_cs(cgrp);
> +	struct cpuset *oldcs = cgroup_cs(oldcgrp);
>  
>  	/*
>  	 * Change mm, possibly for multiple threads in a threadgroup. This is
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 930de94..b2802cc 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5460,8 +5460,9 @@ static void mem_cgroup_clear_mc(void)
>  
>  static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
>  				struct cgroup *cgroup,
> -				struct task_struct *p)
> +				struct cgroup_taskset *tset)
>  {
> +	struct task_struct *p = cgroup_taskset_first(tset);
>  	int ret = 0;
>  	struct mem_cgroup *mem = mem_cgroup_from_cont(cgroup);
>  

Ah..hmm. I think this doesn't work as expected for memcg.
Maybe code like this will be required.

{
	struct mem_cgroup *mem = mem_cgroup_from_cont(cgroup);
        struct mem_cgroup *from = NULL;
	struct task_struct *p;
	int ret = 0;
	/*
 	 * memcg just works against mm-owner. Check mm-owner is in this cgroup.
	 * Because tset contains only one thread-group, we'll find a task of
	 * mm->owner, at most.
 	 */
	for_cgroup_taskset_for_each(task, NULL, tset) {
		struct mm_struct *mm;

		mm = get_task_mm(task);
		if (!mm)
			continue;
		if (mm->owner == task) {
			p = task;
			break;
		}
		mmput(mm);
	}
	if (!p)
		return ret;
        from = mem_cgroup_from_task(p);
        mem_cgroup_start_move(from);
        spin_lock(&mc.lock);
	mc.from = from;
        mc.to = mem;
        spin_unlock(&mc.lock);
        /* We set mc.moving_task later */

        ret = mem_cgroup_precharge_mc(mm);
        if (ret)
             mem_cgroup_clear_mc();
	mm_put(mm);
        return ret;
} 



> @@ -5499,7 +5500,7 @@ static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
>  
>  static void mem_cgroup_cancel_attach(struct cgroup_subsys *ss,
>  				struct cgroup *cgroup,
> -				struct task_struct *p)
> +				struct cgroup_taskset *tset)
>  {
>  	mem_cgroup_clear_mc();
>  }
> @@ -5616,9 +5617,9 @@ retry:
>  
>  static void mem_cgroup_move_task(struct cgroup_subsys *ss,
>  				struct cgroup *cont,
> -				struct cgroup *old_cont,
> -				struct task_struct *p)
> +				struct cgroup_taskset *tset)
>  {
> +	struct task_struct *p = cgroup_taskset_first(tset);
>  	struct mm_struct *mm = get_task_mm(p);
>  

Similar code with can_attach() will be required.

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-23 22:19 ` [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach() Tejun Heo
@ 2011-08-25  0:39   ` KAMEZAWA Hiroyuki
  2011-08-25  0:39   ` KAMEZAWA Hiroyuki
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 100+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-08-25  0:39 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Daisuke Nishimura, containers, lizf, linux-kernel, James Morris,
	linux-pm, paul

On Wed, 24 Aug 2011 00:19:57 +0200
Tejun Heo <tj@kernel.org> wrote:

> Currently, there's no way to pass multiple tasks to cgroup_subsys
> methods necessitating the need for separate per-process and per-task
> methods.  This patch introduces cgroup_taskset which can be used to
> pass multiple tasks and their associated cgroups to cgroup_subsys
> methods.
> 
> Three methods - can_attach(), cancel_attach() and attach() - are
> converted to use cgroup_taskset.  This unifies passed parameters so
> that all methods have access to all information.  Conversions in this
> patchset are identical and don't introduce any behavior change.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Paul Menage <paul@paulmenage.org>
> Cc: Li Zefan <lizf@cn.fujitsu.com>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: James Morris <jmorris@namei.org>

Thank you for your work. I welcome this !

Some comments around memcg.

> ---
>  Documentation/cgroups/cgroups.txt |   26 ++++++----
>  include/linux/cgroup.h            |   28 +++++++++-
>  kernel/cgroup.c                   |   99 +++++++++++++++++++++++++++++++++----
>  kernel/cgroup_freezer.c           |    2 +-
>  kernel/cpuset.c                   |   18 ++++---
>  mm/memcontrol.c                   |   16 +++---
>  security/device_cgroup.c          |    7 ++-
>  7 files changed, 153 insertions(+), 43 deletions(-)
> 
> diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
> index cd67e90..2eee7cf 100644
> --- a/Documentation/cgroups/cgroups.txt
> +++ b/Documentation/cgroups/cgroups.txt
> @@ -594,16 +594,21 @@ rmdir() will fail with it. From this behavior, pre_destroy() can be
>  called multiple times against a cgroup.
>  
>  int can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> -	       struct task_struct *task)
> +	       struct cgroup_taskset *tset)
>  (cgroup_mutex held by caller)
>  
> -Called prior to moving a task into a cgroup; if the subsystem
> -returns an error, this will abort the attach operation.  If a NULL
> -task is passed, then a successful result indicates that *any*
> -unspecified task can be moved into the cgroup. Note that this isn't
> +Called prior to moving one or more tasks into a cgroup; if the
> +subsystem returns an error, this will abort the attach operation.
> +@tset contains the tasks to be attached and is guaranteed to have at
> +least one task in it. If there are multiple, it's guaranteed that all
> +are from the same thread group,


Do this, "If there are multiple, it's guaranteed that all
are from the same thread group ", means the 'tset' contains
only one mm_struct ?

And is it guaranteed that any task in tset will not be freed while
subsystem routine runs ?

> @tset contains all tasks from the
> +group whether they're actually switching cgroup or not, and the first
> +task is the leader. Each @tset entry also contains the task's old
> +cgroup and tasks which aren't switching cgroup can be skipped easily
> +using the cgroup_taskset_for_each() iterator. Note that this isn't
>  called on a fork. If this method returns 0 (success) then this should
> -remain valid while the caller holds cgroup_mutex and it is ensured that either
> -attach() or cancel_attach() will be called in future.
> +remain valid while the caller holds cgroup_mutex and it is ensured
> +that either attach() or cancel_attach() will be called in future.
>  
>  int can_attach_task(struct cgroup *cgrp, struct task_struct *tsk);
>  (cgroup_mutex held by caller)
> @@ -613,14 +618,14 @@ attached (possibly many when using cgroup_attach_proc). Called after
>  can_attach.
>  
>  void cancel_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> -	       struct task_struct *task, bool threadgroup)
> +		   struct cgroup_taskset *tset)
>  (cgroup_mutex held by caller)
>  
>  Called when a task attach operation has failed after can_attach() has succeeded.
>  A subsystem whose can_attach() has some side-effects should provide this
>  function, so that the subsystem can implement a rollback. If not, not necessary.
>  This will be called only about subsystems whose can_attach() operation have
> -succeeded.
> +succeeded. The parameters are identical to can_attach().
>  
>  void pre_attach(struct cgroup *cgrp);
>  (cgroup_mutex held by caller)
> @@ -629,11 +634,12 @@ For any non-per-thread attachment work that needs to happen before
>  attach_task. Needed by cpuset.
>  
>  void attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> -	    struct cgroup *old_cgrp, struct task_struct *task)
> +	    struct cgroup_taskset *tset)
>  (cgroup_mutex held by caller)
>  
>  Called after the task has been attached to the cgroup, to allow any
>  post-attachment activity that requires memory allocations or blocking.
> +The parameters are identical to can_attach().
>  
>  void attach_task(struct cgroup *cgrp, struct task_struct *tsk);
>  (cgroup_mutex held by caller)
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index da7e4bc..2470c8e 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -457,6 +457,28 @@ void cgroup_exclude_rmdir(struct cgroup_subsys_state *css);
>  void cgroup_release_and_wakeup_rmdir(struct cgroup_subsys_state *css);
>  
>  /*
> + * Control Group taskset, used to pass around set of tasks to cgroup_subsys
> + * methods.
> + */
> +struct cgroup_taskset;
> +struct task_struct *cgroup_taskset_first(struct cgroup_taskset *tset);
> +struct task_struct *cgroup_taskset_next(struct cgroup_taskset *tset);
> +struct cgroup *cgroup_taskset_cur_cgroup(struct cgroup_taskset *tset);
> +int cgroup_taskset_size(struct cgroup_taskset *tset);
> +
> +/**
> + * cgroup_taskset_for_each - iterate cgroup_taskset
> + * @task: the loop cursor
> + * @skip_cgrp: skip if task's cgroup matches this, %NULL to iterate through all
> + * @tset: taskset to iterate
> + */
> +#define cgroup_taskset_for_each(task, skip_cgrp, tset)			\
> +	for ((task) = cgroup_taskset_first((tset)); (task);		\
> +	     (task) = cgroup_taskset_next((tset)))			\
> +		if (!(skip_cgrp) ||					\
> +		    cgroup_taskset_cur_cgroup((tset)) != (skip_cgrp))
> +
> +/*
>   * Control Group subsystem type.
>   * See Documentation/cgroups/cgroups.txt for details
>   */
> @@ -467,14 +489,14 @@ struct cgroup_subsys {
>  	int (*pre_destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
>  	void (*destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
>  	int (*can_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
> -			  struct task_struct *tsk);
> +			  struct cgroup_taskset *tset);
>  	int (*can_attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
>  	void (*cancel_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
> -			      struct task_struct *tsk);
> +			      struct cgroup_taskset *tset);
>  	void (*pre_attach)(struct cgroup *cgrp);
>  	void (*attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
>  	void (*attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
> -		       struct cgroup *old_cgrp, struct task_struct *tsk);
> +		       struct cgroup_taskset *tset);
>  	void (*fork)(struct cgroup_subsys *ss, struct task_struct *task);
>  	void (*exit)(struct cgroup_subsys *ss, struct cgroup *cgrp,
>  			struct cgroup *old_cgrp, struct task_struct *task);
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index cf5f3e3..474674b 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -1739,11 +1739,85 @@ int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen)
>  }
>  EXPORT_SYMBOL_GPL(cgroup_path);
>  
> +/*
> + * Control Group taskset
> + */
>  struct task_and_cgroup {
>  	struct task_struct	*task;
>  	struct cgroup		*cgrp;
>  };
>  
> +struct cgroup_taskset {
> +	struct task_and_cgroup	single;
> +	struct flex_array	*tc_array;
> +	int			tc_array_len;
> +	int			idx;
> +	struct cgroup		*cur_cgrp;
> +};
> +
> +/**
> + * cgroup_taskset_first - reset taskset and return the first task
> + * @tset: taskset of interest
> + *
> + * @tset iteration is initialized and the first task is returned.
> + */
> +struct task_struct *cgroup_taskset_first(struct cgroup_taskset *tset)
> +{
> +	if (tset->tc_array) {
> +		tset->idx = 0;
> +		return cgroup_taskset_next(tset);
> +	} else {
> +		tset->cur_cgrp = tset->single.cgrp;
> +		return tset->single.task;
> +	}
> +}
> +EXPORT_SYMBOL_GPL(cgroup_taskset_first);
> +
> +/**
> + * cgroup_taskset_next - iterate to the next task in taskset
> + * @tset: taskset of interest
> + *
> + * Return the next task in @tset.  Iteration must have been initialized
> + * with cgroup_taskset_first().
> + */
> +struct task_struct *cgroup_taskset_next(struct cgroup_taskset *tset)
> +{
> +	struct task_and_cgroup *tc;
> +
> +	if (!tset->tc_array || tset->idx >= tset->tc_array_len)
> +		return NULL;
> +
> +	tc = flex_array_get(tset->tc_array, tset->idx++);
> +	tset->cur_cgrp = tc->cgrp;
> +	return tc->task;
> +}
> +EXPORT_SYMBOL_GPL(cgroup_taskset_next);
> +
> +/**
> + * cgroup_taskset_cur_cgroup - return the matching cgroup for the current task
> + * @tset: taskset of interest
> + *
> + * Return the cgroup for the current (last returned) task of @tset.  This
> + * function must be preceded by either cgroup_taskset_first() or
> + * cgroup_taskset_next().
> + */
> +struct cgroup *cgroup_taskset_cur_cgroup(struct cgroup_taskset *tset)
> +{
> +	return tset->cur_cgrp;
> +}
> +EXPORT_SYMBOL_GPL(cgroup_taskset_cur_cgroup);
> +
> +/**
> + * cgroup_taskset_size - return the number of tasks in taskset
> + * @tset: taskset of interest
> + */
> +int cgroup_taskset_size(struct cgroup_taskset *tset)
> +{
> +	return tset->tc_array ? tset->tc_array_len : 1;
> +}
> +EXPORT_SYMBOL_GPL(cgroup_taskset_size);
> +
> +
>  /*
>   * cgroup_task_migrate - move a task from one cgroup to another.
>   *
> @@ -1828,15 +1902,19 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
>  	struct cgroup_subsys *ss, *failed_ss = NULL;
>  	struct cgroup *oldcgrp;
>  	struct cgroupfs_root *root = cgrp->root;
> +	struct cgroup_taskset tset = { };
>  
>  	/* Nothing to do if the task is already in that cgroup */
>  	oldcgrp = task_cgroup_from_root(tsk, root);
>  	if (cgrp == oldcgrp)
>  		return 0;
>  
> +	tset.single.task = tsk;
> +	tset.single.cgrp = oldcgrp;
> +
>  	for_each_subsys(root, ss) {
>  		if (ss->can_attach) {
> -			retval = ss->can_attach(ss, cgrp, tsk);
> +			retval = ss->can_attach(ss, cgrp, &tset);
>  			if (retval) {
>  				/*
>  				 * Remember on which subsystem the can_attach()
> @@ -1867,7 +1945,7 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
>  		if (ss->attach_task)
>  			ss->attach_task(cgrp, tsk);
>  		if (ss->attach)
> -			ss->attach(ss, cgrp, oldcgrp, tsk);
> +			ss->attach(ss, cgrp, &tset);
>  	}
>  
>  	synchronize_rcu();
> @@ -1889,7 +1967,7 @@ out:
>  				 */
>  				break;
>  			if (ss->cancel_attach)
> -				ss->cancel_attach(ss, cgrp, tsk);
> +				ss->cancel_attach(ss, cgrp, &tset);
>  		}
>  	}
>  	return retval;
> @@ -2005,6 +2083,7 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>  	struct task_struct *tsk;
>  	struct task_and_cgroup *tc;
>  	struct flex_array *group;
> +	struct cgroup_taskset tset = { };
>  	/*
>  	 * we need to make sure we have css_sets for all the tasks we're
>  	 * going to move -before- we actually start moving them, so that in
> @@ -2067,6 +2146,8 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>  	} while_each_thread(leader, tsk);
>  	/* remember the number of threads in the array for later. */
>  	group_size = i;
> +	tset.tc_array = group;
> +	tset.tc_array_len = group_size;
>  	rcu_read_unlock();
>  
>  	/* methods shouldn't be called if no task is actually migrating */
> @@ -2079,7 +2160,7 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>  	 */
>  	for_each_subsys(root, ss) {
>  		if (ss->can_attach) {
> -			retval = ss->can_attach(ss, cgrp, leader);
> +			retval = ss->can_attach(ss, cgrp, &tset);
>  			if (retval) {
>  				failed_ss = ss;
>  				goto out_cancel_attach;
> @@ -2169,10 +2250,8 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>  	 * being moved, this call will need to be reworked to communicate that.
>  	 */
>  	for_each_subsys(root, ss) {
> -		if (ss->attach) {
> -			tc = flex_array_get(group, 0);
> -			ss->attach(ss, cgrp, tc->cgrp, tc->task);
> -		}
> +		if (ss->attach)
> +			ss->attach(ss, cgrp, &tset);
>  	}
>  
>  	/*
> @@ -2194,11 +2273,11 @@ out_cancel_attach:
>  		for_each_subsys(root, ss) {
>  			if (ss == failed_ss) {
>  				if (cancel_failed_ss && ss->cancel_attach)
> -					ss->cancel_attach(ss, cgrp, leader);
> +					ss->cancel_attach(ss, cgrp, &tset);
>  				break;
>  			}
>  			if (ss->cancel_attach)
> -				ss->cancel_attach(ss, cgrp, leader);
> +				ss->cancel_attach(ss, cgrp, &tset);
>  		}
>  	}
>  out_put_tasks:
> diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
> index 4e82525..a2b0082 100644
> --- a/kernel/cgroup_freezer.c
> +++ b/kernel/cgroup_freezer.c
> @@ -159,7 +159,7 @@ static void freezer_destroy(struct cgroup_subsys *ss,
>   */
>  static int freezer_can_attach(struct cgroup_subsys *ss,
>  			      struct cgroup *new_cgroup,
> -			      struct task_struct *task)
> +			      struct cgroup_taskset *tset)
>  {
>  	struct freezer *freezer;
>  
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index 10131fd..2e5825b 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -1368,10 +1368,10 @@ static int fmeter_getrate(struct fmeter *fmp)
>  }
>  
>  /* Called by cgroups to determine if a cpuset is usable; cgroup_mutex held */
> -static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cont,
> -			     struct task_struct *tsk)
> +static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +			     struct cgroup_taskset *tset)
>  {
> -	struct cpuset *cs = cgroup_cs(cont);
> +	struct cpuset *cs = cgroup_cs(cgrp);
>  
>  	if (cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed))
>  		return -ENOSPC;
> @@ -1384,7 +1384,7 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cont,
>  	 * set_cpus_allowed_ptr() on all attached tasks before cpus_allowed may
>  	 * be changed.
>  	 */
> -	if (tsk->flags & PF_THREAD_BOUND)
> +	if (cgroup_taskset_first(tset)->flags & PF_THREAD_BOUND)
>  		return -EINVAL;
>  
>  	return 0;
> @@ -1434,12 +1434,14 @@ static void cpuset_attach_task(struct cgroup *cont, struct task_struct *tsk)
>  	cpuset_update_task_spread_flag(cs, tsk);
>  }
>  
> -static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cont,
> -			  struct cgroup *oldcont, struct task_struct *tsk)
> +static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +			  struct cgroup_taskset *tset)
>  {
>  	struct mm_struct *mm;
> -	struct cpuset *cs = cgroup_cs(cont);
> -	struct cpuset *oldcs = cgroup_cs(oldcont);
> +	struct task_struct *tsk = cgroup_taskset_first(tset);
> +	struct cgroup *oldcgrp = cgroup_taskset_cur_cgroup(tset);
> +	struct cpuset *cs = cgroup_cs(cgrp);
> +	struct cpuset *oldcs = cgroup_cs(oldcgrp);
>  
>  	/*
>  	 * Change mm, possibly for multiple threads in a threadgroup. This is
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 930de94..b2802cc 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5460,8 +5460,9 @@ static void mem_cgroup_clear_mc(void)
>  
>  static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
>  				struct cgroup *cgroup,
> -				struct task_struct *p)
> +				struct cgroup_taskset *tset)
>  {
> +	struct task_struct *p = cgroup_taskset_first(tset);
>  	int ret = 0;
>  	struct mem_cgroup *mem = mem_cgroup_from_cont(cgroup);
>  

Ah..hmm. I think this doesn't work as expected for memcg.
Maybe code like this will be required.

{
	struct mem_cgroup *mem = mem_cgroup_from_cont(cgroup);
        struct mem_cgroup *from = NULL;
	struct task_struct *p;
	int ret = 0;
	/*
 	 * memcg just works against mm-owner. Check mm-owner is in this cgroup.
	 * Because tset contains only one thread-group, we'll find a task of
	 * mm->owner, at most.
 	 */
	for_cgroup_taskset_for_each(task, NULL, tset) {
		struct mm_struct *mm;

		mm = get_task_mm(task);
		if (!mm)
			continue;
		if (mm->owner == task) {
			p = task;
			break;
		}
		mmput(mm);
	}
	if (!p)
		return ret;
        from = mem_cgroup_from_task(p);
        mem_cgroup_start_move(from);
        spin_lock(&mc.lock);
	mc.from = from;
        mc.to = mem;
        spin_unlock(&mc.lock);
        /* We set mc.moving_task later */

        ret = mem_cgroup_precharge_mc(mm);
        if (ret)
             mem_cgroup_clear_mc();
	mm_put(mm);
        return ret;
} 



> @@ -5499,7 +5500,7 @@ static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
>  
>  static void mem_cgroup_cancel_attach(struct cgroup_subsys *ss,
>  				struct cgroup *cgroup,
> -				struct task_struct *p)
> +				struct cgroup_taskset *tset)
>  {
>  	mem_cgroup_clear_mc();
>  }
> @@ -5616,9 +5617,9 @@ retry:
>  
>  static void mem_cgroup_move_task(struct cgroup_subsys *ss,
>  				struct cgroup *cont,
> -				struct cgroup *old_cont,
> -				struct task_struct *p)
> +				struct cgroup_taskset *tset)
>  {
> +	struct task_struct *p = cgroup_taskset_first(tset);
>  	struct mm_struct *mm = get_task_mm(p);
>  

Similar code with can_attach() will be required.

Thanks,
-Kame

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
       [not found]     ` <20110825093958.75b95bd8.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
@ 2011-08-25  8:20       ` Tejun Heo
  0 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  8:20 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Daisuke Nishimura,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, rjw-KKrjLPT3xs0,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	paul-inf54ven1CmVyaH7bEyXVA

Hello, KAMEZAWA.

On Thu, Aug 25, 2011 at 09:39:58AM +0900, KAMEZAWA Hiroyuki wrote:
> > +Called prior to moving one or more tasks into a cgroup; if the
> > +subsystem returns an error, this will abort the attach operation.
> > +@tset contains the tasks to be attached and is guaranteed to have at
> > +least one task in it. If there are multiple, it's guaranteed that all
> > +are from the same thread group,
> 
> 
> Do this, "If there are multiple, it's guaranteed that all
> are from the same thread group ", means the 'tset' contains
> only one mm_struct ?

Yes, CLONE_THREAD requires CLONE_SIGHAND which in turn requires
CLONE_VM, so they'll all have the same mm.

> And is it guaranteed that any task in tset will not be freed while
> subsystem routine runs ?

Yeap, that one is guaranteed.  It might die but the the task_struct
itself won't be released.  However, I think it might be bette to block
task exits during migration too.

> > @@ -5460,8 +5460,9 @@ static void mem_cgroup_clear_mc(void)
> >  
> >  static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
> >  				struct cgroup *cgroup,
> > -				struct task_struct *p)
> > +				struct cgroup_taskset *tset)
> >  {
> > +	struct task_struct *p = cgroup_taskset_first(tset);
> >  	int ret = 0;
> >  	struct mem_cgroup *mem = mem_cgroup_from_cont(cgroup);
> >  
> 
> Ah..hmm. I think this doesn't work as expected for memcg.
> Maybe code like this will be required.

Hmmm... the above is basically identity transformation of the existing
code.  If the above is broken, the current code is broken too.  Is it?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-25  0:39   ` KAMEZAWA Hiroyuki
  2011-08-25  8:20     ` Tejun Heo
@ 2011-08-25  8:20     ` Tejun Heo
       [not found]       ` <20110825082049.GC3286-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
                         ` (2 more replies)
       [not found]     ` <20110825093958.75b95bd8.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
  2 siblings, 3 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  8:20 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: rjw, paul, lizf, linux-pm, linux-kernel, containers,
	Balbir Singh, Daisuke Nishimura, James Morris

Hello, KAMEZAWA.

On Thu, Aug 25, 2011 at 09:39:58AM +0900, KAMEZAWA Hiroyuki wrote:
> > +Called prior to moving one or more tasks into a cgroup; if the
> > +subsystem returns an error, this will abort the attach operation.
> > +@tset contains the tasks to be attached and is guaranteed to have at
> > +least one task in it. If there are multiple, it's guaranteed that all
> > +are from the same thread group,
> 
> 
> Do this, "If there are multiple, it's guaranteed that all
> are from the same thread group ", means the 'tset' contains
> only one mm_struct ?

Yes, CLONE_THREAD requires CLONE_SIGHAND which in turn requires
CLONE_VM, so they'll all have the same mm.

> And is it guaranteed that any task in tset will not be freed while
> subsystem routine runs ?

Yeap, that one is guaranteed.  It might die but the the task_struct
itself won't be released.  However, I think it might be bette to block
task exits during migration too.

> > @@ -5460,8 +5460,9 @@ static void mem_cgroup_clear_mc(void)
> >  
> >  static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
> >  				struct cgroup *cgroup,
> > -				struct task_struct *p)
> > +				struct cgroup_taskset *tset)
> >  {
> > +	struct task_struct *p = cgroup_taskset_first(tset);
> >  	int ret = 0;
> >  	struct mem_cgroup *mem = mem_cgroup_from_cont(cgroup);
> >  
> 
> Ah..hmm. I think this doesn't work as expected for memcg.
> Maybe code like this will be required.

Hmmm... the above is basically identity transformation of the existing
code.  If the above is broken, the current code is broken too.  Is it?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-25  0:39   ` KAMEZAWA Hiroyuki
@ 2011-08-25  8:20     ` Tejun Heo
  2011-08-25  8:20     ` Tejun Heo
       [not found]     ` <20110825093958.75b95bd8.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
  2 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  8:20 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Daisuke Nishimura, containers, lizf, linux-kernel, James Morris,
	linux-pm, paul

Hello, KAMEZAWA.

On Thu, Aug 25, 2011 at 09:39:58AM +0900, KAMEZAWA Hiroyuki wrote:
> > +Called prior to moving one or more tasks into a cgroup; if the
> > +subsystem returns an error, this will abort the attach operation.
> > +@tset contains the tasks to be attached and is guaranteed to have at
> > +least one task in it. If there are multiple, it's guaranteed that all
> > +are from the same thread group,
> 
> 
> Do this, "If there are multiple, it's guaranteed that all
> are from the same thread group ", means the 'tset' contains
> only one mm_struct ?

Yes, CLONE_THREAD requires CLONE_SIGHAND which in turn requires
CLONE_VM, so they'll all have the same mm.

> And is it guaranteed that any task in tset will not be freed while
> subsystem routine runs ?

Yeap, that one is guaranteed.  It might die but the the task_struct
itself won't be released.  However, I think it might be bette to block
task exits during migration too.

> > @@ -5460,8 +5460,9 @@ static void mem_cgroup_clear_mc(void)
> >  
> >  static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
> >  				struct cgroup *cgroup,
> > -				struct task_struct *p)
> > +				struct cgroup_taskset *tset)
> >  {
> > +	struct task_struct *p = cgroup_taskset_first(tset);
> >  	int ret = 0;
> >  	struct mem_cgroup *mem = mem_cgroup_from_cont(cgroup);
> >  
> 
> Ah..hmm. I think this doesn't work as expected for memcg.
> Maybe code like this will be required.

Hmmm... the above is basically identity transformation of the existing
code.  If the above is broken, the current code is broken too.  Is it?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
       [not found]       ` <20110825082049.GC3286-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
@ 2011-08-25  8:21         ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 100+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-08-25  8:21 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Daisuke Nishimura,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, rjw-KKrjLPT3xs0,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	paul-inf54ven1CmVyaH7bEyXVA

On Thu, 25 Aug 2011 10:20:49 +0200
Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:

> Hello, KAMEZAWA.
> 

> > > @@ -5460,8 +5460,9 @@ static void mem_cgroup_clear_mc(void)
> > >  
> > >  static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
> > >  				struct cgroup *cgroup,
> > > -				struct task_struct *p)
> > > +				struct cgroup_taskset *tset)
> > >  {
> > > +	struct task_struct *p = cgroup_taskset_first(tset);
> > >  	int ret = 0;
> > >  	struct mem_cgroup *mem = mem_cgroup_from_cont(cgroup);
> > >  
> > 
> > Ah..hmm. I think this doesn't work as expected for memcg.
> > Maybe code like this will be required.
> 
> Hmmm... the above is basically identity transformation of the existing
> code.  If the above is broken, the current code is broken too.  Is it?
> 
Current code is not broken.

mem_cgroup_can_attach(....., task) need to do real job only when task->mm->owner
== task. In this modification, you pass a set of task at once.
So, mem_cgroup_can_attach() need to check _all_ tasks in tset rather than a
first task in tset. please scan and find mm->owner.

Anyway, if you merge this onto mm-tree first, I think I can have time to
write a fix up if complicated.

Thanks,
-Kame

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-25  8:20     ` Tejun Heo
       [not found]       ` <20110825082049.GC3286-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
@ 2011-08-25  8:21       ` KAMEZAWA Hiroyuki
  2011-08-25  8:40         ` Tejun Heo
                           ` (2 more replies)
  2011-08-25  8:21       ` KAMEZAWA Hiroyuki
  2 siblings, 3 replies; 100+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-08-25  8:21 UTC (permalink / raw)
  To: Tejun Heo
  Cc: rjw, paul, lizf, linux-pm, linux-kernel, containers,
	Balbir Singh, Daisuke Nishimura, James Morris

On Thu, 25 Aug 2011 10:20:49 +0200
Tejun Heo <tj@kernel.org> wrote:

> Hello, KAMEZAWA.
> 

> > > @@ -5460,8 +5460,9 @@ static void mem_cgroup_clear_mc(void)
> > >  
> > >  static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
> > >  				struct cgroup *cgroup,
> > > -				struct task_struct *p)
> > > +				struct cgroup_taskset *tset)
> > >  {
> > > +	struct task_struct *p = cgroup_taskset_first(tset);
> > >  	int ret = 0;
> > >  	struct mem_cgroup *mem = mem_cgroup_from_cont(cgroup);
> > >  
> > 
> > Ah..hmm. I think this doesn't work as expected for memcg.
> > Maybe code like this will be required.
> 
> Hmmm... the above is basically identity transformation of the existing
> code.  If the above is broken, the current code is broken too.  Is it?
> 
Current code is not broken.

mem_cgroup_can_attach(....., task) need to do real job only when task->mm->owner
== task. In this modification, you pass a set of task at once.
So, mem_cgroup_can_attach() need to check _all_ tasks in tset rather than a
first task in tset. please scan and find mm->owner.

Anyway, if you merge this onto mm-tree first, I think I can have time to
write a fix up if complicated.

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-25  8:20     ` Tejun Heo
       [not found]       ` <20110825082049.GC3286-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
  2011-08-25  8:21       ` KAMEZAWA Hiroyuki
@ 2011-08-25  8:21       ` KAMEZAWA Hiroyuki
  2 siblings, 0 replies; 100+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-08-25  8:21 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Daisuke Nishimura, containers, lizf, linux-kernel, James Morris,
	linux-pm, paul

On Thu, 25 Aug 2011 10:20:49 +0200
Tejun Heo <tj@kernel.org> wrote:

> Hello, KAMEZAWA.
> 

> > > @@ -5460,8 +5460,9 @@ static void mem_cgroup_clear_mc(void)
> > >  
> > >  static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
> > >  				struct cgroup *cgroup,
> > > -				struct task_struct *p)
> > > +				struct cgroup_taskset *tset)
> > >  {
> > > +	struct task_struct *p = cgroup_taskset_first(tset);
> > >  	int ret = 0;
> > >  	struct mem_cgroup *mem = mem_cgroup_from_cont(cgroup);
> > >  
> > 
> > Ah..hmm. I think this doesn't work as expected for memcg.
> > Maybe code like this will be required.
> 
> Hmmm... the above is basically identity transformation of the existing
> code.  If the above is broken, the current code is broken too.  Is it?
> 
Current code is not broken.

mem_cgroup_can_attach(....., task) need to do real job only when task->mm->owner
== task. In this modification, you pass a set of task at once.
So, mem_cgroup_can_attach() need to check _all_ tasks in tset rather than a
first task in tset. please scan and find mm->owner.

Anyway, if you merge this onto mm-tree first, I think I can have time to
write a fix up if complicated.

Thanks,
-Kame

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
       [not found]           ` <CAOS58YPM=cuWjAF+VJ4QJ8bnRcVtaDCVXBJCpdWg+2=2GmnKrA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-08-25  8:37             ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 100+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-08-25  8:37 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Daisuke Nishimura,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, rjw-KKrjLPT3xs0,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	paul-inf54ven1CmVyaH7bEyXVA

On Thu, 25 Aug 2011 10:40:06 +0200
Tejun Heo <tj@kernel.org> wrote:

> Hello,
> 
> On Thu, Aug 25, 2011 at 10:21 AM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> >> Hmmm... the above is basically identity transformation of the existing
> >> code.  If the above is broken, the current code is broken too.  Is it?
> >>
> > Current code is not broken.
> 
> Trust me.  If the posted code is broken, the current code is too. It
> is an IDENTITY transformation.
> 
> > mem_cgroup_can_attach(....., task) need to do real job only when task->mm->owner
> > == task. In this modification, you pass a set of task at once.
> 
> Before the change, cgroup would migrate multiple tasks all the same
> but memcg wouldn't have noticed it unless it opted in explicitly using
> [can_]attach_task(). When multiple tasks were moving, [can_]attach()
> would only be called with the leader whether the leader actually is
> changing cgroup or not. The interface sucked and it wasn't properly
> documented but that's what was happening. The interface change is just
> making the breakage obvious - +1 for the new interface. :)
> 
Thank you for clarification. Ok, current code is broken.


> > So, mem_cgroup_can_attach() need to check _all_ tasks in tset rather than a
> > first task in tset. please scan and find mm->owner.
> >
> > Anyway, if you merge this onto mm-tree first, I think I can have time to
> > write a fix up if complicated.
> 
> As for specific merging order, it shouldn't matter all that much but
> if you wanna backport fixes for -stable, maybe it would make more
> sense to sequence the fix before this change.
> 
> Thank you.

Sure. IIUC, the case thread_leader != mm->owner is uncommon. 
I'll consider a fix onto your fix, first.

I'll cosinder a fix for stable tree if someone requests.

Thanks,
-Kame

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-25  8:40         ` Tejun Heo
@ 2011-08-25  8:37           ` KAMEZAWA Hiroyuki
       [not found]           ` <CAOS58YPM=cuWjAF+VJ4QJ8bnRcVtaDCVXBJCpdWg+2=2GmnKrA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2011-08-25  8:37           ` KAMEZAWA Hiroyuki
  2 siblings, 0 replies; 100+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-08-25  8:37 UTC (permalink / raw)
  To: Tejun Heo
  Cc: rjw, paul, lizf, linux-pm, linux-kernel, containers,
	Balbir Singh, Daisuke Nishimura, James Morris

On Thu, 25 Aug 2011 10:40:06 +0200
Tejun Heo <tj@kernel.org> wrote:

> Hello,
> 
> On Thu, Aug 25, 2011 at 10:21 AM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> >> Hmmm... the above is basically identity transformation of the existing
> >> code.  If the above is broken, the current code is broken too.  Is it?
> >>
> > Current code is not broken.
> 
> Trust me.  If the posted code is broken, the current code is too. It
> is an IDENTITY transformation.
> 
> > mem_cgroup_can_attach(....., task) need to do real job only when task->mm->owner
> > == task. In this modification, you pass a set of task at once.
> 
> Before the change, cgroup would migrate multiple tasks all the same
> but memcg wouldn't have noticed it unless it opted in explicitly using
> [can_]attach_task(). When multiple tasks were moving, [can_]attach()
> would only be called with the leader whether the leader actually is
> changing cgroup or not. The interface sucked and it wasn't properly
> documented but that's what was happening. The interface change is just
> making the breakage obvious - +1 for the new interface. :)
> 
Thank you for clarification. Ok, current code is broken.


> > So, mem_cgroup_can_attach() need to check _all_ tasks in tset rather than a
> > first task in tset. please scan and find mm->owner.
> >
> > Anyway, if you merge this onto mm-tree first, I think I can have time to
> > write a fix up if complicated.
> 
> As for specific merging order, it shouldn't matter all that much but
> if you wanna backport fixes for -stable, maybe it would make more
> sense to sequence the fix before this change.
> 
> Thank you.

Sure. IIUC, the case thread_leader != mm->owner is uncommon. 
I'll consider a fix onto your fix, first.

I'll cosinder a fix for stable tree if someone requests.

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-25  8:40         ` Tejun Heo
  2011-08-25  8:37           ` KAMEZAWA Hiroyuki
       [not found]           ` <CAOS58YPM=cuWjAF+VJ4QJ8bnRcVtaDCVXBJCpdWg+2=2GmnKrA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-08-25  8:37           ` KAMEZAWA Hiroyuki
  2 siblings, 0 replies; 100+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-08-25  8:37 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Daisuke Nishimura, containers, lizf, linux-kernel, James Morris,
	linux-pm, paul

On Thu, 25 Aug 2011 10:40:06 +0200
Tejun Heo <tj@kernel.org> wrote:

> Hello,
> 
> On Thu, Aug 25, 2011 at 10:21 AM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> >> Hmmm... the above is basically identity transformation of the existing
> >> code.  If the above is broken, the current code is broken too.  Is it?
> >>
> > Current code is not broken.
> 
> Trust me.  If the posted code is broken, the current code is too. It
> is an IDENTITY transformation.
> 
> > mem_cgroup_can_attach(....., task) need to do real job only when task->mm->owner
> > == task. In this modification, you pass a set of task at once.
> 
> Before the change, cgroup would migrate multiple tasks all the same
> but memcg wouldn't have noticed it unless it opted in explicitly using
> [can_]attach_task(). When multiple tasks were moving, [can_]attach()
> would only be called with the leader whether the leader actually is
> changing cgroup or not. The interface sucked and it wasn't properly
> documented but that's what was happening. The interface change is just
> making the breakage obvious - +1 for the new interface. :)
> 
Thank you for clarification. Ok, current code is broken.


> > So, mem_cgroup_can_attach() need to check _all_ tasks in tset rather than a
> > first task in tset. please scan and find mm->owner.
> >
> > Anyway, if you merge this onto mm-tree first, I think I can have time to
> > write a fix up if complicated.
> 
> As for specific merging order, it shouldn't matter all that much but
> if you wanna backport fixes for -stable, maybe it would make more
> sense to sequence the fix before this change.
> 
> Thank you.

Sure. IIUC, the case thread_leader != mm->owner is uncommon. 
I'll consider a fix onto your fix, first.

I'll cosinder a fix for stable tree if someone requests.

Thanks,
-Kame

_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/linux-pm

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
       [not found]         ` <20110825172140.eb34809f.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
@ 2011-08-25  8:40           ` Tejun Heo
  0 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  8:40 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Daisuke Nishimura,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, rjw-KKrjLPT3xs0,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	paul-inf54ven1CmVyaH7bEyXVA

Hello,

On Thu, Aug 25, 2011 at 10:21 AM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
>> Hmmm... the above is basically identity transformation of the existing
>> code.  If the above is broken, the current code is broken too.  Is it?
>>
> Current code is not broken.

Trust me.  If the posted code is broken, the current code is too. It
is an IDENTITY transformation.

> mem_cgroup_can_attach(....., task) need to do real job only when task->mm->owner
> == task. In this modification, you pass a set of task at once.

Before the change, cgroup would migrate multiple tasks all the same
but memcg wouldn't have noticed it unless it opted in explicitly using
[can_]attach_task(). When multiple tasks were moving, [can_]attach()
would only be called with the leader whether the leader actually is
changing cgroup or not. The interface sucked and it wasn't properly
documented but that's what was happening. The interface change is just
making the breakage obvious - +1 for the new interface. :)

> So, mem_cgroup_can_attach() need to check _all_ tasks in tset rather than a
> first task in tset. please scan and find mm->owner.
>
> Anyway, if you merge this onto mm-tree first, I think I can have time to
> write a fix up if complicated.

As for specific merging order, it shouldn't matter all that much but
if you wanna backport fixes for -stable, maybe it would make more
sense to sequence the fix before this change.

Thank you.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-25  8:21       ` KAMEZAWA Hiroyuki
  2011-08-25  8:40         ` Tejun Heo
       [not found]         ` <20110825172140.eb34809f.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
@ 2011-08-25  8:40         ` Tejun Heo
  2011-08-25  8:37           ` KAMEZAWA Hiroyuki
                             ` (2 more replies)
  2 siblings, 3 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  8:40 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: rjw, paul, lizf, linux-pm, linux-kernel, containers,
	Balbir Singh, Daisuke Nishimura, James Morris

Hello,

On Thu, Aug 25, 2011 at 10:21 AM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
>> Hmmm... the above is basically identity transformation of the existing
>> code.  If the above is broken, the current code is broken too.  Is it?
>>
> Current code is not broken.

Trust me.  If the posted code is broken, the current code is too. It
is an IDENTITY transformation.

> mem_cgroup_can_attach(....., task) need to do real job only when task->mm->owner
> == task. In this modification, you pass a set of task at once.

Before the change, cgroup would migrate multiple tasks all the same
but memcg wouldn't have noticed it unless it opted in explicitly using
[can_]attach_task(). When multiple tasks were moving, [can_]attach()
would only be called with the leader whether the leader actually is
changing cgroup or not. The interface sucked and it wasn't properly
documented but that's what was happening. The interface change is just
making the breakage obvious - +1 for the new interface. :)

> So, mem_cgroup_can_attach() need to check _all_ tasks in tset rather than a
> first task in tset. please scan and find mm->owner.
>
> Anyway, if you merge this onto mm-tree first, I think I can have time to
> write a fix up if complicated.

As for specific merging order, it shouldn't matter all that much but
if you wanna backport fixes for -stable, maybe it would make more
sense to sequence the fix before this change.

Thank you.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-25  8:21       ` KAMEZAWA Hiroyuki
@ 2011-08-25  8:40         ` Tejun Heo
       [not found]         ` <20110825172140.eb34809f.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
  2011-08-25  8:40         ` Tejun Heo
  2 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  8:40 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Daisuke Nishimura, containers, lizf, linux-kernel, James Morris,
	linux-pm, paul

Hello,

On Thu, Aug 25, 2011 at 10:21 AM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
>> Hmmm... the above is basically identity transformation of the existing
>> code.  If the above is broken, the current code is broken too.  Is it?
>>
> Current code is not broken.

Trust me.  If the posted code is broken, the current code is too. It
is an IDENTITY transformation.

> mem_cgroup_can_attach(....., task) need to do real job only when task->mm->owner
> == task. In this modification, you pass a set of task at once.

Before the change, cgroup would migrate multiple tasks all the same
but memcg wouldn't have noticed it unless it opted in explicitly using
[can_]attach_task(). When multiple tasks were moving, [can_]attach()
would only be called with the leader whether the leader actually is
changing cgroup or not. The interface sucked and it wasn't properly
documented but that's what was happening. The interface change is just
making the breakage obvious - +1 for the new interface. :)

> So, mem_cgroup_can_attach() need to check _all_ tasks in tset rather than a
> first task in tset. please scan and find mm->owner.
>
> Anyway, if you merge this onto mm-tree first, I think I can have time to
> write a fix up if complicated.

As for specific merging order, it shouldn't matter all that much but
if you wanna backport fixes for -stable, maybe it would make more
sense to sequence the fix before this change.

Thank you.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
       [not found]   ` <1314138000-2049-3-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2011-08-25  8:51     ` Paul Menage
  2011-08-25  9:42     ` Paul Menage
  1 sibling, 0 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  8:51 UTC (permalink / raw)
  To: Tejun Heo
  Cc: rjw-KKrjLPT3xs0,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> cgroup_attach_proc() behaves differently from cgroup_attach_task() in
> the following aspects.
>
> * All hooks are invoked even if no task is actually being moved.
>
> * ->can_attach_task() is called for all tasks in the group whether the
>  new cgrp is different from the current cgrp or not; however,
>  ->attach_task() is skipped if new equals new.  This makes the calls
>  asymmetric.
>
> This patch improves old cgroup handling in cgroup_attach_proc() by
> looking up the current cgroup at the head, recording it in the flex
> array along with the task itself, and using it to remove the above two
> differences.  This will also ease further changes.

While I'm all in favour of making things more consistent, do we need
such a big change?

In particular, making the group flex-array entries contain both a task
and a cgroup appears to be only necessary in order to skip tasks where
new_cgroup == old_cgroup. Can't we get the same effect by simply
leaving all such tasks out of the flex-array in the first place?

Paul

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
  2011-08-23 22:19 ` Tejun Heo
@ 2011-08-25  8:51   ` Paul Menage
       [not found]     ` <CALdu-PAj1ZUmB2ixxA6yeppB8MerBGk1cSeQadobH0H4cRSe7Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
                       ` (2 more replies)
  2011-08-25  8:51   ` Paul Menage
                     ` (3 subsequent siblings)
  4 siblings, 3 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  8:51 UTC (permalink / raw)
  To: Tejun Heo; +Cc: rjw, lizf, linux-pm, linux-kernel, containers

On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj@kernel.org> wrote:
> cgroup_attach_proc() behaves differently from cgroup_attach_task() in
> the following aspects.
>
> * All hooks are invoked even if no task is actually being moved.
>
> * ->can_attach_task() is called for all tasks in the group whether the
>  new cgrp is different from the current cgrp or not; however,
>  ->attach_task() is skipped if new equals new.  This makes the calls
>  asymmetric.
>
> This patch improves old cgroup handling in cgroup_attach_proc() by
> looking up the current cgroup at the head, recording it in the flex
> array along with the task itself, and using it to remove the above two
> differences.  This will also ease further changes.

While I'm all in favour of making things more consistent, do we need
such a big change?

In particular, making the group flex-array entries contain both a task
and a cgroup appears to be only necessary in order to skip tasks where
new_cgroup == old_cgroup. Can't we get the same effect by simply
leaving all such tasks out of the flex-array in the first place?

Paul

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
  2011-08-23 22:19 ` Tejun Heo
  2011-08-25  8:51   ` Paul Menage
@ 2011-08-25  8:51   ` Paul Menage
       [not found]   ` <1314138000-2049-3-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  8:51 UTC (permalink / raw)
  To: Tejun Heo; +Cc: containers, linux-pm, lizf, linux-kernel

On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj@kernel.org> wrote:
> cgroup_attach_proc() behaves differently from cgroup_attach_task() in
> the following aspects.
>
> * All hooks are invoked even if no task is actually being moved.
>
> * ->can_attach_task() is called for all tasks in the group whether the
>  new cgrp is different from the current cgrp or not; however,
>  ->attach_task() is skipped if new equals new.  This makes the calls
>  asymmetric.
>
> This patch improves old cgroup handling in cgroup_attach_proc() by
> looking up the current cgroup at the head, recording it in the flex
> array along with the task itself, and using it to remove the above two
> differences.  This will also ease further changes.

While I'm all in favour of making things more consistent, do we need
such a big change?

In particular, making the group flex-array entries contain both a task
and a cgroup appears to be only necessary in order to skip tasks where
new_cgroup == old_cgroup. Can't we get the same effect by simply
leaving all such tasks out of the flex-array in the first place?

Paul

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 5/6] cgroup, cpuset: don't use ss->pre_attach()
       [not found]   ` <1314138000-2049-6-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2011-08-25  8:53     ` Paul Menage
  0 siblings, 0 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  8:53 UTC (permalink / raw)
  To: Tejun Heo
  Cc: rjw-KKrjLPT3xs0,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> ->pre_attach() is supposed to be called before migration, which is
> observed during process migration but task migration does it the other
> way around.  The only ->pre_attach() user is cpuset which can do the
> same operaitons in ->can_attach().  Collapse cpuset_pre_attach() into
> cpuset_can_attach().
>
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

Acked-by: Paul Menage <paul-inf54ven1CmVyaH7bEyXVA@public.gmane.org>

Code looks good, but I think that the some of the Documentation
changes slipped in here by mistake.

Paul

> Cc: Paul Menage <paul-inf54ven1CmVyaH7bEyXVA@public.gmane.org>
> Cc: Li Zefan <lizf-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
> ---
>  Documentation/cgroups/cgroups.txt |   20 --------------------
>  kernel/cpuset.c                   |   29 ++++++++++++-----------------
>  2 files changed, 12 insertions(+), 37 deletions(-)
>
> diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
> index 2eee7cf..afb7cde 100644
> --- a/Documentation/cgroups/cgroups.txt
> +++ b/Documentation/cgroups/cgroups.txt
> @@ -610,13 +610,6 @@ called on a fork. If this method returns 0 (success) then this should
>  remain valid while the caller holds cgroup_mutex and it is ensured
>  that either attach() or cancel_attach() will be called in future.
>
> -int can_attach_task(struct cgroup *cgrp, struct task_struct *tsk);
> -(cgroup_mutex held by caller)
> -
> -As can_attach, but for operations that must be run once per task to be
> -attached (possibly many when using cgroup_attach_proc). Called after
> -can_attach.
> -
>  void cancel_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                   struct cgroup_taskset *tset)
>  (cgroup_mutex held by caller)
> @@ -627,12 +620,6 @@ function, so that the subsystem can implement a rollback. If not, not necessary.
>  This will be called only about subsystems whose can_attach() operation have
>  succeeded. The parameters are identical to can_attach().
>
> -void pre_attach(struct cgroup *cgrp);
> -(cgroup_mutex held by caller)
> -
> -For any non-per-thread attachment work that needs to happen before
> -attach_task. Needed by cpuset.
> -
>  void attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
>            struct cgroup_taskset *tset)
>  (cgroup_mutex held by caller)
> @@ -641,13 +628,6 @@ Called after the task has been attached to the cgroup, to allow any
>  post-attachment activity that requires memory allocations or blocking.
>  The parameters are identical to can_attach().
>
> -void attach_task(struct cgroup *cgrp, struct task_struct *tsk);
> -(cgroup_mutex held by caller)
> -
> -As attach, but for operations that must be run once per task to be attached,
> -like can_attach_task. Called before attach. Currently does not support any
> -subsystem that might need the old_cgrp for every thread in the group.
> -
>  void fork(struct cgroup_subsy *ss, struct task_struct *task)
>
>  Called when a task is forked into a cgroup.
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index 472ddd6..f0b8df3 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -1367,6 +1367,15 @@ static int fmeter_getrate(struct fmeter *fmp)
>        return val;
>  }
>
> +/*
> + * Protected by cgroup_lock. The nodemasks must be stored globally because
> + * dynamically allocating them is not allowed in can_attach, and they must
> + * persist until attach.
> + */
> +static cpumask_var_t cpus_attach;
> +static nodemask_t cpuset_attach_nodemask_from;
> +static nodemask_t cpuset_attach_nodemask_to;
> +
>  /* Called by cgroups to determine if a cpuset is usable; cgroup_mutex held */
>  static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                             struct cgroup_taskset *tset)
> @@ -1393,29 +1402,16 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                if ((ret = security_task_setscheduler(task)))
>                        return ret;
>        }
> -       return 0;
> -}
> -
> -/*
> - * Protected by cgroup_lock. The nodemasks must be stored globally because
> - * dynamically allocating them is not allowed in pre_attach, and they must
> - * persist among pre_attach, and attach.
> - */
> -static cpumask_var_t cpus_attach;
> -static nodemask_t cpuset_attach_nodemask_from;
> -static nodemask_t cpuset_attach_nodemask_to;
> -
> -/* Set-up work for before attaching each task. */
> -static void cpuset_pre_attach(struct cgroup *cont)
> -{
> -       struct cpuset *cs = cgroup_cs(cont);
>
> +       /* prepare for attach */
>        if (cs == &top_cpuset)
>                cpumask_copy(cpus_attach, cpu_possible_mask);
>        else
>                guarantee_online_cpus(cs, cpus_attach);
>
>        guarantee_online_mems(cs, &cpuset_attach_nodemask_to);
> +
> +       return 0;
>  }
>
>  static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> @@ -1901,7 +1897,6 @@ struct cgroup_subsys cpuset_subsys = {
>        .create = cpuset_create,
>        .destroy = cpuset_destroy,
>        .can_attach = cpuset_can_attach,
> -       .pre_attach = cpuset_pre_attach,
>        .attach = cpuset_attach,
>        .populate = cpuset_populate,
>        .post_clone = cpuset_post_clone,
> --
> 1.7.6
>
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 5/6] cgroup, cpuset: don't use ss->pre_attach()
  2011-08-23 22:19 ` [PATCH 5/6] cgroup, cpuset: don't use ss->pre_attach() Tejun Heo
@ 2011-08-25  8:53   ` Paul Menage
  2011-08-25  9:06     ` Tejun Heo
                       ` (2 more replies)
  2011-08-25  8:53   ` Paul Menage
       [not found]   ` <1314138000-2049-6-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2 siblings, 3 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  8:53 UTC (permalink / raw)
  To: Tejun Heo; +Cc: rjw, lizf, linux-pm, linux-kernel, containers

On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj@kernel.org> wrote:
> ->pre_attach() is supposed to be called before migration, which is
> observed during process migration but task migration does it the other
> way around.  The only ->pre_attach() user is cpuset which can do the
> same operaitons in ->can_attach().  Collapse cpuset_pre_attach() into
> cpuset_can_attach().
>
> Signed-off-by: Tejun Heo <tj@kernel.org>

Acked-by: Paul Menage <paul@paulmenage.org>

Code looks good, but I think that the some of the Documentation
changes slipped in here by mistake.

Paul

> Cc: Paul Menage <paul@paulmenage.org>
> Cc: Li Zefan <lizf@cn.fujitsu.com>
> ---
>  Documentation/cgroups/cgroups.txt |   20 --------------------
>  kernel/cpuset.c                   |   29 ++++++++++++-----------------
>  2 files changed, 12 insertions(+), 37 deletions(-)
>
> diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
> index 2eee7cf..afb7cde 100644
> --- a/Documentation/cgroups/cgroups.txt
> +++ b/Documentation/cgroups/cgroups.txt
> @@ -610,13 +610,6 @@ called on a fork. If this method returns 0 (success) then this should
>  remain valid while the caller holds cgroup_mutex and it is ensured
>  that either attach() or cancel_attach() will be called in future.
>
> -int can_attach_task(struct cgroup *cgrp, struct task_struct *tsk);
> -(cgroup_mutex held by caller)
> -
> -As can_attach, but for operations that must be run once per task to be
> -attached (possibly many when using cgroup_attach_proc). Called after
> -can_attach.
> -
>  void cancel_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                   struct cgroup_taskset *tset)
>  (cgroup_mutex held by caller)
> @@ -627,12 +620,6 @@ function, so that the subsystem can implement a rollback. If not, not necessary.
>  This will be called only about subsystems whose can_attach() operation have
>  succeeded. The parameters are identical to can_attach().
>
> -void pre_attach(struct cgroup *cgrp);
> -(cgroup_mutex held by caller)
> -
> -For any non-per-thread attachment work that needs to happen before
> -attach_task. Needed by cpuset.
> -
>  void attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
>            struct cgroup_taskset *tset)
>  (cgroup_mutex held by caller)
> @@ -641,13 +628,6 @@ Called after the task has been attached to the cgroup, to allow any
>  post-attachment activity that requires memory allocations or blocking.
>  The parameters are identical to can_attach().
>
> -void attach_task(struct cgroup *cgrp, struct task_struct *tsk);
> -(cgroup_mutex held by caller)
> -
> -As attach, but for operations that must be run once per task to be attached,
> -like can_attach_task. Called before attach. Currently does not support any
> -subsystem that might need the old_cgrp for every thread in the group.
> -
>  void fork(struct cgroup_subsy *ss, struct task_struct *task)
>
>  Called when a task is forked into a cgroup.
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index 472ddd6..f0b8df3 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -1367,6 +1367,15 @@ static int fmeter_getrate(struct fmeter *fmp)
>        return val;
>  }
>
> +/*
> + * Protected by cgroup_lock. The nodemasks must be stored globally because
> + * dynamically allocating them is not allowed in can_attach, and they must
> + * persist until attach.
> + */
> +static cpumask_var_t cpus_attach;
> +static nodemask_t cpuset_attach_nodemask_from;
> +static nodemask_t cpuset_attach_nodemask_to;
> +
>  /* Called by cgroups to determine if a cpuset is usable; cgroup_mutex held */
>  static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                             struct cgroup_taskset *tset)
> @@ -1393,29 +1402,16 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                if ((ret = security_task_setscheduler(task)))
>                        return ret;
>        }
> -       return 0;
> -}
> -
> -/*
> - * Protected by cgroup_lock. The nodemasks must be stored globally because
> - * dynamically allocating them is not allowed in pre_attach, and they must
> - * persist among pre_attach, and attach.
> - */
> -static cpumask_var_t cpus_attach;
> -static nodemask_t cpuset_attach_nodemask_from;
> -static nodemask_t cpuset_attach_nodemask_to;
> -
> -/* Set-up work for before attaching each task. */
> -static void cpuset_pre_attach(struct cgroup *cont)
> -{
> -       struct cpuset *cs = cgroup_cs(cont);
>
> +       /* prepare for attach */
>        if (cs == &top_cpuset)
>                cpumask_copy(cpus_attach, cpu_possible_mask);
>        else
>                guarantee_online_cpus(cs, cpus_attach);
>
>        guarantee_online_mems(cs, &cpuset_attach_nodemask_to);
> +
> +       return 0;
>  }
>
>  static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> @@ -1901,7 +1897,6 @@ struct cgroup_subsys cpuset_subsys = {
>        .create = cpuset_create,
>        .destroy = cpuset_destroy,
>        .can_attach = cpuset_can_attach,
> -       .pre_attach = cpuset_pre_attach,
>        .attach = cpuset_attach,
>        .populate = cpuset_populate,
>        .post_clone = cpuset_post_clone,
> --
> 1.7.6
>
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 5/6] cgroup, cpuset: don't use ss->pre_attach()
  2011-08-23 22:19 ` [PATCH 5/6] cgroup, cpuset: don't use ss->pre_attach() Tejun Heo
  2011-08-25  8:53   ` Paul Menage
@ 2011-08-25  8:53   ` Paul Menage
       [not found]   ` <1314138000-2049-6-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2 siblings, 0 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  8:53 UTC (permalink / raw)
  To: Tejun Heo; +Cc: containers, linux-pm, lizf, linux-kernel

On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj@kernel.org> wrote:
> ->pre_attach() is supposed to be called before migration, which is
> observed during process migration but task migration does it the other
> way around.  The only ->pre_attach() user is cpuset which can do the
> same operaitons in ->can_attach().  Collapse cpuset_pre_attach() into
> cpuset_can_attach().
>
> Signed-off-by: Tejun Heo <tj@kernel.org>

Acked-by: Paul Menage <paul@paulmenage.org>

Code looks good, but I think that the some of the Documentation
changes slipped in here by mistake.

Paul

> Cc: Paul Menage <paul@paulmenage.org>
> Cc: Li Zefan <lizf@cn.fujitsu.com>
> ---
>  Documentation/cgroups/cgroups.txt |   20 --------------------
>  kernel/cpuset.c                   |   29 ++++++++++++-----------------
>  2 files changed, 12 insertions(+), 37 deletions(-)
>
> diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
> index 2eee7cf..afb7cde 100644
> --- a/Documentation/cgroups/cgroups.txt
> +++ b/Documentation/cgroups/cgroups.txt
> @@ -610,13 +610,6 @@ called on a fork. If this method returns 0 (success) then this should
>  remain valid while the caller holds cgroup_mutex and it is ensured
>  that either attach() or cancel_attach() will be called in future.
>
> -int can_attach_task(struct cgroup *cgrp, struct task_struct *tsk);
> -(cgroup_mutex held by caller)
> -
> -As can_attach, but for operations that must be run once per task to be
> -attached (possibly many when using cgroup_attach_proc). Called after
> -can_attach.
> -
>  void cancel_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                   struct cgroup_taskset *tset)
>  (cgroup_mutex held by caller)
> @@ -627,12 +620,6 @@ function, so that the subsystem can implement a rollback. If not, not necessary.
>  This will be called only about subsystems whose can_attach() operation have
>  succeeded. The parameters are identical to can_attach().
>
> -void pre_attach(struct cgroup *cgrp);
> -(cgroup_mutex held by caller)
> -
> -For any non-per-thread attachment work that needs to happen before
> -attach_task. Needed by cpuset.
> -
>  void attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
>            struct cgroup_taskset *tset)
>  (cgroup_mutex held by caller)
> @@ -641,13 +628,6 @@ Called after the task has been attached to the cgroup, to allow any
>  post-attachment activity that requires memory allocations or blocking.
>  The parameters are identical to can_attach().
>
> -void attach_task(struct cgroup *cgrp, struct task_struct *tsk);
> -(cgroup_mutex held by caller)
> -
> -As attach, but for operations that must be run once per task to be attached,
> -like can_attach_task. Called before attach. Currently does not support any
> -subsystem that might need the old_cgrp for every thread in the group.
> -
>  void fork(struct cgroup_subsy *ss, struct task_struct *task)
>
>  Called when a task is forked into a cgroup.
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index 472ddd6..f0b8df3 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -1367,6 +1367,15 @@ static int fmeter_getrate(struct fmeter *fmp)
>        return val;
>  }
>
> +/*
> + * Protected by cgroup_lock. The nodemasks must be stored globally because
> + * dynamically allocating them is not allowed in can_attach, and they must
> + * persist until attach.
> + */
> +static cpumask_var_t cpus_attach;
> +static nodemask_t cpuset_attach_nodemask_from;
> +static nodemask_t cpuset_attach_nodemask_to;
> +
>  /* Called by cgroups to determine if a cpuset is usable; cgroup_mutex held */
>  static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                             struct cgroup_taskset *tset)
> @@ -1393,29 +1402,16 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                if ((ret = security_task_setscheduler(task)))
>                        return ret;
>        }
> -       return 0;
> -}
> -
> -/*
> - * Protected by cgroup_lock. The nodemasks must be stored globally because
> - * dynamically allocating them is not allowed in pre_attach, and they must
> - * persist among pre_attach, and attach.
> - */
> -static cpumask_var_t cpus_attach;
> -static nodemask_t cpuset_attach_nodemask_from;
> -static nodemask_t cpuset_attach_nodemask_to;
> -
> -/* Set-up work for before attaching each task. */
> -static void cpuset_pre_attach(struct cgroup *cont)
> -{
> -       struct cpuset *cs = cgroup_cs(cont);
>
> +       /* prepare for attach */
>        if (cs == &top_cpuset)
>                cpumask_copy(cpus_attach, cpu_possible_mask);
>        else
>                guarantee_online_cpus(cs, cpus_attach);
>
>        guarantee_online_mems(cs, &cpuset_attach_nodemask_to);
> +
> +       return 0;
>  }
>
>  static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> @@ -1901,7 +1897,6 @@ struct cgroup_subsys cpuset_subsys = {
>        .create = cpuset_create,
>        .destroy = cpuset_destroy,
>        .can_attach = cpuset_can_attach,
> -       .pre_attach = cpuset_pre_attach,
>        .attach = cpuset_attach,
>        .populate = cpuset_populate,
>        .post_clone = cpuset_post_clone,
> --
> 1.7.6
>
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
       [not found]     ` <CALdu-PAj1ZUmB2ixxA6yeppB8MerBGk1cSeQadobH0H4cRSe7Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-08-25  9:03       ` Tejun Heo
  0 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  9:03 UTC (permalink / raw)
  To: Paul Menage
  Cc: rjw-KKrjLPT3xs0,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hello,

On Thu, Aug 25, 2011 at 01:51:39AM -0700, Paul Menage wrote:
> In particular, making the group flex-array entries contain both a task
> and a cgroup appears to be only necessary in order to skip tasks where
> new_cgroup == old_cgroup. Can't we get the same effect by simply
> leaving all such tasks out of the flex-array in the first place?

In general, the interface *should* give full information to subsys
methods at each stage including the old cgroup each task is migrating
from and the new cgroup; otherwise, they soon end up developing weird
acrobatics to work around shortcomings in the interface or being
outright buggy, so let's please look past the fixes which are
necessary immediately and think about what a proper interface should
look like.

I mean, seriously, why did ->attach_task() take @new_cgroup when it's
called after migration happened while ->attach() had access to the old
cgroup of the last iterated task in the group?  What the hell does
that even mean?

And, why is this a big change?  The big part is change of interface
but that we need to do anyway.  This one is just adding an entry to
the flex array.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
  2011-08-25  8:51   ` Paul Menage
       [not found]     ` <CALdu-PAj1ZUmB2ixxA6yeppB8MerBGk1cSeQadobH0H4cRSe7Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2011-08-25  9:03     ` Tejun Heo
@ 2011-08-25  9:03     ` Tejun Heo
  2 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  9:03 UTC (permalink / raw)
  To: Paul Menage; +Cc: rjw, lizf, linux-pm, linux-kernel, containers

Hello,

On Thu, Aug 25, 2011 at 01:51:39AM -0700, Paul Menage wrote:
> In particular, making the group flex-array entries contain both a task
> and a cgroup appears to be only necessary in order to skip tasks where
> new_cgroup == old_cgroup. Can't we get the same effect by simply
> leaving all such tasks out of the flex-array in the first place?

In general, the interface *should* give full information to subsys
methods at each stage including the old cgroup each task is migrating
from and the new cgroup; otherwise, they soon end up developing weird
acrobatics to work around shortcomings in the interface or being
outright buggy, so let's please look past the fixes which are
necessary immediately and think about what a proper interface should
look like.

I mean, seriously, why did ->attach_task() take @new_cgroup when it's
called after migration happened while ->attach() had access to the old
cgroup of the last iterated task in the group?  What the hell does
that even mean?

And, why is this a big change?  The big part is change of interface
but that we need to do anyway.  This one is just adding an entry to
the flex array.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
  2011-08-25  8:51   ` Paul Menage
       [not found]     ` <CALdu-PAj1ZUmB2ixxA6yeppB8MerBGk1cSeQadobH0H4cRSe7Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-08-25  9:03     ` Tejun Heo
  2011-08-25  9:03     ` Tejun Heo
  2 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  9:03 UTC (permalink / raw)
  To: Paul Menage; +Cc: containers, linux-pm, lizf, linux-kernel

Hello,

On Thu, Aug 25, 2011 at 01:51:39AM -0700, Paul Menage wrote:
> In particular, making the group flex-array entries contain both a task
> and a cgroup appears to be only necessary in order to skip tasks where
> new_cgroup == old_cgroup. Can't we get the same effect by simply
> leaving all such tasks out of the flex-array in the first place?

In general, the interface *should* give full information to subsys
methods at each stage including the old cgroup each task is migrating
from and the new cgroup; otherwise, they soon end up developing weird
acrobatics to work around shortcomings in the interface or being
outright buggy, so let's please look past the fixes which are
necessary immediately and think about what a proper interface should
look like.

I mean, seriously, why did ->attach_task() take @new_cgroup when it's
called after migration happened while ->attach() had access to the old
cgroup of the last iterated task in the group?  What the hell does
that even mean?

And, why is this a big change?  The big part is change of interface
but that we need to do anyway.  This one is just adding an entry to
the flex array.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 5/6] cgroup, cpuset: don't use ss->pre_attach()
       [not found]     ` <CALdu-PD5EbFJBRHf-iehPwb6vyJTYUTWZniihARFDZ7xRZ8_nQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-08-25  9:06       ` Tejun Heo
  0 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  9:06 UTC (permalink / raw)
  To: Paul Menage
  Cc: rjw-KKrjLPT3xs0,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Thu, Aug 25, 2011 at 01:53:57AM -0700, Paul Menage wrote:
> On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> > ->pre_attach() is supposed to be called before migration, which is
> > observed during process migration but task migration does it the other
> > way around.  The only ->pre_attach() user is cpuset which can do the
> > same operaitons in ->can_attach().  Collapse cpuset_pre_attach() into
> > cpuset_can_attach().
> >
> > Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> 
> Acked-by: Paul Menage <paul-inf54ven1CmVyaH7bEyXVA@public.gmane.org>
> 
> Code looks good, but I think that the some of the Documentation
> changes slipped in here by mistake.

Oops, indeed.  Will relocate them.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 5/6] cgroup, cpuset: don't use ss->pre_attach()
  2011-08-25  8:53   ` Paul Menage
@ 2011-08-25  9:06     ` Tejun Heo
  2011-08-25  9:06     ` Tejun Heo
       [not found]     ` <CALdu-PD5EbFJBRHf-iehPwb6vyJTYUTWZniihARFDZ7xRZ8_nQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  9:06 UTC (permalink / raw)
  To: Paul Menage; +Cc: rjw, lizf, linux-pm, linux-kernel, containers

On Thu, Aug 25, 2011 at 01:53:57AM -0700, Paul Menage wrote:
> On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj@kernel.org> wrote:
> > ->pre_attach() is supposed to be called before migration, which is
> > observed during process migration but task migration does it the other
> > way around.  The only ->pre_attach() user is cpuset which can do the
> > same operaitons in ->can_attach().  Collapse cpuset_pre_attach() into
> > cpuset_can_attach().
> >
> > Signed-off-by: Tejun Heo <tj@kernel.org>
> 
> Acked-by: Paul Menage <paul@paulmenage.org>
> 
> Code looks good, but I think that the some of the Documentation
> changes slipped in here by mistake.

Oops, indeed.  Will relocate them.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 5/6] cgroup, cpuset: don't use ss->pre_attach()
  2011-08-25  8:53   ` Paul Menage
  2011-08-25  9:06     ` Tejun Heo
@ 2011-08-25  9:06     ` Tejun Heo
       [not found]     ` <CALdu-PD5EbFJBRHf-iehPwb6vyJTYUTWZniihARFDZ7xRZ8_nQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  9:06 UTC (permalink / raw)
  To: Paul Menage; +Cc: containers, linux-pm, lizf, linux-kernel

On Thu, Aug 25, 2011 at 01:53:57AM -0700, Paul Menage wrote:
> On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj@kernel.org> wrote:
> > ->pre_attach() is supposed to be called before migration, which is
> > observed during process migration but task migration does it the other
> > way around.  The only ->pre_attach() user is cpuset which can do the
> > same operaitons in ->can_attach().  Collapse cpuset_pre_attach() into
> > cpuset_can_attach().
> >
> > Signed-off-by: Tejun Heo <tj@kernel.org>
> 
> Acked-by: Paul Menage <paul@paulmenage.org>
> 
> Code looks good, but I think that the some of the Documentation
> changes slipped in here by mistake.

Oops, indeed.  Will relocate them.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task()
       [not found]   ` <1314138000-2049-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2011-08-24  1:57     ` Matt Helsley
@ 2011-08-25  9:07     ` Paul Menage
  1 sibling, 0 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  9:07 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Daisuke Nishimura,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Ingo Molnar, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	rjw-KKrjLPT3xs0,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> Now that subsys->can_attach() and attach() take @tset instead of
> @task, they can handle per-task operations.  Convert
> ->can_attach_task() and ->attach_task() users to use ->can_attach()
> and attach() instead.  Most converions are straight-forward.
> Noteworthy changes are,
>
> * In cgroup_freezer, remove unnecessary NULL assignments to unused
>  methods.  It's useless and very prone to get out of sync, which
>  already happened.
>
> * In cpuset, PF_THREAD_BOUND test is checked for each task.  This
>  doesn't make any practical difference but is conceptually cleaner.
>
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Cc: Paul Menage <paul-inf54ven1CmVyaH7bEyXVA@public.gmane.org>
> Cc: Li Zefan <lizf-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
> Cc: Balbir Singh <bsingharora-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: Daisuke Nishimura <nishimura-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
> Cc: James Morris <jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org>
> Cc: Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>
> Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
> ---
>  block/blk-cgroup.c      |   45 +++++++++++++++++++-----------
>  kernel/cgroup_freezer.c |   14 +++-------
>  kernel/cpuset.c         |   70 +++++++++++++++++++++-------------------------
>  kernel/events/core.c    |   13 +++++---
>  kernel/sched.c          |   31 +++++++++++++--------
>  5 files changed, 91 insertions(+), 82 deletions(-)
>
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index bcaf16e..99e0bd4 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -30,8 +30,10 @@ EXPORT_SYMBOL_GPL(blkio_root_cgroup);
>
>  static struct cgroup_subsys_state *blkiocg_create(struct cgroup_subsys *,
>                                                  struct cgroup *);
> -static int blkiocg_can_attach_task(struct cgroup *, struct task_struct *);
> -static void blkiocg_attach_task(struct cgroup *, struct task_struct *);
> +static int blkiocg_can_attach(struct cgroup_subsys *, struct cgroup *,
> +                             struct cgroup_taskset *);
> +static void blkiocg_attach(struct cgroup_subsys *, struct cgroup *,
> +                          struct cgroup_taskset *);
>  static void blkiocg_destroy(struct cgroup_subsys *, struct cgroup *);
>  static int blkiocg_populate(struct cgroup_subsys *, struct cgroup *);
>
> @@ -44,8 +46,8 @@ static int blkiocg_populate(struct cgroup_subsys *, struct cgroup *);
>  struct cgroup_subsys blkio_subsys = {
>        .name = "blkio",
>        .create = blkiocg_create,
> -       .can_attach_task = blkiocg_can_attach_task,
> -       .attach_task = blkiocg_attach_task,
> +       .can_attach = blkiocg_can_attach,
> +       .attach = blkiocg_attach,
>        .destroy = blkiocg_destroy,
>        .populate = blkiocg_populate,
>  #ifdef CONFIG_BLK_CGROUP
> @@ -1614,30 +1616,39 @@ done:
>  * of the main cic data structures.  For now we allow a task to change
>  * its cgroup only if it's the only owner of its ioc.
>  */
> -static int blkiocg_can_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
> +static int blkiocg_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +                             struct cgroup_taskset *tset)
>  {
> +       struct task_struct *task;
>        struct io_context *ioc;
>        int ret = 0;
>
>        /* task_lock() is needed to avoid races with exit_io_context() */
> -       task_lock(tsk);
> -       ioc = tsk->io_context;
> -       if (ioc && atomic_read(&ioc->nr_tasks) > 1)
> -               ret = -EINVAL;
> -       task_unlock(tsk);
> -
> +       cgroup_taskset_for_each(task, cgrp, tset) {
> +               task_lock(task);
> +               ioc = task->io_context;
> +               if (ioc && atomic_read(&ioc->nr_tasks) > 1)
> +                       ret = -EINVAL;
> +               task_unlock(task);
> +               if (ret)
> +                       break;
> +       }

Doesn't the other part of this patch set, that avoids calling the
*attach() methods for tasks that aren't moving, eliminate the need for
the usage of skip_cgrp here (and elsewhere)? When do we actually need
to pass a non-NULL skip_cgrp to cgroup_taskset_for_each()?

Paul

>        return ret;
>  }
>
> -static void blkiocg_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
> +static void blkiocg_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +                          struct cgroup_taskset *tset)
>  {
> +       struct task_struct *task;
>        struct io_context *ioc;
>
> -       task_lock(tsk);
> -       ioc = tsk->io_context;
> -       if (ioc)
> -               ioc->cgroup_changed = 1;
> -       task_unlock(tsk);
> +       cgroup_taskset_for_each(task, cgrp, tset) {
> +               task_lock(task);
> +               ioc = task->io_context;
> +               if (ioc)
> +                       ioc->cgroup_changed = 1;
> +               task_unlock(task);
> +       }
>  }
>
>  void blkio_policy_register(struct blkio_policy_type *blkiop)
> diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
> index a2b0082..2cb5e72 100644
> --- a/kernel/cgroup_freezer.c
> +++ b/kernel/cgroup_freezer.c
> @@ -162,10 +162,14 @@ static int freezer_can_attach(struct cgroup_subsys *ss,
>                              struct cgroup_taskset *tset)
>  {
>        struct freezer *freezer;
> +       struct task_struct *task;
>
>        /*
>         * Anything frozen can't move or be moved to/from.
>         */
> +       cgroup_taskset_for_each(task, new_cgroup, tset)
> +               if (cgroup_freezing(task))
> +                       return -EBUSY;
>
>        freezer = cgroup_freezer(new_cgroup);
>        if (freezer->state != CGROUP_THAWED)
> @@ -174,11 +178,6 @@ static int freezer_can_attach(struct cgroup_subsys *ss,
>        return 0;
>  }
>
> -static int freezer_can_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
> -{
> -       return cgroup_freezing(tsk) ? -EBUSY : 0;
> -}
> -
>  static void freezer_fork(struct cgroup_subsys *ss, struct task_struct *task)
>  {
>        struct freezer *freezer;
> @@ -374,10 +373,5 @@ struct cgroup_subsys freezer_subsys = {
>        .populate       = freezer_populate,
>        .subsys_id      = freezer_subsys_id,
>        .can_attach     = freezer_can_attach,
> -       .can_attach_task = freezer_can_attach_task,
> -       .pre_attach     = NULL,
> -       .attach_task    = NULL,
> -       .attach         = NULL,
>        .fork           = freezer_fork,
> -       .exit           = NULL,
>  };
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index 2e5825b..472ddd6 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -1372,33 +1372,34 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                             struct cgroup_taskset *tset)
>  {
>        struct cpuset *cs = cgroup_cs(cgrp);
> +       struct task_struct *task;
> +       int ret;
>
>        if (cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed))
>                return -ENOSPC;
>
> -       /*
> -        * Kthreads bound to specific cpus cannot be moved to a new cpuset; we
> -        * cannot change their cpu affinity and isolating such threads by their
> -        * set of allowed nodes is unnecessary.  Thus, cpusets are not
> -        * applicable for such threads.  This prevents checking for success of
> -        * set_cpus_allowed_ptr() on all attached tasks before cpus_allowed may
> -        * be changed.
> -        */
> -       if (cgroup_taskset_first(tset)->flags & PF_THREAD_BOUND)
> -               return -EINVAL;
> -
> +       cgroup_taskset_for_each(task, cgrp, tset) {
> +               /*
> +                * Kthreads bound to specific cpus cannot be moved to a new
> +                * cpuset; we cannot change their cpu affinity and
> +                * isolating such threads by their set of allowed nodes is
> +                * unnecessary.  Thus, cpusets are not applicable for such
> +                * threads.  This prevents checking for success of
> +                * set_cpus_allowed_ptr() on all attached tasks before
> +                * cpus_allowed may be changed.
> +                */
> +               if (task->flags & PF_THREAD_BOUND)
> +                       return -EINVAL;
> +               if ((ret = security_task_setscheduler(task)))
> +                       return ret;
> +       }
>        return 0;
>  }
>
> -static int cpuset_can_attach_task(struct cgroup *cgrp, struct task_struct *task)
> -{
> -       return security_task_setscheduler(task);
> -}
> -
>  /*
>  * Protected by cgroup_lock. The nodemasks must be stored globally because
>  * dynamically allocating them is not allowed in pre_attach, and they must
> - * persist among pre_attach, attach_task, and attach.
> + * persist among pre_attach, and attach.
>  */
>  static cpumask_var_t cpus_attach;
>  static nodemask_t cpuset_attach_nodemask_from;
> @@ -1417,39 +1418,34 @@ static void cpuset_pre_attach(struct cgroup *cont)
>        guarantee_online_mems(cs, &cpuset_attach_nodemask_to);
>  }
>
> -/* Per-thread attachment work. */
> -static void cpuset_attach_task(struct cgroup *cont, struct task_struct *tsk)
> -{
> -       int err;
> -       struct cpuset *cs = cgroup_cs(cont);
> -
> -       /*
> -        * can_attach beforehand should guarantee that this doesn't fail.
> -        * TODO: have a better way to handle failure here
> -        */
> -       err = set_cpus_allowed_ptr(tsk, cpus_attach);
> -       WARN_ON_ONCE(err);
> -
> -       cpuset_change_task_nodemask(tsk, &cpuset_attach_nodemask_to);
> -       cpuset_update_task_spread_flag(cs, tsk);
> -}
> -
>  static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                          struct cgroup_taskset *tset)
>  {
>        struct mm_struct *mm;
> -       struct task_struct *tsk = cgroup_taskset_first(tset);
> +       struct task_struct *task;
> +       struct task_struct *leader = cgroup_taskset_first(tset);
>        struct cgroup *oldcgrp = cgroup_taskset_cur_cgroup(tset);
>        struct cpuset *cs = cgroup_cs(cgrp);
>        struct cpuset *oldcs = cgroup_cs(oldcgrp);
>
> +       cgroup_taskset_for_each(task, cgrp, tset) {
> +               /*
> +                * can_attach beforehand should guarantee that this doesn't
> +                * fail.  TODO: have a better way to handle failure here
> +                */
> +               WARN_ON_ONCE(set_cpus_allowed_ptr(task, cpus_attach));
> +
> +               cpuset_change_task_nodemask(task, &cpuset_attach_nodemask_to);
> +               cpuset_update_task_spread_flag(cs, task);
> +       }
> +
>        /*
>         * Change mm, possibly for multiple threads in a threadgroup. This is
>         * expensive and may sleep.
>         */
>        cpuset_attach_nodemask_from = oldcs->mems_allowed;
>        cpuset_attach_nodemask_to = cs->mems_allowed;
> -       mm = get_task_mm(tsk);
> +       mm = get_task_mm(leader);
>        if (mm) {
>                mpol_rebind_mm(mm, &cpuset_attach_nodemask_to);
>                if (is_memory_migrate(cs))
> @@ -1905,9 +1901,7 @@ struct cgroup_subsys cpuset_subsys = {
>        .create = cpuset_create,
>        .destroy = cpuset_destroy,
>        .can_attach = cpuset_can_attach,
> -       .can_attach_task = cpuset_can_attach_task,
>        .pre_attach = cpuset_pre_attach,
> -       .attach_task = cpuset_attach_task,
>        .attach = cpuset_attach,
>        .populate = cpuset_populate,
>        .post_clone = cpuset_post_clone,
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index b8785e2..95e189d 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -7000,10 +7000,13 @@ static int __perf_cgroup_move(void *info)
>        return 0;
>  }
>
> -static void
> -perf_cgroup_attach_task(struct cgroup *cgrp, struct task_struct *task)
> +static void perf_cgroup_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +                              struct cgroup_taskset *tset)
>  {
> -       task_function_call(task, __perf_cgroup_move, task);
> +       struct task_struct *task;
> +
> +       cgroup_taskset_for_each(task, cgrp, tset)
> +               task_function_call(task, __perf_cgroup_move, task);
>  }
>
>  static void perf_cgroup_exit(struct cgroup_subsys *ss, struct cgroup *cgrp,
> @@ -7017,7 +7020,7 @@ static void perf_cgroup_exit(struct cgroup_subsys *ss, struct cgroup *cgrp,
>        if (!(task->flags & PF_EXITING))
>                return;
>
> -       perf_cgroup_attach_task(cgrp, task);
> +       task_function_call(task, __perf_cgroup_move, task);
>  }
>
>  struct cgroup_subsys perf_subsys = {
> @@ -7026,6 +7029,6 @@ struct cgroup_subsys perf_subsys = {
>        .create         = perf_cgroup_create,
>        .destroy        = perf_cgroup_destroy,
>        .exit           = perf_cgroup_exit,
> -       .attach_task    = perf_cgroup_attach_task,
> +       .attach         = perf_cgroup_attach,
>  };
>  #endif /* CONFIG_CGROUP_PERF */
> diff --git a/kernel/sched.c b/kernel/sched.c
> index ccacdbd..dd7e460 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -8966,24 +8966,31 @@ cpu_cgroup_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp)
>        sched_destroy_group(tg);
>  }
>
> -static int
> -cpu_cgroup_can_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
> +static int cpu_cgroup_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +                                struct cgroup_taskset *tset)
>  {
> +       struct task_struct *task;
> +
> +       cgroup_taskset_for_each(task, cgrp, tset) {
>  #ifdef CONFIG_RT_GROUP_SCHED
> -       if (!sched_rt_can_attach(cgroup_tg(cgrp), tsk))
> -               return -EINVAL;
> +               if (!sched_rt_can_attach(cgroup_tg(cgrp), task))
> +                       return -EINVAL;
>  #else
> -       /* We don't support RT-tasks being in separate groups */
> -       if (tsk->sched_class != &fair_sched_class)
> -               return -EINVAL;
> +               /* We don't support RT-tasks being in separate groups */
> +               if (task->sched_class != &fair_sched_class)
> +                       return -EINVAL;
>  #endif
> +       }
>        return 0;
>  }
>
> -static void
> -cpu_cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
> +static void cpu_cgroup_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +                             struct cgroup_taskset *tset)
>  {
> -       sched_move_task(tsk);
> +       struct task_struct *task;
> +
> +       cgroup_taskset_for_each(task, cgrp, tset)
> +               sched_move_task(task);
>  }
>
>  static void
> @@ -9071,8 +9078,8 @@ struct cgroup_subsys cpu_cgroup_subsys = {
>        .name           = "cpu",
>        .create         = cpu_cgroup_create,
>        .destroy        = cpu_cgroup_destroy,
> -       .can_attach_task = cpu_cgroup_can_attach_task,
> -       .attach_task    = cpu_cgroup_attach_task,
> +       .can_attach     = cpu_cgroup_can_attach,
> +       .attach         = cpu_cgroup_attach,
>        .exit           = cpu_cgroup_exit,
>        .populate       = cpu_cgroup_populate,
>        .subsys_id      = cpu_cgroup_subsys_id,
> --
> 1.7.6
>
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task()
  2011-08-23 22:19 ` Tejun Heo
                     ` (3 preceding siblings ...)
  2011-08-25  9:07   ` Paul Menage
@ 2011-08-25  9:07   ` Paul Menage
  2011-08-25  9:12     ` Tejun Heo
                       ` (2 more replies)
  4 siblings, 3 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  9:07 UTC (permalink / raw)
  To: Tejun Heo
  Cc: rjw, lizf, linux-pm, linux-kernel, containers, Balbir Singh,
	Daisuke Nishimura, KAMEZAWA Hiroyuki, James Morris, Ingo Molnar,
	Peter Zijlstra

On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj@kernel.org> wrote:
> Now that subsys->can_attach() and attach() take @tset instead of
> @task, they can handle per-task operations.  Convert
> ->can_attach_task() and ->attach_task() users to use ->can_attach()
> and attach() instead.  Most converions are straight-forward.
> Noteworthy changes are,
>
> * In cgroup_freezer, remove unnecessary NULL assignments to unused
>  methods.  It's useless and very prone to get out of sync, which
>  already happened.
>
> * In cpuset, PF_THREAD_BOUND test is checked for each task.  This
>  doesn't make any practical difference but is conceptually cleaner.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Paul Menage <paul@paulmenage.org>
> Cc: Li Zefan <lizf@cn.fujitsu.com>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: James Morris <jmorris@namei.org>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Peter Zijlstra <peterz@infradead.org>
> ---
>  block/blk-cgroup.c      |   45 +++++++++++++++++++-----------
>  kernel/cgroup_freezer.c |   14 +++-------
>  kernel/cpuset.c         |   70 +++++++++++++++++++++-------------------------
>  kernel/events/core.c    |   13 +++++---
>  kernel/sched.c          |   31 +++++++++++++--------
>  5 files changed, 91 insertions(+), 82 deletions(-)
>
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index bcaf16e..99e0bd4 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -30,8 +30,10 @@ EXPORT_SYMBOL_GPL(blkio_root_cgroup);
>
>  static struct cgroup_subsys_state *blkiocg_create(struct cgroup_subsys *,
>                                                  struct cgroup *);
> -static int blkiocg_can_attach_task(struct cgroup *, struct task_struct *);
> -static void blkiocg_attach_task(struct cgroup *, struct task_struct *);
> +static int blkiocg_can_attach(struct cgroup_subsys *, struct cgroup *,
> +                             struct cgroup_taskset *);
> +static void blkiocg_attach(struct cgroup_subsys *, struct cgroup *,
> +                          struct cgroup_taskset *);
>  static void blkiocg_destroy(struct cgroup_subsys *, struct cgroup *);
>  static int blkiocg_populate(struct cgroup_subsys *, struct cgroup *);
>
> @@ -44,8 +46,8 @@ static int blkiocg_populate(struct cgroup_subsys *, struct cgroup *);
>  struct cgroup_subsys blkio_subsys = {
>        .name = "blkio",
>        .create = blkiocg_create,
> -       .can_attach_task = blkiocg_can_attach_task,
> -       .attach_task = blkiocg_attach_task,
> +       .can_attach = blkiocg_can_attach,
> +       .attach = blkiocg_attach,
>        .destroy = blkiocg_destroy,
>        .populate = blkiocg_populate,
>  #ifdef CONFIG_BLK_CGROUP
> @@ -1614,30 +1616,39 @@ done:
>  * of the main cic data structures.  For now we allow a task to change
>  * its cgroup only if it's the only owner of its ioc.
>  */
> -static int blkiocg_can_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
> +static int blkiocg_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +                             struct cgroup_taskset *tset)
>  {
> +       struct task_struct *task;
>        struct io_context *ioc;
>        int ret = 0;
>
>        /* task_lock() is needed to avoid races with exit_io_context() */
> -       task_lock(tsk);
> -       ioc = tsk->io_context;
> -       if (ioc && atomic_read(&ioc->nr_tasks) > 1)
> -               ret = -EINVAL;
> -       task_unlock(tsk);
> -
> +       cgroup_taskset_for_each(task, cgrp, tset) {
> +               task_lock(task);
> +               ioc = task->io_context;
> +               if (ioc && atomic_read(&ioc->nr_tasks) > 1)
> +                       ret = -EINVAL;
> +               task_unlock(task);
> +               if (ret)
> +                       break;
> +       }

Doesn't the other part of this patch set, that avoids calling the
*attach() methods for tasks that aren't moving, eliminate the need for
the usage of skip_cgrp here (and elsewhere)? When do we actually need
to pass a non-NULL skip_cgrp to cgroup_taskset_for_each()?

Paul

>        return ret;
>  }
>
> -static void blkiocg_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
> +static void blkiocg_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +                          struct cgroup_taskset *tset)
>  {
> +       struct task_struct *task;
>        struct io_context *ioc;
>
> -       task_lock(tsk);
> -       ioc = tsk->io_context;
> -       if (ioc)
> -               ioc->cgroup_changed = 1;
> -       task_unlock(tsk);
> +       cgroup_taskset_for_each(task, cgrp, tset) {
> +               task_lock(task);
> +               ioc = task->io_context;
> +               if (ioc)
> +                       ioc->cgroup_changed = 1;
> +               task_unlock(task);
> +       }
>  }
>
>  void blkio_policy_register(struct blkio_policy_type *blkiop)
> diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
> index a2b0082..2cb5e72 100644
> --- a/kernel/cgroup_freezer.c
> +++ b/kernel/cgroup_freezer.c
> @@ -162,10 +162,14 @@ static int freezer_can_attach(struct cgroup_subsys *ss,
>                              struct cgroup_taskset *tset)
>  {
>        struct freezer *freezer;
> +       struct task_struct *task;
>
>        /*
>         * Anything frozen can't move or be moved to/from.
>         */
> +       cgroup_taskset_for_each(task, new_cgroup, tset)
> +               if (cgroup_freezing(task))
> +                       return -EBUSY;
>
>        freezer = cgroup_freezer(new_cgroup);
>        if (freezer->state != CGROUP_THAWED)
> @@ -174,11 +178,6 @@ static int freezer_can_attach(struct cgroup_subsys *ss,
>        return 0;
>  }
>
> -static int freezer_can_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
> -{
> -       return cgroup_freezing(tsk) ? -EBUSY : 0;
> -}
> -
>  static void freezer_fork(struct cgroup_subsys *ss, struct task_struct *task)
>  {
>        struct freezer *freezer;
> @@ -374,10 +373,5 @@ struct cgroup_subsys freezer_subsys = {
>        .populate       = freezer_populate,
>        .subsys_id      = freezer_subsys_id,
>        .can_attach     = freezer_can_attach,
> -       .can_attach_task = freezer_can_attach_task,
> -       .pre_attach     = NULL,
> -       .attach_task    = NULL,
> -       .attach         = NULL,
>        .fork           = freezer_fork,
> -       .exit           = NULL,
>  };
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index 2e5825b..472ddd6 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -1372,33 +1372,34 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                             struct cgroup_taskset *tset)
>  {
>        struct cpuset *cs = cgroup_cs(cgrp);
> +       struct task_struct *task;
> +       int ret;
>
>        if (cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed))
>                return -ENOSPC;
>
> -       /*
> -        * Kthreads bound to specific cpus cannot be moved to a new cpuset; we
> -        * cannot change their cpu affinity and isolating such threads by their
> -        * set of allowed nodes is unnecessary.  Thus, cpusets are not
> -        * applicable for such threads.  This prevents checking for success of
> -        * set_cpus_allowed_ptr() on all attached tasks before cpus_allowed may
> -        * be changed.
> -        */
> -       if (cgroup_taskset_first(tset)->flags & PF_THREAD_BOUND)
> -               return -EINVAL;
> -
> +       cgroup_taskset_for_each(task, cgrp, tset) {
> +               /*
> +                * Kthreads bound to specific cpus cannot be moved to a new
> +                * cpuset; we cannot change their cpu affinity and
> +                * isolating such threads by their set of allowed nodes is
> +                * unnecessary.  Thus, cpusets are not applicable for such
> +                * threads.  This prevents checking for success of
> +                * set_cpus_allowed_ptr() on all attached tasks before
> +                * cpus_allowed may be changed.
> +                */
> +               if (task->flags & PF_THREAD_BOUND)
> +                       return -EINVAL;
> +               if ((ret = security_task_setscheduler(task)))
> +                       return ret;
> +       }
>        return 0;
>  }
>
> -static int cpuset_can_attach_task(struct cgroup *cgrp, struct task_struct *task)
> -{
> -       return security_task_setscheduler(task);
> -}
> -
>  /*
>  * Protected by cgroup_lock. The nodemasks must be stored globally because
>  * dynamically allocating them is not allowed in pre_attach, and they must
> - * persist among pre_attach, attach_task, and attach.
> + * persist among pre_attach, and attach.
>  */
>  static cpumask_var_t cpus_attach;
>  static nodemask_t cpuset_attach_nodemask_from;
> @@ -1417,39 +1418,34 @@ static void cpuset_pre_attach(struct cgroup *cont)
>        guarantee_online_mems(cs, &cpuset_attach_nodemask_to);
>  }
>
> -/* Per-thread attachment work. */
> -static void cpuset_attach_task(struct cgroup *cont, struct task_struct *tsk)
> -{
> -       int err;
> -       struct cpuset *cs = cgroup_cs(cont);
> -
> -       /*
> -        * can_attach beforehand should guarantee that this doesn't fail.
> -        * TODO: have a better way to handle failure here
> -        */
> -       err = set_cpus_allowed_ptr(tsk, cpus_attach);
> -       WARN_ON_ONCE(err);
> -
> -       cpuset_change_task_nodemask(tsk, &cpuset_attach_nodemask_to);
> -       cpuset_update_task_spread_flag(cs, tsk);
> -}
> -
>  static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                          struct cgroup_taskset *tset)
>  {
>        struct mm_struct *mm;
> -       struct task_struct *tsk = cgroup_taskset_first(tset);
> +       struct task_struct *task;
> +       struct task_struct *leader = cgroup_taskset_first(tset);
>        struct cgroup *oldcgrp = cgroup_taskset_cur_cgroup(tset);
>        struct cpuset *cs = cgroup_cs(cgrp);
>        struct cpuset *oldcs = cgroup_cs(oldcgrp);
>
> +       cgroup_taskset_for_each(task, cgrp, tset) {
> +               /*
> +                * can_attach beforehand should guarantee that this doesn't
> +                * fail.  TODO: have a better way to handle failure here
> +                */
> +               WARN_ON_ONCE(set_cpus_allowed_ptr(task, cpus_attach));
> +
> +               cpuset_change_task_nodemask(task, &cpuset_attach_nodemask_to);
> +               cpuset_update_task_spread_flag(cs, task);
> +       }
> +
>        /*
>         * Change mm, possibly for multiple threads in a threadgroup. This is
>         * expensive and may sleep.
>         */
>        cpuset_attach_nodemask_from = oldcs->mems_allowed;
>        cpuset_attach_nodemask_to = cs->mems_allowed;
> -       mm = get_task_mm(tsk);
> +       mm = get_task_mm(leader);
>        if (mm) {
>                mpol_rebind_mm(mm, &cpuset_attach_nodemask_to);
>                if (is_memory_migrate(cs))
> @@ -1905,9 +1901,7 @@ struct cgroup_subsys cpuset_subsys = {
>        .create = cpuset_create,
>        .destroy = cpuset_destroy,
>        .can_attach = cpuset_can_attach,
> -       .can_attach_task = cpuset_can_attach_task,
>        .pre_attach = cpuset_pre_attach,
> -       .attach_task = cpuset_attach_task,
>        .attach = cpuset_attach,
>        .populate = cpuset_populate,
>        .post_clone = cpuset_post_clone,
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index b8785e2..95e189d 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -7000,10 +7000,13 @@ static int __perf_cgroup_move(void *info)
>        return 0;
>  }
>
> -static void
> -perf_cgroup_attach_task(struct cgroup *cgrp, struct task_struct *task)
> +static void perf_cgroup_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +                              struct cgroup_taskset *tset)
>  {
> -       task_function_call(task, __perf_cgroup_move, task);
> +       struct task_struct *task;
> +
> +       cgroup_taskset_for_each(task, cgrp, tset)
> +               task_function_call(task, __perf_cgroup_move, task);
>  }
>
>  static void perf_cgroup_exit(struct cgroup_subsys *ss, struct cgroup *cgrp,
> @@ -7017,7 +7020,7 @@ static void perf_cgroup_exit(struct cgroup_subsys *ss, struct cgroup *cgrp,
>        if (!(task->flags & PF_EXITING))
>                return;
>
> -       perf_cgroup_attach_task(cgrp, task);
> +       task_function_call(task, __perf_cgroup_move, task);
>  }
>
>  struct cgroup_subsys perf_subsys = {
> @@ -7026,6 +7029,6 @@ struct cgroup_subsys perf_subsys = {
>        .create         = perf_cgroup_create,
>        .destroy        = perf_cgroup_destroy,
>        .exit           = perf_cgroup_exit,
> -       .attach_task    = perf_cgroup_attach_task,
> +       .attach         = perf_cgroup_attach,
>  };
>  #endif /* CONFIG_CGROUP_PERF */
> diff --git a/kernel/sched.c b/kernel/sched.c
> index ccacdbd..dd7e460 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -8966,24 +8966,31 @@ cpu_cgroup_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp)
>        sched_destroy_group(tg);
>  }
>
> -static int
> -cpu_cgroup_can_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
> +static int cpu_cgroup_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +                                struct cgroup_taskset *tset)
>  {
> +       struct task_struct *task;
> +
> +       cgroup_taskset_for_each(task, cgrp, tset) {
>  #ifdef CONFIG_RT_GROUP_SCHED
> -       if (!sched_rt_can_attach(cgroup_tg(cgrp), tsk))
> -               return -EINVAL;
> +               if (!sched_rt_can_attach(cgroup_tg(cgrp), task))
> +                       return -EINVAL;
>  #else
> -       /* We don't support RT-tasks being in separate groups */
> -       if (tsk->sched_class != &fair_sched_class)
> -               return -EINVAL;
> +               /* We don't support RT-tasks being in separate groups */
> +               if (task->sched_class != &fair_sched_class)
> +                       return -EINVAL;
>  #endif
> +       }
>        return 0;
>  }
>
> -static void
> -cpu_cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
> +static void cpu_cgroup_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +                             struct cgroup_taskset *tset)
>  {
> -       sched_move_task(tsk);
> +       struct task_struct *task;
> +
> +       cgroup_taskset_for_each(task, cgrp, tset)
> +               sched_move_task(task);
>  }
>
>  static void
> @@ -9071,8 +9078,8 @@ struct cgroup_subsys cpu_cgroup_subsys = {
>        .name           = "cpu",
>        .create         = cpu_cgroup_create,
>        .destroy        = cpu_cgroup_destroy,
> -       .can_attach_task = cpu_cgroup_can_attach_task,
> -       .attach_task    = cpu_cgroup_attach_task,
> +       .can_attach     = cpu_cgroup_can_attach,
> +       .attach         = cpu_cgroup_attach,
>        .exit           = cpu_cgroup_exit,
>        .populate       = cpu_cgroup_populate,
>        .subsys_id      = cpu_cgroup_subsys_id,
> --
> 1.7.6
>
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task()
  2011-08-23 22:19 ` Tejun Heo
                     ` (2 preceding siblings ...)
       [not found]   ` <1314138000-2049-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2011-08-25  9:07   ` Paul Menage
  2011-08-25  9:07   ` Paul Menage
  4 siblings, 0 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  9:07 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Peter Zijlstra, Daisuke Nishimura, containers, Ingo Molnar, lizf,
	linux-kernel, James Morris, linux-pm, KAMEZAWA Hiroyuki

On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj@kernel.org> wrote:
> Now that subsys->can_attach() and attach() take @tset instead of
> @task, they can handle per-task operations.  Convert
> ->can_attach_task() and ->attach_task() users to use ->can_attach()
> and attach() instead.  Most converions are straight-forward.
> Noteworthy changes are,
>
> * In cgroup_freezer, remove unnecessary NULL assignments to unused
>  methods.  It's useless and very prone to get out of sync, which
>  already happened.
>
> * In cpuset, PF_THREAD_BOUND test is checked for each task.  This
>  doesn't make any practical difference but is conceptually cleaner.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Paul Menage <paul@paulmenage.org>
> Cc: Li Zefan <lizf@cn.fujitsu.com>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: James Morris <jmorris@namei.org>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Peter Zijlstra <peterz@infradead.org>
> ---
>  block/blk-cgroup.c      |   45 +++++++++++++++++++-----------
>  kernel/cgroup_freezer.c |   14 +++-------
>  kernel/cpuset.c         |   70 +++++++++++++++++++++-------------------------
>  kernel/events/core.c    |   13 +++++---
>  kernel/sched.c          |   31 +++++++++++++--------
>  5 files changed, 91 insertions(+), 82 deletions(-)
>
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index bcaf16e..99e0bd4 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -30,8 +30,10 @@ EXPORT_SYMBOL_GPL(blkio_root_cgroup);
>
>  static struct cgroup_subsys_state *blkiocg_create(struct cgroup_subsys *,
>                                                  struct cgroup *);
> -static int blkiocg_can_attach_task(struct cgroup *, struct task_struct *);
> -static void blkiocg_attach_task(struct cgroup *, struct task_struct *);
> +static int blkiocg_can_attach(struct cgroup_subsys *, struct cgroup *,
> +                             struct cgroup_taskset *);
> +static void blkiocg_attach(struct cgroup_subsys *, struct cgroup *,
> +                          struct cgroup_taskset *);
>  static void blkiocg_destroy(struct cgroup_subsys *, struct cgroup *);
>  static int blkiocg_populate(struct cgroup_subsys *, struct cgroup *);
>
> @@ -44,8 +46,8 @@ static int blkiocg_populate(struct cgroup_subsys *, struct cgroup *);
>  struct cgroup_subsys blkio_subsys = {
>        .name = "blkio",
>        .create = blkiocg_create,
> -       .can_attach_task = blkiocg_can_attach_task,
> -       .attach_task = blkiocg_attach_task,
> +       .can_attach = blkiocg_can_attach,
> +       .attach = blkiocg_attach,
>        .destroy = blkiocg_destroy,
>        .populate = blkiocg_populate,
>  #ifdef CONFIG_BLK_CGROUP
> @@ -1614,30 +1616,39 @@ done:
>  * of the main cic data structures.  For now we allow a task to change
>  * its cgroup only if it's the only owner of its ioc.
>  */
> -static int blkiocg_can_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
> +static int blkiocg_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +                             struct cgroup_taskset *tset)
>  {
> +       struct task_struct *task;
>        struct io_context *ioc;
>        int ret = 0;
>
>        /* task_lock() is needed to avoid races with exit_io_context() */
> -       task_lock(tsk);
> -       ioc = tsk->io_context;
> -       if (ioc && atomic_read(&ioc->nr_tasks) > 1)
> -               ret = -EINVAL;
> -       task_unlock(tsk);
> -
> +       cgroup_taskset_for_each(task, cgrp, tset) {
> +               task_lock(task);
> +               ioc = task->io_context;
> +               if (ioc && atomic_read(&ioc->nr_tasks) > 1)
> +                       ret = -EINVAL;
> +               task_unlock(task);
> +               if (ret)
> +                       break;
> +       }

Doesn't the other part of this patch set, that avoids calling the
*attach() methods for tasks that aren't moving, eliminate the need for
the usage of skip_cgrp here (and elsewhere)? When do we actually need
to pass a non-NULL skip_cgrp to cgroup_taskset_for_each()?

Paul

>        return ret;
>  }
>
> -static void blkiocg_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
> +static void blkiocg_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +                          struct cgroup_taskset *tset)
>  {
> +       struct task_struct *task;
>        struct io_context *ioc;
>
> -       task_lock(tsk);
> -       ioc = tsk->io_context;
> -       if (ioc)
> -               ioc->cgroup_changed = 1;
> -       task_unlock(tsk);
> +       cgroup_taskset_for_each(task, cgrp, tset) {
> +               task_lock(task);
> +               ioc = task->io_context;
> +               if (ioc)
> +                       ioc->cgroup_changed = 1;
> +               task_unlock(task);
> +       }
>  }
>
>  void blkio_policy_register(struct blkio_policy_type *blkiop)
> diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
> index a2b0082..2cb5e72 100644
> --- a/kernel/cgroup_freezer.c
> +++ b/kernel/cgroup_freezer.c
> @@ -162,10 +162,14 @@ static int freezer_can_attach(struct cgroup_subsys *ss,
>                              struct cgroup_taskset *tset)
>  {
>        struct freezer *freezer;
> +       struct task_struct *task;
>
>        /*
>         * Anything frozen can't move or be moved to/from.
>         */
> +       cgroup_taskset_for_each(task, new_cgroup, tset)
> +               if (cgroup_freezing(task))
> +                       return -EBUSY;
>
>        freezer = cgroup_freezer(new_cgroup);
>        if (freezer->state != CGROUP_THAWED)
> @@ -174,11 +178,6 @@ static int freezer_can_attach(struct cgroup_subsys *ss,
>        return 0;
>  }
>
> -static int freezer_can_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
> -{
> -       return cgroup_freezing(tsk) ? -EBUSY : 0;
> -}
> -
>  static void freezer_fork(struct cgroup_subsys *ss, struct task_struct *task)
>  {
>        struct freezer *freezer;
> @@ -374,10 +373,5 @@ struct cgroup_subsys freezer_subsys = {
>        .populate       = freezer_populate,
>        .subsys_id      = freezer_subsys_id,
>        .can_attach     = freezer_can_attach,
> -       .can_attach_task = freezer_can_attach_task,
> -       .pre_attach     = NULL,
> -       .attach_task    = NULL,
> -       .attach         = NULL,
>        .fork           = freezer_fork,
> -       .exit           = NULL,
>  };
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index 2e5825b..472ddd6 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -1372,33 +1372,34 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                             struct cgroup_taskset *tset)
>  {
>        struct cpuset *cs = cgroup_cs(cgrp);
> +       struct task_struct *task;
> +       int ret;
>
>        if (cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed))
>                return -ENOSPC;
>
> -       /*
> -        * Kthreads bound to specific cpus cannot be moved to a new cpuset; we
> -        * cannot change their cpu affinity and isolating such threads by their
> -        * set of allowed nodes is unnecessary.  Thus, cpusets are not
> -        * applicable for such threads.  This prevents checking for success of
> -        * set_cpus_allowed_ptr() on all attached tasks before cpus_allowed may
> -        * be changed.
> -        */
> -       if (cgroup_taskset_first(tset)->flags & PF_THREAD_BOUND)
> -               return -EINVAL;
> -
> +       cgroup_taskset_for_each(task, cgrp, tset) {
> +               /*
> +                * Kthreads bound to specific cpus cannot be moved to a new
> +                * cpuset; we cannot change their cpu affinity and
> +                * isolating such threads by their set of allowed nodes is
> +                * unnecessary.  Thus, cpusets are not applicable for such
> +                * threads.  This prevents checking for success of
> +                * set_cpus_allowed_ptr() on all attached tasks before
> +                * cpus_allowed may be changed.
> +                */
> +               if (task->flags & PF_THREAD_BOUND)
> +                       return -EINVAL;
> +               if ((ret = security_task_setscheduler(task)))
> +                       return ret;
> +       }
>        return 0;
>  }
>
> -static int cpuset_can_attach_task(struct cgroup *cgrp, struct task_struct *task)
> -{
> -       return security_task_setscheduler(task);
> -}
> -
>  /*
>  * Protected by cgroup_lock. The nodemasks must be stored globally because
>  * dynamically allocating them is not allowed in pre_attach, and they must
> - * persist among pre_attach, attach_task, and attach.
> + * persist among pre_attach, and attach.
>  */
>  static cpumask_var_t cpus_attach;
>  static nodemask_t cpuset_attach_nodemask_from;
> @@ -1417,39 +1418,34 @@ static void cpuset_pre_attach(struct cgroup *cont)
>        guarantee_online_mems(cs, &cpuset_attach_nodemask_to);
>  }
>
> -/* Per-thread attachment work. */
> -static void cpuset_attach_task(struct cgroup *cont, struct task_struct *tsk)
> -{
> -       int err;
> -       struct cpuset *cs = cgroup_cs(cont);
> -
> -       /*
> -        * can_attach beforehand should guarantee that this doesn't fail.
> -        * TODO: have a better way to handle failure here
> -        */
> -       err = set_cpus_allowed_ptr(tsk, cpus_attach);
> -       WARN_ON_ONCE(err);
> -
> -       cpuset_change_task_nodemask(tsk, &cpuset_attach_nodemask_to);
> -       cpuset_update_task_spread_flag(cs, tsk);
> -}
> -
>  static void cpuset_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                          struct cgroup_taskset *tset)
>  {
>        struct mm_struct *mm;
> -       struct task_struct *tsk = cgroup_taskset_first(tset);
> +       struct task_struct *task;
> +       struct task_struct *leader = cgroup_taskset_first(tset);
>        struct cgroup *oldcgrp = cgroup_taskset_cur_cgroup(tset);
>        struct cpuset *cs = cgroup_cs(cgrp);
>        struct cpuset *oldcs = cgroup_cs(oldcgrp);
>
> +       cgroup_taskset_for_each(task, cgrp, tset) {
> +               /*
> +                * can_attach beforehand should guarantee that this doesn't
> +                * fail.  TODO: have a better way to handle failure here
> +                */
> +               WARN_ON_ONCE(set_cpus_allowed_ptr(task, cpus_attach));
> +
> +               cpuset_change_task_nodemask(task, &cpuset_attach_nodemask_to);
> +               cpuset_update_task_spread_flag(cs, task);
> +       }
> +
>        /*
>         * Change mm, possibly for multiple threads in a threadgroup. This is
>         * expensive and may sleep.
>         */
>        cpuset_attach_nodemask_from = oldcs->mems_allowed;
>        cpuset_attach_nodemask_to = cs->mems_allowed;
> -       mm = get_task_mm(tsk);
> +       mm = get_task_mm(leader);
>        if (mm) {
>                mpol_rebind_mm(mm, &cpuset_attach_nodemask_to);
>                if (is_memory_migrate(cs))
> @@ -1905,9 +1901,7 @@ struct cgroup_subsys cpuset_subsys = {
>        .create = cpuset_create,
>        .destroy = cpuset_destroy,
>        .can_attach = cpuset_can_attach,
> -       .can_attach_task = cpuset_can_attach_task,
>        .pre_attach = cpuset_pre_attach,
> -       .attach_task = cpuset_attach_task,
>        .attach = cpuset_attach,
>        .populate = cpuset_populate,
>        .post_clone = cpuset_post_clone,
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index b8785e2..95e189d 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -7000,10 +7000,13 @@ static int __perf_cgroup_move(void *info)
>        return 0;
>  }
>
> -static void
> -perf_cgroup_attach_task(struct cgroup *cgrp, struct task_struct *task)
> +static void perf_cgroup_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +                              struct cgroup_taskset *tset)
>  {
> -       task_function_call(task, __perf_cgroup_move, task);
> +       struct task_struct *task;
> +
> +       cgroup_taskset_for_each(task, cgrp, tset)
> +               task_function_call(task, __perf_cgroup_move, task);
>  }
>
>  static void perf_cgroup_exit(struct cgroup_subsys *ss, struct cgroup *cgrp,
> @@ -7017,7 +7020,7 @@ static void perf_cgroup_exit(struct cgroup_subsys *ss, struct cgroup *cgrp,
>        if (!(task->flags & PF_EXITING))
>                return;
>
> -       perf_cgroup_attach_task(cgrp, task);
> +       task_function_call(task, __perf_cgroup_move, task);
>  }
>
>  struct cgroup_subsys perf_subsys = {
> @@ -7026,6 +7029,6 @@ struct cgroup_subsys perf_subsys = {
>        .create         = perf_cgroup_create,
>        .destroy        = perf_cgroup_destroy,
>        .exit           = perf_cgroup_exit,
> -       .attach_task    = perf_cgroup_attach_task,
> +       .attach         = perf_cgroup_attach,
>  };
>  #endif /* CONFIG_CGROUP_PERF */
> diff --git a/kernel/sched.c b/kernel/sched.c
> index ccacdbd..dd7e460 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -8966,24 +8966,31 @@ cpu_cgroup_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp)
>        sched_destroy_group(tg);
>  }
>
> -static int
> -cpu_cgroup_can_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
> +static int cpu_cgroup_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +                                struct cgroup_taskset *tset)
>  {
> +       struct task_struct *task;
> +
> +       cgroup_taskset_for_each(task, cgrp, tset) {
>  #ifdef CONFIG_RT_GROUP_SCHED
> -       if (!sched_rt_can_attach(cgroup_tg(cgrp), tsk))
> -               return -EINVAL;
> +               if (!sched_rt_can_attach(cgroup_tg(cgrp), task))
> +                       return -EINVAL;
>  #else
> -       /* We don't support RT-tasks being in separate groups */
> -       if (tsk->sched_class != &fair_sched_class)
> -               return -EINVAL;
> +               /* We don't support RT-tasks being in separate groups */
> +               if (task->sched_class != &fair_sched_class)
> +                       return -EINVAL;
>  #endif
> +       }
>        return 0;
>  }
>
> -static void
> -cpu_cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
> +static void cpu_cgroup_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
> +                             struct cgroup_taskset *tset)
>  {
> -       sched_move_task(tsk);
> +       struct task_struct *task;
> +
> +       cgroup_taskset_for_each(task, cgrp, tset)
> +               sched_move_task(task);
>  }
>
>  static void
> @@ -9071,8 +9078,8 @@ struct cgroup_subsys cpu_cgroup_subsys = {
>        .name           = "cpu",
>        .create         = cpu_cgroup_create,
>        .destroy        = cpu_cgroup_destroy,
> -       .can_attach_task = cpu_cgroup_can_attach_task,
> -       .attach_task    = cpu_cgroup_attach_task,
> +       .can_attach     = cpu_cgroup_can_attach,
> +       .attach         = cpu_cgroup_attach,
>        .exit           = cpu_cgroup_exit,
>        .populate       = cpu_cgroup_populate,
>        .subsys_id      = cpu_cgroup_subsys_id,
> --
> 1.7.6
>
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task()
       [not found]     ` <CALdu-PCc2RzedXubReF9huamL6W+5qGCfXNNvqS2yUk3QTHRng-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-08-25  9:12       ` Tejun Heo
  0 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  9:12 UTC (permalink / raw)
  To: Paul Menage
  Cc: Daisuke Nishimura,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Ingo Molnar, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	rjw-KKrjLPT3xs0,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

Hello, Paul.

On Thu, Aug 25, 2011 at 02:07:35AM -0700, Paul Menage wrote:
> Doesn't the other part of this patch set, that avoids calling the
> *attach() methods for tasks that aren't moving, eliminate the need for
> the usage of skip_cgrp here (and elsewhere)? When do we actually need
> to pass a non-NULL skip_cgrp to cgroup_taskset_for_each()?

If any task is moving ->*attach() should be called.  Whether the @tset
passed in should contain tasks which aren't changing cgroups is
debatable.  The operation is guaranteed to be for an entire thread
group and it makes sense to make at least the leader always available
even if it's not moving.  Given that the operation is defined to be
per-thread-group, I think it's better to pass in the whole thread
group with an easy way to skip the ones which aren't moving.  For
example, memcg seems to need to find the mm->owner and it's possible
that the mm->owner might not be changing cgroups.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task()
  2011-08-25  9:07   ` Paul Menage
  2011-08-25  9:12     ` Tejun Heo
@ 2011-08-25  9:12     ` Tejun Heo
       [not found]     ` <CALdu-PCc2RzedXubReF9huamL6W+5qGCfXNNvqS2yUk3QTHRng-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  9:12 UTC (permalink / raw)
  To: Paul Menage
  Cc: rjw, lizf, linux-pm, linux-kernel, containers, Balbir Singh,
	Daisuke Nishimura, KAMEZAWA Hiroyuki, James Morris, Ingo Molnar,
	Peter Zijlstra

Hello, Paul.

On Thu, Aug 25, 2011 at 02:07:35AM -0700, Paul Menage wrote:
> Doesn't the other part of this patch set, that avoids calling the
> *attach() methods for tasks that aren't moving, eliminate the need for
> the usage of skip_cgrp here (and elsewhere)? When do we actually need
> to pass a non-NULL skip_cgrp to cgroup_taskset_for_each()?

If any task is moving ->*attach() should be called.  Whether the @tset
passed in should contain tasks which aren't changing cgroups is
debatable.  The operation is guaranteed to be for an entire thread
group and it makes sense to make at least the leader always available
even if it's not moving.  Given that the operation is defined to be
per-thread-group, I think it's better to pass in the whole thread
group with an easy way to skip the ones which aren't moving.  For
example, memcg seems to need to find the mm->owner and it's possible
that the mm->owner might not be changing cgroups.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task()
  2011-08-25  9:07   ` Paul Menage
@ 2011-08-25  9:12     ` Tejun Heo
  2011-08-25  9:12     ` Tejun Heo
       [not found]     ` <CALdu-PCc2RzedXubReF9huamL6W+5qGCfXNNvqS2yUk3QTHRng-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  9:12 UTC (permalink / raw)
  To: Paul Menage
  Cc: Peter Zijlstra, Daisuke Nishimura, containers, Ingo Molnar, lizf,
	linux-kernel, James Morris, linux-pm, KAMEZAWA Hiroyuki

Hello, Paul.

On Thu, Aug 25, 2011 at 02:07:35AM -0700, Paul Menage wrote:
> Doesn't the other part of this patch set, that avoids calling the
> *attach() methods for tasks that aren't moving, eliminate the need for
> the usage of skip_cgrp here (and elsewhere)? When do we actually need
> to pass a non-NULL skip_cgrp to cgroup_taskset_for_each()?

If any task is moving ->*attach() should be called.  Whether the @tset
passed in should contain tasks which aren't changing cgroups is
debatable.  The operation is guaranteed to be for an entire thread
group and it makes sense to make at least the leader always available
even if it's not moving.  Given that the operation is defined to be
per-thread-group, I think it's better to pass in the whole thread
group with an easy way to skip the ones which aren't moving.  For
example, memcg seems to need to find the mm->owner and it's possible
that the mm->owner might not be changing cgroups.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
       [not found]   ` <1314138000-2049-4-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2011-08-25  0:39     ` KAMEZAWA Hiroyuki
@ 2011-08-25  9:14     ` Paul Menage
  2011-08-25  9:32     ` Paul Menage
  2 siblings, 0 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  9:14 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Daisuke Nishimura,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, rjw-KKrjLPT3xs0,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> Currently, there's no way to pass multiple tasks to cgroup_subsys
> methods necessitating the need for separate per-process and per-task
> methods.  This patch introduces cgroup_taskset which can be used to
> pass multiple tasks and their associated cgroups to cgroup_subsys
> methods.
>
> Three methods - can_attach(), cancel_attach() and attach() - are
> converted to use cgroup_taskset.  This unifies passed parameters so
> that all methods have access to all information.  Conversions in this
> patchset are identical and don't introduce any behavior change.
>
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

The general idea of passing consistent information to all *attach
methods seems good, but isn't it simpler to just fix up the various
method signatures?

The whole point of having *attach() and *attach_task() was to minimize
the amount of boilerplate (in this case, iterating across a new
cgroup_taskset abstraction) in the subsystems, leaving that to the
cgroups framework.

Paul

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-23 22:19 ` [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach() Tejun Heo
  2011-08-25  0:39   ` KAMEZAWA Hiroyuki
  2011-08-25  0:39   ` KAMEZAWA Hiroyuki
@ 2011-08-25  9:14   ` Paul Menage
  2011-08-25  9:20     ` Tejun Heo
                       ` (2 more replies)
       [not found]   ` <1314138000-2049-4-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (3 subsequent siblings)
  6 siblings, 3 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  9:14 UTC (permalink / raw)
  To: Tejun Heo
  Cc: rjw, lizf, linux-pm, linux-kernel, containers, Balbir Singh,
	Daisuke Nishimura, KAMEZAWA Hiroyuki, James Morris

On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj@kernel.org> wrote:
> Currently, there's no way to pass multiple tasks to cgroup_subsys
> methods necessitating the need for separate per-process and per-task
> methods.  This patch introduces cgroup_taskset which can be used to
> pass multiple tasks and their associated cgroups to cgroup_subsys
> methods.
>
> Three methods - can_attach(), cancel_attach() and attach() - are
> converted to use cgroup_taskset.  This unifies passed parameters so
> that all methods have access to all information.  Conversions in this
> patchset are identical and don't introduce any behavior change.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>

The general idea of passing consistent information to all *attach
methods seems good, but isn't it simpler to just fix up the various
method signatures?

The whole point of having *attach() and *attach_task() was to minimize
the amount of boilerplate (in this case, iterating across a new
cgroup_taskset abstraction) in the subsystems, leaving that to the
cgroups framework.

Paul

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-23 22:19 ` [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach() Tejun Heo
                     ` (3 preceding siblings ...)
       [not found]   ` <1314138000-2049-4-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2011-08-25  9:14   ` Paul Menage
  2011-08-25  9:32   ` Paul Menage
  2011-08-25  9:32   ` Paul Menage
  6 siblings, 0 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  9:14 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Daisuke Nishimura, containers, lizf, linux-kernel, James Morris,
	linux-pm, KAMEZAWA Hiroyuki

On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj@kernel.org> wrote:
> Currently, there's no way to pass multiple tasks to cgroup_subsys
> methods necessitating the need for separate per-process and per-task
> methods.  This patch introduces cgroup_taskset which can be used to
> pass multiple tasks and their associated cgroups to cgroup_subsys
> methods.
>
> Three methods - can_attach(), cancel_attach() and attach() - are
> converted to use cgroup_taskset.  This unifies passed parameters so
> that all methods have access to all information.  Conversions in this
> patchset are identical and don't introduce any behavior change.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>

The general idea of passing consistent information to all *attach
methods seems good, but isn't it simpler to just fix up the various
method signatures?

The whole point of having *attach() and *attach_task() was to minimize
the amount of boilerplate (in this case, iterating across a new
cgroup_taskset abstraction) in the subsystems, leaving that to the
cgroups framework.

Paul

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
       [not found]     ` <CALdu-PDAgqeRJt5vqTB9wddwz70Yn+Jf-Pb0dDKDBD_q37tHQg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-08-25  9:20       ` Tejun Heo
  0 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  9:20 UTC (permalink / raw)
  To: Paul Menage
  Cc: Daisuke Nishimura,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, rjw-KKrjLPT3xs0,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

Hello,

On Thu, Aug 25, 2011 at 02:14:12AM -0700, Paul Menage wrote:
> The general idea of passing consistent information to all *attach
> methods seems good, but isn't it simpler to just fix up the various
> method signatures?

I think having separate ->attach() and ->attach_task() is inherently
broken.  Look at the memcg discussion I had in this thread for
reference and as soon as we need to do something across the tasks
being migrated, iteration-by-callback becomes very painful.
e.g. let's say memcg wants to find the mm->owner and wants to print
warning or fail if that doesn't work out.  How would that be
implemented if it's iterating by callback.

> The whole point of having *attach() and *attach_task() was to minimize
> the amount of boilerplate (in this case, iterating across a new
> cgroup_taskset abstraction) in the subsystems, leaving that to the
> cgroups framework.

Yeah, I agree with making things easier for subsystems but I violently
disagree that iteration-by-callback is helpful in any way.  If
control-loop style iterator is at all possible, it's almost always
better to go that way.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-25  9:14   ` Paul Menage
@ 2011-08-25  9:20     ` Tejun Heo
  2011-08-25  9:32       ` Paul Menage
                         ` (2 more replies)
       [not found]     ` <CALdu-PDAgqeRJt5vqTB9wddwz70Yn+Jf-Pb0dDKDBD_q37tHQg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2011-08-25  9:20     ` Tejun Heo
  2 siblings, 3 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  9:20 UTC (permalink / raw)
  To: Paul Menage
  Cc: rjw, lizf, linux-pm, linux-kernel, containers, Balbir Singh,
	Daisuke Nishimura, KAMEZAWA Hiroyuki, James Morris

Hello,

On Thu, Aug 25, 2011 at 02:14:12AM -0700, Paul Menage wrote:
> The general idea of passing consistent information to all *attach
> methods seems good, but isn't it simpler to just fix up the various
> method signatures?

I think having separate ->attach() and ->attach_task() is inherently
broken.  Look at the memcg discussion I had in this thread for
reference and as soon as we need to do something across the tasks
being migrated, iteration-by-callback becomes very painful.
e.g. let's say memcg wants to find the mm->owner and wants to print
warning or fail if that doesn't work out.  How would that be
implemented if it's iterating by callback.

> The whole point of having *attach() and *attach_task() was to minimize
> the amount of boilerplate (in this case, iterating across a new
> cgroup_taskset abstraction) in the subsystems, leaving that to the
> cgroups framework.

Yeah, I agree with making things easier for subsystems but I violently
disagree that iteration-by-callback is helpful in any way.  If
control-loop style iterator is at all possible, it's almost always
better to go that way.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-25  9:14   ` Paul Menage
  2011-08-25  9:20     ` Tejun Heo
       [not found]     ` <CALdu-PDAgqeRJt5vqTB9wddwz70Yn+Jf-Pb0dDKDBD_q37tHQg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-08-25  9:20     ` Tejun Heo
  2 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  9:20 UTC (permalink / raw)
  To: Paul Menage
  Cc: Daisuke Nishimura, containers, lizf, linux-kernel, James Morris,
	linux-pm, KAMEZAWA Hiroyuki

Hello,

On Thu, Aug 25, 2011 at 02:14:12AM -0700, Paul Menage wrote:
> The general idea of passing consistent information to all *attach
> methods seems good, but isn't it simpler to just fix up the various
> method signatures?

I think having separate ->attach() and ->attach_task() is inherently
broken.  Look at the memcg discussion I had in this thread for
reference and as soon as we need to do something across the tasks
being migrated, iteration-by-callback becomes very painful.
e.g. let's say memcg wants to find the mm->owner and wants to print
warning or fail if that doesn't work out.  How would that be
implemented if it's iterating by callback.

> The whole point of having *attach() and *attach_task() was to minimize
> the amount of boilerplate (in this case, iterating across a new
> cgroup_taskset abstraction) in the subsystems, leaving that to the
> cgroups framework.

Yeah, I agree with making things easier for subsystems but I violently
disagree that iteration-by-callback is helpful in any way.  If
control-loop style iterator is at all possible, it's almost always
better to go that way.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
       [not found]       ` <20110825092045.GG3286-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
@ 2011-08-25  9:32         ` Paul Menage
  0 siblings, 0 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  9:32 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Daisuke Nishimura,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, rjw-KKrjLPT3xs0,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Thu, Aug 25, 2011 at 2:20 AM, Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
>
> I think having separate ->attach() and ->attach_task() is inherently
> broken.  Look at the memcg discussion I had in this thread for
> reference and as soon as we need to do something across the tasks
> being migrated, iteration-by-callback becomes very painful.
> e.g. let's say memcg wants to find the mm->owner and wants to print
> warning or fail if that doesn't work out.  How would that be
> implemented if it's iterating by callback.

OK, fair point. See my other email for patch comments.

Paul

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-25  9:20     ` Tejun Heo
  2011-08-25  9:32       ` Paul Menage
@ 2011-08-25  9:32       ` Paul Menage
       [not found]       ` <20110825092045.GG3286-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
  2 siblings, 0 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  9:32 UTC (permalink / raw)
  To: Tejun Heo
  Cc: rjw, lizf, linux-pm, linux-kernel, containers, Balbir Singh,
	Daisuke Nishimura, KAMEZAWA Hiroyuki, James Morris

On Thu, Aug 25, 2011 at 2:20 AM, Tejun Heo <tj@kernel.org> wrote:
>
> I think having separate ->attach() and ->attach_task() is inherently
> broken.  Look at the memcg discussion I had in this thread for
> reference and as soon as we need to do something across the tasks
> being migrated, iteration-by-callback becomes very painful.
> e.g. let's say memcg wants to find the mm->owner and wants to print
> warning or fail if that doesn't work out.  How would that be
> implemented if it's iterating by callback.

OK, fair point. See my other email for patch comments.

Paul

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-25  9:20     ` Tejun Heo
@ 2011-08-25  9:32       ` Paul Menage
  2011-08-25  9:32       ` Paul Menage
       [not found]       ` <20110825092045.GG3286-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
  2 siblings, 0 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  9:32 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Daisuke Nishimura, containers, lizf, linux-kernel, James Morris,
	linux-pm, KAMEZAWA Hiroyuki

On Thu, Aug 25, 2011 at 2:20 AM, Tejun Heo <tj@kernel.org> wrote:
>
> I think having separate ->attach() and ->attach_task() is inherently
> broken.  Look at the memcg discussion I had in this thread for
> reference and as soon as we need to do something across the tasks
> being migrated, iteration-by-callback becomes very painful.
> e.g. let's say memcg wants to find the mm->owner and wants to print
> warning or fail if that doesn't work out.  How would that be
> implemented if it's iterating by callback.

OK, fair point. See my other email for patch comments.

Paul

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
       [not found]   ` <1314138000-2049-4-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2011-08-25  0:39     ` KAMEZAWA Hiroyuki
  2011-08-25  9:14     ` Paul Menage
@ 2011-08-25  9:32     ` Paul Menage
  2 siblings, 0 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  9:32 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Daisuke Nishimura,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, rjw-KKrjLPT3xs0,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> +Called prior to moving one or more tasks into a cgroup; if the
> +subsystem returns an error, this will abort the attach operation.
> +@tset contains the tasks to be attached and is guaranteed to have at
> +least one task in it. If there are multiple, it's guaranteed that all
> +are from the same thread group, @tset contains all tasks from the
> +group whether they're actually switching cgroup or not, and the first
> +task is the leader.

I'd reformat this a bit for clarity:

If there are multiple tasks in the taskset, then:
  - it's guaranteed that all are from the same thread group
  - @tset contains all tasks from the thread group whether or not
they're switching cgroup
  - the first task is the leader

Acked-by: Paul Menage <paul-inf54ven1CmVyaH7bEyXVA@public.gmane.org>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-23 22:19 ` [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach() Tejun Heo
                     ` (4 preceding siblings ...)
  2011-08-25  9:14   ` Paul Menage
@ 2011-08-25  9:32   ` Paul Menage
  2011-08-25  9:32   ` Paul Menage
  6 siblings, 0 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  9:32 UTC (permalink / raw)
  To: Tejun Heo
  Cc: rjw, lizf, linux-pm, linux-kernel, containers, Balbir Singh,
	Daisuke Nishimura, KAMEZAWA Hiroyuki, James Morris

On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj@kernel.org> wrote:
> +Called prior to moving one or more tasks into a cgroup; if the
> +subsystem returns an error, this will abort the attach operation.
> +@tset contains the tasks to be attached and is guaranteed to have at
> +least one task in it. If there are multiple, it's guaranteed that all
> +are from the same thread group, @tset contains all tasks from the
> +group whether they're actually switching cgroup or not, and the first
> +task is the leader.

I'd reformat this a bit for clarity:

If there are multiple tasks in the taskset, then:
  - it's guaranteed that all are from the same thread group
  - @tset contains all tasks from the thread group whether or not
they're switching cgroup
  - the first task is the leader

Acked-by: Paul Menage <paul@paulmenage.org>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
  2011-08-23 22:19 ` [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach() Tejun Heo
                     ` (5 preceding siblings ...)
  2011-08-25  9:32   ` Paul Menage
@ 2011-08-25  9:32   ` Paul Menage
  6 siblings, 0 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  9:32 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Daisuke Nishimura, containers, lizf, linux-kernel, James Morris,
	linux-pm, KAMEZAWA Hiroyuki

On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj@kernel.org> wrote:
> +Called prior to moving one or more tasks into a cgroup; if the
> +subsystem returns an error, this will abort the attach operation.
> +@tset contains the tasks to be attached and is guaranteed to have at
> +least one task in it. If there are multiple, it's guaranteed that all
> +are from the same thread group, @tset contains all tasks from the
> +group whether they're actually switching cgroup or not, and the first
> +task is the leader.

I'd reformat this a bit for clarity:

If there are multiple tasks in the taskset, then:
  - it's guaranteed that all are from the same thread group
  - @tset contains all tasks from the thread group whether or not
they're switching cgroup
  - the first task is the leader

Acked-by: Paul Menage <paul@paulmenage.org>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
       [not found]   ` <1314138000-2049-3-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2011-08-25  8:51     ` Paul Menage
@ 2011-08-25  9:42     ` Paul Menage
  1 sibling, 0 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  9:42 UTC (permalink / raw)
  To: Tejun Heo
  Cc: rjw-KKrjLPT3xs0,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> cgroup_attach_proc() behaves differently from cgroup_attach_task() in
> the following aspects.
>
> * All hooks are invoked even if no task is actually being moved.
>
> * ->can_attach_task() is called for all tasks in the group whether the
>  new cgrp is different from the current cgrp or not; however,
>  ->attach_task() is skipped if new equals new.  This makes the calls
>  asymmetric.
>
> This patch improves old cgroup handling in cgroup_attach_proc() by
> looking up the current cgroup at the head, recording it in the flex
> array along with the task itself, and using it to remove the above two
> differences.  This will also ease further changes.
>
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

Acked-by: Paul Menage <paul-inf54ven1CmVyaH7bEyXVA@public.gmane.org>

With the later cgroup_taskset changes making use of the same flex
array, I guess I agree that leaving all the tasks in the array makes
sense.

> +       int retval, i, group_size, nr_todo;

I'd be inclined to call "nr_todo" something like "nr_migrating_tasks"
for clarity.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
  2011-08-23 22:19 ` Tejun Heo
                     ` (3 preceding siblings ...)
  2011-08-25  9:42   ` Paul Menage
@ 2011-08-25  9:42   ` Paul Menage
       [not found]     ` <CALdu-PBr-tu1qzScvncr-N4EpPaQ7sTdHf28GhEv_MZLbo1eSg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
                       ` (2 more replies)
  4 siblings, 3 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  9:42 UTC (permalink / raw)
  To: Tejun Heo; +Cc: rjw, lizf, linux-pm, linux-kernel, containers

On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj@kernel.org> wrote:
> cgroup_attach_proc() behaves differently from cgroup_attach_task() in
> the following aspects.
>
> * All hooks are invoked even if no task is actually being moved.
>
> * ->can_attach_task() is called for all tasks in the group whether the
>  new cgrp is different from the current cgrp or not; however,
>  ->attach_task() is skipped if new equals new.  This makes the calls
>  asymmetric.
>
> This patch improves old cgroup handling in cgroup_attach_proc() by
> looking up the current cgroup at the head, recording it in the flex
> array along with the task itself, and using it to remove the above two
> differences.  This will also ease further changes.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>

Acked-by: Paul Menage <paul@paulmenage.org>

With the later cgroup_taskset changes making use of the same flex
array, I guess I agree that leaving all the tasks in the array makes
sense.

> +       int retval, i, group_size, nr_todo;

I'd be inclined to call "nr_todo" something like "nr_migrating_tasks"
for clarity.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
  2011-08-23 22:19 ` Tejun Heo
                     ` (2 preceding siblings ...)
       [not found]   ` <1314138000-2049-3-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2011-08-25  9:42   ` Paul Menage
  2011-08-25  9:42   ` Paul Menage
  4 siblings, 0 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  9:42 UTC (permalink / raw)
  To: Tejun Heo; +Cc: containers, linux-pm, lizf, linux-kernel

On Tue, Aug 23, 2011 at 3:19 PM, Tejun Heo <tj@kernel.org> wrote:
> cgroup_attach_proc() behaves differently from cgroup_attach_task() in
> the following aspects.
>
> * All hooks are invoked even if no task is actually being moved.
>
> * ->can_attach_task() is called for all tasks in the group whether the
>  new cgrp is different from the current cgrp or not; however,
>  ->attach_task() is skipped if new equals new.  This makes the calls
>  asymmetric.
>
> This patch improves old cgroup handling in cgroup_attach_proc() by
> looking up the current cgroup at the head, recording it in the flex
> array along with the task itself, and using it to remove the above two
> differences.  This will also ease further changes.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>

Acked-by: Paul Menage <paul@paulmenage.org>

With the later cgroup_taskset changes making use of the same flex
array, I guess I agree that leaving all the tasks in the array makes
sense.

> +       int retval, i, group_size, nr_todo;

I'd be inclined to call "nr_todo" something like "nr_migrating_tasks"
for clarity.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
       [not found]     ` <CALdu-PBr-tu1qzScvncr-N4EpPaQ7sTdHf28GhEv_MZLbo1eSg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-08-25  9:44       ` Tejun Heo
  0 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  9:44 UTC (permalink / raw)
  To: Paul Menage
  Cc: rjw-KKrjLPT3xs0,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Thu, Aug 25, 2011 at 02:42:46AM -0700, Paul Menage wrote:
> > +       int retval, i, group_size, nr_todo;
> 
> I'd be inclined to call "nr_todo" something like "nr_migrating_tasks"
> for clarity.

Will update.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
  2011-08-25  9:42   ` Paul Menage
       [not found]     ` <CALdu-PBr-tu1qzScvncr-N4EpPaQ7sTdHf28GhEv_MZLbo1eSg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2011-08-25  9:44     ` Tejun Heo
@ 2011-08-25  9:44     ` Tejun Heo
  2 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  9:44 UTC (permalink / raw)
  To: Paul Menage; +Cc: rjw, lizf, linux-pm, linux-kernel, containers

On Thu, Aug 25, 2011 at 02:42:46AM -0700, Paul Menage wrote:
> > +       int retval, i, group_size, nr_todo;
> 
> I'd be inclined to call "nr_todo" something like "nr_migrating_tasks"
> for clarity.

Will update.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
  2011-08-25  9:42   ` Paul Menage
       [not found]     ` <CALdu-PBr-tu1qzScvncr-N4EpPaQ7sTdHf28GhEv_MZLbo1eSg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-08-25  9:44     ` Tejun Heo
  2011-08-25  9:44     ` Tejun Heo
  2 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25  9:44 UTC (permalink / raw)
  To: Paul Menage; +Cc: containers, linux-pm, lizf, linux-kernel

On Thu, Aug 25, 2011 at 02:42:46AM -0700, Paul Menage wrote:
> > +       int retval, i, group_size, nr_todo;
> 
> I'd be inclined to call "nr_todo" something like "nr_migrating_tasks"
> for clarity.

Will update.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 6/6] cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task()
       [not found]   ` <1314138000-2049-7-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2011-08-25  9:45     ` Paul Menage
  0 siblings, 0 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  9:45 UTC (permalink / raw)
  To: Tejun Heo
  Cc: rjw-KKrjLPT3xs0,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, Aug 23, 2011 at 3:20 PM, Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> These three methods are no longer used.  Kill them.
>
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

Acked-by: Paul Menage <paul-inf54ven1CmVyaH7bEyXVA@public.gmane.org>

Overall, I think this patch set is a great improvement in code
simplicity and clarity.

Thanks,
Paul

> Cc: Paul Menage <paul-inf54ven1CmVyaH7bEyXVA@public.gmane.org>
> Cc: Li Zefan <lizf-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
> ---
>  include/linux/cgroup.h |    3 --
>  kernel/cgroup.c        |   53 ++++-------------------------------------------
>  2 files changed, 5 insertions(+), 51 deletions(-)
>
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index 2470c8e..5659d37 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -490,11 +490,8 @@ struct cgroup_subsys {
>        void (*destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
>        int (*can_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                          struct cgroup_taskset *tset);
> -       int (*can_attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
>        void (*cancel_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                              struct cgroup_taskset *tset);
> -       void (*pre_attach)(struct cgroup *cgrp);
> -       void (*attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
>        void (*attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                       struct cgroup_taskset *tset);
>        void (*fork)(struct cgroup_subsys *ss, struct task_struct *task);
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 474674b..374a4cb 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -1926,13 +1926,6 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
>                                goto out;
>                        }
>                }
> -               if (ss->can_attach_task) {
> -                       retval = ss->can_attach_task(cgrp, tsk);
> -                       if (retval) {
> -                               failed_ss = ss;
> -                               goto out;
> -                       }
> -               }
>        }
>
>        retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, false);
> @@ -1940,10 +1933,6 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
>                goto out;
>
>        for_each_subsys(root, ss) {
> -               if (ss->pre_attach)
> -                       ss->pre_attach(cgrp);
> -               if (ss->attach_task)
> -                       ss->attach_task(cgrp, tsk);
>                if (ss->attach)
>                        ss->attach(ss, cgrp, &tset);
>        }
> @@ -2075,7 +2064,6 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>  {
>        int retval, i, group_size, nr_todo;
>        struct cgroup_subsys *ss, *failed_ss = NULL;
> -       bool cancel_failed_ss = false;
>        /* guaranteed to be initialized later, but the compiler needs this */
>        struct css_set *oldcg;
>        struct cgroupfs_root *root = cgrp->root;
> @@ -2166,21 +2154,6 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>                                goto out_cancel_attach;
>                        }
>                }
> -               /* a callback to be run on every thread in the threadgroup. */
> -               if (ss->can_attach_task) {
> -                       /* run on each task in the threadgroup. */
> -                       for (i = 0; i < group_size; i++) {
> -                               tc = flex_array_get(group, i);
> -                               if (tc->cgrp == cgrp)
> -                                       continue;
> -                               retval = ss->can_attach_task(cgrp, tc->task);
> -                               if (retval) {
> -                                       failed_ss = ss;
> -                                       cancel_failed_ss = true;
> -                                       goto out_cancel_attach;
> -                               }
> -                       }
> -               }
>        }
>
>        /*
> @@ -2217,15 +2190,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>        }
>
>        /*
> -        * step 3: now that we're guaranteed success wrt the css_sets, proceed
> -        * to move all tasks to the new cgroup, calling ss->attach_task for each
> -        * one along the way. there are no failure cases after here, so this is
> -        * the commit point.
> +        * step 3: now that we're guaranteed success wrt the css_sets,
> +        * proceed to move all tasks to the new cgroup.  There are no
> +        * failure cases after here, so this is the commit point.
>         */
> -       for_each_subsys(root, ss) {
> -               if (ss->pre_attach)
> -                       ss->pre_attach(cgrp);
> -       }
>        for (i = 0; i < group_size; i++) {
>                tc = flex_array_get(group, i);
>                /* leave current thread as it is if it's already there */
> @@ -2235,19 +2203,11 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>                /* if the thread is PF_EXITING, it can just get skipped. */
>                retval = cgroup_task_migrate(cgrp, tc->cgrp, tc->task, true);
>                BUG_ON(retval != 0 && retval != -ESRCH);
> -
> -               /* attach each task to each subsystem */
> -               for_each_subsys(root, ss) {
> -                       if (ss->attach_task)
> -                               ss->attach_task(cgrp, tc->task);
> -               }
>        }
>        /* nothing is sensitive to fork() after this point. */
>
>        /*
> -        * step 4: do expensive, non-thread-specific subsystem callbacks.
> -        * TODO: if ever a subsystem needs to know the oldcgrp for each task
> -        * being moved, this call will need to be reworked to communicate that.
> +        * step 4: do subsystem attach callbacks.
>         */
>        for_each_subsys(root, ss) {
>                if (ss->attach)
> @@ -2271,11 +2231,8 @@ out_cancel_attach:
>        /* same deal as in cgroup_attach_task */
>        if (retval) {
>                for_each_subsys(root, ss) {
> -                       if (ss == failed_ss) {
> -                               if (cancel_failed_ss && ss->cancel_attach)
> -                                       ss->cancel_attach(ss, cgrp, &tset);
> +                       if (ss == failed_ss)
>                                break;
> -                       }
>                        if (ss->cancel_attach)
>                                ss->cancel_attach(ss, cgrp, &tset);
>                }
> --
> 1.7.6
>
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 6/6] cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task()
  2011-08-23 22:20 ` Tejun Heo
  2011-08-25  9:45   ` Paul Menage
       [not found]   ` <1314138000-2049-7-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2011-08-25  9:45   ` Paul Menage
  2 siblings, 0 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  9:45 UTC (permalink / raw)
  To: Tejun Heo; +Cc: rjw, lizf, linux-pm, linux-kernel, containers

On Tue, Aug 23, 2011 at 3:20 PM, Tejun Heo <tj@kernel.org> wrote:
> These three methods are no longer used.  Kill them.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>

Acked-by: Paul Menage <paul@paulmenage.org>

Overall, I think this patch set is a great improvement in code
simplicity and clarity.

Thanks,
Paul

> Cc: Paul Menage <paul@paulmenage.org>
> Cc: Li Zefan <lizf@cn.fujitsu.com>
> ---
>  include/linux/cgroup.h |    3 --
>  kernel/cgroup.c        |   53 ++++-------------------------------------------
>  2 files changed, 5 insertions(+), 51 deletions(-)
>
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index 2470c8e..5659d37 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -490,11 +490,8 @@ struct cgroup_subsys {
>        void (*destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
>        int (*can_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                          struct cgroup_taskset *tset);
> -       int (*can_attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
>        void (*cancel_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                              struct cgroup_taskset *tset);
> -       void (*pre_attach)(struct cgroup *cgrp);
> -       void (*attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
>        void (*attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                       struct cgroup_taskset *tset);
>        void (*fork)(struct cgroup_subsys *ss, struct task_struct *task);
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 474674b..374a4cb 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -1926,13 +1926,6 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
>                                goto out;
>                        }
>                }
> -               if (ss->can_attach_task) {
> -                       retval = ss->can_attach_task(cgrp, tsk);
> -                       if (retval) {
> -                               failed_ss = ss;
> -                               goto out;
> -                       }
> -               }
>        }
>
>        retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, false);
> @@ -1940,10 +1933,6 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
>                goto out;
>
>        for_each_subsys(root, ss) {
> -               if (ss->pre_attach)
> -                       ss->pre_attach(cgrp);
> -               if (ss->attach_task)
> -                       ss->attach_task(cgrp, tsk);
>                if (ss->attach)
>                        ss->attach(ss, cgrp, &tset);
>        }
> @@ -2075,7 +2064,6 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>  {
>        int retval, i, group_size, nr_todo;
>        struct cgroup_subsys *ss, *failed_ss = NULL;
> -       bool cancel_failed_ss = false;
>        /* guaranteed to be initialized later, but the compiler needs this */
>        struct css_set *oldcg;
>        struct cgroupfs_root *root = cgrp->root;
> @@ -2166,21 +2154,6 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>                                goto out_cancel_attach;
>                        }
>                }
> -               /* a callback to be run on every thread in the threadgroup. */
> -               if (ss->can_attach_task) {
> -                       /* run on each task in the threadgroup. */
> -                       for (i = 0; i < group_size; i++) {
> -                               tc = flex_array_get(group, i);
> -                               if (tc->cgrp == cgrp)
> -                                       continue;
> -                               retval = ss->can_attach_task(cgrp, tc->task);
> -                               if (retval) {
> -                                       failed_ss = ss;
> -                                       cancel_failed_ss = true;
> -                                       goto out_cancel_attach;
> -                               }
> -                       }
> -               }
>        }
>
>        /*
> @@ -2217,15 +2190,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>        }
>
>        /*
> -        * step 3: now that we're guaranteed success wrt the css_sets, proceed
> -        * to move all tasks to the new cgroup, calling ss->attach_task for each
> -        * one along the way. there are no failure cases after here, so this is
> -        * the commit point.
> +        * step 3: now that we're guaranteed success wrt the css_sets,
> +        * proceed to move all tasks to the new cgroup.  There are no
> +        * failure cases after here, so this is the commit point.
>         */
> -       for_each_subsys(root, ss) {
> -               if (ss->pre_attach)
> -                       ss->pre_attach(cgrp);
> -       }
>        for (i = 0; i < group_size; i++) {
>                tc = flex_array_get(group, i);
>                /* leave current thread as it is if it's already there */
> @@ -2235,19 +2203,11 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>                /* if the thread is PF_EXITING, it can just get skipped. */
>                retval = cgroup_task_migrate(cgrp, tc->cgrp, tc->task, true);
>                BUG_ON(retval != 0 && retval != -ESRCH);
> -
> -               /* attach each task to each subsystem */
> -               for_each_subsys(root, ss) {
> -                       if (ss->attach_task)
> -                               ss->attach_task(cgrp, tc->task);
> -               }
>        }
>        /* nothing is sensitive to fork() after this point. */
>
>        /*
> -        * step 4: do expensive, non-thread-specific subsystem callbacks.
> -        * TODO: if ever a subsystem needs to know the oldcgrp for each task
> -        * being moved, this call will need to be reworked to communicate that.
> +        * step 4: do subsystem attach callbacks.
>         */
>        for_each_subsys(root, ss) {
>                if (ss->attach)
> @@ -2271,11 +2231,8 @@ out_cancel_attach:
>        /* same deal as in cgroup_attach_task */
>        if (retval) {
>                for_each_subsys(root, ss) {
> -                       if (ss == failed_ss) {
> -                               if (cancel_failed_ss && ss->cancel_attach)
> -                                       ss->cancel_attach(ss, cgrp, &tset);
> +                       if (ss == failed_ss)
>                                break;
> -                       }
>                        if (ss->cancel_attach)
>                                ss->cancel_attach(ss, cgrp, &tset);
>                }
> --
> 1.7.6
>
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 6/6] cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task()
  2011-08-23 22:20 ` Tejun Heo
@ 2011-08-25  9:45   ` Paul Menage
       [not found]   ` <1314138000-2049-7-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2011-08-25  9:45   ` Paul Menage
  2 siblings, 0 replies; 100+ messages in thread
From: Paul Menage @ 2011-08-25  9:45 UTC (permalink / raw)
  To: Tejun Heo; +Cc: containers, linux-pm, lizf, linux-kernel

On Tue, Aug 23, 2011 at 3:20 PM, Tejun Heo <tj@kernel.org> wrote:
> These three methods are no longer used.  Kill them.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>

Acked-by: Paul Menage <paul@paulmenage.org>

Overall, I think this patch set is a great improvement in code
simplicity and clarity.

Thanks,
Paul

> Cc: Paul Menage <paul@paulmenage.org>
> Cc: Li Zefan <lizf@cn.fujitsu.com>
> ---
>  include/linux/cgroup.h |    3 --
>  kernel/cgroup.c        |   53 ++++-------------------------------------------
>  2 files changed, 5 insertions(+), 51 deletions(-)
>
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index 2470c8e..5659d37 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -490,11 +490,8 @@ struct cgroup_subsys {
>        void (*destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
>        int (*can_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                          struct cgroup_taskset *tset);
> -       int (*can_attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
>        void (*cancel_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                              struct cgroup_taskset *tset);
> -       void (*pre_attach)(struct cgroup *cgrp);
> -       void (*attach_task)(struct cgroup *cgrp, struct task_struct *tsk);
>        void (*attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
>                       struct cgroup_taskset *tset);
>        void (*fork)(struct cgroup_subsys *ss, struct task_struct *task);
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 474674b..374a4cb 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -1926,13 +1926,6 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
>                                goto out;
>                        }
>                }
> -               if (ss->can_attach_task) {
> -                       retval = ss->can_attach_task(cgrp, tsk);
> -                       if (retval) {
> -                               failed_ss = ss;
> -                               goto out;
> -                       }
> -               }
>        }
>
>        retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, false);
> @@ -1940,10 +1933,6 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
>                goto out;
>
>        for_each_subsys(root, ss) {
> -               if (ss->pre_attach)
> -                       ss->pre_attach(cgrp);
> -               if (ss->attach_task)
> -                       ss->attach_task(cgrp, tsk);
>                if (ss->attach)
>                        ss->attach(ss, cgrp, &tset);
>        }
> @@ -2075,7 +2064,6 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>  {
>        int retval, i, group_size, nr_todo;
>        struct cgroup_subsys *ss, *failed_ss = NULL;
> -       bool cancel_failed_ss = false;
>        /* guaranteed to be initialized later, but the compiler needs this */
>        struct css_set *oldcg;
>        struct cgroupfs_root *root = cgrp->root;
> @@ -2166,21 +2154,6 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>                                goto out_cancel_attach;
>                        }
>                }
> -               /* a callback to be run on every thread in the threadgroup. */
> -               if (ss->can_attach_task) {
> -                       /* run on each task in the threadgroup. */
> -                       for (i = 0; i < group_size; i++) {
> -                               tc = flex_array_get(group, i);
> -                               if (tc->cgrp == cgrp)
> -                                       continue;
> -                               retval = ss->can_attach_task(cgrp, tc->task);
> -                               if (retval) {
> -                                       failed_ss = ss;
> -                                       cancel_failed_ss = true;
> -                                       goto out_cancel_attach;
> -                               }
> -                       }
> -               }
>        }
>
>        /*
> @@ -2217,15 +2190,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>        }
>
>        /*
> -        * step 3: now that we're guaranteed success wrt the css_sets, proceed
> -        * to move all tasks to the new cgroup, calling ss->attach_task for each
> -        * one along the way. there are no failure cases after here, so this is
> -        * the commit point.
> +        * step 3: now that we're guaranteed success wrt the css_sets,
> +        * proceed to move all tasks to the new cgroup.  There are no
> +        * failure cases after here, so this is the commit point.
>         */
> -       for_each_subsys(root, ss) {
> -               if (ss->pre_attach)
> -                       ss->pre_attach(cgrp);
> -       }
>        for (i = 0; i < group_size; i++) {
>                tc = flex_array_get(group, i);
>                /* leave current thread as it is if it's already there */
> @@ -2235,19 +2203,11 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
>                /* if the thread is PF_EXITING, it can just get skipped. */
>                retval = cgroup_task_migrate(cgrp, tc->cgrp, tc->task, true);
>                BUG_ON(retval != 0 && retval != -ESRCH);
> -
> -               /* attach each task to each subsystem */
> -               for_each_subsys(root, ss) {
> -                       if (ss->attach_task)
> -                               ss->attach_task(cgrp, tc->task);
> -               }
>        }
>        /* nothing is sensitive to fork() after this point. */
>
>        /*
> -        * step 4: do expensive, non-thread-specific subsystem callbacks.
> -        * TODO: if ever a subsystem needs to know the oldcgrp for each task
> -        * being moved, this call will need to be reworked to communicate that.
> +        * step 4: do subsystem attach callbacks.
>         */
>        for_each_subsys(root, ss) {
>                if (ss->attach)
> @@ -2271,11 +2231,8 @@ out_cancel_attach:
>        /* same deal as in cgroup_attach_task */
>        if (retval) {
>                for_each_subsys(root, ss) {
> -                       if (ss == failed_ss) {
> -                               if (cancel_failed_ss && ss->cancel_attach)
> -                                       ss->cancel_attach(ss, cgrp, &tset);
> +                       if (ss == failed_ss)
>                                break;
> -                       }
>                        if (ss->cancel_attach)
>                                ss->cancel_attach(ss, cgrp, &tset);
>                }
> --
> 1.7.6
>
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
       [not found]   ` <1314312192-26885-3-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2011-08-26  4:13     ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 100+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-08-26  4:13 UTC (permalink / raw)
  To: Tejun Heo
  Cc: fweisbec-Re5JQEeQqe8AvxtiuMwx3w,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, rjw-KKrjLPT3xs0,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	paul-inf54ven1CmVyaH7bEyXVA

On Fri, 26 Aug 2011 00:43:08 +0200
Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:

> cgroup_attach_proc() behaves differently from cgroup_attach_task() in
> the following aspects.
> 
> * All hooks are invoked even if no task is actually being moved.
> 
> * ->can_attach_task() is called for all tasks in the group whether the
>   new cgrp is different from the current cgrp or not; however,
>   ->attach_task() is skipped if new equals new.  This makes the calls
>   asymmetric.
> 
> This patch improves old cgroup handling in cgroup_attach_proc() by
> looking up the current cgroup at the head, recording it in the flex
> array along with the task itself, and using it to remove the above two
> differences.  This will also ease further changes.
> 
> -v2: nr_todo renamed to nr_migrating_tasks as per Paul Menage's
>      suggestion.
> 
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Acked-by: Paul Menage <paul-inf54ven1CmVyaH7bEyXVA@public.gmane.org>
> Cc: Li Zefan <lizf-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
  2011-08-25 22:43 ` [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc() Tejun Heo
  2011-08-26  4:13   ` KAMEZAWA Hiroyuki
       [not found]   ` <1314312192-26885-3-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2011-08-26  4:13   ` KAMEZAWA Hiroyuki
  2 siblings, 0 replies; 100+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-08-26  4:13 UTC (permalink / raw)
  To: Tejun Heo
  Cc: rjw, paul, lizf, linux-pm, linux-kernel, containers, fweisbec, matthltc

On Fri, 26 Aug 2011 00:43:08 +0200
Tejun Heo <tj@kernel.org> wrote:

> cgroup_attach_proc() behaves differently from cgroup_attach_task() in
> the following aspects.
> 
> * All hooks are invoked even if no task is actually being moved.
> 
> * ->can_attach_task() is called for all tasks in the group whether the
>   new cgrp is different from the current cgrp or not; however,
>   ->attach_task() is skipped if new equals new.  This makes the calls
>   asymmetric.
> 
> This patch improves old cgroup handling in cgroup_attach_proc() by
> looking up the current cgroup at the head, recording it in the flex
> array along with the task itself, and using it to remove the above two
> differences.  This will also ease further changes.
> 
> -v2: nr_todo renamed to nr_migrating_tasks as per Paul Menage's
>      suggestion.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Acked-by: Paul Menage <paul@paulmenage.org>
> Cc: Li Zefan <lizf@cn.fujitsu.com>

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>


^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
  2011-08-25 22:43 ` [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc() Tejun Heo
@ 2011-08-26  4:13   ` KAMEZAWA Hiroyuki
       [not found]   ` <1314312192-26885-3-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2011-08-26  4:13   ` KAMEZAWA Hiroyuki
  2 siblings, 0 replies; 100+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-08-26  4:13 UTC (permalink / raw)
  To: Tejun Heo; +Cc: fweisbec, containers, lizf, linux-kernel, linux-pm, paul

On Fri, 26 Aug 2011 00:43:08 +0200
Tejun Heo <tj@kernel.org> wrote:

> cgroup_attach_proc() behaves differently from cgroup_attach_task() in
> the following aspects.
> 
> * All hooks are invoked even if no task is actually being moved.
> 
> * ->can_attach_task() is called for all tasks in the group whether the
>   new cgrp is different from the current cgrp or not; however,
>   ->attach_task() is skipped if new equals new.  This makes the calls
>   asymmetric.
> 
> This patch improves old cgroup handling in cgroup_attach_proc() by
> looking up the current cgroup at the head, recording it in the flex
> array along with the task itself, and using it to remove the above two
> differences.  This will also ease further changes.
> 
> -v2: nr_todo renamed to nr_migrating_tasks as per Paul Menage's
>      suggestion.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Acked-by: Paul Menage <paul@paulmenage.org>
> Cc: Li Zefan <lizf@cn.fujitsu.com>

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
       [not found] ` <1314312192-26885-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2011-08-25 22:43   ` Tejun Heo
  0 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25 22:43 UTC (permalink / raw)
  To: rjw-KKrjLPT3xs0, paul-inf54ven1CmVyaH7bEyXVA,
	lizf-BthXqXjhjHXQFUHtdCDX3A
  Cc: fweisbec-Re5JQEeQqe8AvxtiuMwx3w,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Tejun Heo,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

cgroup_attach_proc() behaves differently from cgroup_attach_task() in
the following aspects.

* All hooks are invoked even if no task is actually being moved.

* ->can_attach_task() is called for all tasks in the group whether the
  new cgrp is different from the current cgrp or not; however,
  ->attach_task() is skipped if new equals new.  This makes the calls
  asymmetric.

This patch improves old cgroup handling in cgroup_attach_proc() by
looking up the current cgroup at the head, recording it in the flex
array along with the task itself, and using it to remove the above two
differences.  This will also ease further changes.

-v2: nr_todo renamed to nr_migrating_tasks as per Paul Menage's
     suggestion.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Acked-by: Paul Menage <paul-inf54ven1CmVyaH7bEyXVA@public.gmane.org>
Cc: Li Zefan <lizf-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
---
 kernel/cgroup.c |   70 ++++++++++++++++++++++++++++++++++--------------------
 1 files changed, 44 insertions(+), 26 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index a606fa2..8a47380 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1739,6 +1739,11 @@ int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen)
 }
 EXPORT_SYMBOL_GPL(cgroup_path);
 
+struct task_and_cgroup {
+	struct task_struct	*task;
+	struct cgroup		*cgrp;
+};
+
 /*
  * cgroup_task_migrate - move a task from one cgroup to another.
  *
@@ -1990,15 +1995,15 @@ static int css_set_prefetch(struct cgroup *cgrp, struct css_set *cg,
  */
 int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 {
-	int retval, i, group_size;
+	int retval, i, group_size, nr_migrating_tasks;
 	struct cgroup_subsys *ss, *failed_ss = NULL;
 	bool cancel_failed_ss = false;
 	/* guaranteed to be initialized later, but the compiler needs this */
-	struct cgroup *oldcgrp = NULL;
 	struct css_set *oldcg;
 	struct cgroupfs_root *root = cgrp->root;
 	/* threadgroup list cursor and array */
 	struct task_struct *tsk;
+	struct task_and_cgroup *tc;
 	struct flex_array *group;
 	/*
 	 * we need to make sure we have css_sets for all the tasks we're
@@ -2017,8 +2022,7 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 */
 	group_size = get_nr_threads(leader);
 	/* flex_array supports very large thread-groups better than kmalloc. */
-	group = flex_array_alloc(sizeof(struct task_struct *), group_size,
-				 GFP_KERNEL);
+	group = flex_array_alloc(sizeof(*tc), group_size, GFP_KERNEL);
 	if (!group)
 		return -ENOMEM;
 	/* pre-allocate to guarantee space while iterating in rcu read-side. */
@@ -2042,8 +2046,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	}
 	/* take a reference on each task in the group to go in the array. */
 	tsk = leader;
-	i = 0;
+	i = nr_migrating_tasks = 0;
 	do {
+		struct task_and_cgroup ent;
+
 		/* as per above, nr_threads may decrease, but not increase. */
 		BUG_ON(i >= group_size);
 		get_task_struct(tsk);
@@ -2051,14 +2057,23 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 		 * saying GFP_ATOMIC has no effect here because we did prealloc
 		 * earlier, but it's good form to communicate our expectations.
 		 */
-		retval = flex_array_put_ptr(group, i, tsk, GFP_ATOMIC);
+		ent.task = tsk;
+		ent.cgrp = task_cgroup_from_root(tsk, root);
+		retval = flex_array_put(group, i, &ent, GFP_ATOMIC);
 		BUG_ON(retval != 0);
 		i++;
+		if (ent.cgrp != cgrp)
+			nr_migrating_tasks++;
 	} while_each_thread(leader, tsk);
 	/* remember the number of threads in the array for later. */
 	group_size = i;
 	rcu_read_unlock();
 
+	/* methods shouldn't be called if no task is actually migrating */
+	retval = 0;
+	if (!nr_migrating_tasks)
+		goto out_put_tasks;
+
 	/*
 	 * step 1: check that we can legitimately attach to the cgroup.
 	 */
@@ -2074,8 +2089,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 		if (ss->can_attach_task) {
 			/* run on each task in the threadgroup. */
 			for (i = 0; i < group_size; i++) {
-				tsk = flex_array_get_ptr(group, i);
-				retval = ss->can_attach_task(cgrp, tsk);
+				tc = flex_array_get(group, i);
+				if (tc->cgrp == cgrp)
+					continue;
+				retval = ss->can_attach_task(cgrp, tc->task);
 				if (retval) {
 					failed_ss = ss;
 					cancel_failed_ss = true;
@@ -2091,23 +2108,22 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 */
 	INIT_LIST_HEAD(&newcg_list);
 	for (i = 0; i < group_size; i++) {
-		tsk = flex_array_get_ptr(group, i);
+		tc = flex_array_get(group, i);
 		/* nothing to do if this task is already in the cgroup */
-		oldcgrp = task_cgroup_from_root(tsk, root);
-		if (cgrp == oldcgrp)
+		if (tc->cgrp == cgrp)
 			continue;
 		/* get old css_set pointer */
-		task_lock(tsk);
-		if (tsk->flags & PF_EXITING) {
+		task_lock(tc->task);
+		if (tc->task->flags & PF_EXITING) {
 			/* ignore this task if it's going away */
-			task_unlock(tsk);
+			task_unlock(tc->task);
 			continue;
 		}
-		oldcg = tsk->cgroups;
+		oldcg = tc->task->cgroups;
 		get_css_set(oldcg);
-		task_unlock(tsk);
+		task_unlock(tc->task);
 		/* see if the new one for us is already in the list? */
-		if (css_set_check_fetched(cgrp, tsk, oldcg, &newcg_list)) {
+		if (css_set_check_fetched(cgrp, tc->task, oldcg, &newcg_list)) {
 			/* was already there, nothing to do. */
 			put_css_set(oldcg);
 		} else {
@@ -2130,20 +2146,19 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 			ss->pre_attach(cgrp);
 	}
 	for (i = 0; i < group_size; i++) {
-		tsk = flex_array_get_ptr(group, i);
+		tc = flex_array_get(group, i);
 		/* leave current thread as it is if it's already there */
-		oldcgrp = task_cgroup_from_root(tsk, root);
-		if (cgrp == oldcgrp)
+		if (tc->cgrp == cgrp)
 			continue;
 
 		/* if the thread is PF_EXITING, it can just get skipped. */
-		retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, true);
+		retval = cgroup_task_migrate(cgrp, tc->cgrp, tc->task, true);
 		BUG_ON(retval != 0 && retval != -ESRCH);
 
 		/* attach each task to each subsystem */
 		for_each_subsys(root, ss) {
 			if (ss->attach_task)
-				ss->attach_task(cgrp, tsk);
+				ss->attach_task(cgrp, tc->task);
 		}
 	}
 	/* nothing is sensitive to fork() after this point. */
@@ -2154,8 +2169,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 * being moved, this call will need to be reworked to communicate that.
 	 */
 	for_each_subsys(root, ss) {
-		if (ss->attach)
-			ss->attach(ss, cgrp, oldcgrp, leader);
+		if (ss->attach) {
+			tc = flex_array_get(group, 0);
+			ss->attach(ss, cgrp, tc->cgrp, tc->task);
+		}
 	}
 
 	/*
@@ -2184,10 +2201,11 @@ out_cancel_attach:
 				ss->cancel_attach(ss, cgrp, leader);
 		}
 	}
+out_put_tasks:
 	/* clean up the array of referenced threads in the group. */
 	for (i = 0; i < group_size; i++) {
-		tsk = flex_array_get_ptr(group, i);
-		put_task_struct(tsk);
+		tc = flex_array_get(group, i);
+		put_task_struct(tc->task);
 	}
 out_free_group_list:
 	flex_array_free(group);
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
  2011-08-25 22:43 [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods, take#2 Tejun Heo
@ 2011-08-25 22:43 ` Tejun Heo
  2011-08-26  4:13   ` KAMEZAWA Hiroyuki
                     ` (2 more replies)
  2011-08-25 22:43 ` Tejun Heo
       [not found] ` <1314312192-26885-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2 siblings, 3 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25 22:43 UTC (permalink / raw)
  To: rjw, paul, lizf
  Cc: linux-pm, linux-kernel, containers, fweisbec, matthltc,
	kamezawa.hiroyu, Tejun Heo

cgroup_attach_proc() behaves differently from cgroup_attach_task() in
the following aspects.

* All hooks are invoked even if no task is actually being moved.

* ->can_attach_task() is called for all tasks in the group whether the
  new cgrp is different from the current cgrp or not; however,
  ->attach_task() is skipped if new equals new.  This makes the calls
  asymmetric.

This patch improves old cgroup handling in cgroup_attach_proc() by
looking up the current cgroup at the head, recording it in the flex
array along with the task itself, and using it to remove the above two
differences.  This will also ease further changes.

-v2: nr_todo renamed to nr_migrating_tasks as per Paul Menage's
     suggestion.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Paul Menage <paul@paulmenage.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
---
 kernel/cgroup.c |   70 ++++++++++++++++++++++++++++++++++--------------------
 1 files changed, 44 insertions(+), 26 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index a606fa2..8a47380 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1739,6 +1739,11 @@ int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen)
 }
 EXPORT_SYMBOL_GPL(cgroup_path);
 
+struct task_and_cgroup {
+	struct task_struct	*task;
+	struct cgroup		*cgrp;
+};
+
 /*
  * cgroup_task_migrate - move a task from one cgroup to another.
  *
@@ -1990,15 +1995,15 @@ static int css_set_prefetch(struct cgroup *cgrp, struct css_set *cg,
  */
 int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 {
-	int retval, i, group_size;
+	int retval, i, group_size, nr_migrating_tasks;
 	struct cgroup_subsys *ss, *failed_ss = NULL;
 	bool cancel_failed_ss = false;
 	/* guaranteed to be initialized later, but the compiler needs this */
-	struct cgroup *oldcgrp = NULL;
 	struct css_set *oldcg;
 	struct cgroupfs_root *root = cgrp->root;
 	/* threadgroup list cursor and array */
 	struct task_struct *tsk;
+	struct task_and_cgroup *tc;
 	struct flex_array *group;
 	/*
 	 * we need to make sure we have css_sets for all the tasks we're
@@ -2017,8 +2022,7 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 */
 	group_size = get_nr_threads(leader);
 	/* flex_array supports very large thread-groups better than kmalloc. */
-	group = flex_array_alloc(sizeof(struct task_struct *), group_size,
-				 GFP_KERNEL);
+	group = flex_array_alloc(sizeof(*tc), group_size, GFP_KERNEL);
 	if (!group)
 		return -ENOMEM;
 	/* pre-allocate to guarantee space while iterating in rcu read-side. */
@@ -2042,8 +2046,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	}
 	/* take a reference on each task in the group to go in the array. */
 	tsk = leader;
-	i = 0;
+	i = nr_migrating_tasks = 0;
 	do {
+		struct task_and_cgroup ent;
+
 		/* as per above, nr_threads may decrease, but not increase. */
 		BUG_ON(i >= group_size);
 		get_task_struct(tsk);
@@ -2051,14 +2057,23 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 		 * saying GFP_ATOMIC has no effect here because we did prealloc
 		 * earlier, but it's good form to communicate our expectations.
 		 */
-		retval = flex_array_put_ptr(group, i, tsk, GFP_ATOMIC);
+		ent.task = tsk;
+		ent.cgrp = task_cgroup_from_root(tsk, root);
+		retval = flex_array_put(group, i, &ent, GFP_ATOMIC);
 		BUG_ON(retval != 0);
 		i++;
+		if (ent.cgrp != cgrp)
+			nr_migrating_tasks++;
 	} while_each_thread(leader, tsk);
 	/* remember the number of threads in the array for later. */
 	group_size = i;
 	rcu_read_unlock();
 
+	/* methods shouldn't be called if no task is actually migrating */
+	retval = 0;
+	if (!nr_migrating_tasks)
+		goto out_put_tasks;
+
 	/*
 	 * step 1: check that we can legitimately attach to the cgroup.
 	 */
@@ -2074,8 +2089,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 		if (ss->can_attach_task) {
 			/* run on each task in the threadgroup. */
 			for (i = 0; i < group_size; i++) {
-				tsk = flex_array_get_ptr(group, i);
-				retval = ss->can_attach_task(cgrp, tsk);
+				tc = flex_array_get(group, i);
+				if (tc->cgrp == cgrp)
+					continue;
+				retval = ss->can_attach_task(cgrp, tc->task);
 				if (retval) {
 					failed_ss = ss;
 					cancel_failed_ss = true;
@@ -2091,23 +2108,22 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 */
 	INIT_LIST_HEAD(&newcg_list);
 	for (i = 0; i < group_size; i++) {
-		tsk = flex_array_get_ptr(group, i);
+		tc = flex_array_get(group, i);
 		/* nothing to do if this task is already in the cgroup */
-		oldcgrp = task_cgroup_from_root(tsk, root);
-		if (cgrp == oldcgrp)
+		if (tc->cgrp == cgrp)
 			continue;
 		/* get old css_set pointer */
-		task_lock(tsk);
-		if (tsk->flags & PF_EXITING) {
+		task_lock(tc->task);
+		if (tc->task->flags & PF_EXITING) {
 			/* ignore this task if it's going away */
-			task_unlock(tsk);
+			task_unlock(tc->task);
 			continue;
 		}
-		oldcg = tsk->cgroups;
+		oldcg = tc->task->cgroups;
 		get_css_set(oldcg);
-		task_unlock(tsk);
+		task_unlock(tc->task);
 		/* see if the new one for us is already in the list? */
-		if (css_set_check_fetched(cgrp, tsk, oldcg, &newcg_list)) {
+		if (css_set_check_fetched(cgrp, tc->task, oldcg, &newcg_list)) {
 			/* was already there, nothing to do. */
 			put_css_set(oldcg);
 		} else {
@@ -2130,20 +2146,19 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 			ss->pre_attach(cgrp);
 	}
 	for (i = 0; i < group_size; i++) {
-		tsk = flex_array_get_ptr(group, i);
+		tc = flex_array_get(group, i);
 		/* leave current thread as it is if it's already there */
-		oldcgrp = task_cgroup_from_root(tsk, root);
-		if (cgrp == oldcgrp)
+		if (tc->cgrp == cgrp)
 			continue;
 
 		/* if the thread is PF_EXITING, it can just get skipped. */
-		retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, true);
+		retval = cgroup_task_migrate(cgrp, tc->cgrp, tc->task, true);
 		BUG_ON(retval != 0 && retval != -ESRCH);
 
 		/* attach each task to each subsystem */
 		for_each_subsys(root, ss) {
 			if (ss->attach_task)
-				ss->attach_task(cgrp, tsk);
+				ss->attach_task(cgrp, tc->task);
 		}
 	}
 	/* nothing is sensitive to fork() after this point. */
@@ -2154,8 +2169,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 * being moved, this call will need to be reworked to communicate that.
 	 */
 	for_each_subsys(root, ss) {
-		if (ss->attach)
-			ss->attach(ss, cgrp, oldcgrp, leader);
+		if (ss->attach) {
+			tc = flex_array_get(group, 0);
+			ss->attach(ss, cgrp, tc->cgrp, tc->task);
+		}
 	}
 
 	/*
@@ -2184,10 +2201,11 @@ out_cancel_attach:
 				ss->cancel_attach(ss, cgrp, leader);
 		}
 	}
+out_put_tasks:
 	/* clean up the array of referenced threads in the group. */
 	for (i = 0; i < group_size; i++) {
-		tsk = flex_array_get_ptr(group, i);
-		put_task_struct(tsk);
+		tc = flex_array_get(group, i);
+		put_task_struct(tc->task);
 	}
 out_free_group_list:
 	flex_array_free(group);
-- 
1.7.6


^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc()
  2011-08-25 22:43 [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods, take#2 Tejun Heo
  2011-08-25 22:43 ` [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc() Tejun Heo
@ 2011-08-25 22:43 ` Tejun Heo
       [not found] ` <1314312192-26885-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2 siblings, 0 replies; 100+ messages in thread
From: Tejun Heo @ 2011-08-25 22:43 UTC (permalink / raw)
  To: rjw, paul, lizf
  Cc: fweisbec, containers, linux-kernel, Tejun Heo, linux-pm, kamezawa.hiroyu

cgroup_attach_proc() behaves differently from cgroup_attach_task() in
the following aspects.

* All hooks are invoked even if no task is actually being moved.

* ->can_attach_task() is called for all tasks in the group whether the
  new cgrp is different from the current cgrp or not; however,
  ->attach_task() is skipped if new equals new.  This makes the calls
  asymmetric.

This patch improves old cgroup handling in cgroup_attach_proc() by
looking up the current cgroup at the head, recording it in the flex
array along with the task itself, and using it to remove the above two
differences.  This will also ease further changes.

-v2: nr_todo renamed to nr_migrating_tasks as per Paul Menage's
     suggestion.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Paul Menage <paul@paulmenage.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
---
 kernel/cgroup.c |   70 ++++++++++++++++++++++++++++++++++--------------------
 1 files changed, 44 insertions(+), 26 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index a606fa2..8a47380 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1739,6 +1739,11 @@ int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen)
 }
 EXPORT_SYMBOL_GPL(cgroup_path);
 
+struct task_and_cgroup {
+	struct task_struct	*task;
+	struct cgroup		*cgrp;
+};
+
 /*
  * cgroup_task_migrate - move a task from one cgroup to another.
  *
@@ -1990,15 +1995,15 @@ static int css_set_prefetch(struct cgroup *cgrp, struct css_set *cg,
  */
 int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 {
-	int retval, i, group_size;
+	int retval, i, group_size, nr_migrating_tasks;
 	struct cgroup_subsys *ss, *failed_ss = NULL;
 	bool cancel_failed_ss = false;
 	/* guaranteed to be initialized later, but the compiler needs this */
-	struct cgroup *oldcgrp = NULL;
 	struct css_set *oldcg;
 	struct cgroupfs_root *root = cgrp->root;
 	/* threadgroup list cursor and array */
 	struct task_struct *tsk;
+	struct task_and_cgroup *tc;
 	struct flex_array *group;
 	/*
 	 * we need to make sure we have css_sets for all the tasks we're
@@ -2017,8 +2022,7 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 */
 	group_size = get_nr_threads(leader);
 	/* flex_array supports very large thread-groups better than kmalloc. */
-	group = flex_array_alloc(sizeof(struct task_struct *), group_size,
-				 GFP_KERNEL);
+	group = flex_array_alloc(sizeof(*tc), group_size, GFP_KERNEL);
 	if (!group)
 		return -ENOMEM;
 	/* pre-allocate to guarantee space while iterating in rcu read-side. */
@@ -2042,8 +2046,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	}
 	/* take a reference on each task in the group to go in the array. */
 	tsk = leader;
-	i = 0;
+	i = nr_migrating_tasks = 0;
 	do {
+		struct task_and_cgroup ent;
+
 		/* as per above, nr_threads may decrease, but not increase. */
 		BUG_ON(i >= group_size);
 		get_task_struct(tsk);
@@ -2051,14 +2057,23 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 		 * saying GFP_ATOMIC has no effect here because we did prealloc
 		 * earlier, but it's good form to communicate our expectations.
 		 */
-		retval = flex_array_put_ptr(group, i, tsk, GFP_ATOMIC);
+		ent.task = tsk;
+		ent.cgrp = task_cgroup_from_root(tsk, root);
+		retval = flex_array_put(group, i, &ent, GFP_ATOMIC);
 		BUG_ON(retval != 0);
 		i++;
+		if (ent.cgrp != cgrp)
+			nr_migrating_tasks++;
 	} while_each_thread(leader, tsk);
 	/* remember the number of threads in the array for later. */
 	group_size = i;
 	rcu_read_unlock();
 
+	/* methods shouldn't be called if no task is actually migrating */
+	retval = 0;
+	if (!nr_migrating_tasks)
+		goto out_put_tasks;
+
 	/*
 	 * step 1: check that we can legitimately attach to the cgroup.
 	 */
@@ -2074,8 +2089,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 		if (ss->can_attach_task) {
 			/* run on each task in the threadgroup. */
 			for (i = 0; i < group_size; i++) {
-				tsk = flex_array_get_ptr(group, i);
-				retval = ss->can_attach_task(cgrp, tsk);
+				tc = flex_array_get(group, i);
+				if (tc->cgrp == cgrp)
+					continue;
+				retval = ss->can_attach_task(cgrp, tc->task);
 				if (retval) {
 					failed_ss = ss;
 					cancel_failed_ss = true;
@@ -2091,23 +2108,22 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 */
 	INIT_LIST_HEAD(&newcg_list);
 	for (i = 0; i < group_size; i++) {
-		tsk = flex_array_get_ptr(group, i);
+		tc = flex_array_get(group, i);
 		/* nothing to do if this task is already in the cgroup */
-		oldcgrp = task_cgroup_from_root(tsk, root);
-		if (cgrp == oldcgrp)
+		if (tc->cgrp == cgrp)
 			continue;
 		/* get old css_set pointer */
-		task_lock(tsk);
-		if (tsk->flags & PF_EXITING) {
+		task_lock(tc->task);
+		if (tc->task->flags & PF_EXITING) {
 			/* ignore this task if it's going away */
-			task_unlock(tsk);
+			task_unlock(tc->task);
 			continue;
 		}
-		oldcg = tsk->cgroups;
+		oldcg = tc->task->cgroups;
 		get_css_set(oldcg);
-		task_unlock(tsk);
+		task_unlock(tc->task);
 		/* see if the new one for us is already in the list? */
-		if (css_set_check_fetched(cgrp, tsk, oldcg, &newcg_list)) {
+		if (css_set_check_fetched(cgrp, tc->task, oldcg, &newcg_list)) {
 			/* was already there, nothing to do. */
 			put_css_set(oldcg);
 		} else {
@@ -2130,20 +2146,19 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 			ss->pre_attach(cgrp);
 	}
 	for (i = 0; i < group_size; i++) {
-		tsk = flex_array_get_ptr(group, i);
+		tc = flex_array_get(group, i);
 		/* leave current thread as it is if it's already there */
-		oldcgrp = task_cgroup_from_root(tsk, root);
-		if (cgrp == oldcgrp)
+		if (tc->cgrp == cgrp)
 			continue;
 
 		/* if the thread is PF_EXITING, it can just get skipped. */
-		retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, true);
+		retval = cgroup_task_migrate(cgrp, tc->cgrp, tc->task, true);
 		BUG_ON(retval != 0 && retval != -ESRCH);
 
 		/* attach each task to each subsystem */
 		for_each_subsys(root, ss) {
 			if (ss->attach_task)
-				ss->attach_task(cgrp, tsk);
+				ss->attach_task(cgrp, tc->task);
 		}
 	}
 	/* nothing is sensitive to fork() after this point. */
@@ -2154,8 +2169,10 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
 	 * being moved, this call will need to be reworked to communicate that.
 	 */
 	for_each_subsys(root, ss) {
-		if (ss->attach)
-			ss->attach(ss, cgrp, oldcgrp, leader);
+		if (ss->attach) {
+			tc = flex_array_get(group, 0);
+			ss->attach(ss, cgrp, tc->cgrp, tc->task);
+		}
 	}
 
 	/*
@@ -2184,10 +2201,11 @@ out_cancel_attach:
 				ss->cancel_attach(ss, cgrp, leader);
 		}
 	}
+out_put_tasks:
 	/* clean up the array of referenced threads in the group. */
 	for (i = 0; i < group_size; i++) {
-		tsk = flex_array_get_ptr(group, i);
-		put_task_struct(tsk);
+		tc = flex_array_get(group, i);
+		put_task_struct(tc->task);
 	}
 out_free_group_list:
 	flex_array_free(group);
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 100+ messages in thread

end of thread, other threads:[~2011-08-26  4:21 UTC | newest]

Thread overview: 100+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-23 22:19 [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Tejun Heo
2011-08-23 22:19 ` [PATCH 1/6] cgroup: subsys->attach_task() should be called after migration Tejun Heo
2011-08-24  0:32   ` Frederic Weisbecker
2011-08-24  0:32   ` Frederic Weisbecker
2011-08-24  1:31     ` Li Zefan
2011-08-24  1:31     ` Li Zefan
2011-08-24  1:31     ` Li Zefan
     [not found]   ` <1314138000-2049-2-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-08-24  0:32     ` Frederic Weisbecker
2011-08-23 22:19 ` Tejun Heo
2011-08-23 22:19 ` [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc() Tejun Heo
2011-08-23 22:19 ` Tejun Heo
2011-08-25  8:51   ` Paul Menage
     [not found]     ` <CALdu-PAj1ZUmB2ixxA6yeppB8MerBGk1cSeQadobH0H4cRSe7Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-08-25  9:03       ` Tejun Heo
2011-08-25  9:03     ` Tejun Heo
2011-08-25  9:03     ` Tejun Heo
2011-08-25  8:51   ` Paul Menage
     [not found]   ` <1314138000-2049-3-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-08-25  8:51     ` Paul Menage
2011-08-25  9:42     ` Paul Menage
2011-08-25  9:42   ` Paul Menage
2011-08-25  9:42   ` Paul Menage
     [not found]     ` <CALdu-PBr-tu1qzScvncr-N4EpPaQ7sTdHf28GhEv_MZLbo1eSg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-08-25  9:44       ` Tejun Heo
2011-08-25  9:44     ` Tejun Heo
2011-08-25  9:44     ` Tejun Heo
2011-08-23 22:19 ` [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach() Tejun Heo
2011-08-25  0:39   ` KAMEZAWA Hiroyuki
2011-08-25  0:39   ` KAMEZAWA Hiroyuki
2011-08-25  8:20     ` Tejun Heo
2011-08-25  8:20     ` Tejun Heo
     [not found]       ` <20110825082049.GC3286-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2011-08-25  8:21         ` KAMEZAWA Hiroyuki
2011-08-25  8:21       ` KAMEZAWA Hiroyuki
2011-08-25  8:40         ` Tejun Heo
     [not found]         ` <20110825172140.eb34809f.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2011-08-25  8:40           ` Tejun Heo
2011-08-25  8:40         ` Tejun Heo
2011-08-25  8:37           ` KAMEZAWA Hiroyuki
     [not found]           ` <CAOS58YPM=cuWjAF+VJ4QJ8bnRcVtaDCVXBJCpdWg+2=2GmnKrA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-08-25  8:37             ` KAMEZAWA Hiroyuki
2011-08-25  8:37           ` KAMEZAWA Hiroyuki
2011-08-25  8:21       ` KAMEZAWA Hiroyuki
     [not found]     ` <20110825093958.75b95bd8.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2011-08-25  8:20       ` Tejun Heo
2011-08-25  9:14   ` Paul Menage
2011-08-25  9:20     ` Tejun Heo
2011-08-25  9:32       ` Paul Menage
2011-08-25  9:32       ` Paul Menage
     [not found]       ` <20110825092045.GG3286-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2011-08-25  9:32         ` Paul Menage
     [not found]     ` <CALdu-PDAgqeRJt5vqTB9wddwz70Yn+Jf-Pb0dDKDBD_q37tHQg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-08-25  9:20       ` Tejun Heo
2011-08-25  9:20     ` Tejun Heo
     [not found]   ` <1314138000-2049-4-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-08-25  0:39     ` KAMEZAWA Hiroyuki
2011-08-25  9:14     ` Paul Menage
2011-08-25  9:32     ` Paul Menage
2011-08-25  9:14   ` Paul Menage
2011-08-25  9:32   ` Paul Menage
2011-08-25  9:32   ` Paul Menage
     [not found] ` <1314138000-2049-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-08-23 22:19   ` [PATCH 1/6] cgroup: subsys->attach_task() should be called after migration Tejun Heo
2011-08-23 22:19   ` [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc() Tejun Heo
2011-08-23 22:19   ` [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach() Tejun Heo
2011-08-23 22:19   ` [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task() Tejun Heo
2011-08-23 22:19   ` [PATCH 5/6] cgroup, cpuset: don't use ss->pre_attach() Tejun Heo
2011-08-23 22:20   ` [PATCH 6/6] cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task() Tejun Heo
2011-08-24  1:14   ` [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Frederic Weisbecker
2011-08-23 22:19 ` [PATCH 3/6] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach() Tejun Heo
2011-08-23 22:19 ` [PATCH 4/6] cgroup: don't use subsys->can_attach_task() or ->attach_task() Tejun Heo
2011-08-23 22:19 ` Tejun Heo
2011-08-24  1:57   ` Matt Helsley
2011-08-24  1:57   ` Matt Helsley
     [not found]     ` <20110824015739.GE28444-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-08-24  7:54       ` Tejun Heo
2011-08-24  7:54     ` Tejun Heo
2011-08-24  7:54     ` Tejun Heo
     [not found]   ` <1314138000-2049-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-08-24  1:57     ` Matt Helsley
2011-08-25  9:07     ` Paul Menage
2011-08-25  9:07   ` Paul Menage
2011-08-25  9:07   ` Paul Menage
2011-08-25  9:12     ` Tejun Heo
2011-08-25  9:12     ` Tejun Heo
     [not found]     ` <CALdu-PCc2RzedXubReF9huamL6W+5qGCfXNNvqS2yUk3QTHRng-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-08-25  9:12       ` Tejun Heo
2011-08-23 22:19 ` [PATCH 5/6] cgroup, cpuset: don't use ss->pre_attach() Tejun Heo
2011-08-25  8:53   ` Paul Menage
2011-08-25  9:06     ` Tejun Heo
2011-08-25  9:06     ` Tejun Heo
     [not found]     ` <CALdu-PD5EbFJBRHf-iehPwb6vyJTYUTWZniihARFDZ7xRZ8_nQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-08-25  9:06       ` Tejun Heo
2011-08-25  8:53   ` Paul Menage
     [not found]   ` <1314138000-2049-6-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-08-25  8:53     ` Paul Menage
2011-08-23 22:19 ` Tejun Heo
2011-08-23 22:20 ` [PATCH 6/6] cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task() Tejun Heo
2011-08-23 22:20 ` Tejun Heo
2011-08-25  9:45   ` Paul Menage
     [not found]   ` <1314138000-2049-7-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-08-25  9:45     ` Paul Menage
2011-08-25  9:45   ` Paul Menage
2011-08-24  1:14 ` [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods Frederic Weisbecker
2011-08-24  7:49   ` Tejun Heo
2011-08-24  7:49   ` Tejun Heo
2011-08-24  7:49   ` Tejun Heo
2011-08-24 13:53     ` Frederic Weisbecker
2011-08-24 13:53     ` Frederic Weisbecker
     [not found]     ` <20110824074959.GA14170-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2011-08-24 13:53       ` Frederic Weisbecker
2011-08-24  1:14 ` Frederic Weisbecker
2011-08-25 22:43 [PATCHSET] cgroup: introduce cgroup_taskset and consolidate subsys methods, take#2 Tejun Heo
2011-08-25 22:43 ` [PATCH 2/6] cgroup: improve old cgroup handling in cgroup_attach_proc() Tejun Heo
2011-08-26  4:13   ` KAMEZAWA Hiroyuki
     [not found]   ` <1314312192-26885-3-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-08-26  4:13     ` KAMEZAWA Hiroyuki
2011-08-26  4:13   ` KAMEZAWA Hiroyuki
2011-08-25 22:43 ` Tejun Heo
     [not found] ` <1314312192-26885-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-08-25 22:43   ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.