linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle
@ 2013-08-01 21:49 Tejun Heo
  2013-08-01 21:49 ` [PATCH 01/23] cgroup: s/cgroup_subsys_state/cgroup_css/ s/task_subsys_state/task_css/ Tejun Heo
                   ` (24 more replies)
  0 siblings, 25 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan; +Cc: containers, cgroups, linux-kernel

Hello,

Currently, struct cgroup * is used as the main interface handle
between cgroup core and its subsystems, which works but is a bit
clunky because subsystems usually care much more about css's
(cgroup_subsys_state) a lot more than cgroups, which is natural as a
css is the intersection between a cgroup and a subsystem.

In addition to being a bit clunky, dealing with cgroups directly pose
a bit of trouble for the planned unified hierarchy support on two
fronts.  First, most iterations become subsystem dependent as task
membership is affected by which subtree has the specific subsystem
enabled and thus require specifying which subsystem the iteration is
for, which is automatically achieved if the interfaces deal with css's
instead of cgroups.

Second, as css's may be created, attached, detached and destroyed
dynamically multiple times across the lifetime of a given cgroup as
they're enabled and disabled, which makes cgroup -> css mapping much
more difficult to synchronize.  Giving out cgroup to subsystems and
then requiring them to take the extra steps to deal with their css's
coming and going dynamically is a lot more fragile than cgroup core
proper handling it internally and giving out the resulting css's to
subsystems.

So, this patchset converts all cgroup subsystem APIs to deal with
css's instead of cgroups.  The patchset is fairly large but most of
the conversions, while being tedious, aren't complex.  At the end of
series, subsystems no longer make cgroup -> css mapping themselves and
cgroup_css() - formerly cgroup_subsys_state() - is made internal to
cgroup core proper.

This is a rather large update to the interface and likely to play as a
barrier when porting commits, which is inconvenient but also provides
an opportunity to clean up the API where we can as doing so won't
significantly raise the level of inconvenience.  As such, this
patchset contains some API cleanups and I'll probably follow up with
further API updates that I've been meaning to do and, if you have some
good idea to clean up cgroup internal API, this probably is a good
time to submit it.

This patchset contains the following 23 patches.

 0001-cgroup-s-cgroup_subsys_state-cgroup_css-s-task_subsy.patch
 0002-cpuset-drop-const-qualifiers-from-struct-cpuset-inst.patch
 0003-netprio_cgroup-pass-around-css-instead-of-cgroup-and.patch
 0004-hugetlb_cgroup-pass-around-hugetlb_cgroup-instead-of.patch
 0005-cgroup-add-subsystem-pointer-to-cgroup_subsys_state.patch
 0006-cgroup-add-update-accessors-which-obtain-subsys-spec.patch
 0007-cgroup-add-css_parent.patch
 0008-cgroup-pass-around-cgroup_subsys_state-instead-of-cg.patch
 0009-cgroup-add-subsys-backlink-pointer-to-cftype.patch
 0010-cgroup-pin-cgroup_subsys_state-when-opening-a-cgroup.patch
 0011-cgroup-add-cgroup-dummy_css.patch
 0012-cgroup-pass-around-cgroup_subsys_state-instead-of-cg.patch
 0013-cgroup-convert-cgroup_next_sibling-to-cgroup_next_ch.patch
 0014-cgroup-always-use-cgroup_next_child-to-walk-the-chil.patch
 0015-cgroup-make-hierarchy-iterators-deal-with-cgroup_sub.patch
 0016-cgroup-relocate-cgroup_advance_iter.patch
 0017-cgroup-rename-cgroup_iter-to-cgroup_task_iter.patch
 0018-cgroup-make-cgroup_task_iter-remember-the-cgroup-bei.patch
 0019-cgroup-remove-struct-cgroup_scanner.patch
 0020-cgroup-make-task-iterators-deal-with-cgroup_subsys_s.patch
 0021-cgroup-make-cftype-un-register_event-deal-with-cgrou.patch
 0022-cgroup-make-cgroup_taskset-deal-with-cgroup_subsys_s.patch
 0023-cgroup-unexport-cgroup_css.patch

0001-0007 prepare for css-based interface.  Individual subsystems are
updated so that they're easier to convert and css_parent() is added so
that upward travelsal can be done without going through cgroup.

0008 updates all subsystem methods.

0009-0012 update file operations.

0013-0015 update hierarchy iterators.

0016-0020 update task iterators.

0021 updates events.

0022 updates task_set handling.

0023 unexports cgroup_css().

This patchset is available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-cssfy

diffstat follows.

 block/blk-cgroup.c           |   41 +-
 block/blk-cgroup.h           |   34 -
 block/blk-throttle.c         |   40 +-
 block/cfq-iosched.c          |   90 ++--
 fs/bio.c                     |    2
 include/linux/cgroup.h       |  268 +++++++-------
 include/linux/memcontrol.h   |    2
 include/linux/vmpressure.h   |    6
 include/net/cls_cgroup.h     |    4
 include/net/netprio_cgroup.h |    8
 kernel/cgroup.c              |  798 ++++++++++++++++++++++++-------------------
 kernel/cgroup_freezer.c      |  128 +++---
 kernel/cpuset.c              |  265 ++++++--------
 kernel/events/core.c         |   22 -
 kernel/sched/core.c          |  113 +++---
 kernel/sched/cpuacct.c       |   51 +-
 kernel/sched/sched.h         |    6
 mm/hugetlb_cgroup.c          |   69 +--
 mm/memcontrol.c              |  216 +++++------
 mm/vmpressure.c              |   25 -
 net/core/netprio_cgroup.c    |   72 +--
 net/ipv4/tcp_memcontrol.c    |   12
 net/sched/cls_cgroup.c       |   39 +-
 security/device_cgroup.c     |   63 +--
 24 files changed, 1210 insertions(+), 1164 deletions(-)

Thanks.

--
tejun

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 01/23] cgroup: s/cgroup_subsys_state/cgroup_css/ s/task_subsys_state/task_css/
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-01 21:49 ` [PATCH 02/23] cpuset: drop "const" qualifiers from struct cpuset instances Tejun Heo
                   ` (23 subsequent siblings)
  24 siblings, 0 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan; +Cc: containers, cgroups, linux-kernel, Tejun Heo

The names of the two struct cgroup_subsys_state accessors -
cgroup_subsys_state() and task_subsys_state() - are somewhat awkward.
The former clashes with the type name and the latter doesn't even
indicate it's somehow related to cgroup.

We're about to revamp large portion of cgroup API, so, let's rename
them so that they're less awkward.  Most per-controller usages of the
accessors are localized in accessor wrappers and given the amount of
scheduled changes, this isn't gonna add any noticeable headache.

Rename cgroup_subsys_state() to cgroup_css() and task_subsys_state()
to task_css().  This patch is pure rename.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 block/blk-cgroup.h           |  5 ++---
 fs/bio.c                     |  2 +-
 include/linux/cgroup.h       | 31 +++++++++++++++++++------------
 include/net/cls_cgroup.h     |  4 ++--
 include/net/netprio_cgroup.h |  4 ++--
 kernel/cgroup.c              |  2 +-
 kernel/cgroup_freezer.c      |  4 ++--
 kernel/cpuset.c              |  6 +++---
 kernel/events/core.c         |  6 +++---
 kernel/sched/core.c          |  4 ++--
 kernel/sched/cpuacct.c       |  4 ++--
 kernel/sched/sched.h         |  6 +++---
 mm/hugetlb_cgroup.c          |  6 ++----
 mm/memcontrol.c              |  5 ++---
 mm/vmpressure.c              |  2 +-
 net/core/netprio_cgroup.c    |  2 +-
 net/sched/cls_cgroup.c       |  4 ++--
 security/device_cgroup.c     |  4 ++--
 18 files changed, 52 insertions(+), 49 deletions(-)

diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
index 8056c03..628e50f 100644
--- a/block/blk-cgroup.h
+++ b/block/blk-cgroup.h
@@ -181,14 +181,13 @@ void blkg_conf_finish(struct blkg_conf_ctx *ctx);
 
 static inline struct blkcg *cgroup_to_blkcg(struct cgroup *cgroup)
 {
-	return container_of(cgroup_subsys_state(cgroup, blkio_subsys_id),
+	return container_of(cgroup_css(cgroup, blkio_subsys_id),
 			    struct blkcg, css);
 }
 
 static inline struct blkcg *task_blkcg(struct task_struct *tsk)
 {
-	return container_of(task_subsys_state(tsk, blkio_subsys_id),
-			    struct blkcg, css);
+	return container_of(task_css(tsk, blkio_subsys_id), struct blkcg, css);
 }
 
 static inline struct blkcg *bio_blkcg(struct bio *bio)
diff --git a/fs/bio.c b/fs/bio.c
index 94bbc04..8e0348f 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -1946,7 +1946,7 @@ int bio_associate_current(struct bio *bio)
 
 	/* associate blkcg if exists */
 	rcu_read_lock();
-	css = task_subsys_state(current, blkio_subsys_id);
+	css = task_css(current, blkio_subsys_id);
 	if (css && css_tryget(css))
 		bio->bi_css = css;
 	rcu_read_unlock();
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index bbf4d89..1938292 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -648,8 +648,15 @@ struct cgroup_subsys {
 #undef IS_SUBSYS_ENABLED
 #undef SUBSYS
 
-static inline struct cgroup_subsys_state *cgroup_subsys_state(
-	struct cgroup *cgrp, int subsys_id)
+/**
+ * cgroup_css - obtain a cgroup's css for the specified subsystem
+ * @cgrp: the cgroup of interest
+ * @subsys_id: the subsystem of interest
+ *
+ * Return @cgrp's css (cgroup_subsys_state) associated with @subsys_id.
+ */
+static inline struct cgroup_subsys_state *cgroup_css(struct cgroup *cgrp,
+						     int subsys_id)
 {
 	return cgrp->subsys[subsys_id];
 }
@@ -679,7 +686,7 @@ extern struct mutex cgroup_mutex;
 #endif
 
 /**
- * task_subsys_state_check - obtain css for (task, subsys) w/ extra access conds
+ * task_css_check - obtain css for (task, subsys) w/ extra access conds
  * @task: the target task
  * @subsys_id: the target subsystem ID
  * @__c: extra condition expression to be passed to rcu_dereference_check()
@@ -687,7 +694,7 @@ extern struct mutex cgroup_mutex;
  * Return the cgroup_subsys_state for the (@task, @subsys_id) pair.  The
  * synchronization rules are the same as task_css_set_check().
  */
-#define task_subsys_state_check(task, subsys_id, __c)			\
+#define task_css_check(task, subsys_id, __c)				\
 	task_css_set_check((task), (__c))->subsys[(subsys_id)]
 
 /**
@@ -702,22 +709,22 @@ static inline struct css_set *task_css_set(struct task_struct *task)
 }
 
 /**
- * task_subsys_state - obtain css for (task, subsys)
+ * task_css - obtain css for (task, subsys)
  * @task: the target task
  * @subsys_id: the target subsystem ID
  *
- * See task_subsys_state_check().
+ * See task_css_check().
  */
-static inline struct cgroup_subsys_state *
-task_subsys_state(struct task_struct *task, int subsys_id)
+static inline struct cgroup_subsys_state *task_css(struct task_struct *task,
+						   int subsys_id)
 {
-	return task_subsys_state_check(task, subsys_id, false);
+	return task_css_check(task, subsys_id, false);
 }
 
-static inline struct cgroup* task_cgroup(struct task_struct *task,
-					       int subsys_id)
+static inline struct cgroup *task_cgroup(struct task_struct *task,
+					 int subsys_id)
 {
-	return task_subsys_state(task, subsys_id)->cgroup;
+	return task_css(task, subsys_id)->cgroup;
 }
 
 /**
diff --git a/include/net/cls_cgroup.h b/include/net/cls_cgroup.h
index 0fee061..52adaa7 100644
--- a/include/net/cls_cgroup.h
+++ b/include/net/cls_cgroup.h
@@ -35,7 +35,7 @@ static inline u32 task_cls_classid(struct task_struct *p)
 		return 0;
 
 	rcu_read_lock();
-	classid = container_of(task_subsys_state(p, net_cls_subsys_id),
+	classid = container_of(task_css(p, net_cls_subsys_id),
 			       struct cgroup_cls_state, css)->classid;
 	rcu_read_unlock();
 
@@ -51,7 +51,7 @@ static inline u32 task_cls_classid(struct task_struct *p)
 		return 0;
 
 	rcu_read_lock();
-	css = task_subsys_state(p, net_cls_subsys_id);
+	css = task_css(p, net_cls_subsys_id);
 	if (css)
 		classid = container_of(css,
 				       struct cgroup_cls_state, css)->classid;
diff --git a/include/net/netprio_cgroup.h b/include/net/netprio_cgroup.h
index 50ab8c2..8110fa7 100644
--- a/include/net/netprio_cgroup.h
+++ b/include/net/netprio_cgroup.h
@@ -39,7 +39,7 @@ static inline u32 task_netprioidx(struct task_struct *p)
 	u32 idx;
 
 	rcu_read_lock();
-	css = task_subsys_state(p, net_prio_subsys_id);
+	css = task_css(p, net_prio_subsys_id);
 	idx = css->cgroup->id;
 	rcu_read_unlock();
 	return idx;
@@ -53,7 +53,7 @@ static inline u32 task_netprioidx(struct task_struct *p)
 	u32 idx = 0;
 
 	rcu_read_lock();
-	css = task_subsys_state(p, net_prio_subsys_id);
+	css = task_css(p, net_prio_subsys_id);
 	if (css)
 		idx = css->cgroup->id;
 	rcu_read_unlock();
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 9420662..9d5af91 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -81,7 +81,7 @@
  */
 #ifdef CONFIG_PROVE_RCU
 DEFINE_MUTEX(cgroup_mutex);
-EXPORT_SYMBOL_GPL(cgroup_mutex);	/* only for task_subsys_state_check() */
+EXPORT_SYMBOL_GPL(cgroup_mutex);	/* only for lockdep */
 #else
 static DEFINE_MUTEX(cgroup_mutex);
 #endif
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index 75dda1e..9d3f615 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -47,13 +47,13 @@ struct freezer {
 
 static inline struct freezer *cgroup_freezer(struct cgroup *cgroup)
 {
-	return container_of(cgroup_subsys_state(cgroup, freezer_subsys_id),
+	return container_of(cgroup_css(cgroup, freezer_subsys_id),
 			    struct freezer, css);
 }
 
 static inline struct freezer *task_freezer(struct task_struct *task)
 {
-	return container_of(task_subsys_state(task, freezer_subsys_id),
+	return container_of(task_css(task, freezer_subsys_id),
 			    struct freezer, css);
 }
 
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 1b9c315..be4512b 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -117,14 +117,14 @@ struct cpuset {
 /* Retrieve the cpuset for a cgroup */
 static inline struct cpuset *cgroup_cs(struct cgroup *cgrp)
 {
-	return container_of(cgroup_subsys_state(cgrp, cpuset_subsys_id),
+	return container_of(cgroup_css(cgrp, cpuset_subsys_id),
 			    struct cpuset, css);
 }
 
 /* Retrieve the cpuset for a task */
 static inline struct cpuset *task_cs(struct task_struct *task)
 {
-	return container_of(task_subsys_state(task, cpuset_subsys_id),
+	return container_of(task_css(task, cpuset_subsys_id),
 			    struct cpuset, css);
 }
 
@@ -2724,7 +2724,7 @@ int proc_cpuset_show(struct seq_file *m, void *unused_v)
 		goto out_free;
 
 	rcu_read_lock();
-	css = task_subsys_state(tsk, cpuset_subsys_id);
+	css = task_css(tsk, cpuset_subsys_id);
 	retval = cgroup_path(css->cgroup, buf, PAGE_SIZE);
 	rcu_read_unlock();
 	if (retval < 0)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 1833bc5..414c61f 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -340,8 +340,8 @@ struct perf_cgroup {
 static inline struct perf_cgroup *
 perf_cgroup_from_task(struct task_struct *task)
 {
-	return container_of(task_subsys_state(task, perf_subsys_id),
-			struct perf_cgroup, css);
+	return container_of(task_css(task, perf_subsys_id),
+			    struct perf_cgroup, css);
 }
 
 static inline bool
@@ -7798,7 +7798,7 @@ static struct cgroup_subsys_state *perf_cgroup_css_alloc(struct cgroup *cont)
 static void perf_cgroup_css_free(struct cgroup *cont)
 {
 	struct perf_cgroup *jc;
-	jc = container_of(cgroup_subsys_state(cont, perf_subsys_id),
+	jc = container_of(cgroup_css(cont, perf_subsys_id),
 			  struct perf_cgroup, css);
 	free_percpu(jc->info);
 	kfree(jc);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9b1f2e5..323d907 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6761,7 +6761,7 @@ void sched_move_task(struct task_struct *tsk)
 	if (unlikely(running))
 		tsk->sched_class->put_prev_task(rq, tsk);
 
-	tg = container_of(task_subsys_state_check(tsk, cpu_cgroup_subsys_id,
+	tg = container_of(task_css_check(tsk, cpu_cgroup_subsys_id,
 				lockdep_is_held(&tsk->sighand->siglock)),
 			  struct task_group, css);
 	tg = autogroup_task_group(tsk, tg);
@@ -7086,7 +7086,7 @@ int sched_rt_handler(struct ctl_table *table, int write,
 /* return corresponding task_group object of a cgroup */
 static inline struct task_group *cgroup_tg(struct cgroup *cgrp)
 {
-	return container_of(cgroup_subsys_state(cgrp, cpu_cgroup_subsys_id),
+	return container_of(cgroup_css(cgrp, cpu_cgroup_subsys_id),
 			    struct task_group, css);
 }
 
diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
index dbb7e2c..4a210fa 100644
--- a/kernel/sched/cpuacct.c
+++ b/kernel/sched/cpuacct.c
@@ -36,14 +36,14 @@ struct cpuacct {
 /* return cpu accounting group corresponding to this container */
 static inline struct cpuacct *cgroup_ca(struct cgroup *cgrp)
 {
-	return container_of(cgroup_subsys_state(cgrp, cpuacct_subsys_id),
+	return container_of(cgroup_css(cgrp, cpuacct_subsys_id),
 			    struct cpuacct, css);
 }
 
 /* return cpu accounting group to which this task belongs */
 static inline struct cpuacct *task_ca(struct task_struct *tsk)
 {
-	return container_of(task_subsys_state(tsk, cpuacct_subsys_id),
+	return container_of(task_css(tsk, cpuacct_subsys_id),
 			    struct cpuacct, css);
 }
 
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index ef0a7b2..471a56d 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -665,9 +665,9 @@ extern int group_balance_cpu(struct sched_group *sg);
 /*
  * Return the group to which this tasks belongs.
  *
- * We cannot use task_subsys_state() and friends because the cgroup
- * subsystem changes that value before the cgroup_subsys::attach() method
- * is called, therefore we cannot pin it and might observe the wrong value.
+ * We cannot use task_css() and friends because the cgroup subsystem
+ * changes that value before the cgroup_subsys::attach() method is called,
+ * therefore we cannot pin it and might observe the wrong value.
  *
  * The same is true for autogroup's p->signal->autogroup->tg, the autogroup
  * core changes this before calling sched_move_task().
diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
index 9cea7de..50f213f 100644
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -42,15 +42,13 @@ struct hugetlb_cgroup *hugetlb_cgroup_from_css(struct cgroup_subsys_state *s)
 static inline
 struct hugetlb_cgroup *hugetlb_cgroup_from_cgroup(struct cgroup *cgroup)
 {
-	return hugetlb_cgroup_from_css(cgroup_subsys_state(cgroup,
-							   hugetlb_subsys_id));
+	return hugetlb_cgroup_from_css(cgroup_css(cgroup, hugetlb_subsys_id));
 }
 
 static inline
 struct hugetlb_cgroup *hugetlb_cgroup_from_task(struct task_struct *task)
 {
-	return hugetlb_cgroup_from_css(task_subsys_state(task,
-							 hugetlb_subsys_id));
+	return hugetlb_cgroup_from_css(task_css(task, hugetlb_subsys_id));
 }
 
 static inline bool hugetlb_cgroup_is_root(struct hugetlb_cgroup *h_cg)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index d12ca6f..b47bd3a 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1037,8 +1037,7 @@ static void memcg_check_events(struct mem_cgroup *memcg, struct page *page)
 
 struct mem_cgroup *mem_cgroup_from_cont(struct cgroup *cont)
 {
-	return mem_cgroup_from_css(
-		cgroup_subsys_state(cont, mem_cgroup_subsys_id));
+	return mem_cgroup_from_css(cgroup_css(cont, mem_cgroup_subsys_id));
 }
 
 struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p)
@@ -1051,7 +1050,7 @@ struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p)
 	if (unlikely(!p))
 		return NULL;
 
-	return mem_cgroup_from_css(task_subsys_state(p, mem_cgroup_subsys_id));
+	return mem_cgroup_from_css(task_css(p, mem_cgroup_subsys_id));
 }
 
 struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm)
diff --git a/mm/vmpressure.c b/mm/vmpressure.c
index 736a601..7f1654d 100644
--- a/mm/vmpressure.c
+++ b/mm/vmpressure.c
@@ -76,7 +76,7 @@ static struct vmpressure *work_to_vmpressure(struct work_struct *work)
 
 static struct vmpressure *cg_to_vmpressure(struct cgroup *cg)
 {
-	return css_to_vmpressure(cgroup_subsys_state(cg, mem_cgroup_subsys_id));
+	return css_to_vmpressure(cgroup_css(cg, mem_cgroup_subsys_id));
 }
 
 static struct vmpressure *vmpressure_parent(struct vmpressure *vmpr)
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index e533259..ccf8523 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -31,7 +31,7 @@
 
 static inline struct cgroup_netprio_state *cgrp_netprio_state(struct cgroup *cgrp)
 {
-	return container_of(cgroup_subsys_state(cgrp, net_prio_subsys_id),
+	return container_of(cgroup_css(cgrp, net_prio_subsys_id),
 			    struct cgroup_netprio_state, css);
 }
 
diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
index 3a294eb..5ee72a0 100644
--- a/net/sched/cls_cgroup.c
+++ b/net/sched/cls_cgroup.c
@@ -25,13 +25,13 @@
 
 static inline struct cgroup_cls_state *cgrp_cls_state(struct cgroup *cgrp)
 {
-	return container_of(cgroup_subsys_state(cgrp, net_cls_subsys_id),
+	return container_of(cgroup_css(cgrp, net_cls_subsys_id),
 			    struct cgroup_cls_state, css);
 }
 
 static inline struct cgroup_cls_state *task_cls_state(struct task_struct *p)
 {
-	return container_of(task_subsys_state(p, net_cls_subsys_id),
+	return container_of(task_css(p, net_cls_subsys_id),
 			    struct cgroup_cls_state, css);
 }
 
diff --git a/security/device_cgroup.c b/security/device_cgroup.c
index e8aad69..87a0a03 100644
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -58,12 +58,12 @@ static inline struct dev_cgroup *css_to_devcgroup(struct cgroup_subsys_state *s)
 
 static inline struct dev_cgroup *cgroup_to_devcgroup(struct cgroup *cgroup)
 {
-	return css_to_devcgroup(cgroup_subsys_state(cgroup, devices_subsys_id));
+	return css_to_devcgroup(cgroup_css(cgroup, devices_subsys_id));
 }
 
 static inline struct dev_cgroup *task_devcgroup(struct task_struct *task)
 {
-	return css_to_devcgroup(task_subsys_state(task, devices_subsys_id));
+	return css_to_devcgroup(task_css(task, devices_subsys_id));
 }
 
 struct cgroup_subsys devices_subsys;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 02/23] cpuset: drop "const" qualifiers from struct cpuset instances
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
  2013-08-01 21:49 ` [PATCH 01/23] cgroup: s/cgroup_subsys_state/cgroup_css/ s/task_subsys_state/task_css/ Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-01 21:49 ` [PATCH 03/23] netprio_cgroup: pass around @css instead of @cgroup and kill struct cgroup_netprio_state Tejun Heo
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan; +Cc: containers, cgroups, linux-kernel, Tejun Heo

cpuset uses "const" qualifiers on struct cpuset in some functions;
however, it doesn't work well when a value derived from returned const
pointer has to be passed to an accessor.  It's C after all.

Drop the "const" qualifiers except for the trivially leaf ones.  This
patch doesn't make any functional changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
---
 kernel/cpuset.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index be4512b..f737134 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -128,7 +128,7 @@ static inline struct cpuset *task_cs(struct task_struct *task)
 			    struct cpuset, css);
 }
 
-static inline struct cpuset *parent_cs(const struct cpuset *cs)
+static inline struct cpuset *parent_cs(struct cpuset *cs)
 {
 	struct cgroup *pcgrp = cs->css.cgroup->parent;
 
@@ -319,8 +319,7 @@ static struct file_system_type cpuset_fs_type = {
  *
  * Call with callback_mutex held.
  */
-static void guarantee_online_cpus(const struct cpuset *cs,
-				  struct cpumask *pmask)
+static void guarantee_online_cpus(struct cpuset *cs, struct cpumask *pmask)
 {
 	while (!cpumask_intersects(cs->cpus_allowed, cpu_online_mask))
 		cs = parent_cs(cs);
@@ -338,7 +337,7 @@ static void guarantee_online_cpus(const struct cpuset *cs,
  *
  * Call with callback_mutex held.
  */
-static void guarantee_online_mems(const struct cpuset *cs, nodemask_t *pmask)
+static void guarantee_online_mems(struct cpuset *cs, nodemask_t *pmask)
 {
 	while (!nodes_intersects(cs->mems_allowed, node_states[N_MEMORY]))
 		cs = parent_cs(cs);
@@ -383,7 +382,7 @@ static int is_cpuset_subset(const struct cpuset *p, const struct cpuset *q)
  * alloc_trial_cpuset - allocate a trial cpuset
  * @cs: the cpuset that the trial cpuset duplicates
  */
-static struct cpuset *alloc_trial_cpuset(const struct cpuset *cs)
+static struct cpuset *alloc_trial_cpuset(struct cpuset *cs)
 {
 	struct cpuset *trial;
 
@@ -430,7 +429,7 @@ static void free_trial_cpuset(struct cpuset *trial)
  * Return 0 if valid, -errno if not.
  */
 
-static int validate_change(const struct cpuset *cur, const struct cpuset *trial)
+static int validate_change(struct cpuset *cur, struct cpuset *trial)
 {
 	struct cgroup *cgrp;
 	struct cpuset *c, *par;
@@ -2343,7 +2342,7 @@ void cpuset_cpus_allowed(struct task_struct *tsk, struct cpumask *pmask)
 
 void cpuset_cpus_allowed_fallback(struct task_struct *tsk)
 {
-	const struct cpuset *cpus_cs;
+	struct cpuset *cpus_cs;
 
 	rcu_read_lock();
 	cpus_cs = effective_cpumask_cpuset(task_cs(tsk));
@@ -2416,7 +2415,7 @@ int cpuset_nodemask_valid_mems_allowed(nodemask_t *nodemask)
  * callback_mutex.  If no ancestor is mem_exclusive or mem_hardwall
  * (an unusual configuration), then returns the root cpuset.
  */
-static const struct cpuset *nearest_hardwall_ancestor(const struct cpuset *cs)
+static struct cpuset *nearest_hardwall_ancestor(struct cpuset *cs)
 {
 	while (!(is_mem_exclusive(cs) || is_mem_hardwall(cs)) && parent_cs(cs))
 		cs = parent_cs(cs);
@@ -2486,7 +2485,7 @@ static const struct cpuset *nearest_hardwall_ancestor(const struct cpuset *cs)
  */
 int __cpuset_node_allowed_softwall(int node, gfp_t gfp_mask)
 {
-	const struct cpuset *cs;	/* current cpuset ancestors */
+	struct cpuset *cs;		/* current cpuset ancestors */
 	int allowed;			/* is allocation in zone z allowed? */
 
 	if (in_interrupt() || (gfp_mask & __GFP_THISNODE))
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 03/23] netprio_cgroup: pass around @css instead of @cgroup and kill struct cgroup_netprio_state
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
  2013-08-01 21:49 ` [PATCH 01/23] cgroup: s/cgroup_subsys_state/cgroup_css/ s/task_subsys_state/task_css/ Tejun Heo
  2013-08-01 21:49 ` [PATCH 02/23] cpuset: drop "const" qualifiers from struct cpuset instances Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-01 22:07   ` David Miller
  2013-08-02 11:42   ` Neil Horman
  2013-08-01 21:49 ` [PATCH 04/23] hugetlb_cgroup: pass around @hugetlb_cgroup instead of @cgroup Tejun Heo
                   ` (21 subsequent siblings)
  24 siblings, 2 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan
  Cc: containers, cgroups, linux-kernel, Tejun Heo, Neil Horman,
	David S. Miller

cgroup controller API will be converted to primarily use struct
cgroup_subsys_state instead of struct cgroup.  In preparation, make
the internal functions of netprio_cgroup pass around @css instead of
@cgrp.

While at it, kill struct cgroup_netprio_state which only contained
struct cgroup_subsys_state without serving any purpose.  All functions
are converted to deal with @css directly.

This patch shouldn't cause any behavior differences.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: David S. Miller <davem@davemloft.net>
---
 include/net/netprio_cgroup.h |  4 ----
 net/core/netprio_cgroup.c    | 56 ++++++++++++++++++++++----------------------
 2 files changed, 28 insertions(+), 32 deletions(-)

diff --git a/include/net/netprio_cgroup.h b/include/net/netprio_cgroup.h
index 8110fa7..a24f8bb 100644
--- a/include/net/netprio_cgroup.h
+++ b/include/net/netprio_cgroup.h
@@ -25,10 +25,6 @@ struct netprio_map {
 	u32 priomap[];
 };
 
-struct cgroup_netprio_state {
-	struct cgroup_subsys_state css;
-};
-
 extern void sock_update_netprioidx(struct sock *sk);
 
 #if IS_BUILTIN(CONFIG_NETPRIO_CGROUP)
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index ccf8523..5dfac88 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -29,12 +29,6 @@
 
 #define PRIOMAP_MIN_SZ		128
 
-static inline struct cgroup_netprio_state *cgrp_netprio_state(struct cgroup *cgrp)
-{
-	return container_of(cgroup_css(cgrp, net_prio_subsys_id),
-			    struct cgroup_netprio_state, css);
-}
-
 /*
  * Extend @dev->priomap so that it's large enough to accomodate
  * @target_idx.  @dev->priomap.priomap_len > @target_idx after successful
@@ -87,68 +81,72 @@ static int extend_netdev_table(struct net_device *dev, u32 target_idx)
 
 /**
  * netprio_prio - return the effective netprio of a cgroup-net_device pair
- * @cgrp: cgroup part of the target pair
+ * @css: css part of the target pair
  * @dev: net_device part of the target pair
  *
  * Should be called under RCU read or rtnl lock.
  */
-static u32 netprio_prio(struct cgroup *cgrp, struct net_device *dev)
+static u32 netprio_prio(struct cgroup_subsys_state *css, struct net_device *dev)
 {
 	struct netprio_map *map = rcu_dereference_rtnl(dev->priomap);
+	int id = css->cgroup->id;
 
-	if (map && cgrp->id < map->priomap_len)
-		return map->priomap[cgrp->id];
+	if (map && id < map->priomap_len)
+		return map->priomap[id];
 	return 0;
 }
 
 /**
  * netprio_set_prio - set netprio on a cgroup-net_device pair
- * @cgrp: cgroup part of the target pair
+ * @css: css part of the target pair
  * @dev: net_device part of the target pair
  * @prio: prio to set
  *
- * Set netprio to @prio on @cgrp-@dev pair.  Should be called under rtnl
+ * Set netprio to @prio on @css-@dev pair.  Should be called under rtnl
  * lock and may fail under memory pressure for non-zero @prio.
  */
-static int netprio_set_prio(struct cgroup *cgrp, struct net_device *dev,
-			    u32 prio)
+static int netprio_set_prio(struct cgroup_subsys_state *css,
+			    struct net_device *dev, u32 prio)
 {
 	struct netprio_map *map;
+	int id = css->cgroup->id;
 	int ret;
 
 	/* avoid extending priomap for zero writes */
 	map = rtnl_dereference(dev->priomap);
-	if (!prio && (!map || map->priomap_len <= cgrp->id))
+	if (!prio && (!map || map->priomap_len <= id))
 		return 0;
 
-	ret = extend_netdev_table(dev, cgrp->id);
+	ret = extend_netdev_table(dev, id);
 	if (ret)
 		return ret;
 
 	map = rtnl_dereference(dev->priomap);
-	map->priomap[cgrp->id] = prio;
+	map->priomap[id] = prio;
 	return 0;
 }
 
 static struct cgroup_subsys_state *cgrp_css_alloc(struct cgroup *cgrp)
 {
-	struct cgroup_netprio_state *cs;
+	struct cgroup_subsys_state *css;
 
-	cs = kzalloc(sizeof(*cs), GFP_KERNEL);
-	if (!cs)
+	css = kzalloc(sizeof(*css), GFP_KERNEL);
+	if (!css)
 		return ERR_PTR(-ENOMEM);
 
-	return &cs->css;
+	return css;
 }
 
 static int cgrp_css_online(struct cgroup *cgrp)
 {
-	struct cgroup *parent = cgrp->parent;
+	struct cgroup_subsys_state *css = cgroup_css(cgrp, net_prio_subsys_id);
+	struct cgroup_subsys_state *parent_css;
 	struct net_device *dev;
 	int ret = 0;
 
-	if (!parent)
+	if (!cgrp->parent)
 		return 0;
+	parent_css = cgroup_css(cgrp->parent, net_prio_subsys_id);
 
 	rtnl_lock();
 	/*
@@ -156,9 +154,9 @@ static int cgrp_css_online(struct cgroup *cgrp)
 	 * onlining, there is no need to clear them on offline.
 	 */
 	for_each_netdev(&init_net, dev) {
-		u32 prio = netprio_prio(parent, dev);
+		u32 prio = netprio_prio(parent_css, dev);
 
-		ret = netprio_set_prio(cgrp, dev, prio);
+		ret = netprio_set_prio(css, dev, prio);
 		if (ret)
 			break;
 	}
@@ -168,7 +166,7 @@ static int cgrp_css_online(struct cgroup *cgrp)
 
 static void cgrp_css_free(struct cgroup *cgrp)
 {
-	kfree(cgrp_netprio_state(cgrp));
+	kfree(cgroup_css(cgrp, net_prio_subsys_id));
 }
 
 static u64 read_prioidx(struct cgroup *cgrp, struct cftype *cft)
@@ -179,11 +177,12 @@ static u64 read_prioidx(struct cgroup *cgrp, struct cftype *cft)
 static int read_priomap(struct cgroup *cont, struct cftype *cft,
 			struct cgroup_map_cb *cb)
 {
+	struct cgroup_subsys_state *css = cgroup_css(cont, net_prio_subsys_id);
 	struct net_device *dev;
 
 	rcu_read_lock();
 	for_each_netdev_rcu(&init_net, dev)
-		cb->fill(cb, dev->name, netprio_prio(cont, dev));
+		cb->fill(cb, dev->name, netprio_prio(css, dev));
 	rcu_read_unlock();
 	return 0;
 }
@@ -191,6 +190,7 @@ static int read_priomap(struct cgroup *cont, struct cftype *cft,
 static int write_priomap(struct cgroup *cgrp, struct cftype *cft,
 			 const char *buffer)
 {
+	struct cgroup_subsys_state *css = cgroup_css(cgrp, net_prio_subsys_id);
 	char devname[IFNAMSIZ + 1];
 	struct net_device *dev;
 	u32 prio;
@@ -205,7 +205,7 @@ static int write_priomap(struct cgroup *cgrp, struct cftype *cft,
 
 	rtnl_lock();
 
-	ret = netprio_set_prio(cgrp, dev, prio);
+	ret = netprio_set_prio(css, dev, prio);
 
 	rtnl_unlock();
 	dev_put(dev);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 04/23] hugetlb_cgroup: pass around @hugetlb_cgroup instead of @cgroup
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (2 preceding siblings ...)
  2013-08-01 21:49 ` [PATCH 03/23] netprio_cgroup: pass around @css instead of @cgroup and kill struct cgroup_netprio_state Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-02  4:35   ` Aneesh Kumar K.V
  2013-08-02 13:10   ` Michal Hocko
  2013-08-01 21:49 ` [PATCH 05/23] cgroup: add subsystem pointer to cgroup_subsys_state Tejun Heo
                   ` (20 subsequent siblings)
  24 siblings, 2 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan
  Cc: containers, cgroups, linux-kernel, Tejun Heo, Aneesh Kumar K.V,
	KAMEZAWA Hiroyuki, Michal Hocko, Johannes Weiner

cgroup controller API will be converted to primarily use struct
cgroup_subsys_state instead of struct cgroup.  In preparation, make
hugetlb_cgroup functions pass around struct hugetlb_cgroup instead of
struct cgroup.

This patch shouldn't cause any behavior differences.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/hugetlb_cgroup.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
index 50f213f..d2f9fc0 100644
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -56,17 +56,19 @@ static inline bool hugetlb_cgroup_is_root(struct hugetlb_cgroup *h_cg)
 	return (h_cg == root_h_cgroup);
 }
 
-static inline struct hugetlb_cgroup *parent_hugetlb_cgroup(struct cgroup *cg)
+static inline struct hugetlb_cgroup *
+parent_hugetlb_cgroup(struct hugetlb_cgroup *h_cg)
 {
-	if (!cg->parent)
+	struct cgroup *parent = h_cg->css.cgroup->parent;
+
+	if (!parent)
 		return NULL;
-	return hugetlb_cgroup_from_cgroup(cg->parent);
+	return hugetlb_cgroup_from_cgroup(parent);
 }
 
-static inline bool hugetlb_cgroup_have_usage(struct cgroup *cg)
+static inline bool hugetlb_cgroup_have_usage(struct hugetlb_cgroup *h_cg)
 {
 	int idx;
-	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_cgroup(cg);
 
 	for (idx = 0; idx < hugetlb_max_hstate; idx++) {
 		if ((res_counter_read_u64(&h_cg->hugepage[idx], RES_USAGE)) > 0)
@@ -115,15 +117,14 @@ static void hugetlb_cgroup_css_free(struct cgroup *cgroup)
  * page reference and test for page active here. This function
  * cannot fail.
  */
-static void hugetlb_cgroup_move_parent(int idx, struct cgroup *cgroup,
+static void hugetlb_cgroup_move_parent(int idx, struct hugetlb_cgroup *h_cg,
 				       struct page *page)
 {
 	int csize;
 	struct res_counter *counter;
 	struct res_counter *fail_res;
 	struct hugetlb_cgroup *page_hcg;
-	struct hugetlb_cgroup *h_cg   = hugetlb_cgroup_from_cgroup(cgroup);
-	struct hugetlb_cgroup *parent = parent_hugetlb_cgroup(cgroup);
+	struct hugetlb_cgroup *parent = parent_hugetlb_cgroup(h_cg);
 
 	page_hcg = hugetlb_cgroup_from_page(page);
 	/*
@@ -155,6 +156,7 @@ out:
  */
 static void hugetlb_cgroup_css_offline(struct cgroup *cgroup)
 {
+	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_cgroup(cgroup);
 	struct hstate *h;
 	struct page *page;
 	int idx = 0;
@@ -163,13 +165,13 @@ static void hugetlb_cgroup_css_offline(struct cgroup *cgroup)
 		for_each_hstate(h) {
 			spin_lock(&hugetlb_lock);
 			list_for_each_entry(page, &h->hugepage_activelist, lru)
-				hugetlb_cgroup_move_parent(idx, cgroup, page);
+				hugetlb_cgroup_move_parent(idx, h_cg, page);
 
 			spin_unlock(&hugetlb_lock);
 			idx++;
 		}
 		cond_resched();
-	} while (hugetlb_cgroup_have_usage(cgroup));
+	} while (hugetlb_cgroup_have_usage(h_cg));
 }
 
 int hugetlb_cgroup_charge_cgroup(int idx, unsigned long nr_pages,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 05/23] cgroup: add subsystem pointer to cgroup_subsys_state
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (3 preceding siblings ...)
  2013-08-01 21:49 ` [PATCH 04/23] hugetlb_cgroup: pass around @hugetlb_cgroup instead of @cgroup Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-01 21:49 ` [PATCH 06/23] cgroup: add/update accessors which obtain subsys specific data from css Tejun Heo
                   ` (19 subsequent siblings)
  24 siblings, 0 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan; +Cc: containers, cgroups, linux-kernel, Tejun Heo

Currently, given a cgroup_subsys_state, there's no way to find out
which subsystem the css is for, which we'll need to convert the cgroup
controller API to primarily use @css instead of @cgroup.  This patch
adds cgroup_subsys_state->ss which points to the subsystem the @css
belongs to.

While at it, remove the comment about accessing @css->cgroup to
determine the hierarchy.  cgroup core will provide API to traverse
hierarchy of css'es and we don't want subsystems to directly walk
cgroup hierarchies anymore.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/cgroup.h | 9 ++++-----
 kernel/cgroup.c        | 1 +
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 1938292..9c2f9d7 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -66,13 +66,12 @@ enum cgroup_subsys_id {
 
 /* Per-subsystem/per-cgroup state maintained by the system. */
 struct cgroup_subsys_state {
-	/*
-	 * The cgroup that this subsystem is attached to. Useful
-	 * for subsystems that want to know about the cgroup
-	 * hierarchy structure
-	 */
+	/* the cgroup that this css is attached to */
 	struct cgroup *cgroup;
 
+	/* the cgroup subsystem that this css is attached to */
+	struct cgroup_subsys *ss;
+
 	/* reference count - access via css_[try]get() and css_put() */
 	struct percpu_ref refcnt;
 
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 9d5af91..fad5498 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4179,6 +4179,7 @@ static void init_cgroup_css(struct cgroup_subsys_state *css,
 			       struct cgroup *cgrp)
 {
 	css->cgroup = cgrp;
+	css->ss = ss;
 	css->flags = 0;
 	css->id = NULL;
 	if (cgrp == cgroup_dummy_top)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 06/23] cgroup: add/update accessors which obtain subsys specific data from css
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (4 preceding siblings ...)
  2013-08-01 21:49 ` [PATCH 05/23] cgroup: add subsystem pointer to cgroup_subsys_state Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-01 21:49 ` [PATCH 07/23] cgroup: add css_parent() Tejun Heo
                   ` (18 subsequent siblings)
  24 siblings, 0 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan; +Cc: containers, cgroups, linux-kernel, Tejun Heo

css (cgroup_subsys_state) is usually embedded in a subsys specific
data structure.  Subsystems either use container_of() directly to cast
from css to such data structure or has an accessor function wrapping
such cast.  As cgroup as whole is moving towards using css as the main
interface handle, add and update such accessors to ease dealing with
css's.

All accessors explicitly handle NULL input and return NULL in those
cases.  While this looks like an extra branch in the code, as all
controllers specific data structures have css as the first field, the
casting doesn't involve any offsetting and the compiler can trivially
optimize out the branch.

* blkio, freezer, cpuset, cpu, cpuacct and net_cls didn't have such
  accessor.  Added.

* memory, hugetlb and devices already had one but didn't explicitly
  handle NULL input.  Updated.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 block/blk-cgroup.h       | 12 ++++++++----
 kernel/cgroup_freezer.c  | 11 +++++++----
 kernel/cpuset.c          | 11 +++++++----
 kernel/sched/core.c      |  8 ++++++--
 kernel/sched/cpuacct.c   | 11 +++++++----
 mm/hugetlb_cgroup.c      |  2 +-
 mm/memcontrol.c          |  2 +-
 net/sched/cls_cgroup.c   | 11 +++++++----
 security/device_cgroup.c |  2 +-
 9 files changed, 45 insertions(+), 25 deletions(-)

diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
index 628e50f..8e5863e 100644
--- a/block/blk-cgroup.h
+++ b/block/blk-cgroup.h
@@ -179,21 +179,25 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
 void blkg_conf_finish(struct blkg_conf_ctx *ctx);
 
 
+static inline struct blkcg *css_to_blkcg(struct cgroup_subsys_state *css)
+{
+	return css ? container_of(css, struct blkcg, css) : NULL;
+}
+
 static inline struct blkcg *cgroup_to_blkcg(struct cgroup *cgroup)
 {
-	return container_of(cgroup_css(cgroup, blkio_subsys_id),
-			    struct blkcg, css);
+	return css_to_blkcg(cgroup_css(cgroup, blkio_subsys_id));
 }
 
 static inline struct blkcg *task_blkcg(struct task_struct *tsk)
 {
-	return container_of(task_css(tsk, blkio_subsys_id), struct blkcg, css);
+	return css_to_blkcg(task_css(tsk, blkio_subsys_id));
 }
 
 static inline struct blkcg *bio_blkcg(struct bio *bio)
 {
 	if (bio && bio->bi_css)
-		return container_of(bio->bi_css, struct blkcg, css);
+		return css_to_blkcg(bio->bi_css);
 	return task_blkcg(current);
 }
 
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index 9d3f615..1db686e 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -45,16 +45,19 @@ struct freezer {
 	spinlock_t			lock;
 };
 
+static inline struct freezer *css_freezer(struct cgroup_subsys_state *css)
+{
+	return css ? container_of(css, struct freezer, css) : NULL;
+}
+
 static inline struct freezer *cgroup_freezer(struct cgroup *cgroup)
 {
-	return container_of(cgroup_css(cgroup, freezer_subsys_id),
-			    struct freezer, css);
+	return css_freezer(cgroup_css(cgroup, freezer_subsys_id));
 }
 
 static inline struct freezer *task_freezer(struct task_struct *task)
 {
-	return container_of(task_css(task, freezer_subsys_id),
-			    struct freezer, css);
+	return css_freezer(task_css(task, freezer_subsys_id));
 }
 
 static struct freezer *parent_freezer(struct freezer *freezer)
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index f737134..6e9cbdd 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -114,18 +114,21 @@ struct cpuset {
 	int relax_domain_level;
 };
 
+static inline struct cpuset *css_cs(struct cgroup_subsys_state *css)
+{
+	return css ? container_of(css, struct cpuset, css) : NULL;
+}
+
 /* Retrieve the cpuset for a cgroup */
 static inline struct cpuset *cgroup_cs(struct cgroup *cgrp)
 {
-	return container_of(cgroup_css(cgrp, cpuset_subsys_id),
-			    struct cpuset, css);
+	return css_cs(cgroup_css(cgrp, cpuset_subsys_id));
 }
 
 /* Retrieve the cpuset for a task */
 static inline struct cpuset *task_cs(struct task_struct *task)
 {
-	return container_of(task_css(task, cpuset_subsys_id),
-			    struct cpuset, css);
+	return css_cs(task_css(task, cpuset_subsys_id));
 }
 
 static inline struct cpuset *parent_cs(struct cpuset *cs)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 323d907..5bccb02 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7083,11 +7083,15 @@ int sched_rt_handler(struct ctl_table *table, int write,
 
 #ifdef CONFIG_CGROUP_SCHED
 
+static inline struct task_group *css_tg(struct cgroup_subsys_state *css)
+{
+	return css ? container_of(css, struct task_group, css) : NULL;
+}
+
 /* return corresponding task_group object of a cgroup */
 static inline struct task_group *cgroup_tg(struct cgroup *cgrp)
 {
-	return container_of(cgroup_css(cgrp, cpu_cgroup_subsys_id),
-			    struct task_group, css);
+	return css_tg(cgroup_css(cgrp, cpu_cgroup_subsys_id));
 }
 
 static struct cgroup_subsys_state *cpu_cgroup_css_alloc(struct cgroup *cgrp)
diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
index 4a210fa..8ccfa10 100644
--- a/kernel/sched/cpuacct.c
+++ b/kernel/sched/cpuacct.c
@@ -33,18 +33,21 @@ struct cpuacct {
 	struct kernel_cpustat __percpu *cpustat;
 };
 
+static inline struct cpuacct *css_ca(struct cgroup_subsys_state *css)
+{
+	return css ? container_of(css, struct cpuacct, css) : NULL;
+}
+
 /* return cpu accounting group corresponding to this container */
 static inline struct cpuacct *cgroup_ca(struct cgroup *cgrp)
 {
-	return container_of(cgroup_css(cgrp, cpuacct_subsys_id),
-			    struct cpuacct, css);
+	return css_ca(cgroup_css(cgrp, cpuacct_subsys_id));
 }
 
 /* return cpu accounting group to which this task belongs */
 static inline struct cpuacct *task_ca(struct task_struct *tsk)
 {
-	return container_of(task_css(tsk, cpuacct_subsys_id),
-			    struct cpuacct, css);
+	return css_ca(task_css(tsk, cpuacct_subsys_id));
 }
 
 static inline struct cpuacct *__parent_ca(struct cpuacct *ca)
diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
index d2f9fc0..95585a0 100644
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -36,7 +36,7 @@ static struct hugetlb_cgroup *root_h_cgroup __read_mostly;
 static inline
 struct hugetlb_cgroup *hugetlb_cgroup_from_css(struct cgroup_subsys_state *s)
 {
-	return container_of(s, struct hugetlb_cgroup, css);
+	return s ? container_of(s, struct hugetlb_cgroup, css) : NULL;
 }
 
 static inline
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b47bd3a..11d659e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -486,7 +486,7 @@ static DEFINE_MUTEX(memcg_create_mutex);
 static inline
 struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *s)
 {
-	return container_of(s, struct mem_cgroup, css);
+	return s ? container_of(s, struct mem_cgroup, css) : NULL;
 }
 
 /* Some nice accessors for the vmpressure. */
diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
index 5ee72a0..af412ab 100644
--- a/net/sched/cls_cgroup.c
+++ b/net/sched/cls_cgroup.c
@@ -23,16 +23,19 @@
 #include <net/sock.h>
 #include <net/cls_cgroup.h>
 
+static inline struct cgroup_cls_state *css_cls_state(struct cgroup_subsys_state *css)
+{
+	return css ? container_of(css, struct cgroup_cls_state, css) : NULL;
+}
+
 static inline struct cgroup_cls_state *cgrp_cls_state(struct cgroup *cgrp)
 {
-	return container_of(cgroup_css(cgrp, net_cls_subsys_id),
-			    struct cgroup_cls_state, css);
+	return css_cls_state(cgroup_css(cgrp, net_cls_subsys_id));
 }
 
 static inline struct cgroup_cls_state *task_cls_state(struct task_struct *p)
 {
-	return container_of(task_css(p, net_cls_subsys_id),
-			    struct cgroup_cls_state, css);
+	return css_cls_state(task_css(p, net_cls_subsys_id));
 }
 
 static struct cgroup_subsys_state *cgrp_css_alloc(struct cgroup *cgrp)
diff --git a/security/device_cgroup.c b/security/device_cgroup.c
index 87a0a03..9095364 100644
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -53,7 +53,7 @@ struct dev_cgroup {
 
 static inline struct dev_cgroup *css_to_devcgroup(struct cgroup_subsys_state *s)
 {
-	return container_of(s, struct dev_cgroup, css);
+	return s ? container_of(s, struct dev_cgroup, css) : NULL;
 }
 
 static inline struct dev_cgroup *cgroup_to_devcgroup(struct cgroup *cgroup)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 07/23] cgroup: add css_parent()
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (5 preceding siblings ...)
  2013-08-01 21:49 ` [PATCH 06/23] cgroup: add/update accessors which obtain subsys specific data from css Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-01 21:49 ` [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods Tejun Heo
                   ` (17 subsequent siblings)
  24 siblings, 0 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan; +Cc: containers, cgroups, linux-kernel, Tejun Heo

Currently, controllers have to explicitly follow the cgroup hierarchy
to find the parent of a given css.  cgroup is moving towards using
cgroup_subsys_state as the main controller interface construct, so
let's provide a way to climb the hierarchy using just csses.

This patch implements css_parent() which, given a css, returns its
parent.  The function is guarnateed to valid non-NULL parent css as
long as the target css is not at the top of the hierarchy.

freezer, cpuset, cpu, cpuacct, hugetlb, memory, net_cls and devices
are converted to use css_parent() instead of accessing cgroup->parent
directly.

* __parent_ca() is dropped from cpuacct and its usage is replaced with
  parent_ca().  The only difference between the two was NULL test on
  cgroup->parent which is now embedded in css_parent() making the
  distinction moot.  Note that eventually a css->parent field will be
  added to css and the NULL check in css_parent() will go away.

This patch shouldn't cause any behavior differences.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 block/blk-cgroup.h       |  4 +---
 include/linux/cgroup.h   | 15 +++++++++++++++
 kernel/cgroup_freezer.c  |  8 ++------
 kernel/cpuset.c          |  6 +-----
 kernel/sched/core.c      |  9 +++------
 kernel/sched/cpuacct.c   | 11 ++---------
 mm/hugetlb_cgroup.c      |  6 +-----
 mm/memcontrol.c          | 39 +++++++++++----------------------------
 net/sched/cls_cgroup.c   |  8 +++++---
 security/device_cgroup.c | 18 +++++-------------
 10 files changed, 46 insertions(+), 78 deletions(-)

diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
index 8e5863e..b6802c4 100644
--- a/block/blk-cgroup.h
+++ b/block/blk-cgroup.h
@@ -209,9 +209,7 @@ static inline struct blkcg *bio_blkcg(struct bio *bio)
  */
 static inline struct blkcg *blkcg_parent(struct blkcg *blkcg)
 {
-	struct cgroup *pcg = blkcg->css.cgroup->parent;
-
-	return pcg ? cgroup_to_blkcg(pcg) : NULL;
+	return css_to_blkcg(css_parent(&blkcg->css));
 }
 
 /**
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 9c2f9d7..b65f6b5 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -648,6 +648,21 @@ struct cgroup_subsys {
 #undef SUBSYS
 
 /**
+ * css_parent - find the parent css
+ * @css: the target cgroup_subsys_state
+ *
+ * Return the parent css of @css.  This function is guaranteed to return
+ * non-NULL parent as long as @css isn't the root.
+ */
+static inline
+struct cgroup_subsys_state *css_parent(struct cgroup_subsys_state *css)
+{
+	struct cgroup *parent_cgrp = css->cgroup->parent;
+
+	return parent_cgrp ? parent_cgrp->subsys[css->ss->subsys_id] : NULL;
+}
+
+/**
  * cgroup_css - obtain a cgroup's css for the specified subsystem
  * @cgrp: the cgroup of interest
  * @subsys_id: the subsystem of interest
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index 1db686e..657a73c 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -62,11 +62,7 @@ static inline struct freezer *task_freezer(struct task_struct *task)
 
 static struct freezer *parent_freezer(struct freezer *freezer)
 {
-	struct cgroup *pcg = freezer->css.cgroup->parent;
-
-	if (pcg)
-		return cgroup_freezer(pcg);
-	return NULL;
+	return css_freezer(css_parent(&freezer->css));
 }
 
 bool cgroup_freezing(struct task_struct *task)
@@ -234,7 +230,7 @@ static void freezer_fork(struct task_struct *task)
 	 * The root cgroup is non-freezable, so we can skip the
 	 * following check.
 	 */
-	if (!freezer->css.cgroup->parent)
+	if (!parent_freezer(freezer))
 		goto out;
 
 	spin_lock_irq(&freezer->lock);
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 6e9cbdd..259a4af 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -133,11 +133,7 @@ static inline struct cpuset *task_cs(struct task_struct *task)
 
 static inline struct cpuset *parent_cs(struct cpuset *cs)
 {
-	struct cgroup *pcgrp = cs->css.cgroup->parent;
-
-	if (pcgrp)
-		return cgroup_cs(pcgrp);
-	return NULL;
+	return css_cs(css_parent(&cs->css));
 }
 
 #ifdef CONFIG_NUMA
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 5bccb02..7a10742 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7114,13 +7114,10 @@ static struct cgroup_subsys_state *cpu_cgroup_css_alloc(struct cgroup *cgrp)
 static int cpu_cgroup_css_online(struct cgroup *cgrp)
 {
 	struct task_group *tg = cgroup_tg(cgrp);
-	struct task_group *parent;
+	struct task_group *parent = css_tg(css_parent(&tg->css));
 
-	if (!cgrp->parent)
-		return 0;
-
-	parent = cgroup_tg(cgrp->parent);
-	sched_online_group(tg, parent);
+	if (parent)
+		sched_online_group(tg, parent);
 	return 0;
 }
 
diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
index 8ccfa10..f6926a1 100644
--- a/kernel/sched/cpuacct.c
+++ b/kernel/sched/cpuacct.c
@@ -50,16 +50,9 @@ static inline struct cpuacct *task_ca(struct task_struct *tsk)
 	return css_ca(task_css(tsk, cpuacct_subsys_id));
 }
 
-static inline struct cpuacct *__parent_ca(struct cpuacct *ca)
-{
-	return cgroup_ca(ca->css.cgroup->parent);
-}
-
 static inline struct cpuacct *parent_ca(struct cpuacct *ca)
 {
-	if (!ca->css.cgroup->parent)
-		return NULL;
-	return cgroup_ca(ca->css.cgroup->parent);
+	return css_ca(css_parent(&ca->css));
 }
 
 static DEFINE_PER_CPU(u64, root_cpuacct_cpuusage);
@@ -284,7 +277,7 @@ void cpuacct_account_field(struct task_struct *p, int index, u64 val)
 	while (ca != &root_cpuacct) {
 		kcpustat = this_cpu_ptr(ca->cpustat);
 		kcpustat->cpustat[index] += val;
-		ca = __parent_ca(ca);
+		ca = parent_ca(ca);
 	}
 	rcu_read_unlock();
 }
diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
index 95585a0..57ecb5d 100644
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -59,11 +59,7 @@ static inline bool hugetlb_cgroup_is_root(struct hugetlb_cgroup *h_cg)
 static inline struct hugetlb_cgroup *
 parent_hugetlb_cgroup(struct hugetlb_cgroup *h_cg)
 {
-	struct cgroup *parent = h_cg->css.cgroup->parent;
-
-	if (!parent)
-		return NULL;
-	return hugetlb_cgroup_from_cgroup(parent);
+	return hugetlb_cgroup_from_css(css_parent(&h_cg->css));
 }
 
 static inline bool hugetlb_cgroup_have_usage(struct hugetlb_cgroup *h_cg)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 11d659e..69b3e52 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1524,10 +1524,8 @@ static unsigned long mem_cgroup_margin(struct mem_cgroup *memcg)
 
 int mem_cgroup_swappiness(struct mem_cgroup *memcg)
 {
-	struct cgroup *cgrp = memcg->css.cgroup;
-
 	/* root ? */
-	if (cgrp->parent == NULL)
+	if (!css_parent(&memcg->css))
 		return vm_swappiness;
 
 	return memcg->swappiness;
@@ -5026,11 +5024,7 @@ static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft,
 {
 	int retval = 0;
 	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
-	struct cgroup *parent = cont->parent;
-	struct mem_cgroup *parent_memcg = NULL;
-
-	if (parent)
-		parent_memcg = mem_cgroup_from_cont(parent);
+	struct mem_cgroup *parent_memcg = mem_cgroup_from_css(css_parent(&memcg->css));
 
 	mutex_lock(&memcg_create_mutex);
 
@@ -5282,18 +5276,15 @@ static int mem_cgroup_write(struct cgroup *cont, struct cftype *cft,
 static void memcg_get_hierarchical_limit(struct mem_cgroup *memcg,
 		unsigned long long *mem_limit, unsigned long long *memsw_limit)
 {
-	struct cgroup *cgroup;
 	unsigned long long min_limit, min_memsw_limit, tmp;
 
 	min_limit = res_counter_read_u64(&memcg->res, RES_LIMIT);
 	min_memsw_limit = res_counter_read_u64(&memcg->memsw, RES_LIMIT);
-	cgroup = memcg->css.cgroup;
 	if (!memcg->use_hierarchy)
 		goto out;
 
-	while (cgroup->parent) {
-		cgroup = cgroup->parent;
-		memcg = mem_cgroup_from_cont(cgroup);
+	while (css_parent(&memcg->css)) {
+		memcg = mem_cgroup_from_css(css_parent(&memcg->css));
 		if (!memcg->use_hierarchy)
 			break;
 		tmp = res_counter_read_u64(&memcg->res, RES_LIMIT);
@@ -5523,16 +5514,11 @@ static int mem_cgroup_swappiness_write(struct cgroup *cgrp, struct cftype *cft,
 				       u64 val)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
-	struct mem_cgroup *parent;
-
-	if (val > 100)
-		return -EINVAL;
+	struct mem_cgroup *parent = mem_cgroup_from_css(css_parent(&memcg->css));
 
-	if (cgrp->parent == NULL)
+	if (val > 100 || !parent)
 		return -EINVAL;
 
-	parent = mem_cgroup_from_cont(cgrp->parent);
-
 	mutex_lock(&memcg_create_mutex);
 
 	/* If under hierarchy, only empty-root can set this value */
@@ -5861,14 +5847,12 @@ static int mem_cgroup_oom_control_write(struct cgroup *cgrp,
 	struct cftype *cft, u64 val)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
-	struct mem_cgroup *parent;
+	struct mem_cgroup *parent = mem_cgroup_from_css(css_parent(&memcg->css));
 
 	/* cannot set to root cgroup and only 0 and 1 are allowed */
-	if (!cgrp->parent || !((val == 0) || (val == 1)))
+	if (!parent || !((val == 0) || (val == 1)))
 		return -EINVAL;
 
-	parent = mem_cgroup_from_cont(cgrp->parent);
-
 	mutex_lock(&memcg_create_mutex);
 	/* oom-kill-disable is a flag for subhierarchy. */
 	if ((parent->use_hierarchy) || memcg_has_children(memcg)) {
@@ -6266,15 +6250,14 @@ free_out:
 static int
 mem_cgroup_css_online(struct cgroup *cont)
 {
-	struct mem_cgroup *memcg, *parent;
+	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
+	struct mem_cgroup *parent = mem_cgroup_from_css(css_parent(&memcg->css));
 	int error = 0;
 
-	if (!cont->parent)
+	if (!parent)
 		return 0;
 
 	mutex_lock(&memcg_create_mutex);
-	memcg = mem_cgroup_from_cont(cont);
-	parent = mem_cgroup_from_cont(cont->parent);
 
 	memcg->use_hierarchy = parent->use_hierarchy;
 	memcg->oom_kill_disable = parent->oom_kill_disable;
diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
index af412ab..9e6b75e 100644
--- a/net/sched/cls_cgroup.c
+++ b/net/sched/cls_cgroup.c
@@ -50,9 +50,11 @@ static struct cgroup_subsys_state *cgrp_css_alloc(struct cgroup *cgrp)
 
 static int cgrp_css_online(struct cgroup *cgrp)
 {
-	if (cgrp->parent)
-		cgrp_cls_state(cgrp)->classid =
-			cgrp_cls_state(cgrp->parent)->classid;
+	struct cgroup_cls_state *cs = cgrp_cls_state(cgrp);
+	struct cgroup_cls_state *parent = css_cls_state(css_parent(&cs->css));
+
+	if (parent)
+		cs->classid = parent->classid;
 	return 0;
 }
 
diff --git a/security/device_cgroup.c b/security/device_cgroup.c
index 9095364..635a49d 100644
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -198,13 +198,11 @@ static inline bool is_devcg_online(const struct dev_cgroup *devcg)
  */
 static int devcgroup_online(struct cgroup *cgroup)
 {
-	struct dev_cgroup *dev_cgroup, *parent_dev_cgroup = NULL;
+	struct dev_cgroup *dev_cgroup = cgroup_to_devcgroup(cgroup);
+	struct dev_cgroup *parent_dev_cgroup = css_to_devcgroup(css_parent(&dev_cgroup->css));
 	int ret = 0;
 
 	mutex_lock(&devcgroup_mutex);
-	dev_cgroup = cgroup_to_devcgroup(cgroup);
-	if (cgroup->parent)
-		parent_dev_cgroup = cgroup_to_devcgroup(cgroup->parent);
 
 	if (parent_dev_cgroup == NULL)
 		dev_cgroup->behavior = DEVCG_DEFAULT_ALLOW;
@@ -394,12 +392,10 @@ static bool may_access(struct dev_cgroup *dev_cgroup,
 static int parent_has_perm(struct dev_cgroup *childcg,
 				  struct dev_exception_item *ex)
 {
-	struct cgroup *pcg = childcg->css.cgroup->parent;
-	struct dev_cgroup *parent;
+	struct dev_cgroup *parent = css_to_devcgroup(css_parent(&childcg->css));
 
-	if (!pcg)
+	if (!parent)
 		return 1;
-	parent = cgroup_to_devcgroup(pcg);
 	return may_access(parent, ex, childcg->behavior);
 }
 
@@ -524,15 +520,11 @@ static int devcgroup_update_access(struct dev_cgroup *devcgroup,
 	char temp[12];		/* 11 + 1 characters needed for a u32 */
 	int count, rc = 0;
 	struct dev_exception_item ex;
-	struct cgroup *p = devcgroup->css.cgroup;
-	struct dev_cgroup *parent = NULL;
+	struct dev_cgroup *parent = css_to_devcgroup(css_parent(&devcgroup->css));
 
 	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 
-	if (p->parent)
-		parent = cgroup_to_devcgroup(p->parent);
-
 	memset(&ex, 0, sizeof(ex));
 	b = buffer;
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (6 preceding siblings ...)
  2013-08-01 21:49 ` [PATCH 07/23] cgroup: add css_parent() Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-02  3:54   ` Li Zefan
                     ` (5 more replies)
  2013-08-01 21:49 ` [PATCH 09/23] cgroup: add subsys backlink pointer to cftype Tejun Heo
                   ` (16 subsequent siblings)
  24 siblings, 6 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan
  Cc: containers, cgroups, linux-kernel, Tejun Heo, Peter Zijlstra,
	Ingo Molnar, Johannes Weiner, Michal Hocko, Balbir Singh,
	Aristeu Rozanski, Matt Helsley, Daniel Wagner, Vivek Goyal,
	Jens Axboe, Steven Rostedt

cgroup is currently in the process of transitioning to using struct
cgroup_subsys_state * as the primary handle instead of struct cgroup *
in subsystem implementations for the following reasons.

* With unified hierarchy, subsystems will be dynamically bound and
  unbound from cgroups and thus css's (cgroup_subsys_state) may be
  created and destroyed dynamically over the lifetime of a cgroup,
  which is different from the current state where all css's are
  allocated and destroyed together with the associated cgroup.  This
  in turn means that cgroup_css() should be synchronized and may
  return NULL, making it more cumbersome to use.

* Differing levels of per-subsystem granularity in the unified
  hierarchy means that the task and descendant iterators should behave
  differently depending on the specific subsystem the iteration is
  being performed for.

* In majority of the cases, subsystems only care about its part in the
  cgroup hierarchy - ie. the hierarchy of css's.  Subsystem methods
  often obtain the matching css pointer from the cgroup and don't
  bother with the cgroup pointer itself.  Passing around css fits
  much better.

This patch converts all cgroup_subsys methods to take @css instead of
@cgroup.  The conversions are mostly straight-forward.  A few
noteworthy changes are

* ->css_alloc() now takes css of the parent cgroup rather than the
  pointer to the new cgroup as the css for the new cgroup doesn't
  exist yet.  Knowing the parent css is enough for all the existing
  subsystems.

* In kernel/cgroup.c::offline_css(), unnecessary open coded css
  dereference is replaced with local variable access.

This patch shouldn't cause any behavior differences.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 block/blk-cgroup.c        | 25 +++++++++++-----------
 include/linux/cgroup.h    | 24 ++++++++++++---------
 kernel/cgroup.c           | 53 ++++++++++++++++++++++++++++-------------------
 kernel/cgroup_freezer.c   | 40 ++++++++++++++++++-----------------
 kernel/cpuset.c           | 39 ++++++++++++++++++----------------
 kernel/events/core.c      | 18 +++++++++-------
 kernel/sched/core.c       | 39 +++++++++++++++++-----------------
 kernel/sched/cpuacct.c    |  9 ++++----
 mm/hugetlb_cgroup.c       | 19 ++++++++---------
 mm/memcontrol.c           | 38 ++++++++++++++++-----------------
 net/core/netprio_cgroup.c | 20 +++++++++---------
 net/sched/cls_cgroup.c    | 18 +++++++++-------
 security/device_cgroup.c  | 22 ++++++++++----------
 13 files changed, 195 insertions(+), 169 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 290792a..79fd9f4 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -765,18 +765,18 @@ struct cftype blkcg_files[] = {
 
 /**
  * blkcg_css_offline - cgroup css_offline callback
- * @cgroup: cgroup of interest
+ * @css: css of interest
  *
- * This function is called when @cgroup is about to go away and responsible
- * for shooting down all blkgs associated with @cgroup.  blkgs should be
+ * This function is called when @css is about to go away and responsible
+ * for shooting down all blkgs associated with @css.  blkgs should be
  * removed while holding both q and blkcg locks.  As blkcg lock is nested
  * inside q lock, this function performs reverse double lock dancing.
  *
  * This is the blkcg counterpart of ioc_release_fn().
  */
-static void blkcg_css_offline(struct cgroup *cgroup)
+static void blkcg_css_offline(struct cgroup_subsys_state *css)
 {
-	struct blkcg *blkcg = cgroup_to_blkcg(cgroup);
+	struct blkcg *blkcg = css_to_blkcg(css);
 
 	spin_lock_irq(&blkcg->lock);
 
@@ -798,21 +798,21 @@ static void blkcg_css_offline(struct cgroup *cgroup)
 	spin_unlock_irq(&blkcg->lock);
 }
 
-static void blkcg_css_free(struct cgroup *cgroup)
+static void blkcg_css_free(struct cgroup_subsys_state *css)
 {
-	struct blkcg *blkcg = cgroup_to_blkcg(cgroup);
+	struct blkcg *blkcg = css_to_blkcg(css);
 
 	if (blkcg != &blkcg_root)
 		kfree(blkcg);
 }
 
-static struct cgroup_subsys_state *blkcg_css_alloc(struct cgroup *cgroup)
+static struct cgroup_subsys_state *
+blkcg_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	static atomic64_t id_seq = ATOMIC64_INIT(0);
 	struct blkcg *blkcg;
-	struct cgroup *parent = cgroup->parent;
 
-	if (!parent) {
+	if (!parent_css) {
 		blkcg = &blkcg_root;
 		goto done;
 	}
@@ -883,14 +883,15 @@ void blkcg_exit_queue(struct request_queue *q)
  * of the main cic data structures.  For now we allow a task to change
  * its cgroup only if it's the only owner of its ioc.
  */
-static int blkcg_can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
+static int blkcg_can_attach(struct cgroup_subsys_state *css,
+			    struct cgroup_taskset *tset)
 {
 	struct task_struct *task;
 	struct io_context *ioc;
 	int ret = 0;
 
 	/* task_lock() is needed to avoid races with exit_io_context() */
-	cgroup_taskset_for_each(task, cgrp, tset) {
+	cgroup_taskset_for_each(task, css->cgroup, tset) {
 		task_lock(task);
 		ioc = task->io_context;
 		if (ioc && atomic_read(&ioc->nr_tasks) > 1)
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index b65f6b5..69b33f9 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -580,18 +580,22 @@ int cgroup_taskset_size(struct cgroup_taskset *tset);
  */
 
 struct cgroup_subsys {
-	struct cgroup_subsys_state *(*css_alloc)(struct cgroup *cgrp);
-	int (*css_online)(struct cgroup *cgrp);
-	void (*css_offline)(struct cgroup *cgrp);
-	void (*css_free)(struct cgroup *cgrp);
-
-	int (*can_attach)(struct cgroup *cgrp, struct cgroup_taskset *tset);
-	void (*cancel_attach)(struct cgroup *cgrp, struct cgroup_taskset *tset);
-	void (*attach)(struct cgroup *cgrp, struct cgroup_taskset *tset);
+	struct cgroup_subsys_state *(*css_alloc)(struct cgroup_subsys_state *parent_css);
+	int (*css_online)(struct cgroup_subsys_state *css);
+	void (*css_offline)(struct cgroup_subsys_state *css);
+	void (*css_free)(struct cgroup_subsys_state *css);
+
+	int (*can_attach)(struct cgroup_subsys_state *css,
+			  struct cgroup_taskset *tset);
+	void (*cancel_attach)(struct cgroup_subsys_state *css,
+			      struct cgroup_taskset *tset);
+	void (*attach)(struct cgroup_subsys_state *css,
+		       struct cgroup_taskset *tset);
 	void (*fork)(struct task_struct *task);
-	void (*exit)(struct cgroup *cgrp, struct cgroup *old_cgrp,
+	void (*exit)(struct cgroup_subsys_state *css,
+		     struct cgroup_subsys_state *old_css,
 		     struct task_struct *task);
-	void (*bind)(struct cgroup *root);
+	void (*bind)(struct cgroup_subsys_state *root_css);
 
 	int subsys_id;
 	int disabled;
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index fad5498..fae11e3 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -853,8 +853,11 @@ static void cgroup_free_fn(struct work_struct *work)
 	/*
 	 * Release the subsystem state objects.
 	 */
-	for_each_root_subsys(cgrp->root, ss)
-		ss->css_free(cgrp);
+	for_each_root_subsys(cgrp->root, ss) {
+		struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
+
+		ss->css_free(css);
+	}
 
 	cgrp->root->number_of_cgroups--;
 	mutex_unlock(&cgroup_mutex);
@@ -1056,7 +1059,7 @@ static int rebind_subsystems(struct cgroupfs_root *root,
 			list_move(&ss->sibling, &root->subsys_list);
 			ss->root = root;
 			if (ss->bind)
-				ss->bind(cgrp);
+				ss->bind(cgrp->subsys[i]);
 
 			/* refcount was already taken, and we're keeping it */
 			root->subsys_mask |= bit;
@@ -1066,7 +1069,7 @@ static int rebind_subsystems(struct cgroupfs_root *root,
 			BUG_ON(cgrp->subsys[i]->cgroup != cgrp);
 
 			if (ss->bind)
-				ss->bind(cgroup_dummy_top);
+				ss->bind(cgroup_dummy_top->subsys[i]);
 			cgroup_dummy_top->subsys[i]->cgroup = cgroup_dummy_top;
 			cgrp->subsys[i] = NULL;
 			cgroup_subsys[i]->root = &cgroup_dummy_root;
@@ -2042,8 +2045,10 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
 	 * step 1: check that we can legitimately attach to the cgroup.
 	 */
 	for_each_root_subsys(root, ss) {
+		struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
+
 		if (ss->can_attach) {
-			retval = ss->can_attach(cgrp, &tset);
+			retval = ss->can_attach(css, &tset);
 			if (retval) {
 				failed_ss = ss;
 				goto out_cancel_attach;
@@ -2082,8 +2087,10 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
 	 * step 4: do subsystem attach callbacks.
 	 */
 	for_each_root_subsys(root, ss) {
+		struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
+
 		if (ss->attach)
-			ss->attach(cgrp, &tset);
+			ss->attach(css, &tset);
 	}
 
 	/*
@@ -2102,10 +2109,12 @@ out_put_css_set_refs:
 out_cancel_attach:
 	if (retval) {
 		for_each_root_subsys(root, ss) {
+			struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
+
 			if (ss == failed_ss)
 				break;
 			if (ss->cancel_attach)
-				ss->cancel_attach(cgrp, &tset);
+				ss->cancel_attach(css, &tset);
 		}
 	}
 out_free_group_list:
@@ -4199,12 +4208,13 @@ static void init_cgroup_css(struct cgroup_subsys_state *css,
 /* invoke ->css_online() on a new CSS and mark it online if successful */
 static int online_css(struct cgroup_subsys *ss, struct cgroup *cgrp)
 {
+	struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
 	int ret = 0;
 
 	lockdep_assert_held(&cgroup_mutex);
 
 	if (ss->css_online)
-		ret = ss->css_online(cgrp);
+		ret = ss->css_online(css);
 	if (!ret)
 		cgrp->subsys[ss->subsys_id]->flags |= CSS_ONLINE;
 	return ret;
@@ -4221,9 +4231,9 @@ static void offline_css(struct cgroup_subsys *ss, struct cgroup *cgrp)
 		return;
 
 	if (ss->css_offline)
-		ss->css_offline(cgrp);
+		ss->css_offline(css);
 
-	cgrp->subsys[ss->subsys_id]->flags &= ~CSS_ONLINE;
+	css->flags &= ~CSS_ONLINE;
 }
 
 /*
@@ -4298,7 +4308,7 @@ static long cgroup_create(struct cgroup *parent, struct dentry *dentry,
 	for_each_root_subsys(root, ss) {
 		struct cgroup_subsys_state *css;
 
-		css = ss->css_alloc(cgrp);
+		css = ss->css_alloc(parent->subsys[ss->subsys_id]);
 		if (IS_ERR(css)) {
 			err = PTR_ERR(css);
 			goto err_free_all;
@@ -4377,7 +4387,7 @@ err_free_all:
 
 		if (css) {
 			percpu_ref_cancel_init(&css->refcnt);
-			ss->css_free(cgrp);
+			ss->css_free(css);
 		}
 	}
 	mutex_unlock(&cgroup_mutex);
@@ -4632,7 +4642,7 @@ static void __init cgroup_init_subsys(struct cgroup_subsys *ss)
 	/* Create the top cgroup state for this subsystem */
 	list_add(&ss->sibling, &cgroup_dummy_root.subsys_list);
 	ss->root = &cgroup_dummy_root;
-	css = ss->css_alloc(cgroup_dummy_top);
+	css = ss->css_alloc(cgroup_dummy_top->subsys[ss->subsys_id]);
 	/* We don't handle early failures gracefully */
 	BUG_ON(IS_ERR(css));
 	init_cgroup_css(css, ss, cgroup_dummy_top);
@@ -4711,7 +4721,7 @@ int __init_or_module cgroup_load_subsys(struct cgroup_subsys *ss)
 	 * struct, so this can happen first (i.e. before the dummy root
 	 * attachment).
 	 */
-	css = ss->css_alloc(cgroup_dummy_top);
+	css = ss->css_alloc(cgroup_dummy_top->subsys[ss->subsys_id]);
 	if (IS_ERR(css)) {
 		/* failure case - need to deassign the cgroup_subsys[] slot. */
 		cgroup_subsys[ss->subsys_id] = NULL;
@@ -4827,7 +4837,7 @@ void cgroup_unload_subsys(struct cgroup_subsys *ss)
 	 * the cgrp->subsys pointer to find their state. note that this
 	 * also takes care of freeing the css_id.
 	 */
-	ss->css_free(cgroup_dummy_top);
+	ss->css_free(cgroup_dummy_top->subsys[ss->subsys_id]);
 	cgroup_dummy_top->subsys[ss->subsys_id] = NULL;
 
 	mutex_unlock(&cgroup_mutex);
@@ -5183,10 +5193,10 @@ void cgroup_exit(struct task_struct *tsk, int run_callbacks)
 		 */
 		for_each_builtin_subsys(ss, i) {
 			if (ss->exit) {
-				struct cgroup *old_cgrp = cset->subsys[i]->cgroup;
-				struct cgroup *cgrp = task_cgroup(tsk, i);
+				struct cgroup_subsys_state *old_css = cset->subsys[i];
+				struct cgroup_subsys_state *css = task_css(tsk, i);
 
-				ss->exit(cgrp, old_cgrp, tsk);
+				ss->exit(css, old_css, tsk);
 			}
 		}
 	}
@@ -5520,7 +5530,8 @@ struct cgroup_subsys_state *cgroup_css_from_dir(struct file *f, int id)
 }
 
 #ifdef CONFIG_CGROUP_DEBUG
-static struct cgroup_subsys_state *debug_css_alloc(struct cgroup *cgrp)
+static struct cgroup_subsys_state *
+debug_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct cgroup_subsys_state *css = kzalloc(sizeof(*css), GFP_KERNEL);
 
@@ -5530,9 +5541,9 @@ static struct cgroup_subsys_state *debug_css_alloc(struct cgroup *cgrp)
 	return css;
 }
 
-static void debug_css_free(struct cgroup *cgrp)
+static void debug_css_free(struct cgroup_subsys_state *css)
 {
-	kfree(cgrp->subsys[debug_subsys_id]);
+	kfree(css);
 }
 
 static u64 debug_taskcount_read(struct cgroup *cgrp, struct cftype *cft)
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index 657a73c..f03a857 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -91,7 +91,8 @@ static const char *freezer_state_strs(unsigned int state)
 
 struct cgroup_subsys freezer_subsys;
 
-static struct cgroup_subsys_state *freezer_css_alloc(struct cgroup *cgroup)
+static struct cgroup_subsys_state *
+freezer_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct freezer *freezer;
 
@@ -104,16 +105,16 @@ static struct cgroup_subsys_state *freezer_css_alloc(struct cgroup *cgroup)
 }
 
 /**
- * freezer_css_online - commit creation of a freezer cgroup
- * @cgroup: cgroup being created
+ * freezer_css_online - commit creation of a freezer css
+ * @css: css being created
  *
- * We're committing to creation of @cgroup.  Mark it online and inherit
+ * We're committing to creation of @css.  Mark it online and inherit
  * parent's freezing state while holding both parent's and our
  * freezer->lock.
  */
-static int freezer_css_online(struct cgroup *cgroup)
+static int freezer_css_online(struct cgroup_subsys_state *css)
 {
-	struct freezer *freezer = cgroup_freezer(cgroup);
+	struct freezer *freezer = css_freezer(css);
 	struct freezer *parent = parent_freezer(freezer);
 
 	/*
@@ -140,15 +141,15 @@ static int freezer_css_online(struct cgroup *cgroup)
 }
 
 /**
- * freezer_css_offline - initiate destruction of @cgroup
- * @cgroup: cgroup being destroyed
+ * freezer_css_offline - initiate destruction of a freezer css
+ * @css: css being destroyed
  *
- * @cgroup is going away.  Mark it dead and decrement system_freezing_count
- * if it was holding one.
+ * @css is going away.  Mark it dead and decrement system_freezing_count if
+ * it was holding one.
  */
-static void freezer_css_offline(struct cgroup *cgroup)
+static void freezer_css_offline(struct cgroup_subsys_state *css)
 {
-	struct freezer *freezer = cgroup_freezer(cgroup);
+	struct freezer *freezer = css_freezer(css);
 
 	spin_lock_irq(&freezer->lock);
 
@@ -160,9 +161,9 @@ static void freezer_css_offline(struct cgroup *cgroup)
 	spin_unlock_irq(&freezer->lock);
 }
 
-static void freezer_css_free(struct cgroup *cgroup)
+static void freezer_css_free(struct cgroup_subsys_state *css)
 {
-	kfree(cgroup_freezer(cgroup));
+	kfree(css_freezer(css));
 }
 
 /*
@@ -174,25 +175,26 @@ static void freezer_css_free(struct cgroup *cgroup)
  * @freezer->lock.  freezer_attach() makes the new tasks conform to the
  * current state and all following state changes can see the new tasks.
  */
-static void freezer_attach(struct cgroup *new_cgrp, struct cgroup_taskset *tset)
+static void freezer_attach(struct cgroup_subsys_state *new_css,
+			   struct cgroup_taskset *tset)
 {
-	struct freezer *freezer = cgroup_freezer(new_cgrp);
+	struct freezer *freezer = css_freezer(new_css);
 	struct task_struct *task;
 	bool clear_frozen = false;
 
 	spin_lock_irq(&freezer->lock);
 
 	/*
-	 * Make the new tasks conform to the current state of @new_cgrp.
+	 * Make the new tasks conform to the current state of @new_css.
 	 * For simplicity, when migrating any task to a FROZEN cgroup, we
 	 * revert it to FREEZING and let update_if_frozen() determine the
 	 * correct state later.
 	 *
-	 * Tasks in @tset are on @new_cgrp but may not conform to its
+	 * Tasks in @tset are on @new_css but may not conform to its
 	 * current state before executing the following - !frozen tasks may
 	 * be visible in a FROZEN cgroup and frozen tasks in a THAWED one.
 	 */
-	cgroup_taskset_for_each(task, new_cgrp, tset) {
+	cgroup_taskset_for_each(task, new_css->cgroup, tset) {
 		if (!(freezer->state & CGROUP_FREEZING)) {
 			__thaw_task(task);
 		} else {
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 259a4af..8ce3fdc 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1455,9 +1455,10 @@ static int fmeter_getrate(struct fmeter *fmp)
 }
 
 /* Called by cgroups to determine if a cpuset is usable; cpuset_mutex held */
-static int cpuset_can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
+static int cpuset_can_attach(struct cgroup_subsys_state *css,
+			     struct cgroup_taskset *tset)
 {
-	struct cpuset *cs = cgroup_cs(cgrp);
+	struct cpuset *cs = css_cs(css);
 	struct task_struct *task;
 	int ret;
 
@@ -1468,11 +1469,11 @@ static int cpuset_can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
 	 * flag is set.
 	 */
 	ret = -ENOSPC;
-	if (!cgroup_sane_behavior(cgrp) &&
+	if (!cgroup_sane_behavior(css->cgroup) &&
 	    (cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed)))
 		goto out_unlock;
 
-	cgroup_taskset_for_each(task, cgrp, tset) {
+	cgroup_taskset_for_each(task, css->cgroup, tset) {
 		/*
 		 * Kthreads which disallow setaffinity shouldn't be moved
 		 * to a new cpuset; we don't want to change their cpu
@@ -1501,11 +1502,11 @@ out_unlock:
 	return ret;
 }
 
-static void cpuset_cancel_attach(struct cgroup *cgrp,
+static void cpuset_cancel_attach(struct cgroup_subsys_state *css,
 				 struct cgroup_taskset *tset)
 {
 	mutex_lock(&cpuset_mutex);
-	cgroup_cs(cgrp)->attach_in_progress--;
+	css_cs(css)->attach_in_progress--;
 	mutex_unlock(&cpuset_mutex);
 }
 
@@ -1516,7 +1517,8 @@ static void cpuset_cancel_attach(struct cgroup *cgrp,
  */
 static cpumask_var_t cpus_attach;
 
-static void cpuset_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
+static void cpuset_attach(struct cgroup_subsys_state *css,
+			  struct cgroup_taskset *tset)
 {
 	/* static buf protected by cpuset_mutex */
 	static nodemask_t cpuset_attach_nodemask_to;
@@ -1524,7 +1526,7 @@ static void cpuset_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
 	struct task_struct *task;
 	struct task_struct *leader = cgroup_taskset_first(tset);
 	struct cgroup *oldcgrp = cgroup_taskset_cur_cgroup(tset);
-	struct cpuset *cs = cgroup_cs(cgrp);
+	struct cpuset *cs = css_cs(css);
 	struct cpuset *oldcs = cgroup_cs(oldcgrp);
 	struct cpuset *cpus_cs = effective_cpumask_cpuset(cs);
 	struct cpuset *mems_cs = effective_nodemask_cpuset(cs);
@@ -1539,7 +1541,7 @@ static void cpuset_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
 
 	guarantee_online_mems(mems_cs, &cpuset_attach_nodemask_to);
 
-	cgroup_taskset_for_each(task, cgrp, tset) {
+	cgroup_taskset_for_each(task, css->cgroup, tset) {
 		/*
 		 * can_attach beforehand should guarantee that this doesn't
 		 * fail.  TODO: have a better way to handle failure here
@@ -1940,11 +1942,12 @@ static struct cftype files[] = {
  *	cgrp:	control group that the new cpuset will be part of
  */
 
-static struct cgroup_subsys_state *cpuset_css_alloc(struct cgroup *cgrp)
+static struct cgroup_subsys_state *
+cpuset_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct cpuset *cs;
 
-	if (!cgrp->parent)
+	if (!parent_css)
 		return &top_cpuset.css;
 
 	cs = kzalloc(sizeof(*cs), GFP_KERNEL);
@@ -1964,9 +1967,9 @@ static struct cgroup_subsys_state *cpuset_css_alloc(struct cgroup *cgrp)
 	return &cs->css;
 }
 
-static int cpuset_css_online(struct cgroup *cgrp)
+static int cpuset_css_online(struct cgroup_subsys_state *css)
 {
-	struct cpuset *cs = cgroup_cs(cgrp);
+	struct cpuset *cs = css_cs(css);
 	struct cpuset *parent = parent_cs(cs);
 	struct cpuset *tmp_cs;
 	struct cgroup *pos_cgrp;
@@ -1984,7 +1987,7 @@ static int cpuset_css_online(struct cgroup *cgrp)
 
 	number_of_cpusets++;
 
-	if (!test_bit(CGRP_CPUSET_CLONE_CHILDREN, &cgrp->flags))
+	if (!test_bit(CGRP_CPUSET_CLONE_CHILDREN, &css->cgroup->flags))
 		goto out_unlock;
 
 	/*
@@ -2024,9 +2027,9 @@ out_unlock:
  * will call rebuild_sched_domains_locked().
  */
 
-static void cpuset_css_offline(struct cgroup *cgrp)
+static void cpuset_css_offline(struct cgroup_subsys_state *css)
 {
-	struct cpuset *cs = cgroup_cs(cgrp);
+	struct cpuset *cs = css_cs(css);
 
 	mutex_lock(&cpuset_mutex);
 
@@ -2039,9 +2042,9 @@ static void cpuset_css_offline(struct cgroup *cgrp)
 	mutex_unlock(&cpuset_mutex);
 }
 
-static void cpuset_css_free(struct cgroup *cgrp)
+static void cpuset_css_free(struct cgroup_subsys_state *css)
 {
-	struct cpuset *cs = cgroup_cs(cgrp);
+	struct cpuset *cs = css_cs(css);
 
 	free_cpumask_var(cs->cpus_allowed);
 	kfree(cs);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 414c61f..9705a0e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7778,7 +7778,8 @@ unlock:
 device_initcall(perf_event_sysfs_init);
 
 #ifdef CONFIG_CGROUP_PERF
-static struct cgroup_subsys_state *perf_cgroup_css_alloc(struct cgroup *cont)
+static struct cgroup_subsys_state *
+perf_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct perf_cgroup *jc;
 
@@ -7795,11 +7796,10 @@ static struct cgroup_subsys_state *perf_cgroup_css_alloc(struct cgroup *cont)
 	return &jc->css;
 }
 
-static void perf_cgroup_css_free(struct cgroup *cont)
+static void perf_cgroup_css_free(struct cgroup_subsys_state *css)
 {
-	struct perf_cgroup *jc;
-	jc = container_of(cgroup_css(cont, perf_subsys_id),
-			  struct perf_cgroup, css);
+	struct perf_cgroup *jc = container_of(css, struct perf_cgroup, css);
+
 	free_percpu(jc->info);
 	kfree(jc);
 }
@@ -7811,15 +7811,17 @@ static int __perf_cgroup_move(void *info)
 	return 0;
 }
 
-static void perf_cgroup_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
+static void perf_cgroup_attach(struct cgroup_subsys_state *css,
+			       struct cgroup_taskset *tset)
 {
 	struct task_struct *task;
 
-	cgroup_taskset_for_each(task, cgrp, tset)
+	cgroup_taskset_for_each(task, css->cgroup, tset)
 		task_function_call(task, __perf_cgroup_move, task);
 }
 
-static void perf_cgroup_exit(struct cgroup *cgrp, struct cgroup *old_cgrp,
+static void perf_cgroup_exit(struct cgroup_subsys_state *css,
+			     struct cgroup_subsys_state *old_css,
 			     struct task_struct *task)
 {
 	/*
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 7a10742..622b7ef 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7094,16 +7094,17 @@ static inline struct task_group *cgroup_tg(struct cgroup *cgrp)
 	return css_tg(cgroup_css(cgrp, cpu_cgroup_subsys_id));
 }
 
-static struct cgroup_subsys_state *cpu_cgroup_css_alloc(struct cgroup *cgrp)
+static struct cgroup_subsys_state *
+cpu_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 {
-	struct task_group *tg, *parent;
+	struct task_group *parent = css_tg(parent_css);
+	struct task_group *tg;
 
-	if (!cgrp->parent) {
+	if (!parent) {
 		/* This is early initialization for the top cgroup */
 		return &root_task_group.css;
 	}
 
-	parent = cgroup_tg(cgrp->parent);
 	tg = sched_create_group(parent);
 	if (IS_ERR(tg))
 		return ERR_PTR(-ENOMEM);
@@ -7111,38 +7112,38 @@ static struct cgroup_subsys_state *cpu_cgroup_css_alloc(struct cgroup *cgrp)
 	return &tg->css;
 }
 
-static int cpu_cgroup_css_online(struct cgroup *cgrp)
+static int cpu_cgroup_css_online(struct cgroup_subsys_state *css)
 {
-	struct task_group *tg = cgroup_tg(cgrp);
-	struct task_group *parent = css_tg(css_parent(&tg->css));
+	struct task_group *tg = css_tg(css);
+	struct task_group *parent = css_tg(css_parent(css));
 
 	if (parent)
 		sched_online_group(tg, parent);
 	return 0;
 }
 
-static void cpu_cgroup_css_free(struct cgroup *cgrp)
+static void cpu_cgroup_css_free(struct cgroup_subsys_state *css)
 {
-	struct task_group *tg = cgroup_tg(cgrp);
+	struct task_group *tg = css_tg(css);
 
 	sched_destroy_group(tg);
 }
 
-static void cpu_cgroup_css_offline(struct cgroup *cgrp)
+static void cpu_cgroup_css_offline(struct cgroup_subsys_state *css)
 {
-	struct task_group *tg = cgroup_tg(cgrp);
+	struct task_group *tg = css_tg(css);
 
 	sched_offline_group(tg);
 }
 
-static int cpu_cgroup_can_attach(struct cgroup *cgrp,
+static int cpu_cgroup_can_attach(struct cgroup_subsys_state *css,
 				 struct cgroup_taskset *tset)
 {
 	struct task_struct *task;
 
-	cgroup_taskset_for_each(task, cgrp, tset) {
+	cgroup_taskset_for_each(task, css->cgroup, tset) {
 #ifdef CONFIG_RT_GROUP_SCHED
-		if (!sched_rt_can_attach(cgroup_tg(cgrp), task))
+		if (!sched_rt_can_attach(css_tg(css), task))
 			return -EINVAL;
 #else
 		/* We don't support RT-tasks being in separate groups */
@@ -7153,18 +7154,18 @@ static int cpu_cgroup_can_attach(struct cgroup *cgrp,
 	return 0;
 }
 
-static void cpu_cgroup_attach(struct cgroup *cgrp,
+static void cpu_cgroup_attach(struct cgroup_subsys_state *css,
 			      struct cgroup_taskset *tset)
 {
 	struct task_struct *task;
 
-	cgroup_taskset_for_each(task, cgrp, tset)
+	cgroup_taskset_for_each(task, css->cgroup, tset)
 		sched_move_task(task);
 }
 
-static void
-cpu_cgroup_exit(struct cgroup *cgrp, struct cgroup *old_cgrp,
-		struct task_struct *task)
+static void cpu_cgroup_exit(struct cgroup_subsys_state *css,
+			    struct cgroup_subsys_state *old_css,
+			    struct task_struct *task)
 {
 	/*
 	 * cgroup_exit() is called in the copy_process() failure path.
diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
index f6926a1..1b784d9 100644
--- a/kernel/sched/cpuacct.c
+++ b/kernel/sched/cpuacct.c
@@ -62,11 +62,12 @@ static struct cpuacct root_cpuacct = {
 };
 
 /* create a new cpu accounting group */
-static struct cgroup_subsys_state *cpuacct_css_alloc(struct cgroup *cgrp)
+static struct cgroup_subsys_state *
+cpuacct_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct cpuacct *ca;
 
-	if (!cgrp->parent)
+	if (!parent_css)
 		return &root_cpuacct.css;
 
 	ca = kzalloc(sizeof(*ca), GFP_KERNEL);
@@ -92,9 +93,9 @@ out:
 }
 
 /* destroy an existing cpu accounting group */
-static void cpuacct_css_free(struct cgroup *cgrp)
+static void cpuacct_css_free(struct cgroup_subsys_state *css)
 {
-	struct cpuacct *ca = cgroup_ca(cgrp);
+	struct cpuacct *ca = css_ca(css);
 
 	free_percpu(ca->cpustat);
 	free_percpu(ca->cpuusage);
diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
index 57ecb5d..e213243 100644
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -73,19 +73,18 @@ static inline bool hugetlb_cgroup_have_usage(struct hugetlb_cgroup *h_cg)
 	return false;
 }
 
-static struct cgroup_subsys_state *hugetlb_cgroup_css_alloc(struct cgroup *cgroup)
+static struct cgroup_subsys_state *
+hugetlb_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 {
+	struct hugetlb_cgroup *parent_h_cgroup = hugetlb_cgroup_from_css(parent_css);
+	struct hugetlb_cgroup *h_cgroup;
 	int idx;
-	struct cgroup *parent_cgroup;
-	struct hugetlb_cgroup *h_cgroup, *parent_h_cgroup;
 
 	h_cgroup = kzalloc(sizeof(*h_cgroup), GFP_KERNEL);
 	if (!h_cgroup)
 		return ERR_PTR(-ENOMEM);
 
-	parent_cgroup = cgroup->parent;
-	if (parent_cgroup) {
-		parent_h_cgroup = hugetlb_cgroup_from_cgroup(parent_cgroup);
+	if (parent_h_cgroup) {
 		for (idx = 0; idx < HUGE_MAX_HSTATE; idx++)
 			res_counter_init(&h_cgroup->hugepage[idx],
 					 &parent_h_cgroup->hugepage[idx]);
@@ -97,11 +96,11 @@ static struct cgroup_subsys_state *hugetlb_cgroup_css_alloc(struct cgroup *cgrou
 	return &h_cgroup->css;
 }
 
-static void hugetlb_cgroup_css_free(struct cgroup *cgroup)
+static void hugetlb_cgroup_css_free(struct cgroup_subsys_state *css)
 {
 	struct hugetlb_cgroup *h_cgroup;
 
-	h_cgroup = hugetlb_cgroup_from_cgroup(cgroup);
+	h_cgroup = hugetlb_cgroup_from_css(css);
 	kfree(h_cgroup);
 }
 
@@ -150,9 +149,9 @@ out:
  * Force the hugetlb cgroup to empty the hugetlb resources by moving them to
  * the parent cgroup.
  */
-static void hugetlb_cgroup_css_offline(struct cgroup *cgroup)
+static void hugetlb_cgroup_css_offline(struct cgroup_subsys_state *css)
 {
-	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_cgroup(cgroup);
+	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_css(css);
 	struct hstate *h;
 	struct page *page;
 	int idx = 0;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 69b3e52..32cca0f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6211,7 +6211,7 @@ static void __init mem_cgroup_soft_limit_tree_init(void)
 }
 
 static struct cgroup_subsys_state * __ref
-mem_cgroup_css_alloc(struct cgroup *cont)
+mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct mem_cgroup *memcg;
 	long error = -ENOMEM;
@@ -6226,7 +6226,7 @@ mem_cgroup_css_alloc(struct cgroup *cont)
 			goto free_out;
 
 	/* root ? */
-	if (cont->parent == NULL) {
+	if (parent_css == NULL) {
 		root_mem_cgroup = memcg;
 		res_counter_init(&memcg->res, NULL);
 		res_counter_init(&memcg->memsw, NULL);
@@ -6248,10 +6248,10 @@ free_out:
 }
 
 static int
-mem_cgroup_css_online(struct cgroup *cont)
+mem_cgroup_css_online(struct cgroup_subsys_state *css)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
-	struct mem_cgroup *parent = mem_cgroup_from_css(css_parent(&memcg->css));
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
+	struct mem_cgroup *parent = mem_cgroup_from_css(css_parent(css));
 	int error = 0;
 
 	if (!parent)
@@ -6308,9 +6308,9 @@ static void mem_cgroup_invalidate_reclaim_iterators(struct mem_cgroup *memcg)
 		mem_cgroup_iter_invalidate(root_mem_cgroup);
 }
 
-static void mem_cgroup_css_offline(struct cgroup *cont)
+static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 
 	kmem_cgroup_css_offline(memcg);
 
@@ -6319,9 +6319,9 @@ static void mem_cgroup_css_offline(struct cgroup *cont)
 	mem_cgroup_destroy_all_caches(memcg);
 }
 
-static void mem_cgroup_css_free(struct cgroup *cont)
+static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 
 	memcg_destroy_kmem(memcg);
 	__mem_cgroup_free(memcg);
@@ -6691,12 +6691,12 @@ static void mem_cgroup_clear_mc(void)
 	mem_cgroup_end_move(from);
 }
 
-static int mem_cgroup_can_attach(struct cgroup *cgroup,
+static int mem_cgroup_can_attach(struct cgroup_subsys_state *css,
 				 struct cgroup_taskset *tset)
 {
 	struct task_struct *p = cgroup_taskset_first(tset);
 	int ret = 0;
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgroup);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	unsigned long move_charge_at_immigrate;
 
 	/*
@@ -6738,7 +6738,7 @@ static int mem_cgroup_can_attach(struct cgroup *cgroup,
 	return ret;
 }
 
-static void mem_cgroup_cancel_attach(struct cgroup *cgroup,
+static void mem_cgroup_cancel_attach(struct cgroup_subsys_state *css,
 				     struct cgroup_taskset *tset)
 {
 	mem_cgroup_clear_mc();
@@ -6886,7 +6886,7 @@ retry:
 	up_read(&mm->mmap_sem);
 }
 
-static void mem_cgroup_move_task(struct cgroup *cont,
+static void mem_cgroup_move_task(struct cgroup_subsys_state *css,
 				 struct cgroup_taskset *tset)
 {
 	struct task_struct *p = cgroup_taskset_first(tset);
@@ -6901,16 +6901,16 @@ static void mem_cgroup_move_task(struct cgroup *cont,
 		mem_cgroup_clear_mc();
 }
 #else	/* !CONFIG_MMU */
-static int mem_cgroup_can_attach(struct cgroup *cgroup,
+static int mem_cgroup_can_attach(struct cgroup_subsys_state *css,
 				 struct cgroup_taskset *tset)
 {
 	return 0;
 }
-static void mem_cgroup_cancel_attach(struct cgroup *cgroup,
+static void mem_cgroup_cancel_attach(struct cgroup_subsys_state *css,
 				     struct cgroup_taskset *tset)
 {
 }
-static void mem_cgroup_move_task(struct cgroup *cont,
+static void mem_cgroup_move_task(struct cgroup_subsys_state *css,
 				 struct cgroup_taskset *tset)
 {
 }
@@ -6920,15 +6920,15 @@ static void mem_cgroup_move_task(struct cgroup *cont,
  * Cgroup retains root cgroups across [un]mount cycles making it necessary
  * to verify sane_behavior flag on each mount attempt.
  */
-static void mem_cgroup_bind(struct cgroup *root)
+static void mem_cgroup_bind(struct cgroup_subsys_state *root_css)
 {
 	/*
 	 * use_hierarchy is forced with sane_behavior.  cgroup core
 	 * guarantees that @root doesn't have any children, so turning it
 	 * on for the root memcg is enough.
 	 */
-	if (cgroup_sane_behavior(root))
-		mem_cgroup_from_cont(root)->use_hierarchy = true;
+	if (cgroup_sane_behavior(root_css->cgroup))
+		mem_cgroup_from_css(root_css)->use_hierarchy = true;
 }
 
 struct cgroup_subsys mem_cgroup_subsys = {
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 5dfac88..8d095b4 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -126,7 +126,8 @@ static int netprio_set_prio(struct cgroup_subsys_state *css,
 	return 0;
 }
 
-static struct cgroup_subsys_state *cgrp_css_alloc(struct cgroup *cgrp)
+static struct cgroup_subsys_state *
+cgrp_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct cgroup_subsys_state *css;
 
@@ -137,16 +138,14 @@ static struct cgroup_subsys_state *cgrp_css_alloc(struct cgroup *cgrp)
 	return css;
 }
 
-static int cgrp_css_online(struct cgroup *cgrp)
+static int cgrp_css_online(struct cgroup_subsys_state *css)
 {
-	struct cgroup_subsys_state *css = cgroup_css(cgrp, net_prio_subsys_id);
-	struct cgroup_subsys_state *parent_css;
+	struct cgroup_subsys_state *parent_css = css_parent(css);
 	struct net_device *dev;
 	int ret = 0;
 
-	if (!cgrp->parent)
+	if (!parent_css)
 		return 0;
-	parent_css = cgroup_css(cgrp->parent, net_prio_subsys_id);
 
 	rtnl_lock();
 	/*
@@ -164,9 +163,9 @@ static int cgrp_css_online(struct cgroup *cgrp)
 	return ret;
 }
 
-static void cgrp_css_free(struct cgroup *cgrp)
+static void cgrp_css_free(struct cgroup_subsys_state *css)
 {
-	kfree(cgroup_css(cgrp, net_prio_subsys_id));
+	kfree(css);
 }
 
 static u64 read_prioidx(struct cgroup *cgrp, struct cftype *cft)
@@ -221,12 +220,13 @@ static int update_netprio(const void *v, struct file *file, unsigned n)
 	return 0;
 }
 
-static void net_prio_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
+static void net_prio_attach(struct cgroup_subsys_state *css,
+			    struct cgroup_taskset *tset)
 {
 	struct task_struct *p;
 	void *v;
 
-	cgroup_taskset_for_each(p, cgrp, tset) {
+	cgroup_taskset_for_each(p, css->cgroup, tset) {
 		task_lock(p);
 		v = (void *)(unsigned long)task_netprioidx(p);
 		iterate_fd(p->files, 0, update_netprio, v);
diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
index 9e6b75e..dc39838 100644
--- a/net/sched/cls_cgroup.c
+++ b/net/sched/cls_cgroup.c
@@ -38,7 +38,8 @@ static inline struct cgroup_cls_state *task_cls_state(struct task_struct *p)
 	return css_cls_state(task_css(p, net_cls_subsys_id));
 }
 
-static struct cgroup_subsys_state *cgrp_css_alloc(struct cgroup *cgrp)
+static struct cgroup_subsys_state *
+cgrp_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct cgroup_cls_state *cs;
 
@@ -48,19 +49,19 @@ static struct cgroup_subsys_state *cgrp_css_alloc(struct cgroup *cgrp)
 	return &cs->css;
 }
 
-static int cgrp_css_online(struct cgroup *cgrp)
+static int cgrp_css_online(struct cgroup_subsys_state *css)
 {
-	struct cgroup_cls_state *cs = cgrp_cls_state(cgrp);
-	struct cgroup_cls_state *parent = css_cls_state(css_parent(&cs->css));
+	struct cgroup_cls_state *cs = css_cls_state(css);
+	struct cgroup_cls_state *parent = css_cls_state(css_parent(css));
 
 	if (parent)
 		cs->classid = parent->classid;
 	return 0;
 }
 
-static void cgrp_css_free(struct cgroup *cgrp)
+static void cgrp_css_free(struct cgroup_subsys_state *css)
 {
-	kfree(cgrp_cls_state(cgrp));
+	kfree(css_cls_state(css));
 }
 
 static int update_classid(const void *v, struct file *file, unsigned n)
@@ -72,12 +73,13 @@ static int update_classid(const void *v, struct file *file, unsigned n)
 	return 0;
 }
 
-static void cgrp_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
+static void cgrp_attach(struct cgroup_subsys_state *css,
+			struct cgroup_taskset *tset)
 {
 	struct task_struct *p;
 	void *v;
 
-	cgroup_taskset_for_each(p, cgrp, tset) {
+	cgroup_taskset_for_each(p, css->cgroup, tset) {
 		task_lock(p);
 		v = (void *)(unsigned long)task_cls_classid(p);
 		iterate_fd(p->files, 0, update_classid, v);
diff --git a/security/device_cgroup.c b/security/device_cgroup.c
index 635a49d..7293ac4 100644
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -68,7 +68,7 @@ static inline struct dev_cgroup *task_devcgroup(struct task_struct *task)
 
 struct cgroup_subsys devices_subsys;
 
-static int devcgroup_can_attach(struct cgroup *new_cgrp,
+static int devcgroup_can_attach(struct cgroup_subsys_state *new_css,
 				struct cgroup_taskset *set)
 {
 	struct task_struct *task = cgroup_taskset_first(set);
@@ -193,13 +193,13 @@ static inline bool is_devcg_online(const struct dev_cgroup *devcg)
 /**
  * devcgroup_online - initializes devcgroup's behavior and exceptions based on
  * 		      parent's
- * @cgroup: cgroup getting online
+ * @css: css getting online
  * returns 0 in case of success, error code otherwise
  */
-static int devcgroup_online(struct cgroup *cgroup)
+static int devcgroup_online(struct cgroup_subsys_state *css)
 {
-	struct dev_cgroup *dev_cgroup = cgroup_to_devcgroup(cgroup);
-	struct dev_cgroup *parent_dev_cgroup = css_to_devcgroup(css_parent(&dev_cgroup->css));
+	struct dev_cgroup *dev_cgroup = css_to_devcgroup(css);
+	struct dev_cgroup *parent_dev_cgroup = css_to_devcgroup(css_parent(css));
 	int ret = 0;
 
 	mutex_lock(&devcgroup_mutex);
@@ -217,9 +217,9 @@ static int devcgroup_online(struct cgroup *cgroup)
 	return ret;
 }
 
-static void devcgroup_offline(struct cgroup *cgroup)
+static void devcgroup_offline(struct cgroup_subsys_state *css)
 {
-	struct dev_cgroup *dev_cgroup = cgroup_to_devcgroup(cgroup);
+	struct dev_cgroup *dev_cgroup = css_to_devcgroup(css);
 
 	mutex_lock(&devcgroup_mutex);
 	dev_cgroup->behavior = DEVCG_DEFAULT_NONE;
@@ -229,7 +229,8 @@ static void devcgroup_offline(struct cgroup *cgroup)
 /*
  * called from kernel/cgroup.c with cgroup_lock() held.
  */
-static struct cgroup_subsys_state *devcgroup_css_alloc(struct cgroup *cgroup)
+static struct cgroup_subsys_state *
+devcgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct dev_cgroup *dev_cgroup;
 
@@ -242,11 +243,10 @@ static struct cgroup_subsys_state *devcgroup_css_alloc(struct cgroup *cgroup)
 	return &dev_cgroup->css;
 }
 
-static void devcgroup_css_free(struct cgroup *cgroup)
+static void devcgroup_css_free(struct cgroup_subsys_state *css)
 {
-	struct dev_cgroup *dev_cgroup;
+	struct dev_cgroup *dev_cgroup = css_to_devcgroup(css);
 
-	dev_cgroup = cgroup_to_devcgroup(cgroup);
 	__dev_exception_clean(dev_cgroup);
 	kfree(dev_cgroup);
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 09/23] cgroup: add subsys backlink pointer to cftype
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (7 preceding siblings ...)
  2013-08-01 21:49 ` [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-05 12:49   ` Vivek Goyal
  2013-08-01 21:49 ` [PATCH 10/23] cgroup: pin cgroup_subsys_state when opening a cgroupfs file Tejun Heo
                   ` (15 subsequent siblings)
  24 siblings, 1 reply; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan
  Cc: containers, cgroups, linux-kernel, Tejun Heo, Vivek Goyal, Jens Axboe

cgroup is transitioning to using css (cgroup_subsys_state) instead of
cgroup as the primary subsystem handle.  The cgroupfs file interface
will be converted to use css's which requires finding out the
subsystem from cftype so that the matching css can be determined from
the cgroup.

This patch adds cftype->ss which points to the subsystem the file
belongs to.  The field is initialized while a cftype is being
registered.  This makes it unnecessary to explicitly specify the
subsystem for other cftype handling functions.  @ss argument dropped
from various cftype handling functions.

This patch shouldn't introduce any behavior differences.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
---
 block/blk-cgroup.c     |  2 +-
 include/linux/cgroup.h |  8 +++++-
 kernel/cgroup.c        | 78 ++++++++++++++++++++++++++++----------------------
 3 files changed, 51 insertions(+), 37 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 79fd9f4..3406373 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1128,7 +1128,7 @@ void blkcg_policy_unregister(struct blkcg_policy *pol)
 
 	/* kill the intf files first */
 	if (pol->cftypes)
-		cgroup_rm_cftypes(&blkio_subsys, pol->cftypes);
+		cgroup_rm_cftypes(pol->cftypes);
 
 	/* unregister and update blkgs */
 	blkcg_policy[pol->plid] = NULL;
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 69b33f9..2e0e9e2 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -429,6 +429,12 @@ struct cftype {
 	/* CFTYPE_* flags */
 	unsigned int flags;
 
+	/*
+	 * The subsys this file belongs to.  Initialized automatically
+	 * during registration.  NULL for cgroup core files.
+	 */
+	struct cgroup_subsys *ss;
+
 	int (*open)(struct inode *inode, struct file *file);
 	ssize_t (*read)(struct cgroup *cgrp, struct cftype *cft,
 			struct file *file,
@@ -542,7 +548,7 @@ static inline const char *cgroup_name(const struct cgroup *cgrp)
 }
 
 int cgroup_add_cftypes(struct cgroup_subsys *ss, struct cftype *cfts);
-int cgroup_rm_cftypes(struct cgroup_subsys *ss, struct cftype *cfts);
+int cgroup_rm_cftypes(struct cftype *cfts);
 
 bool cgroup_is_descendant(struct cgroup *cgrp, struct cgroup *ancestor);
 
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index fae11e3..f1fc4d8 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -219,8 +219,8 @@ static struct cftype cgroup_base_files[];
 
 static void cgroup_offline_fn(struct work_struct *work);
 static int cgroup_destroy_locked(struct cgroup *cgrp);
-static int cgroup_addrm_files(struct cgroup *cgrp, struct cgroup_subsys *subsys,
-			      struct cftype cfts[], bool is_add);
+static int cgroup_addrm_files(struct cgroup *cgrp, struct cftype cfts[],
+			      bool is_add);
 
 /* convenient tests for these bits */
 static inline bool cgroup_is_dead(const struct cgroup *cgrp)
@@ -974,7 +974,7 @@ static void cgroup_clear_dir(struct cgroup *cgrp, unsigned long subsys_mask)
 		if (!test_bit(i, &subsys_mask))
 			continue;
 		list_for_each_entry(set, &ss->cftsets, node)
-			cgroup_addrm_files(cgrp, NULL, set->cfts, false);
+			cgroup_addrm_files(cgrp, set->cfts, false);
 	}
 }
 
@@ -1623,7 +1623,7 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type,
 		 */
 		cred = override_creds(&init_cred);
 
-		ret = cgroup_addrm_files(root_cgrp, NULL, cgroup_base_files, true);
+		ret = cgroup_addrm_files(root_cgrp, cgroup_base_files, true);
 		if (ret)
 			goto rm_base_files;
 
@@ -1681,7 +1681,7 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type,
 
  rm_base_files:
 	free_cgrp_cset_links(&tmp_links);
-	cgroup_addrm_files(&root->top_cgroup, NULL, cgroup_base_files, false);
+	cgroup_addrm_files(&root->top_cgroup, cgroup_base_files, false);
 	revert_creds(cred);
  unlock_drop:
 	cgroup_exit_root_id(root);
@@ -2687,8 +2687,7 @@ static umode_t cgroup_file_mode(const struct cftype *cft)
 	return mode;
 }
 
-static int cgroup_add_file(struct cgroup *cgrp, struct cgroup_subsys *subsys,
-			   struct cftype *cft)
+static int cgroup_add_file(struct cgroup *cgrp, struct cftype *cft)
 {
 	struct dentry *dir = cgrp->dentry;
 	struct cgroup *parent = __d_cgrp(dir);
@@ -2698,8 +2697,8 @@ static int cgroup_add_file(struct cgroup *cgrp, struct cgroup_subsys *subsys,
 	umode_t mode;
 	char name[MAX_CGROUP_TYPE_NAMELEN + MAX_CFTYPE_NAME + 2] = { 0 };
 
-	if (subsys && !(cgrp->root->flags & CGRP_ROOT_NOPREFIX)) {
-		strcpy(name, subsys->name);
+	if (cft->ss && !(cgrp->root->flags & CGRP_ROOT_NOPREFIX)) {
+		strcpy(name, cft->ss->name);
 		strcat(name, ".");
 	}
 	strcat(name, cft->name);
@@ -2736,17 +2735,16 @@ out:
 /**
  * cgroup_addrm_files - add or remove files to a cgroup directory
  * @cgrp: the target cgroup
- * @subsys: the subsystem of files to be added
  * @cfts: array of cftypes to be added
  * @is_add: whether to add or remove
  *
  * Depending on @is_add, add or remove files defined by @cfts on @cgrp.
- * All @cfts should belong to @subsys.  For removals, this function never
- * fails.  If addition fails, this function doesn't remove files already
- * added.  The caller is responsible for cleaning up.
+ * For removals, this function never fails.  If addition fails, this
+ * function doesn't remove files already added.  The caller is responsible
+ * for cleaning up.
  */
-static int cgroup_addrm_files(struct cgroup *cgrp, struct cgroup_subsys *subsys,
-			      struct cftype cfts[], bool is_add)
+static int cgroup_addrm_files(struct cgroup *cgrp, struct cftype cfts[],
+			      bool is_add)
 {
 	struct cftype *cft;
 	int ret;
@@ -2764,7 +2762,7 @@ static int cgroup_addrm_files(struct cgroup *cgrp, struct cgroup_subsys *subsys,
 			continue;
 
 		if (is_add) {
-			ret = cgroup_add_file(cgrp, subsys, cft);
+			ret = cgroup_add_file(cgrp, cft);
 			if (ret) {
 				pr_warn("cgroup_addrm_files: failed to add %s, err=%d\n",
 					cft->name, ret);
@@ -2789,11 +2787,11 @@ static void cgroup_cfts_prepare(void)
 	mutex_lock(&cgroup_mutex);
 }
 
-static int cgroup_cfts_commit(struct cgroup_subsys *ss,
-			      struct cftype *cfts, bool is_add)
+static int cgroup_cfts_commit(struct cftype *cfts, bool is_add)
 	__releases(&cgroup_mutex)
 {
 	LIST_HEAD(pending);
+	struct cgroup_subsys *ss = cfts[0].ss;
 	struct cgroup *cgrp, *root = &ss->root->top_cgroup;
 	struct super_block *sb = ss->root->sb;
 	struct dentry *prev = NULL;
@@ -2821,7 +2819,7 @@ static int cgroup_cfts_commit(struct cgroup_subsys *ss,
 	inode = root->dentry->d_inode;
 	mutex_lock(&inode->i_mutex);
 	mutex_lock(&cgroup_mutex);
-	ret = cgroup_addrm_files(root, ss, cfts, is_add);
+	ret = cgroup_addrm_files(root, cfts, is_add);
 	mutex_unlock(&cgroup_mutex);
 	mutex_unlock(&inode->i_mutex);
 
@@ -2844,7 +2842,7 @@ static int cgroup_cfts_commit(struct cgroup_subsys *ss,
 		mutex_lock(&inode->i_mutex);
 		mutex_lock(&cgroup_mutex);
 		if (cgrp->serial_nr < update_before && !cgroup_is_dead(cgrp))
-			ret = cgroup_addrm_files(cgrp, ss, cfts, is_add);
+			ret = cgroup_addrm_files(cgrp, cfts, is_add);
 		mutex_unlock(&cgroup_mutex);
 		mutex_unlock(&inode->i_mutex);
 
@@ -2876,51 +2874,56 @@ out_deact:
 int cgroup_add_cftypes(struct cgroup_subsys *ss, struct cftype *cfts)
 {
 	struct cftype_set *set;
+	struct cftype *cft;
 	int ret;
 
 	set = kzalloc(sizeof(*set), GFP_KERNEL);
 	if (!set)
 		return -ENOMEM;
 
+	for (cft = cfts; cft->name[0] != '\0'; cft++)
+		cft->ss = ss;
+
 	cgroup_cfts_prepare();
 	set->cfts = cfts;
 	list_add_tail(&set->node, &ss->cftsets);
-	ret = cgroup_cfts_commit(ss, cfts, true);
+	ret = cgroup_cfts_commit(cfts, true);
 	if (ret)
-		cgroup_rm_cftypes(ss, cfts);
+		cgroup_rm_cftypes(cfts);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(cgroup_add_cftypes);
 
 /**
  * cgroup_rm_cftypes - remove an array of cftypes from a subsystem
- * @ss: target cgroup subsystem
  * @cfts: zero-length name terminated array of cftypes
  *
- * Unregister @cfts from @ss.  Files described by @cfts are removed from
- * all existing cgroups to which @ss is attached and all future cgroups
- * won't have them either.  This function can be called anytime whether @ss
- * is attached or not.
+ * Unregister @cfts.  Files described by @cfts are removed from all
+ * existing cgroups and all future cgroups won't have them either.  This
+ * function can be called anytime whether @cfts' subsys is attached or not.
  *
  * Returns 0 on successful unregistration, -ENOENT if @cfts is not
- * registered with @ss.
+ * registered.
  */
-int cgroup_rm_cftypes(struct cgroup_subsys *ss, struct cftype *cfts)
+int cgroup_rm_cftypes(struct cftype *cfts)
 {
 	struct cftype_set *set;
 
+	if (!cfts || !cfts[0].ss)
+		return -ENOENT;
+
 	cgroup_cfts_prepare();
 
-	list_for_each_entry(set, &ss->cftsets, node) {
+	list_for_each_entry(set, &cfts[0].ss->cftsets, node) {
 		if (set->cfts == cfts) {
 			list_del(&set->node);
 			kfree(set);
-			cgroup_cfts_commit(ss, cfts, false);
+			cgroup_cfts_commit(cfts, false);
 			return 0;
 		}
 	}
 
-	cgroup_cfts_commit(ss, NULL, false);
+	cgroup_cfts_commit(NULL, false);
 	return -ENOENT;
 }
 
@@ -4141,7 +4144,7 @@ static int cgroup_populate_dir(struct cgroup *cgrp, unsigned long subsys_mask)
 			continue;
 
 		list_for_each_entry(set, &ss->cftsets, node) {
-			ret = cgroup_addrm_files(cgrp, ss, set->cfts, true);
+			ret = cgroup_addrm_files(cgrp, set->cfts, true);
 			if (ret < 0)
 				goto err;
 		}
@@ -4368,7 +4371,7 @@ static long cgroup_create(struct cgroup *parent, struct dentry *dentry,
 
 	idr_replace(&root->cgroup_idr, cgrp, cgrp->id);
 
-	err = cgroup_addrm_files(cgrp, NULL, cgroup_base_files, true);
+	err = cgroup_addrm_files(cgrp, cgroup_base_files, true);
 	if (err)
 		goto err_destroy;
 
@@ -4529,7 +4532,7 @@ static int cgroup_destroy_locked(struct cgroup *cgrp)
 	 * but we aren't quite done with @cgrp yet, so hold onto it.
 	 */
 	cgroup_clear_dir(cgrp, cgrp->root->subsys_mask);
-	cgroup_addrm_files(cgrp, NULL, cgroup_base_files, false);
+	cgroup_addrm_files(cgrp, cgroup_base_files, false);
 	dget(d);
 	cgroup_d_remove_dir(d);
 
@@ -4623,6 +4626,11 @@ static void __init_or_module cgroup_init_cftsets(struct cgroup_subsys *ss)
 	 * deregistration.
 	 */
 	if (ss->base_cftypes) {
+		struct cftype *cft;
+
+		for (cft = ss->base_cftypes; cft->name[0] != '\0'; cft++)
+			cft->ss = ss;
+
 		ss->base_cftset.cfts = ss->base_cftypes;
 		list_add_tail(&ss->base_cftset.node, &ss->cftsets);
 	}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 10/23] cgroup: pin cgroup_subsys_state when opening a cgroupfs file
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (8 preceding siblings ...)
  2013-08-01 21:49 ` [PATCH 09/23] cgroup: add subsys backlink pointer to cftype Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-01 21:49 ` [PATCH 11/23] cgroup: add cgroup->dummy_css Tejun Heo
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan; +Cc: containers, cgroups, linux-kernel, Tejun Heo

Previously, each file read/write operation relied on the inode
reference count pinning the cgroup and simply checked whether the
cgroup was marked dead before proceeding to invoke the per-subsystem
callback.  This was rather silly as it didn't have any synchronization
or css pinning around the check and the cgroup may be removed and all
css refs drained between the DEAD check and actual method invocation.

This patch pins the css between open() and release() so that it is
guaranteed to be alive for all file operations and remove the silly
DEAD checks from cgroup_file_read/write().

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cgroup.c | 43 ++++++++++++++++++++++++++++++++-----------
 1 file changed, 32 insertions(+), 11 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index f1fc4d8..b413e22 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2270,6 +2270,17 @@ static int cgroup_sane_behavior_show(struct cgroup *cgrp, struct cftype *cft,
 	return 0;
 }
 
+/* return the css for the given cgroup file */
+static struct cgroup_subsys_state *cgroup_file_css(struct cfent *cfe)
+{
+	struct cftype *cft = cfe->type;
+	struct cgroup *cgrp = __d_cgrp(cfe->dentry->d_parent);
+
+	if (cft->ss)
+		return cgrp->subsys[cft->ss->subsys_id];
+	return NULL;
+}
+
 /* A buffer size big enough for numbers or short strings */
 #define CGROUP_LOCAL_BUFFER_SIZE 64
 
@@ -2347,8 +2358,6 @@ static ssize_t cgroup_file_write(struct file *file, const char __user *buf,
 	struct cftype *cft = __d_cft(file->f_dentry);
 	struct cgroup *cgrp = __d_cgrp(file->f_dentry->d_parent);
 
-	if (cgroup_is_dead(cgrp))
-		return -ENODEV;
 	if (cft->write)
 		return cft->write(cgrp, cft, file, buf, nbytes, ppos);
 	if (cft->write_u64 || cft->write_s64)
@@ -2392,9 +2401,6 @@ static ssize_t cgroup_file_read(struct file *file, char __user *buf,
 	struct cftype *cft = __d_cft(file->f_dentry);
 	struct cgroup *cgrp = __d_cgrp(file->f_dentry->d_parent);
 
-	if (cgroup_is_dead(cgrp))
-		return -ENODEV;
-
 	if (cft->read)
 		return cft->read(cgrp, cft, file, buf, nbytes, ppos);
 	if (cft->read_u64)
@@ -2440,15 +2446,22 @@ static const struct file_operations cgroup_seqfile_operations = {
 
 static int cgroup_file_open(struct inode *inode, struct file *file)
 {
+	struct cfent *cfe = __d_cfe(file->f_dentry);
+	struct cftype *cft = __d_cft(file->f_dentry);
+	struct cgroup_subsys_state *css = cgroup_file_css(cfe);
 	int err;
-	struct cfent *cfe;
-	struct cftype *cft;
 
 	err = generic_file_open(inode, file);
 	if (err)
 		return err;
-	cfe = __d_cfe(file->f_dentry);
-	cft = cfe->type;
+
+	/*
+	 * If the file belongs to a subsystem, pin the css.  Will be
+	 * unpinned either on open failure or release.  This ensures that
+	 * @css stays alive for all file operations.
+	 */
+	if (css && !css_tryget(css))
+		return -ENODEV;
 
 	if (cft->read_map || cft->read_seq_string) {
 		file->f_op = &cgroup_seqfile_operations;
@@ -2457,15 +2470,23 @@ static int cgroup_file_open(struct inode *inode, struct file *file)
 		err = cft->open(inode, file);
 	}
 
+	if (css && err)
+		css_put(css);
 	return err;
 }
 
 static int cgroup_file_release(struct inode *inode, struct file *file)
 {
+	struct cfent *cfe = __d_cfe(file->f_dentry);
 	struct cftype *cft = __d_cft(file->f_dentry);
+	struct cgroup_subsys_state *css = cgroup_file_css(cfe);
+	int ret = 0;
+
 	if (cft->release)
-		return cft->release(inode, file);
-	return 0;
+		ret = cft->release(inode, file);
+	if (css)
+		css_put(css);
+	return ret;
 }
 
 /*
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 11/23] cgroup: add cgroup->dummy_css
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (9 preceding siblings ...)
  2013-08-01 21:49 ` [PATCH 10/23] cgroup: pin cgroup_subsys_state when opening a cgroupfs file Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-01 21:49 ` [PATCH 12/23] cgroup: pass around cgroup_subsys_state instead of cgroup in file methods Tejun Heo
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan; +Cc: containers, cgroups, linux-kernel, Tejun Heo

cgroup subsystem API is being converted to use css
(cgroup_subsys_state) as the main handle, which makes things a bit
awkward for subsystem agnostic core features - the "cgroup.*"
interface files and various iterations - a bit awkward as they don't
have a css to use.

This patch adds cgroup->dummy_css which has NULL ->ss and whose only
role is pointing back to the cgroup.  This will be used to support
subsystem agnostic features on the coming css based API.

css_parent() is updated to handle dummy_css's.  Note that css will
soon grow its own ->parent field and css_parent() will be made
trivial.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/cgroup.h | 11 ++++++++++-
 kernel/cgroup.c        |  9 +++++----
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 2e0e9e2..085ca93 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -225,6 +225,9 @@ struct cgroup {
 	struct list_head pidlists;
 	struct mutex pidlist_mutex;
 
+	/* dummy css with NULL ->ss, points back to this cgroup */
+	struct cgroup_subsys_state dummy_css;
+
 	/* For css percpu_ref killing and RCU-protected deletion */
 	struct rcu_head rcu_head;
 	struct work_struct destroy_work;
@@ -669,7 +672,13 @@ struct cgroup_subsys_state *css_parent(struct cgroup_subsys_state *css)
 {
 	struct cgroup *parent_cgrp = css->cgroup->parent;
 
-	return parent_cgrp ? parent_cgrp->subsys[css->ss->subsys_id] : NULL;
+	if (!parent_cgrp)
+		return NULL;
+
+	if (css->ss)
+		return parent_cgrp->subsys[css->ss->subsys_id];
+	else
+		return &parent_cgrp->dummy_css;
 }
 
 /**
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index b413e22..bb87c9f 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1365,6 +1365,7 @@ static void init_cgroup_housekeeping(struct cgroup *cgrp)
 	INIT_LIST_HEAD(&cgrp->release_list);
 	INIT_LIST_HEAD(&cgrp->pidlists);
 	mutex_init(&cgrp->pidlist_mutex);
+	cgrp->dummy_css.cgroup = cgrp;
 	INIT_LIST_HEAD(&cgrp->event_list);
 	spin_lock_init(&cgrp->event_list_lock);
 	simple_xattrs_init(&cgrp->xattrs);
@@ -2278,7 +2279,7 @@ static struct cgroup_subsys_state *cgroup_file_css(struct cfent *cfe)
 
 	if (cft->ss)
 		return cgrp->subsys[cft->ss->subsys_id];
-	return NULL;
+	return &cgrp->dummy_css;
 }
 
 /* A buffer size big enough for numbers or short strings */
@@ -2460,7 +2461,7 @@ static int cgroup_file_open(struct inode *inode, struct file *file)
 	 * unpinned either on open failure or release.  This ensures that
 	 * @css stays alive for all file operations.
 	 */
-	if (css && !css_tryget(css))
+	if (css->ss && !css_tryget(css))
 		return -ENODEV;
 
 	if (cft->read_map || cft->read_seq_string) {
@@ -2470,7 +2471,7 @@ static int cgroup_file_open(struct inode *inode, struct file *file)
 		err = cft->open(inode, file);
 	}
 
-	if (css && err)
+	if (css->ss && err)
 		css_put(css);
 	return err;
 }
@@ -2484,7 +2485,7 @@ static int cgroup_file_release(struct inode *inode, struct file *file)
 
 	if (cft->release)
 		ret = cft->release(inode, file);
-	if (css)
+	if (css->ss)
 		css_put(css);
 	return ret;
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 12/23] cgroup: pass around cgroup_subsys_state instead of cgroup in file methods
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (10 preceding siblings ...)
  2013-08-01 21:49 ` [PATCH 11/23] cgroup: add cgroup->dummy_css Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-02 13:27   ` Michal Hocko
                     ` (3 more replies)
  2013-08-01 21:49 ` [PATCH 13/23] cgroup: convert cgroup_next_sibling() to cgroup_next_child() Tejun Heo
                   ` (12 subsequent siblings)
  24 siblings, 4 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan
  Cc: containers, cgroups, linux-kernel, Tejun Heo, Peter Zijlstra,
	Ingo Molnar, Johannes Weiner, Michal Hocko, Balbir Singh,
	Aristeu Rozanski, Matt Helsley, Daniel Wagner, Vivek Goyal,
	Jens Axboe, Steven Rostedt

cgroup is currently in the process of transitioning to using struct
cgroup_subsys_state * as the primary handle instead of struct cgroup.
Please see the previous commit which converts the subsystem methods
for rationale.

This patch converts all cftype file operations to take @css instead of
@cgroup.  cftypes for the cgroup core files don't have their subsytem
pointer set.  These will automatically use the dummy_css added by the
previous patch and can be converted the same way.

Most subsystem conversions are straight forwards but there are some
interesting ones.

* freezer: update_if_frozen() is also converted to take @css instead
  of @cgroup for consistency.  This will make the code look simpler
  too once iterators are converted to use css.

* memory/vmpressure: mem_cgroup_from_css() needs to be exported to
  vmpressure while mem_cgroup_from_cont() can be made static.
  Updated accordingly.

* cpu: cgroup_tg() doesn't have any user left.  Removed.

* cpuacct: cgroup_ca() doesn't have any user left.  Removed.

* hugetlb: hugetlb_cgroup_form_cgroup() doesn't have any user left.
  Removed.

* net_cls: cgrp_cls_state() doesn't have any user left.  Removed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 block/blk-cgroup.c         |   6 +-
 block/blk-throttle.c       |  32 ++++-----
 block/cfq-iosched.c        |  90 ++++++++++++-------------
 include/linux/cgroup.h     |  24 ++++---
 include/linux/memcontrol.h |   2 +-
 kernel/cgroup.c            | 162 +++++++++++++++++++++++----------------------
 kernel/cgroup_freezer.c    |  40 +++++------
 kernel/cpuset.c            |  35 +++++-----
 kernel/sched/core.c        |  65 +++++++++---------
 kernel/sched/cpuacct.c     |  28 +++-----
 mm/hugetlb_cgroup.c        |  26 +++-----
 mm/memcontrol.c            |  88 ++++++++++++------------
 mm/vmpressure.c            |   4 +-
 net/core/netprio_cgroup.c  |  10 ++-
 net/ipv4/tcp_memcontrol.c  |  12 ++--
 net/sched/cls_cgroup.c     |  14 ++--
 security/device_cgroup.c   |  12 ++--
 17 files changed, 322 insertions(+), 328 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 3406373..f46f3c6 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -437,10 +437,10 @@ struct request_list *__blk_queue_next_rl(struct request_list *rl,
 	return &blkg->rl;
 }
 
-static int blkcg_reset_stats(struct cgroup *cgroup, struct cftype *cftype,
-			     u64 val)
+static int blkcg_reset_stats(struct cgroup_subsys_state *css,
+			     struct cftype *cftype, u64 val)
 {
-	struct blkcg *blkcg = cgroup_to_blkcg(cgroup);
+	struct blkcg *blkcg = css_to_blkcg(css);
 	struct blkcg_gq *blkg;
 	int i;
 
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 08a32df..88bcfb6 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1293,10 +1293,10 @@ static u64 tg_prfill_cpu_rwstat(struct seq_file *sf,
 	return __blkg_prfill_rwstat(sf, pd, &rwstat);
 }
 
-static int tg_print_cpu_rwstat(struct cgroup *cgrp, struct cftype *cft,
-			       struct seq_file *sf)
+static int tg_print_cpu_rwstat(struct cgroup_subsys_state *css,
+			       struct cftype *cft, struct seq_file *sf)
 {
-	struct blkcg *blkcg = cgroup_to_blkcg(cgrp);
+	struct blkcg *blkcg = css_to_blkcg(css);
 
 	blkcg_print_blkgs(sf, blkcg, tg_prfill_cpu_rwstat, &blkcg_policy_throtl,
 			  cft->private, true);
@@ -1325,26 +1325,26 @@ static u64 tg_prfill_conf_uint(struct seq_file *sf, struct blkg_policy_data *pd,
 	return __blkg_prfill_u64(sf, pd, v);
 }
 
-static int tg_print_conf_u64(struct cgroup *cgrp, struct cftype *cft,
-			     struct seq_file *sf)
+static int tg_print_conf_u64(struct cgroup_subsys_state *css,
+			     struct cftype *cft, struct seq_file *sf)
 {
-	blkcg_print_blkgs(sf, cgroup_to_blkcg(cgrp), tg_prfill_conf_u64,
+	blkcg_print_blkgs(sf, css_to_blkcg(css), tg_prfill_conf_u64,
 			  &blkcg_policy_throtl, cft->private, false);
 	return 0;
 }
 
-static int tg_print_conf_uint(struct cgroup *cgrp, struct cftype *cft,
-			      struct seq_file *sf)
+static int tg_print_conf_uint(struct cgroup_subsys_state *css,
+			      struct cftype *cft, struct seq_file *sf)
 {
-	blkcg_print_blkgs(sf, cgroup_to_blkcg(cgrp), tg_prfill_conf_uint,
+	blkcg_print_blkgs(sf, css_to_blkcg(css), tg_prfill_conf_uint,
 			  &blkcg_policy_throtl, cft->private, false);
 	return 0;
 }
 
-static int tg_set_conf(struct cgroup *cgrp, struct cftype *cft, const char *buf,
-		       bool is_u64)
+static int tg_set_conf(struct cgroup_subsys_state *css, struct cftype *cft,
+		       const char *buf, bool is_u64)
 {
-	struct blkcg *blkcg = cgroup_to_blkcg(cgrp);
+	struct blkcg *blkcg = css_to_blkcg(css);
 	struct blkg_conf_ctx ctx;
 	struct throtl_grp *tg;
 	struct throtl_service_queue *sq;
@@ -1403,16 +1403,16 @@ static int tg_set_conf(struct cgroup *cgrp, struct cftype *cft, const char *buf,
 	return 0;
 }
 
-static int tg_set_conf_u64(struct cgroup *cgrp, struct cftype *cft,
+static int tg_set_conf_u64(struct cgroup_subsys_state *css, struct cftype *cft,
 			   const char *buf)
 {
-	return tg_set_conf(cgrp, cft, buf, true);
+	return tg_set_conf(css, cft, buf, true);
 }
 
-static int tg_set_conf_uint(struct cgroup *cgrp, struct cftype *cft,
+static int tg_set_conf_uint(struct cgroup_subsys_state *css, struct cftype *cft,
 			    const char *buf)
 {
-	return tg_set_conf(cgrp, cft, buf, false);
+	return tg_set_conf(css, cft, buf, false);
 }
 
 static struct cftype throtl_files[] = {
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index d5bbdcf..dabb9d0 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1607,12 +1607,11 @@ static u64 cfqg_prfill_weight_device(struct seq_file *sf,
 	return __blkg_prfill_u64(sf, pd, cfqg->dev_weight);
 }
 
-static int cfqg_print_weight_device(struct cgroup *cgrp, struct cftype *cft,
-				    struct seq_file *sf)
+static int cfqg_print_weight_device(struct cgroup_subsys_state *css,
+				    struct cftype *cft, struct seq_file *sf)
 {
-	blkcg_print_blkgs(sf, cgroup_to_blkcg(cgrp),
-			  cfqg_prfill_weight_device, &blkcg_policy_cfq, 0,
-			  false);
+	blkcg_print_blkgs(sf, css_to_blkcg(css), cfqg_prfill_weight_device,
+			  &blkcg_policy_cfq, 0, false);
 	return 0;
 }
 
@@ -1626,35 +1625,34 @@ static u64 cfqg_prfill_leaf_weight_device(struct seq_file *sf,
 	return __blkg_prfill_u64(sf, pd, cfqg->dev_leaf_weight);
 }
 
-static int cfqg_print_leaf_weight_device(struct cgroup *cgrp,
+static int cfqg_print_leaf_weight_device(struct cgroup_subsys_state *css,
 					 struct cftype *cft,
 					 struct seq_file *sf)
 {
-	blkcg_print_blkgs(sf, cgroup_to_blkcg(cgrp),
-			  cfqg_prfill_leaf_weight_device, &blkcg_policy_cfq, 0,
-			  false);
+	blkcg_print_blkgs(sf, css_to_blkcg(css), cfqg_prfill_leaf_weight_device,
+			  &blkcg_policy_cfq, 0, false);
 	return 0;
 }
 
-static int cfq_print_weight(struct cgroup *cgrp, struct cftype *cft,
+static int cfq_print_weight(struct cgroup_subsys_state *css, struct cftype *cft,
 			    struct seq_file *sf)
 {
-	seq_printf(sf, "%u\n", cgroup_to_blkcg(cgrp)->cfq_weight);
+	seq_printf(sf, "%u\n", css_to_blkcg(css)->cfq_weight);
 	return 0;
 }
 
-static int cfq_print_leaf_weight(struct cgroup *cgrp, struct cftype *cft,
-				 struct seq_file *sf)
+static int cfq_print_leaf_weight(struct cgroup_subsys_state *css,
+				 struct cftype *cft, struct seq_file *sf)
 {
-	seq_printf(sf, "%u\n",
-		   cgroup_to_blkcg(cgrp)->cfq_leaf_weight);
+	seq_printf(sf, "%u\n", css_to_blkcg(css)->cfq_leaf_weight);
 	return 0;
 }
 
-static int __cfqg_set_weight_device(struct cgroup *cgrp, struct cftype *cft,
-				    const char *buf, bool is_leaf_weight)
+static int __cfqg_set_weight_device(struct cgroup_subsys_state *css,
+				    struct cftype *cft, const char *buf,
+				    bool is_leaf_weight)
 {
-	struct blkcg *blkcg = cgroup_to_blkcg(cgrp);
+	struct blkcg *blkcg = css_to_blkcg(css);
 	struct blkg_conf_ctx ctx;
 	struct cfq_group *cfqg;
 	int ret;
@@ -1680,22 +1678,22 @@ static int __cfqg_set_weight_device(struct cgroup *cgrp, struct cftype *cft,
 	return ret;
 }
 
-static int cfqg_set_weight_device(struct cgroup *cgrp, struct cftype *cft,
-				  const char *buf)
+static int cfqg_set_weight_device(struct cgroup_subsys_state *css,
+				  struct cftype *cft, const char *buf)
 {
-	return __cfqg_set_weight_device(cgrp, cft, buf, false);
+	return __cfqg_set_weight_device(css, cft, buf, false);
 }
 
-static int cfqg_set_leaf_weight_device(struct cgroup *cgrp, struct cftype *cft,
-				       const char *buf)
+static int cfqg_set_leaf_weight_device(struct cgroup_subsys_state *css,
+				       struct cftype *cft, const char *buf)
 {
-	return __cfqg_set_weight_device(cgrp, cft, buf, true);
+	return __cfqg_set_weight_device(css, cft, buf, true);
 }
 
-static int __cfq_set_weight(struct cgroup *cgrp, struct cftype *cft, u64 val,
-			    bool is_leaf_weight)
+static int __cfq_set_weight(struct cgroup_subsys_state *css, struct cftype *cft,
+			    u64 val, bool is_leaf_weight)
 {
-	struct blkcg *blkcg = cgroup_to_blkcg(cgrp);
+	struct blkcg *blkcg = css_to_blkcg(css);
 	struct blkcg_gq *blkg;
 
 	if (val < CFQ_WEIGHT_MIN || val > CFQ_WEIGHT_MAX)
@@ -1727,30 +1725,32 @@ static int __cfq_set_weight(struct cgroup *cgrp, struct cftype *cft, u64 val,
 	return 0;
 }
 
-static int cfq_set_weight(struct cgroup *cgrp, struct cftype *cft, u64 val)
+static int cfq_set_weight(struct cgroup_subsys_state *css, struct cftype *cft,
+			  u64 val)
 {
-	return __cfq_set_weight(cgrp, cft, val, false);
+	return __cfq_set_weight(css, cft, val, false);
 }
 
-static int cfq_set_leaf_weight(struct cgroup *cgrp, struct cftype *cft, u64 val)
+static int cfq_set_leaf_weight(struct cgroup_subsys_state *css,
+			       struct cftype *cft, u64 val)
 {
-	return __cfq_set_weight(cgrp, cft, val, true);
+	return __cfq_set_weight(css, cft, val, true);
 }
 
-static int cfqg_print_stat(struct cgroup *cgrp, struct cftype *cft,
+static int cfqg_print_stat(struct cgroup_subsys_state *css, struct cftype *cft,
 			   struct seq_file *sf)
 {
-	struct blkcg *blkcg = cgroup_to_blkcg(cgrp);
+	struct blkcg *blkcg = css_to_blkcg(css);
 
 	blkcg_print_blkgs(sf, blkcg, blkg_prfill_stat, &blkcg_policy_cfq,
 			  cft->private, false);
 	return 0;
 }
 
-static int cfqg_print_rwstat(struct cgroup *cgrp, struct cftype *cft,
-			     struct seq_file *sf)
+static int cfqg_print_rwstat(struct cgroup_subsys_state *css,
+			     struct cftype *cft, struct seq_file *sf)
 {
-	struct blkcg *blkcg = cgroup_to_blkcg(cgrp);
+	struct blkcg *blkcg = css_to_blkcg(css);
 
 	blkcg_print_blkgs(sf, blkcg, blkg_prfill_rwstat, &blkcg_policy_cfq,
 			  cft->private, true);
@@ -1773,20 +1773,20 @@ static u64 cfqg_prfill_rwstat_recursive(struct seq_file *sf,
 	return __blkg_prfill_rwstat(sf, pd, &sum);
 }
 
-static int cfqg_print_stat_recursive(struct cgroup *cgrp, struct cftype *cft,
-				     struct seq_file *sf)
+static int cfqg_print_stat_recursive(struct cgroup_subsys_state *css,
+				     struct cftype *cft, struct seq_file *sf)
 {
-	struct blkcg *blkcg = cgroup_to_blkcg(cgrp);
+	struct blkcg *blkcg = css_to_blkcg(css);
 
 	blkcg_print_blkgs(sf, blkcg, cfqg_prfill_stat_recursive,
 			  &blkcg_policy_cfq, cft->private, false);
 	return 0;
 }
 
-static int cfqg_print_rwstat_recursive(struct cgroup *cgrp, struct cftype *cft,
-				       struct seq_file *sf)
+static int cfqg_print_rwstat_recursive(struct cgroup_subsys_state *css,
+				       struct cftype *cft, struct seq_file *sf)
 {
-	struct blkcg *blkcg = cgroup_to_blkcg(cgrp);
+	struct blkcg *blkcg = css_to_blkcg(css);
 
 	blkcg_print_blkgs(sf, blkcg, cfqg_prfill_rwstat_recursive,
 			  &blkcg_policy_cfq, cft->private, true);
@@ -1810,10 +1810,10 @@ static u64 cfqg_prfill_avg_queue_size(struct seq_file *sf,
 }
 
 /* print avg_queue_size */
-static int cfqg_print_avg_queue_size(struct cgroup *cgrp, struct cftype *cft,
-				     struct seq_file *sf)
+static int cfqg_print_avg_queue_size(struct cgroup_subsys_state *css,
+				     struct cftype *cft, struct seq_file *sf)
 {
-	struct blkcg *blkcg = cgroup_to_blkcg(cgrp);
+	struct blkcg *blkcg = css_to_blkcg(css);
 
 	blkcg_print_blkgs(sf, blkcg, cfqg_prfill_avg_queue_size,
 			  &blkcg_policy_cfq, 0, false);
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 085ca93..9749d63 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -439,34 +439,34 @@ struct cftype {
 	struct cgroup_subsys *ss;
 
 	int (*open)(struct inode *inode, struct file *file);
-	ssize_t (*read)(struct cgroup *cgrp, struct cftype *cft,
+	ssize_t (*read)(struct cgroup_subsys_state *css, struct cftype *cft,
 			struct file *file,
 			char __user *buf, size_t nbytes, loff_t *ppos);
 	/*
 	 * read_u64() is a shortcut for the common case of returning a
 	 * single integer. Use it in place of read()
 	 */
-	u64 (*read_u64)(struct cgroup *cgrp, struct cftype *cft);
+	u64 (*read_u64)(struct cgroup_subsys_state *css, struct cftype *cft);
 	/*
 	 * read_s64() is a signed version of read_u64()
 	 */
-	s64 (*read_s64)(struct cgroup *cgrp, struct cftype *cft);
+	s64 (*read_s64)(struct cgroup_subsys_state *css, struct cftype *cft);
 	/*
 	 * read_map() is used for defining a map of key/value
 	 * pairs. It should call cb->fill(cb, key, value) for each
 	 * entry. The key/value pairs (and their ordering) should not
 	 * change between reboots.
 	 */
-	int (*read_map)(struct cgroup *cgrp, struct cftype *cft,
+	int (*read_map)(struct cgroup_subsys_state *css, struct cftype *cft,
 			struct cgroup_map_cb *cb);
 	/*
 	 * read_seq_string() is used for outputting a simple sequence
 	 * using seqfile.
 	 */
-	int (*read_seq_string)(struct cgroup *cgrp, struct cftype *cft,
-			       struct seq_file *m);
+	int (*read_seq_string)(struct cgroup_subsys_state *css,
+			       struct cftype *cft, struct seq_file *m);
 
-	ssize_t (*write)(struct cgroup *cgrp, struct cftype *cft,
+	ssize_t (*write)(struct cgroup_subsys_state *css, struct cftype *cft,
 			 struct file *file,
 			 const char __user *buf, size_t nbytes, loff_t *ppos);
 
@@ -475,18 +475,20 @@ struct cftype {
 	 * a single integer (as parsed by simple_strtoull) from
 	 * userspace. Use in place of write(); return 0 or error.
 	 */
-	int (*write_u64)(struct cgroup *cgrp, struct cftype *cft, u64 val);
+	int (*write_u64)(struct cgroup_subsys_state *css, struct cftype *cft,
+			 u64 val);
 	/*
 	 * write_s64() is a signed version of write_u64()
 	 */
-	int (*write_s64)(struct cgroup *cgrp, struct cftype *cft, s64 val);
+	int (*write_s64)(struct cgroup_subsys_state *css, struct cftype *cft,
+			 s64 val);
 
 	/*
 	 * write_string() is passed a nul-terminated kernelspace
 	 * buffer of maximum length determined by max_write_len.
 	 * Returns 0 or -ve error code.
 	 */
-	int (*write_string)(struct cgroup *cgrp, struct cftype *cft,
+	int (*write_string)(struct cgroup_subsys_state *css, struct cftype *cft,
 			    const char *buffer);
 	/*
 	 * trigger() callback can be used to get some kick from the
@@ -494,7 +496,7 @@ struct cftype {
 	 * at all. The private field can be used to determine the
 	 * kick type for multiplexing.
 	 */
-	int (*trigger)(struct cgroup *cgrp, unsigned int event);
+	int (*trigger)(struct cgroup_subsys_state *css, unsigned int event);
 
 	int (*release)(struct inode *inode, struct file *file);
 
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 7b4d9d7..6c41609 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -85,7 +85,7 @@ extern struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p);
 extern struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm);
 
 extern struct mem_cgroup *parent_mem_cgroup(struct mem_cgroup *memcg);
-extern struct mem_cgroup *mem_cgroup_from_cont(struct cgroup *cont);
+extern struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *css);
 
 static inline
 bool mm_match_cgroup(const struct mm_struct *mm, const struct mem_cgroup *memcg)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index bb87c9f..6c68192 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2228,34 +2228,38 @@ int cgroup_attach_task_all(struct task_struct *from, struct task_struct *tsk)
 }
 EXPORT_SYMBOL_GPL(cgroup_attach_task_all);
 
-static int cgroup_tasks_write(struct cgroup *cgrp, struct cftype *cft, u64 pid)
+static int cgroup_tasks_write(struct cgroup_subsys_state *css,
+			      struct cftype *cft, u64 pid)
 {
-	return attach_task_by_pid(cgrp, pid, false);
+	return attach_task_by_pid(css->cgroup, pid, false);
 }
 
-static int cgroup_procs_write(struct cgroup *cgrp, struct cftype *cft, u64 tgid)
+static int cgroup_procs_write(struct cgroup_subsys_state *css,
+			      struct cftype *cft, u64 tgid)
 {
-	return attach_task_by_pid(cgrp, tgid, true);
+	return attach_task_by_pid(css->cgroup, tgid, true);
 }
 
-static int cgroup_release_agent_write(struct cgroup *cgrp, struct cftype *cft,
-				      const char *buffer)
+static int cgroup_release_agent_write(struct cgroup_subsys_state *css,
+				      struct cftype *cft, const char *buffer)
 {
-	BUILD_BUG_ON(sizeof(cgrp->root->release_agent_path) < PATH_MAX);
+	BUILD_BUG_ON(sizeof(css->cgroup->root->release_agent_path) < PATH_MAX);
 	if (strlen(buffer) >= PATH_MAX)
 		return -EINVAL;
-	if (!cgroup_lock_live_group(cgrp))
+	if (!cgroup_lock_live_group(css->cgroup))
 		return -ENODEV;
 	mutex_lock(&cgroup_root_mutex);
-	strcpy(cgrp->root->release_agent_path, buffer);
+	strcpy(css->cgroup->root->release_agent_path, buffer);
 	mutex_unlock(&cgroup_root_mutex);
 	mutex_unlock(&cgroup_mutex);
 	return 0;
 }
 
-static int cgroup_release_agent_show(struct cgroup *cgrp, struct cftype *cft,
-				     struct seq_file *seq)
+static int cgroup_release_agent_show(struct cgroup_subsys_state *css,
+				     struct cftype *cft, struct seq_file *seq)
 {
+	struct cgroup *cgrp = css->cgroup;
+
 	if (!cgroup_lock_live_group(cgrp))
 		return -ENODEV;
 	seq_puts(seq, cgrp->root->release_agent_path);
@@ -2264,10 +2268,10 @@ static int cgroup_release_agent_show(struct cgroup *cgrp, struct cftype *cft,
 	return 0;
 }
 
-static int cgroup_sane_behavior_show(struct cgroup *cgrp, struct cftype *cft,
-				     struct seq_file *seq)
+static int cgroup_sane_behavior_show(struct cgroup_subsys_state *css,
+				     struct cftype *cft, struct seq_file *seq)
 {
-	seq_printf(seq, "%d\n", cgroup_sane_behavior(cgrp));
+	seq_printf(seq, "%d\n", cgroup_sane_behavior(css->cgroup));
 	return 0;
 }
 
@@ -2285,10 +2289,10 @@ static struct cgroup_subsys_state *cgroup_file_css(struct cfent *cfe)
 /* A buffer size big enough for numbers or short strings */
 #define CGROUP_LOCAL_BUFFER_SIZE 64
 
-static ssize_t cgroup_write_X64(struct cgroup *cgrp, struct cftype *cft,
-				struct file *file,
-				const char __user *userbuf,
-				size_t nbytes, loff_t *unused_ppos)
+static ssize_t cgroup_write_X64(struct cgroup_subsys_state *css,
+				struct cftype *cft, struct file *file,
+				const char __user *userbuf, size_t nbytes,
+				loff_t *unused_ppos)
 {
 	char buffer[CGROUP_LOCAL_BUFFER_SIZE];
 	int retval = 0;
@@ -2306,22 +2310,22 @@ static ssize_t cgroup_write_X64(struct cgroup *cgrp, struct cftype *cft,
 		u64 val = simple_strtoull(strstrip(buffer), &end, 0);
 		if (*end)
 			return -EINVAL;
-		retval = cft->write_u64(cgrp, cft, val);
+		retval = cft->write_u64(css, cft, val);
 	} else {
 		s64 val = simple_strtoll(strstrip(buffer), &end, 0);
 		if (*end)
 			return -EINVAL;
-		retval = cft->write_s64(cgrp, cft, val);
+		retval = cft->write_s64(css, cft, val);
 	}
 	if (!retval)
 		retval = nbytes;
 	return retval;
 }
 
-static ssize_t cgroup_write_string(struct cgroup *cgrp, struct cftype *cft,
-				   struct file *file,
-				   const char __user *userbuf,
-				   size_t nbytes, loff_t *unused_ppos)
+static ssize_t cgroup_write_string(struct cgroup_subsys_state *css,
+				   struct cftype *cft, struct file *file,
+				   const char __user *userbuf, size_t nbytes,
+				   loff_t *unused_ppos)
 {
 	char local_buffer[CGROUP_LOCAL_BUFFER_SIZE];
 	int retval = 0;
@@ -2344,7 +2348,7 @@ static ssize_t cgroup_write_string(struct cgroup *cgrp, struct cftype *cft,
 	}
 
 	buffer[nbytes] = 0;     /* nul-terminate */
-	retval = cft->write_string(cgrp, cft, strstrip(buffer));
+	retval = cft->write_string(css, cft, strstrip(buffer));
 	if (!retval)
 		retval = nbytes;
 out:
@@ -2354,60 +2358,60 @@ out:
 }
 
 static ssize_t cgroup_file_write(struct file *file, const char __user *buf,
-						size_t nbytes, loff_t *ppos)
+				 size_t nbytes, loff_t *ppos)
 {
+	struct cfent *cfe = __d_cfe(file->f_dentry);
 	struct cftype *cft = __d_cft(file->f_dentry);
-	struct cgroup *cgrp = __d_cgrp(file->f_dentry->d_parent);
+	struct cgroup_subsys_state *css = cgroup_file_css(cfe);
 
 	if (cft->write)
-		return cft->write(cgrp, cft, file, buf, nbytes, ppos);
+		return cft->write(css, cft, file, buf, nbytes, ppos);
 	if (cft->write_u64 || cft->write_s64)
-		return cgroup_write_X64(cgrp, cft, file, buf, nbytes, ppos);
+		return cgroup_write_X64(css, cft, file, buf, nbytes, ppos);
 	if (cft->write_string)
-		return cgroup_write_string(cgrp, cft, file, buf, nbytes, ppos);
+		return cgroup_write_string(css, cft, file, buf, nbytes, ppos);
 	if (cft->trigger) {
-		int ret = cft->trigger(cgrp, (unsigned int)cft->private);
+		int ret = cft->trigger(css, (unsigned int)cft->private);
 		return ret ? ret : nbytes;
 	}
 	return -EINVAL;
 }
 
-static ssize_t cgroup_read_u64(struct cgroup *cgrp, struct cftype *cft,
-			       struct file *file,
-			       char __user *buf, size_t nbytes,
-			       loff_t *ppos)
+static ssize_t cgroup_read_u64(struct cgroup_subsys_state *css,
+			       struct cftype *cft, struct file *file,
+			       char __user *buf, size_t nbytes, loff_t *ppos)
 {
 	char tmp[CGROUP_LOCAL_BUFFER_SIZE];
-	u64 val = cft->read_u64(cgrp, cft);
+	u64 val = cft->read_u64(css, cft);
 	int len = sprintf(tmp, "%llu\n", (unsigned long long) val);
 
 	return simple_read_from_buffer(buf, nbytes, ppos, tmp, len);
 }
 
-static ssize_t cgroup_read_s64(struct cgroup *cgrp, struct cftype *cft,
-			       struct file *file,
-			       char __user *buf, size_t nbytes,
-			       loff_t *ppos)
+static ssize_t cgroup_read_s64(struct cgroup_subsys_state *css,
+			       struct cftype *cft, struct file *file,
+			       char __user *buf, size_t nbytes, loff_t *ppos)
 {
 	char tmp[CGROUP_LOCAL_BUFFER_SIZE];
-	s64 val = cft->read_s64(cgrp, cft);
+	s64 val = cft->read_s64(css, cft);
 	int len = sprintf(tmp, "%lld\n", (long long) val);
 
 	return simple_read_from_buffer(buf, nbytes, ppos, tmp, len);
 }
 
 static ssize_t cgroup_file_read(struct file *file, char __user *buf,
-				   size_t nbytes, loff_t *ppos)
+				size_t nbytes, loff_t *ppos)
 {
+	struct cfent *cfe = __d_cfe(file->f_dentry);
 	struct cftype *cft = __d_cft(file->f_dentry);
-	struct cgroup *cgrp = __d_cgrp(file->f_dentry->d_parent);
+	struct cgroup_subsys_state *css = cgroup_file_css(cfe);
 
 	if (cft->read)
-		return cft->read(cgrp, cft, file, buf, nbytes, ppos);
+		return cft->read(css, cft, file, buf, nbytes, ppos);
 	if (cft->read_u64)
-		return cgroup_read_u64(cgrp, cft, file, buf, nbytes, ppos);
+		return cgroup_read_u64(css, cft, file, buf, nbytes, ppos);
 	if (cft->read_s64)
-		return cgroup_read_s64(cgrp, cft, file, buf, nbytes, ppos);
+		return cgroup_read_s64(css, cft, file, buf, nbytes, ppos);
 	return -EINVAL;
 }
 
@@ -2426,16 +2430,16 @@ static int cgroup_seqfile_show(struct seq_file *m, void *arg)
 {
 	struct cfent *cfe = m->private;
 	struct cftype *cft = cfe->type;
-	struct cgroup *cgrp = __d_cgrp(cfe->dentry->d_parent);
+	struct cgroup_subsys_state *css = cgroup_file_css(cfe);
 
 	if (cft->read_map) {
 		struct cgroup_map_cb cb = {
 			.fill = cgroup_map_add,
 			.state = m,
 		};
-		return cft->read_map(cgrp, cft, &cb);
+		return cft->read_map(css, cft, &cb);
 	}
-	return cft->read_seq_string(cgrp, cft, m);
+	return cft->read_seq_string(css, cft, m);
 }
 
 static const struct file_operations cgroup_seqfile_operations = {
@@ -3853,21 +3857,20 @@ static int cgroup_procs_open(struct inode *unused, struct file *file)
 	return cgroup_pidlist_open(file, CGROUP_FILE_PROCS);
 }
 
-static u64 cgroup_read_notify_on_release(struct cgroup *cgrp,
-					    struct cftype *cft)
+static u64 cgroup_read_notify_on_release(struct cgroup_subsys_state *css,
+					 struct cftype *cft)
 {
-	return notify_on_release(cgrp);
+	return notify_on_release(css->cgroup);
 }
 
-static int cgroup_write_notify_on_release(struct cgroup *cgrp,
-					  struct cftype *cft,
-					  u64 val)
+static int cgroup_write_notify_on_release(struct cgroup_subsys_state *css,
+					  struct cftype *cft, u64 val)
 {
-	clear_bit(CGRP_RELEASABLE, &cgrp->flags);
+	clear_bit(CGRP_RELEASABLE, &css->cgroup->flags);
 	if (val)
-		set_bit(CGRP_NOTIFY_ON_RELEASE, &cgrp->flags);
+		set_bit(CGRP_NOTIFY_ON_RELEASE, &css->cgroup->flags);
 	else
-		clear_bit(CGRP_NOTIFY_ON_RELEASE, &cgrp->flags);
+		clear_bit(CGRP_NOTIFY_ON_RELEASE, &css->cgroup->flags);
 	return 0;
 }
 
@@ -3965,9 +3968,10 @@ static void cgroup_event_ptable_queue_proc(struct file *file,
  * Input must be in format '<event_fd> <control_fd> <args>'.
  * Interpretation of args is defined by control file implementation.
  */
-static int cgroup_write_event_control(struct cgroup *cgrp, struct cftype *cft,
-				      const char *buffer)
+static int cgroup_write_event_control(struct cgroup_subsys_state *css,
+				      struct cftype *cft, const char *buffer)
 {
+	struct cgroup *cgrp = css->cgroup;
 	struct cgroup_event *event;
 	struct cgroup *cgrp_cfile;
 	unsigned int efd, cfd;
@@ -4075,20 +4079,19 @@ out_kfree:
 	return ret;
 }
 
-static u64 cgroup_clone_children_read(struct cgroup *cgrp,
-				    struct cftype *cft)
+static u64 cgroup_clone_children_read(struct cgroup_subsys_state *css,
+				      struct cftype *cft)
 {
-	return test_bit(CGRP_CPUSET_CLONE_CHILDREN, &cgrp->flags);
+	return test_bit(CGRP_CPUSET_CLONE_CHILDREN, &css->cgroup->flags);
 }
 
-static int cgroup_clone_children_write(struct cgroup *cgrp,
-				     struct cftype *cft,
-				     u64 val)
+static int cgroup_clone_children_write(struct cgroup_subsys_state *css,
+				       struct cftype *cft, u64 val)
 {
 	if (val)
-		set_bit(CGRP_CPUSET_CLONE_CHILDREN, &cgrp->flags);
+		set_bit(CGRP_CPUSET_CLONE_CHILDREN, &css->cgroup->flags);
 	else
-		clear_bit(CGRP_CPUSET_CLONE_CHILDREN, &cgrp->flags);
+		clear_bit(CGRP_CPUSET_CLONE_CHILDREN, &css->cgroup->flags);
 	return 0;
 }
 
@@ -5576,17 +5579,19 @@ static void debug_css_free(struct cgroup_subsys_state *css)
 	kfree(css);
 }
 
-static u64 debug_taskcount_read(struct cgroup *cgrp, struct cftype *cft)
+static u64 debug_taskcount_read(struct cgroup_subsys_state *css,
+				struct cftype *cft)
 {
-	return cgroup_task_count(cgrp);
+	return cgroup_task_count(css->cgroup);
 }
 
-static u64 current_css_set_read(struct cgroup *cgrp, struct cftype *cft)
+static u64 current_css_set_read(struct cgroup_subsys_state *css,
+				struct cftype *cft)
 {
 	return (u64)(unsigned long)current->cgroups;
 }
 
-static u64 current_css_set_refcount_read(struct cgroup *cgrp,
+static u64 current_css_set_refcount_read(struct cgroup_subsys_state *css,
 					 struct cftype *cft)
 {
 	u64 count;
@@ -5597,7 +5602,7 @@ static u64 current_css_set_refcount_read(struct cgroup *cgrp,
 	return count;
 }
 
-static int current_css_set_cg_links_read(struct cgroup *cgrp,
+static int current_css_set_cg_links_read(struct cgroup_subsys_state *css,
 					 struct cftype *cft,
 					 struct seq_file *seq)
 {
@@ -5624,14 +5629,13 @@ static int current_css_set_cg_links_read(struct cgroup *cgrp,
 }
 
 #define MAX_TASKS_SHOWN_PER_CSS 25
-static int cgroup_css_links_read(struct cgroup *cgrp,
-				 struct cftype *cft,
-				 struct seq_file *seq)
+static int cgroup_css_links_read(struct cgroup_subsys_state *css,
+				 struct cftype *cft, struct seq_file *seq)
 {
 	struct cgrp_cset_link *link;
 
 	read_lock(&css_set_lock);
-	list_for_each_entry(link, &cgrp->cset_links, cset_link) {
+	list_for_each_entry(link, &css->cgroup->cset_links, cset_link) {
 		struct css_set *cset = link->cset;
 		struct task_struct *task;
 		int count = 0;
@@ -5650,9 +5654,9 @@ static int cgroup_css_links_read(struct cgroup *cgrp,
 	return 0;
 }
 
-static u64 releasable_read(struct cgroup *cgrp, struct cftype *cft)
+static u64 releasable_read(struct cgroup_subsys_state *css, struct cftype *cft)
 {
-	return test_bit(CGRP_RELEASABLE, &cgrp->flags);
+	return test_bit(CGRP_RELEASABLE, &css->cgroup->flags);
 }
 
 static struct cftype debug_files[] =  {
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index f03a857..19613ba 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -245,7 +245,7 @@ out:
 
 /**
  * update_if_frozen - update whether a cgroup finished freezing
- * @cgroup: cgroup of interest
+ * @css: css of interest
  *
  * Once FREEZING is initiated, transition to FROZEN is lazily updated by
  * calling this function.  If the current state is FREEZING but not FROZEN,
@@ -256,12 +256,12 @@ out:
  * update_if_frozen() on all descendants prior to invoking this function.
  *
  * Task states and freezer state might disagree while tasks are being
- * migrated into or out of @cgroup, so we can't verify task states against
+ * migrated into or out of @css, so we can't verify task states against
  * @freezer state here.  See freezer_attach() for details.
  */
-static void update_if_frozen(struct cgroup *cgroup)
+static void update_if_frozen(struct cgroup_subsys_state *css)
 {
-	struct freezer *freezer = cgroup_freezer(cgroup);
+	struct freezer *freezer = css_freezer(css);
 	struct cgroup *pos;
 	struct cgroup_iter it;
 	struct task_struct *task;
@@ -275,7 +275,7 @@ static void update_if_frozen(struct cgroup *cgroup)
 		goto out_unlock;
 
 	/* are all (live) children frozen? */
-	cgroup_for_each_child(pos, cgroup) {
+	cgroup_for_each_child(pos, css->cgroup) {
 		struct freezer *child = cgroup_freezer(pos);
 
 		if ((child->state & CGROUP_FREEZER_ONLINE) &&
@@ -284,9 +284,9 @@ static void update_if_frozen(struct cgroup *cgroup)
 	}
 
 	/* are all tasks frozen? */
-	cgroup_iter_start(cgroup, &it);
+	cgroup_iter_start(css->cgroup, &it);
 
-	while ((task = cgroup_iter_next(cgroup, &it))) {
+	while ((task = cgroup_iter_next(css->cgroup, &it))) {
 		if (freezing(task)) {
 			/*
 			 * freezer_should_skip() indicates that the task
@@ -301,12 +301,12 @@ static void update_if_frozen(struct cgroup *cgroup)
 
 	freezer->state |= CGROUP_FROZEN;
 out_iter_end:
-	cgroup_iter_end(cgroup, &it);
+	cgroup_iter_end(css->cgroup, &it);
 out_unlock:
 	spin_unlock_irq(&freezer->lock);
 }
 
-static int freezer_read(struct cgroup *cgroup, struct cftype *cft,
+static int freezer_read(struct cgroup_subsys_state *css, struct cftype *cft,
 			struct seq_file *m)
 {
 	struct cgroup *pos;
@@ -314,13 +314,13 @@ static int freezer_read(struct cgroup *cgroup, struct cftype *cft,
 	rcu_read_lock();
 
 	/* update states bottom-up */
-	cgroup_for_each_descendant_post(pos, cgroup)
-		update_if_frozen(pos);
-	update_if_frozen(cgroup);
+	cgroup_for_each_descendant_post(pos, css->cgroup)
+		update_if_frozen(cgroup_css(pos, freezer_subsys_id));
+	update_if_frozen(css);
 
 	rcu_read_unlock();
 
-	seq_puts(m, freezer_state_strs(cgroup_freezer(cgroup)->state));
+	seq_puts(m, freezer_state_strs(css_freezer(css)->state));
 	seq_putc(m, '\n');
 	return 0;
 }
@@ -426,7 +426,7 @@ static void freezer_change_state(struct freezer *freezer, bool freeze)
 	rcu_read_unlock();
 }
 
-static int freezer_write(struct cgroup *cgroup, struct cftype *cft,
+static int freezer_write(struct cgroup_subsys_state *css, struct cftype *cft,
 			 const char *buffer)
 {
 	bool freeze;
@@ -438,20 +438,22 @@ static int freezer_write(struct cgroup *cgroup, struct cftype *cft,
 	else
 		return -EINVAL;
 
-	freezer_change_state(cgroup_freezer(cgroup), freeze);
+	freezer_change_state(css_freezer(css), freeze);
 	return 0;
 }
 
-static u64 freezer_self_freezing_read(struct cgroup *cgroup, struct cftype *cft)
+static u64 freezer_self_freezing_read(struct cgroup_subsys_state *css,
+				      struct cftype *cft)
 {
-	struct freezer *freezer = cgroup_freezer(cgroup);
+	struct freezer *freezer = css_freezer(css);
 
 	return (bool)(freezer->state & CGROUP_FREEZING_SELF);
 }
 
-static u64 freezer_parent_freezing_read(struct cgroup *cgroup, struct cftype *cft)
+static u64 freezer_parent_freezing_read(struct cgroup_subsys_state *css,
+					struct cftype *cft)
 {
-	struct freezer *freezer = cgroup_freezer(cgroup);
+	struct freezer *freezer = css_freezer(css);
 
 	return (bool)(freezer->state & CGROUP_FREEZING_PARENT);
 }
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 8ce3fdc..89b76e1 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1603,9 +1603,10 @@ typedef enum {
 	FILE_SPREAD_SLAB,
 } cpuset_filetype_t;
 
-static int cpuset_write_u64(struct cgroup *cgrp, struct cftype *cft, u64 val)
+static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
+			    u64 val)
 {
-	struct cpuset *cs = cgroup_cs(cgrp);
+	struct cpuset *cs = css_cs(css);
 	cpuset_filetype_t type = cft->private;
 	int retval = -ENODEV;
 
@@ -1650,9 +1651,10 @@ out_unlock:
 	return retval;
 }
 
-static int cpuset_write_s64(struct cgroup *cgrp, struct cftype *cft, s64 val)
+static int cpuset_write_s64(struct cgroup_subsys_state *css, struct cftype *cft,
+			    s64 val)
 {
-	struct cpuset *cs = cgroup_cs(cgrp);
+	struct cpuset *cs = css_cs(css);
 	cpuset_filetype_t type = cft->private;
 	int retval = -ENODEV;
 
@@ -1676,10 +1678,10 @@ out_unlock:
 /*
  * Common handling for a write to a "cpus" or "mems" file.
  */
-static int cpuset_write_resmask(struct cgroup *cgrp, struct cftype *cft,
-				const char *buf)
+static int cpuset_write_resmask(struct cgroup_subsys_state *css,
+				struct cftype *cft, const char *buf)
 {
-	struct cpuset *cs = cgroup_cs(cgrp);
+	struct cpuset *cs = css_cs(css);
 	struct cpuset *trialcs;
 	int retval = -ENODEV;
 
@@ -1758,13 +1760,12 @@ static size_t cpuset_sprintf_memlist(char *page, struct cpuset *cs)
 	return count;
 }
 
-static ssize_t cpuset_common_file_read(struct cgroup *cgrp,
-				       struct cftype *cft,
-				       struct file *file,
-				       char __user *buf,
-				       size_t nbytes, loff_t *ppos)
+static ssize_t cpuset_common_file_read(struct cgroup_subsys_state *css,
+				       struct cftype *cft, struct file *file,
+				       char __user *buf, size_t nbytes,
+				       loff_t *ppos)
 {
-	struct cpuset *cs = cgroup_cs(cgrp);
+	struct cpuset *cs = css_cs(css);
 	cpuset_filetype_t type = cft->private;
 	char *page;
 	ssize_t retval = 0;
@@ -1794,9 +1795,9 @@ out:
 	return retval;
 }
 
-static u64 cpuset_read_u64(struct cgroup *cgrp, struct cftype *cft)
+static u64 cpuset_read_u64(struct cgroup_subsys_state *css, struct cftype *cft)
 {
-	struct cpuset *cs = cgroup_cs(cgrp);
+	struct cpuset *cs = css_cs(css);
 	cpuset_filetype_t type = cft->private;
 	switch (type) {
 	case FILE_CPU_EXCLUSIVE:
@@ -1825,9 +1826,9 @@ static u64 cpuset_read_u64(struct cgroup *cgrp, struct cftype *cft)
 	return 0;
 }
 
-static s64 cpuset_read_s64(struct cgroup *cgrp, struct cftype *cft)
+static s64 cpuset_read_s64(struct cgroup_subsys_state *css, struct cftype *cft)
 {
-	struct cpuset *cs = cgroup_cs(cgrp);
+	struct cpuset *cs = css_cs(css);
 	cpuset_filetype_t type = cft->private;
 	switch (type) {
 	case FILE_SCHED_RELAX_DOMAIN_LEVEL:
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 622b7ef..cc9a492 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7088,12 +7088,6 @@ static inline struct task_group *css_tg(struct cgroup_subsys_state *css)
 	return css ? container_of(css, struct task_group, css) : NULL;
 }
 
-/* return corresponding task_group object of a cgroup */
-static inline struct task_group *cgroup_tg(struct cgroup *cgrp)
-{
-	return css_tg(cgroup_css(cgrp, cpu_cgroup_subsys_id));
-}
-
 static struct cgroup_subsys_state *
 cpu_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 {
@@ -7179,15 +7173,16 @@ static void cpu_cgroup_exit(struct cgroup_subsys_state *css,
 }
 
 #ifdef CONFIG_FAIR_GROUP_SCHED
-static int cpu_shares_write_u64(struct cgroup *cgrp, struct cftype *cftype,
-				u64 shareval)
+static int cpu_shares_write_u64(struct cgroup_subsys_state *css,
+				struct cftype *cftype, u64 shareval)
 {
-	return sched_group_set_shares(cgroup_tg(cgrp), scale_load(shareval));
+	return sched_group_set_shares(css_tg(css), scale_load(shareval));
 }
 
-static u64 cpu_shares_read_u64(struct cgroup *cgrp, struct cftype *cft)
+static u64 cpu_shares_read_u64(struct cgroup_subsys_state *css,
+			       struct cftype *cft)
 {
-	struct task_group *tg = cgroup_tg(cgrp);
+	struct task_group *tg = css_tg(css);
 
 	return (u64) scale_load_down(tg->shares);
 }
@@ -7309,26 +7304,28 @@ long tg_get_cfs_period(struct task_group *tg)
 	return cfs_period_us;
 }
 
-static s64 cpu_cfs_quota_read_s64(struct cgroup *cgrp, struct cftype *cft)
+static s64 cpu_cfs_quota_read_s64(struct cgroup_subsys_state *css,
+				  struct cftype *cft)
 {
-	return tg_get_cfs_quota(cgroup_tg(cgrp));
+	return tg_get_cfs_quota(css_tg(css));
 }
 
-static int cpu_cfs_quota_write_s64(struct cgroup *cgrp, struct cftype *cftype,
-				s64 cfs_quota_us)
+static int cpu_cfs_quota_write_s64(struct cgroup_subsys_state *css,
+				   struct cftype *cftype, s64 cfs_quota_us)
 {
-	return tg_set_cfs_quota(cgroup_tg(cgrp), cfs_quota_us);
+	return tg_set_cfs_quota(css_tg(css), cfs_quota_us);
 }
 
-static u64 cpu_cfs_period_read_u64(struct cgroup *cgrp, struct cftype *cft)
+static u64 cpu_cfs_period_read_u64(struct cgroup_subsys_state *css,
+				   struct cftype *cft)
 {
-	return tg_get_cfs_period(cgroup_tg(cgrp));
+	return tg_get_cfs_period(css_tg(css));
 }
 
-static int cpu_cfs_period_write_u64(struct cgroup *cgrp, struct cftype *cftype,
-				u64 cfs_period_us)
+static int cpu_cfs_period_write_u64(struct cgroup_subsys_state *css,
+				    struct cftype *cftype, u64 cfs_period_us)
 {
-	return tg_set_cfs_period(cgroup_tg(cgrp), cfs_period_us);
+	return tg_set_cfs_period(css_tg(css), cfs_period_us);
 }
 
 struct cfs_schedulable_data {
@@ -7409,10 +7406,10 @@ static int __cfs_schedulable(struct task_group *tg, u64 period, u64 quota)
 	return ret;
 }
 
-static int cpu_stats_show(struct cgroup *cgrp, struct cftype *cft,
+static int cpu_stats_show(struct cgroup_subsys_state *css, struct cftype *cft,
 		struct cgroup_map_cb *cb)
 {
-	struct task_group *tg = cgroup_tg(cgrp);
+	struct task_group *tg = css_tg(css);
 	struct cfs_bandwidth *cfs_b = &tg->cfs_bandwidth;
 
 	cb->fill(cb, "nr_periods", cfs_b->nr_periods);
@@ -7425,26 +7422,28 @@ static int cpu_stats_show(struct cgroup *cgrp, struct cftype *cft,
 #endif /* CONFIG_FAIR_GROUP_SCHED */
 
 #ifdef CONFIG_RT_GROUP_SCHED
-static int cpu_rt_runtime_write(struct cgroup *cgrp, struct cftype *cft,
-				s64 val)
+static int cpu_rt_runtime_write(struct cgroup_subsys_state *css,
+				struct cftype *cft, s64 val)
 {
-	return sched_group_set_rt_runtime(cgroup_tg(cgrp), val);
+	return sched_group_set_rt_runtime(css_tg(css), val);
 }
 
-static s64 cpu_rt_runtime_read(struct cgroup *cgrp, struct cftype *cft)
+static s64 cpu_rt_runtime_read(struct cgroup_subsys_state *css,
+			       struct cftype *cft)
 {
-	return sched_group_rt_runtime(cgroup_tg(cgrp));
+	return sched_group_rt_runtime(css_tg(css));
 }
 
-static int cpu_rt_period_write_uint(struct cgroup *cgrp, struct cftype *cftype,
-		u64 rt_period_us)
+static int cpu_rt_period_write_uint(struct cgroup_subsys_state *css,
+				    struct cftype *cftype, u64 rt_period_us)
 {
-	return sched_group_set_rt_period(cgroup_tg(cgrp), rt_period_us);
+	return sched_group_set_rt_period(css_tg(css), rt_period_us);
 }
 
-static u64 cpu_rt_period_read_uint(struct cgroup *cgrp, struct cftype *cft)
+static u64 cpu_rt_period_read_uint(struct cgroup_subsys_state *css,
+				   struct cftype *cft)
 {
-	return sched_group_rt_period(cgroup_tg(cgrp));
+	return sched_group_rt_period(css_tg(css));
 }
 #endif /* CONFIG_RT_GROUP_SCHED */
 
diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
index 1b784d9..f64722f 100644
--- a/kernel/sched/cpuacct.c
+++ b/kernel/sched/cpuacct.c
@@ -38,12 +38,6 @@ static inline struct cpuacct *css_ca(struct cgroup_subsys_state *css)
 	return css ? container_of(css, struct cpuacct, css) : NULL;
 }
 
-/* return cpu accounting group corresponding to this container */
-static inline struct cpuacct *cgroup_ca(struct cgroup *cgrp)
-{
-	return css_ca(cgroup_css(cgrp, cpuacct_subsys_id));
-}
-
 /* return cpu accounting group to which this task belongs */
 static inline struct cpuacct *task_ca(struct task_struct *tsk)
 {
@@ -138,9 +132,9 @@ static void cpuacct_cpuusage_write(struct cpuacct *ca, int cpu, u64 val)
 }
 
 /* return total cpu usage (in nanoseconds) of a group */
-static u64 cpuusage_read(struct cgroup *cgrp, struct cftype *cft)
+static u64 cpuusage_read(struct cgroup_subsys_state *css, struct cftype *cft)
 {
-	struct cpuacct *ca = cgroup_ca(cgrp);
+	struct cpuacct *ca = css_ca(css);
 	u64 totalcpuusage = 0;
 	int i;
 
@@ -150,10 +144,10 @@ static u64 cpuusage_read(struct cgroup *cgrp, struct cftype *cft)
 	return totalcpuusage;
 }
 
-static int cpuusage_write(struct cgroup *cgrp, struct cftype *cftype,
-								u64 reset)
+static int cpuusage_write(struct cgroup_subsys_state *css, struct cftype *cft,
+			  u64 reset)
 {
-	struct cpuacct *ca = cgroup_ca(cgrp);
+	struct cpuacct *ca = css_ca(css);
 	int err = 0;
 	int i;
 
@@ -169,10 +163,10 @@ out:
 	return err;
 }
 
-static int cpuacct_percpu_seq_read(struct cgroup *cgroup, struct cftype *cft,
-				   struct seq_file *m)
+static int cpuacct_percpu_seq_read(struct cgroup_subsys_state *css,
+				   struct cftype *cft, struct seq_file *m)
 {
-	struct cpuacct *ca = cgroup_ca(cgroup);
+	struct cpuacct *ca = css_ca(css);
 	u64 percpu;
 	int i;
 
@@ -189,10 +183,10 @@ static const char * const cpuacct_stat_desc[] = {
 	[CPUACCT_STAT_SYSTEM] = "system",
 };
 
-static int cpuacct_stats_show(struct cgroup *cgrp, struct cftype *cft,
-			      struct cgroup_map_cb *cb)
+static int cpuacct_stats_show(struct cgroup_subsys_state *css,
+			      struct cftype *cft, struct cgroup_map_cb *cb)
 {
-	struct cpuacct *ca = cgroup_ca(cgrp);
+	struct cpuacct *ca = css_ca(css);
 	int cpu;
 	s64 val = 0;
 
diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
index e213243..bda8e44 100644
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -40,12 +40,6 @@ struct hugetlb_cgroup *hugetlb_cgroup_from_css(struct cgroup_subsys_state *s)
 }
 
 static inline
-struct hugetlb_cgroup *hugetlb_cgroup_from_cgroup(struct cgroup *cgroup)
-{
-	return hugetlb_cgroup_from_css(cgroup_css(cgroup, hugetlb_subsys_id));
-}
-
-static inline
 struct hugetlb_cgroup *hugetlb_cgroup_from_task(struct task_struct *task)
 {
 	return hugetlb_cgroup_from_css(task_css(task, hugetlb_subsys_id));
@@ -248,14 +242,15 @@ void hugetlb_cgroup_uncharge_cgroup(int idx, unsigned long nr_pages,
 	return;
 }
 
-static ssize_t hugetlb_cgroup_read(struct cgroup *cgroup, struct cftype *cft,
-				   struct file *file, char __user *buf,
-				   size_t nbytes, loff_t *ppos)
+static ssize_t hugetlb_cgroup_read(struct cgroup_subsys_state *css,
+				   struct cftype *cft, struct file *file,
+				   char __user *buf, size_t nbytes,
+				   loff_t *ppos)
 {
 	u64 val;
 	char str[64];
 	int idx, name, len;
-	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_cgroup(cgroup);
+	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_css(css);
 
 	idx = MEMFILE_IDX(cft->private);
 	name = MEMFILE_ATTR(cft->private);
@@ -265,12 +260,12 @@ static ssize_t hugetlb_cgroup_read(struct cgroup *cgroup, struct cftype *cft,
 	return simple_read_from_buffer(buf, nbytes, ppos, str, len);
 }
 
-static int hugetlb_cgroup_write(struct cgroup *cgroup, struct cftype *cft,
-				const char *buffer)
+static int hugetlb_cgroup_write(struct cgroup_subsys_state *css,
+				struct cftype *cft, const char *buffer)
 {
 	int idx, name, ret;
 	unsigned long long val;
-	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_cgroup(cgroup);
+	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_css(css);
 
 	idx = MEMFILE_IDX(cft->private);
 	name = MEMFILE_ATTR(cft->private);
@@ -295,10 +290,11 @@ static int hugetlb_cgroup_write(struct cgroup *cgroup, struct cftype *cft,
 	return ret;
 }
 
-static int hugetlb_cgroup_reset(struct cgroup *cgroup, unsigned int event)
+static int hugetlb_cgroup_reset(struct cgroup_subsys_state *css,
+				unsigned int event)
 {
 	int idx, name, ret = 0;
-	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_cgroup(cgroup);
+	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_css(css);
 
 	idx = MEMFILE_IDX(event);
 	name = MEMFILE_ATTR(event);
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 32cca0f..ab64dfc 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -483,7 +483,6 @@ enum res_type {
  */
 static DEFINE_MUTEX(memcg_create_mutex);
 
-static inline
 struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *s)
 {
 	return s ? container_of(s, struct mem_cgroup, css) : NULL;
@@ -1035,7 +1034,7 @@ static void memcg_check_events(struct mem_cgroup *memcg, struct page *page)
 		preempt_enable();
 }
 
-struct mem_cgroup *mem_cgroup_from_cont(struct cgroup *cont)
+static inline struct mem_cgroup *mem_cgroup_from_cont(struct cgroup *cont)
 {
 	return mem_cgroup_from_css(cgroup_css(cont, mem_cgroup_subsys_id));
 }
@@ -2951,10 +2950,10 @@ static struct kmem_cache *memcg_params_to_cache(struct memcg_cache_params *p)
 }
 
 #ifdef CONFIG_SLABINFO
-static int mem_cgroup_slabinfo_read(struct cgroup *cont, struct cftype *cft,
-					struct seq_file *m)
+static int mem_cgroup_slabinfo_read(struct cgroup_subsys_state *css,
+				    struct cftype *cft, struct seq_file *m)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct memcg_cache_params *params;
 
 	if (!memcg_can_account_kmem(memcg))
@@ -4999,9 +4998,10 @@ static int mem_cgroup_force_empty(struct mem_cgroup *memcg)
 	return 0;
 }
 
-static int mem_cgroup_force_empty_write(struct cgroup *cont, unsigned int event)
+static int mem_cgroup_force_empty_write(struct cgroup_subsys_state *css,
+					unsigned int event)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	int ret;
 
 	if (mem_cgroup_is_root(memcg))
@@ -5014,16 +5014,17 @@ static int mem_cgroup_force_empty_write(struct cgroup *cont, unsigned int event)
 }
 
 
-static u64 mem_cgroup_hierarchy_read(struct cgroup *cont, struct cftype *cft)
+static u64 mem_cgroup_hierarchy_read(struct cgroup_subsys_state *css,
+				     struct cftype *cft)
 {
-	return mem_cgroup_from_cont(cont)->use_hierarchy;
+	return mem_cgroup_from_css(css)->use_hierarchy;
 }
 
-static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft,
-					u64 val)
+static int mem_cgroup_hierarchy_write(struct cgroup_subsys_state *css,
+				      struct cftype *cft, u64 val)
 {
 	int retval = 0;
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct mem_cgroup *parent_memcg = mem_cgroup_from_css(css_parent(&memcg->css));
 
 	mutex_lock(&memcg_create_mutex);
@@ -5094,11 +5095,11 @@ static inline u64 mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
 	return val << PAGE_SHIFT;
 }
 
-static ssize_t mem_cgroup_read(struct cgroup *cont, struct cftype *cft,
-			       struct file *file, char __user *buf,
-			       size_t nbytes, loff_t *ppos)
+static ssize_t mem_cgroup_read(struct cgroup_subsys_state *css,
+			       struct cftype *cft, struct file *file,
+			       char __user *buf, size_t nbytes, loff_t *ppos)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	char str[64];
 	u64 val;
 	int name, len;
@@ -5131,11 +5132,11 @@ static ssize_t mem_cgroup_read(struct cgroup *cont, struct cftype *cft,
 	return simple_read_from_buffer(buf, nbytes, ppos, str, len);
 }
 
-static int memcg_update_kmem_limit(struct cgroup *cont, u64 val)
+static int memcg_update_kmem_limit(struct cgroup_subsys_state *css, u64 val)
 {
 	int ret = -EINVAL;
 #ifdef CONFIG_MEMCG_KMEM
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	/*
 	 * For simplicity, we won't allow this to be disabled.  It also can't
 	 * be changed if the cgroup has children already, or if tasks had
@@ -5151,7 +5152,7 @@ static int memcg_update_kmem_limit(struct cgroup *cont, u64 val)
 	mutex_lock(&memcg_create_mutex);
 	mutex_lock(&set_limit_mutex);
 	if (!memcg->kmem_account_flags && val != RESOURCE_MAX) {
-		if (cgroup_task_count(cont) || memcg_has_children(memcg)) {
+		if (cgroup_task_count(css->cgroup) || memcg_has_children(memcg)) {
 			ret = -EBUSY;
 			goto out;
 		}
@@ -5221,10 +5222,10 @@ out:
  * The user of this function is...
  * RES_LIMIT.
  */
-static int mem_cgroup_write(struct cgroup *cont, struct cftype *cft,
+static int mem_cgroup_write(struct cgroup_subsys_state *css, struct cftype *cft,
 			    const char *buffer)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	enum res_type type;
 	int name;
 	unsigned long long val;
@@ -5248,7 +5249,7 @@ static int mem_cgroup_write(struct cgroup *cont, struct cftype *cft,
 		else if (type == _MEMSWAP)
 			ret = mem_cgroup_resize_memsw_limit(memcg, val);
 		else if (type == _KMEM)
-			ret = memcg_update_kmem_limit(cont, val);
+			ret = memcg_update_kmem_limit(css, val);
 		else
 			return -EINVAL;
 		break;
@@ -5297,9 +5298,9 @@ out:
 	*memsw_limit = min_memsw_limit;
 }
 
-static int mem_cgroup_reset(struct cgroup *cont, unsigned int event)
+static int mem_cgroup_reset(struct cgroup_subsys_state *css, unsigned int event)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	int name;
 	enum res_type type;
 
@@ -5332,17 +5333,17 @@ static int mem_cgroup_reset(struct cgroup *cont, unsigned int event)
 	return 0;
 }
 
-static u64 mem_cgroup_move_charge_read(struct cgroup *cgrp,
+static u64 mem_cgroup_move_charge_read(struct cgroup_subsys_state *css,
 					struct cftype *cft)
 {
-	return mem_cgroup_from_cont(cgrp)->move_charge_at_immigrate;
+	return mem_cgroup_from_css(css)->move_charge_at_immigrate;
 }
 
 #ifdef CONFIG_MMU
-static int mem_cgroup_move_charge_write(struct cgroup *cgrp,
+static int mem_cgroup_move_charge_write(struct cgroup_subsys_state *css,
 					struct cftype *cft, u64 val)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 
 	if (val >= (1 << NR_MOVE_TYPE))
 		return -EINVAL;
@@ -5357,7 +5358,7 @@ static int mem_cgroup_move_charge_write(struct cgroup *cgrp,
 	return 0;
 }
 #else
-static int mem_cgroup_move_charge_write(struct cgroup *cgrp,
+static int mem_cgroup_move_charge_write(struct cgroup_subsys_state *css,
 					struct cftype *cft, u64 val)
 {
 	return -ENOSYS;
@@ -5365,13 +5366,13 @@ static int mem_cgroup_move_charge_write(struct cgroup *cgrp,
 #endif
 
 #ifdef CONFIG_NUMA
-static int memcg_numa_stat_show(struct cgroup *cont, struct cftype *cft,
-				      struct seq_file *m)
+static int memcg_numa_stat_show(struct cgroup_subsys_state *css,
+				struct cftype *cft, struct seq_file *m)
 {
 	int nid;
 	unsigned long total_nr, file_nr, anon_nr, unevictable_nr;
 	unsigned long node_nr;
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 
 	total_nr = mem_cgroup_nr_lru_pages(memcg, LRU_ALL);
 	seq_printf(m, "total=%lu", total_nr);
@@ -5416,10 +5417,10 @@ static inline void mem_cgroup_lru_names_not_uptodate(void)
 	BUILD_BUG_ON(ARRAY_SIZE(mem_cgroup_lru_names) != NR_LRU_LISTS);
 }
 
-static int memcg_stat_show(struct cgroup *cont, struct cftype *cft,
+static int memcg_stat_show(struct cgroup_subsys_state *css, struct cftype *cft,
 				 struct seq_file *m)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct mem_cgroup *mi;
 	unsigned int i;
 
@@ -5503,17 +5504,18 @@ static int memcg_stat_show(struct cgroup *cont, struct cftype *cft,
 	return 0;
 }
 
-static u64 mem_cgroup_swappiness_read(struct cgroup *cgrp, struct cftype *cft)
+static u64 mem_cgroup_swappiness_read(struct cgroup_subsys_state *css,
+				      struct cftype *cft)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 
 	return mem_cgroup_swappiness(memcg);
 }
 
-static int mem_cgroup_swappiness_write(struct cgroup *cgrp, struct cftype *cft,
-				       u64 val)
+static int mem_cgroup_swappiness_write(struct cgroup_subsys_state *css,
+				       struct cftype *cft, u64 val)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct mem_cgroup *parent = mem_cgroup_from_css(css_parent(&memcg->css));
 
 	if (val > 100 || !parent)
@@ -5829,10 +5831,10 @@ static void mem_cgroup_oom_unregister_event(struct cgroup *cgrp,
 	spin_unlock(&memcg_oom_lock);
 }
 
-static int mem_cgroup_oom_control_read(struct cgroup *cgrp,
+static int mem_cgroup_oom_control_read(struct cgroup_subsys_state *css,
 	struct cftype *cft,  struct cgroup_map_cb *cb)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 
 	cb->fill(cb, "oom_kill_disable", memcg->oom_kill_disable);
 
@@ -5843,10 +5845,10 @@ static int mem_cgroup_oom_control_read(struct cgroup *cgrp,
 	return 0;
 }
 
-static int mem_cgroup_oom_control_write(struct cgroup *cgrp,
+static int mem_cgroup_oom_control_write(struct cgroup_subsys_state *css,
 	struct cftype *cft, u64 val)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct mem_cgroup *parent = mem_cgroup_from_css(css_parent(&memcg->css));
 
 	/* cannot set to root cgroup and only 0 and 1 are allowed */
diff --git a/mm/vmpressure.c b/mm/vmpressure.c
index 7f1654d..2a8a736 100644
--- a/mm/vmpressure.c
+++ b/mm/vmpressure.c
@@ -81,8 +81,8 @@ static struct vmpressure *cg_to_vmpressure(struct cgroup *cg)
 
 static struct vmpressure *vmpressure_parent(struct vmpressure *vmpr)
 {
-	struct cgroup *cg = vmpressure_to_css(vmpr)->cgroup;
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cg);
+	struct cgroup_subsys_state *css = vmpressure_to_css(vmpr);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 
 	memcg = parent_mem_cgroup(memcg);
 	if (!memcg)
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 8d095b4..e00f60e 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -168,15 +168,14 @@ static void cgrp_css_free(struct cgroup_subsys_state *css)
 	kfree(css);
 }
 
-static u64 read_prioidx(struct cgroup *cgrp, struct cftype *cft)
+static u64 read_prioidx(struct cgroup_subsys_state *css, struct cftype *cft)
 {
-	return cgrp->id;
+	return css->cgroup->id;
 }
 
-static int read_priomap(struct cgroup *cont, struct cftype *cft,
+static int read_priomap(struct cgroup_subsys_state *css, struct cftype *cft,
 			struct cgroup_map_cb *cb)
 {
-	struct cgroup_subsys_state *css = cgroup_css(cont, net_prio_subsys_id);
 	struct net_device *dev;
 
 	rcu_read_lock();
@@ -186,10 +185,9 @@ static int read_priomap(struct cgroup *cont, struct cftype *cft,
 	return 0;
 }
 
-static int write_priomap(struct cgroup *cgrp, struct cftype *cft,
+static int write_priomap(struct cgroup_subsys_state *css, struct cftype *cft,
 			 const char *buffer)
 {
-	struct cgroup_subsys_state *css = cgroup_css(cgrp, net_prio_subsys_id);
 	char devname[IFNAMSIZ + 1];
 	struct net_device *dev;
 	u32 prio;
diff --git a/net/ipv4/tcp_memcontrol.c b/net/ipv4/tcp_memcontrol.c
index da14436..8a57d79 100644
--- a/net/ipv4/tcp_memcontrol.c
+++ b/net/ipv4/tcp_memcontrol.c
@@ -132,10 +132,10 @@ static int tcp_update_limit(struct mem_cgroup *memcg, u64 val)
 	return 0;
 }
 
-static int tcp_cgroup_write(struct cgroup *cont, struct cftype *cft,
+static int tcp_cgroup_write(struct cgroup_subsys_state *css, struct cftype *cft,
 			    const char *buffer)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	unsigned long long val;
 	int ret = 0;
 
@@ -180,9 +180,9 @@ static u64 tcp_read_usage(struct mem_cgroup *memcg)
 	return res_counter_read_u64(&tcp->tcp_memory_allocated, RES_USAGE);
 }
 
-static u64 tcp_cgroup_read(struct cgroup *cont, struct cftype *cft)
+static u64 tcp_cgroup_read(struct cgroup_subsys_state *css, struct cftype *cft)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	u64 val;
 
 	switch (cft->private) {
@@ -202,13 +202,13 @@ static u64 tcp_cgroup_read(struct cgroup *cont, struct cftype *cft)
 	return val;
 }
 
-static int tcp_cgroup_reset(struct cgroup *cont, unsigned int event)
+static int tcp_cgroup_reset(struct cgroup_subsys_state *css, unsigned int event)
 {
 	struct mem_cgroup *memcg;
 	struct tcp_memcontrol *tcp;
 	struct cg_proto *cg_proto;
 
-	memcg = mem_cgroup_from_cont(cont);
+	memcg = mem_cgroup_from_css(css);
 	cg_proto = tcp_prot.proto_cgroup(memcg);
 	if (!cg_proto)
 		return 0;
diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
index dc39838..8ea1184 100644
--- a/net/sched/cls_cgroup.c
+++ b/net/sched/cls_cgroup.c
@@ -28,11 +28,6 @@ static inline struct cgroup_cls_state *css_cls_state(struct cgroup_subsys_state
 	return css ? container_of(css, struct cgroup_cls_state, css) : NULL;
 }
 
-static inline struct cgroup_cls_state *cgrp_cls_state(struct cgroup *cgrp)
-{
-	return css_cls_state(cgroup_css(cgrp, net_cls_subsys_id));
-}
-
 static inline struct cgroup_cls_state *task_cls_state(struct task_struct *p)
 {
 	return css_cls_state(task_css(p, net_cls_subsys_id));
@@ -87,14 +82,15 @@ static void cgrp_attach(struct cgroup_subsys_state *css,
 	}
 }
 
-static u64 read_classid(struct cgroup *cgrp, struct cftype *cft)
+static u64 read_classid(struct cgroup_subsys_state *css, struct cftype *cft)
 {
-	return cgrp_cls_state(cgrp)->classid;
+	return css_cls_state(css)->classid;
 }
 
-static int write_classid(struct cgroup *cgrp, struct cftype *cft, u64 value)
+static int write_classid(struct cgroup_subsys_state *css, struct cftype *cft,
+			 u64 value)
 {
-	cgrp_cls_state(cgrp)->classid = (u32) value;
+	css_cls_state(css)->classid = (u32) value;
 	return 0;
 }
 
diff --git a/security/device_cgroup.c b/security/device_cgroup.c
index 7293ac4..e0ca464 100644
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -289,10 +289,10 @@ static void set_majmin(char *str, unsigned m)
 		sprintf(str, "%u", m);
 }
 
-static int devcgroup_seq_read(struct cgroup *cgroup, struct cftype *cft,
-				struct seq_file *m)
+static int devcgroup_seq_read(struct cgroup_subsys_state *css,
+			      struct cftype *cft, struct seq_file *m)
 {
-	struct dev_cgroup *devcgroup = cgroup_to_devcgroup(cgroup);
+	struct dev_cgroup *devcgroup = css_to_devcgroup(css);
 	struct dev_exception_item *ex;
 	char maj[MAJMINLEN], min[MAJMINLEN], acc[ACCLEN];
 
@@ -669,13 +669,13 @@ static int devcgroup_update_access(struct dev_cgroup *devcgroup,
 	return rc;
 }
 
-static int devcgroup_access_write(struct cgroup *cgrp, struct cftype *cft,
-				  const char *buffer)
+static int devcgroup_access_write(struct cgroup_subsys_state *css,
+				  struct cftype *cft, const char *buffer)
 {
 	int retval;
 
 	mutex_lock(&devcgroup_mutex);
-	retval = devcgroup_update_access(cgroup_to_devcgroup(cgrp),
+	retval = devcgroup_update_access(css_to_devcgroup(css),
 					 cft->private, buffer);
 	mutex_unlock(&devcgroup_mutex);
 	return retval;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 13/23] cgroup: convert cgroup_next_sibling() to cgroup_next_child()
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (11 preceding siblings ...)
  2013-08-01 21:49 ` [PATCH 12/23] cgroup: pass around cgroup_subsys_state instead of cgroup in file methods Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-01 21:49 ` [PATCH 14/23] cgroup: always use cgroup_next_child() to walk the children list Tejun Heo
                   ` (11 subsequent siblings)
  24 siblings, 0 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan; +Cc: containers, cgroups, linux-kernel, Tejun Heo

cgroup is transitioning to using css (cgroup_subsys_state) as the main
subsys interface handle instead of cgroup and the iterators will be
updated to use css too.  The iterators need to walk the cgroup
hierarchy and return the css's matching the origin css, which is a bit
cumbersome to open code.

This patch converts cgroup_next_sibling() to cgroup_next_child() so
that it can handle all steps of direct child iteration.  This will be
used to update iterators to take @css instead of @cgrp.  In addition
to the new iteration init handling, cgroup_next_child() is
restructured so that the different branches share the end of iteration
condition check.

This patch doesn't change any behavior.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/cgroup.h |  4 ++--
 kernel/cgroup.c        | 59 +++++++++++++++++++++++++-------------------------
 2 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 9749d63..a91c304 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -780,7 +780,7 @@ static inline struct cgroup *cgroup_from_id(struct cgroup_subsys *ss, int id)
 	return idr_find(&ss->root->cgroup_idr, id);
 }
 
-struct cgroup *cgroup_next_sibling(struct cgroup *pos);
+struct cgroup *cgroup_next_child(struct cgroup *pos, struct cgroup *cgrp);
 
 /**
  * cgroup_for_each_child - iterate through children of a cgroup
@@ -803,7 +803,7 @@ struct cgroup *cgroup_next_sibling(struct cgroup *pos);
 #define cgroup_for_each_child(pos, cgrp)				\
 	for ((pos) = list_first_or_null_rcu(&(cgrp)->children,		\
 					    struct cgroup, sibling);	\
-	     (pos); (pos) = cgroup_next_sibling((pos)))
+	     (pos); (pos) = cgroup_next_child((pos), (cgrp)))
 
 struct cgroup *cgroup_next_descendant_pre(struct cgroup *pos,
 					  struct cgroup *cgroup);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 6c68192..e88b50e 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -3030,15 +3030,16 @@ static void cgroup_enable_task_cg_lists(void)
 }
 
 /**
- * cgroup_next_sibling - find the next sibling of a given cgroup
- * @pos: the current cgroup
+ * cgroup_next_child - find the next child of a given cgroup
+ * @pos: the current position (%NULL to initiate traversal)
+ * @cgrp: cgroup whose descendants to walk
  *
- * This function returns the next sibling of @pos and should be called
- * under RCU read lock.  The only requirement is that @pos is accessible.
- * The next sibling is guaranteed to be returned regardless of @pos's
- * state.
+ * This function returns the next child of @cgrp and should be called under
+ * RCU read lock.  The only requirement is that @cgrp and @pos are
+ * accessible.  The next sibling is guaranteed to be returned regardless of
+ * their states.
  */
-struct cgroup *cgroup_next_sibling(struct cgroup *pos)
+struct cgroup *cgroup_next_child(struct cgroup *pos, struct cgroup *cgrp)
 {
 	struct cgroup *next;
 
@@ -3054,30 +3055,30 @@ struct cgroup *cgroup_next_sibling(struct cgroup *pos)
 	 * safe to dereference from this RCU critical section.  If
 	 * ->sibling.next is inaccessible, cgroup_is_dead() is guaranteed
 	 * to be visible as %true here.
+	 *
+	 * If @pos is dead, its next pointer can't be dereferenced;
+	 * however, as each cgroup is given a monotonically increasing
+	 * unique serial number and always appended to the sibling list,
+	 * the next one can be found by walking the parent's children until
+	 * we see a cgroup with higher serial number than @pos's.  While
+	 * this path can be slower, it's taken only when either the current
+	 * cgroup is removed or iteration and removal race.
 	 */
-	if (likely(!cgroup_is_dead(pos))) {
+	if (!pos) {
+		next = list_entry_rcu(cgrp->children.next, struct cgroup, sibling);
+	} else if (likely(!cgroup_is_dead(pos))) {
 		next = list_entry_rcu(pos->sibling.next, struct cgroup, sibling);
-		if (&next->sibling != &pos->parent->children)
-			return next;
-		return NULL;
+	} else {
+		list_for_each_entry_rcu(next, &cgrp->children, sibling)
+			if (next->serial_nr > pos->serial_nr)
+				break;
 	}
 
-	/*
-	 * Can't dereference the next pointer.  Each cgroup is given a
-	 * monotonically increasing unique serial number and always
-	 * appended to the sibling list, so the next one can be found by
-	 * walking the parent's children until we see a cgroup with higher
-	 * serial number than @pos's.
-	 *
-	 * While this path can be slow, it's taken only when either the
-	 * current cgroup is removed or iteration and removal race.
-	 */
-	list_for_each_entry_rcu(next, &pos->parent->children, sibling)
-		if (next->serial_nr > pos->serial_nr)
-			return next;
+	if (&next->sibling != &cgrp->children)
+		return next;
 	return NULL;
 }
-EXPORT_SYMBOL_GPL(cgroup_next_sibling);
+EXPORT_SYMBOL_GPL(cgroup_next_child);
 
 /**
  * cgroup_next_descendant_pre - find the next descendant for pre-order walk
@@ -3110,7 +3111,7 @@ struct cgroup *cgroup_next_descendant_pre(struct cgroup *pos,
 
 	/* no child, visit my or the closest ancestor's next sibling */
 	while (pos != cgroup) {
-		next = cgroup_next_sibling(pos);
+		next = cgroup_next_child(pos, pos->parent);
 		if (next)
 			return next;
 		pos = pos->parent;
@@ -3191,7 +3192,7 @@ struct cgroup *cgroup_next_descendant_post(struct cgroup *pos,
 	}
 
 	/* if there's an unvisited sibling, visit its leftmost descendant */
-	next = cgroup_next_sibling(pos);
+	next = cgroup_next_child(pos, pos->parent);
 	if (next)
 		return cgroup_leftmost_descendant(next);
 
@@ -4540,9 +4541,9 @@ static int cgroup_destroy_locked(struct cgroup *cgrp)
 	/*
 	 * Mark @cgrp dead.  This prevents further task migration and child
 	 * creation by disabling cgroup_lock_live_group().  Note that
-	 * CGRP_DEAD assertion is depended upon by cgroup_next_sibling() to
+	 * CGRP_DEAD assertion is depended upon by cgroup_next_child() to
 	 * resume iteration after dropping RCU read lock.  See
-	 * cgroup_next_sibling() for details.
+	 * cgroup_next_child() for details.
 	 */
 	set_bit(CGRP_DEAD, &cgrp->flags);
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 14/23] cgroup: always use cgroup_next_child() to walk the children list
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (12 preceding siblings ...)
  2013-08-01 21:49 ` [PATCH 13/23] cgroup: convert cgroup_next_sibling() to cgroup_next_child() Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-01 21:49 ` [PATCH 15/23] cgroup: make hierarchy iterators deal with cgroup_subsys_state instead of cgroup Tejun Heo
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan; +Cc: containers, cgroups, linux-kernel, Tejun Heo

There are several places where the children list is accessed directly.
This patch converts those places to use cgroup_next_child().  This
will help updating the hierarchy iterators to use @css instead of
@cgrp.

While cgroup_next_child() can be heavy in pathological cases - e.g. a
lot of dead children, this shouldn't cause any noticeable behavior
differences.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/cgroup.h | 5 ++---
 kernel/cgroup.c        | 7 +++----
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index a91c304..df6ab19 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -801,9 +801,8 @@ struct cgroup *cgroup_next_child(struct cgroup *pos, struct cgroup *cgrp);
  * the start of the next iteration by, for example, bumping the css refcnt.
  */
 #define cgroup_for_each_child(pos, cgrp)				\
-	for ((pos) = list_first_or_null_rcu(&(cgrp)->children,		\
-					    struct cgroup, sibling);	\
-	     (pos); (pos) = cgroup_next_child((pos), (cgrp)))
+	for ((pos) = cgroup_next_child(NULL, (cgrp)); (pos);		\
+	     (pos) = cgroup_next_child((pos), (cgrp)))
 
 struct cgroup *cgroup_next_descendant_pre(struct cgroup *pos,
 					  struct cgroup *cgroup);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index e88b50e..7b53b58 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -3105,7 +3105,7 @@ struct cgroup *cgroup_next_descendant_pre(struct cgroup *pos,
 		pos = cgroup;
 
 	/* visit the first child if exists */
-	next = list_first_or_null_rcu(&pos->children, struct cgroup, sibling);
+	next = cgroup_next_child(NULL, pos);
 	if (next)
 		return next;
 
@@ -3144,7 +3144,7 @@ struct cgroup *cgroup_rightmost_descendant(struct cgroup *pos)
 		last = pos;
 		/* ->prev isn't RCU safe, walk ->next till the end */
 		pos = NULL;
-		list_for_each_entry_rcu(tmp, &last->children, sibling)
+		cgroup_for_each_child(tmp, last)
 			pos = tmp;
 	} while (pos);
 
@@ -3158,8 +3158,7 @@ static struct cgroup *cgroup_leftmost_descendant(struct cgroup *pos)
 
 	do {
 		last = pos;
-		pos = list_first_or_null_rcu(&pos->children, struct cgroup,
-					     sibling);
+		pos = cgroup_next_child(NULL, pos);
 	} while (pos);
 
 	return last;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 15/23] cgroup: make hierarchy iterators deal with cgroup_subsys_state instead of cgroup
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (13 preceding siblings ...)
  2013-08-01 21:49 ` [PATCH 14/23] cgroup: always use cgroup_next_child() to walk the children list Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-02 13:32   ` Michal Hocko
                     ` (2 more replies)
  2013-08-01 21:49 ` [PATCH 16/23] cgroup: relocate cgroup_advance_iter() Tejun Heo
                   ` (9 subsequent siblings)
  24 siblings, 3 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan
  Cc: containers, cgroups, linux-kernel, Tejun Heo, Johannes Weiner,
	Michal Hocko, Balbir Singh, Aristeu Rozanski, Matt Helsley,
	Vivek Goyal, Jens Axboe

cgroup is currently in the process of transitioning to using css
(cgroup_subsys_state) as the primary handle instead of cgroup in
subsystem API.  For hierarchy iterators, this is beneficial because

* In most cases, css is the only thing subsystems care about anyway.

* On the planned unified hierarchy, iterations for different
  subsystems will need to skip over different subtrees of the
  hierarchy depending on which subsystems are enabled on each cgroup.
  Passing around css makes it unnecessary to explicitly specify the
  subsystem in question as css is intersection between cgroup and
  subsystem

* For the planned unified hierarchy, css's would need to be created
  and destroyed dynamically independent from cgroup hierarchy.  Having
  cgroup core manage css iteration makes enforcing deref rules a lot
  easier.

Most subsystem conversions are straight-forward.  Noteworthy changes
are

* blkio: cgroup_to_blkcg() is no longer used.  Removed.

* freezer: cgroup_freezer() is no longer used.  Removed.

* devices: cgroup_to_devcgroup() is no longer used.  Removed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
---
 block/blk-cgroup.c       |   8 +--
 block/blk-cgroup.h       |  25 ++++-----
 block/blk-throttle.c     |   8 +--
 include/linux/cgroup.h   |  88 ++++++++++++++++---------------
 kernel/cgroup.c          | 131 ++++++++++++++++++++++++++---------------------
 kernel/cgroup_freezer.c  |  25 ++++-----
 kernel/cpuset.c          |  58 ++++++++++-----------
 mm/memcontrol.c          |  20 ++++----
 security/device_cgroup.c |  11 ++--
 9 files changed, 187 insertions(+), 187 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index f46f3c6..4b40640 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -614,7 +614,7 @@ u64 blkg_stat_recursive_sum(struct blkg_policy_data *pd, int off)
 {
 	struct blkcg_policy *pol = blkcg_policy[pd->plid];
 	struct blkcg_gq *pos_blkg;
-	struct cgroup *pos_cgrp;
+	struct cgroup_subsys_state *pos_css;
 	u64 sum;
 
 	lockdep_assert_held(pd->blkg->q->queue_lock);
@@ -622,7 +622,7 @@ u64 blkg_stat_recursive_sum(struct blkg_policy_data *pd, int off)
 	sum = blkg_stat_read((void *)pd + off);
 
 	rcu_read_lock();
-	blkg_for_each_descendant_pre(pos_blkg, pos_cgrp, pd_to_blkg(pd)) {
+	blkg_for_each_descendant_pre(pos_blkg, pos_css, pd_to_blkg(pd)) {
 		struct blkg_policy_data *pos_pd = blkg_to_pd(pos_blkg, pol);
 		struct blkg_stat *stat = (void *)pos_pd + off;
 
@@ -649,7 +649,7 @@ struct blkg_rwstat blkg_rwstat_recursive_sum(struct blkg_policy_data *pd,
 {
 	struct blkcg_policy *pol = blkcg_policy[pd->plid];
 	struct blkcg_gq *pos_blkg;
-	struct cgroup *pos_cgrp;
+	struct cgroup_subsys_state *pos_css;
 	struct blkg_rwstat sum;
 	int i;
 
@@ -658,7 +658,7 @@ struct blkg_rwstat blkg_rwstat_recursive_sum(struct blkg_policy_data *pd,
 	sum = blkg_rwstat_read((void *)pd + off);
 
 	rcu_read_lock();
-	blkg_for_each_descendant_pre(pos_blkg, pos_cgrp, pd_to_blkg(pd)) {
+	blkg_for_each_descendant_pre(pos_blkg, pos_css, pd_to_blkg(pd)) {
 		struct blkg_policy_data *pos_pd = blkg_to_pd(pos_blkg, pol);
 		struct blkg_rwstat *rwstat = (void *)pos_pd + off;
 		struct blkg_rwstat tmp;
diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
index b6802c4..8555386 100644
--- a/block/blk-cgroup.h
+++ b/block/blk-cgroup.h
@@ -184,11 +184,6 @@ static inline struct blkcg *css_to_blkcg(struct cgroup_subsys_state *css)
 	return css ? container_of(css, struct blkcg, css) : NULL;
 }
 
-static inline struct blkcg *cgroup_to_blkcg(struct cgroup *cgroup)
-{
-	return css_to_blkcg(cgroup_css(cgroup, blkio_subsys_id));
-}
-
 static inline struct blkcg *task_blkcg(struct task_struct *tsk)
 {
 	return css_to_blkcg(task_css(tsk, blkio_subsys_id));
@@ -289,32 +284,31 @@ struct blkcg_gq *__blkg_lookup(struct blkcg *blkcg, struct request_queue *q,
 /**
  * blkg_for_each_descendant_pre - pre-order walk of a blkg's descendants
  * @d_blkg: loop cursor pointing to the current descendant
- * @pos_cgrp: used for iteration
+ * @pos_css: used for iteration
  * @p_blkg: target blkg to walk descendants of
  *
  * Walk @c_blkg through the descendants of @p_blkg.  Must be used with RCU
  * read locked.  If called under either blkcg or queue lock, the iteration
  * is guaranteed to include all and only online blkgs.  The caller may
- * update @pos_cgrp by calling cgroup_rightmost_descendant() to skip
- * subtree.
+ * update @pos_css by calling css_rightmost_descendant() to skip subtree.
  */
-#define blkg_for_each_descendant_pre(d_blkg, pos_cgrp, p_blkg)		\
-	cgroup_for_each_descendant_pre((pos_cgrp), (p_blkg)->blkcg->css.cgroup) \
-		if (((d_blkg) = __blkg_lookup(cgroup_to_blkcg(pos_cgrp), \
+#define blkg_for_each_descendant_pre(d_blkg, pos_css, p_blkg)		\
+	css_for_each_descendant_pre((pos_css), &(p_blkg)->blkcg->css)	\
+		if (((d_blkg) = __blkg_lookup(css_to_blkcg(pos_css),	\
 					      (p_blkg)->q, false)))
 
 /**
  * blkg_for_each_descendant_post - post-order walk of a blkg's descendants
  * @d_blkg: loop cursor pointing to the current descendant
- * @pos_cgrp: used for iteration
+ * @pos_css: used for iteration
  * @p_blkg: target blkg to walk descendants of
  *
  * Similar to blkg_for_each_descendant_pre() but performs post-order
  * traversal instead.  Synchronization rules are the same.
  */
-#define blkg_for_each_descendant_post(d_blkg, pos_cgrp, p_blkg)		\
-	cgroup_for_each_descendant_post((pos_cgrp), (p_blkg)->blkcg->css.cgroup) \
-		if (((d_blkg) = __blkg_lookup(cgroup_to_blkcg(pos_cgrp), \
+#define blkg_for_each_descendant_post(d_blkg, pos_css, p_blkg)		\
+	css_for_each_descendant_post((pos_css), &(p_blkg)->blkcg->css)	\
+		if (((d_blkg) = __blkg_lookup(css_to_blkcg(pos_css),	\
 					      (p_blkg)->q, false)))
 
 /**
@@ -577,7 +571,6 @@ static inline int blkcg_activate_policy(struct request_queue *q,
 static inline void blkcg_deactivate_policy(struct request_queue *q,
 					   const struct blkcg_policy *pol) { }
 
-static inline struct blkcg *cgroup_to_blkcg(struct cgroup *cgroup) { return NULL; }
 static inline struct blkcg *bio_blkcg(struct bio *bio) { return NULL; }
 
 static inline struct blkg_policy_data *blkg_to_pd(struct blkcg_gq *blkg,
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 88bcfb6..8cefa7f 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1349,7 +1349,7 @@ static int tg_set_conf(struct cgroup_subsys_state *css, struct cftype *cft,
 	struct throtl_grp *tg;
 	struct throtl_service_queue *sq;
 	struct blkcg_gq *blkg;
-	struct cgroup *pos_cgrp;
+	struct cgroup_subsys_state *pos_css;
 	int ret;
 
 	ret = blkg_conf_prep(blkcg, &blkcg_policy_throtl, buf, &ctx);
@@ -1380,7 +1380,7 @@ static int tg_set_conf(struct cgroup_subsys_state *css, struct cftype *cft,
 	 * blk-throttle.
 	 */
 	tg_update_has_rules(tg);
-	blkg_for_each_descendant_pre(blkg, pos_cgrp, ctx.blkg)
+	blkg_for_each_descendant_pre(blkg, pos_css, ctx.blkg)
 		tg_update_has_rules(blkg_to_tg(blkg));
 
 	/*
@@ -1623,7 +1623,7 @@ void blk_throtl_drain(struct request_queue *q)
 {
 	struct throtl_data *td = q->td;
 	struct blkcg_gq *blkg;
-	struct cgroup *pos_cgrp;
+	struct cgroup_subsys_state *pos_css;
 	struct bio *bio;
 	int rw;
 
@@ -1636,7 +1636,7 @@ void blk_throtl_drain(struct request_queue *q)
 	 * better to walk service_queue tree directly but blkg walk is
 	 * easier.
 	 */
-	blkg_for_each_descendant_post(blkg, pos_cgrp, td->queue->root_blkg)
+	blkg_for_each_descendant_post(blkg, pos_css, td->queue->root_blkg)
 		tg_drain_bios(&blkg_to_tg(blkg)->service_queue);
 
 	tg_drain_bios(&td_root_tg(td)->service_queue);
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index df6ab19..7fba0d0 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -780,68 +780,72 @@ static inline struct cgroup *cgroup_from_id(struct cgroup_subsys *ss, int id)
 	return idr_find(&ss->root->cgroup_idr, id);
 }
 
-struct cgroup *cgroup_next_child(struct cgroup *pos, struct cgroup *cgrp);
+struct cgroup_subsys_state *css_next_child(struct cgroup_subsys_state *pos,
+					   struct cgroup_subsys_state *parent);
 
 /**
- * cgroup_for_each_child - iterate through children of a cgroup
- * @pos: the cgroup * to use as the loop cursor
- * @cgrp: cgroup whose children to walk
+ * css_for_each_child - iterate through children of a css
+ * @pos: the css * to use as the loop cursor
+ * @parent: css whose children to walk
  *
- * Walk @cgrp's children.  Must be called under rcu_read_lock().  A child
- * cgroup which hasn't finished ->css_online() or already has finished
+ * Walk @parent's children.  Must be called under rcu_read_lock().  A child
+ * css which hasn't finished ->css_online() or already has finished
  * ->css_offline() may show up during traversal and it's each subsystem's
  * responsibility to verify that each @pos is alive.
  *
  * If a subsystem synchronizes against the parent in its ->css_online() and
- * before starting iterating, a cgroup which finished ->css_online() is
+ * before starting iterating, a css which finished ->css_online() is
  * guaranteed to be visible in the future iterations.
  *
  * It is allowed to temporarily drop RCU read lock during iteration.  The
  * caller is responsible for ensuring that @pos remains accessible until
  * the start of the next iteration by, for example, bumping the css refcnt.
  */
-#define cgroup_for_each_child(pos, cgrp)				\
-	for ((pos) = cgroup_next_child(NULL, (cgrp)); (pos);		\
-	     (pos) = cgroup_next_child((pos), (cgrp)))
+#define css_for_each_child(pos, parent)					\
+	for ((pos) = css_next_child(NULL, (parent)); (pos);		\
+	     (pos) = css_next_child((pos), (parent)))
 
-struct cgroup *cgroup_next_descendant_pre(struct cgroup *pos,
-					  struct cgroup *cgroup);
-struct cgroup *cgroup_rightmost_descendant(struct cgroup *pos);
+struct cgroup_subsys_state *
+css_next_descendant_pre(struct cgroup_subsys_state *pos,
+			struct cgroup_subsys_state *css);
+
+struct cgroup_subsys_state *
+css_rightmost_descendant(struct cgroup_subsys_state *pos);
 
 /**
- * cgroup_for_each_descendant_pre - pre-order walk of a cgroup's descendants
- * @pos: the cgroup * to use as the loop cursor
- * @cgroup: cgroup whose descendants to walk
+ * css_for_each_descendant_pre - pre-order walk of a css's descendants
+ * @pos: the css * to use as the loop cursor
+ * @root: css whose descendants to walk
  *
- * Walk @cgroup's descendants.  Must be called under rcu_read_lock().  A
- * descendant cgroup which hasn't finished ->css_online() or already has
+ * Walk @root's descendants.  Must be called under rcu_read_lock().  A
+ * descendant css which hasn't finished ->css_online() or already has
  * finished ->css_offline() may show up during traversal and it's each
  * subsystem's responsibility to verify that each @pos is alive.
  *
  * If a subsystem synchronizes against the parent in its ->css_online() and
  * before starting iterating, and synchronizes against @pos on each
- * iteration, any descendant cgroup which finished ->css_online() is
+ * iteration, any descendant css which finished ->css_online() is
  * guaranteed to be visible in the future iterations.
  *
  * In other words, the following guarantees that a descendant can't escape
  * state updates of its ancestors.
  *
- * my_online(@cgrp)
+ * my_online(@css)
  * {
- *	Lock @cgrp->parent and @cgrp;
- *	Inherit state from @cgrp->parent;
+ *	Lock @css's parent and @css;
+ *	Inherit state from the parent;
  *	Unlock both.
  * }
  *
- * my_update_state(@cgrp)
+ * my_update_state(@css)
  * {
- *	Lock @cgrp;
- *	Update @cgrp's state;
- *	Unlock @cgrp;
+ *	Lock @css;
+ *	Update @css's state;
+ *	Unlock @css;
  *
- *	cgroup_for_each_descendant_pre(@pos, @cgrp) {
+ *	css_for_each_descendant_pre(@pos, @css) {
  *		Lock @pos;
- *		Verify @pos is alive and inherit state from @pos->parent;
+ *		Verify @pos is alive and inherit state from @pos's parent;
  *		Unlock @pos;
  *	}
  * }
@@ -852,8 +856,7 @@ struct cgroup *cgroup_rightmost_descendant(struct cgroup *pos);
  * visible by walking order and, as long as inheriting operations to the
  * same @pos are atomic to each other, multiple updates racing each other
  * still result in the correct state.  It's guaranateed that at least one
- * inheritance happens for any cgroup after the latest update to its
- * parent.
+ * inheritance happens for any css after the latest update to its parent.
  *
  * If checking parent's state requires locking the parent, each inheriting
  * iteration should lock and unlock both @pos->parent and @pos.
@@ -866,25 +869,26 @@ struct cgroup *cgroup_rightmost_descendant(struct cgroup *pos);
  * caller is responsible for ensuring that @pos remains accessible until
  * the start of the next iteration by, for example, bumping the css refcnt.
  */
-#define cgroup_for_each_descendant_pre(pos, cgroup)			\
-	for (pos = cgroup_next_descendant_pre(NULL, (cgroup)); (pos);	\
-	     pos = cgroup_next_descendant_pre((pos), (cgroup)))
+#define css_for_each_descendant_pre(pos, css)				\
+	for ((pos) = css_next_descendant_pre(NULL, (css)); (pos);	\
+	     (pos) = css_next_descendant_pre((pos), (css)))
 
-struct cgroup *cgroup_next_descendant_post(struct cgroup *pos,
-					   struct cgroup *cgroup);
+struct cgroup_subsys_state *
+css_next_descendant_post(struct cgroup_subsys_state *pos,
+			 struct cgroup_subsys_state *css);
 
 /**
- * cgroup_for_each_descendant_post - post-order walk of a cgroup's descendants
- * @pos: the cgroup * to use as the loop cursor
- * @cgroup: cgroup whose descendants to walk
+ * css_for_each_descendant_post - post-order walk of a css's descendants
+ * @pos: the css * to use as the loop cursor
+ * @css: css whose descendants to walk
  *
- * Similar to cgroup_for_each_descendant_pre() but performs post-order
+ * Similar to css_for_each_descendant_pre() but performs post-order
  * traversal instead.  Note that the walk visibility guarantee described in
  * pre-order walk doesn't apply the same to post-order walks.
  */
-#define cgroup_for_each_descendant_post(pos, cgroup)			\
-	for (pos = cgroup_next_descendant_post(NULL, (cgroup)); (pos);	\
-	     pos = cgroup_next_descendant_post((pos), (cgroup)))
+#define css_for_each_descendant_post(pos, css)				\
+	for ((pos) = css_next_descendant_post(NULL, (css)); (pos);	\
+	     (pos) = css_next_descendant_post((pos), (css)))
 
 /* A cgroup_iter should be treated as an opaque object */
 struct cgroup_iter {
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 7b53b58..850ad87 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2807,8 +2807,8 @@ static void cgroup_cfts_prepare(void)
 	/*
 	 * Thanks to the entanglement with vfs inode locking, we can't walk
 	 * the existing cgroups under cgroup_mutex and create files.
-	 * Instead, we use cgroup_for_each_descendant_pre() and drop RCU
-	 * read lock before calling cgroup_addrm_files().
+	 * Instead, we use css_for_each_descendant_pre() and drop RCU read
+	 * lock before calling cgroup_addrm_files().
 	 */
 	mutex_lock(&cgroup_mutex);
 }
@@ -2818,10 +2818,11 @@ static int cgroup_cfts_commit(struct cftype *cfts, bool is_add)
 {
 	LIST_HEAD(pending);
 	struct cgroup_subsys *ss = cfts[0].ss;
-	struct cgroup *cgrp, *root = &ss->root->top_cgroup;
+	struct cgroup *root = &ss->root->top_cgroup;
 	struct super_block *sb = ss->root->sb;
 	struct dentry *prev = NULL;
 	struct inode *inode;
+	struct cgroup_subsys_state *css;
 	u64 update_before;
 	int ret = 0;
 
@@ -2854,7 +2855,9 @@ static int cgroup_cfts_commit(struct cftype *cfts, bool is_add)
 
 	/* add/rm files for all cgroups created before */
 	rcu_read_lock();
-	cgroup_for_each_descendant_pre(cgrp, root) {
+	css_for_each_descendant_pre(css, cgroup_css(root, ss->subsys_id)) {
+		struct cgroup *cgrp = css->cgroup;
+
 		if (cgroup_is_dead(cgrp))
 			continue;
 
@@ -3030,17 +3033,21 @@ static void cgroup_enable_task_cg_lists(void)
 }
 
 /**
- * cgroup_next_child - find the next child of a given cgroup
- * @pos: the current position (%NULL to initiate traversal)
- * @cgrp: cgroup whose descendants to walk
+ * css_next_child - find the next child of a given css
+ * @pos_css: the current position (%NULL to initiate traversal)
+ * @parent_css: css whose children to walk
  *
- * This function returns the next child of @cgrp and should be called under
- * RCU read lock.  The only requirement is that @cgrp and @pos are
- * accessible.  The next sibling is guaranteed to be returned regardless of
- * their states.
+ * This function returns the next child of @parent_css and should be called
+ * under RCU read lock.  The only requirement is that @parent_css and
+ * @pos_css are accessible.  The next sibling is guaranteed to be returned
+ * regardless of their states.
  */
-struct cgroup *cgroup_next_child(struct cgroup *pos, struct cgroup *cgrp)
+struct cgroup_subsys_state *
+css_next_child(struct cgroup_subsys_state *pos_css,
+	       struct cgroup_subsys_state *parent_css)
 {
+	struct cgroup *pos = pos_css ? pos_css->cgroup : NULL;
+	struct cgroup *cgrp = parent_css->cgroup;
 	struct cgroup *next;
 
 	WARN_ON_ONCE(!rcu_read_lock_held());
@@ -3074,59 +3081,64 @@ struct cgroup *cgroup_next_child(struct cgroup *pos, struct cgroup *cgrp)
 				break;
 	}
 
-	if (&next->sibling != &cgrp->children)
-		return next;
-	return NULL;
+	if (&next->sibling == &cgrp->children)
+		return NULL;
+
+	if (parent_css->ss)
+		return cgroup_css(next, parent_css->ss->subsys_id);
+	else
+		return &next->dummy_css;
 }
-EXPORT_SYMBOL_GPL(cgroup_next_child);
+EXPORT_SYMBOL_GPL(css_next_child);
 
 /**
- * cgroup_next_descendant_pre - find the next descendant for pre-order walk
+ * css_next_descendant_pre - find the next descendant for pre-order walk
  * @pos: the current position (%NULL to initiate traversal)
- * @cgroup: cgroup whose descendants to walk
+ * @root: css whose descendants to walk
  *
- * To be used by cgroup_for_each_descendant_pre().  Find the next
- * descendant to visit for pre-order traversal of @cgroup's descendants.
+ * To be used by css_for_each_descendant_pre().  Find the next descendant
+ * to visit for pre-order traversal of @root's descendants.
  *
  * While this function requires RCU read locking, it doesn't require the
  * whole traversal to be contained in a single RCU critical section.  This
  * function will return the correct next descendant as long as both @pos
- * and @cgroup are accessible and @pos is a descendant of @cgroup.
+ * and @root are accessible and @pos is a descendant of @root.
  */
-struct cgroup *cgroup_next_descendant_pre(struct cgroup *pos,
-					  struct cgroup *cgroup)
+struct cgroup_subsys_state *
+css_next_descendant_pre(struct cgroup_subsys_state *pos,
+			struct cgroup_subsys_state *root)
 {
-	struct cgroup *next;
+	struct cgroup_subsys_state *next;
 
 	WARN_ON_ONCE(!rcu_read_lock_held());
 
-	/* if first iteration, pretend we just visited @cgroup */
+	/* if first iteration, pretend we just visited @root */
 	if (!pos)
-		pos = cgroup;
+		pos = root;
 
 	/* visit the first child if exists */
-	next = cgroup_next_child(NULL, pos);
+	next = css_next_child(NULL, pos);
 	if (next)
 		return next;
 
 	/* no child, visit my or the closest ancestor's next sibling */
-	while (pos != cgroup) {
-		next = cgroup_next_child(pos, pos->parent);
+	while (pos != root) {
+		next = css_next_child(pos, css_parent(pos));
 		if (next)
 			return next;
-		pos = pos->parent;
+		pos = css_parent(pos);
 	}
 
 	return NULL;
 }
-EXPORT_SYMBOL_GPL(cgroup_next_descendant_pre);
+EXPORT_SYMBOL_GPL(css_next_descendant_pre);
 
 /**
- * cgroup_rightmost_descendant - return the rightmost descendant of a cgroup
- * @pos: cgroup of interest
+ * css_rightmost_descendant - return the rightmost descendant of a css
+ * @pos: css of interest
  *
- * Return the rightmost descendant of @pos.  If there's no descendant,
- * @pos is returned.  This can be used during pre-order traversal to skip
+ * Return the rightmost descendant of @pos.  If there's no descendant, @pos
+ * is returned.  This can be used during pre-order traversal to skip
  * subtree of @pos.
  *
  * While this function requires RCU read locking, it doesn't require the
@@ -3134,9 +3146,10 @@ EXPORT_SYMBOL_GPL(cgroup_next_descendant_pre);
  * function will return the correct rightmost descendant as long as @pos is
  * accessible.
  */
-struct cgroup *cgroup_rightmost_descendant(struct cgroup *pos)
+struct cgroup_subsys_state *
+css_rightmost_descendant(struct cgroup_subsys_state *pos)
 {
-	struct cgroup *last, *tmp;
+	struct cgroup_subsys_state *last, *tmp;
 
 	WARN_ON_ONCE(!rcu_read_lock_held());
 
@@ -3144,62 +3157,64 @@ struct cgroup *cgroup_rightmost_descendant(struct cgroup *pos)
 		last = pos;
 		/* ->prev isn't RCU safe, walk ->next till the end */
 		pos = NULL;
-		cgroup_for_each_child(tmp, last)
+		css_for_each_child(tmp, last)
 			pos = tmp;
 	} while (pos);
 
 	return last;
 }
-EXPORT_SYMBOL_GPL(cgroup_rightmost_descendant);
+EXPORT_SYMBOL_GPL(css_rightmost_descendant);
 
-static struct cgroup *cgroup_leftmost_descendant(struct cgroup *pos)
+static struct cgroup_subsys_state *
+css_leftmost_descendant(struct cgroup_subsys_state *pos)
 {
-	struct cgroup *last;
+	struct cgroup_subsys_state *last;
 
 	do {
 		last = pos;
-		pos = cgroup_next_child(NULL, pos);
+		pos = css_next_child(NULL, pos);
 	} while (pos);
 
 	return last;
 }
 
 /**
- * cgroup_next_descendant_post - find the next descendant for post-order walk
+ * css_next_descendant_post - find the next descendant for post-order walk
  * @pos: the current position (%NULL to initiate traversal)
- * @cgroup: cgroup whose descendants to walk
+ * @root: css whose descendants to walk
  *
- * To be used by cgroup_for_each_descendant_post().  Find the next
- * descendant to visit for post-order traversal of @cgroup's descendants.
+ * To be used by css_for_each_descendant_post().  Find the next descendant
+ * to visit for post-order traversal of @root's descendants.
  *
  * While this function requires RCU read locking, it doesn't require the
  * whole traversal to be contained in a single RCU critical section.  This
  * function will return the correct next descendant as long as both @pos
  * and @cgroup are accessible and @pos is a descendant of @cgroup.
  */
-struct cgroup *cgroup_next_descendant_post(struct cgroup *pos,
-					   struct cgroup *cgroup)
+struct cgroup_subsys_state *
+css_next_descendant_post(struct cgroup_subsys_state *pos,
+			 struct cgroup_subsys_state *root)
 {
-	struct cgroup *next;
+	struct cgroup_subsys_state *next;
 
 	WARN_ON_ONCE(!rcu_read_lock_held());
 
 	/* if first iteration, visit the leftmost descendant */
 	if (!pos) {
-		next = cgroup_leftmost_descendant(cgroup);
-		return next != cgroup ? next : NULL;
+		next = css_leftmost_descendant(root);
+		return next != root ? next : NULL;
 	}
 
 	/* if there's an unvisited sibling, visit its leftmost descendant */
-	next = cgroup_next_child(pos, pos->parent);
+	next = css_next_child(pos, css_parent(pos));
 	if (next)
-		return cgroup_leftmost_descendant(next);
+		return css_leftmost_descendant(next);
 
 	/* no sibling left, visit parent */
-	next = pos->parent;
-	return next != cgroup ? next : NULL;
+	next = css_parent(pos);
+	return next != root ? next : NULL;
 }
-EXPORT_SYMBOL_GPL(cgroup_next_descendant_post);
+EXPORT_SYMBOL_GPL(css_next_descendant_post);
 
 void cgroup_iter_start(struct cgroup *cgrp, struct cgroup_iter *it)
 	__acquires(css_set_lock)
@@ -4540,9 +4555,9 @@ static int cgroup_destroy_locked(struct cgroup *cgrp)
 	/*
 	 * Mark @cgrp dead.  This prevents further task migration and child
 	 * creation by disabling cgroup_lock_live_group().  Note that
-	 * CGRP_DEAD assertion is depended upon by cgroup_next_child() to
+	 * CGRP_DEAD assertion is depended upon by css_next_child() to
 	 * resume iteration after dropping RCU read lock.  See
-	 * cgroup_next_child() for details.
+	 * css_next_child() for details.
 	 */
 	set_bit(CGRP_DEAD, &cgrp->flags);
 
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index 19613ba..98ca48d 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -50,11 +50,6 @@ static inline struct freezer *css_freezer(struct cgroup_subsys_state *css)
 	return css ? container_of(css, struct freezer, css) : NULL;
 }
 
-static inline struct freezer *cgroup_freezer(struct cgroup *cgroup)
-{
-	return css_freezer(cgroup_css(cgroup, freezer_subsys_id));
-}
-
 static inline struct freezer *task_freezer(struct task_struct *task)
 {
 	return css_freezer(task_css(task, freezer_subsys_id));
@@ -120,7 +115,7 @@ static int freezer_css_online(struct cgroup_subsys_state *css)
 	/*
 	 * The following double locking and freezing state inheritance
 	 * guarantee that @cgroup can never escape ancestors' freezing
-	 * states.  See cgroup_for_each_descendant_pre() for details.
+	 * states.  See css_for_each_descendant_pre() for details.
 	 */
 	if (parent)
 		spin_lock_irq(&parent->lock);
@@ -262,7 +257,7 @@ out:
 static void update_if_frozen(struct cgroup_subsys_state *css)
 {
 	struct freezer *freezer = css_freezer(css);
-	struct cgroup *pos;
+	struct cgroup_subsys_state *pos;
 	struct cgroup_iter it;
 	struct task_struct *task;
 
@@ -275,8 +270,8 @@ static void update_if_frozen(struct cgroup_subsys_state *css)
 		goto out_unlock;
 
 	/* are all (live) children frozen? */
-	cgroup_for_each_child(pos, css->cgroup) {
-		struct freezer *child = cgroup_freezer(pos);
+	css_for_each_child(pos, css) {
+		struct freezer *child = css_freezer(pos);
 
 		if ((child->state & CGROUP_FREEZER_ONLINE) &&
 		    !(child->state & CGROUP_FROZEN))
@@ -309,13 +304,13 @@ out_unlock:
 static int freezer_read(struct cgroup_subsys_state *css, struct cftype *cft,
 			struct seq_file *m)
 {
-	struct cgroup *pos;
+	struct cgroup_subsys_state *pos;
 
 	rcu_read_lock();
 
 	/* update states bottom-up */
-	cgroup_for_each_descendant_post(pos, css->cgroup)
-		update_if_frozen(cgroup_css(pos, freezer_subsys_id));
+	css_for_each_descendant_post(pos, css)
+		update_if_frozen(pos);
 	update_if_frozen(css);
 
 	rcu_read_unlock();
@@ -396,7 +391,7 @@ static void freezer_apply_state(struct freezer *freezer, bool freeze,
  */
 static void freezer_change_state(struct freezer *freezer, bool freeze)
 {
-	struct cgroup *pos;
+	struct cgroup_subsys_state *pos;
 
 	/* update @freezer */
 	spin_lock_irq(&freezer->lock);
@@ -409,8 +404,8 @@ static void freezer_change_state(struct freezer *freezer, bool freeze)
 	 * CGROUP_FREEZING_PARENT.
 	 */
 	rcu_read_lock();
-	cgroup_for_each_descendant_pre(pos, freezer->css.cgroup) {
-		struct freezer *pos_f = cgroup_freezer(pos);
+	css_for_each_descendant_pre(pos, &freezer->css) {
+		struct freezer *pos_f = css_freezer(pos);
 		struct freezer *parent = parent_freezer(pos_f);
 
 		/*
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 89b76e1..be4f503 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -210,29 +210,29 @@ static struct cpuset top_cpuset = {
 /**
  * cpuset_for_each_child - traverse online children of a cpuset
  * @child_cs: loop cursor pointing to the current child
- * @pos_cgrp: used for iteration
+ * @pos_css: used for iteration
  * @parent_cs: target cpuset to walk children of
  *
  * Walk @child_cs through the online children of @parent_cs.  Must be used
  * with RCU read locked.
  */
-#define cpuset_for_each_child(child_cs, pos_cgrp, parent_cs)		\
-	cgroup_for_each_child((pos_cgrp), (parent_cs)->css.cgroup)	\
-		if (is_cpuset_online(((child_cs) = cgroup_cs((pos_cgrp)))))
+#define cpuset_for_each_child(child_cs, pos_css, parent_cs)		\
+	css_for_each_child((pos_css), &(parent_cs)->css)		\
+		if (is_cpuset_online(((child_cs) = css_cs((pos_css)))))
 
 /**
  * cpuset_for_each_descendant_pre - pre-order walk of a cpuset's descendants
  * @des_cs: loop cursor pointing to the current descendant
- * @pos_cgrp: used for iteration
+ * @pos_css: used for iteration
  * @root_cs: target cpuset to walk ancestor of
  *
  * Walk @des_cs through the online descendants of @root_cs.  Must be used
- * with RCU read locked.  The caller may modify @pos_cgrp by calling
- * cgroup_rightmost_descendant() to skip subtree.
+ * with RCU read locked.  The caller may modify @pos_css by calling
+ * css_rightmost_descendant() to skip subtree.
  */
-#define cpuset_for_each_descendant_pre(des_cs, pos_cgrp, root_cs)	\
-	cgroup_for_each_descendant_pre((pos_cgrp), (root_cs)->css.cgroup) \
-		if (is_cpuset_online(((des_cs) = cgroup_cs((pos_cgrp)))))
+#define cpuset_for_each_descendant_pre(des_cs, pos_css, root_cs)	\
+	css_for_each_descendant_pre((pos_css), &(root_cs)->css)		\
+		if (is_cpuset_online(((des_cs) = css_cs((pos_css)))))
 
 /*
  * There are two global mutexes guarding cpuset structures - cpuset_mutex
@@ -430,7 +430,7 @@ static void free_trial_cpuset(struct cpuset *trial)
 
 static int validate_change(struct cpuset *cur, struct cpuset *trial)
 {
-	struct cgroup *cgrp;
+	struct cgroup_subsys_state *css;
 	struct cpuset *c, *par;
 	int ret;
 
@@ -438,7 +438,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
 
 	/* Each of our child cpusets must be a subset of us */
 	ret = -EBUSY;
-	cpuset_for_each_child(c, cgrp, cur)
+	cpuset_for_each_child(c, css, cur)
 		if (!is_cpuset_subset(c, trial))
 			goto out;
 
@@ -459,7 +459,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
 	 * overlap
 	 */
 	ret = -EINVAL;
-	cpuset_for_each_child(c, cgrp, par) {
+	cpuset_for_each_child(c, css, par) {
 		if ((is_cpu_exclusive(trial) || is_cpu_exclusive(c)) &&
 		    c != cur &&
 		    cpumask_intersects(trial->cpus_allowed, c->cpus_allowed))
@@ -508,13 +508,13 @@ static void update_domain_attr_tree(struct sched_domain_attr *dattr,
 				    struct cpuset *root_cs)
 {
 	struct cpuset *cp;
-	struct cgroup *pos_cgrp;
+	struct cgroup_subsys_state *pos_css;
 
 	rcu_read_lock();
-	cpuset_for_each_descendant_pre(cp, pos_cgrp, root_cs) {
+	cpuset_for_each_descendant_pre(cp, pos_css, root_cs) {
 		/* skip the whole subtree if @cp doesn't have any CPU */
 		if (cpumask_empty(cp->cpus_allowed)) {
-			pos_cgrp = cgroup_rightmost_descendant(pos_cgrp);
+			pos_css = css_rightmost_descendant(pos_css);
 			continue;
 		}
 
@@ -589,7 +589,7 @@ static int generate_sched_domains(cpumask_var_t **domains,
 	struct sched_domain_attr *dattr;  /* attributes for custom domains */
 	int ndoms = 0;		/* number of sched domains in result */
 	int nslot;		/* next empty doms[] struct cpumask slot */
-	struct cgroup *pos_cgrp;
+	struct cgroup_subsys_state *pos_css;
 
 	doms = NULL;
 	dattr = NULL;
@@ -618,7 +618,7 @@ static int generate_sched_domains(cpumask_var_t **domains,
 	csn = 0;
 
 	rcu_read_lock();
-	cpuset_for_each_descendant_pre(cp, pos_cgrp, &top_cpuset) {
+	cpuset_for_each_descendant_pre(cp, pos_css, &top_cpuset) {
 		/*
 		 * Continue traversing beyond @cp iff @cp has some CPUs and
 		 * isn't load balancing.  The former is obvious.  The
@@ -635,7 +635,7 @@ static int generate_sched_domains(cpumask_var_t **domains,
 			csa[csn++] = cp;
 
 		/* skip @cp's subtree */
-		pos_cgrp = cgroup_rightmost_descendant(pos_cgrp);
+		pos_css = css_rightmost_descendant(pos_css);
 	}
 	rcu_read_unlock();
 
@@ -886,16 +886,16 @@ static void update_tasks_cpumask_hier(struct cpuset *root_cs,
 				      bool update_root, struct ptr_heap *heap)
 {
 	struct cpuset *cp;
-	struct cgroup *pos_cgrp;
+	struct cgroup_subsys_state *pos_css;
 
 	if (update_root)
 		update_tasks_cpumask(root_cs, heap);
 
 	rcu_read_lock();
-	cpuset_for_each_descendant_pre(cp, pos_cgrp, root_cs) {
+	cpuset_for_each_descendant_pre(cp, pos_css, root_cs) {
 		/* skip the whole subtree if @cp have some CPU */
 		if (!cpumask_empty(cp->cpus_allowed)) {
-			pos_cgrp = cgroup_rightmost_descendant(pos_cgrp);
+			pos_css = css_rightmost_descendant(pos_css);
 			continue;
 		}
 		if (!css_tryget(&cp->css))
@@ -1143,16 +1143,16 @@ static void update_tasks_nodemask_hier(struct cpuset *root_cs,
 				       bool update_root, struct ptr_heap *heap)
 {
 	struct cpuset *cp;
-	struct cgroup *pos_cgrp;
+	struct cgroup_subsys_state *pos_css;
 
 	if (update_root)
 		update_tasks_nodemask(root_cs, heap);
 
 	rcu_read_lock();
-	cpuset_for_each_descendant_pre(cp, pos_cgrp, root_cs) {
+	cpuset_for_each_descendant_pre(cp, pos_css, root_cs) {
 		/* skip the whole subtree if @cp have some CPU */
 		if (!nodes_empty(cp->mems_allowed)) {
-			pos_cgrp = cgroup_rightmost_descendant(pos_cgrp);
+			pos_css = css_rightmost_descendant(pos_css);
 			continue;
 		}
 		if (!css_tryget(&cp->css))
@@ -1973,7 +1973,7 @@ static int cpuset_css_online(struct cgroup_subsys_state *css)
 	struct cpuset *cs = css_cs(css);
 	struct cpuset *parent = parent_cs(cs);
 	struct cpuset *tmp_cs;
-	struct cgroup *pos_cgrp;
+	struct cgroup_subsys_state *pos_css;
 
 	if (!parent)
 		return 0;
@@ -2005,7 +2005,7 @@ static int cpuset_css_online(struct cgroup_subsys_state *css)
 	 * (and likewise for mems) to the new cgroup.
 	 */
 	rcu_read_lock();
-	cpuset_for_each_child(tmp_cs, pos_cgrp, parent) {
+	cpuset_for_each_child(tmp_cs, pos_css, parent) {
 		if (is_mem_exclusive(tmp_cs) || is_cpu_exclusive(tmp_cs)) {
 			rcu_read_unlock();
 			goto out_unlock;
@@ -2252,10 +2252,10 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
 	/* if cpus or mems changed, we need to propagate to descendants */
 	if (cpus_updated || mems_updated) {
 		struct cpuset *cs;
-		struct cgroup *pos_cgrp;
+		struct cgroup_subsys_state *pos_css;
 
 		rcu_read_lock();
-		cpuset_for_each_descendant_pre(cs, pos_cgrp, &top_cpuset) {
+		cpuset_for_each_descendant_pre(cs, pos_css, &top_cpuset) {
 			if (!css_tryget(&cs->css))
 				continue;
 			rcu_read_unlock();
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index ab64dfc..2285319 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1082,7 +1082,7 @@ struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm)
 static struct mem_cgroup *__mem_cgroup_iter_next(struct mem_cgroup *root,
 		struct mem_cgroup *last_visited)
 {
-	struct cgroup *prev_cgroup, *next_cgroup;
+	struct cgroup_subsys_state *prev_css, *next_css;
 
 	/*
 	 * Root is not visited by cgroup iterators so it needs an
@@ -1091,11 +1091,9 @@ static struct mem_cgroup *__mem_cgroup_iter_next(struct mem_cgroup *root,
 	if (!last_visited)
 		return root;
 
-	prev_cgroup = (last_visited == root) ? NULL
-		: last_visited->css.cgroup;
+	prev_css = (last_visited == root) ? NULL : &last_visited->css;
 skip_node:
-	next_cgroup = cgroup_next_descendant_pre(
-			prev_cgroup, root->css.cgroup);
+	next_css = css_next_descendant_pre(prev_css, &root->css);
 
 	/*
 	 * Even if we found a group we have to make sure it is
@@ -1104,13 +1102,13 @@ skip_node:
 	 * last_visited css is safe to use because it is
 	 * protected by css_get and the tree walk is rcu safe.
 	 */
-	if (next_cgroup) {
-		struct mem_cgroup *mem = mem_cgroup_from_cont(
-				next_cgroup);
+	if (next_css) {
+		struct mem_cgroup *mem = mem_cgroup_from_css(next_css);
+
 		if (css_tryget(&mem->css))
 			return mem;
 		else {
-			prev_cgroup = next_cgroup;
+			prev_css = next_css;
 			goto skip_node;
 		}
 	}
@@ -4939,10 +4937,10 @@ static void mem_cgroup_reparent_charges(struct mem_cgroup *memcg)
  */
 static inline bool __memcg_has_children(struct mem_cgroup *memcg)
 {
-	struct cgroup *pos;
+	struct cgroup_subsys_state *pos;
 
 	/* bounce at first found */
-	cgroup_for_each_child(pos, memcg->css.cgroup)
+	css_for_each_child(pos, &memcg->css)
 		return true;
 	return false;
 }
diff --git a/security/device_cgroup.c b/security/device_cgroup.c
index e0ca464..9bf230a 100644
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -56,11 +56,6 @@ static inline struct dev_cgroup *css_to_devcgroup(struct cgroup_subsys_state *s)
 	return s ? container_of(s, struct dev_cgroup, css) : NULL;
 }
 
-static inline struct dev_cgroup *cgroup_to_devcgroup(struct cgroup *cgroup)
-{
-	return css_to_devcgroup(cgroup_css(cgroup, devices_subsys_id));
-}
-
 static inline struct dev_cgroup *task_devcgroup(struct task_struct *task)
 {
 	return css_to_devcgroup(task_css(task, devices_subsys_id));
@@ -447,13 +442,13 @@ static void revalidate_active_exceptions(struct dev_cgroup *devcg)
 static int propagate_exception(struct dev_cgroup *devcg_root,
 			       struct dev_exception_item *ex)
 {
-	struct cgroup *root = devcg_root->css.cgroup, *pos;
+	struct cgroup_subsys_state *pos;
 	int rc = 0;
 
 	rcu_read_lock();
 
-	cgroup_for_each_descendant_pre(pos, root) {
-		struct dev_cgroup *devcg = cgroup_to_devcgroup(pos);
+	css_for_each_descendant_pre(pos, &devcg_root->css) {
+		struct dev_cgroup *devcg = css_to_devcgroup(pos);
 
 		/*
 		 * Because devcgroup_mutex is held, no devcg will become
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 16/23] cgroup: relocate cgroup_advance_iter()
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (14 preceding siblings ...)
  2013-08-01 21:49 ` [PATCH 15/23] cgroup: make hierarchy iterators deal with cgroup_subsys_state instead of cgroup Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-02  3:25   ` Li Zefan
  2013-08-01 21:49 ` [PATCH 17/23] cgroup: rename cgroup_iter to cgroup_task_iter Tejun Heo
                   ` (8 subsequent siblings)
  24 siblings, 1 reply; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan; +Cc: containers, cgroups, linux-kernel, Tejun Heo

For some reason, cgroup_advance_iter() is standing lonely all away
from its iter comrades.  Relocate it.

This is cosmetic.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cgroup.c | 48 ++++++++++++++++++++++++------------------------
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 850ad87..1085439 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2975,30 +2975,6 @@ int cgroup_task_count(const struct cgroup *cgrp)
 }
 
 /*
- * Advance a list_head iterator.  The iterator should be positioned at
- * the start of a css_set
- */
-static void cgroup_advance_iter(struct cgroup *cgrp, struct cgroup_iter *it)
-{
-	struct list_head *l = it->cset_link;
-	struct cgrp_cset_link *link;
-	struct css_set *cset;
-
-	/* Advance to the next non-empty css_set */
-	do {
-		l = l->next;
-		if (l == &cgrp->cset_links) {
-			it->cset_link = NULL;
-			return;
-		}
-		link = list_entry(l, struct cgrp_cset_link, cset_link);
-		cset = link->cset;
-	} while (list_empty(&cset->tasks));
-	it->cset_link = l;
-	it->task = cset->tasks.next;
-}
-
-/*
  * To reduce the fork() overhead for systems that are not actually
  * using their cgroups capability, we don't maintain the lists running
  * through each css_set to its tasks until we see the list actually
@@ -3216,6 +3192,30 @@ css_next_descendant_post(struct cgroup_subsys_state *pos,
 }
 EXPORT_SYMBOL_GPL(css_next_descendant_post);
 
+/*
+ * Advance a list_head iterator.  The iterator should be positioned at
+ * the start of a css_set
+ */
+static void cgroup_advance_iter(struct cgroup *cgrp, struct cgroup_iter *it)
+{
+	struct list_head *l = it->cset_link;
+	struct cgrp_cset_link *link;
+	struct css_set *cset;
+
+	/* Advance to the next non-empty css_set */
+	do {
+		l = l->next;
+		if (l == &cgrp->cset_links) {
+			it->cset_link = NULL;
+			return;
+		}
+		link = list_entry(l, struct cgrp_cset_link, cset_link);
+		cset = link->cset;
+	} while (list_empty(&cset->tasks));
+	it->cset_link = l;
+	it->task = cset->tasks.next;
+}
+
 void cgroup_iter_start(struct cgroup *cgrp, struct cgroup_iter *it)
 	__acquires(css_set_lock)
 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 17/23] cgroup: rename cgroup_iter to cgroup_task_iter
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (15 preceding siblings ...)
  2013-08-01 21:49 ` [PATCH 16/23] cgroup: relocate cgroup_advance_iter() Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-02 13:35   ` Michal Hocko
  2013-08-01 21:49 ` [PATCH 18/23] cgroup: make cgroup_task_iter remember the cgroup being iterated Tejun Heo
                   ` (7 subsequent siblings)
  24 siblings, 1 reply; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan
  Cc: containers, cgroups, linux-kernel, Tejun Heo, Matt Helsley,
	Johannes Weiner, Michal Hocko, Balbir Singh

cgroup now has multiple iterators and it's quite confusing to have
something which walks over tasks of a single cgroup cgroup_iter.
Let's rename it to cgroup_task_iter.

While at it, reformat / update comments and replace the overview
comment above the interface function decls with proper function
comments.  Such overview can be useful but function comments should be
more than enough here.

This is pure rename and doesn't introduce any functional changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Balbir Singh <bsingharora@gmail.com>
---
 include/linux/cgroup.h  |  31 ++++---------
 kernel/cgroup.c         | 114 ++++++++++++++++++++++++++++++++----------------
 kernel/cgroup_freezer.c |  24 +++++-----
 mm/memcontrol.c         |  10 ++---
 4 files changed, 102 insertions(+), 77 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 7fba0d0..4478336 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -890,31 +890,16 @@ css_next_descendant_post(struct cgroup_subsys_state *pos,
 	for ((pos) = css_next_descendant_post(NULL, (css)); (pos);	\
 	     (pos) = css_next_descendant_post((pos), (css)))
 
-/* A cgroup_iter should be treated as an opaque object */
-struct cgroup_iter {
-	struct list_head *cset_link;
-	struct list_head *task;
+/* A cgroup_task_iter should be treated as an opaque object */
+struct cgroup_task_iter {
+	struct list_head		*cset_link;
+	struct list_head		*task;
 };
 
-/*
- * To iterate across the tasks in a cgroup:
- *
- * 1) call cgroup_iter_start to initialize an iterator
- *
- * 2) call cgroup_iter_next() to retrieve member tasks until it
- *    returns NULL or until you want to end the iteration
- *
- * 3) call cgroup_iter_end() to destroy the iterator.
- *
- * Or, call cgroup_scan_tasks() to iterate through every task in a
- * cgroup - cgroup_scan_tasks() holds the css_set_lock when calling
- * the test_task() callback, but not while calling the process_task()
- * callback.
- */
-void cgroup_iter_start(struct cgroup *cgrp, struct cgroup_iter *it);
-struct task_struct *cgroup_iter_next(struct cgroup *cgrp,
-					struct cgroup_iter *it);
-void cgroup_iter_end(struct cgroup *cgrp, struct cgroup_iter *it);
+void cgroup_task_iter_start(struct cgroup *cgrp, struct cgroup_task_iter *it);
+struct task_struct *cgroup_task_iter_next(struct cgroup *cgrp,
+					  struct cgroup_task_iter *it);
+void cgroup_task_iter_end(struct cgroup *cgrp, struct cgroup_task_iter *it);
 int cgroup_scan_tasks(struct cgroup_scanner *scan);
 int cgroup_attach_task_all(struct task_struct *from, struct task_struct *);
 int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 1085439..7a4f89b 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -367,9 +367,11 @@ static struct cgrp_cset_link init_cgrp_cset_link;
 static int cgroup_init_idr(struct cgroup_subsys *ss,
 			   struct cgroup_subsys_state *css);
 
-/* css_set_lock protects the list of css_set objects, and the
- * chain of tasks off each css_set.  Nests outside task->alloc_lock
- * due to cgroup_iter_start() */
+/*
+ * css_set_lock protects the list of css_set objects, and the chain of
+ * tasks off each css_set.  Nests outside task->alloc_lock due to
+ * cgroup_task_iter_start().
+ */
 static DEFINE_RWLOCK(css_set_lock);
 static int css_set_count;
 
@@ -394,10 +396,12 @@ static unsigned long css_set_hash(struct cgroup_subsys_state *css[])
 	return key;
 }
 
-/* We don't maintain the lists running through each css_set to its
- * task until after the first call to cgroup_iter_start(). This
- * reduces the fork()/exit() overhead for people who have cgroups
- * compiled into their kernel but not actually in use */
+/*
+ * We don't maintain the lists running through each css_set to its task
+ * until after the first call to cgroup_task_iter_start().  This reduces
+ * the fork()/exit() overhead for people who have cgroups compiled into
+ * their kernel but not actually in use.
+ */
 static int use_task_css_set_links __read_mostly;
 
 static void __put_css_set(struct css_set *cset, int taskexit)
@@ -2975,10 +2979,10 @@ int cgroup_task_count(const struct cgroup *cgrp)
 }
 
 /*
- * To reduce the fork() overhead for systems that are not actually
- * using their cgroups capability, we don't maintain the lists running
- * through each css_set to its tasks until we see the list actually
- * used - in other words after the first call to cgroup_iter_start().
+ * To reduce the fork() overhead for systems that are not actually using
+ * their cgroups capability, we don't maintain the lists running through
+ * each css_set to its tasks until we see the list actually used - in other
+ * words after the first call to cgroup_task_iter_start().
  */
 static void cgroup_enable_task_cg_lists(void)
 {
@@ -3192,11 +3196,15 @@ css_next_descendant_post(struct cgroup_subsys_state *pos,
 }
 EXPORT_SYMBOL_GPL(css_next_descendant_post);
 
-/*
- * Advance a list_head iterator.  The iterator should be positioned at
- * the start of a css_set
+/**
+ * cgroup_advance_task_iter - advance a task itererator to the next css_set
+ * @cgrp: the cgroup to walk tasks of
+ * @it: the iterator to advance
+ *
+ * Advance @it to the next css_set to walk.
  */
-static void cgroup_advance_iter(struct cgroup *cgrp, struct cgroup_iter *it)
+static void cgroup_advance_task_iter(struct cgroup *cgrp,
+				     struct cgroup_task_iter *it)
 {
 	struct list_head *l = it->cset_link;
 	struct cgrp_cset_link *link;
@@ -3216,7 +3224,21 @@ static void cgroup_advance_iter(struct cgroup *cgrp, struct cgroup_iter *it)
 	it->task = cset->tasks.next;
 }
 
-void cgroup_iter_start(struct cgroup *cgrp, struct cgroup_iter *it)
+/**
+ * cgroup_task_iter_start - initiate task iteration
+ * @cgrp: the cgroup to walk tasks of
+ * @it: the task iterator to use
+ *
+ * Initiate iteration through the tasks of @cgrp.  The caller can call
+ * cgroup_task_iter_next() to walk through the tasks until the function
+ * returns NULL.  On completion of iteration, cgroup_task_iter_end() must
+ * be called.
+ *
+ * Note that this function acquires a lock which is released when the
+ * iteration finishes.  The caller can't sleep while iteration is in
+ * progress.
+ */
+void cgroup_task_iter_start(struct cgroup *cgrp, struct cgroup_task_iter *it)
 	__acquires(css_set_lock)
 {
 	/*
@@ -3229,11 +3251,20 @@ void cgroup_iter_start(struct cgroup *cgrp, struct cgroup_iter *it)
 
 	read_lock(&css_set_lock);
 	it->cset_link = &cgrp->cset_links;
-	cgroup_advance_iter(cgrp, it);
+	cgroup_advance_task_iter(cgrp, it);
 }
 
-struct task_struct *cgroup_iter_next(struct cgroup *cgrp,
-					struct cgroup_iter *it)
+/**
+ * cgroup_task_iter_next - return the next task for the iterator
+ * @cgrp: the cgroup to walk tasks of
+ * @it: the task iterator being iterated
+ *
+ * The "next" function for task iteration.  @it should have been
+ * initialized via cgroup_task_iter_start().  Returns NULL when the
+ * iteration reaches the end.
+ */
+struct task_struct *cgroup_task_iter_next(struct cgroup *cgrp,
+					  struct cgroup_task_iter *it)
 {
 	struct task_struct *res;
 	struct list_head *l = it->task;
@@ -3247,16 +3278,25 @@ struct task_struct *cgroup_iter_next(struct cgroup *cgrp,
 	l = l->next;
 	link = list_entry(it->cset_link, struct cgrp_cset_link, cset_link);
 	if (l == &link->cset->tasks) {
-		/* We reached the end of this task list - move on to
-		 * the next cg_cgroup_link */
-		cgroup_advance_iter(cgrp, it);
+		/*
+		 * We reached the end of this task list - move on to the
+		 * next cgrp_cset_link.
+		 */
+		cgroup_advance_task_iter(cgrp, it);
 	} else {
 		it->task = l;
 	}
 	return res;
 }
 
-void cgroup_iter_end(struct cgroup *cgrp, struct cgroup_iter *it)
+/**
+ * cgroup_task_iter_end - finish task iteration
+ * @cgrp: the cgroup to walk tasks of
+ * @it: the task iterator to finish
+ *
+ * Finish task iteration started by cgroup_task_iter_start().
+ */
+void cgroup_task_iter_end(struct cgroup *cgrp, struct cgroup_task_iter *it)
 	__releases(css_set_lock)
 {
 	read_unlock(&css_set_lock);
@@ -3305,7 +3345,7 @@ static inline int started_after(void *p1, void *p2)
  * Iterate through all the tasks in a cgroup, calling test_task() for each,
  * and if it returns true, call process_task() for it also.
  * The test_task pointer may be NULL, meaning always true (select all tasks).
- * Effectively duplicates cgroup_iter_{start,next,end}()
+ * Effectively duplicates cgroup_task_iter_{start,next,end}()
  * but does not lock css_set_lock for the call to process_task().
  * The struct cgroup_scanner may be embedded in any structure of the caller's
  * creation.
@@ -3326,7 +3366,7 @@ static inline int started_after(void *p1, void *p2)
 int cgroup_scan_tasks(struct cgroup_scanner *scan)
 {
 	int retval, i;
-	struct cgroup_iter it;
+	struct cgroup_task_iter it;
 	struct task_struct *p, *dropped;
 	/* Never dereference latest_task, since it's not refcounted */
 	struct task_struct *latest_task = NULL;
@@ -3361,8 +3401,8 @@ int cgroup_scan_tasks(struct cgroup_scanner *scan)
 	 * guarantees forward progress and that we don't miss any tasks.
 	 */
 	heap->size = 0;
-	cgroup_iter_start(scan->cgrp, &it);
-	while ((p = cgroup_iter_next(scan->cgrp, &it))) {
+	cgroup_task_iter_start(scan->cgrp, &it);
+	while ((p = cgroup_task_iter_next(scan->cgrp, &it))) {
 		/*
 		 * Only affect tasks that qualify per the caller's callback,
 		 * if he provided one
@@ -3395,7 +3435,7 @@ int cgroup_scan_tasks(struct cgroup_scanner *scan)
 		 * the heap and wasn't inserted
 		 */
 	}
-	cgroup_iter_end(scan->cgrp, &it);
+	cgroup_task_iter_end(scan->cgrp, &it);
 
 	if (heap->size) {
 		for (i = 0; i < heap->size; i++) {
@@ -3601,7 +3641,7 @@ static int pidlist_array_load(struct cgroup *cgrp, enum cgroup_filetype type,
 	pid_t *array;
 	int length;
 	int pid, n = 0; /* used for populating the array */
-	struct cgroup_iter it;
+	struct cgroup_task_iter it;
 	struct task_struct *tsk;
 	struct cgroup_pidlist *l;
 
@@ -3616,8 +3656,8 @@ static int pidlist_array_load(struct cgroup *cgrp, enum cgroup_filetype type,
 	if (!array)
 		return -ENOMEM;
 	/* now, populate the array */
-	cgroup_iter_start(cgrp, &it);
-	while ((tsk = cgroup_iter_next(cgrp, &it))) {
+	cgroup_task_iter_start(cgrp, &it);
+	while ((tsk = cgroup_task_iter_next(cgrp, &it))) {
 		if (unlikely(n == length))
 			break;
 		/* get tgid or pid for procs or tasks file respectively */
@@ -3628,7 +3668,7 @@ static int pidlist_array_load(struct cgroup *cgrp, enum cgroup_filetype type,
 		if (pid > 0) /* make sure to only use valid results */
 			array[n++] = pid;
 	}
-	cgroup_iter_end(cgrp, &it);
+	cgroup_task_iter_end(cgrp, &it);
 	length = n;
 	/* now sort & (if procs) strip out duplicates */
 	sort(array, length, sizeof(pid_t), cmppid, NULL);
@@ -3662,7 +3702,7 @@ int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry)
 {
 	int ret = -EINVAL;
 	struct cgroup *cgrp;
-	struct cgroup_iter it;
+	struct cgroup_task_iter it;
 	struct task_struct *tsk;
 
 	/*
@@ -3676,8 +3716,8 @@ int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry)
 	ret = 0;
 	cgrp = dentry->d_fsdata;
 
-	cgroup_iter_start(cgrp, &it);
-	while ((tsk = cgroup_iter_next(cgrp, &it))) {
+	cgroup_task_iter_start(cgrp, &it);
+	while ((tsk = cgroup_task_iter_next(cgrp, &it))) {
 		switch (tsk->state) {
 		case TASK_RUNNING:
 			stats->nr_running++;
@@ -3697,7 +3737,7 @@ int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry)
 			break;
 		}
 	}
-	cgroup_iter_end(cgrp, &it);
+	cgroup_task_iter_end(cgrp, &it);
 
 err:
 	return ret;
@@ -5128,7 +5168,7 @@ void cgroup_fork(struct task_struct *child)
  * Adds the task to the list running through its css_set if necessary and
  * call the subsystem fork() callbacks.  Has to be after the task is
  * visible on the task list in case we race with the first call to
- * cgroup_iter_start() - to guarantee that the new task ends up on its
+ * cgroup_task_iter_start() - to guarantee that the new task ends up on its
  * list.
  */
 void cgroup_post_fork(struct task_struct *child)
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index 98ca48d..c9177f8 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -258,7 +258,7 @@ static void update_if_frozen(struct cgroup_subsys_state *css)
 {
 	struct freezer *freezer = css_freezer(css);
 	struct cgroup_subsys_state *pos;
-	struct cgroup_iter it;
+	struct cgroup_task_iter it;
 	struct task_struct *task;
 
 	WARN_ON_ONCE(!rcu_read_lock_held());
@@ -279,9 +279,9 @@ static void update_if_frozen(struct cgroup_subsys_state *css)
 	}
 
 	/* are all tasks frozen? */
-	cgroup_iter_start(css->cgroup, &it);
+	cgroup_task_iter_start(css->cgroup, &it);
 
-	while ((task = cgroup_iter_next(css->cgroup, &it))) {
+	while ((task = cgroup_task_iter_next(css->cgroup, &it))) {
 		if (freezing(task)) {
 			/*
 			 * freezer_should_skip() indicates that the task
@@ -296,7 +296,7 @@ static void update_if_frozen(struct cgroup_subsys_state *css)
 
 	freezer->state |= CGROUP_FROZEN;
 out_iter_end:
-	cgroup_iter_end(css->cgroup, &it);
+	cgroup_task_iter_end(css->cgroup, &it);
 out_unlock:
 	spin_unlock_irq(&freezer->lock);
 }
@@ -323,25 +323,25 @@ static int freezer_read(struct cgroup_subsys_state *css, struct cftype *cft,
 static void freeze_cgroup(struct freezer *freezer)
 {
 	struct cgroup *cgroup = freezer->css.cgroup;
-	struct cgroup_iter it;
+	struct cgroup_task_iter it;
 	struct task_struct *task;
 
-	cgroup_iter_start(cgroup, &it);
-	while ((task = cgroup_iter_next(cgroup, &it)))
+	cgroup_task_iter_start(cgroup, &it);
+	while ((task = cgroup_task_iter_next(cgroup, &it)))
 		freeze_task(task);
-	cgroup_iter_end(cgroup, &it);
+	cgroup_task_iter_end(cgroup, &it);
 }
 
 static void unfreeze_cgroup(struct freezer *freezer)
 {
 	struct cgroup *cgroup = freezer->css.cgroup;
-	struct cgroup_iter it;
+	struct cgroup_task_iter it;
 	struct task_struct *task;
 
-	cgroup_iter_start(cgroup, &it);
-	while ((task = cgroup_iter_next(cgroup, &it)))
+	cgroup_task_iter_start(cgroup, &it);
+	while ((task = cgroup_task_iter_next(cgroup, &it)))
 		__thaw_task(task);
-	cgroup_iter_end(cgroup, &it);
+	cgroup_task_iter_end(cgroup, &it);
 }
 
 /**
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2285319..00b055d 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1800,11 +1800,11 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
 	totalpages = mem_cgroup_get_limit(memcg) >> PAGE_SHIFT ? : 1;
 	for_each_mem_cgroup_tree(iter, memcg) {
 		struct cgroup *cgroup = iter->css.cgroup;
-		struct cgroup_iter it;
+		struct cgroup_task_iter it;
 		struct task_struct *task;
 
-		cgroup_iter_start(cgroup, &it);
-		while ((task = cgroup_iter_next(cgroup, &it))) {
+		cgroup_task_iter_start(cgroup, &it);
+		while ((task = cgroup_task_iter_next(cgroup, &it))) {
 			switch (oom_scan_process_thread(task, totalpages, NULL,
 							false)) {
 			case OOM_SCAN_SELECT:
@@ -1817,7 +1817,7 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
 			case OOM_SCAN_CONTINUE:
 				continue;
 			case OOM_SCAN_ABORT:
-				cgroup_iter_end(cgroup, &it);
+				cgroup_task_iter_end(cgroup, &it);
 				mem_cgroup_iter_break(memcg, iter);
 				if (chosen)
 					put_task_struct(chosen);
@@ -1834,7 +1834,7 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
 				get_task_struct(chosen);
 			}
 		}
-		cgroup_iter_end(cgroup, &it);
+		cgroup_task_iter_end(cgroup, &it);
 	}
 
 	if (!chosen)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 18/23] cgroup: make cgroup_task_iter remember the cgroup being iterated
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (16 preceding siblings ...)
  2013-08-01 21:49 ` [PATCH 17/23] cgroup: rename cgroup_iter to cgroup_task_iter Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-02 13:38   ` Michal Hocko
  2013-08-01 21:49 ` [PATCH 19/23] cgroup: remove struct cgroup_scanner Tejun Heo
                   ` (6 subsequent siblings)
  24 siblings, 1 reply; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan
  Cc: containers, cgroups, linux-kernel, Tejun Heo, Matt Helsley,
	Johannes Weiner, Michal Hocko, Balbir Singh

Currently all cgroup_task_iter functions require @cgrp to be passed
in, which is superflous and increases chance of usage error.  Make
cgroup_task_iter remember the cgroup being iterated and drop @cgrp
argument from next and end functions.

This patch doesn't introduce any behavior differences.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Balbir Singh <bsingharora@gmail.com>
---
 include/linux/cgroup.h  |  6 +++---
 kernel/cgroup.c         | 32 +++++++++++++++-----------------
 kernel/cgroup_freezer.c | 12 ++++++------
 mm/memcontrol.c         |  6 +++---
 4 files changed, 27 insertions(+), 29 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 4478336..2b10152 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -892,14 +892,14 @@ css_next_descendant_post(struct cgroup_subsys_state *pos,
 
 /* A cgroup_task_iter should be treated as an opaque object */
 struct cgroup_task_iter {
+	struct cgroup			*origin_cgrp;
 	struct list_head		*cset_link;
 	struct list_head		*task;
 };
 
 void cgroup_task_iter_start(struct cgroup *cgrp, struct cgroup_task_iter *it);
-struct task_struct *cgroup_task_iter_next(struct cgroup *cgrp,
-					  struct cgroup_task_iter *it);
-void cgroup_task_iter_end(struct cgroup *cgrp, struct cgroup_task_iter *it);
+struct task_struct *cgroup_task_iter_next(struct cgroup_task_iter *it);
+void cgroup_task_iter_end(struct cgroup_task_iter *it);
 int cgroup_scan_tasks(struct cgroup_scanner *scan);
 int cgroup_attach_task_all(struct task_struct *from, struct task_struct *);
 int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 7a4f89b..7adaaa6 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -3198,13 +3198,11 @@ EXPORT_SYMBOL_GPL(css_next_descendant_post);
 
 /**
  * cgroup_advance_task_iter - advance a task itererator to the next css_set
- * @cgrp: the cgroup to walk tasks of
  * @it: the iterator to advance
  *
  * Advance @it to the next css_set to walk.
  */
-static void cgroup_advance_task_iter(struct cgroup *cgrp,
-				     struct cgroup_task_iter *it)
+static void cgroup_advance_task_iter(struct cgroup_task_iter *it)
 {
 	struct list_head *l = it->cset_link;
 	struct cgrp_cset_link *link;
@@ -3213,7 +3211,7 @@ static void cgroup_advance_task_iter(struct cgroup *cgrp,
 	/* Advance to the next non-empty css_set */
 	do {
 		l = l->next;
-		if (l == &cgrp->cset_links) {
+		if (l == &it->origin_cgrp->cset_links) {
 			it->cset_link = NULL;
 			return;
 		}
@@ -3250,21 +3248,22 @@ void cgroup_task_iter_start(struct cgroup *cgrp, struct cgroup_task_iter *it)
 		cgroup_enable_task_cg_lists();
 
 	read_lock(&css_set_lock);
+
+	it->origin_cgrp = cgrp;
 	it->cset_link = &cgrp->cset_links;
-	cgroup_advance_task_iter(cgrp, it);
+
+	cgroup_advance_task_iter(it);
 }
 
 /**
  * cgroup_task_iter_next - return the next task for the iterator
- * @cgrp: the cgroup to walk tasks of
  * @it: the task iterator being iterated
  *
  * The "next" function for task iteration.  @it should have been
  * initialized via cgroup_task_iter_start().  Returns NULL when the
  * iteration reaches the end.
  */
-struct task_struct *cgroup_task_iter_next(struct cgroup *cgrp,
-					  struct cgroup_task_iter *it)
+struct task_struct *cgroup_task_iter_next(struct cgroup_task_iter *it)
 {
 	struct task_struct *res;
 	struct list_head *l = it->task;
@@ -3282,7 +3281,7 @@ struct task_struct *cgroup_task_iter_next(struct cgroup *cgrp,
 		 * We reached the end of this task list - move on to the
 		 * next cgrp_cset_link.
 		 */
-		cgroup_advance_task_iter(cgrp, it);
+		cgroup_advance_task_iter(it);
 	} else {
 		it->task = l;
 	}
@@ -3291,12 +3290,11 @@ struct task_struct *cgroup_task_iter_next(struct cgroup *cgrp,
 
 /**
  * cgroup_task_iter_end - finish task iteration
- * @cgrp: the cgroup to walk tasks of
  * @it: the task iterator to finish
  *
  * Finish task iteration started by cgroup_task_iter_start().
  */
-void cgroup_task_iter_end(struct cgroup *cgrp, struct cgroup_task_iter *it)
+void cgroup_task_iter_end(struct cgroup_task_iter *it)
 	__releases(css_set_lock)
 {
 	read_unlock(&css_set_lock);
@@ -3402,7 +3400,7 @@ int cgroup_scan_tasks(struct cgroup_scanner *scan)
 	 */
 	heap->size = 0;
 	cgroup_task_iter_start(scan->cgrp, &it);
-	while ((p = cgroup_task_iter_next(scan->cgrp, &it))) {
+	while ((p = cgroup_task_iter_next(&it))) {
 		/*
 		 * Only affect tasks that qualify per the caller's callback,
 		 * if he provided one
@@ -3435,7 +3433,7 @@ int cgroup_scan_tasks(struct cgroup_scanner *scan)
 		 * the heap and wasn't inserted
 		 */
 	}
-	cgroup_task_iter_end(scan->cgrp, &it);
+	cgroup_task_iter_end(&it);
 
 	if (heap->size) {
 		for (i = 0; i < heap->size; i++) {
@@ -3657,7 +3655,7 @@ static int pidlist_array_load(struct cgroup *cgrp, enum cgroup_filetype type,
 		return -ENOMEM;
 	/* now, populate the array */
 	cgroup_task_iter_start(cgrp, &it);
-	while ((tsk = cgroup_task_iter_next(cgrp, &it))) {
+	while ((tsk = cgroup_task_iter_next(&it))) {
 		if (unlikely(n == length))
 			break;
 		/* get tgid or pid for procs or tasks file respectively */
@@ -3668,7 +3666,7 @@ static int pidlist_array_load(struct cgroup *cgrp, enum cgroup_filetype type,
 		if (pid > 0) /* make sure to only use valid results */
 			array[n++] = pid;
 	}
-	cgroup_task_iter_end(cgrp, &it);
+	cgroup_task_iter_end(&it);
 	length = n;
 	/* now sort & (if procs) strip out duplicates */
 	sort(array, length, sizeof(pid_t), cmppid, NULL);
@@ -3717,7 +3715,7 @@ int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry)
 	cgrp = dentry->d_fsdata;
 
 	cgroup_task_iter_start(cgrp, &it);
-	while ((tsk = cgroup_task_iter_next(cgrp, &it))) {
+	while ((tsk = cgroup_task_iter_next(&it))) {
 		switch (tsk->state) {
 		case TASK_RUNNING:
 			stats->nr_running++;
@@ -3737,7 +3735,7 @@ int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry)
 			break;
 		}
 	}
-	cgroup_task_iter_end(cgrp, &it);
+	cgroup_task_iter_end(&it);
 
 err:
 	return ret;
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index c9177f8..e0ab9bf 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -281,7 +281,7 @@ static void update_if_frozen(struct cgroup_subsys_state *css)
 	/* are all tasks frozen? */
 	cgroup_task_iter_start(css->cgroup, &it);
 
-	while ((task = cgroup_task_iter_next(css->cgroup, &it))) {
+	while ((task = cgroup_task_iter_next(&it))) {
 		if (freezing(task)) {
 			/*
 			 * freezer_should_skip() indicates that the task
@@ -296,7 +296,7 @@ static void update_if_frozen(struct cgroup_subsys_state *css)
 
 	freezer->state |= CGROUP_FROZEN;
 out_iter_end:
-	cgroup_task_iter_end(css->cgroup, &it);
+	cgroup_task_iter_end(&it);
 out_unlock:
 	spin_unlock_irq(&freezer->lock);
 }
@@ -327,9 +327,9 @@ static void freeze_cgroup(struct freezer *freezer)
 	struct task_struct *task;
 
 	cgroup_task_iter_start(cgroup, &it);
-	while ((task = cgroup_task_iter_next(cgroup, &it)))
+	while ((task = cgroup_task_iter_next(&it)))
 		freeze_task(task);
-	cgroup_task_iter_end(cgroup, &it);
+	cgroup_task_iter_end(&it);
 }
 
 static void unfreeze_cgroup(struct freezer *freezer)
@@ -339,9 +339,9 @@ static void unfreeze_cgroup(struct freezer *freezer)
 	struct task_struct *task;
 
 	cgroup_task_iter_start(cgroup, &it);
-	while ((task = cgroup_task_iter_next(cgroup, &it)))
+	while ((task = cgroup_task_iter_next(&it)))
 		__thaw_task(task);
-	cgroup_task_iter_end(cgroup, &it);
+	cgroup_task_iter_end(&it);
 }
 
 /**
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 00b055d..5a5f4dc 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1804,7 +1804,7 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
 		struct task_struct *task;
 
 		cgroup_task_iter_start(cgroup, &it);
-		while ((task = cgroup_task_iter_next(cgroup, &it))) {
+		while ((task = cgroup_task_iter_next(&it))) {
 			switch (oom_scan_process_thread(task, totalpages, NULL,
 							false)) {
 			case OOM_SCAN_SELECT:
@@ -1817,7 +1817,7 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
 			case OOM_SCAN_CONTINUE:
 				continue;
 			case OOM_SCAN_ABORT:
-				cgroup_task_iter_end(cgroup, &it);
+				cgroup_task_iter_end(&it);
 				mem_cgroup_iter_break(memcg, iter);
 				if (chosen)
 					put_task_struct(chosen);
@@ -1834,7 +1834,7 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
 				get_task_struct(chosen);
 			}
 		}
-		cgroup_task_iter_end(cgroup, &it);
+		cgroup_task_iter_end(&it);
 	}
 
 	if (!chosen)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 19/23] cgroup: remove struct cgroup_scanner
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (17 preceding siblings ...)
  2013-08-01 21:49 ` [PATCH 18/23] cgroup: make cgroup_task_iter remember the cgroup being iterated Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-01 21:49 ` [PATCH 20/23] cgroup: make task iterators deal with cgroup_subsys_state instead of cgroup Tejun Heo
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan; +Cc: containers, cgroups, linux-kernel, Tejun Heo

cgroup_scan_tasks() takes a pointer to struct cgroup_scanner as its
sole argument and the only function of that struct is packing the
arguments of the function call which are consisted of five fields.
It's not too unusual to pack parameters into a struct when the number
of arguments gets excessive or the whole set needs to be passed around
a lot, but neither holds here making it just weird.

Drop struct cgroup_scanner and pass the params directly to
cgroup_scan_tasks().  Note that struct cpuset_change_nodemask_arg was
added to cpuset.c to pass both ->cs and ->newmems pointer to
cpuset_change_nodemask() using single data pointer.

This doesn't make any functional differences.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/cgroup.h | 16 ++++-----
 kernel/cgroup.c        | 93 +++++++++++++++++++++++---------------------------
 kernel/cpuset.c        | 63 ++++++++++++++--------------------
 3 files changed, 75 insertions(+), 97 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 2b10152..2e9a799 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -528,15 +528,6 @@ struct cftype_set {
 	struct cftype			*cfts;
 };
 
-struct cgroup_scanner {
-	struct cgroup *cgrp;
-	int (*test_task)(struct task_struct *p, struct cgroup_scanner *scan);
-	void (*process_task)(struct task_struct *p,
-			struct cgroup_scanner *scan);
-	struct ptr_heap *heap;
-	void *data;
-};
-
 /*
  * See the comment above CGRP_ROOT_SANE_BEHAVIOR for details.  This
  * function can be called as long as @cgrp is accessible.
@@ -900,7 +891,12 @@ struct cgroup_task_iter {
 void cgroup_task_iter_start(struct cgroup *cgrp, struct cgroup_task_iter *it);
 struct task_struct *cgroup_task_iter_next(struct cgroup_task_iter *it);
 void cgroup_task_iter_end(struct cgroup_task_iter *it);
-int cgroup_scan_tasks(struct cgroup_scanner *scan);
+
+int cgroup_scan_tasks(struct cgroup *cgrp,
+		      bool (*test)(struct task_struct *, void *),
+		      void (*process)(struct task_struct *, void *),
+		      void *data, struct ptr_heap *heap);
+
 int cgroup_attach_task_all(struct task_struct *from, struct task_struct *);
 int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from);
 
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 7adaaa6..4e354b59 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -3336,32 +3336,37 @@ static inline int started_after(void *p1, void *p2)
 
 /**
  * cgroup_scan_tasks - iterate though all the tasks in a cgroup
- * @scan: struct cgroup_scanner containing arguments for the scan
+ * @cgrp: the cgroup to iterate tasks of
+ * @test: optional test callback
+ * @process: process callback
+ * @data: data passed to @test and @process
+ * @heap: optional pre-allocated heap used for task iteration
  *
- * Arguments include pointers to callback functions test_task() and
- * process_task().
- * Iterate through all the tasks in a cgroup, calling test_task() for each,
- * and if it returns true, call process_task() for it also.
- * The test_task pointer may be NULL, meaning always true (select all tasks).
- * Effectively duplicates cgroup_task_iter_{start,next,end}()
- * but does not lock css_set_lock for the call to process_task().
- * The struct cgroup_scanner may be embedded in any structure of the caller's
- * creation.
- * It is guaranteed that process_task() will act on every task that
- * is a member of the cgroup for the duration of this call. This
- * function may or may not call process_task() for tasks that exit
- * or move to a different cgroup during the call, or are forked or
- * move into the cgroup during the call.
+ * Iterate through all the tasks in a cgroup, calling @test for each, and
+ * if it returns %true, call @process for it also.
  *
- * Note that test_task() may be called with locks held, and may in some
- * situations be called multiple times for the same task, so it should
- * be cheap.
- * If the heap pointer in the struct cgroup_scanner is non-NULL, a heap has been
- * pre-allocated and will be used for heap operations (and its "gt" member will
- * be overwritten), else a temporary heap will be used (allocation of which
- * may cause this function to fail).
+ * @test may be NULL, meaning always true (select all tasks), which
+ * effectively duplicates cgroup_task_iter_{start,next,end}() but does not
+ * lock css_set_lock for the call to @process.
+ *
+ * It is guaranteed that @process will act on every task that is a member
+ * of @cgrp for the duration of this call.  This function may or may not
+ * call @process for tasks that exit or move to a different cgroup during
+ * the call, or are forked or move into the cgroup during the call.
+ *
+ * Note that @test may be called with locks held, and may in some
+ * situations be called multiple times for the same task, so it should be
+ * cheap.
+ *
+ * If @heap is non-NULL, a heap has been pre-allocated and will be used for
+ * heap operations (and its "gt" member will be overwritten), else a
+ * temporary heap will be used (allocation of which may cause this function
+ * to fail).
  */
-int cgroup_scan_tasks(struct cgroup_scanner *scan)
+int cgroup_scan_tasks(struct cgroup *cgrp,
+		      bool (*test)(struct task_struct *, void *),
+		      void (*process)(struct task_struct *, void *),
+		      void *data, struct ptr_heap *heap)
 {
 	int retval, i;
 	struct cgroup_task_iter it;
@@ -3369,12 +3374,10 @@ int cgroup_scan_tasks(struct cgroup_scanner *scan)
 	/* Never dereference latest_task, since it's not refcounted */
 	struct task_struct *latest_task = NULL;
 	struct ptr_heap tmp_heap;
-	struct ptr_heap *heap;
 	struct timespec latest_time = { 0, 0 };
 
-	if (scan->heap) {
+	if (heap) {
 		/* The caller supplied our heap and pre-allocated its memory */
-		heap = scan->heap;
 		heap->gt = &started_after;
 	} else {
 		/* We need to allocate our own heap memory */
@@ -3387,25 +3390,24 @@ int cgroup_scan_tasks(struct cgroup_scanner *scan)
 
  again:
 	/*
-	 * Scan tasks in the cgroup, using the scanner's "test_task" callback
-	 * to determine which are of interest, and using the scanner's
-	 * "process_task" callback to process any of them that need an update.
-	 * Since we don't want to hold any locks during the task updates,
-	 * gather tasks to be processed in a heap structure.
-	 * The heap is sorted by descending task start time.
-	 * If the statically-sized heap fills up, we overflow tasks that
-	 * started later, and in future iterations only consider tasks that
-	 * started after the latest task in the previous pass. This
+	 * Scan tasks in the cgroup, using the @test callback to determine
+	 * which are of interest, and invoking @process callback on the
+	 * ones which need an update.  Since we don't want to hold any
+	 * locks during the task updates, gather tasks to be processed in a
+	 * heap structure.  The heap is sorted by descending task start
+	 * time.  If the statically-sized heap fills up, we overflow tasks
+	 * that started later, and in future iterations only consider tasks
+	 * that started after the latest task in the previous pass. This
 	 * guarantees forward progress and that we don't miss any tasks.
 	 */
 	heap->size = 0;
-	cgroup_task_iter_start(scan->cgrp, &it);
+	cgroup_task_iter_start(cgrp, &it);
 	while ((p = cgroup_task_iter_next(&it))) {
 		/*
 		 * Only affect tasks that qualify per the caller's callback,
 		 * if he provided one
 		 */
-		if (scan->test_task && !scan->test_task(p, scan))
+		if (test && !test(p, data))
 			continue;
 		/*
 		 * Only process tasks that started after the last task
@@ -3443,7 +3445,7 @@ int cgroup_scan_tasks(struct cgroup_scanner *scan)
 				latest_task = q;
 			}
 			/* Process the task per the caller's callback */
-			scan->process_task(q, scan);
+			process(q, data);
 			put_task_struct(q);
 		}
 		/*
@@ -3460,10 +3462,9 @@ int cgroup_scan_tasks(struct cgroup_scanner *scan)
 	return 0;
 }
 
-static void cgroup_transfer_one_task(struct task_struct *task,
-				     struct cgroup_scanner *scan)
+static void cgroup_transfer_one_task(struct task_struct *task, void *data)
 {
-	struct cgroup *new_cgroup = scan->data;
+	struct cgroup *new_cgroup = data;
 
 	mutex_lock(&cgroup_mutex);
 	cgroup_attach_task(new_cgroup, task, false);
@@ -3477,15 +3478,7 @@ static void cgroup_transfer_one_task(struct task_struct *task,
  */
 int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from)
 {
-	struct cgroup_scanner scan;
-
-	scan.cgrp = from;
-	scan.test_task = NULL; /* select all tasks in cgroup */
-	scan.process_task = cgroup_transfer_one_task;
-	scan.heap = NULL;
-	scan.data = to;
-
-	return cgroup_scan_tasks(&scan);
+	return cgroup_scan_tasks(from, NULL, cgroup_transfer_one_task, to, NULL);
 }
 
 /*
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index be4f503..6fe23f2 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -830,7 +830,7 @@ static struct cpuset *effective_nodemask_cpuset(struct cpuset *cs)
 /**
  * cpuset_change_cpumask - make a task's cpus_allowed the same as its cpuset's
  * @tsk: task to test
- * @scan: struct cgroup_scanner containing the cgroup of the task
+ * @data: cpuset to @tsk belongs to
  *
  * Called by cgroup_scan_tasks() for each task in a cgroup whose
  * cpus_allowed mask needs to be changed.
@@ -838,12 +838,11 @@ static struct cpuset *effective_nodemask_cpuset(struct cpuset *cs)
  * We don't need to re-check for the cgroup/cpuset membership, since we're
  * holding cpuset_mutex at this point.
  */
-static void cpuset_change_cpumask(struct task_struct *tsk,
-				  struct cgroup_scanner *scan)
+static void cpuset_change_cpumask(struct task_struct *tsk, void *data)
 {
-	struct cpuset *cpus_cs;
+	struct cpuset *cs = data;
+	struct cpuset *cpus_cs = effective_cpumask_cpuset(cs);
 
-	cpus_cs = effective_cpumask_cpuset(cgroup_cs(scan->cgrp));
 	set_cpus_allowed_ptr(tsk, cpus_cs->cpus_allowed);
 }
 
@@ -862,13 +861,8 @@ static void cpuset_change_cpumask(struct task_struct *tsk,
  */
 static void update_tasks_cpumask(struct cpuset *cs, struct ptr_heap *heap)
 {
-	struct cgroup_scanner scan;
-
-	scan.cgrp = cs->css.cgroup;
-	scan.test_task = NULL;
-	scan.process_task = cpuset_change_cpumask;
-	scan.heap = heap;
-	cgroup_scan_tasks(&scan);
+	cgroup_scan_tasks(cs->css.cgroup, NULL, cpuset_change_cpumask, cs,
+			  heap);
 }
 
 /*
@@ -1052,20 +1046,24 @@ static void cpuset_change_task_nodemask(struct task_struct *tsk,
 	task_unlock(tsk);
 }
 
+struct cpuset_change_nodemask_arg {
+	struct cpuset		*cs;
+	nodemask_t		*newmems;
+};
+
 /*
  * Update task's mems_allowed and rebind its mempolicy and vmas' mempolicy
  * of it to cpuset's new mems_allowed, and migrate pages to new nodes if
  * memory_migrate flag is set. Called with cpuset_mutex held.
  */
-static void cpuset_change_nodemask(struct task_struct *p,
-				   struct cgroup_scanner *scan)
+static void cpuset_change_nodemask(struct task_struct *p, void *data)
 {
-	struct cpuset *cs = cgroup_cs(scan->cgrp);
+	struct cpuset_change_nodemask_arg *arg = data;
+	struct cpuset *cs = arg->cs;
 	struct mm_struct *mm;
 	int migrate;
-	nodemask_t *newmems = scan->data;
 
-	cpuset_change_task_nodemask(p, newmems);
+	cpuset_change_task_nodemask(p, arg->newmems);
 
 	mm = get_task_mm(p);
 	if (!mm)
@@ -1075,7 +1073,7 @@ static void cpuset_change_nodemask(struct task_struct *p,
 
 	mpol_rebind_mm(mm, &cs->mems_allowed);
 	if (migrate)
-		cpuset_migrate_mm(mm, &cs->old_mems_allowed, newmems);
+		cpuset_migrate_mm(mm, &cs->old_mems_allowed, arg->newmems);
 	mmput(mm);
 }
 
@@ -1093,19 +1091,14 @@ static void *cpuset_being_rebound;
 static void update_tasks_nodemask(struct cpuset *cs, struct ptr_heap *heap)
 {
 	static nodemask_t newmems;	/* protected by cpuset_mutex */
-	struct cgroup_scanner scan;
 	struct cpuset *mems_cs = effective_nodemask_cpuset(cs);
+	struct cpuset_change_nodemask_arg arg = { .cs = cs,
+						  .newmems = &newmems };
 
 	cpuset_being_rebound = cs;		/* causes mpol_dup() rebind */
 
 	guarantee_online_mems(mems_cs, &newmems);
 
-	scan.cgrp = cs->css.cgroup;
-	scan.test_task = NULL;
-	scan.process_task = cpuset_change_nodemask;
-	scan.heap = heap;
-	scan.data = &newmems;
-
 	/*
 	 * The mpol_rebind_mm() call takes mmap_sem, which we couldn't
 	 * take while holding tasklist_lock.  Forks can happen - the
@@ -1116,7 +1109,8 @@ static void update_tasks_nodemask(struct cpuset *cs, struct ptr_heap *heap)
 	 * It's ok if we rebind the same mm twice; mpol_rebind_mm()
 	 * is idempotent.  Also migrate pages in each mm to new nodes.
 	 */
-	cgroup_scan_tasks(&scan);
+	cgroup_scan_tasks(cs->css.cgroup, NULL, cpuset_change_nodemask, &arg,
+			  heap);
 
 	/*
 	 * All the tasks' nodemasks have been updated, update
@@ -1263,17 +1257,18 @@ static int update_relax_domain_level(struct cpuset *cs, s64 val)
 /*
  * cpuset_change_flag - make a task's spread flags the same as its cpuset's
  * @tsk: task to be updated
- * @scan: struct cgroup_scanner containing the cgroup of the task
+ * @data: cpuset to @tsk belongs to
  *
  * Called by cgroup_scan_tasks() for each task in a cgroup.
  *
  * We don't need to re-check for the cgroup/cpuset membership, since we're
  * holding cpuset_mutex at this point.
  */
-static void cpuset_change_flag(struct task_struct *tsk,
-				struct cgroup_scanner *scan)
+static void cpuset_change_flag(struct task_struct *tsk, void *data)
 {
-	cpuset_update_task_spread_flag(cgroup_cs(scan->cgrp), tsk);
+	struct cpuset *cs = data;
+
+	cpuset_update_task_spread_flag(cs, tsk);
 }
 
 /*
@@ -1291,13 +1286,7 @@ static void cpuset_change_flag(struct task_struct *tsk,
  */
 static void update_tasks_flags(struct cpuset *cs, struct ptr_heap *heap)
 {
-	struct cgroup_scanner scan;
-
-	scan.cgrp = cs->css.cgroup;
-	scan.test_task = NULL;
-	scan.process_task = cpuset_change_flag;
-	scan.heap = heap;
-	cgroup_scan_tasks(&scan);
+	cgroup_scan_tasks(cs->css.cgroup, NULL, cpuset_change_flag, cs, heap);
 }
 
 /*
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 20/23] cgroup: make task iterators deal with cgroup_subsys_state instead of cgroup
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (18 preceding siblings ...)
  2013-08-01 21:49 ` [PATCH 19/23] cgroup: remove struct cgroup_scanner Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-02 13:40   ` Michal Hocko
  2013-08-01 21:49 ` [PATCH 21/23] cgroup: make cftype->[un]register_event() " Tejun Heo
                   ` (4 subsequent siblings)
  24 siblings, 1 reply; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan
  Cc: containers, cgroups, linux-kernel, Tejun Heo, Johannes Weiner,
	Michal Hocko, Balbir Singh, Matt Helsley

cgroup is in the process of converting to css (cgroup_subsys_state)
from cgroup as the principal subsystem interface handle.  This is
mostly to prepare for the unified hierarchy support where css's will
be created and destroyed dynamically but also helps cleaning up
subsystem implementations as css is usually what they are interested
in anyway.

This patch converts task iterators to deal with css instead of cgroup.
Note that under unified hierarchy, different sets of tasks will be
considered belonging to a given cgroup depending on the subsystem in
question and making the iterators deal with css instead cgroup
provides them with enough information about the iteration.

While at it, fix several function comment formats in cpuset.c.

This patch doesn't introduce any behavior differences.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Matt Helsley <matthltc@us.ibm.com>
---
 include/linux/cgroup.h  |  21 ++++-----
 kernel/cgroup.c         | 112 ++++++++++++++++++++++++------------------------
 kernel/cgroup_freezer.c |  26 ++++++-----
 kernel/cpuset.c         |  41 ++++++++----------
 mm/memcontrol.c         |  11 +++--
 5 files changed, 104 insertions(+), 107 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 2e9a799..6f6d87b 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -881,21 +881,22 @@ css_next_descendant_post(struct cgroup_subsys_state *pos,
 	for ((pos) = css_next_descendant_post(NULL, (css)); (pos);	\
 	     (pos) = css_next_descendant_post((pos), (css)))
 
-/* A cgroup_task_iter should be treated as an opaque object */
-struct cgroup_task_iter {
-	struct cgroup			*origin_cgrp;
+/* A css_task_iter should be treated as an opaque object */
+struct css_task_iter {
+	struct cgroup_subsys_state	*origin_css;
 	struct list_head		*cset_link;
 	struct list_head		*task;
 };
 
-void cgroup_task_iter_start(struct cgroup *cgrp, struct cgroup_task_iter *it);
-struct task_struct *cgroup_task_iter_next(struct cgroup_task_iter *it);
-void cgroup_task_iter_end(struct cgroup_task_iter *it);
+void css_task_iter_start(struct cgroup_subsys_state *css,
+			 struct css_task_iter *it);
+struct task_struct *css_task_iter_next(struct css_task_iter *it);
+void css_task_iter_end(struct css_task_iter *it);
 
-int cgroup_scan_tasks(struct cgroup *cgrp,
-		      bool (*test)(struct task_struct *, void *),
-		      void (*process)(struct task_struct *, void *),
-		      void *data, struct ptr_heap *heap);
+int css_scan_tasks(struct cgroup_subsys_state *css,
+		   bool (*test)(struct task_struct *, void *),
+		   void (*process)(struct task_struct *, void *),
+		   void *data, struct ptr_heap *heap);
 
 int cgroup_attach_task_all(struct task_struct *from, struct task_struct *);
 int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 4e354b59..c61b24f 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -370,7 +370,7 @@ static int cgroup_init_idr(struct cgroup_subsys *ss,
 /*
  * css_set_lock protects the list of css_set objects, and the chain of
  * tasks off each css_set.  Nests outside task->alloc_lock due to
- * cgroup_task_iter_start().
+ * css_task_iter_start().
  */
 static DEFINE_RWLOCK(css_set_lock);
 static int css_set_count;
@@ -398,9 +398,9 @@ static unsigned long css_set_hash(struct cgroup_subsys_state *css[])
 
 /*
  * We don't maintain the lists running through each css_set to its task
- * until after the first call to cgroup_task_iter_start().  This reduces
- * the fork()/exit() overhead for people who have cgroups compiled into
- * their kernel but not actually in use.
+ * until after the first call to css_task_iter_start().  This reduces the
+ * fork()/exit() overhead for people who have cgroups compiled into their
+ * kernel but not actually in use.
  */
 static int use_task_css_set_links __read_mostly;
 
@@ -2982,7 +2982,7 @@ int cgroup_task_count(const struct cgroup *cgrp)
  * To reduce the fork() overhead for systems that are not actually using
  * their cgroups capability, we don't maintain the lists running through
  * each css_set to its tasks until we see the list actually used - in other
- * words after the first call to cgroup_task_iter_start().
+ * words after the first call to css_task_iter_start().
  */
 static void cgroup_enable_task_cg_lists(void)
 {
@@ -3197,12 +3197,12 @@ css_next_descendant_post(struct cgroup_subsys_state *pos,
 EXPORT_SYMBOL_GPL(css_next_descendant_post);
 
 /**
- * cgroup_advance_task_iter - advance a task itererator to the next css_set
+ * css_advance_task_iter - advance a task itererator to the next css_set
  * @it: the iterator to advance
  *
  * Advance @it to the next css_set to walk.
  */
-static void cgroup_advance_task_iter(struct cgroup_task_iter *it)
+static void css_advance_task_iter(struct css_task_iter *it)
 {
 	struct list_head *l = it->cset_link;
 	struct cgrp_cset_link *link;
@@ -3211,7 +3211,7 @@ static void cgroup_advance_task_iter(struct cgroup_task_iter *it)
 	/* Advance to the next non-empty css_set */
 	do {
 		l = l->next;
-		if (l == &it->origin_cgrp->cset_links) {
+		if (l == &it->origin_css->cgroup->cset_links) {
 			it->cset_link = NULL;
 			return;
 		}
@@ -3223,47 +3223,48 @@ static void cgroup_advance_task_iter(struct cgroup_task_iter *it)
 }
 
 /**
- * cgroup_task_iter_start - initiate task iteration
- * @cgrp: the cgroup to walk tasks of
+ * css_task_iter_start - initiate task iteration
+ * @css: the css to walk tasks of
  * @it: the task iterator to use
  *
- * Initiate iteration through the tasks of @cgrp.  The caller can call
- * cgroup_task_iter_next() to walk through the tasks until the function
- * returns NULL.  On completion of iteration, cgroup_task_iter_end() must
- * be called.
+ * Initiate iteration through the tasks of @css.  The caller can call
+ * css_task_iter_next() to walk through the tasks until the function
+ * returns NULL.  On completion of iteration, css_task_iter_end() must be
+ * called.
  *
  * Note that this function acquires a lock which is released when the
  * iteration finishes.  The caller can't sleep while iteration is in
  * progress.
  */
-void cgroup_task_iter_start(struct cgroup *cgrp, struct cgroup_task_iter *it)
+void css_task_iter_start(struct cgroup_subsys_state *css,
+			 struct css_task_iter *it)
 	__acquires(css_set_lock)
 {
 	/*
-	 * The first time anyone tries to iterate across a cgroup,
-	 * we need to enable the list linking each css_set to its
-	 * tasks, and fix up all existing tasks.
+	 * The first time anyone tries to iterate across a css, we need to
+	 * enable the list linking each css_set to its tasks, and fix up
+	 * all existing tasks.
 	 */
 	if (!use_task_css_set_links)
 		cgroup_enable_task_cg_lists();
 
 	read_lock(&css_set_lock);
 
-	it->origin_cgrp = cgrp;
-	it->cset_link = &cgrp->cset_links;
+	it->origin_css = css;
+	it->cset_link = &css->cgroup->cset_links;
 
-	cgroup_advance_task_iter(it);
+	css_advance_task_iter(it);
 }
 
 /**
- * cgroup_task_iter_next - return the next task for the iterator
+ * css_task_iter_next - return the next task for the iterator
  * @it: the task iterator being iterated
  *
  * The "next" function for task iteration.  @it should have been
- * initialized via cgroup_task_iter_start().  Returns NULL when the
- * iteration reaches the end.
+ * initialized via css_task_iter_start().  Returns NULL when the iteration
+ * reaches the end.
  */
-struct task_struct *cgroup_task_iter_next(struct cgroup_task_iter *it)
+struct task_struct *css_task_iter_next(struct css_task_iter *it)
 {
 	struct task_struct *res;
 	struct list_head *l = it->task;
@@ -3281,7 +3282,7 @@ struct task_struct *cgroup_task_iter_next(struct cgroup_task_iter *it)
 		 * We reached the end of this task list - move on to the
 		 * next cgrp_cset_link.
 		 */
-		cgroup_advance_task_iter(it);
+		css_advance_task_iter(it);
 	} else {
 		it->task = l;
 	}
@@ -3289,12 +3290,12 @@ struct task_struct *cgroup_task_iter_next(struct cgroup_task_iter *it)
 }
 
 /**
- * cgroup_task_iter_end - finish task iteration
+ * css_task_iter_end - finish task iteration
  * @it: the task iterator to finish
  *
- * Finish task iteration started by cgroup_task_iter_start().
+ * Finish task iteration started by css_task_iter_start().
  */
-void cgroup_task_iter_end(struct cgroup_task_iter *it)
+void css_task_iter_end(struct css_task_iter *it)
 	__releases(css_set_lock)
 {
 	read_unlock(&css_set_lock);
@@ -3335,24 +3336,24 @@ static inline int started_after(void *p1, void *p2)
 }
 
 /**
- * cgroup_scan_tasks - iterate though all the tasks in a cgroup
- * @cgrp: the cgroup to iterate tasks of
+ * css_scan_tasks - iterate though all the tasks in a css
+ * @css: the css to iterate tasks of
  * @test: optional test callback
  * @process: process callback
  * @data: data passed to @test and @process
  * @heap: optional pre-allocated heap used for task iteration
  *
- * Iterate through all the tasks in a cgroup, calling @test for each, and
- * if it returns %true, call @process for it also.
+ * Iterate through all the tasks in @css, calling @test for each, and if it
+ * returns %true, call @process for it also.
  *
  * @test may be NULL, meaning always true (select all tasks), which
- * effectively duplicates cgroup_task_iter_{start,next,end}() but does not
+ * effectively duplicates css_task_iter_{start,next,end}() but does not
  * lock css_set_lock for the call to @process.
  *
  * It is guaranteed that @process will act on every task that is a member
- * of @cgrp for the duration of this call.  This function may or may not
- * call @process for tasks that exit or move to a different cgroup during
- * the call, or are forked or move into the cgroup during the call.
+ * of @css for the duration of this call.  This function may or may not
+ * call @process for tasks that exit or move to a different css during the
+ * call, or are forked or move into the css during the call.
  *
  * Note that @test may be called with locks held, and may in some
  * situations be called multiple times for the same task, so it should be
@@ -3363,13 +3364,13 @@ static inline int started_after(void *p1, void *p2)
  * temporary heap will be used (allocation of which may cause this function
  * to fail).
  */
-int cgroup_scan_tasks(struct cgroup *cgrp,
-		      bool (*test)(struct task_struct *, void *),
-		      void (*process)(struct task_struct *, void *),
-		      void *data, struct ptr_heap *heap)
+int css_scan_tasks(struct cgroup_subsys_state *css,
+		   bool (*test)(struct task_struct *, void *),
+		   void (*process)(struct task_struct *, void *),
+		   void *data, struct ptr_heap *heap)
 {
 	int retval, i;
-	struct cgroup_task_iter it;
+	struct css_task_iter it;
 	struct task_struct *p, *dropped;
 	/* Never dereference latest_task, since it's not refcounted */
 	struct task_struct *latest_task = NULL;
@@ -3390,7 +3391,7 @@ int cgroup_scan_tasks(struct cgroup *cgrp,
 
  again:
 	/*
-	 * Scan tasks in the cgroup, using the @test callback to determine
+	 * Scan tasks in the css, using the @test callback to determine
 	 * which are of interest, and invoking @process callback on the
 	 * ones which need an update.  Since we don't want to hold any
 	 * locks during the task updates, gather tasks to be processed in a
@@ -3401,8 +3402,8 @@ int cgroup_scan_tasks(struct cgroup *cgrp,
 	 * guarantees forward progress and that we don't miss any tasks.
 	 */
 	heap->size = 0;
-	cgroup_task_iter_start(cgrp, &it);
-	while ((p = cgroup_task_iter_next(&it))) {
+	css_task_iter_start(css, &it);
+	while ((p = css_task_iter_next(&it))) {
 		/*
 		 * Only affect tasks that qualify per the caller's callback,
 		 * if he provided one
@@ -3435,7 +3436,7 @@ int cgroup_scan_tasks(struct cgroup *cgrp,
 		 * the heap and wasn't inserted
 		 */
 	}
-	cgroup_task_iter_end(&it);
+	css_task_iter_end(&it);
 
 	if (heap->size) {
 		for (i = 0; i < heap->size; i++) {
@@ -3478,7 +3479,8 @@ static void cgroup_transfer_one_task(struct task_struct *task, void *data)
  */
 int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from)
 {
-	return cgroup_scan_tasks(from, NULL, cgroup_transfer_one_task, to, NULL);
+	return css_scan_tasks(&from->dummy_css, NULL, cgroup_transfer_one_task,
+			      to, NULL);
 }
 
 /*
@@ -3632,7 +3634,7 @@ static int pidlist_array_load(struct cgroup *cgrp, enum cgroup_filetype type,
 	pid_t *array;
 	int length;
 	int pid, n = 0; /* used for populating the array */
-	struct cgroup_task_iter it;
+	struct css_task_iter it;
 	struct task_struct *tsk;
 	struct cgroup_pidlist *l;
 
@@ -3647,8 +3649,8 @@ static int pidlist_array_load(struct cgroup *cgrp, enum cgroup_filetype type,
 	if (!array)
 		return -ENOMEM;
 	/* now, populate the array */
-	cgroup_task_iter_start(cgrp, &it);
-	while ((tsk = cgroup_task_iter_next(&it))) {
+	css_task_iter_start(&cgrp->dummy_css, &it);
+	while ((tsk = css_task_iter_next(&it))) {
 		if (unlikely(n == length))
 			break;
 		/* get tgid or pid for procs or tasks file respectively */
@@ -3659,7 +3661,7 @@ static int pidlist_array_load(struct cgroup *cgrp, enum cgroup_filetype type,
 		if (pid > 0) /* make sure to only use valid results */
 			array[n++] = pid;
 	}
-	cgroup_task_iter_end(&it);
+	css_task_iter_end(&it);
 	length = n;
 	/* now sort & (if procs) strip out duplicates */
 	sort(array, length, sizeof(pid_t), cmppid, NULL);
@@ -3693,7 +3695,7 @@ int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry)
 {
 	int ret = -EINVAL;
 	struct cgroup *cgrp;
-	struct cgroup_task_iter it;
+	struct css_task_iter it;
 	struct task_struct *tsk;
 
 	/*
@@ -3707,8 +3709,8 @@ int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry)
 	ret = 0;
 	cgrp = dentry->d_fsdata;
 
-	cgroup_task_iter_start(cgrp, &it);
-	while ((tsk = cgroup_task_iter_next(&it))) {
+	css_task_iter_start(&cgrp->dummy_css, &it);
+	while ((tsk = css_task_iter_next(&it))) {
 		switch (tsk->state) {
 		case TASK_RUNNING:
 			stats->nr_running++;
@@ -3728,7 +3730,7 @@ int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry)
 			break;
 		}
 	}
-	cgroup_task_iter_end(&it);
+	css_task_iter_end(&it);
 
 err:
 	return ret;
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index e0ab9bf..5cd2b6d 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -258,7 +258,7 @@ static void update_if_frozen(struct cgroup_subsys_state *css)
 {
 	struct freezer *freezer = css_freezer(css);
 	struct cgroup_subsys_state *pos;
-	struct cgroup_task_iter it;
+	struct css_task_iter it;
 	struct task_struct *task;
 
 	WARN_ON_ONCE(!rcu_read_lock_held());
@@ -279,9 +279,9 @@ static void update_if_frozen(struct cgroup_subsys_state *css)
 	}
 
 	/* are all tasks frozen? */
-	cgroup_task_iter_start(css->cgroup, &it);
+	css_task_iter_start(css, &it);
 
-	while ((task = cgroup_task_iter_next(&it))) {
+	while ((task = css_task_iter_next(&it))) {
 		if (freezing(task)) {
 			/*
 			 * freezer_should_skip() indicates that the task
@@ -296,7 +296,7 @@ static void update_if_frozen(struct cgroup_subsys_state *css)
 
 	freezer->state |= CGROUP_FROZEN;
 out_iter_end:
-	cgroup_task_iter_end(&it);
+	css_task_iter_end(&it);
 out_unlock:
 	spin_unlock_irq(&freezer->lock);
 }
@@ -322,26 +322,24 @@ static int freezer_read(struct cgroup_subsys_state *css, struct cftype *cft,
 
 static void freeze_cgroup(struct freezer *freezer)
 {
-	struct cgroup *cgroup = freezer->css.cgroup;
-	struct cgroup_task_iter it;
+	struct css_task_iter it;
 	struct task_struct *task;
 
-	cgroup_task_iter_start(cgroup, &it);
-	while ((task = cgroup_task_iter_next(&it)))
+	css_task_iter_start(&freezer->css, &it);
+	while ((task = css_task_iter_next(&it)))
 		freeze_task(task);
-	cgroup_task_iter_end(&it);
+	css_task_iter_end(&it);
 }
 
 static void unfreeze_cgroup(struct freezer *freezer)
 {
-	struct cgroup *cgroup = freezer->css.cgroup;
-	struct cgroup_task_iter it;
+	struct css_task_iter it;
 	struct task_struct *task;
 
-	cgroup_task_iter_start(cgroup, &it);
-	while ((task = cgroup_task_iter_next(&it)))
+	css_task_iter_start(&freezer->css, &it);
+	while ((task = css_task_iter_next(&it)))
 		__thaw_task(task);
-	cgroup_task_iter_end(&it);
+	css_task_iter_end(&it);
 }
 
 /**
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 6fe23f2..39e5217 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -832,8 +832,8 @@ static struct cpuset *effective_nodemask_cpuset(struct cpuset *cs)
  * @tsk: task to test
  * @data: cpuset to @tsk belongs to
  *
- * Called by cgroup_scan_tasks() for each task in a cgroup whose
- * cpus_allowed mask needs to be changed.
+ * Called by css_scan_tasks() for each task in a cgroup whose cpus_allowed
+ * mask needs to be changed.
  *
  * We don't need to re-check for the cgroup/cpuset membership, since we're
  * holding cpuset_mutex at this point.
@@ -849,27 +849,26 @@ static void cpuset_change_cpumask(struct task_struct *tsk, void *data)
 /**
  * update_tasks_cpumask - Update the cpumasks of tasks in the cpuset.
  * @cs: the cpuset in which each task's cpus_allowed mask needs to be changed
- * @heap: if NULL, defer allocating heap memory to cgroup_scan_tasks()
+ * @heap: if NULL, defer allocating heap memory to css_scan_tasks()
  *
  * Called with cpuset_mutex held
  *
- * The cgroup_scan_tasks() function will scan all the tasks in a cgroup,
+ * The css_scan_tasks() function will scan all the tasks in a cgroup,
  * calling callback functions for each.
  *
- * No return value. It's guaranteed that cgroup_scan_tasks() always returns 0
+ * No return value. It's guaranteed that css_scan_tasks() always returns 0
  * if @heap != NULL.
  */
 static void update_tasks_cpumask(struct cpuset *cs, struct ptr_heap *heap)
 {
-	cgroup_scan_tasks(cs->css.cgroup, NULL, cpuset_change_cpumask, cs,
-			  heap);
+	css_scan_tasks(&cs->css, NULL, cpuset_change_cpumask, cs, heap);
 }
 
 /*
  * update_tasks_cpumask_hier - Update the cpumasks of tasks in the hierarchy.
  * @root_cs: the root cpuset of the hierarchy
  * @update_root: update root cpuset or not?
- * @heap: the heap used by cgroup_scan_tasks()
+ * @heap: the heap used by css_scan_tasks()
  *
  * This will update cpumasks of tasks in @root_cs and all other empty cpusets
  * which take on cpumask of @root_cs.
@@ -1082,11 +1081,10 @@ static void *cpuset_being_rebound;
 /**
  * update_tasks_nodemask - Update the nodemasks of tasks in the cpuset.
  * @cs: the cpuset in which each task's mems_allowed mask needs to be changed
- * @heap: if NULL, defer allocating heap memory to cgroup_scan_tasks()
+ * @heap: if NULL, defer allocating heap memory to css_scan_tasks()
  *
- * Called with cpuset_mutex held
- * No return value. It's guaranteed that cgroup_scan_tasks() always returns 0
- * if @heap != NULL.
+ * Called with cpuset_mutex held.  No return value. It's guaranteed that
+ * css_scan_tasks() always returns 0 if @heap != NULL.
  */
 static void update_tasks_nodemask(struct cpuset *cs, struct ptr_heap *heap)
 {
@@ -1109,8 +1107,7 @@ static void update_tasks_nodemask(struct cpuset *cs, struct ptr_heap *heap)
 	 * It's ok if we rebind the same mm twice; mpol_rebind_mm()
 	 * is idempotent.  Also migrate pages in each mm to new nodes.
 	 */
-	cgroup_scan_tasks(cs->css.cgroup, NULL, cpuset_change_nodemask, &arg,
-			  heap);
+	css_scan_tasks(&cs->css, NULL, cpuset_change_nodemask, &arg, heap);
 
 	/*
 	 * All the tasks' nodemasks have been updated, update
@@ -1126,7 +1123,7 @@ static void update_tasks_nodemask(struct cpuset *cs, struct ptr_heap *heap)
  * update_tasks_nodemask_hier - Update the nodemasks of tasks in the hierarchy.
  * @cs: the root cpuset of the hierarchy
  * @update_root: update the root cpuset or not?
- * @heap: the heap used by cgroup_scan_tasks()
+ * @heap: the heap used by css_scan_tasks()
  *
  * This will update nodemasks of tasks in @root_cs and all other empty cpusets
  * which take on nodemask of @root_cs.
@@ -1254,12 +1251,12 @@ static int update_relax_domain_level(struct cpuset *cs, s64 val)
 	return 0;
 }
 
-/*
+/**
  * cpuset_change_flag - make a task's spread flags the same as its cpuset's
  * @tsk: task to be updated
  * @data: cpuset to @tsk belongs to
  *
- * Called by cgroup_scan_tasks() for each task in a cgroup.
+ * Called by css_scan_tasks() for each task in a cgroup.
  *
  * We don't need to re-check for the cgroup/cpuset membership, since we're
  * holding cpuset_mutex at this point.
@@ -1271,22 +1268,22 @@ static void cpuset_change_flag(struct task_struct *tsk, void *data)
 	cpuset_update_task_spread_flag(cs, tsk);
 }
 
-/*
+/**
  * update_tasks_flags - update the spread flags of tasks in the cpuset.
  * @cs: the cpuset in which each task's spread flags needs to be changed
- * @heap: if NULL, defer allocating heap memory to cgroup_scan_tasks()
+ * @heap: if NULL, defer allocating heap memory to css_scan_tasks()
  *
  * Called with cpuset_mutex held
  *
- * The cgroup_scan_tasks() function will scan all the tasks in a cgroup,
+ * The css_scan_tasks() function will scan all the tasks in a cgroup,
  * calling callback functions for each.
  *
- * No return value. It's guaranteed that cgroup_scan_tasks() always returns 0
+ * No return value. It's guaranteed that css_scan_tasks() always returns 0
  * if @heap != NULL.
  */
 static void update_tasks_flags(struct cpuset *cs, struct ptr_heap *heap)
 {
-	cgroup_scan_tasks(cs->css.cgroup, NULL, cpuset_change_flag, cs, heap);
+	css_scan_tasks(&cs->css, NULL, cpuset_change_flag, cs, heap);
 }
 
 /*
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5a5f4dc..95106a9 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1799,12 +1799,11 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
 	check_panic_on_oom(CONSTRAINT_MEMCG, gfp_mask, order, NULL);
 	totalpages = mem_cgroup_get_limit(memcg) >> PAGE_SHIFT ? : 1;
 	for_each_mem_cgroup_tree(iter, memcg) {
-		struct cgroup *cgroup = iter->css.cgroup;
-		struct cgroup_task_iter it;
+		struct css_task_iter it;
 		struct task_struct *task;
 
-		cgroup_task_iter_start(cgroup, &it);
-		while ((task = cgroup_task_iter_next(&it))) {
+		css_task_iter_start(&iter->css, &it);
+		while ((task = css_task_iter_next(&it))) {
 			switch (oom_scan_process_thread(task, totalpages, NULL,
 							false)) {
 			case OOM_SCAN_SELECT:
@@ -1817,7 +1816,7 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
 			case OOM_SCAN_CONTINUE:
 				continue;
 			case OOM_SCAN_ABORT:
-				cgroup_task_iter_end(&it);
+				css_task_iter_end(&it);
 				mem_cgroup_iter_break(memcg, iter);
 				if (chosen)
 					put_task_struct(chosen);
@@ -1834,7 +1833,7 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
 				get_task_struct(chosen);
 			}
 		}
-		cgroup_task_iter_end(&it);
+		css_task_iter_end(&it);
 	}
 
 	if (!chosen)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 21/23] cgroup: make cftype->[un]register_event() deal with cgroup_subsys_state instead of cgroup
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (19 preceding siblings ...)
  2013-08-01 21:49 ` [PATCH 20/23] cgroup: make task iterators deal with cgroup_subsys_state instead of cgroup Tejun Heo
@ 2013-08-01 21:49 ` Tejun Heo
  2013-08-02  4:08   ` Li Zefan
                     ` (2 more replies)
  2013-08-01 21:50 ` [PATCH 22/23] cgroup: make cgroup_taskset " Tejun Heo
                   ` (3 subsequent siblings)
  24 siblings, 3 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:49 UTC (permalink / raw)
  To: lizefan
  Cc: containers, cgroups, linux-kernel, Tejun Heo, Johannes Weiner,
	Michal Hocko, Balbir Singh

cgroup is in the process of converting to css (cgroup_subsys_state)
from cgroup as the principal subsystem interface handle.  This is
mostly to prepare for the unified hierarchy support where css's will
be created and destroyed dynamically but also helps cleaning up
subsystem implementations as css is usually what they are interested
in anyway.

cftype->[un]register_event() is among the remaining couple interfaces
which still use struct cgroup.  Convert it to cgroup_subsys_state.
The conversion is mostly mechanical and removes the last users of
mem_cgroup_from_cont() and cg_to_vmpressure(), which are removed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Balbir Singh <bsingharora@gmail.com>
---
 include/linux/cgroup.h     |  8 +++++---
 include/linux/vmpressure.h |  6 ++++--
 kernel/cgroup.c            | 15 ++++++++-------
 mm/memcontrol.c            | 21 ++++++++-------------
 mm/vmpressure.c            | 21 +++++++++------------
 5 files changed, 34 insertions(+), 37 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 6f6d87b..8f44411 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -506,15 +506,17 @@ struct cftype {
 	 * you want to provide this functionality. Use eventfd_signal()
 	 * on eventfd to send notification to userspace.
 	 */
-	int (*register_event)(struct cgroup *cgrp, struct cftype *cft,
-			struct eventfd_ctx *eventfd, const char *args);
+	int (*register_event)(struct cgroup_subsys_state *css,
+			      struct cftype *cft, struct eventfd_ctx *eventfd,
+			      const char *args);
 	/*
 	 * unregister_event() callback will be called when userspace
 	 * closes the eventfd or on cgroup removing.
 	 * This callback must be implemented, if you want provide
 	 * notification functionality.
 	 */
-	void (*unregister_event)(struct cgroup *cgrp, struct cftype *cft,
+	void (*unregister_event)(struct cgroup_subsys_state *css,
+				 struct cftype *cft,
 			struct eventfd_ctx *eventfd);
 };
 
diff --git a/include/linux/vmpressure.h b/include/linux/vmpressure.h
index 76be077..b239482 100644
--- a/include/linux/vmpressure.h
+++ b/include/linux/vmpressure.h
@@ -33,10 +33,12 @@ extern void vmpressure_init(struct vmpressure *vmpr);
 extern struct vmpressure *memcg_to_vmpressure(struct mem_cgroup *memcg);
 extern struct cgroup_subsys_state *vmpressure_to_css(struct vmpressure *vmpr);
 extern struct vmpressure *css_to_vmpressure(struct cgroup_subsys_state *css);
-extern int vmpressure_register_event(struct cgroup *cg, struct cftype *cft,
+extern int vmpressure_register_event(struct cgroup_subsys_state *css,
+				     struct cftype *cft,
 				     struct eventfd_ctx *eventfd,
 				     const char *args);
-extern void vmpressure_unregister_event(struct cgroup *cg, struct cftype *cft,
+extern void vmpressure_unregister_event(struct cgroup_subsys_state *css,
+					struct cftype *cft,
 					struct eventfd_ctx *eventfd);
 #else
 static inline void vmpressure(gfp_t gfp, struct mem_cgroup *memcg,
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index c61b24f..e0ef58e 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -159,9 +159,9 @@ struct css_id {
  */
 struct cgroup_event {
 	/*
-	 * Cgroup which the event belongs to.
+	 * css which the event belongs to.
 	 */
-	struct cgroup *cgrp;
+	struct cgroup_subsys_state *css;
 	/*
 	 * Control file which the event associated.
 	 */
@@ -3948,11 +3948,12 @@ static void cgroup_event_remove(struct work_struct *work)
 {
 	struct cgroup_event *event = container_of(work, struct cgroup_event,
 			remove);
-	struct cgroup *cgrp = event->cgrp;
+	struct cgroup_subsys_state *css = event->css;
+	struct cgroup *cgrp = css->cgroup;
 
 	remove_wait_queue(event->wqh, &event->wait);
 
-	event->cft->unregister_event(cgrp, event->cft, event->eventfd);
+	event->cft->unregister_event(css, event->cft, event->eventfd);
 
 	/* Notify userspace the event is going away. */
 	eventfd_signal(event->eventfd, 1);
@@ -3972,7 +3973,7 @@ static int cgroup_event_wake(wait_queue_t *wait, unsigned mode,
 {
 	struct cgroup_event *event = container_of(wait,
 			struct cgroup_event, wait);
-	struct cgroup *cgrp = event->cgrp;
+	struct cgroup *cgrp = event->css->cgroup;
 	unsigned long flags = (unsigned long)key;
 
 	if (flags & POLLHUP) {
@@ -4041,7 +4042,7 @@ static int cgroup_write_event_control(struct cgroup_subsys_state *css,
 	event = kzalloc(sizeof(*event), GFP_KERNEL);
 	if (!event)
 		return -ENOMEM;
-	event->cgrp = cgrp;
+	event->css = css;
 	INIT_LIST_HEAD(&event->list);
 	init_poll_funcptr(&event->pt, cgroup_event_ptable_queue_proc);
 	init_waitqueue_func_entry(&event->wait, cgroup_event_wake);
@@ -4092,7 +4093,7 @@ static int cgroup_write_event_control(struct cgroup_subsys_state *css,
 		goto out_put_cfile;
 	}
 
-	ret = event->cft->register_event(cgrp, event->cft,
+	ret = event->cft->register_event(css, event->cft,
 			event->eventfd, buffer);
 	if (ret)
 		goto out_put_cfile;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 95106a9..2885e3e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1034,11 +1034,6 @@ static void memcg_check_events(struct mem_cgroup *memcg, struct page *page)
 		preempt_enable();
 }
 
-static inline struct mem_cgroup *mem_cgroup_from_cont(struct cgroup *cont)
-{
-	return mem_cgroup_from_css(cgroup_css(cont, mem_cgroup_subsys_id));
-}
-
 struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p)
 {
 	/*
@@ -5620,10 +5615,10 @@ static void mem_cgroup_oom_notify(struct mem_cgroup *memcg)
 		mem_cgroup_oom_notify_cb(iter);
 }
 
-static int mem_cgroup_usage_register_event(struct cgroup *cgrp,
+static int mem_cgroup_usage_register_event(struct cgroup_subsys_state *css,
 	struct cftype *cft, struct eventfd_ctx *eventfd, const char *args)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct mem_cgroup_thresholds *thresholds;
 	struct mem_cgroup_threshold_ary *new;
 	enum res_type type = MEMFILE_TYPE(cft->private);
@@ -5703,10 +5698,10 @@ unlock:
 	return ret;
 }
 
-static void mem_cgroup_usage_unregister_event(struct cgroup *cgrp,
+static void mem_cgroup_usage_unregister_event(struct cgroup_subsys_state *css,
 	struct cftype *cft, struct eventfd_ctx *eventfd)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct mem_cgroup_thresholds *thresholds;
 	struct mem_cgroup_threshold_ary *new;
 	enum res_type type = MEMFILE_TYPE(cft->private);
@@ -5782,10 +5777,10 @@ unlock:
 	mutex_unlock(&memcg->thresholds_lock);
 }
 
-static int mem_cgroup_oom_register_event(struct cgroup *cgrp,
+static int mem_cgroup_oom_register_event(struct cgroup_subsys_state *css,
 	struct cftype *cft, struct eventfd_ctx *eventfd, const char *args)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct mem_cgroup_eventfd_list *event;
 	enum res_type type = MEMFILE_TYPE(cft->private);
 
@@ -5807,10 +5802,10 @@ static int mem_cgroup_oom_register_event(struct cgroup *cgrp,
 	return 0;
 }
 
-static void mem_cgroup_oom_unregister_event(struct cgroup *cgrp,
+static void mem_cgroup_oom_unregister_event(struct cgroup_subsys_state *css,
 	struct cftype *cft, struct eventfd_ctx *eventfd)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct mem_cgroup_eventfd_list *ev, *tmp;
 	enum res_type type = MEMFILE_TYPE(cft->private);
 
diff --git a/mm/vmpressure.c b/mm/vmpressure.c
index 2a8a736..13489b1 100644
--- a/mm/vmpressure.c
+++ b/mm/vmpressure.c
@@ -74,11 +74,6 @@ static struct vmpressure *work_to_vmpressure(struct work_struct *work)
 	return container_of(work, struct vmpressure, work);
 }
 
-static struct vmpressure *cg_to_vmpressure(struct cgroup *cg)
-{
-	return css_to_vmpressure(cgroup_css(cg, mem_cgroup_subsys_id));
-}
-
 static struct vmpressure *vmpressure_parent(struct vmpressure *vmpr)
 {
 	struct cgroup_subsys_state *css = vmpressure_to_css(vmpr);
@@ -283,7 +278,7 @@ void vmpressure_prio(gfp_t gfp, struct mem_cgroup *memcg, int prio)
 
 /**
  * vmpressure_register_event() - Bind vmpressure notifications to an eventfd
- * @cg:		cgroup that is interested in vmpressure notifications
+ * @css:	css that is interested in vmpressure notifications
  * @cft:	cgroup control files handle
  * @eventfd:	eventfd context to link notifications with
  * @args:	event arguments (used to set up a pressure level threshold)
@@ -298,10 +293,11 @@ void vmpressure_prio(gfp_t gfp, struct mem_cgroup *memcg, int prio)
  * cftype).register_event, and then cgroup core will handle everything by
  * itself.
  */
-int vmpressure_register_event(struct cgroup *cg, struct cftype *cft,
-			      struct eventfd_ctx *eventfd, const char *args)
+int vmpressure_register_event(struct cgroup_subsys_state *css,
+			      struct cftype *cft, struct eventfd_ctx *eventfd,
+			      const char *args)
 {
-	struct vmpressure *vmpr = cg_to_vmpressure(cg);
+	struct vmpressure *vmpr = css_to_vmpressure(css);
 	struct vmpressure_event *ev;
 	int level;
 
@@ -329,7 +325,7 @@ int vmpressure_register_event(struct cgroup *cg, struct cftype *cft,
 
 /**
  * vmpressure_unregister_event() - Unbind eventfd from vmpressure
- * @cg:		cgroup handle
+ * @css:	css handle
  * @cft:	cgroup control files handle
  * @eventfd:	eventfd context that was used to link vmpressure with the @cg
  *
@@ -341,10 +337,11 @@ int vmpressure_register_event(struct cgroup *cg, struct cftype *cft,
  * cftype).unregister_event, and then cgroup core will handle everything
  * by itself.
  */
-void vmpressure_unregister_event(struct cgroup *cg, struct cftype *cft,
+void vmpressure_unregister_event(struct cgroup_subsys_state *css,
+				 struct cftype *cft,
 				 struct eventfd_ctx *eventfd)
 {
-	struct vmpressure *vmpr = cg_to_vmpressure(cg);
+	struct vmpressure *vmpr = css_to_vmpressure(css);
 	struct vmpressure_event *ev;
 
 	mutex_lock(&vmpr->events_lock);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 22/23] cgroup: make cgroup_taskset deal with cgroup_subsys_state instead of cgroup
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (20 preceding siblings ...)
  2013-08-01 21:49 ` [PATCH 21/23] cgroup: make cftype->[un]register_event() " Tejun Heo
@ 2013-08-01 21:50 ` Tejun Heo
  2013-08-06  6:53   ` Daniel Wagner
  2013-08-01 21:50 ` [PATCH 23/23] cgroup: unexport cgroup_css() Tejun Heo
                   ` (2 subsequent siblings)
  24 siblings, 1 reply; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:50 UTC (permalink / raw)
  To: lizefan
  Cc: containers, cgroups, linux-kernel, Tejun Heo, Ingo Molnar,
	Matt Helsley, Daniel Wagner, Steven Rostedt

cgroup is in the process of converting to css (cgroup_subsys_state)
from cgroup as the principal subsystem interface handle.  This is
mostly to prepare for the unified hierarchy support where css's will
be created and destroyed dynamically but also helps cleaning up
subsystem implementations as css is usually what they are interested
in anyway.

cgroup_taskset which is used by the subsystem attach methods is the
last cgroup subsystem API which isn't using css as the handle.  Update
cgroup_taskset_cur_cgroup() to cgroup_taskset_cur_css() and
cgroup_taskset_for_each() to take @skip_css instead of @skip_cgrp.

The conversions are pretty mechanical.  One exception is
cpuset::cgroup_cs(), which lost its last user and got removed.

This patch shouldn't introduce any functional changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 block/blk-cgroup.c        |  2 +-
 include/linux/cgroup.h    | 12 +++++++-----
 kernel/cgroup.c           | 16 +++++++++-------
 kernel/cgroup_freezer.c   |  2 +-
 kernel/cpuset.c           | 15 +++++----------
 kernel/events/core.c      |  2 +-
 kernel/sched/core.c       |  4 ++--
 net/core/netprio_cgroup.c |  2 +-
 net/sched/cls_cgroup.c    |  2 +-
 9 files changed, 28 insertions(+), 29 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 4b40640..54ad002 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -891,7 +891,7 @@ static int blkcg_can_attach(struct cgroup_subsys_state *css,
 	int ret = 0;
 
 	/* task_lock() is needed to avoid races with exit_io_context() */
-	cgroup_taskset_for_each(task, css->cgroup, tset) {
+	cgroup_taskset_for_each(task, css, tset) {
 		task_lock(task);
 		ioc = task->io_context;
 		if (ioc && atomic_read(&ioc->nr_tasks) > 1)
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 8f44411..28e21f9 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -563,20 +563,22 @@ int cgroup_task_count(const struct cgroup *cgrp);
 struct cgroup_taskset;
 struct task_struct *cgroup_taskset_first(struct cgroup_taskset *tset);
 struct task_struct *cgroup_taskset_next(struct cgroup_taskset *tset);
-struct cgroup *cgroup_taskset_cur_cgroup(struct cgroup_taskset *tset);
+struct cgroup_subsys_state *cgroup_taskset_cur_css(struct cgroup_taskset *tset,
+						   int subsys_id);
 int cgroup_taskset_size(struct cgroup_taskset *tset);
 
 /**
  * cgroup_taskset_for_each - iterate cgroup_taskset
  * @task: the loop cursor
- * @skip_cgrp: skip if task's cgroup matches this, %NULL to iterate through all
+ * @skip_css: skip if task's css matches this, %NULL to iterate through all
  * @tset: taskset to iterate
  */
-#define cgroup_taskset_for_each(task, skip_cgrp, tset)			\
+#define cgroup_taskset_for_each(task, skip_css, tset)			\
 	for ((task) = cgroup_taskset_first((tset)); (task);		\
 	     (task) = cgroup_taskset_next((tset)))			\
-		if (!(skip_cgrp) ||					\
-		    cgroup_taskset_cur_cgroup((tset)) != (skip_cgrp))
+		if (!(skip_css) ||					\
+		    cgroup_taskset_cur_css((tset),			\
+			(skip_css)->ss->subsys_id) != (skip_css))
 
 /*
  * Control Group subsystem type.
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index e0ef58e..ead0088 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1900,18 +1900,20 @@ struct task_struct *cgroup_taskset_next(struct cgroup_taskset *tset)
 EXPORT_SYMBOL_GPL(cgroup_taskset_next);
 
 /**
- * cgroup_taskset_cur_cgroup - return the matching cgroup for the current task
+ * cgroup_taskset_cur_css - return the matching css for the current task
  * @tset: taskset of interest
+ * @subsys_id: the ID of the target subsystem
  *
- * Return the cgroup for the current (last returned) task of @tset.  This
- * function must be preceded by either cgroup_taskset_first() or
- * cgroup_taskset_next().
+ * Return the css for the current (last returned) task of @tset for
+ * subsystem specified by @subsys_id.  This function must be preceded by
+ * either cgroup_taskset_first() or cgroup_taskset_next().
  */
-struct cgroup *cgroup_taskset_cur_cgroup(struct cgroup_taskset *tset)
+struct cgroup_subsys_state *cgroup_taskset_cur_css(struct cgroup_taskset *tset,
+						   int subsys_id)
 {
-	return tset->cur_cgrp;
+	return cgroup_css(tset->cur_cgrp, subsys_id);
 }
-EXPORT_SYMBOL_GPL(cgroup_taskset_cur_cgroup);
+EXPORT_SYMBOL_GPL(cgroup_taskset_cur_css);
 
 /**
  * cgroup_taskset_size - return the number of tasks in taskset
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index 5cd2b6d..224da9a 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -189,7 +189,7 @@ static void freezer_attach(struct cgroup_subsys_state *new_css,
 	 * current state before executing the following - !frozen tasks may
 	 * be visible in a FROZEN cgroup and frozen tasks in a THAWED one.
 	 */
-	cgroup_taskset_for_each(task, new_css->cgroup, tset) {
+	cgroup_taskset_for_each(task, new_css, tset) {
 		if (!(freezer->state & CGROUP_FREEZING)) {
 			__thaw_task(task);
 		} else {
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 39e5217..bf69717 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -119,12 +119,6 @@ static inline struct cpuset *css_cs(struct cgroup_subsys_state *css)
 	return css ? container_of(css, struct cpuset, css) : NULL;
 }
 
-/* Retrieve the cpuset for a cgroup */
-static inline struct cpuset *cgroup_cs(struct cgroup *cgrp)
-{
-	return css_cs(cgroup_css(cgrp, cpuset_subsys_id));
-}
-
 /* Retrieve the cpuset for a task */
 static inline struct cpuset *task_cs(struct task_struct *task)
 {
@@ -1459,7 +1453,7 @@ static int cpuset_can_attach(struct cgroup_subsys_state *css,
 	    (cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed)))
 		goto out_unlock;
 
-	cgroup_taskset_for_each(task, css->cgroup, tset) {
+	cgroup_taskset_for_each(task, css, tset) {
 		/*
 		 * Kthreads which disallow setaffinity shouldn't be moved
 		 * to a new cpuset; we don't want to change their cpu
@@ -1511,9 +1505,10 @@ static void cpuset_attach(struct cgroup_subsys_state *css,
 	struct mm_struct *mm;
 	struct task_struct *task;
 	struct task_struct *leader = cgroup_taskset_first(tset);
-	struct cgroup *oldcgrp = cgroup_taskset_cur_cgroup(tset);
+	struct cgroup_subsys_state *oldcss = cgroup_taskset_cur_css(tset,
+							cpuset_subsys_id);
 	struct cpuset *cs = css_cs(css);
-	struct cpuset *oldcs = cgroup_cs(oldcgrp);
+	struct cpuset *oldcs = css_cs(oldcss);
 	struct cpuset *cpus_cs = effective_cpumask_cpuset(cs);
 	struct cpuset *mems_cs = effective_nodemask_cpuset(cs);
 
@@ -1527,7 +1522,7 @@ static void cpuset_attach(struct cgroup_subsys_state *css,
 
 	guarantee_online_mems(mems_cs, &cpuset_attach_nodemask_to);
 
-	cgroup_taskset_for_each(task, css->cgroup, tset) {
+	cgroup_taskset_for_each(task, css, tset) {
 		/*
 		 * can_attach beforehand should guarantee that this doesn't
 		 * fail.  TODO: have a better way to handle failure here
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 9705a0e..c199c4f 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7816,7 +7816,7 @@ static void perf_cgroup_attach(struct cgroup_subsys_state *css,
 {
 	struct task_struct *task;
 
-	cgroup_taskset_for_each(task, css->cgroup, tset)
+	cgroup_taskset_for_each(task, css, tset)
 		task_function_call(task, __perf_cgroup_move, task);
 }
 
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index cc9a492..a7122d5 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7135,7 +7135,7 @@ static int cpu_cgroup_can_attach(struct cgroup_subsys_state *css,
 {
 	struct task_struct *task;
 
-	cgroup_taskset_for_each(task, css->cgroup, tset) {
+	cgroup_taskset_for_each(task, css, tset) {
 #ifdef CONFIG_RT_GROUP_SCHED
 		if (!sched_rt_can_attach(css_tg(css), task))
 			return -EINVAL;
@@ -7153,7 +7153,7 @@ static void cpu_cgroup_attach(struct cgroup_subsys_state *css,
 {
 	struct task_struct *task;
 
-	cgroup_taskset_for_each(task, css->cgroup, tset)
+	cgroup_taskset_for_each(task, css, tset)
 		sched_move_task(task);
 }
 
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index e00f60e..d9cd627 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -224,7 +224,7 @@ static void net_prio_attach(struct cgroup_subsys_state *css,
 	struct task_struct *p;
 	void *v;
 
-	cgroup_taskset_for_each(p, css->cgroup, tset) {
+	cgroup_taskset_for_each(p, css, tset) {
 		task_lock(p);
 		v = (void *)(unsigned long)task_netprioidx(p);
 		iterate_fd(p->files, 0, update_netprio, v);
diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
index 8ea1184..867b4a3 100644
--- a/net/sched/cls_cgroup.c
+++ b/net/sched/cls_cgroup.c
@@ -74,7 +74,7 @@ static void cgrp_attach(struct cgroup_subsys_state *css,
 	struct task_struct *p;
 	void *v;
 
-	cgroup_taskset_for_each(p, css->cgroup, tset) {
+	cgroup_taskset_for_each(p, css, tset) {
 		task_lock(p);
 		v = (void *)(unsigned long)task_cls_classid(p);
 		iterate_fd(p->files, 0, update_classid, v);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 23/23] cgroup: unexport cgroup_css()
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (21 preceding siblings ...)
  2013-08-01 21:50 ` [PATCH 22/23] cgroup: make cgroup_taskset " Tejun Heo
@ 2013-08-01 21:50 ` Tejun Heo
  2013-08-02  3:24 ` [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Li Zefan
  2013-08-09  0:12 ` Tejun Heo
  24 siblings, 0 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-01 21:50 UTC (permalink / raw)
  To: lizefan; +Cc: containers, cgroups, linux-kernel, Tejun Heo

cgroup_css() no longer has any user left outside cgroup.c proper and
we don't want subsystems to grow new usages of the function.  cgroup
core should always provide the css to use to the subsystems, which
will make dynamic creation and destruction of css's across the
lifetime of a cgroup much more manageable than exposing the cgroup
directly to subsystems and let them dereference css's from it.

Make cgroup_css() a static function in cgroup.c.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/cgroup.h | 13 -------------
 kernel/cgroup.c        | 13 +++++++++++++
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 28e21f9..1d81d25 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -679,19 +679,6 @@ struct cgroup_subsys_state *css_parent(struct cgroup_subsys_state *css)
 }
 
 /**
- * cgroup_css - obtain a cgroup's css for the specified subsystem
- * @cgrp: the cgroup of interest
- * @subsys_id: the subsystem of interest
- *
- * Return @cgrp's css (cgroup_subsys_state) associated with @subsys_id.
- */
-static inline struct cgroup_subsys_state *cgroup_css(struct cgroup *cgrp,
-						     int subsys_id)
-{
-	return cgrp->subsys[subsys_id];
-}
-
-/**
  * task_css_set_check - obtain a task's css_set with extra access conditions
  * @task: the task to obtain css_set for
  * @__c: extra condition expression to be passed to rcu_dereference_check()
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index ead0088..f8739c7 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -222,6 +222,19 @@ static int cgroup_destroy_locked(struct cgroup *cgrp);
 static int cgroup_addrm_files(struct cgroup *cgrp, struct cftype cfts[],
 			      bool is_add);
 
+/**
+ * cgroup_css - obtain a cgroup's css for the specified subsystem
+ * @cgrp: the cgroup of interest
+ * @subsys_id: the subsystem of interest
+ *
+ * Return @cgrp's css (cgroup_subsys_state) associated with @subsys_id.
+ */
+static struct cgroup_subsys_state *cgroup_css(struct cgroup *cgrp,
+					      int subsys_id)
+{
+	return cgrp->subsys[subsys_id];
+}
+
 /* convenient tests for these bits */
 static inline bool cgroup_is_dead(const struct cgroup *cgrp)
 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [PATCH 03/23] netprio_cgroup: pass around @css instead of @cgroup and kill struct cgroup_netprio_state
  2013-08-01 21:49 ` [PATCH 03/23] netprio_cgroup: pass around @css instead of @cgroup and kill struct cgroup_netprio_state Tejun Heo
@ 2013-08-01 22:07   ` David Miller
  2013-08-02 11:42   ` Neil Horman
  1 sibling, 0 replies; 60+ messages in thread
From: David Miller @ 2013-08-01 22:07 UTC (permalink / raw)
  To: tj; +Cc: lizefan, containers, cgroups, linux-kernel, nhorman

From: Tejun Heo <tj@kernel.org>
Date: Thu,  1 Aug 2013 17:49:41 -0400

> cgroup controller API will be converted to primarily use struct
> cgroup_subsys_state instead of struct cgroup.  In preparation, make
> the internal functions of netprio_cgroup pass around @css instead of
> @cgrp.
> 
> While at it, kill struct cgroup_netprio_state which only contained
> struct cgroup_subsys_state without serving any purpose.  All functions
> are converted to deal with @css directly.
> 
> This patch shouldn't cause any behavior differences.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (22 preceding siblings ...)
  2013-08-01 21:50 ` [PATCH 23/23] cgroup: unexport cgroup_css() Tejun Heo
@ 2013-08-02  3:24 ` Li Zefan
  2013-08-09  0:12 ` Tejun Heo
  24 siblings, 0 replies; 60+ messages in thread
From: Li Zefan @ 2013-08-02  3:24 UTC (permalink / raw)
  To: Tejun Heo; +Cc: containers, cgroups, linux-kernel

On 2013/8/2 5:49, Tejun Heo wrote:
> Hello,
> 
> Currently, struct cgroup * is used as the main interface handle
> between cgroup core and its subsystems, which works but is a bit
> clunky because subsystems usually care much more about css's
> (cgroup_subsys_state) a lot more than cgroups, which is natural as a
> css is the intersection between a cgroup and a subsystem.
> 
> In addition to being a bit clunky, dealing with cgroups directly pose
> a bit of trouble for the planned unified hierarchy support on two
> fronts.  First, most iterations become subsystem dependent as task
> membership is affected by which subtree has the specific subsystem
> enabled and thus require specifying which subsystem the iteration is
> for, which is automatically achieved if the interfaces deal with css's
> instead of cgroups.
> 
> Second, as css's may be created, attached, detached and destroyed
> dynamically multiple times across the lifetime of a given cgroup as
> they're enabled and disabled, which makes cgroup -> css mapping much
> more difficult to synchronize.  Giving out cgroup to subsystems and
> then requiring them to take the extra steps to deal with their css's
> coming and going dynamically is a lot more fragile than cgroup core
> proper handling it internally and giving out the resulting css's to
> subsystems.
> 
> So, this patchset converts all cgroup subsystem APIs to deal with
> css's instead of cgroups.  The patchset is fairly large but most of
> the conversions, while being tedious, aren't complex.  At the end of
> series, subsystems no longer make cgroup -> css mapping themselves and
> cgroup_css() - formerly cgroup_subsys_state() - is made internal to
> cgroup core proper.
> 
> This is a rather large update to the interface and likely to play as a
> barrier when porting commits, which is inconvenient but also provides
> an opportunity to clean up the API where we can as doing so won't
> significantly raise the level of inconvenience.  As such, this
> patchset contains some API cleanups and I'll probably follow up with
> further API updates that I've been meaning to do and, if you have some
> good idea to clean up cgroup internal API, this probably is a good
> time to submit it.
> 
> This patchset contains the following 23 patches.
> 
>  0001-cgroup-s-cgroup_subsys_state-cgroup_css-s-task_subsy.patch
>  0002-cpuset-drop-const-qualifiers-from-struct-cpuset-inst.patch
>  0003-netprio_cgroup-pass-around-css-instead-of-cgroup-and.patch
>  0004-hugetlb_cgroup-pass-around-hugetlb_cgroup-instead-of.patch
>  0005-cgroup-add-subsystem-pointer-to-cgroup_subsys_state.patch
>  0006-cgroup-add-update-accessors-which-obtain-subsys-spec.patch
>  0007-cgroup-add-css_parent.patch
>  0008-cgroup-pass-around-cgroup_subsys_state-instead-of-cg.patch
>  0009-cgroup-add-subsys-backlink-pointer-to-cftype.patch
>  0010-cgroup-pin-cgroup_subsys_state-when-opening-a-cgroup.patch
>  0011-cgroup-add-cgroup-dummy_css.patch
>  0012-cgroup-pass-around-cgroup_subsys_state-instead-of-cg.patch
>  0013-cgroup-convert-cgroup_next_sibling-to-cgroup_next_ch.patch
>  0014-cgroup-always-use-cgroup_next_child-to-walk-the-chil.patch
>  0015-cgroup-make-hierarchy-iterators-deal-with-cgroup_sub.patch
>  0016-cgroup-relocate-cgroup_advance_iter.patch
>  0017-cgroup-rename-cgroup_iter-to-cgroup_task_iter.patch
>  0018-cgroup-make-cgroup_task_iter-remember-the-cgroup-bei.patch
>  0019-cgroup-remove-struct-cgroup_scanner.patch
>  0020-cgroup-make-task-iterators-deal-with-cgroup_subsys_s.patch
>  0021-cgroup-make-cftype-un-register_event-deal-with-cgrou.patch
>  0022-cgroup-make-cgroup_taskset-deal-with-cgroup_subsys_s.patch
>  0023-cgroup-unexport-cgroup_css.patch
> 

Looks good to me!

Acked-by: Li Zefan <lizefan@huawei.com>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 16/23] cgroup: relocate cgroup_advance_iter()
  2013-08-01 21:49 ` [PATCH 16/23] cgroup: relocate cgroup_advance_iter() Tejun Heo
@ 2013-08-02  3:25   ` Li Zefan
  2013-08-02 19:35     ` Tejun Heo
  0 siblings, 1 reply; 60+ messages in thread
From: Li Zefan @ 2013-08-02  3:25 UTC (permalink / raw)
  To: Tejun Heo; +Cc: containers, cgroups, linux-kernel

On 2013/8/2 5:49, Tejun Heo wrote:
> For some reason, cgroup_advance_iter() is standing lonely all away
> from its iter comrades.  Relocate it.
> 

There're some other functions that are in the same situation. Do you
think it's better to relocate them, or just leave it as it is?


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods
  2013-08-01 21:49 ` [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods Tejun Heo
@ 2013-08-02  3:54   ` Li Zefan
  2013-08-02 19:36     ` Tejun Heo
  2013-08-02  4:02   ` Li Zefan
                     ` (4 subsequent siblings)
  5 siblings, 1 reply; 60+ messages in thread
From: Li Zefan @ 2013-08-02  3:54 UTC (permalink / raw)
  To: Tejun Heo
  Cc: containers, cgroups, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Johannes Weiner, Michal Hocko, Balbir Singh, Aristeu Rozanski,
	Matt Helsley, Daniel Wagner, Vivek Goyal, Jens Axboe,
	Steven Rostedt

> @@ -4298,7 +4308,7 @@ static long cgroup_create(struct cgroup *parent, struct dentry *dentry,
>  	for_each_root_subsys(root, ss) {
>  		struct cgroup_subsys_state *css;
>  
> -		css = ss->css_alloc(cgrp);
> +		css = ss->css_alloc(parent->subsys[ss->subsys_id]);

As this patchset is based on for-3.12 branch, which lacks the fix in for-3.11,
so the css_alloc() in that bug fix is not converted.

>  		if (IS_ERR(css)) {
>  			err = PTR_ERR(css);
>  			goto err_free_all;
> @@ -4377,7 +4387,7 @@ err_free_all:
>  
>  		if (css) {
>  			percpu_ref_cancel_init(&css->refcnt);
> -			ss->css_free(cgrp);
> +			ss->css_free(css);
>  		}
>  	}
>  	mutex_unlock(&cgroup_mutex);


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods
  2013-08-01 21:49 ` [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods Tejun Heo
  2013-08-02  3:54   ` Li Zefan
@ 2013-08-02  4:02   ` Li Zefan
  2013-08-02 19:41     ` Tejun Heo
  2013-08-02 13:19   ` Michal Hocko
                     ` (3 subsequent siblings)
  5 siblings, 1 reply; 60+ messages in thread
From: Li Zefan @ 2013-08-02  4:02 UTC (permalink / raw)
  To: Tejun Heo
  Cc: containers, cgroups, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Johannes Weiner, Michal Hocko, Balbir Singh, Aristeu Rozanski,
	Matt Helsley, Daniel Wagner, Vivek Goyal, Jens Axboe,
	Steven Rostedt

> @@ -4199,12 +4208,13 @@ static void init_cgroup_css(struct cgroup_subsys_state *css,
>  /* invoke ->css_online() on a new CSS and mark it online if successful */
>  static int online_css(struct cgroup_subsys *ss, struct cgroup *cgrp)
>  {
> +	struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
>  	int ret = 0;
>  
>  	lockdep_assert_held(&cgroup_mutex);
>  
>  	if (ss->css_online)
> -		ret = ss->css_online(cgrp);
> +		ret = ss->css_online(css);
>  	if (!ret)
>  		cgrp->subsys[ss->subsys_id]->flags |= CSS_ONLINE;

Then this can be changed to css->flags |= CSS_ONLINE.

>  	return ret;


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 21/23] cgroup: make cftype->[un]register_event() deal with cgroup_subsys_state instead of cgroup
  2013-08-01 21:49 ` [PATCH 21/23] cgroup: make cftype->[un]register_event() " Tejun Heo
@ 2013-08-02  4:08   ` Li Zefan
  2013-08-02 19:44     ` Tejun Heo
  2013-08-02 13:42   ` Michal Hocko
  2013-08-02 20:24   ` [PATCH v2 " Tejun Heo
  2 siblings, 1 reply; 60+ messages in thread
From: Li Zefan @ 2013-08-02  4:08 UTC (permalink / raw)
  To: Tejun Heo
  Cc: containers, cgroups, linux-kernel, Johannes Weiner, Michal Hocko,
	Balbir Singh

> @@ -506,15 +506,17 @@ struct cftype {
>  	 * you want to provide this functionality. Use eventfd_signal()
>  	 * on eventfd to send notification to userspace.
>  	 */
> -	int (*register_event)(struct cgroup *cgrp, struct cftype *cft,
> -			struct eventfd_ctx *eventfd, const char *args);
> +	int (*register_event)(struct cgroup_subsys_state *css,
> +			      struct cftype *cft, struct eventfd_ctx *eventfd,
> +			      const char *args);
>  	/*
>  	 * unregister_event() callback will be called when userspace
>  	 * closes the eventfd or on cgroup removing.
>  	 * This callback must be implemented, if you want provide
>  	 * notification functionality.
>  	 */
> -	void (*unregister_event)(struct cgroup *cgrp, struct cftype *cft,
> +	void (*unregister_event)(struct cgroup_subsys_state *css,
> +				 struct cftype *cft,
>  			struct eventfd_ctx *eventfd);

align this line?

>  };


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 04/23] hugetlb_cgroup: pass around @hugetlb_cgroup instead of @cgroup
  2013-08-01 21:49 ` [PATCH 04/23] hugetlb_cgroup: pass around @hugetlb_cgroup instead of @cgroup Tejun Heo
@ 2013-08-02  4:35   ` Aneesh Kumar K.V
  2013-08-02 13:10   ` Michal Hocko
  1 sibling, 0 replies; 60+ messages in thread
From: Aneesh Kumar K.V @ 2013-08-02  4:35 UTC (permalink / raw)
  To: Tejun Heo, lizefan
  Cc: containers, cgroups, linux-kernel, Tejun Heo, KAMEZAWA Hiroyuki,
	Michal Hocko, Johannes Weiner

Tejun Heo <tj@kernel.org> writes:

> cgroup controller API will be converted to primarily use struct
> cgroup_subsys_state instead of struct cgroup.  In preparation, make
> hugetlb_cgroup functions pass around struct hugetlb_cgroup instead of
> struct cgroup.
>
> This patch shouldn't cause any behavior differences.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Johannes Weiner <hannes@cmpxchg.org>

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>


> ---
>  mm/hugetlb_cgroup.c | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
>
> diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
> index 50f213f..d2f9fc0 100644
> --- a/mm/hugetlb_cgroup.c
> +++ b/mm/hugetlb_cgroup.c
> @@ -56,17 +56,19 @@ static inline bool hugetlb_cgroup_is_root(struct hugetlb_cgroup *h_cg)
>  	return (h_cg == root_h_cgroup);
>  }
>
> -static inline struct hugetlb_cgroup *parent_hugetlb_cgroup(struct cgroup *cg)
> +static inline struct hugetlb_cgroup *
> +parent_hugetlb_cgroup(struct hugetlb_cgroup *h_cg)
>  {
> -	if (!cg->parent)
> +	struct cgroup *parent = h_cg->css.cgroup->parent;
> +
> +	if (!parent)
>  		return NULL;
> -	return hugetlb_cgroup_from_cgroup(cg->parent);
> +	return hugetlb_cgroup_from_cgroup(parent);
>  }
>
> -static inline bool hugetlb_cgroup_have_usage(struct cgroup *cg)
> +static inline bool hugetlb_cgroup_have_usage(struct hugetlb_cgroup *h_cg)
>  {
>  	int idx;
> -	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_cgroup(cg);
>
>  	for (idx = 0; idx < hugetlb_max_hstate; idx++) {
>  		if ((res_counter_read_u64(&h_cg->hugepage[idx], RES_USAGE)) > 0)
> @@ -115,15 +117,14 @@ static void hugetlb_cgroup_css_free(struct cgroup *cgroup)
>   * page reference and test for page active here. This function
>   * cannot fail.
>   */
> -static void hugetlb_cgroup_move_parent(int idx, struct cgroup *cgroup,
> +static void hugetlb_cgroup_move_parent(int idx, struct hugetlb_cgroup *h_cg,
>  				       struct page *page)
>  {
>  	int csize;
>  	struct res_counter *counter;
>  	struct res_counter *fail_res;
>  	struct hugetlb_cgroup *page_hcg;
> -	struct hugetlb_cgroup *h_cg   = hugetlb_cgroup_from_cgroup(cgroup);
> -	struct hugetlb_cgroup *parent = parent_hugetlb_cgroup(cgroup);
> +	struct hugetlb_cgroup *parent = parent_hugetlb_cgroup(h_cg);
>
>  	page_hcg = hugetlb_cgroup_from_page(page);
>  	/*
> @@ -155,6 +156,7 @@ out:
>   */
>  static void hugetlb_cgroup_css_offline(struct cgroup *cgroup)
>  {
> +	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_cgroup(cgroup);
>  	struct hstate *h;
>  	struct page *page;
>  	int idx = 0;
> @@ -163,13 +165,13 @@ static void hugetlb_cgroup_css_offline(struct cgroup *cgroup)
>  		for_each_hstate(h) {
>  			spin_lock(&hugetlb_lock);
>  			list_for_each_entry(page, &h->hugepage_activelist, lru)
> -				hugetlb_cgroup_move_parent(idx, cgroup, page);
> +				hugetlb_cgroup_move_parent(idx, h_cg, page);
>
>  			spin_unlock(&hugetlb_lock);
>  			idx++;
>  		}
>  		cond_resched();
> -	} while (hugetlb_cgroup_have_usage(cgroup));
> +	} while (hugetlb_cgroup_have_usage(h_cg));
>  }
>
>  int hugetlb_cgroup_charge_cgroup(int idx, unsigned long nr_pages,
> -- 
> 1.8.3.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 03/23] netprio_cgroup: pass around @css instead of @cgroup and kill struct cgroup_netprio_state
  2013-08-01 21:49 ` [PATCH 03/23] netprio_cgroup: pass around @css instead of @cgroup and kill struct cgroup_netprio_state Tejun Heo
  2013-08-01 22:07   ` David Miller
@ 2013-08-02 11:42   ` Neil Horman
  1 sibling, 0 replies; 60+ messages in thread
From: Neil Horman @ 2013-08-02 11:42 UTC (permalink / raw)
  To: Tejun Heo; +Cc: lizefan, containers, cgroups, linux-kernel, David S. Miller

On Thu, Aug 01, 2013 at 05:49:41PM -0400, Tejun Heo wrote:
> cgroup controller API will be converted to primarily use struct
> cgroup_subsys_state instead of struct cgroup.  In preparation, make
> the internal functions of netprio_cgroup pass around @css instead of
> @cgrp.
> 
> While at it, kill struct cgroup_netprio_state which only contained
> struct cgroup_subsys_state without serving any purpose.  All functions
> are converted to deal with @css directly.
> 
> This patch shouldn't cause any behavior differences.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Neil Horman <nhorman@tuxdriver.com>
> Cc: David S. Miller <davem@davemloft.net>
Acked-by: Neil Horman <nhorman@tuxdriver.com>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 04/23] hugetlb_cgroup: pass around @hugetlb_cgroup instead of @cgroup
  2013-08-01 21:49 ` [PATCH 04/23] hugetlb_cgroup: pass around @hugetlb_cgroup instead of @cgroup Tejun Heo
  2013-08-02  4:35   ` Aneesh Kumar K.V
@ 2013-08-02 13:10   ` Michal Hocko
  1 sibling, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2013-08-02 13:10 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, containers, cgroups, linux-kernel, Aneesh Kumar K.V,
	KAMEZAWA Hiroyuki, Johannes Weiner

On Thu 01-08-13 17:49:42, Tejun Heo wrote:
> cgroup controller API will be converted to primarily use struct
> cgroup_subsys_state instead of struct cgroup.  In preparation, make
> hugetlb_cgroup functions pass around struct hugetlb_cgroup instead of
> struct cgroup.
> 
> This patch shouldn't cause any behavior differences.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Johannes Weiner <hannes@cmpxchg.org>

Reviewed-by: Michal Hocko <mhocko@suse.cz>

> ---
>  mm/hugetlb_cgroup.c | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
> index 50f213f..d2f9fc0 100644
> --- a/mm/hugetlb_cgroup.c
> +++ b/mm/hugetlb_cgroup.c
> @@ -56,17 +56,19 @@ static inline bool hugetlb_cgroup_is_root(struct hugetlb_cgroup *h_cg)
>  	return (h_cg == root_h_cgroup);
>  }
>  
> -static inline struct hugetlb_cgroup *parent_hugetlb_cgroup(struct cgroup *cg)
> +static inline struct hugetlb_cgroup *
> +parent_hugetlb_cgroup(struct hugetlb_cgroup *h_cg)
>  {
> -	if (!cg->parent)
> +	struct cgroup *parent = h_cg->css.cgroup->parent;
> +
> +	if (!parent)
>  		return NULL;
> -	return hugetlb_cgroup_from_cgroup(cg->parent);
> +	return hugetlb_cgroup_from_cgroup(parent);
>  }
>  
> -static inline bool hugetlb_cgroup_have_usage(struct cgroup *cg)
> +static inline bool hugetlb_cgroup_have_usage(struct hugetlb_cgroup *h_cg)
>  {
>  	int idx;
> -	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_cgroup(cg);
>  
>  	for (idx = 0; idx < hugetlb_max_hstate; idx++) {
>  		if ((res_counter_read_u64(&h_cg->hugepage[idx], RES_USAGE)) > 0)
> @@ -115,15 +117,14 @@ static void hugetlb_cgroup_css_free(struct cgroup *cgroup)
>   * page reference and test for page active here. This function
>   * cannot fail.
>   */
> -static void hugetlb_cgroup_move_parent(int idx, struct cgroup *cgroup,
> +static void hugetlb_cgroup_move_parent(int idx, struct hugetlb_cgroup *h_cg,
>  				       struct page *page)
>  {
>  	int csize;
>  	struct res_counter *counter;
>  	struct res_counter *fail_res;
>  	struct hugetlb_cgroup *page_hcg;
> -	struct hugetlb_cgroup *h_cg   = hugetlb_cgroup_from_cgroup(cgroup);
> -	struct hugetlb_cgroup *parent = parent_hugetlb_cgroup(cgroup);
> +	struct hugetlb_cgroup *parent = parent_hugetlb_cgroup(h_cg);
>  
>  	page_hcg = hugetlb_cgroup_from_page(page);
>  	/*
> @@ -155,6 +156,7 @@ out:
>   */
>  static void hugetlb_cgroup_css_offline(struct cgroup *cgroup)
>  {
> +	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_cgroup(cgroup);
>  	struct hstate *h;
>  	struct page *page;
>  	int idx = 0;
> @@ -163,13 +165,13 @@ static void hugetlb_cgroup_css_offline(struct cgroup *cgroup)
>  		for_each_hstate(h) {
>  			spin_lock(&hugetlb_lock);
>  			list_for_each_entry(page, &h->hugepage_activelist, lru)
> -				hugetlb_cgroup_move_parent(idx, cgroup, page);
> +				hugetlb_cgroup_move_parent(idx, h_cg, page);
>  
>  			spin_unlock(&hugetlb_lock);
>  			idx++;
>  		}
>  		cond_resched();
> -	} while (hugetlb_cgroup_have_usage(cgroup));
> +	} while (hugetlb_cgroup_have_usage(h_cg));
>  }
>  
>  int hugetlb_cgroup_charge_cgroup(int idx, unsigned long nr_pages,
> -- 
> 1.8.3.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods
  2013-08-01 21:49 ` [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods Tejun Heo
  2013-08-02  3:54   ` Li Zefan
  2013-08-02  4:02   ` Li Zefan
@ 2013-08-02 13:19   ` Michal Hocko
  2013-08-02 13:43     ` Michal Hocko
  2013-08-02 19:38     ` Tejun Heo
  2013-08-02 20:24   ` [PATCH v2 " Tejun Heo
                     ` (2 subsequent siblings)
  5 siblings, 2 replies; 60+ messages in thread
From: Michal Hocko @ 2013-08-02 13:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, containers, cgroups, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Johannes Weiner, Balbir Singh, Aristeu Rozanski,
	Matt Helsley, Daniel Wagner, Vivek Goyal, Jens Axboe,
	Steven Rostedt

On Thu 01-08-13 17:49:46, Tejun Heo wrote:
> cgroup is currently in the process of transitioning to using struct
> cgroup_subsys_state * as the primary handle instead of struct cgroup *
> in subsystem implementations for the following reasons.
> 
> * With unified hierarchy, subsystems will be dynamically bound and
>   unbound from cgroups and thus css's (cgroup_subsys_state) may be
>   created and destroyed dynamically over the lifetime of a cgroup,
>   which is different from the current state where all css's are
>   allocated and destroyed together with the associated cgroup.  This
>   in turn means that cgroup_css() should be synchronized and may
>   return NULL, making it more cumbersome to use.
> 
> * Differing levels of per-subsystem granularity in the unified
>   hierarchy means that the task and descendant iterators should behave
>   differently depending on the specific subsystem the iteration is
>   being performed for.
> 
> * In majority of the cases, subsystems only care about its part in the
>   cgroup hierarchy - ie. the hierarchy of css's.  Subsystem methods
>   often obtain the matching css pointer from the cgroup and don't
>   bother with the cgroup pointer itself.  Passing around css fits
>   much better.
> 
> This patch converts all cgroup_subsys methods to take @css instead of
> @cgroup.  The conversions are mostly straight-forward.  A few
> noteworthy changes are
> 
> * ->css_alloc() now takes css of the parent cgroup rather than the
>   pointer to the new cgroup as the css for the new cgroup doesn't
>   exist yet.  Knowing the parent css is enough for all the existing
>   subsystems.
> 
> * In kernel/cgroup.c::offline_css(), unnecessary open coded css
>   dereference is replaced with local variable access.
> 
> This patch shouldn't cause any behavior differences.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Aristeu Rozanski <aris@redhat.com>
> Cc: Matt Helsley <matthltc@us.ibm.com>
> Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Steven Rostedt <rostedt@goodmis.org>

For memcg part
Acked-by: Michal Hocko <mhocko@suse.cz>

mem_cgroup_from_cont can go away now as well. Do you plan to remove it
in the series or later on?

> ---
>  block/blk-cgroup.c        | 25 +++++++++++-----------
>  include/linux/cgroup.h    | 24 ++++++++++++---------
>  kernel/cgroup.c           | 53 ++++++++++++++++++++++++++++-------------------
>  kernel/cgroup_freezer.c   | 40 ++++++++++++++++++-----------------
>  kernel/cpuset.c           | 39 ++++++++++++++++++----------------
>  kernel/events/core.c      | 18 +++++++++-------
>  kernel/sched/core.c       | 39 +++++++++++++++++-----------------
>  kernel/sched/cpuacct.c    |  9 ++++----
>  mm/hugetlb_cgroup.c       | 19 ++++++++---------
>  mm/memcontrol.c           | 38 ++++++++++++++++-----------------
>  net/core/netprio_cgroup.c | 20 +++++++++---------
>  net/sched/cls_cgroup.c    | 18 +++++++++-------
>  security/device_cgroup.c  | 22 ++++++++++----------
>  13 files changed, 195 insertions(+), 169 deletions(-)
> 
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index 290792a..79fd9f4 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -765,18 +765,18 @@ struct cftype blkcg_files[] = {
>  
>  /**
>   * blkcg_css_offline - cgroup css_offline callback
> - * @cgroup: cgroup of interest
> + * @css: css of interest
>   *
> - * This function is called when @cgroup is about to go away and responsible
> - * for shooting down all blkgs associated with @cgroup.  blkgs should be
> + * This function is called when @css is about to go away and responsible
> + * for shooting down all blkgs associated with @css.  blkgs should be
>   * removed while holding both q and blkcg locks.  As blkcg lock is nested
>   * inside q lock, this function performs reverse double lock dancing.
>   *
>   * This is the blkcg counterpart of ioc_release_fn().
>   */
> -static void blkcg_css_offline(struct cgroup *cgroup)
> +static void blkcg_css_offline(struct cgroup_subsys_state *css)
>  {
> -	struct blkcg *blkcg = cgroup_to_blkcg(cgroup);
> +	struct blkcg *blkcg = css_to_blkcg(css);
>  
>  	spin_lock_irq(&blkcg->lock);
>  
> @@ -798,21 +798,21 @@ static void blkcg_css_offline(struct cgroup *cgroup)
>  	spin_unlock_irq(&blkcg->lock);
>  }
>  
> -static void blkcg_css_free(struct cgroup *cgroup)
> +static void blkcg_css_free(struct cgroup_subsys_state *css)
>  {
> -	struct blkcg *blkcg = cgroup_to_blkcg(cgroup);
> +	struct blkcg *blkcg = css_to_blkcg(css);
>  
>  	if (blkcg != &blkcg_root)
>  		kfree(blkcg);
>  }
>  
> -static struct cgroup_subsys_state *blkcg_css_alloc(struct cgroup *cgroup)
> +static struct cgroup_subsys_state *
> +blkcg_css_alloc(struct cgroup_subsys_state *parent_css)
>  {
>  	static atomic64_t id_seq = ATOMIC64_INIT(0);
>  	struct blkcg *blkcg;
> -	struct cgroup *parent = cgroup->parent;
>  
> -	if (!parent) {
> +	if (!parent_css) {
>  		blkcg = &blkcg_root;
>  		goto done;
>  	}
> @@ -883,14 +883,15 @@ void blkcg_exit_queue(struct request_queue *q)
>   * of the main cic data structures.  For now we allow a task to change
>   * its cgroup only if it's the only owner of its ioc.
>   */
> -static int blkcg_can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
> +static int blkcg_can_attach(struct cgroup_subsys_state *css,
> +			    struct cgroup_taskset *tset)
>  {
>  	struct task_struct *task;
>  	struct io_context *ioc;
>  	int ret = 0;
>  
>  	/* task_lock() is needed to avoid races with exit_io_context() */
> -	cgroup_taskset_for_each(task, cgrp, tset) {
> +	cgroup_taskset_for_each(task, css->cgroup, tset) {
>  		task_lock(task);
>  		ioc = task->io_context;
>  		if (ioc && atomic_read(&ioc->nr_tasks) > 1)
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index b65f6b5..69b33f9 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -580,18 +580,22 @@ int cgroup_taskset_size(struct cgroup_taskset *tset);
>   */
>  
>  struct cgroup_subsys {
> -	struct cgroup_subsys_state *(*css_alloc)(struct cgroup *cgrp);
> -	int (*css_online)(struct cgroup *cgrp);
> -	void (*css_offline)(struct cgroup *cgrp);
> -	void (*css_free)(struct cgroup *cgrp);
> -
> -	int (*can_attach)(struct cgroup *cgrp, struct cgroup_taskset *tset);
> -	void (*cancel_attach)(struct cgroup *cgrp, struct cgroup_taskset *tset);
> -	void (*attach)(struct cgroup *cgrp, struct cgroup_taskset *tset);
> +	struct cgroup_subsys_state *(*css_alloc)(struct cgroup_subsys_state *parent_css);
> +	int (*css_online)(struct cgroup_subsys_state *css);
> +	void (*css_offline)(struct cgroup_subsys_state *css);
> +	void (*css_free)(struct cgroup_subsys_state *css);
> +
> +	int (*can_attach)(struct cgroup_subsys_state *css,
> +			  struct cgroup_taskset *tset);
> +	void (*cancel_attach)(struct cgroup_subsys_state *css,
> +			      struct cgroup_taskset *tset);
> +	void (*attach)(struct cgroup_subsys_state *css,
> +		       struct cgroup_taskset *tset);
>  	void (*fork)(struct task_struct *task);
> -	void (*exit)(struct cgroup *cgrp, struct cgroup *old_cgrp,
> +	void (*exit)(struct cgroup_subsys_state *css,
> +		     struct cgroup_subsys_state *old_css,
>  		     struct task_struct *task);
> -	void (*bind)(struct cgroup *root);
> +	void (*bind)(struct cgroup_subsys_state *root_css);
>  
>  	int subsys_id;
>  	int disabled;
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index fad5498..fae11e3 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -853,8 +853,11 @@ static void cgroup_free_fn(struct work_struct *work)
>  	/*
>  	 * Release the subsystem state objects.
>  	 */
> -	for_each_root_subsys(cgrp->root, ss)
> -		ss->css_free(cgrp);
> +	for_each_root_subsys(cgrp->root, ss) {
> +		struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
> +
> +		ss->css_free(css);
> +	}
>  
>  	cgrp->root->number_of_cgroups--;
>  	mutex_unlock(&cgroup_mutex);
> @@ -1056,7 +1059,7 @@ static int rebind_subsystems(struct cgroupfs_root *root,
>  			list_move(&ss->sibling, &root->subsys_list);
>  			ss->root = root;
>  			if (ss->bind)
> -				ss->bind(cgrp);
> +				ss->bind(cgrp->subsys[i]);
>  
>  			/* refcount was already taken, and we're keeping it */
>  			root->subsys_mask |= bit;
> @@ -1066,7 +1069,7 @@ static int rebind_subsystems(struct cgroupfs_root *root,
>  			BUG_ON(cgrp->subsys[i]->cgroup != cgrp);
>  
>  			if (ss->bind)
> -				ss->bind(cgroup_dummy_top);
> +				ss->bind(cgroup_dummy_top->subsys[i]);
>  			cgroup_dummy_top->subsys[i]->cgroup = cgroup_dummy_top;
>  			cgrp->subsys[i] = NULL;
>  			cgroup_subsys[i]->root = &cgroup_dummy_root;
> @@ -2042,8 +2045,10 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
>  	 * step 1: check that we can legitimately attach to the cgroup.
>  	 */
>  	for_each_root_subsys(root, ss) {
> +		struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
> +
>  		if (ss->can_attach) {
> -			retval = ss->can_attach(cgrp, &tset);
> +			retval = ss->can_attach(css, &tset);
>  			if (retval) {
>  				failed_ss = ss;
>  				goto out_cancel_attach;
> @@ -2082,8 +2087,10 @@ static int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk,
>  	 * step 4: do subsystem attach callbacks.
>  	 */
>  	for_each_root_subsys(root, ss) {
> +		struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
> +
>  		if (ss->attach)
> -			ss->attach(cgrp, &tset);
> +			ss->attach(css, &tset);
>  	}
>  
>  	/*
> @@ -2102,10 +2109,12 @@ out_put_css_set_refs:
>  out_cancel_attach:
>  	if (retval) {
>  		for_each_root_subsys(root, ss) {
> +			struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
> +
>  			if (ss == failed_ss)
>  				break;
>  			if (ss->cancel_attach)
> -				ss->cancel_attach(cgrp, &tset);
> +				ss->cancel_attach(css, &tset);
>  		}
>  	}
>  out_free_group_list:
> @@ -4199,12 +4208,13 @@ static void init_cgroup_css(struct cgroup_subsys_state *css,
>  /* invoke ->css_online() on a new CSS and mark it online if successful */
>  static int online_css(struct cgroup_subsys *ss, struct cgroup *cgrp)
>  {
> +	struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
>  	int ret = 0;
>  
>  	lockdep_assert_held(&cgroup_mutex);
>  
>  	if (ss->css_online)
> -		ret = ss->css_online(cgrp);
> +		ret = ss->css_online(css);
>  	if (!ret)
>  		cgrp->subsys[ss->subsys_id]->flags |= CSS_ONLINE;
>  	return ret;
> @@ -4221,9 +4231,9 @@ static void offline_css(struct cgroup_subsys *ss, struct cgroup *cgrp)
>  		return;
>  
>  	if (ss->css_offline)
> -		ss->css_offline(cgrp);
> +		ss->css_offline(css);
>  
> -	cgrp->subsys[ss->subsys_id]->flags &= ~CSS_ONLINE;
> +	css->flags &= ~CSS_ONLINE;
>  }
>  
>  /*
> @@ -4298,7 +4308,7 @@ static long cgroup_create(struct cgroup *parent, struct dentry *dentry,
>  	for_each_root_subsys(root, ss) {
>  		struct cgroup_subsys_state *css;
>  
> -		css = ss->css_alloc(cgrp);
> +		css = ss->css_alloc(parent->subsys[ss->subsys_id]);
>  		if (IS_ERR(css)) {
>  			err = PTR_ERR(css);
>  			goto err_free_all;
> @@ -4377,7 +4387,7 @@ err_free_all:
>  
>  		if (css) {
>  			percpu_ref_cancel_init(&css->refcnt);
> -			ss->css_free(cgrp);
> +			ss->css_free(css);
>  		}
>  	}
>  	mutex_unlock(&cgroup_mutex);
> @@ -4632,7 +4642,7 @@ static void __init cgroup_init_subsys(struct cgroup_subsys *ss)
>  	/* Create the top cgroup state for this subsystem */
>  	list_add(&ss->sibling, &cgroup_dummy_root.subsys_list);
>  	ss->root = &cgroup_dummy_root;
> -	css = ss->css_alloc(cgroup_dummy_top);
> +	css = ss->css_alloc(cgroup_dummy_top->subsys[ss->subsys_id]);
>  	/* We don't handle early failures gracefully */
>  	BUG_ON(IS_ERR(css));
>  	init_cgroup_css(css, ss, cgroup_dummy_top);
> @@ -4711,7 +4721,7 @@ int __init_or_module cgroup_load_subsys(struct cgroup_subsys *ss)
>  	 * struct, so this can happen first (i.e. before the dummy root
>  	 * attachment).
>  	 */
> -	css = ss->css_alloc(cgroup_dummy_top);
> +	css = ss->css_alloc(cgroup_dummy_top->subsys[ss->subsys_id]);
>  	if (IS_ERR(css)) {
>  		/* failure case - need to deassign the cgroup_subsys[] slot. */
>  		cgroup_subsys[ss->subsys_id] = NULL;
> @@ -4827,7 +4837,7 @@ void cgroup_unload_subsys(struct cgroup_subsys *ss)
>  	 * the cgrp->subsys pointer to find their state. note that this
>  	 * also takes care of freeing the css_id.
>  	 */
> -	ss->css_free(cgroup_dummy_top);
> +	ss->css_free(cgroup_dummy_top->subsys[ss->subsys_id]);
>  	cgroup_dummy_top->subsys[ss->subsys_id] = NULL;
>  
>  	mutex_unlock(&cgroup_mutex);
> @@ -5183,10 +5193,10 @@ void cgroup_exit(struct task_struct *tsk, int run_callbacks)
>  		 */
>  		for_each_builtin_subsys(ss, i) {
>  			if (ss->exit) {
> -				struct cgroup *old_cgrp = cset->subsys[i]->cgroup;
> -				struct cgroup *cgrp = task_cgroup(tsk, i);
> +				struct cgroup_subsys_state *old_css = cset->subsys[i];
> +				struct cgroup_subsys_state *css = task_css(tsk, i);
>  
> -				ss->exit(cgrp, old_cgrp, tsk);
> +				ss->exit(css, old_css, tsk);
>  			}
>  		}
>  	}
> @@ -5520,7 +5530,8 @@ struct cgroup_subsys_state *cgroup_css_from_dir(struct file *f, int id)
>  }
>  
>  #ifdef CONFIG_CGROUP_DEBUG
> -static struct cgroup_subsys_state *debug_css_alloc(struct cgroup *cgrp)
> +static struct cgroup_subsys_state *
> +debug_css_alloc(struct cgroup_subsys_state *parent_css)
>  {
>  	struct cgroup_subsys_state *css = kzalloc(sizeof(*css), GFP_KERNEL);
>  
> @@ -5530,9 +5541,9 @@ static struct cgroup_subsys_state *debug_css_alloc(struct cgroup *cgrp)
>  	return css;
>  }
>  
> -static void debug_css_free(struct cgroup *cgrp)
> +static void debug_css_free(struct cgroup_subsys_state *css)
>  {
> -	kfree(cgrp->subsys[debug_subsys_id]);
> +	kfree(css);
>  }
>  
>  static u64 debug_taskcount_read(struct cgroup *cgrp, struct cftype *cft)
> diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
> index 657a73c..f03a857 100644
> --- a/kernel/cgroup_freezer.c
> +++ b/kernel/cgroup_freezer.c
> @@ -91,7 +91,8 @@ static const char *freezer_state_strs(unsigned int state)
>  
>  struct cgroup_subsys freezer_subsys;
>  
> -static struct cgroup_subsys_state *freezer_css_alloc(struct cgroup *cgroup)
> +static struct cgroup_subsys_state *
> +freezer_css_alloc(struct cgroup_subsys_state *parent_css)
>  {
>  	struct freezer *freezer;
>  
> @@ -104,16 +105,16 @@ static struct cgroup_subsys_state *freezer_css_alloc(struct cgroup *cgroup)
>  }
>  
>  /**
> - * freezer_css_online - commit creation of a freezer cgroup
> - * @cgroup: cgroup being created
> + * freezer_css_online - commit creation of a freezer css
> + * @css: css being created
>   *
> - * We're committing to creation of @cgroup.  Mark it online and inherit
> + * We're committing to creation of @css.  Mark it online and inherit
>   * parent's freezing state while holding both parent's and our
>   * freezer->lock.
>   */
> -static int freezer_css_online(struct cgroup *cgroup)
> +static int freezer_css_online(struct cgroup_subsys_state *css)
>  {
> -	struct freezer *freezer = cgroup_freezer(cgroup);
> +	struct freezer *freezer = css_freezer(css);
>  	struct freezer *parent = parent_freezer(freezer);
>  
>  	/*
> @@ -140,15 +141,15 @@ static int freezer_css_online(struct cgroup *cgroup)
>  }
>  
>  /**
> - * freezer_css_offline - initiate destruction of @cgroup
> - * @cgroup: cgroup being destroyed
> + * freezer_css_offline - initiate destruction of a freezer css
> + * @css: css being destroyed
>   *
> - * @cgroup is going away.  Mark it dead and decrement system_freezing_count
> - * if it was holding one.
> + * @css is going away.  Mark it dead and decrement system_freezing_count if
> + * it was holding one.
>   */
> -static void freezer_css_offline(struct cgroup *cgroup)
> +static void freezer_css_offline(struct cgroup_subsys_state *css)
>  {
> -	struct freezer *freezer = cgroup_freezer(cgroup);
> +	struct freezer *freezer = css_freezer(css);
>  
>  	spin_lock_irq(&freezer->lock);
>  
> @@ -160,9 +161,9 @@ static void freezer_css_offline(struct cgroup *cgroup)
>  	spin_unlock_irq(&freezer->lock);
>  }
>  
> -static void freezer_css_free(struct cgroup *cgroup)
> +static void freezer_css_free(struct cgroup_subsys_state *css)
>  {
> -	kfree(cgroup_freezer(cgroup));
> +	kfree(css_freezer(css));
>  }
>  
>  /*
> @@ -174,25 +175,26 @@ static void freezer_css_free(struct cgroup *cgroup)
>   * @freezer->lock.  freezer_attach() makes the new tasks conform to the
>   * current state and all following state changes can see the new tasks.
>   */
> -static void freezer_attach(struct cgroup *new_cgrp, struct cgroup_taskset *tset)
> +static void freezer_attach(struct cgroup_subsys_state *new_css,
> +			   struct cgroup_taskset *tset)
>  {
> -	struct freezer *freezer = cgroup_freezer(new_cgrp);
> +	struct freezer *freezer = css_freezer(new_css);
>  	struct task_struct *task;
>  	bool clear_frozen = false;
>  
>  	spin_lock_irq(&freezer->lock);
>  
>  	/*
> -	 * Make the new tasks conform to the current state of @new_cgrp.
> +	 * Make the new tasks conform to the current state of @new_css.
>  	 * For simplicity, when migrating any task to a FROZEN cgroup, we
>  	 * revert it to FREEZING and let update_if_frozen() determine the
>  	 * correct state later.
>  	 *
> -	 * Tasks in @tset are on @new_cgrp but may not conform to its
> +	 * Tasks in @tset are on @new_css but may not conform to its
>  	 * current state before executing the following - !frozen tasks may
>  	 * be visible in a FROZEN cgroup and frozen tasks in a THAWED one.
>  	 */
> -	cgroup_taskset_for_each(task, new_cgrp, tset) {
> +	cgroup_taskset_for_each(task, new_css->cgroup, tset) {
>  		if (!(freezer->state & CGROUP_FREEZING)) {
>  			__thaw_task(task);
>  		} else {
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index 259a4af..8ce3fdc 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -1455,9 +1455,10 @@ static int fmeter_getrate(struct fmeter *fmp)
>  }
>  
>  /* Called by cgroups to determine if a cpuset is usable; cpuset_mutex held */
> -static int cpuset_can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
> +static int cpuset_can_attach(struct cgroup_subsys_state *css,
> +			     struct cgroup_taskset *tset)
>  {
> -	struct cpuset *cs = cgroup_cs(cgrp);
> +	struct cpuset *cs = css_cs(css);
>  	struct task_struct *task;
>  	int ret;
>  
> @@ -1468,11 +1469,11 @@ static int cpuset_can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
>  	 * flag is set.
>  	 */
>  	ret = -ENOSPC;
> -	if (!cgroup_sane_behavior(cgrp) &&
> +	if (!cgroup_sane_behavior(css->cgroup) &&
>  	    (cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed)))
>  		goto out_unlock;
>  
> -	cgroup_taskset_for_each(task, cgrp, tset) {
> +	cgroup_taskset_for_each(task, css->cgroup, tset) {
>  		/*
>  		 * Kthreads which disallow setaffinity shouldn't be moved
>  		 * to a new cpuset; we don't want to change their cpu
> @@ -1501,11 +1502,11 @@ out_unlock:
>  	return ret;
>  }
>  
> -static void cpuset_cancel_attach(struct cgroup *cgrp,
> +static void cpuset_cancel_attach(struct cgroup_subsys_state *css,
>  				 struct cgroup_taskset *tset)
>  {
>  	mutex_lock(&cpuset_mutex);
> -	cgroup_cs(cgrp)->attach_in_progress--;
> +	css_cs(css)->attach_in_progress--;
>  	mutex_unlock(&cpuset_mutex);
>  }
>  
> @@ -1516,7 +1517,8 @@ static void cpuset_cancel_attach(struct cgroup *cgrp,
>   */
>  static cpumask_var_t cpus_attach;
>  
> -static void cpuset_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
> +static void cpuset_attach(struct cgroup_subsys_state *css,
> +			  struct cgroup_taskset *tset)
>  {
>  	/* static buf protected by cpuset_mutex */
>  	static nodemask_t cpuset_attach_nodemask_to;
> @@ -1524,7 +1526,7 @@ static void cpuset_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
>  	struct task_struct *task;
>  	struct task_struct *leader = cgroup_taskset_first(tset);
>  	struct cgroup *oldcgrp = cgroup_taskset_cur_cgroup(tset);
> -	struct cpuset *cs = cgroup_cs(cgrp);
> +	struct cpuset *cs = css_cs(css);
>  	struct cpuset *oldcs = cgroup_cs(oldcgrp);
>  	struct cpuset *cpus_cs = effective_cpumask_cpuset(cs);
>  	struct cpuset *mems_cs = effective_nodemask_cpuset(cs);
> @@ -1539,7 +1541,7 @@ static void cpuset_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
>  
>  	guarantee_online_mems(mems_cs, &cpuset_attach_nodemask_to);
>  
> -	cgroup_taskset_for_each(task, cgrp, tset) {
> +	cgroup_taskset_for_each(task, css->cgroup, tset) {
>  		/*
>  		 * can_attach beforehand should guarantee that this doesn't
>  		 * fail.  TODO: have a better way to handle failure here
> @@ -1940,11 +1942,12 @@ static struct cftype files[] = {
>   *	cgrp:	control group that the new cpuset will be part of
>   */
>  
> -static struct cgroup_subsys_state *cpuset_css_alloc(struct cgroup *cgrp)
> +static struct cgroup_subsys_state *
> +cpuset_css_alloc(struct cgroup_subsys_state *parent_css)
>  {
>  	struct cpuset *cs;
>  
> -	if (!cgrp->parent)
> +	if (!parent_css)
>  		return &top_cpuset.css;
>  
>  	cs = kzalloc(sizeof(*cs), GFP_KERNEL);
> @@ -1964,9 +1967,9 @@ static struct cgroup_subsys_state *cpuset_css_alloc(struct cgroup *cgrp)
>  	return &cs->css;
>  }
>  
> -static int cpuset_css_online(struct cgroup *cgrp)
> +static int cpuset_css_online(struct cgroup_subsys_state *css)
>  {
> -	struct cpuset *cs = cgroup_cs(cgrp);
> +	struct cpuset *cs = css_cs(css);
>  	struct cpuset *parent = parent_cs(cs);
>  	struct cpuset *tmp_cs;
>  	struct cgroup *pos_cgrp;
> @@ -1984,7 +1987,7 @@ static int cpuset_css_online(struct cgroup *cgrp)
>  
>  	number_of_cpusets++;
>  
> -	if (!test_bit(CGRP_CPUSET_CLONE_CHILDREN, &cgrp->flags))
> +	if (!test_bit(CGRP_CPUSET_CLONE_CHILDREN, &css->cgroup->flags))
>  		goto out_unlock;
>  
>  	/*
> @@ -2024,9 +2027,9 @@ out_unlock:
>   * will call rebuild_sched_domains_locked().
>   */
>  
> -static void cpuset_css_offline(struct cgroup *cgrp)
> +static void cpuset_css_offline(struct cgroup_subsys_state *css)
>  {
> -	struct cpuset *cs = cgroup_cs(cgrp);
> +	struct cpuset *cs = css_cs(css);
>  
>  	mutex_lock(&cpuset_mutex);
>  
> @@ -2039,9 +2042,9 @@ static void cpuset_css_offline(struct cgroup *cgrp)
>  	mutex_unlock(&cpuset_mutex);
>  }
>  
> -static void cpuset_css_free(struct cgroup *cgrp)
> +static void cpuset_css_free(struct cgroup_subsys_state *css)
>  {
> -	struct cpuset *cs = cgroup_cs(cgrp);
> +	struct cpuset *cs = css_cs(css);
>  
>  	free_cpumask_var(cs->cpus_allowed);
>  	kfree(cs);
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 414c61f..9705a0e 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -7778,7 +7778,8 @@ unlock:
>  device_initcall(perf_event_sysfs_init);
>  
>  #ifdef CONFIG_CGROUP_PERF
> -static struct cgroup_subsys_state *perf_cgroup_css_alloc(struct cgroup *cont)
> +static struct cgroup_subsys_state *
> +perf_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
>  {
>  	struct perf_cgroup *jc;
>  
> @@ -7795,11 +7796,10 @@ static struct cgroup_subsys_state *perf_cgroup_css_alloc(struct cgroup *cont)
>  	return &jc->css;
>  }
>  
> -static void perf_cgroup_css_free(struct cgroup *cont)
> +static void perf_cgroup_css_free(struct cgroup_subsys_state *css)
>  {
> -	struct perf_cgroup *jc;
> -	jc = container_of(cgroup_css(cont, perf_subsys_id),
> -			  struct perf_cgroup, css);
> +	struct perf_cgroup *jc = container_of(css, struct perf_cgroup, css);
> +
>  	free_percpu(jc->info);
>  	kfree(jc);
>  }
> @@ -7811,15 +7811,17 @@ static int __perf_cgroup_move(void *info)
>  	return 0;
>  }
>  
> -static void perf_cgroup_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
> +static void perf_cgroup_attach(struct cgroup_subsys_state *css,
> +			       struct cgroup_taskset *tset)
>  {
>  	struct task_struct *task;
>  
> -	cgroup_taskset_for_each(task, cgrp, tset)
> +	cgroup_taskset_for_each(task, css->cgroup, tset)
>  		task_function_call(task, __perf_cgroup_move, task);
>  }
>  
> -static void perf_cgroup_exit(struct cgroup *cgrp, struct cgroup *old_cgrp,
> +static void perf_cgroup_exit(struct cgroup_subsys_state *css,
> +			     struct cgroup_subsys_state *old_css,
>  			     struct task_struct *task)
>  {
>  	/*
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 7a10742..622b7ef 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7094,16 +7094,17 @@ static inline struct task_group *cgroup_tg(struct cgroup *cgrp)
>  	return css_tg(cgroup_css(cgrp, cpu_cgroup_subsys_id));
>  }
>  
> -static struct cgroup_subsys_state *cpu_cgroup_css_alloc(struct cgroup *cgrp)
> +static struct cgroup_subsys_state *
> +cpu_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
>  {
> -	struct task_group *tg, *parent;
> +	struct task_group *parent = css_tg(parent_css);
> +	struct task_group *tg;
>  
> -	if (!cgrp->parent) {
> +	if (!parent) {
>  		/* This is early initialization for the top cgroup */
>  		return &root_task_group.css;
>  	}
>  
> -	parent = cgroup_tg(cgrp->parent);
>  	tg = sched_create_group(parent);
>  	if (IS_ERR(tg))
>  		return ERR_PTR(-ENOMEM);
> @@ -7111,38 +7112,38 @@ static struct cgroup_subsys_state *cpu_cgroup_css_alloc(struct cgroup *cgrp)
>  	return &tg->css;
>  }
>  
> -static int cpu_cgroup_css_online(struct cgroup *cgrp)
> +static int cpu_cgroup_css_online(struct cgroup_subsys_state *css)
>  {
> -	struct task_group *tg = cgroup_tg(cgrp);
> -	struct task_group *parent = css_tg(css_parent(&tg->css));
> +	struct task_group *tg = css_tg(css);
> +	struct task_group *parent = css_tg(css_parent(css));
>  
>  	if (parent)
>  		sched_online_group(tg, parent);
>  	return 0;
>  }
>  
> -static void cpu_cgroup_css_free(struct cgroup *cgrp)
> +static void cpu_cgroup_css_free(struct cgroup_subsys_state *css)
>  {
> -	struct task_group *tg = cgroup_tg(cgrp);
> +	struct task_group *tg = css_tg(css);
>  
>  	sched_destroy_group(tg);
>  }
>  
> -static void cpu_cgroup_css_offline(struct cgroup *cgrp)
> +static void cpu_cgroup_css_offline(struct cgroup_subsys_state *css)
>  {
> -	struct task_group *tg = cgroup_tg(cgrp);
> +	struct task_group *tg = css_tg(css);
>  
>  	sched_offline_group(tg);
>  }
>  
> -static int cpu_cgroup_can_attach(struct cgroup *cgrp,
> +static int cpu_cgroup_can_attach(struct cgroup_subsys_state *css,
>  				 struct cgroup_taskset *tset)
>  {
>  	struct task_struct *task;
>  
> -	cgroup_taskset_for_each(task, cgrp, tset) {
> +	cgroup_taskset_for_each(task, css->cgroup, tset) {
>  #ifdef CONFIG_RT_GROUP_SCHED
> -		if (!sched_rt_can_attach(cgroup_tg(cgrp), task))
> +		if (!sched_rt_can_attach(css_tg(css), task))
>  			return -EINVAL;
>  #else
>  		/* We don't support RT-tasks being in separate groups */
> @@ -7153,18 +7154,18 @@ static int cpu_cgroup_can_attach(struct cgroup *cgrp,
>  	return 0;
>  }
>  
> -static void cpu_cgroup_attach(struct cgroup *cgrp,
> +static void cpu_cgroup_attach(struct cgroup_subsys_state *css,
>  			      struct cgroup_taskset *tset)
>  {
>  	struct task_struct *task;
>  
> -	cgroup_taskset_for_each(task, cgrp, tset)
> +	cgroup_taskset_for_each(task, css->cgroup, tset)
>  		sched_move_task(task);
>  }
>  
> -static void
> -cpu_cgroup_exit(struct cgroup *cgrp, struct cgroup *old_cgrp,
> -		struct task_struct *task)
> +static void cpu_cgroup_exit(struct cgroup_subsys_state *css,
> +			    struct cgroup_subsys_state *old_css,
> +			    struct task_struct *task)
>  {
>  	/*
>  	 * cgroup_exit() is called in the copy_process() failure path.
> diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
> index f6926a1..1b784d9 100644
> --- a/kernel/sched/cpuacct.c
> +++ b/kernel/sched/cpuacct.c
> @@ -62,11 +62,12 @@ static struct cpuacct root_cpuacct = {
>  };
>  
>  /* create a new cpu accounting group */
> -static struct cgroup_subsys_state *cpuacct_css_alloc(struct cgroup *cgrp)
> +static struct cgroup_subsys_state *
> +cpuacct_css_alloc(struct cgroup_subsys_state *parent_css)
>  {
>  	struct cpuacct *ca;
>  
> -	if (!cgrp->parent)
> +	if (!parent_css)
>  		return &root_cpuacct.css;
>  
>  	ca = kzalloc(sizeof(*ca), GFP_KERNEL);
> @@ -92,9 +93,9 @@ out:
>  }
>  
>  /* destroy an existing cpu accounting group */
> -static void cpuacct_css_free(struct cgroup *cgrp)
> +static void cpuacct_css_free(struct cgroup_subsys_state *css)
>  {
> -	struct cpuacct *ca = cgroup_ca(cgrp);
> +	struct cpuacct *ca = css_ca(css);
>  
>  	free_percpu(ca->cpustat);
>  	free_percpu(ca->cpuusage);
> diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
> index 57ecb5d..e213243 100644
> --- a/mm/hugetlb_cgroup.c
> +++ b/mm/hugetlb_cgroup.c
> @@ -73,19 +73,18 @@ static inline bool hugetlb_cgroup_have_usage(struct hugetlb_cgroup *h_cg)
>  	return false;
>  }
>  
> -static struct cgroup_subsys_state *hugetlb_cgroup_css_alloc(struct cgroup *cgroup)
> +static struct cgroup_subsys_state *
> +hugetlb_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
>  {
> +	struct hugetlb_cgroup *parent_h_cgroup = hugetlb_cgroup_from_css(parent_css);
> +	struct hugetlb_cgroup *h_cgroup;
>  	int idx;
> -	struct cgroup *parent_cgroup;
> -	struct hugetlb_cgroup *h_cgroup, *parent_h_cgroup;
>  
>  	h_cgroup = kzalloc(sizeof(*h_cgroup), GFP_KERNEL);
>  	if (!h_cgroup)
>  		return ERR_PTR(-ENOMEM);
>  
> -	parent_cgroup = cgroup->parent;
> -	if (parent_cgroup) {
> -		parent_h_cgroup = hugetlb_cgroup_from_cgroup(parent_cgroup);
> +	if (parent_h_cgroup) {
>  		for (idx = 0; idx < HUGE_MAX_HSTATE; idx++)
>  			res_counter_init(&h_cgroup->hugepage[idx],
>  					 &parent_h_cgroup->hugepage[idx]);
> @@ -97,11 +96,11 @@ static struct cgroup_subsys_state *hugetlb_cgroup_css_alloc(struct cgroup *cgrou
>  	return &h_cgroup->css;
>  }
>  
> -static void hugetlb_cgroup_css_free(struct cgroup *cgroup)
> +static void hugetlb_cgroup_css_free(struct cgroup_subsys_state *css)
>  {
>  	struct hugetlb_cgroup *h_cgroup;
>  
> -	h_cgroup = hugetlb_cgroup_from_cgroup(cgroup);
> +	h_cgroup = hugetlb_cgroup_from_css(css);
>  	kfree(h_cgroup);
>  }
>  
> @@ -150,9 +149,9 @@ out:
>   * Force the hugetlb cgroup to empty the hugetlb resources by moving them to
>   * the parent cgroup.
>   */
> -static void hugetlb_cgroup_css_offline(struct cgroup *cgroup)
> +static void hugetlb_cgroup_css_offline(struct cgroup_subsys_state *css)
>  {
> -	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_cgroup(cgroup);
> +	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_css(css);
>  	struct hstate *h;
>  	struct page *page;
>  	int idx = 0;
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 69b3e52..32cca0f 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -6211,7 +6211,7 @@ static void __init mem_cgroup_soft_limit_tree_init(void)
>  }
>  
>  static struct cgroup_subsys_state * __ref
> -mem_cgroup_css_alloc(struct cgroup *cont)
> +mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
>  {
>  	struct mem_cgroup *memcg;
>  	long error = -ENOMEM;
> @@ -6226,7 +6226,7 @@ mem_cgroup_css_alloc(struct cgroup *cont)
>  			goto free_out;
>  
>  	/* root ? */
> -	if (cont->parent == NULL) {
> +	if (parent_css == NULL) {
>  		root_mem_cgroup = memcg;
>  		res_counter_init(&memcg->res, NULL);
>  		res_counter_init(&memcg->memsw, NULL);
> @@ -6248,10 +6248,10 @@ free_out:
>  }
>  
>  static int
> -mem_cgroup_css_online(struct cgroup *cont)
> +mem_cgroup_css_online(struct cgroup_subsys_state *css)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
> -	struct mem_cgroup *parent = mem_cgroup_from_css(css_parent(&memcg->css));
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
> +	struct mem_cgroup *parent = mem_cgroup_from_css(css_parent(css));
>  	int error = 0;
>  
>  	if (!parent)
> @@ -6308,9 +6308,9 @@ static void mem_cgroup_invalidate_reclaim_iterators(struct mem_cgroup *memcg)
>  		mem_cgroup_iter_invalidate(root_mem_cgroup);
>  }
>  
> -static void mem_cgroup_css_offline(struct cgroup *cont)
> +static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  
>  	kmem_cgroup_css_offline(memcg);
>  
> @@ -6319,9 +6319,9 @@ static void mem_cgroup_css_offline(struct cgroup *cont)
>  	mem_cgroup_destroy_all_caches(memcg);
>  }
>  
> -static void mem_cgroup_css_free(struct cgroup *cont)
> +static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  
>  	memcg_destroy_kmem(memcg);
>  	__mem_cgroup_free(memcg);
> @@ -6691,12 +6691,12 @@ static void mem_cgroup_clear_mc(void)
>  	mem_cgroup_end_move(from);
>  }
>  
> -static int mem_cgroup_can_attach(struct cgroup *cgroup,
> +static int mem_cgroup_can_attach(struct cgroup_subsys_state *css,
>  				 struct cgroup_taskset *tset)
>  {
>  	struct task_struct *p = cgroup_taskset_first(tset);
>  	int ret = 0;
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgroup);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	unsigned long move_charge_at_immigrate;
>  
>  	/*
> @@ -6738,7 +6738,7 @@ static int mem_cgroup_can_attach(struct cgroup *cgroup,
>  	return ret;
>  }
>  
> -static void mem_cgroup_cancel_attach(struct cgroup *cgroup,
> +static void mem_cgroup_cancel_attach(struct cgroup_subsys_state *css,
>  				     struct cgroup_taskset *tset)
>  {
>  	mem_cgroup_clear_mc();
> @@ -6886,7 +6886,7 @@ retry:
>  	up_read(&mm->mmap_sem);
>  }
>  
> -static void mem_cgroup_move_task(struct cgroup *cont,
> +static void mem_cgroup_move_task(struct cgroup_subsys_state *css,
>  				 struct cgroup_taskset *tset)
>  {
>  	struct task_struct *p = cgroup_taskset_first(tset);
> @@ -6901,16 +6901,16 @@ static void mem_cgroup_move_task(struct cgroup *cont,
>  		mem_cgroup_clear_mc();
>  }
>  #else	/* !CONFIG_MMU */
> -static int mem_cgroup_can_attach(struct cgroup *cgroup,
> +static int mem_cgroup_can_attach(struct cgroup_subsys_state *css,
>  				 struct cgroup_taskset *tset)
>  {
>  	return 0;
>  }
> -static void mem_cgroup_cancel_attach(struct cgroup *cgroup,
> +static void mem_cgroup_cancel_attach(struct cgroup_subsys_state *css,
>  				     struct cgroup_taskset *tset)
>  {
>  }
> -static void mem_cgroup_move_task(struct cgroup *cont,
> +static void mem_cgroup_move_task(struct cgroup_subsys_state *css,
>  				 struct cgroup_taskset *tset)
>  {
>  }
> @@ -6920,15 +6920,15 @@ static void mem_cgroup_move_task(struct cgroup *cont,
>   * Cgroup retains root cgroups across [un]mount cycles making it necessary
>   * to verify sane_behavior flag on each mount attempt.
>   */
> -static void mem_cgroup_bind(struct cgroup *root)
> +static void mem_cgroup_bind(struct cgroup_subsys_state *root_css)
>  {
>  	/*
>  	 * use_hierarchy is forced with sane_behavior.  cgroup core
>  	 * guarantees that @root doesn't have any children, so turning it
>  	 * on for the root memcg is enough.
>  	 */
> -	if (cgroup_sane_behavior(root))
> -		mem_cgroup_from_cont(root)->use_hierarchy = true;
> +	if (cgroup_sane_behavior(root_css->cgroup))
> +		mem_cgroup_from_css(root_css)->use_hierarchy = true;
>  }
>  
>  struct cgroup_subsys mem_cgroup_subsys = {
> diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
> index 5dfac88..8d095b4 100644
> --- a/net/core/netprio_cgroup.c
> +++ b/net/core/netprio_cgroup.c
> @@ -126,7 +126,8 @@ static int netprio_set_prio(struct cgroup_subsys_state *css,
>  	return 0;
>  }
>  
> -static struct cgroup_subsys_state *cgrp_css_alloc(struct cgroup *cgrp)
> +static struct cgroup_subsys_state *
> +cgrp_css_alloc(struct cgroup_subsys_state *parent_css)
>  {
>  	struct cgroup_subsys_state *css;
>  
> @@ -137,16 +138,14 @@ static struct cgroup_subsys_state *cgrp_css_alloc(struct cgroup *cgrp)
>  	return css;
>  }
>  
> -static int cgrp_css_online(struct cgroup *cgrp)
> +static int cgrp_css_online(struct cgroup_subsys_state *css)
>  {
> -	struct cgroup_subsys_state *css = cgroup_css(cgrp, net_prio_subsys_id);
> -	struct cgroup_subsys_state *parent_css;
> +	struct cgroup_subsys_state *parent_css = css_parent(css);
>  	struct net_device *dev;
>  	int ret = 0;
>  
> -	if (!cgrp->parent)
> +	if (!parent_css)
>  		return 0;
> -	parent_css = cgroup_css(cgrp->parent, net_prio_subsys_id);
>  
>  	rtnl_lock();
>  	/*
> @@ -164,9 +163,9 @@ static int cgrp_css_online(struct cgroup *cgrp)
>  	return ret;
>  }
>  
> -static void cgrp_css_free(struct cgroup *cgrp)
> +static void cgrp_css_free(struct cgroup_subsys_state *css)
>  {
> -	kfree(cgroup_css(cgrp, net_prio_subsys_id));
> +	kfree(css);
>  }
>  
>  static u64 read_prioidx(struct cgroup *cgrp, struct cftype *cft)
> @@ -221,12 +220,13 @@ static int update_netprio(const void *v, struct file *file, unsigned n)
>  	return 0;
>  }
>  
> -static void net_prio_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
> +static void net_prio_attach(struct cgroup_subsys_state *css,
> +			    struct cgroup_taskset *tset)
>  {
>  	struct task_struct *p;
>  	void *v;
>  
> -	cgroup_taskset_for_each(p, cgrp, tset) {
> +	cgroup_taskset_for_each(p, css->cgroup, tset) {
>  		task_lock(p);
>  		v = (void *)(unsigned long)task_netprioidx(p);
>  		iterate_fd(p->files, 0, update_netprio, v);
> diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
> index 9e6b75e..dc39838 100644
> --- a/net/sched/cls_cgroup.c
> +++ b/net/sched/cls_cgroup.c
> @@ -38,7 +38,8 @@ static inline struct cgroup_cls_state *task_cls_state(struct task_struct *p)
>  	return css_cls_state(task_css(p, net_cls_subsys_id));
>  }
>  
> -static struct cgroup_subsys_state *cgrp_css_alloc(struct cgroup *cgrp)
> +static struct cgroup_subsys_state *
> +cgrp_css_alloc(struct cgroup_subsys_state *parent_css)
>  {
>  	struct cgroup_cls_state *cs;
>  
> @@ -48,19 +49,19 @@ static struct cgroup_subsys_state *cgrp_css_alloc(struct cgroup *cgrp)
>  	return &cs->css;
>  }
>  
> -static int cgrp_css_online(struct cgroup *cgrp)
> +static int cgrp_css_online(struct cgroup_subsys_state *css)
>  {
> -	struct cgroup_cls_state *cs = cgrp_cls_state(cgrp);
> -	struct cgroup_cls_state *parent = css_cls_state(css_parent(&cs->css));
> +	struct cgroup_cls_state *cs = css_cls_state(css);
> +	struct cgroup_cls_state *parent = css_cls_state(css_parent(css));
>  
>  	if (parent)
>  		cs->classid = parent->classid;
>  	return 0;
>  }
>  
> -static void cgrp_css_free(struct cgroup *cgrp)
> +static void cgrp_css_free(struct cgroup_subsys_state *css)
>  {
> -	kfree(cgrp_cls_state(cgrp));
> +	kfree(css_cls_state(css));
>  }
>  
>  static int update_classid(const void *v, struct file *file, unsigned n)
> @@ -72,12 +73,13 @@ static int update_classid(const void *v, struct file *file, unsigned n)
>  	return 0;
>  }
>  
> -static void cgrp_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
> +static void cgrp_attach(struct cgroup_subsys_state *css,
> +			struct cgroup_taskset *tset)
>  {
>  	struct task_struct *p;
>  	void *v;
>  
> -	cgroup_taskset_for_each(p, cgrp, tset) {
> +	cgroup_taskset_for_each(p, css->cgroup, tset) {
>  		task_lock(p);
>  		v = (void *)(unsigned long)task_cls_classid(p);
>  		iterate_fd(p->files, 0, update_classid, v);
> diff --git a/security/device_cgroup.c b/security/device_cgroup.c
> index 635a49d..7293ac4 100644
> --- a/security/device_cgroup.c
> +++ b/security/device_cgroup.c
> @@ -68,7 +68,7 @@ static inline struct dev_cgroup *task_devcgroup(struct task_struct *task)
>  
>  struct cgroup_subsys devices_subsys;
>  
> -static int devcgroup_can_attach(struct cgroup *new_cgrp,
> +static int devcgroup_can_attach(struct cgroup_subsys_state *new_css,
>  				struct cgroup_taskset *set)
>  {
>  	struct task_struct *task = cgroup_taskset_first(set);
> @@ -193,13 +193,13 @@ static inline bool is_devcg_online(const struct dev_cgroup *devcg)
>  /**
>   * devcgroup_online - initializes devcgroup's behavior and exceptions based on
>   * 		      parent's
> - * @cgroup: cgroup getting online
> + * @css: css getting online
>   * returns 0 in case of success, error code otherwise
>   */
> -static int devcgroup_online(struct cgroup *cgroup)
> +static int devcgroup_online(struct cgroup_subsys_state *css)
>  {
> -	struct dev_cgroup *dev_cgroup = cgroup_to_devcgroup(cgroup);
> -	struct dev_cgroup *parent_dev_cgroup = css_to_devcgroup(css_parent(&dev_cgroup->css));
> +	struct dev_cgroup *dev_cgroup = css_to_devcgroup(css);
> +	struct dev_cgroup *parent_dev_cgroup = css_to_devcgroup(css_parent(css));
>  	int ret = 0;
>  
>  	mutex_lock(&devcgroup_mutex);
> @@ -217,9 +217,9 @@ static int devcgroup_online(struct cgroup *cgroup)
>  	return ret;
>  }
>  
> -static void devcgroup_offline(struct cgroup *cgroup)
> +static void devcgroup_offline(struct cgroup_subsys_state *css)
>  {
> -	struct dev_cgroup *dev_cgroup = cgroup_to_devcgroup(cgroup);
> +	struct dev_cgroup *dev_cgroup = css_to_devcgroup(css);
>  
>  	mutex_lock(&devcgroup_mutex);
>  	dev_cgroup->behavior = DEVCG_DEFAULT_NONE;
> @@ -229,7 +229,8 @@ static void devcgroup_offline(struct cgroup *cgroup)
>  /*
>   * called from kernel/cgroup.c with cgroup_lock() held.
>   */
> -static struct cgroup_subsys_state *devcgroup_css_alloc(struct cgroup *cgroup)
> +static struct cgroup_subsys_state *
> +devcgroup_css_alloc(struct cgroup_subsys_state *parent_css)
>  {
>  	struct dev_cgroup *dev_cgroup;
>  
> @@ -242,11 +243,10 @@ static struct cgroup_subsys_state *devcgroup_css_alloc(struct cgroup *cgroup)
>  	return &dev_cgroup->css;
>  }
>  
> -static void devcgroup_css_free(struct cgroup *cgroup)
> +static void devcgroup_css_free(struct cgroup_subsys_state *css)
>  {
> -	struct dev_cgroup *dev_cgroup;
> +	struct dev_cgroup *dev_cgroup = css_to_devcgroup(css);
>  
> -	dev_cgroup = cgroup_to_devcgroup(cgroup);
>  	__dev_exception_clean(dev_cgroup);
>  	kfree(dev_cgroup);
>  }
> -- 
> 1.8.3.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 12/23] cgroup: pass around cgroup_subsys_state instead of cgroup in file methods
  2013-08-01 21:49 ` [PATCH 12/23] cgroup: pass around cgroup_subsys_state instead of cgroup in file methods Tejun Heo
@ 2013-08-02 13:27   ` Michal Hocko
  2013-08-05 14:19   ` Vivek Goyal
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2013-08-02 13:27 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, containers, cgroups, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Johannes Weiner, Balbir Singh, Aristeu Rozanski,
	Matt Helsley, Daniel Wagner, Vivek Goyal, Jens Axboe,
	Steven Rostedt

On Thu 01-08-13 17:49:50, Tejun Heo wrote:
> cgroup is currently in the process of transitioning to using struct
> cgroup_subsys_state * as the primary handle instead of struct cgroup.
> Please see the previous commit which converts the subsystem methods
> for rationale.
> 
> This patch converts all cftype file operations to take @css instead of
> @cgroup.  cftypes for the cgroup core files don't have their subsytem
> pointer set.  These will automatically use the dummy_css added by the
> previous patch and can be converted the same way.
> 
> Most subsystem conversions are straight forwards but there are some
> interesting ones.
> 
> * freezer: update_if_frozen() is also converted to take @css instead
>   of @cgroup for consistency.  This will make the code look simpler
>   too once iterators are converted to use css.
> 
> * memory/vmpressure: mem_cgroup_from_css() needs to be exported to
>   vmpressure while mem_cgroup_from_cont() can be made static.
>   Updated accordingly.
> 
> * cpu: cgroup_tg() doesn't have any user left.  Removed.
> 
> * cpuacct: cgroup_ca() doesn't have any user left.  Removed.
> 
> * hugetlb: hugetlb_cgroup_form_cgroup() doesn't have any user left.
>   Removed.
> 
> * net_cls: cgrp_cls_state() doesn't have any user left.  Removed.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Aristeu Rozanski <aris@redhat.com>
> Cc: Matt Helsley <matthltc@us.ibm.com>
> Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Steven Rostedt <rostedt@goodmis.org>

Looks good to me. For memcg parts
Acked-by: Michal Hocko <mhocko@suse.cz>

> ---
>  block/blk-cgroup.c         |   6 +-
>  block/blk-throttle.c       |  32 ++++-----
>  block/cfq-iosched.c        |  90 ++++++++++++-------------
>  include/linux/cgroup.h     |  24 ++++---
>  include/linux/memcontrol.h |   2 +-
>  kernel/cgroup.c            | 162 +++++++++++++++++++++++----------------------
>  kernel/cgroup_freezer.c    |  40 +++++------
>  kernel/cpuset.c            |  35 +++++-----
>  kernel/sched/core.c        |  65 +++++++++---------
>  kernel/sched/cpuacct.c     |  28 +++-----
>  mm/hugetlb_cgroup.c        |  26 +++-----
>  mm/memcontrol.c            |  88 ++++++++++++------------
>  mm/vmpressure.c            |   4 +-
>  net/core/netprio_cgroup.c  |  10 ++-
>  net/ipv4/tcp_memcontrol.c  |  12 ++--
>  net/sched/cls_cgroup.c     |  14 ++--
>  security/device_cgroup.c   |  12 ++--
>  17 files changed, 322 insertions(+), 328 deletions(-)
> 
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index 3406373..f46f3c6 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -437,10 +437,10 @@ struct request_list *__blk_queue_next_rl(struct request_list *rl,
>  	return &blkg->rl;
>  }
>  
> -static int blkcg_reset_stats(struct cgroup *cgroup, struct cftype *cftype,
> -			     u64 val)
> +static int blkcg_reset_stats(struct cgroup_subsys_state *css,
> +			     struct cftype *cftype, u64 val)
>  {
> -	struct blkcg *blkcg = cgroup_to_blkcg(cgroup);
> +	struct blkcg *blkcg = css_to_blkcg(css);
>  	struct blkcg_gq *blkg;
>  	int i;
>  
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 08a32df..88bcfb6 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -1293,10 +1293,10 @@ static u64 tg_prfill_cpu_rwstat(struct seq_file *sf,
>  	return __blkg_prfill_rwstat(sf, pd, &rwstat);
>  }
>  
> -static int tg_print_cpu_rwstat(struct cgroup *cgrp, struct cftype *cft,
> -			       struct seq_file *sf)
> +static int tg_print_cpu_rwstat(struct cgroup_subsys_state *css,
> +			       struct cftype *cft, struct seq_file *sf)
>  {
> -	struct blkcg *blkcg = cgroup_to_blkcg(cgrp);
> +	struct blkcg *blkcg = css_to_blkcg(css);
>  
>  	blkcg_print_blkgs(sf, blkcg, tg_prfill_cpu_rwstat, &blkcg_policy_throtl,
>  			  cft->private, true);
> @@ -1325,26 +1325,26 @@ static u64 tg_prfill_conf_uint(struct seq_file *sf, struct blkg_policy_data *pd,
>  	return __blkg_prfill_u64(sf, pd, v);
>  }
>  
> -static int tg_print_conf_u64(struct cgroup *cgrp, struct cftype *cft,
> -			     struct seq_file *sf)
> +static int tg_print_conf_u64(struct cgroup_subsys_state *css,
> +			     struct cftype *cft, struct seq_file *sf)
>  {
> -	blkcg_print_blkgs(sf, cgroup_to_blkcg(cgrp), tg_prfill_conf_u64,
> +	blkcg_print_blkgs(sf, css_to_blkcg(css), tg_prfill_conf_u64,
>  			  &blkcg_policy_throtl, cft->private, false);
>  	return 0;
>  }
>  
> -static int tg_print_conf_uint(struct cgroup *cgrp, struct cftype *cft,
> -			      struct seq_file *sf)
> +static int tg_print_conf_uint(struct cgroup_subsys_state *css,
> +			      struct cftype *cft, struct seq_file *sf)
>  {
> -	blkcg_print_blkgs(sf, cgroup_to_blkcg(cgrp), tg_prfill_conf_uint,
> +	blkcg_print_blkgs(sf, css_to_blkcg(css), tg_prfill_conf_uint,
>  			  &blkcg_policy_throtl, cft->private, false);
>  	return 0;
>  }
>  
> -static int tg_set_conf(struct cgroup *cgrp, struct cftype *cft, const char *buf,
> -		       bool is_u64)
> +static int tg_set_conf(struct cgroup_subsys_state *css, struct cftype *cft,
> +		       const char *buf, bool is_u64)
>  {
> -	struct blkcg *blkcg = cgroup_to_blkcg(cgrp);
> +	struct blkcg *blkcg = css_to_blkcg(css);
>  	struct blkg_conf_ctx ctx;
>  	struct throtl_grp *tg;
>  	struct throtl_service_queue *sq;
> @@ -1403,16 +1403,16 @@ static int tg_set_conf(struct cgroup *cgrp, struct cftype *cft, const char *buf,
>  	return 0;
>  }
>  
> -static int tg_set_conf_u64(struct cgroup *cgrp, struct cftype *cft,
> +static int tg_set_conf_u64(struct cgroup_subsys_state *css, struct cftype *cft,
>  			   const char *buf)
>  {
> -	return tg_set_conf(cgrp, cft, buf, true);
> +	return tg_set_conf(css, cft, buf, true);
>  }
>  
> -static int tg_set_conf_uint(struct cgroup *cgrp, struct cftype *cft,
> +static int tg_set_conf_uint(struct cgroup_subsys_state *css, struct cftype *cft,
>  			    const char *buf)
>  {
> -	return tg_set_conf(cgrp, cft, buf, false);
> +	return tg_set_conf(css, cft, buf, false);
>  }
>  
>  static struct cftype throtl_files[] = {
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index d5bbdcf..dabb9d0 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -1607,12 +1607,11 @@ static u64 cfqg_prfill_weight_device(struct seq_file *sf,
>  	return __blkg_prfill_u64(sf, pd, cfqg->dev_weight);
>  }
>  
> -static int cfqg_print_weight_device(struct cgroup *cgrp, struct cftype *cft,
> -				    struct seq_file *sf)
> +static int cfqg_print_weight_device(struct cgroup_subsys_state *css,
> +				    struct cftype *cft, struct seq_file *sf)
>  {
> -	blkcg_print_blkgs(sf, cgroup_to_blkcg(cgrp),
> -			  cfqg_prfill_weight_device, &blkcg_policy_cfq, 0,
> -			  false);
> +	blkcg_print_blkgs(sf, css_to_blkcg(css), cfqg_prfill_weight_device,
> +			  &blkcg_policy_cfq, 0, false);
>  	return 0;
>  }
>  
> @@ -1626,35 +1625,34 @@ static u64 cfqg_prfill_leaf_weight_device(struct seq_file *sf,
>  	return __blkg_prfill_u64(sf, pd, cfqg->dev_leaf_weight);
>  }
>  
> -static int cfqg_print_leaf_weight_device(struct cgroup *cgrp,
> +static int cfqg_print_leaf_weight_device(struct cgroup_subsys_state *css,
>  					 struct cftype *cft,
>  					 struct seq_file *sf)
>  {
> -	blkcg_print_blkgs(sf, cgroup_to_blkcg(cgrp),
> -			  cfqg_prfill_leaf_weight_device, &blkcg_policy_cfq, 0,
> -			  false);
> +	blkcg_print_blkgs(sf, css_to_blkcg(css), cfqg_prfill_leaf_weight_device,
> +			  &blkcg_policy_cfq, 0, false);
>  	return 0;
>  }
>  
> -static int cfq_print_weight(struct cgroup *cgrp, struct cftype *cft,
> +static int cfq_print_weight(struct cgroup_subsys_state *css, struct cftype *cft,
>  			    struct seq_file *sf)
>  {
> -	seq_printf(sf, "%u\n", cgroup_to_blkcg(cgrp)->cfq_weight);
> +	seq_printf(sf, "%u\n", css_to_blkcg(css)->cfq_weight);
>  	return 0;
>  }
>  
> -static int cfq_print_leaf_weight(struct cgroup *cgrp, struct cftype *cft,
> -				 struct seq_file *sf)
> +static int cfq_print_leaf_weight(struct cgroup_subsys_state *css,
> +				 struct cftype *cft, struct seq_file *sf)
>  {
> -	seq_printf(sf, "%u\n",
> -		   cgroup_to_blkcg(cgrp)->cfq_leaf_weight);
> +	seq_printf(sf, "%u\n", css_to_blkcg(css)->cfq_leaf_weight);
>  	return 0;
>  }
>  
> -static int __cfqg_set_weight_device(struct cgroup *cgrp, struct cftype *cft,
> -				    const char *buf, bool is_leaf_weight)
> +static int __cfqg_set_weight_device(struct cgroup_subsys_state *css,
> +				    struct cftype *cft, const char *buf,
> +				    bool is_leaf_weight)
>  {
> -	struct blkcg *blkcg = cgroup_to_blkcg(cgrp);
> +	struct blkcg *blkcg = css_to_blkcg(css);
>  	struct blkg_conf_ctx ctx;
>  	struct cfq_group *cfqg;
>  	int ret;
> @@ -1680,22 +1678,22 @@ static int __cfqg_set_weight_device(struct cgroup *cgrp, struct cftype *cft,
>  	return ret;
>  }
>  
> -static int cfqg_set_weight_device(struct cgroup *cgrp, struct cftype *cft,
> -				  const char *buf)
> +static int cfqg_set_weight_device(struct cgroup_subsys_state *css,
> +				  struct cftype *cft, const char *buf)
>  {
> -	return __cfqg_set_weight_device(cgrp, cft, buf, false);
> +	return __cfqg_set_weight_device(css, cft, buf, false);
>  }
>  
> -static int cfqg_set_leaf_weight_device(struct cgroup *cgrp, struct cftype *cft,
> -				       const char *buf)
> +static int cfqg_set_leaf_weight_device(struct cgroup_subsys_state *css,
> +				       struct cftype *cft, const char *buf)
>  {
> -	return __cfqg_set_weight_device(cgrp, cft, buf, true);
> +	return __cfqg_set_weight_device(css, cft, buf, true);
>  }
>  
> -static int __cfq_set_weight(struct cgroup *cgrp, struct cftype *cft, u64 val,
> -			    bool is_leaf_weight)
> +static int __cfq_set_weight(struct cgroup_subsys_state *css, struct cftype *cft,
> +			    u64 val, bool is_leaf_weight)
>  {
> -	struct blkcg *blkcg = cgroup_to_blkcg(cgrp);
> +	struct blkcg *blkcg = css_to_blkcg(css);
>  	struct blkcg_gq *blkg;
>  
>  	if (val < CFQ_WEIGHT_MIN || val > CFQ_WEIGHT_MAX)
> @@ -1727,30 +1725,32 @@ static int __cfq_set_weight(struct cgroup *cgrp, struct cftype *cft, u64 val,
>  	return 0;
>  }
>  
> -static int cfq_set_weight(struct cgroup *cgrp, struct cftype *cft, u64 val)
> +static int cfq_set_weight(struct cgroup_subsys_state *css, struct cftype *cft,
> +			  u64 val)
>  {
> -	return __cfq_set_weight(cgrp, cft, val, false);
> +	return __cfq_set_weight(css, cft, val, false);
>  }
>  
> -static int cfq_set_leaf_weight(struct cgroup *cgrp, struct cftype *cft, u64 val)
> +static int cfq_set_leaf_weight(struct cgroup_subsys_state *css,
> +			       struct cftype *cft, u64 val)
>  {
> -	return __cfq_set_weight(cgrp, cft, val, true);
> +	return __cfq_set_weight(css, cft, val, true);
>  }
>  
> -static int cfqg_print_stat(struct cgroup *cgrp, struct cftype *cft,
> +static int cfqg_print_stat(struct cgroup_subsys_state *css, struct cftype *cft,
>  			   struct seq_file *sf)
>  {
> -	struct blkcg *blkcg = cgroup_to_blkcg(cgrp);
> +	struct blkcg *blkcg = css_to_blkcg(css);
>  
>  	blkcg_print_blkgs(sf, blkcg, blkg_prfill_stat, &blkcg_policy_cfq,
>  			  cft->private, false);
>  	return 0;
>  }
>  
> -static int cfqg_print_rwstat(struct cgroup *cgrp, struct cftype *cft,
> -			     struct seq_file *sf)
> +static int cfqg_print_rwstat(struct cgroup_subsys_state *css,
> +			     struct cftype *cft, struct seq_file *sf)
>  {
> -	struct blkcg *blkcg = cgroup_to_blkcg(cgrp);
> +	struct blkcg *blkcg = css_to_blkcg(css);
>  
>  	blkcg_print_blkgs(sf, blkcg, blkg_prfill_rwstat, &blkcg_policy_cfq,
>  			  cft->private, true);
> @@ -1773,20 +1773,20 @@ static u64 cfqg_prfill_rwstat_recursive(struct seq_file *sf,
>  	return __blkg_prfill_rwstat(sf, pd, &sum);
>  }
>  
> -static int cfqg_print_stat_recursive(struct cgroup *cgrp, struct cftype *cft,
> -				     struct seq_file *sf)
> +static int cfqg_print_stat_recursive(struct cgroup_subsys_state *css,
> +				     struct cftype *cft, struct seq_file *sf)
>  {
> -	struct blkcg *blkcg = cgroup_to_blkcg(cgrp);
> +	struct blkcg *blkcg = css_to_blkcg(css);
>  
>  	blkcg_print_blkgs(sf, blkcg, cfqg_prfill_stat_recursive,
>  			  &blkcg_policy_cfq, cft->private, false);
>  	return 0;
>  }
>  
> -static int cfqg_print_rwstat_recursive(struct cgroup *cgrp, struct cftype *cft,
> -				       struct seq_file *sf)
> +static int cfqg_print_rwstat_recursive(struct cgroup_subsys_state *css,
> +				       struct cftype *cft, struct seq_file *sf)
>  {
> -	struct blkcg *blkcg = cgroup_to_blkcg(cgrp);
> +	struct blkcg *blkcg = css_to_blkcg(css);
>  
>  	blkcg_print_blkgs(sf, blkcg, cfqg_prfill_rwstat_recursive,
>  			  &blkcg_policy_cfq, cft->private, true);
> @@ -1810,10 +1810,10 @@ static u64 cfqg_prfill_avg_queue_size(struct seq_file *sf,
>  }
>  
>  /* print avg_queue_size */
> -static int cfqg_print_avg_queue_size(struct cgroup *cgrp, struct cftype *cft,
> -				     struct seq_file *sf)
> +static int cfqg_print_avg_queue_size(struct cgroup_subsys_state *css,
> +				     struct cftype *cft, struct seq_file *sf)
>  {
> -	struct blkcg *blkcg = cgroup_to_blkcg(cgrp);
> +	struct blkcg *blkcg = css_to_blkcg(css);
>  
>  	blkcg_print_blkgs(sf, blkcg, cfqg_prfill_avg_queue_size,
>  			  &blkcg_policy_cfq, 0, false);
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index 085ca93..9749d63 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -439,34 +439,34 @@ struct cftype {
>  	struct cgroup_subsys *ss;
>  
>  	int (*open)(struct inode *inode, struct file *file);
> -	ssize_t (*read)(struct cgroup *cgrp, struct cftype *cft,
> +	ssize_t (*read)(struct cgroup_subsys_state *css, struct cftype *cft,
>  			struct file *file,
>  			char __user *buf, size_t nbytes, loff_t *ppos);
>  	/*
>  	 * read_u64() is a shortcut for the common case of returning a
>  	 * single integer. Use it in place of read()
>  	 */
> -	u64 (*read_u64)(struct cgroup *cgrp, struct cftype *cft);
> +	u64 (*read_u64)(struct cgroup_subsys_state *css, struct cftype *cft);
>  	/*
>  	 * read_s64() is a signed version of read_u64()
>  	 */
> -	s64 (*read_s64)(struct cgroup *cgrp, struct cftype *cft);
> +	s64 (*read_s64)(struct cgroup_subsys_state *css, struct cftype *cft);
>  	/*
>  	 * read_map() is used for defining a map of key/value
>  	 * pairs. It should call cb->fill(cb, key, value) for each
>  	 * entry. The key/value pairs (and their ordering) should not
>  	 * change between reboots.
>  	 */
> -	int (*read_map)(struct cgroup *cgrp, struct cftype *cft,
> +	int (*read_map)(struct cgroup_subsys_state *css, struct cftype *cft,
>  			struct cgroup_map_cb *cb);
>  	/*
>  	 * read_seq_string() is used for outputting a simple sequence
>  	 * using seqfile.
>  	 */
> -	int (*read_seq_string)(struct cgroup *cgrp, struct cftype *cft,
> -			       struct seq_file *m);
> +	int (*read_seq_string)(struct cgroup_subsys_state *css,
> +			       struct cftype *cft, struct seq_file *m);
>  
> -	ssize_t (*write)(struct cgroup *cgrp, struct cftype *cft,
> +	ssize_t (*write)(struct cgroup_subsys_state *css, struct cftype *cft,
>  			 struct file *file,
>  			 const char __user *buf, size_t nbytes, loff_t *ppos);
>  
> @@ -475,18 +475,20 @@ struct cftype {
>  	 * a single integer (as parsed by simple_strtoull) from
>  	 * userspace. Use in place of write(); return 0 or error.
>  	 */
> -	int (*write_u64)(struct cgroup *cgrp, struct cftype *cft, u64 val);
> +	int (*write_u64)(struct cgroup_subsys_state *css, struct cftype *cft,
> +			 u64 val);
>  	/*
>  	 * write_s64() is a signed version of write_u64()
>  	 */
> -	int (*write_s64)(struct cgroup *cgrp, struct cftype *cft, s64 val);
> +	int (*write_s64)(struct cgroup_subsys_state *css, struct cftype *cft,
> +			 s64 val);
>  
>  	/*
>  	 * write_string() is passed a nul-terminated kernelspace
>  	 * buffer of maximum length determined by max_write_len.
>  	 * Returns 0 or -ve error code.
>  	 */
> -	int (*write_string)(struct cgroup *cgrp, struct cftype *cft,
> +	int (*write_string)(struct cgroup_subsys_state *css, struct cftype *cft,
>  			    const char *buffer);
>  	/*
>  	 * trigger() callback can be used to get some kick from the
> @@ -494,7 +496,7 @@ struct cftype {
>  	 * at all. The private field can be used to determine the
>  	 * kick type for multiplexing.
>  	 */
> -	int (*trigger)(struct cgroup *cgrp, unsigned int event);
> +	int (*trigger)(struct cgroup_subsys_state *css, unsigned int event);
>  
>  	int (*release)(struct inode *inode, struct file *file);
>  
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 7b4d9d7..6c41609 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -85,7 +85,7 @@ extern struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p);
>  extern struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm);
>  
>  extern struct mem_cgroup *parent_mem_cgroup(struct mem_cgroup *memcg);
> -extern struct mem_cgroup *mem_cgroup_from_cont(struct cgroup *cont);
> +extern struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *css);
>  
>  static inline
>  bool mm_match_cgroup(const struct mm_struct *mm, const struct mem_cgroup *memcg)
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index bb87c9f..6c68192 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -2228,34 +2228,38 @@ int cgroup_attach_task_all(struct task_struct *from, struct task_struct *tsk)
>  }
>  EXPORT_SYMBOL_GPL(cgroup_attach_task_all);
>  
> -static int cgroup_tasks_write(struct cgroup *cgrp, struct cftype *cft, u64 pid)
> +static int cgroup_tasks_write(struct cgroup_subsys_state *css,
> +			      struct cftype *cft, u64 pid)
>  {
> -	return attach_task_by_pid(cgrp, pid, false);
> +	return attach_task_by_pid(css->cgroup, pid, false);
>  }
>  
> -static int cgroup_procs_write(struct cgroup *cgrp, struct cftype *cft, u64 tgid)
> +static int cgroup_procs_write(struct cgroup_subsys_state *css,
> +			      struct cftype *cft, u64 tgid)
>  {
> -	return attach_task_by_pid(cgrp, tgid, true);
> +	return attach_task_by_pid(css->cgroup, tgid, true);
>  }
>  
> -static int cgroup_release_agent_write(struct cgroup *cgrp, struct cftype *cft,
> -				      const char *buffer)
> +static int cgroup_release_agent_write(struct cgroup_subsys_state *css,
> +				      struct cftype *cft, const char *buffer)
>  {
> -	BUILD_BUG_ON(sizeof(cgrp->root->release_agent_path) < PATH_MAX);
> +	BUILD_BUG_ON(sizeof(css->cgroup->root->release_agent_path) < PATH_MAX);
>  	if (strlen(buffer) >= PATH_MAX)
>  		return -EINVAL;
> -	if (!cgroup_lock_live_group(cgrp))
> +	if (!cgroup_lock_live_group(css->cgroup))
>  		return -ENODEV;
>  	mutex_lock(&cgroup_root_mutex);
> -	strcpy(cgrp->root->release_agent_path, buffer);
> +	strcpy(css->cgroup->root->release_agent_path, buffer);
>  	mutex_unlock(&cgroup_root_mutex);
>  	mutex_unlock(&cgroup_mutex);
>  	return 0;
>  }
>  
> -static int cgroup_release_agent_show(struct cgroup *cgrp, struct cftype *cft,
> -				     struct seq_file *seq)
> +static int cgroup_release_agent_show(struct cgroup_subsys_state *css,
> +				     struct cftype *cft, struct seq_file *seq)
>  {
> +	struct cgroup *cgrp = css->cgroup;
> +
>  	if (!cgroup_lock_live_group(cgrp))
>  		return -ENODEV;
>  	seq_puts(seq, cgrp->root->release_agent_path);
> @@ -2264,10 +2268,10 @@ static int cgroup_release_agent_show(struct cgroup *cgrp, struct cftype *cft,
>  	return 0;
>  }
>  
> -static int cgroup_sane_behavior_show(struct cgroup *cgrp, struct cftype *cft,
> -				     struct seq_file *seq)
> +static int cgroup_sane_behavior_show(struct cgroup_subsys_state *css,
> +				     struct cftype *cft, struct seq_file *seq)
>  {
> -	seq_printf(seq, "%d\n", cgroup_sane_behavior(cgrp));
> +	seq_printf(seq, "%d\n", cgroup_sane_behavior(css->cgroup));
>  	return 0;
>  }
>  
> @@ -2285,10 +2289,10 @@ static struct cgroup_subsys_state *cgroup_file_css(struct cfent *cfe)
>  /* A buffer size big enough for numbers or short strings */
>  #define CGROUP_LOCAL_BUFFER_SIZE 64
>  
> -static ssize_t cgroup_write_X64(struct cgroup *cgrp, struct cftype *cft,
> -				struct file *file,
> -				const char __user *userbuf,
> -				size_t nbytes, loff_t *unused_ppos)
> +static ssize_t cgroup_write_X64(struct cgroup_subsys_state *css,
> +				struct cftype *cft, struct file *file,
> +				const char __user *userbuf, size_t nbytes,
> +				loff_t *unused_ppos)
>  {
>  	char buffer[CGROUP_LOCAL_BUFFER_SIZE];
>  	int retval = 0;
> @@ -2306,22 +2310,22 @@ static ssize_t cgroup_write_X64(struct cgroup *cgrp, struct cftype *cft,
>  		u64 val = simple_strtoull(strstrip(buffer), &end, 0);
>  		if (*end)
>  			return -EINVAL;
> -		retval = cft->write_u64(cgrp, cft, val);
> +		retval = cft->write_u64(css, cft, val);
>  	} else {
>  		s64 val = simple_strtoll(strstrip(buffer), &end, 0);
>  		if (*end)
>  			return -EINVAL;
> -		retval = cft->write_s64(cgrp, cft, val);
> +		retval = cft->write_s64(css, cft, val);
>  	}
>  	if (!retval)
>  		retval = nbytes;
>  	return retval;
>  }
>  
> -static ssize_t cgroup_write_string(struct cgroup *cgrp, struct cftype *cft,
> -				   struct file *file,
> -				   const char __user *userbuf,
> -				   size_t nbytes, loff_t *unused_ppos)
> +static ssize_t cgroup_write_string(struct cgroup_subsys_state *css,
> +				   struct cftype *cft, struct file *file,
> +				   const char __user *userbuf, size_t nbytes,
> +				   loff_t *unused_ppos)
>  {
>  	char local_buffer[CGROUP_LOCAL_BUFFER_SIZE];
>  	int retval = 0;
> @@ -2344,7 +2348,7 @@ static ssize_t cgroup_write_string(struct cgroup *cgrp, struct cftype *cft,
>  	}
>  
>  	buffer[nbytes] = 0;     /* nul-terminate */
> -	retval = cft->write_string(cgrp, cft, strstrip(buffer));
> +	retval = cft->write_string(css, cft, strstrip(buffer));
>  	if (!retval)
>  		retval = nbytes;
>  out:
> @@ -2354,60 +2358,60 @@ out:
>  }
>  
>  static ssize_t cgroup_file_write(struct file *file, const char __user *buf,
> -						size_t nbytes, loff_t *ppos)
> +				 size_t nbytes, loff_t *ppos)
>  {
> +	struct cfent *cfe = __d_cfe(file->f_dentry);
>  	struct cftype *cft = __d_cft(file->f_dentry);
> -	struct cgroup *cgrp = __d_cgrp(file->f_dentry->d_parent);
> +	struct cgroup_subsys_state *css = cgroup_file_css(cfe);
>  
>  	if (cft->write)
> -		return cft->write(cgrp, cft, file, buf, nbytes, ppos);
> +		return cft->write(css, cft, file, buf, nbytes, ppos);
>  	if (cft->write_u64 || cft->write_s64)
> -		return cgroup_write_X64(cgrp, cft, file, buf, nbytes, ppos);
> +		return cgroup_write_X64(css, cft, file, buf, nbytes, ppos);
>  	if (cft->write_string)
> -		return cgroup_write_string(cgrp, cft, file, buf, nbytes, ppos);
> +		return cgroup_write_string(css, cft, file, buf, nbytes, ppos);
>  	if (cft->trigger) {
> -		int ret = cft->trigger(cgrp, (unsigned int)cft->private);
> +		int ret = cft->trigger(css, (unsigned int)cft->private);
>  		return ret ? ret : nbytes;
>  	}
>  	return -EINVAL;
>  }
>  
> -static ssize_t cgroup_read_u64(struct cgroup *cgrp, struct cftype *cft,
> -			       struct file *file,
> -			       char __user *buf, size_t nbytes,
> -			       loff_t *ppos)
> +static ssize_t cgroup_read_u64(struct cgroup_subsys_state *css,
> +			       struct cftype *cft, struct file *file,
> +			       char __user *buf, size_t nbytes, loff_t *ppos)
>  {
>  	char tmp[CGROUP_LOCAL_BUFFER_SIZE];
> -	u64 val = cft->read_u64(cgrp, cft);
> +	u64 val = cft->read_u64(css, cft);
>  	int len = sprintf(tmp, "%llu\n", (unsigned long long) val);
>  
>  	return simple_read_from_buffer(buf, nbytes, ppos, tmp, len);
>  }
>  
> -static ssize_t cgroup_read_s64(struct cgroup *cgrp, struct cftype *cft,
> -			       struct file *file,
> -			       char __user *buf, size_t nbytes,
> -			       loff_t *ppos)
> +static ssize_t cgroup_read_s64(struct cgroup_subsys_state *css,
> +			       struct cftype *cft, struct file *file,
> +			       char __user *buf, size_t nbytes, loff_t *ppos)
>  {
>  	char tmp[CGROUP_LOCAL_BUFFER_SIZE];
> -	s64 val = cft->read_s64(cgrp, cft);
> +	s64 val = cft->read_s64(css, cft);
>  	int len = sprintf(tmp, "%lld\n", (long long) val);
>  
>  	return simple_read_from_buffer(buf, nbytes, ppos, tmp, len);
>  }
>  
>  static ssize_t cgroup_file_read(struct file *file, char __user *buf,
> -				   size_t nbytes, loff_t *ppos)
> +				size_t nbytes, loff_t *ppos)
>  {
> +	struct cfent *cfe = __d_cfe(file->f_dentry);
>  	struct cftype *cft = __d_cft(file->f_dentry);
> -	struct cgroup *cgrp = __d_cgrp(file->f_dentry->d_parent);
> +	struct cgroup_subsys_state *css = cgroup_file_css(cfe);
>  
>  	if (cft->read)
> -		return cft->read(cgrp, cft, file, buf, nbytes, ppos);
> +		return cft->read(css, cft, file, buf, nbytes, ppos);
>  	if (cft->read_u64)
> -		return cgroup_read_u64(cgrp, cft, file, buf, nbytes, ppos);
> +		return cgroup_read_u64(css, cft, file, buf, nbytes, ppos);
>  	if (cft->read_s64)
> -		return cgroup_read_s64(cgrp, cft, file, buf, nbytes, ppos);
> +		return cgroup_read_s64(css, cft, file, buf, nbytes, ppos);
>  	return -EINVAL;
>  }
>  
> @@ -2426,16 +2430,16 @@ static int cgroup_seqfile_show(struct seq_file *m, void *arg)
>  {
>  	struct cfent *cfe = m->private;
>  	struct cftype *cft = cfe->type;
> -	struct cgroup *cgrp = __d_cgrp(cfe->dentry->d_parent);
> +	struct cgroup_subsys_state *css = cgroup_file_css(cfe);
>  
>  	if (cft->read_map) {
>  		struct cgroup_map_cb cb = {
>  			.fill = cgroup_map_add,
>  			.state = m,
>  		};
> -		return cft->read_map(cgrp, cft, &cb);
> +		return cft->read_map(css, cft, &cb);
>  	}
> -	return cft->read_seq_string(cgrp, cft, m);
> +	return cft->read_seq_string(css, cft, m);
>  }
>  
>  static const struct file_operations cgroup_seqfile_operations = {
> @@ -3853,21 +3857,20 @@ static int cgroup_procs_open(struct inode *unused, struct file *file)
>  	return cgroup_pidlist_open(file, CGROUP_FILE_PROCS);
>  }
>  
> -static u64 cgroup_read_notify_on_release(struct cgroup *cgrp,
> -					    struct cftype *cft)
> +static u64 cgroup_read_notify_on_release(struct cgroup_subsys_state *css,
> +					 struct cftype *cft)
>  {
> -	return notify_on_release(cgrp);
> +	return notify_on_release(css->cgroup);
>  }
>  
> -static int cgroup_write_notify_on_release(struct cgroup *cgrp,
> -					  struct cftype *cft,
> -					  u64 val)
> +static int cgroup_write_notify_on_release(struct cgroup_subsys_state *css,
> +					  struct cftype *cft, u64 val)
>  {
> -	clear_bit(CGRP_RELEASABLE, &cgrp->flags);
> +	clear_bit(CGRP_RELEASABLE, &css->cgroup->flags);
>  	if (val)
> -		set_bit(CGRP_NOTIFY_ON_RELEASE, &cgrp->flags);
> +		set_bit(CGRP_NOTIFY_ON_RELEASE, &css->cgroup->flags);
>  	else
> -		clear_bit(CGRP_NOTIFY_ON_RELEASE, &cgrp->flags);
> +		clear_bit(CGRP_NOTIFY_ON_RELEASE, &css->cgroup->flags);
>  	return 0;
>  }
>  
> @@ -3965,9 +3968,10 @@ static void cgroup_event_ptable_queue_proc(struct file *file,
>   * Input must be in format '<event_fd> <control_fd> <args>'.
>   * Interpretation of args is defined by control file implementation.
>   */
> -static int cgroup_write_event_control(struct cgroup *cgrp, struct cftype *cft,
> -				      const char *buffer)
> +static int cgroup_write_event_control(struct cgroup_subsys_state *css,
> +				      struct cftype *cft, const char *buffer)
>  {
> +	struct cgroup *cgrp = css->cgroup;
>  	struct cgroup_event *event;
>  	struct cgroup *cgrp_cfile;
>  	unsigned int efd, cfd;
> @@ -4075,20 +4079,19 @@ out_kfree:
>  	return ret;
>  }
>  
> -static u64 cgroup_clone_children_read(struct cgroup *cgrp,
> -				    struct cftype *cft)
> +static u64 cgroup_clone_children_read(struct cgroup_subsys_state *css,
> +				      struct cftype *cft)
>  {
> -	return test_bit(CGRP_CPUSET_CLONE_CHILDREN, &cgrp->flags);
> +	return test_bit(CGRP_CPUSET_CLONE_CHILDREN, &css->cgroup->flags);
>  }
>  
> -static int cgroup_clone_children_write(struct cgroup *cgrp,
> -				     struct cftype *cft,
> -				     u64 val)
> +static int cgroup_clone_children_write(struct cgroup_subsys_state *css,
> +				       struct cftype *cft, u64 val)
>  {
>  	if (val)
> -		set_bit(CGRP_CPUSET_CLONE_CHILDREN, &cgrp->flags);
> +		set_bit(CGRP_CPUSET_CLONE_CHILDREN, &css->cgroup->flags);
>  	else
> -		clear_bit(CGRP_CPUSET_CLONE_CHILDREN, &cgrp->flags);
> +		clear_bit(CGRP_CPUSET_CLONE_CHILDREN, &css->cgroup->flags);
>  	return 0;
>  }
>  
> @@ -5576,17 +5579,19 @@ static void debug_css_free(struct cgroup_subsys_state *css)
>  	kfree(css);
>  }
>  
> -static u64 debug_taskcount_read(struct cgroup *cgrp, struct cftype *cft)
> +static u64 debug_taskcount_read(struct cgroup_subsys_state *css,
> +				struct cftype *cft)
>  {
> -	return cgroup_task_count(cgrp);
> +	return cgroup_task_count(css->cgroup);
>  }
>  
> -static u64 current_css_set_read(struct cgroup *cgrp, struct cftype *cft)
> +static u64 current_css_set_read(struct cgroup_subsys_state *css,
> +				struct cftype *cft)
>  {
>  	return (u64)(unsigned long)current->cgroups;
>  }
>  
> -static u64 current_css_set_refcount_read(struct cgroup *cgrp,
> +static u64 current_css_set_refcount_read(struct cgroup_subsys_state *css,
>  					 struct cftype *cft)
>  {
>  	u64 count;
> @@ -5597,7 +5602,7 @@ static u64 current_css_set_refcount_read(struct cgroup *cgrp,
>  	return count;
>  }
>  
> -static int current_css_set_cg_links_read(struct cgroup *cgrp,
> +static int current_css_set_cg_links_read(struct cgroup_subsys_state *css,
>  					 struct cftype *cft,
>  					 struct seq_file *seq)
>  {
> @@ -5624,14 +5629,13 @@ static int current_css_set_cg_links_read(struct cgroup *cgrp,
>  }
>  
>  #define MAX_TASKS_SHOWN_PER_CSS 25
> -static int cgroup_css_links_read(struct cgroup *cgrp,
> -				 struct cftype *cft,
> -				 struct seq_file *seq)
> +static int cgroup_css_links_read(struct cgroup_subsys_state *css,
> +				 struct cftype *cft, struct seq_file *seq)
>  {
>  	struct cgrp_cset_link *link;
>  
>  	read_lock(&css_set_lock);
> -	list_for_each_entry(link, &cgrp->cset_links, cset_link) {
> +	list_for_each_entry(link, &css->cgroup->cset_links, cset_link) {
>  		struct css_set *cset = link->cset;
>  		struct task_struct *task;
>  		int count = 0;
> @@ -5650,9 +5654,9 @@ static int cgroup_css_links_read(struct cgroup *cgrp,
>  	return 0;
>  }
>  
> -static u64 releasable_read(struct cgroup *cgrp, struct cftype *cft)
> +static u64 releasable_read(struct cgroup_subsys_state *css, struct cftype *cft)
>  {
> -	return test_bit(CGRP_RELEASABLE, &cgrp->flags);
> +	return test_bit(CGRP_RELEASABLE, &css->cgroup->flags);
>  }
>  
>  static struct cftype debug_files[] =  {
> diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
> index f03a857..19613ba 100644
> --- a/kernel/cgroup_freezer.c
> +++ b/kernel/cgroup_freezer.c
> @@ -245,7 +245,7 @@ out:
>  
>  /**
>   * update_if_frozen - update whether a cgroup finished freezing
> - * @cgroup: cgroup of interest
> + * @css: css of interest
>   *
>   * Once FREEZING is initiated, transition to FROZEN is lazily updated by
>   * calling this function.  If the current state is FREEZING but not FROZEN,
> @@ -256,12 +256,12 @@ out:
>   * update_if_frozen() on all descendants prior to invoking this function.
>   *
>   * Task states and freezer state might disagree while tasks are being
> - * migrated into or out of @cgroup, so we can't verify task states against
> + * migrated into or out of @css, so we can't verify task states against
>   * @freezer state here.  See freezer_attach() for details.
>   */
> -static void update_if_frozen(struct cgroup *cgroup)
> +static void update_if_frozen(struct cgroup_subsys_state *css)
>  {
> -	struct freezer *freezer = cgroup_freezer(cgroup);
> +	struct freezer *freezer = css_freezer(css);
>  	struct cgroup *pos;
>  	struct cgroup_iter it;
>  	struct task_struct *task;
> @@ -275,7 +275,7 @@ static void update_if_frozen(struct cgroup *cgroup)
>  		goto out_unlock;
>  
>  	/* are all (live) children frozen? */
> -	cgroup_for_each_child(pos, cgroup) {
> +	cgroup_for_each_child(pos, css->cgroup) {
>  		struct freezer *child = cgroup_freezer(pos);
>  
>  		if ((child->state & CGROUP_FREEZER_ONLINE) &&
> @@ -284,9 +284,9 @@ static void update_if_frozen(struct cgroup *cgroup)
>  	}
>  
>  	/* are all tasks frozen? */
> -	cgroup_iter_start(cgroup, &it);
> +	cgroup_iter_start(css->cgroup, &it);
>  
> -	while ((task = cgroup_iter_next(cgroup, &it))) {
> +	while ((task = cgroup_iter_next(css->cgroup, &it))) {
>  		if (freezing(task)) {
>  			/*
>  			 * freezer_should_skip() indicates that the task
> @@ -301,12 +301,12 @@ static void update_if_frozen(struct cgroup *cgroup)
>  
>  	freezer->state |= CGROUP_FROZEN;
>  out_iter_end:
> -	cgroup_iter_end(cgroup, &it);
> +	cgroup_iter_end(css->cgroup, &it);
>  out_unlock:
>  	spin_unlock_irq(&freezer->lock);
>  }
>  
> -static int freezer_read(struct cgroup *cgroup, struct cftype *cft,
> +static int freezer_read(struct cgroup_subsys_state *css, struct cftype *cft,
>  			struct seq_file *m)
>  {
>  	struct cgroup *pos;
> @@ -314,13 +314,13 @@ static int freezer_read(struct cgroup *cgroup, struct cftype *cft,
>  	rcu_read_lock();
>  
>  	/* update states bottom-up */
> -	cgroup_for_each_descendant_post(pos, cgroup)
> -		update_if_frozen(pos);
> -	update_if_frozen(cgroup);
> +	cgroup_for_each_descendant_post(pos, css->cgroup)
> +		update_if_frozen(cgroup_css(pos, freezer_subsys_id));
> +	update_if_frozen(css);
>  
>  	rcu_read_unlock();
>  
> -	seq_puts(m, freezer_state_strs(cgroup_freezer(cgroup)->state));
> +	seq_puts(m, freezer_state_strs(css_freezer(css)->state));
>  	seq_putc(m, '\n');
>  	return 0;
>  }
> @@ -426,7 +426,7 @@ static void freezer_change_state(struct freezer *freezer, bool freeze)
>  	rcu_read_unlock();
>  }
>  
> -static int freezer_write(struct cgroup *cgroup, struct cftype *cft,
> +static int freezer_write(struct cgroup_subsys_state *css, struct cftype *cft,
>  			 const char *buffer)
>  {
>  	bool freeze;
> @@ -438,20 +438,22 @@ static int freezer_write(struct cgroup *cgroup, struct cftype *cft,
>  	else
>  		return -EINVAL;
>  
> -	freezer_change_state(cgroup_freezer(cgroup), freeze);
> +	freezer_change_state(css_freezer(css), freeze);
>  	return 0;
>  }
>  
> -static u64 freezer_self_freezing_read(struct cgroup *cgroup, struct cftype *cft)
> +static u64 freezer_self_freezing_read(struct cgroup_subsys_state *css,
> +				      struct cftype *cft)
>  {
> -	struct freezer *freezer = cgroup_freezer(cgroup);
> +	struct freezer *freezer = css_freezer(css);
>  
>  	return (bool)(freezer->state & CGROUP_FREEZING_SELF);
>  }
>  
> -static u64 freezer_parent_freezing_read(struct cgroup *cgroup, struct cftype *cft)
> +static u64 freezer_parent_freezing_read(struct cgroup_subsys_state *css,
> +					struct cftype *cft)
>  {
> -	struct freezer *freezer = cgroup_freezer(cgroup);
> +	struct freezer *freezer = css_freezer(css);
>  
>  	return (bool)(freezer->state & CGROUP_FREEZING_PARENT);
>  }
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index 8ce3fdc..89b76e1 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -1603,9 +1603,10 @@ typedef enum {
>  	FILE_SPREAD_SLAB,
>  } cpuset_filetype_t;
>  
> -static int cpuset_write_u64(struct cgroup *cgrp, struct cftype *cft, u64 val)
> +static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
> +			    u64 val)
>  {
> -	struct cpuset *cs = cgroup_cs(cgrp);
> +	struct cpuset *cs = css_cs(css);
>  	cpuset_filetype_t type = cft->private;
>  	int retval = -ENODEV;
>  
> @@ -1650,9 +1651,10 @@ out_unlock:
>  	return retval;
>  }
>  
> -static int cpuset_write_s64(struct cgroup *cgrp, struct cftype *cft, s64 val)
> +static int cpuset_write_s64(struct cgroup_subsys_state *css, struct cftype *cft,
> +			    s64 val)
>  {
> -	struct cpuset *cs = cgroup_cs(cgrp);
> +	struct cpuset *cs = css_cs(css);
>  	cpuset_filetype_t type = cft->private;
>  	int retval = -ENODEV;
>  
> @@ -1676,10 +1678,10 @@ out_unlock:
>  /*
>   * Common handling for a write to a "cpus" or "mems" file.
>   */
> -static int cpuset_write_resmask(struct cgroup *cgrp, struct cftype *cft,
> -				const char *buf)
> +static int cpuset_write_resmask(struct cgroup_subsys_state *css,
> +				struct cftype *cft, const char *buf)
>  {
> -	struct cpuset *cs = cgroup_cs(cgrp);
> +	struct cpuset *cs = css_cs(css);
>  	struct cpuset *trialcs;
>  	int retval = -ENODEV;
>  
> @@ -1758,13 +1760,12 @@ static size_t cpuset_sprintf_memlist(char *page, struct cpuset *cs)
>  	return count;
>  }
>  
> -static ssize_t cpuset_common_file_read(struct cgroup *cgrp,
> -				       struct cftype *cft,
> -				       struct file *file,
> -				       char __user *buf,
> -				       size_t nbytes, loff_t *ppos)
> +static ssize_t cpuset_common_file_read(struct cgroup_subsys_state *css,
> +				       struct cftype *cft, struct file *file,
> +				       char __user *buf, size_t nbytes,
> +				       loff_t *ppos)
>  {
> -	struct cpuset *cs = cgroup_cs(cgrp);
> +	struct cpuset *cs = css_cs(css);
>  	cpuset_filetype_t type = cft->private;
>  	char *page;
>  	ssize_t retval = 0;
> @@ -1794,9 +1795,9 @@ out:
>  	return retval;
>  }
>  
> -static u64 cpuset_read_u64(struct cgroup *cgrp, struct cftype *cft)
> +static u64 cpuset_read_u64(struct cgroup_subsys_state *css, struct cftype *cft)
>  {
> -	struct cpuset *cs = cgroup_cs(cgrp);
> +	struct cpuset *cs = css_cs(css);
>  	cpuset_filetype_t type = cft->private;
>  	switch (type) {
>  	case FILE_CPU_EXCLUSIVE:
> @@ -1825,9 +1826,9 @@ static u64 cpuset_read_u64(struct cgroup *cgrp, struct cftype *cft)
>  	return 0;
>  }
>  
> -static s64 cpuset_read_s64(struct cgroup *cgrp, struct cftype *cft)
> +static s64 cpuset_read_s64(struct cgroup_subsys_state *css, struct cftype *cft)
>  {
> -	struct cpuset *cs = cgroup_cs(cgrp);
> +	struct cpuset *cs = css_cs(css);
>  	cpuset_filetype_t type = cft->private;
>  	switch (type) {
>  	case FILE_SCHED_RELAX_DOMAIN_LEVEL:
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 622b7ef..cc9a492 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7088,12 +7088,6 @@ static inline struct task_group *css_tg(struct cgroup_subsys_state *css)
>  	return css ? container_of(css, struct task_group, css) : NULL;
>  }
>  
> -/* return corresponding task_group object of a cgroup */
> -static inline struct task_group *cgroup_tg(struct cgroup *cgrp)
> -{
> -	return css_tg(cgroup_css(cgrp, cpu_cgroup_subsys_id));
> -}
> -
>  static struct cgroup_subsys_state *
>  cpu_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
>  {
> @@ -7179,15 +7173,16 @@ static void cpu_cgroup_exit(struct cgroup_subsys_state *css,
>  }
>  
>  #ifdef CONFIG_FAIR_GROUP_SCHED
> -static int cpu_shares_write_u64(struct cgroup *cgrp, struct cftype *cftype,
> -				u64 shareval)
> +static int cpu_shares_write_u64(struct cgroup_subsys_state *css,
> +				struct cftype *cftype, u64 shareval)
>  {
> -	return sched_group_set_shares(cgroup_tg(cgrp), scale_load(shareval));
> +	return sched_group_set_shares(css_tg(css), scale_load(shareval));
>  }
>  
> -static u64 cpu_shares_read_u64(struct cgroup *cgrp, struct cftype *cft)
> +static u64 cpu_shares_read_u64(struct cgroup_subsys_state *css,
> +			       struct cftype *cft)
>  {
> -	struct task_group *tg = cgroup_tg(cgrp);
> +	struct task_group *tg = css_tg(css);
>  
>  	return (u64) scale_load_down(tg->shares);
>  }
> @@ -7309,26 +7304,28 @@ long tg_get_cfs_period(struct task_group *tg)
>  	return cfs_period_us;
>  }
>  
> -static s64 cpu_cfs_quota_read_s64(struct cgroup *cgrp, struct cftype *cft)
> +static s64 cpu_cfs_quota_read_s64(struct cgroup_subsys_state *css,
> +				  struct cftype *cft)
>  {
> -	return tg_get_cfs_quota(cgroup_tg(cgrp));
> +	return tg_get_cfs_quota(css_tg(css));
>  }
>  
> -static int cpu_cfs_quota_write_s64(struct cgroup *cgrp, struct cftype *cftype,
> -				s64 cfs_quota_us)
> +static int cpu_cfs_quota_write_s64(struct cgroup_subsys_state *css,
> +				   struct cftype *cftype, s64 cfs_quota_us)
>  {
> -	return tg_set_cfs_quota(cgroup_tg(cgrp), cfs_quota_us);
> +	return tg_set_cfs_quota(css_tg(css), cfs_quota_us);
>  }
>  
> -static u64 cpu_cfs_period_read_u64(struct cgroup *cgrp, struct cftype *cft)
> +static u64 cpu_cfs_period_read_u64(struct cgroup_subsys_state *css,
> +				   struct cftype *cft)
>  {
> -	return tg_get_cfs_period(cgroup_tg(cgrp));
> +	return tg_get_cfs_period(css_tg(css));
>  }
>  
> -static int cpu_cfs_period_write_u64(struct cgroup *cgrp, struct cftype *cftype,
> -				u64 cfs_period_us)
> +static int cpu_cfs_period_write_u64(struct cgroup_subsys_state *css,
> +				    struct cftype *cftype, u64 cfs_period_us)
>  {
> -	return tg_set_cfs_period(cgroup_tg(cgrp), cfs_period_us);
> +	return tg_set_cfs_period(css_tg(css), cfs_period_us);
>  }
>  
>  struct cfs_schedulable_data {
> @@ -7409,10 +7406,10 @@ static int __cfs_schedulable(struct task_group *tg, u64 period, u64 quota)
>  	return ret;
>  }
>  
> -static int cpu_stats_show(struct cgroup *cgrp, struct cftype *cft,
> +static int cpu_stats_show(struct cgroup_subsys_state *css, struct cftype *cft,
>  		struct cgroup_map_cb *cb)
>  {
> -	struct task_group *tg = cgroup_tg(cgrp);
> +	struct task_group *tg = css_tg(css);
>  	struct cfs_bandwidth *cfs_b = &tg->cfs_bandwidth;
>  
>  	cb->fill(cb, "nr_periods", cfs_b->nr_periods);
> @@ -7425,26 +7422,28 @@ static int cpu_stats_show(struct cgroup *cgrp, struct cftype *cft,
>  #endif /* CONFIG_FAIR_GROUP_SCHED */
>  
>  #ifdef CONFIG_RT_GROUP_SCHED
> -static int cpu_rt_runtime_write(struct cgroup *cgrp, struct cftype *cft,
> -				s64 val)
> +static int cpu_rt_runtime_write(struct cgroup_subsys_state *css,
> +				struct cftype *cft, s64 val)
>  {
> -	return sched_group_set_rt_runtime(cgroup_tg(cgrp), val);
> +	return sched_group_set_rt_runtime(css_tg(css), val);
>  }
>  
> -static s64 cpu_rt_runtime_read(struct cgroup *cgrp, struct cftype *cft)
> +static s64 cpu_rt_runtime_read(struct cgroup_subsys_state *css,
> +			       struct cftype *cft)
>  {
> -	return sched_group_rt_runtime(cgroup_tg(cgrp));
> +	return sched_group_rt_runtime(css_tg(css));
>  }
>  
> -static int cpu_rt_period_write_uint(struct cgroup *cgrp, struct cftype *cftype,
> -		u64 rt_period_us)
> +static int cpu_rt_period_write_uint(struct cgroup_subsys_state *css,
> +				    struct cftype *cftype, u64 rt_period_us)
>  {
> -	return sched_group_set_rt_period(cgroup_tg(cgrp), rt_period_us);
> +	return sched_group_set_rt_period(css_tg(css), rt_period_us);
>  }
>  
> -static u64 cpu_rt_period_read_uint(struct cgroup *cgrp, struct cftype *cft)
> +static u64 cpu_rt_period_read_uint(struct cgroup_subsys_state *css,
> +				   struct cftype *cft)
>  {
> -	return sched_group_rt_period(cgroup_tg(cgrp));
> +	return sched_group_rt_period(css_tg(css));
>  }
>  #endif /* CONFIG_RT_GROUP_SCHED */
>  
> diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
> index 1b784d9..f64722f 100644
> --- a/kernel/sched/cpuacct.c
> +++ b/kernel/sched/cpuacct.c
> @@ -38,12 +38,6 @@ static inline struct cpuacct *css_ca(struct cgroup_subsys_state *css)
>  	return css ? container_of(css, struct cpuacct, css) : NULL;
>  }
>  
> -/* return cpu accounting group corresponding to this container */
> -static inline struct cpuacct *cgroup_ca(struct cgroup *cgrp)
> -{
> -	return css_ca(cgroup_css(cgrp, cpuacct_subsys_id));
> -}
> -
>  /* return cpu accounting group to which this task belongs */
>  static inline struct cpuacct *task_ca(struct task_struct *tsk)
>  {
> @@ -138,9 +132,9 @@ static void cpuacct_cpuusage_write(struct cpuacct *ca, int cpu, u64 val)
>  }
>  
>  /* return total cpu usage (in nanoseconds) of a group */
> -static u64 cpuusage_read(struct cgroup *cgrp, struct cftype *cft)
> +static u64 cpuusage_read(struct cgroup_subsys_state *css, struct cftype *cft)
>  {
> -	struct cpuacct *ca = cgroup_ca(cgrp);
> +	struct cpuacct *ca = css_ca(css);
>  	u64 totalcpuusage = 0;
>  	int i;
>  
> @@ -150,10 +144,10 @@ static u64 cpuusage_read(struct cgroup *cgrp, struct cftype *cft)
>  	return totalcpuusage;
>  }
>  
> -static int cpuusage_write(struct cgroup *cgrp, struct cftype *cftype,
> -								u64 reset)
> +static int cpuusage_write(struct cgroup_subsys_state *css, struct cftype *cft,
> +			  u64 reset)
>  {
> -	struct cpuacct *ca = cgroup_ca(cgrp);
> +	struct cpuacct *ca = css_ca(css);
>  	int err = 0;
>  	int i;
>  
> @@ -169,10 +163,10 @@ out:
>  	return err;
>  }
>  
> -static int cpuacct_percpu_seq_read(struct cgroup *cgroup, struct cftype *cft,
> -				   struct seq_file *m)
> +static int cpuacct_percpu_seq_read(struct cgroup_subsys_state *css,
> +				   struct cftype *cft, struct seq_file *m)
>  {
> -	struct cpuacct *ca = cgroup_ca(cgroup);
> +	struct cpuacct *ca = css_ca(css);
>  	u64 percpu;
>  	int i;
>  
> @@ -189,10 +183,10 @@ static const char * const cpuacct_stat_desc[] = {
>  	[CPUACCT_STAT_SYSTEM] = "system",
>  };
>  
> -static int cpuacct_stats_show(struct cgroup *cgrp, struct cftype *cft,
> -			      struct cgroup_map_cb *cb)
> +static int cpuacct_stats_show(struct cgroup_subsys_state *css,
> +			      struct cftype *cft, struct cgroup_map_cb *cb)
>  {
> -	struct cpuacct *ca = cgroup_ca(cgrp);
> +	struct cpuacct *ca = css_ca(css);
>  	int cpu;
>  	s64 val = 0;
>  
> diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
> index e213243..bda8e44 100644
> --- a/mm/hugetlb_cgroup.c
> +++ b/mm/hugetlb_cgroup.c
> @@ -40,12 +40,6 @@ struct hugetlb_cgroup *hugetlb_cgroup_from_css(struct cgroup_subsys_state *s)
>  }
>  
>  static inline
> -struct hugetlb_cgroup *hugetlb_cgroup_from_cgroup(struct cgroup *cgroup)
> -{
> -	return hugetlb_cgroup_from_css(cgroup_css(cgroup, hugetlb_subsys_id));
> -}
> -
> -static inline
>  struct hugetlb_cgroup *hugetlb_cgroup_from_task(struct task_struct *task)
>  {
>  	return hugetlb_cgroup_from_css(task_css(task, hugetlb_subsys_id));
> @@ -248,14 +242,15 @@ void hugetlb_cgroup_uncharge_cgroup(int idx, unsigned long nr_pages,
>  	return;
>  }
>  
> -static ssize_t hugetlb_cgroup_read(struct cgroup *cgroup, struct cftype *cft,
> -				   struct file *file, char __user *buf,
> -				   size_t nbytes, loff_t *ppos)
> +static ssize_t hugetlb_cgroup_read(struct cgroup_subsys_state *css,
> +				   struct cftype *cft, struct file *file,
> +				   char __user *buf, size_t nbytes,
> +				   loff_t *ppos)
>  {
>  	u64 val;
>  	char str[64];
>  	int idx, name, len;
> -	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_cgroup(cgroup);
> +	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_css(css);
>  
>  	idx = MEMFILE_IDX(cft->private);
>  	name = MEMFILE_ATTR(cft->private);
> @@ -265,12 +260,12 @@ static ssize_t hugetlb_cgroup_read(struct cgroup *cgroup, struct cftype *cft,
>  	return simple_read_from_buffer(buf, nbytes, ppos, str, len);
>  }
>  
> -static int hugetlb_cgroup_write(struct cgroup *cgroup, struct cftype *cft,
> -				const char *buffer)
> +static int hugetlb_cgroup_write(struct cgroup_subsys_state *css,
> +				struct cftype *cft, const char *buffer)
>  {
>  	int idx, name, ret;
>  	unsigned long long val;
> -	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_cgroup(cgroup);
> +	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_css(css);
>  
>  	idx = MEMFILE_IDX(cft->private);
>  	name = MEMFILE_ATTR(cft->private);
> @@ -295,10 +290,11 @@ static int hugetlb_cgroup_write(struct cgroup *cgroup, struct cftype *cft,
>  	return ret;
>  }
>  
> -static int hugetlb_cgroup_reset(struct cgroup *cgroup, unsigned int event)
> +static int hugetlb_cgroup_reset(struct cgroup_subsys_state *css,
> +				unsigned int event)
>  {
>  	int idx, name, ret = 0;
> -	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_cgroup(cgroup);
> +	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_css(css);
>  
>  	idx = MEMFILE_IDX(event);
>  	name = MEMFILE_ATTR(event);
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 32cca0f..ab64dfc 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -483,7 +483,6 @@ enum res_type {
>   */
>  static DEFINE_MUTEX(memcg_create_mutex);
>  
> -static inline
>  struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *s)
>  {
>  	return s ? container_of(s, struct mem_cgroup, css) : NULL;
> @@ -1035,7 +1034,7 @@ static void memcg_check_events(struct mem_cgroup *memcg, struct page *page)
>  		preempt_enable();
>  }
>  
> -struct mem_cgroup *mem_cgroup_from_cont(struct cgroup *cont)
> +static inline struct mem_cgroup *mem_cgroup_from_cont(struct cgroup *cont)
>  {
>  	return mem_cgroup_from_css(cgroup_css(cont, mem_cgroup_subsys_id));
>  }
> @@ -2951,10 +2950,10 @@ static struct kmem_cache *memcg_params_to_cache(struct memcg_cache_params *p)
>  }
>  
>  #ifdef CONFIG_SLABINFO
> -static int mem_cgroup_slabinfo_read(struct cgroup *cont, struct cftype *cft,
> -					struct seq_file *m)
> +static int mem_cgroup_slabinfo_read(struct cgroup_subsys_state *css,
> +				    struct cftype *cft, struct seq_file *m)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	struct memcg_cache_params *params;
>  
>  	if (!memcg_can_account_kmem(memcg))
> @@ -4999,9 +4998,10 @@ static int mem_cgroup_force_empty(struct mem_cgroup *memcg)
>  	return 0;
>  }
>  
> -static int mem_cgroup_force_empty_write(struct cgroup *cont, unsigned int event)
> +static int mem_cgroup_force_empty_write(struct cgroup_subsys_state *css,
> +					unsigned int event)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	int ret;
>  
>  	if (mem_cgroup_is_root(memcg))
> @@ -5014,16 +5014,17 @@ static int mem_cgroup_force_empty_write(struct cgroup *cont, unsigned int event)
>  }
>  
>  
> -static u64 mem_cgroup_hierarchy_read(struct cgroup *cont, struct cftype *cft)
> +static u64 mem_cgroup_hierarchy_read(struct cgroup_subsys_state *css,
> +				     struct cftype *cft)
>  {
> -	return mem_cgroup_from_cont(cont)->use_hierarchy;
> +	return mem_cgroup_from_css(css)->use_hierarchy;
>  }
>  
> -static int mem_cgroup_hierarchy_write(struct cgroup *cont, struct cftype *cft,
> -					u64 val)
> +static int mem_cgroup_hierarchy_write(struct cgroup_subsys_state *css,
> +				      struct cftype *cft, u64 val)
>  {
>  	int retval = 0;
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	struct mem_cgroup *parent_memcg = mem_cgroup_from_css(css_parent(&memcg->css));
>  
>  	mutex_lock(&memcg_create_mutex);
> @@ -5094,11 +5095,11 @@ static inline u64 mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
>  	return val << PAGE_SHIFT;
>  }
>  
> -static ssize_t mem_cgroup_read(struct cgroup *cont, struct cftype *cft,
> -			       struct file *file, char __user *buf,
> -			       size_t nbytes, loff_t *ppos)
> +static ssize_t mem_cgroup_read(struct cgroup_subsys_state *css,
> +			       struct cftype *cft, struct file *file,
> +			       char __user *buf, size_t nbytes, loff_t *ppos)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	char str[64];
>  	u64 val;
>  	int name, len;
> @@ -5131,11 +5132,11 @@ static ssize_t mem_cgroup_read(struct cgroup *cont, struct cftype *cft,
>  	return simple_read_from_buffer(buf, nbytes, ppos, str, len);
>  }
>  
> -static int memcg_update_kmem_limit(struct cgroup *cont, u64 val)
> +static int memcg_update_kmem_limit(struct cgroup_subsys_state *css, u64 val)
>  {
>  	int ret = -EINVAL;
>  #ifdef CONFIG_MEMCG_KMEM
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	/*
>  	 * For simplicity, we won't allow this to be disabled.  It also can't
>  	 * be changed if the cgroup has children already, or if tasks had
> @@ -5151,7 +5152,7 @@ static int memcg_update_kmem_limit(struct cgroup *cont, u64 val)
>  	mutex_lock(&memcg_create_mutex);
>  	mutex_lock(&set_limit_mutex);
>  	if (!memcg->kmem_account_flags && val != RESOURCE_MAX) {
> -		if (cgroup_task_count(cont) || memcg_has_children(memcg)) {
> +		if (cgroup_task_count(css->cgroup) || memcg_has_children(memcg)) {
>  			ret = -EBUSY;
>  			goto out;
>  		}
> @@ -5221,10 +5222,10 @@ out:
>   * The user of this function is...
>   * RES_LIMIT.
>   */
> -static int mem_cgroup_write(struct cgroup *cont, struct cftype *cft,
> +static int mem_cgroup_write(struct cgroup_subsys_state *css, struct cftype *cft,
>  			    const char *buffer)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	enum res_type type;
>  	int name;
>  	unsigned long long val;
> @@ -5248,7 +5249,7 @@ static int mem_cgroup_write(struct cgroup *cont, struct cftype *cft,
>  		else if (type == _MEMSWAP)
>  			ret = mem_cgroup_resize_memsw_limit(memcg, val);
>  		else if (type == _KMEM)
> -			ret = memcg_update_kmem_limit(cont, val);
> +			ret = memcg_update_kmem_limit(css, val);
>  		else
>  			return -EINVAL;
>  		break;
> @@ -5297,9 +5298,9 @@ out:
>  	*memsw_limit = min_memsw_limit;
>  }
>  
> -static int mem_cgroup_reset(struct cgroup *cont, unsigned int event)
> +static int mem_cgroup_reset(struct cgroup_subsys_state *css, unsigned int event)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	int name;
>  	enum res_type type;
>  
> @@ -5332,17 +5333,17 @@ static int mem_cgroup_reset(struct cgroup *cont, unsigned int event)
>  	return 0;
>  }
>  
> -static u64 mem_cgroup_move_charge_read(struct cgroup *cgrp,
> +static u64 mem_cgroup_move_charge_read(struct cgroup_subsys_state *css,
>  					struct cftype *cft)
>  {
> -	return mem_cgroup_from_cont(cgrp)->move_charge_at_immigrate;
> +	return mem_cgroup_from_css(css)->move_charge_at_immigrate;
>  }
>  
>  #ifdef CONFIG_MMU
> -static int mem_cgroup_move_charge_write(struct cgroup *cgrp,
> +static int mem_cgroup_move_charge_write(struct cgroup_subsys_state *css,
>  					struct cftype *cft, u64 val)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  
>  	if (val >= (1 << NR_MOVE_TYPE))
>  		return -EINVAL;
> @@ -5357,7 +5358,7 @@ static int mem_cgroup_move_charge_write(struct cgroup *cgrp,
>  	return 0;
>  }
>  #else
> -static int mem_cgroup_move_charge_write(struct cgroup *cgrp,
> +static int mem_cgroup_move_charge_write(struct cgroup_subsys_state *css,
>  					struct cftype *cft, u64 val)
>  {
>  	return -ENOSYS;
> @@ -5365,13 +5366,13 @@ static int mem_cgroup_move_charge_write(struct cgroup *cgrp,
>  #endif
>  
>  #ifdef CONFIG_NUMA
> -static int memcg_numa_stat_show(struct cgroup *cont, struct cftype *cft,
> -				      struct seq_file *m)
> +static int memcg_numa_stat_show(struct cgroup_subsys_state *css,
> +				struct cftype *cft, struct seq_file *m)
>  {
>  	int nid;
>  	unsigned long total_nr, file_nr, anon_nr, unevictable_nr;
>  	unsigned long node_nr;
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  
>  	total_nr = mem_cgroup_nr_lru_pages(memcg, LRU_ALL);
>  	seq_printf(m, "total=%lu", total_nr);
> @@ -5416,10 +5417,10 @@ static inline void mem_cgroup_lru_names_not_uptodate(void)
>  	BUILD_BUG_ON(ARRAY_SIZE(mem_cgroup_lru_names) != NR_LRU_LISTS);
>  }
>  
> -static int memcg_stat_show(struct cgroup *cont, struct cftype *cft,
> +static int memcg_stat_show(struct cgroup_subsys_state *css, struct cftype *cft,
>  				 struct seq_file *m)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	struct mem_cgroup *mi;
>  	unsigned int i;
>  
> @@ -5503,17 +5504,18 @@ static int memcg_stat_show(struct cgroup *cont, struct cftype *cft,
>  	return 0;
>  }
>  
> -static u64 mem_cgroup_swappiness_read(struct cgroup *cgrp, struct cftype *cft)
> +static u64 mem_cgroup_swappiness_read(struct cgroup_subsys_state *css,
> +				      struct cftype *cft)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  
>  	return mem_cgroup_swappiness(memcg);
>  }
>  
> -static int mem_cgroup_swappiness_write(struct cgroup *cgrp, struct cftype *cft,
> -				       u64 val)
> +static int mem_cgroup_swappiness_write(struct cgroup_subsys_state *css,
> +				       struct cftype *cft, u64 val)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	struct mem_cgroup *parent = mem_cgroup_from_css(css_parent(&memcg->css));
>  
>  	if (val > 100 || !parent)
> @@ -5829,10 +5831,10 @@ static void mem_cgroup_oom_unregister_event(struct cgroup *cgrp,
>  	spin_unlock(&memcg_oom_lock);
>  }
>  
> -static int mem_cgroup_oom_control_read(struct cgroup *cgrp,
> +static int mem_cgroup_oom_control_read(struct cgroup_subsys_state *css,
>  	struct cftype *cft,  struct cgroup_map_cb *cb)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  
>  	cb->fill(cb, "oom_kill_disable", memcg->oom_kill_disable);
>  
> @@ -5843,10 +5845,10 @@ static int mem_cgroup_oom_control_read(struct cgroup *cgrp,
>  	return 0;
>  }
>  
> -static int mem_cgroup_oom_control_write(struct cgroup *cgrp,
> +static int mem_cgroup_oom_control_write(struct cgroup_subsys_state *css,
>  	struct cftype *cft, u64 val)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	struct mem_cgroup *parent = mem_cgroup_from_css(css_parent(&memcg->css));
>  
>  	/* cannot set to root cgroup and only 0 and 1 are allowed */
> diff --git a/mm/vmpressure.c b/mm/vmpressure.c
> index 7f1654d..2a8a736 100644
> --- a/mm/vmpressure.c
> +++ b/mm/vmpressure.c
> @@ -81,8 +81,8 @@ static struct vmpressure *cg_to_vmpressure(struct cgroup *cg)
>  
>  static struct vmpressure *vmpressure_parent(struct vmpressure *vmpr)
>  {
> -	struct cgroup *cg = vmpressure_to_css(vmpr)->cgroup;
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cg);
> +	struct cgroup_subsys_state *css = vmpressure_to_css(vmpr);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  
>  	memcg = parent_mem_cgroup(memcg);
>  	if (!memcg)
> diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
> index 8d095b4..e00f60e 100644
> --- a/net/core/netprio_cgroup.c
> +++ b/net/core/netprio_cgroup.c
> @@ -168,15 +168,14 @@ static void cgrp_css_free(struct cgroup_subsys_state *css)
>  	kfree(css);
>  }
>  
> -static u64 read_prioidx(struct cgroup *cgrp, struct cftype *cft)
> +static u64 read_prioidx(struct cgroup_subsys_state *css, struct cftype *cft)
>  {
> -	return cgrp->id;
> +	return css->cgroup->id;
>  }
>  
> -static int read_priomap(struct cgroup *cont, struct cftype *cft,
> +static int read_priomap(struct cgroup_subsys_state *css, struct cftype *cft,
>  			struct cgroup_map_cb *cb)
>  {
> -	struct cgroup_subsys_state *css = cgroup_css(cont, net_prio_subsys_id);
>  	struct net_device *dev;
>  
>  	rcu_read_lock();
> @@ -186,10 +185,9 @@ static int read_priomap(struct cgroup *cont, struct cftype *cft,
>  	return 0;
>  }
>  
> -static int write_priomap(struct cgroup *cgrp, struct cftype *cft,
> +static int write_priomap(struct cgroup_subsys_state *css, struct cftype *cft,
>  			 const char *buffer)
>  {
> -	struct cgroup_subsys_state *css = cgroup_css(cgrp, net_prio_subsys_id);
>  	char devname[IFNAMSIZ + 1];
>  	struct net_device *dev;
>  	u32 prio;
> diff --git a/net/ipv4/tcp_memcontrol.c b/net/ipv4/tcp_memcontrol.c
> index da14436..8a57d79 100644
> --- a/net/ipv4/tcp_memcontrol.c
> +++ b/net/ipv4/tcp_memcontrol.c
> @@ -132,10 +132,10 @@ static int tcp_update_limit(struct mem_cgroup *memcg, u64 val)
>  	return 0;
>  }
>  
> -static int tcp_cgroup_write(struct cgroup *cont, struct cftype *cft,
> +static int tcp_cgroup_write(struct cgroup_subsys_state *css, struct cftype *cft,
>  			    const char *buffer)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	unsigned long long val;
>  	int ret = 0;
>  
> @@ -180,9 +180,9 @@ static u64 tcp_read_usage(struct mem_cgroup *memcg)
>  	return res_counter_read_u64(&tcp->tcp_memory_allocated, RES_USAGE);
>  }
>  
> -static u64 tcp_cgroup_read(struct cgroup *cont, struct cftype *cft)
> +static u64 tcp_cgroup_read(struct cgroup_subsys_state *css, struct cftype *cft)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	u64 val;
>  
>  	switch (cft->private) {
> @@ -202,13 +202,13 @@ static u64 tcp_cgroup_read(struct cgroup *cont, struct cftype *cft)
>  	return val;
>  }
>  
> -static int tcp_cgroup_reset(struct cgroup *cont, unsigned int event)
> +static int tcp_cgroup_reset(struct cgroup_subsys_state *css, unsigned int event)
>  {
>  	struct mem_cgroup *memcg;
>  	struct tcp_memcontrol *tcp;
>  	struct cg_proto *cg_proto;
>  
> -	memcg = mem_cgroup_from_cont(cont);
> +	memcg = mem_cgroup_from_css(css);
>  	cg_proto = tcp_prot.proto_cgroup(memcg);
>  	if (!cg_proto)
>  		return 0;
> diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
> index dc39838..8ea1184 100644
> --- a/net/sched/cls_cgroup.c
> +++ b/net/sched/cls_cgroup.c
> @@ -28,11 +28,6 @@ static inline struct cgroup_cls_state *css_cls_state(struct cgroup_subsys_state
>  	return css ? container_of(css, struct cgroup_cls_state, css) : NULL;
>  }
>  
> -static inline struct cgroup_cls_state *cgrp_cls_state(struct cgroup *cgrp)
> -{
> -	return css_cls_state(cgroup_css(cgrp, net_cls_subsys_id));
> -}
> -
>  static inline struct cgroup_cls_state *task_cls_state(struct task_struct *p)
>  {
>  	return css_cls_state(task_css(p, net_cls_subsys_id));
> @@ -87,14 +82,15 @@ static void cgrp_attach(struct cgroup_subsys_state *css,
>  	}
>  }
>  
> -static u64 read_classid(struct cgroup *cgrp, struct cftype *cft)
> +static u64 read_classid(struct cgroup_subsys_state *css, struct cftype *cft)
>  {
> -	return cgrp_cls_state(cgrp)->classid;
> +	return css_cls_state(css)->classid;
>  }
>  
> -static int write_classid(struct cgroup *cgrp, struct cftype *cft, u64 value)
> +static int write_classid(struct cgroup_subsys_state *css, struct cftype *cft,
> +			 u64 value)
>  {
> -	cgrp_cls_state(cgrp)->classid = (u32) value;
> +	css_cls_state(css)->classid = (u32) value;
>  	return 0;
>  }
>  
> diff --git a/security/device_cgroup.c b/security/device_cgroup.c
> index 7293ac4..e0ca464 100644
> --- a/security/device_cgroup.c
> +++ b/security/device_cgroup.c
> @@ -289,10 +289,10 @@ static void set_majmin(char *str, unsigned m)
>  		sprintf(str, "%u", m);
>  }
>  
> -static int devcgroup_seq_read(struct cgroup *cgroup, struct cftype *cft,
> -				struct seq_file *m)
> +static int devcgroup_seq_read(struct cgroup_subsys_state *css,
> +			      struct cftype *cft, struct seq_file *m)
>  {
> -	struct dev_cgroup *devcgroup = cgroup_to_devcgroup(cgroup);
> +	struct dev_cgroup *devcgroup = css_to_devcgroup(css);
>  	struct dev_exception_item *ex;
>  	char maj[MAJMINLEN], min[MAJMINLEN], acc[ACCLEN];
>  
> @@ -669,13 +669,13 @@ static int devcgroup_update_access(struct dev_cgroup *devcgroup,
>  	return rc;
>  }
>  
> -static int devcgroup_access_write(struct cgroup *cgrp, struct cftype *cft,
> -				  const char *buffer)
> +static int devcgroup_access_write(struct cgroup_subsys_state *css,
> +				  struct cftype *cft, const char *buffer)
>  {
>  	int retval;
>  
>  	mutex_lock(&devcgroup_mutex);
> -	retval = devcgroup_update_access(cgroup_to_devcgroup(cgrp),
> +	retval = devcgroup_update_access(css_to_devcgroup(css),
>  					 cft->private, buffer);
>  	mutex_unlock(&devcgroup_mutex);
>  	return retval;
> -- 
> 1.8.3.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 15/23] cgroup: make hierarchy iterators deal with cgroup_subsys_state instead of cgroup
  2013-08-01 21:49 ` [PATCH 15/23] cgroup: make hierarchy iterators deal with cgroup_subsys_state instead of cgroup Tejun Heo
@ 2013-08-02 13:32   ` Michal Hocko
  2013-08-05 14:25   ` Vivek Goyal
  2013-08-05 18:10   ` Aristeu Rozanski
  2 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2013-08-02 13:32 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, containers, cgroups, linux-kernel, Johannes Weiner,
	Balbir Singh, Aristeu Rozanski, Matt Helsley, Vivek Goyal,
	Jens Axboe

On Thu 01-08-13 17:49:53, Tejun Heo wrote:
> cgroup is currently in the process of transitioning to using css
> (cgroup_subsys_state) as the primary handle instead of cgroup in
> subsystem API.  For hierarchy iterators, this is beneficial because
> 
> * In most cases, css is the only thing subsystems care about anyway.
> 
> * On the planned unified hierarchy, iterations for different
>   subsystems will need to skip over different subtrees of the
>   hierarchy depending on which subsystems are enabled on each cgroup.
>   Passing around css makes it unnecessary to explicitly specify the
>   subsystem in question as css is intersection between cgroup and
>   subsystem
> 
> * For the planned unified hierarchy, css's would need to be created
>   and destroyed dynamically independent from cgroup hierarchy.  Having
>   cgroup core manage css iteration makes enforcing deref rules a lot
>   easier.
> 
> Most subsystem conversions are straight-forward.  Noteworthy changes
> are
> 
> * blkio: cgroup_to_blkcg() is no longer used.  Removed.
> 
> * freezer: cgroup_freezer() is no longer used.  Removed.
> 
> * devices: cgroup_to_devcgroup() is no longer used.  Removed.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Aristeu Rozanski <aris@redhat.com>
> Cc: Matt Helsley <matthltc@us.ibm.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Jens Axboe <axboe@kernel.dk>

For memcg part 
Acked-by: Michal Hocko <mhocko@suse.cz>
(I hated additional css.cgroup step anyway)

> ---
>  block/blk-cgroup.c       |   8 +--
>  block/blk-cgroup.h       |  25 ++++-----
>  block/blk-throttle.c     |   8 +--
>  include/linux/cgroup.h   |  88 ++++++++++++++++---------------
>  kernel/cgroup.c          | 131 ++++++++++++++++++++++++++---------------------
>  kernel/cgroup_freezer.c  |  25 ++++-----
>  kernel/cpuset.c          |  58 ++++++++++-----------
>  mm/memcontrol.c          |  20 ++++----
>  security/device_cgroup.c |  11 ++--
>  9 files changed, 187 insertions(+), 187 deletions(-)
> 
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index f46f3c6..4b40640 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -614,7 +614,7 @@ u64 blkg_stat_recursive_sum(struct blkg_policy_data *pd, int off)
>  {
>  	struct blkcg_policy *pol = blkcg_policy[pd->plid];
>  	struct blkcg_gq *pos_blkg;
> -	struct cgroup *pos_cgrp;
> +	struct cgroup_subsys_state *pos_css;
>  	u64 sum;
>  
>  	lockdep_assert_held(pd->blkg->q->queue_lock);
> @@ -622,7 +622,7 @@ u64 blkg_stat_recursive_sum(struct blkg_policy_data *pd, int off)
>  	sum = blkg_stat_read((void *)pd + off);
>  
>  	rcu_read_lock();
> -	blkg_for_each_descendant_pre(pos_blkg, pos_cgrp, pd_to_blkg(pd)) {
> +	blkg_for_each_descendant_pre(pos_blkg, pos_css, pd_to_blkg(pd)) {
>  		struct blkg_policy_data *pos_pd = blkg_to_pd(pos_blkg, pol);
>  		struct blkg_stat *stat = (void *)pos_pd + off;
>  
> @@ -649,7 +649,7 @@ struct blkg_rwstat blkg_rwstat_recursive_sum(struct blkg_policy_data *pd,
>  {
>  	struct blkcg_policy *pol = blkcg_policy[pd->plid];
>  	struct blkcg_gq *pos_blkg;
> -	struct cgroup *pos_cgrp;
> +	struct cgroup_subsys_state *pos_css;
>  	struct blkg_rwstat sum;
>  	int i;
>  
> @@ -658,7 +658,7 @@ struct blkg_rwstat blkg_rwstat_recursive_sum(struct blkg_policy_data *pd,
>  	sum = blkg_rwstat_read((void *)pd + off);
>  
>  	rcu_read_lock();
> -	blkg_for_each_descendant_pre(pos_blkg, pos_cgrp, pd_to_blkg(pd)) {
> +	blkg_for_each_descendant_pre(pos_blkg, pos_css, pd_to_blkg(pd)) {
>  		struct blkg_policy_data *pos_pd = blkg_to_pd(pos_blkg, pol);
>  		struct blkg_rwstat *rwstat = (void *)pos_pd + off;
>  		struct blkg_rwstat tmp;
> diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
> index b6802c4..8555386 100644
> --- a/block/blk-cgroup.h
> +++ b/block/blk-cgroup.h
> @@ -184,11 +184,6 @@ static inline struct blkcg *css_to_blkcg(struct cgroup_subsys_state *css)
>  	return css ? container_of(css, struct blkcg, css) : NULL;
>  }
>  
> -static inline struct blkcg *cgroup_to_blkcg(struct cgroup *cgroup)
> -{
> -	return css_to_blkcg(cgroup_css(cgroup, blkio_subsys_id));
> -}
> -
>  static inline struct blkcg *task_blkcg(struct task_struct *tsk)
>  {
>  	return css_to_blkcg(task_css(tsk, blkio_subsys_id));
> @@ -289,32 +284,31 @@ struct blkcg_gq *__blkg_lookup(struct blkcg *blkcg, struct request_queue *q,
>  /**
>   * blkg_for_each_descendant_pre - pre-order walk of a blkg's descendants
>   * @d_blkg: loop cursor pointing to the current descendant
> - * @pos_cgrp: used for iteration
> + * @pos_css: used for iteration
>   * @p_blkg: target blkg to walk descendants of
>   *
>   * Walk @c_blkg through the descendants of @p_blkg.  Must be used with RCU
>   * read locked.  If called under either blkcg or queue lock, the iteration
>   * is guaranteed to include all and only online blkgs.  The caller may
> - * update @pos_cgrp by calling cgroup_rightmost_descendant() to skip
> - * subtree.
> + * update @pos_css by calling css_rightmost_descendant() to skip subtree.
>   */
> -#define blkg_for_each_descendant_pre(d_blkg, pos_cgrp, p_blkg)		\
> -	cgroup_for_each_descendant_pre((pos_cgrp), (p_blkg)->blkcg->css.cgroup) \
> -		if (((d_blkg) = __blkg_lookup(cgroup_to_blkcg(pos_cgrp), \
> +#define blkg_for_each_descendant_pre(d_blkg, pos_css, p_blkg)		\
> +	css_for_each_descendant_pre((pos_css), &(p_blkg)->blkcg->css)	\
> +		if (((d_blkg) = __blkg_lookup(css_to_blkcg(pos_css),	\
>  					      (p_blkg)->q, false)))
>  
>  /**
>   * blkg_for_each_descendant_post - post-order walk of a blkg's descendants
>   * @d_blkg: loop cursor pointing to the current descendant
> - * @pos_cgrp: used for iteration
> + * @pos_css: used for iteration
>   * @p_blkg: target blkg to walk descendants of
>   *
>   * Similar to blkg_for_each_descendant_pre() but performs post-order
>   * traversal instead.  Synchronization rules are the same.
>   */
> -#define blkg_for_each_descendant_post(d_blkg, pos_cgrp, p_blkg)		\
> -	cgroup_for_each_descendant_post((pos_cgrp), (p_blkg)->blkcg->css.cgroup) \
> -		if (((d_blkg) = __blkg_lookup(cgroup_to_blkcg(pos_cgrp), \
> +#define blkg_for_each_descendant_post(d_blkg, pos_css, p_blkg)		\
> +	css_for_each_descendant_post((pos_css), &(p_blkg)->blkcg->css)	\
> +		if (((d_blkg) = __blkg_lookup(css_to_blkcg(pos_css),	\
>  					      (p_blkg)->q, false)))
>  
>  /**
> @@ -577,7 +571,6 @@ static inline int blkcg_activate_policy(struct request_queue *q,
>  static inline void blkcg_deactivate_policy(struct request_queue *q,
>  					   const struct blkcg_policy *pol) { }
>  
> -static inline struct blkcg *cgroup_to_blkcg(struct cgroup *cgroup) { return NULL; }
>  static inline struct blkcg *bio_blkcg(struct bio *bio) { return NULL; }
>  
>  static inline struct blkg_policy_data *blkg_to_pd(struct blkcg_gq *blkg,
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 88bcfb6..8cefa7f 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -1349,7 +1349,7 @@ static int tg_set_conf(struct cgroup_subsys_state *css, struct cftype *cft,
>  	struct throtl_grp *tg;
>  	struct throtl_service_queue *sq;
>  	struct blkcg_gq *blkg;
> -	struct cgroup *pos_cgrp;
> +	struct cgroup_subsys_state *pos_css;
>  	int ret;
>  
>  	ret = blkg_conf_prep(blkcg, &blkcg_policy_throtl, buf, &ctx);
> @@ -1380,7 +1380,7 @@ static int tg_set_conf(struct cgroup_subsys_state *css, struct cftype *cft,
>  	 * blk-throttle.
>  	 */
>  	tg_update_has_rules(tg);
> -	blkg_for_each_descendant_pre(blkg, pos_cgrp, ctx.blkg)
> +	blkg_for_each_descendant_pre(blkg, pos_css, ctx.blkg)
>  		tg_update_has_rules(blkg_to_tg(blkg));
>  
>  	/*
> @@ -1623,7 +1623,7 @@ void blk_throtl_drain(struct request_queue *q)
>  {
>  	struct throtl_data *td = q->td;
>  	struct blkcg_gq *blkg;
> -	struct cgroup *pos_cgrp;
> +	struct cgroup_subsys_state *pos_css;
>  	struct bio *bio;
>  	int rw;
>  
> @@ -1636,7 +1636,7 @@ void blk_throtl_drain(struct request_queue *q)
>  	 * better to walk service_queue tree directly but blkg walk is
>  	 * easier.
>  	 */
> -	blkg_for_each_descendant_post(blkg, pos_cgrp, td->queue->root_blkg)
> +	blkg_for_each_descendant_post(blkg, pos_css, td->queue->root_blkg)
>  		tg_drain_bios(&blkg_to_tg(blkg)->service_queue);
>  
>  	tg_drain_bios(&td_root_tg(td)->service_queue);
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index df6ab19..7fba0d0 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -780,68 +780,72 @@ static inline struct cgroup *cgroup_from_id(struct cgroup_subsys *ss, int id)
>  	return idr_find(&ss->root->cgroup_idr, id);
>  }
>  
> -struct cgroup *cgroup_next_child(struct cgroup *pos, struct cgroup *cgrp);
> +struct cgroup_subsys_state *css_next_child(struct cgroup_subsys_state *pos,
> +					   struct cgroup_subsys_state *parent);
>  
>  /**
> - * cgroup_for_each_child - iterate through children of a cgroup
> - * @pos: the cgroup * to use as the loop cursor
> - * @cgrp: cgroup whose children to walk
> + * css_for_each_child - iterate through children of a css
> + * @pos: the css * to use as the loop cursor
> + * @parent: css whose children to walk
>   *
> - * Walk @cgrp's children.  Must be called under rcu_read_lock().  A child
> - * cgroup which hasn't finished ->css_online() or already has finished
> + * Walk @parent's children.  Must be called under rcu_read_lock().  A child
> + * css which hasn't finished ->css_online() or already has finished
>   * ->css_offline() may show up during traversal and it's each subsystem's
>   * responsibility to verify that each @pos is alive.
>   *
>   * If a subsystem synchronizes against the parent in its ->css_online() and
> - * before starting iterating, a cgroup which finished ->css_online() is
> + * before starting iterating, a css which finished ->css_online() is
>   * guaranteed to be visible in the future iterations.
>   *
>   * It is allowed to temporarily drop RCU read lock during iteration.  The
>   * caller is responsible for ensuring that @pos remains accessible until
>   * the start of the next iteration by, for example, bumping the css refcnt.
>   */
> -#define cgroup_for_each_child(pos, cgrp)				\
> -	for ((pos) = cgroup_next_child(NULL, (cgrp)); (pos);		\
> -	     (pos) = cgroup_next_child((pos), (cgrp)))
> +#define css_for_each_child(pos, parent)					\
> +	for ((pos) = css_next_child(NULL, (parent)); (pos);		\
> +	     (pos) = css_next_child((pos), (parent)))
>  
> -struct cgroup *cgroup_next_descendant_pre(struct cgroup *pos,
> -					  struct cgroup *cgroup);
> -struct cgroup *cgroup_rightmost_descendant(struct cgroup *pos);
> +struct cgroup_subsys_state *
> +css_next_descendant_pre(struct cgroup_subsys_state *pos,
> +			struct cgroup_subsys_state *css);
> +
> +struct cgroup_subsys_state *
> +css_rightmost_descendant(struct cgroup_subsys_state *pos);
>  
>  /**
> - * cgroup_for_each_descendant_pre - pre-order walk of a cgroup's descendants
> - * @pos: the cgroup * to use as the loop cursor
> - * @cgroup: cgroup whose descendants to walk
> + * css_for_each_descendant_pre - pre-order walk of a css's descendants
> + * @pos: the css * to use as the loop cursor
> + * @root: css whose descendants to walk
>   *
> - * Walk @cgroup's descendants.  Must be called under rcu_read_lock().  A
> - * descendant cgroup which hasn't finished ->css_online() or already has
> + * Walk @root's descendants.  Must be called under rcu_read_lock().  A
> + * descendant css which hasn't finished ->css_online() or already has
>   * finished ->css_offline() may show up during traversal and it's each
>   * subsystem's responsibility to verify that each @pos is alive.
>   *
>   * If a subsystem synchronizes against the parent in its ->css_online() and
>   * before starting iterating, and synchronizes against @pos on each
> - * iteration, any descendant cgroup which finished ->css_online() is
> + * iteration, any descendant css which finished ->css_online() is
>   * guaranteed to be visible in the future iterations.
>   *
>   * In other words, the following guarantees that a descendant can't escape
>   * state updates of its ancestors.
>   *
> - * my_online(@cgrp)
> + * my_online(@css)
>   * {
> - *	Lock @cgrp->parent and @cgrp;
> - *	Inherit state from @cgrp->parent;
> + *	Lock @css's parent and @css;
> + *	Inherit state from the parent;
>   *	Unlock both.
>   * }
>   *
> - * my_update_state(@cgrp)
> + * my_update_state(@css)
>   * {
> - *	Lock @cgrp;
> - *	Update @cgrp's state;
> - *	Unlock @cgrp;
> + *	Lock @css;
> + *	Update @css's state;
> + *	Unlock @css;
>   *
> - *	cgroup_for_each_descendant_pre(@pos, @cgrp) {
> + *	css_for_each_descendant_pre(@pos, @css) {
>   *		Lock @pos;
> - *		Verify @pos is alive and inherit state from @pos->parent;
> + *		Verify @pos is alive and inherit state from @pos's parent;
>   *		Unlock @pos;
>   *	}
>   * }
> @@ -852,8 +856,7 @@ struct cgroup *cgroup_rightmost_descendant(struct cgroup *pos);
>   * visible by walking order and, as long as inheriting operations to the
>   * same @pos are atomic to each other, multiple updates racing each other
>   * still result in the correct state.  It's guaranateed that at least one
> - * inheritance happens for any cgroup after the latest update to its
> - * parent.
> + * inheritance happens for any css after the latest update to its parent.
>   *
>   * If checking parent's state requires locking the parent, each inheriting
>   * iteration should lock and unlock both @pos->parent and @pos.
> @@ -866,25 +869,26 @@ struct cgroup *cgroup_rightmost_descendant(struct cgroup *pos);
>   * caller is responsible for ensuring that @pos remains accessible until
>   * the start of the next iteration by, for example, bumping the css refcnt.
>   */
> -#define cgroup_for_each_descendant_pre(pos, cgroup)			\
> -	for (pos = cgroup_next_descendant_pre(NULL, (cgroup)); (pos);	\
> -	     pos = cgroup_next_descendant_pre((pos), (cgroup)))
> +#define css_for_each_descendant_pre(pos, css)				\
> +	for ((pos) = css_next_descendant_pre(NULL, (css)); (pos);	\
> +	     (pos) = css_next_descendant_pre((pos), (css)))
>  
> -struct cgroup *cgroup_next_descendant_post(struct cgroup *pos,
> -					   struct cgroup *cgroup);
> +struct cgroup_subsys_state *
> +css_next_descendant_post(struct cgroup_subsys_state *pos,
> +			 struct cgroup_subsys_state *css);
>  
>  /**
> - * cgroup_for_each_descendant_post - post-order walk of a cgroup's descendants
> - * @pos: the cgroup * to use as the loop cursor
> - * @cgroup: cgroup whose descendants to walk
> + * css_for_each_descendant_post - post-order walk of a css's descendants
> + * @pos: the css * to use as the loop cursor
> + * @css: css whose descendants to walk
>   *
> - * Similar to cgroup_for_each_descendant_pre() but performs post-order
> + * Similar to css_for_each_descendant_pre() but performs post-order
>   * traversal instead.  Note that the walk visibility guarantee described in
>   * pre-order walk doesn't apply the same to post-order walks.
>   */
> -#define cgroup_for_each_descendant_post(pos, cgroup)			\
> -	for (pos = cgroup_next_descendant_post(NULL, (cgroup)); (pos);	\
> -	     pos = cgroup_next_descendant_post((pos), (cgroup)))
> +#define css_for_each_descendant_post(pos, css)				\
> +	for ((pos) = css_next_descendant_post(NULL, (css)); (pos);	\
> +	     (pos) = css_next_descendant_post((pos), (css)))
>  
>  /* A cgroup_iter should be treated as an opaque object */
>  struct cgroup_iter {
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 7b53b58..850ad87 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -2807,8 +2807,8 @@ static void cgroup_cfts_prepare(void)
>  	/*
>  	 * Thanks to the entanglement with vfs inode locking, we can't walk
>  	 * the existing cgroups under cgroup_mutex and create files.
> -	 * Instead, we use cgroup_for_each_descendant_pre() and drop RCU
> -	 * read lock before calling cgroup_addrm_files().
> +	 * Instead, we use css_for_each_descendant_pre() and drop RCU read
> +	 * lock before calling cgroup_addrm_files().
>  	 */
>  	mutex_lock(&cgroup_mutex);
>  }
> @@ -2818,10 +2818,11 @@ static int cgroup_cfts_commit(struct cftype *cfts, bool is_add)
>  {
>  	LIST_HEAD(pending);
>  	struct cgroup_subsys *ss = cfts[0].ss;
> -	struct cgroup *cgrp, *root = &ss->root->top_cgroup;
> +	struct cgroup *root = &ss->root->top_cgroup;
>  	struct super_block *sb = ss->root->sb;
>  	struct dentry *prev = NULL;
>  	struct inode *inode;
> +	struct cgroup_subsys_state *css;
>  	u64 update_before;
>  	int ret = 0;
>  
> @@ -2854,7 +2855,9 @@ static int cgroup_cfts_commit(struct cftype *cfts, bool is_add)
>  
>  	/* add/rm files for all cgroups created before */
>  	rcu_read_lock();
> -	cgroup_for_each_descendant_pre(cgrp, root) {
> +	css_for_each_descendant_pre(css, cgroup_css(root, ss->subsys_id)) {
> +		struct cgroup *cgrp = css->cgroup;
> +
>  		if (cgroup_is_dead(cgrp))
>  			continue;
>  
> @@ -3030,17 +3033,21 @@ static void cgroup_enable_task_cg_lists(void)
>  }
>  
>  /**
> - * cgroup_next_child - find the next child of a given cgroup
> - * @pos: the current position (%NULL to initiate traversal)
> - * @cgrp: cgroup whose descendants to walk
> + * css_next_child - find the next child of a given css
> + * @pos_css: the current position (%NULL to initiate traversal)
> + * @parent_css: css whose children to walk
>   *
> - * This function returns the next child of @cgrp and should be called under
> - * RCU read lock.  The only requirement is that @cgrp and @pos are
> - * accessible.  The next sibling is guaranteed to be returned regardless of
> - * their states.
> + * This function returns the next child of @parent_css and should be called
> + * under RCU read lock.  The only requirement is that @parent_css and
> + * @pos_css are accessible.  The next sibling is guaranteed to be returned
> + * regardless of their states.
>   */
> -struct cgroup *cgroup_next_child(struct cgroup *pos, struct cgroup *cgrp)
> +struct cgroup_subsys_state *
> +css_next_child(struct cgroup_subsys_state *pos_css,
> +	       struct cgroup_subsys_state *parent_css)
>  {
> +	struct cgroup *pos = pos_css ? pos_css->cgroup : NULL;
> +	struct cgroup *cgrp = parent_css->cgroup;
>  	struct cgroup *next;
>  
>  	WARN_ON_ONCE(!rcu_read_lock_held());
> @@ -3074,59 +3081,64 @@ struct cgroup *cgroup_next_child(struct cgroup *pos, struct cgroup *cgrp)
>  				break;
>  	}
>  
> -	if (&next->sibling != &cgrp->children)
> -		return next;
> -	return NULL;
> +	if (&next->sibling == &cgrp->children)
> +		return NULL;
> +
> +	if (parent_css->ss)
> +		return cgroup_css(next, parent_css->ss->subsys_id);
> +	else
> +		return &next->dummy_css;
>  }
> -EXPORT_SYMBOL_GPL(cgroup_next_child);
> +EXPORT_SYMBOL_GPL(css_next_child);
>  
>  /**
> - * cgroup_next_descendant_pre - find the next descendant for pre-order walk
> + * css_next_descendant_pre - find the next descendant for pre-order walk
>   * @pos: the current position (%NULL to initiate traversal)
> - * @cgroup: cgroup whose descendants to walk
> + * @root: css whose descendants to walk
>   *
> - * To be used by cgroup_for_each_descendant_pre().  Find the next
> - * descendant to visit for pre-order traversal of @cgroup's descendants.
> + * To be used by css_for_each_descendant_pre().  Find the next descendant
> + * to visit for pre-order traversal of @root's descendants.
>   *
>   * While this function requires RCU read locking, it doesn't require the
>   * whole traversal to be contained in a single RCU critical section.  This
>   * function will return the correct next descendant as long as both @pos
> - * and @cgroup are accessible and @pos is a descendant of @cgroup.
> + * and @root are accessible and @pos is a descendant of @root.
>   */
> -struct cgroup *cgroup_next_descendant_pre(struct cgroup *pos,
> -					  struct cgroup *cgroup)
> +struct cgroup_subsys_state *
> +css_next_descendant_pre(struct cgroup_subsys_state *pos,
> +			struct cgroup_subsys_state *root)
>  {
> -	struct cgroup *next;
> +	struct cgroup_subsys_state *next;
>  
>  	WARN_ON_ONCE(!rcu_read_lock_held());
>  
> -	/* if first iteration, pretend we just visited @cgroup */
> +	/* if first iteration, pretend we just visited @root */
>  	if (!pos)
> -		pos = cgroup;
> +		pos = root;
>  
>  	/* visit the first child if exists */
> -	next = cgroup_next_child(NULL, pos);
> +	next = css_next_child(NULL, pos);
>  	if (next)
>  		return next;
>  
>  	/* no child, visit my or the closest ancestor's next sibling */
> -	while (pos != cgroup) {
> -		next = cgroup_next_child(pos, pos->parent);
> +	while (pos != root) {
> +		next = css_next_child(pos, css_parent(pos));
>  		if (next)
>  			return next;
> -		pos = pos->parent;
> +		pos = css_parent(pos);
>  	}
>  
>  	return NULL;
>  }
> -EXPORT_SYMBOL_GPL(cgroup_next_descendant_pre);
> +EXPORT_SYMBOL_GPL(css_next_descendant_pre);
>  
>  /**
> - * cgroup_rightmost_descendant - return the rightmost descendant of a cgroup
> - * @pos: cgroup of interest
> + * css_rightmost_descendant - return the rightmost descendant of a css
> + * @pos: css of interest
>   *
> - * Return the rightmost descendant of @pos.  If there's no descendant,
> - * @pos is returned.  This can be used during pre-order traversal to skip
> + * Return the rightmost descendant of @pos.  If there's no descendant, @pos
> + * is returned.  This can be used during pre-order traversal to skip
>   * subtree of @pos.
>   *
>   * While this function requires RCU read locking, it doesn't require the
> @@ -3134,9 +3146,10 @@ EXPORT_SYMBOL_GPL(cgroup_next_descendant_pre);
>   * function will return the correct rightmost descendant as long as @pos is
>   * accessible.
>   */
> -struct cgroup *cgroup_rightmost_descendant(struct cgroup *pos)
> +struct cgroup_subsys_state *
> +css_rightmost_descendant(struct cgroup_subsys_state *pos)
>  {
> -	struct cgroup *last, *tmp;
> +	struct cgroup_subsys_state *last, *tmp;
>  
>  	WARN_ON_ONCE(!rcu_read_lock_held());
>  
> @@ -3144,62 +3157,64 @@ struct cgroup *cgroup_rightmost_descendant(struct cgroup *pos)
>  		last = pos;
>  		/* ->prev isn't RCU safe, walk ->next till the end */
>  		pos = NULL;
> -		cgroup_for_each_child(tmp, last)
> +		css_for_each_child(tmp, last)
>  			pos = tmp;
>  	} while (pos);
>  
>  	return last;
>  }
> -EXPORT_SYMBOL_GPL(cgroup_rightmost_descendant);
> +EXPORT_SYMBOL_GPL(css_rightmost_descendant);
>  
> -static struct cgroup *cgroup_leftmost_descendant(struct cgroup *pos)
> +static struct cgroup_subsys_state *
> +css_leftmost_descendant(struct cgroup_subsys_state *pos)
>  {
> -	struct cgroup *last;
> +	struct cgroup_subsys_state *last;
>  
>  	do {
>  		last = pos;
> -		pos = cgroup_next_child(NULL, pos);
> +		pos = css_next_child(NULL, pos);
>  	} while (pos);
>  
>  	return last;
>  }
>  
>  /**
> - * cgroup_next_descendant_post - find the next descendant for post-order walk
> + * css_next_descendant_post - find the next descendant for post-order walk
>   * @pos: the current position (%NULL to initiate traversal)
> - * @cgroup: cgroup whose descendants to walk
> + * @root: css whose descendants to walk
>   *
> - * To be used by cgroup_for_each_descendant_post().  Find the next
> - * descendant to visit for post-order traversal of @cgroup's descendants.
> + * To be used by css_for_each_descendant_post().  Find the next descendant
> + * to visit for post-order traversal of @root's descendants.
>   *
>   * While this function requires RCU read locking, it doesn't require the
>   * whole traversal to be contained in a single RCU critical section.  This
>   * function will return the correct next descendant as long as both @pos
>   * and @cgroup are accessible and @pos is a descendant of @cgroup.
>   */
> -struct cgroup *cgroup_next_descendant_post(struct cgroup *pos,
> -					   struct cgroup *cgroup)
> +struct cgroup_subsys_state *
> +css_next_descendant_post(struct cgroup_subsys_state *pos,
> +			 struct cgroup_subsys_state *root)
>  {
> -	struct cgroup *next;
> +	struct cgroup_subsys_state *next;
>  
>  	WARN_ON_ONCE(!rcu_read_lock_held());
>  
>  	/* if first iteration, visit the leftmost descendant */
>  	if (!pos) {
> -		next = cgroup_leftmost_descendant(cgroup);
> -		return next != cgroup ? next : NULL;
> +		next = css_leftmost_descendant(root);
> +		return next != root ? next : NULL;
>  	}
>  
>  	/* if there's an unvisited sibling, visit its leftmost descendant */
> -	next = cgroup_next_child(pos, pos->parent);
> +	next = css_next_child(pos, css_parent(pos));
>  	if (next)
> -		return cgroup_leftmost_descendant(next);
> +		return css_leftmost_descendant(next);
>  
>  	/* no sibling left, visit parent */
> -	next = pos->parent;
> -	return next != cgroup ? next : NULL;
> +	next = css_parent(pos);
> +	return next != root ? next : NULL;
>  }
> -EXPORT_SYMBOL_GPL(cgroup_next_descendant_post);
> +EXPORT_SYMBOL_GPL(css_next_descendant_post);
>  
>  void cgroup_iter_start(struct cgroup *cgrp, struct cgroup_iter *it)
>  	__acquires(css_set_lock)
> @@ -4540,9 +4555,9 @@ static int cgroup_destroy_locked(struct cgroup *cgrp)
>  	/*
>  	 * Mark @cgrp dead.  This prevents further task migration and child
>  	 * creation by disabling cgroup_lock_live_group().  Note that
> -	 * CGRP_DEAD assertion is depended upon by cgroup_next_child() to
> +	 * CGRP_DEAD assertion is depended upon by css_next_child() to
>  	 * resume iteration after dropping RCU read lock.  See
> -	 * cgroup_next_child() for details.
> +	 * css_next_child() for details.
>  	 */
>  	set_bit(CGRP_DEAD, &cgrp->flags);
>  
> diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
> index 19613ba..98ca48d 100644
> --- a/kernel/cgroup_freezer.c
> +++ b/kernel/cgroup_freezer.c
> @@ -50,11 +50,6 @@ static inline struct freezer *css_freezer(struct cgroup_subsys_state *css)
>  	return css ? container_of(css, struct freezer, css) : NULL;
>  }
>  
> -static inline struct freezer *cgroup_freezer(struct cgroup *cgroup)
> -{
> -	return css_freezer(cgroup_css(cgroup, freezer_subsys_id));
> -}
> -
>  static inline struct freezer *task_freezer(struct task_struct *task)
>  {
>  	return css_freezer(task_css(task, freezer_subsys_id));
> @@ -120,7 +115,7 @@ static int freezer_css_online(struct cgroup_subsys_state *css)
>  	/*
>  	 * The following double locking and freezing state inheritance
>  	 * guarantee that @cgroup can never escape ancestors' freezing
> -	 * states.  See cgroup_for_each_descendant_pre() for details.
> +	 * states.  See css_for_each_descendant_pre() for details.
>  	 */
>  	if (parent)
>  		spin_lock_irq(&parent->lock);
> @@ -262,7 +257,7 @@ out:
>  static void update_if_frozen(struct cgroup_subsys_state *css)
>  {
>  	struct freezer *freezer = css_freezer(css);
> -	struct cgroup *pos;
> +	struct cgroup_subsys_state *pos;
>  	struct cgroup_iter it;
>  	struct task_struct *task;
>  
> @@ -275,8 +270,8 @@ static void update_if_frozen(struct cgroup_subsys_state *css)
>  		goto out_unlock;
>  
>  	/* are all (live) children frozen? */
> -	cgroup_for_each_child(pos, css->cgroup) {
> -		struct freezer *child = cgroup_freezer(pos);
> +	css_for_each_child(pos, css) {
> +		struct freezer *child = css_freezer(pos);
>  
>  		if ((child->state & CGROUP_FREEZER_ONLINE) &&
>  		    !(child->state & CGROUP_FROZEN))
> @@ -309,13 +304,13 @@ out_unlock:
>  static int freezer_read(struct cgroup_subsys_state *css, struct cftype *cft,
>  			struct seq_file *m)
>  {
> -	struct cgroup *pos;
> +	struct cgroup_subsys_state *pos;
>  
>  	rcu_read_lock();
>  
>  	/* update states bottom-up */
> -	cgroup_for_each_descendant_post(pos, css->cgroup)
> -		update_if_frozen(cgroup_css(pos, freezer_subsys_id));
> +	css_for_each_descendant_post(pos, css)
> +		update_if_frozen(pos);
>  	update_if_frozen(css);
>  
>  	rcu_read_unlock();
> @@ -396,7 +391,7 @@ static void freezer_apply_state(struct freezer *freezer, bool freeze,
>   */
>  static void freezer_change_state(struct freezer *freezer, bool freeze)
>  {
> -	struct cgroup *pos;
> +	struct cgroup_subsys_state *pos;
>  
>  	/* update @freezer */
>  	spin_lock_irq(&freezer->lock);
> @@ -409,8 +404,8 @@ static void freezer_change_state(struct freezer *freezer, bool freeze)
>  	 * CGROUP_FREEZING_PARENT.
>  	 */
>  	rcu_read_lock();
> -	cgroup_for_each_descendant_pre(pos, freezer->css.cgroup) {
> -		struct freezer *pos_f = cgroup_freezer(pos);
> +	css_for_each_descendant_pre(pos, &freezer->css) {
> +		struct freezer *pos_f = css_freezer(pos);
>  		struct freezer *parent = parent_freezer(pos_f);
>  
>  		/*
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index 89b76e1..be4f503 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -210,29 +210,29 @@ static struct cpuset top_cpuset = {
>  /**
>   * cpuset_for_each_child - traverse online children of a cpuset
>   * @child_cs: loop cursor pointing to the current child
> - * @pos_cgrp: used for iteration
> + * @pos_css: used for iteration
>   * @parent_cs: target cpuset to walk children of
>   *
>   * Walk @child_cs through the online children of @parent_cs.  Must be used
>   * with RCU read locked.
>   */
> -#define cpuset_for_each_child(child_cs, pos_cgrp, parent_cs)		\
> -	cgroup_for_each_child((pos_cgrp), (parent_cs)->css.cgroup)	\
> -		if (is_cpuset_online(((child_cs) = cgroup_cs((pos_cgrp)))))
> +#define cpuset_for_each_child(child_cs, pos_css, parent_cs)		\
> +	css_for_each_child((pos_css), &(parent_cs)->css)		\
> +		if (is_cpuset_online(((child_cs) = css_cs((pos_css)))))
>  
>  /**
>   * cpuset_for_each_descendant_pre - pre-order walk of a cpuset's descendants
>   * @des_cs: loop cursor pointing to the current descendant
> - * @pos_cgrp: used for iteration
> + * @pos_css: used for iteration
>   * @root_cs: target cpuset to walk ancestor of
>   *
>   * Walk @des_cs through the online descendants of @root_cs.  Must be used
> - * with RCU read locked.  The caller may modify @pos_cgrp by calling
> - * cgroup_rightmost_descendant() to skip subtree.
> + * with RCU read locked.  The caller may modify @pos_css by calling
> + * css_rightmost_descendant() to skip subtree.
>   */
> -#define cpuset_for_each_descendant_pre(des_cs, pos_cgrp, root_cs)	\
> -	cgroup_for_each_descendant_pre((pos_cgrp), (root_cs)->css.cgroup) \
> -		if (is_cpuset_online(((des_cs) = cgroup_cs((pos_cgrp)))))
> +#define cpuset_for_each_descendant_pre(des_cs, pos_css, root_cs)	\
> +	css_for_each_descendant_pre((pos_css), &(root_cs)->css)		\
> +		if (is_cpuset_online(((des_cs) = css_cs((pos_css)))))
>  
>  /*
>   * There are two global mutexes guarding cpuset structures - cpuset_mutex
> @@ -430,7 +430,7 @@ static void free_trial_cpuset(struct cpuset *trial)
>  
>  static int validate_change(struct cpuset *cur, struct cpuset *trial)
>  {
> -	struct cgroup *cgrp;
> +	struct cgroup_subsys_state *css;
>  	struct cpuset *c, *par;
>  	int ret;
>  
> @@ -438,7 +438,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
>  
>  	/* Each of our child cpusets must be a subset of us */
>  	ret = -EBUSY;
> -	cpuset_for_each_child(c, cgrp, cur)
> +	cpuset_for_each_child(c, css, cur)
>  		if (!is_cpuset_subset(c, trial))
>  			goto out;
>  
> @@ -459,7 +459,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
>  	 * overlap
>  	 */
>  	ret = -EINVAL;
> -	cpuset_for_each_child(c, cgrp, par) {
> +	cpuset_for_each_child(c, css, par) {
>  		if ((is_cpu_exclusive(trial) || is_cpu_exclusive(c)) &&
>  		    c != cur &&
>  		    cpumask_intersects(trial->cpus_allowed, c->cpus_allowed))
> @@ -508,13 +508,13 @@ static void update_domain_attr_tree(struct sched_domain_attr *dattr,
>  				    struct cpuset *root_cs)
>  {
>  	struct cpuset *cp;
> -	struct cgroup *pos_cgrp;
> +	struct cgroup_subsys_state *pos_css;
>  
>  	rcu_read_lock();
> -	cpuset_for_each_descendant_pre(cp, pos_cgrp, root_cs) {
> +	cpuset_for_each_descendant_pre(cp, pos_css, root_cs) {
>  		/* skip the whole subtree if @cp doesn't have any CPU */
>  		if (cpumask_empty(cp->cpus_allowed)) {
> -			pos_cgrp = cgroup_rightmost_descendant(pos_cgrp);
> +			pos_css = css_rightmost_descendant(pos_css);
>  			continue;
>  		}
>  
> @@ -589,7 +589,7 @@ static int generate_sched_domains(cpumask_var_t **domains,
>  	struct sched_domain_attr *dattr;  /* attributes for custom domains */
>  	int ndoms = 0;		/* number of sched domains in result */
>  	int nslot;		/* next empty doms[] struct cpumask slot */
> -	struct cgroup *pos_cgrp;
> +	struct cgroup_subsys_state *pos_css;
>  
>  	doms = NULL;
>  	dattr = NULL;
> @@ -618,7 +618,7 @@ static int generate_sched_domains(cpumask_var_t **domains,
>  	csn = 0;
>  
>  	rcu_read_lock();
> -	cpuset_for_each_descendant_pre(cp, pos_cgrp, &top_cpuset) {
> +	cpuset_for_each_descendant_pre(cp, pos_css, &top_cpuset) {
>  		/*
>  		 * Continue traversing beyond @cp iff @cp has some CPUs and
>  		 * isn't load balancing.  The former is obvious.  The
> @@ -635,7 +635,7 @@ static int generate_sched_domains(cpumask_var_t **domains,
>  			csa[csn++] = cp;
>  
>  		/* skip @cp's subtree */
> -		pos_cgrp = cgroup_rightmost_descendant(pos_cgrp);
> +		pos_css = css_rightmost_descendant(pos_css);
>  	}
>  	rcu_read_unlock();
>  
> @@ -886,16 +886,16 @@ static void update_tasks_cpumask_hier(struct cpuset *root_cs,
>  				      bool update_root, struct ptr_heap *heap)
>  {
>  	struct cpuset *cp;
> -	struct cgroup *pos_cgrp;
> +	struct cgroup_subsys_state *pos_css;
>  
>  	if (update_root)
>  		update_tasks_cpumask(root_cs, heap);
>  
>  	rcu_read_lock();
> -	cpuset_for_each_descendant_pre(cp, pos_cgrp, root_cs) {
> +	cpuset_for_each_descendant_pre(cp, pos_css, root_cs) {
>  		/* skip the whole subtree if @cp have some CPU */
>  		if (!cpumask_empty(cp->cpus_allowed)) {
> -			pos_cgrp = cgroup_rightmost_descendant(pos_cgrp);
> +			pos_css = css_rightmost_descendant(pos_css);
>  			continue;
>  		}
>  		if (!css_tryget(&cp->css))
> @@ -1143,16 +1143,16 @@ static void update_tasks_nodemask_hier(struct cpuset *root_cs,
>  				       bool update_root, struct ptr_heap *heap)
>  {
>  	struct cpuset *cp;
> -	struct cgroup *pos_cgrp;
> +	struct cgroup_subsys_state *pos_css;
>  
>  	if (update_root)
>  		update_tasks_nodemask(root_cs, heap);
>  
>  	rcu_read_lock();
> -	cpuset_for_each_descendant_pre(cp, pos_cgrp, root_cs) {
> +	cpuset_for_each_descendant_pre(cp, pos_css, root_cs) {
>  		/* skip the whole subtree if @cp have some CPU */
>  		if (!nodes_empty(cp->mems_allowed)) {
> -			pos_cgrp = cgroup_rightmost_descendant(pos_cgrp);
> +			pos_css = css_rightmost_descendant(pos_css);
>  			continue;
>  		}
>  		if (!css_tryget(&cp->css))
> @@ -1973,7 +1973,7 @@ static int cpuset_css_online(struct cgroup_subsys_state *css)
>  	struct cpuset *cs = css_cs(css);
>  	struct cpuset *parent = parent_cs(cs);
>  	struct cpuset *tmp_cs;
> -	struct cgroup *pos_cgrp;
> +	struct cgroup_subsys_state *pos_css;
>  
>  	if (!parent)
>  		return 0;
> @@ -2005,7 +2005,7 @@ static int cpuset_css_online(struct cgroup_subsys_state *css)
>  	 * (and likewise for mems) to the new cgroup.
>  	 */
>  	rcu_read_lock();
> -	cpuset_for_each_child(tmp_cs, pos_cgrp, parent) {
> +	cpuset_for_each_child(tmp_cs, pos_css, parent) {
>  		if (is_mem_exclusive(tmp_cs) || is_cpu_exclusive(tmp_cs)) {
>  			rcu_read_unlock();
>  			goto out_unlock;
> @@ -2252,10 +2252,10 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
>  	/* if cpus or mems changed, we need to propagate to descendants */
>  	if (cpus_updated || mems_updated) {
>  		struct cpuset *cs;
> -		struct cgroup *pos_cgrp;
> +		struct cgroup_subsys_state *pos_css;
>  
>  		rcu_read_lock();
> -		cpuset_for_each_descendant_pre(cs, pos_cgrp, &top_cpuset) {
> +		cpuset_for_each_descendant_pre(cs, pos_css, &top_cpuset) {
>  			if (!css_tryget(&cs->css))
>  				continue;
>  			rcu_read_unlock();
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index ab64dfc..2285319 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1082,7 +1082,7 @@ struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm)
>  static struct mem_cgroup *__mem_cgroup_iter_next(struct mem_cgroup *root,
>  		struct mem_cgroup *last_visited)
>  {
> -	struct cgroup *prev_cgroup, *next_cgroup;
> +	struct cgroup_subsys_state *prev_css, *next_css;
>  
>  	/*
>  	 * Root is not visited by cgroup iterators so it needs an
> @@ -1091,11 +1091,9 @@ static struct mem_cgroup *__mem_cgroup_iter_next(struct mem_cgroup *root,
>  	if (!last_visited)
>  		return root;
>  
> -	prev_cgroup = (last_visited == root) ? NULL
> -		: last_visited->css.cgroup;
> +	prev_css = (last_visited == root) ? NULL : &last_visited->css;
>  skip_node:
> -	next_cgroup = cgroup_next_descendant_pre(
> -			prev_cgroup, root->css.cgroup);
> +	next_css = css_next_descendant_pre(prev_css, &root->css);
>  
>  	/*
>  	 * Even if we found a group we have to make sure it is
> @@ -1104,13 +1102,13 @@ skip_node:
>  	 * last_visited css is safe to use because it is
>  	 * protected by css_get and the tree walk is rcu safe.
>  	 */
> -	if (next_cgroup) {
> -		struct mem_cgroup *mem = mem_cgroup_from_cont(
> -				next_cgroup);
> +	if (next_css) {
> +		struct mem_cgroup *mem = mem_cgroup_from_css(next_css);
> +
>  		if (css_tryget(&mem->css))
>  			return mem;
>  		else {
> -			prev_cgroup = next_cgroup;
> +			prev_css = next_css;
>  			goto skip_node;
>  		}
>  	}
> @@ -4939,10 +4937,10 @@ static void mem_cgroup_reparent_charges(struct mem_cgroup *memcg)
>   */
>  static inline bool __memcg_has_children(struct mem_cgroup *memcg)
>  {
> -	struct cgroup *pos;
> +	struct cgroup_subsys_state *pos;
>  
>  	/* bounce at first found */
> -	cgroup_for_each_child(pos, memcg->css.cgroup)
> +	css_for_each_child(pos, &memcg->css)
>  		return true;
>  	return false;
>  }
> diff --git a/security/device_cgroup.c b/security/device_cgroup.c
> index e0ca464..9bf230a 100644
> --- a/security/device_cgroup.c
> +++ b/security/device_cgroup.c
> @@ -56,11 +56,6 @@ static inline struct dev_cgroup *css_to_devcgroup(struct cgroup_subsys_state *s)
>  	return s ? container_of(s, struct dev_cgroup, css) : NULL;
>  }
>  
> -static inline struct dev_cgroup *cgroup_to_devcgroup(struct cgroup *cgroup)
> -{
> -	return css_to_devcgroup(cgroup_css(cgroup, devices_subsys_id));
> -}
> -
>  static inline struct dev_cgroup *task_devcgroup(struct task_struct *task)
>  {
>  	return css_to_devcgroup(task_css(task, devices_subsys_id));
> @@ -447,13 +442,13 @@ static void revalidate_active_exceptions(struct dev_cgroup *devcg)
>  static int propagate_exception(struct dev_cgroup *devcg_root,
>  			       struct dev_exception_item *ex)
>  {
> -	struct cgroup *root = devcg_root->css.cgroup, *pos;
> +	struct cgroup_subsys_state *pos;
>  	int rc = 0;
>  
>  	rcu_read_lock();
>  
> -	cgroup_for_each_descendant_pre(pos, root) {
> -		struct dev_cgroup *devcg = cgroup_to_devcgroup(pos);
> +	css_for_each_descendant_pre(pos, &devcg_root->css) {
> +		struct dev_cgroup *devcg = css_to_devcgroup(pos);
>  
>  		/*
>  		 * Because devcgroup_mutex is held, no devcg will become
> -- 
> 1.8.3.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 17/23] cgroup: rename cgroup_iter to cgroup_task_iter
  2013-08-01 21:49 ` [PATCH 17/23] cgroup: rename cgroup_iter to cgroup_task_iter Tejun Heo
@ 2013-08-02 13:35   ` Michal Hocko
  0 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2013-08-02 13:35 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, containers, cgroups, linux-kernel, Matt Helsley,
	Johannes Weiner, Balbir Singh

On Thu 01-08-13 17:49:55, Tejun Heo wrote:
> cgroup now has multiple iterators and it's quite confusing to have
> something which walks over tasks of a single cgroup cgroup_iter.
> Let's rename it to cgroup_task_iter.
> 
> While at it, reformat / update comments and replace the overview
> comment above the interface function decls with proper function
> comments.  Such overview can be useful but function comments should be
> more than enough here.
> 
> This is pure rename and doesn't introduce any functional changes.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Matt Helsley <matthltc@us.ibm.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Balbir Singh <bsingharora@gmail.com>

Makes sense
Acked-by: Michal Hocko <mhocko@suse.cz>

> ---
>  include/linux/cgroup.h  |  31 ++++---------
>  kernel/cgroup.c         | 114 ++++++++++++++++++++++++++++++++----------------
>  kernel/cgroup_freezer.c |  24 +++++-----
>  mm/memcontrol.c         |  10 ++---
>  4 files changed, 102 insertions(+), 77 deletions(-)
> 
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index 7fba0d0..4478336 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -890,31 +890,16 @@ css_next_descendant_post(struct cgroup_subsys_state *pos,
>  	for ((pos) = css_next_descendant_post(NULL, (css)); (pos);	\
>  	     (pos) = css_next_descendant_post((pos), (css)))
>  
> -/* A cgroup_iter should be treated as an opaque object */
> -struct cgroup_iter {
> -	struct list_head *cset_link;
> -	struct list_head *task;
> +/* A cgroup_task_iter should be treated as an opaque object */
> +struct cgroup_task_iter {
> +	struct list_head		*cset_link;
> +	struct list_head		*task;
>  };
>  
> -/*
> - * To iterate across the tasks in a cgroup:
> - *
> - * 1) call cgroup_iter_start to initialize an iterator
> - *
> - * 2) call cgroup_iter_next() to retrieve member tasks until it
> - *    returns NULL or until you want to end the iteration
> - *
> - * 3) call cgroup_iter_end() to destroy the iterator.
> - *
> - * Or, call cgroup_scan_tasks() to iterate through every task in a
> - * cgroup - cgroup_scan_tasks() holds the css_set_lock when calling
> - * the test_task() callback, but not while calling the process_task()
> - * callback.
> - */
> -void cgroup_iter_start(struct cgroup *cgrp, struct cgroup_iter *it);
> -struct task_struct *cgroup_iter_next(struct cgroup *cgrp,
> -					struct cgroup_iter *it);
> -void cgroup_iter_end(struct cgroup *cgrp, struct cgroup_iter *it);
> +void cgroup_task_iter_start(struct cgroup *cgrp, struct cgroup_task_iter *it);
> +struct task_struct *cgroup_task_iter_next(struct cgroup *cgrp,
> +					  struct cgroup_task_iter *it);
> +void cgroup_task_iter_end(struct cgroup *cgrp, struct cgroup_task_iter *it);
>  int cgroup_scan_tasks(struct cgroup_scanner *scan);
>  int cgroup_attach_task_all(struct task_struct *from, struct task_struct *);
>  int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from);
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 1085439..7a4f89b 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -367,9 +367,11 @@ static struct cgrp_cset_link init_cgrp_cset_link;
>  static int cgroup_init_idr(struct cgroup_subsys *ss,
>  			   struct cgroup_subsys_state *css);
>  
> -/* css_set_lock protects the list of css_set objects, and the
> - * chain of tasks off each css_set.  Nests outside task->alloc_lock
> - * due to cgroup_iter_start() */
> +/*
> + * css_set_lock protects the list of css_set objects, and the chain of
> + * tasks off each css_set.  Nests outside task->alloc_lock due to
> + * cgroup_task_iter_start().
> + */
>  static DEFINE_RWLOCK(css_set_lock);
>  static int css_set_count;
>  
> @@ -394,10 +396,12 @@ static unsigned long css_set_hash(struct cgroup_subsys_state *css[])
>  	return key;
>  }
>  
> -/* We don't maintain the lists running through each css_set to its
> - * task until after the first call to cgroup_iter_start(). This
> - * reduces the fork()/exit() overhead for people who have cgroups
> - * compiled into their kernel but not actually in use */
> +/*
> + * We don't maintain the lists running through each css_set to its task
> + * until after the first call to cgroup_task_iter_start().  This reduces
> + * the fork()/exit() overhead for people who have cgroups compiled into
> + * their kernel but not actually in use.
> + */
>  static int use_task_css_set_links __read_mostly;
>  
>  static void __put_css_set(struct css_set *cset, int taskexit)
> @@ -2975,10 +2979,10 @@ int cgroup_task_count(const struct cgroup *cgrp)
>  }
>  
>  /*
> - * To reduce the fork() overhead for systems that are not actually
> - * using their cgroups capability, we don't maintain the lists running
> - * through each css_set to its tasks until we see the list actually
> - * used - in other words after the first call to cgroup_iter_start().
> + * To reduce the fork() overhead for systems that are not actually using
> + * their cgroups capability, we don't maintain the lists running through
> + * each css_set to its tasks until we see the list actually used - in other
> + * words after the first call to cgroup_task_iter_start().
>   */
>  static void cgroup_enable_task_cg_lists(void)
>  {
> @@ -3192,11 +3196,15 @@ css_next_descendant_post(struct cgroup_subsys_state *pos,
>  }
>  EXPORT_SYMBOL_GPL(css_next_descendant_post);
>  
> -/*
> - * Advance a list_head iterator.  The iterator should be positioned at
> - * the start of a css_set
> +/**
> + * cgroup_advance_task_iter - advance a task itererator to the next css_set
> + * @cgrp: the cgroup to walk tasks of
> + * @it: the iterator to advance
> + *
> + * Advance @it to the next css_set to walk.
>   */
> -static void cgroup_advance_iter(struct cgroup *cgrp, struct cgroup_iter *it)
> +static void cgroup_advance_task_iter(struct cgroup *cgrp,
> +				     struct cgroup_task_iter *it)
>  {
>  	struct list_head *l = it->cset_link;
>  	struct cgrp_cset_link *link;
> @@ -3216,7 +3224,21 @@ static void cgroup_advance_iter(struct cgroup *cgrp, struct cgroup_iter *it)
>  	it->task = cset->tasks.next;
>  }
>  
> -void cgroup_iter_start(struct cgroup *cgrp, struct cgroup_iter *it)
> +/**
> + * cgroup_task_iter_start - initiate task iteration
> + * @cgrp: the cgroup to walk tasks of
> + * @it: the task iterator to use
> + *
> + * Initiate iteration through the tasks of @cgrp.  The caller can call
> + * cgroup_task_iter_next() to walk through the tasks until the function
> + * returns NULL.  On completion of iteration, cgroup_task_iter_end() must
> + * be called.
> + *
> + * Note that this function acquires a lock which is released when the
> + * iteration finishes.  The caller can't sleep while iteration is in
> + * progress.
> + */
> +void cgroup_task_iter_start(struct cgroup *cgrp, struct cgroup_task_iter *it)
>  	__acquires(css_set_lock)
>  {
>  	/*
> @@ -3229,11 +3251,20 @@ void cgroup_iter_start(struct cgroup *cgrp, struct cgroup_iter *it)
>  
>  	read_lock(&css_set_lock);
>  	it->cset_link = &cgrp->cset_links;
> -	cgroup_advance_iter(cgrp, it);
> +	cgroup_advance_task_iter(cgrp, it);
>  }
>  
> -struct task_struct *cgroup_iter_next(struct cgroup *cgrp,
> -					struct cgroup_iter *it)
> +/**
> + * cgroup_task_iter_next - return the next task for the iterator
> + * @cgrp: the cgroup to walk tasks of
> + * @it: the task iterator being iterated
> + *
> + * The "next" function for task iteration.  @it should have been
> + * initialized via cgroup_task_iter_start().  Returns NULL when the
> + * iteration reaches the end.
> + */
> +struct task_struct *cgroup_task_iter_next(struct cgroup *cgrp,
> +					  struct cgroup_task_iter *it)
>  {
>  	struct task_struct *res;
>  	struct list_head *l = it->task;
> @@ -3247,16 +3278,25 @@ struct task_struct *cgroup_iter_next(struct cgroup *cgrp,
>  	l = l->next;
>  	link = list_entry(it->cset_link, struct cgrp_cset_link, cset_link);
>  	if (l == &link->cset->tasks) {
> -		/* We reached the end of this task list - move on to
> -		 * the next cg_cgroup_link */
> -		cgroup_advance_iter(cgrp, it);
> +		/*
> +		 * We reached the end of this task list - move on to the
> +		 * next cgrp_cset_link.
> +		 */
> +		cgroup_advance_task_iter(cgrp, it);
>  	} else {
>  		it->task = l;
>  	}
>  	return res;
>  }
>  
> -void cgroup_iter_end(struct cgroup *cgrp, struct cgroup_iter *it)
> +/**
> + * cgroup_task_iter_end - finish task iteration
> + * @cgrp: the cgroup to walk tasks of
> + * @it: the task iterator to finish
> + *
> + * Finish task iteration started by cgroup_task_iter_start().
> + */
> +void cgroup_task_iter_end(struct cgroup *cgrp, struct cgroup_task_iter *it)
>  	__releases(css_set_lock)
>  {
>  	read_unlock(&css_set_lock);
> @@ -3305,7 +3345,7 @@ static inline int started_after(void *p1, void *p2)
>   * Iterate through all the tasks in a cgroup, calling test_task() for each,
>   * and if it returns true, call process_task() for it also.
>   * The test_task pointer may be NULL, meaning always true (select all tasks).
> - * Effectively duplicates cgroup_iter_{start,next,end}()
> + * Effectively duplicates cgroup_task_iter_{start,next,end}()
>   * but does not lock css_set_lock for the call to process_task().
>   * The struct cgroup_scanner may be embedded in any structure of the caller's
>   * creation.
> @@ -3326,7 +3366,7 @@ static inline int started_after(void *p1, void *p2)
>  int cgroup_scan_tasks(struct cgroup_scanner *scan)
>  {
>  	int retval, i;
> -	struct cgroup_iter it;
> +	struct cgroup_task_iter it;
>  	struct task_struct *p, *dropped;
>  	/* Never dereference latest_task, since it's not refcounted */
>  	struct task_struct *latest_task = NULL;
> @@ -3361,8 +3401,8 @@ int cgroup_scan_tasks(struct cgroup_scanner *scan)
>  	 * guarantees forward progress and that we don't miss any tasks.
>  	 */
>  	heap->size = 0;
> -	cgroup_iter_start(scan->cgrp, &it);
> -	while ((p = cgroup_iter_next(scan->cgrp, &it))) {
> +	cgroup_task_iter_start(scan->cgrp, &it);
> +	while ((p = cgroup_task_iter_next(scan->cgrp, &it))) {
>  		/*
>  		 * Only affect tasks that qualify per the caller's callback,
>  		 * if he provided one
> @@ -3395,7 +3435,7 @@ int cgroup_scan_tasks(struct cgroup_scanner *scan)
>  		 * the heap and wasn't inserted
>  		 */
>  	}
> -	cgroup_iter_end(scan->cgrp, &it);
> +	cgroup_task_iter_end(scan->cgrp, &it);
>  
>  	if (heap->size) {
>  		for (i = 0; i < heap->size; i++) {
> @@ -3601,7 +3641,7 @@ static int pidlist_array_load(struct cgroup *cgrp, enum cgroup_filetype type,
>  	pid_t *array;
>  	int length;
>  	int pid, n = 0; /* used for populating the array */
> -	struct cgroup_iter it;
> +	struct cgroup_task_iter it;
>  	struct task_struct *tsk;
>  	struct cgroup_pidlist *l;
>  
> @@ -3616,8 +3656,8 @@ static int pidlist_array_load(struct cgroup *cgrp, enum cgroup_filetype type,
>  	if (!array)
>  		return -ENOMEM;
>  	/* now, populate the array */
> -	cgroup_iter_start(cgrp, &it);
> -	while ((tsk = cgroup_iter_next(cgrp, &it))) {
> +	cgroup_task_iter_start(cgrp, &it);
> +	while ((tsk = cgroup_task_iter_next(cgrp, &it))) {
>  		if (unlikely(n == length))
>  			break;
>  		/* get tgid or pid for procs or tasks file respectively */
> @@ -3628,7 +3668,7 @@ static int pidlist_array_load(struct cgroup *cgrp, enum cgroup_filetype type,
>  		if (pid > 0) /* make sure to only use valid results */
>  			array[n++] = pid;
>  	}
> -	cgroup_iter_end(cgrp, &it);
> +	cgroup_task_iter_end(cgrp, &it);
>  	length = n;
>  	/* now sort & (if procs) strip out duplicates */
>  	sort(array, length, sizeof(pid_t), cmppid, NULL);
> @@ -3662,7 +3702,7 @@ int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry)
>  {
>  	int ret = -EINVAL;
>  	struct cgroup *cgrp;
> -	struct cgroup_iter it;
> +	struct cgroup_task_iter it;
>  	struct task_struct *tsk;
>  
>  	/*
> @@ -3676,8 +3716,8 @@ int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry)
>  	ret = 0;
>  	cgrp = dentry->d_fsdata;
>  
> -	cgroup_iter_start(cgrp, &it);
> -	while ((tsk = cgroup_iter_next(cgrp, &it))) {
> +	cgroup_task_iter_start(cgrp, &it);
> +	while ((tsk = cgroup_task_iter_next(cgrp, &it))) {
>  		switch (tsk->state) {
>  		case TASK_RUNNING:
>  			stats->nr_running++;
> @@ -3697,7 +3737,7 @@ int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry)
>  			break;
>  		}
>  	}
> -	cgroup_iter_end(cgrp, &it);
> +	cgroup_task_iter_end(cgrp, &it);
>  
>  err:
>  	return ret;
> @@ -5128,7 +5168,7 @@ void cgroup_fork(struct task_struct *child)
>   * Adds the task to the list running through its css_set if necessary and
>   * call the subsystem fork() callbacks.  Has to be after the task is
>   * visible on the task list in case we race with the first call to
> - * cgroup_iter_start() - to guarantee that the new task ends up on its
> + * cgroup_task_iter_start() - to guarantee that the new task ends up on its
>   * list.
>   */
>  void cgroup_post_fork(struct task_struct *child)
> diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
> index 98ca48d..c9177f8 100644
> --- a/kernel/cgroup_freezer.c
> +++ b/kernel/cgroup_freezer.c
> @@ -258,7 +258,7 @@ static void update_if_frozen(struct cgroup_subsys_state *css)
>  {
>  	struct freezer *freezer = css_freezer(css);
>  	struct cgroup_subsys_state *pos;
> -	struct cgroup_iter it;
> +	struct cgroup_task_iter it;
>  	struct task_struct *task;
>  
>  	WARN_ON_ONCE(!rcu_read_lock_held());
> @@ -279,9 +279,9 @@ static void update_if_frozen(struct cgroup_subsys_state *css)
>  	}
>  
>  	/* are all tasks frozen? */
> -	cgroup_iter_start(css->cgroup, &it);
> +	cgroup_task_iter_start(css->cgroup, &it);
>  
> -	while ((task = cgroup_iter_next(css->cgroup, &it))) {
> +	while ((task = cgroup_task_iter_next(css->cgroup, &it))) {
>  		if (freezing(task)) {
>  			/*
>  			 * freezer_should_skip() indicates that the task
> @@ -296,7 +296,7 @@ static void update_if_frozen(struct cgroup_subsys_state *css)
>  
>  	freezer->state |= CGROUP_FROZEN;
>  out_iter_end:
> -	cgroup_iter_end(css->cgroup, &it);
> +	cgroup_task_iter_end(css->cgroup, &it);
>  out_unlock:
>  	spin_unlock_irq(&freezer->lock);
>  }
> @@ -323,25 +323,25 @@ static int freezer_read(struct cgroup_subsys_state *css, struct cftype *cft,
>  static void freeze_cgroup(struct freezer *freezer)
>  {
>  	struct cgroup *cgroup = freezer->css.cgroup;
> -	struct cgroup_iter it;
> +	struct cgroup_task_iter it;
>  	struct task_struct *task;
>  
> -	cgroup_iter_start(cgroup, &it);
> -	while ((task = cgroup_iter_next(cgroup, &it)))
> +	cgroup_task_iter_start(cgroup, &it);
> +	while ((task = cgroup_task_iter_next(cgroup, &it)))
>  		freeze_task(task);
> -	cgroup_iter_end(cgroup, &it);
> +	cgroup_task_iter_end(cgroup, &it);
>  }
>  
>  static void unfreeze_cgroup(struct freezer *freezer)
>  {
>  	struct cgroup *cgroup = freezer->css.cgroup;
> -	struct cgroup_iter it;
> +	struct cgroup_task_iter it;
>  	struct task_struct *task;
>  
> -	cgroup_iter_start(cgroup, &it);
> -	while ((task = cgroup_iter_next(cgroup, &it)))
> +	cgroup_task_iter_start(cgroup, &it);
> +	while ((task = cgroup_task_iter_next(cgroup, &it)))
>  		__thaw_task(task);
> -	cgroup_iter_end(cgroup, &it);
> +	cgroup_task_iter_end(cgroup, &it);
>  }
>  
>  /**
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 2285319..00b055d 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1800,11 +1800,11 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  	totalpages = mem_cgroup_get_limit(memcg) >> PAGE_SHIFT ? : 1;
>  	for_each_mem_cgroup_tree(iter, memcg) {
>  		struct cgroup *cgroup = iter->css.cgroup;
> -		struct cgroup_iter it;
> +		struct cgroup_task_iter it;
>  		struct task_struct *task;
>  
> -		cgroup_iter_start(cgroup, &it);
> -		while ((task = cgroup_iter_next(cgroup, &it))) {
> +		cgroup_task_iter_start(cgroup, &it);
> +		while ((task = cgroup_task_iter_next(cgroup, &it))) {
>  			switch (oom_scan_process_thread(task, totalpages, NULL,
>  							false)) {
>  			case OOM_SCAN_SELECT:
> @@ -1817,7 +1817,7 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  			case OOM_SCAN_CONTINUE:
>  				continue;
>  			case OOM_SCAN_ABORT:
> -				cgroup_iter_end(cgroup, &it);
> +				cgroup_task_iter_end(cgroup, &it);
>  				mem_cgroup_iter_break(memcg, iter);
>  				if (chosen)
>  					put_task_struct(chosen);
> @@ -1834,7 +1834,7 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  				get_task_struct(chosen);
>  			}
>  		}
> -		cgroup_iter_end(cgroup, &it);
> +		cgroup_task_iter_end(cgroup, &it);
>  	}
>  
>  	if (!chosen)
> -- 
> 1.8.3.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 18/23] cgroup: make cgroup_task_iter remember the cgroup being iterated
  2013-08-01 21:49 ` [PATCH 18/23] cgroup: make cgroup_task_iter remember the cgroup being iterated Tejun Heo
@ 2013-08-02 13:38   ` Michal Hocko
  0 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2013-08-02 13:38 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, containers, cgroups, linux-kernel, Matt Helsley,
	Johannes Weiner, Balbir Singh

On Thu 01-08-13 17:49:56, Tejun Heo wrote:
> Currently all cgroup_task_iter functions require @cgrp to be passed
> in, which is superflous and increases chance of usage error.  Make
> cgroup_task_iter remember the cgroup being iterated and drop @cgrp
> argument from next and end functions.
> 
> This patch doesn't introduce any behavior differences.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Matt Helsley <matthltc@us.ibm.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Balbir Singh <bsingharora@gmail.com>

For memcg part
Acked-by: Michal Hocko <mhocko@suse.cz>

> ---
>  include/linux/cgroup.h  |  6 +++---
>  kernel/cgroup.c         | 32 +++++++++++++++-----------------
>  kernel/cgroup_freezer.c | 12 ++++++------
>  mm/memcontrol.c         |  6 +++---
>  4 files changed, 27 insertions(+), 29 deletions(-)
> 
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index 4478336..2b10152 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -892,14 +892,14 @@ css_next_descendant_post(struct cgroup_subsys_state *pos,
>  
>  /* A cgroup_task_iter should be treated as an opaque object */
>  struct cgroup_task_iter {
> +	struct cgroup			*origin_cgrp;
>  	struct list_head		*cset_link;
>  	struct list_head		*task;
>  };
>  
>  void cgroup_task_iter_start(struct cgroup *cgrp, struct cgroup_task_iter *it);
> -struct task_struct *cgroup_task_iter_next(struct cgroup *cgrp,
> -					  struct cgroup_task_iter *it);
> -void cgroup_task_iter_end(struct cgroup *cgrp, struct cgroup_task_iter *it);
> +struct task_struct *cgroup_task_iter_next(struct cgroup_task_iter *it);
> +void cgroup_task_iter_end(struct cgroup_task_iter *it);
>  int cgroup_scan_tasks(struct cgroup_scanner *scan);
>  int cgroup_attach_task_all(struct task_struct *from, struct task_struct *);
>  int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from);
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 7a4f89b..7adaaa6 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -3198,13 +3198,11 @@ EXPORT_SYMBOL_GPL(css_next_descendant_post);
>  
>  /**
>   * cgroup_advance_task_iter - advance a task itererator to the next css_set
> - * @cgrp: the cgroup to walk tasks of
>   * @it: the iterator to advance
>   *
>   * Advance @it to the next css_set to walk.
>   */
> -static void cgroup_advance_task_iter(struct cgroup *cgrp,
> -				     struct cgroup_task_iter *it)
> +static void cgroup_advance_task_iter(struct cgroup_task_iter *it)
>  {
>  	struct list_head *l = it->cset_link;
>  	struct cgrp_cset_link *link;
> @@ -3213,7 +3211,7 @@ static void cgroup_advance_task_iter(struct cgroup *cgrp,
>  	/* Advance to the next non-empty css_set */
>  	do {
>  		l = l->next;
> -		if (l == &cgrp->cset_links) {
> +		if (l == &it->origin_cgrp->cset_links) {
>  			it->cset_link = NULL;
>  			return;
>  		}
> @@ -3250,21 +3248,22 @@ void cgroup_task_iter_start(struct cgroup *cgrp, struct cgroup_task_iter *it)
>  		cgroup_enable_task_cg_lists();
>  
>  	read_lock(&css_set_lock);
> +
> +	it->origin_cgrp = cgrp;
>  	it->cset_link = &cgrp->cset_links;
> -	cgroup_advance_task_iter(cgrp, it);
> +
> +	cgroup_advance_task_iter(it);
>  }
>  
>  /**
>   * cgroup_task_iter_next - return the next task for the iterator
> - * @cgrp: the cgroup to walk tasks of
>   * @it: the task iterator being iterated
>   *
>   * The "next" function for task iteration.  @it should have been
>   * initialized via cgroup_task_iter_start().  Returns NULL when the
>   * iteration reaches the end.
>   */
> -struct task_struct *cgroup_task_iter_next(struct cgroup *cgrp,
> -					  struct cgroup_task_iter *it)
> +struct task_struct *cgroup_task_iter_next(struct cgroup_task_iter *it)
>  {
>  	struct task_struct *res;
>  	struct list_head *l = it->task;
> @@ -3282,7 +3281,7 @@ struct task_struct *cgroup_task_iter_next(struct cgroup *cgrp,
>  		 * We reached the end of this task list - move on to the
>  		 * next cgrp_cset_link.
>  		 */
> -		cgroup_advance_task_iter(cgrp, it);
> +		cgroup_advance_task_iter(it);
>  	} else {
>  		it->task = l;
>  	}
> @@ -3291,12 +3290,11 @@ struct task_struct *cgroup_task_iter_next(struct cgroup *cgrp,
>  
>  /**
>   * cgroup_task_iter_end - finish task iteration
> - * @cgrp: the cgroup to walk tasks of
>   * @it: the task iterator to finish
>   *
>   * Finish task iteration started by cgroup_task_iter_start().
>   */
> -void cgroup_task_iter_end(struct cgroup *cgrp, struct cgroup_task_iter *it)
> +void cgroup_task_iter_end(struct cgroup_task_iter *it)
>  	__releases(css_set_lock)
>  {
>  	read_unlock(&css_set_lock);
> @@ -3402,7 +3400,7 @@ int cgroup_scan_tasks(struct cgroup_scanner *scan)
>  	 */
>  	heap->size = 0;
>  	cgroup_task_iter_start(scan->cgrp, &it);
> -	while ((p = cgroup_task_iter_next(scan->cgrp, &it))) {
> +	while ((p = cgroup_task_iter_next(&it))) {
>  		/*
>  		 * Only affect tasks that qualify per the caller's callback,
>  		 * if he provided one
> @@ -3435,7 +3433,7 @@ int cgroup_scan_tasks(struct cgroup_scanner *scan)
>  		 * the heap and wasn't inserted
>  		 */
>  	}
> -	cgroup_task_iter_end(scan->cgrp, &it);
> +	cgroup_task_iter_end(&it);
>  
>  	if (heap->size) {
>  		for (i = 0; i < heap->size; i++) {
> @@ -3657,7 +3655,7 @@ static int pidlist_array_load(struct cgroup *cgrp, enum cgroup_filetype type,
>  		return -ENOMEM;
>  	/* now, populate the array */
>  	cgroup_task_iter_start(cgrp, &it);
> -	while ((tsk = cgroup_task_iter_next(cgrp, &it))) {
> +	while ((tsk = cgroup_task_iter_next(&it))) {
>  		if (unlikely(n == length))
>  			break;
>  		/* get tgid or pid for procs or tasks file respectively */
> @@ -3668,7 +3666,7 @@ static int pidlist_array_load(struct cgroup *cgrp, enum cgroup_filetype type,
>  		if (pid > 0) /* make sure to only use valid results */
>  			array[n++] = pid;
>  	}
> -	cgroup_task_iter_end(cgrp, &it);
> +	cgroup_task_iter_end(&it);
>  	length = n;
>  	/* now sort & (if procs) strip out duplicates */
>  	sort(array, length, sizeof(pid_t), cmppid, NULL);
> @@ -3717,7 +3715,7 @@ int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry)
>  	cgrp = dentry->d_fsdata;
>  
>  	cgroup_task_iter_start(cgrp, &it);
> -	while ((tsk = cgroup_task_iter_next(cgrp, &it))) {
> +	while ((tsk = cgroup_task_iter_next(&it))) {
>  		switch (tsk->state) {
>  		case TASK_RUNNING:
>  			stats->nr_running++;
> @@ -3737,7 +3735,7 @@ int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry)
>  			break;
>  		}
>  	}
> -	cgroup_task_iter_end(cgrp, &it);
> +	cgroup_task_iter_end(&it);
>  
>  err:
>  	return ret;
> diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
> index c9177f8..e0ab9bf 100644
> --- a/kernel/cgroup_freezer.c
> +++ b/kernel/cgroup_freezer.c
> @@ -281,7 +281,7 @@ static void update_if_frozen(struct cgroup_subsys_state *css)
>  	/* are all tasks frozen? */
>  	cgroup_task_iter_start(css->cgroup, &it);
>  
> -	while ((task = cgroup_task_iter_next(css->cgroup, &it))) {
> +	while ((task = cgroup_task_iter_next(&it))) {
>  		if (freezing(task)) {
>  			/*
>  			 * freezer_should_skip() indicates that the task
> @@ -296,7 +296,7 @@ static void update_if_frozen(struct cgroup_subsys_state *css)
>  
>  	freezer->state |= CGROUP_FROZEN;
>  out_iter_end:
> -	cgroup_task_iter_end(css->cgroup, &it);
> +	cgroup_task_iter_end(&it);
>  out_unlock:
>  	spin_unlock_irq(&freezer->lock);
>  }
> @@ -327,9 +327,9 @@ static void freeze_cgroup(struct freezer *freezer)
>  	struct task_struct *task;
>  
>  	cgroup_task_iter_start(cgroup, &it);
> -	while ((task = cgroup_task_iter_next(cgroup, &it)))
> +	while ((task = cgroup_task_iter_next(&it)))
>  		freeze_task(task);
> -	cgroup_task_iter_end(cgroup, &it);
> +	cgroup_task_iter_end(&it);
>  }
>  
>  static void unfreeze_cgroup(struct freezer *freezer)
> @@ -339,9 +339,9 @@ static void unfreeze_cgroup(struct freezer *freezer)
>  	struct task_struct *task;
>  
>  	cgroup_task_iter_start(cgroup, &it);
> -	while ((task = cgroup_task_iter_next(cgroup, &it)))
> +	while ((task = cgroup_task_iter_next(&it)))
>  		__thaw_task(task);
> -	cgroup_task_iter_end(cgroup, &it);
> +	cgroup_task_iter_end(&it);
>  }
>  
>  /**
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 00b055d..5a5f4dc 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1804,7 +1804,7 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  		struct task_struct *task;
>  
>  		cgroup_task_iter_start(cgroup, &it);
> -		while ((task = cgroup_task_iter_next(cgroup, &it))) {
> +		while ((task = cgroup_task_iter_next(&it))) {
>  			switch (oom_scan_process_thread(task, totalpages, NULL,
>  							false)) {
>  			case OOM_SCAN_SELECT:
> @@ -1817,7 +1817,7 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  			case OOM_SCAN_CONTINUE:
>  				continue;
>  			case OOM_SCAN_ABORT:
> -				cgroup_task_iter_end(cgroup, &it);
> +				cgroup_task_iter_end(&it);
>  				mem_cgroup_iter_break(memcg, iter);
>  				if (chosen)
>  					put_task_struct(chosen);
> @@ -1834,7 +1834,7 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  				get_task_struct(chosen);
>  			}
>  		}
> -		cgroup_task_iter_end(cgroup, &it);
> +		cgroup_task_iter_end(&it);
>  	}
>  
>  	if (!chosen)
> -- 
> 1.8.3.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 20/23] cgroup: make task iterators deal with cgroup_subsys_state instead of cgroup
  2013-08-01 21:49 ` [PATCH 20/23] cgroup: make task iterators deal with cgroup_subsys_state instead of cgroup Tejun Heo
@ 2013-08-02 13:40   ` Michal Hocko
  0 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2013-08-02 13:40 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, containers, cgroups, linux-kernel, Johannes Weiner,
	Balbir Singh, Matt Helsley

On Thu 01-08-13 17:49:58, Tejun Heo wrote:
> cgroup is in the process of converting to css (cgroup_subsys_state)
> from cgroup as the principal subsystem interface handle.  This is
> mostly to prepare for the unified hierarchy support where css's will
> be created and destroyed dynamically but also helps cleaning up
> subsystem implementations as css is usually what they are interested
> in anyway.
> 
> This patch converts task iterators to deal with css instead of cgroup.
> Note that under unified hierarchy, different sets of tasks will be
> considered belonging to a given cgroup depending on the subsystem in
> question and making the iterators deal with css instead cgroup
> provides them with enough information about the iteration.
> 
> While at it, fix several function comment formats in cpuset.c.
> 
> This patch doesn't introduce any behavior differences.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Matt Helsley <matthltc@us.ibm.com>

For memcg part
Acked-by: Michal Hocko <mhocko@suse.cz>

> ---
>  include/linux/cgroup.h  |  21 ++++-----
>  kernel/cgroup.c         | 112 ++++++++++++++++++++++++------------------------
>  kernel/cgroup_freezer.c |  26 ++++++-----
>  kernel/cpuset.c         |  41 ++++++++----------
>  mm/memcontrol.c         |  11 +++--
>  5 files changed, 104 insertions(+), 107 deletions(-)
> 
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index 2e9a799..6f6d87b 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -881,21 +881,22 @@ css_next_descendant_post(struct cgroup_subsys_state *pos,
>  	for ((pos) = css_next_descendant_post(NULL, (css)); (pos);	\
>  	     (pos) = css_next_descendant_post((pos), (css)))
>  
> -/* A cgroup_task_iter should be treated as an opaque object */
> -struct cgroup_task_iter {
> -	struct cgroup			*origin_cgrp;
> +/* A css_task_iter should be treated as an opaque object */
> +struct css_task_iter {
> +	struct cgroup_subsys_state	*origin_css;
>  	struct list_head		*cset_link;
>  	struct list_head		*task;
>  };
>  
> -void cgroup_task_iter_start(struct cgroup *cgrp, struct cgroup_task_iter *it);
> -struct task_struct *cgroup_task_iter_next(struct cgroup_task_iter *it);
> -void cgroup_task_iter_end(struct cgroup_task_iter *it);
> +void css_task_iter_start(struct cgroup_subsys_state *css,
> +			 struct css_task_iter *it);
> +struct task_struct *css_task_iter_next(struct css_task_iter *it);
> +void css_task_iter_end(struct css_task_iter *it);
>  
> -int cgroup_scan_tasks(struct cgroup *cgrp,
> -		      bool (*test)(struct task_struct *, void *),
> -		      void (*process)(struct task_struct *, void *),
> -		      void *data, struct ptr_heap *heap);
> +int css_scan_tasks(struct cgroup_subsys_state *css,
> +		   bool (*test)(struct task_struct *, void *),
> +		   void (*process)(struct task_struct *, void *),
> +		   void *data, struct ptr_heap *heap);
>  
>  int cgroup_attach_task_all(struct task_struct *from, struct task_struct *);
>  int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from);
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 4e354b59..c61b24f 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -370,7 +370,7 @@ static int cgroup_init_idr(struct cgroup_subsys *ss,
>  /*
>   * css_set_lock protects the list of css_set objects, and the chain of
>   * tasks off each css_set.  Nests outside task->alloc_lock due to
> - * cgroup_task_iter_start().
> + * css_task_iter_start().
>   */
>  static DEFINE_RWLOCK(css_set_lock);
>  static int css_set_count;
> @@ -398,9 +398,9 @@ static unsigned long css_set_hash(struct cgroup_subsys_state *css[])
>  
>  /*
>   * We don't maintain the lists running through each css_set to its task
> - * until after the first call to cgroup_task_iter_start().  This reduces
> - * the fork()/exit() overhead for people who have cgroups compiled into
> - * their kernel but not actually in use.
> + * until after the first call to css_task_iter_start().  This reduces the
> + * fork()/exit() overhead for people who have cgroups compiled into their
> + * kernel but not actually in use.
>   */
>  static int use_task_css_set_links __read_mostly;
>  
> @@ -2982,7 +2982,7 @@ int cgroup_task_count(const struct cgroup *cgrp)
>   * To reduce the fork() overhead for systems that are not actually using
>   * their cgroups capability, we don't maintain the lists running through
>   * each css_set to its tasks until we see the list actually used - in other
> - * words after the first call to cgroup_task_iter_start().
> + * words after the first call to css_task_iter_start().
>   */
>  static void cgroup_enable_task_cg_lists(void)
>  {
> @@ -3197,12 +3197,12 @@ css_next_descendant_post(struct cgroup_subsys_state *pos,
>  EXPORT_SYMBOL_GPL(css_next_descendant_post);
>  
>  /**
> - * cgroup_advance_task_iter - advance a task itererator to the next css_set
> + * css_advance_task_iter - advance a task itererator to the next css_set
>   * @it: the iterator to advance
>   *
>   * Advance @it to the next css_set to walk.
>   */
> -static void cgroup_advance_task_iter(struct cgroup_task_iter *it)
> +static void css_advance_task_iter(struct css_task_iter *it)
>  {
>  	struct list_head *l = it->cset_link;
>  	struct cgrp_cset_link *link;
> @@ -3211,7 +3211,7 @@ static void cgroup_advance_task_iter(struct cgroup_task_iter *it)
>  	/* Advance to the next non-empty css_set */
>  	do {
>  		l = l->next;
> -		if (l == &it->origin_cgrp->cset_links) {
> +		if (l == &it->origin_css->cgroup->cset_links) {
>  			it->cset_link = NULL;
>  			return;
>  		}
> @@ -3223,47 +3223,48 @@ static void cgroup_advance_task_iter(struct cgroup_task_iter *it)
>  }
>  
>  /**
> - * cgroup_task_iter_start - initiate task iteration
> - * @cgrp: the cgroup to walk tasks of
> + * css_task_iter_start - initiate task iteration
> + * @css: the css to walk tasks of
>   * @it: the task iterator to use
>   *
> - * Initiate iteration through the tasks of @cgrp.  The caller can call
> - * cgroup_task_iter_next() to walk through the tasks until the function
> - * returns NULL.  On completion of iteration, cgroup_task_iter_end() must
> - * be called.
> + * Initiate iteration through the tasks of @css.  The caller can call
> + * css_task_iter_next() to walk through the tasks until the function
> + * returns NULL.  On completion of iteration, css_task_iter_end() must be
> + * called.
>   *
>   * Note that this function acquires a lock which is released when the
>   * iteration finishes.  The caller can't sleep while iteration is in
>   * progress.
>   */
> -void cgroup_task_iter_start(struct cgroup *cgrp, struct cgroup_task_iter *it)
> +void css_task_iter_start(struct cgroup_subsys_state *css,
> +			 struct css_task_iter *it)
>  	__acquires(css_set_lock)
>  {
>  	/*
> -	 * The first time anyone tries to iterate across a cgroup,
> -	 * we need to enable the list linking each css_set to its
> -	 * tasks, and fix up all existing tasks.
> +	 * The first time anyone tries to iterate across a css, we need to
> +	 * enable the list linking each css_set to its tasks, and fix up
> +	 * all existing tasks.
>  	 */
>  	if (!use_task_css_set_links)
>  		cgroup_enable_task_cg_lists();
>  
>  	read_lock(&css_set_lock);
>  
> -	it->origin_cgrp = cgrp;
> -	it->cset_link = &cgrp->cset_links;
> +	it->origin_css = css;
> +	it->cset_link = &css->cgroup->cset_links;
>  
> -	cgroup_advance_task_iter(it);
> +	css_advance_task_iter(it);
>  }
>  
>  /**
> - * cgroup_task_iter_next - return the next task for the iterator
> + * css_task_iter_next - return the next task for the iterator
>   * @it: the task iterator being iterated
>   *
>   * The "next" function for task iteration.  @it should have been
> - * initialized via cgroup_task_iter_start().  Returns NULL when the
> - * iteration reaches the end.
> + * initialized via css_task_iter_start().  Returns NULL when the iteration
> + * reaches the end.
>   */
> -struct task_struct *cgroup_task_iter_next(struct cgroup_task_iter *it)
> +struct task_struct *css_task_iter_next(struct css_task_iter *it)
>  {
>  	struct task_struct *res;
>  	struct list_head *l = it->task;
> @@ -3281,7 +3282,7 @@ struct task_struct *cgroup_task_iter_next(struct cgroup_task_iter *it)
>  		 * We reached the end of this task list - move on to the
>  		 * next cgrp_cset_link.
>  		 */
> -		cgroup_advance_task_iter(it);
> +		css_advance_task_iter(it);
>  	} else {
>  		it->task = l;
>  	}
> @@ -3289,12 +3290,12 @@ struct task_struct *cgroup_task_iter_next(struct cgroup_task_iter *it)
>  }
>  
>  /**
> - * cgroup_task_iter_end - finish task iteration
> + * css_task_iter_end - finish task iteration
>   * @it: the task iterator to finish
>   *
> - * Finish task iteration started by cgroup_task_iter_start().
> + * Finish task iteration started by css_task_iter_start().
>   */
> -void cgroup_task_iter_end(struct cgroup_task_iter *it)
> +void css_task_iter_end(struct css_task_iter *it)
>  	__releases(css_set_lock)
>  {
>  	read_unlock(&css_set_lock);
> @@ -3335,24 +3336,24 @@ static inline int started_after(void *p1, void *p2)
>  }
>  
>  /**
> - * cgroup_scan_tasks - iterate though all the tasks in a cgroup
> - * @cgrp: the cgroup to iterate tasks of
> + * css_scan_tasks - iterate though all the tasks in a css
> + * @css: the css to iterate tasks of
>   * @test: optional test callback
>   * @process: process callback
>   * @data: data passed to @test and @process
>   * @heap: optional pre-allocated heap used for task iteration
>   *
> - * Iterate through all the tasks in a cgroup, calling @test for each, and
> - * if it returns %true, call @process for it also.
> + * Iterate through all the tasks in @css, calling @test for each, and if it
> + * returns %true, call @process for it also.
>   *
>   * @test may be NULL, meaning always true (select all tasks), which
> - * effectively duplicates cgroup_task_iter_{start,next,end}() but does not
> + * effectively duplicates css_task_iter_{start,next,end}() but does not
>   * lock css_set_lock for the call to @process.
>   *
>   * It is guaranteed that @process will act on every task that is a member
> - * of @cgrp for the duration of this call.  This function may or may not
> - * call @process for tasks that exit or move to a different cgroup during
> - * the call, or are forked or move into the cgroup during the call.
> + * of @css for the duration of this call.  This function may or may not
> + * call @process for tasks that exit or move to a different css during the
> + * call, or are forked or move into the css during the call.
>   *
>   * Note that @test may be called with locks held, and may in some
>   * situations be called multiple times for the same task, so it should be
> @@ -3363,13 +3364,13 @@ static inline int started_after(void *p1, void *p2)
>   * temporary heap will be used (allocation of which may cause this function
>   * to fail).
>   */
> -int cgroup_scan_tasks(struct cgroup *cgrp,
> -		      bool (*test)(struct task_struct *, void *),
> -		      void (*process)(struct task_struct *, void *),
> -		      void *data, struct ptr_heap *heap)
> +int css_scan_tasks(struct cgroup_subsys_state *css,
> +		   bool (*test)(struct task_struct *, void *),
> +		   void (*process)(struct task_struct *, void *),
> +		   void *data, struct ptr_heap *heap)
>  {
>  	int retval, i;
> -	struct cgroup_task_iter it;
> +	struct css_task_iter it;
>  	struct task_struct *p, *dropped;
>  	/* Never dereference latest_task, since it's not refcounted */
>  	struct task_struct *latest_task = NULL;
> @@ -3390,7 +3391,7 @@ int cgroup_scan_tasks(struct cgroup *cgrp,
>  
>   again:
>  	/*
> -	 * Scan tasks in the cgroup, using the @test callback to determine
> +	 * Scan tasks in the css, using the @test callback to determine
>  	 * which are of interest, and invoking @process callback on the
>  	 * ones which need an update.  Since we don't want to hold any
>  	 * locks during the task updates, gather tasks to be processed in a
> @@ -3401,8 +3402,8 @@ int cgroup_scan_tasks(struct cgroup *cgrp,
>  	 * guarantees forward progress and that we don't miss any tasks.
>  	 */
>  	heap->size = 0;
> -	cgroup_task_iter_start(cgrp, &it);
> -	while ((p = cgroup_task_iter_next(&it))) {
> +	css_task_iter_start(css, &it);
> +	while ((p = css_task_iter_next(&it))) {
>  		/*
>  		 * Only affect tasks that qualify per the caller's callback,
>  		 * if he provided one
> @@ -3435,7 +3436,7 @@ int cgroup_scan_tasks(struct cgroup *cgrp,
>  		 * the heap and wasn't inserted
>  		 */
>  	}
> -	cgroup_task_iter_end(&it);
> +	css_task_iter_end(&it);
>  
>  	if (heap->size) {
>  		for (i = 0; i < heap->size; i++) {
> @@ -3478,7 +3479,8 @@ static void cgroup_transfer_one_task(struct task_struct *task, void *data)
>   */
>  int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from)
>  {
> -	return cgroup_scan_tasks(from, NULL, cgroup_transfer_one_task, to, NULL);
> +	return css_scan_tasks(&from->dummy_css, NULL, cgroup_transfer_one_task,
> +			      to, NULL);
>  }
>  
>  /*
> @@ -3632,7 +3634,7 @@ static int pidlist_array_load(struct cgroup *cgrp, enum cgroup_filetype type,
>  	pid_t *array;
>  	int length;
>  	int pid, n = 0; /* used for populating the array */
> -	struct cgroup_task_iter it;
> +	struct css_task_iter it;
>  	struct task_struct *tsk;
>  	struct cgroup_pidlist *l;
>  
> @@ -3647,8 +3649,8 @@ static int pidlist_array_load(struct cgroup *cgrp, enum cgroup_filetype type,
>  	if (!array)
>  		return -ENOMEM;
>  	/* now, populate the array */
> -	cgroup_task_iter_start(cgrp, &it);
> -	while ((tsk = cgroup_task_iter_next(&it))) {
> +	css_task_iter_start(&cgrp->dummy_css, &it);
> +	while ((tsk = css_task_iter_next(&it))) {
>  		if (unlikely(n == length))
>  			break;
>  		/* get tgid or pid for procs or tasks file respectively */
> @@ -3659,7 +3661,7 @@ static int pidlist_array_load(struct cgroup *cgrp, enum cgroup_filetype type,
>  		if (pid > 0) /* make sure to only use valid results */
>  			array[n++] = pid;
>  	}
> -	cgroup_task_iter_end(&it);
> +	css_task_iter_end(&it);
>  	length = n;
>  	/* now sort & (if procs) strip out duplicates */
>  	sort(array, length, sizeof(pid_t), cmppid, NULL);
> @@ -3693,7 +3695,7 @@ int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry)
>  {
>  	int ret = -EINVAL;
>  	struct cgroup *cgrp;
> -	struct cgroup_task_iter it;
> +	struct css_task_iter it;
>  	struct task_struct *tsk;
>  
>  	/*
> @@ -3707,8 +3709,8 @@ int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry)
>  	ret = 0;
>  	cgrp = dentry->d_fsdata;
>  
> -	cgroup_task_iter_start(cgrp, &it);
> -	while ((tsk = cgroup_task_iter_next(&it))) {
> +	css_task_iter_start(&cgrp->dummy_css, &it);
> +	while ((tsk = css_task_iter_next(&it))) {
>  		switch (tsk->state) {
>  		case TASK_RUNNING:
>  			stats->nr_running++;
> @@ -3728,7 +3730,7 @@ int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry)
>  			break;
>  		}
>  	}
> -	cgroup_task_iter_end(&it);
> +	css_task_iter_end(&it);
>  
>  err:
>  	return ret;
> diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
> index e0ab9bf..5cd2b6d 100644
> --- a/kernel/cgroup_freezer.c
> +++ b/kernel/cgroup_freezer.c
> @@ -258,7 +258,7 @@ static void update_if_frozen(struct cgroup_subsys_state *css)
>  {
>  	struct freezer *freezer = css_freezer(css);
>  	struct cgroup_subsys_state *pos;
> -	struct cgroup_task_iter it;
> +	struct css_task_iter it;
>  	struct task_struct *task;
>  
>  	WARN_ON_ONCE(!rcu_read_lock_held());
> @@ -279,9 +279,9 @@ static void update_if_frozen(struct cgroup_subsys_state *css)
>  	}
>  
>  	/* are all tasks frozen? */
> -	cgroup_task_iter_start(css->cgroup, &it);
> +	css_task_iter_start(css, &it);
>  
> -	while ((task = cgroup_task_iter_next(&it))) {
> +	while ((task = css_task_iter_next(&it))) {
>  		if (freezing(task)) {
>  			/*
>  			 * freezer_should_skip() indicates that the task
> @@ -296,7 +296,7 @@ static void update_if_frozen(struct cgroup_subsys_state *css)
>  
>  	freezer->state |= CGROUP_FROZEN;
>  out_iter_end:
> -	cgroup_task_iter_end(&it);
> +	css_task_iter_end(&it);
>  out_unlock:
>  	spin_unlock_irq(&freezer->lock);
>  }
> @@ -322,26 +322,24 @@ static int freezer_read(struct cgroup_subsys_state *css, struct cftype *cft,
>  
>  static void freeze_cgroup(struct freezer *freezer)
>  {
> -	struct cgroup *cgroup = freezer->css.cgroup;
> -	struct cgroup_task_iter it;
> +	struct css_task_iter it;
>  	struct task_struct *task;
>  
> -	cgroup_task_iter_start(cgroup, &it);
> -	while ((task = cgroup_task_iter_next(&it)))
> +	css_task_iter_start(&freezer->css, &it);
> +	while ((task = css_task_iter_next(&it)))
>  		freeze_task(task);
> -	cgroup_task_iter_end(&it);
> +	css_task_iter_end(&it);
>  }
>  
>  static void unfreeze_cgroup(struct freezer *freezer)
>  {
> -	struct cgroup *cgroup = freezer->css.cgroup;
> -	struct cgroup_task_iter it;
> +	struct css_task_iter it;
>  	struct task_struct *task;
>  
> -	cgroup_task_iter_start(cgroup, &it);
> -	while ((task = cgroup_task_iter_next(&it)))
> +	css_task_iter_start(&freezer->css, &it);
> +	while ((task = css_task_iter_next(&it)))
>  		__thaw_task(task);
> -	cgroup_task_iter_end(&it);
> +	css_task_iter_end(&it);
>  }
>  
>  /**
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index 6fe23f2..39e5217 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -832,8 +832,8 @@ static struct cpuset *effective_nodemask_cpuset(struct cpuset *cs)
>   * @tsk: task to test
>   * @data: cpuset to @tsk belongs to
>   *
> - * Called by cgroup_scan_tasks() for each task in a cgroup whose
> - * cpus_allowed mask needs to be changed.
> + * Called by css_scan_tasks() for each task in a cgroup whose cpus_allowed
> + * mask needs to be changed.
>   *
>   * We don't need to re-check for the cgroup/cpuset membership, since we're
>   * holding cpuset_mutex at this point.
> @@ -849,27 +849,26 @@ static void cpuset_change_cpumask(struct task_struct *tsk, void *data)
>  /**
>   * update_tasks_cpumask - Update the cpumasks of tasks in the cpuset.
>   * @cs: the cpuset in which each task's cpus_allowed mask needs to be changed
> - * @heap: if NULL, defer allocating heap memory to cgroup_scan_tasks()
> + * @heap: if NULL, defer allocating heap memory to css_scan_tasks()
>   *
>   * Called with cpuset_mutex held
>   *
> - * The cgroup_scan_tasks() function will scan all the tasks in a cgroup,
> + * The css_scan_tasks() function will scan all the tasks in a cgroup,
>   * calling callback functions for each.
>   *
> - * No return value. It's guaranteed that cgroup_scan_tasks() always returns 0
> + * No return value. It's guaranteed that css_scan_tasks() always returns 0
>   * if @heap != NULL.
>   */
>  static void update_tasks_cpumask(struct cpuset *cs, struct ptr_heap *heap)
>  {
> -	cgroup_scan_tasks(cs->css.cgroup, NULL, cpuset_change_cpumask, cs,
> -			  heap);
> +	css_scan_tasks(&cs->css, NULL, cpuset_change_cpumask, cs, heap);
>  }
>  
>  /*
>   * update_tasks_cpumask_hier - Update the cpumasks of tasks in the hierarchy.
>   * @root_cs: the root cpuset of the hierarchy
>   * @update_root: update root cpuset or not?
> - * @heap: the heap used by cgroup_scan_tasks()
> + * @heap: the heap used by css_scan_tasks()
>   *
>   * This will update cpumasks of tasks in @root_cs and all other empty cpusets
>   * which take on cpumask of @root_cs.
> @@ -1082,11 +1081,10 @@ static void *cpuset_being_rebound;
>  /**
>   * update_tasks_nodemask - Update the nodemasks of tasks in the cpuset.
>   * @cs: the cpuset in which each task's mems_allowed mask needs to be changed
> - * @heap: if NULL, defer allocating heap memory to cgroup_scan_tasks()
> + * @heap: if NULL, defer allocating heap memory to css_scan_tasks()
>   *
> - * Called with cpuset_mutex held
> - * No return value. It's guaranteed that cgroup_scan_tasks() always returns 0
> - * if @heap != NULL.
> + * Called with cpuset_mutex held.  No return value. It's guaranteed that
> + * css_scan_tasks() always returns 0 if @heap != NULL.
>   */
>  static void update_tasks_nodemask(struct cpuset *cs, struct ptr_heap *heap)
>  {
> @@ -1109,8 +1107,7 @@ static void update_tasks_nodemask(struct cpuset *cs, struct ptr_heap *heap)
>  	 * It's ok if we rebind the same mm twice; mpol_rebind_mm()
>  	 * is idempotent.  Also migrate pages in each mm to new nodes.
>  	 */
> -	cgroup_scan_tasks(cs->css.cgroup, NULL, cpuset_change_nodemask, &arg,
> -			  heap);
> +	css_scan_tasks(&cs->css, NULL, cpuset_change_nodemask, &arg, heap);
>  
>  	/*
>  	 * All the tasks' nodemasks have been updated, update
> @@ -1126,7 +1123,7 @@ static void update_tasks_nodemask(struct cpuset *cs, struct ptr_heap *heap)
>   * update_tasks_nodemask_hier - Update the nodemasks of tasks in the hierarchy.
>   * @cs: the root cpuset of the hierarchy
>   * @update_root: update the root cpuset or not?
> - * @heap: the heap used by cgroup_scan_tasks()
> + * @heap: the heap used by css_scan_tasks()
>   *
>   * This will update nodemasks of tasks in @root_cs and all other empty cpusets
>   * which take on nodemask of @root_cs.
> @@ -1254,12 +1251,12 @@ static int update_relax_domain_level(struct cpuset *cs, s64 val)
>  	return 0;
>  }
>  
> -/*
> +/**
>   * cpuset_change_flag - make a task's spread flags the same as its cpuset's
>   * @tsk: task to be updated
>   * @data: cpuset to @tsk belongs to
>   *
> - * Called by cgroup_scan_tasks() for each task in a cgroup.
> + * Called by css_scan_tasks() for each task in a cgroup.
>   *
>   * We don't need to re-check for the cgroup/cpuset membership, since we're
>   * holding cpuset_mutex at this point.
> @@ -1271,22 +1268,22 @@ static void cpuset_change_flag(struct task_struct *tsk, void *data)
>  	cpuset_update_task_spread_flag(cs, tsk);
>  }
>  
> -/*
> +/**
>   * update_tasks_flags - update the spread flags of tasks in the cpuset.
>   * @cs: the cpuset in which each task's spread flags needs to be changed
> - * @heap: if NULL, defer allocating heap memory to cgroup_scan_tasks()
> + * @heap: if NULL, defer allocating heap memory to css_scan_tasks()
>   *
>   * Called with cpuset_mutex held
>   *
> - * The cgroup_scan_tasks() function will scan all the tasks in a cgroup,
> + * The css_scan_tasks() function will scan all the tasks in a cgroup,
>   * calling callback functions for each.
>   *
> - * No return value. It's guaranteed that cgroup_scan_tasks() always returns 0
> + * No return value. It's guaranteed that css_scan_tasks() always returns 0
>   * if @heap != NULL.
>   */
>  static void update_tasks_flags(struct cpuset *cs, struct ptr_heap *heap)
>  {
> -	cgroup_scan_tasks(cs->css.cgroup, NULL, cpuset_change_flag, cs, heap);
> +	css_scan_tasks(&cs->css, NULL, cpuset_change_flag, cs, heap);
>  }
>  
>  /*
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 5a5f4dc..95106a9 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1799,12 +1799,11 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  	check_panic_on_oom(CONSTRAINT_MEMCG, gfp_mask, order, NULL);
>  	totalpages = mem_cgroup_get_limit(memcg) >> PAGE_SHIFT ? : 1;
>  	for_each_mem_cgroup_tree(iter, memcg) {
> -		struct cgroup *cgroup = iter->css.cgroup;
> -		struct cgroup_task_iter it;
> +		struct css_task_iter it;
>  		struct task_struct *task;
>  
> -		cgroup_task_iter_start(cgroup, &it);
> -		while ((task = cgroup_task_iter_next(&it))) {
> +		css_task_iter_start(&iter->css, &it);
> +		while ((task = css_task_iter_next(&it))) {
>  			switch (oom_scan_process_thread(task, totalpages, NULL,
>  							false)) {
>  			case OOM_SCAN_SELECT:
> @@ -1817,7 +1816,7 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  			case OOM_SCAN_CONTINUE:
>  				continue;
>  			case OOM_SCAN_ABORT:
> -				cgroup_task_iter_end(&it);
> +				css_task_iter_end(&it);
>  				mem_cgroup_iter_break(memcg, iter);
>  				if (chosen)
>  					put_task_struct(chosen);
> @@ -1834,7 +1833,7 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  				get_task_struct(chosen);
>  			}
>  		}
> -		cgroup_task_iter_end(&it);
> +		css_task_iter_end(&it);
>  	}
>  
>  	if (!chosen)
> -- 
> 1.8.3.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 21/23] cgroup: make cftype->[un]register_event() deal with cgroup_subsys_state instead of cgroup
  2013-08-01 21:49 ` [PATCH 21/23] cgroup: make cftype->[un]register_event() " Tejun Heo
  2013-08-02  4:08   ` Li Zefan
@ 2013-08-02 13:42   ` Michal Hocko
  2013-08-02 20:24   ` [PATCH v2 " Tejun Heo
  2 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2013-08-02 13:42 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, containers, cgroups, linux-kernel, Johannes Weiner,
	Balbir Singh

On Thu 01-08-13 17:49:59, Tejun Heo wrote:
> cgroup is in the process of converting to css (cgroup_subsys_state)
> from cgroup as the principal subsystem interface handle.  This is
> mostly to prepare for the unified hierarchy support where css's will
> be created and destroyed dynamically but also helps cleaning up
> subsystem implementations as css is usually what they are interested
> in anyway.
> 
> cftype->[un]register_event() is among the remaining couple interfaces
> which still use struct cgroup.  Convert it to cgroup_subsys_state.
> The conversion is mostly mechanical and removes the last users of
> mem_cgroup_from_cont() and cg_to_vmpressure(), which are removed.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Balbir Singh <bsingharora@gmail.com>

Acked-by: Michal Hocko <mhocko@suse.cz>

> ---
>  include/linux/cgroup.h     |  8 +++++---
>  include/linux/vmpressure.h |  6 ++++--
>  kernel/cgroup.c            | 15 ++++++++-------
>  mm/memcontrol.c            | 21 ++++++++-------------
>  mm/vmpressure.c            | 21 +++++++++------------
>  5 files changed, 34 insertions(+), 37 deletions(-)
> 
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index 6f6d87b..8f44411 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -506,15 +506,17 @@ struct cftype {
>  	 * you want to provide this functionality. Use eventfd_signal()
>  	 * on eventfd to send notification to userspace.
>  	 */
> -	int (*register_event)(struct cgroup *cgrp, struct cftype *cft,
> -			struct eventfd_ctx *eventfd, const char *args);
> +	int (*register_event)(struct cgroup_subsys_state *css,
> +			      struct cftype *cft, struct eventfd_ctx *eventfd,
> +			      const char *args);
>  	/*
>  	 * unregister_event() callback will be called when userspace
>  	 * closes the eventfd or on cgroup removing.
>  	 * This callback must be implemented, if you want provide
>  	 * notification functionality.
>  	 */
> -	void (*unregister_event)(struct cgroup *cgrp, struct cftype *cft,
> +	void (*unregister_event)(struct cgroup_subsys_state *css,
> +				 struct cftype *cft,
>  			struct eventfd_ctx *eventfd);
>  };
>  
> diff --git a/include/linux/vmpressure.h b/include/linux/vmpressure.h
> index 76be077..b239482 100644
> --- a/include/linux/vmpressure.h
> +++ b/include/linux/vmpressure.h
> @@ -33,10 +33,12 @@ extern void vmpressure_init(struct vmpressure *vmpr);
>  extern struct vmpressure *memcg_to_vmpressure(struct mem_cgroup *memcg);
>  extern struct cgroup_subsys_state *vmpressure_to_css(struct vmpressure *vmpr);
>  extern struct vmpressure *css_to_vmpressure(struct cgroup_subsys_state *css);
> -extern int vmpressure_register_event(struct cgroup *cg, struct cftype *cft,
> +extern int vmpressure_register_event(struct cgroup_subsys_state *css,
> +				     struct cftype *cft,
>  				     struct eventfd_ctx *eventfd,
>  				     const char *args);
> -extern void vmpressure_unregister_event(struct cgroup *cg, struct cftype *cft,
> +extern void vmpressure_unregister_event(struct cgroup_subsys_state *css,
> +					struct cftype *cft,
>  					struct eventfd_ctx *eventfd);
>  #else
>  static inline void vmpressure(gfp_t gfp, struct mem_cgroup *memcg,
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index c61b24f..e0ef58e 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -159,9 +159,9 @@ struct css_id {
>   */
>  struct cgroup_event {
>  	/*
> -	 * Cgroup which the event belongs to.
> +	 * css which the event belongs to.
>  	 */
> -	struct cgroup *cgrp;
> +	struct cgroup_subsys_state *css;
>  	/*
>  	 * Control file which the event associated.
>  	 */
> @@ -3948,11 +3948,12 @@ static void cgroup_event_remove(struct work_struct *work)
>  {
>  	struct cgroup_event *event = container_of(work, struct cgroup_event,
>  			remove);
> -	struct cgroup *cgrp = event->cgrp;
> +	struct cgroup_subsys_state *css = event->css;
> +	struct cgroup *cgrp = css->cgroup;
>  
>  	remove_wait_queue(event->wqh, &event->wait);
>  
> -	event->cft->unregister_event(cgrp, event->cft, event->eventfd);
> +	event->cft->unregister_event(css, event->cft, event->eventfd);
>  
>  	/* Notify userspace the event is going away. */
>  	eventfd_signal(event->eventfd, 1);
> @@ -3972,7 +3973,7 @@ static int cgroup_event_wake(wait_queue_t *wait, unsigned mode,
>  {
>  	struct cgroup_event *event = container_of(wait,
>  			struct cgroup_event, wait);
> -	struct cgroup *cgrp = event->cgrp;
> +	struct cgroup *cgrp = event->css->cgroup;
>  	unsigned long flags = (unsigned long)key;
>  
>  	if (flags & POLLHUP) {
> @@ -4041,7 +4042,7 @@ static int cgroup_write_event_control(struct cgroup_subsys_state *css,
>  	event = kzalloc(sizeof(*event), GFP_KERNEL);
>  	if (!event)
>  		return -ENOMEM;
> -	event->cgrp = cgrp;
> +	event->css = css;
>  	INIT_LIST_HEAD(&event->list);
>  	init_poll_funcptr(&event->pt, cgroup_event_ptable_queue_proc);
>  	init_waitqueue_func_entry(&event->wait, cgroup_event_wake);
> @@ -4092,7 +4093,7 @@ static int cgroup_write_event_control(struct cgroup_subsys_state *css,
>  		goto out_put_cfile;
>  	}
>  
> -	ret = event->cft->register_event(cgrp, event->cft,
> +	ret = event->cft->register_event(css, event->cft,
>  			event->eventfd, buffer);
>  	if (ret)
>  		goto out_put_cfile;
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 95106a9..2885e3e 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1034,11 +1034,6 @@ static void memcg_check_events(struct mem_cgroup *memcg, struct page *page)
>  		preempt_enable();
>  }
>  
> -static inline struct mem_cgroup *mem_cgroup_from_cont(struct cgroup *cont)
> -{
> -	return mem_cgroup_from_css(cgroup_css(cont, mem_cgroup_subsys_id));
> -}
> -
>  struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p)
>  {
>  	/*
> @@ -5620,10 +5615,10 @@ static void mem_cgroup_oom_notify(struct mem_cgroup *memcg)
>  		mem_cgroup_oom_notify_cb(iter);
>  }
>  
> -static int mem_cgroup_usage_register_event(struct cgroup *cgrp,
> +static int mem_cgroup_usage_register_event(struct cgroup_subsys_state *css,
>  	struct cftype *cft, struct eventfd_ctx *eventfd, const char *args)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	struct mem_cgroup_thresholds *thresholds;
>  	struct mem_cgroup_threshold_ary *new;
>  	enum res_type type = MEMFILE_TYPE(cft->private);
> @@ -5703,10 +5698,10 @@ unlock:
>  	return ret;
>  }
>  
> -static void mem_cgroup_usage_unregister_event(struct cgroup *cgrp,
> +static void mem_cgroup_usage_unregister_event(struct cgroup_subsys_state *css,
>  	struct cftype *cft, struct eventfd_ctx *eventfd)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	struct mem_cgroup_thresholds *thresholds;
>  	struct mem_cgroup_threshold_ary *new;
>  	enum res_type type = MEMFILE_TYPE(cft->private);
> @@ -5782,10 +5777,10 @@ unlock:
>  	mutex_unlock(&memcg->thresholds_lock);
>  }
>  
> -static int mem_cgroup_oom_register_event(struct cgroup *cgrp,
> +static int mem_cgroup_oom_register_event(struct cgroup_subsys_state *css,
>  	struct cftype *cft, struct eventfd_ctx *eventfd, const char *args)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	struct mem_cgroup_eventfd_list *event;
>  	enum res_type type = MEMFILE_TYPE(cft->private);
>  
> @@ -5807,10 +5802,10 @@ static int mem_cgroup_oom_register_event(struct cgroup *cgrp,
>  	return 0;
>  }
>  
> -static void mem_cgroup_oom_unregister_event(struct cgroup *cgrp,
> +static void mem_cgroup_oom_unregister_event(struct cgroup_subsys_state *css,
>  	struct cftype *cft, struct eventfd_ctx *eventfd)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  	struct mem_cgroup_eventfd_list *ev, *tmp;
>  	enum res_type type = MEMFILE_TYPE(cft->private);
>  
> diff --git a/mm/vmpressure.c b/mm/vmpressure.c
> index 2a8a736..13489b1 100644
> --- a/mm/vmpressure.c
> +++ b/mm/vmpressure.c
> @@ -74,11 +74,6 @@ static struct vmpressure *work_to_vmpressure(struct work_struct *work)
>  	return container_of(work, struct vmpressure, work);
>  }
>  
> -static struct vmpressure *cg_to_vmpressure(struct cgroup *cg)
> -{
> -	return css_to_vmpressure(cgroup_css(cg, mem_cgroup_subsys_id));
> -}
> -
>  static struct vmpressure *vmpressure_parent(struct vmpressure *vmpr)
>  {
>  	struct cgroup_subsys_state *css = vmpressure_to_css(vmpr);
> @@ -283,7 +278,7 @@ void vmpressure_prio(gfp_t gfp, struct mem_cgroup *memcg, int prio)
>  
>  /**
>   * vmpressure_register_event() - Bind vmpressure notifications to an eventfd
> - * @cg:		cgroup that is interested in vmpressure notifications
> + * @css:	css that is interested in vmpressure notifications
>   * @cft:	cgroup control files handle
>   * @eventfd:	eventfd context to link notifications with
>   * @args:	event arguments (used to set up a pressure level threshold)
> @@ -298,10 +293,11 @@ void vmpressure_prio(gfp_t gfp, struct mem_cgroup *memcg, int prio)
>   * cftype).register_event, and then cgroup core will handle everything by
>   * itself.
>   */
> -int vmpressure_register_event(struct cgroup *cg, struct cftype *cft,
> -			      struct eventfd_ctx *eventfd, const char *args)
> +int vmpressure_register_event(struct cgroup_subsys_state *css,
> +			      struct cftype *cft, struct eventfd_ctx *eventfd,
> +			      const char *args)
>  {
> -	struct vmpressure *vmpr = cg_to_vmpressure(cg);
> +	struct vmpressure *vmpr = css_to_vmpressure(css);
>  	struct vmpressure_event *ev;
>  	int level;
>  
> @@ -329,7 +325,7 @@ int vmpressure_register_event(struct cgroup *cg, struct cftype *cft,
>  
>  /**
>   * vmpressure_unregister_event() - Unbind eventfd from vmpressure
> - * @cg:		cgroup handle
> + * @css:	css handle
>   * @cft:	cgroup control files handle
>   * @eventfd:	eventfd context that was used to link vmpressure with the @cg
>   *
> @@ -341,10 +337,11 @@ int vmpressure_register_event(struct cgroup *cg, struct cftype *cft,
>   * cftype).unregister_event, and then cgroup core will handle everything
>   * by itself.
>   */
> -void vmpressure_unregister_event(struct cgroup *cg, struct cftype *cft,
> +void vmpressure_unregister_event(struct cgroup_subsys_state *css,
> +				 struct cftype *cft,
>  				 struct eventfd_ctx *eventfd)
>  {
> -	struct vmpressure *vmpr = cg_to_vmpressure(cg);
> +	struct vmpressure *vmpr = css_to_vmpressure(css);
>  	struct vmpressure_event *ev;
>  
>  	mutex_lock(&vmpr->events_lock);
> -- 
> 1.8.3.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods
  2013-08-02 13:19   ` Michal Hocko
@ 2013-08-02 13:43     ` Michal Hocko
  2013-08-02 19:52       ` Tejun Heo
  2013-08-02 19:38     ` Tejun Heo
  1 sibling, 1 reply; 60+ messages in thread
From: Michal Hocko @ 2013-08-02 13:43 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, containers, cgroups, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Johannes Weiner, Balbir Singh, Aristeu Rozanski,
	Matt Helsley, Daniel Wagner, Vivek Goyal, Jens Axboe,
	Steven Rostedt

On Fri 02-08-13 15:19:01, Michal Hocko wrote:
[...]
> mem_cgroup_from_cont can go away now as well. Do you plan to remove it
> in the series or later on?

Ohh, it goes in 21/23. Good

Thanks!
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 16/23] cgroup: relocate cgroup_advance_iter()
  2013-08-02  3:25   ` Li Zefan
@ 2013-08-02 19:35     ` Tejun Heo
  0 siblings, 0 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-02 19:35 UTC (permalink / raw)
  To: Li Zefan; +Cc: containers, cgroups, linux-kernel

Hello, Li.

On Fri, Aug 02, 2013 at 11:25:58AM +0800, Li Zefan wrote:
> On 2013/8/2 5:49, Tejun Heo wrote:
> > For some reason, cgroup_advance_iter() is standing lonely all away
> > from its iter comrades.  Relocate it.
> > 
> 
> There're some other functions that are in the same situation. Do you
> think it's better to relocate them, or just leave it as it is?

Hmm... I really don't wanna do one huge reorganize-everything patch as
they tend to be difficult to verify and the disturbance to benefit
ration usually isn't that high, but if the respective part of code is
going through changes and thus will be disturbed anyway, I think it
makes sense to reorganize a bit.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods
  2013-08-02  3:54   ` Li Zefan
@ 2013-08-02 19:36     ` Tejun Heo
  0 siblings, 0 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-02 19:36 UTC (permalink / raw)
  To: Li Zefan
  Cc: containers, cgroups, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Johannes Weiner, Michal Hocko, Balbir Singh, Aristeu Rozanski,
	Matt Helsley, Daniel Wagner, Vivek Goyal, Jens Axboe,
	Steven Rostedt

On Fri, Aug 02, 2013 at 11:54:24AM +0800, Li Zefan wrote:
> > @@ -4298,7 +4308,7 @@ static long cgroup_create(struct cgroup *parent, struct dentry *dentry,
> >  	for_each_root_subsys(root, ss) {
> >  		struct cgroup_subsys_state *css;
> >  
> > -		css = ss->css_alloc(cgrp);
> > +		css = ss->css_alloc(parent->subsys[ss->subsys_id]);
> 
> As this patchset is based on for-3.12 branch, which lacks the fix in for-3.11,
> so the css_alloc() in that bug fix is not converted.

Hmm... I'll pull for-3.11-fixes into for-3.12 and rebase this series
on top of it.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods
  2013-08-02 13:19   ` Michal Hocko
  2013-08-02 13:43     ` Michal Hocko
@ 2013-08-02 19:38     ` Tejun Heo
  1 sibling, 0 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-02 19:38 UTC (permalink / raw)
  To: Michal Hocko
  Cc: lizefan, containers, cgroups, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Johannes Weiner, Balbir Singh, Aristeu Rozanski,
	Matt Helsley, Daniel Wagner, Vivek Goyal, Jens Axboe,
	Steven Rostedt

On Fri, Aug 02, 2013 at 03:19:01PM +0200, Michal Hocko wrote:
> mem_cgroup_from_cont can go away now as well. Do you plan to remove it
> in the series or later on?

IIRC, vmpressure still uses the accessor.  It'll get removed later
when all the usages are gone.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods
  2013-08-02  4:02   ` Li Zefan
@ 2013-08-02 19:41     ` Tejun Heo
  0 siblings, 0 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-02 19:41 UTC (permalink / raw)
  To: Li Zefan
  Cc: containers, cgroups, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Johannes Weiner, Michal Hocko, Balbir Singh, Aristeu Rozanski,
	Matt Helsley, Daniel Wagner, Vivek Goyal, Jens Axboe,
	Steven Rostedt

On Fri, Aug 02, 2013 at 12:02:05PM +0800, Li Zefan wrote:
> > @@ -4199,12 +4208,13 @@ static void init_cgroup_css(struct cgroup_subsys_state *css,
> >  /* invoke ->css_online() on a new CSS and mark it online if successful */
> >  static int online_css(struct cgroup_subsys *ss, struct cgroup *cgrp)
> >  {
> > +	struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
> >  	int ret = 0;
> >  
> >  	lockdep_assert_held(&cgroup_mutex);
> >  
> >  	if (ss->css_online)
> > -		ret = ss->css_online(cgrp);
> > +		ret = ss->css_online(css);
> >  	if (!ret)
> >  		cgrp->subsys[ss->subsys_id]->flags |= CSS_ONLINE;
> 
> Then this can be changed to css->flags |= CSS_ONLINE.

Aye aye.

-- 
tejun

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 21/23] cgroup: make cftype->[un]register_event() deal with cgroup_subsys_state instead of cgroup
  2013-08-02  4:08   ` Li Zefan
@ 2013-08-02 19:44     ` Tejun Heo
  0 siblings, 0 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-02 19:44 UTC (permalink / raw)
  To: Li Zefan
  Cc: containers, cgroups, linux-kernel, Johannes Weiner, Michal Hocko,
	Balbir Singh

On Fri, Aug 02, 2013 at 12:08:51PM +0800, Li Zefan wrote:
> > @@ -506,15 +506,17 @@ struct cftype {
> >  	 * you want to provide this functionality. Use eventfd_signal()
> >  	 * on eventfd to send notification to userspace.
> >  	 */
> > -	int (*register_event)(struct cgroup *cgrp, struct cftype *cft,
> > -			struct eventfd_ctx *eventfd, const char *args);
> > +	int (*register_event)(struct cgroup_subsys_state *css,
> > +			      struct cftype *cft, struct eventfd_ctx *eventfd,
> > +			      const char *args);
> >  	/*
> >  	 * unregister_event() callback will be called when userspace
> >  	 * closes the eventfd or on cgroup removing.
> >  	 * This callback must be implemented, if you want provide
> >  	 * notification functionality.
> >  	 */
> > -	void (*unregister_event)(struct cgroup *cgrp, struct cftype *cft,
> > +	void (*unregister_event)(struct cgroup_subsys_state *css,
> > +				 struct cftype *cft,
> >  			struct eventfd_ctx *eventfd);
> 
> align this line?

Done.

-- 
tejun

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods
  2013-08-02 13:43     ` Michal Hocko
@ 2013-08-02 19:52       ` Tejun Heo
  0 siblings, 0 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-02 19:52 UTC (permalink / raw)
  To: Michal Hocko
  Cc: lizefan, containers, cgroups, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Johannes Weiner, Balbir Singh, Aristeu Rozanski,
	Matt Helsley, Daniel Wagner, Vivek Goyal, Jens Axboe,
	Steven Rostedt

On Fri, Aug 02, 2013 at 03:43:05PM +0200, Michal Hocko wrote:
> On Fri 02-08-13 15:19:01, Michal Hocko wrote:
> [...]
> > mem_cgroup_from_cont can go away now as well. Do you plan to remove it
> > in the series or later on?
> 
> Ohh, it goes in 21/23. Good

Heh, yeah, that one.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v2 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods
  2013-08-01 21:49 ` [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods Tejun Heo
                     ` (2 preceding siblings ...)
  2013-08-02 13:19   ` Michal Hocko
@ 2013-08-02 20:24   ` Tejun Heo
  2013-08-06  7:19     ` Daniel Wagner
  2013-08-05 12:44   ` [PATCH " Vivek Goyal
  2013-08-05 17:57   ` Aristeu Rozanski
  5 siblings, 1 reply; 60+ messages in thread
From: Tejun Heo @ 2013-08-02 20:24 UTC (permalink / raw)
  To: lizefan
  Cc: containers, cgroups, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Johannes Weiner, Michal Hocko, Balbir Singh, Aristeu Rozanski,
	Matt Helsley, Daniel Wagner, Vivek Goyal, Jens Axboe,
	Steven Rostedt

cgroup is currently in the process of transitioning to using struct
cgroup_subsys_state * as the primary handle instead of struct cgroup *
in subsystem implementations for the following reasons.

* With unified hierarchy, subsystems will be dynamically bound and
  unbound from cgroups and thus css's (cgroup_subsys_state) may be
  created and destroyed dynamically over the lifetime of a cgroup,
  which is different from the current state where all css's are
  allocated and destroyed together with the associated cgroup.  This
  in turn means that cgroup_css() should be synchronized and may
  return NULL, making it more cumbersome to use.

* Differing levels of per-subsystem granularity in the unified
  hierarchy means that the task and descendant iterators should behave
  differently depending on the specific subsystem the iteration is
  being performed for.

* In majority of the cases, subsystems only care about its part in the
  cgroup hierarchy - ie. the hierarchy of css's.  Subsystem methods
  often obtain the matching css pointer from the cgroup and don't
  bother with the cgroup pointer itself.  Passing around css fits
  much better.

This patch converts all cgroup_subsys methods to take @css instead of
@cgroup.  The conversions are mostly straight-forward.  A few
noteworthy changes are

* ->css_alloc() now takes css of the parent cgroup rather than the
  pointer to the new cgroup as the css for the new cgroup doesn't
  exist yet.  Knowing the parent css is enough for all the existing
  subsystems.

* In kernel/cgroup.c::offline_css(), unnecessary open coded css
  dereference is replaced with local variable access.

This patch shouldn't cause any behavior differences.

v2: Unnecessary explicit cgrp->subsys[] deref in css_online() replaced
    with local variable @css as suggested by Li Zefan.

    Rebased on top of new for-3.12 which includes for-3.11-fixes so
    that ->css_free() invocation added by da0a12caff ("cgroup: fix a
    leak when percpu_ref_init() fails") is converted too.  Suggested
    by Li Zefan.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 block/blk-cgroup.c        |   25 ++++++++++----------
 include/linux/cgroup.h    |   22 ++++++++++-------
 kernel/cgroup.c           |   57 +++++++++++++++++++++++++++-------------------
 kernel/cgroup_freezer.c   |   40 ++++++++++++++++----------------
 kernel/cpuset.c           |   39 ++++++++++++++++---------------
 kernel/events/core.c      |   18 ++++++++------
 kernel/sched/core.c       |   39 ++++++++++++++++---------------
 kernel/sched/cpuacct.c    |    9 ++++---
 mm/hugetlb_cgroup.c       |   19 +++++++--------
 mm/memcontrol.c           |   38 +++++++++++++++---------------
 net/core/netprio_cgroup.c |   20 ++++++++--------
 net/sched/cls_cgroup.c    |   18 ++++++++------
 security/device_cgroup.c  |   22 ++++++++---------
 13 files changed, 196 insertions(+), 170 deletions(-)

--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -765,18 +765,18 @@ struct cftype blkcg_files[] = {
 
 /**
  * blkcg_css_offline - cgroup css_offline callback
- * @cgroup: cgroup of interest
+ * @css: css of interest
  *
- * This function is called when @cgroup is about to go away and responsible
- * for shooting down all blkgs associated with @cgroup.  blkgs should be
+ * This function is called when @css is about to go away and responsible
+ * for shooting down all blkgs associated with @css.  blkgs should be
  * removed while holding both q and blkcg locks.  As blkcg lock is nested
  * inside q lock, this function performs reverse double lock dancing.
  *
  * This is the blkcg counterpart of ioc_release_fn().
  */
-static void blkcg_css_offline(struct cgroup *cgroup)
+static void blkcg_css_offline(struct cgroup_subsys_state *css)
 {
-	struct blkcg *blkcg = cgroup_to_blkcg(cgroup);
+	struct blkcg *blkcg = css_to_blkcg(css);
 
 	spin_lock_irq(&blkcg->lock);
 
@@ -798,21 +798,21 @@ static void blkcg_css_offline(struct cgr
 	spin_unlock_irq(&blkcg->lock);
 }
 
-static void blkcg_css_free(struct cgroup *cgroup)
+static void blkcg_css_free(struct cgroup_subsys_state *css)
 {
-	struct blkcg *blkcg = cgroup_to_blkcg(cgroup);
+	struct blkcg *blkcg = css_to_blkcg(css);
 
 	if (blkcg != &blkcg_root)
 		kfree(blkcg);
 }
 
-static struct cgroup_subsys_state *blkcg_css_alloc(struct cgroup *cgroup)
+static struct cgroup_subsys_state *
+blkcg_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	static atomic64_t id_seq = ATOMIC64_INIT(0);
 	struct blkcg *blkcg;
-	struct cgroup *parent = cgroup->parent;
 
-	if (!parent) {
+	if (!parent_css) {
 		blkcg = &blkcg_root;
 		goto done;
 	}
@@ -883,14 +883,15 @@ void blkcg_exit_queue(struct request_que
  * of the main cic data structures.  For now we allow a task to change
  * its cgroup only if it's the only owner of its ioc.
  */
-static int blkcg_can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
+static int blkcg_can_attach(struct cgroup_subsys_state *css,
+			    struct cgroup_taskset *tset)
 {
 	struct task_struct *task;
 	struct io_context *ioc;
 	int ret = 0;
 
 	/* task_lock() is needed to avoid races with exit_io_context() */
-	cgroup_taskset_for_each(task, cgrp, tset) {
+	cgroup_taskset_for_each(task, css->cgroup, tset) {
 		task_lock(task);
 		ioc = task->io_context;
 		if (ioc && atomic_read(&ioc->nr_tasks) > 1)
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -579,18 +579,22 @@ int cgroup_taskset_size(struct cgroup_ta
  */
 
 struct cgroup_subsys {
-	struct cgroup_subsys_state *(*css_alloc)(struct cgroup *cgrp);
-	int (*css_online)(struct cgroup *cgrp);
-	void (*css_offline)(struct cgroup *cgrp);
-	void (*css_free)(struct cgroup *cgrp);
+	struct cgroup_subsys_state *(*css_alloc)(struct cgroup_subsys_state *parent_css);
+	int (*css_online)(struct cgroup_subsys_state *css);
+	void (*css_offline)(struct cgroup_subsys_state *css);
+	void (*css_free)(struct cgroup_subsys_state *css);
 
-	int (*can_attach)(struct cgroup *cgrp, struct cgroup_taskset *tset);
-	void (*cancel_attach)(struct cgroup *cgrp, struct cgroup_taskset *tset);
-	void (*attach)(struct cgroup *cgrp, struct cgroup_taskset *tset);
+	int (*can_attach)(struct cgroup_subsys_state *css,
+			  struct cgroup_taskset *tset);
+	void (*cancel_attach)(struct cgroup_subsys_state *css,
+			      struct cgroup_taskset *tset);
+	void (*attach)(struct cgroup_subsys_state *css,
+		       struct cgroup_taskset *tset);
 	void (*fork)(struct task_struct *task);
-	void (*exit)(struct cgroup *cgrp, struct cgroup *old_cgrp,
+	void (*exit)(struct cgroup_subsys_state *css,
+		     struct cgroup_subsys_state *old_css,
 		     struct task_struct *task);
-	void (*bind)(struct cgroup *root);
+	void (*bind)(struct cgroup_subsys_state *root_css);
 
 	int subsys_id;
 	int disabled;
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -853,8 +853,11 @@ static void cgroup_free_fn(struct work_s
 	/*
 	 * Release the subsystem state objects.
 	 */
-	for_each_root_subsys(cgrp->root, ss)
-		ss->css_free(cgrp);
+	for_each_root_subsys(cgrp->root, ss) {
+		struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
+
+		ss->css_free(css);
+	}
 
 	cgrp->root->number_of_cgroups--;
 	mutex_unlock(&cgroup_mutex);
@@ -1056,7 +1059,7 @@ static int rebind_subsystems(struct cgro
 			list_move(&ss->sibling, &root->subsys_list);
 			ss->root = root;
 			if (ss->bind)
-				ss->bind(cgrp);
+				ss->bind(cgrp->subsys[i]);
 
 			/* refcount was already taken, and we're keeping it */
 			root->subsys_mask |= bit;
@@ -1066,7 +1069,7 @@ static int rebind_subsystems(struct cgro
 			BUG_ON(cgrp->subsys[i]->cgroup != cgrp);
 
 			if (ss->bind)
-				ss->bind(cgroup_dummy_top);
+				ss->bind(cgroup_dummy_top->subsys[i]);
 			cgroup_dummy_top->subsys[i]->cgroup = cgroup_dummy_top;
 			cgrp->subsys[i] = NULL;
 			cgroup_subsys[i]->root = &cgroup_dummy_root;
@@ -2049,8 +2052,10 @@ static int cgroup_attach_task(struct cgr
 	 * step 1: check that we can legitimately attach to the cgroup.
 	 */
 	for_each_root_subsys(root, ss) {
+		struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
+
 		if (ss->can_attach) {
-			retval = ss->can_attach(cgrp, &tset);
+			retval = ss->can_attach(css, &tset);
 			if (retval) {
 				failed_ss = ss;
 				goto out_cancel_attach;
@@ -2089,8 +2094,10 @@ static int cgroup_attach_task(struct cgr
 	 * step 4: do subsystem attach callbacks.
 	 */
 	for_each_root_subsys(root, ss) {
+		struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
+
 		if (ss->attach)
-			ss->attach(cgrp, &tset);
+			ss->attach(css, &tset);
 	}
 
 	/*
@@ -2109,10 +2116,12 @@ out_put_css_set_refs:
 out_cancel_attach:
 	if (retval) {
 		for_each_root_subsys(root, ss) {
+			struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
+
 			if (ss == failed_ss)
 				break;
 			if (ss->cancel_attach)
-				ss->cancel_attach(cgrp, &tset);
+				ss->cancel_attach(css, &tset);
 		}
 	}
 out_free_group_list:
@@ -4206,14 +4215,15 @@ static void init_cgroup_css(struct cgrou
 /* invoke ->css_online() on a new CSS and mark it online if successful */
 static int online_css(struct cgroup_subsys *ss, struct cgroup *cgrp)
 {
+	struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
 	int ret = 0;
 
 	lockdep_assert_held(&cgroup_mutex);
 
 	if (ss->css_online)
-		ret = ss->css_online(cgrp);
+		ret = ss->css_online(css);
 	if (!ret)
-		cgrp->subsys[ss->subsys_id]->flags |= CSS_ONLINE;
+		css->flags |= CSS_ONLINE;
 	return ret;
 }
 
@@ -4228,9 +4238,9 @@ static void offline_css(struct cgroup_su
 		return;
 
 	if (ss->css_offline)
-		ss->css_offline(cgrp);
+		ss->css_offline(css);
 
-	cgrp->subsys[ss->subsys_id]->flags &= ~CSS_ONLINE;
+	css->flags &= ~CSS_ONLINE;
 }
 
 /*
@@ -4305,7 +4315,7 @@ static long cgroup_create(struct cgroup
 	for_each_root_subsys(root, ss) {
 		struct cgroup_subsys_state *css;
 
-		css = ss->css_alloc(cgrp);
+		css = ss->css_alloc(parent->subsys[ss->subsys_id]);
 		if (IS_ERR(css)) {
 			err = PTR_ERR(css);
 			goto err_free_all;
@@ -4313,7 +4323,7 @@ static long cgroup_create(struct cgroup
 
 		err = percpu_ref_init(&css->refcnt, css_release);
 		if (err) {
-			ss->css_free(cgrp);
+			ss->css_free(css);
 			goto err_free_all;
 		}
 
@@ -4386,7 +4396,7 @@ err_free_all:
 
 		if (css) {
 			percpu_ref_cancel_init(&css->refcnt);
-			ss->css_free(cgrp);
+			ss->css_free(css);
 		}
 	}
 	mutex_unlock(&cgroup_mutex);
@@ -4641,7 +4651,7 @@ static void __init cgroup_init_subsys(st
 	/* Create the top cgroup state for this subsystem */
 	list_add(&ss->sibling, &cgroup_dummy_root.subsys_list);
 	ss->root = &cgroup_dummy_root;
-	css = ss->css_alloc(cgroup_dummy_top);
+	css = ss->css_alloc(cgroup_dummy_top->subsys[ss->subsys_id]);
 	/* We don't handle early failures gracefully */
 	BUG_ON(IS_ERR(css));
 	init_cgroup_css(css, ss, cgroup_dummy_top);
@@ -4720,7 +4730,7 @@ int __init_or_module cgroup_load_subsys(
 	 * struct, so this can happen first (i.e. before the dummy root
 	 * attachment).
 	 */
-	css = ss->css_alloc(cgroup_dummy_top);
+	css = ss->css_alloc(cgroup_dummy_top->subsys[ss->subsys_id]);
 	if (IS_ERR(css)) {
 		/* failure case - need to deassign the cgroup_subsys[] slot. */
 		cgroup_subsys[ss->subsys_id] = NULL;
@@ -4836,7 +4846,7 @@ void cgroup_unload_subsys(struct cgroup_
 	 * the cgrp->subsys pointer to find their state. note that this
 	 * also takes care of freeing the css_id.
 	 */
-	ss->css_free(cgroup_dummy_top);
+	ss->css_free(cgroup_dummy_top->subsys[ss->subsys_id]);
 	cgroup_dummy_top->subsys[ss->subsys_id] = NULL;
 
 	mutex_unlock(&cgroup_mutex);
@@ -5192,10 +5202,10 @@ void cgroup_exit(struct task_struct *tsk
 		 */
 		for_each_builtin_subsys(ss, i) {
 			if (ss->exit) {
-				struct cgroup *old_cgrp = cset->subsys[i]->cgroup;
-				struct cgroup *cgrp = task_cgroup(tsk, i);
+				struct cgroup_subsys_state *old_css = cset->subsys[i];
+				struct cgroup_subsys_state *css = task_css(tsk, i);
 
-				ss->exit(cgrp, old_cgrp, tsk);
+				ss->exit(css, old_css, tsk);
 			}
 		}
 	}
@@ -5529,7 +5539,8 @@ struct cgroup_subsys_state *cgroup_css_f
 }
 
 #ifdef CONFIG_CGROUP_DEBUG
-static struct cgroup_subsys_state *debug_css_alloc(struct cgroup *cgrp)
+static struct cgroup_subsys_state *
+debug_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct cgroup_subsys_state *css = kzalloc(sizeof(*css), GFP_KERNEL);
 
@@ -5539,9 +5550,9 @@ static struct cgroup_subsys_state *debug
 	return css;
 }
 
-static void debug_css_free(struct cgroup *cgrp)
+static void debug_css_free(struct cgroup_subsys_state *css)
 {
-	kfree(cgrp->subsys[debug_subsys_id]);
+	kfree(css);
 }
 
 static u64 debug_taskcount_read(struct cgroup *cgrp, struct cftype *cft)
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -91,7 +91,8 @@ static const char *freezer_state_strs(un
 
 struct cgroup_subsys freezer_subsys;
 
-static struct cgroup_subsys_state *freezer_css_alloc(struct cgroup *cgroup)
+static struct cgroup_subsys_state *
+freezer_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct freezer *freezer;
 
@@ -104,16 +105,16 @@ static struct cgroup_subsys_state *freez
 }
 
 /**
- * freezer_css_online - commit creation of a freezer cgroup
- * @cgroup: cgroup being created
+ * freezer_css_online - commit creation of a freezer css
+ * @css: css being created
  *
- * We're committing to creation of @cgroup.  Mark it online and inherit
+ * We're committing to creation of @css.  Mark it online and inherit
  * parent's freezing state while holding both parent's and our
  * freezer->lock.
  */
-static int freezer_css_online(struct cgroup *cgroup)
+static int freezer_css_online(struct cgroup_subsys_state *css)
 {
-	struct freezer *freezer = cgroup_freezer(cgroup);
+	struct freezer *freezer = css_freezer(css);
 	struct freezer *parent = parent_freezer(freezer);
 
 	/*
@@ -140,15 +141,15 @@ static int freezer_css_online(struct cgr
 }
 
 /**
- * freezer_css_offline - initiate destruction of @cgroup
- * @cgroup: cgroup being destroyed
+ * freezer_css_offline - initiate destruction of a freezer css
+ * @css: css being destroyed
  *
- * @cgroup is going away.  Mark it dead and decrement system_freezing_count
- * if it was holding one.
+ * @css is going away.  Mark it dead and decrement system_freezing_count if
+ * it was holding one.
  */
-static void freezer_css_offline(struct cgroup *cgroup)
+static void freezer_css_offline(struct cgroup_subsys_state *css)
 {
-	struct freezer *freezer = cgroup_freezer(cgroup);
+	struct freezer *freezer = css_freezer(css);
 
 	spin_lock_irq(&freezer->lock);
 
@@ -160,9 +161,9 @@ static void freezer_css_offline(struct c
 	spin_unlock_irq(&freezer->lock);
 }
 
-static void freezer_css_free(struct cgroup *cgroup)
+static void freezer_css_free(struct cgroup_subsys_state *css)
 {
-	kfree(cgroup_freezer(cgroup));
+	kfree(css_freezer(css));
 }
 
 /*
@@ -174,25 +175,26 @@ static void freezer_css_free(struct cgro
  * @freezer->lock.  freezer_attach() makes the new tasks conform to the
  * current state and all following state changes can see the new tasks.
  */
-static void freezer_attach(struct cgroup *new_cgrp, struct cgroup_taskset *tset)
+static void freezer_attach(struct cgroup_subsys_state *new_css,
+			   struct cgroup_taskset *tset)
 {
-	struct freezer *freezer = cgroup_freezer(new_cgrp);
+	struct freezer *freezer = css_freezer(new_css);
 	struct task_struct *task;
 	bool clear_frozen = false;
 
 	spin_lock_irq(&freezer->lock);
 
 	/*
-	 * Make the new tasks conform to the current state of @new_cgrp.
+	 * Make the new tasks conform to the current state of @new_css.
 	 * For simplicity, when migrating any task to a FROZEN cgroup, we
 	 * revert it to FREEZING and let update_if_frozen() determine the
 	 * correct state later.
 	 *
-	 * Tasks in @tset are on @new_cgrp but may not conform to its
+	 * Tasks in @tset are on @new_css but may not conform to its
 	 * current state before executing the following - !frozen tasks may
 	 * be visible in a FROZEN cgroup and frozen tasks in a THAWED one.
 	 */
-	cgroup_taskset_for_each(task, new_cgrp, tset) {
+	cgroup_taskset_for_each(task, new_css->cgroup, tset) {
 		if (!(freezer->state & CGROUP_FREEZING)) {
 			__thaw_task(task);
 		} else {
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1455,9 +1455,10 @@ static int fmeter_getrate(struct fmeter
 }
 
 /* Called by cgroups to determine if a cpuset is usable; cpuset_mutex held */
-static int cpuset_can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
+static int cpuset_can_attach(struct cgroup_subsys_state *css,
+			     struct cgroup_taskset *tset)
 {
-	struct cpuset *cs = cgroup_cs(cgrp);
+	struct cpuset *cs = css_cs(css);
 	struct task_struct *task;
 	int ret;
 
@@ -1468,11 +1469,11 @@ static int cpuset_can_attach(struct cgro
 	 * flag is set.
 	 */
 	ret = -ENOSPC;
-	if (!cgroup_sane_behavior(cgrp) &&
+	if (!cgroup_sane_behavior(css->cgroup) &&
 	    (cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed)))
 		goto out_unlock;
 
-	cgroup_taskset_for_each(task, cgrp, tset) {
+	cgroup_taskset_for_each(task, css->cgroup, tset) {
 		/*
 		 * Kthreads which disallow setaffinity shouldn't be moved
 		 * to a new cpuset; we don't want to change their cpu
@@ -1501,11 +1502,11 @@ out_unlock:
 	return ret;
 }
 
-static void cpuset_cancel_attach(struct cgroup *cgrp,
+static void cpuset_cancel_attach(struct cgroup_subsys_state *css,
 				 struct cgroup_taskset *tset)
 {
 	mutex_lock(&cpuset_mutex);
-	cgroup_cs(cgrp)->attach_in_progress--;
+	css_cs(css)->attach_in_progress--;
 	mutex_unlock(&cpuset_mutex);
 }
 
@@ -1516,7 +1517,8 @@ static void cpuset_cancel_attach(struct
  */
 static cpumask_var_t cpus_attach;
 
-static void cpuset_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
+static void cpuset_attach(struct cgroup_subsys_state *css,
+			  struct cgroup_taskset *tset)
 {
 	/* static buf protected by cpuset_mutex */
 	static nodemask_t cpuset_attach_nodemask_to;
@@ -1524,7 +1526,7 @@ static void cpuset_attach(struct cgroup
 	struct task_struct *task;
 	struct task_struct *leader = cgroup_taskset_first(tset);
 	struct cgroup *oldcgrp = cgroup_taskset_cur_cgroup(tset);
-	struct cpuset *cs = cgroup_cs(cgrp);
+	struct cpuset *cs = css_cs(css);
 	struct cpuset *oldcs = cgroup_cs(oldcgrp);
 	struct cpuset *cpus_cs = effective_cpumask_cpuset(cs);
 	struct cpuset *mems_cs = effective_nodemask_cpuset(cs);
@@ -1539,7 +1541,7 @@ static void cpuset_attach(struct cgroup
 
 	guarantee_online_mems(mems_cs, &cpuset_attach_nodemask_to);
 
-	cgroup_taskset_for_each(task, cgrp, tset) {
+	cgroup_taskset_for_each(task, css->cgroup, tset) {
 		/*
 		 * can_attach beforehand should guarantee that this doesn't
 		 * fail.  TODO: have a better way to handle failure here
@@ -1940,11 +1942,12 @@ static struct cftype files[] = {
  *	cgrp:	control group that the new cpuset will be part of
  */
 
-static struct cgroup_subsys_state *cpuset_css_alloc(struct cgroup *cgrp)
+static struct cgroup_subsys_state *
+cpuset_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct cpuset *cs;
 
-	if (!cgrp->parent)
+	if (!parent_css)
 		return &top_cpuset.css;
 
 	cs = kzalloc(sizeof(*cs), GFP_KERNEL);
@@ -1964,9 +1967,9 @@ static struct cgroup_subsys_state *cpuse
 	return &cs->css;
 }
 
-static int cpuset_css_online(struct cgroup *cgrp)
+static int cpuset_css_online(struct cgroup_subsys_state *css)
 {
-	struct cpuset *cs = cgroup_cs(cgrp);
+	struct cpuset *cs = css_cs(css);
 	struct cpuset *parent = parent_cs(cs);
 	struct cpuset *tmp_cs;
 	struct cgroup *pos_cgrp;
@@ -1984,7 +1987,7 @@ static int cpuset_css_online(struct cgro
 
 	number_of_cpusets++;
 
-	if (!test_bit(CGRP_CPUSET_CLONE_CHILDREN, &cgrp->flags))
+	if (!test_bit(CGRP_CPUSET_CLONE_CHILDREN, &css->cgroup->flags))
 		goto out_unlock;
 
 	/*
@@ -2024,9 +2027,9 @@ out_unlock:
  * will call rebuild_sched_domains_locked().
  */
 
-static void cpuset_css_offline(struct cgroup *cgrp)
+static void cpuset_css_offline(struct cgroup_subsys_state *css)
 {
-	struct cpuset *cs = cgroup_cs(cgrp);
+	struct cpuset *cs = css_cs(css);
 
 	mutex_lock(&cpuset_mutex);
 
@@ -2039,9 +2042,9 @@ static void cpuset_css_offline(struct cg
 	mutex_unlock(&cpuset_mutex);
 }
 
-static void cpuset_css_free(struct cgroup *cgrp)
+static void cpuset_css_free(struct cgroup_subsys_state *css)
 {
-	struct cpuset *cs = cgroup_cs(cgrp);
+	struct cpuset *cs = css_cs(css);
 
 	free_cpumask_var(cs->cpus_allowed);
 	kfree(cs);
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7778,7 +7778,8 @@ unlock:
 device_initcall(perf_event_sysfs_init);
 
 #ifdef CONFIG_CGROUP_PERF
-static struct cgroup_subsys_state *perf_cgroup_css_alloc(struct cgroup *cont)
+static struct cgroup_subsys_state *
+perf_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct perf_cgroup *jc;
 
@@ -7795,11 +7796,10 @@ static struct cgroup_subsys_state *perf_
 	return &jc->css;
 }
 
-static void perf_cgroup_css_free(struct cgroup *cont)
+static void perf_cgroup_css_free(struct cgroup_subsys_state *css)
 {
-	struct perf_cgroup *jc;
-	jc = container_of(cgroup_css(cont, perf_subsys_id),
-			  struct perf_cgroup, css);
+	struct perf_cgroup *jc = container_of(css, struct perf_cgroup, css);
+
 	free_percpu(jc->info);
 	kfree(jc);
 }
@@ -7811,15 +7811,17 @@ static int __perf_cgroup_move(void *info
 	return 0;
 }
 
-static void perf_cgroup_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
+static void perf_cgroup_attach(struct cgroup_subsys_state *css,
+			       struct cgroup_taskset *tset)
 {
 	struct task_struct *task;
 
-	cgroup_taskset_for_each(task, cgrp, tset)
+	cgroup_taskset_for_each(task, css->cgroup, tset)
 		task_function_call(task, __perf_cgroup_move, task);
 }
 
-static void perf_cgroup_exit(struct cgroup *cgrp, struct cgroup *old_cgrp,
+static void perf_cgroup_exit(struct cgroup_subsys_state *css,
+			     struct cgroup_subsys_state *old_css,
 			     struct task_struct *task)
 {
 	/*
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7094,16 +7094,17 @@ static inline struct task_group *cgroup_
 	return css_tg(cgroup_css(cgrp, cpu_cgroup_subsys_id));
 }
 
-static struct cgroup_subsys_state *cpu_cgroup_css_alloc(struct cgroup *cgrp)
+static struct cgroup_subsys_state *
+cpu_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 {
-	struct task_group *tg, *parent;
+	struct task_group *parent = css_tg(parent_css);
+	struct task_group *tg;
 
-	if (!cgrp->parent) {
+	if (!parent) {
 		/* This is early initialization for the top cgroup */
 		return &root_task_group.css;
 	}
 
-	parent = cgroup_tg(cgrp->parent);
 	tg = sched_create_group(parent);
 	if (IS_ERR(tg))
 		return ERR_PTR(-ENOMEM);
@@ -7111,38 +7112,38 @@ static struct cgroup_subsys_state *cpu_c
 	return &tg->css;
 }
 
-static int cpu_cgroup_css_online(struct cgroup *cgrp)
+static int cpu_cgroup_css_online(struct cgroup_subsys_state *css)
 {
-	struct task_group *tg = cgroup_tg(cgrp);
-	struct task_group *parent = css_tg(css_parent(&tg->css));
+	struct task_group *tg = css_tg(css);
+	struct task_group *parent = css_tg(css_parent(css));
 
 	if (parent)
 		sched_online_group(tg, parent);
 	return 0;
 }
 
-static void cpu_cgroup_css_free(struct cgroup *cgrp)
+static void cpu_cgroup_css_free(struct cgroup_subsys_state *css)
 {
-	struct task_group *tg = cgroup_tg(cgrp);
+	struct task_group *tg = css_tg(css);
 
 	sched_destroy_group(tg);
 }
 
-static void cpu_cgroup_css_offline(struct cgroup *cgrp)
+static void cpu_cgroup_css_offline(struct cgroup_subsys_state *css)
 {
-	struct task_group *tg = cgroup_tg(cgrp);
+	struct task_group *tg = css_tg(css);
 
 	sched_offline_group(tg);
 }
 
-static int cpu_cgroup_can_attach(struct cgroup *cgrp,
+static int cpu_cgroup_can_attach(struct cgroup_subsys_state *css,
 				 struct cgroup_taskset *tset)
 {
 	struct task_struct *task;
 
-	cgroup_taskset_for_each(task, cgrp, tset) {
+	cgroup_taskset_for_each(task, css->cgroup, tset) {
 #ifdef CONFIG_RT_GROUP_SCHED
-		if (!sched_rt_can_attach(cgroup_tg(cgrp), task))
+		if (!sched_rt_can_attach(css_tg(css), task))
 			return -EINVAL;
 #else
 		/* We don't support RT-tasks being in separate groups */
@@ -7153,18 +7154,18 @@ static int cpu_cgroup_can_attach(struct
 	return 0;
 }
 
-static void cpu_cgroup_attach(struct cgroup *cgrp,
+static void cpu_cgroup_attach(struct cgroup_subsys_state *css,
 			      struct cgroup_taskset *tset)
 {
 	struct task_struct *task;
 
-	cgroup_taskset_for_each(task, cgrp, tset)
+	cgroup_taskset_for_each(task, css->cgroup, tset)
 		sched_move_task(task);
 }
 
-static void
-cpu_cgroup_exit(struct cgroup *cgrp, struct cgroup *old_cgrp,
-		struct task_struct *task)
+static void cpu_cgroup_exit(struct cgroup_subsys_state *css,
+			    struct cgroup_subsys_state *old_css,
+			    struct task_struct *task)
 {
 	/*
 	 * cgroup_exit() is called in the copy_process() failure path.
--- a/kernel/sched/cpuacct.c
+++ b/kernel/sched/cpuacct.c
@@ -62,11 +62,12 @@ static struct cpuacct root_cpuacct = {
 };
 
 /* create a new cpu accounting group */
-static struct cgroup_subsys_state *cpuacct_css_alloc(struct cgroup *cgrp)
+static struct cgroup_subsys_state *
+cpuacct_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct cpuacct *ca;
 
-	if (!cgrp->parent)
+	if (!parent_css)
 		return &root_cpuacct.css;
 
 	ca = kzalloc(sizeof(*ca), GFP_KERNEL);
@@ -92,9 +93,9 @@ out:
 }
 
 /* destroy an existing cpu accounting group */
-static void cpuacct_css_free(struct cgroup *cgrp)
+static void cpuacct_css_free(struct cgroup_subsys_state *css)
 {
-	struct cpuacct *ca = cgroup_ca(cgrp);
+	struct cpuacct *ca = css_ca(css);
 
 	free_percpu(ca->cpustat);
 	free_percpu(ca->cpuusage);
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -73,19 +73,18 @@ static inline bool hugetlb_cgroup_have_u
 	return false;
 }
 
-static struct cgroup_subsys_state *hugetlb_cgroup_css_alloc(struct cgroup *cgroup)
+static struct cgroup_subsys_state *
+hugetlb_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 {
+	struct hugetlb_cgroup *parent_h_cgroup = hugetlb_cgroup_from_css(parent_css);
+	struct hugetlb_cgroup *h_cgroup;
 	int idx;
-	struct cgroup *parent_cgroup;
-	struct hugetlb_cgroup *h_cgroup, *parent_h_cgroup;
 
 	h_cgroup = kzalloc(sizeof(*h_cgroup), GFP_KERNEL);
 	if (!h_cgroup)
 		return ERR_PTR(-ENOMEM);
 
-	parent_cgroup = cgroup->parent;
-	if (parent_cgroup) {
-		parent_h_cgroup = hugetlb_cgroup_from_cgroup(parent_cgroup);
+	if (parent_h_cgroup) {
 		for (idx = 0; idx < HUGE_MAX_HSTATE; idx++)
 			res_counter_init(&h_cgroup->hugepage[idx],
 					 &parent_h_cgroup->hugepage[idx]);
@@ -97,11 +96,11 @@ static struct cgroup_subsys_state *huget
 	return &h_cgroup->css;
 }
 
-static void hugetlb_cgroup_css_free(struct cgroup *cgroup)
+static void hugetlb_cgroup_css_free(struct cgroup_subsys_state *css)
 {
 	struct hugetlb_cgroup *h_cgroup;
 
-	h_cgroup = hugetlb_cgroup_from_cgroup(cgroup);
+	h_cgroup = hugetlb_cgroup_from_css(css);
 	kfree(h_cgroup);
 }
 
@@ -150,9 +149,9 @@ out:
  * Force the hugetlb cgroup to empty the hugetlb resources by moving them to
  * the parent cgroup.
  */
-static void hugetlb_cgroup_css_offline(struct cgroup *cgroup)
+static void hugetlb_cgroup_css_offline(struct cgroup_subsys_state *css)
 {
-	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_cgroup(cgroup);
+	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_css(css);
 	struct hstate *h;
 	struct page *page;
 	int idx = 0;
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6211,7 +6211,7 @@ static void __init mem_cgroup_soft_limit
 }
 
 static struct cgroup_subsys_state * __ref
-mem_cgroup_css_alloc(struct cgroup *cont)
+mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct mem_cgroup *memcg;
 	long error = -ENOMEM;
@@ -6226,7 +6226,7 @@ mem_cgroup_css_alloc(struct cgroup *cont
 			goto free_out;
 
 	/* root ? */
-	if (cont->parent == NULL) {
+	if (parent_css == NULL) {
 		root_mem_cgroup = memcg;
 		res_counter_init(&memcg->res, NULL);
 		res_counter_init(&memcg->memsw, NULL);
@@ -6248,10 +6248,10 @@ free_out:
 }
 
 static int
-mem_cgroup_css_online(struct cgroup *cont)
+mem_cgroup_css_online(struct cgroup_subsys_state *css)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
-	struct mem_cgroup *parent = mem_cgroup_from_css(css_parent(&memcg->css));
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
+	struct mem_cgroup *parent = mem_cgroup_from_css(css_parent(css));
 	int error = 0;
 
 	if (!parent)
@@ -6308,9 +6308,9 @@ static void mem_cgroup_invalidate_reclai
 		mem_cgroup_iter_invalidate(root_mem_cgroup);
 }
 
-static void mem_cgroup_css_offline(struct cgroup *cont)
+static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 
 	kmem_cgroup_css_offline(memcg);
 
@@ -6319,9 +6319,9 @@ static void mem_cgroup_css_offline(struc
 	mem_cgroup_destroy_all_caches(memcg);
 }
 
-static void mem_cgroup_css_free(struct cgroup *cont)
+static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 
 	memcg_destroy_kmem(memcg);
 	__mem_cgroup_free(memcg);
@@ -6691,12 +6691,12 @@ static void mem_cgroup_clear_mc(void)
 	mem_cgroup_end_move(from);
 }
 
-static int mem_cgroup_can_attach(struct cgroup *cgroup,
+static int mem_cgroup_can_attach(struct cgroup_subsys_state *css,
 				 struct cgroup_taskset *tset)
 {
 	struct task_struct *p = cgroup_taskset_first(tset);
 	int ret = 0;
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgroup);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	unsigned long move_charge_at_immigrate;
 
 	/*
@@ -6738,7 +6738,7 @@ static int mem_cgroup_can_attach(struct
 	return ret;
 }
 
-static void mem_cgroup_cancel_attach(struct cgroup *cgroup,
+static void mem_cgroup_cancel_attach(struct cgroup_subsys_state *css,
 				     struct cgroup_taskset *tset)
 {
 	mem_cgroup_clear_mc();
@@ -6886,7 +6886,7 @@ retry:
 	up_read(&mm->mmap_sem);
 }
 
-static void mem_cgroup_move_task(struct cgroup *cont,
+static void mem_cgroup_move_task(struct cgroup_subsys_state *css,
 				 struct cgroup_taskset *tset)
 {
 	struct task_struct *p = cgroup_taskset_first(tset);
@@ -6901,16 +6901,16 @@ static void mem_cgroup_move_task(struct
 		mem_cgroup_clear_mc();
 }
 #else	/* !CONFIG_MMU */
-static int mem_cgroup_can_attach(struct cgroup *cgroup,
+static int mem_cgroup_can_attach(struct cgroup_subsys_state *css,
 				 struct cgroup_taskset *tset)
 {
 	return 0;
 }
-static void mem_cgroup_cancel_attach(struct cgroup *cgroup,
+static void mem_cgroup_cancel_attach(struct cgroup_subsys_state *css,
 				     struct cgroup_taskset *tset)
 {
 }
-static void mem_cgroup_move_task(struct cgroup *cont,
+static void mem_cgroup_move_task(struct cgroup_subsys_state *css,
 				 struct cgroup_taskset *tset)
 {
 }
@@ -6920,15 +6920,15 @@ static void mem_cgroup_move_task(struct
  * Cgroup retains root cgroups across [un]mount cycles making it necessary
  * to verify sane_behavior flag on each mount attempt.
  */
-static void mem_cgroup_bind(struct cgroup *root)
+static void mem_cgroup_bind(struct cgroup_subsys_state *root_css)
 {
 	/*
 	 * use_hierarchy is forced with sane_behavior.  cgroup core
 	 * guarantees that @root doesn't have any children, so turning it
 	 * on for the root memcg is enough.
 	 */
-	if (cgroup_sane_behavior(root))
-		mem_cgroup_from_cont(root)->use_hierarchy = true;
+	if (cgroup_sane_behavior(root_css->cgroup))
+		mem_cgroup_from_css(root_css)->use_hierarchy = true;
 }
 
 struct cgroup_subsys mem_cgroup_subsys = {
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -126,7 +126,8 @@ static int netprio_set_prio(struct cgrou
 	return 0;
 }
 
-static struct cgroup_subsys_state *cgrp_css_alloc(struct cgroup *cgrp)
+static struct cgroup_subsys_state *
+cgrp_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct cgroup_subsys_state *css;
 
@@ -137,16 +138,14 @@ static struct cgroup_subsys_state *cgrp_
 	return css;
 }
 
-static int cgrp_css_online(struct cgroup *cgrp)
+static int cgrp_css_online(struct cgroup_subsys_state *css)
 {
-	struct cgroup_subsys_state *css = cgroup_css(cgrp, net_prio_subsys_id);
-	struct cgroup_subsys_state *parent_css;
+	struct cgroup_subsys_state *parent_css = css_parent(css);
 	struct net_device *dev;
 	int ret = 0;
 
-	if (!cgrp->parent)
+	if (!parent_css)
 		return 0;
-	parent_css = cgroup_css(cgrp->parent, net_prio_subsys_id);
 
 	rtnl_lock();
 	/*
@@ -164,9 +163,9 @@ static int cgrp_css_online(struct cgroup
 	return ret;
 }
 
-static void cgrp_css_free(struct cgroup *cgrp)
+static void cgrp_css_free(struct cgroup_subsys_state *css)
 {
-	kfree(cgroup_css(cgrp, net_prio_subsys_id));
+	kfree(css);
 }
 
 static u64 read_prioidx(struct cgroup *cgrp, struct cftype *cft)
@@ -221,12 +220,13 @@ static int update_netprio(const void *v,
 	return 0;
 }
 
-static void net_prio_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
+static void net_prio_attach(struct cgroup_subsys_state *css,
+			    struct cgroup_taskset *tset)
 {
 	struct task_struct *p;
 	void *v;
 
-	cgroup_taskset_for_each(p, cgrp, tset) {
+	cgroup_taskset_for_each(p, css->cgroup, tset) {
 		task_lock(p);
 		v = (void *)(unsigned long)task_netprioidx(p);
 		iterate_fd(p->files, 0, update_netprio, v);
--- a/net/sched/cls_cgroup.c
+++ b/net/sched/cls_cgroup.c
@@ -38,7 +38,8 @@ static inline struct cgroup_cls_state *t
 	return css_cls_state(task_css(p, net_cls_subsys_id));
 }
 
-static struct cgroup_subsys_state *cgrp_css_alloc(struct cgroup *cgrp)
+static struct cgroup_subsys_state *
+cgrp_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct cgroup_cls_state *cs;
 
@@ -48,19 +49,19 @@ static struct cgroup_subsys_state *cgrp_
 	return &cs->css;
 }
 
-static int cgrp_css_online(struct cgroup *cgrp)
+static int cgrp_css_online(struct cgroup_subsys_state *css)
 {
-	struct cgroup_cls_state *cs = cgrp_cls_state(cgrp);
-	struct cgroup_cls_state *parent = css_cls_state(css_parent(&cs->css));
+	struct cgroup_cls_state *cs = css_cls_state(css);
+	struct cgroup_cls_state *parent = css_cls_state(css_parent(css));
 
 	if (parent)
 		cs->classid = parent->classid;
 	return 0;
 }
 
-static void cgrp_css_free(struct cgroup *cgrp)
+static void cgrp_css_free(struct cgroup_subsys_state *css)
 {
-	kfree(cgrp_cls_state(cgrp));
+	kfree(css_cls_state(css));
 }
 
 static int update_classid(const void *v, struct file *file, unsigned n)
@@ -72,12 +73,13 @@ static int update_classid(const void *v,
 	return 0;
 }
 
-static void cgrp_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
+static void cgrp_attach(struct cgroup_subsys_state *css,
+			struct cgroup_taskset *tset)
 {
 	struct task_struct *p;
 	void *v;
 
-	cgroup_taskset_for_each(p, cgrp, tset) {
+	cgroup_taskset_for_each(p, css->cgroup, tset) {
 		task_lock(p);
 		v = (void *)(unsigned long)task_cls_classid(p);
 		iterate_fd(p->files, 0, update_classid, v);
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -68,7 +68,7 @@ static inline struct dev_cgroup *task_de
 
 struct cgroup_subsys devices_subsys;
 
-static int devcgroup_can_attach(struct cgroup *new_cgrp,
+static int devcgroup_can_attach(struct cgroup_subsys_state *new_css,
 				struct cgroup_taskset *set)
 {
 	struct task_struct *task = cgroup_taskset_first(set);
@@ -193,13 +193,13 @@ static inline bool is_devcg_online(const
 /**
  * devcgroup_online - initializes devcgroup's behavior and exceptions based on
  * 		      parent's
- * @cgroup: cgroup getting online
+ * @css: css getting online
  * returns 0 in case of success, error code otherwise
  */
-static int devcgroup_online(struct cgroup *cgroup)
+static int devcgroup_online(struct cgroup_subsys_state *css)
 {
-	struct dev_cgroup *dev_cgroup = cgroup_to_devcgroup(cgroup);
-	struct dev_cgroup *parent_dev_cgroup = css_to_devcgroup(css_parent(&dev_cgroup->css));
+	struct dev_cgroup *dev_cgroup = css_to_devcgroup(css);
+	struct dev_cgroup *parent_dev_cgroup = css_to_devcgroup(css_parent(css));
 	int ret = 0;
 
 	mutex_lock(&devcgroup_mutex);
@@ -217,9 +217,9 @@ static int devcgroup_online(struct cgrou
 	return ret;
 }
 
-static void devcgroup_offline(struct cgroup *cgroup)
+static void devcgroup_offline(struct cgroup_subsys_state *css)
 {
-	struct dev_cgroup *dev_cgroup = cgroup_to_devcgroup(cgroup);
+	struct dev_cgroup *dev_cgroup = css_to_devcgroup(css);
 
 	mutex_lock(&devcgroup_mutex);
 	dev_cgroup->behavior = DEVCG_DEFAULT_NONE;
@@ -229,7 +229,8 @@ static void devcgroup_offline(struct cgr
 /*
  * called from kernel/cgroup.c with cgroup_lock() held.
  */
-static struct cgroup_subsys_state *devcgroup_css_alloc(struct cgroup *cgroup)
+static struct cgroup_subsys_state *
+devcgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct dev_cgroup *dev_cgroup;
 
@@ -242,11 +243,10 @@ static struct cgroup_subsys_state *devcg
 	return &dev_cgroup->css;
 }
 
-static void devcgroup_css_free(struct cgroup *cgroup)
+static void devcgroup_css_free(struct cgroup_subsys_state *css)
 {
-	struct dev_cgroup *dev_cgroup;
+	struct dev_cgroup *dev_cgroup = css_to_devcgroup(css);
 
-	dev_cgroup = cgroup_to_devcgroup(cgroup);
 	__dev_exception_clean(dev_cgroup);
 	kfree(dev_cgroup);
 }

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v2 21/23] cgroup: make cftype->[un]register_event() deal with cgroup_subsys_state instead of cgroup
  2013-08-01 21:49 ` [PATCH 21/23] cgroup: make cftype->[un]register_event() " Tejun Heo
  2013-08-02  4:08   ` Li Zefan
  2013-08-02 13:42   ` Michal Hocko
@ 2013-08-02 20:24   ` Tejun Heo
  2 siblings, 0 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-02 20:24 UTC (permalink / raw)
  To: lizefan
  Cc: containers, cgroups, linux-kernel, Johannes Weiner, Michal Hocko,
	Balbir Singh

cgroup is in the process of converting to css (cgroup_subsys_state)
from cgroup as the principal subsystem interface handle.  This is
mostly to prepare for the unified hierarchy support where css's will
be created and destroyed dynamically but also helps cleaning up
subsystem implementations as css is usually what they are interested
in anyway.

cftype->[un]register_event() is among the remaining couple interfaces
which still use struct cgroup.  Convert it to cgroup_subsys_state.
The conversion is mostly mechanical and removes the last users of
mem_cgroup_from_cont() and cg_to_vmpressure(), which are removed.

v2: indentation update as suggested by Li Zefan.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
---
 include/linux/cgroup.h     |   10 ++++++----
 include/linux/vmpressure.h |    6 ++++--
 kernel/cgroup.c            |   15 ++++++++-------
 mm/memcontrol.c            |   21 ++++++++-------------
 mm/vmpressure.c            |   21 +++++++++------------
 5 files changed, 35 insertions(+), 38 deletions(-)

--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -506,16 +506,18 @@ struct cftype {
 	 * you want to provide this functionality. Use eventfd_signal()
 	 * on eventfd to send notification to userspace.
 	 */
-	int (*register_event)(struct cgroup *cgrp, struct cftype *cft,
-			struct eventfd_ctx *eventfd, const char *args);
+	int (*register_event)(struct cgroup_subsys_state *css,
+			      struct cftype *cft, struct eventfd_ctx *eventfd,
+			      const char *args);
 	/*
 	 * unregister_event() callback will be called when userspace
 	 * closes the eventfd or on cgroup removing.
 	 * This callback must be implemented, if you want provide
 	 * notification functionality.
 	 */
-	void (*unregister_event)(struct cgroup *cgrp, struct cftype *cft,
-			struct eventfd_ctx *eventfd);
+	void (*unregister_event)(struct cgroup_subsys_state *css,
+				 struct cftype *cft,
+				 struct eventfd_ctx *eventfd);
 };
 
 /*
--- a/include/linux/vmpressure.h
+++ b/include/linux/vmpressure.h
@@ -33,10 +33,12 @@ extern void vmpressure_init(struct vmpre
 extern struct vmpressure *memcg_to_vmpressure(struct mem_cgroup *memcg);
 extern struct cgroup_subsys_state *vmpressure_to_css(struct vmpressure *vmpr);
 extern struct vmpressure *css_to_vmpressure(struct cgroup_subsys_state *css);
-extern int vmpressure_register_event(struct cgroup *cg, struct cftype *cft,
+extern int vmpressure_register_event(struct cgroup_subsys_state *css,
+				     struct cftype *cft,
 				     struct eventfd_ctx *eventfd,
 				     const char *args);
-extern void vmpressure_unregister_event(struct cgroup *cg, struct cftype *cft,
+extern void vmpressure_unregister_event(struct cgroup_subsys_state *css,
+					struct cftype *cft,
 					struct eventfd_ctx *eventfd);
 #else
 static inline void vmpressure(gfp_t gfp, struct mem_cgroup *memcg,
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -159,9 +159,9 @@ struct css_id {
  */
 struct cgroup_event {
 	/*
-	 * Cgroup which the event belongs to.
+	 * css which the event belongs to.
 	 */
-	struct cgroup *cgrp;
+	struct cgroup_subsys_state *css;
 	/*
 	 * Control file which the event associated.
 	 */
@@ -3955,11 +3955,12 @@ static void cgroup_event_remove(struct w
 {
 	struct cgroup_event *event = container_of(work, struct cgroup_event,
 			remove);
-	struct cgroup *cgrp = event->cgrp;
+	struct cgroup_subsys_state *css = event->css;
+	struct cgroup *cgrp = css->cgroup;
 
 	remove_wait_queue(event->wqh, &event->wait);
 
-	event->cft->unregister_event(cgrp, event->cft, event->eventfd);
+	event->cft->unregister_event(css, event->cft, event->eventfd);
 
 	/* Notify userspace the event is going away. */
 	eventfd_signal(event->eventfd, 1);
@@ -3979,7 +3980,7 @@ static int cgroup_event_wake(wait_queue_
 {
 	struct cgroup_event *event = container_of(wait,
 			struct cgroup_event, wait);
-	struct cgroup *cgrp = event->cgrp;
+	struct cgroup *cgrp = event->css->cgroup;
 	unsigned long flags = (unsigned long)key;
 
 	if (flags & POLLHUP) {
@@ -4048,7 +4049,7 @@ static int cgroup_write_event_control(st
 	event = kzalloc(sizeof(*event), GFP_KERNEL);
 	if (!event)
 		return -ENOMEM;
-	event->cgrp = cgrp;
+	event->css = css;
 	INIT_LIST_HEAD(&event->list);
 	init_poll_funcptr(&event->pt, cgroup_event_ptable_queue_proc);
 	init_waitqueue_func_entry(&event->wait, cgroup_event_wake);
@@ -4099,7 +4100,7 @@ static int cgroup_write_event_control(st
 		goto out_put_cfile;
 	}
 
-	ret = event->cft->register_event(cgrp, event->cft,
+	ret = event->cft->register_event(css, event->cft,
 			event->eventfd, buffer);
 	if (ret)
 		goto out_put_cfile;
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1034,11 +1034,6 @@ static void memcg_check_events(struct me
 		preempt_enable();
 }
 
-static inline struct mem_cgroup *mem_cgroup_from_cont(struct cgroup *cont)
-{
-	return mem_cgroup_from_css(cgroup_css(cont, mem_cgroup_subsys_id));
-}
-
 struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p)
 {
 	/*
@@ -5620,10 +5615,10 @@ static void mem_cgroup_oom_notify(struct
 		mem_cgroup_oom_notify_cb(iter);
 }
 
-static int mem_cgroup_usage_register_event(struct cgroup *cgrp,
+static int mem_cgroup_usage_register_event(struct cgroup_subsys_state *css,
 	struct cftype *cft, struct eventfd_ctx *eventfd, const char *args)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct mem_cgroup_thresholds *thresholds;
 	struct mem_cgroup_threshold_ary *new;
 	enum res_type type = MEMFILE_TYPE(cft->private);
@@ -5703,10 +5698,10 @@ unlock:
 	return ret;
 }
 
-static void mem_cgroup_usage_unregister_event(struct cgroup *cgrp,
+static void mem_cgroup_usage_unregister_event(struct cgroup_subsys_state *css,
 	struct cftype *cft, struct eventfd_ctx *eventfd)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct mem_cgroup_thresholds *thresholds;
 	struct mem_cgroup_threshold_ary *new;
 	enum res_type type = MEMFILE_TYPE(cft->private);
@@ -5782,10 +5777,10 @@ unlock:
 	mutex_unlock(&memcg->thresholds_lock);
 }
 
-static int mem_cgroup_oom_register_event(struct cgroup *cgrp,
+static int mem_cgroup_oom_register_event(struct cgroup_subsys_state *css,
 	struct cftype *cft, struct eventfd_ctx *eventfd, const char *args)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct mem_cgroup_eventfd_list *event;
 	enum res_type type = MEMFILE_TYPE(cft->private);
 
@@ -5807,10 +5802,10 @@ static int mem_cgroup_oom_register_event
 	return 0;
 }
 
-static void mem_cgroup_oom_unregister_event(struct cgroup *cgrp,
+static void mem_cgroup_oom_unregister_event(struct cgroup_subsys_state *css,
 	struct cftype *cft, struct eventfd_ctx *eventfd)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 	struct mem_cgroup_eventfd_list *ev, *tmp;
 	enum res_type type = MEMFILE_TYPE(cft->private);
 
--- a/mm/vmpressure.c
+++ b/mm/vmpressure.c
@@ -74,11 +74,6 @@ static struct vmpressure *work_to_vmpres
 	return container_of(work, struct vmpressure, work);
 }
 
-static struct vmpressure *cg_to_vmpressure(struct cgroup *cg)
-{
-	return css_to_vmpressure(cgroup_css(cg, mem_cgroup_subsys_id));
-}
-
 static struct vmpressure *vmpressure_parent(struct vmpressure *vmpr)
 {
 	struct cgroup_subsys_state *css = vmpressure_to_css(vmpr);
@@ -283,7 +278,7 @@ void vmpressure_prio(gfp_t gfp, struct m
 
 /**
  * vmpressure_register_event() - Bind vmpressure notifications to an eventfd
- * @cg:		cgroup that is interested in vmpressure notifications
+ * @css:	css that is interested in vmpressure notifications
  * @cft:	cgroup control files handle
  * @eventfd:	eventfd context to link notifications with
  * @args:	event arguments (used to set up a pressure level threshold)
@@ -298,10 +293,11 @@ void vmpressure_prio(gfp_t gfp, struct m
  * cftype).register_event, and then cgroup core will handle everything by
  * itself.
  */
-int vmpressure_register_event(struct cgroup *cg, struct cftype *cft,
-			      struct eventfd_ctx *eventfd, const char *args)
+int vmpressure_register_event(struct cgroup_subsys_state *css,
+			      struct cftype *cft, struct eventfd_ctx *eventfd,
+			      const char *args)
 {
-	struct vmpressure *vmpr = cg_to_vmpressure(cg);
+	struct vmpressure *vmpr = css_to_vmpressure(css);
 	struct vmpressure_event *ev;
 	int level;
 
@@ -329,7 +325,7 @@ int vmpressure_register_event(struct cgr
 
 /**
  * vmpressure_unregister_event() - Unbind eventfd from vmpressure
- * @cg:		cgroup handle
+ * @css:	css handle
  * @cft:	cgroup control files handle
  * @eventfd:	eventfd context that was used to link vmpressure with the @cg
  *
@@ -341,10 +337,11 @@ int vmpressure_register_event(struct cgr
  * cftype).unregister_event, and then cgroup core will handle everything
  * by itself.
  */
-void vmpressure_unregister_event(struct cgroup *cg, struct cftype *cft,
+void vmpressure_unregister_event(struct cgroup_subsys_state *css,
+				 struct cftype *cft,
 				 struct eventfd_ctx *eventfd)
 {
-	struct vmpressure *vmpr = cg_to_vmpressure(cg);
+	struct vmpressure *vmpr = css_to_vmpressure(css);
 	struct vmpressure_event *ev;
 
 	mutex_lock(&vmpr->events_lock);

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods
  2013-08-01 21:49 ` [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods Tejun Heo
                     ` (3 preceding siblings ...)
  2013-08-02 20:24   ` [PATCH v2 " Tejun Heo
@ 2013-08-05 12:44   ` Vivek Goyal
  2013-08-05 17:57   ` Aristeu Rozanski
  5 siblings, 0 replies; 60+ messages in thread
From: Vivek Goyal @ 2013-08-05 12:44 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, containers, cgroups, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Johannes Weiner, Michal Hocko, Balbir Singh,
	Aristeu Rozanski, Matt Helsley, Daniel Wagner, Jens Axboe,
	Steven Rostedt

On Thu, Aug 01, 2013 at 05:49:46PM -0400, Tejun Heo wrote:
> cgroup is currently in the process of transitioning to using struct
> cgroup_subsys_state * as the primary handle instead of struct cgroup *
> in subsystem implementations for the following reasons.
> 
> * With unified hierarchy, subsystems will be dynamically bound and
>   unbound from cgroups and thus css's (cgroup_subsys_state) may be
>   created and destroyed dynamically over the lifetime of a cgroup,
>   which is different from the current state where all css's are
>   allocated and destroyed together with the associated cgroup.  This
>   in turn means that cgroup_css() should be synchronized and may
>   return NULL, making it more cumbersome to use.
> 
> * Differing levels of per-subsystem granularity in the unified
>   hierarchy means that the task and descendant iterators should behave
>   differently depending on the specific subsystem the iteration is
>   being performed for.
> 
> * In majority of the cases, subsystems only care about its part in the
>   cgroup hierarchy - ie. the hierarchy of css's.  Subsystem methods
>   often obtain the matching css pointer from the cgroup and don't
>   bother with the cgroup pointer itself.  Passing around css fits
>   much better.
> 
> This patch converts all cgroup_subsys methods to take @css instead of
> @cgroup.  The conversions are mostly straight-forward.  A few
> noteworthy changes are
> 
> * ->css_alloc() now takes css of the parent cgroup rather than the
>   pointer to the new cgroup as the css for the new cgroup doesn't
>   exist yet.  Knowing the parent css is enough for all the existing
>   subsystems.
> 
> * In kernel/cgroup.c::offline_css(), unnecessary open coded css
>   dereference is replaced with local variable access.
> 
> This patch shouldn't cause any behavior differences.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Aristeu Rozanski <aris@redhat.com>
> Cc: Matt Helsley <matthltc@us.ibm.com>
> Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> ---
>  block/blk-cgroup.c        | 25 +++++++++++-----------

blk-cgroup changes look good to me.

Acked-by: Vivek Goyal <vgoyal@redhat.com>

Vivek

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 09/23] cgroup: add subsys backlink pointer to cftype
  2013-08-01 21:49 ` [PATCH 09/23] cgroup: add subsys backlink pointer to cftype Tejun Heo
@ 2013-08-05 12:49   ` Vivek Goyal
  0 siblings, 0 replies; 60+ messages in thread
From: Vivek Goyal @ 2013-08-05 12:49 UTC (permalink / raw)
  To: Tejun Heo; +Cc: lizefan, containers, cgroups, linux-kernel, Jens Axboe

On Thu, Aug 01, 2013 at 05:49:47PM -0400, Tejun Heo wrote:
> cgroup is transitioning to using css (cgroup_subsys_state) instead of
> cgroup as the primary subsystem handle.  The cgroupfs file interface
> will be converted to use css's which requires finding out the
> subsystem from cftype so that the matching css can be determined from
> the cgroup.
> 
> This patch adds cftype->ss which points to the subsystem the file
> belongs to.  The field is initialized while a cftype is being
> registered.  This makes it unnecessary to explicitly specify the
> subsystem for other cftype handling functions.  @ss argument dropped
> from various cftype handling functions.
> 
> This patch shouldn't introduce any behavior differences.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> ---
>  block/blk-cgroup.c     |  2 +-

blk-cgroup bits are simple here. Ack for these.

Acked-by: Vivek Goyal <vgoyal@redhat.com>

Vivek

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 12/23] cgroup: pass around cgroup_subsys_state instead of cgroup in file methods
  2013-08-01 21:49 ` [PATCH 12/23] cgroup: pass around cgroup_subsys_state instead of cgroup in file methods Tejun Heo
  2013-08-02 13:27   ` Michal Hocko
@ 2013-08-05 14:19   ` Vivek Goyal
  2013-08-05 18:04   ` Aristeu Rozanski
  2013-08-06  6:48   ` Daniel Wagner
  3 siblings, 0 replies; 60+ messages in thread
From: Vivek Goyal @ 2013-08-05 14:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, containers, cgroups, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Johannes Weiner, Michal Hocko, Balbir Singh,
	Aristeu Rozanski, Matt Helsley, Daniel Wagner, Jens Axboe,
	Steven Rostedt

On Thu, Aug 01, 2013 at 05:49:50PM -0400, Tejun Heo wrote:
> cgroup is currently in the process of transitioning to using struct
> cgroup_subsys_state * as the primary handle instead of struct cgroup.
> Please see the previous commit which converts the subsystem methods
> for rationale.
> 
> This patch converts all cftype file operations to take @css instead of
> @cgroup.  cftypes for the cgroup core files don't have their subsytem
> pointer set.  These will automatically use the dummy_css added by the
> previous patch and can be converted the same way.
> 
> Most subsystem conversions are straight forwards but there are some
> interesting ones.
> 
> * freezer: update_if_frozen() is also converted to take @css instead
>   of @cgroup for consistency.  This will make the code look simpler
>   too once iterators are converted to use css.
> 
> * memory/vmpressure: mem_cgroup_from_css() needs to be exported to
>   vmpressure while mem_cgroup_from_cont() can be made static.
>   Updated accordingly.
> 
> * cpu: cgroup_tg() doesn't have any user left.  Removed.
> 
> * cpuacct: cgroup_ca() doesn't have any user left.  Removed.
> 
> * hugetlb: hugetlb_cgroup_form_cgroup() doesn't have any user left.
>   Removed.
> 
> * net_cls: cgrp_cls_state() doesn't have any user left.  Removed.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Aristeu Rozanski <aris@redhat.com>
> Cc: Matt Helsley <matthltc@us.ibm.com>
> Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> ---
>  block/blk-cgroup.c         |   6 +-
>  block/blk-throttle.c       |  32 ++++-----
>  block/cfq-iosched.c        |  90 ++++++++++++-------------

blk-cgroup.c, blk-throttle.c and cfq-iosched.c bits look good to me.

Acked-by: Vivek Goyal <vgoyal@redhat.com>

Vivek

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 15/23] cgroup: make hierarchy iterators deal with cgroup_subsys_state instead of cgroup
  2013-08-01 21:49 ` [PATCH 15/23] cgroup: make hierarchy iterators deal with cgroup_subsys_state instead of cgroup Tejun Heo
  2013-08-02 13:32   ` Michal Hocko
@ 2013-08-05 14:25   ` Vivek Goyal
  2013-08-05 18:10   ` Aristeu Rozanski
  2 siblings, 0 replies; 60+ messages in thread
From: Vivek Goyal @ 2013-08-05 14:25 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, containers, cgroups, linux-kernel, Johannes Weiner,
	Michal Hocko, Balbir Singh, Aristeu Rozanski, Matt Helsley,
	Jens Axboe

On Thu, Aug 01, 2013 at 05:49:53PM -0400, Tejun Heo wrote:
> cgroup is currently in the process of transitioning to using css
> (cgroup_subsys_state) as the primary handle instead of cgroup in
> subsystem API.  For hierarchy iterators, this is beneficial because
> 
> * In most cases, css is the only thing subsystems care about anyway.
> 
> * On the planned unified hierarchy, iterations for different
>   subsystems will need to skip over different subtrees of the
>   hierarchy depending on which subsystems are enabled on each cgroup.
>   Passing around css makes it unnecessary to explicitly specify the
>   subsystem in question as css is intersection between cgroup and
>   subsystem
> 
> * For the planned unified hierarchy, css's would need to be created
>   and destroyed dynamically independent from cgroup hierarchy.  Having
>   cgroup core manage css iteration makes enforcing deref rules a lot
>   easier.
> 
> Most subsystem conversions are straight-forward.  Noteworthy changes
> are
> 
> * blkio: cgroup_to_blkcg() is no longer used.  Removed.
> 
> * freezer: cgroup_freezer() is no longer used.  Removed.
> 
> * devices: cgroup_to_devcgroup() is no longer used.  Removed.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Aristeu Rozanski <aris@redhat.com>
> Cc: Matt Helsley <matthltc@us.ibm.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> ---
>  block/blk-cgroup.c       |   8 +--
>  block/blk-cgroup.h       |  25 ++++-----
>  block/blk-throttle.c     |   8 +--

Block bits look good to me.

Acked-by: Vivek Goyal <vgoyal@redhat.com>

Vivek

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods
  2013-08-01 21:49 ` [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods Tejun Heo
                     ` (4 preceding siblings ...)
  2013-08-05 12:44   ` [PATCH " Vivek Goyal
@ 2013-08-05 17:57   ` Aristeu Rozanski
  5 siblings, 0 replies; 60+ messages in thread
From: Aristeu Rozanski @ 2013-08-05 17:57 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, containers, cgroups, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Johannes Weiner, Michal Hocko, Balbir Singh,
	Matt Helsley, Daniel Wagner, Vivek Goyal, Jens Axboe,
	Steven Rostedt

On Thu, Aug 01, 2013 at 05:49:46PM -0400, Tejun Heo wrote:
> cgroup is currently in the process of transitioning to using struct
> cgroup_subsys_state * as the primary handle instead of struct cgroup *
> in subsystem implementations for the following reasons.
> 
> * With unified hierarchy, subsystems will be dynamically bound and
>   unbound from cgroups and thus css's (cgroup_subsys_state) may be
>   created and destroyed dynamically over the lifetime of a cgroup,
>   which is different from the current state where all css's are
>   allocated and destroyed together with the associated cgroup.  This
>   in turn means that cgroup_css() should be synchronized and may
>   return NULL, making it more cumbersome to use.
> 
> * Differing levels of per-subsystem granularity in the unified
>   hierarchy means that the task and descendant iterators should behave
>   differently depending on the specific subsystem the iteration is
>   being performed for.
> 
> * In majority of the cases, subsystems only care about its part in the
>   cgroup hierarchy - ie. the hierarchy of css's.  Subsystem methods
>   often obtain the matching css pointer from the cgroup and don't
>   bother with the cgroup pointer itself.  Passing around css fits
>   much better.
> 
> This patch converts all cgroup_subsys methods to take @css instead of
> @cgroup.  The conversions are mostly straight-forward.  A few
> noteworthy changes are
> 
> * ->css_alloc() now takes css of the parent cgroup rather than the
>   pointer to the new cgroup as the css for the new cgroup doesn't
>   exist yet.  Knowing the parent css is enough for all the existing
>   subsystems.
> 
> * In kernel/cgroup.c::offline_css(), unnecessary open coded css
>   dereference is replaced with local variable access.
> 
> This patch shouldn't cause any behavior differences.

looks fine on device_cgroup.c bit

Acked-by: Aristeu Rozanski <aris@redhat.com>

-- 
Aristeu


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 12/23] cgroup: pass around cgroup_subsys_state instead of cgroup in file methods
  2013-08-01 21:49 ` [PATCH 12/23] cgroup: pass around cgroup_subsys_state instead of cgroup in file methods Tejun Heo
  2013-08-02 13:27   ` Michal Hocko
  2013-08-05 14:19   ` Vivek Goyal
@ 2013-08-05 18:04   ` Aristeu Rozanski
  2013-08-06  6:48   ` Daniel Wagner
  3 siblings, 0 replies; 60+ messages in thread
From: Aristeu Rozanski @ 2013-08-05 18:04 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, containers, cgroups, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Johannes Weiner, Michal Hocko, Balbir Singh,
	Matt Helsley, Daniel Wagner, Vivek Goyal, Jens Axboe,
	Steven Rostedt

On Thu, Aug 01, 2013 at 05:49:50PM -0400, Tejun Heo wrote:
> cgroup is currently in the process of transitioning to using struct
> cgroup_subsys_state * as the primary handle instead of struct cgroup.
> Please see the previous commit which converts the subsystem methods
> for rationale.
> 
> This patch converts all cftype file operations to take @css instead of
> @cgroup.  cftypes for the cgroup core files don't have their subsytem
> pointer set.  These will automatically use the dummy_css added by the
> previous patch and can be converted the same way.
> 
> Most subsystem conversions are straight forwards but there are some
> interesting ones.
> 
> * freezer: update_if_frozen() is also converted to take @css instead
>   of @cgroup for consistency.  This will make the code look simpler
>   too once iterators are converted to use css.
> 
> * memory/vmpressure: mem_cgroup_from_css() needs to be exported to
>   vmpressure while mem_cgroup_from_cont() can be made static.
>   Updated accordingly.
> 
> * cpu: cgroup_tg() doesn't have any user left.  Removed.
> 
> * cpuacct: cgroup_ca() doesn't have any user left.  Removed.
> 
> * hugetlb: hugetlb_cgroup_form_cgroup() doesn't have any user left.
>   Removed.
> 
> * net_cls: cgrp_cls_state() doesn't have any user left.  Removed.

Also looks good on devcg part

Acked-by: Aristeu Rozanski <aris@redhat.com>

-- 
Aristeu


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 15/23] cgroup: make hierarchy iterators deal with cgroup_subsys_state instead of cgroup
  2013-08-01 21:49 ` [PATCH 15/23] cgroup: make hierarchy iterators deal with cgroup_subsys_state instead of cgroup Tejun Heo
  2013-08-02 13:32   ` Michal Hocko
  2013-08-05 14:25   ` Vivek Goyal
@ 2013-08-05 18:10   ` Aristeu Rozanski
  2 siblings, 0 replies; 60+ messages in thread
From: Aristeu Rozanski @ 2013-08-05 18:10 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, containers, cgroups, linux-kernel, Johannes Weiner,
	Michal Hocko, Balbir Singh, Matt Helsley, Vivek Goyal,
	Jens Axboe

On Thu, Aug 01, 2013 at 05:49:53PM -0400, Tejun Heo wrote:
> cgroup is currently in the process of transitioning to using css
> (cgroup_subsys_state) as the primary handle instead of cgroup in
> subsystem API.  For hierarchy iterators, this is beneficial because
> 
> * In most cases, css is the only thing subsystems care about anyway.
> 
> * On the planned unified hierarchy, iterations for different
>   subsystems will need to skip over different subtrees of the
>   hierarchy depending on which subsystems are enabled on each cgroup.
>   Passing around css makes it unnecessary to explicitly specify the
>   subsystem in question as css is intersection between cgroup and
>   subsystem
> 
> * For the planned unified hierarchy, css's would need to be created
>   and destroyed dynamically independent from cgroup hierarchy.  Having
>   cgroup core manage css iteration makes enforcing deref rules a lot
>   easier.
> 
> Most subsystem conversions are straight-forward.  Noteworthy changes
> are
> 
> * blkio: cgroup_to_blkcg() is no longer used.  Removed.
> 
> * freezer: cgroup_freezer() is no longer used.  Removed.
> 
> * devices: cgroup_to_devcgroup() is no longer used.  Removed.

Acked-by: Aristeu Rozanski <aris@redhat.com>

-- 
Aristeu


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 12/23] cgroup: pass around cgroup_subsys_state instead of cgroup in file methods
  2013-08-01 21:49 ` [PATCH 12/23] cgroup: pass around cgroup_subsys_state instead of cgroup in file methods Tejun Heo
                     ` (2 preceding siblings ...)
  2013-08-05 18:04   ` Aristeu Rozanski
@ 2013-08-06  6:48   ` Daniel Wagner
  3 siblings, 0 replies; 60+ messages in thread
From: Daniel Wagner @ 2013-08-06  6:48 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, containers, cgroups, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Johannes Weiner, Michal Hocko, Balbir Singh,
	Aristeu Rozanski, Matt Helsley, Vivek Goyal, Jens Axboe,
	Steven Rostedt

Hi Tejun,

On 08/01/2013 11:49 PM, Tejun Heo wrote:
> cgroup is currently in the process of transitioning to using struct
> cgroup_subsys_state * as the primary handle instead of struct cgroup.
> Please see the previous commit which converts the subsystem methods
> for rationale.
>
> This patch converts all cftype file operations to take @css instead of
> @cgroup.  cftypes for the cgroup core files don't have their subsytem
> pointer set.  These will automatically use the dummy_css added by the
> previous patch and can be converted the same way.
>
> Most subsystem conversions are straight forwards but there are some
> interesting ones.
>
> * freezer: update_if_frozen() is also converted to take @css instead
>    of @cgroup for consistency.  This will make the code look simpler
>    too once iterators are converted to use css.
>
> * memory/vmpressure: mem_cgroup_from_css() needs to be exported to
>    vmpressure while mem_cgroup_from_cont() can be made static.
>    Updated accordingly.
>
> * cpu: cgroup_tg() doesn't have any user left.  Removed.
>
> * cpuacct: cgroup_ca() doesn't have any user left.  Removed.
>
> * hugetlb: hugetlb_cgroup_form_cgroup() doesn't have any user left.
>    Removed.
>
> * net_cls: cgrp_cls_state() doesn't have any user left.  Removed.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Aristeu Rozanski <aris@redhat.com>
> Cc: Matt Helsley <matthltc@us.ibm.com>
> Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Steven Rostedt <rostedt@goodmis.org>

I guess I ended up because I did some changes on netprio_cgroup and 
cls_cgroup.

Acked-by: Daniel Wagner <daniel.wagner@bmw-carit.de>

cheers,
daniel

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 22/23] cgroup: make cgroup_taskset deal with cgroup_subsys_state instead of cgroup
  2013-08-01 21:50 ` [PATCH 22/23] cgroup: make cgroup_taskset " Tejun Heo
@ 2013-08-06  6:53   ` Daniel Wagner
  0 siblings, 0 replies; 60+ messages in thread
From: Daniel Wagner @ 2013-08-06  6:53 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, containers, cgroups, linux-kernel, Ingo Molnar,
	Matt Helsley, Steven Rostedt

Hi Tejun,

On 08/01/2013 11:50 PM, Tejun Heo wrote:
> cgroup is in the process of converting to css (cgroup_subsys_state)
> from cgroup as the principal subsystem interface handle.  This is
> mostly to prepare for the unified hierarchy support where css's will
> be created and destroyed dynamically but also helps cleaning up
> subsystem implementations as css is usually what they are interested
> in anyway.
>
> cgroup_taskset which is used by the subsystem attach methods is the
> last cgroup subsystem API which isn't using css as the handle.  Update
> cgroup_taskset_cur_cgroup() to cgroup_taskset_cur_css() and
> cgroup_taskset_for_each() to take @skip_css instead of @skip_cgrp.
>
> The conversions are pretty mechanical.  One exception is
> cpuset::cgroup_cs(), which lost its last user and got removed.
>
> This patch shouldn't introduce any functional changes.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Matt Helsley <matthltc@us.ibm.com>
> Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
> Cc: Steven Rostedt <rostedt@goodmis.org>

Nice cleanup.

Acked-By: Daniel Wagner <daniel.wagner@bmw-carit.de>

cheers,
daniel







^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods
  2013-08-02 20:24   ` [PATCH v2 " Tejun Heo
@ 2013-08-06  7:19     ` Daniel Wagner
  0 siblings, 0 replies; 60+ messages in thread
From: Daniel Wagner @ 2013-08-06  7:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, containers, cgroups, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Johannes Weiner, Michal Hocko, Balbir Singh,
	Aristeu Rozanski, Matt Helsley, Vivek Goyal, Jens Axboe,
	Steven Rostedt

On 08/02/2013 10:24 PM, Tejun Heo wrote:
> cgroup is currently in the process of transitioning to using struct
> cgroup_subsys_state * as the primary handle instead of struct cgroup *
> in subsystem implementations for the following reasons.
>
> * With unified hierarchy, subsystems will be dynamically bound and
>    unbound from cgroups and thus css's (cgroup_subsys_state) may be
>    created and destroyed dynamically over the lifetime of a cgroup,
>    which is different from the current state where all css's are
>    allocated and destroyed together with the associated cgroup.  This
>    in turn means that cgroup_css() should be synchronized and may
>    return NULL, making it more cumbersome to use.
>
> * Differing levels of per-subsystem granularity in the unified
>    hierarchy means that the task and descendant iterators should behave
>    differently depending on the specific subsystem the iteration is
>    being performed for.
>
> * In majority of the cases, subsystems only care about its part in the
>    cgroup hierarchy - ie. the hierarchy of css's.  Subsystem methods
>    often obtain the matching css pointer from the cgroup and don't
>    bother with the cgroup pointer itself.  Passing around css fits
>    much better.
>
> This patch converts all cgroup_subsys methods to take @css instead of
> @cgroup.  The conversions are mostly straight-forward.  A few
> noteworthy changes are
>
> * ->css_alloc() now takes css of the parent cgroup rather than the
>    pointer to the new cgroup as the css for the new cgroup doesn't
>    exist yet.  Knowing the parent css is enough for all the existing
>    subsystems.
>
> * In kernel/cgroup.c::offline_css(), unnecessary open coded css
>    dereference is replaced with local variable access.
>
> This patch shouldn't cause any behavior differences.
>
> v2: Unnecessary explicit cgrp->subsys[] deref in css_online() replaced
>      with local variable @css as suggested by Li Zefan.
>
>      Rebased on top of new for-3.12 which includes for-3.11-fixes so
>      that ->css_free() invocation added by da0a12caff ("cgroup: fix a
>      leak when percpu_ref_init() fails") is converted too.  Suggested
>      by Li Zefan.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Acked-by: Li Zefan <lizefan@huawei.com>
> Acked-by: Michal Hocko <mhocko@suse.cz>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Aristeu Rozanski <aris@redhat.com>
> Cc: Matt Helsley <matthltc@us.ibm.com>
> Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Steven Rostedt <rostedt@goodmis.org>

netprio and cls part:

Acked-By: Daniel Wagner <daniel.wagner@bmw-carit.de>

cheers,
daniel




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle
  2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
                   ` (23 preceding siblings ...)
  2013-08-02  3:24 ` [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Li Zefan
@ 2013-08-09  0:12 ` Tejun Heo
  24 siblings, 0 replies; 60+ messages in thread
From: Tejun Heo @ 2013-08-09  0:12 UTC (permalink / raw)
  To: lizefan; +Cc: containers, cgroups, linux-kernel

On Thu, Aug 01, 2013 at 05:49:38PM -0400, Tejun Heo wrote:
> So, this patchset converts all cgroup subsystem APIs to deal with
> css's instead of cgroups.  The patchset is fairly large but most of
> the conversions, while being tedious, aren't complex.  At the end of
> series, subsystems no longer make cgroup -> css mapping themselves and
> cgroup_css() - formerly cgroup_subsys_state() - is made internal to
> cgroup core proper.

Applied to cgroup/for-3.12.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2013-08-09  0:12 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-01 21:49 [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Tejun Heo
2013-08-01 21:49 ` [PATCH 01/23] cgroup: s/cgroup_subsys_state/cgroup_css/ s/task_subsys_state/task_css/ Tejun Heo
2013-08-01 21:49 ` [PATCH 02/23] cpuset: drop "const" qualifiers from struct cpuset instances Tejun Heo
2013-08-01 21:49 ` [PATCH 03/23] netprio_cgroup: pass around @css instead of @cgroup and kill struct cgroup_netprio_state Tejun Heo
2013-08-01 22:07   ` David Miller
2013-08-02 11:42   ` Neil Horman
2013-08-01 21:49 ` [PATCH 04/23] hugetlb_cgroup: pass around @hugetlb_cgroup instead of @cgroup Tejun Heo
2013-08-02  4:35   ` Aneesh Kumar K.V
2013-08-02 13:10   ` Michal Hocko
2013-08-01 21:49 ` [PATCH 05/23] cgroup: add subsystem pointer to cgroup_subsys_state Tejun Heo
2013-08-01 21:49 ` [PATCH 06/23] cgroup: add/update accessors which obtain subsys specific data from css Tejun Heo
2013-08-01 21:49 ` [PATCH 07/23] cgroup: add css_parent() Tejun Heo
2013-08-01 21:49 ` [PATCH 08/23] cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods Tejun Heo
2013-08-02  3:54   ` Li Zefan
2013-08-02 19:36     ` Tejun Heo
2013-08-02  4:02   ` Li Zefan
2013-08-02 19:41     ` Tejun Heo
2013-08-02 13:19   ` Michal Hocko
2013-08-02 13:43     ` Michal Hocko
2013-08-02 19:52       ` Tejun Heo
2013-08-02 19:38     ` Tejun Heo
2013-08-02 20:24   ` [PATCH v2 " Tejun Heo
2013-08-06  7:19     ` Daniel Wagner
2013-08-05 12:44   ` [PATCH " Vivek Goyal
2013-08-05 17:57   ` Aristeu Rozanski
2013-08-01 21:49 ` [PATCH 09/23] cgroup: add subsys backlink pointer to cftype Tejun Heo
2013-08-05 12:49   ` Vivek Goyal
2013-08-01 21:49 ` [PATCH 10/23] cgroup: pin cgroup_subsys_state when opening a cgroupfs file Tejun Heo
2013-08-01 21:49 ` [PATCH 11/23] cgroup: add cgroup->dummy_css Tejun Heo
2013-08-01 21:49 ` [PATCH 12/23] cgroup: pass around cgroup_subsys_state instead of cgroup in file methods Tejun Heo
2013-08-02 13:27   ` Michal Hocko
2013-08-05 14:19   ` Vivek Goyal
2013-08-05 18:04   ` Aristeu Rozanski
2013-08-06  6:48   ` Daniel Wagner
2013-08-01 21:49 ` [PATCH 13/23] cgroup: convert cgroup_next_sibling() to cgroup_next_child() Tejun Heo
2013-08-01 21:49 ` [PATCH 14/23] cgroup: always use cgroup_next_child() to walk the children list Tejun Heo
2013-08-01 21:49 ` [PATCH 15/23] cgroup: make hierarchy iterators deal with cgroup_subsys_state instead of cgroup Tejun Heo
2013-08-02 13:32   ` Michal Hocko
2013-08-05 14:25   ` Vivek Goyal
2013-08-05 18:10   ` Aristeu Rozanski
2013-08-01 21:49 ` [PATCH 16/23] cgroup: relocate cgroup_advance_iter() Tejun Heo
2013-08-02  3:25   ` Li Zefan
2013-08-02 19:35     ` Tejun Heo
2013-08-01 21:49 ` [PATCH 17/23] cgroup: rename cgroup_iter to cgroup_task_iter Tejun Heo
2013-08-02 13:35   ` Michal Hocko
2013-08-01 21:49 ` [PATCH 18/23] cgroup: make cgroup_task_iter remember the cgroup being iterated Tejun Heo
2013-08-02 13:38   ` Michal Hocko
2013-08-01 21:49 ` [PATCH 19/23] cgroup: remove struct cgroup_scanner Tejun Heo
2013-08-01 21:49 ` [PATCH 20/23] cgroup: make task iterators deal with cgroup_subsys_state instead of cgroup Tejun Heo
2013-08-02 13:40   ` Michal Hocko
2013-08-01 21:49 ` [PATCH 21/23] cgroup: make cftype->[un]register_event() " Tejun Heo
2013-08-02  4:08   ` Li Zefan
2013-08-02 19:44     ` Tejun Heo
2013-08-02 13:42   ` Michal Hocko
2013-08-02 20:24   ` [PATCH v2 " Tejun Heo
2013-08-01 21:50 ` [PATCH 22/23] cgroup: make cgroup_taskset " Tejun Heo
2013-08-06  6:53   ` Daniel Wagner
2013-08-01 21:50 ` [PATCH 23/23] cgroup: unexport cgroup_css() Tejun Heo
2013-08-02  3:24 ` [PATCHSET cgroup/for-3.12] cgroup: use cgroup_subsys_state as the primary subsystem interface handle Li Zefan
2013-08-09  0:12 ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).