linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET cgroup/for-3.17] cgroup, blkcg, memcg: make blkcg depend on memcg on unified hierarchy
@ 2014-06-28  1:03 Tejun Heo
  2014-06-28  1:03 ` [PATCH 1/6] cgroup: reorganize cgroup_subtree_control_write() Tejun Heo
                   ` (8 more replies)
  0 siblings, 9 replies; 14+ messages in thread
From: Tejun Heo @ 2014-06-28  1:03 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel, hannes, mhocko, vgoyal, axboe

Hello, guys.

Currently, the blkio subsystem attributes all of writeback IOs to the
root.  One of the issues is that there's no way to tell who originated
a writeback IO from block layer.  Those IOs are usually issued
asynchronously from a task which didn't have anything to do with
actually generating the dirty pages.  The memory subsystem, when
enabled, already keeps track of the ownership of each dirty page and
it's desirable for blkio to piggyback instead of adding its own
per-page tag.

This can be achieved on the unified hierarchy without too much
difficulty.  This patchset implements a dependency mechanism in the
cgroup such that a subsystem can depends on other subsystems.  If
available, the depended-upon subsystems are enabled together
implicitly when the subsystem is turned on.  Implicitly enabled
subsystems are invisible and the dependencies are transparent to
userland.

This patchset implements the dependency mechanism in cgroup core and
make blkcg depend on memcg.  This doesn't actually solve the writeback
problem yet but is an important step.

This patchset contains the following six patches.

 0001-cgroup-reorganize-cgroup_subtree_control_write.patch
 0002-cgroup-introduce-cgroup-subtree_control.patch
 0003-cgroup-make-interface-files-visible-iff-enabled-on-c.patch
 0004-cgroup-implement-cgroup_subsys-css_reset.patch
 0005-cgroup-implement-cgroup_subsys-depends_on.patch
 0006-blkcg-memcg-make-blkcg-depend-on-memcg-on-the-defaul.patch

0001-0005 gradually implement the dependency mechanism.

0006 makes blkcg depend on memcg.

This patchset is on top of a497c3ba1d97 ("Linux 3.16-rc2") and
available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-cgroup-dependency

diffstat follows.  Thanks.

 Documentation/cgroups/cgroups.txt           |   14 +
 Documentation/cgroups/unified-hierarchy.txt |   23 ++-
 block/blk-cgroup.c                          |    7
 include/linux/cgroup.h                      |   20 ++
 kernel/cgroup.c                             |  201 ++++++++++++++++++++++------
 mm/memcontrol.c                             |   24 +++
 6 files changed, 243 insertions(+), 46 deletions(-)

--
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/6] cgroup: reorganize cgroup_subtree_control_write()
  2014-06-28  1:03 [PATCHSET cgroup/for-3.17] cgroup, blkcg, memcg: make blkcg depend on memcg on unified hierarchy Tejun Heo
@ 2014-06-28  1:03 ` Tejun Heo
  2014-06-28  1:03 ` [PATCH 2/6] cgroup: introduce cgroup->subtree_control Tejun Heo
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2014-06-28  1:03 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel, hannes, mhocko, vgoyal, axboe, Tejun Heo

Make the following two reorganizations to
cgroup_subtree_control_write().  These are to prepare for future
changes and shouldn't cause any functional difference.

* Move availability above css offlining wait.

* Move cgrp->child_subsys_mask update above new css creation.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cgroup.c | 34 +++++++++++++++++-----------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 7868fc3..a46d7e2 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2613,6 +2613,14 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 				continue;
 			}
 
+			/* unavailable or not enabled on the parent? */
+			if (!(cgrp_dfl_root.subsys_mask & (1 << ssid)) ||
+			    (cgroup_parent(cgrp) &&
+			     !(cgroup_parent(cgrp)->child_subsys_mask & (1 << ssid)))) {
+				ret = -ENOENT;
+				goto out_unlock;
+			}
+
 			/*
 			 * Because css offlining is asynchronous, userland
 			 * might try to re-enable the same controller while
@@ -2635,14 +2643,6 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 
 				return restart_syscall();
 			}
-
-			/* unavailable or not enabled on the parent? */
-			if (!(cgrp_dfl_root.subsys_mask & (1 << ssid)) ||
-			    (cgroup_parent(cgrp) &&
-			     !(cgroup_parent(cgrp)->child_subsys_mask & (1 << ssid)))) {
-				ret = -ENOENT;
-				goto out_unlock;
-			}
 		} else if (disable & (1 << ssid)) {
 			if (!(cgrp->child_subsys_mask & (1 << ssid))) {
 				disable &= ~(1 << ssid);
@@ -2673,12 +2673,10 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 		goto out_unlock;
 	}
 
-	/*
-	 * Create csses for enables and update child_subsys_mask.  This
-	 * changes cgroup_e_css() results which in turn makes the
-	 * subsequent cgroup_update_dfl_csses() associate all tasks in the
-	 * subtree to the updated csses.
-	 */
+	cgrp->child_subsys_mask |= enable;
+	cgrp->child_subsys_mask &= ~disable;
+
+	/* create new csses */
 	for_each_subsys(ss, ssid) {
 		if (!(enable & (1 << ssid)))
 			continue;
@@ -2690,9 +2688,11 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 		}
 	}
 
-	cgrp->child_subsys_mask |= enable;
-	cgrp->child_subsys_mask &= ~disable;
-
+	/*
+	 * At this point, cgroup_e_css() results reflect the new csses
+	 * making the following cgroup_update_dfl_csses() properly update
+	 * css associations of all tasks in the subtree.
+	 */
 	ret = cgroup_update_dfl_csses(cgrp);
 	if (ret)
 		goto err_undo_css;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/6] cgroup: introduce cgroup->subtree_control
  2014-06-28  1:03 [PATCHSET cgroup/for-3.17] cgroup, blkcg, memcg: make blkcg depend on memcg on unified hierarchy Tejun Heo
  2014-06-28  1:03 ` [PATCH 1/6] cgroup: reorganize cgroup_subtree_control_write() Tejun Heo
@ 2014-06-28  1:03 ` Tejun Heo
  2014-06-28  1:03 ` [PATCH 3/6] cgroup: make interface files visible iff enabled on cgroup->subtree_control Tejun Heo
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2014-06-28  1:03 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel, hannes, mhocko, vgoyal, axboe, Tejun Heo

cgroup is implementing support for subsystem dependency which would
require a way to enable a subsystem even when it's not directly
configured through "cgroup.subtree_control".

Previously, cgroup->child_subsys_mask directly reflected
"cgroup.subtree_control" and the enabled subsystems in the child
cgroups.  This patch adds cgroup->subtree_control which
"cgroup.subtree_control" operates on.  cgroup->child_subsys_mask is
now calculated from cgroup->subtree_control by
cgroup_refresh_child_subsys_mask(), which sets it identical to
cgroup->subtree_control for now.

This will allow using cgroup->child_subsys_mask for all the enabled
subsystems including the implicit ones and ->subtree_control for
tracking the explicitly requested ones.  This patch keeps the two
masks identical and doesn't introduce any behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/cgroup.h |  8 +++++++-
 kernel/cgroup.c        | 46 +++++++++++++++++++++++++++++-----------------
 2 files changed, 36 insertions(+), 18 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 8a111dd..8d52c8e 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -203,7 +203,13 @@ struct cgroup {
 	struct kernfs_node *kn;		/* cgroup kernfs entry */
 	struct kernfs_node *populated_kn; /* kn for "cgroup.subtree_populated" */
 
-	/* the bitmask of subsystems enabled on the child cgroups */
+	/*
+	 * The bitmask of subsystems enabled on the child cgroups.
+	 * ->subtree_control is the one configured through
+	 * "cgroup.subtree_control" while ->child_subsys_mask is the
+	 * effective one which may have more subsystems enabled.
+	 */
+	unsigned int subtree_control;
 	unsigned int child_subsys_mask;
 
 	/* Private pointers for each registered subsystem */
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index a46d7e2..14a9d88 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1036,6 +1036,11 @@ static void cgroup_put(struct cgroup *cgrp)
 	css_put(&cgrp->self);
 }
 
+static void cgroup_refresh_child_subsys_mask(struct cgroup *cgrp)
+{
+	cgrp->child_subsys_mask = cgrp->subtree_control;
+}
+
 /**
  * cgroup_kn_unlock - unlocking helper for cgroup kernfs methods
  * @kn: the kernfs_node being serviced
@@ -1208,12 +1213,15 @@ static int rebind_subsystems(struct cgroup_root *dst_root, unsigned int ss_mask)
 		up_write(&css_set_rwsem);
 
 		src_root->subsys_mask &= ~(1 << ssid);
-		src_root->cgrp.child_subsys_mask &= ~(1 << ssid);
+		src_root->cgrp.subtree_control &= ~(1 << ssid);
+		cgroup_refresh_child_subsys_mask(&src_root->cgrp);
 
 		/* default hierarchy doesn't enable controllers by default */
 		dst_root->subsys_mask |= 1 << ssid;
-		if (dst_root != &cgrp_dfl_root)
-			dst_root->cgrp.child_subsys_mask |= 1 << ssid;
+		if (dst_root != &cgrp_dfl_root) {
+			dst_root->cgrp.subtree_control |= 1 << ssid;
+			cgroup_refresh_child_subsys_mask(&dst_root->cgrp);
+		}
 
 		if (ss->bind)
 			ss->bind(css);
@@ -2454,7 +2462,7 @@ static int cgroup_controllers_show(struct seq_file *seq, void *v)
 {
 	struct cgroup *cgrp = seq_css(seq)->cgroup;
 
-	cgroup_print_ss_mask(seq, cgroup_parent(cgrp)->child_subsys_mask);
+	cgroup_print_ss_mask(seq, cgroup_parent(cgrp)->subtree_control);
 	return 0;
 }
 
@@ -2463,7 +2471,7 @@ static int cgroup_subtree_control_show(struct seq_file *seq, void *v)
 {
 	struct cgroup *cgrp = seq_css(seq)->cgroup;
 
-	cgroup_print_ss_mask(seq, cgrp->child_subsys_mask);
+	cgroup_print_ss_mask(seq, cgrp->subtree_control);
 	return 0;
 }
 
@@ -2608,7 +2616,7 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 
 	for_each_subsys(ss, ssid) {
 		if (enable & (1 << ssid)) {
-			if (cgrp->child_subsys_mask & (1 << ssid)) {
+			if (cgrp->subtree_control & (1 << ssid)) {
 				enable &= ~(1 << ssid);
 				continue;
 			}
@@ -2616,7 +2624,7 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 			/* unavailable or not enabled on the parent? */
 			if (!(cgrp_dfl_root.subsys_mask & (1 << ssid)) ||
 			    (cgroup_parent(cgrp) &&
-			     !(cgroup_parent(cgrp)->child_subsys_mask & (1 << ssid)))) {
+			     !(cgroup_parent(cgrp)->subtree_control & (1 << ssid)))) {
 				ret = -ENOENT;
 				goto out_unlock;
 			}
@@ -2644,14 +2652,14 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 				return restart_syscall();
 			}
 		} else if (disable & (1 << ssid)) {
-			if (!(cgrp->child_subsys_mask & (1 << ssid))) {
+			if (!(cgrp->subtree_control & (1 << ssid))) {
 				disable &= ~(1 << ssid);
 				continue;
 			}
 
 			/* a child has it enabled? */
 			cgroup_for_each_live_child(child, cgrp) {
-				if (child->child_subsys_mask & (1 << ssid)) {
+				if (child->subtree_control & (1 << ssid)) {
 					ret = -EBUSY;
 					goto out_unlock;
 				}
@@ -2665,7 +2673,7 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 	}
 
 	/*
-	 * Except for the root, child_subsys_mask must be zero for a cgroup
+	 * Except for the root, subtree_control must be zero for a cgroup
 	 * with tasks so that child cgroups don't compete against tasks.
 	 */
 	if (enable && cgroup_parent(cgrp) && !list_empty(&cgrp->cset_links)) {
@@ -2673,8 +2681,9 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 		goto out_unlock;
 	}
 
-	cgrp->child_subsys_mask |= enable;
-	cgrp->child_subsys_mask &= ~disable;
+	cgrp->subtree_control |= enable;
+	cgrp->subtree_control &= ~disable;
+	cgroup_refresh_child_subsys_mask(cgrp);
 
 	/* create new csses */
 	for_each_subsys(ss, ssid) {
@@ -2713,8 +2722,9 @@ out_unlock:
 	return ret ?: nbytes;
 
 err_undo_css:
-	cgrp->child_subsys_mask &= ~enable;
-	cgrp->child_subsys_mask |= disable;
+	cgrp->subtree_control &= ~enable;
+	cgrp->subtree_control |= disable;
+	cgroup_refresh_child_subsys_mask(cgrp);
 
 	for_each_subsys(ss, ssid) {
 		if (!(enable & (1 << ssid)))
@@ -4428,10 +4438,12 @@ static int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name,
 
 	/*
 	 * On the default hierarchy, a child doesn't automatically inherit
-	 * child_subsys_mask from the parent.  Each is configured manually.
+	 * subtree_control from the parent.  Each is configured manually.
 	 */
-	if (!cgroup_on_dfl(cgrp))
-		cgrp->child_subsys_mask = parent->child_subsys_mask;
+	if (!cgroup_on_dfl(cgrp)) {
+		cgrp->subtree_control = parent->subtree_control;
+		cgroup_refresh_child_subsys_mask(cgrp);
+	}
 
 	kernfs_activate(kn);
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/6] cgroup: make interface files visible iff enabled on cgroup->subtree_control
  2014-06-28  1:03 [PATCHSET cgroup/for-3.17] cgroup, blkcg, memcg: make blkcg depend on memcg on unified hierarchy Tejun Heo
  2014-06-28  1:03 ` [PATCH 1/6] cgroup: reorganize cgroup_subtree_control_write() Tejun Heo
  2014-06-28  1:03 ` [PATCH 2/6] cgroup: introduce cgroup->subtree_control Tejun Heo
@ 2014-06-28  1:03 ` Tejun Heo
  2014-06-28  1:03 ` [PATCH 4/6] cgroup: implement cgroup_subsys->css_reset() Tejun Heo
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2014-06-28  1:03 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel, hannes, mhocko, vgoyal, axboe, Tejun Heo

cgroup is implementing support for subsystem dependency which would
require a way to enable a subsystem even when it's not directly
configured through "cgroup.subtree_control".

The preceding patch distinguished cgroup->subtree_control and
->child_subsys_mask where the former is the subsystems explicitly
configured by the userland and the latter is all enabled subsystems
currently is equal to the former but will include subsystems
implicitly enabled through dependency.

Subsystems which are enabled due to dependency shouldn't be visible to
userland.  This patch updates cgroup_subtree_control_write() and
create_css() such that interface files are not created for implicitly
enabled subsytems.

* @visible paramter is added to create_css().  Interface files are
  created only when true.

* If an already implicitly enabled subsystem is turned on through
  "cgroup.subtree_control", the existing css should be used.  css
  draining is skipped.

* cgroup_subtree_control_write() computes the new target
  cgroup->child_subsys_mask and create/kill or show/hide csses
  accordingly.

As the two subsystem masks are still kept identical, this patch
doesn't introduce any behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/cgroup.h |  2 ++
 kernel/cgroup.c        | 78 +++++++++++++++++++++++++++++++++++++++++---------
 2 files changed, 66 insertions(+), 14 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 8d52c8e..5287f93 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -208,6 +208,8 @@ struct cgroup {
 	 * ->subtree_control is the one configured through
 	 * "cgroup.subtree_control" while ->child_subsys_mask is the
 	 * effective one which may have more subsystems enabled.
+	 * Controller knobs are made available iff it's enabled in
+	 * ->subtree_control.
 	 */
 	unsigned int subtree_control;
 	unsigned int child_subsys_mask;
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 14a9d88..331fa296 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -186,7 +186,8 @@ static void cgroup_put(struct cgroup *cgrp);
 static int rebind_subsystems(struct cgroup_root *dst_root,
 			     unsigned int ss_mask);
 static int cgroup_destroy_locked(struct cgroup *cgrp);
-static int create_css(struct cgroup *cgrp, struct cgroup_subsys *ss);
+static int create_css(struct cgroup *cgrp, struct cgroup_subsys *ss,
+		      bool visible);
 static void css_release(struct percpu_ref *ref);
 static void kill_css(struct cgroup_subsys_state *css);
 static int cgroup_addrm_files(struct cgroup *cgrp, struct cftype cfts[],
@@ -2577,6 +2578,7 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 					    loff_t off)
 {
 	unsigned int enable = 0, disable = 0;
+	unsigned int css_enable, css_disable, old_ctrl, new_ctrl;
 	struct cgroup *cgrp, *child;
 	struct cgroup_subsys *ss;
 	char *tok;
@@ -2630,6 +2632,13 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 			}
 
 			/*
+			 * @ss is already enabled through dependency and
+			 * we'll just make it visible.  Skip draining.
+			 */
+			if (cgrp->child_subsys_mask & (1 << ssid))
+				continue;
+
+			/*
 			 * Because css offlining is asynchronous, userland
 			 * might try to re-enable the same controller while
 			 * the previous instance is still around.  In such
@@ -2681,17 +2690,39 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 		goto out_unlock;
 	}
 
+	/*
+	 * Update subsys masks and calculate what needs to be done.  More
+	 * subsystems than specified may need to be enabled or disabled
+	 * depending on subsystem dependencies.
+	 */
 	cgrp->subtree_control |= enable;
 	cgrp->subtree_control &= ~disable;
+
+	old_ctrl = cgrp->child_subsys_mask;
 	cgroup_refresh_child_subsys_mask(cgrp);
+	new_ctrl = cgrp->child_subsys_mask;
+
+	css_enable = ~old_ctrl & new_ctrl;
+	css_disable = old_ctrl & ~new_ctrl;
+	enable |= css_enable;
+	disable |= css_disable;
 
-	/* create new csses */
+	/*
+	 * Create new csses or make the existing ones visible.  A css is
+	 * created invisible if it's being implicitly enabled through
+	 * dependency.  An invisible css is made visible when the userland
+	 * explicitly enables it.
+	 */
 	for_each_subsys(ss, ssid) {
 		if (!(enable & (1 << ssid)))
 			continue;
 
 		cgroup_for_each_live_child(child, cgrp) {
-			ret = create_css(child, ss);
+			if (css_enable & (1 << ssid))
+				ret = create_css(child, ss,
+					cgrp->subtree_control & (1 << ssid));
+			else
+				ret = cgroup_populate_dir(child, 1 << ssid);
 			if (ret)
 				goto err_undo_css;
 		}
@@ -2706,13 +2737,21 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 	if (ret)
 		goto err_undo_css;
 
-	/* all tasks are now migrated away from the old csses, kill them */
+	/*
+	 * All tasks are migrated out of disabled csses.  Kill or hide
+	 * them.  A css is hidden when the userland requests it to be
+	 * disabled while other subsystems are still depending on it.
+	 */
 	for_each_subsys(ss, ssid) {
 		if (!(disable & (1 << ssid)))
 			continue;
 
-		cgroup_for_each_live_child(child, cgrp)
-			kill_css(cgroup_css(child, ss));
+		cgroup_for_each_live_child(child, cgrp) {
+			if (css_disable & (1 << ssid))
+				kill_css(cgroup_css(child, ss));
+			else
+				cgroup_clear_dir(child, 1 << ssid);
+		}
 	}
 
 	kernfs_activate(cgrp->kn);
@@ -2732,8 +2771,14 @@ err_undo_css:
 
 		cgroup_for_each_live_child(child, cgrp) {
 			struct cgroup_subsys_state *css = cgroup_css(child, ss);
-			if (css)
+
+			if (!css)
+				continue;
+
+			if (css_enable & (1 << ssid))
 				kill_css(css);
+			else
+				cgroup_clear_dir(child, 1 << ssid);
 		}
 	}
 	goto out_unlock;
@@ -4282,12 +4327,14 @@ static void offline_css(struct cgroup_subsys_state *css)
  * create_css - create a cgroup_subsys_state
  * @cgrp: the cgroup new css will be associated with
  * @ss: the subsys of new css
+ * @visible: whether to create control knobs for the new css or not
  *
  * Create a new css associated with @cgrp - @ss pair.  On success, the new
- * css is online and installed in @cgrp with all interface files created.
- * Returns 0 on success, -errno on failure.
+ * css is online and installed in @cgrp with all interface files created if
+ * @visible.  Returns 0 on success, -errno on failure.
  */
-static int create_css(struct cgroup *cgrp, struct cgroup_subsys *ss)
+static int create_css(struct cgroup *cgrp, struct cgroup_subsys *ss,
+		      bool visible)
 {
 	struct cgroup *parent = cgroup_parent(cgrp);
 	struct cgroup_subsys_state *parent_css = cgroup_css(parent, ss);
@@ -4311,9 +4358,11 @@ static int create_css(struct cgroup *cgrp, struct cgroup_subsys *ss)
 		goto err_free_percpu_ref;
 	css->id = err;
 
-	err = cgroup_populate_dir(cgrp, 1 << ss->id);
-	if (err)
-		goto err_free_id;
+	if (visible) {
+		err = cgroup_populate_dir(cgrp, 1 << ss->id);
+		if (err)
+			goto err_free_id;
+	}
 
 	/* @css is ready to be brought online now, make it visible */
 	list_add_tail_rcu(&css->sibling, &parent_css->children);
@@ -4430,7 +4479,8 @@ static int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name,
 	/* let's create and online css's */
 	for_each_subsys(ss, ssid) {
 		if (parent->child_subsys_mask & (1 << ssid)) {
-			ret = create_css(cgrp, ss);
+			ret = create_css(cgrp, ss,
+					 parent->subtree_control & (1 << ssid));
 			if (ret)
 				goto out_destroy;
 		}
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/6] cgroup: implement cgroup_subsys->css_reset()
  2014-06-28  1:03 [PATCHSET cgroup/for-3.17] cgroup, blkcg, memcg: make blkcg depend on memcg on unified hierarchy Tejun Heo
                   ` (2 preceding siblings ...)
  2014-06-28  1:03 ` [PATCH 3/6] cgroup: make interface files visible iff enabled on cgroup->subtree_control Tejun Heo
@ 2014-06-28  1:03 ` Tejun Heo
  2014-06-28  1:03 ` [PATCH 5/6] cgroup: implement cgroup_subsys->depends_on Tejun Heo
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2014-06-28  1:03 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel, hannes, mhocko, vgoyal, axboe, Tejun Heo

cgroup is implementing support for subsystem dependency which would
require a way to enable a subsystem even when it's not directly
configured through "cgroup.subtree_control".

The previous patches added support for explicitly and implicitly
enabled subsystems and showing/hiding their interface files.  An
explicitly enabled subsystem may become implicitly enabled if it's
turned off through "cgroup.subtree_control" but there are subsystems
depending on it.  In such cases, the subsystem, as it's turned off
when seen from userland, shouldn't enforce any resource control.
Also, the subsystem may be explicitly turned on later again and its
interface files should be as close to the intial state as possible.

This patch adds cgroup_subsys->css_reset() which is invoked when a css
is hidden.  The callback should disable resource control and reset the
state to the vanilla state.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 Documentation/cgroups/cgroups.txt | 14 ++++++++++++++
 include/linux/cgroup.h            |  1 +
 kernel/cgroup.c                   | 16 ++++++++++++----
 3 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
index 821de56..10c949b 100644
--- a/Documentation/cgroups/cgroups.txt
+++ b/Documentation/cgroups/cgroups.txt
@@ -599,6 +599,20 @@ fork. If this method returns 0 (success) then this should remain valid
 while the caller holds cgroup_mutex and it is ensured that either
 attach() or cancel_attach() will be called in future.
 
+void css_reset(struct cgroup_subsys_state *css)
+(cgroup_mutex held by caller)
+
+An optional operation which should restore @css's configuration to the
+initial state.  This is currently only used on the unified hierarchy
+when a subsystem is disabled on a cgroup through
+"cgroup.subtree_control" but should remain enabled because other
+subsystems depend on it.  cgroup core makes such a css invisible by
+removing the associated interface files and invokes this callback so
+that the hidden subsystem can return to the initial neutral state.
+This prevents unexpected resource control from a hidden css and
+ensures that the configuration is in the initial state when it is made
+visible again later.
+
 void cancel_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
 (cgroup_mutex held by caller)
 
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 5287f93..db99e3b 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -642,6 +642,7 @@ struct cgroup_subsys {
 	int (*css_online)(struct cgroup_subsys_state *css);
 	void (*css_offline)(struct cgroup_subsys_state *css);
 	void (*css_free)(struct cgroup_subsys_state *css);
+	void (*css_reset)(struct cgroup_subsys_state *css);
 
 	int (*can_attach)(struct cgroup_subsys_state *css,
 			  struct cgroup_taskset *tset);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 331fa296..3a6b77d 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2740,17 +2740,25 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 	/*
 	 * All tasks are migrated out of disabled csses.  Kill or hide
 	 * them.  A css is hidden when the userland requests it to be
-	 * disabled while other subsystems are still depending on it.
+	 * disabled while other subsystems are still depending on it.  The
+	 * css must not actively control resources and be in the vanilla
+	 * state if it's made visible again later.  Controllers which may
+	 * be depended upon should provide ->css_reset() for this purpose.
 	 */
 	for_each_subsys(ss, ssid) {
 		if (!(disable & (1 << ssid)))
 			continue;
 
 		cgroup_for_each_live_child(child, cgrp) {
-			if (css_disable & (1 << ssid))
-				kill_css(cgroup_css(child, ss));
-			else
+			struct cgroup_subsys_state *css = cgroup_css(child, ss);
+
+			if (css_disable & (1 << ssid)) {
+				kill_css(css);
+			} else {
 				cgroup_clear_dir(child, 1 << ssid);
+				if (ss->css_reset)
+					ss->css_reset(css);
+			}
 		}
 	}
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/6] cgroup: implement cgroup_subsys->depends_on
  2014-06-28  1:03 [PATCHSET cgroup/for-3.17] cgroup, blkcg, memcg: make blkcg depend on memcg on unified hierarchy Tejun Heo
                   ` (3 preceding siblings ...)
  2014-06-28  1:03 ` [PATCH 4/6] cgroup: implement cgroup_subsys->css_reset() Tejun Heo
@ 2014-06-28  1:03 ` Tejun Heo
  2014-06-28  1:03 ` [PATCH 6/6] blkcg, memcg: make blkcg depend on memcg on the default hierarchy Tejun Heo
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2014-06-28  1:03 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel, hannes, mhocko, vgoyal, axboe, Tejun Heo

Currently, the blkio subsystem attributes all of writeback IOs to the
root.  One of the issues is that there's no way to tell who originated
a writeback IO from block layer.  Those IOs are usually issued
asynchronously from a task which didn't have anything to do with
actually generating the dirty pages.  The memory subsystem, when
enabled, already keeps track of the ownership of each dirty page and
it's desirable for blkio to piggyback instead of adding its own
per-page tag.

blkio piggybacking on memory is an implementation detail which
preferably should be handled automatically without requiring explicit
userland action.  To achieve that, this patch implements
cgroup_subsys->depends_on which contains the mask of subsystems which
should be enabled together when the subsystem is enabled.

The previous patches already implemented the support for enabled but
invisible subsystems and cgroup_subsys->depends_on can be easily
implemented by updating cgroup_refresh_child_subsys_mask() so that it
calculates cgroup->child_subsys_mask considering
cgroup_subsys->depends_on of the explicitly enabled subsystems.

Documentation/cgroups/unified-hierarchy.txt is updated to explain that
subsystems may not become immediately available after being unused
from userland and that dependency could be a factor in it.  As
subsystems may already keep residual references, this doesn't
significantly change how subsystem rebinding can be used.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 Documentation/cgroups/unified-hierarchy.txt | 23 ++++++++++++--
 include/linux/cgroup.h                      |  9 ++++++
 kernel/cgroup.c                             | 49 ++++++++++++++++++++++++++++-
 3 files changed, 77 insertions(+), 4 deletions(-)

diff --git a/Documentation/cgroups/unified-hierarchy.txt b/Documentation/cgroups/unified-hierarchy.txt
index 324b182..a7a2205 100644
--- a/Documentation/cgroups/unified-hierarchy.txt
+++ b/Documentation/cgroups/unified-hierarchy.txt
@@ -97,9 +97,26 @@ change soon.
 All controllers which are not bound to other hierarchies are
 automatically bound to unified hierarchy and show up at the root of
 it.  Controllers which are enabled only in the root of unified
-hierarchy can be bound to other hierarchies at any time.  This allows
-mixing unified hierarchy with the traditional multiple hierarchies in
-a fully backward compatible way.
+hierarchy can be bound to other hierarchies.  This allows mixing
+unified hierarchy with the traditional multiple hierarchies in a fully
+backward compatible way.
+
+A controller can be moved across hierarchies only after the controller
+is no longer referenced in its current hierarchy.  Because per-cgroup
+controller states are destroyed asynchronously and controllers may
+have lingering references, a controller may not show up immediately on
+the unified hierarchy after the final umount of the previous
+hierarchy.  Similarly, a controller should be fully disabled to be
+moved out of the unified hierarchy and it may take some time for the
+disabled controller to become available for other hierarchies;
+furthermore, due to dependencies among controllers, other controllers
+may need to be disabled too.
+
+While useful for development and manual configurations, dynamically
+moving controllers between the unified and other hierarchies is
+strongly discouraged for production use.  It is recommended to decide
+the hierarchies and controller associations before starting using the
+controllers.
 
 
 2-2. cgroup.subtree_control
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index db99e3b..28853e7 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -693,6 +693,15 @@ struct cgroup_subsys {
 
 	/* base cftypes, automatically registered with subsys itself */
 	struct cftype *base_cftypes;
+
+	/*
+	 * A subsystem may depend on other subsystems.  When such subsystem
+	 * is enabled on a cgroup, the depended-upon subsystems are enabled
+	 * together if available.  Subsystems enabled due to dependency are
+	 * not visible to userland until explicitly enabled.  The following
+	 * specifies the mask of subsystems that this one depends on.
+	 */
+	unsigned int depends_on;
 };
 
 #define SUBSYS(_x) extern struct cgroup_subsys _x ## _cgrp_subsys;
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 3a6b77d..cd02e99 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1037,9 +1037,56 @@ static void cgroup_put(struct cgroup *cgrp)
 	css_put(&cgrp->self);
 }
 
+/**
+ * cgroup_refresh_child_subsys_mask - update child_subsys_mask
+ * @cgrp: the target cgroup
+ *
+ * On the default hierarchy, a subsystem may request other subsystems to be
+ * enabled together through its ->depends_on mask.  In such cases, more
+ * subsystems than specified in "cgroup.subtree_control" may be enabled.
+ *
+ * This function determines which subsystems need to be enabled given the
+ * current @cgrp->subtree_control and records it in
+ * @cgrp->child_subsys_mask.  The resulting mask is always a superset of
+ * @cgrp->subtree_control and follows the usual hierarchy rules.
+ */
 static void cgroup_refresh_child_subsys_mask(struct cgroup *cgrp)
 {
-	cgrp->child_subsys_mask = cgrp->subtree_control;
+	struct cgroup *parent = cgroup_parent(cgrp);
+	unsigned int cur_ss_mask = cgrp->subtree_control;
+	struct cgroup_subsys *ss;
+	int ssid;
+
+	lockdep_assert_held(&cgroup_mutex);
+
+	if (!cgroup_on_dfl(cgrp)) {
+		cgrp->child_subsys_mask = cur_ss_mask;
+		return;
+	}
+
+	while (true) {
+		unsigned int new_ss_mask = cur_ss_mask;
+
+		for_each_subsys(ss, ssid)
+			if (cur_ss_mask & (1 << ssid))
+				new_ss_mask |= ss->depends_on;
+
+		/*
+		 * Mask out subsystems which aren't available.  This can
+		 * happen only if some depended-upon subsystems were bound
+		 * to non-default hierarchies.
+		 */
+		if (parent)
+			new_ss_mask &= parent->child_subsys_mask;
+		else
+			new_ss_mask &= cgrp->root->subsys_mask;
+
+		if (new_ss_mask == cur_ss_mask)
+			break;
+		cur_ss_mask = new_ss_mask;
+	}
+
+	cgrp->child_subsys_mask = cur_ss_mask;
 }
 
 /**
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 6/6] blkcg, memcg: make blkcg depend on memcg on the default hierarchy
  2014-06-28  1:03 [PATCHSET cgroup/for-3.17] cgroup, blkcg, memcg: make blkcg depend on memcg on unified hierarchy Tejun Heo
                   ` (4 preceding siblings ...)
  2014-06-28  1:03 ` [PATCH 5/6] cgroup: implement cgroup_subsys->depends_on Tejun Heo
@ 2014-06-28  1:03 ` Tejun Heo
  2014-06-28 11:49   ` [PATCH v2 " Tejun Heo
  2014-07-04  6:29 ` [PATCHSET cgroup/for-3.17] cgroup, blkcg, memcg: make blkcg depend on memcg on unified hierarchy Li Zefan
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2014-06-28  1:03 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel, hannes, mhocko, vgoyal, axboe, Tejun Heo

Currently, the blkio subsystem attributes all of writeback IOs to the
root.  One of the issues is that there's no way to tell who originated
a writeback IO from block layer.  Those IOs are usually issued
asynchronously from a task which didn't have anything to do with
actually generating the dirty pages.  The memory subsystem, when
enabled, already keeps track of the ownership of each dirty page and
it's desirable for blkio to piggyback instead of adding its own
per-page tag.

cgroup now has a mechanism to express such dependency -
cgroup_subsys->depends_on.  This patch declares that blkcg depends on
memcg so that memcg is enabled automatically on the default hierarchy
when available.  Future changes will make blkcg map the memcg tag to
find out the cgroup to blame for writeback IOs.

As this means that a memcg may be made invisible, this patch also
implements css_reset() for memcg which resets its basic
configurations.  This implementation will probably need to be expanded
to cover other states which are used in the default hierarchy.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
---
 block/blk-cgroup.c |  7 +++++++
 mm/memcontrol.c    | 24 ++++++++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 069bc20..c9f7547 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -925,6 +925,13 @@ struct cgroup_subsys blkio_cgrp_subsys = {
 	.css_free = blkcg_css_free,
 	.can_attach = blkcg_can_attach,
 	.base_cftypes = blkcg_files,
+
+	/*
+	 * This ensures that, if available, memcg is automatically enabled
+	 * together on the default hierarchy so that the owner cgroup can
+	 * be retrieved from writeback pages.
+	 */
+	.depends_on = 1 << memory_cgrp_id,
 };
 EXPORT_SYMBOL_GPL(blkio_cgrp_subsys);
 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a2c7bcb..db536e9 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6407,6 +6407,29 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
 	__mem_cgroup_free(memcg);
 }
 
+/**
+ * mem_cgroup_css_reset - reset the states of a mem_cgroup
+ * @css: the target css
+ *
+ * Reset the states of the mem_cgroup associated with @css.  This is
+ * invoked when the userland requests disabling on the default hierarchy
+ * but the memcg is pinned through dependency.  The memcg should stop
+ * applying policies and should revert to the vanilla state as it may be
+ * made visible again.
+ *
+ * The current implementation only resets the essential configurations.
+ * This needs to be expanded to cover all the visible parts.
+ */
+static void mem_cgroup_css_reset(struct cgroup_subsys_state *css)
+{
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
+
+	mem_cgroup_resize_limit(memcg, ULLONG_MAX);
+	mem_cgroup_resize_memsw_limit(memcg, ULLONG_MAX);
+	memcg_update_kmem_limit(memcg, ULLONG_MAX);
+	res_counter_set_soft_limit(&memcg->res, ULLONG_MAX);
+}
+
 #ifdef CONFIG_MMU
 /* Handlers for move charge at task migration. */
 #define PRECHARGE_COUNT_AT_ONCE	256
@@ -7019,6 +7042,7 @@ struct cgroup_subsys memory_cgrp_subsys = {
 	.css_online = mem_cgroup_css_online,
 	.css_offline = mem_cgroup_css_offline,
 	.css_free = mem_cgroup_css_free,
+	.css_reset = mem_cgroup_css_reset,
 	.can_attach = mem_cgroup_can_attach,
 	.cancel_attach = mem_cgroup_cancel_attach,
 	.attach = mem_cgroup_move_task,
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 6/6] blkcg, memcg: make blkcg depend on memcg on the default hierarchy
  2014-06-28  1:03 ` [PATCH 6/6] blkcg, memcg: make blkcg depend on memcg on the default hierarchy Tejun Heo
@ 2014-06-28 11:49   ` Tejun Heo
  2014-07-08 19:42     ` Vivek Goyal
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2014-06-28 11:49 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel, hannes, mhocko, vgoyal, axboe

>From 8e67ad03ab03839456816e922c57a7ab3bcf5474 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Sat, 28 Jun 2014 07:47:57 -0400

Currently, the blkio subsystem attributes all of writeback IOs to the
root.  One of the issues is that there's no way to tell who originated
a writeback IO from block layer.  Those IOs are usually issued
asynchronously from a task which didn't have anything to do with
actually generating the dirty pages.  The memory subsystem, when
enabled, already keeps track of the ownership of each dirty page and
it's desirable for blkio to piggyback instead of adding its own
per-page tag.

cgroup now has a mechanism to express such dependency -
cgroup_subsys->depends_on.  This patch declares that blkcg depends on
memcg so that memcg is enabled automatically on the default hierarchy
when available.  Future changes will make blkcg map the memcg tag to
find out the cgroup to blame for writeback IOs.

As this means that a memcg may be made invisible, this patch also
implements css_reset() for memcg which resets its basic
configurations.  This implementation will probably need to be expanded
to cover other states which are used in the default hierarchy.

v2: blkcg's dependency on memcg is wrapped with CONFIG_MEMCG to avoid
    build failure.  Reported by kbuild test robot.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
---
 block/blk-cgroup.c |  8 ++++++++
 mm/memcontrol.c    | 24 ++++++++++++++++++++++++
 2 files changed, 32 insertions(+)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 069bc20..63c3cd4 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -925,6 +925,14 @@ struct cgroup_subsys blkio_cgrp_subsys = {
 	.css_free = blkcg_css_free,
 	.can_attach = blkcg_can_attach,
 	.base_cftypes = blkcg_files,
+#ifdef CONFIG_MEMCG
+	/*
+	 * This ensures that, if available, memcg is automatically enabled
+	 * together on the default hierarchy so that the owner cgroup can
+	 * be retrieved from writeback pages.
+	 */
+	.depends_on = 1 << memory_cgrp_id,
+#endif
 };
 EXPORT_SYMBOL_GPL(blkio_cgrp_subsys);
 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a2c7bcb..db536e9 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6407,6 +6407,29 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
 	__mem_cgroup_free(memcg);
 }
 
+/**
+ * mem_cgroup_css_reset - reset the states of a mem_cgroup
+ * @css: the target css
+ *
+ * Reset the states of the mem_cgroup associated with @css.  This is
+ * invoked when the userland requests disabling on the default hierarchy
+ * but the memcg is pinned through dependency.  The memcg should stop
+ * applying policies and should revert to the vanilla state as it may be
+ * made visible again.
+ *
+ * The current implementation only resets the essential configurations.
+ * This needs to be expanded to cover all the visible parts.
+ */
+static void mem_cgroup_css_reset(struct cgroup_subsys_state *css)
+{
+	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
+
+	mem_cgroup_resize_limit(memcg, ULLONG_MAX);
+	mem_cgroup_resize_memsw_limit(memcg, ULLONG_MAX);
+	memcg_update_kmem_limit(memcg, ULLONG_MAX);
+	res_counter_set_soft_limit(&memcg->res, ULLONG_MAX);
+}
+
 #ifdef CONFIG_MMU
 /* Handlers for move charge at task migration. */
 #define PRECHARGE_COUNT_AT_ONCE	256
@@ -7019,6 +7042,7 @@ struct cgroup_subsys memory_cgrp_subsys = {
 	.css_online = mem_cgroup_css_online,
 	.css_offline = mem_cgroup_css_offline,
 	.css_free = mem_cgroup_css_free,
+	.css_reset = mem_cgroup_css_reset,
 	.can_attach = mem_cgroup_can_attach,
 	.cancel_attach = mem_cgroup_cancel_attach,
 	.attach = mem_cgroup_move_task,
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCHSET cgroup/for-3.17] cgroup, blkcg, memcg: make blkcg depend on memcg on unified hierarchy
  2014-06-28  1:03 [PATCHSET cgroup/for-3.17] cgroup, blkcg, memcg: make blkcg depend on memcg on unified hierarchy Tejun Heo
                   ` (5 preceding siblings ...)
  2014-06-28  1:03 ` [PATCH 6/6] blkcg, memcg: make blkcg depend on memcg on the default hierarchy Tejun Heo
@ 2014-07-04  6:29 ` Li Zefan
  2014-07-07 18:33 ` Johannes Weiner
  2014-07-08 22:03 ` Tejun Heo
  8 siblings, 0 replies; 14+ messages in thread
From: Li Zefan @ 2014-07-04  6:29 UTC (permalink / raw)
  To: Tejun Heo; +Cc: cgroups, linux-kernel, hannes, mhocko, vgoyal, axboe

Hi Tejun,

On 2014/6/28 9:03, Tejun Heo wrote:
> Hello, guys.
> 
> Currently, the blkio subsystem attributes all of writeback IOs to the
> root.  One of the issues is that there's no way to tell who originated
> a writeback IO from block layer.  Those IOs are usually issued
> asynchronously from a task which didn't have anything to do with
> actually generating the dirty pages.  The memory subsystem, when
> enabled, already keeps track of the ownership of each dirty page and
> it's desirable for blkio to piggyback instead of adding its own
> per-page tag.

It's great to see this being worked on!

> 
> This can be achieved on the unified hierarchy without too much
> difficulty.  This patchset implements a dependency mechanism in the
> cgroup such that a subsystem can depends on other subsystems.  If
> available, the depended-upon subsystems are enabled together
> implicitly when the subsystem is turned on.  Implicitly enabled
> subsystems are invisible and the dependencies are transparent to
> userland.
> 
> This patchset implements the dependency mechanism in cgroup core and
> make blkcg depend on memcg.  This doesn't actually solve the writeback
> problem yet but is an important step.
> 
> This patchset contains the following six patches.
> 
>  0001-cgroup-reorganize-cgroup_subtree_control_write.patch
>  0002-cgroup-introduce-cgroup-subtree_control.patch
>  0003-cgroup-make-interface-files-visible-iff-enabled-on-c.patch
>  0004-cgroup-implement-cgroup_subsys-css_reset.patch
>  0005-cgroup-implement-cgroup_subsys-depends_on.patch
>  0006-blkcg-memcg-make-blkcg-depend-on-memcg-on-the-defaul.patch
> 
> 0001-0005 gradually implement the dependency mechanism.
> 
> 0006 makes blkcg depend on memcg.
> 
> This patchset is on top of a497c3ba1d97 ("Linux 3.16-rc2") and
> available in the following git branch.
> 
>  git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-cgroup-dependency
> 
> diffstat follows.  Thanks.
> 
>  Documentation/cgroups/cgroups.txt           |   14 +
>  Documentation/cgroups/unified-hierarchy.txt |   23 ++-
>  block/blk-cgroup.c                          |    7
>  include/linux/cgroup.h                      |   20 ++
>  kernel/cgroup.c                             |  201 ++++++++++++++++++++++------
>  mm/memcontrol.c                             |   24 +++
>  6 files changed, 243 insertions(+), 46 deletions(-)
> 

Acked-by: Li Zefan <lizefan@huawei.com>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCHSET cgroup/for-3.17] cgroup, blkcg, memcg: make blkcg depend on memcg on unified hierarchy
  2014-06-28  1:03 [PATCHSET cgroup/for-3.17] cgroup, blkcg, memcg: make blkcg depend on memcg on unified hierarchy Tejun Heo
                   ` (6 preceding siblings ...)
  2014-07-04  6:29 ` [PATCHSET cgroup/for-3.17] cgroup, blkcg, memcg: make blkcg depend on memcg on unified hierarchy Li Zefan
@ 2014-07-07 18:33 ` Johannes Weiner
  2014-07-08 22:03 ` Tejun Heo
  8 siblings, 0 replies; 14+ messages in thread
From: Johannes Weiner @ 2014-07-07 18:33 UTC (permalink / raw)
  To: Tejun Heo; +Cc: lizefan, cgroups, linux-kernel, mhocko, vgoyal, axboe

On Fri, Jun 27, 2014 at 09:03:06PM -0400, Tejun Heo wrote:
> Hello, guys.
> 
> Currently, the blkio subsystem attributes all of writeback IOs to the
> root.  One of the issues is that there's no way to tell who originated
> a writeback IO from block layer.  Those IOs are usually issued
> asynchronously from a task which didn't have anything to do with
> actually generating the dirty pages.  The memory subsystem, when
> enabled, already keeps track of the ownership of each dirty page and
> it's desirable for blkio to piggyback instead of adding its own
> per-page tag.
> 
> This can be achieved on the unified hierarchy without too much
> difficulty.  This patchset implements a dependency mechanism in the
> cgroup such that a subsystem can depends on other subsystems.  If
> available, the depended-upon subsystems are enabled together
> implicitly when the subsystem is turned on.  Implicitly enabled
> subsystems are invisible and the dependencies are transparent to
> userland.
>
> This patchset implements the dependency mechanism in cgroup core and
> make blkcg depend on memcg.  This doesn't actually solve the writeback
> problem yet but is an important step.
> 
> This patchset contains the following six patches.
> 
>  0001-cgroup-reorganize-cgroup_subtree_control_write.patch
>  0002-cgroup-introduce-cgroup-subtree_control.patch
>  0003-cgroup-make-interface-files-visible-iff-enabled-on-c.patch
>  0004-cgroup-implement-cgroup_subsys-css_reset.patch
>  0005-cgroup-implement-cgroup_subsys-depends_on.patch
>  0006-blkcg-memcg-make-blkcg-depend-on-memcg-on-the-defaul.patch

These look good to me.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

We can update memcg's ->css_reset() as we re-populate the interface in
the default hierarchy.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 6/6] blkcg, memcg: make blkcg depend on memcg on the default hierarchy
  2014-06-28 11:49   ` [PATCH v2 " Tejun Heo
@ 2014-07-08 19:42     ` Vivek Goyal
  2014-07-08 21:53       ` Tejun Heo
  0 siblings, 1 reply; 14+ messages in thread
From: Vivek Goyal @ 2014-07-08 19:42 UTC (permalink / raw)
  To: Tejun Heo; +Cc: lizefan, cgroups, linux-kernel, hannes, mhocko, axboe

On Sat, Jun 28, 2014 at 07:49:07AM -0400, Tejun Heo wrote:
> >From 8e67ad03ab03839456816e922c57a7ab3bcf5474 Mon Sep 17 00:00:00 2001
> From: Tejun Heo <tj@kernel.org>
> Date: Sat, 28 Jun 2014 07:47:57 -0400
> 
> Currently, the blkio subsystem attributes all of writeback IOs to the
> root.  One of the issues is that there's no way to tell who originated
> a writeback IO from block layer.  Those IOs are usually issued
> asynchronously from a task which didn't have anything to do with
> actually generating the dirty pages.  The memory subsystem, when
> enabled, already keeps track of the ownership of each dirty page and
> it's desirable for blkio to piggyback instead of adding its own
> per-page tag.
> 
> cgroup now has a mechanism to express such dependency -
> cgroup_subsys->depends_on.  This patch declares that blkcg depends on
> memcg so that memcg is enabled automatically on the default hierarchy
> when available.  Future changes will make blkcg map the memcg tag to
> find out the cgroup to blame for writeback IOs.
> 
> As this means that a memcg may be made invisible, this patch also
> implements css_reset() for memcg which resets its basic
> configurations.  This implementation will probably need to be expanded
> to cover other states which are used in the default hierarchy.
> 
> v2: blkcg's dependency on memcg is wrapped with CONFIG_MEMCG to avoid
>     build failure.  Reported by kbuild test robot.
> 

Hi Tejun,

I have couple questions about new semantics. Following is my
understanding. Is it right?

- So after this change one can not use blkio controller on unified
  hiearchy if memory controller is mounted on some other hierarchy
  and is not available for mounting unified hiearchy.

- If blkio controller is enabled on unified hiearchy (memory controller
  implicitly enabled), then one can't mount memory controller on other
  hierarchies without first disabling blkio controller on unified hiearchy.

I think this change makes sense. memory controller already knows which
cgroup dirtied the page and making use of same data by forcing memory
and blkio to be mounted together should work.

There are some cases where people don't want to use blkio because of
overhead, and in that case, they can simply not enable blkio controller
on children (which memory controller can still be enabled on children).

Thanks
Vivek

> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> ---
>  block/blk-cgroup.c |  8 ++++++++
>  mm/memcontrol.c    | 24 ++++++++++++++++++++++++
>  2 files changed, 32 insertions(+)
> 
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index 069bc20..63c3cd4 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -925,6 +925,14 @@ struct cgroup_subsys blkio_cgrp_subsys = {
>  	.css_free = blkcg_css_free,
>  	.can_attach = blkcg_can_attach,
>  	.base_cftypes = blkcg_files,
> +#ifdef CONFIG_MEMCG
> +	/*
> +	 * This ensures that, if available, memcg is automatically enabled
> +	 * together on the default hierarchy so that the owner cgroup can
> +	 * be retrieved from writeback pages.
> +	 */
> +	.depends_on = 1 << memory_cgrp_id,
> +#endif
>  };
>  EXPORT_SYMBOL_GPL(blkio_cgrp_subsys);
>  
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index a2c7bcb..db536e9 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -6407,6 +6407,29 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css)
>  	__mem_cgroup_free(memcg);
>  }
>  
> +/**
> + * mem_cgroup_css_reset - reset the states of a mem_cgroup
> + * @css: the target css
> + *
> + * Reset the states of the mem_cgroup associated with @css.  This is
> + * invoked when the userland requests disabling on the default hierarchy
> + * but the memcg is pinned through dependency.  The memcg should stop
> + * applying policies and should revert to the vanilla state as it may be
> + * made visible again.
> + *
> + * The current implementation only resets the essential configurations.
> + * This needs to be expanded to cover all the visible parts.
> + */
> +static void mem_cgroup_css_reset(struct cgroup_subsys_state *css)
> +{
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
> +
> +	mem_cgroup_resize_limit(memcg, ULLONG_MAX);
> +	mem_cgroup_resize_memsw_limit(memcg, ULLONG_MAX);
> +	memcg_update_kmem_limit(memcg, ULLONG_MAX);
> +	res_counter_set_soft_limit(&memcg->res, ULLONG_MAX);
> +}
> +
>  #ifdef CONFIG_MMU
>  /* Handlers for move charge at task migration. */
>  #define PRECHARGE_COUNT_AT_ONCE	256
> @@ -7019,6 +7042,7 @@ struct cgroup_subsys memory_cgrp_subsys = {
>  	.css_online = mem_cgroup_css_online,
>  	.css_offline = mem_cgroup_css_offline,
>  	.css_free = mem_cgroup_css_free,
> +	.css_reset = mem_cgroup_css_reset,
>  	.can_attach = mem_cgroup_can_attach,
>  	.cancel_attach = mem_cgroup_cancel_attach,
>  	.attach = mem_cgroup_move_task,
> -- 
> 1.9.3

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 6/6] blkcg, memcg: make blkcg depend on memcg on the default hierarchy
  2014-07-08 19:42     ` Vivek Goyal
@ 2014-07-08 21:53       ` Tejun Heo
  2014-07-09 11:57         ` Vivek Goyal
  0 siblings, 1 reply; 14+ messages in thread
From: Tejun Heo @ 2014-07-08 21:53 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: lizefan, cgroups, linux-kernel, hannes, mhocko, axboe

Hello, Vivek.

On Tue, Jul 08, 2014 at 03:42:26PM -0400, Vivek Goyal wrote:
> I have couple questions about new semantics. Following is my
> understanding. Is it right?
> 
> - So after this change one can not use blkio controller on unified
>   hiearchy if memory controller is mounted on some other hierarchy
>   and is not available for mounting unified hiearchy.

Hmmm?  No, the only behavior which changes is when both blkcg and
memcg are mounted on the unified hierarchy.  Nothing else changes.
The dependency behavior kicks in iff memcg is available on the unified
hierarchy.

> - If blkio controller is enabled on unified hiearchy (memory controller
>   implicitly enabled), then one can't mount memory controller on other
>   hierarchies without first disabling blkio controller on unified hiearchy.

Yes, blkio needs to be disabled to the root for memcg to be able to
become free.  This is an extra restriction but I don't think it's
anything drastic.  Once a controller starts being actively used on any
hierarchy, nothing has been guaranteed about when the controller would
become free again even if the whole hierarchy is reduced to the root.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCHSET cgroup/for-3.17] cgroup, blkcg, memcg: make blkcg depend on memcg on unified hierarchy
  2014-06-28  1:03 [PATCHSET cgroup/for-3.17] cgroup, blkcg, memcg: make blkcg depend on memcg on unified hierarchy Tejun Heo
                   ` (7 preceding siblings ...)
  2014-07-07 18:33 ` Johannes Weiner
@ 2014-07-08 22:03 ` Tejun Heo
  8 siblings, 0 replies; 14+ messages in thread
From: Tejun Heo @ 2014-07-08 22:03 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel, hannes, mhocko, vgoyal, axboe

Applied to cgroup/for-3.17.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 6/6] blkcg, memcg: make blkcg depend on memcg on the default hierarchy
  2014-07-08 21:53       ` Tejun Heo
@ 2014-07-09 11:57         ` Vivek Goyal
  0 siblings, 0 replies; 14+ messages in thread
From: Vivek Goyal @ 2014-07-09 11:57 UTC (permalink / raw)
  To: Tejun Heo; +Cc: lizefan, cgroups, linux-kernel, hannes, mhocko, axboe

On Tue, Jul 08, 2014 at 05:53:51PM -0400, Tejun Heo wrote:
> Hello, Vivek.
> 
> On Tue, Jul 08, 2014 at 03:42:26PM -0400, Vivek Goyal wrote:
> > I have couple questions about new semantics. Following is my
> > understanding. Is it right?
> > 
> > - So after this change one can not use blkio controller on unified
> >   hiearchy if memory controller is mounted on some other hierarchy
> >   and is not available for mounting unified hiearchy.
> 
> Hmmm?  No, the only behavior which changes is when both blkcg and
> memcg are mounted on the unified hierarchy.  Nothing else changes.
> The dependency behavior kicks in iff memcg is available on the unified
> hierarchy.

Ok, good to know that dependency kicks in only if controlle being depended
on is available on the hierarchy.
> 
> > - If blkio controller is enabled on unified hiearchy (memory controller
> >   implicitly enabled), then one can't mount memory controller on other
> >   hierarchies without first disabling blkio controller on unified hiearchy.
> 
> Yes, blkio needs to be disabled to the root for memcg to be able to
> become free.  This is an extra restriction but I don't think it's
> anything drastic.  Once a controller starts being actively used on any
> hierarchy, nothing has been guaranteed about when the controller would
> become free again even if the whole hierarchy is reduced to the root.

Agreed. Thanks for the clarification.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-07-09 11:58 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-28  1:03 [PATCHSET cgroup/for-3.17] cgroup, blkcg, memcg: make blkcg depend on memcg on unified hierarchy Tejun Heo
2014-06-28  1:03 ` [PATCH 1/6] cgroup: reorganize cgroup_subtree_control_write() Tejun Heo
2014-06-28  1:03 ` [PATCH 2/6] cgroup: introduce cgroup->subtree_control Tejun Heo
2014-06-28  1:03 ` [PATCH 3/6] cgroup: make interface files visible iff enabled on cgroup->subtree_control Tejun Heo
2014-06-28  1:03 ` [PATCH 4/6] cgroup: implement cgroup_subsys->css_reset() Tejun Heo
2014-06-28  1:03 ` [PATCH 5/6] cgroup: implement cgroup_subsys->depends_on Tejun Heo
2014-06-28  1:03 ` [PATCH 6/6] blkcg, memcg: make blkcg depend on memcg on the default hierarchy Tejun Heo
2014-06-28 11:49   ` [PATCH v2 " Tejun Heo
2014-07-08 19:42     ` Vivek Goyal
2014-07-08 21:53       ` Tejun Heo
2014-07-09 11:57         ` Vivek Goyal
2014-07-04  6:29 ` [PATCHSET cgroup/for-3.17] cgroup, blkcg, memcg: make blkcg depend on memcg on unified hierarchy Li Zefan
2014-07-07 18:33 ` Johannes Weiner
2014-07-08 22:03 ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).