linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET] cgroup: cftype based file interface, take #2
@ 2012-03-21 22:17 Tejun Heo
  2012-03-21 22:17 ` [PATCH 01/12] cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir() Tejun Heo
                   ` (15 more replies)
  0 siblings, 16 replies; 29+ messages in thread
From: Tejun Heo @ 2012-03-21 22:17 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups; +Cc: linux-kernel, fweisbec, rni, ctalbott

Hello,

This is the second take at improving cgroup file addition/removal
interface.  Changes from the last take[L] are,

* CGROUP_SUBSYS_CFTYPES[_COND]() dropped and
  cgroup_subsys->base_cftypes is added instead.  This works better for
  subsystems implemented as module.  tcp_memcontrol now uses an
  explicit __initcall() to register its cftypes.

* 0005-cgroup-relocate-cftype-and-cgroup_subsys-definitions.patch
  added as cleanup.

* 0007-memcg-always-create-memsw-files-if-CONFIG_CGROUP_MEM.patch
  added so that memsw files are always created if enabled in config as
  suggested by KAMEZAWA.

This patch replaces cgroup file interface with cftype based one which
allows dynamic additions and removals of cftype arrays whether the
target subsystem is attached or not.  This can be used to make subsys
rebinding via remount work properly but I intentionally avoided doing
that at the moment.

This makes cgroup population simpler for controllers and will be used
to allow controllers to be more dynamic.  e.g. blkio subsys has
sub-policies which may come and go while blkio subsys is attached and
it currently uses fixed set of files which stays blank if not in use.
This will also be useful for implementing unified hierarchy.

This patchset contains the following patches.

 0001-cgroup-move-cgroup_clear_directory-call-out-of-cgrou.patch
 0002-cgroup-build-list-of-all-cgroups-under-a-given-cgrou.patch
 0003-cgroup-implement-cgroup_add_cftypes-and-friends.patch
 0004-cgroup-merge-cft_release_agent-cftype-array-into-the.patch
 0005-cgroup-relocate-cftype-and-cgroup_subsys-definitions.patch
 0006-cgroup-convert-all-non-memcg-controllers-to-the-new-.patch
 0007-memcg-always-create-memsw-files-if-CONFIG_CGROUP_MEM.patch
 0008-cgroup-convert-memcg-controller-to-the-new-cftype-in.patch
 0009-cgroup-remove-cgroup_add_file-s.patch
 0010-cgroup-relocate-__d_cgrp-and-__d_cft.patch
 0011-cgroup-introduce-struct-cfent.patch
 0012-cgroup-implement-cgroup_rm_cftypes.patch

and is on top of

  cgroup/for-3.4 3ce3230a0cff484e5130153f244d4fb8a56b3a8b
+ [1] cgroup: deprecate remount option changes mount option

and is also available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git cgroup-cftype

Glauber, I think the cftypes registration for tcp_memcontrol is fine
as is in this patchset and we just need to move the init to the tail
of ->create().  Let's worry about memcg for module-loadable protocols
whey they are added.  Can you please send a patch to move init to
->create()?

If nobody objects && after Li comes back and acks the changes, I'll
route these through cgroup/for-3.5.

diffstat follows.

 block/blk-cgroup.c        |   45 ++---
 include/linux/cgroup.h    |   51 ++++--
 kernel/cgroup.c           |  359 +++++++++++++++++++++++++++++++++++-----------
 kernel/cgroup_freezer.c   |   11 -
 kernel/cpuset.c           |   31 +--
 kernel/sched/core.c       |   16 --
 mm/memcontrol.c           |   77 ++++-----
 net/core/netprio_cgroup.c |   30 +--
 net/ipv4/tcp_memcontrol.c |   71 ++++-----
 net/sched/cls_cgroup.c    |   31 +--
 security/device_cgroup.c  |   10 -
 11 files changed, 437 insertions(+), 295 deletions(-)

Thanks.

--
tejun

[L] http://thread.gmane.org/gmane.linux.kernel.containers/22623
[1] http://thread.gmane.org/gmane.linux.kernel.cgroups/1192/focus=22611

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 01/12] cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir()
  2012-03-21 22:17 [PATCHSET] cgroup: cftype based file interface, take #2 Tejun Heo
@ 2012-03-21 22:17 ` Tejun Heo
  2012-03-21 22:17 ` [PATCH 02/12] cgroup: build list of all cgroups under a given cgroupfs_root Tejun Heo
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2012-03-21 22:17 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo

cgroup_populate_dir() currently clears all files and then repopulate
the directory; however, the clearing part is only useful when it's
called from cgroup_remount().  Relocate the invocation to
cgroup_remount().

This is to prepare for further cgroup file handling updates.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cgroup.c |    6 ++----
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index deeab87..2e2f93b 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1313,7 +1313,8 @@ static int cgroup_remount(struct super_block *sb, int *flags, char *data)
 		goto out_unlock;
 	}
 
-	/* (re)populate subsystem files */
+	/* clear out any existing files and repopulate subsystem files */
+	cgroup_clear_directory(cgrp->dentry);
 	cgroup_populate_dir(cgrp);
 
 	if (opts.release_agent)
@@ -3648,9 +3649,6 @@ static int cgroup_populate_dir(struct cgroup *cgrp)
 	int err;
 	struct cgroup_subsys *ss;
 
-	/* First clear out any existing files */
-	cgroup_clear_directory(cgrp->dentry);
-
 	err = cgroup_add_files(cgrp, NULL, files, ARRAY_SIZE(files));
 	if (err < 0)
 		return err;
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 02/12] cgroup: build list of all cgroups under a given cgroupfs_root
  2012-03-21 22:17 [PATCHSET] cgroup: cftype based file interface, take #2 Tejun Heo
  2012-03-21 22:17 ` [PATCH 01/12] cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir() Tejun Heo
@ 2012-03-21 22:17 ` Tejun Heo
  2012-03-21 22:17 ` [PATCH 03/12] cgroup: implement cgroup_add_cftypes() and friends Tejun Heo
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2012-03-21 22:17 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo

Build a list of all cgroups anchored at cgroupfs_root->allcg_list and
going through cgroup->allcg_node.  The list is protected by
cgroup_mutex and will be used to improve cgroup file handling.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/cgroup.h |    2 ++
 kernel/cgroup.c        |   10 ++++++++++
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 501adb1..908f26f 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -191,6 +191,8 @@ struct cgroup {
 	 */
 	struct list_head css_sets;
 
+	struct list_head allcg_node;	/* cgroupfs_root->allcg_list */
+
 	/*
 	 * Linked list running through all cgroups that can
 	 * potentially be reaped by the release agent. Protected by
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 2e2f93b..fceee52 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -127,6 +127,9 @@ struct cgroupfs_root {
 	/* A list running through the active hierarchies */
 	struct list_head root_list;
 
+	/* All cgroups on this root, cgroup_mutex protected */
+	struct list_head allcg_list;
+
 	/* Hierarchy-specific flags */
 	unsigned long flags;
 
@@ -1350,11 +1353,14 @@ static void init_cgroup_housekeeping(struct cgroup *cgrp)
 static void init_cgroup_root(struct cgroupfs_root *root)
 {
 	struct cgroup *cgrp = &root->top_cgroup;
+
 	INIT_LIST_HEAD(&root->subsys_list);
 	INIT_LIST_HEAD(&root->root_list);
+	INIT_LIST_HEAD(&root->allcg_list);
 	root->number_of_cgroups = 1;
 	cgrp->root = root;
 	cgrp->top_cgroup = cgrp;
+	list_add_tail(&cgrp->allcg_node, &root->allcg_list);
 	init_cgroup_housekeeping(cgrp);
 }
 
@@ -3794,6 +3800,8 @@ static long cgroup_create(struct cgroup *parent, struct dentry *dentry,
 	/* The cgroup directory was pre-locked for us */
 	BUG_ON(!mutex_is_locked(&cgrp->dentry->d_inode->i_mutex));
 
+	list_add_tail(&cgrp->allcg_node, &root->allcg_list);
+
 	err = cgroup_populate_dir(cgrp);
 	/* If err < 0, we have a half-filled directory - oh well ;) */
 
@@ -4002,6 +4010,8 @@ again:
 	list_del_init(&cgrp->sibling);
 	cgroup_unlock_hierarchy(cgrp->root);
 
+	list_del_init(&cgrp->allcg_node);
+
 	d = dget(cgrp->dentry);
 
 	cgroup_d_remove_dir(d);
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 03/12] cgroup: implement cgroup_add_cftypes() and friends
  2012-03-21 22:17 [PATCHSET] cgroup: cftype based file interface, take #2 Tejun Heo
  2012-03-21 22:17 ` [PATCH 01/12] cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir() Tejun Heo
  2012-03-21 22:17 ` [PATCH 02/12] cgroup: build list of all cgroups under a given cgroupfs_root Tejun Heo
@ 2012-03-21 22:17 ` Tejun Heo
  2012-03-21 22:17 ` [PATCH 04/12] cgroup: merge cft_release_agent cftype array into the base files array Tejun Heo
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2012-03-21 22:17 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo

Currently, cgroup directories are populated by subsys->populate()
callback explicitly creating files on each cgroup creation.  This
level of flexibility isn't needed or desirable.  It provides largely
unused flexibility which call for abuses while severely limiting what
the core layer can do through the lack of structure and conventions.

Per each cgroup file type, the only distinction that cgroup users is
making is whether a cgroup is root or not, which can easily be
expressed with flags.

This patch introduces cgroup_add_cftypes().  These deal with cftypes
instead of individual files - controllers indicate that certain types
of files exist for certain subsystem.  Newly added CFTYPE_*_ON_ROOT
flags indicate whether a cftype should be excluded or created only on
the root cgroup.

cgroup_add_cftypes() can be called any time whether the target
subsystem is currently attached or not.  cgroup core will create files
on the existing cgroups as necessary.

Also, cgroup_subsys->base_cftypes is added to ease registration of the
base files for the subsystem.  If non-NULL on subsys init, the cftypes
pointed to by ->base_cftypes are automatically registered on subsys
init / load.

Further patches will convert the existing users and remove the file
based interface.  Note that this interface allows dynamic addition of
files to an active controller.  This will be used for sub-controller
modularity and unified hierarchy in the longer term.

This patch implements the new mechanism but doesn't apply it to any
user.

v2: replaced DECLARE_CGROUP_CFTYPES[_COND]() with
    cgroup_subsys->base_cftypes, which works better for cgroup_subsys
    which is loaded as module.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/cgroup.h |   33 +++++++++++-
 kernel/cgroup.c        |  132 +++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 162 insertions(+), 3 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 908f26f..ff5b764 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -192,6 +192,7 @@ struct cgroup {
 	struct list_head css_sets;
 
 	struct list_head allcg_node;	/* cgroupfs_root->allcg_list */
+	struct list_head cft_q_node;	/* used during cftype add/rm */
 
 	/*
 	 * Linked list running through all cgroups that can
@@ -277,11 +278,17 @@ struct cgroup_map_cb {
  *	- the 'cftype' of the file is file->f_dentry->d_fsdata
  */
 
-#define MAX_CFTYPE_NAME 64
+/* cftype->flags */
+#define CFTYPE_ONLY_ON_ROOT	(1U << 0)	/* only create on root cg */
+#define CFTYPE_NOT_ON_ROOT	(1U << 1)	/* don't create onp root cg */
+
+#define MAX_CFTYPE_NAME		64
+
 struct cftype {
 	/*
 	 * By convention, the name should begin with the name of the
-	 * subsystem, followed by a period
+	 * subsystem, followed by a period.  Zero length string indicates
+	 * end of cftype array.
 	 */
 	char name[MAX_CFTYPE_NAME];
 	int private;
@@ -297,6 +304,9 @@ struct cftype {
 	 */
 	size_t max_write_len;
 
+	/* CFTYPE_* flags */
+	unsigned int flags;
+
 	int (*open)(struct inode *inode, struct file *file);
 	ssize_t (*read)(struct cgroup *cgrp, struct cftype *cft,
 			struct file *file,
@@ -375,6 +385,16 @@ struct cftype {
 			struct eventfd_ctx *eventfd);
 };
 
+/*
+ * cftype_sets describe cftypes belonging to a subsystem and are chained at
+ * cgroup_subsys->cftsets.  Each cftset points to an array of cftypes
+ * terminated by zero length name.
+ */
+struct cftype_set {
+	struct list_head		node;	/* chained at subsys->cftsets */
+	const struct cftype		*cfts;
+};
+
 struct cgroup_scanner {
 	struct cgroup *cg;
 	int (*test_task)(struct task_struct *p, struct cgroup_scanner *scan);
@@ -400,6 +420,8 @@ int cgroup_add_files(struct cgroup *cgrp,
 			const struct cftype cft[],
 			int count);
 
+int cgroup_add_cftypes(struct cgroup_subsys *ss, const struct cftype *cfts);
+
 int cgroup_is_removed(const struct cgroup *cgrp);
 
 int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen);
@@ -502,6 +524,13 @@ struct cgroup_subsys {
 	struct idr idr;
 	rwlock_t id_lock;
 
+	/* list of cftype_sets */
+	struct list_head cftsets;
+
+	/* base cftypes, automatically [de]registered with subsys itself */
+	struct cftype *base_cftypes;
+	struct cftype_set base_cftset;
+
 	/* should be defined only by modular subsystems */
 	struct module *module;
 };
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index fceee52..fe7a5c2 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2627,8 +2627,14 @@ int cgroup_add_file(struct cgroup *cgrp,
 	struct dentry *dentry;
 	int error;
 	umode_t mode;
-
 	char name[MAX_CGROUP_TYPE_NAMELEN + MAX_CFTYPE_NAME + 2] = { 0 };
+
+	/* does @cft->flags tell us to skip creation on @cgrp? */
+	if ((cft->flags & CFTYPE_NOT_ON_ROOT) && !cgrp->parent)
+		return 0;
+	if ((cft->flags & CFTYPE_ONLY_ON_ROOT) && cgrp->parent)
+		return 0;
+
 	if (subsys && !test_bit(ROOT_NOPREFIX, &cgrp->root->flags)) {
 		strcpy(name, subsys->name);
 		strcat(name, ".");
@@ -2664,6 +2670,95 @@ int cgroup_add_files(struct cgroup *cgrp,
 }
 EXPORT_SYMBOL_GPL(cgroup_add_files);
 
+static DEFINE_MUTEX(cgroup_cft_mutex);
+
+static void cgroup_cfts_prepare(void)
+	__acquires(&cgroup_cft_mutex) __acquires(&cgroup_mutex)
+{
+	/*
+	 * Thanks to the entanglement with vfs inode locking, we can't walk
+	 * the existing cgroups under cgroup_mutex and create files.
+	 * Instead, we increment reference on all cgroups and build list of
+	 * them using @cgrp->cft_q_node.  Grab cgroup_cft_mutex to ensure
+	 * exclusive access to the field.
+	 */
+	mutex_lock(&cgroup_cft_mutex);
+	mutex_lock(&cgroup_mutex);
+}
+
+static void cgroup_cfts_commit(struct cgroup_subsys *ss,
+			       const struct cftype *cfts)
+	__releases(&cgroup_mutex) __releases(&cgroup_cft_mutex)
+{
+	LIST_HEAD(pending);
+	struct cgroup *cgrp, *n;
+	int count = 0;
+
+	while (cfts[count].name[0] != '\0')
+		count++;
+
+	/* %NULL @cfts indicates abort and don't bother if @ss isn't attached */
+	if (cfts && ss->root != &rootnode) {
+		list_for_each_entry(cgrp, &ss->root->allcg_list, allcg_node) {
+			dget(cgrp->dentry);
+			list_add_tail(&cgrp->cft_q_node, &pending);
+		}
+	}
+
+	mutex_unlock(&cgroup_mutex);
+
+	/*
+	 * All new cgroups will see @cfts update on @ss->cftsets.  Add/rm
+	 * files for all cgroups which were created before.
+	 */
+	list_for_each_entry_safe(cgrp, n, &pending, cft_q_node) {
+		struct inode *inode = cgrp->dentry->d_inode;
+
+		mutex_lock(&inode->i_mutex);
+		mutex_lock(&cgroup_mutex);
+		if (!cgroup_is_removed(cgrp))
+			cgroup_add_files(cgrp, ss, cfts, count);
+		mutex_unlock(&cgroup_mutex);
+		mutex_unlock(&inode->i_mutex);
+
+		list_del_init(&cgrp->cft_q_node);
+		dput(cgrp->dentry);
+	}
+
+	mutex_unlock(&cgroup_cft_mutex);
+}
+
+/**
+ * cgroup_add_cftypes - add an array of cftypes to a subsystem
+ * @ss: target cgroup subsystem
+ * @cfts: zero-length name terminated array of cftypes
+ *
+ * Register @cfts to @ss.  Files described by @cfts are created for all
+ * existing cgroups to which @ss is attached and all future cgroups will
+ * have them too.  This function can be called anytime whether @ss is
+ * attached or not.
+ *
+ * Returns 0 on successful registration, -errno on failure.  Note that this
+ * function currently returns 0 as long as @cfts registration is successful
+ * even if some file creation attempts on existing cgroups fail.
+ */
+int cgroup_add_cftypes(struct cgroup_subsys *ss, const struct cftype *cfts)
+{
+	struct cftype_set *set;
+
+	set = kzalloc(sizeof(*set), GFP_KERNEL);
+	if (!set)
+		return -ENOMEM;
+
+	cgroup_cfts_prepare();
+	set->cfts = cfts;
+	list_add_tail(&set->node, &ss->cftsets);
+	cgroup_cfts_commit(ss, cfts);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cgroup_add_cftypes);
+
 /**
  * cgroup_task_count - count the number of tasks in a cgroup.
  * @cgrp: the cgroup in question
@@ -3664,10 +3759,25 @@ static int cgroup_populate_dir(struct cgroup *cgrp)
 			return err;
 	}
 
+	/* process cftsets of each subsystem */
 	for_each_subsys(cgrp->root, ss) {
+		struct cftype_set *set;
+
 		if (ss->populate && (err = ss->populate(ss, cgrp)) < 0)
 			return err;
+
+		list_for_each_entry(set, &ss->cftsets, node) {
+			const struct cftype *cft;
+
+			for (cft = set->cfts; cft->name[0] != '\0'; cft++) {
+				err = cgroup_add_file(cgrp, ss, cft);
+				if (err)
+					pr_warning("cgroup_populate_dir: failed to create %s, err=%d\n",
+						   cft->name, err);
+			}
+		}
 	}
+
 	/* This cgroup is ready now */
 	for_each_subsys(cgrp->root, ss) {
 		struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
@@ -4038,12 +4148,29 @@ again:
 	return 0;
 }
 
+static void __init_or_module cgroup_init_cftsets(struct cgroup_subsys *ss)
+{
+	INIT_LIST_HEAD(&ss->cftsets);
+
+	/*
+	 * base_cftset is embedded in subsys itself, no need to worry about
+	 * deregistration.
+	 */
+	if (ss->base_cftypes) {
+		ss->base_cftset.cfts = ss->base_cftypes;
+		list_add_tail(&ss->base_cftset.node, &ss->cftsets);
+	}
+}
+
 static void __init cgroup_init_subsys(struct cgroup_subsys *ss)
 {
 	struct cgroup_subsys_state *css;
 
 	printk(KERN_INFO "Initializing cgroup subsys %s\n", ss->name);
 
+	/* init base cftset */
+	cgroup_init_cftsets(ss);
+
 	/* Create the top cgroup state for this subsystem */
 	list_add(&ss->sibling, &rootnode.subsys_list);
 	ss->root = &rootnode;
@@ -4113,6 +4240,9 @@ int __init_or_module cgroup_load_subsys(struct cgroup_subsys *ss)
 		return 0;
 	}
 
+	/* init base cftset */
+	cgroup_init_cftsets(ss);
+
 	/*
 	 * need to register a subsys id before anything else - for example,
 	 * init_cgroup_css needs it.
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 04/12] cgroup: merge cft_release_agent cftype array into the base files array
  2012-03-21 22:17 [PATCHSET] cgroup: cftype based file interface, take #2 Tejun Heo
                   ` (2 preceding siblings ...)
  2012-03-21 22:17 ` [PATCH 03/12] cgroup: implement cgroup_add_cftypes() and friends Tejun Heo
@ 2012-03-21 22:17 ` Tejun Heo
  2012-03-21 22:17 ` [PATCH 05/12] cgroup: relocate cftype and cgroup_subsys definitions in controllers Tejun Heo
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2012-03-21 22:17 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo

Now that cftype can express whether a file should only be on root,
cft_release_agent can be merged into the base files cftypes array.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cgroup.c |   19 +++++++------------
 1 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index fe7a5c2..45657d1 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -3736,13 +3736,13 @@ static struct cftype files[] = {
 		.read_u64 = cgroup_clone_children_read,
 		.write_u64 = cgroup_clone_children_write,
 	},
-};
-
-static struct cftype cft_release_agent = {
-	.name = "release_agent",
-	.read_seq_string = cgroup_release_agent_show,
-	.write_string = cgroup_release_agent_write,
-	.max_write_len = PATH_MAX,
+	{
+		.name = "release_agent",
+		.flags = CFTYPE_ONLY_ON_ROOT,
+		.read_seq_string = cgroup_release_agent_show,
+		.write_string = cgroup_release_agent_write,
+		.max_write_len = PATH_MAX,
+	},
 };
 
 static int cgroup_populate_dir(struct cgroup *cgrp)
@@ -3754,11 +3754,6 @@ static int cgroup_populate_dir(struct cgroup *cgrp)
 	if (err < 0)
 		return err;
 
-	if (cgrp == cgrp->top_cgroup) {
-		if ((err = cgroup_add_file(cgrp, NULL, &cft_release_agent)) < 0)
-			return err;
-	}
-
 	/* process cftsets of each subsystem */
 	for_each_subsys(cgrp->root, ss) {
 		struct cftype_set *set;
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 05/12] cgroup: relocate cftype and cgroup_subsys definitions in controllers
  2012-03-21 22:17 [PATCHSET] cgroup: cftype based file interface, take #2 Tejun Heo
                   ` (3 preceding siblings ...)
  2012-03-21 22:17 ` [PATCH 04/12] cgroup: merge cft_release_agent cftype array into the base files array Tejun Heo
@ 2012-03-21 22:17 ` Tejun Heo
  2012-03-21 22:17 ` [PATCH 06/12] cgroup: convert all non-memcg controllers to the new cftype interface Tejun Heo
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2012-03-21 22:17 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo

blk-cgroup, netprio_cgroup, cls_cgroup and tcp_memcontrol
unnecessarily define cftype array and cgroup_subsys structures at the
top of the file, which is unconventional and necessiates forward
declaration of methods.

This patch relocates those below the definitions of the methods and
removes the forward declarations.  Note that forward declaration of
tcp_files[] is added in tcp_memcontrol.c for tcp_init_cgroup().  This
will be removed soon by another patch.

This patch doesn't introduce any functional change.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 block/blk-cgroup.c        |   38 ++++++++++++-----------------
 net/core/netprio_cgroup.c |   26 ++++++++------------
 net/ipv4/tcp_memcontrol.c |   57 +++++++++++++++++++++-----------------------
 net/sched/cls_cgroup.c    |   27 ++++++++------------
 4 files changed, 65 insertions(+), 83 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 1359d63..c0cbe8b 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -28,34 +28,12 @@ static LIST_HEAD(blkio_list);
 struct blkio_cgroup blkio_root_cgroup = { .weight = 2*BLKIO_WEIGHT_DEFAULT };
 EXPORT_SYMBOL_GPL(blkio_root_cgroup);
 
-static struct cgroup_subsys_state *blkiocg_create(struct cgroup *);
-static int blkiocg_can_attach(struct cgroup *, struct cgroup_taskset *);
-static void blkiocg_attach(struct cgroup *, struct cgroup_taskset *);
-static void blkiocg_destroy(struct cgroup *);
-static int blkiocg_populate(struct cgroup_subsys *, struct cgroup *);
-
 /* for encoding cft->private value on file */
 #define BLKIOFILE_PRIVATE(x, val)	(((x) << 16) | (val))
 /* What policy owns the file, proportional or throttle */
 #define BLKIOFILE_POLICY(val)		(((val) >> 16) & 0xffff)
 #define BLKIOFILE_ATTR(val)		((val) & 0xffff)
 
-struct cgroup_subsys blkio_subsys = {
-	.name = "blkio",
-	.create = blkiocg_create,
-	.can_attach = blkiocg_can_attach,
-	.attach = blkiocg_attach,
-	.destroy = blkiocg_destroy,
-	.populate = blkiocg_populate,
-#ifdef CONFIG_BLK_CGROUP
-	/* note: blkio_subsys_id is otherwise defined in blk-cgroup.h */
-	.subsys_id = blkio_subsys_id,
-#endif
-	.use_id = 1,
-	.module = THIS_MODULE,
-};
-EXPORT_SYMBOL_GPL(blkio_subsys);
-
 static inline void blkio_policy_insert_node(struct blkio_cgroup *blkcg,
 					    struct blkio_policy_node *pn)
 {
@@ -1658,6 +1636,22 @@ static void blkiocg_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
 	}
 }
 
+struct cgroup_subsys blkio_subsys = {
+	.name = "blkio",
+	.create = blkiocg_create,
+	.can_attach = blkiocg_can_attach,
+	.attach = blkiocg_attach,
+	.destroy = blkiocg_destroy,
+	.populate = blkiocg_populate,
+#ifdef CONFIG_BLK_CGROUP
+	/* note: blkio_subsys_id is otherwise defined in blk-cgroup.h */
+	.subsys_id = blkio_subsys_id,
+#endif
+	.use_id = 1,
+	.module = THIS_MODULE,
+};
+EXPORT_SYMBOL_GPL(blkio_subsys);
+
 void blkio_policy_register(struct blkio_policy_type *blkiop)
 {
 	spin_lock(&blkio_list_lock);
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 22036ab..c802613 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -23,21 +23,6 @@
 #include <net/sock.h>
 #include <net/netprio_cgroup.h>
 
-static struct cgroup_subsys_state *cgrp_create(struct cgroup *cgrp);
-static void cgrp_destroy(struct cgroup *cgrp);
-static int cgrp_populate(struct cgroup_subsys *ss, struct cgroup *cgrp);
-
-struct cgroup_subsys net_prio_subsys = {
-	.name		= "net_prio",
-	.create		= cgrp_create,
-	.destroy	= cgrp_destroy,
-	.populate	= cgrp_populate,
-#ifdef CONFIG_NETPRIO_CGROUP
-	.subsys_id	= net_prio_subsys_id,
-#endif
-	.module		= THIS_MODULE
-};
-
 #define PRIOIDX_SZ 128
 
 static unsigned long prioidx_map[PRIOIDX_SZ];
@@ -263,6 +248,17 @@ static int cgrp_populate(struct cgroup_subsys *ss, struct cgroup *cgrp)
 	return cgroup_add_files(cgrp, ss, ss_files, ARRAY_SIZE(ss_files));
 }
 
+struct cgroup_subsys net_prio_subsys = {
+	.name		= "net_prio",
+	.create		= cgrp_create,
+	.destroy	= cgrp_destroy,
+	.populate	= cgrp_populate,
+#ifdef CONFIG_NETPRIO_CGROUP
+	.subsys_id	= net_prio_subsys_id,
+#endif
+	.module		= THIS_MODULE
+};
+
 static int netprio_device_event(struct notifier_block *unused,
 				unsigned long event, void *ptr)
 {
diff --git a/net/ipv4/tcp_memcontrol.c b/net/ipv4/tcp_memcontrol.c
index e714c68..e0c5b4e 100644
--- a/net/ipv4/tcp_memcontrol.c
+++ b/net/ipv4/tcp_memcontrol.c
@@ -6,36 +6,7 @@
 #include <linux/memcontrol.h>
 #include <linux/module.h>
 
-static u64 tcp_cgroup_read(struct cgroup *cont, struct cftype *cft);
-static int tcp_cgroup_write(struct cgroup *cont, struct cftype *cft,
-			    const char *buffer);
-static int tcp_cgroup_reset(struct cgroup *cont, unsigned int event);
-
-static struct cftype tcp_files[] = {
-	{
-		.name = "kmem.tcp.limit_in_bytes",
-		.write_string = tcp_cgroup_write,
-		.read_u64 = tcp_cgroup_read,
-		.private = RES_LIMIT,
-	},
-	{
-		.name = "kmem.tcp.usage_in_bytes",
-		.read_u64 = tcp_cgroup_read,
-		.private = RES_USAGE,
-	},
-	{
-		.name = "kmem.tcp.failcnt",
-		.private = RES_FAILCNT,
-		.trigger = tcp_cgroup_reset,
-		.read_u64 = tcp_cgroup_read,
-	},
-	{
-		.name = "kmem.tcp.max_usage_in_bytes",
-		.private = RES_MAX_USAGE,
-		.trigger = tcp_cgroup_reset,
-		.read_u64 = tcp_cgroup_read,
-	},
-};
+static struct cftype tcp_files[4];	/* XXX: will be removed soon */
 
 static inline struct tcp_memcontrol *tcp_from_cgproto(struct cg_proto *cg_proto)
 {
@@ -270,3 +241,29 @@ void tcp_prot_mem(struct mem_cgroup *memcg, long val, int idx)
 
 	tcp->tcp_prot_mem[idx] = val;
 }
+
+static struct cftype tcp_files[] = {
+	{
+		.name = "kmem.tcp.limit_in_bytes",
+		.write_string = tcp_cgroup_write,
+		.read_u64 = tcp_cgroup_read,
+		.private = RES_LIMIT,
+	},
+	{
+		.name = "kmem.tcp.usage_in_bytes",
+		.read_u64 = tcp_cgroup_read,
+		.private = RES_USAGE,
+	},
+	{
+		.name = "kmem.tcp.failcnt",
+		.private = RES_FAILCNT,
+		.trigger = tcp_cgroup_reset,
+		.read_u64 = tcp_cgroup_read,
+	},
+	{
+		.name = "kmem.tcp.max_usage_in_bytes",
+		.private = RES_MAX_USAGE,
+		.trigger = tcp_cgroup_reset,
+		.read_u64 = tcp_cgroup_read,
+	},
+};
diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
index 1afaa28..024df8a 100644
--- a/net/sched/cls_cgroup.c
+++ b/net/sched/cls_cgroup.c
@@ -22,22 +22,6 @@
 #include <net/sock.h>
 #include <net/cls_cgroup.h>
 
-static struct cgroup_subsys_state *cgrp_create(struct cgroup *cgrp);
-static void cgrp_destroy(struct cgroup *cgrp);
-static int cgrp_populate(struct cgroup_subsys *ss, struct cgroup *cgrp);
-
-struct cgroup_subsys net_cls_subsys = {
-	.name		= "net_cls",
-	.create		= cgrp_create,
-	.destroy	= cgrp_destroy,
-	.populate	= cgrp_populate,
-#ifdef CONFIG_NET_CLS_CGROUP
-	.subsys_id	= net_cls_subsys_id,
-#endif
-	.module		= THIS_MODULE,
-};
-
-
 static inline struct cgroup_cls_state *cgrp_cls_state(struct cgroup *cgrp)
 {
 	return container_of(cgroup_subsys_state(cgrp, net_cls_subsys_id),
@@ -93,6 +77,17 @@ static int cgrp_populate(struct cgroup_subsys *ss, struct cgroup *cgrp)
 	return cgroup_add_files(cgrp, ss, ss_files, ARRAY_SIZE(ss_files));
 }
 
+struct cgroup_subsys net_cls_subsys = {
+	.name		= "net_cls",
+	.create		= cgrp_create,
+	.destroy	= cgrp_destroy,
+	.populate	= cgrp_populate,
+#ifdef CONFIG_NET_CLS_CGROUP
+	.subsys_id	= net_cls_subsys_id,
+#endif
+	.module		= THIS_MODULE,
+};
+
 struct cls_cgroup_head {
 	u32			handle;
 	struct tcf_exts		exts;
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 06/12] cgroup: convert all non-memcg controllers to the new cftype interface
  2012-03-21 22:17 [PATCHSET] cgroup: cftype based file interface, take #2 Tejun Heo
                   ` (4 preceding siblings ...)
  2012-03-21 22:17 ` [PATCH 05/12] cgroup: relocate cftype and cgroup_subsys definitions in controllers Tejun Heo
@ 2012-03-21 22:17 ` Tejun Heo
  2012-03-21 22:17 ` [PATCH 07/12] memcg: always create memsw files if CONFIG_CGROUP_MEM_RES_CTLR_SWAP Tejun Heo
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2012-03-21 22:17 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo, Paul Menage,
	Ingo Molnar, Peter Zijlstra, David S. Miller, Vivek Goyal

Convert debug, freezer, cpuset, cpu_cgroup, cpuacct, net_prio, blkio,
net_cls and device controllers to use the new cftype based interface.
Termination entry is added to cftype arrays and populate callbacks are
replaced with cgroup_subsys->base_cftypes initializations.

This is functionally identical transformation.  There shouldn't be any
visible behavior change.

memcg is rather special and will be converted separately.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
---
 block/blk-cgroup.c        |    9 ++-------
 kernel/cgroup.c           |   10 +++-------
 kernel/cgroup_freezer.c   |   11 +++--------
 kernel/cpuset.c           |   31 ++++++++++---------------------
 kernel/sched/core.c       |   16 ++++------------
 net/core/netprio_cgroup.c |    8 ++------
 net/sched/cls_cgroup.c    |    8 ++------
 security/device_cgroup.c  |   10 ++--------
 8 files changed, 28 insertions(+), 75 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index c0cbe8b..a13e189 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1515,14 +1515,9 @@ struct cftype blkio_files[] = {
 		.read_map = blkiocg_file_read_map,
 	},
 #endif
+	{ }	/* terminate */
 };
 
-static int blkiocg_populate(struct cgroup_subsys *subsys, struct cgroup *cgroup)
-{
-	return cgroup_add_files(cgroup, subsys, blkio_files,
-				ARRAY_SIZE(blkio_files));
-}
-
 static void blkiocg_destroy(struct cgroup *cgroup)
 {
 	struct blkio_cgroup *blkcg = cgroup_to_blkio_cgroup(cgroup);
@@ -1642,11 +1637,11 @@ struct cgroup_subsys blkio_subsys = {
 	.can_attach = blkiocg_can_attach,
 	.attach = blkiocg_attach,
 	.destroy = blkiocg_destroy,
-	.populate = blkiocg_populate,
 #ifdef CONFIG_BLK_CGROUP
 	/* note: blkio_subsys_id is otherwise defined in blk-cgroup.h */
 	.subsys_id = blkio_subsys_id,
 #endif
+	.base_cftypes = blkio_files,
 	.use_id = 1,
 	.module = THIS_MODULE,
 };
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 45657d1..d6fe34a 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -5354,19 +5354,15 @@ static struct cftype debug_files[] =  {
 		.name = "releasable",
 		.read_u64 = releasable_read,
 	},
-};
 
-static int debug_populate(struct cgroup_subsys *ss, struct cgroup *cont)
-{
-	return cgroup_add_files(cont, ss, debug_files,
-				ARRAY_SIZE(debug_files));
-}
+	{ }	/* terminate */
+};
 
 struct cgroup_subsys debug_subsys = {
 	.name = "debug",
 	.create = debug_create,
 	.destroy = debug_destroy,
-	.populate = debug_populate,
 	.subsys_id = debug_subsys_id,
+	.base_cftypes = debug_files,
 };
 #endif /* CONFIG_CGROUP_DEBUG */
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index f86e939..3649fc6 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -358,24 +358,19 @@ static int freezer_write(struct cgroup *cgroup,
 static struct cftype files[] = {
 	{
 		.name = "state",
+		.flags = CFTYPE_NOT_ON_ROOT,
 		.read_seq_string = freezer_read,
 		.write_string = freezer_write,
 	},
+	{ }	/* terminate */
 };
 
-static int freezer_populate(struct cgroup_subsys *ss, struct cgroup *cgroup)
-{
-	if (!cgroup->parent)
-		return 0;
-	return cgroup_add_files(cgroup, ss, files, ARRAY_SIZE(files));
-}
-
 struct cgroup_subsys freezer_subsys = {
 	.name		= "freezer",
 	.create		= freezer_create,
 	.destroy	= freezer_destroy,
-	.populate	= freezer_populate,
 	.subsys_id	= freezer_subsys_id,
 	.can_attach	= freezer_can_attach,
 	.fork		= freezer_fork,
+	.base_cftypes	= files,
 };
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 5d57583..81cf9b7 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1792,28 +1792,17 @@ static struct cftype files[] = {
 		.write_u64 = cpuset_write_u64,
 		.private = FILE_SPREAD_SLAB,
 	},
-};
-
-static struct cftype cft_memory_pressure_enabled = {
-	.name = "memory_pressure_enabled",
-	.read_u64 = cpuset_read_u64,
-	.write_u64 = cpuset_write_u64,
-	.private = FILE_MEMORY_PRESSURE_ENABLED,
-};
 
-static int cpuset_populate(struct cgroup_subsys *ss, struct cgroup *cont)
-{
-	int err;
+	{
+		.name = "memory_pressure_enabled",
+		.flags = CFTYPE_ONLY_ON_ROOT,
+		.read_u64 = cpuset_read_u64,
+		.write_u64 = cpuset_write_u64,
+		.private = FILE_MEMORY_PRESSURE_ENABLED,
+	},
 
-	err = cgroup_add_files(cont, ss, files, ARRAY_SIZE(files));
-	if (err)
-		return err;
-	/* memory_pressure_enabled is in root cpuset only */
-	if (!cont->parent)
-		err = cgroup_add_file(cont, ss,
-				      &cft_memory_pressure_enabled);
-	return err;
-}
+	{ }	/* terminate */
+};
 
 /*
  * post_clone() is called during cgroup_create() when the
@@ -1914,9 +1903,9 @@ struct cgroup_subsys cpuset_subsys = {
 	.destroy = cpuset_destroy,
 	.can_attach = cpuset_can_attach,
 	.attach = cpuset_attach,
-	.populate = cpuset_populate,
 	.post_clone = cpuset_post_clone,
 	.subsys_id = cpuset_subsys_id,
+	.base_cftypes = files,
 	.early_init = 1,
 };
 
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ff12f72..90815b5 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7902,13 +7902,9 @@ static struct cftype cpu_files[] = {
 		.write_u64 = cpu_rt_period_write_uint,
 	},
 #endif
+	{ }	/* terminate */
 };
 
-static int cpu_cgroup_populate(struct cgroup_subsys *ss, struct cgroup *cont)
-{
-	return cgroup_add_files(cont, ss, cpu_files, ARRAY_SIZE(cpu_files));
-}
-
 struct cgroup_subsys cpu_cgroup_subsys = {
 	.name		= "cpu",
 	.create		= cpu_cgroup_create,
@@ -7916,8 +7912,8 @@ struct cgroup_subsys cpu_cgroup_subsys = {
 	.can_attach	= cpu_cgroup_can_attach,
 	.attach		= cpu_cgroup_attach,
 	.exit		= cpu_cgroup_exit,
-	.populate	= cpu_cgroup_populate,
 	.subsys_id	= cpu_cgroup_subsys_id,
+	.base_cftypes	= cpu_files,
 	.early_init	= 1,
 };
 
@@ -8102,13 +8098,9 @@ static struct cftype files[] = {
 		.name = "stat",
 		.read_map = cpuacct_stats_show,
 	},
+	{ }	/* terminate */
 };
 
-static int cpuacct_populate(struct cgroup_subsys *ss, struct cgroup *cgrp)
-{
-	return cgroup_add_files(cgrp, ss, files, ARRAY_SIZE(files));
-}
-
 /*
  * charge this task's execution time to its accounting group.
  *
@@ -8140,7 +8132,7 @@ struct cgroup_subsys cpuacct_subsys = {
 	.name = "cpuacct",
 	.create = cpuacct_create,
 	.destroy = cpuacct_destroy,
-	.populate = cpuacct_populate,
 	.subsys_id = cpuacct_subsys_id,
+	.base_cftypes = files,
 };
 #endif	/* CONFIG_CGROUP_CPUACCT */
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index c802613..aa1d391 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -241,21 +241,17 @@ static struct cftype ss_files[] = {
 		.read_map = read_priomap,
 		.write_string = write_priomap,
 	},
+	{ }	/* terminate */
 };
 
-static int cgrp_populate(struct cgroup_subsys *ss, struct cgroup *cgrp)
-{
-	return cgroup_add_files(cgrp, ss, ss_files, ARRAY_SIZE(ss_files));
-}
-
 struct cgroup_subsys net_prio_subsys = {
 	.name		= "net_prio",
 	.create		= cgrp_create,
 	.destroy	= cgrp_destroy,
-	.populate	= cgrp_populate,
 #ifdef CONFIG_NETPRIO_CGROUP
 	.subsys_id	= net_prio_subsys_id,
 #endif
+	.base_cftypes	= ss_files,
 	.module		= THIS_MODULE
 };
 
diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
index 024df8a..7743ea8 100644
--- a/net/sched/cls_cgroup.c
+++ b/net/sched/cls_cgroup.c
@@ -70,21 +70,17 @@ static struct cftype ss_files[] = {
 		.read_u64 = read_classid,
 		.write_u64 = write_classid,
 	},
+	{ }	/* terminate */
 };
 
-static int cgrp_populate(struct cgroup_subsys *ss, struct cgroup *cgrp)
-{
-	return cgroup_add_files(cgrp, ss, ss_files, ARRAY_SIZE(ss_files));
-}
-
 struct cgroup_subsys net_cls_subsys = {
 	.name		= "net_cls",
 	.create		= cgrp_create,
 	.destroy	= cgrp_destroy,
-	.populate	= cgrp_populate,
 #ifdef CONFIG_NET_CLS_CGROUP
 	.subsys_id	= net_cls_subsys_id,
 #endif
+	.base_cftypes	= ss_files,
 	.module		= THIS_MODULE,
 };
 
diff --git a/security/device_cgroup.c b/security/device_cgroup.c
index c43a332..442204c 100644
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -447,22 +447,16 @@ static struct cftype dev_cgroup_files[] = {
 		.read_seq_string = devcgroup_seq_read,
 		.private = DEVCG_LIST,
 	},
+	{ }	/* terminate */
 };
 
-static int devcgroup_populate(struct cgroup_subsys *ss,
-				struct cgroup *cgroup)
-{
-	return cgroup_add_files(cgroup, ss, dev_cgroup_files,
-					ARRAY_SIZE(dev_cgroup_files));
-}
-
 struct cgroup_subsys devices_subsys = {
 	.name = "devices",
 	.can_attach = devcgroup_can_attach,
 	.create = devcgroup_create,
 	.destroy = devcgroup_destroy,
-	.populate = devcgroup_populate,
 	.subsys_id = devices_subsys_id,
+	.base_cftypes = dev_cgroup_files,
 };
 
 int __devcgroup_inode_permission(struct inode *inode, int mask)
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 07/12] memcg: always create memsw files if CONFIG_CGROUP_MEM_RES_CTLR_SWAP
  2012-03-21 22:17 [PATCHSET] cgroup: cftype based file interface, take #2 Tejun Heo
                   ` (5 preceding siblings ...)
  2012-03-21 22:17 ` [PATCH 06/12] cgroup: convert all non-memcg controllers to the new cftype interface Tejun Heo
@ 2012-03-21 22:17 ` Tejun Heo
  2012-03-22  0:23   ` KAMEZAWA Hiroyuki
  2012-03-21 22:17 ` [PATCH 08/12] cgroup: convert memcg controller to the new cftype interface Tejun Heo
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 29+ messages in thread
From: Tejun Heo @ 2012-03-21 22:17 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo, KAMEZAWA Hiroyuki

Instead of conditioning creation of memsw files on do_swap_account,
always create the files if compiled-in and fail read/write attempts
with -EOPNOTSUPP if !do_swap_account.

This is suggested by KAMEZAWA to simplify memcg file creation so that
it can use cgroup->subsys_cftypes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/memcontrol.c |   65 ++++++++++++++++++++++++++----------------------------
 1 files changed, 31 insertions(+), 34 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index ae2f0a8..1e2a9d1 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3818,14 +3818,21 @@ static inline u64 mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
 	return val << PAGE_SHIFT;
 }
 
-static u64 mem_cgroup_read(struct cgroup *cont, struct cftype *cft)
+static ssize_t mem_cgroup_read(struct cgroup *cont, struct cftype *cft,
+			       struct file *file, char __user *buf,
+			       size_t nbytes, loff_t *ppos)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
+	char str[64];
 	u64 val;
-	int type, name;
+	int type, name, len;
 
 	type = MEMFILE_TYPE(cft->private);
 	name = MEMFILE_ATTR(cft->private);
+
+	if (!do_swap_account && type == _MEMSWAP)
+		return -EOPNOTSUPP;
+
 	switch (type) {
 	case _MEM:
 		if (name == RES_USAGE)
@@ -3843,7 +3850,9 @@ static u64 mem_cgroup_read(struct cgroup *cont, struct cftype *cft)
 		BUG();
 		break;
 	}
-	return val;
+
+	len = scnprintf(str, sizeof(str), "%llu\n", (unsigned long long)val);
+	return simple_read_from_buffer(buf, nbytes, ppos, str, len);
 }
 /*
  * The user of this function is...
@@ -3859,6 +3868,10 @@ static int mem_cgroup_write(struct cgroup *cont, struct cftype *cft,
 
 	type = MEMFILE_TYPE(cft->private);
 	name = MEMFILE_ATTR(cft->private);
+
+	if (!do_swap_account && type == _MEMSWAP)
+		return -EOPNOTSUPP;
+
 	switch (name) {
 	case RES_LIMIT:
 		if (mem_cgroup_is_root(memcg)) { /* Can't set limit on root */
@@ -3925,12 +3938,15 @@ out:
 
 static int mem_cgroup_reset(struct cgroup *cont, unsigned int event)
 {
-	struct mem_cgroup *memcg;
+	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
 	int type, name;
 
-	memcg = mem_cgroup_from_cont(cont);
 	type = MEMFILE_TYPE(event);
 	name = MEMFILE_ATTR(event);
+
+	if (!do_swap_account && type == _MEMSWAP)
+		return -EOPNOTSUPP;
+
 	switch (name) {
 	case RES_MAX_USAGE:
 		if (type == _MEM)
@@ -4599,7 +4615,7 @@ static struct cftype mem_cgroup_files[] = {
 	{
 		.name = "usage_in_bytes",
 		.private = MEMFILE_PRIVATE(_MEM, RES_USAGE),
-		.read_u64 = mem_cgroup_read,
+		.read = mem_cgroup_read,
 		.register_event = mem_cgroup_usage_register_event,
 		.unregister_event = mem_cgroup_usage_unregister_event,
 	},
@@ -4607,25 +4623,25 @@ static struct cftype mem_cgroup_files[] = {
 		.name = "max_usage_in_bytes",
 		.private = MEMFILE_PRIVATE(_MEM, RES_MAX_USAGE),
 		.trigger = mem_cgroup_reset,
-		.read_u64 = mem_cgroup_read,
+		.read = mem_cgroup_read,
 	},
 	{
 		.name = "limit_in_bytes",
 		.private = MEMFILE_PRIVATE(_MEM, RES_LIMIT),
 		.write_string = mem_cgroup_write,
-		.read_u64 = mem_cgroup_read,
+		.read = mem_cgroup_read,
 	},
 	{
 		.name = "soft_limit_in_bytes",
 		.private = MEMFILE_PRIVATE(_MEM, RES_SOFT_LIMIT),
 		.write_string = mem_cgroup_write,
-		.read_u64 = mem_cgroup_read,
+		.read = mem_cgroup_read,
 	},
 	{
 		.name = "failcnt",
 		.private = MEMFILE_PRIVATE(_MEM, RES_FAILCNT),
 		.trigger = mem_cgroup_reset,
-		.read_u64 = mem_cgroup_read,
+		.read = mem_cgroup_read,
 	},
 	{
 		.name = "stat",
@@ -4665,14 +4681,11 @@ static struct cftype mem_cgroup_files[] = {
 		.mode = S_IRUGO,
 	},
 #endif
-};
-
 #ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
-static struct cftype memsw_cgroup_files[] = {
 	{
 		.name = "memsw.usage_in_bytes",
 		.private = MEMFILE_PRIVATE(_MEMSWAP, RES_USAGE),
-		.read_u64 = mem_cgroup_read,
+		.read = mem_cgroup_read,
 		.register_event = mem_cgroup_usage_register_event,
 		.unregister_event = mem_cgroup_usage_unregister_event,
 	},
@@ -4680,35 +4693,22 @@ static struct cftype memsw_cgroup_files[] = {
 		.name = "memsw.max_usage_in_bytes",
 		.private = MEMFILE_PRIVATE(_MEMSWAP, RES_MAX_USAGE),
 		.trigger = mem_cgroup_reset,
-		.read_u64 = mem_cgroup_read,
+		.read = mem_cgroup_read,
 	},
 	{
 		.name = "memsw.limit_in_bytes",
 		.private = MEMFILE_PRIVATE(_MEMSWAP, RES_LIMIT),
 		.write_string = mem_cgroup_write,
-		.read_u64 = mem_cgroup_read,
+		.read = mem_cgroup_read,
 	},
 	{
 		.name = "memsw.failcnt",
 		.private = MEMFILE_PRIVATE(_MEMSWAP, RES_FAILCNT),
 		.trigger = mem_cgroup_reset,
-		.read_u64 = mem_cgroup_read,
+		.read = mem_cgroup_read,
 	},
-};
-
-static int register_memsw_files(struct cgroup *cont, struct cgroup_subsys *ss)
-{
-	if (!do_swap_account)
-		return 0;
-	return cgroup_add_files(cont, ss, memsw_cgroup_files,
-				ARRAY_SIZE(memsw_cgroup_files));
-};
-#else
-static int register_memsw_files(struct cgroup *cont, struct cgroup_subsys *ss)
-{
-	return 0;
-}
 #endif
+};
 
 static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
 {
@@ -4969,9 +4969,6 @@ static int mem_cgroup_populate(struct cgroup_subsys *ss,
 				ARRAY_SIZE(mem_cgroup_files));
 
 	if (!ret)
-		ret = register_memsw_files(cont, ss);
-
-	if (!ret)
 		ret = register_kmem_files(cont, ss);
 
 	return ret;
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 08/12] cgroup: convert memcg controller to the new cftype interface
  2012-03-21 22:17 [PATCHSET] cgroup: cftype based file interface, take #2 Tejun Heo
                   ` (6 preceding siblings ...)
  2012-03-21 22:17 ` [PATCH 07/12] memcg: always create memsw files if CONFIG_CGROUP_MEM_RES_CTLR_SWAP Tejun Heo
@ 2012-03-21 22:17 ` Tejun Heo
  2012-03-22  0:27   ` KAMEZAWA Hiroyuki
  2012-03-21 22:17 ` [PATCH 09/12] cgroup: remove cgroup_add_file[s]() Tejun Heo
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 29+ messages in thread
From: Tejun Heo @ 2012-03-21 22:17 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo,
	Johannes Weiner, Michal Hocko, Balbir Singh, KAMEZAWA Hiroyuki,
	Hugh Dickins, Greg Thelen

Convert memcg to use the new cftype based interface.  kmem support
abuses ->populate() for mem_cgroup_sockets_init() so it can't be
removed at the moment.

tcp_memcontrol is updated so that tcp_files[] is registered via a
__initcall.  This change also allows removing the forward declaration
of tcp_files[].  Removed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Greg Thelen <gthelen@google.com>
---
 mm/memcontrol.c           |   12 +++---------
 net/ipv4/tcp_memcontrol.c |   16 ++++++++++------
 2 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 1e2a9d1..a3eb82f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4708,6 +4708,7 @@ static struct cftype mem_cgroup_files[] = {
 		.read = mem_cgroup_read,
 	},
 #endif
+	{ },	/* terminate */
 };
 
 static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
@@ -4963,15 +4964,7 @@ static void mem_cgroup_destroy(struct cgroup *cont)
 static int mem_cgroup_populate(struct cgroup_subsys *ss,
 				struct cgroup *cont)
 {
-	int ret;
-
-	ret = cgroup_add_files(cont, ss, mem_cgroup_files,
-				ARRAY_SIZE(mem_cgroup_files));
-
-	if (!ret)
-		ret = register_kmem_files(cont, ss);
-
-	return ret;
+	return register_kmem_files(cont, ss);
 }
 
 #ifdef CONFIG_MMU
@@ -5485,6 +5478,7 @@ struct cgroup_subsys mem_cgroup_subsys = {
 	.can_attach = mem_cgroup_can_attach,
 	.cancel_attach = mem_cgroup_cancel_attach,
 	.attach = mem_cgroup_move_task,
+	.base_cftypes = mem_cgroup_files,
 	.early_init = 0,
 	.use_id = 1,
 };
diff --git a/net/ipv4/tcp_memcontrol.c b/net/ipv4/tcp_memcontrol.c
index e0c5b4e..73ba5fd 100644
--- a/net/ipv4/tcp_memcontrol.c
+++ b/net/ipv4/tcp_memcontrol.c
@@ -6,8 +6,6 @@
 #include <linux/memcontrol.h>
 #include <linux/module.h>
 
-static struct cftype tcp_files[4];	/* XXX: will be removed soon */
-
 static inline struct tcp_memcontrol *tcp_from_cgproto(struct cg_proto *cg_proto)
 {
 	return container_of(cg_proto, struct tcp_memcontrol, cg_proto);
@@ -36,7 +34,7 @@ int tcp_init_cgroup(struct cgroup *cgrp, struct cgroup_subsys *ss)
 
 	cg_proto = tcp_prot.proto_cgroup(memcg);
 	if (!cg_proto)
-		goto create_files;
+		return 0;
 
 	tcp = tcp_from_cgproto(cg_proto);
 
@@ -59,9 +57,7 @@ int tcp_init_cgroup(struct cgroup *cgrp, struct cgroup_subsys *ss)
 	cg_proto->sockets_allocated = &tcp->tcp_sockets_allocated;
 	cg_proto->memcg = memcg;
 
-create_files:
-	return cgroup_add_files(cgrp, ss, tcp_files,
-				ARRAY_SIZE(tcp_files));
+	return 0;
 }
 EXPORT_SYMBOL(tcp_init_cgroup);
 
@@ -266,4 +262,12 @@ static struct cftype tcp_files[] = {
 		.trigger = tcp_cgroup_reset,
 		.read_u64 = tcp_cgroup_read,
 	},
+	{ }	/* terminate */
 };
+
+static int __init tcp_memcontrol_init(void)
+{
+	WARN_ON(cgroup_add_cftypes(&mem_cgroup_subsys, tcp_files));
+	return 0;
+}
+__initcall(tcp_memcontrol_init);
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 09/12] cgroup: remove cgroup_add_file[s]()
  2012-03-21 22:17 [PATCHSET] cgroup: cftype based file interface, take #2 Tejun Heo
                   ` (7 preceding siblings ...)
  2012-03-21 22:17 ` [PATCH 08/12] cgroup: convert memcg controller to the new cftype interface Tejun Heo
@ 2012-03-21 22:17 ` Tejun Heo
  2012-03-21 22:17 ` [PATCH 10/12] cgroup: relocate __d_cgrp() and __d_cft() Tejun Heo
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2012-03-21 22:17 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo

No controller is using cgroup_add_files[s]().  Unexport them, and
convert cgroup_add_files() to handle NULL entry terminated array
instead of taking count explicitly and continue creation on failure
for internal use.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/cgroup.h |   16 ---------------
 kernel/cgroup.c        |   51 ++++++++++++++++++-----------------------------
 2 files changed, 20 insertions(+), 47 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index ff5b764..45869d7 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -404,22 +404,6 @@ struct cgroup_scanner {
 	void *data;
 };
 
-/*
- * Add a new file to the given cgroup directory. Should only be
- * called by subsystems from within a populate() method
- */
-int cgroup_add_file(struct cgroup *cgrp, struct cgroup_subsys *subsys,
-		       const struct cftype *cft);
-
-/*
- * Add a set of new files to the given cgroup directory. Should
- * only be called by subsystems from within a populate() method
- */
-int cgroup_add_files(struct cgroup *cgrp,
-			struct cgroup_subsys *subsys,
-			const struct cftype cft[],
-			int count);
-
 int cgroup_add_cftypes(struct cgroup_subsys *ss, const struct cftype *cfts);
 
 int cgroup_is_removed(const struct cgroup *cgrp);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index d6fe34a..02e131e 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2619,9 +2619,8 @@ static umode_t cgroup_file_mode(const struct cftype *cft)
 	return mode;
 }
 
-int cgroup_add_file(struct cgroup *cgrp,
-		       struct cgroup_subsys *subsys,
-		       const struct cftype *cft)
+static int cgroup_add_file(struct cgroup *cgrp, struct cgroup_subsys *subsys,
+			   const struct cftype *cft)
 {
 	struct dentry *dir = cgrp->dentry;
 	struct dentry *dentry;
@@ -2653,22 +2652,23 @@ int cgroup_add_file(struct cgroup *cgrp,
 		error = PTR_ERR(dentry);
 	return error;
 }
-EXPORT_SYMBOL_GPL(cgroup_add_file);
 
-int cgroup_add_files(struct cgroup *cgrp,
-			struct cgroup_subsys *subsys,
-			const struct cftype cft[],
-			int count)
+static int cgroup_add_files(struct cgroup *cgrp, struct cgroup_subsys *subsys,
+			    const struct cftype cfts[])
 {
-	int i, err;
-	for (i = 0; i < count; i++) {
-		err = cgroup_add_file(cgrp, subsys, &cft[i]);
-		if (err)
-			return err;
+	const struct cftype *cft;
+	int err, ret = 0;
+
+	for (cft = cfts; cft->name[0] != '\0'; cft++) {
+		err = cgroup_add_file(cgrp, subsys, cft);
+		if (err) {
+			pr_warning("cgroup_add_files: failed to create %s, err=%d\n",
+				   cft->name, err);
+			ret = err;
+		}
 	}
-	return 0;
+	return ret;
 }
-EXPORT_SYMBOL_GPL(cgroup_add_files);
 
 static DEFINE_MUTEX(cgroup_cft_mutex);
 
@@ -2692,10 +2692,6 @@ static void cgroup_cfts_commit(struct cgroup_subsys *ss,
 {
 	LIST_HEAD(pending);
 	struct cgroup *cgrp, *n;
-	int count = 0;
-
-	while (cfts[count].name[0] != '\0')
-		count++;
 
 	/* %NULL @cfts indicates abort and don't bother if @ss isn't attached */
 	if (cfts && ss->root != &rootnode) {
@@ -2717,7 +2713,7 @@ static void cgroup_cfts_commit(struct cgroup_subsys *ss,
 		mutex_lock(&inode->i_mutex);
 		mutex_lock(&cgroup_mutex);
 		if (!cgroup_is_removed(cgrp))
-			cgroup_add_files(cgrp, ss, cfts, count);
+			cgroup_add_files(cgrp, ss, cfts);
 		mutex_unlock(&cgroup_mutex);
 		mutex_unlock(&inode->i_mutex);
 
@@ -3743,6 +3739,7 @@ static struct cftype files[] = {
 		.write_string = cgroup_release_agent_write,
 		.max_write_len = PATH_MAX,
 	},
+	{ }	/* terminate */
 };
 
 static int cgroup_populate_dir(struct cgroup *cgrp)
@@ -3750,7 +3747,7 @@ static int cgroup_populate_dir(struct cgroup *cgrp)
 	int err;
 	struct cgroup_subsys *ss;
 
-	err = cgroup_add_files(cgrp, NULL, files, ARRAY_SIZE(files));
+	err = cgroup_add_files(cgrp, NULL, files);
 	if (err < 0)
 		return err;
 
@@ -3761,16 +3758,8 @@ static int cgroup_populate_dir(struct cgroup *cgrp)
 		if (ss->populate && (err = ss->populate(ss, cgrp)) < 0)
 			return err;
 
-		list_for_each_entry(set, &ss->cftsets, node) {
-			const struct cftype *cft;
-
-			for (cft = set->cfts; cft->name[0] != '\0'; cft++) {
-				err = cgroup_add_file(cgrp, ss, cft);
-				if (err)
-					pr_warning("cgroup_populate_dir: failed to create %s, err=%d\n",
-						   cft->name, err);
-			}
-		}
+		list_for_each_entry(set, &ss->cftsets, node)
+			cgroup_add_files(cgrp, ss, set->cfts);
 	}
 
 	/* This cgroup is ready now */
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 10/12] cgroup: relocate __d_cgrp() and __d_cft()
  2012-03-21 22:17 [PATCHSET] cgroup: cftype based file interface, take #2 Tejun Heo
                   ` (8 preceding siblings ...)
  2012-03-21 22:17 ` [PATCH 09/12] cgroup: remove cgroup_add_file[s]() Tejun Heo
@ 2012-03-21 22:17 ` Tejun Heo
  2012-03-21 22:17 ` [PATCH 11/12] cgroup: introduce struct cfent Tejun Heo
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2012-03-21 22:17 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo

Move the two macros upwards as they'll be used earlier in the file.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cgroup.c |   20 ++++++++++----------
 1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 02e131e..d64f16d 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -282,6 +282,16 @@ list_for_each_entry(_ss, &_root->subsys_list, sibling)
 #define for_each_active_root(_root) \
 list_for_each_entry(_root, &roots, root_list)
 
+static inline struct cgroup *__d_cgrp(struct dentry *dentry)
+{
+	return dentry->d_fsdata;
+}
+
+static inline struct cftype *__d_cft(struct dentry *dentry)
+{
+	return dentry->d_fsdata;
+}
+
 /* the list of cgroups eligible for automatic release. Protected by
  * release_list_lock */
 static LIST_HEAD(release_list);
@@ -1708,16 +1718,6 @@ static struct file_system_type cgroup_fs_type = {
 
 static struct kobject *cgroup_kobj;
 
-static inline struct cgroup *__d_cgrp(struct dentry *dentry)
-{
-	return dentry->d_fsdata;
-}
-
-static inline struct cftype *__d_cft(struct dentry *dentry)
-{
-	return dentry->d_fsdata;
-}
-
 /**
  * cgroup_path - generate the path of a cgroup
  * @cgrp: the cgroup in question
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 11/12] cgroup: introduce struct cfent
  2012-03-21 22:17 [PATCHSET] cgroup: cftype based file interface, take #2 Tejun Heo
                   ` (9 preceding siblings ...)
  2012-03-21 22:17 ` [PATCH 10/12] cgroup: relocate __d_cgrp() and __d_cft() Tejun Heo
@ 2012-03-21 22:17 ` Tejun Heo
  2012-03-30 20:42   ` [PATCH UPDATED " Tejun Heo
  2012-03-21 22:17 ` [PATCH 12/12] cgroup: implement cgroup_rm_cftypes() Tejun Heo
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 29+ messages in thread
From: Tejun Heo @ 2012-03-21 22:17 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo

This patch adds cfent (cgroup file entry) which is the association
between a cgroup and a file.  This is in-cgroup representation of
files under a cgroup directory.  This simplifies walking walking
cgroup files and thus cgroup_clear_directory(), which is now
implemented in two parts - cgroup_rm_file() and a loop around it.

cgroup_rm_file() will be used to implement cftype removal and cfent is
scheduled to serve cgroup specific per-file data (e.g. for sysfs-like
"sever" semantics).

v2: - cfe was freed from cgroup_rm_file() which led to use-after-free
      if the file had openers at the time of removal.  Moved to
      cgroup_diput().

    - cgroup_clear_directory() triggered WARN_ON_ONCE() if d_subdirs
      wasn't empty after removing all files.  This triggered
      spuriously if some files were open during directory clearing.
      Removed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Glauber Costa <glommer@parallels.com>
---
 include/linux/cgroup.h |    1 +
 kernel/cgroup.c        |  111 ++++++++++++++++++++++++++++++++---------------
 2 files changed, 76 insertions(+), 36 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 45869d7..bfe5470 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -175,6 +175,7 @@ struct cgroup {
 	 */
 	struct list_head sibling;	/* my parent's children */
 	struct list_head children;	/* my children */
+	struct list_head files;		/* my files */
 
 	struct cgroup *parent;		/* my parent */
 	struct dentry __rcu *dentry;	/* cgroup fs entry, RCU protected */
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index d64f16d..67daea9 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -148,6 +148,15 @@ struct cgroupfs_root {
 static struct cgroupfs_root rootnode;
 
 /*
+ * cgroupfs file entry, pointed to from leaf dentry->d_fsdata.
+ */
+struct cfent {
+	struct list_head		node;
+	struct dentry			*dentry;
+	struct cftype			*type;
+};
+
+/*
  * CSS ID -- ID per subsys's Cgroup Subsys State(CSS). used only when
  * cgroup_subsys->use_id != 0.
  */
@@ -287,11 +296,16 @@ static inline struct cgroup *__d_cgrp(struct dentry *dentry)
 	return dentry->d_fsdata;
 }
 
-static inline struct cftype *__d_cft(struct dentry *dentry)
+static inline struct cfent *__d_cfe(struct dentry *dentry)
 {
 	return dentry->d_fsdata;
 }
 
+static inline struct cftype *__d_cft(struct dentry *dentry)
+{
+	return __d_cfe(dentry)->type;
+}
+
 /* the list of cgroups eligible for automatic release. Protected by
  * release_list_lock */
 static LIST_HEAD(release_list);
@@ -877,6 +891,12 @@ static void cgroup_diput(struct dentry *dentry, struct inode *inode)
 		BUG_ON(!list_empty(&cgrp->pidlists));
 
 		kfree_rcu(cgrp, rcu_head);
+	} else {
+		struct cfent *cfe = __d_cfe(dentry);
+
+		WARN_ONCE(!list_empty(&cfe->node),
+			  "cfe still linked for %s\n", cfe->type->name);
+		kfree(cfe);
 	}
 	iput(inode);
 }
@@ -895,34 +915,36 @@ static void remove_dir(struct dentry *d)
 	dput(parent);
 }
 
-static void cgroup_clear_directory(struct dentry *dentry)
-{
-	struct list_head *node;
-
-	BUG_ON(!mutex_is_locked(&dentry->d_inode->i_mutex));
-	spin_lock(&dentry->d_lock);
-	node = dentry->d_subdirs.next;
-	while (node != &dentry->d_subdirs) {
-		struct dentry *d = list_entry(node, struct dentry, d_u.d_child);
-
-		spin_lock_nested(&d->d_lock, DENTRY_D_LOCK_NESTED);
-		list_del_init(node);
-		if (d->d_inode) {
-			/* This should never be called on a cgroup
-			 * directory with child cgroups */
-			BUG_ON(d->d_inode->i_mode & S_IFDIR);
-			dget_dlock(d);
-			spin_unlock(&d->d_lock);
-			spin_unlock(&dentry->d_lock);
-			d_delete(d);
-			simple_unlink(dentry->d_inode, d);
-			dput(d);
-			spin_lock(&dentry->d_lock);
-		} else
-			spin_unlock(&d->d_lock);
-		node = dentry->d_subdirs.next;
+static int cgroup_rm_file(struct cgroup *cgrp, const struct cftype *cft)
+{
+	struct cfent *cfe;
+
+	lockdep_assert_held(&cgrp->dentry->d_inode->i_mutex);
+	lockdep_assert_held(&cgroup_mutex);
+
+	list_for_each_entry(cfe, &cgrp->files, node) {
+		struct dentry *d = cfe->dentry;
+
+		if (cft && cfe->type != cft)
+			continue;
+
+		dget(d);
+		d_delete(d);
+		simple_unlink(d->d_inode, d);
+		list_del_init(&cfe->node);
+		dput(d);
+
+		return 0;
 	}
-	spin_unlock(&dentry->d_lock);
+	return -ENOENT;
+}
+
+static void cgroup_clear_directory(struct dentry *dir)
+{
+	struct cgroup *cgrp = __d_cgrp(dir);
+
+	while (!list_empty(&cgrp->files))
+		cgroup_rm_file(cgrp, NULL);
 }
 
 /*
@@ -1352,6 +1374,7 @@ static void init_cgroup_housekeeping(struct cgroup *cgrp)
 {
 	INIT_LIST_HEAD(&cgrp->sibling);
 	INIT_LIST_HEAD(&cgrp->children);
+	INIT_LIST_HEAD(&cgrp->files);
 	INIT_LIST_HEAD(&cgrp->css_sets);
 	INIT_LIST_HEAD(&cgrp->release_list);
 	INIT_LIST_HEAD(&cgrp->pidlists);
@@ -2623,7 +2646,9 @@ static int cgroup_add_file(struct cgroup *cgrp, struct cgroup_subsys *subsys,
 			   const struct cftype *cft)
 {
 	struct dentry *dir = cgrp->dentry;
+	struct cgroup *parent = __d_cgrp(dir);
 	struct dentry *dentry;
+	struct cfent *cfe;
 	int error;
 	umode_t mode;
 	char name[MAX_CGROUP_TYPE_NAMELEN + MAX_CFTYPE_NAME + 2] = { 0 };
@@ -2639,17 +2664,31 @@ static int cgroup_add_file(struct cgroup *cgrp, struct cgroup_subsys *subsys,
 		strcat(name, ".");
 	}
 	strcat(name, cft->name);
+
 	BUG_ON(!mutex_is_locked(&dir->d_inode->i_mutex));
+
+	cfe = kzalloc(sizeof(*cfe), GFP_KERNEL);
+	if (!cfe)
+		return -ENOMEM;
+
 	dentry = lookup_one_len(name, dir, strlen(name));
-	if (!IS_ERR(dentry)) {
-		mode = cgroup_file_mode(cft);
-		error = cgroup_create_file(dentry, mode | S_IFREG,
-						cgrp->root->sb);
-		if (!error)
-			dentry->d_fsdata = (void *)cft;
-		dput(dentry);
-	} else
+	if (IS_ERR(dentry)) {
 		error = PTR_ERR(dentry);
+		goto out;
+	}
+
+	mode = cgroup_file_mode(cft);
+	error = cgroup_create_file(dentry, mode | S_IFREG, cgrp->root->sb);
+	if (!error) {
+		cfe->type = (void *)cft;
+		cfe->dentry = dentry;
+		dentry->d_fsdata = cfe;
+		list_add_tail(&cfe->node, &parent->files);
+		cfe = NULL;
+	}
+	dput(dentry);
+out:
+	kfree(cfe);
 	return error;
 }
 
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 12/12] cgroup: implement cgroup_rm_cftypes()
  2012-03-21 22:17 [PATCHSET] cgroup: cftype based file interface, take #2 Tejun Heo
                   ` (10 preceding siblings ...)
  2012-03-21 22:17 ` [PATCH 11/12] cgroup: introduce struct cfent Tejun Heo
@ 2012-03-21 22:17 ` Tejun Heo
  2012-03-22  9:04 ` [PATCHSET] cgroup: cftype based file interface, take #2 Glauber Costa
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2012-03-21 22:17 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups
  Cc: linux-kernel, fweisbec, rni, ctalbott, Tejun Heo

Implement cgroup_rm_cftypes() which removes an array of cftypes from a
subsystem.  It can be called whether the target subsys is attached or
not.  cgroup core will remove the specified file from all existing
cgroups.

This will be used to improve sub-subsys modularity and will be helpful
for unified hierarchy.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/cgroup.h |    1 +
 kernel/cgroup.c        |   54 +++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 45 insertions(+), 10 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index bfe5470..7e523e2 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -406,6 +406,7 @@ struct cgroup_scanner {
 };
 
 int cgroup_add_cftypes(struct cgroup_subsys *ss, const struct cftype *cfts);
+int cgroup_rm_cftypes(struct cgroup_subsys *ss, const struct cftype *cfts);
 
 int cgroup_is_removed(const struct cgroup *cgrp);
 
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 67daea9..e87c4c4 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2692,17 +2692,20 @@ out:
 	return error;
 }
 
-static int cgroup_add_files(struct cgroup *cgrp, struct cgroup_subsys *subsys,
-			    const struct cftype cfts[])
+static int cgroup_addrm_files(struct cgroup *cgrp, struct cgroup_subsys *subsys,
+			      const struct cftype cfts[], bool is_add)
 {
 	const struct cftype *cft;
 	int err, ret = 0;
 
 	for (cft = cfts; cft->name[0] != '\0'; cft++) {
-		err = cgroup_add_file(cgrp, subsys, cft);
+		if (is_add)
+			err = cgroup_add_file(cgrp, subsys, cft);
+		else
+			err = cgroup_rm_file(cgrp, cft);
 		if (err) {
-			pr_warning("cgroup_add_files: failed to create %s, err=%d\n",
-				   cft->name, err);
+			pr_warning("cgroup_addrm_files: failed to %s %s, err=%d\n",
+				   is_add ? "add" : "remove", cft->name, err);
 			ret = err;
 		}
 	}
@@ -2726,7 +2729,7 @@ static void cgroup_cfts_prepare(void)
 }
 
 static void cgroup_cfts_commit(struct cgroup_subsys *ss,
-			       const struct cftype *cfts)
+			       const struct cftype *cfts, bool is_add)
 	__releases(&cgroup_mutex) __releases(&cgroup_cft_mutex)
 {
 	LIST_HEAD(pending);
@@ -2752,7 +2755,7 @@ static void cgroup_cfts_commit(struct cgroup_subsys *ss,
 		mutex_lock(&inode->i_mutex);
 		mutex_lock(&cgroup_mutex);
 		if (!cgroup_is_removed(cgrp))
-			cgroup_add_files(cgrp, ss, cfts);
+			cgroup_addrm_files(cgrp, ss, cfts, is_add);
 		mutex_unlock(&cgroup_mutex);
 		mutex_unlock(&inode->i_mutex);
 
@@ -2788,13 +2791,44 @@ int cgroup_add_cftypes(struct cgroup_subsys *ss, const struct cftype *cfts)
 	cgroup_cfts_prepare();
 	set->cfts = cfts;
 	list_add_tail(&set->node, &ss->cftsets);
-	cgroup_cfts_commit(ss, cfts);
+	cgroup_cfts_commit(ss, cfts, true);
 
 	return 0;
 }
 EXPORT_SYMBOL_GPL(cgroup_add_cftypes);
 
 /**
+ * cgroup_rm_cftypes - remove an array of cftypes from a subsystem
+ * @ss: target cgroup subsystem
+ * @cfts: zero-length name terminated array of cftypes
+ *
+ * Unregister @cfts from @ss.  Files described by @cfts are removed from
+ * all existing cgroups to which @ss is attached and all future cgroups
+ * won't have them either.  This function can be called anytime whether @ss
+ * is attached or not.
+ *
+ * Returns 0 on successful unregistration, -ENOENT if @cfts is not
+ * registered with @ss.
+ */
+int cgroup_rm_cftypes(struct cgroup_subsys *ss, const struct cftype *cfts)
+{
+	struct cftype_set *set;
+
+	cgroup_cfts_prepare();
+
+	list_for_each_entry(set, &ss->cftsets, node) {
+		if (set->cfts == cfts) {
+			list_del_init(&set->node);
+			cgroup_cfts_commit(ss, cfts, false);
+			return 0;
+		}
+	}
+
+	cgroup_cfts_commit(ss, NULL, false);
+	return -ENOENT;
+}
+
+/**
  * cgroup_task_count - count the number of tasks in a cgroup.
  * @cgrp: the cgroup in question
  *
@@ -3786,7 +3820,7 @@ static int cgroup_populate_dir(struct cgroup *cgrp)
 	int err;
 	struct cgroup_subsys *ss;
 
-	err = cgroup_add_files(cgrp, NULL, files);
+	err = cgroup_addrm_files(cgrp, NULL, files, true);
 	if (err < 0)
 		return err;
 
@@ -3798,7 +3832,7 @@ static int cgroup_populate_dir(struct cgroup *cgrp)
 			return err;
 
 		list_for_each_entry(set, &ss->cftsets, node)
-			cgroup_add_files(cgrp, ss, set->cfts);
+			cgroup_addrm_files(cgrp, ss, set->cfts, true);
 	}
 
 	/* This cgroup is ready now */
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 07/12] memcg: always create memsw files if CONFIG_CGROUP_MEM_RES_CTLR_SWAP
  2012-03-21 22:17 ` [PATCH 07/12] memcg: always create memsw files if CONFIG_CGROUP_MEM_RES_CTLR_SWAP Tejun Heo
@ 2012-03-22  0:23   ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 29+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-22  0:23 UTC (permalink / raw)
  To: Tejun Heo
  Cc: glommer, lizf, containers, cgroups, linux-kernel, fweisbec, rni,
	ctalbott

(2012/03/22 7:17), Tejun Heo wrote:

> Instead of conditioning creation of memsw files on do_swap_account,
> always create the files if compiled-in and fail read/write attempts
> with -EOPNOTSUPP if !do_swap_account.
> 
> This is suggested by KAMEZAWA to simplify memcg file creation so that
> it can use cgroup->subsys_cftypes.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>


seems O.K. to me.

Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 08/12] cgroup: convert memcg controller to the new cftype interface
  2012-03-21 22:17 ` [PATCH 08/12] cgroup: convert memcg controller to the new cftype interface Tejun Heo
@ 2012-03-22  0:27   ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 29+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-22  0:27 UTC (permalink / raw)
  To: Tejun Heo
  Cc: glommer, lizf, containers, cgroups, linux-kernel, fweisbec, rni,
	ctalbott, Johannes Weiner, Michal Hocko, Balbir Singh,
	Hugh Dickins, Greg Thelen

(2012/03/22 7:17), Tejun Heo wrote:

> Convert memcg to use the new cftype based interface.  kmem support
> abuses ->populate() for mem_cgroup_sockets_init() so it can't be
> removed at the moment.
> 
> tcp_memcontrol is updated so that tcp_files[] is registered via a
> __initcall.  This change also allows removing the forward declaration
> of tcp_files[].  Removed.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Glauber Costa <glommer@parallels.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Greg Thelen <gthelen@google.com>


Ok, then, files will be created by cgroup core not by memcg.
base_cftypes seems very good.

Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Thank you.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCHSET] cgroup: cftype based file interface, take #2
  2012-03-21 22:17 [PATCHSET] cgroup: cftype based file interface, take #2 Tejun Heo
                   ` (11 preceding siblings ...)
  2012-03-21 22:17 ` [PATCH 12/12] cgroup: implement cgroup_rm_cftypes() Tejun Heo
@ 2012-03-22  9:04 ` Glauber Costa
  2012-03-30 12:42 ` Li Zefan
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Glauber Costa @ 2012-03-22  9:04 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizf, containers, cgroups, linux-kernel, fweisbec, rni, ctalbott

On 03/22/2012 02:17 AM, Tejun Heo wrote:
> Can you please send a patch to move init to
> ->create()?

Yes.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCHSET] cgroup: cftype based file interface, take #2
  2012-03-21 22:17 [PATCHSET] cgroup: cftype based file interface, take #2 Tejun Heo
                   ` (12 preceding siblings ...)
  2012-03-22  9:04 ` [PATCHSET] cgroup: cftype based file interface, take #2 Glauber Costa
@ 2012-03-30 12:42 ` Li Zefan
  2012-03-30 15:42   ` Tejun Heo
  2012-03-30 22:29 ` Tejun Heo
  2012-03-31 16:47 ` Tejun Heo
  15 siblings, 1 reply; 29+ messages in thread
From: Li Zefan @ 2012-03-30 12:42 UTC (permalink / raw)
  To: Tejun Heo
  Cc: glommer, lizf, containers, cgroups, linux-kernel, fweisbec, rni,
	ctalbott

Tejun Heo wrote:

> Hello,
> 
> This is the second take at improving cgroup file addition/removal
> interface.  Changes from the last take[L] are,
> 
> * CGROUP_SUBSYS_CFTYPES[_COND]() dropped and
>   cgroup_subsys->base_cftypes is added instead.  This works better for
>   subsystems implemented as module.  tcp_memcontrol now uses an
>   explicit __initcall() to register its cftypes.
> 
> * 0005-cgroup-relocate-cftype-and-cgroup_subsys-definitions.patch
>   added as cleanup.
> 
> * 0007-memcg-always-create-memsw-files-if-CONFIG_CGROUP_MEM.patch
>   added so that memsw files are always created if enabled in config as
>   suggested by KAMEZAWA.
> 
> This patch replaces cgroup file interface with cftype based one which
> allows dynamic additions and removals of cftype arrays whether the
> target subsystem is attached or not.  This can be used to make subsys
> rebinding via remount work properly but I intentionally avoided doing
> that at the moment.
> 


What's the problem with remount? and is it important enough that it
should be fixed even the feature is marked as deprecated?

> This makes cgroup population simpler for controllers and will be used
> to allow controllers to be more dynamic.  e.g. blkio subsys has
> sub-policies which may come and go while blkio subsys is attached and
> it currently uses fixed set of files which stays blank if not in use.


Dynamic cgroup files was mentioned before. The scenario in mind was blkio
control files can be added/removed automatically as devices come and ago.

So this time blkio subsystem is really going to be made more dynamic
soon?

> This will also be useful for implementing unified hierarchy.
> 

> This patchset contains the following patches.

> 


I'll look into the patches tomorrow.

>  0001-cgroup-move-cgroup_clear_directory-call-out-of-cgrou.patch
>  0002-cgroup-build-list-of-all-cgroups-under-a-given-cgrou.patch
>  0003-cgroup-implement-cgroup_add_cftypes-and-friends.patch
>  0004-cgroup-merge-cft_release_agent-cftype-array-into-the.patch
>  0005-cgroup-relocate-cftype-and-cgroup_subsys-definitions.patch
>  0006-cgroup-convert-all-non-memcg-controllers-to-the-new-.patch
>  0007-memcg-always-create-memsw-files-if-CONFIG_CGROUP_MEM.patch
>  0008-cgroup-convert-memcg-controller-to-the-new-cftype-in.patch
>  0009-cgroup-remove-cgroup_add_file-s.patch
>  0010-cgroup-relocate-__d_cgrp-and-__d_cft.patch
>  0011-cgroup-introduce-struct-cfent.patch
>  0012-cgroup-implement-cgroup_rm_cftypes.patch
> 
> and is on top of
> 
>   cgroup/for-3.4 3ce3230a0cff484e5130153f244d4fb8a56b3a8b
> + [1] cgroup: deprecate remount option changes mount option
> 
> and is also available in the following git branch.
> 
>  git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git cgroup-cftype
> 
> Glauber, I think the cftypes registration for tcp_memcontrol is fine
> as is in this patchset and we just need to move the init to the tail
> of ->create().  Let's worry about memcg for module-loadable protocols
> whey they are added.  Can you please send a patch to move init to
> ->create()?
> 
> If nobody objects && after Li comes back and acks the changes, I'll
> route these through cgroup/for-3.5.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCHSET] cgroup: cftype based file interface, take #2
  2012-03-30 12:42 ` Li Zefan
@ 2012-03-30 15:42   ` Tejun Heo
  2012-03-31 12:56     ` Li Zefan
  0 siblings, 1 reply; 29+ messages in thread
From: Tejun Heo @ 2012-03-30 15:42 UTC (permalink / raw)
  To: Li Zefan
  Cc: glommer, lizf, containers, cgroups, linux-kernel, fweisbec, rni,
	ctalbott

Hey, Li.

On Fri, Mar 30, 2012 at 08:42:07PM +0800, Li Zefan wrote:
> > This patch replaces cgroup file interface with cftype based one which
> > allows dynamic additions and removals of cftype arrays whether the
> > target subsystem is attached or not.  This can be used to make subsys
> > rebinding via remount work properly but I intentionally avoided doing
> > that at the moment.
> 
> What's the problem with remount?

Subsys can't be changed if any non-root cgroup exists.

> and is it important enough that it should be fixed even the feature
> is marked as deprecated?

I'm not sure.  We *might* need it during multi-mount -> single-mount
transition depending on how that's implemented, so the "at the moment"
qualifier.  It probably won't be fixed but I'm not fully sure.

> > This makes cgroup population simpler for controllers and will be used
> > to allow controllers to be more dynamic.  e.g. blkio subsys has
> > sub-policies which may come and go while blkio subsys is attached and
> > it currently uses fixed set of files which stays blank if not in use.
> 
> Dynamic cgroup files was mentioned before. The scenario in mind was blkio
> control files can be added/removed automatically as devices come and ago.
> 
> So this time blkio subsystem is really going to be made more dynamic
> soon?

Patchset already posted.

  http://thread.gmane.org/gmane.linux.kernel.cgroups/1376

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH UPDATED 11/12] cgroup: introduce struct cfent
  2012-03-21 22:17 ` [PATCH 11/12] cgroup: introduce struct cfent Tejun Heo
@ 2012-03-30 20:42   ` Tejun Heo
  0 siblings, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2012-03-30 20:42 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups; +Cc: linux-kernel, fweisbec, rni, ctalbott

This patch adds cfent (cgroup file entry) which is the association
between a cgroup and a file.  This is in-cgroup representation of
files under a cgroup directory.  This simplifies walking walking
cgroup files and thus cgroup_clear_directory(), which is now
implemented in two parts - cgroup_rm_file() and a loop around it.

cgroup_rm_file() will be used to implement cftype removal and cfent is
scheduled to serve cgroup specific per-file data (e.g. for sysfs-like
"sever" semantics).

v2: - cfe was freed from cgroup_rm_file() which led to use-after-free
      if the file had openers at the time of removal.  Moved to
      cgroup_diput().

    - cgroup_clear_directory() triggered WARN_ON_ONCE() if d_subdirs
      wasn't empty after removing all files.  This triggered
      spuriously if some files were open during directory clearing.
      Removed.

v3: - In cgroup_diput(), WARN_ONCE(!list_empty(&cfe->node)) could be
      spuriously triggered for root cgroups because they don't go
      through cgroup_clear_directory() on unmount.  Don't trigger WARN
      for root cgroups.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Glauber Costa <glommer@parallels.com>
---
cgroup_diput() may trigger WARN_ONCE() spuriously on unmount.  Fixed.

 include/linux/cgroup.h |    1 
 kernel/cgroup.c        |  109 +++++++++++++++++++++++++++++++++----------------
 2 files changed, 76 insertions(+), 34 deletions(-)

Index: work/include/linux/cgroup.h
===================================================================
--- work.orig/include/linux/cgroup.h
+++ work/include/linux/cgroup.h
@@ -175,6 +175,7 @@ struct cgroup {
 	 */
 	struct list_head sibling;	/* my parent's children */
 	struct list_head children;	/* my children */
+	struct list_head files;		/* my files */
 
 	struct cgroup *parent;		/* my parent */
 	struct dentry __rcu *dentry;	/* cgroup fs entry, RCU protected */
Index: work/kernel/cgroup.c
===================================================================
--- work.orig/kernel/cgroup.c
+++ work/kernel/cgroup.c
@@ -148,6 +148,15 @@ struct cgroupfs_root {
 static struct cgroupfs_root rootnode;
 
 /*
+ * cgroupfs file entry, pointed to from leaf dentry->d_fsdata.
+ */
+struct cfent {
+	struct list_head		node;
+	struct dentry			*dentry;
+	struct cftype			*type;
+};
+
+/*
  * CSS ID -- ID per subsys's Cgroup Subsys State(CSS). used only when
  * cgroup_subsys->use_id != 0.
  */
@@ -287,11 +296,16 @@ static inline struct cgroup *__d_cgrp(st
 	return dentry->d_fsdata;
 }
 
-static inline struct cftype *__d_cft(struct dentry *dentry)
+static inline struct cfent *__d_cfe(struct dentry *dentry)
 {
 	return dentry->d_fsdata;
 }
 
+static inline struct cftype *__d_cft(struct dentry *dentry)
+{
+	return __d_cfe(dentry)->type;
+}
+
 /* the list of cgroups eligible for automatic release. Protected by
  * release_list_lock */
 static LIST_HEAD(release_list);
@@ -877,6 +891,14 @@ static void cgroup_diput(struct dentry *
 		BUG_ON(!list_empty(&cgrp->pidlists));
 
 		kfree_rcu(cgrp, rcu_head);
+	} else {
+		struct cfent *cfe = __d_cfe(dentry);
+		struct cgroup *cgrp = dentry->d_parent->d_fsdata;
+
+		WARN_ONCE(!list_empty(&cfe->node) &&
+			  cgrp != &cgrp->root->top_cgroup,
+			  "cfe still linked for %s\n", cfe->type->name);
+		kfree(cfe);
 	}
 	iput(inode);
 }
@@ -895,34 +917,36 @@ static void remove_dir(struct dentry *d)
 	dput(parent);
 }
 
-static void cgroup_clear_directory(struct dentry *dentry)
+static int cgroup_rm_file(struct cgroup *cgrp, const struct cftype *cft)
 {
-	struct list_head *node;
+	struct cfent *cfe;
+
+	lockdep_assert_held(&cgrp->dentry->d_inode->i_mutex);
+	lockdep_assert_held(&cgroup_mutex);
+
+	list_for_each_entry(cfe, &cgrp->files, node) {
+		struct dentry *d = cfe->dentry;
 
-	BUG_ON(!mutex_is_locked(&dentry->d_inode->i_mutex));
-	spin_lock(&dentry->d_lock);
-	node = dentry->d_subdirs.next;
-	while (node != &dentry->d_subdirs) {
-		struct dentry *d = list_entry(node, struct dentry, d_u.d_child);
-
-		spin_lock_nested(&d->d_lock, DENTRY_D_LOCK_NESTED);
-		list_del_init(node);
-		if (d->d_inode) {
-			/* This should never be called on a cgroup
-			 * directory with child cgroups */
-			BUG_ON(d->d_inode->i_mode & S_IFDIR);
-			dget_dlock(d);
-			spin_unlock(&d->d_lock);
-			spin_unlock(&dentry->d_lock);
-			d_delete(d);
-			simple_unlink(dentry->d_inode, d);
-			dput(d);
-			spin_lock(&dentry->d_lock);
-		} else
-			spin_unlock(&d->d_lock);
-		node = dentry->d_subdirs.next;
+		if (cft && cfe->type != cft)
+			continue;
+
+		dget(d);
+		d_delete(d);
+		simple_unlink(d->d_inode, d);
+		list_del_init(&cfe->node);
+		dput(d);
+
+		return 0;
 	}
-	spin_unlock(&dentry->d_lock);
+	return -ENOENT;
+}
+
+static void cgroup_clear_directory(struct dentry *dir)
+{
+	struct cgroup *cgrp = __d_cgrp(dir);
+
+	while (!list_empty(&cgrp->files))
+		cgroup_rm_file(cgrp, NULL);
 }
 
 /*
@@ -1352,6 +1376,7 @@ static void init_cgroup_housekeeping(str
 {
 	INIT_LIST_HEAD(&cgrp->sibling);
 	INIT_LIST_HEAD(&cgrp->children);
+	INIT_LIST_HEAD(&cgrp->files);
 	INIT_LIST_HEAD(&cgrp->css_sets);
 	INIT_LIST_HEAD(&cgrp->release_list);
 	INIT_LIST_HEAD(&cgrp->pidlists);
@@ -2623,7 +2648,9 @@ static int cgroup_add_file(struct cgroup
 			   const struct cftype *cft)
 {
 	struct dentry *dir = cgrp->dentry;
+	struct cgroup *parent = __d_cgrp(dir);
 	struct dentry *dentry;
+	struct cfent *cfe;
 	int error;
 	umode_t mode;
 	char name[MAX_CGROUP_TYPE_NAMELEN + MAX_CFTYPE_NAME + 2] = { 0 };
@@ -2639,17 +2666,31 @@ static int cgroup_add_file(struct cgroup
 		strcat(name, ".");
 	}
 	strcat(name, cft->name);
+
 	BUG_ON(!mutex_is_locked(&dir->d_inode->i_mutex));
+
+	cfe = kzalloc(sizeof(*cfe), GFP_KERNEL);
+	if (!cfe)
+		return -ENOMEM;
+
 	dentry = lookup_one_len(name, dir, strlen(name));
-	if (!IS_ERR(dentry)) {
-		mode = cgroup_file_mode(cft);
-		error = cgroup_create_file(dentry, mode | S_IFREG,
-						cgrp->root->sb);
-		if (!error)
-			dentry->d_fsdata = (void *)cft;
-		dput(dentry);
-	} else
+	if (IS_ERR(dentry)) {
 		error = PTR_ERR(dentry);
+		goto out;
+	}
+
+	mode = cgroup_file_mode(cft);
+	error = cgroup_create_file(dentry, mode | S_IFREG, cgrp->root->sb);
+	if (!error) {
+		cfe->type = (void *)cft;
+		cfe->dentry = dentry;
+		dentry->d_fsdata = cfe;
+		list_add_tail(&cfe->node, &parent->files);
+		cfe = NULL;
+	}
+	dput(dentry);
+out:
+	kfree(cfe);
 	return error;
 }
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCHSET] cgroup: cftype based file interface, take #2
  2012-03-21 22:17 [PATCHSET] cgroup: cftype based file interface, take #2 Tejun Heo
                   ` (13 preceding siblings ...)
  2012-03-30 12:42 ` Li Zefan
@ 2012-03-30 22:29 ` Tejun Heo
  2012-03-31 12:44   ` Li Zefan
  2012-04-03  3:22   ` Glauber Costa
  2012-03-31 16:47 ` Tejun Heo
  15 siblings, 2 replies; 29+ messages in thread
From: Tejun Heo @ 2012-03-30 22:29 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups; +Cc: linux-kernel, fweisbec, rni, ctalbott

On Wed, Mar 21, 2012 at 03:17:33PM -0700, Tejun Heo wrote:
> and is also available in the following git branch.
> 
>  git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git cgroup-cftype

Branch has been refreshed on top of the current mainline and with the
updated patch.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCHSET] cgroup: cftype based file interface, take #2
  2012-03-30 22:29 ` Tejun Heo
@ 2012-03-31 12:44   ` Li Zefan
  2012-03-31 16:31     ` Tejun Heo
  2012-04-03  3:22   ` Glauber Costa
  1 sibling, 1 reply; 29+ messages in thread
From: Li Zefan @ 2012-03-31 12:44 UTC (permalink / raw)
  To: Tejun Heo
  Cc: glommer, lizf, containers, cgroups, fweisbec, ctalbott,
	linux-kernel, rni

Tejun Heo wrote:

> On Wed, Mar 21, 2012 at 03:17:33PM -0700, Tejun Heo wrote:
>> and is also available in the following git branch.
>>
>>  git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git cgroup-cftype
> 
> Branch has been refreshed on top of the current mainline and with the
> updated patch.
> 


I've gone through the whole patchset, feel free to add my acks.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCHSET] cgroup: cftype based file interface, take #2
  2012-03-30 15:42   ` Tejun Heo
@ 2012-03-31 12:56     ` Li Zefan
  2012-03-31 16:30       ` Tejun Heo
  0 siblings, 1 reply; 29+ messages in thread
From: Li Zefan @ 2012-03-31 12:56 UTC (permalink / raw)
  To: Tejun Heo
  Cc: glommer, lizf, containers, cgroups, linux-kernel, fweisbec, rni,
	ctalbott

Tejun Heo wrote:

> Hey, Li.
> 
> On Fri, Mar 30, 2012 at 08:42:07PM +0800, Li Zefan wrote:
>>> This patch replaces cgroup file interface with cftype based one which
>>> allows dynamic additions and removals of cftype arrays whether the
>>> target subsystem is attached or not.  This can be used to make subsys
>>> rebinding via remount work properly but I intentionally avoided doing
>>> that at the moment.
>>
>> What's the problem with remount?
> 
> Subsys can't be changed if any non-root cgroup exists.
> 


Ah, this is a long-standing defect, which is documented in Doc/../cgroups.txt.
It's not trivial to fix it, and I never heard someone request it to be fixed.

>> and is it important enough that it should be fixed even the feature
>> is marked as deprecated?
> 
> I'm not sure.  We *might* need it during multi-mount -> single-mount
> transition depending on how that's implemented, so the "at the moment"
> qualifier.  It probably won't be fixed but I'm not fully sure.
> 


fine.

>>> This makes cgroup population simpler for controllers and will be used
>>> to allow controllers to be more dynamic.  e.g. blkio subsys has
>>> sub-policies which may come and go while blkio subsys is attached and
>>> it currently uses fixed set of files which stays blank if not in use.
>>
>> Dynamic cgroup files was mentioned before. The scenario in mind was blkio
>> control files can be added/removed automatically as devices come and ago.
>>
>> So this time blkio subsystem is really going to be made more dynamic
>> soon?
> 
> Patchset already posted.
> 
>   http://thread.gmane.org/gmane.linux.kernel.cgroups/1376
> 


Thanks!

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCHSET] cgroup: cftype based file interface, take #2
  2012-03-31 12:56     ` Li Zefan
@ 2012-03-31 16:30       ` Tejun Heo
  0 siblings, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2012-03-31 16:30 UTC (permalink / raw)
  To: Li Zefan
  Cc: glommer, lizf, containers, cgroups, linux-kernel, fweisbec, rni,
	ctalbott

Hello,

On Sat, Mar 31, 2012 at 08:56:36PM +0800, Li Zefan wrote:
> > Subsys can't be changed if any non-root cgroup exists.
> 
> Ah, this is a long-standing defect, which is documented in Doc/../cgroups.txt.
> It's not trivial to fix it, and I never heard someone request it to be fixed.

Yeah, it was a bit of head scratcher.  If it can be done only when
there's no cgroup configured, one might as well unmount and mount w/
new subsys instead of remounting.  There's the minute difference that
existing root cgroup state is preserved but that hardly justifies the
crippled feature.  IMHO, lack of fix request indicates the feature not
being used rather than anything else, so the deprecation proposal.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCHSET] cgroup: cftype based file interface, take #2
  2012-03-31 12:44   ` Li Zefan
@ 2012-03-31 16:31     ` Tejun Heo
  0 siblings, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2012-03-31 16:31 UTC (permalink / raw)
  To: Li Zefan
  Cc: glommer, lizf, containers, cgroups, fweisbec, ctalbott,
	linux-kernel, rni

Hello,

On Sat, Mar 31, 2012 at 08:44:23PM +0800, Li Zefan wrote:
> I've gone through the whole patchset, feel free to add my acks.

I'll add acked-by's to all patches including the remount deprecation
patch and start cgroup/for-3.5 branch.  If there's any disagreement,
please scream.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCHSET] cgroup: cftype based file interface, take #2
  2012-03-21 22:17 [PATCHSET] cgroup: cftype based file interface, take #2 Tejun Heo
                   ` (14 preceding siblings ...)
  2012-03-30 22:29 ` Tejun Heo
@ 2012-03-31 16:47 ` Tejun Heo
  15 siblings, 0 replies; 29+ messages in thread
From: Tejun Heo @ 2012-03-31 16:47 UTC (permalink / raw)
  To: glommer, lizf, containers, cgroups; +Cc: linux-kernel, fweisbec, rni, ctalbott

On Wed, Mar 21, 2012 at 03:17:33PM -0700, Tejun Heo wrote:
> This is the second take at improving cgroup file addition/removal
> interface.  Changes from the last take[L] are,

Applied to cgroup/for-3.5 w/ Li's acks added.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCHSET] cgroup: cftype based file interface, take #2
  2012-03-30 22:29 ` Tejun Heo
  2012-03-31 12:44   ` Li Zefan
@ 2012-04-03  3:22   ` Glauber Costa
  2012-04-03 18:47     ` Tejun Heo
  1 sibling, 1 reply; 29+ messages in thread
From: Glauber Costa @ 2012-04-03  3:22 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizf, containers, cgroups, linux-kernel, fweisbec, rni, ctalbott

On 03/31/2012 02:29 AM, Tejun Heo wrote:
> On Wed, Mar 21, 2012 at 03:17:33PM -0700, Tejun Heo wrote:
>> and is also available in the following git branch.
>>
>>   git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git cgroup-cftype
>
> Branch has been refreshed on top of the current mainline and with the
> updated patch.
>
> Thanks.
>
Do I need to send you a patch to remove memcg's populate on top of this ?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCHSET] cgroup: cftype based file interface, take #2
  2012-04-03  3:22   ` Glauber Costa
@ 2012-04-03 18:47     ` Tejun Heo
  2012-04-03 18:52       ` Tejun Heo
  0 siblings, 1 reply; 29+ messages in thread
From: Tejun Heo @ 2012-04-03 18:47 UTC (permalink / raw)
  To: Glauber Costa
  Cc: lizf, containers, cgroups, linux-kernel, fweisbec, rni, ctalbott

Hello, Glauber.

On Tue, Apr 03, 2012 at 07:22:59AM +0400, Glauber Costa wrote:
> Do I need to send you a patch to remove memcg's populate on top of this ?

populate is already gone.  My memory is a bit fuzzy now.  Can you spot
something broken / to be improved re. memcg cftypes in the current
cgroup/for-3.5 branch?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCHSET] cgroup: cftype based file interface, take #2
  2012-04-03 18:47     ` Tejun Heo
@ 2012-04-03 18:52       ` Tejun Heo
  2012-04-03 20:37         ` Glauber Costa
  0 siblings, 1 reply; 29+ messages in thread
From: Tejun Heo @ 2012-04-03 18:52 UTC (permalink / raw)
  To: Glauber Costa
  Cc: lizf, containers, cgroups, linux-kernel, fweisbec, rni, ctalbott

On Tue, Apr 03, 2012 at 11:47:19AM -0700, Tejun Heo wrote:
> Hello, Glauber.
> 
> On Tue, Apr 03, 2012 at 07:22:59AM +0400, Glauber Costa wrote:
> > Do I need to send you a patch to remove memcg's populate on top of this ?
> 
> populate is already gone.  My memory is a bit fuzzy now.  Can you spot
> something broken / to be improved re. memcg cftypes in the current
> cgroup/for-3.5 branch?

Ugh... brainfart.  Please disregard the previous message.  Can you
please resend the patches on top of cgroup/for-3.5 and use Li's new
address - lizefan@huawei.com ?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCHSET] cgroup: cftype based file interface, take #2
  2012-04-03 18:52       ` Tejun Heo
@ 2012-04-03 20:37         ` Glauber Costa
  0 siblings, 0 replies; 29+ messages in thread
From: Glauber Costa @ 2012-04-03 20:37 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Li Zefan, containers, cgroups, linux-kernel, fweisbec, rni, ctalbott

On 04/03/2012 10:52 PM, Tejun Heo wrote:
> On Tue, Apr 03, 2012 at 11:47:19AM -0700, Tejun Heo wrote:
>> Hello, Glauber.
>>
>> On Tue, Apr 03, 2012 at 07:22:59AM +0400, Glauber Costa wrote:
>>> Do I need to send you a patch to remove memcg's populate on top of this ?
>>
>> populate is already gone.  My memory is a bit fuzzy now.  Can you spot
>> something broken / to be improved re. memcg cftypes in the current
>> cgroup/for-3.5 branch?
>
> Ugh... brainfart.  Please disregard the previous message.  Can you
> please resend the patches on top of cgroup/for-3.5 and use Li's new
> address - lizefan@huawei.com ?
>
> Thanks.
>
All right.
I'll give a new test to make sure it is still okay, and send shortly

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2012-04-03 20:39 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-21 22:17 [PATCHSET] cgroup: cftype based file interface, take #2 Tejun Heo
2012-03-21 22:17 ` [PATCH 01/12] cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir() Tejun Heo
2012-03-21 22:17 ` [PATCH 02/12] cgroup: build list of all cgroups under a given cgroupfs_root Tejun Heo
2012-03-21 22:17 ` [PATCH 03/12] cgroup: implement cgroup_add_cftypes() and friends Tejun Heo
2012-03-21 22:17 ` [PATCH 04/12] cgroup: merge cft_release_agent cftype array into the base files array Tejun Heo
2012-03-21 22:17 ` [PATCH 05/12] cgroup: relocate cftype and cgroup_subsys definitions in controllers Tejun Heo
2012-03-21 22:17 ` [PATCH 06/12] cgroup: convert all non-memcg controllers to the new cftype interface Tejun Heo
2012-03-21 22:17 ` [PATCH 07/12] memcg: always create memsw files if CONFIG_CGROUP_MEM_RES_CTLR_SWAP Tejun Heo
2012-03-22  0:23   ` KAMEZAWA Hiroyuki
2012-03-21 22:17 ` [PATCH 08/12] cgroup: convert memcg controller to the new cftype interface Tejun Heo
2012-03-22  0:27   ` KAMEZAWA Hiroyuki
2012-03-21 22:17 ` [PATCH 09/12] cgroup: remove cgroup_add_file[s]() Tejun Heo
2012-03-21 22:17 ` [PATCH 10/12] cgroup: relocate __d_cgrp() and __d_cft() Tejun Heo
2012-03-21 22:17 ` [PATCH 11/12] cgroup: introduce struct cfent Tejun Heo
2012-03-30 20:42   ` [PATCH UPDATED " Tejun Heo
2012-03-21 22:17 ` [PATCH 12/12] cgroup: implement cgroup_rm_cftypes() Tejun Heo
2012-03-22  9:04 ` [PATCHSET] cgroup: cftype based file interface, take #2 Glauber Costa
2012-03-30 12:42 ` Li Zefan
2012-03-30 15:42   ` Tejun Heo
2012-03-31 12:56     ` Li Zefan
2012-03-31 16:30       ` Tejun Heo
2012-03-30 22:29 ` Tejun Heo
2012-03-31 12:44   ` Li Zefan
2012-03-31 16:31     ` Tejun Heo
2012-04-03  3:22   ` Glauber Costa
2012-04-03 18:47     ` Tejun Heo
2012-04-03 18:52       ` Tejun Heo
2012-04-03 20:37         ` Glauber Costa
2012-03-31 16:47 ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).