linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET cgroup/for-4.3] cgroup,memcg: generalize event handling and enable notifications on "memory.events"
@ 2015-08-11 17:58 Tejun Heo
  2015-08-11 17:58 ` [PATCH 1/8] cgroup: replace "cgroup.populated" with "cgroup.events" Tejun Heo
                   ` (9 more replies)
  0 siblings, 10 replies; 19+ messages in thread
From: Tejun Heo @ 2015-08-11 17:58 UTC (permalink / raw)
  To: hannes, lizefan; +Cc: mhocko, cgroups, linux-kernel

Hello,

This patchset establishes conventions on low frequency events,
converts "cgroup.populated" to "cgroup.events" accordingly,
generalizes event handling and enable notifications for
"memory.events".

This patchset contains the following eight patches.

 0001-cgroup-replace-cgroup.populated-with-cgroup.events.patch
 0002-cgroup-replace-cftype-mode-with-CFTYPE_WORLD_WRITABL.patch
 0003-cgroup-relocate-cgroup_populate_dir.patch
 0004-cgroup-make-cgroup_addrm_files-clean-up-after-itself.patch
 0005-cgroup-cosmetic-updates-to-rebind_subsystems.patch
 0006-cgroup-restructure-file-creation-removal-handling.patch
 0007-cgroup-generalize-obtaining-the-handles-of-and-notif.patch
 0008-memcg-generate-file-modified-notifications-on-memory.patch

0001 replaces "cgroup.populated" with "cgroup.events".  0002-0006 are
prep patches.  0007 generalizes event notification.  0008 hook up
event notifications for "memory.events".

This patchset is on top of cgroup/for-4.3 e753531991b8 ("Merge branch
'for-4.3-unified-base' into for-4.3") and available in the following
git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-events

diffstat follows.  Thanks.

 Documentation/cgroups/unified-hierarchy.txt |   15 +
 include/linux/cgroup-defs.h                 |   32 ++-
 include/linux/cgroup.h                      |   13 +
 kernel/cgroup.c                             |  264 ++++++++++++++--------------
 kernel/cpuset.c                             |    6 
 mm/memcontrol.c                             |    8 
 6 files changed, 194 insertions(+), 144 deletions(-)

--
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/8] cgroup: replace "cgroup.populated" with "cgroup.events"
  2015-08-11 17:58 [PATCHSET cgroup/for-4.3] cgroup,memcg: generalize event handling and enable notifications on "memory.events" Tejun Heo
@ 2015-08-11 17:58 ` Tejun Heo
  2015-08-11 17:58 ` [PATCH 2/8] cgroup: replace cftype->mode with CFTYPE_WORLD_WRITABLE Tejun Heo
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-08-11 17:58 UTC (permalink / raw)
  To: hannes, lizefan; +Cc: mhocko, cgroups, linux-kernel, Tejun Heo

memcg already uses "memory.events" for event reporting and other
controllers may need event reporting too.  Let's standardize on
"$SUBSYS.events" interface file for reporting events which don't
happen too frequently and thus can share event notification.

"cgroup.populated" is replaced with "populated" field in
"cgroup.events" and documentation is updated accordingly.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
---
 Documentation/cgroups/unified-hierarchy.txt | 15 ++++++++++-----
 include/linux/cgroup-defs.h                 |  2 +-
 kernel/cgroup.c                             | 17 +++++++++--------
 3 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/Documentation/cgroups/unified-hierarchy.txt b/Documentation/cgroups/unified-hierarchy.txt
index 1ee9caf..7ea2bc1 100644
--- a/Documentation/cgroups/unified-hierarchy.txt
+++ b/Documentation/cgroups/unified-hierarchy.txt
@@ -341,11 +341,11 @@ is riddled with issues.
   unnecessarily complicated and probably done this way because event
   delivery itself was expensive.
 
-Unified hierarchy implements an interface file "cgroup.populated"
-which can be used to monitor whether the cgroup's subhierarchy has
-tasks in it or not.  Its value is 0 if there is no task in the cgroup
-and its descendants; otherwise, 1.  poll and [id]notify events are
-triggered when the value changes.
+Unified hierarchy implements "populated" field in "cgroup.events"
+interface file which can be used to monitor whether the cgroup's
+subhierarchy has tasks in it or not.  Its value is 0 if there is no
+task in the cgroup and its descendants; otherwise, 1.  poll and
+[id]notify events are triggered when the value changes.
 
 This is significantly lighter and simpler and trivially allows
 delegating management of subhierarchy - subhierarchy monitoring can
@@ -435,6 +435,11 @@ may be specified in any order and not all pairs have to be specified.
   the first entry in the file.  Specific entries can use "default" as
   its value to indicate inheritance of the default value.
 
+- For events which are not very high frequency, an interface file
+  "events" should be created which lists event key value pairs.
+  Whenever a notifiable event happens, file modified event should be
+  generated on the file.
+
 
 5-4. Per-Controller Changes
 
diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 5294f1f..74d241d 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -226,7 +226,7 @@ struct cgroup {
 
 	struct kernfs_node *kn;		/* cgroup kernfs entry */
 	struct kernfs_node *procs_kn;	/* kn for "cgroup.procs" */
-	struct kernfs_node *populated_kn; /* kn for "cgroup.subtree_populated" */
+	struct kernfs_node *events_kn;	/* kn for "cgroup.events" */
 
 	/*
 	 * The bitmask of subsystems enabled on the child cgroups.
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index c4d94a5..43535fc 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -519,8 +519,8 @@ static void cgroup_update_populated(struct cgroup *cgrp, bool populated)
 		if (!trigger)
 			break;
 
-		if (cgrp->populated_kn)
-			kernfs_notify(cgrp->populated_kn);
+		if (cgrp->events_kn)
+			kernfs_notify(cgrp->events_kn);
 		cgrp = cgroup_parent(cgrp);
 	} while (cgrp);
 }
@@ -2944,9 +2944,10 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 	goto out_unlock;
 }
 
-static int cgroup_populated_show(struct seq_file *seq, void *v)
+static int cgroup_events_show(struct seq_file *seq, void *v)
 {
-	seq_printf(seq, "%d\n", (bool)seq_css(seq)->cgroup->populated_cnt);
+	seq_printf(seq, "populated %d\n",
+		   (bool)seq_css(seq)->cgroup->populated_cnt);
 	return 0;
 }
 
@@ -3113,8 +3114,8 @@ static int cgroup_add_file(struct cgroup *cgrp, struct cftype *cft)
 
 	if (cft->write == cgroup_procs_write)
 		cgrp->procs_kn = kn;
-	else if (cft->seq_show == cgroup_populated_show)
-		cgrp->populated_kn = kn;
+	else if (cft->seq_show == cgroup_events_show)
+		cgrp->events_kn = kn;
 	return 0;
 }
 
@@ -4287,9 +4288,9 @@ static struct cftype cgroup_dfl_base_files[] = {
 		.write = cgroup_subtree_control_write,
 	},
 	{
-		.name = "cgroup.populated",
+		.name = "cgroup.events",
 		.flags = CFTYPE_NOT_ON_ROOT,
-		.seq_show = cgroup_populated_show,
+		.seq_show = cgroup_events_show,
 	},
 	{ }	/* terminate */
 };
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/8] cgroup: replace cftype->mode with CFTYPE_WORLD_WRITABLE
  2015-08-11 17:58 [PATCHSET cgroup/for-4.3] cgroup,memcg: generalize event handling and enable notifications on "memory.events" Tejun Heo
  2015-08-11 17:58 ` [PATCH 1/8] cgroup: replace "cgroup.populated" with "cgroup.events" Tejun Heo
@ 2015-08-11 17:58 ` Tejun Heo
  2015-08-11 17:58 ` [PATCH 3/8] cgroup: relocate cgroup_populate_dir() Tejun Heo
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-08-11 17:58 UTC (permalink / raw)
  To: hannes, lizefan; +Cc: mhocko, cgroups, linux-kernel, Tejun Heo

cftype->mode allows controllers to give arbitrary permissions to
interface knobs.  Except for "cgroup.event_control", the existing uses
are spurious.

* Some explicitly specify S_IRUGO | S_IWUSR even though that's the
  default.

* "cpuset.memory_pressure" specifies S_IRUGO while also setting a
  write callback which returns -EACCES.  All it needs to do is simply
  not setting a write callback.

"cgroup.event_control" uses cftype->mode to make the file
world-writable.  It's a misdesigned interface and we don't want
controllers to be tweaking interface file permissions in general.
This patch removes cftype->mode and all its spurious uses and
implements CFTYPE_WORLD_WRITABLE for "cgroup.event_control" which is
marked as compatibility-only.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/cgroup-defs.h |  6 +-----
 kernel/cgroup.c             | 19 +++++++------------
 kernel/cpuset.c             |  6 ------
 mm/memcontrol.c             |  3 +--
 4 files changed, 9 insertions(+), 25 deletions(-)

diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 74d241d..93f48ca 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -76,6 +76,7 @@ enum {
 	CFTYPE_ONLY_ON_ROOT	= (1 << 0),	/* only create on root cgrp */
 	CFTYPE_NOT_ON_ROOT	= (1 << 1),	/* don't create on root cgrp */
 	CFTYPE_NO_PREFIX	= (1 << 3),	/* (DON'T USE FOR NEW FILES) no subsys prefix */
+	CFTYPE_WORLD_WRITABLE	= (1 << 4),	/* (DON'T USE FOR NEW FILES) S_IWUGO */
 
 	/* internal flags, do not use outside cgroup core proper */
 	__CFTYPE_ONLY_ON_DFL	= (1 << 16),	/* only on default hierarchy */
@@ -324,11 +325,6 @@ struct cftype {
 	 */
 	char name[MAX_CFTYPE_NAME];
 	unsigned long private;
-	/*
-	 * If not 0, file mode is set to this value, otherwise it will
-	 * be figured out automatically
-	 */
-	umode_t mode;
 
 	/*
 	 * The maximum length of string, excluding trailing nul, that can
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 43535fc..a909e4d 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1044,23 +1044,21 @@ static char *cgroup_file_name(struct cgroup *cgrp, const struct cftype *cft,
  * cgroup_file_mode - deduce file mode of a control file
  * @cft: the control file in question
  *
- * returns cft->mode if ->mode is not 0
- * returns S_IRUGO|S_IWUSR if it has both a read and a write handler
- * returns S_IRUGO if it has only a read handler
- * returns S_IWUSR if it has only a write hander
+ * S_IRUGO for read, S_IWUSR for write.
  */
 static umode_t cgroup_file_mode(const struct cftype *cft)
 {
 	umode_t mode = 0;
 
-	if (cft->mode)
-		return cft->mode;
-
 	if (cft->read_u64 || cft->read_s64 || cft->seq_show)
 		mode |= S_IRUGO;
 
-	if (cft->write_u64 || cft->write_s64 || cft->write)
-		mode |= S_IWUSR;
+	if (cft->write_u64 || cft->write_s64 || cft->write) {
+		if (cft->flags & CFTYPE_WORLD_WRITABLE)
+			mode |= S_IWUGO;
+		else
+			mode |= S_IWUSR;
+	}
 
 	return mode;
 }
@@ -4270,7 +4268,6 @@ static struct cftype cgroup_dfl_base_files[] = {
 		.seq_show = cgroup_pidlist_show,
 		.private = CGROUP_FILE_PROCS,
 		.write = cgroup_procs_write,
-		.mode = S_IRUGO | S_IWUSR,
 	},
 	{
 		.name = "cgroup.controllers",
@@ -4305,7 +4302,6 @@ static struct cftype cgroup_legacy_base_files[] = {
 		.seq_show = cgroup_pidlist_show,
 		.private = CGROUP_FILE_PROCS,
 		.write = cgroup_procs_write,
-		.mode = S_IRUGO | S_IWUSR,
 	},
 	{
 		.name = "cgroup.clone_children",
@@ -4325,7 +4321,6 @@ static struct cftype cgroup_legacy_base_files[] = {
 		.seq_show = cgroup_pidlist_show,
 		.private = CGROUP_FILE_TASKS,
 		.write = cgroup_tasks_write,
-		.mode = S_IRUGO | S_IWUSR,
 	},
 	{
 		.name = "notify_on_release",
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index ee14e3a..4da3f45 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1594,9 +1594,6 @@ static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
 	case FILE_MEMORY_PRESSURE_ENABLED:
 		cpuset_memory_pressure_enabled = !!val;
 		break;
-	case FILE_MEMORY_PRESSURE:
-		retval = -EACCES;
-		break;
 	case FILE_SPREAD_PAGE:
 		retval = update_flag(CS_SPREAD_PAGE, cs, val);
 		break;
@@ -1863,9 +1860,6 @@ static struct cftype files[] = {
 	{
 		.name = "memory_pressure",
 		.read_u64 = cpuset_read_u64,
-		.write_u64 = cpuset_write_u64,
-		.private = FILE_MEMORY_PRESSURE,
-		.mode = S_IRUGO,
 	},
 
 	{
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index acb93c5..78ba418 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4360,8 +4360,7 @@ static struct cftype mem_cgroup_legacy_files[] = {
 	{
 		.name = "cgroup.event_control",		/* XXX: for compat */
 		.write = memcg_write_event_control,
-		.flags = CFTYPE_NO_PREFIX,
-		.mode = S_IWUGO,
+		.flags = CFTYPE_NO_PREFIX | CFTYPE_WORLD_WRITABLE,
 	},
 	{
 		.name = "swappiness",
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/8] cgroup: relocate cgroup_populate_dir()
  2015-08-11 17:58 [PATCHSET cgroup/for-4.3] cgroup,memcg: generalize event handling and enable notifications on "memory.events" Tejun Heo
  2015-08-11 17:58 ` [PATCH 1/8] cgroup: replace "cgroup.populated" with "cgroup.events" Tejun Heo
  2015-08-11 17:58 ` [PATCH 2/8] cgroup: replace cftype->mode with CFTYPE_WORLD_WRITABLE Tejun Heo
@ 2015-08-11 17:58 ` Tejun Heo
  2015-08-11 17:58 ` [PATCH 4/8] cgroup: make cgroup_addrm_files() clean up after itself on failures Tejun Heo
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-08-11 17:58 UTC (permalink / raw)
  To: hannes, lizefan; +Cc: mhocko, cgroups, linux-kernel, Tejun Heo

Move it upwards so that it's right below cgroup_clear_dir() and the
forward declaration is unnecessary.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
---
 kernel/cgroup.c | 63 ++++++++++++++++++++++++++++-----------------------------
 1 file changed, 31 insertions(+), 32 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index a909e4d..92b8cc7 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1024,7 +1024,6 @@ static struct cgroup *task_cgroup_from_root(struct task_struct *task,
  * update of a tasks cgroup pointer by cgroup_attach_task()
  */
 
-static int cgroup_populate_dir(struct cgroup *cgrp, unsigned long subsys_mask);
 static struct kernfs_syscall_ops cgroup_kf_syscall_ops;
 static const struct file_operations proc_cgroupstats_operations;
 
@@ -1238,6 +1237,37 @@ static void cgroup_clear_dir(struct cgroup *cgrp, unsigned long subsys_mask)
 	}
 }
 
+/**
+ * cgroup_populate_dir - create subsys files in a cgroup directory
+ * @cgrp: target cgroup
+ * @subsys_mask: mask of the subsystem ids whose files should be added
+ *
+ * On failure, no file is added.
+ */
+static int cgroup_populate_dir(struct cgroup *cgrp, unsigned long subsys_mask)
+{
+	struct cgroup_subsys *ss;
+	int i, ret = 0;
+
+	/* process cftsets of each subsystem */
+	for_each_subsys(ss, i) {
+		struct cftype *cfts;
+
+		if (!(subsys_mask & (1 << i)))
+			continue;
+
+		list_for_each_entry(cfts, &ss->cfts, node) {
+			ret = cgroup_addrm_files(cgrp, cfts, true);
+			if (ret < 0)
+				goto err;
+		}
+	}
+	return 0;
+err:
+	cgroup_clear_dir(cgrp, subsys_mask);
+	return ret;
+}
+
 static int rebind_subsystems(struct cgroup_root *dst_root,
 			     unsigned long ss_mask)
 {
@@ -4337,37 +4367,6 @@ static struct cftype cgroup_legacy_base_files[] = {
 	{ }	/* terminate */
 };
 
-/**
- * cgroup_populate_dir - create subsys files in a cgroup directory
- * @cgrp: target cgroup
- * @subsys_mask: mask of the subsystem ids whose files should be added
- *
- * On failure, no file is added.
- */
-static int cgroup_populate_dir(struct cgroup *cgrp, unsigned long subsys_mask)
-{
-	struct cgroup_subsys *ss;
-	int i, ret = 0;
-
-	/* process cftsets of each subsystem */
-	for_each_subsys(ss, i) {
-		struct cftype *cfts;
-
-		if (!(subsys_mask & (1 << i)))
-			continue;
-
-		list_for_each_entry(cfts, &ss->cfts, node) {
-			ret = cgroup_addrm_files(cgrp, cfts, true);
-			if (ret < 0)
-				goto err;
-		}
-	}
-	return 0;
-err:
-	cgroup_clear_dir(cgrp, subsys_mask);
-	return ret;
-}
-
 /*
  * css destruction is four-stage process.
  *
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/8] cgroup: make cgroup_addrm_files() clean up after itself on failures
  2015-08-11 17:58 [PATCHSET cgroup/for-4.3] cgroup,memcg: generalize event handling and enable notifications on "memory.events" Tejun Heo
                   ` (2 preceding siblings ...)
  2015-08-11 17:58 ` [PATCH 3/8] cgroup: relocate cgroup_populate_dir() Tejun Heo
@ 2015-08-11 17:58 ` Tejun Heo
  2015-08-11 17:58 ` [PATCH 5/8] cgroup: cosmetic updates to rebind_subsystems() Tejun Heo
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-08-11 17:58 UTC (permalink / raw)
  To: hannes, lizefan; +Cc: mhocko, cgroups, linux-kernel, Tejun Heo

After a file creation failure, cgroup_addrm_files() it didn't remove
the files which had already been created.  When cgroup_populate_dir()
is the caller, this is fine as the caller performs cleanup; however,
for other callers, this may leave unactivated dangling files behind.
As kernfs directory removals are recursive, this doesn't lead to
permanent memory leak but it can, for example, fail future attempts to
create those files again.

There's no point in keeping around this sort of subtlety and it gets
in the way of planned updates to file handling.  This patch makes
cgroup_addrm_files() clean up after itself on failures.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
---
 kernel/cgroup.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 92b8cc7..5e5a4e0 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -3154,19 +3154,18 @@ static int cgroup_add_file(struct cgroup *cgrp, struct cftype *cft)
  * @is_add: whether to add or remove
  *
  * Depending on @is_add, add or remove files defined by @cfts on @cgrp.
- * For removals, this function never fails.  If addition fails, this
- * function doesn't remove files already added.  The caller is responsible
- * for cleaning up.
+ * For removals, this function never fails.
  */
 static int cgroup_addrm_files(struct cgroup *cgrp, struct cftype cfts[],
 			      bool is_add)
 {
-	struct cftype *cft;
+	struct cftype *cft, *cft_end = NULL;
 	int ret;
 
 	lockdep_assert_held(&cgroup_mutex);
 
-	for (cft = cfts; cft->name[0] != '\0'; cft++) {
+restart:
+	for (cft = cfts; cft != cft_end && cft->name[0] != '\0'; cft++) {
 		/* does cft->flags tell us to skip this file on @cgrp? */
 		if ((cft->flags & __CFTYPE_ONLY_ON_DFL) && !cgroup_on_dfl(cgrp))
 			continue;
@@ -3182,7 +3181,9 @@ static int cgroup_addrm_files(struct cgroup *cgrp, struct cftype cfts[],
 			if (ret) {
 				pr_warn("%s: failed to add %s, err=%d\n",
 					__func__, cft->name, ret);
-				return ret;
+				cft_end = cft;
+				is_add = false;
+				goto restart;
 			}
 		} else {
 			cgroup_rm_file(cgrp, cft);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5/8] cgroup: cosmetic updates to rebind_subsystems()
  2015-08-11 17:58 [PATCHSET cgroup/for-4.3] cgroup,memcg: generalize event handling and enable notifications on "memory.events" Tejun Heo
                   ` (3 preceding siblings ...)
  2015-08-11 17:58 ` [PATCH 4/8] cgroup: make cgroup_addrm_files() clean up after itself on failures Tejun Heo
@ 2015-08-11 17:58 ` Tejun Heo
  2015-08-11 17:58 ` [PATCH 6/8] cgroup: restructure file creation / removal handling Tejun Heo
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-08-11 17:58 UTC (permalink / raw)
  To: hannes, lizefan; +Cc: mhocko, cgroups, linux-kernel, Tejun Heo

* Use local variables @scgrp and @dcgrp for @src_root->cgrp and
  @dst_root->cgrp respectively.

* Use initializers to set @src_root and @css in the inner bind loop.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
---
 kernel/cgroup.c | 31 +++++++++++++++----------------
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 5e5a4e0..67d2ba3 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1271,6 +1271,7 @@ static int cgroup_populate_dir(struct cgroup *cgrp, unsigned long subsys_mask)
 static int rebind_subsystems(struct cgroup_root *dst_root,
 			     unsigned long ss_mask)
 {
+	struct cgroup *dcgrp = &dst_root->cgrp;
 	struct cgroup_subsys *ss;
 	unsigned long tmp_ss_mask;
 	int ssid, i, ret;
@@ -1292,7 +1293,7 @@ static int rebind_subsystems(struct cgroup_root *dst_root,
 	if (dst_root == &cgrp_dfl_root)
 		tmp_ss_mask &= ~cgrp_dfl_root_inhibit_ss_mask;
 
-	ret = cgroup_populate_dir(&dst_root->cgrp, tmp_ss_mask);
+	ret = cgroup_populate_dir(dcgrp, tmp_ss_mask);
 	if (ret) {
 		if (dst_root != &cgrp_dfl_root)
 			return ret;
@@ -1318,42 +1319,40 @@ static int rebind_subsystems(struct cgroup_root *dst_root,
 		cgroup_clear_dir(&ss->root->cgrp, 1 << ssid);
 
 	for_each_subsys_which(ss, ssid, &ss_mask) {
-		struct cgroup_root *src_root;
-		struct cgroup_subsys_state *css;
+		struct cgroup_root *src_root = ss->root;
+		struct cgroup *scgrp = &src_root->cgrp;
+		struct cgroup_subsys_state *css = cgroup_css(scgrp, ss);
 		struct css_set *cset;
 
-		src_root = ss->root;
-		css = cgroup_css(&src_root->cgrp, ss);
-
-		WARN_ON(!css || cgroup_css(&dst_root->cgrp, ss));
+		WARN_ON(!css || cgroup_css(dcgrp, ss));
 
-		RCU_INIT_POINTER(src_root->cgrp.subsys[ssid], NULL);
-		rcu_assign_pointer(dst_root->cgrp.subsys[ssid], css);
+		RCU_INIT_POINTER(scgrp->subsys[ssid], NULL);
+		rcu_assign_pointer(dcgrp->subsys[ssid], css);
 		ss->root = dst_root;
-		css->cgroup = &dst_root->cgrp;
+		css->cgroup = dcgrp;
 
 		down_write(&css_set_rwsem);
 		hash_for_each(css_set_table, i, cset, hlist)
 			list_move_tail(&cset->e_cset_node[ss->id],
-				       &dst_root->cgrp.e_csets[ss->id]);
+				       &dcgrp->e_csets[ss->id]);
 		up_write(&css_set_rwsem);
 
 		src_root->subsys_mask &= ~(1 << ssid);
-		src_root->cgrp.subtree_control &= ~(1 << ssid);
-		cgroup_refresh_child_subsys_mask(&src_root->cgrp);
+		scgrp->subtree_control &= ~(1 << ssid);
+		cgroup_refresh_child_subsys_mask(scgrp);
 
 		/* default hierarchy doesn't enable controllers by default */
 		dst_root->subsys_mask |= 1 << ssid;
 		if (dst_root != &cgrp_dfl_root) {
-			dst_root->cgrp.subtree_control |= 1 << ssid;
-			cgroup_refresh_child_subsys_mask(&dst_root->cgrp);
+			dcgrp->subtree_control |= 1 << ssid;
+			cgroup_refresh_child_subsys_mask(dcgrp);
 		}
 
 		if (ss->bind)
 			ss->bind(css);
 	}
 
-	kernfs_activate(dst_root->cgrp.kn);
+	kernfs_activate(dcgrp->kn);
 	return 0;
 }
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 6/8] cgroup: restructure file creation / removal handling
  2015-08-11 17:58 [PATCHSET cgroup/for-4.3] cgroup,memcg: generalize event handling and enable notifications on "memory.events" Tejun Heo
                   ` (4 preceding siblings ...)
  2015-08-11 17:58 ` [PATCH 5/8] cgroup: cosmetic updates to rebind_subsystems() Tejun Heo
@ 2015-08-11 17:58 ` Tejun Heo
  2015-08-11 17:58 ` [PATCH 7/8] cgroup: generalize obtaining the handles of and notifying cgroup files Tejun Heo
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-08-11 17:58 UTC (permalink / raw)
  To: hannes, lizefan; +Cc: mhocko, cgroups, linux-kernel, Tejun Heo

The file creation / removal path has always been a bit icky and the
planned notification update requires css during file creation.
Restructure as follows.

* cgroup_addrm_files() now takes both @css and @cgrp and is only
  called directly by other file handling functions.

* cgroup_populate/clear_dir() are replaced with
  css_populate/clear_dir() taking @css and @cgrp_override.
  @cgrp_override is used only when files needs to be created on /
  removed from a cgroup which isn't attached to @css which happens
  during subsystem rebinds.  Subsystem loops are moved to the callers.

* cgroup_add_file() now takes both @css and @cgrp.  @css isn't used
  yet but will be used by the planned notification update.

This patch doens't cause any behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
---
 kernel/cgroup.c | 143 ++++++++++++++++++++++++++++++--------------------------
 1 file changed, 76 insertions(+), 67 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 67d2ba3..b287522 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -200,7 +200,8 @@ static int create_css(struct cgroup *cgrp, struct cgroup_subsys *ss,
 		      bool visible);
 static void css_release(struct percpu_ref *ref);
 static void kill_css(struct cgroup_subsys_state *css);
-static int cgroup_addrm_files(struct cgroup *cgrp, struct cftype cfts[],
+static int cgroup_addrm_files(struct cgroup_subsys_state *css,
+			      struct cgroup *cgrp, struct cftype cfts[],
 			      bool is_add);
 
 /* IDR wrappers which synchronize using cgroup_idr_lock */
@@ -1218,53 +1219,57 @@ static void cgroup_rm_file(struct cgroup *cgrp, const struct cftype *cft)
 }
 
 /**
- * cgroup_clear_dir - remove subsys files in a cgroup directory
- * @cgrp: target cgroup
- * @subsys_mask: mask of the subsystem ids whose files should be removed
+ * css_clear_dir - remove subsys files in a cgroup directory
+ * @css: taget css
+ * @cgrp_override: specify if target cgroup is different from css->cgroup
  */
-static void cgroup_clear_dir(struct cgroup *cgrp, unsigned long subsys_mask)
+static void css_clear_dir(struct cgroup_subsys_state *css,
+			  struct cgroup *cgrp_override)
 {
-	struct cgroup_subsys *ss;
-	int i;
+	struct cgroup *cgrp = cgrp_override ?: css->cgroup;
+	struct cftype *cfts;
 
-	for_each_subsys(ss, i) {
-		struct cftype *cfts;
-
-		if (!(subsys_mask & (1 << i)))
-			continue;
-		list_for_each_entry(cfts, &ss->cfts, node)
-			cgroup_addrm_files(cgrp, cfts, false);
-	}
+	list_for_each_entry(cfts, &css->ss->cfts, node)
+		cgroup_addrm_files(css, cgrp, cfts, false);
 }
 
 /**
- * cgroup_populate_dir - create subsys files in a cgroup directory
- * @cgrp: target cgroup
- * @subsys_mask: mask of the subsystem ids whose files should be added
+ * css_populate_dir - create subsys files in a cgroup directory
+ * @css: target css
+ * @cgrp_overried: specify if target cgroup is different from css->cgroup
  *
  * On failure, no file is added.
  */
-static int cgroup_populate_dir(struct cgroup *cgrp, unsigned long subsys_mask)
+static int css_populate_dir(struct cgroup_subsys_state *css,
+			    struct cgroup *cgrp_override)
 {
-	struct cgroup_subsys *ss;
-	int i, ret = 0;
+	struct cgroup *cgrp = cgrp_override ?: css->cgroup;
+	struct cftype *cfts, *failed_cfts;
+	int ret;
 
-	/* process cftsets of each subsystem */
-	for_each_subsys(ss, i) {
-		struct cftype *cfts;
+	if (!css->ss) {
+		if (cgroup_on_dfl(cgrp))
+			cfts = cgroup_dfl_base_files;
+		else
+			cfts = cgroup_legacy_base_files;
 
-		if (!(subsys_mask & (1 << i)))
-			continue;
+		return cgroup_addrm_files(&cgrp->self, cgrp, cfts, true);
+	}
 
-		list_for_each_entry(cfts, &ss->cfts, node) {
-			ret = cgroup_addrm_files(cgrp, cfts, true);
-			if (ret < 0)
-				goto err;
+	list_for_each_entry(cfts, &css->ss->cfts, node) {
+		ret = cgroup_addrm_files(css, cgrp, cfts, true);
+		if (ret < 0) {
+			failed_cfts = cfts;
+			goto err;
 		}
 	}
 	return 0;
 err:
-	cgroup_clear_dir(cgrp, subsys_mask);
+	list_for_each_entry(cfts, &css->ss->cfts, node) {
+		if (cfts == failed_cfts)
+			break;
+		cgroup_addrm_files(css, cgrp, cfts, false);
+	}
 	return ret;
 }
 
@@ -1293,10 +1298,13 @@ static int rebind_subsystems(struct cgroup_root *dst_root,
 	if (dst_root == &cgrp_dfl_root)
 		tmp_ss_mask &= ~cgrp_dfl_root_inhibit_ss_mask;
 
-	ret = cgroup_populate_dir(dcgrp, tmp_ss_mask);
-	if (ret) {
-		if (dst_root != &cgrp_dfl_root)
-			return ret;
+	for_each_subsys_which(ss, ssid, &tmp_ss_mask) {
+		struct cgroup *scgrp = &ss->root->cgrp;
+		int tssid;
+
+		ret = css_populate_dir(cgroup_css(scgrp, ss), dcgrp);
+		if (!ret)
+			continue;
 
 		/*
 		 * Rebinding back to the default root is not allowed to
@@ -1304,20 +1312,27 @@ static int rebind_subsystems(struct cgroup_root *dst_root,
 		 * be rare.  Moving subsystems back and forth even more so.
 		 * Just warn about it and continue.
 		 */
-		if (cgrp_dfl_root_visible) {
-			pr_warn("failed to create files (%d) while rebinding 0x%lx to default root\n",
-				ret, ss_mask);
-			pr_warn("you may retry by moving them to a different hierarchy and unbinding\n");
+		if (dst_root == &cgrp_dfl_root) {
+			if (cgrp_dfl_root_visible) {
+				pr_warn("failed to create files (%d) while rebinding 0x%lx to default root\n",
+					ret, ss_mask);
+				pr_warn("you may retry by moving them to a different hierarchy and unbinding\n");
+			}
+			continue;
 		}
+
+		for_each_subsys_which(ss, tssid, &tmp_ss_mask) {
+			if (tssid == ssid)
+				break;
+			css_clear_dir(cgroup_css(scgrp, ss), dcgrp);
+		}
+		return ret;
 	}
 
 	/*
 	 * Nothing can fail from this point on.  Remove files for the
 	 * removed subsystems and rebind each subsystem.
 	 */
-	for_each_subsys_which(ss, ssid, &ss_mask)
-		cgroup_clear_dir(&ss->root->cgrp, 1 << ssid);
-
 	for_each_subsys_which(ss, ssid, &ss_mask) {
 		struct cgroup_root *src_root = ss->root;
 		struct cgroup *scgrp = &src_root->cgrp;
@@ -1326,6 +1341,8 @@ static int rebind_subsystems(struct cgroup_root *dst_root,
 
 		WARN_ON(!css || cgroup_css(dcgrp, ss));
 
+		css_clear_dir(css, NULL);
+
 		RCU_INIT_POINTER(scgrp->subsys[ssid], NULL);
 		rcu_assign_pointer(dcgrp->subsys[ssid], css);
 		ss->root = dst_root;
@@ -1691,7 +1708,6 @@ static int cgroup_setup_root(struct cgroup_root *root, unsigned long ss_mask)
 {
 	LIST_HEAD(tmp_links);
 	struct cgroup *root_cgrp = &root->cgrp;
-	struct cftype *base_files;
 	struct css_set *cset;
 	int i, ret;
 
@@ -1730,12 +1746,7 @@ static int cgroup_setup_root(struct cgroup_root *root, unsigned long ss_mask)
 	}
 	root_cgrp->kn = root->kf_root->kn;
 
-	if (root == &cgrp_dfl_root)
-		base_files = cgroup_dfl_base_files;
-	else
-		base_files = cgroup_legacy_base_files;
-
-	ret = cgroup_addrm_files(root_cgrp, base_files, true);
+	ret = css_populate_dir(&root_cgrp->self, NULL);
 	if (ret)
 		goto destroy_root;
 
@@ -2884,7 +2895,8 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 				ret = create_css(child, ss,
 					cgrp->subtree_control & (1 << ssid));
 			else
-				ret = cgroup_populate_dir(child, 1 << ssid);
+				ret = css_populate_dir(cgroup_css(child, ss),
+						       NULL);
 			if (ret)
 				goto err_undo_css;
 		}
@@ -2917,7 +2929,7 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 			if (css_disable & (1 << ssid)) {
 				kill_css(css);
 			} else {
-				cgroup_clear_dir(child, 1 << ssid);
+				css_clear_dir(css, NULL);
 				if (ss->css_reset)
 					ss->css_reset(css);
 			}
@@ -2965,7 +2977,7 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 			if (css_enable & (1 << ssid))
 				kill_css(css);
 			else
-				cgroup_clear_dir(child, 1 << ssid);
+				css_clear_dir(css, NULL);
 		}
 	}
 	goto out_unlock;
@@ -3117,7 +3129,8 @@ static int cgroup_kn_set_ugid(struct kernfs_node *kn)
 	return kernfs_setattr(kn, &iattr);
 }
 
-static int cgroup_add_file(struct cgroup *cgrp, struct cftype *cft)
+static int cgroup_add_file(struct cgroup_subsys_state *css, struct cgroup *cgrp,
+			   struct cftype *cft)
 {
 	char name[CGROUP_FILE_NAME_MAX];
 	struct kernfs_node *kn;
@@ -3148,14 +3161,16 @@ static int cgroup_add_file(struct cgroup *cgrp, struct cftype *cft)
 
 /**
  * cgroup_addrm_files - add or remove files to a cgroup directory
- * @cgrp: the target cgroup
+ * @css: the target css
+ * @cgrp: the target cgroup (usually css->cgroup)
  * @cfts: array of cftypes to be added
  * @is_add: whether to add or remove
  *
  * Depending on @is_add, add or remove files defined by @cfts on @cgrp.
  * For removals, this function never fails.
  */
-static int cgroup_addrm_files(struct cgroup *cgrp, struct cftype cfts[],
+static int cgroup_addrm_files(struct cgroup_subsys_state *css,
+			      struct cgroup *cgrp, struct cftype cfts[],
 			      bool is_add)
 {
 	struct cftype *cft, *cft_end = NULL;
@@ -3176,7 +3191,7 @@ static int cgroup_addrm_files(struct cgroup *cgrp, struct cftype cfts[],
 			continue;
 
 		if (is_add) {
-			ret = cgroup_add_file(cgrp, cft);
+			ret = cgroup_add_file(css, cgrp, cft);
 			if (ret) {
 				pr_warn("%s: failed to add %s, err=%d\n",
 					__func__, cft->name, ret);
@@ -3208,7 +3223,7 @@ static int cgroup_apply_cftypes(struct cftype *cfts, bool is_add)
 		if (cgroup_is_dead(cgrp))
 			continue;
 
-		ret = cgroup_addrm_files(cgrp, cfts, is_add);
+		ret = cgroup_addrm_files(css, cgrp, cfts, is_add);
 		if (ret)
 			break;
 	}
@@ -4584,7 +4599,7 @@ static int create_css(struct cgroup *cgrp, struct cgroup_subsys *ss,
 	css->id = err;
 
 	if (visible) {
-		err = cgroup_populate_dir(cgrp, 1 << ss->id);
+		err = css_populate_dir(css, NULL);
 		if (err)
 			goto err_free_id;
 	}
@@ -4610,7 +4625,7 @@ static int create_css(struct cgroup *cgrp, struct cgroup_subsys *ss,
 
 err_list_del:
 	list_del_rcu(&css->sibling);
-	cgroup_clear_dir(css->cgroup, 1 << css->ss->id);
+	css_clear_dir(css, NULL);
 err_free_id:
 	cgroup_idr_remove(&ss->css_idr, css->id);
 err_free_percpu_ref:
@@ -4627,7 +4642,6 @@ static int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name,
 	struct cgroup_root *root;
 	struct cgroup_subsys *ss;
 	struct kernfs_node *kn;
-	struct cftype *base_files;
 	int ssid, ret;
 
 	/* Do not accept '\n' to prevent making /proc/<pid>/cgroup unparsable.
@@ -4703,12 +4717,7 @@ static int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name,
 	if (ret)
 		goto out_destroy;
 
-	if (cgroup_on_dfl(cgrp))
-		base_files = cgroup_dfl_base_files;
-	else
-		base_files = cgroup_legacy_base_files;
-
-	ret = cgroup_addrm_files(cgrp, base_files, true);
+	ret = css_populate_dir(&cgrp->self, NULL);
 	if (ret)
 		goto out_destroy;
 
@@ -4795,7 +4804,7 @@ static void kill_css(struct cgroup_subsys_state *css)
 	 * This must happen before css is disassociated with its cgroup.
 	 * See seq_css() for details.
 	 */
-	cgroup_clear_dir(css->cgroup, 1 << css->ss->id);
+	css_clear_dir(css, NULL);
 
 	/*
 	 * Killing would put the base ref, but we need to keep it alive
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 7/8] cgroup: generalize obtaining the handles of and notifying cgroup files
  2015-08-11 17:58 [PATCHSET cgroup/for-4.3] cgroup,memcg: generalize event handling and enable notifications on "memory.events" Tejun Heo
                   ` (5 preceding siblings ...)
  2015-08-11 17:58 ` [PATCH 6/8] cgroup: restructure file creation / removal handling Tejun Heo
@ 2015-08-11 17:58 ` Tejun Heo
  2015-08-11 17:58 ` [PATCH 8/8] memcg: generate file modified notifications on "memory.events" Tejun Heo
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-08-11 17:58 UTC (permalink / raw)
  To: hannes, lizefan; +Cc: mhocko, cgroups, linux-kernel, Tejun Heo

cgroup core handles creations and removals of cgroup interface files
as described by cftypes.  There are cases where the handle for a given
file instance is necessary, for example, to generate a file modified
event.  Currently, this is handled by explicitly matching the callback
method pointer and storing the file handle manually in
cgroup_add_file().  While this simple approach works for cgroup core
files, it can't for controller interface files.

This patch generalizes cgroup interface file handle handling.  struct
cgroup_file is defined and each cftype can optionally tell cgroup core
to store the file handle by setting ->file_offset.  A file handle
remains accessible as long as the containing css is accessible.

Both "cgroup.procs" and "cgroup.events" are converted to use the new
generic mechanism instead of hooking directly into cgroup_add_file().
Also, cgroup_file_notify() which takes a struct cgroup_file and
generates a file modified event on it is added and replaces explicit
kernfs_notify() invocations.

This generalizes cgroup file handle handling and allows controllers to
generate file modified notifications.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/cgroup-defs.h | 26 ++++++++++++++++++++++++--
 include/linux/cgroup.h      | 13 +++++++++++++
 kernel/cgroup.c             | 26 +++++++++++++++++++-------
 3 files changed, 56 insertions(+), 9 deletions(-)

diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 93f48ca..cc5898a 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -84,6 +84,17 @@ enum {
 };
 
 /*
+ * cgroup_file is the handle for a file instance created in a cgroup which
+ * is used, for example, to generate file changed notifications.  This can
+ * be obtained by setting cftype->file_offset.
+ */
+struct cgroup_file {
+	/* do not access any fields from outside cgroup core */
+	struct list_head node;			/* anchored at css->files */
+	struct kernfs_node *kn;
+};
+
+/*
  * Per-subsystem/per-cgroup state maintained by the system.  This is the
  * fundamental structural building block that controllers deal with.
  *
@@ -123,6 +134,9 @@ struct cgroup_subsys_state {
 	 */
 	u64 serial_nr;
 
+	/* all cgroup_files associated with this css */
+	struct list_head files;
+
 	/* percpu_ref killing and RCU release */
 	struct rcu_head rcu_head;
 	struct work_struct destroy_work;
@@ -226,8 +240,8 @@ struct cgroup {
 	int populated_cnt;
 
 	struct kernfs_node *kn;		/* cgroup kernfs entry */
-	struct kernfs_node *procs_kn;	/* kn for "cgroup.procs" */
-	struct kernfs_node *events_kn;	/* kn for "cgroup.events" */
+	struct cgroup_file procs_file;	/* handle for "cgroup.procs" */
+	struct cgroup_file events_file;	/* handle for "cgroup.events" */
 
 	/*
 	 * The bitmask of subsystems enabled on the child cgroups.
@@ -336,6 +350,14 @@ struct cftype {
 	unsigned int flags;
 
 	/*
+	 * If non-zero, should contain the offset from the start of css to
+	 * a struct cgroup_file field.  cgroup will record the handle of
+	 * the created file into it.  The recorded handle can be used as
+	 * long as the containing css remains accessible.
+	 */
+	unsigned int file_offset;
+
+	/*
 	 * Fields used for internal bookkeeping.  Initialized automatically
 	 * during registration.
 	 */
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index eb7ca55..00ddf3c 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -527,6 +527,19 @@ static inline void pr_cont_cgroup_path(struct cgroup *cgrp)
 	pr_cont_kernfs_path(cgrp->kn);
 }
 
+/**
+ * cgroup_file_notify - generate a file modified event for a cgroup_file
+ * @cfile: target cgroup_file
+ *
+ * @cfile must have been obtained by setting cftype->file_offset.
+ */
+static inline void cgroup_file_notify(struct cgroup_file *cfile)
+{
+	/* might not have been created due to one of the CFTYPE selector flags */
+	if (cfile->kn)
+		kernfs_notify(cfile->kn);
+}
+
 #else /* !CONFIG_CGROUPS */
 
 struct cgroup_subsys_state;
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index b287522..4d0d522 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -520,8 +520,8 @@ static void cgroup_update_populated(struct cgroup *cgrp, bool populated)
 		if (!trigger)
 			break;
 
-		if (cgrp->events_kn)
-			kernfs_notify(cgrp->events_kn);
+		cgroup_file_notify(&cgrp->events_file);
+
 		cgrp = cgroup_parent(cgrp);
 	} while (cgrp);
 }
@@ -1671,6 +1671,7 @@ static void init_cgroup_housekeeping(struct cgroup *cgrp)
 
 	INIT_LIST_HEAD(&cgrp->self.sibling);
 	INIT_LIST_HEAD(&cgrp->self.children);
+	INIT_LIST_HEAD(&cgrp->self.files);
 	INIT_LIST_HEAD(&cgrp->cset_links);
 	INIT_LIST_HEAD(&cgrp->pidlists);
 	mutex_init(&cgrp->pidlist_mutex);
@@ -2462,7 +2463,7 @@ static int cgroup_procs_write_permission(struct task_struct *task,
 			cgrp = cgroup_parent(cgrp);
 
 		ret = -ENOMEM;
-		inode = kernfs_get_inode(sb, cgrp->procs_kn);
+		inode = kernfs_get_inode(sb, cgrp->procs_file.kn);
 		if (inode) {
 			ret = inode_permission(inode, MAY_WRITE);
 			iput(inode);
@@ -3152,10 +3153,14 @@ static int cgroup_add_file(struct cgroup_subsys_state *css, struct cgroup *cgrp,
 		return ret;
 	}
 
-	if (cft->write == cgroup_procs_write)
-		cgrp->procs_kn = kn;
-	else if (cft->seq_show == cgroup_events_show)
-		cgrp->events_kn = kn;
+	if (cft->file_offset) {
+		struct cgroup_file *cfile = (void *)css + cft->file_offset;
+
+		kernfs_get(kn);
+		cfile->kn = kn;
+		list_add(&cfile->node, &css->files);
+	}
+
 	return 0;
 }
 
@@ -4307,6 +4312,7 @@ static int cgroup_clone_children_write(struct cgroup_subsys_state *css,
 static struct cftype cgroup_dfl_base_files[] = {
 	{
 		.name = "cgroup.procs",
+		.file_offset = offsetof(struct cgroup, procs_file),
 		.seq_start = cgroup_pidlist_start,
 		.seq_next = cgroup_pidlist_next,
 		.seq_stop = cgroup_pidlist_stop,
@@ -4332,6 +4338,7 @@ static struct cftype cgroup_dfl_base_files[] = {
 	{
 		.name = "cgroup.events",
 		.flags = CFTYPE_NOT_ON_ROOT,
+		.file_offset = offsetof(struct cgroup, events_file),
 		.seq_show = cgroup_events_show,
 	},
 	{ }	/* terminate */
@@ -4410,9 +4417,13 @@ static void css_free_work_fn(struct work_struct *work)
 		container_of(work, struct cgroup_subsys_state, destroy_work);
 	struct cgroup_subsys *ss = css->ss;
 	struct cgroup *cgrp = css->cgroup;
+	struct cgroup_file *cfile;
 
 	percpu_ref_exit(&css->refcnt);
 
+	list_for_each_entry(cfile, &css->files, node)
+		kernfs_put(cfile->kn);
+
 	if (ss) {
 		/* css free path */
 		int id = css->id;
@@ -4517,6 +4528,7 @@ static void init_and_link_css(struct cgroup_subsys_state *css,
 	css->ss = ss;
 	INIT_LIST_HEAD(&css->sibling);
 	INIT_LIST_HEAD(&css->children);
+	INIT_LIST_HEAD(&css->files);
 	css->serial_nr = css_serial_nr_next++;
 
 	if (cgroup_parent(cgrp)) {
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 8/8] memcg: generate file modified notifications on "memory.events"
  2015-08-11 17:58 [PATCHSET cgroup/for-4.3] cgroup,memcg: generalize event handling and enable notifications on "memory.events" Tejun Heo
                   ` (6 preceding siblings ...)
  2015-08-11 17:58 ` [PATCH 7/8] cgroup: generalize obtaining the handles of and notifying cgroup files Tejun Heo
@ 2015-08-11 17:58 ` Tejun Heo
  2015-08-11 18:02   ` Tejun Heo
                     ` (2 more replies)
  2015-08-17 21:29 ` [PATCHSET cgroup/for-4.3] cgroup,memcg: generalize event handling and enable " Johannes Weiner
  2015-09-18 21:40 ` Tejun Heo
  9 siblings, 3 replies; 19+ messages in thread
From: Tejun Heo @ 2015-08-11 17:58 UTC (permalink / raw)
  To: hannes, lizefan; +Cc: mhocko, cgroups, linux-kernel, Tejun Heo

cgroup core only recently grew generic notification support.  Wire up
"memory.events" so that it triggers a file modified event whenever its
content changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
---
 mm/memcontrol.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 78ba418..10db5f1 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -295,6 +295,9 @@ struct mem_cgroup {
 	/* OOM-Killer disable */
 	int		oom_kill_disable;
 
+	/* handle for "memory.events" */
+	struct cgroup_file events_file;
+
 	/* protect arrays of thresholds */
 	struct mutex thresholds_lock;
 
@@ -5499,6 +5502,7 @@ static struct cftype memory_files[] = {
 	{
 		.name = "events",
 		.flags = CFTYPE_NOT_ON_ROOT,
+		.file_offset = offsetof(struct mem_cgroup, events_file),
 		.seq_show = memory_events_show,
 	},
 	{ }	/* terminate */
@@ -5530,6 +5534,7 @@ void mem_cgroup_events(struct mem_cgroup *memcg,
 		       unsigned int nr)
 {
 	this_cpu_add(memcg->stat->events[idx], nr);
+	cgroup_file_notify(&memcg->events_file);
 }
 
 /**
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 8/8] memcg: generate file modified notifications on "memory.events"
  2015-08-11 17:58 ` [PATCH 8/8] memcg: generate file modified notifications on "memory.events" Tejun Heo
@ 2015-08-11 18:02   ` Tejun Heo
  2015-08-17 14:30     ` Michal Hocko
  2015-09-18 22:01   ` [PATCH v2 " Tejun Heo
  2015-09-21 19:16   ` [PATCH " Tejun Heo
  2 siblings, 1 reply; 19+ messages in thread
From: Tejun Heo @ 2015-08-11 18:02 UTC (permalink / raw)
  To: hannes, lizefan; +Cc: mhocko, cgroups, linux-kernel

On Tue, Aug 11, 2015 at 01:58:09PM -0400, Tejun Heo wrote:
> cgroup core only recently grew generic notification support.  Wire up
> "memory.events" so that it triggers a file modified event whenever its
> content changes.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>

So, this won't apply to the current -mm.  Once the earlier part of the
series gets applied to cgroup/for-4.3, I'll refresh this patch on top
of -mm.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 8/8] memcg: generate file modified notifications on "memory.events"
  2015-08-11 18:02   ` Tejun Heo
@ 2015-08-17 14:30     ` Michal Hocko
  2015-08-17 19:51       ` Tejun Heo
  0 siblings, 1 reply; 19+ messages in thread
From: Michal Hocko @ 2015-08-17 14:30 UTC (permalink / raw)
  To: Tejun Heo; +Cc: hannes, lizefan, cgroups, linux-kernel

[Ups this was hanging in to-be-posted since last week - sorry about that]

On Tue 11-08-15 14:02:36, Tejun Heo wrote:
> On Tue, Aug 11, 2015 at 01:58:09PM -0400, Tejun Heo wrote:
> > cgroup core only recently grew generic notification support.  Wire up
> > "memory.events" so that it triggers a file modified event whenever its
> > content changes.
> > 
> > Signed-off-by: Tejun Heo <tj@kernel.org>
> > Cc: Li Zefan <lizefan@huawei.com>
> > Cc: Johannes Weiner <hannes@cmpxchg.org>
> > Cc: Michal Hocko <mhocko@kernel.org>

I cannot say I would be fond of the offset logic but whatever suits the
cgroup core...

Acked-by: Michal Hocko <mhocko@suse.com>

> So, this won't apply to the current -mm.  Once the earlier part of the
> series gets applied to cgroup/for-4.3, I'll refresh this patch on top
> of -mm.

I think you can route it via the same tree.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 8/8] memcg: generate file modified notifications on "memory.events"
  2015-08-17 14:30     ` Michal Hocko
@ 2015-08-17 19:51       ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-08-17 19:51 UTC (permalink / raw)
  To: Michal Hocko; +Cc: hannes, lizefan, cgroups, linux-kernel

Hello, Michal.

On Mon, Aug 17, 2015 at 04:30:57PM +0200, Michal Hocko wrote:
> I cannot say I would be fond of the offset logic but whatever suits the
> cgroup core...

I don't particularly like it either but couldn't think of anything
prettier. :(

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHSET cgroup/for-4.3] cgroup,memcg: generalize event handling and enable notifications on "memory.events"
  2015-08-11 17:58 [PATCHSET cgroup/for-4.3] cgroup,memcg: generalize event handling and enable notifications on "memory.events" Tejun Heo
                   ` (7 preceding siblings ...)
  2015-08-11 17:58 ` [PATCH 8/8] memcg: generate file modified notifications on "memory.events" Tejun Heo
@ 2015-08-17 21:29 ` Johannes Weiner
  2015-08-17 21:32   ` Tejun Heo
  2015-09-18 21:40 ` Tejun Heo
  9 siblings, 1 reply; 19+ messages in thread
From: Johannes Weiner @ 2015-08-17 21:29 UTC (permalink / raw)
  To: Tejun Heo; +Cc: lizefan, mhocko, cgroups, linux-kernel

On Tue, Aug 11, 2015 at 01:58:01PM -0400, Tejun Heo wrote:
> Hello,
> 
> This patchset establishes conventions on low frequency events,
> converts "cgroup.populated" to "cgroup.events" accordingly,
> generalizes event handling and enable notifications for
> "memory.events".
> 
> This patchset contains the following eight patches.
> 
>  0001-cgroup-replace-cgroup.populated-with-cgroup.events.patch
>  0002-cgroup-replace-cftype-mode-with-CFTYPE_WORLD_WRITABL.patch
>  0003-cgroup-relocate-cgroup_populate_dir.patch
>  0004-cgroup-make-cgroup_addrm_files-clean-up-after-itself.patch
>  0005-cgroup-cosmetic-updates-to-rebind_subsystems.patch
>  0006-cgroup-restructure-file-creation-removal-handling.patch
>  0007-cgroup-generalize-obtaining-the-handles-of-and-notif.patch
>  0008-memcg-generate-file-modified-notifications-on-memory.patch
> 
> 0001 replaces "cgroup.populated" with "cgroup.events".  0002-0006 are
> prep patches.  0007 generalizes event notification.  0008 hook up
> event notifications for "memory.events".

These look good to me.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Out of curiosity, do you envision additional entries for cgroup.events
in the near future?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHSET cgroup/for-4.3] cgroup,memcg: generalize event handling and enable notifications on "memory.events"
  2015-08-17 21:29 ` [PATCHSET cgroup/for-4.3] cgroup,memcg: generalize event handling and enable " Johannes Weiner
@ 2015-08-17 21:32   ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-08-17 21:32 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: lizefan, mhocko, cgroups, linux-kernel

Hello,

On Mon, Aug 17, 2015 at 11:29:20PM +0200, Johannes Weiner wrote:
> Out of curiosity, do you envision additional entries for cgroup.events
> in the near future?

I don't have anything specific I can think of right now.  I primarily
want to establish interface convention regarding low-frequency event
delivery and memory.events's seemed simple and extensible.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHSET cgroup/for-4.3] cgroup,memcg: generalize event handling and enable notifications on "memory.events"
  2015-08-11 17:58 [PATCHSET cgroup/for-4.3] cgroup,memcg: generalize event handling and enable notifications on "memory.events" Tejun Heo
                   ` (8 preceding siblings ...)
  2015-08-17 21:29 ` [PATCHSET cgroup/for-4.3] cgroup,memcg: generalize event handling and enable " Johannes Weiner
@ 2015-09-18 21:40 ` Tejun Heo
  9 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-09-18 21:40 UTC (permalink / raw)
  To: hannes, lizefan; +Cc: mhocko, cgroups, linux-kernel

On Tue, Aug 11, 2015 at 01:58:01PM -0400, Tejun Heo wrote:
> Hello,
> 
> This patchset establishes conventions on low frequency events,
> converts "cgroup.populated" to "cgroup.events" accordingly,
> generalizes event handling and enable notifications for
> "memory.events".
> 
> This patchset contains the following eight patches.
> 
>  0001-cgroup-replace-cgroup.populated-with-cgroup.events.patch
>  0002-cgroup-replace-cftype-mode-with-CFTYPE_WORLD_WRITABL.patch
>  0003-cgroup-relocate-cgroup_populate_dir.patch
>  0004-cgroup-make-cgroup_addrm_files-clean-up-after-itself.patch
>  0005-cgroup-cosmetic-updates-to-rebind_subsystems.patch
>  0006-cgroup-restructure-file-creation-removal-handling.patch
>  0007-cgroup-generalize-obtaining-the-handles-of-and-notif.patch
>  0008-memcg-generate-file-modified-notifications-on-memory.patch

Applying 1-7 to cgroup/for-4.4.  Will post refreshed 0008 soon.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 8/8] memcg: generate file modified notifications on "memory.events"
  2015-08-11 17:58 ` [PATCH 8/8] memcg: generate file modified notifications on "memory.events" Tejun Heo
  2015-08-11 18:02   ` Tejun Heo
@ 2015-09-18 22:01   ` Tejun Heo
  2015-09-19 10:15     ` Johannes Weiner
  2015-09-19 16:21     ` Michal Hocko
  2015-09-21 19:16   ` [PATCH " Tejun Heo
  2 siblings, 2 replies; 19+ messages in thread
From: Tejun Heo @ 2015-09-18 22:01 UTC (permalink / raw)
  To: hannes, lizefan; +Cc: mhocko, cgroups, linux-kernel

cgroup core only recently grew generic notification support.  Wire up
"memory.events" so that it triggers a file modified event whenever its
content changes.

v2: Refreshed on top of mem_cgroup relocation.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Michal Hocko <mhocko@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
---
Hello,

Once cgroup/for-4.4 gets propagated to mm, this should apply cleanly.
Alternatively, I can apply it through the cgroup tree.  Which would
you prefer?

Thanks.

 include/linux/memcontrol.h |    4 ++++
 mm/memcontrol.c            |    1 +
 2 files changed, 5 insertions(+)

--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -213,6 +213,9 @@ struct mem_cgroup {
 	/* OOM-Killer disable */
 	int		oom_kill_disable;
 
+	/* handle for "memory.events" */
+	struct cgroup_file events_file;
+
 	/* protect arrays of thresholds */
 	struct mutex thresholds_lock;
 
@@ -286,6 +289,7 @@ static inline void mem_cgroup_events(str
 		       unsigned int nr)
 {
 	this_cpu_add(memcg->stat->events[idx], nr);
+	cgroup_file_notify(&memcg->events_file);
 }
 
 bool mem_cgroup_low(struct mem_cgroup *root, struct mem_cgroup *memcg);
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5202,6 +5202,7 @@ static struct cftype memory_files[] = {
 	{
 		.name = "events",
 		.flags = CFTYPE_NOT_ON_ROOT,
+		.file_offset = offsetof(struct mem_cgroup, events_file),
 		.seq_show = memory_events_show,
 	},
 	{ }	/* terminate */

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 8/8] memcg: generate file modified notifications on "memory.events"
  2015-09-18 22:01   ` [PATCH v2 " Tejun Heo
@ 2015-09-19 10:15     ` Johannes Weiner
  2015-09-19 16:21     ` Michal Hocko
  1 sibling, 0 replies; 19+ messages in thread
From: Johannes Weiner @ 2015-09-19 10:15 UTC (permalink / raw)
  To: Tejun Heo; +Cc: lizefan, mhocko, cgroups, linux-kernel

On Fri, Sep 18, 2015 at 06:01:59PM -0400, Tejun Heo wrote:
> cgroup core only recently grew generic notification support.  Wire up
> "memory.events" so that it triggers a file modified event whenever its
> content changes.
> 
> v2: Refreshed on top of mem_cgroup relocation.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Acked-by: Michal Hocko <mhocko@kernel.org>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

> Once cgroup/for-4.4 gets propagated to mm, this should apply cleanly.
> Alternatively, I can apply it through the cgroup tree.  Which would
> you prefer?

cgroup tree works for me.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 8/8] memcg: generate file modified notifications on "memory.events"
  2015-09-18 22:01   ` [PATCH v2 " Tejun Heo
  2015-09-19 10:15     ` Johannes Weiner
@ 2015-09-19 16:21     ` Michal Hocko
  1 sibling, 0 replies; 19+ messages in thread
From: Michal Hocko @ 2015-09-19 16:21 UTC (permalink / raw)
  To: Tejun Heo; +Cc: hannes, lizefan, cgroups, linux-kernel

On Fri 18-09-15 18:01:59, Tejun Heo wrote:
> Hello,
> 
> Once cgroup/for-4.4 gets propagated to mm, this should apply cleanly.
> Alternatively, I can apply it through the cgroup tree.  Which would
> you prefer?

I am OK to go via your tree as well.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 8/8] memcg: generate file modified notifications on "memory.events"
  2015-08-11 17:58 ` [PATCH 8/8] memcg: generate file modified notifications on "memory.events" Tejun Heo
  2015-08-11 18:02   ` Tejun Heo
  2015-09-18 22:01   ` [PATCH v2 " Tejun Heo
@ 2015-09-21 19:16   ` Tejun Heo
  2 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2015-09-21 19:16 UTC (permalink / raw)
  To: hannes, lizefan; +Cc: mhocko, cgroups, linux-kernel

On Tue, Aug 11, 2015 at 01:58:09PM -0400, Tejun Heo wrote:
> cgroup core only recently grew generic notification support.  Wire up
> "memory.events" so that it triggers a file modified event whenever its
> content changes.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>

Applied to cgroup/for-4.4.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-09-21 19:16 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-11 17:58 [PATCHSET cgroup/for-4.3] cgroup,memcg: generalize event handling and enable notifications on "memory.events" Tejun Heo
2015-08-11 17:58 ` [PATCH 1/8] cgroup: replace "cgroup.populated" with "cgroup.events" Tejun Heo
2015-08-11 17:58 ` [PATCH 2/8] cgroup: replace cftype->mode with CFTYPE_WORLD_WRITABLE Tejun Heo
2015-08-11 17:58 ` [PATCH 3/8] cgroup: relocate cgroup_populate_dir() Tejun Heo
2015-08-11 17:58 ` [PATCH 4/8] cgroup: make cgroup_addrm_files() clean up after itself on failures Tejun Heo
2015-08-11 17:58 ` [PATCH 5/8] cgroup: cosmetic updates to rebind_subsystems() Tejun Heo
2015-08-11 17:58 ` [PATCH 6/8] cgroup: restructure file creation / removal handling Tejun Heo
2015-08-11 17:58 ` [PATCH 7/8] cgroup: generalize obtaining the handles of and notifying cgroup files Tejun Heo
2015-08-11 17:58 ` [PATCH 8/8] memcg: generate file modified notifications on "memory.events" Tejun Heo
2015-08-11 18:02   ` Tejun Heo
2015-08-17 14:30     ` Michal Hocko
2015-08-17 19:51       ` Tejun Heo
2015-09-18 22:01   ` [PATCH v2 " Tejun Heo
2015-09-19 10:15     ` Johannes Weiner
2015-09-19 16:21     ` Michal Hocko
2015-09-21 19:16   ` [PATCH " Tejun Heo
2015-08-17 21:29 ` [PATCHSET cgroup/for-4.3] cgroup,memcg: generalize event handling and enable " Johannes Weiner
2015-08-17 21:32   ` Tejun Heo
2015-09-18 21:40 ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).