linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET v2 cgroup/for-4.6] cgroup, perf_event: make perf_event work on v2 hierarchy
@ 2016-02-24 22:12 Tejun Heo
  2016-02-24 22:12 ` [PATCH 1/2] cgroup: implement cgroup_subsys->implicit_on_dfl Tejun Heo
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Tejun Heo @ 2016-02-24 22:12 UTC (permalink / raw)
  To: lizefan, hannes, a.p.zijlstra, mingo, acme
  Cc: linux-kernel, cgroups, kernel-team

Hello,

Changes from v1[1] are

* cgroup prep patches are applied to cgroup/for-4.6.

* Implicit controller support is reimplemented on top of the modular
  control mask update[2] which makes it simpler and better behaved -
  perf_event can now be returned to v2 hierarchy anytime.

perf_event is special in that while it primarily is there to identify
cgroup membership of tasks, it also keeps some per-cgroup state, so
unlike other controllers which are purely for identification, it can't
be replaced by v2 hierarchy membership tests.  At the same time, as it
is primarily used for tagging, doesn't incur meaningful runtime
overhead and doesn't have any interface, it's awkward to control it
via the usual "cgroup.subtree_control" mechanism on the v2 hierarchy.

This patchset makes perf_event implicitly enabled on the v2 hierarchy
so that cgroup v2 path automatically works with "perf record --cgroup"
as long as perf_event controller is available and not mounted on a
legacy hierarchy.  "perf record" is updated so that it searches for
both v1 hierarchy w/ perf_event on it and v2 hierarchy.  The v1
hierarchy is used if exists; otherwise, v2 is used, making the
transition transparent to users.

This patchset contains the following two patches.

 0001-cgroup-implement-cgroup_subsys-implicit_on_dfl.patch
 0002-cgroup-perf_event-make-perf_event-controller-work-on.patch

This patchset is on top of cgroup/for-4.6 and available in the
following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-cgroup2-perf_event

As the changes are heavily dependent on updates in cgroup core and
relatively well isolated on perf side, once acked, I think it'd be
best to route this through cgroup tree.

diffstat follows.

 Documentation/cgroup-v2.txt |   12 ++++++++++++
 include/linux/cgroup-defs.h |   13 +++++++++++++
 kernel/cgroup.c             |   38 +++++++++++++++++++++++++++++++-------
 kernel/events/core.c        |    6 ++++++
 tools/perf/util/cgroup.c    |   26 +++++++++++++++++++-------
 5 files changed, 81 insertions(+), 14 deletions(-)

Thanks.

--
tejun

[1] http://lkml.kernel.org/g/1452205790-21331-1-git-send-email-tj@kernel.org
[2] http://lkml.kernel.org/g/1456351368-786-1-git-send-email-tj@kernel.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] cgroup: implement cgroup_subsys->implicit_on_dfl
  2016-02-24 22:12 [PATCHSET v2 cgroup/for-4.6] cgroup, perf_event: make perf_event work on v2 hierarchy Tejun Heo
@ 2016-02-24 22:12 ` Tejun Heo
  2016-03-08 15:59   ` Tejun Heo
  2016-02-24 22:12 ` [PATCH 2/2] cgroup, perf_event: make perf_event controller work on cgroup2 hierarchy Tejun Heo
  2016-03-03 15:06 ` [PATCHSET v2 cgroup/for-4.6] cgroup, perf_event: make perf_event work on v2 hierarchy Tejun Heo
  2 siblings, 1 reply; 11+ messages in thread
From: Tejun Heo @ 2016-02-24 22:12 UTC (permalink / raw)
  To: lizefan, hannes, a.p.zijlstra, mingo, acme
  Cc: linux-kernel, cgroups, kernel-team, Tejun Heo

Some controllers, perf_event for now and possibly freezer in the
future, don't really make sense to control explicitly through
"cgroup.subtree_control".  For example, the primary role of perf_event
is identifying the cgroups of tasks; however, because the controller
also keeps a small amount of state per cgroup, it can't be replaced
with simple cgroup membership tests.

This patch implements cgroup_subsys->implicit_on_dfl flag.  When set,
the controller is implicitly enabled on all cgroups on the v2
hierarchy so that utility type controllers such as perf_event can be
enabled and function transparently.

An implicit controller doesn't show up in "cgroup.controllers" or
"cgroup.subtree_control", is exempt from no internal process rule and
can be stolen from the default hierarchy even if there are non-root
csses.

v2: Reimplemented on top of the recent updates to css handling and
    subsystem rebinding.  Rebinding implicit subsystems is now a
    simple matter of exempting it from the busy subsystem check.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/cgroup-defs.h | 13 +++++++++++++
 kernel/cgroup.c             | 38 +++++++++++++++++++++++++++++++-------
 2 files changed, 44 insertions(+), 7 deletions(-)

diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index aae8c94..0af8dc4 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -450,6 +450,19 @@ struct cgroup_subsys {
 	bool early_init:1;
 
 	/*
+	 * If %true, the controller, on the default hierarchy, doesn't show
+	 * up in "cgroup.controllers" or "cgroup.subtree_control", is
+	 * implicitly enabled on all cgroups on the default hierarchy, and
+	 * bypasses the "no internal process" constraint.  This is for
+	 * utility type controllers which is transparent to userland.
+	 *
+	 * An implicit controller can be stolen from the default hierarchy
+	 * anytime and thus must be okay with offline csses from previous
+	 * hierarchies coexisting with csses for the current one.
+	 */
+	bool implicit_on_dfl:1;
+
+	/*
 	 * If %false, this subsystem is properly hierarchical -
 	 * configuration, resource accounting and restriction on a parent
 	 * cgroup cover those of its children.  If %true, hierarchy support
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index a44c123..0f17c65 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -186,6 +186,9 @@ static u16 cgroup_no_v1_mask;
 /* some controllers are not supported in the default hierarchy */
 static u16 cgrp_dfl_inhibit_ss_mask;
 
+/* some controllers are implicitly enabled on the default hierarchy */
+static unsigned long cgrp_dfl_implicit_ss_mask;
+
 /* The list of hierarchy roots */
 
 static LIST_HEAD(cgroup_roots);
@@ -359,8 +362,8 @@ static u16 cgroup_control(struct cgroup *cgrp)
 		return parent->subtree_control;
 
 	if (cgroup_on_dfl(cgrp))
-		root_ss_mask &= ~cgrp_dfl_inhibit_ss_mask;
-
+		root_ss_mask &= ~(cgrp_dfl_inhibit_ss_mask |
+				  cgrp_dfl_implicit_ss_mask);
 	return root_ss_mask;
 }
 
@@ -1326,6 +1329,8 @@ static u16 cgroup_calc_subtree_ss_mask(u16 subtree_control, u16 this_ss_mask)
 
 	lockdep_assert_held(&cgroup_mutex);
 
+	cur_ss_mask |= cgrp_dfl_implicit_ss_mask;
+
 	while (true) {
 		u16 new_ss_mask = cur_ss_mask;
 
@@ -1511,8 +1516,13 @@ static int rebind_subsystems(struct cgroup_root *dst_root, u16 ss_mask)
 	lockdep_assert_held(&cgroup_mutex);
 
 	do_each_subsys_mask(ss, ssid, ss_mask) {
-		/* if @ss has non-root csses attached to it, can't move */
-		if (css_next_child(NULL, cgroup_css(&ss->root->cgrp, ss)))
+		/*
+		 * If @ss has non-root csses attached to it, can't move.
+		 * If @ss is an implicit controller, it is exempt from this
+		 * rule and can be stolen.
+		 */
+		if (css_next_child(NULL, cgroup_css(&ss->root->cgrp, ss)) &&
+		    !ss->implicit_on_dfl)
 			return -EBUSY;
 
 		/* can't move between two non-dummy roots either */
@@ -3032,6 +3042,18 @@ static void cgroup_restore_control(struct cgroup *cgrp)
 	}
 }
 
+static bool css_visible(struct cgroup_subsys_state *css)
+{
+	struct cgroup_subsys *ss = css->ss;
+	struct cgroup *cgrp = css->cgroup;
+
+	if (cgroup_control(cgrp) & (1 << ss->id))
+		return true;
+	if (!(cgroup_ss_mask(cgrp) & (1 << ss->id)))
+		return false;
+	return cgroup_on_dfl(cgrp) && ss->implicit_on_dfl;
+}
+
 /**
  * cgroup_apply_control_enable - enable or show csses according to control
  * @cgrp: root of the target subtree
@@ -3067,7 +3089,7 @@ static int cgroup_apply_control_enable(struct cgroup *cgrp)
 					return PTR_ERR(css);
 			}
 
-			if (cgroup_control(dsct) & (1 << ss->id)) {
+			if (css_visible(css)) {
 				ret = css_populate_dir(css);
 				if (ret)
 					return ret;
@@ -3110,7 +3132,7 @@ static void cgroup_apply_control_disable(struct cgroup *cgrp)
 			if (css->parent &&
 			    !(cgroup_ss_mask(dsct) & (1 << ss->id))) {
 				kill_css(css);
-			} else if (!(cgroup_control(dsct) & (1 << ss->id))) {
+			} else if (!css_visible(css)) {
 				css_clear_dir(css);
 				if (ss->css_reset)
 					ss->css_reset(css);
@@ -5440,7 +5462,9 @@ int __init cgroup_init(void)
 
 		cgrp_dfl_root.subsys_mask |= 1 << ss->id;
 
-		if (!ss->dfl_cftypes)
+		if (ss->implicit_on_dfl)
+			cgrp_dfl_implicit_ss_mask |= 1 << ss->id;
+		else if (!ss->dfl_cftypes)
 			cgrp_dfl_inhibit_ss_mask |= 1 << ss->id;
 
 		if (ss->dfl_cftypes == ss->legacy_cftypes) {
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] cgroup, perf_event: make perf_event controller work on cgroup2 hierarchy
  2016-02-24 22:12 [PATCHSET v2 cgroup/for-4.6] cgroup, perf_event: make perf_event work on v2 hierarchy Tejun Heo
  2016-02-24 22:12 ` [PATCH 1/2] cgroup: implement cgroup_subsys->implicit_on_dfl Tejun Heo
@ 2016-02-24 22:12 ` Tejun Heo
  2016-03-08 10:11   ` Peter Zijlstra
  2017-01-29 19:35   ` [PATCH REPOST " Tejun Heo
  2016-03-03 15:06 ` [PATCHSET v2 cgroup/for-4.6] cgroup, perf_event: make perf_event work on v2 hierarchy Tejun Heo
  2 siblings, 2 replies; 11+ messages in thread
From: Tejun Heo @ 2016-02-24 22:12 UTC (permalink / raw)
  To: lizefan, hannes, a.p.zijlstra, mingo, acme
  Cc: linux-kernel, cgroups, kernel-team, Tejun Heo

perf_event is a utility controller whose primary role is identifying
cgroup membership to filter perf events; however, because it also
tracks some per-css state, it can't be replaced by pure cgroup
membership test.  Mark the controller as implicitly enabled on the
default hierarchy so that perf events can always be filtered based on
cgroup v2 path as long as the controller is not mounted on a legacy
hierarchy.

"perf record" is updated accordingly so that it searches for both v1
and v2 hierarchies.  A v1 hierarchy is used if perf_event is mounted
on it; otherwise, it uses the v2 hierarchy.

v2: Doc updated to reflect more flexible rebinding behavior.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
---
 Documentation/cgroup-v2.txt | 12 ++++++++++++
 kernel/events/core.c        |  6 ++++++
 tools/perf/util/cgroup.c    | 26 +++++++++++++++++++-------
 3 files changed, 37 insertions(+), 7 deletions(-)

diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
index 3922ae1..c49bb10 100644
--- a/Documentation/cgroup-v2.txt
+++ b/Documentation/cgroup-v2.txt
@@ -47,6 +47,8 @@ CONTENTS
   5-3. IO
     5-3-1. IO Interface Files
     5-3-2. Writeback
+  5-4. Misc
+    5-4-1. perf_event
 P. Information on Kernel Programming
   P-1. Filesystem Support for Writeback
 D. Deprecated v1 Core Features
@@ -1091,6 +1093,16 @@ writeback as follows.
 	vm.dirty[_background]_ratio.
 
 
+5-4. Misc
+
+5-4-1. perf_event
+
+perf_event controller, if not mounted on a legacy hierarchy, is
+automatically enabled on the v2 hierarchy so that perf events can
+always be filtered by cgroup v2 path.  The controller can still be
+moved to a legacy hierarchy after v2 hierarchy is populated.
+
+
 P. Information on Kernel Programming
 
 This section contains kernel programming information in the areas
diff --git a/kernel/events/core.c b/kernel/events/core.c
index c095741..e77fa4f 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9416,5 +9416,11 @@ struct cgroup_subsys perf_event_cgrp_subsys = {
 	.css_alloc	= perf_cgroup_css_alloc,
 	.css_free	= perf_cgroup_css_free,
 	.attach		= perf_cgroup_attach,
+	/*
+	 * Implicitly enable on dfl hierarchy so that perf events can
+	 * always be filtered by cgroup2 path as long as perf_event
+	 * controller is not mounted on a legacy hierarchy.
+	 */
+	.implicit_on_dfl = true,
 };
 #endif /* CONFIG_CGROUP_PERF */
diff --git a/tools/perf/util/cgroup.c b/tools/perf/util/cgroup.c
index 90aa1b4..5df6235 100644
--- a/tools/perf/util/cgroup.c
+++ b/tools/perf/util/cgroup.c
@@ -12,8 +12,8 @@ cgroupfs_find_mountpoint(char *buf, size_t maxlen)
 {
 	FILE *fp;
 	char mountpoint[PATH_MAX + 1], tokens[PATH_MAX + 1], type[PATH_MAX + 1];
+	char path_v1[PATH_MAX + 1], path_v2[PATH_MAX + 2], *path;
 	char *token, *saved_ptr = NULL;
-	int found = 0;
 
 	fp = fopen("/proc/mounts", "r");
 	if (!fp)
@@ -24,31 +24,43 @@ cgroupfs_find_mountpoint(char *buf, size_t maxlen)
 	 * and inspect every cgroupfs mount point to find one that has
 	 * perf_event subsystem
 	 */
+	path_v1[0] = '\0';
+	path_v2[0] = '\0';
+
 	while (fscanf(fp, "%*s %"STR(PATH_MAX)"s %"STR(PATH_MAX)"s %"
 				STR(PATH_MAX)"s %*d %*d\n",
 				mountpoint, type, tokens) == 3) {
 
-		if (!strcmp(type, "cgroup")) {
+		if (!path_v1[0] && !strcmp(type, "cgroup")) {
 
 			token = strtok_r(tokens, ",", &saved_ptr);
 
 			while (token != NULL) {
 				if (!strcmp(token, "perf_event")) {
-					found = 1;
+					strcpy(path_v1, mountpoint);
 					break;
 				}
 				token = strtok_r(NULL, ",", &saved_ptr);
 			}
 		}
-		if (found)
+
+		if (!path_v2[0] && !strcmp(type, "cgroup2"))
+			strcpy(path_v2, mountpoint);
+
+		if (path_v1[0] && path_v2[0])
 			break;
 	}
 	fclose(fp);
-	if (!found)
+
+	if (path_v1[0])
+		path = path_v1;
+	else if (path_v2[0])
+		path = path_v2;
+	else
 		return -1;
 
-	if (strlen(mountpoint) < maxlen) {
-		strcpy(buf, mountpoint);
+	if (strlen(path) < maxlen) {
+		strcpy(buf, path);
 		return 0;
 	}
 	return -1;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCHSET v2 cgroup/for-4.6] cgroup, perf_event: make perf_event work on v2 hierarchy
  2016-02-24 22:12 [PATCHSET v2 cgroup/for-4.6] cgroup, perf_event: make perf_event work on v2 hierarchy Tejun Heo
  2016-02-24 22:12 ` [PATCH 1/2] cgroup: implement cgroup_subsys->implicit_on_dfl Tejun Heo
  2016-02-24 22:12 ` [PATCH 2/2] cgroup, perf_event: make perf_event controller work on cgroup2 hierarchy Tejun Heo
@ 2016-03-03 15:06 ` Tejun Heo
  2016-03-08  1:44   ` Tejun Heo
  2 siblings, 1 reply; 11+ messages in thread
From: Tejun Heo @ 2016-03-03 15:06 UTC (permalink / raw)
  To: lizefan, hannes, a.p.zijlstra, mingo, acme
  Cc: linux-kernel, cgroups, kernel-team

Hello,

On Wed, Feb 24, 2016 at 05:12:53PM -0500, Tejun Heo wrote:
> This patchset makes perf_event implicitly enabled on the v2 hierarchy
> so that cgroup v2 path automatically works with "perf record --cgroup"
> as long as perf_event controller is available and not mounted on a
> legacy hierarchy.  "perf record" is updated so that it searches for
> both v1 hierarchy w/ perf_event on it and v2 hierarchy.  The v1
> hierarchy is used if exists; otherwise, v2 is used, making the
> transition transparent to users.

All the prerequisite patches are now available in cgroup/for-4.6
branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git for-4.6

If the perf_event changes look okay, we can either apply the first
patch to cgroup/for-4.6, pull that into perf tree and aply the perf
change there or route both through cgroup tree.  Ideas?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHSET v2 cgroup/for-4.6] cgroup, perf_event: make perf_event work on v2 hierarchy
  2016-03-03 15:06 ` [PATCHSET v2 cgroup/for-4.6] cgroup, perf_event: make perf_event work on v2 hierarchy Tejun Heo
@ 2016-03-08  1:44   ` Tejun Heo
  0 siblings, 0 replies; 11+ messages in thread
From: Tejun Heo @ 2016-03-08  1:44 UTC (permalink / raw)
  To: lizefan, hannes, a.p.zijlstra, mingo, acme
  Cc: linux-kernel, cgroups, kernel-team

On Thu, Mar 03, 2016 at 10:06:54AM -0500, Tejun Heo wrote:
> All the prerequisite patches are now available in cgroup/for-4.6
> branch.
> 
>  git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git for-4.6
> 
> If the perf_event changes look okay, we can either apply the first
> patch to cgroup/for-4.6, pull that into perf tree and aply the perf
> change there or route both through cgroup tree.  Ideas?

Ping.

-- 
tejun

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] cgroup, perf_event: make perf_event controller work on cgroup2 hierarchy
  2016-02-24 22:12 ` [PATCH 2/2] cgroup, perf_event: make perf_event controller work on cgroup2 hierarchy Tejun Heo
@ 2016-03-08 10:11   ` Peter Zijlstra
  2017-01-29 19:35   ` [PATCH REPOST " Tejun Heo
  1 sibling, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2016-03-08 10:11 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, hannes, mingo, acme, linux-kernel, cgroups, kernel-team

On Wed, Feb 24, 2016 at 05:12:55PM -0500, Tejun Heo wrote:
> perf_event is a utility controller whose primary role is identifying
> cgroup membership to filter perf events; however, because it also
> tracks some per-css state, it can't be replaced by pure cgroup
> membership test.  Mark the controller as implicitly enabled on the
> default hierarchy so that perf events can always be filtered based on
> cgroup v2 path as long as the controller is not mounted on a legacy
> hierarchy.
> 
> "perf record" is updated accordingly so that it searches for both v1
> and v2 hierarchies.  A v1 hierarchy is used if perf_event is mounted
> on it; otherwise, it uses the v2 hierarchy.
> 
> v2: Doc updated to reflect more flexible rebinding behavior.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>

Arnaldo, do the userspace bits look ok?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] cgroup: implement cgroup_subsys->implicit_on_dfl
  2016-02-24 22:12 ` [PATCH 1/2] cgroup: implement cgroup_subsys->implicit_on_dfl Tejun Heo
@ 2016-03-08 15:59   ` Tejun Heo
  0 siblings, 0 replies; 11+ messages in thread
From: Tejun Heo @ 2016-03-08 15:59 UTC (permalink / raw)
  To: lizefan, hannes, a.p.zijlstra, mingo, acme
  Cc: linux-kernel, cgroups, kernel-team

On Wed, Feb 24, 2016 at 05:12:54PM -0500, Tejun Heo wrote:
> Some controllers, perf_event for now and possibly freezer in the
> future, don't really make sense to control explicitly through
> "cgroup.subtree_control".  For example, the primary role of perf_event
> is identifying the cgroups of tasks; however, because the controller
> also keeps a small amount of state per cgroup, it can't be replaced
> with simple cgroup membership tests.
> 
> This patch implements cgroup_subsys->implicit_on_dfl flag.  When set,
> the controller is implicitly enabled on all cgroups on the v2
> hierarchy so that utility type controllers such as perf_event can be
> enabled and function transparently.
> 
> An implicit controller doesn't show up in "cgroup.controllers" or
> "cgroup.subtree_control", is exempt from no internal process rule and
> can be stolen from the default hierarchy even if there are non-root
> csses.
> 
> v2: Reimplemented on top of the recent updates to css handling and
>     subsystem rebinding.  Rebinding implicit subsystems is now a
>     simple matter of exempting it from the busy subsystem check.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>

Applying this one to cgroup/for-4.6.  Waiting for the review of
userland part on the second patch.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH REPOST 2/2] cgroup, perf_event: make perf_event controller work on cgroup2 hierarchy
  2016-02-24 22:12 ` [PATCH 2/2] cgroup, perf_event: make perf_event controller work on cgroup2 hierarchy Tejun Heo
  2016-03-08 10:11   ` Peter Zijlstra
@ 2017-01-29 19:35   ` Tejun Heo
  2017-01-30 14:05     ` Arnaldo Carvalho de Melo
  2017-02-02 18:47     ` Tejun Heo
  1 sibling, 2 replies; 11+ messages in thread
From: Tejun Heo @ 2017-01-29 19:35 UTC (permalink / raw)
  To: lizefan, hannes, a.p.zijlstra, mingo, acme
  Cc: linux-kernel, cgroups, kernel-team

perf_event is a utility controller whose primary role is identifying
cgroup membership to filter perf events; however, because it also
tracks some per-css state, it can't be replaced by pure cgroup
membership test.  Mark the controller as implicitly enabled on the
default hierarchy so that perf events can always be filtered based on
cgroup v2 path as long as the controller is not mounted on a legacy
hierarchy.

"perf record" is updated accordingly so that it searches for both v1
and v2 hierarchies.  A v1 hierarchy is used if perf_event is mounted
on it; otherwise, it uses the v2 hierarchy.

v2: Doc updated to reflect more flexible rebinding behavior.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
---
Hello,

This was posted months ago and acked by Peter.  I thought it was
applied but apparently wasn't.  Peter asked Arnaldo whether the
userspace part looked which didn't get replied and that probably was
how it slipped through the crack.

Peter, Arnanldo, how should we proceed?  Should I route this through
the cgroup tree?

Thanks.

 Documentation/cgroup-v2.txt |   12 ++++++++++++
 kernel/events/core.c        |    6 ++++++
 tools/perf/util/cgroup.c    |   26 +++++++++++++++++++-------
 3 files changed, 37 insertions(+), 7 deletions(-)

--- a/Documentation/cgroup-v2.txt
+++ b/Documentation/cgroup-v2.txt
@@ -49,6 +49,8 @@ CONTENTS
     5-3-2. Writeback
   5-4. PID
     5-4-1. PID Interface Files
+  5-5. Misc
+    5-5-1. perf_event
 6. Namespace
   6-1. Basics
   6-2. The Root and Views
@@ -1160,6 +1162,16 @@ through fork() or clone(). These will re
 of a new process would cause a cgroup policy to be violated.
 
 
+5-5. Misc
+
+5-5-1. perf_event
+
+perf_event controller, if not mounted on a legacy hierarchy, is
+automatically enabled on the v2 hierarchy so that perf events can
+always be filtered by cgroup v2 path.  The controller can still be
+moved to a legacy hierarchy after v2 hierarchy is populated.
+
+
 6. Namespace
 
 6-1. Basics
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10792,5 +10792,11 @@ struct cgroup_subsys perf_event_cgrp_sub
 	.css_alloc	= perf_cgroup_css_alloc,
 	.css_free	= perf_cgroup_css_free,
 	.attach		= perf_cgroup_attach,
+	/*
+	 * Implicitly enable on dfl hierarchy so that perf events can
+	 * always be filtered by cgroup2 path as long as perf_event
+	 * controller is not mounted on a legacy hierarchy.
+	 */
+	.implicit_on_dfl = true,
 };
 #endif /* CONFIG_CGROUP_PERF */
--- a/tools/perf/util/cgroup.c
+++ b/tools/perf/util/cgroup.c
@@ -12,8 +12,8 @@ cgroupfs_find_mountpoint(char *buf, size
 {
 	FILE *fp;
 	char mountpoint[PATH_MAX + 1], tokens[PATH_MAX + 1], type[PATH_MAX + 1];
+	char path_v1[PATH_MAX + 1], path_v2[PATH_MAX + 2], *path;
 	char *token, *saved_ptr = NULL;
-	int found = 0;
 
 	fp = fopen("/proc/mounts", "r");
 	if (!fp)
@@ -24,31 +24,43 @@ cgroupfs_find_mountpoint(char *buf, size
 	 * and inspect every cgroupfs mount point to find one that has
 	 * perf_event subsystem
 	 */
+	path_v1[0] = '\0';
+	path_v2[0] = '\0';
+
 	while (fscanf(fp, "%*s %"STR(PATH_MAX)"s %"STR(PATH_MAX)"s %"
 				STR(PATH_MAX)"s %*d %*d\n",
 				mountpoint, type, tokens) == 3) {
 
-		if (!strcmp(type, "cgroup")) {
+		if (!path_v1[0] && !strcmp(type, "cgroup")) {
 
 			token = strtok_r(tokens, ",", &saved_ptr);
 
 			while (token != NULL) {
 				if (!strcmp(token, "perf_event")) {
-					found = 1;
+					strcpy(path_v1, mountpoint);
 					break;
 				}
 				token = strtok_r(NULL, ",", &saved_ptr);
 			}
 		}
-		if (found)
+
+		if (!path_v2[0] && !strcmp(type, "cgroup2"))
+			strcpy(path_v2, mountpoint);
+
+		if (path_v1[0] && path_v2[0])
 			break;
 	}
 	fclose(fp);
-	if (!found)
+
+	if (path_v1[0])
+		path = path_v1;
+	else if (path_v2[0])
+		path = path_v2;
+	else
 		return -1;
 
-	if (strlen(mountpoint) < maxlen) {
-		strcpy(buf, mountpoint);
+	if (strlen(path) < maxlen) {
+		strcpy(buf, path);
 		return 0;
 	}
 	return -1;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH REPOST 2/2] cgroup, perf_event: make perf_event controller work on cgroup2 hierarchy
  2017-01-29 19:35   ` [PATCH REPOST " Tejun Heo
@ 2017-01-30 14:05     ` Arnaldo Carvalho de Melo
  2017-01-30 15:07       ` Tejun Heo
  2017-02-02 18:47     ` Tejun Heo
  1 sibling, 1 reply; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-01-30 14:05 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan, hannes, a.p.zijlstra, mingo, linux-kernel, cgroups, kernel-team

Em Sun, Jan 29, 2017 at 02:35:20PM -0500, Tejun Heo escreveu:
> perf_event is a utility controller whose primary role is identifying
> cgroup membership to filter perf events; however, because it also
> tracks some per-css state, it can't be replaced by pure cgroup
> membership test.  Mark the controller as implicitly enabled on the
> default hierarchy so that perf events can always be filtered based on
> cgroup v2 path as long as the controller is not mounted on a legacy
> hierarchy.
> 
> "perf record" is updated accordingly so that it searches for both v1
> and v2 hierarchies.  A v1 hierarchy is used if perf_event is mounted
> on it; otherwise, it uses the v2 hierarchy.
> 
> v2: Doc updated to reflect more flexible rebinding behavior.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> ---
> Hello,
> 
> This was posted months ago and acked by Peter.  I thought it was
> applied but apparently wasn't.  Peter asked Arnaldo whether the
> userspace part looked which didn't get replied and that probably was
> how it slipped through the crack.
> 
> Peter, Arnanldo, how should we proceed?  Should I route this through
> the cgroup tree?

It looks ok, haven't tested it tho. I don't have a problem for it to go
thru the cgroup tree. The change is small and restrictred to cgroup
glue in tools/perf/.

- Arnaldo
 
> Thanks.
> 
>  Documentation/cgroup-v2.txt |   12 ++++++++++++
>  kernel/events/core.c        |    6 ++++++
>  tools/perf/util/cgroup.c    |   26 +++++++++++++++++++-------
>  3 files changed, 37 insertions(+), 7 deletions(-)
> 
> --- a/Documentation/cgroup-v2.txt
> +++ b/Documentation/cgroup-v2.txt
> @@ -49,6 +49,8 @@ CONTENTS
>      5-3-2. Writeback
>    5-4. PID
>      5-4-1. PID Interface Files
> +  5-5. Misc
> +    5-5-1. perf_event
>  6. Namespace
>    6-1. Basics
>    6-2. The Root and Views
> @@ -1160,6 +1162,16 @@ through fork() or clone(). These will re
>  of a new process would cause a cgroup policy to be violated.
>  
>  
> +5-5. Misc
> +
> +5-5-1. perf_event
> +
> +perf_event controller, if not mounted on a legacy hierarchy, is
> +automatically enabled on the v2 hierarchy so that perf events can
> +always be filtered by cgroup v2 path.  The controller can still be
> +moved to a legacy hierarchy after v2 hierarchy is populated.
> +
> +
>  6. Namespace
>  
>  6-1. Basics
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -10792,5 +10792,11 @@ struct cgroup_subsys perf_event_cgrp_sub
>  	.css_alloc	= perf_cgroup_css_alloc,
>  	.css_free	= perf_cgroup_css_free,
>  	.attach		= perf_cgroup_attach,
> +	/*
> +	 * Implicitly enable on dfl hierarchy so that perf events can
> +	 * always be filtered by cgroup2 path as long as perf_event
> +	 * controller is not mounted on a legacy hierarchy.
> +	 */
> +	.implicit_on_dfl = true,
>  };
>  #endif /* CONFIG_CGROUP_PERF */
> --- a/tools/perf/util/cgroup.c
> +++ b/tools/perf/util/cgroup.c
> @@ -12,8 +12,8 @@ cgroupfs_find_mountpoint(char *buf, size
>  {
>  	FILE *fp;
>  	char mountpoint[PATH_MAX + 1], tokens[PATH_MAX + 1], type[PATH_MAX + 1];
> +	char path_v1[PATH_MAX + 1], path_v2[PATH_MAX + 2], *path;
>  	char *token, *saved_ptr = NULL;
> -	int found = 0;
>  
>  	fp = fopen("/proc/mounts", "r");
>  	if (!fp)
> @@ -24,31 +24,43 @@ cgroupfs_find_mountpoint(char *buf, size
>  	 * and inspect every cgroupfs mount point to find one that has
>  	 * perf_event subsystem
>  	 */
> +	path_v1[0] = '\0';
> +	path_v2[0] = '\0';
> +
>  	while (fscanf(fp, "%*s %"STR(PATH_MAX)"s %"STR(PATH_MAX)"s %"
>  				STR(PATH_MAX)"s %*d %*d\n",
>  				mountpoint, type, tokens) == 3) {
>  
> -		if (!strcmp(type, "cgroup")) {
> +		if (!path_v1[0] && !strcmp(type, "cgroup")) {
>  
>  			token = strtok_r(tokens, ",", &saved_ptr);
>  
>  			while (token != NULL) {
>  				if (!strcmp(token, "perf_event")) {
> -					found = 1;
> +					strcpy(path_v1, mountpoint);
>  					break;
>  				}
>  				token = strtok_r(NULL, ",", &saved_ptr);
>  			}
>  		}
> -		if (found)
> +
> +		if (!path_v2[0] && !strcmp(type, "cgroup2"))
> +			strcpy(path_v2, mountpoint);
> +
> +		if (path_v1[0] && path_v2[0])
>  			break;
>  	}
>  	fclose(fp);
> -	if (!found)
> +
> +	if (path_v1[0])
> +		path = path_v1;
> +	else if (path_v2[0])
> +		path = path_v2;
> +	else
>  		return -1;
>  
> -	if (strlen(mountpoint) < maxlen) {
> -		strcpy(buf, mountpoint);
> +	if (strlen(path) < maxlen) {
> +		strcpy(buf, path);
>  		return 0;
>  	}
>  	return -1;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH REPOST 2/2] cgroup, perf_event: make perf_event controller work on cgroup2 hierarchy
  2017-01-30 14:05     ` Arnaldo Carvalho de Melo
@ 2017-01-30 15:07       ` Tejun Heo
  0 siblings, 0 replies; 11+ messages in thread
From: Tejun Heo @ 2017-01-30 15:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lizefan, hannes, a.p.zijlstra, mingo, linux-kernel, cgroups, kernel-team

Hello,

On Mon, Jan 30, 2017 at 11:05:41AM -0300, Arnaldo Carvalho de Melo wrote:
> > This was posted months ago and acked by Peter.  I thought it was
> > applied but apparently wasn't.  Peter asked Arnaldo whether the
> > userspace part looked which didn't get replied and that probably was
> > how it slipped through the crack.
> > 
> > Peter, Arnanldo, how should we proceed?  Should I route this through
> > the cgroup tree?
> 
> It looks ok, haven't tested it tho. I don't have a problem for it to go
> thru the cgroup tree. The change is small and restrictred to cgroup
> glue in tools/perf/.

Alright, thanks.  Will apply it to cgroup/for-4.11 in a couple days.
If there's any objection, please holler.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH REPOST 2/2] cgroup, perf_event: make perf_event controller work on cgroup2 hierarchy
  2017-01-29 19:35   ` [PATCH REPOST " Tejun Heo
  2017-01-30 14:05     ` Arnaldo Carvalho de Melo
@ 2017-02-02 18:47     ` Tejun Heo
  1 sibling, 0 replies; 11+ messages in thread
From: Tejun Heo @ 2017-02-02 18:47 UTC (permalink / raw)
  To: lizefan, hannes, a.p.zijlstra, mingo, acme
  Cc: linux-kernel, cgroups, kernel-team

On Sun, Jan 29, 2017 at 02:35:20PM -0500, Tejun Heo wrote:
> perf_event is a utility controller whose primary role is identifying
> cgroup membership to filter perf events; however, because it also
> tracks some per-css state, it can't be replaced by pure cgroup
> membership test.  Mark the controller as implicitly enabled on the
> default hierarchy so that perf events can always be filtered based on
> cgroup v2 path as long as the controller is not mounted on a legacy
> hierarchy.
> 
> "perf record" is updated accordingly so that it searches for both v1
> and v2 hierarchies.  A v1 hierarchy is used if perf_event is mounted
> on it; otherwise, it uses the v2 hierarchy.
> 
> v2: Doc updated to reflect more flexible rebinding behavior.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>

Applied to cgroup/for-4.11.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-02-02 18:47 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-24 22:12 [PATCHSET v2 cgroup/for-4.6] cgroup, perf_event: make perf_event work on v2 hierarchy Tejun Heo
2016-02-24 22:12 ` [PATCH 1/2] cgroup: implement cgroup_subsys->implicit_on_dfl Tejun Heo
2016-03-08 15:59   ` Tejun Heo
2016-02-24 22:12 ` [PATCH 2/2] cgroup, perf_event: make perf_event controller work on cgroup2 hierarchy Tejun Heo
2016-03-08 10:11   ` Peter Zijlstra
2017-01-29 19:35   ` [PATCH REPOST " Tejun Heo
2017-01-30 14:05     ` Arnaldo Carvalho de Melo
2017-01-30 15:07       ` Tejun Heo
2017-02-02 18:47     ` Tejun Heo
2016-03-03 15:06 ` [PATCHSET v2 cgroup/for-4.6] cgroup, perf_event: make perf_event work on v2 hierarchy Tejun Heo
2016-03-08  1:44   ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).