[RFC PATCH] cgroup/pids: Make pids.events notifications affine to pids.max

All of lore.kernel.org
 help / color / mirror / Atom feed

* [RFC PATCH] cgroup/pids: Make pids.events notifications affine to pids.max
@ 2019-11-28 17:26 ` Michal Koutný
  0 siblings, 0 replies; 13+ messages in thread
From: Michal Koutný @ 2019-11-28 17:26 UTC (permalink / raw)
  To: cgroups; +Cc: Tejun Heo, Li Zefan, Johannes Weiner, linux-kernel

Currently, when pids.max limit is breached in the hierarchy, the event
is counted and reported in the cgroup where the forking task resides.

The proper hierarchical behavior is to count and report the event in the
cgroup whose limit is being exceeded. Apply this behavior in the default
hierarchy.

Reasons for RFC:

1) If anyone has adjusted their readings to this behavior, this is a BC
   break.

2) This solves no reported bug, just a spotted inconsistency.

3) One step further would be to distinguish pids.events and
   pids.events.local for proper hierarchical counting. (The current
   behavior wouldn't match neither though.)

Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 kernel/cgroup/pids.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/kernel/cgroup/pids.c b/kernel/cgroup/pids.c
index 138059eb730d..5fc34d8b8f60 100644
--- a/kernel/cgroup/pids.c
+++ b/kernel/cgroup/pids.c
@@ -140,7 +140,7 @@ static void pids_charge(struct pids_cgroup *pids, int num)
  * the new value to exceed the hierarchical limit. Returns 0 if the charge
  * succeeded, otherwise -EAGAIN.
  */
-static int pids_try_charge(struct pids_cgroup *pids, int num)
+static int pids_try_charge(struct pids_cgroup *pids, int num, struct pids_cgroup **fail)
 {
 	struct pids_cgroup *p, *q;
 
@@ -153,8 +153,10 @@ static int pids_try_charge(struct pids_cgroup *pids, int num)
 		 * p->limit is %PIDS_MAX then we know that this test will never
 		 * fail.
 		 */
-		if (new > limit)
+		if (new > limit) {
+			*fail = p;
 			goto revert;
+		}
 	}
 
 	return 0;
@@ -217,20 +219,25 @@ static void pids_cancel_attach(struct cgroup_taskset *tset)
 static int pids_can_fork(struct task_struct *task)
 {
 	struct cgroup_subsys_state *css;
-	struct pids_cgroup *pids;
+	struct pids_cgroup *pids, *pids_over_limit;
 	int err;
 
 	css = task_css_check(current, pids_cgrp_id, true);
 	pids = css_pids(css);
-	err = pids_try_charge(pids, 1);
+	err = pids_try_charge(pids, 1, &pids_over_limit);
 	if (err) {
+		/* Backwards compatibility on v1 where events were notified in
+		 * leaves. */
+		if (!cgroup_subsys_on_dfl(pids_cgrp_subsys))
+			pids_over_limit = pids;
+
 		/* Only log the first time events_limit is incremented. */
-		if (atomic64_inc_return(&pids->events_limit) == 1) {
+		if (atomic64_inc_return(&pids_over_limit->events_limit) == 1) {
 			pr_info("cgroup: fork rejected by pids controller in ");
-			pr_cont_cgroup_path(css->cgroup);
+			pr_cont_cgroup_path(pids_over_limit->css.cgroup);
 			pr_cont("\n");
 		}
-		cgroup_file_notify(&pids->events_file);
+		cgroup_file_notify(&pids_over_limit->events_file);
 	}
 	return err;
 }
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH] cgroup/pids: Make pids.events notifications affine to pids.max
@ 2019-11-28 17:26 ` Michal Koutný
  0 siblings, 0 replies; 13+ messages in thread
From: Michal Koutný @ 2019-11-28 17:26 UTC (permalink / raw)
  To: cgroups; +Cc: Tejun Heo, Li Zefan, Johannes Weiner, linux-kernel

Currently, when pids.max limit is breached in the hierarchy, the event
is counted and reported in the cgroup where the forking task resides.

The proper hierarchical behavior is to count and report the event in the
cgroup whose limit is being exceeded. Apply this behavior in the default
hierarchy.

Reasons for RFC:

1) If anyone has adjusted their readings to this behavior, this is a BC
   break.

2) This solves no reported bug, just a spotted inconsistency.

3) One step further would be to distinguish pids.events and
   pids.events.local for proper hierarchical counting. (The current
   behavior wouldn't match neither though.)

Signed-off-by: Michal Koutn√Ω <mkoutny@suse.com>
---
 kernel/cgroup/pids.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/kernel/cgroup/pids.c b/kernel/cgroup/pids.c
index 138059eb730d..5fc34d8b8f60 100644
--- a/kernel/cgroup/pids.c
+++ b/kernel/cgroup/pids.c
@@ -140,7 +140,7 @@ static void pids_charge(struct pids_cgroup *pids, int num)
  * the new value to exceed the hierarchical limit. Returns 0 if the charge
  * succeeded, otherwise -EAGAIN.
  */
-static int pids_try_charge(struct pids_cgroup *pids, int num)
+static int pids_try_charge(struct pids_cgroup *pids, int num, struct pids_cgroup **fail)
 {
 	struct pids_cgroup *p, *q;
 
@@ -153,8 +153,10 @@ static int pids_try_charge(struct pids_cgroup *pids, int num)
 		 * p->limit is %PIDS_MAX then we know that this test will never
 		 * fail.
 		 */
-		if (new > limit)
+		if (new > limit) {
+			*fail = p;
 			goto revert;
+		}
 	}
 
 	return 0;
@@ -217,20 +219,25 @@ static void pids_cancel_attach(struct cgroup_taskset *tset)
 static int pids_can_fork(struct task_struct *task)
 {
 	struct cgroup_subsys_state *css;
-	struct pids_cgroup *pids;
+	struct pids_cgroup *pids, *pids_over_limit;
 	int err;
 
 	css = task_css_check(current, pids_cgrp_id, true);
 	pids = css_pids(css);
-	err = pids_try_charge(pids, 1);
+	err = pids_try_charge(pids, 1, &pids_over_limit);
 	if (err) {
+		/* Backwards compatibility on v1 where events were notified in
+		 * leaves. */
+		if (!cgroup_subsys_on_dfl(pids_cgrp_subsys))
+			pids_over_limit = pids;
+
 		/* Only log the first time events_limit is incremented. */
-		if (atomic64_inc_return(&pids->events_limit) == 1) {
+		if (atomic64_inc_return(&pids_over_limit->events_limit) == 1) {
 			pr_info("cgroup: fork rejected by pids controller in ");
-			pr_cont_cgroup_path(css->cgroup);
+			pr_cont_cgroup_path(pids_over_limit->css.cgroup);
 			pr_cont("\n");
 		}
-		cgroup_file_notify(&pids->events_file);
+		cgroup_file_notify(&pids_over_limit->events_file);
 	}
 	return err;
 }
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH] cgroup/pids: Make pids.events notifications affine to pids.max
  2019-11-28 17:26 ` Michal Koutný
  (?)
@ 2019-12-02 19:11 ` Tejun Heo
  -1 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2019-12-02 19:11 UTC (permalink / raw)
  To: Michal Koutný; +Cc: cgroups, Li Zefan, Johannes Weiner, linux-kernel

Hello,

On Thu, Nov 28, 2019 at 06:26:12PM +0100, Michal Koutný wrote:
> Currently, when pids.max limit is breached in the hierarchy, the event
> is counted and reported in the cgroup where the forking task resides.
> 
> The proper hierarchical behavior is to count and report the event in the
> cgroup whose limit is being exceeded. Apply this behavior in the default
> hierarchy.
> 
> Reasons for RFC:
> 
> 1) If anyone has adjusted their readings to this behavior, this is a BC
>    break.
> 
> 2) This solves no reported bug, just a spotted inconsistency.
> 
> 3) One step further would be to distinguish pids.events and
>    pids.events.local for proper hierarchical counting. (The current
>    behavior wouldn't match neither though.)

Yeah this is incosistent with memcg but there max / high events are
essentially useless because that doesn't indicate actual limit breach.
Both events are interesting - which cgroup's limit was reached and who
suffered because of that.

So, maybe sth like the following?

1. Make max event propagate hierarchically.  This is a behavior change
   but also an obvious bug fix.  Given that internal cgroups don't
   have processes in cgroup2, maybe it's safe enough?

2. Add another (hierarchical, of course) event which counts the number
   of fork rejects.  I can't think of a good name.  Any ideas?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 0/3] cgroup/pids: Make pids.events notifications affine to pids.max
@ 2020-02-05 13:44   ` Michal Koutný
  0 siblings, 0 replies; 13+ messages in thread
From: Michal Koutný @ 2020-02-05 13:44 UTC (permalink / raw)
  To: cgroups; +Cc: Tejun Heo, Li Zefan, Johannes Weiner, Aleksa Sarai, linux-kernel

Currently, when pids.max limit is breached in the hierarchy, the event
is counted and reported in the cgroup where the forking task resides. This
isn't hierarchical neither binds event to its limit

Reasons for RFC:

1) Introduction of new event type.

2) Missing one step further would be to distinguish pids.events and
   pids.events.local.

Changes from v1:  https://lkml.kernel.org/r/20191128172612.10259-1-mkoutny@suse.com
- introduce two separate types of events
- make event counting hierarchical

Michal Koutný (3):
  cgroup/pids: Separate semantics of pids.events related to pids.max
  cgroup/pids: Make event counters hierarchical
  selftests: cgroup: Add basic tests for pids controller

 Documentation/admin-guide/cgroup-v1/pids.rst |   3 +-
 Documentation/admin-guide/cgroup-v2.rst      |  14 ++
 kernel/cgroup/pids.c                         |  92 +++++++--
 tools/testing/selftests/cgroup/Makefile      |   8 +-
 tools/testing/selftests/cgroup/test_pids.c   | 188 +++++++++++++++++++
 5 files changed, 288 insertions(+), 17 deletions(-)
 create mode 100644 tools/testing/selftests/cgroup/test_pids.c

-- 
2.24.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 0/3] cgroup/pids: Make pids.events notifications affine to pids.max
@ 2020-02-05 13:44   ` Michal Koutný
  0 siblings, 0 replies; 13+ messages in thread
From: Michal Koutný @ 2020-02-05 13:44 UTC (permalink / raw)
  To: cgroups-u79uwXL29TY76Z2rM5mHXA
  Cc: Tejun Heo, Li Zefan, Johannes Weiner, Aleksa Sarai,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Currently, when pids.max limit is breached in the hierarchy, the event
is counted and reported in the cgroup where the forking task resides. This
isn't hierarchical neither binds event to its limit

Reasons for RFC:

1) Introduction of new event type.

2) Missing one step further would be to distinguish pids.events and
   pids.events.local.

Changes from v1:  https://lkml.kernel.org/r/20191128172612.10259-1-mkoutny-IBi9RG/b67k@public.gmane.org
- introduce two separate types of events
- make event counting hierarchical

Michal Koutn√Ω (3):
  cgroup/pids: Separate semantics of pids.events related to pids.max
  cgroup/pids: Make event counters hierarchical
  selftests: cgroup: Add basic tests for pids controller

 Documentation/admin-guide/cgroup-v1/pids.rst |   3 +-
 Documentation/admin-guide/cgroup-v2.rst      |  14 ++
 kernel/cgroup/pids.c                         |  92 +++++++--
 tools/testing/selftests/cgroup/Makefile      |   8 +-
 tools/testing/selftests/cgroup/test_pids.c   | 188 +++++++++++++++++++
 5 files changed, 288 insertions(+), 17 deletions(-)
 create mode 100644 tools/testing/selftests/cgroup/test_pids.c

-- 
2.24.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 1/3] cgroup/pids: Separate semantics of pids.events related to pids.max
  2020-02-05 13:44   ` Michal Koutný
@ 2020-02-05 13:44     ` Michal Koutný
  -1 siblings, 0 replies; 13+ messages in thread
From: Michal Koutný @ 2020-02-05 13:44 UTC (permalink / raw)
  To: cgroups; +Cc: Tejun Heo, Li Zefan, Johannes Weiner, Aleksa Sarai, linux-kernel

Currently, when pids.max limit is breached in the hierarchy, the event
is counted and reported in the cgroup where the forking task resides.

This decouples the limit and the notification caused by the limit making
it hard to detect when the actual limit was effected.

Let's introduce new events:
	  max
		The number of times the limit of the cgroup was hit.

	  max.imposed
		The number of times fork failed in the cgroup because of self
		or ancestor limit.

Since it changes semantics of the original "max" event, we introduce
this change only in the v2 API of the controller.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 Documentation/admin-guide/cgroup-v1/pids.rst |  3 +-
 Documentation/admin-guide/cgroup-v2.rst      | 12 ++++
 kernel/cgroup/pids.c                         | 72 +++++++++++++++++---
 3 files changed, 77 insertions(+), 10 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v1/pids.rst b/Documentation/admin-guide/cgroup-v1/pids.rst
index 6acebd9e72c8..0f9f9a7b1f6c 100644
--- a/Documentation/admin-guide/cgroup-v1/pids.rst
+++ b/Documentation/admin-guide/cgroup-v1/pids.rst
@@ -36,7 +36,8 @@ superset of parent/child/pids.current.
 
 The pids.events file contains event counters:
 
-  - max: Number of times fork failed because limit was hit.
+  - max: Number of times fork failed in the cgroup because limit was hit in
+    self or ancestors.
 
 Example
 -------
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 3f801461f0f3..38edeb79c2d8 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1816,6 +1816,18 @@ PID Interface Files
 	The number of processes currently in the cgroup and its
 	descendants.
 
+  pids.events
+	A read-only flat-keyed file which exists on non-root cgroups.  Unless
+	specified otherwise, a value change in this file generates a file modified
+	event. The following entries are defined.
+
+	  max
+		The number of times the limit of the cgroup was hit.
+
+	  max.imposed
+		The number of times fork failed in the cgroup because of self
+		or ancestor limit.
+
 Organisational operations are not blocked by cgroup policies, so it is
 possible to have pids.current > pids.max.  This can be done by either
 setting the limit to be smaller than pids.current, or attaching enough
diff --git a/kernel/cgroup/pids.c b/kernel/cgroup/pids.c
index 138059eb730d..bbfb2fb56029 100644
--- a/kernel/cgroup/pids.c
+++ b/kernel/cgroup/pids.c
@@ -50,8 +50,17 @@ struct pids_cgroup {
 	/* Handle for "pids.events" */
 	struct cgroup_file		events_file;
 
-	/* Number of times fork failed because limit was hit. */
+	/*
+	 * Number of times fork failed in subtree because this pids_cgroup
+	 * limit was hit.
+	 */
 	atomic64_t			events_limit;
+
+	/*
+	 * Number of times fork failed in this pids_cgroup because ancestor
+	 * limit was hit.
+	 */
+	atomic64_t			events_limit_imposed;
 };
 
 static struct pids_cgroup *css_pids(struct cgroup_subsys_state *css)
@@ -76,6 +85,7 @@ pids_css_alloc(struct cgroup_subsys_state *parent)
 	atomic64_set(&pids->counter, 0);
 	atomic64_set(&pids->limit, PIDS_MAX);
 	atomic64_set(&pids->events_limit, 0);
+	atomic64_set(&pids->events_limit_imposed, 0);
 	return &pids->css;
 }
 
@@ -140,7 +150,8 @@ static void pids_charge(struct pids_cgroup *pids, int num)
  * the new value to exceed the hierarchical limit. Returns 0 if the charge
  * succeeded, otherwise -EAGAIN.
  */
-static int pids_try_charge(struct pids_cgroup *pids, int num)
+static int pids_try_charge(struct pids_cgroup *pids, int num,
+			   struct pids_cgroup **fail)
 {
 	struct pids_cgroup *p, *q;
 
@@ -153,8 +164,10 @@ static int pids_try_charge(struct pids_cgroup *pids, int num)
 		 * p->limit is %PIDS_MAX then we know that this test will never
 		 * fail.
 		 */
-		if (new > limit)
+		if (new > limit) {
+			*fail = p;
 			goto revert;
+		}
 	}
 
 	return 0;
@@ -217,20 +230,29 @@ static void pids_cancel_attach(struct cgroup_taskset *tset)
 static int pids_can_fork(struct task_struct *task)
 {
 	struct cgroup_subsys_state *css;
-	struct pids_cgroup *pids;
+	struct pids_cgroup *pids, *pids_over_limit;
 	int err;
 
 	css = task_css_check(current, pids_cgrp_id, true);
 	pids = css_pids(css);
-	err = pids_try_charge(pids, 1);
+	err = pids_try_charge(pids, 1, &pids_over_limit);
 	if (err) {
-		/* Only log the first time events_limit is incremented. */
-		if (atomic64_inc_return(&pids->events_limit) == 1) {
+		/* Backwards compatibility on v1 where events were notified in
+		 * leaves. */
+		if (!cgroup_subsys_on_dfl(pids_cgrp_subsys))
+			pids_over_limit = pids;
+
+		/* Only log the first time events_limit_imposed is incremented. */
+		if (atomic64_inc_return(&pids->events_limit_imposed) == 1) {
 			pr_info("cgroup: fork rejected by pids controller in ");
-			pr_cont_cgroup_path(css->cgroup);
+			pr_cont_cgroup_path(pids->css.cgroup);
 			pr_cont("\n");
 		}
+		atomic64_inc(&pids_over_limit->events_limit);
+
 		cgroup_file_notify(&pids->events_file);
+		if (pids_over_limit != pids)
+			cgroup_file_notify(&pids_over_limit->events_file);
 	}
 	return err;
 }
@@ -309,6 +331,17 @@ static int pids_events_show(struct seq_file *sf, void *v)
 	struct pids_cgroup *pids = css_pids(seq_css(sf));
 
 	seq_printf(sf, "max %lld\n", (s64)atomic64_read(&pids->events_limit));
+	seq_printf(sf, "max.imposed %lld\n",
+		   (s64)atomic64_read(&pids->events_limit_imposed));
+	return 0;
+}
+
+static int pids_events_show_legacy(struct seq_file *sf, void *v)
+{
+	struct pids_cgroup *pids = css_pids(seq_css(sf));
+
+	seq_printf(sf, "max %lld\n",
+		   (s64)atomic64_read(&pids->events_limit_imposed));
 	return 0;
 }
 
@@ -333,6 +366,27 @@ static struct cftype pids_files[] = {
 	{ }	/* terminate */
 };
 
+static struct cftype pids_files_legacy[] = {
+	{
+		.name = "max",
+		.write = pids_max_write,
+		.seq_show = pids_max_show,
+		.flags = CFTYPE_NOT_ON_ROOT,
+	},
+	{
+		.name = "current",
+		.read_s64 = pids_current_read,
+		.flags = CFTYPE_NOT_ON_ROOT,
+	},
+	{
+		.name = "events",
+		.seq_show = pids_events_show_legacy,
+		.file_offset = offsetof(struct pids_cgroup, events_file),
+		.flags = CFTYPE_NOT_ON_ROOT,
+	},
+	{ }	/* terminate */
+};
+
 struct cgroup_subsys pids_cgrp_subsys = {
 	.css_alloc	= pids_css_alloc,
 	.css_free	= pids_css_free,
@@ -341,7 +395,7 @@ struct cgroup_subsys pids_cgrp_subsys = {
 	.can_fork	= pids_can_fork,
 	.cancel_fork	= pids_cancel_fork,
 	.release	= pids_release,
-	.legacy_cftypes	= pids_files,
 	.dfl_cftypes	= pids_files,
+	.legacy_cftypes	= pids_files_legacy,
 	.threaded	= true,
 };
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 1/3] cgroup/pids: Separate semantics of pids.events related to pids.max
@ 2020-02-05 13:44     ` Michal Koutný
  0 siblings, 0 replies; 13+ messages in thread
From: Michal Koutný @ 2020-02-05 13:44 UTC (permalink / raw)
  To: cgroups; +Cc: Tejun Heo, Li Zefan, Johannes Weiner, Aleksa Sarai, linux-kernel

Currently, when pids.max limit is breached in the hierarchy, the event
is counted and reported in the cgroup where the forking task resides.

This decouples the limit and the notification caused by the limit making
it hard to detect when the actual limit was effected.

Let's introduce new events:
	  max
		The number of times the limit of the cgroup was hit.

	  max.imposed
		The number of times fork failed in the cgroup because of self
		or ancestor limit.

Since it changes semantics of the original "max" event, we introduce
this change only in the v2 API of the controller.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Michal Koutn√Ω <mkoutny@suse.com>
---
 Documentation/admin-guide/cgroup-v1/pids.rst |  3 +-
 Documentation/admin-guide/cgroup-v2.rst      | 12 ++++
 kernel/cgroup/pids.c                         | 72 +++++++++++++++++---
 3 files changed, 77 insertions(+), 10 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v1/pids.rst b/Documentation/admin-guide/cgroup-v1/pids.rst
index 6acebd9e72c8..0f9f9a7b1f6c 100644
--- a/Documentation/admin-guide/cgroup-v1/pids.rst
+++ b/Documentation/admin-guide/cgroup-v1/pids.rst
@@ -36,7 +36,8 @@ superset of parent/child/pids.current.
 
 The pids.events file contains event counters:
 
-  - max: Number of times fork failed because limit was hit.
+  - max: Number of times fork failed in the cgroup because limit was hit in
+    self or ancestors.
 
 Example
 -------
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 3f801461f0f3..38edeb79c2d8 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1816,6 +1816,18 @@ PID Interface Files
 	The number of processes currently in the cgroup and its
 	descendants.
 
+  pids.events
+	A read-only flat-keyed file which exists on non-root cgroups.  Unless
+	specified otherwise, a value change in this file generates a file modified
+	event. The following entries are defined.
+
+	  max
+		The number of times the limit of the cgroup was hit.
+
+	  max.imposed
+		The number of times fork failed in the cgroup because of self
+		or ancestor limit.
+
 Organisational operations are not blocked by cgroup policies, so it is
 possible to have pids.current > pids.max.  This can be done by either
 setting the limit to be smaller than pids.current, or attaching enough
diff --git a/kernel/cgroup/pids.c b/kernel/cgroup/pids.c
index 138059eb730d..bbfb2fb56029 100644
--- a/kernel/cgroup/pids.c
+++ b/kernel/cgroup/pids.c
@@ -50,8 +50,17 @@ struct pids_cgroup {
 	/* Handle for "pids.events" */
 	struct cgroup_file		events_file;
 
-	/* Number of times fork failed because limit was hit. */
+	/*
+	 * Number of times fork failed in subtree because this pids_cgroup
+	 * limit was hit.
+	 */
 	atomic64_t			events_limit;
+
+	/*
+	 * Number of times fork failed in this pids_cgroup because ancestor
+	 * limit was hit.
+	 */
+	atomic64_t			events_limit_imposed;
 };
 
 static struct pids_cgroup *css_pids(struct cgroup_subsys_state *css)
@@ -76,6 +85,7 @@ pids_css_alloc(struct cgroup_subsys_state *parent)
 	atomic64_set(&pids->counter, 0);
 	atomic64_set(&pids->limit, PIDS_MAX);
 	atomic64_set(&pids->events_limit, 0);
+	atomic64_set(&pids->events_limit_imposed, 0);
 	return &pids->css;
 }
 
@@ -140,7 +150,8 @@ static void pids_charge(struct pids_cgroup *pids, int num)
  * the new value to exceed the hierarchical limit. Returns 0 if the charge
  * succeeded, otherwise -EAGAIN.
  */
-static int pids_try_charge(struct pids_cgroup *pids, int num)
+static int pids_try_charge(struct pids_cgroup *pids, int num,
+			   struct pids_cgroup **fail)
 {
 	struct pids_cgroup *p, *q;
 
@@ -153,8 +164,10 @@ static int pids_try_charge(struct pids_cgroup *pids, int num)
 		 * p->limit is %PIDS_MAX then we know that this test will never
 		 * fail.
 		 */
-		if (new > limit)
+		if (new > limit) {
+			*fail = p;
 			goto revert;
+		}
 	}
 
 	return 0;
@@ -217,20 +230,29 @@ static void pids_cancel_attach(struct cgroup_taskset *tset)
 static int pids_can_fork(struct task_struct *task)
 {
 	struct cgroup_subsys_state *css;
-	struct pids_cgroup *pids;
+	struct pids_cgroup *pids, *pids_over_limit;
 	int err;
 
 	css = task_css_check(current, pids_cgrp_id, true);
 	pids = css_pids(css);
-	err = pids_try_charge(pids, 1);
+	err = pids_try_charge(pids, 1, &pids_over_limit);
 	if (err) {
-		/* Only log the first time events_limit is incremented. */
-		if (atomic64_inc_return(&pids->events_limit) == 1) {
+		/* Backwards compatibility on v1 where events were notified in
+		 * leaves. */
+		if (!cgroup_subsys_on_dfl(pids_cgrp_subsys))
+			pids_over_limit = pids;
+
+		/* Only log the first time events_limit_imposed is incremented. */
+		if (atomic64_inc_return(&pids->events_limit_imposed) == 1) {
 			pr_info("cgroup: fork rejected by pids controller in ");
-			pr_cont_cgroup_path(css->cgroup);
+			pr_cont_cgroup_path(pids->css.cgroup);
 			pr_cont("\n");
 		}
+		atomic64_inc(&pids_over_limit->events_limit);
+
 		cgroup_file_notify(&pids->events_file);
+		if (pids_over_limit != pids)
+			cgroup_file_notify(&pids_over_limit->events_file);
 	}
 	return err;
 }
@@ -309,6 +331,17 @@ static int pids_events_show(struct seq_file *sf, void *v)
 	struct pids_cgroup *pids = css_pids(seq_css(sf));
 
 	seq_printf(sf, "max %lld\n", (s64)atomic64_read(&pids->events_limit));
+	seq_printf(sf, "max.imposed %lld\n",
+		   (s64)atomic64_read(&pids->events_limit_imposed));
+	return 0;
+}
+
+static int pids_events_show_legacy(struct seq_file *sf, void *v)
+{
+	struct pids_cgroup *pids = css_pids(seq_css(sf));
+
+	seq_printf(sf, "max %lld\n",
+		   (s64)atomic64_read(&pids->events_limit_imposed));
 	return 0;
 }
 
@@ -333,6 +366,27 @@ static struct cftype pids_files[] = {
 	{ }	/* terminate */
 };
 
+static struct cftype pids_files_legacy[] = {
+	{
+		.name = "max",
+		.write = pids_max_write,
+		.seq_show = pids_max_show,
+		.flags = CFTYPE_NOT_ON_ROOT,
+	},
+	{
+		.name = "current",
+		.read_s64 = pids_current_read,
+		.flags = CFTYPE_NOT_ON_ROOT,
+	},
+	{
+		.name = "events",
+		.seq_show = pids_events_show_legacy,
+		.file_offset = offsetof(struct pids_cgroup, events_file),
+		.flags = CFTYPE_NOT_ON_ROOT,
+	},
+	{ }	/* terminate */
+};
+
 struct cgroup_subsys pids_cgrp_subsys = {
 	.css_alloc	= pids_css_alloc,
 	.css_free	= pids_css_free,
@@ -341,7 +395,7 @@ struct cgroup_subsys pids_cgrp_subsys = {
 	.can_fork	= pids_can_fork,
 	.cancel_fork	= pids_cancel_fork,
 	.release	= pids_release,
-	.legacy_cftypes	= pids_files,
 	.dfl_cftypes	= pids_files,
+	.legacy_cftypes	= pids_files_legacy,
 	.threaded	= true,
 };
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 2/3] cgroup/pids: Make event counters hierarchical
  2020-02-05 13:44   ` Michal Koutný
@ 2020-02-05 13:44     ` Michal Koutný
  -1 siblings, 0 replies; 13+ messages in thread
From: Michal Koutný @ 2020-02-05 13:44 UTC (permalink / raw)
  To: cgroups; +Cc: Tejun Heo, Li Zefan, Johannes Weiner, Aleksa Sarai, linux-kernel

The pids.events file should honor the hierarchy, so make the events
propagate from their origin up to the root on the unified hierarchy. The
legacy hierarchy behavior remains the same.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 Documentation/admin-guide/cgroup-v2.rst |  4 ++-
 kernel/cgroup/pids.c                    | 44 ++++++++++++++++---------
 2 files changed, 31 insertions(+), 17 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 38edeb79c2d8..5dda08f268b7 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1819,7 +1819,9 @@ PID Interface Files
   pids.events
 	A read-only flat-keyed file which exists on non-root cgroups.  Unless
 	specified otherwise, a value change in this file generates a file modified
-	event. The following entries are defined.
+	event. Fields in this file are hierarchical and the file modified event
+	can be generated due to an event down the hierarchy. The following
+	entries are defined.
 
 	  max
 		The number of times the limit of the cgroup was hit.
diff --git a/kernel/cgroup/pids.c b/kernel/cgroup/pids.c
index bbfb2fb56029..5d65b36931cd 100644
--- a/kernel/cgroup/pids.c
+++ b/kernel/cgroup/pids.c
@@ -223,6 +223,33 @@ static void pids_cancel_attach(struct cgroup_taskset *tset)
 	}
 }
 
+static void pids_event(struct pids_cgroup *pids_forking,
+		       struct pids_cgroup *pids_over_limit)
+{
+	struct pids_cgroup *p;
+	bool limit = false;
+
+	for (p = pids_forking; parent_pids(p); p = parent_pids(p)) {
+		/* Only log the first time events_limit_imposed is incremented. */
+		if (atomic64_inc_return(&p->events_limit_imposed) == 1 &&
+		    p == pids_forking) {
+			pr_info("cgroup: fork rejected by pids controller in ");
+			pr_cont_cgroup_path(p->css.cgroup);
+			pr_cont("\n");
+		}
+
+		if (p == pids_over_limit)
+			limit = true;
+		if (limit)
+			atomic64_inc(&p->events_limit);
+
+		cgroup_file_notify(&p->events_file);
+		/* Events are only notified in pids_forking on v1 */
+		if (!cgroup_subsys_on_dfl(pids_cgrp_subsys))
+			break;
+	}
+}
+
 /*
  * task_css_check(true) in pids_can_fork() and pids_cancel_fork() relies
  * on cgroup_threadgroup_change_begin() held by the copy_process().
@@ -237,22 +264,7 @@ static int pids_can_fork(struct task_struct *task)
 	pids = css_pids(css);
 	err = pids_try_charge(pids, 1, &pids_over_limit);
 	if (err) {
-		/* Backwards compatibility on v1 where events were notified in
-		 * leaves. */
-		if (!cgroup_subsys_on_dfl(pids_cgrp_subsys))
-			pids_over_limit = pids;
-
-		/* Only log the first time events_limit_imposed is incremented. */
-		if (atomic64_inc_return(&pids->events_limit_imposed) == 1) {
-			pr_info("cgroup: fork rejected by pids controller in ");
-			pr_cont_cgroup_path(pids->css.cgroup);
-			pr_cont("\n");
-		}
-		atomic64_inc(&pids_over_limit->events_limit);
-
-		cgroup_file_notify(&pids->events_file);
-		if (pids_over_limit != pids)
-			cgroup_file_notify(&pids_over_limit->events_file);
+		pids_event(pids, pids_over_limit);
 	}
 	return err;
 }
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 2/3] cgroup/pids: Make event counters hierarchical
@ 2020-02-05 13:44     ` Michal Koutný
  0 siblings, 0 replies; 13+ messages in thread
From: Michal Koutný @ 2020-02-05 13:44 UTC (permalink / raw)
  To: cgroups; +Cc: Tejun Heo, Li Zefan, Johannes Weiner, Aleksa Sarai, linux-kernel

The pids.events file should honor the hierarchy, so make the events
propagate from their origin up to the root on the unified hierarchy. The
legacy hierarchy behavior remains the same.

Signed-off-by: Michal Koutn√Ω <mkoutny@suse.com>
---
 Documentation/admin-guide/cgroup-v2.rst |  4 ++-
 kernel/cgroup/pids.c                    | 44 ++++++++++++++++---------
 2 files changed, 31 insertions(+), 17 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 38edeb79c2d8..5dda08f268b7 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1819,7 +1819,9 @@ PID Interface Files
   pids.events
 	A read-only flat-keyed file which exists on non-root cgroups.  Unless
 	specified otherwise, a value change in this file generates a file modified
-	event. The following entries are defined.
+	event. Fields in this file are hierarchical and the file modified event
+	can be generated due to an event down the hierarchy. The following
+	entries are defined.
 
 	  max
 		The number of times the limit of the cgroup was hit.
diff --git a/kernel/cgroup/pids.c b/kernel/cgroup/pids.c
index bbfb2fb56029..5d65b36931cd 100644
--- a/kernel/cgroup/pids.c
+++ b/kernel/cgroup/pids.c
@@ -223,6 +223,33 @@ static void pids_cancel_attach(struct cgroup_taskset *tset)
 	}
 }
 
+static void pids_event(struct pids_cgroup *pids_forking,
+		       struct pids_cgroup *pids_over_limit)
+{
+	struct pids_cgroup *p;
+	bool limit = false;
+
+	for (p = pids_forking; parent_pids(p); p = parent_pids(p)) {
+		/* Only log the first time events_limit_imposed is incremented. */
+		if (atomic64_inc_return(&p->events_limit_imposed) == 1 &&
+		    p == pids_forking) {
+			pr_info("cgroup: fork rejected by pids controller in ");
+			pr_cont_cgroup_path(p->css.cgroup);
+			pr_cont("\n");
+		}
+
+		if (p == pids_over_limit)
+			limit = true;
+		if (limit)
+			atomic64_inc(&p->events_limit);
+
+		cgroup_file_notify(&p->events_file);
+		/* Events are only notified in pids_forking on v1 */
+		if (!cgroup_subsys_on_dfl(pids_cgrp_subsys))
+			break;
+	}
+}
+
 /*
  * task_css_check(true) in pids_can_fork() and pids_cancel_fork() relies
  * on cgroup_threadgroup_change_begin() held by the copy_process().
@@ -237,22 +264,7 @@ static int pids_can_fork(struct task_struct *task)
 	pids = css_pids(css);
 	err = pids_try_charge(pids, 1, &pids_over_limit);
 	if (err) {
-		/* Backwards compatibility on v1 where events were notified in
-		 * leaves. */
-		if (!cgroup_subsys_on_dfl(pids_cgrp_subsys))
-			pids_over_limit = pids;
-
-		/* Only log the first time events_limit_imposed is incremented. */
-		if (atomic64_inc_return(&pids->events_limit_imposed) == 1) {
-			pr_info("cgroup: fork rejected by pids controller in ");
-			pr_cont_cgroup_path(pids->css.cgroup);
-			pr_cont("\n");
-		}
-		atomic64_inc(&pids_over_limit->events_limit);
-
-		cgroup_file_notify(&pids->events_file);
-		if (pids_over_limit != pids)
-			cgroup_file_notify(&pids_over_limit->events_file);
+		pids_event(pids, pids_over_limit);
 	}
 	return err;
 }
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 3/3] selftests: cgroup: Add basic tests for pids controller
  2020-02-05 13:44   ` Michal Koutný
@ 2020-02-05 13:44     ` Michal Koutný
  -1 siblings, 0 replies; 13+ messages in thread
From: Michal Koutný @ 2020-02-05 13:44 UTC (permalink / raw)
  To: cgroups; +Cc: Tejun Heo, Li Zefan, Johannes Weiner, Aleksa Sarai, linux-kernel

This commit adds (and wires in) new test program for checking basic pids
controller functionality -- restricting tasks in a cgroup and correct
event counting.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 tools/testing/selftests/cgroup/Makefile    |   8 +-
 tools/testing/selftests/cgroup/test_pids.c | 188 +++++++++++++++++++++
 2 files changed, 193 insertions(+), 3 deletions(-)
 create mode 100644 tools/testing/selftests/cgroup/test_pids.c

diff --git a/tools/testing/selftests/cgroup/Makefile b/tools/testing/selftests/cgroup/Makefile
index 66aafe1f5746..15a8578f40b4 100644
--- a/tools/testing/selftests/cgroup/Makefile
+++ b/tools/testing/selftests/cgroup/Makefile
@@ -5,12 +5,14 @@ all:
 
 TEST_FILES     := with_stress.sh
 TEST_PROGS     := test_stress.sh
-TEST_GEN_PROGS = test_memcontrol
-TEST_GEN_PROGS += test_core
+TEST_GEN_PROGS = test_core
 TEST_GEN_PROGS += test_freezer
+TEST_GEN_PROGS += test_memcontrol
+TEST_GEN_PROGS += test_pids
 
 include ../lib.mk
 
-$(OUTPUT)/test_memcontrol: cgroup_util.c
 $(OUTPUT)/test_core: cgroup_util.c
 $(OUTPUT)/test_freezer: cgroup_util.c
+$(OUTPUT)/test_memcontrol: cgroup_util.c
+$(OUTPUT)/test_pids: cgroup_util.c
diff --git a/tools/testing/selftests/cgroup/test_pids.c b/tools/testing/selftests/cgroup/test_pids.c
new file mode 100644
index 000000000000..6545d81f2838
--- /dev/null
+++ b/tools/testing/selftests/cgroup/test_pids.c
@@ -0,0 +1,188 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+
+#include <errno.h>
+#include <linux/limits.h>
+#include <signal.h>
+#include <string.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#include "../kselftest.h"
+#include "cgroup_util.h"
+
+static int run_success(const char *cgroup, void *arg)
+{
+	return 0;
+}
+
+static int run_pause(const char *cgroup, void *arg)
+{
+	return pause();
+}
+
+/*
+ * This test checks that pids.max prevents forking new children above the
+ * specified limit in the cgroup.
+ */
+static int test_pids_max(const char *root)
+{
+	int ret = KSFT_FAIL;
+	char *cg_pids;
+	int pid;
+
+
+	cg_pids = cg_name(root, "pids_test");
+	if (!cg_pids)
+		goto cleanup;
+
+	if (cg_create(cg_pids))
+		goto cleanup;
+
+	if (cg_read_strcmp(cg_pids, "pids.max", "max\n"))
+		goto cleanup;
+
+	if (cg_write(cg_pids, "pids.max", "2"))
+		goto cleanup;
+
+	if (cg_enter_current(cg_pids))
+		goto cleanup;
+
+	pid = cg_run_nowait(cg_pids, run_pause, NULL);
+	if (pid < 0)
+		goto cleanup;
+
+	if (cg_run_nowait(cg_pids, run_success, NULL) != -1 || errno != EAGAIN)
+		goto cleanup;
+
+	if (kill(pid, SIGINT))
+		goto cleanup;
+
+	ret = KSFT_PASS;
+
+cleanup:
+	cg_enter_current(root);
+	cg_destroy(cg_pids);
+	free(cg_pids);
+
+	return ret;
+}
+
+/*
+ * This test checks that while pids.max prevents forking new children above the
+ * specified limit in the cgroup appropriate events are generated in the
+ * hiearchy.
+ */
+static int test_pids_events(const char *root)
+{
+	int ret = KSFT_FAIL;
+	char *cg_parent = NULL, *cg_child = NULL;
+	int pid;
+
+
+	cg_parent = cg_name(root, "pids_parent");
+	cg_child = cg_name(cg_parent, "pids_child");
+	if (!cg_parent || !cg_child)
+		goto cleanup;
+
+	if (cg_create(cg_parent))
+		goto cleanup;
+	if (cg_write(cg_parent, "cgroup.subtree_control", "+pids"))
+		goto cleanup;
+	if (cg_create(cg_child))
+		goto cleanup;
+
+	if (cg_write(cg_parent, "pids.max", "2"))
+		goto cleanup;
+
+	if (cg_read_strcmp(cg_child, "pids.max", "max\n"))
+		goto cleanup;
+
+	if (cg_enter_current(cg_child))
+		goto cleanup;
+
+	pid = cg_run_nowait(cg_child, run_pause, NULL);
+	if (pid < 0)
+		goto cleanup;
+
+	if (cg_run_nowait(cg_child, run_success, NULL) != -1 || errno != EAGAIN)
+		goto cleanup;
+
+	if (kill(pid, SIGINT))
+		goto cleanup;
+
+
+	if (cg_read_key_long(cg_child, "pids.events", "max ") != 0)
+		goto cleanup;
+	if (cg_read_key_long(cg_child, "pids.events", "max.imposed ") != 1)
+		goto cleanup;
+
+	if (cg_read_key_long(cg_parent, "pids.events", "max ") != 1)
+		goto cleanup;
+	if (cg_read_key_long(cg_parent, "pids.events", "max.imposed ") != 1)
+		goto cleanup;
+
+
+	ret = KSFT_PASS;
+
+cleanup:
+	cg_enter_current(root);
+	if (cg_child)
+		cg_destroy(cg_child);
+	if (cg_parent)
+		cg_destroy(cg_parent);
+	free(cg_child);
+	free(cg_parent);
+
+	return ret;
+}
+
+
+
+#define T(x) { x, #x }
+struct pids_test {
+	int (*fn)(const char *root);
+	const char *name;
+} tests[] = {
+	T(test_pids_max),
+	T(test_pids_events),
+};
+#undef T
+
+int main(int argc, char **argv)
+{
+	char root[PATH_MAX];
+	int i, ret = EXIT_SUCCESS;
+
+	if (cg_find_unified_root(root, sizeof(root)))
+		ksft_exit_skip("cgroup v2 isn't mounted\n");
+
+	/*
+	 * Check that pids controller is available:
+	 * pids is listed in cgroup.controllers
+	 */
+	if (cg_read_strstr(root, "cgroup.controllers", "pids"))
+		ksft_exit_skip("pids controller isn't available\n");
+
+	if (cg_read_strstr(root, "cgroup.subtree_control", "pids"))
+		if (cg_write(root, "cgroup.subtree_control", "+pids"))
+			ksft_exit_skip("Failed to set pids controller\n");
+
+	for (i = 0; i < ARRAY_SIZE(tests); i++) {
+		switch (tests[i].fn(root)) {
+		case KSFT_PASS:
+			ksft_test_result_pass("%s\n", tests[i].name);
+			break;
+		case KSFT_SKIP:
+			ksft_test_result_skip("%s\n", tests[i].name);
+			break;
+		default:
+			ret = EXIT_FAILURE;
+			ksft_test_result_fail("%s\n", tests[i].name);
+			break;
+		}
+	}
+
+	return ret;
+}
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH v2 3/3] selftests: cgroup: Add basic tests for pids controller
@ 2020-02-05 13:44     ` Michal Koutný
  0 siblings, 0 replies; 13+ messages in thread
From: Michal Koutný @ 2020-02-05 13:44 UTC (permalink / raw)
  To: cgroups; +Cc: Tejun Heo, Li Zefan, Johannes Weiner, Aleksa Sarai, linux-kernel

This commit adds (and wires in) new test program for checking basic pids
controller functionality -- restricting tasks in a cgroup and correct
event counting.

Signed-off-by: Michal Koutn√Ω <mkoutny@suse.com>
---
 tools/testing/selftests/cgroup/Makefile    |   8 +-
 tools/testing/selftests/cgroup/test_pids.c | 188 +++++++++++++++++++++
 2 files changed, 193 insertions(+), 3 deletions(-)
 create mode 100644 tools/testing/selftests/cgroup/test_pids.c

diff --git a/tools/testing/selftests/cgroup/Makefile b/tools/testing/selftests/cgroup/Makefile
index 66aafe1f5746..15a8578f40b4 100644
--- a/tools/testing/selftests/cgroup/Makefile
+++ b/tools/testing/selftests/cgroup/Makefile
@@ -5,12 +5,14 @@ all:
 
 TEST_FILES     := with_stress.sh
 TEST_PROGS     := test_stress.sh
-TEST_GEN_PROGS = test_memcontrol
-TEST_GEN_PROGS += test_core
+TEST_GEN_PROGS = test_core
 TEST_GEN_PROGS += test_freezer
+TEST_GEN_PROGS += test_memcontrol
+TEST_GEN_PROGS += test_pids
 
 include ../lib.mk
 
-$(OUTPUT)/test_memcontrol: cgroup_util.c
 $(OUTPUT)/test_core: cgroup_util.c
 $(OUTPUT)/test_freezer: cgroup_util.c
+$(OUTPUT)/test_memcontrol: cgroup_util.c
+$(OUTPUT)/test_pids: cgroup_util.c
diff --git a/tools/testing/selftests/cgroup/test_pids.c b/tools/testing/selftests/cgroup/test_pids.c
new file mode 100644
index 000000000000..6545d81f2838
--- /dev/null
+++ b/tools/testing/selftests/cgroup/test_pids.c
@@ -0,0 +1,188 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+
+#include <errno.h>
+#include <linux/limits.h>
+#include <signal.h>
+#include <string.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#include "../kselftest.h"
+#include "cgroup_util.h"
+
+static int run_success(const char *cgroup, void *arg)
+{
+	return 0;
+}
+
+static int run_pause(const char *cgroup, void *arg)
+{
+	return pause();
+}
+
+/*
+ * This test checks that pids.max prevents forking new children above the
+ * specified limit in the cgroup.
+ */
+static int test_pids_max(const char *root)
+{
+	int ret = KSFT_FAIL;
+	char *cg_pids;
+	int pid;
+
+
+	cg_pids = cg_name(root, "pids_test");
+	if (!cg_pids)
+		goto cleanup;
+
+	if (cg_create(cg_pids))
+		goto cleanup;
+
+	if (cg_read_strcmp(cg_pids, "pids.max", "max\n"))
+		goto cleanup;
+
+	if (cg_write(cg_pids, "pids.max", "2"))
+		goto cleanup;
+
+	if (cg_enter_current(cg_pids))
+		goto cleanup;
+
+	pid = cg_run_nowait(cg_pids, run_pause, NULL);
+	if (pid < 0)
+		goto cleanup;
+
+	if (cg_run_nowait(cg_pids, run_success, NULL) != -1 || errno != EAGAIN)
+		goto cleanup;
+
+	if (kill(pid, SIGINT))
+		goto cleanup;
+
+	ret = KSFT_PASS;
+
+cleanup:
+	cg_enter_current(root);
+	cg_destroy(cg_pids);
+	free(cg_pids);
+
+	return ret;
+}
+
+/*
+ * This test checks that while pids.max prevents forking new children above the
+ * specified limit in the cgroup appropriate events are generated in the
+ * hiearchy.
+ */
+static int test_pids_events(const char *root)
+{
+	int ret = KSFT_FAIL;
+	char *cg_parent = NULL, *cg_child = NULL;
+	int pid;
+
+
+	cg_parent = cg_name(root, "pids_parent");
+	cg_child = cg_name(cg_parent, "pids_child");
+	if (!cg_parent || !cg_child)
+		goto cleanup;
+
+	if (cg_create(cg_parent))
+		goto cleanup;
+	if (cg_write(cg_parent, "cgroup.subtree_control", "+pids"))
+		goto cleanup;
+	if (cg_create(cg_child))
+		goto cleanup;
+
+	if (cg_write(cg_parent, "pids.max", "2"))
+		goto cleanup;
+
+	if (cg_read_strcmp(cg_child, "pids.max", "max\n"))
+		goto cleanup;
+
+	if (cg_enter_current(cg_child))
+		goto cleanup;
+
+	pid = cg_run_nowait(cg_child, run_pause, NULL);
+	if (pid < 0)
+		goto cleanup;
+
+	if (cg_run_nowait(cg_child, run_success, NULL) != -1 || errno != EAGAIN)
+		goto cleanup;
+
+	if (kill(pid, SIGINT))
+		goto cleanup;
+
+
+	if (cg_read_key_long(cg_child, "pids.events", "max ") != 0)
+		goto cleanup;
+	if (cg_read_key_long(cg_child, "pids.events", "max.imposed ") != 1)
+		goto cleanup;
+
+	if (cg_read_key_long(cg_parent, "pids.events", "max ") != 1)
+		goto cleanup;
+	if (cg_read_key_long(cg_parent, "pids.events", "max.imposed ") != 1)
+		goto cleanup;
+
+
+	ret = KSFT_PASS;
+
+cleanup:
+	cg_enter_current(root);
+	if (cg_child)
+		cg_destroy(cg_child);
+	if (cg_parent)
+		cg_destroy(cg_parent);
+	free(cg_child);
+	free(cg_parent);
+
+	return ret;
+}
+
+
+
+#define T(x) { x, #x }
+struct pids_test {
+	int (*fn)(const char *root);
+	const char *name;
+} tests[] = {
+	T(test_pids_max),
+	T(test_pids_events),
+};
+#undef T
+
+int main(int argc, char **argv)
+{
+	char root[PATH_MAX];
+	int i, ret = EXIT_SUCCESS;
+
+	if (cg_find_unified_root(root, sizeof(root)))
+		ksft_exit_skip("cgroup v2 isn't mounted\n");
+
+	/*
+	 * Check that pids controller is available:
+	 * pids is listed in cgroup.controllers
+	 */
+	if (cg_read_strstr(root, "cgroup.controllers", "pids"))
+		ksft_exit_skip("pids controller isn't available\n");
+
+	if (cg_read_strstr(root, "cgroup.subtree_control", "pids"))
+		if (cg_write(root, "cgroup.subtree_control", "+pids"))
+			ksft_exit_skip("Failed to set pids controller\n");
+
+	for (i = 0; i < ARRAY_SIZE(tests); i++) {
+		switch (tests[i].fn(root)) {
+		case KSFT_PASS:
+			ksft_test_result_pass("%s\n", tests[i].name);
+			break;
+		case KSFT_SKIP:
+			ksft_test_result_skip("%s\n", tests[i].name);
+			break;
+		default:
+			ret = EXIT_FAILURE;
+			ksft_test_result_fail("%s\n", tests[i].name);
+			break;
+		}
+	}
+
+	return ret;
+}
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH v2 1/3] cgroup/pids: Separate semantics of pids.events related to pids.max
@ 2020-02-12 23:03       ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2020-02-12 23:03 UTC (permalink / raw)
  To: Michal Koutný
  Cc: cgroups, Li Zefan, Johannes Weiner, Aleksa Sarai, linux-kernel

On Wed, Feb 05, 2020 at 02:44:24PM +0100, Michal Koutný wrote:
> Currently, when pids.max limit is breached in the hierarchy, the event
> is counted and reported in the cgroup where the forking task resides.
> 
> This decouples the limit and the notification caused by the limit making
> it hard to detect when the actual limit was effected.
> 
> Let's introduce new events:
> 	  max
> 		The number of times the limit of the cgroup was hit.
> 
> 	  max.imposed
> 		The number of times fork failed in the cgroup because of self
> 		or ancestor limit.

Can you please follow the same convention as memory.events and
memory.events.local?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH v2 1/3] cgroup/pids: Separate semantics of pids.events related to pids.max
@ 2020-02-12 23:03       ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2020-02-12 23:03 UTC (permalink / raw)
  To: Michal Koutný
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Li Zefan, Johannes Weiner,
	Aleksa Sarai, linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Wed, Feb 05, 2020 at 02:44:24PM +0100, Michal Koutný wrote:
> Currently, when pids.max limit is breached in the hierarchy, the event
> is counted and reported in the cgroup where the forking task resides.
> 
> This decouples the limit and the notification caused by the limit making
> it hard to detect when the actual limit was effected.
> 
> Let's introduce new events:
> 	  max
> 		The number of times the limit of the cgroup was hit.
> 
> 	  max.imposed
> 		The number of times fork failed in the cgroup because of self
> 		or ancestor limit.

Can you please follow the same convention as memory.events and
memory.events.local?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-02-12 23:03 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-28 17:26 [RFC PATCH] cgroup/pids: Make pids.events notifications affine to pids.max Michal Koutný
2019-11-28 17:26 ` Michal Koutný
2019-12-02 19:11 ` Tejun Heo
2020-02-05 13:44 ` [RFC PATCH v2 0/3] " Michal Koutný
2020-02-05 13:44   ` Michal Koutný
2020-02-05 13:44   ` [RFC PATCH v2 1/3] cgroup/pids: Separate semantics of pids.events related " Michal Koutný
2020-02-05 13:44     ` Michal Koutný
2020-02-12 23:03     ` Tejun Heo
2020-02-12 23:03       ` Tejun Heo
2020-02-05 13:44   ` [RFC PATCH v2 2/3] cgroup/pids: Make event counters hierarchical Michal Koutný
2020-02-05 13:44     ` Michal Koutný
2020-02-05 13:44   ` [RFC PATCH v2 3/3] selftests: cgroup: Add basic tests for pids controller Michal Koutný
2020-02-05 13:44     ` Michal Koutný

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.