linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] Optimize single thread migration
@ 2019-10-04 10:57 Michal Koutný
  2019-10-04 10:57 ` [PATCH 1/5] cgroup: Update comments about task exit path Michal Koutný
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Michal Koutný @ 2019-10-04 10:57 UTC (permalink / raw)
  To: cgroups; +Cc: Tejun Heo, linux-kernel, Li Zefan, Johannes Weiner

Hello.

The important part is the patch 02 where the reasoning is.

The rest is mostly auxiliar and split out into separate commits for
better readability.

The patches are based on v5.3. 

Michal


Michal Koutný (5):
  cgroup: Update comments about task exit path
  cgroup: Optimize single thread migration
  selftests: cgroup: Simplify task self migration
  selftests: cgroup: Add task migration tests
  selftests: cgroup: Run test_core under interfering stress

 kernel/cgroup/cgroup-internal.h               |   5 +-
 kernel/cgroup/cgroup-v1.c                     |   5 +-
 kernel/cgroup/cgroup.c                        |  68 ++++----
 tools/testing/selftests/cgroup/Makefile       |   4 +-
 tools/testing/selftests/cgroup/cgroup_util.c  |  42 ++++-
 tools/testing/selftests/cgroup/cgroup_util.h  |   6 +-
 tools/testing/selftests/cgroup/test_core.c    | 146 ++++++++++++++++++
 tools/testing/selftests/cgroup/test_freezer.c |   2 +-
 tools/testing/selftests/cgroup/test_stress.sh |   4 +
 tools/testing/selftests/cgroup/with_stress.sh | 101 ++++++++++++
 10 files changed, 342 insertions(+), 41 deletions(-)
 create mode 100755 tools/testing/selftests/cgroup/test_stress.sh
 create mode 100755 tools/testing/selftests/cgroup/with_stress.sh

-- 
2.21.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/5] cgroup: Update comments about task exit path
  2019-10-04 10:57 [PATCH 0/5] Optimize single thread migration Michal Koutný
@ 2019-10-04 10:57 ` Michal Koutný
  2019-10-04 10:57 ` [PATCH 2/5] cgroup: Optimize single thread migration Michal Koutný
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Michal Koutný @ 2019-10-04 10:57 UTC (permalink / raw)
  To: cgroups; +Cc: Tejun Heo, linux-kernel, Li Zefan, Johannes Weiner

We no longer take cgroup_mutex in cgroup_exit and the exiting tasks are
not moved to init_css_set, reflect that in several comments to prevent
confusion.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 kernel/cgroup/cgroup.c | 29 +++++++++--------------------
 1 file changed, 9 insertions(+), 20 deletions(-)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 753afbca549f..1488bb732902 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -899,8 +899,7 @@ static void css_set_move_task(struct task_struct *task,
 		/*
 		 * We are synchronized through cgroup_threadgroup_rwsem
 		 * against PF_EXITING setting such that we can't race
-		 * against cgroup_exit() changing the css_set to
-		 * init_css_set and dropping the old one.
+		 * against cgroup_exit()/cgroup_free() dropping the css_set.
 		 */
 		WARN_ON_ONCE(task->flags & PF_EXITING);
 
@@ -1430,9 +1429,8 @@ struct cgroup *task_cgroup_from_root(struct task_struct *task,
 				     struct cgroup_root *root)
 {
 	/*
-	 * No need to lock the task - since we hold cgroup_mutex the
-	 * task can't change groups, so the only thing that can happen
-	 * is that it exits and its css is set back to init_css_set.
+	 * No need to lock the task - since we hold css_set_lock the
+	 * task can't change groups.
 	 */
 	return cset_cgroup_from_root(task_css_set(task), root);
 }
@@ -6020,7 +6018,7 @@ void cgroup_post_fork(struct task_struct *child)
 		struct css_set *cset;
 
 		spin_lock_irq(&css_set_lock);
-		cset = task_css_set(current);
+		cset = task_css_set(current); /* current is @child's parent */
 		if (list_empty(&child->cg_list)) {
 			get_css_set(cset);
 			cset->nr_tasks++;
@@ -6063,20 +6061,8 @@ void cgroup_post_fork(struct task_struct *child)
  * cgroup_exit - detach cgroup from exiting task
  * @tsk: pointer to task_struct of exiting process
  *
- * Description: Detach cgroup from @tsk and release it.
- *
- * Note that cgroups marked notify_on_release force every task in
- * them to take the global cgroup_mutex mutex when exiting.
- * This could impact scaling on very large systems.  Be reluctant to
- * use notify_on_release cgroups where very high task exit scaling
- * is required on large systems.
+ * Description: Detach cgroup from @tsk.
  *
- * We set the exiting tasks cgroup to the root cgroup (top_cgroup).  We
- * call cgroup_exit() while the task is still competent to handle
- * notify_on_release(), then leave the task attached to the root cgroup in
- * each hierarchy for the remainder of its exit.  No need to bother with
- * init_css_set refcnting.  init_css_set never goes away and we can't race
- * with migration path - PF_EXITING is visible to migration path.
  */
 void cgroup_exit(struct task_struct *tsk)
 {
@@ -6086,7 +6072,8 @@ void cgroup_exit(struct task_struct *tsk)
 
 	/*
 	 * Unlink from @tsk from its css_set.  As migration path can't race
-	 * with us, we can check css_set and cg_list without synchronization.
+	 * with us (thanks to cgroup_threadgroup_rwsem), we can check css_set
+	 * and cg_list without synchronization.
 	 */
 	cset = task_css_set(tsk);
 
@@ -6102,6 +6089,8 @@ void cgroup_exit(struct task_struct *tsk)
 
 		spin_unlock_irq(&css_set_lock);
 	} else {
+		/* Take reference to avoid freeing init_css_set in cgroup_free,
+		 * see cgroup_fork(). */
 		get_css_set(cset);
 	}
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/5] cgroup: Optimize single thread migration
  2019-10-04 10:57 [PATCH 0/5] Optimize single thread migration Michal Koutný
  2019-10-04 10:57 ` [PATCH 1/5] cgroup: Update comments about task exit path Michal Koutný
@ 2019-10-04 10:57 ` Michal Koutný
  2019-10-04 10:57 ` [PATCH 3/5] selftests: cgroup: Simplify task self migration Michal Koutný
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Michal Koutný @ 2019-10-04 10:57 UTC (permalink / raw)
  To: cgroups; +Cc: Tejun Heo, linux-kernel, Li Zefan, Johannes Weiner

There are reports of users who use thread migrations between cgroups and
they report performance drop after d59cfc09c32a ("sched, cgroup: replace
signal_struct->group_rwsem with a global percpu_rwsem"). The effect is
pronounced on machines with more CPUs.

The migration is affected by forking noise happening in the background,
after the mentioned commit a migrating thread must wait for all
(forking) processes on the system, not only of its threadgroup.

There are several places that need to synchronize with migration:
	a) do_exit,
	b) de_thread,
	c) copy_process,
	d) cgroup_update_dfl_csses,
	e) parallel migration (cgroup_{proc,thread}s_write).

In the case of self-migrating thread, we relax the synchronization on
cgroup_threadgroup_rwsem to avoid the cost of waiting. d) and e) are
excluded with cgroup_mutex, c) does not matter in case of single thread
migration and the executing thread cannot exec(2) or exit(2) while it is
writing into cgroup.threads. In case of do_exit because of signal
delivery, we either exit before the migration or finish the migration
(of not yet PF_EXITING thread) and die afterwards.

This patch handles only the case of self-migration by writing "0" into
cgroup.threads. For simplicity, we always take cgroup_threadgroup_rwsem
with numeric PIDs.

This change improves migration dependent workload performance similar
to per-signal_struct state.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 kernel/cgroup/cgroup-internal.h |  5 +++--
 kernel/cgroup/cgroup-v1.c       |  5 +++--
 kernel/cgroup/cgroup.c          | 39 +++++++++++++++++++++++++--------
 3 files changed, 36 insertions(+), 13 deletions(-)

diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h
index 809e34a3c017..90d1710fef6c 100644
--- a/kernel/cgroup/cgroup-internal.h
+++ b/kernel/cgroup/cgroup-internal.h
@@ -231,9 +231,10 @@ int cgroup_migrate(struct task_struct *leader, bool threadgroup,
 
 int cgroup_attach_task(struct cgroup *dst_cgrp, struct task_struct *leader,
 		       bool threadgroup);
-struct task_struct *cgroup_procs_write_start(char *buf, bool threadgroup)
+struct task_struct *cgroup_procs_write_start(char *buf, bool threadgroup,
+					     bool *locked)
 	__acquires(&cgroup_threadgroup_rwsem);
-void cgroup_procs_write_finish(struct task_struct *task)
+void cgroup_procs_write_finish(struct task_struct *task, bool locked)
 	__releases(&cgroup_threadgroup_rwsem);
 
 void cgroup_lock_and_drain_offline(struct cgroup *cgrp);
diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c
index 88006be40ea3..dd0b263430d5 100644
--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -514,12 +514,13 @@ static ssize_t __cgroup1_procs_write(struct kernfs_open_file *of,
 	struct task_struct *task;
 	const struct cred *cred, *tcred;
 	ssize_t ret;
+	bool locked;
 
 	cgrp = cgroup_kn_lock_live(of->kn, false);
 	if (!cgrp)
 		return -ENODEV;
 
-	task = cgroup_procs_write_start(buf, threadgroup);
+	task = cgroup_procs_write_start(buf, threadgroup, &locked);
 	ret = PTR_ERR_OR_ZERO(task);
 	if (ret)
 		goto out_unlock;
@@ -541,7 +542,7 @@ static ssize_t __cgroup1_procs_write(struct kernfs_open_file *of,
 	ret = cgroup_attach_task(cgrp, task, threadgroup);
 
 out_finish:
-	cgroup_procs_write_finish(task);
+	cgroup_procs_write_finish(task, locked);
 out_unlock:
 	cgroup_kn_unlock(of->kn);
 
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 1488bb732902..3f806dbe279b 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -2822,7 +2822,8 @@ int cgroup_attach_task(struct cgroup *dst_cgrp, struct task_struct *leader,
 	return ret;
 }
 
-struct task_struct *cgroup_procs_write_start(char *buf, bool threadgroup)
+struct task_struct *cgroup_procs_write_start(char *buf, bool threadgroup,
+					     bool *locked)
 	__acquires(&cgroup_threadgroup_rwsem)
 {
 	struct task_struct *tsk;
@@ -2831,7 +2832,21 @@ struct task_struct *cgroup_procs_write_start(char *buf, bool threadgroup)
 	if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0)
 		return ERR_PTR(-EINVAL);
 
-	percpu_down_write(&cgroup_threadgroup_rwsem);
+	/*
+	 * If we migrate a single thread, we don't care about threadgroup
+	 * stability. If the thread is `current`, it won't exit(2) under our
+	 * hands or change PID through exec(2). We exclude
+	 * cgroup_update_dfl_csses and other cgroup_{proc,thread}s_write
+	 * callers by cgroup_mutex.
+	 * Therefore, we can skip the global lock.
+	 */
+	lockdep_assert_held(&cgroup_mutex);
+	if (pid || threadgroup) {
+		percpu_down_write(&cgroup_threadgroup_rwsem);
+		*locked = true;
+	} else {
+		*locked = false;
+	}
 
 	rcu_read_lock();
 	if (pid) {
@@ -2862,13 +2877,16 @@ struct task_struct *cgroup_procs_write_start(char *buf, bool threadgroup)
 	goto out_unlock_rcu;
 
 out_unlock_threadgroup:
-	percpu_up_write(&cgroup_threadgroup_rwsem);
+	if (*locked) {
+		percpu_up_write(&cgroup_threadgroup_rwsem);
+		*locked = false;
+	}
 out_unlock_rcu:
 	rcu_read_unlock();
 	return tsk;
 }
 
-void cgroup_procs_write_finish(struct task_struct *task)
+void cgroup_procs_write_finish(struct task_struct *task, bool locked)
 	__releases(&cgroup_threadgroup_rwsem)
 {
 	struct cgroup_subsys *ss;
@@ -2877,7 +2895,8 @@ void cgroup_procs_write_finish(struct task_struct *task)
 	/* release reference from cgroup_procs_write_start() */
 	put_task_struct(task);
 
-	percpu_up_write(&cgroup_threadgroup_rwsem);
+	if (locked)
+		percpu_up_write(&cgroup_threadgroup_rwsem);
 	for_each_subsys(ss, ssid)
 		if (ss->post_attach)
 			ss->post_attach();
@@ -4752,12 +4771,13 @@ static ssize_t cgroup_procs_write(struct kernfs_open_file *of,
 	struct cgroup *src_cgrp, *dst_cgrp;
 	struct task_struct *task;
 	ssize_t ret;
+	bool locked;
 
 	dst_cgrp = cgroup_kn_lock_live(of->kn, false);
 	if (!dst_cgrp)
 		return -ENODEV;
 
-	task = cgroup_procs_write_start(buf, true);
+	task = cgroup_procs_write_start(buf, true, &locked);
 	ret = PTR_ERR_OR_ZERO(task);
 	if (ret)
 		goto out_unlock;
@@ -4775,7 +4795,7 @@ static ssize_t cgroup_procs_write(struct kernfs_open_file *of,
 	ret = cgroup_attach_task(dst_cgrp, task, true);
 
 out_finish:
-	cgroup_procs_write_finish(task);
+	cgroup_procs_write_finish(task, locked);
 out_unlock:
 	cgroup_kn_unlock(of->kn);
 
@@ -4793,6 +4813,7 @@ static ssize_t cgroup_threads_write(struct kernfs_open_file *of,
 	struct cgroup *src_cgrp, *dst_cgrp;
 	struct task_struct *task;
 	ssize_t ret;
+	bool locked;
 
 	buf = strstrip(buf);
 
@@ -4800,7 +4821,7 @@ static ssize_t cgroup_threads_write(struct kernfs_open_file *of,
 	if (!dst_cgrp)
 		return -ENODEV;
 
-	task = cgroup_procs_write_start(buf, false);
+	task = cgroup_procs_write_start(buf, false, &locked);
 	ret = PTR_ERR_OR_ZERO(task);
 	if (ret)
 		goto out_unlock;
@@ -4824,7 +4845,7 @@ static ssize_t cgroup_threads_write(struct kernfs_open_file *of,
 	ret = cgroup_attach_task(dst_cgrp, task, false);
 
 out_finish:
-	cgroup_procs_write_finish(task);
+	cgroup_procs_write_finish(task, locked);
 out_unlock:
 	cgroup_kn_unlock(of->kn);
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/5] selftests: cgroup: Simplify task self migration
  2019-10-04 10:57 [PATCH 0/5] Optimize single thread migration Michal Koutný
  2019-10-04 10:57 ` [PATCH 1/5] cgroup: Update comments about task exit path Michal Koutný
  2019-10-04 10:57 ` [PATCH 2/5] cgroup: Optimize single thread migration Michal Koutný
@ 2019-10-04 10:57 ` Michal Koutný
  2019-10-04 10:57 ` [PATCH 4/5] selftests: cgroup: Add task migration tests Michal Koutný
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Michal Koutný @ 2019-10-04 10:57 UTC (permalink / raw)
  To: cgroups; +Cc: Tejun Heo, linux-kernel, Li Zefan, Johannes Weiner

Simplify task migration by being oblivious about its PID during
migration. This allows to easily migrate individual threads as well.
This change brings no functional change and prepares grounds for thread
granularity migrating tests.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 tools/testing/selftests/cgroup/cgroup_util.c  | 16 +++++++++++-----
 tools/testing/selftests/cgroup/cgroup_util.h  |  4 +++-
 tools/testing/selftests/cgroup/test_freezer.c |  2 +-
 3 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/cgroup/cgroup_util.c b/tools/testing/selftests/cgroup/cgroup_util.c
index bdb69599c4bd..f6573eac1365 100644
--- a/tools/testing/selftests/cgroup/cgroup_util.c
+++ b/tools/testing/selftests/cgroup/cgroup_util.c
@@ -282,10 +282,12 @@ int cg_enter(const char *cgroup, int pid)
 
 int cg_enter_current(const char *cgroup)
 {
-	char pidbuf[64];
+	return cg_write(cgroup, "cgroup.procs", "0");
+}
 
-	snprintf(pidbuf, sizeof(pidbuf), "%d", getpid());
-	return cg_write(cgroup, "cgroup.procs", pidbuf);
+int cg_enter_current_thread(const char *cgroup)
+{
+	return cg_write(cgroup, "cgroup.threads", "0");
 }
 
 int cg_run(const char *cgroup,
@@ -410,11 +412,15 @@ int set_oom_adj_score(int pid, int score)
 	return 0;
 }
 
-char proc_read_text(int pid, const char *item, char *buf, size_t size)
+ssize_t proc_read_text(int pid, bool thread, const char *item, char *buf, size_t size)
 {
 	char path[PATH_MAX];
 
-	snprintf(path, sizeof(path), "/proc/%d/%s", pid, item);
+	if (!pid)
+		snprintf(path, sizeof(path), "/proc/%s/%s",
+			 thread ? "thread-self" : "self", item);
+	else
+		snprintf(path, sizeof(path), "/proc/%d/%s", pid, item);
 
 	return read_text(path, buf, size);
 }
diff --git a/tools/testing/selftests/cgroup/cgroup_util.h b/tools/testing/selftests/cgroup/cgroup_util.h
index c72f28046bfa..27ff21d82af1 100644
--- a/tools/testing/selftests/cgroup/cgroup_util.h
+++ b/tools/testing/selftests/cgroup/cgroup_util.h
@@ -1,4 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
+#include <stdbool.h>
 #include <stdlib.h>
 
 #define PAGE_SIZE 4096
@@ -35,6 +36,7 @@ extern int cg_run(const char *cgroup,
 		  void *arg);
 extern int cg_enter(const char *cgroup, int pid);
 extern int cg_enter_current(const char *cgroup);
+extern int cg_enter_current_thread(const char *cgroup);
 extern int cg_run_nowait(const char *cgroup,
 			 int (*fn)(const char *cgroup, void *arg),
 			 void *arg);
@@ -45,4 +47,4 @@ extern int is_swap_enabled(void);
 extern int set_oom_adj_score(int pid, int score);
 extern int cg_wait_for_proc_count(const char *cgroup, int count);
 extern int cg_killall(const char *cgroup);
-extern char proc_read_text(int pid, const char *item, char *buf, size_t size);
+extern ssize_t proc_read_text(int pid, bool thread, const char *item, char *buf, size_t size);
diff --git a/tools/testing/selftests/cgroup/test_freezer.c b/tools/testing/selftests/cgroup/test_freezer.c
index 8219a30853d2..5896e35168cd 100644
--- a/tools/testing/selftests/cgroup/test_freezer.c
+++ b/tools/testing/selftests/cgroup/test_freezer.c
@@ -648,7 +648,7 @@ static int proc_check_stopped(int pid)
 	char buf[PAGE_SIZE];
 	int len;
 
-	len = proc_read_text(pid, "stat", buf, sizeof(buf));
+	len = proc_read_text(pid, 0, "stat", buf, sizeof(buf));
 	if (len == -1) {
 		debug("Can't get %d stat\n", pid);
 		return -1;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4/5] selftests: cgroup: Add task migration tests
  2019-10-04 10:57 [PATCH 0/5] Optimize single thread migration Michal Koutný
                   ` (2 preceding siblings ...)
  2019-10-04 10:57 ` [PATCH 3/5] selftests: cgroup: Simplify task self migration Michal Koutný
@ 2019-10-04 10:57 ` Michal Koutný
  2019-10-04 10:57 ` [PATCH 5/5] selftests: cgroup: Run test_core under interfering stress Michal Koutný
  2019-10-07 14:12 ` [PATCH 0/5] Optimize single thread migration Tejun Heo
  5 siblings, 0 replies; 7+ messages in thread
From: Michal Koutný @ 2019-10-04 10:57 UTC (permalink / raw)
  To: cgroups; +Cc: Tejun Heo, linux-kernel, Li Zefan, Johannes Weiner

Add two new tests that verify that thread and threadgroup migrations
work as expected.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 tools/testing/selftests/cgroup/Makefile      |   2 +-
 tools/testing/selftests/cgroup/cgroup_util.c |  26 ++++
 tools/testing/selftests/cgroup/cgroup_util.h |   2 +
 tools/testing/selftests/cgroup/test_core.c   | 146 +++++++++++++++++++
 4 files changed, 175 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/cgroup/Makefile b/tools/testing/selftests/cgroup/Makefile
index 8d369b6a2069..1c9179400be0 100644
--- a/tools/testing/selftests/cgroup/Makefile
+++ b/tools/testing/selftests/cgroup/Makefile
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
-CFLAGS += -Wall
+CFLAGS += -Wall -pthread
 
 all:
 
diff --git a/tools/testing/selftests/cgroup/cgroup_util.c b/tools/testing/selftests/cgroup/cgroup_util.c
index f6573eac1365..8f7131dcf1ff 100644
--- a/tools/testing/selftests/cgroup/cgroup_util.c
+++ b/tools/testing/selftests/cgroup/cgroup_util.c
@@ -158,6 +158,22 @@ long cg_read_key_long(const char *cgroup, const char *control, const char *key)
 	return atol(ptr + strlen(key));
 }
 
+long cg_read_lc(const char *cgroup, const char *control)
+{
+	char buf[PAGE_SIZE];
+	const char delim[] = "\n";
+	char *line;
+	long cnt = 0;
+
+	if (cg_read(cgroup, control, buf, sizeof(buf)))
+		return -1;
+
+	for (line = strtok(buf, delim); line; line = strtok(NULL, delim))
+		cnt++;
+
+	return cnt;
+}
+
 int cg_write(const char *cgroup, const char *control, char *buf)
 {
 	char path[PATH_MAX];
@@ -424,3 +440,13 @@ ssize_t proc_read_text(int pid, bool thread, const char *item, char *buf, size_t
 
 	return read_text(path, buf, size);
 }
+
+int proc_read_strstr(int pid, bool thread, const char *item, const char *needle)
+{
+	char buf[PAGE_SIZE];
+
+	if (proc_read_text(pid, thread, item, buf, sizeof(buf)) < 0)
+		return -1;
+
+	return strstr(buf, needle) ? 0 : -1;
+}
diff --git a/tools/testing/selftests/cgroup/cgroup_util.h b/tools/testing/selftests/cgroup/cgroup_util.h
index 27ff21d82af1..49c54fbdb229 100644
--- a/tools/testing/selftests/cgroup/cgroup_util.h
+++ b/tools/testing/selftests/cgroup/cgroup_util.h
@@ -30,6 +30,7 @@ extern int cg_read_strstr(const char *cgroup, const char *control,
 			  const char *needle);
 extern long cg_read_long(const char *cgroup, const char *control);
 long cg_read_key_long(const char *cgroup, const char *control, const char *key);
+extern long cg_read_lc(const char *cgroup, const char *control);
 extern int cg_write(const char *cgroup, const char *control, char *buf);
 extern int cg_run(const char *cgroup,
 		  int (*fn)(const char *cgroup, void *arg),
@@ -48,3 +49,4 @@ extern int set_oom_adj_score(int pid, int score);
 extern int cg_wait_for_proc_count(const char *cgroup, int count);
 extern int cg_killall(const char *cgroup);
 extern ssize_t proc_read_text(int pid, bool thread, const char *item, char *buf, size_t size);
+extern int proc_read_strstr(int pid, bool thread, const char *item, const char *needle);
diff --git a/tools/testing/selftests/cgroup/test_core.c b/tools/testing/selftests/cgroup/test_core.c
index 79053a4f4783..c5ca669feb2b 100644
--- a/tools/testing/selftests/cgroup/test_core.c
+++ b/tools/testing/selftests/cgroup/test_core.c
@@ -5,6 +5,9 @@
 #include <unistd.h>
 #include <stdio.h>
 #include <errno.h>
+#include <signal.h>
+#include <string.h>
+#include <pthread.h>
 
 #include "../kselftest.h"
 #include "cgroup_util.h"
@@ -354,6 +357,147 @@ static int test_cgcore_internal_process_constraint(const char *root)
 	return ret;
 }
 
+static void *dummy_thread_fn(void *arg)
+{
+	return (void *)(size_t)pause();
+}
+
+/*
+ * Test threadgroup migration.
+ * All threads of a process are migrated together.
+ */
+static int test_cgcore_proc_migration(const char *root)
+{
+	int ret = KSFT_FAIL;
+	int t, c_threads, n_threads = 13;
+	char *src = NULL, *dst = NULL;
+	pthread_t threads[n_threads];
+
+	src = cg_name(root, "cg_src");
+	dst = cg_name(root, "cg_dst");
+	if (!src || !dst)
+		goto cleanup;
+
+	if (cg_create(src))
+		goto cleanup;
+	if (cg_create(dst))
+		goto cleanup;
+
+	if (cg_enter_current(src))
+		goto cleanup;
+
+	for (c_threads = 0; c_threads < n_threads; ++c_threads) {
+		if (pthread_create(&threads[c_threads], NULL, dummy_thread_fn, NULL))
+			goto cleanup;
+	}
+
+	cg_enter_current(dst);
+	if (cg_read_lc(dst, "cgroup.threads") != n_threads + 1)
+		goto cleanup;
+
+	ret = KSFT_PASS;
+
+cleanup:
+	for (t = 0; t < c_threads; ++t) {
+		pthread_cancel(threads[t]);
+	}
+
+	for (t = 0; t < c_threads; ++t) {
+		pthread_join(threads[t], NULL);
+	}
+
+	cg_enter_current(root);
+
+	if (dst)
+		cg_destroy(dst);
+	if (src)
+		cg_destroy(src);
+	free(dst);
+	free(src);
+	return ret;
+}
+
+static void *migrating_thread_fn(void *arg)
+{
+	int g, i, n_iterations = 1000;
+	char **grps = arg;
+	char lines[3][PATH_MAX];
+
+	for (g = 1; g < 3; ++g)
+		snprintf(lines[g], sizeof(lines[g]), "0::%s", grps[g] + strlen(grps[0]));
+
+	for (i = 0; i < n_iterations; ++i) {
+		cg_enter_current_thread(grps[(i % 2) + 1]);
+
+		if (proc_read_strstr(0, 1, "cgroup", lines[(i % 2) + 1]))
+			return (void *)-1;
+	}
+	return NULL;
+}
+
+/*
+ * Test single thread migration.
+ * Threaded cgroups allow successful migration of a thread.
+ */
+static int test_cgcore_thread_migration(const char *root)
+{
+	int ret = KSFT_FAIL;
+	char *dom = NULL;
+	char line[PATH_MAX];
+	char *grps[3] = { (char *)root, NULL, NULL };
+	pthread_t thr;
+	void *retval;
+
+	dom = cg_name(root, "cg_dom");
+	grps[1] = cg_name(root, "cg_dom/cg_src");
+	grps[2] = cg_name(root, "cg_dom/cg_dst");
+	if (!grps[1] || !grps[2] || !dom)
+		goto cleanup;
+
+	if (cg_create(dom))
+		goto cleanup;
+	if (cg_create(grps[1]))
+		goto cleanup;
+	if (cg_create(grps[2]))
+		goto cleanup;
+
+	if (cg_write(grps[1], "cgroup.type", "threaded"))
+		goto cleanup;
+	if (cg_write(grps[2], "cgroup.type", "threaded"))
+		goto cleanup;
+
+	if (cg_enter_current(grps[1]))
+		goto cleanup;
+
+	if (pthread_create(&thr, NULL, migrating_thread_fn, grps))
+		goto cleanup;
+
+	if (pthread_join(thr, &retval))
+		goto cleanup;
+
+	if (retval)
+		goto cleanup;
+
+	snprintf(line, sizeof(line), "0::%s", grps[1] + strlen(grps[0]));
+	if (proc_read_strstr(0, 1, "cgroup", line))
+		goto cleanup;
+
+	ret = KSFT_PASS;
+
+cleanup:
+	cg_enter_current(root);
+	if (grps[2])
+		cg_destroy(grps[2]);
+	if (grps[1])
+		cg_destroy(grps[1]);
+	if (dom)
+		cg_destroy(dom);
+	free(grps[2]);
+	free(grps[1]);
+	free(dom);
+	return ret;
+}
+
 #define T(x) { x, #x }
 struct corecg_test {
 	int (*fn)(const char *root);
@@ -366,6 +510,8 @@ struct corecg_test {
 	T(test_cgcore_parent_becomes_threaded),
 	T(test_cgcore_invalid_domain),
 	T(test_cgcore_populated),
+	T(test_cgcore_proc_migration),
+	T(test_cgcore_thread_migration),
 };
 #undef T
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 5/5] selftests: cgroup: Run test_core under interfering stress
  2019-10-04 10:57 [PATCH 0/5] Optimize single thread migration Michal Koutný
                   ` (3 preceding siblings ...)
  2019-10-04 10:57 ` [PATCH 4/5] selftests: cgroup: Add task migration tests Michal Koutný
@ 2019-10-04 10:57 ` Michal Koutný
  2019-10-07 14:12 ` [PATCH 0/5] Optimize single thread migration Tejun Heo
  5 siblings, 0 replies; 7+ messages in thread
From: Michal Koutný @ 2019-10-04 10:57 UTC (permalink / raw)
  To: cgroups; +Cc: Tejun Heo, linux-kernel, Li Zefan, Johannes Weiner

test_core tests various cgroup creation/removal and task migration
paths. Run the tests repeatedly with interfering noise (for lockdep
checks). Currently, forking noise and subsystem enabled/disabled
switching are the implemented noises.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 tools/testing/selftests/cgroup/Makefile       |   2 +
 tools/testing/selftests/cgroup/test_stress.sh |   4 +
 tools/testing/selftests/cgroup/with_stress.sh | 101 ++++++++++++++++++
 3 files changed, 107 insertions(+)
 create mode 100755 tools/testing/selftests/cgroup/test_stress.sh
 create mode 100755 tools/testing/selftests/cgroup/with_stress.sh

diff --git a/tools/testing/selftests/cgroup/Makefile b/tools/testing/selftests/cgroup/Makefile
index 1c9179400be0..66aafe1f5746 100644
--- a/tools/testing/selftests/cgroup/Makefile
+++ b/tools/testing/selftests/cgroup/Makefile
@@ -3,6 +3,8 @@ CFLAGS += -Wall -pthread
 
 all:
 
+TEST_FILES     := with_stress.sh
+TEST_PROGS     := test_stress.sh
 TEST_GEN_PROGS = test_memcontrol
 TEST_GEN_PROGS += test_core
 TEST_GEN_PROGS += test_freezer
diff --git a/tools/testing/selftests/cgroup/test_stress.sh b/tools/testing/selftests/cgroup/test_stress.sh
new file mode 100755
index 000000000000..15d9d5896394
--- /dev/null
+++ b/tools/testing/selftests/cgroup/test_stress.sh
@@ -0,0 +1,4 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+./with_stress.sh -s subsys -s fork ./test_core
diff --git a/tools/testing/selftests/cgroup/with_stress.sh b/tools/testing/selftests/cgroup/with_stress.sh
new file mode 100755
index 000000000000..e28c35008f5b
--- /dev/null
+++ b/tools/testing/selftests/cgroup/with_stress.sh
@@ -0,0 +1,101 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+
+stress_fork()
+{
+	while true ; do
+		/usr/bin/true
+		sleep 0.01
+	done
+}
+
+stress_subsys()
+{
+	local verb=+
+	while true ; do
+		echo $verb$subsys_ctrl >$sysfs/cgroup.subtree_control
+		[ $verb = "+" ] && verb=- || verb=+
+		# incommensurable period with other stresses
+		sleep 0.011
+	done
+}
+
+init_and_check()
+{
+	sysfs=`mount -t cgroup2 | head -1 | awk '{ print $3 }'`
+	if [ ! -d "$sysfs" ]; then
+		echo "Skipping: cgroup2 is not mounted" >&2
+		exit $ksft_skip
+	fi
+
+	if ! echo +$subsys_ctrl >$sysfs/cgroup.subtree_control ; then
+		echo "Skipping: cannot enable $subsys_ctrl in $sysfs" >&2
+		exit $ksft_skip
+	fi
+
+	if ! echo -$subsys_ctrl >$sysfs/cgroup.subtree_control ; then
+		echo "Skipping: cannot disable $subsys_ctrl in $sysfs" >&2
+		exit $ksft_skip
+	fi
+}
+
+declare -a stresses
+declare -a stress_pids
+duration=5
+rc=0
+subsys_ctrl=cpuset
+sysfs=
+
+while getopts c:d:hs: opt; do
+	case $opt in
+	c)
+		subsys_ctrl=$OPTARG
+		;;
+	d)
+		duration=$OPTARG
+		;;
+	h)
+		echo "Usage $0 [ -s stress ] ... [ -d duration ] [-c controller] cmd args .."
+		echo -e "\t default duration $duration seconds"
+		echo -e "\t default controller $subsys_ctrl"
+		exit
+		;;
+	s)
+		func=stress_$OPTARG
+		if [ "x$(type -t $func)" != "xfunction" ] ; then
+			echo "Unknown stress $OPTARG"
+			exit 1
+		fi
+		stresses+=($func)
+		;;
+	esac
+done
+shift $((OPTIND - 1))
+
+init_and_check
+
+for s in ${stresses[*]} ; do
+	$s &
+	stress_pids+=($!)
+done
+
+
+time=0
+start=$(date +%s)
+
+while [ $time -lt $duration ] ; do
+	$*
+	rc=$?
+	[ $rc -eq 0 ] || break
+	time=$(($(date +%s) - $start))
+done
+
+for pid in ${stress_pids[*]} ; do
+	kill -SIGTERM $pid
+	wait $pid
+done
+
+exit $rc
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/5] Optimize single thread migration
  2019-10-04 10:57 [PATCH 0/5] Optimize single thread migration Michal Koutný
                   ` (4 preceding siblings ...)
  2019-10-04 10:57 ` [PATCH 5/5] selftests: cgroup: Run test_core under interfering stress Michal Koutný
@ 2019-10-07 14:12 ` Tejun Heo
  5 siblings, 0 replies; 7+ messages in thread
From: Tejun Heo @ 2019-10-07 14:12 UTC (permalink / raw)
  To: Michal Koutný; +Cc: cgroups, linux-kernel, Li Zefan, Johannes Weiner

On Fri, Oct 04, 2019 at 12:57:38PM +0200, Michal Koutný wrote:
> Hello.
> 
> The important part is the patch 02 where the reasoning is.
> 
> The rest is mostly auxiliar and split out into separate commits for
> better readability.
> 
> The patches are based on v5.3. 

This is great.  Applied to the series to cgroup/for-5.5

Thanks!

-- 
tejun

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-10-07 14:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-04 10:57 [PATCH 0/5] Optimize single thread migration Michal Koutný
2019-10-04 10:57 ` [PATCH 1/5] cgroup: Update comments about task exit path Michal Koutný
2019-10-04 10:57 ` [PATCH 2/5] cgroup: Optimize single thread migration Michal Koutný
2019-10-04 10:57 ` [PATCH 3/5] selftests: cgroup: Simplify task self migration Michal Koutný
2019-10-04 10:57 ` [PATCH 4/5] selftests: cgroup: Add task migration tests Michal Koutný
2019-10-04 10:57 ` [PATCH 5/5] selftests: cgroup: Run test_core under interfering stress Michal Koutný
2019-10-07 14:12 ` [PATCH 0/5] Optimize single thread migration Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).