From: Tejun Heo <tj@kernel.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Petr Mladek <pmladek@suse.com>,
cgroups@vger.kernel.org, Cyril Hrubis <chrubis@suse.cz>,
linux-kernel@vger.kernel.org
Subject: [PATCH 1/2] cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys->post_attach callback
Date: Thu, 21 Apr 2016 19:06:48 -0400 [thread overview]
Message-ID: <20160421230648.GI4775@htj.duckdns.org> (raw)
In-Reply-To: <20160417120747.GC21757@dhcp22.suse.cz>
Since e93ad19d0564 ("cpuset: make mm migration asynchronous"), cpuset
kicks off asynchronous NUMA node migration if necessary during task
migration and flushes it from cpuset_post_attach_flush() which is
called at the end of __cgroup_procs_write(). This is to avoid
performing migration with cgroup_threadgroup_rwsem write-locked which
can lead to deadlock through dependency on kworker creation.
memcg has a similar issue with charge moving, so let's convert it to
an official callback rather than the current one-off cpuset specific
function. This patch adds cgroup_subsys->post_attach callback and
makes cpuset register cpuset_post_attach_flush() as its ->post_attach.
The conversion is mostly one-to-one except that the new callback is
called under cgroup_mutex. This is to guarantee that no other
migration operations are started before ->post_attach callbacks are
finished. cgroup_mutex is one of the outermost mutex in the system
and has never been and shouldn't be a problem. We can add specialized
synchronization around __cgroup_procs_write() but I don't think
there's any noticeable benefit.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org> # 4.4+ prerequisite for the next patch
---
include/linux/cgroup-defs.h | 1 +
include/linux/cpuset.h | 6 ------
kernel/cgroup.c | 7 +++++--
kernel/cpuset.c | 4 ++--
4 files changed, 8 insertions(+), 10 deletions(-)
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -444,6 +444,7 @@ struct cgroup_subsys {
int (*can_attach)(struct cgroup_taskset *tset);
void (*cancel_attach)(struct cgroup_taskset *tset);
void (*attach)(struct cgroup_taskset *tset);
+ void (*post_attach)(void);
int (*can_fork)(struct task_struct *task);
void (*cancel_fork)(struct task_struct *task);
void (*fork)(struct task_struct *task);
--- a/include/linux/cpuset.h
+++ b/include/linux/cpuset.h
@@ -137,8 +137,6 @@ static inline void set_mems_allowed(node
task_unlock(current);
}
-extern void cpuset_post_attach_flush(void);
-
#else /* !CONFIG_CPUSETS */
static inline bool cpusets_enabled(void) { return false; }
@@ -245,10 +243,6 @@ static inline bool read_mems_allowed_ret
return false;
}
-static inline void cpuset_post_attach_flush(void)
-{
-}
-
#endif /* !CONFIG_CPUSETS */
#endif /* _LINUX_CPUSET_H */
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2825,9 +2825,10 @@ static ssize_t __cgroup_procs_write(stru
size_t nbytes, loff_t off, bool threadgroup)
{
struct task_struct *tsk;
+ struct cgroup_subsys *ss;
struct cgroup *cgrp;
pid_t pid;
- int ret;
+ int ssid, ret;
if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0)
return -EINVAL;
@@ -2875,8 +2876,10 @@ out_unlock_rcu:
rcu_read_unlock();
out_unlock_threadgroup:
percpu_up_write(&cgroup_threadgroup_rwsem);
+ for_each_subsys(ss, ssid)
+ if (ss->post_attach)
+ ss->post_attach();
cgroup_kn_unlock(of->kn);
- cpuset_post_attach_flush();
return ret ?: nbytes;
}
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -58,7 +58,6 @@
#include <asm/uaccess.h>
#include <linux/atomic.h>
#include <linux/mutex.h>
-#include <linux/workqueue.h>
#include <linux/cgroup.h>
#include <linux/wait.h>
@@ -1016,7 +1015,7 @@ static void cpuset_migrate_mm(struct mm_
}
}
-void cpuset_post_attach_flush(void)
+static void cpuset_post_attach(void)
{
flush_workqueue(cpuset_migrate_mm_wq);
}
@@ -2087,6 +2086,7 @@ struct cgroup_subsys cpuset_cgrp_subsys
.can_attach = cpuset_can_attach,
.cancel_attach = cpuset_cancel_attach,
.attach = cpuset_attach,
+ .post_attach = cpuset_post_attach,
.bind = cpuset_bind,
.legacy_cftypes = files,
.early_init = true,
next prev parent reply other threads:[~2016-04-21 23:06 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-13 9:42 [BUG] cgroup/workques/fork: deadlock when moving cgroups Petr Mladek
2016-04-13 18:33 ` Tejun Heo
2016-04-13 18:57 ` Tejun Heo
2016-04-13 19:23 ` Michal Hocko
2016-04-13 19:28 ` Michal Hocko
2016-04-13 19:37 ` Tejun Heo
2016-04-13 19:48 ` Michal Hocko
2016-04-14 7:06 ` Michal Hocko
2016-04-14 15:32 ` Tejun Heo
2016-04-14 17:50 ` Johannes Weiner
2016-04-15 7:06 ` Michal Hocko
2016-04-15 14:38 ` Tejun Heo
2016-04-15 15:08 ` Michal Hocko
2016-04-15 15:25 ` Tejun Heo
2016-04-17 12:00 ` Michal Hocko
2016-04-18 14:40 ` Petr Mladek
2016-04-19 14:01 ` Michal Hocko
2016-04-19 15:39 ` Petr Mladek
2016-04-15 19:17 ` [PATCH for-4.6-fixes] memcg: remove lru_add_drain_all() invocation from mem_cgroup_move_charge() Tejun Heo
2016-04-17 12:07 ` Michal Hocko
2016-04-20 21:29 ` Tejun Heo
2016-04-21 3:27 ` Michal Hocko
2016-04-21 15:00 ` Petr Mladek
2016-04-21 15:51 ` Tejun Heo
2016-04-21 23:06 ` Tejun Heo [this message]
2016-04-21 23:09 ` [PATCH 2/2] memcg: relocate charge moving from ->attach to ->post_attach Tejun Heo
2016-04-22 13:57 ` Petr Mladek
2016-04-25 8:25 ` Michal Hocko
2016-04-25 19:42 ` Tejun Heo
2016-04-25 19:44 ` Tejun Heo
2016-04-21 23:11 ` [PATCH 1/2] cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys->post_attach callback Tejun Heo
2016-04-21 15:56 ` [PATCH for-4.6-fixes] memcg: remove lru_add_drain_all() invocation from mem_cgroup_move_charge() Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160421230648.GI4775@htj.duckdns.org \
--to=tj@kernel.org \
--cc=cgroups@vger.kernel.org \
--cc=chrubis@suse.cz \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mhocko@kernel.org \
--cc=pmladek@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).