All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Petr Mladek <pmladek@suse.com>
Cc: Tejun Heo <tj@kernel.org>, Johannes Weiner <hannes@cmpxchg.org>,
	cgroups@vger.kernel.org, Cyril Hrubis <chrubis@suse.cz>,
	linux-kernel@vger.kernel.org
Subject: Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups
Date: Tue, 19 Apr 2016 10:01:21 -0400	[thread overview]
Message-ID: <20160419140120.GA4126@dhcp22.suse.cz> (raw)
In-Reply-To: <20160418144023.GG6862@pathway.suse.cz>

On Mon 18-04-16 16:40:23, Petr Mladek wrote:
> On Fri 2016-04-15 10:38:15, Tejun Heo wrote:
> > > Anyway, before we go that way, can we at least consider the possibility
> > > of removing the kworker creation dependency on the global rwsem? AFAIU
> > > this locking was added because of the pid controller. Do we even care
> > > about something as volatile as kworkers in the pid controller?
> >
> > It's not just pid controller and the global percpu locking has lower
> > hotpath overhead.  We can try to exclude kworkers out of the locking
> > but that can get really nasty and there are already attempts to add
> > cgroup support to workqueue.  Will think more about it.
> 
> I have played with this idea on Friday. Please, find below a POC.
> The worker detection works and the deadlock is removed. But workers
> do not appear in the root cgroups. I am not familiar with the cgroups
> stuff, so this part is much more difficult for me.
> 
> I send it because it might give you an idea when discussing it
> on LSF. Please, let me know if I should continue on this way or
> if it looks too crazy already now.
> 
> 
> >From ca1420926f990892a914d64046ee8d273b876f30 Mon Sep 17 00:00:00 2001
> From: Petr Mladek <pmladek@suse.com>
> Date: Mon, 18 Apr 2016 14:17:17 +0200
> Subject: [POC PATCH] cgroups/workqueus: Do not block forking workqueues by cgroups
>  lock
> 
> This is a POC how to delay cgroups operations when forking workqueue
> workers.  Workqueues are used by some cgroups operations, for example,
> lru_add_drain_all().  Creating new worker might help to avoid a deadlock.
> 
> This patch adds a way to detect whether a workqueue worker is being
> forked, see detect_wq_worker().  For this it needs to make struct
> kthread_create_info and the main worker_thread public.  As a consequence,
> it renames worker_thread to wq_worker_thread.
> 
> Next, cgroups_fork() just initializes the cgroups fields in task_struct.
> It does not really need to be guarded by cgroup_threadgroup_rwsem.
> 
> cgroup_threadgroup_rwsem is taken later when we check if the fork
> is allowed and when we copy the cgroups setting.  But these two
> operations are skipped for workers.
> 
> The result is that workers are not blocked by any cgroups operation
> but they do not appear in the root cgroups.
> 
> There is a preparation of cgroup_delayed_post_fork() that might put
> the workers into the root cgroups.  It is just a copy of
> cgroup_post_fork() where "kthreadd_task" is hardcoded.  It is not yet
> called.  Also it is not protected against other cgroups operations.
> ---
>  include/linux/kthread.h   | 14 +++++++++++++
>  include/linux/workqueue.h |  1 +
>  kernel/cgroup.c           | 53 +++++++++++++++++++++++++++++++++++++++++++++++
>  kernel/fork.c             | 36 +++++++++++++++++++++++---------
>  kernel/kthread.c          | 14 -------------
>  kernel/workqueue.c        |  9 ++++----
>  6 files changed, 98 insertions(+), 29 deletions(-)

This feels too overcomplicated. Can we simply drop the locking in
copy_process if the current == ktreadadd? Would something actually
break?
-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: Petr Mladek <pmladek-IBi9RG/b67k@public.gmane.org>
Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Cyril Hrubis <chrubis-AlSwsSmVLrQ@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups
Date: Tue, 19 Apr 2016 10:01:21 -0400	[thread overview]
Message-ID: <20160419140120.GA4126@dhcp22.suse.cz> (raw)
In-Reply-To: <20160418144023.GG6862-KsEp0d+Q8qECVLCxKZUutA@public.gmane.org>

On Mon 18-04-16 16:40:23, Petr Mladek wrote:
> On Fri 2016-04-15 10:38:15, Tejun Heo wrote:
> > > Anyway, before we go that way, can we at least consider the possibility
> > > of removing the kworker creation dependency on the global rwsem? AFAIU
> > > this locking was added because of the pid controller. Do we even care
> > > about something as volatile as kworkers in the pid controller?
> >
> > It's not just pid controller and the global percpu locking has lower
> > hotpath overhead.  We can try to exclude kworkers out of the locking
> > but that can get really nasty and there are already attempts to add
> > cgroup support to workqueue.  Will think more about it.
> 
> I have played with this idea on Friday. Please, find below a POC.
> The worker detection works and the deadlock is removed. But workers
> do not appear in the root cgroups. I am not familiar with the cgroups
> stuff, so this part is much more difficult for me.
> 
> I send it because it might give you an idea when discussing it
> on LSF. Please, let me know if I should continue on this way or
> if it looks too crazy already now.
> 
> 
> >From ca1420926f990892a914d64046ee8d273b876f30 Mon Sep 17 00:00:00 2001
> From: Petr Mladek <pmladek-IBi9RG/b67k@public.gmane.org>
> Date: Mon, 18 Apr 2016 14:17:17 +0200
> Subject: [POC PATCH] cgroups/workqueus: Do not block forking workqueues by cgroups
>  lock
> 
> This is a POC how to delay cgroups operations when forking workqueue
> workers.  Workqueues are used by some cgroups operations, for example,
> lru_add_drain_all().  Creating new worker might help to avoid a deadlock.
> 
> This patch adds a way to detect whether a workqueue worker is being
> forked, see detect_wq_worker().  For this it needs to make struct
> kthread_create_info and the main worker_thread public.  As a consequence,
> it renames worker_thread to wq_worker_thread.
> 
> Next, cgroups_fork() just initializes the cgroups fields in task_struct.
> It does not really need to be guarded by cgroup_threadgroup_rwsem.
> 
> cgroup_threadgroup_rwsem is taken later when we check if the fork
> is allowed and when we copy the cgroups setting.  But these two
> operations are skipped for workers.
> 
> The result is that workers are not blocked by any cgroups operation
> but they do not appear in the root cgroups.
> 
> There is a preparation of cgroup_delayed_post_fork() that might put
> the workers into the root cgroups.  It is just a copy of
> cgroup_post_fork() where "kthreadd_task" is hardcoded.  It is not yet
> called.  Also it is not protected against other cgroups operations.
> ---
>  include/linux/kthread.h   | 14 +++++++++++++
>  include/linux/workqueue.h |  1 +
>  kernel/cgroup.c           | 53 +++++++++++++++++++++++++++++++++++++++++++++++
>  kernel/fork.c             | 36 +++++++++++++++++++++++---------
>  kernel/kthread.c          | 14 -------------
>  kernel/workqueue.c        |  9 ++++----
>  6 files changed, 98 insertions(+), 29 deletions(-)

This feels too overcomplicated. Can we simply drop the locking in
copy_process if the current == ktreadadd? Would something actually
break?
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2016-04-19 14:01 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-13  9:42 [BUG] cgroup/workques/fork: deadlock when moving cgroups Petr Mladek
2016-04-13 18:33 ` Tejun Heo
2016-04-13 18:33   ` Tejun Heo
2016-04-13 18:57   ` Tejun Heo
2016-04-13 18:57     ` Tejun Heo
2016-04-13 19:23   ` Michal Hocko
2016-04-13 19:23     ` Michal Hocko
2016-04-13 19:28     ` Michal Hocko
2016-04-13 19:28       ` Michal Hocko
2016-04-13 19:37     ` Tejun Heo
2016-04-13 19:48       ` Michal Hocko
2016-04-14  7:06         ` Michal Hocko
2016-04-14  7:06           ` Michal Hocko
2016-04-14 15:32           ` Tejun Heo
2016-04-14 15:32             ` Tejun Heo
2016-04-14 17:50     ` Johannes Weiner
2016-04-15  7:06       ` Michal Hocko
2016-04-15 14:38         ` Tejun Heo
2016-04-15 14:38           ` Tejun Heo
2016-04-15 15:08           ` Michal Hocko
2016-04-15 15:08             ` Michal Hocko
2016-04-15 15:25             ` Tejun Heo
2016-04-15 15:25               ` Tejun Heo
2016-04-17 12:00               ` Michal Hocko
2016-04-17 12:00                 ` Michal Hocko
2016-04-18 14:40           ` Petr Mladek
2016-04-18 14:40             ` Petr Mladek
2016-04-19 14:01             ` Michal Hocko [this message]
2016-04-19 14:01               ` Michal Hocko
2016-04-19 15:39               ` Petr Mladek
2016-04-15 19:17       ` [PATCH for-4.6-fixes] memcg: remove lru_add_drain_all() invocation from mem_cgroup_move_charge() Tejun Heo
2016-04-17 12:07         ` Michal Hocko
2016-04-17 12:07           ` Michal Hocko
2016-04-20 21:29           ` Tejun Heo
2016-04-20 21:29             ` Tejun Heo
2016-04-21  3:27             ` Michal Hocko
2016-04-21  3:27               ` Michal Hocko
2016-04-21 15:00               ` Petr Mladek
2016-04-21 15:00                 ` Petr Mladek
2016-04-21 15:51                 ` Tejun Heo
2016-04-21 23:06           ` [PATCH 1/2] cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys->post_attach callback Tejun Heo
2016-04-21 23:06             ` Tejun Heo
2016-04-21 23:09             ` [PATCH 2/2] memcg: relocate charge moving from ->attach to ->post_attach Tejun Heo
2016-04-21 23:09               ` Tejun Heo
2016-04-22 13:57               ` Petr Mladek
2016-04-22 13:57                 ` Petr Mladek
2016-04-25  8:25               ` Michal Hocko
2016-04-25  8:25                 ` Michal Hocko
2016-04-25 19:42                 ` Tejun Heo
2016-04-25 19:42                   ` Tejun Heo
2016-04-25 19:44               ` Tejun Heo
2016-04-25 19:44                 ` Tejun Heo
2016-04-21 23:11             ` [PATCH 1/2] cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys->post_attach callback Tejun Heo
2016-04-21 23:11               ` Tejun Heo
2016-04-21 15:56         ` [PATCH for-4.6-fixes] memcg: remove lru_add_drain_all() invocation from mem_cgroup_move_charge() Tejun Heo
2016-04-21 15:56           ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160419140120.GA4126@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=chrubis@suse.cz \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmladek@suse.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.