linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Petr Mladek <pmladek@suse.com>,
	cgroups@vger.kernel.org, Cyril Hrubis <chrubis@suse.cz>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH for-4.6-fixes] memcg: remove lru_add_drain_all() invocation from mem_cgroup_move_charge()
Date: Sun, 17 Apr 2016 08:07:48 -0400	[thread overview]
Message-ID: <20160417120747.GC21757@dhcp22.suse.cz> (raw)
In-Reply-To: <20160415191719.GK12583@htj.duckdns.org>

On Fri 15-04-16 15:17:19, Tejun Heo wrote:
> mem_cgroup_move_charge() invokes lru_add_drain_all() so that the pvec
> pages can be moved too.  lru_add_drain_all() schedules and flushes
> work items on system_wq which depends on being able to create new
> kworkers to make forward progress.  Since 1ed1328792ff ("sched,
> cgroup: replace signal_struct->group_rwsem with a global
> percpu_rwsem"), a new task can't be created while in the cgroup
> migration path and the described lru_add_drain_all() invocation can
> easily lead to a deadlock.
> 
> Charge moving is best-effort and whether the pvec pages are migrated
> or not doesn't really matter.  Don't call it during charge moving.
> Eventually, we want to move the actual charge moving outside the
> migration path.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Johannes Weiner <hannes@cmpxchg.org>

I guess
Debugged-by: Petr Mladek <pmladek@suse.com>
Reported-by: Cyril Hrubis <chrubis@suse.cz>

would be due

> Suggested-by: Michal Hocko <mhocko@kernel.org>
> Fixes: 1ed1328792ff ("sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem")
> Cc: stable@vger.kernel.org
> ---
> Hello,
> 
> So, this deadlock seems pretty easy to trigger.  We'll make the charge
> moving asynchronous eventually but let's not hold off fixing an
> immediate problem.

Although this looks rather straightforward and it fixes the immediate
problem I am little bit nervous about it. As already pointed out in
other email mem_cgroup_move_charge still depends on mmap_sem for
read and we might hit an even more subtle lockup if the current holder
of the mmap_sem for write depends on the task creation (e.g. some of the
direct reclaim path uses WQ which is really hard to rule out and I even
think that some shrinkers do this).

I liked your proposal when mem_cgroup_move_charge would be called from a
context which doesn't hold the problematic rwsem much more. Would that
be too intrusive for the stable backport?

> Thanks.
> 
>  mm/memcontrol.c |    1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 36db05f..56060c7 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4859,7 +4859,6 @@ static void mem_cgroup_move_charge(struct mm_struct *mm)
>  		.mm = mm,
>  	};
>  
> -	lru_add_drain_all();
>  	/*
>  	 * Signal lock_page_memcg() to take the memcg's move_lock
>  	 * while we're moving its pages to another memcg. Then wait

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2016-04-17 12:07 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-13  9:42 [BUG] cgroup/workques/fork: deadlock when moving cgroups Petr Mladek
2016-04-13 18:33 ` Tejun Heo
2016-04-13 18:57   ` Tejun Heo
2016-04-13 19:23   ` Michal Hocko
2016-04-13 19:28     ` Michal Hocko
2016-04-13 19:37     ` Tejun Heo
2016-04-13 19:48       ` Michal Hocko
2016-04-14  7:06         ` Michal Hocko
2016-04-14 15:32           ` Tejun Heo
2016-04-14 17:50     ` Johannes Weiner
2016-04-15  7:06       ` Michal Hocko
2016-04-15 14:38         ` Tejun Heo
2016-04-15 15:08           ` Michal Hocko
2016-04-15 15:25             ` Tejun Heo
2016-04-17 12:00               ` Michal Hocko
2016-04-18 14:40           ` Petr Mladek
2016-04-19 14:01             ` Michal Hocko
2016-04-19 15:39               ` Petr Mladek
2016-04-15 19:17       ` [PATCH for-4.6-fixes] memcg: remove lru_add_drain_all() invocation from mem_cgroup_move_charge() Tejun Heo
2016-04-17 12:07         ` Michal Hocko [this message]
2016-04-20 21:29           ` Tejun Heo
2016-04-21  3:27             ` Michal Hocko
2016-04-21 15:00               ` Petr Mladek
2016-04-21 15:51                 ` Tejun Heo
2016-04-21 23:06           ` [PATCH 1/2] cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys->post_attach callback Tejun Heo
2016-04-21 23:09             ` [PATCH 2/2] memcg: relocate charge moving from ->attach to ->post_attach Tejun Heo
2016-04-22 13:57               ` Petr Mladek
2016-04-25  8:25               ` Michal Hocko
2016-04-25 19:42                 ` Tejun Heo
2016-04-25 19:44               ` Tejun Heo
2016-04-21 23:11             ` [PATCH 1/2] cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys->post_attach callback Tejun Heo
2016-04-21 15:56         ` [PATCH for-4.6-fixes] memcg: remove lru_add_drain_all() invocation from mem_cgroup_move_charge() Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160417120747.GC21757@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=chrubis@suse.cz \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmladek@suse.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).