All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Rientjes <rientjes@google.com>
To: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Minchan Kim <minchan@kernel.org>,
	linux-mm@kvack.org, cgroups@vger.kernel.org
Subject: Re: [rfc][patch 3/3] mm, memcg: introduce own oom handler to iterate only over its own threads
Date: Tue, 26 Jun 2012 22:35:39 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.00.1206262229380.32567@chino.kir.corp.google.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1206261323260.8673@chino.kir.corp.google.com>

On Tue, 26 Jun 2012, David Rientjes wrote:

> It's still not a perfect solution for the above reason.  We need 
> tasklist_lock for oom_kill_process() for a few reasons:
> 
>  (1) if /proc/sys/vm/oom_dump_tasks is enabled, which is the default, 
>      to iterate the tasklist
> 
>  (2) to iterate the selected process's children, and
> 
>  (3) to iterate the tasklist to kill all other processes sharing the 
>      same memory.
> 
> I'm hoping we can avoid taking tasklist_lock entirely for memcg ooms to 
> avoid the starvation problem at all.  We definitely still need to do (3) 
> to avoid mm->mmap_sem deadlock if another thread sharing the same memory 
> is holding the semaphore trying to allocate memory and waiting for current 
> to exit, which needs the semaphore itself.  That can be done with 
> rcu_read_lock(), however, and doesn't require tasklist_lock.
> 
> (1) can be done with rcu_read_lock() as well but I'm wondering if there 
> would be a significant advantage doing this by a cgroup iterator as well.  
> It may not be worth it just for the sanity of the code.
> 
> We can do (2) if we change to list_for_each_entry_rcu().
> 

It turns out that task->children is not an rcu-protected list so this 
doesn't work.  Both (1) and (3) can be accomplished with 
rcu_read_{lock,unlock}() that can nest inside the tasklist_lock for the 
global oom killer.  (We could even split the global oom killer tasklist 
locking and optimize it seperately from this patchset.)

So we have a couple of options:

 - allow oom_kill_process() to do

	if (memcg)
		read_lock(&tasklist_lock);
	...
	if (memcg)
		read_unlock(&tasklist_lock);

   around the iteration over the victim's children.  This should solve the 
   issue since any other iteration over the entire tasklist would have 
   triggered the same starvation if it were that bad, or

 - suppress the iteration for memcg ooms and just kill the parent instead.

Comments?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
To: Kamezawa Hiroyuki
	<kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
Cc: Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	KOSAKI Motohiro
	<kosaki.motohiro-+CUm20s59erQFUHtdCDX3A@public.gmane.org>,
	Minchan Kim <minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [rfc][patch 3/3] mm, memcg: introduce own oom handler to iterate only over its own threads
Date: Tue, 26 Jun 2012 22:35:39 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.00.1206262229380.32567@chino.kir.corp.google.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1206261323260.8673-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>

On Tue, 26 Jun 2012, David Rientjes wrote:

> It's still not a perfect solution for the above reason.  We need 
> tasklist_lock for oom_kill_process() for a few reasons:
> 
>  (1) if /proc/sys/vm/oom_dump_tasks is enabled, which is the default, 
>      to iterate the tasklist
> 
>  (2) to iterate the selected process's children, and
> 
>  (3) to iterate the tasklist to kill all other processes sharing the 
>      same memory.
> 
> I'm hoping we can avoid taking tasklist_lock entirely for memcg ooms to 
> avoid the starvation problem at all.  We definitely still need to do (3) 
> to avoid mm->mmap_sem deadlock if another thread sharing the same memory 
> is holding the semaphore trying to allocate memory and waiting for current 
> to exit, which needs the semaphore itself.  That can be done with 
> rcu_read_lock(), however, and doesn't require tasklist_lock.
> 
> (1) can be done with rcu_read_lock() as well but I'm wondering if there 
> would be a significant advantage doing this by a cgroup iterator as well.  
> It may not be worth it just for the sanity of the code.
> 
> We can do (2) if we change to list_for_each_entry_rcu().
> 

It turns out that task->children is not an rcu-protected list so this 
doesn't work.  Both (1) and (3) can be accomplished with 
rcu_read_{lock,unlock}() that can nest inside the tasklist_lock for the 
global oom killer.  (We could even split the global oom killer tasklist 
locking and optimize it seperately from this patchset.)

So we have a couple of options:

 - allow oom_kill_process() to do

	if (memcg)
		read_lock(&tasklist_lock);
	...
	if (memcg)
		read_unlock(&tasklist_lock);

   around the iteration over the victim's children.  This should solve the 
   issue since any other iteration over the entire tasklist would have 
   triggered the same starvation if it were that bad, or

 - suppress the iteration for memcg ooms and just kill the parent instead.

Comments?

  reply	other threads:[~2012-06-27  5:35 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-26  1:47 [patch 1/3] mm, oom: move declaration for mem_cgroup_out_of_memory to oom.h David Rientjes
2012-06-26  1:47 ` David Rientjes
2012-06-26  1:47 ` [rfc][patch 2/3] mm, oom: introduce helper function to process threads during scan David Rientjes
2012-06-26  1:47   ` David Rientjes
2012-06-26  3:22   ` Kamezawa Hiroyuki
2012-06-26  3:22     ` Kamezawa Hiroyuki
2012-06-26  6:05     ` KOSAKI Motohiro
2012-06-26  8:48   ` Michal Hocko
2012-06-26  8:48     ` Michal Hocko
2012-06-26  1:47 ` [rfc][patch 3/3] mm, memcg: introduce own oom handler to iterate only over its own threads David Rientjes
2012-06-26  1:47   ` David Rientjes
2012-06-26  5:32   ` Kamezawa Hiroyuki
2012-06-26  5:32     ` Kamezawa Hiroyuki
2012-06-26 20:38     ` David Rientjes
2012-06-27  5:35       ` David Rientjes [this message]
2012-06-27  5:35         ` David Rientjes
2012-06-28  1:43         ` David Rientjes
2012-06-28  1:43           ` David Rientjes
2012-06-28 17:16           ` Oleg Nesterov
2012-06-29 20:37             ` David Rientjes
2012-06-28  8:55         ` Kamezawa Hiroyuki
2012-06-28  8:55           ` Kamezawa Hiroyuki
2012-06-29 20:30           ` David Rientjes
2012-06-29 20:30             ` David Rientjes
2012-07-03 17:56             ` Oleg Nesterov
2012-06-28  8:52       ` Kamezawa Hiroyuki
2012-06-26  9:58   ` Michal Hocko
2012-06-26  9:58     ` Michal Hocko
2012-06-26  3:12 ` [patch 1/3] mm, oom: move declaration for mem_cgroup_out_of_memory to oom.h Kamezawa Hiroyuki
2012-06-26  6:04   ` KOSAKI Motohiro
2012-06-26  6:04     ` KOSAKI Motohiro
2012-06-26  8:34 ` Michal Hocko
2012-06-26  8:34   ` Michal Hocko
2012-06-29 21:06 ` [patch 1/5] " David Rientjes
2012-06-29 21:06   ` [patch 2/5] mm, oom: introduce helper function to process threads during scan David Rientjes
2012-06-29 21:06     ` David Rientjes
2012-07-12  7:18     ` Sha Zhengju
2012-07-12  7:18       ` Sha Zhengju
2012-06-29 21:06   ` [patch 3/5] mm, memcg: introduce own oom handler to iterate only over its own threads David Rientjes
2012-06-29 21:06     ` David Rientjes
2012-07-10 21:19     ` Andrew Morton
2012-07-10 21:19       ` Andrew Morton
2012-07-10 23:24       ` David Rientjes
2012-07-12 14:50     ` Sha Zhengju
2012-07-12 14:50       ` Sha Zhengju
2012-06-29 21:06   ` [patch 4/5] mm, oom: reduce dependency on tasklist_lock David Rientjes
2012-07-03 18:17     ` Oleg Nesterov
2012-07-03 18:17       ` Oleg Nesterov
2012-07-10 21:04       ` David Rientjes
2012-07-13 14:32     ` Michal Hocko
2012-07-13 14:32       ` Michal Hocko
2012-07-16  7:42       ` [PATCH mmotm] mm, oom: reduce dependency on tasklist_lock: fix Hugh Dickins
2012-07-16  7:42         ` Hugh Dickins
2012-07-16  8:06         ` Michal Hocko
2012-07-16  8:06           ` Michal Hocko
2012-07-16  9:01           ` Hugh Dickins
2012-07-16  9:27             ` Michal Hocko
2012-07-19 10:11         ` Kamezawa Hiroyuki
2012-07-19 10:11           ` Kamezawa Hiroyuki
2012-06-29 21:07   ` [patch 5/5] mm, memcg: move all oom handling to memcontrol.c David Rientjes
2012-06-29 21:07     ` David Rientjes
2012-07-04  5:51     ` Kamezawa Hiroyuki
2012-07-04  5:51       ` Kamezawa Hiroyuki
2012-07-13 14:34     ` Michal Hocko
2012-07-10 21:05   ` [patch 1/5] mm, oom: move declaration for mem_cgroup_out_of_memory to oom.h David Rientjes
2012-07-10 21:05     ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.00.1206262229380.32567@chino.kir.corp.google.com \
    --to=rientjes@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=minchan@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.