linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] oom: avoid killing init if it assume the oom killed thread's mm
@ 2013-09-23  9:45 Ming Liu
  2013-09-25  2:34 ` David Rientjes
  0 siblings, 1 reply; 5+ messages in thread
From: Ming Liu @ 2013-09-23  9:45 UTC (permalink / raw)
  To: akpm, rientjes, mhocko, rusty, hannes; +Cc: linux-mm, linux-kernel

After selecting a task to kill, the oom killer iterates all processes and
kills all other user threads that share the same mm_struct in different
thread groups.

But in some extreme cases, the selected task happens to be a vfork child
of init process sharing the same mm_struct with it, which causes kernel
panic on init getting killed. This panic is observed in a busybox shell
that busybox itself is init, with a kthread keeps consuming memories.

Signed-off-by: Ming Liu <ming.liu@windriver.com>
---
 mm/oom_kill.c |   16 ++++++++--------
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 314e9d2..7db4881 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -479,17 +479,17 @@ void oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
 	task_unlock(victim);
 
 	/*
-	 * Kill all user processes sharing victim->mm in other thread groups, if
-	 * any.  They don't get access to memory reserves, though, to avoid
-	 * depletion of all memory.  This prevents mm->mmap_sem livelock when an
-	 * oom killed thread cannot exit because it requires the semaphore and
-	 * its contended by another thread trying to allocate memory itself.
-	 * That thread will now get access to memory reserves since it has a
-	 * pending fatal signal.
+	 * Kill all user processes except init sharing victim->mm in other
+	 * thread groups, if any.  They don't get access to memory reserves,
+	 * though, to avoid depletion of all memory.  This prevents mm->mmap_sem
+	 * livelock when an oom killed thread cannot exit because it requires
+	 * the semaphore and its contended by another thread trying to allocate
+	 * memory itself. That thread will now get access to memory reserves
+	 * since it has a pending fatal signal.
 	 */
 	for_each_process(p)
 		if (p->mm == mm && !same_thread_group(p, victim) &&
-		    !(p->flags & PF_KTHREAD)) {
+		    !(p->flags & PF_KTHREAD) && !is_global_init(p)) {
 			if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)
 				continue;
 
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] oom: avoid killing init if it assume the oom killed thread's mm
  2013-09-23  9:45 [PATCH] oom: avoid killing init if it assume the oom killed thread's mm Ming Liu
@ 2013-09-25  2:34 ` David Rientjes
  2013-09-25  5:49   ` Ming Liu
  0 siblings, 1 reply; 5+ messages in thread
From: David Rientjes @ 2013-09-25  2:34 UTC (permalink / raw)
  To: Ming Liu; +Cc: akpm, mhocko, rusty, hannes, linux-mm, linux-kernel

On Mon, 23 Sep 2013, Ming Liu wrote:

> After selecting a task to kill, the oom killer iterates all processes and
> kills all other user threads that share the same mm_struct in different
> thread groups.
> 
> But in some extreme cases, the selected task happens to be a vfork child
> of init process sharing the same mm_struct with it, which causes kernel
> panic on init getting killed. This panic is observed in a busybox shell
> that busybox itself is init, with a kthread keeps consuming memories.
> 

We shouldn't be selecting a process where mm == init_mm in the first 
place, so this wouldn't fix the issue entirely.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] oom: avoid killing init if it assume the oom killed thread's mm
  2013-09-25  2:34 ` David Rientjes
@ 2013-09-25  5:49   ` Ming Liu
  2013-09-25 17:56     ` David Rientjes
  0 siblings, 1 reply; 5+ messages in thread
From: Ming Liu @ 2013-09-25  5:49 UTC (permalink / raw)
  To: David Rientjes; +Cc: akpm, mhocko, rusty, hannes, linux-mm, linux-kernel

On 09/25/2013 10:34 AM, David Rientjes wrote:
> On Mon, 23 Sep 2013, Ming Liu wrote:
>
>> After selecting a task to kill, the oom killer iterates all processes and
>> kills all other user threads that share the same mm_struct in different
>> thread groups.
>>
>> But in some extreme cases, the selected task happens to be a vfork child
>> of init process sharing the same mm_struct with it, which causes kernel
>> panic on init getting killed. This panic is observed in a busybox shell
>> that busybox itself is init, with a kthread keeps consuming memories.
>>
> We shouldn't be selecting a process where mm == init_mm in the first
> place, so this wouldn't fix the issue entirely.

But if we add a control point for "mm == init_mm" in the first place(ie. 
in oom_unkillable_task), that would forbid the processes sharing mm with 
init to be selected, is that reasonable? Actually my fix is just to 
protect init process to be killed for its vfork child being selected and 
I think it's the only place where there is the risk. If my understanding 
is wrong, pls correct me.

Thanks,
Ming Liu
>
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] oom: avoid killing init if it assume the oom killed thread's mm
  2013-09-25  5:49   ` Ming Liu
@ 2013-09-25 17:56     ` David Rientjes
  2013-09-26  1:38       ` Ming Liu
  0 siblings, 1 reply; 5+ messages in thread
From: David Rientjes @ 2013-09-25 17:56 UTC (permalink / raw)
  To: Ming Liu; +Cc: akpm, mhocko, rusty, hannes, linux-mm, linux-kernel

On Wed, 25 Sep 2013, Ming Liu wrote:

> > We shouldn't be selecting a process where mm == init_mm in the first
> > place, so this wouldn't fix the issue entirely.
> 
> But if we add a control point for "mm == init_mm" in the first place(ie. in
> oom_unkillable_task), that would forbid the processes sharing mm with init to
> be selected, is that reasonable? Actually my fix is just to protect init
> process to be killed for its vfork child being selected and I think it's the
> only place where there is the risk. If my understanding is wrong, pls correct
> me.
> 

We never want to select a process where task->mm == init_mm because if we 
kill it we won't free any memory, regardless of vfork().  The goal of the 
oom killer is solely to free memory, so it always tries to avoid needless 
killing.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] oom: avoid killing init if it assume the oom killed thread's mm
  2013-09-25 17:56     ` David Rientjes
@ 2013-09-26  1:38       ` Ming Liu
  0 siblings, 0 replies; 5+ messages in thread
From: Ming Liu @ 2013-09-26  1:38 UTC (permalink / raw)
  To: David Rientjes; +Cc: akpm, mhocko, rusty, hannes, linux-mm, linux-kernel

On 09/26/2013 01:56 AM, David Rientjes wrote:
> On Wed, 25 Sep 2013, Ming Liu wrote:
>
>>> We shouldn't be selecting a process where mm == init_mm in the first
>>> place, so this wouldn't fix the issue entirely.
>> But if we add a control point for "mm == init_mm" in the first place(ie. in
>> oom_unkillable_task), that would forbid the processes sharing mm with init to
>> be selected, is that reasonable? Actually my fix is just to protect init
>> process to be killed for its vfork child being selected and I think it's the
>> only place where there is the risk. If my understanding is wrong, pls correct
>> me.
>>
> We never want to select a process where task->mm == init_mm because if we
> kill it we won't free any memory, regardless of vfork().  The goal of the
> oom killer is solely to free memory, so it always tries to avoid needless
> killing.
Yes, that make sense, I will send the V1 patch.

the best,
thank you
>
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-09-26  1:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-23  9:45 [PATCH] oom: avoid killing init if it assume the oom killed thread's mm Ming Liu
2013-09-25  2:34 ` David Rientjes
2013-09-25  5:49   ` Ming Liu
2013-09-25 17:56     ` David Rientjes
2013-09-26  1:38       ` Ming Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).