linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -mm 0/3] mm/oom_kill: ensure we actually kill all tasks sharing the same mm
@ 2015-09-29 14:17 Oleg Nesterov
  2015-09-29 14:18 ` [PATCH -mm 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() Oleg Nesterov
                   ` (3 more replies)
  0 siblings, 4 replies; 42+ messages in thread
From: Oleg Nesterov @ 2015-09-29 14:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Rientjes, Kyle Walker, Michal Hocko, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

Michal, Tetsuo, David, sorry for delay. I'll try to read and answer
your emails in "can't oom-kill zap the victim's memory" thread later.

Let me send some initial changes which imo makes sense regardless,
but if we want to zap the victim's memory we need to ensure that all
tasks which share this ->mm were actually killed (see 3/3).

Please review, this series is simple but only compile tested.

Andrew, this is on top of linux-mmotm.git + the recent mm-oom-remove-
task_lock-protecting-comm-printing.patch from David.

Oleg.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH -mm 1/3] mm/oom_kill: remove the wrong fatal_signal_pending()
  2015-09-29 14:17 [PATCH -mm 0/3] mm/oom_kill: ensure we actually kill all tasks sharing the same mm Oleg Nesterov
@ 2015-09-29 14:18 ` Oleg Nesterov
  2015-09-29 22:36   ` David Rientjes
  2015-09-29 14:18 ` [PATCH -mm 2/3] mm/oom_kill: cleanup the "kill sharing same memory" Oleg Nesterov
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 42+ messages in thread
From: Oleg Nesterov @ 2015-09-29 14:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Rientjes, Kyle Walker, Michal Hocko, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

The fatal_signal_pending() was added to suppress unnecessary "sharing
same memory" message, but it can't 100% help anyway because it can be
false-negative; SIGKILL can be already dequeued.

And worse, it can be false-positive due to exec or coredump. exec is
mostly fine, but coredump is not. It is possible that the group leader
has the pending SIGKILL because its sub-thread originated the coredump,
in this case we must not skip this process.

We could probably add the additional ->group_exit_task check but this
pach just removes fatal_signal_pending(), the extra "Kill process" is
unlikely and doesn't really hurt.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 mm/oom_kill.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 4766e25..0d581c6 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -588,8 +588,6 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
 		    !(p->flags & PF_KTHREAD)) {
 			if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)
 				continue;
-			if (fatal_signal_pending(p))
-				continue;
 
 			pr_info("Kill process %d (%s) sharing same memory\n",
 				task_pid_nr(p), p->comm);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH -mm 2/3] mm/oom_kill: cleanup the "kill sharing same memory"
  2015-09-29 14:17 [PATCH -mm 0/3] mm/oom_kill: ensure we actually kill all tasks sharing the same mm Oleg Nesterov
  2015-09-29 14:18 ` [PATCH -mm 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() Oleg Nesterov
@ 2015-09-29 14:18 ` Oleg Nesterov
  2015-09-29 22:39   ` David Rientjes
  2015-09-29 14:18 ` [PATCH -mm 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in Oleg Nesterov
  2015-09-30 18:23 ` [PATCH -mm v2 0/3] mm/oom_kill: ensure we actually kill all tasks sharing the same mm Oleg Nesterov
  3 siblings, 1 reply; 42+ messages in thread
From: Oleg Nesterov @ 2015-09-29 14:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Rientjes, Kyle Walker, Michal Hocko, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

Purely cosmetic, but the complex "if" condition looks annoying to me.
Especially because it is not consistent with OOM_SCORE_ADJ_MIN check
which adds another if/continue.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 mm/oom_kill.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 0d581c6..8e7bed2 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -583,16 +583,20 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
 	 * pending fatal signal.
 	 */
 	rcu_read_lock();
-	for_each_process(p)
-		if (p->mm == mm && !same_thread_group(p, victim) &&
-		    !(p->flags & PF_KTHREAD)) {
-			if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)
-				continue;
+	for_each_process(p) {
+		if (unlikely(p->flags & PF_KTHREAD))
+			continue;
+		if (same_thread_group(p, victim))
+			continue;
+		if (p->mm != mm)
+			continue;
+		if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)
+			continue;
 
-			pr_info("Kill process %d (%s) sharing same memory\n",
-				task_pid_nr(p), p->comm);
-			do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, true);
-		}
+		pr_info("Kill process %d (%s) sharing same memory\n",
+			task_pid_nr(p), p->comm);
+		do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, true);
+	}
 	rcu_read_unlock();
 
 	mmput(mm);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH -mm 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in
  2015-09-29 14:17 [PATCH -mm 0/3] mm/oom_kill: ensure we actually kill all tasks sharing the same mm Oleg Nesterov
  2015-09-29 14:18 ` [PATCH -mm 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() Oleg Nesterov
  2015-09-29 14:18 ` [PATCH -mm 2/3] mm/oom_kill: cleanup the "kill sharing same memory" Oleg Nesterov
@ 2015-09-29 14:18 ` Oleg Nesterov
  2015-09-29 22:41   ` David Rientjes
  2015-09-30  2:16   ` Tetsuo Handa
  2015-09-30 18:23 ` [PATCH -mm v2 0/3] mm/oom_kill: ensure we actually kill all tasks sharing the same mm Oleg Nesterov
  3 siblings, 2 replies; 42+ messages in thread
From: Oleg Nesterov @ 2015-09-29 14:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Rientjes, Kyle Walker, Michal Hocko, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

Both "child->mm == mm" and "p->mm != mm" checks in oom_kill_process()
are wrong. ->mm can be if task is the exited group leader. This means
in particular that "kill sharing same memory" loop can miss a process
with a zombie leader which uses the same ->mm.

Note: the process_has_mm(child, p->mm) check is still not 100% correct,
p->mm can be NULL too. This is minor, but probably deserves a fix or a
comment anyway.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 mm/oom_kill.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 8e7bed2..8ecac2ef 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -483,6 +483,17 @@ void oom_killer_enable(void)
 	oom_killer_disabled = false;
 }
 
+static bool process_has_mm(struct task_struct *p, struct mm_struct *mm)
+{
+	struct task_struct *t;
+
+	for_each_thread(p, t)
+		if (t->mm)
+			return t->mm == mm;
+
+	return false;
+}
+
 #define K(x) ((x) << (PAGE_SHIFT-10))
 /*
  * Must be called while holding a reference to p, which will be released upon
@@ -530,7 +541,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
 		list_for_each_entry(child, &t->children, sibling) {
 			unsigned int child_points;
 
-			if (child->mm == p->mm)
+			if (process_has_mm(child, p->mm))
 				continue;
 			/*
 			 * oom_badness() returns 0 if the thread is unkillable
@@ -588,7 +599,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
 			continue;
 		if (same_thread_group(p, victim))
 			continue;
-		if (p->mm != mm)
+		if (!process_has_mm(p, mm))
 			continue;
 		if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)
 			continue;
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm 1/3] mm/oom_kill: remove the wrong fatal_signal_pending()
  2015-09-29 14:18 ` [PATCH -mm 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() Oleg Nesterov
@ 2015-09-29 22:36   ` David Rientjes
  2015-09-30  1:42     ` Tetsuo Handa
  2015-09-30 13:43     ` [PATCH -mm 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() Oleg Nesterov
  0 siblings, 2 replies; 42+ messages in thread
From: David Rientjes @ 2015-09-29 22:36 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Kyle Walker, Michal Hocko, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

On Tue, 29 Sep 2015, Oleg Nesterov wrote:

> The fatal_signal_pending() was added to suppress unnecessary "sharing
> same memory" message, but it can't 100% help anyway because it can be
> false-negative; SIGKILL can be already dequeued.
> 
> And worse, it can be false-positive due to exec or coredump. exec is
> mostly fine, but coredump is not. It is possible that the group leader
> has the pending SIGKILL because its sub-thread originated the coredump,
> in this case we must not skip this process.
> 
> We could probably add the additional ->group_exit_task check but this
> pach just removes fatal_signal_pending(), the extra "Kill process" is
> unlikely and doesn't really hurt.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: David Rientjes <rientjes@google.com>

In addition, I'm really debating whether we need the "sharing same memory" 
line or not.  In the past, it has been helpful because there is no other 
way to determine what the kernel has killed other than to leave an 
artifact behind in the kernel log.  I can imagine that this could easily 
spam the kernel log, though, accompanied by oom killer messages that are 
already very verbose.  I wouldn't mind if it the printk were removed 
entirely.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm 2/3] mm/oom_kill: cleanup the "kill sharing same memory"
  2015-09-29 14:18 ` [PATCH -mm 2/3] mm/oom_kill: cleanup the "kill sharing same memory" Oleg Nesterov
@ 2015-09-29 22:39   ` David Rientjes
  2015-09-30 13:49     ` Oleg Nesterov
  0 siblings, 1 reply; 42+ messages in thread
From: David Rientjes @ 2015-09-29 22:39 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Kyle Walker, Michal Hocko, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

On Tue, 29 Sep 2015, Oleg Nesterov wrote:

> Purely cosmetic, but the complex "if" condition looks annoying to me.
> Especially because it is not consistent with OOM_SCORE_ADJ_MIN check
> which adds another if/continue.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
>  mm/oom_kill.c | 22 +++++++++++++---------
>  1 file changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 0d581c6..8e7bed2 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -583,16 +583,20 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
>  	 * pending fatal signal.
>  	 */
>  	rcu_read_lock();
> -	for_each_process(p)
> -		if (p->mm == mm && !same_thread_group(p, victim) &&
> -		    !(p->flags & PF_KTHREAD)) {
> -			if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)
> -				continue;
> +	for_each_process(p) {
> +		if (unlikely(p->flags & PF_KTHREAD))
> +			continue;
> +		if (same_thread_group(p, victim))
> +			continue;
> +		if (p->mm != mm)
> +			continue;

This ordering is a little weird to me, I think we would eliminate the 
majority of processes by checking for p->mm != mm first.  There are 
certainly pathological cases where that can be defeated, but in practice 
it seems to happen more often than not.

Unless you object, I think the ordering should be p->mm != mm, 
same_thread_group(), unlikely(PF_KTHREAD) as it originally was (thanks for 
adding the unlikely).

I agree your cleanup looks much better than the nested conditional.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in
  2015-09-29 14:18 ` [PATCH -mm 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in Oleg Nesterov
@ 2015-09-29 22:41   ` David Rientjes
  2015-09-30 13:50     ` Oleg Nesterov
  2015-09-30  2:16   ` Tetsuo Handa
  1 sibling, 1 reply; 42+ messages in thread
From: David Rientjes @ 2015-09-29 22:41 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Kyle Walker, Michal Hocko, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

On Tue, 29 Sep 2015, Oleg Nesterov wrote:

> Both "child->mm == mm" and "p->mm != mm" checks in oom_kill_process()
> are wrong. ->mm can be if task is the exited group leader. This means
> in particular that "kill sharing same memory" loop can miss a process
> with a zombie leader which uses the same ->mm.
> 
> Note: the process_has_mm(child, p->mm) check is still not 100% correct,
> p->mm can be NULL too. This is minor, but probably deserves a fix or a
> comment anyway.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: David Rientjes <rientjes@google.com>

I like this and I don't want to hold up a fix for a personal preference, 
but I find process_has_mm() to simply imply the process has a non-NULL mm.  
Maybe process_shares_mm()?  Something to consider.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm 1/3] mm/oom_kill: remove the wrong fatal_signal_pending()
  2015-09-29 22:36   ` David Rientjes
@ 2015-09-30  1:42     ` Tetsuo Handa
  2015-09-30 13:47       ` Oleg Nesterov
  2015-09-30 13:43     ` [PATCH -mm 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() Oleg Nesterov
  1 sibling, 1 reply; 42+ messages in thread
From: Tetsuo Handa @ 2015-09-30  1:42 UTC (permalink / raw)
  To: rientjes, oleg; +Cc: akpm, kwalker, mhocko, skozina, linux-kernel

David Rientjes wrote:
> On Tue, 29 Sep 2015, Oleg Nesterov wrote:
> 
> > The fatal_signal_pending() was added to suppress unnecessary "sharing
> > same memory" message, but it can't 100% help anyway because it can be
> > false-negative; SIGKILL can be already dequeued.
> > 
> > And worse, it can be false-positive due to exec or coredump. exec is
> > mostly fine, but coredump is not. It is possible that the group leader
> > has the pending SIGKILL because its sub-thread originated the coredump,
> > in this case we must not skip this process.
> > 
> > We could probably add the additional ->group_exit_task check but this
> > pach just removes fatal_signal_pending(), the extra "Kill process" is
> > unlikely and doesn't really hurt.

This fatal_signal_pending() check is about to be added by me because the OOM
killer spams the kernel log when the mm struct which the OOM victim is using
is shared by many threads. ( http://marc.info/?l=linux-mm&m=143256441501204 )

> > 
> > Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> 
> Acked-by: David Rientjes <rientjes@google.com>
> 
> In addition, I'm really debating whether we need the "sharing same memory" 
> line or not.  In the past, it has been helpful because there is no other 
> way to determine what the kernel has killed other than to leave an 
> artifact behind in the kernel log.  I can imagine that this could easily 
> spam the kernel log, though, accompanied by oom killer messages that are 
> already very verbose.  I wouldn't mind if it the printk were removed 
> entirely.
> 

I was waiting for your comment about whether you depend on
the "sharing same memory" message with KERN_ERR level.
( http://marc.info/?l=linux-mm&m=144120389203133 )

If nobody else objects, I think we can remove the "sharing same memory"
message. ( http://marc.info/?l=linux-mm&m=144119325831959 )

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in
  2015-09-29 14:18 ` [PATCH -mm 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in Oleg Nesterov
  2015-09-29 22:41   ` David Rientjes
@ 2015-09-30  2:16   ` Tetsuo Handa
  2015-09-30 13:59     ` Oleg Nesterov
  1 sibling, 1 reply; 42+ messages in thread
From: Tetsuo Handa @ 2015-09-30  2:16 UTC (permalink / raw)
  To: oleg, akpm; +Cc: rientjes, kwalker, mhocko, skozina, linux-kernel

Oleg Nesterov wrote:
> Both "child->mm == mm" and "p->mm != mm" checks in oom_kill_process()
> are wrong. ->mm can be if task is the exited group leader. This means

can be [missing word here?] if task



> +static bool process_has_mm(struct task_struct *p, struct mm_struct *mm)
> +{
> +	struct task_struct *t;
> +
> +	for_each_thread(p, t)
> +		if (t->mm)

Can t->mm change between pevious line and next line?

> +			return t->mm == mm;
> +
> +	return false;
> +}
> +
>  #define K(x) ((x) << (PAGE_SHIFT-10))
>  /*
>   * Must be called while holding a reference to p, which will be released upon
> @@ -530,7 +541,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
>  		list_for_each_entry(child, &t->children, sibling) {
>  			unsigned int child_points;
>  
> -			if (child->mm == p->mm)
> +			if (process_has_mm(child, p->mm))
>  				continue;

We hold read_lock(&tasklist_lock) but not rcu_read_lock().
Is for_each_thread() safe without rcu_read_lock()?

>  			/*
>  			 * oom_badness() returns 0 if the thread is unkillable

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm 1/3] mm/oom_kill: remove the wrong fatal_signal_pending()
  2015-09-29 22:36   ` David Rientjes
  2015-09-30  1:42     ` Tetsuo Handa
@ 2015-09-30 13:43     ` Oleg Nesterov
  1 sibling, 0 replies; 42+ messages in thread
From: Oleg Nesterov @ 2015-09-30 13:43 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Kyle Walker, Michal Hocko, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

On 09/29, David Rientjes wrote:
>
> On Tue, 29 Sep 2015, Oleg Nesterov wrote:
>
> > The fatal_signal_pending() was added to suppress unnecessary "sharing
> > same memory" message, but it can't 100% help anyway because it can be
> > false-negative; SIGKILL can be already dequeued.
> >
> > And worse, it can be false-positive due to exec or coredump. exec is
> > mostly fine, but coredump is not. It is possible that the group leader
> > has the pending SIGKILL because its sub-thread originated the coredump,
> > in this case we must not skip this process.
> >
> > We could probably add the additional ->group_exit_task check but this
> > pach just removes fatal_signal_pending(), the extra "Kill process" is
> > unlikely and doesn't really hurt.
> >
> > Signed-off-by: Oleg Nesterov <oleg@redhat.com>
>
> Acked-by: David Rientjes <rientjes@google.com>

Thanks!

> In addition, I'm really debating whether we need the "sharing same memory"
> line or not.  In the past, it has been helpful because there is no other
> way to determine what the kernel has killed other than to leave an
> artifact behind in the kernel log.  I can imagine that this could easily
> spam the kernel log, though, accompanied by oom killer messages that are
> already very verbose.  I wouldn't mind if it the printk were removed
> entirely.

Yes, me too... let me reply to Tetsuo's email.

Oleg.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm 1/3] mm/oom_kill: remove the wrong fatal_signal_pending()
  2015-09-30  1:42     ` Tetsuo Handa
@ 2015-09-30 13:47       ` Oleg Nesterov
  2015-09-30 15:20         ` [PATCH -mm 1/3] mm/oom_kill: remove the wrongfatal_signal_pending() Tetsuo Handa
  0 siblings, 1 reply; 42+ messages in thread
From: Oleg Nesterov @ 2015-09-30 13:47 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: rientjes, akpm, kwalker, mhocko, skozina, linux-kernel

On 09/30, Tetsuo Handa wrote:
>
> David Rientjes wrote:
> > On Tue, 29 Sep 2015, Oleg Nesterov wrote:
> >
> > > The fatal_signal_pending() was added to suppress unnecessary "sharing
> > > same memory" message, but it can't 100% help anyway because it can be
> > > false-negative; SIGKILL can be already dequeued.
> > >
> > > And worse, it can be false-positive due to exec or coredump. exec is
> > > mostly fine, but coredump is not. It is possible that the group leader
> > > has the pending SIGKILL because its sub-thread originated the coredump,
> > > in this case we must not skip this process.
> > >
> > > We could probably add the additional ->group_exit_task check but this
> > > pach just removes fatal_signal_pending(), the extra "Kill process" is
> > > unlikely and doesn't really hurt.
>
> This fatal_signal_pending() check is about to be added by me because the OOM
> killer spams the kernel log when the mm struct which the OOM victim is using
> is shared by many threads. ( http://marc.info/?l=linux-mm&m=143256441501204 )

OK, I see, but it is wrong.

But I don't really understand "shared by many threads", I mean "threads" is
confusing word. I guess you mean CLONE_VM processes, otherwise we shouldn't
see the additional spam.

And 1000 CLONE_VM processes + "and the lock dependency prevents all threads
except the OOM victim thread from terminating until they get TIF_MEMDIE flag"
look like a really pathological case...

> > In addition, I'm really debating whether we need the "sharing same memory"
> > line or not.  In the past, it has been helpful because there is no other
> > way to determine what the kernel has killed other than to leave an
> > artifact behind in the kernel log.  I can imagine that this could easily
> > spam the kernel log, though, accompanied by oom killer messages that are
> > already very verbose.  I wouldn't mind if it the printk were removed
> > entirely.
> >
>
> I was waiting for your comment about whether you depend on
> the "sharing same memory" message with KERN_ERR level.
> ( http://marc.info/?l=linux-mm&m=144120389203133 )
>
> If nobody else objects, I think we can remove the "sharing same memory"
> message. ( http://marc.info/?l=linux-mm&m=144119325831959 )

OK, will you agree with v2 which also removes pr_warn?

Oleg.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm 2/3] mm/oom_kill: cleanup the "kill sharing same memory"
  2015-09-29 22:39   ` David Rientjes
@ 2015-09-30 13:49     ` Oleg Nesterov
  0 siblings, 0 replies; 42+ messages in thread
From: Oleg Nesterov @ 2015-09-30 13:49 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Kyle Walker, Michal Hocko, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

On 09/29, David Rientjes wrote:
>
> On Tue, 29 Sep 2015, Oleg Nesterov wrote:
>
> > Purely cosmetic, but the complex "if" condition looks annoying to me.
> > Especially because it is not consistent with OOM_SCORE_ADJ_MIN check
> > which adds another if/continue.
> >
> > Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> > ---
> >  mm/oom_kill.c | 22 +++++++++++++---------
> >  1 file changed, 13 insertions(+), 9 deletions(-)
> >
> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > index 0d581c6..8e7bed2 100644
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -583,16 +583,20 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
> >  	 * pending fatal signal.
> >  	 */
> >  	rcu_read_lock();
> > -	for_each_process(p)
> > -		if (p->mm == mm && !same_thread_group(p, victim) &&
> > -		    !(p->flags & PF_KTHREAD)) {
> > -			if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)
> > -				continue;
> > +	for_each_process(p) {
> > +		if (unlikely(p->flags & PF_KTHREAD))
> > +			continue;
> > +		if (same_thread_group(p, victim))
> > +			continue;
> > +		if (p->mm != mm)
> > +			continue;
>
> This ordering is a little weird to me, I think we would eliminate the
> majority of processes by checking for p->mm != mm first.  There are
> certainly pathological cases where that can be defeated, but in practice
> it seems to happen more often than not.
>
> Unless you object, I think the ordering should be p->mm != mm,
> same_thread_group(), unlikely(PF_KTHREAD) as it originally was (thanks for
> adding the unlikely).

OK, agreed, will send v2.

Oleg.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in
  2015-09-29 22:41   ` David Rientjes
@ 2015-09-30 13:50     ` Oleg Nesterov
  0 siblings, 0 replies; 42+ messages in thread
From: Oleg Nesterov @ 2015-09-30 13:50 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Kyle Walker, Michal Hocko, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

On 09/29, David Rientjes wrote:
>
> On Tue, 29 Sep 2015, Oleg Nesterov wrote:
>
> > Both "child->mm == mm" and "p->mm != mm" checks in oom_kill_process()
> > are wrong. ->mm can be if task is the exited group leader. This means
> > in particular that "kill sharing same memory" loop can miss a process
> > with a zombie leader which uses the same ->mm.
> >
> > Note: the process_has_mm(child, p->mm) check is still not 100% correct,
> > p->mm can be NULL too. This is minor, but probably deserves a fix or a
> > comment anyway.
> >
> > Signed-off-by: Oleg Nesterov <oleg@redhat.com>
>
> Acked-by: David Rientjes <rientjes@google.com>
>
> I like this and I don't want to hold up a fix for a personal preference,
> but I find process_has_mm() to simply imply the process has a non-NULL mm.

Hmm, yes ;)

> Maybe process_shares_mm()?  Something to consider.

Agreed, will rename in v2.

Oleg.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in
  2015-09-30  2:16   ` Tetsuo Handa
@ 2015-09-30 13:59     ` Oleg Nesterov
  0 siblings, 0 replies; 42+ messages in thread
From: Oleg Nesterov @ 2015-09-30 13:59 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: akpm, rientjes, kwalker, mhocko, skozina, linux-kernel

On 09/30, Tetsuo Handa wrote:
>
> Oleg Nesterov wrote:
> > Both "child->mm == mm" and "p->mm != mm" checks in oom_kill_process()
> > are wrong. ->mm can be if task is the exited group leader. This means
>
> can be [missing word here?] if task

Yes thanks. Will fix in v2.

Hmm. And I just noticed that the subjects were corrupted... need to fix
my script.

> > +static bool process_has_mm(struct task_struct *p, struct mm_struct *mm)
> > +{
> > +	struct task_struct *t;
> > +
> > +	for_each_thread(p, t)
> > +		if (t->mm)
>
> Can t->mm change between pevious line and next line?

Good point, thanks. I'll add READ_ONCE() to ensure gcc won't re-load
t->mm again.

> > @@ -530,7 +541,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
> >  		list_for_each_entry(child, &t->children, sibling) {
> >  			unsigned int child_points;
> >
> > -			if (child->mm == p->mm)
> > +			if (process_has_mm(child, p->mm))
> >  				continue;
>
> We hold read_lock(&tasklist_lock) but not rcu_read_lock().
> Is for_each_thread() safe without rcu_read_lock()?

Yes, for_each_thread() is rcu and/or tasklist_lock safe.

Oleg.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm 1/3] mm/oom_kill: remove the wrongfatal_signal_pending()
  2015-09-30 13:47       ` Oleg Nesterov
@ 2015-09-30 15:20         ` Tetsuo Handa
  0 siblings, 0 replies; 42+ messages in thread
From: Tetsuo Handa @ 2015-09-30 15:20 UTC (permalink / raw)
  To: oleg; +Cc: rientjes, akpm, kwalker, mhocko, skozina, linux-kernel

Oleg Nesterov wrote:
> > This fatal_signal_pending() check is about to be added by me because the OOM
> > killer spams the kernel log when the mm struct which the OOM victim is using
> > is shared by many threads. ( http://marc.info/?l=linux-mm&m=143256441501204 )
> 
> OK, I see, but it is wrong.
> 
> But I don't really understand "shared by many threads", I mean "threads" is
> confusing word. I guess you mean CLONE_VM processes, otherwise we shouldn't
> see the additional spam.

Right.

> 
> And 1000 CLONE_VM processes + "and the lock dependency prevents all threads
> except the OOM victim thread from terminating until they get TIF_MEMDIE flag"
> look like a really pathological case...

Right. But I saw that
http://lkml.kernel.org/r/201509271451.DEB86404.tMFFHSVQFOLOOJ@I-love.SAKURA.ne.jp
took 3 minites to kill one mm struct because dump_header() was called for many
times.

> > I was waiting for your comment about whether you depend on
> > the "sharing same memory" message with KERN_ERR level.
> > ( http://marc.info/?l=linux-mm&m=144120389203133 )
> >
> > If nobody else objects, I think we can remove the "sharing same memory"
> > message. ( http://marc.info/?l=linux-mm&m=144119325831959 )
> 
> OK, will you agree with v2 which also removes pr_warn?

Yes.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH -mm v2 0/3] mm/oom_kill: ensure we actually kill all tasks sharing the same mm
  2015-09-29 14:17 [PATCH -mm 0/3] mm/oom_kill: ensure we actually kill all tasks sharing the same mm Oleg Nesterov
                   ` (2 preceding siblings ...)
  2015-09-29 14:18 ` [PATCH -mm 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in Oleg Nesterov
@ 2015-09-30 18:23 ` Oleg Nesterov
  2015-09-30 18:24   ` [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process() Oleg Nesterov
                     ` (2 more replies)
  3 siblings, 3 replies; 42+ messages in thread
From: Oleg Nesterov @ 2015-09-30 18:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Rientjes, Kyle Walker, Michal Hocko, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

On 09/29, Oleg Nesterov wrote:
>
> Let me send some initial changes which imo makes sense regardless,
> but if we want to zap the victim's memory we need to ensure that all
> tasks which share this ->mm were actually killed (see 3/3).
>
> Please review, this series is simple but only compile tested.
>
> Andrew, this is on top of linux-mmotm.git + the recent mm-oom-remove-
> task_lock-protecting-comm-printing.patch from David.

Please consider v2 based on the comments from Tetsuo and David (thanks
a lot!).

Oleg.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()
  2015-09-30 18:23 ` [PATCH -mm v2 0/3] mm/oom_kill: ensure we actually kill all tasks sharing the same mm Oleg Nesterov
@ 2015-09-30 18:24   ` Oleg Nesterov
  2015-09-30 21:14     ` David Rientjes
  2015-10-01 12:49     ` [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check " Michal Hocko
  2015-09-30 18:24   ` [PATCH -mm v2 2/3] mm/oom_kill: cleanup the "kill sharing same memory" loop Oleg Nesterov
  2015-09-30 18:24   ` [PATCH -mm v2 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in oom_kill_process() Oleg Nesterov
  2 siblings, 2 replies; 42+ messages in thread
From: Oleg Nesterov @ 2015-09-30 18:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Rientjes, Kyle Walker, Michal Hocko, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

The fatal_signal_pending() was added to suppress unnecessary "sharing
same memory" message, but it can't 100% help anyway because it can be
false-negative; SIGKILL can be already dequeued.

And worse, it can be false-positive due to exec or coredump. exec is
mostly fine, but coredump is not. It is possible that the group leader
has the pending SIGKILL because its sub-thread originated the coredump,
in this case we must not skip this process.

We could probably add the additional ->group_exit_task check but this
pach just removes the wrong check along with pr_info().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 mm/oom_kill.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 4766e25..b6b8c78 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -588,11 +588,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
 		    !(p->flags & PF_KTHREAD)) {
 			if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)
 				continue;
-			if (fatal_signal_pending(p))
-				continue;
 
-			pr_info("Kill process %d (%s) sharing same memory\n",
-				task_pid_nr(p), p->comm);
 			do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, true);
 		}
 	rcu_read_unlock();
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH -mm v2 2/3] mm/oom_kill: cleanup the "kill sharing same memory" loop
  2015-09-30 18:23 ` [PATCH -mm v2 0/3] mm/oom_kill: ensure we actually kill all tasks sharing the same mm Oleg Nesterov
  2015-09-30 18:24   ` [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process() Oleg Nesterov
@ 2015-09-30 18:24   ` Oleg Nesterov
  2015-09-30 21:15     ` David Rientjes
  2015-10-01 12:50     ` Michal Hocko
  2015-09-30 18:24   ` [PATCH -mm v2 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in oom_kill_process() Oleg Nesterov
  2 siblings, 2 replies; 42+ messages in thread
From: Oleg Nesterov @ 2015-09-30 18:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Rientjes, Kyle Walker, Michal Hocko, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

Purely cosmetic, but the complex "if" condition looks annoying to me.
Especially because it is not consistent with OOM_SCORE_ADJ_MIN check
which adds another if/continue.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 mm/oom_kill.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index b6b8c78..c189ee5 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -583,14 +583,18 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
 	 * pending fatal signal.
 	 */
 	rcu_read_lock();
-	for_each_process(p)
-		if (p->mm == mm && !same_thread_group(p, victim) &&
-		    !(p->flags & PF_KTHREAD)) {
-			if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)
-				continue;
+	for_each_process(p) {
+		if (p->mm != mm)
+			continue;
+		if (same_thread_group(p, victim))
+			continue;
+		if (unlikely(p->flags & PF_KTHREAD))
+			continue;
+		if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)
+			continue;
 
-			do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, true);
-		}
+		do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, true);
+	}
 	rcu_read_unlock();
 
 	mmput(mm);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH -mm v2 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in oom_kill_process()
  2015-09-30 18:23 ` [PATCH -mm v2 0/3] mm/oom_kill: ensure we actually kill all tasks sharing the same mm Oleg Nesterov
  2015-09-30 18:24   ` [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process() Oleg Nesterov
  2015-09-30 18:24   ` [PATCH -mm v2 2/3] mm/oom_kill: cleanup the "kill sharing same memory" loop Oleg Nesterov
@ 2015-09-30 18:24   ` Oleg Nesterov
  2015-09-30 21:15     ` David Rientjes
                       ` (2 more replies)
  2 siblings, 3 replies; 42+ messages in thread
From: Oleg Nesterov @ 2015-09-30 18:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Rientjes, Kyle Walker, Michal Hocko, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

Both "child->mm == mm" and "p->mm != mm" checks in oom_kill_process()
are wrong. task->mm can be NULL if the task is the exited group leader.
This means in particular that "kill sharing same memory" loop can miss
a process with a zombie leader which uses the same ->mm.

Note: the process_has_mm(child, p->mm) check is still not 100% correct,
p->mm can be NULL too. This is minor, but probably deserves a fix or a
comment anyway.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 mm/oom_kill.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index c189ee5..034d219 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -483,6 +483,18 @@ void oom_killer_enable(void)
 	oom_killer_disabled = false;
 }
 
+static bool process_shares_mm(struct task_struct *p, struct mm_struct *mm)
+{
+	struct task_struct *t;
+
+	for_each_thread(p, t) {
+		struct mm_struct *t_mm = READ_ONCE(t->mm);
+		if (t_mm)
+			return t_mm == mm;
+	}
+	return false;
+}
+
 #define K(x) ((x) << (PAGE_SHIFT-10))
 /*
  * Must be called while holding a reference to p, which will be released upon
@@ -530,7 +542,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
 		list_for_each_entry(child, &t->children, sibling) {
 			unsigned int child_points;
 
-			if (child->mm == p->mm)
+			if (process_shares_mm(child, p->mm))
 				continue;
 			/*
 			 * oom_badness() returns 0 if the thread is unkillable
@@ -584,7 +596,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
 	 */
 	rcu_read_lock();
 	for_each_process(p) {
-		if (p->mm != mm)
+		if (!process_shares_mm(p, mm))
 			continue;
 		if (same_thread_group(p, victim))
 			continue;
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()
  2015-09-30 18:24   ` [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process() Oleg Nesterov
@ 2015-09-30 21:14     ` David Rientjes
  2015-10-01 10:52       ` [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending()check " Tetsuo Handa
  2015-10-01 12:49     ` [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check " Michal Hocko
  1 sibling, 1 reply; 42+ messages in thread
From: David Rientjes @ 2015-09-30 21:14 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Kyle Walker, Michal Hocko, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

On Wed, 30 Sep 2015, Oleg Nesterov wrote:

> The fatal_signal_pending() was added to suppress unnecessary "sharing
> same memory" message, but it can't 100% help anyway because it can be
> false-negative; SIGKILL can be already dequeued.
> 
> And worse, it can be false-positive due to exec or coredump. exec is
> mostly fine, but coredump is not. It is possible that the group leader
> has the pending SIGKILL because its sub-thread originated the coredump,
> in this case we must not skip this process.
> 
> We could probably add the additional ->group_exit_task check but this
> pach just removes the wrong check along with pr_info().
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: David Rientjes <rientjes@google.com>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 2/3] mm/oom_kill: cleanup the "kill sharing same memory" loop
  2015-09-30 18:24   ` [PATCH -mm v2 2/3] mm/oom_kill: cleanup the "kill sharing same memory" loop Oleg Nesterov
@ 2015-09-30 21:15     ` David Rientjes
  2015-10-01 12:50     ` Michal Hocko
  1 sibling, 0 replies; 42+ messages in thread
From: David Rientjes @ 2015-09-30 21:15 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Kyle Walker, Michal Hocko, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

On Wed, 30 Sep 2015, Oleg Nesterov wrote:

> Purely cosmetic, but the complex "if" condition looks annoying to me.
> Especially because it is not consistent with OOM_SCORE_ADJ_MIN check
> which adds another if/continue.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: David Rientjes <rientjes@google.com>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in oom_kill_process()
  2015-09-30 18:24   ` [PATCH -mm v2 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in oom_kill_process() Oleg Nesterov
@ 2015-09-30 21:15     ` David Rientjes
  2015-10-01 12:56     ` Michal Hocko
  2015-10-01 22:24     ` Andrew Morton
  2 siblings, 0 replies; 42+ messages in thread
From: David Rientjes @ 2015-09-30 21:15 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Kyle Walker, Michal Hocko, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

On Wed, 30 Sep 2015, Oleg Nesterov wrote:

> Both "child->mm == mm" and "p->mm != mm" checks in oom_kill_process()
> are wrong. task->mm can be NULL if the task is the exited group leader.
> This means in particular that "kill sharing same memory" loop can miss
> a process with a zombie leader which uses the same ->mm.
> 
> Note: the process_has_mm(child, p->mm) check is still not 100% correct,
> p->mm can be NULL too. This is minor, but probably deserves a fix or a
> comment anyway.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: David Rientjes <rientjes@google.com>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending()check in oom_kill_process()
  2015-09-30 21:14     ` David Rientjes
@ 2015-10-01 10:52       ` Tetsuo Handa
  0 siblings, 0 replies; 42+ messages in thread
From: Tetsuo Handa @ 2015-10-01 10:52 UTC (permalink / raw)
  To: oleg, akpm; +Cc: rientjes, kwalker, mhocko, skozina, linux-kernel

David Rientjes wrote:
> On Wed, 30 Sep 2015, Oleg Nesterov wrote:
> 
> > The fatal_signal_pending() was added to suppress unnecessary "sharing
> > same memory" message, but it can't 100% help anyway because it can be
> > false-negative; SIGKILL can be already dequeued.
> > 
> > And worse, it can be false-positive due to exec or coredump. exec is
> > mostly fine, but coredump is not. It is possible that the group leader
> > has the pending SIGKILL because its sub-thread originated the coredump,
> > in this case we must not skip this process.
> > 
> > We could probably add the additional ->group_exit_task check but this
> > pach just removes the wrong check along with pr_info().
> > 
> > Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> 
> Acked-by: David Rientjes <rientjes@google.com>
> 

Please s/pach/patch/ when applying.

Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()
  2015-09-30 18:24   ` [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process() Oleg Nesterov
  2015-09-30 21:14     ` David Rientjes
@ 2015-10-01 12:49     ` Michal Hocko
  2015-10-01 15:00       ` Oleg Nesterov
  1 sibling, 1 reply; 42+ messages in thread
From: Michal Hocko @ 2015-10-01 12:49 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, David Rientjes, Kyle Walker, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

On Wed 30-09-15 20:24:05, Oleg Nesterov wrote:
[...]
> It is possible that the group leader
> has the pending SIGKILL because its sub-thread originated the coredump,
> in this case we must not skip this process.

I do not understand this. If the group leader has SIGKILL pending it
will die anyway regardless whether we send another sigkill or not, no?

Or is the issue that another SIGKILL will wake it up?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 2/3] mm/oom_kill: cleanup the "kill sharing same memory" loop
  2015-09-30 18:24   ` [PATCH -mm v2 2/3] mm/oom_kill: cleanup the "kill sharing same memory" loop Oleg Nesterov
  2015-09-30 21:15     ` David Rientjes
@ 2015-10-01 12:50     ` Michal Hocko
  1 sibling, 0 replies; 42+ messages in thread
From: Michal Hocko @ 2015-10-01 12:50 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, David Rientjes, Kyle Walker, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

On Wed 30-09-15 20:24:08, Oleg Nesterov wrote:
> Purely cosmetic, but the complex "if" condition looks annoying to me.
> Especially because it is not consistent with OOM_SCORE_ADJ_MIN check
> which adds another if/continue.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/oom_kill.c | 18 +++++++++++-------
>  1 file changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index b6b8c78..c189ee5 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -583,14 +583,18 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
>  	 * pending fatal signal.
>  	 */
>  	rcu_read_lock();
> -	for_each_process(p)
> -		if (p->mm == mm && !same_thread_group(p, victim) &&
> -		    !(p->flags & PF_KTHREAD)) {
> -			if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)
> -				continue;
> +	for_each_process(p) {
> +		if (p->mm != mm)
> +			continue;
> +		if (same_thread_group(p, victim))
> +			continue;
> +		if (unlikely(p->flags & PF_KTHREAD))
> +			continue;
> +		if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)
> +			continue;
>  
> -			do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, true);
> -		}
> +		do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, true);
> +	}
>  	rcu_read_unlock();
>  
>  	mmput(mm);
> -- 
> 2.4.3

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in oom_kill_process()
  2015-09-30 18:24   ` [PATCH -mm v2 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in oom_kill_process() Oleg Nesterov
  2015-09-30 21:15     ` David Rientjes
@ 2015-10-01 12:56     ` Michal Hocko
  2015-10-01 22:24     ` Andrew Morton
  2 siblings, 0 replies; 42+ messages in thread
From: Michal Hocko @ 2015-10-01 12:56 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, David Rientjes, Kyle Walker, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

On Wed 30-09-15 20:24:11, Oleg Nesterov wrote:
> Both "child->mm == mm" and "p->mm != mm" checks in oom_kill_process()
> are wrong. task->mm can be NULL if the task is the exited group leader.
> This means in particular that "kill sharing same memory" loop can miss
> a process with a zombie leader which uses the same ->mm.
> 
> Note: the process_has_mm(child, p->mm) check is still not 100% correct,
> p->mm can be NULL too. This is minor, but probably deserves a fix or a
> comment anyway.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/oom_kill.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index c189ee5..034d219 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -483,6 +483,18 @@ void oom_killer_enable(void)
>  	oom_killer_disabled = false;
>  }
>  
> +static bool process_shares_mm(struct task_struct *p, struct mm_struct *mm)
> +{
> +	struct task_struct *t;
> +
> +	for_each_thread(p, t) {
> +		struct mm_struct *t_mm = READ_ONCE(t->mm);
> +		if (t_mm)
> +			return t_mm == mm;
> +	}
> +	return false;
> +}
> +
>  #define K(x) ((x) << (PAGE_SHIFT-10))
>  /*
>   * Must be called while holding a reference to p, which will be released upon
> @@ -530,7 +542,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
>  		list_for_each_entry(child, &t->children, sibling) {
>  			unsigned int child_points;
>  
> -			if (child->mm == p->mm)
> +			if (process_shares_mm(child, p->mm))
>  				continue;
>  			/*
>  			 * oom_badness() returns 0 if the thread is unkillable
> @@ -584,7 +596,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
>  	 */
>  	rcu_read_lock();
>  	for_each_process(p) {
> -		if (p->mm != mm)
> +		if (!process_shares_mm(p, mm))
>  			continue;
>  		if (same_thread_group(p, victim))
>  			continue;
> -- 
> 2.4.3

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()
  2015-10-01 12:49     ` [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check " Michal Hocko
@ 2015-10-01 15:00       ` Oleg Nesterov
  2015-10-01 15:27         ` Michal Hocko
  0 siblings, 1 reply; 42+ messages in thread
From: Oleg Nesterov @ 2015-10-01 15:00 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, David Rientjes, Kyle Walker, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

On 10/01, Michal Hocko wrote:
>
> On Wed 30-09-15 20:24:05, Oleg Nesterov wrote:
> [...]
> > It is possible that the group leader
> > has the pending SIGKILL because its sub-thread originated the coredump,
> > in this case we must not skip this process.
>
> I do not understand this. If the group leader has SIGKILL pending it
> will die anyway regardless whether we send another sigkill or not, no?

Yes it will die, but only after the coredump is finished.

Suppose we have a thread group with the group leader P and another
thread T. If T starts the coredump, it sends SIGKILL to P and waits
until it parks in exit_mm(). Then T actually dumps the core which may
need more memory, a lot of time, etc.

We need to kill this process. Yes, P is already killed and it sleeps
in TASK_UNINTERRUPTIBLE so this thread does not need SIGKILL. But
do_send_sig_info(P) will also find T and kill it too to make
dump_interrupted() == T.

Oleg.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()
  2015-10-01 15:00       ` Oleg Nesterov
@ 2015-10-01 15:27         ` Michal Hocko
  2015-10-01 15:41           ` Oleg Nesterov
  0 siblings, 1 reply; 42+ messages in thread
From: Michal Hocko @ 2015-10-01 15:27 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, David Rientjes, Kyle Walker, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

On Thu 01-10-15 17:00:10, Oleg Nesterov wrote:
> On 10/01, Michal Hocko wrote:
> >
> > On Wed 30-09-15 20:24:05, Oleg Nesterov wrote:
> > [...]
> > > It is possible that the group leader
> > > has the pending SIGKILL because its sub-thread originated the coredump,
> > > in this case we must not skip this process.
> >
> > I do not understand this. If the group leader has SIGKILL pending it
> > will die anyway regardless whether we send another sigkill or not, no?
> 
> Yes it will die, but only after the coredump is finished.
> 
> Suppose we have a thread group with the group leader P and another
> thread T. If T starts the coredump, it sends SIGKILL to P and waits
> until it parks in exit_mm(). Then T actually dumps the core which may
> need more memory, a lot of time, etc.
> 
> We need to kill this process. Yes, P is already killed and it sleeps
> in TASK_UNINTERRUPTIBLE so this thread does not need SIGKILL. But
> do_send_sig_info(P) will also find T and kill it too to make
> dump_interrupted() == T.

I am still utterly confused :( Where do we kill T if it is not in the
same thread group with P?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()
  2015-10-01 15:27         ` Michal Hocko
@ 2015-10-01 15:41           ` Oleg Nesterov
  2015-10-01 16:19             ` Michal Hocko
  0 siblings, 1 reply; 42+ messages in thread
From: Oleg Nesterov @ 2015-10-01 15:41 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, David Rientjes, Kyle Walker, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

On 10/01, Michal Hocko wrote:
>
> On Thu 01-10-15 17:00:10, Oleg Nesterov wrote:
> > On 10/01, Michal Hocko wrote:
> > >
> > > On Wed 30-09-15 20:24:05, Oleg Nesterov wrote:
> > > [...]
> > > > It is possible that the group leader
> > > > has the pending SIGKILL because its sub-thread originated the coredump,
> > > > in this case we must not skip this process.
> > >
> > > I do not understand this. If the group leader has SIGKILL pending it
> > > will die anyway regardless whether we send another sigkill or not, no?
> >
> > Yes it will die, but only after the coredump is finished.
> >
> > Suppose we have a thread group with the group leader P and another
> > thread T. If T starts the coredump, it sends SIGKILL to P and waits
> > until it parks in exit_mm(). Then T actually dumps the core which may
> > need more memory, a lot of time, etc.
> >
> > We need to kill this process. Yes, P is already killed and it sleeps
> > in TASK_UNINTERRUPTIBLE so this thread does not need SIGKILL. But
> > do_send_sig_info(P) will also find T and kill it too to make
> > dump_interrupted() == T.
>
> I am still utterly confused :( Where do we kill T if it is not in the
> same thread group with P?

But it is in the same thread group?

Oleg.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()
  2015-10-01 15:41           ` Oleg Nesterov
@ 2015-10-01 16:19             ` Michal Hocko
  2015-10-01 17:53               ` Oleg Nesterov
  0 siblings, 1 reply; 42+ messages in thread
From: Michal Hocko @ 2015-10-01 16:19 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, David Rientjes, Kyle Walker, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

On Thu 01-10-15 17:41:15, Oleg Nesterov wrote:
> On 10/01, Michal Hocko wrote:
> >
> > On Thu 01-10-15 17:00:10, Oleg Nesterov wrote:
> > > On 10/01, Michal Hocko wrote:
> > > >
> > > > On Wed 30-09-15 20:24:05, Oleg Nesterov wrote:
> > > > [...]
> > > > > It is possible that the group leader
> > > > > has the pending SIGKILL because its sub-thread originated the coredump,
> > > > > in this case we must not skip this process.
> > > >
> > > > I do not understand this. If the group leader has SIGKILL pending it
> > > > will die anyway regardless whether we send another sigkill or not, no?
> > >
> > > Yes it will die, but only after the coredump is finished.
> > >
> > > Suppose we have a thread group with the group leader P and another
> > > thread T. If T starts the coredump, it sends SIGKILL to P and waits
> > > until it parks in exit_mm(). Then T actually dumps the core which may
> > > need more memory, a lot of time, etc.
> > >
> > > We need to kill this process. Yes, P is already killed and it sleeps
> > > in TASK_UNINTERRUPTIBLE so this thread does not need SIGKILL. But
> > > do_send_sig_info(P) will also find T and kill it too to make
> > > dump_interrupted() == T.
> >
> > I am still utterly confused :( Where do we kill T if it is not in the
> > same thread group with P?
> 
> But it is in the same thread group?

The whole loop is about sending sigkill to a process from a different
thread group though. And this is what confused me completely. But I got
the point finally. zap_process will add SIGKILL to all threads but the
current which will go on without being killed and if this is not a
thread group leader then we would miss it.

Thanks for the clarification and feel free to add
Acked-by: Michal Hocko <mhocko@suse.com>

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()
  2015-10-01 16:19             ` Michal Hocko
@ 2015-10-01 17:53               ` Oleg Nesterov
  2015-10-02 11:32                 ` Tetsuo Handa
  0 siblings, 1 reply; 42+ messages in thread
From: Oleg Nesterov @ 2015-10-01 17:53 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, David Rientjes, Kyle Walker, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

On 10/01, Michal Hocko wrote:
>
> zap_process will add SIGKILL to all threads but the
> current which will go on without being killed and if this is not a
> thread group leader then we would miss it.

Yes. And note that de_thread() does the same. Speaking of oom-killer
this is mostly fine, the execing thread is going to release its old
->mm and it has already passed the copy_strings() stage which can use
a lot more memory.

But in theory (in practice currently this seems impossible without
SIGKILL) exec can fail before exec_mmap(), so if we want to zap its
->mm we need to ensure it can't return to user space.

> Acked-by: Michal Hocko <mhocko@suse.com>

Thanks!

Oleg.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in oom_kill_process()
  2015-09-30 18:24   ` [PATCH -mm v2 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in oom_kill_process() Oleg Nesterov
  2015-09-30 21:15     ` David Rientjes
  2015-10-01 12:56     ` Michal Hocko
@ 2015-10-01 22:24     ` Andrew Morton
  2 siblings, 0 replies; 42+ messages in thread
From: Andrew Morton @ 2015-10-01 22:24 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: David Rientjes, Kyle Walker, Michal Hocko, Stanislav Kozina,
	Tetsuo Handa, linux-kernel

On Wed, 30 Sep 2015 20:24:11 +0200 Oleg Nesterov <oleg@redhat.com> wrote:

> Both "child->mm == mm" and "p->mm != mm" checks in oom_kill_process()
> are wrong. task->mm can be NULL if the task is the exited group leader.
> This means in particular that "kill sharing same memory" loop can miss
> a process with a zombie leader which uses the same ->mm.
> 
> Note: the process_has_mm(child, p->mm) check is still not 100% correct,
> p->mm can be NULL too. This is minor, but probably deserves a fix or a
> comment anyway.
> 
> ...
>
> +static bool process_shares_mm(struct task_struct *p, struct mm_struct *mm)
> +{
> +	struct task_struct *t;
> +
> +	for_each_thread(p, t) {
> +		struct mm_struct *t_mm = READ_ONCE(t->mm);
> +		if (t_mm)
> +			return t_mm == mm;
> +	}
> +	return false;
> +}

Guys, please don't write write-only code.  This function is deeply
unobvious and I don't think a typical reader will have a hope of
understanding why things are this way.

I had a lame attempt:

--- a/mm/oom_kill.c~mm-oom_kill-fix-the-wrong-task-mm-==-mm-checks-in-oom_kill_process-fix
+++ a/mm/oom_kill.c
@@ -483,6 +483,12 @@ void oom_killer_enable(void)
 	oom_killer_disabled = false;
 }
 
+/*
+ * task->mm can be NULL if the task is the exited group leader.  So to
+ * determine whether the task is using a particular mm, we examine all the
+ * task's threads: if one of those is using this mm then this task was also
+ * using it.
+ */
 static bool process_shares_mm(struct task_struct *p, struct mm_struct *mm)
 {
 	struct task_struct *t;
_

Which makes me wonder if "process_shared_mm" or even
"process_used_to_share_mm" would be better names...


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()
  2015-10-01 17:53               ` Oleg Nesterov
@ 2015-10-02 11:32                 ` Tetsuo Handa
  2015-10-02 12:11                   ` Michal Hocko
  2015-10-02 13:52                   ` Oleg Nesterov
  0 siblings, 2 replies; 42+ messages in thread
From: Tetsuo Handa @ 2015-10-02 11:32 UTC (permalink / raw)
  To: oleg, mhocko; +Cc: akpm, rientjes, kwalker, skozina, linux-kernel

Oleg Nesterov wrote:
> On 10/01, Michal Hocko wrote:
> >
> > zap_process will add SIGKILL to all threads but the
> > current which will go on without being killed and if this is not a
> > thread group leader then we would miss it.
> 
> Yes. And note that de_thread() does the same. Speaking of oom-killer
> this is mostly fine, the execing thread is going to release its old
> ->mm and it has already passed the copy_strings() stage which can use
> a lot more memory.

So, we have the same wrong fatal_signal_pending() check in out_of_memory()

        /*
         * If current has a pending SIGKILL or is exiting, then automatically
         * select it.  The goal is to allow it to allocate so that it may
         * quickly exit and free its memory.
         *
         * But don't select if current has already released its mm and cleared
         * TIF_MEMDIE flag at exit_mm(), otherwise an OOM livelock may occur.
         */
        if (current->mm &&
            (fatal_signal_pending(current) || task_will_free_mem(current))) {
                mark_oom_victim(current);
                return true;
        }

because it is possible that T starts the coredump, T sends SIGKILL to P,
P calls out_of_memory() on GFP_FS allocation, P misses to set SIGKILL on T?

Since T sends SIGKILL to all clone(CLONE_VM) tasks upon coredump, P needs
to do

        rcu_read_lock();
        for_each_process(p) {
                if (!process_shares_mm(p, current->mm))
                        continue;
                if (unlikely(p->flags & PF_KTHREAD))
                        continue;
                if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)
                        continue;

                do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, true);
        }
        rcu_read_unlock();

after mark_oom_victim(current) in case T is not in the same thread group?

If yes, what happens if some task failed to receive SIGKILL due to
p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN condition?
Will we hit mm->mmap_sem livelock?

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()
  2015-10-02 11:32                 ` Tetsuo Handa
@ 2015-10-02 12:11                   ` Michal Hocko
  2015-10-02 12:33                     ` Tetsuo Handa
  2015-10-02 13:52                   ` Oleg Nesterov
  1 sibling, 1 reply; 42+ messages in thread
From: Michal Hocko @ 2015-10-02 12:11 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: oleg, akpm, rientjes, kwalker, skozina, linux-kernel

On Fri 02-10-15 20:32:41, Tetsuo Handa wrote:
> Oleg Nesterov wrote:
> > On 10/01, Michal Hocko wrote:
> > >
> > > zap_process will add SIGKILL to all threads but the
> > > current which will go on without being killed and if this is not a
> > > thread group leader then we would miss it.
> > 
> > Yes. And note that de_thread() does the same. Speaking of oom-killer
> > this is mostly fine, the execing thread is going to release its old
> > ->mm and it has already passed the copy_strings() stage which can use
> > a lot more memory.
> 
> So, we have the same wrong fatal_signal_pending() check in out_of_memory()
> 
>         /*
>          * If current has a pending SIGKILL or is exiting, then automatically
>          * select it.  The goal is to allow it to allocate so that it may
>          * quickly exit and free its memory.
>          *
>          * But don't select if current has already released its mm and cleared
>          * TIF_MEMDIE flag at exit_mm(), otherwise an OOM livelock may occur.
>          */
>         if (current->mm &&
>             (fatal_signal_pending(current) || task_will_free_mem(current))) {
>                 mark_oom_victim(current);
>                 return true;
>         }
> 
> because it is possible that T starts the coredump, T sends SIGKILL to P,
> P calls out_of_memory() on GFP_FS allocation, P misses to set SIGKILL on T?

So what? P will get an access to memory reserves to move on with the
allocation. This has nothing to do with other thread. If the current
thread (P) doesn't release any memory we would get to regular oom killer
path and eventually send the signal.

This is a simple heuristic to prevent from unnecessary killing. If it
doesn't help we should still be able to kill something. If you really
want to prevent from an unlikely case where the current is the only task
in the OOM path and TIF_MEMDIE didn't help then disable the heuristic if
the current already has TIF_MEMDIE set.

> Since T sends SIGKILL to all clone(CLONE_VM) tasks upon coredump, P needs
> to do

It does that only to all threads in the _same_ thread group AFAIU.

> 
>         rcu_read_lock();
>         for_each_process(p) {
>                 if (!process_shares_mm(p, current->mm))
>                         continue;
>                 if (unlikely(p->flags & PF_KTHREAD))
>                         continue;
>                 if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)
>                         continue;
> 
>                 do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, true);
>         }
>         rcu_read_unlock();
> 
> after mark_oom_victim(current) in case T is not in the same thread group?

What does this have to do with coredumping? Have you fallen into the
same confusion trap I did yesterday?

> If yes, what happens if some task failed to receive SIGKILL due to
> p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN condition?
> Will we hit mm->mmap_sem livelock?

I am not sure which livelock you mean here (exit_mm not being able to
proceed because some of the task is holding mmap_sem for write and
looping in the allocator)?

Anyway, having tasks sharing mm but having incompatible OOM_SCORE_ADJ_MIN
is basically a misconfiguration IMHO. I wouldn't lose sleep over it to
be honest. And yes if one process is pinning the address space then our
chances.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()
  2015-10-02 12:11                   ` Michal Hocko
@ 2015-10-02 12:33                     ` Tetsuo Handa
  2015-10-02 13:32                       ` Michal Hocko
  0 siblings, 1 reply; 42+ messages in thread
From: Tetsuo Handa @ 2015-10-02 12:33 UTC (permalink / raw)
  To: mhocko; +Cc: oleg, akpm, rientjes, kwalker, skozina, linux-kernel

Michal Hocko wrote:
> > Since T sends SIGKILL to all clone(CLONE_VM) tasks upon coredump, P needs
> > to do
> 
> It does that only to all threads in the _same_ thread group AFAIU.

I'm confused. What the _same_ thread group?

I can observe that SIGKILL is sent to all

  clone(CLONE_THREAD | CLONE_SIGHAND | CLONE_VM)
  clone(CLONE_SIGHAND | CLONE_VM)
  clone(CLONE_VM)

threads upon coredump.

---------- testing program start ----------
#define _GNU_SOURCE
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
#include <fcntl.h>
#include <sched.h>
#include <sys/mman.h>

static int file_mapper(void *unused)
{
        const int fd = open("/proc/self/exe", O_RDONLY);
        void *ptr[10000]; /* Will cause SIGSEGV due to stack overflow */
        int i;
        sleep(2);
        while (1) {
                for (i = 0; i < 10000; i++)
                        ptr[i] = mmap(NULL, 4096, PROT_READ, MAP_PRIVATE, fd,
                                      0);
                for (i = 0; i < 10000; i++)
                        munmap(ptr[i], 4096);
        }
        return 0;
}

int main(int argc, char *argv[])
{
        int i;
        for (i = 0; i < 5; i++) {
                char *cp = malloc(4 * 1024);
                if (!cp || clone(file_mapper, cp + 4 * 1024,
                                 CLONE_THREAD | CLONE_SIGHAND | CLONE_VM, NULL) == -1)
                        break;
        }
        for (i = 0; i < 5; i++) {
                char *cp = malloc(4 * 1024);
                if (!cp || clone(file_mapper, cp + 4 * 1024,
                                 CLONE_SIGHAND | CLONE_VM, NULL) == -1)
                        break;
        }
        for (i = 0; i < 5; i++) {
                char *cp = malloc(4 * 1024);
                if (!cp || clone(file_mapper, cp + 4 * 1024,
                                 CLONE_VM, NULL) == -1)
                        break;
        }
        while (1)
                pause();
        return 0;
}
---------- testing program end ----------

---------- debug printk() patch start ----------
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -295,6 +295,8 @@ static int zap_process(struct task_struct *start, int exit_code, int flags)
        for_each_thread(start, t) {
                task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
                if (t != current && t->mm) {
+                       printk(KERN_INFO "Setting SIGKILL to %s(%u)\n",
+                              t->comm, t->pid);
                        sigaddset(&t->pending.signal, SIGKILL);
                        signal_wake_up(t, 1);
                        nr++;
---------- debug printk() patch end ----------

---------- kernel log start ----------
[ 4829.770899] a.out[11614]: segfault at 1e1f768 ip 00000000004007be sp 0000000001e1f770 error 6 in a.out[400000+1000]
[ 4829.774190] Setting SIGKILL to a.out(11613)
[ 4829.775954] Setting SIGKILL to a.out(11615)
[ 4829.777191] Setting SIGKILL to a.out(11616)
[ 4829.778381] Setting SIGKILL to a.out(11617)
[ 4829.779537] Setting SIGKILL to a.out(11618)
[ 4829.781057] Setting SIGKILL to a.out(11619)
[ 4829.782236] Setting SIGKILL to a.out(11620)
[ 4829.783401] Setting SIGKILL to a.out(11621)
[ 4829.784569] Setting SIGKILL to a.out(11622)
[ 4829.785700] Setting SIGKILL to a.out(11623)
[ 4829.786848] Setting SIGKILL to a.out(11624)
[ 4829.788001] Setting SIGKILL to a.out(11625)
[ 4829.789132] Setting SIGKILL to a.out(11626)
[ 4829.790332] Setting SIGKILL to a.out(11627)
[ 4829.791593] Setting SIGKILL to a.out(11628)
[ 4829.792941] a.out[11624]: segfault at 1e29808 ip 00000000004007be sp 0000000001e29810 error 6 in a.out[400000+1000]
[ 4829.795493] a.out[11622]: segfault at 1e277e8 ip 00000000004007be sp 0000000001e277f0 error 6
[ 4829.797171] a.out[11623]: segfault at 1e287f8 ip 00000000004007be sp 0000000001e28800 error 6
[ 4829.797545] a.out[11621]: segfault at 1e267d8 ip 00000000004007be sp 0000000001e267e0 error 6
[ 4829.797547] a.out[11618]: segfault at 1e237a8 ip 00000000004007be sp 0000000001e237b0 error 6
[ 4829.797548] a.out[11617]: segfault at 1e22798 ip 00000000004007be sp 0000000001e227a0 error 6
[ 4829.797550] a.out[11619]: segfault at 1e247b8 ip 00000000004007be sp 0000000001e247c0 error 6
[ 4829.797552] a.out[11620]: segfault at 1e257c8 ip 00000000004007be sp 0000000001e257d0 error 6
[ 4829.802631] a.out[11615]: segfault at 1e20778 ip 00000000004007be sp 0000000001e20780 error 6
[ 4829.802633]  in a.out[400000+1000]
[ 4829.802639]  in a.out[400000+1000]
[ 4829.802642]  in a.out[400000+1000]
[ 4829.802655]  in a.out[400000+1000]
[ 4829.802659]  in a.out[400000+1000]
[ 4829.802662]  in a.out[400000+1000]
[ 4829.814605]  in a.out[400000+1000]
[ 4829.819500]  in a.out[400000+1000]
---------- kernel log end ----------

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()
  2015-10-02 12:33                     ` Tetsuo Handa
@ 2015-10-02 13:32                       ` Michal Hocko
  2015-10-02 13:57                         ` Oleg Nesterov
  2015-10-02 14:07                         ` Tetsuo Handa
  0 siblings, 2 replies; 42+ messages in thread
From: Michal Hocko @ 2015-10-02 13:32 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: oleg, akpm, rientjes, kwalker, skozina, linux-kernel

On Fri 02-10-15 21:33:08, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > > Since T sends SIGKILL to all clone(CLONE_VM) tasks upon coredump, P needs
> > > to do
> > 
> > It does that only to all threads in the _same_ thread group AFAIU.
> 
> I'm confused. What the _same_ thread group?
> 
> I can observe that SIGKILL is sent to all
> 
>   clone(CLONE_THREAD | CLONE_SIGHAND | CLONE_VM)
>   clone(CLONE_SIGHAND | CLONE_VM)
>   clone(CLONE_VM)

I might be missing something crucial here but
copy_process has the following:
        if (clone_flags & CLONE_THREAD) {
                p->exit_signal = -1;
                p->group_leader = current->group_leader;
                p->tgid = current->tgid;
        } else {
                if (clone_flags & CLONE_PARENT)
                        p->exit_signal = current->group_leader->exit_signal;
                else
                        p->exit_signal = (clone_flags & CSIGNAL);
                p->group_leader = p;
                p->tgid = p->pid;
        }

So clone without CLONE_THREAD should create a new thread group leader
and so create a new thread group. Unless there is some other trickery
which I do not see right now for_each_thread from the parent task
shouldn't see those which are cloned without CLONE_THREAD.

[...]
> --- a/fs/coredump.c
> +++ b/fs/coredump.c
> @@ -295,6 +295,8 @@ static int zap_process(struct task_struct *start, int exit_code, int flags)
>         for_each_thread(start, t) {
>                 task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
>                 if (t != current && t->mm) {
> +                       printk(KERN_INFO "Setting SIGKILL to %s(%u)\n",
> +                              t->comm, t->pid);
>                         sigaddset(&t->pending.signal, SIGKILL);
>                         signal_wake_up(t, 1);
>                         nr++;
> ---------- debug printk() patch end ----------

OK, but all your tasks should trigger SEGV. You cannot find out whether
all of them happened from the same zap_process, can you.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()
  2015-10-02 11:32                 ` Tetsuo Handa
  2015-10-02 12:11                   ` Michal Hocko
@ 2015-10-02 13:52                   ` Oleg Nesterov
  2015-10-02 14:36                     ` Tetsuo Handa
  1 sibling, 1 reply; 42+ messages in thread
From: Oleg Nesterov @ 2015-10-02 13:52 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: mhocko, akpm, rientjes, kwalker, skozina, linux-kernel

Tetsuo, sorry, I don't understand your question...

On 10/02, Tetsuo Handa wrote:
>
> Oleg Nesterov wrote:
> > On 10/01, Michal Hocko wrote:
> > >
> > > zap_process will add SIGKILL to all threads but the
> > > current which will go on without being killed and if this is not a
> > > thread group leader then we would miss it.
> >
> > Yes. And note that de_thread() does the same. Speaking of oom-killer
> > this is mostly fine, the execing thread is going to release its old
> > ->mm and it has already passed the copy_strings() stage which can use
> > a lot more memory.
>
> So, we have the same wrong fatal_signal_pending() check in out_of_memory()

Yes, sure, it is not right too. Again, this is even documented in
d003f371b27016354c:

    fatal_signal_pending() can be true because of SIGNAL_GROUP_COREDUMP so
    out_of_memory() and mem_cgroup_out_of_memory() shouldn't blindly trust it.

This is off-topic in a sense that this series only tries to ensure that
if we are going to kill a memory hog we can't miss a process which shares
the same mm (ignoring the OOM_SCORE_ADJ_MIN condition below).

>         /*
>          * If current has a pending SIGKILL or is exiting, then automatically
>          * select it.  The goal is to allow it to allocate so that it may
>          * quickly exit and free its memory.
>          *
>          * But don't select if current has already released its mm and cleared
>          * TIF_MEMDIE flag at exit_mm(), otherwise an OOM livelock may occur.
>          */
>         if (current->mm &&
>             (fatal_signal_pending(current) || task_will_free_mem(current))) {
>                 mark_oom_victim(current);
>                 return true;
>         }
>
> because it is possible that T starts the coredump, T sends SIGKILL to P,
> P calls out_of_memory() on GFP_FS allocation,

yes, and since fatal_signal_pending() == T we do not even check
task_will_free_mem().

> P misses to set SIGKILL on T?
>
> Since T sends SIGKILL to all clone(CLONE_VM) tasks upon coredump, P needs
> to do
>
> [...snip...]

> after mark_oom_victim(current) in case T is not in the same thread group?

I do not see how this depends on "not in the same thread group". This
fatal_signal_pending() doesn't look right in any case.


> If yes, what happens if some task failed to receive SIGKILL due to
> p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN condition?

Oh. This is another issue. I already tried to suggest to remove this
check. But this needs more discussion, hopefully we can do this later.

Oleg.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()
  2015-10-02 13:32                       ` Michal Hocko
@ 2015-10-02 13:57                         ` Oleg Nesterov
  2015-10-02 14:24                           ` Michal Hocko
  2015-10-02 14:07                         ` Tetsuo Handa
  1 sibling, 1 reply; 42+ messages in thread
From: Oleg Nesterov @ 2015-10-02 13:57 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Tetsuo Handa, akpm, rientjes, kwalker, skozina, linux-kernel

On 10/02, Michal Hocko wrote:
>
> So clone without CLONE_THREAD should create a new thread group leader
> and so create a new thread group.

Yes.

> Unless there is some other trickery
> which I do not see right now for_each_thread from the parent task
> shouldn't see those which are cloned without CLONE_THREAD.

Yes.

But I still do not understand what are you talking about, sorry ;)

So let me say just in case that coredump (namely zap_threads()) will
also kill other thread groups with the same ->mm.

Oleg.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()
  2015-10-02 13:32                       ` Michal Hocko
  2015-10-02 13:57                         ` Oleg Nesterov
@ 2015-10-02 14:07                         ` Tetsuo Handa
  2015-10-02 14:15                           ` Oleg Nesterov
  1 sibling, 1 reply; 42+ messages in thread
From: Tetsuo Handa @ 2015-10-02 14:07 UTC (permalink / raw)
  To: mhocko; +Cc: oleg, akpm, rientjes, kwalker, skozina, linux-kernel

Michal Hocko wrote:
> > --- a/fs/coredump.c
> > +++ b/fs/coredump.c
> > @@ -295,6 +295,8 @@ static int zap_process(struct task_struct *start, int exit_code, int flags)
> >         for_each_thread(start, t) {
> >                 task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
> >                 if (t != current && t->mm) {
> > +                       printk(KERN_INFO "Setting SIGKILL to %s(%u)\n",
> > +                              t->comm, t->pid);
> >                         sigaddset(&t->pending.signal, SIGKILL);
> >                         signal_wake_up(t, 1);
> >                         nr++;
> > ---------- debug printk() patch end ----------
> 
> OK, but all your tasks should trigger SEGV. You cannot find out whether
> all of them happened from the same zap_process, can you.

Indeed. I retested with updated patch. Not all of them are killed from
the same zap_process, but all of them are killed by the same SEGV event.
I think that coredump stops all threads sharing the same memory.

---------- debug printk() patch start ----------
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -292,14 +292,18 @@ static int zap_process(struct task_struct *start, int exit_code, int flags)
        start->signal->group_exit_code = exit_code;
        start->signal->group_stop_count = 0;

+       printk(KERN_INFO "%s(%d): Started zap_process()\n", current->comm, current->pid);
        for_each_thread(start, t) {
                task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
                if (t != current && t->mm) {
+                       printk(KERN_INFO "%s(%d): Setting SIGKILL to %s(%u)\n",
+                              current->comm, current->pid, t->comm, t->pid);
                        sigaddset(&t->pending.signal, SIGKILL);
                        signal_wake_up(t, 1);
                        nr++;
                }
        }
+       printk(KERN_INFO "%s(%d): zap_process() returned %d\n", current->comm, current->pid, nr);

        return nr;
 }
---------- debug printk() patch end ----------

---------- kernel log start ----------
[   71.808316] a.out[11057]: segfault at 7ac768 ip 00000000004007be sp 00000000007ac770 error 6 in a.out[400000+1000]
[   71.811003] a.out[11058]: segfault at 7ad778 ip 00000000004007be sp 00000000007ad780 error 6
[   71.811005]  in a.out[400000+1000]
[   71.813817] a.out(11058): Started zap_process()
[   71.813818] a.out(11058): Setting SIGKILL to a.out(11056)
[   71.813841] a.out(11058): Setting SIGKILL to a.out(11057)
[   71.813854] a.out(11058): Setting SIGKILL to a.out(11059)
[   71.813855] a.out(11058): Setting SIGKILL to a.out(11060)
[   71.813855] a.out(11058): Setting SIGKILL to a.out(11061)
[   71.813857] a.out(11058): zap_process() returned 5
[   71.813880] a.out(11058): Started zap_process()
[   71.813880] a.out(11058): Setting SIGKILL to a.out(11062)
[   71.813881] a.out(11058): zap_process() returned 1
[   71.813881] a.out(11058): Started zap_process()
[   71.813882] a.out(11058): Setting SIGKILL to a.out(11063)
[   71.813882] a.out(11058): zap_process() returned 1
[   71.813882] a.out(11058): Started zap_process()
[   71.813883] a.out(11058): Setting SIGKILL to a.out(11064)
[   71.813883] a.out(11058): zap_process() returned 1
[   71.813884] a.out(11058): Started zap_process()
[   71.813884] a.out(11058): Setting SIGKILL to a.out(11065)
[   71.813899] a.out(11058): zap_process() returned 1
[   71.813900] a.out(11058): Started zap_process()
[   71.813900] a.out(11058): Setting SIGKILL to a.out(11066)
[   71.813901] a.out(11058): zap_process() returned 1
[   71.813901] a.out(11058): Started zap_process()
[   71.813902] a.out(11058): Setting SIGKILL to a.out(11067)
[   71.813902] a.out(11058): zap_process() returned 1
[   71.813903] a.out(11058): Started zap_process()
[   71.813904] a.out(11058): Setting SIGKILL to a.out(11068)
[   71.813904] a.out(11058): zap_process() returned 1
[   71.813905] a.out(11058): Started zap_process()
[   71.813905] a.out(11058): Setting SIGKILL to a.out(11069)
[   71.813906] a.out(11058): zap_process() returned 1
[   71.813906] a.out(11058): Started zap_process()
[   71.813907] a.out(11058): Setting SIGKILL to a.out(11070)
[   71.813907] a.out(11058): zap_process() returned 1
[   71.813908] a.out(11058): Started zap_process()
[   71.813908] a.out(11058): Setting SIGKILL to a.out(11071)
[   71.813908] a.out(11058): zap_process() returned 1
[   71.813925] a.out[11068]: segfault at 7b7818 ip 00000000004007be sp 00000000007b7820 error 6 in a.out[400000+1000]
[   71.813938] a.out[11063]: segfault at 7b27c8 ip 00000000004007be sp 00000000007b27d0 error 6
[   71.813940] a.out[11064]: segfault at 7b37d8 ip 00000000004007be sp 00000000007b37e0 error 6
[   71.813941] a.out[11066]: segfault at 7b57f8 ip 00000000004007be sp 00000000007b5800 error 6
[   71.813943] a.out[11060]: segfault at 7af798 ip 00000000004007be sp 00000000007af7a0 error 6
[   71.813945] a.out[11062]: segfault at 7b17b8 ip 00000000004007be sp 00000000007b17c0 error 6
[   71.813946] a.out[11067]: segfault at 7b6808 ip 00000000004007be sp 00000000007b6810 error 6
[   71.813994] a.out[11070]: segfault at 7b9838 ip 00000000004007be sp 00000000007b9840 error 6
[   71.813995]  in a.out[400000+1000]
[   71.813998]  in a.out[400000+1000]
[   71.814002]  in a.out[400000+1000]
[   71.814004]  in a.out[400000+1000]
[   71.814008]  in a.out[400000+1000]
[   71.814011]  in a.out[400000+1000]
[   71.814015]  in a.out[400000+1000]
---------- kernel log end ----------

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()
  2015-10-02 14:07                         ` Tetsuo Handa
@ 2015-10-02 14:15                           ` Oleg Nesterov
  0 siblings, 0 replies; 42+ messages in thread
From: Oleg Nesterov @ 2015-10-02 14:15 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: mhocko, akpm, rientjes, kwalker, skozina, linux-kernel

On 10/02, Tetsuo Handa wrote:
>
> Indeed. I retested with updated patch. Not all of them are killed from
> the same zap_process, but all of them are killed by the same SEGV event.
> I think that coredump stops all threads sharing the same memory.

Yes sure. Please see other emails.

Oleg.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()
  2015-10-02 13:57                         ` Oleg Nesterov
@ 2015-10-02 14:24                           ` Michal Hocko
  0 siblings, 0 replies; 42+ messages in thread
From: Michal Hocko @ 2015-10-02 14:24 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Tetsuo Handa, akpm, rientjes, kwalker, skozina, linux-kernel

On Fri 02-10-15 15:57:56, Oleg Nesterov wrote:
> On 10/02, Michal Hocko wrote:
> >
> > So clone without CLONE_THREAD should create a new thread group leader
> > and so create a new thread group.
> 
> Yes.
> 
> > Unless there is some other trickery
> > which I do not see right now for_each_thread from the parent task
> > shouldn't see those which are cloned without CLONE_THREAD.
> 
> Yes.
> 
> But I still do not understand what are you talking about, sorry ;)

yes this whole thing was off-topic and I am sorry to add more confusion
to it.

> So let me say just in case that coredump (namely zap_threads()) will
> also kill other thread groups with the same ->mm.

Yes, I wanted to make sure to clarify what the thread_group is supposed
to mean here but I only made it more confusing now that I am reading the
whole thing again. Sorry about that.

Calling processes which only share the mm as threads is confusing and
only obscures the discussion.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()
  2015-10-02 13:52                   ` Oleg Nesterov
@ 2015-10-02 14:36                     ` Tetsuo Handa
  0 siblings, 0 replies; 42+ messages in thread
From: Tetsuo Handa @ 2015-10-02 14:36 UTC (permalink / raw)
  To: oleg; +Cc: mhocko, akpm, rientjes, kwalker, skozina, linux-kernel

Oleg Nesterov wrote:
> Tetsuo, sorry, I don't understand your question...
> 
> > because it is possible that T starts the coredump, T sends SIGKILL to P,
> > P calls out_of_memory() on GFP_FS allocation,
> 
> yes, and since fatal_signal_pending() == T we do not even check
> task_will_free_mem().
> 
> > P misses to set SIGKILL on T?
> >
> > Since T sends SIGKILL to all clone(CLONE_VM) tasks upon coredump, P needs
> > to do
> >
> > [...snip...]
> 
> > after mark_oom_victim(current) in case T is not in the same thread group?
> 
> I do not see how this depends on "not in the same thread group". This
> fatal_signal_pending() doesn't look right in any case.

You already answered my question. ;-)
You confirmed this is a possible silent hang up path (I mean, hang up
without OOM killer messages).

> > If yes, what happens if some task failed to receive SIGKILL due to
> > p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN condition?
> 
> Oh. This is another issue. I already tried to suggest to remove this
> check. But this needs more discussion, hopefully we can do this later.

OK.

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2015-10-02 14:37 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-29 14:17 [PATCH -mm 0/3] mm/oom_kill: ensure we actually kill all tasks sharing the same mm Oleg Nesterov
2015-09-29 14:18 ` [PATCH -mm 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() Oleg Nesterov
2015-09-29 22:36   ` David Rientjes
2015-09-30  1:42     ` Tetsuo Handa
2015-09-30 13:47       ` Oleg Nesterov
2015-09-30 15:20         ` [PATCH -mm 1/3] mm/oom_kill: remove the wrongfatal_signal_pending() Tetsuo Handa
2015-09-30 13:43     ` [PATCH -mm 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() Oleg Nesterov
2015-09-29 14:18 ` [PATCH -mm 2/3] mm/oom_kill: cleanup the "kill sharing same memory" Oleg Nesterov
2015-09-29 22:39   ` David Rientjes
2015-09-30 13:49     ` Oleg Nesterov
2015-09-29 14:18 ` [PATCH -mm 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in Oleg Nesterov
2015-09-29 22:41   ` David Rientjes
2015-09-30 13:50     ` Oleg Nesterov
2015-09-30  2:16   ` Tetsuo Handa
2015-09-30 13:59     ` Oleg Nesterov
2015-09-30 18:23 ` [PATCH -mm v2 0/3] mm/oom_kill: ensure we actually kill all tasks sharing the same mm Oleg Nesterov
2015-09-30 18:24   ` [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process() Oleg Nesterov
2015-09-30 21:14     ` David Rientjes
2015-10-01 10:52       ` [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending()check " Tetsuo Handa
2015-10-01 12:49     ` [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check " Michal Hocko
2015-10-01 15:00       ` Oleg Nesterov
2015-10-01 15:27         ` Michal Hocko
2015-10-01 15:41           ` Oleg Nesterov
2015-10-01 16:19             ` Michal Hocko
2015-10-01 17:53               ` Oleg Nesterov
2015-10-02 11:32                 ` Tetsuo Handa
2015-10-02 12:11                   ` Michal Hocko
2015-10-02 12:33                     ` Tetsuo Handa
2015-10-02 13:32                       ` Michal Hocko
2015-10-02 13:57                         ` Oleg Nesterov
2015-10-02 14:24                           ` Michal Hocko
2015-10-02 14:07                         ` Tetsuo Handa
2015-10-02 14:15                           ` Oleg Nesterov
2015-10-02 13:52                   ` Oleg Nesterov
2015-10-02 14:36                     ` Tetsuo Handa
2015-09-30 18:24   ` [PATCH -mm v2 2/3] mm/oom_kill: cleanup the "kill sharing same memory" loop Oleg Nesterov
2015-09-30 21:15     ` David Rientjes
2015-10-01 12:50     ` Michal Hocko
2015-09-30 18:24   ` [PATCH -mm v2 3/3] mm/oom_kill: fix the wrong task->mm == mm checks in oom_kill_process() Oleg Nesterov
2015-09-30 21:15     ` David Rientjes
2015-10-01 12:56     ` Michal Hocko
2015-10-01 22:24     ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).