All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan.kim@gmail.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>, Oleg Nesterov <oleg@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Nick Piggin <npiggin@suse.de>
Subject: Re: [PATCH 3/5] oom: introduce find_lock_task_mm() to fix !mm false positives
Date: Thu, 3 Jun 2010 01:05:13 +0900	[thread overview]
Message-ID: <20100602160513.GC5326@barrios-desktop> (raw)
In-Reply-To: <20100531183539.1849.A69D9226@jp.fujitsu.com>

On Mon, May 31, 2010 at 06:36:34PM +0900, KOSAKI Motohiro wrote:
> From: Oleg Nesterov <oleg@redhat.com>
> Subject: [PATCH 3/5] oom: introduce find_lock_task_mm() to fix !mm false positives
> 
> Almost all ->mm == NUL checks in oom_kill.c are wrong.
> 
> The current code assumes that the task without ->mm has already
> released its memory and ignores the process. However this is not
> necessarily true when this process is multithreaded, other live
> sub-threads can use this ->mm.
> 
> - Remove the "if (!p->mm)" check in select_bad_process(), it is
>   just wrong.
> 
> - Add the new helper, find_lock_task_mm(), which finds the live
>   thread which uses the memory and takes task_lock() to pin ->mm
> 
> - change oom_badness() to use this helper instead of just checking
>   ->mm != NULL.
> 
> - As David pointed out, select_bad_process() must never choose the
>   task without ->mm, but no matter what badness() returns the
>   task can be chosen if nothing else has been found yet.
> 
> Note! This patch is not enough, we need more changes.
> 
> 	- badness() was fixed, but oom_kill_task() still ignores
> 	  the task without ->mm
> 
> This will be addressed later.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> Cc: David Rientjes <rientjes@google.com>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> [rebase
> latest -mm and remove some obsoleted description]
Reviewed-by: Minchan Kim <minchan.kim@gmail.com?

Good catch but I have a nitpick. :)

find_lock_task_mm isn't good name of the function, I think. 
As you know, original goal of the function is to find sub-thread of p
which is alive(ie, doesn't release mm). 

task_lock is important for user of the function but minor.

I suggest following as 
/*
 * If we find alive thread of process, it returns task_struct of sub thread.
 * Notice. this function calls task_lock. So caller should call task_unlock.
 */
static struct task_struct *find_alive_subthread(struct task_struct *process)
{
... 
}

I don't forced my suggesion if you suggest much good name.
Regardless of accepting my suggestion, looks good to me.

> ---
>  mm/oom_kill.c |   28 +++++++++++++++++-----------
>  1 files changed, 17 insertions(+), 11 deletions(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index c87a6f4..162af2e 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -52,6 +52,19 @@ static int has_intersects_mems_allowed(struct task_struct *tsk)
>  	return 0;
>  }
>  
> +static struct task_struct *find_lock_task_mm(struct task_struct *p)
> +{
> +	struct task_struct *t = p;
> +	do {
> +		task_lock(t);
> +		if (likely(t->mm))
> +			return t;
> +		task_unlock(t);
> +	} while_each_thread(p, t);
> +
> +	return NULL;
> +}
> +
>  /**
>   * badness - calculate a numeric value for how bad this task has been
>   * @p: task struct of which task we should calculate
> @@ -74,7 +87,6 @@ static int has_intersects_mems_allowed(struct task_struct *tsk)
>  unsigned long badness(struct task_struct *p, unsigned long uptime)
>  {
>  	unsigned long points, cpu_time, run_time;
> -	struct mm_struct *mm;
>  	struct task_struct *child;
>  	int oom_adj = p->signal->oom_adj;
>  	struct task_cputime task_time;
> @@ -84,17 +96,14 @@ unsigned long badness(struct task_struct *p, unsigned long uptime)
>  	if (oom_adj == OOM_DISABLE)
>  		return 0;
>  
> -	task_lock(p);
> -	mm = p->mm;
> -	if (!mm) {
> -		task_unlock(p);
> +	p = find_lock_task_mm(p);
> +	if (!p)
>  		return 0;
> -	}
>  
>  	/*
>  	 * The memory size of the process is the basis for the badness.
>  	 */
> -	points = mm->total_vm;
> +	points = p->mm->total_vm;
>  
>  	/*
>  	 * After this unlock we can no longer dereference local variable `mm'
> @@ -117,7 +126,7 @@ unsigned long badness(struct task_struct *p, unsigned long uptime)
>  	 */
>  	list_for_each_entry(child, &p->children, sibling) {
>  		task_lock(child);
> -		if (child->mm != mm && child->mm)
> +		if (child->mm != p->mm && child->mm)
>  			points += child->mm->total_vm/2 + 1;
>  		task_unlock(child);
>  	}
> @@ -256,9 +265,6 @@ static struct task_struct *select_bad_process(unsigned long *ppoints,
>  	for_each_process(p) {
>  		unsigned long points;
>  
> -		/* skip the tasks which have already released their mm. */
> -		if (!p->mm)
> -			continue;
>  		/* skip the init task and kthreads */
>  		if (is_global_init(p) || (p->flags & PF_KTHREAD))
>  			continue;
> -- 
> 1.6.5.2
> 
> 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

WARNING: multiple messages have this Message-ID (diff)
From: Minchan Kim <minchan.kim@gmail.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>, Oleg Nesterov <oleg@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Nick Piggin <npiggin@suse.de>
Subject: Re: [PATCH 3/5] oom: introduce find_lock_task_mm() to fix !mm false positives
Date: Thu, 3 Jun 2010 01:05:13 +0900	[thread overview]
Message-ID: <20100602160513.GC5326@barrios-desktop> (raw)
In-Reply-To: <20100531183539.1849.A69D9226@jp.fujitsu.com>

On Mon, May 31, 2010 at 06:36:34PM +0900, KOSAKI Motohiro wrote:
> From: Oleg Nesterov <oleg@redhat.com>
> Subject: [PATCH 3/5] oom: introduce find_lock_task_mm() to fix !mm false positives
> 
> Almost all ->mm == NUL checks in oom_kill.c are wrong.
> 
> The current code assumes that the task without ->mm has already
> released its memory and ignores the process. However this is not
> necessarily true when this process is multithreaded, other live
> sub-threads can use this ->mm.
> 
> - Remove the "if (!p->mm)" check in select_bad_process(), it is
>   just wrong.
> 
> - Add the new helper, find_lock_task_mm(), which finds the live
>   thread which uses the memory and takes task_lock() to pin ->mm
> 
> - change oom_badness() to use this helper instead of just checking
>   ->mm != NULL.
> 
> - As David pointed out, select_bad_process() must never choose the
>   task without ->mm, but no matter what badness() returns the
>   task can be chosen if nothing else has been found yet.
> 
> Note! This patch is not enough, we need more changes.
> 
> 	- badness() was fixed, but oom_kill_task() still ignores
> 	  the task without ->mm
> 
> This will be addressed later.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> Cc: David Rientjes <rientjes@google.com>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> [rebase
> latest -mm and remove some obsoleted description]
Reviewed-by: Minchan Kim <minchan.kim@gmail.com?

Good catch but I have a nitpick. :)

find_lock_task_mm isn't good name of the function, I think. 
As you know, original goal of the function is to find sub-thread of p
which is alive(ie, doesn't release mm). 

task_lock is important for user of the function but minor.

I suggest following as 
/*
 * If we find alive thread of process, it returns task_struct of sub thread.
 * Notice. this function calls task_lock. So caller should call task_unlock.
 */
static struct task_struct *find_alive_subthread(struct task_struct *process)
{
... 
}

I don't forced my suggesion if you suggest much good name.
Regardless of accepting my suggestion, looks good to me.

> ---
>  mm/oom_kill.c |   28 +++++++++++++++++-----------
>  1 files changed, 17 insertions(+), 11 deletions(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index c87a6f4..162af2e 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -52,6 +52,19 @@ static int has_intersects_mems_allowed(struct task_struct *tsk)
>  	return 0;
>  }
>  
> +static struct task_struct *find_lock_task_mm(struct task_struct *p)
> +{
> +	struct task_struct *t = p;
> +	do {
> +		task_lock(t);
> +		if (likely(t->mm))
> +			return t;
> +		task_unlock(t);
> +	} while_each_thread(p, t);
> +
> +	return NULL;
> +}
> +
>  /**
>   * badness - calculate a numeric value for how bad this task has been
>   * @p: task struct of which task we should calculate
> @@ -74,7 +87,6 @@ static int has_intersects_mems_allowed(struct task_struct *tsk)
>  unsigned long badness(struct task_struct *p, unsigned long uptime)
>  {
>  	unsigned long points, cpu_time, run_time;
> -	struct mm_struct *mm;
>  	struct task_struct *child;
>  	int oom_adj = p->signal->oom_adj;
>  	struct task_cputime task_time;
> @@ -84,17 +96,14 @@ unsigned long badness(struct task_struct *p, unsigned long uptime)
>  	if (oom_adj == OOM_DISABLE)
>  		return 0;
>  
> -	task_lock(p);
> -	mm = p->mm;
> -	if (!mm) {
> -		task_unlock(p);
> +	p = find_lock_task_mm(p);
> +	if (!p)
>  		return 0;
> -	}
>  
>  	/*
>  	 * The memory size of the process is the basis for the badness.
>  	 */
> -	points = mm->total_vm;
> +	points = p->mm->total_vm;
>  
>  	/*
>  	 * After this unlock we can no longer dereference local variable `mm'
> @@ -117,7 +126,7 @@ unsigned long badness(struct task_struct *p, unsigned long uptime)
>  	 */
>  	list_for_each_entry(child, &p->children, sibling) {
>  		task_lock(child);
> -		if (child->mm != mm && child->mm)
> +		if (child->mm != p->mm && child->mm)
>  			points += child->mm->total_vm/2 + 1;
>  		task_unlock(child);
>  	}
> @@ -256,9 +265,6 @@ static struct task_struct *select_bad_process(unsigned long *ppoints,
>  	for_each_process(p) {
>  		unsigned long points;
>  
> -		/* skip the tasks which have already released their mm. */
> -		if (!p->mm)
> -			continue;
>  		/* skip the init task and kthreads */
>  		if (is_global_init(p) || (p->flags & PF_KTHREAD))
>  			continue;
> -- 
> 1.6.5.2
> 
> 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-06-02 16:05 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-31  9:33 [PATCH 1/5] oom: select_bad_process: check PF_KTHREAD instead of !mm to skip kthreads KOSAKI Motohiro
2010-05-31  9:33 ` KOSAKI Motohiro
2010-05-31  9:35 ` [PATCH 2/5] oom: select_bad_process: PF_EXITING check should take ->mm into account KOSAKI Motohiro
2010-05-31  9:35   ` KOSAKI Motohiro
2010-05-31 16:43   ` Oleg Nesterov
2010-05-31 16:43     ` Oleg Nesterov
2010-06-01  1:10     ` KOSAKI Motohiro
2010-06-01  1:10       ` KOSAKI Motohiro
2010-06-01 20:18       ` Oleg Nesterov
2010-06-01 20:18         ` Oleg Nesterov
2010-06-02 13:54         ` [PATCH] oom: remove PF_EXITING check completely KOSAKI Motohiro
2010-06-02 13:54           ` KOSAKI Motohiro
2010-06-02 15:54           ` Oleg Nesterov
2010-06-02 15:54             ` Oleg Nesterov
2010-06-02 21:02             ` David Rientjes
2010-06-02 21:02               ` David Rientjes
2010-06-03  4:48               ` KOSAKI Motohiro
2010-06-03  4:48                 ` KOSAKI Motohiro
2010-06-03  6:29                 ` David Rientjes
2010-06-03  6:29                   ` David Rientjes
2010-06-02 13:54         ` [PATCH] oom: Make coredump interruptible KOSAKI Motohiro
2010-06-02 13:54           ` KOSAKI Motohiro
2010-06-02 15:42           ` Oleg Nesterov
2010-06-02 15:42             ` Oleg Nesterov
2010-06-02 17:29             ` Roland McGrath
2010-06-02 17:29               ` Roland McGrath
2010-06-02 17:53               ` Oleg Nesterov
2010-06-02 17:53                 ` Oleg Nesterov
2010-06-02 18:58                 ` Roland McGrath
2010-06-02 18:58                   ` Roland McGrath
2010-06-02 20:38                   ` Oleg Nesterov
2010-06-02 20:38                     ` Oleg Nesterov
2010-06-03 14:03                     ` Oleg Nesterov
2010-06-03 14:03                       ` Oleg Nesterov
2010-06-04 10:54                     ` KOSAKI Motohiro
2010-06-04 10:54                       ` KOSAKI Motohiro
2010-06-04 11:27                       ` Oleg Nesterov
2010-06-04 11:27                         ` Oleg Nesterov
2010-06-04 11:34                         ` Oleg Nesterov
2010-06-04 11:34                           ` Oleg Nesterov
2010-06-09 19:53                         ` Oleg Nesterov
2010-06-09 19:53                           ` Oleg Nesterov
2010-06-09 20:41                           ` David Rientjes
2010-06-09 20:41                             ` David Rientjes
2010-06-09 21:03                             ` Oleg Nesterov
2010-06-09 21:03                               ` Oleg Nesterov
2010-06-13 11:24                           ` KOSAKI Motohiro
2010-06-13 11:24                             ` KOSAKI Motohiro
2010-06-13 15:53                             ` Oleg Nesterov
2010-06-13 15:53                               ` Oleg Nesterov
2010-06-13 17:13                               ` uninterruptible CLONE_VFORK (Was: oom: Make coredump interruptible) Oleg Nesterov
2010-06-13 17:13                                 ` Oleg Nesterov
2010-06-14  0:56                                 ` Roland McGrath
2010-06-14  0:56                                   ` Roland McGrath
2010-06-14 16:33                                   ` Oleg Nesterov
2010-06-14 16:33                                     ` Oleg Nesterov
2010-06-14 19:17                                     ` Roland McGrath
2010-06-14 19:17                                       ` Roland McGrath
2010-06-28 17:33                                       ` Oleg Nesterov
2010-06-28 17:33                                         ` Oleg Nesterov
2010-06-28 18:04                                         ` Roland McGrath
2010-06-28 18:04                                           ` Roland McGrath
2010-06-14  0:36                               ` [PATCH] oom: Make coredump interruptible Roland McGrath
2010-06-14  0:36                                 ` Roland McGrath
2010-06-14  0:26                     ` Roland McGrath
2010-06-14  0:26                       ` Roland McGrath
2010-06-01 20:39   ` [PATCH 2/5] oom: select_bad_process: PF_EXITING check should take ->mm into account David Rientjes
2010-06-01 20:39     ` David Rientjes
2010-05-31  9:36 ` [PATCH 3/5] oom: introduce find_lock_task_mm() to fix !mm false positives KOSAKI Motohiro
2010-05-31  9:36   ` KOSAKI Motohiro
2010-06-01  0:57   ` KAMEZAWA Hiroyuki
2010-06-01  0:57     ` KAMEZAWA Hiroyuki
2010-06-01 20:42   ` David Rientjes
2010-06-01 20:42     ` David Rientjes
2010-06-02 16:05   ` Minchan Kim [this message]
2010-06-02 16:05     ` Minchan Kim
2010-05-31  9:37 ` [PATCH 4/5] oom: the points calculation of child processes must use find_lock_task_mm() too KOSAKI Motohiro
2010-05-31  9:37   ` KOSAKI Motohiro
2010-05-31 16:56   ` Oleg Nesterov
2010-05-31 16:56     ` Oleg Nesterov
2010-05-31 23:48     ` KOSAKI Motohiro
2010-05-31 23:48       ` KOSAKI Motohiro
2010-05-31  9:38 ` [PATCH 5/5] oom: __oom_kill_task() " KOSAKI Motohiro
2010-05-31  9:38   ` KOSAKI Motohiro
2010-06-01  1:02   ` KAMEZAWA Hiroyuki
2010-06-01  1:02     ` KAMEZAWA Hiroyuki
2010-06-01 20:44   ` David Rientjes
2010-06-01 20:44     ` David Rientjes
2010-06-01  0:54 ` [PATCH 1/5] oom: select_bad_process: check PF_KTHREAD instead of !mm to skip kthreads KAMEZAWA Hiroyuki
2010-06-01  0:54   ` KAMEZAWA Hiroyuki
2010-06-01 20:36 ` David Rientjes
2010-06-01 20:36   ` David Rientjes
2010-06-01 21:20   ` Oleg Nesterov
2010-06-01 21:20     ` Oleg Nesterov
2010-06-01 21:26     ` David Rientjes
2010-06-01 21:26       ` David Rientjes
2010-06-02 13:54       ` KOSAKI Motohiro
2010-06-02 13:54         ` KOSAKI Motohiro
2010-06-02 21:09         ` David Rientjes
2010-06-02 21:09           ` David Rientjes
2010-06-02 21:33           ` Oleg Nesterov
2010-06-02 21:33             ` Oleg Nesterov
2010-06-02 21:46             ` David Rientjes
2010-06-02 21:46               ` David Rientjes
2010-06-03 14:27               ` Oleg Nesterov
2010-06-03 14:27                 ` Oleg Nesterov
2010-06-03 20:11                 ` David Rientjes
2010-06-03 20:11                   ` David Rientjes
2010-06-02 15:32 ` Minchan Kim
2010-06-02 15:32   ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100602160513.GC5326@barrios-desktop \
    --to=minchan.kim@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=oleg@redhat.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.