All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Rientjes <rientjes@google.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Nick Piggin <npiggin@suse.de>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Lubos Lunak <l.lunak@suse.cz>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [patch 3/7 -mm] oom: select task from tasklist for mempolicy ooms
Date: Mon, 15 Feb 2010 14:11:44 -0800 (PST)	[thread overview]
Message-ID: <alpine.DEB.2.00.1002151407000.26927@chino.kir.corp.google.com> (raw)
In-Reply-To: <20100215120924.7281.A69D9226@jp.fujitsu.com>

On Mon, 15 Feb 2010, KOSAKI Motohiro wrote:

> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -1638,6 +1638,45 @@ bool init_nodemask_of_mempolicy(nodemask_t *mask)
> >  }
> >  #endif
> >  
> > +/*
> > + * mempolicy_nodemask_intersects
> > + *
> > + * If tsk's mempolicy is "default" [NULL], return 'true' to indicate default
> > + * policy.  Otherwise, check for intersection between mask and the policy
> > + * nodemask for 'bind' or 'interleave' policy, or mask to contain the single
> > + * node for 'preferred' or 'local' policy.
> > + */
> > +bool mempolicy_nodemask_intersects(struct task_struct *tsk,
> > +					const nodemask_t *mask)
> > +{
> > +	struct mempolicy *mempolicy;
> > +	bool ret = true;
> > +
> > +	mempolicy = tsk->mempolicy;
> > +	mpol_get(mempolicy);
> 
> Why is this refcount increment necessary? mempolicy is grabbed by tsk,
> IOW it never be freed in this function.
> 

We need to get a refcount on the mempolicy to ensure it doesn't get freed 
from under us, tsk is not necessarily current.

> 
> > +	if (!mask || !mempolicy)
> > +		goto out;
> > +
> > +	switch (mempolicy->mode) {
> > +	case MPOL_PREFERRED:
> > +		if (mempolicy->flags & MPOL_F_LOCAL)
> > +			ret = node_isset(numa_node_id(), *mask);
> 
> Um? Is this good heuristic?
> The task can migrate various cpus, then "node_isset(numa_node_id(), *mask) == 0"
> doesn't mean the task doesn't consume *mask's memory.
> 

For MPOL_F_LOCAL, we need to check whether the task's cpu is on a node 
that is allowed by the zonelist passed to the page allocator.  In the 
second revision of this patchset, this was changed to

	node_isset(cpu_to_node(task_cpu(tsk)), *mask)

to check.  It would be possible for no memory to have been allocated on 
that node and it just happens that the tsk is running on it momentarily, 
but it's the best indication we have given the mempolicy of whether 
killing a task may lead to future memory freeing.

> > @@ -660,24 +683,18 @@ void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask,
> >  	 */
> >  	constraint = constrained_alloc(zonelist, gfp_mask, nodemask);
> >  	read_lock(&tasklist_lock);
> > -
> > -	switch (constraint) {
> > -	case CONSTRAINT_MEMORY_POLICY:
> > -		oom_kill_process(current, gfp_mask, order, 0, NULL,
> > -				"No available memory (MPOL_BIND)");
> > -		break;
> > -
> > -	case CONSTRAINT_NONE:
> > -		if (sysctl_panic_on_oom) {
> > +	if (unlikely(sysctl_panic_on_oom)) {
> > +		/*
> > +		 * panic_on_oom only affects CONSTRAINT_NONE, the kernel
> > +		 * should not panic for cpuset or mempolicy induced memory
> > +		 * failures.
> > +		 */
> > +		if (constraint == CONSTRAINT_NONE) {
> >  			dump_header(NULL, gfp_mask, order, NULL);
> > -			panic("out of memory. panic_on_oom is selected\n");
> > +			panic("Out of memory: panic_on_oom is enabled\n");
> 
> enabled? Its feature is enabled at boot time. triggered? or fired?
> 

The panic_on_oom sysctl is "enabled" if it is set to non-zero; that's the 
word used throughout Documentation/sysctl/vm.txt to describe when a sysctl 
is being used or not.

WARNING: multiple messages have this Message-ID (diff)
From: David Rientjes <rientjes@google.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Nick Piggin <npiggin@suse.de>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Lubos Lunak <l.lunak@suse.cz>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [patch 3/7 -mm] oom: select task from tasklist for mempolicy ooms
Date: Mon, 15 Feb 2010 14:11:44 -0800 (PST)	[thread overview]
Message-ID: <alpine.DEB.2.00.1002151407000.26927@chino.kir.corp.google.com> (raw)
In-Reply-To: <20100215120924.7281.A69D9226@jp.fujitsu.com>

On Mon, 15 Feb 2010, KOSAKI Motohiro wrote:

> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -1638,6 +1638,45 @@ bool init_nodemask_of_mempolicy(nodemask_t *mask)
> >  }
> >  #endif
> >  
> > +/*
> > + * mempolicy_nodemask_intersects
> > + *
> > + * If tsk's mempolicy is "default" [NULL], return 'true' to indicate default
> > + * policy.  Otherwise, check for intersection between mask and the policy
> > + * nodemask for 'bind' or 'interleave' policy, or mask to contain the single
> > + * node for 'preferred' or 'local' policy.
> > + */
> > +bool mempolicy_nodemask_intersects(struct task_struct *tsk,
> > +					const nodemask_t *mask)
> > +{
> > +	struct mempolicy *mempolicy;
> > +	bool ret = true;
> > +
> > +	mempolicy = tsk->mempolicy;
> > +	mpol_get(mempolicy);
> 
> Why is this refcount increment necessary? mempolicy is grabbed by tsk,
> IOW it never be freed in this function.
> 

We need to get a refcount on the mempolicy to ensure it doesn't get freed 
from under us, tsk is not necessarily current.

> 
> > +	if (!mask || !mempolicy)
> > +		goto out;
> > +
> > +	switch (mempolicy->mode) {
> > +	case MPOL_PREFERRED:
> > +		if (mempolicy->flags & MPOL_F_LOCAL)
> > +			ret = node_isset(numa_node_id(), *mask);
> 
> Um? Is this good heuristic?
> The task can migrate various cpus, then "node_isset(numa_node_id(), *mask) == 0"
> doesn't mean the task doesn't consume *mask's memory.
> 

For MPOL_F_LOCAL, we need to check whether the task's cpu is on a node 
that is allowed by the zonelist passed to the page allocator.  In the 
second revision of this patchset, this was changed to

	node_isset(cpu_to_node(task_cpu(tsk)), *mask)

to check.  It would be possible for no memory to have been allocated on 
that node and it just happens that the tsk is running on it momentarily, 
but it's the best indication we have given the mempolicy of whether 
killing a task may lead to future memory freeing.

> > @@ -660,24 +683,18 @@ void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask,
> >  	 */
> >  	constraint = constrained_alloc(zonelist, gfp_mask, nodemask);
> >  	read_lock(&tasklist_lock);
> > -
> > -	switch (constraint) {
> > -	case CONSTRAINT_MEMORY_POLICY:
> > -		oom_kill_process(current, gfp_mask, order, 0, NULL,
> > -				"No available memory (MPOL_BIND)");
> > -		break;
> > -
> > -	case CONSTRAINT_NONE:
> > -		if (sysctl_panic_on_oom) {
> > +	if (unlikely(sysctl_panic_on_oom)) {
> > +		/*
> > +		 * panic_on_oom only affects CONSTRAINT_NONE, the kernel
> > +		 * should not panic for cpuset or mempolicy induced memory
> > +		 * failures.
> > +		 */
> > +		if (constraint == CONSTRAINT_NONE) {
> >  			dump_header(NULL, gfp_mask, order, NULL);
> > -			panic("out of memory. panic_on_oom is selected\n");
> > +			panic("Out of memory: panic_on_oom is enabled\n");
> 
> enabled? Its feature is enabled at boot time. triggered? or fired?
> 

The panic_on_oom sysctl is "enabled" if it is set to non-zero; that's the 
word used throughout Documentation/sysctl/vm.txt to describe when a sysctl 
is being used or not.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-02-15 22:11 UTC|newest]

Thread overview: 140+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-10 16:32 [patch 0/7 -mm] oom killer rewrite David Rientjes
2010-02-10 16:32 ` David Rientjes
2010-02-10 16:32 ` [patch 1/7 -mm] oom: filter tasks not sharing the same cpuset David Rientjes
2010-02-10 16:32   ` David Rientjes
2010-02-10 17:08   ` Rik van Riel
2010-02-10 17:08     ` Rik van Riel
2010-02-11 23:52   ` KAMEZAWA Hiroyuki
2010-02-11 23:52     ` KAMEZAWA Hiroyuki
2010-02-15  2:56   ` KOSAKI Motohiro
2010-02-15  2:56     ` KOSAKI Motohiro
2010-02-15 22:06     ` David Rientjes
2010-02-15 22:06       ` David Rientjes
2010-02-16  4:52       ` KOSAKI Motohiro
2010-02-16  4:52         ` KOSAKI Motohiro
2010-02-16  6:01         ` KOSAKI Motohiro
2010-02-16  6:01           ` KOSAKI Motohiro
2010-02-16  7:03         ` Nick Piggin
2010-02-16  7:03           ` Nick Piggin
2010-02-16  8:49           ` David Rientjes
2010-02-16  8:49             ` David Rientjes
2010-02-16  9:04             ` Nick Piggin
2010-02-16  9:04               ` Nick Piggin
2010-02-16  9:10               ` David Rientjes
2010-02-16  9:10                 ` David Rientjes
2010-02-16  8:46         ` David Rientjes
2010-02-16  8:46           ` David Rientjes
2010-02-10 16:32 ` [patch 2/7 -mm] oom: sacrifice child with highest badness score for parent David Rientjes
2010-02-10 16:32   ` David Rientjes
2010-02-10 20:52   ` Rik van Riel
2010-02-10 20:52     ` Rik van Riel
2010-02-12  0:00   ` KAMEZAWA Hiroyuki
2010-02-12  0:00     ` KAMEZAWA Hiroyuki
2010-02-12  0:15     ` David Rientjes
2010-02-12  0:15       ` David Rientjes
2010-02-13  2:49   ` Minchan Kim
2010-02-13  2:49     ` Minchan Kim
2010-02-15  3:08   ` KOSAKI Motohiro
2010-02-15  3:08     ` KOSAKI Motohiro
2010-02-10 16:32 ` [patch 3/7 -mm] oom: select task from tasklist for mempolicy ooms David Rientjes
2010-02-10 16:32   ` David Rientjes
2010-02-10 22:47   ` Rik van Riel
2010-02-10 22:47     ` Rik van Riel
2010-02-15  5:03   ` KOSAKI Motohiro
2010-02-15  5:03     ` KOSAKI Motohiro
2010-02-15 22:11     ` David Rientjes [this message]
2010-02-15 22:11       ` David Rientjes
2010-02-16  5:15       ` KOSAKI Motohiro
2010-02-16  5:15         ` KOSAKI Motohiro
2010-02-16 21:52         ` David Rientjes
2010-02-16 21:52           ` David Rientjes
2010-02-17  0:48           ` David Rientjes
2010-02-17  0:48             ` David Rientjes
2010-02-17  1:13             ` KOSAKI Motohiro
2010-02-17  1:13               ` KOSAKI Motohiro
2010-02-10 16:32 ` [patch 4/7 -mm] oom: badness heuristic rewrite David Rientjes
2010-02-10 16:32   ` David Rientjes
2010-02-11  4:10   ` Rik van Riel
2010-02-11  4:10     ` Rik van Riel
2010-02-11  9:14     ` David Rientjes
2010-02-11  9:14       ` David Rientjes
2010-02-11 15:07       ` Nick Bowler
2010-02-11 15:07         ` Nick Bowler
2010-02-11 21:01         ` David Rientjes
2010-02-11 21:01           ` David Rientjes
2010-02-11 21:43       ` Andrew Morton
2010-02-11 21:43         ` Andrew Morton
2010-02-11 21:51         ` David Rientjes
2010-02-11 21:51           ` David Rientjes
2010-02-11 22:31           ` Andrew Morton
2010-02-11 22:31             ` Andrew Morton
2010-02-11 22:42             ` David Rientjes
2010-02-11 22:42               ` David Rientjes
2010-02-11 23:11               ` Andrew Morton
2010-02-11 23:11                 ` Andrew Morton
2010-02-11 23:31                 ` David Rientjes
2010-02-11 23:31                   ` David Rientjes
2010-02-11 23:37                   ` Andrew Morton
2010-02-11 23:37                     ` Andrew Morton
2010-02-12 13:56       ` Minchan Kim
2010-02-12 13:56         ` Minchan Kim
2010-02-12 21:00         ` David Rientjes
2010-02-12 21:00           ` David Rientjes
2010-02-13  2:45           ` Minchan Kim
2010-02-13  2:45             ` Minchan Kim
2010-02-15 21:54             ` David Rientjes
2010-02-15 21:54               ` David Rientjes
2010-02-16 13:14               ` Minchan Kim
2010-02-16 13:14                 ` Minchan Kim
2010-02-16 21:41                 ` David Rientjes
2010-02-16 21:41                   ` David Rientjes
2010-02-17  7:41                   ` Minchan Kim
2010-02-17  7:41                     ` Minchan Kim
2010-02-17  9:23                     ` David Rientjes
2010-02-17  9:23                       ` David Rientjes
2010-02-17 13:08                       ` Minchan Kim
2010-02-17 13:08                         ` Minchan Kim
2010-02-15  8:05   ` KOSAKI Motohiro
2010-02-15  8:05     ` KOSAKI Motohiro
2010-02-10 16:32 ` [patch 5/7 -mm] oom: replace sysctls with quick mode David Rientjes
2010-02-10 16:32   ` David Rientjes
2010-02-12  0:26   ` KAMEZAWA Hiroyuki
2010-02-12  0:26     ` KAMEZAWA Hiroyuki
2010-02-12  9:58     ` David Rientjes
2010-02-12  9:58       ` David Rientjes
2010-02-15  8:09   ` KOSAKI Motohiro
2010-02-15  8:09     ` KOSAKI Motohiro
2010-02-15 22:15     ` David Rientjes
2010-02-15 22:15       ` David Rientjes
2010-02-16  5:25       ` KOSAKI Motohiro
2010-02-16  5:25         ` KOSAKI Motohiro
2010-02-16  9:04         ` David Rientjes
2010-02-16  9:04           ` David Rientjes
2010-02-10 16:32 ` [patch 6/7 -mm] oom: avoid oom killer for lowmem allocations David Rientjes
2010-02-10 16:32   ` David Rientjes
2010-02-11  4:13   ` Rik van Riel
2010-02-11  4:13     ` Rik van Riel
2010-02-11  9:19     ` David Rientjes
2010-02-11  9:19       ` David Rientjes
2010-02-11 14:08       ` Rik van Riel
2010-02-11 14:08         ` Rik van Riel
2010-02-12  1:28   ` KAMEZAWA Hiroyuki
2010-02-12  1:28     ` KAMEZAWA Hiroyuki
2010-02-12 10:06     ` David Rientjes
2010-02-12 10:06       ` David Rientjes
2010-02-15  0:09       ` KAMEZAWA Hiroyuki
2010-02-15  0:09         ` KAMEZAWA Hiroyuki
2010-02-15 22:01         ` David Rientjes
2010-02-15 22:01           ` David Rientjes
2010-02-15  8:29   ` KOSAKI Motohiro
2010-02-15  8:29     ` KOSAKI Motohiro
2010-02-10 16:32 ` [patch 7/7 -mm] oom: remove unnecessary code and cleanup David Rientjes
2010-02-10 16:32   ` David Rientjes
2010-02-12  0:12   ` KAMEZAWA Hiroyuki
2010-02-12  0:12     ` KAMEZAWA Hiroyuki
2010-02-12  0:21     ` David Rientjes
2010-02-12  0:21       ` David Rientjes
2010-02-15  8:31       ` KOSAKI Motohiro
2010-02-15  8:31         ` KOSAKI Motohiro
2010-02-15  2:51 ` [patch 0/7 -mm] oom killer rewrite KOSAKI Motohiro
2010-02-15  2:51   ` KOSAKI Motohiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.00.1002151407000.26927@chino.kir.corp.google.com \
    --to=rientjes@google.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=l.lunak@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.