From: David Rientjes <rientjes@google.com> To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Andrew Morton <akpm@linux-foundation.org>, Rik van Riel <riel@redhat.com>, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>, Nick Piggin <npiggin@suse.de>, Andrea Arcangeli <aarcange@redhat.com>, Balbir Singh <balbir@linux.vnet.ibm.com>, Lubos Lunak <l.lunak@suse.cz>, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch 3/7 -mm] oom: select task from tasklist for mempolicy ooms Date: Mon, 15 Feb 2010 14:11:44 -0800 (PST) [thread overview] Message-ID: <alpine.DEB.2.00.1002151407000.26927@chino.kir.corp.google.com> (raw) In-Reply-To: <20100215120924.7281.A69D9226@jp.fujitsu.com> On Mon, 15 Feb 2010, KOSAKI Motohiro wrote: > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > > --- a/mm/mempolicy.c > > +++ b/mm/mempolicy.c > > @@ -1638,6 +1638,45 @@ bool init_nodemask_of_mempolicy(nodemask_t *mask) > > } > > #endif > > > > +/* > > + * mempolicy_nodemask_intersects > > + * > > + * If tsk's mempolicy is "default" [NULL], return 'true' to indicate default > > + * policy. Otherwise, check for intersection between mask and the policy > > + * nodemask for 'bind' or 'interleave' policy, or mask to contain the single > > + * node for 'preferred' or 'local' policy. > > + */ > > +bool mempolicy_nodemask_intersects(struct task_struct *tsk, > > + const nodemask_t *mask) > > +{ > > + struct mempolicy *mempolicy; > > + bool ret = true; > > + > > + mempolicy = tsk->mempolicy; > > + mpol_get(mempolicy); > > Why is this refcount increment necessary? mempolicy is grabbed by tsk, > IOW it never be freed in this function. > We need to get a refcount on the mempolicy to ensure it doesn't get freed from under us, tsk is not necessarily current. > > > + if (!mask || !mempolicy) > > + goto out; > > + > > + switch (mempolicy->mode) { > > + case MPOL_PREFERRED: > > + if (mempolicy->flags & MPOL_F_LOCAL) > > + ret = node_isset(numa_node_id(), *mask); > > Um? Is this good heuristic? > The task can migrate various cpus, then "node_isset(numa_node_id(), *mask) == 0" > doesn't mean the task doesn't consume *mask's memory. > For MPOL_F_LOCAL, we need to check whether the task's cpu is on a node that is allowed by the zonelist passed to the page allocator. In the second revision of this patchset, this was changed to node_isset(cpu_to_node(task_cpu(tsk)), *mask) to check. It would be possible for no memory to have been allocated on that node and it just happens that the tsk is running on it momentarily, but it's the best indication we have given the mempolicy of whether killing a task may lead to future memory freeing. > > @@ -660,24 +683,18 @@ void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, > > */ > > constraint = constrained_alloc(zonelist, gfp_mask, nodemask); > > read_lock(&tasklist_lock); > > - > > - switch (constraint) { > > - case CONSTRAINT_MEMORY_POLICY: > > - oom_kill_process(current, gfp_mask, order, 0, NULL, > > - "No available memory (MPOL_BIND)"); > > - break; > > - > > - case CONSTRAINT_NONE: > > - if (sysctl_panic_on_oom) { > > + if (unlikely(sysctl_panic_on_oom)) { > > + /* > > + * panic_on_oom only affects CONSTRAINT_NONE, the kernel > > + * should not panic for cpuset or mempolicy induced memory > > + * failures. > > + */ > > + if (constraint == CONSTRAINT_NONE) { > > dump_header(NULL, gfp_mask, order, NULL); > > - panic("out of memory. panic_on_oom is selected\n"); > > + panic("Out of memory: panic_on_oom is enabled\n"); > > enabled? Its feature is enabled at boot time. triggered? or fired? > The panic_on_oom sysctl is "enabled" if it is set to non-zero; that's the word used throughout Documentation/sysctl/vm.txt to describe when a sysctl is being used or not.
WARNING: multiple messages have this Message-ID (diff)
From: David Rientjes <rientjes@google.com> To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Andrew Morton <akpm@linux-foundation.org>, Rik van Riel <riel@redhat.com>, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>, Nick Piggin <npiggin@suse.de>, Andrea Arcangeli <aarcange@redhat.com>, Balbir Singh <balbir@linux.vnet.ibm.com>, Lubos Lunak <l.lunak@suse.cz>, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch 3/7 -mm] oom: select task from tasklist for mempolicy ooms Date: Mon, 15 Feb 2010 14:11:44 -0800 (PST) [thread overview] Message-ID: <alpine.DEB.2.00.1002151407000.26927@chino.kir.corp.google.com> (raw) In-Reply-To: <20100215120924.7281.A69D9226@jp.fujitsu.com> On Mon, 15 Feb 2010, KOSAKI Motohiro wrote: > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > > --- a/mm/mempolicy.c > > +++ b/mm/mempolicy.c > > @@ -1638,6 +1638,45 @@ bool init_nodemask_of_mempolicy(nodemask_t *mask) > > } > > #endif > > > > +/* > > + * mempolicy_nodemask_intersects > > + * > > + * If tsk's mempolicy is "default" [NULL], return 'true' to indicate default > > + * policy. Otherwise, check for intersection between mask and the policy > > + * nodemask for 'bind' or 'interleave' policy, or mask to contain the single > > + * node for 'preferred' or 'local' policy. > > + */ > > +bool mempolicy_nodemask_intersects(struct task_struct *tsk, > > + const nodemask_t *mask) > > +{ > > + struct mempolicy *mempolicy; > > + bool ret = true; > > + > > + mempolicy = tsk->mempolicy; > > + mpol_get(mempolicy); > > Why is this refcount increment necessary? mempolicy is grabbed by tsk, > IOW it never be freed in this function. > We need to get a refcount on the mempolicy to ensure it doesn't get freed from under us, tsk is not necessarily current. > > > + if (!mask || !mempolicy) > > + goto out; > > + > > + switch (mempolicy->mode) { > > + case MPOL_PREFERRED: > > + if (mempolicy->flags & MPOL_F_LOCAL) > > + ret = node_isset(numa_node_id(), *mask); > > Um? Is this good heuristic? > The task can migrate various cpus, then "node_isset(numa_node_id(), *mask) == 0" > doesn't mean the task doesn't consume *mask's memory. > For MPOL_F_LOCAL, we need to check whether the task's cpu is on a node that is allowed by the zonelist passed to the page allocator. In the second revision of this patchset, this was changed to node_isset(cpu_to_node(task_cpu(tsk)), *mask) to check. It would be possible for no memory to have been allocated on that node and it just happens that the tsk is running on it momentarily, but it's the best indication we have given the mempolicy of whether killing a task may lead to future memory freeing. > > @@ -660,24 +683,18 @@ void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, > > */ > > constraint = constrained_alloc(zonelist, gfp_mask, nodemask); > > read_lock(&tasklist_lock); > > - > > - switch (constraint) { > > - case CONSTRAINT_MEMORY_POLICY: > > - oom_kill_process(current, gfp_mask, order, 0, NULL, > > - "No available memory (MPOL_BIND)"); > > - break; > > - > > - case CONSTRAINT_NONE: > > - if (sysctl_panic_on_oom) { > > + if (unlikely(sysctl_panic_on_oom)) { > > + /* > > + * panic_on_oom only affects CONSTRAINT_NONE, the kernel > > + * should not panic for cpuset or mempolicy induced memory > > + * failures. > > + */ > > + if (constraint == CONSTRAINT_NONE) { > > dump_header(NULL, gfp_mask, order, NULL); > > - panic("out of memory. panic_on_oom is selected\n"); > > + panic("Out of memory: panic_on_oom is enabled\n"); > > enabled? Its feature is enabled at boot time. triggered? or fired? > The panic_on_oom sysctl is "enabled" if it is set to non-zero; that's the word used throughout Documentation/sysctl/vm.txt to describe when a sysctl is being used or not. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-02-15 22:11 UTC|newest] Thread overview: 140+ messages / expand[flat|nested] mbox.gz Atom feed top 2010-02-10 16:32 [patch 0/7 -mm] oom killer rewrite David Rientjes 2010-02-10 16:32 ` David Rientjes 2010-02-10 16:32 ` [patch 1/7 -mm] oom: filter tasks not sharing the same cpuset David Rientjes 2010-02-10 16:32 ` David Rientjes 2010-02-10 17:08 ` Rik van Riel 2010-02-10 17:08 ` Rik van Riel 2010-02-11 23:52 ` KAMEZAWA Hiroyuki 2010-02-11 23:52 ` KAMEZAWA Hiroyuki 2010-02-15 2:56 ` KOSAKI Motohiro 2010-02-15 2:56 ` KOSAKI Motohiro 2010-02-15 22:06 ` David Rientjes 2010-02-15 22:06 ` David Rientjes 2010-02-16 4:52 ` KOSAKI Motohiro 2010-02-16 4:52 ` KOSAKI Motohiro 2010-02-16 6:01 ` KOSAKI Motohiro 2010-02-16 6:01 ` KOSAKI Motohiro 2010-02-16 7:03 ` Nick Piggin 2010-02-16 7:03 ` Nick Piggin 2010-02-16 8:49 ` David Rientjes 2010-02-16 8:49 ` David Rientjes 2010-02-16 9:04 ` Nick Piggin 2010-02-16 9:04 ` Nick Piggin 2010-02-16 9:10 ` David Rientjes 2010-02-16 9:10 ` David Rientjes 2010-02-16 8:46 ` David Rientjes 2010-02-16 8:46 ` David Rientjes 2010-02-10 16:32 ` [patch 2/7 -mm] oom: sacrifice child with highest badness score for parent David Rientjes 2010-02-10 16:32 ` David Rientjes 2010-02-10 20:52 ` Rik van Riel 2010-02-10 20:52 ` Rik van Riel 2010-02-12 0:00 ` KAMEZAWA Hiroyuki 2010-02-12 0:00 ` KAMEZAWA Hiroyuki 2010-02-12 0:15 ` David Rientjes 2010-02-12 0:15 ` David Rientjes 2010-02-13 2:49 ` Minchan Kim 2010-02-13 2:49 ` Minchan Kim 2010-02-15 3:08 ` KOSAKI Motohiro 2010-02-15 3:08 ` KOSAKI Motohiro 2010-02-10 16:32 ` [patch 3/7 -mm] oom: select task from tasklist for mempolicy ooms David Rientjes 2010-02-10 16:32 ` David Rientjes 2010-02-10 22:47 ` Rik van Riel 2010-02-10 22:47 ` Rik van Riel 2010-02-15 5:03 ` KOSAKI Motohiro 2010-02-15 5:03 ` KOSAKI Motohiro 2010-02-15 22:11 ` David Rientjes [this message] 2010-02-15 22:11 ` David Rientjes 2010-02-16 5:15 ` KOSAKI Motohiro 2010-02-16 5:15 ` KOSAKI Motohiro 2010-02-16 21:52 ` David Rientjes 2010-02-16 21:52 ` David Rientjes 2010-02-17 0:48 ` David Rientjes 2010-02-17 0:48 ` David Rientjes 2010-02-17 1:13 ` KOSAKI Motohiro 2010-02-17 1:13 ` KOSAKI Motohiro 2010-02-10 16:32 ` [patch 4/7 -mm] oom: badness heuristic rewrite David Rientjes 2010-02-10 16:32 ` David Rientjes 2010-02-11 4:10 ` Rik van Riel 2010-02-11 4:10 ` Rik van Riel 2010-02-11 9:14 ` David Rientjes 2010-02-11 9:14 ` David Rientjes 2010-02-11 15:07 ` Nick Bowler 2010-02-11 15:07 ` Nick Bowler 2010-02-11 21:01 ` David Rientjes 2010-02-11 21:01 ` David Rientjes 2010-02-11 21:43 ` Andrew Morton 2010-02-11 21:43 ` Andrew Morton 2010-02-11 21:51 ` David Rientjes 2010-02-11 21:51 ` David Rientjes 2010-02-11 22:31 ` Andrew Morton 2010-02-11 22:31 ` Andrew Morton 2010-02-11 22:42 ` David Rientjes 2010-02-11 22:42 ` David Rientjes 2010-02-11 23:11 ` Andrew Morton 2010-02-11 23:11 ` Andrew Morton 2010-02-11 23:31 ` David Rientjes 2010-02-11 23:31 ` David Rientjes 2010-02-11 23:37 ` Andrew Morton 2010-02-11 23:37 ` Andrew Morton 2010-02-12 13:56 ` Minchan Kim 2010-02-12 13:56 ` Minchan Kim 2010-02-12 21:00 ` David Rientjes 2010-02-12 21:00 ` David Rientjes 2010-02-13 2:45 ` Minchan Kim 2010-02-13 2:45 ` Minchan Kim 2010-02-15 21:54 ` David Rientjes 2010-02-15 21:54 ` David Rientjes 2010-02-16 13:14 ` Minchan Kim 2010-02-16 13:14 ` Minchan Kim 2010-02-16 21:41 ` David Rientjes 2010-02-16 21:41 ` David Rientjes 2010-02-17 7:41 ` Minchan Kim 2010-02-17 7:41 ` Minchan Kim 2010-02-17 9:23 ` David Rientjes 2010-02-17 9:23 ` David Rientjes 2010-02-17 13:08 ` Minchan Kim 2010-02-17 13:08 ` Minchan Kim 2010-02-15 8:05 ` KOSAKI Motohiro 2010-02-15 8:05 ` KOSAKI Motohiro 2010-02-10 16:32 ` [patch 5/7 -mm] oom: replace sysctls with quick mode David Rientjes 2010-02-10 16:32 ` David Rientjes 2010-02-12 0:26 ` KAMEZAWA Hiroyuki 2010-02-12 0:26 ` KAMEZAWA Hiroyuki 2010-02-12 9:58 ` David Rientjes 2010-02-12 9:58 ` David Rientjes 2010-02-15 8:09 ` KOSAKI Motohiro 2010-02-15 8:09 ` KOSAKI Motohiro 2010-02-15 22:15 ` David Rientjes 2010-02-15 22:15 ` David Rientjes 2010-02-16 5:25 ` KOSAKI Motohiro 2010-02-16 5:25 ` KOSAKI Motohiro 2010-02-16 9:04 ` David Rientjes 2010-02-16 9:04 ` David Rientjes 2010-02-10 16:32 ` [patch 6/7 -mm] oom: avoid oom killer for lowmem allocations David Rientjes 2010-02-10 16:32 ` David Rientjes 2010-02-11 4:13 ` Rik van Riel 2010-02-11 4:13 ` Rik van Riel 2010-02-11 9:19 ` David Rientjes 2010-02-11 9:19 ` David Rientjes 2010-02-11 14:08 ` Rik van Riel 2010-02-11 14:08 ` Rik van Riel 2010-02-12 1:28 ` KAMEZAWA Hiroyuki 2010-02-12 1:28 ` KAMEZAWA Hiroyuki 2010-02-12 10:06 ` David Rientjes 2010-02-12 10:06 ` David Rientjes 2010-02-15 0:09 ` KAMEZAWA Hiroyuki 2010-02-15 0:09 ` KAMEZAWA Hiroyuki 2010-02-15 22:01 ` David Rientjes 2010-02-15 22:01 ` David Rientjes 2010-02-15 8:29 ` KOSAKI Motohiro 2010-02-15 8:29 ` KOSAKI Motohiro 2010-02-10 16:32 ` [patch 7/7 -mm] oom: remove unnecessary code and cleanup David Rientjes 2010-02-10 16:32 ` David Rientjes 2010-02-12 0:12 ` KAMEZAWA Hiroyuki 2010-02-12 0:12 ` KAMEZAWA Hiroyuki 2010-02-12 0:21 ` David Rientjes 2010-02-12 0:21 ` David Rientjes 2010-02-15 8:31 ` KOSAKI Motohiro 2010-02-15 8:31 ` KOSAKI Motohiro 2010-02-15 2:51 ` [patch 0/7 -mm] oom killer rewrite KOSAKI Motohiro 2010-02-15 2:51 ` KOSAKI Motohiro
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=alpine.DEB.2.00.1002151407000.26927@chino.kir.corp.google.com \ --to=rientjes@google.com \ --cc=aarcange@redhat.com \ --cc=akpm@linux-foundation.org \ --cc=balbir@linux.vnet.ibm.com \ --cc=kamezawa.hiroyu@jp.fujitsu.com \ --cc=kosaki.motohiro@jp.fujitsu.com \ --cc=l.lunak@suse.cz \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=npiggin@suse.de \ --cc=riel@redhat.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.