* [PATCH] oom: skip frozen tasks @ 2011-08-23 8:31 ` Konstantin Khlebnikov 0 siblings, 0 replies; 88+ messages in thread From: Konstantin Khlebnikov @ 2011-08-23 8:31 UTC (permalink / raw) To: linux-mm, Andrew Morton, linux-kernel Cc: KOSAKI Motohiro, Oleg Nesterov, KAMEZAWA Hiroyuki, David Rientjes All frozen tasks are unkillable, and if one of them has TIF_MEMDIE we must kill something else to avoid deadlock. After this patch select_bad_process() will skip frozen task before checking TIF_MEMDIE. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> --- mm/oom_kill.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 626303b..931ab20 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -138,6 +138,8 @@ static bool oom_unkillable_task(struct task_struct *p, return true; if (p->flags & PF_KTHREAD) return true; + if (p->flags & PF_FROZEN) + return true; /* When mem_cgroup_out_of_memory() and p is not member of the group */ if (mem && !task_in_mem_cgroup(p, mem)) ^ permalink raw reply related [flat|nested] 88+ messages in thread
* [PATCH] oom: skip frozen tasks @ 2011-08-23 8:31 ` Konstantin Khlebnikov 0 siblings, 0 replies; 88+ messages in thread From: Konstantin Khlebnikov @ 2011-08-23 8:31 UTC (permalink / raw) To: linux-mm, Andrew Morton, linux-kernel Cc: KOSAKI Motohiro, Oleg Nesterov, KAMEZAWA Hiroyuki, David Rientjes All frozen tasks are unkillable, and if one of them has TIF_MEMDIE we must kill something else to avoid deadlock. After this patch select_bad_process() will skip frozen task before checking TIF_MEMDIE. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> --- mm/oom_kill.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 626303b..931ab20 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -138,6 +138,8 @@ static bool oom_unkillable_task(struct task_struct *p, return true; if (p->flags & PF_KTHREAD) return true; + if (p->flags & PF_FROZEN) + return true; /* When mem_cgroup_out_of_memory() and p is not member of the group */ if (mem && !task_in_mem_cgroup(p, mem)) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-23 8:31 ` Konstantin Khlebnikov @ 2011-08-23 9:15 ` KAMEZAWA Hiroyuki -1 siblings, 0 replies; 88+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-08-23 9:15 UTC (permalink / raw) To: Konstantin Khlebnikov Cc: linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Oleg Nesterov, David Rientjes On Tue, 23 Aug 2011 11:31:01 +0300 Konstantin Khlebnikov <khlebnikov@openvz.org> wrote: > All frozen tasks are unkillable, and if one of them has TIF_MEMDIE > we must kill something else to avoid deadlock. After this patch > select_bad_process() will skip frozen task before checking TIF_MEMDIE. > > Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-23 9:15 ` KAMEZAWA Hiroyuki 0 siblings, 0 replies; 88+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-08-23 9:15 UTC (permalink / raw) To: Konstantin Khlebnikov Cc: linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Oleg Nesterov, David Rientjes On Tue, 23 Aug 2011 11:31:01 +0300 Konstantin Khlebnikov <khlebnikov@openvz.org> wrote: > All frozen tasks are unkillable, and if one of them has TIF_MEMDIE > we must kill something else to avoid deadlock. After this patch > select_bad_process() will skip frozen task before checking TIF_MEMDIE. > > Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-23 8:31 ` Konstantin Khlebnikov @ 2011-08-23 13:46 ` Michal Hocko -1 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-23 13:46 UTC (permalink / raw) To: Konstantin Khlebnikov Cc: linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Oleg Nesterov, KAMEZAWA Hiroyuki, David Rientjes On Tue 23-08-11 11:31:01, Konstantin Khlebnikov wrote: > All frozen tasks are unkillable, and if one of them has TIF_MEMDIE > we must kill something else to avoid deadlock. This is a livelock rather than a deadlock, isn't it? We are picking the same process all the time and cannot do any progress. > After this patch select_bad_process() will skip frozen task before > checking TIF_MEMDIE. > > Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Anyway the patch looks good to me. Reviewed-by: Michal Hocko <mhocko@suse.cz> > --- > mm/oom_kill.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 626303b..931ab20 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -138,6 +138,8 @@ static bool oom_unkillable_task(struct task_struct *p, > return true; > if (p->flags & PF_KTHREAD) > return true; > + if (p->flags & PF_FROZEN) > + return true; > > /* When mem_cgroup_out_of_memory() and p is not member of the group */ > if (mem && !task_in_mem_cgroup(p, mem)) > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-23 13:46 ` Michal Hocko 0 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-23 13:46 UTC (permalink / raw) To: Konstantin Khlebnikov Cc: linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Oleg Nesterov, KAMEZAWA Hiroyuki, David Rientjes On Tue 23-08-11 11:31:01, Konstantin Khlebnikov wrote: > All frozen tasks are unkillable, and if one of them has TIF_MEMDIE > we must kill something else to avoid deadlock. This is a livelock rather than a deadlock, isn't it? We are picking the same process all the time and cannot do any progress. > After this patch select_bad_process() will skip frozen task before > checking TIF_MEMDIE. > > Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Anyway the patch looks good to me. Reviewed-by: Michal Hocko <mhocko@suse.cz> > --- > mm/oom_kill.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 626303b..931ab20 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -138,6 +138,8 @@ static bool oom_unkillable_task(struct task_struct *p, > return true; > if (p->flags & PF_KTHREAD) > return true; > + if (p->flags & PF_FROZEN) > + return true; > > /* When mem_cgroup_out_of_memory() and p is not member of the group */ > if (mem && !task_in_mem_cgroup(p, mem)) > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-23 8:31 ` Konstantin Khlebnikov @ 2011-08-23 20:18 ` David Rientjes -1 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-08-23 20:18 UTC (permalink / raw) To: Konstantin Khlebnikov Cc: linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Oleg Nesterov, KAMEZAWA Hiroyuki On Tue, 23 Aug 2011, Konstantin Khlebnikov wrote: > All frozen tasks are unkillable, and if one of them has TIF_MEMDIE > we must kill something else to avoid deadlock. After this patch > select_bad_process() will skip frozen task before checking TIF_MEMDIE. > The caveat is that if the task in the refrigerator is not OOM_DISABLE and there are no other eligible tasks (system wide, in the cpuset, or in the memcg) to kill, then the machine will panic as a result of this when, in the past, we would simply issue the SIGKILL and keep looping in the page allocator until it is thawed. So you may actually be trading a stall waiting for this thread to thaw for what would now be a panic, and that's not clearly better to me. ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-23 20:18 ` David Rientjes 0 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-08-23 20:18 UTC (permalink / raw) To: Konstantin Khlebnikov Cc: linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Oleg Nesterov, KAMEZAWA Hiroyuki On Tue, 23 Aug 2011, Konstantin Khlebnikov wrote: > All frozen tasks are unkillable, and if one of them has TIF_MEMDIE > we must kill something else to avoid deadlock. After this patch > select_bad_process() will skip frozen task before checking TIF_MEMDIE. > The caveat is that if the task in the refrigerator is not OOM_DISABLE and there are no other eligible tasks (system wide, in the cpuset, or in the memcg) to kill, then the machine will panic as a result of this when, in the past, we would simply issue the SIGKILL and keep looping in the page allocator until it is thawed. So you may actually be trading a stall waiting for this thread to thaw for what would now be a panic, and that's not clearly better to me. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-23 20:18 ` David Rientjes @ 2011-08-24 10:19 ` Michal Hocko -1 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-24 10:19 UTC (permalink / raw) To: David Rientjes Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Oleg Nesterov, KAMEZAWA Hiroyuki On Tue 23-08-11 13:18:14, David Rientjes wrote: > On Tue, 23 Aug 2011, Konstantin Khlebnikov wrote: > > > All frozen tasks are unkillable, and if one of them has TIF_MEMDIE > > we must kill something else to avoid deadlock. After this patch > > select_bad_process() will skip frozen task before checking TIF_MEMDIE. > > > > The caveat is that if the task in the refrigerator is not OOM_DISABLE and > there are no other eligible tasks (system wide, in the cpuset, or in the > memcg) to kill, then the machine will panic as a result of this when, in > the past, we would simply issue the SIGKILL and keep looping in the page > allocator until it is thawed. mem_cgroup_out_of_memory doesn't panic if nothing has been selected. We will loop in the charge&reclaim path until somebody frees some memory. > So you may actually be trading a stall waiting for this thread to thaw for > what would now be a panic, and that's not clearly better to me. When we are in the global OOM condition then you are right, we have a higher chance to panic. I still find the patch an improvement because encountering a frozen task and looping over it without any progress (even though there are other tasks that could be killed) is more probable than having no killable task at all. On non-NUMA machines there is even not a big chance that somebody would be able to thaw a task as the system is already on knees. -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-24 10:19 ` Michal Hocko 0 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-24 10:19 UTC (permalink / raw) To: David Rientjes Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Oleg Nesterov, KAMEZAWA Hiroyuki On Tue 23-08-11 13:18:14, David Rientjes wrote: > On Tue, 23 Aug 2011, Konstantin Khlebnikov wrote: > > > All frozen tasks are unkillable, and if one of them has TIF_MEMDIE > > we must kill something else to avoid deadlock. After this patch > > select_bad_process() will skip frozen task before checking TIF_MEMDIE. > > > > The caveat is that if the task in the refrigerator is not OOM_DISABLE and > there are no other eligible tasks (system wide, in the cpuset, or in the > memcg) to kill, then the machine will panic as a result of this when, in > the past, we would simply issue the SIGKILL and keep looping in the page > allocator until it is thawed. mem_cgroup_out_of_memory doesn't panic if nothing has been selected. We will loop in the charge&reclaim path until somebody frees some memory. > So you may actually be trading a stall waiting for this thread to thaw for > what would now be a panic, and that's not clearly better to me. When we are in the global OOM condition then you are right, we have a higher chance to panic. I still find the patch an improvement because encountering a frozen task and looping over it without any progress (even though there are other tasks that could be killed) is more probable than having no killable task at all. On non-NUMA machines there is even not a big chance that somebody would be able to thaw a task as the system is already on knees. -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-24 10:19 ` Michal Hocko @ 2011-08-24 19:31 ` David Rientjes -1 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-08-24 19:31 UTC (permalink / raw) To: Michal Hocko Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Oleg Nesterov, KAMEZAWA Hiroyuki On Wed, 24 Aug 2011, Michal Hocko wrote: > When we are in the global OOM condition then you are right, we have a > higher chance to panic. I still find the patch an improvement because > encountering a frozen task and looping over it without any progress > (even though there are other tasks that could be killed) is more > probable than having no killable task at all. > On non-NUMA machines there is even not a big chance that somebody would > be able to thaw a task as the system is already on knees. > That's obviously false since we call oom_killer_disable() in freeze_processes() to disable the oom killer from ever being called in the first place, so this is something you need to resolve with Rafael before you cause more machines to panic. ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-24 19:31 ` David Rientjes 0 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-08-24 19:31 UTC (permalink / raw) To: Michal Hocko Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Oleg Nesterov, KAMEZAWA Hiroyuki On Wed, 24 Aug 2011, Michal Hocko wrote: > When we are in the global OOM condition then you are right, we have a > higher chance to panic. I still find the patch an improvement because > encountering a frozen task and looping over it without any progress > (even though there are other tasks that could be killed) is more > probable than having no killable task at all. > On non-NUMA machines there is even not a big chance that somebody would > be able to thaw a task as the system is already on knees. > That's obviously false since we call oom_killer_disable() in freeze_processes() to disable the oom killer from ever being called in the first place, so this is something you need to resolve with Rafael before you cause more machines to panic. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-24 19:31 ` David Rientjes @ 2011-08-25 9:19 ` Michal Hocko -1 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-25 9:19 UTC (permalink / raw) To: David Rientjes Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Oleg Nesterov, KAMEZAWA Hiroyuki On Wed 24-08-11 12:31:26, David Rientjes wrote: > On Wed, 24 Aug 2011, Michal Hocko wrote: > > > When we are in the global OOM condition then you are right, we have a > > higher chance to panic. I still find the patch an improvement because > > encountering a frozen task and looping over it without any progress > > (even though there are other tasks that could be killed) is more > > probable than having no killable task at all. > > On non-NUMA machines there is even not a big chance that somebody would > > be able to thaw a task as the system is already on knees. > > > > That's obviously false since we call oom_killer_disable() in > freeze_processes() to disable the oom killer from ever being called in the > first place, so this is something you need to resolve with Rafael before > you cause more machines to panic. I didn't mean suspend/resume path (that is protected by oom_killer_disabled) so the patch doesn't make any change. Other than that you may end up with all tasks frozen by freezer cgroup (assuming that others, that are killable, would be already killed by OOM). But in that case who can thaw those processes when we are already in OOM? If there is no chance to move forward then panic is more suitable than a livelock IMO. OK, we might be in OOM on a nodemask (cpuset or mempol) on NUMA and an allocation on a different nodemask might still succeed and so we can thaw those processes. This should be addressed. What if we panicked only if constraint == CONSTRAINT_NONE? Or am I missing something? -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-25 9:19 ` Michal Hocko 0 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-25 9:19 UTC (permalink / raw) To: David Rientjes Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Oleg Nesterov, KAMEZAWA Hiroyuki On Wed 24-08-11 12:31:26, David Rientjes wrote: > On Wed, 24 Aug 2011, Michal Hocko wrote: > > > When we are in the global OOM condition then you are right, we have a > > higher chance to panic. I still find the patch an improvement because > > encountering a frozen task and looping over it without any progress > > (even though there are other tasks that could be killed) is more > > probable than having no killable task at all. > > On non-NUMA machines there is even not a big chance that somebody would > > be able to thaw a task as the system is already on knees. > > > > That's obviously false since we call oom_killer_disable() in > freeze_processes() to disable the oom killer from ever being called in the > first place, so this is something you need to resolve with Rafael before > you cause more machines to panic. I didn't mean suspend/resume path (that is protected by oom_killer_disabled) so the patch doesn't make any change. Other than that you may end up with all tasks frozen by freezer cgroup (assuming that others, that are killable, would be already killed by OOM). But in that case who can thaw those processes when we are already in OOM? If there is no chance to move forward then panic is more suitable than a livelock IMO. OK, we might be in OOM on a nodemask (cpuset or mempol) on NUMA and an allocation on a different nodemask might still succeed and so we can thaw those processes. This should be addressed. What if we panicked only if constraint == CONSTRAINT_NONE? Or am I missing something? -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-25 9:19 ` Michal Hocko @ 2011-08-25 15:18 ` Oleg Nesterov -1 siblings, 0 replies; 88+ messages in thread From: Oleg Nesterov @ 2011-08-25 15:18 UTC (permalink / raw) To: Michal Hocko Cc: David Rientjes, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki On 08/25, Michal Hocko wrote: > > On Wed 24-08-11 12:31:26, David Rientjes wrote: > > > > That's obviously false since we call oom_killer_disable() in > > freeze_processes() to disable the oom killer from ever being called in the > > first place, so this is something you need to resolve with Rafael before > > you cause more machines to panic. > > I didn't mean suspend/resume path (that is protected by oom_killer_disabled) > so the patch doesn't make any change. Confused... freeze_processes() does try_to_freeze_tasks() before oom_killer_disable() ? Oleg. ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-25 15:18 ` Oleg Nesterov 0 siblings, 0 replies; 88+ messages in thread From: Oleg Nesterov @ 2011-08-25 15:18 UTC (permalink / raw) To: Michal Hocko Cc: David Rientjes, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki On 08/25, Michal Hocko wrote: > > On Wed 24-08-11 12:31:26, David Rientjes wrote: > > > > That's obviously false since we call oom_killer_disable() in > > freeze_processes() to disable the oom killer from ever being called in the > > first place, so this is something you need to resolve with Rafael before > > you cause more machines to panic. > > I didn't mean suspend/resume path (that is protected by oom_killer_disabled) > so the patch doesn't make any change. Confused... freeze_processes() does try_to_freeze_tasks() before oom_killer_disable() ? Oleg. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-25 15:18 ` Oleg Nesterov @ 2011-08-25 16:47 ` Michal Hocko -1 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-25 16:47 UTC (permalink / raw) To: Oleg Nesterov Cc: David Rientjes, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki On Thu 25-08-11 17:18:18, Oleg Nesterov wrote: > On 08/25, Michal Hocko wrote: > > > > On Wed 24-08-11 12:31:26, David Rientjes wrote: > > > > > > That's obviously false since we call oom_killer_disable() in > > > freeze_processes() to disable the oom killer from ever being called in the > > > first place, so this is something you need to resolve with Rafael before > > > you cause more machines to panic. > > > > I didn't mean suspend/resume path (that is protected by oom_killer_disabled) > > so the patch doesn't make any change. > > Confused... freeze_processes() does try_to_freeze_tasks() before > oom_killer_disable() ? Yes you are right, I must have been blind. Now I see the point. We do not want to panic while we are suspending and the memory is really low just because all the userspace is already in the the fridge. Sorry for confusion. I still do not follow the oom_killer_disable note from David, though. > > Oleg. -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-25 16:47 ` Michal Hocko 0 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-25 16:47 UTC (permalink / raw) To: Oleg Nesterov Cc: David Rientjes, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki On Thu 25-08-11 17:18:18, Oleg Nesterov wrote: > On 08/25, Michal Hocko wrote: > > > > On Wed 24-08-11 12:31:26, David Rientjes wrote: > > > > > > That's obviously false since we call oom_killer_disable() in > > > freeze_processes() to disable the oom killer from ever being called in the > > > first place, so this is something you need to resolve with Rafael before > > > you cause more machines to panic. > > > > I didn't mean suspend/resume path (that is protected by oom_killer_disabled) > > so the patch doesn't make any change. > > Confused... freeze_processes() does try_to_freeze_tasks() before > oom_killer_disable() ? Yes you are right, I must have been blind. Now I see the point. We do not want to panic while we are suspending and the memory is really low just because all the userspace is already in the the fridge. Sorry for confusion. I still do not follow the oom_killer_disable note from David, though. > > Oleg. -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-25 16:47 ` Michal Hocko @ 2011-08-25 21:14 ` David Rientjes -1 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-08-25 21:14 UTC (permalink / raw) To: Michal Hocko Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Thu, 25 Aug 2011, Michal Hocko wrote: > > > > That's obviously false since we call oom_killer_disable() in > > > > freeze_processes() to disable the oom killer from ever being called in the > > > > first place, so this is something you need to resolve with Rafael before > > > > you cause more machines to panic. > > > > > > I didn't mean suspend/resume path (that is protected by oom_killer_disabled) > > > so the patch doesn't make any change. > > > > Confused... freeze_processes() does try_to_freeze_tasks() before > > oom_killer_disable() ? > > Yes you are right, I must have been blind. > > Now I see the point. We do not want to panic while we are suspending and > the memory is really low just because all the userspace is already in > the the fridge. > Sorry for confusion. > > I still do not follow the oom_killer_disable note from David, though. > oom_killer_disable() was added to that path for a reason when all threads are frozen: memory allocations still occur in the suspend path in an oom condition and adding the oom_killer_disable() will cause those allocations to fail rather than sending pointless SIGKILLs to frozen threads. Now consider if the only _eligible_ threads for oom kill (because of cpusets or mempolicies) are those that are frozen. We certainly do not want to panic because other cpusets are still getting work done. We'd either want to add a mem to the cpuset or thaw the processes because the cpuset is oom. You can't just selectively skip certain threads when their state can be temporary without risking a panic. That's why this patch is a non-starter. A much better solution would be to lower the badness score that the oom killer uses for PF_FROZEN threads so that they aren't considered a priority for kill unless there's nothing else left to kill. ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-25 21:14 ` David Rientjes 0 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-08-25 21:14 UTC (permalink / raw) To: Michal Hocko Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Thu, 25 Aug 2011, Michal Hocko wrote: > > > > That's obviously false since we call oom_killer_disable() in > > > > freeze_processes() to disable the oom killer from ever being called in the > > > > first place, so this is something you need to resolve with Rafael before > > > > you cause more machines to panic. > > > > > > I didn't mean suspend/resume path (that is protected by oom_killer_disabled) > > > so the patch doesn't make any change. > > > > Confused... freeze_processes() does try_to_freeze_tasks() before > > oom_killer_disable() ? > > Yes you are right, I must have been blind. > > Now I see the point. We do not want to panic while we are suspending and > the memory is really low just because all the userspace is already in > the the fridge. > Sorry for confusion. > > I still do not follow the oom_killer_disable note from David, though. > oom_killer_disable() was added to that path for a reason when all threads are frozen: memory allocations still occur in the suspend path in an oom condition and adding the oom_killer_disable() will cause those allocations to fail rather than sending pointless SIGKILLs to frozen threads. Now consider if the only _eligible_ threads for oom kill (because of cpusets or mempolicies) are those that are frozen. We certainly do not want to panic because other cpusets are still getting work done. We'd either want to add a mem to the cpuset or thaw the processes because the cpuset is oom. You can't just selectively skip certain threads when their state can be temporary without risking a panic. That's why this patch is a non-starter. A much better solution would be to lower the badness score that the oom killer uses for PF_FROZEN threads so that they aren't considered a priority for kill unless there's nothing else left to kill. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-25 21:14 ` David Rientjes @ 2011-08-26 7:09 ` Michal Hocko -1 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-26 7:09 UTC (permalink / raw) To: David Rientjes Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Thu 25-08-11 14:14:20, David Rientjes wrote: > On Thu, 25 Aug 2011, Michal Hocko wrote: > > > > > > That's obviously false since we call oom_killer_disable() in > > > > > freeze_processes() to disable the oom killer from ever being called in the > > > > > first place, so this is something you need to resolve with Rafael before > > > > > you cause more machines to panic. > > > > > > > > I didn't mean suspend/resume path (that is protected by oom_killer_disabled) > > > > so the patch doesn't make any change. > > > > > > Confused... freeze_processes() does try_to_freeze_tasks() before > > > oom_killer_disable() ? > > > > Yes you are right, I must have been blind. > > > > Now I see the point. We do not want to panic while we are suspending and > > the memory is really low just because all the userspace is already in > > the the fridge. > > Sorry for confusion. > > > > I still do not follow the oom_killer_disable note from David, though. > > > > oom_killer_disable() was added to that path for a reason when all threads > are frozen: memory allocations still occur in the suspend path in an oom > condition and adding the oom_killer_disable() will cause those > allocations to fail rather than sending pointless SIGKILLs to frozen > threads. > > Now consider if the only _eligible_ threads for oom kill (because of > cpusets or mempolicies) are those that are frozen. We certainly do not > want to panic because other cpusets are still getting work done. We'd > either want to add a mem to the cpuset or thaw the processes because the > cpuset is oom. Sure. > > You can't just selectively skip certain threads when their state can be > temporary without risking a panic. That's why this patch is a > non-starter. > > A much better solution would be to lower the badness score that the oom > killer uses for PF_FROZEN threads so that they aren't considered a > priority for kill unless there's nothing else left to kill. Yes, sounds better. Thanks -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-26 7:09 ` Michal Hocko 0 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-26 7:09 UTC (permalink / raw) To: David Rientjes Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Thu 25-08-11 14:14:20, David Rientjes wrote: > On Thu, 25 Aug 2011, Michal Hocko wrote: > > > > > > That's obviously false since we call oom_killer_disable() in > > > > > freeze_processes() to disable the oom killer from ever being called in the > > > > > first place, so this is something you need to resolve with Rafael before > > > > > you cause more machines to panic. > > > > > > > > I didn't mean suspend/resume path (that is protected by oom_killer_disabled) > > > > so the patch doesn't make any change. > > > > > > Confused... freeze_processes() does try_to_freeze_tasks() before > > > oom_killer_disable() ? > > > > Yes you are right, I must have been blind. > > > > Now I see the point. We do not want to panic while we are suspending and > > the memory is really low just because all the userspace is already in > > the the fridge. > > Sorry for confusion. > > > > I still do not follow the oom_killer_disable note from David, though. > > > > oom_killer_disable() was added to that path for a reason when all threads > are frozen: memory allocations still occur in the suspend path in an oom > condition and adding the oom_killer_disable() will cause those > allocations to fail rather than sending pointless SIGKILLs to frozen > threads. > > Now consider if the only _eligible_ threads for oom kill (because of > cpusets or mempolicies) are those that are frozen. We certainly do not > want to panic because other cpusets are still getting work done. We'd > either want to add a mem to the cpuset or thaw the processes because the > cpuset is oom. Sure. > > You can't just selectively skip certain threads when their state can be > temporary without risking a panic. That's why this patch is a > non-starter. > > A much better solution would be to lower the badness score that the oom > killer uses for PF_FROZEN threads so that they aren't considered a > priority for kill unless there's nothing else left to kill. Yes, sounds better. Thanks -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-26 7:09 ` Michal Hocko @ 2011-08-26 8:56 ` Michal Hocko -1 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-26 8:56 UTC (permalink / raw) To: David Rientjes Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri 26-08-11 09:09:46, Michal Hocko wrote: > On Thu 25-08-11 14:14:20, David Rientjes wrote: > > On Thu, 25 Aug 2011, Michal Hocko wrote: > > > > > > > > That's obviously false since we call oom_killer_disable() in > > > > > > freeze_processes() to disable the oom killer from ever being called in the > > > > > > first place, so this is something you need to resolve with Rafael before > > > > > > you cause more machines to panic. > > > > > > > > > > I didn't mean suspend/resume path (that is protected by oom_killer_disabled) > > > > > so the patch doesn't make any change. > > > > > > > > Confused... freeze_processes() does try_to_freeze_tasks() before > > > > oom_killer_disable() ? > > > > > > Yes you are right, I must have been blind. > > > > > > Now I see the point. We do not want to panic while we are suspending and > > > the memory is really low just because all the userspace is already in > > > the the fridge. > > > Sorry for confusion. > > > > > > I still do not follow the oom_killer_disable note from David, though. > > > > > > > oom_killer_disable() was added to that path for a reason when all threads > > are frozen: memory allocations still occur in the suspend path in an oom > > condition and adding the oom_killer_disable() will cause those > > allocations to fail rather than sending pointless SIGKILLs to frozen > > threads. > > > > Now consider if the only _eligible_ threads for oom kill (because of > > cpusets or mempolicies) are those that are frozen. We certainly do not > > want to panic because other cpusets are still getting work done. We'd > > either want to add a mem to the cpuset or thaw the processes because the > > cpuset is oom. > > Sure. > > > > > You can't just selectively skip certain threads when their state can be > > temporary without risking a panic. That's why this patch is a > > non-starter. > > > > A much better solution would be to lower the badness score that the oom > > killer uses for PF_FROZEN threads so that they aren't considered a > > priority for kill unless there's nothing else left to kill. > > Yes, sounds better. .. but still not sufficient. We also have to thaw the process as well. Just a quick hacked up patch (not tested, just for an illustration). Would something like this work? --- >From 305a8139a72b20709e6b59ff8f4d322a9e04ab19 Mon Sep 17 00:00:00 2001 From: Michal Hocko <mhocko@suse.cz> Date: Fri, 26 Aug 2011 10:39:35 +0200 Subject: [PATCH] oom: do not live lock on frozen tasks [WARNING untested] OOM can end up in a live lock if select_bad_process picks up a frozen task. On the other hand we cannot mark such processes as unkillable because we could panic the system even though there is a chance that somebody could thaw the process so we can make a forward process (e.g. a process from another cpuset or with a different nodemask). Let's give all frozen tasks a bonus (OOM_SCORE_ADJ_MAX/2) so that we do not consider them unless really necessary and if we really pick up one then thaw its threads before we try to kill it. TODO - given bonus might be too big? - aren't we racing with try_to_freeze_tasks? --- mm/oom_kill.c | 13 +++++++++++++ 1 files changed, 13 insertions(+), 0 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 626303b..fd194bc 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -32,6 +32,7 @@ #include <linux/mempolicy.h> #include <linux/security.h> #include <linux/ptrace.h> +#include <linux/freezer.h> int sysctl_panic_on_oom; int sysctl_oom_kill_allocating_task; @@ -214,6 +215,14 @@ unsigned int oom_badness(struct task_struct *p, struct mem_cgroup *mem, points += p->signal->oom_score_adj; /* + * Do not try to kill frozen tasks unless there is nothing else to kill. + * We do not want to give it 1 point because we still want to select a good + * candidate among all frozen tasks. Let's give it a reasonable bonus. + */ + if (frozen(p)) + points -= OOM_SCORE_ADJ_MAX/2; + + /* * Never return 0 for an eligible task that may be killed since it's * possible that no single user task uses more than 0.1% of memory and * no single admin tasks uses more than 3.0%. @@ -450,6 +459,10 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) pr_err("Kill process %d (%s) sharing same memory\n", task_pid_nr(q), q->comm); task_unlock(q); + + if (frozen(q)) + thaw_process(q); + force_sig(SIGKILL, q); } -- 1.7.5.4 -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply related [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-26 8:56 ` Michal Hocko 0 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-26 8:56 UTC (permalink / raw) To: David Rientjes Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri 26-08-11 09:09:46, Michal Hocko wrote: > On Thu 25-08-11 14:14:20, David Rientjes wrote: > > On Thu, 25 Aug 2011, Michal Hocko wrote: > > > > > > > > That's obviously false since we call oom_killer_disable() in > > > > > > freeze_processes() to disable the oom killer from ever being called in the > > > > > > first place, so this is something you need to resolve with Rafael before > > > > > > you cause more machines to panic. > > > > > > > > > > I didn't mean suspend/resume path (that is protected by oom_killer_disabled) > > > > > so the patch doesn't make any change. > > > > > > > > Confused... freeze_processes() does try_to_freeze_tasks() before > > > > oom_killer_disable() ? > > > > > > Yes you are right, I must have been blind. > > > > > > Now I see the point. We do not want to panic while we are suspending and > > > the memory is really low just because all the userspace is already in > > > the the fridge. > > > Sorry for confusion. > > > > > > I still do not follow the oom_killer_disable note from David, though. > > > > > > > oom_killer_disable() was added to that path for a reason when all threads > > are frozen: memory allocations still occur in the suspend path in an oom > > condition and adding the oom_killer_disable() will cause those > > allocations to fail rather than sending pointless SIGKILLs to frozen > > threads. > > > > Now consider if the only _eligible_ threads for oom kill (because of > > cpusets or mempolicies) are those that are frozen. We certainly do not > > want to panic because other cpusets are still getting work done. We'd > > either want to add a mem to the cpuset or thaw the processes because the > > cpuset is oom. > > Sure. > > > > > You can't just selectively skip certain threads when their state can be > > temporary without risking a panic. That's why this patch is a > > non-starter. > > > > A much better solution would be to lower the badness score that the oom > > killer uses for PF_FROZEN threads so that they aren't considered a > > priority for kill unless there's nothing else left to kill. > > Yes, sounds better. .. but still not sufficient. We also have to thaw the process as well. Just a quick hacked up patch (not tested, just for an illustration). Would something like this work? --- ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-26 8:56 ` Michal Hocko @ 2011-08-26 9:21 ` David Rientjes -1 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-08-26 9:21 UTC (permalink / raw) To: Michal Hocko Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri, 26 Aug 2011, Michal Hocko wrote: > Let's give all frozen tasks a bonus (OOM_SCORE_ADJ_MAX/2) so that we do > not consider them unless really necessary and if we really pick up one > then thaw its threads before we try to kill it. > I don't like arbitrary heuristics like this because they polluted the old oom killer before it was rewritten and made it much more unpredictable. The only heuristic it includes right now is a bonus for root tasks so that when two processes have nearly the same amount of memory usage (within 3% of available memory), the non-root task is chosen instead. This bonus is actually saying that a single frozen task can use up to 50% more of the machine's capacity in a system-wide oom condition than the task that will now be killed instead. That seems excessive. I do like the idea of automatically thawing the task though and if that's possible then I don't think we need to manipulate the badness heuristic at all. I know that wouldn't be feasible when we've frozen _all_ threads and that's why we have oom_killer_disable(), but we'll have to check with Rafael to see if something like this could work. Rafael? > TODO > - given bonus might be too big? > - aren't we racing with try_to_freeze_tasks? > --- > mm/oom_kill.c | 13 +++++++++++++ > 1 files changed, 13 insertions(+), 0 deletions(-) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 626303b..fd194bc 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -32,6 +32,7 @@ > #include <linux/mempolicy.h> > #include <linux/security.h> > #include <linux/ptrace.h> > +#include <linux/freezer.h> > > int sysctl_panic_on_oom; > int sysctl_oom_kill_allocating_task; > @@ -214,6 +215,14 @@ unsigned int oom_badness(struct task_struct *p, struct mem_cgroup *mem, > points += p->signal->oom_score_adj; > > /* > + * Do not try to kill frozen tasks unless there is nothing else to kill. > + * We do not want to give it 1 point because we still want to select a good > + * candidate among all frozen tasks. Let's give it a reasonable bonus. > + */ > + if (frozen(p)) > + points -= OOM_SCORE_ADJ_MAX/2; > + > + /* > * Never return 0 for an eligible task that may be killed since it's > * possible that no single user task uses more than 0.1% of memory and > * no single admin tasks uses more than 3.0%. > @@ -450,6 +459,10 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > pr_err("Kill process %d (%s) sharing same memory\n", > task_pid_nr(q), q->comm); > task_unlock(q); > + > + if (frozen(q)) > + thaw_process(q); > + > force_sig(SIGKILL, q); > } > ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-26 9:21 ` David Rientjes 0 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-08-26 9:21 UTC (permalink / raw) To: Michal Hocko Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri, 26 Aug 2011, Michal Hocko wrote: > Let's give all frozen tasks a bonus (OOM_SCORE_ADJ_MAX/2) so that we do > not consider them unless really necessary and if we really pick up one > then thaw its threads before we try to kill it. > I don't like arbitrary heuristics like this because they polluted the old oom killer before it was rewritten and made it much more unpredictable. The only heuristic it includes right now is a bonus for root tasks so that when two processes have nearly the same amount of memory usage (within 3% of available memory), the non-root task is chosen instead. This bonus is actually saying that a single frozen task can use up to 50% more of the machine's capacity in a system-wide oom condition than the task that will now be killed instead. That seems excessive. I do like the idea of automatically thawing the task though and if that's possible then I don't think we need to manipulate the badness heuristic at all. I know that wouldn't be feasible when we've frozen _all_ threads and that's why we have oom_killer_disable(), but we'll have to check with Rafael to see if something like this could work. Rafael? > TODO > - given bonus might be too big? > - aren't we racing with try_to_freeze_tasks? > --- > mm/oom_kill.c | 13 +++++++++++++ > 1 files changed, 13 insertions(+), 0 deletions(-) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 626303b..fd194bc 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -32,6 +32,7 @@ > #include <linux/mempolicy.h> > #include <linux/security.h> > #include <linux/ptrace.h> > +#include <linux/freezer.h> > > int sysctl_panic_on_oom; > int sysctl_oom_kill_allocating_task; > @@ -214,6 +215,14 @@ unsigned int oom_badness(struct task_struct *p, struct mem_cgroup *mem, > points += p->signal->oom_score_adj; > > /* > + * Do not try to kill frozen tasks unless there is nothing else to kill. > + * We do not want to give it 1 point because we still want to select a good > + * candidate among all frozen tasks. Let's give it a reasonable bonus. > + */ > + if (frozen(p)) > + points -= OOM_SCORE_ADJ_MAX/2; > + > + /* > * Never return 0 for an eligible task that may be killed since it's > * possible that no single user task uses more than 0.1% of memory and > * no single admin tasks uses more than 3.0%. > @@ -450,6 +459,10 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > pr_err("Kill process %d (%s) sharing same memory\n", > task_pid_nr(q), q->comm); > task_unlock(q); > + > + if (frozen(q)) > + thaw_process(q); > + > force_sig(SIGKILL, q); > } > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-26 9:21 ` David Rientjes @ 2011-08-26 9:53 ` Michal Hocko -1 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-26 9:53 UTC (permalink / raw) To: David Rientjes Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri 26-08-11 02:21:42, David Rientjes wrote: > On Fri, 26 Aug 2011, Michal Hocko wrote: > > > Let's give all frozen tasks a bonus (OOM_SCORE_ADJ_MAX/2) so that we do > > not consider them unless really necessary and if we really pick up one > > then thaw its threads before we try to kill it. > > > > I don't like arbitrary heuristics like this because they polluted the old > oom killer before it was rewritten and made it much more unpredictable. > The only heuristic it includes right now is a bonus for root tasks so that > when two processes have nearly the same amount of memory usage (within 3% > of available memory), the non-root task is chosen instead. > > This bonus is actually saying that a single frozen task can use up to 50% > more of the machine's capacity in a system-wide oom condition than the > task that will now be killed instead. That seems excessive. Yes, the number is probably too high. I just wanted to start up with something. Maybe we can give it another root bonus. But I agree whatever we use it will be just a random value... > > I do like the idea of automatically thawing the task though and if that's > possible then I don't think we need to manipulate the badness heuristic at > all. I know that wouldn't be feasible when we've frozen _all_ threads and Why it wouldn't be feasible for all threads? If you have all tasks frozen (suspend going on, whole cgroup or all tasks in a cpuset/nodemask are frozen) then the selection is more natural because all of them are equal (with or without a bonus). The bonus tries to reduce thawing if not all of them are frozen. I am not saying the bonus is necessary, though. It depends on what the freezer is used for (e.g. freeze a process which went wild and debug what went wrong wouldn't welcome that somebody killed it or other (mis)use which relies on D state). > that's why we have oom_killer_disable(), but we'll have to check with > Rafael to see if something like this could work. Rafael? > > > TODO > > - given bonus might be too big? > > - aren't we racing with try_to_freeze_tasks? > > --- > > mm/oom_kill.c | 13 +++++++++++++ > > 1 files changed, 13 insertions(+), 0 deletions(-) > > > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > > index 626303b..fd194bc 100644 > > --- a/mm/oom_kill.c > > +++ b/mm/oom_kill.c > > @@ -32,6 +32,7 @@ > > #include <linux/mempolicy.h> > > #include <linux/security.h> > > #include <linux/ptrace.h> > > +#include <linux/freezer.h> > > > > int sysctl_panic_on_oom; > > int sysctl_oom_kill_allocating_task; > > @@ -214,6 +215,14 @@ unsigned int oom_badness(struct task_struct *p, struct mem_cgroup *mem, > > points += p->signal->oom_score_adj; > > > > /* > > + * Do not try to kill frozen tasks unless there is nothing else to kill. > > + * We do not want to give it 1 point because we still want to select a good > > + * candidate among all frozen tasks. Let's give it a reasonable bonus. > > + */ > > + if (frozen(p)) > > + points -= OOM_SCORE_ADJ_MAX/2; > > + > > + /* > > * Never return 0 for an eligible task that may be killed since it's > > * possible that no single user task uses more than 0.1% of memory and > > * no single admin tasks uses more than 3.0%. > > @@ -450,6 +459,10 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > > pr_err("Kill process %d (%s) sharing same memory\n", > > task_pid_nr(q), q->comm); > > task_unlock(q); > > + > > + if (frozen(q)) > > + thaw_process(q); > > + > > force_sig(SIGKILL, q); > > } > > -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-26 9:53 ` Michal Hocko 0 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-26 9:53 UTC (permalink / raw) To: David Rientjes Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri 26-08-11 02:21:42, David Rientjes wrote: > On Fri, 26 Aug 2011, Michal Hocko wrote: > > > Let's give all frozen tasks a bonus (OOM_SCORE_ADJ_MAX/2) so that we do > > not consider them unless really necessary and if we really pick up one > > then thaw its threads before we try to kill it. > > > > I don't like arbitrary heuristics like this because they polluted the old > oom killer before it was rewritten and made it much more unpredictable. > The only heuristic it includes right now is a bonus for root tasks so that > when two processes have nearly the same amount of memory usage (within 3% > of available memory), the non-root task is chosen instead. > > This bonus is actually saying that a single frozen task can use up to 50% > more of the machine's capacity in a system-wide oom condition than the > task that will now be killed instead. That seems excessive. Yes, the number is probably too high. I just wanted to start up with something. Maybe we can give it another root bonus. But I agree whatever we use it will be just a random value... > > I do like the idea of automatically thawing the task though and if that's > possible then I don't think we need to manipulate the badness heuristic at > all. I know that wouldn't be feasible when we've frozen _all_ threads and Why it wouldn't be feasible for all threads? If you have all tasks frozen (suspend going on, whole cgroup or all tasks in a cpuset/nodemask are frozen) then the selection is more natural because all of them are equal (with or without a bonus). The bonus tries to reduce thawing if not all of them are frozen. I am not saying the bonus is necessary, though. It depends on what the freezer is used for (e.g. freeze a process which went wild and debug what went wrong wouldn't welcome that somebody killed it or other (mis)use which relies on D state). > that's why we have oom_killer_disable(), but we'll have to check with > Rafael to see if something like this could work. Rafael? > > > TODO > > - given bonus might be too big? > > - aren't we racing with try_to_freeze_tasks? > > --- > > mm/oom_kill.c | 13 +++++++++++++ > > 1 files changed, 13 insertions(+), 0 deletions(-) > > > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > > index 626303b..fd194bc 100644 > > --- a/mm/oom_kill.c > > +++ b/mm/oom_kill.c > > @@ -32,6 +32,7 @@ > > #include <linux/mempolicy.h> > > #include <linux/security.h> > > #include <linux/ptrace.h> > > +#include <linux/freezer.h> > > > > int sysctl_panic_on_oom; > > int sysctl_oom_kill_allocating_task; > > @@ -214,6 +215,14 @@ unsigned int oom_badness(struct task_struct *p, struct mem_cgroup *mem, > > points += p->signal->oom_score_adj; > > > > /* > > + * Do not try to kill frozen tasks unless there is nothing else to kill. > > + * We do not want to give it 1 point because we still want to select a good > > + * candidate among all frozen tasks. Let's give it a reasonable bonus. > > + */ > > + if (frozen(p)) > > + points -= OOM_SCORE_ADJ_MAX/2; > > + > > + /* > > * Never return 0 for an eligible task that may be killed since it's > > * possible that no single user task uses more than 0.1% of memory and > > * no single admin tasks uses more than 3.0%. > > @@ -450,6 +459,10 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > > pr_err("Kill process %d (%s) sharing same memory\n", > > task_pid_nr(q), q->comm); > > task_unlock(q); > > + > > + if (frozen(q)) > > + thaw_process(q); > > + > > force_sig(SIGKILL, q); > > } > > -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-26 9:53 ` Michal Hocko @ 2011-08-26 11:01 ` Michal Hocko -1 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-26 11:01 UTC (permalink / raw) To: David Rientjes Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri 26-08-11 11:53:56, Michal Hocko wrote: > On Fri 26-08-11 02:21:42, David Rientjes wrote: > > On Fri, 26 Aug 2011, Michal Hocko wrote: > > > > > Let's give all frozen tasks a bonus (OOM_SCORE_ADJ_MAX/2) so that we do > > > not consider them unless really necessary and if we really pick up one > > > then thaw its threads before we try to kill it. > > > > > > > I don't like arbitrary heuristics like this because they polluted the old > > oom killer before it was rewritten and made it much more unpredictable. > > The only heuristic it includes right now is a bonus for root tasks so that > > when two processes have nearly the same amount of memory usage (within 3% > > of available memory), the non-root task is chosen instead. > > > > This bonus is actually saying that a single frozen task can use up to 50% > > more of the machine's capacity in a system-wide oom condition than the > > task that will now be killed instead. That seems excessive. > > Yes, the number is probably too high. I just wanted to start up with > something. Maybe we can give it another root bonus. But I agree whatever > we use it will be just a random value... > > > > > I do like the idea of automatically thawing the task though and if that's > > possible then I don't think we need to manipulate the badness heuristic at > > all. I know that wouldn't be feasible when we've frozen _all_ threads and > > Why it wouldn't be feasible for all threads? If you have all tasks > frozen (suspend going on, whole cgroup or all tasks in a cpuset/nodemask > are frozen) then the selection is more natural because all of them are > equal (with or without a bonus). The bonus tries to reduce thawing if > not all of them are frozen. > I am not saying the bonus is necessary, though. It depends on what > the freezer is used for (e.g. freeze a process which went wild and > debug what went wrong wouldn't welcome that somebody killed it or other > (mis)use which relies on D state). Anyway, I do agree, the two things (bonus and thaw during oom_kill) should be handled separately. -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-26 11:01 ` Michal Hocko 0 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-26 11:01 UTC (permalink / raw) To: David Rientjes Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri 26-08-11 11:53:56, Michal Hocko wrote: > On Fri 26-08-11 02:21:42, David Rientjes wrote: > > On Fri, 26 Aug 2011, Michal Hocko wrote: > > > > > Let's give all frozen tasks a bonus (OOM_SCORE_ADJ_MAX/2) so that we do > > > not consider them unless really necessary and if we really pick up one > > > then thaw its threads before we try to kill it. > > > > > > > I don't like arbitrary heuristics like this because they polluted the old > > oom killer before it was rewritten and made it much more unpredictable. > > The only heuristic it includes right now is a bonus for root tasks so that > > when two processes have nearly the same amount of memory usage (within 3% > > of available memory), the non-root task is chosen instead. > > > > This bonus is actually saying that a single frozen task can use up to 50% > > more of the machine's capacity in a system-wide oom condition than the > > task that will now be killed instead. That seems excessive. > > Yes, the number is probably too high. I just wanted to start up with > something. Maybe we can give it another root bonus. But I agree whatever > we use it will be just a random value... > > > > > I do like the idea of automatically thawing the task though and if that's > > possible then I don't think we need to manipulate the badness heuristic at > > all. I know that wouldn't be feasible when we've frozen _all_ threads and > > Why it wouldn't be feasible for all threads? If you have all tasks > frozen (suspend going on, whole cgroup or all tasks in a cpuset/nodemask > are frozen) then the selection is more natural because all of them are > equal (with or without a bonus). The bonus tries to reduce thawing if > not all of them are frozen. > I am not saying the bonus is necessary, though. It depends on what > the freezer is used for (e.g. freeze a process which went wild and > debug what went wrong wouldn't welcome that somebody killed it or other > (mis)use which relies on D state). Anyway, I do agree, the two things (bonus and thaw during oom_kill) should be handled separately. -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-26 9:53 ` Michal Hocko @ 2011-08-26 18:13 ` David Rientjes -1 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-08-26 18:13 UTC (permalink / raw) To: Michal Hocko Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri, 26 Aug 2011, Michal Hocko wrote: > > I do like the idea of automatically thawing the task though and if that's > > possible then I don't think we need to manipulate the badness heuristic at > > all. I know that wouldn't be feasible when we've frozen _all_ threads and > > Why it wouldn't be feasible for all threads? If you have all tasks > frozen (suspend going on, whole cgroup or all tasks in a cpuset/nodemask > are frozen) then the selection is more natural because all of them are > equal (with or without a bonus). The bonus tries to reduce thawing if > not all of them are frozen. Yeah, this comment wasn't in reference to your heuristic change, it was in reference to the fact that if all threads are frozen then there is little hope for us recovering from the situation without a user response. I think that's why we want oom_killer_disable() so that we avoid looping forever and can actually fail allocations in the hope that we'll bring ourselves out of suspend. > I am not saying the bonus is necessary, though. It depends on what > the freezer is used for (e.g. freeze a process which went wild and > debug what went wrong wouldn't welcome that somebody killed it or other > (mis)use which relies on D state). > I'd love to be able to do a thaw on a PF_FROZEN task in the oom killer followed by a SIGKILL if that task is selected for oom kill without an heuristic change. Not sure if that's possible, so we'll wait for Rafael to chime in. If it actually does come down to a heuristic change, then it need not happen in the oom killer: the freezing code would need to use test_set_oom_score_adj() to temporarily reduce the oom_score_adj for that task until it comes out of the refrigerator. We already use that in ksm and swapoff to actually prefer threads, but we can use it to bias against threads as well. ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-26 18:13 ` David Rientjes 0 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-08-26 18:13 UTC (permalink / raw) To: Michal Hocko Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri, 26 Aug 2011, Michal Hocko wrote: > > I do like the idea of automatically thawing the task though and if that's > > possible then I don't think we need to manipulate the badness heuristic at > > all. I know that wouldn't be feasible when we've frozen _all_ threads and > > Why it wouldn't be feasible for all threads? If you have all tasks > frozen (suspend going on, whole cgroup or all tasks in a cpuset/nodemask > are frozen) then the selection is more natural because all of them are > equal (with or without a bonus). The bonus tries to reduce thawing if > not all of them are frozen. Yeah, this comment wasn't in reference to your heuristic change, it was in reference to the fact that if all threads are frozen then there is little hope for us recovering from the situation without a user response. I think that's why we want oom_killer_disable() so that we avoid looping forever and can actually fail allocations in the hope that we'll bring ourselves out of suspend. > I am not saying the bonus is necessary, though. It depends on what > the freezer is used for (e.g. freeze a process which went wild and > debug what went wrong wouldn't welcome that somebody killed it or other > (mis)use which relies on D state). > I'd love to be able to do a thaw on a PF_FROZEN task in the oom killer followed by a SIGKILL if that task is selected for oom kill without an heuristic change. Not sure if that's possible, so we'll wait for Rafael to chime in. If it actually does come down to a heuristic change, then it need not happen in the oom killer: the freezing code would need to use test_set_oom_score_adj() to temporarily reduce the oom_score_adj for that task until it comes out of the refrigerator. We already use that in ksm and swapoff to actually prefer threads, but we can use it to bias against threads as well. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* [PATCH 1/2] oom: do not live lock on frozen tasks 2011-08-26 18:13 ` David Rientjes @ 2011-09-26 8:28 ` Michal Hocko -1 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-09-26 8:28 UTC (permalink / raw) To: David Rientjes Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki, Tejun Heo, Rusty Russell [Let's add some more people to CC list] Sorry it took so long but I was quite bussy recently. On Fri 26-08-11 11:13:40, David Rientjes wrote: > On Fri, 26 Aug 2011, Michal Hocko wrote: [...] > > I am not saying the bonus is necessary, though. It depends on what > > the freezer is used for (e.g. freeze a process which went wild and > > debug what went wrong wouldn't welcome that somebody killed it or other > > (mis)use which relies on D state). > > > > I'd love to be able to do a thaw on a PF_FROZEN task in the oom killer > followed by a SIGKILL if that task is selected for oom kill without an > heuristic change. Not sure if that's possible, so we'll wait for Rafael > to chime in. We have discussed that with Rafael and it should be safe to do that. See the patch bellow. The only place I am not entirely sure about is run_guest (drivers/lguest/core.c). It seems that the code is able to cope with signals but it also calls lguest_arch_run_guest after try_to_freeze. --- >From aea3c3fba1e172373877df03a335848cae4b717e Mon Sep 17 00:00:00 2001 From: Michal Hocko <mhocko@suse.cz> Date: Fri, 16 Sep 2011 11:23:15 +0200 Subject: [PATCH 1/2] oom: do not live lock on frozen tasks Konstantin Khlebnikov has reported (https://lkml.org/lkml/2011/8/23/45) that OOM can end up in a live lock if select_bad_process picks up a frozen task. Unfortunately we cannot mark such processes as unkillable to ignore them because we could panic the system even though there is a chance that somebody could thaw the process so we can make a forward process (e.g. a process from another cpuset or with a different nodemask). Let's thaw an OOM selected frozen process right after we've sent fatal signal from oom_kill_task. Thawing is safe if the frozen task doesn't access any suspended device (e.g. by ioctl) on the way out to the userspace where we handle the signal and die. Note, we are not interested in the kernel threads because they are not oom killable. Accessing suspended devices by a userspace processes shouldn't be an issue because devices are suspended only after userspace is already frozen and oom is disabled at that time. run_guest (drivers/lguest/core.c) calls try_to_freeze with an user context but it seems it is able to cope with signals because it explicitly checks for pending signals so we should be safe. Other than that userspace accesses the fridge only from the signal handling routines so we are able to handle SIGKILL without any negative side effects. Signed-off-by: Michal Hocko <mhocko@suse.cz> Reported-by: Konstantin Khlebnikov <khlebnikov@openvz.org> --- mm/oom_kill.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 626303b..b9774f3 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -32,6 +32,7 @@ #include <linux/mempolicy.h> #include <linux/security.h> #include <linux/ptrace.h> +#include <linux/freezer.h> int sysctl_panic_on_oom; int sysctl_oom_kill_allocating_task; @@ -451,6 +452,9 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) task_pid_nr(q), q->comm); task_unlock(q); force_sig(SIGKILL, q); + + if (frozen(q)) + thaw_process(q); } set_tsk_thread_flag(p, TIF_MEMDIE); -- 1.7.5.4 -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply related [flat|nested] 88+ messages in thread
* [PATCH 1/2] oom: do not live lock on frozen tasks @ 2011-09-26 8:28 ` Michal Hocko 0 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-09-26 8:28 UTC (permalink / raw) To: David Rientjes Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki, Tejun Heo, Rusty Russell [Let's add some more people to CC list] Sorry it took so long but I was quite bussy recently. On Fri 26-08-11 11:13:40, David Rientjes wrote: > On Fri, 26 Aug 2011, Michal Hocko wrote: [...] > > I am not saying the bonus is necessary, though. It depends on what > > the freezer is used for (e.g. freeze a process which went wild and > > debug what went wrong wouldn't welcome that somebody killed it or other > > (mis)use which relies on D state). > > > > I'd love to be able to do a thaw on a PF_FROZEN task in the oom killer > followed by a SIGKILL if that task is selected for oom kill without an > heuristic change. Not sure if that's possible, so we'll wait for Rafael > to chime in. We have discussed that with Rafael and it should be safe to do that. See the patch bellow. The only place I am not entirely sure about is run_guest (drivers/lguest/core.c). It seems that the code is able to cope with signals but it also calls lguest_arch_run_guest after try_to_freeze. --- ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks 2011-09-26 8:28 ` Michal Hocko @ 2011-09-26 8:56 ` David Rientjes -1 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-09-26 8:56 UTC (permalink / raw) To: Michal Hocko Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki, Tejun Heo, Rusty Russell On Mon, 26 Sep 2011, Michal Hocko wrote: > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 626303b..b9774f3 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -32,6 +32,7 @@ > #include <linux/mempolicy.h> > #include <linux/security.h> > #include <linux/ptrace.h> > +#include <linux/freezer.h> > > int sysctl_panic_on_oom; > int sysctl_oom_kill_allocating_task; > @@ -451,6 +452,9 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > task_pid_nr(q), q->comm); > task_unlock(q); > force_sig(SIGKILL, q); > + > + if (frozen(q)) > + thaw_process(q); > } > > set_tsk_thread_flag(p, TIF_MEMDIE); This is in the wrong place, oom_kill_task() iterates over all threads that are _not_ in the same thread group as the chosen thread and kills them without giving them access to memory reserves. The chosen task, p, could still be frozen and may not exit. Once that's fixed, feel free to add my Acked-by: David Rientjes <rientjes@google.com> once Rafael sends his acked-by or reviewed-by. ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks @ 2011-09-26 8:56 ` David Rientjes 0 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-09-26 8:56 UTC (permalink / raw) To: Michal Hocko Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki, Tejun Heo, Rusty Russell On Mon, 26 Sep 2011, Michal Hocko wrote: > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 626303b..b9774f3 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -32,6 +32,7 @@ > #include <linux/mempolicy.h> > #include <linux/security.h> > #include <linux/ptrace.h> > +#include <linux/freezer.h> > > int sysctl_panic_on_oom; > int sysctl_oom_kill_allocating_task; > @@ -451,6 +452,9 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > task_pid_nr(q), q->comm); > task_unlock(q); > force_sig(SIGKILL, q); > + > + if (frozen(q)) > + thaw_process(q); > } > > set_tsk_thread_flag(p, TIF_MEMDIE); This is in the wrong place, oom_kill_task() iterates over all threads that are _not_ in the same thread group as the chosen thread and kills them without giving them access to memory reserves. The chosen task, p, could still be frozen and may not exit. Once that's fixed, feel free to add my Acked-by: David Rientjes <rientjes@google.com> once Rafael sends his acked-by or reviewed-by. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks 2011-09-26 8:56 ` David Rientjes @ 2011-09-26 9:14 ` Michal Hocko -1 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-09-26 9:14 UTC (permalink / raw) To: David Rientjes Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki, Tejun Heo, Rusty Russell On Mon 26-09-11 01:56:57, David Rientjes wrote: > On Mon, 26 Sep 2011, Michal Hocko wrote: > > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > > index 626303b..b9774f3 100644 > > --- a/mm/oom_kill.c > > +++ b/mm/oom_kill.c > > @@ -32,6 +32,7 @@ > > #include <linux/mempolicy.h> > > #include <linux/security.h> > > #include <linux/ptrace.h> > > +#include <linux/freezer.h> > > > > int sysctl_panic_on_oom; > > int sysctl_oom_kill_allocating_task; > > @@ -451,6 +452,9 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > > task_pid_nr(q), q->comm); > > task_unlock(q); > > force_sig(SIGKILL, q); > > + > > + if (frozen(q)) > > + thaw_process(q); > > } > > > > set_tsk_thread_flag(p, TIF_MEMDIE); > > This is in the wrong place, oom_kill_task() iterates over all threads that > are _not_ in the same thread group as the chosen thread and kills them > without giving them access to memory reserves. The chosen task, p, could > still be frozen and may not exit. Ahh, right you are. I ave missed that one. Updated patch bellow. > > Once that's fixed, feel free to add my > > Acked-by: David Rientjes <rientjes@google.com> Thanks > > once Rafael sends his acked-by or reviewed-by. --- >From f935ed4558c2fb033ef5c14e02b28e12a615f80e Mon Sep 17 00:00:00 2001 From: Michal Hocko <mhocko@suse.cz> Date: Fri, 16 Sep 2011 11:23:15 +0200 Subject: [PATCH] oom: do not live lock on frozen tasks Konstantin Khlebnikov has reported (https://lkml.org/lkml/2011/8/23/45) that OOM can end up in a live lock if select_bad_process picks up a frozen task. Unfortunately we cannot mark such processes as unkillable to ignore them because we could panic the system even though there is a chance that somebody could thaw the process so we can make a forward process (e.g. a process from another cpuset or with a different nodemask). Let's thaw an OOM selected frozen process right after we've sent fatal signal from oom_kill_task. Thawing is safe if the frozen task doesn't access any suspended device (e.g. by ioctl) on the way out to the userspace where we handle the signal and die. Note, we are not interested in the kernel threads because they are not oom killable. Accessing suspended devices by a userspace processes shouldn't be an issue because devices are suspended only after userspace is already frozen and oom is disabled at that time. run_guest (drivers/lguest/core.c) calls try_to_freeze with an user context but it seems it is able to cope with signals because it explicitly checks for pending signals so we should be safe. Other than that userspace accesses the fridge only from the signal handling routines so we are able to handle SIGKILL without any negative side effects. Signed-off-by: Michal Hocko <mhocko@suse.cz> Reported-by: Konstantin Khlebnikov <khlebnikov@openvz.org> --- mm/oom_kill.c | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 626303b..c419a7e 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -32,6 +32,7 @@ #include <linux/mempolicy.h> #include <linux/security.h> #include <linux/ptrace.h> +#include <linux/freezer.h> int sysctl_panic_on_oom; int sysctl_oom_kill_allocating_task; @@ -451,10 +452,15 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) task_pid_nr(q), q->comm); task_unlock(q); force_sig(SIGKILL, q); + + if (frozen(q)) + thaw_process(q); } set_tsk_thread_flag(p, TIF_MEMDIE); force_sig(SIGKILL, p); + if (frozen(p)) + thaw_process(p); return 0; } -- 1.7.5.4 -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply related [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks @ 2011-09-26 9:14 ` Michal Hocko 0 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-09-26 9:14 UTC (permalink / raw) To: David Rientjes Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki, Tejun Heo, Rusty Russell On Mon 26-09-11 01:56:57, David Rientjes wrote: > On Mon, 26 Sep 2011, Michal Hocko wrote: > > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > > index 626303b..b9774f3 100644 > > --- a/mm/oom_kill.c > > +++ b/mm/oom_kill.c > > @@ -32,6 +32,7 @@ > > #include <linux/mempolicy.h> > > #include <linux/security.h> > > #include <linux/ptrace.h> > > +#include <linux/freezer.h> > > > > int sysctl_panic_on_oom; > > int sysctl_oom_kill_allocating_task; > > @@ -451,6 +452,9 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > > task_pid_nr(q), q->comm); > > task_unlock(q); > > force_sig(SIGKILL, q); > > + > > + if (frozen(q)) > > + thaw_process(q); > > } > > > > set_tsk_thread_flag(p, TIF_MEMDIE); > > This is in the wrong place, oom_kill_task() iterates over all threads that > are _not_ in the same thread group as the chosen thread and kills them > without giving them access to memory reserves. The chosen task, p, could > still be frozen and may not exit. Ahh, right you are. I ave missed that one. Updated patch bellow. > > Once that's fixed, feel free to add my > > Acked-by: David Rientjes <rientjes@google.com> Thanks > > once Rafael sends his acked-by or reviewed-by. --- ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks 2011-09-26 9:14 ` Michal Hocko @ 2011-09-26 9:25 ` KAMEZAWA Hiroyuki -1 siblings, 0 replies; 88+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-09-26 9:25 UTC (permalink / raw) To: Michal Hocko Cc: David Rientjes, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Rafael J. Wysocki, Tejun Heo, Rusty Russell On Mon, 26 Sep 2011 11:14:40 +0200 Michal Hocko <mhocko@suse.cz> wrote: > On Mon 26-09-11 01:56:57, David Rientjes wrote: > > On Mon, 26 Sep 2011, Michal Hocko wrote: > > > > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > > > index 626303b..b9774f3 100644 > > > --- a/mm/oom_kill.c > > > +++ b/mm/oom_kill.c > > > @@ -32,6 +32,7 @@ > > > #include <linux/mempolicy.h> > > > #include <linux/security.h> > > > #include <linux/ptrace.h> > > > +#include <linux/freezer.h> > > > > > > int sysctl_panic_on_oom; > > > int sysctl_oom_kill_allocating_task; > > > @@ -451,6 +452,9 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > > > task_pid_nr(q), q->comm); > > > task_unlock(q); > > > force_sig(SIGKILL, q); > > > + > > > + if (frozen(q)) > > > + thaw_process(q); > > > } > > > > > > set_tsk_thread_flag(p, TIF_MEMDIE); > > > > This is in the wrong place, oom_kill_task() iterates over all threads that > > are _not_ in the same thread group as the chosen thread and kills them > > without giving them access to memory reserves. The chosen task, p, could > > still be frozen and may not exit. > > Ahh, right you are. I ave missed that one. Updated patch bellow. > > > > > Once that's fixed, feel free to add my > > > > Acked-by: David Rientjes <rientjes@google.com> > > Thanks > > > > > once Rafael sends his acked-by or reviewed-by. > --- > From f935ed4558c2fb033ef5c14e02b28e12a615f80e Mon Sep 17 00:00:00 2001 > From: Michal Hocko <mhocko@suse.cz> > Date: Fri, 16 Sep 2011 11:23:15 +0200 > Subject: [PATCH] oom: do not live lock on frozen tasks > > Konstantin Khlebnikov has reported (https://lkml.org/lkml/2011/8/23/45) > that OOM can end up in a live lock if select_bad_process picks up a frozen > task. > Unfortunately we cannot mark such processes as unkillable to ignore them > because we could panic the system even though there is a chance that > somebody could thaw the process so we can make a forward process (e.g. a > process from another cpuset or with a different nodemask). > > Let's thaw an OOM selected frozen process right after we've sent fatal > signal from oom_kill_task. > Thawing is safe if the frozen task doesn't access any suspended device > (e.g. by ioctl) on the way out to the userspace where we handle the > signal and die. Note, we are not interested in the kernel threads because > they are not oom killable. > > Accessing suspended devices by a userspace processes shouldn't be an > issue because devices are suspended only after userspace is already > frozen and oom is disabled at that time. > > run_guest (drivers/lguest/core.c) calls try_to_freeze with an user > context but it seems it is able to cope with signals because it > explicitly checks for pending signals so we should be safe. > > Other than that userspace accesses the fridge only from the > signal handling routines so we are able to handle SIGKILL without any > negative side effects. > > Signed-off-by: Michal Hocko <mhocko@suse.cz> > Reported-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks @ 2011-09-26 9:25 ` KAMEZAWA Hiroyuki 0 siblings, 0 replies; 88+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-09-26 9:25 UTC (permalink / raw) To: Michal Hocko Cc: David Rientjes, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Rafael J. Wysocki, Tejun Heo, Rusty Russell On Mon, 26 Sep 2011 11:14:40 +0200 Michal Hocko <mhocko@suse.cz> wrote: > On Mon 26-09-11 01:56:57, David Rientjes wrote: > > On Mon, 26 Sep 2011, Michal Hocko wrote: > > > > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > > > index 626303b..b9774f3 100644 > > > --- a/mm/oom_kill.c > > > +++ b/mm/oom_kill.c > > > @@ -32,6 +32,7 @@ > > > #include <linux/mempolicy.h> > > > #include <linux/security.h> > > > #include <linux/ptrace.h> > > > +#include <linux/freezer.h> > > > > > > int sysctl_panic_on_oom; > > > int sysctl_oom_kill_allocating_task; > > > @@ -451,6 +452,9 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > > > task_pid_nr(q), q->comm); > > > task_unlock(q); > > > force_sig(SIGKILL, q); > > > + > > > + if (frozen(q)) > > > + thaw_process(q); > > > } > > > > > > set_tsk_thread_flag(p, TIF_MEMDIE); > > > > This is in the wrong place, oom_kill_task() iterates over all threads that > > are _not_ in the same thread group as the chosen thread and kills them > > without giving them access to memory reserves. The chosen task, p, could > > still be frozen and may not exit. > > Ahh, right you are. I ave missed that one. Updated patch bellow. > > > > > Once that's fixed, feel free to add my > > > > Acked-by: David Rientjes <rientjes@google.com> > > Thanks > > > > > once Rafael sends his acked-by or reviewed-by. > --- > From f935ed4558c2fb033ef5c14e02b28e12a615f80e Mon Sep 17 00:00:00 2001 > From: Michal Hocko <mhocko@suse.cz> > Date: Fri, 16 Sep 2011 11:23:15 +0200 > Subject: [PATCH] oom: do not live lock on frozen tasks > > Konstantin Khlebnikov has reported (https://lkml.org/lkml/2011/8/23/45) > that OOM can end up in a live lock if select_bad_process picks up a frozen > task. > Unfortunately we cannot mark such processes as unkillable to ignore them > because we could panic the system even though there is a chance that > somebody could thaw the process so we can make a forward process (e.g. a > process from another cpuset or with a different nodemask). > > Let's thaw an OOM selected frozen process right after we've sent fatal > signal from oom_kill_task. > Thawing is safe if the frozen task doesn't access any suspended device > (e.g. by ioctl) on the way out to the userspace where we handle the > signal and die. Note, we are not interested in the kernel threads because > they are not oom killable. > > Accessing suspended devices by a userspace processes shouldn't be an > issue because devices are suspended only after userspace is already > frozen and oom is disabled at that time. > > run_guest (drivers/lguest/core.c) calls try_to_freeze with an user > context but it seems it is able to cope with signals because it > explicitly checks for pending signals so we should be safe. > > Other than that userspace accesses the fridge only from the > signal handling routines so we are able to handle SIGKILL without any > negative side effects. > > Signed-off-by: Michal Hocko <mhocko@suse.cz> > Reported-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks 2011-09-26 9:25 ` KAMEZAWA Hiroyuki @ 2011-09-26 9:32 ` Michal Hocko -1 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-09-26 9:32 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: David Rientjes, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Rafael J. Wysocki, Tejun Heo, Rusty Russell On Mon 26-09-11 18:25:36, KAMEZAWA Hiroyuki wrote: > On Mon, 26 Sep 2011 11:14:40 +0200 > Michal Hocko <mhocko@suse.cz> wrote: > > > On Mon 26-09-11 01:56:57, David Rientjes wrote: > > > On Mon, 26 Sep 2011, Michal Hocko wrote: > > > > > > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > > > > index 626303b..b9774f3 100644 > > > > --- a/mm/oom_kill.c > > > > +++ b/mm/oom_kill.c > > > > @@ -32,6 +32,7 @@ > > > > #include <linux/mempolicy.h> > > > > #include <linux/security.h> > > > > #include <linux/ptrace.h> > > > > +#include <linux/freezer.h> > > > > > > > > int sysctl_panic_on_oom; > > > > int sysctl_oom_kill_allocating_task; > > > > @@ -451,6 +452,9 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > > > > task_pid_nr(q), q->comm); > > > > task_unlock(q); > > > > force_sig(SIGKILL, q); > > > > + > > > > + if (frozen(q)) > > > > + thaw_process(q); > > > > } > > > > > > > > set_tsk_thread_flag(p, TIF_MEMDIE); > > > > > > This is in the wrong place, oom_kill_task() iterates over all threads that > > > are _not_ in the same thread group as the chosen thread and kills them > > > without giving them access to memory reserves. The chosen task, p, could > > > still be frozen and may not exit. > > > > Ahh, right you are. I ave missed that one. Updated patch bellow. > > > > > > > > Once that's fixed, feel free to add my > > > > > > Acked-by: David Rientjes <rientjes@google.com> > > > > Thanks > > > > > > > > once Rafael sends his acked-by or reviewed-by. > > --- > > From f935ed4558c2fb033ef5c14e02b28e12a615f80e Mon Sep 17 00:00:00 2001 > > From: Michal Hocko <mhocko@suse.cz> > > Date: Fri, 16 Sep 2011 11:23:15 +0200 > > Subject: [PATCH] oom: do not live lock on frozen tasks > > > > Konstantin Khlebnikov has reported (https://lkml.org/lkml/2011/8/23/45) > > that OOM can end up in a live lock if select_bad_process picks up a frozen > > task. > > Unfortunately we cannot mark such processes as unkillable to ignore them > > because we could panic the system even though there is a chance that > > somebody could thaw the process so we can make a forward process (e.g. a > > process from another cpuset or with a different nodemask). > > > > Let's thaw an OOM selected frozen process right after we've sent fatal > > signal from oom_kill_task. > > Thawing is safe if the frozen task doesn't access any suspended device > > (e.g. by ioctl) on the way out to the userspace where we handle the > > signal and die. Note, we are not interested in the kernel threads because > > they are not oom killable. > > > > Accessing suspended devices by a userspace processes shouldn't be an > > issue because devices are suspended only after userspace is already > > frozen and oom is disabled at that time. > > > > run_guest (drivers/lguest/core.c) calls try_to_freeze with an user > > context but it seems it is able to cope with signals because it > > explicitly checks for pending signals so we should be safe. > > > > Other than that userspace accesses the fridge only from the > > signal handling routines so we are able to handle SIGKILL without any > > negative side effects. > > > > Signed-off-by: Michal Hocko <mhocko@suse.cz> > > Reported-by: Konstantin Khlebnikov <khlebnikov@openvz.org> > > Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Thanks -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks @ 2011-09-26 9:32 ` Michal Hocko 0 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-09-26 9:32 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: David Rientjes, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Rafael J. Wysocki, Tejun Heo, Rusty Russell On Mon 26-09-11 18:25:36, KAMEZAWA Hiroyuki wrote: > On Mon, 26 Sep 2011 11:14:40 +0200 > Michal Hocko <mhocko@suse.cz> wrote: > > > On Mon 26-09-11 01:56:57, David Rientjes wrote: > > > On Mon, 26 Sep 2011, Michal Hocko wrote: > > > > > > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > > > > index 626303b..b9774f3 100644 > > > > --- a/mm/oom_kill.c > > > > +++ b/mm/oom_kill.c > > > > @@ -32,6 +32,7 @@ > > > > #include <linux/mempolicy.h> > > > > #include <linux/security.h> > > > > #include <linux/ptrace.h> > > > > +#include <linux/freezer.h> > > > > > > > > int sysctl_panic_on_oom; > > > > int sysctl_oom_kill_allocating_task; > > > > @@ -451,6 +452,9 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > > > > task_pid_nr(q), q->comm); > > > > task_unlock(q); > > > > force_sig(SIGKILL, q); > > > > + > > > > + if (frozen(q)) > > > > + thaw_process(q); > > > > } > > > > > > > > set_tsk_thread_flag(p, TIF_MEMDIE); > > > > > > This is in the wrong place, oom_kill_task() iterates over all threads that > > > are _not_ in the same thread group as the chosen thread and kills them > > > without giving them access to memory reserves. The chosen task, p, could > > > still be frozen and may not exit. > > > > Ahh, right you are. I ave missed that one. Updated patch bellow. > > > > > > > > Once that's fixed, feel free to add my > > > > > > Acked-by: David Rientjes <rientjes@google.com> > > > > Thanks > > > > > > > > once Rafael sends his acked-by or reviewed-by. > > --- > > From f935ed4558c2fb033ef5c14e02b28e12a615f80e Mon Sep 17 00:00:00 2001 > > From: Michal Hocko <mhocko@suse.cz> > > Date: Fri, 16 Sep 2011 11:23:15 +0200 > > Subject: [PATCH] oom: do not live lock on frozen tasks > > > > Konstantin Khlebnikov has reported (https://lkml.org/lkml/2011/8/23/45) > > that OOM can end up in a live lock if select_bad_process picks up a frozen > > task. > > Unfortunately we cannot mark such processes as unkillable to ignore them > > because we could panic the system even though there is a chance that > > somebody could thaw the process so we can make a forward process (e.g. a > > process from another cpuset or with a different nodemask). > > > > Let's thaw an OOM selected frozen process right after we've sent fatal > > signal from oom_kill_task. > > Thawing is safe if the frozen task doesn't access any suspended device > > (e.g. by ioctl) on the way out to the userspace where we handle the > > signal and die. Note, we are not interested in the kernel threads because > > they are not oom killable. > > > > Accessing suspended devices by a userspace processes shouldn't be an > > issue because devices are suspended only after userspace is already > > frozen and oom is disabled at that time. > > > > run_guest (drivers/lguest/core.c) calls try_to_freeze with an user > > context but it seems it is able to cope with signals because it > > explicitly checks for pending signals so we should be safe. > > > > Other than that userspace accesses the fridge only from the > > signal handling routines so we are able to handle SIGKILL without any > > negative side effects. > > > > Signed-off-by: Michal Hocko <mhocko@suse.cz> > > Reported-by: Konstantin Khlebnikov <khlebnikov@openvz.org> > > Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Thanks -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks 2011-09-26 9:14 ` Michal Hocko @ 2011-09-26 15:51 ` Rafael J. Wysocki -1 siblings, 0 replies; 88+ messages in thread From: Rafael J. Wysocki @ 2011-09-26 15:51 UTC (permalink / raw) To: Michal Hocko Cc: David Rientjes, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Tejun Heo, Rusty Russell On Monday, September 26, 2011, Michal Hocko wrote: > On Mon 26-09-11 01:56:57, David Rientjes wrote: > > On Mon, 26 Sep 2011, Michal Hocko wrote: > > > > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > > > index 626303b..b9774f3 100644 > > > --- a/mm/oom_kill.c > > > +++ b/mm/oom_kill.c > > > @@ -32,6 +32,7 @@ > > > #include <linux/mempolicy.h> > > > #include <linux/security.h> > > > #include <linux/ptrace.h> > > > +#include <linux/freezer.h> > > > > > > int sysctl_panic_on_oom; > > > int sysctl_oom_kill_allocating_task; > > > @@ -451,6 +452,9 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > > > task_pid_nr(q), q->comm); > > > task_unlock(q); > > > force_sig(SIGKILL, q); > > > + > > > + if (frozen(q)) > > > + thaw_process(q); > > > } > > > > > > set_tsk_thread_flag(p, TIF_MEMDIE); > > > > This is in the wrong place, oom_kill_task() iterates over all threads that > > are _not_ in the same thread group as the chosen thread and kills them > > without giving them access to memory reserves. The chosen task, p, could > > still be frozen and may not exit. > > Ahh, right you are. I ave missed that one. Updated patch bellow. > > > > > Once that's fixed, feel free to add my > > > > Acked-by: David Rientjes <rientjes@google.com> > > Thanks > > > > > once Rafael sends his acked-by or reviewed-by. > --- > From f935ed4558c2fb033ef5c14e02b28e12a615f80e Mon Sep 17 00:00:00 2001 > From: Michal Hocko <mhocko@suse.cz> > Date: Fri, 16 Sep 2011 11:23:15 +0200 > Subject: [PATCH] oom: do not live lock on frozen tasks > > Konstantin Khlebnikov has reported (https://lkml.org/lkml/2011/8/23/45) > that OOM can end up in a live lock if select_bad_process picks up a frozen > task. > Unfortunately we cannot mark such processes as unkillable to ignore them > because we could panic the system even though there is a chance that > somebody could thaw the process so we can make a forward process (e.g. a > process from another cpuset or with a different nodemask). > > Let's thaw an OOM selected frozen process right after we've sent fatal > signal from oom_kill_task. > Thawing is safe if the frozen task doesn't access any suspended device > (e.g. by ioctl) on the way out to the userspace where we handle the > signal and die. Note, we are not interested in the kernel threads because > they are not oom killable. > > Accessing suspended devices by a userspace processes shouldn't be an > issue because devices are suspended only after userspace is already > frozen and oom is disabled at that time. > > run_guest (drivers/lguest/core.c) calls try_to_freeze with an user > context but it seems it is able to cope with signals because it > explicitly checks for pending signals so we should be safe. > > Other than that userspace accesses the fridge only from the > signal handling routines so we are able to handle SIGKILL without any > negative side effects. > > Signed-off-by: Michal Hocko <mhocko@suse.cz> > Reported-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> > --- > mm/oom_kill.c | 6 ++++++ > 1 files changed, 6 insertions(+), 0 deletions(-) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 626303b..c419a7e 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -32,6 +32,7 @@ > #include <linux/mempolicy.h> > #include <linux/security.h> > #include <linux/ptrace.h> > +#include <linux/freezer.h> > > int sysctl_panic_on_oom; > int sysctl_oom_kill_allocating_task; > @@ -451,10 +452,15 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > task_pid_nr(q), q->comm); > task_unlock(q); > force_sig(SIGKILL, q); > + > + if (frozen(q)) > + thaw_process(q); > } > > set_tsk_thread_flag(p, TIF_MEMDIE); > force_sig(SIGKILL, p); > + if (frozen(p)) > + thaw_process(p); > > return 0; > } > ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks @ 2011-09-26 15:51 ` Rafael J. Wysocki 0 siblings, 0 replies; 88+ messages in thread From: Rafael J. Wysocki @ 2011-09-26 15:51 UTC (permalink / raw) To: Michal Hocko Cc: David Rientjes, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Tejun Heo, Rusty Russell On Monday, September 26, 2011, Michal Hocko wrote: > On Mon 26-09-11 01:56:57, David Rientjes wrote: > > On Mon, 26 Sep 2011, Michal Hocko wrote: > > > > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > > > index 626303b..b9774f3 100644 > > > --- a/mm/oom_kill.c > > > +++ b/mm/oom_kill.c > > > @@ -32,6 +32,7 @@ > > > #include <linux/mempolicy.h> > > > #include <linux/security.h> > > > #include <linux/ptrace.h> > > > +#include <linux/freezer.h> > > > > > > int sysctl_panic_on_oom; > > > int sysctl_oom_kill_allocating_task; > > > @@ -451,6 +452,9 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > > > task_pid_nr(q), q->comm); > > > task_unlock(q); > > > force_sig(SIGKILL, q); > > > + > > > + if (frozen(q)) > > > + thaw_process(q); > > > } > > > > > > set_tsk_thread_flag(p, TIF_MEMDIE); > > > > This is in the wrong place, oom_kill_task() iterates over all threads that > > are _not_ in the same thread group as the chosen thread and kills them > > without giving them access to memory reserves. The chosen task, p, could > > still be frozen and may not exit. > > Ahh, right you are. I ave missed that one. Updated patch bellow. > > > > > Once that's fixed, feel free to add my > > > > Acked-by: David Rientjes <rientjes@google.com> > > Thanks > > > > > once Rafael sends his acked-by or reviewed-by. > --- > From f935ed4558c2fb033ef5c14e02b28e12a615f80e Mon Sep 17 00:00:00 2001 > From: Michal Hocko <mhocko@suse.cz> > Date: Fri, 16 Sep 2011 11:23:15 +0200 > Subject: [PATCH] oom: do not live lock on frozen tasks > > Konstantin Khlebnikov has reported (https://lkml.org/lkml/2011/8/23/45) > that OOM can end up in a live lock if select_bad_process picks up a frozen > task. > Unfortunately we cannot mark such processes as unkillable to ignore them > because we could panic the system even though there is a chance that > somebody could thaw the process so we can make a forward process (e.g. a > process from another cpuset or with a different nodemask). > > Let's thaw an OOM selected frozen process right after we've sent fatal > signal from oom_kill_task. > Thawing is safe if the frozen task doesn't access any suspended device > (e.g. by ioctl) on the way out to the userspace where we handle the > signal and die. Note, we are not interested in the kernel threads because > they are not oom killable. > > Accessing suspended devices by a userspace processes shouldn't be an > issue because devices are suspended only after userspace is already > frozen and oom is disabled at that time. > > run_guest (drivers/lguest/core.c) calls try_to_freeze with an user > context but it seems it is able to cope with signals because it > explicitly checks for pending signals so we should be safe. > > Other than that userspace accesses the fridge only from the > signal handling routines so we are able to handle SIGKILL without any > negative side effects. > > Signed-off-by: Michal Hocko <mhocko@suse.cz> > Reported-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> > --- > mm/oom_kill.c | 6 ++++++ > 1 files changed, 6 insertions(+), 0 deletions(-) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 626303b..c419a7e 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -32,6 +32,7 @@ > #include <linux/mempolicy.h> > #include <linux/security.h> > #include <linux/ptrace.h> > +#include <linux/freezer.h> > > int sysctl_panic_on_oom; > int sysctl_oom_kill_allocating_task; > @@ -451,10 +452,15 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > task_pid_nr(q), q->comm); > task_unlock(q); > force_sig(SIGKILL, q); > + > + if (frozen(q)) > + thaw_process(q); > } > > set_tsk_thread_flag(p, TIF_MEMDIE); > force_sig(SIGKILL, p); > + if (frozen(p)) > + thaw_process(p); > > return 0; > } > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks 2011-09-26 15:51 ` Rafael J. Wysocki @ 2011-09-26 18:28 ` Michal Hocko -1 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-09-26 18:28 UTC (permalink / raw) To: Rafael J. Wysocki Cc: David Rientjes, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Tejun Heo, Rusty Russell On Mon 26-09-11 17:51:40, Rafael J. Wysocki wrote: > On Monday, September 26, 2011, Michal Hocko wrote: [...] > > From f935ed4558c2fb033ef5c14e02b28e12a615f80e Mon Sep 17 00:00:00 2001 > > From: Michal Hocko <mhocko@suse.cz> > > Date: Fri, 16 Sep 2011 11:23:15 +0200 > > Subject: [PATCH] oom: do not live lock on frozen tasks > > > > Konstantin Khlebnikov has reported (https://lkml.org/lkml/2011/8/23/45) > > that OOM can end up in a live lock if select_bad_process picks up a frozen > > task. > > Unfortunately we cannot mark such processes as unkillable to ignore them > > because we could panic the system even though there is a chance that > > somebody could thaw the process so we can make a forward process (e.g. a > > process from another cpuset or with a different nodemask). > > > > Let's thaw an OOM selected frozen process right after we've sent fatal > > signal from oom_kill_task. > > Thawing is safe if the frozen task doesn't access any suspended device > > (e.g. by ioctl) on the way out to the userspace where we handle the > > signal and die. Note, we are not interested in the kernel threads because > > they are not oom killable. > > > > Accessing suspended devices by a userspace processes shouldn't be an > > issue because devices are suspended only after userspace is already > > frozen and oom is disabled at that time. > > > > run_guest (drivers/lguest/core.c) calls try_to_freeze with an user > > context but it seems it is able to cope with signals because it > > explicitly checks for pending signals so we should be safe. > > > > Other than that userspace accesses the fridge only from the > > signal handling routines so we are able to handle SIGKILL without any > > negative side effects. > > > > Signed-off-by: Michal Hocko <mhocko@suse.cz> > > Reported-by: Konstantin Khlebnikov <khlebnikov@openvz.org> > > Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Thanks. -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks @ 2011-09-26 18:28 ` Michal Hocko 0 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-09-26 18:28 UTC (permalink / raw) To: Rafael J. Wysocki Cc: David Rientjes, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Tejun Heo, Rusty Russell On Mon 26-09-11 17:51:40, Rafael J. Wysocki wrote: > On Monday, September 26, 2011, Michal Hocko wrote: [...] > > From f935ed4558c2fb033ef5c14e02b28e12a615f80e Mon Sep 17 00:00:00 2001 > > From: Michal Hocko <mhocko@suse.cz> > > Date: Fri, 16 Sep 2011 11:23:15 +0200 > > Subject: [PATCH] oom: do not live lock on frozen tasks > > > > Konstantin Khlebnikov has reported (https://lkml.org/lkml/2011/8/23/45) > > that OOM can end up in a live lock if select_bad_process picks up a frozen > > task. > > Unfortunately we cannot mark such processes as unkillable to ignore them > > because we could panic the system even though there is a chance that > > somebody could thaw the process so we can make a forward process (e.g. a > > process from another cpuset or with a different nodemask). > > > > Let's thaw an OOM selected frozen process right after we've sent fatal > > signal from oom_kill_task. > > Thawing is safe if the frozen task doesn't access any suspended device > > (e.g. by ioctl) on the way out to the userspace where we handle the > > signal and die. Note, we are not interested in the kernel threads because > > they are not oom killable. > > > > Accessing suspended devices by a userspace processes shouldn't be an > > issue because devices are suspended only after userspace is already > > frozen and oom is disabled at that time. > > > > run_guest (drivers/lguest/core.c) calls try_to_freeze with an user > > context but it seems it is able to cope with signals because it > > explicitly checks for pending signals so we should be safe. > > > > Other than that userspace accesses the fridge only from the > > signal handling routines so we are able to handle SIGKILL without any > > negative side effects. > > > > Signed-off-by: Michal Hocko <mhocko@suse.cz> > > Reported-by: Konstantin Khlebnikov <khlebnikov@openvz.org> > > Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Thanks. -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks 2011-09-26 15:51 ` Rafael J. Wysocki @ 2011-09-27 1:03 ` David Rientjes -1 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-09-27 1:03 UTC (permalink / raw) To: Rafael J. Wysocki, Michal Hocko Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Tejun Heo, Rusty Russell On Mon, 26 Sep 2011, Rafael J. Wysocki wrote: > > Konstantin Khlebnikov has reported (https://lkml.org/lkml/2011/8/23/45) > > that OOM can end up in a live lock if select_bad_process picks up a frozen > > task. > > Unfortunately we cannot mark such processes as unkillable to ignore them > > because we could panic the system even though there is a chance that > > somebody could thaw the process so we can make a forward process (e.g. a > > process from another cpuset or with a different nodemask). > > > > Let's thaw an OOM selected frozen process right after we've sent fatal > > signal from oom_kill_task. > > Thawing is safe if the frozen task doesn't access any suspended device > > (e.g. by ioctl) on the way out to the userspace where we handle the > > signal and die. Note, we are not interested in the kernel threads because > > they are not oom killable. > > > > Accessing suspended devices by a userspace processes shouldn't be an > > issue because devices are suspended only after userspace is already > > frozen and oom is disabled at that time. > > > > run_guest (drivers/lguest/core.c) calls try_to_freeze with an user > > context but it seems it is able to cope with signals because it > > explicitly checks for pending signals so we should be safe. > > > > Other than that userspace accesses the fridge only from the > > signal handling routines so we are able to handle SIGKILL without any > > negative side effects. > > > > Signed-off-by: Michal Hocko <mhocko@suse.cz> > > Reported-by: Konstantin Khlebnikov <khlebnikov@openvz.org> > > Acked-by: Rafael J. Wysocki <rjw@sisk.pl> > Acked-by: David Rientjes <rientjes@google.com> Although this still seems to be problematic if the chosen thread gets frozen before the SIGKILL can be handled. We don't have any checks for fatal_signal_pending() when freezing threads and waiting for them to exit? Michal, could you send Andrew your revised patch with all the acked-bys? Thanks! ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks @ 2011-09-27 1:03 ` David Rientjes 0 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-09-27 1:03 UTC (permalink / raw) To: Rafael J. Wysocki, Michal Hocko Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Tejun Heo, Rusty Russell On Mon, 26 Sep 2011, Rafael J. Wysocki wrote: > > Konstantin Khlebnikov has reported (https://lkml.org/lkml/2011/8/23/45) > > that OOM can end up in a live lock if select_bad_process picks up a frozen > > task. > > Unfortunately we cannot mark such processes as unkillable to ignore them > > because we could panic the system even though there is a chance that > > somebody could thaw the process so we can make a forward process (e.g. a > > process from another cpuset or with a different nodemask). > > > > Let's thaw an OOM selected frozen process right after we've sent fatal > > signal from oom_kill_task. > > Thawing is safe if the frozen task doesn't access any suspended device > > (e.g. by ioctl) on the way out to the userspace where we handle the > > signal and die. Note, we are not interested in the kernel threads because > > they are not oom killable. > > > > Accessing suspended devices by a userspace processes shouldn't be an > > issue because devices are suspended only after userspace is already > > frozen and oom is disabled at that time. > > > > run_guest (drivers/lguest/core.c) calls try_to_freeze with an user > > context but it seems it is able to cope with signals because it > > explicitly checks for pending signals so we should be safe. > > > > Other than that userspace accesses the fridge only from the > > signal handling routines so we are able to handle SIGKILL without any > > negative side effects. > > > > Signed-off-by: Michal Hocko <mhocko@suse.cz> > > Reported-by: Konstantin Khlebnikov <khlebnikov@openvz.org> > > Acked-by: Rafael J. Wysocki <rjw@sisk.pl> > Acked-by: David Rientjes <rientjes@google.com> Although this still seems to be problematic if the chosen thread gets frozen before the SIGKILL can be handled. We don't have any checks for fatal_signal_pending() when freezing threads and waiting for them to exit? Michal, could you send Andrew your revised patch with all the acked-bys? Thanks! -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks 2011-09-27 1:03 ` David Rientjes @ 2011-09-27 7:52 ` Michal Hocko -1 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-09-27 7:52 UTC (permalink / raw) To: David Rientjes Cc: Rafael J. Wysocki, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Tejun Heo, Rusty Russell On Mon 26-09-11 18:03:26, David Rientjes wrote: > On Mon, 26 Sep 2011, Rafael J. Wysocki wrote: > > > > Konstantin Khlebnikov has reported (https://lkml.org/lkml/2011/8/23/45) > > > that OOM can end up in a live lock if select_bad_process picks up a frozen > > > task. > > > Unfortunately we cannot mark such processes as unkillable to ignore them > > > because we could panic the system even though there is a chance that > > > somebody could thaw the process so we can make a forward process (e.g. a > > > process from another cpuset or with a different nodemask). > > > > > > Let's thaw an OOM selected frozen process right after we've sent fatal > > > signal from oom_kill_task. > > > Thawing is safe if the frozen task doesn't access any suspended device > > > (e.g. by ioctl) on the way out to the userspace where we handle the > > > signal and die. Note, we are not interested in the kernel threads because > > > they are not oom killable. > > > > > > Accessing suspended devices by a userspace processes shouldn't be an > > > issue because devices are suspended only after userspace is already > > > frozen and oom is disabled at that time. > > > > > > run_guest (drivers/lguest/core.c) calls try_to_freeze with an user > > > context but it seems it is able to cope with signals because it > > > explicitly checks for pending signals so we should be safe. > > > > > > Other than that userspace accesses the fridge only from the > > > signal handling routines so we are able to handle SIGKILL without any > > > negative side effects. > > > > > > Signed-off-by: Michal Hocko <mhocko@suse.cz> > > > Reported-by: Konstantin Khlebnikov <khlebnikov@openvz.org> > > > > Acked-by: Rafael J. Wysocki <rjw@sisk.pl> > > > > Acked-by: David Rientjes <rientjes@google.com> Thanks! > > Although this still seems to be problematic if the chosen thread gets > frozen before the SIGKILL can be handled. We don't have any checks for > fatal_signal_pending() when freezing threads and waiting for them to exit? I guess you mean a situation when select_bad_process picks up a process which is not marked as frozen yet but we send SIGKILL right before schedule is called in refrigerator. In that case either schedule should catch it by signal_pending_state check or we will pick it up next OOM round when we pick up the same process (if nothing else is eligible). Or am I missing something? > Michal, could you send Andrew your revised patch with all the acked-bys? Yes I will. I would just like to hear back from Konstantin who originally reported the issue. Maybe he has a test case. > > Thanks! > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks @ 2011-09-27 7:52 ` Michal Hocko 0 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-09-27 7:52 UTC (permalink / raw) To: David Rientjes Cc: Rafael J. Wysocki, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Tejun Heo, Rusty Russell On Mon 26-09-11 18:03:26, David Rientjes wrote: > On Mon, 26 Sep 2011, Rafael J. Wysocki wrote: > > > > Konstantin Khlebnikov has reported (https://lkml.org/lkml/2011/8/23/45) > > > that OOM can end up in a live lock if select_bad_process picks up a frozen > > > task. > > > Unfortunately we cannot mark such processes as unkillable to ignore them > > > because we could panic the system even though there is a chance that > > > somebody could thaw the process so we can make a forward process (e.g. a > > > process from another cpuset or with a different nodemask). > > > > > > Let's thaw an OOM selected frozen process right after we've sent fatal > > > signal from oom_kill_task. > > > Thawing is safe if the frozen task doesn't access any suspended device > > > (e.g. by ioctl) on the way out to the userspace where we handle the > > > signal and die. Note, we are not interested in the kernel threads because > > > they are not oom killable. > > > > > > Accessing suspended devices by a userspace processes shouldn't be an > > > issue because devices are suspended only after userspace is already > > > frozen and oom is disabled at that time. > > > > > > run_guest (drivers/lguest/core.c) calls try_to_freeze with an user > > > context but it seems it is able to cope with signals because it > > > explicitly checks for pending signals so we should be safe. > > > > > > Other than that userspace accesses the fridge only from the > > > signal handling routines so we are able to handle SIGKILL without any > > > negative side effects. > > > > > > Signed-off-by: Michal Hocko <mhocko@suse.cz> > > > Reported-by: Konstantin Khlebnikov <khlebnikov@openvz.org> > > > > Acked-by: Rafael J. Wysocki <rjw@sisk.pl> > > > > Acked-by: David Rientjes <rientjes@google.com> Thanks! > > Although this still seems to be problematic if the chosen thread gets > frozen before the SIGKILL can be handled. We don't have any checks for > fatal_signal_pending() when freezing threads and waiting for them to exit? I guess you mean a situation when select_bad_process picks up a process which is not marked as frozen yet but we send SIGKILL right before schedule is called in refrigerator. In that case either schedule should catch it by signal_pending_state check or we will pick it up next OOM round when we pick up the same process (if nothing else is eligible). Or am I missing something? > Michal, could you send Andrew your revised patch with all the acked-bys? Yes I will. I would just like to hear back from Konstantin who originally reported the issue. Maybe he has a test case. > > Thanks! > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks 2011-09-27 7:52 ` Michal Hocko @ 2011-09-27 18:30 ` David Rientjes -1 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-09-27 18:30 UTC (permalink / raw) To: Michal Hocko Cc: Rafael J. Wysocki, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Tejun Heo, Rusty Russell On Tue, 27 Sep 2011, Michal Hocko wrote: > I guess you mean a situation when select_bad_process picks up a process > which is not marked as frozen yet but we send SIGKILL right before > schedule is called in refrigerator. > In that case either schedule should catch it by signal_pending_state > check or we will pick it up next OOM round when we pick up the same > process (if nothing else is eligible). Or am I missing something? > That doesn't close the race, the oom killer will see the presence of an eligible TIF_MEMDIE thread in select_bad_process() and simply return to the page allocator. You'd need to thaw it there as well and hope that nothing now or in the future will get into an endless thaw-freeze-thaw loop in the exit path. ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks @ 2011-09-27 18:30 ` David Rientjes 0 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-09-27 18:30 UTC (permalink / raw) To: Michal Hocko Cc: Rafael J. Wysocki, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Tejun Heo, Rusty Russell On Tue, 27 Sep 2011, Michal Hocko wrote: > I guess you mean a situation when select_bad_process picks up a process > which is not marked as frozen yet but we send SIGKILL right before > schedule is called in refrigerator. > In that case either schedule should catch it by signal_pending_state > check or we will pick it up next OOM round when we pick up the same > process (if nothing else is eligible). Or am I missing something? > That doesn't close the race, the oom killer will see the presence of an eligible TIF_MEMDIE thread in select_bad_process() and simply return to the page allocator. You'd need to thaw it there as well and hope that nothing now or in the future will get into an endless thaw-freeze-thaw loop in the exit path. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks 2011-09-26 8:28 ` Michal Hocko @ 2011-09-26 10:28 ` Rusty Russell -1 siblings, 0 replies; 88+ messages in thread From: Rusty Russell @ 2011-09-26 10:28 UTC (permalink / raw) To: Michal Hocko, David Rientjes Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki, Tejun Heo On Mon, 26 Sep 2011 10:28:37 +0200, Michal Hocko <mhocko@suse.cz> wrote: > On Fri 26-08-11 11:13:40, David Rientjes wrote: > > I'd love to be able to do a thaw on a PF_FROZEN task in the oom killer > > followed by a SIGKILL if that task is selected for oom kill without an > > heuristic change. Not sure if that's possible, so we'll wait for Rafael > > to chime in. > > We have discussed that with Rafael and it should be safe to do that. See > the patch bellow. > The only place I am not entirely sure about is run_guest > (drivers/lguest/core.c). It seems that the code is able to cope with > signals but it also calls lguest_arch_run_guest after try_to_freeze. Yes; if you want to kill things in the refrigerator(), then will a if (cpu->lg->dead || task_is_dead(current)) break; Work? That break means we return to the read() syscall pretty much immediately. Thanks for the CC, Rusty. ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks @ 2011-09-26 10:28 ` Rusty Russell 0 siblings, 0 replies; 88+ messages in thread From: Rusty Russell @ 2011-09-26 10:28 UTC (permalink / raw) To: Michal Hocko, David Rientjes Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki, Tejun Heo On Mon, 26 Sep 2011 10:28:37 +0200, Michal Hocko <mhocko@suse.cz> wrote: > On Fri 26-08-11 11:13:40, David Rientjes wrote: > > I'd love to be able to do a thaw on a PF_FROZEN task in the oom killer > > followed by a SIGKILL if that task is selected for oom kill without an > > heuristic change. Not sure if that's possible, so we'll wait for Rafael > > to chime in. > > We have discussed that with Rafael and it should be safe to do that. See > the patch bellow. > The only place I am not entirely sure about is run_guest > (drivers/lguest/core.c). It seems that the code is able to cope with > signals but it also calls lguest_arch_run_guest after try_to_freeze. Yes; if you want to kill things in the refrigerator(), then will a if (cpu->lg->dead || task_is_dead(current)) break; Work? That break means we return to the read() syscall pretty much immediately. Thanks for the CC, Rusty. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks 2011-09-26 10:28 ` Rusty Russell @ 2011-09-26 11:05 ` Michal Hocko -1 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-09-26 11:05 UTC (permalink / raw) To: Rusty Russell Cc: David Rientjes, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki, Tejun Heo On Mon 26-09-11 19:58:50, Rusty Russell wrote: > On Mon, 26 Sep 2011 10:28:37 +0200, Michal Hocko <mhocko@suse.cz> wrote: > > On Fri 26-08-11 11:13:40, David Rientjes wrote: > > > I'd love to be able to do a thaw on a PF_FROZEN task in the oom killer > > > followed by a SIGKILL if that task is selected for oom kill without an > > > heuristic change. Not sure if that's possible, so we'll wait for Rafael > > > to chime in. > > > > We have discussed that with Rafael and it should be safe to do that. See > > the patch bellow. > > The only place I am not entirely sure about is run_guest > > (drivers/lguest/core.c). It seems that the code is able to cope with > > signals but it also calls lguest_arch_run_guest after try_to_freeze. > > Yes; if you want to kill things in the refrigerator(), then will a > > if (cpu->lg->dead || task_is_dead(current)) > break; > > Work? The task is not dead yet. We should rather check for pending signals. Can we just move try_to_freeze up before the pending signals check? diff --git a/drivers/lguest/core.c b/drivers/lguest/core.c index 2535933..a513509 100644 --- a/drivers/lguest/core.c +++ b/drivers/lguest/core.c @@ -232,6 +232,12 @@ int run_guest(struct lg_cpu *cpu, unsigned long __user *user) } } + /* + * All long-lived kernel loops need to check with this horrible + * thing called the freezer. If the Host is trying to suspend, + * it stops us. + */ + try_to_freeze(); /* Check for signals */ if (signal_pending(current)) return -ERESTARTSYS; @@ -246,13 +252,6 @@ int run_guest(struct lg_cpu *cpu, unsigned long __user *user) try_deliver_interrupt(cpu, irq, more); /* - * All long-lived kernel loops need to check with this horrible - * thing called the freezer. If the Host is trying to suspend, - * it stops us. - */ - try_to_freeze(); - - /* * Just make absolutely sure the Guest is still alive. One of * those hypercalls could have been fatal, for example. */ > That break means we return to the read() syscall pretty much > immediately. > > Thanks for the CC, > Rusty. -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply related [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks @ 2011-09-26 11:05 ` Michal Hocko 0 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-09-26 11:05 UTC (permalink / raw) To: Rusty Russell Cc: David Rientjes, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki, Tejun Heo On Mon 26-09-11 19:58:50, Rusty Russell wrote: > On Mon, 26 Sep 2011 10:28:37 +0200, Michal Hocko <mhocko@suse.cz> wrote: > > On Fri 26-08-11 11:13:40, David Rientjes wrote: > > > I'd love to be able to do a thaw on a PF_FROZEN task in the oom killer > > > followed by a SIGKILL if that task is selected for oom kill without an > > > heuristic change. Not sure if that's possible, so we'll wait for Rafael > > > to chime in. > > > > We have discussed that with Rafael and it should be safe to do that. See > > the patch bellow. > > The only place I am not entirely sure about is run_guest > > (drivers/lguest/core.c). It seems that the code is able to cope with > > signals but it also calls lguest_arch_run_guest after try_to_freeze. > > Yes; if you want to kill things in the refrigerator(), then will a > > if (cpu->lg->dead || task_is_dead(current)) > break; > > Work? The task is not dead yet. We should rather check for pending signals. Can we just move try_to_freeze up before the pending signals check? diff --git a/drivers/lguest/core.c b/drivers/lguest/core.c index 2535933..a513509 100644 --- a/drivers/lguest/core.c +++ b/drivers/lguest/core.c @@ -232,6 +232,12 @@ int run_guest(struct lg_cpu *cpu, unsigned long __user *user) } } + /* + * All long-lived kernel loops need to check with this horrible + * thing called the freezer. If the Host is trying to suspend, + * it stops us. + */ + try_to_freeze(); /* Check for signals */ if (signal_pending(current)) return -ERESTARTSYS; @@ -246,13 +252,6 @@ int run_guest(struct lg_cpu *cpu, unsigned long __user *user) try_deliver_interrupt(cpu, irq, more); /* - * All long-lived kernel loops need to check with this horrible - * thing called the freezer. If the Host is trying to suspend, - * it stops us. - */ - try_to_freeze(); - - /* * Just make absolutely sure the Guest is still alive. One of * those hypercalls could have been fatal, for example. */ > That break means we return to the read() syscall pretty much > immediately. > > Thanks for the CC, > Rusty. -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks 2011-09-26 11:05 ` Michal Hocko @ 2011-09-27 2:21 ` Rusty Russell -1 siblings, 0 replies; 88+ messages in thread From: Rusty Russell @ 2011-09-27 2:21 UTC (permalink / raw) To: Michal Hocko Cc: David Rientjes, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki, Tejun Heo On Mon, 26 Sep 2011 13:05:59 +0200, Michal Hocko <mhocko@suse.cz> wrote: > On Mon 26-09-11 19:58:50, Rusty Russell wrote: > > On Mon, 26 Sep 2011 10:28:37 +0200, Michal Hocko <mhocko@suse.cz> wrote: > > > On Fri 26-08-11 11:13:40, David Rientjes wrote: > > > > I'd love to be able to do a thaw on a PF_FROZEN task in the oom killer > > > > followed by a SIGKILL if that task is selected for oom kill without an > > > > heuristic change. Not sure if that's possible, so we'll wait for Rafael > > > > to chime in. > > > > > > We have discussed that with Rafael and it should be safe to do that. See > > > the patch bellow. > > > The only place I am not entirely sure about is run_guest > > > (drivers/lguest/core.c). It seems that the code is able to cope with > > > signals but it also calls lguest_arch_run_guest after try_to_freeze. > > > > Yes; if you want to kill things in the refrigerator(), then will a > > > > if (cpu->lg->dead || task_is_dead(current)) > > break; > > > > Work? > > The task is not dead yet. We should rather check for pending signals. > Can we just move try_to_freeze up before the pending signals check? Yep, that works. Acked-by: Rusty Russell <rusty@rustcorp.com.au> Cheers, Rusty. ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 1/2] oom: do not live lock on frozen tasks @ 2011-09-27 2:21 ` Rusty Russell 0 siblings, 0 replies; 88+ messages in thread From: Rusty Russell @ 2011-09-27 2:21 UTC (permalink / raw) To: Michal Hocko Cc: David Rientjes, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki, Tejun Heo On Mon, 26 Sep 2011 13:05:59 +0200, Michal Hocko <mhocko@suse.cz> wrote: > On Mon 26-09-11 19:58:50, Rusty Russell wrote: > > On Mon, 26 Sep 2011 10:28:37 +0200, Michal Hocko <mhocko@suse.cz> wrote: > > > On Fri 26-08-11 11:13:40, David Rientjes wrote: > > > > I'd love to be able to do a thaw on a PF_FROZEN task in the oom killer > > > > followed by a SIGKILL if that task is selected for oom kill without an > > > > heuristic change. Not sure if that's possible, so we'll wait for Rafael > > > > to chime in. > > > > > > We have discussed that with Rafael and it should be safe to do that. See > > > the patch bellow. > > > The only place I am not entirely sure about is run_guest > > > (drivers/lguest/core.c). It seems that the code is able to cope with > > > signals but it also calls lguest_arch_run_guest after try_to_freeze. > > > > Yes; if you want to kill things in the refrigerator(), then will a > > > > if (cpu->lg->dead || task_is_dead(current)) > > break; > > > > Work? > > The task is not dead yet. We should rather check for pending signals. > Can we just move try_to_freeze up before the pending signals check? Yep, that works. Acked-by: Rusty Russell <rusty@rustcorp.com.au> Cheers, Rusty. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* [PATCH] lguest: move process freezing before pending signals check 2011-09-27 2:21 ` Rusty Russell @ 2011-09-27 7:03 ` Michal Hocko -1 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-09-27 7:03 UTC (permalink / raw) To: Rusty Russell Cc: David Rientjes, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki, Tejun Heo On Tue 27-09-11 11:51:00, Rusty Russell wrote: > On Mon, 26 Sep 2011 13:05:59 +0200, Michal Hocko <mhocko@suse.cz> wrote: > > On Mon 26-09-11 19:58:50, Rusty Russell wrote: > > > On Mon, 26 Sep 2011 10:28:37 +0200, Michal Hocko <mhocko@suse.cz> wrote: > > > > On Fri 26-08-11 11:13:40, David Rientjes wrote: > > > > > I'd love to be able to do a thaw on a PF_FROZEN task in the oom killer > > > > > followed by a SIGKILL if that task is selected for oom kill without an > > > > > heuristic change. Not sure if that's possible, so we'll wait for Rafael > > > > > to chime in. > > > > > > > > We have discussed that with Rafael and it should be safe to do that. See > > > > the patch bellow. > > > > The only place I am not entirely sure about is run_guest > > > > (drivers/lguest/core.c). It seems that the code is able to cope with > > > > signals but it also calls lguest_arch_run_guest after try_to_freeze. > > > > > > Yes; if you want to kill things in the refrigerator(), then will a > > > > > > if (cpu->lg->dead || task_is_dead(current)) > > > break; > > > > > > Work? > > > > The task is not dead yet. We should rather check for pending signals. > > Can we just move try_to_freeze up before the pending signals check? > > Yep, that works. > > Acked-by: Rusty Russell <rusty@rustcorp.com.au> Thanks. The full patch bellow: --- >From c8b4f49ad85948b7362c50964d6c09f0e35a6537 Mon Sep 17 00:00:00 2001 From: Michal Hocko <mhocko@suse.cz> Date: Tue, 27 Sep 2011 08:56:03 +0200 Subject: [PATCH] lguest: move process freezing before pending signals check run_guest tries to freeze the current process after it has handled pending interrupts and before it calls lguest_arch_run_guest. This doesn't work nicely if the task has been killed while being frozen and when we want to handle that signal as soon as possible. Let's move try_to_freeze before we check for pending signal so that we can get out of the loop as soon as possible. Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: Rusty Russell <rusty@rustcorp.com.au> --- drivers/lguest/core.c | 14 +++++++------- 1 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/lguest/core.c b/drivers/lguest/core.c index 2535933..e7dda91 100644 --- a/drivers/lguest/core.c +++ b/drivers/lguest/core.c @@ -232,6 +232,13 @@ int run_guest(struct lg_cpu *cpu, unsigned long __user *user) } } + /* + * All long-lived kernel loops need to check with this horrible + * thing called the freezer. If the Host is trying to suspend, + * it stops us. + */ + try_to_freeze(); + /* Check for signals */ if (signal_pending(current)) return -ERESTARTSYS; @@ -246,13 +253,6 @@ int run_guest(struct lg_cpu *cpu, unsigned long __user *user) try_deliver_interrupt(cpu, irq, more); /* - * All long-lived kernel loops need to check with this horrible - * thing called the freezer. If the Host is trying to suspend, - * it stops us. - */ - try_to_freeze(); - - /* * Just make absolutely sure the Guest is still alive. One of * those hypercalls could have been fatal, for example. */ -- 1.7.5.4 -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply related [flat|nested] 88+ messages in thread
* [PATCH] lguest: move process freezing before pending signals check @ 2011-09-27 7:03 ` Michal Hocko 0 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-09-27 7:03 UTC (permalink / raw) To: Rusty Russell Cc: David Rientjes, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki, Tejun Heo On Tue 27-09-11 11:51:00, Rusty Russell wrote: > On Mon, 26 Sep 2011 13:05:59 +0200, Michal Hocko <mhocko@suse.cz> wrote: > > On Mon 26-09-11 19:58:50, Rusty Russell wrote: > > > On Mon, 26 Sep 2011 10:28:37 +0200, Michal Hocko <mhocko@suse.cz> wrote: > > > > On Fri 26-08-11 11:13:40, David Rientjes wrote: > > > > > I'd love to be able to do a thaw on a PF_FROZEN task in the oom killer > > > > > followed by a SIGKILL if that task is selected for oom kill without an > > > > > heuristic change. Not sure if that's possible, so we'll wait for Rafael > > > > > to chime in. > > > > > > > > We have discussed that with Rafael and it should be safe to do that. See > > > > the patch bellow. > > > > The only place I am not entirely sure about is run_guest > > > > (drivers/lguest/core.c). It seems that the code is able to cope with > > > > signals but it also calls lguest_arch_run_guest after try_to_freeze. > > > > > > Yes; if you want to kill things in the refrigerator(), then will a > > > > > > if (cpu->lg->dead || task_is_dead(current)) > > > break; > > > > > > Work? > > > > The task is not dead yet. We should rather check for pending signals. > > Can we just move try_to_freeze up before the pending signals check? > > Yep, that works. > > Acked-by: Rusty Russell <rusty@rustcorp.com.au> Thanks. The full patch bellow: --- ^ permalink raw reply [flat|nested] 88+ messages in thread
* [PATCH 2/2] oom: give bonus to frozen processes 2011-08-26 18:13 ` David Rientjes @ 2011-09-26 8:35 ` Michal Hocko -1 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-09-26 8:35 UTC (permalink / raw) To: David Rientjes Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri 26-08-11 11:13:40, David Rientjes wrote: > On Fri, 26 Aug 2011, Michal Hocko wrote: [...] > > I am not saying the bonus is necessary, though. It depends on what > > the freezer is used for (e.g. freeze a process which went wild and > > debug what went wrong wouldn't welcome that somebody killed it or other > > (mis)use which relies on D state). [...] > If it actually does come down to a heuristic change, then it need not > happen in the oom killer: the freezing code would need to use > test_set_oom_score_adj() to temporarily reduce the oom_score_adj for that > task until it comes out of the refrigerator. We already use that in ksm > and swapoff to actually prefer threads, but we can use it to bias against > threads as well. Let's try it with a heuristic change first. If you really do not like it, we can move to oom_scode_adj. I like the heuristic change little bit more because it is at the same place as the root bonus. --- >From 2f8fbab922f353eebaa9fa1c5fcb59a59cf2f441 Mon Sep 17 00:00:00 2001 From: Michal Hocko <mhocko@suse.cz> Date: Fri, 16 Sep 2011 15:51:25 +0200 Subject: [PATCH 2/2] oom: give bonus to frozen processes We do not want to kill frozen processes if there are some processes that could be killed instead. Frozen tasks are in uninterruptible sleep and somebody might rely on that. Let's give those tasks a bonus so that they would be selected only when really necessary. The size of the bonus is questionable but let's start with the same bonus we give to root processes. Signed-off-by: Michal Hocko <mhocko@suse.cz> --- include/linux/oom.h | 6 ++++++ mm/oom_kill.c | 10 +++++++++- 2 files changed, 15 insertions(+), 1 deletions(-) diff --git a/include/linux/oom.h b/include/linux/oom.h index 13b7b02e..960e2de 100644 --- a/include/linux/oom.h +++ b/include/linux/oom.h @@ -13,6 +13,12 @@ #define OOM_ADJUST_MAX 15 /* + * Bonus (3%) for a task that shouldn't be killed unless necessary. + * We give this bonus to root and frozen tasks currently. + */ +#define OOM_BONUS 30 + +/* * /proc/<pid>/oom_score_adj set to OOM_SCORE_ADJ_MIN disables oom killing for * pid. */ diff --git a/mm/oom_kill.c b/mm/oom_kill.c index b9774f3..4aea1b3 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -205,7 +205,15 @@ unsigned int oom_badness(struct task_struct *p, struct mem_cgroup *mem, * implementation used by LSMs. */ if (has_capability_noaudit(p, CAP_SYS_ADMIN)) - points -= 30; + points -= OOM_BONUS; + + /* + * Do not try to kill frozen tasks unless there is nothing else to kill. + * We do not want to give it 1 point because we still want to select a good + * candidate among all frozen tasks. Let's give it a reasonable bonus. + */ + if (frozen(p)) + points -= OOM_BONUS; /* * /proc/pid/oom_score_adj ranges from -1000 to +1000 such that it may -- 1.7.5.4 -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply related [flat|nested] 88+ messages in thread
* [PATCH 2/2] oom: give bonus to frozen processes @ 2011-09-26 8:35 ` Michal Hocko 0 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-09-26 8:35 UTC (permalink / raw) To: David Rientjes Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri 26-08-11 11:13:40, David Rientjes wrote: > On Fri, 26 Aug 2011, Michal Hocko wrote: [...] > > I am not saying the bonus is necessary, though. It depends on what > > the freezer is used for (e.g. freeze a process which went wild and > > debug what went wrong wouldn't welcome that somebody killed it or other > > (mis)use which relies on D state). [...] > If it actually does come down to a heuristic change, then it need not > happen in the oom killer: the freezing code would need to use > test_set_oom_score_adj() to temporarily reduce the oom_score_adj for that > task until it comes out of the refrigerator. We already use that in ksm > and swapoff to actually prefer threads, but we can use it to bias against > threads as well. Let's try it with a heuristic change first. If you really do not like it, we can move to oom_scode_adj. I like the heuristic change little bit more because it is at the same place as the root bonus. --- ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 2/2] oom: give bonus to frozen processes 2011-09-26 8:35 ` Michal Hocko @ 2011-09-26 9:02 ` David Rientjes -1 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-09-26 9:02 UTC (permalink / raw) To: Michal Hocko Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Mon, 26 Sep 2011, Michal Hocko wrote: > Let's try it with a heuristic change first. If you really do not like > it, we can move to oom_scode_adj. I like the heuristic change little bit > more because it is at the same place as the root bonus. The problem with the bonus is that, as mentioned previously, it doesn't protect against ANYTHING for the case you're trying to fix. This won't panic the machine because all killable threads are guaranteed to have a non-zero badness score, but it's a very valid configuration to have either - all eligible threads (system-wide, shared cpuset, shared mempolicy nodes) are frozen, or - all eligible frozen threads use <5% of memory whereas all other eligible killable threads use 1% of available memory. and that means the oom killer will repeatedly select those threads and the livelock still exists unless you can guarantee that they are successfully thawed, that thawing them in all situations is safe, and that once thawed they will make a timely exit. Additionally, I don't think biasing against frozen tasks makes sense from a heusritic standpoint of the oom killer. Why would we want give non-frozen tasks that are actually getting work done a preference over a task that is frozen and doing absolutely nothing? It seems like that's backwards and that we'd actually prefer killing the task doing nothing so it can free its memory. ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 2/2] oom: give bonus to frozen processes @ 2011-09-26 9:02 ` David Rientjes 0 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-09-26 9:02 UTC (permalink / raw) To: Michal Hocko Cc: Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Mon, 26 Sep 2011, Michal Hocko wrote: > Let's try it with a heuristic change first. If you really do not like > it, we can move to oom_scode_adj. I like the heuristic change little bit > more because it is at the same place as the root bonus. The problem with the bonus is that, as mentioned previously, it doesn't protect against ANYTHING for the case you're trying to fix. This won't panic the machine because all killable threads are guaranteed to have a non-zero badness score, but it's a very valid configuration to have either - all eligible threads (system-wide, shared cpuset, shared mempolicy nodes) are frozen, or - all eligible frozen threads use <5% of memory whereas all other eligible killable threads use 1% of available memory. and that means the oom killer will repeatedly select those threads and the livelock still exists unless you can guarantee that they are successfully thawed, that thawing them in all situations is safe, and that once thawed they will make a timely exit. Additionally, I don't think biasing against frozen tasks makes sense from a heusritic standpoint of the oom killer. Why would we want give non-frozen tasks that are actually getting work done a preference over a task that is frozen and doing absolutely nothing? It seems like that's backwards and that we'd actually prefer killing the task doing nothing so it can free its memory. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 2/2] oom: give bonus to frozen processes 2011-09-26 9:02 ` David Rientjes @ 2011-09-26 9:31 ` KAMEZAWA Hiroyuki -1 siblings, 0 replies; 88+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-09-26 9:31 UTC (permalink / raw) To: David Rientjes Cc: Michal Hocko, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Rafael J. Wysocki On Mon, 26 Sep 2011 02:02:59 -0700 (PDT) David Rientjes <rientjes@google.com> wrote: > On Mon, 26 Sep 2011, Michal Hocko wrote: > > > Let's try it with a heuristic change first. If you really do not like > > it, we can move to oom_scode_adj. I like the heuristic change little bit > > more because it is at the same place as the root bonus. > > The problem with the bonus is that, as mentioned previously, it doesn't > protect against ANYTHING for the case you're trying to fix. This won't > panic the machine because all killable threads are guaranteed to have a > non-zero badness score, but it's a very valid configuration to have either > > - all eligible threads (system-wide, shared cpuset, shared mempolicy > nodes) are frozen, or > > - all eligible frozen threads use <5% of memory whereas all other > eligible killable threads use 1% of available memory. > > and that means the oom killer will repeatedly select those threads and the > livelock still exists unless you can guarantee that they are successfully > thawed, that thawing them in all situations is safe, and that once thawed > they will make a timely exit. > > Additionally, I don't think biasing against frozen tasks makes sense from > a heusritic standpoint of the oom killer. Why would we want give > non-frozen tasks that are actually getting work done a preference over a > task that is frozen and doing absolutely nothing? It seems like that's > backwards and that we'd actually prefer killing the task doing nothing so > it can free its memory. > I agree with David. Why don't you set oom_score_adj as -1000 for processes which never should die ? You don't freeze processes via user-land using cgroup ? Thanks, -Kame ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 2/2] oom: give bonus to frozen processes @ 2011-09-26 9:31 ` KAMEZAWA Hiroyuki 0 siblings, 0 replies; 88+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-09-26 9:31 UTC (permalink / raw) To: David Rientjes Cc: Michal Hocko, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Rafael J. Wysocki On Mon, 26 Sep 2011 02:02:59 -0700 (PDT) David Rientjes <rientjes@google.com> wrote: > On Mon, 26 Sep 2011, Michal Hocko wrote: > > > Let's try it with a heuristic change first. If you really do not like > > it, we can move to oom_scode_adj. I like the heuristic change little bit > > more because it is at the same place as the root bonus. > > The problem with the bonus is that, as mentioned previously, it doesn't > protect against ANYTHING for the case you're trying to fix. This won't > panic the machine because all killable threads are guaranteed to have a > non-zero badness score, but it's a very valid configuration to have either > > - all eligible threads (system-wide, shared cpuset, shared mempolicy > nodes) are frozen, or > > - all eligible frozen threads use <5% of memory whereas all other > eligible killable threads use 1% of available memory. > > and that means the oom killer will repeatedly select those threads and the > livelock still exists unless you can guarantee that they are successfully > thawed, that thawing them in all situations is safe, and that once thawed > they will make a timely exit. > > Additionally, I don't think biasing against frozen tasks makes sense from > a heusritic standpoint of the oom killer. Why would we want give > non-frozen tasks that are actually getting work done a preference over a > task that is frozen and doing absolutely nothing? It seems like that's > backwards and that we'd actually prefer killing the task doing nothing so > it can free its memory. > I agree with David. Why don't you set oom_score_adj as -1000 for processes which never should die ? You don't freeze processes via user-land using cgroup ? Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 2/2] oom: give bonus to frozen processes 2011-09-26 9:31 ` KAMEZAWA Hiroyuki @ 2011-09-26 9:54 ` Michal Hocko -1 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-09-26 9:54 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: David Rientjes, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Rafael J. Wysocki On Mon 26-09-11 18:31:15, KAMEZAWA Hiroyuki wrote: > On Mon, 26 Sep 2011 02:02:59 -0700 (PDT) > David Rientjes <rientjes@google.com> wrote: > > > On Mon, 26 Sep 2011, Michal Hocko wrote: > > > > > Let's try it with a heuristic change first. If you really do not like > > > it, we can move to oom_scode_adj. I like the heuristic change little bit > > > more because it is at the same place as the root bonus. > > > > The problem with the bonus is that, as mentioned previously, it doesn't > > protect against ANYTHING for the case you're trying to fix. Yes, it just makes this less probable. > > This won't panic the machine because all killable threads are > > guaranteed to have a non-zero badness score, but it's a very valid > > configuration to have either > > > > - all eligible threads (system-wide, shared cpuset, shared mempolicy > > nodes) are frozen, or > > > > - all eligible frozen threads use <5% of memory whereas all other > > eligible killable threads use 1% of available memory. > > > > and that means the oom killer will repeatedly select those threads and the > > livelock still exists unless you can guarantee that they are successfully > > thawed, that thawing them in all situations is safe, and that once thawed > > they will make a timely exit. Yes, this is what the first patch is fixing. Thawed tasks should die almost immediately because they are on the way to userspace anyway. > > > > Additionally, I don't think biasing against frozen tasks makes sense from > > a heusritic standpoint of the oom killer. Why would we want give > > non-frozen tasks that are actually getting work done a preference over a > > task that is frozen and doing absolutely nothing? Because frozen tasks are in that state usually (not considering suspend path which has OOM disabled) based on an user request (via freezer cgroup e.g.). I wouldn't be surprised if somebody relied on the D state and that the task will not get killer. > > It seems like that's backwards and that we'd actually prefer killing > > the task doing nothing so it can free its memory. > > > > I agree with David. > Why don't you set oom_score_adj as -1000 for processes which never should die ? It is little bit unintuitive to think about OOM killer when you just want to debug your frozen application. On the other hand I agree that adding a new heuristic for an use case that is not entirely clear and which is not 100% anyway is not good. So, please scratch this patch and let's wait for somebody with a valid use case. > You don't freeze processes via user-land using cgroup ? That was exactly the use case I had in mind. Somebody using freezer cgroup to freeze a task to debug it. > > Thanks, > -Kame > > > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH 2/2] oom: give bonus to frozen processes @ 2011-09-26 9:54 ` Michal Hocko 0 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-09-26 9:54 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: David Rientjes, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, Rafael J. Wysocki On Mon 26-09-11 18:31:15, KAMEZAWA Hiroyuki wrote: > On Mon, 26 Sep 2011 02:02:59 -0700 (PDT) > David Rientjes <rientjes@google.com> wrote: > > > On Mon, 26 Sep 2011, Michal Hocko wrote: > > > > > Let's try it with a heuristic change first. If you really do not like > > > it, we can move to oom_scode_adj. I like the heuristic change little bit > > > more because it is at the same place as the root bonus. > > > > The problem with the bonus is that, as mentioned previously, it doesn't > > protect against ANYTHING for the case you're trying to fix. Yes, it just makes this less probable. > > This won't panic the machine because all killable threads are > > guaranteed to have a non-zero badness score, but it's a very valid > > configuration to have either > > > > - all eligible threads (system-wide, shared cpuset, shared mempolicy > > nodes) are frozen, or > > > > - all eligible frozen threads use <5% of memory whereas all other > > eligible killable threads use 1% of available memory. > > > > and that means the oom killer will repeatedly select those threads and the > > livelock still exists unless you can guarantee that they are successfully > > thawed, that thawing them in all situations is safe, and that once thawed > > they will make a timely exit. Yes, this is what the first patch is fixing. Thawed tasks should die almost immediately because they are on the way to userspace anyway. > > > > Additionally, I don't think biasing against frozen tasks makes sense from > > a heusritic standpoint of the oom killer. Why would we want give > > non-frozen tasks that are actually getting work done a preference over a > > task that is frozen and doing absolutely nothing? Because frozen tasks are in that state usually (not considering suspend path which has OOM disabled) based on an user request (via freezer cgroup e.g.). I wouldn't be surprised if somebody relied on the D state and that the task will not get killer. > > It seems like that's backwards and that we'd actually prefer killing > > the task doing nothing so it can free its memory. > > > > I agree with David. > Why don't you set oom_score_adj as -1000 for processes which never should die ? It is little bit unintuitive to think about OOM killer when you just want to debug your frozen application. On the other hand I agree that adding a new heuristic for an use case that is not entirely clear and which is not 100% anyway is not good. So, please scratch this patch and let's wait for somebody with a valid use case. > You don't freeze processes via user-land using cgroup ? That was exactly the use case I had in mind. Somebody using freezer cgroup to freeze a task to debug it. > > Thanks, > -Kame > > > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-26 9:21 ` David Rientjes @ 2011-08-26 21:03 ` Rafael J. Wysocki -1 siblings, 0 replies; 88+ messages in thread From: Rafael J. Wysocki @ 2011-08-26 21:03 UTC (permalink / raw) To: David Rientjes Cc: Michal Hocko, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki On Friday, August 26, 2011, David Rientjes wrote: > On Fri, 26 Aug 2011, Michal Hocko wrote: > > > Let's give all frozen tasks a bonus (OOM_SCORE_ADJ_MAX/2) so that we do > > not consider them unless really necessary and if we really pick up one > > then thaw its threads before we try to kill it. > > > > I don't like arbitrary heuristics like this because they polluted the old > oom killer before it was rewritten and made it much more unpredictable. > The only heuristic it includes right now is a bonus for root tasks so that > when two processes have nearly the same amount of memory usage (within 3% > of available memory), the non-root task is chosen instead. > > This bonus is actually saying that a single frozen task can use up to 50% > more of the machine's capacity in a system-wide oom condition than the > task that will now be killed instead. That seems excessive. > > I do like the idea of automatically thawing the task though and if that's > possible then I don't think we need to manipulate the badness heuristic at > all. I know that wouldn't be feasible when we've frozen _all_ threads and > that's why we have oom_killer_disable(), but we'll have to check with > Rafael to see if something like this could work. Rafael? That depends a good deal on when the thawing happens and what the thawed task can do before being killed. For example, if the thawing happens while devices are suspended and the thawed task accesses a driver through ioctl(), for example, the purpose of freezing will be defeated. Thanks, Rafael ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-26 21:03 ` Rafael J. Wysocki 0 siblings, 0 replies; 88+ messages in thread From: Rafael J. Wysocki @ 2011-08-26 21:03 UTC (permalink / raw) To: David Rientjes Cc: Michal Hocko, Oleg Nesterov, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki On Friday, August 26, 2011, David Rientjes wrote: > On Fri, 26 Aug 2011, Michal Hocko wrote: > > > Let's give all frozen tasks a bonus (OOM_SCORE_ADJ_MAX/2) so that we do > > not consider them unless really necessary and if we really pick up one > > then thaw its threads before we try to kill it. > > > > I don't like arbitrary heuristics like this because they polluted the old > oom killer before it was rewritten and made it much more unpredictable. > The only heuristic it includes right now is a bonus for root tasks so that > when two processes have nearly the same amount of memory usage (within 3% > of available memory), the non-root task is chosen instead. > > This bonus is actually saying that a single frozen task can use up to 50% > more of the machine's capacity in a system-wide oom condition than the > task that will now be killed instead. That seems excessive. > > I do like the idea of automatically thawing the task though and if that's > possible then I don't think we need to manipulate the badness heuristic at > all. I know that wouldn't be feasible when we've frozen _all_ threads and > that's why we have oom_killer_disable(), but we'll have to check with > Rafael to see if something like this could work. Rafael? That depends a good deal on when the thawing happens and what the thawed task can do before being killed. For example, if the thawing happens while devices are suspended and the thawed task accesses a driver through ioctl(), for example, the purpose of freezing will be defeated. Thanks, Rafael -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-26 8:56 ` Michal Hocko @ 2011-08-26 10:03 ` Konstantin Khlebnikov -1 siblings, 0 replies; 88+ messages in thread From: Konstantin Khlebnikov @ 2011-08-26 10:03 UTC (permalink / raw) To: Michal Hocko Cc: David Rientjes, Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki Michal Hocko wrote: > @@ -450,6 +459,10 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > pr_err("Kill process %d (%s) sharing same memory\n", > task_pid_nr(q), q->comm); > task_unlock(q); > + > + if (frozen(q)) > + thaw_process(q); > + We must thaw task strictly after sending SIGKILL. But anyway I think this is a bad idea. > force_sig(SIGKILL, q); > } > ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-26 10:03 ` Konstantin Khlebnikov 0 siblings, 0 replies; 88+ messages in thread From: Konstantin Khlebnikov @ 2011-08-26 10:03 UTC (permalink / raw) To: Michal Hocko Cc: David Rientjes, Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki Michal Hocko wrote: > @@ -450,6 +459,10 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > pr_err("Kill process %d (%s) sharing same memory\n", > task_pid_nr(q), q->comm); > task_unlock(q); > + > + if (frozen(q)) > + thaw_process(q); > + We must thaw task strictly after sending SIGKILL. But anyway I think this is a bad idea. > force_sig(SIGKILL, q); > } > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-26 10:03 ` Konstantin Khlebnikov @ 2011-08-26 10:48 ` Michal Hocko -1 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-26 10:48 UTC (permalink / raw) To: Konstantin Khlebnikov Cc: David Rientjes, Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri 26-08-11 14:03:17, Konstantin Khlebnikov wrote: > Michal Hocko wrote: > > >@@ -450,6 +459,10 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > > pr_err("Kill process %d (%s) sharing same memory\n", > > task_pid_nr(q), q->comm); > > task_unlock(q); > >+ > >+ if (frozen(q)) > >+ thaw_process(q); > >+ > > We must thaw task strictly after sending SIGKILL. Sounds reasonable. > But anyway I think this is a bad idea. Why? > > > force_sig(SIGKILL, q); > > } > > -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-26 10:48 ` Michal Hocko 0 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-26 10:48 UTC (permalink / raw) To: Konstantin Khlebnikov Cc: David Rientjes, Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri 26-08-11 14:03:17, Konstantin Khlebnikov wrote: > Michal Hocko wrote: > > >@@ -450,6 +459,10 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > > pr_err("Kill process %d (%s) sharing same memory\n", > > task_pid_nr(q), q->comm); > > task_unlock(q); > >+ > >+ if (frozen(q)) > >+ thaw_process(q); > >+ > > We must thaw task strictly after sending SIGKILL. Sounds reasonable. > But anyway I think this is a bad idea. Why? > > > force_sig(SIGKILL, q); > > } > > -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-26 10:48 ` Michal Hocko @ 2011-08-26 12:44 ` Konstantin Khlebnikov -1 siblings, 0 replies; 88+ messages in thread From: Konstantin Khlebnikov @ 2011-08-26 12:44 UTC (permalink / raw) To: Michal Hocko Cc: David Rientjes, Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki Michal Hocko wrote: > On Fri 26-08-11 14:03:17, Konstantin Khlebnikov wrote: >> Michal Hocko wrote: >> >>> @@ -450,6 +459,10 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) >>> pr_err("Kill process %d (%s) sharing same memory\n", >>> task_pid_nr(q), q->comm); >>> task_unlock(q); >>> + >>> + if (frozen(q)) >>> + thaw_process(q); >>> + >> >> We must thaw task strictly after sending SIGKILL. > > Sounds reasonable. > >> But anyway I think this is a bad idea. > > Why? Refrigerator may be used for digging in task's internal structures, so such digger may be very surprised if somebody suddenly thaws this task. > >> >>> force_sig(SIGKILL, q); >>> } >>> > ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-26 12:44 ` Konstantin Khlebnikov 0 siblings, 0 replies; 88+ messages in thread From: Konstantin Khlebnikov @ 2011-08-26 12:44 UTC (permalink / raw) To: Michal Hocko Cc: David Rientjes, Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki Michal Hocko wrote: > On Fri 26-08-11 14:03:17, Konstantin Khlebnikov wrote: >> Michal Hocko wrote: >> >>> @@ -450,6 +459,10 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) >>> pr_err("Kill process %d (%s) sharing same memory\n", >>> task_pid_nr(q), q->comm); >>> task_unlock(q); >>> + >>> + if (frozen(q)) >>> + thaw_process(q); >>> + >> >> We must thaw task strictly after sending SIGKILL. > > Sounds reasonable. > >> But anyway I think this is a bad idea. > > Why? Refrigerator may be used for digging in task's internal structures, so such digger may be very surprised if somebody suddenly thaws this task. > >> >>> force_sig(SIGKILL, q); >>> } >>> > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-26 12:44 ` Konstantin Khlebnikov @ 2011-08-26 12:59 ` Michal Hocko -1 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-26 12:59 UTC (permalink / raw) To: Konstantin Khlebnikov Cc: David Rientjes, Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri 26-08-11 16:44:49, Konstantin Khlebnikov wrote: > Michal Hocko wrote: > >On Fri 26-08-11 14:03:17, Konstantin Khlebnikov wrote: > >>Michal Hocko wrote: > >> > >>>@@ -450,6 +459,10 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > >>> pr_err("Kill process %d (%s) sharing same memory\n", > >>> task_pid_nr(q), q->comm); > >>> task_unlock(q); > >>>+ > >>>+ if (frozen(q)) > >>>+ thaw_process(q); > >>>+ > >> > >>We must thaw task strictly after sending SIGKILL. > > > >Sounds reasonable. > > > >>But anyway I think this is a bad idea. > > > >Why? > > Refrigerator may be used for digging in task's internal structures, > so such digger may be very surprised if somebody suddenly thaws this task. That is something similar why I mentioned that we probably want to give it some oom bonus. Nevertheless we have to be carefull about that. Someone could freeze a memory hog just to hide it from OOM which would have much worse consequences than if such app disappeared while somebody is looking at it while it is frozen. This has to be balanced somehow. > > > > >> > >>> force_sig(SIGKILL, q); > >>> } > >>> > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-26 12:59 ` Michal Hocko 0 siblings, 0 replies; 88+ messages in thread From: Michal Hocko @ 2011-08-26 12:59 UTC (permalink / raw) To: Konstantin Khlebnikov Cc: David Rientjes, Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri 26-08-11 16:44:49, Konstantin Khlebnikov wrote: > Michal Hocko wrote: > >On Fri 26-08-11 14:03:17, Konstantin Khlebnikov wrote: > >>Michal Hocko wrote: > >> > >>>@@ -450,6 +459,10 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem) > >>> pr_err("Kill process %d (%s) sharing same memory\n", > >>> task_pid_nr(q), q->comm); > >>> task_unlock(q); > >>>+ > >>>+ if (frozen(q)) > >>>+ thaw_process(q); > >>>+ > >> > >>We must thaw task strictly after sending SIGKILL. > > > >Sounds reasonable. > > > >>But anyway I think this is a bad idea. > > > >Why? > > Refrigerator may be used for digging in task's internal structures, > so such digger may be very surprised if somebody suddenly thaws this task. That is something similar why I mentioned that we probably want to give it some oom bonus. Nevertheless we have to be carefull about that. Someone could freeze a memory hog just to hide it from OOM which would have much worse consequences than if such app disappeared while somebody is looking at it while it is frozen. This has to be balanced somehow. > > > > >> > >>> force_sig(SIGKILL, q); > >>> } > >>> > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-25 21:14 ` David Rientjes @ 2011-08-26 7:35 ` Konstantin Khlebnikov -1 siblings, 0 replies; 88+ messages in thread From: Konstantin Khlebnikov @ 2011-08-26 7:35 UTC (permalink / raw) To: David Rientjes Cc: Michal Hocko, Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki David Rientjes wrote: > On Thu, 25 Aug 2011, Michal Hocko wrote: > >>>>> That's obviously false since we call oom_killer_disable() in >>>>> freeze_processes() to disable the oom killer from ever being called in the >>>>> first place, so this is something you need to resolve with Rafael before >>>>> you cause more machines to panic. >>>> >>>> I didn't mean suspend/resume path (that is protected by oom_killer_disabled) >>>> so the patch doesn't make any change. >>> >>> Confused... freeze_processes() does try_to_freeze_tasks() before >>> oom_killer_disable() ? >> >> Yes you are right, I must have been blind. >> >> Now I see the point. We do not want to panic while we are suspending and >> the memory is really low just because all the userspace is already in >> the the fridge. >> Sorry for confusion. >> >> I still do not follow the oom_killer_disable note from David, though. >> > > oom_killer_disable() was added to that path for a reason when all threads > are frozen: memory allocations still occur in the suspend path in an oom > condition and adding the oom_killer_disable() will cause those > allocations to fail rather than sending pointless SIGKILLs to frozen > threads. > > Now consider if the only _eligible_ threads for oom kill (because of > cpusets or mempolicies) are those that are frozen. We certainly do not > want to panic because other cpusets are still getting work done. We'd > either want to add a mem to the cpuset or thaw the processes because the > cpuset is oom. > > You can't just selectively skip certain threads when their state can be > temporary without risking a panic. That's why this patch is a > non-starter. > > A much better solution would be to lower the badness score that the oom > killer uses for PF_FROZEN threads so that they aren't considered a > priority for kill unless there's nothing else left to kill. Anyway, oom killer shouldn't loop endlessly if it see TIF_MEMDIE on frozen task, it must go on and try to kill somebody else. We cannot wait for thawing this task. ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-26 7:35 ` Konstantin Khlebnikov 0 siblings, 0 replies; 88+ messages in thread From: Konstantin Khlebnikov @ 2011-08-26 7:35 UTC (permalink / raw) To: David Rientjes Cc: Michal Hocko, Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki David Rientjes wrote: > On Thu, 25 Aug 2011, Michal Hocko wrote: > >>>>> That's obviously false since we call oom_killer_disable() in >>>>> freeze_processes() to disable the oom killer from ever being called in the >>>>> first place, so this is something you need to resolve with Rafael before >>>>> you cause more machines to panic. >>>> >>>> I didn't mean suspend/resume path (that is protected by oom_killer_disabled) >>>> so the patch doesn't make any change. >>> >>> Confused... freeze_processes() does try_to_freeze_tasks() before >>> oom_killer_disable() ? >> >> Yes you are right, I must have been blind. >> >> Now I see the point. We do not want to panic while we are suspending and >> the memory is really low just because all the userspace is already in >> the the fridge. >> Sorry for confusion. >> >> I still do not follow the oom_killer_disable note from David, though. >> > > oom_killer_disable() was added to that path for a reason when all threads > are frozen: memory allocations still occur in the suspend path in an oom > condition and adding the oom_killer_disable() will cause those > allocations to fail rather than sending pointless SIGKILLs to frozen > threads. > > Now consider if the only _eligible_ threads for oom kill (because of > cpusets or mempolicies) are those that are frozen. We certainly do not > want to panic because other cpusets are still getting work done. We'd > either want to add a mem to the cpuset or thaw the processes because the > cpuset is oom. > > You can't just selectively skip certain threads when their state can be > temporary without risking a panic. That's why this patch is a > non-starter. > > A much better solution would be to lower the badness score that the oom > killer uses for PF_FROZEN threads so that they aren't considered a > priority for kill unless there's nothing else left to kill. Anyway, oom killer shouldn't loop endlessly if it see TIF_MEMDIE on frozen task, it must go on and try to kill somebody else. We cannot wait for thawing this task. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-26 7:35 ` Konstantin Khlebnikov @ 2011-08-26 9:09 ` David Rientjes -1 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-08-26 9:09 UTC (permalink / raw) To: Konstantin Khlebnikov Cc: Michal Hocko, Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri, 26 Aug 2011, Konstantin Khlebnikov wrote: > > A much better solution would be to lower the badness score that the oom > > killer uses for PF_FROZEN threads so that they aren't considered a > > priority for kill unless there's nothing else left to kill. > > Anyway, oom killer shouldn't loop endlessly if it see TIF_MEMDIE on frozen > task, > it must go on and try to kill somebody else. We cannot wait for thawing this > task. > Did you read my suggestion? I quoted it above again for you. The badness heuristic would only select those tasks to kill as a last resort in the hopes they will eventually be thawed and may exit. Panicking the entire machine for what could be isolated by a cgroup is insanity. ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-26 9:09 ` David Rientjes 0 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-08-26 9:09 UTC (permalink / raw) To: Konstantin Khlebnikov Cc: Michal Hocko, Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri, 26 Aug 2011, Konstantin Khlebnikov wrote: > > A much better solution would be to lower the badness score that the oom > > killer uses for PF_FROZEN threads so that they aren't considered a > > priority for kill unless there's nothing else left to kill. > > Anyway, oom killer shouldn't loop endlessly if it see TIF_MEMDIE on frozen > task, > it must go on and try to kill somebody else. We cannot wait for thawing this > task. > Did you read my suggestion? I quoted it above again for you. The badness heuristic would only select those tasks to kill as a last resort in the hopes they will eventually be thawed and may exit. Panicking the entire machine for what could be isolated by a cgroup is insanity. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-26 9:09 ` David Rientjes @ 2011-08-26 9:59 ` Konstantin Khlebnikov -1 siblings, 0 replies; 88+ messages in thread From: Konstantin Khlebnikov @ 2011-08-26 9:59 UTC (permalink / raw) To: David Rientjes Cc: Michal Hocko, Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki David Rientjes wrote: > On Fri, 26 Aug 2011, Konstantin Khlebnikov wrote: > >>> A much better solution would be to lower the badness score that the oom >>> killer uses for PF_FROZEN threads so that they aren't considered a >>> priority for kill unless there's nothing else left to kill. >> >> Anyway, oom killer shouldn't loop endlessly if it see TIF_MEMDIE on frozen >> task, >> it must go on and try to kill somebody else. We cannot wait for thawing this >> task. >> > > Did you read my suggestion? I quoted it above again for you. The badness > heuristic would only select those tasks to kill as a last resort in the > hopes they will eventually be thawed and may exit. Panicking the entire > machine for what could be isolated by a cgroup is insanity. Maybe just fix this "panic" logic? OOM killer should panic only on global memory shortage. ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-26 9:59 ` Konstantin Khlebnikov 0 siblings, 0 replies; 88+ messages in thread From: Konstantin Khlebnikov @ 2011-08-26 9:59 UTC (permalink / raw) To: David Rientjes Cc: Michal Hocko, Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki David Rientjes wrote: > On Fri, 26 Aug 2011, Konstantin Khlebnikov wrote: > >>> A much better solution would be to lower the badness score that the oom >>> killer uses for PF_FROZEN threads so that they aren't considered a >>> priority for kill unless there's nothing else left to kill. >> >> Anyway, oom killer shouldn't loop endlessly if it see TIF_MEMDIE on frozen >> task, >> it must go on and try to kill somebody else. We cannot wait for thawing this >> task. >> > > Did you read my suggestion? I quoted it above again for you. The badness > heuristic would only select those tasks to kill as a last resort in the > hopes they will eventually be thawed and may exit. Panicking the entire > machine for what could be isolated by a cgroup is insanity. Maybe just fix this "panic" logic? OOM killer should panic only on global memory shortage. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-26 9:59 ` Konstantin Khlebnikov @ 2011-08-26 18:09 ` David Rientjes -1 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-08-26 18:09 UTC (permalink / raw) To: Konstantin Khlebnikov Cc: Michal Hocko, Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri, 26 Aug 2011, Konstantin Khlebnikov wrote: > Maybe just fix this "panic" logic? OOM killer should panic only on global > memory shortage. > NO, it shouldn't, we actually rely quite extensively on cpusets filled with OOM_DISABLE threads to panic because the job scheduler would be unresponsive in such a condition and it'd much better to panic and reboot than to brick the machine. I'm not sure where you're getting all your information from, but please don't pass it off as principles. You can set the panic logic to be whatever you want with /proc/sys/vm/panic_on_oom. See Documentation/filesystems/proc.txt for more information. ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-26 18:09 ` David Rientjes 0 siblings, 0 replies; 88+ messages in thread From: David Rientjes @ 2011-08-26 18:09 UTC (permalink / raw) To: Konstantin Khlebnikov Cc: Michal Hocko, Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki, Rafael J. Wysocki On Fri, 26 Aug 2011, Konstantin Khlebnikov wrote: > Maybe just fix this "panic" logic? OOM killer should panic only on global > memory shortage. > NO, it shouldn't, we actually rely quite extensively on cpusets filled with OOM_DISABLE threads to panic because the job scheduler would be unresponsive in such a condition and it'd much better to panic and reboot than to brick the machine. I'm not sure where you're getting all your information from, but please don't pass it off as principles. You can set the panic logic to be whatever you want with /proc/sys/vm/panic_on_oom. See Documentation/filesystems/proc.txt for more information. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks 2011-08-25 15:18 ` Oleg Nesterov @ 2011-08-25 21:03 ` Rafael J. Wysocki -1 siblings, 0 replies; 88+ messages in thread From: Rafael J. Wysocki @ 2011-08-25 21:03 UTC (permalink / raw) To: Oleg Nesterov Cc: Michal Hocko, David Rientjes, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki On Thursday, August 25, 2011, Oleg Nesterov wrote: > On 08/25, Michal Hocko wrote: > > > > On Wed 24-08-11 12:31:26, David Rientjes wrote: > > > > > > That's obviously false since we call oom_killer_disable() in > > > freeze_processes() to disable the oom killer from ever being called in the > > > first place, so this is something you need to resolve with Rafael before > > > you cause more machines to panic. > > > > I didn't mean suspend/resume path (that is protected by oom_killer_disabled) > > so the patch doesn't make any change. > > Confused... freeze_processes() does try_to_freeze_tasks() before > oom_killer_disable() ? Yes, it does. Thanks, Rafael ^ permalink raw reply [flat|nested] 88+ messages in thread
* Re: [PATCH] oom: skip frozen tasks @ 2011-08-25 21:03 ` Rafael J. Wysocki 0 siblings, 0 replies; 88+ messages in thread From: Rafael J. Wysocki @ 2011-08-25 21:03 UTC (permalink / raw) To: Oleg Nesterov Cc: Michal Hocko, David Rientjes, Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, KOSAKI Motohiro, KAMEZAWA Hiroyuki On Thursday, August 25, 2011, Oleg Nesterov wrote: > On 08/25, Michal Hocko wrote: > > > > On Wed 24-08-11 12:31:26, David Rientjes wrote: > > > > > > That's obviously false since we call oom_killer_disable() in > > > freeze_processes() to disable the oom killer from ever being called in the > > > first place, so this is something you need to resolve with Rafael before > > > you cause more machines to panic. > > > > I didn't mean suspend/resume path (that is protected by oom_killer_disabled) > > so the patch doesn't make any change. > > Confused... freeze_processes() does try_to_freeze_tasks() before > oom_killer_disable() ? Yes, it does. Thanks, Rafael -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 88+ messages in thread
end of thread, other threads:[~2011-09-27 18:30 UTC | newest] Thread overview: 88+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-08-23 8:31 [PATCH] oom: skip frozen tasks Konstantin Khlebnikov 2011-08-23 8:31 ` Konstantin Khlebnikov 2011-08-23 9:15 ` KAMEZAWA Hiroyuki 2011-08-23 9:15 ` KAMEZAWA Hiroyuki 2011-08-23 13:46 ` Michal Hocko 2011-08-23 13:46 ` Michal Hocko 2011-08-23 20:18 ` David Rientjes 2011-08-23 20:18 ` David Rientjes 2011-08-24 10:19 ` Michal Hocko 2011-08-24 10:19 ` Michal Hocko 2011-08-24 19:31 ` David Rientjes 2011-08-24 19:31 ` David Rientjes 2011-08-25 9:19 ` Michal Hocko 2011-08-25 9:19 ` Michal Hocko 2011-08-25 15:18 ` Oleg Nesterov 2011-08-25 15:18 ` Oleg Nesterov 2011-08-25 16:47 ` Michal Hocko 2011-08-25 16:47 ` Michal Hocko 2011-08-25 21:14 ` David Rientjes 2011-08-25 21:14 ` David Rientjes 2011-08-26 7:09 ` Michal Hocko 2011-08-26 7:09 ` Michal Hocko 2011-08-26 8:56 ` Michal Hocko 2011-08-26 8:56 ` Michal Hocko 2011-08-26 9:21 ` David Rientjes 2011-08-26 9:21 ` David Rientjes 2011-08-26 9:53 ` Michal Hocko 2011-08-26 9:53 ` Michal Hocko 2011-08-26 11:01 ` Michal Hocko 2011-08-26 11:01 ` Michal Hocko 2011-08-26 18:13 ` David Rientjes 2011-08-26 18:13 ` David Rientjes 2011-09-26 8:28 ` [PATCH 1/2] oom: do not live lock on " Michal Hocko 2011-09-26 8:28 ` Michal Hocko 2011-09-26 8:56 ` David Rientjes 2011-09-26 8:56 ` David Rientjes 2011-09-26 9:14 ` Michal Hocko 2011-09-26 9:14 ` Michal Hocko 2011-09-26 9:25 ` KAMEZAWA Hiroyuki 2011-09-26 9:25 ` KAMEZAWA Hiroyuki 2011-09-26 9:32 ` Michal Hocko 2011-09-26 9:32 ` Michal Hocko 2011-09-26 15:51 ` Rafael J. Wysocki 2011-09-26 15:51 ` Rafael J. Wysocki 2011-09-26 18:28 ` Michal Hocko 2011-09-26 18:28 ` Michal Hocko 2011-09-27 1:03 ` David Rientjes 2011-09-27 1:03 ` David Rientjes 2011-09-27 7:52 ` Michal Hocko 2011-09-27 7:52 ` Michal Hocko 2011-09-27 18:30 ` David Rientjes 2011-09-27 18:30 ` David Rientjes 2011-09-26 10:28 ` Rusty Russell 2011-09-26 10:28 ` Rusty Russell 2011-09-26 11:05 ` Michal Hocko 2011-09-26 11:05 ` Michal Hocko 2011-09-27 2:21 ` Rusty Russell 2011-09-27 2:21 ` Rusty Russell 2011-09-27 7:03 ` [PATCH] lguest: move process freezing before pending signals check Michal Hocko 2011-09-27 7:03 ` Michal Hocko 2011-09-26 8:35 ` [PATCH 2/2] oom: give bonus to frozen processes Michal Hocko 2011-09-26 8:35 ` Michal Hocko 2011-09-26 9:02 ` David Rientjes 2011-09-26 9:02 ` David Rientjes 2011-09-26 9:31 ` KAMEZAWA Hiroyuki 2011-09-26 9:31 ` KAMEZAWA Hiroyuki 2011-09-26 9:54 ` Michal Hocko 2011-09-26 9:54 ` Michal Hocko 2011-08-26 21:03 ` [PATCH] oom: skip frozen tasks Rafael J. Wysocki 2011-08-26 21:03 ` Rafael J. Wysocki 2011-08-26 10:03 ` Konstantin Khlebnikov 2011-08-26 10:03 ` Konstantin Khlebnikov 2011-08-26 10:48 ` Michal Hocko 2011-08-26 10:48 ` Michal Hocko 2011-08-26 12:44 ` Konstantin Khlebnikov 2011-08-26 12:44 ` Konstantin Khlebnikov 2011-08-26 12:59 ` Michal Hocko 2011-08-26 12:59 ` Michal Hocko 2011-08-26 7:35 ` Konstantin Khlebnikov 2011-08-26 7:35 ` Konstantin Khlebnikov 2011-08-26 9:09 ` David Rientjes 2011-08-26 9:09 ` David Rientjes 2011-08-26 9:59 ` Konstantin Khlebnikov 2011-08-26 9:59 ` Konstantin Khlebnikov 2011-08-26 18:09 ` David Rientjes 2011-08-26 18:09 ` David Rientjes 2011-08-25 21:03 ` Rafael J. Wysocki 2011-08-25 21:03 ` Rafael J. Wysocki
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.