All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch] mm, oom: do not schedule if current has been killed
@ 2012-06-19  1:08 David Rientjes
  2012-06-19  1:57 ` Kamezawa Hiroyuki
  0 siblings, 1 reply; 16+ messages in thread
From: David Rientjes @ 2012-06-19  1:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: KAMEZAWA Hiroyuki, KOSAKI Motohiro, Oleg Nesterov, linux-mm

The oom killer currently schedules away from current in an
uninterruptible sleep if it does not have access to memory reserves.
It's possible that current was killed because it shares memory with the
oom killed thread or because it was killed by the user in the interim,
however.

This patch only schedules away from current if it does not have a pending
kill, i.e. if it does not share memory with the oom killed thread.  It's
possible that it will immediately retry its memory allocation and fail,
but it will immediately be given access to memory reserves if it calls
the oom killer again.

This prevents the delay of memory freeing when threads that share memory
with the oom killed thread get unnecessarily scheduled.

Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/oom_kill.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -749,7 +749,7 @@ out:
 	 * Give "p" a good chance of killing itself before we
 	 * retry to allocate memory unless "p" is current
 	 */
-	if (killed && !test_thread_flag(TIF_MEMDIE))
+	if (killed && !fatal_signal_pending(current))
 		schedule_timeout_uninterruptible(1);
 }
 
@@ -765,6 +765,6 @@ void pagefault_out_of_memory(void)
 		out_of_memory(NULL, 0, 0, NULL, false);
 		clear_system_oom();
 	}
-	if (!test_thread_flag(TIF_MEMDIE))
+	if (!fatal_signal_pending(current))
 		schedule_timeout_uninterruptible(1);
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [patch] mm, oom: do not schedule if current has been killed
  2012-06-19  1:08 [patch] mm, oom: do not schedule if current has been killed David Rientjes
@ 2012-06-19  1:57 ` Kamezawa Hiroyuki
  2012-06-19  2:23   ` David Rientjes
  0 siblings, 1 reply; 16+ messages in thread
From: Kamezawa Hiroyuki @ 2012-06-19  1:57 UTC (permalink / raw)
  To: David Rientjes; +Cc: Andrew Morton, KOSAKI Motohiro, Oleg Nesterov, linux-mm

(2012/06/19 10:08), David Rientjes wrote:
> The oom killer currently schedules away from current in an
> uninterruptible sleep if it does not have access to memory reserves.
> It's possible that current was killed because it shares memory with the
> oom killed thread or because it was killed by the user in the interim,
> however.
>
> This patch only schedules away from current if it does not have a pending
> kill, i.e. if it does not share memory with the oom killed thread.  It's
> possible that it will immediately retry its memory allocation and fail,
> but it will immediately be given access to memory reserves if it calls
> the oom killer again.
>
> This prevents the delay of memory freeing when threads that share memory
> with the oom killed thread get unnecessarily scheduled.
>
> Signed-off-by: David Rientjes<rientjes@google.com>

fatal_signal_pending() == false if test_thread_flag(TIF_MEMDIE)==false ?

I'll check memcg code..

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>



> ---
>   mm/oom_kill.c |    4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -749,7 +749,7 @@ out:
>   	 * Give "p" a good chance of killing itself before we
>   	 * retry to allocate memory unless "p" is current
>   	 */
> -	if (killed&&  !test_thread_flag(TIF_MEMDIE))
> +	if (killed&&  !fatal_signal_pending(current))
>   		schedule_timeout_uninterruptible(1);
>   }
>
> @@ -765,6 +765,6 @@ void pagefault_out_of_memory(void)
>   		out_of_memory(NULL, 0, 0, NULL, false);
>   		clear_system_oom();
>   	}
> -	if (!test_thread_flag(TIF_MEMDIE))
> +	if (!fatal_signal_pending(current))
>   		schedule_timeout_uninterruptible(1);
>   }


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [patch] mm, oom: do not schedule if current has been killed
  2012-06-19  1:57 ` Kamezawa Hiroyuki
@ 2012-06-19  2:23   ` David Rientjes
  2012-06-19  2:31     ` [patch v2] " David Rientjes
  0 siblings, 1 reply; 16+ messages in thread
From: David Rientjes @ 2012-06-19  2:23 UTC (permalink / raw)
  To: Kamezawa Hiroyuki; +Cc: Andrew Morton, KOSAKI Motohiro, Oleg Nesterov, linux-mm

On Tue, 19 Jun 2012, Kamezawa Hiroyuki wrote:

> fatal_signal_pending() == false if test_thread_flag(TIF_MEMDIE)==false ?
> 

Yeah, the only thing that sets TIF_MEMDIE is the oom killer and it 
immediately SIGKILLs it afterwards.

Aside: I've been thinking of adding a check to the page allocator for 
!(gfp & __GFP_FS) && !(gfp & __GFP_NORETRY) to set TIF_MEMDIE for current 
if it has a fatal signal since such an allocation isn't eligible for 
calling into the oom killer.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [patch v2] mm, oom: do not schedule if current has been killed
  2012-06-19  2:23   ` David Rientjes
@ 2012-06-19  2:31     ` David Rientjes
  2012-06-19  2:51       ` Kamezawa Hiroyuki
                         ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: David Rientjes @ 2012-06-19  2:31 UTC (permalink / raw)
  To: Kamezawa Hiroyuki; +Cc: Andrew Morton, KOSAKI Motohiro, Oleg Nesterov, linux-mm

The oom killer currently schedules away from current in an
uninterruptible sleep if it does not have access to memory reserves.
It's possible that current was killed because it shares memory with the
oom killed thread or because it was killed by the user in the interim,
however.

This patch only schedules away from current if it does not have a pending
kill, i.e. if it does not share memory with the oom killed thread, or is
already exiting.  It's possible that it will immediately retry its memory
allocation and fail, but it will immediately be given access to memory
reserves if it calls the oom killer again.

This prevents the delay of memory freeing when threads that share memory
with the oom killed thread get unnecessarily scheduled.

Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/oom_kill.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -746,10 +746,11 @@ out:
 	read_unlock(&tasklist_lock);
 
 	/*
-	 * Give "p" a good chance of killing itself before we
+	 * Give "p" a good chance of exiting before we
 	 * retry to allocate memory unless "p" is current
 	 */
-	if (killed && !test_thread_flag(TIF_MEMDIE))
+	if (killed && !fatal_signal_pending(current) &&
+		      !(current->flags & PF_EXITING))
 		schedule_timeout_uninterruptible(1);
 }
 
@@ -765,6 +766,6 @@ void pagefault_out_of_memory(void)
 		out_of_memory(NULL, 0, 0, NULL, false);
 		clear_system_oom();
 	}
-	if (!test_thread_flag(TIF_MEMDIE))
+	if (!fatal_signal_pending(current) && !(current->flags & PF_EXITING))
 		schedule_timeout_uninterruptible(1);
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [patch v2] mm, oom: do not schedule if current has been killed
  2012-06-19  2:31     ` [patch v2] " David Rientjes
@ 2012-06-19  2:51       ` Kamezawa Hiroyuki
  2012-06-19  6:03       ` KOSAKI Motohiro
  2012-06-19 13:55       ` Oleg Nesterov
  2 siblings, 0 replies; 16+ messages in thread
From: Kamezawa Hiroyuki @ 2012-06-19  2:51 UTC (permalink / raw)
  To: David Rientjes; +Cc: Andrew Morton, KOSAKI Motohiro, Oleg Nesterov, linux-mm

(2012/06/19 11:31), David Rientjes wrote:
> The oom killer currently schedules away from current in an
> uninterruptible sleep if it does not have access to memory reserves.
> It's possible that current was killed because it shares memory with the
> oom killed thread or because it was killed by the user in the interim,
> however.
>
> This patch only schedules away from current if it does not have a pending
> kill, i.e. if it does not share memory with the oom killed thread, or is
> already exiting.  It's possible that it will immediately retry its memory
> allocation and fail, but it will immediately be given access to memory
> reserves if it calls the oom killer again.
>
> This prevents the delay of memory freeing when threads that share memory
> with the oom killed thread get unnecessarily scheduled.
>
> Signed-off-by: David Rientjes<rientjes@google.com>

seems good to me.
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>


> ---
>   mm/oom_kill.c |    7 ++++---
>   1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -746,10 +746,11 @@ out:
>   	read_unlock(&tasklist_lock);
>
>   	/*
> -	 * Give "p" a good chance of killing itself before we
> +	 * Give "p" a good chance of exiting before we
>   	 * retry to allocate memory unless "p" is current
>   	 */
> -	if (killed&&  !test_thread_flag(TIF_MEMDIE))
> +	if (killed&&  !fatal_signal_pending(current)&&
> +		      !(current->flags&  PF_EXITING))
>   		schedule_timeout_uninterruptible(1);
>   }
>
> @@ -765,6 +766,6 @@ void pagefault_out_of_memory(void)
>   		out_of_memory(NULL, 0, 0, NULL, false);
>   		clear_system_oom();
>   	}
> -	if (!test_thread_flag(TIF_MEMDIE))
> +	if (!fatal_signal_pending(current)&&  !(current->flags&  PF_EXITING))
>   		schedule_timeout_uninterruptible(1);
>   }


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [patch v2] mm, oom: do not schedule if current has been killed
  2012-06-19  2:31     ` [patch v2] " David Rientjes
  2012-06-19  2:51       ` Kamezawa Hiroyuki
@ 2012-06-19  6:03       ` KOSAKI Motohiro
  2012-06-19  6:26         ` David Rientjes
  2012-06-19 13:55       ` Oleg Nesterov
  2 siblings, 1 reply; 16+ messages in thread
From: KOSAKI Motohiro @ 2012-06-19  6:03 UTC (permalink / raw)
  To: David Rientjes; +Cc: Kamezawa Hiroyuki, Andrew Morton, Oleg Nesterov, linux-mm

On Mon, Jun 18, 2012 at 10:31 PM, David Rientjes <rientjes@google.com> wrote:
> The oom killer currently schedules away from current in an
> uninterruptible sleep if it does not have access to memory reserves.
> It's possible that current was killed because it shares memory with the
> oom killed thread or because it was killed by the user in the interim,
> however.
>
> This patch only schedules away from current if it does not have a pending
> kill, i.e. if it does not share memory with the oom killed thread, or is
> already exiting.  It's possible that it will immediately retry its memory
> allocation and fail, but it will immediately be given access to memory
> reserves if it calls the oom killer again.
>
> This prevents the delay of memory freeing when threads that share memory
> with the oom killed thread get unnecessarily scheduled.
>
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
>  mm/oom_kill.c |    7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -746,10 +746,11 @@ out:
>        read_unlock(&tasklist_lock);
>
>        /*
> -        * Give "p" a good chance of killing itself before we
> +        * Give "p" a good chance of exiting before we
>         * retry to allocate memory unless "p" is current
>         */
> -       if (killed && !test_thread_flag(TIF_MEMDIE))
> +       if (killed && !fatal_signal_pending(current) &&
> +                     !(current->flags & PF_EXITING))
>                schedule_timeout_uninterruptible(1);
>  }

Why don't check gfp_flags? I think the rule is,

1) a thread of newly marked as TIF_MEMDIE
    -> now it has a capability to access reseve memory. let's immediately retry.
2) allocation for GFP_HIGHUSER_MOVABLE
    -> we can fail to allocate it safely. let's immediately fail.
        (I suspect we need to change page allocator too)
3) GFP_KERNEL and PF_EXITING
    -> don't retry immediately. It shall fail again. let's wait until
killed process
        is exited.



> @@ -765,6 +766,6 @@ void pagefault_out_of_memory(void)
>                out_of_memory(NULL, 0, 0, NULL, false);
>                clear_system_oom();
>        }
> -       if (!test_thread_flag(TIF_MEMDIE))
> +       if (!fatal_signal_pending(current) && !(current->flags & PF_EXITING))
>                schedule_timeout_uninterruptible(1);

This makes sense to me.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [patch v2] mm, oom: do not schedule if current has been killed
  2012-06-19  6:03       ` KOSAKI Motohiro
@ 2012-06-19  6:26         ` David Rientjes
  2012-06-19 17:32           ` KOSAKI Motohiro
  0 siblings, 1 reply; 16+ messages in thread
From: David Rientjes @ 2012-06-19  6:26 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: Kamezawa Hiroyuki, Andrew Morton, Oleg Nesterov, linux-mm

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1847 bytes --]

On Tue, 19 Jun 2012, KOSAKI Motohiro wrote:

> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -746,10 +746,11 @@ out:
> >        read_unlock(&tasklist_lock);
> >
> >        /*
> > -        * Give "p" a good chance of killing itself before we
> > +        * Give "p" a good chance of exiting before we
> >         * retry to allocate memory unless "p" is current
> >         */
> > -       if (killed && !test_thread_flag(TIF_MEMDIE))
> > +       if (killed && !fatal_signal_pending(current) &&
> > +                     !(current->flags & PF_EXITING))
> >                schedule_timeout_uninterruptible(1);
> >  }
> 
> Why don't check gfp_flags? I think the rule is,
> 
> 1) a thread of newly marked as TIF_MEMDIE
>     -> now it has a capability to access reseve memory. let's immediately retry.
> 2) allocation for GFP_HIGHUSER_MOVABLE
>     -> we can fail to allocate it safely. let's immediately fail.
>         (I suspect we need to change page allocator too)
> 3) GFP_KERNEL and PF_EXITING
>     -> don't retry immediately. It shall fail again. let's wait until
> killed process
>         is exited.
> 

The killed process may exit but it does not guarantee that its memory will 
be freed if it's shared with current.  This is the case that the patch is 
addressing, where right now we unnecessarily schedule if current has been 
killed or is already along the exit path.  We want to retry as soon as 
possible so that either the allocation now succeeds or we can recall the 
oom killer as soon as possible and get TIF_MEMDIE set because we have a 
fatal signal so current may exit in a timely way as well.  The point is 
that if current has either a SIGKILL or is already exiting as it returns 
from the oom killer, it does no good to continue to stall and prevent that 
memory freeing.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [patch v2] mm, oom: do not schedule if current has been killed
  2012-06-19  2:31     ` [patch v2] " David Rientjes
  2012-06-19  2:51       ` Kamezawa Hiroyuki
  2012-06-19  6:03       ` KOSAKI Motohiro
@ 2012-06-19 13:55       ` Oleg Nesterov
  2012-06-19 20:24         ` David Rientjes
  2 siblings, 1 reply; 16+ messages in thread
From: Oleg Nesterov @ 2012-06-19 13:55 UTC (permalink / raw)
  To: David Rientjes
  Cc: Kamezawa Hiroyuki, Andrew Morton, KOSAKI Motohiro, linux-mm

On 06/18, David Rientjes wrote:
>
> This patch only schedules away from current if it does not have a pending
> kill,

Can't really comment this patch, just one note...

> -	if (killed && !test_thread_flag(TIF_MEMDIE))
> +	if (killed && !fatal_signal_pending(current) &&
> +		      !(current->flags & PF_EXITING))
>  		schedule_timeout_uninterruptible(1);

Perhaps

	if (killed && !(current->flags & PF_EXITING))
		schedule_timeout_killable(1);

makes more sense?

If fatal_signal_pending() == T then schedule_timeout_killable()
is nop, but unline uninterruptible_ it can be SIGKILL'ed.

But if you prefer to check fatal_signal_pending() I won't argue.

Oleg.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [patch v2] mm, oom: do not schedule if current has been killed
  2012-06-19  6:26         ` David Rientjes
@ 2012-06-19 17:32           ` KOSAKI Motohiro
  2012-06-19 18:59             ` David Rientjes
  0 siblings, 1 reply; 16+ messages in thread
From: KOSAKI Motohiro @ 2012-06-19 17:32 UTC (permalink / raw)
  To: rientjes; +Cc: kosaki.motohiro, kamezawa.hiroyu, akpm, oleg, linux-mm

On 6/19/2012 2:26 AM, David Rientjes wrote:
> On Tue, 19 Jun 2012, KOSAKI Motohiro wrote:
> 
>>> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
>>> --- a/mm/oom_kill.c
>>> +++ b/mm/oom_kill.c
>>> @@ -746,10 +746,11 @@ out:
>>>        read_unlock(&tasklist_lock);
>>>
>>>        /*
>>> -        * Give "p" a good chance of killing itself before we
>>> +        * Give "p" a good chance of exiting before we
>>>         * retry to allocate memory unless "p" is current
>>>         */
>>> -       if (killed && !test_thread_flag(TIF_MEMDIE))
>>> +       if (killed && !fatal_signal_pending(current) &&
>>> +                     !(current->flags & PF_EXITING))
>>>                schedule_timeout_uninterruptible(1);
>>>  }
>>
>> Why don't check gfp_flags? I think the rule is,
>>
>> 1) a thread of newly marked as TIF_MEMDIE
>>     -> now it has a capability to access reseve memory. let's immediately retry.
>> 2) allocation for GFP_HIGHUSER_MOVABLE
>>     -> we can fail to allocate it safely. let's immediately fail.
>>         (I suspect we need to change page allocator too)
>> 3) GFP_KERNEL and PF_EXITING
>>     -> don't retry immediately. It shall fail again. let's wait until
>> killed process
>>         is exited.
>>
> 
> The killed process may exit but it does not guarantee that its memory will 
> be freed if it's shared with current.  This is the case that the patch is 
> addressing, where right now we unnecessarily schedule if current has been 
> killed or is already along the exit path.  We want to retry as soon as 
> possible so that either the allocation now succeeds or we can recall the 
> oom killer as soon as possible and get TIF_MEMDIE set because we have a 
> fatal signal so current may exit in a timely way as well.  The point is 
> that if current has either a SIGKILL or is already exiting as it returns 
> from the oom killer, it does no good to continue to stall and prevent that 
> memory freeing.

You missed live lock risk. immediate retry makes immediate fail if no one
freed any memory. Even if the task call out_of_memory() again, select_bad_process()
may return -1 and don't makes any forward progress.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [patch v2] mm, oom: do not schedule if current has been killed
  2012-06-19 17:32           ` KOSAKI Motohiro
@ 2012-06-19 18:59             ` David Rientjes
  2012-06-19 19:29               ` KOSAKI Motohiro
  0 siblings, 1 reply; 16+ messages in thread
From: David Rientjes @ 2012-06-19 18:59 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: kamezawa.hiroyu, akpm, oleg, linux-mm

On Tue, 19 Jun 2012, KOSAKI Motohiro wrote:

> > The killed process may exit but it does not guarantee that its memory will 
> > be freed if it's shared with current.  This is the case that the patch is 
> > addressing, where right now we unnecessarily schedule if current has been 
> > killed or is already along the exit path.  We want to retry as soon as 
> > possible so that either the allocation now succeeds or we can recall the 
> > oom killer as soon as possible and get TIF_MEMDIE set because we have a 
> > fatal signal so current may exit in a timely way as well.  The point is 
> > that if current has either a SIGKILL or is already exiting as it returns 
> > from the oom killer, it does no good to continue to stall and prevent that 
> > memory freeing.
> 
> You missed live lock risk. immediate retry makes immediate fail if no one
> freed any memory. Even if the task call out_of_memory() again, select_bad_process()
> may return -1 and don't makes any forward progress.
> 

I missed a livelock?  You missed the fact that the oom killer is 
short-circuited by this before anything else gets done:

	if (fatal_signal_pending(current)) {
		set_thread_flag(TIF_MEMDIE);
		return;
	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [patch v2] mm, oom: do not schedule if current has been killed
  2012-06-19 18:59             ` David Rientjes
@ 2012-06-19 19:29               ` KOSAKI Motohiro
  0 siblings, 0 replies; 16+ messages in thread
From: KOSAKI Motohiro @ 2012-06-19 19:29 UTC (permalink / raw)
  To: rientjes; +Cc: kosaki.motohiro, kamezawa.hiroyu, akpm, oleg, linux-mm

On 6/19/2012 2:59 PM, David Rientjes wrote:
> On Tue, 19 Jun 2012, KOSAKI Motohiro wrote:
> 
>>> The killed process may exit but it does not guarantee that its memory will 
>>> be freed if it's shared with current.  This is the case that the patch is 
>>> addressing, where right now we unnecessarily schedule if current has been 
>>> killed or is already along the exit path.  We want to retry as soon as 
>>> possible so that either the allocation now succeeds or we can recall the 
>>> oom killer as soon as possible and get TIF_MEMDIE set because we have a 
>>> fatal signal so current may exit in a timely way as well.  The point is 
>>> that if current has either a SIGKILL or is already exiting as it returns 
>>> from the oom killer, it does no good to continue to stall and prevent that 
>>> memory freeing.
>>
>> You missed live lock risk. immediate retry makes immediate fail if no one
>> freed any memory. Even if the task call out_of_memory() again, select_bad_process()
>> may return -1 and don't makes any forward progress.
>>
> 
> I missed a livelock?  You missed the fact that the oom killer is 
> short-circuited by this before anything else gets done:
> 
> 	if (fatal_signal_pending(current)) {
> 		set_thread_flag(TIF_MEMDIE);
> 		return;
> 	}
> 

I talked about PF_EXITING. fatal signal is no problem.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [patch v2] mm, oom: do not schedule if current has been killed
  2012-06-19 13:55       ` Oleg Nesterov
@ 2012-06-19 20:24         ` David Rientjes
  2012-06-19 20:58           ` [patch v3] " David Rientjes
  0 siblings, 1 reply; 16+ messages in thread
From: David Rientjes @ 2012-06-19 20:24 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Kamezawa Hiroyuki, Andrew Morton, KOSAKI Motohiro, linux-mm

On Tue, 19 Jun 2012, Oleg Nesterov wrote:

> 	if (killed && !(current->flags & PF_EXITING))
> 		schedule_timeout_killable(1);
> 
> makes more sense?
> 
> If fatal_signal_pending() == T then schedule_timeout_killable()
> is nop, but unline uninterruptible_ it can be SIGKILL'ed.
> 

Ok, that's certainly cleaner.  I'll post a v3, thanks for the suggestion.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [patch v3] mm, oom: do not schedule if current has been killed
  2012-06-19 20:24         ` David Rientjes
@ 2012-06-19 20:58           ` David Rientjes
  2012-06-19 21:39             ` KOSAKI Motohiro
  0 siblings, 1 reply; 16+ messages in thread
From: David Rientjes @ 2012-06-19 20:58 UTC (permalink / raw)
  To: Oleg Nesterov, Andrew Morton; +Cc: Kamezawa Hiroyuki, KOSAKI Motohiro, linux-mm

The oom killer currently schedules away from current in an
uninterruptible sleep if it does not have access to memory reserves.
It's possible that current was killed because it shares memory with the
oom killed thread or because it was killed by the user in the interim,
however.

This patch only schedules away from current if it does not have a pending
kill, i.e. if it does not share memory with the oom killed thread.  It's
possible that it will immediately retry its memory allocation and fail,
but it will immediately be given access to memory reserves if it calls
the oom killer again.

This prevents the delay of memory freeing when threads that share memory
with the oom killed thread get unnecessarily scheduled.

Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/oom_kill.c |   11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -746,11 +746,11 @@ out:
 	read_unlock(&tasklist_lock);
 
 	/*
-	 * Give "p" a good chance of killing itself before we
-	 * retry to allocate memory unless "p" is current
+	 * Give the killed threads a good chance of exiting before trying to
+	 * allocate memory again.
 	 */
-	if (killed && !test_thread_flag(TIF_MEMDIE))
-		schedule_timeout_uninterruptible(1);
+	if (killed)
+		schedule_timeout_killable(1);
 }
 
 /*
@@ -765,6 +765,5 @@ void pagefault_out_of_memory(void)
 		out_of_memory(NULL, 0, 0, NULL, false);
 		clear_system_oom();
 	}
-	if (!test_thread_flag(TIF_MEMDIE))
-		schedule_timeout_uninterruptible(1);
+	schedule_timeout_killable(1);
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [patch v3] mm, oom: do not schedule if current has been killed
  2012-06-19 20:58           ` [patch v3] " David Rientjes
@ 2012-06-19 21:39             ` KOSAKI Motohiro
  2012-06-20  0:38               ` Kamezawa Hiroyuki
  0 siblings, 1 reply; 16+ messages in thread
From: KOSAKI Motohiro @ 2012-06-19 21:39 UTC (permalink / raw)
  To: David Rientjes
  Cc: Oleg Nesterov, Andrew Morton, Kamezawa Hiroyuki, KOSAKI Motohiro,
	linux-mm, kosaki.motohiro

(6/19/12 4:58 PM), David Rientjes wrote:
> The oom killer currently schedules away from current in an
> uninterruptible sleep if it does not have access to memory reserves.
> It's possible that current was killed because it shares memory with the
> oom killed thread or because it was killed by the user in the interim,
> however.
> 
> This patch only schedules away from current if it does not have a pending
> kill, i.e. if it does not share memory with the oom killed thread.  It's
> possible that it will immediately retry its memory allocation and fail,
> but it will immediately be given access to memory reserves if it calls
> the oom killer again.
> 
> This prevents the delay of memory freeing when threads that share memory
> with the oom killed thread get unnecessarily scheduled.
> 
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
>  mm/oom_kill.c |   11 +++++------
>  1 file changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -746,11 +746,11 @@ out:
>  	read_unlock(&tasklist_lock);
>  
>  	/*
> -	 * Give "p" a good chance of killing itself before we
> -	 * retry to allocate memory unless "p" is current
> +	 * Give the killed threads a good chance of exiting before trying to
> +	 * allocate memory again.
>  	 */
> -	if (killed && !test_thread_flag(TIF_MEMDIE))
> -		schedule_timeout_uninterruptible(1);
> +	if (killed)
> +		schedule_timeout_killable(1);
>  }

This is not match I expected. but I have no seen a big problem.

Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>


>  
>  /*
> @@ -765,6 +765,5 @@ void pagefault_out_of_memory(void)
>  		out_of_memory(NULL, 0, 0, NULL, false);
>  		clear_system_oom();
>  	}
> -	if (!test_thread_flag(TIF_MEMDIE))
> -		schedule_timeout_uninterruptible(1);
> +	schedule_timeout_killable(1);
>  }
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [patch v3] mm, oom: do not schedule if current has been killed
  2012-06-19 21:39             ` KOSAKI Motohiro
@ 2012-06-20  0:38               ` Kamezawa Hiroyuki
  2012-06-21  1:23                 ` David Rientjes
  0 siblings, 1 reply; 16+ messages in thread
From: Kamezawa Hiroyuki @ 2012-06-20  0:38 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: David Rientjes, Oleg Nesterov, Andrew Morton, KOSAKI Motohiro, linux-mm

(2012/06/20 6:39), KOSAKI Motohiro wrote:
> (6/19/12 4:58 PM), David Rientjes wrote:
>> The oom killer currently schedules away from current in an
>> uninterruptible sleep if it does not have access to memory reserves.
>> It's possible that current was killed because it shares memory with the
>> oom killed thread or because it was killed by the user in the interim,
>> however.
>>
>> This patch only schedules away from current if it does not have a pending
>> kill, i.e. if it does not share memory with the oom killed thread.  It's
>> possible that it will immediately retry its memory allocation and fail,
>> but it will immediately be given access to memory reserves if it calls
>> the oom killer again.
>>
>> This prevents the delay of memory freeing when threads that share memory
>> with the oom killed thread get unnecessarily scheduled.
>>
>> Signed-off-by: David Rientjes<rientjes@google.com>
>> ---
>>   mm/oom_kill.c |   11 +++++------
>>   1 file changed, 5 insertions(+), 6 deletions(-)
>>
>> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
>> --- a/mm/oom_kill.c
>> +++ b/mm/oom_kill.c
>> @@ -746,11 +746,11 @@ out:
>>   	read_unlock(&tasklist_lock);
>>
>>   	/*
>> -	 * Give "p" a good chance of killing itself before we
>> -	 * retry to allocate memory unless "p" is current
>> +	 * Give the killed threads a good chance of exiting before trying to
>> +	 * allocate memory again.
>>   	 */
>> -	if (killed&&  !test_thread_flag(TIF_MEMDIE))
>> -		schedule_timeout_uninterruptible(1);
>> +	if (killed)
>> +		schedule_timeout_killable(1);
>>   }
>
> This is not match I expected. but I have no seen a big problem.
>
> Acked-by: KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com>
>

Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

I'll check memcg part to make it consistent to this when this goes to -mm.




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [patch v3] mm, oom: do not schedule if current has been killed
  2012-06-20  0:38               ` Kamezawa Hiroyuki
@ 2012-06-21  1:23                 ` David Rientjes
  0 siblings, 0 replies; 16+ messages in thread
From: David Rientjes @ 2012-06-21  1:23 UTC (permalink / raw)
  To: Kamezawa Hiroyuki
  Cc: KOSAKI Motohiro, Oleg Nesterov, Andrew Morton, KOSAKI Motohiro, linux-mm

On Wed, 20 Jun 2012, Kamezawa Hiroyuki wrote:

> I'll check memcg part to make it consistent to this when this goes to -mm.
> 

It's merged.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2012-06-21  1:23 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-19  1:08 [patch] mm, oom: do not schedule if current has been killed David Rientjes
2012-06-19  1:57 ` Kamezawa Hiroyuki
2012-06-19  2:23   ` David Rientjes
2012-06-19  2:31     ` [patch v2] " David Rientjes
2012-06-19  2:51       ` Kamezawa Hiroyuki
2012-06-19  6:03       ` KOSAKI Motohiro
2012-06-19  6:26         ` David Rientjes
2012-06-19 17:32           ` KOSAKI Motohiro
2012-06-19 18:59             ` David Rientjes
2012-06-19 19:29               ` KOSAKI Motohiro
2012-06-19 13:55       ` Oleg Nesterov
2012-06-19 20:24         ` David Rientjes
2012-06-19 20:58           ` [patch v3] " David Rientjes
2012-06-19 21:39             ` KOSAKI Motohiro
2012-06-20  0:38               ` Kamezawa Hiroyuki
2012-06-21  1:23                 ` David Rientjes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.