[PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group
@ 2007-06-05 17:48 Will Schmidt
  2007-06-05 17:48 ` [PATCH 2/3] [PATCH powerpc] " Will Schmidt
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Will Schmidt @ 2007-06-05 17:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: anton, liuxppc-dev, will_schmidt


When we get into a state where VM has ran out of memory, and it's time to
thwack a process, we should take out the entire process group, rather than 
just one thread.

Tested on i386

Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
---

 arch/i386/mm/fault.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/i386/mm/fault.c b/arch/i386/mm/fault.c
index b8c4e25..82aec0e 100644
--- a/arch/i386/mm/fault.c
+++ b/arch/i386/mm/fault.c
@@ -567,8 +567,10 @@ out_of_memory:
 		goto survive;
 	}
 	printk("VM: killing process %s\n", tsk->comm);
-	if (error_code & 4)
+	if (error_code & 4) {
+		zap_other_threads(tsk);
 		do_exit(SIGKILL);
+	}
 	goto no_context;
 
 do_sigbus:



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] [PATCH powerpc] during VM oom condition, kill all threads in process group
  2007-06-05 17:48 [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group Will Schmidt
@ 2007-06-05 17:48 ` Will Schmidt
  2007-06-05 18:17   ` Will Schmidt
  2007-06-05 17:48 ` [PATCH 3/3] [PATCH x86_64] " Will Schmidt
  2007-06-07 22:34 ` [PATCH 1/3] [PATCH i386] " Andrew Morton
  2 siblings, 1 reply; 12+ messages in thread
From: Will Schmidt @ 2007-06-05 17:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: anton, liuxppc-dev, will_schmidt


When we get into a state where VM has ran out of memory, and it's time to
thwack a process, we should take out the entire process group, rather than
just one thread.

Tested on POWER5.

Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
---

 arch/powerpc/mm/fault.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 03aeb3a..9afe871 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -392,8 +392,10 @@ out_of_memory:
 		goto survive;
 	}
 	printk("VM: killing process %s\n", current->comm);
-	if (user_mode(regs))
+	if (user_mode(regs)) {
+		zap_other_threads(current);
 		do_exit(SIGKILL);
+	}
 	return SIGKILL;
 
 do_sigbus:



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] [PATCH x86_64] during VM oom condition, kill all threads in process group
  2007-06-05 17:48 [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group Will Schmidt
  2007-06-05 17:48 ` [PATCH 2/3] [PATCH powerpc] " Will Schmidt
@ 2007-06-05 17:48 ` Will Schmidt
  2007-06-07 22:34 ` [PATCH 1/3] [PATCH i386] " Andrew Morton
  2 siblings, 0 replies; 12+ messages in thread
From: Will Schmidt @ 2007-06-05 17:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: anton, liuxppc-dev, will_schmidt


When we get into a state where VM has ran out of memory, and it's time to
thwack a process, we should take out the entire process group, rather than 
just one thread.

Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
---

 arch/x86_64/mm/fault.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/x86_64/mm/fault.c b/arch/x86_64/mm/fault.c
index 6ada723..2a3060e 100644
--- a/arch/x86_64/mm/fault.c
+++ b/arch/x86_64/mm/fault.c
@@ -562,8 +562,10 @@ out_of_memory:
 		goto again;
 	}
 	printk("VM: killing process %s\n", tsk->comm);
-	if (error_code & 4)
+	if (error_code & 4) {
+		zap_other_threads(tsk);
 		do_exit(SIGKILL);
+	}
 	goto no_context;
 
 do_sigbus:



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] [PATCH powerpc] during VM oom condition, kill all threads in process group
  2007-06-05 17:48 ` [PATCH 2/3] [PATCH powerpc] " Will Schmidt
@ 2007-06-05 18:17   ` Will Schmidt
  0 siblings, 0 replies; 12+ messages in thread
From: Will Schmidt @ 2007-06-05 18:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: anton, linuxppc-dev


Whoops..  sorry about any reply bounces, I flubbed the cc to
linuxppc-dev@ozlabs.org . 

-Will

On Tue, 2007-05-06 at 12:48 -0500, Will Schmidt wrote:
> When we get into a state where VM has ran out of memory, and it's time to
> thwack a process, we should take out the entire process group, rather than
> just one thread.
> 
> Tested on POWER5.
> 
> Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
> ---
> 
>  arch/powerpc/mm/fault.c |    4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 03aeb3a..9afe871 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -392,8 +392,10 @@ out_of_memory:
>  		goto survive;
>  	}
>  	printk("VM: killing process %s\n", current->comm);
> -	if (user_mode(regs))
> +	if (user_mode(regs)) {
> +		zap_other_threads(current);
>  		do_exit(SIGKILL);
> +	}
>  	return SIGKILL;
>  
>  do_sigbus:
> 
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group
  2007-06-05 17:48 [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group Will Schmidt
  2007-06-05 17:48 ` [PATCH 2/3] [PATCH powerpc] " Will Schmidt
  2007-06-05 17:48 ` [PATCH 3/3] [PATCH x86_64] " Will Schmidt
@ 2007-06-07 22:34 ` Andrew Morton
  2007-06-07 23:16   ` Anton Blanchard
  2 siblings, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2007-06-07 22:34 UTC (permalink / raw)
  To: Will Schmidt; +Cc: linux-kernel, anton, liuxppc-dev

On Tue, 05 Jun 2007 12:48:32 -0500
Will Schmidt <will_schmidt@vnet.ibm.com> wrote:

> 
> When we get into a state where VM has ran out of memory, and it's time to
> thwack a process, we should take out the entire process group, rather than 
> just one thread.
> 
> Tested on i386
> 
> Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
> ---
> 
>  arch/i386/mm/fault.c |    4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/i386/mm/fault.c b/arch/i386/mm/fault.c
> index b8c4e25..82aec0e 100644
> --- a/arch/i386/mm/fault.c
> +++ b/arch/i386/mm/fault.c
> @@ -567,8 +567,10 @@ out_of_memory:
>  		goto survive;
>  	}
>  	printk("VM: killing process %s\n", tsk->comm);
> -	if (error_code & 4)
> +	if (error_code & 4) {
> +		zap_other_threads(tsk);
>  		do_exit(SIGKILL);
> +	}
>  	goto no_context;
>  

zap_other_threads() requires tasklist_lock.

If we're going to do this then we should probably create some new function
(with a better name) which takes tasklsit_lock and then calls
zap_other_threads().

Does this patch fix any observed-in-the-real-world problem?  If so, please
describe it.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group
  2007-06-07 22:34 ` [PATCH 1/3] [PATCH i386] " Andrew Morton
@ 2007-06-07 23:16   ` Anton Blanchard
  2007-06-08  0:10     ` Andrew Morton
  0 siblings, 1 reply; 12+ messages in thread
From: Anton Blanchard @ 2007-06-07 23:16 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Will Schmidt, linux-kernel, liuxppc-dev

 
Hi,

> zap_other_threads() requires tasklist_lock.
> 
> If we're going to do this then we should probably create some new function
> (with a better name) which takes tasklsit_lock and then calls
> zap_other_threads().
> 
> Does this patch fix any observed-in-the-real-world problem?  If so, please
> describe it.

Yeah we have had complaints where threaded apps have only one thread
shot down instead of the entire process. This leaves the application in
a bad state, whereas if it had been killed cleanly the application could
have restarted.

My understanding is that fatal signals should kill all threads in the
group.

Anton

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group
  2007-06-07 23:16   ` Anton Blanchard
@ 2007-06-08  0:10     ` Andrew Morton
  2007-06-08 19:19       ` Will Schmidt
  0 siblings, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2007-06-08  0:10 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: Will Schmidt, linux-kernel, liuxppc-dev

On Thu, 7 Jun 2007 18:16:21 -0500
Anton Blanchard <anton@samba.org> wrote:

>  
> Hi,
> 
> > zap_other_threads() requires tasklist_lock.
> > 
> > If we're going to do this then we should probably create some new function
> > (with a better name) which takes tasklsit_lock and then calls
> > zap_other_threads().
> > 
> > Does this patch fix any observed-in-the-real-world problem?  If so, please
> > describe it.
> 
> Yeah we have had complaints where threaded apps have only one thread
> shot down instead of the entire process. This leaves the application in
> a bad state, whereas if it had been killed cleanly the application could
> have restarted.
> 
> My understanding is that fatal signals should kill all threads in the
> group.
> 

OK, well could we please get all that info appropriatelt captured in #2's
changelog?

Other architectures will probably need to implement this.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group
  2007-06-08  0:10     ` Andrew Morton
@ 2007-06-08 19:19       ` Will Schmidt
  2007-06-08 19:32         ` Andrew Morton
  0 siblings, 1 reply; 12+ messages in thread
From: Will Schmidt @ 2007-06-08 19:19 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Anton Blanchard, linux-kernel, linuxppc-dev

On Thu, 2007-06-07 at 17:10 -0700, Andrew Morton wrote:
> On Thu, 7 Jun 2007 18:16:21 -0500
> Anton Blanchard <anton@samba.org> wrote:
> 
> >  
> > Hi,
> > 
> > > zap_other_threads() requires tasklist_lock.

Yup, I missed that.   Thanks for pointing it out.

> > > 
> > > If we're going to do this then we should probably create some new function
> > > (with a better name) which takes tasklsit_lock and then calls
> > > zap_other_threads().

I expect this will be a write_lock_irq() since zap_other_threads will be
doing a bit more than just reading the task info.

This will be down in a do-page-fault failure path (see
arch/*/mm/fault.c).  I wonder if calling write_lock is going to be safe,
or if its possible to get into a deadlock?  i.e. should I branch back up
to the survive: label if I can't take the lock?  Would that even be
sufficient? or is it not an issue here? 

> > > 
> > > Does this patch fix any observed-in-the-real-world problem?  If so, please
> > > describe it.
> > 
> > Yeah we have had complaints where threaded apps have only one thread
> > shot down instead of the entire process. This leaves the application in
> > a bad state, whereas if it had been killed cleanly the application could
> > have restarted.
> > 
> > My understanding is that fatal signals should kill all threads in the
> > group.
> > 
> 
> OK, well could we please get all that info appropriatelt captured in #2's
> changelog?
Yup, next spin I'll add more to the changelog. 

> 
> Other architectures will probably need to implement this.

-Will


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group
  2007-06-08 19:19       ` Will Schmidt
@ 2007-06-08 19:32         ` Andrew Morton
  2007-06-08 21:12           ` Will Schmidt
  0 siblings, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2007-06-08 19:32 UTC (permalink / raw)
  To: will_schmidt
  Cc: Anton Blanchard, linux-kernel, linuxppc-dev, Oleg Nesterov,
	Eric W. Biederman

On Fri, 08 Jun 2007 14:19:18 -0500
Will Schmidt <will_schmidt@vnet.ibm.com> wrote:

> > > > zap_other_threads() requires tasklist_lock.
> 
> Yup, I missed that.   Thanks for pointing it out.
> 
> > > > 
> > > > If we're going to do this then we should probably create some new function
> > > > (with a better name) which takes tasklsit_lock and then calls
> > > > zap_other_threads().
> 
> I expect this will be a write_lock_irq() since zap_other_threads will be
> doing a bit more than just reading the task info.

No, I think read_lock() will be sufficient.

In fact, it's probably the case that rcu_read_lock() is now sufficient
locking coverage for zap_other_threads() (cc's people).

It had better be, because do_group_exit() forgot to take tasklist_lock.  It
is perhaps relying upon spin_lock()'s hidden rcu_read_lock() properties
without so much as a code comment, which would be somewhat nasty of it.

You could perhaps just call do_group_exit() from within the fault handler,
btw.

> This will be down in a do-page-fault failure path (see
> arch/*/mm/fault.c).  I wonder if calling write_lock is going to be safe,
> or if its possible to get into a deadlock?  i.e. should I branch back up
> to the survive: label if I can't take the lock?  Would that even be
> sufficient? or is it not an issue here? 

You can take the lock in the fault handler.  Nobody should be getting
pagefaults while holding tasklist_lock.  (Well, a vmalloc fault might, but
that's a special-case which doesn't allocate memory or anything like that).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group
  2007-06-08 19:32         ` Andrew Morton
@ 2007-06-08 21:12           ` Will Schmidt
  2007-06-08 22:48             ` Eric W. Biederman
  0 siblings, 1 reply; 12+ messages in thread
From: Will Schmidt @ 2007-06-08 21:12 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linuxppc-dev, Eric W. Biederman, Oleg Nesterov, Anton Blanchard,
	linux-kernel

On Fri, 2007-06-08 at 12:32 -0700, Andrew Morton wrote:
> On Fri, 08 Jun 2007 14:19:18 -0500
> Will Schmidt <will_schmidt@vnet.ibm.com> wrote:
> 
> > > > > zap_other_threads() requires tasklist_lock.
> > 

> In fact, it's probably the case that rcu_read_lock() is now sufficient
> locking coverage for zap_other_threads() (cc's people).
> 
> It had better be, because do_group_exit() forgot to take tasklist_lock.  It
> is perhaps relying upon spin_lock()'s hidden rcu_read_lock() properties
> without so much as a code comment, which would be somewhat nasty of it.

> You could perhaps just call do_group_exit() from within the fault
> handler,
> btw.

Yup, so looks like I can actually replace the existing do_exit() call
with do_group_exit().   I'll sit on this for a bit to give other folks a
chance to comment on which lock call is sufficient, read_lock() or
rcu_read_lock(), etc;  and do_group_exit()'s issue with taking
tasklist_lock. 

Thanks,

-Will


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group
  2007-06-08 21:12           ` Will Schmidt
@ 2007-06-08 22:48             ` Eric W. Biederman
  2007-06-13 15:51               ` Oleg Nesterov
  0 siblings, 1 reply; 12+ messages in thread
From: Eric W. Biederman @ 2007-06-08 22:48 UTC (permalink / raw)
  To: will_schmidt
  Cc: Andrew Morton, linuxppc-dev, Oleg Nesterov, Anton Blanchard,
	linux-kernel

Will Schmidt <will_schmidt@vnet.ibm.com> writes:

> On Fri, 2007-06-08 at 12:32 -0700, Andrew Morton wrote:
>> On Fri, 08 Jun 2007 14:19:18 -0500
>> Will Schmidt <will_schmidt@vnet.ibm.com> wrote:
>> 
>> > > > > zap_other_threads() requires tasklist_lock.
>> > 
>
>> In fact, it's probably the case that rcu_read_lock() is now sufficient
>> locking coverage for zap_other_threads() (cc's people).
>> 
>> It had better be, because do_group_exit() forgot to take tasklist_lock.  It
>> is perhaps relying upon spin_lock()'s hidden rcu_read_lock() properties
>> without so much as a code comment, which would be somewhat nasty of it.
>
>> You could perhaps just call do_group_exit() from within the fault
>> handler,
>> btw.
>
> Yup, so looks like I can actually replace the existing do_exit() call
> with do_group_exit().   I'll sit on this for a bit to give other folks a
> chance to comment on which lock call is sufficient, read_lock() or
> rcu_read_lock(), etc;  and do_group_exit()'s issue with taking
> tasklist_lock. 

No.  The rcu_read_lock is not sufficient.
Yes.  sighand->siglock is enough, and we explicitly take it in
do_group_exit before calling zap_other_threads.

Unless I have completely miss-understood this thread.

Eric

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group
  2007-06-08 22:48             ` Eric W. Biederman
@ 2007-06-13 15:51               ` Oleg Nesterov
  0 siblings, 0 replies; 12+ messages in thread
From: Oleg Nesterov @ 2007-06-13 15:51 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: will_schmidt, Andrew Morton, linuxppc-dev, Anton Blanchard, linux-kernel

On 06/08, Eric W. Biederman wrote:
>
> Will Schmidt <will_schmidt@vnet.ibm.com> writes:
> 
> > On Fri, 2007-06-08 at 12:32 -0700, Andrew Morton wrote:
> >> On Fri, 08 Jun 2007 14:19:18 -0500
> >> Will Schmidt <will_schmidt@vnet.ibm.com> wrote:
> >> 
> >> > > > > zap_other_threads() requires tasklist_lock.
> >> > 
> >
> >> In fact, it's probably the case that rcu_read_lock() is now sufficient
> >> locking coverage for zap_other_threads() (cc's people).
> >> 
> >> It had better be, because do_group_exit() forgot to take tasklist_lock.  It
> >> is perhaps relying upon spin_lock()'s hidden rcu_read_lock() properties
> >> without so much as a code comment, which would be somewhat nasty of it.
> >
> >> You could perhaps just call do_group_exit() from within the fault
> >> handler,
> >> btw.
> >
> > Yup, so looks like I can actually replace the existing do_exit() call
> > with do_group_exit().   I'll sit on this for a bit to give other folks a
> > chance to comment on which lock call is sufficient, read_lock() or
> > rcu_read_lock(), etc;  and do_group_exit()'s issue with taking
> > tasklist_lock. 
> 
> No.  The rcu_read_lock is not sufficient.
> Yes.  sighand->siglock is enough, and we explicitly take it in
> do_group_exit before calling zap_other_threads.

Yes, we don't need tasklist_lock (or rcu_read_lock).

de_thread() calls zap_other_threads() under tasklist_lock, but this
is because we can change child_reaper.

Oleg.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2007-06-13 15:51 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-06-05 17:48 [PATCH 1/3] [PATCH i386] during VM oom condition, kill all threads in process group Will Schmidt
2007-06-05 17:48 ` [PATCH 2/3] [PATCH powerpc] " Will Schmidt
2007-06-05 18:17   ` Will Schmidt
2007-06-05 17:48 ` [PATCH 3/3] [PATCH x86_64] " Will Schmidt
2007-06-07 22:34 ` [PATCH 1/3] [PATCH i386] " Andrew Morton
2007-06-07 23:16   ` Anton Blanchard
2007-06-08  0:10     ` Andrew Morton
2007-06-08 19:19       ` Will Schmidt
2007-06-08 19:32         ` Andrew Morton
2007-06-08 21:12           ` Will Schmidt
2007-06-08 22:48             ` Eric W. Biederman
2007-06-13 15:51               ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).