[PATCH 1/5] vfork: don't freezer_count() for in-kernel users of CLONE

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 1/5] vfork: don't freezer_count() for in-kernel users of CLONE_VFORK
@ 2013-02-14 23:38 Mandeep Singh Baines
  2013-02-14 23:38 ` [PATCH 2/5] lockdep: check that no locks held at freeze time Mandeep Singh Baines
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Mandeep Singh Baines @ 2013-02-14 23:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mandeep Singh Baines, Oleg Nesterov, Tejun Heo, Andrew Morton,
	Rafael J. Wysocki, Ingo Molnar

We don't need to call freezer_do_not_count() for in-kernel users
of CLONE_VFORK since exec will get called in bounded time.

We don't want to call freezer_count() for in-kernel users because
they may be holding locks. freezer_count() calls try_to_freeze().
We don't want to freeze an in-kernel user because it may be
holding locks.

In a follow-up patch, I call debug_check_no_locks_held() from
try_to_freeze(). After applying this patch, I get no lockdep
warnings with that patch.

Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Tejun Heo <tj@kernel.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Rafael J. Wysocki <rjw@sisk.pl>
CC: Ingo Molnar <mingo@redhat.com>
---
 kernel/fork.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index c535f33..a7cd973 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -722,9 +722,11 @@ static int wait_for_vfork_done(struct task_struct *child,
 {
 	int killed;
 
-	freezer_do_not_count();
+	if (current->mm)
+		freezer_do_not_count();
 	killed = wait_for_completion_killable(vfork);
-	freezer_count();
+	if (current->mm)
+		freezer_count();
 
 	if (killed) {
 		task_lock(child);
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/5] lockdep: check that no locks held at freeze time
  2013-02-14 23:38 [PATCH 1/5] vfork: don't freezer_count() for in-kernel users of CLONE_VFORK Mandeep Singh Baines
@ 2013-02-14 23:38 ` Mandeep Singh Baines
  2013-02-15 11:16   ` Ingo Molnar
  2013-02-15 15:44   ` Oleg Nesterov
  2013-02-14 23:38 ` [PATCH 3/5] coredump: cleanup the waiting for coredump_finish code Mandeep Singh Baines
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 16+ messages in thread
From: Mandeep Singh Baines @ 2013-02-14 23:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mandeep Singh Baines, Oleg Nesterov, Tejun Heo, Andrew Morton,
	Rafael J. Wysocki, Ingo Molnar

We shouldn't try_to_freeze if locks are held. Verified that
I get no lockdep warnings after applying this patch and
"vfork: don't freezer_count() for in-kernel users of CLONE_VFORK".

Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Tejun Heo <tj@kernel.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Rafael J. Wysocki <rjw@sisk.pl>
CC: Ingo Molnar <mingo@redhat.com>
---
 include/linux/freezer.h | 2 ++
 kernel/lockdep.c        | 6 +++---
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/linux/freezer.h b/include/linux/freezer.h
index e4238ce..1538cfc 100644
--- a/include/linux/freezer.h
+++ b/include/linux/freezer.h
@@ -3,6 +3,7 @@
 #ifndef FREEZER_H_INCLUDED
 #define FREEZER_H_INCLUDED
 
+#include <linux/debug_locks.h>
 #include <linux/sched.h>
 #include <linux/wait.h>
 #include <linux/atomic.h>
@@ -43,6 +44,7 @@ extern void thaw_kernel_threads(void);
 
 static inline bool try_to_freeze(void)
 {
+	debug_check_no_locks_held(current);
 	might_sleep();
 	if (likely(!freezing(current)))
 		return false;
diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index 7981e5b..e3ee8af 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -4091,10 +4091,10 @@ static void print_held_locks_bug(struct task_struct *curr)
 		return;
 
 	printk("\n");
-	printk("=====================================\n");
-	printk("[ BUG: lock held at task exit time! ]\n");
+	printk("=======================================\n");
+	printk("[ BUG: lock held at exit/freeze time! ]\n");
 	print_kernel_ident();
-	printk("-------------------------------------\n");
+	printk("---------------------------------------\n");
 	printk("%s/%d is exiting with locks still held!\n",
 		curr->comm, task_pid_nr(curr));
 	lockdep_print_held_locks(curr);
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] lockdep: check that no locks held at freeze time
  2013-02-14 23:38 ` [PATCH 2/5] lockdep: check that no locks held at freeze time Mandeep Singh Baines
@ 2013-02-15 11:16   ` Ingo Molnar
  2013-02-15 15:44   ` Oleg Nesterov
  1 sibling, 0 replies; 16+ messages in thread
From: Ingo Molnar @ 2013-02-15 11:16 UTC (permalink / raw)
  To: Mandeep Singh Baines
  Cc: linux-kernel, Oleg Nesterov, Tejun Heo, Andrew Morton,
	Rafael J. Wysocki, Ingo Molnar


* Mandeep Singh Baines <msb@chromium.org> wrote:

> --- a/kernel/lockdep.c
> +++ b/kernel/lockdep.c
> @@ -4091,10 +4091,10 @@ static void print_held_locks_bug(struct task_struct *curr)
>  		return;
>  
>  	printk("\n");
> -	printk("=====================================\n");
> -	printk("[ BUG: lock held at task exit time! ]\n");
> +	printk("=======================================\n");
> +	printk("[ BUG: lock held at exit/freeze time! ]\n");

Looks good, but to make it more apparent which case triggers, 
please pass in a 'msg' string that gets printed - and the msg is 
either "exit" or "freeze", depending on from where it's called.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] lockdep: check that no locks held at freeze time
  2013-02-14 23:38 ` [PATCH 2/5] lockdep: check that no locks held at freeze time Mandeep Singh Baines
  2013-02-15 11:16   ` Ingo Molnar
@ 2013-02-15 15:44   ` Oleg Nesterov
  1 sibling, 0 replies; 16+ messages in thread
From: Oleg Nesterov @ 2013-02-15 15:44 UTC (permalink / raw)
  To: Mandeep Singh Baines
  Cc: linux-kernel, Tejun Heo, Andrew Morton, Rafael J. Wysocki, Ingo Molnar

On 02/14, Mandeep Singh Baines wrote:
>
> We shouldn't try_to_freeze if locks are held. Verified that
> I get no lockdep warnings after applying this patch and
> "vfork: don't freezer_count() for in-kernel users of CLONE_VFORK".

Ah. Now I understand why you did 1/5. Beacuse you call
debug_check_no_locks_held() right at the start of try_to_freeze(),
even if the caller can't be frozen.

>  static inline bool try_to_freeze(void)
>  {
> +	debug_check_no_locks_held(current);
>  	might_sleep();
>  	if (likely(!freezing(current)))
>  		return false;

Up to maintainers, but perhaps you should check !PF_NOFREEZE at least?

Oleg.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 3/5] coredump: cleanup the waiting for coredump_finish code
  2013-02-14 23:38 [PATCH 1/5] vfork: don't freezer_count() for in-kernel users of CLONE_VFORK Mandeep Singh Baines
  2013-02-14 23:38 ` [PATCH 2/5] lockdep: check that no locks held at freeze time Mandeep Singh Baines
@ 2013-02-14 23:38 ` Mandeep Singh Baines
  2013-02-15 14:53   ` Oleg Nesterov
  2013-02-15 23:30   ` Andrew Morton
  2013-02-14 23:38 ` [PATCH 4/5] coredump: use a freezable_schedule for the coredump_finish wait Mandeep Singh Baines
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 16+ messages in thread
From: Mandeep Singh Baines @ 2013-02-14 23:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mandeep Singh Baines, Oleg Nesterov, Tejun Heo, Andrew Morton,
	Rafael J. Wysocki, Ingo Molnar

Replace the for loop with a simple if.

Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Tejun Heo <tj@kernel.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Rafael J. Wysocki <rjw@sisk.pl>
CC: Ingo Molnar <mingo@redhat.com>
---
 kernel/exit.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index b4df219..f215198 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -479,12 +479,9 @@ static void exit_mm(struct task_struct * tsk)
 		if (atomic_dec_and_test(&core_state->nr_threads))
 			complete(&core_state->startup);
 
-		for (;;) {
-			set_task_state(tsk, TASK_UNINTERRUPTIBLE);
-			if (!self.task) /* see coredump_finish() */
-				break;
+		set_task_state(tsk, TASK_UNINTERRUPTIBLE);
+		if (self.task) /* see coredump_finish() */
 			schedule();
-		}
 		__set_task_state(tsk, TASK_RUNNING);
 		down_read(&mm->mmap_sem);
 	}
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/5] coredump: cleanup the waiting for coredump_finish code
  2013-02-14 23:38 ` [PATCH 3/5] coredump: cleanup the waiting for coredump_finish code Mandeep Singh Baines
@ 2013-02-15 14:53   ` Oleg Nesterov
  2013-02-15 23:30   ` Andrew Morton
  1 sibling, 0 replies; 16+ messages in thread
From: Oleg Nesterov @ 2013-02-15 14:53 UTC (permalink / raw)
  To: Mandeep Singh Baines
  Cc: linux-kernel, Tejun Heo, Andrew Morton, Rafael J. Wysocki, Ingo Molnar

On 02/14, Mandeep Singh Baines wrote:
>
> Replace the for loop with a simple if.

Why?

> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -479,12 +479,9 @@ static void exit_mm(struct task_struct * tsk)
>  		if (atomic_dec_and_test(&core_state->nr_threads))
>  			complete(&core_state->startup);
>
> -		for (;;) {
> -			set_task_state(tsk, TASK_UNINTERRUPTIBLE);
> -			if (!self.task) /* see coredump_finish() */
> -				break;
> +		set_task_state(tsk, TASK_UNINTERRUPTIBLE);
> +		if (self.task) /* see coredump_finish() */
>  			schedule();
> -		}

If you think we should not worry about spurious wakeups you can simplify
this code even more, you do not need mb() to set TASK_UNINTERRUPTIBLE
(just move it before dec_and_test), you do not need "if (self.task)",
you do not need __set_task_state(tsk, TASK_RUNNING).

But I think we should always worry, so why?

Oleg.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/5] coredump: cleanup the waiting for coredump_finish code
  2013-02-14 23:38 ` [PATCH 3/5] coredump: cleanup the waiting for coredump_finish code Mandeep Singh Baines
  2013-02-15 14:53   ` Oleg Nesterov
@ 2013-02-15 23:30   ` Andrew Morton
  1 sibling, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2013-02-15 23:30 UTC (permalink / raw)
  To: Mandeep Singh Baines
  Cc: linux-kernel, Oleg Nesterov, Tejun Heo, Rafael J. Wysocki, Ingo Molnar

On Thu, 14 Feb 2013 15:38:14 -0800
Mandeep Singh Baines <msb@chromium.org> wrote:

> Replace the for loop with a simple if.

Well OK, but why?  Presumably the loop was added for a reason and
presumably you believe that reason to be (no longer?) correct.  Please
describe all these things.

> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -479,12 +479,9 @@ static void exit_mm(struct task_struct * tsk)
>  		if (atomic_dec_and_test(&core_state->nr_threads))
>  			complete(&core_state->startup);
>  
> -		for (;;) {
> -			set_task_state(tsk, TASK_UNINTERRUPTIBLE);
> -			if (!self.task) /* see coredump_finish() */
> -				break;
> +		set_task_state(tsk, TASK_UNINTERRUPTIBLE);
> +		if (self.task) /* see coredump_finish() */
>  			schedule();
> -		}

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 4/5] coredump: use a freezable_schedule for the coredump_finish wait
  2013-02-14 23:38 [PATCH 1/5] vfork: don't freezer_count() for in-kernel users of CLONE_VFORK Mandeep Singh Baines
  2013-02-14 23:38 ` [PATCH 2/5] lockdep: check that no locks held at freeze time Mandeep Singh Baines
  2013-02-14 23:38 ` [PATCH 3/5] coredump: cleanup the waiting for coredump_finish code Mandeep Singh Baines
@ 2013-02-14 23:38 ` Mandeep Singh Baines
  2013-02-14 23:38 ` [PATCH 5/5] coredump: abort core dump piping only due to a fatal signal Mandeep Singh Baines
  2013-02-15 15:28 ` [PATCH 1/5] vfork: don't freezer_count() for in-kernel users of CLONE_VFORK Oleg Nesterov
  4 siblings, 0 replies; 16+ messages in thread
From: Mandeep Singh Baines @ 2013-02-14 23:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mandeep Singh Baines, Oleg Nesterov, Tejun Heo, Andrew Morton,
	Rafael J. Wysocki, Ingo Molnar

Prevents hung_task detector from panicing the machine. This is also
needed to prevent this wait from blocking suspend.

(It doesnt' currently block suspend but it would once the next
patch in this series is applied.)

Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Tejun Heo <tj@kernel.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Rafael J. Wysocki <rjw@sisk.pl>
CC: Ingo Molnar <mingo@redhat.com>
---
 kernel/exit.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index f215198..04d9db0 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -20,6 +20,7 @@
 #include <linux/tsacct_kern.h>
 #include <linux/file.h>
 #include <linux/fdtable.h>
+#include <linux/freezer.h>
 #include <linux/binfmts.h>
 #include <linux/nsproxy.h>
 #include <linux/pid_namespace.h>
@@ -481,7 +482,7 @@ static void exit_mm(struct task_struct * tsk)
 
 		set_task_state(tsk, TASK_UNINTERRUPTIBLE);
 		if (self.task) /* see coredump_finish() */
-			schedule();
+			freezable_schedule();
 		__set_task_state(tsk, TASK_RUNNING);
 		down_read(&mm->mmap_sem);
 	}
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/5] coredump: abort core dump piping only due to a fatal signal
  2013-02-14 23:38 [PATCH 1/5] vfork: don't freezer_count() for in-kernel users of CLONE_VFORK Mandeep Singh Baines
                   ` (2 preceding siblings ...)
  2013-02-14 23:38 ` [PATCH 4/5] coredump: use a freezable_schedule for the coredump_finish wait Mandeep Singh Baines
@ 2013-02-14 23:38 ` Mandeep Singh Baines
  2013-02-15 15:01   ` Oleg Nesterov
                     ` (2 more replies)
  2013-02-15 15:28 ` [PATCH 1/5] vfork: don't freezer_count() for in-kernel users of CLONE_VFORK Oleg Nesterov
  4 siblings, 3 replies; 16+ messages in thread
From: Mandeep Singh Baines @ 2013-02-14 23:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ben Chan, Mandeep Singh Baines, Oleg Nesterov, Tejun Heo,
	Andrew Morton, Rafael J. Wysocki, Ingo Molnar

From: Ben Chan <benchan@chromium.org>

This patch makes wait_for_dump_helpers() not to abort piping the core
dump data when the crashing process has received any but a fatal signal
(SIGKILL). The rationale is that a crashing process may still receive
uninteresting signals such as SIGCHLD when its core dump data is being
redirected to a helper application. While it's necessary to allow
terminating the core dump piping via SIGKILL, it's practically more
useful for the purpose of debugging and crash reporting if the core dump
piping is not aborted due to other non-fatal signals.

Signed-off-by: Ben Chan <benchan@chromium.org>
Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Tejun Heo <tj@kernel.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Rafael J. Wysocki <rjw@sisk.pl>
CC: Ingo Molnar <mingo@redhat.com>
---
 fs/coredump.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/coredump.c b/fs/coredump.c
index 1774932..54e31a3 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -32,6 +32,7 @@
 #include <linux/pipe_fs_i.h>
 #include <linux/oom.h>
 #include <linux/compat.h>
+#include <linux/freezer.h>
 
 #include <asm/uaccess.h>
 #include <asm/mmu_context.h>
@@ -417,10 +418,13 @@ static void wait_for_dump_helpers(struct file *file)
 	pipe->readers++;
 	pipe->writers--;
 
-	while ((pipe->readers > 1) && (!signal_pending(current))) {
+	while ((pipe->readers > 1) && (!fatal_signal_pending(current))) {
 		wake_up_interruptible_sync(&pipe->wait);
 		kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
 		pipe_wait(pipe);
+		pipe_unlock(pipe);
+		try_to_freeze();
+		pipe_lock(pipe);
 	}
 
 	pipe->readers--;
-- 
1.8.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] coredump: abort core dump piping only due to a fatal signal
  2013-02-14 23:38 ` [PATCH 5/5] coredump: abort core dump piping only due to a fatal signal Mandeep Singh Baines
@ 2013-02-15 15:01   ` Oleg Nesterov
  2013-02-16  1:20     ` Mandeep Singh Baines
  2013-02-15 23:25   ` Andrew Morton
  2013-02-16  0:09   ` [PATCH v3] coredump: ignore non-fatal signals when core dumping to a pipe Mandeep Singh Baines
  2 siblings, 1 reply; 16+ messages in thread
From: Oleg Nesterov @ 2013-02-15 15:01 UTC (permalink / raw)
  To: Mandeep Singh Baines
  Cc: linux-kernel, Ben Chan, Tejun Heo, Andrew Morton,
	Rafael J. Wysocki, Ingo Molnar

On 02/14, Mandeep Singh Baines wrote:
>
> This patch makes wait_for_dump_helpers() not to abort piping the core
> dump data when the crashing process has received any but a fatal signal
> (SIGKILL). The rationale is that a crashing process may still receive
> uninteresting signals such as SIGCHLD when its core dump data is being
> redirected to a helper application.

You already sent this change in the past ;) and I already reviewed it.

It is not enough and imho not good. Damn, I'll try very much to make the
patches on weekend...

> -	while ((pipe->readers > 1) && (!signal_pending(current))) {
> +	while ((pipe->readers > 1) && (!fatal_signal_pending(current))) {

This turns pipe_wait() belowe into the busy-wait loop if signal_pending().
Not good. And not enough, there are other reasons why coredump can fail
if the signal is pending.

>  		wake_up_interruptible_sync(&pipe->wait);
>  		kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
>  		pipe_wait(pipe);
> +		pipe_unlock(pipe);
> +		try_to_freeze();

Oh, yes. One of the problems with coredump/signals is freezer. Not sure
what should we do...

But if we add try_to_freeze() here, we need to add more try_to_freeze's,
think about dumping the huge core on the slow media.

Oleg.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] coredump: abort core dump piping only due to a fatal signal
  2013-02-15 15:01   ` Oleg Nesterov
@ 2013-02-16  1:20     ` Mandeep Singh Baines
  2013-02-16 17:05       ` Oleg Nesterov
  0 siblings, 1 reply; 16+ messages in thread
From: Mandeep Singh Baines @ 2013-02-16  1:20 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: linux-kernel, Ben Chan, Tejun Heo, Andrew Morton,
	Rafael J. Wysocki, Ingo Molnar

On Fri, Feb 15, 2013 at 7:01 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 02/14, Mandeep Singh Baines wrote:
>>
>> This patch makes wait_for_dump_helpers() not to abort piping the core
>> dump data when the crashing process has received any but a fatal signal
>> (SIGKILL). The rationale is that a crashing process may still receive
>> uninteresting signals such as SIGCHLD when its core dump data is being
>> redirected to a helper application.
>
> You already sent this change in the past ;) and I already reviewed it.
>
> It is not enough and imho not good. Damn, I'll try very much to make the
> patches on weekend...
>
>> -     while ((pipe->readers > 1) && (!signal_pending(current))) {
>> +     while ((pipe->readers > 1) && (!fatal_signal_pending(current))) {
>
> This turns pipe_wait() belowe into the busy-wait loop if signal_pending().

D'oh. Thanks for catching that.

Fixed in v3 by blocking non-fatal signals.

> Not good. And not enough, there are other reasons why coredump can fail
> if the signal is pending.

What other reasons did you have in mind?

Since applying an earlier version of this patch, truncated/missing
coredumps are no longer any issue for us.

Maybe it could fail in binfmt->core_dump().

Could the other reasons be addressed in another patch?

>
>>               wake_up_interruptible_sync(&pipe->wait);
>>               kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
>>               pipe_wait(pipe);
>> +             pipe_unlock(pipe);
>> +             try_to_freeze();
>
> Oh, yes. One of the problems with coredump/signals is freezer. Not sure
> what should we do...
>
> But if we add try_to_freeze() here, we need to add more try_to_freeze's,
> think about dumping the huge core on the slow media.
>

We could add more try_to_freeze()s in the dump_write paths to work
even better with freezer. Do you see any issues with just adding it
here for a start. It fixes the non-slow media case.

Thanks much for reviewing this patch.

Regards,
Mandeep

> Oleg.
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] coredump: abort core dump piping only due to a fatal signal
  2013-02-16  1:20     ` Mandeep Singh Baines
@ 2013-02-16 17:05       ` Oleg Nesterov
  0 siblings, 0 replies; 16+ messages in thread
From: Oleg Nesterov @ 2013-02-16 17:05 UTC (permalink / raw)
  To: Mandeep Singh Baines
  Cc: linux-kernel, Ben Chan, Tejun Heo, Andrew Morton,
	Rafael J. Wysocki, Ingo Molnar

On 02/15, Mandeep Singh Baines wrote:
>
> On Fri, Feb 15, 2013 at 7:01 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > It is not enough and imho not good. Damn, I'll try very much to make the
> > patches on weekend...
> >
> >> -     while ((pipe->readers > 1) && (!signal_pending(current))) {
> >> +     while ((pipe->readers > 1) && (!fatal_signal_pending(current))) {
> >
> > This turns pipe_wait() belowe into the busy-wait loop if signal_pending().
>
> D'oh. Thanks for catching that.
>
> Fixed in v3 by blocking non-fatal signals.

Doesn't look correct...

> > Not good. And not enough, there are other reasons why coredump can fail
> > if the signal is pending.
>
> What other reasons did you have in mind?

Say, pipe_write() can fail if signal_pending() == T.

> Since applying an earlier version of this patch, truncated/missing
> coredumps are no longer any issue for us.

Sure, this "almost works". But this is doesn't really work.

And more importantly, we should fix another problem, SIGKILL should
really stop the coredumping, and I do not see a simple solution, the
main problem is the races with the exiting threads...

> Could the other reasons be addressed in another patch?

Well. Personally I believe we should fix the problems with signals
first, then add the freezer changes...

> >>               wake_up_interruptible_sync(&pipe->wait);
> >>               kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
> >>               pipe_wait(pipe);
> >> +             pipe_unlock(pipe);
> >> +             try_to_freeze();
> >
> > Oh, yes. One of the problems with coredump/signals is freezer. Not sure
> > what should we do...
> >
> > But if we add try_to_freeze() here, we need to add more try_to_freeze's,
> > think about dumping the huge core on the slow media.
> >
>
> We could add more try_to_freeze()s in the dump_write paths to work
> even better with freezer. Do you see any issues with just adding it
> here for a start. It fixes the non-slow media case.

The only issue is that, again, this change pretends to work but it doesn't ;)
IOW, imho you fix the symptom only.

Lets forget about the slow media, consider the piped coredump (the case
you are trying to fix). Suppose that try_to_freeze_tasks() is in progress,
the user-space coredump handler is already frozen, and the dumping thread
does pipe_write()->pipe_wait().

If only we could change pipe_wait() to do freezable_schedule()...

Oleg.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] coredump: abort core dump piping only due to a fatal signal
  2013-02-14 23:38 ` [PATCH 5/5] coredump: abort core dump piping only due to a fatal signal Mandeep Singh Baines
  2013-02-15 15:01   ` Oleg Nesterov
@ 2013-02-15 23:25   ` Andrew Morton
  2013-02-16  0:09   ` [PATCH v3] coredump: ignore non-fatal signals when core dumping to a pipe Mandeep Singh Baines
  2 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2013-02-15 23:25 UTC (permalink / raw)
  To: Mandeep Singh Baines
  Cc: linux-kernel, Ben Chan, Oleg Nesterov, Tejun Heo,
	Rafael J. Wysocki, Ingo Molnar

On Thu, 14 Feb 2013 15:38:16 -0800
Mandeep Singh Baines <msb@chromium.org> wrote:

> From: Ben Chan <benchan@chromium.org>
> 
> This patch makes wait_for_dump_helpers() not to abort piping the core
> dump data when the crashing process has received any but a fatal signal
> (SIGKILL). The rationale is that a crashing process may still receive
> uninteresting signals such as SIGCHLD when its core dump data is being
> redirected to a helper application. While it's necessary to allow
> terminating the core dump piping via SIGKILL, it's practically more
> useful for the purpose of debugging and crash reporting if the core dump
> piping is not aborted due to other non-fatal signals.

Sounds sensible.  The changelog is rather hard to understand.  Is the
below accurate?

: Make wait_for_dump_helpers() not abort piping the core dump data when the
: crashing process has received a non-fatal signal.  The abort still occurs
: in the case of SIGKILL.
: 
: The rationale is that a crashing process may still receive uninteresting
: signals such as SIGCHLD when its core dump data is being redirected to a
: helper application.  While it's necessary to allow terminating the core
: dump piping via SIGKILL, it's practically more useful for the purpose of
: debugging and crash reporting if the core dump piping is not aborted due
: to other non-fatal signals.

> --- a/fs/coredump.c
> +++ b/fs/coredump.c
> @@ -32,6 +32,7 @@
>  #include <linux/pipe_fs_i.h>
>  #include <linux/oom.h>
>  #include <linux/compat.h>
> +#include <linux/freezer.h>
>  
>  #include <asm/uaccess.h>
>  #include <asm/mmu_context.h>
> @@ -417,10 +418,13 @@ static void wait_for_dump_helpers(struct file *file)
>  	pipe->readers++;
>  	pipe->writers--;
>  
> -	while ((pipe->readers > 1) && (!signal_pending(current))) {
> +	while ((pipe->readers > 1) && (!fatal_signal_pending(current))) {
>  		wake_up_interruptible_sync(&pipe->wait);
>  		kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
>  		pipe_wait(pipe);
> +		pipe_unlock(pipe);
> +		try_to_freeze();
> +		pipe_lock(pipe);
>  	}

The new call to try_to_freeze() is unchangelogged and uncommented.  Can
we please fix both these things?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3] coredump: ignore non-fatal signals when core dumping to a pipe
  2013-02-14 23:38 ` [PATCH 5/5] coredump: abort core dump piping only due to a fatal signal Mandeep Singh Baines
  2013-02-15 15:01   ` Oleg Nesterov
  2013-02-15 23:25   ` Andrew Morton
@ 2013-02-16  0:09   ` Mandeep Singh Baines
  2013-02-16  0:57     ` [PATCH v4] " Mandeep Singh Baines
  2 siblings, 1 reply; 16+ messages in thread
From: Mandeep Singh Baines @ 2013-02-16  0:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ben Chan, Mandeep Singh Baines, Oleg Nesterov, Tejun Heo,
	Andrew Morton, Rafael J. Wysocki, Ingo Molnar

From: Ben Chan <benchan@chromium.org>

Make wait_for_dump_helpers() not abort piping the core dump data when the
crashing process has received a non-fatal signal.  The abort still occurs
in the case of SIGKILL.

Addresses http://crosbug.com/21559

Changes since v1:
* Mandeep Singh Baines
  * To prevent blocking suspend, add try_to_freeze().
Changes since v2:
* LKML: <20130215150117.GB30829@redhat.com> Oleg Nestorov
  * Block non-fatal signals to avoid poll_wait busy waiting.
* LKML: <20130215152538.9a61a44e.akpm@linux-foundation.org> Andrew Morton
  * Added comment re: try_to_freeze and clarified commit message.

Signed-off-by: Ben Chan <benchan@chromium.org>
Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Tejun Heo <tj@kernel.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Rafael J. Wysocki <rjw@sisk.pl>
CC: Ingo Molnar <mingo@redhat.com>
---
 fs/coredump.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/fs/coredump.c b/fs/coredump.c
index 1774932..976f6cc 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -32,6 +32,7 @@
 #include <linux/pipe_fs_i.h>
 #include <linux/oom.h>
 #include <linux/compat.h>
+#include <linux/freezer.h>
 
 #include <asm/uaccess.h>
 #include <asm/mmu_context.h>
@@ -410,6 +411,11 @@ static void coredump_finish(struct mm_struct *mm)
 static void wait_for_dump_helpers(struct file *file)
 {
 	struct pipe_inode_info *pipe;
+	sigset_t blocked, previous;
+
+	/* Block all but fatal signals. */
+	siginitsetinv(&blocked, sigmask(SIGKILL));
+	sigprocmask(SIG_BLOCK, &blocked, &previous);
 
 	pipe = file->f_path.dentry->d_inode->i_pipe;
 
@@ -421,12 +427,22 @@ static void wait_for_dump_helpers(struct file *file)
 		wake_up_interruptible_sync(&pipe->wait);
 		kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
 		pipe_wait(pipe);
+
+		/*
+		 * Non-fatal signals are blocked. So we need to try
+		 * to freeze in order to not block suspend.
+		 */
+		pipe_unlock(pipe);
+		try_to_freeze();
+		pipe_lock(pipe);
 	}
 
 	pipe->readers--;
 	pipe->writers++;
 	pipe_unlock(pipe);
 
+	/* Restore signals. */
+	sigprocmask(SIG_SETMASK, &previous, NULL);
 }
 
 /*
-- 
1.7.12.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4] coredump: ignore non-fatal signals when core dumping to a pipe
  2013-02-16  0:09   ` [PATCH v3] coredump: ignore non-fatal signals when core dumping to a pipe Mandeep Singh Baines
@ 2013-02-16  0:57     ` Mandeep Singh Baines
  0 siblings, 0 replies; 16+ messages in thread
From: Mandeep Singh Baines @ 2013-02-16  0:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ben Chan, Mandeep Singh Baines, Oleg Nesterov, Tejun Heo,
	Andrew Morton, Rafael J. Wysocki, Ingo Molnar

From: Ben Chan <benchan@chromium.org>

Make wait_for_dump_helpers() not abort piping the core dump data when the
crashing process has received a non-fatal signal.  The abort still occurs
in the case of SIGKILL.

Testing:

localhost ~ # echo "|/usr/bin/sleep 1d" > /proc/sys/kernel/core_pattern
localhost ~ # sleep 1d &
[1] 2514
localhost ~ # kill -ABRT $! # Cause coredump
localhost ~ # kill -USR1 $! # Send non-fatal signal
localhost ~ # top -p $! -n1 -b # Verify that we aren't dead or busy waiting
top - 16:45:34 up 2 min,  0 users,  load average: 0.71, 0.42, 0.17
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
Cpu(s): 26.0%us,  8.5%sy,  0.0%ni, 65.1%id,  0.2%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:    991516k total,   418556k used,   572960k free,     5948k buffers
Swap:        0k total,        0k used,        0k free,   289928k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2514 root      20   0  1868  392  336 S    0  0.0   0:00.00 sleep

localhost ~ # echo mem > /sys/power/state # Suspend
localhost ~ # top -p $! -n1 -b # Verify that we aren't dead or busy waiting
top - 16:46:46 up 3 min,  0 users,  load average: 1.68, 0.69, 0.28
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
Cpu(s): 24.1%us,  7.7%sy,  0.0%ni, 67.9%id,  0.2%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:    991516k total,   419956k used,   571560k free,     5996k buffers
Swap:        0k total,        0k used,        0k free,   290208k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2514 root      20   0  1868  392  336 S    0  0.0   0:00.00 sleep

Addresses http://crosbug.com/21559

Changes since v1:
* Mandeep Singh Baines
  * To prevent blocking suspend, add try_to_freeze().
Changes since v2:
* LKML: <20130215150117.GB30829@redhat.com> Oleg Nestorov
  * Block non-fatal signals to avoid poll_wait busy waiting.
* LKML: <20130215152538.9a61a44e.akpm@linux-foundation.org> Andrew Morton
  * Added comment re: try_to_freeze and clarified commit message.
Changes since v3:
* Mandeep Singh Baines
  * Clear signal pending caused by fake signal from freeze_task().
  * Document how the patch was tested.

Signed-off-by: Ben Chan <benchan@chromium.org>
Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Tejun Heo <tj@kernel.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Rafael J. Wysocki <rjw@sisk.pl>
CC: Ingo Molnar <mingo@redhat.com>
---
 fs/coredump.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/fs/coredump.c b/fs/coredump.c
index 1774932..3c5a08f 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -32,6 +32,7 @@
 #include <linux/pipe_fs_i.h>
 #include <linux/oom.h>
 #include <linux/compat.h>
+#include <linux/freezer.h>
 
 #include <asm/uaccess.h>
 #include <asm/mmu_context.h>
@@ -410,6 +411,11 @@ static void coredump_finish(struct mm_struct *mm)
 static void wait_for_dump_helpers(struct file *file)
 {
 	struct pipe_inode_info *pipe;
+	sigset_t blocked, previous;
+
+	/* Block all but fatal signals. */
+	siginitsetinv(&blocked, sigmask(SIGKILL));
+	sigprocmask(SIG_BLOCK, &blocked, &previous);
 
 	pipe = file->f_path.dentry->d_inode->i_pipe;
 
@@ -417,16 +423,29 @@ static void wait_for_dump_helpers(struct file *file)
 	pipe->readers++;
 	pipe->writers--;
 
-	while ((pipe->readers > 1) && (!signal_pending(current))) {
+	while ((pipe->readers > 1) && (!fatal_signal_pending(current))) {
 		wake_up_interruptible_sync(&pipe->wait);
 		kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
 		pipe_wait(pipe);
+
+		/*
+		 * Non-fatal signals are blocked. So we need to try
+		 * to freeze in order to not block suspend.
+		 */
+		pipe_unlock(pipe);
+		try_to_freeze();
+		pipe_lock(pipe);
+
+		/* Clear fake signal from freeze_task. */
+		recalc_sigpending();
 	}
 
 	pipe->readers--;
 	pipe->writers++;
 	pipe_unlock(pipe);
 
+	/* Restore signals. */
+	sigprocmask(SIG_SETMASK, &previous, NULL);
 }
 
 /*
-- 
1.7.12.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] vfork: don't freezer_count() for in-kernel users of CLONE_VFORK
  2013-02-14 23:38 [PATCH 1/5] vfork: don't freezer_count() for in-kernel users of CLONE_VFORK Mandeep Singh Baines
                   ` (3 preceding siblings ...)
  2013-02-14 23:38 ` [PATCH 5/5] coredump: abort core dump piping only due to a fatal signal Mandeep Singh Baines
@ 2013-02-15 15:28 ` Oleg Nesterov
  4 siblings, 0 replies; 16+ messages in thread
From: Oleg Nesterov @ 2013-02-15 15:28 UTC (permalink / raw)
  To: Mandeep Singh Baines
  Cc: linux-kernel, Tejun Heo, Andrew Morton, Rafael J. Wysocki, Ingo Molnar

On 02/14, Mandeep Singh Baines wrote:
>
> We don't need to call freezer_do_not_count() for in-kernel users
> of CLONE_VFORK since exec will get called in bounded time.

OK,

> We don't want to call freezer_count() for in-kernel users because
> they may be holding locks.

Who? We should not do this anyway. And __call_usermodehelper() doesn't
afaics.

OK, its caller (process_one_work) does lock_map_acquire() for debugging
purposes, this can "confuse" print_held_locks_bug(). But this thread is
PF_NOFREEZE ?

> @@ -722,9 +722,11 @@ static int wait_for_vfork_done(struct task_struct *child,
>  {
>  	int killed;
>
> -	freezer_do_not_count();
> +	if (current->mm)
> +		freezer_do_not_count();

And if you want to exclude in-kernel users, then perhaps PF_KTHREAD check
will look a bit better.

Oleg.


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-02-16 17:06 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-14 23:38 [PATCH 1/5] vfork: don't freezer_count() for in-kernel users of CLONE_VFORK Mandeep Singh Baines
2013-02-14 23:38 ` [PATCH 2/5] lockdep: check that no locks held at freeze time Mandeep Singh Baines
2013-02-15 11:16   ` Ingo Molnar
2013-02-15 15:44   ` Oleg Nesterov
2013-02-14 23:38 ` [PATCH 3/5] coredump: cleanup the waiting for coredump_finish code Mandeep Singh Baines
2013-02-15 14:53   ` Oleg Nesterov
2013-02-15 23:30   ` Andrew Morton
2013-02-14 23:38 ` [PATCH 4/5] coredump: use a freezable_schedule for the coredump_finish wait Mandeep Singh Baines
2013-02-14 23:38 ` [PATCH 5/5] coredump: abort core dump piping only due to a fatal signal Mandeep Singh Baines
2013-02-15 15:01   ` Oleg Nesterov
2013-02-16  1:20     ` Mandeep Singh Baines
2013-02-16 17:05       ` Oleg Nesterov
2013-02-15 23:25   ` Andrew Morton
2013-02-16  0:09   ` [PATCH v3] coredump: ignore non-fatal signals when core dumping to a pipe Mandeep Singh Baines
2013-02-16  0:57     ` [PATCH v4] " Mandeep Singh Baines
2013-02-15 15:28 ` [PATCH 1/5] vfork: don't freezer_count() for in-kernel users of CLONE_VFORK Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).