linux-mediatek.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] siganl: ignore other signals when doing coredump
@ 2020-07-23  6:52 chunlei.wang
  2020-07-24  0:51 ` Andrew Morton
  0 siblings, 1 reply; 4+ messages in thread
From: chunlei.wang @ 2020-07-23  6:52 UTC (permalink / raw)
  To: Matthias Brugger, Andrew Morton, Peter Zijlstra
  Cc: Aneesh Kumar K.V, weiwei.zhang, linux-mediatek, Will Deacon,
	wsd_upstream

do_coredump flow is interrupted by SIGKILL,
causing the coredump to be truncated.

Signed-off-by: Chunlei Wang <chunlei.wang@mediatek.com>
---
 arch/Kconfig    | 12 ++++++++++++
 kernel/signal.c |  8 ++++++++
 2 files changed, 20 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index 8cc35dc556c7..559eac47093e 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -834,6 +834,18 @@ config OLD_SIGACTION
 config COMPAT_OLD_SIGACTION
 	bool
 
+config IGNORE_ANY_SIGNALS
+	tristate "ignore any signals when coredump is doing"
+	default n
+	help
+	  The sigkill is very special. If a process receives a sigkill, it
will
+	  immediately respond to the sigkill. When a process is abnormal and
+	  collecting coredump, the do_coredump flow will be interrupted by
+	  SIGKILL, causing the coredump to be truncated. This truncated
coredump
+	  is incomplete, and also gdb can't load.
+	  Maybe we can ignore any signlas when process is collecting coredump.
+	  This config can decide whether to ignore any signals.
+
 config COMPAT_32BIT_TIME
 	bool "Provide system calls for 32-bit time_t"
 	default !64BIT || COMPAT
diff --git a/kernel/signal.c b/kernel/signal.c
index 5ca48cc5da76..ccae3c84eb6d 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -903,6 +903,14 @@ static bool prepare_signal(int sig, struct
task_struct *p, bool force)
 	sigset_t flush;
 
 	if (signal->flags & (SIGNAL_GROUP_EXIT | SIGNAL_GROUP_COREDUMP)) {
+
+#if defined CONFIG_IGNORE_ANY_SIGNALS
+		if (signal->flags & SIGNAL_GROUP_COREDUMP) {
+			pr_debug("[%d:%s] skip sig %d due to coredump is doing\n",
+					p->pid, p->comm, sig);
+			return false;
+		}
+#endif
 		if (!(signal->flags & SIGNAL_GROUP_EXIT))
 			return sig == SIGKILL;
 		/*
-- 
2.18.0

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] siganl: ignore other signals when doing coredump
  2020-07-23  6:52 [PATCH] siganl: ignore other signals when doing coredump chunlei.wang
@ 2020-07-24  0:51 ` Andrew Morton
  2020-07-31  8:54   ` chunlei.wang
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2020-07-24  0:51 UTC (permalink / raw)
  To: chunlei.wang
  Cc: weiwei.zhang, wsd_upstream, Peter Zijlstra, Aneesh Kumar K.V,
	Oleg Nesterov, linux-mediatek, Matthias Brugger, Will Deacon

On Thu, 23 Jul 2020 14:52:23 +0800 "chunlei.wang" <Chunlei.wang@mediatek.com> wrote:

> do_coredump flow is interrupted by SIGKILL,
> causing the coredump to be truncated.
> 

Please tell us much more about why you think Linux would benefit from
this change.  Precisely what operational problems are you seeing with
the current code?


> ---
>  arch/Kconfig    | 12 ++++++++++++
>  kernel/signal.c |  8 ++++++++
>  2 files changed, 20 insertions(+)
> 
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 8cc35dc556c7..559eac47093e 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -834,6 +834,18 @@ config OLD_SIGACTION
>  config COMPAT_OLD_SIGACTION
>  	bool
>  
> +config IGNORE_ANY_SIGNALS
> +	tristate "ignore any signals when coredump is doing"
> +	default n
> +	help
> +	  The sigkill is very special. If a process receives a sigkill, it
> will
> +	  immediately respond to the sigkill. When a process is abnormal and
> +	  collecting coredump, the do_coredump flow will be interrupted by
> +	  SIGKILL, causing the coredump to be truncated. This truncated
> coredump
> +	  is incomplete, and also gdb can't load.
> +	  Maybe we can ignore any signlas when process is collecting coredump.
> +	  This config can decide whether to ignore any signals.
> +
>  config COMPAT_32BIT_TIME
>  	bool "Provide system calls for 32-bit time_t"
>  	default !64BIT || COMPAT
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 5ca48cc5da76..ccae3c84eb6d 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -903,6 +903,14 @@ static bool prepare_signal(int sig, struct
> task_struct *p, bool force)
>  	sigset_t flush;
>  
>  	if (signal->flags & (SIGNAL_GROUP_EXIT | SIGNAL_GROUP_COREDUMP)) {
> +
> +#if defined CONFIG_IGNORE_ANY_SIGNALS
> +		if (signal->flags & SIGNAL_GROUP_COREDUMP) {
> +			pr_debug("[%d:%s] skip sig %d due to coredump is doing\n",
> +					p->pid, p->comm, sig);
> +			return false;
> +		}
> +#endif
>  		if (!(signal->flags & SIGNAL_GROUP_EXIT))
>  			return sig == SIGKILL;
>  		/*
> -- 
> 2.18.0
> 

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] siganl: ignore other signals when doing coredump
  2020-07-24  0:51 ` Andrew Morton
@ 2020-07-31  8:54   ` chunlei.wang
  2020-08-03 19:17     ` Oleg Nesterov
  0 siblings, 1 reply; 4+ messages in thread
From: chunlei.wang @ 2020-07-31  8:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: weiwei.zhang, wsd_upstream, Peter Zijlstra, Aneesh Kumar K.V,
	Oleg Nesterov, linux-mediatek, Matthias Brugger, Will Deacon

Please tell us much more about why you think Linux would benefit from
this change.  Precisely what operational problems are you seeing with
the current code?
=>
       Sorry for the late reply.

       If coredump is incomplete, R&D can not find root cause through
coredump.
 If the issue is seldom, this modification will speed up the process of
solving the problem.

       When one thread occur crash, if the default action of the signal
is dump, it will enter do_coredump flow and  SIGNAL_GROUP_COREDUMP is
set to signal->flags. If SIGKILL is received, the function of
prepare_signal will check signal->flags, if the bit of
SIGNAL_GROUP_COREDUMP is true and SIGNAL_GROUP_EXIT is false,  the
SIGKILL will respond immediately, process will do exit flow, but now
process is doing coredump. This is abnormal, do_coredump will be break ,
causing causing the coredump to be  truncated.

On Thu, 2020-07-23 at 17:51 -0700, Andrew Morton wrote:
> On Thu, 23 Jul 2020 14:52:23 +0800 "chunlei.wang" <Chunlei.wang@mediatek.com> wrote:
> 
> > do_coredump flow is interrupted by SIGKILL,
> > causing the coredump to be truncated.
> > 
> 
> Please tell us much more about why you think Linux would benefit from
> this change.  Precisely what operational problems are you seeing with
> the current code?
> 
> 
> > ---
> >  arch/Kconfig    | 12 ++++++++++++
> >  kernel/signal.c |  8 ++++++++
> >  2 files changed, 20 insertions(+)
> > 
> > diff --git a/arch/Kconfig b/arch/Kconfig
> > index 8cc35dc556c7..559eac47093e 100644
> > --- a/arch/Kconfig
> > +++ b/arch/Kconfig
> > @@ -834,6 +834,18 @@ config OLD_SIGACTION
> >  config COMPAT_OLD_SIGACTION
> >  	bool
> >  
> > +config IGNORE_ANY_SIGNALS
> > +	tristate "ignore any signals when coredump is doing"
> > +	default n
> > +	help
> > +	  The sigkill is very special. If a process receives a sigkill, it
> > will
> > +	  immediately respond to the sigkill. When a process is abnormal and
> > +	  collecting coredump, the do_coredump flow will be interrupted by
> > +	  SIGKILL, causing the coredump to be truncated. This truncated
> > coredump
> > +	  is incomplete, and also gdb can't load.
> > +	  Maybe we can ignore any signlas when process is collecting coredump.
> > +	  This config can decide whether to ignore any signals.
> > +
> >  config COMPAT_32BIT_TIME
> >  	bool "Provide system calls for 32-bit time_t"
> >  	default !64BIT || COMPAT
> > diff --git a/kernel/signal.c b/kernel/signal.c
> > index 5ca48cc5da76..ccae3c84eb6d 100644
> > --- a/kernel/signal.c
> > +++ b/kernel/signal.c
> > @@ -903,6 +903,14 @@ static bool prepare_signal(int sig, struct
> > task_struct *p, bool force)
> >  	sigset_t flush;
> >  
> >  	if (signal->flags & (SIGNAL_GROUP_EXIT | SIGNAL_GROUP_COREDUMP)) {
> > +
> > +#if defined CONFIG_IGNORE_ANY_SIGNALS
> > +		if (signal->flags & SIGNAL_GROUP_COREDUMP) {
> > +			pr_debug("[%d:%s] skip sig %d due to coredump is doing\n",
> > +					p->pid, p->comm, sig);
> > +			return false;
> > +		}
> > +#endif
> >  		if (!(signal->flags & SIGNAL_GROUP_EXIT))
> >  			return sig == SIGKILL;
> >  		/*
> > -- 
> > 2.18.0
> > 

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] siganl: ignore other signals when doing coredump
  2020-07-31  8:54   ` chunlei.wang
@ 2020-08-03 19:17     ` Oleg Nesterov
  0 siblings, 0 replies; 4+ messages in thread
From: Oleg Nesterov @ 2020-08-03 19:17 UTC (permalink / raw)
  To: chunlei.wang
  Cc: weiwei.zhang, wsd_upstream, Peter Zijlstra, Aneesh Kumar K.V,
	linux-kernel, linux-mediatek, Matthias Brugger, Andrew Morton,
	Will Deacon

On 07/31, chunlei.wang wrote:
>
> Please tell us much more about why you think Linux would benefit from
> this change.  Precisely what operational problems are you seeing with
> the current code?
> =>
>        Sorry for the late reply.
>
>        If coredump is incomplete, R&D can not find root cause through
> coredump.
>  If the issue is seldom, this modification will speed up the process of
> solving the problem.

To be honest, I do not even know what can I say, except that I disagree
with this change. The very idea looks wrong to me.

Granted, SIGKILL can kill the process which does something useful. Say,
dumps a core. So what?

Where does this SIGKILL come from? How often does this happen?

And why do you think the core dumping is special? Say, you try to debug
the buggy application, but a sudden SIGKILL kills the debuggee and you
lose the debugging session. Does this mean that the kernel needs another
patch to protect the process running under gdb from SIGKILL?

I don't think so. Please feel free to resend this patch, but it needs
a very convincing changelog. And please send it to lkml.

Oleg.


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-08-03 19:17 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-23  6:52 [PATCH] siganl: ignore other signals when doing coredump chunlei.wang
2020-07-24  0:51 ` Andrew Morton
2020-07-31  8:54   ` chunlei.wang
2020-08-03 19:17     ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).