linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Enke Chen <enkechen@cisco.com>
To: Oleg Nesterov <oleg@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Arnd Bergmann <arnd@arndb.de>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Khalid Aziz <khalid.aziz@oracle.com>,
	Kate Stewart <kstewart@linuxfoundation.org>,
	Helge Deller <deller@gmx.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christian Brauner <christian@brauner.io>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Dave Martin <Dave.Martin@arm.com>,
	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>,
	Michal Hocko <mhocko@kernel.org>, Rik van Riel <riel@surriel.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Roman Gushchin <guro@fb.com>,
	Marcos Paulo de Souza <marcos.souza.org@gmail.com>,
	Dominik Brodowski <linux@dominikbrodowski.net>,
	Cyrill Gorcunov <gorcunov@openvz.org>,
	Yang Shi <yang.shi@linux.alibaba.com>,
	Jann Horn <jannh@google.com>, Kees Cook <keescook@chromium.org>
Cc: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	"Victor Kamensky (kamensky)" <kamensky@cisco.com>,
	xe-linux-external@cisco.com, Stefan Strogin <sstrogin@cisco.com>,
	Enke Chen <enkechen@cisco.com>
Subject: [PATCH v5 1/2] kernel/signal: Signal-based pre-coredump notification
Date: Tue, 27 Nov 2018 14:54:41 -0800	[thread overview]
Message-ID: <80e96710-f424-9b39-72ee-9cc7cbe7a5f7@cisco.com> (raw)
In-Reply-To: <0c197608-3b7e-ffd1-8943-801a60beb917@cisco.com>

[Repost as a series, as suggested by Andrew Morton]

For simplicity and consistency, this patch provides an implementation
for signal-based fault notification prior to the coredump of a child
process. A new prctl command, PR_SET_PREDUMP_SIG, is defined that can
be used by an application to express its interest and to specify the
signal for such a notification.

Changes to prctl(2):

   PR_SET_PREDUMP_SIG (since Linux 4.20.x)
          Set the child pre-coredump signal of the calling process to
          arg2 (either a signal value in the range 1..maxsig, or 0 to
          clear). This is the signal that the calling process will get
          prior to the coredump of a child process. This value is
          cleared across execve(2), or for the child of a fork(2).

   PR_GET_PREDUMP_SIG (since Linux 4.20.x)
          Return the current value of the child pre-coredump signal,
          in the location pointed to by (int *) arg2.

Background:

As the coredump of a process may take time, in certain time-sensitive
applications it is necessary for a parent process (e.g., a process
manager) to be notified of a child's imminent death before the coredump
so that the parent process can act sooner, such as re-spawning an
application process, or initiating a control-plane fail-over.

One application is BFD. The early fault notification is a critical
component for maintaining BFD sessions (with a timeout value of
50 msec or 100 msec) across a control-plane failure.

Currently there are two ways for a parent process to be notified of a
child process's state change. One is to use the POSIX signal, and
another is to use the kernel connector module. The specific events and
actions are summarized as follows:

Process Event    POSIX Signal                Connector-based
----------------------------------------------------------------------
ptrace_attach()  do_notify_parent_cldstop()  proc_ptrace_connector()
                 SIGCHLD / CLD_STOPPED

ptrace_detach()  do_notify_parent_cldstop()  proc_ptrace_connector()
                 SIGCHLD / CLD_CONTINUED

pre_coredump/    N/A                         proc_coredump_connector()
get_signal()

post_coredump/   do_notify_parent()          proc_exit_connector()
do_exit()        SIGCHLD / exit_signal
----------------------------------------------------------------------

As shown in the table, the signal-based pre-coredump notification is not
currently available. In some cases using a connector-based notification
can be quite complicated (e.g., when a process manager is written in shell
scripts and thus is subject to certain inherent limitations), and a
signal-based notification would be simpler and better suited.

Signed-off-by: Enke Chen <enkechen@cisco.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
---
v4 -> v5:
Addressed review comments from Oleg Nesterov:
o use rcu_read_lock instead.
o revert back to notify the real_parent.

 fs/coredump.c                | 23 +++++++++++++++++++++++
 fs/exec.c                    |  3 +++
 include/linux/sched/signal.h |  3 +++
 include/uapi/linux/prctl.h   |  4 ++++
 kernel/sys.c                 | 13 +++++++++++++
 5 files changed, 46 insertions(+)

diff --git a/fs/coredump.c b/fs/coredump.c
index e42e17e..740b1bb 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -536,6 +536,24 @@ static int umh_pipe_setup(struct subprocess_info *info, struct cred *new)
 	return err;
 }
 
+/*
+ * While do_notify_parent() notifies the parent of a child's death post
+ * its coredump, this function lets the parent (if so desired) know about
+ * the imminent death of a child just prior to its coredump.
+ */
+static void do_notify_parent_predump(void)
+{
+	struct task_struct *parent;
+	int sig;
+
+	rcu_read_lock();
+	parent = rcu_dereference(current->real_parent);
+	sig = parent->signal->predump_signal;
+	if (sig != 0)
+		do_send_sig_info(sig, SEND_SIG_NOINFO, parent, PIDTYPE_TGID);
+	rcu_read_unlock();
+}
+
 void do_coredump(const kernel_siginfo_t *siginfo)
 {
 	struct core_state core_state;
@@ -590,6 +608,11 @@ void do_coredump(const kernel_siginfo_t *siginfo)
 	if (retval < 0)
 		goto fail_creds;
 
+	/*
+	 * Send the pre-coredump signal to the parent if requested.
+	 */
+	do_notify_parent_predump();
+
 	old_cred = override_creds(cred);
 
 	ispipe = format_corename(&cn, &cprm);
diff --git a/fs/exec.c b/fs/exec.c
index fc281b7..7714da7 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1181,6 +1181,9 @@ static int de_thread(struct task_struct *tsk)
 	/* we have changed execution domain */
 	tsk->exit_signal = SIGCHLD;
 
+	/* Clear the pre-coredump signal before loading a new binary */
+	sig->predump_signal = 0;
+
 #ifdef CONFIG_POSIX_TIMERS
 	exit_itimers(sig);
 	flush_itimer_signals();
diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index 13789d1..728ef68 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -112,6 +112,9 @@ struct signal_struct {
 	int			group_stop_count;
 	unsigned int		flags; /* see SIGNAL_* flags below */
 
+	/* The signal sent prior to a child's coredump */
+	int			predump_signal;
+
 	/*
 	 * PR_SET_CHILD_SUBREAPER marks a process, like a service
 	 * manager, to re-parent orphan (double-forking) child processes
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index c0d7ea0..79f0a8a 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -219,4 +219,8 @@ struct prctl_mm_map {
 # define PR_SPEC_DISABLE		(1UL << 2)
 # define PR_SPEC_FORCE_DISABLE		(1UL << 3)
 
+/* Whether to receive signal prior to child's coredump */
+#define PR_SET_PREDUMP_SIG	54
+#define PR_GET_PREDUMP_SIG	55
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/sys.c b/kernel/sys.c
index 123bd73..39aa3b8 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2476,6 +2476,19 @@ int __weak arch_prctl_spec_ctrl_set(struct task_struct *t, unsigned long which,
 			return -EINVAL;
 		error = arch_prctl_spec_ctrl_set(me, arg2, arg3);
 		break;
+	case PR_SET_PREDUMP_SIG:
+		if (arg3 || arg4 || arg5)
+			return -EINVAL;
+		if (!valid_signal((int)arg2))
+			return -EINVAL;
+		me->signal->predump_signal = (int)arg2;
+		break;
+	case PR_GET_PREDUMP_SIG:
+		if (arg3 || arg4 || arg5)
+			return -EINVAL;
+		error = put_user(me->signal->predump_signal,
+				 (int __user *)arg2);
+		break;
 	default:
 		error = -EINVAL;
 		break;
-- 
1.8.3.1

  parent reply	other threads:[~2018-11-27 22:54 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-13  0:33 [PATCH] kernel/signal: Signal-based pre-coredump notification Enke Chen
2018-10-13  6:40 ` Greg Kroah-Hartman
2018-10-15 18:16   ` Enke Chen
2018-10-15 18:43     ` Greg Kroah-Hartman
2018-10-15 18:49       ` Enke Chen
2018-10-15 18:58         ` Greg Kroah-Hartman
2018-10-13 10:44 ` Christian Brauner
2018-10-15 18:39   ` Enke Chen
2018-10-13 18:27 ` Jann Horn
2018-10-15 18:36   ` Enke Chen
2018-10-15 18:54     ` Jann Horn
2018-10-15 19:23       ` Enke Chen
2018-10-19 23:01       ` Enke Chen
2018-10-22 15:40         ` Jann Horn
2018-10-22 20:48           ` Enke Chen
2018-10-15 12:05 ` Oleg Nesterov
2018-10-15 18:54   ` Enke Chen
2018-10-15 19:17   ` Enke Chen
2018-10-15 19:26     ` Enke Chen
2018-10-16 14:14     ` Oleg Nesterov
2018-10-16 15:09       ` Eric W. Biederman
2018-10-17  0:39       ` Enke Chen
2018-10-15 21:21 ` Alan Cox
2018-10-15 21:31   ` Enke Chen
2018-10-15 23:28 ` Eric W. Biederman
2018-10-16  0:33   ` valdis.kletnieks
2018-10-16  0:54   ` Enke Chen
2018-10-16 15:26     ` Eric W. Biederman
2018-10-22 21:09 ` [PATCH v2] " Enke Chen
2018-10-23  9:23   ` Oleg Nesterov
2018-10-23 19:43     ` Enke Chen
2018-10-23 21:40       ` Enke Chen
2018-10-24 13:52       ` Oleg Nesterov
2018-10-24 21:56         ` Enke Chen
2018-10-24  5:39   ` [PATCH v3] " Enke Chen
2018-10-24 14:02     ` Oleg Nesterov
2018-10-24 22:02       ` Enke Chen
2018-10-25 22:56     ` [PATCH v4] " Enke Chen
2018-10-26  8:28       ` Oleg Nesterov
2018-10-26 22:23         ` Enke Chen
2018-10-29 11:18           ` Oleg Nesterov
2018-10-29 21:08             ` Enke Chen
2018-10-29 22:31             ` [PATCH v5] " Enke Chen
2018-10-30 16:46               ` Oleg Nesterov
2018-10-31  0:25                 ` Enke Chen
2018-11-22  0:37                 ` Andrew Morton
2018-11-22  1:09                   ` Enke Chen
2018-11-22  1:18                     ` Enke Chen
2018-11-22  1:33                     ` Andrew Morton
2018-11-22  4:57                       ` Enke Chen
2018-11-12 23:22               ` Enke Chen
2018-11-27 22:54               ` Enke Chen [this message]
2018-11-28 15:19                 ` [PATCH v5 1/2] " Dave Martin
2018-11-29  0:15                   ` Enke Chen
2018-11-29 11:55                     ` Dave Martin
2018-11-30  0:27                       ` Enke Chen
2018-11-30 12:03                       ` Oleg Nesterov
2018-12-05  6:47                       ` Jann Horn
2018-12-04 22:37                     ` Andrew Morton
2018-12-06 17:29                       ` Oleg Nesterov
2018-10-25 22:56     ` [PATCH] selftests/prctl: selftest for pre-coredump signal notification Enke Chen
2018-11-27 22:54       ` [PATCH v5 2/2] " Enke Chen
2018-10-24 13:29   ` [PATCH v2] kernel/signal: Signal-based pre-coredump notification Eric W. Biederman
2018-10-24 23:50     ` Enke Chen
2018-10-25 12:23       ` Eric W. Biederman
2018-10-25 20:45         ` Enke Chen
2018-10-25 21:24         ` Enke Chen
2018-10-25 21:56         ` Enke Chen
2018-10-25 13:45     ` Jann Horn
2018-10-25 20:21       ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=80e96710-f424-9b39-72ee-9cc7cbe7a5f7@cisco.com \
    --to=enkechen@cisco.com \
    --cc=Dave.Martin@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=christian@brauner.io \
    --cc=deller@gmx.de \
    --cc=ebiederm@xmission.com \
    --cc=gorcunov@openvz.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=guro@fb.com \
    --cc=hpa@zytor.com \
    --cc=jannh@google.com \
    --cc=kamensky@cisco.com \
    --cc=keescook@chromium.org \
    --cc=khalid.aziz@oracle.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kstewart@linuxfoundation.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@dominikbrodowski.net \
    --cc=marcos.souza.org@gmail.com \
    --cc=mchehab+samsung@kernel.org \
    --cc=mhocko@kernel.org \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=sstrogin@cisco.com \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    --cc=will.deacon@arm.com \
    --cc=xe-linux-external@cisco.com \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).