From: Enke Chen <enkechen@cisco.com>
To: Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>,
x86@kernel.org, Peter Zijlstra <peterz@infradead.org>,
Arnd Bergmann <arnd@arndb.de>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Khalid Aziz <khalid.aziz@oracle.com>,
Kate Stewart <kstewart@linuxfoundation.org>,
Helge Deller <deller@gmx.de>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Al Viro <viro@zeniv.linux.org.uk>,
Andrew Morton <akpm@linux-foundation.org>,
Christian Brauner <christian@brauner.io>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will.deacon@arm.com>,
Dave Martin <Dave.Martin@arm.com>,
Mauro Carvalho Chehab <mchehab+samsung@kernel.org>,
Michal Hocko <mhocko@kernel.org>, Rik van Riel <riel@surriel.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Roman Gushchin <guro@fb.com>,
Marcos Paulo de Souza <marcos.souza.org@gmail.com>,
Oleg Nesterov <oleg@redhat.com>,
Dominik Brodowski <linux@dominikbrodowski.net>,
Cyrill Gorcunov <gorcunov@openvz.org>,
Yang Shi <yang.shi@linux.alibaba.com>,
Jann Horn <jannh@google.com>, Kees Cook <keescook@chromium.org>
Cc: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
Enke Chen <enkechen@cisco.com>,
"Victor Kamensky (kamensky)" <kamensky@cisco.com>,
xe-linux-external@cisco.com, Stefan Strogin <sstrogin@cisco.com>
Subject: [PATCH] kernel/signal: Signal-based pre-coredump notification
Date: Fri, 12 Oct 2018 17:33:35 -0700 [thread overview]
Message-ID: <e7ed306f-8992-9d00-bcab-5131159e8d89@cisco.com> (raw)
For simplicity and consistency, this patch provides an implementation
for signal-based fault notification prior to the coredump of a child
process. A new prctl command, PR_SET_PREDUMP_SIG, is defined that can
be used by an application to express its interest and to specify the
signal (SIGCHLD or SIGUSR1 or SIGUSR2) for such a notification. A new
signal code (si_code), CLD_PREDUMP, is also defined for SIGCHLD.
Background:
As the coredump of a process may take time, in certain time-sensitive
applications it is necessary for a parent process (e.g., a process
manager) to be notified of a child's imminent death before the coredump
so that the parent process can act sooner, such as re-spawning an
application process, or initiating a control-plane fail-over.
Currently there are two ways for a parent process to be notified of a
child process's state change. One is to use the POSIX signal, and
another is to use the kernel connector module. The specific events and
actions are summarized as follows:
Process Event POSIX Signal Connector-based
----------------------------------------------------------------------
ptrace_attach() do_notify_parent_cldstop() proc_ptrace_connector()
SIGCHLD / CLD_STOPPED
ptrace_detach() do_notify_parent_cldstop() proc_ptrace_connector()
SIGCHLD / CLD_CONTINUED
pre_coredump/ N/A proc_coredump_connector()
get_signal()
post_coredump/ do_notify_parent() proc_exit_connector()
do_exit() SIGCHLD / exit_signal
----------------------------------------------------------------------
As shown in the table, the signal-based pre-coredump notification is not
currently available. In some cases using a connector-based notification
can be quite complicated (e.g., when a process manager is written in shell
scripts and thus is subject to certain inherent limitations), and a
signal-based notification would be simpler and better suited.
Signed-off-by: Enke Chen <enkechen@cisco.com>
---
arch/x86/kernel/signal_compat.c | 2 +-
include/linux/sched.h | 4 ++
include/linux/signal.h | 5 +++
include/uapi/asm-generic/siginfo.h | 3 +-
include/uapi/linux/prctl.h | 4 ++
kernel/fork.c | 1 +
kernel/signal.c | 51 +++++++++++++++++++++++++
kernel/sys.c | 77 ++++++++++++++++++++++++++++++++++++++
8 files changed, 145 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/signal_compat.c b/arch/x86/kernel/signal_compat.c
index 9ccbf05..a3deba8 100644
--- a/arch/x86/kernel/signal_compat.c
+++ b/arch/x86/kernel/signal_compat.c
@@ -30,7 +30,7 @@ static inline void signal_compat_build_tests(void)
BUILD_BUG_ON(NSIGSEGV != 7);
BUILD_BUG_ON(NSIGBUS != 5);
BUILD_BUG_ON(NSIGTRAP != 5);
- BUILD_BUG_ON(NSIGCHLD != 6);
+ BUILD_BUG_ON(NSIGCHLD != 7);
BUILD_BUG_ON(NSIGSYS != 1);
/* This is part of the ABI and can never change in size: */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 09026ea..cfb9645 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -696,6 +696,10 @@ struct task_struct {
int exit_signal;
/* The signal sent when the parent dies: */
int pdeath_signal;
+
+ /* The signal sent prior to a child's coredump: */
+ int predump_signal;
+
/* JOBCTL_*, siglock protected: */
unsigned long jobctl;
diff --git a/include/linux/signal.h b/include/linux/signal.h
index 706a499..7cb976d 100644
--- a/include/linux/signal.h
+++ b/include/linux/signal.h
@@ -256,6 +256,11 @@ static inline int valid_signal(unsigned long sig)
return sig <= _NSIG ? 1 : 0;
}
+static inline int valid_predump_signal(int sig)
+{
+ return (sig == SIGCHLD) || (sig == SIGUSR1) || (sig == SIGUSR2);
+}
+
struct timespec;
struct pt_regs;
enum pid_type;
diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h
index cb3d6c2..1a47cef 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -267,7 +267,8 @@ struct { \
#define CLD_TRAPPED 4 /* traced child has trapped */
#define CLD_STOPPED 5 /* child has stopped */
#define CLD_CONTINUED 6 /* stopped child has continued */
-#define NSIGCHLD 6
+#define CLD_PREDUMP 7 /* child is about to dump core */
+#define NSIGCHLD 7
/*
* SIGPOLL (or any other signal without signal specific si_codes) si_codes
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index c0d7ea0..79f0a8a 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -219,4 +219,8 @@ struct prctl_mm_map {
# define PR_SPEC_DISABLE (1UL << 2)
# define PR_SPEC_FORCE_DISABLE (1UL << 3)
+/* Whether to receive signal prior to child's coredump */
+#define PR_SET_PREDUMP_SIG 54
+#define PR_GET_PREDUMP_SIG 55
+
#endif /* _LINUX_PRCTL_H */
diff --git a/kernel/fork.c b/kernel/fork.c
index 07cddff..c296c11 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1985,6 +1985,7 @@ static __latent_entropy struct task_struct *copy_process(
p->dirty_paused_when = 0;
p->pdeath_signal = 0;
+ p->predump_signal = 0;
INIT_LIST_HEAD(&p->thread_group);
p->task_works = NULL;
diff --git a/kernel/signal.c b/kernel/signal.c
index 312b43e..eb4a483 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2337,6 +2337,44 @@ static int ptrace_signal(int signr, kernel_siginfo_t *info)
return signr;
}
+/*
+ * Let the parent, if so desired, know about the imminent death of a child
+ * prior to its coredump.
+ *
+ * Locking logic is similar to do_notify_parent_cldstop().
+ */
+static void do_notify_parent_predump(struct task_struct *tsk)
+{
+ struct sighand_struct *sighand;
+ struct task_struct *parent;
+ struct kernel_siginfo info;
+ unsigned long flags;
+ int sig;
+
+ parent = tsk->real_parent;
+ sig = parent->predump_signal;
+
+ /* Check again with "tasklist_lock" locked by the caller */
+ if (!valid_predump_signal(sig))
+ return;
+
+ clear_siginfo(&info);
+ info.si_signo = sig;
+ if (sig == SIGCHLD)
+ info.si_code = CLD_PREDUMP;
+
+ rcu_read_lock();
+ info.si_pid = task_pid_nr_ns(tsk, task_active_pid_ns(parent));
+ info.si_uid = from_kuid_munged(task_cred_xxx(parent, user_ns),
+ task_uid(tsk));
+ rcu_read_unlock();
+
+ sighand = parent->sighand;
+ spin_lock_irqsave(&sighand->siglock, flags);
+ __group_send_sig_info(sig, &info, parent);
+ spin_unlock_irqrestore(&sighand->siglock, flags);
+}
+
bool get_signal(struct ksignal *ksig)
{
struct sighand_struct *sighand = current->sighand;
@@ -2497,6 +2535,19 @@ bool get_signal(struct ksignal *ksig)
current->flags |= PF_SIGNALED;
if (sig_kernel_coredump(signr)) {
+ /*
+ * Notify the parent prior to the coredump if the
+ * parent is interested in such a notificaiton.
+ */
+ int p_sig = current->real_parent->predump_signal;
+
+ if (valid_predump_signal(p_sig)) {
+ read_lock(&tasklist_lock);
+ do_notify_parent_predump(current);
+ read_unlock(&tasklist_lock);
+ cond_resched();
+ }
+
if (print_fatal_signals)
print_fatal_signal(ksig->info.si_signo);
proc_coredump_connector(current);
diff --git a/kernel/sys.c b/kernel/sys.c
index 123bd73..43eb250d 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2258,6 +2258,76 @@ int __weak arch_prctl_spec_ctrl_set(struct task_struct *t, unsigned long which,
return -EINVAL;
}
+static int prctl_get_predump_signal(struct task_struct *tsk, pid_t pid,
+ int __user *addr)
+{
+ struct task_struct *p;
+ int error;
+
+ /* For the current task, the common case */
+ if (pid == 0) {
+ put_user(tsk->predump_signal, addr);
+ return 0;
+ }
+
+ error = -ESRCH;
+ rcu_read_lock();
+ p = find_task_by_vpid(pid);
+ if (p) {
+ error = 0;
+ put_user(p->predump_signal, addr);
+ }
+ rcu_read_unlock();
+ return error;
+}
+
+/*
+ * Returns true if current's euid is same as p's uid or euid,
+ * or has CAP_SYS_ADMIN.
+ *
+ * Called with rcu_read_lock, creds are safe.
+ *
+ * Adapted from set_one_prio_perm().
+ */
+static bool set_predump_signal_perm(struct task_struct *p)
+{
+ const struct cred *cred = current_cred(), *pcred = __task_cred(p);
+
+ return uid_eq(pcred->uid, cred->euid) ||
+ uid_eq(pcred->euid, cred->euid) ||
+ capable(CAP_SYS_ADMIN);
+}
+
+static int prctl_set_predump_signal(struct task_struct *tsk, pid_t pid, int sig)
+{
+ struct task_struct *p;
+ int error;
+
+ /* 0 is valid for disabling the feature */
+ if (sig && !valid_predump_signal(sig))
+ return -EINVAL;
+
+ /* For the current task, the common case */
+ if (pid == 0) {
+ tsk->predump_signal = sig;
+ return 0;
+ }
+
+ error = -ESRCH;
+ rcu_read_lock();
+ p = find_task_by_vpid(pid);
+ if (p) {
+ if (!set_predump_signal_perm(p))
+ error = -EPERM;
+ else {
+ error = 0;
+ p->predump_signal = sig;
+ }
+ }
+ rcu_read_unlock();
+ return error;
+}
+
SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
unsigned long, arg4, unsigned long, arg5)
{
@@ -2476,6 +2546,13 @@ int __weak arch_prctl_spec_ctrl_set(struct task_struct *t, unsigned long which,
return -EINVAL;
error = arch_prctl_spec_ctrl_set(me, arg2, arg3);
break;
+ case PR_SET_PREDUMP_SIG:
+ error = prctl_set_predump_signal(me, (pid_t)arg2, (int)arg3);
+ break;
+ case PR_GET_PREDUMP_SIG:
+ error = prctl_get_predump_signal(me, (pid_t)arg2,
+ (int __user *)arg3);
+ break;
default:
error = -EINVAL;
break;
--
1.8.3.1
next reply other threads:[~2018-10-13 0:43 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-13 0:33 Enke Chen [this message]
2018-10-13 6:40 ` [PATCH] kernel/signal: Signal-based pre-coredump notification Greg Kroah-Hartman
2018-10-15 18:16 ` Enke Chen
2018-10-15 18:43 ` Greg Kroah-Hartman
2018-10-15 18:49 ` Enke Chen
2018-10-15 18:58 ` Greg Kroah-Hartman
2018-10-13 10:44 ` Christian Brauner
2018-10-15 18:39 ` Enke Chen
2018-10-13 18:27 ` Jann Horn
2018-10-15 18:36 ` Enke Chen
2018-10-15 18:54 ` Jann Horn
2018-10-15 19:23 ` Enke Chen
2018-10-19 23:01 ` Enke Chen
2018-10-22 15:40 ` Jann Horn
2018-10-22 20:48 ` Enke Chen
2018-10-15 12:05 ` Oleg Nesterov
2018-10-15 18:54 ` Enke Chen
2018-10-15 19:17 ` Enke Chen
2018-10-15 19:26 ` Enke Chen
2018-10-16 14:14 ` Oleg Nesterov
2018-10-16 15:09 ` Eric W. Biederman
2018-10-17 0:39 ` Enke Chen
2018-10-15 21:21 ` Alan Cox
2018-10-15 21:31 ` Enke Chen
2018-10-15 23:28 ` Eric W. Biederman
2018-10-16 0:33 ` valdis.kletnieks
2018-10-16 0:54 ` Enke Chen
2018-10-16 15:26 ` Eric W. Biederman
2018-10-22 21:09 ` [PATCH v2] " Enke Chen
2018-10-23 9:23 ` Oleg Nesterov
2018-10-23 19:43 ` Enke Chen
2018-10-23 21:40 ` Enke Chen
2018-10-24 13:52 ` Oleg Nesterov
2018-10-24 21:56 ` Enke Chen
2018-10-24 5:39 ` [PATCH v3] " Enke Chen
2018-10-24 14:02 ` Oleg Nesterov
2018-10-24 22:02 ` Enke Chen
2018-10-25 22:56 ` [PATCH v4] " Enke Chen
2018-10-26 8:28 ` Oleg Nesterov
2018-10-26 22:23 ` Enke Chen
2018-10-29 11:18 ` Oleg Nesterov
2018-10-29 21:08 ` Enke Chen
2018-10-29 22:31 ` [PATCH v5] " Enke Chen
2018-10-30 16:46 ` Oleg Nesterov
2018-10-31 0:25 ` Enke Chen
2018-11-22 0:37 ` Andrew Morton
2018-11-22 1:09 ` Enke Chen
2018-11-22 1:18 ` Enke Chen
2018-11-22 1:33 ` Andrew Morton
2018-11-22 4:57 ` Enke Chen
2018-11-12 23:22 ` Enke Chen
2018-11-27 22:54 ` [PATCH v5 1/2] " Enke Chen
2018-11-28 15:19 ` Dave Martin
2018-11-29 0:15 ` Enke Chen
2018-11-29 11:55 ` Dave Martin
2018-11-30 0:27 ` Enke Chen
2018-11-30 12:03 ` Oleg Nesterov
2018-12-05 6:47 ` Jann Horn
2018-12-04 22:37 ` Andrew Morton
2018-12-06 17:29 ` Oleg Nesterov
2018-10-25 22:56 ` [PATCH] selftests/prctl: selftest for pre-coredump signal notification Enke Chen
2018-11-27 22:54 ` [PATCH v5 2/2] " Enke Chen
2018-10-24 13:29 ` [PATCH v2] kernel/signal: Signal-based pre-coredump notification Eric W. Biederman
2018-10-24 23:50 ` Enke Chen
2018-10-25 12:23 ` Eric W. Biederman
2018-10-25 20:45 ` Enke Chen
2018-10-25 21:24 ` Enke Chen
2018-10-25 21:56 ` Enke Chen
2018-10-25 13:45 ` Jann Horn
2018-10-25 20:21 ` Eric W. Biederman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e7ed306f-8992-9d00-bcab-5131159e8d89@cisco.com \
--to=enkechen@cisco.com \
--cc=Dave.Martin@arm.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=christian@brauner.io \
--cc=deller@gmx.de \
--cc=ebiederm@xmission.com \
--cc=gorcunov@openvz.org \
--cc=gregkh@linuxfoundation.org \
--cc=guro@fb.com \
--cc=hpa@zytor.com \
--cc=jannh@google.com \
--cc=kamensky@cisco.com \
--cc=keescook@chromium.org \
--cc=khalid.aziz@oracle.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kstewart@linuxfoundation.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@dominikbrodowski.net \
--cc=marcos.souza.org@gmail.com \
--cc=mchehab+samsung@kernel.org \
--cc=mhocko@kernel.org \
--cc=mingo@redhat.com \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=riel@surriel.com \
--cc=sstrogin@cisco.com \
--cc=tglx@linutronix.de \
--cc=viro@zeniv.linux.org.uk \
--cc=will.deacon@arm.com \
--cc=x86@kernel.org \
--cc=xe-linux-external@cisco.com \
--cc=yang.shi@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).