From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753019AbdILRFy (ORCPT ); Tue, 12 Sep 2017 13:05:54 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58076 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753003AbdILRFr (ORCPT ); Tue, 12 Sep 2017 13:05:47 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 0B91FC04B92E Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=oleg@redhat.com Date: Tue, 12 Sep 2017 19:05:44 +0200 From: Oleg Nesterov To: =?iso-8859-1?Q?J=FCrg?= Billeter Cc: Andrew Morton , Eric Biederman , linux-kernel@vger.kernel.org, Linus Torvalds , Michael Kerrisk Subject: Re: [PATCH] prctl: add PR_[GS]ET_PDEATHSIG_PROC Message-ID: <20170912170544.GA32121@redhat.com> References: <20170909094008.49983-1-j@bitron.ch> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20170909094008.49983-1-j@bitron.ch> User-Agent: Mutt/1.5.24 (2015-08-30) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Tue, 12 Sep 2017 17:05:47 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/09, Jürg Billeter wrote: > > PR_SET_PDEATHSIG_PROC sets a process-based death signal. I think the patch is technically correct, > Unlike > PR_SET_PDEATHSIG, this is inherited across fork to allow killing a whole > subtree without race conditions. but I am still not sure this is right... at least I can't understand the "without race conditions" above. IOW, the child can do prctl(PR_SET_PDEATHSIG_PROC, SIGKILL) right after fork(), why this is not enough to kill a whole subtree without race conditions? OTOH. If you want to kill a whole sub-tree then perhaps the exiting process should simply send the ->pdeath_signal_proc to the whole sub-tree? Not that I really think this makes more sense, but if we add the new API we should discuss everything we can. Say, CLONE_PARENT. Should it succeed if ->pdeath_signal_proc != 0 ? Anyway, I think this patch needs more reviewers. Let me add Linus and Michael. Again, I am not worried about correctness, the patch is simple, but the new API always needs a thorough discussion. Oleg. > This can be used for sandboxing when combined with a seccomp filter. > > There have been previous attempts to support this by changing the > behavior of PR_SET_PDEATHSIG. However, that would break existing > applications. See https://marc.info/?l=linux-kernel&m=117621804801689 > and https://bugzilla.kernel.org/show_bug.cgi?id=43300 > > Signed-off-by: Jürg Billeter > --- > fs/exec.c | 1 + > include/linux/sched/signal.h | 3 +++ > include/uapi/linux/prctl.h | 4 ++++ > kernel/cred.c | 1 + > kernel/exit.c | 4 ++++ > kernel/fork.c | 2 ++ > kernel/sys.c | 11 +++++++++++ > security/apparmor/lsm.c | 1 + > security/selinux/hooks.c | 1 + > 9 files changed, 28 insertions(+) > > diff --git a/fs/exec.c b/fs/exec.c > index 01a9fb9d8ac3..bb389c3c596d 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -1353,6 +1353,7 @@ void setup_new_exec(struct linux_binprm * bprm) > if (bprm->secureexec) { > /* Make sure parent cannot signal privileged process. */ > current->pdeath_signal = 0; > + current->signal->pdeath_signal_proc = 0; > > /* > * For secureexec, reset the stack limit to sane default to > diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h > index 2a0dd40b15db..c5c137e5ef39 100644 > --- a/include/linux/sched/signal.h > +++ b/include/linux/sched/signal.h > @@ -103,6 +103,9 @@ struct signal_struct { > int group_stop_count; > unsigned int flags; /* see SIGNAL_* flags below */ > > + /* The signal sent when the parent dies: */ > + int pdeath_signal_proc; > + > /* > * PR_SET_CHILD_SUBREAPER marks a process, like a service > * manager, to re-parent orphan (double-forking) child processes > diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h > index a8d0759a9e40..04508e81d4f2 100644 > --- a/include/uapi/linux/prctl.h > +++ b/include/uapi/linux/prctl.h > @@ -197,4 +197,8 @@ struct prctl_mm_map { > # define PR_CAP_AMBIENT_LOWER 3 > # define PR_CAP_AMBIENT_CLEAR_ALL 4 > > +/* Process-based variant of PDEATHSIG */ > +#define PR_SET_PDEATHSIG_PROC 48 > +#define PR_GET_PDEATHSIG_PROC 49 > + > #endif /* _LINUX_PRCTL_H */ > diff --git a/kernel/cred.c b/kernel/cred.c > index ecf03657e71c..0192a94670e1 100644 > --- a/kernel/cred.c > +++ b/kernel/cred.c > @@ -448,6 +448,7 @@ int commit_creds(struct cred *new) > if (task->mm) > set_dumpable(task->mm, suid_dumpable); > task->pdeath_signal = 0; > + task->signal->pdeath_signal_proc = 0; > smp_wmb(); > } > > diff --git a/kernel/exit.c b/kernel/exit.c > index a35d8a17e01f..1be0616239e0 100644 > --- a/kernel/exit.c > +++ b/kernel/exit.c > @@ -635,6 +635,10 @@ static void reparent_leader(struct task_struct *father, struct task_struct *p, > if (unlikely(p->exit_state == EXIT_DEAD)) > return; > > + if (p->signal->pdeath_signal_proc) > + group_send_sig_info(p->signal->pdeath_signal_proc, > + SEND_SIG_NOINFO, p); > + > /* We don't want people slaying init. */ > p->exit_signal = SIGCHLD; > > diff --git a/kernel/fork.c b/kernel/fork.c > index 24a4c0be80d5..f6482392ece9 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -1412,6 +1412,8 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk) > > mutex_init(&sig->cred_guard_mutex); > > + sig->pdeath_signal_proc = current->signal->pdeath_signal_proc; > + > return 0; > } > > diff --git a/kernel/sys.c b/kernel/sys.c > index 2855ee73acd0..c47e92fa5370 100644 > --- a/kernel/sys.c > +++ b/kernel/sys.c > @@ -2210,6 +2210,17 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, > case PR_GET_PDEATHSIG: > error = put_user(me->pdeath_signal, (int __user *)arg2); > break; > + case PR_SET_PDEATHSIG_PROC: > + if (!valid_signal(arg2)) { > + error = -EINVAL; > + break; > + } > + me->signal->pdeath_signal_proc = arg2; > + break; > + case PR_GET_PDEATHSIG_PROC: > + error = put_user(me->signal->pdeath_signal_proc, > + (int __user *)arg2); > + break; > case PR_GET_DUMPABLE: > error = get_dumpable(me->mm); > break; > diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c > index 7a82c0f61452..c8bd6b1331c1 100644 > --- a/security/apparmor/lsm.c > +++ b/security/apparmor/lsm.c > @@ -628,6 +628,7 @@ static void apparmor_bprm_committing_creds(struct linux_binprm *bprm) > aa_inherit_files(bprm->cred, current->files); > > current->pdeath_signal = 0; > + current->signal->pdeath_signal_proc = 0; > > /* reset soft limits and set hard limits for the new label */ > __aa_transition_rlimits(label, new_ctx->label); > diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c > index ad3b0f53ede0..574d6238f8de 100644 > --- a/security/selinux/hooks.c > +++ b/security/selinux/hooks.c > @@ -2527,6 +2527,7 @@ static void selinux_bprm_committing_creds(struct linux_binprm *bprm) > > /* Always clear parent death signal on SID transitions. */ > current->pdeath_signal = 0; > + current->signal->pdeath_signal_proc = 0; > > /* Check whether the new SID can inherit resource limits from the old > * SID. If not, reset all soft limits to the lower of the current > -- > 2.14.1 >