linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] fork: check exit_signal passed in clone3() call
@ 2019-09-10 17:58 Eugene Syromiatnikov
  2019-09-11 13:31 ` Oleg Nesterov
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Eugene Syromiatnikov @ 2019-09-10 17:58 UTC (permalink / raw)
  To: linux-kernel, Christian Brauner, Oleg Nesterov
  Cc: Andrew Morton, Peter Zijlstra (Intel),
	Ingo Molnar, Dmitry V. Levin, Eric Biederman

Previously, higher 32 bits of exit_signal fields were lost when
copied to the kernel args structure (that uses int as a type for the
respective field).  Moreover, as Oleg has noted[1], exit_signal is used
unchecked, so it has to be checked for sanity before use; for the legacy
syscalls, applying CSIGNAL mask guarantees that it is at least non-negative;
however, there's no such thing is done in clone3() code path, and that can
break at least thread_group_leader.

Checking user-passed exit_signal against ~CSIGNAL mask solves both
of these problems.

[1] https://lkml.org/lkml/2019/9/10/467

* kernel/fork.c (copy_clone_args_from_user): Fail with -EINVAL if
args.exit_signal has bits set outside CSIGNAL mask.
(_do_fork): Note that exit_signal is expected to be checked for the
sanity by the caller.

Fixes: 7f192e3cd316 ("fork: add clone3")
Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
---
 kernel/fork.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/kernel/fork.c b/kernel/fork.c
index 2852d0e..9dee2ab 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2338,6 +2338,8 @@ struct mm_struct *copy_init_mm(void)
  *
  * It copies the process, and if successful kick-starts
  * it and waits for it to finish using the VM if required.
+ *
+ * args->exit_signal is expected to be checked for sanity by the caller.
  */
 long _do_fork(struct kernel_clone_args *args)
 {
@@ -2562,6 +2564,16 @@ noinline static int copy_clone_args_from_user(struct kernel_clone_args *kargs,
 	if (copy_from_user(&args, uargs, size))
 		return -EFAULT;
 
+	/*
+	 * exit_signal is confined to CSIGNAL mask in legacy syscalls,
+	 * so it is used unchecked deeper in syscall handling routines;
+	 * moreover, copying to struct kernel_clone_args.exit_signals
+	 * trims higher 32 bits, so it is has to be checked that they
+	 * are zero.
+	 */
+	if (unlikely(args.exit_signal & ~((u64)CSIGNAL)))
+		return -EINVAL;
+
 	*kargs = (struct kernel_clone_args){
 		.flags		= args.flags,
 		.pidfd		= u64_to_user_ptr(args.pidfd),
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] fork: check exit_signal passed in clone3() call
  2019-09-10 17:58 [PATCH v2] fork: check exit_signal passed in clone3() call Eugene Syromiatnikov
@ 2019-09-11 13:31 ` Oleg Nesterov
  2019-09-11 13:47   ` Christian Brauner
  2019-09-11 13:48 ` Andrew Morton
  2019-09-11 17:32 ` Eric W. Biederman
  2 siblings, 1 reply; 14+ messages in thread
From: Oleg Nesterov @ 2019-09-11 13:31 UTC (permalink / raw)
  To: Eugene Syromiatnikov
  Cc: linux-kernel, Christian Brauner, Andrew Morton,
	Peter Zijlstra (Intel),
	Ingo Molnar, Dmitry V. Levin, Eric Biederman

On 09/10, Eugene Syromiatnikov wrote:
>
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -2338,6 +2338,8 @@ struct mm_struct *copy_init_mm(void)
>   *
>   * It copies the process, and if successful kick-starts
>   * it and waits for it to finish using the VM if required.
> + *
> + * args->exit_signal is expected to be checked for sanity by the caller.

not sure this comment is really useful but it doesn't hurt

>  long _do_fork(struct kernel_clone_args *args)
>  {
> @@ -2562,6 +2564,16 @@ noinline static int copy_clone_args_from_user(struct kernel_clone_args *kargs,
>  	if (copy_from_user(&args, uargs, size))
>  		return -EFAULT;
>  
> +	/*
> +	 * exit_signal is confined to CSIGNAL mask in legacy syscalls,
> +	 * so it is used unchecked deeper in syscall handling routines;
> +	 * moreover, copying to struct kernel_clone_args.exit_signals
> +	 * trims higher 32 bits, so it is has to be checked that they
> +	 * are zero.
> +	 */
> +	if (unlikely(args.exit_signal & ~((u64)CSIGNAL)))
> +		return -EINVAL;

OK, agreed. As you pointed out, this doesn't guarantee valid_signal(exit_signal).
But we do no really care as long as it is non-negative, it acts as exit_signal==0.

I have no idea if we want to deny exit_signal >= _NSIG in clone3(), this was always
allowed...

I think this needs the "CC: stable" tag.

Acked-by: Oleg Nesterov <oleg@redhat.com>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] fork: check exit_signal passed in clone3() call
  2019-09-11 13:31 ` Oleg Nesterov
@ 2019-09-11 13:47   ` Christian Brauner
  0 siblings, 0 replies; 14+ messages in thread
From: Christian Brauner @ 2019-09-11 13:47 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Eugene Syromiatnikov, linux-kernel, Andrew Morton,
	Peter Zijlstra (Intel),
	Ingo Molnar, Dmitry V. Levin, Eric Biederman

On Wed, Sep 11, 2019 at 03:31:20PM +0200, Oleg Nesterov wrote:
> On 09/10, Eugene Syromiatnikov wrote:
> >
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -2338,6 +2338,8 @@ struct mm_struct *copy_init_mm(void)
> >   *
> >   * It copies the process, and if successful kick-starts
> >   * it and waits for it to finish using the VM if required.
> > + *
> > + * args->exit_signal is expected to be checked for sanity by the caller.
> 
> not sure this comment is really useful but it doesn't hurt
> 
> >  long _do_fork(struct kernel_clone_args *args)
> >  {
> > @@ -2562,6 +2564,16 @@ noinline static int copy_clone_args_from_user(struct kernel_clone_args *kargs,
> >  	if (copy_from_user(&args, uargs, size))
> >  		return -EFAULT;
> >  
> > +	/*
> > +	 * exit_signal is confined to CSIGNAL mask in legacy syscalls,
> > +	 * so it is used unchecked deeper in syscall handling routines;
> > +	 * moreover, copying to struct kernel_clone_args.exit_signals
> > +	 * trims higher 32 bits, so it is has to be checked that they
> > +	 * are zero.
> > +	 */
> > +	if (unlikely(args.exit_signal & ~((u64)CSIGNAL)))
> > +		return -EINVAL;
> 
> OK, agreed. As you pointed out, this doesn't guarantee valid_signal(exit_signal).
> But we do no really care as long as it is non-negative, it acts as exit_signal==0.
> 
> I have no idea if we want to deny exit_signal >= _NSIG in clone3(), this was always
> allowed...
> 
> I think this needs the "CC: stable" tag.

No, I don't think so. clone3() is not in any released kernel. It'll be
released with 5.3. So we should just try and have this picked up this
week before the release.  I'm going to send a pr for this today
hopefully.
(Sorry for the delay, conferencing makes it harder to reply to mail.)

Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>

> 
> Acked-by: Oleg Nesterov <oleg@redhat.com>
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] fork: check exit_signal passed in clone3() call
  2019-09-10 17:58 [PATCH v2] fork: check exit_signal passed in clone3() call Eugene Syromiatnikov
  2019-09-11 13:31 ` Oleg Nesterov
@ 2019-09-11 13:48 ` Andrew Morton
  2019-09-11 13:52   ` Christian Brauner
  2019-09-11 17:32 ` Eric W. Biederman
  2 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2019-09-11 13:48 UTC (permalink / raw)
  To: Eugene Syromiatnikov
  Cc: linux-kernel, Christian Brauner, Oleg Nesterov,
	Peter Zijlstra (Intel),
	Ingo Molnar, Dmitry V. Levin, Eric Biederman

On Tue, 10 Sep 2019 18:58:52 +0100 Eugene Syromiatnikov <esyr@redhat.com> wrote:

> Previously, higher 32 bits of exit_signal fields were lost when
> copied to the kernel args structure (that uses int as a type for the
> respective field).  Moreover, as Oleg has noted[1], exit_signal is used
> unchecked, so it has to be checked for sanity before use; for the legacy
> syscalls, applying CSIGNAL mask guarantees that it is at least non-negative;
> however, there's no such thing is done in clone3() code path, and that can
> break at least thread_group_leader.
> 
> Checking user-passed exit_signal against ~CSIGNAL mask solves both
> of these problems.
> 
> [1] https://lkml.org/lkml/2019/9/10/467
> 
> * kernel/fork.c (copy_clone_args_from_user): Fail with -EINVAL if
> args.exit_signal has bits set outside CSIGNAL mask.
> (_do_fork): Note that exit_signal is expected to be checked for the
> sanity by the caller.
> 
> Fixes: 7f192e3cd316 ("fork: add clone3")

What are the user-visible runtime effects of this bug?

Relatedly, should this fix be backported into -stable kernels?  If so, why?


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] fork: check exit_signal passed in clone3() call
  2019-09-11 13:48 ` Andrew Morton
@ 2019-09-11 13:52   ` Christian Brauner
  2019-09-11 14:16     ` Christian Brauner
  2019-09-13  9:07     ` Christian Brauner
  0 siblings, 2 replies; 14+ messages in thread
From: Christian Brauner @ 2019-09-11 13:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Eugene Syromiatnikov, linux-kernel, Oleg Nesterov,
	Peter Zijlstra (Intel),
	Ingo Molnar, Dmitry V. Levin, Eric Biederman

On Wed, Sep 11, 2019 at 06:48:52AM -0700, Andrew Morton wrote:
> On Tue, 10 Sep 2019 18:58:52 +0100 Eugene Syromiatnikov <esyr@redhat.com> wrote:
> 
> > Previously, higher 32 bits of exit_signal fields were lost when
> > copied to the kernel args structure (that uses int as a type for the
> > respective field).  Moreover, as Oleg has noted[1], exit_signal is used
> > unchecked, so it has to be checked for sanity before use; for the legacy
> > syscalls, applying CSIGNAL mask guarantees that it is at least non-negative;
> > however, there's no such thing is done in clone3() code path, and that can
> > break at least thread_group_leader.
> > 
> > Checking user-passed exit_signal against ~CSIGNAL mask solves both
> > of these problems.
> > 
> > [1] https://lkml.org/lkml/2019/9/10/467
> > 
> > * kernel/fork.c (copy_clone_args_from_user): Fail with -EINVAL if
> > args.exit_signal has bits set outside CSIGNAL mask.
> > (_do_fork): Note that exit_signal is expected to be checked for the
> > sanity by the caller.
> > 
> > Fixes: 7f192e3cd316 ("fork: add clone3")
> 
> What are the user-visible runtime effects of this bug?
> 
> Relatedly, should this fix be backported into -stable kernels?  If so, why?

No, as I said in my other mail clone3() is not in any released kernel
yet. clone3() is going to be released in v5.3.

Christian

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] fork: check exit_signal passed in clone3() call
  2019-09-11 13:52   ` Christian Brauner
@ 2019-09-11 14:16     ` Christian Brauner
  2019-09-11 14:32       ` Eugene Syromiatnikov
  2019-09-13  9:07     ` Christian Brauner
  1 sibling, 1 reply; 14+ messages in thread
From: Christian Brauner @ 2019-09-11 14:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Eugene Syromiatnikov, linux-kernel, Oleg Nesterov,
	Peter Zijlstra (Intel),
	Ingo Molnar, Dmitry V. Levin, Eric Biederman

On Wed, Sep 11, 2019 at 03:52:36PM +0200, Christian Brauner wrote:
> On Wed, Sep 11, 2019 at 06:48:52AM -0700, Andrew Morton wrote:
> > On Tue, 10 Sep 2019 18:58:52 +0100 Eugene Syromiatnikov <esyr@redhat.com> wrote:
> > 
> > > Previously, higher 32 bits of exit_signal fields were lost when
> > > copied to the kernel args structure (that uses int as a type for the
> > > respective field).  Moreover, as Oleg has noted[1], exit_signal is used
> > > unchecked, so it has to be checked for sanity before use; for the legacy
> > > syscalls, applying CSIGNAL mask guarantees that it is at least non-negative;
> > > however, there's no such thing is done in clone3() code path, and that can
> > > break at least thread_group_leader.
> > > 
> > > Checking user-passed exit_signal against ~CSIGNAL mask solves both
> > > of these problems.
> > > 
> > > [1] https://lkml.org/lkml/2019/9/10/467
> > > 
> > > * kernel/fork.c (copy_clone_args_from_user): Fail with -EINVAL if
> > > args.exit_signal has bits set outside CSIGNAL mask.
> > > (_do_fork): Note that exit_signal is expected to be checked for the
> > > sanity by the caller.
> > > 
> > > Fixes: 7f192e3cd316 ("fork: add clone3")
> > 
> > What are the user-visible runtime effects of this bug?
> > 
> > Relatedly, should this fix be backported into -stable kernels?  If so, why?
> 
> No, as I said in my other mail clone3() is not in any released kernel
> yet. clone3() is going to be released in v5.3.

Sigh, I spoke to soon... Hm, this is placed in _do_fork(). There's a
chance that this might be visible in legacy clone if anyone passes in an
invalid signal greater than NSIG right now somehow, they'd now get
EINVAL if I'm seeing this right.

So an alternative might be to only fix this in clone3() only right now
and get this patch into 5.3 to not release clone3() with this bug from
legacy clone duplicated.
And we defer the actual legacy clone fix until after next merge window
having it stew in linux-next for a couple of rcs. Distros often pull in
rcs so if anyone notices a regression for legacy clone we'll know about
it... valid_signal() checks at process exit time when the parent is
supposed to be notifed will catch faulty signals anyway so it's not that
big of a deal.

Christian

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] fork: check exit_signal passed in clone3() call
  2019-09-11 14:16     ` Christian Brauner
@ 2019-09-11 14:32       ` Eugene Syromiatnikov
  2019-09-11 14:54         ` Christian Brauner
  0 siblings, 1 reply; 14+ messages in thread
From: Eugene Syromiatnikov @ 2019-09-11 14:32 UTC (permalink / raw)
  To: Christian Brauner, Andrew Morton
  Cc: linux-kernel, Oleg Nesterov, Peter Zijlstra (Intel),
	Ingo Molnar, Dmitry V. Levin, Eric Biederman

On Wed, Sep 11, 2019 at 04:16:36PM +0200, Christian Brauner wrote:
> On Wed, Sep 11, 2019 at 03:52:36PM +0200, Christian Brauner wrote:
> > On Wed, Sep 11, 2019 at 06:48:52AM -0700, Andrew Morton wrote:
> > > What are the user-visible runtime effects of this bug?

The userspace can set -1 as an exit_signal, and that will break process
signalling and reaping.

> > > Relatedly, should this fix be backported into -stable kernels?  If so, why?
> > 
> > No, as I said in my other mail clone3() is not in any released kernel
> > yet. clone3() is going to be released in v5.3.
> 
> Sigh, I spoke to soon... Hm, this is placed in _do_fork(). There's a
> chance that this might be visible in legacy clone if anyone passes in an
> invalid signal greater than NSIG right now somehow, they'd now get
> EINVAL if I'm seeing this right.
> 
> So an alternative might be to only fix this in clone3() only right now
> and get this patch into 5.3 to not release clone3() with this bug from
> legacy clone duplicated.
> And we defer the actual legacy clone fix until after next merge window
> having it stew in linux-next for a couple of rcs. Distros often pull in
> rcs so if anyone notices a regression for legacy clone we'll know about
> it... valid_signal() checks at process exit time when the parent is
> supposed to be notifed will catch faulty signals anyway so it's not that
> big of a deal.

As the patch is written, only copy_clone_args_from_user is touched (which
is used only by clone3 and not legacy clone), and the check added
replicates legacy clone behaviour: userspace can set 0..CSIGNAL
as an exit_signal.   Having ability to set exit_signal in NSIG..CSIGNAL
renge seems to be a bug, but at least it seems to be harmless one
and indeed may be addressed separately in the future.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] fork: check exit_signal passed in clone3() call
  2019-09-11 14:32       ` Eugene Syromiatnikov
@ 2019-09-11 14:54         ` Christian Brauner
  2019-09-11 15:08           ` Dmitry V. Levin
  2019-09-11 15:20           ` Eugene Syromiatnikov
  0 siblings, 2 replies; 14+ messages in thread
From: Christian Brauner @ 2019-09-11 14:54 UTC (permalink / raw)
  To: Eugene Syromiatnikov
  Cc: Andrew Morton, linux-kernel, Oleg Nesterov,
	Peter Zijlstra (Intel),
	Ingo Molnar, Dmitry V. Levin, Eric Biederman

On Wed, Sep 11, 2019 at 03:32:13PM +0100, Eugene Syromiatnikov wrote:
> On Wed, Sep 11, 2019 at 04:16:36PM +0200, Christian Brauner wrote:
> > On Wed, Sep 11, 2019 at 03:52:36PM +0200, Christian Brauner wrote:
> > > On Wed, Sep 11, 2019 at 06:48:52AM -0700, Andrew Morton wrote:
> > > > What are the user-visible runtime effects of this bug?
> 
> The userspace can set -1 as an exit_signal, and that will break process
> signalling and reaping.
> 
> > > > Relatedly, should this fix be backported into -stable kernels?  If so, why?
> > > 
> > > No, as I said in my other mail clone3() is not in any released kernel
> > > yet. clone3() is going to be released in v5.3.
> > 
> > Sigh, I spoke to soon... Hm, this is placed in _do_fork(). There's a
> > chance that this might be visible in legacy clone if anyone passes in an
> > invalid signal greater than NSIG right now somehow, they'd now get
> > EINVAL if I'm seeing this right.
> > 
> > So an alternative might be to only fix this in clone3() only right now
> > and get this patch into 5.3 to not release clone3() with this bug from
> > legacy clone duplicated.
> > And we defer the actual legacy clone fix until after next merge window
> > having it stew in linux-next for a couple of rcs. Distros often pull in
> > rcs so if anyone notices a regression for legacy clone we'll know about
> > it... valid_signal() checks at process exit time when the parent is
> > supposed to be notifed will catch faulty signals anyway so it's not that
> > big of a deal.
> 
> As the patch is written, only copy_clone_args_from_user is touched (which
> is used only by clone3 and not legacy clone), and the check added

Great!

> replicates legacy clone behaviour: userspace can set 0..CSIGNAL
> as an exit_signal.   Having ability to set exit_signal in NSIG..CSIGNAL

Hm. The way I see it for clone3() it would make sense to only have <
NSIG right away. valid_signal() won't let through anything else
anyway... Since clone3() isn't out yet it doesn't make sense to
replicate the (buggy) behavior of legacy clone, right?

Christian

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] fork: check exit_signal passed in clone3() call
  2019-09-11 14:54         ` Christian Brauner
@ 2019-09-11 15:08           ` Dmitry V. Levin
  2019-09-11 15:20           ` Eugene Syromiatnikov
  1 sibling, 0 replies; 14+ messages in thread
From: Dmitry V. Levin @ 2019-09-11 15:08 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Eugene Syromiatnikov, Andrew Morton, linux-kernel, Oleg Nesterov,
	Peter Zijlstra (Intel),
	Ingo Molnar, Eric Biederman

On Wed, Sep 11, 2019 at 04:54:47PM +0200, Christian Brauner wrote:
> On Wed, Sep 11, 2019 at 03:32:13PM +0100, Eugene Syromiatnikov wrote:
> > On Wed, Sep 11, 2019 at 04:16:36PM +0200, Christian Brauner wrote:
> > > On Wed, Sep 11, 2019 at 03:52:36PM +0200, Christian Brauner wrote:
> > > > On Wed, Sep 11, 2019 at 06:48:52AM -0700, Andrew Morton wrote:
> > > > > What are the user-visible runtime effects of this bug?
> > 
> > The userspace can set -1 as an exit_signal, and that will break process
> > signalling and reaping.
> > 
> > > > > Relatedly, should this fix be backported into -stable kernels?  If so, why?
> > > > 
> > > > No, as I said in my other mail clone3() is not in any released kernel
> > > > yet. clone3() is going to be released in v5.3.
> > > 
> > > Sigh, I spoke to soon... Hm, this is placed in _do_fork(). There's a
> > > chance that this might be visible in legacy clone if anyone passes in an
> > > invalid signal greater than NSIG right now somehow, they'd now get
> > > EINVAL if I'm seeing this right.
> > > 
> > > So an alternative might be to only fix this in clone3() only right now
> > > and get this patch into 5.3 to not release clone3() with this bug from
> > > legacy clone duplicated.
> > > And we defer the actual legacy clone fix until after next merge window
> > > having it stew in linux-next for a couple of rcs. Distros often pull in
> > > rcs so if anyone notices a regression for legacy clone we'll know about
> > > it... valid_signal() checks at process exit time when the parent is
> > > supposed to be notifed will catch faulty signals anyway so it's not that
> > > big of a deal.
> > 
> > As the patch is written, only copy_clone_args_from_user is touched (which
> > is used only by clone3 and not legacy clone), and the check added
> 
> Great!
> 
> > replicates legacy clone behaviour: userspace can set 0..CSIGNAL
> > as an exit_signal.   Having ability to set exit_signal in NSIG..CSIGNAL
> 
> Hm. The way I see it for clone3() it would make sense to only have <
> NSIG right away. valid_signal() won't let through anything else
> anyway... Since clone3() isn't out yet it doesn't make sense to
> replicate the (buggy) behavior of legacy clone, right?

I agree, let's have a proper exit_signal check in the new syscall
from the beginning.

It should be as simple as
	if (unlikely((args.exit_signal & ~((u64)CSIGNAL)) ||
		     !valid_signal(args.exit_signal)))
		return -EINVAL;
shouldn't it?


-- 
ldv

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] fork: check exit_signal passed in clone3() call
  2019-09-11 14:54         ` Christian Brauner
  2019-09-11 15:08           ` Dmitry V. Levin
@ 2019-09-11 15:20           ` Eugene Syromiatnikov
  2019-09-11 15:31             ` Christian Brauner
  1 sibling, 1 reply; 14+ messages in thread
From: Eugene Syromiatnikov @ 2019-09-11 15:20 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Andrew Morton, linux-kernel, Oleg Nesterov,
	Peter Zijlstra (Intel),
	Ingo Molnar, Dmitry V. Levin, Eric Biederman

On Wed, Sep 11, 2019 at 04:54:47PM +0200, Christian Brauner wrote:
> On Wed, Sep 11, 2019 at 03:32:13PM +0100, Eugene Syromiatnikov wrote:
> > On Wed, Sep 11, 2019 at 04:16:36PM +0200, Christian Brauner wrote:
> > > On Wed, Sep 11, 2019 at 03:52:36PM +0200, Christian Brauner wrote:
> > > > On Wed, Sep 11, 2019 at 06:48:52AM -0700, Andrew Morton wrote:
> > > > > What are the user-visible runtime effects of this bug?
> > 
> > The userspace can set -1 as an exit_signal, and that will break process
> > signalling and reaping.
> > 
> > > > > Relatedly, should this fix be backported into -stable kernels?  If so, why?
> > > > 
> > > > No, as I said in my other mail clone3() is not in any released kernel
> > > > yet. clone3() is going to be released in v5.3.
> > > 
> > > Sigh, I spoke to soon... Hm, this is placed in _do_fork(). There's a
> > > chance that this might be visible in legacy clone if anyone passes in an
> > > invalid signal greater than NSIG right now somehow, they'd now get
> > > EINVAL if I'm seeing this right.
> > > 
> > > So an alternative might be to only fix this in clone3() only right now
> > > and get this patch into 5.3 to not release clone3() with this bug from
> > > legacy clone duplicated.
> > > And we defer the actual legacy clone fix until after next merge window
> > > having it stew in linux-next for a couple of rcs. Distros often pull in
> > > rcs so if anyone notices a regression for legacy clone we'll know about
> > > it... valid_signal() checks at process exit time when the parent is
> > > supposed to be notifed will catch faulty signals anyway so it's not that
> > > big of a deal.
> > 
> > As the patch is written, only copy_clone_args_from_user is touched (which
> > is used only by clone3 and not legacy clone), and the check added
> 
> Great!
> 
> > replicates legacy clone behaviour: userspace can set 0..CSIGNAL
> > as an exit_signal.   Having ability to set exit_signal in NSIG..CSIGNAL
> 
> Hm. The way I see it for clone3() it would make sense to only have <
> NSIG right away. valid_signal() won't let through anything else
> anyway... Since clone3() isn't out yet it doesn't make sense to
> replicate the (buggy) behavior of legacy clone, right?

I was thinking about that and in the end decided to go with CSIGNAL;
the only issue I see here is that it prevents for libc to easily
switch clone() library call implementation to clone3(), in case there
are some applications that rely on this kind of behaviour; if there's
no such userspace users, then switch to valid_signal() check for both
legacy and clone3 should be fine (along with possible switching to u64
for kernel_clone_args's exit_signal, so the check can done in one place),
but I'm agree with your hesitance to do it right now.

> Christian

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] fork: check exit_signal passed in clone3() call
  2019-09-11 15:20           ` Eugene Syromiatnikov
@ 2019-09-11 15:31             ` Christian Brauner
  0 siblings, 0 replies; 14+ messages in thread
From: Christian Brauner @ 2019-09-11 15:31 UTC (permalink / raw)
  To: Eugene Syromiatnikov
  Cc: Andrew Morton, linux-kernel, Oleg Nesterov,
	Peter Zijlstra (Intel),
	Ingo Molnar, Dmitry V. Levin, Eric Biederman

On Wed, Sep 11, 2019 at 04:20:48PM +0100, Eugene Syromiatnikov wrote:
> On Wed, Sep 11, 2019 at 04:54:47PM +0200, Christian Brauner wrote:
> > On Wed, Sep 11, 2019 at 03:32:13PM +0100, Eugene Syromiatnikov wrote:
> > > On Wed, Sep 11, 2019 at 04:16:36PM +0200, Christian Brauner wrote:
> > > > On Wed, Sep 11, 2019 at 03:52:36PM +0200, Christian Brauner wrote:
> > > > > On Wed, Sep 11, 2019 at 06:48:52AM -0700, Andrew Morton wrote:
> > > > > > What are the user-visible runtime effects of this bug?
> > > 
> > > The userspace can set -1 as an exit_signal, and that will break process
> > > signalling and reaping.
> > > 
> > > > > > Relatedly, should this fix be backported into -stable kernels?  If so, why?
> > > > > 
> > > > > No, as I said in my other mail clone3() is not in any released kernel
> > > > > yet. clone3() is going to be released in v5.3.
> > > > 
> > > > Sigh, I spoke to soon... Hm, this is placed in _do_fork(). There's a
> > > > chance that this might be visible in legacy clone if anyone passes in an
> > > > invalid signal greater than NSIG right now somehow, they'd now get
> > > > EINVAL if I'm seeing this right.
> > > > 
> > > > So an alternative might be to only fix this in clone3() only right now
> > > > and get this patch into 5.3 to not release clone3() with this bug from
> > > > legacy clone duplicated.
> > > > And we defer the actual legacy clone fix until after next merge window
> > > > having it stew in linux-next for a couple of rcs. Distros often pull in
> > > > rcs so if anyone notices a regression for legacy clone we'll know about
> > > > it... valid_signal() checks at process exit time when the parent is
> > > > supposed to be notifed will catch faulty signals anyway so it's not that
> > > > big of a deal.
> > > 
> > > As the patch is written, only copy_clone_args_from_user is touched (which
> > > is used only by clone3 and not legacy clone), and the check added
> > 
> > Great!
> > 
> > > replicates legacy clone behaviour: userspace can set 0..CSIGNAL
> > > as an exit_signal.   Having ability to set exit_signal in NSIG..CSIGNAL
> > 
> > Hm. The way I see it for clone3() it would make sense to only have <
> > NSIG right away. valid_signal() won't let through anything else
> > anyway... Since clone3() isn't out yet it doesn't make sense to
> > replicate the (buggy) behavior of legacy clone, right?
> 
> I was thinking about that and in the end decided to go with CSIGNAL;
> the only issue I see here is that it prevents for libc to easily
> switch clone() library call implementation to clone3(), in case there
> are some applications that rely on this kind of behaviour; if there's
> no such userspace users, then switch to valid_signal() check for both
> legacy and clone3 should be fine (along with possible switching to u64
> for kernel_clone_args's exit_signal, so the check can done in one place),
> but I'm agree with your hesitance to do it right now.

We already have other explicit differences between clone() and clone3().
And these differences are intentional. Two examples are that CSIGNAL is
not a valid flag anymore (due to .exit-signal) and CLONE_DETACHED will
not be simply ignored anymore but will get you EINVAL. So that shouldn't
be a concern.

I think we really want the minimal thing and only for clone3 right now:
verify whether any high-bits are set and whether it's a valid signal.
clone3() needs to have correct semantics when released!

Legacy clone on the other hand already has this buggy behavior for
n-years. It can have it for a little while longer so we can see whether
userspace relies on it.

Christian

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] fork: check exit_signal passed in clone3() call
  2019-09-10 17:58 [PATCH v2] fork: check exit_signal passed in clone3() call Eugene Syromiatnikov
  2019-09-11 13:31 ` Oleg Nesterov
  2019-09-11 13:48 ` Andrew Morton
@ 2019-09-11 17:32 ` Eric W. Biederman
  2 siblings, 0 replies; 14+ messages in thread
From: Eric W. Biederman @ 2019-09-11 17:32 UTC (permalink / raw)
  To: Eugene Syromiatnikov
  Cc: linux-kernel, Christian Brauner, Oleg Nesterov, Andrew Morton,
	Peter Zijlstra (Intel),
	Ingo Molnar, Dmitry V. Levin

Eugene Syromiatnikov <esyr@redhat.com> writes:

> Previously, higher 32 bits of exit_signal fields were lost when
> copied to the kernel args structure (that uses int as a type for the
> respective field).  Moreover, as Oleg has noted[1], exit_signal is used
> unchecked, so it has to be checked for sanity before use; for the legacy
> syscalls, applying CSIGNAL mask guarantees that it is at least non-negative;
> however, there's no such thing is done in clone3() code path, and that can
> break at least thread_group_leader.
>
> Checking user-passed exit_signal against ~CSIGNAL mask solves both
> of these problems.
>
> [1] https://lkml.org/lkml/2019/9/10/467
>
> * kernel/fork.c (copy_clone_args_from_user): Fail with -EINVAL if
> args.exit_signal has bits set outside CSIGNAL mask.
> (_do_fork): Note that exit_signal is expected to be checked for the
> sanity by the caller.
>
> Fixes: 7f192e3cd316 ("fork: add clone3")
> Reported-by: Oleg Nesterov <oleg@redhat.com>
> Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>

Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>

> ---
>  kernel/fork.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 2852d0e..9dee2ab 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -2338,6 +2338,8 @@ struct mm_struct *copy_init_mm(void)
>   *
>   * It copies the process, and if successful kick-starts
>   * it and waits for it to finish using the VM if required.
> + *
> + * args->exit_signal is expected to be checked for sanity by the caller.
>   */
>  long _do_fork(struct kernel_clone_args *args)
>  {
> @@ -2562,6 +2564,16 @@ noinline static int copy_clone_args_from_user(struct kernel_clone_args *kargs,
>  	if (copy_from_user(&args, uargs, size))
>  		return -EFAULT;
>  
> +	/*
> +	 * exit_signal is confined to CSIGNAL mask in legacy syscalls,
> +	 * so it is used unchecked deeper in syscall handling routines;
> +	 * moreover, copying to struct kernel_clone_args.exit_signals
> +	 * trims higher 32 bits, so it is has to be checked that they
> +	 * are zero.
> +	 */
> +	if (unlikely(args.exit_signal & ~((u64)CSIGNAL)))
> +		return -EINVAL;
> +
>  	*kargs = (struct kernel_clone_args){
>  		.flags		= args.flags,
>  		.pidfd		= u64_to_user_ptr(args.pidfd),

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] fork: check exit_signal passed in clone3() call
  2019-09-11 13:52   ` Christian Brauner
  2019-09-11 14:16     ` Christian Brauner
@ 2019-09-13  9:07     ` Christian Brauner
  1 sibling, 0 replies; 14+ messages in thread
From: Christian Brauner @ 2019-09-13  9:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Eugene Syromiatnikov, linux-kernel, Oleg Nesterov,
	Peter Zijlstra (Intel),
	Ingo Molnar, Dmitry V. Levin, Eric Biederman

On Wed, Sep 11, 2019 at 03:52:36PM +0200, Christian Brauner wrote:
> On Wed, Sep 11, 2019 at 06:48:52AM -0700, Andrew Morton wrote:
> > On Tue, 10 Sep 2019 18:58:52 +0100 Eugene Syromiatnikov <esyr@redhat.com> wrote:
> > 
> > > Previously, higher 32 bits of exit_signal fields were lost when
> > > copied to the kernel args structure (that uses int as a type for the
> > > respective field).  Moreover, as Oleg has noted[1], exit_signal is used
> > > unchecked, so it has to be checked for sanity before use; for the legacy
> > > syscalls, applying CSIGNAL mask guarantees that it is at least non-negative;
> > > however, there's no such thing is done in clone3() code path, and that can
> > > break at least thread_group_leader.
> > > 
> > > Checking user-passed exit_signal against ~CSIGNAL mask solves both
> > > of these problems.
> > > 
> > > [1] https://lkml.org/lkml/2019/9/10/467
> > > 
> > > * kernel/fork.c (copy_clone_args_from_user): Fail with -EINVAL if
> > > args.exit_signal has bits set outside CSIGNAL mask.
> > > (_do_fork): Note that exit_signal is expected to be checked for the
> > > sanity by the caller.
> > > 
> > > Fixes: 7f192e3cd316 ("fork: add clone3")
> > 
> > What are the user-visible runtime effects of this bug?
> > 
> > Relatedly, should this fix be backported into -stable kernels?  If so, why?
> 
> No, as I said in my other mail clone3() is not in any released kernel
> yet. clone3() is going to be released in v5.3.

Applied yesteday. This is now fixed and included in mainline.

Thanks!
Christian

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2] fork: check exit_signal passed in clone3() call
@ 2019-09-10 17:58 Eugene Syromiatnikov
  0 siblings, 0 replies; 14+ messages in thread
From: Eugene Syromiatnikov @ 2019-09-10 17:58 UTC (permalink / raw)
  To: linux-kernel, Christian Brauner, Oleg Nesterov
  Cc: Andrew Morton, Peter Zijlstra (Intel),
	Ingo Molnar, Dmitry V. Levin, Eric Biederman

Hello.

After some consideration, I've decided to utilise Oleg's proposal[1]
"(args.exit_signal & ~((u64)CSIGNAL))" as a check. I still don't like
it, as it mixes argument copy check (I'm not sure if it's ever needed,
however, as I'm not sure if there's a reason for exit_signal field
of struct kernel_clone_args to have int type) with argument sanity
check; moreover, it covers only clone3 case, and the code in
copy_process is still error-prone in the long run.  Ideally, the check
should be somewhere in the one place, but as of now this one place
is likely _do_fork, but it's kinda weir to have argument check there
as of now.

Changes since v1[2]:
 - Check changed to comparison against negated CSIGNAL to address
   the bug reported by Oleg[3].
 - Added a comment to _do_fork that exit_signal has to be checked
   by the caller.

[1] https://lkml.org/lkml/2019/9/10/581
[2] https://lkml.org/lkml/2019/9/10/411
[3] https://lkml.org/lkml/2019/9/10/467

Eugene Syromiatnikov (1):
  fork: check exit_signal passed in clone3() call

 kernel/fork.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

-- 
2.1.4


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-09-13  9:07 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-10 17:58 [PATCH v2] fork: check exit_signal passed in clone3() call Eugene Syromiatnikov
2019-09-11 13:31 ` Oleg Nesterov
2019-09-11 13:47   ` Christian Brauner
2019-09-11 13:48 ` Andrew Morton
2019-09-11 13:52   ` Christian Brauner
2019-09-11 14:16     ` Christian Brauner
2019-09-11 14:32       ` Eugene Syromiatnikov
2019-09-11 14:54         ` Christian Brauner
2019-09-11 15:08           ` Dmitry V. Levin
2019-09-11 15:20           ` Eugene Syromiatnikov
2019-09-11 15:31             ` Christian Brauner
2019-09-13  9:07     ` Christian Brauner
2019-09-11 17:32 ` Eric W. Biederman
  -- strict thread matches above, loose matches on Subject: below --
2019-09-10 17:58 Eugene Syromiatnikov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).