linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] user_ns: use correct check for single-threadedness
@ 2015-07-28 17:15 Kees Cook
  2015-07-28 18:02 ` Rik van Riel
                   ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Kees Cook @ 2015-07-28 17:15 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Oleg Nesterov, David Howells, linux-kernel, Andrew Morton,
	Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov, Rik van Riel,
	Vladimir Davydov, Ricky Zhou, Julien Tinnes

From: Ricky Zhou <rickyz@chromium.org>

Checking mm_users > 1 does not mean a process is multithreaded. For
example, reading /proc/PID/maps temporarily increments mm_users, allowing
other processes to (accidentally) interfere with unshare() calls.

This fixes observed failures of unshare(CLONE_NEWUSER) incorrectly
returning EINVAL if another processes happened to be simultaneously
reading the maps file.

Signed-off-by: Ricky Zhou <rickyz@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
---
 kernel/fork.c           | 4 ++--
 kernel/user_namespace.c | 3 ++-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index dbd9b8d7b7cc..7d138f152dcd 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -75,6 +75,7 @@
 #include <linux/aio.h>
 #include <linux/compiler.h>
 #include <linux/sysctl.h>
+#include <linux/sched.h>
 
 #include <asm/pgtable.h>
 #include <asm/pgalloc.h>
@@ -1876,8 +1877,7 @@ static int check_unshare_flags(unsigned long unshare_flags)
 	 * needs to unshare vm.
 	 */
 	if (unshare_flags & (CLONE_THREAD | CLONE_SIGHAND | CLONE_VM)) {
-		/* FIXME: get_task_mm() increments ->mm_users */
-		if (atomic_read(&current->mm->mm_users) > 1)
+		if (!current_is_single_threaded())
 			return -EINVAL;
 	}
 
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 4109f8320684..7e0e021e4304 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -22,6 +22,7 @@
 #include <linux/ctype.h>
 #include <linux/projid.h>
 #include <linux/fs_struct.h>
+#include <linux/sched.h>
 
 static struct kmem_cache *user_ns_cachep __read_mostly;
 static DEFINE_MUTEX(userns_state_mutex);
@@ -977,7 +978,7 @@ static int userns_install(struct nsproxy *nsproxy, struct ns_common *ns)
 		return -EINVAL;
 
 	/* Threaded processes may not enter a different user namespace */
-	if (atomic_read(&current->mm->mm_users) > 1)
+	if (!current_is_single_threaded())
 		return -EINVAL;
 
 	if (current->fs->users != 1)
-- 
1.9.1


-- 
Kees Cook
Chrome OS Security

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-07-28 17:15 [PATCH] user_ns: use correct check for single-threadedness Kees Cook
@ 2015-07-28 18:02 ` Rik van Riel
  2015-07-28 18:17 ` Eric W. Biederman
  2015-07-28 21:35 ` Andrew Morton
  2 siblings, 0 replies; 41+ messages in thread
From: Rik van Riel @ 2015-07-28 18:02 UTC (permalink / raw)
  To: Kees Cook, Eric W. Biederman
  Cc: Oleg Nesterov, David Howells, linux-kernel, Andrew Morton,
	Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Vladimir Davydov, Ricky Zhou, Julien Tinnes

On 07/28/2015 01:15 PM, Kees Cook wrote:
> From: Ricky Zhou <rickyz@chromium.org>
> 
> Checking mm_users > 1 does not mean a process is multithreaded. For
> example, reading /proc/PID/maps temporarily increments mm_users, allowing
> other processes to (accidentally) interfere with unshare() calls.
> 
> This fixes observed failures of unshare(CLONE_NEWUSER) incorrectly
> returning EINVAL if another processes happened to be simultaneously
> reading the maps file.
> 
> Signed-off-by: Ricky Zhou <rickyz@chromium.org>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> Cc: stable@vger.kernel.org

Reviewed-by: Rik van Riel <riel@redhat.com>


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-07-28 17:15 [PATCH] user_ns: use correct check for single-threadedness Kees Cook
  2015-07-28 18:02 ` Rik van Riel
@ 2015-07-28 18:17 ` Eric W. Biederman
  2015-07-28 20:55   ` Ricky Zhou
  2015-07-28 21:35 ` Andrew Morton
  2 siblings, 1 reply; 41+ messages in thread
From: Eric W. Biederman @ 2015-07-28 18:17 UTC (permalink / raw)
  To: Kees Cook
  Cc: Oleg Nesterov, David Howells, linux-kernel, Andrew Morton,
	Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov, Rik van Riel,
	Vladimir Davydov, Ricky Zhou, Julien Tinnes

Kees Cook <keescook@chromium.org> writes:

> From: Ricky Zhou <rickyz@chromium.org>
>
> Checking mm_users > 1 does not mean a process is multithreaded. For
> example, reading /proc/PID/maps temporarily increments mm_users, allowing
> other processes to (accidentally) interfere with unshare() calls.
>
> This fixes observed failures of unshare(CLONE_NEWUSER) incorrectly
> returning EINVAL if another processes happened to be simultaneously
> reading the maps file.
>
> Signed-off-by: Ricky Zhou <rickyz@chromium.org>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> Cc: stable@vger.kernel.org

This looks like a good fix.  Any chance you can drudge up the commit where
this hack came in so that Greg & Company know how far to back port this?

> ---
>  kernel/fork.c           | 4 ++--
>  kernel/user_namespace.c | 3 ++-
>  2 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index dbd9b8d7b7cc..7d138f152dcd 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -75,6 +75,7 @@
>  #include <linux/aio.h>
>  #include <linux/compiler.h>
>  #include <linux/sysctl.h>
> +#include <linux/sched.h>
>  
>  #include <asm/pgtable.h>
>  #include <asm/pgalloc.h>
> @@ -1876,8 +1877,7 @@ static int check_unshare_flags(unsigned long unshare_flags)
>  	 * needs to unshare vm.
>  	 */
>  	if (unshare_flags & (CLONE_THREAD | CLONE_SIGHAND | CLONE_VM)) {
> -		/* FIXME: get_task_mm() increments ->mm_users */
> -		if (atomic_read(&current->mm->mm_users) > 1)
> +		if (!current_is_single_threaded())
>  			return -EINVAL;
>  	}
>  
> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> index 4109f8320684..7e0e021e4304 100644
> --- a/kernel/user_namespace.c
> +++ b/kernel/user_namespace.c
> @@ -22,6 +22,7 @@
>  #include <linux/ctype.h>
>  #include <linux/projid.h>
>  #include <linux/fs_struct.h>
> +#include <linux/sched.h>
>  
>  static struct kmem_cache *user_ns_cachep __read_mostly;
>  static DEFINE_MUTEX(userns_state_mutex);
> @@ -977,7 +978,7 @@ static int userns_install(struct nsproxy *nsproxy, struct ns_common *ns)
>  		return -EINVAL;
>  
>  	/* Threaded processes may not enter a different user namespace */
> -	if (atomic_read(&current->mm->mm_users) > 1)
> +	if (!current_is_single_threaded())
>  		return -EINVAL;
>  
>  	if (current->fs->users != 1)
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-07-28 18:17 ` Eric W. Biederman
@ 2015-07-28 20:55   ` Ricky Zhou
  2015-07-28 21:01     ` Kees Cook
  0 siblings, 1 reply; 41+ messages in thread
From: Ricky Zhou @ 2015-07-28 20:55 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kees Cook, Oleg Nesterov, David Howells, linux-kernel,
	Andrew Morton, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Julien Tinnes

(apologies for the dup, forgot to reply all)

userns_install in user_namespace.c (affects setns of a user
namespace): cde1975bc242f3e1072bde623ef378e547b73f91.

The check in check_unshare_flags is a little more complex. The
incorrect check was added in
cf2e340f4249b781b3d2beb41e891d08581f0e10 but I don't think it would
have triggered under any supported combination of flags at that point.

>From 50804fe3737ca6a5942fdc2057a18a8141d00141 until
6e556ce209b09528dbf1931cbfd5d323e1345926, the bug affected
unshare(CLONE_NEWPID).

>From b2e0d98705e60e45bbb3c0032c48824ad7ae0704 onward,
unshare(CLONE_NEWUSER) is affected.

Thanks,
Ricky

On Tue, Jul 28, 2015 at 11:17 AM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
> Kees Cook <keescook@chromium.org> writes:
>
>> From: Ricky Zhou <rickyz@chromium.org>
>>
>> Checking mm_users > 1 does not mean a process is multithreaded. For
>> example, reading /proc/PID/maps temporarily increments mm_users, allowing
>> other processes to (accidentally) interfere with unshare() calls.
>>
>> This fixes observed failures of unshare(CLONE_NEWUSER) incorrectly
>> returning EINVAL if another processes happened to be simultaneously
>> reading the maps file.
>>
>> Signed-off-by: Ricky Zhou <rickyz@chromium.org>
>> Signed-off-by: Kees Cook <keescook@chromium.org>
>> Cc: stable@vger.kernel.org
>
> This looks like a good fix.  Any chance you can drudge up the commit where
> this hack came in so that Greg & Company know how far to back port this?
>
>> ---
>>  kernel/fork.c           | 4 ++--
>>  kernel/user_namespace.c | 3 ++-
>>  2 files changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/fork.c b/kernel/fork.c
>> index dbd9b8d7b7cc..7d138f152dcd 100644
>> --- a/kernel/fork.c
>> +++ b/kernel/fork.c
>> @@ -75,6 +75,7 @@
>>  #include <linux/aio.h>
>>  #include <linux/compiler.h>
>>  #include <linux/sysctl.h>
>> +#include <linux/sched.h>
>>
>>  #include <asm/pgtable.h>
>>  #include <asm/pgalloc.h>
>> @@ -1876,8 +1877,7 @@ static int check_unshare_flags(unsigned long unshare_flags)
>>        * needs to unshare vm.
>>        */
>>       if (unshare_flags & (CLONE_THREAD | CLONE_SIGHAND | CLONE_VM)) {
>> -             /* FIXME: get_task_mm() increments ->mm_users */
>> -             if (atomic_read(&current->mm->mm_users) > 1)
>> +             if (!current_is_single_threaded())
>>                       return -EINVAL;
>>       }
>>
>> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
>> index 4109f8320684..7e0e021e4304 100644
>> --- a/kernel/user_namespace.c
>> +++ b/kernel/user_namespace.c
>> @@ -22,6 +22,7 @@
>>  #include <linux/ctype.h>
>>  #include <linux/projid.h>
>>  #include <linux/fs_struct.h>
>> +#include <linux/sched.h>
>>
>>  static struct kmem_cache *user_ns_cachep __read_mostly;
>>  static DEFINE_MUTEX(userns_state_mutex);
>> @@ -977,7 +978,7 @@ static int userns_install(struct nsproxy *nsproxy, struct ns_common *ns)
>>               return -EINVAL;
>>
>>       /* Threaded processes may not enter a different user namespace */
>> -     if (atomic_read(&current->mm->mm_users) > 1)
>> +     if (!current_is_single_threaded())
>>               return -EINVAL;
>>
>>       if (current->fs->users != 1)
>> --
>> 1.9.1

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-07-28 20:55   ` Ricky Zhou
@ 2015-07-28 21:01     ` Kees Cook
  2015-08-05 18:13       ` Eric W. Biederman
  0 siblings, 1 reply; 41+ messages in thread
From: Kees Cook @ 2015-07-28 21:01 UTC (permalink / raw)
  To: Ricky Zhou
  Cc: Eric W. Biederman, Oleg Nesterov, David Howells, LKML,
	Andrew Morton, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Julien Tinnes

On Tue, Jul 28, 2015 at 1:55 PM, Ricky Zhou <rickyz@google.com> wrote:
> On Tue, Jul 28, 2015 at 11:17 AM, Eric W. Biederman
> <ebiederm@xmission.com> wrote:
>> Kees Cook <keescook@chromium.org> writes:
>>
>>> From: Ricky Zhou <rickyz@chromium.org>
>>>
>>> Checking mm_users > 1 does not mean a process is multithreaded. For
>>> example, reading /proc/PID/maps temporarily increments mm_users, allowing
>>> other processes to (accidentally) interfere with unshare() calls.
>>>
>>> This fixes observed failures of unshare(CLONE_NEWUSER) incorrectly
>>> returning EINVAL if another processes happened to be simultaneously
>>> reading the maps file.
>>>
>>> Signed-off-by: Ricky Zhou <rickyz@chromium.org>
>>> Signed-off-by: Kees Cook <keescook@chromium.org>
>>> Cc: stable@vger.kernel.org
>>
>> This looks like a good fix.  Any chance you can drudge up the commit where
>> this hack came in so that Greg & Company know how far to back port this?
>
> userns_install in user_namespace.c (affects setns of a user
> namespace): cde1975bc242f3e1072bde623ef378e547b73f91.
>
> The check in check_unshare_flags is a little more complex. The
> incorrect check was added in
> cf2e340f4249b781b3d2beb41e891d08581f0e10 but I don't think it would
> have triggered under any supported combination of flags at that point.
>
> From 50804fe3737ca6a5942fdc2057a18a8141d00141 until
> 6e556ce209b09528dbf1931cbfd5d323e1345926, the bug affected
> unshare(CLONE_NEWPID).

That's back to v3.8, so this goes quite a way, it seems.

-Kees

>
> From b2e0d98705e60e45bbb3c0032c48824ad7ae0704 onward,
> unshare(CLONE_NEWUSER) is affected.
>
> Thanks,
> Ricky
>>>
>>> ---
>>>  kernel/fork.c           | 4 ++--
>>>  kernel/user_namespace.c | 3 ++-
>>>  2 files changed, 4 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/kernel/fork.c b/kernel/fork.c
>>> index dbd9b8d7b7cc..7d138f152dcd 100644
>>> --- a/kernel/fork.c
>>> +++ b/kernel/fork.c
>>> @@ -75,6 +75,7 @@
>>>  #include <linux/aio.h>
>>>  #include <linux/compiler.h>
>>>  #include <linux/sysctl.h>
>>> +#include <linux/sched.h>
>>>
>>>  #include <asm/pgtable.h>
>>>  #include <asm/pgalloc.h>
>>> @@ -1876,8 +1877,7 @@ static int check_unshare_flags(unsigned long unshare_flags)
>>>        * needs to unshare vm.
>>>        */
>>>       if (unshare_flags & (CLONE_THREAD | CLONE_SIGHAND | CLONE_VM)) {
>>> -             /* FIXME: get_task_mm() increments ->mm_users */
>>> -             if (atomic_read(&current->mm->mm_users) > 1)
>>> +             if (!current_is_single_threaded())
>>>                       return -EINVAL;
>>>       }
>>>
>>> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
>>> index 4109f8320684..7e0e021e4304 100644
>>> --- a/kernel/user_namespace.c
>>> +++ b/kernel/user_namespace.c
>>> @@ -22,6 +22,7 @@
>>>  #include <linux/ctype.h>
>>>  #include <linux/projid.h>
>>>  #include <linux/fs_struct.h>
>>> +#include <linux/sched.h>
>>>
>>>  static struct kmem_cache *user_ns_cachep __read_mostly;
>>>  static DEFINE_MUTEX(userns_state_mutex);
>>> @@ -977,7 +978,7 @@ static int userns_install(struct nsproxy *nsproxy, struct ns_common *ns)
>>>               return -EINVAL;
>>>
>>>       /* Threaded processes may not enter a different user namespace */
>>> -     if (atomic_read(&current->mm->mm_users) > 1)
>>> +     if (!current_is_single_threaded())
>>>               return -EINVAL;
>>>
>>>       if (current->fs->users != 1)
>>> --
>>> 1.9.1



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-07-28 17:15 [PATCH] user_ns: use correct check for single-threadedness Kees Cook
  2015-07-28 18:02 ` Rik van Riel
  2015-07-28 18:17 ` Eric W. Biederman
@ 2015-07-28 21:35 ` Andrew Morton
  2015-07-28 21:50   ` Kees Cook
  2015-07-28 22:11   ` Kirill A. Shutemov
  2 siblings, 2 replies; 41+ messages in thread
From: Andrew Morton @ 2015-07-28 21:35 UTC (permalink / raw)
  To: Kees Cook
  Cc: Eric W. Biederman, Oleg Nesterov, David Howells, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov, Rik van Riel,
	Vladimir Davydov, Ricky Zhou, Julien Tinnes

On Tue, 28 Jul 2015 10:15:00 -0700 Kees Cook <keescook@chromium.org> wrote:

> From: Ricky Zhou <rickyz@chromium.org>
> 
> Checking mm_users > 1 does not mean a process is multithreaded. For
> example, reading /proc/PID/maps temporarily increments mm_users, allowing
> other processes to (accidentally) interfere with unshare() calls.
> 
> This fixes observed failures of unshare(CLONE_NEWUSER) incorrectly
> returning EINVAL if another processes happened to be simultaneously
> reading the maps file.

Yikes.  current_is_single_threaded() is expensive.  Are we sure this
isn't going to kill someone's workload?


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-07-28 21:35 ` Andrew Morton
@ 2015-07-28 21:50   ` Kees Cook
  2015-07-28 22:11   ` Kirill A. Shutemov
  1 sibling, 0 replies; 41+ messages in thread
From: Kees Cook @ 2015-07-28 21:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Eric W. Biederman, Oleg Nesterov, David Howells, LKML,
	Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov, Rik van Riel,
	Vladimir Davydov, Ricky Zhou, Julien Tinnes

On Tue, Jul 28, 2015 at 2:35 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Tue, 28 Jul 2015 10:15:00 -0700 Kees Cook <keescook@chromium.org> wrote:
>
>> From: Ricky Zhou <rickyz@chromium.org>
>>
>> Checking mm_users > 1 does not mean a process is multithreaded. For
>> example, reading /proc/PID/maps temporarily increments mm_users, allowing
>> other processes to (accidentally) interfere with unshare() calls.
>>
>> This fixes observed failures of unshare(CLONE_NEWUSER) incorrectly
>> returning EINVAL if another processes happened to be simultaneously
>> reading the maps file.
>
> Yikes.  current_is_single_threaded() is expensive.  Are we sure this
> isn't going to kill someone's workload?

It _can_ be expensive, but if mm->mm_users == 1 it immediately returns
true, so it's only the cases where there is a race (like what's solved
here), or when it's a legit failure. This doesn't feel to me like it
should hit a real user very hard, since "real" callers of unshare will
normally have mm_users == 1.

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-07-28 21:35 ` Andrew Morton
  2015-07-28 21:50   ` Kees Cook
@ 2015-07-28 22:11   ` Kirill A. Shutemov
  2015-08-05 11:38     ` Ingo Molnar
  2015-08-05 17:23     ` Oleg Nesterov
  1 sibling, 2 replies; 41+ messages in thread
From: Kirill A. Shutemov @ 2015-07-28 22:11 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kees Cook, Eric W. Biederman, Oleg Nesterov, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

On Tue, Jul 28, 2015 at 02:35:04PM -0700, Andrew Morton wrote:
> On Tue, 28 Jul 2015 10:15:00 -0700 Kees Cook <keescook@chromium.org> wrote:
> 
> > From: Ricky Zhou <rickyz@chromium.org>
> > 
> > Checking mm_users > 1 does not mean a process is multithreaded. For
> > example, reading /proc/PID/maps temporarily increments mm_users, allowing
> > other processes to (accidentally) interfere with unshare() calls.
> > 
> > This fixes observed failures of unshare(CLONE_NEWUSER) incorrectly
> > returning EINVAL if another processes happened to be simultaneously
> > reading the maps file.
> 
> Yikes.  current_is_single_threaded() is expensive.  Are we sure this
> isn't going to kill someone's workload?

It's expensive only if mm_users > 1. We will go to for_each_process() only
if somebody outside of the process grabs mm_users references (like reading
/proc/PID/maps). Or if it called it from multithreaded application.

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-07-28 22:11   ` Kirill A. Shutemov
@ 2015-08-05 11:38     ` Ingo Molnar
  2015-08-05 11:53       ` Kirill A. Shutemov
  2015-08-05 17:23     ` Oleg Nesterov
  1 sibling, 1 reply; 41+ messages in thread
From: Ingo Molnar @ 2015-08-05 11:38 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Andrew Morton, Kees Cook, Eric W. Biederman, Oleg Nesterov,
	David Howells, linux-kernel, Peter Zijlstra, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes


* Kirill A. Shutemov <kirill@shutemov.name> wrote:

> On Tue, Jul 28, 2015 at 02:35:04PM -0700, Andrew Morton wrote:
> > On Tue, 28 Jul 2015 10:15:00 -0700 Kees Cook <keescook@chromium.org> wrote:
> > 
> > > From: Ricky Zhou <rickyz@chromium.org>
> > > 
> > > Checking mm_users > 1 does not mean a process is multithreaded. For
> > > example, reading /proc/PID/maps temporarily increments mm_users, allowing
> > > other processes to (accidentally) interfere with unshare() calls.
> > > 
> > > This fixes observed failures of unshare(CLONE_NEWUSER) incorrectly
> > > returning EINVAL if another processes happened to be simultaneously
> > > reading the maps file.
> > 
> > Yikes.  current_is_single_threaded() is expensive.  Are we sure this
> > isn't going to kill someone's workload?
> 
> It's expensive only if mm_users > 1. We will go to for_each_process() only
> if somebody outside of the process grabs mm_users references (like reading
> /proc/PID/maps). Or if it called it from multithreaded application.

It's considerably expensive:

        rcu_read_lock();
        for_each_process(p) {
                do {
	...
                } while_each_thread(p, t);
        }


'only' if it's multi-threaded, i.e. when some workload cares so much about 
performance that it uses multiple threads?

Can you see the contradiction there?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-08-05 11:38     ` Ingo Molnar
@ 2015-08-05 11:53       ` Kirill A. Shutemov
  2015-08-05 13:13         ` Ricky Zhou
  0 siblings, 1 reply; 41+ messages in thread
From: Kirill A. Shutemov @ 2015-08-05 11:53 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, Kees Cook, Eric W. Biederman, Oleg Nesterov,
	David Howells, linux-kernel, Peter Zijlstra, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

On Wed, Aug 05, 2015 at 01:38:54PM +0200, Ingo Molnar wrote:
> 
> * Kirill A. Shutemov <kirill@shutemov.name> wrote:
> 
> > On Tue, Jul 28, 2015 at 02:35:04PM -0700, Andrew Morton wrote:
> > > On Tue, 28 Jul 2015 10:15:00 -0700 Kees Cook <keescook@chromium.org> wrote:
> > > 
> > > > From: Ricky Zhou <rickyz@chromium.org>
> > > > 
> > > > Checking mm_users > 1 does not mean a process is multithreaded. For
> > > > example, reading /proc/PID/maps temporarily increments mm_users, allowing
> > > > other processes to (accidentally) interfere with unshare() calls.
> > > > 
> > > > This fixes observed failures of unshare(CLONE_NEWUSER) incorrectly
> > > > returning EINVAL if another processes happened to be simultaneously
> > > > reading the maps file.
> > > 
> > > Yikes.  current_is_single_threaded() is expensive.  Are we sure this
> > > isn't going to kill someone's workload?
> > 
> > It's expensive only if mm_users > 1. We will go to for_each_process() only
> > if somebody outside of the process grabs mm_users references (like reading
> > /proc/PID/maps). Or if it called it from multithreaded application.
> 
> It's considerably expensive:
> 
>         rcu_read_lock();
>         for_each_process(p) {
>                 do {
> 	...
>                 } while_each_thread(p, t);
>         }
> 
> 
> 'only' if it's multi-threaded, i.e. when some workload cares so much about 
> performance that it uses multiple threads?
> 
> Can you see the contradiction there?

I can. man 2 unshare:

	CLONE_NEWUSER requires that the calling process  is  not  threaded;

The workload cares so much about performance that it ignores API
requirements. Some slow down looks like a fair price to me.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-08-05 11:53       ` Kirill A. Shutemov
@ 2015-08-05 13:13         ` Ricky Zhou
  0 siblings, 0 replies; 41+ messages in thread
From: Ricky Zhou @ 2015-08-05 13:13 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Ingo Molnar, Andrew Morton, Kees Cook, Eric W. Biederman,
	Oleg Nesterov, David Howells, linux-kernel, Peter Zijlstra,
	Kirill A. Shutemov, Rik van Riel, Vladimir Davydov, Ricky Zhou,
	Julien Tinnes

On Wed, Aug 5, 2015 at 4:53 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
>> 'only' if it's multi-threaded, i.e. when some workload cares so much about
>> performance that it uses multiple threads?
>>
>> Can you see the contradiction there?
>
> I can. man 2 unshare:
>
>         CLONE_NEWUSER requires that the calling process  is  not  threaded;
>
> The workload cares so much about performance that it ignores API
> requirements. Some slow down looks like a fair price to me.
To be fair, the entire reason for this patch is that the slow path
(mm_users > 1) can happen even when the process is single-threaded. I
was concerned about how expensive current_is_single_threaded looked as
well, but didn't see any lightweight alternatives short of adding a
field to mm_struct. Do folks think it's worth going down that route
instead?

Thanks,
Ricky

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-07-28 22:11   ` Kirill A. Shutemov
  2015-08-05 11:38     ` Ingo Molnar
@ 2015-08-05 17:23     ` Oleg Nesterov
  2015-08-05 18:00       ` Eric W. Biederman
  1 sibling, 1 reply; 41+ messages in thread
From: Oleg Nesterov @ 2015-08-05 17:23 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Andrew Morton, Kees Cook, Eric W. Biederman, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

On 07/29, Kirill A. Shutemov wrote:
>
> On Tue, Jul 28, 2015 at 02:35:04PM -0700, Andrew Morton wrote:
> > On Tue, 28 Jul 2015 10:15:00 -0700 Kees Cook <keescook@chromium.org> wrote:
> >
> > > From: Ricky Zhou <rickyz@chromium.org>
> > >
> > > Checking mm_users > 1 does not mean a process is multithreaded. For
> > > example, reading /proc/PID/maps temporarily increments mm_users, allowing
> > > other processes to (accidentally) interfere with unshare() calls.
> > >
> > > This fixes observed failures of unshare(CLONE_NEWUSER) incorrectly
> > > returning EINVAL if another processes happened to be simultaneously
> > > reading the maps file.
> >
> > Yikes.  current_is_single_threaded() is expensive.  Are we sure this
> > isn't going to kill someone's workload?
>
> It's expensive only if mm_users > 1. We will go to for_each_process() only
> if somebody outside of the process grabs mm_users references (like reading
> /proc/PID/maps).

Yes,

> Or if it called it from multithreaded application.

Not really, please note the "negative fast-path" check:

	if (atomic_read(&task->signal->live) != 1)	
		return false;

> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

Same here, I think the patch is fine.

I don't think that sys_setns() is performance critical, and Eric seems
to agree with this change too.

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-08-05 17:23     ` Oleg Nesterov
@ 2015-08-05 18:00       ` Eric W. Biederman
  2015-08-05 18:52         ` Eric W. Biederman
  0 siblings, 1 reply; 41+ messages in thread
From: Eric W. Biederman @ 2015-08-05 18:00 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

Oleg Nesterov <oleg@redhat.com> writes:

> On 07/29, Kirill A. Shutemov wrote:
>>
>> On Tue, Jul 28, 2015 at 02:35:04PM -0700, Andrew Morton wrote:
>> > On Tue, 28 Jul 2015 10:15:00 -0700 Kees Cook <keescook@chromium.org> wrote:
>> >
>> > > From: Ricky Zhou <rickyz@chromium.org>
>> > >
>> > > Checking mm_users > 1 does not mean a process is multithreaded. For
>> > > example, reading /proc/PID/maps temporarily increments mm_users, allowing
>> > > other processes to (accidentally) interfere with unshare() calls.
>> > >
>> > > This fixes observed failures of unshare(CLONE_NEWUSER) incorrectly
>> > > returning EINVAL if another processes happened to be simultaneously
>> > > reading the maps file.
>> >
>> > Yikes.  current_is_single_threaded() is expensive.  Are we sure this
>> > isn't going to kill someone's workload?
>>
>> It's expensive only if mm_users > 1. We will go to for_each_process() only
>> if somebody outside of the process grabs mm_users references (like reading
>> /proc/PID/maps).
>
> Yes,
>
>> Or if it called it from multithreaded application.
>
> Not really, please note the "negative fast-path" check:
>
> 	if (atomic_read(&task->signal->live) != 1)	
> 		return false;
>
>> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>
> Same here, I think the patch is fine.
>
> I don't think that sys_setns() is performance critical, and Eric seems
> to agree with this change too.

I do and I am about to merge it into my userns tree.  It is clearly an
issue that is affecting users in the real world, which fundamentally
makes the current implementation buggy.

That said the performance of sys_setns does matter.  Removing the
possibility of calling synchronize_rcu from switch_task_namespaces
happened because the performance of sys_setns was a problem.

Looking at the implementation of current_is_single_threaded() it appears
that the data structures dealing with threads could stand some
improvement so current_is_single_threaded() could become a constant time
function.

The class of application that is likely to call setns and be strongly
affected by it's performance are container management applications or
applications that are deliberately using more than one namespace at a
time.  There might be advantages to using current_is_single_threaded()
as currently written to slow down an application.  That slow down could
expand a race condition enough to make another bug exploitable.

So if we could figure out how to make current_is_single_threaded()
faster or at least have the slow path not triggerable by users playing
with proc I would very much be in favor of it.

Eric

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-07-28 21:01     ` Kees Cook
@ 2015-08-05 18:13       ` Eric W. Biederman
  2015-08-05 19:40         ` Kees Cook
  0 siblings, 1 reply; 41+ messages in thread
From: Eric W. Biederman @ 2015-08-05 18:13 UTC (permalink / raw)
  To: Kees Cook
  Cc: Ricky Zhou, Oleg Nesterov, David Howells, LKML, Andrew Morton,
	Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov, Rik van Riel,
	Vladimir Davydov, Julien Tinnes

Kees Cook <keescook@chromium.org> writes:

> On Tue, Jul 28, 2015 at 1:55 PM, Ricky Zhou <rickyz@google.com> wrote:
>> On Tue, Jul 28, 2015 at 11:17 AM, Eric W. Biederman
>> <ebiederm@xmission.com> wrote:
>>> Kees Cook <keescook@chromium.org> writes:
>>>
>>>> From: Ricky Zhou <rickyz@chromium.org>
>>>>
>>>> Checking mm_users > 1 does not mean a process is multithreaded. For
>>>> example, reading /proc/PID/maps temporarily increments mm_users, allowing
>>>> other processes to (accidentally) interfere with unshare() calls.
>>>>
>>>> This fixes observed failures of unshare(CLONE_NEWUSER) incorrectly
>>>> returning EINVAL if another processes happened to be simultaneously
>>>> reading the maps file.
>>>>
>>>> Signed-off-by: Ricky Zhou <rickyz@chromium.org>
>>>> Signed-off-by: Kees Cook <keescook@chromium.org>
>>>> Cc: stable@vger.kernel.org
>>>
>>> This looks like a good fix.  Any chance you can drudge up the commit where
>>> this hack came in so that Greg & Company know how far to back port this?
>>
>> userns_install in user_namespace.c (affects setns of a user
>> namespace): cde1975bc242f3e1072bde623ef378e547b73f91.
>>
>> The check in check_unshare_flags is a little more complex. The
>> incorrect check was added in
>> cf2e340f4249b781b3d2beb41e891d08581f0e10 but I don't think it would
>> have triggered under any supported combination of flags at that point.
>>
>> From 50804fe3737ca6a5942fdc2057a18a8141d00141 until
>> 6e556ce209b09528dbf1931cbfd5d323e1345926, the bug affected
>> unshare(CLONE_NEWPID).
>
> That's back to v3.8, so this goes quite a way, it seems.

This patch was marked as CC' stable.  The question I am asking is this
problem bad enough that backporting this change to stable makes sense?

And even more if this patch can wait for the merge window to be merged
or if there this needs to be expidited, and get in before 4.2 -final.

It sounds like this is one of those rare bugs someone hit once or twice,
just often enough for this to be tracked down, and the important thing
is that this fix get into the kernel's code base.

So unless I am otherwise informed I will assume that this is a change
that can stand to wait until the merge window to be merged (so it has a
full development cycle of review and discussion).  But that is worth
backporting after that because it actually causes problems that people
actually hit.

Eric

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-08-05 18:00       ` Eric W. Biederman
@ 2015-08-05 18:52         ` Eric W. Biederman
  2015-08-06 13:06           ` Oleg Nesterov
  2015-08-06 14:35           ` [PATCH] user_ns: use correct check for single-threadedness Oleg Nesterov
  0 siblings, 2 replies; 41+ messages in thread
From: Eric W. Biederman @ 2015-08-05 18:52 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes


Hmm.

On closer inspection this patch touches on a greater inconsistency then
the test to see if the task is the only task using the mm_struct.

We currently allow tasks created with clone to have a different user
namespace and to share a mm_struct, and I don't think that is wrong.

What we actually care about are the uid and gid values that show up in
signals that are reported to a process, and for that what we care about
is the question do the tasks share signal handling state, which is
controlled by the flags CLONE_SIGHAND and CLONE_THREAD.

As such current_is_single_threaded() is wrong because it tests to see if
there is someone else sharing an mm_struct.

So I have to ask.  Is it possible to rework these checks such that we
look at the sighand struct and signal sharing handling sharing instead
of the count on the mm_struct?

I suspect we could do that more cheaply, as well as making the code more
correct.

Eric


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-08-05 18:13       ` Eric W. Biederman
@ 2015-08-05 19:40         ` Kees Cook
  0 siblings, 0 replies; 41+ messages in thread
From: Kees Cook @ 2015-08-05 19:40 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Ricky Zhou, Oleg Nesterov, David Howells, LKML, Andrew Morton,
	Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov, Rik van Riel,
	Vladimir Davydov, Julien Tinnes

On Wed, Aug 5, 2015 at 11:13 AM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
> Kees Cook <keescook@chromium.org> writes:
>
>> On Tue, Jul 28, 2015 at 1:55 PM, Ricky Zhou <rickyz@google.com> wrote:
>>> On Tue, Jul 28, 2015 at 11:17 AM, Eric W. Biederman
>>> <ebiederm@xmission.com> wrote:
>>>> Kees Cook <keescook@chromium.org> writes:
>>>>
>>>>> From: Ricky Zhou <rickyz@chromium.org>
>>>>>
>>>>> Checking mm_users > 1 does not mean a process is multithreaded. For
>>>>> example, reading /proc/PID/maps temporarily increments mm_users, allowing
>>>>> other processes to (accidentally) interfere with unshare() calls.
>>>>>
>>>>> This fixes observed failures of unshare(CLONE_NEWUSER) incorrectly
>>>>> returning EINVAL if another processes happened to be simultaneously
>>>>> reading the maps file.
>>>>>
>>>>> Signed-off-by: Ricky Zhou <rickyz@chromium.org>
>>>>> Signed-off-by: Kees Cook <keescook@chromium.org>
>>>>> Cc: stable@vger.kernel.org
>>>>
>>>> This looks like a good fix.  Any chance you can drudge up the commit where
>>>> this hack came in so that Greg & Company know how far to back port this?
>>>
>>> userns_install in user_namespace.c (affects setns of a user
>>> namespace): cde1975bc242f3e1072bde623ef378e547b73f91.
>>>
>>> The check in check_unshare_flags is a little more complex. The
>>> incorrect check was added in
>>> cf2e340f4249b781b3d2beb41e891d08581f0e10 but I don't think it would
>>> have triggered under any supported combination of flags at that point.
>>>
>>> From 50804fe3737ca6a5942fdc2057a18a8141d00141 until
>>> 6e556ce209b09528dbf1931cbfd5d323e1345926, the bug affected
>>> unshare(CLONE_NEWPID).
>>
>> That's back to v3.8, so this goes quite a way, it seems.
>
> This patch was marked as CC' stable.  The question I am asking is this
> problem bad enough that backporting this change to stable makes sense?

I have no problem dropping the CC. At the time it seemed like a clear
bug fix appropriate for stable. If you feel differently, please remove
the CC. :)

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-08-05 18:52         ` Eric W. Biederman
@ 2015-08-06 13:06           ` Oleg Nesterov
  2015-08-06 13:44             ` Oleg Nesterov
  2015-08-06 14:35           ` [PATCH] user_ns: use correct check for single-threadedness Oleg Nesterov
  1 sibling, 1 reply; 41+ messages in thread
From: Oleg Nesterov @ 2015-08-06 13:06 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

On 08/05, Eric W. Biederman wrote:
>
> So I have to ask.

I hope you are asking someone else, not me ;) I never understood what
exactly we try to restrict and why.

> Is it possible to rework these checks such that we
> look at the sighand struct and signal sharing handling sharing instead
> of the count on the mm_struct?

Then why we can't simply check thread_group_empty() == T ? Why should we
worry about CLONE_SIGHAND at all?

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-08-06 13:06           ` Oleg Nesterov
@ 2015-08-06 13:44             ` Oleg Nesterov
  2015-08-12  1:17               ` Eric W. Biederman
  2015-08-12  1:22               ` [PATCH 0/2] userns: Creation logic fixes Eric W. Biederman
  0 siblings, 2 replies; 41+ messages in thread
From: Oleg Nesterov @ 2015-08-06 13:44 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

On 08/06, Oleg Nesterov wrote:
>
> On 08/05, Eric W. Biederman wrote:
> >
> > So I have to ask.
>
> I hope you are asking someone else, not me ;) I never understood what
> exactly we try to restrict and why.
>
> > Is it possible to rework these checks such that we
> > look at the sighand struct and signal sharing handling sharing instead
> > of the count on the mm_struct?
>
> Then why we can't simply check thread_group_empty() == T ? Why should we
> worry about CLONE_SIGHAND at all?

The same for clone() actually... I forgot why we decided to check
CLONE_SIGHAND, iirc I suggested CLONE_THREAD initially then we switched
to CLONE_SIGHAND "just in case", to make it as strict as possible.

How about the patch below?

(note that the "or parent" part of the comment is wrong in any case).

Oleg.


--- x/kernel/fork.c
+++ x/kernel/fork.c
@@ -1279,10 +1279,9 @@ static struct task_struct *copy_process(
 
 	/*
 	 * If the new process will be in a different pid or user namespace
-	 * do not allow it to share a thread group or signal handlers or
-	 * parent with the forking task.
+	 * do not allow it to share a thread group with the forking task.
 	 */
-	if (clone_flags & CLONE_SIGHAND) {
+	if (clone_flags & CLONE_THREAD) {
 		if ((clone_flags & (CLONE_NEWUSER | CLONE_NEWPID)) ||
 		    (task_active_pid_ns(current) !=
 				current->nsproxy->pid_ns_for_children))


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-08-05 18:52         ` Eric W. Biederman
  2015-08-06 13:06           ` Oleg Nesterov
@ 2015-08-06 14:35           ` Oleg Nesterov
  2015-08-06 21:16             ` Eric W. Biederman
  1 sibling, 1 reply; 41+ messages in thread
From: Oleg Nesterov @ 2015-08-06 14:35 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

On 08/05, Eric W. Biederman wrote:
>
> So I have to ask.

I hope you are asking someone else, not me ;) I never understood what
exactly we try to restrict and why.

> Is it possible to rework these checks such that we
> look at the sighand struct and signal sharing handling sharing instead
> of the count on the mm_struct?

Then why we can't simply check thread_group_empty() == T ? Why should we
worry about CLONE_SIGHAND at all?

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-08-06 14:35           ` [PATCH] user_ns: use correct check for single-threadedness Oleg Nesterov
@ 2015-08-06 21:16             ` Eric W. Biederman
  0 siblings, 0 replies; 41+ messages in thread
From: Eric W. Biederman @ 2015-08-06 21:16 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

Oleg Nesterov <oleg@redhat.com> writes:

> On 08/05, Eric W. Biederman wrote:
>>
>> So I have to ask.
>
> I hope you are asking someone else, not me ;) I never understood what
> exactly we try to restrict and why.

I think I was just asking rhetorically, and asking myself.

>> Is it possible to rework these checks such that we
>> look at the sighand struct and signal sharing handling sharing instead
>> of the count on the mm_struct?
>
> Then why we can't simply check thread_group_empty() == T ? Why should we
> worry about CLONE_SIGHAND at all?

<rant> 
thread_group_empty() for a thread group with a single member is a
confusing name.
 </rant>

CLONE_SIGHAND is just a hair excessive, you have to at least look at
your per thread register to see which namespace to interpret the values
of the signals in.  I suppose that is confusing but not totally fatal.

Without changing the semantics, just correcting the implementation
of the code we have now this is the code I get:

diff --git a/kernel/fork.c b/kernel/fork.c
index 1bfefc6f96a4..c95757c15fcb 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1866,13 +1866,13 @@ static int check_unshare_flags(unsigned long unshare_flags)
 				CLONE_NEWUSER|CLONE_NEWPID))
 		return -EINVAL;
 	/*
-	 * Not implemented, but pretend it works if there is nothing to
-	 * unshare. Note that unsharing CLONE_THREAD or CLONE_SIGHAND
-	 * needs to unshare vm.
+	 * Not implemented, but pretend it works if there is nothing
+	 * to unshare.  Note that unsharing the address space or the
+	 * signal handlers also need to unshare the signal queues (aka
+	 * CLONE_THREAD).
 	 */
-	if (unshare_flags & (CLONE_THREAD | CLONE_SIGHAND | CLONE_VM)) {
-		/* FIXME: get_task_mm() increments ->mm_users */
-		if (atomic_read(&current->mm->mm_users) > 1)
+	if (unshare_flags & CLONE_THREAD) {
+		if (!thread_group_empty(current))
 			return -EINVAL;
 	}
 
@@ -1936,21 +1936,23 @@ SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
 	int err;
 
 	/*
-	 * If unsharing a user namespace must also unshare the thread.
+	 * If unsharing a user namespace must also unshare the signal
+	 * handlers and unshare the filesystem root and working
+	 * directories.
 	 */
 	if (unshare_flags & CLONE_NEWUSER)
-		unshare_flags |= CLONE_THREAD | CLONE_FS;
-	/*
-	 * If unsharing a thread from a thread group, must also unshare vm.
-	 */
-	if (unshare_flags & CLONE_THREAD)
-		unshare_flags |= CLONE_VM;
+		unshare_flags |= CLONE_SIGHAND | CLONE_FS;
 	/*
 	 * If unsharing vm, must also unshare signal handlers.
 	 */
 	if (unshare_flags & CLONE_VM)
 		unshare_flags |= CLONE_SIGHAND;
 	/*
+	 * If unsharing a signal handlers, must also unshare the signal queues.
+	 */
+	if (unshare_flags & CLONE_SIGHAND)
+		unshare_flags |= CLONE_THREAD;
+	/*
 	 * If unsharing namespace, must also unshare filesystem information.
 	 */
 	if (unshare_flags & CLONE_NEWNS)
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 4109f8320684..45a5cbf97715 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -976,8 +976,8 @@ static int userns_install(struct nsproxy *nsproxy, struct ns_common *ns)
 	if (user_ns == current_user_ns())
 		return -EINVAL;
 
-	/* Threaded processes may not enter a different user namespace */
-	if (atomic_read(&current->mm->mm_users) > 1)
+	/* Shared signal handlers must live in the same user namespace */
+	if (atomic_read(&current->sighand->count) > 1)
 		return -EINVAL;
 
 	if (current->fs->users != 1)

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-08-06 13:44             ` Oleg Nesterov
@ 2015-08-12  1:17               ` Eric W. Biederman
  2015-08-12 14:40                 ` Oleg Nesterov
  2015-08-12  1:22               ` [PATCH 0/2] userns: Creation logic fixes Eric W. Biederman
  1 sibling, 1 reply; 41+ messages in thread
From: Eric W. Biederman @ 2015-08-12  1:17 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

Oleg Nesterov <oleg@redhat.com> writes:

> On 08/06, Oleg Nesterov wrote:
>>
>> On 08/05, Eric W. Biederman wrote:
>> >
>> > So I have to ask.
>>
>> I hope you are asking someone else, not me ;) I never understood what
>> exactly we try to restrict and why.
>>
>> > Is it possible to rework these checks such that we
>> > look at the sighand struct and signal sharing handling sharing instead
>> > of the count on the mm_struct?
>>
>> Then why we can't simply check thread_group_empty() == T ? Why should we
>> worry about CLONE_SIGHAND at all?
>
> The same for clone() actually... I forgot why we decided to check
> CLONE_SIGHAND, iirc I suggested CLONE_THREAD initially then we switched
> to CLONE_SIGHAND "just in case", to make it as strict as possible.

I do agree that making the test be for CLONE_THREAD is safe, makes
sense, and is less confusing than what we have now.x

> How about the patch below?
>
> (note that the "or parent" part of the comment is wrong in any case).

It was correct.  You failed to removed it when you removed CLONE_PARENT
from that test.

Eric


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 0/2] userns: Creation logic fixes
  2015-08-06 13:44             ` Oleg Nesterov
  2015-08-12  1:17               ` Eric W. Biederman
@ 2015-08-12  1:22               ` Eric W. Biederman
  2015-08-12  1:24                 ` [PATCH 1/2] unshare: Unsharing a thread does not require unsharing a vm Eric W. Biederman
                                   ` (2 more replies)
  1 sibling, 3 replies; 41+ messages in thread
From: Eric W. Biederman @ 2015-08-12  1:22 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes


So I have take a good hard stare at the problem, as well as sitting down
and writing some test code to verify the code works the way I think it
does.

The following two patches are how I think this bit of chaos needs to be
solved.  If folks could take a once over these patches and possibly test
them to confirm they fix your issues I would appreciate it.

Eric W. Biederman (2):
      unshare: Unsharing a thread does not require unsharing a vm
      userns,pidns: Force thread group sharing, not signal handler sharing.

 kernel/fork.c           | 32 ++++++++++++++++++--------------
 kernel/user_namespace.c |  4 ++--
 2 files changed, 20 insertions(+), 16 deletions(-)

Eric

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 1/2] unshare: Unsharing a thread does not require unsharing a vm
  2015-08-12  1:22               ` [PATCH 0/2] userns: Creation logic fixes Eric W. Biederman
@ 2015-08-12  1:24                 ` Eric W. Biederman
  2015-08-12 17:48                   ` Oleg Nesterov
  2015-08-12  1:25                 ` [PATCH 2/2] userns,pidns: Force thread group sharing, not signal handler sharing Eric W. Biederman
  2015-08-12  6:29                 ` [PATCH 0/2] userns: Creation logic fixes Kees Cook
  2 siblings, 1 reply; 41+ messages in thread
From: Eric W. Biederman @ 2015-08-12  1:24 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes


In the logic in the initial commit of unshare made creating a new
thread group for a process, contingent upon creating a new memory
address space for that process.  That is wrong.  Two separate
processes in different thread groups can share a memory address space
and clone allows creation of such proceses.

This is significant because it was observed that mm_users > 1 does not
mean that a process is multi-threaded, as reading /proc/PID/maps
temporarily increments mm_users, which allows other processes to
(accidentally) interfere with unshare() calls.

Correct the check in check_unshare_flags() to test for
!thread_group_empty() for CLONE_THREAD, CLONE_SIGHAND, and CLONE_VM
and also for CLONE_VM check for !current_is_single_threaded instead
of mm_users > 1.

By using the correct checks in unshare this removes the possibility of
an accidental denial of service attack.

Additionally using the correct checks in unshare ensures that only an
explicit unshare(CLONE_VM) can possibly trigger the slow path of
current_is_single_threaded().  As an explict unshare(CLONE_VM) is
pointless it is not expected there are many applications that make
that call.

Cc: stable@vger.kernel.org
Fixes: b2e0d98705e60e45bbb3c0032c48824ad7ae0704 userns: Implement unshare of the user namespace
Reported-by: Ricky Zhou <rickyz@chromium.org>
Reported-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/fork.c | 24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 1bfefc6f96a4..0edc437d5bb0 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1866,13 +1866,17 @@ static int check_unshare_flags(unsigned long unshare_flags)
 				CLONE_NEWUSER|CLONE_NEWPID))
 		return -EINVAL;
 	/*
-	 * Not implemented, but pretend it works if there is nothing to
-	 * unshare. Note that unsharing CLONE_THREAD or CLONE_SIGHAND
-	 * needs to unshare vm.
+	 * Not implemented, but pretend it works if there is nothing
+	 * to unshare.  Note that unsharing the address space or the
+	 * signal handlers also need to unshare the signal queues (aka
+	 * CLONE_THREAD).
 	 */
 	if (unshare_flags & (CLONE_THREAD | CLONE_SIGHAND | CLONE_VM)) {
-		/* FIXME: get_task_mm() increments ->mm_users */
-		if (atomic_read(&current->mm->mm_users) > 1)
+		if (!thread_group_empty(current))
+			return -EINVAL;
+	}
+	if (unshare_flags & CLONE_VM) {
+		if (!current_is_single_threaded())
 			return -EINVAL;
 	}
 
@@ -1941,16 +1945,16 @@ SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
 	if (unshare_flags & CLONE_NEWUSER)
 		unshare_flags |= CLONE_THREAD | CLONE_FS;
 	/*
-	 * If unsharing a thread from a thread group, must also unshare vm.
-	 */
-	if (unshare_flags & CLONE_THREAD)
-		unshare_flags |= CLONE_VM;
-	/*
 	 * If unsharing vm, must also unshare signal handlers.
 	 */
 	if (unshare_flags & CLONE_VM)
 		unshare_flags |= CLONE_SIGHAND;
 	/*
+	 * If unsharing a signal handlers, must also unshare the signal queues.
+	 */
+	if (unshare_flags & CLONE_SIGHAND)
+		unshare_flags |= CLONE_THREAD;
+	/*
 	 * If unsharing namespace, must also unshare filesystem information.
 	 */
 	if (unshare_flags & CLONE_NEWNS)
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 2/2] userns,pidns: Force thread group sharing, not signal handler sharing
  2015-08-12  1:22               ` [PATCH 0/2] userns: Creation logic fixes Eric W. Biederman
  2015-08-12  1:24                 ` [PATCH 1/2] unshare: Unsharing a thread does not require unsharing a vm Eric W. Biederman
@ 2015-08-12  1:25                 ` Eric W. Biederman
  2015-08-12 17:24                   ` Oleg Nesterov
  2015-08-12  6:29                 ` [PATCH 0/2] userns: Creation logic fixes Kees Cook
  2 siblings, 1 reply; 41+ messages in thread
From: Eric W. Biederman @ 2015-08-12  1:25 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes


The code that places signals in signal queues computes the uids, gids,
and pids at the time the signals are enqueued.  Which means that tasks
that share signal queues must be in the same pid and user namespaces.

Sharing signal handlers is fine, but bizarre.

So make the code in fork and userns_install clearer by only testing
for what is functionally necessary.

Also update the comment in unshare about unsharing a user namespace to
be a little more explicit and make a little more sense.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/fork.c           | 8 ++++----
 kernel/user_namespace.c | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 0edc437d5bb0..c9a4373a4d96 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1273,10 +1273,9 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 
 	/*
 	 * If the new process will be in a different pid or user namespace
-	 * do not allow it to share a thread group or signal handlers or
-	 * parent with the forking task.
+	 * do not allow it to share a thread group with the forking task.
 	 */
-	if (clone_flags & CLONE_SIGHAND) {
+	if (clone_flags & CLONE_THREAD) {
 		if ((clone_flags & (CLONE_NEWUSER | CLONE_NEWPID)) ||
 		    (task_active_pid_ns(current) !=
 				current->nsproxy->pid_ns_for_children))
@@ -1940,7 +1939,8 @@ SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
 	int err;
 
 	/*
-	 * If unsharing a user namespace must also unshare the thread.
+	 * If unsharing a user namespace must also unshare the thread group
+	 * and unshare the filesystem root and working directories.
 	 */
 	if (unshare_flags & CLONE_NEWUSER)
 		unshare_flags |= CLONE_THREAD | CLONE_FS;
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 4109f8320684..f65a0a06a8c0 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -976,8 +976,8 @@ static int userns_install(struct nsproxy *nsproxy, struct ns_common *ns)
 	if (user_ns == current_user_ns())
 		return -EINVAL;
 
-	/* Threaded processes may not enter a different user namespace */
-	if (atomic_read(&current->mm->mm_users) > 1)
+	/* Tasks that share a thread group must share a user namespace */
+	if (!thread_group_empty(current))
 		return -EINVAL;
 
 	if (current->fs->users != 1)
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] userns: Creation logic fixes
  2015-08-12  1:22               ` [PATCH 0/2] userns: Creation logic fixes Eric W. Biederman
  2015-08-12  1:24                 ` [PATCH 1/2] unshare: Unsharing a thread does not require unsharing a vm Eric W. Biederman
  2015-08-12  1:25                 ` [PATCH 2/2] userns,pidns: Force thread group sharing, not signal handler sharing Eric W. Biederman
@ 2015-08-12  6:29                 ` Kees Cook
  2 siblings, 0 replies; 41+ messages in thread
From: Kees Cook @ 2015-08-12  6:29 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Oleg Nesterov, Kirill A. Shutemov, Andrew Morton, David Howells,
	LKML, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

On Tue, Aug 11, 2015 at 6:22 PM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
>
> So I have take a good hard stare at the problem, as well as sitting down
> and writing some test code to verify the code works the way I think it
> does.
>
> The following two patches are how I think this bit of chaos needs to be
> solved.  If folks could take a once over these patches and possibly test
> them to confirm they fix your issues I would appreciate it.
>
> Eric W. Biederman (2):
>       unshare: Unsharing a thread does not require unsharing a vm
>       userns,pidns: Force thread group sharing, not signal handler sharing.
>
>  kernel/fork.c           | 32 ++++++++++++++++++--------------
>  kernel/user_namespace.c |  4 ++--
>  2 files changed, 20 insertions(+), 16 deletions(-)

Thanks for digging into this!

Reviewed-by: Kees Cook <keescook@chromium.org>

-Kees


-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-08-12  1:17               ` Eric W. Biederman
@ 2015-08-12 14:40                 ` Oleg Nesterov
  2015-08-12 15:11                   ` Eric W. Biederman
  0 siblings, 1 reply; 41+ messages in thread
From: Oleg Nesterov @ 2015-08-12 14:40 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

On 08/11, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@redhat.com> writes:
>
> >> Then why we can't simply check thread_group_empty() == T ? Why should we
> >> worry about CLONE_SIGHAND at all?
> >
> > The same for clone() actually... I forgot why we decided to check
> > CLONE_SIGHAND, iirc I suggested CLONE_THREAD initially then we switched
> > to CLONE_SIGHAND "just in case", to make it as strict as possible.
>
> I do agree that making the test be for CLONE_THREAD is safe, makes
> sense, and is less confusing than what we have now.x

Good,

> > How about the patch below?
> >
> > (note that the "or parent" part of the comment is wrong in any case).
>
> It was correct.

Yes, I know,

> You failed to removed it when you removed CLONE_PARENT
> from that test.

Cough... it was you ;) 1f7f4dde5c945f41a7abc2285be43d918029ecc5
"fork:  Allow CLONE_PARENT after setns(CLONE_NEWPID)".

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] user_ns: use correct check for single-threadedness
  2015-08-12 14:40                 ` Oleg Nesterov
@ 2015-08-12 15:11                   ` Eric W. Biederman
  0 siblings, 0 replies; 41+ messages in thread
From: Eric W. Biederman @ 2015-08-12 15:11 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

Oleg Nesterov <oleg@redhat.com> writes:

> On 08/11, Eric W. Biederman wrote:
>>
>> Oleg Nesterov <oleg@redhat.com> writes:
>>
>> >> Then why we can't simply check thread_group_empty() == T ? Why should we
>> >> worry about CLONE_SIGHAND at all?
>> >
>> > The same for clone() actually... I forgot why we decided to check
>> > CLONE_SIGHAND, iirc I suggested CLONE_THREAD initially then we switched
>> > to CLONE_SIGHAND "just in case", to make it as strict as possible.
>>
>> I do agree that making the test be for CLONE_THREAD is safe, makes
>> sense, and is less confusing than what we have now.x
>
> Good,
>
>> > How about the patch below?
>> >
>> > (note that the "or parent" part of the comment is wrong in any case).
>>
>> It was correct.
>
> Yes, I know,
>
>> You failed to removed it when you removed CLONE_PARENT
>> from that test.
>
> Cough... it was you ;) 1f7f4dde5c945f41a7abc2285be43d918029ecc5
> "fork:  Allow CLONE_PARENT after setns(CLONE_NEWPID)".

So it was.  I must have tired when I read the git log last night.

Eric

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] userns,pidns: Force thread group sharing, not signal handler sharing
  2015-08-12  1:25                 ` [PATCH 2/2] userns,pidns: Force thread group sharing, not signal handler sharing Eric W. Biederman
@ 2015-08-12 17:24                   ` Oleg Nesterov
  0 siblings, 0 replies; 41+ messages in thread
From: Oleg Nesterov @ 2015-08-12 17:24 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

On 08/11, Eric W. Biederman wrote:
>
> The code that places signals in signal queues computes the uids, gids,
> and pids at the time the signals are enqueued.  Which means that tasks
> that share signal queues must be in the same pid and user namespaces.
>
> Sharing signal handlers is fine, but bizarre.
>
> So make the code in fork and userns_install clearer by only testing
> for what is functionally necessary.
>
> Also update the comment in unshare about unsharing a user namespace to
> be a little more explicit and make a little more sense.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Acked-by: Oleg Nesterov <oleg@redhat.com>



But: Eric, Andrew, this means that

	user_ns-use-correct-check-for-single-threadedness.patch

in -mm tree should be dropped.

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/2] unshare: Unsharing a thread does not require unsharing a vm
  2015-08-12  1:24                 ` [PATCH 1/2] unshare: Unsharing a thread does not require unsharing a vm Eric W. Biederman
@ 2015-08-12 17:48                   ` Oleg Nesterov
  2015-08-12 18:39                     ` Eric W. Biederman
  2015-08-12 19:59                     ` [PATCH v2] " Eric W. Biederman
  0 siblings, 2 replies; 41+ messages in thread
From: Oleg Nesterov @ 2015-08-12 17:48 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

On 08/11, Eric W. Biederman wrote:
>
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1866,13 +1866,17 @@ static int check_unshare_flags(unsigned long unshare_flags)
>  				CLONE_NEWUSER|CLONE_NEWPID))
>  		return -EINVAL;
>  	/*
> -	 * Not implemented, but pretend it works if there is nothing to
> -	 * unshare. Note that unsharing CLONE_THREAD or CLONE_SIGHAND
> -	 * needs to unshare vm.
> +	 * Not implemented, but pretend it works if there is nothing
> +	 * to unshare.  Note that unsharing the address space or the
> +	 * signal handlers also need to unshare the signal queues (aka
> +	 * CLONE_THREAD).
>  	 */
>  	if (unshare_flags & (CLONE_THREAD | CLONE_SIGHAND | CLONE_VM)) {
> -		/* FIXME: get_task_mm() increments ->mm_users */
> -		if (atomic_read(&current->mm->mm_users) > 1)
> +		if (!thread_group_empty(current))
> +			return -EINVAL;
> +	}
> +	if (unshare_flags & CLONE_VM) {
> +		if (!current_is_single_threaded())
>  			return -EINVAL;
>  	}

OK, but then you can remove "| CLONE_VM" from the previous check...


> @@ -1941,16 +1945,16 @@ SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
>  	if (unshare_flags & CLONE_NEWUSER)
>  		unshare_flags |= CLONE_THREAD | CLONE_FS;
>  	/*
> -	 * If unsharing a thread from a thread group, must also unshare vm.
> -	 */
> -	if (unshare_flags & CLONE_THREAD)
> -		unshare_flags |= CLONE_VM;

OK,

>  	/*
> +	 * If unsharing a signal handlers, must also unshare the signal queues.
> +	 */
> +	if (unshare_flags & CLONE_SIGHAND)
> +		unshare_flags |= CLONE_THREAD;

This looks unnecessary, check_unshare_flags() checks "THREAD | SIGHAND".
And to me the comment looks misleading although I won't argue.

And in fact this doesn't look exactly right, or I am totally confused.
Shouldn't we do

	if (unshare_flags & CLONE_SIGHAND)
		unshare_flags |= CLONE_VM;

? Or change check_unshare_flags()...

Otherwise suppose that a single threaded process does clone(VM | SIGHAND)
and (say) child does sys_unshare(SIGHAND). This will wrongly succeed afaics.

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/2] unshare: Unsharing a thread does not require unsharing a vm
  2015-08-12 17:48                   ` Oleg Nesterov
@ 2015-08-12 18:39                     ` Eric W. Biederman
  2015-08-13 12:55                       ` Oleg Nesterov
  2015-08-12 19:59                     ` [PATCH v2] " Eric W. Biederman
  1 sibling, 1 reply; 41+ messages in thread
From: Eric W. Biederman @ 2015-08-12 18:39 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

Oleg Nesterov <oleg@redhat.com> writes:

> On 08/11, Eric W. Biederman wrote:
>>
>> --- a/kernel/fork.c
>> +++ b/kernel/fork.c
>> @@ -1866,13 +1866,17 @@ static int check_unshare_flags(unsigned long unshare_flags)
>>  				CLONE_NEWUSER|CLONE_NEWPID))
>>  		return -EINVAL;
>>  	/*
>> -	 * Not implemented, but pretend it works if there is nothing to
>> -	 * unshare. Note that unsharing CLONE_THREAD or CLONE_SIGHAND
>> -	 * needs to unshare vm.
>> +	 * Not implemented, but pretend it works if there is nothing
>> +	 * to unshare.  Note that unsharing the address space or the
>> +	 * signal handlers also need to unshare the signal queues (aka
>> +	 * CLONE_THREAD).
>>  	 */
>>  	if (unshare_flags & (CLONE_THREAD | CLONE_SIGHAND | CLONE_VM)) {
>> -		/* FIXME: get_task_mm() increments ->mm_users */
>> -		if (atomic_read(&current->mm->mm_users) > 1)
>> +		if (!thread_group_empty(current))
>> +			return -EINVAL;
>> +	}
>> +	if (unshare_flags & CLONE_VM) {
>> +		if (!current_is_single_threaded())
>>  			return -EINVAL;
>>  	}
>
> OK, but then you can remove "| CLONE_VM" from the previous check...

As an optimization, but I don't think anything cares enough for the
optimization to be worth the confusion.

>> @@ -1941,16 +1945,16 @@ SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
>>  	if (unshare_flags & CLONE_NEWUSER)
>>  		unshare_flags |= CLONE_THREAD | CLONE_FS;
>>  	/*
>> -	 * If unsharing a thread from a thread group, must also unshare vm.
>> -	 */
>> -	if (unshare_flags & CLONE_THREAD)
>> -		unshare_flags |= CLONE_VM;
>
> OK,
>
>>  	/*
>> +	 * If unsharing a signal handlers, must also unshare the signal queues.
>> +	 */
>> +	if (unshare_flags & CLONE_SIGHAND)
>> +		unshare_flags |= CLONE_THREAD;
>
> This looks unnecessary, check_unshare_flags() checks "THREAD | SIGHAND".
> And to me the comment looks misleading although I won't argue.

I absolutely can not understand this code if we jump 5 steps ahead
and optimize out the individual dependencies, and try for a flattened
dependency tree instead.  I can validate the individual dependencies
from first principles.

If we jump several steps ahead I can not validate the individual
dependencies.  

It really is important to say if you want your own private struct
sighand_struct, you also need to have your own private struct
signal_struct.

> And in fact this doesn't look exactly right, or I am totally confused.
> Shouldn't we do
>
> 	if (unshare_flags & CLONE_SIGHAND)
> 		unshare_flags |= CLONE_VM;

Nope.  The backward definitions of the flags in unshare has gotten you.
CLONE_SIGHAND means that you want a struct sighand_struct with a count
of 1.  Nothing about a sighand_struct with a count of 1 implies or
requires mm_users == 1.  clone can quite happily create those.

> ? Or change check_unshare_flags()...
>
> Otherwise suppose that a single threaded process does clone(VM | SIGHAND)
> and (say) child does sys_unshare(SIGHAND). This will wrongly succeed
> afaics.

Why would it be wrong to succeed in that case?  struct sighand_struct
has a count of 1.  unshare(CLONE_SIGHAND) requests a sighand_struct with
a count of 1.

I expect part of the confusion is the code in unshare has been wrongly
requiring an unshared vm to support a sighand_struct with a count of 1
since the day the code was merged.

Ugh. This patch has a bug where we don't check for sighand->count == 1.

clone(VM)  ---> mm_users = 2 sighand->count == 1 signal->live == 1

clone(VM|SIGHAND) --> mm_users = 2 sighand->count == 2 signal->live == 1

unshare(SIGHAND) needs to guarantee that when it returns sighand->count == 1.
So unshare(SIGHAND) needs to test for sighand->count == 1.

Ugh.  Untangling this ancient mess is a pain.  One more pass at this
patch it seems.

Eric

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v2] unshare: Unsharing a thread does not require unsharing a vm
  2015-08-12 17:48                   ` Oleg Nesterov
  2015-08-12 18:39                     ` Eric W. Biederman
@ 2015-08-12 19:59                     ` Eric W. Biederman
  2015-08-13 12:57                       ` Oleg Nesterov
  1 sibling, 1 reply; 41+ messages in thread
From: Eric W. Biederman @ 2015-08-12 19:59 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes


In the logic in the initial commit of unshare made creating a new
thread group for a process, contingent upon creating a new memory
address space for that process.  That is wrong.  Two separate
processes in different thread groups can share a memory address space
and clone allows creation of such proceses.

This is significant because it was observed that mm_users > 1 does not
mean that a process is multi-threaded, as reading /proc/PID/maps
temporarily increments mm_users, which allows other processes to
(accidentally) interfere with unshare() calls.

Correct the check in check_unshare_flags() to test for
!thread_group_empty() for CLONE_THREAD, CLONE_SIGHAND, and CLONE_VM.
For sighand->count > 1 for CLONE_SIGHAND and CLONE_VM.
For !current_is_single_threaded instead of mm_users > 1 for CLONE_VM.

By using the correct checks in unshare this removes the possibility of
an accidental denial of service attack.

Additionally using the correct checks in unshare ensures that only an
explicit unshare(CLONE_VM) can possibly trigger the slow path of
current_is_single_threaded().  As an explict unshare(CLONE_VM) is
pointless it is not expected there are many applications that make
that call.

Cc: stable@vger.kernel.org
Fixes: b2e0d98705e60e45bbb3c0032c48824ad7ae0704 userns: Implement unshare of the user namespace
Reported-by: Ricky Zhou <rickyz@chromium.org>
Reported-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---

Reposting this change with the addition of the needed test for
sighand->count > 1.

 kernel/fork.c | 28 ++++++++++++++++++----------
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 1bfefc6f96a4..d544ae97f999 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1866,13 +1866,21 @@ static int check_unshare_flags(unsigned long unshare_flags)
 				CLONE_NEWUSER|CLONE_NEWPID))
 		return -EINVAL;
 	/*
-	 * Not implemented, but pretend it works if there is nothing to
-	 * unshare. Note that unsharing CLONE_THREAD or CLONE_SIGHAND
-	 * needs to unshare vm.
+	 * Not implemented, but pretend it works if there is nothing
+	 * to unshare.  Note that unsharing the address space or the
+	 * signal handlers also need to unshare the signal queues (aka
+	 * CLONE_THREAD).
 	 */
 	if (unshare_flags & (CLONE_THREAD | CLONE_SIGHAND | CLONE_VM)) {
-		/* FIXME: get_task_mm() increments ->mm_users */
-		if (atomic_read(&current->mm->mm_users) > 1)
+		if (!thread_group_empty(current))
+			return -EINVAL;
+	}
+	if (unshare_flags & (CLONE_SIGHAND | CLONE_VM)) {
+		if (atomic_read(&current->sighand->count) > 1)
+			return -EINVAL;
+	}
+	if (unshare_flags & CLONE_VM) {
+		if (!current_is_single_threaded())
 			return -EINVAL;
 	}
 
@@ -1941,16 +1949,16 @@ SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
 	if (unshare_flags & CLONE_NEWUSER)
 		unshare_flags |= CLONE_THREAD | CLONE_FS;
 	/*
-	 * If unsharing a thread from a thread group, must also unshare vm.
-	 */
-	if (unshare_flags & CLONE_THREAD)
-		unshare_flags |= CLONE_VM;
-	/*
 	 * If unsharing vm, must also unshare signal handlers.
 	 */
 	if (unshare_flags & CLONE_VM)
 		unshare_flags |= CLONE_SIGHAND;
 	/*
+	 * If unsharing a signal handlers, must also unshare the signal queues.
+	 */
+	if (unshare_flags & CLONE_SIGHAND)
+		unshare_flags |= CLONE_THREAD;
+	/*
 	 * If unsharing namespace, must also unshare filesystem information.
 	 */
 	if (unshare_flags & CLONE_NEWNS)
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/2] unshare: Unsharing a thread does not require unsharing a vm
  2015-08-12 18:39                     ` Eric W. Biederman
@ 2015-08-13 12:55                       ` Oleg Nesterov
  2015-08-13 15:38                         ` Eric W. Biederman
  0 siblings, 1 reply; 41+ messages in thread
From: Oleg Nesterov @ 2015-08-13 12:55 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

Let me first say that CLONE_SIGHAND must die, I think ;) and perhaps
even sighand_struct... I am wondering if we can add something like

	if ((clone_flags & (CLONE_THREAD | CLONE_SIGHAND)) == CLONE_SIGHAND)
		pr_info("You are crazy, please report this to lkml\n");

into copy_process().

On 08/12, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@redhat.com> writes:
>
> > On 08/11, Eric W. Biederman wrote:
> >>  	if (unshare_flags & (CLONE_THREAD | CLONE_SIGHAND | CLONE_VM)) {
> >> -		/* FIXME: get_task_mm() increments ->mm_users */
> >> -		if (atomic_read(&current->mm->mm_users) > 1)
> >> +		if (!thread_group_empty(current))
> >> +			return -EINVAL;
> >> +	}
> >> +	if (unshare_flags & CLONE_VM) {
> >> +		if (!current_is_single_threaded())
> >>  			return -EINVAL;
> >>  	}
> >
> > OK, but then you can remove "| CLONE_VM" from the previous check...
>
> As an optimization, but I don't think anything cares enough for the
> optimization to be worth the confusion.

current_is_single_threaded() checks task->signal->live at the start,
so there is no optimization. But I won't argue, this doesn't hurt.

> >>  	/*
> >> +	 * If unsharing a signal handlers, must also unshare the signal queues.
> >> +	 */
> >> +	if (unshare_flags & CLONE_SIGHAND)
> >> +		unshare_flags |= CLONE_THREAD;
> >
> > This looks unnecessary, check_unshare_flags() checks "THREAD | SIGHAND".
> > And to me the comment looks misleading although I won't argue.
>
> I absolutely can not understand this code if we jump 5 steps ahead
> and optimize out the individual dependencies, and try for a flattened
> dependency tree instead.  I can validate the individual dependencies
> from first principles.
>
> If we jump several steps ahead I can not validate the individual
> dependencies.

OK,

> > And in fact this doesn't look exactly right, or I am totally confused.
> > Shouldn't we do
> >
> > 	if (unshare_flags & CLONE_SIGHAND)
> > 		unshare_flags |= CLONE_VM;
>
> Nope.  The backward definitions of the flags in unshare has gotten you.

See below,

> CLONE_SIGHAND means that you want a struct sighand_struct with a count
> of 1.

This is (almost) true,

> Nothing about a sighand_struct with a count of 1 implies or
> requires mm_users == 1.  clone can quite happily create those.

See

	if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))

in copy_process(). So if you have a shared sighand_struct, your ->mm
is also shared, current_is_single_threaded() will notice this.

> > Otherwise suppose that a single threaded process does clone(VM | SIGHAND)
> > and (say) child does sys_unshare(SIGHAND). This will wrongly succeed
> > afaics.
>
> Why would it be wrong to succeed in that case?  struct sighand_struct
> has a count of 1.

How that? clone(VM | SIGHAND) will share ->sighand and increment its
count.

> unshare(CLONE_SIGHAND) requests a sighand_struct with
> a count of 1.

Exactly, that is why it is wrong to succeed.

> unshare(SIGHAND) needs to guarantee that when it returns sighand->count == 1.
> So unshare(SIGHAND) needs to test for sighand->count == 1.

Oh, I do not think we should check sighand->count. This can lead to
the same problem we have with the current current->mm->mm_users check.

Most probably today nobody increments sighand->count (I didn't even
try to verify). But this is possible, and I saw the code which did
this to pin ->sighand...

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2] unshare: Unsharing a thread does not require unsharing a vm
  2015-08-12 19:59                     ` [PATCH v2] " Eric W. Biederman
@ 2015-08-13 12:57                       ` Oleg Nesterov
  2015-08-13 16:01                         ` Eric W. Biederman
  0 siblings, 1 reply; 41+ messages in thread
From: Oleg Nesterov @ 2015-08-13 12:57 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

On 08/12, Eric W. Biederman wrote:
>
> +	if (unshare_flags & (CLONE_SIGHAND | CLONE_VM)) {
> +		if (atomic_read(&current->sighand->count) > 1)
> +			return -EINVAL;
> +	}

I am still not sure we want this... please the the previous email.

But perhaps I missed something.

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/2] unshare: Unsharing a thread does not require unsharing a vm
  2015-08-13 12:55                       ` Oleg Nesterov
@ 2015-08-13 15:38                         ` Eric W. Biederman
  2015-08-13 16:17                           ` Oleg Nesterov
  0 siblings, 1 reply; 41+ messages in thread
From: Eric W. Biederman @ 2015-08-13 15:38 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

Oleg Nesterov <oleg@redhat.com> writes:

> Let me first say that CLONE_SIGHAND must die, I think ;) and perhaps
> even sighand_struct... I am wondering if we can add something like
>
> 	if ((clone_flags & (CLONE_THREAD | CLONE_SIGHAND)) == CLONE_SIGHAND)
> 		pr_info("You are crazy, please report this to lkml\n");
>
> into copy_process().

The only way killing CLONE_SIGHAND would be viable would be with a
config option.  There are entire generations of linux where libpthreads
used this before CLONE_THREAD was implemented.  Now perhaps no one cares
anymore, but there are a lot of historic binairies that used it, even to
the point where I know of at least one user outside of glibc's pthread
implementation.

> On 08/12, Eric W. Biederman wrote:
>>
>> Oleg Nesterov <oleg@redhat.com> writes:
>>
>> > On 08/11, Eric W. Biederman wrote:
>> >>  	if (unshare_flags & (CLONE_THREAD | CLONE_SIGHAND | CLONE_VM)) {
>> >> -		/* FIXME: get_task_mm() increments ->mm_users */
>> >> -		if (atomic_read(&current->mm->mm_users) > 1)
>> >> +		if (!thread_group_empty(current))
>> >> +			return -EINVAL;
>> >> +	}
>> >> +	if (unshare_flags & CLONE_VM) {
>> >> +		if (!current_is_single_threaded())
>> >>  			return -EINVAL;
>> >>  	}
>> >
>> > OK, but then you can remove "| CLONE_VM" from the previous check...
>>
>> As an optimization, but I don't think anything cares enough for the
>> optimization to be worth the confusion.
>
> current_is_single_threaded() checks task->signal->live at the start,
> so there is no optimization. But I won't argue, this doesn't hurt.

>> >>  	/*
>> >> +	 * If unsharing a signal handlers, must also unshare the signal queues.
>> >> +	 */
>> >> +	if (unshare_flags & CLONE_SIGHAND)
>> >> +		unshare_flags |= CLONE_THREAD;
>> >
>> > This looks unnecessary, check_unshare_flags() checks "THREAD | SIGHAND".
>> > And to me the comment looks misleading although I won't argue.
>>
>> I absolutely can not understand this code if we jump 5 steps ahead
>> and optimize out the individual dependencies, and try for a flattened
>> dependency tree instead.  I can validate the individual dependencies
>> from first principles.
>>
>> If we jump several steps ahead I can not validate the individual
>> dependencies.
>
> OK,
>
>> > And in fact this doesn't look exactly right, or I am totally confused.
>> > Shouldn't we do
>> >
>> > 	if (unshare_flags & CLONE_SIGHAND)
>> > 		unshare_flags |= CLONE_VM;
>>
>> Nope.  The backward definitions of the flags in unshare has gotten you.
>
> See below,
>
>> CLONE_SIGHAND means that you want a struct sighand_struct with a count
>> of 1.
>
> This is (almost) true,
>
>> Nothing about a sighand_struct with a count of 1 implies or
>> requires mm_users == 1.  clone can quite happily create those.
>
> See
>
> 	if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
>
> in copy_process(). So if you have a shared sighand_struct, your ->mm
> is also shared, current_is_single_threaded() will notice this.

Yes.  A shared sighand_struct will have a shared ->mm.  But a private
sighand_struct with count == 1 may also have a shared ->mm.

>> > Otherwise suppose that a single threaded process does clone(VM | SIGHAND)
>> > and (say) child does sys_unshare(SIGHAND). This will wrongly succeed
>> > afaics.
>>
>> Why would it be wrong to succeed in that case?  struct sighand_struct
>> has a count of 1.
>
> How that? clone(VM | SIGHAND) will share ->sighand and increment its
> count.
>
>> unshare(CLONE_SIGHAND) requests a sighand_struct with
>> a count of 1.
>
> Exactly, that is why it is wrong to succeed.

Now that I am clear about what you are talking about I agree with you.

My apologies I clearly misread what you were saying yesterday.

>> unshare(SIGHAND) needs to guarantee that when it returns sighand->count == 1.
>> So unshare(SIGHAND) needs to test for sighand->count == 1.
>
> Oh, I do not think we should check sighand->count. This can lead to
> the same problem we have with the current current->mm->mm_users check.
>
> Most probably today nobody increments sighand->count (I didn't even
> try to verify). But this is possible, and I saw the code which did
> this to pin ->sighand...

I have verified that copy_sighand is the only place in the kernel where
we increment sighand->count today.  de_thread in fs/exec.c even seems to
rely on that.

So while I agree with you that the sighand->count could suffer a similar
fate as mm_users it does not.

Eric

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2] unshare: Unsharing a thread does not require unsharing a vm
  2015-08-13 12:57                       ` Oleg Nesterov
@ 2015-08-13 16:01                         ` Eric W. Biederman
  2015-08-13 16:30                           ` Oleg Nesterov
  0 siblings, 1 reply; 41+ messages in thread
From: Eric W. Biederman @ 2015-08-13 16:01 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

Oleg Nesterov <oleg@redhat.com> writes:

> On 08/12, Eric W. Biederman wrote:
>>
>> +	if (unshare_flags & (CLONE_SIGHAND | CLONE_VM)) {
>> +		if (atomic_read(&current->sighand->count) > 1)
>> +			return -EINVAL;
>> +	}
>
> I am still not sure we want this... please the the previous email.

Reading your other email I did not see why you thought this check was
unnecessary.

> But perhaps I missed something.

In short:
clone(VM) --> mm_users > 1 && sighand_struct->count == 1
followed by:
unshare(SIGHAND)
the unshare should succeed.

Meanwhile:
clone(VM|SIGHAND) --> mm_users > 1 && sighand_struct->count > 1
followed by:
unshare(SIGHAND)
the unshare should fail.

I actually tested both of these cases and my patch works properly.

Not that I expect that there is anyone actually calling unshare(SIGHAND)
but unless we figure out how to remove the code, the code should
function correctly.  If for no other reason than to not confuse people
reading and maintaining the code.

Further I have audited the callers and we don't have anyone playing
games with sighand->count.  There is an implementation of unsharing the
sighand_struct in dethread in fs/exec.c that relies on this.

Other possible tests such as current_is_single_threaded() and
thread_group_empty() fail at the wrong times to be used.

So I think it is clear that testing for a private sighand_struct is
necessaring and testing sighand->count is a perfectly fine test.

Eric


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/2] unshare: Unsharing a thread does not require unsharing a vm
  2015-08-13 15:38                         ` Eric W. Biederman
@ 2015-08-13 16:17                           ` Oleg Nesterov
  2015-08-13 16:27                             ` Eric W. Biederman
  0 siblings, 1 reply; 41+ messages in thread
From: Oleg Nesterov @ 2015-08-13 16:17 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

On 08/13, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@redhat.com> writes:
>
> > Let me first say that CLONE_SIGHAND must die, I think ;) and perhaps
> > even sighand_struct... I am wondering if we can add something like
> >
> > 	if ((clone_flags & (CLONE_THREAD | CLONE_SIGHAND)) == CLONE_SIGHAND)
> > 		pr_info("You are crazy, please report this to lkml\n");
> >
> > into copy_process().
>
> The only way killing CLONE_SIGHAND would be viable would be with a
> config option.  There are entire generations of linux where libpthreads
> used this before CLONE_THREAD was implemented.  Now perhaps no one cares
> anymore, but there are a lot of historic binairies that used it, even to
> the point where I know of at least one user outside of glibc's pthread
> implementation.

Heh. so we still need to keep it. Thanks.

> Yes.  A shared sighand_struct will have a shared ->mm.  But a private
> sighand_struct with count == 1 may also have a shared ->mm.

Yes sure. This just means that we can check current_is_single_threaded()
if CLONE_SIGHAND | CLONE_VM, signal->count check can be avoided.

> > Oh, I do not think we should check sighand->count. This can lead to
> > the same problem we have with the current current->mm->mm_users check.
> >
> > Most probably today nobody increments sighand->count (I didn't even
> > try to verify). But this is possible, and I saw the code which did
> > this to pin ->sighand...
>
> I have verified that copy_sighand is the only place in the kernel where
> we increment sighand->count today.

OK,

> de_thread in fs/exec.c even seems to
> rely on that.

Not really. This is just optimization, de_thread() could change ->sighand
unconditionally.

> So while I agree with you that the sighand->count could suffer a similar
> fate as mm_users it does not.

Ignoring the out-of-tree code ;)

Nevermind, I won't really argue, this all is mostly cosmetic. And perhaps
this sighand->count check in check_unshare_flags() makes this code look
a bit better / more understandable.

Still. How about the trivial *-fix.patch for -mm which simply does

	-	if (unshare_flags & (CLONE_SIGHAND | CLONE_VM)) {
	+	if (unshare_flags & CLONE_SIGHAND) {
			if (atomic_read(&current->sighand->count) > 1)
				return -EINVAL;
		}

again, this doesn't really matter. To this "| CLONE_VM" looks
very confusing to me.

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/2] unshare: Unsharing a thread does not require unsharing a vm
  2015-08-13 16:17                           ` Oleg Nesterov
@ 2015-08-13 16:27                             ` Eric W. Biederman
  2015-08-13 16:50                               ` Oleg Nesterov
  0 siblings, 1 reply; 41+ messages in thread
From: Eric W. Biederman @ 2015-08-13 16:27 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

Oleg Nesterov <oleg@redhat.com> writes:

> On 08/13, Eric W. Biederman wrote:
>> The only way killing CLONE_SIGHAND would be viable would be with a
>> config option.  There are entire generations of linux where libpthreads
>> used this before CLONE_THREAD was implemented.  Now perhaps no one cares
>> anymore, but there are a lot of historic binairies that used it, even to
>> the point where I know of at least one user outside of glibc's pthread
>> implementation.
>
> Heh. so we still need to keep it. Thanks.

Pretty much. It is possible to make this stuff go away when people stop
caring but it is a long process.  I think I have almost killed
sys_sysctl.  It seems to be disabled in most distributions.

>> Yes.  A shared sighand_struct will have a shared ->mm.  But a private
>> sighand_struct with count == 1 may also have a shared ->mm.
>
> Yes sure. This just means that we can check current_is_single_threaded()
> if CLONE_SIGHAND | CLONE_VM, signal->count check can be avoided.

As I pointed out in my follow we really can't because there is a case
where mm_users > 1 and sighand_count == 1.  In which case using
current_is_single_threaded can cause unshare(SIGHAND) to fail.

>> So while I agree with you that the sighand->count could suffer a similar
>> fate as mm_users it does not.
>
> Ignoring the out-of-tree code ;)
>
> Nevermind, I won't really argue, this all is mostly cosmetic. And perhaps
> this sighand->count check in check_unshare_flags() makes this code look
> a bit better / more understandable.
>
> Still. How about the trivial *-fix.patch for -mm which simply does
>
> 	-	if (unshare_flags & (CLONE_SIGHAND | CLONE_VM)) {
> 	+	if (unshare_flags & CLONE_SIGHAND) {
> 			if (atomic_read(&current->sighand->count) > 1)
> 				return -EINVAL;
> 		}
>
> again, this doesn't really matter. To this "| CLONE_VM" looks
> very confusing to me.

Definitely cosmetic.  This was my preserving of your flattened test
argument in around mm_users > 1  in check_unshare_flags().

It is unncessary given that we add CLONE_SIGHAND when CLONE_VM.
But to have a private mm_struct you definitely need a sighand_struct.

In the sense of document when these tests apply I think it makes a
teensy bit of sense to have the CLONE_VM there.  But if you want to send
me a cosmetic patch that removes that I will add it to my tree, with the
other two patches.

Eric


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2] unshare: Unsharing a thread does not require unsharing a vm
  2015-08-13 16:01                         ` Eric W. Biederman
@ 2015-08-13 16:30                           ` Oleg Nesterov
  2015-08-13 16:39                             ` Eric W. Biederman
  0 siblings, 1 reply; 41+ messages in thread
From: Oleg Nesterov @ 2015-08-13 16:30 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

On 08/13, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@redhat.com> writes:
>
> > On 08/12, Eric W. Biederman wrote:
> >>
> >> +	if (unshare_flags & (CLONE_SIGHAND | CLONE_VM)) {
> >> +		if (atomic_read(&current->sighand->count) > 1)
> >> +			return -EINVAL;
> >> +	}
> >
> > I am still not sure we want this... please the the previous email.
>
> Reading your other email I did not see why you thought this check was
> unnecessary.
>
> > But perhaps I missed something.
>
> In short:
> clone(VM) --> mm_users > 1 && sighand_struct->count == 1
> followed by:
> unshare(SIGHAND)
> the unshare should succeed.
>
> Meanwhile:
> clone(VM|SIGHAND) --> mm_users > 1 && sighand_struct->count > 1
> followed by:
> unshare(SIGHAND)
> the unshare should fail.

Yes, yes, yes.

But once again, I meant we can remove this sighand->count check
if unshare(SIGHAND) checks current_is_single_threaded(). That is
why I suggested to do

	if (unshare_flags & CLONE_SIGHAND)
		unshare_flags |= CLONE_VM;

in sys_unshare(), or change check_unshare_flags() to check
"unshare_flags & (CLONE_VM | CLONE_SIGHAND)" before
current_is_single_threaded().



Damn. And this discussion makes me think that another cleanup makes
sense too. Can't we move all these unshare_flags manipulations into
check_unshare_flags? So that sys_unshare() will only do

	err = check_unshare_flags(&unshare_flags);

and the reader of this code won't need to read 2 functions to understand
whats going on.

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2] unshare: Unsharing a thread does not require unsharing a vm
  2015-08-13 16:30                           ` Oleg Nesterov
@ 2015-08-13 16:39                             ` Eric W. Biederman
  0 siblings, 0 replies; 41+ messages in thread
From: Eric W. Biederman @ 2015-08-13 16:39 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

Oleg Nesterov <oleg@redhat.com> writes:

> On 08/13, Eric W. Biederman wrote:
>>
>> Oleg Nesterov <oleg@redhat.com> writes:
>>
>> > On 08/12, Eric W. Biederman wrote:
>> >>
>> >> +	if (unshare_flags & (CLONE_SIGHAND | CLONE_VM)) {
>> >> +		if (atomic_read(&current->sighand->count) > 1)
>> >> +			return -EINVAL;
>> >> +	}
>> >
>> > I am still not sure we want this... please the the previous email.
>>
>> Reading your other email I did not see why you thought this check was
>> unnecessary.
>>
>> > But perhaps I missed something.
>>
>> In short:
>> clone(VM) --> mm_users > 1 && sighand_struct->count == 1
>> followed by:
>> unshare(SIGHAND)
>> the unshare should succeed.
>>
>> Meanwhile:
>> clone(VM|SIGHAND) --> mm_users > 1 && sighand_struct->count > 1
>> followed by:
>> unshare(SIGHAND)
>> the unshare should fail.
>
> Yes, yes, yes.
>
> But once again, I meant we can remove this sighand->count check
> if unshare(SIGHAND) checks current_is_single_threaded(). That is
> why I suggested to do
>
> 	if (unshare_flags & CLONE_SIGHAND)
> 		unshare_flags |= CLONE_VM;
>
> in sys_unshare(), or change check_unshare_flags() to check
> "unshare_flags & (CLONE_VM | CLONE_SIGHAND)" before
> current_is_single_threaded().

See the two cases above that change to unshare_flags will make
unshare(SIGHAND) fail when sighand_struct->count == 1.

Which is fundamentally wrong.

> Damn. And this discussion makes me think that another cleanup makes
> sense too. Can't we move all these unshare_flags manipulations into
> check_unshare_flags? So that sys_unshare() will only do
>
> 	err = check_unshare_flags(&unshare_flags);
>
> and the reader of this code won't need to read 2 functions to understand
> whats going on.

Eric


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/2] unshare: Unsharing a thread does not require unsharing a vm
  2015-08-13 16:27                             ` Eric W. Biederman
@ 2015-08-13 16:50                               ` Oleg Nesterov
  2015-08-14 17:59                                 ` Oleg Nesterov
  0 siblings, 1 reply; 41+ messages in thread
From: Oleg Nesterov @ 2015-08-13 16:50 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

On 08/13, Eric W. Biederman wrote:
>
> In the sense of document when these tests apply I think it makes a
> teensy bit of sense to have the CLONE_VM there.  But if you want to send
> me a cosmetic patch that removes that I will add it to my tree, with the
> other two patches.

Will do ;)

Eric, I need to run away, I'll try to answer other parts of our confusing
discussion tomorrow.

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/2] unshare: Unsharing a thread does not require unsharing a vm
  2015-08-13 16:50                               ` Oleg Nesterov
@ 2015-08-14 17:59                                 ` Oleg Nesterov
  0 siblings, 0 replies; 41+ messages in thread
From: Oleg Nesterov @ 2015-08-14 17:59 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kirill A. Shutemov, Andrew Morton, Kees Cook, David Howells,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Kirill A. Shutemov,
	Rik van Riel, Vladimir Davydov, Ricky Zhou, Julien Tinnes

On 08/13, Oleg Nesterov wrote:
>
> On 08/13, Eric W. Biederman wrote:
> >
> > In the sense of document when these tests apply I think it makes a
> > teensy bit of sense to have the CLONE_VM there.  But if you want to send
> > me a cosmetic patch that removes that I will add it to my tree, with the
> > other two patches.
>
> Will do ;)

Yes, I still think it is pointless to check sighand->count if CLONE_VM.

> Eric, I need to run away, I'll try to answer other parts of our confusing
> discussion tomorrow.

No, lest stop it.

Yes, I was wrong, we can't avoid sighand->count check. Somehow I absolutely
forgot that we also need to ensure that unshare(SIGHAND) can't wrongly _fail_.

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2015-08-14 18:02 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-28 17:15 [PATCH] user_ns: use correct check for single-threadedness Kees Cook
2015-07-28 18:02 ` Rik van Riel
2015-07-28 18:17 ` Eric W. Biederman
2015-07-28 20:55   ` Ricky Zhou
2015-07-28 21:01     ` Kees Cook
2015-08-05 18:13       ` Eric W. Biederman
2015-08-05 19:40         ` Kees Cook
2015-07-28 21:35 ` Andrew Morton
2015-07-28 21:50   ` Kees Cook
2015-07-28 22:11   ` Kirill A. Shutemov
2015-08-05 11:38     ` Ingo Molnar
2015-08-05 11:53       ` Kirill A. Shutemov
2015-08-05 13:13         ` Ricky Zhou
2015-08-05 17:23     ` Oleg Nesterov
2015-08-05 18:00       ` Eric W. Biederman
2015-08-05 18:52         ` Eric W. Biederman
2015-08-06 13:06           ` Oleg Nesterov
2015-08-06 13:44             ` Oleg Nesterov
2015-08-12  1:17               ` Eric W. Biederman
2015-08-12 14:40                 ` Oleg Nesterov
2015-08-12 15:11                   ` Eric W. Biederman
2015-08-12  1:22               ` [PATCH 0/2] userns: Creation logic fixes Eric W. Biederman
2015-08-12  1:24                 ` [PATCH 1/2] unshare: Unsharing a thread does not require unsharing a vm Eric W. Biederman
2015-08-12 17:48                   ` Oleg Nesterov
2015-08-12 18:39                     ` Eric W. Biederman
2015-08-13 12:55                       ` Oleg Nesterov
2015-08-13 15:38                         ` Eric W. Biederman
2015-08-13 16:17                           ` Oleg Nesterov
2015-08-13 16:27                             ` Eric W. Biederman
2015-08-13 16:50                               ` Oleg Nesterov
2015-08-14 17:59                                 ` Oleg Nesterov
2015-08-12 19:59                     ` [PATCH v2] " Eric W. Biederman
2015-08-13 12:57                       ` Oleg Nesterov
2015-08-13 16:01                         ` Eric W. Biederman
2015-08-13 16:30                           ` Oleg Nesterov
2015-08-13 16:39                             ` Eric W. Biederman
2015-08-12  1:25                 ` [PATCH 2/2] userns,pidns: Force thread group sharing, not signal handler sharing Eric W. Biederman
2015-08-12 17:24                   ` Oleg Nesterov
2015-08-12  6:29                 ` [PATCH 0/2] userns: Creation logic fixes Kees Cook
2015-08-06 14:35           ` [PATCH] user_ns: use correct check for single-threadedness Oleg Nesterov
2015-08-06 21:16             ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).