From: "Serge E. Hallyn" <serge@hallyn.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Linux Containers <containers@lists.linux-foundation.org>,
"Serge E. Hallyn" <serge@hallyn.com>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Vasily Kulikov <segoon@openwall.com>
Subject: Re: [PATCH review 1/6] userns: Avoid recursion in put_user_ns
Date: Sat, 26 Jan 2013 20:58:05 +0000 [thread overview]
Message-ID: <20130126205805.GA11274@mail.hallyn.com> (raw)
In-Reply-To: <877gn0it3t.fsf@xmission.com>
Quoting Eric W. Biederman (ebiederm@xmission.com):
>
> When freeing a deeply nested user namespace free_user_ns calls
> put_user_ns on it's parent which may in turn call free_user_ns again.
> When -fno-optimize-sibling-calls is passed to gcc one stack frame per
> user namespace is left on the stack, potentially overflowing the
> kernel stack. CONFIG_FRAME_POINTER forces -fno-optimize-sibling-calls
> so we can't count on gcc to optimize this code.
>
> Remove struct kref and use a plain atomic_t. Making the code more
> flexible and easier to comprehend. Make the loop in free_user_ns
> explict to guarantee that the stack does not overflow with
> CONFIG_FRAME_POINTER enabled.
>
> I have tested this fix with a simple program that uses unshare to
> create a deeply nested user namespace structure and then calls exit.
> With 1000 nesteuser namespaces before this change running my test
> program causes the kernel to die a horrible death. With 10,000,000
> nested user namespaces after this change my test program runs to
> completion and causes no harm.
>
> Pointed-out-by: Vasily Kulikov <segoon@openwall.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
> include/linux/user_namespace.h | 10 +++++-----
> kernel/user.c | 4 +---
> kernel/user_namespace.c | 17 +++++++++--------
> 3 files changed, 15 insertions(+), 16 deletions(-)
>
> diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
> index b9bd2e6..4ce0093 100644
> --- a/include/linux/user_namespace.h
> +++ b/include/linux/user_namespace.h
> @@ -21,7 +21,7 @@ struct user_namespace {
> struct uid_gid_map uid_map;
> struct uid_gid_map gid_map;
> struct uid_gid_map projid_map;
> - struct kref kref;
> + atomic_t count;
> struct user_namespace *parent;
> kuid_t owner;
> kgid_t group;
> @@ -35,18 +35,18 @@ extern struct user_namespace init_user_ns;
> static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
> {
> if (ns)
> - kref_get(&ns->kref);
> + atomic_inc(&ns->count);
> return ns;
> }
>
> extern int create_user_ns(struct cred *new);
> extern int unshare_userns(unsigned long unshare_flags, struct cred **new_cred);
> -extern void free_user_ns(struct kref *kref);
> +extern void free_user_ns(struct user_namespace *ns);
>
> static inline void put_user_ns(struct user_namespace *ns)
> {
> - if (ns)
> - kref_put(&ns->kref, free_user_ns);
> + if (ns && atomic_dec_and_test(&ns->count))
> + free_user_ns(ns);
> }
>
> struct seq_operations;
> diff --git a/kernel/user.c b/kernel/user.c
> index 33acb5e..57ebfd4 100644
> --- a/kernel/user.c
> +++ b/kernel/user.c
> @@ -47,9 +47,7 @@ struct user_namespace init_user_ns = {
> .count = 4294967295U,
> },
> },
> - .kref = {
> - .refcount = ATOMIC_INIT(3),
> - },
> + .count = ATOMIC_INIT(3),
> .owner = GLOBAL_ROOT_UID,
> .group = GLOBAL_ROOT_GID,
> .proc_inum = PROC_USER_INIT_INO,
> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> index 2b042c4..24f8ec3 100644
> --- a/kernel/user_namespace.c
> +++ b/kernel/user_namespace.c
> @@ -78,7 +78,7 @@ int create_user_ns(struct cred *new)
> return ret;
> }
>
> - kref_init(&ns->kref);
> + atomic_set(&ns->count, 1);
> /* Leave the new->user_ns reference with the new user namespace. */
> ns->parent = parent_ns;
> ns->owner = owner;
> @@ -104,15 +104,16 @@ int unshare_userns(unsigned long unshare_flags, struct cred **new_cred)
> return create_user_ns(cred);
> }
>
> -void free_user_ns(struct kref *kref)
> +void free_user_ns(struct user_namespace *ns)
> {
> - struct user_namespace *parent, *ns =
> - container_of(kref, struct user_namespace, kref);
> + struct user_namespace *parent;
>
> - parent = ns->parent;
> - proc_free_inum(ns->proc_inum);
> - kmem_cache_free(user_ns_cachep, ns);
> - put_user_ns(parent);
> + do {
> + parent = ns->parent;
> + proc_free_inum(ns->proc_inum);
> + kmem_cache_free(user_ns_cachep, ns);
> + ns = parent;
> + } while (atomic_dec_and_test(&parent->count));
> }
> EXPORT_SYMBOL(free_user_ns);
>
> --
> 1.7.5.4
next prev parent reply other threads:[~2013-01-26 20:56 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-26 2:15 [PATCH review 0/6] miscelaneous user namespace patches Eric W. Biederman
2013-01-26 2:19 ` [PATCH review 1/6] userns: Avoid recursion in put_user_ns Eric W. Biederman
2013-01-26 20:58 ` Serge E. Hallyn [this message]
2013-01-28 14:51 ` Vasily Kulikov
2013-01-28 16:34 ` Eric W. Biederman
2013-01-26 2:21 ` [PATCH review 2/6] userns: Allow any uid or gid mappings that don't overlap Eric W. Biederman
2013-01-26 21:08 ` Serge E. Hallyn
2013-01-28 14:28 ` Aristeu Rozanski
2013-01-28 14:41 ` Lord Glauber Costa of Sealand
2013-01-28 15:12 ` Aristeu Rozanski
2013-01-26 2:22 ` [PATCH review 3/6] userns: Recommend use of memory control groups Eric W. Biederman
2013-01-26 21:13 ` Serge E. Hallyn
2013-01-27 6:19 ` Eric W. Biederman
2013-01-28 7:37 ` Lord Glauber Costa of Sealand
2013-01-28 7:50 ` Lord Glauber Costa of Sealand
2013-01-28 8:14 ` Eric W. Biederman
2013-01-28 8:22 ` Lord Glauber Costa of Sealand
2013-01-28 16:19 ` Eric W. Biederman
2013-01-28 16:37 ` Lord Glauber Costa of Sealand
2013-01-28 17:18 ` Eric W. Biederman
2013-01-28 8:05 ` Eric W. Biederman
2013-01-26 2:23 ` [PATCH review 4/6] userns: Allow the userns root to mount of devpts Eric W. Biederman
2013-01-26 21:22 ` Serge E. Hallyn
2013-01-26 2:26 ` [PATCH review 5/6] userns: Allow the userns root to mount ramfs Eric W. Biederman
2013-01-26 21:29 ` Serge E. Hallyn
2013-01-27 6:09 ` Eric W. Biederman
2013-01-27 18:23 ` Serge E. Hallyn
2013-01-27 18:23 ` Serge E. Hallyn
2013-01-26 2:26 ` [PATCH review 6/6] userns: Allow the userns root to mount tmpfs Eric W. Biederman
2013-01-27 18:23 ` Serge E. Hallyn
2013-01-28 1:28 ` Gao feng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130126205805.GA11274@mail.hallyn.com \
--to=serge@hallyn.com \
--cc=containers@lists.linux-foundation.org \
--cc=ebiederm@xmission.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=segoon@openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).