linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2 v2] userns: show current values of user namespace counters
@ 2016-08-15 20:10 Andrei Vagin
  2016-08-15 20:10 ` [PATCH 1/2] kernel: " Andrei Vagin
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Andrei Vagin @ 2016-08-15 20:10 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: containers, linux-kernel, Andrei Vagin, Serge Hallyn, Kees Cook

Recently Eric added user namespace counters.  User namespace counters is
a feature that allows to limit the number of various kernel objects a
user can create. These limits are set via /proc/sys/user/ sysctls on a
per user namespace basis and are applicable to all users in that
namespace.

User namespace counters are not in the upstream tree yet,
you can find them in Eric's tree:
https://git.kernel.org/cgit/linux/kernel/git/ebiederm/user-namespace.git/log/?h=for-testing

This patch adds /proc/<pid>/userns_counts files to provide current usage
of user namespace counters.

  > cat /proc/813/userns_counts
  user_namespaces          101000               1
  pid_namespaces           101000               1
  ipc_namespaces           101000               4
  net_namespaces           101000               2
  mnt_namespaces           101000               5
  mnt_namespaces           100000               1

The meanings of the columns are as follows, from left to right:

  Name         Object name
  UID          User ID
  Usage        Current usage

The full documentation is in the second patch.

v2: - describe this file in Documentation/filesystems/proc.txt
    - move and rename into /proc/<pid>/userns_counts

Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>

Andrei Vagin (1):
  kernel: show current values of user namespace counters

Kirill Kolyshkin (1):
  Documentation: describe /proc/<pid>/userns_counts

 Documentation/filesystems/proc.txt |  30 +++++++++++
 fs/proc/array.c                    |  55 ++++++++++++++++++++
 fs/proc/base.c                     |   1 +
 fs/proc/internal.h                 |   1 +
 include/linux/user_namespace.h     |   8 +++
 kernel/ucount.c                    | 102 +++++++++++++++++++++++++++++++++++++
 6 files changed, 197 insertions(+)

-- 
2.5.5

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] kernel: show current values of user namespace counters
  2016-08-15 20:10 [PATCH 0/2 v2] userns: show current values of user namespace counters Andrei Vagin
@ 2016-08-15 20:10 ` Andrei Vagin
  2016-08-16 20:00   ` Kees Cook
  2016-08-15 20:10 ` [PATCH 2/2] Documentation: describe /proc/<pid>/userns_counts Andrei Vagin
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Andrei Vagin @ 2016-08-15 20:10 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: containers, linux-kernel, Andrei Vagin, Serge Hallyn, Kees Cook

Recently Eric added user namespace counters.  User namespace counters is
a feature that allows to limit the number of various kernel objects a
user can create. These limits are set via /proc/sys/user/ sysctls on a
per user namespace basis and are applicable to all users in that
namespace.

This patch adds /proc/PID/userns_counts files to provide current usage
of user namespace counters.

  > cat /proc/813/userns_counts
  user_namespaces          101000               1
  pid_namespaces           101000               1
  ipc_namespaces           101000               4
  net_namespaces           101000               2
  mnt_namespaces           101000               5
  mnt_namespaces           100000               1

The meanings of the columns are as follows, from left to right:

  Name         Object name
  UID          User ID
  Usage        Current usage

Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
---
 fs/proc/array.c                |  57 +++++++++++++++++++++++
 fs/proc/base.c                 |   3 ++
 fs/proc/internal.h             |   1 +
 include/linux/user_namespace.h |   8 ++++
 kernel/ucount.c                | 102 +++++++++++++++++++++++++++++++++++++++++
 5 files changed, 171 insertions(+)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 88c7de1..f186625 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -734,3 +734,60 @@ const struct file_operations proc_tid_children_operations = {
 	.release = children_seq_release,
 };
 #endif /* CONFIG_PROC_CHILDREN */
+
+#ifdef CONFIG_USER_NS
+static int ucounts_open(struct inode *inode, struct file *filp)
+{
+	struct ucounts_iterator *iter;
+	struct seq_file *seq;
+	int ret;
+
+	struct task_struct *task;
+	struct user_namespace *ns;
+
+	task = get_proc_task(inode);
+	if (!task)
+		return -ESRCH;
+
+	rcu_read_lock();
+	ns = get_user_ns(__task_cred(task)->user_ns);
+	rcu_read_unlock();
+
+	put_task_struct(task);
+
+	if (ns == NULL)
+		return -ESRCH;
+
+	ret = seq_open_private(filp, &ucounts_seq_operations,
+					sizeof(struct ucounts_iterator));
+
+	if (ret) {
+		put_user_ns(ns);
+		return ret;
+	}
+
+	seq = filp->private_data;
+	iter = seq->private;
+	iter->ns = ns;
+
+	return 0;
+}
+
+int ucounts_release(struct inode *inode, struct file *file)
+{
+	struct seq_file *seq = file->private_data;
+	struct ucounts_iterator *iter = seq->private;
+
+	put_user_ns(iter->ns);
+
+	return seq_release_private(inode, file);
+}
+
+
+const struct file_operations proc_ucounts_operations = {
+	.open		= ucounts_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= ucounts_release,
+};
+#endif /* CONFIG_USER_NS */
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 54e2702..4252f7a 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2910,6 +2910,9 @@ static const struct pid_entry tgid_base_stuff[] = {
 	REG("timers",	  S_IRUGO, proc_timers_operations),
 #endif
 	REG("timerslack_ns", S_IRUGO|S_IWUGO, proc_pid_set_timerslack_ns_operations),
+#ifdef CONFIG_USER_NS
+	REG("userns_counts",  S_IRUGO, proc_ucounts_operations),
+#endif
 };
 
 static int proc_tgid_base_readdir(struct file *file, struct dir_context *ctx)
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 7931c55..845cadb 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -298,6 +298,7 @@ extern const struct file_operations proc_pid_smaps_operations;
 extern const struct file_operations proc_tid_smaps_operations;
 extern const struct file_operations proc_clear_refs_operations;
 extern const struct file_operations proc_pagemap_operations;
+extern const struct file_operations proc_ucounts_operations;
 
 extern unsigned long task_vsize(struct mm_struct *);
 extern unsigned long task_statm(struct mm_struct *,
diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index 30ffe10..5f824dd 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -106,6 +106,14 @@ extern ssize_t proc_setgroups_write(struct file *, const char __user *, size_t,
 extern int proc_setgroups_show(struct seq_file *m, void *v);
 extern bool userns_may_setgroups(const struct user_namespace *ns);
 extern bool current_in_userns(const struct user_namespace *target_ns);
+
+struct ucounts_iterator {
+	struct user_namespace *ns;
+	int	hash;
+};
+
+extern const struct seq_operations ucounts_seq_operations;
+
 #else
 
 static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
diff --git a/kernel/ucount.c b/kernel/ucount.c
index 6ebbe9b..cef09e3 100644
--- a/kernel/ucount.c
+++ b/kernel/ucount.c
@@ -9,6 +9,8 @@
 #include <linux/sysctl.h>
 #include <linux/slab.h>
 #include <linux/hash.h>
+#include <linux/seq_file.h>
+#include <linux/proc_fs.h>
 #include <linux/user_namespace.h>
 
 #define UCOUNTS_HASHTABLE_BITS 10
@@ -232,4 +234,104 @@ static __init int user_namespace_sysctl_init(void)
 }
 subsys_initcall(user_namespace_sysctl_init);
 
+#ifdef CONFIG_PROC_FS
+static void *ucounts_start(struct seq_file *f, loff_t *pos)
+{
+	struct ucounts_iterator *iter = f->private;
+	int h, i = 0;
+
+	spin_lock(&ucounts_lock);
+	for (h = 0; h < (1 << UCOUNTS_HASHTABLE_BITS); h++) {
+		struct ucounts *ucounts;
+
+		hlist_for_each_entry(ucounts, &ucounts_hashtable[h], node) {
+			if (ucounts->ns != iter->ns)
+				continue;
+			if (i++ < *pos)
+				continue;
+
+			iter->hash = h;
+
+			return ucounts;
+		}
+	}
+
+	return NULL;
+}
+
+static void ucounts_stop(struct seq_file *f, void *v)
+{
+	spin_unlock(&ucounts_lock);
+}
+
+static void *ucounts_next(struct seq_file *f, void *v, loff_t *pos)
+{
+	struct ucounts_iterator *iter = f->private;
+	struct ucounts *ucounts = v;
+	int h;
+
+	++*pos;
+
+	for (h = iter->hash; h < (1 << UCOUNTS_HASHTABLE_BITS); h++) {
+		struct hlist_node *node;
+
+		if (ucounts == NULL) {
+			node = ucounts_hashtable[h].first;
+			iter->hash = h;
+		} else
+			node = ucounts->node.next;
+
+		ucounts = hlist_entry(node, struct ucounts, node);
+
+		hlist_for_each_entry_from(ucounts, node) {
+			if (ucounts->ns != iter->ns)
+				continue;
+
+			return ucounts;
+		}
+
+		ucounts = NULL;
+	}
+
+	return NULL;
+}
+
+static int ucounts_show(struct seq_file *f, void *v)
+{
+	static const char * const ns_strs[] = {
+				"user_namespaces",
+				"pid_namespaces",
+				"uts_namespaces",
+				"ipc_namespaces",
+				"net_namespaces",
+				"mnt_namespaces",
+				"cgroup_namespaces",
+				NULL
+	};
+	struct user_namespace *ns = current_user_ns();
+	struct ucounts *ucounts = v;
+	uid_t uid;
+	int i;
+
+	uid = from_kuid_munged(ns, ucounts->uid);
+
+	for (i = 0; ns_strs[i]; i++) {
+		int val = atomic_read(&ucounts->ucount[i]);
+
+		if (val == 0)
+			continue;
+
+		seq_printf(f, "%-20s %10u\t%10d\n", ns_strs[i], uid, val);
+	}
+
+	return 0;
+}
+
+const struct seq_operations ucounts_seq_operations = {
+	.start	= ucounts_start,
+	.next	= ucounts_next,
+	.stop	= ucounts_stop,
+	.show	= ucounts_show,
+};
 
+#endif /* CONFIG_PROC_FS */
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] Documentation: describe /proc/<pid>/userns_counts
  2016-08-15 20:10 [PATCH 0/2 v2] userns: show current values of user namespace counters Andrei Vagin
  2016-08-15 20:10 ` [PATCH 1/2] kernel: " Andrei Vagin
@ 2016-08-15 20:10 ` Andrei Vagin
  2016-08-16 22:53 ` [PATCH 0/2 v2] userns: show current values of user namespace counters Serge E. Hallyn
  2016-10-06 17:51 ` Andrei Vagin
  3 siblings, 0 replies; 11+ messages in thread
From: Andrei Vagin @ 2016-08-15 20:10 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: containers, linux-kernel, Andrei Vagin, Serge Hallyn, Kees Cook,
	Kirill Kolyshkin

From: Kirill Kolyshkin <kir@openvz.org>

This file provides current usage of user namespace counters.

Signed-off-by: Kirill Kolyshkin <kir@openvz.org>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
---
 Documentation/filesystems/proc.txt | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 68080ad..7300d9c 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -44,6 +44,7 @@ Table of Contents
   3.8   /proc/<pid>/fdinfo/<fd> - Information about opened file
   3.9   /proc/<pid>/map_files - Information about memory mapped files
   3.10  /proc/<pid>/timerslack_ns - Task timerslack value
+  3.11  /proc/<pid>/userns_counts - User namespace counters
 
   4	Configuring procfs
   4.1	Mount options
@@ -1889,6 +1890,35 @@ Valid values are from 0 - ULLONG_MAX
 An application setting the value must have PTRACE_MODE_ATTACH_FSCREDS level
 permissions on the task specified to change its timerslack_ns value.
 
+3.11 /proc/<pid>/userns_counts - User namespace counters
+---------------------------------------------------------
+
+This file provides current usage of user namespace counters.
+
+User namespace counters is a feature that allows to limit the number of various
+kernel objects a user can create. These limits are set via /proc/sys/user/
+sysctls on a per user namespace basis and are applicable to all users in that
+namespace. Therefore, the limits are the same for every user in a user
+namespace.
+
+Each user has their own set of user namespace counters. Once a user creates a
+new user namespace, every new object created inside that namespace is also
+charged to the user. That means that a user is limited by their user namespace
+limits, as well as the limits in their parent user namespaces.
+
+  > cat /proc/813/userns_counts
+  user_namespaces          101000	         1
+  pid_namespaces           101000	         1
+  ipc_namespaces           101000	         4
+  net_namespaces           101000	         2
+  mnt_namespaces           101000	         5
+  mnt_namespaces           100000	         1
+
+The meanings of the columns are as follows, from left to right:
+
+  Name		Object name
+  UID		User ID
+  Usage		Current usage
 
 ------------------------------------------------------------------------------
 Configuring procfs
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] kernel: show current values of user namespace counters
  2016-08-15 20:10 ` [PATCH 1/2] kernel: " Andrei Vagin
@ 2016-08-16 20:00   ` Kees Cook
  2016-08-16 20:05     ` Serge E. Hallyn
  0 siblings, 1 reply; 11+ messages in thread
From: Kees Cook @ 2016-08-16 20:00 UTC (permalink / raw)
  To: Andrei Vagin; +Cc: Eric W. Biederman, Linux Containers, LKML, Serge Hallyn

On Mon, Aug 15, 2016 at 1:10 PM, Andrei Vagin <avagin@openvz.org> wrote:
> Recently Eric added user namespace counters.  User namespace counters is
> a feature that allows to limit the number of various kernel objects a
> user can create. These limits are set via /proc/sys/user/ sysctls on a
> per user namespace basis and are applicable to all users in that
> namespace.
>
> This patch adds /proc/PID/userns_counts files to provide current usage
> of user namespace counters.
>
>   > cat /proc/813/userns_counts
>   user_namespaces          101000               1
>   pid_namespaces           101000               1
>   ipc_namespaces           101000               4
>   net_namespaces           101000               2
>   mnt_namespaces           101000               5
>   mnt_namespaces           100000               1
>
> The meanings of the columns are as follows, from left to right:
>
>   Name         Object name
>   UID          User ID
>   Usage        Current usage
>
> Cc: Serge Hallyn <serge.hallyn@canonical.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> Signed-off-by: Andrei Vagin <avagin@openvz.org>
> ---
>  fs/proc/array.c                |  57 +++++++++++++++++++++++
>  fs/proc/base.c                 |   3 ++
>  fs/proc/internal.h             |   1 +
>  include/linux/user_namespace.h |   8 ++++
>  kernel/ucount.c                | 102 +++++++++++++++++++++++++++++++++++++++++
>  5 files changed, 171 insertions(+)
>
> diff --git a/fs/proc/array.c b/fs/proc/array.c
> index 88c7de1..f186625 100644
> --- a/fs/proc/array.c
> +++ b/fs/proc/array.c
> @@ -734,3 +734,60 @@ const struct file_operations proc_tid_children_operations = {
>         .release = children_seq_release,
>  };
>  #endif /* CONFIG_PROC_CHILDREN */
> +
> +#ifdef CONFIG_USER_NS
> +static int ucounts_open(struct inode *inode, struct file *filp)
> +{
> +       struct ucounts_iterator *iter;
> +       struct seq_file *seq;
> +       int ret;
> +
> +       struct task_struct *task;
> +       struct user_namespace *ns;
> +
> +       task = get_proc_task(inode);
> +       if (!task)
> +               return -ESRCH;
> +
> +       rcu_read_lock();
> +       ns = get_user_ns(__task_cred(task)->user_ns);
> +       rcu_read_unlock();
> +
> +       put_task_struct(task);
> +
> +       if (ns == NULL)
> +               return -ESRCH;
> +
> +       ret = seq_open_private(filp, &ucounts_seq_operations,
> +                                       sizeof(struct ucounts_iterator));
> +
> +       if (ret) {
> +               put_user_ns(ns);
> +               return ret;
> +       }
> +
> +       seq = filp->private_data;
> +       iter = seq->private;
> +       iter->ns = ns;
> +
> +       return 0;
> +}
> +
> +int ucounts_release(struct inode *inode, struct file *file)
> +{
> +       struct seq_file *seq = file->private_data;
> +       struct ucounts_iterator *iter = seq->private;
> +
> +       put_user_ns(iter->ns);
> +
> +       return seq_release_private(inode, file);
> +}
> +
> +
> +const struct file_operations proc_ucounts_operations = {
> +       .open           = ucounts_open,
> +       .read           = seq_read,
> +       .llseek         = seq_lseek,
> +       .release        = ucounts_release,
> +};
> +#endif /* CONFIG_USER_NS */
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 54e2702..4252f7a 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -2910,6 +2910,9 @@ static const struct pid_entry tgid_base_stuff[] = {
>         REG("timers",     S_IRUGO, proc_timers_operations),
>  #endif
>         REG("timerslack_ns", S_IRUGO|S_IWUGO, proc_pid_set_timerslack_ns_operations),
> +#ifdef CONFIG_USER_NS
> +       REG("userns_counts",  S_IRUGO, proc_ucounts_operations),
> +#endif
>  };
>
>  static int proc_tgid_base_readdir(struct file *file, struct dir_context *ctx)
> diff --git a/fs/proc/internal.h b/fs/proc/internal.h
> index 7931c55..845cadb 100644
> --- a/fs/proc/internal.h
> +++ b/fs/proc/internal.h
> @@ -298,6 +298,7 @@ extern const struct file_operations proc_pid_smaps_operations;
>  extern const struct file_operations proc_tid_smaps_operations;
>  extern const struct file_operations proc_clear_refs_operations;
>  extern const struct file_operations proc_pagemap_operations;
> +extern const struct file_operations proc_ucounts_operations;
>
>  extern unsigned long task_vsize(struct mm_struct *);
>  extern unsigned long task_statm(struct mm_struct *,
> diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
> index 30ffe10..5f824dd 100644
> --- a/include/linux/user_namespace.h
> +++ b/include/linux/user_namespace.h
> @@ -106,6 +106,14 @@ extern ssize_t proc_setgroups_write(struct file *, const char __user *, size_t,
>  extern int proc_setgroups_show(struct seq_file *m, void *v);
>  extern bool userns_may_setgroups(const struct user_namespace *ns);
>  extern bool current_in_userns(const struct user_namespace *target_ns);
> +
> +struct ucounts_iterator {
> +       struct user_namespace *ns;
> +       int     hash;
> +};
> +
> +extern const struct seq_operations ucounts_seq_operations;
> +
>  #else
>
>  static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
> diff --git a/kernel/ucount.c b/kernel/ucount.c
> index 6ebbe9b..cef09e3 100644
> --- a/kernel/ucount.c
> +++ b/kernel/ucount.c
> @@ -9,6 +9,8 @@
>  #include <linux/sysctl.h>
>  #include <linux/slab.h>
>  #include <linux/hash.h>
> +#include <linux/seq_file.h>
> +#include <linux/proc_fs.h>
>  #include <linux/user_namespace.h>
>
>  #define UCOUNTS_HASHTABLE_BITS 10
> @@ -232,4 +234,104 @@ static __init int user_namespace_sysctl_init(void)
>  }
>  subsys_initcall(user_namespace_sysctl_init);
>
> +#ifdef CONFIG_PROC_FS
> +static void *ucounts_start(struct seq_file *f, loff_t *pos)
> +{
> +       struct ucounts_iterator *iter = f->private;
> +       int h, i = 0;
> +
> +       spin_lock(&ucounts_lock);

This series is much improved, thanks! However, I still don't think
it's a good idea to hold this spinlock across the start/stop lifetime.
It creates too many opportunities for abuse. :( Perhaps Eric will have
some better ideas about how to deal with this...

-Kees

> +       for (h = 0; h < (1 << UCOUNTS_HASHTABLE_BITS); h++) {
> +               struct ucounts *ucounts;
> +
> +               hlist_for_each_entry(ucounts, &ucounts_hashtable[h], node) {
> +                       if (ucounts->ns != iter->ns)
> +                               continue;
> +                       if (i++ < *pos)
> +                               continue;
> +
> +                       iter->hash = h;
> +
> +                       return ucounts;
> +               }
> +       }
> +
> +       return NULL;
> +}
> +
> +static void ucounts_stop(struct seq_file *f, void *v)
> +{
> +       spin_unlock(&ucounts_lock);
> +}
> +
> +static void *ucounts_next(struct seq_file *f, void *v, loff_t *pos)
> +{
> +       struct ucounts_iterator *iter = f->private;
> +       struct ucounts *ucounts = v;
> +       int h;
> +
> +       ++*pos;
> +
> +       for (h = iter->hash; h < (1 << UCOUNTS_HASHTABLE_BITS); h++) {
> +               struct hlist_node *node;
> +
> +               if (ucounts == NULL) {
> +                       node = ucounts_hashtable[h].first;
> +                       iter->hash = h;
> +               } else
> +                       node = ucounts->node.next;
> +
> +               ucounts = hlist_entry(node, struct ucounts, node);
> +
> +               hlist_for_each_entry_from(ucounts, node) {
> +                       if (ucounts->ns != iter->ns)
> +                               continue;
> +
> +                       return ucounts;
> +               }
> +
> +               ucounts = NULL;
> +       }
> +
> +       return NULL;
> +}
> +
> +static int ucounts_show(struct seq_file *f, void *v)
> +{
> +       static const char * const ns_strs[] = {
> +                               "user_namespaces",
> +                               "pid_namespaces",
> +                               "uts_namespaces",
> +                               "ipc_namespaces",
> +                               "net_namespaces",
> +                               "mnt_namespaces",
> +                               "cgroup_namespaces",
> +                               NULL
> +       };
> +       struct user_namespace *ns = current_user_ns();
> +       struct ucounts *ucounts = v;
> +       uid_t uid;
> +       int i;
> +
> +       uid = from_kuid_munged(ns, ucounts->uid);
> +
> +       for (i = 0; ns_strs[i]; i++) {
> +               int val = atomic_read(&ucounts->ucount[i]);
> +
> +               if (val == 0)
> +                       continue;
> +
> +               seq_printf(f, "%-20s %10u\t%10d\n", ns_strs[i], uid, val);
> +       }
> +
> +       return 0;
> +}
> +
> +const struct seq_operations ucounts_seq_operations = {
> +       .start  = ucounts_start,
> +       .next   = ucounts_next,
> +       .stop   = ucounts_stop,
> +       .show   = ucounts_show,
> +};
>
> +#endif /* CONFIG_PROC_FS */
> --
> 2.5.5
>



-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] kernel: show current values of user namespace counters
  2016-08-16 20:00   ` Kees Cook
@ 2016-08-16 20:05     ` Serge E. Hallyn
  2016-08-16 22:44       ` Andrei Vagin
  0 siblings, 1 reply; 11+ messages in thread
From: Serge E. Hallyn @ 2016-08-16 20:05 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andrei Vagin, Serge Hallyn, Linux Containers, Eric W. Biederman, LKML

Quoting Kees Cook (keescook@chromium.org):
> On Mon, Aug 15, 2016 at 1:10 PM, Andrei Vagin <avagin@openvz.org> wrote:
> > Recently Eric added user namespace counters.  User namespace counters is
> > a feature that allows to limit the number of various kernel objects a
> > user can create. These limits are set via /proc/sys/user/ sysctls on a
> > per user namespace basis and are applicable to all users in that
> > namespace.
> >
> > This patch adds /proc/PID/userns_counts files to provide current usage
> > of user namespace counters.
> >
> >   > cat /proc/813/userns_counts
> >   user_namespaces          101000               1
> >   pid_namespaces           101000               1
> >   ipc_namespaces           101000               4
> >   net_namespaces           101000               2
> >   mnt_namespaces           101000               5
> >   mnt_namespaces           100000               1
> >
> > The meanings of the columns are as follows, from left to right:
> >
> >   Name         Object name
> >   UID          User ID
> >   Usage        Current usage
> >
> > Cc: Serge Hallyn <serge.hallyn@canonical.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> > Signed-off-by: Andrei Vagin <avagin@openvz.org>
> > ---
> >  fs/proc/array.c                |  57 +++++++++++++++++++++++
> >  fs/proc/base.c                 |   3 ++
> >  fs/proc/internal.h             |   1 +
> >  include/linux/user_namespace.h |   8 ++++
> >  kernel/ucount.c                | 102 +++++++++++++++++++++++++++++++++++++++++
> >  5 files changed, 171 insertions(+)
> >
> > diff --git a/fs/proc/array.c b/fs/proc/array.c
> > index 88c7de1..f186625 100644
> > --- a/fs/proc/array.c
> > +++ b/fs/proc/array.c
> > @@ -734,3 +734,60 @@ const struct file_operations proc_tid_children_operations = {
> >         .release = children_seq_release,
> >  };
> >  #endif /* CONFIG_PROC_CHILDREN */
> > +
> > +#ifdef CONFIG_USER_NS
> > +static int ucounts_open(struct inode *inode, struct file *filp)
> > +{
> > +       struct ucounts_iterator *iter;
> > +       struct seq_file *seq;
> > +       int ret;
> > +
> > +       struct task_struct *task;
> > +       struct user_namespace *ns;
> > +
> > +       task = get_proc_task(inode);
> > +       if (!task)
> > +               return -ESRCH;
> > +
> > +       rcu_read_lock();
> > +       ns = get_user_ns(__task_cred(task)->user_ns);
> > +       rcu_read_unlock();
> > +
> > +       put_task_struct(task);
> > +
> > +       if (ns == NULL)
> > +               return -ESRCH;
> > +
> > +       ret = seq_open_private(filp, &ucounts_seq_operations,
> > +                                       sizeof(struct ucounts_iterator));
> > +
> > +       if (ret) {
> > +               put_user_ns(ns);
> > +               return ret;
> > +       }
> > +
> > +       seq = filp->private_data;
> > +       iter = seq->private;
> > +       iter->ns = ns;
> > +
> > +       return 0;
> > +}
> > +
> > +int ucounts_release(struct inode *inode, struct file *file)
> > +{
> > +       struct seq_file *seq = file->private_data;
> > +       struct ucounts_iterator *iter = seq->private;
> > +
> > +       put_user_ns(iter->ns);
> > +
> > +       return seq_release_private(inode, file);
> > +}
> > +
> > +
> > +const struct file_operations proc_ucounts_operations = {
> > +       .open           = ucounts_open,
> > +       .read           = seq_read,
> > +       .llseek         = seq_lseek,
> > +       .release        = ucounts_release,
> > +};
> > +#endif /* CONFIG_USER_NS */
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index 54e2702..4252f7a 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -2910,6 +2910,9 @@ static const struct pid_entry tgid_base_stuff[] = {
> >         REG("timers",     S_IRUGO, proc_timers_operations),
> >  #endif
> >         REG("timerslack_ns", S_IRUGO|S_IWUGO, proc_pid_set_timerslack_ns_operations),
> > +#ifdef CONFIG_USER_NS
> > +       REG("userns_counts",  S_IRUGO, proc_ucounts_operations),
> > +#endif
> >  };
> >
> >  static int proc_tgid_base_readdir(struct file *file, struct dir_context *ctx)
> > diff --git a/fs/proc/internal.h b/fs/proc/internal.h
> > index 7931c55..845cadb 100644
> > --- a/fs/proc/internal.h
> > +++ b/fs/proc/internal.h
> > @@ -298,6 +298,7 @@ extern const struct file_operations proc_pid_smaps_operations;
> >  extern const struct file_operations proc_tid_smaps_operations;
> >  extern const struct file_operations proc_clear_refs_operations;
> >  extern const struct file_operations proc_pagemap_operations;
> > +extern const struct file_operations proc_ucounts_operations;
> >
> >  extern unsigned long task_vsize(struct mm_struct *);
> >  extern unsigned long task_statm(struct mm_struct *,
> > diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
> > index 30ffe10..5f824dd 100644
> > --- a/include/linux/user_namespace.h
> > +++ b/include/linux/user_namespace.h
> > @@ -106,6 +106,14 @@ extern ssize_t proc_setgroups_write(struct file *, const char __user *, size_t,
> >  extern int proc_setgroups_show(struct seq_file *m, void *v);
> >  extern bool userns_may_setgroups(const struct user_namespace *ns);
> >  extern bool current_in_userns(const struct user_namespace *target_ns);
> > +
> > +struct ucounts_iterator {
> > +       struct user_namespace *ns;
> > +       int     hash;
> > +};
> > +
> > +extern const struct seq_operations ucounts_seq_operations;
> > +
> >  #else
> >
> >  static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
> > diff --git a/kernel/ucount.c b/kernel/ucount.c
> > index 6ebbe9b..cef09e3 100644
> > --- a/kernel/ucount.c
> > +++ b/kernel/ucount.c
> > @@ -9,6 +9,8 @@
> >  #include <linux/sysctl.h>
> >  #include <linux/slab.h>
> >  #include <linux/hash.h>
> > +#include <linux/seq_file.h>
> > +#include <linux/proc_fs.h>
> >  #include <linux/user_namespace.h>
> >
> >  #define UCOUNTS_HASHTABLE_BITS 10
> > @@ -232,4 +234,104 @@ static __init int user_namespace_sysctl_init(void)
> >  }
> >  subsys_initcall(user_namespace_sysctl_init);
> >
> > +#ifdef CONFIG_PROC_FS
> > +static void *ucounts_start(struct seq_file *f, loff_t *pos)
> > +{
> > +       struct ucounts_iterator *iter = f->private;
> > +       int h, i = 0;
> > +
> > +       spin_lock(&ucounts_lock);
> 
> This series is much improved, thanks! However, I still don't think
> it's a good idea to hold this spinlock across the start/stop lifetime.
> It creates too many opportunities for abuse. :( Perhaps Eric will have
> some better ideas about how to deal with this...

Would it be too limiting to pre-allocate a page and simply fill it out
right at ucounts_start, so that we can drop the spinlock and simply
return data from that page?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] kernel: show current values of user namespace counters
  2016-08-16 20:05     ` Serge E. Hallyn
@ 2016-08-16 22:44       ` Andrei Vagin
  0 siblings, 0 replies; 11+ messages in thread
From: Andrei Vagin @ 2016-08-16 22:44 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Kees Cook, Andrei Vagin, Serge Hallyn, Linux Containers,
	Eric W. Biederman, LKML

On Tue, Aug 16, 2016 at 03:05:29PM -0500, Serge E. Hallyn wrote:
> Quoting Kees Cook (keescook@chromium.org):
> > On Mon, Aug 15, 2016 at 1:10 PM, Andrei Vagin <avagin@openvz.org> wrote:
> > > Recently Eric added user namespace counters.  User namespace counters is
> > > a feature that allows to limit the number of various kernel objects a
> > > user can create. These limits are set via /proc/sys/user/ sysctls on a
> > > per user namespace basis and are applicable to all users in that
> > > namespace.
> > >
> > > This patch adds /proc/PID/userns_counts files to provide current usage
> > > of user namespace counters.
> > >
> > >   > cat /proc/813/userns_counts
> > >   user_namespaces          101000               1
> > >   pid_namespaces           101000               1
> > >   ipc_namespaces           101000               4
> > >   net_namespaces           101000               2
> > >   mnt_namespaces           101000               5
> > >   mnt_namespaces           100000               1
> > >
> > > The meanings of the columns are as follows, from left to right:
> > >
> > >   Name         Object name
> > >   UID          User ID
> > >   Usage        Current usage
> > >
> > > Cc: Serge Hallyn <serge.hallyn@canonical.com>
> > > Cc: Kees Cook <keescook@chromium.org>
> > > Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> > > Signed-off-by: Andrei Vagin <avagin@openvz.org>
> > > ---
> > >  fs/proc/array.c                |  57 +++++++++++++++++++++++
> > >  fs/proc/base.c                 |   3 ++
> > >  fs/proc/internal.h             |   1 +
> > >  include/linux/user_namespace.h |   8 ++++
> > >  kernel/ucount.c                | 102 +++++++++++++++++++++++++++++++++++++++++
> > >  5 files changed, 171 insertions(+)
> > >
> > > diff --git a/fs/proc/array.c b/fs/proc/array.c
> > > index 88c7de1..f186625 100644
> > > --- a/fs/proc/array.c
> > > +++ b/fs/proc/array.c
> > > @@ -734,3 +734,60 @@ const struct file_operations proc_tid_children_operations = {
> > >         .release = children_seq_release,
> > >  };
> > >  #endif /* CONFIG_PROC_CHILDREN */
> > > +
> > > +#ifdef CONFIG_USER_NS
> > > +static int ucounts_open(struct inode *inode, struct file *filp)
> > > +{
> > > +       struct ucounts_iterator *iter;
> > > +       struct seq_file *seq;
> > > +       int ret;
> > > +
> > > +       struct task_struct *task;
> > > +       struct user_namespace *ns;
> > > +
> > > +       task = get_proc_task(inode);
> > > +       if (!task)
> > > +               return -ESRCH;
> > > +
> > > +       rcu_read_lock();
> > > +       ns = get_user_ns(__task_cred(task)->user_ns);
> > > +       rcu_read_unlock();
> > > +
> > > +       put_task_struct(task);
> > > +
> > > +       if (ns == NULL)
> > > +               return -ESRCH;
> > > +
> > > +       ret = seq_open_private(filp, &ucounts_seq_operations,
> > > +                                       sizeof(struct ucounts_iterator));
> > > +
> > > +       if (ret) {
> > > +               put_user_ns(ns);
> > > +               return ret;
> > > +       }
> > > +
> > > +       seq = filp->private_data;
> > > +       iter = seq->private;
> > > +       iter->ns = ns;
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +int ucounts_release(struct inode *inode, struct file *file)
> > > +{
> > > +       struct seq_file *seq = file->private_data;
> > > +       struct ucounts_iterator *iter = seq->private;
> > > +
> > > +       put_user_ns(iter->ns);
> > > +
> > > +       return seq_release_private(inode, file);
> > > +}
> > > +
> > > +
> > > +const struct file_operations proc_ucounts_operations = {
> > > +       .open           = ucounts_open,
> > > +       .read           = seq_read,
> > > +       .llseek         = seq_lseek,
> > > +       .release        = ucounts_release,
> > > +};
> > > +#endif /* CONFIG_USER_NS */
> > > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > > index 54e2702..4252f7a 100644
> > > --- a/fs/proc/base.c
> > > +++ b/fs/proc/base.c
> > > @@ -2910,6 +2910,9 @@ static const struct pid_entry tgid_base_stuff[] = {
> > >         REG("timers",     S_IRUGO, proc_timers_operations),
> > >  #endif
> > >         REG("timerslack_ns", S_IRUGO|S_IWUGO, proc_pid_set_timerslack_ns_operations),
> > > +#ifdef CONFIG_USER_NS
> > > +       REG("userns_counts",  S_IRUGO, proc_ucounts_operations),
> > > +#endif
> > >  };
> > >
> > >  static int proc_tgid_base_readdir(struct file *file, struct dir_context *ctx)
> > > diff --git a/fs/proc/internal.h b/fs/proc/internal.h
> > > index 7931c55..845cadb 100644
> > > --- a/fs/proc/internal.h
> > > +++ b/fs/proc/internal.h
> > > @@ -298,6 +298,7 @@ extern const struct file_operations proc_pid_smaps_operations;
> > >  extern const struct file_operations proc_tid_smaps_operations;
> > >  extern const struct file_operations proc_clear_refs_operations;
> > >  extern const struct file_operations proc_pagemap_operations;
> > > +extern const struct file_operations proc_ucounts_operations;
> > >
> > >  extern unsigned long task_vsize(struct mm_struct *);
> > >  extern unsigned long task_statm(struct mm_struct *,
> > > diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
> > > index 30ffe10..5f824dd 100644
> > > --- a/include/linux/user_namespace.h
> > > +++ b/include/linux/user_namespace.h
> > > @@ -106,6 +106,14 @@ extern ssize_t proc_setgroups_write(struct file *, const char __user *, size_t,
> > >  extern int proc_setgroups_show(struct seq_file *m, void *v);
> > >  extern bool userns_may_setgroups(const struct user_namespace *ns);
> > >  extern bool current_in_userns(const struct user_namespace *target_ns);
> > > +
> > > +struct ucounts_iterator {
> > > +       struct user_namespace *ns;
> > > +       int     hash;
> > > +};
> > > +
> > > +extern const struct seq_operations ucounts_seq_operations;
> > > +
> > >  #else
> > >
> > >  static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
> > > diff --git a/kernel/ucount.c b/kernel/ucount.c
> > > index 6ebbe9b..cef09e3 100644
> > > --- a/kernel/ucount.c
> > > +++ b/kernel/ucount.c
> > > @@ -9,6 +9,8 @@
> > >  #include <linux/sysctl.h>
> > >  #include <linux/slab.h>
> > >  #include <linux/hash.h>
> > > +#include <linux/seq_file.h>
> > > +#include <linux/proc_fs.h>
> > >  #include <linux/user_namespace.h>
> > >
> > >  #define UCOUNTS_HASHTABLE_BITS 10
> > > @@ -232,4 +234,104 @@ static __init int user_namespace_sysctl_init(void)
> > >  }
> > >  subsys_initcall(user_namespace_sysctl_init);
> > >
> > > +#ifdef CONFIG_PROC_FS
> > > +static void *ucounts_start(struct seq_file *f, loff_t *pos)
> > > +{
> > > +       struct ucounts_iterator *iter = f->private;
> > > +       int h, i = 0;
> > > +
> > > +       spin_lock(&ucounts_lock);
> > 
> > This series is much improved, thanks! However, I still don't think
> > it's a good idea to hold this spinlock across the start/stop lifetime.
> > It creates too many opportunities for abuse. :( Perhaps Eric will have
> > some better ideas about how to deal with this...
> 
> Would it be too limiting to pre-allocate a page and simply fill it out
> right at ucounts_start, so that we can drop the spinlock and simply
> return data from that page?

seq_file works like you described here. It allocates a buffer, then
calls ->start(), fills the buffer and calls ->stop().

seq_file calls only ->next() and ->show() between ->start and stop().

ssize_t seq_read(struct file *file, char __user *buf, size_t size, loff_t *ppos)
{
        if (!m->buf) {
                m->buf = seq_buf_alloc(m->size = PAGE_SIZE);
...
        p = m->op->start(m, &pos);
...
	while (m->count < size) {
		...
                p = m->op->next(m, p, &next);
                if (!p || IS_ERR(p)) {
                        err = PTR_ERR(p);
                        break;
                }

		err = m->op->show(m, p);
                if (seq_has_overflowed(m) || err) {
			...
			break;
		}
		...
	}
        m->op->stop(m, p);
}

In our case, a buffer size will not be bigger then one page, because a
bigger buffer is allocated only if it isn't enough to place one record.

Thanks,
Andrei

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2 v2] userns: show current values of user namespace counters
  2016-08-15 20:10 [PATCH 0/2 v2] userns: show current values of user namespace counters Andrei Vagin
  2016-08-15 20:10 ` [PATCH 1/2] kernel: " Andrei Vagin
  2016-08-15 20:10 ` [PATCH 2/2] Documentation: describe /proc/<pid>/userns_counts Andrei Vagin
@ 2016-08-16 22:53 ` Serge E. Hallyn
  2016-10-06 17:51 ` Andrei Vagin
  3 siblings, 0 replies; 11+ messages in thread
From: Serge E. Hallyn @ 2016-08-16 22:53 UTC (permalink / raw)
  To: Andrei Vagin
  Cc: Eric W. Biederman, Serge Hallyn, containers, linux-kernel, Kees Cook

Quoting Andrei Vagin (avagin@openvz.org):
> Recently Eric added user namespace counters.  User namespace counters is
> a feature that allows to limit the number of various kernel objects a
> user can create. These limits are set via /proc/sys/user/ sysctls on a
> per user namespace basis and are applicable to all users in that
> namespace.
> 
> User namespace counters are not in the upstream tree yet,
> you can find them in Eric's tree:
> https://git.kernel.org/cgit/linux/kernel/git/ebiederm/user-namespace.git/log/?h=for-testing
> 
> This patch adds /proc/<pid>/userns_counts files to provide current usage
> of user namespace counters.
> 
>   > cat /proc/813/userns_counts
>   user_namespaces          101000               1
>   pid_namespaces           101000               1
>   ipc_namespaces           101000               4
>   net_namespaces           101000               2
>   mnt_namespaces           101000               5
>   mnt_namespaces           100000               1
> 
> The meanings of the columns are as follows, from left to right:
> 
>   Name         Object name
>   UID          User ID
>   Usage        Current usage
> 
> The full documentation is in the second patch.
> 
> v2: - describe this file in Documentation/filesystems/proc.txt
>     - move and rename into /proc/<pid>/userns_counts
> 
> Cc: Serge Hallyn <serge.hallyn@canonical.com>

Acked-by: Serge Hallyn <serge@hallyn.com>

> Cc: Kees Cook <keescook@chromium.org>
> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> Signed-off-by: Andrei Vagin <avagin@openvz.org>
> 
> Andrei Vagin (1):
>   kernel: show current values of user namespace counters
> 
> Kirill Kolyshkin (1):
>   Documentation: describe /proc/<pid>/userns_counts
> 
>  Documentation/filesystems/proc.txt |  30 +++++++++++
>  fs/proc/array.c                    |  55 ++++++++++++++++++++
>  fs/proc/base.c                     |   1 +
>  fs/proc/internal.h                 |   1 +
>  include/linux/user_namespace.h     |   8 +++
>  kernel/ucount.c                    | 102 +++++++++++++++++++++++++++++++++++++
>  6 files changed, 197 insertions(+)
> 
> -- 
> 2.5.5
> _______________________________________________
> Containers mailing list
> Containers@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2 v2] userns: show current values of user namespace counters
  2016-08-15 20:10 [PATCH 0/2 v2] userns: show current values of user namespace counters Andrei Vagin
                   ` (2 preceding siblings ...)
  2016-08-16 22:53 ` [PATCH 0/2 v2] userns: show current values of user namespace counters Serge E. Hallyn
@ 2016-10-06 17:51 ` Andrei Vagin
  2016-10-06 19:33   ` Eric W. Biederman
  3 siblings, 1 reply; 11+ messages in thread
From: Andrei Vagin @ 2016-10-06 17:51 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: avagin, containers, linux-kernel, Serge Hallyn, Kees Cook

Hello Eric,

What do you think about this series? It should be useful to know current
usage for user counters.

Thanks,
Andrei

On Mon, Aug 15, 2016 at 01:10:20PM -0700, Andrei Vagin wrote:
> Recently Eric added user namespace counters.  User namespace counters is
> a feature that allows to limit the number of various kernel objects a
> user can create. These limits are set via /proc/sys/user/ sysctls on a
> per user namespace basis and are applicable to all users in that
> namespace.
> 
> User namespace counters are not in the upstream tree yet,
> you can find them in Eric's tree:
> https://git.kernel.org/cgit/linux/kernel/git/ebiederm/user-namespace.git/log/?h=for-testing
> 
> This patch adds /proc/<pid>/userns_counts files to provide current usage
> of user namespace counters.
> 
>   > cat /proc/813/userns_counts
>   user_namespaces          101000               1
>   pid_namespaces           101000               1
>   ipc_namespaces           101000               4
>   net_namespaces           101000               2
>   mnt_namespaces           101000               5
>   mnt_namespaces           100000               1
> 
> The meanings of the columns are as follows, from left to right:
> 
>   Name         Object name
>   UID          User ID
>   Usage        Current usage
> 
> The full documentation is in the second patch.
> 
> v2: - describe this file in Documentation/filesystems/proc.txt
>     - move and rename into /proc/<pid>/userns_counts
> 
> Cc: Serge Hallyn <serge.hallyn@canonical.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> Signed-off-by: Andrei Vagin <avagin@openvz.org>
> 
> Andrei Vagin (1):
>   kernel: show current values of user namespace counters
> 
> Kirill Kolyshkin (1):
>   Documentation: describe /proc/<pid>/userns_counts
> 
>  Documentation/filesystems/proc.txt |  30 +++++++++++
>  fs/proc/array.c                    |  55 ++++++++++++++++++++
>  fs/proc/base.c                     |   1 +
>  fs/proc/internal.h                 |   1 +
>  include/linux/user_namespace.h     |   8 +++
>  kernel/ucount.c                    | 102 +++++++++++++++++++++++++++++++++++++
>  6 files changed, 197 insertions(+)
> 
> -- 
> 2.5.5

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2 v2] userns: show current values of user namespace counters
  2016-10-06 17:51 ` Andrei Vagin
@ 2016-10-06 19:33   ` Eric W. Biederman
  2016-10-10 16:22     ` Andrei Vagin
  0 siblings, 1 reply; 11+ messages in thread
From: Eric W. Biederman @ 2016-10-06 19:33 UTC (permalink / raw)
  To: Andrei Vagin; +Cc: avagin, containers, linux-kernel, Serge Hallyn, Kees Cook

Andrei Vagin <avagin@virtuozzo.com> writes:

> Hello Eric,
>
> What do you think about this series? It should be useful to know current
> usage for user counters.

I am in favor of knowing the values.  Unless there is a good reason not
to we should export the values with a read-only sysctl.  I believe that
is what other similar limits do.

As for having per process knowledge I think that is probably something
we want to solve for these sysctls as well.

I don't think I saw anyone looking at this code from the perspective of
information leaks.  I think we need to ask that question, as similar
interfaces have been problematic from an information leak point of view.

In short I what you are trying to do here is valuable, I just want
to make certain we have a maintainable pattern when we export these.

Eric

>
> Thanks,
> Andrei
>
> On Mon, Aug 15, 2016 at 01:10:20PM -0700, Andrei Vagin wrote:
>> Recently Eric added user namespace counters.  User namespace counters is
>> a feature that allows to limit the number of various kernel objects a
>> user can create. These limits are set via /proc/sys/user/ sysctls on a
>> per user namespace basis and are applicable to all users in that
>> namespace.
>> 
>> User namespace counters are not in the upstream tree yet,
>> you can find them in Eric's tree:
>> https://git.kernel.org/cgit/linux/kernel/git/ebiederm/user-namespace.git/log/?h=for-testing
>> 
>> This patch adds /proc/<pid>/userns_counts files to provide current usage
>> of user namespace counters.
>> 
>>   > cat /proc/813/userns_counts
>>   user_namespaces          101000               1
>>   pid_namespaces           101000               1
>>   ipc_namespaces           101000               4
>>   net_namespaces           101000               2
>>   mnt_namespaces           101000               5
>>   mnt_namespaces           100000               1
>> 
>> The meanings of the columns are as follows, from left to right:
>> 
>>   Name         Object name
>>   UID          User ID
>>   Usage        Current usage
>> 
>> The full documentation is in the second patch.
>> 
>> v2: - describe this file in Documentation/filesystems/proc.txt
>>     - move and rename into /proc/<pid>/userns_counts
>> 
>> Cc: Serge Hallyn <serge.hallyn@canonical.com>
>> Cc: Kees Cook <keescook@chromium.org>
>> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
>> Signed-off-by: Andrei Vagin <avagin@openvz.org>
>> 
>> Andrei Vagin (1):
>>   kernel: show current values of user namespace counters
>> 
>> Kirill Kolyshkin (1):
>>   Documentation: describe /proc/<pid>/userns_counts
>> 
>>  Documentation/filesystems/proc.txt |  30 +++++++++++
>>  fs/proc/array.c                    |  55 ++++++++++++++++++++
>>  fs/proc/base.c                     |   1 +
>>  fs/proc/internal.h                 |   1 +
>>  include/linux/user_namespace.h     |   8 +++
>>  kernel/ucount.c                    | 102 +++++++++++++++++++++++++++++++++++++
>>  6 files changed, 197 insertions(+)
>> 
>> -- 
>> 2.5.5

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2 v2] userns: show current values of user namespace counters
  2016-10-06 19:33   ` Eric W. Biederman
@ 2016-10-10 16:22     ` Andrei Vagin
  2016-10-10 20:44       ` Eric W. Biederman
  0 siblings, 1 reply; 11+ messages in thread
From: Andrei Vagin @ 2016-10-10 16:22 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: avagin, containers, linux-kernel, Serge Hallyn, Kees Cook

On Thu, Oct 06, 2016 at 02:33:53PM -0500, Eric W. Biederman wrote:
> Andrei Vagin <avagin@virtuozzo.com> writes:
> 
> > Hello Eric,
> >
> > What do you think about this series? It should be useful to know current
> > usage for user counters.
> 
> I am in favor of knowing the values.  Unless there is a good reason not
> to we should export the values with a read-only sysctl.  I believe that
> is what other similar limits do.

I want to have a place where I will be able to get limits for all
users. I can't imagine how to do this with a sysctl. It will looks like
multiline sysct-s, what doesn't look good. I will think. If you will
have any ideas let me know. Thanks.

> 
> As for having per process knowledge I think that is probably something
> we want to solve for these sysctls as well.
> 
> I don't think I saw anyone looking at this code from the perspective of
> information leaks.  I think we need to ask that question, as similar
> interfaces have been problematic from an information leak point of view.

It's a good question.

> 
> In short I what you are trying to do here is valuable, I just want
> to make certain we have a maintainable pattern when we export these.
> 
> Eric
> 
> >
> > Thanks,
> > Andrei
> >
> > On Mon, Aug 15, 2016 at 01:10:20PM -0700, Andrei Vagin wrote:
> >> Recently Eric added user namespace counters.  User namespace counters is
> >> a feature that allows to limit the number of various kernel objects a
> >> user can create. These limits are set via /proc/sys/user/ sysctls on a
> >> per user namespace basis and are applicable to all users in that
> >> namespace.
> >> 
> >> User namespace counters are not in the upstream tree yet,
> >> you can find them in Eric's tree:
> >> https://git.kernel.org/cgit/linux/kernel/git/ebiederm/user-namespace.git/log/?h=for-testing
> >> 
> >> This patch adds /proc/<pid>/userns_counts files to provide current usage
> >> of user namespace counters.
> >> 
> >>   > cat /proc/813/userns_counts
> >>   user_namespaces          101000               1
> >>   pid_namespaces           101000               1
> >>   ipc_namespaces           101000               4
> >>   net_namespaces           101000               2
> >>   mnt_namespaces           101000               5
> >>   mnt_namespaces           100000               1
> >> 
> >> The meanings of the columns are as follows, from left to right:
> >> 
> >>   Name         Object name
> >>   UID          User ID
> >>   Usage        Current usage
> >> 
> >> The full documentation is in the second patch.
> >> 
> >> v2: - describe this file in Documentation/filesystems/proc.txt
> >>     - move and rename into /proc/<pid>/userns_counts
> >> 
> >> Cc: Serge Hallyn <serge.hallyn@canonical.com>
> >> Cc: Kees Cook <keescook@chromium.org>
> >> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> >> Signed-off-by: Andrei Vagin <avagin@openvz.org>
> >> 
> >> Andrei Vagin (1):
> >>   kernel: show current values of user namespace counters
> >> 
> >> Kirill Kolyshkin (1):
> >>   Documentation: describe /proc/<pid>/userns_counts
> >> 
> >>  Documentation/filesystems/proc.txt |  30 +++++++++++
> >>  fs/proc/array.c                    |  55 ++++++++++++++++++++
> >>  fs/proc/base.c                     |   1 +
> >>  fs/proc/internal.h                 |   1 +
> >>  include/linux/user_namespace.h     |   8 +++
> >>  kernel/ucount.c                    | 102 +++++++++++++++++++++++++++++++++++++
> >>  6 files changed, 197 insertions(+)
> >> 
> >> -- 
> >> 2.5.5

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2 v2] userns: show current values of user namespace counters
  2016-10-10 16:22     ` Andrei Vagin
@ 2016-10-10 20:44       ` Eric W. Biederman
  0 siblings, 0 replies; 11+ messages in thread
From: Eric W. Biederman @ 2016-10-10 20:44 UTC (permalink / raw)
  To: Andrei Vagin; +Cc: avagin, containers, linux-kernel, Serge Hallyn, Kees Cook

Andrei Vagin <avagin@virtuozzo.com> writes:

> On Thu, Oct 06, 2016 at 02:33:53PM -0500, Eric W. Biederman wrote:
>> Andrei Vagin <avagin@virtuozzo.com> writes:
>> 
>> > Hello Eric,
>> >
>> > What do you think about this series? It should be useful to know current
>> > usage for user counters.
>> 
>> I am in favor of knowing the values.  Unless there is a good reason not
>> to we should export the values with a read-only sysctl.  I believe that
>> is what other similar limits do.
>
> I want to have a place where I will be able to get limits for all
> users. I can't imagine how to do this with a sysctl. It will looks like
> multiline sysct-s, what doesn't look good. I will think. If you will
> have any ideas let me know. Thanks.

Something that has been on my wish list for a while has been to modify
/proc/sys/... to also show up under /proc/<pid>/sys/... for the
non-global values.  Now it might make sense to show these things in an
alternate filesystem.

At the same time I am a little leary of the desire.  Changing these
limits and watching them in a per-process / per-user sense is fine.
However their fundamental purpose is to be set and forget limits and
that only rarely should anyone need to mess with.  Which makes the
primary purpose of looking at them debugging and verifying that the
limits are set to reasonable values.

Active management if someone wants to go there is possible but it will
never be the primary purpose of these limits.

>> As for having per process knowledge I think that is probably something
>> we want to solve for these sysctls as well.
>> 
>> I don't think I saw anyone looking at this code from the perspective of
>> information leaks.  I think we need to ask that question, as similar
>> interfaces have been problematic from an information leak point of view.
>
> It's a good question.

I expect that we don't actually care.  The kernel tends to leak a lot of
this kind of information.  But I figure we should at least be able to
say we thought about it and we don't care.

Eric

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-10-10 20:46 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-15 20:10 [PATCH 0/2 v2] userns: show current values of user namespace counters Andrei Vagin
2016-08-15 20:10 ` [PATCH 1/2] kernel: " Andrei Vagin
2016-08-16 20:00   ` Kees Cook
2016-08-16 20:05     ` Serge E. Hallyn
2016-08-16 22:44       ` Andrei Vagin
2016-08-15 20:10 ` [PATCH 2/2] Documentation: describe /proc/<pid>/userns_counts Andrei Vagin
2016-08-16 22:53 ` [PATCH 0/2 v2] userns: show current values of user namespace counters Serge E. Hallyn
2016-10-06 17:51 ` Andrei Vagin
2016-10-06 19:33   ` Eric W. Biederman
2016-10-10 16:22     ` Andrei Vagin
2016-10-10 20:44       ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).