linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] procfs: use an enum for possible hidepid values
@ 2016-11-03 15:30 Lafcadio Wluiki
  2016-11-03 15:30 ` [PATCH 2/2] procfs/tasks: add a simple per-task procfs hidepid= field Lafcadio Wluiki
  2016-11-03 15:49 ` [PATCH 1/2] procfs: use an enum for possible hidepid values Kees Cook
  0 siblings, 2 replies; 13+ messages in thread
From: Lafcadio Wluiki @ 2016-11-03 15:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andrew Morton, Kees Cook

(Third, rebased submission, since first two submissions yielded no replies.)

Previously, the hidepid parameter was checked by comparing literal
integers 0, 1, 2. Let's add a proper enum for this, to make the checking
more expressive:

	0 → HIDEPID_OFF
	1 → HIDEPID_NO_ACCESS
	2 → HIDEPID_INVISIBLE

This changes the internal labelling only, the userspace-facing interface
remains unmodified, and still works with literal integers 0, 1, 2.

No functional changes.

Signed-off-by: Lafcadio Wluiki <wluikil@gmail.com>
---
 fs/proc/base.c                | 8 ++++----
 fs/proc/inode.c               | 2 +-
 fs/proc/root.c                | 3 ++-
 include/linux/pid_namespace.h | 6 ++++++
 4 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index ca651ac..ae5e13c 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -726,11 +726,11 @@ static int proc_pid_permission(struct inode *inode, int mask)
 	task = get_proc_task(inode);
 	if (!task)
 		return -ESRCH;
-	has_perms = has_pid_permissions(pid, task, 1);
+	has_perms = has_pid_permissions(pid, task, HIDEPID_NO_ACCESS);
 	put_task_struct(task);
 
 	if (!has_perms) {
-		if (pid->hide_pid == 2) {
+		if (pid->hide_pid == HIDEPID_INVISIBLE) {
 			/*
 			 * Let's make getdents(), stat(), and open()
 			 * consistent with each other.  If a process
@@ -1720,7 +1720,7 @@ int pid_getattr(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat)
 	stat->gid = GLOBAL_ROOT_GID;
 	task = pid_task(proc_pid(inode), PIDTYPE_PID);
 	if (task) {
-		if (!has_pid_permissions(pid, task, 2)) {
+		if (!has_pid_permissions(pid, task, HIDEPID_INVISIBLE)) {
 			rcu_read_unlock();
 			/*
 			 * This doesn't prevent learning whether PID exists,
@@ -3181,7 +3181,7 @@ int proc_pid_readdir(struct file *file, struct dir_context *ctx)
 	     iter.tgid += 1, iter = next_tgid(ns, iter)) {
 		char name[PROC_NUMBUF];
 		int len;
-		if (!has_pid_permissions(ns, iter.task, 2))
+		if (!has_pid_permissions(ns, iter.task, HIDEPID_INVISIBLE))
 			continue;
 
 		len = snprintf(name, sizeof(name), "%d", iter.tgid);
diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index e69ebe6..872325e 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -106,7 +106,7 @@ static int proc_show_options(struct seq_file *seq, struct dentry *root)
 
 	if (!gid_eq(pid->pid_gid, GLOBAL_ROOT_GID))
 		seq_printf(seq, ",gid=%u", from_kgid_munged(&init_user_ns, pid->pid_gid));
-	if (pid->hide_pid != 0)
+	if (pid->hide_pid != HIDEPID_OFF)
 		seq_printf(seq, ",hidepid=%u", pid->hide_pid);
 
 	return 0;
diff --git a/fs/proc/root.c b/fs/proc/root.c
index 8d3e484..2989731 100644
--- a/fs/proc/root.c
+++ b/fs/proc/root.c
@@ -58,7 +58,8 @@ int proc_parse_options(char *options, struct pid_namespace *pid)
 		case Opt_hidepid:
 			if (match_int(&args[0], &option))
 				return 0;
-			if (option < 0 || option > 2) {
+			if (option < HIDEPID_OFF ||
+			    option > HIDEPID_INVISIBLE) {
 				pr_err("proc: hidepid value must be between 0 and 2.\n");
 				return 0;
 			}
diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h
index 34cce96..c2a989d 100644
--- a/include/linux/pid_namespace.h
+++ b/include/linux/pid_namespace.h
@@ -21,6 +21,12 @@ struct pidmap {
 
 struct fs_pin;
 
+enum { /* definitions for pid_namespace's hide_pid field */
+	HIDEPID_OFF	  = 0,
+	HIDEPID_NO_ACCESS = 1,
+	HIDEPID_INVISIBLE = 2,
+};
+
 struct pid_namespace {
 	struct kref kref;
 	struct pidmap pidmap[PIDMAP_ENTRIES];
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2] procfs/tasks: add a simple per-task procfs hidepid= field
  2016-11-03 15:30 [PATCH 1/2] procfs: use an enum for possible hidepid values Lafcadio Wluiki
@ 2016-11-03 15:30 ` Lafcadio Wluiki
  2016-11-03 16:12   ` Kees Cook
  2016-11-03 18:24   ` [2/2] " Jann Horn
  2016-11-03 15:49 ` [PATCH 1/2] procfs: use an enum for possible hidepid values Kees Cook
  1 sibling, 2 replies; 13+ messages in thread
From: Lafcadio Wluiki @ 2016-11-03 15:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andrew Morton, Kees Cook

(Third, rebased submission, since first two submissions yielded no replies.)

This adds a new per-task hidepid= flag that is honored by procfs when
presenting /proc to the user, in addition to the existing hidepid= mount
option. So far, hidepid= was exclusively a per-pidns setting. Locking
down a set of processes so that they cannot see other user's processes
without affecting the rest of the system thus currently requires
creation of a private PID namespace, with all the complexity it brings,
including maintaining a stub init process as PID 1 and losing the
ability to see processes of the same user on the rest of the system.

With this patch all acesss and visibility checks in procfs now
honour two fields:

	a) the existing hide_pid field in the PID namespace
	b) the new hide_pid in struct task_struct

Access/visibility is only granted if both fields permit it; the more
restrictive one wins. By default the new task_struct hide_pid value
defaults to 0, which means behaviour is not changed from the status quo.

Setting the per-process hide_pid value is done via a new PR_SET_HIDEPID
prctl() option which takes the same three supported values as the
hidepid= mount option. The per-process hide_pid may only be increased,
never decreased, thus ensuring that once applied, processes can never
escape such a hide_pid jail.  When a process forks it inherits its
parent's hide_pid value.

Suggested usecase: let's say nginx runs as user "www-data". After
dropping privileges it may now call:

	…
	prctl(PR_SET_HIDEPID, 2);
	…

And from that point on neither nginx itself, nor any of its child
processes may see processes in /proc anymore that belong to a different
user than "www-data". Other services running on the same system remain
unaffected.

This should permit Linux distributions to more comprehensively lock down
their services, as it allows an isolated opt-in for hidepid= for
specific services. Previously hidepid= could only be set system-wide,
and then specific services had to be excluded by group membership,
essentially a more complex concept of opt-out.

A test-tool that validates this functionality is available here:

	https://paste.fedoraproject.org/412975/71967605/

Signed-off-by: Lafcadio Wluiki <wluikil@gmail.com>
---
 fs/proc/array.c            |  3 +++
 fs/proc/base.c             |  6 ++++--
 include/linux/init_task.h  |  1 +
 include/linux/sched.h      |  1 +
 include/uapi/linux/prctl.h |  4 ++++
 kernel/fork.c              |  1 +
 kernel/sys.c               | 10 ++++++++++
 7 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 81818ad..ea801e5 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -163,6 +163,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
 	const struct cred *cred;
 	pid_t ppid, tpid = 0, tgid, ngid;
 	unsigned int max_fds = 0;
+	int hide_pid;
 
 	rcu_read_lock();
 	ppid = pid_alive(p) ?
@@ -183,6 +184,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
 	task_lock(p);
 	if (p->files)
 		max_fds = files_fdtable(p->files)->max_fds;
+	hide_pid = p->hide_pid;
 	task_unlock(p);
 	rcu_read_unlock();
 
@@ -201,6 +203,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
 	seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->egid));
 	seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->sgid));
 	seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->fsgid));
+	seq_put_decimal_ull(m, "\nHidePID:\t", hide_pid);
 	seq_put_decimal_ull(m, "\nFDSize:\t", max_fds);
 
 	seq_puts(m, "\nGroups:\t");
diff --git a/fs/proc/base.c b/fs/proc/base.c
index ae5e13c..6c9a42b 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -709,7 +709,8 @@ static bool has_pid_permissions(struct pid_namespace *pid,
 				 struct task_struct *task,
 				 int hide_pid_min)
 {
-	if (pid->hide_pid < hide_pid_min)
+	if (pid->hide_pid < hide_pid_min &&
+	    current->hide_pid < hide_pid_min)
 		return true;
 	if (in_group_p(pid->pid_gid))
 		return true;
@@ -730,7 +731,8 @@ static int proc_pid_permission(struct inode *inode, int mask)
 	put_task_struct(task);
 
 	if (!has_perms) {
-		if (pid->hide_pid == HIDEPID_INVISIBLE) {
+		if (pid->hide_pid == HIDEPID_INVISIBLE ||
+		    current->hide_pid == HIDEPID_INVISIBLE) {
 			/*
 			 * Let's make getdents(), stat(), and open()
 			 * consistent with each other.  If a process
diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 325f649..c87de0e 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -250,6 +250,7 @@ extern struct task_group root_task_group;
 	.cpu_timers	= INIT_CPU_TIMERS(tsk.cpu_timers),		\
 	.pi_lock	= __RAW_SPIN_LOCK_UNLOCKED(tsk.pi_lock),	\
 	.timer_slack_ns = 50000, /* 50 usec default slack */		\
+	.hide_pid	= 0,						\
 	.pids = {							\
 		[PIDTYPE_PID]  = INIT_PID_LINK(PIDTYPE_PID),		\
 		[PIDTYPE_PGID] = INIT_PID_LINK(PIDTYPE_PGID),		\
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 348f51b..3e8ca16 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1572,6 +1572,7 @@ struct task_struct {
 	/* unserialized, strictly 'current' */
 	unsigned in_execve:1; /* bit to tell LSMs we're in execve */
 	unsigned in_iowait:1;
+	unsigned hide_pid:2; /* per-process procfs hidepid= */
 #if !defined(TIF_RESTORE_SIGMASK)
 	unsigned restore_sigmask:1;
 #endif
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index a8d0759..ada62b6 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -197,4 +197,8 @@ struct prctl_mm_map {
 # define PR_CAP_AMBIENT_LOWER		3
 # define PR_CAP_AMBIENT_CLEAR_ALL	4
 
+/* Per process, non-revokable procfs hidepid= option */
+#define PR_SET_HIDEPID 48
+#define PR_GET_HIDEPID 49
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/fork.c b/kernel/fork.c
index 623259f..d4fe951 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1562,6 +1562,7 @@ static __latent_entropy struct task_struct *copy_process(
 #endif
 
 	p->default_timer_slack_ns = current->timer_slack_ns;
+	p->hide_pid = current->hide_pid;
 
 	task_io_accounting_init(&p->ioac);
 	acct_clear_integrals(p);
diff --git a/kernel/sys.c b/kernel/sys.c
index 89d5be4..c0a1d3e 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2270,6 +2270,16 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
 	case PR_GET_FP_MODE:
 		error = GET_FP_MODE(me);
 		break;
+	case PR_SET_HIDEPID:
+		if (arg2 < HIDEPID_OFF || arg2 > HIDEPID_INVISIBLE)
+			return -EINVAL;
+		if (arg2 < me->hide_pid)
+			return -EPERM;
+		me->hide_pid = arg2;
+		break;
+	case PR_GET_HIDEPID:
+		error = put_user((int) me->hide_pid, (int __user *)arg2);
+		break;
 	default:
 		error = -EINVAL;
 		break;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] procfs: use an enum for possible hidepid values
  2016-11-03 15:30 [PATCH 1/2] procfs: use an enum for possible hidepid values Lafcadio Wluiki
  2016-11-03 15:30 ` [PATCH 2/2] procfs/tasks: add a simple per-task procfs hidepid= field Lafcadio Wluiki
@ 2016-11-03 15:49 ` Kees Cook
  2016-11-15 23:27   ` Kees Cook
  1 sibling, 1 reply; 13+ messages in thread
From: Kees Cook @ 2016-11-03 15:49 UTC (permalink / raw)
  To: Lafcadio Wluiki, Andrew Morton; +Cc: LKML

On Thu, Nov 3, 2016 at 9:30 AM, Lafcadio Wluiki <wluikil@gmail.com> wrote:
> (Third, rebased submission, since first two submissions yielded no replies.)

Hm, I didn't see this series before, for some reason.

> Previously, the hidepid parameter was checked by comparing literal
> integers 0, 1, 2. Let's add a proper enum for this, to make the checking
> more expressive:
>
>         0 → HIDEPID_OFF
>         1 → HIDEPID_NO_ACCESS
>         2 → HIDEPID_INVISIBLE
>
> This changes the internal labelling only, the userspace-facing interface
> remains unmodified, and still works with literal integers 0, 1, 2.
>
> No functional changes.
>
> Signed-off-by: Lafcadio Wluiki <wluikil@gmail.com>

Yup, this is good. Dropping literals is always preferred. :)

Acked-by: Kees Cook <keescook@chromium.org>

-Kees

> ---
>  fs/proc/base.c                | 8 ++++----
>  fs/proc/inode.c               | 2 +-
>  fs/proc/root.c                | 3 ++-
>  include/linux/pid_namespace.h | 6 ++++++
>  4 files changed, 13 insertions(+), 6 deletions(-)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index ca651ac..ae5e13c 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -726,11 +726,11 @@ static int proc_pid_permission(struct inode *inode, int mask)
>         task = get_proc_task(inode);
>         if (!task)
>                 return -ESRCH;
> -       has_perms = has_pid_permissions(pid, task, 1);
> +       has_perms = has_pid_permissions(pid, task, HIDEPID_NO_ACCESS);
>         put_task_struct(task);
>
>         if (!has_perms) {
> -               if (pid->hide_pid == 2) {
> +               if (pid->hide_pid == HIDEPID_INVISIBLE) {
>                         /*
>                          * Let's make getdents(), stat(), and open()
>                          * consistent with each other.  If a process
> @@ -1720,7 +1720,7 @@ int pid_getattr(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat)
>         stat->gid = GLOBAL_ROOT_GID;
>         task = pid_task(proc_pid(inode), PIDTYPE_PID);
>         if (task) {
> -               if (!has_pid_permissions(pid, task, 2)) {
> +               if (!has_pid_permissions(pid, task, HIDEPID_INVISIBLE)) {
>                         rcu_read_unlock();
>                         /*
>                          * This doesn't prevent learning whether PID exists,
> @@ -3181,7 +3181,7 @@ int proc_pid_readdir(struct file *file, struct dir_context *ctx)
>              iter.tgid += 1, iter = next_tgid(ns, iter)) {
>                 char name[PROC_NUMBUF];
>                 int len;
> -               if (!has_pid_permissions(ns, iter.task, 2))
> +               if (!has_pid_permissions(ns, iter.task, HIDEPID_INVISIBLE))
>                         continue;
>
>                 len = snprintf(name, sizeof(name), "%d", iter.tgid);
> diff --git a/fs/proc/inode.c b/fs/proc/inode.c
> index e69ebe6..872325e 100644
> --- a/fs/proc/inode.c
> +++ b/fs/proc/inode.c
> @@ -106,7 +106,7 @@ static int proc_show_options(struct seq_file *seq, struct dentry *root)
>
>         if (!gid_eq(pid->pid_gid, GLOBAL_ROOT_GID))
>                 seq_printf(seq, ",gid=%u", from_kgid_munged(&init_user_ns, pid->pid_gid));
> -       if (pid->hide_pid != 0)
> +       if (pid->hide_pid != HIDEPID_OFF)
>                 seq_printf(seq, ",hidepid=%u", pid->hide_pid);
>
>         return 0;
> diff --git a/fs/proc/root.c b/fs/proc/root.c
> index 8d3e484..2989731 100644
> --- a/fs/proc/root.c
> +++ b/fs/proc/root.c
> @@ -58,7 +58,8 @@ int proc_parse_options(char *options, struct pid_namespace *pid)
>                 case Opt_hidepid:
>                         if (match_int(&args[0], &option))
>                                 return 0;
> -                       if (option < 0 || option > 2) {
> +                       if (option < HIDEPID_OFF ||
> +                           option > HIDEPID_INVISIBLE) {
>                                 pr_err("proc: hidepid value must be between 0 and 2.\n");
>                                 return 0;
>                         }
> diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h
> index 34cce96..c2a989d 100644
> --- a/include/linux/pid_namespace.h
> +++ b/include/linux/pid_namespace.h
> @@ -21,6 +21,12 @@ struct pidmap {
>
>  struct fs_pin;
>
> +enum { /* definitions for pid_namespace's hide_pid field */
> +       HIDEPID_OFF       = 0,
> +       HIDEPID_NO_ACCESS = 1,
> +       HIDEPID_INVISIBLE = 2,
> +};
> +
>  struct pid_namespace {
>         struct kref kref;
>         struct pidmap pidmap[PIDMAP_ENTRIES];
> --
> 2.7.4
>



-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] procfs/tasks: add a simple per-task procfs hidepid= field
  2016-11-03 15:30 ` [PATCH 2/2] procfs/tasks: add a simple per-task procfs hidepid= field Lafcadio Wluiki
@ 2016-11-03 16:12   ` Kees Cook
  2016-11-03 17:55     ` Jann Horn
  2016-11-03 18:24   ` [2/2] " Jann Horn
  1 sibling, 1 reply; 13+ messages in thread
From: Kees Cook @ 2016-11-03 16:12 UTC (permalink / raw)
  To: Lafcadio Wluiki
  Cc: LKML, Andrew Morton, linux-arch, Jann Horn, kernel-hardening

On Thu, Nov 3, 2016 at 9:30 AM, Lafcadio Wluiki <wluikil@gmail.com> wrote:
> (Third, rebased submission, since first two submissions yielded no replies.)
>
> This adds a new per-task hidepid= flag that is honored by procfs when
> presenting /proc to the user, in addition to the existing hidepid= mount
> option. So far, hidepid= was exclusively a per-pidns setting. Locking
> down a set of processes so that they cannot see other user's processes
> without affecting the rest of the system thus currently requires
> creation of a private PID namespace, with all the complexity it brings,
> including maintaining a stub init process as PID 1 and losing the
> ability to see processes of the same user on the rest of the system.
>
> With this patch all acesss and visibility checks in procfs now
> honour two fields:
>
>         a) the existing hide_pid field in the PID namespace
>         b) the new hide_pid in struct task_struct
>
> Access/visibility is only granted if both fields permit it; the more
> restrictive one wins. By default the new task_struct hide_pid value
> defaults to 0, which means behaviour is not changed from the status quo.
>
> Setting the per-process hide_pid value is done via a new PR_SET_HIDEPID
> prctl() option which takes the same three supported values as the
> hidepid= mount option. The per-process hide_pid may only be increased,
> never decreased, thus ensuring that once applied, processes can never
> escape such a hide_pid jail.  When a process forks it inherits its
> parent's hide_pid value.
>
> Suggested usecase: let's say nginx runs as user "www-data". After
> dropping privileges it may now call:
>
>         …
>         prctl(PR_SET_HIDEPID, 2);
>         …
>
> And from that point on neither nginx itself, nor any of its child
> processes may see processes in /proc anymore that belong to a different
> user than "www-data". Other services running on the same system remain
> unaffected.
>
> This should permit Linux distributions to more comprehensively lock down
> their services, as it allows an isolated opt-in for hidepid= for
> specific services. Previously hidepid= could only be set system-wide,
> and then specific services had to be excluded by group membership,
> essentially a more complex concept of opt-out.
>
> A test-tool that validates this functionality is available here:
>
>         https://paste.fedoraproject.org/412975/71967605/
>
> Signed-off-by: Lafcadio Wluiki <wluikil@gmail.com>

I like this idea: it meaningfully reduces attack surface, even though
it doesn't ensure the same confinement as a pid namespace (e.g. a
process with this prctl set can still direct syscalls to pids that are
hidden). However, the attack surface in /proc is relatively large
compared to the syscalls that use pids.

For task launchers, this may overlap nicely with no_new_privs too.

Some suggestions/nits below...

> ---
>  fs/proc/array.c            |  3 +++
>  fs/proc/base.c             |  6 ++++--
>  include/linux/init_task.h  |  1 +
>  include/linux/sched.h      |  1 +
>  include/uapi/linux/prctl.h |  4 ++++
>  kernel/fork.c              |  1 +
>  kernel/sys.c               | 10 ++++++++++
>  7 files changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/fs/proc/array.c b/fs/proc/array.c
> index 81818ad..ea801e5 100644
> --- a/fs/proc/array.c
> +++ b/fs/proc/array.c
> @@ -163,6 +163,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
>         const struct cred *cred;
>         pid_t ppid, tpid = 0, tgid, ngid;
>         unsigned int max_fds = 0;
> +       int hide_pid;
>
>         rcu_read_lock();
>         ppid = pid_alive(p) ?
> @@ -183,6 +184,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
>         task_lock(p);
>         if (p->files)
>                 max_fds = files_fdtable(p->files)->max_fds;
> +       hide_pid = p->hide_pid;
>         task_unlock(p);
>         rcu_read_unlock();
>
> @@ -201,6 +203,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
>         seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->egid));
>         seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->sgid));
>         seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->fsgid));
> +       seq_put_decimal_ull(m, "\nHidePID:\t", hide_pid);
>         seq_put_decimal_ull(m, "\nFDSize:\t", max_fds);
>
>         seq_puts(m, "\nGroups:\t");

This should get an addition to table 1-2 of
Documentation/filesystems/proc.txt that covers the contents of
/proc/$pid/status


> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index ae5e13c..6c9a42b 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -709,7 +709,8 @@ static bool has_pid_permissions(struct pid_namespace *pid,
>                                  struct task_struct *task,
>                                  int hide_pid_min)
>  {
> -       if (pid->hide_pid < hide_pid_min)
> +       if (pid->hide_pid < hide_pid_min &&
> +           current->hide_pid < hide_pid_min)
>                 return true;
>         if (in_group_p(pid->pid_gid))
>                 return true;
> @@ -730,7 +731,8 @@ static int proc_pid_permission(struct inode *inode, int mask)
>         put_task_struct(task);
>
>         if (!has_perms) {
> -               if (pid->hide_pid == HIDEPID_INVISIBLE) {
> +               if (pid->hide_pid == HIDEPID_INVISIBLE ||
> +                   current->hide_pid == HIDEPID_INVISIBLE) {
>                         /*
>                          * Let's make getdents(), stat(), and open()
>                          * consistent with each other.  If a process

Instead of open-coding both of these "||" tests, I think it might be
cleaner to just choose the highest protection value, and use that in
both comparisons. e.g.

    int hide_pid = max(pid->hide_pid, current->hide_pid);

    if (hide_pid < hide_pid_min) ...

    9if (hide_pid == HIDEPID_INVISIBLE) ...


> diff --git a/include/linux/init_task.h b/include/linux/init_task.h
> index 325f649..c87de0e 100644
> --- a/include/linux/init_task.h
> +++ b/include/linux/init_task.h
> @@ -250,6 +250,7 @@ extern struct task_group root_task_group;
>         .cpu_timers     = INIT_CPU_TIMERS(tsk.cpu_timers),              \
>         .pi_lock        = __RAW_SPIN_LOCK_UNLOCKED(tsk.pi_lock),        \
>         .timer_slack_ns = 50000, /* 50 usec default slack */            \
> +       .hide_pid       = 0,                                            \
>         .pids = {                                                       \
>                 [PIDTYPE_PID]  = INIT_PID_LINK(PIDTYPE_PID),            \
>                 [PIDTYPE_PGID] = INIT_PID_LINK(PIDTYPE_PGID),           \
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 348f51b..3e8ca16 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1572,6 +1572,7 @@ struct task_struct {
>         /* unserialized, strictly 'current' */
>         unsigned in_execve:1; /* bit to tell LSMs we're in execve */
>         unsigned in_iowait:1;
> +       unsigned hide_pid:2; /* per-process procfs hidepid= */
>  #if !defined(TIF_RESTORE_SIGMASK)
>         unsigned restore_sigmask:1;
>  #endif
> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
> index a8d0759..ada62b6 100644
> --- a/include/uapi/linux/prctl.h
> +++ b/include/uapi/linux/prctl.h
> @@ -197,4 +197,8 @@ struct prctl_mm_map {
>  # define PR_CAP_AMBIENT_LOWER          3
>  # define PR_CAP_AMBIENT_CLEAR_ALL      4
>
> +/* Per process, non-revokable procfs hidepid= option */
> +#define PR_SET_HIDEPID 48
> +#define PR_GET_HIDEPID 49
> +
>  #endif /* _LINUX_PRCTL_H */
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 623259f..d4fe951 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1562,6 +1562,7 @@ static __latent_entropy struct task_struct *copy_process(
>  #endif
>
>         p->default_timer_slack_ns = current->timer_slack_ns;
> +       p->hide_pid = current->hide_pid;
>
>         task_io_accounting_init(&p->ioac);
>         acct_clear_integrals(p);
> diff --git a/kernel/sys.c b/kernel/sys.c
> index 89d5be4..c0a1d3e 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2270,6 +2270,16 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
>         case PR_GET_FP_MODE:
>                 error = GET_FP_MODE(me);
>                 break;
> +       case PR_SET_HIDEPID:
> +               if (arg2 < HIDEPID_OFF || arg2 > HIDEPID_INVISIBLE)
> +                       return -EINVAL;
> +               if (arg2 < me->hide_pid)
> +                       return -EPERM;
> +               me->hide_pid = arg2;
> +               break;
> +       case PR_GET_HIDEPID:
> +               error = put_user((int) me->hide_pid, (int __user *)arg2);
> +               break;
>         default:
>                 error = -EINVAL;
>                 break;
> --
> 2.7.4
>

Since this adds a new prctl interface, it's best to Cc linux-arch
(which I added now).

Thanks for proposing this!

-Kees

-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] procfs/tasks: add a simple per-task procfs hidepid= field
  2016-11-03 16:12   ` Kees Cook
@ 2016-11-03 17:55     ` Jann Horn
  2016-11-03 18:05       ` Kees Cook
  0 siblings, 1 reply; 13+ messages in thread
From: Jann Horn @ 2016-11-03 17:55 UTC (permalink / raw)
  To: Kees Cook
  Cc: Lafcadio Wluiki, LKML, Andrew Morton, linux-arch, kernel-hardening

On Thu, Nov 03, 2016 at 10:12:55AM -0600, Kees Cook wrote:
> On Thu, Nov 3, 2016 at 9:30 AM, Lafcadio Wluiki <wluikil@gmail.com> wrote:
> > (Third, rebased submission, since first two submissions yielded no replies.)
> >
> > This adds a new per-task hidepid= flag that is honored by procfs when
> > presenting /proc to the user, in addition to the existing hidepid= mount
> > option. So far, hidepid= was exclusively a per-pidns setting. Locking
> > down a set of processes so that they cannot see other user's processes
> > without affecting the rest of the system thus currently requires
> > creation of a private PID namespace, with all the complexity it brings,
> > including maintaining a stub init process as PID 1 and losing the
> > ability to see processes of the same user on the rest of the system.
[...]
> Since this adds a new prctl interface, it's best to Cc linux-arch
> (which I added now).

Please also CC linux-api for the next iteration, since this is a new
userspace-facing API.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] procfs/tasks: add a simple per-task procfs hidepid= field
  2016-11-03 17:55     ` Jann Horn
@ 2016-11-03 18:05       ` Kees Cook
  0 siblings, 0 replies; 13+ messages in thread
From: Kees Cook @ 2016-11-03 18:05 UTC (permalink / raw)
  To: Jann Horn
  Cc: Lafcadio Wluiki, LKML, Andrew Morton, kernel-hardening, linux-arch

On Thu, Nov 3, 2016 at 11:55 AM, Jann Horn <jann@thejh.net> wrote:
> On Thu, Nov 03, 2016 at 10:12:55AM -0600, Kees Cook wrote:
>> On Thu, Nov 3, 2016 at 9:30 AM, Lafcadio Wluiki <wluikil@gmail.com> wrote:
>> > (Third, rebased submission, since first two submissions yielded no replies.)
>> >
>> > This adds a new per-task hidepid= flag that is honored by procfs when
>> > presenting /proc to the user, in addition to the existing hidepid= mount
>> > option. So far, hidepid= was exclusively a per-pidns setting. Locking
>> > down a set of processes so that they cannot see other user's processes
>> > without affecting the rest of the system thus currently requires
>> > creation of a private PID namespace, with all the complexity it brings,
>> > including maintaining a stub init process as PID 1 and losing the
>> > ability to see processes of the same user on the rest of the system.
> [...]
>> Since this adds a new prctl interface, it's best to Cc linux-arch
>> (which I added now).
>
> Please also CC linux-api for the next iteration, since this is a new
> userspace-facing API.

Oops, thank you. I meant linux-api, not linux-arch. :P

-Kees

-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2/2] procfs/tasks: add a simple per-task procfs hidepid= field
  2016-11-03 15:30 ` [PATCH 2/2] procfs/tasks: add a simple per-task procfs hidepid= field Lafcadio Wluiki
  2016-11-03 16:12   ` Kees Cook
@ 2016-11-03 18:24   ` Jann Horn
  2016-11-03 20:21     ` Lafcadio Wluiki
  2016-11-03 20:34     ` Kees Cook
  1 sibling, 2 replies; 13+ messages in thread
From: Jann Horn @ 2016-11-03 18:24 UTC (permalink / raw)
  To: Lafcadio Wluiki; +Cc: linux-kernel, Andrew Morton, Kees Cook, kernel-hardening

On Thu, Nov 03, 2016 at 09:30:38AM -0600, Lafcadio Wluiki wrote:
> This adds a new per-task hidepid= flag that is honored by procfs when
> presenting /proc to the user, in addition to the existing hidepid= mount
> option. So far, hidepid= was exclusively a per-pidns setting. Locking
> down a set of processes so that they cannot see other user's processes
> without affecting the rest of the system thus currently requires
> creation of a private PID namespace, with all the complexity it brings,
> including maintaining a stub init process as PID 1 and losing the
> ability to see processes of the same user on the rest of the system.
[...]
> diff --git a/kernel/sys.c b/kernel/sys.c
> index 89d5be4..c0a1d3e 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2270,6 +2270,16 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
>  	case PR_GET_FP_MODE:
>  		error = GET_FP_MODE(me);
>  		break;
> +	case PR_SET_HIDEPID:
> +		if (arg2 < HIDEPID_OFF || arg2 > HIDEPID_INVISIBLE)
> +			return -EINVAL;
> +		if (arg2 < me->hide_pid)
> +			return -EPERM;
> +		me->hide_pid = arg2;
> +		break;

Should we test for ns_capable(CAP_SYS_ADMIN)||no_new_privs here?
I think it wouldn't hurt, and I'd like to avoid adding new ways in which
the execution of setuid programs can be influenced. OTOH, people already
use hidepid now, and it's not an issue... I'm not sure. Opinions?

@Lafcadio: Do you think that requiring no_new_privs to be set would
break your usecase? Would nginx need to still be able to execute setuid
binaries?

Aside from this, and the comments Kees already made, this looks good
to me.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2/2] procfs/tasks: add a simple per-task procfs hidepid= field
  2016-11-03 18:24   ` [2/2] " Jann Horn
@ 2016-11-03 20:21     ` Lafcadio Wluiki
  2016-11-03 20:34     ` Kees Cook
  1 sibling, 0 replies; 13+ messages in thread
From: Lafcadio Wluiki @ 2016-11-03 20:21 UTC (permalink / raw)
  To: Jann Horn; +Cc: linux-kernel, Andrew Morton, Kees Cook, kernel-hardening

On Thu, Nov 3, 2016 at 12:24 PM, Jann Horn <jann@thejh.net> wrote:

>> +     case PR_SET_HIDEPID:
>> +             if (arg2 < HIDEPID_OFF || arg2 > HIDEPID_INVISIBLE)
>> +                     return -EINVAL;
>> +             if (arg2 < me->hide_pid)
>> +                     return -EPERM;
>> +             me->hide_pid = arg2;
>> +             break;
>
> Should we test for ns_capable(CAP_SYS_ADMIN)||no_new_privs here?
> I think it wouldn't hurt, and I'd like to avoid adding new ways in which
> the execution of setuid programs can be influenced. OTOH, people already
> use hidepid now, and it's not an issue... I'm not sure. Opinions?

Hmm, the existing hidepid= thing is a mount option and that you you of
course can only change with root privs so far, hence the NNP thing
doesn't really apply so far on hidepid.

> @Lafcadio: Do you think that requiring no_new_privs to be set would
> break your usecase? Would nginx need to still be able to execute setuid
> binaries?

I think adding the NNP check would be OK for my use. I'll add this to
the next iteration!

> Aside from this, and the comments Kees already made, this looks good
> to me.

Thanks for the review,

L.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2/2] procfs/tasks: add a simple per-task procfs hidepid= field
  2016-11-03 18:24   ` [2/2] " Jann Horn
  2016-11-03 20:21     ` Lafcadio Wluiki
@ 2016-11-03 20:34     ` Kees Cook
  2016-11-03 20:42       ` [kernel-hardening] " Jann Horn
  1 sibling, 1 reply; 13+ messages in thread
From: Kees Cook @ 2016-11-03 20:34 UTC (permalink / raw)
  To: Jann Horn; +Cc: Lafcadio Wluiki, LKML, Andrew Morton, kernel-hardening

On Thu, Nov 3, 2016 at 12:24 PM, Jann Horn <jann@thejh.net> wrote:
> On Thu, Nov 03, 2016 at 09:30:38AM -0600, Lafcadio Wluiki wrote:
>> This adds a new per-task hidepid= flag that is honored by procfs when
>> presenting /proc to the user, in addition to the existing hidepid= mount
>> option. So far, hidepid= was exclusively a per-pidns setting. Locking
>> down a set of processes so that they cannot see other user's processes
>> without affecting the rest of the system thus currently requires
>> creation of a private PID namespace, with all the complexity it brings,
>> including maintaining a stub init process as PID 1 and losing the
>> ability to see processes of the same user on the rest of the system.
> [...]
>> diff --git a/kernel/sys.c b/kernel/sys.c
>> index 89d5be4..c0a1d3e 100644
>> --- a/kernel/sys.c
>> +++ b/kernel/sys.c
>> @@ -2270,6 +2270,16 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
>>       case PR_GET_FP_MODE:
>>               error = GET_FP_MODE(me);
>>               break;
>> +     case PR_SET_HIDEPID:
>> +             if (arg2 < HIDEPID_OFF || arg2 > HIDEPID_INVISIBLE)
>> +                     return -EINVAL;
>> +             if (arg2 < me->hide_pid)
>> +                     return -EPERM;
>> +             me->hide_pid = arg2;
>> +             break;
>
> Should we test for ns_capable(CAP_SYS_ADMIN)||no_new_privs here?
> I think it wouldn't hurt, and I'd like to avoid adding new ways in which
> the execution of setuid programs can be influenced. OTOH, people already
> use hidepid now, and it's not an issue... I'm not sure. Opinions?

Hrrm, I'm really on the fence. I don't feel like having things in
/proc go invisible for a setuid would be bad, but I wouldn't be
surprised to eat my words. :) On the other hand, I can't think of a
place where this requirement would create a problem.

e.g. init launching a web server as root could set nnp and this, and
it would still be able to switch down to www-data, etc. If someone has
www-data in their /etc/sudoers file, I already fear for their sanity.
;)

-Kees

-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [kernel-hardening] Re: [2/2] procfs/tasks: add a simple per-task procfs hidepid= field
  2016-11-03 20:34     ` Kees Cook
@ 2016-11-03 20:42       ` Jann Horn
  0 siblings, 0 replies; 13+ messages in thread
From: Jann Horn @ 2016-11-03 20:42 UTC (permalink / raw)
  To: Kees Cook; +Cc: Lafcadio Wluiki, LKML, Andrew Morton, kernel-hardening

[-- Attachment #1: Type: text/plain, Size: 2374 bytes --]

On Thu, Nov 03, 2016 at 02:34:16PM -0600, Kees Cook wrote:
> On Thu, Nov 3, 2016 at 12:24 PM, Jann Horn <jann@thejh.net> wrote:
> > On Thu, Nov 03, 2016 at 09:30:38AM -0600, Lafcadio Wluiki wrote:
> >> This adds a new per-task hidepid= flag that is honored by procfs when
> >> presenting /proc to the user, in addition to the existing hidepid= mount
> >> option. So far, hidepid= was exclusively a per-pidns setting. Locking
> >> down a set of processes so that they cannot see other user's processes
> >> without affecting the rest of the system thus currently requires
> >> creation of a private PID namespace, with all the complexity it brings,
> >> including maintaining a stub init process as PID 1 and losing the
> >> ability to see processes of the same user on the rest of the system.
> > [...]
> >> diff --git a/kernel/sys.c b/kernel/sys.c
> >> index 89d5be4..c0a1d3e 100644
> >> --- a/kernel/sys.c
> >> +++ b/kernel/sys.c
> >> @@ -2270,6 +2270,16 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
> >>       case PR_GET_FP_MODE:
> >>               error = GET_FP_MODE(me);
> >>               break;
> >> +     case PR_SET_HIDEPID:
> >> +             if (arg2 < HIDEPID_OFF || arg2 > HIDEPID_INVISIBLE)
> >> +                     return -EINVAL;
> >> +             if (arg2 < me->hide_pid)
> >> +                     return -EPERM;
> >> +             me->hide_pid = arg2;
> >> +             break;
> >
> > Should we test for ns_capable(CAP_SYS_ADMIN)||no_new_privs here?
> > I think it wouldn't hurt, and I'd like to avoid adding new ways in which
> > the execution of setuid programs can be influenced. OTOH, people already
> > use hidepid now, and it's not an issue... I'm not sure. Opinions?
> 
> Hrrm, I'm really on the fence. I don't feel like having things in
> /proc go invisible for a setuid would be bad, but I wouldn't be
> surprised to eat my words. :) On the other hand, I can't think of a
> place where this requirement would create a problem.
> 
> e.g. init launching a web server as root could set nnp and this, and
> it would still be able to switch down to www-data, etc. If someone has
> www-data in their /etc/sudoers file, I already fear for their sanity.
> ;)

(and init launching a web server as root could also set hidepid without
setting nnp if it really wants to)

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] procfs: use an enum for possible hidepid values
  2016-11-03 15:49 ` [PATCH 1/2] procfs: use an enum for possible hidepid values Kees Cook
@ 2016-11-15 23:27   ` Kees Cook
  0 siblings, 0 replies; 13+ messages in thread
From: Kees Cook @ 2016-11-15 23:27 UTC (permalink / raw)
  To: Lafcadio Wluiki, Andrew Morton; +Cc: LKML

On Thu, Nov 3, 2016 at 8:49 AM, Kees Cook <keescook@chromium.org> wrote:
> On Thu, Nov 3, 2016 at 9:30 AM, Lafcadio Wluiki <wluikil@gmail.com> wrote:
>> (Third, rebased submission, since first two submissions yielded no replies.)
>
> Hm, I didn't see this series before, for some reason.
>
>> Previously, the hidepid parameter was checked by comparing literal
>> integers 0, 1, 2. Let's add a proper enum for this, to make the checking
>> more expressive:
>>
>>         0 → HIDEPID_OFF
>>         1 → HIDEPID_NO_ACCESS
>>         2 → HIDEPID_INVISIBLE
>>
>> This changes the internal labelling only, the userspace-facing interface
>> remains unmodified, and still works with literal integers 0, 1, 2.
>>
>> No functional changes.
>>
>> Signed-off-by: Lafcadio Wluiki <wluikil@gmail.com>
>
> Yup, this is good. Dropping literals is always preferred. :)
>
> Acked-by: Kees Cook <keescook@chromium.org>

Hi,

Friendly ping to Andrew for picking up this clean-up for -mm. (Though
not yet the 2/2 patch, as it still has some unanswered questions...)

-Kees

>
> -Kees
>
>> ---
>>  fs/proc/base.c                | 8 ++++----
>>  fs/proc/inode.c               | 2 +-
>>  fs/proc/root.c                | 3 ++-
>>  include/linux/pid_namespace.h | 6 ++++++
>>  4 files changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>> index ca651ac..ae5e13c 100644
>> --- a/fs/proc/base.c
>> +++ b/fs/proc/base.c
>> @@ -726,11 +726,11 @@ static int proc_pid_permission(struct inode *inode, int mask)
>>         task = get_proc_task(inode);
>>         if (!task)
>>                 return -ESRCH;
>> -       has_perms = has_pid_permissions(pid, task, 1);
>> +       has_perms = has_pid_permissions(pid, task, HIDEPID_NO_ACCESS);
>>         put_task_struct(task);
>>
>>         if (!has_perms) {
>> -               if (pid->hide_pid == 2) {
>> +               if (pid->hide_pid == HIDEPID_INVISIBLE) {
>>                         /*
>>                          * Let's make getdents(), stat(), and open()
>>                          * consistent with each other.  If a process
>> @@ -1720,7 +1720,7 @@ int pid_getattr(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat)
>>         stat->gid = GLOBAL_ROOT_GID;
>>         task = pid_task(proc_pid(inode), PIDTYPE_PID);
>>         if (task) {
>> -               if (!has_pid_permissions(pid, task, 2)) {
>> +               if (!has_pid_permissions(pid, task, HIDEPID_INVISIBLE)) {
>>                         rcu_read_unlock();
>>                         /*
>>                          * This doesn't prevent learning whether PID exists,
>> @@ -3181,7 +3181,7 @@ int proc_pid_readdir(struct file *file, struct dir_context *ctx)
>>              iter.tgid += 1, iter = next_tgid(ns, iter)) {
>>                 char name[PROC_NUMBUF];
>>                 int len;
>> -               if (!has_pid_permissions(ns, iter.task, 2))
>> +               if (!has_pid_permissions(ns, iter.task, HIDEPID_INVISIBLE))
>>                         continue;
>>
>>                 len = snprintf(name, sizeof(name), "%d", iter.tgid);
>> diff --git a/fs/proc/inode.c b/fs/proc/inode.c
>> index e69ebe6..872325e 100644
>> --- a/fs/proc/inode.c
>> +++ b/fs/proc/inode.c
>> @@ -106,7 +106,7 @@ static int proc_show_options(struct seq_file *seq, struct dentry *root)
>>
>>         if (!gid_eq(pid->pid_gid, GLOBAL_ROOT_GID))
>>                 seq_printf(seq, ",gid=%u", from_kgid_munged(&init_user_ns, pid->pid_gid));
>> -       if (pid->hide_pid != 0)
>> +       if (pid->hide_pid != HIDEPID_OFF)
>>                 seq_printf(seq, ",hidepid=%u", pid->hide_pid);
>>
>>         return 0;
>> diff --git a/fs/proc/root.c b/fs/proc/root.c
>> index 8d3e484..2989731 100644
>> --- a/fs/proc/root.c
>> +++ b/fs/proc/root.c
>> @@ -58,7 +58,8 @@ int proc_parse_options(char *options, struct pid_namespace *pid)
>>                 case Opt_hidepid:
>>                         if (match_int(&args[0], &option))
>>                                 return 0;
>> -                       if (option < 0 || option > 2) {
>> +                       if (option < HIDEPID_OFF ||
>> +                           option > HIDEPID_INVISIBLE) {
>>                                 pr_err("proc: hidepid value must be between 0 and 2.\n");
>>                                 return 0;
>>                         }
>> diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h
>> index 34cce96..c2a989d 100644
>> --- a/include/linux/pid_namespace.h
>> +++ b/include/linux/pid_namespace.h
>> @@ -21,6 +21,12 @@ struct pidmap {
>>
>>  struct fs_pin;
>>
>> +enum { /* definitions for pid_namespace's hide_pid field */
>> +       HIDEPID_OFF       = 0,
>> +       HIDEPID_NO_ACCESS = 1,
>> +       HIDEPID_INVISIBLE = 2,
>> +};
>> +
>>  struct pid_namespace {
>>         struct kref kref;
>>         struct pidmap pidmap[PIDMAP_ENTRIES];
>> --
>> 2.7.4
>>
>
>
>
> --
> Kees Cook
> Nexus Security



-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/2] procfs/tasks: add a simple per-task procfs hidepid= field
  2016-10-09 14:36 Lafcadio Wluiki
@ 2016-10-09 14:36 ` Lafcadio Wluiki
  0 siblings, 0 replies; 13+ messages in thread
From: Lafcadio Wluiki @ 2016-10-09 14:36 UTC (permalink / raw)
  To: linux-kernel; +Cc: Kees Cook

(Second, rebased submission, since first submission yielded no replies.)

This adds a new per-task hidepid= flag that is honored by procfs when
presenting /proc to the user, in addition to the existing hidepid= mount
option. So far, hidepid= was exclusively a per-pidns setting. Locking
down a set of processes so that they cannot see other user's processes
without affecting the rest of the system thus currently requires
creation of a private PID namespace, with all the complexity it brings,
including maintaining a stub init process as PID 1 and losing the
ability to see processes of the same user on the rest of the system.

With this patch all acesss and visibility checks in procfs now
honour two fields:

	a) the existing hide_pid field in the PID namespace
	b) the new hide_pid in struct task_struct

Access/visibility is only granted if both fields permit it; the more
restrictive one wins. By default the new task_struct hide_pid value
defaults to 0, which means behaviour is not changed from the status quo.

Setting the per-process hide_pid value is done via a new PR_SET_HIDEPID
prctl() option which takes the same three supported values as the
hidepid= mount option. The per-process hide_pid may only be increased,
never decreased, thus ensuring that once applied, processes can never
escape such a hide_pid jail.  When a process forks it inherits its
parent's hide_pid value.

Suggested usecase: let's say nginx runs as user "www-data". After
dropping privileges it may now call:

	…
	prctl(PR_SET_HIDEPID, 2);
	…

And from that point on neither nginx itself, nor any of its child
processes may see processes in /proc anymore that belong to a different
user than "www-data". Other services running on the same system remain
unaffected.

This should permit Linux distributions to more comprehensively lock down
their services, as it allows an isolated opt-in for hidepid= for
specific services. Previously hidepid= could only be set system-wide,
and then specific services had to be excluded by group membership,
essentially a more complex concept of opt-out.

A test-tool that validates this functionality is available here:

	https://paste.fedoraproject.org/412975/71967605/

Signed-off-by: Lafcadio Wluiki <wluikil@gmail.com>
---
 fs/proc/array.c            |  3 +++
 fs/proc/base.c             |  6 ++++--
 include/linux/init_task.h  |  1 +
 include/linux/sched.h      |  1 +
 include/uapi/linux/prctl.h |  4 ++++
 kernel/fork.c              |  1 +
 kernel/sys.c               | 10 ++++++++++
 7 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 89600fd..2135616 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -163,6 +163,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
 	const struct cred *cred;
 	pid_t ppid, tpid = 0, tgid, ngid;
 	unsigned int max_fds = 0;
+	int hide_pid;
 
 	rcu_read_lock();
 	ppid = pid_alive(p) ?
@@ -183,6 +184,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
 	task_lock(p);
 	if (p->files)
 		max_fds = files_fdtable(p->files)->max_fds;
+	hide_pid = p->hide_pid;
 	task_unlock(p);
 	rcu_read_unlock();
 
@@ -201,6 +203,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
 	seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->egid));
 	seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->sgid));
 	seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->fsgid));
+	seq_put_decimal_ull(m, "\nHidePID:\t", hide_pid);
 	seq_put_decimal_ull(m, "\nFDSize:\t", max_fds);
 
 	seq_puts(m, "\nGroups:\t");
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 2680794..84524d6 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -726,7 +726,8 @@ static bool has_pid_permissions(struct pid_namespace *pid,
 				 struct task_struct *task,
 				 int hide_pid_min)
 {
-	if (pid->hide_pid < hide_pid_min)
+	if (pid->hide_pid < hide_pid_min &&
+	    current->hide_pid < hide_pid_min)
 		return true;
 	if (in_group_p(pid->pid_gid))
 		return true;
@@ -747,7 +748,8 @@ static int proc_pid_permission(struct inode *inode, int mask)
 	put_task_struct(task);
 
 	if (!has_perms) {
-		if (pid->hide_pid == HIDEPID_INVISIBLE) {
+		if (pid->hide_pid == HIDEPID_INVISIBLE ||
+		    current->hide_pid == HIDEPID_INVISIBLE) {
 			/*
 			 * Let's make getdents(), stat(), and open()
 			 * consistent with each other.  If a process
diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 325f649..c87de0e 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -250,6 +250,7 @@ extern struct task_group root_task_group;
 	.cpu_timers	= INIT_CPU_TIMERS(tsk.cpu_timers),		\
 	.pi_lock	= __RAW_SPIN_LOCK_UNLOCKED(tsk.pi_lock),	\
 	.timer_slack_ns = 50000, /* 50 usec default slack */		\
+	.hide_pid	= 0,						\
 	.pids = {							\
 		[PIDTYPE_PID]  = INIT_PID_LINK(PIDTYPE_PID),		\
 		[PIDTYPE_PGID] = INIT_PID_LINK(PIDTYPE_PGID),		\
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 348f51b..3e8ca16 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1572,6 +1572,7 @@ struct task_struct {
 	/* unserialized, strictly 'current' */
 	unsigned in_execve:1; /* bit to tell LSMs we're in execve */
 	unsigned in_iowait:1;
+	unsigned hide_pid:2; /* per-process procfs hidepid= */
 #if !defined(TIF_RESTORE_SIGMASK)
 	unsigned restore_sigmask:1;
 #endif
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index a8d0759..ada62b6 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -197,4 +197,8 @@ struct prctl_mm_map {
 # define PR_CAP_AMBIENT_LOWER		3
 # define PR_CAP_AMBIENT_CLEAR_ALL	4
 
+/* Per process, non-revokable procfs hidepid= option */
+#define PR_SET_HIDEPID 48
+#define PR_GET_HIDEPID 49
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/fork.c b/kernel/fork.c
index 6d42242..a781d35 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1560,6 +1560,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 #endif
 
 	p->default_timer_slack_ns = current->timer_slack_ns;
+	p->hide_pid = current->hide_pid;
 
 	task_io_accounting_init(&p->ioac);
 	acct_clear_integrals(p);
diff --git a/kernel/sys.c b/kernel/sys.c
index 89d5be4..c0a1d3e 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2270,6 +2270,16 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
 	case PR_GET_FP_MODE:
 		error = GET_FP_MODE(me);
 		break;
+	case PR_SET_HIDEPID:
+		if (arg2 < HIDEPID_OFF || arg2 > HIDEPID_INVISIBLE)
+			return -EINVAL;
+		if (arg2 < me->hide_pid)
+			return -EPERM;
+		me->hide_pid = arg2;
+		break;
+	case PR_GET_HIDEPID:
+		error = put_user((int) me->hide_pid, (int __user *)arg2);
+		break;
 	default:
 		error = -EINVAL;
 		break;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2] procfs/tasks: add a simple per-task procfs hidepid= field
  2016-08-24 18:41 [PATCH 1/2] procfs: use an enum for possible hidepid values Lafcadio Wluiki
@ 2016-08-24 18:41 ` Lafcadio Wluiki
  0 siblings, 0 replies; 13+ messages in thread
From: Lafcadio Wluiki @ 2016-08-24 18:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andy Lutomirski

This adds a new per-task hidepid= flag that is honored by procfs when
presenting /proc to the user, in addition to the existing hidepid= mount
option. So far, hidepid= was exclusively a per-pidns setting. Locking
down a set of processes so that they cannot see other user's processes
without affecting the rest of the system thus currently requires
creation of a private PID namespace, with all the complexity it brings,
including maintaining a stub init process as PID 1 and losing the
ability to see processes of the same user on the rest of the system.

With this patch all acesss and visibility checks in procfs now
honour two fields:

	a) the existing hide_pid field in the PID namespace
	b) the new hide_pid in struct task_struct

Access/visibility is only granted if both fields permit it; the more
restrictive one wins. By default the new task_struct hide_pid value
defaults to 0, which means behaviour is not changed from the status quo.

Setting the per-process hide_pid value is done via a new PR_SET_HIDEPID
prctl() option which takes the same three supported values as the
hidepid= mount option. The per-process hide_pid may only be increased,
never decreased, thus ensuring that once applied, processes can never
escape such a hide_pid jail.  When a process forks it inherits its
parent's hide_pid value.

Suggested usecase: let's say nginx runs as user "www-data". After
dropping privileges it may now call:

	…
	prctl(PR_SET_HIDEPID, 2);
	…

And from that point on neither nginx itself, nor any of its child
processes may see processes in /proc anymore that belong to a different
user than "www-data". Other services running on the same system remain
unaffected.

This should permit Linux distributions to more comprehensively lock down
their services, as it allows an isolated opt-in for hidepid= for
specific services. Previously hidepid= could only be set system-wide,
and then specific services had to be excluded by group membership,
essentially a more complex concept of opt-out.

Signed-off-by: Lafcadio Wluiki <wluikil@gmail.com>
---
 fs/proc/array.c            |  4 ++++
 fs/proc/base.c             |  6 ++++--
 include/linux/init_task.h  |  1 +
 include/linux/sched.h      |  1 +
 include/uapi/linux/prctl.h |  4 ++++
 kernel/fork.c              |  1 +
 kernel/sys.c               | 10 ++++++++++
 7 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 88c7de1..a0c1151 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -163,6 +163,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
 	const struct cred *cred;
 	pid_t ppid, tpid = 0, tgid, ngid;
 	unsigned int max_fds = 0;
+	int hide_pid;
 
 	rcu_read_lock();
 	ppid = pid_alive(p) ?
@@ -183,6 +184,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
 	task_lock(p);
 	if (p->files)
 		max_fds = files_fdtable(p->files)->max_fds;
+	hide_pid = p->hide_pid;
 	task_unlock(p);
 	rcu_read_unlock();
 
@@ -195,6 +197,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
 		"TracerPid:\t%d\n"
 		"Uid:\t%d\t%d\t%d\t%d\n"
 		"Gid:\t%d\t%d\t%d\t%d\n"
+		"HidePID:\t%i\n"
 		"FDSize:\t%d\nGroups:\t",
 		get_task_state(p),
 		tgid, ngid, pid_nr_ns(pid, ns), ppid, tpid,
@@ -206,6 +209,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
 		from_kgid_munged(user_ns, cred->egid),
 		from_kgid_munged(user_ns, cred->sgid),
 		from_kgid_munged(user_ns, cred->fsgid),
+		hide_pid,
 		max_fds);
 
 	group_info = cred->group_info;
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 308e9a5..b24675f 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -726,7 +726,8 @@ static bool has_pid_permissions(struct pid_namespace *pid,
 				 struct task_struct *task,
 				 int hide_pid_min)
 {
-	if (pid->hide_pid < hide_pid_min)
+	if (pid->hide_pid < hide_pid_min &&
+	    current->hide_pid < hide_pid_min)
 		return true;
 	if (in_group_p(pid->pid_gid))
 		return true;
@@ -747,7 +748,8 @@ static int proc_pid_permission(struct inode *inode, int mask)
 	put_task_struct(task);
 
 	if (!has_perms) {
-		if (pid->hide_pid == HIDEPID_INVISIBLE) {
+		if (pid->hide_pid == HIDEPID_INVISIBLE ||
+		    current->hide_pid == HIDEPID_INVISIBLE) {
 			/*
 			 * Let's make getdents(), stat(), and open()
 			 * consistent with each other.  If a process
diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index f8834f8..abd7a52 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -239,6 +239,7 @@ extern struct task_group root_task_group;
 	.cpu_timers	= INIT_CPU_TIMERS(tsk.cpu_timers),		\
 	.pi_lock	= __RAW_SPIN_LOCK_UNLOCKED(tsk.pi_lock),	\
 	.timer_slack_ns = 50000, /* 50 usec default slack */		\
+	.hide_pid	= 0,						\
 	.pids = {							\
 		[PIDTYPE_PID]  = INIT_PID_LINK(PIDTYPE_PID),		\
 		[PIDTYPE_PGID] = INIT_PID_LINK(PIDTYPE_PGID),		\
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 62c68e5..d63af9f 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1547,6 +1547,7 @@ struct task_struct {
 	/* unserialized, strictly 'current' */
 	unsigned in_execve:1; /* bit to tell LSMs we're in execve */
 	unsigned in_iowait:1;
+	unsigned hide_pid:2; /* per-process procfs hidepid= */
 #if !defined(TIF_RESTORE_SIGMASK)
 	unsigned restore_sigmask:1;
 #endif
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index a8d0759..ada62b6 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -197,4 +197,8 @@ struct prctl_mm_map {
 # define PR_CAP_AMBIENT_LOWER		3
 # define PR_CAP_AMBIENT_CLEAR_ALL	4
 
+/* Per process, non-revokable procfs hidepid= option */
+#define PR_SET_HIDEPID 48
+#define PR_GET_HIDEPID 49
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/fork.c b/kernel/fork.c
index 52e725d..44879c4 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1394,6 +1394,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 #endif
 
 	p->default_timer_slack_ns = current->timer_slack_ns;
+	p->hide_pid = current->hide_pid;
 
 	task_io_accounting_init(&p->ioac);
 	acct_clear_integrals(p);
diff --git a/kernel/sys.c b/kernel/sys.c
index 89d5be4..c0a1d3e 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2270,6 +2270,16 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
 	case PR_GET_FP_MODE:
 		error = GET_FP_MODE(me);
 		break;
+	case PR_SET_HIDEPID:
+		if (arg2 < HIDEPID_OFF || arg2 > HIDEPID_INVISIBLE)
+			return -EINVAL;
+		if (arg2 < me->hide_pid)
+			return -EPERM;
+		me->hide_pid = arg2;
+		break;
+	case PR_GET_HIDEPID:
+		error = put_user((int) me->hide_pid, (int __user *)arg2);
+		break;
 	default:
 		error = -EINVAL;
 		break;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-11-15 23:27 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-03 15:30 [PATCH 1/2] procfs: use an enum for possible hidepid values Lafcadio Wluiki
2016-11-03 15:30 ` [PATCH 2/2] procfs/tasks: add a simple per-task procfs hidepid= field Lafcadio Wluiki
2016-11-03 16:12   ` Kees Cook
2016-11-03 17:55     ` Jann Horn
2016-11-03 18:05       ` Kees Cook
2016-11-03 18:24   ` [2/2] " Jann Horn
2016-11-03 20:21     ` Lafcadio Wluiki
2016-11-03 20:34     ` Kees Cook
2016-11-03 20:42       ` [kernel-hardening] " Jann Horn
2016-11-03 15:49 ` [PATCH 1/2] procfs: use an enum for possible hidepid values Kees Cook
2016-11-15 23:27   ` Kees Cook
  -- strict thread matches above, loose matches on Subject: below --
2016-10-09 14:36 Lafcadio Wluiki
2016-10-09 14:36 ` [PATCH 2/2] procfs/tasks: add a simple per-task procfs hidepid= field Lafcadio Wluiki
2016-08-24 18:41 [PATCH 1/2] procfs: use an enum for possible hidepid values Lafcadio Wluiki
2016-08-24 18:41 ` [PATCH 2/2] procfs/tasks: add a simple per-task procfs hidepid= field Lafcadio Wluiki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).