All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kees Cook <keescook@chromium.org>
To: Lafcadio Wluiki <wluikil@gmail.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-arch <linux-arch@vger.kernel.org>,
	Jann Horn <jann@thejh.net>,
	"kernel-hardening@lists.openwall.com" 
	<kernel-hardening@lists.openwall.com>
Subject: Re: [PATCH 2/2] procfs/tasks: add a simple per-task procfs hidepid= field
Date: Thu, 3 Nov 2016 10:12:55 -0600	[thread overview]
Message-ID: <CAGXu5jLZnkwK3NRV7qy=fi51_W4rP4-jGzeYSwceuvSz7_Ht8g@mail.gmail.com> (raw)
In-Reply-To: <1478187038-19954-2-git-send-email-wluikil@gmail.com>

On Thu, Nov 3, 2016 at 9:30 AM, Lafcadio Wluiki <wluikil@gmail.com> wrote:
> (Third, rebased submission, since first two submissions yielded no replies.)
>
> This adds a new per-task hidepid= flag that is honored by procfs when
> presenting /proc to the user, in addition to the existing hidepid= mount
> option. So far, hidepid= was exclusively a per-pidns setting. Locking
> down a set of processes so that they cannot see other user's processes
> without affecting the rest of the system thus currently requires
> creation of a private PID namespace, with all the complexity it brings,
> including maintaining a stub init process as PID 1 and losing the
> ability to see processes of the same user on the rest of the system.
>
> With this patch all acesss and visibility checks in procfs now
> honour two fields:
>
>         a) the existing hide_pid field in the PID namespace
>         b) the new hide_pid in struct task_struct
>
> Access/visibility is only granted if both fields permit it; the more
> restrictive one wins. By default the new task_struct hide_pid value
> defaults to 0, which means behaviour is not changed from the status quo.
>
> Setting the per-process hide_pid value is done via a new PR_SET_HIDEPID
> prctl() option which takes the same three supported values as the
> hidepid= mount option. The per-process hide_pid may only be increased,
> never decreased, thus ensuring that once applied, processes can never
> escape such a hide_pid jail.  When a process forks it inherits its
> parent's hide_pid value.
>
> Suggested usecase: let's say nginx runs as user "www-data". After
> dropping privileges it may now call:
>
>         …
>         prctl(PR_SET_HIDEPID, 2);
>         …
>
> And from that point on neither nginx itself, nor any of its child
> processes may see processes in /proc anymore that belong to a different
> user than "www-data". Other services running on the same system remain
> unaffected.
>
> This should permit Linux distributions to more comprehensively lock down
> their services, as it allows an isolated opt-in for hidepid= for
> specific services. Previously hidepid= could only be set system-wide,
> and then specific services had to be excluded by group membership,
> essentially a more complex concept of opt-out.
>
> A test-tool that validates this functionality is available here:
>
>         https://paste.fedoraproject.org/412975/71967605/
>
> Signed-off-by: Lafcadio Wluiki <wluikil@gmail.com>

I like this idea: it meaningfully reduces attack surface, even though
it doesn't ensure the same confinement as a pid namespace (e.g. a
process with this prctl set can still direct syscalls to pids that are
hidden). However, the attack surface in /proc is relatively large
compared to the syscalls that use pids.

For task launchers, this may overlap nicely with no_new_privs too.

Some suggestions/nits below...

> ---
>  fs/proc/array.c            |  3 +++
>  fs/proc/base.c             |  6 ++++--
>  include/linux/init_task.h  |  1 +
>  include/linux/sched.h      |  1 +
>  include/uapi/linux/prctl.h |  4 ++++
>  kernel/fork.c              |  1 +
>  kernel/sys.c               | 10 ++++++++++
>  7 files changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/fs/proc/array.c b/fs/proc/array.c
> index 81818ad..ea801e5 100644
> --- a/fs/proc/array.c
> +++ b/fs/proc/array.c
> @@ -163,6 +163,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
>         const struct cred *cred;
>         pid_t ppid, tpid = 0, tgid, ngid;
>         unsigned int max_fds = 0;
> +       int hide_pid;
>
>         rcu_read_lock();
>         ppid = pid_alive(p) ?
> @@ -183,6 +184,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
>         task_lock(p);
>         if (p->files)
>                 max_fds = files_fdtable(p->files)->max_fds;
> +       hide_pid = p->hide_pid;
>         task_unlock(p);
>         rcu_read_unlock();
>
> @@ -201,6 +203,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
>         seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->egid));
>         seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->sgid));
>         seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->fsgid));
> +       seq_put_decimal_ull(m, "\nHidePID:\t", hide_pid);
>         seq_put_decimal_ull(m, "\nFDSize:\t", max_fds);
>
>         seq_puts(m, "\nGroups:\t");

This should get an addition to table 1-2 of
Documentation/filesystems/proc.txt that covers the contents of
/proc/$pid/status


> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index ae5e13c..6c9a42b 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -709,7 +709,8 @@ static bool has_pid_permissions(struct pid_namespace *pid,
>                                  struct task_struct *task,
>                                  int hide_pid_min)
>  {
> -       if (pid->hide_pid < hide_pid_min)
> +       if (pid->hide_pid < hide_pid_min &&
> +           current->hide_pid < hide_pid_min)
>                 return true;
>         if (in_group_p(pid->pid_gid))
>                 return true;
> @@ -730,7 +731,8 @@ static int proc_pid_permission(struct inode *inode, int mask)
>         put_task_struct(task);
>
>         if (!has_perms) {
> -               if (pid->hide_pid == HIDEPID_INVISIBLE) {
> +               if (pid->hide_pid == HIDEPID_INVISIBLE ||
> +                   current->hide_pid == HIDEPID_INVISIBLE) {
>                         /*
>                          * Let's make getdents(), stat(), and open()
>                          * consistent with each other.  If a process

Instead of open-coding both of these "||" tests, I think it might be
cleaner to just choose the highest protection value, and use that in
both comparisons. e.g.

    int hide_pid = max(pid->hide_pid, current->hide_pid);

    if (hide_pid < hide_pid_min) ...

    9if (hide_pid == HIDEPID_INVISIBLE) ...


> diff --git a/include/linux/init_task.h b/include/linux/init_task.h
> index 325f649..c87de0e 100644
> --- a/include/linux/init_task.h
> +++ b/include/linux/init_task.h
> @@ -250,6 +250,7 @@ extern struct task_group root_task_group;
>         .cpu_timers     = INIT_CPU_TIMERS(tsk.cpu_timers),              \
>         .pi_lock        = __RAW_SPIN_LOCK_UNLOCKED(tsk.pi_lock),        \
>         .timer_slack_ns = 50000, /* 50 usec default slack */            \
> +       .hide_pid       = 0,                                            \
>         .pids = {                                                       \
>                 [PIDTYPE_PID]  = INIT_PID_LINK(PIDTYPE_PID),            \
>                 [PIDTYPE_PGID] = INIT_PID_LINK(PIDTYPE_PGID),           \
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 348f51b..3e8ca16 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1572,6 +1572,7 @@ struct task_struct {
>         /* unserialized, strictly 'current' */
>         unsigned in_execve:1; /* bit to tell LSMs we're in execve */
>         unsigned in_iowait:1;
> +       unsigned hide_pid:2; /* per-process procfs hidepid= */
>  #if !defined(TIF_RESTORE_SIGMASK)
>         unsigned restore_sigmask:1;
>  #endif
> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
> index a8d0759..ada62b6 100644
> --- a/include/uapi/linux/prctl.h
> +++ b/include/uapi/linux/prctl.h
> @@ -197,4 +197,8 @@ struct prctl_mm_map {
>  # define PR_CAP_AMBIENT_LOWER          3
>  # define PR_CAP_AMBIENT_CLEAR_ALL      4
>
> +/* Per process, non-revokable procfs hidepid= option */
> +#define PR_SET_HIDEPID 48
> +#define PR_GET_HIDEPID 49
> +
>  #endif /* _LINUX_PRCTL_H */
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 623259f..d4fe951 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1562,6 +1562,7 @@ static __latent_entropy struct task_struct *copy_process(
>  #endif
>
>         p->default_timer_slack_ns = current->timer_slack_ns;
> +       p->hide_pid = current->hide_pid;
>
>         task_io_accounting_init(&p->ioac);
>         acct_clear_integrals(p);
> diff --git a/kernel/sys.c b/kernel/sys.c
> index 89d5be4..c0a1d3e 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2270,6 +2270,16 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
>         case PR_GET_FP_MODE:
>                 error = GET_FP_MODE(me);
>                 break;
> +       case PR_SET_HIDEPID:
> +               if (arg2 < HIDEPID_OFF || arg2 > HIDEPID_INVISIBLE)
> +                       return -EINVAL;
> +               if (arg2 < me->hide_pid)
> +                       return -EPERM;
> +               me->hide_pid = arg2;
> +               break;
> +       case PR_GET_HIDEPID:
> +               error = put_user((int) me->hide_pid, (int __user *)arg2);
> +               break;
>         default:
>                 error = -EINVAL;
>                 break;
> --
> 2.7.4
>

Since this adds a new prctl interface, it's best to Cc linux-arch
(which I added now).

Thanks for proposing this!

-Kees

-- 
Kees Cook
Nexus Security

WARNING: multiple messages have this Message-ID (diff)
From: Kees Cook <keescook@chromium.org>
To: Lafcadio Wluiki <wluikil@gmail.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-arch <linux-arch@vger.kernel.org>,
	Jann Horn <jann@thejh.net>,
	"kernel-hardening@lists.openwall.com"
	<kernel-hardening@lists.openwall.com>
Subject: [kernel-hardening] Re: [PATCH 2/2] procfs/tasks: add a simple per-task procfs hidepid= field
Date: Thu, 3 Nov 2016 10:12:55 -0600	[thread overview]
Message-ID: <CAGXu5jLZnkwK3NRV7qy=fi51_W4rP4-jGzeYSwceuvSz7_Ht8g@mail.gmail.com> (raw)
In-Reply-To: <1478187038-19954-2-git-send-email-wluikil@gmail.com>

On Thu, Nov 3, 2016 at 9:30 AM, Lafcadio Wluiki <wluikil@gmail.com> wrote:
> (Third, rebased submission, since first two submissions yielded no replies.)
>
> This adds a new per-task hidepid= flag that is honored by procfs when
> presenting /proc to the user, in addition to the existing hidepid= mount
> option. So far, hidepid= was exclusively a per-pidns setting. Locking
> down a set of processes so that they cannot see other user's processes
> without affecting the rest of the system thus currently requires
> creation of a private PID namespace, with all the complexity it brings,
> including maintaining a stub init process as PID 1 and losing the
> ability to see processes of the same user on the rest of the system.
>
> With this patch all acesss and visibility checks in procfs now
> honour two fields:
>
>         a) the existing hide_pid field in the PID namespace
>         b) the new hide_pid in struct task_struct
>
> Access/visibility is only granted if both fields permit it; the more
> restrictive one wins. By default the new task_struct hide_pid value
> defaults to 0, which means behaviour is not changed from the status quo.
>
> Setting the per-process hide_pid value is done via a new PR_SET_HIDEPID
> prctl() option which takes the same three supported values as the
> hidepid= mount option. The per-process hide_pid may only be increased,
> never decreased, thus ensuring that once applied, processes can never
> escape such a hide_pid jail.  When a process forks it inherits its
> parent's hide_pid value.
>
> Suggested usecase: let's say nginx runs as user "www-data". After
> dropping privileges it may now call:
>
>         …
>         prctl(PR_SET_HIDEPID, 2);
>         …
>
> And from that point on neither nginx itself, nor any of its child
> processes may see processes in /proc anymore that belong to a different
> user than "www-data". Other services running on the same system remain
> unaffected.
>
> This should permit Linux distributions to more comprehensively lock down
> their services, as it allows an isolated opt-in for hidepid= for
> specific services. Previously hidepid= could only be set system-wide,
> and then specific services had to be excluded by group membership,
> essentially a more complex concept of opt-out.
>
> A test-tool that validates this functionality is available here:
>
>         https://paste.fedoraproject.org/412975/71967605/
>
> Signed-off-by: Lafcadio Wluiki <wluikil@gmail.com>

I like this idea: it meaningfully reduces attack surface, even though
it doesn't ensure the same confinement as a pid namespace (e.g. a
process with this prctl set can still direct syscalls to pids that are
hidden). However, the attack surface in /proc is relatively large
compared to the syscalls that use pids.

For task launchers, this may overlap nicely with no_new_privs too.

Some suggestions/nits below...

> ---
>  fs/proc/array.c            |  3 +++
>  fs/proc/base.c             |  6 ++++--
>  include/linux/init_task.h  |  1 +
>  include/linux/sched.h      |  1 +
>  include/uapi/linux/prctl.h |  4 ++++
>  kernel/fork.c              |  1 +
>  kernel/sys.c               | 10 ++++++++++
>  7 files changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/fs/proc/array.c b/fs/proc/array.c
> index 81818ad..ea801e5 100644
> --- a/fs/proc/array.c
> +++ b/fs/proc/array.c
> @@ -163,6 +163,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
>         const struct cred *cred;
>         pid_t ppid, tpid = 0, tgid, ngid;
>         unsigned int max_fds = 0;
> +       int hide_pid;
>
>         rcu_read_lock();
>         ppid = pid_alive(p) ?
> @@ -183,6 +184,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
>         task_lock(p);
>         if (p->files)
>                 max_fds = files_fdtable(p->files)->max_fds;
> +       hide_pid = p->hide_pid;
>         task_unlock(p);
>         rcu_read_unlock();
>
> @@ -201,6 +203,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
>         seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->egid));
>         seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->sgid));
>         seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->fsgid));
> +       seq_put_decimal_ull(m, "\nHidePID:\t", hide_pid);
>         seq_put_decimal_ull(m, "\nFDSize:\t", max_fds);
>
>         seq_puts(m, "\nGroups:\t");

This should get an addition to table 1-2 of
Documentation/filesystems/proc.txt that covers the contents of
/proc/$pid/status


> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index ae5e13c..6c9a42b 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -709,7 +709,8 @@ static bool has_pid_permissions(struct pid_namespace *pid,
>                                  struct task_struct *task,
>                                  int hide_pid_min)
>  {
> -       if (pid->hide_pid < hide_pid_min)
> +       if (pid->hide_pid < hide_pid_min &&
> +           current->hide_pid < hide_pid_min)
>                 return true;
>         if (in_group_p(pid->pid_gid))
>                 return true;
> @@ -730,7 +731,8 @@ static int proc_pid_permission(struct inode *inode, int mask)
>         put_task_struct(task);
>
>         if (!has_perms) {
> -               if (pid->hide_pid == HIDEPID_INVISIBLE) {
> +               if (pid->hide_pid == HIDEPID_INVISIBLE ||
> +                   current->hide_pid == HIDEPID_INVISIBLE) {
>                         /*
>                          * Let's make getdents(), stat(), and open()
>                          * consistent with each other.  If a process

Instead of open-coding both of these "||" tests, I think it might be
cleaner to just choose the highest protection value, and use that in
both comparisons. e.g.

    int hide_pid = max(pid->hide_pid, current->hide_pid);

    if (hide_pid < hide_pid_min) ...

    9if (hide_pid == HIDEPID_INVISIBLE) ...


> diff --git a/include/linux/init_task.h b/include/linux/init_task.h
> index 325f649..c87de0e 100644
> --- a/include/linux/init_task.h
> +++ b/include/linux/init_task.h
> @@ -250,6 +250,7 @@ extern struct task_group root_task_group;
>         .cpu_timers     = INIT_CPU_TIMERS(tsk.cpu_timers),              \
>         .pi_lock        = __RAW_SPIN_LOCK_UNLOCKED(tsk.pi_lock),        \
>         .timer_slack_ns = 50000, /* 50 usec default slack */            \
> +       .hide_pid       = 0,                                            \
>         .pids = {                                                       \
>                 [PIDTYPE_PID]  = INIT_PID_LINK(PIDTYPE_PID),            \
>                 [PIDTYPE_PGID] = INIT_PID_LINK(PIDTYPE_PGID),           \
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 348f51b..3e8ca16 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1572,6 +1572,7 @@ struct task_struct {
>         /* unserialized, strictly 'current' */
>         unsigned in_execve:1; /* bit to tell LSMs we're in execve */
>         unsigned in_iowait:1;
> +       unsigned hide_pid:2; /* per-process procfs hidepid= */
>  #if !defined(TIF_RESTORE_SIGMASK)
>         unsigned restore_sigmask:1;
>  #endif
> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
> index a8d0759..ada62b6 100644
> --- a/include/uapi/linux/prctl.h
> +++ b/include/uapi/linux/prctl.h
> @@ -197,4 +197,8 @@ struct prctl_mm_map {
>  # define PR_CAP_AMBIENT_LOWER          3
>  # define PR_CAP_AMBIENT_CLEAR_ALL      4
>
> +/* Per process, non-revokable procfs hidepid= option */
> +#define PR_SET_HIDEPID 48
> +#define PR_GET_HIDEPID 49
> +
>  #endif /* _LINUX_PRCTL_H */
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 623259f..d4fe951 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1562,6 +1562,7 @@ static __latent_entropy struct task_struct *copy_process(
>  #endif
>
>         p->default_timer_slack_ns = current->timer_slack_ns;
> +       p->hide_pid = current->hide_pid;
>
>         task_io_accounting_init(&p->ioac);
>         acct_clear_integrals(p);
> diff --git a/kernel/sys.c b/kernel/sys.c
> index 89d5be4..c0a1d3e 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2270,6 +2270,16 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
>         case PR_GET_FP_MODE:
>                 error = GET_FP_MODE(me);
>                 break;
> +       case PR_SET_HIDEPID:
> +               if (arg2 < HIDEPID_OFF || arg2 > HIDEPID_INVISIBLE)
> +                       return -EINVAL;
> +               if (arg2 < me->hide_pid)
> +                       return -EPERM;
> +               me->hide_pid = arg2;
> +               break;
> +       case PR_GET_HIDEPID:
> +               error = put_user((int) me->hide_pid, (int __user *)arg2);
> +               break;
>         default:
>                 error = -EINVAL;
>                 break;
> --
> 2.7.4
>

Since this adds a new prctl interface, it's best to Cc linux-arch
(which I added now).

Thanks for proposing this!

-Kees

-- 
Kees Cook
Nexus Security

  reply	other threads:[~2016-11-03 16:13 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-03 15:30 [PATCH 1/2] procfs: use an enum for possible hidepid values Lafcadio Wluiki
2016-11-03 15:30 ` [PATCH 2/2] procfs/tasks: add a simple per-task procfs hidepid= field Lafcadio Wluiki
2016-11-03 16:12   ` Kees Cook [this message]
2016-11-03 16:12     ` [kernel-hardening] " Kees Cook
2016-11-03 16:12     ` Kees Cook
2016-11-03 17:55     ` Jann Horn
2016-11-03 17:55       ` [kernel-hardening] " Jann Horn
2016-11-03 17:55       ` Jann Horn
2016-11-03 18:05       ` Kees Cook
2016-11-03 18:05         ` [kernel-hardening] " Kees Cook
2016-11-03 18:05         ` Kees Cook
2016-11-03 18:24   ` [2/2] " Jann Horn
2016-11-03 18:24     ` [kernel-hardening] " Jann Horn
2016-11-03 20:21     ` Lafcadio Wluiki
2016-11-03 20:21       ` [kernel-hardening] " Lafcadio Wluiki
2016-11-03 20:34     ` Kees Cook
2016-11-03 20:34       ` [kernel-hardening] " Kees Cook
2016-11-03 20:42       ` Jann Horn
2016-11-03 20:42         ` Jann Horn
2016-11-03 15:49 ` [PATCH 1/2] procfs: use an enum for possible hidepid values Kees Cook
2016-11-15 23:27   ` Kees Cook
  -- strict thread matches above, loose matches on Subject: below --
2016-10-09 14:36 Lafcadio Wluiki
2016-10-09 14:36 ` [PATCH 2/2] procfs/tasks: add a simple per-task procfs hidepid= field Lafcadio Wluiki
2016-08-24 18:41 [PATCH 1/2] procfs: use an enum for possible hidepid values Lafcadio Wluiki
2016-08-24 18:41 ` [PATCH 2/2] procfs/tasks: add a simple per-task procfs hidepid= field Lafcadio Wluiki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGXu5jLZnkwK3NRV7qy=fi51_W4rP4-jGzeYSwceuvSz7_Ht8g@mail.gmail.com' \
    --to=keescook@chromium.org \
    --cc=akpm@linux-foundation.org \
    --cc=jann@thejh.net \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=wluikil@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.