All of lore.kernel.org
 help / color / mirror / Atom feed
From: Aleksa Sarai <cyphar@cyphar.com>
To: Jann Horn <jannh@google.com>
Cc: Andy Lutomirski <luto@kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Jeff Layton <jlayton@kernel.org>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Arnd Bergmann <arnd@arndb.de>,
	David Howells <dhowells@redhat.com>,
	Eric Biederman <ebiederm@xmission.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Christian Brauner <christian@brauner.io>,
	Tycho Andersen <tycho@tycho.ws>,
	David Drysdale <drysdale@google.com>,
	Chanho Min <chanho.min@lge.com>, Oleg Nesterov <oleg@redhat.com>,
	Aleksa Sarai <asarai@suse.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	containers@lists.linux-foundation.org,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>,
	kernel list <linux-kernel@vger.kernel.org>,
	linux-arch <linux-arch@vger.kernel.org>
Subject: Re: [PATCH v6 5/6] binfmt_*: scope path resolution of interpreters
Date: Tue, 7 May 2019 05:17:35 +1000	[thread overview]
Message-ID: <20190506191735.nmzf7kwfh7b6e2tf@yavin> (raw)
In-Reply-To: <CAG48ez0-CiODf6UBHWTaog97prx=VAd3HgHvEjdGNz344m1xKw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4411 bytes --]

On 2019-05-06, Jann Horn <jannh@google.com> wrote:
> On Mon, May 6, 2019 at 6:56 PM Aleksa Sarai <cyphar@cyphar.com> wrote:
> > The need to be able to scope path resolution of interpreters became
> > clear with one of the possible vectors used in CVE-2019-5736 (which
> > most major container runtimes were vulnerable to).
> >
> > Naively, it might seem that openat(2) -- which supports path scoping --
> > can be combined with execveat(AT_EMPTY_PATH) to trivially scope the
> > binary being executed. Unfortunately, a "bad binary" (usually a symlink)
> > could be written as a #!-style script with the symlink target as the
> > interpreter -- which would be completely missed by just scoping the
> > openat(2). An example of this being exploitable is CVE-2019-5736.
> >
> > In order to get around this, we need to pass down to each binfmt_*
> > implementation the scoping flags requested in execveat(2). In order to
> > maintain backwards-compatibility we only pass the scoping AT_* flags.
> >
> > To avoid breaking userspace (in the exceptionally rare cases where you
> > have #!-scripts with a relative path being execveat(2)-ed with dfd !=
> > AT_FDCWD), we only pass dfd down to binfmt_* if any of our new flags are
> > set in execveat(2).
> 
> This seems extremely dangerous. I like the overall series, but not this patch.
> 
> > @@ -1762,6 +1774,12 @@ static int __do_execve_file(int fd, struct filename *filename,
> >
> >         sched_exec();
> >
> > +       bprm->flags = flags & (AT_XDEV | AT_NO_MAGICLINKS | AT_NO_SYMLINKS |
> > +                              AT_THIS_ROOT);
> [...]
> > +#define AT_THIS_ROOT           0x100000 /* - Scope ".." resolution to dirfd (like chroot(2)). */
> 
> So now what happens if there is a setuid root ELF binary with program
> interpreter "/lib64/ld-linux-x86-64.so.2" (like /bin/su), and an
> unprivileged user runs it with execveat(..., AT_THIS_ROOT)? Is that
> going to let the unprivileged user decide which interpreter the
> setuid-root process should use? From a high-level perspective, opening
> the interpreter should be controlled by the program that is being
> loaded, not by the program that invoked it.

I went a bit nuts with openat_exec(), and I did end up adding it to the
ELF interpreter lookup (and you're completely right that this is a bad
idea -- I will drop it from this patch if it's included in the next
series).

The proposed solutions you give below are much nicer than this patch so
I can drop it and work on fixing those issues separately.

> In my opinion, CVE-2019-5736 points out two different problems:
>
> The big problem: The __ptrace_may_access() logic has a special-case
> short-circuit for "introspection" that you can't opt out of; this
> makes it possible to open things in procfs that are related to the
> current process even if the credentials of the process wouldn't permit
> accessing another process like it. I think the proper fix to deal with
> this would be to add a prctl() flag for "set whether introspection is
> allowed for this process", and if userspace has manually un-set that
> flag, any introspection special-case logic would be skipped.

We could do PR_SET_DUMPABLE=3 for this, I guess?

> An additional problem: /proc/*/exe can be used to open a file for
> writing; I think it may have been Andy Lutomirski who pointed out some
> time ago that it would be nice if you couldn't use /proc/*/fd/* to
> re-open files with more privileges, which is sort of the same thing.

This is something I'm currently working on a series for, which would
boil down to some restrictions on how re-opening of file descriptors
works through procfs.

However, execveat() of a procfs magiclink is a bit hard to block --
there is no way for userspace to to represent a file being "open for
execute" so they are all "open for execute" by default and blocking it
outright seems a bit extreme (though I actually hope to eventually add
the ability to mark an O_PATH as "open for X" to resolveat(2) -- hence
why I've reserved some bits).

(Thinking more about it, there is an argument that I should include the
above patch into this series so that we can block re-opening of fds
opened through resolveat(2) without explicit flags from the outset.)

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: Aleksa Sarai <cyphar@cyphar.com>
To: Jann Horn <jannh@google.com>
Cc: Andy Lutomirski <luto@kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Jeff Layton <jlayton@kernel.org>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Arnd Bergmann <arnd@arndb.de>,
	David Howells <dhowells@redhat.com>,
	Eric Biederman <ebiederm@xmission.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Christian Brauner <christian@brauner.io>,
	Tycho Andersen <tycho@tycho.ws>,
	David Drysdale <drysdale@google.com>,
	Chanho Min <chanho.min@lge.com>, Oleg Nesterov <oleg@redhat.com>,
	Aleksa Sarai <asarai@suse.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	containers@lists.linux-foundation.org,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>
Subject: Re: [PATCH v6 5/6] binfmt_*: scope path resolution of interpreters
Date: Tue, 7 May 2019 05:17:35 +1000	[thread overview]
Message-ID: <20190506191735.nmzf7kwfh7b6e2tf@yavin> (raw)
In-Reply-To: <CAG48ez0-CiODf6UBHWTaog97prx=VAd3HgHvEjdGNz344m1xKw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4411 bytes --]

On 2019-05-06, Jann Horn <jannh@google.com> wrote:
> On Mon, May 6, 2019 at 6:56 PM Aleksa Sarai <cyphar@cyphar.com> wrote:
> > The need to be able to scope path resolution of interpreters became
> > clear with one of the possible vectors used in CVE-2019-5736 (which
> > most major container runtimes were vulnerable to).
> >
> > Naively, it might seem that openat(2) -- which supports path scoping --
> > can be combined with execveat(AT_EMPTY_PATH) to trivially scope the
> > binary being executed. Unfortunately, a "bad binary" (usually a symlink)
> > could be written as a #!-style script with the symlink target as the
> > interpreter -- which would be completely missed by just scoping the
> > openat(2). An example of this being exploitable is CVE-2019-5736.
> >
> > In order to get around this, we need to pass down to each binfmt_*
> > implementation the scoping flags requested in execveat(2). In order to
> > maintain backwards-compatibility we only pass the scoping AT_* flags.
> >
> > To avoid breaking userspace (in the exceptionally rare cases where you
> > have #!-scripts with a relative path being execveat(2)-ed with dfd !=
> > AT_FDCWD), we only pass dfd down to binfmt_* if any of our new flags are
> > set in execveat(2).
> 
> This seems extremely dangerous. I like the overall series, but not this patch.
> 
> > @@ -1762,6 +1774,12 @@ static int __do_execve_file(int fd, struct filename *filename,
> >
> >         sched_exec();
> >
> > +       bprm->flags = flags & (AT_XDEV | AT_NO_MAGICLINKS | AT_NO_SYMLINKS |
> > +                              AT_THIS_ROOT);
> [...]
> > +#define AT_THIS_ROOT           0x100000 /* - Scope ".." resolution to dirfd (like chroot(2)). */
> 
> So now what happens if there is a setuid root ELF binary with program
> interpreter "/lib64/ld-linux-x86-64.so.2" (like /bin/su), and an
> unprivileged user runs it with execveat(..., AT_THIS_ROOT)? Is that
> going to let the unprivileged user decide which interpreter the
> setuid-root process should use? From a high-level perspective, opening
> the interpreter should be controlled by the program that is being
> loaded, not by the program that invoked it.

I went a bit nuts with openat_exec(), and I did end up adding it to the
ELF interpreter lookup (and you're completely right that this is a bad
idea -- I will drop it from this patch if it's included in the next
series).

The proposed solutions you give below are much nicer than this patch so
I can drop it and work on fixing those issues separately.

> In my opinion, CVE-2019-5736 points out two different problems:
>
> The big problem: The __ptrace_may_access() logic has a special-case
> short-circuit for "introspection" that you can't opt out of; this
> makes it possible to open things in procfs that are related to the
> current process even if the credentials of the process wouldn't permit
> accessing another process like it. I think the proper fix to deal with
> this would be to add a prctl() flag for "set whether introspection is
> allowed for this process", and if userspace has manually un-set that
> flag, any introspection special-case logic would be skipped.

We could do PR_SET_DUMPABLE=3 for this, I guess?

> An additional problem: /proc/*/exe can be used to open a file for
> writing; I think it may have been Andy Lutomirski who pointed out some
> time ago that it would be nice if you couldn't use /proc/*/fd/* to
> re-open files with more privileges, which is sort of the same thing.

This is something I'm currently working on a series for, which would
boil down to some restrictions on how re-opening of file descriptors
works through procfs.

However, execveat() of a procfs magiclink is a bit hard to block --
there is no way for userspace to to represent a file being "open for
execute" so they are all "open for execute" by default and blocking it
outright seems a bit extreme (though I actually hope to eventually add
the ability to mark an O_PATH as "open for X" to resolveat(2) -- hence
why I've reserved some bits).

(Thinking more about it, there is an argument that I should include the
above patch into this series so that we can block re-opening of fds
opened through resolveat(2) without explicit flags from the outset.)

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2019-05-06 19:18 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-06 16:54 [PATCH v6 0/6] namei: resolveat(2) path resolution restriction API Aleksa Sarai
2019-05-06 16:54 ` [PATCH v6 1/6] namei: split out nd->dfd handling to dirfd_path_init Aleksa Sarai
2019-05-06 16:54 ` [PATCH v6 2/6] namei: O_BENEATH-style path resolution flags Aleksa Sarai
2019-05-06 16:54 ` [PATCH v6 3/6] namei: LOOKUP_IN_ROOT: chroot-like path resolution Aleksa Sarai
2019-05-06 16:54 ` [PATCH v6 4/6] namei: aggressively check for nd->root escape on ".." resolution Aleksa Sarai
2019-05-06 16:54 ` [PATCH v6 5/6] binfmt_*: scope path resolution of interpreters Aleksa Sarai
2019-05-06 18:37   ` Jann Horn
2019-05-06 18:37     ` Jann Horn
2019-05-06 19:17     ` Aleksa Sarai [this message]
2019-05-06 19:17       ` Aleksa Sarai
2019-05-06 23:41       ` Andy Lutomirski
2019-05-06 23:41         ` Andy Lutomirski
2019-05-08  0:54       ` Aleksa Sarai
2019-05-08  0:54         ` Aleksa Sarai
2019-05-10 20:41       ` Jann Horn
2019-05-10 20:41         ` Jann Horn
2019-05-10 21:20         ` Andy Lutomirski
2019-05-10 21:20           ` Andy Lutomirski
2019-05-10 22:55           ` Jann Horn
2019-05-10 22:55             ` Jann Horn
2019-05-10 23:36             ` Christian Brauner
2019-05-10 23:36               ` Christian Brauner
2019-05-11 15:49               ` Aleksa Sarai
2019-05-11 15:49                 ` Aleksa Sarai
2019-05-11 17:00             ` Andy Lutomirski
2019-05-11 17:00               ` Andy Lutomirski
2019-05-11 17:21               ` Linus Torvalds
2019-05-11 17:21                 ` Linus Torvalds
2019-05-11 17:26                 ` Linus Torvalds
2019-05-11 17:26                   ` Linus Torvalds
2019-05-11 17:31                   ` Aleksa Sarai
2019-05-11 17:31                     ` Aleksa Sarai
2019-05-11 17:43                     ` Linus Torvalds
2019-05-11 17:43                       ` Linus Torvalds
2019-05-11 17:48                       ` Christian Brauner
2019-05-11 17:48                         ` Christian Brauner
2019-05-11 18:00                       ` Aleksa Sarai
2019-05-11 18:00                         ` Aleksa Sarai
2019-05-11 22:39                 ` Andy Lutomirski
2019-05-11 22:39                   ` Andy Lutomirski
     [not found]                   ` <CAHk-=wg3+3GfHsHdB4o78jNiPh_5ShrzxBuTN-Y8EZfiFMhCvw@mail.gmail.com>
2019-05-12 10:19                     ` Christian Brauner
2019-05-12 10:19                       ` Christian Brauner
     [not found]                     ` <9CD2B97D-A6BD-43BE-9040-B410D996A195@amacapital.net>
2019-05-12 10:44                       ` Linus Torvalds
2019-05-12 10:44                         ` Linus Torvalds
2019-05-12 13:35                         ` Aleksa Sarai
2019-05-12 13:35                           ` Aleksa Sarai
2019-05-12 13:38                           ` Aleksa Sarai
2019-05-12 13:38                             ` Aleksa Sarai
2019-05-12 14:34                           ` Andy Lutomirski
2019-05-12 14:34                             ` Andy Lutomirski
2019-05-11 17:26               ` Aleksa Sarai
2019-05-11 17:26                 ` Aleksa Sarai
2019-05-08  0:38     ` Eric W. Biederman
2019-05-08  0:38       ` Eric W. Biederman
2019-05-10 20:10       ` Jann Horn
2019-05-10 20:10         ` Jann Horn
2019-05-10 20:10         ` Jann Horn
2019-05-10 20:10         ` Jann Horn
2019-05-06 16:54 ` [PATCH v6 6/6] namei: resolveat(2) syscall Aleksa Sarai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190506191735.nmzf7kwfh7b6e2tf@yavin \
    --to=cyphar@cyphar.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=asarai@suse.de \
    --cc=ast@kernel.org \
    --cc=bfields@fieldses.org \
    --cc=chanho.min@lge.com \
    --cc=christian@brauner.io \
    --cc=containers@lists.linux-foundation.org \
    --cc=dhowells@redhat.com \
    --cc=drysdale@google.com \
    --cc=ebiederm@xmission.com \
    --cc=jannh@google.com \
    --cc=jlayton@kernel.org \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=oleg@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=tycho@tycho.ws \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.