From: Aleksa Sarai <cyphar@cyphar.com> To: Jann Horn <jannh@google.com> Cc: Andy Lutomirski <luto@kernel.org>, Al Viro <viro@zeniv.linux.org.uk>, Jeff Layton <jlayton@kernel.org>, "J. Bruce Fields" <bfields@fieldses.org>, Arnd Bergmann <arnd@arndb.de>, David Howells <dhowells@redhat.com>, Eric Biederman <ebiederm@xmission.com>, Andrew Morton <akpm@linux-foundation.org>, Alexei Starovoitov <ast@kernel.org>, Kees Cook <keescook@chromium.org>, Christian Brauner <christian@brauner.io>, Tycho Andersen <tycho@tycho.ws>, David Drysdale <drysdale@google.com>, Chanho Min <chanho.min@lge.com>, Oleg Nesterov <oleg@redhat.com>, Aleksa Sarai <asarai@suse.de>, Linus Torvalds <torvalds@linux-foundation.org>, containers@lists.linux-foundation.org, linux-fsdevel <linux-fsdevel@vger.kernel.org>, Linux API <linux-api@vger.kernel.org>, kernel list <linux-kernel@vger.kernel.org>, linux-arch <linux-arch@vger.kernel.org> Subject: Re: [PATCH v6 5/6] binfmt_*: scope path resolution of interpreters Date: Tue, 7 May 2019 05:17:35 +1000 [thread overview] Message-ID: <20190506191735.nmzf7kwfh7b6e2tf@yavin> (raw) In-Reply-To: <CAG48ez0-CiODf6UBHWTaog97prx=VAd3HgHvEjdGNz344m1xKw@mail.gmail.com> [-- Attachment #1: Type: text/plain, Size: 4411 bytes --] On 2019-05-06, Jann Horn <jannh@google.com> wrote: > On Mon, May 6, 2019 at 6:56 PM Aleksa Sarai <cyphar@cyphar.com> wrote: > > The need to be able to scope path resolution of interpreters became > > clear with one of the possible vectors used in CVE-2019-5736 (which > > most major container runtimes were vulnerable to). > > > > Naively, it might seem that openat(2) -- which supports path scoping -- > > can be combined with execveat(AT_EMPTY_PATH) to trivially scope the > > binary being executed. Unfortunately, a "bad binary" (usually a symlink) > > could be written as a #!-style script with the symlink target as the > > interpreter -- which would be completely missed by just scoping the > > openat(2). An example of this being exploitable is CVE-2019-5736. > > > > In order to get around this, we need to pass down to each binfmt_* > > implementation the scoping flags requested in execveat(2). In order to > > maintain backwards-compatibility we only pass the scoping AT_* flags. > > > > To avoid breaking userspace (in the exceptionally rare cases where you > > have #!-scripts with a relative path being execveat(2)-ed with dfd != > > AT_FDCWD), we only pass dfd down to binfmt_* if any of our new flags are > > set in execveat(2). > > This seems extremely dangerous. I like the overall series, but not this patch. > > > @@ -1762,6 +1774,12 @@ static int __do_execve_file(int fd, struct filename *filename, > > > > sched_exec(); > > > > + bprm->flags = flags & (AT_XDEV | AT_NO_MAGICLINKS | AT_NO_SYMLINKS | > > + AT_THIS_ROOT); > [...] > > +#define AT_THIS_ROOT 0x100000 /* - Scope ".." resolution to dirfd (like chroot(2)). */ > > So now what happens if there is a setuid root ELF binary with program > interpreter "/lib64/ld-linux-x86-64.so.2" (like /bin/su), and an > unprivileged user runs it with execveat(..., AT_THIS_ROOT)? Is that > going to let the unprivileged user decide which interpreter the > setuid-root process should use? From a high-level perspective, opening > the interpreter should be controlled by the program that is being > loaded, not by the program that invoked it. I went a bit nuts with openat_exec(), and I did end up adding it to the ELF interpreter lookup (and you're completely right that this is a bad idea -- I will drop it from this patch if it's included in the next series). The proposed solutions you give below are much nicer than this patch so I can drop it and work on fixing those issues separately. > In my opinion, CVE-2019-5736 points out two different problems: > > The big problem: The __ptrace_may_access() logic has a special-case > short-circuit for "introspection" that you can't opt out of; this > makes it possible to open things in procfs that are related to the > current process even if the credentials of the process wouldn't permit > accessing another process like it. I think the proper fix to deal with > this would be to add a prctl() flag for "set whether introspection is > allowed for this process", and if userspace has manually un-set that > flag, any introspection special-case logic would be skipped. We could do PR_SET_DUMPABLE=3 for this, I guess? > An additional problem: /proc/*/exe can be used to open a file for > writing; I think it may have been Andy Lutomirski who pointed out some > time ago that it would be nice if you couldn't use /proc/*/fd/* to > re-open files with more privileges, which is sort of the same thing. This is something I'm currently working on a series for, which would boil down to some restrictions on how re-opening of file descriptors works through procfs. However, execveat() of a procfs magiclink is a bit hard to block -- there is no way for userspace to to represent a file being "open for execute" so they are all "open for execute" by default and blocking it outright seems a bit extreme (though I actually hope to eventually add the ability to mark an O_PATH as "open for X" to resolveat(2) -- hence why I've reserved some bits). (Thinking more about it, there is an argument that I should include the above patch into this series so that we can block re-opening of fds opened through resolveat(2) without explicit flags from the outset.) -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/> [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --]
WARNING: multiple messages have this Message-ID (diff)
From: Aleksa Sarai <cyphar@cyphar.com> To: Jann Horn <jannh@google.com> Cc: Andy Lutomirski <luto@kernel.org>, Al Viro <viro@zeniv.linux.org.uk>, Jeff Layton <jlayton@kernel.org>, "J. Bruce Fields" <bfields@fieldses.org>, Arnd Bergmann <arnd@arndb.de>, David Howells <dhowells@redhat.com>, Eric Biederman <ebiederm@xmission.com>, Andrew Morton <akpm@linux-foundation.org>, Alexei Starovoitov <ast@kernel.org>, Kees Cook <keescook@chromium.org>, Christian Brauner <christian@brauner.io>, Tycho Andersen <tycho@tycho.ws>, David Drysdale <drysdale@google.com>, Chanho Min <chanho.min@lge.com>, Oleg Nesterov <oleg@redhat.com>, Aleksa Sarai <asarai@suse.de>, Linus Torvalds <torvalds@linux-foundation.org>, containers@lists.linux-foundation.org, linux-fsdevel <linux-fsdevel@vger.kernel.org>, Linux API <linux-api@vger.kernel.org> Subject: Re: [PATCH v6 5/6] binfmt_*: scope path resolution of interpreters Date: Tue, 7 May 2019 05:17:35 +1000 [thread overview] Message-ID: <20190506191735.nmzf7kwfh7b6e2tf@yavin> (raw) In-Reply-To: <CAG48ez0-CiODf6UBHWTaog97prx=VAd3HgHvEjdGNz344m1xKw@mail.gmail.com> [-- Attachment #1: Type: text/plain, Size: 4411 bytes --] On 2019-05-06, Jann Horn <jannh@google.com> wrote: > On Mon, May 6, 2019 at 6:56 PM Aleksa Sarai <cyphar@cyphar.com> wrote: > > The need to be able to scope path resolution of interpreters became > > clear with one of the possible vectors used in CVE-2019-5736 (which > > most major container runtimes were vulnerable to). > > > > Naively, it might seem that openat(2) -- which supports path scoping -- > > can be combined with execveat(AT_EMPTY_PATH) to trivially scope the > > binary being executed. Unfortunately, a "bad binary" (usually a symlink) > > could be written as a #!-style script with the symlink target as the > > interpreter -- which would be completely missed by just scoping the > > openat(2). An example of this being exploitable is CVE-2019-5736. > > > > In order to get around this, we need to pass down to each binfmt_* > > implementation the scoping flags requested in execveat(2). In order to > > maintain backwards-compatibility we only pass the scoping AT_* flags. > > > > To avoid breaking userspace (in the exceptionally rare cases where you > > have #!-scripts with a relative path being execveat(2)-ed with dfd != > > AT_FDCWD), we only pass dfd down to binfmt_* if any of our new flags are > > set in execveat(2). > > This seems extremely dangerous. I like the overall series, but not this patch. > > > @@ -1762,6 +1774,12 @@ static int __do_execve_file(int fd, struct filename *filename, > > > > sched_exec(); > > > > + bprm->flags = flags & (AT_XDEV | AT_NO_MAGICLINKS | AT_NO_SYMLINKS | > > + AT_THIS_ROOT); > [...] > > +#define AT_THIS_ROOT 0x100000 /* - Scope ".." resolution to dirfd (like chroot(2)). */ > > So now what happens if there is a setuid root ELF binary with program > interpreter "/lib64/ld-linux-x86-64.so.2" (like /bin/su), and an > unprivileged user runs it with execveat(..., AT_THIS_ROOT)? Is that > going to let the unprivileged user decide which interpreter the > setuid-root process should use? From a high-level perspective, opening > the interpreter should be controlled by the program that is being > loaded, not by the program that invoked it. I went a bit nuts with openat_exec(), and I did end up adding it to the ELF interpreter lookup (and you're completely right that this is a bad idea -- I will drop it from this patch if it's included in the next series). The proposed solutions you give below are much nicer than this patch so I can drop it and work on fixing those issues separately. > In my opinion, CVE-2019-5736 points out two different problems: > > The big problem: The __ptrace_may_access() logic has a special-case > short-circuit for "introspection" that you can't opt out of; this > makes it possible to open things in procfs that are related to the > current process even if the credentials of the process wouldn't permit > accessing another process like it. I think the proper fix to deal with > this would be to add a prctl() flag for "set whether introspection is > allowed for this process", and if userspace has manually un-set that > flag, any introspection special-case logic would be skipped. We could do PR_SET_DUMPABLE=3 for this, I guess? > An additional problem: /proc/*/exe can be used to open a file for > writing; I think it may have been Andy Lutomirski who pointed out some > time ago that it would be nice if you couldn't use /proc/*/fd/* to > re-open files with more privileges, which is sort of the same thing. This is something I'm currently working on a series for, which would boil down to some restrictions on how re-opening of file descriptors works through procfs. However, execveat() of a procfs magiclink is a bit hard to block -- there is no way for userspace to to represent a file being "open for execute" so they are all "open for execute" by default and blocking it outright seems a bit extreme (though I actually hope to eventually add the ability to mark an O_PATH as "open for X" to resolveat(2) -- hence why I've reserved some bits). (Thinking more about it, there is an argument that I should include the above patch into this series so that we can block re-opening of fds opened through resolveat(2) without explicit flags from the outset.) -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/> [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2019-05-06 19:18 UTC|newest] Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-05-06 16:54 [PATCH v6 0/6] namei: resolveat(2) path resolution restriction API Aleksa Sarai 2019-05-06 16:54 ` [PATCH v6 1/6] namei: split out nd->dfd handling to dirfd_path_init Aleksa Sarai 2019-05-06 16:54 ` [PATCH v6 2/6] namei: O_BENEATH-style path resolution flags Aleksa Sarai 2019-05-06 16:54 ` [PATCH v6 3/6] namei: LOOKUP_IN_ROOT: chroot-like path resolution Aleksa Sarai 2019-05-06 16:54 ` [PATCH v6 4/6] namei: aggressively check for nd->root escape on ".." resolution Aleksa Sarai 2019-05-06 16:54 ` [PATCH v6 5/6] binfmt_*: scope path resolution of interpreters Aleksa Sarai 2019-05-06 18:37 ` Jann Horn 2019-05-06 18:37 ` Jann Horn 2019-05-06 19:17 ` Aleksa Sarai [this message] 2019-05-06 19:17 ` Aleksa Sarai 2019-05-06 23:41 ` Andy Lutomirski 2019-05-06 23:41 ` Andy Lutomirski 2019-05-08 0:54 ` Aleksa Sarai 2019-05-08 0:54 ` Aleksa Sarai 2019-05-10 20:41 ` Jann Horn 2019-05-10 20:41 ` Jann Horn 2019-05-10 21:20 ` Andy Lutomirski 2019-05-10 21:20 ` Andy Lutomirski 2019-05-10 22:55 ` Jann Horn 2019-05-10 22:55 ` Jann Horn 2019-05-10 23:36 ` Christian Brauner 2019-05-10 23:36 ` Christian Brauner 2019-05-11 15:49 ` Aleksa Sarai 2019-05-11 15:49 ` Aleksa Sarai 2019-05-11 17:00 ` Andy Lutomirski 2019-05-11 17:00 ` Andy Lutomirski 2019-05-11 17:21 ` Linus Torvalds 2019-05-11 17:21 ` Linus Torvalds 2019-05-11 17:26 ` Linus Torvalds 2019-05-11 17:26 ` Linus Torvalds 2019-05-11 17:31 ` Aleksa Sarai 2019-05-11 17:31 ` Aleksa Sarai 2019-05-11 17:43 ` Linus Torvalds 2019-05-11 17:43 ` Linus Torvalds 2019-05-11 17:48 ` Christian Brauner 2019-05-11 17:48 ` Christian Brauner 2019-05-11 18:00 ` Aleksa Sarai 2019-05-11 18:00 ` Aleksa Sarai 2019-05-11 22:39 ` Andy Lutomirski 2019-05-11 22:39 ` Andy Lutomirski [not found] ` <CAHk-=wg3+3GfHsHdB4o78jNiPh_5ShrzxBuTN-Y8EZfiFMhCvw@mail.gmail.com> 2019-05-12 10:19 ` Christian Brauner 2019-05-12 10:19 ` Christian Brauner [not found] ` <9CD2B97D-A6BD-43BE-9040-B410D996A195@amacapital.net> 2019-05-12 10:44 ` Linus Torvalds 2019-05-12 10:44 ` Linus Torvalds 2019-05-12 13:35 ` Aleksa Sarai 2019-05-12 13:35 ` Aleksa Sarai 2019-05-12 13:38 ` Aleksa Sarai 2019-05-12 13:38 ` Aleksa Sarai 2019-05-12 14:34 ` Andy Lutomirski 2019-05-12 14:34 ` Andy Lutomirski 2019-05-11 17:26 ` Aleksa Sarai 2019-05-11 17:26 ` Aleksa Sarai 2019-05-08 0:38 ` Eric W. Biederman 2019-05-08 0:38 ` Eric W. Biederman 2019-05-10 20:10 ` Jann Horn 2019-05-10 20:10 ` Jann Horn 2019-05-10 20:10 ` Jann Horn 2019-05-10 20:10 ` Jann Horn 2019-05-06 16:54 ` [PATCH v6 6/6] namei: resolveat(2) syscall Aleksa Sarai
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190506191735.nmzf7kwfh7b6e2tf@yavin \ --to=cyphar@cyphar.com \ --cc=akpm@linux-foundation.org \ --cc=arnd@arndb.de \ --cc=asarai@suse.de \ --cc=ast@kernel.org \ --cc=bfields@fieldses.org \ --cc=chanho.min@lge.com \ --cc=christian@brauner.io \ --cc=containers@lists.linux-foundation.org \ --cc=dhowells@redhat.com \ --cc=drysdale@google.com \ --cc=ebiederm@xmission.com \ --cc=jannh@google.com \ --cc=jlayton@kernel.org \ --cc=keescook@chromium.org \ --cc=linux-api@vger.kernel.org \ --cc=linux-arch@vger.kernel.org \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=luto@kernel.org \ --cc=oleg@redhat.com \ --cc=torvalds@linux-foundation.org \ --cc=tycho@tycho.ws \ --cc=viro@zeniv.linux.org.uk \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.