From: Andy Lutomirski <email@example.com> To: Jann Horn <firstname.lastname@example.org> Cc: Andy Lutomirski <email@example.com>, Aleksa Sarai <firstname.lastname@example.org>, Al Viro <email@example.com>, Jeff Layton <firstname.lastname@example.org>, "J. Bruce Fields" <email@example.com>, Arnd Bergmann <firstname.lastname@example.org>, David Howells <email@example.com>, Eric Biederman <firstname.lastname@example.org>, Andrew Morton <email@example.com>, Alexei Starovoitov <firstname.lastname@example.org>, Kees Cook <email@example.com>, Christian Brauner <firstname.lastname@example.org>, Tycho Andersen <email@example.com>, David Drysdale <firstname.lastname@example.org>, Chanho Min <email@example.com>, Oleg Nesterov <firstname.lastname@example.org>, Aleksa Sarai <email@example.com>, Linus Torvalds <firstname.lastname@example.org>, Linux Containers <email@example.com>, linux-fsdevel <firstname.lastname@example.org>, Linux API <email@example.com>, kernel list <firstname.lastname@example.org>, linux-arch <email@example.com> Subject: Re: [PATCH v6 5/6] binfmt_*: scope path resolution of interpreters Date: Sat, 11 May 2019 10:00:47 -0700 Message-ID: <C60DC580-854D-478D-AF23-5F29FB7C3E50@amacapital.net> (raw) In-Reply-To: <20190510225527.GA59914@google.com> > On May 10, 2019, at 3:55 PM, Jann Horn <firstname.lastname@example.org> wrote: > >> On Fri, May 10, 2019 at 02:20:23PM -0700, Andy Lutomirski wrote: >>> On Fri, May 10, 2019 at 1:41 PM Jann Horn <email@example.com> wrote: >>> >>>> On Tue, May 07, 2019 at 05:17:35AM +1000, Aleksa Sarai wrote: >>>>> On 2019-05-06, Jann Horn <firstname.lastname@example.org> wrote: >>>>> In my opinion, CVE-2019-5736 points out two different problems: >>>>> >>>>> The big problem: The __ptrace_may_access() logic has a special-case >>>>> short-circuit for "introspection" that you can't opt out of; this >>>>> makes it possible to open things in procfs that are related to the >>>>> current process even if the credentials of the process wouldn't permit >>>>> accessing another process like it. I think the proper fix to deal with >>>>> this would be to add a prctl() flag for "set whether introspection is >>>>> allowed for this process", and if userspace has manually un-set that >>>>> flag, any introspection special-case logic would be skipped. >>>> >>>> We could do PR_SET_DUMPABLE=3 for this, I guess? >>> >>> Hmm... I'd make it a new prctl() command, since introspection is >>> somewhat orthogonal to dumpability. Also, dumpability is per-mm, and I >>> think the introspection flag should be per-thread. >> >> I've lost track of the context here, but it seems to me that >> mitigating attacks involving accidental following of /proc links >> shouldn't depend on dumpability. What's the actual problem this is >> trying to solve again? > > The one actual security problem that I've seen related to this is > CVE-2019-5736. There is a write-up of it at > <https://blog.dragonsector.pl/2019/02/cve-2019-5736-escape-from-docker-and.html> > under "Successful approach", but it goes more or less as follows: > > A container is running that doesn't use user namespaces (because for > some reason I don't understand, apparently some people still do that). > An evil process is running inside the container with UID 0 (as in, > GLOBAL_ROOT_UID); so if the evil process inside the container was able > to reach root-owned files on the host filesystem, it could write into > them. > > The container engine wants to spawn a new process inside the container. > It forks off a child that joins the container's namespaces (including > PID and mount namespaces), and then the child calls execve() on some > path in the container. I think that, at this point, the task should be considered owned by the container. Maybe we should have a better API than execve() to execute a program in a safer way, but fiddling with dumpability seems like a band-aid. In fact, the process is arguably pwned even *before* execve. A better “spawn” API should fix this. In the mean time, I think it should be assumed that, if you join a container’s namespaces, you are at its mercy. > The attacker replaces the executable in the container with a symlink > to /proc/self/exe and replaces a library inside the container with a > malicious one. Cute. > When the container engine calls execve(), intending to run an executable > inside the container, it instead goes through ptrace_may_access() using > the introspection short-circuit and re-executes its own executable > through the jumped symlink /proc/self/exe (which is normally unreachable > for the container). After the execve(), the process loads an evil > library from inside the container and is under the control of the > container. > Now the container controls a process whose /proc/self/exe is a jumped > symlink to a host executable, and the container can write into it. > > Some container engines are now using an extremely ugly hack to work > around this - whenever they want to enter a container, they copy the > host binary into a new memfd and execute that to avoid exposing the > original host binary to containers: > <https://github.com/opencontainers/runc/commit/0a8e4117e7f715d5fbeef398405813ce8e88558b> > > > In my opinion, the problems here are: > > - Apparently some people run untrusted containers without user > namespaces. It would be really nice if people could not do that. > (Probably the biggest problem here.) > - ptrace_may_access() has a short-circuit that permits a process to > unintentionally look at itself even if it has dropped privileges - > here, it permits the execve("/proc/self/exe", ...) that would > normally be blocked by the check for CAP_SYS_PTRACE if the process > is nondumpable. I don’t see this as a problem. Dumpable is about protecting a task from others, not about protecting a task against itself. > - You can use /proc/*/exe to get a writable fd. This is IMO the real bug.
next prev parent reply index Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-05-06 16:54 [PATCH v6 0/6] namei: resolveat(2) path resolution restriction API Aleksa Sarai 2019-05-06 16:54 ` [PATCH v6 1/6] namei: split out nd->dfd handling to dirfd_path_init Aleksa Sarai 2019-05-06 16:54 ` [PATCH v6 2/6] namei: O_BENEATH-style path resolution flags Aleksa Sarai 2019-05-06 16:54 ` [PATCH v6 3/6] namei: LOOKUP_IN_ROOT: chroot-like path resolution Aleksa Sarai 2019-05-06 16:54 ` [PATCH v6 4/6] namei: aggressively check for nd->root escape on ".." resolution Aleksa Sarai 2019-05-06 16:54 ` [PATCH v6 5/6] binfmt_*: scope path resolution of interpreters Aleksa Sarai 2019-05-06 18:37 ` Jann Horn 2019-05-06 19:17 ` Aleksa Sarai 2019-05-06 23:41 ` Andy Lutomirski 2019-05-08 0:54 ` Aleksa Sarai 2019-05-10 20:41 ` Jann Horn 2019-05-10 21:20 ` Andy Lutomirski 2019-05-10 22:55 ` Jann Horn 2019-05-10 23:36 ` Christian Brauner 2019-05-11 15:49 ` Aleksa Sarai 2019-05-11 17:00 ` Andy Lutomirski [this message] 2019-05-11 17:21 ` Linus Torvalds 2019-05-11 17:26 ` Linus Torvalds 2019-05-11 17:31 ` Aleksa Sarai 2019-05-11 17:43 ` Linus Torvalds 2019-05-11 17:48 ` Christian Brauner 2019-05-11 18:00 ` Aleksa Sarai 2019-05-11 22:39 ` Andy Lutomirski [not found] ` <CAHk-=wg3+3GfHsHdB4o78jNiPh_5ShrzxBuTN-Y8EZfiFMhCvw@mail.gmail.com> 2019-05-12 10:19 ` Christian Brauner [not found] ` <9CD2B97D-A6BD-43BE-9040-B410D996A195@amacapital.net> 2019-05-12 10:44 ` Linus Torvalds 2019-05-12 13:35 ` Aleksa Sarai 2019-05-12 13:38 ` Aleksa Sarai 2019-05-12 14:34 ` Andy Lutomirski 2019-05-11 17:26 ` Aleksa Sarai 2019-05-08 0:38 ` Eric W. Biederman 2019-05-10 20:10 ` Jann Horn 2019-05-06 16:54 ` [PATCH v6 6/6] namei: resolveat(2) syscall Aleksa Sarai
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=C60DC580-854D-478D-AF23-5F29FB7C3E50@amacapital.net \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
LKML Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git git clone --mirror https://lore.kernel.org/lkml/10 lkml/git/10.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \ firstname.lastname@example.org public-inbox-index lkml Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel AGPL code for this site: git clone https://public-inbox.org/public-inbox.git