From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81F1EC432C3 for ; Sat, 16 Nov 2019 00:31:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5B3D9206DB for ; Sat, 16 Nov 2019 00:31:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727368AbfKPAby (ORCPT ); Fri, 15 Nov 2019 19:31:54 -0500 Received: from mout-p-201.mailbox.org ([80.241.56.171]:60584 "EHLO mout-p-201.mailbox.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727134AbfKPAbx (ORCPT ); Fri, 15 Nov 2019 19:31:53 -0500 Received: from smtp2.mailbox.org (smtp2.mailbox.org [80.241.60.241]) (using TLSv1.2 with cipher ECDHE-RSA-CHACHA20-POLY1305 (256/256 bits)) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 47FGPK0kj4zQlB0; Sat, 16 Nov 2019 01:31:49 +0100 (CET) X-Virus-Scanned: amavisd-new at heinlein-support.de Received: from smtp2.mailbox.org ([80.241.60.241]) by hefe.heinlein-support.de (hefe.heinlein-support.de [91.198.250.172]) (amavisd-new, port 10030) with ESMTP id RTci28XHZ7j8; Sat, 16 Nov 2019 01:31:44 +0100 (CET) From: Aleksa Sarai To: Al Viro , Jeff Layton , "J. Bruce Fields" , Arnd Bergmann , David Howells , Shuah Khan , Shuah Khan , Ingo Molnar , Peter Zijlstra , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko Cc: Aleksa Sarai , Christian Brauner , Eric Biederman , Andy Lutomirski , Andrew Morton , Kees Cook , Jann Horn , Tycho Andersen , David Drysdale , Chanho Min , Oleg Nesterov , Rasmus Villemoes , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Christian Brauner , Aleksa Sarai , Linus Torvalds , dev@opencontainers.org, containers@lists.linux-foundation.org, bpf@vger.kernel.org, netdev@vger.kernel.org, linux-alpha@vger.kernel.org, linux-api@vger.kernel.org, libc-alpha@sourceware.org, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, linux-xtensa@linux-xtensa.org, sparclinux@vger.kernel.org Subject: [PATCH v16 08/12] namei: LOOKUP_IN_ROOT: chroot-like scoped resolution Date: Sat, 16 Nov 2019 11:27:58 +1100 Message-Id: <20191116002802.6663-9-cyphar@cyphar.com> In-Reply-To: <20191116002802.6663-1-cyphar@cyphar.com> References: <20191116002802.6663-1-cyphar@cyphar.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-mips-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-mips@vger.kernel.org /* Background. */ Container runtimes or other administrative management processes will often interact with root filesystems while in the host mount namespace, because the cost of doing a chroot(2) on every operation is too prohibitive (especially in Go, which cannot safely use vfork). However, a malicious program can trick the management process into doing operations on files outside of the root filesystem through careful crafting of symlinks. Most programs that need this feature have attempted to make this process safe, by doing all of the path resolution in userspace (with symlinks being scoped to the root of the malicious root filesystem). Unfortunately, this method is prone to foot-guns and usually such implementations have subtle security bugs. Thus, what userspace needs is a way to resolve a path as though it were in a chroot(2) -- with all absolute symlinks being resolved relative to the dirfd root (and ".." components being stuck under the dirfd root). It is much simpler and more straight-forward to provide this functionality in-kernel (because it can be done far more cheaply and correctly). More classical applications that also have this problem (which have their own potentially buggy userspace path sanitisation code) include web servers, archive extraction tools, network file servers, and so on. /* Userspace API. */ LOOKUP_IN_ROOT will be exposed to userspace through openat2(2). /* Semantics. */ Unlike most other LOOKUP flags (most notably LOOKUP_FOLLOW), LOOKUP_IN_ROOT applies to all components of the path. With LOOKUP_IN_ROOT, any path component which attempts to cross the starting point of the pathname lookup (the dirfd passed to openat) will remain at the starting point. Thus, all absolute paths and symlinks will be scoped within the starting point. There is a slight change in behaviour regarding pathnames -- if the pathname is absolute then the dirfd is still used as the root of resolution of LOOKUP_IN_ROOT is specified (this is to avoid obvious foot-guns, at the cost of a minor API inconsistency). As with LOOKUP_BENEATH, Jann's security concern about ".."[1] applies to LOOKUP_IN_ROOT -- therefore ".." resolution is blocked. This restriction will be lifted in a future patch, but requires more work to ensure that permitting ".." is done safely. Magic-link jumps are also blocked, because they can beam the path lookup across the starting point. It would be possible to detect and block only the "bad" crossings with path_is_under() checks, but it's unclear whether it makes sense to permit magic-links at all. However, userspace is recommended to pass LOOKUP_NO_MAGICLINKS if they want to ensure that magic-link crossing is entirely disabled. /* Testing. */ LOOKUP_IN_ROOT is tested as part of the openat2(2) selftests. [1]: https://lore.kernel.org/lkml/CAG48ez1jzNvxB+bfOBnERFGp=oMM0vHWuLD6EULmne3R6xa53w@mail.gmail.com/ Cc: Christian Brauner Signed-off-by: Aleksa Sarai --- fs/namei.c | 10 +++++++--- include/linux/namei.h | 3 ++- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index bad5ef56b4c8..cb0b9f411efb 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2271,13 +2271,16 @@ static const char *path_init(struct nameidata *nd, unsigned flags) nd->m_seq = read_seqbegin(&mount_lock); - /* Figure out the starting path and root (if needed). */ - if (*s == '/') { + /* Absolute pathname -- fetch the root (LOOKUP_IN_ROOT uses nd->dfd). */ + if (*s == '/' && !(flags & LOOKUP_IN_ROOT)) { error = nd_jump_root(nd); if (unlikely(error)) return ERR_PTR(error); return s; - } else if (nd->dfd == AT_FDCWD) { + } + + /* Relative pathname -- get the starting-point it is relative to. */ + if (nd->dfd == AT_FDCWD) { if (flags & LOOKUP_RCU) { struct fs_struct *fs = current->fs; unsigned seq; @@ -2317,6 +2320,7 @@ static const char *path_init(struct nameidata *nd, unsigned flags) } fdput(f); } + /* For scoped-lookups we need to set the root to the dirfd as well. */ if (flags & LOOKUP_IS_SCOPED) { nd->root = nd->path; diff --git a/include/linux/namei.h b/include/linux/namei.h index 93dad378f1e8..93151e47ec47 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -45,8 +45,9 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, LAST_BIND}; #define LOOKUP_NO_MAGICLINKS 0x020000 /* No nd_jump_link() crossing. */ #define LOOKUP_NO_XDEV 0x040000 /* No mountpoint crossing. */ #define LOOKUP_BENEATH 0x080000 /* No escaping from starting point. */ +#define LOOKUP_IN_ROOT 0x100000 /* Treat dirfd as fs root. */ /* LOOKUP_* flags which do scope-related checks based on the dirfd. */ -#define LOOKUP_IS_SCOPED LOOKUP_BENEATH +#define LOOKUP_IS_SCOPED (LOOKUP_BENEATH | LOOKUP_IN_ROOT) extern int path_pts(struct path *path); -- 2.24.0 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EB00C432C3 for ; Sat, 16 Nov 2019 00:31:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 42C0B206DB for ; Sat, 16 Nov 2019 00:31:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727032AbfKPAbv (ORCPT ); Fri, 15 Nov 2019 19:31:51 -0500 Received: from mout-p-201.mailbox.org ([80.241.56.171]:60562 "EHLO mout-p-201.mailbox.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727132AbfKPAbu (ORCPT ); Fri, 15 Nov 2019 19:31:50 -0500 Received: from smtp2.mailbox.org (smtp2.mailbox.org [80.241.60.241]) (using TLSv1.2 with cipher ECDHE-RSA-CHACHA20-POLY1305 (256/256 bits)) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 47FGPK0kj4zQlB0; Sat, 16 Nov 2019 01:31:49 +0100 (CET) X-Virus-Scanned: amavisd-new at heinlein-support.de Received: from smtp2.mailbox.org ([80.241.60.241]) by hefe.heinlein-support.de (hefe.heinlein-support.de [91.198.250.172]) (amavisd-new, port 10030) with ESMTP id RTci28XHZ7j8; Sat, 16 Nov 2019 01:31:44 +0100 (CET) From: Aleksa Sarai To: Al Viro , Jeff Layton , "J. Bruce Fields" , Arnd Bergmann , David Howells , Shuah Khan , Shuah Khan , Ingo Molnar , Peter Zijlstra , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko Cc: Aleksa Sarai , Christian Brauner , Eric Biederman , Andy Lutomirski , Andrew Morton , Kees Cook , Jann Horn , Tycho Andersen , David Drysdale , Chanho Min , Oleg Nesterov , Rasmus Villemoes , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Christian Brauner , Aleksa Sarai , Linus Torvalds , dev@opencontainers.org, containers@lists.linux-foundation.org, bpf@vger.kernel.org, netdev@vger.kernel.org, linux-alpha@vger.kernel.org, linux-api@vger.kernel.org, libc-alpha@sourceware.org, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, linux-xtensa@linux-xtensa.org, sparclinux@vger.kernel.org Subject: [PATCH v16 08/12] namei: LOOKUP_IN_ROOT: chroot-like scoped resolution Date: Sat, 16 Nov 2019 11:27:58 +1100 Message-Id: <20191116002802.6663-9-cyphar@cyphar.com> In-Reply-To: <20191116002802.6663-1-cyphar@cyphar.com> References: <20191116002802.6663-1-cyphar@cyphar.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-m68k-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-m68k@vger.kernel.org Message-ID: <20191116002758.uoKzGXt6uhDIuUffrqQJQzj9gR2qJmPqNo7G5Q1E1BM@z> /* Background. */ Container runtimes or other administrative management processes will often interact with root filesystems while in the host mount namespace, because the cost of doing a chroot(2) on every operation is too prohibitive (especially in Go, which cannot safely use vfork). However, a malicious program can trick the management process into doing operations on files outside of the root filesystem through careful crafting of symlinks. Most programs that need this feature have attempted to make this process safe, by doing all of the path resolution in userspace (with symlinks being scoped to the root of the malicious root filesystem). Unfortunately, this method is prone to foot-guns and usually such implementations have subtle security bugs. Thus, what userspace needs is a way to resolve a path as though it were in a chroot(2) -- with all absolute symlinks being resolved relative to the dirfd root (and ".." components being stuck under the dirfd root). It is much simpler and more straight-forward to provide this functionality in-kernel (because it can be done far more cheaply and correctly). More classical applications that also have this problem (which have their own potentially buggy userspace path sanitisation code) include web servers, archive extraction tools, network file servers, and so on. /* Userspace API. */ LOOKUP_IN_ROOT will be exposed to userspace through openat2(2). /* Semantics. */ Unlike most other LOOKUP flags (most notably LOOKUP_FOLLOW), LOOKUP_IN_ROOT applies to all components of the path. With LOOKUP_IN_ROOT, any path component which attempts to cross the starting point of the pathname lookup (the dirfd passed to openat) will remain at the starting point. Thus, all absolute paths and symlinks will be scoped within the starting point. There is a slight change in behaviour regarding pathnames -- if the pathname is absolute then the dirfd is still used as the root of resolution of LOOKUP_IN_ROOT is specified (this is to avoid obvious foot-guns, at the cost of a minor API inconsistency). As with LOOKUP_BENEATH, Jann's security concern about ".."[1] applies to LOOKUP_IN_ROOT -- therefore ".." resolution is blocked. This restriction will be lifted in a future patch, but requires more work to ensure that permitting ".." is done safely. Magic-link jumps are also blocked, because they can beam the path lookup across the starting point. It would be possible to detect and block only the "bad" crossings with path_is_under() checks, but it's unclear whether it makes sense to permit magic-links at all. However, userspace is recommended to pass LOOKUP_NO_MAGICLINKS if they want to ensure that magic-link crossing is entirely disabled. /* Testing. */ LOOKUP_IN_ROOT is tested as part of the openat2(2) selftests. [1]: https://lore.kernel.org/lkml/CAG48ez1jzNvxB+bfOBnERFGp=oMM0vHWuLD6EULmne3R6xa53w@mail.gmail.com/ Cc: Christian Brauner Signed-off-by: Aleksa Sarai --- fs/namei.c | 10 +++++++--- include/linux/namei.h | 3 ++- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index bad5ef56b4c8..cb0b9f411efb 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2271,13 +2271,16 @@ static const char *path_init(struct nameidata *nd, unsigned flags) nd->m_seq = read_seqbegin(&mount_lock); - /* Figure out the starting path and root (if needed). */ - if (*s == '/') { + /* Absolute pathname -- fetch the root (LOOKUP_IN_ROOT uses nd->dfd). */ + if (*s == '/' && !(flags & LOOKUP_IN_ROOT)) { error = nd_jump_root(nd); if (unlikely(error)) return ERR_PTR(error); return s; - } else if (nd->dfd == AT_FDCWD) { + } + + /* Relative pathname -- get the starting-point it is relative to. */ + if (nd->dfd == AT_FDCWD) { if (flags & LOOKUP_RCU) { struct fs_struct *fs = current->fs; unsigned seq; @@ -2317,6 +2320,7 @@ static const char *path_init(struct nameidata *nd, unsigned flags) } fdput(f); } + /* For scoped-lookups we need to set the root to the dirfd as well. */ if (flags & LOOKUP_IS_SCOPED) { nd->root = nd->path; diff --git a/include/linux/namei.h b/include/linux/namei.h index 93dad378f1e8..93151e47ec47 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -45,8 +45,9 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, LAST_BIND}; #define LOOKUP_NO_MAGICLINKS 0x020000 /* No nd_jump_link() crossing. */ #define LOOKUP_NO_XDEV 0x040000 /* No mountpoint crossing. */ #define LOOKUP_BENEATH 0x080000 /* No escaping from starting point. */ +#define LOOKUP_IN_ROOT 0x100000 /* Treat dirfd as fs root. */ /* LOOKUP_* flags which do scope-related checks based on the dirfd. */ -#define LOOKUP_IS_SCOPED LOOKUP_BENEATH +#define LOOKUP_IS_SCOPED (LOOKUP_BENEATH | LOOKUP_IN_ROOT) extern int path_pts(struct path *path); -- 2.24.0