From: Aleksa Sarai <cyphar@cyphar.com>
To: Al Viro <viro@zeniv.linux.org.uk>,
Jeff Layton <jlayton@kernel.org>,
"J. Bruce Fields" <bfields@fieldses.org>,
Arnd Bergmann <arnd@arndb.de>,
David Howells <dhowells@redhat.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>,
Christian Brauner <christian@brauner.io>,
Eric Biederman <ebiederm@xmission.com>,
Andy Lutomirski <luto@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Alexei Starovoitov <ast@kernel.org>,
Kees Cook <keescook@chromium.org>, Jann Horn <jannh@google.com>,
Tycho Andersen <tycho@tycho.ws>,
David Drysdale <drysdale@google.com>,
Chanho Min <chanho.min@lge.com>, Oleg Nesterov <oleg@redhat.com>,
Aleksa Sarai <asarai@suse.de>,
Linus Torvalds <torvalds@linux-foundation.org>,
containers@lists.linux-foundation.org,
linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org
Subject: [PATCH v6 6/6] namei: resolveat(2) syscall
Date: Tue, 7 May 2019 02:54:39 +1000 [thread overview]
Message-ID: <20190506165439.9155-7-cyphar@cyphar.com> (raw)
In-Reply-To: <20190506165439.9155-1-cyphar@cyphar.com>
The most obvious syscall to add support for the new LOOKUP_* scoping
flags would be openat(2) (along with the required execveat(2) change
included in this series). However, there are a few reasons to not do
this:
* The new LOOKUP_* flags are intended to be security features, and
openat(2) will silently ignore all unknown flags. This means that
users would need to avoid foot-gunning themselves constantly when
using this interface if it were part of openat(2).
* Resolution scoping feels like a different operation to the existing
O_* flags. And since openat(2) has limited flag space, it seems to be
quite wasteful to clutter it with 5 flags that are all
resolution-related. Arguably O_NOFOLLOw is also a resolution flag but
its entire purpose is to error out if you encounter a trailing
symlink not to scope resolution.
* Other systems would be able to reimplement this syscall allowing for
cross-OS standardisation rather than being hidden amongst O_* flags
which may result in it not being used by all the parties that might
want to use it (file servers, web servers, container runtimes, etc).
* It gives us the opportunity to iterate on the O_PATH interface in the
future. There are some potential security improvements that can be
made to O_PATH (handling /proc/self/fd re-opening of file descriptors
much more sanely) which could be made even better with some other
bits (such as ACC_MODE bits which work for O_PATH).
To this end, we introduce the resolveat(2) syscall. At the moment it's
effectively another way of getting a bog-standard O_PATH descriptor but
with the ability to use the new LOOKUP_* flags.
Because resolveat(2) only provides the ability to get O_PATH
descriptors, users will need to get creative with /proc/self/fd in order
to get a usable file descriptor for other uses. However, in future we
can add O_EMPTYPATH support to openat(2) which would allow for
re-opening without procfs (though as mentioned above there are some
security improvements that should be made to the interfaces).
NOTE: This patch adds the syscall to all architectures using the new
unified syscall numbering, but several architectures are missing
newer (nr > 423) syscalls -- hence the uneven gaps in the syscall
tables.
Cc: Christian Brauner <christian@brauner.io>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
arch/alpha/kernel/syscalls/syscall.tbl | 1 +
arch/arm/tools/syscall.tbl | 1 +
arch/ia64/kernel/syscalls/syscall.tbl | 1 +
arch/m68k/kernel/syscalls/syscall.tbl | 1 +
arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n32.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n64.tbl | 1 +
arch/mips/kernel/syscalls/syscall_o32.tbl | 1 +
arch/parisc/kernel/syscalls/syscall.tbl | 1 +
arch/powerpc/kernel/syscalls/syscall.tbl | 1 +
arch/s390/kernel/syscalls/syscall.tbl | 1 +
arch/sh/kernel/syscalls/syscall.tbl | 1 +
arch/sparc/kernel/syscalls/syscall.tbl | 1 +
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
fs/namei.c | 46 +++++++++++++++++++++
include/uapi/linux/fcntl.h | 12 ++++++
18 files changed, 74 insertions(+)
diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl
index 63ed39cbd3bd..72f431b1dc9c 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -461,5 +461,6 @@
530 common getegid sys_getegid
531 common geteuid sys_geteuid
532 common getppid sys_getppid
+533 common resolveat sys_resolveat
# all other architectures have common numbers for new syscall, alpha
# is the exception.
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index 9016f4081bb9..1bc0282a67f7 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -437,3 +437,4 @@
421 common rt_sigtimedwait_time64 sys_rt_sigtimedwait
422 common futex_time64 sys_futex
423 common sched_rr_get_interval_time64 sys_sched_rr_get_interval
+428 common resolveat sys_resolveat
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl
index ab9cda5f6136..d3ae73ffaf48 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -344,3 +344,4 @@
332 common pkey_free sys_pkey_free
333 common rseq sys_rseq
# 334 through 423 are reserved to sync up with other architectures
+428 common resolveat sys_resolveat
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
index 125c14178979..81b7389e9e58 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -423,3 +423,4 @@
421 common rt_sigtimedwait_time64 sys_rt_sigtimedwait
422 common futex_time64 sys_futex
423 common sched_rr_get_interval_time64 sys_sched_rr_get_interval
+428 common resolveat sys_resolveat
diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl
index 8ee3a8c18498..626aed10e2b5 100644
--- a/arch/microblaze/kernel/syscalls/syscall.tbl
+++ b/arch/microblaze/kernel/syscalls/syscall.tbl
@@ -429,3 +429,4 @@
421 common rt_sigtimedwait_time64 sys_rt_sigtimedwait
422 common futex_time64 sys_futex
423 common sched_rr_get_interval_time64 sys_sched_rr_get_interval
+428 common resolveat sys_resolveat
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index 15f4117900ee..8cbd6032d6bf 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -362,3 +362,4 @@
421 n32 rt_sigtimedwait_time64 compat_sys_rt_sigtimedwait_time64
422 n32 futex_time64 sys_futex
423 n32 sched_rr_get_interval_time64 sys_sched_rr_get_interval
+428 n32 resolveat sys_resolveat
diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl
index c85502e67b44..234923a1fc88 100644
--- a/arch/mips/kernel/syscalls/syscall_n64.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
@@ -338,3 +338,4 @@
327 n64 rseq sys_rseq
328 n64 io_pgetevents sys_io_pgetevents
# 329 through 423 are reserved to sync up with other architectures
+428 n64 resolveat sys_resolveat
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 2e063d0f837e..7b4586acf35d 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -411,3 +411,4 @@
421 o32 rt_sigtimedwait_time64 sys_rt_sigtimedwait compat_sys_rt_sigtimedwait_time64
422 o32 futex_time64 sys_futex sys_futex
423 o32 sched_rr_get_interval_time64 sys_sched_rr_get_interval sys_sched_rr_get_interval
+428 o32 resolveat sys_resolveat sys_resolveat
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index b26766c6647d..19a9a92dc5f8 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -420,3 +420,4 @@
421 32 rt_sigtimedwait_time64 sys_rt_sigtimedwait compat_sys_rt_sigtimedwait_time64
422 32 futex_time64 sys_futex sys_futex
423 32 sched_rr_get_interval_time64 sys_sched_rr_get_interval sys_sched_rr_get_interval
+428 common resolveat sys_resolveat sys_resolveat
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index b18abb0c3dae..bfcd75b928de 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -505,3 +505,4 @@
421 32 rt_sigtimedwait_time64 sys_rt_sigtimedwait compat_sys_rt_sigtimedwait_time64
422 32 futex_time64 sys_futex sys_futex
423 32 sched_rr_get_interval_time64 sys_sched_rr_get_interval sys_sched_rr_get_interval
+428 common resolveat sys_resolveat sys_resolveat
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index 02579f95f391..084e51f02e65 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -426,3 +426,4 @@
421 32 rt_sigtimedwait_time64 - compat_sys_rt_sigtimedwait_time64
422 32 futex_time64 - sys_futex
423 32 sched_rr_get_interval_time64 - sys_sched_rr_get_interval
+428 common resolveat sys_resolveat -
diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl
index bfda678576e4..e9115c5cec72 100644
--- a/arch/sh/kernel/syscalls/syscall.tbl
+++ b/arch/sh/kernel/syscalls/syscall.tbl
@@ -426,3 +426,4 @@
421 common rt_sigtimedwait_time64 sys_rt_sigtimedwait
422 common futex_time64 sys_futex
423 common sched_rr_get_interval_time64 sys_sched_rr_get_interval
+428 common resolveat sys_resolveat
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index b9a5a04b2d2c..2d3fdd913d89 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -469,3 +469,4 @@
421 32 rt_sigtimedwait_time64 sys_rt_sigtimedwait compat_sys_rt_sigtimedwait_time64
422 32 futex_time64 sys_futex sys_futex
423 32 sched_rr_get_interval_time64 sys_sched_rr_get_interval sys_sched_rr_get_interval
+428 common resolveat sys_resolveat sys_resolveat
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 4cd5f982b1e5..101feef22473 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -438,3 +438,4 @@
425 i386 io_uring_setup sys_io_uring_setup __ia32_sys_io_uring_setup
426 i386 io_uring_enter sys_io_uring_enter __ia32_sys_io_uring_enter
427 i386 io_uring_register sys_io_uring_register __ia32_sys_io_uring_register
+428 i386 resolveat sys_resolveat __ia32_sys_resolveat
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index 64ca0d06259a..c7c197bf07bc 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -355,6 +355,7 @@
425 common io_uring_setup __x64_sys_io_uring_setup
426 common io_uring_enter __x64_sys_io_uring_enter
427 common io_uring_register __x64_sys_io_uring_register
+428 common resolveat __x64_sys_resolveat
#
# x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl
index 6af49929de85..86c44f15dfa4 100644
--- a/arch/xtensa/kernel/syscalls/syscall.tbl
+++ b/arch/xtensa/kernel/syscalls/syscall.tbl
@@ -394,3 +394,4 @@
421 common rt_sigtimedwait_time64 sys_rt_sigtimedwait
422 common futex_time64 sys_futex
423 common sched_rr_get_interval_time64 sys_sched_rr_get_interval
+428 common resolveat sys_resolveat
diff --git a/fs/namei.c b/fs/namei.c
index 2b6a1bf4e745..30db5833aac3 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3656,6 +3656,52 @@ struct file *do_filp_open(int dfd, struct filename *pathname,
return filp;
}
+SYSCALL_DEFINE3(resolveat, int, dfd, const char __user *, path,
+ unsigned long, flags)
+{
+ int fd;
+ struct filename *tmp;
+ struct open_flags op = {
+ .open_flag = O_PATH,
+ };
+
+ if (flags & ~VALID_RESOLVE_FLAGS)
+ return -EINVAL;
+
+ if (flags & RESOLVE_CLOEXEC)
+ op.open_flag |= O_CLOEXEC;
+ if (!(flags & RESOLVE_NOFOLLOW))
+ op.lookup_flags |= LOOKUP_FOLLOW;
+ if (flags & RESOLVE_BENEATH)
+ op.lookup_flags |= LOOKUP_BENEATH;
+ if (flags & RESOLVE_XDEV)
+ op.lookup_flags |= LOOKUP_XDEV;
+ if (flags & RESOLVE_NO_MAGICLINKS)
+ op.lookup_flags |= LOOKUP_NO_MAGICLINKS;
+ if (flags & RESOLVE_NO_SYMLINKS)
+ op.lookup_flags |= LOOKUP_NO_SYMLINKS;
+ if (flags & RESOLVE_THIS_ROOT)
+ op.lookup_flags |= LOOKUP_IN_ROOT;
+
+ tmp = getname(path);
+ if (IS_ERR(tmp))
+ return PTR_ERR(tmp);
+
+ fd = get_unused_fd_flags(op.open_flag);
+ if (fd >= 0) {
+ struct file *f = do_filp_open(dfd, tmp, &op);
+ if (IS_ERR(f)) {
+ put_unused_fd(fd);
+ fd = PTR_ERR(f);
+ } else {
+ fsnotify_open(f);
+ fd_install(fd, f);
+ }
+ }
+ putname(tmp);
+ return fd;
+}
+
struct file *do_file_open_root(struct dentry *dentry, struct vfsmount *mnt,
const char *name, const struct open_flags *op)
{
diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
index bfca7f87cd2a..d5a2a97929c6 100644
--- a/include/uapi/linux/fcntl.h
+++ b/include/uapi/linux/fcntl.h
@@ -100,4 +100,16 @@
#define AT_NO_SYMLINKS 0x080000 /* - Block all symlinks (implies AT_NO_MAGICLINKS). */
#define AT_THIS_ROOT 0x100000 /* - Scope ".." resolution to dirfd (like chroot(2)). */
+/* First two bits of RESOLVE_* are reserved for future ACC_MODE extensions. */
+#define RESOLVE_CLOEXEC 0x004 /* Set O_CLOEXEC on the returned fd. */
+#define RESOLVE_NOFOLLOW 0x008 /* Don't follow trailing symlinks. */
+#define RESOLVE_RESOLUTION_TYPE 0x1F0 /* Type of path-resolution scoping we are applying. */
+#define RESOLVE_BENEATH 0x010 /* - Block "lexical" trickery like "..", symlinks, absolute paths, etc. */
+#define RESOLVE_XDEV 0x020 /* - Block mount-point crossings (includes bind-mounts). */
+#define RESOLVE_NO_MAGICLINKS 0x040 /* - Block procfs-style "magic" symlinks. */
+#define RESOLVE_NO_SYMLINKS 0x080 /* - Block all symlinks (implies AT_NO_MAGICLINKS). */
+#define RESOLVE_THIS_ROOT 0x100 /* - Scope ".." resolution to dirfd (like chroot(2)). */
+
+#define VALID_RESOLVE_FLAGS (RESOLVE_CLOEXEC | RESOLVE_NOFOLLOW | RESOLVE_RESOLUTION_TYPE)
+
#endif /* _UAPI_LINUX_FCNTL_H */
--
2.21.0
prev parent reply other threads:[~2019-05-06 16:56 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-06 16:54 [PATCH v6 0/6] namei: resolveat(2) path resolution restriction API Aleksa Sarai
2019-05-06 16:54 ` [PATCH v6 1/6] namei: split out nd->dfd handling to dirfd_path_init Aleksa Sarai
2019-05-06 16:54 ` [PATCH v6 2/6] namei: O_BENEATH-style path resolution flags Aleksa Sarai
2019-05-06 16:54 ` [PATCH v6 3/6] namei: LOOKUP_IN_ROOT: chroot-like path resolution Aleksa Sarai
2019-05-06 16:54 ` [PATCH v6 4/6] namei: aggressively check for nd->root escape on ".." resolution Aleksa Sarai
2019-05-06 16:54 ` [PATCH v6 5/6] binfmt_*: scope path resolution of interpreters Aleksa Sarai
2019-05-06 18:37 ` Jann Horn
2019-05-06 18:37 ` Jann Horn
2019-05-06 19:17 ` Aleksa Sarai
2019-05-06 19:17 ` Aleksa Sarai
2019-05-06 23:41 ` Andy Lutomirski
2019-05-06 23:41 ` Andy Lutomirski
2019-05-08 0:54 ` Aleksa Sarai
2019-05-08 0:54 ` Aleksa Sarai
2019-05-10 20:41 ` Jann Horn
2019-05-10 20:41 ` Jann Horn
2019-05-10 21:20 ` Andy Lutomirski
2019-05-10 21:20 ` Andy Lutomirski
2019-05-10 22:55 ` Jann Horn
2019-05-10 22:55 ` Jann Horn
2019-05-10 23:36 ` Christian Brauner
2019-05-10 23:36 ` Christian Brauner
2019-05-11 15:49 ` Aleksa Sarai
2019-05-11 15:49 ` Aleksa Sarai
2019-05-11 17:00 ` Andy Lutomirski
2019-05-11 17:00 ` Andy Lutomirski
2019-05-11 17:21 ` Linus Torvalds
2019-05-11 17:21 ` Linus Torvalds
2019-05-11 17:26 ` Linus Torvalds
2019-05-11 17:26 ` Linus Torvalds
2019-05-11 17:31 ` Aleksa Sarai
2019-05-11 17:31 ` Aleksa Sarai
2019-05-11 17:43 ` Linus Torvalds
2019-05-11 17:43 ` Linus Torvalds
2019-05-11 17:48 ` Christian Brauner
2019-05-11 17:48 ` Christian Brauner
2019-05-11 18:00 ` Aleksa Sarai
2019-05-11 18:00 ` Aleksa Sarai
2019-05-11 22:39 ` Andy Lutomirski
2019-05-11 22:39 ` Andy Lutomirski
[not found] ` <CAHk-=wg3+3GfHsHdB4o78jNiPh_5ShrzxBuTN-Y8EZfiFMhCvw@mail.gmail.com>
2019-05-12 10:19 ` Christian Brauner
2019-05-12 10:19 ` Christian Brauner
[not found] ` <9CD2B97D-A6BD-43BE-9040-B410D996A195@amacapital.net>
2019-05-12 10:44 ` Linus Torvalds
2019-05-12 10:44 ` Linus Torvalds
2019-05-12 13:35 ` Aleksa Sarai
2019-05-12 13:35 ` Aleksa Sarai
2019-05-12 13:38 ` Aleksa Sarai
2019-05-12 13:38 ` Aleksa Sarai
2019-05-12 14:34 ` Andy Lutomirski
2019-05-12 14:34 ` Andy Lutomirski
2019-05-11 17:26 ` Aleksa Sarai
2019-05-11 17:26 ` Aleksa Sarai
2019-05-08 0:38 ` Eric W. Biederman
2019-05-08 0:38 ` Eric W. Biederman
2019-05-10 20:10 ` Jann Horn
2019-05-10 20:10 ` Jann Horn
2019-05-10 20:10 ` Jann Horn
2019-05-10 20:10 ` Jann Horn
2019-05-06 16:54 ` Aleksa Sarai [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190506165439.9155-7-cyphar@cyphar.com \
--to=cyphar@cyphar.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=asarai@suse.de \
--cc=ast@kernel.org \
--cc=bfields@fieldses.org \
--cc=chanho.min@lge.com \
--cc=christian@brauner.io \
--cc=containers@lists.linux-foundation.org \
--cc=dhowells@redhat.com \
--cc=drysdale@google.com \
--cc=ebiederm@xmission.com \
--cc=jannh@google.com \
--cc=jlayton@kernel.org \
--cc=keescook@chromium.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=oleg@redhat.com \
--cc=torvalds@linux-foundation.org \
--cc=tycho@tycho.ws \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.