linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Relocate execve() sanity checks
@ 2020-05-18  5:54 Kees Cook
  2020-05-18  5:54 ` [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES Kees Cook
                   ` (4 more replies)
  0 siblings, 5 replies; 28+ messages in thread
From: Kees Cook @ 2020-05-18  5:54 UTC (permalink / raw)
  To: Al Viro
  Cc: Kees Cook, Andrew Morton, Tetsuo Handa, Eric Biggers,
	Dmitry Vyukov, linux-fsdevel, linux-security-module, linux-api,
	linux-kernel

Hi,

While looking at the code paths for the proposed O_MAYEXEC flag, I saw
some things that looked like they should be fixed up.

  exec: Change uselib(2) IS_SREG() failure to EACCES
	This just regularizes the return code on uselib(2).

  exec: Relocate S_ISREG() check
	This moves the S_ISREG() check even earlier than it was already.

  exec: Relocate path_noexec() check
	This adds the path_noexec() check to the same place as the
	S_ISREG() check.

  fs: Include FMODE_EXEC when converting flags to f_mode
	This seemed like an oversight, but I suspect there is some
	reason I couldn't find for why FMODE_EXEC doesn't get set in
	f_mode and just stays in f_flags.

Thanks!

-Kees


Kees Cook (4):
  exec: Change uselib(2) IS_SREG() failure to EACCES
  exec: Relocate S_ISREG() check
  exec: Relocate path_noexec() check
  fs: Include FMODE_EXEC when converting flags to f_mode

 fs/exec.c                | 13 +++++++++----
 fs/namei.c               |  5 +++++
 fs/open.c                |  6 ------
 include/linux/fs.h       |  3 ++-
 include/linux/fsnotify.h |  4 ++--
 5 files changed, 18 insertions(+), 13 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES
  2020-05-18  5:54 [PATCH 0/4] Relocate execve() sanity checks Kees Cook
@ 2020-05-18  5:54 ` Kees Cook
  2020-05-18 13:02   ` Christian Brauner
  2020-05-18  5:54 ` [PATCH 2/4] exec: Relocate S_ISREG() check Kees Cook
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 28+ messages in thread
From: Kees Cook @ 2020-05-18  5:54 UTC (permalink / raw)
  To: Al Viro
  Cc: Kees Cook, Andrew Morton, Tetsuo Handa, Eric Biggers,
	Dmitry Vyukov, linux-fsdevel, linux-security-module, linux-api,
	linux-kernel

Change uselib(2)' S_ISREG() error return to EACCES instead of EINVAL so
the behavior matches execve(2), and the seemingly documented value.
The "not a regular file" failure mode of execve(2) is explicitly
documented[1], but it is not mentioned in uselib(2)[2] which does,
however, say that open(2) and mmap(2) errors may apply. The documentation
for open(2) does not include a "not a regular file" error[3], but mmap(2)
does[4], and it is EACCES.

[1] http://man7.org/linux/man-pages/man2/execve.2.html#ERRORS
[2] http://man7.org/linux/man-pages/man2/uselib.2.html#ERRORS
[3] http://man7.org/linux/man-pages/man2/open.2.html#ERRORS
[4] http://man7.org/linux/man-pages/man2/mmap.2.html#ERRORS

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 fs/exec.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 06b4c550af5d..30735ce1dc0e 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -139,11 +139,10 @@ SYSCALL_DEFINE1(uselib, const char __user *, library)
 	if (IS_ERR(file))
 		goto out;
 
-	error = -EINVAL;
+	error = -EACCES;
 	if (!S_ISREG(file_inode(file)->i_mode))
 		goto exit;
 
-	error = -EACCES;
 	if (path_noexec(&file->f_path))
 		goto exit;
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/4] exec: Relocate S_ISREG() check
  2020-05-18  5:54 [PATCH 0/4] Relocate execve() sanity checks Kees Cook
  2020-05-18  5:54 ` [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES Kees Cook
@ 2020-05-18  5:54 ` Kees Cook
       [not found]   ` <20200525091420.GI12456@shao2-debian>
  2020-05-18  5:54 ` [PATCH 3/4] exec: Relocate path_noexec() check Kees Cook
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 28+ messages in thread
From: Kees Cook @ 2020-05-18  5:54 UTC (permalink / raw)
  To: Al Viro
  Cc: Kees Cook, Andrew Morton, Tetsuo Handa, Eric Biggers,
	Dmitry Vyukov, linux-fsdevel, linux-security-module, linux-api,
	linux-kernel

The execve(2)/uselib(2) syscalls have always rejected non-regular
files. Recently, it was noticed that a deadlock was introduced when trying
to execute pipes, as the S_ISREG() test was happening too late. This was
fixed in commit 73601ea5b7b1 ("fs/open.c: allow opening only regular files
during execve()"), but it was added after inode_permission() had already
run, which meant LSMs could see bogus attempts to execute non-regular
files. Move the test earlier.

Also include a comment with the redundant S_ISREG() checks at the end of
execve(2)/uselib(2) to note that they are present to avoid any mistakes.

Finally, instead of dereferencing the inode, use dcache for S_ISREG()
test.

My notes on the call path, and related arguments, checks, etc:

do_open_execat()
    struct open_flags open_exec_flags = {
        .open_flag = O_LARGEFILE | O_RDONLY | __FMODE_EXEC, ...
    do_filp_open(dfd, filename, open_flags)
        path_openat(nameidata, open_flags, flags)
            /* f_mode populated from open_flags in alloc_empty_file() */
            file = alloc_empty_file(open_flags, current_cred());
            do_open(nameidata, file, open_flags)
		/* new location of FMODE_EXEC vs S_ISREG() test */
                may_open(path, acc_mode, open_flag)
                    inode_permission(inode, MAY_OPEN | acc_mode)
                        security_inode_permission(inode, acc_mode)
                vfs_open(path, file)
                    do_dentry_open(file, path->dentry->d_inode, open)
                        /* old location of FMODE_EXEC vs S_ISREG() test */
                        security_file_open(f)
                        open()

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 fs/exec.c  | 8 ++++++++
 fs/namei.c | 4 ++++
 fs/open.c  | 6 ------
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 30735ce1dc0e..f0c80a8b9ccd 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -139,6 +139,10 @@ SYSCALL_DEFINE1(uselib, const char __user *, library)
 	if (IS_ERR(file))
 		goto out;
 
+	/*
+	 * do_open() has already checked for this, but we can be extra
+	 * cautious and check again at the very end too.
+	 */
 	error = -EACCES;
 	if (!S_ISREG(file_inode(file)->i_mode))
 		goto exit;
@@ -860,6 +864,10 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags)
 	if (IS_ERR(file))
 		goto out;
 
+	/*
+	 * do_open() has already checked for this, but we can be extra
+	 * cautious and check again at the very end too.
+	 */
 	err = -EACCES;
 	if (!S_ISREG(file_inode(file)->i_mode))
 		goto exit;
diff --git a/fs/namei.c b/fs/namei.c
index a320371899cf..b9408aacaaa4 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3212,6 +3212,10 @@ static int do_open(struct nameidata *nd,
 	if ((nd->flags & LOOKUP_DIRECTORY) && !d_can_lookup(nd->path.dentry))
 		return -ENOTDIR;
 
+	/* Any file opened for execution has to be a regular file. */
+	if ((file->f_flags & FMODE_EXEC) && !d_is_reg(nd->path.dentry))
+		return -EACCES;
+
 	do_truncate = false;
 	acc_mode = op->acc_mode;
 	if (file->f_mode & FMODE_CREATED) {
diff --git a/fs/open.c b/fs/open.c
index 719b320ede52..bb16e4e3cd57 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -753,12 +753,6 @@ static int do_dentry_open(struct file *f,
 		return 0;
 	}
 
-	/* Any file opened for execve()/uselib() has to be a regular file. */
-	if (unlikely(f->f_flags & FMODE_EXEC && !S_ISREG(inode->i_mode))) {
-		error = -EACCES;
-		goto cleanup_file;
-	}
-
 	if (f->f_mode & FMODE_WRITE && !special_file(inode->i_mode)) {
 		error = get_write_access(inode);
 		if (unlikely(error))
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/4] exec: Relocate path_noexec() check
  2020-05-18  5:54 [PATCH 0/4] Relocate execve() sanity checks Kees Cook
  2020-05-18  5:54 ` [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES Kees Cook
  2020-05-18  5:54 ` [PATCH 2/4] exec: Relocate S_ISREG() check Kees Cook
@ 2020-05-18  5:54 ` Kees Cook
  2020-05-18  5:54 ` [PATCH 4/4] fs: Include FMODE_EXEC when converting flags to f_mode Kees Cook
  2020-05-19 15:06 ` [PATCH 0/4] Relocate execve() sanity checks Eric W. Biederman
  4 siblings, 0 replies; 28+ messages in thread
From: Kees Cook @ 2020-05-18  5:54 UTC (permalink / raw)
  To: Al Viro
  Cc: Kees Cook, Andrew Morton, Tetsuo Handa, Eric Biggers,
	Dmitry Vyukov, linux-fsdevel, linux-security-module, linux-api,
	linux-kernel

The path_noexec() check, like the regular file check, was happening too
late, letting LSMs see impossible execve()s. Check it earlier as well
and collect the redundant fs/exec.c path_noexec() test under the same
robustness comment as the S_ISREG() check.

My notes on the call path, and related arguments, checks, etc:

do_open_execat()
    struct open_flags open_exec_flags = {
        .open_flag = O_LARGEFILE | O_RDONLY | __FMODE_EXEC, ...
    do_filp_open(dfd, filename, open_flags)
        path_openat(nameidata, open_flags, flags)
            /* f_mode populated from open_flags in alloc_empty_file() */
            file = alloc_empty_file(open_flags, current_cred());
            do_open(nameidata, file, open_flags)
                /* new location of FMODE_EXEC vs path_noexec() test */
                may_open(path, acc_mode, open_flag)
                    inode_permission(inode, MAY_OPEN | acc_mode)
                        security_inode_permission(inode, acc_mode)
                vfs_open(path, file)
                    do_dentry_open(file, path->dentry->d_inode, open)
                        security_file_open(f)
                        open()
    /* old location of path_noexec() test */

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 fs/exec.c  | 6 ++----
 fs/namei.c | 5 +++--
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index f0c80a8b9ccd..a34093323aa1 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -140,13 +140,12 @@ SYSCALL_DEFINE1(uselib, const char __user *, library)
 		goto out;
 
 	/*
-	 * do_open() has already checked for this, but we can be extra
+	 * do_open() has already checked for these, but we can be extra
 	 * cautious and check again at the very end too.
 	 */
 	error = -EACCES;
 	if (!S_ISREG(file_inode(file)->i_mode))
 		goto exit;
-
 	if (path_noexec(&file->f_path))
 		goto exit;
 
@@ -865,13 +864,12 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags)
 		goto out;
 
 	/*
-	 * do_open() has already checked for this, but we can be extra
+	 * do_open() has already checked for these, but we can be extra
 	 * cautious and check again at the very end too.
 	 */
 	err = -EACCES;
 	if (!S_ISREG(file_inode(file)->i_mode))
 		goto exit;
-
 	if (path_noexec(&file->f_path))
 		goto exit;
 
diff --git a/fs/namei.c b/fs/namei.c
index b9408aacaaa4..6bb1b6624bad 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3212,8 +3212,9 @@ static int do_open(struct nameidata *nd,
 	if ((nd->flags & LOOKUP_DIRECTORY) && !d_can_lookup(nd->path.dentry))
 		return -ENOTDIR;
 
-	/* Any file opened for execution has to be a regular file. */
-	if ((file->f_flags & FMODE_EXEC) && !d_is_reg(nd->path.dentry))
+	/* Opening for execution requires a regular file on an exec mnt. */
+	if ((file->f_flags & FMODE_EXEC) && (!d_is_reg(nd->path.dentry) ||
+					     path_noexec(&nd->path)))
 		return -EACCES;
 
 	do_truncate = false;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 4/4] fs: Include FMODE_EXEC when converting flags to f_mode
  2020-05-18  5:54 [PATCH 0/4] Relocate execve() sanity checks Kees Cook
                   ` (2 preceding siblings ...)
  2020-05-18  5:54 ` [PATCH 3/4] exec: Relocate path_noexec() check Kees Cook
@ 2020-05-18  5:54 ` Kees Cook
  2020-05-19 15:06 ` [PATCH 0/4] Relocate execve() sanity checks Eric W. Biederman
  4 siblings, 0 replies; 28+ messages in thread
From: Kees Cook @ 2020-05-18  5:54 UTC (permalink / raw)
  To: Al Viro
  Cc: Kees Cook, Andrew Morton, Tetsuo Handa, Eric Biggers,
	Dmitry Vyukov, linux-fsdevel, linux-security-module, linux-api,
	linux-kernel

Include FMODE_EXEC when building the f_mode field, so that code can
actually test the correct field and values. Only three places actually
examine f_flags for FMODE_EXEC:

fs/open.c:      if (unlikely((f->f_mode & FMODE_EXEC) && !S_ISREG(inode->i_mode))) {
include/linux/fsnotify.h:               if (file->f_mode & FMODE_EXEC) {
include/linux/fsnotify.h:       if (file->f_mode & FMODE_EXEC)

Signed-off-by: Kees Cook <keescook@chromium.org>
---
I assume there must be some reason for FMODE_EXEC not being pulled into
f_mode, but I couldn't find it. I guess this is my attempt to either fix
an oversight or to learn by flames. :)
---
 fs/namei.c               | 4 ++--
 include/linux/fs.h       | 3 ++-
 include/linux/fsnotify.h | 4 ++--
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 6bb1b6624bad..362b1cc75f5c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3213,8 +3213,8 @@ static int do_open(struct nameidata *nd,
 		return -ENOTDIR;
 
 	/* Opening for execution requires a regular file on an exec mnt. */
-	if ((file->f_flags & FMODE_EXEC) && (!d_is_reg(nd->path.dentry) ||
-					     path_noexec(&nd->path)))
+	if ((file->f_mode & FMODE_EXEC) && (!d_is_reg(nd->path.dentry) ||
+					    path_noexec(&nd->path)))
 		return -EACCES;
 
 	do_truncate = false;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 4f6f59b4f22a..8a2cabdcf531 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3545,10 +3545,11 @@ int __init get_filesystem_list(char *buf);
 
 #define __FMODE_EXEC		((__force int) FMODE_EXEC)
 #define __FMODE_NONOTIFY	((__force int) FMODE_NONOTIFY)
+#define __SMUGGLED_FMODE_FLAGS	(__FMODE_EXEC | __FMODE_NONOTIFY)
 
 #define ACC_MODE(x) ("\004\002\006\006"[(x)&O_ACCMODE])
 #define OPEN_FMODE(flag) ((__force fmode_t)(((flag + 1) & O_ACCMODE) | \
-					    (flag & __FMODE_NONOTIFY)))
+					    (flag & __SMUGGLED_FMODE_FLAGS)))
 
 static inline bool is_sxid(umode_t mode)
 {
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 5ab28f6c7d26..86761ed4b434 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -90,7 +90,7 @@ static inline int fsnotify_perm(struct file *file, int mask)
 	if (mask & MAY_OPEN) {
 		fsnotify_mask = FS_OPEN_PERM;
 
-		if (file->f_flags & __FMODE_EXEC) {
+		if (file->f_mode & FMODE_EXEC) {
 			ret = fsnotify_file(file, FS_OPEN_EXEC_PERM);
 
 			if (ret)
@@ -264,7 +264,7 @@ static inline void fsnotify_open(struct file *file)
 {
 	__u32 mask = FS_OPEN;
 
-	if (file->f_flags & __FMODE_EXEC)
+	if (file->f_mode & FMODE_EXEC)
 		mask |= FS_OPEN_EXEC;
 
 	fsnotify_file(file, mask);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES
  2020-05-18  5:54 ` [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES Kees Cook
@ 2020-05-18 13:02   ` Christian Brauner
  2020-05-18 14:43     ` Jann Horn
  0 siblings, 1 reply; 28+ messages in thread
From: Christian Brauner @ 2020-05-18 13:02 UTC (permalink / raw)
  To: Kees Cook
  Cc: Al Viro, Andrew Morton, Tetsuo Handa, Eric Biggers,
	Dmitry Vyukov, linux-fsdevel, linux-security-module, linux-api,
	linux-kernel

On Sun, May 17, 2020 at 10:54:54PM -0700, Kees Cook wrote:
> Change uselib(2)' S_ISREG() error return to EACCES instead of EINVAL so
> the behavior matches execve(2), and the seemingly documented value.
> The "not a regular file" failure mode of execve(2) is explicitly
> documented[1], but it is not mentioned in uselib(2)[2] which does,
> however, say that open(2) and mmap(2) errors may apply. The documentation
> for open(2) does not include a "not a regular file" error[3], but mmap(2)
> does[4], and it is EACCES.
> 
> [1] http://man7.org/linux/man-pages/man2/execve.2.html#ERRORS
> [2] http://man7.org/linux/man-pages/man2/uselib.2.html#ERRORS
> [3] http://man7.org/linux/man-pages/man2/open.2.html#ERRORS
> [4] http://man7.org/linux/man-pages/man2/mmap.2.html#ERRORS
> 
> Signed-off-by: Kees Cook <keescook@chromium.org>

This is all extremely weird.
uselib has been deprected since forever basically which makes me doubt
this matters much but:
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>

Also - gulp (puts on flame proof suit) - may I suggest we check if there
are any distros out there that still set CONFIG_USELIB=y and if not do
what we did with the sysctl syscall and remove it? If someone yells we
can always backpaddle...

Christian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES
  2020-05-18 13:02   ` Christian Brauner
@ 2020-05-18 14:43     ` Jann Horn
  2020-05-18 14:46       ` Christian Brauner
  0 siblings, 1 reply; 28+ messages in thread
From: Jann Horn @ 2020-05-18 14:43 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Kees Cook, Al Viro, Andrew Morton, Tetsuo Handa, Eric Biggers,
	Dmitry Vyukov, linux-fsdevel, linux-security-module, Linux API,
	kernel list

On Mon, May 18, 2020 at 3:03 PM Christian Brauner
<christian.brauner@ubuntu.com> wrote:
> Also - gulp (puts on flame proof suit) - may I suggest we check if there
> are any distros out there that still set CONFIG_USELIB=y

Debian seems to have it enabled on x86...

https://salsa.debian.org/kernel-team/linux/-/blob/master/debian/config/kernelarch-x86/config#L1896

A random Ubuntu 19.10 VM I have here has it enabled, too.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES
  2020-05-18 14:43     ` Jann Horn
@ 2020-05-18 14:46       ` Christian Brauner
  2020-05-18 23:57         ` Eric W. Biederman
  0 siblings, 1 reply; 28+ messages in thread
From: Christian Brauner @ 2020-05-18 14:46 UTC (permalink / raw)
  To: Jann Horn
  Cc: Kees Cook, Al Viro, Andrew Morton, Tetsuo Handa, Eric Biggers,
	Dmitry Vyukov, linux-fsdevel, linux-security-module, Linux API,
	kernel list

On Mon, May 18, 2020 at 04:43:20PM +0200, Jann Horn wrote:
> On Mon, May 18, 2020 at 3:03 PM Christian Brauner
> <christian.brauner@ubuntu.com> wrote:
> > Also - gulp (puts on flame proof suit) - may I suggest we check if there
> > are any distros out there that still set CONFIG_USELIB=y
> 
> Debian seems to have it enabled on x86...
> 
> https://salsa.debian.org/kernel-team/linux/-/blob/master/debian/config/kernelarch-x86/config#L1896
> 
> A random Ubuntu 19.10 VM I have here has it enabled, too.

I wonder if there's any program - apart from _ancient_ glibc out there
that actually use it...
I looked at uselib in codsearch but the results were quite unspecific
but I didn't look too close.

Christian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES
  2020-05-18 14:46       ` Christian Brauner
@ 2020-05-18 23:57         ` Eric W. Biederman
  2020-05-19  8:11           ` Christian Brauner
  2020-05-19  8:37           ` Andreas Schwab
  0 siblings, 2 replies; 28+ messages in thread
From: Eric W. Biederman @ 2020-05-18 23:57 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jann Horn, Kees Cook, Al Viro, Andrew Morton, Tetsuo Handa,
	Eric Biggers, Dmitry Vyukov, linux-fsdevel,
	linux-security-module, Linux API, kernel list

Christian Brauner <christian.brauner@ubuntu.com> writes:

> On Mon, May 18, 2020 at 04:43:20PM +0200, Jann Horn wrote:
>> On Mon, May 18, 2020 at 3:03 PM Christian Brauner
>> <christian.brauner@ubuntu.com> wrote:
>> > Also - gulp (puts on flame proof suit) - may I suggest we check if there
>> > are any distros out there that still set CONFIG_USELIB=y
>> 
>> Debian seems to have it enabled on x86...
>> 
>> https://salsa.debian.org/kernel-team/linux/-/blob/master/debian/config/kernelarch-x86/config#L1896
>> 
>> A random Ubuntu 19.10 VM I have here has it enabled, too.
>
> I wonder if there's any program - apart from _ancient_ glibc out there
> that actually use it...
> I looked at uselib in codsearch but the results were quite unspecific
> but I didn't look too close.

So the thing to do is to have a polite word with people who build Ubuntu
and Debian kernels and get them to disable the kernel .config.

A quick look suggets it is already disabled in RHEL8.  It cannot be
disabled in RHEL7.

Then in a few years we can come back and discuss removing the uselib
system call, base on no distributions having it enabled.

If it was only libc4 and libc5 that used the uselib system call then it
can probably be removed after enough time.

We can probably reorganize the code before the point it is clearly safe
to drop support for USELIB to keep it off to the side so USELIB does not
have any ongoing mainteance costs.

For this patchset I think we need to assume uselib will need to be
maintained for a bit longer.

Eric


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES
  2020-05-18 23:57         ` Eric W. Biederman
@ 2020-05-19  8:11           ` Christian Brauner
  2020-05-19  8:37           ` Andreas Schwab
  1 sibling, 0 replies; 28+ messages in thread
From: Christian Brauner @ 2020-05-19  8:11 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Jann Horn, Kees Cook, Al Viro, Andrew Morton, Tetsuo Handa,
	Eric Biggers, Dmitry Vyukov, linux-fsdevel,
	linux-security-module, Linux API, kernel list

On Mon, May 18, 2020 at 06:57:15PM -0500, Eric W. Biederman wrote:
> Christian Brauner <christian.brauner@ubuntu.com> writes:
> 
> > On Mon, May 18, 2020 at 04:43:20PM +0200, Jann Horn wrote:
> >> On Mon, May 18, 2020 at 3:03 PM Christian Brauner
> >> <christian.brauner@ubuntu.com> wrote:
> >> > Also - gulp (puts on flame proof suit) - may I suggest we check if there
> >> > are any distros out there that still set CONFIG_USELIB=y
> >> 
> >> Debian seems to have it enabled on x86...
> >> 
> >> https://salsa.debian.org/kernel-team/linux/-/blob/master/debian/config/kernelarch-x86/config#L1896
> >> 
> >> A random Ubuntu 19.10 VM I have here has it enabled, too.
> >
> > I wonder if there's any program - apart from _ancient_ glibc out there
> > that actually use it...
> > I looked at uselib in codsearch but the results were quite unspecific
> > but I didn't look too close.
> 
> So the thing to do is to have a polite word with people who build Ubuntu
> and Debian kernels and get them to disable the kernel .config.

Yeah, I think that's a sane thing to do.
I filed a bug for Ubuntu to start a discussion. I can't see an obvious
reason why not.

> 
> A quick look suggets it is already disabled in RHEL8.  It cannot be
> disabled in RHEL7.
> 
> Then in a few years we can come back and discuss removing the uselib
> system call, base on no distributions having it enabled.
> 
> If it was only libc4 and libc5 that used the uselib system call then it
> can probably be removed after enough time.
> 
> We can probably reorganize the code before the point it is clearly safe
> to drop support for USELIB to keep it off to the side so USELIB does not
> have any ongoing mainteance costs.
> 
> For this patchset I think we need to assume uselib will need to be
> maintained for a bit longer.

Yeah, agreed. It doesn't matter as long as we have a plan for the future
to remove it. I don't think keeping this cruft around forever should be
the only outlook.

Christian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES
  2020-05-18 23:57         ` Eric W. Biederman
  2020-05-19  8:11           ` Christian Brauner
@ 2020-05-19  8:37           ` Andreas Schwab
  2020-05-19 11:56             ` Eric W. Biederman
  1 sibling, 1 reply; 28+ messages in thread
From: Andreas Schwab @ 2020-05-19  8:37 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Christian Brauner, Jann Horn, Kees Cook, Al Viro, Andrew Morton,
	Tetsuo Handa, Eric Biggers, Dmitry Vyukov, linux-fsdevel,
	linux-security-module, Linux API, kernel list

On Mai 18 2020, Eric W. Biederman wrote:

> If it was only libc4 and libc5 that used the uselib system call then it
> can probably be removed after enough time.

Only libc4 used it, libc5 was already ELF.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES
  2020-05-19  8:37           ` Andreas Schwab
@ 2020-05-19 11:56             ` Eric W. Biederman
  2020-05-19 12:12               ` Andreas Schwab
  2020-05-19 13:13               ` Christian Brauner
  0 siblings, 2 replies; 28+ messages in thread
From: Eric W. Biederman @ 2020-05-19 11:56 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Christian Brauner, Jann Horn, Kees Cook, Al Viro, Andrew Morton,
	Tetsuo Handa, Eric Biggers, Dmitry Vyukov, linux-fsdevel,
	linux-security-module, Linux API, kernel list

Andreas Schwab <schwab@linux-m68k.org> writes:

> On Mai 18 2020, Eric W. Biederman wrote:
>
>> If it was only libc4 and libc5 that used the uselib system call then it
>> can probably be removed after enough time.
>
> Only libc4 used it, libc5 was already ELF.

binfmt_elf.c supports uselib.  In a very a.out ish way.  Do you know if
that support was ever used?

If we are truly talking a.out only we should be able to make uselib
conditional on a.out support in the kernel which is strongly mostly
disabled at this point.

I am wondering if there are source trees for libc4 or libc5 around
anywhere that we can look at to see how usage of uselib evolved.

Eric

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES
  2020-05-19 11:56             ` Eric W. Biederman
@ 2020-05-19 12:12               ` Andreas Schwab
  2020-05-19 12:28                 ` Eric W. Biederman
  2020-05-19 13:13               ` Christian Brauner
  1 sibling, 1 reply; 28+ messages in thread
From: Andreas Schwab @ 2020-05-19 12:12 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Christian Brauner, Jann Horn, Kees Cook, Al Viro, Andrew Morton,
	Tetsuo Handa, Eric Biggers, Dmitry Vyukov, linux-fsdevel,
	linux-security-module, Linux API, kernel list

On Mai 19 2020, Eric W. Biederman wrote:

> I am wondering if there are source trees for libc4 or libc5 around
> anywhere that we can look at to see how usage of uselib evolved.

libc5 is available from archive.debian.org.

http://archive.debian.org/debian-archive/debian/pool/main/libc/libc/libc_5.4.46.orig.tar.gz

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES
  2020-05-19 12:12               ` Andreas Schwab
@ 2020-05-19 12:28                 ` Eric W. Biederman
  2020-05-19 13:29                   ` Christian Brauner
  0 siblings, 1 reply; 28+ messages in thread
From: Eric W. Biederman @ 2020-05-19 12:28 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Christian Brauner, Jann Horn, Kees Cook, Al Viro, Andrew Morton,
	Tetsuo Handa, Eric Biggers, Dmitry Vyukov, linux-fsdevel,
	linux-security-module, Linux API, kernel list

Andreas Schwab <schwab@linux-m68k.org> writes:

> On Mai 19 2020, Eric W. Biederman wrote:
>
>> I am wondering if there are source trees for libc4 or libc5 around
>> anywhere that we can look at to see how usage of uselib evolved.
>
> libc5 is available from archive.debian.org.
>
> http://archive.debian.org/debian-archive/debian/pool/main/libc/libc/libc_5.4.46.orig.tar.gz

Interesting.

It appears that the old a.out code to make use of uselib remained in
the libc5 sources but it was all conditional on the being compiled not
to use ELF.

libc5 did provide a wrapper for the uselib system call.

It appears glibc also provides a wrapper for the uselib system call
named: uselib@GLIBC_2.2.5.

I don't see a glibc header file that provides a declaration for uselib
though.

So the question becomes did anyone use those glibc wrappers.

Eric



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES
  2020-05-19 11:56             ` Eric W. Biederman
  2020-05-19 12:12               ` Andreas Schwab
@ 2020-05-19 13:13               ` Christian Brauner
  2020-05-19 14:32                 ` Geert Uytterhoeven
  1 sibling, 1 reply; 28+ messages in thread
From: Christian Brauner @ 2020-05-19 13:13 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andreas Schwab, Jann Horn, Kees Cook, Al Viro, Andrew Morton,
	Tetsuo Handa, Eric Biggers, Dmitry Vyukov, linux-fsdevel,
	linux-security-module, Linux API, kernel list

On Tue, May 19, 2020 at 06:56:36AM -0500, Eric W. Biederman wrote:
> Andreas Schwab <schwab@linux-m68k.org> writes:
> 
> > On Mai 18 2020, Eric W. Biederman wrote:
> >
> >> If it was only libc4 and libc5 that used the uselib system call then it
> >> can probably be removed after enough time.
> >
> > Only libc4 used it, libc5 was already ELF.
> 
> binfmt_elf.c supports uselib.  In a very a.out ish way.  Do you know if
> that support was ever used?
> 
> If we are truly talking a.out only we should be able to make uselib
> conditional on a.out support in the kernel which is strongly mostly
> disabled at this point.

The only ones that even allow setting AOUT:

arch/alpha/Kconfig:     select HAVE_AOUT
arch/m68k/Kconfig:      select HAVE_AOUT if MMU

and x86 deprecated it March 2019:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=eac616557050737a8d6ef6fe0322d0980ff0ffde

Christian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES
  2020-05-19 12:28                 ` Eric W. Biederman
@ 2020-05-19 13:29                   ` Christian Brauner
  2020-05-19 14:49                     ` Eric W. Biederman
  0 siblings, 1 reply; 28+ messages in thread
From: Christian Brauner @ 2020-05-19 13:29 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andreas Schwab, Jann Horn, Kees Cook, Al Viro, Andrew Morton,
	Tetsuo Handa, Eric Biggers, Dmitry Vyukov, linux-fsdevel,
	linux-security-module, Linux API, kernel list

On Tue, May 19, 2020 at 07:28:46AM -0500, Eric W. Biederman wrote:
> Andreas Schwab <schwab@linux-m68k.org> writes:
> 
> > On Mai 19 2020, Eric W. Biederman wrote:
> >
> >> I am wondering if there are source trees for libc4 or libc5 around
> >> anywhere that we can look at to see how usage of uselib evolved.
> >
> > libc5 is available from archive.debian.org.
> >
> > http://archive.debian.org/debian-archive/debian/pool/main/libc/libc/libc_5.4.46.orig.tar.gz
> 
> Interesting.
> 
> It appears that the old a.out code to make use of uselib remained in
> the libc5 sources but it was all conditional on the being compiled not
> to use ELF.
> 
> libc5 did provide a wrapper for the uselib system call.
> 
> It appears glibc also provides a wrapper for the uselib system call
> named: uselib@GLIBC_2.2.5.
> 
> I don't see a glibc header file that provides a declaration for uselib
> though.
> 
> So the question becomes did anyone use those glibc wrappers.

The only software I could find was ski, the ia64 instruction set
emulator, which apparently used to make use of this and when glibc
removed they did:

#define uselib(libname) syscall(__NR_uselib, libname)

but they only define it for the sake of the internal syscall list they
maintain so not actively using it. I just checked, ski is available on
Fedora 31 and Fedora has USELIB disabled.
Codesearch on Debian yields no users that actively use the syscall for
anything.

Christian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES
  2020-05-19 13:13               ` Christian Brauner
@ 2020-05-19 14:32                 ` Geert Uytterhoeven
  2020-05-19 14:47                   ` Christian Brauner
  0 siblings, 1 reply; 28+ messages in thread
From: Geert Uytterhoeven @ 2020-05-19 14:32 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Eric W. Biederman, Andreas Schwab, Jann Horn, Kees Cook, Al Viro,
	Andrew Morton, Tetsuo Handa, Eric Biggers, Dmitry Vyukov,
	linux-fsdevel, linux-security-module, Linux API, kernel list

Hi Christian,

On Tue, May 19, 2020 at 3:15 PM Christian Brauner
<christian.brauner@ubuntu.com> wrote:
> On Tue, May 19, 2020 at 06:56:36AM -0500, Eric W. Biederman wrote:
> > Andreas Schwab <schwab@linux-m68k.org> writes:
> > > On Mai 18 2020, Eric W. Biederman wrote:
> > >> If it was only libc4 and libc5 that used the uselib system call then it
> > >> can probably be removed after enough time.
> > >
> > > Only libc4 used it, libc5 was already ELF.
> >
> > binfmt_elf.c supports uselib.  In a very a.out ish way.  Do you know if
> > that support was ever used?
> >
> > If we are truly talking a.out only we should be able to make uselib
> > conditional on a.out support in the kernel which is strongly mostly
> > disabled at this point.
>
> The only ones that even allow setting AOUT:
>
> arch/alpha/Kconfig:     select HAVE_AOUT
> arch/m68k/Kconfig:      select HAVE_AOUT if MMU
>
> and x86 deprecated it March 2019:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=eac616557050737a8d6ef6fe0322d0980ff0ffde

Quoting myself (for the second time this month):
   "I think it's safe to assume no one still runs a.out binaries on m68k."
    http://lore.kernel.org/r/CAMuHMdW+m0Q+j3rsQdMXnrEPm+XB5Y2AQrxW5sD1mZAKgmEqoA@mail.gmail.com

Gr{oetje,eeting}s,

                        Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES
  2020-05-19 14:32                 ` Geert Uytterhoeven
@ 2020-05-19 14:47                   ` Christian Brauner
  0 siblings, 0 replies; 28+ messages in thread
From: Christian Brauner @ 2020-05-19 14:47 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Eric W. Biederman, Andreas Schwab, Jann Horn, Kees Cook, Al Viro,
	Andrew Morton, Tetsuo Handa, Eric Biggers, Dmitry Vyukov,
	linux-fsdevel, linux-security-module, Linux API, kernel list

On Tue, May 19, 2020 at 04:32:38PM +0200, Geert Uytterhoeven wrote:
> Hi Christian,
> 
> On Tue, May 19, 2020 at 3:15 PM Christian Brauner
> <christian.brauner@ubuntu.com> wrote:
> > On Tue, May 19, 2020 at 06:56:36AM -0500, Eric W. Biederman wrote:
> > > Andreas Schwab <schwab@linux-m68k.org> writes:
> > > > On Mai 18 2020, Eric W. Biederman wrote:
> > > >> If it was only libc4 and libc5 that used the uselib system call then it
> > > >> can probably be removed after enough time.
> > > >
> > > > Only libc4 used it, libc5 was already ELF.
> > >
> > > binfmt_elf.c supports uselib.  In a very a.out ish way.  Do you know if
> > > that support was ever used?
> > >
> > > If we are truly talking a.out only we should be able to make uselib
> > > conditional on a.out support in the kernel which is strongly mostly
> > > disabled at this point.
> >
> > The only ones that even allow setting AOUT:
> >
> > arch/alpha/Kconfig:     select HAVE_AOUT
> > arch/m68k/Kconfig:      select HAVE_AOUT if MMU
> >
> > and x86 deprecated it March 2019:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=eac616557050737a8d6ef6fe0322d0980ff0ffde
> 
> Quoting myself (for the second time this month):
>    "I think it's safe to assume no one still runs a.out binaries on m68k."
>     http://lore.kernel.org/r/CAMuHMdW+m0Q+j3rsQdMXnrEPm+XB5Y2AQrxW5sD1mZAKgmEqoA@mail.gmail.com

Well, it's just a great thing to hear. :)

Fyi, I once tried to give a demo with a Slackware version in a container
which pre-dated ELF on a x86 kernel with AOUT. It didn't go well...

Christian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES
  2020-05-19 13:29                   ` Christian Brauner
@ 2020-05-19 14:49                     ` Eric W. Biederman
  0 siblings, 0 replies; 28+ messages in thread
From: Eric W. Biederman @ 2020-05-19 14:49 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Andreas Schwab, Jann Horn, Kees Cook, Al Viro, Andrew Morton,
	Tetsuo Handa, Eric Biggers, Dmitry Vyukov, linux-fsdevel,
	linux-security-module, Linux API, kernel list

Christian Brauner <christian.brauner@ubuntu.com> writes:

> On Tue, May 19, 2020 at 07:28:46AM -0500, Eric W. Biederman wrote:
>> Andreas Schwab <schwab@linux-m68k.org> writes:
>> 
>> > On Mai 19 2020, Eric W. Biederman wrote:
>> >
>> >> I am wondering if there are source trees for libc4 or libc5 around
>> >> anywhere that we can look at to see how usage of uselib evolved.
>> >
>> > libc5 is available from archive.debian.org.
>> >
>> > http://archive.debian.org/debian-archive/debian/pool/main/libc/libc/libc_5.4.46.orig.tar.gz
>> 
>> Interesting.
>> 
>> It appears that the old a.out code to make use of uselib remained in
>> the libc5 sources but it was all conditional on the being compiled not
>> to use ELF.
>> 
>> libc5 did provide a wrapper for the uselib system call.
>> 
>> It appears glibc also provides a wrapper for the uselib system call
>> named: uselib@GLIBC_2.2.5.
>> 
>> I don't see a glibc header file that provides a declaration for uselib
>> though.
>> 
>> So the question becomes did anyone use those glibc wrappers.
>
> The only software I could find was ski, the ia64 instruction set
> emulator, which apparently used to make use of this and when glibc
> removed they did:
>
> #define uselib(libname) syscall(__NR_uselib, libname)
>
> but they only define it for the sake of the internal syscall list they
> maintain so not actively using it. I just checked, ski is available on
> Fedora 31 and Fedora has USELIB disabled.
> Codesearch on Debian yields no users that actively use the syscall for
> anything.

I think there is a very good argument that no one builds libraries
usable with uselib anymore.  The ELF version requires a ET_EXEC binary
with one PT_LOAD segment that is loaded at a fixed virtual address.
This is a format that does not allow for relocation processing, the
loading executable has to ``know'' where the symbols are in the loaded
binary, and they have to be build to run at distinct virtual addresses.

I think I could conjure up some linker scripts to do that with no more
linker support than we use to build the kernel, but it is not easy to
maintain binaries and libraries like that as code changes.  Which is
why we switched to ELF in the first place.

I think the tooling challenges plus not being able to find anything
using uselib anymore make a solid argument for going to a distribution
and asking them to stop enabling CONFIG_USELIB in their kernels.

Eric



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/4] Relocate execve() sanity checks
  2020-05-18  5:54 [PATCH 0/4] Relocate execve() sanity checks Kees Cook
                   ` (3 preceding siblings ...)
  2020-05-18  5:54 ` [PATCH 4/4] fs: Include FMODE_EXEC when converting flags to f_mode Kees Cook
@ 2020-05-19 15:06 ` Eric W. Biederman
  2020-05-19 16:26   ` Kees Cook
  4 siblings, 1 reply; 28+ messages in thread
From: Eric W. Biederman @ 2020-05-19 15:06 UTC (permalink / raw)
  To: Kees Cook
  Cc: Al Viro, Andrew Morton, Tetsuo Handa, Eric Biggers,
	Dmitry Vyukov, linux-fsdevel, linux-security-module, linux-api,
	linux-kernel

Kees Cook <keescook@chromium.org> writes:

> Hi,
>
> While looking at the code paths for the proposed O_MAYEXEC flag, I saw
> some things that looked like they should be fixed up.
>
>   exec: Change uselib(2) IS_SREG() failure to EACCES
> 	This just regularizes the return code on uselib(2).
>
>   exec: Relocate S_ISREG() check
> 	This moves the S_ISREG() check even earlier than it was already.
>
>   exec: Relocate path_noexec() check
> 	This adds the path_noexec() check to the same place as the
> 	S_ISREG() check.
>
>   fs: Include FMODE_EXEC when converting flags to f_mode
> 	This seemed like an oversight, but I suspect there is some
> 	reason I couldn't find for why FMODE_EXEC doesn't get set in
> 	f_mode and just stays in f_flags.

So I took a look at this series.

I think the belt and suspenders approach of adding code in open and then
keeping it in exec and uselib is probably wrong.  My sense of the
situation is a belt and suspenders approach is more likely to be
confusing and result in people making mistakes when maintaining the code
than to actually be helpful.

Eric

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/4] Relocate execve() sanity checks
  2020-05-19 15:06 ` [PATCH 0/4] Relocate execve() sanity checks Eric W. Biederman
@ 2020-05-19 16:26   ` Kees Cook
  2020-05-19 17:41     ` Eric W. Biederman
  0 siblings, 1 reply; 28+ messages in thread
From: Kees Cook @ 2020-05-19 16:26 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Al Viro, Andrew Morton, Tetsuo Handa, Eric Biggers,
	Dmitry Vyukov, linux-fsdevel, linux-security-module, linux-api,
	linux-kernel

On Tue, May 19, 2020 at 10:06:32AM -0500, Eric W. Biederman wrote:
> Kees Cook <keescook@chromium.org> writes:
> 
> > Hi,
> >
> > While looking at the code paths for the proposed O_MAYEXEC flag, I saw
> > some things that looked like they should be fixed up.
> >
> >   exec: Change uselib(2) IS_SREG() failure to EACCES
> > 	This just regularizes the return code on uselib(2).
> >
> >   exec: Relocate S_ISREG() check
> > 	This moves the S_ISREG() check even earlier than it was already.
> >
> >   exec: Relocate path_noexec() check
> > 	This adds the path_noexec() check to the same place as the
> > 	S_ISREG() check.
> >
> >   fs: Include FMODE_EXEC when converting flags to f_mode
> > 	This seemed like an oversight, but I suspect there is some
> > 	reason I couldn't find for why FMODE_EXEC doesn't get set in
> > 	f_mode and just stays in f_flags.
> 
> So I took a look at this series.
> 
> I think the belt and suspenders approach of adding code in open and then
> keeping it in exec and uselib is probably wrong.  My sense of the
> situation is a belt and suspenders approach is more likely to be
> confusing and result in people making mistakes when maintaining the code
> than to actually be helpful.

This is why I added the comments in fs/exec.c's redundant checks. When I
was originally testing this series, I had entirely removed the checks in
fs/exec.c, but then had nightmares about some kind of future VFS paths
that would somehow bypass do_open() and result in execve() working on
noexec mounts, there by allowing for the introduction of a really nasty
security bug.

The S_ISREG test is demonstrably too late (as referenced in the series),
and given the LSM hooks, I think the noexec check is too late as well.
(This is especially true for the coming O_MAYEXEC series, which will
absolutely need those tests earlier as well[1] -- the permission checking
is then in the correct place: during open, not exec.) I think the only
question is about leaving the redundant checks in fs/exec.c, which I
think are a cheap way to retain a sense of robustness.

-Kees

[1] https://lore.kernel.org/lkml/202005142343.D580850@keescook/

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/4] Relocate execve() sanity checks
  2020-05-19 16:26   ` Kees Cook
@ 2020-05-19 17:41     ` Eric W. Biederman
  2020-05-19 17:56       ` Kees Cook
  0 siblings, 1 reply; 28+ messages in thread
From: Eric W. Biederman @ 2020-05-19 17:41 UTC (permalink / raw)
  To: Kees Cook
  Cc: Al Viro, Andrew Morton, Tetsuo Handa, Eric Biggers,
	Dmitry Vyukov, linux-fsdevel, linux-security-module, linux-api,
	linux-kernel

Kees Cook <keescook@chromium.org> writes:

> On Tue, May 19, 2020 at 10:06:32AM -0500, Eric W. Biederman wrote:
>> Kees Cook <keescook@chromium.org> writes:
>> 
>> > Hi,
>> >
>> > While looking at the code paths for the proposed O_MAYEXEC flag, I saw
>> > some things that looked like they should be fixed up.
>> >
>> >   exec: Change uselib(2) IS_SREG() failure to EACCES
>> > 	This just regularizes the return code on uselib(2).
>> >
>> >   exec: Relocate S_ISREG() check
>> > 	This moves the S_ISREG() check even earlier than it was already.
>> >
>> >   exec: Relocate path_noexec() check
>> > 	This adds the path_noexec() check to the same place as the
>> > 	S_ISREG() check.
>> >
>> >   fs: Include FMODE_EXEC when converting flags to f_mode
>> > 	This seemed like an oversight, but I suspect there is some
>> > 	reason I couldn't find for why FMODE_EXEC doesn't get set in
>> > 	f_mode and just stays in f_flags.
>> 
>> So I took a look at this series.
>> 
>> I think the belt and suspenders approach of adding code in open and then
>> keeping it in exec and uselib is probably wrong.  My sense of the
>> situation is a belt and suspenders approach is more likely to be
>> confusing and result in people making mistakes when maintaining the code
>> than to actually be helpful.
>
> This is why I added the comments in fs/exec.c's redundant checks. When I
> was originally testing this series, I had entirely removed the checks in
> fs/exec.c, but then had nightmares about some kind of future VFS paths
> that would somehow bypass do_open() and result in execve() working on
> noexec mounts, there by allowing for the introduction of a really nasty
> security bug.
>
> The S_ISREG test is demonstrably too late (as referenced in the series),

Yes.  The open of a pipe very much happens when it should not.

The deadlock looks like part of the cred_guard_mutex mess.  I think I
introduced an alternate solution for the specific code paths in the
backtrace when I introduced exec_update_mutex.

The fact that cred_guard_mutex is held over open, while at the same time
cred_guard_mutex is grabbed on open files is very questionable.  Until
my most recent patchset feeding exec /proc/self/maps would also deadlock
this way.

> and given the LSM hooks, I think the noexec check is too late as well.
> (This is especially true for the coming O_MAYEXEC series, which will
> absolutely need those tests earlier as well[1] -- the permission checking
> is then in the correct place: during open, not exec.) I think the only
> question is about leaving the redundant checks in fs/exec.c, which I
> think are a cheap way to retain a sense of robustness.

The trouble is when someone passes through changes one of the permission
checks for whatever reason (misses that they are duplicated in another
location) and things then fail in some very unexpected way.

Eric

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/4] Relocate execve() sanity checks
  2020-05-19 17:41     ` Eric W. Biederman
@ 2020-05-19 17:56       ` Kees Cook
  2020-05-19 18:42         ` Eric W. Biederman
  0 siblings, 1 reply; 28+ messages in thread
From: Kees Cook @ 2020-05-19 17:56 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Al Viro, Andrew Morton, Tetsuo Handa, Eric Biggers,
	Dmitry Vyukov, linux-fsdevel, linux-security-module, linux-api,
	linux-kernel

On Tue, May 19, 2020 at 12:41:27PM -0500, Eric W. Biederman wrote:
> Kees Cook <keescook@chromium.org> writes:
> > and given the LSM hooks, I think the noexec check is too late as well.
> > (This is especially true for the coming O_MAYEXEC series, which will
> > absolutely need those tests earlier as well[1] -- the permission checking
> > is then in the correct place: during open, not exec.) I think the only
> > question is about leaving the redundant checks in fs/exec.c, which I
> > think are a cheap way to retain a sense of robustness.
> 
> The trouble is when someone passes through changes one of the permission
> checks for whatever reason (misses that they are duplicated in another
> location) and things then fail in some very unexpected way.

Do you think this series should drop the "late" checks in fs/exec.c?
Honestly, the largest motivation for me to move the checks earlier as
I've done is so that other things besides execve() can use FMODE_EXEC
during open() and receive the same sanity-checking as execve() (i.e the
O_MAYEXEC series -- the details are still under discussion but this
cleanup will be needed regardless).

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/4] Relocate execve() sanity checks
  2020-05-19 17:56       ` Kees Cook
@ 2020-05-19 18:42         ` Eric W. Biederman
  2020-05-19 21:17           ` Kees Cook
  0 siblings, 1 reply; 28+ messages in thread
From: Eric W. Biederman @ 2020-05-19 18:42 UTC (permalink / raw)
  To: Kees Cook
  Cc: Al Viro, Andrew Morton, Tetsuo Handa, Eric Biggers,
	Dmitry Vyukov, linux-fsdevel, linux-security-module, linux-api,
	linux-kernel

Kees Cook <keescook@chromium.org> writes:

> On Tue, May 19, 2020 at 12:41:27PM -0500, Eric W. Biederman wrote:
>> Kees Cook <keescook@chromium.org> writes:
>> > and given the LSM hooks, I think the noexec check is too late as well.
>> > (This is especially true for the coming O_MAYEXEC series, which will
>> > absolutely need those tests earlier as well[1] -- the permission checking
>> > is then in the correct place: during open, not exec.) I think the only
>> > question is about leaving the redundant checks in fs/exec.c, which I
>> > think are a cheap way to retain a sense of robustness.
>> 
>> The trouble is when someone passes through changes one of the permission
>> checks for whatever reason (misses that they are duplicated in another
>> location) and things then fail in some very unexpected way.
>
> Do you think this series should drop the "late" checks in fs/exec.c?
> Honestly, the largest motivation for me to move the checks earlier as
> I've done is so that other things besides execve() can use FMODE_EXEC
> during open() and receive the same sanity-checking as execve() (i.e the
> O_MAYEXEC series -- the details are still under discussion but this
> cleanup will be needed regardless).

I think this series should drop the "late" checks in fs/exec.c  It feels
less error prone, and it feels like that would transform this into
something Linus would be eager to merge because series becomes a cleanup
that reduces line count.

I haven't been inside of open recently enough to remember if the
location you are putting the check fundamentally makes sense.  But the
O_MAYEXEC bits make a pretty strong case that something of the sort
needs to happen.

I took a quick look but I can not see clearly where path_noexec
and the regular file tests should go.

I do see that you have code duplication with faccessat which suggests
that you haven't put the checks in the right place.

I am wondering if we need something distinct to request the type of the
file being opened versus execute permissions.

All I know is being careful and putting the tests in a good logical
place makes the code more maintainable, whereas not being careful
results in all kinds of sharp corners that might be exploitable.
So I think it is worth digging in and figuring out where those checks
should live.  Especially so that code like faccessat does not need
to duplicate them.

Eric


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/4] Relocate execve() sanity checks
  2020-05-19 18:42         ` Eric W. Biederman
@ 2020-05-19 21:17           ` Kees Cook
  2020-05-19 22:58             ` John Johansen
  0 siblings, 1 reply; 28+ messages in thread
From: Kees Cook @ 2020-05-19 21:17 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Al Viro, Andrew Morton, Tetsuo Handa, Eric Biggers,
	Dmitry Vyukov, linux-fsdevel, linux-security-module, linux-api,
	linux-kernel, John Johansen

On Tue, May 19, 2020 at 01:42:28PM -0500, Eric W. Biederman wrote:
> Kees Cook <keescook@chromium.org> writes:
> 
> > On Tue, May 19, 2020 at 12:41:27PM -0500, Eric W. Biederman wrote:
> >> Kees Cook <keescook@chromium.org> writes:
> >> > and given the LSM hooks, I think the noexec check is too late as well.
> >> > (This is especially true for the coming O_MAYEXEC series, which will
> >> > absolutely need those tests earlier as well[1] -- the permission checking
> >> > is then in the correct place: during open, not exec.) I think the only
> >> > question is about leaving the redundant checks in fs/exec.c, which I
> >> > think are a cheap way to retain a sense of robustness.
> >> 
> >> The trouble is when someone passes through changes one of the permission
> >> checks for whatever reason (misses that they are duplicated in another
> >> location) and things then fail in some very unexpected way.
> >
> > Do you think this series should drop the "late" checks in fs/exec.c?
> > Honestly, the largest motivation for me to move the checks earlier as
> > I've done is so that other things besides execve() can use FMODE_EXEC
> > during open() and receive the same sanity-checking as execve() (i.e the
> > O_MAYEXEC series -- the details are still under discussion but this
> > cleanup will be needed regardless).
> 
> I think this series should drop the "late" checks in fs/exec.c  It feels
> less error prone, and it feels like that would transform this into
> something Linus would be eager to merge because series becomes a cleanup
> that reduces line count.

Yeah, that was my initial sense too. I just started to get nervous about
removing the long-standing exec sanity checks. ;)

> I haven't been inside of open recently enough to remember if the
> location you are putting the check fundamentally makes sense.  But the
> O_MAYEXEC bits make a pretty strong case that something of the sort
> needs to happen.

Right. I *think* it's correct place for now, based on my understanding
of the call graph (which is why I included it in the commit logs).

> I took a quick look but I can not see clearly where path_noexec
> and the regular file tests should go.
> 
> I do see that you have code duplication with faccessat which suggests
> that you haven't put the checks in the right place.

Yeah, I have notes on the similar call sites (which I concluded, perhaps
wrongly) to ignore:

do_faccessat()
    user_path_at(dfd, filename, lookup_flags, &path);
    if (acc_mode & MAY_EXEC .... path_noexec()
    inode_permission(inode, mode | MAY_ACCESS);

This appears to be strictly advisory, and the path_noexec() test is
there to, perhaps, avoid surprises when doing access() then fexecve()?
I would note, however, that that path-based LSMs appear to have no hook
in this call graph at all. I was expecting a call like:

	security_file_permission(..., mode | MAY_ACCESS)

but I couldn't find one (or anything like it), so only
inode_permission() is being tested (which means also the existing
execve() late tests are missed, and the newly added S_ISREG() test from
do_dentry_open() is missed).


prctl_set_mm_exe_file()
    err = -EACCESS;
    if (!S_ISREG(inode->i_mode) || path_noexec(&exe.file->f_path))
        goto exit;
    err = inode_permission(inode, MAY_EXEC);

This is similar (no path-based LSM hooks present, only inode_permission()
used for permission checking), but it is at least gated by CAP_SYS_ADMIN.


And this bring me to a related question from my review: does
dentry_open() intentionally bypass security_inode_permission()? I.e. it
calls vfs_open() not do_open():

openat2(dfd, char * filename, open_how)
    build_open_flags(open_how, open_flags)
    do_filp_open(dfd, filename, open_flags)
        path_openat(nameidata, open_flags, flags)
            file = alloc_empty_file(open_flags, current_cred());
            do_open(nameidata, file, open_flags)
                may_open(path, acc_mode, open_flag)
                    inode_permission(inode, MAY_OPEN | acc_mode)
                        security_inode_permission(inode, acc_mode)
                vfs_open(path, file)
                    do_dentry_open(file, path->dentry->d_inode, open)
                        if (unlikely(f->f_flags & FMODE_EXEC && !S_ISREG(inode->i_mode))) ...
                        security_file_open(f)
                                /* path-based LSMs check for open here
				 * and use FMODE_* flags to determine how a file
                                 * is being opened. */
                        open()

vs

dentry_open(path, flags, cred)
        f = alloc_empty_file(flags, cred);
        vfs_open(path, f);

I would expect dentry_open() to mostly duplicate a bunch of
path_openat(), but it lacks the may_open() call, etc.

I really got the feeling that there was some new conceptual split needed
inside do_open() where the nameidata details have been finished, after
we've gained the "file" information, but before we've lost the "path"
information. For example, may_open(path, ...) has no sense of "file",
though it does do the inode_permission() call.

Note also that may_open() is used in do_tmpfile() too, and has a comment
implying it needs to be checking only a subset of the path details. So
I'm not sure how to split things up.

So, that's why I put the new checks just before the may_open() call in
do_open(): it's the most central, positions itself correctly for dealing
with O_MAYEXEC, and doesn't appear to make any existing paths worse.

> I am wondering if we need something distinct to request the type of the
> file being opened versus execute permissions.

Well, this is why I wanted to centralize it -- the knowledge of how a
file is going to be used needs to be tested both by the core VFS
(S_ISREG, path_noexec) and the LSMs. Things were inconsistent before.

> All I know is being careful and putting the tests in a good logical
> place makes the code more maintainable, whereas not being careful
> results in all kinds of sharp corners that might be exploitable.
> So I think it is worth digging in and figuring out where those checks
> should live.  Especially so that code like faccessat does not need
> to duplicate them.

I think this is the right place with respect to execve(), though I think
there are other cases that could use to be improved (or at least made
more consistent).

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/4] Relocate execve() sanity checks
  2020-05-19 21:17           ` Kees Cook
@ 2020-05-19 22:58             ` John Johansen
  0 siblings, 0 replies; 28+ messages in thread
From: John Johansen @ 2020-05-19 22:58 UTC (permalink / raw)
  To: Kees Cook, Eric W. Biederman
  Cc: Al Viro, Andrew Morton, Tetsuo Handa, Eric Biggers,
	Dmitry Vyukov, linux-fsdevel, linux-security-module, linux-api,
	linux-kernel

On 5/19/20 2:17 PM, Kees Cook wrote:
> On Tue, May 19, 2020 at 01:42:28PM -0500, Eric W. Biederman wrote:
>> Kees Cook <keescook@chromium.org> writes:
>>
>>> On Tue, May 19, 2020 at 12:41:27PM -0500, Eric W. Biederman wrote:
>>>> Kees Cook <keescook@chromium.org> writes:
>>>>> and given the LSM hooks, I think the noexec check is too late as well.
>>>>> (This is especially true for the coming O_MAYEXEC series, which will
>>>>> absolutely need those tests earlier as well[1] -- the permission checking
>>>>> is then in the correct place: during open, not exec.) I think the only
>>>>> question is about leaving the redundant checks in fs/exec.c, which I
>>>>> think are a cheap way to retain a sense of robustness.
>>>>
>>>> The trouble is when someone passes through changes one of the permission
>>>> checks for whatever reason (misses that they are duplicated in another
>>>> location) and things then fail in some very unexpected way.
>>>
>>> Do you think this series should drop the "late" checks in fs/exec.c?
>>> Honestly, the largest motivation for me to move the checks earlier as
>>> I've done is so that other things besides execve() can use FMODE_EXEC
>>> during open() and receive the same sanity-checking as execve() (i.e the
>>> O_MAYEXEC series -- the details are still under discussion but this
>>> cleanup will be needed regardless).
>>
>> I think this series should drop the "late" checks in fs/exec.c  It feels
>> less error prone, and it feels like that would transform this into
>> something Linus would be eager to merge because series becomes a cleanup
>> that reduces line count.
> 
> Yeah, that was my initial sense too. I just started to get nervous about
> removing the long-standing exec sanity checks. ;)
> 
>> I haven't been inside of open recently enough to remember if the
>> location you are putting the check fundamentally makes sense.  But the
>> O_MAYEXEC bits make a pretty strong case that something of the sort
>> needs to happen.
> 
> Right. I *think* it's correct place for now, based on my understanding
> of the call graph (which is why I included it in the commit logs).
> 
>> I took a quick look but I can not see clearly where path_noexec
>> and the regular file tests should go.
>>
>> I do see that you have code duplication with faccessat which suggests
>> that you haven't put the checks in the right place.
> 
> Yeah, I have notes on the similar call sites (which I concluded, perhaps
> wrongly) to ignore:
> 
> do_faccessat()
>     user_path_at(dfd, filename, lookup_flags, &path);
>     if (acc_mode & MAY_EXEC .... path_noexec()
>     inode_permission(inode, mode | MAY_ACCESS);
> 
> This appears to be strictly advisory, and the path_noexec() test is
> there to, perhaps, avoid surprises when doing access() then fexecve()?
> I would note, however, that that path-based LSMs appear to have no hook
> in this call graph at all. I was expecting a call like:
> 
> 	security_file_permission(..., mode | MAY_ACCESS)
> 
> but I couldn't find one (or anything like it), so only
> inode_permission() is being tested (which means also the existing
> execve() late tests are missed, and the newly added S_ISREG() test from
> do_dentry_open() is missed).
> 
> 

sadly correct, its something that we intend to fix but haven't gotten
to it

> prctl_set_mm_exe_file()
>     err = -EACCESS;
>     if (!S_ISREG(inode->i_mode) || path_noexec(&exe.file->f_path))
>         goto exit;
>     err = inode_permission(inode, MAY_EXEC);
> 
> This is similar (no path-based LSM hooks present, only inode_permission()
> used for permission checking), but it is at least gated by CAP_SYS_ADMIN.
> 

dito here

> 
> And this bring me to a related question from my review: does
> dentry_open() intentionally bypass security_inode_permission()? I.e. it
> calls vfs_open() not do_open():
> 
> openat2(dfd, char * filename, open_how)
>     build_open_flags(open_how, open_flags)
>     do_filp_open(dfd, filename, open_flags)
>         path_openat(nameidata, open_flags, flags)
>             file = alloc_empty_file(open_flags, current_cred());
>             do_open(nameidata, file, open_flags)
>                 may_open(path, acc_mode, open_flag)
>                     inode_permission(inode, MAY_OPEN | acc_mode)
>                         security_inode_permission(inode, acc_mode)
>                 vfs_open(path, file)
>                     do_dentry_open(file, path->dentry->d_inode, open)
>                         if (unlikely(f->f_flags & FMODE_EXEC && !S_ISREG(inode->i_mode))) ...
>                         security_file_open(f)
>                                 /* path-based LSMs check for open here
> 				 * and use FMODE_* flags to determine how a file
>                                  * is being opened. */
>                         open()
> 
> vs
> 
> dentry_open(path, flags, cred)
>         f = alloc_empty_file(flags, cred);
>         vfs_open(path, f);
> 
> I would expect dentry_open() to mostly duplicate a bunch of
> path_openat(), but it lacks the may_open() call, etc.
> 
> I really got the feeling that there was some new conceptual split needed
> inside do_open() where the nameidata details have been finished, after
> we've gained the "file" information, but before we've lost the "path"
> information. For example, may_open(path, ...) has no sense of "file",
> though it does do the inode_permission() call.
> 
yes that would be nice, sadly the path hooks are really a bolted on after thought

> Note also that may_open() is used in do_tmpfile() too, and has a comment
> implying it needs to be checking only a subset of the path details. So
> I'm not sure how to split things up.
> 
/me neither anymore

> So, that's why I put the new checks just before the may_open() call in
> do_open(): it's the most central, positions itself correctly for dealing
> with O_MAYEXEC, and doesn't appear to make any existing paths worse.
> 
>> I am wondering if we need something distinct to request the type of the
>> file being opened versus execute permissions.
> 
> Well, this is why I wanted to centralize it -- the knowledge of how a
> file is going to be used needs to be tested both by the core VFS
> (S_ISREG, path_noexec) and the LSMs. Things were inconsistent before.
> 

yep

>> All I know is being careful and putting the tests in a good logical
>> place makes the code more maintainable, whereas not being careful
>> results in all kinds of sharp corners that might be exploitable.
>> So I think it is worth digging in and figuring out where those checks
>> should live.  Especially so that code like faccessat does not need
>> to duplicate them.
> 
> I think this is the right place with respect to execve(), though I think
> there are other cases that could use to be improved (or at least made
> more consistent).
> 
> -Kees
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [exec] 166d03c9ec: ltp.execveat02.fail
       [not found]   ` <20200525091420.GI12456@shao2-debian>
@ 2020-06-04 22:45     ` Kees Cook
  2020-06-05  2:57     ` Kees Cook
  1 sibling, 0 replies; 28+ messages in thread
From: Kees Cook @ 2020-06-04 22:45 UTC (permalink / raw)
  To: kernel test robot
  Cc: Al Viro, Andrew Morton, Tetsuo Handa, Eric Biggers,
	Dmitry Vyukov, linux-fsdevel, linux-security-module, linux-api,
	linux-kernel, lkp, ltp

On Mon, May 25, 2020 at 05:14:20PM +0800, kernel test robot wrote:
> Greeting,

(Whoops, I missed this in my inbox.)

> <<<test_start>>>
> tag=execveat02 stime=1590373229
> cmdline="execveat02"
> contacts=""
> analysis=exit
> <<<test_output>>>
> tst_test.c:1246: INFO: Timeout per run is 0h 05m 00s
> execveat02.c:64: PASS: execveat() fails as expected: EBADF (9)
> execveat02.c:64: PASS: execveat() fails as expected: EINVAL (22)
> execveat02.c:61: FAIL: execveat() fails unexpectedly, expected: ELOOP: EACCES (13)
> execveat02.c:64: PASS: execveat() fails as expected: ENOTDIR (20)

I will go check on this. Looking at the expected result (ELOOP) I think
this just means the test needs adjustment because it's trying to
double-check for a pathological case, but it seems their test setup
trips the (now earlier) IS_SREG() test. But I'll double-check and report
back!

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [exec] 166d03c9ec: ltp.execveat02.fail
       [not found]   ` <20200525091420.GI12456@shao2-debian>
  2020-06-04 22:45     ` [exec] 166d03c9ec: ltp.execveat02.fail Kees Cook
@ 2020-06-05  2:57     ` Kees Cook
  1 sibling, 0 replies; 28+ messages in thread
From: Kees Cook @ 2020-06-05  2:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Al Viro, Aleksa Sarai, Tetsuo Handa, kernel test robot,
	Eric Biggers, Dmitry Vyukov, linux-fsdevel,
	linux-security-module, linux-api, linux-kernel, lkp, ltp

On Mon, May 25, 2020 at 05:14:20PM +0800, kernel test robot wrote:
> execveat02.c:64: PASS: execveat() fails as expected: EBADF (9)
> execveat02.c:64: PASS: execveat() fails as expected: EINVAL (22)
> execveat02.c:61: FAIL: execveat() fails unexpectedly, expected: ELOOP: EACCES (13)
> execveat02.c:64: PASS: execveat() fails as expected: ENOTDIR (20)

tl;dr: I think this test is correct, and I think I see a way to improve
the offending patch series to do the right thing.


Okay, the LTP is checking for ELOOP on trying to exec a symlink:

...
 *    3) execveat() fails and returns ELOOP if the file identified by dirfd and
 *       pathname is a symbolic link and flag includes AT_SYMLINK_NOFOLLOW.
...
#define TESTDIR "testdir"
#define TEST_APP "execveat_errno"
...
#define TEST_SYMLINK "execveat_symlink"
...
#define TEST_ERL_SYMLINK TESTDIR"/"TEST_SYMLINK
...
        sprintf(app_sym_path, "%s/%s", cur_dir_path, TEST_ERL_SYMLINK);
...
        SAFE_SYMLINK(TEST_REL_APP, TEST_ERL_SYMLINK);

        fd = SAFE_OPEN(TEST_REL_APP, O_PATH);
...
static struct tcase {
        int *fd;
        char *pathname;
        int flag;
        int exp_err;
} tcases[] = {
...
        {&fd, app_sym_path, AT_SYMLINK_NOFOLLOW, ELOOP},
...
};
...
                TEST(execveat(*tc->fd, tc->pathname, argv, environ, tc->flag));

This is testing the exec _of_ a symlink under AT_SYMLINK_NOFOLLOW.

The execve(2) manpage says:

       ELOOP  Too many symbolic links were encountered in resolving
              pathname or  the  name  of  a script or ELF interpreter.

       ELOOP  The maximum recursion limit was reached during recursive
	      script interpretation (see "Interpreter scripts", above).
	      Before Linux 3.8, the error produced for this case was ENOEXEC.

Which actually doesn't mention this case. open(2) says:

       ELOOP  Too many symbolic links were encountered in resolving pathname.

       ELOOP  pathname was a symbolic link, and flags specified O_NOFOLLOW
	      but not O_PATH.

(but O_NOFOLLOW is limited to file creation. linkat(2) lists the AT_*
flags, and applied to openat, this seems to track: attempting to
execat where the final element is a symlink should fail with ELOOP,
though the manpage does warn that this makes it indistinguishable from
symlink loops -- the first item listed in the execve manpage for
ELOOP...)

Regardless, this does seem to be the "correct" result, as opening for
exec or opening just normally should really get the same error code.

The call path for execve looks like this:

    do_open_execat()
        struct open_flags open_exec_flags = {
            .open_flag = O_LARGEFILE | O_RDONLY | __FMODE_EXEC,
	    .acc_mode = MAY_READ | MAY_EXEC,
	    ...
        do_filp_open(dfd, filename, open_flags)
            path_openat(nameidata, open_flags, flags)
                file = alloc_empty_file(open_flags, current_cred());
		open_last_lookups(nd, file, open_flags)
		    step_into(nd, ...)
		        /* stop on symlink without LOOKUP_FOLLOW */
                do_open(nameidata, file, open_flags)
                    /* new location of FMODE_EXEC vs S_ISREG() test */
                    may_open(path, acc_mode, open_flag)
			/* test for S_IFLNK */
                        inode_permission(inode, MAY_OPEN | acc_mode)
                            security_inode_permission(inode, acc_mode)
                    vfs_open(path, file)
                        do_dentry_open(file, path->dentry->d_inode, open)
                            /* old location of FMODE_EXEC vs S_ISREG() test */
                            security_file_open(f)
                            open()

The step_into() is what kicks back out without LOOKUP_FOLLOW, so we're
left holding a symlink (S_IFMT inode). In do_open(), there is a set of
checks via may_open() which checks for S_IFMT and rejects it:

        switch (inode->i_mode & S_IFMT) {
        case S_IFLNK:
                return -ELOOP;

So that's the case LTP was testing for.

The patch in -next ("exec: relocate S_ISREG() check")[1], moves the regular
file requirement up before may_open(), for all the reasons mentioned in
the commit log (and the next patch[2]).

When I was originally trying to determine the best place for where the
checks should live, may_open() really did seem like the right place, but I
recognized that it was examining path characteristics (which was good) but
it didn't have the file, and that seemed to be an intentional separation.

What is needed in may_open() would be the "how was this file opened?"
piece of information: file->f_mode & FMODE_EXEC. However, in looking at
this again now, I wonder if it might be possible to use the MAY_EXEC
from the acc_mode? It seems the old check (in do_dentry_open() had no
access to the acc_mode, so it was forced to use the FMODE_EXEC signal
instead.

(I actually think this remains a bit of a design problem: path-based LSMs,
which see the opened file in security_file_open(), also don't have access
to acc_mode and must depend on the FMODE_* flags...)

I will respin the series to use acc_mode and move the tests into
may_open().

-Kees

[1] https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=vfs/fmode_exec/v1&id=36aaecef88b472ad5e0a408ffdb5b2e46f1478f6
[2] https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=vfs/fmode_exec/v1&id=a2f303b9f24ae2ecdc57f3db93e49b2b869893aa

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2020-06-05  2:57 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-18  5:54 [PATCH 0/4] Relocate execve() sanity checks Kees Cook
2020-05-18  5:54 ` [PATCH 1/4] exec: Change uselib(2) IS_SREG() failure to EACCES Kees Cook
2020-05-18 13:02   ` Christian Brauner
2020-05-18 14:43     ` Jann Horn
2020-05-18 14:46       ` Christian Brauner
2020-05-18 23:57         ` Eric W. Biederman
2020-05-19  8:11           ` Christian Brauner
2020-05-19  8:37           ` Andreas Schwab
2020-05-19 11:56             ` Eric W. Biederman
2020-05-19 12:12               ` Andreas Schwab
2020-05-19 12:28                 ` Eric W. Biederman
2020-05-19 13:29                   ` Christian Brauner
2020-05-19 14:49                     ` Eric W. Biederman
2020-05-19 13:13               ` Christian Brauner
2020-05-19 14:32                 ` Geert Uytterhoeven
2020-05-19 14:47                   ` Christian Brauner
2020-05-18  5:54 ` [PATCH 2/4] exec: Relocate S_ISREG() check Kees Cook
     [not found]   ` <20200525091420.GI12456@shao2-debian>
2020-06-04 22:45     ` [exec] 166d03c9ec: ltp.execveat02.fail Kees Cook
2020-06-05  2:57     ` Kees Cook
2020-05-18  5:54 ` [PATCH 3/4] exec: Relocate path_noexec() check Kees Cook
2020-05-18  5:54 ` [PATCH 4/4] fs: Include FMODE_EXEC when converting flags to f_mode Kees Cook
2020-05-19 15:06 ` [PATCH 0/4] Relocate execve() sanity checks Eric W. Biederman
2020-05-19 16:26   ` Kees Cook
2020-05-19 17:41     ` Eric W. Biederman
2020-05-19 17:56       ` Kees Cook
2020-05-19 18:42         ` Eric W. Biederman
2020-05-19 21:17           ` Kees Cook
2020-05-19 22:58             ` John Johansen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).