[PATCH v3 0/2] Restrict dangerous open in sticky directories

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v3 0/2] Restrict dangerous open in sticky directories
@ 2017-11-22  8:01 Salvatore Mesoraca
  2017-11-22  8:01 ` [PATCH v3 1/2] Protected FIFOs and regular files Salvatore Mesoraca
  2017-11-22  8:01 ` [PATCH v3 2/2] Protected O_CREAT open in sticky directories Salvatore Mesoraca
  0 siblings, 2 replies; 26+ messages in thread
From: Salvatore Mesoraca @ 2017-11-22  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kernel Hardening, linux-fsdevel, Salvatore Mesoraca,
	Alexander Viro, Jann Horn, Kees Cook, Solar Designer,
	Eric W. Biederman

This patch-set introduces two separate features aimed at restricting
dangerous open in world or group writable sticky directories.
The purpose is to prevent exploitable bugs in user-space programs
that don't access sticky directories in the proper way.
The first patch prevents the O_CREAT open of FIFOs and regular files
in world or group writable sticky directories, if they already exists
and are owned by someone else.
The second patch prevents O_CREAT open in world or group writable
sticky when the O_EXCL flag is not set, even if the file doesn't
exist yet.
More details can be found in the respective commit messages.

Changes in v3:
	- Fixed format string for uid_t that is unsigned
	  (suggested by Jann Horn).
	- Stop checking if file's and parent dir's owners match in
	  may_create_no_excl. This will allow to discover potential
	  vulnerabilities more easily.

Salvatore Mesoraca (2):
  Protected FIFOs and regular files
  Protected O_CREAT open in sticky directories

 Documentation/sysctl/fs.txt |  66 +++++++++++++++++++++++++
 fs/namei.c                  | 117 ++++++++++++++++++++++++++++++++++++++++++--
 include/linux/fs.h          |   3 ++
 kernel/sysctl.c             |  27 ++++++++++
 4 files changed, 210 insertions(+), 3 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v3 1/2] Protected FIFOs and regular files
  2017-11-22  8:01 [PATCH v3 0/2] Restrict dangerous open in sticky directories Salvatore Mesoraca
@ 2017-11-22  8:01 ` Salvatore Mesoraca
  2017-11-23 22:43   ` [kernel-hardening] " Tobin C. Harding
  2017-11-22  8:01 ` [PATCH v3 2/2] Protected O_CREAT open in sticky directories Salvatore Mesoraca
  1 sibling, 1 reply; 26+ messages in thread
From: Salvatore Mesoraca @ 2017-11-22  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kernel Hardening, linux-fsdevel, Salvatore Mesoraca,
	Alexander Viro, Jann Horn, Kees Cook, Solar Designer,
	Eric W. Biederman

Disallows open of FIFOs or regular files not owned by the user in world
writable sticky directories, unless the owner is the same as that of
the directory or the file is opened without the O_CREAT flag.
The purpose is to make data spoofing attacks harder.
This protection can be turned on and off separately for FIFOs and regular
files via sysctl, just like the symlinks/hardlinks protection.
This patch is based on Openwall's "HARDEN_FIFO" feature by Solar
Designer.

This is a brief list of old vulnerabilities that could have been prevented
by this feature, some of them even allow for privilege escalation:
CVE-2000-1134
CVE-2007-3852
CVE-2008-0525
CVE-2009-0416
CVE-2011-4834
CVE-2015-1838
CVE-2015-7442
CVE-2016-7489

This list is not meant to be complete. It's difficult to track down
all vulnerabilities of this kind because they were often reported
without any mention of this particular attack vector.
In fact, before symlinks restrictions, fifos/regular files were not the
favorite vehicle to exploit them.

Suggested-by: Solar Designer <solar@openwall.com>
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com>
---
 Documentation/sysctl/fs.txt | 36 ++++++++++++++++++++++++++
 fs/namei.c                  | 61 ++++++++++++++++++++++++++++++++++++++++++---
 include/linux/fs.h          |  2 ++
 kernel/sysctl.c             | 18 +++++++++++++
 4 files changed, 114 insertions(+), 3 deletions(-)

diff --git a/Documentation/sysctl/fs.txt b/Documentation/sysctl/fs.txt
index 6c00c1e..f3cf2cd 100644
--- a/Documentation/sysctl/fs.txt
+++ b/Documentation/sysctl/fs.txt
@@ -34,7 +34,9 @@ Currently, these files are in /proc/sys/fs:
 - overflowgid
 - pipe-user-pages-hard
 - pipe-user-pages-soft
+- protected_fifos
 - protected_hardlinks
+- protected_regular
 - protected_symlinks
 - suid_dumpable
 - super-max
@@ -182,6 +184,24 @@ applied.
 
 ==============================================================
 
+protected_fifos:
+
+The intent of this protection is to avoid unintentional writes to
+an attacker-controlled FIFO, where a program expected to create a regular
+file.
+
+When set to "0", FIFOs writing is unrestricted.
+
+When set to "1" don't allow O_CREAT open on FIFOs that we don't own
+in world writable sticky directories, unless they are owned by the
+owner of the directory.
+
+When set to "2" it also applies to group writable sticky directories.
+
+This protection is based on the restrictions in Openwall.
+
+==============================================================
+
 protected_hardlinks:
 
 A long-standing class of security issues is the hardlink-based
@@ -202,6 +222,22 @@ This protection is based on the restrictions in Openwall and grsecurity.
 
 ==============================================================
 
+protected_regular:
+
+This protection is similar to protected_fifos, but it
+avoids writes to an attacker-controlled regular file, where a program
+expected to create one.
+
+When set to "0", regular files writing is unrestricted.
+
+When set to "1" don't allow O_CREAT open on regular files that we
+don't own in world writable sticky directories, unless they are
+owned by the owner of the directory.
+
+When set to "2" it also applies to group writable sticky directories.
+
+==============================================================
+
 protected_symlinks:
 
 A long-standing class of security issues is the symlink-based
diff --git a/fs/namei.c b/fs/namei.c
index f0c7a7b..92992ad 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -902,6 +902,8 @@ static inline void put_link(struct nameidata *nd)
 
 int sysctl_protected_symlinks __read_mostly = 0;
 int sysctl_protected_hardlinks __read_mostly = 0;
+int sysctl_protected_fifos __read_mostly;
+int sysctl_protected_regular __read_mostly;
 
 /**
  * may_follow_link - Check symlink following for unsafe situations
@@ -1015,6 +1017,54 @@ static int may_linkat(struct path *link)
 	return -EPERM;
 }
 
+/**
+ * may_create_in_sticky - Check whether an O_CREAT open in a sticky directory
+ *			  should be allowed or not, when the file already
+ *			  existed.
+ * @dir: the stick parent directory
+ * @name: the file name
+ * @inode: the inode of the file to open
+ *
+ * Block an O_CREAT open of a FIFO (or a regular file) when:
+ *   - sysctl_protected_fifos (or sysctl_protected_regular) is enabled
+ *   - the file already exists
+ *   - we are in a sticky directory
+ *   - we don't own the file
+ *   - the owner of the directory doesn't own the file
+ *   - the directory is world writable
+ * If the sysctl_protected_fifos (or sysctl_protected_regular) is set to 2
+ * the directory doesn't have to be world writable: being group writable will
+ * be enough.
+ *
+ * Returns 0 if the open is allowed, -ve on error.
+ */
+static int may_create_in_sticky(struct dentry * const dir,
+				const unsigned char * const name,
+				struct inode * const inode)
+{
+	if ((!sysctl_protected_fifos && S_ISFIFO(inode->i_mode)) ||
+	    (!sysctl_protected_regular && S_ISREG(inode->i_mode)) ||
+	    likely(!(dir->d_inode->i_mode & S_ISVTX)) ||
+	    uid_eq(inode->i_uid, dir->d_inode->i_uid) ||
+	    uid_eq(current_fsuid(), inode->i_uid))
+		return 0;
+
+	if (likely(dir->d_inode->i_mode & 0002) ||
+	    (dir->d_inode->i_mode & 0020 &&
+	     ((sysctl_protected_fifos >= 2 && S_ISFIFO(inode->i_mode)) ||
+	      (sysctl_protected_regular >= 2 && S_ISREG(inode->i_mode))))) {
+		pr_notice_ratelimited("denied writing in '%s' of %u.%u in a sticky directory by UID %u, EUID %u, process %s:%d.\n",
+				      name,
+				      from_kuid(&init_user_ns, inode->i_uid),
+				      from_kgid(&init_user_ns, inode->i_gid),
+				      from_kuid(&init_user_ns, current_uid()),
+				      from_kuid(&init_user_ns, current_euid()),
+				      current->comm, current->pid);
+		return -EACCES;
+	}
+	return 0;
+}
+
 static __always_inline
 const char *get_link(struct nameidata *nd)
 {
@@ -3365,9 +3415,14 @@ static int do_last(struct nameidata *nd,
 	if (error)
 		return error;
 	audit_inode(nd->name, nd->path.dentry, 0);
-	error = -EISDIR;
-	if ((open_flag & O_CREAT) && d_is_dir(nd->path.dentry))
-		goto out;
+	if (open_flag & O_CREAT) {
+		error = -EISDIR;
+		if (d_is_dir(nd->path.dentry))
+			goto out;
+		error = may_create_in_sticky(dir, nd->last.name, inode);
+		if (unlikely(error))
+			goto out;
+	}
 	error = -ENOTDIR;
 	if ((nd->flags & LOOKUP_DIRECTORY) && !d_can_lookup(nd->path.dentry))
 		goto out;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2995a27..6fb45a52 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -72,6 +72,8 @@
 extern int leases_enable, lease_break_time;
 extern int sysctl_protected_symlinks;
 extern int sysctl_protected_hardlinks;
+extern int sysctl_protected_fifos;
+extern int sysctl_protected_regular;
 
 typedef __kernel_rwf_t rwf_t;
 
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 557d467..590fbc9 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1799,6 +1799,24 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
 		.extra2		= &one,
 	},
 	{
+		.procname	= "protected_fifos",
+		.data		= &sysctl_protected_fifos,
+		.maxlen		= sizeof(int),
+		.mode		= 0600,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &zero,
+		.extra2		= &two,
+	},
+	{
+		.procname	= "protected_regular",
+		.data		= &sysctl_protected_regular,
+		.maxlen		= sizeof(int),
+		.mode		= 0600,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &zero,
+		.extra2		= &two,
+	},
+	{
 		.procname	= "suid_dumpable",
 		.data		= &suid_dumpable,
 		.maxlen		= sizeof(int),
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-11-22  8:01 [PATCH v3 0/2] Restrict dangerous open in sticky directories Salvatore Mesoraca
  2017-11-22  8:01 ` [PATCH v3 1/2] Protected FIFOs and regular files Salvatore Mesoraca
@ 2017-11-22  8:01 ` Salvatore Mesoraca
  2017-11-22 13:22   ` Matthew Wilcox
                     ` (3 more replies)
  1 sibling, 4 replies; 26+ messages in thread
From: Salvatore Mesoraca @ 2017-11-22  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kernel Hardening, linux-fsdevel, Salvatore Mesoraca,
	Alexander Viro, Jann Horn, Kees Cook, Solar Designer,
	Eric W. Biederman

Disallows O_CREAT open missing the O_EXCL flag, in world or
group writable directories, even if the file doesn't exist yet.
With few exceptions (e.g. shared lock files based on flock())
if a program tries to open a file, in a sticky directory,
with the O_CREAT flag and without the O_EXCL, it probably has a bug.
This feature allows to detect and potentially block programs that
act this way, it can be used to find vulnerabilities (like those
prevented by patch #1) and to do policy enforcement.

Suggested-by: Solar Designer <solar@openwall.com>
Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com>
---
 Documentation/sysctl/fs.txt | 30 ++++++++++++++++++++++++
 fs/namei.c                  | 56 +++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h          |  1 +
 kernel/sysctl.c             |  9 ++++++++
 4 files changed, 96 insertions(+)

diff --git a/Documentation/sysctl/fs.txt b/Documentation/sysctl/fs.txt
index f3cf2cd..7f24b4f 100644
--- a/Documentation/sysctl/fs.txt
+++ b/Documentation/sysctl/fs.txt
@@ -37,6 +37,7 @@ Currently, these files are in /proc/sys/fs:
 - protected_fifos
 - protected_hardlinks
 - protected_regular
+- protected_sticky_child_create
 - protected_symlinks
 - suid_dumpable
 - super-max
@@ -238,6 +239,35 @@ When set to "2" it also applies to group writable sticky directories.
 
 ==============================================================
 
+protected_sticky_child_create:
+
+An O_CREAT open missing the O_EXCL flag in a sticky directory is,
+often, a bug or a synthom of the fact that the program is not
+using appropriate procedures to access sticky directories.
+This protection allow to detect and possibly block these unsafe
+open invocations, even if the files don't exist yet.
+Though should be noted that, sometimes, it's OK to open a file
+with O_CREAT and without O_EXCL (e.g. shared lock files based
+on flock()), for this reason values above 2 should be set
+with care.
+
+When set to "0" the protection is disabled.
+
+When set to "1", notify about O_CREAT open missing the O_EXCL flag
+in world writable sticky directories.
+
+When set to "2", notify about O_CREAT open missing the O_EXCL flag
+in world or group writable sticky directories.
+
+When set to "3", block O_CREAT open missing the O_EXCL flag
+in world writable sticky directories and notify (but don't block)
+in group writable sticky directories.
+
+When set to "4", block O_CREAT open missing the O_EXCL flag
+in world writable and group writable sticky directories.
+
+==============================================================
+
 protected_symlinks:
 
 A long-standing class of security issues is the symlink-based
diff --git a/fs/namei.c b/fs/namei.c
index 92992ad..fcee423 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -904,6 +904,7 @@ static inline void put_link(struct nameidata *nd)
 int sysctl_protected_hardlinks __read_mostly = 0;
 int sysctl_protected_fifos __read_mostly;
 int sysctl_protected_regular __read_mostly;
+int sysctl_protected_sticky_child_create __read_mostly;
 
 /**
  * may_follow_link - Check symlink following for unsafe situations
@@ -1065,6 +1066,53 @@ static int may_create_in_sticky(struct dentry * const dir,
 	return 0;
 }
 
+/**
+ * may_create_no_excl - Detect and possibly block unsafe O_CREAT open
+ *			without O_EXCL.
+ * @dir: the stick parent directory
+ * @name: the file name
+ * @inode: the inode of the file to open (can be NULL to skip uid checks)
+ *
+ * When sysctl_protected_sticky_child_create is set to "0" the
+ * protection is disabled.
+ * When it's set to "1", notify about O_CREAT open missing the O_EXCL flag
+ * in world writable sticky directories.
+ * When it's set to "2", notify about O_CREAT open missing the O_EXCL flag
+ * in group writable sticky directories.
+ * When it's set to "3", block O_CREAT open missing the O_EXCL flag
+ * in world writable sticky directories and notify (but don't block)
+ * in group writable sticky directories.
+ * When it's set to "4", block O_CREAT open missing the O_EXCL flag
+ * in world writable and group writable sticky directories.
+ *
+ * Returns 0 if the open is allowed, -ve on error.
+ */
+static int may_create_no_excl(struct dentry * const dir,
+			      const unsigned char * const name,
+			      struct inode * const inode)
+{
+	umode_t mode = dir->d_inode->i_mode;
+
+	if (likely(!(mode & S_ISVTX)))
+		return 0;
+	if (inode && uid_eq(current_fsuid(), inode->i_uid))
+		return 0;
+
+	if ((sysctl_protected_sticky_child_create && likely(mode & 0002)) ||
+	    (sysctl_protected_sticky_child_create >= 2 && mode & 0020)) {
+		pr_notice_ratelimited("unsafe O_CREAT open (missing O_EXCL) of '%s' in a sticky directory by UID %u, EUID %u, process %s:%d.\n",
+				      name,
+				      from_kuid(&init_user_ns, current_uid()),
+				      from_kuid(&init_user_ns, current_euid()),
+				      current->comm, current->pid);
+		if (sysctl_protected_sticky_child_create >= 4 ||
+		    (sysctl_protected_sticky_child_create == 3 &&
+		     likely(mode & 0002)))
+			return -EACCES;
+	}
+	return 0;
+}
+
 static __always_inline
 const char *get_link(struct nameidata *nd)
 {
@@ -3256,6 +3304,11 @@ static int lookup_open(struct nameidata *nd, struct path *path,
 			error = -EACCES;
 			goto out_dput;
 		}
+		if (!(open_flag & O_EXCL)) {
+			error = may_create_no_excl(dir, nd->last.name, NULL);
+			if (unlikely(error))
+				goto out_dput;
+		}
 		error = dir_inode->i_op->create(dir_inode, dentry, mode,
 						open_flag & O_EXCL);
 		if (error)
@@ -3422,6 +3475,9 @@ static int do_last(struct nameidata *nd,
 		error = may_create_in_sticky(dir, nd->last.name, inode);
 		if (unlikely(error))
 			goto out;
+		error = may_create_no_excl(dir, nd->last.name, inode);
+		if (unlikely(error))
+			goto out;
 	}
 	error = -ENOTDIR;
 	if ((nd->flags & LOOKUP_DIRECTORY) && !d_can_lookup(nd->path.dentry))
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 6fb45a52..3ab37e0 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -74,6 +74,7 @@
 extern int sysctl_protected_hardlinks;
 extern int sysctl_protected_fifos;
 extern int sysctl_protected_regular;
+extern int sysctl_protected_sticky_child_create;
 
 typedef __kernel_rwf_t rwf_t;
 
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 590fbc9..012c739 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1817,6 +1817,15 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
 		.extra2		= &two,
 	},
 	{
+		.procname	= "protected_sticky_child_create",
+		.data		= &sysctl_protected_sticky_child_create,
+		.maxlen		= sizeof(int),
+		.mode		= 0600,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &zero,
+		.extra2		= &four,
+	},
+	{
 		.procname	= "suid_dumpable",
 		.data		= &suid_dumpable,
 		.maxlen		= sizeof(int),
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-11-22  8:01 ` [PATCH v3 2/2] Protected O_CREAT open in sticky directories Salvatore Mesoraca
@ 2017-11-22 13:22   ` Matthew Wilcox
  2017-11-24  8:29     ` Salvatore Mesoraca
  2017-11-22 16:51   ` Alan Cox
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 26+ messages in thread
From: Matthew Wilcox @ 2017-11-22 13:22 UTC (permalink / raw)
  To: Salvatore Mesoraca
  Cc: linux-kernel, Kernel Hardening, linux-fsdevel, Alexander Viro,
	Jann Horn, Kees Cook, Solar Designer, Eric W. Biederman

On Wed, Nov 22, 2017 at 09:01:46AM +0100, Salvatore Mesoraca wrote:
> +An O_CREAT open missing the O_EXCL flag in a sticky directory is,
> +often, a bug or a synthom of the fact that the program is not
> +using appropriate procedures to access sticky directories.
> +This protection allow to detect and possibly block these unsafe
> +open invocations, even if the files don't exist yet.
> +Though should be noted that, sometimes, it's OK to open a file
> +with O_CREAT and without O_EXCL (e.g. shared lock files based
> +on flock()), for this reason values above 2 should be set
> +with care.
> +
> +When set to "0" the protection is disabled.
> +
> +When set to "1", notify about O_CREAT open missing the O_EXCL flag
> +in world writable sticky directories.
> +
> +When set to "2", notify about O_CREAT open missing the O_EXCL flag
> +in world or group writable sticky directories.
> +
> +When set to "3", block O_CREAT open missing the O_EXCL flag
> +in world writable sticky directories and notify (but don't block)
> +in group writable sticky directories.
> +
> +When set to "4", block O_CREAT open missing the O_EXCL flag
> +in world writable and group writable sticky directories.

This seems insufficiently flexible.  For example, there is no way for me
to specify that I want to block O_CREAT without O_EXCL in world-writable,
but not be notified about O_CREAT without O_EXCL in group-writable.

And maybe I want to be notified that blocking has happened?

Why not make it bits?  So:

0 => notify in world
1 => block in world
2 => notify in group
3 => block in group

So you'd have the following meaningful values:

 0 - permit all (your option 0)
 1 - notify world; permit group (your option 1)
 2 - block world; permit group
 3 - block,notify world; permit group
 4 - permit world; notify group (?)
 5 - notify world; notify group (your option 2)
 6 - block world; notify group (your option 3)
 7 - block,notify world; notify group
 8 - permit world; block group (?)
 9 - notify world; block group (?)
10 - block world; block group (your option 4)
11 - block,notify world; block group
12 - permit world; block, notify group (?)
13 - notify world; block, notify group (?)
14 - block world; block, notify group
15 - block, notify world; block, notify group

Some of these don't make a lot of sense (marked with ?), but I don't see
the harm in permitting a sysadmin to do something that seems nonsensical
to me.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-11-22  8:01 ` [PATCH v3 2/2] Protected O_CREAT open in sticky directories Salvatore Mesoraca
  2017-11-22 13:22   ` Matthew Wilcox
@ 2017-11-22 16:51   ` Alan Cox
  2017-11-24  8:31     ` Salvatore Mesoraca
  2017-11-24 10:53     ` David Laight
  2017-11-23 22:57   ` Tobin C. Harding
  2017-11-30 16:53   ` David Laight
  3 siblings, 2 replies; 26+ messages in thread
From: Alan Cox @ 2017-11-22 16:51 UTC (permalink / raw)
  To: Salvatore Mesoraca
  Cc: linux-kernel, Kernel Hardening, linux-fsdevel, Alexander Viro,
	Jann Horn, Kees Cook, Solar Designer, Eric W. Biederman

On Wed, 22 Nov 2017 09:01:46 +0100
Salvatore Mesoraca <s.mesoraca16@gmail.com> wrote:

> Disallows O_CREAT open missing the O_EXCL flag, in world or
> group writable directories, even if the file doesn't exist yet.
> With few exceptions (e.g. shared lock files based on flock())

Enough exceptions to make it a bad idea.

Firstly if you care this much *stop* having shared writable directories.
We have namespaces, you don't need them. You can give every user their
own /tmp etc.

The rest of this only make sense on a per application and directory basis
because there are valid use cases, and that means it wants to be part of
an existing LSM security module where you've got the context required and
you can attach it to a specific directory and/or process.

Alan

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [kernel-hardening] [PATCH v3 1/2] Protected FIFOs and regular files
  2017-11-22  8:01 ` [PATCH v3 1/2] Protected FIFOs and regular files Salvatore Mesoraca
@ 2017-11-23 22:43   ` Tobin C. Harding
  2017-11-24  8:24     ` Salvatore Mesoraca
  0 siblings, 1 reply; 26+ messages in thread
From: Tobin C. Harding @ 2017-11-23 22:43 UTC (permalink / raw)
  To: Salvatore Mesoraca
  Cc: linux-kernel, Kernel Hardening, linux-fsdevel, Alexander Viro,
	Jann Horn, Kees Cook, Solar Designer, Eric W. Biederman

On Wed, Nov 22, 2017 at 09:01:45AM +0100, Salvatore Mesoraca wrote:

Please take these comments in all humility, my English is a long way
from perfect. These are English grammar comments only. If this is viewed
as trivial please stop reading now and ignore.

> Disallows open of FIFOs or regular files not owned by the user in world
> writable sticky directories, unless the owner is the same as that of
> the directory or the file is opened without the O_CREAT flag.
> The purpose is to make data spoofing attacks harder.
> This protection can be turned on and off separately for FIFOs and regular
> files via sysctl, just like the symlinks/hardlinks protection.
> This patch is based on Openwall's "HARDEN_FIFO" feature by Solar
> Designer.
> 
> This is a brief list of old vulnerabilities that could have been prevented
> by this feature, some of them even allow for privilege escalation:
> CVE-2000-1134
> CVE-2007-3852
> CVE-2008-0525
> CVE-2009-0416
> CVE-2011-4834
> CVE-2015-1838
> CVE-2015-7442
> CVE-2016-7489
> 
> This list is not meant to be complete. It's difficult to track down
> all vulnerabilities of this kind because they were often reported
> without any mention of this particular attack vector.
> In fact, before symlinks restrictions, fifos/regular files were not the
> favorite vehicle to exploit them.
> 
> Suggested-by: Solar Designer <solar@openwall.com>
> Suggested-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com>
> ---
>  Documentation/sysctl/fs.txt | 36 ++++++++++++++++++++++++++
>  fs/namei.c                  | 61 ++++++++++++++++++++++++++++++++++++++++++---
>  include/linux/fs.h          |  2 ++
>  kernel/sysctl.c             | 18 +++++++++++++
>  4 files changed, 114 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/sysctl/fs.txt b/Documentation/sysctl/fs.txt
> index 6c00c1e..f3cf2cd 100644
> --- a/Documentation/sysctl/fs.txt
> +++ b/Documentation/sysctl/fs.txt
> @@ -34,7 +34,9 @@ Currently, these files are in /proc/sys/fs:
>  - overflowgid
>  - pipe-user-pages-hard
>  - pipe-user-pages-soft
> +- protected_fifos
>  - protected_hardlinks
> +- protected_regular
>  - protected_symlinks
>  - suid_dumpable
>  - super-max
> @@ -182,6 +184,24 @@ applied.
>  
>  ==============================================================
>  
> +protected_fifos:
> +
> +The intent of this protection is to avoid unintentional writes to
> +an attacker-controlled FIFO, where a program expected to create a regular
> +file.
> +
> +When set to "0", FIFOs writing is unrestricted.

 When set to "0", writing to FIFOs is unrestricted.

> +When set to "1" don't allow O_CREAT open on FIFOs that we don't own
> +in world writable sticky directories, unless they are owned by the
> +owner of the directory.
> +
> +When set to "2" it also applies to group writable sticky directories.
> +
> +This protection is based on the restrictions in Openwall.
> +
> +==============================================================
> +
>  protected_hardlinks:
>  
>  A long-standing class of security issues is the hardlink-based
> @@ -202,6 +222,22 @@ This protection is based on the restrictions in Openwall and grsecurity.
>  
>  ==============================================================
>  
> +protected_regular:
> +
> +This protection is similar to protected_fifos, but it
> +avoids writes to an attacker-controlled regular file, where a program
> +expected to create one.
> +
> +When set to "0", regular files writing is unrestricted.

 When set to "0", writing to regular files is unrestricted.

> +When set to "1" don't allow O_CREAT open on regular files that we
> +don't own in world writable sticky directories, unless they are
> +owned by the owner of the directory.
> +
> +When set to "2" it also applies to group writable sticky directories.
> +
> +==============================================================
> +
>  protected_symlinks:
>  
>  A long-standing class of security issues is the symlink-based
> diff --git a/fs/namei.c b/fs/namei.c
> index f0c7a7b..92992ad 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -902,6 +902,8 @@ static inline void put_link(struct nameidata *nd)
>  
>  int sysctl_protected_symlinks __read_mostly = 0;
>  int sysctl_protected_hardlinks __read_mostly = 0;
> +int sysctl_protected_fifos __read_mostly;
> +int sysctl_protected_regular __read_mostly;
>  
>  /**
>   * may_follow_link - Check symlink following for unsafe situations
> @@ -1015,6 +1017,54 @@ static int may_linkat(struct path *link)
>  	return -EPERM;
>  }
>  
> +/**
> + * may_create_in_sticky - Check whether an O_CREAT open in a sticky directory
> + *			  should be allowed or not, when the file already
> + *			  existed.

Perhaps

 + * may_create_in_sticky - Check whether an O_CREAT open, in a sticky directory,
 			 should be allowed, or not, on files that already exist.



Hope this helps,
Tobin.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [kernel-hardening] [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-11-22  8:01 ` [PATCH v3 2/2] Protected O_CREAT open in sticky directories Salvatore Mesoraca
  2017-11-22 13:22   ` Matthew Wilcox
  2017-11-22 16:51   ` Alan Cox
@ 2017-11-23 22:57   ` Tobin C. Harding
  2017-11-24  8:34     ` Salvatore Mesoraca
  2017-11-30 16:53   ` David Laight
  3 siblings, 1 reply; 26+ messages in thread
From: Tobin C. Harding @ 2017-11-23 22:57 UTC (permalink / raw)
  To: Salvatore Mesoraca
  Cc: linux-kernel, Kernel Hardening, linux-fsdevel, Alexander Viro,
	Jann Horn, Kees Cook, Solar Designer, Eric W. Biederman

On Wed, Nov 22, 2017 at 09:01:46AM +0100, Salvatore Mesoraca wrote:

Same caveat about this being English language comments only as for patch
1/2. Please ignore if this is too trivial. My grammar is a long way from
perfect, especially please feel free to ignore my placement of commas,
they are often wrong. 

> Disallows O_CREAT open missing the O_EXCL flag, in world or
> group writable directories, even if the file doesn't exist yet.
> With few exceptions (e.g. shared lock files based on flock())
> if a program tries to open a file, in a sticky directory,
> with the O_CREAT flag and without the O_EXCL, it probably has a bug.
> This feature allows to detect and potentially block programs that
> act this way, it can be used to find vulnerabilities (like those
> prevented by patch #1) and to do policy enforcement.
> 
> Suggested-by: Solar Designer <solar@openwall.com>
> Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com>
> ---
>  Documentation/sysctl/fs.txt | 30 ++++++++++++++++++++++++
>  fs/namei.c                  | 56 +++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/fs.h          |  1 +
>  kernel/sysctl.c             |  9 ++++++++
>  4 files changed, 96 insertions(+)
> 
> diff --git a/Documentation/sysctl/fs.txt b/Documentation/sysctl/fs.txt
> index f3cf2cd..7f24b4f 100644
> --- a/Documentation/sysctl/fs.txt
> +++ b/Documentation/sysctl/fs.txt
> @@ -37,6 +37,7 @@ Currently, these files are in /proc/sys/fs:
>  - protected_fifos
>  - protected_hardlinks
>  - protected_regular
> +- protected_sticky_child_create
>  - protected_symlinks
>  - suid_dumpable
>  - super-max
> @@ -238,6 +239,35 @@ When set to "2" it also applies to group writable sticky directories.
>  
>  ==============================================================
>  
> +protected_sticky_child_create:
> +
> +An O_CREAT open missing the O_EXCL flag in a sticky directory is,
> +often, a bug or a synthom of the fact that the program is not

s/synthom/symptom

> +using appropriate procedures to access sticky directories.
> +This protection allow to detect and possibly block these unsafe

Perhaps

 This protection allows us to detect, and possibly block, these unsafe

> +open invocations, even if the files don't exist yet.
> +Though should be noted that, sometimes, it's OK to open a file

Perhaps

 +Although it should be noted, sometimes it's OK to open a file

(I looked up 'although' vs 'though' and am not quite sure on this one,
it seems to read better with 'although'. Again, apologies if this is
overly trivial.)


Hope this helps,
Tobin.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [kernel-hardening] [PATCH v3 1/2] Protected FIFOs and regular files
  2017-11-23 22:43   ` [kernel-hardening] " Tobin C. Harding
@ 2017-11-24  8:24     ` Salvatore Mesoraca
  0 siblings, 0 replies; 26+ messages in thread
From: Salvatore Mesoraca @ 2017-11-24  8:24 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: linux-kernel, Kernel Hardening, linux-fsdevel, Alexander Viro,
	Jann Horn, Kees Cook, Solar Designer, Eric W. Biederman

2017-11-23 23:43 GMT+01:00 Tobin C. Harding <me@tobin.cc>:
> On Wed, Nov 22, 2017 at 09:01:45AM +0100, Salvatore Mesoraca wrote:
>
> Please take these comments in all humility, my English is a long way
> from perfect. These are English grammar comments only. If this is viewed
> as trivial please stop reading now and ignore.

Any help is always greatly appreciated!
And I like your proposed changes, they sound better to me too.
Thank you for your time,

Salvatore

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-11-22 13:22   ` Matthew Wilcox
@ 2017-11-24  8:29     ` Salvatore Mesoraca
  0 siblings, 0 replies; 26+ messages in thread
From: Salvatore Mesoraca @ 2017-11-24  8:29 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-kernel, Kernel Hardening, linux-fsdevel, Alexander Viro,
	Jann Horn, Kees Cook, Solar Designer, Eric W. Biederman

2017-11-22 14:22 GMT+01:00 Matthew Wilcox <willy@infradead.org>:
> On Wed, Nov 22, 2017 at 09:01:46AM +0100, Salvatore Mesoraca wrote:
>> +An O_CREAT open missing the O_EXCL flag in a sticky directory is,
>> +often, a bug or a synthom of the fact that the program is not
>> +using appropriate procedures to access sticky directories.
>> +This protection allow to detect and possibly block these unsafe
>> +open invocations, even if the files don't exist yet.
>> +Though should be noted that, sometimes, it's OK to open a file
>> +with O_CREAT and without O_EXCL (e.g. shared lock files based
>> +on flock()), for this reason values above 2 should be set
>> +with care.
>> +
>> +When set to "0" the protection is disabled.
>> +
>> +When set to "1", notify about O_CREAT open missing the O_EXCL flag
>> +in world writable sticky directories.
>> +
>> +When set to "2", notify about O_CREAT open missing the O_EXCL flag
>> +in world or group writable sticky directories.
>> +
>> +When set to "3", block O_CREAT open missing the O_EXCL flag
>> +in world writable sticky directories and notify (but don't block)
>> +in group writable sticky directories.
>> +
>> +When set to "4", block O_CREAT open missing the O_EXCL flag
>> +in world writable and group writable sticky directories.
>
> This seems insufficiently flexible.  For example, there is no way for me
> to specify that I want to block O_CREAT without O_EXCL in world-writable,
> but not be notified about O_CREAT without O_EXCL in group-writable.

I understand your concern, I did it like this because I wanted to keep the
interface as simple as possible. But, maybe, more flexibility it's better.

> And maybe I want to be notified that blocking has happened?

I didn't write it explicitly in the doc, but you will always be notified
that blocking has happened. On the other hand you don't have any way to
suppress those notification.

> Why not make it bits?  So:
>
> 0 => notify in world
> 1 => block in world
> 2 => notify in group
> 3 => block in group
>
> So you'd have the following meaningful values:
>
>  0 - permit all (your option 0)
>  1 - notify world; permit group (your option 1)
>  2 - block world; permit group
>  3 - block,notify world; permit group
>  4 - permit world; notify group (?)
>  5 - notify world; notify group (your option 2)
>  6 - block world; notify group (your option 3)
>  7 - block,notify world; notify group
>  8 - permit world; block group (?)
>  9 - notify world; block group (?)
> 10 - block world; block group (your option 4)
> 11 - block,notify world; block group
> 12 - permit world; block, notify group (?)
> 13 - notify world; block, notify group (?)
> 14 - block world; block, notify group
> 15 - block, notify world; block, notify group
>
> Some of these don't make a lot of sense (marked with ?), but I don't see
> the harm in permitting a sysadmin to do something that seems nonsensical
> to me.

I like your idea of using "bits" this way, even if it will allow sysadmins
to set values that don't make too much sense.
Thank you very much for your suggestions,

Salvatore

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-11-22 16:51   ` Alan Cox
@ 2017-11-24  8:31     ` Salvatore Mesoraca
  2017-11-24 10:53     ` David Laight
  1 sibling, 0 replies; 26+ messages in thread
From: Salvatore Mesoraca @ 2017-11-24  8:31 UTC (permalink / raw)
  To: Alan Cox
  Cc: linux-kernel, Kernel Hardening, linux-fsdevel, Alexander Viro,
	Jann Horn, Kees Cook, Solar Designer, Eric W. Biederman

2017-11-22 17:51 GMT+01:00 Alan Cox <gnomes@lxorguk.ukuu.org.uk>:
> On Wed, 22 Nov 2017 09:01:46 +0100
> Salvatore Mesoraca <s.mesoraca16@gmail.com> wrote:
>
>> Disallows O_CREAT open missing the O_EXCL flag, in world or
>> group writable directories, even if the file doesn't exist yet.
>> With few exceptions (e.g. shared lock files based on flock())
>
> Enough exceptions to make it a bad idea.
>
> Firstly if you care this much *stop* having shared writable directories.
> We have namespaces, you don't need them. You can give every user their
> own /tmp etc.
>
> The rest of this only make sense on a per application and directory basis
> because there are valid use cases, and that means it wants to be part of
> an existing LSM security module where you've got the context required and
> you can attach it to a specific directory and/or process.

I think that this feature should be intended more as a "debugging" feature
than as a "security" one.
When the feature implemented in the first patch is enabled, this restriction
doesn't improve security at all and it's not supposed to do it.
The first patch blocks attacks that exploit some unsafe usage of sticky
directories.
This patch, instead, doesn't block actual attacks: it detects (and maybe
blocks) the bad code that can be exploited for the attacks blocked by #1,
even if no one is attacking you in that moment.
This looks like a useful feature to me, even if you already use more
sophisticated security apparatus like LSMs or namespaces, because it makes
easy to find real vulnerabilities in software: the commit message of
patch #1 has a short list of some CVEs that this feature can detect.
Also, being just a sysctl away, it's within anyone's reach.
Probably the "debugging" goal wasn't clear from my previous message,
I'm sorry for the misunderstanding.
Thank you very much for your time,

Salvatore

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [kernel-hardening] [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-11-23 22:57   ` Tobin C. Harding
@ 2017-11-24  8:34     ` Salvatore Mesoraca
  0 siblings, 0 replies; 26+ messages in thread
From: Salvatore Mesoraca @ 2017-11-24  8:34 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: linux-kernel, Kernel Hardening, linux-fsdevel, Alexander Viro,
	Jann Horn, Kees Cook, Solar Designer, Eric W. Biederman

2017-11-23 23:57 GMT+01:00 Tobin C. Harding <me@tobin.cc>:
> On Wed, Nov 22, 2017 at 09:01:46AM +0100, Salvatore Mesoraca wrote:
>
> Same caveat about this being English language comments only as for patch
> 1/2. Please ignore if this is too trivial. My grammar is a long way from
> perfect, especially please feel free to ignore my placement of commas,
> they are often wrong.

As I wrote before: any help is always welcome.
Thank you,

Salvatore

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-11-22 16:51   ` Alan Cox
  2017-11-24  8:31     ` Salvatore Mesoraca
@ 2017-11-24 10:53     ` David Laight
  2017-11-24 11:43       ` Salvatore Mesoraca
  1 sibling, 1 reply; 26+ messages in thread
From: David Laight @ 2017-11-24 10:53 UTC (permalink / raw)
  To: 'Alan Cox', Salvatore Mesoraca
  Cc: linux-kernel, Kernel Hardening, linux-fsdevel, Alexander Viro,
	Jann Horn, Kees Cook, Solar Designer, Eric W. Biederman

From: Alan Cox
> Sent: 22 November 2017 16:52
> 
> On Wed, 22 Nov 2017 09:01:46 +0100
> Salvatore Mesoraca <s.mesoraca16@gmail.com> wrote:
> 
> > Disallows O_CREAT open missing the O_EXCL flag, in world or
> > group writable directories, even if the file doesn't exist yet.
> > With few exceptions (e.g. shared lock files based on flock())
> 
> Enough exceptions to make it a bad idea.
> 
> Firstly if you care this much *stop* having shared writable directories.
> We have namespaces, you don't need them. You can give every user their
> own /tmp etc.

Looks like a very bad idea to me as well.

Doesn't this stop all shell redirects into a shared /tmp ?
I'm pretty sure most programs use O_CREAT | O_TRUNC for output
files - they'll all stop working.

If there are some directories where you need to force O_EXCL you
need to make it a property of the directory, not the kernel.

	David

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-11-24 10:53     ` David Laight
@ 2017-11-24 11:43       ` Salvatore Mesoraca
  2017-11-24 11:53         ` David Laight
  2017-11-27  0:26         ` Solar Designer
  0 siblings, 2 replies; 26+ messages in thread
From: Salvatore Mesoraca @ 2017-11-24 11:43 UTC (permalink / raw)
  To: David Laight
  Cc: Alan Cox, linux-kernel, Kernel Hardening, linux-fsdevel,
	Alexander Viro, Jann Horn, Kees Cook, Solar Designer,
	Eric W. Biederman

2017-11-24 11:53 GMT+01:00 David Laight <David.Laight@aculab.com>:
> From: Alan Cox
>> Sent: 22 November 2017 16:52
>>
>> On Wed, 22 Nov 2017 09:01:46 +0100
>> Salvatore Mesoraca <s.mesoraca16@gmail.com> wrote:
>>
>> > Disallows O_CREAT open missing the O_EXCL flag, in world or
>> > group writable directories, even if the file doesn't exist yet.
>> > With few exceptions (e.g. shared lock files based on flock())
>>
>> Enough exceptions to make it a bad idea.
>>
>> Firstly if you care this much *stop* having shared writable directories.
>> We have namespaces, you don't need them. You can give every user their
>> own /tmp etc.
>
> Looks like a very bad idea to me as well.
>
> Doesn't this stop all shell redirects into a shared /tmp ?
> I'm pretty sure most programs use O_CREAT | O_TRUNC for output
> files - they'll all stop working.

If some program does such a thing, that's a potential vulnerability.
With "protected_hardlinks" you are, in most cases, safe.
But, still, that program has a bug and having this feature enabled will
help you notice it soon.
For that matter, I'm using this patch on my system and I don't have any
program behaving like this.

> If there are some directories where you need to force O_EXCL you
> need to make it a property of the directory, not the kernel.
>
>         David
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-11-24 11:43       ` Salvatore Mesoraca
@ 2017-11-24 11:53         ` David Laight
  2017-11-26 11:29           ` Salvatore Mesoraca
  2017-11-27  0:26         ` Solar Designer
  1 sibling, 1 reply; 26+ messages in thread
From: David Laight @ 2017-11-24 11:53 UTC (permalink / raw)
  To: 'Salvatore Mesoraca'
  Cc: Alan Cox, linux-kernel, Kernel Hardening, linux-fsdevel,
	Alexander Viro, Jann Horn, Kees Cook, Solar Designer,
	Eric W. Biederman

From: Salvatore Mesoraca [mailto:s.mesoraca16@gmail.com]
> Sent: 24 November 2017 11:44
> 
> 2017-11-24 11:53 GMT+01:00 David Laight <David.Laight@aculab.com>:
> > From: Alan Cox
> >> Sent: 22 November 2017 16:52
> >>
> >> On Wed, 22 Nov 2017 09:01:46 +0100
> >> Salvatore Mesoraca <s.mesoraca16@gmail.com> wrote:
> >>
> >> > Disallows O_CREAT open missing the O_EXCL flag, in world or
> >> > group writable directories, even if the file doesn't exist yet.
> >> > With few exceptions (e.g. shared lock files based on flock())
> >>
> >> Enough exceptions to make it a bad idea.
> >>
> >> Firstly if you care this much *stop* having shared writable directories.
> >> We have namespaces, you don't need them. You can give every user their
> >> own /tmp etc.
> >
> > Looks like a very bad idea to me as well.
> >
> > Doesn't this stop all shell redirects into a shared /tmp ?
> > I'm pretty sure most programs use O_CREAT | O_TRUNC for output
> > files - they'll all stop working.
> 
> If some program does such a thing, that's a potential vulnerability.
> With "protected_hardlinks" you are, in most cases, safe.
> But, still, that program has a bug and having this feature enabled will
> help you notice it soon.
> For that matter, I'm using this patch on my system and I don't have any
> program behaving like this.

Hmmm.... a quick strace shows cp and vi doing stat("/tmp/foo") and then
open(O_WRONLY|O_TRUNC) if it exists and O_CREATE|O_EXCL if it doesn't.
I can't help feeling that is just hiding a race.

	David

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-11-24 11:53         ` David Laight
@ 2017-11-26 11:29           ` Salvatore Mesoraca
  0 siblings, 0 replies; 26+ messages in thread
From: Salvatore Mesoraca @ 2017-11-26 11:29 UTC (permalink / raw)
  To: David Laight
  Cc: Alan Cox, linux-kernel, Kernel Hardening, linux-fsdevel,
	Alexander Viro, Jann Horn, Kees Cook, Solar Designer,
	Eric W. Biederman

2017-11-24 12:53 GMT+01:00 David Laight <David.Laight@aculab.com>:
> From: Salvatore Mesoraca [mailto:s.mesoraca16@gmail.com]
>> Sent: 24 November 2017 11:44
>>
>> 2017-11-24 11:53 GMT+01:00 David Laight <David.Laight@aculab.com>:
>> > From: Alan Cox
>> >> Sent: 22 November 2017 16:52
>> >>
>> >> On Wed, 22 Nov 2017 09:01:46 +0100
>> >> Salvatore Mesoraca <s.mesoraca16@gmail.com> wrote:
>> >>
>> >> > Disallows O_CREAT open missing the O_EXCL flag, in world or
>> >> > group writable directories, even if the file doesn't exist yet.
>> >> > With few exceptions (e.g. shared lock files based on flock())
>> >>
>> >> Enough exceptions to make it a bad idea.
>> >>
>> >> Firstly if you care this much *stop* having shared writable directories.
>> >> We have namespaces, you don't need them. You can give every user their
>> >> own /tmp etc.
>> >
>> > Looks like a very bad idea to me as well.
>> >
>> > Doesn't this stop all shell redirects into a shared /tmp ?
>> > I'm pretty sure most programs use O_CREAT | O_TRUNC for output
>> > files - they'll all stop working.
>>
>> If some program does such a thing, that's a potential vulnerability.
>> With "protected_hardlinks" you are, in most cases, safe.
>> But, still, that program has a bug and having this feature enabled will
>> help you notice it soon.
>> For that matter, I'm using this patch on my system and I don't have any
>> program behaving like this.
>
> Hmmm.... a quick strace shows cp and vi doing stat("/tmp/foo") and then
> open(O_WRONLY|O_TRUNC) if it exists and O_CREATE|O_EXCL if it doesn't.
> I can't help feeling that is just hiding a race.

Yes, unfortunately, doing something like "cp somefile /tmp/" is a bad
practice that
in most cases will go unnoticed by this feature. Nevertheless there
are many other
real world vulnerability that it would have been able to detect.
Thank you very much for taking the time to do some experiments.

Salvatore

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-11-24 11:43       ` Salvatore Mesoraca
  2017-11-24 11:53         ` David Laight
@ 2017-11-27  0:26         ` Solar Designer
  2017-11-30 14:39           ` Salvatore Mesoraca
  1 sibling, 1 reply; 26+ messages in thread
From: Solar Designer @ 2017-11-27  0:26 UTC (permalink / raw)
  To: Salvatore Mesoraca
  Cc: David Laight, Alan Cox, linux-kernel, Kernel Hardening,
	linux-fsdevel, Alexander Viro, Jann Horn, Kees Cook,
	Eric W. Biederman

On Fri, Nov 24, 2017 at 12:43:47PM +0100, Salvatore Mesoraca wrote:
> 2017-11-24 11:53 GMT+01:00 David Laight <David.Laight@aculab.com>:
> > From: Alan Cox
> >> Sent: 22 November 2017 16:52
> >>
> >> On Wed, 22 Nov 2017 09:01:46 +0100 Salvatore Mesoraca <s.mesoraca16@gmail.com> wrote:
> >>
> >> > Disallows O_CREAT open missing the O_EXCL flag, in world or
> >> > group writable directories, even if the file doesn't exist yet.
> >> > With few exceptions (e.g. shared lock files based on flock())

Why would "shared lock files based on flock()" need O_CREAT without
O_EXCL?  Where specifically are they currently used that way?  My guess
would be a group-writable mail spool (that would be fcntl() locks on
Linux now, but it's the same thing for the purpose of this discussion),
but even then I don't see a need for such open flags.  If a program does
that, we could want to find out and revise it (if O_CREAT|O_EXCL fails,
retry without these to open the existing file, then flock() either way).

> >> Enough exceptions to make it a bad idea.

Maybe, but as Salvatore points out we're considering this for rather
limited use - easy opt-in policy enforcement during software development
and auditing (years ago, I was using an LKM for this purpose, which
caught some issues in Postfix that were promptly patched), and maybe on
specific production systems where the sysadmins know what they're doing.

> >> Firstly if you care this much *stop* having shared writable directories.
> >> We have namespaces, you don't need them. You can give every user their
> >> own /tmp etc.

This is good advice (and I've been doing similar with pam_mktemp, prior
to namespaces), but it's not exactly for the same use case.

Also, there are shared writable directories that have to remain shared
unless more things are reworked.  For example, mail spools and crontab
directories, which could be avoided by having the corresponding files in
user homes instead (and it's been done), but that's also a related risk
(a privileged process accessing a directory writable by a user, even if
accessing only for reading, lowering the risk impact).

> > Looks like a very bad idea to me as well.
> >
> > Doesn't this stop all shell redirects into a shared /tmp ?

The above sounds like it'd be something bad - but it's actually just
right for an opt-in policy like this.  A command like "program >
/tmp/file" is generally unsafe (unless the file was pre-created safely,
see below) and should be avoided.  The fact that a lot of documentation
gives commands of this kind only emphasizes the need for easy policy
enforcement against such misuses of /tmp.

Ideally, we'd stop most "shell redirects into a shared /tmp" - all but
those into files correctly pre-created with mktemp(1) and the like.

Technically, we could achieve similar by allowing O_CREAT without O_EXCL
if the file exists _and_ has the same owner as our current fsuid or the
owner is root.  Most scripts that use mktemp(1) then use that file
without switching privileges, so they'll continue working.  If there's
an exception to that on a system where I'd configure a policy like
what's discussed here, I'd rather learn of it and be interested in
looking at that unusual script.  That script would be taking a risk by
making one (pseudo-)user (or root) write into a file in /tmp created
(even if initially safely) by another pseudo-user (not root), thus
letting the latter pseudo-user attack the former (or root).

Ditto for programs using mkstemp(3), but then somehow going through the
pathname rather than the fd.  That's not ideal, but it's sometimes safe
and sometimes makes sense (such as when they need to exec other programs
only accepting a pathname), so the same exception and approach applies.

> > I'm pretty sure most programs use O_CREAT | O_TRUNC for output
> > files - they'll all stop working.

That would be great.  Unfortunately (in this context), you've identified
exceptions, where programs try to be smart but are not necessarily safe.

It's beneficial to be able to easily stop at least some misuses of /tmp
and the like, even if we can't stop all.

> If some program does such a thing, that's a potential vulnerability.

Exactly.

> With "protected_hardlinks" you are, in most cases, safe.

And "protected_symlinks", and not in terms of data spoofing attacks via
FIFOs and regular files (the initial focus of this patchset).

> But, still, that program has a bug and having this feature enabled will
> help you notice it soon.
> For that matter, I'm using this patch on my system and I don't have any
> program behaving like this.
> 
> > If there are some directories where you need to force O_EXCL you
> > need to make it a property of the directory, not the kernel.
> >
> >         David

Good idea, but this would need to be checked and enforced by the kernel
anyhow, so it's a matter of coarse vs. fine policy.  Coarse is much
easier to implement and to start using, albeit perhaps mostly not by
general-purpose distros, but by a software developer/auditor, an
adventurous sysadmin, or a sysadmin team or a developer of specific
systems where a simple policy like this is known to be valid (e.g.,
where there's no expectation of arbitrary software being added, so a
simple rule like this can be introduced system-wide).

Alexander

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-11-27  0:26         ` Solar Designer
@ 2017-11-30 14:39           ` Salvatore Mesoraca
  2017-11-30 14:57             ` [kernel-hardening] " Ian Campbell
  0 siblings, 1 reply; 26+ messages in thread
From: Salvatore Mesoraca @ 2017-11-30 14:39 UTC (permalink / raw)
  To: Solar Designer
  Cc: David Laight, Alan Cox, linux-kernel, Kernel Hardening,
	linux-fsdevel, Alexander Viro, Jann Horn, Kees Cook,
	Eric W. Biederman

2017-11-27 1:26 GMT+01:00 Solar Designer <solar@openwall.com>:
> On Fri, Nov 24, 2017 at 12:43:47PM +0100, Salvatore Mesoraca wrote:
> > 2017-11-24 11:53 GMT+01:00 David Laight <David.Laight@aculab.com>:
> > > From: Alan Cox
> > >> Sent: 22 November 2017 16:52
> > >>
> > >> On Wed, 22 Nov 2017 09:01:46 +0100 Salvatore Mesoraca <s.mesoraca16@gmail.com> wrote:
> > >>
> > >> > Disallows O_CREAT open missing the O_EXCL flag, in world or
> > >> > group writable directories, even if the file doesn't exist yet.
> > >> > With few exceptions (e.g. shared lock files based on flock())
>
> Why would "shared lock files based on flock()" need O_CREAT without
> O_EXCL?  Where specifically are they currently used that way?

I don't think that they *need* to act like this, but this is how
util-linux's flock(1) currently works.
And it doesn't seem an unreasonable behavior from their perspective,
so maybe other programs do that too.
I was citing that case just to make it clear that, if someone gets
a warning because of flock(1), they shouldn't be worried about it.
That behavior can be certainly avoided, but of course it isn't a
security problem per se.

> If a program does
> that, we could want to find out and revise it (if O_CREAT|O_EXCL fails,
> retry without these to open the existing file, then flock() either way).

Yes, this would probably be the best thing to do, good idea.
Thanks again for your time,

Salvatore

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [kernel-hardening] Re: [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-11-30 14:39           ` Salvatore Mesoraca
@ 2017-11-30 14:57             ` Ian Campbell
  2017-11-30 16:30               ` [kernel-hardening] " Solar Designer
  0 siblings, 1 reply; 26+ messages in thread
From: Ian Campbell @ 2017-11-30 14:57 UTC (permalink / raw)
  To: Salvatore Mesoraca, Solar Designer
  Cc: David Laight, Alan Cox, linux-kernel, Kernel Hardening,
	linux-fsdevel, Alexander Viro, Jann Horn, Kees Cook,
	Eric W. Biederman

On Thu, 2017-11-30 at 15:39 +0100, Salvatore Mesoraca wrote:
> 2017-11-27 1:26 GMT+01:00 Solar Designer <solar@openwall.com>:
> > On Fri, Nov 24, 2017 at 12:43:47PM +0100, Salvatore Mesoraca wrote:
> > > 2017-11-24 11:53 GMT+01:00 David Laight <David.Laight@aculab.com>
> > > :
> > > > From: Alan Cox
> > > > > Sent: 22 November 2017 16:52
> > > > > 
> > > > > On Wed, 22 Nov 2017 09:01:46 +0100 Salvatore Mesoraca <s.meso
> > > > > raca16@gmail.com> wrote:
> > > > > 
> > > > > > Disallows O_CREAT open missing the O_EXCL flag, in world or
> > > > > > group writable directories, even if the file doesn't exist
> > > > > > yet.
> > > > > > With few exceptions (e.g. shared lock files based on
> > > > > > flock())
> > 
> > Why would "shared lock files based on flock()" need O_CREAT without
> > O_EXCL?  Where specifically are they currently used that way?
> 
> I don't think that they *need* to act like this, but this is how
> util-linux's flock(1) currently works.
> And it doesn't seem an unreasonable behavior from their perspective,

I thought that too, specifically I reasoned that using O_EXCL would
defeat the purpose of the _shared_ flock(), wouldn't it?

Ian.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [kernel-hardening] [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-11-30 14:57             ` [kernel-hardening] " Ian Campbell
@ 2017-11-30 16:30               ` Solar Designer
  2017-12-05 10:21                 ` Salvatore Mesoraca
  0 siblings, 1 reply; 26+ messages in thread
From: Solar Designer @ 2017-11-30 16:30 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Salvatore Mesoraca, David Laight, Alan Cox, linux-kernel,
	Kernel Hardening, linux-fsdevel, Alexander Viro, Jann Horn,
	Kees Cook, Eric W. Biederman, H. Peter Anvin, Karel Zak

Replying to Salvatore and Ian at once, and CC'ing H. Peter Anvin and
Karel Zak for util-linux flock(1).

On Thu, Nov 30, 2017 at 02:57:06PM +0000, Ian Campbell wrote:
> On Thu, 2017-11-30 at 15:39 +0100, Salvatore Mesoraca wrote:
> > 2017-11-27 1:26 GMT+01:00 Solar Designer <solar@openwall.com>:
> > > Why would "shared lock files based on flock()" need O_CREAT without
> > > O_EXCL?  Where specifically are they currently used that way?
> > 
> > I don't think that they *need* to act like this, but this is how
> > util-linux's flock(1) currently works.

Oh, but you never referred to flock(1) in there, so I read flock() as
implying flock(2).  I think util-linux's flock(1) is relatively obscure,
and as far as I can tell it isn't documented anywhere whether it may be
used on untrusted lock file directories or not.  A revision of the
flock(1) man page seen on RHEL7 gives really weird examples using /tmp.
The current util-linux-2.31/sys-utils/flock.1 is similar.  strace'ing
those examples, I see flock(1) literally uses the /tmp directory itself
(not a file inside) as the lock, which surprisingly appears to work.
Checking the util-linux tree, it looks like this was added as a feature.
I suppose the good news is this doesn't appear to allow for worse than a
DoS against the script using flock(1) (since a different user can also
take that lock), unless any of the upper level directories are also
untrusted.  The man page as seen at https://linux.die.net/man/1/flock
currently only lists a more complex example with /var/lock/mylockfile,
where flock(1) itself is asked to access it via a pre-opened fd.
Permissions on /var/lock vary between distros (e.g., a RHEL6'ish system
I just checked has it as 775 root:lock, whereas a RHEL7'ish has it as
symlink to ../run/lock, which is 755 root:root).  Anyway, these are just
examples.  The reality is people will use flock(1) in various directories,
some of them trusted and some not.  If our proposed policy can detect and
optionally break some uses in untrusted directories, that's good since
such uses are currently unsafe (see below).

> > And it doesn't seem an unreasonable behavior from their perspective,
> 
> I thought that too, specifically I reasoned that using O_EXCL would
> defeat the purpose of the _shared_ flock(), wouldn't it?

No.  O_EXCL means the attempt to O_CREAT will fail, at which point the
program can retry without both of these flags.  (The actual lock is
obtained later via flock(2) or fcntl(2) anyway.)  In fact, flock(1)
already does something similar, as tested using the very first example
from the man page on a RHEL7'ish system:

$ strace flock /tmp -c cat
[...]
open("/tmp", O_RDONLY|O_CREAT|O_NOCTTY, 0666) = -1 EISDIR (Is a directory)
open("/tmp", O_RDONLY|O_NOCTTY)         = 3
flock(3, LOCK_EX)                       = 0

As you can see, when O_CREAT failed, the program retried without that
flag.  It could just as easily have tried O_CREAT|O_EXCL, then dropped
both at once.  Testing with a lock file (rather than lock directory):

$ strace flock /tmp/lockfile -c cat
[...]
open("/tmp/lockfile", O_RDONLY|O_CREAT|O_NOCTTY, 0666) = 3
flock(3, LOCK_EX)                       = 0

This use of flock(1) would be a worse vulnerability, not limited to DoS
against the script itself but also allowing for privilege escalation
unless (sym)link restrictions are enabled in the kernel.  Adding O_EXCL
would help (reduce the impact back to DoS against itself), and would
require that the retry logic (like what is seen in the lock directory
example above) be present.

On Thu, Nov 30, 2017 at 03:39:48PM +0100, Salvatore Mesoraca wrote:
> so maybe other programs do that too.

Maybe, and they would be similarly vulnerable if they do that in
untrusted directories, and they would also need to be fixed.

A subtle case is when something like this is technically done in a
directory writable by someone else, but is not necessarily crossing
what the program's author or the sysadmin recognize as a privilege
boundary.  Maybe they have accepted that this one pseudo-user can
attack that other one anyway, such as because under a certain daemon's
privsep model the would-be attacking pseudo-user is necessarily more
privileged anyway.  This is why we shouldn't by default extend this
policy to all directories writable by someone else, but instead we
consider introducing a sysctl with varying policy levels - initially
focusing on sticky directories writable by someone else and only later
optionally extending to non-sticky directories writable by someone else.
The latter mode would have even greater focus on development and
debugging rather than on production use, although I imagine that
specific systems could eventually afford it in production as well.

> I was citing that case just to make it clear that, if someone gets
> a warning because of flock(1), they shouldn't be worried about it.

Checking the util-linux-2.31/sys-utils/flock.c, I see that it in fact
doesn't retry without O_CREAT in the non-directory case.  That would
need to be corrected, along with introduction of O_EXCL.

> That behavior can be certainly avoided, but of course it isn't a
> security problem per se.

I think it is a security problem per se, except in the "subtle case"
above, and it's great that our proposed policy would catch it.

Perhaps flock(1) should be extended as follows:

> > If a program does
> > that, we could want to find out and revise it (if O_CREAT|O_EXCL fails,
> > retry without these to open the existing file, then flock() either way).
> 
> Yes, this would probably be the best thing to do, good idea.

util-linux-2.31/sys-utils/flock.c currently has:

static int open_file(const char *filename, int *flags)
{

	int fd;
	int fl = *flags == 0 ? O_RDONLY : *flags;

	errno = 0;
	fl |= O_NOCTTY | O_CREAT;
	fd = open(filename, fl, 0666);

	/* Linux doesn't like O_CREAT on a directory, even though it
	 * should be a no-op; POSIX doesn't allow O_RDWR or O_WRONLY
	 */
	if (fd < 0 && errno == EISDIR) {
		fl = O_RDONLY | O_NOCTTY;
		fd = open(filename, fl);
	}

I think this should be revised to:

	fl |= O_NOCTTY | O_CREAT | O_EXCL;
	fd = open(filename, fl, 0666);

	/* Linux doesn't like O_CREAT on a directory, even though it
	 * should be a no-op; POSIX doesn't allow O_RDWR or O_WRONLY
	 */
	if (fd < 0) {
		if (errno == EISDIR)
			fl = O_RDONLY | O_NOCTTY;
		else
			fl &= ~(O_CREAT | O_EXCL);
		fd = open(filename, fl);
	}

This adds O_EXCL and retry without O_CREAT|O_EXCL for non-directories.

[ The pre-umask permissions of 0666 are also a potential concern since
many users have a relaxed umask as distro default, but changing these to
0600 would break some otherwise valid uses where a lock file may be
shared between different users on purpose.  Maybe a future major version
of flock(1) could default to 0600 and/or accept a "-m" option to set the
lock file's mode in case a new file gets created.  mktemp(1) may be
viewed as precedent: it uses pre-umask permissions of 0600.  Lock files
are arguably somewhat similar to temporary files.  This is beyond scope
of this thread and LKML, though, and should be discussed elsewhere. ]

Alexander

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-11-22  8:01 ` [PATCH v3 2/2] Protected O_CREAT open in sticky directories Salvatore Mesoraca
                     ` (2 preceding siblings ...)
  2017-11-23 22:57   ` Tobin C. Harding
@ 2017-11-30 16:53   ` David Laight
  2017-11-30 17:51     ` Solar Designer
  3 siblings, 1 reply; 26+ messages in thread
From: David Laight @ 2017-11-30 16:53 UTC (permalink / raw)
  To: 'Salvatore Mesoraca', linux-kernel
  Cc: Kernel Hardening, linux-fsdevel, Alexander Viro, Jann Horn,
	Kees Cook, Solar Designer, Eric W. Biederman

From: Salvatore Mesoraca
> Sent: 22 November 2017 08:02
> 
> Disallows O_CREAT open missing the O_EXCL flag, in world or
> group writable directories, even if the file doesn't exist yet.
> With few exceptions (e.g. shared lock files based on flock())
> if a program tries to open a file, in a sticky directory,
> with the O_CREAT flag and without the O_EXCL, it probably has a bug.
> This feature allows to detect and potentially block programs that
> act this way, it can be used to find vulnerabilities (like those
> prevented by patch #1) and to do policy enforcement.

(Going back to the original post)

I presume the 'vulnerabilities' are related to symlinks being created
just before the open?

Trouble is this change breaks a lot of general use of /tmp.
I always assumed that code that cared would use O_EXCL and
everything else wasn't worth subverting.

I found code in vi (and elsewhere) that subverted these checks
by opening with O_WRONLY if stat() showed the file existed and
O_CREAT|O_EXCL if it didn't.

I'm pretty sure that traditionally a lot of these opens were done
with O_CREAT|O_TRUNC.
Implementing that as unlink() followed by a create would stop
'random' (ok all) symlinks being followed.

Overall I'm pretty sure this change will break things badly somewhere.

	David

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-11-30 16:53   ` David Laight
@ 2017-11-30 17:51     ` Solar Designer
  2017-12-01  9:46       ` David Laight
  0 siblings, 1 reply; 26+ messages in thread
From: Solar Designer @ 2017-11-30 17:51 UTC (permalink / raw)
  To: David Laight
  Cc: 'Salvatore Mesoraca',
	linux-kernel, Kernel Hardening, linux-fsdevel, Alexander Viro,
	Jann Horn, Kees Cook, Eric W. Biederman

On Thu, Nov 30, 2017 at 04:53:06PM +0000, David Laight wrote:
> From: Salvatore Mesoraca
> > if a program tries to open a file, in a sticky directory,
> > with the O_CREAT flag and without the O_EXCL, it probably has a bug.
> > This feature allows to detect and potentially block programs that
> > act this way, it can be used to find vulnerabilities (like those
> > prevented by patch #1) and to do policy enforcement.

> I presume the 'vulnerabilities' are related to symlinks being created
> just before the open?

That's one way to exploit them.  We already have a sysctl to try and
block that specific attack, and we're considering adding more, for other
attacks.  But we'd also like an easy way to learn of the vulnerabilities
without anyone trying to exploit them yet, hence this patch.

> Trouble is this change breaks a lot of general use of /tmp.

That's general misuse of /tmp.  Things like "command > /tmp/file"
without having pre-created the file with O_EXCL e.g. by mktemp(1).

> I always assumed that code that cared would use O_EXCL and
> everything else wasn't worth subverting.

Opinions will vary on whether it's worth subverting or not, and someone
may reasonably want to configure this differently on different systems.

> I found code in vi (and elsewhere) that subverted these checks
> by opening with O_WRONLY if stat() showed the file existed and
> O_CREAT|O_EXCL if it didn't.

Yes, such misuses of /tmp and the like by use of those programs - where
the potential consequences would often be less severe (if your example
above is correct, it sounds like it'd be data spoofing but not
overwriting another file over a link) - could remain unnoticed.

> I'm pretty sure that traditionally a lot of these opens were done
> with O_CREAT|O_TRUNC.
> Implementing that as unlink() followed by a create would stop
> 'random' (ok all) symlinks being followed.

That's racy.  We have O_EXCL, and it must be used along with O_CREAT
when the directory is untrusted.  (If it were only about symlinks, we
also already have O_NOFOLLOW.)

> Overall I'm pretty sure this change will break things badly somewhere.

Sure.  We want it to visibly break what was subtly broken, so that the
breakage can be seen and corrected without an attempted attack.

Alexander

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-11-30 17:51     ` Solar Designer
@ 2017-12-01  9:46       ` David Laight
  2017-12-01 15:52         ` Alan Cox
  0 siblings, 1 reply; 26+ messages in thread
From: David Laight @ 2017-12-01  9:46 UTC (permalink / raw)
  To: 'Solar Designer'
  Cc: 'Salvatore Mesoraca',
	linux-kernel, Kernel Hardening, linux-fsdevel, Alexander Viro,
	Jann Horn, Kees Cook, Eric W. Biederman

From: Solar Designer
> Sent: 30 November 2017 17:52
> 
> On Thu, Nov 30, 2017 at 04:53:06PM +0000, David Laight wrote:
> > From: Salvatore Mesoraca
> > > if a program tries to open a file, in a sticky directory,
> > > with the O_CREAT flag and without the O_EXCL, it probably has a bug.
> > > This feature allows to detect and potentially block programs that
> > > act this way, it can be used to find vulnerabilities (like those
> > > prevented by patch #1) and to do policy enforcement.
> 
> > I presume the 'vulnerabilities' are related to symlinks being created
> > just before the open?
> 
> That's one way to exploit them.  We already have a sysctl to try and
> block that specific attack, and we're considering adding more, for other
> attacks.  But we'd also like an easy way to learn of the vulnerabilities
> without anyone trying to exploit them yet, hence this patch.
> 
> > Trouble is this change breaks a lot of general use of /tmp.
> 
> That's general misuse of /tmp.  Things like "command > /tmp/file"
> without having pre-created the file with O_EXCL e.g. by mktemp(1).

I'm sorry, I've been using Unix for over 30 years.
/tmp is a place that temporary files were created - nothing special.
Traditionally it was emptied on every boot.
There was never anything that required files be created in any
specific way.

> > I always assumed that code that cared would use O_EXCL and
> > everything else wasn't worth subverting.
> 
> Opinions will vary on whether it's worth subverting or not, and someone
> may reasonably want to configure this differently on different systems.
> 
> > I found code in vi (and elsewhere) that subverted these checks
> > by opening with O_WRONLY if stat() showed the file existed and
> > O_CREAT|O_EXCL if it didn't.
> 
> Yes, such misuses of /tmp and the like by use of those programs - where
> the potential consequences would often be less severe (if your example
> above is correct, it sounds like it'd be data spoofing but not
> overwriting another file over a link) - could remain unnoticed.

It seemed to me that code had been added to avoid issues caused by
this policing of opens.
But the code added is itself racy - so makes the whole thing pointless.

> > I'm pretty sure that traditionally a lot of these opens were done
> > with O_CREAT|O_TRUNC.
> > Implementing that as unlink() followed by a create would stop
> > 'random' (ok all) symlinks being followed.
> 
> That's racy.  We have O_EXCL, and it must be used along with O_CREAT
> when the directory is untrusted.  (If it were only about symlinks, we
> also already have O_NOFOLLOW.)

Not racy if done in the kernel itself.

> > Overall I'm pretty sure this change will break things badly somewhere.
> 
> Sure.  We want it to visibly break what was subtly broken, so that the
> breakage can be seen and corrected without an attempted attack.

Maybe, but it will break some user system when they do a kernel update.
It won't necessarily break a developer's system first.

And, AFAICT, there is already code that subverts any tests by doing
a check for the file existing and then selecting between two different
types on open.

	David

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-12-01  9:46       ` David Laight
@ 2017-12-01 15:52         ` Alan Cox
  0 siblings, 0 replies; 26+ messages in thread
From: Alan Cox @ 2017-12-01 15:52 UTC (permalink / raw)
  To: David Laight
  Cc: 'Solar Designer', 'Salvatore Mesoraca',
	linux-kernel, Kernel Hardening, linux-fsdevel, Alexander Viro,
	Jann Horn, Kees Cook, Eric W. Biederman

> > That's general misuse of /tmp.  Things like "command > /tmp/file"
> > without having pre-created the file with O_EXCL e.g. by mktemp(1).  
> 
> I'm sorry, I've been using Unix for over 30 years.
> /tmp is a place that temporary files were created - nothing special.
> Traditionally it was emptied on every boot.
> There was never anything that required files be created in any
> specific way.

And in 1978 you had to boot single user and use nckeck and icheck to fix
the filesystem up by hand, you had no networking, no systemd, no
sysvinit, no ANSI C. no X11 ... (shall I go on...)

There are reasons it all changed. The origin of /tmp is a compromise of
security and disk performance made in the 1970s about an OS that was
quite different, running on a machine with typically 256K of RAM, no RAM
disks, a single very expensive fixed head drive and a larger moving head
one.

The existence of /tmp in that form today is a bizarre historic quirk.
Fortunately if you want a perfectly safe /tmp/ use namespaces and every
user can have their own private /tmp.

Alan

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [kernel-hardening] [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-11-30 16:30               ` [kernel-hardening] " Solar Designer
@ 2017-12-05 10:21                 ` Salvatore Mesoraca
  2017-12-07 21:47                   ` Solar Designer
  0 siblings, 1 reply; 26+ messages in thread
From: Salvatore Mesoraca @ 2017-12-05 10:21 UTC (permalink / raw)
  To: Solar Designer
  Cc: Ian Campbell, David Laight, Alan Cox, linux-kernel,
	Kernel Hardening, linux-fsdevel, Alexander Viro, Jann Horn,
	Kees Cook, Eric W. Biederman, H. Peter Anvin, Karel Zak

2017-11-30 17:30 GMT+01:00 Solar Designer <solar@openwall.com>:
> Replying to Salvatore and Ian at once, and CC'ing H. Peter Anvin and
> Karel Zak for util-linux flock(1).
>
> On Thu, Nov 30, 2017 at 02:57:06PM +0000, Ian Campbell wrote:
> > On Thu, 2017-11-30 at 15:39 +0100, Salvatore Mesoraca wrote:
> > > 2017-11-27 1:26 GMT+01:00 Solar Designer <solar@openwall.com>:
> > > > Why would "shared lock files based on flock()" need O_CREAT without
> > > > O_EXCL?  Where specifically are they currently used that way?
> > >
> > > I don't think that they *need* to act like this, but this is how
> > > util-linux's flock(1) currently works.
>
> Oh, but you never referred to flock(1) in there, so I read flock() as
> implying flock(2).

That's true, I observed that behavior in flock(1), but I didn't specify
it was flock(1) because I thought that there may be some other users of
flock(2) that works that way.

> I think util-linux's flock(1) is relatively obscure,
> and as far as I can tell it isn't documented anywhere whether it may be
> used on untrusted lock file directories or not.  A revision of the
> flock(1) man page seen on RHEL7 gives really weird examples using /tmp.
> The current util-linux-2.31/sys-utils/flock.1 is similar.  strace'ing
> those examples, I see flock(1) literally uses the /tmp directory itself
> (not a file inside) as the lock, which surprisingly appears to work.
> Checking the util-linux tree, it looks like this was added as a feature.
> I suppose the good news is this doesn't appear to allow for worse than a
> DoS against the script using flock(1) (since a different user can also
> take that lock), unless any of the upper level directories are also
> untrusted.  The man page as seen at https://linux.die.net/man/1/flock
> currently only lists a more complex example with /var/lock/mylockfile,
> where flock(1) itself is asked to access it via a pre-opened fd.
> Permissions on /var/lock vary between distros (e.g., a RHEL6'ish system
> I just checked has it as 775 root:lock, whereas a RHEL7'ish has it as
> symlink to ../run/lock, which is 755 root:root).  Anyway, these are just
> examples.  The reality is people will use flock(1) in various directories,
> some of them trusted and some not.  If our proposed policy can detect and
> optionally break some uses in untrusted directories, that's good since
> such uses are currently unsafe (see below).
>
> > > And it doesn't seem an unreasonable behavior from their perspective,
> >
> > I thought that too, specifically I reasoned that using O_EXCL would
> > defeat the purpose of the _shared_ flock(), wouldn't it?
>
> No.  O_EXCL means the attempt to O_CREAT will fail, at which point the
> program can retry without both of these flags.  (The actual lock is
> obtained later via flock(2) or fcntl(2) anyway.)  In fact, flock(1)
> already does something similar, as tested using the very first example
> from the man page on a RHEL7'ish system:
>
> $ strace flock /tmp -c cat
> [...]
> open("/tmp", O_RDONLY|O_CREAT|O_NOCTTY, 0666) = -1 EISDIR (Is a directory)
> open("/tmp", O_RDONLY|O_NOCTTY)         = 3
> flock(3, LOCK_EX)                       = 0
>
> As you can see, when O_CREAT failed, the program retried without that
> flag.  It could just as easily have tried O_CREAT|O_EXCL, then dropped
> both at once.  Testing with a lock file (rather than lock directory):
>
> $ strace flock /tmp/lockfile -c cat
> [...]
> open("/tmp/lockfile", O_RDONLY|O_CREAT|O_NOCTTY, 0666) = 3
> flock(3, LOCK_EX)                       = 0
>
> This use of flock(1) would be a worse vulnerability, not limited to DoS
> against the script itself but also allowing for privilege escalation
> unless (sym)link restrictions are enabled in the kernel.  Adding O_EXCL
> would help (reduce the impact back to DoS against itself), and would
> require that the retry logic (like what is seen in the lock directory
> example above) be present.
>
> On Thu, Nov 30, 2017 at 03:39:48PM +0100, Salvatore Mesoraca wrote:
> > so maybe other programs do that too.
>
> Maybe, and they would be similarly vulnerable if they do that in
> untrusted directories, and they would also need to be fixed.
>
> A subtle case is when something like this is technically done in a
> directory writable by someone else, but is not necessarily crossing
> what the program's author or the sysadmin recognize as a privilege
> boundary.  Maybe they have accepted that this one pseudo-user can
> attack that other one anyway, such as because under a certain daemon's
> privsep model the would-be attacking pseudo-user is necessarily more
> privileged anyway.  This is why we shouldn't by default extend this
> policy to all directories writable by someone else, but instead we
> consider introducing a sysctl with varying policy levels - initially
> focusing on sticky directories writable by someone else and only later
> optionally extending to non-sticky directories writable by someone else.
> The latter mode would have even greater focus on development and
> debugging rather than on production use, although I imagine that
> specific systems could eventually afford it in production as well.
>
> > I was citing that case just to make it clear that, if someone gets
> > a warning because of flock(1), they shouldn't be worried about it.
>
> Checking the util-linux-2.31/sys-utils/flock.c, I see that it in fact
> doesn't retry without O_CREAT in the non-directory case.  That would
> need to be corrected, along with introduction of O_EXCL.
>
> > That behavior can be certainly avoided, but of course it isn't a
> > security problem per se.
>
> I think it is a security problem per se, except in the "subtle case"
> above, and it's great that our proposed policy would catch it.

I agree on the DoS, though, at first, I didn't consider it a "bug" because
there isn't any open mode that can prevent the DoS in this case.
If you want to avoid it, you must avoid other-users-writable directories
at all. So, It think that, if you are using a sticky directory, it's
intended behavior to let someone else "lock" you.
But maybe many flock(1) users are not aware of the issue and so, sometimes,
it can be unintended.
I didn't consider privilege escalation as an issue because I always
looked at flock(1) under the assumption that the lockfile is never actually
read or modified in any way and so it shouldn't make too much difference if
it's an already existing regular file or a symlink etc.
Am I missing something?

Salvatore

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [kernel-hardening] [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-12-05 10:21                 ` Salvatore Mesoraca
@ 2017-12-07 21:47                   ` Solar Designer
  2017-12-11 12:08                     ` Salvatore Mesoraca
  0 siblings, 1 reply; 26+ messages in thread
From: Solar Designer @ 2017-12-07 21:47 UTC (permalink / raw)
  To: Salvatore Mesoraca
  Cc: Ian Campbell, David Laight, Alan Cox, linux-kernel,
	Kernel Hardening, linux-fsdevel, Alexander Viro, Jann Horn,
	Kees Cook, Eric W. Biederman, H. Peter Anvin, Karel Zak

On Tue, Dec 05, 2017 at 11:21:00AM +0100, Salvatore Mesoraca wrote:
> 2017-11-30 17:30 GMT+01:00 Solar Designer <solar@openwall.com>:
> > $ strace flock /tmp/lockfile -c cat
> > [...]
> > open("/tmp/lockfile", O_RDONLY|O_CREAT|O_NOCTTY, 0666) = 3
> > flock(3, LOCK_EX)                       = 0
> >
> > This use of flock(1) would be a worse vulnerability, not limited to DoS
> > against the script itself but also allowing for privilege escalation
> > unless (sym)link restrictions are enabled in the kernel.  Adding O_EXCL
> > would help (reduce the impact back to DoS against itself), and would
> > require that the retry logic (like what is seen in the lock directory
> > example above) be present.

> > > That behavior can be certainly avoided, but of course it isn't a
> > > security problem per se.
> >
> > I think it is a security problem per se, except in the "subtle case"
> > above, and it's great that our proposed policy would catch it.
> 
> I agree on the DoS, though, at first, I didn't consider it a "bug" because
> there isn't any open mode that can prevent the DoS in this case.
> If you want to avoid it, you must avoid other-users-writable directories
> at all. So, It think that, if you are using a sticky directory, it's
> intended behavior to let someone else "lock" you.

Right.  There's a worse DoS I had overlooked, though: flock(1) can also
be made to create and/or lock another file (maybe in another directory).
Perhaps adding O_NOFOLLOW (alongside flock(1)'s existing use of
O_NOCTTY) would be a good idea, even though uses in a directory writable
by someone else are inherently risky anyway.

> But maybe many flock(1) users are not aware of the issue and so, sometimes,
> it can be unintended.
> I didn't consider privilege escalation as an issue because I always
> looked at flock(1) under the assumption that the lockfile is never actually
> read or modified in any way and so it shouldn't make too much difference if
> it's an already existing regular file or a symlink etc.
> Am I missing something?

You made a good point, but yes: O_CREAT will follow a dangling symlink
and there are cases where creating an empty file of an attacker-chosen
pathname may allow for privilege escalation.  For example, crontab(1)
man page on RHEL7 says regarding /etc/cron.allow and /etc/cron.deny:

"If neither of these files exists, only the super user is allowed to
use cron."

In that configuration, simply creating empty /etc/cron.deny grants
access to crontab(1) to all users.  As user:

$ crontab -l
You (solar) are not allowed to use this program (crontab)
See crontab(1) for more information
$ ln -s /etc/cron.deny /tmp/lockfile

As root:

# sysctl -w fs.protected_symlinks=0
fs.protected_symlinks = 0
# flock /tmp/lockfile -c echo

As user:

$ crontab -l
no crontab for solar

There may also be side-effects on open of device files (the best known
example is rewinding a tape), and we won't avoid those by retrying
without O_CREAT|O_EXCL.  O_NOFOLLOW will help against symlinks to device
files, but not against hard links (if on same device).  The kernel's
symlink and hardlink protections help, but in this sub-thread we were
discussing detecting userspace software issues without waiting for an
attack.  Things like this fit David Laight's point well: programs trying
to make risky (mis)uses less risky sometimes also avoid being detected
by our proposed policy.  That's life.

Alexander

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [kernel-hardening] [PATCH v3 2/2] Protected O_CREAT open in sticky directories
  2017-12-07 21:47                   ` Solar Designer
@ 2017-12-11 12:08                     ` Salvatore Mesoraca
  0 siblings, 0 replies; 26+ messages in thread
From: Salvatore Mesoraca @ 2017-12-11 12:08 UTC (permalink / raw)
  To: Solar Designer
  Cc: Ian Campbell, David Laight, Alan Cox, linux-kernel,
	Kernel Hardening, linux-fsdevel, Alexander Viro, Jann Horn,
	Kees Cook, Eric W. Biederman, H. Peter Anvin, Karel Zak

2017-12-07 22:47 GMT+01:00 Solar Designer <solar@openwall.com>:
> On Tue, Dec 05, 2017 at 11:21:00AM +0100, Salvatore Mesoraca wrote:
> > 2017-11-30 17:30 GMT+01:00 Solar Designer <solar@openwall.com>:
> > > $ strace flock /tmp/lockfile -c cat
> > > [...]
> > > open("/tmp/lockfile", O_RDONLY|O_CREAT|O_NOCTTY, 0666) = 3
> > > flock(3, LOCK_EX)                       = 0
> > >
> > > This use of flock(1) would be a worse vulnerability, not limited to DoS
> > > against the script itself but also allowing for privilege escalation
> > > unless (sym)link restrictions are enabled in the kernel.  Adding O_EXCL
> > > would help (reduce the impact back to DoS against itself), and would
> > > require that the retry logic (like what is seen in the lock directory
> > > example above) be present.
>
> > > > That behavior can be certainly avoided, but of course it isn't a
> > > > security problem per se.
> > >
> > > I think it is a security problem per se, except in the "subtle case"
> > > above, and it's great that our proposed policy would catch it.
> >
> > I agree on the DoS, though, at first, I didn't consider it a "bug" because
> > there isn't any open mode that can prevent the DoS in this case.
> > If you want to avoid it, you must avoid other-users-writable directories
> > at all. So, It think that, if you are using a sticky directory, it's
> > intended behavior to let someone else "lock" you.
>
> Right.  There's a worse DoS I had overlooked, though: flock(1) can also
> be made to create and/or lock another file (maybe in another directory).
> Perhaps adding O_NOFOLLOW (alongside flock(1)'s existing use of
> O_NOCTTY) would be a good idea, even though uses in a directory writable
> by someone else are inherently risky anyway.
>
> > But maybe many flock(1) users are not aware of the issue and so, sometimes,
> > it can be unintended.
> > I didn't consider privilege escalation as an issue because I always
> > looked at flock(1) under the assumption that the lockfile is never actually
> > read or modified in any way and so it shouldn't make too much difference if
> > it's an already existing regular file or a symlink etc.
> > Am I missing something?
>
> You made a good point, but yes: O_CREAT will follow a dangling symlink
> and there are cases where creating an empty file of an attacker-chosen
> pathname may allow for privilege escalation.  For example, crontab(1)
> man page on RHEL7 says regarding /etc/cron.allow and /etc/cron.deny:
>
> "If neither of these files exists, only the super user is allowed to
> use cron."
>
> In that configuration, simply creating empty /etc/cron.deny grants
> access to crontab(1) to all users.  As user:
>
> $ crontab -l
> You (solar) are not allowed to use this program (crontab)
> See crontab(1) for more information
> $ ln -s /etc/cron.deny /tmp/lockfile
>
> As root:
>
> # sysctl -w fs.protected_symlinks=0
> fs.protected_symlinks = 0
> # flock /tmp/lockfile -c echo
>
> As user:
>
> $ crontab -l
> no crontab for solar

Ah, right. I didn't think of it.

> There may also be side-effects on open of device files (the best known
> example is rewinding a tape), and we won't avoid those by retrying
> without O_CREAT|O_EXCL.  O_NOFOLLOW will help against symlinks to device
> files, but not against hard links (if on same device).  The kernel's
> symlink and hardlink protections help, but in this sub-thread we were
> discussing detecting userspace software issues without waiting for an
> attack.  Things like this fit David Laight's point well: programs trying
> to make risky (mis)uses less risky sometimes also avoid being detected
> by our proposed policy.  That's life.

Yes, unfortunately some bad behaviors will go unnoticed. But many others won't,
so I thinks this is still a useful feature to have.

Salvatore

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2017-12-11 12:09 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-22  8:01 [PATCH v3 0/2] Restrict dangerous open in sticky directories Salvatore Mesoraca
2017-11-22  8:01 ` [PATCH v3 1/2] Protected FIFOs and regular files Salvatore Mesoraca
2017-11-23 22:43   ` [kernel-hardening] " Tobin C. Harding
2017-11-24  8:24     ` Salvatore Mesoraca
2017-11-22  8:01 ` [PATCH v3 2/2] Protected O_CREAT open in sticky directories Salvatore Mesoraca
2017-11-22 13:22   ` Matthew Wilcox
2017-11-24  8:29     ` Salvatore Mesoraca
2017-11-22 16:51   ` Alan Cox
2017-11-24  8:31     ` Salvatore Mesoraca
2017-11-24 10:53     ` David Laight
2017-11-24 11:43       ` Salvatore Mesoraca
2017-11-24 11:53         ` David Laight
2017-11-26 11:29           ` Salvatore Mesoraca
2017-11-27  0:26         ` Solar Designer
2017-11-30 14:39           ` Salvatore Mesoraca
2017-11-30 14:57             ` [kernel-hardening] " Ian Campbell
2017-11-30 16:30               ` [kernel-hardening] " Solar Designer
2017-12-05 10:21                 ` Salvatore Mesoraca
2017-12-07 21:47                   ` Solar Designer
2017-12-11 12:08                     ` Salvatore Mesoraca
2017-11-23 22:57   ` Tobin C. Harding
2017-11-24  8:34     ` Salvatore Mesoraca
2017-11-30 16:53   ` David Laight
2017-11-30 17:51     ` Solar Designer
2017-12-01  9:46       ` David Laight
2017-12-01 15:52         ` Alan Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).