linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	socketpair@gmail.com,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Willy Tarreau <w@1wt.eu>, Al Viro <viro@zeniv.linux.org.uk>,
	Luis Henriques <luis.henriques@canonical.com>,
	Chas Williams <3chas3@gmail.com>
Subject: [PATCH 3.14 19/29] pipe: limit the per-user amount of pages allocated in pipes
Date: Wed, 22 Jun 2016 15:37:27 -0700	[thread overview]
Message-ID: <20160622223531.550826220@linuxfoundation.org> (raw)
In-Reply-To: <20160622223530.496939726@linuxfoundation.org>

3.14-stable review patch.  If anyone has any objections, please let me know.

------------------


From: Willy Tarreau <w@1wt.eu>

commit 759c01142a5d0f364a462346168a56de28a80f52 upstream.

On no-so-small systems, it is possible for a single process to cause an
OOM condition by filling large pipes with data that are never read. A
typical process filling 4000 pipes with 1 MB of data will use 4 GB of
memory. On small systems it may be tricky to set the pipe max size to
prevent this from happening.

This patch makes it possible to enforce a per-user soft limit above
which new pipes will be limited to a single page, effectively limiting
them to 4 kB each, as well as a hard limit above which no new pipes may
be created for this user. This has the effect of protecting the system
against memory abuse without hurting other users, and still allowing
pipes to work correctly though with less data at once.

The limit are controlled by two new sysctls : pipe-user-pages-soft, and
pipe-user-pages-hard. Both may be disabled by setting them to zero. The
default soft limit allows the default number of FDs per process (1024)
to create pipes of the default size (64kB), thus reaching a limit of 64MB
before starting to create only smaller pipes. With 256 processes limited
to 1024 FDs each, this results in 1024*64kB + (256*1024 - 1024) * 4kB =
1084 MB of memory allocated for a user. The hard limit is disabled by
default to avoid breaking existing applications that make intensive use
of pipes (eg: for splicing).

Reported-by: socketpair@gmail.com
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Mitigates: CVE-2013-4312 (Linux 2.0+)
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Chas Williams <3chas3@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/sysctl/fs.txt |   23 +++++++++++++++++++++
 fs/pipe.c                   |   47 ++++++++++++++++++++++++++++++++++++++++++--
 include/linux/pipe_fs_i.h   |    4 +++
 include/linux/sched.h       |    1 
 kernel/sysctl.c             |   14 +++++++++++++
 5 files changed, 87 insertions(+), 2 deletions(-)

--- a/Documentation/sysctl/fs.txt
+++ b/Documentation/sysctl/fs.txt
@@ -32,6 +32,8 @@ Currently, these files are in /proc/sys/
 - nr_open
 - overflowuid
 - overflowgid
+- pipe-user-pages-hard
+- pipe-user-pages-soft
 - protected_hardlinks
 - protected_symlinks
 - suid_dumpable
@@ -159,6 +161,27 @@ The default is 65534.
 
 ==============================================================
 
+pipe-user-pages-hard:
+
+Maximum total number of pages a non-privileged user may allocate for pipes.
+Once this limit is reached, no new pipes may be allocated until usage goes
+below the limit again. When set to 0, no limit is applied, which is the default
+setting.
+
+==============================================================
+
+pipe-user-pages-soft:
+
+Maximum total number of pages a non-privileged user may allocate for pipes
+before the pipe size gets limited to a single page. Once this limit is reached,
+new pipes will be limited to a single page in size for this user in order to
+limit total memory usage, and trying to increase them using fcntl() will be
+denied until usage goes below the limit again. The default value allows to
+allocate up to 1024 pipes at their default size. When set to 0, no limit is
+applied.
+
+==============================================================
+
 protected_hardlinks:
 
 A long-standing class of security issues is the hardlink-based
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -39,6 +39,12 @@ unsigned int pipe_max_size = 1048576;
  */
 unsigned int pipe_min_size = PAGE_SIZE;
 
+/* Maximum allocatable pages per user. Hard limit is unset by default, soft
+ * matches default values.
+ */
+unsigned long pipe_user_pages_hard;
+unsigned long pipe_user_pages_soft = PIPE_DEF_BUFFERS * INR_OPEN_CUR;
+
 /*
  * We use a start+len construction, which provides full use of the 
  * allocated memory.
@@ -795,20 +801,49 @@ pipe_fasync(int fd, struct file *filp, i
 	return retval;
 }
 
+static void account_pipe_buffers(struct pipe_inode_info *pipe,
+                                 unsigned long old, unsigned long new)
+{
+	atomic_long_add(new - old, &pipe->user->pipe_bufs);
+}
+
+static bool too_many_pipe_buffers_soft(struct user_struct *user)
+{
+	return pipe_user_pages_soft &&
+	       atomic_long_read(&user->pipe_bufs) >= pipe_user_pages_soft;
+}
+
+static bool too_many_pipe_buffers_hard(struct user_struct *user)
+{
+	return pipe_user_pages_hard &&
+	       atomic_long_read(&user->pipe_bufs) >= pipe_user_pages_hard;
+}
+
 struct pipe_inode_info *alloc_pipe_info(void)
 {
 	struct pipe_inode_info *pipe;
 
 	pipe = kzalloc(sizeof(struct pipe_inode_info), GFP_KERNEL);
 	if (pipe) {
-		pipe->bufs = kzalloc(sizeof(struct pipe_buffer) * PIPE_DEF_BUFFERS, GFP_KERNEL);
+		unsigned long pipe_bufs = PIPE_DEF_BUFFERS;
+		struct user_struct *user = get_current_user();
+
+		if (!too_many_pipe_buffers_hard(user)) {
+			if (too_many_pipe_buffers_soft(user))
+				pipe_bufs = 1;
+			pipe->bufs = kzalloc(sizeof(struct pipe_buffer) * pipe_bufs, GFP_KERNEL);
+		}
+
 		if (pipe->bufs) {
 			init_waitqueue_head(&pipe->wait);
 			pipe->r_counter = pipe->w_counter = 1;
-			pipe->buffers = PIPE_DEF_BUFFERS;
+			pipe->buffers = pipe_bufs;
+			pipe->user = user;
+			account_pipe_buffers(pipe, 0, pipe_bufs);
 			mutex_init(&pipe->mutex);
 			return pipe;
 		}
+		free_uid(user);
 		kfree(pipe);
 	}
 
@@ -819,6 +854,8 @@ void free_pipe_info(struct pipe_inode_in
 {
 	int i;
 
+	account_pipe_buffers(pipe, pipe->buffers, 0);
+	free_uid(pipe->user);
 	for (i = 0; i < pipe->buffers; i++) {
 		struct pipe_buffer *buf = pipe->bufs + i;
 		if (buf->ops)
@@ -1209,6 +1246,7 @@ static long pipe_set_size(struct pipe_in
 			memcpy(bufs + head, pipe->bufs, tail * sizeof(struct pipe_buffer));
 	}
 
+	account_pipe_buffers(pipe, pipe->buffers, nr_pages);
 	pipe->curbuf = 0;
 	kfree(pipe->bufs);
 	pipe->bufs = bufs;
@@ -1280,6 +1318,11 @@ long pipe_fcntl(struct file *file, unsig
 		if (!capable(CAP_SYS_RESOURCE) && size > pipe_max_size) {
 			ret = -EPERM;
 			goto out;
+		} else if ((too_many_pipe_buffers_hard(pipe->user) ||
+			    too_many_pipe_buffers_soft(pipe->user)) &&
+		           !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) {
+			ret = -EPERM;
+			goto out;
 		}
 		ret = pipe_set_size(pipe, nr_pages);
 		break;
--- a/include/linux/pipe_fs_i.h
+++ b/include/linux/pipe_fs_i.h
@@ -42,6 +42,7 @@ struct pipe_buffer {
  *	@fasync_readers: reader side fasync
  *	@fasync_writers: writer side fasync
  *	@bufs: the circular array of pipe buffers
+ *	@user: the user who created this pipe
  **/
 struct pipe_inode_info {
 	struct mutex mutex;
@@ -57,6 +58,7 @@ struct pipe_inode_info {
 	struct fasync_struct *fasync_readers;
 	struct fasync_struct *fasync_writers;
 	struct pipe_buffer *bufs;
+	struct user_struct *user;
 };
 
 /*
@@ -140,6 +142,8 @@ void pipe_unlock(struct pipe_inode_info
 void pipe_double_lock(struct pipe_inode_info *, struct pipe_inode_info *);
 
 extern unsigned int pipe_max_size, pipe_min_size;
+extern unsigned long pipe_user_pages_hard;
+extern unsigned long pipe_user_pages_soft;
 int pipe_proc_fn(struct ctl_table *, int, void __user *, size_t *, loff_t *);
 
 
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -756,6 +756,7 @@ struct user_struct {
 #endif
 	unsigned long locked_shm; /* How many pages of mlocked shm ? */
 	unsigned long unix_inflight;	/* How many files in flight in unix sockets */
+	atomic_long_t pipe_bufs;  /* how many pages are allocated in pipe buffers */
 
 #ifdef CONFIG_KEYS
 	struct key *uid_keyring;	/* UID specific keyring */
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1668,6 +1668,20 @@ static struct ctl_table fs_table[] = {
 		.proc_handler	= &pipe_proc_fn,
 		.extra1		= &pipe_min_size,
 	},
+	{
+		.procname	= "pipe-user-pages-hard",
+		.data		= &pipe_user_pages_hard,
+		.maxlen		= sizeof(pipe_user_pages_hard),
+		.mode		= 0644,
+		.proc_handler	= proc_doulongvec_minmax,
+	},
+	{
+		.procname	= "pipe-user-pages-soft",
+		.data		= &pipe_user_pages_soft,
+		.maxlen		= sizeof(pipe_user_pages_soft),
+		.mode		= 0644,
+		.proc_handler	= proc_doulongvec_minmax,
+	},
 	{ }
 };
 

  parent reply	other threads:[~2016-06-22 23:27 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-22 22:37 [PATCH 3.14 00/29] 3.14.73-stable review Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 01/29] netlink: Fix dump skb leak/double free Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 02/29] sfc: on MC reset, clear PIO buffer linkage in TXQs Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 03/29] tcp: record TLP and ER timer stats in v6 stats Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 05/29] ARM: fix PTRACE_SETVFPREGS on SMP systems Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 06/29] crypto: ccp - Fix AES XTS error for request sizes above 4096 Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 07/29] powerpc: Fix definition of SIAR and SDAR registers Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 08/29] powerpc: Use privileged SPR number for MMCR2 Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 09/29] parisc: Fix pagefault crash in unaligned __get_user() call Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 10/29] ecryptfs: forbid opening files without mmap handler Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 11/29] wext: Fix 32 bit iwpriv compatibility issue with 64 bit Kernel Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 12/29] fix d_walk()/non-delayed __d_free() race Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 13/29] MIPS: Fix 64k page support for 32 bit kernels Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 14/29] powerpc/pseries/eeh: Handle RTAS delay requests in configure_bridge Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 15/29] netfilter: x_tables: validate e->target_offset early Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 16/29] netfilter: x_tables: make sure e->next_offset covers remaining blob size Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 17/29] netfilter: x_tables: fix unconditional helper Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 18/29] xfs: fix up backport error in fs/xfs/xfs_inode.c Greg Kroah-Hartman
2016-06-22 22:37 ` Greg Kroah-Hartman [this message]
2016-06-22 22:37 ` [PATCH 3.14 20/29] netfilter: x_tables: dont move to non-existent next rule Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 21/29] netfilter: x_tables: validate targets of jumps Greg Kroah-Hartman
2016-06-23  8:54   ` Florian Westphal
2016-06-23  9:13     ` Florian Westphal
2016-06-24  2:46       ` Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 22/29] netfilter: x_tables: add and use xt_check_entry_offsets Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 23/29] netfilter: x_tables: kill check_entry helper Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 24/29] netfilter: x_tables: assert minimum target size Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 25/29] netfilter: x_tables: add compat version of xt_check_entry_offsets Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 26/29] netfilter: x_tables: check standard target size too Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 27/29] netfilter: x_tables: check for bogus target offset Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 28/29] netfilter: x_tables: validate all offsets and sizes in a rule Greg Kroah-Hartman
2016-06-22 22:37 ` [PATCH 3.14 29/29] netfilter: x_tables: dont reject valid target size on some architectures Greg Kroah-Hartman
2016-06-23  4:54 ` [PATCH 3.14 00/29] 3.14.73-stable review -rc2 Greg Kroah-Hartman
2016-06-23 19:41   ` Guenter Roeck
2016-06-23 21:54   ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160622223531.550826220@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=3chas3@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luis.henriques@canonical.com \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=socketpair@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=w@1wt.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).