linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/2] sysctl: handle overflow for file-max
@ 2018-10-16 22:33 Christian Brauner
  2018-10-16 22:33 ` [PATCH v3 1/2] sysctl: handle overflow in proc_get_long Christian Brauner
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Christian Brauner @ 2018-10-16 22:33 UTC (permalink / raw)
  To: keescook, linux-kernel
  Cc: ebiederm, mcgrof, akpm, joe.lawrence, longman, linux, viro,
	adobriyan, linux-api, Christian Brauner

Hey,

Here is v3 of this patchset. Changelogs are in the individual commits.

Currently, when writing

echo 18446744073709551616 > /proc/sys/fs/file-max

/proc/sys/fs/file-max will overflow and be set to 0. That quickly
crashes the system.

The first version of this patch intended to detect the overflow and cap
at ULONG_MAX. However, we should not do this and rather return EINVAL on
overflow. The reasons are:
- this aligns with other sysctl handlers that simply reject overflows
  (cf. [1], [2], and a bunch of others)
- we already do a partial fail on overflow right now
  Namely, when the TMPBUFLEN is exceeded. So we already reject values
  such as 184467440737095516160 (21 chars) but accept values such as
  18446744073709551616 (20 chars) but both are overflows. So we should
  just always reject 64bit overflows and not special-case this based on
  the number of chars.

(This patchset is in reference to https://lkml.org/lkml/2018/10/11/585.)

Thanks!
Christian

[1]: fb910c42cceb ("sysctl: check for UINT_MAX before unsigned int min/max")
[2]: 196851bed522 ("s390/topology: correct topology mode proc handler")

Christian Brauner (2):
  sysctl: handle overflow in proc_get_long
  sysctl: handle overflow for file-max

 kernel/sysctl.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 46 insertions(+), 1 deletion(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v3 1/2] sysctl: handle overflow in proc_get_long
  2018-10-16 22:33 [PATCH v3 0/2] sysctl: handle overflow for file-max Christian Brauner
@ 2018-10-16 22:33 ` Christian Brauner
  2018-10-16 23:45   ` Eric W. Biederman
  2018-10-16 22:33 ` [PATCH v3 2/2] sysctl: handle overflow for file-max Christian Brauner
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Christian Brauner @ 2018-10-16 22:33 UTC (permalink / raw)
  To: keescook, linux-kernel
  Cc: ebiederm, mcgrof, akpm, joe.lawrence, longman, linux, viro,
	adobriyan, linux-api, Christian Brauner

proc_get_long() is a funny function. It uses simple_strtoul() and for a
good reason. proc_get_long() wants to always succeed the parse and return
the maybe incorrect value and the trailing characters to check against a
pre-defined list of acceptable trailing values.
However, simple_strtoul() explicitly ignores overflows which can cause
funny things like the following to happen:

echo 18446744073709551616 > /proc/sys/fs/file-max
cat /proc/sys/fs/file-max
0

(Which will cause your system to silently die behind your back.)

On the other hand kstrtoul() does do overflow detection but does not return
the trailing characters, and also fails the parse when anything other than
'\n' is a trailing character whereas proc_get_long() wants to be more
lenient.

Now, before adding another kstrtoul() function let's simply add a static
parse strtoul_lenient() which:
- fails on overflow with -ERANGE
- returns the trailing characters to the caller

The reason why we should fail on ERANGE is that we already do a partial
fail on overflow right now. Namely, when the TMPBUFLEN is exceeded. So we
already reject values such as 184467440737095516160 (21 chars) but accept
values such as 18446744073709551616 (20 chars) but both are overflows. So
we should just always reject 64bit overflows and not special-case this
based on the number of chars.

Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
v2->v3:
- (Kees) s/#include <../lib/kstrtox.h>/#include "../lib/kstrtox.h"/g
- (Kees) document strtoul_lenient()

v1->v2:
- s/sysctl_cap_erange/sysctl_lenient/g
- consistenly fail on overflow

v0->v1:
- s/sysctl_strtoul_lenient/strtoul_cap_erange/g
- (Al) remove bool overflow return argument from strtoul_cap_erange
- (Al) return ULONG_MAX on ERANGE from strtoul_cap_erange
- (Dominik) fix spelling in commit message
---
 kernel/sysctl.c | 40 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index cc02050fd0c4..102aa7a65687 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -68,6 +68,8 @@
 #include <linux/mount.h>
 #include <linux/pipe_fs_i.h>
 
+#include "../lib/kstrtox.h"
+
 #include <linux/uaccess.h>
 #include <asm/processor.h>
 
@@ -2065,6 +2067,41 @@ static void proc_skip_char(char **buf, size_t *size, const char v)
 	}
 }
 
+/**
+ * strtoul_lenient - parse an ASCII formatted integer from a buffer and only
+ *                   fail on overflow
+ *
+ * @cp: kernel buffer containing the string to parse
+ * @endp: pointer to store the trailing characters
+ * @base: the base to use
+ * @res: where the parsed integer will be stored
+ *
+ * In case of success 0 is returned and @res will contain the parsed integer,
+ * @endp will hold any trailing characters.
+ * This function will fail the parse on overflow. If there wasn't an overflow
+ * the function will defer the decision what characters count as invalid to the
+ * caller.
+ */
+static int strtoul_lenient(const char *cp, char **endp, unsigned int base,
+			   unsigned long *res)
+{
+	unsigned long long result;
+	unsigned int rv;
+
+	cp = _parse_integer_fixup_radix(cp, &base);
+	rv = _parse_integer(cp, base, &result);
+	if ((rv & KSTRTOX_OVERFLOW) || (result != (unsigned long)result))
+		return -ERANGE;
+
+	cp += rv;
+
+	if (endp)
+		*endp = (char *)cp;
+
+	*res = (unsigned long)result;
+	return 0;
+}
+
 #define TMPBUFLEN 22
 /**
  * proc_get_long - reads an ASCII formatted integer from a user buffer
@@ -2108,7 +2145,8 @@ static int proc_get_long(char **buf, size_t *size,
 	if (!isdigit(*p))
 		return -EINVAL;
 
-	*val = simple_strtoul(p, &p, 0);
+	if (strtoul_lenient(p, &p, 0, val))
+		return -EINVAL;
 
 	len = p - tmp;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 2/2] sysctl: handle overflow for file-max
  2018-10-16 22:33 [PATCH v3 0/2] sysctl: handle overflow for file-max Christian Brauner
  2018-10-16 22:33 ` [PATCH v3 1/2] sysctl: handle overflow in proc_get_long Christian Brauner
@ 2018-10-16 22:33 ` Christian Brauner
  2018-10-17  0:35   ` Al Viro
  2018-10-16 22:36 ` [PATCH v3 0/2] " Kees Cook
  2018-10-29 14:58 ` Christian Brauner
  3 siblings, 1 reply; 14+ messages in thread
From: Christian Brauner @ 2018-10-16 22:33 UTC (permalink / raw)
  To: keescook, linux-kernel
  Cc: ebiederm, mcgrof, akpm, joe.lawrence, longman, linux, viro,
	adobriyan, linux-api, Christian Brauner

Currently, when writing

echo 18446744073709551616 > /proc/sys/fs/file-max

/proc/sys/fs/file-max will overflow and be set to 0. That quickly
crashes the system.
This commit sets the max and min value for file-max and returns -EINVAL
when a long int is exceeded. Any higher value cannot currently be used as
the percpu counters are long ints and not unsigned integers. This behavior
also aligns with other tuneables that return -EINVAL when their range is
exceeded. See e.g. [1], [2] and others.

[1]: fb910c42cceb ("sysctl: check for UINT_MAX before unsigned int min/max")
[2]: 196851bed522 ("s390/topology: correct topology mode proc handler")

Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
v2->v3:
- unchanged

v2->v1:
- consistenly fail on overflow

v0->v1:
- if max value is < than ULONG_MAX use max as upper bound
- (Dominik) remove double "the" from commit message
---
 kernel/sysctl.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 102aa7a65687..93456e3a90cd 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -128,6 +128,7 @@ static int __maybe_unused one = 1;
 static int __maybe_unused two = 2;
 static int __maybe_unused four = 4;
 static unsigned long one_ul = 1;
+static unsigned long long_max = LONG_MAX;
 static int one_hundred = 100;
 static int one_thousand = 1000;
 #ifdef CONFIG_PRINTK
@@ -1697,6 +1698,8 @@ static struct ctl_table fs_table[] = {
 		.maxlen		= sizeof(files_stat.max_files),
 		.mode		= 0644,
 		.proc_handler	= proc_doulongvec_minmax,
+		.extra1		= &zero,
+		.extra2		= &long_max,
 	},
 	{
 		.procname	= "nr_open",
@@ -2813,6 +2816,10 @@ static int __do_proc_doulongvec_minmax(void *data, struct ctl_table *table, int
 				break;
 			if (neg)
 				continue;
+			if ((max && val > *max) || (min && val < *min)) {
+				err = -EINVAL;
+				break;
+			}
 			val = convmul * val / convdiv;
 			if ((min && val < *min) || (max && val > *max))
 				continue;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/2] sysctl: handle overflow for file-max
  2018-10-16 22:33 [PATCH v3 0/2] sysctl: handle overflow for file-max Christian Brauner
  2018-10-16 22:33 ` [PATCH v3 1/2] sysctl: handle overflow in proc_get_long Christian Brauner
  2018-10-16 22:33 ` [PATCH v3 2/2] sysctl: handle overflow for file-max Christian Brauner
@ 2018-10-16 22:36 ` Kees Cook
  2018-10-29 14:58 ` Christian Brauner
  3 siblings, 0 replies; 14+ messages in thread
From: Kees Cook @ 2018-10-16 22:36 UTC (permalink / raw)
  To: Christian Brauner, Andrew Morton
  Cc: LKML, Eric W. Biederman, Luis R. Rodriguez, Joe Lawrence,
	Waiman Long, Dominik Brodowski, Al Viro, Alexey Dobriyan,
	Linux API

On Tue, Oct 16, 2018 at 3:33 PM, Christian Brauner <christian@brauner.io> wrote:
> Hey,
>
> Here is v3 of this patchset. Changelogs are in the individual commits.

Thanks! These look good. Andrew, can you take these?

-Kees

>
> Currently, when writing
>
> echo 18446744073709551616 > /proc/sys/fs/file-max
>
> /proc/sys/fs/file-max will overflow and be set to 0. That quickly
> crashes the system.
>
> The first version of this patch intended to detect the overflow and cap
> at ULONG_MAX. However, we should not do this and rather return EINVAL on
> overflow. The reasons are:
> - this aligns with other sysctl handlers that simply reject overflows
>   (cf. [1], [2], and a bunch of others)
> - we already do a partial fail on overflow right now
>   Namely, when the TMPBUFLEN is exceeded. So we already reject values
>   such as 184467440737095516160 (21 chars) but accept values such as
>   18446744073709551616 (20 chars) but both are overflows. So we should
>   just always reject 64bit overflows and not special-case this based on
>   the number of chars.
>
> (This patchset is in reference to https://lkml.org/lkml/2018/10/11/585.)
>
> Thanks!
> Christian
>
> [1]: fb910c42cceb ("sysctl: check for UINT_MAX before unsigned int min/max")
> [2]: 196851bed522 ("s390/topology: correct topology mode proc handler")
>
> Christian Brauner (2):
>   sysctl: handle overflow in proc_get_long
>   sysctl: handle overflow for file-max
>
>  kernel/sysctl.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 46 insertions(+), 1 deletion(-)
>
> --
> 2.17.1
>



-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 1/2] sysctl: handle overflow in proc_get_long
  2018-10-16 22:33 ` [PATCH v3 1/2] sysctl: handle overflow in proc_get_long Christian Brauner
@ 2018-10-16 23:45   ` Eric W. Biederman
  2018-10-17  0:24     ` Christian Brauner
  0 siblings, 1 reply; 14+ messages in thread
From: Eric W. Biederman @ 2018-10-16 23:45 UTC (permalink / raw)
  To: Christian Brauner
  Cc: keescook, linux-kernel, mcgrof, akpm, joe.lawrence, longman,
	linux, viro, adobriyan, linux-api

Christian Brauner <christian@brauner.io> writes:

> proc_get_long() is a funny function. It uses simple_strtoul() and for a
> good reason. proc_get_long() wants to always succeed the parse and return
> the maybe incorrect value and the trailing characters to check against a
> pre-defined list of acceptable trailing values.
> However, simple_strtoul() explicitly ignores overflows which can cause
> funny things like the following to happen:
>
> echo 18446744073709551616 > /proc/sys/fs/file-max
> cat /proc/sys/fs/file-max
> 0
>
> (Which will cause your system to silently die behind your back.)
>
> On the other hand kstrtoul() does do overflow detection but does not return
> the trailing characters, and also fails the parse when anything other than
> '\n' is a trailing character whereas proc_get_long() wants to be more
> lenient.
>
> Now, before adding another kstrtoul() function let's simply add a static
> parse strtoul_lenient() which:
> - fails on overflow with -ERANGE
> - returns the trailing characters to the caller
>
> The reason why we should fail on ERANGE is that we already do a partial
> fail on overflow right now. Namely, when the TMPBUFLEN is exceeded. So we
> already reject values such as 184467440737095516160 (21 chars) but accept
> values such as 18446744073709551616 (20 chars) but both are overflows. So
> we should just always reject 64bit overflows and not special-case this
> based on the number of chars.
>
> Acked-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: Christian Brauner <christian@brauner.io>
> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> ---
> v2->v3:
> - (Kees) s/#include <../lib/kstrtox.h>/#include "../lib/kstrtox.h"/g
> - (Kees) document strtoul_lenient()
>
> v1->v2:
> - s/sysctl_cap_erange/sysctl_lenient/g
> - consistenly fail on overflow
>
> v0->v1:
> - s/sysctl_strtoul_lenient/strtoul_cap_erange/g
> - (Al) remove bool overflow return argument from strtoul_cap_erange
> - (Al) return ULONG_MAX on ERANGE from strtoul_cap_erange
> - (Dominik) fix spelling in commit message
> ---
>  kernel/sysctl.c | 40 +++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 39 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index cc02050fd0c4..102aa7a65687 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -68,6 +68,8 @@
>  #include <linux/mount.h>
>  #include <linux/pipe_fs_i.h>
>  
> +#include "../lib/kstrtox.h"
> +
>  #include <linux/uaccess.h>
>  #include <asm/processor.h>
>  
> @@ -2065,6 +2067,41 @@ static void proc_skip_char(char **buf, size_t *size, const char v)
>  	}
>  }
>  
> +/**
> + * strtoul_lenient - parse an ASCII formatted integer from a buffer and only
> + *                   fail on overflow
> + *
> + * @cp: kernel buffer containing the string to parse
> + * @endp: pointer to store the trailing characters
> + * @base: the base to use
> + * @res: where the parsed integer will be stored
> + *
> + * In case of success 0 is returned and @res will contain the parsed integer,
> + * @endp will hold any trailing characters.
> + * This function will fail the parse on overflow. If there wasn't an overflow
> + * the function will defer the decision what characters count as invalid to the
> + * caller.
> + */
> +static int strtoul_lenient(const char *cp, char **endp, unsigned int base,
> +			   unsigned long *res)
> +{
> +	unsigned long long result;
> +	unsigned int rv;
> +
> +	cp = _parse_integer_fixup_radix(cp, &base);
> +	rv = _parse_integer(cp, base, &result);
> +	if ((rv & KSTRTOX_OVERFLOW) || (result != (unsigned long)result))
> +		return -ERANGE;
> +
> +	cp += rv;
> +
> +	if (endp)
> +		*endp = (char *)cp;
> +
> +	*res = (unsigned long)result;
> +	return 0;
> +}
> +
>  #define TMPBUFLEN 22
>  /**
>   * proc_get_long - reads an ASCII formatted integer from a user buffer
> @@ -2108,7 +2145,8 @@ static int proc_get_long(char **buf, size_t *size,
>  	if (!isdigit(*p))
>  		return -EINVAL;
>  
> -	*val = simple_strtoul(p, &p, 0);
> +	if (strtoul_lenient(p, &p, 0, val))
> +		return -EINVAL;

Is it deliberate that on an error stroul_lenient returns -ERANGE but
then proc_get_long returns -EINVAL?  That feels wrong.  The write system
call does not permit -ERANGE or -EINVAL for the contents of the data
so both options appear equally bad from a standards point of view.

I am just wondering what the thinking is here.

>  	len = p - tmp;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 1/2] sysctl: handle overflow in proc_get_long
  2018-10-16 23:45   ` Eric W. Biederman
@ 2018-10-17  0:24     ` Christian Brauner
  2018-10-17  2:19       ` Kees Cook
  0 siblings, 1 reply; 14+ messages in thread
From: Christian Brauner @ 2018-10-17  0:24 UTC (permalink / raw)
  To: Eric W . Biederman
  Cc: Kees Cook, LKML, Luis R. Rodriguez, Andrew Morton, Joe Lawrence,
	Waiman Long, Dominik Brodowski, Al Viro, Alexey Dobriyan,
	Linux API

On Wed, Oct 17, 2018 at 1:46 AM Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> Christian Brauner <christian@brauner.io> writes:
>
> > proc_get_long() is a funny function. It uses simple_strtoul() and for a
> > good reason. proc_get_long() wants to always succeed the parse and return
> > the maybe incorrect value and the trailing characters to check against a
> > pre-defined list of acceptable trailing values.
> > However, simple_strtoul() explicitly ignores overflows which can cause
> > funny things like the following to happen:
> >
> > echo 18446744073709551616 > /proc/sys/fs/file-max
> > cat /proc/sys/fs/file-max
> > 0
> >
> > (Which will cause your system to silently die behind your back.)
> >
> > On the other hand kstrtoul() does do overflow detection but does not return
> > the trailing characters, and also fails the parse when anything other than
> > '\n' is a trailing character whereas proc_get_long() wants to be more
> > lenient.
> >
> > Now, before adding another kstrtoul() function let's simply add a static
> > parse strtoul_lenient() which:
> > - fails on overflow with -ERANGE
> > - returns the trailing characters to the caller
> >
> > The reason why we should fail on ERANGE is that we already do a partial
> > fail on overflow right now. Namely, when the TMPBUFLEN is exceeded. So we
> > already reject values such as 184467440737095516160 (21 chars) but accept
> > values such as 18446744073709551616 (20 chars) but both are overflows. So
> > we should just always reject 64bit overflows and not special-case this
> > based on the number of chars.
> >
> > Acked-by: Kees Cook <keescook@chromium.org>
> > Signed-off-by: Christian Brauner <christian@brauner.io>
> > Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> > ---
> > v2->v3:
> > - (Kees) s/#include <../lib/kstrtox.h>/#include "../lib/kstrtox.h"/g
> > - (Kees) document strtoul_lenient()
> >
> > v1->v2:
> > - s/sysctl_cap_erange/sysctl_lenient/g
> > - consistenly fail on overflow
> >
> > v0->v1:
> > - s/sysctl_strtoul_lenient/strtoul_cap_erange/g
> > - (Al) remove bool overflow return argument from strtoul_cap_erange
> > - (Al) return ULONG_MAX on ERANGE from strtoul_cap_erange
> > - (Dominik) fix spelling in commit message
> > ---
> >  kernel/sysctl.c | 40 +++++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 39 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> > index cc02050fd0c4..102aa7a65687 100644
> > --- a/kernel/sysctl.c
> > +++ b/kernel/sysctl.c
> > @@ -68,6 +68,8 @@
> >  #include <linux/mount.h>
> >  #include <linux/pipe_fs_i.h>
> >
> > +#include "../lib/kstrtox.h"
> > +
> >  #include <linux/uaccess.h>
> >  #include <asm/processor.h>
> >
> > @@ -2065,6 +2067,41 @@ static void proc_skip_char(char **buf, size_t *size, const char v)
> >       }
> >  }
> >
> > +/**
> > + * strtoul_lenient - parse an ASCII formatted integer from a buffer and only
> > + *                   fail on overflow
> > + *
> > + * @cp: kernel buffer containing the string to parse
> > + * @endp: pointer to store the trailing characters
> > + * @base: the base to use
> > + * @res: where the parsed integer will be stored
> > + *
> > + * In case of success 0 is returned and @res will contain the parsed integer,
> > + * @endp will hold any trailing characters.
> > + * This function will fail the parse on overflow. If there wasn't an overflow
> > + * the function will defer the decision what characters count as invalid to the
> > + * caller.
> > + */
> > +static int strtoul_lenient(const char *cp, char **endp, unsigned int base,
> > +                        unsigned long *res)
> > +{
> > +     unsigned long long result;
> > +     unsigned int rv;
> > +
> > +     cp = _parse_integer_fixup_radix(cp, &base);
> > +     rv = _parse_integer(cp, base, &result);
> > +     if ((rv & KSTRTOX_OVERFLOW) || (result != (unsigned long)result))
> > +             return -ERANGE;
> > +
> > +     cp += rv;
> > +
> > +     if (endp)
> > +             *endp = (char *)cp;
> > +
> > +     *res = (unsigned long)result;
> > +     return 0;
> > +}
> > +
> >  #define TMPBUFLEN 22
> >  /**
> >   * proc_get_long - reads an ASCII formatted integer from a user buffer
> > @@ -2108,7 +2145,8 @@ static int proc_get_long(char **buf, size_t *size,
> >       if (!isdigit(*p))
> >               return -EINVAL;
> >
> > -     *val = simple_strtoul(p, &p, 0);
> > +     if (strtoul_lenient(p, &p, 0, val))
> > +             return -EINVAL;
>
> Is it deliberate that on an error stroul_lenient returns -ERANGE but
> then proc_get_long returns -EINVAL?  That feels wrong.  The write system

Yes, because all other integer parsing function return ERANGE as well.
Right now there are no other users but if someone wants to use that function
I wanted it to behave like the others.

> call does not permit -ERANGE or -EINVAL for the contents of the data
> so both options appear equally bad from a standards point of view.

Right, but if you write a value that exceeds the buffer of 22 chars that is used
to parse you already get EINVAL back on current kernels.
So it didn't feel like returning EOVERFLOW or ERANGE might make sense.
I saw a change in 4.10 or 4.11 as well that used EINVAL on UINT_MAX overflow
or something. EINVAL might be enough information for userspace here ?/.

>
> I am just wondering what the thinking is here.
>
> >       len = p - tmp;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/2] sysctl: handle overflow for file-max
  2018-10-16 22:33 ` [PATCH v3 2/2] sysctl: handle overflow for file-max Christian Brauner
@ 2018-10-17  0:35   ` Al Viro
  2018-10-17  9:57     ` Christian Brauner
  2018-10-18 21:58     ` Andrea Arcangeli
  0 siblings, 2 replies; 14+ messages in thread
From: Al Viro @ 2018-10-17  0:35 UTC (permalink / raw)
  To: Christian Brauner
  Cc: keescook, linux-kernel, ebiederm, mcgrof, akpm, joe.lawrence,
	longman, linux, adobriyan, linux-api, Andrea Arcangeli,
	Miklos Szeredi, Eric Dumazet

On Wed, Oct 17, 2018 at 12:33:22AM +0200, Christian Brauner wrote:
> Currently, when writing
> 
> echo 18446744073709551616 > /proc/sys/fs/file-max
> 
> /proc/sys/fs/file-max will overflow and be set to 0. That quickly
> crashes the system.
> This commit sets the max and min value for file-max and returns -EINVAL
> when a long int is exceeded. Any higher value cannot currently be used as
> the percpu counters are long ints and not unsigned integers. This behavior
> also aligns with other tuneables that return -EINVAL when their range is
> exceeded. See e.g. [1], [2] and others.

Mostly sane, but...  get_max_files() users are bloody odd.  The one in
file-max limit reporting looks like a half-arsed attempt in "[PATCH] fix
file counting".  The one in af_unix.c, though...  I don't remember how
that check had come to be - IIRC that was a strange fallout of a thread
with me, Andrea and ANK involved, circa 1999, but I don't remember details;
Andrea, any memories?  It might be worth reconsidering...  The change in
question is in 2.2.4pre6; what do we use unix_nr_socks for?  We try to
limit the number of PF_UNIX socks by 2 * max_files, but max_files can be
huge *and* non-constant (i.e. it can decrease).  What's more, unix_tot_inflight
is unsigned int and max_files might exceed 2^31 just fine since "fs: allow
for more than 2^31 files" back in 2010...  Something's fishy there...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 1/2] sysctl: handle overflow in proc_get_long
  2018-10-17  0:24     ` Christian Brauner
@ 2018-10-17  2:19       ` Kees Cook
  0 siblings, 0 replies; 14+ messages in thread
From: Kees Cook @ 2018-10-17  2:19 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Eric W . Biederman, LKML, Luis R. Rodriguez, Andrew Morton,
	Joe Lawrence, Waiman Long, Dominik Brodowski, Al Viro,
	Alexey Dobriyan, Linux API

On Tue, Oct 16, 2018 at 5:24 PM, Christian Brauner <christian@brauner.io> wrote:
> Right, but if you write a value that exceeds the buffer of 22 chars that is used
> to parse you already get EINVAL back on current kernels.
> So it didn't feel like returning EOVERFLOW or ERANGE might make sense.
> I saw a change in 4.10 or 4.11 as well that used EINVAL on UINT_MAX overflow
> or something. EINVAL might be enough information for userspace here ?/.

I'd agree: I think there is more precedent for EINVAL.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/2] sysctl: handle overflow for file-max
  2018-10-17  0:35   ` Al Viro
@ 2018-10-17  9:57     ` Christian Brauner
  2018-10-18 21:58     ` Andrea Arcangeli
  1 sibling, 0 replies; 14+ messages in thread
From: Christian Brauner @ 2018-10-17  9:57 UTC (permalink / raw)
  To: Al Viro
  Cc: keescook, linux-kernel, ebiederm, mcgrof, akpm, joe.lawrence,
	longman, linux, adobriyan, linux-api, Andrea Arcangeli,
	Miklos Szeredi, Eric Dumazet

On Wed, Oct 17, 2018 at 01:35:48AM +0100, Al Viro wrote:
> On Wed, Oct 17, 2018 at 12:33:22AM +0200, Christian Brauner wrote:
> > Currently, when writing
> > 
> > echo 18446744073709551616 > /proc/sys/fs/file-max
> > 
> > /proc/sys/fs/file-max will overflow and be set to 0. That quickly
> > crashes the system.
> > This commit sets the max and min value for file-max and returns -EINVAL
> > when a long int is exceeded. Any higher value cannot currently be used as
> > the percpu counters are long ints and not unsigned integers. This behavior
> > also aligns with other tuneables that return -EINVAL when their range is
> > exceeded. See e.g. [1], [2] and others.
> 
> Mostly sane, but...  get_max_files() users are bloody odd.  The one in
> file-max limit reporting looks like a half-arsed attempt in "[PATCH] fix
> file counting".  The one in af_unix.c, though...  I don't remember how
> that check had come to be - IIRC that was a strange fallout of a thread
> with me, Andrea and ANK involved, circa 1999, but I don't remember details;
> Andrea, any memories?  It might be worth reconsidering...  The change in
> question is in 2.2.4pre6; what do we use unix_nr_socks for?  We try to
> limit the number of PF_UNIX socks by 2 * max_files, but max_files can be

So that's something I mentioned to Kees before. It seems we should
either simply replace this check with:

        if ((atomic_long_read(&unix_nr_socks) >> 1) > get_max_files())
                goto out;

to protect against overflows or simply do

        if (atomic_long_read(&unix_nr_socks) > get_max_files())
                goto out;

> huge *and* non-constant (i.e. it can decrease).  What's more, unix_tot_inflight
> is unsigned int and max_files might exceed 2^31 just fine since "fs: allow
> for more than 2^31 files" back in 2010...  Something's fishy there...

What's more is that fs/file_table.c:files_maxfiles_init()
currently has:
void __init files_maxfiles_init(void)
{
        unsigned long n;
        unsigned long memreserve = (totalram_pages - nr_free_pages()) * 3/2;

        memreserve = min(memreserve, totalram_pages - 1);
        n = ((totalram_pages - memreserve) * (PAGE_SIZE / 1024)) / 10;

        files_stat.max_files = max_t(unsigned long, n, NR_FILE);
}

given that we currently can't handle more than LONG_MAX files should we
maybe cap here? Like:

diff --git a/fs/file_table.c b/fs/file_table.c
index e49af4caf15d..dd108b4c6d72 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -376,6 +376,8 @@ void __init files_init(void)
 /*
  * One file with associated inode and dcache is very roughly 1K. Per default
  * do not use more than 10% of our memory for files.
+ * The percpu counters only handle long ints so cap maximum number of
+ * files at LONG_MAX.
  */
 void __init files_maxfiles_init(void)
 {
@@ -386,4 +388,7 @@ void __init files_maxfiles_init(void)
        n = ((totalram_pages - memreserve) * (PAGE_SIZE / 1024)) / 10;

        files_stat.max_files = max_t(unsigned long, n, NR_FILE);
+
+       if (files_stat.max_files > LONG_MAX)
+               files_stat.max_files = LONG_MAX;
 }

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/2] sysctl: handle overflow for file-max
  2018-10-17  0:35   ` Al Viro
  2018-10-17  9:57     ` Christian Brauner
@ 2018-10-18 21:58     ` Andrea Arcangeli
  1 sibling, 0 replies; 14+ messages in thread
From: Andrea Arcangeli @ 2018-10-18 21:58 UTC (permalink / raw)
  To: Al Viro
  Cc: Christian Brauner, keescook, linux-kernel, ebiederm, mcgrof,
	akpm, joe.lawrence, longman, linux, adobriyan, linux-api,
	Miklos Szeredi, Eric Dumazet

Hi Al,

On Wed, Oct 17, 2018 at 01:35:48AM +0100, Al Viro wrote:
> On Wed, Oct 17, 2018 at 12:33:22AM +0200, Christian Brauner wrote:
> > Currently, when writing
> > 
> > echo 18446744073709551616 > /proc/sys/fs/file-max
> > 
> > /proc/sys/fs/file-max will overflow and be set to 0. That quickly
> > crashes the system.
> > This commit sets the max and min value for file-max and returns -EINVAL
> > when a long int is exceeded. Any higher value cannot currently be used as
> > the percpu counters are long ints and not unsigned integers. This behavior
> > also aligns with other tuneables that return -EINVAL when their range is
> > exceeded. See e.g. [1], [2] and others.
> 
> Mostly sane, but...  get_max_files() users are bloody odd.  The one in
> file-max limit reporting looks like a half-arsed attempt in "[PATCH] fix
> file counting".  The one in af_unix.c, though...  I don't remember how
> that check had come to be - IIRC that was a strange fallout of a thread
> with me, Andrea and ANK involved, circa 1999, but I don't remember details;
> Andrea, any memories?  It might be worth reconsidering...  The change in
> question is in 2.2.4pre6; what do we use unix_nr_socks for?  We try to
> limit the number of PF_UNIX socks by 2 * max_files, but max_files can be
> huge *and* non-constant (i.e. it can decrease).  What's more, unix_tot_inflight
> is unsigned int and max_files might exceed 2^31 just fine since "fs: allow
> for more than 2^31 files" back in 2010...  Something's fishy there...

Feels like a lifetime ago :), but looking into I remembered some of
it. That thread was about some instability in unix sockets for an
unrelated bug in the garbage collector. While reviewing your fix, I
probably incidentally found a resource exhaustion problem in doing a
connect();close() loop on a listening stream af_unix. I found an
exploit somewhere in my home dated in 99 in ls -l. Then ANK found
another resource exhaustion by sending datagram sockets, which I also
found an exploit for in my home.

ANK pointed out that a connect syscall allocates two sockets, one to
be accepted by the listening process, the other is the connect
itself. That must be the explanation of the "*2".

The "max_files*2" is probably a patch was from you (which was not
overflowing back then), in attempt to fix the garbage collector issue
which initially looked like resource exhaustion.

I may have suggested to check sk_max_ack_backlog and fail connect() in
such case to solve the resource exhaustion, but my proposal was
obviously broken because connect() would return an error when the
backlog was full and I suppose I didn't implement anything like
unix_wait_for_peer. So I guess (not 100% sure) the get_max_files()*2
check stayed, not because of the bug in the garbage collector that was
fixed independently, but as a stop gap measure for the
connect();close() loop resource exhaustion.

I tried the exploit that does a connect();close() in a loop and it
gracefully hangs in unix_wait_for_peer() after sk_max_ack_backlog
connects.

Out of curiosity I tried also the dgram exploit and it hangs in
sock_alloc_send_pskb with sk_wmem_alloc_get(sk) < sk->sk_sndbuf
check. The file*2 limit couldn't have helped that one anyway.

If I set /proc/sys/net/core/somaxconn to 1000000 the exploit works
fine again and the connect;close loop again allocates infinite amount
of kernel RAM into a tiny RSS process and it triggered OOM (there was
no OOM killer in v2.2 I suppose). By default it's 128. There's also
sysctl_max_dgram_qlen for dgram that on Android is set to 600 (by
default 10).

I tend to think these resources are now capped by other means (notably
somaxconn, sysctl_max_dgram_qlen, sk_wmem_alloc_get) and unix_nr_socks
can be dropped. Or if that atomic counter is still needed it's not for
a trivial exploit anymore than just does listen(); SIGSTOP() from one
process and a connect();close() loop in another process. It'd require
more than a listening socket and heavily forking or a large increase
on the max number of file descriptors (a privileged op) to do a ton of
listens, but forking has its own memory footprint in userland too. At
the very least it should be a per-cpu counter synced to the atomic
global after a threshold.

The other reason for dropping is that it wasn't ideal that the trivial
exploit could still allocated max_files*2 SYN skbs with a loop of
connect;close, max_files*2 is too much already so I suppose it was
only a stop-gap measure to begin with.

Thanks,
Andrea

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/2] sysctl: handle overflow for file-max
  2018-10-16 22:33 [PATCH v3 0/2] sysctl: handle overflow for file-max Christian Brauner
                   ` (2 preceding siblings ...)
  2018-10-16 22:36 ` [PATCH v3 0/2] " Kees Cook
@ 2018-10-29 14:58 ` Christian Brauner
  2018-10-29 21:44   ` Kees Cook
  3 siblings, 1 reply; 14+ messages in thread
From: Christian Brauner @ 2018-10-29 14:58 UTC (permalink / raw)
  To: keescook, linux-kernel
  Cc: ebiederm, mcgrof, akpm, joe.lawrence, longman, linux, viro,
	adobriyan, linux-api

On Wed, Oct 17, 2018 at 12:33:20AM +0200, Christian Brauner wrote:
> Hey,
> 
> Here is v3 of this patchset. Changelogs are in the individual commits.
> 
> Currently, when writing
> 
> echo 18446744073709551616 > /proc/sys/fs/file-max
> 
> /proc/sys/fs/file-max will overflow and be set to 0. That quickly
> crashes the system.
> 
> The first version of this patch intended to detect the overflow and cap
> at ULONG_MAX. However, we should not do this and rather return EINVAL on
> overflow. The reasons are:
> - this aligns with other sysctl handlers that simply reject overflows
>   (cf. [1], [2], and a bunch of others)
> - we already do a partial fail on overflow right now
>   Namely, when the TMPBUFLEN is exceeded. So we already reject values
>   such as 184467440737095516160 (21 chars) but accept values such as
>   18446744073709551616 (20 chars) but both are overflows. So we should
>   just always reject 64bit overflows and not special-case this based on
>   the number of chars.
> 
> (This patchset is in reference to https://lkml.org/lkml/2018/10/11/585.)

Just so that we don't forget, can we make sure that this gets picked
into linux-next? :)

Christian

> 
> Thanks!
> Christian
> 
> [1]: fb910c42cceb ("sysctl: check for UINT_MAX before unsigned int min/max")
> [2]: 196851bed522 ("s390/topology: correct topology mode proc handler")
> 
> Christian Brauner (2):
>   sysctl: handle overflow in proc_get_long
>   sysctl: handle overflow for file-max
> 
>  kernel/sysctl.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 46 insertions(+), 1 deletion(-)
> 
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/2] sysctl: handle overflow for file-max
  2018-10-29 14:58 ` Christian Brauner
@ 2018-10-29 21:44   ` Kees Cook
  2018-12-09 16:40     ` Christian Brauner
  0 siblings, 1 reply; 14+ messages in thread
From: Kees Cook @ 2018-10-29 21:44 UTC (permalink / raw)
  To: Christian Brauner, Andrew Morton
  Cc: LKML, Eric W. Biederman, Luis R. Rodriguez, Joe Lawrence,
	Waiman Long, Dominik Brodowski, Al Viro, Alexey Dobriyan,
	Linux API

On Mon, Oct 29, 2018 at 7:58 AM, Christian Brauner <christian@brauner.io> wrote:
> On Wed, Oct 17, 2018 at 12:33:20AM +0200, Christian Brauner wrote:
>> Hey,
>>
>> Here is v3 of this patchset. Changelogs are in the individual commits.
>>
>> Currently, when writing
>>
>> echo 18446744073709551616 > /proc/sys/fs/file-max
>>
>> /proc/sys/fs/file-max will overflow and be set to 0. That quickly
>> crashes the system.
>>
>> The first version of this patch intended to detect the overflow and cap
>> at ULONG_MAX. However, we should not do this and rather return EINVAL on
>> overflow. The reasons are:
>> - this aligns with other sysctl handlers that simply reject overflows
>>   (cf. [1], [2], and a bunch of others)
>> - we already do a partial fail on overflow right now
>>   Namely, when the TMPBUFLEN is exceeded. So we already reject values
>>   such as 184467440737095516160 (21 chars) but accept values such as
>>   18446744073709551616 (20 chars) but both are overflows. So we should
>>   just always reject 64bit overflows and not special-case this based on
>>   the number of chars.
>>
>> (This patchset is in reference to https://lkml.org/lkml/2018/10/11/585.)
>
> Just so that we don't forget, can we make sure that this gets picked
> into linux-next? :)

I was hoping akpm would take this? Andrew, does the v3 look okay to you?

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/2] sysctl: handle overflow for file-max
  2018-10-29 21:44   ` Kees Cook
@ 2018-12-09 16:40     ` Christian Brauner
  2018-12-10 17:51       ` Kees Cook
  0 siblings, 1 reply; 14+ messages in thread
From: Christian Brauner @ 2018-12-09 16:40 UTC (permalink / raw)
  To: Kees Cook
  Cc: Christian Brauner, Andrew Morton, Linux Kernel Mailing List,
	Eric W. Biederman, mcgrof, joe.lawrence, Waiman Long,
	Dominik Brodowski, Al Viro, Alexey Dobriyan, Linux API

On Mon, Oct 29, 2018 at 10:44 PM Kees Cook <keescook@chromium.org> wrote:
>
> On Mon, Oct 29, 2018 at 7:58 AM, Christian Brauner <christian@brauner.io> wrote:
> > On Wed, Oct 17, 2018 at 12:33:20AM +0200, Christian Brauner wrote:
> >> Hey,
> >>
> >> Here is v3 of this patchset. Changelogs are in the individual commits.
> >>
> >> Currently, when writing
> >>
> >> echo 18446744073709551616 > /proc/sys/fs/file-max
> >>
> >> /proc/sys/fs/file-max will overflow and be set to 0. That quickly
> >> crashes the system.
> >>
> >> The first version of this patch intended to detect the overflow and cap
> >> at ULONG_MAX. However, we should not do this and rather return EINVAL on
> >> overflow. The reasons are:
> >> - this aligns with other sysctl handlers that simply reject overflows
> >>   (cf. [1], [2], and a bunch of others)
> >> - we already do a partial fail on overflow right now
> >>   Namely, when the TMPBUFLEN is exceeded. So we already reject values
> >>   such as 184467440737095516160 (21 chars) but accept values such as
> >>   18446744073709551616 (20 chars) but both are overflows. So we should
> >>   just always reject 64bit overflows and not special-case this based on
> >>   the number of chars.
> >>
> >> (This patchset is in reference to https://lkml.org/lkml/2018/10/11/585.)
> >
> > Just so that we don't forget, can we make sure that this gets picked
> > into linux-next? :)
>
> I was hoping akpm would take this? Andrew, does the v3 look okay to you?

gentle ping again :)

Christian

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/2] sysctl: handle overflow for file-max
  2018-12-09 16:40     ` Christian Brauner
@ 2018-12-10 17:51       ` Kees Cook
  0 siblings, 0 replies; 14+ messages in thread
From: Kees Cook @ 2018-12-10 17:51 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christian Brauner, Christian Brauner, LKML, Eric W. Biederman,
	Luis R. Rodriguez, Joe Lawrence, Waiman Long, Dominik Brodowski,
	Al Viro, Alexey Dobriyan, Linux API

Hi Andrew,

Can you take this patch for -mm?

-Kees

On Sun, Dec 9, 2018 at 8:41 AM Christian Brauner
<christian.brauner@canonical.com> wrote:
>
> On Mon, Oct 29, 2018 at 10:44 PM Kees Cook <keescook@chromium.org> wrote:
> >
> > On Mon, Oct 29, 2018 at 7:58 AM, Christian Brauner <christian@brauner.io> wrote:
> > > On Wed, Oct 17, 2018 at 12:33:20AM +0200, Christian Brauner wrote:
> > >> Hey,
> > >>
> > >> Here is v3 of this patchset. Changelogs are in the individual commits.
> > >>
> > >> Currently, when writing
> > >>
> > >> echo 18446744073709551616 > /proc/sys/fs/file-max
> > >>
> > >> /proc/sys/fs/file-max will overflow and be set to 0. That quickly
> > >> crashes the system.
> > >>
> > >> The first version of this patch intended to detect the overflow and cap
> > >> at ULONG_MAX. However, we should not do this and rather return EINVAL on
> > >> overflow. The reasons are:
> > >> - this aligns with other sysctl handlers that simply reject overflows
> > >>   (cf. [1], [2], and a bunch of others)
> > >> - we already do a partial fail on overflow right now
> > >>   Namely, when the TMPBUFLEN is exceeded. So we already reject values
> > >>   such as 184467440737095516160 (21 chars) but accept values such as
> > >>   18446744073709551616 (20 chars) but both are overflows. So we should
> > >>   just always reject 64bit overflows and not special-case this based on
> > >>   the number of chars.
> > >>
> > >> (This patchset is in reference to https://lkml.org/lkml/2018/10/11/585.)
> > >
> > > Just so that we don't forget, can we make sure that this gets picked
> > > into linux-next? :)
> >
> > I was hoping akpm would take this? Andrew, does the v3 look okay to you?
>
> gentle ping again :)
>
> Christian



-- 
Kees Cook

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-12-10 17:51 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-16 22:33 [PATCH v3 0/2] sysctl: handle overflow for file-max Christian Brauner
2018-10-16 22:33 ` [PATCH v3 1/2] sysctl: handle overflow in proc_get_long Christian Brauner
2018-10-16 23:45   ` Eric W. Biederman
2018-10-17  0:24     ` Christian Brauner
2018-10-17  2:19       ` Kees Cook
2018-10-16 22:33 ` [PATCH v3 2/2] sysctl: handle overflow for file-max Christian Brauner
2018-10-17  0:35   ` Al Viro
2018-10-17  9:57     ` Christian Brauner
2018-10-18 21:58     ` Andrea Arcangeli
2018-10-16 22:36 ` [PATCH v3 0/2] " Kees Cook
2018-10-29 14:58 ` Christian Brauner
2018-10-29 21:44   ` Kees Cook
2018-12-09 16:40     ` Christian Brauner
2018-12-10 17:51       ` Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).