[RESEND,v3,2/2] sysctl: handle overflow for file-max
diff mbox series

Message ID 20190107222700.15954-3-christian@brauner.io
State In Next
Commit 852b69d69afb76a9b2b4b0c01aea18515a15808b
Headers show
Series
  • sysctl: handle overflow for file-max
Related show

Commit Message

Christian Brauner Jan. 7, 2019, 10:27 p.m. UTC
Currently, when writing

echo 18446744073709551616 > /proc/sys/fs/file-max

/proc/sys/fs/file-max will overflow and be set to 0. That quickly
crashes the system.
This commit sets the max and min value for file-max and returns -EINVAL
when a long int is exceeded. Any higher value cannot currently be used as
the percpu counters are long ints and not unsigned integers. This behavior
also aligns with other tuneables that return -EINVAL when their range is
exceeded. See e.g. [1], [2] and others.

[1]: fb910c42cceb ("sysctl: check for UINT_MAX before unsigned int min/max")
[2]: 196851bed522 ("s390/topology: correct topology mode proc handler")

Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Christian Brauner <christian@brauner.io>
---
v3:
- unchanged

v1:
- consistenly fail on overflow

v1:
- if max value is < than ULONG_MAX use max as upper bound
- (Dominik) remove double "the" from commit message
---
 kernel/sysctl.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Dominik Brodowski Jan. 8, 2019, 7:01 a.m. UTC | #1
On Mon, Jan 07, 2019 at 11:27:00PM +0100, Christian Brauner wrote:
> @@ -2833,6 +2836,10 @@ static int __do_proc_doulongvec_minmax(void *data, struct ctl_table *table, int
>  				break;
>  			if (neg)
>  				continue;
> +			if ((max && val > *max) || (min && val < *min)) {
> +				err = -EINVAL;
> +				break;
> +			}
>  			val = convmul * val / convdiv;
>  			if ((min && val < *min) || (max && val > *max))
>  				continue;

This is a generic change which affects all users of
do_proc_doulongvec_minmax() that have extra1 or extra2 set. In sysctl.c, I
do not see another user of proc_doulongvec_minmax() that has extra1 or
extra2 set. However, have you verified whether your patch changes the
behaviour for other files that make use of proc_doulongvec_minmax() or
proc_doulongvec_ms_jiffies_minmax(), and not only of the file-max sysctl?

Thanks,
	Dominik
Christian Brauner Jan. 10, 2019, 2:50 p.m. UTC | #2
On Tue, Jan 08, 2019 at 08:01:10AM +0100, Dominik Brodowski wrote:
> On Mon, Jan 07, 2019 at 11:27:00PM +0100, Christian Brauner wrote:
> > @@ -2833,6 +2836,10 @@ static int __do_proc_doulongvec_minmax(void *data, struct ctl_table *table, int
> >  				break;
> >  			if (neg)
> >  				continue;
> > +			if ((max && val > *max) || (min && val < *min)) {
> > +				err = -EINVAL;
> > +				break;
> > +			}
> >  			val = convmul * val / convdiv;
> >  			if ((min && val < *min) || (max && val > *max))
> >  				continue;
> 
> This is a generic change which affects all users of
> do_proc_doulongvec_minmax() that have extra1 or extra2 set. In sysctl.c, I
> do not see another user of proc_doulongvec_minmax() that has extra1 or
> extra2 set. However, have you verified whether your patch changes the
> behaviour for other files that make use of proc_doulongvec_minmax() or
> proc_doulongvec_ms_jiffies_minmax(), and not only of the file-max sysctl?

Sorry for the delayed reply. I did look at the callers. The functions
that are of interest afaict are:

proc_doulongvec_ms_jiffies_minmax
proc_doulongvec_minmax

So this could be visible when users write values that would overflow the
type used in the kernel.
I guess your point is whether we are venturing into userspace break
territory. Hm... We should probably make sure that we're not regressing
anyone else! What do you think if instead of the above patch we did:

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index ba4d9e85feb8..37727b4c7a97 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1721,7 +1721,7 @@ static struct ctl_table fs_table[] = {
 		.data		= &files_stat.max_files,
 		.maxlen		= sizeof(files_stat.max_files),
 		.mode		= 0644,
-		.proc_handler	= proc_doulongvec_minmax,
+		.proc_handler	= proc_file_max,
 	},
 	{
 		.procname	= "nr_open",
@@ -2758,7 +2758,7 @@ static int __do_proc_doulongvec_minmax(void *data, struct ctl_table *table, int
 				     void __user *buffer,
 				     size_t *lenp, loff_t *ppos,
 				     unsigned long convmul,
-				     unsigned long convdiv)
+				     unsigned long convdiv, bool strict)
 {
 	unsigned long *i, *min, *max;
 	int vleft, first = 1, err = 0;
@@ -2806,7 +2806,12 @@ static int __do_proc_doulongvec_minmax(void *data, struct ctl_table *table, int
 				continue;
 			val = convmul * val / convdiv;
 			if ((min && val < *min) || (max && val > *max))
-				continue;
+				if (strict) {
+					err = -EINVAL;
+					break;
+				} else {
+					continue;
+				}
 			*i = val;
 		} else {
 			val = convdiv * (*i) / convmul;
@@ -2843,7 +2848,15 @@ static int do_proc_doulongvec_minmax(struct ctl_table *table, int write,
 				     unsigned long convdiv)
 {
 	return __do_proc_doulongvec_minmax(table->data, table, write,
-			buffer, lenp, ppos, convmul, convdiv);
+			buffer, lenp, ppos, convmul, convdiv, false);
+}
+
+static int proc_file_max(struct ctl_table *table, int write,
+			 void __user *buffer, size_t *lenp, loff_t *ppos,
+			 unsigned long convmul, unsigned long convdiv)
+{
+	return __do_proc_doulongvec_minmax(table->data, table, write, buffer,
+					   lenp, ppos, convmul, convdiv, true);
 }
 
 /**
@@ -2865,7 +2878,8 @@ static int do_proc_doulongvec_minmax(struct ctl_table *table, int write,
 int proc_doulongvec_minmax(struct ctl_table *table, int write,
 			   void __user *buffer, size_t *lenp, loff_t *ppos)
 {
-    return do_proc_doulongvec_minmax(table, write, buffer, lenp, ppos, 1l, 1l);
+	return do_proc_doulongvec_minmax(table, write, buffer, lenp, ppos, 1l,
+					 1l, false);
 }
 
 /**
@@ -2890,7 +2904,7 @@ int proc_doulongvec_ms_jiffies_minmax(struct ctl_table *table, int write,
 				      size_t *lenp, loff_t *ppos)
 {
     return do_proc_doulongvec_minmax(table, write, buffer,
-				     lenp, ppos, HZ, 1000l);
+				     lenp, ppos, HZ, 1000l, false);
 }
Dominik Brodowski Jan. 10, 2019, 2:55 p.m. UTC | #3
On Thu, Jan 10, 2019 at 03:50:05PM +0100, Christian Brauner wrote:
> On Tue, Jan 08, 2019 at 08:01:10AM +0100, Dominik Brodowski wrote:
> > On Mon, Jan 07, 2019 at 11:27:00PM +0100, Christian Brauner wrote:
> > > @@ -2833,6 +2836,10 @@ static int __do_proc_doulongvec_minmax(void *data, struct ctl_table *table, int
> > >  				break;
> > >  			if (neg)
> > >  				continue;
> > > +			if ((max && val > *max) || (min && val < *min)) {
> > > +				err = -EINVAL;
> > > +				break;
> > > +			}
> > >  			val = convmul * val / convdiv;
> > >  			if ((min && val < *min) || (max && val > *max))
> > >  				continue;
> > 
> > This is a generic change which affects all users of
> > do_proc_doulongvec_minmax() that have extra1 or extra2 set. In sysctl.c, I
> > do not see another user of proc_doulongvec_minmax() that has extra1 or
> > extra2 set. However, have you verified whether your patch changes the
> > behaviour for other files that make use of proc_doulongvec_minmax() or
> > proc_doulongvec_ms_jiffies_minmax(), and not only of the file-max sysctl?
> 
> Sorry for the delayed reply. I did look at the callers. The functions
> that are of interest afaict are:
> 
> proc_doulongvec_ms_jiffies_minmax
> proc_doulongvec_minmax
> 
> So this could be visible when users write values that would overflow the
> type used in the kernel.
>
> I guess your point is whether we are venturing into userspace break
> territory. Hm... We should probably make sure that we're not regressing
> anyone else! What do you think if instead of the above patch we did:

Hm, I prefer the original patch -- as the same (valid) reasons which apply
for the file-max sysctl might also apply to other users of this function
where extra1 and/or2 extra2 are set.

If there are no other users of this function where extra1 or extra2 are set,
just add a comment in the commit message:

	While this changes the behaviour of __do_proc_doulongvec_minmax(),
	no other existing users in the kernel are affected by this change.

If there are other users of this function where extra1 or extra2 are set,
you would need to generalize the commit message overall.

Thanks,
	Dominik
Christian Brauner Jan. 10, 2019, 3 p.m. UTC | #4
On Thu, Jan 10, 2019 at 03:55:59PM +0100, Dominik Brodowski wrote:
> On Thu, Jan 10, 2019 at 03:50:05PM +0100, Christian Brauner wrote:
> > On Tue, Jan 08, 2019 at 08:01:10AM +0100, Dominik Brodowski wrote:
> > > On Mon, Jan 07, 2019 at 11:27:00PM +0100, Christian Brauner wrote:
> > > > @@ -2833,6 +2836,10 @@ static int __do_proc_doulongvec_minmax(void *data, struct ctl_table *table, int
> > > >  				break;
> > > >  			if (neg)
> > > >  				continue;
> > > > +			if ((max && val > *max) || (min && val < *min)) {
> > > > +				err = -EINVAL;
> > > > +				break;
> > > > +			}
> > > >  			val = convmul * val / convdiv;
> > > >  			if ((min && val < *min) || (max && val > *max))
> > > >  				continue;
> > > 
> > > This is a generic change which affects all users of
> > > do_proc_doulongvec_minmax() that have extra1 or extra2 set. In sysctl.c, I
> > > do not see another user of proc_doulongvec_minmax() that has extra1 or
> > > extra2 set. However, have you verified whether your patch changes the
> > > behaviour for other files that make use of proc_doulongvec_minmax() or
> > > proc_doulongvec_ms_jiffies_minmax(), and not only of the file-max sysctl?
> > 
> > Sorry for the delayed reply. I did look at the callers. The functions
> > that are of interest afaict are:
> > 
> > proc_doulongvec_ms_jiffies_minmax
> > proc_doulongvec_minmax
> > 
> > So this could be visible when users write values that would overflow the
> > type used in the kernel.
> >
> > I guess your point is whether we are venturing into userspace break
> > territory. Hm... We should probably make sure that we're not regressing
> > anyone else! What do you think if instead of the above patch we did:
> 
> Hm, I prefer the original patch -- as the same (valid) reasons which apply
> for the file-max sysctl might also apply to other users of this function
> where extra1 and/or2 extra2 are set.

In that case we should erorr out on:

val = convmul * val / convdiv;
if ((min && val < *min) || (max && val > *max)) {
        err = -EINVAL;
        break;
}

I fear that erroring out before might break *_jiffies since they are the
only caller that request a convmul/convdiv value that is not 1l.

Christian
Christian Brauner Jan. 11, 2019, 2:51 p.m. UTC | #5
On Thu, Jan 10, 2019 at 03:55:59PM +0100, Dominik Brodowski wrote:
> On Thu, Jan 10, 2019 at 03:50:05PM +0100, Christian Brauner wrote:
> > On Tue, Jan 08, 2019 at 08:01:10AM +0100, Dominik Brodowski wrote:
> > > On Mon, Jan 07, 2019 at 11:27:00PM +0100, Christian Brauner wrote:
> > > > @@ -2833,6 +2836,10 @@ static int __do_proc_doulongvec_minmax(void *data, struct ctl_table *table, int
> > > >  				break;
> > > >  			if (neg)
> > > >  				continue;
> > > > +			if ((max && val > *max) || (min && val < *min)) {
> > > > +				err = -EINVAL;
> > > > +				break;
> > > > +			}
> > > >  			val = convmul * val / convdiv;
> > > >  			if ((min && val < *min) || (max && val > *max))
> > > >  				continue;
> > > 
> > > This is a generic change which affects all users of
> > > do_proc_doulongvec_minmax() that have extra1 or extra2 set. In sysctl.c, I
> > > do not see another user of proc_doulongvec_minmax() that has extra1 or
> > > extra2 set. However, have you verified whether your patch changes the
> > > behaviour for other files that make use of proc_doulongvec_minmax() or
> > > proc_doulongvec_ms_jiffies_minmax(), and not only of the file-max sysctl?
> > 
> > Sorry for the delayed reply. I did look at the callers. The functions
> > that are of interest afaict are:
> > 
> > proc_doulongvec_ms_jiffies_minmax
> > proc_doulongvec_minmax
> > 
> > So this could be visible when users write values that would overflow the
> > type used in the kernel.
> >
> > I guess your point is whether we are venturing into userspace break
> > territory. Hm... We should probably make sure that we're not regressing
> > anyone else! What do you think if instead of the above patch we did:
> 
> Hm, I prefer the original patch -- as the same (valid) reasons which apply
> for the file-max sysctl might also apply to other users of this function
> where extra1 and/or2 extra2 are set.
> 
> If there are no other users of this function where extra1 or extra2 are set,
> just add a comment in the commit message:
> 
> 	While this changes the behaviour of __do_proc_doulongvec_minmax(),
> 	no other existing users in the kernel are affected by this change.
> 
> If there are other users of this function where extra1 or extra2 are set,
> you would need to generalize the commit message overall.

Andrew, can you please drop this patch

[RESEND PATCH v3 2/2] sysctl: handle overflow for file-max

from your tree (It should be located at [1] from what I can gather.).
I'll resend it based on Dominik's observation and will generalize the
commit message and also error out *after* the conversion has been done
and not before.
The first patch 1/2 is correct and can be kept.

Thanks!
Christian

[1]: https://www.ozlabs.org/~akpm/mmots/broken-out/sysctl-handle-overflow-for-file-max.patch

Patch
diff mbox series

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 06df9ef138e3..11378a59af4b 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -129,6 +129,7 @@  static int __maybe_unused one = 1;
 static int __maybe_unused two = 2;
 static int __maybe_unused four = 4;
 static unsigned long one_ul = 1;
+static unsigned long long_max = LONG_MAX;
 static int one_hundred = 100;
 static int one_thousand = 1000;
 #ifdef CONFIG_PRINTK
@@ -1717,6 +1718,8 @@  static struct ctl_table fs_table[] = {
 		.maxlen		= sizeof(files_stat.max_files),
 		.mode		= 0644,
 		.proc_handler	= proc_doulongvec_minmax,
+		.extra1		= &zero,
+		.extra2		= &long_max,
 	},
 	{
 		.procname	= "nr_open",
@@ -2833,6 +2836,10 @@  static int __do_proc_doulongvec_minmax(void *data, struct ctl_table *table, int
 				break;
 			if (neg)
 				continue;
+			if ((max && val > *max) || (min && val < *min)) {
+				err = -EINVAL;
+				break;
+			}
 			val = convmul * val / convdiv;
 			if ((min && val < *min) || (max && val > *max))
 				continue;