linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/1] net: race condition in ipv6 forwarding and disable_ipv6 parameters
@ 2012-01-16 20:40 Francesco Ruggeri
  2012-01-17 17:44 ` David Miller
  0 siblings, 1 reply; 3+ messages in thread
From: Francesco Ruggeri @ 2012-01-16 20:40 UTC (permalink / raw)
  To: netdev, David Miller; +Cc: linux-kernel, Francesco Ruggeri

From: Francesco Ruggeri <fruggeri@aristanetworks.com>

There is a race condition in addrconf_sysctl_forward() and
addrconf_sysctl_disable().
These functions change idev->cnf.forwarding (resp. idev->cnf.disable_ipv6)
and then try to grab the rtnl lock before performing any actions.
If that fails they restore the original value and restart the syscall.
This creates race conditions if ipv6 code tries to access
these parameters, or if multiple instances try to do the same operation.
As an example of the former, if __ipv6_ifa_notify() finds a 0 in
idev->cnf.forwarding when invoked by addrconf_ifdown() it may not free
anycast addresses, ultimately resulting in the net_device not being freed.
This patch reads the user parameters into a temporary location and only
writes the actual parameters when the rtnl lock is acquired.
Tested in 2.6.38.8.
Signed-off-by: Francesco Ruggeri <fruggeri@aristanetworks.com>
---
diff -up a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
--- a/net/ipv6/addrconf.c	2011-03-14 18:20:32.000000000 -0700
+++ b/net/ipv6/addrconf.c	2012-01-07 16:48:54.146386450 -0800
@@ -502,29 +502,31 @@ static void addrconf_forward_change(stru
 	rcu_read_unlock();
 }

-static int addrconf_fixup_forwarding(struct ctl_table *table, int *p, int old)
+static int addrconf_fixup_forwarding(struct ctl_table *table, int *p, int newf)
 {
 	struct net *net;
+	int old;
+
+	if (!rtnl_trylock())
+		return restart_syscall();

 	net = (struct net *)table->extra2;
-	if (p == &net->ipv6.devconf_dflt->forwarding)
-		return 0;
+	old = *p;
+	*p = newf;

-	if (!rtnl_trylock()) {
-		/* Restore the original values before restarting */
-		*p = old;
-		return restart_syscall();
+	if (p == &net->ipv6.devconf_dflt->forwarding) {
+		rtnl_unlock();
+		return 0;
 	}

 	if (p == &net->ipv6.devconf_all->forwarding) {
-		__s32 newf = net->ipv6.devconf_all->forwarding;
 		net->ipv6.devconf_dflt->forwarding = newf;
 		addrconf_forward_change(net, newf);
-	} else if ((!*p) ^ (!old))
+	} else if ((!newf) ^ (!old))
 		dev_forward_change((struct inet6_dev *)table->extra1);
 	rtnl_unlock();

-	if (*p)
+	if (newf)
 		rt6_purge_dflt_routers(net);
 	return 1;
 }
@@ -4257,9 +4259,17 @@ int addrconf_sysctl_forward(ctl_table *c
 	int *valp = ctl->data;
 	int val = *valp;
 	loff_t pos = *ppos;
+	ctl_table lctl;
 	int ret;

-	ret = proc_dointvec(ctl, write, buffer, lenp, ppos);
+	/*
+	 * ctl->data points to idev->cnf.forwarding, we should
+	 * not modify it until we get the rtnl lock.
+	 */
+	lctl = *ctl;
+	lctl.data = &val;
+
+	ret = proc_dointvec(&lctl, write, buffer, lenp, ppos);

 	if (write)
 		ret = addrconf_fixup_forwarding(ctl, valp, val);
@@ -4297,26 +4307,27 @@ static void addrconf_disable_change(stru
 	rcu_read_unlock();
 }

-static int addrconf_disable_ipv6(struct ctl_table *table, int *p, int old)
+static int addrconf_disable_ipv6(struct ctl_table *table, int *p, int newf)
 {
 	struct net *net;
+	int old;
+
+	if (!rtnl_trylock())
+		return restart_syscall();

 	net = (struct net *)table->extra2;
+	old = *p;
+	*p = newf;

-	if (p == &net->ipv6.devconf_dflt->disable_ipv6)
+	if (p == &net->ipv6.devconf_dflt->disable_ipv6) {
+		rtnl_unlock();
 		return 0;
-
-	if (!rtnl_trylock()) {
-		/* Restore the original values before restarting */
-		*p = old;
-		return restart_syscall();
 	}

 	if (p == &net->ipv6.devconf_all->disable_ipv6) {
-		__s32 newf = net->ipv6.devconf_all->disable_ipv6;
 		net->ipv6.devconf_dflt->disable_ipv6 = newf;
 		addrconf_disable_change(net, newf);
-	} else if ((!*p) ^ (!old))
+	} else if ((!newf) ^ (!old))
 		dev_disable_change((struct inet6_dev *)table->extra1);

 	rtnl_unlock();
@@ -4330,9 +4341,17 @@ int addrconf_sysctl_disable(ctl_table *c
 	int *valp = ctl->data;
 	int val = *valp;
 	loff_t pos = *ppos;
+	ctl_table lctl;
 	int ret;

-	ret = proc_dointvec(ctl, write, buffer, lenp, ppos);
+	/*
+	 * ctl->data points to idev->cnf.disable_ipv6, we should
+	 * not modify it until we get the rtnl lock.
+	 */
+	lctl = *ctl;
+	lctl.data = &val;
+
+	ret = proc_dointvec(&lctl, write, buffer, lenp, ppos);

 	if (write)
 		ret = addrconf_disable_ipv6(ctl, valp, val);

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 1/1] net: race condition in ipv6 forwarding and disable_ipv6 parameters
  2012-01-16 20:40 [PATCH 1/1] net: race condition in ipv6 forwarding and disable_ipv6 parameters Francesco Ruggeri
@ 2012-01-17 17:44 ` David Miller
  2012-01-17 22:58   ` Francesco Ruggeri
  0 siblings, 1 reply; 3+ messages in thread
From: David Miller @ 2012-01-17 17:44 UTC (permalink / raw)
  To: fruggeri; +Cc: netdev, linux-kernel

From: Francesco Ruggeri <fruggeri@aristanetworks.com>
Date: Mon, 16 Jan 2012 12:40:10 -0800

> -static int addrconf_fixup_forwarding(struct ctl_table *table, int *p, int old)
> +static int addrconf_fixup_forwarding(struct ctl_table *table, int *p, int newf)
 ...
> @@ -4257,9 +4259,17 @@ int addrconf_sysctl_forward(ctl_table *c
>  	int *valp = ctl->data;
>  	int val = *valp;
>  	loff_t pos = *ppos;
> +	ctl_table lctl;
>  	int ret;
> 
> -	ret = proc_dointvec(ctl, write, buffer, lenp, ppos);
> +	/*
> +	 * ctl->data points to idev->cnf.forwarding, we should
> +	 * not modify it until we get the rtnl lock.
> +	 */
> +	lctl = *ctl;
> +	lctl.data = &val;
> +
> +	ret = proc_dointvec(&lctl, write, buffer, lenp, ppos);
> 
>  	if (write)
>  		ret = addrconf_fixup_forwarding(ctl, valp, val);

I don't understand this at all.

"val" is still the old value, before proc_dointvec() runs, after your
changes.  So renaming the argument to addrconf_fixup_forwarding() as
"newf" and treating it as the new value doesn't make any sense at all.

Did you really test this patch?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 1/1] net: race condition in ipv6 forwarding and disable_ipv6 parameters
  2012-01-17 17:44 ` David Miller
@ 2012-01-17 22:58   ` Francesco Ruggeri
  0 siblings, 0 replies; 3+ messages in thread
From: Francesco Ruggeri @ 2012-01-17 22:58 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

On Tue, Jan 17, 2012 at 9:44 AM, David Miller <davem@davemloft.net> wrote:
> From: Francesco Ruggeri <fruggeri@aristanetworks.com>
> Date: Mon, 16 Jan 2012 12:40:10 -0800
>
>> -static int addrconf_fixup_forwarding(struct ctl_table *table, int *p, int old)
>> +static int addrconf_fixup_forwarding(struct ctl_table *table, int *p, int newf)
>  ...
>> @@ -4257,9 +4259,17 @@ int addrconf_sysctl_forward(ctl_table *c
>>       int *valp = ctl->data;
>>       int val = *valp;
>>       loff_t pos = *ppos;
>> +     ctl_table lctl;
>>       int ret;
>>
>> -     ret = proc_dointvec(ctl, write, buffer, lenp, ppos);
>> +     /*
>> +      * ctl->data points to idev->cnf.forwarding, we should
>> +      * not modify it until we get the rtnl lock.
>> +      */
>> +     lctl = *ctl;
>> +     lctl.data = &val;
>> +
>> +     ret = proc_dointvec(&lctl, write, buffer, lenp, ppos);
>>
>>       if (write)
>>               ret = addrconf_fixup_forwarding(ctl, valp, val);
>
> I don't understand this at all.
>
> "val" is still the old value, before proc_dointvec() runs, after your
> changes.  So renaming the argument to addrconf_fixup_forwarding() as
> "newf" and treating it as the new value doesn't make any sense at all.
>
> Did you really test this patch?

Hi David,
thanks for reviewing the patch.
Yes, I did test it. We have been running this patch on 60 servers for
a week without problems, whereas we would see problems within the
first 24 hours without the patch.

val does contain the old value before proc_dointvec is invoked (that
is needed for reads), but it contains the new value after
proc_dointvec is invoked if "write" is non
zero.

proc_dointvec is defined as:
proc_dointvec(struct ctl_table *table, ...
In case of a write, proc_dointvec modifies the location pointed to by
table->data. In case of a read it just reads it.

The change here is that instead of using the original ctl structure I
make a copy of it (lctl) and then change lctl.data to point to "val"
on the stack.
In this way proc_dointvec does not use the original structure, and in
case of a write it only modifies "val" on the stack. In case of a read
it still reads the original value (which as you point out was saved in
"val = *valp").

In both the new and the old code, valp points to the location that has
to be changed (in the inet6_dev structure).
With the old code, the old value would be saved in "val",
proc_dointvec would change *valp, and val would be passed to
addrconf_fixup_forwarding as the old value (so that it could be
restored).
With the new code, proc_dointvec modifies "val", which is used by
addrconf_fixup_forwarding to modify *valp only after it has acquired
the lock.

Let me know if this clarifies things.
Thanks,
Francesco

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-01-17 22:58 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-16 20:40 [PATCH 1/1] net: race condition in ipv6 forwarding and disable_ipv6 parameters Francesco Ruggeri
2012-01-17 17:44 ` David Miller
2012-01-17 22:58   ` Francesco Ruggeri

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).