All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/1] veth: don't free priv->status until dev->destructor (v2)
@ 2009-06-26 16:24 ` Serge E. Hallyn
  0 siblings, 0 replies; 10+ messages in thread
From: Serge E. Hallyn @ 2009-06-26 16:24 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Linux Containers, Sachin Sant, netdev, David Miller, matthltc, lkml

Based on the commit msg on ae0e8e82205c903978a79ebf5e31c670b61fa5b4, it looks
like oopses were caused when people were reading the veth dev stats while the
module was being unloaded, causing a deref of freed memory in veth_get_stats()?
If so, I believe the following patch (still against mainline, so not on top of
my previous patch or on top of a git-revert of
ae0e8e82205c903978a79ebf5e31c670b61fa5b)) should prevent that.  All the stats
are gathered within one rcu cycle, while the device free hook first sets the
device stats struct to NULL, waits an rcu cycle before freeing it.

I haven't been able to reproduce the original oops though (been trying
to cat the stats sysfs files while rmmoding veth, to no avail, and haven't
found an original bug report or testcase), so can't verify whether this patch
prevents the original oops.

Does this look sufficient?

thanks,
-serge

From a8eb0950b47ff6c5dfe2debafbd203dcced75bd3 Mon Sep 17 00:00:00 2001
From: root <root@elm3b203.beaverton.ibm.com>
Date: Wed, 24 Jun 2009 20:26:17 -0700
Subject: [PATCH 1/1] veth: don't free priv->status until dev->destructor (v2)

Since commit ae0e8e82205c903978a79ebf5e31c670b61fa5b4, priv->status
has been freed at veth_close().  But that causes a NULL deref at
veth_xmit.  This patch moves priv->status free back to the device
destructor.  It also tries to prevent the original possible
sysfs-induced oops.  All the stats are now gathered within one rcu
cycle, while the device free hook first sets the device stats struct to
NULL, waits an rcu cycle before freeing
it.

Changelog:
	June 26: try to fix the original oops.

Signed-off-by: Serge Hallyn <serue@us.ibm.com>
---
 drivers/net/veth.c |   22 ++++++++++++++++++----
 1 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 87197dd..112add0 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -208,7 +208,7 @@ rx_drop:
 
 static struct net_device_stats *veth_get_stats(struct net_device *dev)
 {
-	struct veth_priv *priv = netdev_priv(dev);
+	struct veth_priv *priv;
 	struct net_device_stats *dev_stats = &dev->stats;
 	unsigned int cpu;
 	struct veth_net_stats *stats;
@@ -220,6 +220,8 @@ static struct net_device_stats *veth_get_stats(struct net_device *dev)
 	dev_stats->tx_dropped = 0;
 	dev_stats->rx_dropped = 0;
 
+	rcu_read_lock();
+	priv = netdev_priv(dev);
 	if (priv->stats)
 		for_each_online_cpu(cpu) {
 			stats = per_cpu_ptr(priv->stats, cpu);
@@ -231,6 +233,7 @@ static struct net_device_stats *veth_get_stats(struct net_device *dev)
 			dev_stats->tx_dropped += stats->tx_dropped;
 			dev_stats->rx_dropped += stats->rx_dropped;
 		}
+	rcu_read_unlock();
 
 	return dev_stats;
 }
@@ -257,8 +260,6 @@ static int veth_close(struct net_device *dev)
 	netif_carrier_off(dev);
 	netif_carrier_off(priv->peer);
 
-	free_percpu(priv->stats);
-	priv->stats = NULL;
 	return 0;
 }
 
@@ -299,6 +300,19 @@ static const struct net_device_ops veth_netdev_ops = {
 	.ndo_set_mac_address = eth_mac_addr,
 };
 
+static void veth_dev_free(struct net_device *dev)
+{
+	struct veth_priv *priv;
+	struct veth_net_stats *stats;
+
+	priv = netdev_priv(dev);
+	stats = priv->stats;
+	priv->stats = NULL;
+	synchronize_rcu();
+	free_percpu(stats);
+	free_netdev(dev);
+}
+
 static void veth_setup(struct net_device *dev)
 {
 	ether_setup(dev);
@@ -306,7 +320,7 @@ static void veth_setup(struct net_device *dev)
 	dev->netdev_ops = &veth_netdev_ops;
 	dev->ethtool_ops = &veth_ethtool_ops;
 	dev->features |= NETIF_F_LLTX;
-	dev->destructor = free_netdev;
+	dev->destructor = veth_dev_free;
 }
 
 /*
-- 
1.6.2.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 1/1] veth: don't free priv->status until dev->destructor (v2)
@ 2009-06-26 16:24 ` Serge E. Hallyn
  0 siblings, 0 replies; 10+ messages in thread
From: Serge E. Hallyn @ 2009-06-26 16:24 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Linux Containers, Sachin Sant, netdev, David Miller, matthltc, lkml

Based on the commit msg on ae0e8e82205c903978a79ebf5e31c670b61fa5b4, it looks
like oopses were caused when people were reading the veth dev stats while the
module was being unloaded, causing a deref of freed memory in veth_get_stats()?
If so, I believe the following patch (still against mainline, so not on top of
my previous patch or on top of a git-revert of
ae0e8e82205c903978a79ebf5e31c670b61fa5b)) should prevent that.  All the stats
are gathered within one rcu cycle, while the device free hook first sets the
device stats struct to NULL, waits an rcu cycle before freeing it.

I haven't been able to reproduce the original oops though (been trying
to cat the stats sysfs files while rmmoding veth, to no avail, and haven't
found an original bug report or testcase), so can't verify whether this patch
prevents the original oops.

Does this look sufficient?

thanks,
-serge

>From a8eb0950b47ff6c5dfe2debafbd203dcced75bd3 Mon Sep 17 00:00:00 2001
From: root <root@elm3b203.beaverton.ibm.com>
Date: Wed, 24 Jun 2009 20:26:17 -0700
Subject: [PATCH 1/1] veth: don't free priv->status until dev->destructor (v2)

Since commit ae0e8e82205c903978a79ebf5e31c670b61fa5b4, priv->status
has been freed at veth_close().  But that causes a NULL deref at
veth_xmit.  This patch moves priv->status free back to the device
destructor.  It also tries to prevent the original possible
sysfs-induced oops.  All the stats are now gathered within one rcu
cycle, while the device free hook first sets the device stats struct to
NULL, waits an rcu cycle before freeing
it.

Changelog:
	June 26: try to fix the original oops.

Signed-off-by: Serge Hallyn <serue@us.ibm.com>
---
 drivers/net/veth.c |   22 ++++++++++++++++++----
 1 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 87197dd..112add0 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -208,7 +208,7 @@ rx_drop:
 
 static struct net_device_stats *veth_get_stats(struct net_device *dev)
 {
-	struct veth_priv *priv = netdev_priv(dev);
+	struct veth_priv *priv;
 	struct net_device_stats *dev_stats = &dev->stats;
 	unsigned int cpu;
 	struct veth_net_stats *stats;
@@ -220,6 +220,8 @@ static struct net_device_stats *veth_get_stats(struct net_device *dev)
 	dev_stats->tx_dropped = 0;
 	dev_stats->rx_dropped = 0;
 
+	rcu_read_lock();
+	priv = netdev_priv(dev);
 	if (priv->stats)
 		for_each_online_cpu(cpu) {
 			stats = per_cpu_ptr(priv->stats, cpu);
@@ -231,6 +233,7 @@ static struct net_device_stats *veth_get_stats(struct net_device *dev)
 			dev_stats->tx_dropped += stats->tx_dropped;
 			dev_stats->rx_dropped += stats->rx_dropped;
 		}
+	rcu_read_unlock();
 
 	return dev_stats;
 }
@@ -257,8 +260,6 @@ static int veth_close(struct net_device *dev)
 	netif_carrier_off(dev);
 	netif_carrier_off(priv->peer);
 
-	free_percpu(priv->stats);
-	priv->stats = NULL;
 	return 0;
 }
 
@@ -299,6 +300,19 @@ static const struct net_device_ops veth_netdev_ops = {
 	.ndo_set_mac_address = eth_mac_addr,
 };
 
+static void veth_dev_free(struct net_device *dev)
+{
+	struct veth_priv *priv;
+	struct veth_net_stats *stats;
+
+	priv = netdev_priv(dev);
+	stats = priv->stats;
+	priv->stats = NULL;
+	synchronize_rcu();
+	free_percpu(stats);
+	free_netdev(dev);
+}
+
 static void veth_setup(struct net_device *dev)
 {
 	ether_setup(dev);
@@ -306,7 +320,7 @@ static void veth_setup(struct net_device *dev)
 	dev->netdev_ops = &veth_netdev_ops;
 	dev->ethtool_ops = &veth_ethtool_ops;
 	dev->features |= NETIF_F_LLTX;
-	dev->destructor = free_netdev;
+	dev->destructor = veth_dev_free;
 }
 
 /*
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] veth: don't free priv->status until dev->destructor (v2)
  2009-06-26 16:24 ` Serge E. Hallyn
  (?)
@ 2009-07-15 15:50 ` David Miller
  2009-07-20 21:25   ` Stephen Hemminger
  -1 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2009-07-15 15:50 UTC (permalink / raw)
  To: serue; +Cc: shemminger, containers, sachinp, netdev, matthltc, linux-kernel

From: "Serge E. Hallyn" <serue@us.ibm.com>
Date: Fri, 26 Jun 2009 11:24:18 -0500

> I haven't been able to reproduce the original oops though (been
> trying to cat the stats sysfs files while rmmoding veth, to no
> avail, and haven't found an original bug report or testcase), so
> can't verify whether this patch prevents the original oops.

If you 'cat' it you're unlikely to trigger the oops.

You have to hold the sysfs files open, and then elsewhere do the
rmmod, wait, and then continue with some access to those open sysfs
file descriptors (f.e. do some reads).

I'd also need this patch to be against current sources as they'll
never apply since I did the revert quite some time ago.

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] veth: don't free priv->status until dev->destructor (v2)
  2009-07-15 15:50 ` David Miller
@ 2009-07-20 21:25   ` Stephen Hemminger
  2009-07-22 15:55     ` Serge E. Hallyn
  0 siblings, 1 reply; 10+ messages in thread
From: Stephen Hemminger @ 2009-07-20 21:25 UTC (permalink / raw)
  To: serue; +Cc: David Miller, containers, sachinp, netdev, matthltc, linux-kernel

On Wed, 15 Jul 2009 08:50:12 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: "Serge E. Hallyn" <serue@us.ibm.com>
> Date: Fri, 26 Jun 2009 11:24:18 -0500
> 
> > I haven't been able to reproduce the original oops though (been
> > trying to cat the stats sysfs files while rmmoding veth, to no
> > avail, and haven't found an original bug report or testcase), so
> > can't verify whether this patch prevents the original oops.
> 
> If you 'cat' it you're unlikely to trigger the oops.
> 
> You have to hold the sysfs files open, and then elsewhere do the
> rmmod, wait, and then continue with some access to those open sysfs
> file descriptors (f.e. do some reads).
> 
> I'd also need this patch to be against current sources as they'll
> never apply since I did the revert quite some time ago.
> 
> Thanks.


My usual way of doing this is:

#  (sleep 30; cat /sys/class/net/ethX/statistics/tx_bytes) &
# rmmod the_buggy_driver

wait...


-- 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] veth: don't free priv->status until dev->destructor (v2)
  2009-07-20 21:25   ` Stephen Hemminger
@ 2009-07-22 15:55     ` Serge E. Hallyn
  0 siblings, 0 replies; 10+ messages in thread
From: Serge E. Hallyn @ 2009-07-22 15:55 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: David Miller, containers, sachinp, netdev, matthltc, linux-kernel

Quoting Stephen Hemminger (shemminger@vyatta.com):
> On Wed, 15 Jul 2009 08:50:12 -0700 (PDT)
> David Miller <davem@davemloft.net> wrote:
> 
> > From: "Serge E. Hallyn" <serue@us.ibm.com>
> > Date: Fri, 26 Jun 2009 11:24:18 -0500
> > 
> > > I haven't been able to reproduce the original oops though (been
> > > trying to cat the stats sysfs files while rmmoding veth, to no
> > > avail, and haven't found an original bug report or testcase), so
> > > can't verify whether this patch prevents the original oops.
> > 
> > If you 'cat' it you're unlikely to trigger the oops.
> > 
> > You have to hold the sysfs files open, and then elsewhere do the
> > rmmod, wait, and then continue with some access to those open sysfs
> > file descriptors (f.e. do some reads).

Yup, I was doing that too, but couldn't reproduce as yet.

> > I'd also need this patch to be against current sources as they'll
> > never apply since I did the revert quite some time ago.
> > 
> > Thanks.

Ok, thanks - I'll generate a new patch against a fresh pull when
I can confirm that it actually solves the problem.

> My usual way of doing this is:
> 
> #  (sleep 30; cat /sys/class/net/ethX/statistics/tx_bytes) &
> # rmmod the_buggy_driver
> 
> wait...

Can you oops the kernel this way on latest netns?

thanks,
-serge

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] veth: don't free priv->status until dev->destructor (v2)
  2009-06-26 16:24 ` Serge E. Hallyn
  (?)
  (?)
@ 2009-07-24 19:46 ` Stephen Hemminger
  2009-08-05  6:40   ` Eric W. Biederman
  -1 siblings, 1 reply; 10+ messages in thread
From: Stephen Hemminger @ 2009-07-24 19:46 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Linux Containers, Sachin Sant, netdev, David Miller, matthltc, lkml

On Fri, 26 Jun 2009 11:24:18 -0500
"Serge E. Hallyn" <serue@us.ibm.com> wrote:

> Based on the commit msg on ae0e8e82205c903978a79ebf5e31c670b61fa5b4, it looks

>  	ether_setup(dev);
> @@ -306,7 +320,7 @@ static void veth_setup(struct net_device *dev)
>  	dev->netdev_ops = &veth_netdev_ops;
>  	dev->ethtool_ops = &veth_ethtool_ops;
>  	dev->features |= NETIF_F_LLTX;
> -	dev->destructor = free_netdev;
> +	dev->destructor = veth_dev_free;
>

This is still going to oops if sysfs statistics referenced
after module unload because module is unloaded (code is gone)
and veth_dev_free no longer exists.

I'll respin the original patch (using free_netdev) and fix
the statistics complaint.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] veth: don't free priv->status until dev->destructor (v2)
  2009-07-24 19:46 ` Stephen Hemminger
@ 2009-08-05  6:40   ` Eric W. Biederman
  2009-08-05 17:10       ` Stephen Hemminger
  0 siblings, 1 reply; 10+ messages in thread
From: Eric W. Biederman @ 2009-08-05  6:40 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Serge E. Hallyn, Linux Containers, Sachin Sant, netdev,
	David Miller, matthltc, lkml

Stephen Hemminger <shemminger@vyatta.com> writes:

> On Fri, 26 Jun 2009 11:24:18 -0500
> "Serge E. Hallyn" <serue@us.ibm.com> wrote:
>
>> Based on the commit msg on ae0e8e82205c903978a79ebf5e31c670b61fa5b4, it looks
>
>>  	ether_setup(dev);
>> @@ -306,7 +320,7 @@ static void veth_setup(struct net_device *dev)
>>  	dev->netdev_ops = &veth_netdev_ops;
>>  	dev->ethtool_ops = &veth_ethtool_ops;
>>  	dev->features |= NETIF_F_LLTX;
>> -	dev->destructor = free_netdev;
>> +	dev->destructor = veth_dev_free;
>>
>
> This is still going to oops if sysfs statistics referenced
> after module unload because module is unloaded (code is gone)
> and veth_dev_free no longer exists.

Has anyone actually seen that cause an oops?

The reason I am asking is that as I read the code we cannot have
this problem.  At worst the destructor callback is delayed until:

veth_exit
  rtnl_link_unregister
    rtnl_unlock
      netdev_run_todo
        dev->destructor


Similarly even if the sysfs filehandle is open we have called:

netdev_unregister_kobject
  ...
    sysfs_addrm_finish
      sysfs_deactivate

Which guarantees that sysfs_get_active_two will fail and all
subsequent actions on that file will fail.

Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] veth: don't free priv->status until dev->destructor (v2)
  2009-08-05  6:40   ` Eric W. Biederman
@ 2009-08-05 17:10       ` Stephen Hemminger
  0 siblings, 0 replies; 10+ messages in thread
From: Stephen Hemminger @ 2009-08-05 17:10 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Serge E. Hallyn, Linux Containers, Sachin Sant, netdev,
	David Miller, matthltc, lkml

On Tue, 04 Aug 2009 23:40:47 -0700
ebiederm@xmission.com (Eric W. Biederman) wrote:

> Stephen Hemminger <shemminger@vyatta.com> writes:
> 
> > On Fri, 26 Jun 2009 11:24:18 -0500
> > "Serge E. Hallyn" <serue@us.ibm.com> wrote:
> >
> >> Based on the commit msg on ae0e8e82205c903978a79ebf5e31c670b61fa5b4, it looks
> >
> >>  	ether_setup(dev);
> >> @@ -306,7 +320,7 @@ static void veth_setup(struct net_device *dev)
> >>  	dev->netdev_ops = &veth_netdev_ops;
> >>  	dev->ethtool_ops = &veth_ethtool_ops;
> >>  	dev->features |= NETIF_F_LLTX;
> >> -	dev->destructor = free_netdev;
> >> +	dev->destructor = veth_dev_free;
> >>
> >
> > This is still going to oops if sysfs statistics referenced
> > after module unload because module is unloaded (code is gone)
> > and veth_dev_free no longer exists.
> 
> Has anyone actually seen that cause an oops?
> 
> The reason I am asking is that as I read the code we cannot have
> this problem.  At worst the destructor callback is delayed until:
> 
> veth_exit
>   rtnl_link_unregister
>     rtnl_unlock
>       netdev_run_todo
>         dev->destructor
> 
> 
> Similarly even if the sysfs filehandle is open we have called:
> 
> netdev_unregister_kobject
>   ...
>     sysfs_addrm_finish
>       sysfs_deactivate
> 
> Which guarantees that sysfs_get_active_two will fail and all
> subsequent actions on that file will fail.
> 
> Eric

Sysfs must be safer than it used to be. 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] veth: don't free priv->status until dev->destructor (v2)
@ 2009-08-05 17:10       ` Stephen Hemminger
  0 siblings, 0 replies; 10+ messages in thread
From: Stephen Hemminger @ 2009-08-05 17:10 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Serge E. Hallyn, Linux Containers, Sachin Sant, netdev,
	David Miller, matthltc, lkml

On Tue, 04 Aug 2009 23:40:47 -0700
ebiederm@xmission.com (Eric W. Biederman) wrote:

> Stephen Hemminger <shemminger@vyatta.com> writes:
> 
> > On Fri, 26 Jun 2009 11:24:18 -0500
> > "Serge E. Hallyn" <serue@us.ibm.com> wrote:
> >
> >> Based on the commit msg on ae0e8e82205c903978a79ebf5e31c670b61fa5b4, it looks
> >
> >>  	ether_setup(dev);
> >> @@ -306,7 +320,7 @@ static void veth_setup(struct net_device *dev)
> >>  	dev->netdev_ops = &veth_netdev_ops;
> >>  	dev->ethtool_ops = &veth_ethtool_ops;
> >>  	dev->features |= NETIF_F_LLTX;
> >> -	dev->destructor = free_netdev;
> >> +	dev->destructor = veth_dev_free;
> >>
> >
> > This is still going to oops if sysfs statistics referenced
> > after module unload because module is unloaded (code is gone)
> > and veth_dev_free no longer exists.
> 
> Has anyone actually seen that cause an oops?
> 
> The reason I am asking is that as I read the code we cannot have
> this problem.  At worst the destructor callback is delayed until:
> 
> veth_exit
>   rtnl_link_unregister
>     rtnl_unlock
>       netdev_run_todo
>         dev->destructor
> 
> 
> Similarly even if the sysfs filehandle is open we have called:
> 
> netdev_unregister_kobject
>   ...
>     sysfs_addrm_finish
>       sysfs_deactivate
> 
> Which guarantees that sysfs_get_active_two will fail and all
> subsequent actions on that file will fail.
> 
> Eric

Sysfs must be safer than it used to be. 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] veth: don't free priv->status until dev->destructor (v2)
  2009-08-05 17:10       ` Stephen Hemminger
  (?)
@ 2009-08-05 22:43       ` Eric W. Biederman
  -1 siblings, 0 replies; 10+ messages in thread
From: Eric W. Biederman @ 2009-08-05 22:43 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Serge E. Hallyn, Linux Containers, Sachin Sant, netdev,
	David Miller, matthltc, lkml

Stephen Hemminger <shemminger@vyatta.com> writes:
>
> Sysfs must be safer than it used to be. 

Definitely.  A lot of this dates to Tejun's cleanups which merged
2-3 years agos now.  Just before I started working on sysfs.

Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2009-08-05 22:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-26 16:24 [PATCH 1/1] veth: don't free priv->status until dev->destructor (v2) Serge E. Hallyn
2009-06-26 16:24 ` Serge E. Hallyn
2009-07-15 15:50 ` David Miller
2009-07-20 21:25   ` Stephen Hemminger
2009-07-22 15:55     ` Serge E. Hallyn
2009-07-24 19:46 ` Stephen Hemminger
2009-08-05  6:40   ` Eric W. Biederman
2009-08-05 17:10     ` Stephen Hemminger
2009-08-05 17:10       ` Stephen Hemminger
2009-08-05 22:43       ` Eric W. Biederman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.