netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] netpoll: protect napi_poll and poll_controller during dev_[open|close]
@ 2013-01-30 20:44 Neil Horman
  2013-01-30 21:07 ` Ben Hutchings
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Neil Horman @ 2013-01-30 20:44 UTC (permalink / raw)
  To: netdev; +Cc: Neil Horman, Ivan Vecera, David S. Miller

Ivan Vercera was recently backporting commit
9c13cb8bb477a83b9a3c9e5a5478a4e21294a760 to a RHEL kernel, and I noticed that,
while this patch protects the tg3 driver from having its ndo_poll_controller
routine called during device initalization, it does nothing for the driver
during shutdown. I.e. it would be entirely possible to have the
ndo_poll_controller method (or subsequently the ndo_poll) routine called for a
driver in the netpoll path on CPU A while in parallel on CPU B, the ndo_close or
ndo_open routine could be called.  Given that the two latter routines tend to
initizlize and free many data structures that the former two rely on, the result
can easily be data corruption or various other crashes.  Furthermore, it seems
that this is potentially a problem with all net drivers that support netpoll,
and so this should ideally be fixed in a common path.

Fix it by creating a spinlock in the netpoll_info structure, and holding it on
netpoll_poll_dev, and in dev_close and dev_open.  That will prevent the driver
from getting torn down while we're using it in the netpoll path

I've done some testing on this, flooding a netconsole enabled system with
messages and ifup/downing the interface. No problems observed

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Ivan Vecera <ivecera@redhat.com>
CC: "David S. Miller" <davem@davemloft.net>
---
 include/linux/netpoll.h |  1 +
 net/core/dev.c          | 16 ++++++++++++++++
 net/core/netpoll.c      |  3 +++
 3 files changed, 20 insertions(+)

diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index f54c3bb..bb1d364 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -40,6 +40,7 @@ struct netpoll_info {
 
 	int rx_flags;
 	spinlock_t rx_lock;
+	spinlock_t napi_lock;
 	struct list_head rx_np; /* netpolls that registered an rx_hook */
 
 	struct sk_buff_head neigh_tx; /* list of neigh requests to reply to */
diff --git a/net/core/dev.c b/net/core/dev.c
index a87bc74..18f85e1 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1307,11 +1307,19 @@ static int __dev_open(struct net_device *dev)
 int dev_open(struct net_device *dev)
 {
 	int ret;
+	struct netpoll_info *ni;
 
 	if (dev->flags & IFF_UP)
 		return 0;
 
+	rcu_read_lock();
+	ni = rcu_dereference(dev->npinfo);
+	if (ni)
+		spin_lock(&ni->napi_lock);
 	ret = __dev_open(dev);
+	if (ni)
+		spin_unlock(&ni->napi_lock);
+	rcu_read_unlock();
 	if (ret < 0)
 		return ret;
 
@@ -1325,6 +1333,7 @@ EXPORT_SYMBOL(dev_open);
 static int __dev_close_many(struct list_head *head)
 {
 	struct net_device *dev;
+	struct netpoll_info *ni;
 
 	ASSERT_RTNL();
 	might_sleep();
@@ -1355,8 +1364,15 @@ static int __dev_close_many(struct list_head *head)
 		 *	We allow it to be called even after a DETACH hot-plug
 		 *	event.
 		 */
+		rcu_read_lock();
+		ni = rcu_dereference(dev->npinfo);
+		if (ni)
+			spin_lock(&ni->napi_lock);
 		if (ops->ndo_stop)
 			ops->ndo_stop(dev);
+		if (ni)
+			spin_unlock(&ni->napi_lock);
+		rcu_read_unlock();
 
 		dev->flags &= ~IFF_UP;
 		net_dmaengine_put();
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 331ccb9..f72eaa9 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -207,9 +207,11 @@ static void netpoll_poll_dev(struct net_device *dev)
 		return;
 
 	/* Process pending work on NIC */
+	spin_lock(&ni->napi_lock);
 	ops->ndo_poll_controller(dev);
 
 	poll_napi(dev);
+	spin_lock(&ni->napi_lock);
 
 	if (dev->flags & IFF_SLAVE) {
 		if (ni) {
@@ -1004,6 +1006,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 		INIT_LIST_HEAD(&npinfo->rx_np);
 
 		spin_lock_init(&npinfo->rx_lock);
+		spin_lock_init(&npinfo->napi_lock);
 		skb_queue_head_init(&npinfo->neigh_tx);
 		skb_queue_head_init(&npinfo->txq);
 		INIT_DELAYED_WORK(&npinfo->tx_work, queue_process);
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] netpoll: protect napi_poll and poll_controller during dev_[open|close]
  2013-01-30 20:44 [PATCH] netpoll: protect napi_poll and poll_controller during dev_[open|close] Neil Horman
@ 2013-01-30 21:07 ` Ben Hutchings
  2013-01-30 21:25   ` Neil Horman
  2013-02-01 17:02 ` [PATCH v2] " Neil Horman
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Ben Hutchings @ 2013-01-30 21:07 UTC (permalink / raw)
  To: Neil Horman; +Cc: netdev, Ivan Vecera, David S. Miller

On Wed, 2013-01-30 at 15:44 -0500, Neil Horman wrote:
> Ivan Vercera was recently backporting commit
> 9c13cb8bb477a83b9a3c9e5a5478a4e21294a760 to a RHEL kernel, and I noticed that,
> while this patch protects the tg3 driver from having its ndo_poll_controller
> routine called during device initalization, it does nothing for the driver
> during shutdown. I.e. it would be entirely possible to have the
> ndo_poll_controller method (or subsequently the ndo_poll) routine called for a
> driver in the netpoll path on CPU A while in parallel on CPU B, the ndo_close or
> ndo_open routine could be called.  Given that the two latter routines tend to
> initizlize and free many data structures that the former two rely on, the result
> can easily be data corruption or various other crashes.  Furthermore, it seems
> that this is potentially a problem with all net drivers that support netpoll,
> and so this should ideally be fixed in a common path.
> 
> Fix it by creating a spinlock in the netpoll_info structure, and holding it on
> netpoll_poll_dev, and in dev_close and dev_open.  That will prevent the driver
> from getting torn down while we're using it in the netpoll path
> 
> I've done some testing on this, flooding a netconsole enabled system with
> messages and ifup/downing the interface. No problems observed
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> CC: Ivan Vecera <ivecera@redhat.com>
> CC: "David S. Miller" <davem@davemloft.net>
> ---
>  include/linux/netpoll.h |  1 +
>  net/core/dev.c          | 16 ++++++++++++++++
>  net/core/netpoll.c      |  3 +++
>  3 files changed, 20 insertions(+)
> 
> diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
> index f54c3bb..bb1d364 100644
> --- a/include/linux/netpoll.h
> +++ b/include/linux/netpoll.h
> @@ -40,6 +40,7 @@ struct netpoll_info {
>  
>  	int rx_flags;
>  	spinlock_t rx_lock;
> +	spinlock_t napi_lock;
>  	struct list_head rx_np; /* netpolls that registered an rx_hook */
>  
>  	struct sk_buff_head neigh_tx; /* list of neigh requests to reply to */
> diff --git a/net/core/dev.c b/net/core/dev.c
> index a87bc74..18f85e1 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1307,11 +1307,19 @@ static int __dev_open(struct net_device *dev)
>  int dev_open(struct net_device *dev)
>  {
>  	int ret;
> +	struct netpoll_info *ni;
>  
>  	if (dev->flags & IFF_UP)
>  		return 0;
>  
> +	rcu_read_lock();
> +	ni = rcu_dereference(dev->npinfo);
> +	if (ni)
> +		spin_lock(&ni->napi_lock);
>  	ret = __dev_open(dev);
> +	if (ni)
> +		spin_unlock(&ni->napi_lock);
[...]

No, you can't call ndo_open and ndo_stop in atomic context.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] netpoll: protect napi_poll and poll_controller during dev_[open|close]
  2013-01-30 21:07 ` Ben Hutchings
@ 2013-01-30 21:25   ` Neil Horman
  0 siblings, 0 replies; 15+ messages in thread
From: Neil Horman @ 2013-01-30 21:25 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev, Ivan Vecera, David S. Miller

On Wed, Jan 30, 2013 at 09:07:22PM +0000, Ben Hutchings wrote:
> On Wed, 2013-01-30 at 15:44 -0500, Neil Horman wrote:
> > Ivan Vercera was recently backporting commit
> > 9c13cb8bb477a83b9a3c9e5a5478a4e21294a760 to a RHEL kernel, and I noticed that,
> > while this patch protects the tg3 driver from having its ndo_poll_controller
> > routine called during device initalization, it does nothing for the driver
> > during shutdown. I.e. it would be entirely possible to have the
> > ndo_poll_controller method (or subsequently the ndo_poll) routine called for a
> > driver in the netpoll path on CPU A while in parallel on CPU B, the ndo_close or
> > ndo_open routine could be called.  Given that the two latter routines tend to
> > initizlize and free many data structures that the former two rely on, the result
> > can easily be data corruption or various other crashes.  Furthermore, it seems
> > that this is potentially a problem with all net drivers that support netpoll,
> > and so this should ideally be fixed in a common path.
> > 
> > Fix it by creating a spinlock in the netpoll_info structure, and holding it on
> > netpoll_poll_dev, and in dev_close and dev_open.  That will prevent the driver
> > from getting torn down while we're using it in the netpoll path
> > 
> > I've done some testing on this, flooding a netconsole enabled system with
> > messages and ifup/downing the interface. No problems observed
> > 
> > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> > CC: Ivan Vecera <ivecera@redhat.com>
> > CC: "David S. Miller" <davem@davemloft.net>
> > ---
> >  include/linux/netpoll.h |  1 +
> >  net/core/dev.c          | 16 ++++++++++++++++
> >  net/core/netpoll.c      |  3 +++
> >  3 files changed, 20 insertions(+)
> > 
> > diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
> > index f54c3bb..bb1d364 100644
> > --- a/include/linux/netpoll.h
> > +++ b/include/linux/netpoll.h
> > @@ -40,6 +40,7 @@ struct netpoll_info {
> >  
> >  	int rx_flags;
> >  	spinlock_t rx_lock;
> > +	spinlock_t napi_lock;
> >  	struct list_head rx_np; /* netpolls that registered an rx_hook */
> >  
> >  	struct sk_buff_head neigh_tx; /* list of neigh requests to reply to */
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index a87bc74..18f85e1 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -1307,11 +1307,19 @@ static int __dev_open(struct net_device *dev)
> >  int dev_open(struct net_device *dev)
> >  {
> >  	int ret;
> > +	struct netpoll_info *ni;
> >  
> >  	if (dev->flags & IFF_UP)
> >  		return 0;
> >  
> > +	rcu_read_lock();
> > +	ni = rcu_dereference(dev->npinfo);
> > +	if (ni)
> > +		spin_lock(&ni->napi_lock);
> >  	ret = __dev_open(dev);
> > +	if (ni)
> > +		spin_unlock(&ni->napi_lock);
> [...]
> 
> No, you can't call ndo_open and ndo_stop in atomic context.
> 
Crap, you're right.  I might deadlock too if we print something while going down
via netconsole.  I'll change this to a flag to skip the poll_controller routine
if we're going down.  New patch in the AM.

Thanks
Neil

> Ben.
> 
> -- 
> Ben Hutchings, Staff Engineer, Solarflare
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2] netpoll: protect napi_poll and poll_controller during dev_[open|close]
  2013-01-30 20:44 [PATCH] netpoll: protect napi_poll and poll_controller during dev_[open|close] Neil Horman
  2013-01-30 21:07 ` Ben Hutchings
@ 2013-02-01 17:02 ` Neil Horman
  2013-02-01 17:14   ` Eric Dumazet
  2013-02-01 17:49 ` [PATCH v3] " Neil Horman
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Neil Horman @ 2013-02-01 17:02 UTC (permalink / raw)
  To: netdev; +Cc: Neil Horman, Ivan Vecera, David S. Miller, Ben Hutchings

Ivan Vercera was recently backporting commit
9c13cb8bb477a83b9a3c9e5a5478a4e21294a760 to a RHEL kernel, and I noticed that,
while this patch protects the tg3 driver from having its ndo_poll_controller
routine called during device initalization, it does nothing for the driver
during shutdown. I.e. it would be entirely possible to have the
ndo_poll_controller method (or subsequently the ndo_poll) routine called for a
driver in the netpoll path on CPU A while in parallel on CPU B, the ndo_close or
ndo_open routine could be called.  Given that the two latter routines tend to
initizlize and free many data structures that the former two rely on, the result
can easily be data corruption or various other crashes.  Furthermore, it seems
that this is potentially a problem with all net drivers that support netpoll,
and so this should ideally be fixed in a common path.

As Ben H Pointed out to me, we can't preform dev_open/dev_close in atomic
context, so I've come up with this solution.  I've modified the npinfo->rx_flags
to make use of bitops, so that we can atomically set/clear individual bits.
We use one of these bits to provide mutual exclusion between the netpoll polling
path and the dev_open/close paths.  I can't say Im a huge fan of returning
-EBUSY in open/close, but I like it better than busy waiting while the poll path
completes.

I've tested this here by flooding netconsole with messages on a system whos nic
driver I modfied to periodically return NETDEV_TX_BUSY, so that the netpoll tx
workqueue would be forced to send frames and poll the device.  While this was
going on I rapidly ifdown/up'ed the interface and watched for any problems.
I've not found any.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Ivan Vecera <ivecera@redhat.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Ben Hutchings <bhutchings@solarflare.com>
---
 include/linux/netpoll.h | 14 +++++++++----
 net/core/dev.c          | 53 +++++++++++++++++++++++++++++++++++++++++++++++++
 net/core/netpoll.c      | 23 ++++++++++++++-------
 3 files changed, 79 insertions(+), 11 deletions(-)

diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index f54c3bb..09ef2e5 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -35,10 +35,14 @@ struct netpoll {
 	struct rcu_head rcu;
 };
 
+#define NETPOLL_RX_ENABLED  1
+#define NETPOLL_RX_DROP     2
+#define NETPOLL_RX_ACTIVE   3
+
 struct netpoll_info {
 	atomic_t refcnt;
 
-	int rx_flags;
+	unsigned long flags;
 	spinlock_t rx_lock;
 	struct list_head rx_np; /* netpolls that registered an rx_hook */
 
@@ -79,7 +83,8 @@ static inline bool netpoll_rx_on(struct sk_buff *skb)
 {
 	struct netpoll_info *npinfo = rcu_dereference_bh(skb->dev->npinfo);
 
-	return npinfo && (!list_empty(&npinfo->rx_np) || npinfo->rx_flags);
+	return npinfo && (!list_empty(&npinfo->rx_np) ||
+			   test_bit(NETPOLL_RX_ENABLED, &npinfo->flags));
 }
 
 static inline bool netpoll_rx(struct sk_buff *skb)
@@ -95,8 +100,9 @@ static inline bool netpoll_rx(struct sk_buff *skb)
 
 	npinfo = rcu_dereference_bh(skb->dev->npinfo);
 	spin_lock(&npinfo->rx_lock);
-	/* check rx_flags again with the lock held */
-	if (npinfo->rx_flags && __netpoll_rx(skb, npinfo))
+	/* check flags again with the lock held */
+	if (test_bit(NETPOLL_RX_ENABLED, &npinfo->flags) &&
+	    __netpoll_rx(skb, npinfo))
 		ret = true;
 	spin_unlock(&npinfo->rx_lock);
 
diff --git a/net/core/dev.c b/net/core/dev.c
index a87bc74..90b267a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1259,6 +1259,8 @@ EXPORT_SYMBOL(dev_load);
 static int __dev_open(struct net_device *dev)
 {
 	const struct net_device_ops *ops = dev->netdev_ops;
+	struct netpoll_info *ni;
+
 	int ret;
 
 	ASSERT_RTNL();
@@ -1266,6 +1268,19 @@ static int __dev_open(struct net_device *dev)
 	if (!netif_device_present(dev))
 		return -ENODEV;
 
+	/*
+	 * Block netpoll from trying to do any rx path servicing
+	 * If we don't do this there is a chance ndo_poll_controller
+	 * or ndo_poll may be running while we open the device
+	 */
+	rcu_read_lock();
+	ni = rcu_dereference(dev->npinfo);
+	if (ni && test_and_set_bit(NETPOLL_RX_ACTIVE, &ni->flags)) {
+		rcu_read_unlock();
+		return -EBUSY;
+	}
+	rcu_read_unlock();
+
 	ret = call_netdevice_notifiers(NETDEV_PRE_UP, dev);
 	ret = notifier_to_errno(ret);
 	if (ret)
@@ -1279,6 +1294,12 @@ static int __dev_open(struct net_device *dev)
 	if (!ret && ops->ndo_open)
 		ret = ops->ndo_open(dev);
 
+	rcu_read_lock();
+	ni = rcu_dereference(dev->npinfo);
+	if (ni)
+		clear_bit(NETPOLL_RX_ACTIVE, &ni->flags);
+	rcu_read_unlock();
+
 	if (ret)
 		clear_bit(__LINK_STATE_START, &dev->state);
 	else {
@@ -1369,10 +1390,26 @@ static int __dev_close(struct net_device *dev)
 {
 	int retval;
 	LIST_HEAD(single);
+	struct netpoll_info *ni;
+
+	/* Temporarily disable netpoll until the interface is down */
+	rcu_read_lock();
+	ni = rcu_dereference(dev->npinfo);
+	if (ni && test_and_set_bit(NETPOLL_RX_ACTIVE, &ni->flags)) {
+		rcu_read_unlock();
+		return -EBUSY;
+	}
+	rcu_read_unlock();
 
 	list_add(&dev->unreg_list, &single);
 	retval = __dev_close_many(&single);
 	list_del(&single);
+
+	rcu_read_lock();
+	ni = rcu_dereference(dev->npinfo);
+	if (ni)
+		clear_bit(NETPOLL_RX_ACTIVE, &ni->flags);
+	rcu_read_unlock();
 	return retval;
 }
 
@@ -1410,10 +1447,26 @@ int dev_close(struct net_device *dev)
 {
 	if (dev->flags & IFF_UP) {
 		LIST_HEAD(single);
+		struct netpoll_info *ni;
+
+		/* Block netpoll rx while the interface is going down*/
+		rcu_read_lock();
+		ni = rcu_dereference(dev->npinfo);
+		if (ni && test_and_set_bit(NETPOLL_RX_ACTIVE, &ni->flags)) {
+			rcu_read_unlock();
+			return -EBUSY;
+		}
+		rcu_read_unlock();
 
 		list_add(&dev->unreg_list, &single);
 		dev_close_many(&single);
 		list_del(&single);
+
+		rcu_read_lock();
+		ni = rcu_dereference(dev->npinfo);
+		if (ni)
+			clear_bit(NETPOLL_RX_ACTIVE, &ni->flags);
+		rcu_read_unlock();
 	}
 	return 0;
 }
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 331ccb9..96c5bc5 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -48,8 +48,6 @@ static struct sk_buff_head skb_pool;
 static atomic_t trapped;
 
 #define USEC_PER_POLL	50
-#define NETPOLL_RX_ENABLED  1
-#define NETPOLL_RX_DROP     2
 
 #define MAX_SKB_SIZE							\
 	(sizeof(struct ethhdr) +					\
@@ -152,7 +150,7 @@ static int poll_one_napi(struct netpoll_info *npinfo,
 	if (!test_bit(NAPI_STATE_SCHED, &napi->state))
 		return budget;
 
-	npinfo->rx_flags |= NETPOLL_RX_DROP;
+	set_bit(NETPOLL_RX_DROP, &npinfo->flags);
 	atomic_inc(&trapped);
 	set_bit(NAPI_STATE_NPSVC, &napi->state);
 
@@ -161,7 +159,7 @@ static int poll_one_napi(struct netpoll_info *npinfo,
 
 	clear_bit(NAPI_STATE_NPSVC, &napi->state);
 	atomic_dec(&trapped);
-	npinfo->rx_flags &= ~NETPOLL_RX_DROP;
+	clear_bit(NETPOLL_RX_DROP, &npinfo->flags);
 
 	return budget - work;
 }
@@ -199,6 +197,14 @@ static void netpoll_poll_dev(struct net_device *dev)
 	const struct net_device_ops *ops;
 	struct netpoll_info *ni = rcu_dereference_bh(dev->npinfo);
 
+	/*
+	 * Don't do any rx activity unless NETPOLL_RX_ACTIVE is clear
+	 * the dev_open/close paths use this to block netpoll activity
+	 * while changing device state
+	 */
+	if (test_and_set_bit(NETPOLL_RX_ACTIVE, &dev->npinfo->flags))
+		return;
+
 	if (!dev || !netif_running(dev))
 		return;
 
@@ -211,6 +217,8 @@ static void netpoll_poll_dev(struct net_device *dev)
 
 	poll_napi(dev);
 
+	clear_bit(NETPOLL_RX_ACTIVE, &dev->npinfo->flags);
+
 	if (dev->flags & IFF_SLAVE) {
 		if (ni) {
 			struct net_device *bond_dev;
@@ -229,6 +237,7 @@ static void netpoll_poll_dev(struct net_device *dev)
 	service_neigh_queue(ni);
 
 	zap_completion_queue();
+
 }
 
 static void refill_skbs(void)
@@ -1000,7 +1009,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 			goto out;
 		}
 
-		npinfo->rx_flags = 0;
+		npinfo->flags = 0;
 		INIT_LIST_HEAD(&npinfo->rx_np);
 
 		spin_lock_init(&npinfo->rx_lock);
@@ -1025,7 +1034,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 
 	if (np->rx_hook) {
 		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		npinfo->rx_flags |= NETPOLL_RX_ENABLED;
+		set_bit(NETPOLL_RX_ENABLED, &npinfo->flags);
 		list_add_tail(&np->rx, &npinfo->rx_np);
 		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
 	}
@@ -1204,7 +1213,7 @@ void __netpoll_cleanup(struct netpoll *np)
 		spin_lock_irqsave(&npinfo->rx_lock, flags);
 		list_del(&np->rx);
 		if (list_empty(&npinfo->rx_np))
-			npinfo->rx_flags &= ~NETPOLL_RX_ENABLED;
+			clear_bit(NETPOLL_RX_ENABLED, &npinfo->flags);
 		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
 	}
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] netpoll: protect napi_poll and poll_controller during dev_[open|close]
  2013-02-01 17:02 ` [PATCH v2] " Neil Horman
@ 2013-02-01 17:14   ` Eric Dumazet
  0 siblings, 0 replies; 15+ messages in thread
From: Eric Dumazet @ 2013-02-01 17:14 UTC (permalink / raw)
  To: Neil Horman; +Cc: netdev, Ivan Vecera, David S. Miller, Ben Hutchings

On Fri, 2013-02-01 at 12:02 -0500, Neil Horman wrote:
> Ivan Vercera was recently backporting commit
> 9c13cb8bb477a83b9a3c9e5a5478a4e21294a760 to a RHEL kernel, and I noticed that,
> while this patch protects the tg3 driver from having its ndo_poll_controller
> routine called during device initalization, it does nothing for the driver
> during shutdown. I.e. it would be entirely possible to have the
> ndo_poll_controller method (or subsequently the ndo_poll) routine called for a
> driver in the netpoll path on CPU A while in parallel on CPU B, the ndo_close or
> ndo_open routine could be called.  Given that the two latter routines tend to
> initizlize and free many data structures that the former two rely on, the result
> can easily be data corruption or various other crashes.  Furthermore, it seems
> that this is potentially a problem with all net drivers that support netpoll,
> and so this should ideally be fixed in a common path.
> 
> As Ben H Pointed out to me, we can't preform dev_open/dev_close in atomic
> context, so I've come up with this solution.  I've modified the npinfo->rx_flags
> to make use of bitops, so that we can atomically set/clear individual bits.
> We use one of these bits to provide mutual exclusion between the netpoll polling
> path and the dev_open/close paths.  I can't say Im a huge fan of returning
> -EBUSY in open/close, but I like it better than busy waiting while the poll path
> completes.
> 
> I've tested this here by flooding netconsole with messages on a system whos nic
> driver I modfied to periodically return NETDEV_TX_BUSY, so that the netpoll tx
> workqueue would be forced to send frames and poll the device.  While this was
> going on I rapidly ifdown/up'ed the interface and watched for any problems.
> I've not found any.
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> CC: Ivan Vecera <ivecera@redhat.com>
> CC: "David S. Miller" <davem@davemloft.net>
> CC: Ben Hutchings <bhutchings@solarflare.com>
> ---
>  include/linux/netpoll.h | 14 +++++++++----
>  net/core/dev.c          | 53 +++++++++++++++++++++++++++++++++++++++++++++++++
>  net/core/netpoll.c      | 23 ++++++++++++++-------
>  3 files changed, 79 insertions(+), 11 deletions(-)
> 
> diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
> index f54c3bb..09ef2e5 100644
> --- a/include/linux/netpoll.h
> +++ b/include/linux/netpoll.h
> @@ -35,10 +35,14 @@ struct netpoll {
>  	struct rcu_head rcu;
>  };
>  
> +#define NETPOLL_RX_ENABLED  1
> +#define NETPOLL_RX_DROP     2
> +#define NETPOLL_RX_ACTIVE   3
> +
>  struct netpoll_info {
>  	atomic_t refcnt;
>  
> -	int rx_flags;
> +	unsigned long flags;
>  	spinlock_t rx_lock;
>  	struct list_head rx_np; /* netpolls that registered an rx_hook */

You could avoid holes on 64bit arches using following order :

	atomic_t refcnt;
	spinlock_t rx_lock;
	unsigned long flags;

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3] netpoll: protect napi_poll and poll_controller during dev_[open|close]
  2013-01-30 20:44 [PATCH] netpoll: protect napi_poll and poll_controller during dev_[open|close] Neil Horman
  2013-01-30 21:07 ` Ben Hutchings
  2013-02-01 17:02 ` [PATCH v2] " Neil Horman
@ 2013-02-01 17:49 ` Neil Horman
  2013-02-01 21:21   ` Francois Romieu
  2013-02-02 18:51 ` [PATCH v4] " Neil Horman
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Neil Horman @ 2013-02-01 17:49 UTC (permalink / raw)
  To: netdev
  Cc: Neil Horman, Ivan Vecera, David S. Miller, Ben Hutchings, Eric Dumazet

Ivan Vercera was recently backporting commit
9c13cb8bb477a83b9a3c9e5a5478a4e21294a760 to a RHEL kernel, and I noticed that,
while this patch protects the tg3 driver from having its ndo_poll_controller
routine called during device initalization, it does nothing for the driver
during shutdown. I.e. it would be entirely possible to have the
ndo_poll_controller method (or subsequently the ndo_poll) routine called for a
driver in the netpoll path on CPU A while in parallel on CPU B, the ndo_close or
ndo_open routine could be called.  Given that the two latter routines tend to
initizlize and free many data structures that the former two rely on, the result
can easily be data corruption or various other crashes.  Furthermore, it seems
that this is potentially a problem with all net drivers that support netpoll,
and so this should ideally be fixed in a common path.

As Ben H Pointed out to me, we can't preform dev_open/dev_close in atomic
context, so I've come up with this solution.  I've modified the npinfo->rx_flags
to make use of bitops, so that we can atomically set/clear individual bits.
We use one of these bits to provide mutual exclusion between the netpoll polling
path and the dev_open/close paths.  I can't say Im a huge fan of returning
-EBUSY in open/close, but I like it better than busy waiting while the poll path
completes.

I've tested this here by flooding netconsole with messages on a system whos nic
driver I modfied to periodically return NETDEV_TX_BUSY, so that the netpoll tx
workqueue would be forced to send frames and poll the device.  While this was
going on I rapidly ifdown/up'ed the interface and watched for any problems.
I've not found any.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Ivan Vecera <ivecera@redhat.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Ben Hutchings <bhutchings@solarflare.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>

---
Change notes

v3)
Updated netpoll_info to remove hole in struct on 64 bit arches (Eric D)
---
 include/linux/netpoll.h | 14 +++++++++----
 net/core/dev.c          | 53 +++++++++++++++++++++++++++++++++++++++++++++++++
 net/core/netpoll.c      | 23 ++++++++++++++-------
 3 files changed, 79 insertions(+), 11 deletions(-)

diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index f54c3bb..2ef970c 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -35,11 +35,15 @@ struct netpoll {
 	struct rcu_head rcu;
 };
 
+#define NETPOLL_RX_ENABLED  1
+#define NETPOLL_RX_DROP     2
+#define NETPOLL_RX_ACTIVE   3
+
 struct netpoll_info {
 	atomic_t refcnt;
 
-	int rx_flags;
 	spinlock_t rx_lock;
+	unsigned long flags;
 	struct list_head rx_np; /* netpolls that registered an rx_hook */
 
 	struct sk_buff_head neigh_tx; /* list of neigh requests to reply to */
@@ -79,7 +83,8 @@ static inline bool netpoll_rx_on(struct sk_buff *skb)
 {
 	struct netpoll_info *npinfo = rcu_dereference_bh(skb->dev->npinfo);
 
-	return npinfo && (!list_empty(&npinfo->rx_np) || npinfo->rx_flags);
+	return npinfo && (!list_empty(&npinfo->rx_np) ||
+			   test_bit(NETPOLL_RX_ENABLED, &npinfo->flags));
 }
 
 static inline bool netpoll_rx(struct sk_buff *skb)
@@ -95,8 +100,9 @@ static inline bool netpoll_rx(struct sk_buff *skb)
 
 	npinfo = rcu_dereference_bh(skb->dev->npinfo);
 	spin_lock(&npinfo->rx_lock);
-	/* check rx_flags again with the lock held */
-	if (npinfo->rx_flags && __netpoll_rx(skb, npinfo))
+	/* check flags again with the lock held */
+	if (test_bit(NETPOLL_RX_ENABLED, &npinfo->flags) &&
+	    __netpoll_rx(skb, npinfo))
 		ret = true;
 	spin_unlock(&npinfo->rx_lock);
 
diff --git a/net/core/dev.c b/net/core/dev.c
index a87bc74..90b267a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1259,6 +1259,8 @@ EXPORT_SYMBOL(dev_load);
 static int __dev_open(struct net_device *dev)
 {
 	const struct net_device_ops *ops = dev->netdev_ops;
+	struct netpoll_info *ni;
+
 	int ret;
 
 	ASSERT_RTNL();
@@ -1266,6 +1268,19 @@ static int __dev_open(struct net_device *dev)
 	if (!netif_device_present(dev))
 		return -ENODEV;
 
+	/*
+	 * Block netpoll from trying to do any rx path servicing
+	 * If we don't do this there is a chance ndo_poll_controller
+	 * or ndo_poll may be running while we open the device
+	 */
+	rcu_read_lock();
+	ni = rcu_dereference(dev->npinfo);
+	if (ni && test_and_set_bit(NETPOLL_RX_ACTIVE, &ni->flags)) {
+		rcu_read_unlock();
+		return -EBUSY;
+	}
+	rcu_read_unlock();
+
 	ret = call_netdevice_notifiers(NETDEV_PRE_UP, dev);
 	ret = notifier_to_errno(ret);
 	if (ret)
@@ -1279,6 +1294,12 @@ static int __dev_open(struct net_device *dev)
 	if (!ret && ops->ndo_open)
 		ret = ops->ndo_open(dev);
 
+	rcu_read_lock();
+	ni = rcu_dereference(dev->npinfo);
+	if (ni)
+		clear_bit(NETPOLL_RX_ACTIVE, &ni->flags);
+	rcu_read_unlock();
+
 	if (ret)
 		clear_bit(__LINK_STATE_START, &dev->state);
 	else {
@@ -1369,10 +1390,26 @@ static int __dev_close(struct net_device *dev)
 {
 	int retval;
 	LIST_HEAD(single);
+	struct netpoll_info *ni;
+
+	/* Temporarily disable netpoll until the interface is down */
+	rcu_read_lock();
+	ni = rcu_dereference(dev->npinfo);
+	if (ni && test_and_set_bit(NETPOLL_RX_ACTIVE, &ni->flags)) {
+		rcu_read_unlock();
+		return -EBUSY;
+	}
+	rcu_read_unlock();
 
 	list_add(&dev->unreg_list, &single);
 	retval = __dev_close_many(&single);
 	list_del(&single);
+
+	rcu_read_lock();
+	ni = rcu_dereference(dev->npinfo);
+	if (ni)
+		clear_bit(NETPOLL_RX_ACTIVE, &ni->flags);
+	rcu_read_unlock();
 	return retval;
 }
 
@@ -1410,10 +1447,26 @@ int dev_close(struct net_device *dev)
 {
 	if (dev->flags & IFF_UP) {
 		LIST_HEAD(single);
+		struct netpoll_info *ni;
+
+		/* Block netpoll rx while the interface is going down*/
+		rcu_read_lock();
+		ni = rcu_dereference(dev->npinfo);
+		if (ni && test_and_set_bit(NETPOLL_RX_ACTIVE, &ni->flags)) {
+			rcu_read_unlock();
+			return -EBUSY;
+		}
+		rcu_read_unlock();
 
 		list_add(&dev->unreg_list, &single);
 		dev_close_many(&single);
 		list_del(&single);
+
+		rcu_read_lock();
+		ni = rcu_dereference(dev->npinfo);
+		if (ni)
+			clear_bit(NETPOLL_RX_ACTIVE, &ni->flags);
+		rcu_read_unlock();
 	}
 	return 0;
 }
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 331ccb9..96c5bc5 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -48,8 +48,6 @@ static struct sk_buff_head skb_pool;
 static atomic_t trapped;
 
 #define USEC_PER_POLL	50
-#define NETPOLL_RX_ENABLED  1
-#define NETPOLL_RX_DROP     2
 
 #define MAX_SKB_SIZE							\
 	(sizeof(struct ethhdr) +					\
@@ -152,7 +150,7 @@ static int poll_one_napi(struct netpoll_info *npinfo,
 	if (!test_bit(NAPI_STATE_SCHED, &napi->state))
 		return budget;
 
-	npinfo->rx_flags |= NETPOLL_RX_DROP;
+	set_bit(NETPOLL_RX_DROP, &npinfo->flags);
 	atomic_inc(&trapped);
 	set_bit(NAPI_STATE_NPSVC, &napi->state);
 
@@ -161,7 +159,7 @@ static int poll_one_napi(struct netpoll_info *npinfo,
 
 	clear_bit(NAPI_STATE_NPSVC, &napi->state);
 	atomic_dec(&trapped);
-	npinfo->rx_flags &= ~NETPOLL_RX_DROP;
+	clear_bit(NETPOLL_RX_DROP, &npinfo->flags);
 
 	return budget - work;
 }
@@ -199,6 +197,14 @@ static void netpoll_poll_dev(struct net_device *dev)
 	const struct net_device_ops *ops;
 	struct netpoll_info *ni = rcu_dereference_bh(dev->npinfo);
 
+	/*
+	 * Don't do any rx activity unless NETPOLL_RX_ACTIVE is clear
+	 * the dev_open/close paths use this to block netpoll activity
+	 * while changing device state
+	 */
+	if (test_and_set_bit(NETPOLL_RX_ACTIVE, &dev->npinfo->flags))
+		return;
+
 	if (!dev || !netif_running(dev))
 		return;
 
@@ -211,6 +217,8 @@ static void netpoll_poll_dev(struct net_device *dev)
 
 	poll_napi(dev);
 
+	clear_bit(NETPOLL_RX_ACTIVE, &dev->npinfo->flags);
+
 	if (dev->flags & IFF_SLAVE) {
 		if (ni) {
 			struct net_device *bond_dev;
@@ -229,6 +237,7 @@ static void netpoll_poll_dev(struct net_device *dev)
 	service_neigh_queue(ni);
 
 	zap_completion_queue();
+
 }
 
 static void refill_skbs(void)
@@ -1000,7 +1009,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 			goto out;
 		}
 
-		npinfo->rx_flags = 0;
+		npinfo->flags = 0;
 		INIT_LIST_HEAD(&npinfo->rx_np);
 
 		spin_lock_init(&npinfo->rx_lock);
@@ -1025,7 +1034,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 
 	if (np->rx_hook) {
 		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		npinfo->rx_flags |= NETPOLL_RX_ENABLED;
+		set_bit(NETPOLL_RX_ENABLED, &npinfo->flags);
 		list_add_tail(&np->rx, &npinfo->rx_np);
 		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
 	}
@@ -1204,7 +1213,7 @@ void __netpoll_cleanup(struct netpoll *np)
 		spin_lock_irqsave(&npinfo->rx_lock, flags);
 		list_del(&np->rx);
 		if (list_empty(&npinfo->rx_np))
-			npinfo->rx_flags &= ~NETPOLL_RX_ENABLED;
+			clear_bit(NETPOLL_RX_ENABLED, &npinfo->flags);
 		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
 	}
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] netpoll: protect napi_poll and poll_controller during dev_[open|close]
  2013-02-01 17:49 ` [PATCH v3] " Neil Horman
@ 2013-02-01 21:21   ` Francois Romieu
  0 siblings, 0 replies; 15+ messages in thread
From: Francois Romieu @ 2013-02-01 21:21 UTC (permalink / raw)
  To: Neil Horman
  Cc: netdev, Ivan Vecera, David S. Miller, Ben Hutchings, Eric Dumazet

Neil Horman <nhorman@tuxdriver.com> :
[...]
> diff --git a/net/core/dev.c b/net/core/dev.c
> index a87bc74..90b267a 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
[...]
> @@ -1266,6 +1268,19 @@ static int __dev_open(struct net_device *dev)
>  	if (!netif_device_present(dev))
>  		return -ENODEV;
>  
> +	/*
> +	 * Block netpoll from trying to do any rx path servicing
> +	 * If we don't do this there is a chance ndo_poll_controller
> +	 * or ndo_poll may be running while we open the device
> +	 */
> +	rcu_read_lock();
> +	ni = rcu_dereference(dev->npinfo);
> +	if (ni && test_and_set_bit(NETPOLL_RX_ACTIVE, &ni->flags)) {
> +		rcu_read_unlock();
> +		return -EBUSY;
> +	}
> +	rcu_read_unlock();
> +

struct net_device contains no npinfo member when CONFIG_NETPOLL is not set.

-- 
Ueimor

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v4] netpoll: protect napi_poll and poll_controller during dev_[open|close]
  2013-01-30 20:44 [PATCH] netpoll: protect napi_poll and poll_controller during dev_[open|close] Neil Horman
                   ` (2 preceding siblings ...)
  2013-02-01 17:49 ` [PATCH v3] " Neil Horman
@ 2013-02-02 18:51 ` Neil Horman
  2013-02-03 21:05   ` David Miller
  2013-02-04 17:40 ` [PATCH v5] " Neil Horman
  2013-02-05 18:05 ` [PATCH v6] " Neil Horman
  5 siblings, 1 reply; 15+ messages in thread
From: Neil Horman @ 2013-02-02 18:51 UTC (permalink / raw)
  To: netdev
  Cc: Neil Horman, Ivan Vecera, David S. Miller, Ben Hutchings,
	Francois Romieu, Eric Dumazet

Ivan Vercera was recently backporting commit
9c13cb8bb477a83b9a3c9e5a5478a4e21294a760 to a RHEL kernel, and I noticed that,
while this patch protects the tg3 driver from having its ndo_poll_controller
routine called during device initalization, it does nothing for the driver
during shutdown. I.e. it would be entirely possible to have the
ndo_poll_controller method (or subsequently the ndo_poll) routine called for a
driver in the netpoll path on CPU A while in parallel on CPU B, the ndo_close or
ndo_open routine could be called.  Given that the two latter routines tend to
initizlize and free many data structures that the former two rely on, the result
can easily be data corruption or various other crashes.  Furthermore, it seems
that this is potentially a problem with all net drivers that support netpoll,
and so this should ideally be fixed in a common path.

As Ben H Pointed out to me, we can't preform dev_open/dev_close in atomic
context, so I've come up with this solution.  I've modified the npinfo->rx_flags
to make use of bitops, so that we can atomically set/clear individual bits.
We use one of these bits to provide mutual exclusion between the netpoll polling
path and the dev_open/close paths.  I can't say Im a huge fan of returning
-EBUSY in open/close, but I like it better than busy waiting while the poll path
completes.

I've tested this here by flooding netconsole with messages on a system whos nic
driver I modfied to periodically return NETDEV_TX_BUSY, so that the netpoll tx
workqueue would be forced to send frames and poll the device.  While this was
going on I rapidly ifdown/up'ed the interface and watched for any problems.
I've not found any.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Ivan Vecera <ivecera@redhat.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Ben Hutchings <bhutchings@solarflare.com>
CC: Francois Romieu <romieu@fr.zoreil.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>

---
Change notes

v3)
Updated netpoll_info to remove hole in struct on 64 bit arches (Eric D)

v4)
Fixed build break that occurs when CONFIG_NETPOLL not enabled (Fracois R)
---
 include/linux/netpoll.h | 41 +++++++++++++++++++++++++++++++++++++----
 net/core/dev.c          | 29 ++++++++++++++++++++++++++++-
 net/core/netpoll.c      | 23 ++++++++++++++++-------
 3 files changed, 81 insertions(+), 12 deletions(-)

diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index f54c3bb..305f754 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -35,10 +35,14 @@ struct netpoll {
 	struct rcu_head rcu;
 };
 
+#define NETPOLL_RX_ENABLED  1
+#define NETPOLL_RX_DROP     2
+#define NETPOLL_RX_ACTIVE   3
+
 struct netpoll_info {
 	atomic_t refcnt;
 
-	int rx_flags;
+	unsigned long flags;
 	spinlock_t rx_lock;
 	struct list_head rx_np; /* netpolls that registered an rx_hook */
 
@@ -51,6 +55,33 @@ struct netpoll_info {
 	struct rcu_head rcu;
 };
 
+static inline int netpoll_rx_disable(struct net_device *dev)
+{
+	int rc = 0;
+#ifdef CONFIG_NETPOLL
+	struct netpoll_info *ni;
+	rcu_read_lock();
+	ni = rcu_dereference(dev->npinfo);
+	if (ni && test_and_set_bit(NETPOLL_RX_ACTIVE, &ni->flags))
+		rc = -EBUSY;
+	rcu_read_unlock();
+
+#endif
+	return rc;
+}
+
+static inline void netpoll_rx_enable(struct net_device *dev)
+{
+#ifdef CONFIG_NETPOLL
+	struct netpoll_info *ni;
+	rcu_read_lock();
+	ni = rcu_dereference(dev->npinfo);
+	if (ni)
+		clear_bit(NETPOLL_RX_ACTIVE, &ni->flags);
+	rcu_read_unlock();
+#endif
+}
+
 void netpoll_send_udp(struct netpoll *np, const char *msg, int len);
 void netpoll_print_options(struct netpoll *np);
 int netpoll_parse_options(struct netpoll *np, char *opt);
@@ -79,7 +110,8 @@ static inline bool netpoll_rx_on(struct sk_buff *skb)
 {
 	struct netpoll_info *npinfo = rcu_dereference_bh(skb->dev->npinfo);
 
-	return npinfo && (!list_empty(&npinfo->rx_np) || npinfo->rx_flags);
+	return npinfo && (!list_empty(&npinfo->rx_np) ||
+			   test_bit(NETPOLL_RX_ENABLED, &npinfo->flags));
 }
 
 static inline bool netpoll_rx(struct sk_buff *skb)
@@ -95,8 +127,9 @@ static inline bool netpoll_rx(struct sk_buff *skb)
 
 	npinfo = rcu_dereference_bh(skb->dev->npinfo);
 	spin_lock(&npinfo->rx_lock);
-	/* check rx_flags again with the lock held */
-	if (npinfo->rx_flags && __netpoll_rx(skb, npinfo))
+	/* check flags again with the lock held */
+	if (test_bit(NETPOLL_RX_ENABLED, &npinfo->flags) &&
+	    __netpoll_rx(skb, npinfo))
 		ret = true;
 	spin_unlock(&npinfo->rx_lock);
 
diff --git a/net/core/dev.c b/net/core/dev.c
index a87bc74..05b7116 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1259,6 +1259,7 @@ EXPORT_SYMBOL(dev_load);
 static int __dev_open(struct net_device *dev)
 {
 	const struct net_device_ops *ops = dev->netdev_ops;
+
 	int ret;
 
 	ASSERT_RTNL();
@@ -1266,6 +1267,15 @@ static int __dev_open(struct net_device *dev)
 	if (!netif_device_present(dev))
 		return -ENODEV;
 
+	/*
+	 * Block netpoll from trying to do any rx path servicing
+	 * If we don't do this there is a chance ndo_poll_controller
+	 * or ndo_poll may be running while we open the device
+	 */
+	ret = netpoll_rx_disable(dev);
+	if (ret)
+		return ret;
+
 	ret = call_netdevice_notifiers(NETDEV_PRE_UP, dev);
 	ret = notifier_to_errno(ret);
 	if (ret)
@@ -1279,6 +1289,8 @@ static int __dev_open(struct net_device *dev)
 	if (!ret && ops->ndo_open)
 		ret = ops->ndo_open(dev);
 
+	netpoll_rx_enable(dev);
+
 	if (ret)
 		clear_bit(__LINK_STATE_START, &dev->state);
 	else {
@@ -1370,9 +1382,16 @@ static int __dev_close(struct net_device *dev)
 	int retval;
 	LIST_HEAD(single);
 
+	/* Temporarily disable netpoll until the interface is down */
+	retval = netpoll_rx_disable(dev);
+	if (retval)
+		return retval;
+
 	list_add(&dev->unreg_list, &single);
 	retval = __dev_close_many(&single);
 	list_del(&single);
+
+	netpoll_rx_enable(dev);
 	return retval;
 }
 
@@ -1408,14 +1427,22 @@ static int dev_close_many(struct list_head *head)
  */
 int dev_close(struct net_device *dev)
 {
+	int ret = 0;
 	if (dev->flags & IFF_UP) {
 		LIST_HEAD(single);
 
+		/* Block netpoll rx while the interface is going down*/
+		ret = netpoll_rx_disable(dev);
+		if (ret)
+			return ret;
+
 		list_add(&dev->unreg_list, &single);
 		dev_close_many(&single);
 		list_del(&single);
+
+		netpoll_rx_enable(dev);
 	}
-	return 0;
+	return ret;
 }
 EXPORT_SYMBOL(dev_close);
 
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 331ccb9..96c5bc5 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -48,8 +48,6 @@ static struct sk_buff_head skb_pool;
 static atomic_t trapped;
 
 #define USEC_PER_POLL	50
-#define NETPOLL_RX_ENABLED  1
-#define NETPOLL_RX_DROP     2
 
 #define MAX_SKB_SIZE							\
 	(sizeof(struct ethhdr) +					\
@@ -152,7 +150,7 @@ static int poll_one_napi(struct netpoll_info *npinfo,
 	if (!test_bit(NAPI_STATE_SCHED, &napi->state))
 		return budget;
 
-	npinfo->rx_flags |= NETPOLL_RX_DROP;
+	set_bit(NETPOLL_RX_DROP, &npinfo->flags);
 	atomic_inc(&trapped);
 	set_bit(NAPI_STATE_NPSVC, &napi->state);
 
@@ -161,7 +159,7 @@ static int poll_one_napi(struct netpoll_info *npinfo,
 
 	clear_bit(NAPI_STATE_NPSVC, &napi->state);
 	atomic_dec(&trapped);
-	npinfo->rx_flags &= ~NETPOLL_RX_DROP;
+	clear_bit(NETPOLL_RX_DROP, &npinfo->flags);
 
 	return budget - work;
 }
@@ -199,6 +197,14 @@ static void netpoll_poll_dev(struct net_device *dev)
 	const struct net_device_ops *ops;
 	struct netpoll_info *ni = rcu_dereference_bh(dev->npinfo);
 
+	/*
+	 * Don't do any rx activity unless NETPOLL_RX_ACTIVE is clear
+	 * the dev_open/close paths use this to block netpoll activity
+	 * while changing device state
+	 */
+	if (test_and_set_bit(NETPOLL_RX_ACTIVE, &dev->npinfo->flags))
+		return;
+
 	if (!dev || !netif_running(dev))
 		return;
 
@@ -211,6 +217,8 @@ static void netpoll_poll_dev(struct net_device *dev)
 
 	poll_napi(dev);
 
+	clear_bit(NETPOLL_RX_ACTIVE, &dev->npinfo->flags);
+
 	if (dev->flags & IFF_SLAVE) {
 		if (ni) {
 			struct net_device *bond_dev;
@@ -229,6 +237,7 @@ static void netpoll_poll_dev(struct net_device *dev)
 	service_neigh_queue(ni);
 
 	zap_completion_queue();
+
 }
 
 static void refill_skbs(void)
@@ -1000,7 +1009,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 			goto out;
 		}
 
-		npinfo->rx_flags = 0;
+		npinfo->flags = 0;
 		INIT_LIST_HEAD(&npinfo->rx_np);
 
 		spin_lock_init(&npinfo->rx_lock);
@@ -1025,7 +1034,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 
 	if (np->rx_hook) {
 		spin_lock_irqsave(&npinfo->rx_lock, flags);
-		npinfo->rx_flags |= NETPOLL_RX_ENABLED;
+		set_bit(NETPOLL_RX_ENABLED, &npinfo->flags);
 		list_add_tail(&np->rx, &npinfo->rx_np);
 		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
 	}
@@ -1204,7 +1213,7 @@ void __netpoll_cleanup(struct netpoll *np)
 		spin_lock_irqsave(&npinfo->rx_lock, flags);
 		list_del(&np->rx);
 		if (list_empty(&npinfo->rx_np))
-			npinfo->rx_flags &= ~NETPOLL_RX_ENABLED;
+			clear_bit(NETPOLL_RX_ENABLED, &npinfo->flags);
 		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
 	}
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v4] netpoll: protect napi_poll and poll_controller during dev_[open|close]
  2013-02-02 18:51 ` [PATCH v4] " Neil Horman
@ 2013-02-03 21:05   ` David Miller
  2013-02-04  0:32     ` Neil Horman
  0 siblings, 1 reply; 15+ messages in thread
From: David Miller @ 2013-02-03 21:05 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, ivecera, bhutchings, romieu, eric.dumazet

From: Neil Horman <nhorman@tuxdriver.com>
Date: Sat,  2 Feb 2013 13:51:16 -0500

> I can't say Im a huge fan of returning -EBUSY in open/close, but I
> like it better than busy waiting while the poll path completes.

Sorry Neil, this is a show stopper for me.

There is no reason why, just because of bad timing, a device open or
close that would normally succeed, should fail.  And this situation is
only being created because the mutual exclusion and context handling
is "hard", well that's too bad :-)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4] netpoll: protect napi_poll and poll_controller during dev_[open|close]
  2013-02-03 21:05   ` David Miller
@ 2013-02-04  0:32     ` Neil Horman
  0 siblings, 0 replies; 15+ messages in thread
From: Neil Horman @ 2013-02-04  0:32 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, ivecera, bhutchings, romieu, eric.dumazet

On Sun, Feb 03, 2013 at 04:05:58PM -0500, David Miller wrote:
> From: Neil Horman <nhorman@tuxdriver.com>
> Date: Sat,  2 Feb 2013 13:51:16 -0500
> 
> > I can't say Im a huge fan of returning -EBUSY in open/close, but I
> > like it better than busy waiting while the poll path completes.
> 
> Sorry Neil, this is a show stopper for me.
> 
> There is no reason why, just because of bad timing, a device open or
> close that would normally succeed, should fail.  And this situation is
> only being created because the mutual exclusion and context handling
> is "hard", well that's too bad :-)
> 

Alright, fair enough Dave.  I'll either covert this to a busy wait, or use a
mutex to do this.
Neil

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v5] netpoll: protect napi_poll and poll_controller during dev_[open|close]
  2013-01-30 20:44 [PATCH] netpoll: protect napi_poll and poll_controller during dev_[open|close] Neil Horman
                   ` (3 preceding siblings ...)
  2013-02-02 18:51 ` [PATCH v4] " Neil Horman
@ 2013-02-04 17:40 ` Neil Horman
  2013-02-05  2:06   ` David Miller
  2013-02-05 18:05 ` [PATCH v6] " Neil Horman
  5 siblings, 1 reply; 15+ messages in thread
From: Neil Horman @ 2013-02-04 17:40 UTC (permalink / raw)
  To: netdev
  Cc: Neil Horman, Ivan Vecera, David S. Miller, Ben Hutchings,
	Francois Romieu, Eric Dumazet

Ivan Vercera was recently backporting commit
9c13cb8bb477a83b9a3c9e5a5478a4e21294a760 to a RHEL kernel, and I noticed that,
while this patch protects the tg3 driver from having its ndo_poll_controller
routine called during device initalization, it does nothing for the driver
during shutdown. I.e. it would be entirely possible to have the
ndo_poll_controller method (or subsequently the ndo_poll) routine called for a
driver in the netpoll path on CPU A while in parallel on CPU B, the ndo_close or
ndo_open routine could be called.  Given that the two latter routines tend to
initizlize and free many data structures that the former two rely on, the result
can easily be data corruption or various other crashes.  Furthermore, it seems
that this is potentially a problem with all net drivers that support netpoll,
and so this should ideally be fixed in a common path.

As Ben H Pointed out to me, we can't preform dev_open/dev_close in atomic
context, so I've come up with this solution.  We can use a mutex to sleep in
open/close paths and just do a mutex_trylock in the napi poll path and abandon
the poll attempt if we're locked, as we'll just retry the poll on the next send
anyway.

I've tested this here by flooding netconsole with messages on a system whos nic
driver I modfied to periodically return NETDEV_TX_BUSY, so that the netpoll tx
workqueue would be forced to send frames and poll the device.  While this was
going on I rapidly ifdown/up'ed the interface and watched for any problems.
I've not found any.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Ivan Vecera <ivecera@redhat.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Ben Hutchings <bhutchings@solarflare.com>
CC: Francois Romieu <romieu@fr.zoreil.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>

---
Change notes

v3)
Updated netpoll_info to remove hole in struct on 64 bit arches (Eric D)

v4)
Fixed build break that occurs when CONFIG_NETPOLL not enabled (Fracois R)

v5) Davem pointed out that wile the bit flag approach works, its really ugly to
have to return -EBUSY because of a transient event that should quickly clear.
Since we are in process context in the dev_open/close paths, I've switched to
using a mutex, and allowing the open/close paths to sleep if netpoll is active.
Since the netpoll path itself may execute in any context, we just do a
mutex_trylock there, and abandon the poll if its already held.  Either the
interface is going down, and the poll doesn't matter, or it'll just succede on
the next poll cycle anyway.  Note, I left Erics change in, as we can still save
some space by rearranging the netpoll_info structure.
---
 include/linux/netpoll.h | 11 ++++++++++-
 net/core/dev.c          | 28 +++++++++++++++++++++++++++-
 net/core/netpoll.c      | 41 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 78 insertions(+), 2 deletions(-)

diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index f54c3bb..ab856d5 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -38,8 +38,9 @@ struct netpoll {
 struct netpoll_info {
 	atomic_t refcnt;
 
-	int rx_flags;
+	unsigned long rx_flags;
 	spinlock_t rx_lock;
+	struct mutex dev_lock;
 	struct list_head rx_np; /* netpolls that registered an rx_hook */
 
 	struct sk_buff_head neigh_tx; /* list of neigh requests to reply to */
@@ -51,6 +52,14 @@ struct netpoll_info {
 	struct rcu_head rcu;
 };
 
+#ifdef CONFIG_NETPOLL
+extern int netpoll_rx_disable(struct net_device *dev);
+extern void netpoll_rx_enable(struct net_device *dev);
+#else
+static inline int netpoll_rx_disable(struct net_device *dev) { return 0; }
+static inline void netpoll_rx_enable(struct net_device *dev) { return; }
+#endif
+
 void netpoll_send_udp(struct netpoll *np, const char *msg, int len);
 void netpoll_print_options(struct netpoll *np);
 int netpoll_parse_options(struct netpoll *np, char *opt);
diff --git a/net/core/dev.c b/net/core/dev.c
index a87bc74..c9357b5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1266,6 +1266,15 @@ static int __dev_open(struct net_device *dev)
 	if (!netif_device_present(dev))
 		return -ENODEV;
 
+	/*
+	 * Block netpoll from trying to do any rx path servicing.
+	 * If we don't do this there is a chance ndo_poll_controller
+	 * or ndo_poll may be running while we open the device
+	 */
+	ret = netpoll_rx_disable(dev);
+	if (ret)
+		return ret;
+
 	ret = call_netdevice_notifiers(NETDEV_PRE_UP, dev);
 	ret = notifier_to_errno(ret);
 	if (ret)
@@ -1279,6 +1288,8 @@ static int __dev_open(struct net_device *dev)
 	if (!ret && ops->ndo_open)
 		ret = ops->ndo_open(dev);
 
+	netpoll_rx_enable(dev);
+
 	if (ret)
 		clear_bit(__LINK_STATE_START, &dev->state);
 	else {
@@ -1370,9 +1381,16 @@ static int __dev_close(struct net_device *dev)
 	int retval;
 	LIST_HEAD(single);
 
+	/* Temporarily disable netpoll until the interface is down */
+	retval = netpoll_rx_disable(dev);
+	if (retval)
+		return retval;
+
 	list_add(&dev->unreg_list, &single);
 	retval = __dev_close_many(&single);
 	list_del(&single);
+
+	netpoll_rx_enable(dev);
 	return retval;
 }
 
@@ -1408,14 +1426,22 @@ static int dev_close_many(struct list_head *head)
  */
 int dev_close(struct net_device *dev)
 {
+	int ret = 0;
 	if (dev->flags & IFF_UP) {
 		LIST_HEAD(single);
 
+		/* Block netpoll rx while the interface is going down*/
+		ret = netpoll_rx_disable(dev);
+		if (ret)
+			return ret;
+
 		list_add(&dev->unreg_list, &single);
 		dev_close_many(&single);
 		list_del(&single);
+
+		netpoll_rx_enable(dev);
 	}
-	return 0;
+	return ret;
 }
 EXPORT_SYMBOL(dev_close);
 
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 331ccb9..c4dfe0d 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -47,6 +47,8 @@ static struct sk_buff_head skb_pool;
 
 static atomic_t trapped;
 
+static struct srcu_struct netpoll_srcu;
+
 #define USEC_PER_POLL	50
 #define NETPOLL_RX_ENABLED  1
 #define NETPOLL_RX_DROP     2
@@ -199,6 +201,14 @@ static void netpoll_poll_dev(struct net_device *dev)
 	const struct net_device_ops *ops;
 	struct netpoll_info *ni = rcu_dereference_bh(dev->npinfo);
 
+	/*
+	 * Don't do any rx activity if the dev_lock mutex is held 
+	 * the dev_open/close paths use this to block netpoll activity
+	 * while changing device state
+	 */
+	if (!mutex_trylock(&dev->npinfo->dev_lock))
+		return;
+
 	if (!dev || !netif_running(dev))
 		return;
 
@@ -211,6 +221,8 @@ static void netpoll_poll_dev(struct net_device *dev)
 
 	poll_napi(dev);
 
+	mutex_unlock(&dev->npinfo->dev_lock);
+
 	if (dev->flags & IFF_SLAVE) {
 		if (ni) {
 			struct net_device *bond_dev;
@@ -231,6 +243,31 @@ static void netpoll_poll_dev(struct net_device *dev)
 	zap_completion_queue();
 }
 
+int netpoll_rx_disable(struct net_device *dev)
+{
+        struct netpoll_info *ni;
+	int idx;
+        might_sleep();
+        idx = srcu_read_lock(&netpoll_srcu);
+        ni = srcu_dereference(dev->npinfo, &netpoll_srcu);
+        if (ni)
+                mutex_lock(&ni->dev_lock);
+        srcu_read_unlock(&netpoll_srcu, idx);
+        return 0;
+}
+EXPORT_SYMBOL(netpoll_rx_disable);
+
+void netpoll_rx_enable(struct net_device *dev)
+{
+        struct netpoll_info *ni;
+        rcu_read_lock();
+        ni = rcu_dereference(dev->npinfo);
+        if (ni)
+                mutex_unlock(&ni->dev_lock);
+        rcu_read_unlock();
+}
+EXPORT_SYMBOL(netpoll_rx_enable);
+
 static void refill_skbs(void)
 {
 	struct sk_buff *skb;
@@ -1004,6 +1041,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 		INIT_LIST_HEAD(&npinfo->rx_np);
 
 		spin_lock_init(&npinfo->rx_lock);
+		mutex_init(&npinfo->dev_lock);
 		skb_queue_head_init(&npinfo->neigh_tx);
 		skb_queue_head_init(&npinfo->txq);
 		INIT_DELAYED_WORK(&npinfo->tx_work, queue_process);
@@ -1169,6 +1207,7 @@ EXPORT_SYMBOL(netpoll_setup);
 static int __init netpoll_init(void)
 {
 	skb_queue_head_init(&skb_pool);
+	init_srcu_struct(&netpoll_srcu);
 	return 0;
 }
 core_initcall(netpoll_init);
@@ -1208,6 +1247,8 @@ void __netpoll_cleanup(struct netpoll *np)
 		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
 	}
 
+	synchronize_srcu(&netpoll_srcu);
+
 	if (atomic_dec_and_test(&npinfo->refcnt)) {
 		const struct net_device_ops *ops;
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v5] netpoll: protect napi_poll and poll_controller during dev_[open|close]
  2013-02-04 17:40 ` [PATCH v5] " Neil Horman
@ 2013-02-05  2:06   ` David Miller
  2013-02-05 15:07     ` Neil Horman
  0 siblings, 1 reply; 15+ messages in thread
From: David Miller @ 2013-02-05  2:06 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, ivecera, bhutchings, romieu, eric.dumazet

From: Neil Horman <nhorman@tuxdriver.com>
Date: Mon,  4 Feb 2013 12:40:44 -0500

> +	/*
> +	 * Block netpoll from trying to do any rx path servicing.
> +	 * If we don't do this there is a chance ndo_poll_controller
> +	 * or ndo_poll may be running while we open the device
> +	 */

Please format comments:

	/* Like
	 * this.
	 */

> +	/*
> +	 * Don't do any rx activity if the dev_lock mutex is held 
> +	 * the dev_open/close paths use this to block netpoll activity
> +	 * while changing device state
> +	 */

Same here.

> +int netpoll_rx_disable(struct net_device *dev)
> +{
> +        struct netpoll_info *ni;
> +	int idx;
> +        might_sleep();
> +        idx = srcu_read_lock(&netpoll_srcu);
> +        ni = srcu_dereference(dev->npinfo, &netpoll_srcu);

A lot of interesting indentation going on here.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5] netpoll: protect napi_poll and poll_controller during dev_[open|close]
  2013-02-05  2:06   ` David Miller
@ 2013-02-05 15:07     ` Neil Horman
  0 siblings, 0 replies; 15+ messages in thread
From: Neil Horman @ 2013-02-05 15:07 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, ivecera, bhutchings, romieu, eric.dumazet

On Mon, Feb 04, 2013 at 09:06:42PM -0500, David Miller wrote:
> From: Neil Horman <nhorman@tuxdriver.com>
> Date: Mon,  4 Feb 2013 12:40:44 -0500
> 
> > +	/*
> > +	 * Block netpoll from trying to do any rx path servicing.
> > +	 * If we don't do this there is a chance ndo_poll_controller
> > +	 * or ndo_poll may be running while we open the device
> > +	 */
> 
> Please format comments:
> 
> 	/* Like
> 	 * this.
> 	 */
> 
> > +	/*
> > +	 * Don't do any rx activity if the dev_lock mutex is held 
> > +	 * the dev_open/close paths use this to block netpoll activity
> > +	 * while changing device state
> > +	 */
> 
> Same here.
> 
> > +int netpoll_rx_disable(struct net_device *dev)
> > +{
> > +        struct netpoll_info *ni;
> > +	int idx;
> > +        might_sleep();
> > +        idx = srcu_read_lock(&netpoll_srcu);
> > +        ni = srcu_dereference(dev->npinfo, &netpoll_srcu);
> 
> A lot of interesting indentation going on here.
Apologies, Dave, I rushed and forgot to run checkpatch.  I'll resend shortly.
Neil

> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v6] netpoll: protect napi_poll and poll_controller during dev_[open|close]
  2013-01-30 20:44 [PATCH] netpoll: protect napi_poll and poll_controller during dev_[open|close] Neil Horman
                   ` (4 preceding siblings ...)
  2013-02-04 17:40 ` [PATCH v5] " Neil Horman
@ 2013-02-05 18:05 ` Neil Horman
  2013-02-06 20:45   ` David Miller
  5 siblings, 1 reply; 15+ messages in thread
From: Neil Horman @ 2013-02-05 18:05 UTC (permalink / raw)
  To: netdev
  Cc: Neil Horman, Ivan Vecera, David S. Miller, Ben Hutchings,
	Francois Romieu, Eric Dumazet

Ivan Vercera was recently backporting commit
9c13cb8bb477a83b9a3c9e5a5478a4e21294a760 to a RHEL kernel, and I noticed that,
while this patch protects the tg3 driver from having its ndo_poll_controller
routine called during device initalization, it does nothing for the driver
during shutdown. I.e. it would be entirely possible to have the
ndo_poll_controller method (or subsequently the ndo_poll) routine called for a
driver in the netpoll path on CPU A while in parallel on CPU B, the ndo_close or
ndo_open routine could be called.  Given that the two latter routines tend to
initizlize and free many data structures that the former two rely on, the result
can easily be data corruption or various other crashes.  Furthermore, it seems
that this is potentially a problem with all net drivers that support netpoll,
and so this should ideally be fixed in a common path.

As Ben H Pointed out to me, we can't preform dev_open/dev_close in atomic
context, so I've come up with this solution.  We can use a mutex to sleep in
open/close paths and just do a mutex_trylock in the napi poll path and abandon
the poll attempt if we're locked, as we'll just retry the poll on the next send
anyway.

I've tested this here by flooding netconsole with messages on a system whos nic
driver I modfied to periodically return NETDEV_TX_BUSY, so that the netpoll tx
workqueue would be forced to send frames and poll the device.  While this was
going on I rapidly ifdown/up'ed the interface and watched for any problems.
I've not found any.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Ivan Vecera <ivecera@redhat.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Ben Hutchings <bhutchings@solarflare.com>
CC: Francois Romieu <romieu@fr.zoreil.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>

---
Change notes

v3)
Updated netpoll_info to remove hole in struct on 64 bit arches (Eric D)

v4)
Fixed build break that occurs when CONFIG_NETPOLL not enabled (Fracois R)

v5) Davem pointed out that wile the bit flag approach works, its really ugly to
have to return -EBUSY because of a transient event that should quickly clear.
Since we are in process context in the dev_open/close paths, I've switched to
using a mutex, and allowing the open/close paths to sleep if netpoll is active.
Since the netpoll path itself may execute in any context, we just do a
mutex_trylock there, and abandon the poll if its already held.  Either the
interface is going down, and the poll doesn't matter, or it'll just succede on
the next poll cycle anyway.  Note, I left Erics change in, as we can still save
some space by rearranging the netpoll_info structure.

v6) Fix comment and indentation sillyness from my previous rushing
---
 include/linux/netpoll.h | 11 ++++++++++-
 net/core/dev.c          | 27 ++++++++++++++++++++++++++-
 net/core/netpoll.c      | 40 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 76 insertions(+), 2 deletions(-)

diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index f54c3bb..ab856d5 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -38,8 +38,9 @@ struct netpoll {
 struct netpoll_info {
 	atomic_t refcnt;
 
-	int rx_flags;
+	unsigned long rx_flags;
 	spinlock_t rx_lock;
+	struct mutex dev_lock;
 	struct list_head rx_np; /* netpolls that registered an rx_hook */
 
 	struct sk_buff_head neigh_tx; /* list of neigh requests to reply to */
@@ -51,6 +52,14 @@ struct netpoll_info {
 	struct rcu_head rcu;
 };
 
+#ifdef CONFIG_NETPOLL
+extern int netpoll_rx_disable(struct net_device *dev);
+extern void netpoll_rx_enable(struct net_device *dev);
+#else
+static inline int netpoll_rx_disable(struct net_device *dev) { return 0; }
+static inline void netpoll_rx_enable(struct net_device *dev) { return; }
+#endif
+
 void netpoll_send_udp(struct netpoll *np, const char *msg, int len);
 void netpoll_print_options(struct netpoll *np);
 int netpoll_parse_options(struct netpoll *np, char *opt);
diff --git a/net/core/dev.c b/net/core/dev.c
index a87bc74..ac08c1c 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1266,6 +1266,14 @@ static int __dev_open(struct net_device *dev)
 	if (!netif_device_present(dev))
 		return -ENODEV;
 
+	/* Block netpoll from trying to do any rx path servicing.
+	 * If we don't do this there is a chance ndo_poll_controller
+	 * or ndo_poll may be running while we open the device
+	 */
+	ret = netpoll_rx_disable(dev);
+	if (ret)
+		return ret;
+
 	ret = call_netdevice_notifiers(NETDEV_PRE_UP, dev);
 	ret = notifier_to_errno(ret);
 	if (ret)
@@ -1279,6 +1287,8 @@ static int __dev_open(struct net_device *dev)
 	if (!ret && ops->ndo_open)
 		ret = ops->ndo_open(dev);
 
+	netpoll_rx_enable(dev);
+
 	if (ret)
 		clear_bit(__LINK_STATE_START, &dev->state);
 	else {
@@ -1370,9 +1380,16 @@ static int __dev_close(struct net_device *dev)
 	int retval;
 	LIST_HEAD(single);
 
+	/* Temporarily disable netpoll until the interface is down */
+	retval = netpoll_rx_disable(dev);
+	if (retval)
+		return retval;
+
 	list_add(&dev->unreg_list, &single);
 	retval = __dev_close_many(&single);
 	list_del(&single);
+
+	netpoll_rx_enable(dev);
 	return retval;
 }
 
@@ -1408,14 +1425,22 @@ static int dev_close_many(struct list_head *head)
  */
 int dev_close(struct net_device *dev)
 {
+	int ret = 0;
 	if (dev->flags & IFF_UP) {
 		LIST_HEAD(single);
 
+		/* Block netpoll rx while the interface is going down */
+		ret = netpoll_rx_disable(dev);
+		if (ret)
+			return ret;
+
 		list_add(&dev->unreg_list, &single);
 		dev_close_many(&single);
 		list_del(&single);
+
+		netpoll_rx_enable(dev);
 	}
-	return 0;
+	return ret;
 }
 EXPORT_SYMBOL(dev_close);
 
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 331ccb9..edcd9ad 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -47,6 +47,8 @@ static struct sk_buff_head skb_pool;
 
 static atomic_t trapped;
 
+static struct srcu_struct netpoll_srcu;
+
 #define USEC_PER_POLL	50
 #define NETPOLL_RX_ENABLED  1
 #define NETPOLL_RX_DROP     2
@@ -199,6 +201,13 @@ static void netpoll_poll_dev(struct net_device *dev)
 	const struct net_device_ops *ops;
 	struct netpoll_info *ni = rcu_dereference_bh(dev->npinfo);
 
+	/* Don't do any rx activity if the dev_lock mutex is held
+	 * the dev_open/close paths use this to block netpoll activity
+	 * while changing device state
+	 */
+	if (!mutex_trylock(&dev->npinfo->dev_lock))
+		return;
+
 	if (!dev || !netif_running(dev))
 		return;
 
@@ -211,6 +220,8 @@ static void netpoll_poll_dev(struct net_device *dev)
 
 	poll_napi(dev);
 
+	mutex_unlock(&dev->npinfo->dev_lock);
+
 	if (dev->flags & IFF_SLAVE) {
 		if (ni) {
 			struct net_device *bond_dev;
@@ -231,6 +242,31 @@ static void netpoll_poll_dev(struct net_device *dev)
 	zap_completion_queue();
 }
 
+int netpoll_rx_disable(struct net_device *dev)
+{
+	struct netpoll_info *ni;
+	int idx;
+	might_sleep();
+	idx = srcu_read_lock(&netpoll_srcu);
+	ni = srcu_dereference(dev->npinfo, &netpoll_srcu);
+	if (ni)
+		mutex_lock(&ni->dev_lock);
+	srcu_read_unlock(&netpoll_srcu, idx);
+	return 0;
+}
+EXPORT_SYMBOL(netpoll_rx_disable);
+
+void netpoll_rx_enable(struct net_device *dev)
+{
+	struct netpoll_info *ni;
+	rcu_read_lock();
+	ni = rcu_dereference(dev->npinfo);
+	if (ni)
+		mutex_unlock(&ni->dev_lock);
+	rcu_read_unlock();
+}
+EXPORT_SYMBOL(netpoll_rx_enable);
+
 static void refill_skbs(void)
 {
 	struct sk_buff *skb;
@@ -1004,6 +1040,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev, gfp_t gfp)
 		INIT_LIST_HEAD(&npinfo->rx_np);
 
 		spin_lock_init(&npinfo->rx_lock);
+		mutex_init(&npinfo->dev_lock);
 		skb_queue_head_init(&npinfo->neigh_tx);
 		skb_queue_head_init(&npinfo->txq);
 		INIT_DELAYED_WORK(&npinfo->tx_work, queue_process);
@@ -1169,6 +1206,7 @@ EXPORT_SYMBOL(netpoll_setup);
 static int __init netpoll_init(void)
 {
 	skb_queue_head_init(&skb_pool);
+	init_srcu_struct(&netpoll_srcu);
 	return 0;
 }
 core_initcall(netpoll_init);
@@ -1208,6 +1246,8 @@ void __netpoll_cleanup(struct netpoll *np)
 		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
 	}
 
+	synchronize_srcu(&netpoll_srcu);
+
 	if (atomic_dec_and_test(&npinfo->refcnt)) {
 		const struct net_device_ops *ops;
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v6] netpoll: protect napi_poll and poll_controller during dev_[open|close]
  2013-02-05 18:05 ` [PATCH v6] " Neil Horman
@ 2013-02-06 20:45   ` David Miller
  0 siblings, 0 replies; 15+ messages in thread
From: David Miller @ 2013-02-06 20:45 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, ivecera, bhutchings, romieu, eric.dumazet

From: Neil Horman <nhorman@tuxdriver.com>
Date: Tue,  5 Feb 2013 13:05:43 -0500

> Ivan Vercera was recently backporting commit
> 9c13cb8bb477a83b9a3c9e5a5478a4e21294a760 to a RHEL kernel, and I noticed that,
> while this patch protects the tg3 driver from having its ndo_poll_controller
> routine called during device initalization, it does nothing for the driver
> during shutdown. I.e. it would be entirely possible to have the
> ndo_poll_controller method (or subsequently the ndo_poll) routine called for a
> driver in the netpoll path on CPU A while in parallel on CPU B, the ndo_close or
> ndo_open routine could be called.  Given that the two latter routines tend to
> initizlize and free many data structures that the former two rely on, the result
> can easily be data corruption or various other crashes.  Furthermore, it seems
> that this is potentially a problem with all net drivers that support netpoll,
> and so this should ideally be fixed in a common path.
> 
> As Ben H Pointed out to me, we can't preform dev_open/dev_close in atomic
> context, so I've come up with this solution.  We can use a mutex to sleep in
> open/close paths and just do a mutex_trylock in the napi poll path and abandon
> the poll attempt if we're locked, as we'll just retry the poll on the next send
> anyway.
> 
> I've tested this here by flooding netconsole with messages on a system whos nic
> driver I modfied to periodically return NETDEV_TX_BUSY, so that the netpoll tx
> workqueue would be forced to send frames and poll the device.  While this was
> going on I rapidly ifdown/up'ed the interface and watched for any problems.
> I've not found any.
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>

Applied to net-next, thanks Neil.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2013-02-06 20:45 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-30 20:44 [PATCH] netpoll: protect napi_poll and poll_controller during dev_[open|close] Neil Horman
2013-01-30 21:07 ` Ben Hutchings
2013-01-30 21:25   ` Neil Horman
2013-02-01 17:02 ` [PATCH v2] " Neil Horman
2013-02-01 17:14   ` Eric Dumazet
2013-02-01 17:49 ` [PATCH v3] " Neil Horman
2013-02-01 21:21   ` Francois Romieu
2013-02-02 18:51 ` [PATCH v4] " Neil Horman
2013-02-03 21:05   ` David Miller
2013-02-04  0:32     ` Neil Horman
2013-02-04 17:40 ` [PATCH v5] " Neil Horman
2013-02-05  2:06   ` David Miller
2013-02-05 15:07     ` Neil Horman
2013-02-05 18:05 ` [PATCH v6] " Neil Horman
2013-02-06 20:45   ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).