From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wendy Cheng Subject: Re: [PATCH 4/8] IPoIB: fix mcast_dev_flush/mcast_restart_task race Date: Fri, 29 Aug 2014 14:53:32 -0700 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Return-path: In-Reply-To: Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Doug Ledford Cc: "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org" List-Id: linux-rdma@vger.kernel.org On Tue, Aug 12, 2014 at 4:38 PM, Doug Ledford wrote: > Our mcast_dev_flush routine and our mcast_restart_task can race against > each other. In particular, they both hold the priv->lock while > manipulating the rbtree and while removing mcast entries from the > multicast_list and while adding entries to the remove_list, but they > also both drop their locks prior to doing the actual removes. The > mcast_dev_flush routine is run entirely under the rtnl lock and so has > at least some locking. The actual race condition is like this: > > Thread 1 Thread 2 > ifconfig ib0 up > start multicast join for broadcast > multicast join completes for broadcast > start to add more multicast joins > call mcast_restart_task to add new entries > ifconfig ib0 down > mcast_dev_flush > mcast_leave(mcast A) > mcast_leave(mcast A) > > As mcast_leave calls ib_sa_multicast_leave, and as member in > core/multicast.c is ref counted, we run into an unbalanced refcount > issue. To avoid stomping on each others removes, take the rtnl lock > specifically when we are deleting the entries from the remove list. Isn't "test_and_clear_bit()" atomic so it is unlikely that ib_sa_free_multicast() can run multiple times ? 638 static int ipoib_mcast_leave(struct net_device *dev, struct ipoib_mcast *mcast) 639 { 640 struct ipoib_dev_priv *priv = netdev_priv(dev); 641 int ret = 0; 642 643 if (test_and_clear_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags)) 644 ib_sa_free_multicast(mcast->mc); 645 646 if (test_and_clear_bit(IPOIB_MCAST_FLAG_ATTACHED, &mcast->flags) ) { -- Wendy > > Signed-off-by: Doug Ledford > --- > drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 37 ++++++++++++++++++++++---- > 1 file changed, 32 insertions(+), 5 deletions(-) > > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c > index f5e8da530d9..19e3fe75ebf 100644 > --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c > +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c > @@ -810,7 +810,10 @@ void ipoib_mcast_dev_flush(struct net_device *dev) > > spin_unlock_irqrestore(&priv->lock, flags); > > - /* seperate between the wait to the leave*/ > + /* > + * make sure the in-flight joins have finished before we attempt > + * to leave > + */ > list_for_each_entry_safe(mcast, tmcast, &remove_list, list) > if (test_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags)) > wait_for_completion(&mcast->done); > @@ -931,14 +934,38 @@ void ipoib_mcast_restart_task(struct work_struct *work) > netif_addr_unlock(dev); > local_irq_restore(flags); > > - /* We have to cancel outside of the spinlock */ > + /* > + * make sure the in-flight joins have finished before we attempt > + * to leave > + */ > + list_for_each_entry_safe(mcast, tmcast, &remove_list, list) > + if (test_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags)) > + wait_for_completion(&mcast->done); > + > + /* > + * We have to cancel outside of the spinlock, but we have to > + * take the rtnl lock or else we race with the removal of > + * entries from the remove list in mcast_dev_flush as part > + * of ipoib_stop() which will call mcast_stop_thread with > + * flush == 1 while holding the rtnl lock, and the > + * flush_workqueue won't complete until this restart_mcast_task > + * completes. So do like the carrier on task and attempt to > + * take the rtnl lock, but if we can't before the ADMIN_UP flag > + * goes away, then just return and know that the remove list will > + * get flushed later by mcast_dev_flush. > + */ > + while (!rtnl_trylock()) { > + if (!test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) > + return; > + else > + msleep(20); > + } > list_for_each_entry_safe(mcast, tmcast, &remove_list, list) { > ipoib_mcast_leave(mcast->dev, mcast); > ipoib_mcast_free(mcast); > } > - > - if (test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) > - ipoib_mcast_start_thread(dev); > + ipoib_mcast_start_thread(dev); > + rtnl_unlock(); > } > > #ifdef CONFIG_INFINIBAND_IPOIB_DEBUG > -- > 1.9.3 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html