From mboxrd@z Thu Jan 1 00:00:00 1970 From: Doug Ledford Subject: Re: [PATCH 2/8] IPoIB: Make the carrier_on_task race aware Date: Tue, 19 Aug 2014 16:32:01 -0400 (EDT) Message-ID: <2CC1794B-10CD-49A2-8F5D-0C66A6684DBC@redhat.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8BIT Return-path: In-Reply-To: Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Wendy Cheng Cc: "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org" List-Id: linux-rdma@vger.kernel.org Whether or not the core network code is OK with us dropping the rtnl lock while manipulating the interface is the issue here. However, I did consider changing the mcast_mutex to a per interface lock instead. The are various optimizations that can be made now that the locking is correct and race free and that we don't flush from the workqueue we are flushing. I didn't go into those changes in this set. As for 20ms, I did that because checkpatch complains about shorter waits and I don't like to busy loop, especially since it's not like this is a performance sensitive section of code. Sent from my iPad > On Aug 18, 2014, at 4:26 PM, Wendy Cheng wrote: > >> On Tue, Aug 12, 2014 at 4:38 PM, Doug Ledford wrote: >> We blindly assume that we can just take the rtnl lock and that will >> prevent races with downing this interface. Unfortunately, that's not >> the case. In ipoib_mcast_stop_thread() we will call flush_workqueue() >> in an attempt to clear out all remaining instances of ipoib_join_task. >> But, since this task is put on the same workqueue as the join task, the >> flush_workqueue waits on this thread too. But this thread is deadlocked >> on the rtnl lock. The better thing here is to use trylock and loop on >> that until we either get the lock or we see that FLAG_ADMIN_UP has >> been cleared, in which case we don't need to do anything anyway and we >> just return. >> >> Signed-off-by: Doug Ledford >> --- >> drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 21 +++++++++++++++------ >> 1 file changed, 15 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c >> index a0a42859f12..7e9cd39b5ef 100644 >> --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c >> +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c >> @@ -353,18 +353,27 @@ void ipoib_mcast_carrier_on_task(struct work_struct *work) >> carrier_on_task); >> struct ib_port_attr attr; >> >> - /* >> - * Take rtnl_lock to avoid racing with ipoib_stop() and >> - * turning the carrier back on while a device is being >> - * removed. >> - */ >> if (ib_query_port(priv->ca, priv->port, &attr) || >> attr.state != IB_PORT_ACTIVE) { >> ipoib_dbg(priv, "Keeping carrier off until IB port is active\n"); >> return; >> } >> >> - rtnl_lock(); >> + /* >> + * Take rtnl_lock to avoid racing with ipoib_stop() and >> + * turning the carrier back on while a device is being >> + * removed. However, ipoib_stop() will attempt to flush >> + * the workqueue while holding the rtnl lock, so loop >> + * on trylock until either we get the lock or we see >> + * FLAG_ADMIN_UP go away as that signals that we are bailing >> + * and can safely ignore the carrier on work >> + */ >> + while (!rtnl_trylock()) { >> + if (!test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) >> + return; >> + else >> + msleep(20); >> + } > > > I always think rtnl lock is too big for this purpose... and that 20 ms > is not ideal either. Could we have a new IPOIB private mutex used by > ipoib_stop() and this section of code ? So something like: > > ipoib_stop() > { ..... > mutex_lock(&something_new); > clear_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags); > ... > mutex_unlock(&something_new); > return 0; > } > > Then the loop would become: > > // this while-loop will be very short - since we either get the mutex > quickly or "return" quickly. > while (!mutex_trylock(&something_new)) { > if (!test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) > return; > } > > >> if (!ipoib_cm_admin_enabled(priv->dev)) >> dev_set_mtu(priv->dev, min(priv->mcast_mtu, priv->admin_mtu)); >> netif_carrier_on(priv->dev); > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html