All of lore.kernel.org
 help / color / mirror / Atom feed
From: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Wendy Cheng <s.wendy.cheng-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org"
	<roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Subject: Re: [PATCH 3/8] IPoIB: fix MCAST_FLAG_BUSY usage
Date: Tue, 19 Aug 2014 16:28:44 -0400 (EDT)	[thread overview]
Message-ID: <902D5BF2-159A-4B31-A87F-7491F3C8057F@redhat.com> (raw)
In-Reply-To: <CABgxfbH-Dt3CpxJKwCAZeHTUyupaA9y_WXVXuxgiPMet26PTQw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

Excuse my top posting, I'm at a meeting and not using my normal mail client.

To answer your question, this is a confusing issue that I had to work through myself.

When we call ib_sa_multicast_join, it will create a record for us and return that.  That record is not valid for us to call ib_sa_multicast_leave until after the join has completed and our callback has acknowledged that join completion with a 0 return status from our callback.

At the time the record is created, ib_sa_multicast_join will also queue up the actual join.  When the join completes, they call our callback (and without my locking fixes, we can get the callback before ib_sa_multicast_join returns our record entry to us).

If our callback is called with the status already set, then the join has failed and will be handled that way when our callback returns.  In this case, the callback is informative for us, that's all.  When we trap -ENETRESET and put our return to 0, we are acknowledging that the join failed prior to our callback and that the join will not be completed and that mcast->mc as we have recorded is not valid.

If the status is 0 to start, and we set it to a non-0 value, then that also indicates that the join has failed, but in this case the core code must unroll the join that was completed up until the callback was done (and it does).  Any time this is the case, the member record returned from ib_sa_multicast_join is invalid and we can not call ib_sa_multicast_leave on the record or we will hang forever waiting for a completion that won't be coming.

Only if we get our callback entered with a 0 status and return with a 0 status does the member record that ib_sa_multicast_join returned to us become fully valid.  At this point, we must call ib_sa_multicast_leave on the record if we want it to go away.

So that's why in this patch we

1) take a mutex to force ib_sa_join_multicast to return and us to set mcast->mc to the proper return value before we process the join completion callback
2) always clear mcast->mc if there is any error since we can't call ib_sa_multicast_leave
3) always complete the mcast in case we are waiting on it
4) only if our status is ENETRESET set our return to 0 so the ib core code knows we acknowledged the event

Sent from my iPad

> On Aug 19, 2014, at 11:08 AM, Wendy Cheng <s.wendy.cheng-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> 
>> On Tue, Aug 12, 2014 at 4:38 PM, Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> Commit a9c8ba5884 (IPoIB: Fix usage of uninitialized multicast objects)
>> added a new flag MCAST_JOIN_STARTED, but was not very strict in how it
>> was used.  We didn't always initialize the completion struct before we
>> set the flag, and we didn't always call complete on the completion
>> struct from all paths that complete it.  This made it less than totally
>> effective, and certainly made its use confusing.  And in the flush
>> function we would use the presence of this flag to signal that we should
>> wait on the completion struct, but we never cleared this flag, ever.
>> This is further muddied by the fact that we overload the MCAST_FLAG_BUSY
>> flag to mean two different things: we have a join in flight, and we have
>> succeeded in getting an ib_sa_join_multicast.
>> 
> 
> [snip]
> 
>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
>> index 7e9cd39b5ef..f5e8da530d9 100644
>> --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
>> +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
>> @@ -271,16 +271,27 @@ ipoib_mcast_sendonly_join_complete(int status,
>>        struct ipoib_mcast *mcast = multicast->context;
>>        struct net_device *dev = mcast->dev;
>> 
>> +       /*
>> +        * We have to take the mutex to force mcast_sendonly_join to
>> +        * return from ib_sa_multicast_join and set mcast->mc to a
>> +        * valid value.  Otherwise we were racing with ourselves in
>> +        * that we might fail here, but get a valid return from
>> +        * ib_sa_multicast_join after we had cleared mcast->mc here,
>> +        * resulting in mis-matched joins and leaves and a deadlock
>> +        */
>> +       mutex_lock(&mcast_mutex);
>> +
>>        /* We trap for port events ourselves. */
>>        if (status == -ENETRESET)
>> -               return 0;
>> +               goto out;
>> 
> 
> [snip]
> 
> 
>> +out:
>> +       clear_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags);
>> +       if (status)
>> +               mcast->mc = NULL;
>> +       complete(&mcast->done);
>> +       if (status == -ENETRESET)
>> +               status = 0;
>> +       mutex_unlock(&mcast_mutex);
>>        return status;
>> }
>> 
> 
> I have not gone thru the whole patch yet. However, here is something quick ...
> 
> Would this "return 0 if (status == -ENETRESET)" be wrong in the
> original code ?  Say ..... when it returns to process_group_error(),
> without proper error return code, it'll skip ib_sa_free_multicast().
> Should we worry about it ?
> 
> -- Wendy
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2014-08-19 20:28 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-12 23:38 [PATCH 0/8] IPoIB: Fix multiple race conditions Doug Ledford
     [not found] ` <cover.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-12 23:38   ` [PATCH 1/8] IPoIB: Consolidate rtnl_lock tasks in workqueue Doug Ledford
     [not found]     ` <2394730ce5ae1d46522dca04066293dd842edf16.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-15 22:11       ` Wendy Cheng
2014-09-04 14:28       ` Erez Shitrit
2014-08-12 23:38   ` [PATCH 2/8] IPoIB: Make the carrier_on_task race aware Doug Ledford
     [not found]     ` <d05cdce70f1312f35f8be2d14bafd2a06809b137.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-18 23:26       ` Wendy Cheng
     [not found]         ` <CABgxfbE6edfZZ58=mTvhGqWSkCxsik0XuQPR0L-yayze=803cg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-19 20:32           ` Doug Ledford
     [not found]             ` <2CC1794B-10CD-49A2-8F5D-0C66A6684DBC-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-19 22:05               ` Wendy Cheng
2014-09-04 12:13       ` Erez Shitrit
     [not found]         ` <5408576C.7040609-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-09-09  7:17           ` Doug Ledford
2014-08-12 23:38   ` [PATCH 3/8] IPoIB: fix MCAST_FLAG_BUSY usage Doug Ledford
     [not found]     ` <a410e80dc5ca7cfa64229bbbf50c1337317e3bd8.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-19 18:08       ` Wendy Cheng
     [not found]         ` <CABgxfbH-Dt3CpxJKwCAZeHTUyupaA9y_WXVXuxgiPMet26PTQw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-19 20:28           ` Doug Ledford [this message]
     [not found]             ` <902D5BF2-159A-4B31-A87F-7491F3C8057F-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-25 19:51               ` Wendy Cheng
     [not found]                 ` <CABgxfbHOD75vLdZ0TtWZbk8ne3kHd_eWObxPHmoJ-D8DjE0bkw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-25 20:03                   ` Doug Ledford
     [not found]                     ` <E3EFCBAC-2D6E-49D3-A556-DBD40701CC5F-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-26 18:04                       ` Wendy Cheng
2014-08-12 23:38   ` [PATCH 4/8] IPoIB: fix mcast_dev_flush/mcast_restart_task race Doug Ledford
     [not found]     ` <e1dbcfc25d8930b281aad12699ebf8fa82485b0e.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-29 21:53       ` Wendy Cheng
     [not found]         ` <CABgxfbHDuUrdHuLJT2oD07Cy3Ys4_rj-bJ6eR=9+uv0CuPH7_Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-30 15:39           ` Wendy Cheng
     [not found]             ` <CABgxfbGJOdmAn1sokEtisDdnA=r_4mfP=PfqZVsP0cd_oL50dA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-09-03 18:06               ` Doug Ledford
     [not found]                 ` <1409767614.26762.7.camel-v+aXH1h/sVwpzh8Nc7Vzg+562jBIR2Zt@public.gmane.org>
2014-09-03 19:45                   ` Wendy Cheng
2014-09-03 17:49           ` Doug Ledford
2014-08-12 23:38   ` [PATCH 5/8] IPoIB: change init sequence ordering Doug Ledford
     [not found]     ` <ead9800512c1cb412b86cb1de3868c40f07c72be.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-09-04 12:36       ` Erez Shitrit
2014-08-12 23:38   ` [PATCH 6/8] IPoIB: Use dedicated workqueues per interface Doug Ledford
     [not found]     ` <f7af9c251d722675a549e4a673f46c0f31dfa266.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-20 15:01       ` Estrin, Alex
2014-09-04  6:49       ` Erez Shitrit
     [not found]         ` <54080B6B.8050707-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-09-09  7:09           ` Doug Ledford
     [not found]             ` <540EA7B2.2050805-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-09-10 17:27               ` Erez Shitrit
2014-08-12 23:38   ` [PATCH 7/8] IPoIB: Make ipoib_mcast_stop_thread flush the workqueue Doug Ledford
     [not found]     ` <ae3912431eeacd81d920a405a6bdeb3853791b1a.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-20 15:03       ` Estrin, Alex
2014-08-12 23:38   ` [PATCH 8/8] IPoIB: No longer use flush as a parameter Doug Ledford
     [not found]     ` <ad7bb2b8da52f187cf2978e6a1c77ead32b60de3.1407885724.git.dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-20 15:04       ` Estrin, Alex
2014-08-15 22:08   ` [PATCH 0/8] IPoIB: Fix multiple race conditions Wendy Cheng
     [not found]     ` <CABgxfbEGfiNGUKT4NJi1GoDRouFznxpgogDt5yr47TLfwDB7hA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-09-03 16:26       ` Wendy Cheng
2014-09-03 13:52   ` Or Gerlitz
     [not found]     ` <54071D14.9040404-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-09-03 18:12       ` Doug Ledford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=902D5BF2-159A-4B31-A87F-7491F3C8057F@redhat.com \
    --to=dledford-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=s.wendy.cheng-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.