From: Chuck Lever <chuck.lever@oracle.com>
To: Dan Aloni <dan@kernelim.com>
Cc: linux-rdma@vger.kernel.org,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH] xprtrdma: Wake up re_connect_wait on disconnect
Date: Sun, 21 Jun 2020 10:49:53 -0400 [thread overview]
Message-ID: <E3C3C032-CAB9-4AA7-B574-0A037A4F37FC@oracle.com> (raw)
In-Reply-To: <AC3CC4DE-C508-4F95-9F0D-B2977CD7301F@oracle.com>
Hi Dan-
> On Jun 20, 2020, at 2:46 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
>
> Hi Dan-
>
>> On Jun 20, 2020, at 1:18 PM, Dan Aloni <dan@kernelim.com> wrote:
>>
>> Given that rpcrdma_xprt_connect() happens from workqueue context, on cases where
>> connections don't succeeds, something needs to wake it up. In my case, this has
>> been observed when the CM callback received `RDMA_CM_EVENT_REJECTED`, and
>> `rpcrdma_xprt_connect()` slept forever.
>
> Interesting. My development and testing generates plenty of REJECTED connection
> requests, but I never saw this particular failure mode.
Correction: My testing _used_ _to_ generate REJECTED events regularly. It does
not seem to any more, even after client crashes. So that explains why I haven't
seen this before.
I haven't reproduced the problem here, but the fix still looks proper to me,
and doesn't appear to introduce any regressions. I do have some issues with your
proposed patch, though.
The first paragraph of the patch description is incorrect. RDMA_CM_EVENT_DISCONNECTED
can occur only once a connection has been established. That guarantees there are no
waiters on re_connect_wait in that case. It's connect errors that need to wake-up
the connect worker.
>> This continues the fix in commit 58bd6656f808 ('xprtrdma: Restore wake-up-all to
>> rpcrdma_cm_event_handler()').
IMO this paragraph needs to be replaced by:
Fixes: e28ce90083f0 ("xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt")
>> Signed-off-by: Dan Aloni <dan@kernelim.com>
>> CC: Chuck Lever <chuck.lever@oracle.com>
>> ---
>>
>> Notes:
>> Hi Chuck,
>>
>> Maybe I missd something, as it is not clear to me how otherwise (without this
>> patch), re_connect_wait can be woken up in this situation. Please explain?
>>
>> net/sunrpc/xprtrdma/verbs.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
>> index 2ae348377806..8bd76a47a91f 100644
>> --- a/net/sunrpc/xprtrdma/verbs.c
>> +++ b/net/sunrpc/xprtrdma/verbs.c
>> @@ -289,6 +289,7 @@ rpcrdma_cm_event_handler(struct rdma_cm_id *id, struct rdma_cm_event *event)
>> ep->re_connect_status = -ECONNABORTED;
>> disconnected:
>> xprt_force_disconnect(xprt);
>> + wake_up_all(&ep->re_connect_wait);
>> return rpcrdma_ep_destroy(ep);
>> default:
>> break;
This hunk does not apply on top of fixes I've already sent to Anna for 5.8-rc1.
So, if you don't object, I'll adjust your patch (this hunk and the description)
before sending it along to Anna.
--
Chuck Lever
next prev parent reply other threads:[~2020-06-21 14:52 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-20 17:18 [PATCH] xprtrdma: Wake up re_connect_wait on disconnect Dan Aloni
2020-06-20 18:46 ` Chuck Lever
2020-06-21 14:49 ` Chuck Lever [this message]
2020-06-21 15:11 ` Dan Aloni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=E3C3C032-CAB9-4AA7-B574-0A037A4F37FC@oracle.com \
--to=chuck.lever@oracle.com \
--cc=dan@kernelim.com \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).