All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] IBcore/CM: Issue DREQ when receiving REQ/REP for stale QP
@ 2016-10-28 11:14 Hans Westgaard Ry
  2016-10-30 21:06 ` Sagi Grimberg
       [not found] ` <1477653269-27359-1-git-send-email-hans.westgaard.ry-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
  0 siblings, 2 replies; 9+ messages in thread
From: Hans Westgaard Ry @ 2016-10-28 11:14 UTC (permalink / raw)
  To: Doug Ledford, Sean Hefty, Hal Rosenstock, Matan Barak,
	Erez Shitrit, Bart Van Assche, Ira Weiny, Or Gerlitz,
	Hakon Bugge, Yuval Shaia, linux-rdma, linux-kernel

from "InfiBand Architecture Specifications Volume 1":

  A QP is said to have a stale connection when only one side has
  connection information. A stale connection may result if the remote CM
  had dropped the connection and sent a DREQ but the DREQ was never
  received by the local CM. Alternatively the remote CM may have lost
  all record of past connections because its node crashed and rebooted,
  while the local CM did not become aware of the remote node's reboot
  and therefore did not clean up stale connections.

and:

   A local CM may receive a REQ/REP for a stale connection. It shall
   abort the connection issuing REJ to the REQ/REP. It shall then issue
   DREQ with "DREQ:remote QPN” set to the remote QPN from the REQ/REP.

This patch solves a problem with reuse of QPN. Current codebase, that
is IPoIB, relies on a REAP-mechanism to do cleanup of the structures
in CM. A problem with this is the timeconstants governing this
mechanism; they are up to 768 seconds and the interface may look
inresponsive in that period.  Issuing a DREQ (and receiving a DREP)
does the necessary cleanup and the interface comes up.

Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
---
 drivers/infiniband/core/cm.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index c995255..c97e4d5 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -1519,6 +1519,7 @@ static struct cm_id_private * cm_match_req(struct cm_work *work,
 	struct cm_id_private *listen_cm_id_priv, *cur_cm_id_priv;
 	struct cm_timewait_info *timewait_info;
 	struct cm_req_msg *req_msg;
+	struct ib_cm_id *cm_id;
 
 	req_msg = (struct cm_req_msg *)work->mad_recv_wc->recv_buf.mad;
 
@@ -1540,10 +1541,18 @@ static struct cm_id_private * cm_match_req(struct cm_work *work,
 	timewait_info = cm_insert_remote_qpn(cm_id_priv->timewait_info);
 	if (timewait_info) {
 		cm_cleanup_timewait(cm_id_priv->timewait_info);
+		cur_cm_id_priv = cm_get_id(timewait_info->work.local_id,
+					   timewait_info->work.remote_id);
+
 		spin_unlock_irq(&cm.lock);
 		cm_issue_rej(work->port, work->mad_recv_wc,
 			     IB_CM_REJ_STALE_CONN, CM_MSG_RESPONSE_REQ,
 			     NULL, 0);
+		if (cur_cm_id_priv) {
+			cm_id = &cur_cm_id_priv->id;
+			ib_send_cm_dreq(cm_id, NULL, 0);
+			cm_deref_id(cur_cm_id_priv);
+		}
 		return NULL;
 	}
 
@@ -1919,6 +1928,9 @@ static int cm_rep_handler(struct cm_work *work)
 	struct cm_id_private *cm_id_priv;
 	struct cm_rep_msg *rep_msg;
 	int ret;
+	struct cm_id_private *cur_cm_id_priv;
+	struct ib_cm_id *cm_id;
+	struct cm_timewait_info *timewait_info;
 
 	rep_msg = (struct cm_rep_msg *)work->mad_recv_wc->recv_buf.mad;
 	cm_id_priv = cm_acquire_id(rep_msg->remote_comm_id, 0);
@@ -1953,16 +1965,26 @@ static int cm_rep_handler(struct cm_work *work)
 		goto error;
 	}
 	/* Check for a stale connection. */
-	if (cm_insert_remote_qpn(cm_id_priv->timewait_info)) {
+	timewait_info = cm_insert_remote_qpn(cm_id_priv->timewait_info);
+	if (timewait_info) {
 		rb_erase(&cm_id_priv->timewait_info->remote_id_node,
 			 &cm.remote_id_table);
 		cm_id_priv->timewait_info->inserted_remote_id = 0;
+		cur_cm_id_priv = cm_get_id(timewait_info->work.local_id,
+					   timewait_info->work.remote_id);
+
 		spin_unlock(&cm.lock);
 		spin_unlock_irq(&cm_id_priv->lock);
 		cm_issue_rej(work->port, work->mad_recv_wc,
 			     IB_CM_REJ_STALE_CONN, CM_MSG_RESPONSE_REP,
 			     NULL, 0);
 		ret = -EINVAL;
+		if (cur_cm_id_priv) {
+			cm_id = &cur_cm_id_priv->id;
+			ib_send_cm_dreq(cm_id, NULL, 0);
+			cm_deref_id(cur_cm_id_priv);
+		}
+
 		goto error;
 	}
 	spin_unlock(&cm.lock);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] IBcore/CM: Issue DREQ when receiving REQ/REP for stale QP
  2016-10-28 11:14 [PATCH] IBcore/CM: Issue DREQ when receiving REQ/REP for stale QP Hans Westgaard Ry
@ 2016-10-30 21:06 ` Sagi Grimberg
       [not found]   ` <8df2c90e-8581-4ce8-fd18-92ac57eb6160-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
       [not found] ` <1477653269-27359-1-git-send-email-hans.westgaard.ry-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 9+ messages in thread
From: Sagi Grimberg @ 2016-10-30 21:06 UTC (permalink / raw)
  To: Hans Westgaard Ry, Doug Ledford, Sean Hefty, Hal Rosenstock,
	Matan Barak, Erez Shitrit, Bart Van Assche, Ira Weiny,
	Or Gerlitz, Hakon Bugge, Yuval Shaia, linux-rdma, linux-kernel

> from "InfiBand Architecture Specifications Volume 1":
>
>   A QP is said to have a stale connection when only one side has
>   connection information. A stale connection may result if the remote CM
>   had dropped the connection and sent a DREQ but the DREQ was never
>   received by the local CM. Alternatively the remote CM may have lost
>   all record of past connections because its node crashed and rebooted,
>   while the local CM did not become aware of the remote node's reboot
>   and therefore did not clean up stale connections.
>
> and:
>
>    A local CM may receive a REQ/REP for a stale connection. It shall
>    abort the connection issuing REJ to the REQ/REP. It shall then issue
>    DREQ with "DREQ:remote QPN” set to the remote QPN from the REQ/REP.
>
> This patch solves a problem with reuse of QPN. Current codebase, that
> is IPoIB, relies on a REAP-mechanism to do cleanup of the structures
> in CM. A problem with this is the timeconstants governing this
> mechanism; they are up to 768 seconds and the interface may look
> inresponsive in that period.  Issuing a DREQ (and receiving a DREP)
> does the necessary cleanup and the interface comes up.

I like this fix, so,

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

But I think the CM layer still is buggy in this area.

In vol 1 the state transition table specifically states that DREP
timeouts should move the cm_id to timewait state but the CM doesn't
seem to maintain response timeouts on disconnect requests. If the
DREQ happened to fail (send error completion) things are fine, but
if the DREQ makes it to the peer but it doesn't reply then no one
will take care of it (i.e. we will never see a TIMEWAIT event from
this cm_id)...

I recall some debugging session with Hal on this area a ~year ago
with a new iser target (which didn't reply to DREQs on reboot
sequences). iser initiator waits for a DISCONNECTED/TIMEWAIT events
before destroying the cm_id (which never happened because of the
above). I think I ended up working around that in iser to just go
ahead and destroy the cm_id after issuing a DREQ (but now I realize
it was never included so I'll probably dig it up again soon).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] IBcore/CM: Issue DREQ when receiving REQ/REP for stale QP
  2016-10-30 21:06 ` Sagi Grimberg
@ 2016-10-31  4:52       ` santosh.shilimkar
  0 siblings, 0 replies; 9+ messages in thread
From: santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA @ 2016-10-31  4:52 UTC (permalink / raw)
  To: Sagi Grimberg, Hans Westgaard Ry, Doug Ledford, Sean Hefty,
	Hal Rosenstock, Matan Barak, Erez Shitrit, Bart Van Assche,
	Ira Weiny, Or Gerlitz, Hakon Bugge, Yuval Shaia,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On 10/30/16 2:06 PM, Sagi Grimberg wrote:
>> from "InfiBand Architecture Specifications Volume 1":
>>
>>   A QP is said to have a stale connection when only one side has
>>   connection information. A stale connection may result if the remote CM
>>   had dropped the connection and sent a DREQ but the DREQ was never
>>   received by the local CM. Alternatively the remote CM may have lost
>>   all record of past connections because its node crashed and rebooted,
>>   while the local CM did not become aware of the remote node's reboot
>>   and therefore did not clean up stale connections.
>>
>> and:
>>
>>    A local CM may receive a REQ/REP for a stale connection. It shall
>>    abort the connection issuing REJ to the REQ/REP. It shall then issue
>>    DREQ with "DREQ:remote QPN” set to the remote QPN from the REQ/REP.
>>
>> This patch solves a problem with reuse of QPN. Current codebase, that
>> is IPoIB, relies on a REAP-mechanism to do cleanup of the structures
>> in CM. A problem with this is the timeconstants governing this
>> mechanism; they are up to 768 seconds and the interface may look
>> inresponsive in that period.  Issuing a DREQ (and receiving a DREP)
>> does the necessary cleanup and the interface comes up.
>
> I like this fix, so,
>
Me too and hence suggested Hans to post it on rdma list when
saw this patch in internal review.

> Reviewed-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
>
> But I think the CM layer still is buggy in this area.
>
> In vol 1 the state transition table specifically states that DREP
> timeouts should move the cm_id to timewait state but the CM doesn't
> seem to maintain response timeouts on disconnect requests. If the
> DREQ happened to fail (send error completion) things are fine, but
> if the DREQ makes it to the peer but it doesn't reply then no one
> will take care of it (i.e. we will never see a TIMEWAIT event from
> this cm_id)...
>
> I recall some debugging session with Hal on this area a ~year ago
> with a new iser target (which didn't reply to DREQs on reboot
> sequences). iser initiator waits for a DISCONNECTED/TIMEWAIT events
> before destroying the cm_id (which never happened because of the
> above). I think I ended up working around that in iser to just go
> ahead and destroy the cm_id after issuing a DREQ (but now I realize
> it was never included so I'll probably dig it up again soon).

There is another fundamental issue with core CM code wrt DREQ
getting dropped. The the mad agent used to send the DREQ is
associated with a port and if this port is down, the IB link
layer will drop that DREQ as per SPEC. Similarly if the destination
port is down where the DREQ is suppose to reach, then the DREQ
gets dropped there too. These dropped CM ids are retried by MAD
agent on same port till the port comes back alive.

Am not sure in your case the ports were going down or not
but it was the case then IIUC, what you are doing for ISER is
still needed (up front destroying the cm id).

Regards,
Santosh


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] IBcore/CM: Issue DREQ when receiving REQ/REP for stale QP
@ 2016-10-31  4:52       ` santosh.shilimkar
  0 siblings, 0 replies; 9+ messages in thread
From: santosh.shilimkar @ 2016-10-31  4:52 UTC (permalink / raw)
  To: Sagi Grimberg, Hans Westgaard Ry, Doug Ledford, Sean Hefty,
	Hal Rosenstock, Matan Barak, Erez Shitrit, Bart Van Assche,
	Ira Weiny, Or Gerlitz, Hakon Bugge, Yuval Shaia, linux-rdma,
	linux-kernel

On 10/30/16 2:06 PM, Sagi Grimberg wrote:
>> from "InfiBand Architecture Specifications Volume 1":
>>
>>   A QP is said to have a stale connection when only one side has
>>   connection information. A stale connection may result if the remote CM
>>   had dropped the connection and sent a DREQ but the DREQ was never
>>   received by the local CM. Alternatively the remote CM may have lost
>>   all record of past connections because its node crashed and rebooted,
>>   while the local CM did not become aware of the remote node's reboot
>>   and therefore did not clean up stale connections.
>>
>> and:
>>
>>    A local CM may receive a REQ/REP for a stale connection. It shall
>>    abort the connection issuing REJ to the REQ/REP. It shall then issue
>>    DREQ with "DREQ:remote QPN” set to the remote QPN from the REQ/REP.
>>
>> This patch solves a problem with reuse of QPN. Current codebase, that
>> is IPoIB, relies on a REAP-mechanism to do cleanup of the structures
>> in CM. A problem with this is the timeconstants governing this
>> mechanism; they are up to 768 seconds and the interface may look
>> inresponsive in that period.  Issuing a DREQ (and receiving a DREP)
>> does the necessary cleanup and the interface comes up.
>
> I like this fix, so,
>
Me too and hence suggested Hans to post it on rdma list when
saw this patch in internal review.

> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
>
> But I think the CM layer still is buggy in this area.
>
> In vol 1 the state transition table specifically states that DREP
> timeouts should move the cm_id to timewait state but the CM doesn't
> seem to maintain response timeouts on disconnect requests. If the
> DREQ happened to fail (send error completion) things are fine, but
> if the DREQ makes it to the peer but it doesn't reply then no one
> will take care of it (i.e. we will never see a TIMEWAIT event from
> this cm_id)...
>
> I recall some debugging session with Hal on this area a ~year ago
> with a new iser target (which didn't reply to DREQs on reboot
> sequences). iser initiator waits for a DISCONNECTED/TIMEWAIT events
> before destroying the cm_id (which never happened because of the
> above). I think I ended up working around that in iser to just go
> ahead and destroy the cm_id after issuing a DREQ (but now I realize
> it was never included so I'll probably dig it up again soon).

There is another fundamental issue with core CM code wrt DREQ
getting dropped. The the mad agent used to send the DREQ is
associated with a port and if this port is down, the IB link
layer will drop that DREQ as per SPEC. Similarly if the destination
port is down where the DREQ is suppose to reach, then the DREQ
gets dropped there too. These dropped CM ids are retried by MAD
agent on same port till the port comes back alive.

Am not sure in your case the ports were going down or not
but it was the case then IIUC, what you are doing for ISER is
still needed (up front destroying the cm id).

Regards,
Santosh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] IBcore/CM: Issue DREQ when receiving REQ/REP for stale QP
  2016-10-28 11:14 [PATCH] IBcore/CM: Issue DREQ when receiving REQ/REP for stale QP Hans Westgaard Ry
@ 2016-12-14 17:55     ` Doug Ledford
       [not found] ` <1477653269-27359-1-git-send-email-hans.westgaard.ry-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
  1 sibling, 0 replies; 9+ messages in thread
From: Doug Ledford @ 2016-12-14 17:55 UTC (permalink / raw)
  To: Hans Westgaard Ry, Sean Hefty, Hal Rosenstock, Matan Barak,
	Erez Shitrit, Bart Van Assche, Ira Weiny, Or Gerlitz,
	Hakon Bugge, Yuval Shaia, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 1558 bytes --]

On 10/28/2016 7:14 AM, Hans Westgaard Ry wrote:
> from "InfiBand Architecture Specifications Volume 1":
> 
>   A QP is said to have a stale connection when only one side has
>   connection information. A stale connection may result if the remote CM
>   had dropped the connection and sent a DREQ but the DREQ was never
>   received by the local CM. Alternatively the remote CM may have lost
>   all record of past connections because its node crashed and rebooted,
>   while the local CM did not become aware of the remote node's reboot
>   and therefore did not clean up stale connections.
> 
> and:
> 
>    A local CM may receive a REQ/REP for a stale connection. It shall
>    abort the connection issuing REJ to the REQ/REP. It shall then issue
>    DREQ with "DREQ:remote QPN” set to the remote QPN from the REQ/REP.
> 
> This patch solves a problem with reuse of QPN. Current codebase, that
> is IPoIB, relies on a REAP-mechanism to do cleanup of the structures
> in CM. A problem with this is the timeconstants governing this
> mechanism; they are up to 768 seconds and the interface may look
> inresponsive in that period.  Issuing a DREQ (and receiving a DREP)
> does the necessary cleanup and the interface comes up.
> 
> Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> Reviewed-by: Håkon Bugge <haakon.bugge-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>

Thanks, applied.


-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG Key ID: 0E572FDD


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] IBcore/CM: Issue DREQ when receiving REQ/REP for stale QP
@ 2016-12-14 17:55     ` Doug Ledford
  0 siblings, 0 replies; 9+ messages in thread
From: Doug Ledford @ 2016-12-14 17:55 UTC (permalink / raw)
  To: Hans Westgaard Ry, Sean Hefty, Hal Rosenstock, Matan Barak,
	Erez Shitrit, Bart Van Assche, Ira Weiny, Or Gerlitz,
	Hakon Bugge, Yuval Shaia, linux-rdma, linux-kernel


[-- Attachment #1.1: Type: text/plain, Size: 1471 bytes --]

On 10/28/2016 7:14 AM, Hans Westgaard Ry wrote:
> from "InfiBand Architecture Specifications Volume 1":
> 
>   A QP is said to have a stale connection when only one side has
>   connection information. A stale connection may result if the remote CM
>   had dropped the connection and sent a DREQ but the DREQ was never
>   received by the local CM. Alternatively the remote CM may have lost
>   all record of past connections because its node crashed and rebooted,
>   while the local CM did not become aware of the remote node's reboot
>   and therefore did not clean up stale connections.
> 
> and:
> 
>    A local CM may receive a REQ/REP for a stale connection. It shall
>    abort the connection issuing REJ to the REQ/REP. It shall then issue
>    DREQ with "DREQ:remote QPN” set to the remote QPN from the REQ/REP.
> 
> This patch solves a problem with reuse of QPN. Current codebase, that
> is IPoIB, relies on a REAP-mechanism to do cleanup of the structures
> in CM. A problem with this is the timeconstants governing this
> mechanism; they are up to 768 seconds and the interface may look
> inresponsive in that period.  Issuing a DREQ (and receiving a DREP)
> does the necessary cleanup and the interface comes up.
> 
> Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
> Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>

Thanks, applied.


-- 
Doug Ledford <dledford@redhat.com>
    GPG Key ID: 0E572FDD


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PING][PATCH] IBcore/CM: Issue DREQ when receiving REQ/REP for stale QP
  2016-10-28 11:14 [PATCH] IBcore/CM: Issue DREQ when receiving REQ/REP for stale QP Hans Westgaard Ry
@ 2017-02-01 11:17     ` Hans Westgaard Ry
       [not found] ` <1477653269-27359-1-git-send-email-hans.westgaard.ry-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
  1 sibling, 0 replies; 9+ messages in thread
From: Hans Westgaard Ry @ 2017-02-01 11:17 UTC (permalink / raw)
  To: Doug Ledford, Sean Hefty, Hal Rosenstock, Matan Barak,
	Erez Shitrit, Bart Van Assche, Ira Weiny, Or Gerlitz,
	Hakon Bugge, Yuval Shaia, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA



On 10/28/2016 01:14 PM, Hans Westgaard Ry wrote:
> from "InfiBand Architecture Specifications Volume 1":
>
>    A QP is said to have a stale connection when only one side has
>    connection information. A stale connection may result if the remote CM
>    had dropped the connection and sent a DREQ but the DREQ was never
>    received by the local CM. Alternatively the remote CM may have lost
>    all record of past connections because its node crashed and rebooted,
>    while the local CM did not become aware of the remote node's reboot
>    and therefore did not clean up stale connections.
>
> and:
>
>     A local CM may receive a REQ/REP for a stale connection. It shall
>     abort the connection issuing REJ to the REQ/REP. It shall then issue
>     DREQ with "DREQ:remote QPN” set to the remote QPN from the REQ/REP.
>
> This patch solves a problem with reuse of QPN. Current codebase, that
> is IPoIB, relies on a REAP-mechanism to do cleanup of the structures
> in CM. A problem with this is the timeconstants governing this
> mechanism; they are up to 768 seconds and the interface may look
> inresponsive in that period.  Issuing a DREQ (and receiving a DREP)
> does the necessary cleanup and the interface comes up.
>
> Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> Reviewed-by: Håkon Bugge <haakon.bugge-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> ---
>   drivers/infiniband/core/cm.c | 24 +++++++++++++++++++++++-
>   1 file changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> index c995255..c97e4d5 100644
> --- a/drivers/infiniband/core/cm.c
> +++ b/drivers/infiniband/core/cm.c
> @@ -1519,6 +1519,7 @@ static struct cm_id_private * cm_match_req(struct cm_work *work,
>   	struct cm_id_private *listen_cm_id_priv, *cur_cm_id_priv;
>   	struct cm_timewait_info *timewait_info;
>   	struct cm_req_msg *req_msg;
> +	struct ib_cm_id *cm_id;
>   
>   	req_msg = (struct cm_req_msg *)work->mad_recv_wc->recv_buf.mad;
>   
> @@ -1540,10 +1541,18 @@ static struct cm_id_private * cm_match_req(struct cm_work *work,
>   	timewait_info = cm_insert_remote_qpn(cm_id_priv->timewait_info);
>   	if (timewait_info) {
>   		cm_cleanup_timewait(cm_id_priv->timewait_info);
> +		cur_cm_id_priv = cm_get_id(timewait_info->work.local_id,
> +					   timewait_info->work.remote_id);
> +
>   		spin_unlock_irq(&cm.lock);
>   		cm_issue_rej(work->port, work->mad_recv_wc,
>   			     IB_CM_REJ_STALE_CONN, CM_MSG_RESPONSE_REQ,
>   			     NULL, 0);
> +		if (cur_cm_id_priv) {
> +			cm_id = &cur_cm_id_priv->id;
> +			ib_send_cm_dreq(cm_id, NULL, 0);
> +			cm_deref_id(cur_cm_id_priv);
> +		}
>   		return NULL;
>   	}
>   
> @@ -1919,6 +1928,9 @@ static int cm_rep_handler(struct cm_work *work)
>   	struct cm_id_private *cm_id_priv;
>   	struct cm_rep_msg *rep_msg;
>   	int ret;
> +	struct cm_id_private *cur_cm_id_priv;
> +	struct ib_cm_id *cm_id;
> +	struct cm_timewait_info *timewait_info;
>   
>   	rep_msg = (struct cm_rep_msg *)work->mad_recv_wc->recv_buf.mad;
>   	cm_id_priv = cm_acquire_id(rep_msg->remote_comm_id, 0);
> @@ -1953,16 +1965,26 @@ static int cm_rep_handler(struct cm_work *work)
>   		goto error;
>   	}
>   	/* Check for a stale connection. */
> -	if (cm_insert_remote_qpn(cm_id_priv->timewait_info)) {
> +	timewait_info = cm_insert_remote_qpn(cm_id_priv->timewait_info);
> +	if (timewait_info) {
>   		rb_erase(&cm_id_priv->timewait_info->remote_id_node,
>   			 &cm.remote_id_table);
>   		cm_id_priv->timewait_info->inserted_remote_id = 0;
> +		cur_cm_id_priv = cm_get_id(timewait_info->work.local_id,
> +					   timewait_info->work.remote_id);
> +
>   		spin_unlock(&cm.lock);
>   		spin_unlock_irq(&cm_id_priv->lock);
>   		cm_issue_rej(work->port, work->mad_recv_wc,
>   			     IB_CM_REJ_STALE_CONN, CM_MSG_RESPONSE_REP,
>   			     NULL, 0);
>   		ret = -EINVAL;
> +		if (cur_cm_id_priv) {
> +			cm_id = &cur_cm_id_priv->id;
> +			ib_send_cm_dreq(cm_id, NULL, 0);
> +			cm_deref_id(cur_cm_id_priv);
> +		}
> +
>   		goto error;
>   	}
>   	spin_unlock(&cm.lock);

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PING][PATCH] IBcore/CM: Issue DREQ when receiving REQ/REP for stale QP
@ 2017-02-01 11:17     ` Hans Westgaard Ry
  0 siblings, 0 replies; 9+ messages in thread
From: Hans Westgaard Ry @ 2017-02-01 11:17 UTC (permalink / raw)
  To: Doug Ledford, Sean Hefty, Hal Rosenstock, Matan Barak,
	Erez Shitrit, Bart Van Assche, Ira Weiny, Or Gerlitz,
	Hakon Bugge, Yuval Shaia, linux-rdma, linux-kernel



On 10/28/2016 01:14 PM, Hans Westgaard Ry wrote:
> from "InfiBand Architecture Specifications Volume 1":
>
>    A QP is said to have a stale connection when only one side has
>    connection information. A stale connection may result if the remote CM
>    had dropped the connection and sent a DREQ but the DREQ was never
>    received by the local CM. Alternatively the remote CM may have lost
>    all record of past connections because its node crashed and rebooted,
>    while the local CM did not become aware of the remote node's reboot
>    and therefore did not clean up stale connections.
>
> and:
>
>     A local CM may receive a REQ/REP for a stale connection. It shall
>     abort the connection issuing REJ to the REQ/REP. It shall then issue
>     DREQ with "DREQ:remote QPN” set to the remote QPN from the REQ/REP.
>
> This patch solves a problem with reuse of QPN. Current codebase, that
> is IPoIB, relies on a REAP-mechanism to do cleanup of the structures
> in CM. A problem with this is the timeconstants governing this
> mechanism; they are up to 768 seconds and the interface may look
> inresponsive in that period.  Issuing a DREQ (and receiving a DREP)
> does the necessary cleanup and the interface comes up.
>
> Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
> Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
> ---
>   drivers/infiniband/core/cm.c | 24 +++++++++++++++++++++++-
>   1 file changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> index c995255..c97e4d5 100644
> --- a/drivers/infiniband/core/cm.c
> +++ b/drivers/infiniband/core/cm.c
> @@ -1519,6 +1519,7 @@ static struct cm_id_private * cm_match_req(struct cm_work *work,
>   	struct cm_id_private *listen_cm_id_priv, *cur_cm_id_priv;
>   	struct cm_timewait_info *timewait_info;
>   	struct cm_req_msg *req_msg;
> +	struct ib_cm_id *cm_id;
>   
>   	req_msg = (struct cm_req_msg *)work->mad_recv_wc->recv_buf.mad;
>   
> @@ -1540,10 +1541,18 @@ static struct cm_id_private * cm_match_req(struct cm_work *work,
>   	timewait_info = cm_insert_remote_qpn(cm_id_priv->timewait_info);
>   	if (timewait_info) {
>   		cm_cleanup_timewait(cm_id_priv->timewait_info);
> +		cur_cm_id_priv = cm_get_id(timewait_info->work.local_id,
> +					   timewait_info->work.remote_id);
> +
>   		spin_unlock_irq(&cm.lock);
>   		cm_issue_rej(work->port, work->mad_recv_wc,
>   			     IB_CM_REJ_STALE_CONN, CM_MSG_RESPONSE_REQ,
>   			     NULL, 0);
> +		if (cur_cm_id_priv) {
> +			cm_id = &cur_cm_id_priv->id;
> +			ib_send_cm_dreq(cm_id, NULL, 0);
> +			cm_deref_id(cur_cm_id_priv);
> +		}
>   		return NULL;
>   	}
>   
> @@ -1919,6 +1928,9 @@ static int cm_rep_handler(struct cm_work *work)
>   	struct cm_id_private *cm_id_priv;
>   	struct cm_rep_msg *rep_msg;
>   	int ret;
> +	struct cm_id_private *cur_cm_id_priv;
> +	struct ib_cm_id *cm_id;
> +	struct cm_timewait_info *timewait_info;
>   
>   	rep_msg = (struct cm_rep_msg *)work->mad_recv_wc->recv_buf.mad;
>   	cm_id_priv = cm_acquire_id(rep_msg->remote_comm_id, 0);
> @@ -1953,16 +1965,26 @@ static int cm_rep_handler(struct cm_work *work)
>   		goto error;
>   	}
>   	/* Check for a stale connection. */
> -	if (cm_insert_remote_qpn(cm_id_priv->timewait_info)) {
> +	timewait_info = cm_insert_remote_qpn(cm_id_priv->timewait_info);
> +	if (timewait_info) {
>   		rb_erase(&cm_id_priv->timewait_info->remote_id_node,
>   			 &cm.remote_id_table);
>   		cm_id_priv->timewait_info->inserted_remote_id = 0;
> +		cur_cm_id_priv = cm_get_id(timewait_info->work.local_id,
> +					   timewait_info->work.remote_id);
> +
>   		spin_unlock(&cm.lock);
>   		spin_unlock_irq(&cm_id_priv->lock);
>   		cm_issue_rej(work->port, work->mad_recv_wc,
>   			     IB_CM_REJ_STALE_CONN, CM_MSG_RESPONSE_REP,
>   			     NULL, 0);
>   		ret = -EINVAL;
> +		if (cur_cm_id_priv) {
> +			cm_id = &cur_cm_id_priv->id;
> +			ib_send_cm_dreq(cm_id, NULL, 0);
> +			cm_deref_id(cur_cm_id_priv);
> +		}
> +
>   		goto error;
>   	}
>   	spin_unlock(&cm.lock);

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PING][PATCH] IBcore/CM: Issue DREQ when receiving REQ/REP for stale QP
  2017-02-01 11:17     ` Hans Westgaard Ry
  (?)
@ 2017-02-01 11:44     ` Leon Romanovsky
  -1 siblings, 0 replies; 9+ messages in thread
From: Leon Romanovsky @ 2017-02-01 11:44 UTC (permalink / raw)
  To: Hans Westgaard Ry
  Cc: Doug Ledford, Sean Hefty, Hal Rosenstock, Matan Barak,
	Erez Shitrit, Bart Van Assche, Ira Weiny, Or Gerlitz,
	Hakon Bugge, Yuval Shaia, linux-rdma, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 76 bytes --]

It was applied, see commit 9315bc9a133011fdb084f2626b86db3ebb64661f

Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-02-01 11:44 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-28 11:14 [PATCH] IBcore/CM: Issue DREQ when receiving REQ/REP for stale QP Hans Westgaard Ry
2016-10-30 21:06 ` Sagi Grimberg
     [not found]   ` <8df2c90e-8581-4ce8-fd18-92ac57eb6160-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2016-10-31  4:52     ` santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA
2016-10-31  4:52       ` santosh.shilimkar
     [not found] ` <1477653269-27359-1-git-send-email-hans.westgaard.ry-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2016-12-14 17:55   ` Doug Ledford
2016-12-14 17:55     ` Doug Ledford
2017-02-01 11:17   ` [PING][PATCH] " Hans Westgaard Ry
2017-02-01 11:17     ` Hans Westgaard Ry
2017-02-01 11:44     ` Leon Romanovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.