All of lore.kernel.org
 help / color / mirror / Atom feed
* QPN re-use?  was RE: rstream application
@ 2017-11-21 16:26 Hefty, Sean
       [not found] ` <1828884A29C6694DAF28B7E6B8A82373AB1AC922-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Hefty, Sean @ 2017-11-21 16:26 UTC (permalink / raw)
  To: Kalderon, Michal, Jason Gunthorpe, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Elior, Ariel, Amrani, Ram, Radzi, Amit

> Rstream client connects to server, runs latency test, calls rs_close
> and then connects Again before running the bandwidth test. The issue
> occurs on the second connect (after latency and before bandwidth).
> Server does the same. It seems the close state in some cases doesn't
> complete Before the second connect request arrives.
> Yes, the hardware re-uses QPNs once they are destroyed.

If the issue is that the QP is in time wait, I'm not sure this is completely an application issue.  The QP being re-used could have been allocated by another application, with timewait being set to something large.  This wouldn't be an rstream issue, but something all applications would be required to handle.

It sounds like there needs to be some sort of coordination of QPN re-use between the drivers and CM, especially if the driver wants to re-use QPNs as LIFO.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: QPN re-use?  was RE: rstream application
       [not found] ` <1828884A29C6694DAF28B7E6B8A82373AB1AC922-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2017-11-22 13:44   ` Kalderon, Michal
       [not found]     ` <CY1PR0701MB201272A9D57012EB34E9E08188200-UpKza+2NMNLHMJvQ0dyT705OhdzP3rhOnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Kalderon, Michal @ 2017-11-22 13:44 UTC (permalink / raw)
  To: Hefty, Sean, Jason Gunthorpe, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Elior, Ariel, Amrani, Ram, Radzi, Amit

> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
> owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Hefty, Sean
> Sent: Tuesday, November 21, 2017 6:27 PM
> To: Kalderon, Michal <Michal.Kalderon-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>; Jason Gunthorpe
> <jgg-uk2M96/98Pc@public.gmane.org>; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Cc: Elior, Ariel <Ariel.Elior-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>; Amrani, Ram
> <Ram.Amrani-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>; Radzi, Amit <Amit.Radzi-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>
> Subject: QPN re-use? was RE: rstream application
> 
> > Rstream client connects to server, runs latency test, calls rs_close
> > and then connects Again before running the bandwidth test. The issue
> > occurs on the second connect (after latency and before bandwidth).
> > Server does the same. It seems the close state in some cases doesn't
> > complete Before the second connect request arrives.
> > Yes, the hardware re-uses QPNs once they are destroyed.
> 
> If the issue is that the QP is in time wait, I'm not sure this is completely an
> application issue.  The QP being re-used could have been allocated by
> another application, with timewait being set to something large.  This
> wouldn't be an rstream issue, but something all applications would be
> required to handle.
> 
> It sounds like there needs to be some sort of coordination of QPN re-use
> between the drivers and CM, especially if the driver wants to re-use QPNs as
> LIFO.

Thanks, Sean. This makes sense. So are you suggesting and additional API between CM
And driver to notify when a QPN can be re-used ? 
Just to emphasize though, the problem could still occur if on client side the QPN is released
After timewait, but timewait didn't pass on server-side. (this is similar to the case here, 
The QPN re-use was on the client side, where the connection was no longer in time-wait, but
Existed in the server side remote-qp tables ). 

Thanks,
Michal

> 
> - Sean
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the
> body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: QPN re-use?  was RE: rstream application
       [not found]     ` <CY1PR0701MB201272A9D57012EB34E9E08188200-UpKza+2NMNLHMJvQ0dyT705OhdzP3rhOnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2017-11-22 17:34       ` Hefty, Sean
       [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB1AD065-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Hefty, Sean @ 2017-11-22 17:34 UTC (permalink / raw)
  To: Kalderon, Michal, Jason Gunthorpe, linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Elior, Ariel, Amrani, Ram, Radzi, Amit

> Thanks, Sean. This makes sense. So are you suggesting and additional
> API between CM And driver to notify when a QPN can be re-used ?

This was what I was suggesting, but the point you make below may mean that it won't help that much.  :/

> Just to emphasize though, the problem could still occur if on client
> side the QPN is released After timewait, but timewait didn't pass on
> server-side. (this is similar to the case here, The QPN re-use was
> on the client side, where the connection was no longer in time-wait,
> but Existed in the server side remote-qp tables ).

IMO, the best solution is for drivers to cycle through qpn space before attempting to re-use any.  Otherwise, this problem is more likely to occur.

I can't easily think of a great alternative.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: QPN re-use?  was RE: rstream application
       [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB1AD065-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2017-11-23 12:18           ` Radzi, Amit
  0 siblings, 0 replies; 4+ messages in thread
From: Radzi, Amit @ 2017-11-23 12:18 UTC (permalink / raw)
  To: Hefty, Sean, Kalderon, Michal, Jason Gunthorpe,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Elior, Ariel, Amrani, Ram

> > Thanks, Sean. This makes sense. So are you suggesting and additional
> > API between CM And driver to notify when a QPN can be re-used ?
> 
> This was what I was suggesting, but the point you make below may mean that it
> won't help that much.  :/

I think it will still help. In the general case (and this restream specific case) the client destroys the QP locally, destroys the cm id (which sends a disconnect request) and creates a new one which sends the new connect request (with the same local QPN).
If there was an interface that the CM will tell the driver that the QPN can be used only when moving to idle (in a successful case after the disconnect response arrives) we won't have a problem since the client will get a different QPN.
There still might be a case where for example the disconnect request will be dropped (and moving to timewait and idle without retrying the disconnect) we might still get a reject from the server since he didn't see the disconnect but that is rare and will have to be addressed in the application layer.

> 
> > Just to emphasize though, the problem could still occur if on client
> > side the QPN is released After timewait, but timewait didn't pass on
> > server-side. (this is similar to the case here, The QPN re-use was
> > on the client side, where the connection was no longer in time-wait,
> > but Existed in the server side remote-qp tables ).
> 
> IMO, the best solution is for drivers to cycle through qpn space before
> attempting to re-use any.  Otherwise, this problem is more likely to occur.
> 
> I can't easily think of a great alternative.

We can do that but that but it will not always help either (e.g. there is only one QPN available).

> 
> - Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-11-23 12:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-21 16:26 QPN re-use? was RE: rstream application Hefty, Sean
     [not found] ` <1828884A29C6694DAF28B7E6B8A82373AB1AC922-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-11-22 13:44   ` Kalderon, Michal
     [not found]     ` <CY1PR0701MB201272A9D57012EB34E9E08188200-UpKza+2NMNLHMJvQ0dyT705OhdzP3rhOnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-11-22 17:34       ` Hefty, Sean
     [not found]         ` <1828884A29C6694DAF28B7E6B8A82373AB1AD065-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-11-23 12:18           ` Radzi, Amit

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.