All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-next v2] RDMA/rxe: Fix "Replace mr by rkey in responder resources"
@ 2022-04-18 17:41 Bob Pearson
  2022-04-19 16:18 ` Jason Gunthorpe
  2022-04-20 15:53 ` Jason Gunthorpe
  0 siblings, 2 replies; 5+ messages in thread
From: Bob Pearson @ 2022-04-18 17:41 UTC (permalink / raw)
  To: jgg, zyjzyj2000, linux-rdma; +Cc: Bob Pearson

The rping benchmark fails on long runs. The root cause of this
failure has been traced to a failure to compute a nonzero value of mr
in rare situations.

Fix this failure by correctly handling the computation of mr in
read_reply() in rxe_resp.c in the replay flow.

Fixes: 8a1a0be894da ("RDMA/rxe: Replace mr by rkey in responder resources")
Link: https://lore.kernel.org/linux-rdma/1a9a9190-368d-3442-0a62-443b1a6c1209@linux.dev/
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
v2
  Renamed commit
  Changed fixes line to correctly ID the bug
  Added a link to the reported mr == NULL issue

 drivers/infiniband/sw/rxe/rxe_resp.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index e2653a8721fe..2e627685e804 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -734,8 +734,14 @@ static enum resp_states read_reply(struct rxe_qp *qp,
 	}
 
 	if (res->state == rdatm_res_state_new) {
-		mr = qp->resp.mr;
-		qp->resp.mr = NULL;
+		if (!res->replay) {
+			mr = qp->resp.mr;
+			qp->resp.mr = NULL;
+		} else {
+			mr = rxe_recheck_mr(qp, res->read.rkey);
+			if (!mr)
+				return RESPST_ERR_RKEY_VIOLATION;
+		}
 
 		if (res->read.resid <= mtu)
 			opcode = IB_OPCODE_RC_RDMA_READ_RESPONSE_ONLY;

base-commit: 98c8026331ceabe1df579940b81eec75eb49cdd9
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH for-next v2] RDMA/rxe: Fix "Replace mr by rkey in responder resources"
  2022-04-18 17:41 [PATCH for-next v2] RDMA/rxe: Fix "Replace mr by rkey in responder resources" Bob Pearson
@ 2022-04-19 16:18 ` Jason Gunthorpe
  2022-04-19 22:02   ` Pearson, Robert B
  2022-04-20 15:53 ` Jason Gunthorpe
  1 sibling, 1 reply; 5+ messages in thread
From: Jason Gunthorpe @ 2022-04-19 16:18 UTC (permalink / raw)
  To: Bob Pearson; +Cc: zyjzyj2000, linux-rdma

On Mon, Apr 18, 2022 at 12:41:04PM -0500, Bob Pearson wrote:
> The rping benchmark fails on long runs. The root cause of this
> failure has been traced to a failure to compute a nonzero value of mr
> in rare situations.
> 
> Fix this failure by correctly handling the computation of mr in
> read_reply() in rxe_resp.c in the replay flow.
> 
> Fixes: 8a1a0be894da ("RDMA/rxe: Replace mr by rkey in responder resources")
> Link: https://lore.kernel.org/linux-rdma/1a9a9190-368d-3442-0a62-443b1a6c1209@linux.dev/
> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> ---
> v2
>   Renamed commit
>   Changed fixes line to correctly ID the bug
>   Added a link to the reported mr == NULL issue
> 
>  drivers/infiniband/sw/rxe/rxe_resp.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)

I'm confused, does this one replace this patch:

https://lore.kernel.org/all/20220411030647.20011-1-rpearsonhpe@gmail.com/

?

It has the same title but is completely different

Jason

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH for-next v2] RDMA/rxe: Fix "Replace mr by rkey in responder resources"
  2022-04-19 16:18 ` Jason Gunthorpe
@ 2022-04-19 22:02   ` Pearson, Robert B
  0 siblings, 0 replies; 5+ messages in thread
From: Pearson, Robert B @ 2022-04-19 22:02 UTC (permalink / raw)
  To: Jason Gunthorpe, Bob Pearson; +Cc: zyjzyj2000, linux-rdma

> I'm confused, does this one replace this patch:
> https://lore.kernel.org/all/20220411030647.20011-1-rpearsonhpe@gmail.com/
> ?
> It has the same title but is completely different
> Jason

This is a new bug. It needs a better title. Is the old one still hanging around or was it accepted upstream?
We could call this "Fix read_reply in rxe_resp.c" or anything else that works for you.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH for-next v2] RDMA/rxe: Fix "Replace mr by rkey in responder resources"
  2022-04-18 17:41 [PATCH for-next v2] RDMA/rxe: Fix "Replace mr by rkey in responder resources" Bob Pearson
  2022-04-19 16:18 ` Jason Gunthorpe
@ 2022-04-20 15:53 ` Jason Gunthorpe
  2022-04-20 16:06   ` Pearson, Robert B
  1 sibling, 1 reply; 5+ messages in thread
From: Jason Gunthorpe @ 2022-04-20 15:53 UTC (permalink / raw)
  To: Bob Pearson; +Cc: zyjzyj2000, linux-rdma

On Mon, Apr 18, 2022 at 12:41:04PM -0500, Bob Pearson wrote:
> The rping benchmark fails on long runs. The root cause of this
> failure has been traced to a failure to compute a nonzero value of mr
> in rare situations.
> 
> Fix this failure by correctly handling the computation of mr in
> read_reply() in rxe_resp.c in the replay flow.
> 
> Fixes: 8a1a0be894da ("RDMA/rxe: Replace mr by rkey in responder resources")
> Link: https://lore.kernel.org/linux-rdma/1a9a9190-368d-3442-0a62-443b1a6c1209@linux.dev/
> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> ---
> v2
>   Renamed commit
>   Changed fixes line to correctly ID the bug
>   Added a link to the reported mr == NULL issue
> 
>  drivers/infiniband/sw/rxe/rxe_resp.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)

Applied to for-rc, thanks

Jason

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH for-next v2] RDMA/rxe: Fix "Replace mr by rkey in responder resources"
  2022-04-20 15:53 ` Jason Gunthorpe
@ 2022-04-20 16:06   ` Pearson, Robert B
  0 siblings, 0 replies; 5+ messages in thread
From: Pearson, Robert B @ 2022-04-20 16:06 UTC (permalink / raw)
  To: Jason Gunthorpe, Bob Pearson; +Cc: zyjzyj2000, linux-rdma

thanks

-----Original Message-----
From: Jason Gunthorpe <jgg@nvidia.com> 
Sent: Wednesday, April 20, 2022 10:53 AM
To: Bob Pearson <rpearsonhpe@gmail.com>
Cc: zyjzyj2000@gmail.com; linux-rdma@vger.kernel.org
Subject: Re: [PATCH for-next v2] RDMA/rxe: Fix "Replace mr by rkey in responder resources"

On Mon, Apr 18, 2022 at 12:41:04PM -0500, Bob Pearson wrote:
> The rping benchmark fails on long runs. The root cause of this failure 
> has been traced to a failure to compute a nonzero value of mr in rare 
> situations.
> 
> Fix this failure by correctly handling the computation of mr in
> read_reply() in rxe_resp.c in the replay flow.
> 
> Fixes: 8a1a0be894da ("RDMA/rxe: Replace mr by rkey in responder 
> resources")
> Link: 
> https://lore.kernel.org/linux-rdma/1a9a9190-368d-3442-0a62-443b1a6c120
> 9@linux.dev/
> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> ---
> v2
>   Renamed commit
>   Changed fixes line to correctly ID the bug
>   Added a link to the reported mr == NULL issue
> 
>  drivers/infiniband/sw/rxe/rxe_resp.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)

Applied to for-rc, thanks

Jason

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-04-20 16:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-18 17:41 [PATCH for-next v2] RDMA/rxe: Fix "Replace mr by rkey in responder resources" Bob Pearson
2022-04-19 16:18 ` Jason Gunthorpe
2022-04-19 22:02   ` Pearson, Robert B
2022-04-20 15:53 ` Jason Gunthorpe
2022-04-20 16:06   ` Pearson, Robert B

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.