All of lore.kernel.org
 help / color / mirror / Atom feed
* Fwd: Possible bug in test_mr.py
       [not found] <c143355e-954a-5953-c67c-c7a9bf451b7b@gmail.com>
@ 2021-03-29  4:18 ` Bob Pearson
  2021-03-29 16:34   ` Edward Srouji
  0 siblings, 1 reply; 3+ messages in thread
From: Bob Pearson @ 2021-03-29  4:18 UTC (permalink / raw)
  To: linux-rdma




-------- Forwarded Message --------
Subject: Possible bug in test_mr.py
Date: Sun, 28 Mar 2021 22:52:01 -0500
From: Bob Pearson <rpearsonhpe@gmail.com>
To: Jason Gunthorpe <jgg@nvidia.com>, Zhu Yanjun <zyjzyj2000@gmail.com>, linux-rdma@vger.linux.org

Testing ibv_rereg_mr() I noticed that the test uses the rkey originally assigned to the MR by ibv_reg_mr() and not
the rkey subsequently assigned after calling ibv_rereg_mr(). This matters when the original MR did not have remote
memory access and rkey was set to zero. If the rereg changes access to allow remote memory access then the rkey is set
when the verb returns. But the test code never looks again after setting up the original MRs.

In rxe setting rkey = lkey always gets the first rereg test case to pass.

bob

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Fwd: Possible bug in test_mr.py
  2021-03-29  4:18 ` Fwd: Possible bug in test_mr.py Bob Pearson
@ 2021-03-29 16:34   ` Edward Srouji
  2021-03-29 20:07     ` Possible bug in test_mr_rereg_pd Bob Pearson
  0 siblings, 1 reply; 3+ messages in thread
From: Edward Srouji @ 2021-03-29 16:34 UTC (permalink / raw)
  To: Bob Pearson, linux-rdma

Hi Bob,

You're referring to the "test_mr_rereg_access" of test_mr.py, aren't you?
We need to re-sync the rkeys between the two sides, for example by 
calling the sync_remote_attr() after reregistering the MRs.
And if that's the case, and the MR keys could change, maybe it's better 
to re-sync the keys between the two sides after each rereg call.

I'll push a fix - in case I misunderstood you or that's not the point 
you tried to make, please let me know.

Thanks,
Edward.

On 3/29/2021 7:18 AM, Bob Pearson wrote:
>
>
> -------- Forwarded Message --------
> Subject: Possible bug in test_mr.py
> Date: Sun, 28 Mar 2021 22:52:01 -0500
> From: Bob Pearson <rpearsonhpe@gmail.com>
> To: Jason Gunthorpe <jgg@nvidia.com>, Zhu Yanjun <zyjzyj2000@gmail.com>, linux-rdma@vger.linux.org
>
> Testing ibv_rereg_mr() I noticed that the test uses the rkey originally assigned to the MR by ibv_reg_mr() and not
> the rkey subsequently assigned after calling ibv_rereg_mr(). This matters when the original MR did not have remote
> memory access and rkey was set to zero. If the rereg changes access to allow remote memory access then the rkey is set
> when the verb returns. But the test code never looks again after setting up the original MRs.
>
> In rxe setting rkey = lkey always gets the first rereg test case to pass.
>
> bob

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Possible bug in test_mr_rereg_pd
  2021-03-29 16:34   ` Edward Srouji
@ 2021-03-29 20:07     ` Bob Pearson
  0 siblings, 0 replies; 3+ messages in thread
From: Bob Pearson @ 2021-03-29 20:07 UTC (permalink / raw)
  To: Edward Srouji, linux-rdma



Edward,

In a later test (test_mr_rereg_pd) which is also failing, I get the following

======================================================================
ERROR: test_mr_rereg_pd (tests.test_mr.MRTest)
Test that cover rereg MR's PD with this flow:
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/rpearson/src/rdma-core/tests/test_mr.py", line 160, in test_mr_rereg_pd
    u.traffic(**self.traffic_args)
  File "/home/rpearson/src/rdma-core/tests/utils.py", line 653, in traffic
    poll(client.cq)
  File "/home/rpearson/src/rdma-core/tests/utils.py", line 524, in poll_cq
    raise PyverbsRDMAError('Completion status is {s}'.
pyverbs.pyverbs_error.PyverbsRDMAError: Completion status is WR flush error. Errno: 5, Input/output error

But, adding tracing to the kernel driver I see that that part of the test actually succeeded
after resetting the two QPs and reregistering the MR back to original PD. However, there were
some WQEs that got flushed when the QPs were reset but I don't see anything in the test to drain
the RCQs after the reset. So it is not surprising that the test sees the flush errors when it polls
the RCQ.

Since CQs are independent of QPs I thought it was correct to show these completions as flushed even
if the QP is reset.

Bob

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-03-29 20:08 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <c143355e-954a-5953-c67c-c7a9bf451b7b@gmail.com>
2021-03-29  4:18 ` Fwd: Possible bug in test_mr.py Bob Pearson
2021-03-29 16:34   ` Edward Srouji
2021-03-29 20:07     ` Possible bug in test_mr_rereg_pd Bob Pearson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.