All of lore.kernel.org
 help / color / mirror / Atom feed
* NFS RDMA test failure as of 5.14-rc1
@ 2021-07-27 13:43 Marciniszyn, Mike
  2021-07-27 17:14 ` Marciniszyn, Mike
  2021-07-27 20:17 ` Marciniszyn, Mike
  0 siblings, 2 replies; 17+ messages in thread
From: Marciniszyn, Mike @ 2021-07-27 13:43 UTC (permalink / raw)
  To: Haakon Bugge
  Cc: Chuck Lever III, linux-rdma, Dalessandro, Dennis,
	Jason Gunthorpe, Leon Romanovsky, Doug Ledford

Our CI testing has been failing for NFS RDMA since 5.14-rc1.

Based on kprobes, the NFS RDMA client creates its QP using rdma_create_qp(), and does post receives right away.

This patch below looks like it deleted the transition from RESET to INIT, breaking the client side NFS RDMA since post receives are not valid in RESET.

I suspect the patch needs to be reverted or NFS RDMA needs to handle the transition to INIT?

commit dc70f7c3ed34b081c02a611591c5079c53b771b8
Author: H�kon Bugge <haakon.bugge@oracle.com>
Date:   Tue Jun 22 15:39:56 2021 +0200

    RDMA/cma: Remove unnecessary INIT->INIT transition

    In rdma_create_qp(), a connected QP will be transitioned to the INIT
    state.

    Afterwards, the QP will be transitioned to the RTR state by the
    cma_modify_qp_rtr() function. But this function starts by performing an
    ib_modify_qp() to the INIT state again, before another ib_modify_qp() is
    performed to transition the QP to the RTR state.

    Hence, there is no need to transition the QP to the INIT state in
    rdma_create_qp().

    Link: https://lore.kernel.org/r/1624369197-24578-2-git-send-email-haakon.bugge@oracle.com
    Signed-off-by: H�kon Bugge <haakon.bugge@oracle.com>
    Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: NFS RDMA test failure as of 5.14-rc1
  2021-07-27 13:43 NFS RDMA test failure as of 5.14-rc1 Marciniszyn, Mike
@ 2021-07-27 17:14 ` Marciniszyn, Mike
  2021-07-27 17:26   ` Chuck Lever III
  2021-07-27 20:17 ` Marciniszyn, Mike
  1 sibling, 1 reply; 17+ messages in thread
From: Marciniszyn, Mike @ 2021-07-27 17:14 UTC (permalink / raw)
  To: Haakon Bugge
  Cc: Chuck Lever III, linux-rdma, Dalessandro, Dennis,
	Jason Gunthorpe, Leon Romanovsky, Doug Ledford

.
> 
> I suspect the patch needs to be reverted or NFS RDMA needs to handle the
> transition to INIT?
> 

Reverting the patch below works.

> commit dc70f7c3ed34b081c02a611591c5079c53b771b8
> Author: H�kon Bugge <haakon.bugge@oracle.com>
> Date:   Tue Jun 22 15:39:56 2021 +0200
> 
>     RDMA/cma: Remove unnecessary INIT->INIT transition

A quick audit of .post_recv calls:

1 main.c           <global>                   698 .post_recv = bnxt_re_post_recv, <-- allows
2 provider.c       <global>                   489 .post_recv = c4iw_post_receive, <-- allows
3 hns_roce_hw_v1.c <global>                  4411 .post_recv = hns_roce_v1_post_recv, <-- allows
4 hns_roce_hw_v2.c <global>                  6190 .post_recv = hns_roce_v2_post_recv, <-- -EINVAL
5 verbs.c          <global>                  4396 .post_recv = irdma_post_recv, <-- allows
6 main.c           <global>                  2561 .post_recv = mlx4_ib_post_recv, <-- allows
7 main.c           <global>                  3767 .post_recv = mlx5_ib_post_recv_nodrain, <-- allows
8 mthca_provider.c <global>                  1148 .post_recv = mthca_arbel_post_receive, <-- allows
9 mthca_provider.c <global>                  1154 .post_recv = mthca_tavor_post_receive, <-- allows
a ocrdma_main.c    <global>                   173 .post_recv = ocrdma_post_recv, <-- -EINVAL
b main.c           <global>                   221 .post_recv = qedr_post_recv, <- -EINVAL
c pvrdma_main.c    <global>                   175 .post_recv = pvrdma_post_recv, <- -EINVAL
d vt.c             <global>                   392 .post_recv = rvt_post_recv,<- -EINVAL
e rxe_verbs.c	   <global>                  1132 .post_recv = rxe_post_recv, <- -EINVAL
f siw_main.c	   <global>                   287 .post_recv = siw_post_receive, <-- allows?

Looks like it is a non-portable assumption that an out-of-state post recv works.   I guess it is possible that the post happens and gets flushed back?

From the IBTA spec:

Reset to Init - Enable posting to the Receive Queue

... post receive notes "Invalid QP state" as a possible immediate failure, but nothing more binding than that.

Mike


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NFS RDMA test failure as of 5.14-rc1
  2021-07-27 17:14 ` Marciniszyn, Mike
@ 2021-07-27 17:26   ` Chuck Lever III
  2021-07-27 17:35     ` Marciniszyn, Mike
  0 siblings, 1 reply; 17+ messages in thread
From: Chuck Lever III @ 2021-07-27 17:26 UTC (permalink / raw)
  To: Marciniszyn, Mike, Haakon Bugge
  Cc: linux-rdma, Dalessandro, Dennis, Jason Gunthorpe,
	Leon Romanovsky, Doug Ledford


> On Jul 27, 2021, at 1:14 PM, Marciniszyn, Mike <mike.marciniszyn@cornelisnetworks.com> wrote:
> 
>> I suspect the patch needs to be reverted or NFS RDMA needs to handle the
>> transition to INIT?

If I'm reading nvmet_rdma_create_queue_ib() correctly,
it invokes rdma_create_qp() then posts Receives. No
ib_modify_qp() there.

So some ULPs assume that rdma_create_qp() returns a
new QP in the IB_QPS_INIT state.

I can't say whether that is spec compliant or even
internally documented.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: NFS RDMA test failure as of 5.14-rc1
  2021-07-27 17:26   ` Chuck Lever III
@ 2021-07-27 17:35     ` Marciniszyn, Mike
  2021-07-27 17:38       ` Jason Gunthorpe
  0 siblings, 1 reply; 17+ messages in thread
From: Marciniszyn, Mike @ 2021-07-27 17:35 UTC (permalink / raw)
  To: Chuck Lever III, Haakon Bugge
  Cc: linux-rdma, Dalessandro, Dennis, Jason Gunthorpe,
	Leon Romanovsky, Doug Ledford

> > On Jul 27, 2021, at 1:14 PM, Marciniszyn, Mike
> <mike.marciniszyn@cornelisnetworks.com> wrote:
> >
> >> I suspect the patch needs to be reverted or NFS RDMA needs to handle
> >> the transition to INIT?
> 
> If I'm reading nvmet_rdma_create_queue_ib() correctly, it invokes
> rdma_create_qp() then posts Receives. No
> ib_modify_qp() there.
> 
> So some ULPs assume that rdma_create_qp() returns a new QP in the
> IB_QPS_INIT state.
> 
> I can't say whether that is spec compliant or even internally documented.
> 

This from the spec:

C10-20: A newly created QP/EE shall be placed in the Reset state.

Mike

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NFS RDMA test failure as of 5.14-rc1
  2021-07-27 17:35     ` Marciniszyn, Mike
@ 2021-07-27 17:38       ` Jason Gunthorpe
  2021-07-27 17:45         ` Chuck Lever III
  2021-07-27 18:12         ` Leon Romanovsky
  0 siblings, 2 replies; 17+ messages in thread
From: Jason Gunthorpe @ 2021-07-27 17:38 UTC (permalink / raw)
  To: Marciniszyn, Mike
  Cc: Chuck Lever III, Haakon Bugge, linux-rdma, Dalessandro, Dennis,
	Leon Romanovsky, Doug Ledford

On Tue, Jul 27, 2021 at 05:35:46PM +0000, Marciniszyn, Mike wrote:
> > > On Jul 27, 2021, at 1:14 PM, Marciniszyn, Mike
> > <mike.marciniszyn@cornelisnetworks.com> wrote:
> > >
> > >> I suspect the patch needs to be reverted or NFS RDMA needs to handle
> > >> the transition to INIT?
> > 
> > If I'm reading nvmet_rdma_create_queue_ib() correctly, it invokes
> > rdma_create_qp() then posts Receives. No
> > ib_modify_qp() there.
> > 
> > So some ULPs assume that rdma_create_qp() returns a new QP in the
> > IB_QPS_INIT state.
> > 
> > I can't say whether that is spec compliant or even internally documented.
> > 
> 
> This from the spec:
> 
> C10-20: A newly created QP/EE shall be placed in the Reset state.

rdma_create_qp() is a CM function, it is not covered by the spec.

The question is if there is any reasonable basis for ULPs to require
that the QP be in the INIT state after the CM creates it, or if the
ULPs should wait to post their recvs until later on in the process.

Haakon's original analysis said that this was an INIT->INIT
transition, so I'm a bit confused why we lost a RESET->INIT transition
in the end?

Jason

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NFS RDMA test failure as of 5.14-rc1
  2021-07-27 17:38       ` Jason Gunthorpe
@ 2021-07-27 17:45         ` Chuck Lever III
  2021-07-27 21:20           ` Marciniszyn, Mike
  2021-07-27 18:12         ` Leon Romanovsky
  1 sibling, 1 reply; 17+ messages in thread
From: Chuck Lever III @ 2021-07-27 17:45 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Marciniszyn, Mike, Haakon Bugge, linux-rdma, Dalessandro, Dennis,
	Leon Romanovsky, Doug Ledford



> On Jul 27, 2021, at 1:38 PM, Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> On Tue, Jul 27, 2021 at 05:35:46PM +0000, Marciniszyn, Mike wrote:
>>>> On Jul 27, 2021, at 1:14 PM, Marciniszyn, Mike
>>> <mike.marciniszyn@cornelisnetworks.com> wrote:
>>>> 
>>>>> I suspect the patch needs to be reverted or NFS RDMA needs to handle
>>>>> the transition to INIT?
>>> 
>>> If I'm reading nvmet_rdma_create_queue_ib() correctly, it invokes
>>> rdma_create_qp() then posts Receives. No
>>> ib_modify_qp() there.
>>> 
>>> So some ULPs assume that rdma_create_qp() returns a new QP in the
>>> IB_QPS_INIT state.
>>> 
>>> I can't say whether that is spec compliant or even internally documented.
>>> 
>> 
>> This from the spec:
>> 
>> C10-20: A newly created QP/EE shall be placed in the Reset state.
> 
> rdma_create_qp() is a CM function, it is not covered by the spec.
> 
> The question is if there is any reasonable basis for ULPs to require
> that the QP be in the INIT state after the CM creates it, or if the
> ULPs should wait to post their recvs until later on in the process.

Existing ULPs that have posted Receives well before the CM_ESTABLISHED
event suggest that there is a longstanding expectation that a QP
returned from rdma_create_qp() is in a state that is ready for posted
work.


> Haakon's original analysis said that this was an INIT->INIT
> transition, so I'm a bit confused why we lost a RESET->INIT transition
> in the end?

Perhaps the patch should have removed the ib_modify_qp() from
cma_modify_qp_rtr() instead.

--
Chuck Lever




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NFS RDMA test failure as of 5.14-rc1
  2021-07-27 17:38       ` Jason Gunthorpe
  2021-07-27 17:45         ` Chuck Lever III
@ 2021-07-27 18:12         ` Leon Romanovsky
  1 sibling, 0 replies; 17+ messages in thread
From: Leon Romanovsky @ 2021-07-27 18:12 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Marciniszyn, Mike, Chuck Lever III, Haakon Bugge, linux-rdma,
	Dalessandro, Dennis, Doug Ledford

On Tue, Jul 27, 2021 at 02:38:57PM -0300, Jason Gunthorpe wrote:
> On Tue, Jul 27, 2021 at 05:35:46PM +0000, Marciniszyn, Mike wrote:
> > > > On Jul 27, 2021, at 1:14 PM, Marciniszyn, Mike
> > > <mike.marciniszyn@cornelisnetworks.com> wrote:
> > > >
> > > >> I suspect the patch needs to be reverted or NFS RDMA needs to handle
> > > >> the transition to INIT?
> > > 
> > > If I'm reading nvmet_rdma_create_queue_ib() correctly, it invokes
> > > rdma_create_qp() then posts Receives. No
> > > ib_modify_qp() there.
> > > 
> > > So some ULPs assume that rdma_create_qp() returns a new QP in the
> > > IB_QPS_INIT state.
> > > 
> > > I can't say whether that is spec compliant or even internally documented.
> > > 
> > 
> > This from the spec:
> > 
> > C10-20: A newly created QP/EE shall be placed in the Reset state.
> 
> rdma_create_qp() is a CM function, it is not covered by the spec.
> 
> The question is if there is any reasonable basis for ULPs to require
> that the QP be in the INIT state after the CM creates it, or if the
> ULPs should wait to post their recvs until later on in the process.
> 
> Haakon's original analysis said that this was an INIT->INIT
> transition, so I'm a bit confused why we lost a RESET->INIT transition
> in the end?

When I reviewed Haakon's patch, I saw that all accept/listen/e.t.c.
events modify QP from RESET to INIT. This is how we lost extra
transition.

Thanks

> 
> Jason

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: NFS RDMA test failure as of 5.14-rc1
  2021-07-27 13:43 NFS RDMA test failure as of 5.14-rc1 Marciniszyn, Mike
  2021-07-27 17:14 ` Marciniszyn, Mike
@ 2021-07-27 20:17 ` Marciniszyn, Mike
  2021-07-28 13:50   ` Marciniszyn, Mike
  1 sibling, 1 reply; 17+ messages in thread
From: Marciniszyn, Mike @ 2021-07-27 20:17 UTC (permalink / raw)
  To: Jason Gunthorpe, Doug Ledford
  Cc: Chuck Lever III, linux-rdma, Dalessandro, Dennis,
	Leon Romanovsky, Haakon Bugge

> Subject: NFS RDMA test failure as of 5.14-rc1
> 
> Our CI testing has been failing for NFS RDMA since 5.14-rc1.
> 
> Based on kprobes, the NFS RDMA client creates its QP using
> rdma_create_qp(), and does post receives right away.
> 
> This patch below looks like it deleted the transition from RESET to INIT,
> breaking the client side NFS RDMA since post receives are not valid in RESET.
> 
> I suspect the patch needs to be reverted or NFS RDMA needs to handle the
> transition to INIT?
> 
> commit dc70f7c3ed34b081c02a611591c5079c53b771b8
> Author: H�kon Bugge <haakon.bugge@oracle.com>
> Date:   Tue Jun 22 15:39:56 2021 +0200
> 
>     RDMA/cma: Remove unnecessary INIT->INIT transition
> 
>     In rdma_create_qp(), a connected QP will be transitioned to the INIT
>     state.
> 
>     Afterwards, the QP will be transitioned to the RTR state by the
>     cma_modify_qp_rtr() function. But this function starts by performing an
>     ib_modify_qp() to the INIT state again, before another ib_modify_qp() is
>     performed to transition the QP to the RTR state.
> 
>     Hence, there is no need to transition the QP to the INIT state in
>     rdma_create_qp().
> 
>     Link: https://lore.kernel.org/r/1624369197-24578-2-git-send-email-
> haakon.bugge@oracle.com
>     Signed-off-by: H�kon Bugge <haakon.bugge@oracle.com>
>     Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
>     Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

A brief unit test with the patch reverted in 5.14-rc3 shows that this patch may be responsible for iSer CI regressions there as well.

Mike

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: NFS RDMA test failure as of 5.14-rc1
  2021-07-27 17:45         ` Chuck Lever III
@ 2021-07-27 21:20           ` Marciniszyn, Mike
  2021-07-27 23:10             ` Chuck Lever III
  0 siblings, 1 reply; 17+ messages in thread
From: Marciniszyn, Mike @ 2021-07-27 21:20 UTC (permalink / raw)
  To: Chuck Lever III, Jason Gunthorpe
  Cc: Haakon Bugge, linux-rdma, Dalessandro, Dennis, Leon Romanovsky,
	Doug Ledford

> > Haakon's original analysis said that this was an INIT->INIT
> > transition, so I'm a bit confused why we lost a RESET->INIT transition
> > in the end?
> 
> Perhaps the patch should have removed the ib_modify_qp() from
> cma_modify_qp_rtr() instead.
> 

I think that will work.

Do you want to post a patch and I can test?

Mike

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NFS RDMA test failure as of 5.14-rc1
  2021-07-27 21:20           ` Marciniszyn, Mike
@ 2021-07-27 23:10             ` Chuck Lever III
  2021-07-28  4:31               ` Leon Romanovsky
  0 siblings, 1 reply; 17+ messages in thread
From: Chuck Lever III @ 2021-07-27 23:10 UTC (permalink / raw)
  To: Marciniszyn, Mike
  Cc: Jason Gunthorpe, Haakon Bugge, linux-rdma, Dalessandro, Dennis,
	Leon Romanovsky, Doug Ledford


> On Jul 27, 2021, at 5:20 PM, Marciniszyn, Mike <mike.marciniszyn@cornelisnetworks.com> wrote:
> 
>>> Haakon's original analysis said that this was an INIT->INIT
>>> transition, so I'm a bit confused why we lost a RESET->INIT transition
>>> in the end?
>> 
>> Perhaps the patch should have removed the ib_modify_qp() from
>> cma_modify_qp_rtr() instead.
> 
> I think that will work.

Implemented and tested. It doesn't work. :-)

The conclusion I draw is that there are still spots that depend
on one or the other of those state transitions remaining where
they are.

--
Chuck Lever




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NFS RDMA test failure as of 5.14-rc1
  2021-07-27 23:10             ` Chuck Lever III
@ 2021-07-28  4:31               ` Leon Romanovsky
  0 siblings, 0 replies; 17+ messages in thread
From: Leon Romanovsky @ 2021-07-28  4:31 UTC (permalink / raw)
  To: Chuck Lever III
  Cc: Marciniszyn, Mike, Jason Gunthorpe, Haakon Bugge, linux-rdma,
	Dalessandro, Dennis, Doug Ledford

On Tue, Jul 27, 2021 at 11:10:25PM +0000, Chuck Lever III wrote:
> 
> > On Jul 27, 2021, at 5:20 PM, Marciniszyn, Mike <mike.marciniszyn@cornelisnetworks.com> wrote:
> > 
> >>> Haakon's original analysis said that this was an INIT->INIT
> >>> transition, so I'm a bit confused why we lost a RESET->INIT transition
> >>> in the end?
> >> 
> >> Perhaps the patch should have removed the ib_modify_qp() from
> >> cma_modify_qp_rtr() instead.
> > 
> > I think that will work.
> 
> Implemented and tested. It doesn't work. :-)
> 
> The conclusion I draw is that there are still spots that depend
> on one or the other of those state transitions remaining where
> they are.

So let's revert this patch.

Thanks

> 
> --
> Chuck Lever
> 
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: NFS RDMA test failure as of 5.14-rc1
  2021-07-27 20:17 ` Marciniszyn, Mike
@ 2021-07-28 13:50   ` Marciniszyn, Mike
  2021-07-28 17:05     ` Jason Gunthorpe
  0 siblings, 1 reply; 17+ messages in thread
From: Marciniszyn, Mike @ 2021-07-28 13:50 UTC (permalink / raw)
  To: Jason Gunthorpe, Doug Ledford
  Cc: Chuck Lever III, linux-rdma, Dalessandro, Dennis,
	Leon Romanovsky, Haakon Bugge

> >
> > commit dc70f7c3ed34b081c02a611591c5079c53b771b8
> > Author: H kon Bugge <haakon.bugge@oracle.com>
> > Date:   Tue Jun 22 15:39:56 2021 +0200
> >
> >     RDMA/cma: Remove unnecessary INIT->INIT transition
> >
> >     In rdma_create_qp(), a connected QP will be transitioned to the INIT
> >     state.
> >
> >     Afterwards, the QP will be transitioned to the RTR state by the
> >     cma_modify_qp_rtr() function. But this function starts by performing an
> >     ib_modify_qp() to the INIT state again, before another ib_modify_qp() is
> >     performed to transition the QP to the RTR state.
> >
> >     Hence, there is no need to transition the QP to the INIT state in
> >     rdma_create_qp().
> >
> >     Link: https://lore.kernel.org/r/1624369197-24578-2-git-send-email-
> > haakon.bugge@oracle.com
> >     Signed-off-by: H kon Bugge <haakon.bugge@oracle.com>
> >     Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
> >     Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> 
> A brief unit test with the patch reverted in 5.14-rc3 shows that this patch may
> be responsible for iSer CI regressions there as well.

A test of 5.15-rc3 + a revert tested clean.

Jason, do you need a patch to revert or should I send one.

Mike

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NFS RDMA test failure as of 5.14-rc1
  2021-07-28 13:50   ` Marciniszyn, Mike
@ 2021-07-28 17:05     ` Jason Gunthorpe
  2021-07-29 18:28       ` Marciniszyn, Mike
  0 siblings, 1 reply; 17+ messages in thread
From: Jason Gunthorpe @ 2021-07-28 17:05 UTC (permalink / raw)
  To: Marciniszyn, Mike
  Cc: Doug Ledford, Chuck Lever III, linux-rdma, Dalessandro, Dennis,
	Leon Romanovsky, Haakon Bugge

On Wed, Jul 28, 2021 at 01:50:48PM +0000, Marciniszyn, Mike wrote:
> > >
> > > commit dc70f7c3ed34b081c02a611591c5079c53b771b8
> > > Author: H kon Bugge <haakon.bugge@oracle.com>
> > > Date:   Tue Jun 22 15:39:56 2021 +0200
> > >
> > >     RDMA/cma: Remove unnecessary INIT->INIT transition
> > >
> > >     In rdma_create_qp(), a connected QP will be transitioned to the INIT
> > >     state.
> > >
> > >     Afterwards, the QP will be transitioned to the RTR state by the
> > >     cma_modify_qp_rtr() function. But this function starts by performing an
> > >     ib_modify_qp() to the INIT state again, before another ib_modify_qp() is
> > >     performed to transition the QP to the RTR state.
> > >
> > >     Hence, there is no need to transition the QP to the INIT state in
> > >     rdma_create_qp().
> > >
> > >     Link: https://lore.kernel.org/r/1624369197-24578-2-git-send-email-
> > > haakon.bugge@oracle.com
> > >     Signed-off-by: H kon Bugge <haakon.bugge@oracle.com>
> > >     Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
> > >     Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > 
> > A brief unit test with the patch reverted in 5.14-rc3 shows that this patch may
> > be responsible for iSer CI regressions there as well.
> 
> A test of 5.15-rc3 + a revert tested clean.
> 
> Jason, do you need a patch to revert or should I send one.

Please, I would like to hear from Haakon as well

Jason

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: NFS RDMA test failure as of 5.14-rc1
  2021-07-28 17:05     ` Jason Gunthorpe
@ 2021-07-29 18:28       ` Marciniszyn, Mike
  2021-08-09 13:38         ` Haakon Bugge
  0 siblings, 1 reply; 17+ messages in thread
From: Marciniszyn, Mike @ 2021-07-29 18:28 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, Chuck Lever III, linux-rdma, Dalessandro, Dennis,
	Leon Romanovsky, Haakon Bugge

> > A test of 5.15-rc3 + a revert tested clean.
> >
> > Jason, do you need a patch to revert or should I send one.
> 
> Please, I would like to hear from Haakon as well
> 
The revert is https://lore.kernel.org/linux-rdma/1627583182-81330-1-git-send-email-mike.marciniszyn@cornelisnetworks.com/T/#u.

Mike

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NFS RDMA test failure as of 5.14-rc1
  2021-07-29 18:28       ` Marciniszyn, Mike
@ 2021-08-09 13:38         ` Haakon Bugge
  2021-08-09 14:00           ` Chuck Lever III
  0 siblings, 1 reply; 17+ messages in thread
From: Haakon Bugge @ 2021-08-09 13:38 UTC (permalink / raw)
  To: Marciniszyn, Mike
  Cc: Jason Gunthorpe, Doug Ledford, Chuck Lever III,
	OFED mailing list, Dalessandro, Dennis, Leon Romanovsky



> On 29 Jul 2021, at 20:28, Marciniszyn, Mike <mike.marciniszyn@cornelisnetworks.com> wrote:
> 
>>> A test of 5.15-rc3 + a revert tested clean.
>>> 
>>> Jason, do you need a patch to revert or should I send one.
>> 
>> Please, I would like to hear from Haakon as well
>> 
> The revert is https://lore.kernel.org/linux-rdma/1627583182-81330-1-git-send-email-mike.marciniszyn@cornelisnetworks.com/T/#u.

Hi Mike/Chuck/Jason/Leon,


Thanks so much for filling in for me while I was on vacation.

Short term I agree on the revert. But I actually think commit dc70f7c3ed34 ("RDMA/cma: Remove unnecessary INIT->INIT transition") is correcting an erroneous behaviour in the CMA. But it needs more.

1. An API that has a call that has to be made twice (modify_qp --> INIT) is either incorrectly designed or incorrectly used IMHO.

2. IBTA is quite clear, that the transition to Initialized shall happen on the Active side when a REQ is sent (section 12.9.7.1 ACTIVE STATES) and the CM state transitions to the "REQ Sent" state. Similar on the Passive side, the transition to Initialized shall happen when you are in the CMA LISTEN state and you receive a REQ and the CM state is transitioned to "REQ Rcvd" state (section 12.9.7.2 PASSIVE STATES).

3. Performance-wise, the WR posting _before_ sending a REQ is sent on the Active side (rdma_connect()) or before calling rdma_listen() on the passive side, prolongs the time before said REQ is sent or the server is ready to accept. Doing the WR posting as depicted by IBTA above, the time spent filling the recv queues are hidden because we're waiting for a communication response anyway. Not saying this is pronounced, but worth to mention.

The problem here seems to be, CMA does incorrectly return a QP in the INIT state after rdma_create_qp() and some ULPs take advantage of it, it does not transition the QP to INIT state when REQ is sent or received as per IBTA, but has the (second) transition to INIT just before the transition to RTR.

Should this be changed such that the QP is transitioned to the INIT state during rdma_connect() and rdma_accept()? After the respective calls, the ULP is allowed to post recvs. This also aligns nicely with the use of rdma_set_ack_timeout() and rdma_set_min_rnr_timer().


Thxs, Håkon


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: NFS RDMA test failure as of 5.14-rc1
  2021-08-09 13:38         ` Haakon Bugge
@ 2021-08-09 14:00           ` Chuck Lever III
  2021-08-10 15:09             ` Marciniszyn, Mike
  0 siblings, 1 reply; 17+ messages in thread
From: Chuck Lever III @ 2021-08-09 14:00 UTC (permalink / raw)
  To: Haakon Bugge
  Cc: Marciniszyn, Mike, Jason Gunthorpe, Doug Ledford,
	OFED mailing list, Dalessandro, Dennis, Leon Romanovsky



> On Aug 9, 2021, at 9:38 AM, Haakon Bugge <haakon.bugge@oracle.com> wrote:
> 
> 
> 
>> On 29 Jul 2021, at 20:28, Marciniszyn, Mike <mike.marciniszyn@cornelisnetworks.com> wrote:
>> 
>>>> A test of 5.15-rc3 + a revert tested clean.
>>>> 
>>>> Jason, do you need a patch to revert or should I send one.
>>> 
>>> Please, I would like to hear from Haakon as well
>>> 
>> The revert is https://lore.kernel.org/linux-rdma/1627583182-81330-1-git-send-email-mike.marciniszyn@cornelisnetworks.com/T/#u.
> 
> Hi Mike/Chuck/Jason/Leon,
> 
> 
> Thanks so much for filling in for me while I was on vacation.
> 
> Short term I agree on the revert. But I actually think commit dc70f7c3ed34 ("RDMA/cma: Remove unnecessary INIT->INIT transition") is correcting an erroneous behaviour in the CMA. But it needs more.
> 
> 1. An API that has a call that has to be made twice (modify_qp --> INIT) is either incorrectly designed or incorrectly used IMHO.
> 
> 2. IBTA is quite clear, that the transition to Initialized shall happen on the Active side when a REQ is sent (section 12.9.7.1 ACTIVE STATES) and the CM state transitions to the "REQ Sent" state. Similar on the Passive side, the transition to Initialized shall happen when you are in the CMA LISTEN state and you receive a REQ and the CM state is transitioned to "REQ Rcvd" state (section 12.9.7.2 PASSIVE STATES).
> 
> 3. Performance-wise, the WR posting _before_ sending a REQ is sent on the Active side (rdma_connect()) or before calling rdma_listen() on the passive side, prolongs the time before said REQ is sent or the server is ready to accept. Doing the WR posting as depicted by IBTA above, the time spent filling the recv queues are hidden because we're waiting for a communication response anyway. Not saying this is pronounced, but worth to mention.
> 
> The problem here seems to be, CMA does incorrectly return a QP in the INIT state after rdma_create_qp() and some ULPs take advantage of it, it does not transition the QP to INIT state when REQ is sent or received as per IBTA, but has the (second) transition to INIT just before the transition to RTR.
> 
> Should this be changed such that the QP is transitioned to the INIT state during rdma_connect() and rdma_accept()? After the respective calls, the ULP is allowed to post recvs. This also aligns nicely with the use of rdma_set_ack_timeout() and rdma_set_min_rnr_timer().

I don't have a philosophical position on exactly how rdma_create_qp()
should work going forward, but I agree that ULPs have depended on the
current behavior and will need to be updated if the QPs returned by
rdma_create_qp() are not in INIT state. I stand ready to fix things
up in the RPC/RDMA consumers should that be needed.

In fact it looks like some consumers might already assume the corrected
CMA behavior. Maybe the RPC/RDMA consumers can safely be modified now?
Let me know how to proceed if this is the case.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: NFS RDMA test failure as of 5.14-rc1
  2021-08-09 14:00           ` Chuck Lever III
@ 2021-08-10 15:09             ` Marciniszyn, Mike
  0 siblings, 0 replies; 17+ messages in thread
From: Marciniszyn, Mike @ 2021-08-10 15:09 UTC (permalink / raw)
  To: Chuck Lever III, Haakon Bugge
  Cc: Jason Gunthorpe, Doug Ledford, OFED mailing list, Dalessandro,
	Dennis, Leon Romanovsky

> I don't have a philosophical position on exactly how rdma_create_qp()
> should work going forward, but I agree that ULPs have depended on the
> current behavior and will need to be updated if the QPs returned by
> rdma_create_qp() are not in INIT state. I stand ready to fix things up in the
> RPC/RDMA consumers should that be needed.
> 
> In fact it looks like some consumers might already assume the corrected CMA
> behavior. Maybe the RPC/RDMA consumers can safely be modified now?
> Let me know how to proceed if this is the case.
> 

I haven't done the same ULP audit as Chuck, but I do know that the cma.c patch caused failures on both iSer and NFS/RDMA.

The revert fixed ALL the issues.

Let me know how to help test any successor patch.

Mike

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-08-10 15:09 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-27 13:43 NFS RDMA test failure as of 5.14-rc1 Marciniszyn, Mike
2021-07-27 17:14 ` Marciniszyn, Mike
2021-07-27 17:26   ` Chuck Lever III
2021-07-27 17:35     ` Marciniszyn, Mike
2021-07-27 17:38       ` Jason Gunthorpe
2021-07-27 17:45         ` Chuck Lever III
2021-07-27 21:20           ` Marciniszyn, Mike
2021-07-27 23:10             ` Chuck Lever III
2021-07-28  4:31               ` Leon Romanovsky
2021-07-27 18:12         ` Leon Romanovsky
2021-07-27 20:17 ` Marciniszyn, Mike
2021-07-28 13:50   ` Marciniszyn, Mike
2021-07-28 17:05     ` Jason Gunthorpe
2021-07-29 18:28       ` Marciniszyn, Mike
2021-08-09 13:38         ` Haakon Bugge
2021-08-09 14:00           ` Chuck Lever III
2021-08-10 15:09             ` Marciniszyn, Mike

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.