All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bernard Metzler <BMT@zurich.ibm.com>
To: Cheng Xu <chengyou@linux.alibaba.com>,
	"jgg@ziepe.ca" <jgg@ziepe.ca>,
	"leon@kernel.org" <leon@kernel.org>
Cc: "linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>
Subject: RE: [PATCH for-next] RDMA/siw: Fix duplicated reported IW_CM_EVENT_CONNECT_REPLY event
Date: Thu, 14 Jul 2022 13:58:03 +0000	[thread overview]
Message-ID: <BYAPR15MB2631A9EEA0F43AFCE86B790F99889@BYAPR15MB2631.namprd15.prod.outlook.com> (raw)
In-Reply-To: <23ee6969-c32d-911a-2430-d9e3f6c52a61@linux.alibaba.com>

> -----Original Message-----
> From: Cheng Xu <chengyou@linux.alibaba.com>
> Sent: Thursday, 14 July 2022 15:20
> To: Bernard Metzler <BMT@zurich.ibm.com>; jgg@ziepe.ca; leon@kernel.org
> Cc: linux-rdma@vger.kernel.org
> Subject: [EXTERNAL] Re: [PATCH for-next] RDMA/siw: Fix duplicated reported
> IW_CM_EVENT_CONNECT_REPLY event
> 
> 
> 
> On 7/14/22 8:59 PM, Bernard Metzler wrote:
> >> -----Original Message-----
> >> From: Cheng Xu <chengyou@linux.alibaba.com>
> >> Sent: Thursday, 14 July 2022 03:31
> >> To: jgg@ziepe.ca; leon@kernel.org; Bernard Metzler <BMT@zurich.ibm.com>
> >> Cc: linux-rdma@vger.kernel.org; chengyou@linux.alibaba.com
> >> Subject: [EXTERNAL] [PATCH for-next] RDMA/siw: Fix duplicated reported
> >> IW_CM_EVENT_CONNECT_REPLY event
> >>
> >> If siw_recv_mpa_rr returns -EAGAIN, it means that the MPA reply hasn't
> >> been received completely, and should not report
> IW_CM_EVENT_CONNECT_REPLY
> >> in this case. This may trigger a call trace in iw_cm. A simple way to
> >> trigger this:
> >
> > Great, thanks! I obviously did never hit an incomplete
> > MPA hdr. Please make another change to fix it correctly,
> > as suggested below.
> >
> >
> > case of an incomplete
> >>  server: ib_send_lat
> >>  client: ib_send_lat -R <server_ip>
> >>
> >> The call trace looks like this:
> >>
> >>  kernel BUG at drivers/infiniband/core/iwcm.c:894!
> >>  invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> >>  <...>
> >>  Workqueue: iw_cm_wq cm_work_handler [iw_cm]
> >>  Call Trace:
> >>   <TASK>
> >>   cm_work_handler+0x1dd/0x370 [iw_cm]
> >>   process_one_work+0x1e2/0x3b0
> >>   worker_thread+0x49/0x2e0
> >>   ? rescuer_thread+0x370/0x370
> >>   kthread+0xe5/0x110
> >>   ? kthread_complete_and_exit+0x20/0x20
> >>   ret_from_fork+0x1f/0x30
> >>   </TASK>
> >>
> >> Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
> >> ---
> >>  drivers/infiniband/sw/siw/siw_cm.c | 7 ++++---
> >>  1 file changed, 4 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/infiniband/sw/siw/siw_cm.c
> >> b/drivers/infiniband/sw/siw/siw_cm.c
> >> index 17f34d584cd9..f88d2971c2c6 100644
> >> --- a/drivers/infiniband/sw/siw/siw_cm.c
> >> +++ b/drivers/infiniband/sw/siw/siw_cm.c
> >> @@ -725,11 +725,11 @@ static int siw_proc_mpareply(struct siw_cep *cep)
> >>  	enum mpa_v2_ctrl mpa_p2p_mode = MPA_V2_RDMA_NO_RTR;
> >>
> >>  	rv = siw_recv_mpa_rr(cep);
> >> -	if (rv != -EAGAIN)
> >> -		siw_cancel_mpatimer(cep);
> >>  	if (rv)
> >>  		goto out_err;
> >>
> >> +	siw_cancel_mpatimer(cep);
> >> +
> >
> > Cancel the MPA timer only if we have a
> > real error. -EAGAIN translates to just
> > further waiting. So best to add the timer
> > cancellation to the error bailout section.
> >
> >>  	rep = &cep->mpa.hdr;
> >>
> >>  	if (__mpa_rr_revision(rep->params.bits) > MPA_REVISION_2) {
> >> @@ -895,7 +895,8 @@ static int siw_proc_mpareply(struct siw_cep *cep)
> >>  	}
> >>
> >>  out_err:
> >> -	siw_cm_upcall(cep, IW_CM_EVENT_CONNECT_REPLY, -EINVAL);
> >> +	if (rv != -EAGAIN)
> > {
> > cancel MPA timer here.
> 
> Indeed we do not need it here, because when siw_proc_mpareply returns error
> but not -EAGAIN, the release_cep will be set in the caller
> (siw_cm_work_handler),
> and siw_cancel_mpatimer will be called in the error handle flow.
> 
> I think this is better, because the error handle is more unified.

Yes, sorry, your original patch is correct.

> 
> How do you think?
> 
> Thanks,
> Cheng Xu
> 
> 
> > 		siw_cancel_mpatimer(cep);
> >> +		siw_cm_upcall(cep, IW_CM_EVENT_CONNECT_REPLY, -EINVAL);
> > }
> >>
> >>  	return rv;
> >>  }
> >> --
> >> 2.37.0
> >

Thank you!


Acked-by: Bernard Metzler <bmt@zurich.ibm.com>

  reply	other threads:[~2022-07-14 14:00 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-14 12:59 [PATCH for-next] RDMA/siw: Fix duplicated reported IW_CM_EVENT_CONNECT_REPLY event Bernard Metzler
2022-07-14 13:20 ` Cheng Xu
2022-07-14 13:58   ` Bernard Metzler [this message]
  -- strict thread matches above, loose matches on Subject: below --
2022-07-14  1:30 Cheng Xu
2022-07-18 11:21 ` Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BYAPR15MB2631A9EEA0F43AFCE86B790F99889@BYAPR15MB2631.namprd15.prod.outlook.com \
    --to=bmt@zurich.ibm.com \
    --cc=chengyou@linux.alibaba.com \
    --cc=jgg@ziepe.ca \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.