All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Haakon Bugge <haakon.bugge@oracle.com>
Cc: Doug Ledford <dledford@redhat.com>,
	OFED mailing list <linux-rdma@vger.kernel.org>
Subject: Re: [PATCH for-rc] IB/cma: Fix false P_Key mismatch messages
Date: Fri, 9 Jul 2021 13:56:24 -0300	[thread overview]
Message-ID: <20210709165624.GB1541340@nvidia.com> (raw)
In-Reply-To: <0F551BD8-4179-4090-B739-F30202A0BEE6@oracle.com>

On Fri, Jul 09, 2021 at 04:45:21PM +0000, Haakon Bugge wrote:
> 
> 
> > On 8 Jul 2021, at 20:52, Jason Gunthorpe <jgg@nvidia.com> wrote:
> > 
> > On Thu, Jul 08, 2021 at 03:59:25PM +0000, Haakon Bugge wrote:
> >> 
> >> 
> >>> On 5 Jul 2021, at 18:59, Haakon Bugge <haakon.bugge@oracle.com> wrote:
> >>> 
> >>> 
> >>> 
> >>>> On 5 Jul 2021, at 18:26, Jason Gunthorpe <jgg@nvidia.com> wrote:
> >>>> 
> >>>> On Tue, Jun 29, 2021 at 01:45:35PM +0000, Haakon Bugge wrote:
> >>>> 
> >>>>>>>> IMHO it is a bug on the sender side to send GMPs to use a pkey that
> >>>>>>>> doesn't exactly match the data path pkey.
> >>>>>>> 
> >>>>>>> The active connector calls ib_addr_get_pkey(). This function
> >>>>>>> extracts the pkey from byte 8/9 in the device's bcast
> >>>>>>> address. However, RFC 4391 explicitly states:
> >>>>>> 
> >>>>>> pkeys in CM come only from path records that the SM returns, the above
> >>>>>> should only be used to feed into a path record query which could then
> >>>>>> return back a limited pkey.
> >>>>>> 
> >>>>>> Everything thereafter should use the SM's version of the pkey.
> >>>>> 
> >>>>> Revisiting this. I think I mis-interpreted the scenario that led to
> >>>>> the P_Key mismatch messages.
> >>>>> 
> >>>>> The CM retrieves the pkey_index that matched the P_Key in the BTH
> >>>>> (cm_get_bth_pkey()) and thereafter calls ib_get_cached_pkey() to get
> >>>>> the P_Key value of the particular pkey_index.
> >>>>> 
> >>>>> Assume a full-member sends a REQ. In that case, both P_Keys (BTH and
> >>>>> primary path_rec) are full. Further, assume the recipient is only a
> >>>>> limited member. Since full and limited members of the same partition
> >>>>> are eligible to communicate, the P_Key retrieved by
> >>>>> cm_get_bth_pkey() will be the limited one.
> >>>> 
> >>>> It is incorrect for the issuer of the REQ to put a full pkey in the
> >>>> REQ message when the target is a limited member.
> >>> 
> >>> Sorry, I mis-interpreted the spec. I though the PKey in the Path record should be that of the initiator, not the target's. OK. Will come up with a fix.
> >> 
> >> On the systems I have access to (running Oracle flavour OpenSM in
> >> our NM2 switches), the behaviour is exactly the opposite of what you
> >> say.
> > 
> > Check with saquery what is happening, if you request a reversible path
> > from the CM target (limited pkey) to the CM client (full) you should
> > get the limited pkey or the SM is broken.
> > 
> > If the SM is working then probably something in the stack is using a
> > reversed src/dest when doing the PR query.
> > 
> > It is not intuitive but the PR query should have SGID as the CM Target
> > even though it is running on the CM Client.
> 
> That is not how it is today. And because of that, all accesses to
> the PR assume the d{gid,lid} is the remote peer. To fix this, I have
> to swap dgid/sgid and ib.dlid/ib.slid all over to get this
> working. That is pervasive. E.g., even includes ipoib. Let me know
> if that is what you want.

It is only things that use the paths to generate CM REQ messages, and
yes it is the right thing to do.

Jason

      reply	other threads:[~2021-07-09 16:56 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-05 12:54 [PATCH for-rc] IB/cma: Fix false P_Key mismatch messages Håkon Bugge
2021-05-10 17:04 ` Jason Gunthorpe
2021-05-10 18:52   ` Haakon Bugge
2021-05-10 19:12     ` Jason Gunthorpe
2021-06-29 13:45       ` Haakon Bugge
2021-07-05 16:26         ` Jason Gunthorpe
2021-07-05 16:59           ` Haakon Bugge
2021-07-08 15:59             ` Haakon Bugge
2021-07-08 18:52               ` Jason Gunthorpe
2021-07-09 16:45                 ` Haakon Bugge
2021-07-09 16:56                   ` Jason Gunthorpe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210709165624.GB1541340@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=dledford@redhat.com \
    --cc=haakon.bugge@oracle.com \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.