linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Doug Ledford <dledford@redhat.com>
Cc: Bart Van Assche <bvanassche@acm.org>,
	Steve Wise <larrystevenwise@gmail.com>,
	linux-rdma@vger.kernel.org, 3100102071@zju.edu.cn
Subject: Re: [PATCH 2/2] rxe: correctly calculate iCRC for unaligned payloads
Date: Tue, 10 Dec 2019 08:54:10 +0200	[thread overview]
Message-ID: <20191210065410.GK67461@unreal> (raw)
In-Reply-To: <1aee0f71873a4c9da7f965c12419d81333f3a0b4.camel@redhat.com>

On Mon, Dec 09, 2019 at 02:07:06PM -0500, Doug Ledford wrote:
> On Tue, 2019-12-03 at 19:46 -0500, Doug Ledford wrote:
> > On Tue, 2019-12-03 at 08:25 -0800, Bart Van Assche wrote:
> > > On 12/2/19 6:03 PM, Steve Wise wrote:
> > > > If RoCE PDUs being sent or received contain pad bytes, then the
> > > > iCRC
> > > > is miscalculated resulting PDUs being emitted by RXE with an
> > > > incorrect
> > > > iCRC, as well as ingress PDUs being dropped due to erroneously
> > > > detecting
> > > > a bad iCRC in the PDU.  The fix is to include the pad bytes, if
> > > > any,
> > > > in iCRC computations.
> > >
> > > Should this description mention that this patch breaks
> > > compatibility
> > > with SoftRoCE drivers that do not include this fix? Do we need a
> > > kernel
> > > module parameter that allows to select either the old or the new
> > > behavior?
> >
> > No.  The original soft-RoCE driver was supposed to be compatible with
> > hardware devices.  Because of this bug, it obviously wasn't.  This is
> > a
> > bug fix, and we do not need to do anything to be compatible with the
> > broken behavior.  Instead, it just needs noting that the soft-RoCE
> > implementation in prior kernels has a known wire format bug that
> > impacts
> > communications with both fixed versions of the driver and real
> > hardware
> > devices.
>
> I've taken these two patches into for-rc (with fixups to the commit
> message on the second, as well as adding a Fixes: tag on the second).
>
> I stand by what I said about not needing a compatibility flag or module
> option for the user to set.  However, that isn't to say that we can't
> autodetect old soft-RoCE peers.  If we get a packet that fails CRC and
> has pad bytes, then re-run the CRC without the pad bytes and see if it
> matches.  If it does, we could A) mark the current QP as being to an old
> soft-RoCE device (causing us to send without including the pad bytes in
> the CRC) and B) allocate a struct old_soft_roce_peer and save the guid
> into that struct and then put that struct on a list that we then search
> any time we are creating a new queue pair and if the new queue pair goes
> to a guid in the list, then we immediately flag that qp as being to an
> old soft roce device and get the right behavior.  It would slow down qp
> creation somewhat due to the list search, but probably not enough to
> worry about.  No one will be doing a 1,000 node cluster MPI job over
> soft-RoCE, so we should never notice the list length causing search
> problems.  A patch to do something like that would be welcome.

Do you find this implementation needed? I see RXE as a development
platform and in my view it is unlikely that someone will run RXE in
production with mixture of different kernel versions, which requires
such compatibility fallback.

Thanks

>
> --
> Doug Ledford <dledford@redhat.com>
>     GPG KeyID: B826A3330E572FDD
>     Fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD



  reply	other threads:[~2019-12-10  6:54 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-03  2:03 [PATCH 1/2] Update mailmap info for Steve Wise Steve Wise
2019-12-03  2:03 ` [PATCH 2/2] rxe: correctly calculate iCRC for unaligned payloads Steve Wise
2019-12-03 16:25   ` Bart Van Assche
2019-12-03 18:32     ` Steve Wise
2019-12-04  0:46     ` Doug Ledford
2019-12-09 19:07       ` Doug Ledford
2019-12-10  6:54         ` Leon Romanovsky [this message]
2019-12-11  4:24           ` Doug Ledford
2019-12-11  5:59             ` Leon Romanovsky
2019-12-11 14:42             ` Tom Talpey
2019-12-12 22:06               ` Doug Ledford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191210065410.GK67461@unreal \
    --to=leon@kernel.org \
    --cc=3100102071@zju.edu.cn \
    --cc=bvanassche@acm.org \
    --cc=dledford@redhat.com \
    --cc=larrystevenwise@gmail.com \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).