All of lore.kernel.org
 help / color / mirror / Atom feed
From: Olga Kornievskaia <aglo@umich.edu>
To: Bob Pearson <rpearsonhpe@gmail.com>
Cc: Zhu Yanjun <zyjzyj2000@gmail.com>,
	Jason Gunthorpe <jgg@nvidia.com>,
	linux-rdma <linux-rdma@vger.kernel.org>
Subject: Re: RDMA/rxe is broken (impacting running NFSoRDMA over softRoCE)
Date: Tue, 20 Jul 2021 17:48:03 -0400	[thread overview]
Message-ID: <CAN-5tyFQd3wzRXtcQoO0wC-bU1Ggk05K7ikokY_ZGZidG=CP5A@mail.gmail.com> (raw)
In-Reply-To: <63d7f374-1252-82c8-769d-2d1a540466fd@gmail.com>

On Tue, Jul 20, 2021 at 2:27 AM Bob Pearson <rpearsonhpe@gmail.com> wrote:
>
> On 7/19/21 10:46 PM, Olga Kornievskaia wrote:
> > Hello,
> >
> > I would like to report that the rxe driver got broken some time
> > between 5.13 and 5.14-rc1 (so basically the last git pull). It's not
> > just NFSoRDMA but simple rping doesn't work. I believe I found the
> > problematic commit: 5bcf5a59c41e19141783c7305d420a5e36c937b2
> > "RDMA/rxe: Protext kernel index from user space"
> >
> > Server side logs: "rdma_rxe: bad ICRC from <>".
> >
> Thanks. That is helpful. Will try to find it.

Thank you, I appreciate you looking into it. Actually I'm not 100%
confident that's the commit for this particular problem "I" was seeing
in 5.14-rc (which was rping hanging but not crashing. An NFS mount
also hangs, doesn't crash) . But what git bisect was going thru and
encountering crashes so can't say what it "found". So I think that's
the one that cashes kernel oops. I think something else leads to the
bad ICRC.

I have a general question. I see that you've been posting a lot of
work on RDMA/rxe lately. Can this be viewed as somebody (you/your
company) is now actively supporting rxe driver? It looked like
previously Mellanox had abandoned support for it. We ran into several
issues trying to use rxe for NFSoRDMA throughout the years but they
were not being addressed.

There were a number of commits that lead to crashes. commit
ec9bf373f2458f4b5f1ece8b93a07e6204081667 "RDMA/core: Use refcount_t
instead of atomic_t on refcount of ib_uverbs_device" leads to the
following kernel oops. commit 205be5dc9984b67a3b388cbdaa27a2f2644a4bd6
"RDMA/irdma: Fix spelling mistake "Allocal" -> "Allocate"" also leads
to the kernel oops.

  reply	other threads:[~2021-07-20 21:48 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-20  3:46 RDMA/rxe is broken (impacting running NFSoRDMA over softRoCE) Olga Kornievskaia
2021-07-20  6:27 ` Bob Pearson
2021-07-20 21:48   ` Olga Kornievskaia [this message]
2021-07-21  5:47     ` Leon Romanovsky
2021-07-21 21:15       ` Olga Kornievskaia
2021-07-21 21:48         ` Bob Pearson
2021-07-21  6:16     ` Zhu Yanjun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAN-5tyFQd3wzRXtcQoO0wC-bU1Ggk05K7ikokY_ZGZidG=CP5A@mail.gmail.com' \
    --to=aglo@umich.edu \
    --cc=jgg@nvidia.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=rpearsonhpe@gmail.com \
    --cc=zyjzyj2000@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.