All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zhu Yanjun <zyjzyj2000@gmail.com>
To: Olga Kornievskaia <aglo@umich.edu>
Cc: Bob Pearson <rpearsonhpe@gmail.com>,
	Jason Gunthorpe <jgg@nvidia.com>,
	linux-rdma <linux-rdma@vger.kernel.org>
Subject: Re: RDMA/rxe is broken (impacting running NFSoRDMA over softRoCE)
Date: Wed, 21 Jul 2021 14:16:39 +0800	[thread overview]
Message-ID: <CAD=hENfEeHDMO8h09xtUYrn+2zv=xuSxCy6s2LrnWka11NneTg@mail.gmail.com> (raw)
In-Reply-To: <CAN-5tyFQd3wzRXtcQoO0wC-bU1Ggk05K7ikokY_ZGZidG=CP5A@mail.gmail.com>

On Wed, Jul 21, 2021 at 5:48 AM Olga Kornievskaia <aglo@umich.edu> wrote:
>
> On Tue, Jul 20, 2021 at 2:27 AM Bob Pearson <rpearsonhpe@gmail.com> wrote:
> >
> > On 7/19/21 10:46 PM, Olga Kornievskaia wrote:
> > > Hello,
> > >
> > > I would like to report that the rxe driver got broken some time
> > > between 5.13 and 5.14-rc1 (so basically the last git pull). It's not
> > > just NFSoRDMA but simple rping doesn't work. I believe I found the
> > > problematic commit: 5bcf5a59c41e19141783c7305d420a5e36c937b2
> > > "RDMA/rxe: Protext kernel index from user space"
> > >
> > > Server side logs: "rdma_rxe: bad ICRC from <>".
> > >
> > Thanks. That is helpful. Will try to find it.
>
> Thank you, I appreciate you looking into it. Actually I'm not 100%
> confident that's the commit for this particular problem "I" was seeing
> in 5.14-rc (which was rping hanging but not crashing. An NFS mount
> also hangs, doesn't crash) . But what git bisect was going thru and
> encountering crashes so can't say what it "found". So I think that's
> the one that cashes kernel oops. I think something else leads to the
> bad ICRC.

Thanks a lot. I will delve into this problem.

Zhu Yanjun

>
> I have a general question. I see that you've been posting a lot of
> work on RDMA/rxe lately. Can this be viewed as somebody (you/your
> company) is now actively supporting rxe driver? It looked like
> previously Mellanox had abandoned support for it. We ran into several
> issues trying to use rxe for NFSoRDMA throughout the years but they
> were not being addressed.
>
> There were a number of commits that lead to crashes. commit
> ec9bf373f2458f4b5f1ece8b93a07e6204081667 "RDMA/core: Use refcount_t
> instead of atomic_t on refcount of ib_uverbs_device" leads to the
> following kernel oops. commit 205be5dc9984b67a3b388cbdaa27a2f2644a4bd6
> "RDMA/irdma: Fix spelling mistake "Allocal" -> "Allocate"" also leads
> to the kernel oops.

      parent reply	other threads:[~2021-07-21  6:17 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-20  3:46 RDMA/rxe is broken (impacting running NFSoRDMA over softRoCE) Olga Kornievskaia
2021-07-20  6:27 ` Bob Pearson
2021-07-20 21:48   ` Olga Kornievskaia
2021-07-21  5:47     ` Leon Romanovsky
2021-07-21 21:15       ` Olga Kornievskaia
2021-07-21 21:48         ` Bob Pearson
2021-07-21  6:16     ` Zhu Yanjun [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAD=hENfEeHDMO8h09xtUYrn+2zv=xuSxCy6s2LrnWka11NneTg@mail.gmail.com' \
    --to=zyjzyj2000@gmail.com \
    --cc=aglo@umich.edu \
    --cc=jgg@nvidia.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=rpearsonhpe@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.