All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: "Pearson, Robert B" <rpearsonhpe@gmail.com>
Cc: "Pearson, Robert B" <robert.pearson2@hpe.com>,
	Zhu Yanjun <zyjzyj2000@gmail.com>,
	RDMA mailing list <linux-rdma@vger.kernel.org>
Subject: Re: [PATCH for-next v7 00/10] RDMA/rxe: Implement memory windows
Date: Tue, 25 May 2021 15:41:56 -0300	[thread overview]
Message-ID: <20210525184156.GF1002214@nvidia.com> (raw)
In-Reply-To: <77192b9c-9d8a-061f-5ffc-1052504104bc@gmail.com>

On Tue, May 25, 2021 at 01:09:01PM -0500, Pearson, Robert B wrote:
> 
> On 5/25/2021 10:23 AM, Pearson, Robert B wrote:
> > On further reflection I realize I did not understand correctly the user/kernel API issue correctly. I was assuming that the user application should continue to run but that we could require re-compiling rdma-core. If we require that old rdma-core binaries run on newer kernels then the 40 bytes is an issue. I always recompiled rdma-core and didn't test running with old binaries. Fortunately there is an easy fix. The flags field in the earlier rxe mw version had one bit in it but the new version dropped that and I never went back and removed the field. Dropping the flags field doesn't break anything but lets the mw struct fit in the wr union without extending it.
> > 
> > I will fix, retest and resubmit.
> > 
> > Bob
> > 
> > From: Zhu Yanjun <zyjzyj2000@gmail.com>
> > Sent: Tuesday, May 25, 2021 10:00 AM
> > To: Pearson, Robert B <robert.pearson2@hpe.com>
> > Cc: Pearson, Robert B <rpearsonhpe@gmail.com>; Jason Gunthorpe <jgg@nvidia.com>; RDMA mailing list <linux-rdma@vger.kernel.org>
> > Subject: Re: [PATCH for-next v7 00/10] RDMA/rxe: Implement memory windows
> > 
> > On Tue, May 25, 2021 at 1:27 PM Pearson, Robert B <robert.pearson2@hpe.com> wrote:
> > > There's nothing to change. There is no problem. Just get the headers sync'ed.
> > > If that doesn't fix your issues your tree has gotten corrupted somehow. But, I don't think that is the issue. I saw the same type of errors you reported when rdma_core is built with the old header file. That definitely will cause problems. The size of the send queue WQEs changed because new fields were added. Then user space and the kernel immediately get off from each other.
> > > 
> > > Good luck,
> > About rdma-core, the root cause is clear. I am fine with this patch series.
> > Thanks, Bob.
> > 
> > Zhu Yanjun
> > 
> Well. Interesting. Having pulled latest rdma-core again and fixed the wr.mw
> size issue I now see a bunch of CQ and QP errors which have nothing to do
> with the memory windows patches. It looks more like a memory ordering
> problem around the queues. Is this possibly related to the recent relaxed
> ordering changes?? 

They haven't been merged and wouldn't effect a SW driver like rxe

> The one py test failure I have chased down is in the resize cq
> test. The first time it runs after building a new module I can print
> out the new cqe and the current queue count and see the expected 1
> which is less than 6 but the code takes the wrong branch and does
> not report an error. Rerunning the test I get the expected behavior
> and the test passes. This will take a bit of effort.

Bisect the kernel?

Jason

      reply	other threads:[~2021-05-25 18:42 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-21 20:18 [PATCH for-next v7 00/10] RDMA/rxe: Implement memory windows Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 01/10] RDMA/rxe: Add bind MW fields to rxe_send_wr Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 02/10] RDMA/rxe: Return errors for add index and key Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 03/10] RDMA/rxe: Enable MW object pool Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 04/10] RDMA/rxe: Add ib_alloc_mw and ib_dealloc_mw verbs Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 05/10] RDMA/rxe: Replace WR_REG_MASK by WR_LOCAL_OP_MASK Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 06/10] RDMA/rxe: Move local ops to subroutine Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 07/10] RDMA/rxe: Add support for bind MW work requests Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 08/10] RDMA/rxe: Implement invalidate MW operations Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 09/10] RDMA/rxe: Implement memory access through MWs Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 10/10] RDMA/rxe: Disallow MR dereg and invalidate when bound Bob Pearson
2021-05-24  3:14 ` [PATCH for-next v7 00/10] RDMA/rxe: Implement memory windows Zhu Yanjun
2021-05-24 16:04   ` Pearson, Robert B
2021-05-25  2:08     ` Zhu Yanjun
2021-05-25  4:57       ` Pearson, Robert B
2021-05-25  5:18         ` Zhu Yanjun
2021-05-25  5:27           ` Pearson, Robert B
2021-05-25  5:45             ` Zhu Yanjun
2021-05-25 15:00             ` Zhu Yanjun
2021-05-25 15:23               ` Pearson, Robert B
2021-05-25 18:09                 ` Pearson, Robert B
2021-05-25 18:41                   ` Jason Gunthorpe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210525184156.GF1002214@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=robert.pearson2@hpe.com \
    --cc=rpearsonhpe@gmail.com \
    --cc=zyjzyj2000@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.