All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Pearson, Robert B" <robert.pearson2@hpe.com>
To: Zhu Yanjun <zyjzyj2000@gmail.com>
Cc: "Pearson, Robert B" <rpearsonhpe@gmail.com>,
	Jason Gunthorpe <jgg@nvidia.com>,
	RDMA mailing list <linux-rdma@vger.kernel.org>
Subject: RE: [PATCH for-next v7 00/10] RDMA/rxe: Implement memory windows
Date: Tue, 25 May 2021 15:23:50 +0000	[thread overview]
Message-ID: <CS1PR8401MB109691557DEE165AC0B9C47ABC259@CS1PR8401MB1096.NAMPRD84.PROD.OUTLOOK.COM> (raw)
In-Reply-To: <CAD=hENfATTprVG+wYa+1qjdTcuetLyzTt8gHjfcWp5PsLVL4Pw@mail.gmail.com>

On further reflection I realize I did not understand correctly the user/kernel API issue correctly. I was assuming that the user application should continue to run but that we could require re-compiling rdma-core. If we require that old rdma-core binaries run on newer kernels then the 40 bytes is an issue. I always recompiled rdma-core and didn't test running with old binaries. Fortunately there is an easy fix. The flags field in the earlier rxe mw version had one bit in it but the new version dropped that and I never went back and removed the field. Dropping the flags field doesn't break anything but lets the mw struct fit in the wr union without extending it.

I will fix, retest and resubmit.

Bob

-----Original Message-----
From: Zhu Yanjun <zyjzyj2000@gmail.com> 
Sent: Tuesday, May 25, 2021 10:00 AM
To: Pearson, Robert B <robert.pearson2@hpe.com>
Cc: Pearson, Robert B <rpearsonhpe@gmail.com>; Jason Gunthorpe <jgg@nvidia.com>; RDMA mailing list <linux-rdma@vger.kernel.org>
Subject: Re: [PATCH for-next v7 00/10] RDMA/rxe: Implement memory windows

On Tue, May 25, 2021 at 1:27 PM Pearson, Robert B <robert.pearson2@hpe.com> wrote:
>
> There's nothing to change. There is no problem. Just get the headers sync'ed.
> If that doesn't fix your issues your tree has gotten corrupted somehow. But, I don't think that is the issue. I saw the same type of errors you reported when rdma_core is built with the old header file. That definitely will cause problems. The size of the send queue WQEs changed because new fields were added. Then user space and the kernel immediately get off from each other.
>
> Good luck,

About rdma-core, the root cause is clear. I am fine with this patch series.
Thanks, Bob.

Zhu Yanjun

>
> Bob
>
> -----Original Message-----
> From: Zhu Yanjun <zyjzyj2000@gmail.com>
> Sent: Tuesday, May 25, 2021 12:18 AM
> To: Pearson, Robert B <robert.pearson2@hpe.com>
> Cc: Pearson, Robert B <rpearsonhpe@gmail.com>; Jason Gunthorpe 
> <jgg@nvidia.com>; RDMA mailing list <linux-rdma@vger.kernel.org>
> Subject: Re: [PATCH for-next v7 00/10] RDMA/rxe: Implement memory 
> windows
>
> On Tue, May 25, 2021 at 12:57 PM Pearson, Robert B <robert.pearson2@hpe.com> wrote:
> >
> > Zhu,
> >
> > I'm not sure about the script. Starting from where you were I copied 
> > <LINUX>/include/uapi/rdma/rdma_user_rxe.h to 
> > <RDMA_CORE>/kernel-headers/rdma/rdma_user_rxe.h. After running the 
> > script you should be able to just diff these two files to make sure 
> > they are the same. If they aren't copy the header file over. After 
> > the shift to 5.13
> > rc1+ I re-pulled both trees and applied the kernel patches and then 
> > rc1+ built everything. The python test cases look like
> >
> > .............sssssssss.............sssssssssssssssssssssssssssssssss
> > ss 
> > ssssssssssssssssssssssssssssssssssss.ssssssssssssssssssssssssss....s
> > ss s.............s.....s.......ssssssssss..ss
> > --------------------------------------------------------------------
> > --
> > Ran 182 tests in 0.380s
>
> Thanks. Please submit a new patch for this problem.
>
> >
> > OK (skipped=124)
> >
> > There are a lot of skips but no errors. The skips are from features that rxe does not support.
> >
> > Adding the MW rdma_core patch picks up a small number of additional test cases involving memory windows.
>
> Thanks a lot. Look forward to these additional test cases involving memory windows.
>
> Zhu Yanjun
>
> >
> > Regards,
> >
> > Bob
> >
> > -----Original Message-----
> > From: Zhu Yanjun <zyjzyj2000@gmail.com>
> > Sent: Monday, May 24, 2021 9:09 PM
> > To: Pearson, Robert B <rpearsonhpe@gmail.com>
> > Cc: Jason Gunthorpe <jgg@nvidia.com>; RDMA mailing list 
> > <linux-rdma@vger.kernel.org>
> > Subject: Re: [PATCH for-next v7 00/10] RDMA/rxe: Implement memory 
> > windows
> >
> > On Tue, May 25, 2021 at 12:04 AM Pearson, Robert B <rpearsonhpe@gmail.com> wrote:
> > >
> > > On 5/23/2021 10:14 PM, Zhu Yanjun wrote:
> > > > On Sat, May 22, 2021 at 4:19 AM Bob Pearson <rpearsonhpe@gmail.com> wrote:
> > > >> This series of patches implement memory windows for the 
> > > >> rdma_rxe driver. This is a shorter reimplementation of an earlier patch set.
> > > >> They apply to and depend on the current for-next linux rdma tree.
> > > >>
> > > >> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> > > >> ---
> > > >> v7:
> > > >>    Fixed a duplicate INIT_RDMA_OBJ_SIZE(ib_mw, ...) in rxe_verbs.c.
> > > > With this patch series, there are about 17 errors and 1 failure in rdma-core.
> > >
> > > Zhu,
> > >
> > > You have to sync the kernel-header file with the kernel.
> >
> > From the link
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/t
> > re
> > e/Documentation/kbuild/headers_install.rst?h=v5.13-rc3
> > you mean "make headers_install"?
> >
> > In fact, after "make headers_install", these patches still cause errors and failures in rdma-core.
> >
> > I will delve into these errors of rdma-core. Too many errors.
> >
> > Zhu Yanjun
> >
> > >
> > > Bob
> > >
> > > > "
> > > > ----------------------------------------------------------------
> > > > --
> > > > --
> > > > --
> > > > Ran 183 tests in 2.130s
> > > >
> > > > FAILED (failures=1, errors=17, skipped=124) "
> > > >
> > > > After these patches, not sure if rxe can communicate with the 
> > > > physical NICs correctly because of the above errors and failure.
> > > >
> > > > Zhu Yanjun
> > > >
> > > >> v6:
> > > >>    Added rxe_ prefix to subroutine names in lines that changed
> > > >>    from Zhu's review of v5.
> > > >> v5:
> > > >>    Fixed a typo in 10th patch.
> > > >> v4:
> > > >>    Added a 10th patch to check when MRs have bound MWs
> > > >>    and disallow dereg and invalidate operations.
> > > >> v3:
> > > >>    cleaned up void return and lower case enums from
> > > >>    Zhu's review.
> > > >> v2:
> > > >>    cleaned up an issue in rdma_user_rxe.h
> > > >>    cleaned up a collision in rxe_resp.c
> > > >>
> > > >> Bob Pearson (9):
> > > >>    RDMA/rxe: Add bind MW fields to rxe_send_wr
> > > >>    RDMA/rxe: Return errors for add index and key
> > > >>    RDMA/rxe: Enable MW object pool
> > > >>    RDMA/rxe: Add ib_alloc_mw and ib_dealloc_mw verbs
> > > >>    RDMA/rxe: Replace WR_REG_MASK by WR_LOCAL_OP_MASK
> > > >>    RDMA/rxe: Move local ops to subroutine
> > > >>    RDMA/rxe: Add support for bind MW work requests
> > > >>    RDMA/rxe: Implement invalidate MW operations
> > > >>    RDMA/rxe: Implement memory access through MWs
> > > >>
> > > >>   drivers/infiniband/sw/rxe/Makefile     |   1 +
> > > >>   drivers/infiniband/sw/rxe/rxe.c        |   1 +
> > > >>   drivers/infiniband/sw/rxe/rxe_comp.c   |   1 +
> > > >>   drivers/infiniband/sw/rxe/rxe_loc.h    |  29 +-
> > > >>   drivers/infiniband/sw/rxe/rxe_mr.c     |  79 ++++--
> > > >>   drivers/infiniband/sw/rxe/rxe_mw.c     | 356 +++++++++++++++++++++++++
> > > >>   drivers/infiniband/sw/rxe/rxe_opcode.c |  11 +-
> > > >>   drivers/infiniband/sw/rxe/rxe_opcode.h |   3 +-
> > > >>   drivers/infiniband/sw/rxe/rxe_param.h  |  19 +-
> > > >>   drivers/infiniband/sw/rxe/rxe_pool.c   |  45 ++--
> > > >>   drivers/infiniband/sw/rxe/rxe_pool.h   |   8 +-
> > > >>   drivers/infiniband/sw/rxe/rxe_req.c    | 102 ++++---
> > > >>   drivers/infiniband/sw/rxe/rxe_resp.c   | 110 +++++---
> > > >>   drivers/infiniband/sw/rxe/rxe_verbs.c  |   5 +-
> > > >>   drivers/infiniband/sw/rxe/rxe_verbs.h  |  38 ++-
> > > >>   include/uapi/rdma/rdma_user_rxe.h      |  34 ++-
> > > >>   16 files changed, 691 insertions(+), 151 deletions(-)
> > > >>   create mode 100644 drivers/infiniband/sw/rxe/rxe_mw.c
> > > >> --
> > > >> 2.27.0
> > > >>

  reply	other threads:[~2021-05-25 15:25 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-21 20:18 [PATCH for-next v7 00/10] RDMA/rxe: Implement memory windows Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 01/10] RDMA/rxe: Add bind MW fields to rxe_send_wr Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 02/10] RDMA/rxe: Return errors for add index and key Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 03/10] RDMA/rxe: Enable MW object pool Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 04/10] RDMA/rxe: Add ib_alloc_mw and ib_dealloc_mw verbs Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 05/10] RDMA/rxe: Replace WR_REG_MASK by WR_LOCAL_OP_MASK Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 06/10] RDMA/rxe: Move local ops to subroutine Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 07/10] RDMA/rxe: Add support for bind MW work requests Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 08/10] RDMA/rxe: Implement invalidate MW operations Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 09/10] RDMA/rxe: Implement memory access through MWs Bob Pearson
2021-05-21 20:18 ` [PATCH for-next v7 10/10] RDMA/rxe: Disallow MR dereg and invalidate when bound Bob Pearson
2021-05-24  3:14 ` [PATCH for-next v7 00/10] RDMA/rxe: Implement memory windows Zhu Yanjun
2021-05-24 16:04   ` Pearson, Robert B
2021-05-25  2:08     ` Zhu Yanjun
2021-05-25  4:57       ` Pearson, Robert B
2021-05-25  5:18         ` Zhu Yanjun
2021-05-25  5:27           ` Pearson, Robert B
2021-05-25  5:45             ` Zhu Yanjun
2021-05-25 15:00             ` Zhu Yanjun
2021-05-25 15:23               ` Pearson, Robert B [this message]
2021-05-25 18:09                 ` Pearson, Robert B
2021-05-25 18:41                   ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CS1PR8401MB109691557DEE165AC0B9C47ABC259@CS1PR8401MB1096.NAMPRD84.PROD.OUTLOOK.COM \
    --to=robert.pearson2@hpe.com \
    --cc=jgg@nvidia.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=rpearsonhpe@gmail.com \
    --cc=zyjzyj2000@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.