All of lore.kernel.org
 help / color / mirror / Atom feed
From: Haris Iqbal <haris.iqbal@ionos.com>
To: Bob Pearson <rpearsonhpe@gmail.com>
Cc: jgg@nvidia.com, zyjzyj2000@gmail.com, jhack@hpe.com,
	frank.zago@hpe.com, linux-rdma@vger.kernel.org,
	Aleksei Marov <aleksei.marov@ionos.com>,
	Jinpu Wang <jinpu.wang@ionos.com>
Subject: Re: [PATCH for-next] RDMA/rxe: Fix incorrect fencing
Date: Mon, 23 May 2022 10:05:32 +0200	[thread overview]
Message-ID: <CAJpMwyhsf_C6dosUH81_5aD4fd5XHNPD94B3NE=TT+fSBAKW1g@mail.gmail.com> (raw)
In-Reply-To: <e81610d6-7896-03d6-91f9-15d68c7b8192@gmail.com>

On Mon, May 23, 2022 at 5:51 AM Bob Pearson <rpearsonhpe@gmail.com> wrote:
>
> On 5/22/22 18:59, Haris Iqbal wrote:
> > On Mon, May 23, 2022 at 12:36 AM Bob Pearson <rpearsonhpe@gmail.com> wrote:
> >>
> >> Currently the rxe driver checks if any previous operation
> >> is not complete to determine if a fence wait is required.
> >> This is not correct. For a regular fence only previous
> >> read or atomic operations must be complete while for a local
> >> invalidate fence all previous operations must be complete.
> >> This patch corrects this behavior.
> >>
> >> Fixes: 8700e3e7c4857 ("Soft RoCE (RXE) - The software RoCE driver")
> >> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> >> ---
> >>  drivers/infiniband/sw/rxe/rxe_req.c | 42 ++++++++++++++++++++++++-----
> >>  1 file changed, 36 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
> >> index ae5fbc79dd5c..f36263855a45 100644
> >> --- a/drivers/infiniband/sw/rxe/rxe_req.c
> >> +++ b/drivers/infiniband/sw/rxe/rxe_req.c
> >> @@ -163,16 +163,41 @@ static struct rxe_send_wqe *req_next_wqe(struct rxe_qp *qp)
> >>                      (wqe->state != wqe_state_processing)))
> >>                 return NULL;
> >>
> >> -       if (unlikely((wqe->wr.send_flags & IB_SEND_FENCE) &&
> >> -                                                    (index != cons))) {
> >> -               qp->req.wait_fence = 1;
> >> -               return NULL;
> >> -       }
> >> -
> >>         wqe->mask = wr_opcode_mask(wqe->wr.opcode, qp);
> >>         return wqe;
> >>  }
> >>
> >> +/**
> >> + * rxe_wqe_is_fenced - check if next wqe is fenced
> >> + * @qp: the queue pair
> >> + * @wqe: the next wqe
> >> + *
> >> + * Returns: 1 if wqe is fenced (needs to wait)
> >> + *         0 if wqe is good to go
> >> + */
> >> +static int rxe_wqe_is_fenced(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
> >> +{
> >> +       unsigned int cons;
> >> +
> >> +       if (!(wqe->wr.send_flags & IB_SEND_FENCE))
> >> +               return 0;
> >> +
> >> +       cons = queue_get_consumer(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT);
> >> +
> >> +       /* Local invalidate fence (LIF) see IBA 10.6.5.1
> >> +        * Requires ALL previous operations on the send queue
> >> +        * are complete.
> >> +        */
> >> +       if (wqe->wr.opcode == IB_WR_LOCAL_INV)
> >> +               return qp->req.wqe_index != cons;
> >
> >
> > Do I understand correctly that according to this code a wr with opcode
> > IB_WR_LOCAL_INV needs to have the IB_SEND_FENCE also set for this to
> > work?
> >
> > If that is the desired behaviour, can you point out where in spec this
> > is mentioned.
>
> According to IBA "Local invalidate fence" (LIF) and regular Fence behave
> differently. (See the referenced sections in the IBA.) For a local invalidate
> operation the fence bit fences all previous operations. That was the old behavior
> which made no distinction between local invalidate and other operations.
> The change here are the other operations with a regular fence which should only
> requires read and atomic operations to be fenced.
>
> Not sure what you mean by 'also'. Per the IBA if the LIF is set then you have
> strict invalidate ordering if not then you have relaxed ordering. The kernel verbs
> API only has one fence bit and does not have a separate LIF bit so I am
> interpreting them to share the one bit.

I see. Now I understand. Thanks for the explanation.

I do have a follow-up question. For a IB_WR_LOCAL_INV wr, without the
fence bit means relaxed ordering. This would mean that the completion
for that wr must take place "before any subsequent WQE has begun
execution". From what I understand, multiple rxe_requester instances
can run in parallel and pick up wqes and execute them. How is the
relaxed ordering criteria fulfilled?

>
> Bob
> >
> > Thanks.
> >
> >
> >> +
> >> +       /* Fence see IBA 10.8.3.3
> >> +        * Requires that all previous read and atomic operations
> >> +        * are complete.
> >> +        */
> >> +       return atomic_read(&qp->req.rd_atomic) != qp->attr.max_rd_atomic;
> >> +}
> >> +
> >>  static int next_opcode_rc(struct rxe_qp *qp, u32 opcode, int fits)
> >>  {
> >>         switch (opcode) {
> >> @@ -636,6 +661,11 @@ int rxe_requester(void *arg)
> >>         if (unlikely(!wqe))
> >>                 goto exit;
> >>
> >> +       if (rxe_wqe_is_fenced(qp, wqe)) {
> >> +               qp->req.wait_fence = 1;
> >> +               goto exit;
> >> +       }
> >> +
> >>         if (wqe->mask & WR_LOCAL_OP_MASK) {
> >>                 ret = rxe_do_local_ops(qp, wqe);
> >>                 if (unlikely(ret))
> >>
> >> base-commit: c5eb0a61238dd6faf37f58c9ce61c9980aaffd7a
> >> --
> >> 2.34.1
> >>
>

  reply	other threads:[~2022-05-23  8:06 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-22 22:33 [PATCH for-next] RDMA/rxe: Fix incorrect fencing Bob Pearson
2022-05-22 23:59 ` Haris Iqbal
2022-05-23  3:51   ` Bob Pearson
2022-05-23  8:05     ` Haris Iqbal [this message]
2022-05-23 18:22       ` Bob Pearson
2022-05-24 10:28         ` Haris Iqbal
2022-05-24 18:20           ` Bob Pearson
2022-05-27 10:18             ` Haris Iqbal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJpMwyhsf_C6dosUH81_5aD4fd5XHNPD94B3NE=TT+fSBAKW1g@mail.gmail.com' \
    --to=haris.iqbal@ionos.com \
    --cc=aleksei.marov@ionos.com \
    --cc=frank.zago@hpe.com \
    --cc=jgg@nvidia.com \
    --cc=jhack@hpe.com \
    --cc=jinpu.wang@ionos.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=rpearsonhpe@gmail.com \
    --cc=zyjzyj2000@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.