linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: Tom Talpey <tom@talpey.com>
Cc: linux-rdma <linux-rdma@vger.kernel.org>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH v1 2/8] xprtrdma: Do not post Receives after disconnect
Date: Wed, 31 Mar 2021 16:31:36 -0400	[thread overview]
Message-ID: <CAFMMQGvortADqgmAzskZKcnyHDzsTEW0FtR501wpP+deUM57FA@mail.gmail.com> (raw)
In-Reply-To: <4004f56f-3603-f56c-aea9-651230b3181e@talpey.com>

On Wed, Mar 31, 2021 at 4:01 PM Tom Talpey <tom@talpey.com> wrote:
>
> On 3/31/2021 3:36 PM, Chuck Lever wrote:
> > Currently the Receive completion handler refreshes the Receive Queue
> > whenever a successful Receive completion occurs.
> >
> > On disconnect, xprtrdma drains the Receive Queue. The first few
> > Receive completions after a disconnect are typically successful,
> > until the first flushed Receive.
> >
> > This means the Receive completion handler continues to post more
> > Receive WRs after the drain sentinel has been posted. The late-
> > posted Receives flush after the drain sentinel has completed,
> > leading to a crash later in rpcrdma_xprt_disconnect().
> >
> > To prevent this crash, xprtrdma has to ensure that the Receive
> > handler stops posting Receives before ib_drain_rq() posts its
> > drain sentinel.
> >
> > This patch is probably not sufficient to fully close that window,
>
> "Probably" is not a word I'd like to use in a stable:cc...

Well, I could be easily convinced to remove the Cc: stable
for this one, since it's not a full fix. But this is a pretty pervasive
problem with disconnect handling.


> > but does significantly reduce the opportunity for a crash to
> > occur without incurring undue performance overhead.
> >
> > Cc: stable@vger.kernel.org # v5.7
> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > ---
> >   net/sunrpc/xprtrdma/verbs.c |    7 +++++++
> >   1 file changed, 7 insertions(+)
> >
> > diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
> > index ec912cf9c618..1d88685badbe 100644
> > --- a/net/sunrpc/xprtrdma/verbs.c
> > +++ b/net/sunrpc/xprtrdma/verbs.c
> > @@ -1371,8 +1371,10 @@ void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, bool temp)
> >   {
> >       struct rpcrdma_buffer *buf = &r_xprt->rx_buf;
> >       struct rpcrdma_ep *ep = r_xprt->rx_ep;
> > +     struct ib_qp_init_attr init_attr;
> >       struct ib_recv_wr *wr, *bad_wr;
> >       struct rpcrdma_rep *rep;
> > +     struct ib_qp_attr attr;
> >       int needed, count, rc;
> >
> >       rc = 0;
> > @@ -1385,6 +1387,11 @@ void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, bool temp)
> >       if (!temp)
> >               needed += RPCRDMA_MAX_RECV_BATCH;
> >
> > +     if (ib_query_qp(ep->re_id->qp, &attr, IB_QP_STATE, &init_attr))
> > +             goto out;
>
> This call isn't completely cheap.

True, but it's done only once every 7 Receive completions.

The other option is to use re_connect_status, and add some memory
barriers to ensure we get the latest value. That doesn't help us get the
race window closed any further, though.

> > +     if (attr.qp_state == IB_QPS_ERR)
> > +             goto out;
>
> But the QP is free to disconnect or go to error right now. This approach
> just reduces the timing hole.

Agreed 100%. I just couldn't think of a better approach. I'm definitely open
to better ideas.

> Is it not possible to mark the WRs as
> being part of a batch, and allowing them to flush? You could borrow a
> bit in the completion cookie, and check it when the CQE pops out. Maybe.

It's not an issue with batching, it's an issue with posting Receives from the
Receive completion handler. I'd think that any of the ULPs that post Receives
in their completion handler would have the same issue.

The purpose of the QP drain in rpcrdma_xprt_disconnect() is to ensure there
are no more WRs in flight so that the hardware resources can be safely
destroyed. If the Receive completion handler continues to post Receive WRs
after the drain sentinel has been posted, leaks and crashes become possible.


> > +
> >       /* fast path: all needed reps can be found on the free list */
> >       wr = NULL;
> >       while (needed) {
> >
> >
> >



-- 
When the world is being engulfed by a comet, go ahead and excrete
where you want to.

  reply	other threads:[~2021-03-31 20:32 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-31 19:35 [PATCH v1 0/8] xprtrdma Receive Queue fixes Chuck Lever
2021-03-31 19:36 ` [PATCH v1 1/8] xprtrdma: Avoid Receive Queue wrapping Chuck Lever
2021-03-31 20:05   ` Tom Talpey
2021-03-31 19:36 ` [PATCH v1 2/8] xprtrdma: Do not post Receives after disconnect Chuck Lever
2021-03-31 19:59   ` Tom Talpey
2021-03-31 20:31     ` Chuck Lever [this message]
2021-03-31 21:22       ` Tom Talpey
2021-04-01 16:56         ` Chuck Lever
2021-03-31 19:36 ` [PATCH v1 3/8] xprtrdma: Put flushed Receives on free list instead of destroying them Chuck Lever
2021-03-31 20:02   ` Tom Talpey
2021-03-31 19:36 ` [PATCH v1 4/8] xprtrdma: Improve locking around rpcrdma_rep destruction Chuck Lever
2021-03-31 19:36 ` [PATCH v1 5/8] xprtrdma: Improve commentary around rpcrdma_reps_unmap() Chuck Lever
2021-03-31 19:36 ` [PATCH v1 6/8] xprtrdma: Improve locking around rpcrdma_rep creation Chuck Lever
2021-03-31 19:36 ` [PATCH v1 7/8] xprtrdma: Fix cwnd update ordering Chuck Lever
2021-03-31 19:36 ` [PATCH v1 8/8] xprtrdma: Delete rpcrdma_recv_buffer_put() Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFMMQGvortADqgmAzskZKcnyHDzsTEW0FtR501wpP+deUM57FA@mail.gmail.com \
    --to=chuck.lever@oracle.com \
    --cc=chucklever@gmail.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).