All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: Trond Myklebust <trondmy@primarydata.com>
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH v3] SUNRPC: Fix a race in the receive code path
Date: Thu, 14 Dec 2017 15:59:04 -0500	[thread overview]
Message-ID: <4D12D608-0108-4CBA-AEF4-CD8CEC86B55F@oracle.com> (raw)
In-Reply-To: <1513283844.11150.11.camel@primarydata.com>


> On Dec 14, 2017, at 3:37 PM, Trond Myklebust <trondmy@primarydata.com> =
wrote:
>=20
> On Thu, 2017-12-14 at 14:22 -0500, Chuck Lever wrote:
>>> On Dec 14, 2017, at 2:03 PM, Trond Myklebust <trondmy@primarydata.c
>>> om> wrote:
>>>=20
>>> Does the RDMA code update the connect cookie when the connection
>>> breaks? It looks to me as if it only does that when the connection
>>> is
>>> re-established. We really want both.
>>>=20
>>>> I imagine this issue could also impact write buffer exhaustion
>>>> on TCP.
>>>=20
>>> See net/sunrpc/xprtsock.c:xs_tcp_state_change()
>>=20
>> xprtrdma manipulates the connect_cookie in its connect worker,
>> see rpcrdma_connect_worker. This was added by:
>>=20
>> commit 575448bd36208f99fe0dd554a43518d798966740
>> Author:     Tom Talpey <talpey@netapp.com>
>> AuthorDate: Thu Oct 9 15:00:40 2008 -0400
>> Commit:     Trond Myklebust <Trond.Myklebust@netapp.com>
>> CommitDate: Fri Oct 10 15:10:36 2008 -0400
>>=20
>>    RPC/RDMA: suppress retransmit on RPC/RDMA clients.
>>=20
>> Would it be more correct to bump the cookie in rpcrdma_conn_upcall,
>> which is the equivalent to xs_tcp_state_change? (if so, why, so
>> I can compose a reasonable patch description)
>>=20
>> It could be bumped in the RDMA_CM_EVENT_ESTABLISHED and the
>> RDMA_CM_EVENT_DISCONNECTED cases, for example. I'm not sure
>> RDMA provides a distinction between "server disconnected"
>> and "client disconnected" although that probably does not
>> matter for this purpose.
>>=20
>> But, why would the additional cookie update help? The transport
>> is not disconnecting before the deadlock.
>>=20
>=20
> The connection cookie's purpose is twofold:
>=20
> 1) It tracks whether or not a request has been transmitted on the
> current connection or not.

That's broken by setting the cookie unconditionally outside
the transport_lock, isn't it?


> 2) It ensures that when several requests with the same connection
> cookie all call xprt_conditional_disconnect(), then that results in a
> single disconnection event. To do so, it assumes that xprt_autoclose()
> will change the cookie if the disconnection attempt is successful.
>=20
> In TCP we do so in the xs_tcp_state_change(). If the RDMA transport =
can
> guarantee that the call to xprt->ops->close(xprt) is always =
successful,
> then you could do so there.

I don't mind moving the cookie bump to rpcrdma_conn_upcall,
but I'm not sure I understand the locking requirements.

Currently, xprt_transmit sets the connect_cookie while holding
the transport_lock.

xprt_conditional_disconnect compares the cookie while holding
the transport_lock.

For TCP, the transport_lock is held when bumping the cookie
in the ESTABLISHED case, but _not_ in the two CLOSE cases?

xprtrdma holds the transport_lock when bumping the cookie,
which it does in its connect worker. It has to hold the lock
because it skips the value 0. xprtrdma needs to guarantee
that an RPC is never transmitted on the same connection
twice (and maybe it could use rq_connect_cookie instead of
its own cookie).

xprt_reserve_init is holding the reserve_lock but not the
transport_lock when it grabs the cookie. Maybe it should
not be initializing the rqst's cookie there?

Seems to me that xprt_transmit needs to update the rqst's
cookie while holding the transport_lock, especially if
xprtrdma needs to skip a cookie value? I'm sure I'm missing
something.


--
Chuck Lever




  reply	other threads:[~2017-12-14 21:00 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-04  0:17 [PATCH v3] SUNRPC: Fix a race in the receive code path Trond Myklebust
2017-12-13 16:14 ` Chuck Lever
2017-12-13 23:42   ` Chuck Lever
2017-12-14  1:13     ` Chuck Lever
2017-12-14 12:16       ` Trond Myklebust
2017-12-14 15:49         ` Chuck Lever
2017-12-14 19:03           ` Trond Myklebust
2017-12-14 19:22             ` Chuck Lever
2017-12-14 20:37               ` Trond Myklebust
2017-12-14 20:59                 ` Chuck Lever [this message]
2017-12-14 21:33                   ` Trond Myklebust
2017-12-15  2:16                     ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D12D608-0108-4CBA-AEF4-CD8CEC86B55F@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=Anna.Schumaker@Netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@primarydata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.