All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Anna Schumaker <Anna.Schumaker@netapp.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	Tom Talpey <tom@talpey.com>
Subject: Re: [PATCH v1 13/16] NFS: Add sidecar RPC client support
Date: Mon, 20 Oct 2014 21:06:19 -0400	[thread overview]
Message-ID: <5BF0312C-06EC-4D83-81E9-F929724A0EAD@oracle.com> (raw)
In-Reply-To: <CAHQdGtRahOX68j6J6Vq9ykBE-2qUm1sd_HPw=HGimKAGs5V=9Q@mail.gmail.com>


On Oct 20, 2014, at 6:31 PM, Trond Myklebust <trond.myklebust@primarydata.com> wrote:

> On Mon, Oct 20, 2014 at 11:11 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
>> Hi Trond-
>> 
>> On Oct 20, 2014, at 3:40 PM, Trond Myklebust <trond.myklebust@primarydata.com> wrote:
>> 
>>> Why aren't we doing the callbacks via RDMA as per the recommendation
>>> in RFC5667 section 5.1?
>> 
>> There’s no benefit to it. With a side car, the server requires
>> few or no changes. There are no CB operations that benefit
>> from using RDMA. It’s very quick to implement, re-using most of
>> the client backchannel implementation that already exists.
>> 
>> I’ve discussed this with an author of RFC 5667 [cc’d], and also
>> with the implementors of an existing NFSv4.1 server that supports
>> RDMA. They both agree that a side car is an acceptable, or even a
>> preferable, way to approach backchannel support.
>> 
>> Also, when I discussed this with you months ago, you also felt
>> that a side car was better than adding backchannel support to the
>> xprtrdma transport. I took this approach only because you OK’d it.
>> 
>> But I don’t see an explicit recommendation in section 5.1. Which
>> text are you referring to?
> 
> The very first paragraph argues that because callback messages don't
> carry bulk data, there is no problem with using RPC/RDMA and, in
> particular, with using RDMA_MSG provided that the buffer sizes are
> negotiated correctly.

The opening paragraph is advice that applies to all forms
of NFSv4 callback, including NFSv4.0, which uses a separate
transport initiated from the NFS server. Specific advice about
NFSv4.1 bi-directional RPC is left to the next two paragraphs,
but they suggest there be dragons. I rather think this is a
warning not to “go there.”

> So the questions are:
> 
> 1) Where is the discussion of the merits for and against adding
> bi-directional support to the xprtrdma layer in Linux? What is the
> showstopper preventing implementation of a design based around
> RFC5667?

There is no show-stopper (see Section 5.1, after all). It’s
simply a matter of development effort: a side-car is much
less work than implementing full RDMA backchannel support for
both a client and server, especially since TCP backchannel
already works and can be used immediately.

Also, no problem with eventually implementing RDMA backchannel
if the complexity, and any performance overhead it introduces in
the forward channel, can be justified. The client can use the
CREATE_SESSION flags to detect what a server supports.

> 2) Why do we instead have to solve the whole backchannel problem in
> the NFSv4.1 layer, and where is the discussion of the merits for and
> against that particular solution? As far as I can tell, it imposes at
> least 2 extra requirements:
> a) NFSv4.1 client+server must have support either for session
> trunking or for clientid trunking

Very minimal trunking support. The only operation allowed on
the TCP side-car's forward channel is BIND_CONN_TO_SESSION.

Bruce told me that associating multiple transports to a
clientid/session should not be an issue for his server (his
words were “if that doesn’t work, it’s a bug”).

Would this restrictive form of trunking present a problem?

> b) NFSv4.1 client must be able to set up a TCP connection to the
> server (that can be session/clientid trunked with the existing RDMA
> channel)

Also very minimal changes. The changes are already done,
posted in v1 of this patch series.

> All I've found so far on googling these questions is a 5 1/2 year old
> email exchange between Tom Tucker and Ricardo where the conclusion
> appears to be that we can, in time, implement both designs.

You and I spoke about this on Feb 13, 2014 during pub night.
At the time you stated that a side-car was the only spec-
compliant way to approach this. I said I would go forward
with the idea in Linux, and you did not object.

> However
> there is no explanation of why we would want to do so.
> http://comments.gmane.org/gmane.linux.nfs/22927

I’ve implemented exactly what Ricardo proposed in this
thread, including dealing with connection loss:

> > The thinking is that NFSRDMA could initially use a TCP callback channel.
> > We'll implement BIND_CONN_TO_SESSION so that the backchannel does not
> > need to be tied to the forechannel connection.  This should address the
> > case where you have NFSRDMA for the forechannel and TCP for the
> > backchannel.  BIND_CONN_TO_SESSION is also required to reestablish
> > dropped connections effectively (to avoid losing the reply cache).


And here’s what you had to say in support of the idea:

> Given what they're hoping to achieve, I'm fine with
> doing a simple implementation of sessions first, then progressively
> refining it.

What’s the next step?

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com




  reply	other threads:[~2014-10-21  1:06 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-16 19:38 [PATCH v1 00/16] NFS/RDMA patches for 3.19 Chuck Lever
2014-10-16 19:38 ` [PATCH v1 01/16] xprtrdma: Return an errno from rpcrdma_register_external() Chuck Lever
2014-10-16 19:38 ` [PATCH v1 02/16] xprtrdma: Cap req_cqinit Chuck Lever
2014-10-20 13:27   ` Anna Schumaker
2014-10-16 19:38 ` [PATCH v1 03/16] SUNRPC: Pass callsize and recvsize to buf_alloc as separate arguments Chuck Lever
2014-10-20 14:04   ` Anna Schumaker
2014-10-20 18:21     ` Chuck Lever
2014-10-16 19:38 ` [PATCH v1 04/16] xprtrdma: Re-write rpcrdma_flush_cqs() Chuck Lever
2014-10-16 19:38 ` [PATCH v1 05/16] xprtrdma: unmap all FMRs during transport disconnect Chuck Lever
2014-10-16 19:39 ` [PATCH v1 06/16] xprtrdma: spin CQ completion vectors Chuck Lever
2014-10-16 19:39 ` [PATCH v1 07/16] SUNRPC: serialize iostats updates Chuck Lever
2014-10-16 19:39 ` [PATCH v1 08/16] xprtrdma: Display async errors Chuck Lever
2014-10-16 19:39 ` [PATCH v1 09/16] xprtrdma: Enable pad optimization Chuck Lever
2014-10-16 19:39 ` [PATCH v1 10/16] NFS: Include transport protocol name in UCS client string Chuck Lever
2014-10-16 19:39 ` [PATCH v1 11/16] NFS: Clean up nfs4_init_callback() Chuck Lever
2014-10-16 19:39 ` [PATCH v1 12/16] SUNRPC: Add rpc_xprt_is_bidirectional() Chuck Lever
2014-10-16 19:40 ` [PATCH v1 13/16] NFS: Add sidecar RPC client support Chuck Lever
2014-10-20 17:33   ` Anna Schumaker
2014-10-20 18:09     ` Chuck Lever
2014-10-20 19:40       ` Trond Myklebust
2014-10-20 20:11         ` Chuck Lever
2014-10-20 22:31           ` Trond Myklebust
2014-10-21  1:06             ` Chuck Lever [this message]
2014-10-21  7:45               ` Trond Myklebust
2014-10-21 17:11                 ` Chuck Lever
2014-10-22  8:39                   ` Trond Myklebust
2014-10-22 17:20                     ` Chuck Lever
2014-10-22 20:53                       ` Trond Myklebust
2014-10-22 22:38                         ` Chuck Lever
2014-10-23 13:32                   ` J. Bruce Fields
2014-10-23 13:55                     ` Chuck Lever
2014-10-16 19:40 ` [PATCH v1 14/16] NFS: Set BIND_CONN_TO_SESSION arguments in the proc layer Chuck Lever
2014-10-16 19:40 ` [PATCH v1 15/16] NFS: Bind side-car connection to session Chuck Lever
2014-10-16 19:40 ` [PATCH v1 16/16] NFS: Disable SESSION4_BACK_CHAN when a backchannel sidecar is to be used Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5BF0312C-06EC-4D83-81E9-F929724A0EAD@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=Anna.Schumaker@netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=tom@talpey.com \
    --cc=trond.myklebust@primarydata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.