All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Anna Schumaker <Anna.Schumaker@netapp.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	Tom Talpey <tom@talpey.com>
Subject: Re: [PATCH v1 13/16] NFS: Add sidecar RPC client support
Date: Tue, 21 Oct 2014 13:11:26 -0400	[thread overview]
Message-ID: <EB7360E0-CDE8-40A6-8F91-6A119EE68DEC@oracle.com> (raw)
In-Reply-To: <CAHQdGtRumr9Cr6oJPUEw5nLiiWqNW5JRq7tzpsKUVtCG=F1QFg@mail.gmail.com>


On Oct 21, 2014, at 3:45 AM, Trond Myklebust <trond.myklebust@primarydata.com> wrote:

> On Tue, Oct 21, 2014 at 4:06 AM, Chuck Lever <chuck.lever@oracle.com> wrote:
>> 
>> There is no show-stopper (see Section 5.1, after all). It’s
>> simply a matter of development effort: a side-car is much
>> less work than implementing full RDMA backchannel support for
>> both a client and server, especially since TCP backchannel
>> already works and can be used immediately.
>> 
>> Also, no problem with eventually implementing RDMA backchannel
>> if the complexity, and any performance overhead it introduces in
>> the forward channel, can be justified. The client can use the
>> CREATE_SESSION flags to detect what a server supports.
> 
> What complexity and performance overhead does it introduce in the
> forward channel?

The benefit of RDMA is that there are opportunities to
reduce host CPU interaction with incoming data.
Bi-direction requires that the transport look at the RPC
header to determine the direction of the message. That
could have an impact on the forward channel, but it’s
never been measured, to my knowledge.

The reason this is more of an issue for RPC/RDMA is that
a copy of the XID appears in the RPC/RDMA header to avoid
the need to look at the RPC header. That’s typically what
implementations use to steer RPC reply processing.

Often the RPC/RDMA header and RPC header land in
disparate buffers. The RPC/RDMA reply handler looks
strictly at the RPC/RDMA header, and runs in a tasklet
usually on a different CPU. Adding bi-direction would mean
the transport would have to peek into the upper layer
headers, possibly resulting in cache line bouncing.

The complexity would be the addition of over a hundred
new lines of code on the client, and possibly a similar
amount of new code on the server. Small, perhaps, but
not insignificant.

>>> 2) Why do we instead have to solve the whole backchannel problem in
>>> the NFSv4.1 layer, and where is the discussion of the merits for and
>>> against that particular solution? As far as I can tell, it imposes at
>>> least 2 extra requirements:
>>> a) NFSv4.1 client+server must have support either for session
>>> trunking or for clientid trunking
>> 
>> Very minimal trunking support. The only operation allowed on
>> the TCP side-car's forward channel is BIND_CONN_TO_SESSION.
>> 
>> Bruce told me that associating multiple transports to a
>> clientid/session should not be an issue for his server (his
>> words were “if that doesn’t work, it’s a bug”).
>> 
>> Would this restrictive form of trunking present a problem?
>> 
>>> b) NFSv4.1 client must be able to set up a TCP connection to the
>>> server (that can be session/clientid trunked with the existing RDMA
>>> channel)
>> 
>> Also very minimal changes. The changes are already done,
>> posted in v1 of this patch series.
> 
> I'm not asking for details on the size of the changesets, but for a
> justification of the design itself.

The size of the changeset _is_ the justification. It’s
a much less invasive change to add a TCP side-car than
it is to implement RDMA backchannel on both server and
client.

Most servers would require almost no change. Linux needs
only a bug fix or two. Effectively zero-impact for
servers that already support NFSv4.0 on RDMA to get
NFSv4.1 and pNFS on RDMA, with working callbacks.

That’s really all there is to it. It’s almost entirely a
practical consideration: we have the infrastructure and
can make it work in just a few lines of code.

> If it is possible to confine all
> the changes to the RPC/RDMA layer, then why consider patches that
> change the NFSv4.1 layer at all?

The fast new transport bring-up benefit is probably the
biggest win. A TCP side-car makes bringing up any new
transport implementation simpler.

And, RPC/RDMA offers zero performance benefit for
backchannel traffic, especially since CB traffic would
never move via RDMA READ/WRITE (as per RFC 5667 section
5.1).

The primary benefit to doing an RPC/RDMA-only solution
is that there is no upper layer impact. Is that a design
requirement?

There’s also been no discussion of issues with adding a
very restricted amount of transport trunking. Can you
elaborate on the problems this could introduce?

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com




  reply	other threads:[~2014-10-21 17:11 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-16 19:38 [PATCH v1 00/16] NFS/RDMA patches for 3.19 Chuck Lever
2014-10-16 19:38 ` [PATCH v1 01/16] xprtrdma: Return an errno from rpcrdma_register_external() Chuck Lever
2014-10-16 19:38 ` [PATCH v1 02/16] xprtrdma: Cap req_cqinit Chuck Lever
2014-10-20 13:27   ` Anna Schumaker
2014-10-16 19:38 ` [PATCH v1 03/16] SUNRPC: Pass callsize and recvsize to buf_alloc as separate arguments Chuck Lever
2014-10-20 14:04   ` Anna Schumaker
2014-10-20 18:21     ` Chuck Lever
2014-10-16 19:38 ` [PATCH v1 04/16] xprtrdma: Re-write rpcrdma_flush_cqs() Chuck Lever
2014-10-16 19:38 ` [PATCH v1 05/16] xprtrdma: unmap all FMRs during transport disconnect Chuck Lever
2014-10-16 19:39 ` [PATCH v1 06/16] xprtrdma: spin CQ completion vectors Chuck Lever
2014-10-16 19:39 ` [PATCH v1 07/16] SUNRPC: serialize iostats updates Chuck Lever
2014-10-16 19:39 ` [PATCH v1 08/16] xprtrdma: Display async errors Chuck Lever
2014-10-16 19:39 ` [PATCH v1 09/16] xprtrdma: Enable pad optimization Chuck Lever
2014-10-16 19:39 ` [PATCH v1 10/16] NFS: Include transport protocol name in UCS client string Chuck Lever
2014-10-16 19:39 ` [PATCH v1 11/16] NFS: Clean up nfs4_init_callback() Chuck Lever
2014-10-16 19:39 ` [PATCH v1 12/16] SUNRPC: Add rpc_xprt_is_bidirectional() Chuck Lever
2014-10-16 19:40 ` [PATCH v1 13/16] NFS: Add sidecar RPC client support Chuck Lever
2014-10-20 17:33   ` Anna Schumaker
2014-10-20 18:09     ` Chuck Lever
2014-10-20 19:40       ` Trond Myklebust
2014-10-20 20:11         ` Chuck Lever
2014-10-20 22:31           ` Trond Myklebust
2014-10-21  1:06             ` Chuck Lever
2014-10-21  7:45               ` Trond Myklebust
2014-10-21 17:11                 ` Chuck Lever [this message]
2014-10-22  8:39                   ` Trond Myklebust
2014-10-22 17:20                     ` Chuck Lever
2014-10-22 20:53                       ` Trond Myklebust
2014-10-22 22:38                         ` Chuck Lever
2014-10-23 13:32                   ` J. Bruce Fields
2014-10-23 13:55                     ` Chuck Lever
2014-10-16 19:40 ` [PATCH v1 14/16] NFS: Set BIND_CONN_TO_SESSION arguments in the proc layer Chuck Lever
2014-10-16 19:40 ` [PATCH v1 15/16] NFS: Bind side-car connection to session Chuck Lever
2014-10-16 19:40 ` [PATCH v1 16/16] NFS: Disable SESSION4_BACK_CHAN when a backchannel sidecar is to be used Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=EB7360E0-CDE8-40A6-8F91-6A119EE68DEC@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=Anna.Schumaker@netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=tom@talpey.com \
    --cc=trond.myklebust@primarydata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.