On Thu, May 30 2019, Chuck Lever wrote: > Hi Neil- > > Thanks for chasing this a little further. > > >> On May 29, 2019, at 8:41 PM, NeilBrown wrote: >> >> This patch set is based on the patches in the multipath_tcp branch of >> git://git.linux-nfs.org/projects/trondmy/nfs-2.6.git >> >> I'd like to add my voice to those supporting this work and wanting to >> see it land. >> We have had customers/partners wanting this sort of functionality for >> years. In SLES releases prior to SLE15, we've provide a >> "nosharetransport" mount option, so that several filesystem could be >> mounted from the same server and each would get its own TCP >> connection. > > Is it well understood why splitting up the TCP connections result > in better performance? > > >> In SLE15 we are using this 'nconnect' feature, which is much nicer. >> >> Partners have assured us that it improves total throughput, >> particularly with bonded networks, but we haven't had any concrete >> data until Olga Kornievskaia provided some concrete test data - thanks >> Olga! >> >> My understanding, as I explain in one of the patches, is that parallel >> hardware is normally utilized by distributing flows, rather than >> packets. This avoid out-of-order deliver of packets in a flow. >> So multiple flows are needed to utilizes parallel hardware. > > Indeed. > > However I think one of the problems is what happens in simpler scenarios. > We had reports that using nconnect > 1 on virtual clients made things > go slower. It's not always wise to establish multiple connections > between the same two IP addresses. It depends on the hardware on each > end, and the network conditions. This is a good argument for leaving the default at '1'. When documentation is added to nfs(5), we can make it clear that the optimal number is dependant on hardware. > > What about situations where the network capabilities between server and > client change? Problem is that neither endpoint can detect that; TCP > usually just deals with it. Being able to manually change (-o remount) the number of connections might be useful... > > Related Work: > > We now have protocol (more like conventions) for clients to discover > when a server has additional endpoints so that it can establish > connections to each of them. > > https://datatracker.ietf.org/doc/rfc8587/ > > and > > https://datatracker.ietf.org/doc/draft-ietf-nfsv4-rfc5661-msns-update/ > > Boiled down, the client uses fs_locations and trunking detection to > figure out when two IP addresses are the same server instance. > > This facility can also be used to establish a connection over a > different path if network connectivity is lost. > > There has also been some exploration of MP-TCP. The magic happens > under the transport socket in the network layer, and the RPC client > is not involved. I would think that SCTP would be the best protocol for NFS to use as it supports multi-streaming - several independent streams. That would require that hardware understands it of course. Though I have examined MP-TCP closely, it looks like it is still fully sequenced, so it would be tricky for two RPC messages to be assembled into TCP frames completely independently - at least you would need synchronization on the sequence number. Thanks for your thoughts, NeilBrown > > >> Comments most welcome. I'd love to see this, or something similar, >> merged. >> >> Thanks, >> NeilBrown >> >> --- >> >> NeilBrown (4): >> NFS: send state management on a single connection. >> SUNRPC: enhance rpc_clnt_show_stats() to report on all xprts. >> SUNRPC: add links for all client xprts to debugfs >> >> Trond Myklebust (5): >> SUNRPC: Add basic load balancing to the transport switch >> SUNRPC: Allow creation of RPC clients with multiple connections >> NFS: Add a mount option to specify number of TCP connections to use >> NFSv4: Allow multiple connections to NFSv4.x servers >> pNFS: Allow multiple connections to the DS >> NFS: Allow multiple connections to a NFSv2 or NFSv3 server >> >> >> fs/nfs/client.c | 3 + >> fs/nfs/internal.h | 2 + >> fs/nfs/nfs3client.c | 1 >> fs/nfs/nfs4client.c | 13 ++++- >> fs/nfs/nfs4proc.c | 22 +++++--- >> fs/nfs/super.c | 12 ++++ >> include/linux/nfs_fs_sb.h | 1 >> include/linux/sunrpc/clnt.h | 1 >> include/linux/sunrpc/sched.h | 1 >> include/linux/sunrpc/xprt.h | 1 >> include/linux/sunrpc/xprtmultipath.h | 2 + >> net/sunrpc/clnt.c | 98 ++++++++++++++++++++++++++++++++-- >> net/sunrpc/debugfs.c | 46 ++++++++++------ >> net/sunrpc/sched.c | 3 + >> net/sunrpc/stats.c | 15 +++-- >> net/sunrpc/sunrpc.h | 3 + >> net/sunrpc/xprtmultipath.c | 23 +++++++- >> 17 files changed, 204 insertions(+), 43 deletions(-) >> >> -- >> Signature >> > > -- > Chuck Lever