On Thu, May 30 2019, Tom Talpey wrote: > On 5/30/2019 6:38 PM, NeilBrown wrote: >> On Thu, May 30 2019, Tom Talpey wrote: >> >>> On 5/30/2019 1:20 PM, Olga Kornievskaia wrote: >>>> On Thu, May 30, 2019 at 1:05 PM Tom Talpey wrote: >>>>> >>>>> On 5/29/2019 8:41 PM, NeilBrown wrote: >>>>>> I've also re-arrange the patches a bit, merged two, and remove the >>>>>> restriction to TCP and NFSV4.x,x>=1. Discussions seemed to suggest >>>>>> these restrictions were not needed, I can see no need. >>>>> >>>>> I believe the need is for the correctness of retries. Because NFSv2, >>>>> NFSv3 and NFSv4.0 have no exactly-once semantics of their own, server >>>>> duplicate request caches are important (although often imperfect). >>>>> These caches use client XID's, source ports and addresses, sometimes >>>>> in addition to other methods, to detect retry. Existing clients are >>>>> careful to reconnect with the same source port, to ensure this. And >>>>> existing servers won't change. >>>> >>>> Retries are already bound to the same connection so there shouldn't be >>>> an issue of a retransmission coming from a different source port. >>> >>> So, there's no path redundancy? If any connection is lost and can't >>> be reestablished, the requests on that connection will time out? >> >> Path redundancy happens lower down in the stack. Presumably a bonding >> driver will divert flows to a working path when one path fails. >> NFS doesn't see paths at all. It just sees TCP connections - each with >> the same source and destination address. How these are associated, from >> time to time, with different hardware is completely transparent to NFS. > > But, you don't propose to constrain this to bonded connections. So > NFS will create connections on whatever collection of NICs which are > locally, and if these aren't bonded, well, the issues become visible. If a client had multiple network interfaces with different addresses, and several of them had routes to the selected server IP, then this might result in the multiple connections to the server having different local addresses (as well as different local ports) - I don't know the network layer well enough to be sure if this is possible, but it seems credible. If one of these interfaces then went down, and there was no automatic routing reconfiguration in place to restore connectivity through a different interface, then the TCP connection would timeout and break. The xprt would then try to reconnect using the same source port and destination address - it doesn't provide an explicit source address, but lets the network layer provide one. This would presumably result in a connection with a different source address. So requests would continue to flow on the xprt, but they might miss the DRC as the source address would be different. If you have a configuration like this (multi-homed client with multiple interfaces that can reach the server with equal weight), then you already have a possible problem of missing the DRC if one interface goes down a new connection is established from another one. nconnect doesn't change that. So I still don't see any problem. If I've misunderstood you, please provide a detailed description of the sort of configuration where you think a problem might arise. > > BTW, RDMA NICs are never bonded. I've come across the concept of "Multi-Rail", but I cannot say that I fully understand it yet. I suspect you would need more than nconnect to make proper use of multi-rail RDMA Thanks, NeilBrown