* [PATCH 0/2] SUNRPC: Clean up port reuse behavior on reconnects @ 2018-05-10 6:12 Vallish Vaidyeshwara 2018-05-10 6:12 ` [PATCH 1/2] SUNRPC: Need to reuse non-reserved port for reconnect Vallish Vaidyeshwara 2018-05-10 6:12 ` [PATCH 2/2] SUNRPC: Reconnect with new port on server initiated connection termination Vallish Vaidyeshwara 0 siblings, 2 replies; 10+ messages in thread From: Vallish Vaidyeshwara @ 2018-05-10 6:12 UTC (permalink / raw) To: trond.myklebust, anna.schumaker, jlayton, bfields, linux-nfs; +Cc: jsstraus We have seen examples of NFSv4 client reconnections failing due to client source port reuse combined with stateful TCP state tracking in middleboxes. Commit 0f7a622ca616 ("rpc: xs_bind - do not bind when requesting a random ephemeral port"), commit 4dda9c8a5e34 ("SUNRPC: Set SO_REUSEPORT socket option for TCP connections"), and commit 1f4c17a03ba7 ("SUNRPC: Handle EADDRNOTAVAIL on connection failures") had unintended side effects of making port reuse dependent on whether that client uses reserved or non-reserved source ports. This series fixes the observed NFSv4 client reconnect failures by choosing a new source port regardless of which port range is in use for RST or FIN terminating the prior connection. This series does not restore the original always-reuse behavior that existed prior to commits mentioned above. Patch 1: -------- SUNRPC: Need to reuse non-reserved port for reconnect This patch restores the original behavior of port reuse in all scenarios. Patch 2: -------- SUNRPC: Reconnect with new port on server initiated connection termination Client uses a new port only in cases where server has terminated the connection explicitly using RST or FIN. Proposed outcome with this patch series: ---------------------------------------- ------------------------------------------------------------------ | Port type | Connection termination | Current | With this | | | type | behavior | patch series | ------------------------------------------------------------------ | Reserved | Network partition-DROP | Reuse port | Reuse port | ------------------------------------------------------------------ | Reserved | FIN from server | Reuse port | New port | ------------------------------------------------------------------ | Reserved | RST from server | Reuse port | New port | ------------------------------------------------------------------ | Non-Resv | Network partition-DROP | New port | Reuse port | ------------------------------------------------------------------ | Non-Resv | FIN from server | New port | New port | ------------------------------------------------------------------ | Non-Resv | RST from server | New port | New port | ------------------------------------------------------------------ net/sunrpc/xprtsock.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) -- 2.7.3.AMZN ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/2] SUNRPC: Need to reuse non-reserved port for reconnect 2018-05-10 6:12 [PATCH 0/2] SUNRPC: Clean up port reuse behavior on reconnects Vallish Vaidyeshwara @ 2018-05-10 6:12 ` Vallish Vaidyeshwara 2018-05-10 21:18 ` Vallish Vaidyeshwara 2018-05-10 6:12 ` [PATCH 2/2] SUNRPC: Reconnect with new port on server initiated connection termination Vallish Vaidyeshwara 1 sibling, 1 reply; 10+ messages in thread From: Vallish Vaidyeshwara @ 2018-05-10 6:12 UTC (permalink / raw) To: trond.myklebust, anna.schumaker, jlayton, bfields, linux-nfs; +Cc: jsstraus Seemingly innocent optimization related to xs_bind() broke TCP port reuse by making non-reserved ephermal socket port to not be saved in "struct sock_xprt (srcport)". In case of non-reserved port, allocation happens as part of kernel_connect() inside of xs_tcp_finish_connecting(). kernel_connect() returns EINPROGRESS and the code skips stashing srcport in sock_xprt for reconnects. This affects servers DRC in case of network partition where client's RPC recovery would try reconnecting with a different port. Reported-by: Alexey Kuznetsov <alexeyk@amazon.com> Reviewed-by: Jacob Strauss <jsstraus@amazon.com> Reviewed-by: Alakesh Haloi <alakeshh@amazon.com> Signed-off-by: Vallish Vaidyeshwara <vallish@amazon.com> Fixes: 0f7a622c ("rpc: xs_bind - do not bind when requesting a random ephemeral port") --- net/sunrpc/xprtsock.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index c8902f1..5bf75b3 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -2393,9 +2393,11 @@ static int xs_tcp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock) ret = kernel_connect(sock, xs_addr(xprt), xprt->addrlen, O_NONBLOCK); switch (ret) { case 0: - xs_set_srcport(transport, sock); /* fall through */ case -EINPROGRESS: + /* Allocated port saved for reconnect */ + xs_set_srcport(transport, sock); + /* SYN_SENT! */ if (xprt->reestablish_timeout < XS_TCP_INIT_REEST_TO) xprt->reestablish_timeout = XS_TCP_INIT_REEST_TO; -- 2.7.3.AMZN ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] SUNRPC: Need to reuse non-reserved port for reconnect 2018-05-10 6:12 ` [PATCH 1/2] SUNRPC: Need to reuse non-reserved port for reconnect Vallish Vaidyeshwara @ 2018-05-10 21:18 ` Vallish Vaidyeshwara 0 siblings, 0 replies; 10+ messages in thread From: Vallish Vaidyeshwara @ 2018-05-10 21:18 UTC (permalink / raw) To: trond.myklebust, anna.schumaker, jlayton, bfields, linux-nfs; +Cc: jsstraus On Thu, May 10, 2018 at 06:12:53AM +0000, Vallish Vaidyeshwara wrote: > Seemingly innocent optimization related to xs_bind() broke TCP port > reuse by making non-reserved ephermal socket port to not be saved > in "struct sock_xprt (srcport)". In case of non-reserved port, > allocation happens as part of kernel_connect() inside of > xs_tcp_finish_connecting(). kernel_connect() returns EINPROGRESS > and the code skips stashing srcport in sock_xprt for reconnects. > This affects servers DRC in case of network partition where client's > RPC recovery would try reconnecting with a different port. > > Reported-by: Alexey Kuznetsov <alexeyk@amazon.com> > Reviewed-by: Jacob Strauss <jsstraus@amazon.com> > Reviewed-by: Alakesh Haloi <alakeshh@amazon.com> > Signed-off-by: Vallish Vaidyeshwara <vallish@amazon.com> > Fixes: 0f7a622c ("rpc: xs_bind - do not bind when requesting a random ephemeral port") > --- > net/sunrpc/xprtsock.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c > index c8902f1..5bf75b3 100644 > --- a/net/sunrpc/xprtsock.c > +++ b/net/sunrpc/xprtsock.c > @@ -2393,9 +2393,11 @@ static int xs_tcp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock) > ret = kernel_connect(sock, xs_addr(xprt), xprt->addrlen, O_NONBLOCK); > switch (ret) { > case 0: > - xs_set_srcport(transport, sock); > /* fall through */ > case -EINPROGRESS: > + /* Allocated port saved for reconnect */ > + xs_set_srcport(transport, sock); > + > /* SYN_SENT! */ > if (xprt->reestablish_timeout < XS_TCP_INIT_REEST_TO) > xprt->reestablish_timeout = XS_TCP_INIT_REEST_TO; > -- > 2.7.3.AMZN > Hello Trond and Bruce, This patch is actually restoring existing broken DRC behavior. Can you folks let me know your feedback on this patch as well. Thanks. -Vallish ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/2] SUNRPC: Reconnect with new port on server initiated connection termination 2018-05-10 6:12 [PATCH 0/2] SUNRPC: Clean up port reuse behavior on reconnects Vallish Vaidyeshwara 2018-05-10 6:12 ` [PATCH 1/2] SUNRPC: Need to reuse non-reserved port for reconnect Vallish Vaidyeshwara @ 2018-05-10 6:12 ` Vallish Vaidyeshwara 2018-05-10 15:25 ` Trond Myklebust 1 sibling, 1 reply; 10+ messages in thread From: Vallish Vaidyeshwara @ 2018-05-10 6:12 UTC (permalink / raw) To: trond.myklebust, anna.schumaker, jlayton, bfields, linux-nfs; +Cc: jsstraus Server initiated socket close can corrupt connection state tracking table in conjunction with other network middle boxes. In situations like these, client connection hangs till connection state tracking table entries age out and get purged. Client reconnection with a new port in such a situation will avoid connection hang. Reviewed-by: Jacob Strauss <jsstraus@amazon.com> Reviewed-by: Alakesh Haloi <alakeshh@amazon.com> Signed-off-by: Vallish Vaidyeshwara <vallish@amazon.com> --- net/sunrpc/xprtsock.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 5bf75b3..d293c8d 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -1629,6 +1629,8 @@ static void xs_tcp_state_change(struct sock *sk) /* The server initiated a shutdown of the socket */ xprt->connect_cookie++; clear_bit(XPRT_CONNECTED, &xprt->state); + /* Server sent FIN, reconnect with a new port */ + transport->srcport = 0; xs_tcp_force_close(xprt); /* fall through */ case TCP_CLOSING: @@ -1650,6 +1652,9 @@ static void xs_tcp_state_change(struct sock *sk) &transport->sock_state)) xprt_clear_connecting(xprt); clear_bit(XPRT_CLOSING, &xprt->state); + /* Server sent RST, reconnect with a new port */ + if (sk->sk_err == ECONNRESET) + transport->srcport = 0; if (sk->sk_err) xprt_wake_pending_tasks(xprt, -sk->sk_err); /* Trigger the socket release */ -- 2.7.3.AMZN ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] SUNRPC: Reconnect with new port on server initiated connection termination 2018-05-10 6:12 ` [PATCH 2/2] SUNRPC: Reconnect with new port on server initiated connection termination Vallish Vaidyeshwara @ 2018-05-10 15:25 ` Trond Myklebust 2018-05-10 16:22 ` Vallish Vaidyeshwara 0 siblings, 1 reply; 10+ messages in thread From: Trond Myklebust @ 2018-05-10 15:25 UTC (permalink / raw) To: bfields, vallish, anna.schumaker, linux-nfs, jlayton; +Cc: jsstraus T24gVGh1LCAyMDE4LTA1LTEwIGF0IDA2OjEyICswMDAwLCBWYWxsaXNoIFZhaWR5ZXNod2FyYSB3 cm90ZToNCj4gU2VydmVyIGluaXRpYXRlZCBzb2NrZXQgY2xvc2UgY2FuIGNvcnJ1cHQgY29ubmVj dGlvbiBzdGF0ZSB0cmFja2luZw0KPiB0YWJsZSBpbiBjb25qdW5jdGlvbiB3aXRoIG90aGVyIG5l dHdvcmsgbWlkZGxlIGJveGVzLiBJbiBzaXR1YXRpb25zDQo+IGxpa2UgdGhlc2UsIGNsaWVudCBj b25uZWN0aW9uIGhhbmdzIHRpbGwgY29ubmVjdGlvbiBzdGF0ZSB0cmFja2luZw0KPiB0YWJsZSBl bnRyaWVzIGFnZSBvdXQgYW5kIGdldCBwdXJnZWQuIENsaWVudCByZWNvbm5lY3Rpb24gd2l0aCBh IG5ldw0KPiBwb3J0IGluIHN1Y2ggYSBzaXR1YXRpb24gd2lsbCBhdm9pZCBjb25uZWN0aW9uIGhh bmcuDQo+IA0KPiBSZXZpZXdlZC1ieTogSmFjb2IgU3RyYXVzcyA8anNzdHJhdXNAYW1hem9uLmNv bT4NCj4gUmV2aWV3ZWQtYnk6IEFsYWtlc2ggSGFsb2kgPGFsYWtlc2hoQGFtYXpvbi5jb20+DQo+ IFNpZ25lZC1vZmYtYnk6IFZhbGxpc2ggVmFpZHllc2h3YXJhIDx2YWxsaXNoQGFtYXpvbi5jb20+ DQo+IC0tLQ0KPiAgbmV0L3N1bnJwYy94cHJ0c29jay5jIHwgNSArKysrKw0KPiAgMSBmaWxlIGNo YW5nZWQsIDUgaW5zZXJ0aW9ucygrKQ0KPiANCj4gZGlmZiAtLWdpdCBhL25ldC9zdW5ycGMveHBy dHNvY2suYyBiL25ldC9zdW5ycGMveHBydHNvY2suYw0KPiBpbmRleCA1YmY3NWIzLi5kMjkzYzhk IDEwMDY0NA0KPiAtLS0gYS9uZXQvc3VucnBjL3hwcnRzb2NrLmMNCj4gKysrIGIvbmV0L3N1bnJw Yy94cHJ0c29jay5jDQo+IEBAIC0xNjI5LDYgKzE2MjksOCBAQCBzdGF0aWMgdm9pZCB4c190Y3Bf c3RhdGVfY2hhbmdlKHN0cnVjdCBzb2NrDQo+ICpzaykNCj4gIAkJLyogVGhlIHNlcnZlciBpbml0 aWF0ZWQgYSBzaHV0ZG93biBvZiB0aGUgc29ja2V0ICovDQo+ICAJCXhwcnQtPmNvbm5lY3RfY29v a2llKys7DQo+ICAJCWNsZWFyX2JpdChYUFJUX0NPTk5FQ1RFRCwgJnhwcnQtPnN0YXRlKTsNCj4g KwkJLyogU2VydmVyIHNlbnQgRklOLCByZWNvbm5lY3Qgd2l0aCBhIG5ldyBwb3J0ICovDQo+ICsJ CXRyYW5zcG9ydC0+c3JjcG9ydCA9IDA7DQo+ICAJCXhzX3RjcF9mb3JjZV9jbG9zZSh4cHJ0KTsN Cj4gIAkJLyogZmFsbCB0aHJvdWdoICovDQo+ICAJY2FzZSBUQ1BfQ0xPU0lORzoNCj4gQEAgLTE2 NTAsNiArMTY1Miw5IEBAIHN0YXRpYyB2b2lkIHhzX3RjcF9zdGF0ZV9jaGFuZ2Uoc3RydWN0IHNv Y2sNCj4gKnNrKQ0KPiAgCQkJCQkmdHJhbnNwb3J0LT5zb2NrX3N0YXRlKSkNCj4gIAkJCXhwcnRf Y2xlYXJfY29ubmVjdGluZyh4cHJ0KTsNCj4gIAkJY2xlYXJfYml0KFhQUlRfQ0xPU0lORywgJnhw cnQtPnN0YXRlKTsNCj4gKwkJLyogU2VydmVyIHNlbnQgUlNULCByZWNvbm5lY3Qgd2l0aCBhIG5l dyBwb3J0ICovDQo+ICsJCWlmIChzay0+c2tfZXJyID09IEVDT05OUkVTRVQpDQo+ICsJCQl0cmFu c3BvcnQtPnNyY3BvcnQgPSAwOw0KPiAgCQlpZiAoc2stPnNrX2VycikNCj4gIAkJCXhwcnRfd2Fr ZV9wZW5kaW5nX3Rhc2tzKHhwcnQsIC1zay0+c2tfZXJyKTsNCj4gIAkJLyogVHJpZ2dlciB0aGUg c29ja2V0IHJlbGVhc2UgKi8NCg0KTkFDSy4gVGhpcyB3aWxsIHV0dGVybHkgYnJlYWsgTkZTdjIs IE5GU3YzIGFuZCBORlN2NC4wIGR1cGxpY2F0ZSByZXBsYXkNCmNhY2hlIHNlbWFudGljcy4gDQoN CkNoZWVycw0KICBUcm9uZA0KLS0gDQpUcm9uZCBNeWtsZWJ1c3QNCkxpbnV4IE5GUyBjbGllbnQg bWFpbnRhaW5lciwgSGFtbWVyc3BhY2UNCnRyb25kLm15a2xlYnVzdEBoYW1tZXJzcGFjZS5jb20N Cg== ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] SUNRPC: Reconnect with new port on server initiated connection termination 2018-05-10 15:25 ` Trond Myklebust @ 2018-05-10 16:22 ` Vallish Vaidyeshwara 2018-05-10 17:26 ` Trond Myklebust 2018-05-10 17:37 ` bfields 0 siblings, 2 replies; 10+ messages in thread From: Vallish Vaidyeshwara @ 2018-05-10 16:22 UTC (permalink / raw) To: Trond Myklebust; +Cc: bfields, anna.schumaker, linux-nfs, jlayton, jsstraus On Thu, May 10, 2018 at 03:25:14PM +0000, Trond Myklebust wrote: > On Thu, 2018-05-10 at 06:12 +0000, Vallish Vaidyeshwara wrote: > > Server initiated socket close can corrupt connection state tracking > > table in conjunction with other network middle boxes. In situations > > like these, client connection hangs till connection state tracking > > table entries age out and get purged. Client reconnection with a new > > port in such a situation will avoid connection hang. > > > > Reviewed-by: Jacob Strauss <jsstraus@amazon.com> > > Reviewed-by: Alakesh Haloi <alakeshh@amazon.com> > > Signed-off-by: Vallish Vaidyeshwara <vallish@amazon.com> > > --- > > net/sunrpc/xprtsock.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c > > index 5bf75b3..d293c8d 100644 > > --- a/net/sunrpc/xprtsock.c > > +++ b/net/sunrpc/xprtsock.c > > @@ -1629,6 +1629,8 @@ static void xs_tcp_state_change(struct sock > > *sk) > > /* The server initiated a shutdown of the socket */ > > xprt->connect_cookie++; > > clear_bit(XPRT_CONNECTED, &xprt->state); > > + /* Server sent FIN, reconnect with a new port */ > > + transport->srcport = 0; > > xs_tcp_force_close(xprt); > > /* fall through */ > > case TCP_CLOSING: > > @@ -1650,6 +1652,9 @@ static void xs_tcp_state_change(struct sock > > *sk) > > &transport->sock_state)) > > xprt_clear_connecting(xprt); > > clear_bit(XPRT_CLOSING, &xprt->state); > > + /* Server sent RST, reconnect with a new port */ > > + if (sk->sk_err == ECONNRESET) > > + transport->srcport = 0; > > if (sk->sk_err) > > xprt_wake_pending_tasks(xprt, -sk->sk_err); > > /* Trigger the socket release */ > > NACK. This will utterly break NFSv2, NFSv3 and NFSv4.0 duplicate replay > cache semantics. > > Cheers > Trond > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com Hello Trond, The first patch in this series is actually helping restore DRC behavior in cases like network partition where packets are dropped: [PATCH 1/2] SUNRPC: Need to reuse non-reserved port for reconnect Patch 1 starts reusing port in all cases. The second patch is still not breaking DRC semantics, to quote from source code: <code snip> /** * xs_close - close a socket * @xprt: transport * * This is used when all requests are complete; ie, no DRC state remains * on the server we want to save. * * The caller _must_ be holding XPRT_LOCKED in order to avoid issues with * xs_reset_transport() zeroing the socket from underneath a writer. */ static void xs_close(struct rpc_xprt *xprt) <code snip> If the server has closed a connection, then no DRC state remains on the server we want to use. The second patch is exploiting this semantics and using a new port in following 2 cases: a) RST from server implies the connection was torn down & no useful DRC exists on server b) FIN from server implies that server is shutting down the connection as part of close and no DRC state remains Please let me know if I have missed something obvious, I definitely do not want to break DRC as that is not the intention of this patch series. Is there a situation where server can close a connection and still keep DRC? Thanks. -Vallish ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] SUNRPC: Reconnect with new port on server initiated connection termination 2018-05-10 16:22 ` Vallish Vaidyeshwara @ 2018-05-10 17:26 ` Trond Myklebust 2018-05-10 21:12 ` Vallish Vaidyeshwara 2018-05-10 17:37 ` bfields 1 sibling, 1 reply; 10+ messages in thread From: Trond Myklebust @ 2018-05-10 17:26 UTC (permalink / raw) To: vallish; +Cc: bfields, anna.schumaker, jsstraus, linux-nfs, jlayton T24gVGh1LCAyMDE4LTA1LTEwIGF0IDE2OjIyICswMDAwLCBWYWxsaXNoIFZhaWR5ZXNod2FyYSB3 cm90ZToNCj4gT24gVGh1LCBNYXkgMTAsIDIwMTggYXQgMDM6MjU6MTRQTSArMDAwMCwgVHJvbmQg TXlrbGVidXN0IHdyb3RlOg0KPiA+IE9uIFRodSwgMjAxOC0wNS0xMCBhdCAwNjoxMiArMDAwMCwg VmFsbGlzaCBWYWlkeWVzaHdhcmEgd3JvdGU6DQo+ID4gPiBTZXJ2ZXIgaW5pdGlhdGVkIHNvY2tl dCBjbG9zZSBjYW4gY29ycnVwdCBjb25uZWN0aW9uIHN0YXRlDQo+ID4gPiB0cmFja2luZw0KPiA+ ID4gdGFibGUgaW4gY29uanVuY3Rpb24gd2l0aCBvdGhlciBuZXR3b3JrIG1pZGRsZSBib3hlcy4g SW4NCj4gPiA+IHNpdHVhdGlvbnMNCj4gPiA+IGxpa2UgdGhlc2UsIGNsaWVudCBjb25uZWN0aW9u IGhhbmdzIHRpbGwgY29ubmVjdGlvbiBzdGF0ZQ0KPiA+ID4gdHJhY2tpbmcNCj4gPiA+IHRhYmxl IGVudHJpZXMgYWdlIG91dCBhbmQgZ2V0IHB1cmdlZC4gQ2xpZW50IHJlY29ubmVjdGlvbiB3aXRo IGENCj4gPiA+IG5ldw0KPiA+ID4gcG9ydCBpbiBzdWNoIGEgc2l0dWF0aW9uIHdpbGwgYXZvaWQg Y29ubmVjdGlvbiBoYW5nLg0KPiA+ID4gDQo+ID4gPiBSZXZpZXdlZC1ieTogSmFjb2IgU3RyYXVz cyA8anNzdHJhdXNAYW1hem9uLmNvbT4NCj4gPiA+IFJldmlld2VkLWJ5OiBBbGFrZXNoIEhhbG9p IDxhbGFrZXNoaEBhbWF6b24uY29tPg0KPiA+ID4gU2lnbmVkLW9mZi1ieTogVmFsbGlzaCBWYWlk eWVzaHdhcmEgPHZhbGxpc2hAYW1hem9uLmNvbT4NCj4gPiA+IC0tLQ0KPiA+ID4gIG5ldC9zdW5y cGMveHBydHNvY2suYyB8IDUgKysrKysNCj4gPiA+ICAxIGZpbGUgY2hhbmdlZCwgNSBpbnNlcnRp b25zKCspDQo+ID4gPiANCj4gPiA+IGRpZmYgLS1naXQgYS9uZXQvc3VucnBjL3hwcnRzb2NrLmMg Yi9uZXQvc3VucnBjL3hwcnRzb2NrLmMNCj4gPiA+IGluZGV4IDViZjc1YjMuLmQyOTNjOGQgMTAw NjQ0DQo+ID4gPiAtLS0gYS9uZXQvc3VucnBjL3hwcnRzb2NrLmMNCj4gPiA+ICsrKyBiL25ldC9z dW5ycGMveHBydHNvY2suYw0KPiA+ID4gQEAgLTE2MjksNiArMTYyOSw4IEBAIHN0YXRpYyB2b2lk IHhzX3RjcF9zdGF0ZV9jaGFuZ2Uoc3RydWN0IHNvY2sNCj4gPiA+ICpzaykNCj4gPiA+ICAJCS8q IFRoZSBzZXJ2ZXIgaW5pdGlhdGVkIGEgc2h1dGRvd24gb2YgdGhlIHNvY2tldA0KPiA+ID4gKi8N Cj4gPiA+ICAJCXhwcnQtPmNvbm5lY3RfY29va2llKys7DQo+ID4gPiAgCQljbGVhcl9iaXQoWFBS VF9DT05ORUNURUQsICZ4cHJ0LT5zdGF0ZSk7DQo+ID4gPiArCQkvKiBTZXJ2ZXIgc2VudCBGSU4s IHJlY29ubmVjdCB3aXRoIGEgbmV3IHBvcnQgKi8NCj4gPiA+ICsJCXRyYW5zcG9ydC0+c3JjcG9y dCA9IDA7DQo+ID4gPiAgCQl4c190Y3BfZm9yY2VfY2xvc2UoeHBydCk7DQo+ID4gPiAgCQkvKiBm YWxsIHRocm91Z2ggKi8NCj4gPiA+ICAJY2FzZSBUQ1BfQ0xPU0lORzoNCj4gPiA+IEBAIC0xNjUw LDYgKzE2NTIsOSBAQCBzdGF0aWMgdm9pZCB4c190Y3Bfc3RhdGVfY2hhbmdlKHN0cnVjdCBzb2Nr DQo+ID4gPiAqc2spDQo+ID4gPiAgCQkJCQkmdHJhbnNwb3J0LT5zb2NrX3N0YXRlKSkNCj4gPiA+ ICAJCQl4cHJ0X2NsZWFyX2Nvbm5lY3RpbmcoeHBydCk7DQo+ID4gPiAgCQljbGVhcl9iaXQoWFBS VF9DTE9TSU5HLCAmeHBydC0+c3RhdGUpOw0KPiA+ID4gKwkJLyogU2VydmVyIHNlbnQgUlNULCBy ZWNvbm5lY3Qgd2l0aCBhIG5ldyBwb3J0ICovDQo+ID4gPiArCQlpZiAoc2stPnNrX2VyciA9PSBF Q09OTlJFU0VUKQ0KPiA+ID4gKwkJCXRyYW5zcG9ydC0+c3JjcG9ydCA9IDA7DQo+ID4gPiAgCQlp ZiAoc2stPnNrX2VycikNCj4gPiA+ICAJCQl4cHJ0X3dha2VfcGVuZGluZ190YXNrcyh4cHJ0LCAt c2stDQo+ID4gPiA+c2tfZXJyKTsNCj4gPiA+ICAJCS8qIFRyaWdnZXIgdGhlIHNvY2tldCByZWxl YXNlICovDQo+ID4gDQo+ID4gTkFDSy4gVGhpcyB3aWxsIHV0dGVybHkgYnJlYWsgTkZTdjIsIE5G U3YzIGFuZCBORlN2NC4wIGR1cGxpY2F0ZQ0KPiA+IHJlcGxheQ0KPiA+IGNhY2hlIHNlbWFudGlj cy4gDQo+ID4gDQo+ID4gQ2hlZXJzDQo+ID4gICBUcm9uZA0KPiA+IC0tIA0KPiA+IFRyb25kIE15 a2xlYnVzdA0KPiA+IExpbnV4IE5GUyBjbGllbnQgbWFpbnRhaW5lciwgSGFtbWVyc3BhY2UNCj4g PiB0cm9uZC5teWtsZWJ1c3RAaGFtbWVyc3BhY2UuY29tDQo+IA0KPiBIZWxsbyBUcm9uZCwNCj4g DQo+IFRoZSBmaXJzdCBwYXRjaCBpbiB0aGlzIHNlcmllcyBpcyBhY3R1YWxseSBoZWxwaW5nIHJl c3RvcmUgRFJDDQo+IGJlaGF2aW9yIGluDQo+IGNhc2VzIGxpa2UgbmV0d29yayBwYXJ0aXRpb24g d2hlcmUgcGFja2V0cyBhcmUgZHJvcHBlZDoNCj4gW1BBVENIIDEvMl0gU1VOUlBDOiBOZWVkIHRv IHJldXNlIG5vbi1yZXNlcnZlZCBwb3J0IGZvciByZWNvbm5lY3QNCj4gUGF0Y2ggMSBzdGFydHMg cmV1c2luZyBwb3J0IGluIGFsbCBjYXNlcy4NCj4gDQo+IFRoZSBzZWNvbmQgcGF0Y2ggaXMgc3Rp bGwgbm90IGJyZWFraW5nIERSQyBzZW1hbnRpY3MsIHRvIHF1b3RlIGZyb20NCj4gc291cmNlDQo+ IGNvZGU6DQo+IA0KPiA8Y29kZSBzbmlwPg0KPiAgIC8qKg0KPiAgICAqIHhzX2Nsb3NlIC0gY2xv c2UgYSBzb2NrZXQNCj4gICAgKiBAeHBydDogdHJhbnNwb3J0DQo+ICAgICoNCj4gICAgKiBUaGlz IGlzIHVzZWQgd2hlbiBhbGwgcmVxdWVzdHMgYXJlIGNvbXBsZXRlOyBpZSwgbm8gRFJDIHN0YXRl DQo+IHJlbWFpbnMNCj4gICAgKiBvbiB0aGUgc2VydmVyIHdlIHdhbnQgdG8gc2F2ZS4NCj4gICAg Kg0KPiAgICAqIFRoZSBjYWxsZXIgX211c3RfIGJlIGhvbGRpbmcgWFBSVF9MT0NLRUQgaW4gb3Jk ZXIgdG8gYXZvaWQNCj4gaXNzdWVzIHdpdGgNCj4gICAgKiB4c19yZXNldF90cmFuc3BvcnQoKSB6 ZXJvaW5nIHRoZSBzb2NrZXQgZnJvbSB1bmRlcm5lYXRoIGENCj4gd3JpdGVyLg0KPiAgICAqLw0K PiAgIHN0YXRpYyB2b2lkIHhzX2Nsb3NlKHN0cnVjdCBycGNfeHBydCAqeHBydCkNCj4gPGNvZGUg c25pcD4NCj4gDQo+IElmIHRoZSBzZXJ2ZXIgaGFzIGNsb3NlZCBhIGNvbm5lY3Rpb24sIHRoZW4g bm8gRFJDIHN0YXRlIHJlbWFpbnMgb24NCj4gdGhlIHNlcnZlcg0KPiB3ZSB3YW50IHRvIHVzZS4N Cj4gDQo+IFRoZSBzZWNvbmQgcGF0Y2ggaXMgZXhwbG9pdGluZyB0aGlzIHNlbWFudGljcyBhbmQg dXNpbmcgYSBuZXcgcG9ydCBpbg0KPiBmb2xsb3dpbmcNCj4gMiBjYXNlczoNCj4gYSkgUlNUIGZy b20gc2VydmVyIGltcGxpZXMgdGhlIGNvbm5lY3Rpb24gd2FzIHRvcm4gZG93biAmIG5vIHVzZWZ1 bA0KPiBEUkMgZXhpc3RzDQo+IG9uIHNlcnZlcg0KPiBiKSBGSU4gZnJvbSBzZXJ2ZXIgaW1wbGll cyB0aGF0IHNlcnZlciBpcyBzaHV0dGluZyBkb3duIHRoZQ0KPiBjb25uZWN0aW9uIGFzIHBhcnQN Cj4gb2YgY2xvc2UgYW5kIG5vIERSQyBzdGF0ZSByZW1haW5zDQo+IA0KPiBQbGVhc2UgbGV0IG1l IGtub3cgaWYgSSBoYXZlIG1pc3NlZCBzb21ldGhpbmcgb2J2aW91cywgSSBkZWZpbml0ZWx5DQo+ IGRvIG5vdCB3YW50DQo+IHRvIGJyZWFrIERSQyBhcyB0aGF0IGlzIG5vdCB0aGUgaW50ZW50aW9u IG9mIHRoaXMgcGF0Y2ggc2VyaWVzLiBJcw0KPiB0aGVyZSBhDQo+IHNpdHVhdGlvbiB3aGVyZSBz ZXJ2ZXIgY2FuIGNsb3NlIGEgY29ubmVjdGlvbiBhbmQgc3RpbGwga2VlcCBEUkM/DQo+IA0KDQpU aGUgRFJDIGRvZXMgbm90IGV4aXN0IGZvciB0aGUgYmVuZWZpdCBvZiB0aGUgc2VydmVyLCBidXQg Zm9yIHRoZQ0KYmVuZWZpdCBvZiB0aGUgY2xpZW50LiBJdCBpcyB0aGVyZSB0byBlbnN1cmUgdGhh dCBpZiB0aGUgY2xpZW50IHJlcGxheXMNCnRoZSByZXF1ZXN0LCB0aGVuIGl0IGdldHMgdGhlIGV4 YWN0IHNhbWUgcmVwbHkgYXMgaXQgc2hvdWxkIGhhdmUNCnJlY2VpdmVkIHdoZW4gdGhlIGZpcnN0 IHJlcXVlc3Qgd2FzIHNlbnQuIEJlYXJpbmcgdGhhdCBpbiBtaW5kOg0KDQogICAxLiBXaGF0IGd1 YXJhbnRlZXMgdGhhdCBhbGwgc2VydmVycyBiZWhhdmUgY29ycmVjdGx5IHcuci50LiBlbnN1cmlu Zw0KICAgICAgdGhhdCB0aGV5IGhhdmUgc2VudCBhbGwgcmVwbGllcyB0byBhbnkgb3V0c3RhbmRp bmcgUlBDIGNhbGwgYmVmb3JlDQogICAgICBzaHV0dGluZyBkb3duIHRoZSBjb25uZWN0aW9uLiBJ J20gbm90IHN1cmUgdGhhdCBldmVuIHRoZSBMaW51eA0KICAgICAgc2VydmVyIGRvZXMgdGhhdC4N CiAgIDIuIEhvdyB3b3VsZCBhIHNlcnZlciBldmVuIGtub3cgd2hldGhlciBvciBub3QgdGhlIGNs aWVudCBtYXkgbmVlZCB0bw0KICAgICAgcmVwbGF5IGEgcmVxdWVzdD8NCg0KLS0gDQpUcm9uZCBN eWtsZWJ1c3QNCkxpbnV4IE5GUyBjbGllbnQgbWFpbnRhaW5lciwgSGFtbWVyc3BhY2UNCnRyb25k Lm15a2xlYnVzdEBoYW1tZXJzcGFjZS5jb20= ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] SUNRPC: Reconnect with new port on server initiated connection termination 2018-05-10 17:26 ` Trond Myklebust @ 2018-05-10 21:12 ` Vallish Vaidyeshwara 0 siblings, 0 replies; 10+ messages in thread From: Vallish Vaidyeshwara @ 2018-05-10 21:12 UTC (permalink / raw) To: Trond Myklebust; +Cc: bfields, anna.schumaker, jsstraus, linux-nfs, jlayton On Thu, May 10, 2018 at 05:26:12PM +0000, Trond Myklebust wrote: > On Thu, 2018-05-10 at 16:22 +0000, Vallish Vaidyeshwara wrote: > > On Thu, May 10, 2018 at 03:25:14PM +0000, Trond Myklebust wrote: > > > On Thu, 2018-05-10 at 06:12 +0000, Vallish Vaidyeshwara wrote: > > > > Server initiated socket close can corrupt connection state > > > > tracking > > > > table in conjunction with other network middle boxes. In > > > > situations > > > > like these, client connection hangs till connection state > > > > tracking > > > > table entries age out and get purged. Client reconnection with a > > > > new > > > > port in such a situation will avoid connection hang. > > > > > > > > Reviewed-by: Jacob Strauss <jsstraus@amazon.com> > > > > Reviewed-by: Alakesh Haloi <alakeshh@amazon.com> > > > > Signed-off-by: Vallish Vaidyeshwara <vallish@amazon.com> > > > > --- > > > > net/sunrpc/xprtsock.c | 5 +++++ > > > > 1 file changed, 5 insertions(+) > > > > > > > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c > > > > index 5bf75b3..d293c8d 100644 > > > > --- a/net/sunrpc/xprtsock.c > > > > +++ b/net/sunrpc/xprtsock.c > > > > @@ -1629,6 +1629,8 @@ static void xs_tcp_state_change(struct sock > > > > *sk) > > > > /* The server initiated a shutdown of the socket > > > > */ > > > > xprt->connect_cookie++; > > > > clear_bit(XPRT_CONNECTED, &xprt->state); > > > > + /* Server sent FIN, reconnect with a new port */ > > > > + transport->srcport = 0; > > > > xs_tcp_force_close(xprt); > > > > /* fall through */ > > > > case TCP_CLOSING: > > > > @@ -1650,6 +1652,9 @@ static void xs_tcp_state_change(struct sock > > > > *sk) > > > > &transport->sock_state)) > > > > xprt_clear_connecting(xprt); > > > > clear_bit(XPRT_CLOSING, &xprt->state); > > > > + /* Server sent RST, reconnect with a new port */ > > > > + if (sk->sk_err == ECONNRESET) > > > > + transport->srcport = 0; > > > > if (sk->sk_err) > > > > xprt_wake_pending_tasks(xprt, -sk- > > > > >sk_err); > > > > /* Trigger the socket release */ > > > > > > NACK. This will utterly break NFSv2, NFSv3 and NFSv4.0 duplicate > > > replay > > > cache semantics. > > > > > > Cheers > > > Trond > > > -- > > > Trond Myklebust > > > Linux NFS client maintainer, Hammerspace > > > trond.myklebust@hammerspace.com > > > > Hello Trond, > > > > The first patch in this series is actually helping restore DRC > > behavior in > > cases like network partition where packets are dropped: > > [PATCH 1/2] SUNRPC: Need to reuse non-reserved port for reconnect > > Patch 1 starts reusing port in all cases. > > > > The second patch is still not breaking DRC semantics, to quote from > > source > > code: > > > > <code snip> > > /** > > * xs_close - close a socket > > * @xprt: transport > > * > > * This is used when all requests are complete; ie, no DRC state > > remains > > * on the server we want to save. > > * > > * The caller _must_ be holding XPRT_LOCKED in order to avoid > > issues with > > * xs_reset_transport() zeroing the socket from underneath a > > writer. > > */ > > static void xs_close(struct rpc_xprt *xprt) > > <code snip> > > > > If the server has closed a connection, then no DRC state remains on > > the server > > we want to use. > > > > The second patch is exploiting this semantics and using a new port in > > following > > 2 cases: > > a) RST from server implies the connection was torn down & no useful > > DRC exists > > on server > > b) FIN from server implies that server is shutting down the > > connection as part > > of close and no DRC state remains > > > > Please let me know if I have missed something obvious, I definitely > > do not want > > to break DRC as that is not the intention of this patch series. Is > > there a > > situation where server can close a connection and still keep DRC? > > > > The DRC does not exist for the benefit of the server, but for the > benefit of the client. It is there to ensure that if the client replays > the request, then it gets the exact same reply as it should have > received when the first request was sent. Bearing that in mind: > > 1. What guarantees that all servers behave correctly w.r.t. ensuring > that they have sent all replies to any outstanding RPC call before > shutting down the connection. I'm not sure that even the Linux > server does that. > 2. How would a server even know whether or not the client may need to > replay a request? > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com Hello Trond, Thanks for the explanation, I was kind of misled by code comments in xs_close.I will probably submit a different patch to clean up these code comments which can mislead others as well. Regards, -Vallish ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] SUNRPC: Reconnect with new port on server initiated connection termination 2018-05-10 16:22 ` Vallish Vaidyeshwara 2018-05-10 17:26 ` Trond Myklebust @ 2018-05-10 17:37 ` bfields 2018-05-10 21:15 ` Vallish Vaidyeshwara 1 sibling, 1 reply; 10+ messages in thread From: bfields @ 2018-05-10 17:37 UTC (permalink / raw) To: Vallish Vaidyeshwara Cc: Trond Myklebust, anna.schumaker, linux-nfs, jlayton, jsstraus On Thu, May 10, 2018 at 04:22:02PM +0000, Vallish Vaidyeshwara wrote: > If the server has closed a connection, then no DRC state remains on the server > we want to use. That's not a rule, as far as I know. Certainly the Linux server doesn't destroy DRC state on closing a connection. (Should it? I'm not sure that's a good idea.) --b. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] SUNRPC: Reconnect with new port on server initiated connection termination 2018-05-10 17:37 ` bfields @ 2018-05-10 21:15 ` Vallish Vaidyeshwara 0 siblings, 0 replies; 10+ messages in thread From: Vallish Vaidyeshwara @ 2018-05-10 21:15 UTC (permalink / raw) To: bfields; +Cc: Trond Myklebust, anna.schumaker, linux-nfs, jlayton, jsstraus On Thu, May 10, 2018 at 01:37:28PM -0400, bfields@fieldses.org wrote: > On Thu, May 10, 2018 at 04:22:02PM +0000, Vallish Vaidyeshwara wrote: > > If the server has closed a connection, then no DRC state remains on the server > > we want to use. > > That's not a rule, as far as I know. Certainly the Linux server doesn't > destroy DRC state on closing a connection. (Should it? I'm not sure > that's a good idea.) > > --b. > Thanks for the clarification and I agree patch 2 should not be applied. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2018-05-10 21:18 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-05-10 6:12 [PATCH 0/2] SUNRPC: Clean up port reuse behavior on reconnects Vallish Vaidyeshwara 2018-05-10 6:12 ` [PATCH 1/2] SUNRPC: Need to reuse non-reserved port for reconnect Vallish Vaidyeshwara 2018-05-10 21:18 ` Vallish Vaidyeshwara 2018-05-10 6:12 ` [PATCH 2/2] SUNRPC: Reconnect with new port on server initiated connection termination Vallish Vaidyeshwara 2018-05-10 15:25 ` Trond Myklebust 2018-05-10 16:22 ` Vallish Vaidyeshwara 2018-05-10 17:26 ` Trond Myklebust 2018-05-10 21:12 ` Vallish Vaidyeshwara 2018-05-10 17:37 ` bfields 2018-05-10 21:15 ` Vallish Vaidyeshwara
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.