* NFS/TCP timeouts @ 2018-10-03 18:31 Olga Kornievskaia 2018-10-03 18:45 ` Trond Myklebust 0 siblings, 1 reply; 7+ messages in thread From: Olga Kornievskaia @ 2018-10-03 18:31 UTC (permalink / raw) To: linux-nfs Hi folks, Is it true that NFS mount option "timeo" has nothing to do with the socket's setting of the user-specified timeout TCP_USER_TIMEOUT. Instead, when creating a TCP socket NFS uses either default/hard coded value of 60s for v3 or for v4.x it's lease based. Is there no value is having an adjustable TCP timeout value? Thank you. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: NFS/TCP timeouts 2018-10-03 18:31 NFS/TCP timeouts Olga Kornievskaia @ 2018-10-03 18:45 ` Trond Myklebust 2018-10-03 19:06 ` Olga Kornievskaia 0 siblings, 1 reply; 7+ messages in thread From: Trond Myklebust @ 2018-10-03 18:45 UTC (permalink / raw) To: linux-nfs, aglo T24gV2VkLCAyMDE4LTEwLTAzIGF0IDE0OjMxIC0wNDAwLCBPbGdhIEtvcm5pZXZza2FpYSB3cm90 ZToNCj4gSGkgZm9sa3MsDQo+IA0KPiBJcyBpdCB0cnVlIHRoYXQgTkZTIG1vdW50IG9wdGlvbiAi dGltZW8iIGhhcyBub3RoaW5nIHRvIGRvIHdpdGggdGhlDQo+IHNvY2tldCdzIHNldHRpbmcgb2Yg dGhlIHVzZXItc3BlY2lmaWVkIHRpbWVvdXQgVENQX1VTRVJfVElNRU9VVC4NCj4gSW5zdGVhZCwg d2hlbiBjcmVhdGluZyBhIFRDUCBzb2NrZXQgTkZTIHVzZXMgZWl0aGVyIGRlZmF1bHQvaGFyZA0K PiBjb2RlZA0KPiB2YWx1ZSBvZiA2MHMgZm9yIHYzIG9yIGZvciB2NC54IGl0J3MgbGVhc2UgYmFz ZWQuIElzIHRoZXJlIG5vIHZhbHVlDQo+IGlzDQo+IGhhdmluZyBhbiBhZGp1c3RhYmxlIFRDUCB0 aW1lb3V0IHZhbHVlPw0KPiANCg0KSXQgaXMgYWRqdXN0ZWQuIFBsZWFzZSBzZWUgdGhlIGNhbGN1 bGF0aW9uIGluDQp4c190Y3Bfc2V0X3NvY2tldF90aW1lb3V0cygpLg0KDQotLSANClRyb25kIE15 a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFpbmVyLCBIYW1tZXJzcGFjZQ0KdHJvbmQu bXlrbGVidXN0QGhhbW1lcnNwYWNlLmNvbQ0KDQoNCg== ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: NFS/TCP timeouts 2018-10-03 18:45 ` Trond Myklebust @ 2018-10-03 19:06 ` Olga Kornievskaia 2019-12-11 20:36 ` Olga Kornievskaia 0 siblings, 1 reply; 7+ messages in thread From: Olga Kornievskaia @ 2018-10-03 19:06 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-nfs On Wed, Oct 3, 2018 at 2:45 PM Trond Myklebust <trondmy@hammerspace.com> wrote: > > On Wed, 2018-10-03 at 14:31 -0400, Olga Kornievskaia wrote: > > Hi folks, > > > > Is it true that NFS mount option "timeo" has nothing to do with the > > socket's setting of the user-specified timeout TCP_USER_TIMEOUT. > > Instead, when creating a TCP socket NFS uses either default/hard > > coded > > value of 60s for v3 or for v4.x it's lease based. Is there no value > > is > > having an adjustable TCP timeout value? > > > > It is adjusted. Please see the calculation in > xs_tcp_set_socket_timeouts(). but it's not user configurable, is it? I don't see a way to modify v3's default 60s TCP timeout. and also in v4, the timeouts are set from xs_tcp_set_connect_timeout() for the lease period but again not user configurable, as far as i can tell. > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: NFS/TCP timeouts 2018-10-03 19:06 ` Olga Kornievskaia @ 2019-12-11 20:36 ` Olga Kornievskaia 2019-12-12 16:47 ` Trond Myklebust 0 siblings, 1 reply; 7+ messages in thread From: Olga Kornievskaia @ 2019-12-11 20:36 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-nfs Hi Trond, I'd like to raise this once again. Is this true that setting a timeout limit (TCP_USER_TIMEOUT) is not user configurable (rather I'm pretty sure it is not) but my question is why shouldn't it be tied to the "timeo" mount option? Right now, only the sesson/lease manager thread sets it via rpc_set_connect_timeout() to be lease period related. Is it the fact that we don't want to allow user to control TCP settings via the mount options? But somehow folks are expecting to be able to set low "timeo" value and have the (dead) connection to be considered dead earlier than for a rather long timeout period which is happening now. Thanks. On Wed, Oct 3, 2018 at 3:06 PM Olga Kornievskaia <aglo@umich.edu> wrote: > > On Wed, Oct 3, 2018 at 2:45 PM Trond Myklebust <trondmy@hammerspace.com> wrote: > > > > On Wed, 2018-10-03 at 14:31 -0400, Olga Kornievskaia wrote: > > > Hi folks, > > > > > > Is it true that NFS mount option "timeo" has nothing to do with the > > > socket's setting of the user-specified timeout TCP_USER_TIMEOUT. > > > Instead, when creating a TCP socket NFS uses either default/hard > > > coded > > > value of 60s for v3 or for v4.x it's lease based. Is there no value > > > is > > > having an adjustable TCP timeout value? > > > > > > > It is adjusted. Please see the calculation in > > xs_tcp_set_socket_timeouts(). > > but it's not user configurable, is it? I don't see a way to modify > v3's default 60s TCP timeout. and also in v4, the timeouts are set > from xs_tcp_set_connect_timeout() for the lease period but again not > user configurable, as far as i can tell. > > > > > -- > > Trond Myklebust > > Linux NFS client maintainer, Hammerspace > > trond.myklebust@hammerspace.com > > > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: NFS/TCP timeouts 2019-12-11 20:36 ` Olga Kornievskaia @ 2019-12-12 16:47 ` Trond Myklebust 2019-12-12 18:13 ` Olga Kornievskaia 0 siblings, 1 reply; 7+ messages in thread From: Trond Myklebust @ 2019-12-12 16:47 UTC (permalink / raw) To: aglo; +Cc: linux-nfs Hi Olga, On Wed, 2019-12-11 at 15:36 -0500, Olga Kornievskaia wrote: > Hi Trond, > > I'd like to raise this once again. Is this true that setting a > timeout > limit (TCP_USER_TIMEOUT) is not user configurable (rather I'm pretty > sure it is not) but my question is why shouldn't it be tied to the > "timeo" mount option? Right now, only the sesson/lease manager thread > sets it via rpc_set_connect_timeout() to be lease period related. > > Is it the fact that we don't want to allow user to control TCP > settings via the mount options? But somehow folks are expecting to be > able to set low "timeo" value and have the (dead) connection to be > considered dead earlier than for a rather long timeout period which > is > happening now. In my mind, the two are correlated, but are not equivalent. The 'timeo' value is basically a timeout for how long it takes for the whole process of "send RPC call", "have it processed by the server" and "receive reply". IOW: 'timeo' is about how long it takes for an RPC call to execute end- to-end. The TCP_USER_TIMEOUT, is essentially a timeout for how long it takes the server to ACK receipt of the RPC call once we've placed it in the TCP socket. IOW: it is a timeout for the networking part of an RPC call transmission. So, as I said, the two are correlated: if the server is down, then your timeout is dominated by the fact that the network transmission never completes. However if the server is up and congested, then the "processing by the server" is likely to dominate. The other thing to note is that if the TCP connection is unresponsive, we may want to fail that much faster in order to give ourselves a chance to close the connection, open a new one and retransmit the requests from the old connection before the 'timeo' is triggered (since in the case of a soft timeout, that could be a fatal error). Does that make sense? > > Thanks. > > On Wed, Oct 3, 2018 at 3:06 PM Olga Kornievskaia <aglo@umich.edu> > wrote: > > On Wed, Oct 3, 2018 at 2:45 PM Trond Myklebust < > > trondmy@hammerspace.com> wrote: > > > On Wed, 2018-10-03 at 14:31 -0400, Olga Kornievskaia wrote: > > > > Hi folks, > > > > > > > > Is it true that NFS mount option "timeo" has nothing to do with > > > > the > > > > socket's setting of the user-specified timeout > > > > TCP_USER_TIMEOUT. > > > > Instead, when creating a TCP socket NFS uses either > > > > default/hard > > > > coded > > > > value of 60s for v3 or for v4.x it's lease based. Is there no > > > > value > > > > is > > > > having an adjustable TCP timeout value? > > > > > > > > > > It is adjusted. Please see the calculation in > > > xs_tcp_set_socket_timeouts(). > > > > but it's not user configurable, is it? I don't see a way to modify > > v3's default 60s TCP timeout. and also in v4, the timeouts are set > > from xs_tcp_set_connect_timeout() for the lease period but again > > not > > user configurable, as far as i can tell. > > > > > -- > > > Trond Myklebust > > > Linux NFS client maintainer, Hammerspace > > > trond.myklebust@hammerspace.com > > > > > > -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: NFS/TCP timeouts 2019-12-12 16:47 ` Trond Myklebust @ 2019-12-12 18:13 ` Olga Kornievskaia 2019-12-12 19:31 ` Trond Myklebust 0 siblings, 1 reply; 7+ messages in thread From: Olga Kornievskaia @ 2019-12-12 18:13 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-nfs On Thu, Dec 12, 2019 at 11:47 AM Trond Myklebust <trondmy@hammerspace.com> wrote: > > Hi Olga, > > On Wed, 2019-12-11 at 15:36 -0500, Olga Kornievskaia wrote: > > Hi Trond, > > > > I'd like to raise this once again. Is this true that setting a > > timeout > > limit (TCP_USER_TIMEOUT) is not user configurable (rather I'm pretty > > sure it is not) but my question is why shouldn't it be tied to the > > "timeo" mount option? Right now, only the sesson/lease manager thread > > sets it via rpc_set_connect_timeout() to be lease period related. > > > > Is it the fact that we don't want to allow user to control TCP > > settings via the mount options? But somehow folks are expecting to be > > able to set low "timeo" value and have the (dead) connection to be > > considered dead earlier than for a rather long timeout period which > > is > > happening now. > > In my mind, the two are correlated, but are not equivalent. > > The 'timeo' value is basically a timeout for how long it takes for the > whole process of "send RPC call", "have it processed by the server" and > "receive reply". > IOW: 'timeo' is about how long it takes for an RPC call to execute end- > to-end. Ok, but what happens is there are no actions (connection wise) are taken when this timeout goes off and that' a problem for detecting bad connections. > The TCP_USER_TIMEOUT, is essentially a timeout for how long it takes > the server to ACK receipt of the RPC call once we've placed it in the > TCP socket. > IOW: it is a timeout for the networking part of an RPC call > transmission. But why isn't TCP time out (1) not user configurable and/or (2) not tied to the "timeo" ? > So, as I said, the two are correlated: if the server is down, then your > timeout is dominated by the fact that the network transmission never > completes. However if the server is up and congested, then the > "processing by the server" is likely to dominate. > > The other thing to note is that if the TCP connection is unresponsive, > we may want to fail that much faster in order to give ourselves a > chance to close the connection, open a new one and retransmit the > requests from the old connection before the 'timeo' is triggered (since > in the case of a soft timeout, that could be a fatal error). "we may want to fail" doesn't happen and that's exactly what I would like to happen. Also, TCP timeout is set to the a lease time (let's take linux server which sets 90s timeout) and that's larger than the default "timeo" which is 60s. That goes against your intention to recover in time. > Does that make sense? It's the last case I'm interested in. The issue I'm having is that after a "timeout" (which should be a lease period), the client doesn't sent a SYN trying to establish a new connection. - Here's a current problem. In the cloud environment, a server node goes down. It's spun up again in a different VM (but with the same IP) and server is ready to be receiving requests and continue with the IO. The problem is the client doesn't try to send a new SYN until the old connection timeout. This timeout is 3mins for v3 and can't be shorted because TCP_USER_TIMEOUT isn't user configurable or tied into the timeo. But user expects that connections times out after 60s (as default timeo) (or whatever value timeo is specified during mount). Current linux client doesn't do that. Even in v4, in my testing ,the client doesn't send the new SYN after the lease period (but I believe that's a bug). The only time it does do it if I change rpc_set_connect_time() to something low so that default of 18000 is set. (1) I could be wrong but I think there is a bug that doesn't re-establish connection (unless some low value is set). (2) I think there should be ability (at least for v3) to set the timeout for lower than 3mins. Perhaps we can add a new mount option, either have a totally separate tcp timeout value or something like "sync_nfstcp_timeouts" and use timeo to govern both NFS and TCP timeout. > > > > > Thanks. > > > > On Wed, Oct 3, 2018 at 3:06 PM Olga Kornievskaia <aglo@umich.edu> > > wrote: > > > On Wed, Oct 3, 2018 at 2:45 PM Trond Myklebust < > > > trondmy@hammerspace.com> wrote: > > > > On Wed, 2018-10-03 at 14:31 -0400, Olga Kornievskaia wrote: > > > > > Hi folks, > > > > > > > > > > Is it true that NFS mount option "timeo" has nothing to do with > > > > > the > > > > > socket's setting of the user-specified timeout > > > > > TCP_USER_TIMEOUT. > > > > > Instead, when creating a TCP socket NFS uses either > > > > > default/hard > > > > > coded > > > > > value of 60s for v3 or for v4.x it's lease based. Is there no > > > > > value > > > > > is > > > > > having an adjustable TCP timeout value? > > > > > > > > > > > > > It is adjusted. Please see the calculation in > > > > xs_tcp_set_socket_timeouts(). > > > > > > but it's not user configurable, is it? I don't see a way to modify > > > v3's default 60s TCP timeout. and also in v4, the timeouts are set > > > from xs_tcp_set_connect_timeout() for the lease period but again > > > not > > > user configurable, as far as i can tell. > > > > > > > -- > > > > Trond Myklebust > > > > Linux NFS client maintainer, Hammerspace > > > > trond.myklebust@hammerspace.com > > > > > > > > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: NFS/TCP timeouts 2019-12-12 18:13 ` Olga Kornievskaia @ 2019-12-12 19:31 ` Trond Myklebust 0 siblings, 0 replies; 7+ messages in thread From: Trond Myklebust @ 2019-12-12 19:31 UTC (permalink / raw) To: aglo; +Cc: linux-nfs On Thu, 2019-12-12 at 13:13 -0500, Olga Kornievskaia wrote: > On Thu, Dec 12, 2019 at 11:47 AM Trond Myklebust > <trondmy@hammerspace.com> wrote: > > Hi Olga, > > > > On Wed, 2019-12-11 at 15:36 -0500, Olga Kornievskaia wrote: > > > Hi Trond, > > > > > > I'd like to raise this once again. Is this true that setting a > > > timeout > > > limit (TCP_USER_TIMEOUT) is not user configurable (rather I'm > > > pretty > > > sure it is not) but my question is why shouldn't it be tied to > > > the > > > "timeo" mount option? Right now, only the sesson/lease manager > > > thread > > > sets it via rpc_set_connect_timeout() to be lease period related. > > > > > > Is it the fact that we don't want to allow user to control TCP > > > settings via the mount options? But somehow folks are expecting > > > to be > > > able to set low "timeo" value and have the (dead) connection to > > > be > > > considered dead earlier than for a rather long timeout period > > > which > > > is > > > happening now. > > > > In my mind, the two are correlated, but are not equivalent. > > > > The 'timeo' value is basically a timeout for how long it takes for > > the > > whole process of "send RPC call", "have it processed by the server" > > and > > "receive reply". > > IOW: 'timeo' is about how long it takes for an RPC call to execute > > end- > > to-end. > > Ok, but what happens is there are no actions (connection wise) are > taken when this timeout goes off and that' a problem for detecting > bad > connections. I'm not sure I understand what you mean. The point of TCP_USER_TIMEOUT is that the TCP layer is told when to time out and break the connection. Furthermore, the other side (i.e. the server) is told about the existence of this timeout, and hence knows what to expect. IOW: there are no actions at the RPC layer because this is a TCP layer thing. > > > The TCP_USER_TIMEOUT, is essentially a timeout for how long it > > takes > > the server to ACK receipt of the RPC call once we've placed it in > > the > > TCP socket. > > IOW: it is a timeout for the networking part of an RPC call > > transmission. > > But why isn't TCP time out (1) not user configurable and/or (2) not > tied to the "timeo" ? > > > So, as I said, the two are correlated: if the server is down, then > > your > > timeout is dominated by the fact that the network transmission > > never > > completes. However if the server is up and congested, then the > > "processing by the server" is likely to dominate. > > > > The other thing to note is that if the TCP connection is > > unresponsive, > > we may want to fail that much faster in order to give ourselves a > > chance to close the connection, open a new one and retransmit the > > requests from the old connection before the 'timeo' is triggered > > (since > > in the case of a soft timeout, that could be a fatal error). > > "we may want to fail" doesn't happen and that's exactly what I would > like to happen. Also, TCP timeout is set to the a lease time (let's > take linux server which sets 90s timeout) and that's larger than the > default "timeo" which is 60s. That goes against your intention to > recover in time. > > > Does that make sense? > > It's the last case I'm interested in. The issue I'm having is that > after a "timeout" (which should be a lease period), the client > doesn't > sent a SYN trying to establish a new connection. TCP_USER_TIMEOUT should not affect the handshake part of the TCP connection (see 'man 7 tcp'). It can't solve a problem with the SYN states. > - > Here's a current problem. In the cloud environment, a server node > goes > down. It's spun up again in a different VM (but with the same IP) and > server is ready to be receiving requests and continue with the IO. > The > problem is the client doesn't try to send a new SYN until the old > connection timeout. This timeout is 3mins for v3 and can't be shorted > because TCP_USER_TIMEOUT isn't user configurable or tied into the > timeo. But user expects that connections times out after 60s (as > default timeo) (or whatever value timeo is specified during mount). > Current linux client doesn't do that. > > Even in v4, in my testing ,the client doesn't send the new SYN after > the lease period (but I believe that's a bug). The only time it does > do it if I change rpc_set_connect_time() to something low so that > default of 18000 is set. > > (1) I could be wrong but I think there is a bug that doesn't > re-establish connection (unless some low value is set). > (2) I think there should be ability (at least for v3) to set the > timeout for lower than 3mins. Perhaps we can add a new mount option, > either have a totally separate tcp timeout value or something like > "sync_nfstcp_timeouts" and use timeo to govern both NFS and TCP > timeout. This needs to be resolved using something different. I'm not sure what to use for timing the handshake out more quickly. > > > > Thanks. > > > > > > On Wed, Oct 3, 2018 at 3:06 PM Olga Kornievskaia <aglo@umich.edu> > > > wrote: > > > > On Wed, Oct 3, 2018 at 2:45 PM Trond Myklebust < > > > > trondmy@hammerspace.com> wrote: > > > > > On Wed, 2018-10-03 at 14:31 -0400, Olga Kornievskaia wrote: > > > > > > Hi folks, > > > > > > > > > > > > Is it true that NFS mount option "timeo" has nothing to do > > > > > > with > > > > > > the > > > > > > socket's setting of the user-specified timeout > > > > > > TCP_USER_TIMEOUT. > > > > > > Instead, when creating a TCP socket NFS uses either > > > > > > default/hard > > > > > > coded > > > > > > value of 60s for v3 or for v4.x it's lease based. Is there > > > > > > no > > > > > > value > > > > > > is > > > > > > having an adjustable TCP timeout value? > > > > > > > > > > > > > > > > It is adjusted. Please see the calculation in > > > > > xs_tcp_set_socket_timeouts(). > > > > > > > > but it's not user configurable, is it? I don't see a way to > > > > modify > > > > v3's default 60s TCP timeout. and also in v4, the timeouts are > > > > set > > > > from xs_tcp_set_connect_timeout() for the lease period but > > > > again > > > > not > > > > user configurable, as far as i can tell. > > > > > > > > > -- > > > > > Trond Myklebust > > > > > Linux NFS client maintainer, Hammerspace > > > > > trond.myklebust@hammerspace.com > > > > > > > > > > > > -- > > Trond Myklebust > > Linux NFS client maintainer, Hammerspace > > trond.myklebust@hammerspace.com > > > > -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-12-12 19:31 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-10-03 18:31 NFS/TCP timeouts Olga Kornievskaia 2018-10-03 18:45 ` Trond Myklebust 2018-10-03 19:06 ` Olga Kornievskaia 2019-12-11 20:36 ` Olga Kornievskaia 2019-12-12 16:47 ` Trond Myklebust 2019-12-12 18:13 ` Olga Kornievskaia 2019-12-12 19:31 ` Trond Myklebust
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.