All of lore.kernel.org
 help / color / mirror / Atom feed
* question: re-try of operations in PNFS
@ 2018-05-17 20:43 Olga Kornievskaia
  2018-05-22 20:34 ` Mkrtchyan, Tigran
  0 siblings, 1 reply; 9+ messages in thread
From: Olga Kornievskaia @ 2018-05-17 20:43 UTC (permalink / raw)
  To: linux-nfs

Hi Trond,

Is there a reason why an rpc connection to the DS is set to timeout
requests instead of waiting until the reply from the server ? Requests
to DS timeout in 10sec and are resent to MDS.

Thank you.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: question: re-try of operations in PNFS
  2018-05-17 20:43 question: re-try of operations in PNFS Olga Kornievskaia
@ 2018-05-22 20:34 ` Mkrtchyan, Tigran
  2018-05-22 20:45   ` Olga Kornievskaia
  0 siblings, 1 reply; 9+ messages in thread
From: Mkrtchyan, Tigran @ 2018-05-22 20:34 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: linux-nfs

Hi Olga,

we saw similar issues with early version of RHEL6 kernels, but this was fixed in the later version.
and it's possible now to set timeout with

dataserver_timeo and dataserver_retrans

bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1175413

Which which kernel do you observe it?

Regards,
   Tigran.

----- Original Message -----
> From: "Olga Kornievskaia" <aglo@umich.edu>
> To: "linux-nfs" <linux-nfs@vger.kernel.org>
> Sent: Thursday, May 17, 2018 10:43:34 PM
> Subject: question: re-try of operations in PNFS

> Hi Trond,
> 
> Is there a reason why an rpc connection to the DS is set to timeout
> requests instead of waiting until the reply from the server ? Requests
> to DS timeout in 10sec and are resent to MDS.
> 
> Thank you.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: question: re-try of operations in PNFS
  2018-05-22 20:34 ` Mkrtchyan, Tigran
@ 2018-05-22 20:45   ` Olga Kornievskaia
  2018-05-22 21:01     ` Mkrtchyan, Tigran
                       ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Olga Kornievskaia @ 2018-05-22 20:45 UTC (permalink / raw)
  To: Mkrtchyan, Tigran; +Cc: linux-nfs

On Tue, May 22, 2018 at 4:34 PM, Mkrtchyan, Tigran
<tigran.mkrtchyan@desy.de> wrote:
> Hi Olga,
>
> we saw similar issues with early version of RHEL6 kernels, but this was fixed in the later version.
> and it's possible now to set timeout with
>
> dataserver_timeo and dataserver_retrans
>
> bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1175413
>
> Which which kernel do you observe it?

Upstream kernel. But I'm arguing that there shouldn't be a need to
specify a dataserver_timeo because it shouldn't timeout at all just
like MDS operations.

Also curiously, "man nfs" doesn't list "datasever_timeo" option and
when I try to use it on a RHEL7.4 machine it says incorrect option.
Also grep thru the upstream kernel code for "dataserver_timeo" is
empty too.

>
> Regards,
>    Tigran.
>
> ----- Original Message -----
>> From: "Olga Kornievskaia" <aglo@umich.edu>
>> To: "linux-nfs" <linux-nfs@vger.kernel.org>
>> Sent: Thursday, May 17, 2018 10:43:34 PM
>> Subject: question: re-try of operations in PNFS
>
>> Hi Trond,
>>
>> Is there a reason why an rpc connection to the DS is set to timeout
>> requests instead of waiting until the reply from the server ? Requests
>> to DS timeout in 10sec and are resent to MDS.
>>
>> Thank you.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: question: re-try of operations in PNFS
  2018-05-22 20:45   ` Olga Kornievskaia
@ 2018-05-22 21:01     ` Mkrtchyan, Tigran
  2018-05-23  0:26     ` Rick Macklem
  2018-05-30 10:47     ` Suresh Jayaraman
  2 siblings, 0 replies; 9+ messages in thread
From: Mkrtchyan, Tigran @ 2018-05-22 21:01 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: linux-nfs

Agree, there shouldn't be an extra option for that and probably fresher kernels have dropped
that option. If I recall correctly, the options was there to protect client from pnfs bugs. E.g.
fall back to MDS if it thinks that something went wrong and client happy.

I haven't seen that problem for quite some time. However on modern kernels (and RHEL 7) we prefer
to use flexfiles pnfs layout. One of the reasons is the option to avoid io through mds.

I will try to reproduce it.

Regards,
   Tigran.

----- Original Message -----
> From: "Olga Kornievskaia" <aglo@umich.edu>
> To: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
> Sent: Tuesday, May 22, 2018 10:45:59 PM
> Subject: Re: question: re-try of operations in PNFS

> On Tue, May 22, 2018 at 4:34 PM, Mkrtchyan, Tigran
> <tigran.mkrtchyan@desy.de> wrote:
>> Hi Olga,
>>
>> we saw similar issues with early version of RHEL6 kernels, but this was fixed in
>> the later version.
>> and it's possible now to set timeout with
>>
>> dataserver_timeo and dataserver_retrans
>>
>> bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1175413
>>
>> Which which kernel do you observe it?
> 
> Upstream kernel. But I'm arguing that there shouldn't be a need to
> specify a dataserver_timeo because it shouldn't timeout at all just
> like MDS operations.
> 
> Also curiously, "man nfs" doesn't list "datasever_timeo" option and
> when I try to use it on a RHEL7.4 machine it says incorrect option.
> Also grep thru the upstream kernel code for "dataserver_timeo" is
> empty too.
> 
>>
>> Regards,
>>    Tigran.
>>
>> ----- Original Message -----
>>> From: "Olga Kornievskaia" <aglo@umich.edu>
>>> To: "linux-nfs" <linux-nfs@vger.kernel.org>
>>> Sent: Thursday, May 17, 2018 10:43:34 PM
>>> Subject: question: re-try of operations in PNFS
>>
>>> Hi Trond,
>>>
>>> Is there a reason why an rpc connection to the DS is set to timeout
>>> requests instead of waiting until the reply from the server ? Requests
>>> to DS timeout in 10sec and are resent to MDS.
>>>
>>> Thank you.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: question: re-try of operations in PNFS
  2018-05-22 20:45   ` Olga Kornievskaia
  2018-05-22 21:01     ` Mkrtchyan, Tigran
@ 2018-05-23  0:26     ` Rick Macklem
  2018-05-23 13:25       ` Olga Kornievskaia
  2018-05-30 10:47     ` Suresh Jayaraman
  2 siblings, 1 reply; 9+ messages in thread
From: Rick Macklem @ 2018-05-23  0:26 UTC (permalink / raw)
  To: Olga Kornievskaia, Mkrtchyan, Tigran; +Cc: linux-nfs

Olga Kornievskaia wrote:
[good stuff snipped]
>Upstream kernel. But I'm arguing that there shouldn't be a need to
>specify a dataserver_timeo because it shouldn't timeout at all just
>like MDS operations.
If/when the server is providing mirrored DSs, I've found this timeout useful
in the FreeBSD client since it allows the client to detect a DS failure.
It can then report the failure to the MDS via LayoutReturn (or another one
on NFSv4.2 which I can't remember the name of since I haven't done 4.2;-).

For non-mirrored DSs, the only thing I can think of (I've never seen this) would
be some sort of network partitioning such that the client can't reach the DS but
can reach the MDS.

I have no idea if this is relevant to Linux, but thought I'd mention it, just in case.
[more stuff snipped]
rick

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: question: re-try of operations in PNFS
  2018-05-23  0:26     ` Rick Macklem
@ 2018-05-23 13:25       ` Olga Kornievskaia
  2018-05-23 13:42         ` Trond Myklebust
  0 siblings, 1 reply; 9+ messages in thread
From: Olga Kornievskaia @ 2018-05-23 13:25 UTC (permalink / raw)
  To: Rick Macklem; +Cc: Mkrtchyan, Tigran, linux-nfs

On Tue, May 22, 2018 at 8:26 PM, Rick Macklem <rmacklem@uoguelph.ca> wrote:
> Olga Kornievskaia wrote:
> [good stuff snipped]
>>Upstream kernel. But I'm arguing that there shouldn't be a need to
>>specify a dataserver_timeo because it shouldn't timeout at all just
>>like MDS operations.
> If/when the server is providing mirrored DSs, I've found this timeout useful
> in the FreeBSD client since it allows the client to detect a DS failure.
> It can then report the failure to the MDS via LayoutReturn (or another one
> on NFSv4.2 which I can't remember the name of since I haven't done 4.2;-).
>
> For non-mirrored DSs, the only thing I can think of (I've never seen this) would
> be some sort of network partitioning such that the client can't reach the DS but
> can reach the MDS.
>
> I have no idea if this is relevant to Linux, but thought I'd mention it, just in case.
> [more stuff snipped]

Isn't retrying makes the implementation not spec compliant?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: question: re-try of operations in PNFS
  2018-05-23 13:25       ` Olga Kornievskaia
@ 2018-05-23 13:42         ` Trond Myklebust
  2018-05-23 15:16           ` Olga Kornievskaia
  0 siblings, 1 reply; 9+ messages in thread
From: Trond Myklebust @ 2018-05-23 13:42 UTC (permalink / raw)
  To: aglo, rmacklem; +Cc: tigran.mkrtchyan, linux-nfs

T24gV2VkLCAyMDE4LTA1LTIzIGF0IDA5OjI1IC0wNDAwLCBPbGdhIEtvcm5pZXZza2FpYSB3cm90
ZToNCj4gT24gVHVlLCBNYXkgMjIsIDIwMTggYXQgODoyNiBQTSwgUmljayBNYWNrbGVtIDxybWFj
a2xlbUB1b2d1ZWxwaC5jYT4NCj4gd3JvdGU6DQo+ID4gT2xnYSBLb3JuaWV2c2thaWEgd3JvdGU6
DQo+ID4gW2dvb2Qgc3R1ZmYgc25pcHBlZF0NCj4gPiA+IFVwc3RyZWFtIGtlcm5lbC4gQnV0IEkn
bSBhcmd1aW5nIHRoYXQgdGhlcmUgc2hvdWxkbid0IGJlIGEgbmVlZA0KPiA+ID4gdG8NCj4gPiA+
IHNwZWNpZnkgYSBkYXRhc2VydmVyX3RpbWVvIGJlY2F1c2UgaXQgc2hvdWxkbid0IHRpbWVvdXQg
YXQgYWxsDQo+ID4gPiBqdXN0DQo+ID4gPiBsaWtlIE1EUyBvcGVyYXRpb25zLg0KPiA+IA0KPiA+
IElmL3doZW4gdGhlIHNlcnZlciBpcyBwcm92aWRpbmcgbWlycm9yZWQgRFNzLCBJJ3ZlIGZvdW5k
IHRoaXMNCj4gPiB0aW1lb3V0IHVzZWZ1bA0KPiA+IGluIHRoZSBGcmVlQlNEIGNsaWVudCBzaW5j
ZSBpdCBhbGxvd3MgdGhlIGNsaWVudCB0byBkZXRlY3QgYSBEUw0KPiA+IGZhaWx1cmUuDQo+ID4g
SXQgY2FuIHRoZW4gcmVwb3J0IHRoZSBmYWlsdXJlIHRvIHRoZSBNRFMgdmlhIExheW91dFJldHVy
biAob3INCj4gPiBhbm90aGVyIG9uZQ0KPiA+IG9uIE5GU3Y0LjIgd2hpY2ggSSBjYW4ndCByZW1l
bWJlciB0aGUgbmFtZSBvZiBzaW5jZSBJIGhhdmVuJ3QgZG9uZQ0KPiA+IDQuMjstKS4NCj4gPiAN
Cj4gPiBGb3Igbm9uLW1pcnJvcmVkIERTcywgdGhlIG9ubHkgdGhpbmcgSSBjYW4gdGhpbmsgb2Yg
KEkndmUgbmV2ZXINCj4gPiBzZWVuIHRoaXMpIHdvdWxkDQo+ID4gYmUgc29tZSBzb3J0IG9mIG5l
dHdvcmsgcGFydGl0aW9uaW5nIHN1Y2ggdGhhdCB0aGUgY2xpZW50IGNhbid0DQo+ID4gcmVhY2gg
dGhlIERTIGJ1dA0KPiA+IGNhbiByZWFjaCB0aGUgTURTLg0KPiA+IA0KPiA+IEkgaGF2ZSBubyBp
ZGVhIGlmIHRoaXMgaXMgcmVsZXZhbnQgdG8gTGludXgsIGJ1dCB0aG91Z2h0IEknZA0KPiA+IG1l
bnRpb24gaXQsIGp1c3QgaW4gY2FzZS4NCj4gPiBbbW9yZSBzdHVmZiBzbmlwcGVkXQ0KPiANCj4g
SXNuJ3QgcmV0cnlpbmcgbWFrZXMgdGhlIGltcGxlbWVudGF0aW9uIG5vdCBzcGVjIGNvbXBsaWFu
dD8NCg0KUmVwbGF5aW5nIGEgcmVxdWVzdCB3b3VsZCBub3QgYmUgc3BlYyBjb21wbGlhbnQuIFBs
YXlpbmcgbmV3IHJlcXVlc3RzDQppcyBwZXJmZWN0bHkgZmluZSAoZS5nLiBhZnRlciBwaWNraW5n
IHVwIGEgbmV3IGxheW91dCBvciByZWRpcmVjdGluZw0KdGhlIEkvTyB0byB0aGUgTURTKS4NCg0K
SGlzdG9yaWNhbGx5LCBJIHNlZW0gdG8gcmVtZW1iZXIgdGhhdCBhdCBvbmUgcG9pbnQgd2UgaW50
cm9kdWNlZCBhIDE1cw0KdGltZW91dCBvbiBJL08gcmVxdWVzdHMgdG8gdGhlIERTIGluIG9yZGVy
IHRvIGFsbG93IGZhc3QgZmFpbG92ZXIgb2YNCnRoZSBwTkZTIGNsaWVudCB3aGVuIHRoZSBEUyB3
YXMgZG93biBvciB1bnJlc3BvbnNpdmUuIEknbSBub3Qgc3VyZQ0Kd2hldGhlciBvciBub3QgdGhh
dCBtZWNoYW5pc20gc3RpbGwgZXhpc3RzIGFuZCB3aGV0aGVyIGl0IGlzIHdoYXQgeW91DQphcmUg
c2VlaW5nIGhlcmUuDQoNCi0tIA0KVHJvbmQgTXlrbGVidXN0DQpMaW51eCBORlMgY2xpZW50IG1h
aW50YWluZXIsIEhhbW1lcnNwYWNlDQp0cm9uZC5teWtsZWJ1c3RAaGFtbWVyc3BhY2UuY29tDQoN
Cg==

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: question: re-try of operations in PNFS
  2018-05-23 13:42         ` Trond Myklebust
@ 2018-05-23 15:16           ` Olga Kornievskaia
  0 siblings, 0 replies; 9+ messages in thread
From: Olga Kornievskaia @ 2018-05-23 15:16 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: rmacklem, tigran.mkrtchyan, linux-nfs

On Wed, May 23, 2018 at 9:42 AM, Trond Myklebust
<trondmy@hammerspace.com> wrote:
> On Wed, 2018-05-23 at 09:25 -0400, Olga Kornievskaia wrote:
>> On Tue, May 22, 2018 at 8:26 PM, Rick Macklem <rmacklem@uoguelph.ca>
>> wrote:
>> > Olga Kornievskaia wrote:
>> > [good stuff snipped]
>> > > Upstream kernel. But I'm arguing that there shouldn't be a need
>> > > to
>> > > specify a dataserver_timeo because it shouldn't timeout at all
>> > > just
>> > > like MDS operations.
>> >
>> > If/when the server is providing mirrored DSs, I've found this
>> > timeout useful
>> > in the FreeBSD client since it allows the client to detect a DS
>> > failure.
>> > It can then report the failure to the MDS via LayoutReturn (or
>> > another one
>> > on NFSv4.2 which I can't remember the name of since I haven't done
>> > 4.2;-).
>> >
>> > For non-mirrored DSs, the only thing I can think of (I've never
>> > seen this) would
>> > be some sort of network partitioning such that the client can't
>> > reach the DS but
>> > can reach the MDS.
>> >
>> > I have no idea if this is relevant to Linux, but thought I'd
>> > mention it, just in case.
>> > [more stuff snipped]
>>
>> Isn't retrying makes the implementation not spec compliant?
>
> Replaying a request would not be spec compliant. Playing new requests
> is perfectly fine (e.g. after picking up a new layout or redirecting
> the I/O to the MDS).

I see you are right. The request to the MDS is a "new request" as it
uses a different filehandle.

> Historically, I seem to remember that at one point we introduced a 15s
> timeout on I/O requests to the DS in order to allow fast failover of
> the pNFS client when the DS was down or unresponsive. I'm not sure
> whether or not that mechanism still exists and whether it is what you
> are seeing here.

Then I'd guess it probably is that and the timeout now is 10s.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: question: re-try of operations in PNFS
  2018-05-22 20:45   ` Olga Kornievskaia
  2018-05-22 21:01     ` Mkrtchyan, Tigran
  2018-05-23  0:26     ` Rick Macklem
@ 2018-05-30 10:47     ` Suresh Jayaraman
  2 siblings, 0 replies; 9+ messages in thread
From: Suresh Jayaraman @ 2018-05-30 10:47 UTC (permalink / raw)
  To: Olga Kornievskaia, Mkrtchyan, Tigran; +Cc: linux-nfs

On 05/23/2018 02:15 AM, Olga Kornievskaia wrote:
>> we saw similar issues with early version of RHEL6 kernels, but this was fixed in the later version.
>> and it's possible now to set timeout with
>>
>> dataserver_timeo and dataserver_retrans
>>
>> bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1175413
>>
>> Which which kernel do you observe it?
> 
> Upstream kernel. But I'm arguing that there shouldn't be a need to
> specify a dataserver_timeo because it shouldn't timeout at all just
> like MDS operations.
> 
> Also curiously, "man nfs" doesn't list "datasever_timeo" option and
> when I try to use it on a RHEL7.4 machine it says incorrect option.
> Also grep thru the upstream kernel code for "dataserver_timeo" is
> empty too.
> 

I still see these options (as module parameters to 
nfs_layout_nfsv41_files module) in the mainline kernel (4.17-rc7).

We are facing the problem with IO being routed through MDS when DS is 
momentarily unavailable (for e.g. DS restart or DS failover). Wondering 
if anyone found this timeout helpful in the case when the network 
connection goes down as part of DS failover for instance. In the past, 
we had observed that the IO is being routed through MDS immediately 
after DS is restarted and MDS won't be in a position to complete the IO.


Regards,
Suresh

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-05-30 10:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-17 20:43 question: re-try of operations in PNFS Olga Kornievskaia
2018-05-22 20:34 ` Mkrtchyan, Tigran
2018-05-22 20:45   ` Olga Kornievskaia
2018-05-22 21:01     ` Mkrtchyan, Tigran
2018-05-23  0:26     ` Rick Macklem
2018-05-23 13:25       ` Olga Kornievskaia
2018-05-23 13:42         ` Trond Myklebust
2018-05-23 15:16           ` Olga Kornievskaia
2018-05-30 10:47     ` Suresh Jayaraman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.