* [PATCH 1/2] nfsv4: handle ENOSPC during create session @ 2018-06-21 16:35 Manjunath Patil 2018-06-21 16:35 ` [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot Manjunath Patil 2018-06-21 17:04 ` [PATCH 1/2] nfsv4: handle ENOSPC during create session Trond Myklebust 0 siblings, 2 replies; 18+ messages in thread From: Manjunath Patil @ 2018-06-21 16:35 UTC (permalink / raw) To: linux-nfs; +Cc: manjunath.b.patil Presently the client mount hangs for NFS4ERR_NOSPC repsonse from server during create session operation. Handle this error at the client side and pass it back to user-space, which may chose to mount with lower nfs versions. Signed-off-by: Manjunath Patil <manjunath.b.patil@oracle.com> --- fs/nfs/nfs4state.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index 2bf2eaa..2134cf5 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -381,6 +381,8 @@ int nfs41_discover_server_trunking(struct nfs_client *clp, } nfs4_schedule_state_manager(clp); status = nfs_wait_client_init_complete(clp); + if (!status) /* -ERESTARTSYS */ + status = nfs_client_init_status(clp); if (status < 0) nfs_put_client(clp); return status; @@ -1919,6 +1921,9 @@ static int nfs4_handle_reclaim_lease_error(struct nfs_client *clp, int status) dprintk("%s: exit with error %d for server %s\n", __func__, -EPROTONOSUPPORT, clp->cl_hostname); return -EPROTONOSUPPORT; + case -NFS4ERR_NOSPC: + nfs_mark_client_ready(clp, status); + /*fall through*/ case -NFS4ERR_NOT_SAME: /* FixMe: implement recovery * in nfs4_exchange_id */ default: @@ -2186,6 +2191,7 @@ int nfs4_discover_server_trunking(struct nfs_client *clp, case 0: case -EINTR: case -ERESTARTSYS: + case -NFS4ERR_NOSPC: break; case -ETIMEDOUT: if (clnt->cl_softrtry) -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot 2018-06-21 16:35 [PATCH 1/2] nfsv4: handle ENOSPC during create session Manjunath Patil @ 2018-06-21 16:35 ` Manjunath Patil 2018-06-22 17:54 ` J. Bruce Fields 2018-06-21 17:04 ` [PATCH 1/2] nfsv4: handle ENOSPC during create session Trond Myklebust 1 sibling, 1 reply; 18+ messages in thread From: Manjunath Patil @ 2018-06-21 16:35 UTC (permalink / raw) To: linux-nfs; +Cc: manjunath.b.patil Presently nfserr_jukebox is being returned by nfsd for create_session request if server is unable to allocate a session slot. This may be treated as NFS4ERR_DELAY by the clients and which may continue to re-try create_session in loop leading NFSv4.1+ mounts in hung state. nfsd should return nfserr_nospc in this case as per rfc5661(section-18.36.4 subpoint 4. Session creation). Signed-off-by: Manjunath Patil <manjunath.b.patil@oracle.com> --- fs/nfsd/nfs4state.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 8571414..3734e08 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -2716,7 +2716,7 @@ static __be32 check_forechannel_attrs(struct nfsd4_channel_attrs *ca, struct nfs */ ca->maxreqs = nfsd4_get_drc_mem(ca); if (!ca->maxreqs) - return nfserr_jukebox; + return nfserr_nospc; return nfs_ok; } -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot 2018-06-21 16:35 ` [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot Manjunath Patil @ 2018-06-22 17:54 ` J. Bruce Fields 2018-06-22 21:49 ` Chuck Lever ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: J. Bruce Fields @ 2018-06-22 17:54 UTC (permalink / raw) To: Manjunath Patil; +Cc: linux-nfs On Thu, Jun 21, 2018 at 04:35:33PM +0000, Manjunath Patil wrote: > Presently nfserr_jukebox is being returned by nfsd for create_session > request if server is unable to allocate a session slot. This may be > treated as NFS4ERR_DELAY by the clients and which may continue to re-try > create_session in loop leading NFSv4.1+ mounts in hung state. nfsd > should return nfserr_nospc in this case as per rfc5661(section-18.36.4 > subpoint 4. Session creation). I don't think the spec actually gives us an error that we can use to say a CREATE_SESSION failed permanently for lack of resources. Better would be to avoid the need to fail at all. Possibilities: - revive Trond's patches some time back to do dynamic slot size renegotiation - make sure the systems you're testing on already have de766e570413 and 44d8660d3bb0 applied. - further liberalise the limits here: do we need them at all, or should we just wait till a kmalloc fails? Or maybe take a hybrid approach?: e.g. allow an arbitrary number of clients and only limit slots & slotsizes. --b. > > Signed-off-by: Manjunath Patil <manjunath.b.patil@oracle.com> > --- > fs/nfsd/nfs4state.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > index 8571414..3734e08 100644 > --- a/fs/nfsd/nfs4state.c > +++ b/fs/nfsd/nfs4state.c > @@ -2716,7 +2716,7 @@ static __be32 check_forechannel_attrs(struct nfsd4_channel_attrs *ca, struct nfs > */ > ca->maxreqs = nfsd4_get_drc_mem(ca); > if (!ca->maxreqs) > - return nfserr_jukebox; > + return nfserr_nospc; > > return nfs_ok; > } > -- > 1.8.3.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot 2018-06-22 17:54 ` J. Bruce Fields @ 2018-06-22 21:49 ` Chuck Lever 2018-06-22 22:31 ` Trond Myklebust 2018-06-24 20:26 ` J. Bruce Fields 2018-07-09 14:25 ` J. Bruce Fields 2 siblings, 1 reply; 18+ messages in thread From: Chuck Lever @ 2018-06-22 21:49 UTC (permalink / raw) To: Bruce Fields; +Cc: Manjunath Patil, Linux NFS Mailing List Hi Bruce- > On Jun 22, 2018, at 1:54 PM, J. Bruce Fields <bfields@fieldses.org> = wrote: >=20 > On Thu, Jun 21, 2018 at 04:35:33PM +0000, Manjunath Patil wrote: >> Presently nfserr_jukebox is being returned by nfsd for create_session >> request if server is unable to allocate a session slot. This may be >> treated as NFS4ERR_DELAY by the clients and which may continue to = re-try >> create_session in loop leading NFSv4.1+ mounts in hung state. nfsd >> should return nfserr_nospc in this case as per = rfc5661(section-18.36.4 >> subpoint 4. Session creation). >=20 > I don't think the spec actually gives us an error that we can use to = say > a CREATE_SESSION failed permanently for lack of resources. The current situation is that the server replies NFS4ERR_DELAY, and the client retries indefinitely. The goal is to let the client choose whether it wants to try the CREATE_SESSION again, try a different NFS version, or fail the mount request. Bill and I both looked at this section of RFC 5661. It seems to us that the use of NFS4ERR_NOSPC is appropriate and unambiguous in this situation, and it is an allowed status for the CREATE_SESSION operation. NFS4ERR_DELAY OTOH is not helpful. > Better would be to avoid the need to fail at all. That would be good too. > Possibilities: >=20 > - revive Trond's patches some time back to do dynamic slot size > renegotiation > - make sure the systems you're testing on already have > de766e570413 and 44d8660d3bb0 applied. > - further liberalise the limits here: do we need them at all, or > should we just wait till a kmalloc fails? Or maybe take a > hybrid approach?: e.g. allow an arbitrary number of clients > and only limit slots & slotsizes. > --b. >=20 >>=20 >> Signed-off-by: Manjunath Patil <manjunath.b.patil@oracle.com> >> --- >> fs/nfsd/nfs4state.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >>=20 >> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c >> index 8571414..3734e08 100644 >> --- a/fs/nfsd/nfs4state.c >> +++ b/fs/nfsd/nfs4state.c >> @@ -2716,7 +2716,7 @@ static __be32 check_forechannel_attrs(struct = nfsd4_channel_attrs *ca, struct nfs >> */ >> ca->maxreqs =3D nfsd4_get_drc_mem(ca); >> if (!ca->maxreqs) >> - return nfserr_jukebox; >> + return nfserr_nospc; >>=20 >> return nfs_ok; >> } >> --=20 >> 1.8.3.1 >>=20 >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-nfs" = in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" = in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Chuck Lever chucklever@gmail.com ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot 2018-06-22 21:49 ` Chuck Lever @ 2018-06-22 22:31 ` Trond Myklebust 2018-06-22 23:10 ` Trond Myklebust 2018-06-23 19:00 ` Chuck Lever 0 siblings, 2 replies; 18+ messages in thread From: Trond Myklebust @ 2018-06-22 22:31 UTC (permalink / raw) To: bfields, chucklever; +Cc: linux-nfs, manjunath.b.patil T24gRnJpLCAyMDE4LTA2LTIyIGF0IDE3OjQ5IC0wNDAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g SGkgQnJ1Y2UtDQo+IA0KPiANCj4gPiBPbiBKdW4gMjIsIDIwMTgsIGF0IDE6NTQgUE0sIEouIEJy dWNlIEZpZWxkcyA8YmZpZWxkc0BmaWVsZHNlcy5vcmc+DQo+ID4gd3JvdGU6DQo+ID4gDQo+ID4g T24gVGh1LCBKdW4gMjEsIDIwMTggYXQgMDQ6MzU6MzNQTSArMDAwMCwgTWFuanVuYXRoIFBhdGls IHdyb3RlOg0KPiA+ID4gUHJlc2VudGx5IG5mc2Vycl9qdWtlYm94IGlzIGJlaW5nIHJldHVybmVk IGJ5IG5mc2QgZm9yDQo+ID4gPiBjcmVhdGVfc2Vzc2lvbg0KPiA+ID4gcmVxdWVzdCBpZiBzZXJ2 ZXIgaXMgdW5hYmxlIHRvIGFsbG9jYXRlIGEgc2Vzc2lvbiBzbG90LiBUaGlzIG1heQ0KPiA+ID4g YmUNCj4gPiA+IHRyZWF0ZWQgYXMgTkZTNEVSUl9ERUxBWSBieSB0aGUgY2xpZW50cyBhbmQgd2hp Y2ggbWF5IGNvbnRpbnVlIHRvDQo+ID4gPiByZS10cnkNCj4gPiA+IGNyZWF0ZV9zZXNzaW9uIGlu IGxvb3AgbGVhZGluZyBORlN2NC4xKyBtb3VudHMgaW4gaHVuZyBzdGF0ZS4NCj4gPiA+IG5mc2QN Cj4gPiA+IHNob3VsZCByZXR1cm4gbmZzZXJyX25vc3BjIGluIHRoaXMgY2FzZSBhcyBwZXIgcmZj NTY2MShzZWN0aW9uLQ0KPiA+ID4gMTguMzYuNA0KPiA+ID4gc3VicG9pbnQgNC4gU2Vzc2lvbiBj cmVhdGlvbikuDQo+ID4gDQo+ID4gSSBkb24ndCB0aGluayB0aGUgc3BlYyBhY3R1YWxseSBnaXZl cyB1cyBhbiBlcnJvciB0aGF0IHdlIGNhbiB1c2UNCj4gPiB0byBzYXkNCj4gPiBhIENSRUFURV9T RVNTSU9OIGZhaWxlZCBwZXJtYW5lbnRseSBmb3IgbGFjayBvZiByZXNvdXJjZXMuDQo+IA0KPiBU aGUgY3VycmVudCBzaXR1YXRpb24gaXMgdGhhdCB0aGUgc2VydmVyIHJlcGxpZXMgTkZTNEVSUl9E RUxBWSwNCj4gYW5kIHRoZSBjbGllbnQgcmV0cmllcyBpbmRlZmluaXRlbHkuIFRoZSBnb2FsIGlz IHRvIGxldCB0aGUNCj4gY2xpZW50IGNob29zZSB3aGV0aGVyIGl0IHdhbnRzIHRvIHRyeSB0aGUg Q1JFQVRFX1NFU1NJT04gYWdhaW4sDQo+IHRyeSBhIGRpZmZlcmVudCBORlMgdmVyc2lvbiwgb3Ig ZmFpbCB0aGUgbW91bnQgcmVxdWVzdC4NCj4gDQo+IEJpbGwgYW5kIEkgYm90aCBsb29rZWQgYXQg dGhpcyBzZWN0aW9uIG9mIFJGQyA1NjYxLiBJdCBzZWVtcyB0bw0KPiB1cyB0aGF0IHRoZSB1c2Ug b2YgTkZTNEVSUl9OT1NQQyBpcyBhcHByb3ByaWF0ZSBhbmQgdW5hbWJpZ3VvdXMNCj4gaW4gdGhp cyBzaXR1YXRpb24sIGFuZCBpdCBpcyBhbiBhbGxvd2VkIHN0YXR1cyBmb3IgdGhlDQo+IENSRUFU RV9TRVNTSU9OIG9wZXJhdGlvbi4gTkZTNEVSUl9ERUxBWSBPVE9IIGlzIG5vdCBoZWxwZnVsLg0K DQpUaGVyZSBhcmUgYSByYW5nZSBvZiBlcnJvcnMgd2hpY2ggd2UgbWF5IG5lZWQgdG8gaGFuZGxl IGJ5IGRlc3Ryb3lpbmcNCnRoZSBzZXNzaW9uLCBhbmQgdGhlbiBjcmVhdGluZyBhIG5ldyBvbmUg KG1haW5seSB0aGUgb25lcyB3aGVyZSB0aGUNCmNsaWVudCBhbmQgc2VydmVyIHNsb3QgaGFuZGxp bmcgZ2V0IG91dCBvZiBzeW5jKS4gVGhhdCdzIHdoeSByZXR1cm5pbmcNCk5GUzRFUlJfTk9TUEMg aW4gcmVzcG9uc2UgdG8gQ1JFQVRFX1NFU1NJT04gaXMgdW5oZWxwZnVsLCBhbmQgaXMgd2h5DQp0 aGUgb25seSBzYW5lIHJlc3BvbnNlIGJ5IHRoZSBjbGllbnQgd2lsbCBiZSB0byB0cmVhdCBpcyBh cyBhIHRlbXBvcmFyeQ0KZXJyb3IuDQoNCklPVzogdGhlc2UgcGF0Y2hlcyB3aWxsIG5vdCBiZSBh Y2NlcHRhYmxlLCBldmVuIHdpdGggYSByZXdyaXRlLCBhcyB0aGV5DQphcmUgYmFzZWQgb24gYSBm bGF3ZWQgYXNzdW1wdGlvbi4NCi0tIA0KVHJvbmQgTXlrbGVidXN0DQpMaW51eCBORlMgY2xpZW50 IG1haW50YWluZXIsIEhhbW1lcnNwYWNlDQp0cm9uZC5teWtsZWJ1c3RAaGFtbWVyc3BhY2UuY29t DQoNCg== ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot 2018-06-22 22:31 ` Trond Myklebust @ 2018-06-22 23:10 ` Trond Myklebust 2018-06-23 19:00 ` Chuck Lever 1 sibling, 0 replies; 18+ messages in thread From: Trond Myklebust @ 2018-06-22 23:10 UTC (permalink / raw) To: bfields, chucklever; +Cc: linux-nfs, manjunath.b.patil T24gRnJpLCAyMDE4LTA2LTIyIGF0IDE4OjMxIC0wNDAwLCBUcm9uZCBNeWtsZWJ1c3Qgd3JvdGU6 DQo+IE9uIEZyaSwgMjAxOC0wNi0yMiBhdCAxNzo0OSAtMDQwMCwgQ2h1Y2sgTGV2ZXIgd3JvdGU6 DQo+ID4gSGkgQnJ1Y2UtDQo+ID4gDQo+ID4gDQo+ID4gPiBPbiBKdW4gMjIsIDIwMTgsIGF0IDE6 NTQgUE0sIEouIEJydWNlIEZpZWxkcyA8YmZpZWxkc0BmaWVsZHNlcy5vcg0KPiA+ID4gZz4NCj4g PiA+IHdyb3RlOg0KPiA+ID4gDQo+ID4gPiBPbiBUaHUsIEp1biAyMSwgMjAxOCBhdCAwNDozNToz M1BNICswMDAwLCBNYW5qdW5hdGggUGF0aWwgd3JvdGU6DQo+ID4gPiA+IFByZXNlbnRseSBuZnNl cnJfanVrZWJveCBpcyBiZWluZyByZXR1cm5lZCBieSBuZnNkIGZvcg0KPiA+ID4gPiBjcmVhdGVf c2Vzc2lvbg0KPiA+ID4gPiByZXF1ZXN0IGlmIHNlcnZlciBpcyB1bmFibGUgdG8gYWxsb2NhdGUg YSBzZXNzaW9uIHNsb3QuIFRoaXMNCj4gPiA+ID4gbWF5DQo+ID4gPiA+IGJlDQo+ID4gPiA+IHRy ZWF0ZWQgYXMgTkZTNEVSUl9ERUxBWSBieSB0aGUgY2xpZW50cyBhbmQgd2hpY2ggbWF5IGNvbnRp bnVlDQo+ID4gPiA+IHRvDQo+ID4gPiA+IHJlLXRyeQ0KPiA+ID4gPiBjcmVhdGVfc2Vzc2lvbiBp biBsb29wIGxlYWRpbmcgTkZTdjQuMSsgbW91bnRzIGluIGh1bmcgc3RhdGUuDQo+ID4gPiA+IG5m c2QNCj4gPiA+ID4gc2hvdWxkIHJldHVybiBuZnNlcnJfbm9zcGMgaW4gdGhpcyBjYXNlIGFzIHBl ciByZmM1NjYxKHNlY3Rpb24tDQo+ID4gPiA+IDE4LjM2LjQNCj4gPiA+ID4gc3VicG9pbnQgNC4g U2Vzc2lvbiBjcmVhdGlvbikuDQo+ID4gPiANCj4gPiA+IEkgZG9uJ3QgdGhpbmsgdGhlIHNwZWMg YWN0dWFsbHkgZ2l2ZXMgdXMgYW4gZXJyb3IgdGhhdCB3ZSBjYW4gdXNlDQo+ID4gPiB0byBzYXkN Cj4gPiA+IGEgQ1JFQVRFX1NFU1NJT04gZmFpbGVkIHBlcm1hbmVudGx5IGZvciBsYWNrIG9mIHJl c291cmNlcy4NCj4gPiANCj4gPiBUaGUgY3VycmVudCBzaXR1YXRpb24gaXMgdGhhdCB0aGUgc2Vy dmVyIHJlcGxpZXMgTkZTNEVSUl9ERUxBWSwNCj4gPiBhbmQgdGhlIGNsaWVudCByZXRyaWVzIGlu ZGVmaW5pdGVseS4gVGhlIGdvYWwgaXMgdG8gbGV0IHRoZQ0KPiA+IGNsaWVudCBjaG9vc2Ugd2hl dGhlciBpdCB3YW50cyB0byB0cnkgdGhlIENSRUFURV9TRVNTSU9OIGFnYWluLA0KPiA+IHRyeSBh IGRpZmZlcmVudCBORlMgdmVyc2lvbiwgb3IgZmFpbCB0aGUgbW91bnQgcmVxdWVzdC4NCj4gPiAN Cj4gPiBCaWxsIGFuZCBJIGJvdGggbG9va2VkIGF0IHRoaXMgc2VjdGlvbiBvZiBSRkMgNTY2MS4g SXQgc2VlbXMgdG8NCj4gPiB1cyB0aGF0IHRoZSB1c2Ugb2YgTkZTNEVSUl9OT1NQQyBpcyBhcHBy b3ByaWF0ZSBhbmQgdW5hbWJpZ3VvdXMNCj4gPiBpbiB0aGlzIHNpdHVhdGlvbiwgYW5kIGl0IGlz IGFuIGFsbG93ZWQgc3RhdHVzIGZvciB0aGUNCj4gPiBDUkVBVEVfU0VTU0lPTiBvcGVyYXRpb24u IE5GUzRFUlJfREVMQVkgT1RPSCBpcyBub3QgaGVscGZ1bC4NCj4gDQo+IFRoZXJlIGFyZSBhIHJh bmdlIG9mIGVycm9ycyB3aGljaCB3ZSBtYXkgbmVlZCB0byBoYW5kbGUgYnkgZGVzdHJveWluZw0K PiB0aGUgc2Vzc2lvbiwgYW5kIHRoZW4gY3JlYXRpbmcgYSBuZXcgb25lIChtYWlubHkgdGhlIG9u ZXMgd2hlcmUgdGhlDQo+IGNsaWVudCBhbmQgc2VydmVyIHNsb3QgaGFuZGxpbmcgZ2V0IG91dCBv ZiBzeW5jKS4gVGhhdCdzIHdoeQ0KPiByZXR1cm5pbmcNCj4gTkZTNEVSUl9OT1NQQyBpbiByZXNw b25zZSB0byBDUkVBVEVfU0VTU0lPTiBpcyB1bmhlbHBmdWwsIGFuZCBpcyB3aHkNCj4gdGhlIG9u bHkgc2FuZSByZXNwb25zZSBieSB0aGUgY2xpZW50IHdpbGwgYmUgdG8gdHJlYXQgaXMgYXMgYQ0K PiB0ZW1wb3JhcnkNCj4gZXJyb3IuDQo+IA0KPiBJT1c6IHRoZXNlIHBhdGNoZXMgd2lsbCBub3Qg YmUgYWNjZXB0YWJsZSwgZXZlbiB3aXRoIGEgcmV3cml0ZSwgYXMNCj4gdGhleQ0KPiBhcmUgYmFz ZWQgb24gYSBmbGF3ZWQgYXNzdW1wdGlvbi4NCg0KVGhlIG9uZSB1c2UgY2FzZSBmb3IgTkZTNEVS Ul9OT1NQQyBmb3Igd2hpY2ggaXQgd291bGQgYXBwZWFyIHRvIG1ha2UNCnNlbnNlIHRvIHRyZWF0 IGl0IGFzIGEgbW9yZSBvciBsZXNzIHBlcm1hbmVudCBlcnJvciwgaXMgaWYgdGhlIGNsaWVudA0K aXMgZW5nYWdpbmcgaW4gY2xpZW50IElEIHRydW5raW5nLCBhbmQgaXQgdHJpZXMgdG8gc2V0IHVw IHRvbyBtYW55DQpzZXNzaW9ucy4gSWYgdGhhdCBoYXBwZW5zLCB0aGVuIHRoZSBzZXJ2ZXIgbmVl ZHMgYSB3YXkgdG8gdGVsbCB0aGUNCmNsaWVudCB0byBzdG9wIGNyZWF0aW5nIG5ldyBzZXNzaW9u cy4gSG93ZXZlciBJJ20gbm90IGF3YXJlIG9mIGFueQ0KY2xpZW50cyBvdXQgdGhlcmUgdGhhdCBk byBjbGllbnQgaWQgdHJ1bmtpbmcuLi4NCg0KLS0gDQpUcm9uZCBNeWtsZWJ1c3QNCkNUTywgSGFt bWVyc3BhY2UgSW5jDQo0MzAwIEVsIENhbWlubyBSZWFsLCBTdWl0ZSAxMDUNCkxvcyBBbHRvcywg Q0EgOTQwMjINCnd3dy5oYW1tZXIuc3BhY2UgaWQ9Ii14LWV2by1zZWxlY3Rpb24tZW5kLW1hcmtl ciI+ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot 2018-06-22 22:31 ` Trond Myklebust 2018-06-22 23:10 ` Trond Myklebust @ 2018-06-23 19:00 ` Chuck Lever 2018-06-24 13:56 ` Trond Myklebust 1 sibling, 1 reply; 18+ messages in thread From: Chuck Lever @ 2018-06-23 19:00 UTC (permalink / raw) To: Trond Myklebust, Bruce Fields; +Cc: Linux NFS Mailing List, manjunath.b.patil > On Jun 22, 2018, at 6:31 PM, Trond Myklebust <trondmy@hammerspace.com> = wrote: >=20 > On Fri, 2018-06-22 at 17:49 -0400, Chuck Lever wrote: >> Hi Bruce- >>=20 >>=20 >>> On Jun 22, 2018, at 1:54 PM, J. Bruce Fields <bfields@fieldses.org> >>> wrote: >>>=20 >>> On Thu, Jun 21, 2018 at 04:35:33PM +0000, Manjunath Patil wrote: >>>> Presently nfserr_jukebox is being returned by nfsd for >>>> create_session >>>> request if server is unable to allocate a session slot. This may >>>> be >>>> treated as NFS4ERR_DELAY by the clients and which may continue to >>>> re-try >>>> create_session in loop leading NFSv4.1+ mounts in hung state. >>>> nfsd >>>> should return nfserr_nospc in this case as per rfc5661(section- >>>> 18.36.4 >>>> subpoint 4. Session creation). >>>=20 >>> I don't think the spec actually gives us an error that we can use >>> to say >>> a CREATE_SESSION failed permanently for lack of resources. >>=20 >> The current situation is that the server replies NFS4ERR_DELAY, >> and the client retries indefinitely. The goal is to let the >> client choose whether it wants to try the CREATE_SESSION again, >> try a different NFS version, or fail the mount request. >>=20 >> Bill and I both looked at this section of RFC 5661. It seems to >> us that the use of NFS4ERR_NOSPC is appropriate and unambiguous >> in this situation, and it is an allowed status for the >> CREATE_SESSION operation. NFS4ERR_DELAY OTOH is not helpful. >=20 > There are a range of errors which we may need to handle by destroying > the session, and then creating a new one (mainly the ones where the > client and server slot handling get out of sync). That's why returning > NFS4ERR_NOSPC in response to CREATE_SESSION is unhelpful, and is why > the only sane response by the client will be to treat it as a = temporary > error. > IOW: these patches will not be acceptable, even with a rewrite, as = they > are based on a flawed assumption. Fair enough. We're not attached to any particular solution/fix. So let's take "recovery of an active mount" out of the picture for a moment. The narrow problem is behavioral: during initial contact with an unfamiliar server, the server can hold off a client indefinitely by sending NFS4ERR_DELAY for example until another client unmounts. We want to find a way to allow clients to make progress when a server is short of resources. It appears that the mount(2) system call does not return as long as the server is still returning NFS4ERR_DELAY. Possibly user space is never given an opportunity to stop retrying, and thus mount.nfs gets stuck. It appears that DELAY is OK for EXCHANGE_ID too. So if a server decides to return DELAY to EXCHANGE_ID, I wonder if our client's trunking detection would be hamstrung by one bad server... -- Chuck Lever chucklever@gmail.com ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot 2018-06-23 19:00 ` Chuck Lever @ 2018-06-24 13:56 ` Trond Myklebust 2018-06-25 15:39 ` Chuck Lever 0 siblings, 1 reply; 18+ messages in thread From: Trond Myklebust @ 2018-06-24 13:56 UTC (permalink / raw) To: bfields, chucklever; +Cc: linux-nfs, manjunath.b.patil T24gU2F0LCAyMDE4LTA2LTIzIGF0IDE1OjAwIC0wNDAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g PiBPbiBKdW4gMjIsIDIwMTgsIGF0IDY6MzEgUE0sIFRyb25kIE15a2xlYnVzdCA8dHJvbmRteUBo YW1tZXJzcGFjZS5jDQo+ID4gb20+IHdyb3RlOg0KPiA+IA0KPiA+IE9uIEZyaSwgMjAxOC0wNi0y MiBhdCAxNzo0OSAtMDQwMCwgQ2h1Y2sgTGV2ZXIgd3JvdGU6DQo+ID4gPiBIaSBCcnVjZS0NCj4g PiA+IA0KPiA+ID4gDQo+ID4gPiA+IE9uIEp1biAyMiwgMjAxOCwgYXQgMTo1NCBQTSwgSi4gQnJ1 Y2UgRmllbGRzIDxiZmllbGRzQGZpZWxkc2VzLg0KPiA+ID4gPiBvcmc+DQo+ID4gPiA+IHdyb3Rl Og0KPiA+ID4gPiANCj4gPiA+ID4gT24gVGh1LCBKdW4gMjEsIDIwMTggYXQgMDQ6MzU6MzNQTSAr MDAwMCwgTWFuanVuYXRoIFBhdGlsDQo+ID4gPiA+IHdyb3RlOg0KPiA+ID4gPiA+IFByZXNlbnRs eSBuZnNlcnJfanVrZWJveCBpcyBiZWluZyByZXR1cm5lZCBieSBuZnNkIGZvcg0KPiA+ID4gPiA+ IGNyZWF0ZV9zZXNzaW9uDQo+ID4gPiA+ID4gcmVxdWVzdCBpZiBzZXJ2ZXIgaXMgdW5hYmxlIHRv IGFsbG9jYXRlIGEgc2Vzc2lvbiBzbG90LiBUaGlzDQo+ID4gPiA+ID4gbWF5DQo+ID4gPiA+ID4g YmUNCj4gPiA+ID4gPiB0cmVhdGVkIGFzIE5GUzRFUlJfREVMQVkgYnkgdGhlIGNsaWVudHMgYW5k IHdoaWNoIG1heQ0KPiA+ID4gPiA+IGNvbnRpbnVlIHRvDQo+ID4gPiA+ID4gcmUtdHJ5DQo+ID4g PiA+ID4gY3JlYXRlX3Nlc3Npb24gaW4gbG9vcCBsZWFkaW5nIE5GU3Y0LjErIG1vdW50cyBpbiBo dW5nIHN0YXRlLg0KPiA+ID4gPiA+IG5mc2QNCj4gPiA+ID4gPiBzaG91bGQgcmV0dXJuIG5mc2Vy cl9ub3NwYyBpbiB0aGlzIGNhc2UgYXMgcGVyDQo+ID4gPiA+ID4gcmZjNTY2MShzZWN0aW9uLQ0K PiA+ID4gPiA+IDE4LjM2LjQNCj4gPiA+ID4gPiBzdWJwb2ludCA0LiBTZXNzaW9uIGNyZWF0aW9u KS4NCj4gPiA+ID4gDQo+ID4gPiA+IEkgZG9uJ3QgdGhpbmsgdGhlIHNwZWMgYWN0dWFsbHkgZ2l2 ZXMgdXMgYW4gZXJyb3IgdGhhdCB3ZSBjYW4NCj4gPiA+ID4gdXNlDQo+ID4gPiA+IHRvIHNheQ0K PiA+ID4gPiBhIENSRUFURV9TRVNTSU9OIGZhaWxlZCBwZXJtYW5lbnRseSBmb3IgbGFjayBvZiBy ZXNvdXJjZXMuDQo+ID4gPiANCj4gPiA+IFRoZSBjdXJyZW50IHNpdHVhdGlvbiBpcyB0aGF0IHRo ZSBzZXJ2ZXIgcmVwbGllcyBORlM0RVJSX0RFTEFZLA0KPiA+ID4gYW5kIHRoZSBjbGllbnQgcmV0 cmllcyBpbmRlZmluaXRlbHkuIFRoZSBnb2FsIGlzIHRvIGxldCB0aGUNCj4gPiA+IGNsaWVudCBj aG9vc2Ugd2hldGhlciBpdCB3YW50cyB0byB0cnkgdGhlIENSRUFURV9TRVNTSU9OIGFnYWluLA0K PiA+ID4gdHJ5IGEgZGlmZmVyZW50IE5GUyB2ZXJzaW9uLCBvciBmYWlsIHRoZSBtb3VudCByZXF1 ZXN0Lg0KPiA+ID4gDQo+ID4gPiBCaWxsIGFuZCBJIGJvdGggbG9va2VkIGF0IHRoaXMgc2VjdGlv biBvZiBSRkMgNTY2MS4gSXQgc2VlbXMgdG8NCj4gPiA+IHVzIHRoYXQgdGhlIHVzZSBvZiBORlM0 RVJSX05PU1BDIGlzIGFwcHJvcHJpYXRlIGFuZCB1bmFtYmlndW91cw0KPiA+ID4gaW4gdGhpcyBz aXR1YXRpb24sIGFuZCBpdCBpcyBhbiBhbGxvd2VkIHN0YXR1cyBmb3IgdGhlDQo+ID4gPiBDUkVB VEVfU0VTU0lPTiBvcGVyYXRpb24uIE5GUzRFUlJfREVMQVkgT1RPSCBpcyBub3QgaGVscGZ1bC4N Cj4gPiANCj4gPiBUaGVyZSBhcmUgYSByYW5nZSBvZiBlcnJvcnMgd2hpY2ggd2UgbWF5IG5lZWQg dG8gaGFuZGxlIGJ5DQo+ID4gZGVzdHJveWluZw0KPiA+IHRoZSBzZXNzaW9uLCBhbmQgdGhlbiBj cmVhdGluZyBhIG5ldyBvbmUgKG1haW5seSB0aGUgb25lcyB3aGVyZSB0aGUNCj4gPiBjbGllbnQg YW5kIHNlcnZlciBzbG90IGhhbmRsaW5nIGdldCBvdXQgb2Ygc3luYykuIFRoYXQncyB3aHkNCj4g PiByZXR1cm5pbmcNCj4gPiBORlM0RVJSX05PU1BDIGluIHJlc3BvbnNlIHRvIENSRUFURV9TRVNT SU9OIGlzIHVuaGVscGZ1bCwgYW5kIGlzDQo+ID4gd2h5DQo+ID4gdGhlIG9ubHkgc2FuZSByZXNw b25zZSBieSB0aGUgY2xpZW50IHdpbGwgYmUgdG8gdHJlYXQgaXQgYXMgYQ0KPiA+IHRlbXBvcmFy eQ0KPiA+IGVycm9yLg0KPiA+IElPVzogdGhlc2UgcGF0Y2hlcyB3aWxsIG5vdCBiZSBhY2NlcHRh YmxlLCBldmVuIHdpdGggYSByZXdyaXRlLCBhcw0KPiA+IHRoZXkNCj4gPiBhcmUgYmFzZWQgb24g YSBmbGF3ZWQgYXNzdW1wdGlvbi4NCj4gDQo+IEZhaXIgZW5vdWdoLiBXZSdyZSBub3QgYXR0YWNo ZWQgdG8gYW55IHBhcnRpY3VsYXIgc29sdXRpb24vZml4Lg0KPiANCj4gU28gbGV0J3MgdGFrZSAi cmVjb3Zlcnkgb2YgYW4gYWN0aXZlIG1vdW50IiBvdXQgb2YgdGhlIHBpY3R1cmUNCj4gZm9yIGEg bW9tZW50Lg0KPiANCj4gVGhlIG5hcnJvdyBwcm9ibGVtIGlzIGJlaGF2aW9yYWw6IGR1cmluZyBp bml0aWFsIGNvbnRhY3Qgd2l0aCBhbg0KPiB1bmZhbWlsaWFyIHNlcnZlciwgdGhlIHNlcnZlciBj YW4gaG9sZCBvZmYgYSBjbGllbnQgaW5kZWZpbml0ZWx5DQo+IGJ5IHNlbmRpbmcgTkZTNEVSUl9E RUxBWSBmb3IgZXhhbXBsZSB1bnRpbCBhbm90aGVyIGNsaWVudCB1bm1vdW50cy4NCj4gV2Ugd2Fu dCB0byBmaW5kIGEgd2F5IHRvIGFsbG93IGNsaWVudHMgdG8gbWFrZSBwcm9ncmVzcyB3aGVuIGEN Cj4gc2VydmVyIGlzIHNob3J0IG9mIHJlc291cmNlcy4NCj4gDQo+IEl0IGFwcGVhcnMgdGhhdCB0 aGUgbW91bnQoMikgc3lzdGVtIGNhbGwgZG9lcyBub3QgcmV0dXJuIGFzIGxvbmcNCj4gYXMgdGhl IHNlcnZlciBpcyBzdGlsbCByZXR1cm5pbmcgTkZTNEVSUl9ERUxBWS4gUG9zc2libHkgdXNlcg0K PiBzcGFjZSBpcyBuZXZlciBnaXZlbiBhbiBvcHBvcnR1bml0eSB0byBzdG9wIHJldHJ5aW5nLCBh bmQgdGh1cw0KPiBtb3VudC5uZnMgZ2V0cyBzdHVjay4NCj4gDQo+IEl0IGFwcGVhcnMgdGhhdCBE RUxBWSBpcyBPSyBmb3IgRVhDSEFOR0VfSUQgdG9vLiBTbyBpZiBhIHNlcnZlcg0KPiBkZWNpZGVz IHRvIHJldHVybiBERUxBWSB0byBFWENIQU5HRV9JRCwgSSB3b25kZXIgaWYgb3VyIGNsaWVudCdz DQo+IHRydW5raW5nIGRldGVjdGlvbiB3b3VsZCBiZSBoYW1zdHJ1bmcgYnkgb25lIGJhZCBzZXJ2 ZXIuLi4NCg0KVGhlICdtb3VudCcgcHJvZ3JhbSBoYXMgdGhlICdyZXRyeScgb3B0aW9uIGluIG9y ZGVyIHRvIHNldCBhIHRpbWVvdXQNCmZvciB0aGUgbW91bnQgb3BlcmF0aW9uIGl0c2VsZi4gSXMg dGhhdCBvcHRpb24gbm90IHdvcmtpbmcgY29ycmVjdGx5Pw0KSWYgc28sIHdlIHNob3VsZCBkZWZp bml0ZWx5IGZpeCB0aGF0Lg0KV2UgbWlnaHQgYWxzbyB3YW50IHRvIGxvb2sgaW50byBtYWtpbmcg aXQgdGFrZSB2YWx1ZXMgPCAxIG1pbnV0ZS4gVGhhdA0KY291bGQgYmUgYWNjb21wbGlzaGVkIGVp dGhlciBieSBleHRlbmRpbmcgdGhlIHN5bnRheCBvZiB0aGUgJ3JldHJ5Jw0Kb3B0aW9uIChlLmcu OiAncmV0cnk9PG1pbnV0ZXM+OjxzZWNvbmRzPicpIG9yIGJ5IGFkZGluZyBhIG5ldyBvcHRpb24N CihlLmcuICdzcmV0cnk9PHNlY29uZHM+JykuDQoNCkl0IHdvdWxkIHRoZW4gYmUgdXAgdG8gdGhl IGNhbGxlciBvZiBtb3VudCB0byBkZWNpZGUgdGhlIHBvbGljeSBvZiB3aGF0DQp0byBkbyBhZnRl ciBhIHRpbWVvdXQuIFJlbmVnb3RpYXRpb24gZG93bndhcmQgdG8gTkZTdjMgbWlnaHQgYmUgYW4N Cm9wdGlvbiwgYnV0IGl0J3Mgbm90IHNvbWV0aGluZyB0aGF0IG1vc3QgcGVvcGxlIHdhbnQgdG8g ZG8gaW4gdGhlIGNhc2UNCndoZXJlIHRoZXJlIGFyZSBsb3RzIG9mIGNsaWVudHMgY29tcGV0aW5n IGZvciByZXNvdXJjZXMgc2luY2UgdGhhdCdzDQpwcmVjaXNlbHkgdGhlIHJlZ2ltZSB3aGVyZSB0 aGUgTkZTdjMgRFJDIHNjaGVtZSBicmVha3MgZG93biAobG90cyBvZg0KZGlzY29ubmVjdGlvbnMs IGNvbWJpbmVkIHdpdGggYSBoaWdoIHR1cm5vdmVyIG9mIERSQyBzbG90cykuDQoNCi0tIA0KVHJv bmQgTXlrbGVidXN0DQpMaW51eCBORlMgY2xpZW50IG1haW50YWluZXIsIEhhbW1lcnNwYWNlDQp0 cm9uZC5teWtsZWJ1c3RAaGFtbWVyc3BhY2UuY29tDQoNCg== ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot 2018-06-24 13:56 ` Trond Myklebust @ 2018-06-25 15:39 ` Chuck Lever 2018-06-25 16:45 ` Trond Myklebust 2018-06-25 17:03 ` Manjunath Patil 0 siblings, 2 replies; 18+ messages in thread From: Chuck Lever @ 2018-06-25 15:39 UTC (permalink / raw) To: Trond Myklebust; +Cc: Bruce Fields, Linux NFS Mailing List, manjunath.b.patil > On Jun 24, 2018, at 9:56 AM, Trond Myklebust <trondmy@hammerspace.com> = wrote: >=20 > On Sat, 2018-06-23 at 15:00 -0400, Chuck Lever wrote: >>> On Jun 22, 2018, at 6:31 PM, Trond Myklebust <trondmy@hammerspace.c >>> om> wrote: >>>=20 >>> On Fri, 2018-06-22 at 17:49 -0400, Chuck Lever wrote: >>>> Hi Bruce- >>>>=20 >>>>=20 >>>>> On Jun 22, 2018, at 1:54 PM, J. Bruce Fields <bfields@fieldses. >>>>> org> >>>>> wrote: >>>>>=20 >>>>> On Thu, Jun 21, 2018 at 04:35:33PM +0000, Manjunath Patil >>>>> wrote: >>>>>> Presently nfserr_jukebox is being returned by nfsd for >>>>>> create_session >>>>>> request if server is unable to allocate a session slot. This >>>>>> may >>>>>> be >>>>>> treated as NFS4ERR_DELAY by the clients and which may >>>>>> continue to >>>>>> re-try >>>>>> create_session in loop leading NFSv4.1+ mounts in hung state. >>>>>> nfsd >>>>>> should return nfserr_nospc in this case as per >>>>>> rfc5661(section- >>>>>> 18.36.4 >>>>>> subpoint 4. Session creation). >>>>>=20 >>>>> I don't think the spec actually gives us an error that we can >>>>> use >>>>> to say >>>>> a CREATE_SESSION failed permanently for lack of resources. >>>>=20 >>>> The current situation is that the server replies NFS4ERR_DELAY, >>>> and the client retries indefinitely. The goal is to let the >>>> client choose whether it wants to try the CREATE_SESSION again, >>>> try a different NFS version, or fail the mount request. >>>>=20 >>>> Bill and I both looked at this section of RFC 5661. It seems to >>>> us that the use of NFS4ERR_NOSPC is appropriate and unambiguous >>>> in this situation, and it is an allowed status for the >>>> CREATE_SESSION operation. NFS4ERR_DELAY OTOH is not helpful. >>>=20 >>> There are a range of errors which we may need to handle by >>> destroying >>> the session, and then creating a new one (mainly the ones where the >>> client and server slot handling get out of sync). That's why >>> returning >>> NFS4ERR_NOSPC in response to CREATE_SESSION is unhelpful, and is >>> why >>> the only sane response by the client will be to treat it as a >>> temporary >>> error. >>> IOW: these patches will not be acceptable, even with a rewrite, as >>> they >>> are based on a flawed assumption. >>=20 >> Fair enough. We're not attached to any particular solution/fix. >>=20 >> So let's take "recovery of an active mount" out of the picture >> for a moment. >>=20 >> The narrow problem is behavioral: during initial contact with an >> unfamiliar server, the server can hold off a client indefinitely >> by sending NFS4ERR_DELAY for example until another client unmounts. >> We want to find a way to allow clients to make progress when a >> server is short of resources. >>=20 >> It appears that the mount(2) system call does not return as long >> as the server is still returning NFS4ERR_DELAY. Possibly user >> space is never given an opportunity to stop retrying, and thus >> mount.nfs gets stuck. >>=20 >> It appears that DELAY is OK for EXCHANGE_ID too. So if a server >> decides to return DELAY to EXCHANGE_ID, I wonder if our client's >> trunking detection would be hamstrung by one bad server... >=20 > The 'mount' program has the 'retry' option in order to set a timeout > for the mount operation itself. Is that option not working correctly? Manjunath will need to confirm that, but my understanding is that mount.nfs is not regaining control when the server returns DELAY to CREATE_SESSION. My conclusion was that mount(2) is not returning. > If so, we should definitely fix that. My recollection is that mount.nfs polls, it does not set a timer signal. So it will call mount(2) repeatedly until either "retry" minutes has passed, or mount(2) succeeds. I don't think it will deal with mount(2) not returning, but I could be wrong about that. My preference would be to make the kernel more reliable (ie mount(2) fails immediately in this case). That gives mount.nfs some time to try other things (like, try the original mount again after a few moments, or fall back to NFSv4.0, or fail). We don't want mount.nfs to wait for the full retry=3D while doing nothing else. That would make this particular failure mode behave differently than all the other modes we have had, historically, IIUC. Also, I agree with Bruce that the server should make CREATE_SESSION less likely to fail. That would also benefit state recovery. > We might also want to look into making it take values < 1 minute. That > could be accomplished either by extending the syntax of the 'retry' > option (e.g.: 'retry=3D<minutes>:<seconds>') or by adding a new option > (e.g. 'sretry=3D<seconds>'). >=20 > It would then be up to the caller of mount to decide the policy of = what > to do after a timeout. I agree that the caller of mount(2) should be allowed to provide the policy. > Renegotiation downward to NFSv3 might be an > option, but it's not something that most people want to do in the case > where there are lots of clients competing for resources since that's > precisely the regime where the NFSv3 DRC scheme breaks down (lots of > disconnections, combined with a high turnover of DRC slots). -- Chuck Lever chucklever@gmail.com ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot 2018-06-25 15:39 ` Chuck Lever @ 2018-06-25 16:45 ` Trond Myklebust 2018-06-25 17:03 ` Manjunath Patil 1 sibling, 0 replies; 18+ messages in thread From: Trond Myklebust @ 2018-06-25 16:45 UTC (permalink / raw) To: chucklever; +Cc: bfields, linux-nfs, manjunath.b.patil T24gTW9uLCAyMDE4LTA2LTI1IGF0IDExOjM5IC0wNDAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g PiBPbiBKdW4gMjQsIDIwMTgsIGF0IDk6NTYgQU0sIFRyb25kIE15a2xlYnVzdCA8dHJvbmRteUBo YW1tZXJzcGFjZS5jDQo+ID4gb20+IHdyb3RlOg0KPiA+IA0KPiA+IE9uIFNhdCwgMjAxOC0wNi0y MyBhdCAxNTowMCAtMDQwMCwgQ2h1Y2sgTGV2ZXIgd3JvdGU6DQo+ID4gPiA+IE9uIEp1biAyMiwg MjAxOCwgYXQgNjozMSBQTSwgVHJvbmQgTXlrbGVidXN0IDx0cm9uZG15QGhhbW1lcnNwYQ0KPiA+ ID4gPiBjZS5jDQo+ID4gPiA+IG9tPiB3cm90ZToNCj4gPiA+ID4gDQo+ID4gPiA+IE9uIEZyaSwg MjAxOC0wNi0yMiBhdCAxNzo0OSAtMDQwMCwgQ2h1Y2sgTGV2ZXIgd3JvdGU6DQo+ID4gPiA+ID4g SGkgQnJ1Y2UtDQo+ID4gPiA+ID4gDQo+ID4gPiA+ID4gDQo+ID4gPiA+ID4gPiBPbiBKdW4gMjIs IDIwMTgsIGF0IDE6NTQgUE0sIEouIEJydWNlIEZpZWxkcyA8YmZpZWxkc0BmaWVsZA0KPiA+ID4g PiA+ID4gc2VzLg0KPiA+ID4gPiA+ID4gb3JnPg0KPiA+ID4gPiA+ID4gd3JvdGU6DQo+ID4gPiA+ ID4gPiANCj4gPiA+ID4gPiA+IE9uIFRodSwgSnVuIDIxLCAyMDE4IGF0IDA0OjM1OjMzUE0gKzAw MDAsIE1hbmp1bmF0aCBQYXRpbA0KPiA+ID4gPiA+ID4gd3JvdGU6DQo+ID4gPiA+ID4gPiA+IFBy ZXNlbnRseSBuZnNlcnJfanVrZWJveCBpcyBiZWluZyByZXR1cm5lZCBieSBuZnNkIGZvcg0KPiA+ ID4gPiA+ID4gPiBjcmVhdGVfc2Vzc2lvbg0KPiA+ID4gPiA+ID4gPiByZXF1ZXN0IGlmIHNlcnZl ciBpcyB1bmFibGUgdG8gYWxsb2NhdGUgYSBzZXNzaW9uIHNsb3QuDQo+ID4gPiA+ID4gPiA+IFRo aXMNCj4gPiA+ID4gPiA+ID4gbWF5DQo+ID4gPiA+ID4gPiA+IGJlDQo+ID4gPiA+ID4gPiA+IHRy ZWF0ZWQgYXMgTkZTNEVSUl9ERUxBWSBieSB0aGUgY2xpZW50cyBhbmQgd2hpY2ggbWF5DQo+ID4g PiA+ID4gPiA+IGNvbnRpbnVlIHRvDQo+ID4gPiA+ID4gPiA+IHJlLXRyeQ0KPiA+ID4gPiA+ID4g PiBjcmVhdGVfc2Vzc2lvbiBpbiBsb29wIGxlYWRpbmcgTkZTdjQuMSsgbW91bnRzIGluIGh1bmcN Cj4gPiA+ID4gPiA+ID4gc3RhdGUuDQo+ID4gPiA+ID4gPiA+IG5mc2QNCj4gPiA+ID4gPiA+ID4g c2hvdWxkIHJldHVybiBuZnNlcnJfbm9zcGMgaW4gdGhpcyBjYXNlIGFzIHBlcg0KPiA+ID4gPiA+ ID4gPiByZmM1NjYxKHNlY3Rpb24tDQo+ID4gPiA+ID4gPiA+IDE4LjM2LjQNCj4gPiA+ID4gPiA+ ID4gc3VicG9pbnQgNC4gU2Vzc2lvbiBjcmVhdGlvbikuDQo+ID4gPiA+ID4gPiANCj4gPiA+ID4g PiA+IEkgZG9uJ3QgdGhpbmsgdGhlIHNwZWMgYWN0dWFsbHkgZ2l2ZXMgdXMgYW4gZXJyb3IgdGhh dCB3ZQ0KPiA+ID4gPiA+ID4gY2FuDQo+ID4gPiA+ID4gPiB1c2UNCj4gPiA+ID4gPiA+IHRvIHNh eQ0KPiA+ID4gPiA+ID4gYSBDUkVBVEVfU0VTU0lPTiBmYWlsZWQgcGVybWFuZW50bHkgZm9yIGxh Y2sgb2YgcmVzb3VyY2VzLg0KPiA+ID4gPiA+IA0KPiA+ID4gPiA+IFRoZSBjdXJyZW50IHNpdHVh dGlvbiBpcyB0aGF0IHRoZSBzZXJ2ZXIgcmVwbGllcw0KPiA+ID4gPiA+IE5GUzRFUlJfREVMQVks DQo+ID4gPiA+ID4gYW5kIHRoZSBjbGllbnQgcmV0cmllcyBpbmRlZmluaXRlbHkuIFRoZSBnb2Fs IGlzIHRvIGxldCB0aGUNCj4gPiA+ID4gPiBjbGllbnQgY2hvb3NlIHdoZXRoZXIgaXQgd2FudHMg dG8gdHJ5IHRoZSBDUkVBVEVfU0VTU0lPTg0KPiA+ID4gPiA+IGFnYWluLA0KPiA+ID4gPiA+IHRy eSBhIGRpZmZlcmVudCBORlMgdmVyc2lvbiwgb3IgZmFpbCB0aGUgbW91bnQgcmVxdWVzdC4NCj4g PiA+ID4gPiANCj4gPiA+ID4gPiBCaWxsIGFuZCBJIGJvdGggbG9va2VkIGF0IHRoaXMgc2VjdGlv biBvZiBSRkMgNTY2MS4gSXQgc2VlbXMNCj4gPiA+ID4gPiB0bw0KPiA+ID4gPiA+IHVzIHRoYXQg dGhlIHVzZSBvZiBORlM0RVJSX05PU1BDIGlzIGFwcHJvcHJpYXRlIGFuZA0KPiA+ID4gPiA+IHVu YW1iaWd1b3VzDQo+ID4gPiA+ID4gaW4gdGhpcyBzaXR1YXRpb24sIGFuZCBpdCBpcyBhbiBhbGxv d2VkIHN0YXR1cyBmb3IgdGhlDQo+ID4gPiA+ID4gQ1JFQVRFX1NFU1NJT04gb3BlcmF0aW9uLiBO RlM0RVJSX0RFTEFZIE9UT0ggaXMgbm90IGhlbHBmdWwuDQo+ID4gPiA+IA0KPiA+ID4gPiBUaGVy ZSBhcmUgYSByYW5nZSBvZiBlcnJvcnMgd2hpY2ggd2UgbWF5IG5lZWQgdG8gaGFuZGxlIGJ5DQo+ ID4gPiA+IGRlc3Ryb3lpbmcNCj4gPiA+ID4gdGhlIHNlc3Npb24sIGFuZCB0aGVuIGNyZWF0aW5n IGEgbmV3IG9uZSAobWFpbmx5IHRoZSBvbmVzIHdoZXJlDQo+ID4gPiA+IHRoZQ0KPiA+ID4gPiBj bGllbnQgYW5kIHNlcnZlciBzbG90IGhhbmRsaW5nIGdldCBvdXQgb2Ygc3luYykuIFRoYXQncyB3 aHkNCj4gPiA+ID4gcmV0dXJuaW5nDQo+ID4gPiA+IE5GUzRFUlJfTk9TUEMgaW4gcmVzcG9uc2Ug dG8gQ1JFQVRFX1NFU1NJT04gaXMgdW5oZWxwZnVsLCBhbmQNCj4gPiA+ID4gaXMNCj4gPiA+ID4g d2h5DQo+ID4gPiA+IHRoZSBvbmx5IHNhbmUgcmVzcG9uc2UgYnkgdGhlIGNsaWVudCB3aWxsIGJl IHRvIHRyZWF0IGl0IGFzIGENCj4gPiA+ID4gdGVtcG9yYXJ5DQo+ID4gPiA+IGVycm9yLg0KPiA+ ID4gPiBJT1c6IHRoZXNlIHBhdGNoZXMgd2lsbCBub3QgYmUgYWNjZXB0YWJsZSwgZXZlbiB3aXRo IGEgcmV3cml0ZSwNCj4gPiA+ID4gYXMNCj4gPiA+ID4gdGhleQ0KPiA+ID4gPiBhcmUgYmFzZWQg b24gYSBmbGF3ZWQgYXNzdW1wdGlvbi4NCj4gPiA+IA0KPiA+ID4gRmFpciBlbm91Z2guIFdlJ3Jl IG5vdCBhdHRhY2hlZCB0byBhbnkgcGFydGljdWxhciBzb2x1dGlvbi9maXguDQo+ID4gPiANCj4g PiA+IFNvIGxldCdzIHRha2UgInJlY292ZXJ5IG9mIGFuIGFjdGl2ZSBtb3VudCIgb3V0IG9mIHRo ZSBwaWN0dXJlDQo+ID4gPiBmb3IgYSBtb21lbnQuDQo+ID4gPiANCj4gPiA+IFRoZSBuYXJyb3cg cHJvYmxlbSBpcyBiZWhhdmlvcmFsOiBkdXJpbmcgaW5pdGlhbCBjb250YWN0IHdpdGggYW4NCj4g PiA+IHVuZmFtaWxpYXIgc2VydmVyLCB0aGUgc2VydmVyIGNhbiBob2xkIG9mZiBhIGNsaWVudCBp bmRlZmluaXRlbHkNCj4gPiA+IGJ5IHNlbmRpbmcgTkZTNEVSUl9ERUxBWSBmb3IgZXhhbXBsZSB1 bnRpbCBhbm90aGVyIGNsaWVudA0KPiA+ID4gdW5tb3VudHMuDQo+ID4gPiBXZSB3YW50IHRvIGZp bmQgYSB3YXkgdG8gYWxsb3cgY2xpZW50cyB0byBtYWtlIHByb2dyZXNzIHdoZW4gYQ0KPiA+ID4g c2VydmVyIGlzIHNob3J0IG9mIHJlc291cmNlcy4NCj4gPiA+IA0KPiA+ID4gSXQgYXBwZWFycyB0 aGF0IHRoZSBtb3VudCgyKSBzeXN0ZW0gY2FsbCBkb2VzIG5vdCByZXR1cm4gYXMgbG9uZw0KPiA+ ID4gYXMgdGhlIHNlcnZlciBpcyBzdGlsbCByZXR1cm5pbmcgTkZTNEVSUl9ERUxBWS4gUG9zc2li bHkgdXNlcg0KPiA+ID4gc3BhY2UgaXMgbmV2ZXIgZ2l2ZW4gYW4gb3Bwb3J0dW5pdHkgdG8gc3Rv cCByZXRyeWluZywgYW5kIHRodXMNCj4gPiA+IG1vdW50Lm5mcyBnZXRzIHN0dWNrLg0KPiA+ID4g DQo+ID4gPiBJdCBhcHBlYXJzIHRoYXQgREVMQVkgaXMgT0sgZm9yIEVYQ0hBTkdFX0lEIHRvby4g U28gaWYgYSBzZXJ2ZXINCj4gPiA+IGRlY2lkZXMgdG8gcmV0dXJuIERFTEFZIHRvIEVYQ0hBTkdF X0lELCBJIHdvbmRlciBpZiBvdXIgY2xpZW50J3MNCj4gPiA+IHRydW5raW5nIGRldGVjdGlvbiB3 b3VsZCBiZSBoYW1zdHJ1bmcgYnkgb25lIGJhZCBzZXJ2ZXIuLi4NCj4gPiANCj4gPiBUaGUgJ21v dW50JyBwcm9ncmFtIGhhcyB0aGUgJ3JldHJ5JyBvcHRpb24gaW4gb3JkZXIgdG8gc2V0IGENCj4g PiB0aW1lb3V0DQo+ID4gZm9yIHRoZSBtb3VudCBvcGVyYXRpb24gaXRzZWxmLiBJcyB0aGF0IG9w dGlvbiBub3Qgd29ya2luZw0KPiA+IGNvcnJlY3RseT8NCj4gDQo+IE1hbmp1bmF0aCB3aWxsIG5l ZWQgdG8gY29uZmlybSB0aGF0LCBidXQgbXkgdW5kZXJzdGFuZGluZyBpcyB0aGF0DQo+IG1vdW50 Lm5mcyBpcyBub3QgcmVnYWluaW5nIGNvbnRyb2wgd2hlbiB0aGUgc2VydmVyIHJldHVybnMgREVM QVkNCj4gdG8gQ1JFQVRFX1NFU1NJT04uIE15IGNvbmNsdXNpb24gd2FzIHRoYXQgbW91bnQoMikg aXMgbm90IHJldHVybmluZy4NCj4gDQo+IA0KPiA+IElmIHNvLCB3ZSBzaG91bGQgZGVmaW5pdGVs eSBmaXggdGhhdC4NCj4gDQo+IE15IHJlY29sbGVjdGlvbiBpcyB0aGF0IG1vdW50Lm5mcyBwb2xs cywgaXQgZG9lcyBub3Qgc2V0IGEgdGltZXINCj4gc2lnbmFsLiBTbyBpdCB3aWxsIGNhbGwgbW91 bnQoMikgcmVwZWF0ZWRseSB1bnRpbCBlaXRoZXIgInJldHJ5Ig0KPiBtaW51dGVzIGhhcyBwYXNz ZWQsIG9yIG1vdW50KDIpIHN1Y2NlZWRzLiBJIGRvbid0IHRoaW5rIGl0IHdpbGwNCj4gZGVhbCB3 aXRoIG1vdW50KDIpIG5vdCByZXR1cm5pbmcsIGJ1dCBJIGNvdWxkIGJlIHdyb25nIGFib3V0IHRo YXQuDQo+IA0KPiBNeSBwcmVmZXJlbmNlIHdvdWxkIGJlIHRvIG1ha2UgdGhlIGtlcm5lbCBtb3Jl IHJlbGlhYmxlIChpZSBtb3VudCgyKQ0KPiBmYWlscyBpbW1lZGlhdGVseSBpbiB0aGlzIGNhc2Up LiBUaGF0IGdpdmVzIG1vdW50Lm5mcyBzb21lIHRpbWUgdG8NCj4gdHJ5IG90aGVyIHRoaW5ncyAo bGlrZSwgdHJ5IHRoZSBvcmlnaW5hbCBtb3VudCBhZ2FpbiBhZnRlciBhIGZldw0KPiBtb21lbnRz LCBvciBmYWxsIGJhY2sgdG8gTkZTdjQuMCwgb3IgZmFpbCkuDQoNCkZhbGxpbmcgYmFjayB0byBO RlN2NC4wIGlzIGFsc28gd3JvbmcgaW4gdGhpcyBjYXNlLiA0LjAgcmVsaWVzIG9uIHRoZQ0KRFJD IGZvciByZXBsYXkgcHJvdGVjdGlvbiBhZ2FpbnN0IGFsbCBub24tc3RhdGVmdWwgbm9uaWRlbXBv dGVudA0Kb3BlcmF0aW9ucyAoaS5lLiBta2RpcigpLCB1bmxpbmsoKSwgcmVuYW1lKCksIC4uLiku DQoNCklmIHlvdSB3YW50IHRvIG1ha2UgRU5PU1BDIGEgZmF0YWwgZXJyb3IsIHRoZW4gdGhhdCBt ZWFucyB5b3UgbmVlZCB0bw0KZWR1Y2F0ZSB1c2VycyBhYm91dCB0aGUgcmVtZWRpZXMsIGFuZCBJ IGNhbid0IHNlZSB0aGF0IHdlJ3JlIGFncmVlaW5nDQpvbiB3aGF0IGNvbnN0aXR1dGVzIHRoZSBy aWdodCByZW1lZHkgaGVyZS4gU28gaSBkaXNhZ3JlZSB0aGF0IGl0IGlzIE9LDQp0byBleHBvc2Ug dGhpcyBwYXJ0aWN1bGFyIGVycm9yIHRvIHVzZXJsYW5kIGZvciBub3cuDQoNCkknbSBPSyB3aXRo IGZpeGluZydyZXRyeT0nLCBidXQgdGhhdCdzIGJlY2F1c2UgaXQgaXMgYSB3ZWxsIGRlZmluZWQN CmNvbnRyb2wgbWVjaGFuaXNtLg0KDQo+IFdlIGRvbid0IHdhbnQgbW91bnQubmZzIHRvIHdhaXQg Zm9yIHRoZSBmdWxsIHJldHJ5PSB3aGlsZSBkb2luZw0KPiBub3RoaW5nIGVsc2UuIFRoYXQgd291 bGQgbWFrZSB0aGlzIHBhcnRpY3VsYXIgZmFpbHVyZSBtb2RlIGJlaGF2ZQ0KPiBkaWZmZXJlbnRs eSB0aGFuIGFsbCB0aGUgb3RoZXIgbW9kZXMgd2UgaGF2ZSBoYWQsIGhpc3RvcmljYWxseSwgSUlV Qy4NCj4gDQo+IEFsc28sIEkgYWdyZWUgd2l0aCBCcnVjZSB0aGF0IHRoZSBzZXJ2ZXIgc2hvdWxk IG1ha2UgQ1JFQVRFX1NFU1NJT04NCj4gbGVzcyBsaWtlbHkgdG8gZmFpbC4gVGhhdCB3b3VsZCBh bHNvIGJlbmVmaXQgc3RhdGUgcmVjb3ZlcnkuDQo+IA0KPiANCj4gPiBXZSBtaWdodCBhbHNvIHdh bnQgdG8gbG9vayBpbnRvIG1ha2luZyBpdCB0YWtlIHZhbHVlcyA8IDEgbWludXRlLg0KPiA+IFRo YXQNCj4gPiBjb3VsZCBiZSBhY2NvbXBsaXNoZWQgZWl0aGVyIGJ5IGV4dGVuZGluZyB0aGUgc3lu dGF4IG9mIHRoZSAncmV0cnknDQo+ID4gb3B0aW9uIChlLmcuOiAncmV0cnk9PG1pbnV0ZXM+Ojxz ZWNvbmRzPicpIG9yIGJ5IGFkZGluZyBhIG5ldw0KPiA+IG9wdGlvbg0KPiA+IChlLmcuICdzcmV0 cnk9PHNlY29uZHM+JykuDQo+ID4gDQo+ID4gSXQgd291bGQgdGhlbiBiZSB1cCB0byB0aGUgY2Fs bGVyIG9mIG1vdW50IHRvIGRlY2lkZSB0aGUgcG9saWN5IG9mDQo+ID4gd2hhdA0KPiA+IHRvIGRv IGFmdGVyIGEgdGltZW91dC4NCj4gDQo+IEkgYWdyZWUgdGhhdCB0aGUgY2FsbGVyIG9mIG1vdW50 KDIpIHNob3VsZCBiZSBhbGxvd2VkIHRvIHByb3ZpZGUgdGhlDQo+IHBvbGljeS4NCj4gDQo+IA0K PiA+IFJlbmVnb3RpYXRpb24gZG93bndhcmQgdG8gTkZTdjMgbWlnaHQgYmUgYW4NCj4gPiBvcHRp b24sIGJ1dCBpdCdzIG5vdCBzb21ldGhpbmcgdGhhdCBtb3N0IHBlb3BsZSB3YW50IHRvIGRvIGlu IHRoZQ0KPiA+IGNhc2UNCj4gPiB3aGVyZSB0aGVyZSBhcmUgbG90cyBvZiBjbGllbnRzIGNvbXBl dGluZyBmb3IgcmVzb3VyY2VzIHNpbmNlDQo+ID4gdGhhdCdzDQo+ID4gcHJlY2lzZWx5IHRoZSBy ZWdpbWUgd2hlcmUgdGhlIE5GU3YzIERSQyBzY2hlbWUgYnJlYWtzIGRvd24gKGxvdHMNCj4gPiBv Zg0KPiA+IGRpc2Nvbm5lY3Rpb25zLCBjb21iaW5lZCB3aXRoIGEgaGlnaCB0dXJub3ZlciBvZiBE UkMgc2xvdHMpLg0KPiANCj4gLS0NCj4gQ2h1Y2sgTGV2ZXINCj4gY2h1Y2tsZXZlckBnbWFpbC5j b20NCj4gDQo+IA0KPiANCi0tIA0KVHJvbmQgTXlrbGVidXN0DQpDVE8sIEhhbW1lcnNwYWNlIElu Yw0KNDMwMCBFbCBDYW1pbm8gUmVhbCwgU3VpdGUgMTA1DQpMb3MgQWx0b3MsIENBIDk0MDIyDQp3 d3cuaGFtbWVyLnNwYWNlDQo= ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot 2018-06-25 15:39 ` Chuck Lever 2018-06-25 16:45 ` Trond Myklebust @ 2018-06-25 17:03 ` Manjunath Patil 1 sibling, 0 replies; 18+ messages in thread From: Manjunath Patil @ 2018-06-25 17:03 UTC (permalink / raw) To: Chuck Lever, Trond Myklebust; +Cc: Bruce Fields, Linux NFS Mailing List On 6/25/2018 8:39 AM, Chuck Lever wrote: > >> On Jun 24, 2018, at 9:56 AM, Trond Myklebust <trondmy@hammerspace.com> wrote: >> >> On Sat, 2018-06-23 at 15:00 -0400, Chuck Lever wrote: >>>> On Jun 22, 2018, at 6:31 PM, Trond Myklebust <trondmy@hammerspace.c >>>> om> wrote: >>>> >>>> On Fri, 2018-06-22 at 17:49 -0400, Chuck Lever wrote: >>>>> Hi Bruce- >>>>> >>>>> >>>>>> On Jun 22, 2018, at 1:54 PM, J. Bruce Fields <bfields@fieldses. >>>>>> org> >>>>>> wrote: >>>>>> >>>>>> On Thu, Jun 21, 2018 at 04:35:33PM +0000, Manjunath Patil >>>>>> wrote: >>>>>>> Presently nfserr_jukebox is being returned by nfsd for >>>>>>> create_session >>>>>>> request if server is unable to allocate a session slot. This >>>>>>> may >>>>>>> be >>>>>>> treated as NFS4ERR_DELAY by the clients and which may >>>>>>> continue to >>>>>>> re-try >>>>>>> create_session in loop leading NFSv4.1+ mounts in hung state. >>>>>>> nfsd >>>>>>> should return nfserr_nospc in this case as per >>>>>>> rfc5661(section- >>>>>>> 18.36.4 >>>>>>> subpoint 4. Session creation). >>>>>> I don't think the spec actually gives us an error that we can >>>>>> use >>>>>> to say >>>>>> a CREATE_SESSION failed permanently for lack of resources. >>>>> The current situation is that the server replies NFS4ERR_DELAY, >>>>> and the client retries indefinitely. The goal is to let the >>>>> client choose whether it wants to try the CREATE_SESSION again, >>>>> try a different NFS version, or fail the mount request. >>>>> >>>>> Bill and I both looked at this section of RFC 5661. It seems to >>>>> us that the use of NFS4ERR_NOSPC is appropriate and unambiguous >>>>> in this situation, and it is an allowed status for the >>>>> CREATE_SESSION operation. NFS4ERR_DELAY OTOH is not helpful. >>>> There are a range of errors which we may need to handle by >>>> destroying >>>> the session, and then creating a new one (mainly the ones where the >>>> client and server slot handling get out of sync). That's why >>>> returning >>>> NFS4ERR_NOSPC in response to CREATE_SESSION is unhelpful, and is >>>> why >>>> the only sane response by the client will be to treat it as a >>>> temporary >>>> error. >>>> IOW: these patches will not be acceptable, even with a rewrite, as >>>> they >>>> are based on a flawed assumption. >>> Fair enough. We're not attached to any particular solution/fix. >>> >>> So let's take "recovery of an active mount" out of the picture >>> for a moment. >>> >>> The narrow problem is behavioral: during initial contact with an >>> unfamiliar server, the server can hold off a client indefinitely >>> by sending NFS4ERR_DELAY for example until another client unmounts. >>> We want to find a way to allow clients to make progress when a >>> server is short of resources. >>> >>> It appears that the mount(2) system call does not return as long >>> as the server is still returning NFS4ERR_DELAY. Possibly user >>> space is never given an opportunity to stop retrying, and thus >>> mount.nfs gets stuck. >>> >>> It appears that DELAY is OK for EXCHANGE_ID too. So if a server >>> decides to return DELAY to EXCHANGE_ID, I wonder if our client's >>> trunking detection would be hamstrung by one bad server... >> The 'mount' program has the 'retry' option in order to set a timeout >> for the mount operation itself. Is that option not working correctly? > Manjunath will need to confirm that, but my understanding is that > mount.nfs is not regaining control when the server returns DELAY > to CREATE_SESSION. My conclusion was that mount(2) is not returning. > yes. this is true. Even with setting a retry the mount calls blocks on client side indefinitely. On the wire I can see CREATE_SESSION and NFS4ERR_DELAY exchanges happening continuously. I am not sure about the effects, but a NFSv4.0 mount to same server at this moment succeeds. More information: ... 2144 09:54:32.473054 write(1, "mount.nfs: trying text-based opt"..., 113) = 113 <0.000337> 2144 09:54:32.473468 mount("10.211.47.123:/exports", "/NFSMNT", "nfs", 0, "retry=1,vers=4,minorversion=1,ad"... <unfinished ...> 2143 09:56:42.253947 <... wait4 resumed> 0x7fffb2e13ec8, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) <129.800036> 2143 09:56:42.254142 --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} --- ... The client mount call hangs here - [<ffffffffa05204d2>] nfs_wait_client_init_complete+0x52/0xc0 [nfs] [<ffffffffa05872ed>] nfs41_discover_server_trunking+0x6d/0xb0 [nfsv4] [<ffffffffa0587802>] nfs4_discover_server_trunking+0x82/0x2e0 [nfsv4] [<ffffffffa058f8d6>] nfs4_init_client+0x136/0x300 [nfsv4] [<ffffffffa05210bf>] nfs_get_client+0x24f/0x2f0 [nfs] [<ffffffffa058eeef>] nfs4_set_client+0x9f/0xf0 [nfsv4] [<ffffffffa059039e>] nfs4_create_server+0x13e/0x3b0 [nfsv4] [<ffffffffa05881b2>] nfs4_remote_mount+0x32/0x60 [nfsv4] [<ffffffff8121df3e>] mount_fs+0x3e/0x180 [<ffffffff8123a6db>] vfs_kern_mount+0x6b/0x110 [<ffffffffa05880d6>] nfs_do_root_mount+0x86/0xc0 [nfsv4] [<ffffffffa05884c4>] nfs4_try_mount+0x44/0xc0 [nfsv4] [<ffffffffa052ed6b>] nfs_fs_mount+0x4cb/0xda0 [nfs] [<ffffffff8121df3e>] mount_fs+0x3e/0x180 [<ffffffff8123a6db>] vfs_kern_mount+0x6b/0x110 [<ffffffff8123d5c1>] do_mount+0x251/0xcf0 [<ffffffff8123e3a2>] SyS_mount+0xa2/0x110 [<ffffffff81751f4b>] tracesys_phase2+0x6d/0x72 [<ffffffffffffffff>] 0xffffffffffffffff I have a setup to reproduce this. If you need any more info, please let me know. -Thanks, Manjunath >> If so, we should definitely fix that. > My recollection is that mount.nfs polls, it does not set a timer > signal. So it will call mount(2) repeatedly until either "retry" > minutes has passed, or mount(2) succeeds. I don't think it will > deal with mount(2) not returning, but I could be wrong about that. > > My preference would be to make the kernel more reliable (ie mount(2) > fails immediately in this case). That gives mount.nfs some time to > try other things (like, try the original mount again after a few > moments, or fall back to NFSv4.0, or fail). > > We don't want mount.nfs to wait for the full retry= while doing > nothing else. That would make this particular failure mode behave > differently than all the other modes we have had, historically, IIUC. > > Also, I agree with Bruce that the server should make CREATE_SESSION > less likely to fail. That would also benefit state recovery. > > >> We might also want to look into making it take values < 1 minute. That >> could be accomplished either by extending the syntax of the 'retry' >> option (e.g.: 'retry=<minutes>:<seconds>') or by adding a new option >> (e.g. 'sretry=<seconds>'). >> >> It would then be up to the caller of mount to decide the policy of what >> to do after a timeout. > I agree that the caller of mount(2) should be allowed to provide the > policy. > > >> Renegotiation downward to NFSv3 might be an >> option, but it's not something that most people want to do in the case >> where there are lots of clients competing for resources since that's >> precisely the regime where the NFSv3 DRC scheme breaks down (lots of >> disconnections, combined with a high turnover of DRC slots). > -- > Chuck Lever > chucklever@gmail.com > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot 2018-06-22 17:54 ` J. Bruce Fields 2018-06-22 21:49 ` Chuck Lever @ 2018-06-24 20:26 ` J. Bruce Fields [not found] ` <bde64edc-5684-82d7-4488-e2ebdd7018fc@oracle.com> 2018-07-09 14:25 ` J. Bruce Fields 2 siblings, 1 reply; 18+ messages in thread From: J. Bruce Fields @ 2018-06-24 20:26 UTC (permalink / raw) To: Manjunath Patil; +Cc: linux-nfs By the way, could you share some more details with us about the situation when you (or your customers) are actually hitting this case? How many clients, what kind of clients, etc. And what version of the server were you seeing the problem on? (I'm mainly curious whether de766e570413 and 44d8660d3bb0 were already applied.) I'm glad we're thinking about how to handle this case, but my feeling is that the server is probably just being *much* too conservative about these allocations, and the most important thing may be to fix that and make it a lot rarer that we hit this case in the first place. --b. ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <bde64edc-5684-82d7-4488-e2ebdd7018fc@oracle.com>]
* Re: [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot [not found] ` <bde64edc-5684-82d7-4488-e2ebdd7018fc@oracle.com> @ 2018-06-25 22:04 ` J. Bruce Fields 2018-06-26 17:20 ` Manjunath Patil 0 siblings, 1 reply; 18+ messages in thread From: J. Bruce Fields @ 2018-06-25 22:04 UTC (permalink / raw) To: Manjunath Patil; +Cc: linux-nfs On Mon, Jun 25, 2018 at 10:17:21AM -0700, Manjunath Patil wrote: > Hi Bruce, > > I could reproduce this issue by lowering the amount of RAM. On my > virtual box VM with 176M MB of RAM I can reproduce this with 3 > clients. I know how to reproduce it, I was just wondering what motivated it--were customers hitting it (how), was it just artificial testing? Oh well, it probably needs to be fixed regardless. --b. > My kernel didn't have the following fixes - > > de766e5 nfsd: give out fewer session slots as limit approaches > 44d8660 nfsd: increase DRC cache limit > > Once I apply these patches, the issue recurs with 10+ clients. > Once the mount starts to hang due to this issue, a NFSv4.0 still succeeds. > > I took the latest mainline kernel [4.18.0-rc1] and made the server > return NFS4ERR_DELAY[nfserr_jukebox] if its unable to allocate 50 > slots[just to accelerate the issue] > > - if (!ca->maxreqs) > + if (ca->maxreqs < 50) { > ... > return nfserr_jukebox; > > Then used the same client[4.18.0-rc1] and observed that mount calls > still hangs[indefinitely]. > Typically the client hangs here - [stack are from oracle kernel] - > > [root@OL7U5-work ~]# ps -ef | grep mount > root 2032 1732 0 09:49 pts/0 00:00:00 strace -tttvf -o > /tmp/a.out mount 10.211.47.123:/exports /NFSMNT -vvv -o retry=1 > root 2034 2032 0 09:49 pts/0 00:00:00 mount > 10.211.47.123:/exports /NFSMNT -vvv -o retry=1 > root 2035 2034 0 09:49 pts/0 00:00:00 /sbin/mount.nfs > 10.211.47.123:/exports /NFSMNT -v -o rw,retry=1 > root 2039 1905 0 09:49 pts/1 00:00:00 grep --color=auto mount > [root@OL7U5-work ~]# cat /proc/2035/stack > [<ffffffffa05204d2>] nfs_wait_client_init_complete+0x52/0xc0 [nfs] > [<ffffffffa05872ed>] nfs41_discover_server_trunking+0x6d/0xb0 [nfsv4] > [<ffffffffa0587802>] nfs4_discover_server_trunking+0x82/0x2e0 [nfsv4] > [<ffffffffa058f8d6>] nfs4_init_client+0x136/0x300 [nfsv4] > [<ffffffffa05210bf>] nfs_get_client+0x24f/0x2f0 [nfs] > [<ffffffffa058eeef>] nfs4_set_client+0x9f/0xf0 [nfsv4] > [<ffffffffa059039e>] nfs4_create_server+0x13e/0x3b0 [nfsv4] > [<ffffffffa05881b2>] nfs4_remote_mount+0x32/0x60 [nfsv4] > [<ffffffff8121df3e>] mount_fs+0x3e/0x180 > [<ffffffff8123a6db>] vfs_kern_mount+0x6b/0x110 > [<ffffffffa05880d6>] nfs_do_root_mount+0x86/0xc0 [nfsv4] > [<ffffffffa05884c4>] nfs4_try_mount+0x44/0xc0 [nfsv4] > [<ffffffffa052ed6b>] nfs_fs_mount+0x4cb/0xda0 [nfs] > [<ffffffff8121df3e>] mount_fs+0x3e/0x180 > [<ffffffff8123a6db>] vfs_kern_mount+0x6b/0x110 > [<ffffffff8123d5c1>] do_mount+0x251/0xcf0 > [<ffffffff8123e3a2>] SyS_mount+0xa2/0x110 > [<ffffffff81751f4b>] tracesys_phase2+0x6d/0x72 > [<ffffffffffffffff>] 0xffffffffffffffff > > [root@OL7U5-work ~]# cat /proc/2034/stack > [<ffffffff8108c147>] do_wait+0x217/0x2a0 > [<ffffffff8108d360>] do_wait4+0x80/0x110 > [<ffffffff8108d40d>] SyS_wait4+0x1d/0x20 > [<ffffffff81751f4b>] tracesys_phase2+0x6d/0x72 > [<ffffffffffffffff>] 0xffffffffffffffff > > [root@OL7U5-work ~]# cat /proc/2032/stack > [<ffffffff8108c147>] do_wait+0x217/0x2a0 > [<ffffffff8108d360>] do_wait4+0x80/0x110 > [<ffffffff8108d40d>] SyS_wait4+0x1d/0x20 > [<ffffffff81751ddc>] system_call_fastpath+0x18/0xd6 > [<ffffffffffffffff>] 0xffffffffffffffff > > -Thanks, > Manjunath > On 6/24/2018 1:26 PM, J. Bruce Fields wrote: > >By the way, could you share some more details with us about the > >situation when you (or your customers) are actually hitting this case? > > > >How many clients, what kind of clients, etc. And what version of the > >server were you seeing the problem on? (I'm mainly curious whether > >de766e570413 and 44d8660d3bb0 were already applied.) > > > >I'm glad we're thinking about how to handle this case, but my feeling is > >that the server is probably just being *much* too conservative about > >these allocations, and the most important thing may be to fix that and > >make it a lot rarer that we hit this case in the first place. > > > >--b. > >-- > >To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > >the body of a message to majordomo@vger.kernel.org > >More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot 2018-06-25 22:04 ` J. Bruce Fields @ 2018-06-26 17:20 ` Manjunath Patil 0 siblings, 0 replies; 18+ messages in thread From: Manjunath Patil @ 2018-06-26 17:20 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-nfs Hi Bruce, ct also had a test setup with Linux nfs server which had 512M RAM and they were cloning a VMs for clients. These client had a fstab entry for nfs mount. So each start up of the client would mount to NAS. They clone and start the VM. They observed that 10th VM clone hung during startup. This setup was just to see how many client can be used. Note that ct NAS didn't have de766e570413 and 44d8660d3bb0. Having these might have pushed the no.of clients even further. I will get back to you on their actual use-case. -Thanks, Manjunath On 6/25/2018 3:04 PM, J. Bruce Fields wrote: > On Mon, Jun 25, 2018 at 10:17:21AM -0700, Manjunath Patil wrote: >> Hi Bruce, >> >> I could reproduce this issue by lowering the amount of RAM. On my >> virtual box VM with 176M MB of RAM I can reproduce this with 3 >> clients. > I know how to reproduce it, I was just wondering what motivated it--were > customers hitting it (how), was it just artificial testing? > > Oh well, it probably needs to be fixed regardless. > > --b. > >> My kernel didn't have the following fixes - >> >> de766e5 nfsd: give out fewer session slots as limit approaches >> 44d8660 nfsd: increase DRC cache limit >> >> Once I apply these patches, the issue recurs with 10+ clients. >> Once the mount starts to hang due to this issue, a NFSv4.0 still succeeds. >> >> I took the latest mainline kernel [4.18.0-rc1] and made the server >> return NFS4ERR_DELAY[nfserr_jukebox] if its unable to allocate 50 >> slots[just to accelerate the issue] >> >> - if (!ca->maxreqs) >> + if (ca->maxreqs < 50) { >> ... >> return nfserr_jukebox; >> >> Then used the same client[4.18.0-rc1] and observed that mount calls >> still hangs[indefinitely]. >> Typically the client hangs here - [stack are from oracle kernel] - >> >> [root@OL7U5-work ~]# ps -ef | grep mount >> root 2032 1732 0 09:49 pts/0 00:00:00 strace -tttvf -o >> /tmp/a.out mount 10.211.47.123:/exports /NFSMNT -vvv -o retry=1 >> root 2034 2032 0 09:49 pts/0 00:00:00 mount >> 10.211.47.123:/exports /NFSMNT -vvv -o retry=1 >> root 2035 2034 0 09:49 pts/0 00:00:00 /sbin/mount.nfs >> 10.211.47.123:/exports /NFSMNT -v -o rw,retry=1 >> root 2039 1905 0 09:49 pts/1 00:00:00 grep --color=auto mount >> [root@OL7U5-work ~]# cat /proc/2035/stack >> [<ffffffffa05204d2>] nfs_wait_client_init_complete+0x52/0xc0 [nfs] >> [<ffffffffa05872ed>] nfs41_discover_server_trunking+0x6d/0xb0 [nfsv4] >> [<ffffffffa0587802>] nfs4_discover_server_trunking+0x82/0x2e0 [nfsv4] >> [<ffffffffa058f8d6>] nfs4_init_client+0x136/0x300 [nfsv4] >> [<ffffffffa05210bf>] nfs_get_client+0x24f/0x2f0 [nfs] >> [<ffffffffa058eeef>] nfs4_set_client+0x9f/0xf0 [nfsv4] >> [<ffffffffa059039e>] nfs4_create_server+0x13e/0x3b0 [nfsv4] >> [<ffffffffa05881b2>] nfs4_remote_mount+0x32/0x60 [nfsv4] >> [<ffffffff8121df3e>] mount_fs+0x3e/0x180 >> [<ffffffff8123a6db>] vfs_kern_mount+0x6b/0x110 >> [<ffffffffa05880d6>] nfs_do_root_mount+0x86/0xc0 [nfsv4] >> [<ffffffffa05884c4>] nfs4_try_mount+0x44/0xc0 [nfsv4] >> [<ffffffffa052ed6b>] nfs_fs_mount+0x4cb/0xda0 [nfs] >> [<ffffffff8121df3e>] mount_fs+0x3e/0x180 >> [<ffffffff8123a6db>] vfs_kern_mount+0x6b/0x110 >> [<ffffffff8123d5c1>] do_mount+0x251/0xcf0 >> [<ffffffff8123e3a2>] SyS_mount+0xa2/0x110 >> [<ffffffff81751f4b>] tracesys_phase2+0x6d/0x72 >> [<ffffffffffffffff>] 0xffffffffffffffff >> >> [root@OL7U5-work ~]# cat /proc/2034/stack >> [<ffffffff8108c147>] do_wait+0x217/0x2a0 >> [<ffffffff8108d360>] do_wait4+0x80/0x110 >> [<ffffffff8108d40d>] SyS_wait4+0x1d/0x20 >> [<ffffffff81751f4b>] tracesys_phase2+0x6d/0x72 >> [<ffffffffffffffff>] 0xffffffffffffffff >> >> [root@OL7U5-work ~]# cat /proc/2032/stack >> [<ffffffff8108c147>] do_wait+0x217/0x2a0 >> [<ffffffff8108d360>] do_wait4+0x80/0x110 >> [<ffffffff8108d40d>] SyS_wait4+0x1d/0x20 >> [<ffffffff81751ddc>] system_call_fastpath+0x18/0xd6 >> [<ffffffffffffffff>] 0xffffffffffffffff >> >> -Thanks, >> Manjunath >> On 6/24/2018 1:26 PM, J. Bruce Fields wrote: >>> By the way, could you share some more details with us about the >>> situation when you (or your customers) are actually hitting this case? >>> >>> How many clients, what kind of clients, etc. And what version of the >>> server were you seeing the problem on? (I'm mainly curious whether >>> de766e570413 and 44d8660d3bb0 were already applied.) >>> >>> I'm glad we're thinking about how to handle this case, but my feeling is >>> that the server is probably just being *much* too conservative about >>> these allocations, and the most important thing may be to fix that and >>> make it a lot rarer that we hit this case in the first place. >>> >>> --b. >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot 2018-06-22 17:54 ` J. Bruce Fields 2018-06-22 21:49 ` Chuck Lever 2018-06-24 20:26 ` J. Bruce Fields @ 2018-07-09 14:25 ` J. Bruce Fields 2018-07-09 21:57 ` Trond Myklebust 2 siblings, 1 reply; 18+ messages in thread From: J. Bruce Fields @ 2018-07-09 14:25 UTC (permalink / raw) To: Manjunath Patil; +Cc: linux-nfs On Fri, Jun 22, 2018 at 01:54:16PM -0400, bfields wrote: > On Thu, Jun 21, 2018 at 04:35:33PM +0000, Manjunath Patil wrote: > > Presently nfserr_jukebox is being returned by nfsd for create_session > > request if server is unable to allocate a session slot. This may be > > treated as NFS4ERR_DELAY by the clients and which may continue to re-try > > create_session in loop leading NFSv4.1+ mounts in hung state. nfsd > > should return nfserr_nospc in this case as per rfc5661(section-18.36.4 > > subpoint 4. Session creation). > > I don't think the spec actually gives us an error that we can use to say > a CREATE_SESSION failed permanently for lack of resources. > > Better would be to avoid the need to fail at all. Possibilities: > > - revive Trond's patches some time back to do dynamic slot size By the way, I finally got around to reviewing those patches (5 years late!). One issue is that they seem to take the slot count requested by the client at CREATE_SESSION as a lower bound. And the current client requests a lot of slots (1024, I think?--this is just from looking at the code, I should watch a mount). Anyway, I assume that's not a hard requirement and that we can fix it. Also the slot number is driven entirely by the server's guess at what the client needs--we might also want to take into account whether we're running out of server resources. So that still leaves the question of how to cap the total slot memory. I'm beginning to wonder whether that's a good idea at all. Perhaps it'd be better for now just to keep going till kmalloc fails. There's no shortage of other ways that a malicious client could DOS the server anyway. I'll probably forward-port and repost Trond's patches some time in the next month. --b. > renegotiation > - make sure the systems you're testing on already have > de766e570413 and 44d8660d3bb0 applied. > - further liberalise the limits here: do we need them at all, or > should we just wait till a kmalloc fails? Or maybe take a > hybrid approach?: e.g. allow an arbitrary number of clients > and only limit slots & slotsizes. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot 2018-07-09 14:25 ` J. Bruce Fields @ 2018-07-09 21:57 ` Trond Myklebust 0 siblings, 0 replies; 18+ messages in thread From: Trond Myklebust @ 2018-07-09 21:57 UTC (permalink / raw) To: Dr James Bruce Fields; +Cc: manjunath.b.patil, linux-nfs On Mon, 9 Jul 2018 at 10:27, J. Bruce Fields <bfields@fieldses.org> wrote: > > On Fri, Jun 22, 2018 at 01:54:16PM -0400, bfields wrote: > > On Thu, Jun 21, 2018 at 04:35:33PM +0000, Manjunath Patil wrote: > > > Presently nfserr_jukebox is being returned by nfsd for create_session > > > request if server is unable to allocate a session slot. This may be > > > treated as NFS4ERR_DELAY by the clients and which may continue to re-try > > > create_session in loop leading NFSv4.1+ mounts in hung state. nfsd > > > should return nfserr_nospc in this case as per rfc5661(section-18.36.4 > > > subpoint 4. Session creation). > > > > I don't think the spec actually gives us an error that we can use to say > > a CREATE_SESSION failed permanently for lack of resources. > > > > Better would be to avoid the need to fail at all. Possibilities: > > > > - revive Trond's patches some time back to do dynamic slot size > > By the way, I finally got around to reviewing those patches (5 years > late!). One issue is that they seem to take the slot count requested by > the client at CREATE_SESSION as a lower bound. And the current client > requests a lot of slots (1024, I think?--this is just from looking at > the code, I should watch a mount). Anyway, I assume that's not a hard > requirement and that we can fix it. > > Also the slot number is driven entirely by the server's guess at what > the client needs--we might also want to take into account whether we're > running out of server resources. > > So that still leaves the question of how to cap the total slot memory. > > I'm beginning to wonder whether that's a good idea at all. Perhaps it'd > be better for now just to keep going till kmalloc fails. There's no > shortage of other ways that a malicious client could DOS the server > anyway. > > I'll probably forward-port and repost Trond's patches some time in the > next month. To be fair, if I were redoing the patches today, I'd probably change them to ensure that we only grow the session slot table size using the sequence target_slot, but shrink it using the CB_RECALL_SLOT mechanism. That makes for a much cleaner implementation with less heuristics needed on the part of both the client and server. Cheers Trond ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/2] nfsv4: handle ENOSPC during create session 2018-06-21 16:35 [PATCH 1/2] nfsv4: handle ENOSPC during create session Manjunath Patil 2018-06-21 16:35 ` [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot Manjunath Patil @ 2018-06-21 17:04 ` Trond Myklebust 2018-06-22 14:28 ` Manjunath Patil 1 sibling, 1 reply; 18+ messages in thread From: Trond Myklebust @ 2018-06-21 17:04 UTC (permalink / raw) To: linux-nfs, manjunath.b.patil T24gVGh1LCAyMDE4LTA2LTIxIGF0IDE2OjM1ICswMDAwLCBNYW5qdW5hdGggUGF0aWwgd3JvdGU6 DQo+IFByZXNlbnRseSB0aGUgY2xpZW50IG1vdW50IGhhbmdzIGZvciBORlM0RVJSX05PU1BDIHJl cHNvbnNlIGZyb20NCj4gc2VydmVyDQo+IGR1cmluZyBjcmVhdGUgc2Vzc2lvbiBvcGVyYXRpb24u IEhhbmRsZSB0aGlzIGVycm9yIGF0IHRoZSBjbGllbnQgc2lkZQ0KPiBhbmQgcGFzcyBpdCBiYWNr IHRvIHVzZXItc3BhY2UsIHdoaWNoIG1heSBjaG9zZSB0byBtb3VudCB3aXRoIGxvd2VyDQo+IG5m cw0KPiB2ZXJzaW9ucy4NCj4gDQo+IFNpZ25lZC1vZmYtYnk6IE1hbmp1bmF0aCBQYXRpbCA8bWFu anVuYXRoLmIucGF0aWxAb3JhY2xlLmNvbT4NCj4gLS0tDQo+ICBmcy9uZnMvbmZzNHN0YXRlLmMg fCA2ICsrKysrKw0KPiAgMSBmaWxlIGNoYW5nZWQsIDYgaW5zZXJ0aW9ucygrKQ0KPiANCj4gZGlm ZiAtLWdpdCBhL2ZzL25mcy9uZnM0c3RhdGUuYyBiL2ZzL25mcy9uZnM0c3RhdGUuYw0KPiBpbmRl eCAyYmYyZWFhLi4yMTM0Y2Y1IDEwMDY0NA0KPiAtLS0gYS9mcy9uZnMvbmZzNHN0YXRlLmMNCj4g KysrIGIvZnMvbmZzL25mczRzdGF0ZS5jDQo+IEBAIC0zODEsNiArMzgxLDggQEAgaW50IG5mczQx X2Rpc2NvdmVyX3NlcnZlcl90cnVua2luZyhzdHJ1Y3QNCj4gbmZzX2NsaWVudCAqY2xwLA0KPiAg CX0NCj4gIAluZnM0X3NjaGVkdWxlX3N0YXRlX21hbmFnZXIoY2xwKTsNCj4gIAlzdGF0dXMgPSBu ZnNfd2FpdF9jbGllbnRfaW5pdF9jb21wbGV0ZShjbHApOw0KPiArCWlmICghc3RhdHVzKSAvKiAt RVJFU1RBUlRTWVMgKi8NCj4gKwkJc3RhdHVzID0gbmZzX2NsaWVudF9pbml0X3N0YXR1cyhjbHAp Ow0KDQpOYWNrLi4uIFRoZSB0cnVua2luZyBjb2RlIGlzIF9ub3RfIHRoZSBwbGFjZSB0byBkbyBz ZXNzaW9uIGVycm9yDQpkZXRlY3Rpb24uDQoNCj4gIAlpZiAoc3RhdHVzIDwgMCkNCj4gIAkJbmZz X3B1dF9jbGllbnQoY2xwKTsNCj4gIAlyZXR1cm4gc3RhdHVzOw0KPiBAQCAtMTkxOSw2ICsxOTIx LDkgQEAgc3RhdGljIGludA0KPiBuZnM0X2hhbmRsZV9yZWNsYWltX2xlYXNlX2Vycm9yKHN0cnVj dCBuZnNfY2xpZW50ICpjbHAsIGludCBzdGF0dXMpDQo+ICAJCWRwcmludGsoIiVzOiBleGl0IHdp dGggZXJyb3IgJWQgZm9yIHNlcnZlciAlc1xuIiwNCj4gIAkJCQlfX2Z1bmNfXywgLUVQUk9UT05P U1VQUE9SVCwgY2xwLQ0KPiA+Y2xfaG9zdG5hbWUpOw0KPiAgCQlyZXR1cm4gLUVQUk9UT05PU1VQ UE9SVDsNCj4gKwljYXNlIC1ORlM0RVJSX05PU1BDOg0KPiArCQluZnNfbWFya19jbGllbnRfcmVh ZHkoY2xwLCBzdGF0dXMpOw0KPiArCQkvKmZhbGwgdGhyb3VnaCovDQoNCk5hY2suLi4gVGhpcyB3 b3VsZCBjYXVzZSBleGlzdGluZyBtb3VudHMgdG8gc3VkZGVubHkgcGVybWFuZW50bHkgZmFpbC4N Cg0KPiAgCWNhc2UgLU5GUzRFUlJfTk9UX1NBTUU6IC8qIEZpeE1lOiBpbXBsZW1lbnQgcmVjb3Zl cnkNCj4gIAkJCQkgKiBpbiBuZnM0X2V4Y2hhbmdlX2lkICovDQo+ICAJZGVmYXVsdDoNCj4gQEAg LTIxODYsNiArMjE5MSw3IEBAIGludCBuZnM0X2Rpc2NvdmVyX3NlcnZlcl90cnVua2luZyhzdHJ1 Y3QNCj4gbmZzX2NsaWVudCAqY2xwLA0KPiAgCWNhc2UgMDoNCj4gIAljYXNlIC1FSU5UUjoNCj4g IAljYXNlIC1FUkVTVEFSVFNZUzoNCj4gKwljYXNlIC1ORlM0RVJSX05PU1BDOg0KPiAgCQlicmVh azsNCj4gIAljYXNlIC1FVElNRURPVVQ6DQo+ICAJCWlmIChjbG50LT5jbF9zb2Z0cnRyeSkNCi0t IA0KVHJvbmQgTXlrbGVidXN0DQpMaW51eCBORlMgY2xpZW50IG1haW50YWluZXIsIEhhbW1lcnNw YWNlDQp0cm9uZC5teWtsZWJ1c3RAaGFtbWVyc3BhY2UuY29tDQoNCg== ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/2] nfsv4: handle ENOSPC during create session 2018-06-21 17:04 ` [PATCH 1/2] nfsv4: handle ENOSPC during create session Trond Myklebust @ 2018-06-22 14:28 ` Manjunath Patil 0 siblings, 0 replies; 18+ messages in thread From: Manjunath Patil @ 2018-06-22 14:28 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-nfs Thank you for your comments Trond. Let me come back with better version. -Thanks, Manjunath On 6/21/2018 10:04 AM, Trond Myklebust wrote: > On Thu, 2018-06-21 at 16:35 +0000, Manjunath Patil wrote: >> Presently the client mount hangs for NFS4ERR_NOSPC repsonse from >> server >> during create session operation. Handle this error at the client side >> and pass it back to user-space, which may chose to mount with lower >> nfs >> versions. >> >> Signed-off-by: Manjunath Patil <manjunath.b.patil@oracle.com> >> --- >> fs/nfs/nfs4state.c | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c >> index 2bf2eaa..2134cf5 100644 >> --- a/fs/nfs/nfs4state.c >> +++ b/fs/nfs/nfs4state.c >> @@ -381,6 +381,8 @@ int nfs41_discover_server_trunking(struct >> nfs_client *clp, >> } >> nfs4_schedule_state_manager(clp); >> status = nfs_wait_client_init_complete(clp); >> + if (!status) /* -ERESTARTSYS */ >> + status = nfs_client_init_status(clp); > Nack... The trunking code is _not_ the place to do session error > detection. > >> if (status < 0) >> nfs_put_client(clp); >> return status; >> @@ -1919,6 +1921,9 @@ static int >> nfs4_handle_reclaim_lease_error(struct nfs_client *clp, int status) >> dprintk("%s: exit with error %d for server %s\n", >> __func__, -EPROTONOSUPPORT, clp- >>> cl_hostname); >> return -EPROTONOSUPPORT; >> + case -NFS4ERR_NOSPC: >> + nfs_mark_client_ready(clp, status); >> + /*fall through*/ > Nack... This would cause existing mounts to suddenly permanently fail. > >> case -NFS4ERR_NOT_SAME: /* FixMe: implement recovery >> * in nfs4_exchange_id */ >> default: >> @@ -2186,6 +2191,7 @@ int nfs4_discover_server_trunking(struct >> nfs_client *clp, >> case 0: >> case -EINTR: >> case -ERESTARTSYS: >> + case -NFS4ERR_NOSPC: >> break; >> case -ETIMEDOUT: >> if (clnt->cl_softrtry) ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2018-07-09 21:57 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-06-21 16:35 [PATCH 1/2] nfsv4: handle ENOSPC during create session Manjunath Patil 2018-06-21 16:35 ` [PATCH 2/2] nfsd: return ENOSPC if unable to allocate a session slot Manjunath Patil 2018-06-22 17:54 ` J. Bruce Fields 2018-06-22 21:49 ` Chuck Lever 2018-06-22 22:31 ` Trond Myklebust 2018-06-22 23:10 ` Trond Myklebust 2018-06-23 19:00 ` Chuck Lever 2018-06-24 13:56 ` Trond Myklebust 2018-06-25 15:39 ` Chuck Lever 2018-06-25 16:45 ` Trond Myklebust 2018-06-25 17:03 ` Manjunath Patil 2018-06-24 20:26 ` J. Bruce Fields [not found] ` <bde64edc-5684-82d7-4488-e2ebdd7018fc@oracle.com> 2018-06-25 22:04 ` J. Bruce Fields 2018-06-26 17:20 ` Manjunath Patil 2018-07-09 14:25 ` J. Bruce Fields 2018-07-09 21:57 ` Trond Myklebust 2018-06-21 17:04 ` [PATCH 1/2] nfsv4: handle ENOSPC during create session Trond Myklebust 2018-06-22 14:28 ` Manjunath Patil
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).