linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] NFS41: Drop lseg ref before fallthru to MDS
@ 2011-07-04  1:30 Peng Tao
  2011-07-09 14:10 ` Benny Halevy
  0 siblings, 1 reply; 15+ messages in thread
From: Peng Tao @ 2011-07-04  1:30 UTC (permalink / raw)
  To: Trond.Myklebust; +Cc: linux-nfs, Peng Tao

There is no need to keep lseg reference when read/write through MDS.
This fixes a null pointer crash at nfs_post_op_update_inode_force_wcc
because nfs4_proc_write_setup will unset wdata->res.fattr if wdata->lseg
is not NULL.

Signed-off-by: Peng Tao <peng_tao@emc.com>
---
 fs/nfs/pnfs.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 30a0394..55fdf02 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1193,6 +1193,9 @@ pnfs_ld_write_done(struct nfs_write_data *data)
 
 	dprintk("%s: pnfs_error=%d, retry via MDS\n", __func__,
 		data->pnfs_error);
+
+	put_lseg(data->lseg);
+	data->lseg = NULL;
 	status = nfs_initiate_write(data, NFS_CLIENT(data->inode),
 				    data->mds_ops, NFS_FILE_SYNC);
 	return status ? : -EAGAIN;
@@ -1240,6 +1243,9 @@ pnfs_ld_read_done(struct nfs_read_data *data)
 
 	dprintk("%s: pnfs_error=%d, retry via MDS\n", __func__,
 		data->pnfs_error);
+
+	put_lseg(data->lseg);
+	data->lseg = NULL;
 	status = nfs_initiate_read(data, NFS_CLIENT(data->inode),
 				   data->mds_ops);
 	return status ? : -EAGAIN;
-- 
1.7.4.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
  2011-07-04  1:30 [PATCH] NFS41: Drop lseg ref before fallthru to MDS Peng Tao
@ 2011-07-09 14:10 ` Benny Halevy
  2011-07-20  5:52   ` tao.peng
  0 siblings, 1 reply; 15+ messages in thread
From: Benny Halevy @ 2011-07-09 14:10 UTC (permalink / raw)
  To: Peng Tao; +Cc: Trond.Myklebust, linux-nfs, Peng Tao

On 2011-07-04 04:30, Peng Tao wrote:
> There is no need to keep lseg reference when read/write through MDS.
> This fixes a null pointer crash at nfs_post_op_update_inode_force_wcc
> because nfs4_proc_write_setup will unset wdata->res.fattr if wdata->lseg
> is not NULL.
> 
> Signed-off-by: Peng Tao <peng_tao@emc.com>

Looks good to me.

Benny

> ---
>  fs/nfs/pnfs.c |    6 ++++++
>  1 files changed, 6 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
> index 30a0394..55fdf02 100644
> --- a/fs/nfs/pnfs.c
> +++ b/fs/nfs/pnfs.c
> @@ -1193,6 +1193,9 @@ pnfs_ld_write_done(struct nfs_write_data *data)
>  
>  	dprintk("%s: pnfs_error=%d, retry via MDS\n", __func__,
>  		data->pnfs_error);
> +
> +	put_lseg(data->lseg);
> +	data->lseg = NULL;
>  	status = nfs_initiate_write(data, NFS_CLIENT(data->inode),
>  				    data->mds_ops, NFS_FILE_SYNC);
>  	return status ? : -EAGAIN;
> @@ -1240,6 +1243,9 @@ pnfs_ld_read_done(struct nfs_read_data *data)
>  
>  	dprintk("%s: pnfs_error=%d, retry via MDS\n", __func__,
>  		data->pnfs_error);
> +
> +	put_lseg(data->lseg);
> +	data->lseg = NULL;
>  	status = nfs_initiate_read(data, NFS_CLIENT(data->inode),
>  				   data->mds_ops);
>  	return status ? : -EAGAIN;

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
  2011-07-09 14:10 ` Benny Halevy
@ 2011-07-20  5:52   ` tao.peng
  2011-07-25 19:13     ` Trond Myklebust
  0 siblings, 1 reply; 15+ messages in thread
From: tao.peng @ 2011-07-20  5:52 UTC (permalink / raw)
  To: Trond.Myklebust; +Cc: linux-nfs, bhalevy, bergwolf

Hi, Trond,

Any comments on this patch? I still get kernel crash when pnfs write is attempted but fails and calls pnfs_ld_write_done(). It seems object layout uses the same code path as well. But I don't find the patch in either your tree or Benny's tree. Are there any concerns?

Thanks,
Tao

> -----Original Message-----
> From: Benny Halevy [mailto:bhalevy@tonian.com]
> Sent: Saturday, July 09, 2011 10:10 PM
> To: Peng Tao
> Cc: Trond.Myklebust@netapp.com; linux-nfs@vger.kernel.org; Peng, Tao
> Subject: Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
> 
> On 2011-07-04 04:30, Peng Tao wrote:
> > There is no need to keep lseg reference when read/write through MDS.
> > This fixes a null pointer crash at nfs_post_op_update_inode_force_wcc
> > because nfs4_proc_write_setup will unset wdata->res.fattr if wdata->lseg
> > is not NULL.
> >
> > Signed-off-by: Peng Tao <peng_tao@emc.com>
> 
> Looks good to me.
> 
> Benny
> 
> > ---
> >  fs/nfs/pnfs.c |    6 ++++++
> >  1 files changed, 6 insertions(+), 0 deletions(-)
> >
> > diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
> > index 30a0394..55fdf02 100644
> > --- a/fs/nfs/pnfs.c
> > +++ b/fs/nfs/pnfs.c
> > @@ -1193,6 +1193,9 @@ pnfs_ld_write_done(struct nfs_write_data *data)
> >
> >  	dprintk("%s: pnfs_error=%d, retry via MDS\n", __func__,
> >  		data->pnfs_error);
> > +
> > +	put_lseg(data->lseg);
> > +	data->lseg = NULL;
> >  	status = nfs_initiate_write(data, NFS_CLIENT(data->inode),
> >  				    data->mds_ops, NFS_FILE_SYNC);
> >  	return status ? : -EAGAIN;
> > @@ -1240,6 +1243,9 @@ pnfs_ld_read_done(struct nfs_read_data *data)
> >
> >  	dprintk("%s: pnfs_error=%d, retry via MDS\n", __func__,
> >  		data->pnfs_error);
> > +
> > +	put_lseg(data->lseg);
> > +	data->lseg = NULL;
> >  	status = nfs_initiate_read(data, NFS_CLIENT(data->inode),
> >  				   data->mds_ops);
> >  	return status ? : -EAGAIN;


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
  2011-07-20  5:52   ` tao.peng
@ 2011-07-25 19:13     ` Trond Myklebust
  2011-07-25 21:35       ` Jim Rees
                         ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Trond Myklebust @ 2011-07-25 19:13 UTC (permalink / raw)
  To: tao.peng; +Cc: linux-nfs, bhalevy, bergwolf

On Wed, 2011-07-20 at 01:52 -0400, tao.peng@emc.com wrote: 
> Hi, Trond,
> 
> Any comments on this patch? I still get kernel crash when pnfs write is attempted but fails and calls pnfs_ld_write_done(). It seems object layout uses the same code path as well. But I don't find the patch in either your tree or Benny's tree. Are there any concerns?
> 
> Thanks,
> Tao

The whole pnfs_ld_write_done thing is bogus and needs to be replaced
with something sane. It is trying to initiate a WRITE RPC call with the
wrong block size, and is calling the MDS rpc_call_done() and
rpc_release() with an uninitialised rpc task pointer.

Ditto for pnfs_ld_read_done.

Cheers
  Trond

> > -----Original Message-----
> > From: Benny Halevy [mailto:bhalevy@tonian.com]
> > Sent: Saturday, July 09, 2011 10:10 PM
> > To: Peng Tao
> > Cc: Trond.Myklebust@netapp.com; linux-nfs@vger.kernel.org; Peng, Tao
> > Subject: Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
> > 
> > On 2011-07-04 04:30, Peng Tao wrote:
> > > There is no need to keep lseg reference when read/write through MDS.
> > > This fixes a null pointer crash at nfs_post_op_update_inode_force_wcc
> > > because nfs4_proc_write_setup will unset wdata->res.fattr if wdata->lseg
> > > is not NULL.
> > >
> > > Signed-off-by: Peng Tao <peng_tao@emc.com>
> > 
> > Looks good to me.
> > 
> > Benny
> > 
> > > ---
> > >  fs/nfs/pnfs.c |    6 ++++++
> > >  1 files changed, 6 insertions(+), 0 deletions(-)
> > >
> > > diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
> > > index 30a0394..55fdf02 100644
> > > --- a/fs/nfs/pnfs.c
> > > +++ b/fs/nfs/pnfs.c
> > > @@ -1193,6 +1193,9 @@ pnfs_ld_write_done(struct nfs_write_data *data)
> > >
> > >  	dprintk("%s: pnfs_error=%d, retry via MDS\n", __func__,
> > >  		data->pnfs_error);
> > > +
> > > +	put_lseg(data->lseg);
> > > +	data->lseg = NULL;
> > >  	status = nfs_initiate_write(data, NFS_CLIENT(data->inode),
> > >  				    data->mds_ops, NFS_FILE_SYNC);
> > >  	return status ? : -EAGAIN;
> > > @@ -1240,6 +1243,9 @@ pnfs_ld_read_done(struct nfs_read_data *data)
> > >
> > >  	dprintk("%s: pnfs_error=%d, retry via MDS\n", __func__,
> > >  		data->pnfs_error);
> > > +
> > > +	put_lseg(data->lseg);
> > > +	data->lseg = NULL;
> > >  	status = nfs_initiate_read(data, NFS_CLIENT(data->inode),
> > >  				   data->mds_ops);
> > >  	return status ? : -EAGAIN;
> 

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
  2011-07-25 19:13     ` Trond Myklebust
@ 2011-07-25 21:35       ` Jim Rees
  2011-07-26 15:37       ` Peng Tao
  2011-07-26 16:34       ` Benny Halevy
  2 siblings, 0 replies; 15+ messages in thread
From: Jim Rees @ 2011-07-25 21:35 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: tao.peng, linux-nfs, bhalevy, bergwolf

Trond Myklebust wrote:

  On Wed, 2011-07-20 at 01:52 -0400, tao.peng@emc.com wrote: 
  > Hi, Trond,
  > 
  > Any comments on this patch? I still get kernel crash when pnfs write is attempted but fails and calls pnfs_ld_write_done(). It seems object layout uses the same code path as well. But I don't find the patch in either your tree or Benny's tree. Are there any concerns?
  > 
  > Thanks,
  > Tao
  
  The whole pnfs_ld_write_done thing is bogus and needs to be replaced
  with something sane. It is trying to initiate a WRITE RPC call with the
  wrong block size, and is calling the MDS rpc_call_done() and
  rpc_release() with an uninitialised rpc task pointer.
  
  Ditto for pnfs_ld_read_done.

Benny / Boaz, have you run into this problem with object layout?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
  2011-07-25 19:13     ` Trond Myklebust
  2011-07-25 21:35       ` Jim Rees
@ 2011-07-26 15:37       ` Peng Tao
  2011-07-26 15:50         ` Myklebust, Trond
  2011-07-26 16:34       ` Benny Halevy
  2 siblings, 1 reply; 15+ messages in thread
From: Peng Tao @ 2011-07-26 15:37 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: tao.peng, linux-nfs, bhalevy

Hi, Trond,

On Tue, Jul 26, 2011 at 3:13 AM, Trond Myklebust
<Trond.Myklebust@netapp.com> wrote:
> On Wed, 2011-07-20 at 01:52 -0400, tao.peng@emc.com wrote:
>> Hi, Trond,
>>
>> Any comments on this patch? I still get kernel crash when pnfs write is attempted but fails and calls pnfs_ld_write_done(). It seems object layout uses the same code path as well. But I don't find the patch in either your tree or Benny's tree. Are there any concerns?
>>
>> Thanks,
>> Tao
>
> The whole pnfs_ld_write_done thing is bogus and needs to be replaced
> with something sane. It is trying to initiate a WRITE RPC call with the
> wrong block size, and is calling the MDS rpc_call_done() and
> rpc_release() with an uninitialised rpc task pointer.
>
> Ditto for pnfs_ld_read_done.
Thanks for your explanation. Is there any plan on how to fix
pnfs_ld_read/write_done? Basically, we would need an interface that
can redirect the IO to MDS if pnfs_error is set or do all necessary
cleanup work to end read/write if pnfs_error is 0. IMHO, the
recoalesce logic need to access nfs_pageio_descriptor but we do not
have that information at pnfs_ld_read/write_done.

Best,
Tao

>
> Cheers
>  Trond
>
>> > -----Original Message-----
>> > From: Benny Halevy [mailto:bhalevy@tonian.com]
>> > Sent: Saturday, July 09, 2011 10:10 PM
>> > To: Peng Tao
>> > Cc: Trond.Myklebust@netapp.com; linux-nfs@vger.kernel.org; Peng, Tao
>> > Subject: Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
>> >
>> > On 2011-07-04 04:30, Peng Tao wrote:
>> > > There is no need to keep lseg reference when read/write through MDS.
>> > > This fixes a null pointer crash at nfs_post_op_update_inode_force_wcc
>> > > because nfs4_proc_write_setup will unset wdata->res.fattr if wdata->lseg
>> > > is not NULL.
>> > >
>> > > Signed-off-by: Peng Tao <peng_tao@emc.com>
>> >
>> > Looks good to me.
>> >
>> > Benny
>> >
>> > > ---
>> > >  fs/nfs/pnfs.c |    6 ++++++
>> > >  1 files changed, 6 insertions(+), 0 deletions(-)
>> > >
>> > > diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
>> > > index 30a0394..55fdf02 100644
>> > > --- a/fs/nfs/pnfs.c
>> > > +++ b/fs/nfs/pnfs.c
>> > > @@ -1193,6 +1193,9 @@ pnfs_ld_write_done(struct nfs_write_data *data)
>> > >
>> > >   dprintk("%s: pnfs_error=%d, retry via MDS\n", __func__,
>> > >           data->pnfs_error);
>> > > +
>> > > + put_lseg(data->lseg);
>> > > + data->lseg = NULL;
>> > >   status = nfs_initiate_write(data, NFS_CLIENT(data->inode),
>> > >                               data->mds_ops, NFS_FILE_SYNC);
>> > >   return status ? : -EAGAIN;
>> > > @@ -1240,6 +1243,9 @@ pnfs_ld_read_done(struct nfs_read_data *data)
>> > >
>> > >   dprintk("%s: pnfs_error=%d, retry via MDS\n", __func__,
>> > >           data->pnfs_error);
>> > > +
>> > > + put_lseg(data->lseg);
>> > > + data->lseg = NULL;
>> > >   status = nfs_initiate_read(data, NFS_CLIENT(data->inode),
>> > >                              data->mds_ops);
>> > >   return status ? : -EAGAIN;
>>
>
> --
> Trond Myklebust
> Linux NFS client maintainer
>
> NetApp
> Trond.Myklebust@netapp.com
> www.netapp.com
>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
  2011-07-26 15:37       ` Peng Tao
@ 2011-07-26 15:50         ` Myklebust, Trond
  2011-07-26 16:08           ` Jim Rees
                             ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Myklebust, Trond @ 2011-07-26 15:50 UTC (permalink / raw)
  To: Peng Tao; +Cc: tao.peng, linux-nfs, bhalevy

PiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiBGcm9tOiBQZW5nIFRhbyBbbWFpbHRvOmJl
cmd3b2xmQGdtYWlsLmNvbV0NCj4gU2VudDogVHVlc2RheSwgSnVseSAyNiwgMjAxMSAxMTozNyBB
TQ0KPiBUbzogTXlrbGVidXN0LCBUcm9uZA0KPiBDYzogdGFvLnBlbmdAZW1jLmNvbTsgbGludXgt
bmZzQHZnZXIua2VybmVsLm9yZzsgYmhhbGV2eUB0b25pYW4uY29tDQo+IFN1YmplY3Q6IFJlOiBb
UEFUQ0hdIE5GUzQxOiBEcm9wIGxzZWcgcmVmIGJlZm9yZSBmYWxsdGhydSB0byBNRFMNCj4gDQo+
IEhpLCBUcm9uZCwNCj4gDQo+IE9uIFR1ZSwgSnVsIDI2LCAyMDExIGF0IDM6MTMgQU0sIFRyb25k
IE15a2xlYnVzdA0KPiA8VHJvbmQuTXlrbGVidXN0QG5ldGFwcC5jb20+IHdyb3RlOg0KPiA+IE9u
IFdlZCwgMjAxMS0wNy0yMCBhdCAwMTo1MiAtMDQwMCwgdGFvLnBlbmdAZW1jLmNvbSB3cm90ZToN
Cj4gPj4gSGksIFRyb25kLA0KPiA+Pg0KPiA+PiBBbnkgY29tbWVudHMgb24gdGhpcyBwYXRjaD8g
SSBzdGlsbCBnZXQga2VybmVsIGNyYXNoIHdoZW4gcG5mcyB3cml0ZQ0KPiBpcyBhdHRlbXB0ZWQg
YnV0IGZhaWxzIGFuZCBjYWxscyBwbmZzX2xkX3dyaXRlX2RvbmUoKS4gSXQgc2VlbXMgb2JqZWN0
DQo+IGxheW91dCB1c2VzIHRoZSBzYW1lIGNvZGUgcGF0aCBhcyB3ZWxsLiBCdXQgSSBkb24ndCBm
aW5kIHRoZSBwYXRjaCBpbg0KPiBlaXRoZXIgeW91ciB0cmVlIG9yIEJlbm55J3MgdHJlZS4gQXJl
IHRoZXJlIGFueSBjb25jZXJucz8NCj4gPj4NCj4gPj4gVGhhbmtzLA0KPiA+PiBUYW8NCj4gPg0K
PiA+IFRoZSB3aG9sZSBwbmZzX2xkX3dyaXRlX2RvbmUgdGhpbmcgaXMgYm9ndXMgYW5kIG5lZWRz
IHRvIGJlIHJlcGxhY2VkDQo+ID4gd2l0aCBzb21ldGhpbmcgc2FuZS4gSXQgaXMgdHJ5aW5nIHRv
IGluaXRpYXRlIGEgV1JJVEUgUlBDIGNhbGwgd2l0aA0KPiB0aGUNCj4gPiB3cm9uZyBibG9jayBz
aXplLCBhbmQgaXMgY2FsbGluZyB0aGUgTURTIHJwY19jYWxsX2RvbmUoKSBhbmQNCj4gPiBycGNf
cmVsZWFzZSgpIHdpdGggYW4gdW5pbml0aWFsaXNlZCBycGMgdGFzayBwb2ludGVyLg0KPiA+DQo+
ID4gRGl0dG8gZm9yIHBuZnNfbGRfcmVhZF9kb25lLg0KPiBUaGFua3MgZm9yIHlvdXIgZXhwbGFu
YXRpb24uIElzIHRoZXJlIGFueSBwbGFuIG9uIGhvdyB0byBmaXgNCj4gcG5mc19sZF9yZWFkL3dy
aXRlX2RvbmU/IEJhc2ljYWxseSwgd2Ugd291bGQgbmVlZCBhbiBpbnRlcmZhY2UgdGhhdA0KPiBj
YW4gcmVkaXJlY3QgdGhlIElPIHRvIE1EUyBpZiBwbmZzX2Vycm9yIGlzIHNldCBvciBkbyBhbGwg
bmVjZXNzYXJ5DQo+IGNsZWFudXAgd29yayB0byBlbmQgcmVhZC93cml0ZSBpZiBwbmZzX2Vycm9y
IGlzIDAuIElNSE8sIHRoZQ0KPiByZWNvYWxlc2NlIGxvZ2ljIG5lZWQgdG8gYWNjZXNzIG5mc19w
YWdlaW9fZGVzY3JpcHRvciBidXQgd2UgZG8gbm90DQo+IGhhdmUgdGhhdCBpbmZvcm1hdGlvbiBh
dCBwbmZzX2xkX3JlYWQvd3JpdGVfZG9uZS4NCg0KQXMgZmFyIGFzIEkgY2FuIHNlZSwgdGhlIHJp
Z2h0IHRoaW5nIHRvIGRvIGlzIHRvIG1hcmsgdGhlIGxheW91dCBhcyBpbnZhbGlkIGFuZCB0aGVu
IHJlZGlydHkgdGhlIHBhZ2UuIEl0IHNob3VsZCBiZSBlYXN5IHRvIGhhdmUgZnN5bmMoKSByZS1z
ZW5kIHRoZSBwYWdlcyBpbiB0aGlzIGNhc2UuIFRoZXNlIHNob3VsZCBiZSBleHRyZW1lbHkgcmFy
ZSBldmVudHMsIHNpbmNlIHdlIGV4cGVjdCB0byBjYXRjaCBtb3N0IG9mIHRoZSBwTkZTIGZhaWx1
cmVzIHdoZW4gd2UgZG8gdGhlIGFjdHVhbCBMQVlPVVRHRVQgaW4gdGhlIC0+cGdfaW5pdCgpLg0K
DQpNeSBtYWluIHdvcnJ5IGlzIGZvciBhaW8vZGlvIHdoZXJlIHRoZXJlIGlzIG5vIGdvb2QgbWVj
aGFuaXNtIGZvciByZXRyeWluZy4gSSdtIHN0aWxsIHdvcmtpbmcgb24gdGhhdC4uLg0KDQpDaGVl
cnMNCiAgVHJvbmQNCgTvv717Lm7vv70r77+977+977+977+977+977+977+9KyXvv73vv71sendt
77+977+9Yu+/veunsu+/ve+/vXLvv73vv716WO+/ve+/vRnfsinvv73vv73vv713Kh9qZ++/ve+/
ve+/vR7vv73vv73vv73vv73vv73domov77+977+977+9eu+/vd6W77+977+9Mu+/vd6Z77+977+9
77+9Ju+/vSnfoe+/vWHvv73vv71/77+977+9Hu+/vUfvv73vv73vv71o77+9D++/vWo6K3bvv73v
v73vv71377+92aU=

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
  2011-07-26 15:50         ` Myklebust, Trond
@ 2011-07-26 16:08           ` Jim Rees
  2011-07-26 16:14             ` Myklebust, Trond
  2011-07-26 17:32           ` Peng Tao
  2011-07-27 10:17           ` tao.peng
  2 siblings, 1 reply; 15+ messages in thread
From: Jim Rees @ 2011-07-26 16:08 UTC (permalink / raw)
  To: Myklebust, Trond; +Cc: Peng Tao, tao.peng, linux-nfs, bhalevy

Myklebust, Trond wrote:

  > Thanks for your explanation. Is there any plan on how to fix
  > pnfs_ld_read/write_done? Basically, we would need an interface that
  > can redirect the IO to MDS if pnfs_error is set or do all necessary
  > cleanup work to end read/write if pnfs_error is 0. IMHO, the
  > recoalesce logic need to access nfs_pageio_descriptor but we do not
  > have that information at pnfs_ld_read/write_done.
  
  As far as I can see, the right thing to do is to mark the layout as
  invalid and then redirty the page. It should be easy to have fsync()
  re-send the pages in this case. These should be extremely rare events,
  since we expect to catch most of the pNFS failures when we do the actual
  LAYOUTGET in the ->pg_init().
  
  My main worry is for aio/dio where there is no good mechanism for
  retrying. I'm still working on that...

What do you suggest we do for the current set of patches that add block
layout to pnfs?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
  2011-07-26 16:08           ` Jim Rees
@ 2011-07-26 16:14             ` Myklebust, Trond
  2011-07-26 16:37               ` Benny Halevy
  0 siblings, 1 reply; 15+ messages in thread
From: Myklebust, Trond @ 2011-07-26 16:14 UTC (permalink / raw)
  To: Jim Rees; +Cc: Peng Tao, tao.peng, linux-nfs, bhalevy

> -----Original Message-----
> From: Jim Rees [mailto:rees@umich.edu]
> Sent: Tuesday, July 26, 2011 12:08 PM
> To: Myklebust, Trond
> Cc: Peng Tao; tao.peng@emc.com; linux-nfs@vger.kernel.org;
> bhalevy@tonian.com
> Subject: Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
> 
> Myklebust, Trond wrote:
> 
>   > Thanks for your explanation. Is there any plan on how to fix
>   > pnfs_ld_read/write_done? Basically, we would need an interface
that
>   > can redirect the IO to MDS if pnfs_error is set or do all
necessary
>   > cleanup work to end read/write if pnfs_error is 0. IMHO, the
>   > recoalesce logic need to access nfs_pageio_descriptor but we do
not
>   > have that information at pnfs_ld_read/write_done.
> 
>   As far as I can see, the right thing to do is to mark the layout as
>   invalid and then redirty the page. It should be easy to have fsync()
>   re-send the pages in this case. These should be extremely rare
> events,
>   since we expect to catch most of the pNFS failures when we do the
> actual
>   LAYOUTGET in the ->pg_init().
> 
>   My main worry is for aio/dio where there is no good mechanism for
>   retrying. I'm still working on that...
> 
> What do you suggest we do for the current set of patches that add
block
> layout to pnfs?

If you are calling pnfs_ld_read/write_done, then don't change anything:
it is easier to fix this in one spot rather than several.
However someone needs to start working on fixing the code in
pnfs_ld_read/write_done to do the right thing. If nobody else has the
cycles, then I can do that but I'd prefer to have someone who can easily
test the resulting code do it.

Cheers
  Trond

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
  2011-07-25 19:13     ` Trond Myklebust
  2011-07-25 21:35       ` Jim Rees
  2011-07-26 15:37       ` Peng Tao
@ 2011-07-26 16:34       ` Benny Halevy
  2 siblings, 0 replies; 15+ messages in thread
From: Benny Halevy @ 2011-07-26 16:34 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: tao.peng, linux-nfs, bergwolf

On 2011-07-25 15:13, Trond Myklebust wrote:
> On Wed, 2011-07-20 at 01:52 -0400, tao.peng@emc.com wrote: 
>> Hi, Trond,
>>
>> Any comments on this patch? I still get kernel crash when pnfs write is attempted but fails and calls pnfs_ld_write_done(). It seems object layout uses the same code path as well. But I don't find the patch in either your tree or Benny's tree. Are there any concerns?
>>
>> Thanks,
>> Tao
> 
> The whole pnfs_ld_write_done thing is bogus and needs to be replaced
> with something sane. It is trying to initiate a WRITE RPC call with the
> wrong block size,

I was under the impression that your re-coalesce work will take
care of that. Is there anything else that needs to be done?

> and is calling the MDS rpc_call_done() and
> rpc_release() with an uninitialised rpc task pointer.

So on this path there is indeed no active rpc task so we're using the
task structure in the struct nfs_write_data.  I agree that having
a helper function at the rpc layer to initialize it to a meaningful
value indicating there is no active rpc task would be a useful thing.

But the fix Peng sent is for the fallback path where we initiate
I/O to the MDS and we do build a rpc task properly.  On this path
lseg indeed needs to be put and set to NULL.

Benny

> 
> Ditto for pnfs_ld_read_done.
> 
> Cheers
>   Trond
> 
>>> -----Original Message-----
>>> From: Benny Halevy [mailto:bhalevy@tonian.com]
>>> Sent: Saturday, July 09, 2011 10:10 PM
>>> To: Peng Tao
>>> Cc: Trond.Myklebust@netapp.com; linux-nfs@vger.kernel.org; Peng, Tao
>>> Subject: Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
>>>
>>> On 2011-07-04 04:30, Peng Tao wrote:
>>>> There is no need to keep lseg reference when read/write through MDS.
>>>> This fixes a null pointer crash at nfs_post_op_update_inode_force_wcc
>>>> because nfs4_proc_write_setup will unset wdata->res.fattr if wdata->lseg
>>>> is not NULL.
>>>>
>>>> Signed-off-by: Peng Tao <peng_tao@emc.com>
>>>
>>> Looks good to me.
>>>
>>> Benny
>>>
>>>> ---
>>>>  fs/nfs/pnfs.c |    6 ++++++
>>>>  1 files changed, 6 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
>>>> index 30a0394..55fdf02 100644
>>>> --- a/fs/nfs/pnfs.c
>>>> +++ b/fs/nfs/pnfs.c
>>>> @@ -1193,6 +1193,9 @@ pnfs_ld_write_done(struct nfs_write_data *data)
>>>>
>>>>  	dprintk("%s: pnfs_error=%d, retry via MDS\n", __func__,
>>>>  		data->pnfs_error);
>>>> +
>>>> +	put_lseg(data->lseg);
>>>> +	data->lseg = NULL;
>>>>  	status = nfs_initiate_write(data, NFS_CLIENT(data->inode),
>>>>  				    data->mds_ops, NFS_FILE_SYNC);
>>>>  	return status ? : -EAGAIN;
>>>> @@ -1240,6 +1243,9 @@ pnfs_ld_read_done(struct nfs_read_data *data)
>>>>
>>>>  	dprintk("%s: pnfs_error=%d, retry via MDS\n", __func__,
>>>>  		data->pnfs_error);
>>>> +
>>>> +	put_lseg(data->lseg);
>>>> +	data->lseg = NULL;
>>>>  	status = nfs_initiate_read(data, NFS_CLIENT(data->inode),
>>>>  				   data->mds_ops);
>>>>  	return status ? : -EAGAIN;
>>
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
  2011-07-26 16:14             ` Myklebust, Trond
@ 2011-07-26 16:37               ` Benny Halevy
  0 siblings, 0 replies; 15+ messages in thread
From: Benny Halevy @ 2011-07-26 16:37 UTC (permalink / raw)
  To: Myklebust, Trond; +Cc: Jim Rees, Peng Tao, tao.peng, linux-nfs

On 2011-07-26 12:14, Myklebust, Trond wrote:
>> -----Original Message-----
>> From: Jim Rees [mailto:rees@umich.edu]
>> Sent: Tuesday, July 26, 2011 12:08 PM
>> To: Myklebust, Trond
>> Cc: Peng Tao; tao.peng@emc.com; linux-nfs@vger.kernel.org;
>> bhalevy@tonian.com
>> Subject: Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
>>
>> Myklebust, Trond wrote:
>>
>>   > Thanks for your explanation. Is there any plan on how to fix
>>   > pnfs_ld_read/write_done? Basically, we would need an interface
> that
>>   > can redirect the IO to MDS if pnfs_error is set or do all
> necessary
>>   > cleanup work to end read/write if pnfs_error is 0. IMHO, the
>>   > recoalesce logic need to access nfs_pageio_descriptor but we do
> not
>>   > have that information at pnfs_ld_read/write_done.
>>
>>   As far as I can see, the right thing to do is to mark the layout as
>>   invalid and then redirty the page. It should be easy to have fsync()
>>   re-send the pages in this case. These should be extremely rare
>> events,
>>   since we expect to catch most of the pNFS failures when we do the
>> actual
>>   LAYOUTGET in the ->pg_init().
>>
>>   My main worry is for aio/dio where there is no good mechanism for
>>   retrying. I'm still working on that...
>>
>> What do you suggest we do for the current set of patches that add
> block
>> layout to pnfs?
> 
> If you are calling pnfs_ld_read/write_done, then don't change anything:
> it is easier to fix this in one spot rather than several.

Even if we plan on fixing this for the next merge window I think
there's value with the current fix even if it's going to be replaced
with a better fix along the road.

> However someone needs to start working on fixing the code in
> pnfs_ld_read/write_done to do the right thing. If nobody else has the
> cycles, then I can do that but I'd prefer to have someone who can easily
> test the resulting code do it.

I don't have cycles to code this either but I'll be happy to help
with looking at the design and reviewing the implementation.

Benny

> 
> Cheers
>   Trond
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
  2011-07-26 15:50         ` Myklebust, Trond
  2011-07-26 16:08           ` Jim Rees
@ 2011-07-26 17:32           ` Peng Tao
  2011-07-26 17:37             ` Myklebust, Trond
  2011-07-27 10:17           ` tao.peng
  2 siblings, 1 reply; 15+ messages in thread
From: Peng Tao @ 2011-07-26 17:32 UTC (permalink / raw)
  To: Myklebust, Trond; +Cc: tao.peng, linux-nfs, bhalevy

On Tue, Jul 26, 2011 at 11:50 PM, Myklebust, Trond
<Trond.Myklebust@netapp.com> wrote:
>> -----Original Message-----
>> From: Peng Tao [mailto:bergwolf@gmail.com]
>> Sent: Tuesday, July 26, 2011 11:37 AM
>> To: Myklebust, Trond
>> Cc: tao.peng@emc.com; linux-nfs@vger.kernel.org; bhalevy@tonian.com
>> Subject: Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
>>
>> Hi, Trond,
>>
>> On Tue, Jul 26, 2011 at 3:13 AM, Trond Myklebust
>> <Trond.Myklebust@netapp.com> wrote:
>> > On Wed, 2011-07-20 at 01:52 -0400, tao.peng@emc.com wrote:
>> >> Hi, Trond,
>> >>
>> >> Any comments on this patch? I still get kernel crash when pnfs write
>> is attempted but fails and calls pnfs_ld_write_done(). It seems object
>> layout uses the same code path as well. But I don't find the patch in
>> either your tree or Benny's tree. Are there any concerns?
>> >>
>> >> Thanks,
>> >> Tao
>> >
>> > The whole pnfs_ld_write_done thing is bogus and needs to be replaced
>> > with something sane. It is trying to initiate a WRITE RPC call with
>> the
>> > wrong block size, and is calling the MDS rpc_call_done() and
>> > rpc_release() with an uninitialised rpc task pointer.
>> >
>> > Ditto for pnfs_ld_read_done.
>> Thanks for your explanation. Is there any plan on how to fix
>> pnfs_ld_read/write_done? Basically, we would need an interface that
>> can redirect the IO to MDS if pnfs_error is set or do all necessary
>> cleanup work to end read/write if pnfs_error is 0. IMHO, the
>> recoalesce logic need to access nfs_pageio_descriptor but we do not
>> have that information at pnfs_ld_read/write_done.
>
> As far as I can see, the right thing to do is to mark the layout as invalid and then redirty the page. It should be easy to have fsync() re-send the pages in this case. These should be extremely rare events, since we expect to catch most of the pNFS failures when we do the actual LAYOUTGET in the ->pg_init().
Agreed. This should be easier than re-coalescing and sending to MDS at
read/write_done.

>
> My main worry is for aio/dio where there is no good mechanism for retrying. I'm still working on that...
For dio, we may have to send the failed pages to MDS instead of
relying on next fsync() to retry.


Thanks,
Tao

>
> Cheers
>  Trond
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
  2011-07-26 17:32           ` Peng Tao
@ 2011-07-26 17:37             ` Myklebust, Trond
  2011-07-26 17:57               ` Peng Tao
  0 siblings, 1 reply; 15+ messages in thread
From: Myklebust, Trond @ 2011-07-26 17:37 UTC (permalink / raw)
  To: Peng Tao; +Cc: tao.peng, linux-nfs, bhalevy

PiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiBGcm9tOiBQZW5nIFRhbyBbbWFpbHRvOmJl
cmd3b2xmQGdtYWlsLmNvbV0NCj4gU2VudDogVHVlc2RheSwgSnVseSAyNiwgMjAxMSAxOjMzIFBN
DQo+IFRvOiBNeWtsZWJ1c3QsIFRyb25kDQo+IENjOiB0YW8ucGVuZ0BlbWMuY29tOyBsaW51eC1u
ZnNAdmdlci5rZXJuZWwub3JnOyBiaGFsZXZ5QHRvbmlhbi5jb20NCj4gU3ViamVjdDogUmU6IFtQ
QVRDSF0gTkZTNDE6IERyb3AgbHNlZyByZWYgYmVmb3JlIGZhbGx0aHJ1IHRvIE1EUw0KPiANCj4g
T24gVHVlLCBKdWwgMjYsIDIwMTEgYXQgMTE6NTAgUE0sIE15a2xlYnVzdCwgVHJvbmQNCj4gPFRy
b25kLk15a2xlYnVzdEBuZXRhcHAuY29tPiB3cm90ZToNCj4gPj4gLS0tLS1PcmlnaW5hbCBNZXNz
YWdlLS0tLS0NCj4gPj4gRnJvbTogUGVuZyBUYW8gW21haWx0bzpiZXJnd29sZkBnbWFpbC5jb21d
DQo+ID4+IFNlbnQ6IFR1ZXNkYXksIEp1bHkgMjYsIDIwMTEgMTE6MzcgQU0NCj4gPj4gVG86IE15
a2xlYnVzdCwgVHJvbmQNCj4gPj4gQ2M6IHRhby5wZW5nQGVtYy5jb207IGxpbnV4LW5mc0B2Z2Vy
Lmtlcm5lbC5vcmc7IGJoYWxldnlAdG9uaWFuLmNvbQ0KPiA+PiBTdWJqZWN0OiBSZTogW1BBVENI
XSBORlM0MTogRHJvcCBsc2VnIHJlZiBiZWZvcmUgZmFsbHRocnUgdG8gTURTDQo+ID4+DQo+ID4+
IEhpLCBUcm9uZCwNCj4gPj4NCj4gPj4gT24gVHVlLCBKdWwgMjYsIDIwMTEgYXQgMzoxMyBBTSwg
VHJvbmQgTXlrbGVidXN0DQo+ID4+IDxUcm9uZC5NeWtsZWJ1c3RAbmV0YXBwLmNvbT4gd3JvdGU6
DQo+ID4+ID4gT24gV2VkLCAyMDExLTA3LTIwIGF0IDAxOjUyIC0wNDAwLCB0YW8ucGVuZ0BlbWMu
Y29tIHdyb3RlOg0KPiA+PiA+PiBIaSwgVHJvbmQsDQo+ID4+ID4+DQo+ID4+ID4+IEFueSBjb21t
ZW50cyBvbiB0aGlzIHBhdGNoPyBJIHN0aWxsIGdldCBrZXJuZWwgY3Jhc2ggd2hlbiBwbmZzDQo+
IHdyaXRlDQo+ID4+IGlzIGF0dGVtcHRlZCBidXQgZmFpbHMgYW5kIGNhbGxzIHBuZnNfbGRfd3Jp
dGVfZG9uZSgpLiBJdCBzZWVtcw0KPiBvYmplY3QNCj4gPj4gbGF5b3V0IHVzZXMgdGhlIHNhbWUg
Y29kZSBwYXRoIGFzIHdlbGwuIEJ1dCBJIGRvbid0IGZpbmQgdGhlIHBhdGNoDQo+IGluDQo+ID4+
IGVpdGhlciB5b3VyIHRyZWUgb3IgQmVubnkncyB0cmVlLiBBcmUgdGhlcmUgYW55IGNvbmNlcm5z
Pw0KPiA+PiA+Pg0KPiA+PiA+PiBUaGFua3MsDQo+ID4+ID4+IFRhbw0KPiA+PiA+DQo+ID4+ID4g
VGhlIHdob2xlIHBuZnNfbGRfd3JpdGVfZG9uZSB0aGluZyBpcyBib2d1cyBhbmQgbmVlZHMgdG8g
YmUNCj4gcmVwbGFjZWQNCj4gPj4gPiB3aXRoIHNvbWV0aGluZyBzYW5lLiBJdCBpcyB0cnlpbmcg
dG8gaW5pdGlhdGUgYSBXUklURSBSUEMgY2FsbA0KPiB3aXRoDQo+ID4+IHRoZQ0KPiA+PiA+IHdy
b25nIGJsb2NrIHNpemUsIGFuZCBpcyBjYWxsaW5nIHRoZSBNRFMgcnBjX2NhbGxfZG9uZSgpIGFu
ZA0KPiA+PiA+IHJwY19yZWxlYXNlKCkgd2l0aCBhbiB1bmluaXRpYWxpc2VkIHJwYyB0YXNrIHBv
aW50ZXIuDQo+ID4+ID4NCj4gPj4gPiBEaXR0byBmb3IgcG5mc19sZF9yZWFkX2RvbmUuDQo+ID4+
IFRoYW5rcyBmb3IgeW91ciBleHBsYW5hdGlvbi4gSXMgdGhlcmUgYW55IHBsYW4gb24gaG93IHRv
IGZpeA0KPiA+PiBwbmZzX2xkX3JlYWQvd3JpdGVfZG9uZT8gQmFzaWNhbGx5LCB3ZSB3b3VsZCBu
ZWVkIGFuIGludGVyZmFjZSB0aGF0DQo+ID4+IGNhbiByZWRpcmVjdCB0aGUgSU8gdG8gTURTIGlm
IHBuZnNfZXJyb3IgaXMgc2V0IG9yIGRvIGFsbCBuZWNlc3NhcnkNCj4gPj4gY2xlYW51cCB3b3Jr
IHRvIGVuZCByZWFkL3dyaXRlIGlmIHBuZnNfZXJyb3IgaXMgMC4gSU1ITywgdGhlDQo+ID4+IHJl
Y29hbGVzY2UgbG9naWMgbmVlZCB0byBhY2Nlc3MgbmZzX3BhZ2Vpb19kZXNjcmlwdG9yIGJ1dCB3
ZSBkbyBub3QNCj4gPj4gaGF2ZSB0aGF0IGluZm9ybWF0aW9uIGF0IHBuZnNfbGRfcmVhZC93cml0
ZV9kb25lLg0KPiA+DQo+ID4gQXMgZmFyIGFzIEkgY2FuIHNlZSwgdGhlIHJpZ2h0IHRoaW5nIHRv
IGRvIGlzIHRvIG1hcmsgdGhlIGxheW91dCBhcw0KPiBpbnZhbGlkIGFuZCB0aGVuIHJlZGlydHkg
dGhlIHBhZ2UuIEl0IHNob3VsZCBiZSBlYXN5IHRvIGhhdmUgZnN5bmMoKQ0KPiByZS1zZW5kIHRo
ZSBwYWdlcyBpbiB0aGlzIGNhc2UuIFRoZXNlIHNob3VsZCBiZSBleHRyZW1lbHkgcmFyZSBldmVu
dHMsDQo+IHNpbmNlIHdlIGV4cGVjdCB0byBjYXRjaCBtb3N0IG9mIHRoZSBwTkZTIGZhaWx1cmVz
IHdoZW4gd2UgZG8gdGhlDQo+IGFjdHVhbCBMQVlPVVRHRVQgaW4gdGhlIC0+cGdfaW5pdCgpLg0K
PiBBZ3JlZWQuIFRoaXMgc2hvdWxkIGJlIGVhc2llciB0aGFuIHJlLWNvYWxlc2NpbmcgYW5kIHNl
bmRpbmcgdG8gTURTIGF0DQo+IHJlYWQvd3JpdGVfZG9uZS4NCj4gDQo+ID4NCj4gPiBNeSBtYWlu
IHdvcnJ5IGlzIGZvciBhaW8vZGlvIHdoZXJlIHRoZXJlIGlzIG5vIGdvb2QgbWVjaGFuaXNtIGZv
cg0KPiByZXRyeWluZy4gSSdtIHN0aWxsIHdvcmtpbmcgb24gdGhhdC4uLg0KPiBGb3IgZGlvLCB3
ZSBtYXkgaGF2ZSB0byBzZW5kIHRoZSBmYWlsZWQgcGFnZXMgdG8gTURTIGluc3RlYWQgb2YNCj4g
cmVseWluZyBvbiBuZXh0IGZzeW5jKCkgdG8gcmV0cnkuDQoNClRoZSBwcm9ibGVtIGlzbid0IHdo
YXQgdG8gZG8sIGl0IGlzIG1vcmUgb25lIG9mIF93aG9fIGRvZXMgaXQuIFRoZSBycGNpb2QvbmZz
aW9kIHF1ZXVlcyBhcmVuJ3QgdGhlIGlkZWFsIHBsYWNlIHRvIHNldCB1cCBhIHJlc2VuZCBzaW5j
ZSBpdCBpbnZvbHZlcyBhbGxvY2F0aW5nIG1lbW9yeS4NCg0KQ2hlZXJzDQogIFRyb25kDQoT77+9
77+97Lm7HO+/vSbvv71+77+9Ju+/vRjvv73vv70rLe+/ve+/vd22F++/ve+/vXfvv73vv73Lm++/
ve+/ve+/vW3vv71i77+977+9Z37Ip++/vRfvv73vv73cqH3vv73vv73vv73GoHrvv70majordu+/
ve+/ve+/vQfvv73vv73vv73vv716Wivvv73vv70rembvv73vv73vv71o77+977+977+9fu+/ve+/
ve+/ve+/vWnvv73vv73vv71677+9Hu+/vXfvv73vv73vv70/77+977+977+977+9Ju+/vSnfohtm

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
  2011-07-26 17:37             ` Myklebust, Trond
@ 2011-07-26 17:57               ` Peng Tao
  0 siblings, 0 replies; 15+ messages in thread
From: Peng Tao @ 2011-07-26 17:57 UTC (permalink / raw)
  To: Myklebust, Trond; +Cc: tao.peng, linux-nfs, bhalevy

On Wed, Jul 27, 2011 at 1:37 AM, Myklebust, Trond
<Trond.Myklebust@netapp.com> wrote:
>> -----Original Message-----
>> From: Peng Tao [mailto:bergwolf@gmail.com]
>> Sent: Tuesday, July 26, 2011 1:33 PM
>> To: Myklebust, Trond
>> Cc: tao.peng@emc.com; linux-nfs@vger.kernel.org; bhalevy@tonian.com
>> Subject: Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
>>
>> On Tue, Jul 26, 2011 at 11:50 PM, Myklebust, Trond
>> <Trond.Myklebust@netapp.com> wrote:
>> >> -----Original Message-----
>> >> From: Peng Tao [mailto:bergwolf@gmail.com]
>> >> Sent: Tuesday, July 26, 2011 11:37 AM
>> >> To: Myklebust, Trond
>> >> Cc: tao.peng@emc.com; linux-nfs@vger.kernel.org; bhalevy@tonian.com
>> >> Subject: Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
>> >>
>> >> Hi, Trond,
>> >>
>> >> On Tue, Jul 26, 2011 at 3:13 AM, Trond Myklebust
>> >> <Trond.Myklebust@netapp.com> wrote:
>> >> > On Wed, 2011-07-20 at 01:52 -0400, tao.peng@emc.com wrote:
>> >> >> Hi, Trond,
>> >> >>
>> >> >> Any comments on this patch? I still get kernel crash when pnfs
>> write
>> >> is attempted but fails and calls pnfs_ld_write_done(). It seems
>> object
>> >> layout uses the same code path as well. But I don't find the patch
>> in
>> >> either your tree or Benny's tree. Are there any concerns?
>> >> >>
>> >> >> Thanks,
>> >> >> Tao
>> >> >
>> >> > The whole pnfs_ld_write_done thing is bogus and needs to be
>> replaced
>> >> > with something sane. It is trying to initiate a WRITE RPC call
>> with
>> >> the
>> >> > wrong block size, and is calling the MDS rpc_call_done() and
>> >> > rpc_release() with an uninitialised rpc task pointer.
>> >> >
>> >> > Ditto for pnfs_ld_read_done.
>> >> Thanks for your explanation. Is there any plan on how to fix
>> >> pnfs_ld_read/write_done? Basically, we would need an interface that
>> >> can redirect the IO to MDS if pnfs_error is set or do all necessary
>> >> cleanup work to end read/write if pnfs_error is 0. IMHO, the
>> >> recoalesce logic need to access nfs_pageio_descriptor but we do not
>> >> have that information at pnfs_ld_read/write_done.
>> >
>> > As far as I can see, the right thing to do is to mark the layout as
>> invalid and then redirty the page. It should be easy to have fsync()
>> re-send the pages in this case. These should be extremely rare events,
>> since we expect to catch most of the pNFS failures when we do the
>> actual LAYOUTGET in the ->pg_init().
>> Agreed. This should be easier than re-coalescing and sending to MDS at
>> read/write_done.
>>
>> >
>> > My main worry is for aio/dio where there is no good mechanism for
>> retrying. I'm still working on that...
>> For dio, we may have to send the failed pages to MDS instead of
>> relying on next fsync() to retry.
>
> The problem isn't what to do, it is more one of _who_ does it. The rpciod/nfsiod queues aren't the ideal place to set up a resend since it involves allocating memory.
How about having a pnfs private workqueue to take care of the resend?
There are some other places default workqueue is used in io path in
both block and object layout code. It can be problematic if the
default workqueue is blocked.  e.g. if someone on the default
workqueue allocates memory and reclaim code comes into pnfs path.
Using default workqueue here can cause application hang forever. If we
have a private workqueue, these problems can be solve IMO.

Best,
Tao


>
> Cheers
>  Trond
>



-- 
Thanks,
-Bergwolf

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
  2011-07-26 15:50         ` Myklebust, Trond
  2011-07-26 16:08           ` Jim Rees
  2011-07-26 17:32           ` Peng Tao
@ 2011-07-27 10:17           ` tao.peng
  2 siblings, 0 replies; 15+ messages in thread
From: tao.peng @ 2011-07-27 10:17 UTC (permalink / raw)
  To: Trond.Myklebust; +Cc: linux-nfs, bhalevy

SGksIFRyb25kLA0KDQo+IC0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQo+IEZyb206IGxpbnV4
LW5mcy1vd25lckB2Z2VyLmtlcm5lbC5vcmcgW21haWx0bzpsaW51eC1uZnMtb3duZXJAdmdlci5r
ZXJuZWwub3JnXQ0KPiBPbiBCZWhhbGYgT2YgTXlrbGVidXN0LCBUcm9uZA0KPiBTZW50OiBUdWVz
ZGF5LCBKdWx5IDI2LCAyMDExIDExOjUwIFBNDQo+IFRvOiBQZW5nIFRhbw0KPiBDYzogUGVuZywg
VGFvOyBsaW51eC1uZnNAdmdlci5rZXJuZWwub3JnOyBiaGFsZXZ5QHRvbmlhbi5jb20NCj4gU3Vi
amVjdDogUkU6IFtQQVRDSF0gTkZTNDE6IERyb3AgbHNlZyByZWYgYmVmb3JlIGZhbGx0aHJ1IHRv
IE1EUw0KPiANCj4gPiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiA+IEZyb206IFBlbmcg
VGFvIFttYWlsdG86YmVyZ3dvbGZAZ21haWwuY29tXQ0KPiA+IFNlbnQ6IFR1ZXNkYXksIEp1bHkg
MjYsIDIwMTEgMTE6MzcgQU0NCj4gPiBUbzogTXlrbGVidXN0LCBUcm9uZA0KPiA+IENjOiB0YW8u
cGVuZ0BlbWMuY29tOyBsaW51eC1uZnNAdmdlci5rZXJuZWwub3JnOyBiaGFsZXZ5QHRvbmlhbi5j
b20NCj4gPiBTdWJqZWN0OiBSZTogW1BBVENIXSBORlM0MTogRHJvcCBsc2VnIHJlZiBiZWZvcmUg
ZmFsbHRocnUgdG8gTURTDQo+ID4NCj4gPiBIaSwgVHJvbmQsDQo+ID4NCj4gPiBPbiBUdWUsIEp1
bCAyNiwgMjAxMSBhdCAzOjEzIEFNLCBUcm9uZCBNeWtsZWJ1c3QNCj4gPiA8VHJvbmQuTXlrbGVi
dXN0QG5ldGFwcC5jb20+IHdyb3RlOg0KPiA+ID4gT24gV2VkLCAyMDExLTA3LTIwIGF0IDAxOjUy
IC0wNDAwLCB0YW8ucGVuZ0BlbWMuY29tIHdyb3RlOg0KPiA+ID4+IEhpLCBUcm9uZCwNCj4gPiA+
Pg0KPiA+ID4+IEFueSBjb21tZW50cyBvbiB0aGlzIHBhdGNoPyBJIHN0aWxsIGdldCBrZXJuZWwg
Y3Jhc2ggd2hlbiBwbmZzIHdyaXRlDQo+ID4gaXMgYXR0ZW1wdGVkIGJ1dCBmYWlscyBhbmQgY2Fs
bHMgcG5mc19sZF93cml0ZV9kb25lKCkuIEl0IHNlZW1zIG9iamVjdA0KPiA+IGxheW91dCB1c2Vz
IHRoZSBzYW1lIGNvZGUgcGF0aCBhcyB3ZWxsLiBCdXQgSSBkb24ndCBmaW5kIHRoZSBwYXRjaCBp
bg0KPiA+IGVpdGhlciB5b3VyIHRyZWUgb3IgQmVubnkncyB0cmVlLiBBcmUgdGhlcmUgYW55IGNv
bmNlcm5zPw0KPiA+ID4+DQo+ID4gPj4gVGhhbmtzLA0KPiA+ID4+IFRhbw0KPiA+ID4NCj4gPiA+
IFRoZSB3aG9sZSBwbmZzX2xkX3dyaXRlX2RvbmUgdGhpbmcgaXMgYm9ndXMgYW5kIG5lZWRzIHRv
IGJlIHJlcGxhY2VkDQo+ID4gPiB3aXRoIHNvbWV0aGluZyBzYW5lLiBJdCBpcyB0cnlpbmcgdG8g
aW5pdGlhdGUgYSBXUklURSBSUEMgY2FsbCB3aXRoDQo+ID4gdGhlDQo+ID4gPiB3cm9uZyBibG9j
ayBzaXplLCBhbmQgaXMgY2FsbGluZyB0aGUgTURTIHJwY19jYWxsX2RvbmUoKSBhbmQNCj4gPiA+
IHJwY19yZWxlYXNlKCkgd2l0aCBhbiB1bmluaXRpYWxpc2VkIHJwYyB0YXNrIHBvaW50ZXIuDQo+
ID4gPg0KPiA+ID4gRGl0dG8gZm9yIHBuZnNfbGRfcmVhZF9kb25lLg0KPiA+IFRoYW5rcyBmb3Ig
eW91ciBleHBsYW5hdGlvbi4gSXMgdGhlcmUgYW55IHBsYW4gb24gaG93IHRvIGZpeA0KPiA+IHBu
ZnNfbGRfcmVhZC93cml0ZV9kb25lPyBCYXNpY2FsbHksIHdlIHdvdWxkIG5lZWQgYW4gaW50ZXJm
YWNlIHRoYXQNCj4gPiBjYW4gcmVkaXJlY3QgdGhlIElPIHRvIE1EUyBpZiBwbmZzX2Vycm9yIGlz
IHNldCBvciBkbyBhbGwgbmVjZXNzYXJ5DQo+ID4gY2xlYW51cCB3b3JrIHRvIGVuZCByZWFkL3dy
aXRlIGlmIHBuZnNfZXJyb3IgaXMgMC4gSU1ITywgdGhlDQo+ID4gcmVjb2FsZXNjZSBsb2dpYyBu
ZWVkIHRvIGFjY2VzcyBuZnNfcGFnZWlvX2Rlc2NyaXB0b3IgYnV0IHdlIGRvIG5vdA0KPiA+IGhh
dmUgdGhhdCBpbmZvcm1hdGlvbiBhdCBwbmZzX2xkX3JlYWQvd3JpdGVfZG9uZS4NCj4gDQo+IEFz
IGZhciBhcyBJIGNhbiBzZWUsIHRoZSByaWdodCB0aGluZyB0byBkbyBpcyB0byBtYXJrIHRoZSBs
YXlvdXQgYXMgaW52YWxpZCBhbmQgdGhlbg0KPiByZWRpcnR5IHRoZSBwYWdlLiBJdCBzaG91bGQg
YmUgZWFzeSB0byBoYXZlIGZzeW5jKCkgcmUtc2VuZCB0aGUgcGFnZXMgaW4gdGhpcyBjYXNlLg0K
PiBUaGVzZSBzaG91bGQgYmUgZXh0cmVtZWx5IHJhcmUgZXZlbnRzLCBzaW5jZSB3ZSBleHBlY3Qg
dG8gY2F0Y2ggbW9zdCBvZiB0aGUgcE5GUw0KPiBmYWlsdXJlcyB3aGVuIHdlIGRvIHRoZSBhY3R1
YWwgTEFZT1VUR0VUIGluIHRoZSAtPnBnX2luaXQoKS4NCkFub3RoZXIgcHJvYmxlbSB0aGF0IGp1
c3QgY29tZXMgaW50byBteSBoZWFkIGlzIHJlYWRwYWdlIGZhaWx1cmVzLiBIb3cgZG8gd2UgcmVz
Y2hkdWxlIHJlYWQgZmFpbHVyZXMgaWYgbm90IHJlLXNlbmRpbmcgaXQgdG8gTURTPw0KDQpUaGFu
a3MsDQpUYW8NCj4gDQo+IE15IG1haW4gd29ycnkgaXMgZm9yIGFpby9kaW8gd2hlcmUgdGhlcmUg
aXMgbm8gZ29vZCBtZWNoYW5pc20gZm9yIHJldHJ5aW5nLiBJJ20gc3RpbGwNCj4gd29ya2luZyBv
biB0aGF0Li4uDQo+IA0KPiBDaGVlcnMNCj4gICBUcm9uZA0KPiAE77+9ey5u77+9K++/ve+/ve+/
ve+/ve+/ve+/ve+/vSsl77+977+9bHp3be+/ve+/vWLvv73rp7Lvv73vv71y77+977+9eljvv73v
v70Z37Ip77+977+977+9dyoNCj4gamfvv73vv73vv70e77+977+977+977+977+93aJqL++/ve+/
ve+/vXrvv73elu+/ve+/vTLvv73eme+/ve+/ve+/vSbvv70p36Hvv71h77+977+9f++/ve+/vR7v
v71H77+977+977+9aO+/vQ/vv71qOit277+977+977+9d++/vdmlDQoT77+977+97Lm7HO+/vSbv
v71+77+9Ju+/vRjvv73vv70rLe+/ve+/vd22F++/ve+/vXfvv73vv73Lm++/ve+/ve+/vW3vv71i
77+977+9Z37Ip++/vRfvv73vv73cqH3vv73vv73vv73GoHrvv70majordu+/ve+/ve+/vQfvv73v
v73vv73vv716Wivvv73vv70rembvv73vv73vv71o77+977+977+9fu+/ve+/ve+/ve+/vWnvv73v
v73vv71677+9Hu+/vXfvv73vv73vv70/77+977+977+977+9Ju+/vSnfohtm

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2011-07-27 10:17 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-04  1:30 [PATCH] NFS41: Drop lseg ref before fallthru to MDS Peng Tao
2011-07-09 14:10 ` Benny Halevy
2011-07-20  5:52   ` tao.peng
2011-07-25 19:13     ` Trond Myklebust
2011-07-25 21:35       ` Jim Rees
2011-07-26 15:37       ` Peng Tao
2011-07-26 15:50         ` Myklebust, Trond
2011-07-26 16:08           ` Jim Rees
2011-07-26 16:14             ` Myklebust, Trond
2011-07-26 16:37               ` Benny Halevy
2011-07-26 17:32           ` Peng Tao
2011-07-26 17:37             ` Myklebust, Trond
2011-07-26 17:57               ` Peng Tao
2011-07-27 10:17           ` tao.peng
2011-07-26 16:34       ` Benny Halevy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).