linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/4] pNFS: Ensure we return the error if someone kills a waiting layoutget
@ 2018-09-05 18:07 Trond Myklebust
  2018-09-05 18:07 ` [PATCH 2/4] NFSv4: Fix a tracepoint Oops in initiate_file_draining() Trond Myklebust
  2019-03-12 20:04 ` [PATCH 1/4] pNFS: Ensure we return the error if someone kills a waiting layoutget Schumaker, Anna
  0 siblings, 2 replies; 7+ messages in thread
From: Trond Myklebust @ 2018-09-05 18:07 UTC (permalink / raw)
  To: Anna Schumaker; +Cc: linux-nfs

If someone interrupts a wait on one or more outstanding layoutgets in
pnfs_update_layout() then return the ERESTARTSYS/EINTR error.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/pnfs.c | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index e8f232de484f..7d9a51e6b847 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1740,16 +1740,16 @@ static bool pnfs_within_mdsthreshold(struct nfs_open_context *ctx,
 	return ret;
 }
 
-static bool pnfs_prepare_to_retry_layoutget(struct pnfs_layout_hdr *lo)
+static int pnfs_prepare_to_retry_layoutget(struct pnfs_layout_hdr *lo)
 {
 	/*
 	 * send layoutcommit as it can hold up layoutreturn due to lseg
 	 * reference
 	 */
 	pnfs_layoutcommit_inode(lo->plh_inode, false);
-	return !wait_on_bit_action(&lo->plh_flags, NFS_LAYOUT_RETURN,
+	return wait_on_bit_action(&lo->plh_flags, NFS_LAYOUT_RETURN,
 				   nfs_wait_bit_killable,
-				   TASK_UNINTERRUPTIBLE);
+				   TASK_KILLABLE);
 }
 
 static void nfs_layoutget_begin(struct pnfs_layout_hdr *lo)
@@ -1830,7 +1830,9 @@ pnfs_update_layout(struct inode *ino,
 	}
 
 lookup_again:
-	nfs4_client_recover_expired_lease(clp);
+	lseg = ERR_PTR(nfs4_client_recover_expired_lease(clp));
+	if (IS_ERR(lseg))
+		goto out;
 	first = false;
 	spin_lock(&ino->i_lock);
 	lo = pnfs_find_alloc_layout(ino, ctx, gfp_flags);
@@ -1863,9 +1865,9 @@ pnfs_update_layout(struct inode *ino,
 	if (list_empty(&lo->plh_segs) &&
 	    atomic_read(&lo->plh_outstanding) != 0) {
 		spin_unlock(&ino->i_lock);
-		if (wait_var_event_killable(&lo->plh_outstanding,
-					atomic_read(&lo->plh_outstanding) == 0
-					|| !list_empty(&lo->plh_segs)))
+		lseg = ERR_PTR(wait_var_event_killable(&lo->plh_outstanding,
+					atomic_read(&lo->plh_outstanding)));
+		if (IS_ERR(lseg) || !list_empty(&lo->plh_segs))
 			goto out_put_layout_hdr;
 		pnfs_put_layout_hdr(lo);
 		goto lookup_again;
@@ -1898,8 +1900,11 @@ pnfs_update_layout(struct inode *ino,
 		if (test_and_set_bit(NFS_LAYOUT_FIRST_LAYOUTGET,
 				     &lo->plh_flags)) {
 			spin_unlock(&ino->i_lock);
-			wait_on_bit(&lo->plh_flags, NFS_LAYOUT_FIRST_LAYOUTGET,
-				    TASK_UNINTERRUPTIBLE);
+			lseg = ERR_PTR(wait_on_bit(&lo->plh_flags,
+						NFS_LAYOUT_FIRST_LAYOUTGET,
+						TASK_KILLABLE));
+			if (IS_ERR(lseg))
+				goto out_put_layout_hdr;
 			pnfs_put_layout_hdr(lo);
 			dprintk("%s retrying\n", __func__);
 			goto lookup_again;
@@ -1925,7 +1930,8 @@ pnfs_update_layout(struct inode *ino,
 	if (test_bit(NFS_LAYOUT_RETURN, &lo->plh_flags)) {
 		spin_unlock(&ino->i_lock);
 		dprintk("%s wait for layoutreturn\n", __func__);
-		if (pnfs_prepare_to_retry_layoutget(lo)) {
+		lseg = ERR_PTR(pnfs_prepare_to_retry_layoutget(lo));
+		if (!IS_ERR(lseg)) {
 			if (first)
 				pnfs_clear_first_layoutget(lo);
 			pnfs_put_layout_hdr(lo);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/4] NFSv4: Fix a tracepoint Oops in initiate_file_draining()
  2018-09-05 18:07 [PATCH 1/4] pNFS: Ensure we return the error if someone kills a waiting layoutget Trond Myklebust
@ 2018-09-05 18:07 ` Trond Myklebust
  2018-09-05 18:07   ` [PATCH 3/4] NFSv4.1 fix infinite loop on I/O Trond Myklebust
  2019-03-12 20:04 ` [PATCH 1/4] pNFS: Ensure we return the error if someone kills a waiting layoutget Schumaker, Anna
  1 sibling, 1 reply; 7+ messages in thread
From: Trond Myklebust @ 2018-09-05 18:07 UTC (permalink / raw)
  To: Anna Schumaker; +Cc: linux-nfs

Now that the value of 'ino' can be NULL or an ERR_PTR(), we need to
change the test in the tracepoint.

Fixes: ce5624f7e6675 ("NFSv4: Return NFS4ERR_DELAY when a layout fails...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/nfs4trace.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
index a275fba93170..708342f4692f 100644
--- a/fs/nfs/nfs4trace.h
+++ b/fs/nfs/nfs4trace.h
@@ -1194,7 +1194,7 @@ DECLARE_EVENT_CLASS(nfs4_inode_stateid_callback_event,
 		TP_fast_assign(
 			__entry->error = error;
 			__entry->fhandle = nfs_fhandle_hash(fhandle);
-			if (inode != NULL) {
+			if (!IS_ERR_OR_NULL(inode)) {
 				__entry->fileid = NFS_FILEID(inode);
 				__entry->dev = inode->i_sb->s_dev;
 			} else {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/4] NFSv4.1 fix infinite loop on I/O.
  2018-09-05 18:07 ` [PATCH 2/4] NFSv4: Fix a tracepoint Oops in initiate_file_draining() Trond Myklebust
@ 2018-09-05 18:07   ` Trond Myklebust
  2018-09-05 18:07     ` [PATCH 4/4] NFS: Don't open code clearing of delegation state Trond Myklebust
  0 siblings, 1 reply; 7+ messages in thread
From: Trond Myklebust @ 2018-09-05 18:07 UTC (permalink / raw)
  To: Anna Schumaker; +Cc: linux-nfs

The previous fix broke recovery of delegated stateids because it assumes
that if we did not mark the delegation as suspect, then the delegation has
effectively been revoked, and so it removes that delegation irrespectively
of whether or not it is valid and still in use. While this is "mostly
harmless" for ordinary I/O, we've seen pNFS fail with LAYOUTGET spinning
in an infinite loop while complaining that we're using an invalid stateid
(in this case the all-zero stateid).

What we rather want to do here is ensure that the delegation is always
correctly marked as needing testing when that is the case. So we want
to close the loophole offered by nfs4_schedule_stateid_recovery(),
which marks the state as needing to be reclaimed, but not the
delegation that may be backing it.

Fixes: 0e3d3e5df07dc ("NFSv4.1 fix infinite loop on IO BAD_STATEID error")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/nfs4proc.c  | 10 +++++++---
 fs/nfs/nfs4state.c |  2 ++
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 34830f6457ea..2e4456ac9556 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2676,14 +2676,18 @@ static void nfs41_check_delegation_stateid(struct nfs4_state *state)
 	}
 
 	nfs4_stateid_copy(&stateid, &delegation->stateid);
-	if (test_bit(NFS_DELEGATION_REVOKED, &delegation->flags) ||
-		!test_and_clear_bit(NFS_DELEGATION_TEST_EXPIRED,
-			&delegation->flags)) {
+	if (test_bit(NFS_DELEGATION_REVOKED, &delegation->flags)) {
 		rcu_read_unlock();
 		nfs_finish_clear_delegation_stateid(state, &stateid);
 		return;
 	}
 
+	if (!test_and_clear_bit(NFS_DELEGATION_TEST_EXPIRED,
+				&delegation->flags)) {
+		rcu_read_unlock();
+		return;
+	}
+
 	cred = get_rpccred(delegation->cred);
 	rcu_read_unlock();
 	status = nfs41_test_and_free_expired_stateid(server, &stateid, cred);
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 3df0eb52da1c..40a08cd483f0 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1390,6 +1390,8 @@ int nfs4_schedule_stateid_recovery(const struct nfs_server *server, struct nfs4_
 
 	if (!nfs4_state_mark_reclaim_nograce(clp, state))
 		return -EBADF;
+	nfs_inode_find_delegation_state_and_recover(state->inode,
+			&state->stateid);
 	dprintk("%s: scheduling stateid recovery for server %s\n", __func__,
 			clp->cl_hostname);
 	nfs4_schedule_state_manager(clp);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4/4] NFS: Don't open code clearing of delegation state
  2018-09-05 18:07   ` [PATCH 3/4] NFSv4.1 fix infinite loop on I/O Trond Myklebust
@ 2018-09-05 18:07     ` Trond Myklebust
  0 siblings, 0 replies; 7+ messages in thread
From: Trond Myklebust @ 2018-09-05 18:07 UTC (permalink / raw)
  To: Anna Schumaker; +Cc: linux-nfs

Add a helper for the case when the nfs4 open state has been set to use
a delegation stateid, and we want to revert to using the open stateid.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/nfs4proc.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 2e4456ac9556..8220a168282e 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1637,6 +1637,14 @@ static void nfs_state_set_delegation(struct nfs4_state *state,
 	write_sequnlock(&state->seqlock);
 }
 
+static void nfs_state_clear_delegation(struct nfs4_state *state)
+{
+	write_seqlock(&state->seqlock);
+	nfs4_stateid_copy(&state->stateid, &state->open_stateid);
+	clear_bit(NFS_DELEGATED_STATE, &state->flags);
+	write_sequnlock(&state->seqlock);
+}
+
 static int update_open_stateid(struct nfs4_state *state,
 		const nfs4_stateid *open_stateid,
 		const nfs4_stateid *delegation,
@@ -2145,10 +2153,7 @@ int nfs4_open_delegation_recall(struct nfs_open_context *ctx,
 	if (IS_ERR(opendata))
 		return PTR_ERR(opendata);
 	nfs4_stateid_copy(&opendata->o_arg.u.delegation, stateid);
-	write_seqlock(&state->seqlock);
-	nfs4_stateid_copy(&state->stateid, &state->open_stateid);
-	write_sequnlock(&state->seqlock);
-	clear_bit(NFS_DELEGATED_STATE, &state->flags);
+	nfs_state_clear_delegation(state);
 	switch (type & (FMODE_READ|FMODE_WRITE)) {
 	case FMODE_READ|FMODE_WRITE:
 	case FMODE_WRITE:
@@ -2601,10 +2606,7 @@ static void nfs_finish_clear_delegation_stateid(struct nfs4_state *state,
 		const nfs4_stateid *stateid)
 {
 	nfs_remove_bad_delegation(state->inode, stateid);
-	write_seqlock(&state->seqlock);
-	nfs4_stateid_copy(&state->stateid, &state->open_stateid);
-	write_sequnlock(&state->seqlock);
-	clear_bit(NFS_DELEGATED_STATE, &state->flags);
+	nfs_state_clear_delegation(state);
 }
 
 static void nfs40_clear_delegation_stateid(struct nfs4_state *state)
@@ -2672,13 +2674,14 @@ static void nfs41_check_delegation_stateid(struct nfs4_state *state)
 	delegation = rcu_dereference(NFS_I(state->inode)->delegation);
 	if (delegation == NULL) {
 		rcu_read_unlock();
+		nfs_state_clear_delegation(state);
 		return;
 	}
 
 	nfs4_stateid_copy(&stateid, &delegation->stateid);
 	if (test_bit(NFS_DELEGATION_REVOKED, &delegation->flags)) {
 		rcu_read_unlock();
-		nfs_finish_clear_delegation_stateid(state, &stateid);
+		nfs_state_clear_delegation(state);
 		return;
 	}
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/4] pNFS: Ensure we return the error if someone kills a waiting layoutget
  2018-09-05 18:07 [PATCH 1/4] pNFS: Ensure we return the error if someone kills a waiting layoutget Trond Myklebust
  2018-09-05 18:07 ` [PATCH 2/4] NFSv4: Fix a tracepoint Oops in initiate_file_draining() Trond Myklebust
@ 2019-03-12 20:04 ` Schumaker, Anna
  2019-03-12 20:12   ` Trond Myklebust
  1 sibling, 1 reply; 7+ messages in thread
From: Schumaker, Anna @ 2019-03-12 20:04 UTC (permalink / raw)
  To: trondmy; +Cc: linux-nfs

Hi Trond,

I'm seeing a hang when testing xfstests generic/013 on v4.1 with pNFS after this
patch:

On Wed, 2018-09-05 at 14:07 -0400, Trond Myklebust wrote:
> If someone interrupts a wait on one or more outstanding layoutgets in
> pnfs_update_layout() then return the ERESTARTSYS/EINTR error.
> 
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> ---
>  fs/nfs/pnfs.c | 26 ++++++++++++++++----------
>  1 file changed, 16 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
> index e8f232de484f..7d9a51e6b847 100644
> --- a/fs/nfs/pnfs.c
> +++ b/fs/nfs/pnfs.c
> @@ -1740,16 +1740,16 @@ static bool pnfs_within_mdsthreshold(struct
> nfs_open_context *ctx,
>  	return ret;
>  }
>  
> -static bool pnfs_prepare_to_retry_layoutget(struct pnfs_layout_hdr *lo)
> +static int pnfs_prepare_to_retry_layoutget(struct pnfs_layout_hdr *lo)
>  {
>  	/*
>  	 * send layoutcommit as it can hold up layoutreturn due to lseg
>  	 * reference
>  	 */
>  	pnfs_layoutcommit_inode(lo->plh_inode, false);
> -	return !wait_on_bit_action(&lo->plh_flags, NFS_LAYOUT_RETURN,
> +	return wait_on_bit_action(&lo->plh_flags, NFS_LAYOUT_RETURN,
>  				   nfs_wait_bit_killable,
> -				   TASK_UNINTERRUPTIBLE);
> +				   TASK_KILLABLE);
>  }
>  
>  static void nfs_layoutget_begin(struct pnfs_layout_hdr *lo)
> @@ -1830,7 +1830,9 @@ pnfs_update_layout(struct inode *ino,
>  	}
>  
>  lookup_again:
> -	nfs4_client_recover_expired_lease(clp);
> +	lseg = ERR_PTR(nfs4_client_recover_expired_lease(clp));
> +	if (IS_ERR(lseg))
> +		goto out;
>  	first = false;
>  	spin_lock(&ino->i_lock);
>  	lo = pnfs_find_alloc_layout(ino, ctx, gfp_flags);
> @@ -1863,9 +1865,9 @@ pnfs_update_layout(struct inode *ino,
>  	if (list_empty(&lo->plh_segs) &&
>  	    atomic_read(&lo->plh_outstanding) != 0) {
>  		spin_unlock(&ino->i_lock);
> -		if (wait_var_event_killable(&lo->plh_outstanding,
> -					atomic_read(&lo->plh_outstanding) == 0
> -					|| !list_empty(&lo->plh_segs)))
> +		lseg = ERR_PTR(wait_var_event_killable(&lo->plh_outstanding,
> +					atomic_read(&lo->plh_outstanding)));
> +		if (IS_ERR(lseg) || !list_empty(&lo->plh_segs))

Was dropping the "== 0" condition attached to the atomic_read() here a mistake?
I think what's happening is that my client is waiting for plh_outstanding to be
anything other than 0 when there isn't any work left to do.

Thanks,
Anna

>  			goto out_put_layout_hdr;
>  		pnfs_put_layout_hdr(lo);
>  		goto lookup_again;
> @@ -1898,8 +1900,11 @@ pnfs_update_layout(struct inode *ino,
>  		if (test_and_set_bit(NFS_LAYOUT_FIRST_LAYOUTGET,
>  				     &lo->plh_flags)) {
>  			spin_unlock(&ino->i_lock);
> -			wait_on_bit(&lo->plh_flags, NFS_LAYOUT_FIRST_LAYOUTGET,
> -				    TASK_UNINTERRUPTIBLE);
> +			lseg = ERR_PTR(wait_on_bit(&lo->plh_flags,
> +						NFS_LAYOUT_FIRST_LAYOUTGET,
> +						TASK_KILLABLE));
> +			if (IS_ERR(lseg))
> +				goto out_put_layout_hdr;
>  			pnfs_put_layout_hdr(lo);
>  			dprintk("%s retrying\n", __func__);
>  			goto lookup_again;
> @@ -1925,7 +1930,8 @@ pnfs_update_layout(struct inode *ino,
>  	if (test_bit(NFS_LAYOUT_RETURN, &lo->plh_flags)) {
>  		spin_unlock(&ino->i_lock);
>  		dprintk("%s wait for layoutreturn\n", __func__);
> -		if (pnfs_prepare_to_retry_layoutget(lo)) {
> +		lseg = ERR_PTR(pnfs_prepare_to_retry_layoutget(lo));
> +		if (!IS_ERR(lseg)) {
>  			if (first)
>  				pnfs_clear_first_layoutget(lo);
>  			pnfs_put_layout_hdr(lo);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/4] pNFS: Ensure we return the error if someone kills a waiting layoutget
  2019-03-12 20:04 ` [PATCH 1/4] pNFS: Ensure we return the error if someone kills a waiting layoutget Schumaker, Anna
@ 2019-03-12 20:12   ` Trond Myklebust
  2019-03-12 21:22     ` Anna Schumaker
  0 siblings, 1 reply; 7+ messages in thread
From: Trond Myklebust @ 2019-03-12 20:12 UTC (permalink / raw)
  To: Anna.Schumaker; +Cc: linux-nfs

On Tue, 2019-03-12 at 20:04 +0000, Schumaker, Anna wrote:
> Hi Trond,
> 
> I'm seeing a hang when testing xfstests generic/013 on v4.1 with pNFS
> after this
> patch:
> 
> On Wed, 2018-09-05 at 14:07 -0400, Trond Myklebust wrote:
> > If someone interrupts a wait on one or more outstanding layoutgets
> > in
> > pnfs_update_layout() then return the ERESTARTSYS/EINTR error.
> > 
> > Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> > ---
> >  fs/nfs/pnfs.c | 26 ++++++++++++++++----------
> >  1 file changed, 16 insertions(+), 10 deletions(-)
> > 
> > diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
> > index e8f232de484f..7d9a51e6b847 100644
> > --- a/fs/nfs/pnfs.c
> > +++ b/fs/nfs/pnfs.c
> > @@ -1740,16 +1740,16 @@ static bool pnfs_within_mdsthreshold(struct
> > nfs_open_context *ctx,
> >  	return ret;
> >  }
> >  
> > -static bool pnfs_prepare_to_retry_layoutget(struct pnfs_layout_hdr
> > *lo)
> > +static int pnfs_prepare_to_retry_layoutget(struct pnfs_layout_hdr
> > *lo)
> >  {
> >  	/*
> >  	 * send layoutcommit as it can hold up layoutreturn due to lseg
> >  	 * reference
> >  	 */
> >  	pnfs_layoutcommit_inode(lo->plh_inode, false);
> > -	return !wait_on_bit_action(&lo->plh_flags, NFS_LAYOUT_RETURN,
> > +	return wait_on_bit_action(&lo->plh_flags, NFS_LAYOUT_RETURN,
> >  				   nfs_wait_bit_killable,
> > -				   TASK_UNINTERRUPTIBLE);
> > +				   TASK_KILLABLE);
> >  }
> >  
> >  static void nfs_layoutget_begin(struct pnfs_layout_hdr *lo)
> > @@ -1830,7 +1830,9 @@ pnfs_update_layout(struct inode *ino,
> >  	}
> >  
> >  lookup_again:
> > -	nfs4_client_recover_expired_lease(clp);
> > +	lseg = ERR_PTR(nfs4_client_recover_expired_lease(clp));
> > +	if (IS_ERR(lseg))
> > +		goto out;
> >  	first = false;
> >  	spin_lock(&ino->i_lock);
> >  	lo = pnfs_find_alloc_layout(ino, ctx, gfp_flags);
> > @@ -1863,9 +1865,9 @@ pnfs_update_layout(struct inode *ino,
> >  	if (list_empty(&lo->plh_segs) &&
> >  	    atomic_read(&lo->plh_outstanding) != 0) {
> >  		spin_unlock(&ino->i_lock);
> > -		if (wait_var_event_killable(&lo->plh_outstanding,
> > -					atomic_read(&lo-
> > >plh_outstanding) == 0
> > -					|| !list_empty(&lo->plh_segs)))
> > +		lseg = ERR_PTR(wait_var_event_killable(&lo-
> > >plh_outstanding,
> > +					atomic_read(&lo-
> > >plh_outstanding)));
> > +		if (IS_ERR(lseg) || !list_empty(&lo->plh_segs))
> 
> Was dropping the "== 0" condition attached to the atomic_read() here
> a mistake?
> I think what's happening is that my client is waiting for
> plh_outstanding to be
> anything other than 0 when there isn't any work left to do.

Yes. That's a bug. How about the following patch?

8<---------------------------------------------------
From 400417b05f3ec0531544ca5f94e64d838d8b8849 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Tue, 12 Mar 2019 16:04:51 -0400
Subject: [PATCH] pNFS: Fix a typo in pnfs_update_layout

We're supposed to wait for the outstanding layout count to go to zero,
but that got lost somehow.

Fixes: d03360aaf5cca ("pNFS: Ensure we return the error if someone...")
Reported-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/pnfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 8247bd1634cb..7066cd7c7aff 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1889,7 +1889,7 @@ pnfs_update_layout(struct inode *ino,
 	    atomic_read(&lo->plh_outstanding) != 0) {
 		spin_unlock(&ino->i_lock);
 		lseg = ERR_PTR(wait_var_event_killable(&lo->plh_outstanding,
-					atomic_read(&lo->plh_outstanding)));
+					!atomic_read(&lo->plh_outstanding)));
 		if (IS_ERR(lseg) || !list_empty(&lo->plh_segs))
 			goto out_put_layout_hdr;
 		pnfs_put_layout_hdr(lo);
-- 
2.20.1

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/4] pNFS: Ensure we return the error if someone kills a waiting layoutget
  2019-03-12 20:12   ` Trond Myklebust
@ 2019-03-12 21:22     ` Anna Schumaker
  0 siblings, 0 replies; 7+ messages in thread
From: Anna Schumaker @ 2019-03-12 21:22 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On Tue, 2019-03-12 at 20:12 +0000, Trond Myklebust wrote:
> On Tue, 2019-03-12 at 20:04 +0000, Schumaker, Anna wrote:
> > Hi Trond,
> > 
> > I'm seeing a hang when testing xfstests generic/013 on v4.1 with pNFS
> > after this
> > patch:
> > 
> > On Wed, 2018-09-05 at 14:07 -0400, Trond Myklebust wrote:
> > > If someone interrupts a wait on one or more outstanding layoutgets
> > > in
> > > pnfs_update_layout() then return the ERESTARTSYS/EINTR error.
> > > 
> > > Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> > > ---
> > >  fs/nfs/pnfs.c | 26 ++++++++++++++++----------
> > >  1 file changed, 16 insertions(+), 10 deletions(-)
> > > 
> > > diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
> > > index e8f232de484f..7d9a51e6b847 100644
> > > --- a/fs/nfs/pnfs.c
> > > +++ b/fs/nfs/pnfs.c
> > > @@ -1740,16 +1740,16 @@ static bool pnfs_within_mdsthreshold(struct
> > > nfs_open_context *ctx,
> > >  	return ret;
> > >  }
> > >  
> > > -static bool pnfs_prepare_to_retry_layoutget(struct pnfs_layout_hdr
> > > *lo)
> > > +static int pnfs_prepare_to_retry_layoutget(struct pnfs_layout_hdr
> > > *lo)
> > >  {
> > >  	/*
> > >  	 * send layoutcommit as it can hold up layoutreturn due to lseg
> > >  	 * reference
> > >  	 */
> > >  	pnfs_layoutcommit_inode(lo->plh_inode, false);
> > > -	return !wait_on_bit_action(&lo->plh_flags, NFS_LAYOUT_RETURN,
> > > +	return wait_on_bit_action(&lo->plh_flags, NFS_LAYOUT_RETURN,
> > >  				   nfs_wait_bit_killable,
> > > -				   TASK_UNINTERRUPTIBLE);
> > > +				   TASK_KILLABLE);
> > >  }
> > >  
> > >  static void nfs_layoutget_begin(struct pnfs_layout_hdr *lo)
> > > @@ -1830,7 +1830,9 @@ pnfs_update_layout(struct inode *ino,
> > >  	}
> > >  
> > >  lookup_again:
> > > -	nfs4_client_recover_expired_lease(clp);
> > > +	lseg = ERR_PTR(nfs4_client_recover_expired_lease(clp));
> > > +	if (IS_ERR(lseg))
> > > +		goto out;
> > >  	first = false;
> > >  	spin_lock(&ino->i_lock);
> > >  	lo = pnfs_find_alloc_layout(ino, ctx, gfp_flags);
> > > @@ -1863,9 +1865,9 @@ pnfs_update_layout(struct inode *ino,
> > >  	if (list_empty(&lo->plh_segs) &&
> > >  	    atomic_read(&lo->plh_outstanding) != 0) {
> > >  		spin_unlock(&ino->i_lock);
> > > -		if (wait_var_event_killable(&lo->plh_outstanding,
> > > -					atomic_read(&lo-
> > > > plh_outstanding) == 0
> > > -					|| !list_empty(&lo->plh_segs)))
> > > +		lseg = ERR_PTR(wait_var_event_killable(&lo-
> > > > plh_outstanding,
> > > +					atomic_read(&lo-
> > > > plh_outstanding)));
> > > +		if (IS_ERR(lseg) || !list_empty(&lo->plh_segs))
> > 
> > Was dropping the "== 0" condition attached to the atomic_read() here
> > a mistake?
> > I think what's happening is that my client is waiting for
> > plh_outstanding to be
> > anything other than 0 when there isn't any work left to do.
> 
> Yes. That's a bug. How about the following patch?

This patch works for me, but for some reason doing "!atomic_read()" takes 8
minutes longer to complete compared to doing "atomic_read() == 0".  I have not
run this multiple times to confirm that it's always the case.

Anna

> 
> 8<---------------------------------------------------
> From 400417b05f3ec0531544ca5f94e64d838d8b8849 Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <trond.myklebust@hammerspace.com>
> Date: Tue, 12 Mar 2019 16:04:51 -0400
> Subject: [PATCH] pNFS: Fix a typo in pnfs_update_layout
> 
> We're supposed to wait for the outstanding layout count to go to zero,
> but that got lost somehow.
> 
> Fixes: d03360aaf5cca ("pNFS: Ensure we return the error if someone...")
> Reported-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> ---
>  fs/nfs/pnfs.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
> index 8247bd1634cb..7066cd7c7aff 100644
> --- a/fs/nfs/pnfs.c
> +++ b/fs/nfs/pnfs.c
> @@ -1889,7 +1889,7 @@ pnfs_update_layout(struct inode *ino,
>  	    atomic_read(&lo->plh_outstanding) != 0) {
>  		spin_unlock(&ino->i_lock);
>  		lseg = ERR_PTR(wait_var_event_killable(&lo->plh_outstanding,
> -					atomic_read(&lo->plh_outstanding)));
> +					!atomic_read(&lo->plh_outstanding)));
>  		if (IS_ERR(lseg) || !list_empty(&lo->plh_segs))
>  			goto out_put_layout_hdr;
>  		pnfs_put_layout_hdr(lo);
> -- 
> 2.20.1
> 
> -- 
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
> 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-03-12 21:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-05 18:07 [PATCH 1/4] pNFS: Ensure we return the error if someone kills a waiting layoutget Trond Myklebust
2018-09-05 18:07 ` [PATCH 2/4] NFSv4: Fix a tracepoint Oops in initiate_file_draining() Trond Myklebust
2018-09-05 18:07   ` [PATCH 3/4] NFSv4.1 fix infinite loop on I/O Trond Myklebust
2018-09-05 18:07     ` [PATCH 4/4] NFS: Don't open code clearing of delegation state Trond Myklebust
2019-03-12 20:04 ` [PATCH 1/4] pNFS: Ensure we return the error if someone kills a waiting layoutget Schumaker, Anna
2019-03-12 20:12   ` Trond Myklebust
2019-03-12 21:22     ` Anna Schumaker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).