linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] xfs: a couple AIL pushing trylock fixes
@ 2020-03-26 13:17 Brian Foster
  2020-03-26 13:17 ` [PATCH 1/2] xfs: trylock underlying buffer on dquot flush Brian Foster
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Brian Foster @ 2020-03-26 13:17 UTC (permalink / raw)
  To: linux-xfs

Hi all,

Here's a couple more small fixes that fell out of the auto relog work.
The dquot issue is actually a deadlock vector if we randomly relog dquot
buffers (which is only done for test purposes), but I figure we should
handle dquot buffers similar to how inode buffers are handled. Thoughts,
reviews, flames appreciated.

Brian

Brian Foster (2):
  xfs: trylock underlying buffer on dquot flush
  xfs: return locked status of inode buffer on xfsaild push

 fs/xfs/xfs_dquot.c      |  6 +++---
 fs/xfs/xfs_dquot_item.c |  3 ++-
 fs/xfs/xfs_inode_item.c |  3 ++-
 fs/xfs/xfs_qm.c         | 14 +++++++++-----
 4 files changed, 16 insertions(+), 10 deletions(-)

-- 
2.21.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/2] xfs: trylock underlying buffer on dquot flush
  2020-03-26 13:17 [PATCH 0/2] xfs: a couple AIL pushing trylock fixes Brian Foster
@ 2020-03-26 13:17 ` Brian Foster
  2020-03-27 12:59   ` Christoph Hellwig
                     ` (2 more replies)
  2020-03-26 13:17 ` [PATCH 2/2] xfs: return locked status of inode buffer on xfsaild push Brian Foster
  2020-03-27 15:32 ` [PATCH 0/2] xfs: a couple AIL pushing trylock fixes Darrick J. Wong
  2 siblings, 3 replies; 19+ messages in thread
From: Brian Foster @ 2020-03-26 13:17 UTC (permalink / raw)
  To: linux-xfs

A dquot flush currently blocks on the buffer lock for the underlying
dquot buffer. In turn, this causes xfsaild to block rather than
continue processing other items in the meantime. Update
xfs_qm_dqflush() to trylock the buffer, similar to how inode buffers
are handled, and return -EAGAIN if the lock fails. Fix up any
callers that don't currently handle the error properly.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_dquot.c      |  6 +++---
 fs/xfs/xfs_dquot_item.c |  3 ++-
 fs/xfs/xfs_qm.c         | 14 +++++++++-----
 3 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index 711376ca269f..af2c8e5ceea0 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -1105,8 +1105,8 @@ xfs_qm_dqflush(
 	 * Get the buffer containing the on-disk dquot
 	 */
 	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dqp->q_blkno,
-				   mp->m_quotainfo->qi_dqchunklen, 0, &bp,
-				   &xfs_dquot_buf_ops);
+				   mp->m_quotainfo->qi_dqchunklen, XBF_TRYLOCK,
+				   &bp, &xfs_dquot_buf_ops);
 	if (error)
 		goto out_unlock;
 
@@ -1177,7 +1177,7 @@ xfs_qm_dqflush(
 
 out_unlock:
 	xfs_dqfunlock(dqp);
-	return -EIO;
+	return error;
 }
 
 /*
diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
index cf65e2e43c6e..baad1748d0d1 100644
--- a/fs/xfs/xfs_dquot_item.c
+++ b/fs/xfs/xfs_dquot_item.c
@@ -189,7 +189,8 @@ xfs_qm_dquot_logitem_push(
 		if (!xfs_buf_delwri_queue(bp, buffer_list))
 			rval = XFS_ITEM_FLUSHING;
 		xfs_buf_relse(bp);
-	}
+	} else if (error == -EAGAIN)
+		rval = XFS_ITEM_LOCKED;
 
 	spin_lock(&lip->li_ailp->ail_lock);
 out_unlock:
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index de1d2c606c14..68c778d25c48 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -121,12 +121,11 @@ xfs_qm_dqpurge(
 {
 	struct xfs_mount	*mp = dqp->q_mount;
 	struct xfs_quotainfo	*qi = mp->m_quotainfo;
+	int			error = -EAGAIN;
 
 	xfs_dqlock(dqp);
-	if ((dqp->dq_flags & XFS_DQ_FREEING) || dqp->q_nrefs != 0) {
-		xfs_dqunlock(dqp);
-		return -EAGAIN;
-	}
+	if ((dqp->dq_flags & XFS_DQ_FREEING) || dqp->q_nrefs != 0)
+		goto out_unlock;
 
 	dqp->dq_flags |= XFS_DQ_FREEING;
 
@@ -139,7 +138,6 @@ xfs_qm_dqpurge(
 	 */
 	if (XFS_DQ_IS_DIRTY(dqp)) {
 		struct xfs_buf	*bp = NULL;
-		int		error;
 
 		/*
 		 * We don't care about getting disk errors here. We need
@@ -149,6 +147,8 @@ xfs_qm_dqpurge(
 		if (!error) {
 			error = xfs_bwrite(bp);
 			xfs_buf_relse(bp);
+		} else if (error == -EAGAIN) {
+			goto out_unlock;
 		}
 		xfs_dqflock(dqp);
 	}
@@ -174,6 +174,10 @@ xfs_qm_dqpurge(
 
 	xfs_qm_dqdestroy(dqp);
 	return 0;
+
+out_unlock:
+	xfs_dqunlock(dqp);
+	return error;
 }
 
 /*
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/2] xfs: return locked status of inode buffer on xfsaild push
  2020-03-26 13:17 [PATCH 0/2] xfs: a couple AIL pushing trylock fixes Brian Foster
  2020-03-26 13:17 ` [PATCH 1/2] xfs: trylock underlying buffer on dquot flush Brian Foster
@ 2020-03-26 13:17 ` Brian Foster
  2020-03-27 13:00   ` Christoph Hellwig
  2020-03-27 15:39   ` Darrick J. Wong
  2020-03-27 15:32 ` [PATCH 0/2] xfs: a couple AIL pushing trylock fixes Darrick J. Wong
  2 siblings, 2 replies; 19+ messages in thread
From: Brian Foster @ 2020-03-26 13:17 UTC (permalink / raw)
  To: linux-xfs

If the inode buffer backing a particular inode is locked,
xfs_iflush() returns -EAGAIN and xfs_inode_item_push() skips the
inode. It still returns success to xfsaild, however, which bypasses
the xfsaild backoff heuristic. Update xfs_inode_item_push() to
return locked status if the inode buffer couldn't be locked.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_inode_item.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
index 4a3d13d4a022..9a903babbcf7 100644
--- a/fs/xfs/xfs_inode_item.c
+++ b/fs/xfs/xfs_inode_item.c
@@ -552,7 +552,8 @@ xfs_inode_item_push(
 		if (!xfs_buf_delwri_queue(bp, buffer_list))
 			rval = XFS_ITEM_FLUSHING;
 		xfs_buf_relse(bp);
-	}
+	} else if (error == -EAGAIN)
+		rval = XFS_ITEM_LOCKED;
 
 	spin_lock(&lip->li_ailp->ail_lock);
 out_unlock:
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] xfs: trylock underlying buffer on dquot flush
  2020-03-26 13:17 ` [PATCH 1/2] xfs: trylock underlying buffer on dquot flush Brian Foster
@ 2020-03-27 12:59   ` Christoph Hellwig
  2020-03-27 15:45   ` Darrick J. Wong
  2020-03-29 22:46   ` Dave Chinner
  2 siblings, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2020-03-27 12:59 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Thu, Mar 26, 2020 at 09:17:02AM -0400, Brian Foster wrote:
> A dquot flush currently blocks on the buffer lock for the underlying
> dquot buffer. In turn, this causes xfsaild to block rather than
> continue processing other items in the meantime. Update
> xfs_qm_dqflush() to trylock the buffer, similar to how inode buffers
> are handled, and return -EAGAIN if the lock fails. Fix up any
> callers that don't currently handle the error properly.

Looks good, seems like the two remaining xfs_qm_dqflush have
sensible -EAGAIN handling.

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] xfs: return locked status of inode buffer on xfsaild push
  2020-03-26 13:17 ` [PATCH 2/2] xfs: return locked status of inode buffer on xfsaild push Brian Foster
@ 2020-03-27 13:00   ` Christoph Hellwig
  2020-03-27 15:39   ` Darrick J. Wong
  1 sibling, 0 replies; 19+ messages in thread
From: Christoph Hellwig @ 2020-03-27 13:00 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Thu, Mar 26, 2020 at 09:17:03AM -0400, Brian Foster wrote:
> If the inode buffer backing a particular inode is locked,
> xfs_iflush() returns -EAGAIN and xfs_inode_item_push() skips the
> inode. It still returns success to xfsaild, however, which bypasses
> the xfsaild backoff heuristic. Update xfs_inode_item_push() to
> return locked status if the inode buffer couldn't be locked.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/2] xfs: a couple AIL pushing trylock fixes
  2020-03-26 13:17 [PATCH 0/2] xfs: a couple AIL pushing trylock fixes Brian Foster
  2020-03-26 13:17 ` [PATCH 1/2] xfs: trylock underlying buffer on dquot flush Brian Foster
  2020-03-26 13:17 ` [PATCH 2/2] xfs: return locked status of inode buffer on xfsaild push Brian Foster
@ 2020-03-27 15:32 ` Darrick J. Wong
  2020-03-27 16:44   ` Brian Foster
  2 siblings, 1 reply; 19+ messages in thread
From: Darrick J. Wong @ 2020-03-27 15:32 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Thu, Mar 26, 2020 at 09:17:01AM -0400, Brian Foster wrote:
> Hi all,
> 
> Here's a couple more small fixes that fell out of the auto relog work.
> The dquot issue is actually a deadlock vector if we randomly relog dquot
> buffers (which is only done for test purposes), but I figure we should
> handle dquot buffers similar to how inode buffers are handled. Thoughts,
> reviews, flames appreciated.

Oops, I missed this one, will review now...

Do you think there needs to be an explicit testcase for this?  Or are
the current generic/{388,475} good enough?  I'm pretty sure I've seen
this exact deadlock on them every now and again, so we're probably
covered.

--D


> Brian
> 
> Brian Foster (2):
>   xfs: trylock underlying buffer on dquot flush
>   xfs: return locked status of inode buffer on xfsaild push
> 
>  fs/xfs/xfs_dquot.c      |  6 +++---
>  fs/xfs/xfs_dquot_item.c |  3 ++-
>  fs/xfs/xfs_inode_item.c |  3 ++-
>  fs/xfs/xfs_qm.c         | 14 +++++++++-----
>  4 files changed, 16 insertions(+), 10 deletions(-)
> 
> -- 
> 2.21.1
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] xfs: return locked status of inode buffer on xfsaild push
  2020-03-26 13:17 ` [PATCH 2/2] xfs: return locked status of inode buffer on xfsaild push Brian Foster
  2020-03-27 13:00   ` Christoph Hellwig
@ 2020-03-27 15:39   ` Darrick J. Wong
  1 sibling, 0 replies; 19+ messages in thread
From: Darrick J. Wong @ 2020-03-27 15:39 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Thu, Mar 26, 2020 at 09:17:03AM -0400, Brian Foster wrote:
> If the inode buffer backing a particular inode is locked,
> xfs_iflush() returns -EAGAIN and xfs_inode_item_push() skips the
> inode. It still returns success to xfsaild, however, which bypasses
> the xfsaild backoff heuristic. Update xfs_inode_item_push() to
> return locked status if the inode buffer couldn't be locked.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Seems pretty straightforward,

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/xfs_inode_item.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
> index 4a3d13d4a022..9a903babbcf7 100644
> --- a/fs/xfs/xfs_inode_item.c
> +++ b/fs/xfs/xfs_inode_item.c
> @@ -552,7 +552,8 @@ xfs_inode_item_push(
>  		if (!xfs_buf_delwri_queue(bp, buffer_list))
>  			rval = XFS_ITEM_FLUSHING;
>  		xfs_buf_relse(bp);
> -	}
> +	} else if (error == -EAGAIN)
> +		rval = XFS_ITEM_LOCKED;
>  
>  	spin_lock(&lip->li_ailp->ail_lock);
>  out_unlock:
> -- 
> 2.21.1
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] xfs: trylock underlying buffer on dquot flush
  2020-03-26 13:17 ` [PATCH 1/2] xfs: trylock underlying buffer on dquot flush Brian Foster
  2020-03-27 12:59   ` Christoph Hellwig
@ 2020-03-27 15:45   ` Darrick J. Wong
  2020-03-27 16:44     ` Brian Foster
  2020-03-29 22:46   ` Dave Chinner
  2 siblings, 1 reply; 19+ messages in thread
From: Darrick J. Wong @ 2020-03-27 15:45 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Thu, Mar 26, 2020 at 09:17:02AM -0400, Brian Foster wrote:
> A dquot flush currently blocks on the buffer lock for the underlying
> dquot buffer. In turn, this causes xfsaild to block rather than
> continue processing other items in the meantime. Update
> xfs_qm_dqflush() to trylock the buffer, similar to how inode buffers
> are handled, and return -EAGAIN if the lock fails. Fix up any
> callers that don't currently handle the error properly.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Is xfs_qm_dquot_isolate returning LRU_RETRY an acceptable resolution (as
opposed to, say, LRU_SKIP) for xfs_qm_dqflush returning -EAGAIN?

--D

> ---
>  fs/xfs/xfs_dquot.c      |  6 +++---
>  fs/xfs/xfs_dquot_item.c |  3 ++-
>  fs/xfs/xfs_qm.c         | 14 +++++++++-----
>  3 files changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
> index 711376ca269f..af2c8e5ceea0 100644
> --- a/fs/xfs/xfs_dquot.c
> +++ b/fs/xfs/xfs_dquot.c
> @@ -1105,8 +1105,8 @@ xfs_qm_dqflush(
>  	 * Get the buffer containing the on-disk dquot
>  	 */
>  	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dqp->q_blkno,
> -				   mp->m_quotainfo->qi_dqchunklen, 0, &bp,
> -				   &xfs_dquot_buf_ops);
> +				   mp->m_quotainfo->qi_dqchunklen, XBF_TRYLOCK,
> +				   &bp, &xfs_dquot_buf_ops);
>  	if (error)
>  		goto out_unlock;
>  
> @@ -1177,7 +1177,7 @@ xfs_qm_dqflush(
>  
>  out_unlock:
>  	xfs_dqfunlock(dqp);
> -	return -EIO;
> +	return error;
>  }
>  
>  /*
> diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
> index cf65e2e43c6e..baad1748d0d1 100644
> --- a/fs/xfs/xfs_dquot_item.c
> +++ b/fs/xfs/xfs_dquot_item.c
> @@ -189,7 +189,8 @@ xfs_qm_dquot_logitem_push(
>  		if (!xfs_buf_delwri_queue(bp, buffer_list))
>  			rval = XFS_ITEM_FLUSHING;
>  		xfs_buf_relse(bp);
> -	}
> +	} else if (error == -EAGAIN)
> +		rval = XFS_ITEM_LOCKED;
>  
>  	spin_lock(&lip->li_ailp->ail_lock);
>  out_unlock:
> diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
> index de1d2c606c14..68c778d25c48 100644
> --- a/fs/xfs/xfs_qm.c
> +++ b/fs/xfs/xfs_qm.c
> @@ -121,12 +121,11 @@ xfs_qm_dqpurge(
>  {
>  	struct xfs_mount	*mp = dqp->q_mount;
>  	struct xfs_quotainfo	*qi = mp->m_quotainfo;
> +	int			error = -EAGAIN;
>  
>  	xfs_dqlock(dqp);
> -	if ((dqp->dq_flags & XFS_DQ_FREEING) || dqp->q_nrefs != 0) {
> -		xfs_dqunlock(dqp);
> -		return -EAGAIN;
> -	}
> +	if ((dqp->dq_flags & XFS_DQ_FREEING) || dqp->q_nrefs != 0)
> +		goto out_unlock;
>  
>  	dqp->dq_flags |= XFS_DQ_FREEING;
>  
> @@ -139,7 +138,6 @@ xfs_qm_dqpurge(
>  	 */
>  	if (XFS_DQ_IS_DIRTY(dqp)) {
>  		struct xfs_buf	*bp = NULL;
> -		int		error;
>  
>  		/*
>  		 * We don't care about getting disk errors here. We need
> @@ -149,6 +147,8 @@ xfs_qm_dqpurge(
>  		if (!error) {
>  			error = xfs_bwrite(bp);
>  			xfs_buf_relse(bp);
> +		} else if (error == -EAGAIN) {
> +			goto out_unlock;
>  		}
>  		xfs_dqflock(dqp);
>  	}
> @@ -174,6 +174,10 @@ xfs_qm_dqpurge(
>  
>  	xfs_qm_dqdestroy(dqp);
>  	return 0;
> +
> +out_unlock:
> +	xfs_dqunlock(dqp);
> +	return error;
>  }
>  
>  /*
> -- 
> 2.21.1
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/2] xfs: a couple AIL pushing trylock fixes
  2020-03-27 15:32 ` [PATCH 0/2] xfs: a couple AIL pushing trylock fixes Darrick J. Wong
@ 2020-03-27 16:44   ` Brian Foster
  2020-03-29 16:43     ` Darrick J. Wong
  0 siblings, 1 reply; 19+ messages in thread
From: Brian Foster @ 2020-03-27 16:44 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, Mar 27, 2020 at 08:32:05AM -0700, Darrick J. Wong wrote:
> On Thu, Mar 26, 2020 at 09:17:01AM -0400, Brian Foster wrote:
> > Hi all,
> > 
> > Here's a couple more small fixes that fell out of the auto relog work.
> > The dquot issue is actually a deadlock vector if we randomly relog dquot
> > buffers (which is only done for test purposes), but I figure we should
> > handle dquot buffers similar to how inode buffers are handled. Thoughts,
> > reviews, flames appreciated.
> 
> Oops, I missed this one, will review now...
> 
> Do you think there needs to be an explicit testcase for this?  Or are
> the current generic/{388,475} good enough?  I'm pretty sure I've seen
> this exact deadlock on them every now and again, so we're probably
> covered.
> 

I'm actually not aware of a related upstream deadlock. That doesn't mean
there isn't one of course, but the problem I hit was related to the
random buffer relogging stuff in the auto relog series. I split these
out because xfsaild is intended to be mostly async, so they seemed like a
generic fixups..

Brian

> --D
> 
> 
> > Brian
> > 
> > Brian Foster (2):
> >   xfs: trylock underlying buffer on dquot flush
> >   xfs: return locked status of inode buffer on xfsaild push
> > 
> >  fs/xfs/xfs_dquot.c      |  6 +++---
> >  fs/xfs/xfs_dquot_item.c |  3 ++-
> >  fs/xfs/xfs_inode_item.c |  3 ++-
> >  fs/xfs/xfs_qm.c         | 14 +++++++++-----
> >  4 files changed, 16 insertions(+), 10 deletions(-)
> > 
> > -- 
> > 2.21.1
> > 
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] xfs: trylock underlying buffer on dquot flush
  2020-03-27 15:45   ` Darrick J. Wong
@ 2020-03-27 16:44     ` Brian Foster
  2020-03-27 16:46       ` Brian Foster
  2020-03-27 16:50       ` Darrick J. Wong
  0 siblings, 2 replies; 19+ messages in thread
From: Brian Foster @ 2020-03-27 16:44 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, Mar 27, 2020 at 08:45:28AM -0700, Darrick J. Wong wrote:
> On Thu, Mar 26, 2020 at 09:17:02AM -0400, Brian Foster wrote:
> > A dquot flush currently blocks on the buffer lock for the underlying
> > dquot buffer. In turn, this causes xfsaild to block rather than
> > continue processing other items in the meantime. Update
> > xfs_qm_dqflush() to trylock the buffer, similar to how inode buffers
> > are handled, and return -EAGAIN if the lock fails. Fix up any
> > callers that don't currently handle the error properly.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> 
> Is xfs_qm_dquot_isolate returning LRU_RETRY an acceptable resolution (as
> opposed to, say, LRU_SKIP) for xfs_qm_dqflush returning -EAGAIN?
> 

Hmm.. this is reclaim so I suppose LRU_SKIP would be more appropriate
than retry (along with more consistent with the other trylock failures
in that function). Ok with something like the following?

@@ -461,7 +461,11 @@ xfs_qm_dquot_isolate(
 		spin_unlock(lru_lock);
 
 		error = xfs_qm_dqflush(dqp, &bp);
-		if (error)
+		if (error == -EAGAIN) {
+			xfs_dqunlock(dqp);
+			spin_lock(lru_lock);
+			goto out_miss_busy;
+		} else if (error)
 			goto out_unlock_dirty;
 
 		xfs_buf_delwri_queue(bp, &isol->buffers);

Brian

> --D
> 
> > ---
> >  fs/xfs/xfs_dquot.c      |  6 +++---
> >  fs/xfs/xfs_dquot_item.c |  3 ++-
> >  fs/xfs/xfs_qm.c         | 14 +++++++++-----
> >  3 files changed, 14 insertions(+), 9 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
> > index 711376ca269f..af2c8e5ceea0 100644
> > --- a/fs/xfs/xfs_dquot.c
> > +++ b/fs/xfs/xfs_dquot.c
> > @@ -1105,8 +1105,8 @@ xfs_qm_dqflush(
> >  	 * Get the buffer containing the on-disk dquot
> >  	 */
> >  	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dqp->q_blkno,
> > -				   mp->m_quotainfo->qi_dqchunklen, 0, &bp,
> > -				   &xfs_dquot_buf_ops);
> > +				   mp->m_quotainfo->qi_dqchunklen, XBF_TRYLOCK,
> > +				   &bp, &xfs_dquot_buf_ops);
> >  	if (error)
> >  		goto out_unlock;
> >  
> > @@ -1177,7 +1177,7 @@ xfs_qm_dqflush(
> >  
> >  out_unlock:
> >  	xfs_dqfunlock(dqp);
> > -	return -EIO;
> > +	return error;
> >  }
> >  
> >  /*
> > diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
> > index cf65e2e43c6e..baad1748d0d1 100644
> > --- a/fs/xfs/xfs_dquot_item.c
> > +++ b/fs/xfs/xfs_dquot_item.c
> > @@ -189,7 +189,8 @@ xfs_qm_dquot_logitem_push(
> >  		if (!xfs_buf_delwri_queue(bp, buffer_list))
> >  			rval = XFS_ITEM_FLUSHING;
> >  		xfs_buf_relse(bp);
> > -	}
> > +	} else if (error == -EAGAIN)
> > +		rval = XFS_ITEM_LOCKED;
> >  
> >  	spin_lock(&lip->li_ailp->ail_lock);
> >  out_unlock:
> > diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
> > index de1d2c606c14..68c778d25c48 100644
> > --- a/fs/xfs/xfs_qm.c
> > +++ b/fs/xfs/xfs_qm.c
> > @@ -121,12 +121,11 @@ xfs_qm_dqpurge(
> >  {
> >  	struct xfs_mount	*mp = dqp->q_mount;
> >  	struct xfs_quotainfo	*qi = mp->m_quotainfo;
> > +	int			error = -EAGAIN;
> >  
> >  	xfs_dqlock(dqp);
> > -	if ((dqp->dq_flags & XFS_DQ_FREEING) || dqp->q_nrefs != 0) {
> > -		xfs_dqunlock(dqp);
> > -		return -EAGAIN;
> > -	}
> > +	if ((dqp->dq_flags & XFS_DQ_FREEING) || dqp->q_nrefs != 0)
> > +		goto out_unlock;
> >  
> >  	dqp->dq_flags |= XFS_DQ_FREEING;
> >  
> > @@ -139,7 +138,6 @@ xfs_qm_dqpurge(
> >  	 */
> >  	if (XFS_DQ_IS_DIRTY(dqp)) {
> >  		struct xfs_buf	*bp = NULL;
> > -		int		error;
> >  
> >  		/*
> >  		 * We don't care about getting disk errors here. We need
> > @@ -149,6 +147,8 @@ xfs_qm_dqpurge(
> >  		if (!error) {
> >  			error = xfs_bwrite(bp);
> >  			xfs_buf_relse(bp);
> > +		} else if (error == -EAGAIN) {
> > +			goto out_unlock;
> >  		}
> >  		xfs_dqflock(dqp);
> >  	}
> > @@ -174,6 +174,10 @@ xfs_qm_dqpurge(
> >  
> >  	xfs_qm_dqdestroy(dqp);
> >  	return 0;
> > +
> > +out_unlock:
> > +	xfs_dqunlock(dqp);
> > +	return error;
> >  }
> >  
> >  /*
> > -- 
> > 2.21.1
> > 
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] xfs: trylock underlying buffer on dquot flush
  2020-03-27 16:44     ` Brian Foster
@ 2020-03-27 16:46       ` Brian Foster
  2020-03-27 17:04         ` Darrick J. Wong
  2020-03-27 16:50       ` Darrick J. Wong
  1 sibling, 1 reply; 19+ messages in thread
From: Brian Foster @ 2020-03-27 16:46 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, Mar 27, 2020 at 12:44:40PM -0400, Brian Foster wrote:
> On Fri, Mar 27, 2020 at 08:45:28AM -0700, Darrick J. Wong wrote:
> > On Thu, Mar 26, 2020 at 09:17:02AM -0400, Brian Foster wrote:
> > > A dquot flush currently blocks on the buffer lock for the underlying
> > > dquot buffer. In turn, this causes xfsaild to block rather than
> > > continue processing other items in the meantime. Update
> > > xfs_qm_dqflush() to trylock the buffer, similar to how inode buffers
> > > are handled, and return -EAGAIN if the lock fails. Fix up any
> > > callers that don't currently handle the error properly.
> > > 
> > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > 
> > Is xfs_qm_dquot_isolate returning LRU_RETRY an acceptable resolution (as
> > opposed to, say, LRU_SKIP) for xfs_qm_dqflush returning -EAGAIN?
> > 
> 
> Hmm.. this is reclaim so I suppose LRU_SKIP would be more appropriate
> than retry (along with more consistent with the other trylock failures
> in that function). Ok with something like the following?
> 
> @@ -461,7 +461,11 @@ xfs_qm_dquot_isolate(
>  		spin_unlock(lru_lock);
>  
>  		error = xfs_qm_dqflush(dqp, &bp);
> -		if (error)
> +		if (error == -EAGAIN) {
> +			xfs_dqunlock(dqp);
> +			spin_lock(lru_lock);
> +			goto out_miss_busy;
> +		} else if (error)
>  			goto out_unlock_dirty;

Then again, is it safe to skip from here once we've cycled the lru_lock?

Brian

>  
>  		xfs_buf_delwri_queue(bp, &isol->buffers);
> 
> Brian
> 
> > --D
> > 
> > > ---
> > >  fs/xfs/xfs_dquot.c      |  6 +++---
> > >  fs/xfs/xfs_dquot_item.c |  3 ++-
> > >  fs/xfs/xfs_qm.c         | 14 +++++++++-----
> > >  3 files changed, 14 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
> > > index 711376ca269f..af2c8e5ceea0 100644
> > > --- a/fs/xfs/xfs_dquot.c
> > > +++ b/fs/xfs/xfs_dquot.c
> > > @@ -1105,8 +1105,8 @@ xfs_qm_dqflush(
> > >  	 * Get the buffer containing the on-disk dquot
> > >  	 */
> > >  	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dqp->q_blkno,
> > > -				   mp->m_quotainfo->qi_dqchunklen, 0, &bp,
> > > -				   &xfs_dquot_buf_ops);
> > > +				   mp->m_quotainfo->qi_dqchunklen, XBF_TRYLOCK,
> > > +				   &bp, &xfs_dquot_buf_ops);
> > >  	if (error)
> > >  		goto out_unlock;
> > >  
> > > @@ -1177,7 +1177,7 @@ xfs_qm_dqflush(
> > >  
> > >  out_unlock:
> > >  	xfs_dqfunlock(dqp);
> > > -	return -EIO;
> > > +	return error;
> > >  }
> > >  
> > >  /*
> > > diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
> > > index cf65e2e43c6e..baad1748d0d1 100644
> > > --- a/fs/xfs/xfs_dquot_item.c
> > > +++ b/fs/xfs/xfs_dquot_item.c
> > > @@ -189,7 +189,8 @@ xfs_qm_dquot_logitem_push(
> > >  		if (!xfs_buf_delwri_queue(bp, buffer_list))
> > >  			rval = XFS_ITEM_FLUSHING;
> > >  		xfs_buf_relse(bp);
> > > -	}
> > > +	} else if (error == -EAGAIN)
> > > +		rval = XFS_ITEM_LOCKED;
> > >  
> > >  	spin_lock(&lip->li_ailp->ail_lock);
> > >  out_unlock:
> > > diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
> > > index de1d2c606c14..68c778d25c48 100644
> > > --- a/fs/xfs/xfs_qm.c
> > > +++ b/fs/xfs/xfs_qm.c
> > > @@ -121,12 +121,11 @@ xfs_qm_dqpurge(
> > >  {
> > >  	struct xfs_mount	*mp = dqp->q_mount;
> > >  	struct xfs_quotainfo	*qi = mp->m_quotainfo;
> > > +	int			error = -EAGAIN;
> > >  
> > >  	xfs_dqlock(dqp);
> > > -	if ((dqp->dq_flags & XFS_DQ_FREEING) || dqp->q_nrefs != 0) {
> > > -		xfs_dqunlock(dqp);
> > > -		return -EAGAIN;
> > > -	}
> > > +	if ((dqp->dq_flags & XFS_DQ_FREEING) || dqp->q_nrefs != 0)
> > > +		goto out_unlock;
> > >  
> > >  	dqp->dq_flags |= XFS_DQ_FREEING;
> > >  
> > > @@ -139,7 +138,6 @@ xfs_qm_dqpurge(
> > >  	 */
> > >  	if (XFS_DQ_IS_DIRTY(dqp)) {
> > >  		struct xfs_buf	*bp = NULL;
> > > -		int		error;
> > >  
> > >  		/*
> > >  		 * We don't care about getting disk errors here. We need
> > > @@ -149,6 +147,8 @@ xfs_qm_dqpurge(
> > >  		if (!error) {
> > >  			error = xfs_bwrite(bp);
> > >  			xfs_buf_relse(bp);
> > > +		} else if (error == -EAGAIN) {
> > > +			goto out_unlock;
> > >  		}
> > >  		xfs_dqflock(dqp);
> > >  	}
> > > @@ -174,6 +174,10 @@ xfs_qm_dqpurge(
> > >  
> > >  	xfs_qm_dqdestroy(dqp);
> > >  	return 0;
> > > +
> > > +out_unlock:
> > > +	xfs_dqunlock(dqp);
> > > +	return error;
> > >  }
> > >  
> > >  /*
> > > -- 
> > > 2.21.1
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] xfs: trylock underlying buffer on dquot flush
  2020-03-27 16:44     ` Brian Foster
  2020-03-27 16:46       ` Brian Foster
@ 2020-03-27 16:50       ` Darrick J. Wong
  1 sibling, 0 replies; 19+ messages in thread
From: Darrick J. Wong @ 2020-03-27 16:50 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Fri, Mar 27, 2020 at 12:44:40PM -0400, Brian Foster wrote:
> On Fri, Mar 27, 2020 at 08:45:28AM -0700, Darrick J. Wong wrote:
> > On Thu, Mar 26, 2020 at 09:17:02AM -0400, Brian Foster wrote:
> > > A dquot flush currently blocks on the buffer lock for the underlying
> > > dquot buffer. In turn, this causes xfsaild to block rather than
> > > continue processing other items in the meantime. Update
> > > xfs_qm_dqflush() to trylock the buffer, similar to how inode buffers
> > > are handled, and return -EAGAIN if the lock fails. Fix up any
> > > callers that don't currently handle the error properly.
> > > 
> > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > 
> > Is xfs_qm_dquot_isolate returning LRU_RETRY an acceptable resolution (as
> > opposed to, say, LRU_SKIP) for xfs_qm_dqflush returning -EAGAIN?
> > 
> 
> Hmm.. this is reclaim so I suppose LRU_SKIP would be more appropriate
> than retry (along with more consistent with the other trylock failures
> in that function). Ok with something like the following?
> 
> @@ -461,7 +461,11 @@ xfs_qm_dquot_isolate(
>  		spin_unlock(lru_lock);
>  
>  		error = xfs_qm_dqflush(dqp, &bp);
> -		if (error)
> +		if (error == -EAGAIN) {
> +			xfs_dqunlock(dqp);
> +			spin_lock(lru_lock);
> +			goto out_miss_busy;
> +		} else if (error)
>  			goto out_unlock_dirty;
>  
>  		xfs_buf_delwri_queue(bp, &isol->buffers);

Yeah, looks good to me.

--D

> 
> Brian
> 
> > --D
> > 
> > > ---
> > >  fs/xfs/xfs_dquot.c      |  6 +++---
> > >  fs/xfs/xfs_dquot_item.c |  3 ++-
> > >  fs/xfs/xfs_qm.c         | 14 +++++++++-----
> > >  3 files changed, 14 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
> > > index 711376ca269f..af2c8e5ceea0 100644
> > > --- a/fs/xfs/xfs_dquot.c
> > > +++ b/fs/xfs/xfs_dquot.c
> > > @@ -1105,8 +1105,8 @@ xfs_qm_dqflush(
> > >  	 * Get the buffer containing the on-disk dquot
> > >  	 */
> > >  	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dqp->q_blkno,
> > > -				   mp->m_quotainfo->qi_dqchunklen, 0, &bp,
> > > -				   &xfs_dquot_buf_ops);
> > > +				   mp->m_quotainfo->qi_dqchunklen, XBF_TRYLOCK,
> > > +				   &bp, &xfs_dquot_buf_ops);
> > >  	if (error)
> > >  		goto out_unlock;
> > >  
> > > @@ -1177,7 +1177,7 @@ xfs_qm_dqflush(
> > >  
> > >  out_unlock:
> > >  	xfs_dqfunlock(dqp);
> > > -	return -EIO;
> > > +	return error;
> > >  }
> > >  
> > >  /*
> > > diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
> > > index cf65e2e43c6e..baad1748d0d1 100644
> > > --- a/fs/xfs/xfs_dquot_item.c
> > > +++ b/fs/xfs/xfs_dquot_item.c
> > > @@ -189,7 +189,8 @@ xfs_qm_dquot_logitem_push(
> > >  		if (!xfs_buf_delwri_queue(bp, buffer_list))
> > >  			rval = XFS_ITEM_FLUSHING;
> > >  		xfs_buf_relse(bp);
> > > -	}
> > > +	} else if (error == -EAGAIN)
> > > +		rval = XFS_ITEM_LOCKED;
> > >  
> > >  	spin_lock(&lip->li_ailp->ail_lock);
> > >  out_unlock:
> > > diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
> > > index de1d2c606c14..68c778d25c48 100644
> > > --- a/fs/xfs/xfs_qm.c
> > > +++ b/fs/xfs/xfs_qm.c
> > > @@ -121,12 +121,11 @@ xfs_qm_dqpurge(
> > >  {
> > >  	struct xfs_mount	*mp = dqp->q_mount;
> > >  	struct xfs_quotainfo	*qi = mp->m_quotainfo;
> > > +	int			error = -EAGAIN;
> > >  
> > >  	xfs_dqlock(dqp);
> > > -	if ((dqp->dq_flags & XFS_DQ_FREEING) || dqp->q_nrefs != 0) {
> > > -		xfs_dqunlock(dqp);
> > > -		return -EAGAIN;
> > > -	}
> > > +	if ((dqp->dq_flags & XFS_DQ_FREEING) || dqp->q_nrefs != 0)
> > > +		goto out_unlock;
> > >  
> > >  	dqp->dq_flags |= XFS_DQ_FREEING;
> > >  
> > > @@ -139,7 +138,6 @@ xfs_qm_dqpurge(
> > >  	 */
> > >  	if (XFS_DQ_IS_DIRTY(dqp)) {
> > >  		struct xfs_buf	*bp = NULL;
> > > -		int		error;
> > >  
> > >  		/*
> > >  		 * We don't care about getting disk errors here. We need
> > > @@ -149,6 +147,8 @@ xfs_qm_dqpurge(
> > >  		if (!error) {
> > >  			error = xfs_bwrite(bp);
> > >  			xfs_buf_relse(bp);
> > > +		} else if (error == -EAGAIN) {
> > > +			goto out_unlock;
> > >  		}
> > >  		xfs_dqflock(dqp);
> > >  	}
> > > @@ -174,6 +174,10 @@ xfs_qm_dqpurge(
> > >  
> > >  	xfs_qm_dqdestroy(dqp);
> > >  	return 0;
> > > +
> > > +out_unlock:
> > > +	xfs_dqunlock(dqp);
> > > +	return error;
> > >  }
> > >  
> > >  /*
> > > -- 
> > > 2.21.1
> > > 
> > 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] xfs: trylock underlying buffer on dquot flush
  2020-03-27 16:46       ` Brian Foster
@ 2020-03-27 17:04         ` Darrick J. Wong
  0 siblings, 0 replies; 19+ messages in thread
From: Darrick J. Wong @ 2020-03-27 17:04 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Fri, Mar 27, 2020 at 12:46:44PM -0400, Brian Foster wrote:
> On Fri, Mar 27, 2020 at 12:44:40PM -0400, Brian Foster wrote:
> > On Fri, Mar 27, 2020 at 08:45:28AM -0700, Darrick J. Wong wrote:
> > > On Thu, Mar 26, 2020 at 09:17:02AM -0400, Brian Foster wrote:
> > > > A dquot flush currently blocks on the buffer lock for the underlying
> > > > dquot buffer. In turn, this causes xfsaild to block rather than
> > > > continue processing other items in the meantime. Update
> > > > xfs_qm_dqflush() to trylock the buffer, similar to how inode buffers
> > > > are handled, and return -EAGAIN if the lock fails. Fix up any
> > > > callers that don't currently handle the error properly.
> > > > 
> > > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > 
> > > Is xfs_qm_dquot_isolate returning LRU_RETRY an acceptable resolution (as
> > > opposed to, say, LRU_SKIP) for xfs_qm_dqflush returning -EAGAIN?
> > > 
> > 
> > Hmm.. this is reclaim so I suppose LRU_SKIP would be more appropriate
> > than retry (along with more consistent with the other trylock failures
> > in that function). Ok with something like the following?
> > 
> > @@ -461,7 +461,11 @@ xfs_qm_dquot_isolate(
> >  		spin_unlock(lru_lock);
> >  
> >  		error = xfs_qm_dqflush(dqp, &bp);
> > -		if (error)
> > +		if (error == -EAGAIN) {
> > +			xfs_dqunlock(dqp);
> > +			spin_lock(lru_lock);
> > +			goto out_miss_busy;
> > +		} else if (error)
> >  			goto out_unlock_dirty;
> 
> Then again, is it safe to skip from here once we've cycled the lru_lock?

DOH.  Yeah, I missed that we cycled the lru lock and therefore have to
LRU_RETRY.  So I guess the original patch was fine:

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> Brian
> 
> >  
> >  		xfs_buf_delwri_queue(bp, &isol->buffers);
> > 
> > Brian
> > 
> > > --D
> > > 
> > > > ---
> > > >  fs/xfs/xfs_dquot.c      |  6 +++---
> > > >  fs/xfs/xfs_dquot_item.c |  3 ++-
> > > >  fs/xfs/xfs_qm.c         | 14 +++++++++-----
> > > >  3 files changed, 14 insertions(+), 9 deletions(-)
> > > > 
> > > > diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
> > > > index 711376ca269f..af2c8e5ceea0 100644
> > > > --- a/fs/xfs/xfs_dquot.c
> > > > +++ b/fs/xfs/xfs_dquot.c
> > > > @@ -1105,8 +1105,8 @@ xfs_qm_dqflush(
> > > >  	 * Get the buffer containing the on-disk dquot
> > > >  	 */
> > > >  	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dqp->q_blkno,
> > > > -				   mp->m_quotainfo->qi_dqchunklen, 0, &bp,
> > > > -				   &xfs_dquot_buf_ops);
> > > > +				   mp->m_quotainfo->qi_dqchunklen, XBF_TRYLOCK,
> > > > +				   &bp, &xfs_dquot_buf_ops);
> > > >  	if (error)
> > > >  		goto out_unlock;
> > > >  
> > > > @@ -1177,7 +1177,7 @@ xfs_qm_dqflush(
> > > >  
> > > >  out_unlock:
> > > >  	xfs_dqfunlock(dqp);
> > > > -	return -EIO;
> > > > +	return error;
> > > >  }
> > > >  
> > > >  /*
> > > > diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
> > > > index cf65e2e43c6e..baad1748d0d1 100644
> > > > --- a/fs/xfs/xfs_dquot_item.c
> > > > +++ b/fs/xfs/xfs_dquot_item.c
> > > > @@ -189,7 +189,8 @@ xfs_qm_dquot_logitem_push(
> > > >  		if (!xfs_buf_delwri_queue(bp, buffer_list))
> > > >  			rval = XFS_ITEM_FLUSHING;
> > > >  		xfs_buf_relse(bp);
> > > > -	}
> > > > +	} else if (error == -EAGAIN)
> > > > +		rval = XFS_ITEM_LOCKED;
> > > >  
> > > >  	spin_lock(&lip->li_ailp->ail_lock);
> > > >  out_unlock:
> > > > diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
> > > > index de1d2c606c14..68c778d25c48 100644
> > > > --- a/fs/xfs/xfs_qm.c
> > > > +++ b/fs/xfs/xfs_qm.c
> > > > @@ -121,12 +121,11 @@ xfs_qm_dqpurge(
> > > >  {
> > > >  	struct xfs_mount	*mp = dqp->q_mount;
> > > >  	struct xfs_quotainfo	*qi = mp->m_quotainfo;
> > > > +	int			error = -EAGAIN;
> > > >  
> > > >  	xfs_dqlock(dqp);
> > > > -	if ((dqp->dq_flags & XFS_DQ_FREEING) || dqp->q_nrefs != 0) {
> > > > -		xfs_dqunlock(dqp);
> > > > -		return -EAGAIN;
> > > > -	}
> > > > +	if ((dqp->dq_flags & XFS_DQ_FREEING) || dqp->q_nrefs != 0)
> > > > +		goto out_unlock;
> > > >  
> > > >  	dqp->dq_flags |= XFS_DQ_FREEING;
> > > >  
> > > > @@ -139,7 +138,6 @@ xfs_qm_dqpurge(
> > > >  	 */
> > > >  	if (XFS_DQ_IS_DIRTY(dqp)) {
> > > >  		struct xfs_buf	*bp = NULL;
> > > > -		int		error;
> > > >  
> > > >  		/*
> > > >  		 * We don't care about getting disk errors here. We need
> > > > @@ -149,6 +147,8 @@ xfs_qm_dqpurge(
> > > >  		if (!error) {
> > > >  			error = xfs_bwrite(bp);
> > > >  			xfs_buf_relse(bp);
> > > > +		} else if (error == -EAGAIN) {
> > > > +			goto out_unlock;
> > > >  		}
> > > >  		xfs_dqflock(dqp);
> > > >  	}
> > > > @@ -174,6 +174,10 @@ xfs_qm_dqpurge(
> > > >  
> > > >  	xfs_qm_dqdestroy(dqp);
> > > >  	return 0;
> > > > +
> > > > +out_unlock:
> > > > +	xfs_dqunlock(dqp);
> > > > +	return error;
> > > >  }
> > > >  
> > > >  /*
> > > > -- 
> > > > 2.21.1
> > > > 
> > > 
> > 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/2] xfs: a couple AIL pushing trylock fixes
  2020-03-27 16:44   ` Brian Foster
@ 2020-03-29 16:43     ` Darrick J. Wong
  0 siblings, 0 replies; 19+ messages in thread
From: Darrick J. Wong @ 2020-03-29 16:43 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Fri, Mar 27, 2020 at 12:44:12PM -0400, Brian Foster wrote:
> On Fri, Mar 27, 2020 at 08:32:05AM -0700, Darrick J. Wong wrote:
> > On Thu, Mar 26, 2020 at 09:17:01AM -0400, Brian Foster wrote:
> > > Hi all,
> > > 
> > > Here's a couple more small fixes that fell out of the auto relog work.
> > > The dquot issue is actually a deadlock vector if we randomly relog dquot
> > > buffers (which is only done for test purposes), but I figure we should
> > > handle dquot buffers similar to how inode buffers are handled. Thoughts,
> > > reviews, flames appreciated.
> > 
> > Oops, I missed this one, will review now...
> > 
> > Do you think there needs to be an explicit testcase for this?  Or are
> > the current generic/{388,475} good enough?  I'm pretty sure I've seen
> > this exact deadlock on them every now and again, so we're probably
> > covered.
> > 
> 
> I'm actually not aware of a related upstream deadlock. That doesn't mean
> there isn't one of course, but the problem I hit was related to the
> random buffer relogging stuff in the auto relog series. I split these
> out because xfsaild is intended to be mostly async, so they seemed like a
> generic fixups..

<nod> FWIW I'd traced a generic/475 shutdown hang as far as "the AIL
seems to be stuck on a locked dquot buffer" but haven't really had a
chance to look into what was going on at the time.

Whereas before it would usually hang if I let it run more than about 15
minutes, now I've been able to get it to run all night to completion.

--D

> Brian
> 
> > --D
> > 
> > 
> > > Brian
> > > 
> > > Brian Foster (2):
> > >   xfs: trylock underlying buffer on dquot flush
> > >   xfs: return locked status of inode buffer on xfsaild push
> > > 
> > >  fs/xfs/xfs_dquot.c      |  6 +++---
> > >  fs/xfs/xfs_dquot_item.c |  3 ++-
> > >  fs/xfs/xfs_inode_item.c |  3 ++-
> > >  fs/xfs/xfs_qm.c         | 14 +++++++++-----
> > >  4 files changed, 16 insertions(+), 10 deletions(-)
> > > 
> > > -- 
> > > 2.21.1
> > > 
> > 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] xfs: trylock underlying buffer on dquot flush
  2020-03-26 13:17 ` [PATCH 1/2] xfs: trylock underlying buffer on dquot flush Brian Foster
  2020-03-27 12:59   ` Christoph Hellwig
  2020-03-27 15:45   ` Darrick J. Wong
@ 2020-03-29 22:46   ` Dave Chinner
  2020-03-29 23:01     ` Dave Chinner
  2020-03-30 12:15     ` Brian Foster
  2 siblings, 2 replies; 19+ messages in thread
From: Dave Chinner @ 2020-03-29 22:46 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Thu, Mar 26, 2020 at 09:17:02AM -0400, Brian Foster wrote:
> A dquot flush currently blocks on the buffer lock for the underlying
> dquot buffer. In turn, this causes xfsaild to block rather than
> continue processing other items in the meantime. Update
> xfs_qm_dqflush() to trylock the buffer, similar to how inode buffers
> are handled, and return -EAGAIN if the lock fails. Fix up any
> callers that don't currently handle the error properly.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  fs/xfs/xfs_dquot.c      |  6 +++---
>  fs/xfs/xfs_dquot_item.c |  3 ++-
>  fs/xfs/xfs_qm.c         | 14 +++++++++-----
>  3 files changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
> index 711376ca269f..af2c8e5ceea0 100644
> --- a/fs/xfs/xfs_dquot.c
> +++ b/fs/xfs/xfs_dquot.c
> @@ -1105,8 +1105,8 @@ xfs_qm_dqflush(
>  	 * Get the buffer containing the on-disk dquot
>  	 */
>  	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dqp->q_blkno,
> -				   mp->m_quotainfo->qi_dqchunklen, 0, &bp,
> -				   &xfs_dquot_buf_ops);
> +				   mp->m_quotainfo->qi_dqchunklen, XBF_TRYLOCK,
> +				   &bp, &xfs_dquot_buf_ops);
>  	if (error)
>  		goto out_unlock;
>  
> @@ -1177,7 +1177,7 @@ xfs_qm_dqflush(
>  
>  out_unlock:
>  	xfs_dqfunlock(dqp);
> -	return -EIO;
> +	return error;
>  }
>  
>  /*
> diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
> index cf65e2e43c6e..baad1748d0d1 100644
> --- a/fs/xfs/xfs_dquot_item.c
> +++ b/fs/xfs/xfs_dquot_item.c
> @@ -189,7 +189,8 @@ xfs_qm_dquot_logitem_push(
>  		if (!xfs_buf_delwri_queue(bp, buffer_list))
>  			rval = XFS_ITEM_FLUSHING;
>  		xfs_buf_relse(bp);
> -	}
> +	} else if (error == -EAGAIN)
> +		rval = XFS_ITEM_LOCKED;

Doesn't xfs_inode_item_push() also have this problem in that it
doesn't handle -EAGAIN properly?

Also, we can get -EIO, -EFSCORRUPTED, etc here. They probably
shouldn't return XFS_ITEM_SUCCESS, either....

Otherwise seems OK.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] xfs: trylock underlying buffer on dquot flush
  2020-03-29 22:46   ` Dave Chinner
@ 2020-03-29 23:01     ` Dave Chinner
  2020-03-30 12:15     ` Brian Foster
  1 sibling, 0 replies; 19+ messages in thread
From: Dave Chinner @ 2020-03-29 23:01 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Mon, Mar 30, 2020 at 09:46:02AM +1100, Dave Chinner wrote:
> On Thu, Mar 26, 2020 at 09:17:02AM -0400, Brian Foster wrote:
> > A dquot flush currently blocks on the buffer lock for the underlying
> > dquot buffer. In turn, this causes xfsaild to block rather than
> > continue processing other items in the meantime. Update
> > xfs_qm_dqflush() to trylock the buffer, similar to how inode buffers
> > are handled, and return -EAGAIN if the lock fails. Fix up any
> > callers that don't currently handle the error properly.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> >  fs/xfs/xfs_dquot.c      |  6 +++---
> >  fs/xfs/xfs_dquot_item.c |  3 ++-
> >  fs/xfs/xfs_qm.c         | 14 +++++++++-----
> >  3 files changed, 14 insertions(+), 9 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
> > index 711376ca269f..af2c8e5ceea0 100644
> > --- a/fs/xfs/xfs_dquot.c
> > +++ b/fs/xfs/xfs_dquot.c
> > @@ -1105,8 +1105,8 @@ xfs_qm_dqflush(
> >  	 * Get the buffer containing the on-disk dquot
> >  	 */
> >  	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dqp->q_blkno,
> > -				   mp->m_quotainfo->qi_dqchunklen, 0, &bp,
> > -				   &xfs_dquot_buf_ops);
> > +				   mp->m_quotainfo->qi_dqchunklen, XBF_TRYLOCK,
> > +				   &bp, &xfs_dquot_buf_ops);
> >  	if (error)
> >  		goto out_unlock;
> >  
> > @@ -1177,7 +1177,7 @@ xfs_qm_dqflush(
> >  
> >  out_unlock:
> >  	xfs_dqfunlock(dqp);
> > -	return -EIO;
> > +	return error;
> >  }
> >  
> >  /*
> > diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
> > index cf65e2e43c6e..baad1748d0d1 100644
> > --- a/fs/xfs/xfs_dquot_item.c
> > +++ b/fs/xfs/xfs_dquot_item.c
> > @@ -189,7 +189,8 @@ xfs_qm_dquot_logitem_push(
> >  		if (!xfs_buf_delwri_queue(bp, buffer_list))
> >  			rval = XFS_ITEM_FLUSHING;
> >  		xfs_buf_relse(bp);
> > -	}
> > +	} else if (error == -EAGAIN)
> > +		rval = XFS_ITEM_LOCKED;
> 
> Doesn't xfs_inode_item_push() also have this problem in that it
> doesn't handle -EAGAIN properly?

... and now I see this is the second patch in the series...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] xfs: trylock underlying buffer on dquot flush
  2020-03-29 22:46   ` Dave Chinner
  2020-03-29 23:01     ` Dave Chinner
@ 2020-03-30 12:15     ` Brian Foster
  2020-03-31  0:04       ` Dave Chinner
  1 sibling, 1 reply; 19+ messages in thread
From: Brian Foster @ 2020-03-30 12:15 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

On Mon, Mar 30, 2020 at 09:46:02AM +1100, Dave Chinner wrote:
> On Thu, Mar 26, 2020 at 09:17:02AM -0400, Brian Foster wrote:
> > A dquot flush currently blocks on the buffer lock for the underlying
> > dquot buffer. In turn, this causes xfsaild to block rather than
> > continue processing other items in the meantime. Update
> > xfs_qm_dqflush() to trylock the buffer, similar to how inode buffers
> > are handled, and return -EAGAIN if the lock fails. Fix up any
> > callers that don't currently handle the error properly.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> >  fs/xfs/xfs_dquot.c      |  6 +++---
> >  fs/xfs/xfs_dquot_item.c |  3 ++-
> >  fs/xfs/xfs_qm.c         | 14 +++++++++-----
> >  3 files changed, 14 insertions(+), 9 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
> > index 711376ca269f..af2c8e5ceea0 100644
> > --- a/fs/xfs/xfs_dquot.c
> > +++ b/fs/xfs/xfs_dquot.c
> > @@ -1105,8 +1105,8 @@ xfs_qm_dqflush(
> >  	 * Get the buffer containing the on-disk dquot
> >  	 */
> >  	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dqp->q_blkno,
> > -				   mp->m_quotainfo->qi_dqchunklen, 0, &bp,
> > -				   &xfs_dquot_buf_ops);
> > +				   mp->m_quotainfo->qi_dqchunklen, XBF_TRYLOCK,
> > +				   &bp, &xfs_dquot_buf_ops);
> >  	if (error)
> >  		goto out_unlock;
> >  
> > @@ -1177,7 +1177,7 @@ xfs_qm_dqflush(
> >  
> >  out_unlock:
> >  	xfs_dqfunlock(dqp);
> > -	return -EIO;
> > +	return error;
> >  }
> >  
> >  /*
> > diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
> > index cf65e2e43c6e..baad1748d0d1 100644
> > --- a/fs/xfs/xfs_dquot_item.c
> > +++ b/fs/xfs/xfs_dquot_item.c
> > @@ -189,7 +189,8 @@ xfs_qm_dquot_logitem_push(
> >  		if (!xfs_buf_delwri_queue(bp, buffer_list))
> >  			rval = XFS_ITEM_FLUSHING;
> >  		xfs_buf_relse(bp);
> > -	}
> > +	} else if (error == -EAGAIN)
> > +		rval = XFS_ITEM_LOCKED;
> 
> Doesn't xfs_inode_item_push() also have this problem in that it
> doesn't handle -EAGAIN properly?
> 
> Also, we can get -EIO, -EFSCORRUPTED, etc here. They probably
> shouldn't return XFS_ITEM_SUCCESS, either....
> 

Good point. I'm actually not sure what we should return in that case
given the item return codes all seem to assume a valid state. We could
define an XFS_ITEM_ERROR return, but I'm not sure it's worth it for what
is currently stat/tracepoint logic in the caller. Perhaps a broader
rework of error handling in this context is in order that would lift
generic (fatal) error handling into xfsaild. E.g., I see that
xfs_qm_dqflush() is inconsistent by itself in that the item is removed
from the AIL if we're already shut down, but not if that function
invokes the shutdown; we shutdown if the direct xfs_dqblk_verify() call
fails but not if the read verifier (which also looks like it calls
xfs_dqblk_verify() on every on-disk dquot) returns -EFSCORRUPTED, etc.
It might make some sense to let iop_push() return negative error codes
if that facilitates consistent error handling...

Brian

> Otherwise seems OK.
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] xfs: trylock underlying buffer on dquot flush
  2020-03-30 12:15     ` Brian Foster
@ 2020-03-31  0:04       ` Dave Chinner
  2020-03-31 11:46         ` Brian Foster
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Chinner @ 2020-03-31  0:04 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Mon, Mar 30, 2020 at 08:15:44AM -0400, Brian Foster wrote:
> On Mon, Mar 30, 2020 at 09:46:02AM +1100, Dave Chinner wrote:
> > On Thu, Mar 26, 2020 at 09:17:02AM -0400, Brian Foster wrote:
> > > A dquot flush currently blocks on the buffer lock for the underlying
> > > dquot buffer. In turn, this causes xfsaild to block rather than
> > > continue processing other items in the meantime. Update
> > > xfs_qm_dqflush() to trylock the buffer, similar to how inode buffers
> > > are handled, and return -EAGAIN if the lock fails. Fix up any
> > > callers that don't currently handle the error properly.
> > > 
> > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > ---
> > >  fs/xfs/xfs_dquot.c      |  6 +++---
> > >  fs/xfs/xfs_dquot_item.c |  3 ++-
> > >  fs/xfs/xfs_qm.c         | 14 +++++++++-----
> > >  3 files changed, 14 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
> > > index 711376ca269f..af2c8e5ceea0 100644
> > > --- a/fs/xfs/xfs_dquot.c
> > > +++ b/fs/xfs/xfs_dquot.c
> > > @@ -1105,8 +1105,8 @@ xfs_qm_dqflush(
> > >  	 * Get the buffer containing the on-disk dquot
> > >  	 */
> > >  	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dqp->q_blkno,
> > > -				   mp->m_quotainfo->qi_dqchunklen, 0, &bp,
> > > -				   &xfs_dquot_buf_ops);
> > > +				   mp->m_quotainfo->qi_dqchunklen, XBF_TRYLOCK,
> > > +				   &bp, &xfs_dquot_buf_ops);
> > >  	if (error)
> > >  		goto out_unlock;
> > >  
> > > @@ -1177,7 +1177,7 @@ xfs_qm_dqflush(
> > >  
> > >  out_unlock:
> > >  	xfs_dqfunlock(dqp);
> > > -	return -EIO;
> > > +	return error;
> > >  }
> > >  
> > >  /*
> > > diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
> > > index cf65e2e43c6e..baad1748d0d1 100644
> > > --- a/fs/xfs/xfs_dquot_item.c
> > > +++ b/fs/xfs/xfs_dquot_item.c
> > > @@ -189,7 +189,8 @@ xfs_qm_dquot_logitem_push(
> > >  		if (!xfs_buf_delwri_queue(bp, buffer_list))
> > >  			rval = XFS_ITEM_FLUSHING;
> > >  		xfs_buf_relse(bp);
> > > -	}
> > > +	} else if (error == -EAGAIN)
> > > +		rval = XFS_ITEM_LOCKED;
> > 
> > Doesn't xfs_inode_item_push() also have this problem in that it
> > doesn't handle -EAGAIN properly?
> > 
> > Also, we can get -EIO, -EFSCORRUPTED, etc here. They probably
> > shouldn't return XFS_ITEM_SUCCESS, either....
> > 
> 
> Good point. I'm actually not sure what we should return in that case
> given the item return codes all seem to assume a valid state. We could
> define an XFS_ITEM_ERROR return, but I'm not sure it's worth it for what
> is currently stat/tracepoint logic in the caller.  Perhaps a broader
> rework of error handling in this context is in order that would lift
> generic (fatal) error handling into xfsaild.

Yeah, that's where my thoughts were heading as well.

> E.g., I see that
> xfs_qm_dqflush() is inconsistent by itself in that the item is removed
> from the AIL if we're already shut down, but not if that function
> invokes the shutdown; we shutdown if the direct xfs_dqblk_verify() call
> fails but not if the read verifier (which also looks like it calls
> xfs_dqblk_verify() on every on-disk dquot) returns -EFSCORRUPTED, etc.
> It might make some sense to let iop_push() return negative error codes
> if that facilitates consistent error handling...

Yes, it's a bit of a mess. I suspect that what we should be doing
here is pulling the failed buffer write retry code up into the main
push loop. That is, we can set LI_FAILED on log items that fail to
flush, either directly at submit time, or at IO completion for write
errors.

Then we can have the main AIL loop set LI_FAILED on push failures,
and also the main loop detect LI_FAILED directly and call a new
->iop_resubmit() function rather than having to handle that the
resubmit cases as special cases in every ->iop_push() path.

That seems like a much cleaner way of handling submission failure
and retries for all log item types that need it compared to the way
we currently handle it for buffers...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] xfs: trylock underlying buffer on dquot flush
  2020-03-31  0:04       ` Dave Chinner
@ 2020-03-31 11:46         ` Brian Foster
  0 siblings, 0 replies; 19+ messages in thread
From: Brian Foster @ 2020-03-31 11:46 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

On Tue, Mar 31, 2020 at 11:04:09AM +1100, Dave Chinner wrote:
> On Mon, Mar 30, 2020 at 08:15:44AM -0400, Brian Foster wrote:
> > On Mon, Mar 30, 2020 at 09:46:02AM +1100, Dave Chinner wrote:
> > > On Thu, Mar 26, 2020 at 09:17:02AM -0400, Brian Foster wrote:
> > > > A dquot flush currently blocks on the buffer lock for the underlying
> > > > dquot buffer. In turn, this causes xfsaild to block rather than
> > > > continue processing other items in the meantime. Update
> > > > xfs_qm_dqflush() to trylock the buffer, similar to how inode buffers
> > > > are handled, and return -EAGAIN if the lock fails. Fix up any
> > > > callers that don't currently handle the error properly.
> > > > 
> > > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > > ---
> > > >  fs/xfs/xfs_dquot.c      |  6 +++---
> > > >  fs/xfs/xfs_dquot_item.c |  3 ++-
> > > >  fs/xfs/xfs_qm.c         | 14 +++++++++-----
> > > >  3 files changed, 14 insertions(+), 9 deletions(-)
> > > > 
> > > > diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
> > > > index 711376ca269f..af2c8e5ceea0 100644
> > > > --- a/fs/xfs/xfs_dquot.c
> > > > +++ b/fs/xfs/xfs_dquot.c
> > > > @@ -1105,8 +1105,8 @@ xfs_qm_dqflush(
> > > >  	 * Get the buffer containing the on-disk dquot
> > > >  	 */
> > > >  	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dqp->q_blkno,
> > > > -				   mp->m_quotainfo->qi_dqchunklen, 0, &bp,
> > > > -				   &xfs_dquot_buf_ops);
> > > > +				   mp->m_quotainfo->qi_dqchunklen, XBF_TRYLOCK,
> > > > +				   &bp, &xfs_dquot_buf_ops);
> > > >  	if (error)
> > > >  		goto out_unlock;
> > > >  
> > > > @@ -1177,7 +1177,7 @@ xfs_qm_dqflush(
> > > >  
> > > >  out_unlock:
> > > >  	xfs_dqfunlock(dqp);
> > > > -	return -EIO;
> > > > +	return error;
> > > >  }
> > > >  
> > > >  /*
> > > > diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
> > > > index cf65e2e43c6e..baad1748d0d1 100644
> > > > --- a/fs/xfs/xfs_dquot_item.c
> > > > +++ b/fs/xfs/xfs_dquot_item.c
> > > > @@ -189,7 +189,8 @@ xfs_qm_dquot_logitem_push(
> > > >  		if (!xfs_buf_delwri_queue(bp, buffer_list))
> > > >  			rval = XFS_ITEM_FLUSHING;
> > > >  		xfs_buf_relse(bp);
> > > > -	}
> > > > +	} else if (error == -EAGAIN)
> > > > +		rval = XFS_ITEM_LOCKED;
> > > 
> > > Doesn't xfs_inode_item_push() also have this problem in that it
> > > doesn't handle -EAGAIN properly?
> > > 
> > > Also, we can get -EIO, -EFSCORRUPTED, etc here. They probably
> > > shouldn't return XFS_ITEM_SUCCESS, either....
> > > 
> > 
> > Good point. I'm actually not sure what we should return in that case
> > given the item return codes all seem to assume a valid state. We could
> > define an XFS_ITEM_ERROR return, but I'm not sure it's worth it for what
> > is currently stat/tracepoint logic in the caller.  Perhaps a broader
> > rework of error handling in this context is in order that would lift
> > generic (fatal) error handling into xfsaild.
> 
> Yeah, that's where my thoughts were heading as well.
> 
> > E.g., I see that
> > xfs_qm_dqflush() is inconsistent by itself in that the item is removed
> > from the AIL if we're already shut down, but not if that function
> > invokes the shutdown; we shutdown if the direct xfs_dqblk_verify() call
> > fails but not if the read verifier (which also looks like it calls
> > xfs_dqblk_verify() on every on-disk dquot) returns -EFSCORRUPTED, etc.
> > It might make some sense to let iop_push() return negative error codes
> > if that facilitates consistent error handling...
> 
> Yes, it's a bit of a mess. I suspect that what we should be doing
> here is pulling the failed buffer write retry code up into the main
> push loop. That is, we can set LI_FAILED on log items that fail to
> flush, either directly at submit time, or at IO completion for write
> errors.
> 
> Then we can have the main AIL loop set LI_FAILED on push failures,
> and also the main loop detect LI_FAILED directly and call a new
> ->iop_resubmit() function rather than having to handle that the
> resubmit cases as special cases in every ->iop_push() path.
> 

I'm not sure we want to use LI_FAILED in failure to flush (i.e. push
failure) situations because it's currently used specifically to indicate
that a particular item requires resubmit when it already has been
successfully flushed. This avoids the need for a post I/O error push to
retry an already locked flush lock (and flush attempt) and subsequently
cause the item to remain stuck on the AIL. It still might make sense to
refactor the existing LI_FAILED implementation into ->iop_resubmit()
callbacks for those items that use it, though.

That also doesn't preclude refactoring some sort of generic push failure
error handling into xfsaild for the sake of consistency. It's just not
immediately clear to me what it should look like. Perhaps I'll poke at
it a bit once I get the next rfc of the relog work settled and posted
(soon)..

Brian

> That seems like a much cleaner way of handling submission failure
> and retries for all log item types that need it compared to the way
> we currently handle it for buffers...
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-03-31 11:47 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-26 13:17 [PATCH 0/2] xfs: a couple AIL pushing trylock fixes Brian Foster
2020-03-26 13:17 ` [PATCH 1/2] xfs: trylock underlying buffer on dquot flush Brian Foster
2020-03-27 12:59   ` Christoph Hellwig
2020-03-27 15:45   ` Darrick J. Wong
2020-03-27 16:44     ` Brian Foster
2020-03-27 16:46       ` Brian Foster
2020-03-27 17:04         ` Darrick J. Wong
2020-03-27 16:50       ` Darrick J. Wong
2020-03-29 22:46   ` Dave Chinner
2020-03-29 23:01     ` Dave Chinner
2020-03-30 12:15     ` Brian Foster
2020-03-31  0:04       ` Dave Chinner
2020-03-31 11:46         ` Brian Foster
2020-03-26 13:17 ` [PATCH 2/2] xfs: return locked status of inode buffer on xfsaild push Brian Foster
2020-03-27 13:00   ` Christoph Hellwig
2020-03-27 15:39   ` Darrick J. Wong
2020-03-27 15:32 ` [PATCH 0/2] xfs: a couple AIL pushing trylock fixes Darrick J. Wong
2020-03-27 16:44   ` Brian Foster
2020-03-29 16:43     ` Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).