All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] xfs: fix buffer delwri queue state race
@ 2018-06-13 11:05 Brian Foster
  2018-06-13 11:05 ` [PATCH v2 1/2] xfs: refactor buffer submission into a common helper Brian Foster
  2018-06-13 11:05 ` [PATCH v2 2/2] xfs: use sync buffer I/O for sync delwri queue submission Brian Foster
  0 siblings, 2 replies; 17+ messages in thread
From: Brian Foster @ 2018-06-13 11:05 UTC (permalink / raw)
  To: linux-xfs

Hi all,

Here's a v2 of the patches to fix the buffer delwri queue state race
problem (documented in the patch 2 commit log). This takes a completely
different approach from v1, as suggested by Dave during review of v1.

Patch 1 refactors the buffer submission paths and patch 2 closes the
race by using synchronous buffer I/O for synchronous delwri queues. This
survives a full xfstests run as well as lowmem stress testing of
xfs/305 [1].

Thoughts, reviews, flames appreciated.

Brian

[1] If run long enough, I do eventually hit some presumably unrelated
issues that are not currently reproducible simply due to the fact that
the delwri queue issue is more prevalent.

v2:
- Implement sync buffer I/O for sync delwri queues instead of buffer
  wait list stealing.
v1: https://marc.info/?l=linux-xfs&m=152837528705511&w=2

Brian Foster (2):
  xfs: refactor buffer submission into a common helper
  xfs: use sync buffer I/O for sync delwri queue submission

 fs/xfs/xfs_buf.c | 164 +++++++++++++++++++++++------------------------
 1 file changed, 79 insertions(+), 85 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/2] xfs: refactor buffer submission into a common helper
  2018-06-13 11:05 [PATCH v2 0/2] xfs: fix buffer delwri queue state race Brian Foster
@ 2018-06-13 11:05 ` Brian Foster
  2018-06-14 13:43   ` [PATCH v3 " Brian Foster
  2018-06-13 11:05 ` [PATCH v2 2/2] xfs: use sync buffer I/O for sync delwri queue submission Brian Foster
  1 sibling, 1 reply; 17+ messages in thread
From: Brian Foster @ 2018-06-13 11:05 UTC (permalink / raw)
  To: linux-xfs

Sync and async buffer submission both do generally similar things
with a couple odd exceptions. Refactor the core buffer submission
code into a common helper to isolate buffer submission from
completion handling of synchronous buffer I/O.

This patch does not change behavior. It is a step towards support
for using synchronous buffer I/O via synchronous delwri queue
submission.

Designed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_buf.c | 86 +++++++++++++++++++++---------------------------
 1 file changed, 38 insertions(+), 48 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index e9c058e3761c..112999ddb75e 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1458,22 +1458,18 @@ _xfs_buf_ioapply(
  * a call to this function unless the caller holds an additional reference
  * itself.
  */
-void
-xfs_buf_submit(
+static int
+__xfs_buf_submit(
 	struct xfs_buf	*bp)
 {
-	trace_xfs_buf_submit(bp, _RET_IP_);
-
 	ASSERT(!(bp->b_flags & _XBF_DELWRI_Q));
-	ASSERT(bp->b_flags & XBF_ASYNC);
 
 	/* on shutdown we stale and complete the buffer immediately */
 	if (XFS_FORCED_SHUTDOWN(bp->b_target->bt_mount)) {
 		xfs_buf_ioerror(bp, -EIO);
 		bp->b_flags &= ~XBF_DONE;
 		xfs_buf_stale(bp);
-		xfs_buf_ioend(bp);
-		return;
+		return -EIO;
 	}
 
 	if (bp->b_flags & XBF_WRITE)
@@ -1482,23 +1478,14 @@ xfs_buf_submit(
 	/* clear the internal error state to avoid spurious errors */
 	bp->b_io_error = 0;
 
-	/*
-	 * The caller's reference is released during I/O completion.
-	 * This occurs some time after the last b_io_remaining reference is
-	 * released, so after we drop our Io reference we have to have some
-	 * other reference to ensure the buffer doesn't go away from underneath
-	 * us. Take a direct reference to ensure we have safe access to the
-	 * buffer until we are finished with it.
-	 */
-	xfs_buf_hold(bp);
-
 	/*
 	 * Set the count to 1 initially, this will stop an I/O completion
 	 * callout which happens before we have started all the I/O from calling
 	 * xfs_buf_ioend too early.
 	 */
 	atomic_set(&bp->b_io_remaining, 1);
-	xfs_buf_ioacct_inc(bp);
+	if (bp->b_flags & XBF_ASYNC)
+		xfs_buf_ioacct_inc(bp);
 	_xfs_buf_ioapply(bp);
 
 	/*
@@ -1507,14 +1494,40 @@ xfs_buf_submit(
 	 * that we don't return to the caller with completion still pending.
 	 */
 	if (atomic_dec_and_test(&bp->b_io_remaining) == 1) {
-		if (bp->b_error)
+		if (bp->b_error || !(bp->b_flags & XBF_ASYNC))
 			xfs_buf_ioend(bp);
 		else
 			xfs_buf_ioend_async(bp);
 	}
 
-	xfs_buf_rele(bp);
+	return 0;
+}
+
+void
+xfs_buf_submit(
+	struct xfs_buf	*bp)
+{
+	int		error;
+
+	ASSERT(bp->b_flags & XBF_ASYNC);
+	trace_xfs_buf_submit(bp, _RET_IP_);
+
+	/*
+	 * The caller's reference is released during I/O completion.
+	 * This occurs some time after the last b_io_remaining reference is
+	 * released, so after we drop our Io reference we have to have some
+	 * other reference to ensure the buffer doesn't go away from underneath
+	 * us. Take a direct reference to ensure we have safe access to the
+	 * buffer until we are finished with it.
+	 */
+	xfs_buf_hold(bp);
+
+	error = __xfs_buf_submit(bp);
+	if (error)
+		xfs_buf_ioend(bp);
+
 	/* Note: it is not safe to reference bp now we've dropped our ref */
+	xfs_buf_rele(bp);
 }
 
 /*
@@ -1528,20 +1541,7 @@ xfs_buf_submit_wait(
 
 	trace_xfs_buf_submit_wait(bp, _RET_IP_);
 
-	ASSERT(!(bp->b_flags & (_XBF_DELWRI_Q | XBF_ASYNC)));
-
-	if (XFS_FORCED_SHUTDOWN(bp->b_target->bt_mount)) {
-		xfs_buf_ioerror(bp, -EIO);
-		xfs_buf_stale(bp);
-		bp->b_flags &= ~XBF_DONE;
-		return -EIO;
-	}
-
-	if (bp->b_flags & XBF_WRITE)
-		xfs_buf_wait_unpin(bp);
-
-	/* clear the internal error state to avoid spurious errors */
-	bp->b_io_error = 0;
+	ASSERT(!(bp->b_flags & XBF_ASYNC));
 
 	/*
 	 * For synchronous IO, the IO does not inherit the submitters reference
@@ -1551,20 +1551,9 @@ xfs_buf_submit_wait(
 	 */
 	xfs_buf_hold(bp);
 
-	/*
-	 * Set the count to 1 initially, this will stop an I/O completion
-	 * callout which happens before we have started all the I/O from calling
-	 * xfs_buf_ioend too early.
-	 */
-	atomic_set(&bp->b_io_remaining, 1);
-	_xfs_buf_ioapply(bp);
-
-	/*
-	 * make sure we run completion synchronously if it raced with us and is
-	 * already complete.
-	 */
-	if (atomic_dec_and_test(&bp->b_io_remaining) == 1)
-		xfs_buf_ioend(bp);
+	error = __xfs_buf_submit(bp);
+	if (error)
+		goto out;
 
 	/* wait for completion before gathering the error from the buffer */
 	trace_xfs_buf_iowait(bp, _RET_IP_);
@@ -1572,6 +1561,7 @@ xfs_buf_submit_wait(
 	trace_xfs_buf_iowait_done(bp, _RET_IP_);
 	error = bp->b_error;
 
+out:
 	/*
 	 * all done now, we can release the hold that keeps the buffer
 	 * referenced for the entire IO.
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/2] xfs: use sync buffer I/O for sync delwri queue submission
  2018-06-13 11:05 [PATCH v2 0/2] xfs: fix buffer delwri queue state race Brian Foster
  2018-06-13 11:05 ` [PATCH v2 1/2] xfs: refactor buffer submission into a common helper Brian Foster
@ 2018-06-13 11:05 ` Brian Foster
  2018-06-13 22:08   ` Dave Chinner
  2018-06-15 11:28   ` Christoph Hellwig
  1 sibling, 2 replies; 17+ messages in thread
From: Brian Foster @ 2018-06-13 11:05 UTC (permalink / raw)
  To: linux-xfs

If a delwri queue occurs of a buffer that sits on a delwri queue
wait list, the queue sets _XBF_DELWRI_Q without changing the state
of ->b_list. This occurs, for example, if another thread beats the
current delwri waiter thread to the buffer lock after I/O
completion. Once the waiter acquires the lock, it removes the buffer
from the wait list and leaves a buffer with _XBF_DELWRI_Q set but
not populated on a list. This results in a lost buffer submission
and in turn can result in assert failures due to _XBF_DELWRI_Q being
set on buffer reclaim or filesystem lockups if the buffer happens to
cover an item in the AIL.

This problem has been reproduced by repeated iterations of xfs/305
on high CPU count (28xcpu) systems with limited memory (~1GB). Dirty
dquot reclaim races with an xfsaild push of a separate dquot backed
by the same buffer such that the buffer sits on the reclaim wait
list at the time xfsaild attempts to queue it. Since the latter
dquot has been flush locked but the underlying buffer not submitted
for I/O, the dquot pins the AIL and causes the filesystem to
livelock.

This race is essentially made possible by the buffer lock cycle
involved with waiting on a synchronous delwri queue submission.
Close the race by using synchronous buffer I/O for respective delwri
queue submission. This means the buffer remains locked across the
I/O and so is inaccessible from other contexts while in the
intermediate wait list state. The sync buffer I/O wait mechanism is
factored into a helper such that sync delwri buffer submission and
serialization are batched operations.

Designed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_buf.c | 78 +++++++++++++++++++++++++-----------------------
 1 file changed, 41 insertions(+), 37 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 112999ddb75e..113ab6426a40 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1530,6 +1530,21 @@ xfs_buf_submit(
 	xfs_buf_rele(bp);
 }
 
+/*
+ * Wait on a sync buffer.
+ */
+static int
+xfs_buf_iowait(
+	struct xfs_buf	*bp)
+{
+	/* wait for completion before gathering the error from the buffer */
+	trace_xfs_buf_iowait(bp, _RET_IP_);
+	wait_for_completion(&bp->b_iowait);
+	trace_xfs_buf_iowait_done(bp, _RET_IP_);
+
+	return bp->b_error;
+}
+
 /*
  * Synchronous buffer IO submission path, read or write.
  */
@@ -1554,12 +1569,7 @@ xfs_buf_submit_wait(
 	error = __xfs_buf_submit(bp);
 	if (error)
 		goto out;
-
-	/* wait for completion before gathering the error from the buffer */
-	trace_xfs_buf_iowait(bp, _RET_IP_);
-	wait_for_completion(&bp->b_iowait);
-	trace_xfs_buf_iowait_done(bp, _RET_IP_);
-	error = bp->b_error;
+	error = xfs_buf_iowait(bp);
 
 out:
 	/*
@@ -1962,16 +1972,10 @@ xfs_buf_cmp(
 }
 
 /*
- * submit buffers for write.
- *
- * When we have a large buffer list, we do not want to hold all the buffers
- * locked while we block on the request queue waiting for IO dispatch. To avoid
- * this problem, we lock and submit buffers in groups of 50, thereby minimising
- * the lock hold times for lists which may contain thousands of objects.
- *
- * To do this, we sort the buffer list before we walk the list to lock and
- * submit buffers, and we plug and unplug around each group of buffers we
- * submit.
+ * Submit buffers for write. If wait_list is specified, the buffers are
+ * submitted using sync I/O and placed on the wait list such that the caller can
+ * iowait each buffer. Otherwise async I/O is used and the buffers are released
+ * at I/O completion time.
  */
 static int
 xfs_buf_delwri_submit_buffers(
@@ -2013,21 +2017,22 @@ xfs_buf_delwri_submit_buffers(
 		trace_xfs_buf_delwri_split(bp, _RET_IP_);
 
 		/*
-		 * We do all IO submission async. This means if we need
-		 * to wait for IO completion we need to take an extra
-		 * reference so the buffer is still valid on the other
-		 * side. We need to move the buffer onto the io_list
-		 * at this point so the caller can still access it.
+		 * If we have a wait list, each buffer (and associated delwri
+		 * queue reference) transfers to it and is submitted
+		 * synchronously. Otherwise, drop the buffer from the delwri
+		 * queue and submit async.
 		 */
 		bp->b_flags &= ~(_XBF_DELWRI_Q | XBF_WRITE_FAIL);
-		bp->b_flags |= XBF_WRITE | XBF_ASYNC;
+		bp->b_flags |= XBF_WRITE;
 		if (wait_list) {
-			xfs_buf_hold(bp);
+			bp->b_flags &= ~XBF_ASYNC;
 			list_move_tail(&bp->b_list, wait_list);
-		} else
+			__xfs_buf_submit(bp);
+		} else {
+			bp->b_flags |= XBF_ASYNC;
 			list_del_init(&bp->b_list);
-
-		xfs_buf_submit(bp);
+			xfs_buf_submit(bp);
+		}
 	}
 	blk_finish_plug(&plug);
 
@@ -2074,8 +2079,11 @@ xfs_buf_delwri_submit(
 
 		list_del_init(&bp->b_list);
 
-		/* locking the buffer will wait for async IO completion. */
-		xfs_buf_lock(bp);
+		/*
+		 * Wait on the locked buffer, check for errors and unlock and
+		 * release the delwri queue reference.
+		 */
+		xfs_buf_iowait(bp);
 		error2 = bp->b_error;
 		xfs_buf_relse(bp);
 		if (!error)
@@ -2122,22 +2130,18 @@ xfs_buf_delwri_pushbuf(
 
 	/*
 	 * Delwri submission clears the DELWRI_Q buffer flag and returns with
-	 * the buffer on the wait list with an associated reference. Rather than
+	 * the buffer on the wait list with the original reference. Rather than
 	 * bounce the buffer from a local wait list back to the original list
 	 * after I/O completion, reuse the original list as the wait list.
 	 */
 	xfs_buf_delwri_submit_buffers(&submit_list, buffer_list);
 
 	/*
-	 * The buffer is now under I/O and wait listed as during typical delwri
-	 * submission. Lock the buffer to wait for I/O completion. Rather than
-	 * remove the buffer from the wait list and release the reference, we
-	 * want to return with the buffer queued to the original list. The
-	 * buffer already sits on the original list with a wait list reference,
-	 * however. If we let the queue inherit that wait list reference, all we
-	 * need to do is reset the DELWRI_Q flag.
+	 * The buffer is now locked, under I/O and wait listed on the original
+	 * delwri queue. Wait for I/O completion, restore the DELWRI_Q flag and
+	 * return with the buffer unlocked and on the original queue.
 	 */
-	xfs_buf_lock(bp);
+	xfs_buf_iowait(bp);
 	error = bp->b_error;
 	bp->b_flags |= _XBF_DELWRI_Q;
 	xfs_buf_unlock(bp);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] xfs: use sync buffer I/O for sync delwri queue submission
  2018-06-13 11:05 ` [PATCH v2 2/2] xfs: use sync buffer I/O for sync delwri queue submission Brian Foster
@ 2018-06-13 22:08   ` Dave Chinner
  2018-06-13 23:29     ` Brian Foster
  2018-06-15 11:28   ` Christoph Hellwig
  1 sibling, 1 reply; 17+ messages in thread
From: Dave Chinner @ 2018-06-13 22:08 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jun 13, 2018 at 07:05:16AM -0400, Brian Foster wrote:
> If a delwri queue occurs of a buffer that sits on a delwri queue
> wait list, the queue sets _XBF_DELWRI_Q without changing the state
> of ->b_list. This occurs, for example, if another thread beats the
> current delwri waiter thread to the buffer lock after I/O
> completion. Once the waiter acquires the lock, it removes the buffer
> from the wait list and leaves a buffer with _XBF_DELWRI_Q set but
> not populated on a list. This results in a lost buffer submission
> and in turn can result in assert failures due to _XBF_DELWRI_Q being
> set on buffer reclaim or filesystem lockups if the buffer happens to
> cover an item in the AIL.
> 
> This problem has been reproduced by repeated iterations of xfs/305
> on high CPU count (28xcpu) systems with limited memory (~1GB). Dirty
> dquot reclaim races with an xfsaild push of a separate dquot backed
> by the same buffer such that the buffer sits on the reclaim wait
> list at the time xfsaild attempts to queue it. Since the latter
> dquot has been flush locked but the underlying buffer not submitted
> for I/O, the dquot pins the AIL and causes the filesystem to
> livelock.
> 
> This race is essentially made possible by the buffer lock cycle
> involved with waiting on a synchronous delwri queue submission.
> Close the race by using synchronous buffer I/O for respective delwri
> queue submission. This means the buffer remains locked across the
> I/O and so is inaccessible from other contexts while in the
> intermediate wait list state. The sync buffer I/O wait mechanism is
> factored into a helper such that sync delwri buffer submission and
> serialization are batched operations.
> 
> Designed-by: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---

Just something I noticed on a initial brief scan:

> @@ -2013,21 +2017,22 @@ xfs_buf_delwri_submit_buffers(
>  		trace_xfs_buf_delwri_split(bp, _RET_IP_);
>  
>  		/*
> -		 * We do all IO submission async. This means if we need
> -		 * to wait for IO completion we need to take an extra
> -		 * reference so the buffer is still valid on the other
> -		 * side. We need to move the buffer onto the io_list
> -		 * at this point so the caller can still access it.
> +		 * If we have a wait list, each buffer (and associated delwri
> +		 * queue reference) transfers to it and is submitted
> +		 * synchronously. Otherwise, drop the buffer from the delwri
> +		 * queue and submit async.
>  		 */
>  		bp->b_flags &= ~(_XBF_DELWRI_Q | XBF_WRITE_FAIL);
> -		bp->b_flags |= XBF_WRITE | XBF_ASYNC;
> +		bp->b_flags |= XBF_WRITE;
>  		if (wait_list) {
> -			xfs_buf_hold(bp);
> +			bp->b_flags &= ~XBF_ASYNC;
>  			list_move_tail(&bp->b_list, wait_list);
> -		} else
> +			__xfs_buf_submit(bp);

We lose a buffer submission tracepoint here.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] xfs: use sync buffer I/O for sync delwri queue submission
  2018-06-13 22:08   ` Dave Chinner
@ 2018-06-13 23:29     ` Brian Foster
  2018-06-13 23:37       ` Dave Chinner
  0 siblings, 1 reply; 17+ messages in thread
From: Brian Foster @ 2018-06-13 23:29 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

On Thu, Jun 14, 2018 at 08:08:07AM +1000, Dave Chinner wrote:
> On Wed, Jun 13, 2018 at 07:05:16AM -0400, Brian Foster wrote:
> > If a delwri queue occurs of a buffer that sits on a delwri queue
> > wait list, the queue sets _XBF_DELWRI_Q without changing the state
> > of ->b_list. This occurs, for example, if another thread beats the
> > current delwri waiter thread to the buffer lock after I/O
> > completion. Once the waiter acquires the lock, it removes the buffer
> > from the wait list and leaves a buffer with _XBF_DELWRI_Q set but
> > not populated on a list. This results in a lost buffer submission
> > and in turn can result in assert failures due to _XBF_DELWRI_Q being
> > set on buffer reclaim or filesystem lockups if the buffer happens to
> > cover an item in the AIL.
> > 
> > This problem has been reproduced by repeated iterations of xfs/305
> > on high CPU count (28xcpu) systems with limited memory (~1GB). Dirty
> > dquot reclaim races with an xfsaild push of a separate dquot backed
> > by the same buffer such that the buffer sits on the reclaim wait
> > list at the time xfsaild attempts to queue it. Since the latter
> > dquot has been flush locked but the underlying buffer not submitted
> > for I/O, the dquot pins the AIL and causes the filesystem to
> > livelock.
> > 
> > This race is essentially made possible by the buffer lock cycle
> > involved with waiting on a synchronous delwri queue submission.
> > Close the race by using synchronous buffer I/O for respective delwri
> > queue submission. This means the buffer remains locked across the
> > I/O and so is inaccessible from other contexts while in the
> > intermediate wait list state. The sync buffer I/O wait mechanism is
> > factored into a helper such that sync delwri buffer submission and
> > serialization are batched operations.
> > 
> > Designed-by: Dave Chinner <dchinner@redhat.com>
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> 
> Just something I noticed on a initial brief scan:
> 
> > @@ -2013,21 +2017,22 @@ xfs_buf_delwri_submit_buffers(
> >  		trace_xfs_buf_delwri_split(bp, _RET_IP_);
> >  
> >  		/*
> > -		 * We do all IO submission async. This means if we need
> > -		 * to wait for IO completion we need to take an extra
> > -		 * reference so the buffer is still valid on the other
> > -		 * side. We need to move the buffer onto the io_list
> > -		 * at this point so the caller can still access it.
> > +		 * If we have a wait list, each buffer (and associated delwri
> > +		 * queue reference) transfers to it and is submitted
> > +		 * synchronously. Otherwise, drop the buffer from the delwri
> > +		 * queue and submit async.
> >  		 */
> >  		bp->b_flags &= ~(_XBF_DELWRI_Q | XBF_WRITE_FAIL);
> > -		bp->b_flags |= XBF_WRITE | XBF_ASYNC;
> > +		bp->b_flags |= XBF_WRITE;
> >  		if (wait_list) {
> > -			xfs_buf_hold(bp);
> > +			bp->b_flags &= ~XBF_ASYNC;
> >  			list_move_tail(&bp->b_list, wait_list);
> > -		} else
> > +			__xfs_buf_submit(bp);
> 
> We lose a buffer submission tracepoint here.
> 

Yeah, good catch. What do you think about just killing
trace_xfs_buf_submit_wait() and pushing trace_xfs_buf_submit() down into
the new helper? It looks like we can already distinguish the io type
based on ->b_flags.

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] xfs: use sync buffer I/O for sync delwri queue submission
  2018-06-13 23:29     ` Brian Foster
@ 2018-06-13 23:37       ` Dave Chinner
  0 siblings, 0 replies; 17+ messages in thread
From: Dave Chinner @ 2018-06-13 23:37 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jun 13, 2018 at 07:29:54PM -0400, Brian Foster wrote:
> On Thu, Jun 14, 2018 at 08:08:07AM +1000, Dave Chinner wrote:
> > On Wed, Jun 13, 2018 at 07:05:16AM -0400, Brian Foster wrote:
> > > If a delwri queue occurs of a buffer that sits on a delwri queue
> > > wait list, the queue sets _XBF_DELWRI_Q without changing the state
> > > of ->b_list. This occurs, for example, if another thread beats the
> > > current delwri waiter thread to the buffer lock after I/O
> > > completion. Once the waiter acquires the lock, it removes the buffer
> > > from the wait list and leaves a buffer with _XBF_DELWRI_Q set but
> > > not populated on a list. This results in a lost buffer submission
> > > and in turn can result in assert failures due to _XBF_DELWRI_Q being
> > > set on buffer reclaim or filesystem lockups if the buffer happens to
> > > cover an item in the AIL.
> > > 
> > > This problem has been reproduced by repeated iterations of xfs/305
> > > on high CPU count (28xcpu) systems with limited memory (~1GB). Dirty
> > > dquot reclaim races with an xfsaild push of a separate dquot backed
> > > by the same buffer such that the buffer sits on the reclaim wait
> > > list at the time xfsaild attempts to queue it. Since the latter
> > > dquot has been flush locked but the underlying buffer not submitted
> > > for I/O, the dquot pins the AIL and causes the filesystem to
> > > livelock.
> > > 
> > > This race is essentially made possible by the buffer lock cycle
> > > involved with waiting on a synchronous delwri queue submission.
> > > Close the race by using synchronous buffer I/O for respective delwri
> > > queue submission. This means the buffer remains locked across the
> > > I/O and so is inaccessible from other contexts while in the
> > > intermediate wait list state. The sync buffer I/O wait mechanism is
> > > factored into a helper such that sync delwri buffer submission and
> > > serialization are batched operations.
> > > 
> > > Designed-by: Dave Chinner <dchinner@redhat.com>
> > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > ---
> > 
> > Just something I noticed on a initial brief scan:
> > 
> > > @@ -2013,21 +2017,22 @@ xfs_buf_delwri_submit_buffers(
> > >  		trace_xfs_buf_delwri_split(bp, _RET_IP_);
> > >  
> > >  		/*
> > > -		 * We do all IO submission async. This means if we need
> > > -		 * to wait for IO completion we need to take an extra
> > > -		 * reference so the buffer is still valid on the other
> > > -		 * side. We need to move the buffer onto the io_list
> > > -		 * at this point so the caller can still access it.
> > > +		 * If we have a wait list, each buffer (and associated delwri
> > > +		 * queue reference) transfers to it and is submitted
> > > +		 * synchronously. Otherwise, drop the buffer from the delwri
> > > +		 * queue and submit async.
> > >  		 */
> > >  		bp->b_flags &= ~(_XBF_DELWRI_Q | XBF_WRITE_FAIL);
> > > -		bp->b_flags |= XBF_WRITE | XBF_ASYNC;
> > > +		bp->b_flags |= XBF_WRITE;
> > >  		if (wait_list) {
> > > -			xfs_buf_hold(bp);
> > > +			bp->b_flags &= ~XBF_ASYNC;
> > >  			list_move_tail(&bp->b_list, wait_list);
> > > -		} else
> > > +			__xfs_buf_submit(bp);
> > 
> > We lose a buffer submission tracepoint here.
> > 
> 
> Yeah, good catch. What do you think about just killing
> trace_xfs_buf_submit_wait() and pushing trace_xfs_buf_submit() down into
> the new helper? It looks like we can already distinguish the io type
> based on ->b_flags.

That's fine by me.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3 1/2] xfs: refactor buffer submission into a common helper
  2018-06-13 11:05 ` [PATCH v2 1/2] xfs: refactor buffer submission into a common helper Brian Foster
@ 2018-06-14 13:43   ` Brian Foster
  2018-06-15 11:24     ` Christoph Hellwig
  0 siblings, 1 reply; 17+ messages in thread
From: Brian Foster @ 2018-06-14 13:43 UTC (permalink / raw)
  To: linux-xfs

Sync and async buffer submission both do generally similar things
with a couple odd exceptions. Refactor the core buffer submission
code into a common helper to isolate buffer submission from
completion handling of synchronous buffer I/O.

This patch does not change behavior. It is a step towards support
for using synchronous buffer I/O via synchronous delwri queue
submission.

Designed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
---

v3:
- Leave tracepoint in __xfs_buf_submit and kill
  trace_xfs_buf_submit_wait().

 fs/xfs/xfs_buf.c   | 85 ++++++++++++++++++++--------------------------
 fs/xfs/xfs_trace.h |  1 -
 2 files changed, 37 insertions(+), 49 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index e9c058e3761c..7b0f7c79cd62 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1458,22 +1458,20 @@ _xfs_buf_ioapply(
  * a call to this function unless the caller holds an additional reference
  * itself.
  */
-void
-xfs_buf_submit(
+static int
+__xfs_buf_submit(
 	struct xfs_buf	*bp)
 {
 	trace_xfs_buf_submit(bp, _RET_IP_);
 
 	ASSERT(!(bp->b_flags & _XBF_DELWRI_Q));
-	ASSERT(bp->b_flags & XBF_ASYNC);
 
 	/* on shutdown we stale and complete the buffer immediately */
 	if (XFS_FORCED_SHUTDOWN(bp->b_target->bt_mount)) {
 		xfs_buf_ioerror(bp, -EIO);
 		bp->b_flags &= ~XBF_DONE;
 		xfs_buf_stale(bp);
-		xfs_buf_ioend(bp);
-		return;
+		return -EIO;
 	}
 
 	if (bp->b_flags & XBF_WRITE)
@@ -1482,23 +1480,14 @@ xfs_buf_submit(
 	/* clear the internal error state to avoid spurious errors */
 	bp->b_io_error = 0;
 
-	/*
-	 * The caller's reference is released during I/O completion.
-	 * This occurs some time after the last b_io_remaining reference is
-	 * released, so after we drop our Io reference we have to have some
-	 * other reference to ensure the buffer doesn't go away from underneath
-	 * us. Take a direct reference to ensure we have safe access to the
-	 * buffer until we are finished with it.
-	 */
-	xfs_buf_hold(bp);
-
 	/*
 	 * Set the count to 1 initially, this will stop an I/O completion
 	 * callout which happens before we have started all the I/O from calling
 	 * xfs_buf_ioend too early.
 	 */
 	atomic_set(&bp->b_io_remaining, 1);
-	xfs_buf_ioacct_inc(bp);
+	if (bp->b_flags & XBF_ASYNC)
+		xfs_buf_ioacct_inc(bp);
 	_xfs_buf_ioapply(bp);
 
 	/*
@@ -1507,14 +1496,39 @@ xfs_buf_submit(
 	 * that we don't return to the caller with completion still pending.
 	 */
 	if (atomic_dec_and_test(&bp->b_io_remaining) == 1) {
-		if (bp->b_error)
+		if (bp->b_error || !(bp->b_flags & XBF_ASYNC))
 			xfs_buf_ioend(bp);
 		else
 			xfs_buf_ioend_async(bp);
 	}
 
-	xfs_buf_rele(bp);
+	return 0;
+}
+
+void
+xfs_buf_submit(
+	struct xfs_buf	*bp)
+{
+	int		error;
+
+	ASSERT(bp->b_flags & XBF_ASYNC);
+
+	/*
+	 * The caller's reference is released during I/O completion.
+	 * This occurs some time after the last b_io_remaining reference is
+	 * released, so after we drop our Io reference we have to have some
+	 * other reference to ensure the buffer doesn't go away from underneath
+	 * us. Take a direct reference to ensure we have safe access to the
+	 * buffer until we are finished with it.
+	 */
+	xfs_buf_hold(bp);
+
+	error = __xfs_buf_submit(bp);
+	if (error)
+		xfs_buf_ioend(bp);
+
 	/* Note: it is not safe to reference bp now we've dropped our ref */
+	xfs_buf_rele(bp);
 }
 
 /*
@@ -1526,22 +1540,7 @@ xfs_buf_submit_wait(
 {
 	int		error;
 
-	trace_xfs_buf_submit_wait(bp, _RET_IP_);
-
-	ASSERT(!(bp->b_flags & (_XBF_DELWRI_Q | XBF_ASYNC)));
-
-	if (XFS_FORCED_SHUTDOWN(bp->b_target->bt_mount)) {
-		xfs_buf_ioerror(bp, -EIO);
-		xfs_buf_stale(bp);
-		bp->b_flags &= ~XBF_DONE;
-		return -EIO;
-	}
-
-	if (bp->b_flags & XBF_WRITE)
-		xfs_buf_wait_unpin(bp);
-
-	/* clear the internal error state to avoid spurious errors */
-	bp->b_io_error = 0;
+	ASSERT(!(bp->b_flags & XBF_ASYNC));
 
 	/*
 	 * For synchronous IO, the IO does not inherit the submitters reference
@@ -1551,20 +1550,9 @@ xfs_buf_submit_wait(
 	 */
 	xfs_buf_hold(bp);
 
-	/*
-	 * Set the count to 1 initially, this will stop an I/O completion
-	 * callout which happens before we have started all the I/O from calling
-	 * xfs_buf_ioend too early.
-	 */
-	atomic_set(&bp->b_io_remaining, 1);
-	_xfs_buf_ioapply(bp);
-
-	/*
-	 * make sure we run completion synchronously if it raced with us and is
-	 * already complete.
-	 */
-	if (atomic_dec_and_test(&bp->b_io_remaining) == 1)
-		xfs_buf_ioend(bp);
+	error = __xfs_buf_submit(bp);
+	if (error)
+		goto out;
 
 	/* wait for completion before gathering the error from the buffer */
 	trace_xfs_buf_iowait(bp, _RET_IP_);
@@ -1572,6 +1560,7 @@ xfs_buf_submit_wait(
 	trace_xfs_buf_iowait_done(bp, _RET_IP_);
 	error = bp->b_error;
 
+out:
 	/*
 	 * all done now, we can release the hold that keeps the buffer
 	 * referenced for the entire IO.
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 972d45d28097..c2d2e513d722 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -310,7 +310,6 @@ DEFINE_BUF_EVENT(xfs_buf_hold);
 DEFINE_BUF_EVENT(xfs_buf_rele);
 DEFINE_BUF_EVENT(xfs_buf_iodone);
 DEFINE_BUF_EVENT(xfs_buf_submit);
-DEFINE_BUF_EVENT(xfs_buf_submit_wait);
 DEFINE_BUF_EVENT(xfs_buf_lock);
 DEFINE_BUF_EVENT(xfs_buf_lock_done);
 DEFINE_BUF_EVENT(xfs_buf_trylock_fail);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 1/2] xfs: refactor buffer submission into a common helper
  2018-06-14 13:43   ` [PATCH v3 " Brian Foster
@ 2018-06-15 11:24     ` Christoph Hellwig
  2018-06-15 11:53       ` Brian Foster
  0 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2018-06-15 11:24 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Thu, Jun 14, 2018 at 09:43:07AM -0400, Brian Foster wrote:
> Sync and async buffer submission both do generally similar things
> with a couple odd exceptions. Refactor the core buffer submission
> code into a common helper to isolate buffer submission from
> completion handling of synchronous buffer I/O.
> 
> This patch does not change behavior. It is a step towards support
> for using synchronous buffer I/O via synchronous delwri queue
> submission.
> 
> Designed-by: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
> 
> v3:
> - Leave tracepoint in __xfs_buf_submit and kill
>   trace_xfs_buf_submit_wait().
> 
>  fs/xfs/xfs_buf.c   | 85 ++++++++++++++++++++--------------------------
>  fs/xfs/xfs_trace.h |  1 -
>  2 files changed, 37 insertions(+), 49 deletions(-)
> 
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index e9c058e3761c..7b0f7c79cd62 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -1458,22 +1458,20 @@ _xfs_buf_ioapply(
>   * a call to this function unless the caller holds an additional reference
>   * itself.
>   */
> -void
> -xfs_buf_submit(
> +static int
> +__xfs_buf_submit(
>  	struct xfs_buf	*bp)
>  {
>  	trace_xfs_buf_submit(bp, _RET_IP_);
>  
>  	ASSERT(!(bp->b_flags & _XBF_DELWRI_Q));
> -	ASSERT(bp->b_flags & XBF_ASYNC);
>  
>  	/* on shutdown we stale and complete the buffer immediately */
>  	if (XFS_FORCED_SHUTDOWN(bp->b_target->bt_mount)) {
>  		xfs_buf_ioerror(bp, -EIO);
>  		bp->b_flags &= ~XBF_DONE;
>  		xfs_buf_stale(bp);
> -		xfs_buf_ioend(bp);
> -		return;
> +		return -EIO;
>  	}
>  
>  	if (bp->b_flags & XBF_WRITE)
> @@ -1482,23 +1480,14 @@ xfs_buf_submit(
>  	/* clear the internal error state to avoid spurious errors */
>  	bp->b_io_error = 0;
>  
> -	/*
> -	 * The caller's reference is released during I/O completion.
> -	 * This occurs some time after the last b_io_remaining reference is
> -	 * released, so after we drop our Io reference we have to have some
> -	 * other reference to ensure the buffer doesn't go away from underneath
> -	 * us. Take a direct reference to ensure we have safe access to the
> -	 * buffer until we are finished with it.
> -	 */
> -	xfs_buf_hold(bp);
> -
>  	/*
>  	 * Set the count to 1 initially, this will stop an I/O completion
>  	 * callout which happens before we have started all the I/O from calling
>  	 * xfs_buf_ioend too early.
>  	 */
>  	atomic_set(&bp->b_io_remaining, 1);
> -	xfs_buf_ioacct_inc(bp);
> +	if (bp->b_flags & XBF_ASYNC)
> +		xfs_buf_ioacct_inc(bp);
>  	_xfs_buf_ioapply(bp);
>  
>  	/*
> @@ -1507,14 +1496,39 @@ xfs_buf_submit(
>  	 * that we don't return to the caller with completion still pending.
>  	 */
>  	if (atomic_dec_and_test(&bp->b_io_remaining) == 1) {
> -		if (bp->b_error)
> +		if (bp->b_error || !(bp->b_flags & XBF_ASYNC))
>  			xfs_buf_ioend(bp);
>  		else
>  			xfs_buf_ioend_async(bp);
>  	}
>  
> -	xfs_buf_rele(bp);
> +	return 0;
> +}
> +
> +void
> +xfs_buf_submit(
> +	struct xfs_buf	*bp)
> +{
> +	int		error;
> +
> +	ASSERT(bp->b_flags & XBF_ASYNC);
> +
> +	/*
> +	 * The caller's reference is released during I/O completion.
> +	 * This occurs some time after the last b_io_remaining reference is
> +	 * released, so after we drop our Io reference we have to have some
> +	 * other reference to ensure the buffer doesn't go away from underneath
> +	 * us. Take a direct reference to ensure we have safe access to the
> +	 * buffer until we are finished with it.
> +	 */
> +	xfs_buf_hold(bp);
> +
> +	error = __xfs_buf_submit(bp);
> +	if (error)
> +		xfs_buf_ioend(bp);
> +

It seems like we could simple throw in a:

	if (!(bp->b_flags & XBF_ASYNC)) {
		trace_xfs_buf_iowait(bp, _RET_IP_);
		wait_for_completion(&bp->b_iowait);
		trace_xfs_buf_iowait_done(bp, _RET_IP_);
		error = bp->b_error;;
	}

here and get ret rid of the separate xfs_buf_submit_wait function
entirely.  Or even factor the above out into a nice little helper.

Otherwise this looks fine to me:

Signed-off-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] xfs: use sync buffer I/O for sync delwri queue submission
  2018-06-13 11:05 ` [PATCH v2 2/2] xfs: use sync buffer I/O for sync delwri queue submission Brian Foster
  2018-06-13 22:08   ` Dave Chinner
@ 2018-06-15 11:28   ` Christoph Hellwig
  2018-06-15 11:53     ` Brian Foster
  1 sibling, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2018-06-15 11:28 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jun 13, 2018 at 07:05:16AM -0400, Brian Foster wrote:
> If a delwri queue occurs of a buffer that sits on a delwri queue
> wait list, the queue sets _XBF_DELWRI_Q without changing the state
> of ->b_list. This occurs, for example, if another thread beats the
> current delwri waiter thread to the buffer lock after I/O
> completion. Once the waiter acquires the lock, it removes the buffer
> from the wait list and leaves a buffer with _XBF_DELWRI_Q set but
> not populated on a list. This results in a lost buffer submission
> and in turn can result in assert failures due to _XBF_DELWRI_Q being
> set on buffer reclaim or filesystem lockups if the buffer happens to
> cover an item in the AIL.
> 
> This problem has been reproduced by repeated iterations of xfs/305
> on high CPU count (28xcpu) systems with limited memory (~1GB). Dirty
> dquot reclaim races with an xfsaild push of a separate dquot backed
> by the same buffer such that the buffer sits on the reclaim wait
> list at the time xfsaild attempts to queue it. Since the latter
> dquot has been flush locked but the underlying buffer not submitted
> for I/O, the dquot pins the AIL and causes the filesystem to
> livelock.
> 
> This race is essentially made possible by the buffer lock cycle
> involved with waiting on a synchronous delwri queue submission.
> Close the race by using synchronous buffer I/O for respective delwri
> queue submission. This means the buffer remains locked across the
> I/O and so is inaccessible from other contexts while in the
> intermediate wait list state. The sync buffer I/O wait mechanism is
> factored into a helper such that sync delwri buffer submission and
> serialization are batched operations.
> 
> Designed-by: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  fs/xfs/xfs_buf.c | 78 +++++++++++++++++++++++++-----------------------
>  1 file changed, 41 insertions(+), 37 deletions(-)
> 
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index 112999ddb75e..113ab6426a40 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -1530,6 +1530,21 @@ xfs_buf_submit(
>  	xfs_buf_rele(bp);
>  }
>  
> +/*
> + * Wait on a sync buffer.
> + */
> +static int
> +xfs_buf_iowait(
> +	struct xfs_buf	*bp)
> +{
> +	/* wait for completion before gathering the error from the buffer */

That comments seems to state the obvious based on the function name
and top of function comment.  I'd just remove it.

> @@ -2013,21 +2017,22 @@ xfs_buf_delwri_submit_buffers(
>  		trace_xfs_buf_delwri_split(bp, _RET_IP_);
>  
>  		/*
> +		 * If we have a wait list, each buffer (and associated delwri
> +		 * queue reference) transfers to it and is submitted
> +		 * synchronously. Otherwise, drop the buffer from the delwri
> +		 * queue and submit async.
>  		 */
>  		bp->b_flags &= ~(_XBF_DELWRI_Q | XBF_WRITE_FAIL);
> +		bp->b_flags |= XBF_WRITE;
>  		if (wait_list) {
> +			bp->b_flags &= ~XBF_ASYNC;
>  			list_move_tail(&bp->b_list, wait_list);
> +			__xfs_buf_submit(bp);
> +		} else {
> +			bp->b_flags |= XBF_ASYNC;
>  			list_del_init(&bp->b_list);
> +			xfs_buf_submit(bp);
> +		}

Ok, that breaks my idea of just checking XBF_ASYNC in the previous
reply.  But we could still do that with an explicit flag instead of
the duplication.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 1/2] xfs: refactor buffer submission into a common helper
  2018-06-15 11:24     ` Christoph Hellwig
@ 2018-06-15 11:53       ` Brian Foster
  0 siblings, 0 replies; 17+ messages in thread
From: Brian Foster @ 2018-06-15 11:53 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, Jun 15, 2018 at 04:24:14AM -0700, Christoph Hellwig wrote:
> On Thu, Jun 14, 2018 at 09:43:07AM -0400, Brian Foster wrote:
> > Sync and async buffer submission both do generally similar things
> > with a couple odd exceptions. Refactor the core buffer submission
> > code into a common helper to isolate buffer submission from
> > completion handling of synchronous buffer I/O.
> > 
> > This patch does not change behavior. It is a step towards support
> > for using synchronous buffer I/O via synchronous delwri queue
> > submission.
> > 
> > Designed-by: Dave Chinner <dchinner@redhat.com>
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> > 
> > v3:
> > - Leave tracepoint in __xfs_buf_submit and kill
> >   trace_xfs_buf_submit_wait().
> > 
> >  fs/xfs/xfs_buf.c   | 85 ++++++++++++++++++++--------------------------
> >  fs/xfs/xfs_trace.h |  1 -
> >  2 files changed, 37 insertions(+), 49 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> > index e9c058e3761c..7b0f7c79cd62 100644
> > --- a/fs/xfs/xfs_buf.c
> > +++ b/fs/xfs/xfs_buf.c
> > @@ -1458,22 +1458,20 @@ _xfs_buf_ioapply(
> >   * a call to this function unless the caller holds an additional reference
> >   * itself.
> >   */
> > -void
> > -xfs_buf_submit(
> > +static int
> > +__xfs_buf_submit(
> >  	struct xfs_buf	*bp)
> >  {
> >  	trace_xfs_buf_submit(bp, _RET_IP_);
> >  
> >  	ASSERT(!(bp->b_flags & _XBF_DELWRI_Q));
> > -	ASSERT(bp->b_flags & XBF_ASYNC);
> >  
> >  	/* on shutdown we stale and complete the buffer immediately */
> >  	if (XFS_FORCED_SHUTDOWN(bp->b_target->bt_mount)) {
> >  		xfs_buf_ioerror(bp, -EIO);
> >  		bp->b_flags &= ~XBF_DONE;
> >  		xfs_buf_stale(bp);
> > -		xfs_buf_ioend(bp);
> > -		return;
> > +		return -EIO;
> >  	}
> >  
> >  	if (bp->b_flags & XBF_WRITE)
> > @@ -1482,23 +1480,14 @@ xfs_buf_submit(
> >  	/* clear the internal error state to avoid spurious errors */
> >  	bp->b_io_error = 0;
> >  
> > -	/*
> > -	 * The caller's reference is released during I/O completion.
> > -	 * This occurs some time after the last b_io_remaining reference is
> > -	 * released, so after we drop our Io reference we have to have some
> > -	 * other reference to ensure the buffer doesn't go away from underneath
> > -	 * us. Take a direct reference to ensure we have safe access to the
> > -	 * buffer until we are finished with it.
> > -	 */
> > -	xfs_buf_hold(bp);
> > -
> >  	/*
> >  	 * Set the count to 1 initially, this will stop an I/O completion
> >  	 * callout which happens before we have started all the I/O from calling
> >  	 * xfs_buf_ioend too early.
> >  	 */
> >  	atomic_set(&bp->b_io_remaining, 1);
> > -	xfs_buf_ioacct_inc(bp);
> > +	if (bp->b_flags & XBF_ASYNC)
> > +		xfs_buf_ioacct_inc(bp);
> >  	_xfs_buf_ioapply(bp);
> >  
> >  	/*
> > @@ -1507,14 +1496,39 @@ xfs_buf_submit(
> >  	 * that we don't return to the caller with completion still pending.
> >  	 */
> >  	if (atomic_dec_and_test(&bp->b_io_remaining) == 1) {
> > -		if (bp->b_error)
> > +		if (bp->b_error || !(bp->b_flags & XBF_ASYNC))
> >  			xfs_buf_ioend(bp);
> >  		else
> >  			xfs_buf_ioend_async(bp);
> >  	}
> >  
> > -	xfs_buf_rele(bp);
> > +	return 0;
> > +}
> > +
> > +void
> > +xfs_buf_submit(
> > +	struct xfs_buf	*bp)
> > +{
> > +	int		error;
> > +
> > +	ASSERT(bp->b_flags & XBF_ASYNC);
> > +
> > +	/*
> > +	 * The caller's reference is released during I/O completion.
> > +	 * This occurs some time after the last b_io_remaining reference is
> > +	 * released, so after we drop our Io reference we have to have some
> > +	 * other reference to ensure the buffer doesn't go away from underneath
> > +	 * us. Take a direct reference to ensure we have safe access to the
> > +	 * buffer until we are finished with it.
> > +	 */
> > +	xfs_buf_hold(bp);
> > +
> > +	error = __xfs_buf_submit(bp);
> > +	if (error)
> > +		xfs_buf_ioend(bp);
> > +
> 
> It seems like we could simple throw in a:
> 
> 	if (!(bp->b_flags & XBF_ASYNC)) {
> 		trace_xfs_buf_iowait(bp, _RET_IP_);
> 		wait_for_completion(&bp->b_iowait);
> 		trace_xfs_buf_iowait_done(bp, _RET_IP_);
> 		error = bp->b_error;;
> 	}
> 
> here and get ret rid of the separate xfs_buf_submit_wait function
> entirely.  Or even factor the above out into a nice little helper.
> 

As you already noted in patch 2, we need this code split up to support
fixing the delwri queue thing..

> Otherwise this looks fine to me:
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Thanks. I assume you mean Reviewed-by... ;)

Brian

> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] xfs: use sync buffer I/O for sync delwri queue submission
  2018-06-15 11:28   ` Christoph Hellwig
@ 2018-06-15 11:53     ` Brian Foster
  2018-06-15 12:16       ` Christoph Hellwig
  0 siblings, 1 reply; 17+ messages in thread
From: Brian Foster @ 2018-06-15 11:53 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, Jun 15, 2018 at 04:28:20AM -0700, Christoph Hellwig wrote:
> On Wed, Jun 13, 2018 at 07:05:16AM -0400, Brian Foster wrote:
> > If a delwri queue occurs of a buffer that sits on a delwri queue
> > wait list, the queue sets _XBF_DELWRI_Q without changing the state
> > of ->b_list. This occurs, for example, if another thread beats the
> > current delwri waiter thread to the buffer lock after I/O
> > completion. Once the waiter acquires the lock, it removes the buffer
> > from the wait list and leaves a buffer with _XBF_DELWRI_Q set but
> > not populated on a list. This results in a lost buffer submission
> > and in turn can result in assert failures due to _XBF_DELWRI_Q being
> > set on buffer reclaim or filesystem lockups if the buffer happens to
> > cover an item in the AIL.
> > 
> > This problem has been reproduced by repeated iterations of xfs/305
> > on high CPU count (28xcpu) systems with limited memory (~1GB). Dirty
> > dquot reclaim races with an xfsaild push of a separate dquot backed
> > by the same buffer such that the buffer sits on the reclaim wait
> > list at the time xfsaild attempts to queue it. Since the latter
> > dquot has been flush locked but the underlying buffer not submitted
> > for I/O, the dquot pins the AIL and causes the filesystem to
> > livelock.
> > 
> > This race is essentially made possible by the buffer lock cycle
> > involved with waiting on a synchronous delwri queue submission.
> > Close the race by using synchronous buffer I/O for respective delwri
> > queue submission. This means the buffer remains locked across the
> > I/O and so is inaccessible from other contexts while in the
> > intermediate wait list state. The sync buffer I/O wait mechanism is
> > factored into a helper such that sync delwri buffer submission and
> > serialization are batched operations.
> > 
> > Designed-by: Dave Chinner <dchinner@redhat.com>
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> >  fs/xfs/xfs_buf.c | 78 +++++++++++++++++++++++++-----------------------
> >  1 file changed, 41 insertions(+), 37 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> > index 112999ddb75e..113ab6426a40 100644
> > --- a/fs/xfs/xfs_buf.c
> > +++ b/fs/xfs/xfs_buf.c
> > @@ -1530,6 +1530,21 @@ xfs_buf_submit(
> >  	xfs_buf_rele(bp);
> >  }
> >  
> > +/*
> > + * Wait on a sync buffer.
> > + */
> > +static int
> > +xfs_buf_iowait(
> > +	struct xfs_buf	*bp)
> > +{
> > +	/* wait for completion before gathering the error from the buffer */
> 
> That comments seems to state the obvious based on the function name
> and top of function comment.  I'd just remove it.
> 

Ok, I'll fold it into the above.

> > @@ -2013,21 +2017,22 @@ xfs_buf_delwri_submit_buffers(
> >  		trace_xfs_buf_delwri_split(bp, _RET_IP_);
> >  
> >  		/*
> > +		 * If we have a wait list, each buffer (and associated delwri
> > +		 * queue reference) transfers to it and is submitted
> > +		 * synchronously. Otherwise, drop the buffer from the delwri
> > +		 * queue and submit async.
> >  		 */
> >  		bp->b_flags &= ~(_XBF_DELWRI_Q | XBF_WRITE_FAIL);
> > +		bp->b_flags |= XBF_WRITE;
> >  		if (wait_list) {
> > +			bp->b_flags &= ~XBF_ASYNC;
> >  			list_move_tail(&bp->b_list, wait_list);
> > +			__xfs_buf_submit(bp);
> > +		} else {
> > +			bp->b_flags |= XBF_ASYNC;
> >  			list_del_init(&bp->b_list);
> > +			xfs_buf_submit(bp);
> > +		}
> 
> Ok, that breaks my idea of just checking XBF_ASYNC in the previous
> reply.  But we could still do that with an explicit flag instead of
> the duplication.

Not totally sure I follow... do you essentially mean to rename
xfs_buf_submit_wait() -> xfs_buf_submit() and make the iowait
conditional on !XBF_ASYNC and absence of some new "sync_nowait"
parameter to the function? Could you clarify how you envision the
updated xfs_buf_submit() function signature to look?

If I'm following correctly, that seems fairly reasonable at first
thought. This is a separate patch though (refactoring the interface vs.
refactoring the implementation to fix a bug).

Brian

> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] xfs: use sync buffer I/O for sync delwri queue submission
  2018-06-15 11:53     ` Brian Foster
@ 2018-06-15 12:16       ` Christoph Hellwig
  2018-06-15 12:39         ` Brian Foster
  0 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2018-06-15 12:16 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, linux-xfs

On Fri, Jun 15, 2018 at 07:53:35AM -0400, Brian Foster wrote:
> Not totally sure I follow... do you essentially mean to rename
> xfs_buf_submit_wait() -> xfs_buf_submit() and make the iowait
> conditional on !XBF_ASYNC and absence of some new "sync_nowait"
> parameter to the function? Could you clarify how you envision the
> updated xfs_buf_submit() function signature to look?
> 
> If I'm following correctly, that seems fairly reasonable at first
> thought. This is a separate patch though (refactoring the interface vs.
> refactoring the implementation to fix a bug).

Well.  I'd suggest something like the patch below for patch 1:

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 55661cbdb51b..5504525d345b 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1417,28 +1417,33 @@ _xfs_buf_ioapply(
 	blk_finish_plug(&plug);
 }
 
-/*
- * Asynchronous IO submission path. This transfers the buffer lock ownership and
- * the current reference to the IO. It is not safe to reference the buffer after
- * a call to this function unless the caller holds an additional reference
- * itself.
- */
-void
-xfs_buf_submit(
+static int
+xfs_buf_iowait(
 	struct xfs_buf	*bp)
+{
+	trace_xfs_buf_iowait(bp, _RET_IP_);
+	wait_for_completion(&bp->b_iowait);
+	trace_xfs_buf_iowait_done(bp, _RET_IP_);
+	return bp->b_error;
+}
+
+static int
+__xfs_buf_submit(
+	struct xfs_buf	*bp,
+	bool		wait)
 {
 	trace_xfs_buf_submit(bp, _RET_IP_);
 
 	ASSERT(!(bp->b_flags & _XBF_DELWRI_Q));
-	ASSERT(bp->b_flags & XBF_ASYNC);
 
 	/* on shutdown we stale and complete the buffer immediately */
 	if (XFS_FORCED_SHUTDOWN(bp->b_target->bt_mount)) {
 		xfs_buf_ioerror(bp, -EIO);
 		bp->b_flags &= ~XBF_DONE;
 		xfs_buf_stale(bp);
-		xfs_buf_ioend(bp);
-		return;
+		if (!wait)
+			xfs_buf_ioend(bp);
+		return -EIO;
 	}
 
 	if (bp->b_flags & XBF_WRITE)
@@ -1463,7 +1468,8 @@ xfs_buf_submit(
 	 * xfs_buf_ioend too early.
 	 */
 	atomic_set(&bp->b_io_remaining, 1);
-	xfs_buf_ioacct_inc(bp);
+	if (bp->b_flags & XBF_ASYNC)
+		xfs_buf_ioacct_inc(bp);
 	_xfs_buf_ioapply(bp);
 
 	/*
@@ -1472,14 +1478,31 @@ xfs_buf_submit(
 	 * that we don't return to the caller with completion still pending.
 	 */
 	if (atomic_dec_and_test(&bp->b_io_remaining) == 1) {
-		if (bp->b_error)
+		if (bp->b_error || !(bp->b_flags & XBF_ASYNC))
 			xfs_buf_ioend(bp);
 		else
 			xfs_buf_ioend_async(bp);
 	}
 
+	if (wait)
+		xfs_buf_iowait(bp);
 	xfs_buf_rele(bp);
 	/* Note: it is not safe to reference bp now we've dropped our ref */
+	return 0;
+}
+
+/*
+ * Asynchronous IO submission path. This transfers the buffer lock ownership and
+ * the current reference to the IO. It is not safe to reference the buffer after
+ * a call to this function unless the caller holds an additional reference
+ * itself.
+ */
+void
+xfs_buf_submit(
+	struct xfs_buf	*bp)
+{
+	ASSERT(bp->b_flags & XBF_ASYNC);
+	__xfs_buf_submit(bp, false);
 }
 
 /*
@@ -1489,60 +1512,8 @@ int
 xfs_buf_submit_wait(
 	struct xfs_buf	*bp)
 {
-	int		error;
-
-	trace_xfs_buf_submit_wait(bp, _RET_IP_);
-
-	ASSERT(!(bp->b_flags & (_XBF_DELWRI_Q | XBF_ASYNC)));
-
-	if (XFS_FORCED_SHUTDOWN(bp->b_target->bt_mount)) {
-		xfs_buf_ioerror(bp, -EIO);
-		xfs_buf_stale(bp);
-		bp->b_flags &= ~XBF_DONE;
-		return -EIO;
-	}
-
-	if (bp->b_flags & XBF_WRITE)
-		xfs_buf_wait_unpin(bp);
-
-	/* clear the internal error state to avoid spurious errors */
-	bp->b_io_error = 0;
-
-	/*
-	 * For synchronous IO, the IO does not inherit the submitters reference
-	 * count, nor the buffer lock. Hence we cannot release the reference we
-	 * are about to take until we've waited for all IO completion to occur,
-	 * including any xfs_buf_ioend_async() work that may be pending.
-	 */
-	xfs_buf_hold(bp);
-
-	/*
-	 * Set the count to 1 initially, this will stop an I/O completion
-	 * callout which happens before we have started all the I/O from calling
-	 * xfs_buf_ioend too early.
-	 */
-	atomic_set(&bp->b_io_remaining, 1);
-	_xfs_buf_ioapply(bp);
-
-	/*
-	 * make sure we run completion synchronously if it raced with us and is
-	 * already complete.
-	 */
-	if (atomic_dec_and_test(&bp->b_io_remaining) == 1)
-		xfs_buf_ioend(bp);
-
-	/* wait for completion before gathering the error from the buffer */
-	trace_xfs_buf_iowait(bp, _RET_IP_);
-	wait_for_completion(&bp->b_iowait);
-	trace_xfs_buf_iowait_done(bp, _RET_IP_);
-	error = bp->b_error;
-
-	/*
-	 * all done now, we can release the hold that keeps the buffer
-	 * referenced for the entire IO.
-	 */
-	xfs_buf_rele(bp);
-	return error;
+	ASSERT(!(bp->b_flags & XBF_ASYNC));
+	return __xfs_buf_submit(bp, true);
 }
 
 void *
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 8955254b900e..4d3a9adf0ce3 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -322,7 +322,6 @@ DEFINE_BUF_EVENT(xfs_buf_hold);
 DEFINE_BUF_EVENT(xfs_buf_rele);
 DEFINE_BUF_EVENT(xfs_buf_iodone);
 DEFINE_BUF_EVENT(xfs_buf_submit);
-DEFINE_BUF_EVENT(xfs_buf_submit_wait);
 DEFINE_BUF_EVENT(xfs_buf_lock);
 DEFINE_BUF_EVENT(xfs_buf_lock_done);
 DEFINE_BUF_EVENT(xfs_buf_trylock_fail);

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] xfs: use sync buffer I/O for sync delwri queue submission
  2018-06-15 12:16       ` Christoph Hellwig
@ 2018-06-15 12:39         ` Brian Foster
  2018-06-15 16:31           ` Darrick J. Wong
  2018-06-18 11:21           ` Christoph Hellwig
  0 siblings, 2 replies; 17+ messages in thread
From: Brian Foster @ 2018-06-15 12:39 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, Jun 15, 2018 at 05:16:02AM -0700, Christoph Hellwig wrote:
> On Fri, Jun 15, 2018 at 07:53:35AM -0400, Brian Foster wrote:
> > Not totally sure I follow... do you essentially mean to rename
> > xfs_buf_submit_wait() -> xfs_buf_submit() and make the iowait
> > conditional on !XBF_ASYNC and absence of some new "sync_nowait"
> > parameter to the function? Could you clarify how you envision the
> > updated xfs_buf_submit() function signature to look?
> > 
> > If I'm following correctly, that seems fairly reasonable at first
> > thought. This is a separate patch though (refactoring the interface vs.
> > refactoring the implementation to fix a bug).
> 
> Well.  I'd suggest something like the patch below for patch 1:
> 

Ok, codewise I don't have much of a preference, but I don't think it's
worth redoing the regression testing and lowmem testing and whatnot just
to change how the guts are refactored here. What's the endgame? I came
up with the following on top of patch 2. Compile tested only, and I can
refold the _common() helper back into the caller and invert the
nowait logic or whatnot..

Brian

--- 8< ---

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 311ca301b7fd..89d8cedf2828 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -757,11 +757,7 @@ _xfs_buf_read(
 	bp->b_flags &= ~(XBF_WRITE | XBF_ASYNC | XBF_READ_AHEAD);
 	bp->b_flags |= flags & (XBF_READ | XBF_ASYNC | XBF_READ_AHEAD);
 
-	if (flags & XBF_ASYNC) {
-		xfs_buf_submit(bp);
-		return 0;
-	}
-	return xfs_buf_submit_wait(bp);
+	return xfs_buf_submit(bp);
 }
 
 xfs_buf_t *
@@ -846,7 +842,7 @@ xfs_buf_read_uncached(
 	bp->b_flags |= XBF_READ;
 	bp->b_ops = ops;
 
-	xfs_buf_submit_wait(bp);
+	xfs_buf_submit(bp);
 	if (bp->b_error) {
 		int	error = bp->b_error;
 		xfs_buf_relse(bp);
@@ -1249,7 +1245,7 @@ xfs_bwrite(
 	bp->b_flags &= ~(XBF_ASYNC | XBF_READ | _XBF_DELWRI_Q |
 			 XBF_WRITE_FAIL | XBF_DONE);
 
-	error = xfs_buf_submit_wait(bp);
+	error = xfs_buf_submit(bp);
 	if (error) {
 		xfs_force_shutdown(bp->b_target->bt_mount,
 				   SHUTDOWN_META_IO_ERROR);
@@ -1459,7 +1455,7 @@ _xfs_buf_ioapply(
  * itself.
  */
 static int
-__xfs_buf_submit(
+__xfs_buf_submit_common(
 	struct xfs_buf	*bp)
 {
 	trace_xfs_buf_submit(bp, _RET_IP_);
@@ -1505,32 +1501,6 @@ __xfs_buf_submit(
 	return 0;
 }
 
-void
-xfs_buf_submit(
-	struct xfs_buf	*bp)
-{
-	int		error;
-
-	ASSERT(bp->b_flags & XBF_ASYNC);
-
-	/*
-	 * The caller's reference is released during I/O completion.
-	 * This occurs some time after the last b_io_remaining reference is
-	 * released, so after we drop our Io reference we have to have some
-	 * other reference to ensure the buffer doesn't go away from underneath
-	 * us. Take a direct reference to ensure we have safe access to the
-	 * buffer until we are finished with it.
-	 */
-	xfs_buf_hold(bp);
-
-	error = __xfs_buf_submit(bp);
-	if (error)
-		xfs_buf_ioend(bp);
-
-	/* Note: it is not safe to reference bp now we've dropped our ref */
-	xfs_buf_rele(bp);
-}
-
 /*
  * Wait for I/O completion of a sync buffer and return the I/O error code.
  */
@@ -1549,30 +1519,33 @@ xfs_buf_iowait(
  * Synchronous buffer IO submission path, read or write.
  */
 int
-xfs_buf_submit_wait(
-	struct xfs_buf	*bp)
+__xfs_buf_submit(
+	struct xfs_buf	*bp,
+	bool		sync_nowait)
 {
 	int		error;
 
-	ASSERT(!(bp->b_flags & XBF_ASYNC));
-
 	/*
-	 * For synchronous IO, the IO does not inherit the submitters reference
-	 * count, nor the buffer lock. Hence we cannot release the reference we
-	 * are about to take until we've waited for all IO completion to occur,
-	 * including any xfs_buf_ioend_async() work that may be pending.
+	 * Grab a reference so the buffer does not go away underneath us. For
+	 * async buffers, I/O completion drops the callers reference, which
+	 * could occur before submission returns.
 	 */
 	xfs_buf_hold(bp);
 
-	error = __xfs_buf_submit(bp);
-	if (error)
+	error = __xfs_buf_submit_common(bp);
+	if (error) {
+		if (bp->b_flags & XBF_ASYNC)
+			xfs_buf_ioend(bp);
 		goto out;
-	error = xfs_buf_iowait(bp);
+	}
 
+	if (!(bp->b_flags & XBF_ASYNC) && !sync_nowait)
+		error = xfs_buf_iowait(bp);
 out:
 	/*
-	 * all done now, we can release the hold that keeps the buffer
-	 * referenced for the entire IO.
+	 * Release the hold that keeps the buffer referenced for the entire
+	 * I/O. Note that if the buffer is async, it is not safe to reference
+	 * after this release.
 	 */
 	xfs_buf_rele(bp);
 	return error;
@@ -1984,6 +1957,7 @@ xfs_buf_delwri_submit_buffers(
 	LIST_HEAD		(submit_list);
 	int			pinned = 0;
 	struct blk_plug		plug;
+	bool			nowait = false;
 
 	list_sort(NULL, buffer_list, xfs_buf_cmp);
 
@@ -2025,12 +1999,12 @@ xfs_buf_delwri_submit_buffers(
 		if (wait_list) {
 			bp->b_flags &= ~XBF_ASYNC;
 			list_move_tail(&bp->b_list, wait_list);
-			__xfs_buf_submit(bp);
+			nowait = true;
 		} else {
 			bp->b_flags |= XBF_ASYNC;
 			list_del_init(&bp->b_list);
-			xfs_buf_submit(bp);
 		}
+		__xfs_buf_submit(bp, nowait);
 	}
 	blk_finish_plug(&plug);
 
diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index d24dbd4dac39..9bae5c201003 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -298,8 +298,13 @@ extern void __xfs_buf_ioerror(struct xfs_buf *bp, int error,
 		xfs_failaddr_t failaddr);
 #define xfs_buf_ioerror(bp, err) __xfs_buf_ioerror((bp), (err), __this_address)
 extern void xfs_buf_ioerror_alert(struct xfs_buf *, const char *func);
-extern void xfs_buf_submit(struct xfs_buf *bp);
-extern int xfs_buf_submit_wait(struct xfs_buf *bp);
+
+extern int __xfs_buf_submit(struct xfs_buf *bp, bool);
+static inline int xfs_buf_submit(struct xfs_buf *bp)
+{
+	return __xfs_buf_submit(bp, false);
+}
+
 extern void xfs_buf_iomove(xfs_buf_t *, size_t, size_t, void *,
 				xfs_buf_rw_t);
 #define xfs_buf_zero(bp, off, len) \
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index b181b5f57a19..724a76d87564 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -196,7 +196,7 @@ xlog_bread_noalign(
 	bp->b_io_length = nbblks;
 	bp->b_error = 0;
 
-	error = xfs_buf_submit_wait(bp);
+	error = xfs_buf_submit(bp);
 	if (error && !XFS_FORCED_SHUTDOWN(log->l_mp))
 		xfs_buf_ioerror_alert(bp, __func__);
 	return error;
@@ -5707,7 +5707,7 @@ xlog_do_recover(
 	bp->b_flags |= XBF_READ;
 	bp->b_ops = &xfs_sb_buf_ops;
 
-	error = xfs_buf_submit_wait(bp);
+	error = xfs_buf_submit(bp);
 	if (error) {
 		if (!XFS_FORCED_SHUTDOWN(mp)) {
 			xfs_buf_ioerror_alert(bp, __func__);

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] xfs: use sync buffer I/O for sync delwri queue submission
  2018-06-15 12:39         ` Brian Foster
@ 2018-06-15 16:31           ` Darrick J. Wong
  2018-06-15 17:43             ` Brian Foster
  2018-06-18 11:21           ` Christoph Hellwig
  1 sibling, 1 reply; 17+ messages in thread
From: Darrick J. Wong @ 2018-06-15 16:31 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, linux-xfs

On Fri, Jun 15, 2018 at 08:39:56AM -0400, Brian Foster wrote:
> On Fri, Jun 15, 2018 at 05:16:02AM -0700, Christoph Hellwig wrote:
> > On Fri, Jun 15, 2018 at 07:53:35AM -0400, Brian Foster wrote:
> > > Not totally sure I follow... do you essentially mean to rename
> > > xfs_buf_submit_wait() -> xfs_buf_submit() and make the iowait
> > > conditional on !XBF_ASYNC and absence of some new "sync_nowait"
> > > parameter to the function? Could you clarify how you envision the
> > > updated xfs_buf_submit() function signature to look?
> > > 
> > > If I'm following correctly, that seems fairly reasonable at first
> > > thought. This is a separate patch though (refactoring the interface vs.
> > > refactoring the implementation to fix a bug).
> > 
> > Well.  I'd suggest something like the patch below for patch 1:
> > 
> 
> Ok, codewise I don't have much of a preference, but I don't think it's
> worth redoing the regression testing and lowmem testing and whatnot just
> to change how the guts are refactored here. What's the endgame? I came
> up with the following on top of patch 2. Compile tested only, and I can
> refold the _common() helper back into the caller and invert the
> nowait logic or whatnot..

Whatever you all decide, would you mind adding a comment to xfs_buf.c
describing how delwri buffers are supposed to work, particularly with
regard to what are the buffer states specific to delwri lists, and how
the wait list that gets passed around works?  I need to check my
assumptions about how this mechanism functions. :)

(AKA I looked at both versions, thought 'this delwri queue list stuff
looks rather clever...' and then promptly got distracted by mkfs-config
and other stuff. :/)

--D

> Brian
> 
> --- 8< ---
> 
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index 311ca301b7fd..89d8cedf2828 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -757,11 +757,7 @@ _xfs_buf_read(
>  	bp->b_flags &= ~(XBF_WRITE | XBF_ASYNC | XBF_READ_AHEAD);
>  	bp->b_flags |= flags & (XBF_READ | XBF_ASYNC | XBF_READ_AHEAD);
>  
> -	if (flags & XBF_ASYNC) {
> -		xfs_buf_submit(bp);
> -		return 0;
> -	}
> -	return xfs_buf_submit_wait(bp);
> +	return xfs_buf_submit(bp);
>  }
>  
>  xfs_buf_t *
> @@ -846,7 +842,7 @@ xfs_buf_read_uncached(
>  	bp->b_flags |= XBF_READ;
>  	bp->b_ops = ops;
>  
> -	xfs_buf_submit_wait(bp);
> +	xfs_buf_submit(bp);
>  	if (bp->b_error) {
>  		int	error = bp->b_error;
>  		xfs_buf_relse(bp);
> @@ -1249,7 +1245,7 @@ xfs_bwrite(
>  	bp->b_flags &= ~(XBF_ASYNC | XBF_READ | _XBF_DELWRI_Q |
>  			 XBF_WRITE_FAIL | XBF_DONE);
>  
> -	error = xfs_buf_submit_wait(bp);
> +	error = xfs_buf_submit(bp);
>  	if (error) {
>  		xfs_force_shutdown(bp->b_target->bt_mount,
>  				   SHUTDOWN_META_IO_ERROR);
> @@ -1459,7 +1455,7 @@ _xfs_buf_ioapply(
>   * itself.
>   */
>  static int
> -__xfs_buf_submit(
> +__xfs_buf_submit_common(
>  	struct xfs_buf	*bp)
>  {
>  	trace_xfs_buf_submit(bp, _RET_IP_);
> @@ -1505,32 +1501,6 @@ __xfs_buf_submit(
>  	return 0;
>  }
>  
> -void
> -xfs_buf_submit(
> -	struct xfs_buf	*bp)
> -{
> -	int		error;
> -
> -	ASSERT(bp->b_flags & XBF_ASYNC);
> -
> -	/*
> -	 * The caller's reference is released during I/O completion.
> -	 * This occurs some time after the last b_io_remaining reference is
> -	 * released, so after we drop our Io reference we have to have some
> -	 * other reference to ensure the buffer doesn't go away from underneath
> -	 * us. Take a direct reference to ensure we have safe access to the
> -	 * buffer until we are finished with it.
> -	 */
> -	xfs_buf_hold(bp);
> -
> -	error = __xfs_buf_submit(bp);
> -	if (error)
> -		xfs_buf_ioend(bp);
> -
> -	/* Note: it is not safe to reference bp now we've dropped our ref */
> -	xfs_buf_rele(bp);
> -}
> -
>  /*
>   * Wait for I/O completion of a sync buffer and return the I/O error code.
>   */
> @@ -1549,30 +1519,33 @@ xfs_buf_iowait(
>   * Synchronous buffer IO submission path, read or write.
>   */
>  int
> -xfs_buf_submit_wait(
> -	struct xfs_buf	*bp)
> +__xfs_buf_submit(
> +	struct xfs_buf	*bp,
> +	bool		sync_nowait)
>  {
>  	int		error;
>  
> -	ASSERT(!(bp->b_flags & XBF_ASYNC));
> -
>  	/*
> -	 * For synchronous IO, the IO does not inherit the submitters reference
> -	 * count, nor the buffer lock. Hence we cannot release the reference we
> -	 * are about to take until we've waited for all IO completion to occur,
> -	 * including any xfs_buf_ioend_async() work that may be pending.
> +	 * Grab a reference so the buffer does not go away underneath us. For
> +	 * async buffers, I/O completion drops the callers reference, which
> +	 * could occur before submission returns.
>  	 */
>  	xfs_buf_hold(bp);
>  
> -	error = __xfs_buf_submit(bp);
> -	if (error)
> +	error = __xfs_buf_submit_common(bp);
> +	if (error) {
> +		if (bp->b_flags & XBF_ASYNC)
> +			xfs_buf_ioend(bp);
>  		goto out;
> -	error = xfs_buf_iowait(bp);
> +	}
>  
> +	if (!(bp->b_flags & XBF_ASYNC) && !sync_nowait)
> +		error = xfs_buf_iowait(bp);
>  out:
>  	/*
> -	 * all done now, we can release the hold that keeps the buffer
> -	 * referenced for the entire IO.
> +	 * Release the hold that keeps the buffer referenced for the entire
> +	 * I/O. Note that if the buffer is async, it is not safe to reference
> +	 * after this release.
>  	 */
>  	xfs_buf_rele(bp);
>  	return error;
> @@ -1984,6 +1957,7 @@ xfs_buf_delwri_submit_buffers(
>  	LIST_HEAD		(submit_list);
>  	int			pinned = 0;
>  	struct blk_plug		plug;
> +	bool			nowait = false;
>  
>  	list_sort(NULL, buffer_list, xfs_buf_cmp);
>  
> @@ -2025,12 +1999,12 @@ xfs_buf_delwri_submit_buffers(
>  		if (wait_list) {
>  			bp->b_flags &= ~XBF_ASYNC;
>  			list_move_tail(&bp->b_list, wait_list);
> -			__xfs_buf_submit(bp);
> +			nowait = true;
>  		} else {
>  			bp->b_flags |= XBF_ASYNC;
>  			list_del_init(&bp->b_list);
> -			xfs_buf_submit(bp);
>  		}
> +		__xfs_buf_submit(bp, nowait);
>  	}
>  	blk_finish_plug(&plug);
>  
> diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
> index d24dbd4dac39..9bae5c201003 100644
> --- a/fs/xfs/xfs_buf.h
> +++ b/fs/xfs/xfs_buf.h
> @@ -298,8 +298,13 @@ extern void __xfs_buf_ioerror(struct xfs_buf *bp, int error,
>  		xfs_failaddr_t failaddr);
>  #define xfs_buf_ioerror(bp, err) __xfs_buf_ioerror((bp), (err), __this_address)
>  extern void xfs_buf_ioerror_alert(struct xfs_buf *, const char *func);
> -extern void xfs_buf_submit(struct xfs_buf *bp);
> -extern int xfs_buf_submit_wait(struct xfs_buf *bp);
> +
> +extern int __xfs_buf_submit(struct xfs_buf *bp, bool);
> +static inline int xfs_buf_submit(struct xfs_buf *bp)
> +{
> +	return __xfs_buf_submit(bp, false);
> +}
> +
>  extern void xfs_buf_iomove(xfs_buf_t *, size_t, size_t, void *,
>  				xfs_buf_rw_t);
>  #define xfs_buf_zero(bp, off, len) \
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index b181b5f57a19..724a76d87564 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -196,7 +196,7 @@ xlog_bread_noalign(
>  	bp->b_io_length = nbblks;
>  	bp->b_error = 0;
>  
> -	error = xfs_buf_submit_wait(bp);
> +	error = xfs_buf_submit(bp);
>  	if (error && !XFS_FORCED_SHUTDOWN(log->l_mp))
>  		xfs_buf_ioerror_alert(bp, __func__);
>  	return error;
> @@ -5707,7 +5707,7 @@ xlog_do_recover(
>  	bp->b_flags |= XBF_READ;
>  	bp->b_ops = &xfs_sb_buf_ops;
>  
> -	error = xfs_buf_submit_wait(bp);
> +	error = xfs_buf_submit(bp);
>  	if (error) {
>  		if (!XFS_FORCED_SHUTDOWN(mp)) {
>  			xfs_buf_ioerror_alert(bp, __func__);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] xfs: use sync buffer I/O for sync delwri queue submission
  2018-06-15 16:31           ` Darrick J. Wong
@ 2018-06-15 17:43             ` Brian Foster
  0 siblings, 0 replies; 17+ messages in thread
From: Brian Foster @ 2018-06-15 17:43 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, linux-xfs

On Fri, Jun 15, 2018 at 09:31:38AM -0700, Darrick J. Wong wrote:
> On Fri, Jun 15, 2018 at 08:39:56AM -0400, Brian Foster wrote:
> > On Fri, Jun 15, 2018 at 05:16:02AM -0700, Christoph Hellwig wrote:
> > > On Fri, Jun 15, 2018 at 07:53:35AM -0400, Brian Foster wrote:
> > > > Not totally sure I follow... do you essentially mean to rename
> > > > xfs_buf_submit_wait() -> xfs_buf_submit() and make the iowait
> > > > conditional on !XBF_ASYNC and absence of some new "sync_nowait"
> > > > parameter to the function? Could you clarify how you envision the
> > > > updated xfs_buf_submit() function signature to look?
> > > > 
> > > > If I'm following correctly, that seems fairly reasonable at first
> > > > thought. This is a separate patch though (refactoring the interface vs.
> > > > refactoring the implementation to fix a bug).
> > > 
> > > Well.  I'd suggest something like the patch below for patch 1:
> > > 
> > 
> > Ok, codewise I don't have much of a preference, but I don't think it's
> > worth redoing the regression testing and lowmem testing and whatnot just
> > to change how the guts are refactored here. What's the endgame? I came
> > up with the following on top of patch 2. Compile tested only, and I can
> > refold the _common() helper back into the caller and invert the
> > nowait logic or whatnot..
> 
> Whatever you all decide, would you mind adding a comment to xfs_buf.c
> describing how delwri buffers are supposed to work, particularly with
> regard to what are the buffer states specific to delwri lists, and how
> the wait list that gets passed around works?  I need to check my
> assumptions about how this mechanism functions. :)
> 

It's really just the DELWRI_Q flag and the list itself, which I think is
covered in the comments in _delwri_queue() (the bit around the buffer
being lazily removed from the queue in some cases).

Otherwise, the fact the buffer could be DELWRI_Q while on the wait list
is the problem this is trying to resolve... that's an invalid state. The
change this makes is to essentially make sure that once submitted, the
buffer remains locked until it is fully released from the delwri queue.

The wait list is really just a local thing to xfs_buf_delwri_submit()
and shouldn't be relevant external to the mechanism itself. I've updated
the comment above xfs_buf_delwri_submit_buffers() to reflect that. I'm
not quite sure if you're looking for more than that..? Perhaps I'll just
post the v3 with comment tweaks and go from there..

Brian

> (AKA I looked at both versions, thought 'this delwri queue list stuff
> looks rather clever...' and then promptly got distracted by mkfs-config
> and other stuff. :/)
> 
> --D
> 
> > Brian
> > 
> > --- 8< ---
> > 
> > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> > index 311ca301b7fd..89d8cedf2828 100644
> > --- a/fs/xfs/xfs_buf.c
> > +++ b/fs/xfs/xfs_buf.c
> > @@ -757,11 +757,7 @@ _xfs_buf_read(
> >  	bp->b_flags &= ~(XBF_WRITE | XBF_ASYNC | XBF_READ_AHEAD);
> >  	bp->b_flags |= flags & (XBF_READ | XBF_ASYNC | XBF_READ_AHEAD);
> >  
> > -	if (flags & XBF_ASYNC) {
> > -		xfs_buf_submit(bp);
> > -		return 0;
> > -	}
> > -	return xfs_buf_submit_wait(bp);
> > +	return xfs_buf_submit(bp);
> >  }
> >  
> >  xfs_buf_t *
> > @@ -846,7 +842,7 @@ xfs_buf_read_uncached(
> >  	bp->b_flags |= XBF_READ;
> >  	bp->b_ops = ops;
> >  
> > -	xfs_buf_submit_wait(bp);
> > +	xfs_buf_submit(bp);
> >  	if (bp->b_error) {
> >  		int	error = bp->b_error;
> >  		xfs_buf_relse(bp);
> > @@ -1249,7 +1245,7 @@ xfs_bwrite(
> >  	bp->b_flags &= ~(XBF_ASYNC | XBF_READ | _XBF_DELWRI_Q |
> >  			 XBF_WRITE_FAIL | XBF_DONE);
> >  
> > -	error = xfs_buf_submit_wait(bp);
> > +	error = xfs_buf_submit(bp);
> >  	if (error) {
> >  		xfs_force_shutdown(bp->b_target->bt_mount,
> >  				   SHUTDOWN_META_IO_ERROR);
> > @@ -1459,7 +1455,7 @@ _xfs_buf_ioapply(
> >   * itself.
> >   */
> >  static int
> > -__xfs_buf_submit(
> > +__xfs_buf_submit_common(
> >  	struct xfs_buf	*bp)
> >  {
> >  	trace_xfs_buf_submit(bp, _RET_IP_);
> > @@ -1505,32 +1501,6 @@ __xfs_buf_submit(
> >  	return 0;
> >  }
> >  
> > -void
> > -xfs_buf_submit(
> > -	struct xfs_buf	*bp)
> > -{
> > -	int		error;
> > -
> > -	ASSERT(bp->b_flags & XBF_ASYNC);
> > -
> > -	/*
> > -	 * The caller's reference is released during I/O completion.
> > -	 * This occurs some time after the last b_io_remaining reference is
> > -	 * released, so after we drop our Io reference we have to have some
> > -	 * other reference to ensure the buffer doesn't go away from underneath
> > -	 * us. Take a direct reference to ensure we have safe access to the
> > -	 * buffer until we are finished with it.
> > -	 */
> > -	xfs_buf_hold(bp);
> > -
> > -	error = __xfs_buf_submit(bp);
> > -	if (error)
> > -		xfs_buf_ioend(bp);
> > -
> > -	/* Note: it is not safe to reference bp now we've dropped our ref */
> > -	xfs_buf_rele(bp);
> > -}
> > -
> >  /*
> >   * Wait for I/O completion of a sync buffer and return the I/O error code.
> >   */
> > @@ -1549,30 +1519,33 @@ xfs_buf_iowait(
> >   * Synchronous buffer IO submission path, read or write.
> >   */
> >  int
> > -xfs_buf_submit_wait(
> > -	struct xfs_buf	*bp)
> > +__xfs_buf_submit(
> > +	struct xfs_buf	*bp,
> > +	bool		sync_nowait)
> >  {
> >  	int		error;
> >  
> > -	ASSERT(!(bp->b_flags & XBF_ASYNC));
> > -
> >  	/*
> > -	 * For synchronous IO, the IO does not inherit the submitters reference
> > -	 * count, nor the buffer lock. Hence we cannot release the reference we
> > -	 * are about to take until we've waited for all IO completion to occur,
> > -	 * including any xfs_buf_ioend_async() work that may be pending.
> > +	 * Grab a reference so the buffer does not go away underneath us. For
> > +	 * async buffers, I/O completion drops the callers reference, which
> > +	 * could occur before submission returns.
> >  	 */
> >  	xfs_buf_hold(bp);
> >  
> > -	error = __xfs_buf_submit(bp);
> > -	if (error)
> > +	error = __xfs_buf_submit_common(bp);
> > +	if (error) {
> > +		if (bp->b_flags & XBF_ASYNC)
> > +			xfs_buf_ioend(bp);
> >  		goto out;
> > -	error = xfs_buf_iowait(bp);
> > +	}
> >  
> > +	if (!(bp->b_flags & XBF_ASYNC) && !sync_nowait)
> > +		error = xfs_buf_iowait(bp);
> >  out:
> >  	/*
> > -	 * all done now, we can release the hold that keeps the buffer
> > -	 * referenced for the entire IO.
> > +	 * Release the hold that keeps the buffer referenced for the entire
> > +	 * I/O. Note that if the buffer is async, it is not safe to reference
> > +	 * after this release.
> >  	 */
> >  	xfs_buf_rele(bp);
> >  	return error;
> > @@ -1984,6 +1957,7 @@ xfs_buf_delwri_submit_buffers(
> >  	LIST_HEAD		(submit_list);
> >  	int			pinned = 0;
> >  	struct blk_plug		plug;
> > +	bool			nowait = false;
> >  
> >  	list_sort(NULL, buffer_list, xfs_buf_cmp);
> >  
> > @@ -2025,12 +1999,12 @@ xfs_buf_delwri_submit_buffers(
> >  		if (wait_list) {
> >  			bp->b_flags &= ~XBF_ASYNC;
> >  			list_move_tail(&bp->b_list, wait_list);
> > -			__xfs_buf_submit(bp);
> > +			nowait = true;
> >  		} else {
> >  			bp->b_flags |= XBF_ASYNC;
> >  			list_del_init(&bp->b_list);
> > -			xfs_buf_submit(bp);
> >  		}
> > +		__xfs_buf_submit(bp, nowait);
> >  	}
> >  	blk_finish_plug(&plug);
> >  
> > diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
> > index d24dbd4dac39..9bae5c201003 100644
> > --- a/fs/xfs/xfs_buf.h
> > +++ b/fs/xfs/xfs_buf.h
> > @@ -298,8 +298,13 @@ extern void __xfs_buf_ioerror(struct xfs_buf *bp, int error,
> >  		xfs_failaddr_t failaddr);
> >  #define xfs_buf_ioerror(bp, err) __xfs_buf_ioerror((bp), (err), __this_address)
> >  extern void xfs_buf_ioerror_alert(struct xfs_buf *, const char *func);
> > -extern void xfs_buf_submit(struct xfs_buf *bp);
> > -extern int xfs_buf_submit_wait(struct xfs_buf *bp);
> > +
> > +extern int __xfs_buf_submit(struct xfs_buf *bp, bool);
> > +static inline int xfs_buf_submit(struct xfs_buf *bp)
> > +{
> > +	return __xfs_buf_submit(bp, false);
> > +}
> > +
> >  extern void xfs_buf_iomove(xfs_buf_t *, size_t, size_t, void *,
> >  				xfs_buf_rw_t);
> >  #define xfs_buf_zero(bp, off, len) \
> > diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> > index b181b5f57a19..724a76d87564 100644
> > --- a/fs/xfs/xfs_log_recover.c
> > +++ b/fs/xfs/xfs_log_recover.c
> > @@ -196,7 +196,7 @@ xlog_bread_noalign(
> >  	bp->b_io_length = nbblks;
> >  	bp->b_error = 0;
> >  
> > -	error = xfs_buf_submit_wait(bp);
> > +	error = xfs_buf_submit(bp);
> >  	if (error && !XFS_FORCED_SHUTDOWN(log->l_mp))
> >  		xfs_buf_ioerror_alert(bp, __func__);
> >  	return error;
> > @@ -5707,7 +5707,7 @@ xlog_do_recover(
> >  	bp->b_flags |= XBF_READ;
> >  	bp->b_ops = &xfs_sb_buf_ops;
> >  
> > -	error = xfs_buf_submit_wait(bp);
> > +	error = xfs_buf_submit(bp);
> >  	if (error) {
> >  		if (!XFS_FORCED_SHUTDOWN(mp)) {
> >  			xfs_buf_ioerror_alert(bp, __func__);
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] xfs: use sync buffer I/O for sync delwri queue submission
  2018-06-15 12:39         ` Brian Foster
  2018-06-15 16:31           ` Darrick J. Wong
@ 2018-06-18 11:21           ` Christoph Hellwig
  2018-06-18 11:47             ` Brian Foster
  1 sibling, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2018-06-18 11:21 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, linux-xfs

On Fri, Jun 15, 2018 at 08:39:56AM -0400, Brian Foster wrote:
> Ok, codewise I don't have much of a preference, but I don't think it's
> worth redoing the regression testing and lowmem testing and whatnot just
> to change how the guts are refactored here. What's the endgame?

Avoid the nasty duplication as much as possible.  I think your patch
already goes a long way towards that, but I'd rather go all the way.

> I came
> up with the following on top of patch 2. Compile tested only, and I can
> refold the _common() helper back into the caller and invert the
> nowait logic or whatnot..

Mostly looks fine, except that it seems pointless to have
__xfs_buf_submit_common around instead of merging it into the only
caller.

> +	if (!(bp->b_flags & XBF_ASYNC) && !sync_nowait)

Also I'd rather pass in a

	'bool wait' parameter.

xfs_buf_submit() would set it based on XBF_ASYNC, and the delwri
code would just set it to false explicitly.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/2] xfs: use sync buffer I/O for sync delwri queue submission
  2018-06-18 11:21           ` Christoph Hellwig
@ 2018-06-18 11:47             ` Brian Foster
  0 siblings, 0 replies; 17+ messages in thread
From: Brian Foster @ 2018-06-18 11:47 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Mon, Jun 18, 2018 at 04:21:30AM -0700, Christoph Hellwig wrote:
> On Fri, Jun 15, 2018 at 08:39:56AM -0400, Brian Foster wrote:
> > Ok, codewise I don't have much of a preference, but I don't think it's
> > worth redoing the regression testing and lowmem testing and whatnot just
> > to change how the guts are refactored here. What's the endgame?
> 
> Avoid the nasty duplication as much as possible.  I think your patch
> already goes a long way towards that, but I'd rather go all the way.
> 

That's what I figured, thanks.

> > I came
> > up with the following on top of patch 2. Compile tested only, and I can
> > refold the _common() helper back into the caller and invert the
> > nowait logic or whatnot..
> 
> Mostly looks fine, except that it seems pointless to have
> __xfs_buf_submit_common around instead of merging it into the only
> caller.
> 

Yes..

> > +	if (!(bp->b_flags & XBF_ASYNC) && !sync_nowait)
> 
> Also I'd rather pass in a
> 
> 	'bool wait' parameter.
> 
> xfs_buf_submit() would set it based on XBF_ASYNC, and the delwri
> code would just set it to false explicitly.

A patch with the above changes is essentially what I put through some
testing over the weekend. It doesn't look like anything exploded, so
I'll try to get it cleaned up and posted later today.

Brian

> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-06-18 11:47 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-13 11:05 [PATCH v2 0/2] xfs: fix buffer delwri queue state race Brian Foster
2018-06-13 11:05 ` [PATCH v2 1/2] xfs: refactor buffer submission into a common helper Brian Foster
2018-06-14 13:43   ` [PATCH v3 " Brian Foster
2018-06-15 11:24     ` Christoph Hellwig
2018-06-15 11:53       ` Brian Foster
2018-06-13 11:05 ` [PATCH v2 2/2] xfs: use sync buffer I/O for sync delwri queue submission Brian Foster
2018-06-13 22:08   ` Dave Chinner
2018-06-13 23:29     ` Brian Foster
2018-06-13 23:37       ` Dave Chinner
2018-06-15 11:28   ` Christoph Hellwig
2018-06-15 11:53     ` Brian Foster
2018-06-15 12:16       ` Christoph Hellwig
2018-06-15 12:39         ` Brian Foster
2018-06-15 16:31           ` Darrick J. Wong
2018-06-15 17:43             ` Brian Foster
2018-06-18 11:21           ` Christoph Hellwig
2018-06-18 11:47             ` Brian Foster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.