linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2 V5] Resubmit items failed during writeback
@ 2017-07-18 14:54 Carlos Maiolino
  2017-07-18 14:54 ` [PATCH 1/2 V5] xfs: Add infrastructure needed for error propagation during buffer IO failure Carlos Maiolino
  2017-07-18 14:54 ` [PATCH 2/2 V5] xfs: Properly retry failed inode items in case of error during buffer writeback Carlos Maiolino
  0 siblings, 2 replies; 7+ messages in thread
From: Carlos Maiolino @ 2017-07-18 14:54 UTC (permalink / raw)
  To: linux-xfs

Hi folks,

this is the V5 review of the fix for buffer resubmission.

Changes related are described on each patch.

The xfstests to test this problem is already in fstests list, and this patchset
passes the test cleanly.

Cheers 

Carlos Maiolino (2):
  xfs: Add infrastructure needed for error propagation during buffer IO
    failure
  xfs: Properly retry failed inode items in case of error during buffer
    writeback

 fs/xfs/xfs_buf_item.c   | 58 ++++++++++++++++++++++++++++++++++++++++++++++++-
 fs/xfs/xfs_buf_item.h   |  3 +++
 fs/xfs/xfs_inode_item.c | 45 ++++++++++++++++++++++++++++++++++----
 fs/xfs/xfs_trans.h      |  8 +++++--
 fs/xfs/xfs_trans_ail.c  |  4 ++--
 fs/xfs/xfs_trans_priv.h | 31 ++++++++++++++++++++++++++
 6 files changed, 140 insertions(+), 9 deletions(-)

-- 
2.9.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2 V5] xfs: Add infrastructure needed for error propagation during buffer IO failure
  2017-07-18 14:54 [PATCH 0/2 V5] Resubmit items failed during writeback Carlos Maiolino
@ 2017-07-18 14:54 ` Carlos Maiolino
  2017-07-19 12:01   ` Brian Foster
  2017-07-18 14:54 ` [PATCH 2/2 V5] xfs: Properly retry failed inode items in case of error during buffer writeback Carlos Maiolino
  1 sibling, 1 reply; 7+ messages in thread
From: Carlos Maiolino @ 2017-07-18 14:54 UTC (permalink / raw)
  To: linux-xfs

With the current code, XFS never re-submit a failed buffer for IO,
because the failed item in the buffer is kept in the flush locked state
forever.

To be able to resubmit an log item for IO, we need a way to mark an item
as failed, if, for any reason the buffer which the item belonged to
failed during writeback.

Add a new log item callback to be used after an IO completion failure
and make the needed clean ups.

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
---

V2:
	- Update commit log to include a better description of why this
	  patch is needed and fix spelling mistakes
	- Move xfs_buf_do_callbacks_fail() call into
	  xfs_buf_iodone_callback_error, so the callbacks can be executed
	  before the buffer is released, and only after it has been
	  retried once

V3:
	- fix some loops according to hch suggestion
	- whitespace cleanup

V4:
	- Invoke failure callbacks before reset the I/O error
	- Remove bflags field from iop_error callback
	- move spin_lock/unlock xa_lock up in the stack, handling all
	  log items in the same buffer into a single lock

V5:
	- Reorganize variable declarations
	  in fxs_buf_do_callbacks_fail

 fs/xfs/xfs_buf_item.c | 24 +++++++++++++++++++++++-
 fs/xfs/xfs_trans.h    |  7 +++++--
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
index f6a8422..d6ca7d6 100644
--- a/fs/xfs/xfs_buf_item.c
+++ b/fs/xfs/xfs_buf_item.c
@@ -29,6 +29,7 @@
 #include "xfs_error.h"
 #include "xfs_trace.h"
 #include "xfs_log.h"
+#include "xfs_inode.h"
 
 
 kmem_zone_t	*xfs_buf_item_zone;
@@ -1054,6 +1055,23 @@ xfs_buf_do_callbacks(
 	}
 }
 
+STATIC void
+xfs_buf_do_callbacks_fail(
+	struct xfs_buf		*bp)
+{
+	struct xfs_log_item	*next;
+	struct xfs_log_item	*lip = bp->b_fspriv;
+	struct xfs_ail		*ailp = lip->li_ailp;
+
+	spin_lock(&ailp->xa_lock);
+	for (; lip; lip = next) {
+		next = lip->li_bio_list;
+		if (lip->li_ops->iop_error)
+			lip->li_ops->iop_error(lip, bp);
+	}
+	spin_unlock(&ailp->xa_lock);
+}
+
 static bool
 xfs_buf_iodone_callback_error(
 	struct xfs_buf		*bp)
@@ -1123,7 +1141,11 @@ xfs_buf_iodone_callback_error(
 	if ((mp->m_flags & XFS_MOUNT_UNMOUNTING) && mp->m_fail_unmount)
 		goto permanent_error;
 
-	/* still a transient error, higher layers will retry */
+	/*
+	 * Still a transient error, run IO completion failure callbacks and let
+	 * the higher layers retry the buffer.
+	 */
+	xfs_buf_do_callbacks_fail(bp);
 	xfs_buf_ioerror(bp, 0);
 	xfs_buf_relse(bp);
 	return true;
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index 6bdad6f..442d679 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -64,11 +64,13 @@ typedef struct xfs_log_item {
 } xfs_log_item_t;
 
 #define	XFS_LI_IN_AIL	0x1
-#define XFS_LI_ABORTED	0x2
+#define	XFS_LI_ABORTED	0x2
+#define	XFS_LI_FAILED	0x4
 
 #define XFS_LI_FLAGS \
 	{ XFS_LI_IN_AIL,	"IN_AIL" }, \
-	{ XFS_LI_ABORTED,	"ABORTED" }
+	{ XFS_LI_ABORTED,	"ABORTED" }, \
+	{ XFS_LI_FAILED,	"FAILED" }
 
 struct xfs_item_ops {
 	void (*iop_size)(xfs_log_item_t *, int *, int *);
@@ -79,6 +81,7 @@ struct xfs_item_ops {
 	void (*iop_unlock)(xfs_log_item_t *);
 	xfs_lsn_t (*iop_committed)(xfs_log_item_t *, xfs_lsn_t);
 	void (*iop_committing)(xfs_log_item_t *, xfs_lsn_t);
+	void (*iop_error)(xfs_log_item_t *, xfs_buf_t *);
 };
 
 void	xfs_log_item_init(struct xfs_mount *mp, struct xfs_log_item *item,
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2 V5] xfs: Properly retry failed inode items in case of error during buffer writeback
  2017-07-18 14:54 [PATCH 0/2 V5] Resubmit items failed during writeback Carlos Maiolino
  2017-07-18 14:54 ` [PATCH 1/2 V5] xfs: Add infrastructure needed for error propagation during buffer IO failure Carlos Maiolino
@ 2017-07-18 14:54 ` Carlos Maiolino
  2017-07-19 12:01   ` Brian Foster
  1 sibling, 1 reply; 7+ messages in thread
From: Carlos Maiolino @ 2017-07-18 14:54 UTC (permalink / raw)
  To: linux-xfs

When a buffer has been failed during writeback, the inode items into it
are kept flush locked, and are never resubmitted due the flush lock, so,
if any buffer fails to be written, the items in AIL are never written to
disk and never unlocked.

This causes unmount operation to hang due these items flush locked in AIL,
but this also causes the items in AIL to never be written back, even when
the IO device comes back to normal.

I've been testing this patch with a DM-thin device, creating a
filesystem larger than the real device.

When writing enough data to fill the DM-thin device, XFS receives ENOSPC
errors from the device, and keep spinning on xfsaild (when 'retry
forever' configuration is set).

At this point, the filesystem can not be unmounted because of the flush locked
items in AIL, but worse, the items in AIL are never retried at all
(once xfs_inode_item_push() will skip the items that are flush locked),
even if the underlying DM-thin device is expanded to the proper size.

This patch fixes both cases, retrying any item that has been failed
previously, using the infra-structure provided by the previous patch.

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
---

V2:
	- Fix XFS_LI_FAILED flag removal
	- Use atomic operations to set and clear XFS_LI_FAILED flag
	- Remove check for XBF_WRITE_FAIL in xfs_inode_item_push
	- Add more comments to the code
	- Add a helper function to resubmit the failed buffers, so this
	  can be also used in dquot system without duplicating code

V3:
	- kill xfs_imap_to_bp call using a pointer in the log item to
	  hold the buffer address
	- use xa_lock instead of atomic operations to handle log item
	  flags
	- Add a hold to the buffer for each log item failed
	- move buffer resubmission up in xfs_inode_item_push()

V4:
	- Remove bflags argument from iop_error callback
	- Remove ip argument from xfs_buf_resubmit_failed_buffers
	- Use helpers to set/clear XFS_LI_FAILED flag
	- remove ->xa_lock from the iop->error callback and move it up
	  on the stack, so all log items are processed into a single
	  pair of lock/unlock

V5:
	- fix comments
	- move buf_lock from xfs_buf_resubmit_failed_buffers() up to
	  xfs_inode_item_push, and use trylock instead
	- assert xa_lock is held in xfs_set/clear_li_failed helpers
	  and move such helpers into xfs_trans_priv.h, once
	  lockdep_assert_held() usage will require xfs_ail definition
	- ASSERT XFS_LI_IN_AIL into xfs_clear_li_failed()
	- assert inode is flush locked in xfs_inode_item_error
	- fix bitwise ops mess

 fs/xfs/xfs_buf_item.c   | 34 ++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_buf_item.h   |  3 +++
 fs/xfs/xfs_inode_item.c | 45 +++++++++++++++++++++++++++++++++++++++++----
 fs/xfs/xfs_trans.h      |  1 +
 fs/xfs/xfs_trans_ail.c  |  4 ++--
 fs/xfs/xfs_trans_priv.h | 31 +++++++++++++++++++++++++++++++
 6 files changed, 112 insertions(+), 6 deletions(-)

diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
index d6ca7d6..940b0eb 100644
--- a/fs/xfs/xfs_buf_item.c
+++ b/fs/xfs/xfs_buf_item.c
@@ -1226,3 +1226,37 @@ xfs_buf_iodone(
 	xfs_trans_ail_delete(ailp, lip, SHUTDOWN_CORRUPT_INCORE);
 	xfs_buf_item_free(BUF_ITEM(lip));
 }
+
+/*
+ * Requeue a failed buffer for writeback
+ *
+ * Return true if the buffer has been re-queued properly, false otherwise
+ */
+bool
+xfs_buf_resubmit_failed_buffers(
+	struct xfs_buf		*bp,
+	struct xfs_log_item	*lip,
+	struct list_head	*buffer_list)
+{
+	struct xfs_log_item	*next;
+	bool			ret;
+
+	/*
+	 * Clear XFS_LI_FAILED flag from all items before resubmit
+	 *
+	 * XFS_LI_FAILED set/clear is protected by xa_lock, caller  this
+	 * function already have it acquired
+	 */
+	for (; lip; lip = next) {
+		next = lip->li_bio_list;
+		xfs_clear_li_failed(lip);
+	}
+
+	/* Add this buffer back to the delayed write list */
+	if (!xfs_buf_delwri_queue(bp, buffer_list))
+		ret = false;
+	else
+		ret = true;
+
+	return ret;
+}
diff --git a/fs/xfs/xfs_buf_item.h b/fs/xfs/xfs_buf_item.h
index f7eba99..530686e 100644
--- a/fs/xfs/xfs_buf_item.h
+++ b/fs/xfs/xfs_buf_item.h
@@ -70,6 +70,9 @@ void	xfs_buf_attach_iodone(struct xfs_buf *,
 			      xfs_log_item_t *);
 void	xfs_buf_iodone_callbacks(struct xfs_buf *);
 void	xfs_buf_iodone(struct xfs_buf *, struct xfs_log_item *);
+bool	xfs_buf_resubmit_failed_buffers(struct xfs_buf *,
+					struct xfs_log_item *,
+					struct list_head *);
 
 extern kmem_zone_t	*xfs_buf_item_zone;
 
diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
index 08cb7d1..ad7ec64 100644
--- a/fs/xfs/xfs_inode_item.c
+++ b/fs/xfs/xfs_inode_item.c
@@ -27,6 +27,7 @@
 #include "xfs_error.h"
 #include "xfs_trace.h"
 #include "xfs_trans_priv.h"
+#include "xfs_buf_item.h"
 #include "xfs_log.h"
 
 
@@ -475,6 +476,21 @@ xfs_inode_item_unpin(
 		wake_up_bit(&ip->i_flags, __XFS_IPINNED_BIT);
 }
 
+/*
+ * Callback used to mark a buffer with XFS_LI_FAILED when items in the buffer
+ * have been failed during writeback
+ */
+STATIC void
+xfs_inode_item_error(
+	struct xfs_log_item	*lip,
+	struct xfs_buf		*bp)
+{
+	struct xfs_inode	*ip = INODE_ITEM(lip)->ili_inode;
+
+	ASSERT(xfs_isiflocked(ip));
+	xfs_set_li_failed(lip, bp);
+}
+
 STATIC uint
 xfs_inode_item_push(
 	struct xfs_log_item	*lip,
@@ -484,13 +500,28 @@ xfs_inode_item_push(
 {
 	struct xfs_inode_log_item *iip = INODE_ITEM(lip);
 	struct xfs_inode	*ip = iip->ili_inode;
-	struct xfs_buf		*bp = NULL;
+	struct xfs_buf		*bp = lip->li_buf;
 	uint			rval = XFS_ITEM_SUCCESS;
 	int			error;
 
 	if (xfs_ipincount(ip) > 0)
 		return XFS_ITEM_PINNED;
 
+	/*
+	 * The buffer containing this item failed to be written back
+	 * previously. Resubmit the buffer for IO.
+	 */
+	if (lip->li_flags & XFS_LI_FAILED) {
+		if (!xfs_buf_trylock(bp))
+		    return XFS_ITEM_LOCKED;
+
+		if (!xfs_buf_resubmit_failed_buffers(bp, lip, buffer_list))
+			rval = XFS_ITEM_FLUSHING;
+
+		xfs_buf_unlock(bp);
+		return rval;
+	}
+
 	if (!xfs_ilock_nowait(ip, XFS_ILOCK_SHARED))
 		return XFS_ITEM_LOCKED;
 
@@ -622,7 +653,8 @@ static const struct xfs_item_ops xfs_inode_item_ops = {
 	.iop_unlock	= xfs_inode_item_unlock,
 	.iop_committed	= xfs_inode_item_committed,
 	.iop_push	= xfs_inode_item_push,
-	.iop_committing = xfs_inode_item_committing
+	.iop_committing = xfs_inode_item_committing,
+	.iop_error	= xfs_inode_item_error
 };
 
 
@@ -710,7 +742,8 @@ xfs_iflush_done(
 		 * the AIL lock.
 		 */
 		iip = INODE_ITEM(blip);
-		if (iip->ili_logged && blip->li_lsn == iip->ili_flush_lsn)
+		if ((iip->ili_logged && blip->li_lsn == iip->ili_flush_lsn) ||
+		    lip->li_flags & XFS_LI_FAILED)
 			need_ail++;
 
 		blip = next;
@@ -718,7 +751,8 @@ xfs_iflush_done(
 
 	/* make sure we capture the state of the initial inode. */
 	iip = INODE_ITEM(lip);
-	if (iip->ili_logged && lip->li_lsn == iip->ili_flush_lsn)
+	if ((iip->ili_logged && lip->li_lsn == iip->ili_flush_lsn) ||
+	    lip->li_flags & XFS_LI_FAILED)
 		need_ail++;
 
 	/*
@@ -739,6 +773,9 @@ xfs_iflush_done(
 			if (INODE_ITEM(blip)->ili_logged &&
 			    blip->li_lsn == INODE_ITEM(blip)->ili_flush_lsn)
 				mlip_changed |= xfs_ail_delete_one(ailp, blip);
+			else {
+				xfs_clear_li_failed(blip);
+			}
 		}
 
 		if (mlip_changed) {
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index 442d679..7d62772 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -49,6 +49,7 @@ typedef struct xfs_log_item {
 	struct xfs_ail			*li_ailp;	/* ptr to AIL */
 	uint				li_type;	/* item type */
 	uint				li_flags;	/* misc flags */
+	struct xfs_buf			*li_buf;	/* real buffer pointer */
 	struct xfs_log_item		*li_bio_list;	/* buffer item list */
 	void				(*li_cb)(struct xfs_buf *,
 						 struct xfs_log_item *);
diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
index 9056c0f..eba21a9 100644
--- a/fs/xfs/xfs_trans_ail.c
+++ b/fs/xfs/xfs_trans_ail.c
@@ -687,15 +687,15 @@ xfs_trans_ail_update_bulk(
 bool
 xfs_ail_delete_one(
 	struct xfs_ail		*ailp,
-	struct xfs_log_item 	*lip)
+	struct xfs_log_item	*lip)
 {
 	struct xfs_log_item	*mlip = xfs_ail_min(ailp);
 
 	trace_xfs_ail_delete(lip, mlip->li_lsn, lip->li_lsn);
 	xfs_ail_delete(ailp, lip);
+	xfs_clear_li_failed(lip);
 	lip->li_flags &= ~XFS_LI_IN_AIL;
 	lip->li_lsn = 0;
-
 	return mlip == lip;
 }
 
diff --git a/fs/xfs/xfs_trans_priv.h b/fs/xfs/xfs_trans_priv.h
index d91706c..b317a36 100644
--- a/fs/xfs/xfs_trans_priv.h
+++ b/fs/xfs/xfs_trans_priv.h
@@ -164,4 +164,35 @@ xfs_trans_ail_copy_lsn(
 	*dst = *src;
 }
 #endif
+
+static inline void
+xfs_clear_li_failed(
+	struct xfs_log_item	*lip)
+{
+	struct xfs_buf	*bp = lip->li_buf;
+
+	ASSERT(lip->li_flags & XFS_LI_IN_AIL);
+	lockdep_assert_held(&lip->li_ailp->xa_lock);
+
+	if (lip->li_flags & XFS_LI_FAILED) {
+		lip->li_flags &= ~XFS_LI_FAILED;
+		lip->li_buf = NULL;
+		xfs_buf_rele(bp);
+	}
+}
+
+static inline void
+xfs_set_li_failed(
+	struct xfs_log_item	*lip,
+	struct xfs_buf		*bp)
+{
+	lockdep_assert_held(&lip->li_ailp->xa_lock);
+
+	if (!(lip->li_flags & XFS_LI_FAILED)) {
+		xfs_buf_hold(bp);
+		lip->li_flags |= XFS_LI_FAILED;
+		lip->li_buf = bp;
+	}
+}
+
 #endif	/* __XFS_TRANS_PRIV_H__ */
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2 V5] xfs: Add infrastructure needed for error propagation during buffer IO failure
  2017-07-18 14:54 ` [PATCH 1/2 V5] xfs: Add infrastructure needed for error propagation during buffer IO failure Carlos Maiolino
@ 2017-07-19 12:01   ` Brian Foster
  2017-07-20 12:01     ` Carlos Maiolino
  0 siblings, 1 reply; 7+ messages in thread
From: Brian Foster @ 2017-07-19 12:01 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: linux-xfs

On Tue, Jul 18, 2017 at 04:54:14PM +0200, Carlos Maiolino wrote:
> With the current code, XFS never re-submit a failed buffer for IO,
> because the failed item in the buffer is kept in the flush locked state
> forever.
> 
> To be able to resubmit an log item for IO, we need a way to mark an item
> as failed, if, for any reason the buffer which the item belonged to
> failed during writeback.
> 
> Add a new log item callback to be used after an IO completion failure
> and make the needed clean ups.
> 
> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
> ---

I think a comment would be nice for _do_callbacks_fail() (see below for
an example), but otherwise the code looks good to me:

Reviewed-by: Brian Foster <bfoster@redhat.com>

> 
> V2:
> 	- Update commit log to include a better description of why this
> 	  patch is needed and fix spelling mistakes
> 	- Move xfs_buf_do_callbacks_fail() call into
> 	  xfs_buf_iodone_callback_error, so the callbacks can be executed
> 	  before the buffer is released, and only after it has been
> 	  retried once
> 
> V3:
> 	- fix some loops according to hch suggestion
> 	- whitespace cleanup
> 
> V4:
> 	- Invoke failure callbacks before reset the I/O error
> 	- Remove bflags field from iop_error callback
> 	- move spin_lock/unlock xa_lock up in the stack, handling all
> 	  log items in the same buffer into a single lock
> 
> V5:
> 	- Reorganize variable declarations
> 	  in fxs_buf_do_callbacks_fail
> 
>  fs/xfs/xfs_buf_item.c | 24 +++++++++++++++++++++++-
>  fs/xfs/xfs_trans.h    |  7 +++++--
>  2 files changed, 28 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
> index f6a8422..d6ca7d6 100644
> --- a/fs/xfs/xfs_buf_item.c
> +++ b/fs/xfs/xfs_buf_item.c
> @@ -29,6 +29,7 @@
>  #include "xfs_error.h"
>  #include "xfs_trace.h"
>  #include "xfs_log.h"
> +#include "xfs_inode.h"
>  
>  
>  kmem_zone_t	*xfs_buf_item_zone;
> @@ -1054,6 +1055,23 @@ xfs_buf_do_callbacks(
>  	}
>  }
>  

/*
 * Invoke the error state callback for each log item affected by the failed I/O.
 *
 * If a metadata buffer write fails with a non-permanent error, the buffer is
 * eventually resubmitted and so the completion callbacks are not run. The error
 * state may need to be propagated to the log items attached to the buffer,
 * however, so the next AIL push of the item knows how to handle it correctly.
 */

> +STATIC void
> +xfs_buf_do_callbacks_fail(
> +	struct xfs_buf		*bp)
> +{
> +	struct xfs_log_item	*next;
> +	struct xfs_log_item	*lip = bp->b_fspriv;
> +	struct xfs_ail		*ailp = lip->li_ailp;
> +
> +	spin_lock(&ailp->xa_lock);
> +	for (; lip; lip = next) {
> +		next = lip->li_bio_list;
> +		if (lip->li_ops->iop_error)
> +			lip->li_ops->iop_error(lip, bp);
> +	}
> +	spin_unlock(&ailp->xa_lock);
> +}
> +
>  static bool
>  xfs_buf_iodone_callback_error(
>  	struct xfs_buf		*bp)
> @@ -1123,7 +1141,11 @@ xfs_buf_iodone_callback_error(
>  	if ((mp->m_flags & XFS_MOUNT_UNMOUNTING) && mp->m_fail_unmount)
>  		goto permanent_error;
>  
> -	/* still a transient error, higher layers will retry */
> +	/*
> +	 * Still a transient error, run IO completion failure callbacks and let
> +	 * the higher layers retry the buffer.
> +	 */
> +	xfs_buf_do_callbacks_fail(bp);
>  	xfs_buf_ioerror(bp, 0);
>  	xfs_buf_relse(bp);
>  	return true;
> diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
> index 6bdad6f..442d679 100644
> --- a/fs/xfs/xfs_trans.h
> +++ b/fs/xfs/xfs_trans.h
> @@ -64,11 +64,13 @@ typedef struct xfs_log_item {
>  } xfs_log_item_t;
>  
>  #define	XFS_LI_IN_AIL	0x1
> -#define XFS_LI_ABORTED	0x2
> +#define	XFS_LI_ABORTED	0x2
> +#define	XFS_LI_FAILED	0x4
>  
>  #define XFS_LI_FLAGS \
>  	{ XFS_LI_IN_AIL,	"IN_AIL" }, \
> -	{ XFS_LI_ABORTED,	"ABORTED" }
> +	{ XFS_LI_ABORTED,	"ABORTED" }, \
> +	{ XFS_LI_FAILED,	"FAILED" }
>  
>  struct xfs_item_ops {
>  	void (*iop_size)(xfs_log_item_t *, int *, int *);
> @@ -79,6 +81,7 @@ struct xfs_item_ops {
>  	void (*iop_unlock)(xfs_log_item_t *);
>  	xfs_lsn_t (*iop_committed)(xfs_log_item_t *, xfs_lsn_t);
>  	void (*iop_committing)(xfs_log_item_t *, xfs_lsn_t);
> +	void (*iop_error)(xfs_log_item_t *, xfs_buf_t *);
>  };
>  
>  void	xfs_log_item_init(struct xfs_mount *mp, struct xfs_log_item *item,
> -- 
> 2.9.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2 V5] xfs: Properly retry failed inode items in case of error during buffer writeback
  2017-07-18 14:54 ` [PATCH 2/2 V5] xfs: Properly retry failed inode items in case of error during buffer writeback Carlos Maiolino
@ 2017-07-19 12:01   ` Brian Foster
  2017-07-20 12:07     ` Carlos Maiolino
  0 siblings, 1 reply; 7+ messages in thread
From: Brian Foster @ 2017-07-19 12:01 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: linux-xfs

On Tue, Jul 18, 2017 at 04:54:15PM +0200, Carlos Maiolino wrote:
> When a buffer has been failed during writeback, the inode items into it
> are kept flush locked, and are never resubmitted due the flush lock, so,
> if any buffer fails to be written, the items in AIL are never written to
> disk and never unlocked.
> 
> This causes unmount operation to hang due these items flush locked in AIL,
> but this also causes the items in AIL to never be written back, even when
> the IO device comes back to normal.
> 
> I've been testing this patch with a DM-thin device, creating a
> filesystem larger than the real device.
> 
> When writing enough data to fill the DM-thin device, XFS receives ENOSPC
> errors from the device, and keep spinning on xfsaild (when 'retry
> forever' configuration is set).
> 
> At this point, the filesystem can not be unmounted because of the flush locked
> items in AIL, but worse, the items in AIL are never retried at all
> (once xfs_inode_item_push() will skip the items that are flush locked),
> even if the underlying DM-thin device is expanded to the proper size.
> 
> This patch fixes both cases, retrying any item that has been failed
> previously, using the infra-structure provided by the previous patch.
> 
> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
> ---

This version looks pretty good to me. I've added a few style nits below
and I need to run some testing, but otherwise the code looks good:

Reviewed-by: Brian Foster <bfoster@redhat.com>

> 
> V2:
> 	- Fix XFS_LI_FAILED flag removal
> 	- Use atomic operations to set and clear XFS_LI_FAILED flag
> 	- Remove check for XBF_WRITE_FAIL in xfs_inode_item_push
> 	- Add more comments to the code
> 	- Add a helper function to resubmit the failed buffers, so this
> 	  can be also used in dquot system without duplicating code
> 
> V3:
> 	- kill xfs_imap_to_bp call using a pointer in the log item to
> 	  hold the buffer address
> 	- use xa_lock instead of atomic operations to handle log item
> 	  flags
> 	- Add a hold to the buffer for each log item failed
> 	- move buffer resubmission up in xfs_inode_item_push()
> 
> V4:
> 	- Remove bflags argument from iop_error callback
> 	- Remove ip argument from xfs_buf_resubmit_failed_buffers
> 	- Use helpers to set/clear XFS_LI_FAILED flag
> 	- remove ->xa_lock from the iop->error callback and move it up
> 	  on the stack, so all log items are processed into a single
> 	  pair of lock/unlock
> 
> V5:
> 	- fix comments
> 	- move buf_lock from xfs_buf_resubmit_failed_buffers() up to
> 	  xfs_inode_item_push, and use trylock instead
> 	- assert xa_lock is held in xfs_set/clear_li_failed helpers
> 	  and move such helpers into xfs_trans_priv.h, once
> 	  lockdep_assert_held() usage will require xfs_ail definition
> 	- ASSERT XFS_LI_IN_AIL into xfs_clear_li_failed()
> 	- assert inode is flush locked in xfs_inode_item_error
> 	- fix bitwise ops mess
> 
>  fs/xfs/xfs_buf_item.c   | 34 ++++++++++++++++++++++++++++++++++
>  fs/xfs/xfs_buf_item.h   |  3 +++
>  fs/xfs/xfs_inode_item.c | 45 +++++++++++++++++++++++++++++++++++++++++----
>  fs/xfs/xfs_trans.h      |  1 +
>  fs/xfs/xfs_trans_ail.c  |  4 ++--
>  fs/xfs/xfs_trans_priv.h | 31 +++++++++++++++++++++++++++++++
>  6 files changed, 112 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
> index d6ca7d6..940b0eb 100644
> --- a/fs/xfs/xfs_buf_item.c
> +++ b/fs/xfs/xfs_buf_item.c
> @@ -1226,3 +1226,37 @@ xfs_buf_iodone(
>  	xfs_trans_ail_delete(ailp, lip, SHUTDOWN_CORRUPT_INCORE);
>  	xfs_buf_item_free(BUF_ITEM(lip));
>  }
> +
> +/*
> + * Requeue a failed buffer for writeback
> + *
> + * Return true if the buffer has been re-queued properly, false otherwise
> + */
> +bool
> +xfs_buf_resubmit_failed_buffers(
> +	struct xfs_buf		*bp,
> +	struct xfs_log_item	*lip,
> +	struct list_head	*buffer_list)
> +{
> +	struct xfs_log_item	*next;
> +	bool			ret;
> +
> +	/*
> +	 * Clear XFS_LI_FAILED flag from all items before resubmit
> +	 *
> +	 * XFS_LI_FAILED set/clear is protected by xa_lock, caller  this
> +	 * function already have it acquired
> +	 */
> +	for (; lip; lip = next) {
> +		next = lip->li_bio_list;
> +		xfs_clear_li_failed(lip);
> +	}
> +
> +	/* Add this buffer back to the delayed write list */
> +	if (!xfs_buf_delwri_queue(bp, buffer_list))
> +		ret = false;
> +	else
> +		ret = true;
> +
> +	return ret;

	return xfs_buf_delwri_queue(bp, buffer_list); ?

> +}
> diff --git a/fs/xfs/xfs_buf_item.h b/fs/xfs/xfs_buf_item.h
> index f7eba99..530686e 100644
> --- a/fs/xfs/xfs_buf_item.h
> +++ b/fs/xfs/xfs_buf_item.h
> @@ -70,6 +70,9 @@ void	xfs_buf_attach_iodone(struct xfs_buf *,
>  			      xfs_log_item_t *);
>  void	xfs_buf_iodone_callbacks(struct xfs_buf *);
>  void	xfs_buf_iodone(struct xfs_buf *, struct xfs_log_item *);
> +bool	xfs_buf_resubmit_failed_buffers(struct xfs_buf *,
> +					struct xfs_log_item *,
> +					struct list_head *);
>  
>  extern kmem_zone_t	*xfs_buf_item_zone;
>  
> diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
> index 08cb7d1..ad7ec64 100644
> --- a/fs/xfs/xfs_inode_item.c
> +++ b/fs/xfs/xfs_inode_item.c
> @@ -27,6 +27,7 @@
>  #include "xfs_error.h"
>  #include "xfs_trace.h"
>  #include "xfs_trans_priv.h"
> +#include "xfs_buf_item.h"
>  #include "xfs_log.h"
>  
>  
> @@ -475,6 +476,21 @@ xfs_inode_item_unpin(
>  		wake_up_bit(&ip->i_flags, __XFS_IPINNED_BIT);
>  }
>  
> +/*
> + * Callback used to mark a buffer with XFS_LI_FAILED when items in the buffer
> + * have been failed during writeback

"This informs the AIL that the inode is already flush locked on the next
push and acquires a hold on the buffer to ensure that it isn't reclaimed
before dirty data makes it to disk."

> + */
> +STATIC void
> +xfs_inode_item_error(
> +	struct xfs_log_item	*lip,
> +	struct xfs_buf		*bp)
> +{
> +	struct xfs_inode	*ip = INODE_ITEM(lip)->ili_inode;
> +
> +	ASSERT(xfs_isiflocked(ip));
> +	xfs_set_li_failed(lip, bp);
> +}
> +
>  STATIC uint
>  xfs_inode_item_push(
>  	struct xfs_log_item	*lip,
> @@ -484,13 +500,28 @@ xfs_inode_item_push(
>  {
>  	struct xfs_inode_log_item *iip = INODE_ITEM(lip);
>  	struct xfs_inode	*ip = iip->ili_inode;
> -	struct xfs_buf		*bp = NULL;
> +	struct xfs_buf		*bp = lip->li_buf;
>  	uint			rval = XFS_ITEM_SUCCESS;
>  	int			error;
>  
>  	if (xfs_ipincount(ip) > 0)
>  		return XFS_ITEM_PINNED;
>  
> +	/*
> +	 * The buffer containing this item failed to be written back
> +	 * previously. Resubmit the buffer for IO.
> +	 */
> +	if (lip->li_flags & XFS_LI_FAILED) {
> +		if (!xfs_buf_trylock(bp))
> +		    return XFS_ITEM_LOCKED;

Bad indentation.

> +
> +		if (!xfs_buf_resubmit_failed_buffers(bp, lip, buffer_list))
> +			rval = XFS_ITEM_FLUSHING;
> +
> +		xfs_buf_unlock(bp);
> +		return rval;
> +	}
> +
>  	if (!xfs_ilock_nowait(ip, XFS_ILOCK_SHARED))
>  		return XFS_ITEM_LOCKED;
>  
> @@ -622,7 +653,8 @@ static const struct xfs_item_ops xfs_inode_item_ops = {
>  	.iop_unlock	= xfs_inode_item_unlock,
>  	.iop_committed	= xfs_inode_item_committed,
>  	.iop_push	= xfs_inode_item_push,
> -	.iop_committing = xfs_inode_item_committing
> +	.iop_committing = xfs_inode_item_committing,
> +	.iop_error	= xfs_inode_item_error
>  };
>  
>  
> @@ -710,7 +742,8 @@ xfs_iflush_done(
>  		 * the AIL lock.
>  		 */
>  		iip = INODE_ITEM(blip);
> -		if (iip->ili_logged && blip->li_lsn == iip->ili_flush_lsn)
> +		if ((iip->ili_logged && blip->li_lsn == iip->ili_flush_lsn) ||
> +		    lip->li_flags & XFS_LI_FAILED)
>  			need_ail++;
>  
>  		blip = next;
> @@ -718,7 +751,8 @@ xfs_iflush_done(
>  
>  	/* make sure we capture the state of the initial inode. */
>  	iip = INODE_ITEM(lip);
> -	if (iip->ili_logged && lip->li_lsn == iip->ili_flush_lsn)
> +	if ((iip->ili_logged && lip->li_lsn == iip->ili_flush_lsn) ||
> +	    lip->li_flags & XFS_LI_FAILED)
>  		need_ail++;
>  
>  	/*
> @@ -739,6 +773,9 @@ xfs_iflush_done(
>  			if (INODE_ITEM(blip)->ili_logged &&
>  			    blip->li_lsn == INODE_ITEM(blip)->ili_flush_lsn)
>  				mlip_changed |= xfs_ail_delete_one(ailp, blip);
> +			else {
> +				xfs_clear_li_failed(blip);
> +			}
>  		}
>  
>  		if (mlip_changed) {
> diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
> index 442d679..7d62772 100644
> --- a/fs/xfs/xfs_trans.h
> +++ b/fs/xfs/xfs_trans.h
> @@ -49,6 +49,7 @@ typedef struct xfs_log_item {
>  	struct xfs_ail			*li_ailp;	/* ptr to AIL */
>  	uint				li_type;	/* item type */
>  	uint				li_flags;	/* misc flags */
> +	struct xfs_buf			*li_buf;	/* real buffer pointer */
>  	struct xfs_log_item		*li_bio_list;	/* buffer item list */
>  	void				(*li_cb)(struct xfs_buf *,
>  						 struct xfs_log_item *);
> diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
> index 9056c0f..eba21a9 100644
> --- a/fs/xfs/xfs_trans_ail.c
> +++ b/fs/xfs/xfs_trans_ail.c
> @@ -687,15 +687,15 @@ xfs_trans_ail_update_bulk(
>  bool
>  xfs_ail_delete_one(
>  	struct xfs_ail		*ailp,
> -	struct xfs_log_item 	*lip)
> +	struct xfs_log_item	*lip)
>  {
>  	struct xfs_log_item	*mlip = xfs_ail_min(ailp);
>  
>  	trace_xfs_ail_delete(lip, mlip->li_lsn, lip->li_lsn);
>  	xfs_ail_delete(ailp, lip);
> +	xfs_clear_li_failed(lip);
>  	lip->li_flags &= ~XFS_LI_IN_AIL;
>  	lip->li_lsn = 0;
> -

Unrelated whitespace changes here and above..?

Brian

>  	return mlip == lip;
>  }
>  
> diff --git a/fs/xfs/xfs_trans_priv.h b/fs/xfs/xfs_trans_priv.h
> index d91706c..b317a36 100644
> --- a/fs/xfs/xfs_trans_priv.h
> +++ b/fs/xfs/xfs_trans_priv.h
> @@ -164,4 +164,35 @@ xfs_trans_ail_copy_lsn(
>  	*dst = *src;
>  }
>  #endif
> +
> +static inline void
> +xfs_clear_li_failed(
> +	struct xfs_log_item	*lip)
> +{
> +	struct xfs_buf	*bp = lip->li_buf;
> +
> +	ASSERT(lip->li_flags & XFS_LI_IN_AIL);
> +	lockdep_assert_held(&lip->li_ailp->xa_lock);
> +
> +	if (lip->li_flags & XFS_LI_FAILED) {
> +		lip->li_flags &= ~XFS_LI_FAILED;
> +		lip->li_buf = NULL;
> +		xfs_buf_rele(bp);
> +	}
> +}
> +
> +static inline void
> +xfs_set_li_failed(
> +	struct xfs_log_item	*lip,
> +	struct xfs_buf		*bp)
> +{
> +	lockdep_assert_held(&lip->li_ailp->xa_lock);
> +
> +	if (!(lip->li_flags & XFS_LI_FAILED)) {
> +		xfs_buf_hold(bp);
> +		lip->li_flags |= XFS_LI_FAILED;
> +		lip->li_buf = bp;
> +	}
> +}
> +
>  #endif	/* __XFS_TRANS_PRIV_H__ */
> -- 
> 2.9.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2 V5] xfs: Add infrastructure needed for error propagation during buffer IO failure
  2017-07-19 12:01   ` Brian Foster
@ 2017-07-20 12:01     ` Carlos Maiolino
  0 siblings, 0 replies; 7+ messages in thread
From: Carlos Maiolino @ 2017-07-20 12:01 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jul 19, 2017 at 08:01:04AM -0400, Brian Foster wrote:
> On Tue, Jul 18, 2017 at 04:54:14PM +0200, Carlos Maiolino wrote:
> > With the current code, XFS never re-submit a failed buffer for IO,
> > because the failed item in the buffer is kept in the flush locked state
> > forever.
> > 
> > To be able to resubmit an log item for IO, we need a way to mark an item
> > as failed, if, for any reason the buffer which the item belonged to
> > failed during writeback.
> > 
> > Add a new log item callback to be used after an IO completion failure
> > and make the needed clean ups.
> > 
> > Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
> > ---
> 
> I think a comment would be nice for _do_callbacks_fail() (see below for
> an example), but otherwise the code looks good to me:
> 
> Reviewed-by: Brian Foster <bfoster@redhat.com>
> 
> > 
> > V2:
> > 	- Update commit log to include a better description of why this
> > 	  patch is needed and fix spelling mistakes
> > 	- Move xfs_buf_do_callbacks_fail() call into
> > 	  xfs_buf_iodone_callback_error, so the callbacks can be executed
> > 	  before the buffer is released, and only after it has been
> > 	  retried once
> > 
> > V3:
> > 	- fix some loops according to hch suggestion
> > 	- whitespace cleanup
> > 
> > V4:
> > 	- Invoke failure callbacks before reset the I/O error
> > 	- Remove bflags field from iop_error callback
> > 	- move spin_lock/unlock xa_lock up in the stack, handling all
> > 	  log items in the same buffer into a single lock
> > 
> > V5:
> > 	- Reorganize variable declarations
> > 	  in fxs_buf_do_callbacks_fail
> > 
> >  fs/xfs/xfs_buf_item.c | 24 +++++++++++++++++++++++-
> >  fs/xfs/xfs_trans.h    |  7 +++++--
> >  2 files changed, 28 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
> > index f6a8422..d6ca7d6 100644
> > --- a/fs/xfs/xfs_buf_item.c
> > +++ b/fs/xfs/xfs_buf_item.c
> > @@ -29,6 +29,7 @@
> >  #include "xfs_error.h"
> >  #include "xfs_trace.h"
> >  #include "xfs_log.h"
> > +#include "xfs_inode.h"
> >  
> >  
> >  kmem_zone_t	*xfs_buf_item_zone;
> > @@ -1054,6 +1055,23 @@ xfs_buf_do_callbacks(
> >  	}
> >  }
> >  
> 
> /*
>  * Invoke the error state callback for each log item affected by the failed I/O.
>  *
>  * If a metadata buffer write fails with a non-permanent error, the buffer is
>  * eventually resubmitted and so the completion callbacks are not run. The error
>  * state may need to be propagated to the log items attached to the buffer,
>  * however, so the next AIL push of the item knows how to handle it correctly.
>  */

ok, I have no objection in detailing more it. this comment looks fine, I'll add
it up to the V6 and keep your review tag if it's ok.

cheers
> 
> > +STATIC void
> > +xfs_buf_do_callbacks_fail(
> > +	struct xfs_buf		*bp)
> > +{
> > +	struct xfs_log_item	*next;
> > +	struct xfs_log_item	*lip = bp->b_fspriv;
> > +	struct xfs_ail		*ailp = lip->li_ailp;
> > +
> > +	spin_lock(&ailp->xa_lock);
> > +	for (; lip; lip = next) {
> > +		next = lip->li_bio_list;
> > +		if (lip->li_ops->iop_error)
> > +			lip->li_ops->iop_error(lip, bp);
> > +	}
> > +	spin_unlock(&ailp->xa_lock);
> > +}
> > +
> >  static bool
> >  xfs_buf_iodone_callback_error(
> >  	struct xfs_buf		*bp)
> > @@ -1123,7 +1141,11 @@ xfs_buf_iodone_callback_error(
> >  	if ((mp->m_flags & XFS_MOUNT_UNMOUNTING) && mp->m_fail_unmount)
> >  		goto permanent_error;
> >  
> > -	/* still a transient error, higher layers will retry */
> > +	/*
> > +	 * Still a transient error, run IO completion failure callbacks and let
> > +	 * the higher layers retry the buffer.
> > +	 */
> > +	xfs_buf_do_callbacks_fail(bp);
> >  	xfs_buf_ioerror(bp, 0);
> >  	xfs_buf_relse(bp);
> >  	return true;
> > diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
> > index 6bdad6f..442d679 100644
> > --- a/fs/xfs/xfs_trans.h
> > +++ b/fs/xfs/xfs_trans.h
> > @@ -64,11 +64,13 @@ typedef struct xfs_log_item {
> >  } xfs_log_item_t;
> >  
> >  #define	XFS_LI_IN_AIL	0x1
> > -#define XFS_LI_ABORTED	0x2
> > +#define	XFS_LI_ABORTED	0x2
> > +#define	XFS_LI_FAILED	0x4
> >  
> >  #define XFS_LI_FLAGS \
> >  	{ XFS_LI_IN_AIL,	"IN_AIL" }, \
> > -	{ XFS_LI_ABORTED,	"ABORTED" }
> > +	{ XFS_LI_ABORTED,	"ABORTED" }, \
> > +	{ XFS_LI_FAILED,	"FAILED" }
> >  
> >  struct xfs_item_ops {
> >  	void (*iop_size)(xfs_log_item_t *, int *, int *);
> > @@ -79,6 +81,7 @@ struct xfs_item_ops {
> >  	void (*iop_unlock)(xfs_log_item_t *);
> >  	xfs_lsn_t (*iop_committed)(xfs_log_item_t *, xfs_lsn_t);
> >  	void (*iop_committing)(xfs_log_item_t *, xfs_lsn_t);
> > +	void (*iop_error)(xfs_log_item_t *, xfs_buf_t *);
> >  };
> >  
> >  void	xfs_log_item_init(struct xfs_mount *mp, struct xfs_log_item *item,
> > -- 
> > 2.9.4
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Carlos

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2 V5] xfs: Properly retry failed inode items in case of error during buffer writeback
  2017-07-19 12:01   ` Brian Foster
@ 2017-07-20 12:07     ` Carlos Maiolino
  0 siblings, 0 replies; 7+ messages in thread
From: Carlos Maiolino @ 2017-07-20 12:07 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jul 19, 2017 at 08:01:14AM -0400, Brian Foster wrote:
> On Tue, Jul 18, 2017 at 04:54:15PM +0200, Carlos Maiolino wrote:
> > When a buffer has been failed during writeback, the inode items into it
> > are kept flush locked, and are never resubmitted due the flush lock, so,
> > if any buffer fails to be written, the items in AIL are never written to
> > disk and never unlocked.
> > 
> > This causes unmount operation to hang due these items flush locked in AIL,
> > but this also causes the items in AIL to never be written back, even when
> > the IO device comes back to normal.
> > 
> > I've been testing this patch with a DM-thin device, creating a
> > filesystem larger than the real device.
> > 
> > When writing enough data to fill the DM-thin device, XFS receives ENOSPC
> > errors from the device, and keep spinning on xfsaild (when 'retry
> > forever' configuration is set).
> > 
> > At this point, the filesystem can not be unmounted because of the flush locked
> > items in AIL, but worse, the items in AIL are never retried at all
> > (once xfs_inode_item_push() will skip the items that are flush locked),
> > even if the underlying DM-thin device is expanded to the proper size.
> > 
> > This patch fixes both cases, retrying any item that has been failed
> > previously, using the infra-structure provided by the previous patch.
> > 
> > Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
> > ---
> 
> This version looks pretty good to me. I've added a few style nits below
> and I need to run some testing, but otherwise the code looks good:
> 
> Reviewed-by: Brian Foster <bfoster@redhat.com>
> 
> > 
> > V2:
> > 	- Fix XFS_LI_FAILED flag removal
> > 	- Use atomic operations to set and clear XFS_LI_FAILED flag
> > 	- Remove check for XBF_WRITE_FAIL in xfs_inode_item_push
> > 	- Add more comments to the code
> > 	- Add a helper function to resubmit the failed buffers, so this
> > 	  can be also used in dquot system without duplicating code
> > 
> > V3:
> > 	- kill xfs_imap_to_bp call using a pointer in the log item to
> > 	  hold the buffer address
> > 	- use xa_lock instead of atomic operations to handle log item
> > 	  flags
> > 	- Add a hold to the buffer for each log item failed
> > 	- move buffer resubmission up in xfs_inode_item_push()
> > 
> > V4:
> > 	- Remove bflags argument from iop_error callback
> > 	- Remove ip argument from xfs_buf_resubmit_failed_buffers
> > 	- Use helpers to set/clear XFS_LI_FAILED flag
> > 	- remove ->xa_lock from the iop->error callback and move it up
> > 	  on the stack, so all log items are processed into a single
> > 	  pair of lock/unlock
> > 
> > V5:
> > 	- fix comments
> > 	- move buf_lock from xfs_buf_resubmit_failed_buffers() up to
> > 	  xfs_inode_item_push, and use trylock instead
> > 	- assert xa_lock is held in xfs_set/clear_li_failed helpers
> > 	  and move such helpers into xfs_trans_priv.h, once
> > 	  lockdep_assert_held() usage will require xfs_ail definition
> > 	- ASSERT XFS_LI_IN_AIL into xfs_clear_li_failed()
> > 	- assert inode is flush locked in xfs_inode_item_error
> > 	- fix bitwise ops mess
> > 
> >  fs/xfs/xfs_buf_item.c   | 34 ++++++++++++++++++++++++++++++++++
> >  fs/xfs/xfs_buf_item.h   |  3 +++
> >  fs/xfs/xfs_inode_item.c | 45 +++++++++++++++++++++++++++++++++++++++++----
> >  fs/xfs/xfs_trans.h      |  1 +
> >  fs/xfs/xfs_trans_ail.c  |  4 ++--
> >  fs/xfs/xfs_trans_priv.h | 31 +++++++++++++++++++++++++++++++
> >  6 files changed, 112 insertions(+), 6 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
> > index d6ca7d6..940b0eb 100644
> > --- a/fs/xfs/xfs_buf_item.c
> > +++ b/fs/xfs/xfs_buf_item.c
> > @@ -1226,3 +1226,37 @@ xfs_buf_iodone(
> >  	xfs_trans_ail_delete(ailp, lip, SHUTDOWN_CORRUPT_INCORE);
> >  	xfs_buf_item_free(BUF_ITEM(lip));
> >  }
> > +
> > +/*
> > + * Requeue a failed buffer for writeback
> > + *
> > + * Return true if the buffer has been re-queued properly, false otherwise
> > + */
> > +bool
> > +xfs_buf_resubmit_failed_buffers(
> > +	struct xfs_buf		*bp,
> > +	struct xfs_log_item	*lip,
> > +	struct list_head	*buffer_list)
> > +{
> > +	struct xfs_log_item	*next;
> > +	bool			ret;
> > +
> > +	/*
> > +	 * Clear XFS_LI_FAILED flag from all items before resubmit
> > +	 *
> > +	 * XFS_LI_FAILED set/clear is protected by xa_lock, caller  this
> > +	 * function already have it acquired
> > +	 */
> > +	for (; lip; lip = next) {
> > +		next = lip->li_bio_list;
> > +		xfs_clear_li_failed(lip);
> > +	}
> > +
> > +	/* Add this buffer back to the delayed write list */
> > +	if (!xfs_buf_delwri_queue(bp, buffer_list))
> > +		ret = false;
> > +	else
> > +		ret = true;
> > +
> > +	return ret;
> 
> 	return xfs_buf_delwri_queue(bp, buffer_list); ?

yup, sounds better

> 
> > +}
> > diff --git a/fs/xfs/xfs_buf_item.h b/fs/xfs/xfs_buf_item.h
> > index f7eba99..530686e 100644
> > --- a/fs/xfs/xfs_buf_item.h
> > +++ b/fs/xfs/xfs_buf_item.h
> > @@ -70,6 +70,9 @@ void	xfs_buf_attach_iodone(struct xfs_buf *,
> >  			      xfs_log_item_t *);
> >  void	xfs_buf_iodone_callbacks(struct xfs_buf *);
> >  void	xfs_buf_iodone(struct xfs_buf *, struct xfs_log_item *);
> > +bool	xfs_buf_resubmit_failed_buffers(struct xfs_buf *,
> > +					struct xfs_log_item *,
> > +					struct list_head *);
> >  
> >  extern kmem_zone_t	*xfs_buf_item_zone;
> >  
> > diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
> > index 08cb7d1..ad7ec64 100644
> > --- a/fs/xfs/xfs_inode_item.c
> > +++ b/fs/xfs/xfs_inode_item.c
> > @@ -27,6 +27,7 @@
> >  #include "xfs_error.h"
> >  #include "xfs_trace.h"
> >  #include "xfs_trans_priv.h"
> > +#include "xfs_buf_item.h"
> >  #include "xfs_log.h"
> >  
> >  
> > @@ -475,6 +476,21 @@ xfs_inode_item_unpin(
> >  		wake_up_bit(&ip->i_flags, __XFS_IPINNED_BIT);
> >  }
> >  
> > +/*
> > + * Callback used to mark a buffer with XFS_LI_FAILED when items in the buffer
> > + * have been failed during writeback
> 
> "This informs the AIL that the inode is already flush locked on the next
> push and acquires a hold on the buffer to ensure that it isn't reclaimed
> before dirty data makes it to disk."
> 

sounds fine

> > + */
> > +STATIC void
> > +xfs_inode_item_error(
> > +	struct xfs_log_item	*lip,
> > +	struct xfs_buf		*bp)
> > +{
> > +	struct xfs_inode	*ip = INODE_ITEM(lip)->ili_inode;
> > +
> > +	ASSERT(xfs_isiflocked(ip));
> > +	xfs_set_li_failed(lip, bp);
> > +}
> > +
> >  STATIC uint
> >  xfs_inode_item_push(
> >  	struct xfs_log_item	*lip,
> > @@ -484,13 +500,28 @@ xfs_inode_item_push(
> >  {
> >  	struct xfs_inode_log_item *iip = INODE_ITEM(lip);
> >  	struct xfs_inode	*ip = iip->ili_inode;
> > -	struct xfs_buf		*bp = NULL;
> > +	struct xfs_buf		*bp = lip->li_buf;
> >  	uint			rval = XFS_ITEM_SUCCESS;
> >  	int			error;
> >  
> >  	if (xfs_ipincount(ip) > 0)
> >  		return XFS_ITEM_PINNED;
> >  
> > +	/*
> > +	 * The buffer containing this item failed to be written back
> > +	 * previously. Resubmit the buffer for IO.
> > +	 */
> > +	if (lip->li_flags & XFS_LI_FAILED) {
> > +		if (!xfs_buf_trylock(bp))
> > +		    return XFS_ITEM_LOCKED;
> 
> Bad indentation.
> 
> > +
> > +		if (!xfs_buf_resubmit_failed_buffers(bp, lip, buffer_list))
> > +			rval = XFS_ITEM_FLUSHING;
> > +
> > +		xfs_buf_unlock(bp);
> > +		return rval;
> > +	}
> > +
> >  	if (!xfs_ilock_nowait(ip, XFS_ILOCK_SHARED))
> >  		return XFS_ITEM_LOCKED;
> >  
> > @@ -622,7 +653,8 @@ static const struct xfs_item_ops xfs_inode_item_ops = {
> >  	.iop_unlock	= xfs_inode_item_unlock,
> >  	.iop_committed	= xfs_inode_item_committed,
> >  	.iop_push	= xfs_inode_item_push,
> > -	.iop_committing = xfs_inode_item_committing
> > +	.iop_committing = xfs_inode_item_committing,
> > +	.iop_error	= xfs_inode_item_error
> >  };
> >  
> >  
> > @@ -710,7 +742,8 @@ xfs_iflush_done(
> >  		 * the AIL lock.
> >  		 */
> >  		iip = INODE_ITEM(blip);
> > -		if (iip->ili_logged && blip->li_lsn == iip->ili_flush_lsn)
> > +		if ((iip->ili_logged && blip->li_lsn == iip->ili_flush_lsn) ||
> > +		    lip->li_flags & XFS_LI_FAILED)
> >  			need_ail++;
> >  
> >  		blip = next;
> > @@ -718,7 +751,8 @@ xfs_iflush_done(
> >  
> >  	/* make sure we capture the state of the initial inode. */
> >  	iip = INODE_ITEM(lip);
> > -	if (iip->ili_logged && lip->li_lsn == iip->ili_flush_lsn)
> > +	if ((iip->ili_logged && lip->li_lsn == iip->ili_flush_lsn) ||
> > +	    lip->li_flags & XFS_LI_FAILED)
> >  		need_ail++;
> >  
> >  	/*
> > @@ -739,6 +773,9 @@ xfs_iflush_done(
> >  			if (INODE_ITEM(blip)->ili_logged &&
> >  			    blip->li_lsn == INODE_ITEM(blip)->ili_flush_lsn)
> >  				mlip_changed |= xfs_ail_delete_one(ailp, blip);
> > +			else {
> > +				xfs_clear_li_failed(blip);
> > +			}
> >  		}
> >  
> >  		if (mlip_changed) {
> > diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
> > index 442d679..7d62772 100644
> > --- a/fs/xfs/xfs_trans.h
> > +++ b/fs/xfs/xfs_trans.h
> > @@ -49,6 +49,7 @@ typedef struct xfs_log_item {
> >  	struct xfs_ail			*li_ailp;	/* ptr to AIL */
> >  	uint				li_type;	/* item type */
> >  	uint				li_flags;	/* misc flags */
> > +	struct xfs_buf			*li_buf;	/* real buffer pointer */
> >  	struct xfs_log_item		*li_bio_list;	/* buffer item list */
> >  	void				(*li_cb)(struct xfs_buf *,
> >  						 struct xfs_log_item *);
> > diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
> > index 9056c0f..eba21a9 100644
> > --- a/fs/xfs/xfs_trans_ail.c
> > +++ b/fs/xfs/xfs_trans_ail.c
> > @@ -687,15 +687,15 @@ xfs_trans_ail_update_bulk(
> >  bool
> >  xfs_ail_delete_one(
> >  	struct xfs_ail		*ailp,
> > -	struct xfs_log_item 	*lip)
> > +	struct xfs_log_item	*lip)
> >  {
> >  	struct xfs_log_item	*mlip = xfs_ail_min(ailp);
> >  
> >  	trace_xfs_ail_delete(lip, mlip->li_lsn, lip->li_lsn);
> >  	xfs_ail_delete(ailp, lip);
> > +	xfs_clear_li_failed(lip);
> >  	lip->li_flags &= ~XFS_LI_IN_AIL;
> >  	lip->li_lsn = 0;
> > -
> 
> Unrelated whitespace changes here and above..?
> 

Thanks, I'll fix these style mistakes and submit it with your review tag too.
> Brian
> 
> >  	return mlip == lip;
> >  }
> >  
> > diff --git a/fs/xfs/xfs_trans_priv.h b/fs/xfs/xfs_trans_priv.h
> > index d91706c..b317a36 100644
> > --- a/fs/xfs/xfs_trans_priv.h
> > +++ b/fs/xfs/xfs_trans_priv.h
> > @@ -164,4 +164,35 @@ xfs_trans_ail_copy_lsn(
> >  	*dst = *src;
> >  }
> >  #endif
> > +
> > +static inline void
> > +xfs_clear_li_failed(
> > +	struct xfs_log_item	*lip)
> > +{
> > +	struct xfs_buf	*bp = lip->li_buf;
> > +
> > +	ASSERT(lip->li_flags & XFS_LI_IN_AIL);
> > +	lockdep_assert_held(&lip->li_ailp->xa_lock);
> > +
> > +	if (lip->li_flags & XFS_LI_FAILED) {
> > +		lip->li_flags &= ~XFS_LI_FAILED;
> > +		lip->li_buf = NULL;
> > +		xfs_buf_rele(bp);
> > +	}
> > +}
> > +
> > +static inline void
> > +xfs_set_li_failed(
> > +	struct xfs_log_item	*lip,
> > +	struct xfs_buf		*bp)
> > +{
> > +	lockdep_assert_held(&lip->li_ailp->xa_lock);
> > +
> > +	if (!(lip->li_flags & XFS_LI_FAILED)) {
> > +		xfs_buf_hold(bp);
> > +		lip->li_flags |= XFS_LI_FAILED;
> > +		lip->li_buf = bp;
> > +	}
> > +}
> > +
> >  #endif	/* __XFS_TRANS_PRIV_H__ */
> > -- 
> > 2.9.4
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Carlos

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-07-20 12:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-18 14:54 [PATCH 0/2 V5] Resubmit items failed during writeback Carlos Maiolino
2017-07-18 14:54 ` [PATCH 1/2 V5] xfs: Add infrastructure needed for error propagation during buffer IO failure Carlos Maiolino
2017-07-19 12:01   ` Brian Foster
2017-07-20 12:01     ` Carlos Maiolino
2017-07-18 14:54 ` [PATCH 2/2 V5] xfs: Properly retry failed inode items in case of error during buffer writeback Carlos Maiolino
2017-07-19 12:01   ` Brian Foster
2017-07-20 12:07     ` Carlos Maiolino

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).