All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC, PATCH 00/10] xfs: configurable error behaviours
@ 2015-08-05 11:08 Dave Chinner
  2015-08-05 11:08 ` [PATCH 01/10] xfs: remove XBF_DONE flag wrapper macros Dave Chinner
                   ` (10 more replies)
  0 siblings, 11 replies; 15+ messages in thread
From: Dave Chinner @ 2015-08-05 11:08 UTC (permalink / raw)
  To: xfs

Hi folks,

This is my first pass at introducing a framework for configuring the
error handling behaviour of various subsystems in XFS. This is being
driven by the need to handle things like ENOSPC from thin
provisioned block devices in a sane manner, as well as from the mm/
side of the kernel for more configurable memory allocation failure
behaviour.

The patchset introduces the concepts of:

	- "failure speed", which defines the obvious behaviour of
	  the error handlers. "fast" will fail immediately, "slow"
	  will fail after a bound number of retries or timeout, and
	  "never" means exactly that. The current behaviour is
	  "never", and this patchset mostly leaves the default
	  behaviour configured this way.

	- "error class", which is the subsystem or error location
	  the error handlers configure the behaviour for. "metadata"
	  configures async write errors in metadata buffers, "kmem"
	  configures memory allocation failure behaviour and we'll
	  add others as we see fit.

	- "error handler", which is the specific error the
	  configuration applies to. For IO errors, it's obvious that
	  EIO, ENOSPC, ENODEV, etc are errors that need to be
	  handled in different ways. But for the kmem class, the
	  only error is ENOMEM, so while that it is currently
	  configured that way, the error handlers are more likely to
	  refer to failures in transaction context, in buffer
	  allocation context, etc. IOWs, the name and failure
	  context can change, but the same "failure speed"
	  definition still applies.

As a result, the infrastructure is quite flexible - anywhere you
have a struct xfs_mount you can look up an error handling
configuration and apply it appropriately for that context. The
patchset really only implements a metadata buffer async write IO
error context handler, but that is just the tip of the iceberg....

I've tried to make the initialisation as generic and simple as
possible. It's all table based, so all the default values are in one
place and easy to find - you don't have to touch the initialisation
code to modify the default error config, or to add new errors. Only
sysfs handlers need to be added for new error classes and handlers.

The current sysfs layout looks like this:

$ find /sys/fs/xfs/vdc/error
/sys/fs/xfs/vdc/error
/sys/fs/xfs/vdc/error/kmem
/sys/fs/xfs/vdc/error/kmem/ENOMEM
/sys/fs/xfs/vdc/error/kmem/ENOMEM/max_retries
/sys/fs/xfs/vdc/error/kmem/ENOMEM/failure_speed
/sys/fs/xfs/vdc/error/kmem/ENOMEM/fail_at_unmount
/sys/fs/xfs/vdc/error/kmem/ENOMEM/retry_timeout_seconds
/sys/fs/xfs/vdc/error/kmem/Default
/sys/fs/xfs/vdc/error/kmem/Default/max_retries
/sys/fs/xfs/vdc/error/kmem/Default/failure_speed
/sys/fs/xfs/vdc/error/kmem/Default/fail_at_unmount
/sys/fs/xfs/vdc/error/kmem/Default/retry_timeout_seconds
/sys/fs/xfs/vdc/error/metadata
/sys/fs/xfs/vdc/error/metadata/EIO
/sys/fs/xfs/vdc/error/metadata/EIO/max_retries
/sys/fs/xfs/vdc/error/metadata/EIO/failure_speed
/sys/fs/xfs/vdc/error/metadata/EIO/fail_at_unmount
/sys/fs/xfs/vdc/error/metadata/EIO/retry_timeout_seconds
/sys/fs/xfs/vdc/error/metadata/ENODEV
/sys/fs/xfs/vdc/error/metadata/ENODEV/max_retries
/sys/fs/xfs/vdc/error/metadata/ENODEV/failure_speed
/sys/fs/xfs/vdc/error/metadata/ENODEV/fail_at_unmount
/sys/fs/xfs/vdc/error/metadata/ENODEV/retry_timeout_seconds
/sys/fs/xfs/vdc/error/metadata/ENOSPC
/sys/fs/xfs/vdc/error/metadata/ENOSPC/max_retries
/sys/fs/xfs/vdc/error/metadata/ENOSPC/failure_speed
/sys/fs/xfs/vdc/error/metadata/ENOSPC/fail_at_unmount
/sys/fs/xfs/vdc/error/metadata/ENOSPC/retry_timeout_seconds
/sys/fs/xfs/vdc/error/metadata/Default
/sys/fs/xfs/vdc/error/metadata/Default/max_retries
/sys/fs/xfs/vdc/error/metadata/Default/failure_speed
/sys/fs/xfs/vdc/error/metadata/Default/fail_at_unmount
/sys/fs/xfs/vdc/error/metadata/Default/retry_timeout_seconds
$

Which shows the classes and error handlers, and how different
classes can have different error handlers.

The patchset runs through xfstests without problems on the default
configuration it sets. Note that the default config for the metadata
class now sets "fail at unmount" so errors that have been retries
indefinitely will fail at unmount and trigger a shutdown at that
point. This is preferable to hanging unmount as currently happens.

Anyway, have a play, have a poke, and let me know what you think...

Cheers,

Dave.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 01/10] xfs: remove XBF_DONE flag wrapper macros
  2015-08-05 11:08 [RFC, PATCH 00/10] xfs: configurable error behaviours Dave Chinner
@ 2015-08-05 11:08 ` Dave Chinner
  2015-08-05 11:08 ` [PATCH 02/10] xfs: configurable error behaviour via sysfs Dave Chinner
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2015-08-05 11:08 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

They only set/clear/check a flag, no need for obfuscating this
with a macro.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_buf.c         | 2 +-
 fs/xfs/xfs_buf.h         | 4 ----
 fs/xfs/xfs_buf_item.c    | 6 +++---
 fs/xfs/xfs_inode.c       | 2 +-
 fs/xfs/xfs_log.c         | 4 ++--
 fs/xfs/xfs_log_recover.c | 2 +-
 fs/xfs/xfs_mount.c       | 2 +-
 fs/xfs/xfs_trans_buf.c   | 4 ++--
 8 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 004b45c..6252a0b 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -644,7 +644,7 @@ xfs_buf_read_map(
 	if (bp) {
 		trace_xfs_buf_read(bp, flags, _RET_IP_);
 
-		if (!XFS_BUF_ISDONE(bp)) {
+		if (!(bp->b_flags & XBF_DONE)) {
 			XFS_STATS_INC(xb_get_read);
 			bp->b_ops = ops;
 			_xfs_buf_read(bp, flags);
diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index 331c1cc..68c8f2d 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -319,10 +319,6 @@ void xfs_buf_stale(struct xfs_buf *bp);
 #define XFS_BUF_UNSTALE(bp)	((bp)->b_flags &= ~XBF_STALE)
 #define XFS_BUF_ISSTALE(bp)	((bp)->b_flags & XBF_STALE)
 
-#define XFS_BUF_DONE(bp)	((bp)->b_flags |= XBF_DONE)
-#define XFS_BUF_UNDONE(bp)	((bp)->b_flags &= ~XBF_DONE)
-#define XFS_BUF_ISDONE(bp)	((bp)->b_flags & XBF_DONE)
-
 #define XFS_BUF_ASYNC(bp)	((bp)->b_flags |= XBF_ASYNC)
 #define XFS_BUF_UNASYNC(bp)	((bp)->b_flags &= ~XBF_ASYNC)
 #define XFS_BUF_ISASYNC(bp)	((bp)->b_flags & XBF_ASYNC)
diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
index 1816334..ee63961 100644
--- a/fs/xfs/xfs_buf_item.c
+++ b/fs/xfs/xfs_buf_item.c
@@ -470,7 +470,7 @@ xfs_buf_item_unpin(
 		xfs_buf_hold(bp);
 		bp->b_flags |= XBF_ASYNC;
 		xfs_buf_ioerror(bp, -EIO);
-		XFS_BUF_UNDONE(bp);
+		bp->b_flags &= ~XBF_DONE;
 		xfs_buf_stale(bp);
 		xfs_buf_ioend(bp);
 	}
@@ -949,7 +949,7 @@ xfs_buf_iodone_callbacks(
 	 */
 	if (XFS_FORCED_SHUTDOWN(mp)) {
 		xfs_buf_stale(bp);
-		XFS_BUF_DONE(bp);
+		bp->b_flags |= XBF_DONE;
 		trace_xfs_buf_item_iodone(bp, _RET_IP_);
 		goto do_callbacks;
 	}
@@ -995,7 +995,7 @@ xfs_buf_iodone_callbacks(
 	 * sure to return the error to the caller of xfs_bwrite().
 	 */
 	xfs_buf_stale(bp);
-	XFS_BUF_DONE(bp);
+	bp->b_flags |= XBF_DONE;
 
 	trace_xfs_buf_error_relse(bp, _RET_IP_);
 
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 4156e37..b42aa39 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -3260,7 +3260,7 @@ cluster_corrupt_out:
 		 * mark it as stale and brelse.
 		 */
 		if (bp->b_iodone) {
-			XFS_BUF_UNDONE(bp);
+			bp->b_flags &= ~XBF_DONE;
 			xfs_buf_stale(bp);
 			xfs_buf_ioerror(bp, -EIO);
 			xfs_buf_ioend(bp);
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 6b5a84a..a3c16b9 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -3906,7 +3906,7 @@ xfs_log_force_umount(
 	    log->l_flags & XLOG_ACTIVE_RECOVERY) {
 		mp->m_flags |= XFS_MOUNT_FS_SHUTDOWN;
 		if (mp->m_sb_bp)
-			XFS_BUF_DONE(mp->m_sb_bp);
+			mp->m_sb_bp->b_flags |= XBF_DONE;
 		return 0;
 	}
 
@@ -3936,7 +3936,7 @@ xfs_log_force_umount(
 	spin_lock(&log->l_icloglock);
 	mp->m_flags |= XFS_MOUNT_FS_SHUTDOWN;
 	if (mp->m_sb_bp)
-		XFS_BUF_DONE(mp->m_sb_bp);
+		mp->m_sb_bp->b_flags |= XBF_DONE;
 
 	/*
 	 * Mark the log and the iclogs with IO error flags to prevent any
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 6f83d12..78abbe45 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -4488,7 +4488,7 @@ xlog_do_recover(
 	 * updates, re-read in the superblock and reverify it.
 	 */
 	bp = xfs_getsb(log->l_mp, 0);
-	XFS_BUF_UNDONE(bp);
+	bp->b_flags &= ~XBF_DONE;
 	ASSERT(!(XFS_BUF_ISWRITE(bp)));
 	XFS_BUF_READ(bp);
 	XFS_BUF_UNASYNC(bp);
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 461e791..a094e47 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -1261,7 +1261,7 @@ xfs_getsb(
 	}
 
 	xfs_buf_hold(bp);
-	ASSERT(XFS_BUF_ISDONE(bp));
+	ASSERT(bp->b_flags & XBF_DONE);
 	return bp;
 }
 
diff --git a/fs/xfs/xfs_trans_buf.c b/fs/xfs/xfs_trans_buf.c
index 7579841..ed6f61f 100644
--- a/fs/xfs/xfs_trans_buf.c
+++ b/fs/xfs/xfs_trans_buf.c
@@ -155,7 +155,7 @@ xfs_trans_get_buf_map(
 		ASSERT(xfs_buf_islocked(bp));
 		if (XFS_FORCED_SHUTDOWN(tp->t_mountp)) {
 			xfs_buf_stale(bp);
-			XFS_BUF_DONE(bp);
+			bp->b_flags |= XBF_DONE;
 		}
 
 		ASSERT(bp->b_transp == tp);
@@ -518,7 +518,7 @@ xfs_trans_log_buf(xfs_trans_t	*tp,
 	 * inside the b_bdstrat callback so that this won't get written to
 	 * disk.
 	 */
-	XFS_BUF_DONE(bp);
+	bp->b_flags |= XBF_DONE;
 
 	ASSERT(atomic_read(&bip->bli_refcount) > 0);
 	bp->b_iodone = xfs_buf_iodone_callbacks;
-- 
2.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 02/10] xfs: configurable error behaviour via sysfs
  2015-08-05 11:08 [RFC, PATCH 00/10] xfs: configurable error behaviours Dave Chinner
  2015-08-05 11:08 ` [PATCH 01/10] xfs: remove XBF_DONE flag wrapper macros Dave Chinner
@ 2015-08-05 11:08 ` Dave Chinner
  2015-08-05 11:08 ` [PATCH 03/10] xfs: introduce metadata IO error class Dave Chinner
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2015-08-05 11:08 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

We need to be able to change the way XFS behaviours in error
conditions depending on the type of underlying storage. This is
necessary for handling non-traditional block devices with extended
error cases, such as thin provisioned devices that can return ENOSPC
as an IO error.

Introduce the basic sysfs infrastructure needed to define and
configure error behaviours. This is done to be generic enough to
extend to configuring behaviour in other error conditions, such as
ENOMEM, which also has different desired behaviours according to
machine configuration.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_mount.c |  9 +++++-
 fs/xfs/xfs_mount.h | 27 +++++++++++++++++
 fs/xfs/xfs_sysfs.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
 fs/xfs/xfs_sysfs.h |  3 ++
 4 files changed, 123 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index a094e47..4245b7f3 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -693,10 +693,14 @@ xfs_mountfs(
 	if (error)
 		goto out;
 
-	error = xfs_uuid_mount(mp);
+	error = xfs_error_sysfs_init(mp);
 	if (error)
 		goto out_remove_sysfs;
 
+	error = xfs_uuid_mount(mp);
+	if (error)
+		goto out_remove_error_sysfs;
+
 	/*
 	 * Set the minimum read and write sizes
 	 */
@@ -967,6 +971,8 @@ xfs_mountfs(
 	xfs_da_unmount(mp);
  out_remove_uuid:
 	xfs_uuid_unmount(mp);
+ out_remove_error_sysfs:
+	xfs_error_sysfs_del(mp);
  out_remove_sysfs:
 	xfs_sysfs_del(&mp->m_kobj);
  out:
@@ -1052,6 +1058,7 @@ xfs_unmountfs(
 #endif
 	xfs_free_perag(mp);
 
+	xfs_error_sysfs_del(mp);
 	xfs_sysfs_del(&mp->m_kobj);
 }
 
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index 7999e91..e51c63c 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -37,6 +37,31 @@ enum {
 	XFS_LOWSP_MAX,
 };
 
+/*
+ * Error Configuration
+ *
+ * Error classes define the subsystem the configuration belongs to.
+ * Error numbers define the errors that are configurable.
+ * Failure types describe when an error is considered fatal.
+ */
+enum {
+	XFS_ERR_CLASS_MAX,
+};
+enum {
+	XFS_ERR_ERRNO_MAX,
+};
+enum {
+	XFS_ERR_FAIL_DEFAULT,
+	XFS_ERR_FAIL_NEVER,
+	XFS_ERR_FAIL_SLOW,
+	XFS_ERR_FAIL_FAST,
+};
+
+struct xfs_error_cfg {
+	struct xfs_kobj	kobj;
+	int		fail_speed;
+};
+
 typedef struct xfs_mount {
 	struct super_block	*m_super;
 	xfs_tid_t		m_tid;		/* next unused tid for fs */
@@ -127,6 +152,8 @@ typedef struct xfs_mount {
 	int64_t			m_low_space[XFS_LOWSP_MAX];
 						/* low free space thresholds */
 	struct xfs_kobj		m_kobj;
+	struct xfs_kobj		m_error_kobj;
+	struct xfs_error_cfg	m_error_cfg[XFS_ERR_CLASS_MAX][XFS_ERR_ERRNO_MAX];
 
 	struct workqueue_struct *m_buf_workqueue;
 	struct workqueue_struct	*m_data_workqueue;
diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c
index aa03670..017dcfb 100644
--- a/fs/xfs/xfs_sysfs.c
+++ b/fs/xfs/xfs_sysfs.c
@@ -17,10 +17,14 @@
  */
 
 #include "xfs.h"
-#include "xfs_sysfs.h"
+#include "xfs_shared.h"
+#include "xfs_format.h"
 #include "xfs_log_format.h"
+#include "xfs_trans_resv.h"
+#include "xfs_sysfs.h"
 #include "xfs_log.h"
 #include "xfs_log_priv.h"
+#include "xfs_mount.h"
 
 struct xfs_sysfs_attr {
 	struct attribute attr;
@@ -237,3 +241,83 @@ struct kobj_type xfs_log_ktype = {
 	.sysfs_ops = &xfs_log_ops,
 	.default_attrs = xfs_log_attrs,
 };
+
+/*
+ * Metadata IO error configuration
+ *
+ * The sysfs structure here is:
+ *	...xfs/<dev>/error/<class>/<errno>/<error_attrs>
+ *
+ * where <class> allows use to discriminate between data IO and metadata IO,
+ * and any other future type of IO (e.g. special inode or directory error
+ * handling) we care to support.
+ */
+static struct attribute *xfs_error_attrs[] = {
+	NULL,
+};
+
+static inline struct xfs_error_cfg *
+to_error_cfg(struct kobject *kobject)
+{
+	struct xfs_kobj *kobj = to_kobj(kobject);
+	return container_of(kobj, struct xfs_error_cfg, kobj);
+}
+
+static ssize_t
+xfs_error_show(
+	struct kobject		*kobject,
+	struct attribute	*attr,
+	char			*buf)
+{
+	struct xfs_error_cfg *cfg = to_error_cfg(kobject);
+	struct xfs_sysfs_attr *xfs_attr = to_attr(attr);
+
+	return xfs_attr->show ? xfs_attr->show(buf, cfg) : 0;
+}
+
+static ssize_t
+xfs_error_store(
+	struct kobject		*kobject,
+	struct attribute	*attr,
+	const char		*buf,
+	size_t			count)
+{
+	struct xfs_error_cfg *cfg = to_error_cfg(kobject);
+	struct xfs_sysfs_attr *xfs_attr = to_attr(attr);
+
+	return xfs_attr->store ? xfs_attr->store(buf, count, cfg) : 0;
+}
+
+static struct sysfs_ops xfs_error_ops = {
+	.show = xfs_error_show,
+	.store = xfs_error_store,
+};
+
+struct kobj_type xfs_error_cfg_ktype = {
+	.release = xfs_sysfs_release,
+	.sysfs_ops = &xfs_error_ops,
+	.default_attrs = xfs_error_attrs,
+};
+
+struct kobj_type xfs_error_ktype = {
+	.release = xfs_sysfs_release,
+};
+
+int
+xfs_error_sysfs_init(
+	struct xfs_mount	*mp)
+{
+	int			error;
+
+	/* .../xfs/<dev>/error/ */
+	error = xfs_sysfs_init(&mp->m_error_kobj, &xfs_error_ktype,
+				&mp->m_kobj, "error");
+	return error;
+}
+
+void
+xfs_error_sysfs_del(
+	struct xfs_mount	*mp)
+{
+	xfs_sysfs_del(&mp->m_error_kobj);
+}
diff --git a/fs/xfs/xfs_sysfs.h b/fs/xfs/xfs_sysfs.h
index 240eee3..1f662d8 100644
--- a/fs/xfs/xfs_sysfs.h
+++ b/fs/xfs/xfs_sysfs.h
@@ -57,4 +57,7 @@ xfs_sysfs_del(
 	wait_for_completion(&kobj->complete);
 }
 
+int	xfs_error_sysfs_init(struct xfs_mount *mp);
+void	xfs_error_sysfs_del(struct xfs_mount *mp);
+
 #endif	/* __XFS_SYSFS_H__ */
-- 
2.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 03/10] xfs: introduce metadata IO error class
  2015-08-05 11:08 [RFC, PATCH 00/10] xfs: configurable error behaviours Dave Chinner
  2015-08-05 11:08 ` [PATCH 01/10] xfs: remove XBF_DONE flag wrapper macros Dave Chinner
  2015-08-05 11:08 ` [PATCH 02/10] xfs: configurable error behaviour via sysfs Dave Chinner
@ 2015-08-05 11:08 ` Dave Chinner
  2015-08-05 11:08 ` [PATCH 04/10] xfs: add configurable error support to metadata buffers Dave Chinner
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2015-08-05 11:08 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

Now we have the basic infrastructure, add the first error class so
we can build up the infrastructure in a meaningful way. Add the
metadata async write IO error class and sysfs entry, and introduce a
default configuration that matches the existing behaviour of
XFS_ERR_FAIL_NEVER for async write metadata buffers.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_mount.h |  3 +++
 fs/xfs/xfs_sysfs.c | 34 ++++++++++++++++++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index e51c63c..753060f 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -45,9 +45,11 @@ enum {
  * Failure types describe when an error is considered fatal.
  */
 enum {
+	XFS_ERR_METADATA,
 	XFS_ERR_CLASS_MAX,
 };
 enum {
+	XFS_ERR_DEFAULT,
 	XFS_ERR_ERRNO_MAX,
 };
 enum {
@@ -153,6 +155,7 @@ typedef struct xfs_mount {
 						/* low free space thresholds */
 	struct xfs_kobj		m_kobj;
 	struct xfs_kobj		m_error_kobj;
+	struct xfs_kobj		m_error_meta_kobj;
 	struct xfs_error_cfg	m_error_cfg[XFS_ERR_CLASS_MAX][XFS_ERR_ERRNO_MAX];
 
 	struct workqueue_struct *m_buf_workqueue;
diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c
index 017dcfb..cb7ced0 100644
--- a/fs/xfs/xfs_sysfs.c
+++ b/fs/xfs/xfs_sysfs.c
@@ -307,11 +307,34 @@ int
 xfs_error_sysfs_init(
 	struct xfs_mount	*mp)
 {
+	struct xfs_error_cfg	*cfg;
 	int			error;
 
 	/* .../xfs/<dev>/error/ */
 	error = xfs_sysfs_init(&mp->m_error_kobj, &xfs_error_ktype,
 				&mp->m_kobj, "error");
+	if (error)
+		return error;
+
+	/* .../xfs/<dev>/error/metadata/ */
+	error = xfs_sysfs_init(&mp->m_error_meta_kobj, &xfs_error_ktype,
+				&mp->m_error_kobj, "metadata");
+	if (error)
+		goto out_error;
+
+	cfg = &mp->m_error_cfg[XFS_ERR_METADATA][XFS_ERR_DEFAULT];
+	error = xfs_sysfs_init(&cfg->kobj, &xfs_error_cfg_ktype,
+				&mp->m_error_meta_kobj, "Default");
+	if (error)
+		goto out_error_meta;
+	cfg->fail_speed = XFS_ERR_FAIL_NEVER;
+
+	return 0;
+
+out_error_meta:
+	xfs_sysfs_del(&mp->m_error_meta_kobj);
+out_error:
+	xfs_sysfs_del(&mp->m_error_kobj);
 	return error;
 }
 
@@ -319,5 +342,16 @@ void
 xfs_error_sysfs_del(
 	struct xfs_mount	*mp)
 {
+	struct xfs_error_cfg	*cfg;
+	int			i, j;
+
+	for (i = 0; i < XFS_ERR_CLASS_MAX; i++) {
+		for (j = 0; j < XFS_ERR_ERRNO_MAX; j++) {
+			cfg = &mp->m_error_cfg[i][j];
+
+			xfs_sysfs_del(&cfg->kobj);
+		}
+	}
+	xfs_sysfs_del(&mp->m_error_meta_kobj);
 	xfs_sysfs_del(&mp->m_error_kobj);
 }
-- 
2.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 04/10] xfs: add configurable error support to metadata buffers
  2015-08-05 11:08 [RFC, PATCH 00/10] xfs: configurable error behaviours Dave Chinner
                   ` (2 preceding siblings ...)
  2015-08-05 11:08 ` [PATCH 03/10] xfs: introduce metadata IO error class Dave Chinner
@ 2015-08-05 11:08 ` Dave Chinner
  2015-08-11 14:18   ` Brian Foster
  2015-08-05 11:08 ` [PATCH 05/10] xfs: introduce table-based init for error behaviours Dave Chinner
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2015-08-05 11:08 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

With the error configuration handle for async metadata write errors
in place, we can now add initial support to the IO error processing
in xfs_buf_iodone_error().

Add an infrastructure function to look up the configuration handle,
and rearrange the error handling to prepare the way for different
error handling conigurations to be used.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_buf.h      |   1 +
 fs/xfs/xfs_buf_item.c | 112 ++++++++++++++++++++++++++++++--------------------
 fs/xfs/xfs_mount.h    |   3 ++
 fs/xfs/xfs_sysfs.c    |  17 ++++++++
 4 files changed, 88 insertions(+), 45 deletions(-)

diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index 68c8f2d..ed0ea41 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -181,6 +181,7 @@ typedef struct xfs_buf {
 	unsigned int		b_page_count;	/* size of page array */
 	unsigned int		b_offset;	/* page offset in first page */
 	int			b_error;	/* error code on I/O */
+	int			b_last_error;	/* previous async I/O error */
 	const struct xfs_buf_ops	*b_ops;
 
 #ifdef XFS_BUF_LOCK_TRACKING
diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
index ee63961..a09ae26 100644
--- a/fs/xfs/xfs_buf_item.c
+++ b/fs/xfs/xfs_buf_item.c
@@ -924,35 +924,22 @@ xfs_buf_do_callbacks(
 	}
 }
 
-/*
- * This is the iodone() function for buffers which have had callbacks
- * attached to them by xfs_buf_attach_iodone().  It should remove each
- * log item from the buffer's list and call the callback of each in turn.
- * When done, the buffer's fsprivate field is set to NULL and the buffer
- * is unlocked with a call to iodone().
- */
-void
-xfs_buf_iodone_callbacks(
+static bool
+xfs_buf_iodone_callback_error(
 	struct xfs_buf		*bp)
 {
 	struct xfs_log_item	*lip = bp->b_fspriv;
 	struct xfs_mount	*mp = lip->li_mountp;
 	static ulong		lasttime;
 	static xfs_buftarg_t	*lasttarg;
-
-	if (likely(!bp->b_error))
-		goto do_callbacks;
+	struct xfs_error_cfg	*cfg;
 
 	/*
 	 * If we've already decided to shutdown the filesystem because of
 	 * I/O errors, there's no point in giving this a retry.
 	 */
-	if (XFS_FORCED_SHUTDOWN(mp)) {
-		xfs_buf_stale(bp);
-		bp->b_flags |= XBF_DONE;
-		trace_xfs_buf_item_iodone(bp, _RET_IP_);
-		goto do_callbacks;
-	}
+	if (XFS_FORCED_SHUTDOWN(mp))
+		goto out_stale;
 
 	if (bp->b_target != lasttarg ||
 	    time_after(jiffies, (lasttime + 5*HZ))) {
@@ -961,45 +948,80 @@ xfs_buf_iodone_callbacks(
 	}
 	lasttarg = bp->b_target;
 
+	/* synchronous writes will have callers process the error */
+	if (!(bp->b_flags & XBF_ASYNC))
+		goto out_stale;
+
+	trace_xfs_buf_item_iodone_async(bp, _RET_IP_);
+	ASSERT(bp->b_iodone != NULL);
+
 	/*
 	 * If the write was asynchronous then no one will be looking for the
-	 * error.  Clear the error state and write the buffer out again.
-	 *
-	 * XXX: This helps against transient write errors, but we need to find
-	 * a way to shut the filesystem down if the writes keep failing.
-	 *
-	 * In practice we'll shut the filesystem down soon as non-transient
-	 * errors tend to affect the whole device and a failing log write
-	 * will make us give up.  But we really ought to do better here.
+	 * error.  If this is the first failure of this type, clear the error
+	 * state and write the buffer out again. This means we always retry an
+	 * async write failure at least once, but we also need to set the buffer
+	 * up to behave correctly now for repeated failures.
 	 */
-	if (XFS_BUF_ISASYNC(bp)) {
-		ASSERT(bp->b_iodone != NULL);
-
-		trace_xfs_buf_item_iodone_async(bp, _RET_IP_);
+	cfg = xfs_error_get_cfg(mp, XFS_ERR_METADATA, bp->b_error);
+	if (!(bp->b_flags & (XBF_STALE|XBF_WRITE_FAIL)) ||
+	     bp->b_last_error != bp->b_error) {
+		bp->b_flags |= (XBF_WRITE | XBF_ASYNC |
+			        XBF_DONE | XBF_WRITE_FAIL);
+		bp->b_last_error = bp->b_error;
+		xfs_buf_ioerror(bp, 0);
+		xfs_buf_submit(bp);
+		return true;
+	}
 
-		xfs_buf_ioerror(bp, 0); /* errno of 0 unsets the flag */
+	/*
+	 * Repeated failure on an async write. Take action according to the
+	 * error configuration we have been set up to use.
+	 */
+	if (cfg->fail_speed == XFS_ERR_FAIL_FAST)
+		goto permanent_error;
 
-		if (!(bp->b_flags & (XBF_STALE|XBF_WRITE_FAIL))) {
-			bp->b_flags |= XBF_WRITE | XBF_ASYNC |
-				       XBF_DONE | XBF_WRITE_FAIL;
-			xfs_buf_submit(bp);
-		} else {
-			xfs_buf_relse(bp);
-		}
-
-		return;
-	}
+	/* still a transient error, higher layers will retry */
+	xfs_buf_ioerror(bp, 0);
+	xfs_buf_relse(bp);
+	return true;
 
 	/*
-	 * If the write of the buffer was synchronous, we want to make
-	 * sure to return the error to the caller of xfs_bwrite().
+	 * Permanent error - we need to trigger a shutdown if we haven't already
+	 * to indicate that inconsistency will result from this action.
 	 */
+permanent_error:
+	xfs_force_shutdown(mp, SHUTDOWN_META_IO_ERROR);
+out_stale:
 	xfs_buf_stale(bp);
 	bp->b_flags |= XBF_DONE;
-
 	trace_xfs_buf_error_relse(bp, _RET_IP_);
+	return false;
+}
+
+/*
+ * This is the iodone() function for buffers which have had callbacks attached
+ * to them by xfs_buf_attach_iodone(). We need to iterate the items on the
+ * callback list, mark the buffer as having no more callbacks and then push the
+ * buffer through IO completion processing.
+ */
+void
+xfs_buf_iodone_callbacks(
+	struct xfs_buf		*bp)
+{
+	/*
+	 * If there is an error, process it. Some errors require us
+	 * to run callbacks after failure processing is done so we
+	 * detect that and take appropriate action.
+	 */
+	if (bp->b_error && xfs_buf_iodone_callback_error(bp))
+		return;
+
+	/*
+	 * Successful IO or permanent error. Either way, we can clear the
+	 * retry state here in preparation for the next error that may occur.
+	 */
+	bp->b_last_error = 0;
 
-do_callbacks:
 	xfs_buf_do_callbacks(bp);
 	bp->b_fspriv = NULL;
 	bp->b_iodone = NULL;
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index 753060f..21caa5a 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -366,4 +366,7 @@ extern int	xfs_dev_is_read_only(struct xfs_mount *, char *);
 
 extern void	xfs_set_low_space_thresholds(struct xfs_mount *);
 
+struct xfs_error_cfg * xfs_error_get_cfg(struct xfs_mount *mp,
+		int error_class, int error);
+
 #endif	/* __XFS_MOUNT_H__ */
diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c
index cb7ced0..3667d33 100644
--- a/fs/xfs/xfs_sysfs.c
+++ b/fs/xfs/xfs_sysfs.c
@@ -355,3 +355,20 @@ xfs_error_sysfs_del(
 	xfs_sysfs_del(&mp->m_error_meta_kobj);
 	xfs_sysfs_del(&mp->m_error_kobj);
 }
+
+struct xfs_error_cfg *
+xfs_error_get_cfg(
+	struct xfs_mount	*mp,
+	int			error_class,
+	int			error)
+{
+	struct xfs_error_cfg	*cfg;
+
+	switch (error) {
+	default:
+		cfg = &mp->m_error_cfg[error_class][XFS_ERR_DEFAULT];
+		break;
+	}
+
+	return cfg;
+}
-- 
2.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 05/10] xfs: introduce table-based init for error behaviours
  2015-08-05 11:08 [RFC, PATCH 00/10] xfs: configurable error behaviours Dave Chinner
                   ` (3 preceding siblings ...)
  2015-08-05 11:08 ` [PATCH 04/10] xfs: add configurable error support to metadata buffers Dave Chinner
@ 2015-08-05 11:08 ` Dave Chinner
  2015-08-11 14:19   ` Brian Foster
  2015-08-05 11:08 ` [PATCH 06/10] xfs: add configuration of error failure speed Dave Chinner
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2015-08-05 11:08 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

Before we start expanding the number of error classes and errors we
can configure behaviour for, we need a simple and clear way to
define the default behaviour that we initialised each mount with.
Introduce a table based method for keeping the initial configuration
in, and apply that to the existing initialisation code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_sysfs.c | 72 +++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 60 insertions(+), 12 deletions(-)

diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c
index 3667d33..9d66095 100644
--- a/fs/xfs/xfs_sysfs.c
+++ b/fs/xfs/xfs_sysfs.c
@@ -303,11 +303,67 @@ struct kobj_type xfs_error_ktype = {
 	.release = xfs_sysfs_release,
 };
 
+/*
+ * Error initialisation tables. These need to be ordered in the same
+ * order as the enums used to index the array. All class init tables need to
+ * define a "default" behaviour as the first entry, all other entries can be
+ * empty.
+ */
+struct xfs_error_init {
+	char		*name;
+	int		fail_speed;
+};
+
+static const struct xfs_error_init xfs_error_meta_init[XFS_ERR_ERRNO_MAX] = {
+	{ .name = "Default",
+	  .fail_speed = XFS_ERR_FAIL_NEVER,
+	},
+};
+
+static int
+xfs_error_sysfs_init_class(
+	struct xfs_mount	*mp,
+	int			class,
+	const char		*parent_name,
+	struct xfs_kobj		*parent_kobj,
+	const struct xfs_error_init init[])
+{
+	struct xfs_error_cfg	*cfg;
+	int			error;
+	int			i;
+
+	ASSERT(class < XFS_ERR_CLASS_MAX);
+
+	error = xfs_sysfs_init(parent_kobj, &xfs_error_ktype,
+				&mp->m_error_kobj, parent_name);
+	if (error)
+		return error;
+
+	for (i = 0; i < XFS_ERR_ERRNO_MAX; i++) {
+		cfg = &mp->m_error_cfg[class][i];
+		error = xfs_sysfs_init(&cfg->kobj, &xfs_error_cfg_ktype,
+					parent_kobj, init[i].name);
+		if (error)
+			goto out_error;
+
+		cfg->fail_speed = init[i].fail_speed;
+	}
+	return 0;
+
+out_error:
+	/* unwind the entries that succeeded */
+	for (i--; i >= 0; i--) {
+		cfg = &mp->m_error_cfg[class][i];
+		xfs_sysfs_del(&cfg->kobj);
+	}
+	xfs_sysfs_del(parent_kobj);
+	return error;
+}
+
 int
 xfs_error_sysfs_init(
 	struct xfs_mount	*mp)
 {
-	struct xfs_error_cfg	*cfg;
 	int			error;
 
 	/* .../xfs/<dev>/error/ */
@@ -317,22 +373,14 @@ xfs_error_sysfs_init(
 		return error;
 
 	/* .../xfs/<dev>/error/metadata/ */
-	error = xfs_sysfs_init(&mp->m_error_meta_kobj, &xfs_error_ktype,
-				&mp->m_error_kobj, "metadata");
+	error = xfs_error_sysfs_init_class(mp, XFS_ERR_METADATA,
+				"metadata", &mp->m_error_meta_kobj,
+				xfs_error_meta_init);
 	if (error)
 		goto out_error;
 
-	cfg = &mp->m_error_cfg[XFS_ERR_METADATA][XFS_ERR_DEFAULT];
-	error = xfs_sysfs_init(&cfg->kobj, &xfs_error_cfg_ktype,
-				&mp->m_error_meta_kobj, "Default");
-	if (error)
-		goto out_error_meta;
-	cfg->fail_speed = XFS_ERR_FAIL_NEVER;
-
 	return 0;
 
-out_error_meta:
-	xfs_sysfs_del(&mp->m_error_meta_kobj);
 out_error:
 	xfs_sysfs_del(&mp->m_error_kobj);
 	return error;
-- 
2.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 06/10] xfs: add configuration of error failure speed
  2015-08-05 11:08 [RFC, PATCH 00/10] xfs: configurable error behaviours Dave Chinner
                   ` (4 preceding siblings ...)
  2015-08-05 11:08 ` [PATCH 05/10] xfs: introduce table-based init for error behaviours Dave Chinner
@ 2015-08-05 11:08 ` Dave Chinner
  2015-08-11 14:19   ` Brian Foster
  2015-08-05 11:08 ` [PATCH 07/10] xfs: add "fail at unmount" error handling configuration Dave Chinner
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2015-08-05 11:08 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

On reception of an error, we can fail immediately, perform some
bound amount of retries or retry indefinitely. The current behaviour
we have is to retry forever.

However, we'd like the ability to choose what behaviour we have, and
that requires the ability to configure the behaviour through the new
sysfs interfaces. Add configuration options for fail fast, slow or
never to reflect the three choices above. Fail fast or fail never
don't require any other options, but "fail slow" needs configuration
to bound the retry behaviour. Add both a maximum retry count and a
retry timeout so that we can bound by time and/or physical IO
attempts.

Finally, plumb these into xfs_buf_iodone error processing so that
the error behaviour follows the selected configuration.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_buf.h      |  23 +++++++++-
 fs/xfs/xfs_buf_item.c |  22 +++++++++-
 fs/xfs/xfs_mount.h    |   2 +
 fs/xfs/xfs_sysfs.c    | 119 ++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 164 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index ed0ea41..afc2d2b 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -181,7 +181,28 @@ typedef struct xfs_buf {
 	unsigned int		b_page_count;	/* size of page array */
 	unsigned int		b_offset;	/* page offset in first page */
 	int			b_error;	/* error code on I/O */
-	int			b_last_error;	/* previous async I/O error */
+
+	/*
+	 * async write failure retry count. Initialised to zero on the first
+	 * failure, then when it exceeds the maximum configured without a
+	 * success the write is considered to be failed permanently and the
+	 * iodone handler will take appropriate action.
+	 *
+	 * For retry timeouts, we record the jiffie of the first failure. This
+	 * means that we can change the retry timeout and it on the next error
+	 * it will be checked against the newly configured timeout. This
+	 * prevents buffers getting stuck in retry loops with a really long
+	 * timeout.
+	 *
+	 * last_error is used to ensure that we are getting repeated errors, not
+	 * different errors. e.g. a block device might change ENOSPC to EIO when
+	 * a failure timeout occurs, so we want to re-initialise the error
+	 * retry behaviour appropriately when that happens.
+	 */
+	int			b_retries;
+	unsigned long		b_first_retry_time; /* in jiffies */
+	int			b_last_error;
+
 	const struct xfs_buf_ops	*b_ops;
 
 #ifdef XFS_BUF_LOCK_TRACKING
diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
index a09ae26..c785698 100644
--- a/fs/xfs/xfs_buf_item.c
+++ b/fs/xfs/xfs_buf_item.c
@@ -968,6 +968,9 @@ xfs_buf_iodone_callback_error(
 		bp->b_flags |= (XBF_WRITE | XBF_ASYNC |
 			        XBF_DONE | XBF_WRITE_FAIL);
 		bp->b_last_error = bp->b_error;
+		bp->b_retries = 0;
+		bp->b_first_retry_time = jiffies;
+
 		xfs_buf_ioerror(bp, 0);
 		xfs_buf_submit(bp);
 		return true;
@@ -977,9 +980,25 @@ xfs_buf_iodone_callback_error(
 	 * Repeated failure on an async write. Take action according to the
 	 * error configuration we have been set up to use.
 	 */
-	if (cfg->fail_speed == XFS_ERR_FAIL_FAST)
+	switch (cfg->fail_speed) {
+	case XFS_ERR_FAIL_FAST:
 		goto permanent_error;
 
+	case XFS_ERR_FAIL_SLOW:
+		if (++bp->b_retries > cfg->max_retries)
+			goto permanent_error;
+		if (!cfg->retry_timeout)
+			break;
+		if (time_after(jiffies,
+			       cfg->retry_timeout + bp->b_first_retry_time))
+			goto permanent_error;
+		break;
+
+	case XFS_ERR_FAIL_NEVER:
+	default:
+		break;
+	}
+
 	/* still a transient error, higher layers will retry */
 	xfs_buf_ioerror(bp, 0);
 	xfs_buf_relse(bp);
@@ -1021,6 +1040,7 @@ xfs_buf_iodone_callbacks(
 	 * retry state here in preparation for the next error that may occur.
 	 */
 	bp->b_last_error = 0;
+	bp->b_retries = 0;
 
 	xfs_buf_do_callbacks(bp);
 	bp->b_fspriv = NULL;
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index 21caa5a..a684a72b 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -62,6 +62,8 @@ enum {
 struct xfs_error_cfg {
 	struct xfs_kobj	kobj;
 	int		fail_speed;
+	int		max_retries;	/* INT_MAX = retry forever */
+	unsigned long	retry_timeout;	/* in jiffies, 0 = no timeout */
 };
 
 typedef struct xfs_mount {
diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c
index 9d66095..1f078e1 100644
--- a/fs/xfs/xfs_sysfs.c
+++ b/fs/xfs/xfs_sysfs.c
@@ -252,7 +252,119 @@ struct kobj_type xfs_log_ktype = {
  * and any other future type of IO (e.g. special inode or directory error
  * handling) we care to support.
  */
+static ssize_t
+failure_speed_store(
+	const char	*buf,
+	size_t		count,
+	void		*data)
+{
+	struct xfs_error_cfg *cfg = data;
+	char		*str = kstrdup(buf, GFP_KERNEL);
+	char		*sp;
+	int		len;
+
+	if (!str)
+		return -ENOMEM;
+
+	sp = strstrip(str);
+	len = strlen(sp);
+	if (strncmp(sp, "fast", len) == 0)
+		cfg->fail_speed = XFS_ERR_FAIL_FAST;
+	else if (strncmp(sp, "slow", len) == 0)
+		cfg->fail_speed = XFS_ERR_FAIL_SLOW;
+	else if (strncmp(sp, "never", len) == 0)
+		cfg->fail_speed = XFS_ERR_FAIL_NEVER;
+	else
+		count = -EINVAL;
+	kfree(str);
+	return count;
+}
+
+static ssize_t
+failure_speed_show(
+	char	*buf,
+	void	*data)
+{
+	struct xfs_error_cfg *cfg = data;
+
+	if (cfg->fail_speed == XFS_ERR_FAIL_FAST)
+		return snprintf(buf, PAGE_SIZE, "[fast] slow never\n");
+	if (cfg->fail_speed == XFS_ERR_FAIL_SLOW)
+		return snprintf(buf, PAGE_SIZE, "fast [slow] never\n");
+	return snprintf(buf, PAGE_SIZE, "fast slow [never]\n");
+}
+XFS_SYSFS_ATTR_RW(failure_speed);
+
+static ssize_t
+max_retries_store(
+	const char	*buf,
+	size_t		count,
+	void		*data)
+{
+	struct xfs_error_cfg *cfg = data;
+	int		ret;
+	int		val;
+
+	ret = kstrtoint(buf, 0, &val);
+	if (ret)
+		return ret;
+
+	if (val < 0 || val > INT_MAX)
+		return -EINVAL;
+
+	cfg->max_retries = val;
+	return count;
+}
+
+static ssize_t
+max_retries_show(
+	char	*buf,
+	void	*data)
+{
+	struct xfs_error_cfg *cfg = data;
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", cfg->max_retries);
+}
+XFS_SYSFS_ATTR_RW(max_retries);
+
+static ssize_t
+retry_timeout_seconds_store(
+	const char	*buf,
+	size_t		count,
+	void		*data)
+{
+	struct xfs_error_cfg *cfg = data;
+	int		ret;
+	int		val;
+
+	ret = kstrtoint(buf, 0, &val);
+	if (ret)
+		return ret;
+
+	/* 1 day timeout maximum */
+	if (val < 0 || val > 86400)
+		return -EINVAL;
+
+	cfg->retry_timeout = msecs_to_jiffies(val * MSEC_PER_SEC);
+	return count;
+}
+
+static ssize_t
+retry_timeout_seconds_show(
+	char	*buf,
+	void	*data)
+{
+	struct xfs_error_cfg *cfg = data;
+
+	return snprintf(buf, PAGE_SIZE, "%ld\n", 
+			jiffies_to_msecs(cfg->retry_timeout) * MSEC_PER_SEC);
+}
+XFS_SYSFS_ATTR_RW(retry_timeout_seconds);
+
 static struct attribute *xfs_error_attrs[] = {
+	ATTR_LIST(failure_speed),
+	ATTR_LIST(max_retries),
+	ATTR_LIST(retry_timeout_seconds),
 	NULL,
 };
 
@@ -312,11 +424,15 @@ struct kobj_type xfs_error_ktype = {
 struct xfs_error_init {
 	char		*name;
 	int		fail_speed;
+	int		max_retries;
+	int		retry_timeout;	/* in seconds */
 };
 
 static const struct xfs_error_init xfs_error_meta_init[XFS_ERR_ERRNO_MAX] = {
 	{ .name = "Default",
 	  .fail_speed = XFS_ERR_FAIL_NEVER,
+	  .max_retries = INT_MAX,
+	  .retry_timeout = 0,
 	},
 };
 
@@ -347,6 +463,9 @@ xfs_error_sysfs_init_class(
 			goto out_error;
 
 		cfg->fail_speed = init[i].fail_speed;
+		cfg->max_retries = init[i].max_retries;
+		cfg->retry_timeout = msecs_to_jiffies(
+					init[i].retry_timeout * MSEC_PER_SEC);
 	}
 	return 0;
 
-- 
2.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 07/10] xfs: add "fail at unmount" error handling configuration
  2015-08-05 11:08 [RFC, PATCH 00/10] xfs: configurable error behaviours Dave Chinner
                   ` (5 preceding siblings ...)
  2015-08-05 11:08 ` [PATCH 06/10] xfs: add configuration of error failure speed Dave Chinner
@ 2015-08-05 11:08 ` Dave Chinner
  2015-08-05 11:08 ` [PATCH 08/10] xfs: add configuration handles for specific errors Dave Chinner
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2015-08-05 11:08 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

If we take "retry forever" literally on metadata IO errors, we can
hang an unmount retries those writes forever. This is the default
behaviour, unfortunately. Add a error configuration option for this
behaviour and default it to "fail" so that an unmount will trigger
actual errors, a shutdown and allow the unmount to succeed. It will
be noisy, though, as it will log the errors and shutdown that
occurs.

To do this, we need to mark the filesystem as being in the process
of unmounting. Do this with a mount flag that is added at the
appropriate time (i.e. before the blocking AIL sync). We also need
to add this flag if mount fails after the initial phase of log
recovery has been run.

The config is done by a separate boolean sysfs option rather than a
new fail_speed enum, as fail_at_unmount is relevant to both
XFS_ERR_FAIL_NEVER and XFS_ERR_FAIL_SLOW options.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_buf_item.c |  4 ++++
 fs/xfs/xfs_mount.c    |  9 +++++++++
 fs/xfs/xfs_mount.h    |  2 ++
 fs/xfs/xfs_sysfs.c    | 36 ++++++++++++++++++++++++++++++++++++
 4 files changed, 51 insertions(+)

diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
index c785698..cf8aec0 100644
--- a/fs/xfs/xfs_buf_item.c
+++ b/fs/xfs/xfs_buf_item.c
@@ -999,6 +999,10 @@ xfs_buf_iodone_callback_error(
 		break;
 	}
 
+	/* At unmount we may treat errors differently */
+	if ((mp->m_flags & XFS_MOUNT_UNMOUNTING) && cfg->fail_at_unmount)
+		goto permanent_error;
+
 	/* still a transient error, higher layers will retry */
 	xfs_buf_ioerror(bp, 0);
 	xfs_buf_relse(bp);
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 4245b7f3..c2096a0 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -960,6 +960,7 @@ xfs_mountfs(
  out_rele_rip:
 	IRELE(rip);
  out_log_dealloc:
+	mp->m_flags |= XFS_MOUNT_UNMOUNTING;
 	xfs_log_unmount(mp);
  out_fail_wait:
 	if (mp->m_logdev_targp && mp->m_logdev_targp != mp->m_ddev_targp)
@@ -1009,6 +1010,14 @@ xfs_unmountfs(
 	xfs_log_force(mp, XFS_LOG_SYNC);
 
 	/*
+	 * We now need to tell the world we are unmounting. This will allow
+	 * us to detect that the filesystem is going away and we should error
+	 * out anything that we have been retrying in the background. This will
+	 * prevent neverending retries iin AIL pushing from hanging the unmount.
+	 */
+	mp->m_flags |= XFS_MOUNT_UNMOUNTING;
+
+	/*
 	 * Flush all pending changes from the AIL.
 	 */
 	xfs_ail_push_all_sync(mp->m_ail);
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index a684a72b..c86fe43 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -64,6 +64,7 @@ struct xfs_error_cfg {
 	int		fail_speed;
 	int		max_retries;	/* INT_MAX = retry forever */
 	unsigned long	retry_timeout;	/* in jiffies, 0 = no timeout */
+	bool		fail_at_unmount;
 };
 
 typedef struct xfs_mount {
@@ -186,6 +187,7 @@ typedef struct xfs_mount {
 #define XFS_MOUNT_WSYNC		(1ULL << 0)	/* for nfs - all metadata ops
 						   must be synchronous except
 						   for space allocations */
+#define XFS_MOUNT_UNMOUNTING	(1ULL << 1)	/* filesystem is unmounting */
 #define XFS_MOUNT_WAS_CLEAN	(1ULL << 3)
 #define XFS_MOUNT_FS_SHUTDOWN	(1ULL << 4)	/* atomic stop of all filesystem
 						   operations, typically for
diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c
index 1f078e1..921b079 100644
--- a/fs/xfs/xfs_sysfs.c
+++ b/fs/xfs/xfs_sysfs.c
@@ -361,10 +361,43 @@ retry_timeout_seconds_show(
 }
 XFS_SYSFS_ATTR_RW(retry_timeout_seconds);
 
+static ssize_t
+fail_at_unmount_store(
+	const char	*buf,
+	size_t		count,
+	void		*data)
+{
+	struct xfs_error_cfg *cfg = data;
+	int		ret;
+	int		val;
+
+	ret = kstrtoint(buf, 0, &val);
+	if (ret)
+		return ret;
+
+	if (val < 0 || val > 1)
+		return -EINVAL;
+
+	cfg->fail_at_unmount = val;
+	return count;
+}
+
+static ssize_t
+fail_at_unmount_show(
+	char	*buf,
+	void	*data)
+{
+	struct xfs_error_cfg *cfg = data;
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", cfg->fail_at_unmount);
+}
+XFS_SYSFS_ATTR_RW(fail_at_unmount);
+
 static struct attribute *xfs_error_attrs[] = {
 	ATTR_LIST(failure_speed),
 	ATTR_LIST(max_retries),
 	ATTR_LIST(retry_timeout_seconds),
+	ATTR_LIST(fail_at_unmount),
 	NULL,
 };
 
@@ -426,6 +459,7 @@ struct xfs_error_init {
 	int		fail_speed;
 	int		max_retries;
 	int		retry_timeout;	/* in seconds */
+	bool		fail_at_unmount;
 };
 
 static const struct xfs_error_init xfs_error_meta_init[XFS_ERR_ERRNO_MAX] = {
@@ -433,6 +467,7 @@ static const struct xfs_error_init xfs_error_meta_init[XFS_ERR_ERRNO_MAX] = {
 	  .fail_speed = XFS_ERR_FAIL_NEVER,
 	  .max_retries = INT_MAX,
 	  .retry_timeout = 0,
+	  .fail_at_unmount = true,
 	},
 };
 
@@ -466,6 +501,7 @@ xfs_error_sysfs_init_class(
 		cfg->max_retries = init[i].max_retries;
 		cfg->retry_timeout = msecs_to_jiffies(
 					init[i].retry_timeout * MSEC_PER_SEC);
+		cfg->fail_at_unmount = init[i].fail_at_unmount;
 	}
 	return 0;
 
-- 
2.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 08/10] xfs: add configuration handles for specific errors
  2015-08-05 11:08 [RFC, PATCH 00/10] xfs: configurable error behaviours Dave Chinner
                   ` (6 preceding siblings ...)
  2015-08-05 11:08 ` [PATCH 07/10] xfs: add "fail at unmount" error handling configuration Dave Chinner
@ 2015-08-05 11:08 ` Dave Chinner
  2015-08-05 11:08 ` [PATCH 09/10] xfs: disable specific error configurations Dave Chinner
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2015-08-05 11:08 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

now most of the infrastructure is in place, we can start adding
support for configuring specific errors such as ENODEV, ENOSPC, EIO,
etc. Add these error configurations and configure them all to have
appropriate behaviours. That is, all will use the "fail never, fail
at unmount" default except for ENODEV, which is an unrecoverable
error so it will be configured as a "fail fast" error.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_mount.h |  3 +++
 fs/xfs/xfs_sysfs.c | 24 ++++++++++++++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index c86fe43..7d62450 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -50,6 +50,9 @@ enum {
 };
 enum {
 	XFS_ERR_DEFAULT,
+	XFS_ERR_EIO,
+	XFS_ERR_ENOSPC,
+	XFS_ERR_ENODEV,
 	XFS_ERR_ERRNO_MAX,
 };
 enum {
diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c
index 921b079..019e859 100644
--- a/fs/xfs/xfs_sysfs.c
+++ b/fs/xfs/xfs_sysfs.c
@@ -469,6 +469,21 @@ static const struct xfs_error_init xfs_error_meta_init[XFS_ERR_ERRNO_MAX] = {
 	  .retry_timeout = 0,
 	  .fail_at_unmount = true,
 	},
+	{ .name = "EIO",
+	  .fail_speed = XFS_ERR_FAIL_NEVER,
+	  .max_retries = INT_MAX,
+	  .retry_timeout = 0,
+	  .fail_at_unmount = true,
+	},
+	{ .name = "ENOSPC",
+	  .fail_speed = XFS_ERR_FAIL_NEVER,
+	  .max_retries = INT_MAX,
+	  .retry_timeout = 0,
+	  .fail_at_unmount = true,
+	},
+	{ .name = "ENODEV",
+	  .fail_speed = XFS_ERR_FAIL_FAST,
+	},
 };
 
 static int
@@ -568,6 +583,15 @@ xfs_error_get_cfg(
 	struct xfs_error_cfg	*cfg;
 
 	switch (error) {
+	case EIO:
+		cfg = &mp->m_error_cfg[error_class][XFS_ERR_EIO];
+		break;
+	case ENOSPC:
+		cfg = &mp->m_error_cfg[error_class][XFS_ERR_ENOSPC];
+		break;
+	case ENODEV:
+		cfg = &mp->m_error_cfg[error_class][XFS_ERR_ENODEV];
+		break;
 	default:
 		cfg = &mp->m_error_cfg[error_class][XFS_ERR_DEFAULT];
 		break;
-- 
2.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 09/10] xfs: disable specific error configurations
  2015-08-05 11:08 [RFC, PATCH 00/10] xfs: configurable error behaviours Dave Chinner
                   ` (7 preceding siblings ...)
  2015-08-05 11:08 ` [PATCH 08/10] xfs: add configuration handles for specific errors Dave Chinner
@ 2015-08-05 11:08 ` Dave Chinner
  2015-08-05 11:08 ` [PATCH 10/10] xfs: add kmem error configuration class Dave Chinner
  2015-08-11 14:20 ` [RFC, PATCH 00/10] xfs: configurable error behaviours Brian Foster
  10 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2015-08-05 11:08 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

Different error classes are going to need different error to be
configured, so we don't want them all to be visible in sysfs. Add a
configuration check into the config initialisation an lookup
code to determine if the default should be used for a specific
error. If so, the sysfs entry is not created, and on lookup the
default config is returned.

Add ENOMEM at this point to exercise this code, as it will be used
later when adding a kmem error failure class.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_mount.h |  1 +
 fs/xfs/xfs_sysfs.c | 22 ++++++++++++++++++----
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index 7d62450..876b86b 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -53,6 +53,7 @@ enum {
 	XFS_ERR_EIO,
 	XFS_ERR_ENOSPC,
 	XFS_ERR_ENODEV,
+	XFS_ERR_ENOMEM,
 	XFS_ERR_ERRNO_MAX,
 };
 enum {
diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c
index 019e859..fa46ca4 100644
--- a/fs/xfs/xfs_sysfs.c
+++ b/fs/xfs/xfs_sysfs.c
@@ -484,6 +484,9 @@ static const struct xfs_error_init xfs_error_meta_init[XFS_ERR_ERRNO_MAX] = {
 	{ .name = "ENODEV",
 	  .fail_speed = XFS_ERR_FAIL_FAST,
 	},
+	{ .name = "ENOMEM",
+	  .fail_speed = XFS_ERR_FAIL_DEFAULT,
+	},
 };
 
 static int
@@ -507,12 +510,17 @@ xfs_error_sysfs_init_class(
 
 	for (i = 0; i < XFS_ERR_ERRNO_MAX; i++) {
 		cfg = &mp->m_error_cfg[class][i];
+
+		/* skip errors that are not configurable for this class */
+		cfg->fail_speed = init[i].fail_speed;
+		if (cfg->fail_speed == XFS_ERR_FAIL_DEFAULT)
+			continue;
+
 		error = xfs_sysfs_init(&cfg->kobj, &xfs_error_cfg_ktype,
 					parent_kobj, init[i].name);
 		if (error)
 			goto out_error;
 
-		cfg->fail_speed = init[i].fail_speed;
 		cfg->max_retries = init[i].max_retries;
 		cfg->retry_timeout = msecs_to_jiffies(
 					init[i].retry_timeout * MSEC_PER_SEC);
@@ -567,7 +575,8 @@ xfs_error_sysfs_del(
 		for (j = 0; j < XFS_ERR_ERRNO_MAX; j++) {
 			cfg = &mp->m_error_cfg[i][j];
 
-			xfs_sysfs_del(&cfg->kobj);
+			if (cfg->fail_speed != XFS_ERR_FAIL_DEFAULT)
+				xfs_sysfs_del(&cfg->kobj);
 		}
 	}
 	xfs_sysfs_del(&mp->m_error_meta_kobj);
@@ -592,10 +601,15 @@ xfs_error_get_cfg(
 	case ENODEV:
 		cfg = &mp->m_error_cfg[error_class][XFS_ERR_ENODEV];
 		break;
-	default:
-		cfg = &mp->m_error_cfg[error_class][XFS_ERR_DEFAULT];
+	case ENOMEM:
+		cfg = &mp->m_error_cfg[error_class][XFS_ERR_ENOMEM];
 		break;
+	default:
+		return &mp->m_error_cfg[error_class][XFS_ERR_DEFAULT];
 	}
 
+	/* The error may not be not configurable, so uses default behaviour */
+	if (cfg->fail_speed == XFS_ERR_FAIL_DEFAULT)
+		return &mp->m_error_cfg[error_class][XFS_ERR_DEFAULT];
 	return cfg;
 }
-- 
2.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 10/10] xfs: add kmem error configuration class
  2015-08-05 11:08 [RFC, PATCH 00/10] xfs: configurable error behaviours Dave Chinner
                   ` (8 preceding siblings ...)
  2015-08-05 11:08 ` [PATCH 09/10] xfs: disable specific error configurations Dave Chinner
@ 2015-08-05 11:08 ` Dave Chinner
  2015-08-11 14:20 ` [RFC, PATCH 00/10] xfs: configurable error behaviours Brian Foster
  10 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2015-08-05 11:08 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

For configuring how to handle memory allocation failures. I'm not
yet sure how to hook it into the memory allocation calls - that will
be done in a later patch; this just demonstrates how multiple
classes are configured and initialised.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_mount.h |  3 ++-
 fs/xfs/xfs_sysfs.c | 77 +++++++++++++++++++++++++++++++++++++++++-------------
 2 files changed, 61 insertions(+), 19 deletions(-)

diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index 876b86b..03162cd 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -46,6 +46,7 @@ enum {
  */
 enum {
 	XFS_ERR_METADATA,
+	XFS_ERR_KMEM,
 	XFS_ERR_CLASS_MAX,
 };
 enum {
@@ -162,7 +163,7 @@ typedef struct xfs_mount {
 						/* low free space thresholds */
 	struct xfs_kobj		m_kobj;
 	struct xfs_kobj		m_error_kobj;
-	struct xfs_kobj		m_error_meta_kobj;
+	struct xfs_kobj		m_error_class_kobj[XFS_ERR_CLASS_MAX];
 	struct xfs_error_cfg	m_error_cfg[XFS_ERR_CLASS_MAX][XFS_ERR_ERRNO_MAX];
 
 	struct workqueue_struct *m_buf_workqueue;
diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c
index fa46ca4..cbbc36a 100644
--- a/fs/xfs/xfs_sysfs.c
+++ b/fs/xfs/xfs_sysfs.c
@@ -484,26 +484,50 @@ static const struct xfs_error_init xfs_error_meta_init[XFS_ERR_ERRNO_MAX] = {
 	{ .name = "ENODEV",
 	  .fail_speed = XFS_ERR_FAIL_FAST,
 	},
-	{ .name = "ENOMEM",
+	{ /* ENOMEM */
 	  .fail_speed = XFS_ERR_FAIL_DEFAULT,
 	},
 };
 
+static const struct xfs_error_init xfs_error_kmem_init[XFS_ERR_ERRNO_MAX] = {
+	{ .name = "Default",
+	  .fail_speed = XFS_ERR_FAIL_NEVER,
+	  .max_retries = INT_MAX,
+	  .retry_timeout = 0,
+	  .fail_at_unmount = false,
+	},
+	{ /* EIO */
+	  .fail_speed = XFS_ERR_FAIL_DEFAULT,
+	},
+	{ /* ENOSPC */
+	  .fail_speed = XFS_ERR_FAIL_DEFAULT,
+	},
+	{ /* ENODEV */
+	  .fail_speed = XFS_ERR_FAIL_DEFAULT,
+	},
+	{ .name = "ENOMEM",
+	  .fail_speed = XFS_ERR_FAIL_NEVER,
+	  .max_retries = INT_MAX,
+	  .retry_timeout = 0,
+	  .fail_at_unmount = false,
+	},
+};
+
 static int
 xfs_error_sysfs_init_class(
 	struct xfs_mount	*mp,
 	int			class,
 	const char		*parent_name,
-	struct xfs_kobj		*parent_kobj,
 	const struct xfs_error_init init[])
 {
+	struct xfs_kobj		*class_kobj = &mp->m_error_class_kobj[class];
 	struct xfs_error_cfg	*cfg;
 	int			error;
 	int			i;
 
 	ASSERT(class < XFS_ERR_CLASS_MAX);
 
-	error = xfs_sysfs_init(parent_kobj, &xfs_error_ktype,
+	error = xfs_sysfs_init(class_kobj, &xfs_error_ktype,
 				&mp->m_error_kobj, parent_name);
 	if (error)
 		return error;
@@ -516,8 +540,9 @@ xfs_error_sysfs_init_class(
 		if (cfg->fail_speed == XFS_ERR_FAIL_DEFAULT)
 			continue;
 
+		ASSERT(init[i].name);
 		error = xfs_sysfs_init(&cfg->kobj, &xfs_error_cfg_ktype,
-					parent_kobj, init[i].name);
+					class_kobj, init[i].name);
 		if (error)
 			goto out_error;
 
@@ -534,10 +559,28 @@ out_error:
 		cfg = &mp->m_error_cfg[class][i];
 		xfs_sysfs_del(&cfg->kobj);
 	}
-	xfs_sysfs_del(parent_kobj);
+	xfs_sysfs_del(class_kobj);
 	return error;
 }
 
+static void
+xfs_error_sysfs_del_class(
+	struct xfs_mount	*mp,
+	int			class)
+{
+	struct xfs_error_cfg	*cfg;
+	int			i;
+
+	for (i = 0; i < XFS_ERR_ERRNO_MAX; i++) {
+		cfg = &mp->m_error_cfg[class][i];
+
+		if (cfg->fail_speed != XFS_ERR_FAIL_DEFAULT)
+			xfs_sysfs_del(&cfg->kobj);
+	}
+
+	xfs_sysfs_del(&mp->m_error_class_kobj[class]);
+}
+
 int
 xfs_error_sysfs_init(
 	struct xfs_mount	*mp)
@@ -551,14 +594,19 @@ xfs_error_sysfs_init(
 		return error;
 
 	/* .../xfs/<dev>/error/metadata/ */
-	error = xfs_error_sysfs_init_class(mp, XFS_ERR_METADATA,
-				"metadata", &mp->m_error_meta_kobj,
-				xfs_error_meta_init);
+	error = xfs_error_sysfs_init_class(mp, XFS_ERR_METADATA, "metadata",
+					   xfs_error_meta_init);
 	if (error)
 		goto out_error;
 
+	error = xfs_error_sysfs_init_class(mp, XFS_ERR_KMEM, "kmem",
+					   xfs_error_kmem_init);
+	if (error)
+		goto out_error_meta;
 	return 0;
 
+out_error_meta:
+	xfs_error_sysfs_del_class(mp, XFS_ERR_METADATA);
 out_error:
 	xfs_sysfs_del(&mp->m_error_kobj);
 	return error;
@@ -568,18 +616,11 @@ void
 xfs_error_sysfs_del(
 	struct xfs_mount	*mp)
 {
-	struct xfs_error_cfg	*cfg;
-	int			i, j;
+	int			i;
 
-	for (i = 0; i < XFS_ERR_CLASS_MAX; i++) {
-		for (j = 0; j < XFS_ERR_ERRNO_MAX; j++) {
-			cfg = &mp->m_error_cfg[i][j];
+	for (i = 0; i < XFS_ERR_CLASS_MAX; i++)
+		xfs_error_sysfs_del_class(mp, i);
 
-			if (cfg->fail_speed != XFS_ERR_FAIL_DEFAULT)
-				xfs_sysfs_del(&cfg->kobj);
-		}
-	}
-	xfs_sysfs_del(&mp->m_error_meta_kobj);
 	xfs_sysfs_del(&mp->m_error_kobj);
 }
 
-- 
2.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 04/10] xfs: add configurable error support to metadata buffers
  2015-08-05 11:08 ` [PATCH 04/10] xfs: add configurable error support to metadata buffers Dave Chinner
@ 2015-08-11 14:18   ` Brian Foster
  0 siblings, 0 replies; 15+ messages in thread
From: Brian Foster @ 2015-08-11 14:18 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Wed, Aug 05, 2015 at 09:08:35PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> With the error configuration handle for async metadata write errors
> in place, we can now add initial support to the IO error processing
> in xfs_buf_iodone_error().
> 
> Add an infrastructure function to look up the configuration handle,
> and rearrange the error handling to prepare the way for different
> error handling conigurations to be used.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/xfs_buf.h      |   1 +
>  fs/xfs/xfs_buf_item.c | 112 ++++++++++++++++++++++++++++++--------------------
>  fs/xfs/xfs_mount.h    |   3 ++
>  fs/xfs/xfs_sysfs.c    |  17 ++++++++
>  4 files changed, 88 insertions(+), 45 deletions(-)
> 
> diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
> index 68c8f2d..ed0ea41 100644
> --- a/fs/xfs/xfs_buf.h
> +++ b/fs/xfs/xfs_buf.h
> @@ -181,6 +181,7 @@ typedef struct xfs_buf {
>  	unsigned int		b_page_count;	/* size of page array */
>  	unsigned int		b_offset;	/* page offset in first page */
>  	int			b_error;	/* error code on I/O */
> +	int			b_last_error;	/* previous async I/O error */
>  	const struct xfs_buf_ops	*b_ops;
>  
>  #ifdef XFS_BUF_LOCK_TRACKING
> diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
> index ee63961..a09ae26 100644
> --- a/fs/xfs/xfs_buf_item.c
> +++ b/fs/xfs/xfs_buf_item.c
> @@ -924,35 +924,22 @@ xfs_buf_do_callbacks(
>  	}
>  }
>  
> -/*
> - * This is the iodone() function for buffers which have had callbacks
> - * attached to them by xfs_buf_attach_iodone().  It should remove each
> - * log item from the buffer's list and call the callback of each in turn.
> - * When done, the buffer's fsprivate field is set to NULL and the buffer
> - * is unlocked with a call to iodone().
> - */

A pre-function comment would be nice to at least explain what the return
value means.

> -void
> -xfs_buf_iodone_callbacks(
> +static bool
> +xfs_buf_iodone_callback_error(
>  	struct xfs_buf		*bp)
>  {
>  	struct xfs_log_item	*lip = bp->b_fspriv;
>  	struct xfs_mount	*mp = lip->li_mountp;
>  	static ulong		lasttime;
>  	static xfs_buftarg_t	*lasttarg;
> -
> -	if (likely(!bp->b_error))
> -		goto do_callbacks;
> +	struct xfs_error_cfg	*cfg;
>  
>  	/*
>  	 * If we've already decided to shutdown the filesystem because of
>  	 * I/O errors, there's no point in giving this a retry.
>  	 */
> -	if (XFS_FORCED_SHUTDOWN(mp)) {
> -		xfs_buf_stale(bp);
> -		bp->b_flags |= XBF_DONE;
> -		trace_xfs_buf_item_iodone(bp, _RET_IP_);
> -		goto do_callbacks;
> -	}

Looks like we've lost a tracepoint here.

> +	if (XFS_FORCED_SHUTDOWN(mp))
> +		goto out_stale;
>  
>  	if (bp->b_target != lasttarg ||
>  	    time_after(jiffies, (lasttime + 5*HZ))) {
> @@ -961,45 +948,80 @@ xfs_buf_iodone_callbacks(
>  	}
>  	lasttarg = bp->b_target;
>  
> +	/* synchronous writes will have callers process the error */
> +	if (!(bp->b_flags & XBF_ASYNC))
> +		goto out_stale;
> +

Why mark it stale?

> +	trace_xfs_buf_item_iodone_async(bp, _RET_IP_);
> +	ASSERT(bp->b_iodone != NULL);
> +
>  	/*
>  	 * If the write was asynchronous then no one will be looking for the
> -	 * error.  Clear the error state and write the buffer out again.
> -	 *
> -	 * XXX: This helps against transient write errors, but we need to find
> -	 * a way to shut the filesystem down if the writes keep failing.
> -	 *
> -	 * In practice we'll shut the filesystem down soon as non-transient
> -	 * errors tend to affect the whole device and a failing log write
> -	 * will make us give up.  But we really ought to do better here.
> +	 * error.  If this is the first failure of this type, clear the error
> +	 * state and write the buffer out again. This means we always retry an
> +	 * async write failure at least once, but we also need to set the buffer
> +	 * up to behave correctly now for repeated failures.
>  	 */
> -	if (XFS_BUF_ISASYNC(bp)) {
> -		ASSERT(bp->b_iodone != NULL);
> -
> -		trace_xfs_buf_item_iodone_async(bp, _RET_IP_);
> +	cfg = xfs_error_get_cfg(mp, XFS_ERR_METADATA, bp->b_error);

cfg isn't used until after the following block of code.

Brian

> +	if (!(bp->b_flags & (XBF_STALE|XBF_WRITE_FAIL)) ||
> +	     bp->b_last_error != bp->b_error) {
> +		bp->b_flags |= (XBF_WRITE | XBF_ASYNC |
> +			        XBF_DONE | XBF_WRITE_FAIL);
> +		bp->b_last_error = bp->b_error;
> +		xfs_buf_ioerror(bp, 0);
> +		xfs_buf_submit(bp);
> +		return true;
> +	}
>  
> -		xfs_buf_ioerror(bp, 0); /* errno of 0 unsets the flag */
> +	/*
> +	 * Repeated failure on an async write. Take action according to the
> +	 * error configuration we have been set up to use.
> +	 */
> +	if (cfg->fail_speed == XFS_ERR_FAIL_FAST)
> +		goto permanent_error;
>  
> -		if (!(bp->b_flags & (XBF_STALE|XBF_WRITE_FAIL))) {
> -			bp->b_flags |= XBF_WRITE | XBF_ASYNC |
> -				       XBF_DONE | XBF_WRITE_FAIL;
> -			xfs_buf_submit(bp);
> -		} else {
> -			xfs_buf_relse(bp);
> -		}
> -
> -		return;
> -	}
> +	/* still a transient error, higher layers will retry */
> +	xfs_buf_ioerror(bp, 0);
> +	xfs_buf_relse(bp);
> +	return true;
>  
>  	/*
> -	 * If the write of the buffer was synchronous, we want to make
> -	 * sure to return the error to the caller of xfs_bwrite().
> +	 * Permanent error - we need to trigger a shutdown if we haven't already
> +	 * to indicate that inconsistency will result from this action.
>  	 */
> +permanent_error:
> +	xfs_force_shutdown(mp, SHUTDOWN_META_IO_ERROR);
> +out_stale:
>  	xfs_buf_stale(bp);
>  	bp->b_flags |= XBF_DONE;
> -
>  	trace_xfs_buf_error_relse(bp, _RET_IP_);
> +	return false;
> +}
> +
> +/*
> + * This is the iodone() function for buffers which have had callbacks attached
> + * to them by xfs_buf_attach_iodone(). We need to iterate the items on the
> + * callback list, mark the buffer as having no more callbacks and then push the
> + * buffer through IO completion processing.
> + */
> +void
> +xfs_buf_iodone_callbacks(
> +	struct xfs_buf		*bp)
> +{
> +	/*
> +	 * If there is an error, process it. Some errors require us
> +	 * to run callbacks after failure processing is done so we
> +	 * detect that and take appropriate action.
> +	 */
> +	if (bp->b_error && xfs_buf_iodone_callback_error(bp))
> +		return;
> +
> +	/*
> +	 * Successful IO or permanent error. Either way, we can clear the
> +	 * retry state here in preparation for the next error that may occur.
> +	 */
> +	bp->b_last_error = 0;
>  
> -do_callbacks:
>  	xfs_buf_do_callbacks(bp);
>  	bp->b_fspriv = NULL;
>  	bp->b_iodone = NULL;
> diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
> index 753060f..21caa5a 100644
> --- a/fs/xfs/xfs_mount.h
> +++ b/fs/xfs/xfs_mount.h
> @@ -366,4 +366,7 @@ extern int	xfs_dev_is_read_only(struct xfs_mount *, char *);
>  
>  extern void	xfs_set_low_space_thresholds(struct xfs_mount *);
>  
> +struct xfs_error_cfg * xfs_error_get_cfg(struct xfs_mount *mp,
> +		int error_class, int error);
> +
>  #endif	/* __XFS_MOUNT_H__ */
> diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c
> index cb7ced0..3667d33 100644
> --- a/fs/xfs/xfs_sysfs.c
> +++ b/fs/xfs/xfs_sysfs.c
> @@ -355,3 +355,20 @@ xfs_error_sysfs_del(
>  	xfs_sysfs_del(&mp->m_error_meta_kobj);
>  	xfs_sysfs_del(&mp->m_error_kobj);
>  }
> +
> +struct xfs_error_cfg *
> +xfs_error_get_cfg(
> +	struct xfs_mount	*mp,
> +	int			error_class,
> +	int			error)
> +{
> +	struct xfs_error_cfg	*cfg;
> +
> +	switch (error) {
> +	default:
> +		cfg = &mp->m_error_cfg[error_class][XFS_ERR_DEFAULT];
> +		break;
> +	}
> +
> +	return cfg;
> +}
> -- 
> 2.1.4
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 05/10] xfs: introduce table-based init for error behaviours
  2015-08-05 11:08 ` [PATCH 05/10] xfs: introduce table-based init for error behaviours Dave Chinner
@ 2015-08-11 14:19   ` Brian Foster
  0 siblings, 0 replies; 15+ messages in thread
From: Brian Foster @ 2015-08-11 14:19 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Wed, Aug 05, 2015 at 09:08:36PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Before we start expanding the number of error classes and errors we
> can configure behaviour for, we need a simple and clear way to
> define the default behaviour that we initialised each mount with.
> Introduce a table based method for keeping the initial configuration
> in, and apply that to the existing initialisation code.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/xfs_sysfs.c | 72 +++++++++++++++++++++++++++++++++++++++++++++---------
>  1 file changed, 60 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c
> index 3667d33..9d66095 100644
> --- a/fs/xfs/xfs_sysfs.c
> +++ b/fs/xfs/xfs_sysfs.c
> @@ -303,11 +303,67 @@ struct kobj_type xfs_error_ktype = {
>  	.release = xfs_sysfs_release,
>  };
>  
> +/*
> + * Error initialisation tables. These need to be ordered in the same
> + * order as the enums used to index the array. All class init tables need to
> + * define a "default" behaviour as the first entry, all other entries can be
> + * empty.
> + */
> +struct xfs_error_init {
> +	char		*name;
> +	int		fail_speed;
> +};
> +
> +static const struct xfs_error_init xfs_error_meta_init[XFS_ERR_ERRNO_MAX] = {
> +	{ .name = "Default",
> +	  .fail_speed = XFS_ERR_FAIL_NEVER,
> +	},
> +};
> +
> +static int
> +xfs_error_sysfs_init_class(
> +	struct xfs_mount	*mp,
> +	int			class,
> +	const char		*parent_name,
> +	struct xfs_kobj		*parent_kobj,
> +	const struct xfs_error_init init[])
> +{
> +	struct xfs_error_cfg	*cfg;
> +	int			error;
> +	int			i;
> +
> +	ASSERT(class < XFS_ERR_CLASS_MAX);
> +
> +	error = xfs_sysfs_init(parent_kobj, &xfs_error_ktype,
> +				&mp->m_error_kobj, parent_name);

'parent_kobj' is a bit confusing of a name here since you're
initializing it as well. Perhaps 'base_kobj,' 'class_kobj,' or something
along those lines?

Brian

> +	if (error)
> +		return error;
> +
> +	for (i = 0; i < XFS_ERR_ERRNO_MAX; i++) {
> +		cfg = &mp->m_error_cfg[class][i];
> +		error = xfs_sysfs_init(&cfg->kobj, &xfs_error_cfg_ktype,
> +					parent_kobj, init[i].name);
> +		if (error)
> +			goto out_error;
> +
> +		cfg->fail_speed = init[i].fail_speed;
> +	}
> +	return 0;
> +
> +out_error:
> +	/* unwind the entries that succeeded */
> +	for (i--; i >= 0; i--) {
> +		cfg = &mp->m_error_cfg[class][i];
> +		xfs_sysfs_del(&cfg->kobj);
> +	}
> +	xfs_sysfs_del(parent_kobj);
> +	return error;
> +}
> +
>  int
>  xfs_error_sysfs_init(
>  	struct xfs_mount	*mp)
>  {
> -	struct xfs_error_cfg	*cfg;
>  	int			error;
>  
>  	/* .../xfs/<dev>/error/ */
> @@ -317,22 +373,14 @@ xfs_error_sysfs_init(
>  		return error;
>  
>  	/* .../xfs/<dev>/error/metadata/ */
> -	error = xfs_sysfs_init(&mp->m_error_meta_kobj, &xfs_error_ktype,
> -				&mp->m_error_kobj, "metadata");
> +	error = xfs_error_sysfs_init_class(mp, XFS_ERR_METADATA,
> +				"metadata", &mp->m_error_meta_kobj,
> +				xfs_error_meta_init);
>  	if (error)
>  		goto out_error;
>  
> -	cfg = &mp->m_error_cfg[XFS_ERR_METADATA][XFS_ERR_DEFAULT];
> -	error = xfs_sysfs_init(&cfg->kobj, &xfs_error_cfg_ktype,
> -				&mp->m_error_meta_kobj, "Default");
> -	if (error)
> -		goto out_error_meta;
> -	cfg->fail_speed = XFS_ERR_FAIL_NEVER;
> -
>  	return 0;
>  
> -out_error_meta:
> -	xfs_sysfs_del(&mp->m_error_meta_kobj);
>  out_error:
>  	xfs_sysfs_del(&mp->m_error_kobj);
>  	return error;
> -- 
> 2.1.4
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 06/10] xfs: add configuration of error failure speed
  2015-08-05 11:08 ` [PATCH 06/10] xfs: add configuration of error failure speed Dave Chinner
@ 2015-08-11 14:19   ` Brian Foster
  0 siblings, 0 replies; 15+ messages in thread
From: Brian Foster @ 2015-08-11 14:19 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Wed, Aug 05, 2015 at 09:08:37PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> On reception of an error, we can fail immediately, perform some
> bound amount of retries or retry indefinitely. The current behaviour
> we have is to retry forever.
> 
> However, we'd like the ability to choose what behaviour we have, and
> that requires the ability to configure the behaviour through the new
> sysfs interfaces. Add configuration options for fail fast, slow or
> never to reflect the three choices above. Fail fast or fail never
> don't require any other options, but "fail slow" needs configuration
> to bound the retry behaviour. Add both a maximum retry count and a
> retry timeout so that we can bound by time and/or physical IO
> attempts.
> 
> Finally, plumb these into xfs_buf_iodone error processing so that
> the error behaviour follows the selected configuration.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/xfs_buf.h      |  23 +++++++++-
>  fs/xfs/xfs_buf_item.c |  22 +++++++++-
>  fs/xfs/xfs_mount.h    |   2 +
>  fs/xfs/xfs_sysfs.c    | 119 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 164 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
> index ed0ea41..afc2d2b 100644
> --- a/fs/xfs/xfs_buf.h
> +++ b/fs/xfs/xfs_buf.h
> @@ -181,7 +181,28 @@ typedef struct xfs_buf {
>  	unsigned int		b_page_count;	/* size of page array */
>  	unsigned int		b_offset;	/* page offset in first page */
>  	int			b_error;	/* error code on I/O */
> -	int			b_last_error;	/* previous async I/O error */
> +
> +	/*
> +	 * async write failure retry count. Initialised to zero on the first
> +	 * failure, then when it exceeds the maximum configured without a
> +	 * success the write is considered to be failed permanently and the
> +	 * iodone handler will take appropriate action.
> +	 *
> +	 * For retry timeouts, we record the jiffie of the first failure. This
> +	 * means that we can change the retry timeout and it on the next error
> +	 * it will be checked against the newly configured timeout. This
> +	 * prevents buffers getting stuck in retry loops with a really long
> +	 * timeout.
> +	 *
> +	 * last_error is used to ensure that we are getting repeated errors, not
> +	 * different errors. e.g. a block device might change ENOSPC to EIO when
> +	 * a failure timeout occurs, so we want to re-initialise the error
> +	 * retry behaviour appropriately when that happens.
> +	 */
> +	int			b_retries;
> +	unsigned long		b_first_retry_time; /* in jiffies */
> +	int			b_last_error;
> +
>  	const struct xfs_buf_ops	*b_ops;
>  
>  #ifdef XFS_BUF_LOCK_TRACKING
> diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
> index a09ae26..c785698 100644
> --- a/fs/xfs/xfs_buf_item.c
> +++ b/fs/xfs/xfs_buf_item.c
> @@ -968,6 +968,9 @@ xfs_buf_iodone_callback_error(

There's a 5 second xfs_buf_ioerror_alert() further up in this function.
I wonder if we should figure out a way to tie that into the failure
speed mechanism..? It might make sense to retain the original alert to
indicate that a failure occurred in the first place, but I'm not sure a
5 second repetition makes sense if an admin has set a 5 minute error
timeout, for example.

It also might be a good idea to make this initial alert more informative
in terms of indicating the error sequence has kicked in and retries are
in progress. Perhaps even indicate what metric (timeout, retries)
triggers a final shutdown if we get to that point. Otherwise, it's not
really clear to the user where we're at when alerts start firing,
particularly for somebody who might not know that fs shutdown is the
endgame.

>  		bp->b_flags |= (XBF_WRITE | XBF_ASYNC |
>  			        XBF_DONE | XBF_WRITE_FAIL);
>  		bp->b_last_error = bp->b_error;
> +		bp->b_retries = 0;
> +		bp->b_first_retry_time = jiffies;
> +
>  		xfs_buf_ioerror(bp, 0);
>  		xfs_buf_submit(bp);
>  		return true;
> @@ -977,9 +980,25 @@ xfs_buf_iodone_callback_error(
>  	 * Repeated failure on an async write. Take action according to the
>  	 * error configuration we have been set up to use.
>  	 */
> -	if (cfg->fail_speed == XFS_ERR_FAIL_FAST)
> +	switch (cfg->fail_speed) {
> +	case XFS_ERR_FAIL_FAST:
>  		goto permanent_error;
>  
> +	case XFS_ERR_FAIL_SLOW:
> +		if (++bp->b_retries > cfg->max_retries)
> +			goto permanent_error;
> +		if (!cfg->retry_timeout)
> +			break;
> +		if (time_after(jiffies,
> +			       cfg->retry_timeout + bp->b_first_retry_time))
> +			goto permanent_error;
> +		break;
> +
> +	case XFS_ERR_FAIL_NEVER:
> +	default:
> +		break;
> +	}
> +

Where is the actual resubmission for the fail slow/never cases?

>  	/* still a transient error, higher layers will retry */
>  	xfs_buf_ioerror(bp, 0);
>  	xfs_buf_relse(bp);
> @@ -1021,6 +1040,7 @@ xfs_buf_iodone_callbacks(
>  	 * retry state here in preparation for the next error that may occur.
>  	 */
>  	bp->b_last_error = 0;
> +	bp->b_retries = 0;
>  
>  	xfs_buf_do_callbacks(bp);
>  	bp->b_fspriv = NULL;
> diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
> index 21caa5a..a684a72b 100644
> --- a/fs/xfs/xfs_mount.h
> +++ b/fs/xfs/xfs_mount.h
> @@ -62,6 +62,8 @@ enum {
>  struct xfs_error_cfg {
>  	struct xfs_kobj	kobj;
>  	int		fail_speed;
> +	int		max_retries;	/* INT_MAX = retry forever */
> +	unsigned long	retry_timeout;	/* in jiffies, 0 = no timeout */
>  };
>  
>  typedef struct xfs_mount {
> diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c
> index 9d66095..1f078e1 100644
> --- a/fs/xfs/xfs_sysfs.c
> +++ b/fs/xfs/xfs_sysfs.c
> @@ -252,7 +252,119 @@ struct kobj_type xfs_log_ktype = {
>   * and any other future type of IO (e.g. special inode or directory error
>   * handling) we care to support.
>   */
> +static ssize_t
> +failure_speed_store(
> +	const char	*buf,
> +	size_t		count,
> +	void		*data)
> +{
> +	struct xfs_error_cfg *cfg = data;
> +	char		*str = kstrdup(buf, GFP_KERNEL);
> +	char		*sp;
> +	int		len;
> +
> +	if (!str)
> +		return -ENOMEM;
> +
> +	sp = strstrip(str);
> +	len = strlen(sp);
> +	if (strncmp(sp, "fast", len) == 0)
> +		cfg->fail_speed = XFS_ERR_FAIL_FAST;
> +	else if (strncmp(sp, "slow", len) == 0)
> +		cfg->fail_speed = XFS_ERR_FAIL_SLOW;
> +	else if (strncmp(sp, "never", len) == 0)
> +		cfg->fail_speed = XFS_ERR_FAIL_NEVER;
> +	else
> +		count = -EINVAL;
> +	kfree(str);
> +	return count;
> +}
> +
> +static ssize_t
> +failure_speed_show(
> +	char	*buf,
> +	void	*data)
> +{
> +	struct xfs_error_cfg *cfg = data;
> +
> +	if (cfg->fail_speed == XFS_ERR_FAIL_FAST)
> +		return snprintf(buf, PAGE_SIZE, "[fast] slow never\n");
> +	if (cfg->fail_speed == XFS_ERR_FAIL_SLOW)
> +		return snprintf(buf, PAGE_SIZE, "fast [slow] never\n");
> +	return snprintf(buf, PAGE_SIZE, "fast slow [never]\n");
> +}
> +XFS_SYSFS_ATTR_RW(failure_speed);
> +
> +static ssize_t
> +max_retries_store(
> +	const char	*buf,
> +	size_t		count,
> +	void		*data)
> +{
> +	struct xfs_error_cfg *cfg = data;
> +	int		ret;
> +	int		val;
> +
> +	ret = kstrtoint(buf, 0, &val);
> +	if (ret)
> +		return ret;
> +
> +	if (val < 0 || val > INT_MAX)
> +		return -EINVAL;
> +
> +	cfg->max_retries = val;
> +	return count;
> +}
> +
> +static ssize_t
> +max_retries_show(
> +	char	*buf,
> +	void	*data)
> +{
> +	struct xfs_error_cfg *cfg = data;
> +
> +	return snprintf(buf, PAGE_SIZE, "%d\n", cfg->max_retries);
> +}
> +XFS_SYSFS_ATTR_RW(max_retries);
> +
> +static ssize_t
> +retry_timeout_seconds_store(
> +	const char	*buf,
> +	size_t		count,
> +	void		*data)
> +{
> +	struct xfs_error_cfg *cfg = data;
> +	int		ret;
> +	int		val;
> +
> +	ret = kstrtoint(buf, 0, &val);
> +	if (ret)
> +		return ret;
> +
> +	/* 1 day timeout maximum */
> +	if (val < 0 || val > 86400)
> +		return -EINVAL;
> +
> +	cfg->retry_timeout = msecs_to_jiffies(val * MSEC_PER_SEC);
> +	return count;
> +}
> +
> +static ssize_t
> +retry_timeout_seconds_show(
> +	char	*buf,
> +	void	*data)
> +{
> +	struct xfs_error_cfg *cfg = data;
> +
> +	return snprintf(buf, PAGE_SIZE, "%ld\n", 

Trailing whitespace above.

> +			jiffies_to_msecs(cfg->retry_timeout) * MSEC_PER_SEC);

... and a conversion to microseconds. ;)

Brian

> +}
> +XFS_SYSFS_ATTR_RW(retry_timeout_seconds);
> +
>  static struct attribute *xfs_error_attrs[] = {
> +	ATTR_LIST(failure_speed),
> +	ATTR_LIST(max_retries),
> +	ATTR_LIST(retry_timeout_seconds),
>  	NULL,
>  };
>  
> @@ -312,11 +424,15 @@ struct kobj_type xfs_error_ktype = {
>  struct xfs_error_init {
>  	char		*name;
>  	int		fail_speed;
> +	int		max_retries;
> +	int		retry_timeout;	/* in seconds */
>  };
>  
>  static const struct xfs_error_init xfs_error_meta_init[XFS_ERR_ERRNO_MAX] = {
>  	{ .name = "Default",
>  	  .fail_speed = XFS_ERR_FAIL_NEVER,
> +	  .max_retries = INT_MAX,
> +	  .retry_timeout = 0,
>  	},
>  };
>  
> @@ -347,6 +463,9 @@ xfs_error_sysfs_init_class(
>  			goto out_error;
>  
>  		cfg->fail_speed = init[i].fail_speed;
> +		cfg->max_retries = init[i].max_retries;
> +		cfg->retry_timeout = msecs_to_jiffies(
> +					init[i].retry_timeout * MSEC_PER_SEC);
>  	}
>  	return 0;
>  
> -- 
> 2.1.4
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC, PATCH 00/10] xfs: configurable error behaviours
  2015-08-05 11:08 [RFC, PATCH 00/10] xfs: configurable error behaviours Dave Chinner
                   ` (9 preceding siblings ...)
  2015-08-05 11:08 ` [PATCH 10/10] xfs: add kmem error configuration class Dave Chinner
@ 2015-08-11 14:20 ` Brian Foster
  10 siblings, 0 replies; 15+ messages in thread
From: Brian Foster @ 2015-08-11 14:20 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Wed, Aug 05, 2015 at 09:08:31PM +1000, Dave Chinner wrote:
> Hi folks,
> 
> This is my first pass at introducing a framework for configuring the
> error handling behaviour of various subsystems in XFS. This is being
> driven by the need to handle things like ENOSPC from thin
> provisioned block devices in a sane manner, as well as from the mm/
> side of the kernel for more configurable memory allocation failure
> behaviour.
> 

By and large, this looks pretty good to me. I sent some comments on the
individual patches separately. A few more random thoughts below...

> The patchset introduces the concepts of:
> 
> 	- "failure speed", which defines the obvious behaviour of
> 	  the error handlers. "fast" will fail immediately, "slow"
> 	  will fail after a bound number of retries or timeout, and
> 	  "never" means exactly that. The current behaviour is
> 	  "never", and this patchset mostly leaves the default
> 	  behaviour configured this way.
> 
> 	- "error class", which is the subsystem or error location
> 	  the error handlers configure the behaviour for. "metadata"
> 	  configures async write errors in metadata buffers, "kmem"
> 	  configures memory allocation failure behaviour and we'll
> 	  add others as we see fit.
> 
> 	- "error handler", which is the specific error the
> 	  configuration applies to. For IO errors, it's obvious that
> 	  EIO, ENOSPC, ENODEV, etc are errors that need to be
> 	  handled in different ways. But for the kmem class, the
> 	  only error is ENOMEM, so while that it is currently
> 	  configured that way, the error handlers are more likely to
> 	  refer to failures in transaction context, in buffer
> 	  allocation context, etc. IOWs, the name and failure
> 	  context can change, but the same "failure speed"
> 	  definition still applies.
> 

At first thought, I'm not a huge fan of encoding the error codes into
the sysfs interface. This stuff probably shouldn't change much, if ever,
but it's not exactly under our control since we're talking about
underlying device error codes.

On one hand, I think it would be nice to have something like
'out_of_space' in the sysfs tree rather than ENOSPC, for example. Then
the filesystem would just do the right thing for whatever error code
that happens to reflect. On the other hand, one could have some 3rd
party storage device that reports EIO or something else in the context
where a thin volume is out of space, for example. The user would have no
sense of how to configure that for this particular device without the
error code based configuration.

> As a result, the infrastructure is quite flexible - anywhere you
> have a struct xfs_mount you can look up an error handling
> configuration and apply it appropriately for that context. The
> patchset really only implements a metadata buffer async write IO
> error context handler, but that is just the tip of the iceberg....
> 
> I've tried to make the initialisation as generic and simple as
> possible. It's all table based, so all the default values are in one
> place and easy to find - you don't have to touch the initialisation
> code to modify the default error config, or to add new errors. Only
> sysfs handlers need to be added for new error classes and handlers.
> 
> The current sysfs layout looks like this:
> 
> $ find /sys/fs/xfs/vdc/error
> /sys/fs/xfs/vdc/error
> /sys/fs/xfs/vdc/error/kmem
> /sys/fs/xfs/vdc/error/kmem/ENOMEM
> /sys/fs/xfs/vdc/error/kmem/ENOMEM/max_retries
> /sys/fs/xfs/vdc/error/kmem/ENOMEM/failure_speed
> /sys/fs/xfs/vdc/error/kmem/ENOMEM/fail_at_unmount
> /sys/fs/xfs/vdc/error/kmem/ENOMEM/retry_timeout_seconds
> /sys/fs/xfs/vdc/error/kmem/Default
> /sys/fs/xfs/vdc/error/kmem/Default/max_retries
> /sys/fs/xfs/vdc/error/kmem/Default/failure_speed
> /sys/fs/xfs/vdc/error/kmem/Default/fail_at_unmount
> /sys/fs/xfs/vdc/error/kmem/Default/retry_timeout_seconds
> /sys/fs/xfs/vdc/error/metadata
> /sys/fs/xfs/vdc/error/metadata/EIO
> /sys/fs/xfs/vdc/error/metadata/EIO/max_retries
> /sys/fs/xfs/vdc/error/metadata/EIO/failure_speed
> /sys/fs/xfs/vdc/error/metadata/EIO/fail_at_unmount
> /sys/fs/xfs/vdc/error/metadata/EIO/retry_timeout_seconds
> /sys/fs/xfs/vdc/error/metadata/ENODEV
> /sys/fs/xfs/vdc/error/metadata/ENODEV/max_retries
> /sys/fs/xfs/vdc/error/metadata/ENODEV/failure_speed
> /sys/fs/xfs/vdc/error/metadata/ENODEV/fail_at_unmount
> /sys/fs/xfs/vdc/error/metadata/ENODEV/retry_timeout_seconds
> /sys/fs/xfs/vdc/error/metadata/ENOSPC
> /sys/fs/xfs/vdc/error/metadata/ENOSPC/max_retries
> /sys/fs/xfs/vdc/error/metadata/ENOSPC/failure_speed
> /sys/fs/xfs/vdc/error/metadata/ENOSPC/fail_at_unmount
> /sys/fs/xfs/vdc/error/metadata/ENOSPC/retry_timeout_seconds
> /sys/fs/xfs/vdc/error/metadata/Default
> /sys/fs/xfs/vdc/error/metadata/Default/max_retries
> /sys/fs/xfs/vdc/error/metadata/Default/failure_speed
> /sys/fs/xfs/vdc/error/metadata/Default/fail_at_unmount
> /sys/fs/xfs/vdc/error/metadata/Default/retry_timeout_seconds
> $
> 

The error context isn't quite clear from this organization. It is
consistent and seems functionally sane from the code. I just wonder
whether users might get confused considering that some of these errors
can originate from the filesystem vs. the underlying device. Clearly
we're interested in the latter, but how is that evident from the sysfs
interface?

Maybe tweaking the class name would help clarify a bit. E.g.,
metadata_io..?

Brian

> Which shows the classes and error handlers, and how different
> classes can have different error handlers.
> 
> The patchset runs through xfstests without problems on the default
> configuration it sets. Note that the default config for the metadata
> class now sets "fail at unmount" so errors that have been retries
> indefinitely will fail at unmount and trigger a shutdown at that
> point. This is preferable to hanging unmount as currently happens.
> 
> Anyway, have a play, have a poke, and let me know what you think...
> 
> Cheers,
> 
> Dave.
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2015-08-11 14:20 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-05 11:08 [RFC, PATCH 00/10] xfs: configurable error behaviours Dave Chinner
2015-08-05 11:08 ` [PATCH 01/10] xfs: remove XBF_DONE flag wrapper macros Dave Chinner
2015-08-05 11:08 ` [PATCH 02/10] xfs: configurable error behaviour via sysfs Dave Chinner
2015-08-05 11:08 ` [PATCH 03/10] xfs: introduce metadata IO error class Dave Chinner
2015-08-05 11:08 ` [PATCH 04/10] xfs: add configurable error support to metadata buffers Dave Chinner
2015-08-11 14:18   ` Brian Foster
2015-08-05 11:08 ` [PATCH 05/10] xfs: introduce table-based init for error behaviours Dave Chinner
2015-08-11 14:19   ` Brian Foster
2015-08-05 11:08 ` [PATCH 06/10] xfs: add configuration of error failure speed Dave Chinner
2015-08-11 14:19   ` Brian Foster
2015-08-05 11:08 ` [PATCH 07/10] xfs: add "fail at unmount" error handling configuration Dave Chinner
2015-08-05 11:08 ` [PATCH 08/10] xfs: add configuration handles for specific errors Dave Chinner
2015-08-05 11:08 ` [PATCH 09/10] xfs: disable specific error configurations Dave Chinner
2015-08-05 11:08 ` [PATCH 10/10] xfs: add kmem error configuration class Dave Chinner
2015-08-11 14:20 ` [RFC, PATCH 00/10] xfs: configurable error behaviours Brian Foster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.