All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/21] xfs: refactor log recovery
@ 2020-04-30  0:47 Darrick J. Wong
  2020-04-30  0:47 ` [PATCH 01/21] xfs: refactor log recovery item sorting into a generic dispatch structure Darrick J. Wong
                   ` (21 more replies)
  0 siblings, 22 replies; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:47 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

Hi all,

This series refactors log recovery by moving recovery code for each log
item type into the source code for the rest of that log item type and
using dispatch function pointers to virtualize the interactions.  This
dramatically reduces the amount of code in xfs_log_recover.c and
increases cohesion throughout the log code.

In this second version, we dispense with the extra indirection for log
intent items.  During log recovery pass 2, committing of the recovered
intent and intent-done items is done directly by creating
xlog_recover_item_types for all intent types.  The recovery functions
that do the work are now called directly through the xfs_log_item ops
structure.  Recovery item sorting is less intrusive, and the buffer and
inode recovery code are in separate files now.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=refactor-log-recovery

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 01/21] xfs: refactor log recovery item sorting into a generic dispatch structure
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
@ 2020-04-30  0:47 ` Darrick J. Wong
  2020-04-30  5:53   ` Christoph Hellwig
  2020-05-01 10:40   ` Chandan Rajendra
  2020-04-30  0:47 ` [PATCH 02/21] xfs: refactor log recovery item dispatch for pass2 readhead functions Darrick J. Wong
                   ` (20 subsequent siblings)
  21 siblings, 2 replies; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:47 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Create a generic dispatch structure to delegate recovery of different
log item types into various code modules.  This will enable us to move
code specific to a particular log item type out of xfs_log_recover.c and
into the log item source.

The first operation we virtualize is the log item sorting.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/Makefile                 |    2 +
 fs/xfs/libxfs/xfs_log_recover.h |   41 ++++++++++++++
 fs/xfs/xfs_bmap_item.c          |    7 ++
 fs/xfs/xfs_buf_item.c           |    1 
 fs/xfs/xfs_buf_item_recover.c   |   37 +++++++++++++
 fs/xfs/xfs_dquot_item.c         |    8 +++
 fs/xfs/xfs_extfree_item.c       |    7 ++
 fs/xfs/xfs_icreate_item.c       |   13 ++++
 fs/xfs/xfs_inode_item_recover.c |   25 ++++++++
 fs/xfs/xfs_log_recover.c        |  115 ++++++++++++++++++++++++++-------------
 fs/xfs/xfs_refcount_item.c      |    7 ++
 fs/xfs/xfs_rmap_item.c          |    7 ++
 12 files changed, 231 insertions(+), 39 deletions(-)
 create mode 100644 fs/xfs/xfs_buf_item_recover.c
 create mode 100644 fs/xfs/xfs_inode_item_recover.c


diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile
index ee375b67ac71..5e52c2dc6078 100644
--- a/fs/xfs/Makefile
+++ b/fs/xfs/Makefile
@@ -120,9 +120,11 @@ xfs-y				+= xfs_log.o \
 				   xfs_log_cil.o \
 				   xfs_bmap_item.o \
 				   xfs_buf_item.o \
+				   xfs_buf_item_recover.o \
 				   xfs_extfree_item.o \
 				   xfs_icreate_item.o \
 				   xfs_inode_item.o \
+				   xfs_inode_item_recover.o \
 				   xfs_refcount_item.o \
 				   xfs_rmap_item.o \
 				   xfs_log_recover.o \
diff --git a/fs/xfs/libxfs/xfs_log_recover.h b/fs/xfs/libxfs/xfs_log_recover.h
index 3bf671637a91..38ae9c371edb 100644
--- a/fs/xfs/libxfs/xfs_log_recover.h
+++ b/fs/xfs/libxfs/xfs_log_recover.h
@@ -6,6 +6,45 @@
 #ifndef	__XFS_LOG_RECOVER_H__
 #define __XFS_LOG_RECOVER_H__
 
+/*
+ * Each log item type (XFS_LI_*) gets its own xlog_recover_item_type to
+ * define how recovery should work for that type of log item.
+ */
+struct xlog_recover_item;
+
+/* Sorting hat for log items as they're read in. */
+enum xlog_recover_reorder {
+	XLOG_REORDER_BUFFER_LIST,
+	XLOG_REORDER_ITEM_LIST,
+	XLOG_REORDER_INODE_BUFFER_LIST,
+	XLOG_REORDER_CANCEL_LIST,
+};
+
+struct xlog_recover_item_type {
+	/*
+	 * Help sort recovered log items into the order required to replay them
+	 * correctly.  Log item types that always use XLOG_REORDER_ITEM_LIST do
+	 * not have to supply a function here.  See the comment preceding
+	 * xlog_recover_reorder_trans for more details about what the return
+	 * values mean.
+	 */
+	enum xlog_recover_reorder (*reorder_fn)(struct xlog_recover_item *item);
+};
+
+extern const struct xlog_recover_item_type xlog_icreate_item_type;
+extern const struct xlog_recover_item_type xlog_buf_item_type;
+extern const struct xlog_recover_item_type xlog_inode_item_type;
+extern const struct xlog_recover_item_type xlog_dquot_item_type;
+extern const struct xlog_recover_item_type xlog_quotaoff_item_type;
+extern const struct xlog_recover_item_type xlog_bmap_intent_item_type;
+extern const struct xlog_recover_item_type xlog_bmap_done_item_type;
+extern const struct xlog_recover_item_type xlog_extfree_intent_item_type;
+extern const struct xlog_recover_item_type xlog_extfree_done_item_type;
+extern const struct xlog_recover_item_type xlog_rmap_intent_item_type;
+extern const struct xlog_recover_item_type xlog_rmap_done_item_type;
+extern const struct xlog_recover_item_type xlog_refcount_intent_item_type;
+extern const struct xlog_recover_item_type xlog_refcount_done_item_type;
+
 /*
  * Macros, structures, prototypes for internal log manager use.
  */
@@ -24,10 +63,10 @@
  */
 typedef struct xlog_recover_item {
 	struct list_head	ri_list;
-	int			ri_type;
 	int			ri_cnt;	/* count of regions found */
 	int			ri_total;	/* total regions */
 	xfs_log_iovec_t		*ri_buf;	/* ptr to regions buffer */
+	const struct xlog_recover_item_type *ri_type;
 } xlog_recover_item_t;
 
 struct xlog_recover {
diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
index ee6f4229cebc..a2824013e2cb 100644
--- a/fs/xfs/xfs_bmap_item.c
+++ b/fs/xfs/xfs_bmap_item.c
@@ -22,6 +22,7 @@
 #include "xfs_bmap_btree.h"
 #include "xfs_trans_space.h"
 #include "xfs_error.h"
+#include "xfs_log_recover.h"
 
 kmem_zone_t	*xfs_bui_zone;
 kmem_zone_t	*xfs_bud_zone;
@@ -563,3 +564,9 @@ xfs_bui_recover(
 	}
 	return error;
 }
+
+const struct xlog_recover_item_type xlog_bmap_intent_item_type = {
+};
+
+const struct xlog_recover_item_type xlog_bmap_done_item_type = {
+};
diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
index 1545657c3ca0..a416fc35e444 100644
--- a/fs/xfs/xfs_buf_item.c
+++ b/fs/xfs/xfs_buf_item.c
@@ -17,7 +17,6 @@
 #include "xfs_trace.h"
 #include "xfs_log.h"
 
-
 kmem_zone_t	*xfs_buf_item_zone;
 
 static inline struct xfs_buf_log_item *BUF_ITEM(struct xfs_log_item *lip)
diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c
new file mode 100644
index 000000000000..07ddf58209c3
--- /dev/null
+++ b/fs/xfs/xfs_buf_item_recover.c
@@ -0,0 +1,37 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2000-2006 Silicon Graphics, Inc.
+ * All Rights Reserved.
+ */
+#include "xfs.h"
+#include "xfs_fs.h"
+#include "xfs_shared.h"
+#include "xfs_format.h"
+#include "xfs_log_format.h"
+#include "xfs_trans_resv.h"
+#include "xfs_bit.h"
+#include "xfs_mount.h"
+#include "xfs_trans.h"
+#include "xfs_buf_item.h"
+#include "xfs_trans_priv.h"
+#include "xfs_trace.h"
+#include "xfs_log.h"
+#include "xfs_log_priv.h"
+#include "xfs_log_recover.h"
+
+STATIC enum xlog_recover_reorder
+xlog_buf_reorder_fn(
+	struct xlog_recover_item	*item)
+{
+	struct xfs_buf_log_format	*buf_f = item->ri_buf[0].i_addr;
+
+	if (buf_f->blf_flags & XFS_BLF_CANCEL)
+		return XLOG_REORDER_CANCEL_LIST;
+	if (buf_f->blf_flags & XFS_BLF_INODE_BUF)
+		return XLOG_REORDER_INODE_BUFFER_LIST;
+	return XLOG_REORDER_BUFFER_LIST;
+}
+
+const struct xlog_recover_item_type xlog_buf_item_type = {
+	.reorder_fn		= xlog_buf_reorder_fn,
+};
diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
index baad1748d0d1..3bd5b6c7e235 100644
--- a/fs/xfs/xfs_dquot_item.c
+++ b/fs/xfs/xfs_dquot_item.c
@@ -17,6 +17,8 @@
 #include "xfs_trans_priv.h"
 #include "xfs_qm.h"
 #include "xfs_log.h"
+#include "xfs_log_priv.h"
+#include "xfs_log_recover.h"
 
 static inline struct xfs_dq_logitem *DQUOT_ITEM(struct xfs_log_item *lip)
 {
@@ -383,3 +385,9 @@ xfs_qm_qoff_logitem_init(
 	qf->qql_flags = flags;
 	return qf;
 }
+
+const struct xlog_recover_item_type xlog_dquot_item_type = {
+};
+
+const struct xlog_recover_item_type xlog_quotaoff_item_type = {
+};
diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
index 6ea847f6e298..c53e5f46ee26 100644
--- a/fs/xfs/xfs_extfree_item.c
+++ b/fs/xfs/xfs_extfree_item.c
@@ -22,6 +22,7 @@
 #include "xfs_bmap.h"
 #include "xfs_trace.h"
 #include "xfs_error.h"
+#include "xfs_log_recover.h"
 
 kmem_zone_t	*xfs_efi_zone;
 kmem_zone_t	*xfs_efd_zone;
@@ -652,3 +653,9 @@ xfs_efi_recover(
 	xfs_trans_cancel(tp);
 	return error;
 }
+
+const struct xlog_recover_item_type xlog_extfree_intent_item_type = {
+};
+
+const struct xlog_recover_item_type xlog_extfree_done_item_type = {
+};
diff --git a/fs/xfs/xfs_icreate_item.c b/fs/xfs/xfs_icreate_item.c
index 490fee22b878..9f38a3c200a3 100644
--- a/fs/xfs/xfs_icreate_item.c
+++ b/fs/xfs/xfs_icreate_item.c
@@ -11,6 +11,8 @@
 #include "xfs_trans_priv.h"
 #include "xfs_icreate_item.h"
 #include "xfs_log.h"
+#include "xfs_log_priv.h"
+#include "xfs_log_recover.h"
 
 kmem_zone_t	*xfs_icreate_zone;		/* inode create item zone */
 
@@ -107,3 +109,14 @@ xfs_icreate_log(
 	tp->t_flags |= XFS_TRANS_DIRTY;
 	set_bit(XFS_LI_DIRTY, &icp->ic_item.li_flags);
 }
+
+static enum xlog_recover_reorder
+xlog_icreate_reorder(
+		struct xlog_recover_item *item)
+{
+	return XLOG_REORDER_BUFFER_LIST;
+}
+
+const struct xlog_recover_item_type xlog_icreate_item_type = {
+	.reorder_fn		= xlog_icreate_reorder,
+};
diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c
new file mode 100644
index 000000000000..478f0a5c08ab
--- /dev/null
+++ b/fs/xfs/xfs_inode_item_recover.c
@@ -0,0 +1,25 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2000-2006 Silicon Graphics, Inc.
+ * All Rights Reserved.
+ */
+#include "xfs.h"
+#include "xfs_fs.h"
+#include "xfs_shared.h"
+#include "xfs_format.h"
+#include "xfs_log_format.h"
+#include "xfs_trans_resv.h"
+#include "xfs_mount.h"
+#include "xfs_inode.h"
+#include "xfs_trans.h"
+#include "xfs_inode_item.h"
+#include "xfs_trace.h"
+#include "xfs_trans_priv.h"
+#include "xfs_buf_item.h"
+#include "xfs_log.h"
+#include "xfs_error.h"
+#include "xfs_log_priv.h"
+#include "xfs_log_recover.h"
+
+const struct xlog_recover_item_type xlog_inode_item_type = {
+};
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index db47dfc0cada..8ab107680883 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -1786,6 +1786,57 @@ xlog_clear_stale_blocks(
  ******************************************************************************
  */
 
+static int
+xlog_set_item_type(
+	struct xlog_recover_item		*item)
+{
+	switch (ITEM_TYPE(item)) {
+	case XFS_LI_ICREATE:
+		item->ri_type = &xlog_icreate_item_type;
+		return 0;
+	case XFS_LI_BUF:
+		item->ri_type = &xlog_buf_item_type;
+		return 0;
+	case XFS_LI_EFI:
+		item->ri_type = &xlog_extfree_intent_item_type;
+		return 0;
+	case XFS_LI_EFD:
+		item->ri_type = &xlog_extfree_done_item_type;
+		return 0;
+	case XFS_LI_RUI:
+		item->ri_type = &xlog_rmap_intent_item_type;
+		return 0;
+	case XFS_LI_RUD:
+		item->ri_type = &xlog_rmap_done_item_type;
+		return 0;
+	case XFS_LI_CUI:
+		item->ri_type = &xlog_refcount_intent_item_type;
+		return 0;
+	case XFS_LI_CUD:
+		item->ri_type = &xlog_refcount_done_item_type;
+		return 0;
+	case XFS_LI_BUI:
+		item->ri_type = &xlog_bmap_intent_item_type;
+		return 0;
+	case XFS_LI_BUD:
+		item->ri_type = &xlog_bmap_done_item_type;
+		return 0;
+	case XFS_LI_INODE:
+		item->ri_type = &xlog_inode_item_type;
+		return 0;
+#ifdef CONFIG_XFS_QUOTA
+	case XFS_LI_DQUOT:
+		item->ri_type = &xlog_dquot_item_type;
+		return 0;
+	case XFS_LI_QUOTAOFF:
+		item->ri_type = &xlog_quotaoff_item_type;
+		return 0;
+#endif /* CONFIG_XFS_QUOTA */
+	default:
+		return -EFSCORRUPTED;
+	}
+}
+
 /*
  * Sort the log items in the transaction.
  *
@@ -1851,41 +1902,10 @@ xlog_recover_reorder_trans(
 
 	list_splice_init(&trans->r_itemq, &sort_list);
 	list_for_each_entry_safe(item, n, &sort_list, ri_list) {
-		xfs_buf_log_format_t	*buf_f = item->ri_buf[0].i_addr;
+		enum xlog_recover_reorder	fate = XLOG_REORDER_ITEM_LIST;
 
-		switch (ITEM_TYPE(item)) {
-		case XFS_LI_ICREATE:
-			list_move_tail(&item->ri_list, &buffer_list);
-			break;
-		case XFS_LI_BUF:
-			if (buf_f->blf_flags & XFS_BLF_CANCEL) {
-				trace_xfs_log_recover_item_reorder_head(log,
-							trans, item, pass);
-				list_move(&item->ri_list, &cancel_list);
-				break;
-			}
-			if (buf_f->blf_flags & XFS_BLF_INODE_BUF) {
-				list_move(&item->ri_list, &inode_buffer_list);
-				break;
-			}
-			list_move_tail(&item->ri_list, &buffer_list);
-			break;
-		case XFS_LI_INODE:
-		case XFS_LI_DQUOT:
-		case XFS_LI_QUOTAOFF:
-		case XFS_LI_EFD:
-		case XFS_LI_EFI:
-		case XFS_LI_RUI:
-		case XFS_LI_RUD:
-		case XFS_LI_CUI:
-		case XFS_LI_CUD:
-		case XFS_LI_BUI:
-		case XFS_LI_BUD:
-			trace_xfs_log_recover_item_reorder_tail(log,
-							trans, item, pass);
-			list_move_tail(&item->ri_list, &item_list);
-			break;
-		default:
+		error = xlog_set_item_type(item);
+		if (error) {
 			xfs_warn(log->l_mp,
 				"%s: unrecognized type of log operation (%d)",
 				__func__, ITEM_TYPE(item));
@@ -1896,11 +1916,32 @@ xlog_recover_reorder_trans(
 			 */
 			if (!list_empty(&sort_list))
 				list_splice_init(&sort_list, &trans->r_itemq);
-			error = -EIO;
-			goto out;
+			break;
+		}
+
+		if (item->ri_type->reorder_fn)
+			fate = item->ri_type->reorder_fn(item);
+
+		switch (fate) {
+		case XLOG_REORDER_BUFFER_LIST:
+			list_move_tail(&item->ri_list, &buffer_list);
+			break;
+		case XLOG_REORDER_CANCEL_LIST:
+			trace_xfs_log_recover_item_reorder_head(log,
+					trans, item, pass);
+			list_move(&item->ri_list, &cancel_list);
+			break;
+		case XLOG_REORDER_INODE_BUFFER_LIST:
+			list_move(&item->ri_list, &inode_buffer_list);
+			break;
+		case XLOG_REORDER_ITEM_LIST:
+			trace_xfs_log_recover_item_reorder_tail(log,
+							trans, item, pass);
+			list_move_tail(&item->ri_list, &item_list);
+			break;
 		}
 	}
-out:
+
 	ASSERT(list_empty(&sort_list));
 	if (!list_empty(&buffer_list))
 		list_splice(&buffer_list, &trans->r_itemq);
diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c
index 8eeed73928cd..ddab09385bfb 100644
--- a/fs/xfs/xfs_refcount_item.c
+++ b/fs/xfs/xfs_refcount_item.c
@@ -18,6 +18,7 @@
 #include "xfs_log.h"
 #include "xfs_refcount.h"
 #include "xfs_error.h"
+#include "xfs_log_recover.h"
 
 kmem_zone_t	*xfs_cui_zone;
 kmem_zone_t	*xfs_cud_zone;
@@ -590,3 +591,9 @@ xfs_cui_recover(
 	xfs_trans_cancel(tp);
 	return error;
 }
+
+const struct xlog_recover_item_type xlog_refcount_intent_item_type = {
+};
+
+const struct xlog_recover_item_type xlog_refcount_done_item_type = {
+};
diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c
index 4911b68f95dd..bcad3db1f3a4 100644
--- a/fs/xfs/xfs_rmap_item.c
+++ b/fs/xfs/xfs_rmap_item.c
@@ -18,6 +18,7 @@
 #include "xfs_log.h"
 #include "xfs_rmap.h"
 #include "xfs_error.h"
+#include "xfs_log_recover.h"
 
 kmem_zone_t	*xfs_rui_zone;
 kmem_zone_t	*xfs_rud_zone;
@@ -606,3 +607,9 @@ xfs_rui_recover(
 	xfs_trans_cancel(tp);
 	return error;
 }
+
+const struct xlog_recover_item_type xlog_rmap_intent_item_type = {
+};
+
+const struct xlog_recover_item_type xlog_rmap_done_item_type = {
+};


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 02/21] xfs: refactor log recovery item dispatch for pass2 readhead functions
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
  2020-04-30  0:47 ` [PATCH 01/21] xfs: refactor log recovery item sorting into a generic dispatch structure Darrick J. Wong
@ 2020-04-30  0:47 ` Darrick J. Wong
  2020-05-01 12:10   ` Chandan Rajendra
  2020-04-30  0:47 ` [PATCH 03/21] xfs: refactor log recovery item dispatch for pass1 commit functions Darrick J. Wong
                   ` (19 subsequent siblings)
  21 siblings, 1 reply; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:47 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Move the pass2 readhead code into the per-item source code files and use
the dispatch function to call them.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_log_recover.h |    6 ++
 fs/xfs/xfs_buf_item_recover.c   |   11 +++++
 fs/xfs/xfs_dquot_item.c         |   34 ++++++++++++++
 fs/xfs/xfs_inode_item_recover.c |   19 ++++++++
 fs/xfs/xfs_log_recover.c        |   95 +--------------------------------------
 5 files changed, 73 insertions(+), 92 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_log_recover.h b/fs/xfs/libxfs/xfs_log_recover.h
index 38ae9c371edb..1463eba47254 100644
--- a/fs/xfs/libxfs/xfs_log_recover.h
+++ b/fs/xfs/libxfs/xfs_log_recover.h
@@ -29,6 +29,9 @@ struct xlog_recover_item_type {
 	 * values mean.
 	 */
 	enum xlog_recover_reorder (*reorder_fn)(struct xlog_recover_item *item);
+
+	/* Start readahead for pass2, if provided. */
+	void (*ra_pass2_fn)(struct xlog *log, struct xlog_recover_item *item);
 };
 
 extern const struct xlog_recover_item_type xlog_icreate_item_type;
@@ -90,4 +93,7 @@ struct xlog_recover {
 #define	XLOG_RECOVER_PASS1	1
 #define	XLOG_RECOVER_PASS2	2
 
+void xlog_buf_readahead(struct xlog *log, xfs_daddr_t blkno, uint len,
+		const struct xfs_buf_ops *ops);
+
 #endif	/* __XFS_LOG_RECOVER_H__ */
diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c
index 07ddf58209c3..c756b8e55fde 100644
--- a/fs/xfs/xfs_buf_item_recover.c
+++ b/fs/xfs/xfs_buf_item_recover.c
@@ -32,6 +32,17 @@ xlog_buf_reorder_fn(
 	return XLOG_REORDER_BUFFER_LIST;
 }
 
+STATIC void
+xlog_recover_buffer_ra_pass2(
+	struct xlog                     *log,
+	struct xlog_recover_item        *item)
+{
+	struct xfs_buf_log_format	*buf_f = item->ri_buf[0].i_addr;
+
+	xlog_buf_readahead(log, buf_f->blf_blkno, buf_f->blf_len, NULL);
+}
+
 const struct xlog_recover_item_type xlog_buf_item_type = {
 	.reorder_fn		= xlog_buf_reorder_fn,
+	.ra_pass2_fn		= xlog_recover_buffer_ra_pass2,
 };
diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
index 3bd5b6c7e235..2a05d1239423 100644
--- a/fs/xfs/xfs_dquot_item.c
+++ b/fs/xfs/xfs_dquot_item.c
@@ -386,7 +386,41 @@ xfs_qm_qoff_logitem_init(
 	return qf;
 }
 
+STATIC void
+xlog_recover_dquot_ra_pass2(
+	struct xlog			*log,
+	struct xlog_recover_item	*item)
+{
+	struct xfs_mount	*mp = log->l_mp;
+	struct xfs_disk_dquot	*recddq;
+	struct xfs_dq_logformat	*dq_f;
+	uint			type;
+
+	if (mp->m_qflags == 0)
+		return;
+
+	recddq = item->ri_buf[1].i_addr;
+	if (recddq == NULL)
+		return;
+	if (item->ri_buf[1].i_len < sizeof(struct xfs_disk_dquot))
+		return;
+
+	type = recddq->d_flags & (XFS_DQ_USER | XFS_DQ_PROJ | XFS_DQ_GROUP);
+	ASSERT(type);
+	if (log->l_quotaoffs_flag & type)
+		return;
+
+	dq_f = item->ri_buf[0].i_addr;
+	ASSERT(dq_f);
+	ASSERT(dq_f->qlf_len == 1);
+
+	xlog_buf_readahead(log, dq_f->qlf_blkno,
+			XFS_FSB_TO_BB(mp, dq_f->qlf_len),
+			&xfs_dquot_buf_ra_ops);
+}
+
 const struct xlog_recover_item_type xlog_dquot_item_type = {
+	.ra_pass2_fn		= xlog_recover_dquot_ra_pass2,
 };
 
 const struct xlog_recover_item_type xlog_quotaoff_item_type = {
diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c
index 478f0a5c08ab..d97d8caa4652 100644
--- a/fs/xfs/xfs_inode_item_recover.c
+++ b/fs/xfs/xfs_inode_item_recover.c
@@ -21,5 +21,24 @@
 #include "xfs_log_priv.h"
 #include "xfs_log_recover.h"
 
+STATIC void
+xlog_recover_inode_ra_pass2(
+	struct xlog                     *log,
+	struct xlog_recover_item        *item)
+{
+	if (item->ri_buf[0].i_len == sizeof(struct xfs_inode_log_format)) {
+		struct xfs_inode_log_format	*ilfp = item->ri_buf[0].i_addr;
+
+		xlog_buf_readahead(log, ilfp->ilf_blkno, ilfp->ilf_len,
+				   &xfs_inode_buf_ra_ops);
+	} else {
+		struct xfs_inode_log_format_32	*ilfp = item->ri_buf[0].i_addr;
+
+		xlog_buf_readahead(log, ilfp->ilf_blkno, ilfp->ilf_len,
+				   &xfs_inode_buf_ra_ops);
+	}
+}
+
 const struct xlog_recover_item_type xlog_inode_item_type = {
+	.ra_pass2_fn		= xlog_recover_inode_ra_pass2,
 };
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 8ab107680883..b61323cc5a11 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2045,7 +2045,7 @@ xlog_put_buffer_cancelled(
 	return true;
 }
 
-static void
+void
 xlog_buf_readahead(
 	struct xlog		*log,
 	xfs_daddr_t		blkno,
@@ -3912,96 +3912,6 @@ xlog_recover_do_icreate_pass2(
 				     length, be32_to_cpu(icl->icl_gen));
 }
 
-STATIC void
-xlog_recover_buffer_ra_pass2(
-	struct xlog                     *log,
-	struct xlog_recover_item        *item)
-{
-	struct xfs_buf_log_format	*buf_f = item->ri_buf[0].i_addr;
-
-	xlog_buf_readahead(log, buf_f->blf_blkno, buf_f->blf_len, NULL);
-}
-
-STATIC void
-xlog_recover_inode_ra_pass2(
-	struct xlog                     *log,
-	struct xlog_recover_item        *item)
-{
-	if (item->ri_buf[0].i_len == sizeof(struct xfs_inode_log_format)) {
-		struct xfs_inode_log_format	*ilfp = item->ri_buf[0].i_addr;
-
-		xlog_buf_readahead(log, ilfp->ilf_blkno, ilfp->ilf_len,
-				   &xfs_inode_buf_ra_ops);
-	} else {
-		struct xfs_inode_log_format_32	*ilfp = item->ri_buf[0].i_addr;
-
-		xlog_buf_readahead(log, ilfp->ilf_blkno, ilfp->ilf_len,
-				   &xfs_inode_buf_ra_ops);
-	}
-}
-
-STATIC void
-xlog_recover_dquot_ra_pass2(
-	struct xlog			*log,
-	struct xlog_recover_item	*item)
-{
-	struct xfs_mount	*mp = log->l_mp;
-	struct xfs_disk_dquot	*recddq;
-	struct xfs_dq_logformat	*dq_f;
-	uint			type;
-
-	if (mp->m_qflags == 0)
-		return;
-
-	recddq = item->ri_buf[1].i_addr;
-	if (recddq == NULL)
-		return;
-	if (item->ri_buf[1].i_len < sizeof(struct xfs_disk_dquot))
-		return;
-
-	type = recddq->d_flags & (XFS_DQ_USER | XFS_DQ_PROJ | XFS_DQ_GROUP);
-	ASSERT(type);
-	if (log->l_quotaoffs_flag & type)
-		return;
-
-	dq_f = item->ri_buf[0].i_addr;
-	ASSERT(dq_f);
-	ASSERT(dq_f->qlf_len == 1);
-
-	xlog_buf_readahead(log, dq_f->qlf_blkno,
-			XFS_FSB_TO_BB(mp, dq_f->qlf_len),
-			&xfs_dquot_buf_ra_ops);
-}
-
-STATIC void
-xlog_recover_ra_pass2(
-	struct xlog			*log,
-	struct xlog_recover_item	*item)
-{
-	switch (ITEM_TYPE(item)) {
-	case XFS_LI_BUF:
-		xlog_recover_buffer_ra_pass2(log, item);
-		break;
-	case XFS_LI_INODE:
-		xlog_recover_inode_ra_pass2(log, item);
-		break;
-	case XFS_LI_DQUOT:
-		xlog_recover_dquot_ra_pass2(log, item);
-		break;
-	case XFS_LI_EFI:
-	case XFS_LI_EFD:
-	case XFS_LI_QUOTAOFF:
-	case XFS_LI_RUI:
-	case XFS_LI_RUD:
-	case XFS_LI_CUI:
-	case XFS_LI_CUD:
-	case XFS_LI_BUI:
-	case XFS_LI_BUD:
-	default:
-		break;
-	}
-}
-
 STATIC int
 xlog_recover_commit_pass1(
 	struct xlog			*log,
@@ -4138,7 +4048,8 @@ xlog_recover_commit_trans(
 			error = xlog_recover_commit_pass1(log, trans, item);
 			break;
 		case XLOG_RECOVER_PASS2:
-			xlog_recover_ra_pass2(log, item);
+			if (item->ri_type && item->ri_type->ra_pass2_fn)
+				item->ri_type->ra_pass2_fn(log, item);
 			list_move_tail(&item->ri_list, &ra_list);
 			items_queued++;
 			if (items_queued >= XLOG_RECOVER_COMMIT_QUEUE_MAX) {


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 03/21] xfs: refactor log recovery item dispatch for pass1 commit functions
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
  2020-04-30  0:47 ` [PATCH 01/21] xfs: refactor log recovery item sorting into a generic dispatch structure Darrick J. Wong
  2020-04-30  0:47 ` [PATCH 02/21] xfs: refactor log recovery item dispatch for pass2 readhead functions Darrick J. Wong
@ 2020-04-30  0:47 ` Darrick J. Wong
  2020-04-30  0:48 ` [PATCH 04/21] xfs: refactor log recovery buffer item dispatch for pass2 " Darrick J. Wong
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:47 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Move the pass1 commit code into the per-item source code files and use
the dispatch function to call them.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_log_recover.h |    5 +++
 fs/xfs/xfs_buf_item_recover.c   |   27 ++++++++++++++
 fs/xfs/xfs_dquot_item.c         |   28 ++++++++++++++
 fs/xfs/xfs_log_recover.c        |   78 +++------------------------------------
 4 files changed, 65 insertions(+), 73 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_log_recover.h b/fs/xfs/libxfs/xfs_log_recover.h
index 1463eba47254..b933dc8bb8a3 100644
--- a/fs/xfs/libxfs/xfs_log_recover.h
+++ b/fs/xfs/libxfs/xfs_log_recover.h
@@ -32,6 +32,10 @@ struct xlog_recover_item_type {
 
 	/* Start readahead for pass2, if provided. */
 	void (*ra_pass2_fn)(struct xlog *log, struct xlog_recover_item *item);
+
+	/* Do whatever work we need to do for pass1, if provided. */
+	int (*commit_pass1_fn)(struct xlog *log,
+			       struct xlog_recover_item *item);
 };
 
 extern const struct xlog_recover_item_type xlog_icreate_item_type;
@@ -95,5 +99,6 @@ struct xlog_recover {
 
 void xlog_buf_readahead(struct xlog *log, xfs_daddr_t blkno, uint len,
 		const struct xfs_buf_ops *ops);
+bool xlog_add_buffer_cancelled(struct xlog *log, xfs_daddr_t blkno, uint len);
 
 #endif	/* __XFS_LOG_RECOVER_H__ */
diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c
index c756b8e55fde..deda3ad32d95 100644
--- a/fs/xfs/xfs_buf_item_recover.c
+++ b/fs/xfs/xfs_buf_item_recover.c
@@ -42,7 +42,34 @@ xlog_recover_buffer_ra_pass2(
 	xlog_buf_readahead(log, buf_f->blf_blkno, buf_f->blf_len, NULL);
 }
 
+/*
+ * Build up the table of buf cancel records so that we don't replay cancelled
+ * data in the second pass.
+ */
+static int
+xlog_recover_buffer_commit_pass1(
+	struct xlog			*log,
+	struct xlog_recover_item	*item)
+{
+	struct xfs_buf_log_format	*bf = item->ri_buf[0].i_addr;
+
+	if (!xfs_buf_log_check_iovec(&item->ri_buf[0])) {
+		xfs_err(log->l_mp, "bad buffer log item size (%d)",
+				item->ri_buf[0].i_len);
+		return -EFSCORRUPTED;
+	}
+
+	if (!(bf->blf_flags & XFS_BLF_CANCEL))
+		trace_xfs_log_recover_buf_not_cancel(log, bf);
+	else if (xlog_add_buffer_cancelled(log, bf->blf_blkno, bf->blf_len))
+		trace_xfs_log_recover_buf_cancel_add(log, bf);
+	else
+		trace_xfs_log_recover_buf_cancel_ref_inc(log, bf);
+	return 0;
+}
+
 const struct xlog_recover_item_type xlog_buf_item_type = {
 	.reorder_fn		= xlog_buf_reorder_fn,
 	.ra_pass2_fn		= xlog_recover_buffer_ra_pass2,
+	.commit_pass1_fn	= xlog_recover_buffer_commit_pass1,
 };
diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
index 2a05d1239423..4d18af49adfe 100644
--- a/fs/xfs/xfs_dquot_item.c
+++ b/fs/xfs/xfs_dquot_item.c
@@ -423,5 +423,33 @@ const struct xlog_recover_item_type xlog_dquot_item_type = {
 	.ra_pass2_fn		= xlog_recover_dquot_ra_pass2,
 };
 
+/*
+ * Recover QUOTAOFF records. We simply make a note of it in the xlog
+ * structure, so that we know not to do any dquot item or dquot buffer recovery,
+ * of that type.
+ */
+STATIC int
+xlog_recover_quotaoff_commit_pass1(
+	struct xlog			*log,
+	struct xlog_recover_item	*item)
+{
+	struct xfs_qoff_logformat	*qoff_f = item->ri_buf[0].i_addr;
+	ASSERT(qoff_f);
+
+	/*
+	 * The logitem format's flag tells us if this was user quotaoff,
+	 * group/project quotaoff or both.
+	 */
+	if (qoff_f->qf_flags & XFS_UQUOTA_ACCT)
+		log->l_quotaoffs_flag |= XFS_DQ_USER;
+	if (qoff_f->qf_flags & XFS_PQUOTA_ACCT)
+		log->l_quotaoffs_flag |= XFS_DQ_PROJ;
+	if (qoff_f->qf_flags & XFS_GQUOTA_ACCT)
+		log->l_quotaoffs_flag |= XFS_DQ_GROUP;
+
+	return 0;
+}
+
 const struct xlog_recover_item_type xlog_quotaoff_item_type = {
+	.commit_pass1_fn	= xlog_recover_quotaoff_commit_pass1,
 };
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index b61323cc5a11..fbd1f7d6f1c9 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -1975,7 +1975,7 @@ xlog_find_buffer_cancelled(
 	return NULL;
 }
 
-static bool
+bool
 xlog_add_buffer_cancelled(
 	struct xlog		*log,
 	xfs_daddr_t		blkno,
@@ -2056,32 +2056,6 @@ xlog_buf_readahead(
 		xfs_buf_readahead(log->l_mp->m_ddev_targp, blkno, len, ops);
 }
 
-/*
- * Build up the table of buf cancel records so that we don't replay cancelled
- * data in the second pass.
- */
-static int
-xlog_recover_buffer_pass1(
-	struct xlog			*log,
-	struct xlog_recover_item	*item)
-{
-	struct xfs_buf_log_format	*bf = item->ri_buf[0].i_addr;
-
-	if (!xfs_buf_log_check_iovec(&item->ri_buf[0])) {
-		xfs_err(log->l_mp, "bad buffer log item size (%d)",
-				item->ri_buf[0].i_len);
-		return -EFSCORRUPTED;
-	}
-
-	if (!(bf->blf_flags & XFS_BLF_CANCEL))
-		trace_xfs_log_recover_buf_not_cancel(log, bf);
-	else if (xlog_add_buffer_cancelled(log, bf->blf_blkno, bf->blf_len))
-		trace_xfs_log_recover_buf_cancel_add(log, bf);
-	else
-		trace_xfs_log_recover_buf_cancel_ref_inc(log, bf);
-	return 0;
-}
-
 /*
  * Perform recovery for a buffer full of inodes.  In these buffers, the only
  * data which should be recovered is that which corresponds to the
@@ -3219,33 +3193,6 @@ xlog_recover_inode_pass2(
 	return error;
 }
 
-/*
- * Recover QUOTAOFF records. We simply make a note of it in the xlog
- * structure, so that we know not to do any dquot item or dquot buffer recovery,
- * of that type.
- */
-STATIC int
-xlog_recover_quotaoff_pass1(
-	struct xlog			*log,
-	struct xlog_recover_item	*item)
-{
-	xfs_qoff_logformat_t	*qoff_f = item->ri_buf[0].i_addr;
-	ASSERT(qoff_f);
-
-	/*
-	 * The logitem format's flag tells us if this was user quotaoff,
-	 * group/project quotaoff or both.
-	 */
-	if (qoff_f->qf_flags & XFS_UQUOTA_ACCT)
-		log->l_quotaoffs_flag |= XFS_DQ_USER;
-	if (qoff_f->qf_flags & XFS_PQUOTA_ACCT)
-		log->l_quotaoffs_flag |= XFS_DQ_PROJ;
-	if (qoff_f->qf_flags & XFS_GQUOTA_ACCT)
-		log->l_quotaoffs_flag |= XFS_DQ_GROUP;
-
-	return 0;
-}
-
 /*
  * Recover a dquot record
  */
@@ -3920,30 +3867,15 @@ xlog_recover_commit_pass1(
 {
 	trace_xfs_log_recover_item_recover(log, trans, item, XLOG_RECOVER_PASS1);
 
-	switch (ITEM_TYPE(item)) {
-	case XFS_LI_BUF:
-		return xlog_recover_buffer_pass1(log, item);
-	case XFS_LI_QUOTAOFF:
-		return xlog_recover_quotaoff_pass1(log, item);
-	case XFS_LI_INODE:
-	case XFS_LI_EFI:
-	case XFS_LI_EFD:
-	case XFS_LI_DQUOT:
-	case XFS_LI_ICREATE:
-	case XFS_LI_RUI:
-	case XFS_LI_RUD:
-	case XFS_LI_CUI:
-	case XFS_LI_CUD:
-	case XFS_LI_BUI:
-	case XFS_LI_BUD:
-		/* nothing to do in pass 1 */
-		return 0;
-	default:
+	if (!item->ri_type) {
 		xfs_warn(log->l_mp, "%s: invalid item type (%d)",
 			__func__, ITEM_TYPE(item));
 		ASSERT(0);
 		return -EFSCORRUPTED;
 	}
+	if (!item->ri_type->commit_pass1_fn)
+		return 0;
+	return item->ri_type->commit_pass1_fn(log, item);
 }
 
 STATIC int


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 04/21] xfs: refactor log recovery buffer item dispatch for pass2 commit functions
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
                   ` (2 preceding siblings ...)
  2020-04-30  0:47 ` [PATCH 03/21] xfs: refactor log recovery item dispatch for pass1 commit functions Darrick J. Wong
@ 2020-04-30  0:48 ` Darrick J. Wong
  2020-05-01 13:43   ` Chandan Rajendra
  2020-04-30  0:48 ` [PATCH 05/21] xfs: refactor log recovery inode " Darrick J. Wong
                   ` (17 subsequent siblings)
  21 siblings, 1 reply; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:48 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Move the log buffer item pass2 commit code into the per-item source code
files and use the dispatch function to call it.  We do these one at a
time because there's a lot of code to move.  No functional changes.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_log_recover.h |   23 +
 fs/xfs/xfs_buf_item_recover.c   |  790 +++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_log_recover.c        |  798 ---------------------------------------
 3 files changed, 820 insertions(+), 791 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_log_recover.h b/fs/xfs/libxfs/xfs_log_recover.h
index b933dc8bb8a3..5017d80c0f4b 100644
--- a/fs/xfs/libxfs/xfs_log_recover.h
+++ b/fs/xfs/libxfs/xfs_log_recover.h
@@ -36,6 +36,26 @@ struct xlog_recover_item_type {
 	/* Do whatever work we need to do for pass1, if provided. */
 	int (*commit_pass1_fn)(struct xlog *log,
 			       struct xlog_recover_item *item);
+
+	/*
+	 * This function should do whatever work is needed for pass2 of log
+	 * recovery, if provided.
+	 *
+	 * If the recovered item is an intent item, this function should parse
+	 * the recovered item to construct an in-core log intent item and
+	 * insert it into the AIL.  The in-core log intent item should have 1
+	 * refcount so that the item is freed either (a) when we commit the
+	 * recovered log item for the intent-done item; (b) replay the work and
+	 * log a new intent-done item; or (c) recovery fails and we have to
+	 * abort.
+	 *
+	 * If the recovered item is an intent-done item, this function should
+	 * parse the recovered item to find the id of the corresponding intent
+	 * log item.  Next, it should find the in-core log intent item in the
+	 * AIL and release it.
+	 */
+	int (*commit_pass2_fn)(struct xlog *log, struct list_head *buffer_list,
+			       struct xlog_recover_item *item, xfs_lsn_t lsn);
 };
 
 extern const struct xlog_recover_item_type xlog_icreate_item_type;
@@ -100,5 +120,8 @@ struct xlog_recover {
 void xlog_buf_readahead(struct xlog *log, xfs_daddr_t blkno, uint len,
 		const struct xfs_buf_ops *ops);
 bool xlog_add_buffer_cancelled(struct xlog *log, xfs_daddr_t blkno, uint len);
+bool xlog_is_buffer_cancelled(struct xlog *log, xfs_daddr_t blkno, uint len);
+bool xlog_put_buffer_cancelled(struct xlog *log, xfs_daddr_t blkno, uint len);
+void xlog_recover_iodone(struct xfs_buf *bp);
 
 #endif	/* __XFS_LOG_RECOVER_H__ */
diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c
index deda3ad32d95..d324f810819d 100644
--- a/fs/xfs/xfs_buf_item_recover.c
+++ b/fs/xfs/xfs_buf_item_recover.c
@@ -18,6 +18,10 @@
 #include "xfs_log.h"
 #include "xfs_log_priv.h"
 #include "xfs_log_recover.h"
+#include "xfs_error.h"
+#include "xfs_inode.h"
+#include "xfs_dir2.h"
+#include "xfs_quota.h"
 
 STATIC enum xlog_recover_reorder
 xlog_buf_reorder_fn(
@@ -68,8 +72,794 @@ xlog_recover_buffer_commit_pass1(
 	return 0;
 }
 
+/*
+ * Validate the recovered buffer is of the correct type and attach the
+ * appropriate buffer operations to them for writeback. Magic numbers are in a
+ * few places:
+ *	the first 16 bits of the buffer (inode buffer, dquot buffer),
+ *	the first 32 bits of the buffer (most blocks),
+ *	inside a struct xfs_da_blkinfo at the start of the buffer.
+ */
+static void
+xlog_recover_validate_buf_type(
+	struct xfs_mount		*mp,
+	struct xfs_buf			*bp,
+	struct xfs_buf_log_format	*buf_f,
+	xfs_lsn_t			current_lsn)
+{
+	struct xfs_da_blkinfo		*info = bp->b_addr;
+	uint32_t			magic32;
+	uint16_t			magic16;
+	uint16_t			magicda;
+	char				*warnmsg = NULL;
+
+	/*
+	 * We can only do post recovery validation on items on CRC enabled
+	 * fielsystems as we need to know when the buffer was written to be able
+	 * to determine if we should have replayed the item. If we replay old
+	 * metadata over a newer buffer, then it will enter a temporarily
+	 * inconsistent state resulting in verification failures. Hence for now
+	 * just avoid the verification stage for non-crc filesystems
+	 */
+	if (!xfs_sb_version_hascrc(&mp->m_sb))
+		return;
+
+	magic32 = be32_to_cpu(*(__be32 *)bp->b_addr);
+	magic16 = be16_to_cpu(*(__be16*)bp->b_addr);
+	magicda = be16_to_cpu(info->magic);
+	switch (xfs_blft_from_flags(buf_f)) {
+	case XFS_BLFT_BTREE_BUF:
+		switch (magic32) {
+		case XFS_ABTB_CRC_MAGIC:
+		case XFS_ABTB_MAGIC:
+			bp->b_ops = &xfs_bnobt_buf_ops;
+			break;
+		case XFS_ABTC_CRC_MAGIC:
+		case XFS_ABTC_MAGIC:
+			bp->b_ops = &xfs_cntbt_buf_ops;
+			break;
+		case XFS_IBT_CRC_MAGIC:
+		case XFS_IBT_MAGIC:
+			bp->b_ops = &xfs_inobt_buf_ops;
+			break;
+		case XFS_FIBT_CRC_MAGIC:
+		case XFS_FIBT_MAGIC:
+			bp->b_ops = &xfs_finobt_buf_ops;
+			break;
+		case XFS_BMAP_CRC_MAGIC:
+		case XFS_BMAP_MAGIC:
+			bp->b_ops = &xfs_bmbt_buf_ops;
+			break;
+		case XFS_RMAP_CRC_MAGIC:
+			bp->b_ops = &xfs_rmapbt_buf_ops;
+			break;
+		case XFS_REFC_CRC_MAGIC:
+			bp->b_ops = &xfs_refcountbt_buf_ops;
+			break;
+		default:
+			warnmsg = "Bad btree block magic!";
+			break;
+		}
+		break;
+	case XFS_BLFT_AGF_BUF:
+		if (magic32 != XFS_AGF_MAGIC) {
+			warnmsg = "Bad AGF block magic!";
+			break;
+		}
+		bp->b_ops = &xfs_agf_buf_ops;
+		break;
+	case XFS_BLFT_AGFL_BUF:
+		if (magic32 != XFS_AGFL_MAGIC) {
+			warnmsg = "Bad AGFL block magic!";
+			break;
+		}
+		bp->b_ops = &xfs_agfl_buf_ops;
+		break;
+	case XFS_BLFT_AGI_BUF:
+		if (magic32 != XFS_AGI_MAGIC) {
+			warnmsg = "Bad AGI block magic!";
+			break;
+		}
+		bp->b_ops = &xfs_agi_buf_ops;
+		break;
+	case XFS_BLFT_UDQUOT_BUF:
+	case XFS_BLFT_PDQUOT_BUF:
+	case XFS_BLFT_GDQUOT_BUF:
+#ifdef CONFIG_XFS_QUOTA
+		if (magic16 != XFS_DQUOT_MAGIC) {
+			warnmsg = "Bad DQUOT block magic!";
+			break;
+		}
+		bp->b_ops = &xfs_dquot_buf_ops;
+#else
+		xfs_alert(mp,
+	"Trying to recover dquots without QUOTA support built in!");
+		ASSERT(0);
+#endif
+		break;
+	case XFS_BLFT_DINO_BUF:
+		if (magic16 != XFS_DINODE_MAGIC) {
+			warnmsg = "Bad INODE block magic!";
+			break;
+		}
+		bp->b_ops = &xfs_inode_buf_ops;
+		break;
+	case XFS_BLFT_SYMLINK_BUF:
+		if (magic32 != XFS_SYMLINK_MAGIC) {
+			warnmsg = "Bad symlink block magic!";
+			break;
+		}
+		bp->b_ops = &xfs_symlink_buf_ops;
+		break;
+	case XFS_BLFT_DIR_BLOCK_BUF:
+		if (magic32 != XFS_DIR2_BLOCK_MAGIC &&
+		    magic32 != XFS_DIR3_BLOCK_MAGIC) {
+			warnmsg = "Bad dir block magic!";
+			break;
+		}
+		bp->b_ops = &xfs_dir3_block_buf_ops;
+		break;
+	case XFS_BLFT_DIR_DATA_BUF:
+		if (magic32 != XFS_DIR2_DATA_MAGIC &&
+		    magic32 != XFS_DIR3_DATA_MAGIC) {
+			warnmsg = "Bad dir data magic!";
+			break;
+		}
+		bp->b_ops = &xfs_dir3_data_buf_ops;
+		break;
+	case XFS_BLFT_DIR_FREE_BUF:
+		if (magic32 != XFS_DIR2_FREE_MAGIC &&
+		    magic32 != XFS_DIR3_FREE_MAGIC) {
+			warnmsg = "Bad dir3 free magic!";
+			break;
+		}
+		bp->b_ops = &xfs_dir3_free_buf_ops;
+		break;
+	case XFS_BLFT_DIR_LEAF1_BUF:
+		if (magicda != XFS_DIR2_LEAF1_MAGIC &&
+		    magicda != XFS_DIR3_LEAF1_MAGIC) {
+			warnmsg = "Bad dir leaf1 magic!";
+			break;
+		}
+		bp->b_ops = &xfs_dir3_leaf1_buf_ops;
+		break;
+	case XFS_BLFT_DIR_LEAFN_BUF:
+		if (magicda != XFS_DIR2_LEAFN_MAGIC &&
+		    magicda != XFS_DIR3_LEAFN_MAGIC) {
+			warnmsg = "Bad dir leafn magic!";
+			break;
+		}
+		bp->b_ops = &xfs_dir3_leafn_buf_ops;
+		break;
+	case XFS_BLFT_DA_NODE_BUF:
+		if (magicda != XFS_DA_NODE_MAGIC &&
+		    magicda != XFS_DA3_NODE_MAGIC) {
+			warnmsg = "Bad da node magic!";
+			break;
+		}
+		bp->b_ops = &xfs_da3_node_buf_ops;
+		break;
+	case XFS_BLFT_ATTR_LEAF_BUF:
+		if (magicda != XFS_ATTR_LEAF_MAGIC &&
+		    magicda != XFS_ATTR3_LEAF_MAGIC) {
+			warnmsg = "Bad attr leaf magic!";
+			break;
+		}
+		bp->b_ops = &xfs_attr3_leaf_buf_ops;
+		break;
+	case XFS_BLFT_ATTR_RMT_BUF:
+		if (magic32 != XFS_ATTR3_RMT_MAGIC) {
+			warnmsg = "Bad attr remote magic!";
+			break;
+		}
+		bp->b_ops = &xfs_attr3_rmt_buf_ops;
+		break;
+	case XFS_BLFT_SB_BUF:
+		if (magic32 != XFS_SB_MAGIC) {
+			warnmsg = "Bad SB block magic!";
+			break;
+		}
+		bp->b_ops = &xfs_sb_buf_ops;
+		break;
+#ifdef CONFIG_XFS_RT
+	case XFS_BLFT_RTBITMAP_BUF:
+	case XFS_BLFT_RTSUMMARY_BUF:
+		/* no magic numbers for verification of RT buffers */
+		bp->b_ops = &xfs_rtbuf_ops;
+		break;
+#endif /* CONFIG_XFS_RT */
+	default:
+		xfs_warn(mp, "Unknown buffer type %d!",
+			 xfs_blft_from_flags(buf_f));
+		break;
+	}
+
+	/*
+	 * Nothing else to do in the case of a NULL current LSN as this means
+	 * the buffer is more recent than the change in the log and will be
+	 * skipped.
+	 */
+	if (current_lsn == NULLCOMMITLSN)
+		return;
+
+	if (warnmsg) {
+		xfs_warn(mp, warnmsg);
+		ASSERT(0);
+	}
+
+	/*
+	 * We must update the metadata LSN of the buffer as it is written out to
+	 * ensure that older transactions never replay over this one and corrupt
+	 * the buffer. This can occur if log recovery is interrupted at some
+	 * point after the current transaction completes, at which point a
+	 * subsequent mount starts recovery from the beginning.
+	 *
+	 * Write verifiers update the metadata LSN from log items attached to
+	 * the buffer. Therefore, initialize a bli purely to carry the LSN to
+	 * the verifier. We'll clean it up in our ->iodone() callback.
+	 */
+	if (bp->b_ops) {
+		struct xfs_buf_log_item	*bip;
+
+		ASSERT(!bp->b_iodone || bp->b_iodone == xlog_recover_iodone);
+		bp->b_iodone = xlog_recover_iodone;
+		xfs_buf_item_init(bp, mp);
+		bip = bp->b_log_item;
+		bip->bli_item.li_lsn = current_lsn;
+	}
+}
+
+/*
+ * Perform a 'normal' buffer recovery.  Each logged region of the
+ * buffer should be copied over the corresponding region in the
+ * given buffer.  The bitmap in the buf log format structure indicates
+ * where to place the logged data.
+ */
+STATIC void
+xlog_recover_do_reg_buffer(
+	struct xfs_mount		*mp,
+	struct xlog_recover_item	*item,
+	struct xfs_buf			*bp,
+	struct xfs_buf_log_format	*buf_f,
+	xfs_lsn_t			current_lsn)
+{
+	int			i;
+	int			bit;
+	int			nbits;
+	xfs_failaddr_t		fa;
+	const size_t		size_disk_dquot = sizeof(struct xfs_disk_dquot);
+
+	trace_xfs_log_recover_buf_reg_buf(mp->m_log, buf_f);
+
+	bit = 0;
+	i = 1;  /* 0 is the buf format structure */
+	while (1) {
+		bit = xfs_next_bit(buf_f->blf_data_map,
+				   buf_f->blf_map_size, bit);
+		if (bit == -1)
+			break;
+		nbits = xfs_contig_bits(buf_f->blf_data_map,
+					buf_f->blf_map_size, bit);
+		ASSERT(nbits > 0);
+		ASSERT(item->ri_buf[i].i_addr != NULL);
+		ASSERT(item->ri_buf[i].i_len % XFS_BLF_CHUNK == 0);
+		ASSERT(BBTOB(bp->b_length) >=
+		       ((uint)bit << XFS_BLF_SHIFT) + (nbits << XFS_BLF_SHIFT));
+
+		/*
+		 * The dirty regions logged in the buffer, even though
+		 * contiguous, may span multiple chunks. This is because the
+		 * dirty region may span a physical page boundary in a buffer
+		 * and hence be split into two separate vectors for writing into
+		 * the log. Hence we need to trim nbits back to the length of
+		 * the current region being copied out of the log.
+		 */
+		if (item->ri_buf[i].i_len < (nbits << XFS_BLF_SHIFT))
+			nbits = item->ri_buf[i].i_len >> XFS_BLF_SHIFT;
+
+		/*
+		 * Do a sanity check if this is a dquot buffer. Just checking
+		 * the first dquot in the buffer should do. XXXThis is
+		 * probably a good thing to do for other buf types also.
+		 */
+		fa = NULL;
+		if (buf_f->blf_flags &
+		   (XFS_BLF_UDQUOT_BUF|XFS_BLF_PDQUOT_BUF|XFS_BLF_GDQUOT_BUF)) {
+			if (item->ri_buf[i].i_addr == NULL) {
+				xfs_alert(mp,
+					"XFS: NULL dquot in %s.", __func__);
+				goto next;
+			}
+			if (item->ri_buf[i].i_len < size_disk_dquot) {
+				xfs_alert(mp,
+					"XFS: dquot too small (%d) in %s.",
+					item->ri_buf[i].i_len, __func__);
+				goto next;
+			}
+			fa = xfs_dquot_verify(mp, item->ri_buf[i].i_addr,
+					       -1, 0);
+			if (fa) {
+				xfs_alert(mp,
+	"dquot corrupt at %pS trying to replay into block 0x%llx",
+					fa, bp->b_bn);
+				goto next;
+			}
+		}
+
+		memcpy(xfs_buf_offset(bp,
+			(uint)bit << XFS_BLF_SHIFT),	/* dest */
+			item->ri_buf[i].i_addr,		/* source */
+			nbits<<XFS_BLF_SHIFT);		/* length */
+ next:
+		i++;
+		bit += nbits;
+	}
+
+	/* Shouldn't be any more regions */
+	ASSERT(i == item->ri_total);
+
+	xlog_recover_validate_buf_type(mp, bp, buf_f, current_lsn);
+}
+
+/*
+ * Perform a dquot buffer recovery.
+ * Simple algorithm: if we have found a QUOTAOFF log item of the same type
+ * (ie. USR or GRP), then just toss this buffer away; don't recover it.
+ * Else, treat it as a regular buffer and do recovery.
+ *
+ * Return false if the buffer was tossed and true if we recovered the buffer to
+ * indicate to the caller if the buffer needs writing.
+ */
+STATIC bool
+xlog_recover_do_dquot_buffer(
+	struct xfs_mount		*mp,
+	struct xlog			*log,
+	struct xlog_recover_item	*item,
+	struct xfs_buf			*bp,
+	struct xfs_buf_log_format	*buf_f)
+{
+	uint			type;
+
+	trace_xfs_log_recover_buf_dquot_buf(log, buf_f);
+
+	/*
+	 * Filesystems are required to send in quota flags at mount time.
+	 */
+	if (!mp->m_qflags)
+		return false;
+
+	type = 0;
+	if (buf_f->blf_flags & XFS_BLF_UDQUOT_BUF)
+		type |= XFS_DQ_USER;
+	if (buf_f->blf_flags & XFS_BLF_PDQUOT_BUF)
+		type |= XFS_DQ_PROJ;
+	if (buf_f->blf_flags & XFS_BLF_GDQUOT_BUF)
+		type |= XFS_DQ_GROUP;
+	/*
+	 * This type of quotas was turned off, so ignore this buffer
+	 */
+	if (log->l_quotaoffs_flag & type)
+		return false;
+
+	xlog_recover_do_reg_buffer(mp, item, bp, buf_f, NULLCOMMITLSN);
+	return true;
+}
+
+/*
+ * Perform recovery for a buffer full of inodes.  In these buffers, the only
+ * data which should be recovered is that which corresponds to the
+ * di_next_unlinked pointers in the on disk inode structures.  The rest of the
+ * data for the inodes is always logged through the inodes themselves rather
+ * than the inode buffer and is recovered in xlog_recover_inode_pass2().
+ *
+ * The only time when buffers full of inodes are fully recovered is when the
+ * buffer is full of newly allocated inodes.  In this case the buffer will
+ * not be marked as an inode buffer and so will be sent to
+ * xlog_recover_do_reg_buffer() below during recovery.
+ */
+STATIC int
+xlog_recover_do_inode_buffer(
+	struct xfs_mount		*mp,
+	struct xlog_recover_item	*item,
+	struct xfs_buf			*bp,
+	struct xfs_buf_log_format	*buf_f)
+{
+	int				i;
+	int				item_index = 0;
+	int				bit = 0;
+	int				nbits = 0;
+	int				reg_buf_offset = 0;
+	int				reg_buf_bytes = 0;
+	int				next_unlinked_offset;
+	int				inodes_per_buf;
+	xfs_agino_t			*logged_nextp;
+	xfs_agino_t			*buffer_nextp;
+
+	trace_xfs_log_recover_buf_inode_buf(mp->m_log, buf_f);
+
+	/*
+	 * Post recovery validation only works properly on CRC enabled
+	 * filesystems.
+	 */
+	if (xfs_sb_version_hascrc(&mp->m_sb))
+		bp->b_ops = &xfs_inode_buf_ops;
+
+	inodes_per_buf = BBTOB(bp->b_length) >> mp->m_sb.sb_inodelog;
+	for (i = 0; i < inodes_per_buf; i++) {
+		next_unlinked_offset = (i * mp->m_sb.sb_inodesize) +
+			offsetof(xfs_dinode_t, di_next_unlinked);
+
+		while (next_unlinked_offset >=
+		       (reg_buf_offset + reg_buf_bytes)) {
+			/*
+			 * The next di_next_unlinked field is beyond
+			 * the current logged region.  Find the next
+			 * logged region that contains or is beyond
+			 * the current di_next_unlinked field.
+			 */
+			bit += nbits;
+			bit = xfs_next_bit(buf_f->blf_data_map,
+					   buf_f->blf_map_size, bit);
+
+			/*
+			 * If there are no more logged regions in the
+			 * buffer, then we're done.
+			 */
+			if (bit == -1)
+				return 0;
+
+			nbits = xfs_contig_bits(buf_f->blf_data_map,
+						buf_f->blf_map_size, bit);
+			ASSERT(nbits > 0);
+			reg_buf_offset = bit << XFS_BLF_SHIFT;
+			reg_buf_bytes = nbits << XFS_BLF_SHIFT;
+			item_index++;
+		}
+
+		/*
+		 * If the current logged region starts after the current
+		 * di_next_unlinked field, then move on to the next
+		 * di_next_unlinked field.
+		 */
+		if (next_unlinked_offset < reg_buf_offset)
+			continue;
+
+		ASSERT(item->ri_buf[item_index].i_addr != NULL);
+		ASSERT((item->ri_buf[item_index].i_len % XFS_BLF_CHUNK) == 0);
+		ASSERT((reg_buf_offset + reg_buf_bytes) <= BBTOB(bp->b_length));
+
+		/*
+		 * The current logged region contains a copy of the
+		 * current di_next_unlinked field.  Extract its value
+		 * and copy it to the buffer copy.
+		 */
+		logged_nextp = item->ri_buf[item_index].i_addr +
+				next_unlinked_offset - reg_buf_offset;
+		if (XFS_IS_CORRUPT(mp, *logged_nextp == 0)) {
+			xfs_alert(mp,
+		"Bad inode buffer log record (ptr = "PTR_FMT", bp = "PTR_FMT"). "
+		"Trying to replay bad (0) inode di_next_unlinked field.",
+				item, bp);
+			return -EFSCORRUPTED;
+		}
+
+		buffer_nextp = xfs_buf_offset(bp, next_unlinked_offset);
+		*buffer_nextp = *logged_nextp;
+
+		/*
+		 * If necessary, recalculate the CRC in the on-disk inode. We
+		 * have to leave the inode in a consistent state for whoever
+		 * reads it next....
+		 */
+		xfs_dinode_calc_crc(mp,
+				xfs_buf_offset(bp, i * mp->m_sb.sb_inodesize));
+
+	}
+
+	return 0;
+}
+
+/*
+ * V5 filesystems know the age of the buffer on disk being recovered. We can
+ * have newer objects on disk than we are replaying, and so for these cases we
+ * don't want to replay the current change as that will make the buffer contents
+ * temporarily invalid on disk.
+ *
+ * The magic number might not match the buffer type we are going to recover
+ * (e.g. reallocated blocks), so we ignore the xfs_buf_log_format flags.  Hence
+ * extract the LSN of the existing object in the buffer based on it's current
+ * magic number.  If we don't recognise the magic number in the buffer, then
+ * return a LSN of -1 so that the caller knows it was an unrecognised block and
+ * so can recover the buffer.
+ *
+ * Note: we cannot rely solely on magic number matches to determine that the
+ * buffer has a valid LSN - we also need to verify that it belongs to this
+ * filesystem, so we need to extract the object's LSN and compare it to that
+ * which we read from the superblock. If the UUIDs don't match, then we've got a
+ * stale metadata block from an old filesystem instance that we need to recover
+ * over the top of.
+ */
+static xfs_lsn_t
+xlog_recover_get_buf_lsn(
+	struct xfs_mount	*mp,
+	struct xfs_buf		*bp)
+{
+	uint32_t		magic32;
+	uint16_t		magic16;
+	uint16_t		magicda;
+	void			*blk = bp->b_addr;
+	uuid_t			*uuid;
+	xfs_lsn_t		lsn = -1;
+
+	/* v4 filesystems always recover immediately */
+	if (!xfs_sb_version_hascrc(&mp->m_sb))
+		goto recover_immediately;
+
+	magic32 = be32_to_cpu(*(__be32 *)blk);
+	switch (magic32) {
+	case XFS_ABTB_CRC_MAGIC:
+	case XFS_ABTC_CRC_MAGIC:
+	case XFS_ABTB_MAGIC:
+	case XFS_ABTC_MAGIC:
+	case XFS_RMAP_CRC_MAGIC:
+	case XFS_REFC_CRC_MAGIC:
+	case XFS_IBT_CRC_MAGIC:
+	case XFS_IBT_MAGIC: {
+		struct xfs_btree_block *btb = blk;
+
+		lsn = be64_to_cpu(btb->bb_u.s.bb_lsn);
+		uuid = &btb->bb_u.s.bb_uuid;
+		break;
+	}
+	case XFS_BMAP_CRC_MAGIC:
+	case XFS_BMAP_MAGIC: {
+		struct xfs_btree_block *btb = blk;
+
+		lsn = be64_to_cpu(btb->bb_u.l.bb_lsn);
+		uuid = &btb->bb_u.l.bb_uuid;
+		break;
+	}
+	case XFS_AGF_MAGIC:
+		lsn = be64_to_cpu(((struct xfs_agf *)blk)->agf_lsn);
+		uuid = &((struct xfs_agf *)blk)->agf_uuid;
+		break;
+	case XFS_AGFL_MAGIC:
+		lsn = be64_to_cpu(((struct xfs_agfl *)blk)->agfl_lsn);
+		uuid = &((struct xfs_agfl *)blk)->agfl_uuid;
+		break;
+	case XFS_AGI_MAGIC:
+		lsn = be64_to_cpu(((struct xfs_agi *)blk)->agi_lsn);
+		uuid = &((struct xfs_agi *)blk)->agi_uuid;
+		break;
+	case XFS_SYMLINK_MAGIC:
+		lsn = be64_to_cpu(((struct xfs_dsymlink_hdr *)blk)->sl_lsn);
+		uuid = &((struct xfs_dsymlink_hdr *)blk)->sl_uuid;
+		break;
+	case XFS_DIR3_BLOCK_MAGIC:
+	case XFS_DIR3_DATA_MAGIC:
+	case XFS_DIR3_FREE_MAGIC:
+		lsn = be64_to_cpu(((struct xfs_dir3_blk_hdr *)blk)->lsn);
+		uuid = &((struct xfs_dir3_blk_hdr *)blk)->uuid;
+		break;
+	case XFS_ATTR3_RMT_MAGIC:
+		/*
+		 * Remote attr blocks are written synchronously, rather than
+		 * being logged. That means they do not contain a valid LSN
+		 * (i.e. transactionally ordered) in them, and hence any time we
+		 * see a buffer to replay over the top of a remote attribute
+		 * block we should simply do so.
+		 */
+		goto recover_immediately;
+	case XFS_SB_MAGIC:
+		/*
+		 * superblock uuids are magic. We may or may not have a
+		 * sb_meta_uuid on disk, but it will be set in the in-core
+		 * superblock. We set the uuid pointer for verification
+		 * according to the superblock feature mask to ensure we check
+		 * the relevant UUID in the superblock.
+		 */
+		lsn = be64_to_cpu(((struct xfs_dsb *)blk)->sb_lsn);
+		if (xfs_sb_version_hasmetauuid(&mp->m_sb))
+			uuid = &((struct xfs_dsb *)blk)->sb_meta_uuid;
+		else
+			uuid = &((struct xfs_dsb *)blk)->sb_uuid;
+		break;
+	default:
+		break;
+	}
+
+	if (lsn != (xfs_lsn_t)-1) {
+		if (!uuid_equal(&mp->m_sb.sb_meta_uuid, uuid))
+			goto recover_immediately;
+		return lsn;
+	}
+
+	magicda = be16_to_cpu(((struct xfs_da_blkinfo *)blk)->magic);
+	switch (magicda) {
+	case XFS_DIR3_LEAF1_MAGIC:
+	case XFS_DIR3_LEAFN_MAGIC:
+	case XFS_DA3_NODE_MAGIC:
+		lsn = be64_to_cpu(((struct xfs_da3_blkinfo *)blk)->lsn);
+		uuid = &((struct xfs_da3_blkinfo *)blk)->uuid;
+		break;
+	default:
+		break;
+	}
+
+	if (lsn != (xfs_lsn_t)-1) {
+		if (!uuid_equal(&mp->m_sb.sb_uuid, uuid))
+			goto recover_immediately;
+		return lsn;
+	}
+
+	/*
+	 * We do individual object checks on dquot and inode buffers as they
+	 * have their own individual LSN records. Also, we could have a stale
+	 * buffer here, so we have to at least recognise these buffer types.
+	 *
+	 * A notd complexity here is inode unlinked list processing - it logs
+	 * the inode directly in the buffer, but we don't know which inodes have
+	 * been modified, and there is no global buffer LSN. Hence we need to
+	 * recover all inode buffer types immediately. This problem will be
+	 * fixed by logical logging of the unlinked list modifications.
+	 */
+	magic16 = be16_to_cpu(*(__be16 *)blk);
+	switch (magic16) {
+	case XFS_DQUOT_MAGIC:
+	case XFS_DINODE_MAGIC:
+		goto recover_immediately;
+	default:
+		break;
+	}
+
+	/* unknown buffer contents, recover immediately */
+
+recover_immediately:
+	return (xfs_lsn_t)-1;
+
+}
+
+/*
+ * This routine replays a modification made to a buffer at runtime.
+ * There are actually two types of buffer, regular and inode, which
+ * are handled differently.  Inode buffers are handled differently
+ * in that we only recover a specific set of data from them, namely
+ * the inode di_next_unlinked fields.  This is because all other inode
+ * data is actually logged via inode records and any data we replay
+ * here which overlaps that may be stale.
+ *
+ * When meta-data buffers are freed at run time we log a buffer item
+ * with the XFS_BLF_CANCEL bit set to indicate that previous copies
+ * of the buffer in the log should not be replayed at recovery time.
+ * This is so that if the blocks covered by the buffer are reused for
+ * file data before we crash we don't end up replaying old, freed
+ * meta-data into a user's file.
+ *
+ * To handle the cancellation of buffer log items, we make two passes
+ * over the log during recovery.  During the first we build a table of
+ * those buffers which have been cancelled, and during the second we
+ * only replay those buffers which do not have corresponding cancel
+ * records in the table.  See xlog_recover_buffer_pass[1,2] above
+ * for more details on the implementation of the table of cancel records.
+ */
+STATIC int
+xlog_recover_buffer_commit_pass2(
+	struct xlog			*log,
+	struct list_head		*buffer_list,
+	struct xlog_recover_item	*item,
+	xfs_lsn_t			current_lsn)
+{
+	struct xfs_buf_log_format	*buf_f = item->ri_buf[0].i_addr;
+	struct xfs_mount		*mp = log->l_mp;
+	struct xfs_buf			*bp;
+	int				error;
+	uint				buf_flags;
+	xfs_lsn_t			lsn;
+
+	/*
+	 * In this pass we only want to recover all the buffers which have
+	 * not been cancelled and are not cancellation buffers themselves.
+	 */
+	if (buf_f->blf_flags & XFS_BLF_CANCEL) {
+		if (xlog_put_buffer_cancelled(log, buf_f->blf_blkno,
+				buf_f->blf_len))
+			goto cancelled;
+	} else {
+
+		if (xlog_is_buffer_cancelled(log, buf_f->blf_blkno,
+				buf_f->blf_len))
+			goto cancelled;
+	}
+
+	trace_xfs_log_recover_buf_recover(log, buf_f);
+
+	buf_flags = 0;
+	if (buf_f->blf_flags & XFS_BLF_INODE_BUF)
+		buf_flags |= XBF_UNMAPPED;
+
+	error = xfs_buf_read(mp->m_ddev_targp, buf_f->blf_blkno, buf_f->blf_len,
+			  buf_flags, &bp, NULL);
+	if (error)
+		return error;
+
+	/*
+	 * Recover the buffer only if we get an LSN from it and it's less than
+	 * the lsn of the transaction we are replaying.
+	 *
+	 * Note that we have to be extremely careful of readahead here.
+	 * Readahead does not attach verfiers to the buffers so if we don't
+	 * actually do any replay after readahead because of the LSN we found
+	 * in the buffer if more recent than that current transaction then we
+	 * need to attach the verifier directly. Failure to do so can lead to
+	 * future recovery actions (e.g. EFI and unlinked list recovery) can
+	 * operate on the buffers and they won't get the verifier attached. This
+	 * can lead to blocks on disk having the correct content but a stale
+	 * CRC.
+	 *
+	 * It is safe to assume these clean buffers are currently up to date.
+	 * If the buffer is dirtied by a later transaction being replayed, then
+	 * the verifier will be reset to match whatever recover turns that
+	 * buffer into.
+	 */
+	lsn = xlog_recover_get_buf_lsn(mp, bp);
+	if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0) {
+		trace_xfs_log_recover_buf_skip(log, buf_f);
+		xlog_recover_validate_buf_type(mp, bp, buf_f, NULLCOMMITLSN);
+		goto out_release;
+	}
+
+	if (buf_f->blf_flags & XFS_BLF_INODE_BUF) {
+		error = xlog_recover_do_inode_buffer(mp, item, bp, buf_f);
+		if (error)
+			goto out_release;
+	} else if (buf_f->blf_flags &
+		  (XFS_BLF_UDQUOT_BUF|XFS_BLF_PDQUOT_BUF|XFS_BLF_GDQUOT_BUF)) {
+		bool	dirty;
+
+		dirty = xlog_recover_do_dquot_buffer(mp, log, item, bp, buf_f);
+		if (!dirty)
+			goto out_release;
+	} else {
+		xlog_recover_do_reg_buffer(mp, item, bp, buf_f, current_lsn);
+	}
+
+	/*
+	 * Perform delayed write on the buffer.  Asynchronous writes will be
+	 * slower when taking into account all the buffers to be flushed.
+	 *
+	 * Also make sure that only inode buffers with good sizes stay in
+	 * the buffer cache.  The kernel moves inodes in buffers of 1 block
+	 * or inode_cluster_size bytes, whichever is bigger.  The inode
+	 * buffers in the log can be a different size if the log was generated
+	 * by an older kernel using unclustered inode buffers or a newer kernel
+	 * running with a different inode cluster size.  Regardless, if the
+	 * the inode buffer size isn't max(blocksize, inode_cluster_size)
+	 * for *our* value of inode_cluster_size, then we need to keep
+	 * the buffer out of the buffer cache so that the buffer won't
+	 * overlap with future reads of those inodes.
+	 */
+	if (XFS_DINODE_MAGIC ==
+	    be16_to_cpu(*((__be16 *)xfs_buf_offset(bp, 0))) &&
+	    (BBTOB(bp->b_length) != M_IGEO(log->l_mp)->inode_cluster_size)) {
+		xfs_buf_stale(bp);
+		error = xfs_bwrite(bp);
+	} else {
+		ASSERT(bp->b_mount == mp);
+		bp->b_iodone = xlog_recover_iodone;
+		xfs_buf_delwri_queue(bp, buffer_list);
+	}
+
+out_release:
+	xfs_buf_relse(bp);
+	return error;
+cancelled:
+	trace_xfs_log_recover_buf_cancel(log, buf_f);
+	return 0;
+}
+
 const struct xlog_recover_item_type xlog_buf_item_type = {
 	.reorder_fn		= xlog_buf_reorder_fn,
 	.ra_pass2_fn		= xlog_recover_buffer_ra_pass2,
 	.commit_pass1_fn	= xlog_recover_buffer_commit_pass1,
+	.commit_pass2_fn	= xlog_recover_buffer_commit_pass2,
 };
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index fbd1f7d6f1c9..0a241f1c371a 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -284,7 +284,7 @@ xlog_header_check_mount(
 	return 0;
 }
 
-STATIC void
+void
 xlog_recover_iodone(
 	struct xfs_buf	*bp)
 {
@@ -2007,7 +2007,7 @@ xlog_add_buffer_cancelled(
 /*
  * Check if there is and entry for blkno, len in the buffer cancel record table.
  */
-static bool
+bool
 xlog_is_buffer_cancelled(
 	struct xlog		*log,
 	xfs_daddr_t		blkno,
@@ -2024,7 +2024,7 @@ xlog_is_buffer_cancelled(
  * buffer is re-used again after its last cancellation we actually replay the
  * changes made at that point.
  */
-static bool
+bool
 xlog_put_buffer_cancelled(
 	struct xlog		*log,
 	xfs_daddr_t		blkno,
@@ -2056,791 +2056,6 @@ xlog_buf_readahead(
 		xfs_buf_readahead(log->l_mp->m_ddev_targp, blkno, len, ops);
 }
 
-/*
- * Perform recovery for a buffer full of inodes.  In these buffers, the only
- * data which should be recovered is that which corresponds to the
- * di_next_unlinked pointers in the on disk inode structures.  The rest of the
- * data for the inodes is always logged through the inodes themselves rather
- * than the inode buffer and is recovered in xlog_recover_inode_pass2().
- *
- * The only time when buffers full of inodes are fully recovered is when the
- * buffer is full of newly allocated inodes.  In this case the buffer will
- * not be marked as an inode buffer and so will be sent to
- * xlog_recover_do_reg_buffer() below during recovery.
- */
-STATIC int
-xlog_recover_do_inode_buffer(
-	struct xfs_mount	*mp,
-	xlog_recover_item_t	*item,
-	struct xfs_buf		*bp,
-	xfs_buf_log_format_t	*buf_f)
-{
-	int			i;
-	int			item_index = 0;
-	int			bit = 0;
-	int			nbits = 0;
-	int			reg_buf_offset = 0;
-	int			reg_buf_bytes = 0;
-	int			next_unlinked_offset;
-	int			inodes_per_buf;
-	xfs_agino_t		*logged_nextp;
-	xfs_agino_t		*buffer_nextp;
-
-	trace_xfs_log_recover_buf_inode_buf(mp->m_log, buf_f);
-
-	/*
-	 * Post recovery validation only works properly on CRC enabled
-	 * filesystems.
-	 */
-	if (xfs_sb_version_hascrc(&mp->m_sb))
-		bp->b_ops = &xfs_inode_buf_ops;
-
-	inodes_per_buf = BBTOB(bp->b_length) >> mp->m_sb.sb_inodelog;
-	for (i = 0; i < inodes_per_buf; i++) {
-		next_unlinked_offset = (i * mp->m_sb.sb_inodesize) +
-			offsetof(xfs_dinode_t, di_next_unlinked);
-
-		while (next_unlinked_offset >=
-		       (reg_buf_offset + reg_buf_bytes)) {
-			/*
-			 * The next di_next_unlinked field is beyond
-			 * the current logged region.  Find the next
-			 * logged region that contains or is beyond
-			 * the current di_next_unlinked field.
-			 */
-			bit += nbits;
-			bit = xfs_next_bit(buf_f->blf_data_map,
-					   buf_f->blf_map_size, bit);
-
-			/*
-			 * If there are no more logged regions in the
-			 * buffer, then we're done.
-			 */
-			if (bit == -1)
-				return 0;
-
-			nbits = xfs_contig_bits(buf_f->blf_data_map,
-						buf_f->blf_map_size, bit);
-			ASSERT(nbits > 0);
-			reg_buf_offset = bit << XFS_BLF_SHIFT;
-			reg_buf_bytes = nbits << XFS_BLF_SHIFT;
-			item_index++;
-		}
-
-		/*
-		 * If the current logged region starts after the current
-		 * di_next_unlinked field, then move on to the next
-		 * di_next_unlinked field.
-		 */
-		if (next_unlinked_offset < reg_buf_offset)
-			continue;
-
-		ASSERT(item->ri_buf[item_index].i_addr != NULL);
-		ASSERT((item->ri_buf[item_index].i_len % XFS_BLF_CHUNK) == 0);
-		ASSERT((reg_buf_offset + reg_buf_bytes) <= BBTOB(bp->b_length));
-
-		/*
-		 * The current logged region contains a copy of the
-		 * current di_next_unlinked field.  Extract its value
-		 * and copy it to the buffer copy.
-		 */
-		logged_nextp = item->ri_buf[item_index].i_addr +
-				next_unlinked_offset - reg_buf_offset;
-		if (XFS_IS_CORRUPT(mp, *logged_nextp == 0)) {
-			xfs_alert(mp,
-		"Bad inode buffer log record (ptr = "PTR_FMT", bp = "PTR_FMT"). "
-		"Trying to replay bad (0) inode di_next_unlinked field.",
-				item, bp);
-			return -EFSCORRUPTED;
-		}
-
-		buffer_nextp = xfs_buf_offset(bp, next_unlinked_offset);
-		*buffer_nextp = *logged_nextp;
-
-		/*
-		 * If necessary, recalculate the CRC in the on-disk inode. We
-		 * have to leave the inode in a consistent state for whoever
-		 * reads it next....
-		 */
-		xfs_dinode_calc_crc(mp,
-				xfs_buf_offset(bp, i * mp->m_sb.sb_inodesize));
-
-	}
-
-	return 0;
-}
-
-/*
- * V5 filesystems know the age of the buffer on disk being recovered. We can
- * have newer objects on disk than we are replaying, and so for these cases we
- * don't want to replay the current change as that will make the buffer contents
- * temporarily invalid on disk.
- *
- * The magic number might not match the buffer type we are going to recover
- * (e.g. reallocated blocks), so we ignore the xfs_buf_log_format flags.  Hence
- * extract the LSN of the existing object in the buffer based on it's current
- * magic number.  If we don't recognise the magic number in the buffer, then
- * return a LSN of -1 so that the caller knows it was an unrecognised block and
- * so can recover the buffer.
- *
- * Note: we cannot rely solely on magic number matches to determine that the
- * buffer has a valid LSN - we also need to verify that it belongs to this
- * filesystem, so we need to extract the object's LSN and compare it to that
- * which we read from the superblock. If the UUIDs don't match, then we've got a
- * stale metadata block from an old filesystem instance that we need to recover
- * over the top of.
- */
-static xfs_lsn_t
-xlog_recover_get_buf_lsn(
-	struct xfs_mount	*mp,
-	struct xfs_buf		*bp)
-{
-	uint32_t		magic32;
-	uint16_t		magic16;
-	uint16_t		magicda;
-	void			*blk = bp->b_addr;
-	uuid_t			*uuid;
-	xfs_lsn_t		lsn = -1;
-
-	/* v4 filesystems always recover immediately */
-	if (!xfs_sb_version_hascrc(&mp->m_sb))
-		goto recover_immediately;
-
-	magic32 = be32_to_cpu(*(__be32 *)blk);
-	switch (magic32) {
-	case XFS_ABTB_CRC_MAGIC:
-	case XFS_ABTC_CRC_MAGIC:
-	case XFS_ABTB_MAGIC:
-	case XFS_ABTC_MAGIC:
-	case XFS_RMAP_CRC_MAGIC:
-	case XFS_REFC_CRC_MAGIC:
-	case XFS_IBT_CRC_MAGIC:
-	case XFS_IBT_MAGIC: {
-		struct xfs_btree_block *btb = blk;
-
-		lsn = be64_to_cpu(btb->bb_u.s.bb_lsn);
-		uuid = &btb->bb_u.s.bb_uuid;
-		break;
-	}
-	case XFS_BMAP_CRC_MAGIC:
-	case XFS_BMAP_MAGIC: {
-		struct xfs_btree_block *btb = blk;
-
-		lsn = be64_to_cpu(btb->bb_u.l.bb_lsn);
-		uuid = &btb->bb_u.l.bb_uuid;
-		break;
-	}
-	case XFS_AGF_MAGIC:
-		lsn = be64_to_cpu(((struct xfs_agf *)blk)->agf_lsn);
-		uuid = &((struct xfs_agf *)blk)->agf_uuid;
-		break;
-	case XFS_AGFL_MAGIC:
-		lsn = be64_to_cpu(((struct xfs_agfl *)blk)->agfl_lsn);
-		uuid = &((struct xfs_agfl *)blk)->agfl_uuid;
-		break;
-	case XFS_AGI_MAGIC:
-		lsn = be64_to_cpu(((struct xfs_agi *)blk)->agi_lsn);
-		uuid = &((struct xfs_agi *)blk)->agi_uuid;
-		break;
-	case XFS_SYMLINK_MAGIC:
-		lsn = be64_to_cpu(((struct xfs_dsymlink_hdr *)blk)->sl_lsn);
-		uuid = &((struct xfs_dsymlink_hdr *)blk)->sl_uuid;
-		break;
-	case XFS_DIR3_BLOCK_MAGIC:
-	case XFS_DIR3_DATA_MAGIC:
-	case XFS_DIR3_FREE_MAGIC:
-		lsn = be64_to_cpu(((struct xfs_dir3_blk_hdr *)blk)->lsn);
-		uuid = &((struct xfs_dir3_blk_hdr *)blk)->uuid;
-		break;
-	case XFS_ATTR3_RMT_MAGIC:
-		/*
-		 * Remote attr blocks are written synchronously, rather than
-		 * being logged. That means they do not contain a valid LSN
-		 * (i.e. transactionally ordered) in them, and hence any time we
-		 * see a buffer to replay over the top of a remote attribute
-		 * block we should simply do so.
-		 */
-		goto recover_immediately;
-	case XFS_SB_MAGIC:
-		/*
-		 * superblock uuids are magic. We may or may not have a
-		 * sb_meta_uuid on disk, but it will be set in the in-core
-		 * superblock. We set the uuid pointer for verification
-		 * according to the superblock feature mask to ensure we check
-		 * the relevant UUID in the superblock.
-		 */
-		lsn = be64_to_cpu(((struct xfs_dsb *)blk)->sb_lsn);
-		if (xfs_sb_version_hasmetauuid(&mp->m_sb))
-			uuid = &((struct xfs_dsb *)blk)->sb_meta_uuid;
-		else
-			uuid = &((struct xfs_dsb *)blk)->sb_uuid;
-		break;
-	default:
-		break;
-	}
-
-	if (lsn != (xfs_lsn_t)-1) {
-		if (!uuid_equal(&mp->m_sb.sb_meta_uuid, uuid))
-			goto recover_immediately;
-		return lsn;
-	}
-
-	magicda = be16_to_cpu(((struct xfs_da_blkinfo *)blk)->magic);
-	switch (magicda) {
-	case XFS_DIR3_LEAF1_MAGIC:
-	case XFS_DIR3_LEAFN_MAGIC:
-	case XFS_DA3_NODE_MAGIC:
-		lsn = be64_to_cpu(((struct xfs_da3_blkinfo *)blk)->lsn);
-		uuid = &((struct xfs_da3_blkinfo *)blk)->uuid;
-		break;
-	default:
-		break;
-	}
-
-	if (lsn != (xfs_lsn_t)-1) {
-		if (!uuid_equal(&mp->m_sb.sb_uuid, uuid))
-			goto recover_immediately;
-		return lsn;
-	}
-
-	/*
-	 * We do individual object checks on dquot and inode buffers as they
-	 * have their own individual LSN records. Also, we could have a stale
-	 * buffer here, so we have to at least recognise these buffer types.
-	 *
-	 * A notd complexity here is inode unlinked list processing - it logs
-	 * the inode directly in the buffer, but we don't know which inodes have
-	 * been modified, and there is no global buffer LSN. Hence we need to
-	 * recover all inode buffer types immediately. This problem will be
-	 * fixed by logical logging of the unlinked list modifications.
-	 */
-	magic16 = be16_to_cpu(*(__be16 *)blk);
-	switch (magic16) {
-	case XFS_DQUOT_MAGIC:
-	case XFS_DINODE_MAGIC:
-		goto recover_immediately;
-	default:
-		break;
-	}
-
-	/* unknown buffer contents, recover immediately */
-
-recover_immediately:
-	return (xfs_lsn_t)-1;
-
-}
-
-/*
- * Validate the recovered buffer is of the correct type and attach the
- * appropriate buffer operations to them for writeback. Magic numbers are in a
- * few places:
- *	the first 16 bits of the buffer (inode buffer, dquot buffer),
- *	the first 32 bits of the buffer (most blocks),
- *	inside a struct xfs_da_blkinfo at the start of the buffer.
- */
-static void
-xlog_recover_validate_buf_type(
-	struct xfs_mount	*mp,
-	struct xfs_buf		*bp,
-	xfs_buf_log_format_t	*buf_f,
-	xfs_lsn_t		current_lsn)
-{
-	struct xfs_da_blkinfo	*info = bp->b_addr;
-	uint32_t		magic32;
-	uint16_t		magic16;
-	uint16_t		magicda;
-	char			*warnmsg = NULL;
-
-	/*
-	 * We can only do post recovery validation on items on CRC enabled
-	 * fielsystems as we need to know when the buffer was written to be able
-	 * to determine if we should have replayed the item. If we replay old
-	 * metadata over a newer buffer, then it will enter a temporarily
-	 * inconsistent state resulting in verification failures. Hence for now
-	 * just avoid the verification stage for non-crc filesystems
-	 */
-	if (!xfs_sb_version_hascrc(&mp->m_sb))
-		return;
-
-	magic32 = be32_to_cpu(*(__be32 *)bp->b_addr);
-	magic16 = be16_to_cpu(*(__be16*)bp->b_addr);
-	magicda = be16_to_cpu(info->magic);
-	switch (xfs_blft_from_flags(buf_f)) {
-	case XFS_BLFT_BTREE_BUF:
-		switch (magic32) {
-		case XFS_ABTB_CRC_MAGIC:
-		case XFS_ABTB_MAGIC:
-			bp->b_ops = &xfs_bnobt_buf_ops;
-			break;
-		case XFS_ABTC_CRC_MAGIC:
-		case XFS_ABTC_MAGIC:
-			bp->b_ops = &xfs_cntbt_buf_ops;
-			break;
-		case XFS_IBT_CRC_MAGIC:
-		case XFS_IBT_MAGIC:
-			bp->b_ops = &xfs_inobt_buf_ops;
-			break;
-		case XFS_FIBT_CRC_MAGIC:
-		case XFS_FIBT_MAGIC:
-			bp->b_ops = &xfs_finobt_buf_ops;
-			break;
-		case XFS_BMAP_CRC_MAGIC:
-		case XFS_BMAP_MAGIC:
-			bp->b_ops = &xfs_bmbt_buf_ops;
-			break;
-		case XFS_RMAP_CRC_MAGIC:
-			bp->b_ops = &xfs_rmapbt_buf_ops;
-			break;
-		case XFS_REFC_CRC_MAGIC:
-			bp->b_ops = &xfs_refcountbt_buf_ops;
-			break;
-		default:
-			warnmsg = "Bad btree block magic!";
-			break;
-		}
-		break;
-	case XFS_BLFT_AGF_BUF:
-		if (magic32 != XFS_AGF_MAGIC) {
-			warnmsg = "Bad AGF block magic!";
-			break;
-		}
-		bp->b_ops = &xfs_agf_buf_ops;
-		break;
-	case XFS_BLFT_AGFL_BUF:
-		if (magic32 != XFS_AGFL_MAGIC) {
-			warnmsg = "Bad AGFL block magic!";
-			break;
-		}
-		bp->b_ops = &xfs_agfl_buf_ops;
-		break;
-	case XFS_BLFT_AGI_BUF:
-		if (magic32 != XFS_AGI_MAGIC) {
-			warnmsg = "Bad AGI block magic!";
-			break;
-		}
-		bp->b_ops = &xfs_agi_buf_ops;
-		break;
-	case XFS_BLFT_UDQUOT_BUF:
-	case XFS_BLFT_PDQUOT_BUF:
-	case XFS_BLFT_GDQUOT_BUF:
-#ifdef CONFIG_XFS_QUOTA
-		if (magic16 != XFS_DQUOT_MAGIC) {
-			warnmsg = "Bad DQUOT block magic!";
-			break;
-		}
-		bp->b_ops = &xfs_dquot_buf_ops;
-#else
-		xfs_alert(mp,
-	"Trying to recover dquots without QUOTA support built in!");
-		ASSERT(0);
-#endif
-		break;
-	case XFS_BLFT_DINO_BUF:
-		if (magic16 != XFS_DINODE_MAGIC) {
-			warnmsg = "Bad INODE block magic!";
-			break;
-		}
-		bp->b_ops = &xfs_inode_buf_ops;
-		break;
-	case XFS_BLFT_SYMLINK_BUF:
-		if (magic32 != XFS_SYMLINK_MAGIC) {
-			warnmsg = "Bad symlink block magic!";
-			break;
-		}
-		bp->b_ops = &xfs_symlink_buf_ops;
-		break;
-	case XFS_BLFT_DIR_BLOCK_BUF:
-		if (magic32 != XFS_DIR2_BLOCK_MAGIC &&
-		    magic32 != XFS_DIR3_BLOCK_MAGIC) {
-			warnmsg = "Bad dir block magic!";
-			break;
-		}
-		bp->b_ops = &xfs_dir3_block_buf_ops;
-		break;
-	case XFS_BLFT_DIR_DATA_BUF:
-		if (magic32 != XFS_DIR2_DATA_MAGIC &&
-		    magic32 != XFS_DIR3_DATA_MAGIC) {
-			warnmsg = "Bad dir data magic!";
-			break;
-		}
-		bp->b_ops = &xfs_dir3_data_buf_ops;
-		break;
-	case XFS_BLFT_DIR_FREE_BUF:
-		if (magic32 != XFS_DIR2_FREE_MAGIC &&
-		    magic32 != XFS_DIR3_FREE_MAGIC) {
-			warnmsg = "Bad dir3 free magic!";
-			break;
-		}
-		bp->b_ops = &xfs_dir3_free_buf_ops;
-		break;
-	case XFS_BLFT_DIR_LEAF1_BUF:
-		if (magicda != XFS_DIR2_LEAF1_MAGIC &&
-		    magicda != XFS_DIR3_LEAF1_MAGIC) {
-			warnmsg = "Bad dir leaf1 magic!";
-			break;
-		}
-		bp->b_ops = &xfs_dir3_leaf1_buf_ops;
-		break;
-	case XFS_BLFT_DIR_LEAFN_BUF:
-		if (magicda != XFS_DIR2_LEAFN_MAGIC &&
-		    magicda != XFS_DIR3_LEAFN_MAGIC) {
-			warnmsg = "Bad dir leafn magic!";
-			break;
-		}
-		bp->b_ops = &xfs_dir3_leafn_buf_ops;
-		break;
-	case XFS_BLFT_DA_NODE_BUF:
-		if (magicda != XFS_DA_NODE_MAGIC &&
-		    magicda != XFS_DA3_NODE_MAGIC) {
-			warnmsg = "Bad da node magic!";
-			break;
-		}
-		bp->b_ops = &xfs_da3_node_buf_ops;
-		break;
-	case XFS_BLFT_ATTR_LEAF_BUF:
-		if (magicda != XFS_ATTR_LEAF_MAGIC &&
-		    magicda != XFS_ATTR3_LEAF_MAGIC) {
-			warnmsg = "Bad attr leaf magic!";
-			break;
-		}
-		bp->b_ops = &xfs_attr3_leaf_buf_ops;
-		break;
-	case XFS_BLFT_ATTR_RMT_BUF:
-		if (magic32 != XFS_ATTR3_RMT_MAGIC) {
-			warnmsg = "Bad attr remote magic!";
-			break;
-		}
-		bp->b_ops = &xfs_attr3_rmt_buf_ops;
-		break;
-	case XFS_BLFT_SB_BUF:
-		if (magic32 != XFS_SB_MAGIC) {
-			warnmsg = "Bad SB block magic!";
-			break;
-		}
-		bp->b_ops = &xfs_sb_buf_ops;
-		break;
-#ifdef CONFIG_XFS_RT
-	case XFS_BLFT_RTBITMAP_BUF:
-	case XFS_BLFT_RTSUMMARY_BUF:
-		/* no magic numbers for verification of RT buffers */
-		bp->b_ops = &xfs_rtbuf_ops;
-		break;
-#endif /* CONFIG_XFS_RT */
-	default:
-		xfs_warn(mp, "Unknown buffer type %d!",
-			 xfs_blft_from_flags(buf_f));
-		break;
-	}
-
-	/*
-	 * Nothing else to do in the case of a NULL current LSN as this means
-	 * the buffer is more recent than the change in the log and will be
-	 * skipped.
-	 */
-	if (current_lsn == NULLCOMMITLSN)
-		return;
-
-	if (warnmsg) {
-		xfs_warn(mp, warnmsg);
-		ASSERT(0);
-	}
-
-	/*
-	 * We must update the metadata LSN of the buffer as it is written out to
-	 * ensure that older transactions never replay over this one and corrupt
-	 * the buffer. This can occur if log recovery is interrupted at some
-	 * point after the current transaction completes, at which point a
-	 * subsequent mount starts recovery from the beginning.
-	 *
-	 * Write verifiers update the metadata LSN from log items attached to
-	 * the buffer. Therefore, initialize a bli purely to carry the LSN to
-	 * the verifier. We'll clean it up in our ->iodone() callback.
-	 */
-	if (bp->b_ops) {
-		struct xfs_buf_log_item	*bip;
-
-		ASSERT(!bp->b_iodone || bp->b_iodone == xlog_recover_iodone);
-		bp->b_iodone = xlog_recover_iodone;
-		xfs_buf_item_init(bp, mp);
-		bip = bp->b_log_item;
-		bip->bli_item.li_lsn = current_lsn;
-	}
-}
-
-/*
- * Perform a 'normal' buffer recovery.  Each logged region of the
- * buffer should be copied over the corresponding region in the
- * given buffer.  The bitmap in the buf log format structure indicates
- * where to place the logged data.
- */
-STATIC void
-xlog_recover_do_reg_buffer(
-	struct xfs_mount	*mp,
-	xlog_recover_item_t	*item,
-	struct xfs_buf		*bp,
-	xfs_buf_log_format_t	*buf_f,
-	xfs_lsn_t		current_lsn)
-{
-	int			i;
-	int			bit;
-	int			nbits;
-	xfs_failaddr_t		fa;
-	const size_t		size_disk_dquot = sizeof(struct xfs_disk_dquot);
-
-	trace_xfs_log_recover_buf_reg_buf(mp->m_log, buf_f);
-
-	bit = 0;
-	i = 1;  /* 0 is the buf format structure */
-	while (1) {
-		bit = xfs_next_bit(buf_f->blf_data_map,
-				   buf_f->blf_map_size, bit);
-		if (bit == -1)
-			break;
-		nbits = xfs_contig_bits(buf_f->blf_data_map,
-					buf_f->blf_map_size, bit);
-		ASSERT(nbits > 0);
-		ASSERT(item->ri_buf[i].i_addr != NULL);
-		ASSERT(item->ri_buf[i].i_len % XFS_BLF_CHUNK == 0);
-		ASSERT(BBTOB(bp->b_length) >=
-		       ((uint)bit << XFS_BLF_SHIFT) + (nbits << XFS_BLF_SHIFT));
-
-		/*
-		 * The dirty regions logged in the buffer, even though
-		 * contiguous, may span multiple chunks. This is because the
-		 * dirty region may span a physical page boundary in a buffer
-		 * and hence be split into two separate vectors for writing into
-		 * the log. Hence we need to trim nbits back to the length of
-		 * the current region being copied out of the log.
-		 */
-		if (item->ri_buf[i].i_len < (nbits << XFS_BLF_SHIFT))
-			nbits = item->ri_buf[i].i_len >> XFS_BLF_SHIFT;
-
-		/*
-		 * Do a sanity check if this is a dquot buffer. Just checking
-		 * the first dquot in the buffer should do. XXXThis is
-		 * probably a good thing to do for other buf types also.
-		 */
-		fa = NULL;
-		if (buf_f->blf_flags &
-		   (XFS_BLF_UDQUOT_BUF|XFS_BLF_PDQUOT_BUF|XFS_BLF_GDQUOT_BUF)) {
-			if (item->ri_buf[i].i_addr == NULL) {
-				xfs_alert(mp,
-					"XFS: NULL dquot in %s.", __func__);
-				goto next;
-			}
-			if (item->ri_buf[i].i_len < size_disk_dquot) {
-				xfs_alert(mp,
-					"XFS: dquot too small (%d) in %s.",
-					item->ri_buf[i].i_len, __func__);
-				goto next;
-			}
-			fa = xfs_dquot_verify(mp, item->ri_buf[i].i_addr,
-					       -1, 0);
-			if (fa) {
-				xfs_alert(mp,
-	"dquot corrupt at %pS trying to replay into block 0x%llx",
-					fa, bp->b_bn);
-				goto next;
-			}
-		}
-
-		memcpy(xfs_buf_offset(bp,
-			(uint)bit << XFS_BLF_SHIFT),	/* dest */
-			item->ri_buf[i].i_addr,		/* source */
-			nbits<<XFS_BLF_SHIFT);		/* length */
- next:
-		i++;
-		bit += nbits;
-	}
-
-	/* Shouldn't be any more regions */
-	ASSERT(i == item->ri_total);
-
-	xlog_recover_validate_buf_type(mp, bp, buf_f, current_lsn);
-}
-
-/*
- * Perform a dquot buffer recovery.
- * Simple algorithm: if we have found a QUOTAOFF log item of the same type
- * (ie. USR or GRP), then just toss this buffer away; don't recover it.
- * Else, treat it as a regular buffer and do recovery.
- *
- * Return false if the buffer was tossed and true if we recovered the buffer to
- * indicate to the caller if the buffer needs writing.
- */
-STATIC bool
-xlog_recover_do_dquot_buffer(
-	struct xfs_mount		*mp,
-	struct xlog			*log,
-	struct xlog_recover_item	*item,
-	struct xfs_buf			*bp,
-	struct xfs_buf_log_format	*buf_f)
-{
-	uint			type;
-
-	trace_xfs_log_recover_buf_dquot_buf(log, buf_f);
-
-	/*
-	 * Filesystems are required to send in quota flags at mount time.
-	 */
-	if (!mp->m_qflags)
-		return false;
-
-	type = 0;
-	if (buf_f->blf_flags & XFS_BLF_UDQUOT_BUF)
-		type |= XFS_DQ_USER;
-	if (buf_f->blf_flags & XFS_BLF_PDQUOT_BUF)
-		type |= XFS_DQ_PROJ;
-	if (buf_f->blf_flags & XFS_BLF_GDQUOT_BUF)
-		type |= XFS_DQ_GROUP;
-	/*
-	 * This type of quotas was turned off, so ignore this buffer
-	 */
-	if (log->l_quotaoffs_flag & type)
-		return false;
-
-	xlog_recover_do_reg_buffer(mp, item, bp, buf_f, NULLCOMMITLSN);
-	return true;
-}
-
-/*
- * This routine replays a modification made to a buffer at runtime.
- * There are actually two types of buffer, regular and inode, which
- * are handled differently.  Inode buffers are handled differently
- * in that we only recover a specific set of data from them, namely
- * the inode di_next_unlinked fields.  This is because all other inode
- * data is actually logged via inode records and any data we replay
- * here which overlaps that may be stale.
- *
- * When meta-data buffers are freed at run time we log a buffer item
- * with the XFS_BLF_CANCEL bit set to indicate that previous copies
- * of the buffer in the log should not be replayed at recovery time.
- * This is so that if the blocks covered by the buffer are reused for
- * file data before we crash we don't end up replaying old, freed
- * meta-data into a user's file.
- *
- * To handle the cancellation of buffer log items, we make two passes
- * over the log during recovery.  During the first we build a table of
- * those buffers which have been cancelled, and during the second we
- * only replay those buffers which do not have corresponding cancel
- * records in the table.  See xlog_recover_buffer_pass[1,2] above
- * for more details on the implementation of the table of cancel records.
- */
-STATIC int
-xlog_recover_buffer_pass2(
-	struct xlog			*log,
-	struct list_head		*buffer_list,
-	struct xlog_recover_item	*item,
-	xfs_lsn_t			current_lsn)
-{
-	xfs_buf_log_format_t	*buf_f = item->ri_buf[0].i_addr;
-	xfs_mount_t		*mp = log->l_mp;
-	xfs_buf_t		*bp;
-	int			error;
-	uint			buf_flags;
-	xfs_lsn_t		lsn;
-
-	/*
-	 * In this pass we only want to recover all the buffers which have
-	 * not been cancelled and are not cancellation buffers themselves.
-	 */
-	if (buf_f->blf_flags & XFS_BLF_CANCEL) {
-		if (xlog_put_buffer_cancelled(log, buf_f->blf_blkno,
-				buf_f->blf_len))
-			goto cancelled;
-	} else {
-
-		if (xlog_is_buffer_cancelled(log, buf_f->blf_blkno,
-				buf_f->blf_len))
-			goto cancelled;
-	}
-
-	trace_xfs_log_recover_buf_recover(log, buf_f);
-
-	buf_flags = 0;
-	if (buf_f->blf_flags & XFS_BLF_INODE_BUF)
-		buf_flags |= XBF_UNMAPPED;
-
-	error = xfs_buf_read(mp->m_ddev_targp, buf_f->blf_blkno, buf_f->blf_len,
-			  buf_flags, &bp, NULL);
-	if (error)
-		return error;
-
-	/*
-	 * Recover the buffer only if we get an LSN from it and it's less than
-	 * the lsn of the transaction we are replaying.
-	 *
-	 * Note that we have to be extremely careful of readahead here.
-	 * Readahead does not attach verfiers to the buffers so if we don't
-	 * actually do any replay after readahead because of the LSN we found
-	 * in the buffer if more recent than that current transaction then we
-	 * need to attach the verifier directly. Failure to do so can lead to
-	 * future recovery actions (e.g. EFI and unlinked list recovery) can
-	 * operate on the buffers and they won't get the verifier attached. This
-	 * can lead to blocks on disk having the correct content but a stale
-	 * CRC.
-	 *
-	 * It is safe to assume these clean buffers are currently up to date.
-	 * If the buffer is dirtied by a later transaction being replayed, then
-	 * the verifier will be reset to match whatever recover turns that
-	 * buffer into.
-	 */
-	lsn = xlog_recover_get_buf_lsn(mp, bp);
-	if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0) {
-		trace_xfs_log_recover_buf_skip(log, buf_f);
-		xlog_recover_validate_buf_type(mp, bp, buf_f, NULLCOMMITLSN);
-		goto out_release;
-	}
-
-	if (buf_f->blf_flags & XFS_BLF_INODE_BUF) {
-		error = xlog_recover_do_inode_buffer(mp, item, bp, buf_f);
-		if (error)
-			goto out_release;
-	} else if (buf_f->blf_flags &
-		  (XFS_BLF_UDQUOT_BUF|XFS_BLF_PDQUOT_BUF|XFS_BLF_GDQUOT_BUF)) {
-		bool	dirty;
-
-		dirty = xlog_recover_do_dquot_buffer(mp, log, item, bp, buf_f);
-		if (!dirty)
-			goto out_release;
-	} else {
-		xlog_recover_do_reg_buffer(mp, item, bp, buf_f, current_lsn);
-	}
-
-	/*
-	 * Perform delayed write on the buffer.  Asynchronous writes will be
-	 * slower when taking into account all the buffers to be flushed.
-	 *
-	 * Also make sure that only inode buffers with good sizes stay in
-	 * the buffer cache.  The kernel moves inodes in buffers of 1 block
-	 * or inode_cluster_size bytes, whichever is bigger.  The inode
-	 * buffers in the log can be a different size if the log was generated
-	 * by an older kernel using unclustered inode buffers or a newer kernel
-	 * running with a different inode cluster size.  Regardless, if the
-	 * the inode buffer size isn't max(blocksize, inode_cluster_size)
-	 * for *our* value of inode_cluster_size, then we need to keep
-	 * the buffer out of the buffer cache so that the buffer won't
-	 * overlap with future reads of those inodes.
-	 */
-	if (XFS_DINODE_MAGIC ==
-	    be16_to_cpu(*((__be16 *)xfs_buf_offset(bp, 0))) &&
-	    (BBTOB(bp->b_length) != M_IGEO(log->l_mp)->inode_cluster_size)) {
-		xfs_buf_stale(bp);
-		error = xfs_bwrite(bp);
-	} else {
-		ASSERT(bp->b_mount == mp);
-		bp->b_iodone = xlog_recover_iodone;
-		xfs_buf_delwri_queue(bp, buffer_list);
-	}
-
-out_release:
-	xfs_buf_relse(bp);
-	return error;
-cancelled:
-	trace_xfs_log_recover_buf_cancel(log, buf_f);
-	return 0;
-}
-
 /*
  * Inode fork owner changes
  *
@@ -3887,10 +3102,11 @@ xlog_recover_commit_pass2(
 {
 	trace_xfs_log_recover_item_recover(log, trans, item, XLOG_RECOVER_PASS2);
 
+	if (item->ri_type && item->ri_type->commit_pass2_fn)
+		return item->ri_type->commit_pass2_fn(log, buffer_list, item,
+				trans->r_lsn);
+
 	switch (ITEM_TYPE(item)) {
-	case XFS_LI_BUF:
-		return xlog_recover_buffer_pass2(log, buffer_list, item,
-						 trans->r_lsn);
 	case XFS_LI_INODE:
 		return xlog_recover_inode_pass2(log, buffer_list, item,
 						 trans->r_lsn);


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 05/21] xfs: refactor log recovery inode item dispatch for pass2 commit functions
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
                   ` (3 preceding siblings ...)
  2020-04-30  0:48 ` [PATCH 04/21] xfs: refactor log recovery buffer item dispatch for pass2 " Darrick J. Wong
@ 2020-04-30  0:48 ` Darrick J. Wong
  2020-05-01 14:03   ` Chandan Rajendra
  2020-04-30  0:48 ` [PATCH 06/21] xfs: refactor log recovery dquot " Darrick J. Wong
                   ` (16 subsequent siblings)
  21 siblings, 1 reply; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:48 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Move the log inode item pass2 commit code into the per-item source code
files and use the dispatch function to call it.  We do these one at a
time because there's a lot of code to move.  No functional changes.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_inode_item_recover.c |  355 +++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_log_recover.c        |  355 ---------------------------------------
 2 files changed, 355 insertions(+), 355 deletions(-)


diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c
index d97d8caa4652..46fc8a4b9ac6 100644
--- a/fs/xfs/xfs_inode_item_recover.c
+++ b/fs/xfs/xfs_inode_item_recover.c
@@ -20,6 +20,8 @@
 #include "xfs_error.h"
 #include "xfs_log_priv.h"
 #include "xfs_log_recover.h"
+#include "xfs_icache.h"
+#include "xfs_bmap_btree.h"
 
 STATIC void
 xlog_recover_inode_ra_pass2(
@@ -39,6 +41,359 @@ xlog_recover_inode_ra_pass2(
 	}
 }
 
+/*
+ * Inode fork owner changes
+ *
+ * If we have been told that we have to reparent the inode fork, it's because an
+ * extent swap operation on a CRC enabled filesystem has been done and we are
+ * replaying it. We need to walk the BMBT of the appropriate fork and change the
+ * owners of it.
+ *
+ * The complexity here is that we don't have an inode context to work with, so
+ * after we've replayed the inode we need to instantiate one.  This is where the
+ * fun begins.
+ *
+ * We are in the middle of log recovery, so we can't run transactions. That
+ * means we cannot use cache coherent inode instantiation via xfs_iget(), as
+ * that will result in the corresponding iput() running the inode through
+ * xfs_inactive(). If we've just replayed an inode core that changes the link
+ * count to zero (i.e. it's been unlinked), then xfs_inactive() will run
+ * transactions (bad!).
+ *
+ * So, to avoid this, we instantiate an inode directly from the inode core we've
+ * just recovered. We have the buffer still locked, and all we really need to
+ * instantiate is the inode core and the forks being modified. We can do this
+ * manually, then run the inode btree owner change, and then tear down the
+ * xfs_inode without having to run any transactions at all.
+ *
+ * Also, because we don't have a transaction context available here but need to
+ * gather all the buffers we modify for writeback so we pass the buffer_list
+ * instead for the operation to use.
+ */
+
+STATIC int
+xfs_recover_inode_owner_change(
+	struct xfs_mount	*mp,
+	struct xfs_dinode	*dip,
+	struct xfs_inode_log_format *in_f,
+	struct list_head	*buffer_list)
+{
+	struct xfs_inode	*ip;
+	int			error;
+
+	ASSERT(in_f->ilf_fields & (XFS_ILOG_DOWNER|XFS_ILOG_AOWNER));
+
+	ip = xfs_inode_alloc(mp, in_f->ilf_ino);
+	if (!ip)
+		return -ENOMEM;
+
+	/* instantiate the inode */
+	ASSERT(dip->di_version >= 3);
+	xfs_inode_from_disk(ip, dip);
+
+	error = xfs_iformat_fork(ip, dip);
+	if (error)
+		goto out_free_ip;
+
+	if (!xfs_inode_verify_forks(ip)) {
+		error = -EFSCORRUPTED;
+		goto out_free_ip;
+	}
+
+	if (in_f->ilf_fields & XFS_ILOG_DOWNER) {
+		ASSERT(in_f->ilf_fields & XFS_ILOG_DBROOT);
+		error = xfs_bmbt_change_owner(NULL, ip, XFS_DATA_FORK,
+					      ip->i_ino, buffer_list);
+		if (error)
+			goto out_free_ip;
+	}
+
+	if (in_f->ilf_fields & XFS_ILOG_AOWNER) {
+		ASSERT(in_f->ilf_fields & XFS_ILOG_ABROOT);
+		error = xfs_bmbt_change_owner(NULL, ip, XFS_ATTR_FORK,
+					      ip->i_ino, buffer_list);
+		if (error)
+			goto out_free_ip;
+	}
+
+out_free_ip:
+	xfs_inode_free(ip);
+	return error;
+}
+
+STATIC int
+xlog_recover_inode_commit_pass2(
+	struct xlog			*log,
+	struct list_head		*buffer_list,
+	struct xlog_recover_item	*item,
+	xfs_lsn_t			current_lsn)
+{
+	struct xfs_inode_log_format	*in_f;
+	struct xfs_mount		*mp = log->l_mp;
+	struct xfs_buf			*bp;
+	struct xfs_dinode		*dip;
+	int				len;
+	char				*src;
+	char				*dest;
+	int				error;
+	int				attr_index;
+	uint				fields;
+	struct xfs_log_dinode		*ldip;
+	uint				isize;
+	int				need_free = 0;
+
+	if (item->ri_buf[0].i_len == sizeof(struct xfs_inode_log_format)) {
+		in_f = item->ri_buf[0].i_addr;
+	} else {
+		in_f = kmem_alloc(sizeof(struct xfs_inode_log_format), 0);
+		need_free = 1;
+		error = xfs_inode_item_format_convert(&item->ri_buf[0], in_f);
+		if (error)
+			goto error;
+	}
+
+	/*
+	 * Inode buffers can be freed, look out for it,
+	 * and do not replay the inode.
+	 */
+	if (xlog_is_buffer_cancelled(log, in_f->ilf_blkno, in_f->ilf_len)) {
+		error = 0;
+		trace_xfs_log_recover_inode_cancel(log, in_f);
+		goto error;
+	}
+	trace_xfs_log_recover_inode_recover(log, in_f);
+
+	error = xfs_buf_read(mp->m_ddev_targp, in_f->ilf_blkno, in_f->ilf_len,
+			0, &bp, &xfs_inode_buf_ops);
+	if (error)
+		goto error;
+	ASSERT(in_f->ilf_fields & XFS_ILOG_CORE);
+	dip = xfs_buf_offset(bp, in_f->ilf_boffset);
+
+	/*
+	 * Make sure the place we're flushing out to really looks
+	 * like an inode!
+	 */
+	if (XFS_IS_CORRUPT(mp, !xfs_verify_magic16(bp, dip->di_magic))) {
+		xfs_alert(mp,
+	"%s: Bad inode magic number, dip = "PTR_FMT", dino bp = "PTR_FMT", ino = %Ld",
+			__func__, dip, bp, in_f->ilf_ino);
+		error = -EFSCORRUPTED;
+		goto out_release;
+	}
+	ldip = item->ri_buf[1].i_addr;
+	if (XFS_IS_CORRUPT(mp, ldip->di_magic != XFS_DINODE_MAGIC)) {
+		xfs_alert(mp,
+			"%s: Bad inode log record, rec ptr "PTR_FMT", ino %Ld",
+			__func__, item, in_f->ilf_ino);
+		error = -EFSCORRUPTED;
+		goto out_release;
+	}
+
+	/*
+	 * If the inode has an LSN in it, recover the inode only if it's less
+	 * than the lsn of the transaction we are replaying. Note: we still
+	 * need to replay an owner change even though the inode is more recent
+	 * than the transaction as there is no guarantee that all the btree
+	 * blocks are more recent than this transaction, too.
+	 */
+	if (dip->di_version >= 3) {
+		xfs_lsn_t	lsn = be64_to_cpu(dip->di_lsn);
+
+		if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0) {
+			trace_xfs_log_recover_inode_skip(log, in_f);
+			error = 0;
+			goto out_owner_change;
+		}
+	}
+
+	/*
+	 * di_flushiter is only valid for v1/2 inodes. All changes for v3 inodes
+	 * are transactional and if ordering is necessary we can determine that
+	 * more accurately by the LSN field in the V3 inode core. Don't trust
+	 * the inode versions we might be changing them here - use the
+	 * superblock flag to determine whether we need to look at di_flushiter
+	 * to skip replay when the on disk inode is newer than the log one
+	 */
+	if (!xfs_sb_version_has_v3inode(&mp->m_sb) &&
+	    ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) {
+		/*
+		 * Deal with the wrap case, DI_MAX_FLUSH is less
+		 * than smaller numbers
+		 */
+		if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH &&
+		    ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) {
+			/* do nothing */
+		} else {
+			trace_xfs_log_recover_inode_skip(log, in_f);
+			error = 0;
+			goto out_release;
+		}
+	}
+
+	/* Take the opportunity to reset the flush iteration count */
+	ldip->di_flushiter = 0;
+
+	if (unlikely(S_ISREG(ldip->di_mode))) {
+		if ((ldip->di_format != XFS_DINODE_FMT_EXTENTS) &&
+		    (ldip->di_format != XFS_DINODE_FMT_BTREE)) {
+			XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(3)",
+					 XFS_ERRLEVEL_LOW, mp, ldip,
+					 sizeof(*ldip));
+			xfs_alert(mp,
+		"%s: Bad regular inode log record, rec ptr "PTR_FMT", "
+		"ino ptr = "PTR_FMT", ino bp = "PTR_FMT", ino %Ld",
+				__func__, item, dip, bp, in_f->ilf_ino);
+			error = -EFSCORRUPTED;
+			goto out_release;
+		}
+	} else if (unlikely(S_ISDIR(ldip->di_mode))) {
+		if ((ldip->di_format != XFS_DINODE_FMT_EXTENTS) &&
+		    (ldip->di_format != XFS_DINODE_FMT_BTREE) &&
+		    (ldip->di_format != XFS_DINODE_FMT_LOCAL)) {
+			XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(4)",
+					     XFS_ERRLEVEL_LOW, mp, ldip,
+					     sizeof(*ldip));
+			xfs_alert(mp,
+		"%s: Bad dir inode log record, rec ptr "PTR_FMT", "
+		"ino ptr = "PTR_FMT", ino bp = "PTR_FMT", ino %Ld",
+				__func__, item, dip, bp, in_f->ilf_ino);
+			error = -EFSCORRUPTED;
+			goto out_release;
+		}
+	}
+	if (unlikely(ldip->di_nextents + ldip->di_anextents > ldip->di_nblocks)){
+		XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(5)",
+				     XFS_ERRLEVEL_LOW, mp, ldip,
+				     sizeof(*ldip));
+		xfs_alert(mp,
+	"%s: Bad inode log record, rec ptr "PTR_FMT", dino ptr "PTR_FMT", "
+	"dino bp "PTR_FMT", ino %Ld, total extents = %d, nblocks = %Ld",
+			__func__, item, dip, bp, in_f->ilf_ino,
+			ldip->di_nextents + ldip->di_anextents,
+			ldip->di_nblocks);
+		error = -EFSCORRUPTED;
+		goto out_release;
+	}
+	if (unlikely(ldip->di_forkoff > mp->m_sb.sb_inodesize)) {
+		XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(6)",
+				     XFS_ERRLEVEL_LOW, mp, ldip,
+				     sizeof(*ldip));
+		xfs_alert(mp,
+	"%s: Bad inode log record, rec ptr "PTR_FMT", dino ptr "PTR_FMT", "
+	"dino bp "PTR_FMT", ino %Ld, forkoff 0x%x", __func__,
+			item, dip, bp, in_f->ilf_ino, ldip->di_forkoff);
+		error = -EFSCORRUPTED;
+		goto out_release;
+	}
+	isize = xfs_log_dinode_size(mp);
+	if (unlikely(item->ri_buf[1].i_len > isize)) {
+		XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(7)",
+				     XFS_ERRLEVEL_LOW, mp, ldip,
+				     sizeof(*ldip));
+		xfs_alert(mp,
+			"%s: Bad inode log record length %d, rec ptr "PTR_FMT,
+			__func__, item->ri_buf[1].i_len, item);
+		error = -EFSCORRUPTED;
+		goto out_release;
+	}
+
+	/* recover the log dinode inode into the on disk inode */
+	xfs_log_dinode_to_disk(ldip, dip);
+
+	fields = in_f->ilf_fields;
+	if (fields & XFS_ILOG_DEV)
+		xfs_dinode_put_rdev(dip, in_f->ilf_u.ilfu_rdev);
+
+	if (in_f->ilf_size == 2)
+		goto out_owner_change;
+	len = item->ri_buf[2].i_len;
+	src = item->ri_buf[2].i_addr;
+	ASSERT(in_f->ilf_size <= 4);
+	ASSERT((in_f->ilf_size == 3) || (fields & XFS_ILOG_AFORK));
+	ASSERT(!(fields & XFS_ILOG_DFORK) ||
+	       (len == in_f->ilf_dsize));
+
+	switch (fields & XFS_ILOG_DFORK) {
+	case XFS_ILOG_DDATA:
+	case XFS_ILOG_DEXT:
+		memcpy(XFS_DFORK_DPTR(dip), src, len);
+		break;
+
+	case XFS_ILOG_DBROOT:
+		xfs_bmbt_to_bmdr(mp, (struct xfs_btree_block *)src, len,
+				 (struct xfs_bmdr_block *)XFS_DFORK_DPTR(dip),
+				 XFS_DFORK_DSIZE(dip, mp));
+		break;
+
+	default:
+		/*
+		 * There are no data fork flags set.
+		 */
+		ASSERT((fields & XFS_ILOG_DFORK) == 0);
+		break;
+	}
+
+	/*
+	 * If we logged any attribute data, recover it.  There may or
+	 * may not have been any other non-core data logged in this
+	 * transaction.
+	 */
+	if (in_f->ilf_fields & XFS_ILOG_AFORK) {
+		if (in_f->ilf_fields & XFS_ILOG_DFORK) {
+			attr_index = 3;
+		} else {
+			attr_index = 2;
+		}
+		len = item->ri_buf[attr_index].i_len;
+		src = item->ri_buf[attr_index].i_addr;
+		ASSERT(len == in_f->ilf_asize);
+
+		switch (in_f->ilf_fields & XFS_ILOG_AFORK) {
+		case XFS_ILOG_ADATA:
+		case XFS_ILOG_AEXT:
+			dest = XFS_DFORK_APTR(dip);
+			ASSERT(len <= XFS_DFORK_ASIZE(dip, mp));
+			memcpy(dest, src, len);
+			break;
+
+		case XFS_ILOG_ABROOT:
+			dest = XFS_DFORK_APTR(dip);
+			xfs_bmbt_to_bmdr(mp, (struct xfs_btree_block *)src,
+					 len, (struct xfs_bmdr_block *)dest,
+					 XFS_DFORK_ASIZE(dip, mp));
+			break;
+
+		default:
+			xfs_warn(log->l_mp, "%s: Invalid flag", __func__);
+			ASSERT(0);
+			error = -EFSCORRUPTED;
+			goto out_release;
+		}
+	}
+
+out_owner_change:
+	/* Recover the swapext owner change unless inode has been deleted */
+	if ((in_f->ilf_fields & (XFS_ILOG_DOWNER|XFS_ILOG_AOWNER)) &&
+	    (dip->di_mode != 0))
+		error = xfs_recover_inode_owner_change(mp, dip, in_f,
+						       buffer_list);
+	/* re-generate the checksum. */
+	xfs_dinode_calc_crc(log->l_mp, dip);
+
+	ASSERT(bp->b_mount == mp);
+	bp->b_iodone = xlog_recover_iodone;
+	xfs_buf_delwri_queue(bp, buffer_list);
+
+out_release:
+	xfs_buf_relse(bp);
+error:
+	if (need_free)
+		kmem_free(in_f);
+	return error;
+}
+
 const struct xlog_recover_item_type xlog_inode_item_type = {
 	.ra_pass2_fn		= xlog_recover_inode_ra_pass2,
+	.commit_pass2_fn	= xlog_recover_inode_commit_pass2,
 };
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 0a241f1c371a..57e5dac0f510 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2056,358 +2056,6 @@ xlog_buf_readahead(
 		xfs_buf_readahead(log->l_mp->m_ddev_targp, blkno, len, ops);
 }
 
-/*
- * Inode fork owner changes
- *
- * If we have been told that we have to reparent the inode fork, it's because an
- * extent swap operation on a CRC enabled filesystem has been done and we are
- * replaying it. We need to walk the BMBT of the appropriate fork and change the
- * owners of it.
- *
- * The complexity here is that we don't have an inode context to work with, so
- * after we've replayed the inode we need to instantiate one.  This is where the
- * fun begins.
- *
- * We are in the middle of log recovery, so we can't run transactions. That
- * means we cannot use cache coherent inode instantiation via xfs_iget(), as
- * that will result in the corresponding iput() running the inode through
- * xfs_inactive(). If we've just replayed an inode core that changes the link
- * count to zero (i.e. it's been unlinked), then xfs_inactive() will run
- * transactions (bad!).
- *
- * So, to avoid this, we instantiate an inode directly from the inode core we've
- * just recovered. We have the buffer still locked, and all we really need to
- * instantiate is the inode core and the forks being modified. We can do this
- * manually, then run the inode btree owner change, and then tear down the
- * xfs_inode without having to run any transactions at all.
- *
- * Also, because we don't have a transaction context available here but need to
- * gather all the buffers we modify for writeback so we pass the buffer_list
- * instead for the operation to use.
- */
-
-STATIC int
-xfs_recover_inode_owner_change(
-	struct xfs_mount	*mp,
-	struct xfs_dinode	*dip,
-	struct xfs_inode_log_format *in_f,
-	struct list_head	*buffer_list)
-{
-	struct xfs_inode	*ip;
-	int			error;
-
-	ASSERT(in_f->ilf_fields & (XFS_ILOG_DOWNER|XFS_ILOG_AOWNER));
-
-	ip = xfs_inode_alloc(mp, in_f->ilf_ino);
-	if (!ip)
-		return -ENOMEM;
-
-	/* instantiate the inode */
-	ASSERT(dip->di_version >= 3);
-	xfs_inode_from_disk(ip, dip);
-
-	error = xfs_iformat_fork(ip, dip);
-	if (error)
-		goto out_free_ip;
-
-	if (!xfs_inode_verify_forks(ip)) {
-		error = -EFSCORRUPTED;
-		goto out_free_ip;
-	}
-
-	if (in_f->ilf_fields & XFS_ILOG_DOWNER) {
-		ASSERT(in_f->ilf_fields & XFS_ILOG_DBROOT);
-		error = xfs_bmbt_change_owner(NULL, ip, XFS_DATA_FORK,
-					      ip->i_ino, buffer_list);
-		if (error)
-			goto out_free_ip;
-	}
-
-	if (in_f->ilf_fields & XFS_ILOG_AOWNER) {
-		ASSERT(in_f->ilf_fields & XFS_ILOG_ABROOT);
-		error = xfs_bmbt_change_owner(NULL, ip, XFS_ATTR_FORK,
-					      ip->i_ino, buffer_list);
-		if (error)
-			goto out_free_ip;
-	}
-
-out_free_ip:
-	xfs_inode_free(ip);
-	return error;
-}
-
-STATIC int
-xlog_recover_inode_pass2(
-	struct xlog			*log,
-	struct list_head		*buffer_list,
-	struct xlog_recover_item	*item,
-	xfs_lsn_t			current_lsn)
-{
-	struct xfs_inode_log_format	*in_f;
-	xfs_mount_t		*mp = log->l_mp;
-	xfs_buf_t		*bp;
-	xfs_dinode_t		*dip;
-	int			len;
-	char			*src;
-	char			*dest;
-	int			error;
-	int			attr_index;
-	uint			fields;
-	struct xfs_log_dinode	*ldip;
-	uint			isize;
-	int			need_free = 0;
-
-	if (item->ri_buf[0].i_len == sizeof(struct xfs_inode_log_format)) {
-		in_f = item->ri_buf[0].i_addr;
-	} else {
-		in_f = kmem_alloc(sizeof(struct xfs_inode_log_format), 0);
-		need_free = 1;
-		error = xfs_inode_item_format_convert(&item->ri_buf[0], in_f);
-		if (error)
-			goto error;
-	}
-
-	/*
-	 * Inode buffers can be freed, look out for it,
-	 * and do not replay the inode.
-	 */
-	if (xlog_is_buffer_cancelled(log, in_f->ilf_blkno, in_f->ilf_len)) {
-		error = 0;
-		trace_xfs_log_recover_inode_cancel(log, in_f);
-		goto error;
-	}
-	trace_xfs_log_recover_inode_recover(log, in_f);
-
-	error = xfs_buf_read(mp->m_ddev_targp, in_f->ilf_blkno, in_f->ilf_len,
-			0, &bp, &xfs_inode_buf_ops);
-	if (error)
-		goto error;
-	ASSERT(in_f->ilf_fields & XFS_ILOG_CORE);
-	dip = xfs_buf_offset(bp, in_f->ilf_boffset);
-
-	/*
-	 * Make sure the place we're flushing out to really looks
-	 * like an inode!
-	 */
-	if (XFS_IS_CORRUPT(mp, !xfs_verify_magic16(bp, dip->di_magic))) {
-		xfs_alert(mp,
-	"%s: Bad inode magic number, dip = "PTR_FMT", dino bp = "PTR_FMT", ino = %Ld",
-			__func__, dip, bp, in_f->ilf_ino);
-		error = -EFSCORRUPTED;
-		goto out_release;
-	}
-	ldip = item->ri_buf[1].i_addr;
-	if (XFS_IS_CORRUPT(mp, ldip->di_magic != XFS_DINODE_MAGIC)) {
-		xfs_alert(mp,
-			"%s: Bad inode log record, rec ptr "PTR_FMT", ino %Ld",
-			__func__, item, in_f->ilf_ino);
-		error = -EFSCORRUPTED;
-		goto out_release;
-	}
-
-	/*
-	 * If the inode has an LSN in it, recover the inode only if it's less
-	 * than the lsn of the transaction we are replaying. Note: we still
-	 * need to replay an owner change even though the inode is more recent
-	 * than the transaction as there is no guarantee that all the btree
-	 * blocks are more recent than this transaction, too.
-	 */
-	if (dip->di_version >= 3) {
-		xfs_lsn_t	lsn = be64_to_cpu(dip->di_lsn);
-
-		if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0) {
-			trace_xfs_log_recover_inode_skip(log, in_f);
-			error = 0;
-			goto out_owner_change;
-		}
-	}
-
-	/*
-	 * di_flushiter is only valid for v1/2 inodes. All changes for v3 inodes
-	 * are transactional and if ordering is necessary we can determine that
-	 * more accurately by the LSN field in the V3 inode core. Don't trust
-	 * the inode versions we might be changing them here - use the
-	 * superblock flag to determine whether we need to look at di_flushiter
-	 * to skip replay when the on disk inode is newer than the log one
-	 */
-	if (!xfs_sb_version_has_v3inode(&mp->m_sb) &&
-	    ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) {
-		/*
-		 * Deal with the wrap case, DI_MAX_FLUSH is less
-		 * than smaller numbers
-		 */
-		if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH &&
-		    ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) {
-			/* do nothing */
-		} else {
-			trace_xfs_log_recover_inode_skip(log, in_f);
-			error = 0;
-			goto out_release;
-		}
-	}
-
-	/* Take the opportunity to reset the flush iteration count */
-	ldip->di_flushiter = 0;
-
-	if (unlikely(S_ISREG(ldip->di_mode))) {
-		if ((ldip->di_format != XFS_DINODE_FMT_EXTENTS) &&
-		    (ldip->di_format != XFS_DINODE_FMT_BTREE)) {
-			XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(3)",
-					 XFS_ERRLEVEL_LOW, mp, ldip,
-					 sizeof(*ldip));
-			xfs_alert(mp,
-		"%s: Bad regular inode log record, rec ptr "PTR_FMT", "
-		"ino ptr = "PTR_FMT", ino bp = "PTR_FMT", ino %Ld",
-				__func__, item, dip, bp, in_f->ilf_ino);
-			error = -EFSCORRUPTED;
-			goto out_release;
-		}
-	} else if (unlikely(S_ISDIR(ldip->di_mode))) {
-		if ((ldip->di_format != XFS_DINODE_FMT_EXTENTS) &&
-		    (ldip->di_format != XFS_DINODE_FMT_BTREE) &&
-		    (ldip->di_format != XFS_DINODE_FMT_LOCAL)) {
-			XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(4)",
-					     XFS_ERRLEVEL_LOW, mp, ldip,
-					     sizeof(*ldip));
-			xfs_alert(mp,
-		"%s: Bad dir inode log record, rec ptr "PTR_FMT", "
-		"ino ptr = "PTR_FMT", ino bp = "PTR_FMT", ino %Ld",
-				__func__, item, dip, bp, in_f->ilf_ino);
-			error = -EFSCORRUPTED;
-			goto out_release;
-		}
-	}
-	if (unlikely(ldip->di_nextents + ldip->di_anextents > ldip->di_nblocks)){
-		XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(5)",
-				     XFS_ERRLEVEL_LOW, mp, ldip,
-				     sizeof(*ldip));
-		xfs_alert(mp,
-	"%s: Bad inode log record, rec ptr "PTR_FMT", dino ptr "PTR_FMT", "
-	"dino bp "PTR_FMT", ino %Ld, total extents = %d, nblocks = %Ld",
-			__func__, item, dip, bp, in_f->ilf_ino,
-			ldip->di_nextents + ldip->di_anextents,
-			ldip->di_nblocks);
-		error = -EFSCORRUPTED;
-		goto out_release;
-	}
-	if (unlikely(ldip->di_forkoff > mp->m_sb.sb_inodesize)) {
-		XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(6)",
-				     XFS_ERRLEVEL_LOW, mp, ldip,
-				     sizeof(*ldip));
-		xfs_alert(mp,
-	"%s: Bad inode log record, rec ptr "PTR_FMT", dino ptr "PTR_FMT", "
-	"dino bp "PTR_FMT", ino %Ld, forkoff 0x%x", __func__,
-			item, dip, bp, in_f->ilf_ino, ldip->di_forkoff);
-		error = -EFSCORRUPTED;
-		goto out_release;
-	}
-	isize = xfs_log_dinode_size(mp);
-	if (unlikely(item->ri_buf[1].i_len > isize)) {
-		XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(7)",
-				     XFS_ERRLEVEL_LOW, mp, ldip,
-				     sizeof(*ldip));
-		xfs_alert(mp,
-			"%s: Bad inode log record length %d, rec ptr "PTR_FMT,
-			__func__, item->ri_buf[1].i_len, item);
-		error = -EFSCORRUPTED;
-		goto out_release;
-	}
-
-	/* recover the log dinode inode into the on disk inode */
-	xfs_log_dinode_to_disk(ldip, dip);
-
-	fields = in_f->ilf_fields;
-	if (fields & XFS_ILOG_DEV)
-		xfs_dinode_put_rdev(dip, in_f->ilf_u.ilfu_rdev);
-
-	if (in_f->ilf_size == 2)
-		goto out_owner_change;
-	len = item->ri_buf[2].i_len;
-	src = item->ri_buf[2].i_addr;
-	ASSERT(in_f->ilf_size <= 4);
-	ASSERT((in_f->ilf_size == 3) || (fields & XFS_ILOG_AFORK));
-	ASSERT(!(fields & XFS_ILOG_DFORK) ||
-	       (len == in_f->ilf_dsize));
-
-	switch (fields & XFS_ILOG_DFORK) {
-	case XFS_ILOG_DDATA:
-	case XFS_ILOG_DEXT:
-		memcpy(XFS_DFORK_DPTR(dip), src, len);
-		break;
-
-	case XFS_ILOG_DBROOT:
-		xfs_bmbt_to_bmdr(mp, (struct xfs_btree_block *)src, len,
-				 (xfs_bmdr_block_t *)XFS_DFORK_DPTR(dip),
-				 XFS_DFORK_DSIZE(dip, mp));
-		break;
-
-	default:
-		/*
-		 * There are no data fork flags set.
-		 */
-		ASSERT((fields & XFS_ILOG_DFORK) == 0);
-		break;
-	}
-
-	/*
-	 * If we logged any attribute data, recover it.  There may or
-	 * may not have been any other non-core data logged in this
-	 * transaction.
-	 */
-	if (in_f->ilf_fields & XFS_ILOG_AFORK) {
-		if (in_f->ilf_fields & XFS_ILOG_DFORK) {
-			attr_index = 3;
-		} else {
-			attr_index = 2;
-		}
-		len = item->ri_buf[attr_index].i_len;
-		src = item->ri_buf[attr_index].i_addr;
-		ASSERT(len == in_f->ilf_asize);
-
-		switch (in_f->ilf_fields & XFS_ILOG_AFORK) {
-		case XFS_ILOG_ADATA:
-		case XFS_ILOG_AEXT:
-			dest = XFS_DFORK_APTR(dip);
-			ASSERT(len <= XFS_DFORK_ASIZE(dip, mp));
-			memcpy(dest, src, len);
-			break;
-
-		case XFS_ILOG_ABROOT:
-			dest = XFS_DFORK_APTR(dip);
-			xfs_bmbt_to_bmdr(mp, (struct xfs_btree_block *)src,
-					 len, (xfs_bmdr_block_t*)dest,
-					 XFS_DFORK_ASIZE(dip, mp));
-			break;
-
-		default:
-			xfs_warn(log->l_mp, "%s: Invalid flag", __func__);
-			ASSERT(0);
-			error = -EFSCORRUPTED;
-			goto out_release;
-		}
-	}
-
-out_owner_change:
-	/* Recover the swapext owner change unless inode has been deleted */
-	if ((in_f->ilf_fields & (XFS_ILOG_DOWNER|XFS_ILOG_AOWNER)) &&
-	    (dip->di_mode != 0))
-		error = xfs_recover_inode_owner_change(mp, dip, in_f,
-						       buffer_list);
-	/* re-generate the checksum. */
-	xfs_dinode_calc_crc(log->l_mp, dip);
-
-	ASSERT(bp->b_mount == mp);
-	bp->b_iodone = xlog_recover_iodone;
-	xfs_buf_delwri_queue(bp, buffer_list);
-
-out_release:
-	xfs_buf_relse(bp);
-error:
-	if (need_free)
-		kmem_free(in_f);
-	return error;
-}
-
 /*
  * Recover a dquot record
  */
@@ -3107,9 +2755,6 @@ xlog_recover_commit_pass2(
 				trans->r_lsn);
 
 	switch (ITEM_TYPE(item)) {
-	case XFS_LI_INODE:
-		return xlog_recover_inode_pass2(log, buffer_list, item,
-						 trans->r_lsn);
 	case XFS_LI_EFI:
 		return xlog_recover_efi_pass2(log, item, trans->r_lsn);
 	case XFS_LI_EFD:


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 06/21] xfs: refactor log recovery dquot item dispatch for pass2 commit functions
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
                   ` (4 preceding siblings ...)
  2020-04-30  0:48 ` [PATCH 05/21] xfs: refactor log recovery inode " Darrick J. Wong
@ 2020-04-30  0:48 ` Darrick J. Wong
  2020-05-01 14:14   ` Chandan Rajendra
  2020-04-30  0:48 ` [PATCH 07/21] xfs: refactor log recovery icreate " Darrick J. Wong
                   ` (15 subsequent siblings)
  21 siblings, 1 reply; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:48 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Move the log dquot item pass2 commit code into the per-item source code
files and use the dispatch function to call it.  We do these one at a
time because there's a lot of code to move.  No functional changes.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_dquot_item.c  |  109 +++++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_log_recover.c |  112 ----------------------------------------------
 2 files changed, 109 insertions(+), 112 deletions(-)


diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
index 4d18af49adfe..83bd7ded9185 100644
--- a/fs/xfs/xfs_dquot_item.c
+++ b/fs/xfs/xfs_dquot_item.c
@@ -419,8 +419,117 @@ xlog_recover_dquot_ra_pass2(
 			&xfs_dquot_buf_ra_ops);
 }
 
+/*
+ * Recover a dquot record
+ */
+STATIC int
+xlog_recover_dquot_commit_pass2(
+	struct xlog			*log,
+	struct list_head		*buffer_list,
+	struct xlog_recover_item	*item,
+	xfs_lsn_t			current_lsn)
+{
+	struct xfs_mount		*mp = log->l_mp;
+	struct xfs_buf			*bp;
+	struct xfs_disk_dquot		*ddq, *recddq;
+	struct xfs_dq_logformat		*dq_f;
+	xfs_failaddr_t			fa;
+	int				error;
+	uint				type;
+
+	/*
+	 * Filesystems are required to send in quota flags at mount time.
+	 */
+	if (mp->m_qflags == 0)
+		return 0;
+
+	recddq = item->ri_buf[1].i_addr;
+	if (recddq == NULL) {
+		xfs_alert(log->l_mp, "NULL dquot in %s.", __func__);
+		return -EFSCORRUPTED;
+	}
+	if (item->ri_buf[1].i_len < sizeof(struct xfs_disk_dquot)) {
+		xfs_alert(log->l_mp, "dquot too small (%d) in %s.",
+			item->ri_buf[1].i_len, __func__);
+		return -EFSCORRUPTED;
+	}
+
+	/*
+	 * This type of quotas was turned off, so ignore this record.
+	 */
+	type = recddq->d_flags & (XFS_DQ_USER | XFS_DQ_PROJ | XFS_DQ_GROUP);
+	ASSERT(type);
+	if (log->l_quotaoffs_flag & type)
+		return 0;
+
+	/*
+	 * At this point we know that quota was _not_ turned off.
+	 * Since the mount flags are not indicating to us otherwise, this
+	 * must mean that quota is on, and the dquot needs to be replayed.
+	 * Remember that we may not have fully recovered the superblock yet,
+	 * so we can't do the usual trick of looking at the SB quota bits.
+	 *
+	 * The other possibility, of course, is that the quota subsystem was
+	 * removed since the last mount - ENOSYS.
+	 */
+	dq_f = item->ri_buf[0].i_addr;
+	ASSERT(dq_f);
+	fa = xfs_dquot_verify(mp, recddq, dq_f->qlf_id, 0);
+	if (fa) {
+		xfs_alert(mp, "corrupt dquot ID 0x%x in log at %pS",
+				dq_f->qlf_id, fa);
+		return -EFSCORRUPTED;
+	}
+	ASSERT(dq_f->qlf_len == 1);
+
+	/*
+	 * At this point we are assuming that the dquots have been allocated
+	 * and hence the buffer has valid dquots stamped in it. It should,
+	 * therefore, pass verifier validation. If the dquot is bad, then the
+	 * we'll return an error here, so we don't need to specifically check
+	 * the dquot in the buffer after the verifier has run.
+	 */
+	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dq_f->qlf_blkno,
+				   XFS_FSB_TO_BB(mp, dq_f->qlf_len), 0, &bp,
+				   &xfs_dquot_buf_ops);
+	if (error)
+		return error;
+
+	ASSERT(bp);
+	ddq = xfs_buf_offset(bp, dq_f->qlf_boffset);
+
+	/*
+	 * If the dquot has an LSN in it, recover the dquot only if it's less
+	 * than the lsn of the transaction we are replaying.
+	 */
+	if (xfs_sb_version_hascrc(&mp->m_sb)) {
+		struct xfs_dqblk *dqb = (struct xfs_dqblk *)ddq;
+		xfs_lsn_t	lsn = be64_to_cpu(dqb->dd_lsn);
+
+		if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0) {
+			goto out_release;
+		}
+	}
+
+	memcpy(ddq, recddq, item->ri_buf[1].i_len);
+	if (xfs_sb_version_hascrc(&mp->m_sb)) {
+		xfs_update_cksum((char *)ddq, sizeof(struct xfs_dqblk),
+				 XFS_DQUOT_CRC_OFF);
+	}
+
+	ASSERT(dq_f->qlf_size == 2);
+	ASSERT(bp->b_mount == mp);
+	bp->b_iodone = xlog_recover_iodone;
+	xfs_buf_delwri_queue(bp, buffer_list);
+
+out_release:
+	xfs_buf_relse(bp);
+	return 0;
+}
+
 const struct xlog_recover_item_type xlog_dquot_item_type = {
 	.ra_pass2_fn		= xlog_recover_dquot_ra_pass2,
+	.commit_pass2_fn	= xlog_recover_dquot_commit_pass2,
 };
 
 /*
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 57e5dac0f510..58a54d9e6847 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2056,115 +2056,6 @@ xlog_buf_readahead(
 		xfs_buf_readahead(log->l_mp->m_ddev_targp, blkno, len, ops);
 }
 
-/*
- * Recover a dquot record
- */
-STATIC int
-xlog_recover_dquot_pass2(
-	struct xlog			*log,
-	struct list_head		*buffer_list,
-	struct xlog_recover_item	*item,
-	xfs_lsn_t			current_lsn)
-{
-	xfs_mount_t		*mp = log->l_mp;
-	xfs_buf_t		*bp;
-	struct xfs_disk_dquot	*ddq, *recddq;
-	xfs_failaddr_t		fa;
-	int			error;
-	xfs_dq_logformat_t	*dq_f;
-	uint			type;
-
-
-	/*
-	 * Filesystems are required to send in quota flags at mount time.
-	 */
-	if (mp->m_qflags == 0)
-		return 0;
-
-	recddq = item->ri_buf[1].i_addr;
-	if (recddq == NULL) {
-		xfs_alert(log->l_mp, "NULL dquot in %s.", __func__);
-		return -EFSCORRUPTED;
-	}
-	if (item->ri_buf[1].i_len < sizeof(struct xfs_disk_dquot)) {
-		xfs_alert(log->l_mp, "dquot too small (%d) in %s.",
-			item->ri_buf[1].i_len, __func__);
-		return -EFSCORRUPTED;
-	}
-
-	/*
-	 * This type of quotas was turned off, so ignore this record.
-	 */
-	type = recddq->d_flags & (XFS_DQ_USER | XFS_DQ_PROJ | XFS_DQ_GROUP);
-	ASSERT(type);
-	if (log->l_quotaoffs_flag & type)
-		return 0;
-
-	/*
-	 * At this point we know that quota was _not_ turned off.
-	 * Since the mount flags are not indicating to us otherwise, this
-	 * must mean that quota is on, and the dquot needs to be replayed.
-	 * Remember that we may not have fully recovered the superblock yet,
-	 * so we can't do the usual trick of looking at the SB quota bits.
-	 *
-	 * The other possibility, of course, is that the quota subsystem was
-	 * removed since the last mount - ENOSYS.
-	 */
-	dq_f = item->ri_buf[0].i_addr;
-	ASSERT(dq_f);
-	fa = xfs_dquot_verify(mp, recddq, dq_f->qlf_id, 0);
-	if (fa) {
-		xfs_alert(mp, "corrupt dquot ID 0x%x in log at %pS",
-				dq_f->qlf_id, fa);
-		return -EFSCORRUPTED;
-	}
-	ASSERT(dq_f->qlf_len == 1);
-
-	/*
-	 * At this point we are assuming that the dquots have been allocated
-	 * and hence the buffer has valid dquots stamped in it. It should,
-	 * therefore, pass verifier validation. If the dquot is bad, then the
-	 * we'll return an error here, so we don't need to specifically check
-	 * the dquot in the buffer after the verifier has run.
-	 */
-	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dq_f->qlf_blkno,
-				   XFS_FSB_TO_BB(mp, dq_f->qlf_len), 0, &bp,
-				   &xfs_dquot_buf_ops);
-	if (error)
-		return error;
-
-	ASSERT(bp);
-	ddq = xfs_buf_offset(bp, dq_f->qlf_boffset);
-
-	/*
-	 * If the dquot has an LSN in it, recover the dquot only if it's less
-	 * than the lsn of the transaction we are replaying.
-	 */
-	if (xfs_sb_version_hascrc(&mp->m_sb)) {
-		struct xfs_dqblk *dqb = (struct xfs_dqblk *)ddq;
-		xfs_lsn_t	lsn = be64_to_cpu(dqb->dd_lsn);
-
-		if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0) {
-			goto out_release;
-		}
-	}
-
-	memcpy(ddq, recddq, item->ri_buf[1].i_len);
-	if (xfs_sb_version_hascrc(&mp->m_sb)) {
-		xfs_update_cksum((char *)ddq, sizeof(struct xfs_dqblk),
-				 XFS_DQUOT_CRC_OFF);
-	}
-
-	ASSERT(dq_f->qlf_size == 2);
-	ASSERT(bp->b_mount == mp);
-	bp->b_iodone = xlog_recover_iodone;
-	xfs_buf_delwri_queue(bp, buffer_list);
-
-out_release:
-	xfs_buf_relse(bp);
-	return 0;
-}
-
 /*
  * This routine is called to create an in-core extent free intent
  * item from the efi format structure which was logged on disk.
@@ -2771,9 +2662,6 @@ xlog_recover_commit_pass2(
 		return xlog_recover_bui_pass2(log, item, trans->r_lsn);
 	case XFS_LI_BUD:
 		return xlog_recover_bud_pass2(log, item);
-	case XFS_LI_DQUOT:
-		return xlog_recover_dquot_pass2(log, buffer_list, item,
-						trans->r_lsn);
 	case XFS_LI_ICREATE:
 		return xlog_recover_do_icreate_pass2(log, buffer_list, item);
 	case XFS_LI_QUOTAOFF:


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 07/21] xfs: refactor log recovery icreate item dispatch for pass2 commit functions
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
                   ` (5 preceding siblings ...)
  2020-04-30  0:48 ` [PATCH 06/21] xfs: refactor log recovery dquot " Darrick J. Wong
@ 2020-04-30  0:48 ` Darrick J. Wong
  2020-05-01 14:18   ` Chandan Rajendra
  2020-04-30  0:48 ` [PATCH 08/21] xfs: remove log recovery quotaoff " Darrick J. Wong
                   ` (14 subsequent siblings)
  21 siblings, 1 reply; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:48 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Move the log icreate item pass2 commit code into the per-item source code
files and use the dispatch function to call it.  We do these one at a
time because there's a lot of code to move.  No functional changes.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_icreate_item.c |  132 +++++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_log_recover.c  |  126 -------------------------------------------
 2 files changed, 132 insertions(+), 126 deletions(-)


diff --git a/fs/xfs/xfs_icreate_item.c b/fs/xfs/xfs_icreate_item.c
index 9f38a3c200a3..602a8c91371f 100644
--- a/fs/xfs/xfs_icreate_item.c
+++ b/fs/xfs/xfs_icreate_item.c
@@ -6,13 +6,19 @@
 #include "xfs.h"
 #include "xfs_fs.h"
 #include "xfs_shared.h"
+#include "xfs_format.h"
 #include "xfs_log_format.h"
+#include "xfs_trans_resv.h"
+#include "xfs_mount.h"
+#include "xfs_inode.h"
 #include "xfs_trans.h"
 #include "xfs_trans_priv.h"
 #include "xfs_icreate_item.h"
 #include "xfs_log.h"
 #include "xfs_log_priv.h"
 #include "xfs_log_recover.h"
+#include "xfs_ialloc.h"
+#include "xfs_trace.h"
 
 kmem_zone_t	*xfs_icreate_zone;		/* inode create item zone */
 
@@ -117,6 +123,132 @@ xlog_icreate_reorder(
 	return XLOG_REORDER_BUFFER_LIST;
 }
 
+/*
+ * This routine is called when an inode create format structure is found in a
+ * committed transaction in the log.  It's purpose is to initialise the inodes
+ * being allocated on disk. This requires us to get inode cluster buffers that
+ * match the range to be initialised, stamped with inode templates and written
+ * by delayed write so that subsequent modifications will hit the cached buffer
+ * and only need writing out at the end of recovery.
+ */
+STATIC int
+xlog_recover_do_icreate_commit_pass2(
+	struct xlog			*log,
+	struct list_head		*buffer_list,
+	struct xlog_recover_item	*item,
+	xfs_lsn_t			lsn)
+{
+	struct xfs_mount		*mp = log->l_mp;
+	struct xfs_icreate_log		*icl;
+	struct xfs_ino_geometry		*igeo = M_IGEO(mp);
+	xfs_agnumber_t			agno;
+	xfs_agblock_t			agbno;
+	unsigned int			count;
+	unsigned int			isize;
+	xfs_agblock_t			length;
+	int				bb_per_cluster;
+	int				cancel_count;
+	int				nbufs;
+	int				i;
+
+	icl = (struct xfs_icreate_log *)item->ri_buf[0].i_addr;
+	if (icl->icl_type != XFS_LI_ICREATE) {
+		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad type");
+		return -EINVAL;
+	}
+
+	if (icl->icl_size != 1) {
+		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad icl size");
+		return -EINVAL;
+	}
+
+	agno = be32_to_cpu(icl->icl_ag);
+	if (agno >= mp->m_sb.sb_agcount) {
+		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad agno");
+		return -EINVAL;
+	}
+	agbno = be32_to_cpu(icl->icl_agbno);
+	if (!agbno || agbno == NULLAGBLOCK || agbno >= mp->m_sb.sb_agblocks) {
+		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad agbno");
+		return -EINVAL;
+	}
+	isize = be32_to_cpu(icl->icl_isize);
+	if (isize != mp->m_sb.sb_inodesize) {
+		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad isize");
+		return -EINVAL;
+	}
+	count = be32_to_cpu(icl->icl_count);
+	if (!count) {
+		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad count");
+		return -EINVAL;
+	}
+	length = be32_to_cpu(icl->icl_length);
+	if (!length || length >= mp->m_sb.sb_agblocks) {
+		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad length");
+		return -EINVAL;
+	}
+
+	/*
+	 * The inode chunk is either full or sparse and we only support
+	 * m_ino_geo.ialloc_min_blks sized sparse allocations at this time.
+	 */
+	if (length != igeo->ialloc_blks &&
+	    length != igeo->ialloc_min_blks) {
+		xfs_warn(log->l_mp,
+			 "%s: unsupported chunk length", __FUNCTION__);
+		return -EINVAL;
+	}
+
+	/* verify inode count is consistent with extent length */
+	if ((count >> mp->m_sb.sb_inopblog) != length) {
+		xfs_warn(log->l_mp,
+			 "%s: inconsistent inode count and chunk length",
+			 __FUNCTION__);
+		return -EINVAL;
+	}
+
+	/*
+	 * The icreate transaction can cover multiple cluster buffers and these
+	 * buffers could have been freed and reused. Check the individual
+	 * buffers for cancellation so we don't overwrite anything written after
+	 * a cancellation.
+	 */
+	bb_per_cluster = XFS_FSB_TO_BB(mp, igeo->blocks_per_cluster);
+	nbufs = length / igeo->blocks_per_cluster;
+	for (i = 0, cancel_count = 0; i < nbufs; i++) {
+		xfs_daddr_t	daddr;
+
+		daddr = XFS_AGB_TO_DADDR(mp, agno,
+				agbno + i * igeo->blocks_per_cluster);
+		if (xlog_is_buffer_cancelled(log, daddr, bb_per_cluster))
+			cancel_count++;
+	}
+
+	/*
+	 * We currently only use icreate for a single allocation at a time. This
+	 * means we should expect either all or none of the buffers to be
+	 * cancelled. Be conservative and skip replay if at least one buffer is
+	 * cancelled, but warn the user that something is awry if the buffers
+	 * are not consistent.
+	 *
+	 * XXX: This must be refined to only skip cancelled clusters once we use
+	 * icreate for multiple chunk allocations.
+	 */
+	ASSERT(!cancel_count || cancel_count == nbufs);
+	if (cancel_count) {
+		if (cancel_count != nbufs)
+			xfs_warn(mp,
+	"WARNING: partial inode chunk cancellation, skipped icreate.");
+		trace_xfs_log_recover_icreate_cancel(log, icl);
+		return 0;
+	}
+
+	trace_xfs_log_recover_icreate_recover(log, icl);
+	return xfs_ialloc_inode_init(mp, NULL, buffer_list, count, agno, agbno,
+				     length, be32_to_cpu(icl->icl_gen));
+}
+
 const struct xlog_recover_item_type xlog_icreate_item_type = {
 	.reorder_fn		= xlog_icreate_reorder,
+	.commit_pass2_fn	= xlog_recover_do_icreate_commit_pass2,
 };
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 58a54d9e6847..6ba3d64d08de 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2489,130 +2489,6 @@ xlog_recover_bud_pass2(
 	return 0;
 }
 
-/*
- * This routine is called when an inode create format structure is found in a
- * committed transaction in the log.  It's purpose is to initialise the inodes
- * being allocated on disk. This requires us to get inode cluster buffers that
- * match the range to be initialised, stamped with inode templates and written
- * by delayed write so that subsequent modifications will hit the cached buffer
- * and only need writing out at the end of recovery.
- */
-STATIC int
-xlog_recover_do_icreate_pass2(
-	struct xlog		*log,
-	struct list_head	*buffer_list,
-	xlog_recover_item_t	*item)
-{
-	struct xfs_mount	*mp = log->l_mp;
-	struct xfs_icreate_log	*icl;
-	struct xfs_ino_geometry	*igeo = M_IGEO(mp);
-	xfs_agnumber_t		agno;
-	xfs_agblock_t		agbno;
-	unsigned int		count;
-	unsigned int		isize;
-	xfs_agblock_t		length;
-	int			bb_per_cluster;
-	int			cancel_count;
-	int			nbufs;
-	int			i;
-
-	icl = (struct xfs_icreate_log *)item->ri_buf[0].i_addr;
-	if (icl->icl_type != XFS_LI_ICREATE) {
-		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad type");
-		return -EINVAL;
-	}
-
-	if (icl->icl_size != 1) {
-		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad icl size");
-		return -EINVAL;
-	}
-
-	agno = be32_to_cpu(icl->icl_ag);
-	if (agno >= mp->m_sb.sb_agcount) {
-		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad agno");
-		return -EINVAL;
-	}
-	agbno = be32_to_cpu(icl->icl_agbno);
-	if (!agbno || agbno == NULLAGBLOCK || agbno >= mp->m_sb.sb_agblocks) {
-		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad agbno");
-		return -EINVAL;
-	}
-	isize = be32_to_cpu(icl->icl_isize);
-	if (isize != mp->m_sb.sb_inodesize) {
-		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad isize");
-		return -EINVAL;
-	}
-	count = be32_to_cpu(icl->icl_count);
-	if (!count) {
-		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad count");
-		return -EINVAL;
-	}
-	length = be32_to_cpu(icl->icl_length);
-	if (!length || length >= mp->m_sb.sb_agblocks) {
-		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad length");
-		return -EINVAL;
-	}
-
-	/*
-	 * The inode chunk is either full or sparse and we only support
-	 * m_ino_geo.ialloc_min_blks sized sparse allocations at this time.
-	 */
-	if (length != igeo->ialloc_blks &&
-	    length != igeo->ialloc_min_blks) {
-		xfs_warn(log->l_mp,
-			 "%s: unsupported chunk length", __FUNCTION__);
-		return -EINVAL;
-	}
-
-	/* verify inode count is consistent with extent length */
-	if ((count >> mp->m_sb.sb_inopblog) != length) {
-		xfs_warn(log->l_mp,
-			 "%s: inconsistent inode count and chunk length",
-			 __FUNCTION__);
-		return -EINVAL;
-	}
-
-	/*
-	 * The icreate transaction can cover multiple cluster buffers and these
-	 * buffers could have been freed and reused. Check the individual
-	 * buffers for cancellation so we don't overwrite anything written after
-	 * a cancellation.
-	 */
-	bb_per_cluster = XFS_FSB_TO_BB(mp, igeo->blocks_per_cluster);
-	nbufs = length / igeo->blocks_per_cluster;
-	for (i = 0, cancel_count = 0; i < nbufs; i++) {
-		xfs_daddr_t	daddr;
-
-		daddr = XFS_AGB_TO_DADDR(mp, agno,
-				agbno + i * igeo->blocks_per_cluster);
-		if (xlog_is_buffer_cancelled(log, daddr, bb_per_cluster))
-			cancel_count++;
-	}
-
-	/*
-	 * We currently only use icreate for a single allocation at a time. This
-	 * means we should expect either all or none of the buffers to be
-	 * cancelled. Be conservative and skip replay if at least one buffer is
-	 * cancelled, but warn the user that something is awry if the buffers
-	 * are not consistent.
-	 *
-	 * XXX: This must be refined to only skip cancelled clusters once we use
-	 * icreate for multiple chunk allocations.
-	 */
-	ASSERT(!cancel_count || cancel_count == nbufs);
-	if (cancel_count) {
-		if (cancel_count != nbufs)
-			xfs_warn(mp,
-	"WARNING: partial inode chunk cancellation, skipped icreate.");
-		trace_xfs_log_recover_icreate_cancel(log, icl);
-		return 0;
-	}
-
-	trace_xfs_log_recover_icreate_recover(log, icl);
-	return xfs_ialloc_inode_init(mp, NULL, buffer_list, count, agno, agbno,
-				     length, be32_to_cpu(icl->icl_gen));
-}
-
 STATIC int
 xlog_recover_commit_pass1(
 	struct xlog			*log,
@@ -2662,8 +2538,6 @@ xlog_recover_commit_pass2(
 		return xlog_recover_bui_pass2(log, item, trans->r_lsn);
 	case XFS_LI_BUD:
 		return xlog_recover_bud_pass2(log, item);
-	case XFS_LI_ICREATE:
-		return xlog_recover_do_icreate_pass2(log, buffer_list, item);
 	case XFS_LI_QUOTAOFF:
 		/* nothing to do in pass2 */
 		return 0;


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 08/21] xfs: remove log recovery quotaoff item dispatch for pass2 commit functions
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
                   ` (6 preceding siblings ...)
  2020-04-30  0:48 ` [PATCH 07/21] xfs: refactor log recovery icreate " Darrick J. Wong
@ 2020-04-30  0:48 ` Darrick J. Wong
  2020-05-01 15:09   ` Chandan Rajendra
  2020-04-30  0:48 ` [PATCH 09/21] xfs: refactor log recovery EFI " Darrick J. Wong
                   ` (13 subsequent siblings)
  21 siblings, 1 reply; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:48 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Quotaoff doesn't actually do anything, so take advantage of the
commit_pass2_fn pointer being optional and get rid of the switch
statement clause.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_log_recover.c |    3 ---
 1 file changed, 3 deletions(-)


diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 6ba3d64d08de..dba38fb99af7 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2538,9 +2538,6 @@ xlog_recover_commit_pass2(
 		return xlog_recover_bui_pass2(log, item, trans->r_lsn);
 	case XFS_LI_BUD:
 		return xlog_recover_bud_pass2(log, item);
-	case XFS_LI_QUOTAOFF:
-		/* nothing to do in pass2 */
-		return 0;
 	default:
 		xfs_warn(log->l_mp, "%s: invalid item type (%d)",
 			__func__, ITEM_TYPE(item));


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 09/21] xfs: refactor log recovery EFI item dispatch for pass2 commit functions
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
                   ` (7 preceding siblings ...)
  2020-04-30  0:48 ` [PATCH 08/21] xfs: remove log recovery quotaoff " Darrick J. Wong
@ 2020-04-30  0:48 ` Darrick J. Wong
  2020-05-01 10:28   ` Christoph Hellwig
  2020-04-30  0:48 ` [PATCH 10/21] xfs: refactor log recovery RUI " Darrick J. Wong
                   ` (12 subsequent siblings)
  21 siblings, 1 reply; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:48 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Move the extent free intent and intent-done pass2 commit code into the
per-item source code files and use dispatch functions to call them.  We
do these one at a time because there's a lot of code to move.  No
functional changes.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_extfree_item.c |  108 ++++++++++++++++++++++++++++++++++++++++++++-
 fs/xfs/xfs_extfree_item.h |    4 --
 fs/xfs/xfs_log_recover.c  |  100 ------------------------------------------
 3 files changed, 105 insertions(+), 107 deletions(-)


diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
index c53e5f46ee26..53c1d3b9b957 100644
--- a/fs/xfs/xfs_extfree_item.c
+++ b/fs/xfs/xfs_extfree_item.c
@@ -22,6 +22,7 @@
 #include "xfs_bmap.h"
 #include "xfs_trace.h"
 #include "xfs_error.h"
+#include "xfs_log_priv.h"
 #include "xfs_log_recover.h"
 
 kmem_zone_t	*xfs_efi_zone;
@@ -32,7 +33,7 @@ static inline struct xfs_efi_log_item *EFI_ITEM(struct xfs_log_item *lip)
 	return container_of(lip, struct xfs_efi_log_item, efi_item);
 }
 
-void
+STATIC void
 xfs_efi_item_free(
 	struct xfs_efi_log_item	*efip)
 {
@@ -151,7 +152,7 @@ static const struct xfs_item_ops xfs_efi_item_ops = {
 /*
  * Allocate and initialize an efi item with the given number of extents.
  */
-struct xfs_efi_log_item *
+STATIC struct xfs_efi_log_item *
 xfs_efi_init(
 	struct xfs_mount	*mp,
 	uint			nextents)
@@ -185,7 +186,7 @@ xfs_efi_init(
  * one of which will be the native format for this kernel.
  * It will handle the conversion of formats if necessary.
  */
-int
+STATIC int
 xfs_efi_copy_format(xfs_log_iovec_t *buf, xfs_efi_log_format_t *dst_efi_fmt)
 {
 	xfs_efi_log_format_t *src_efi_fmt = buf->i_addr;
@@ -654,8 +655,109 @@ xfs_efi_recover(
 	return error;
 }
 
+/*
+ * This routine is called to create an in-core extent free intent
+ * item from the efi format structure which was logged on disk.
+ * It allocates an in-core efi, copies the extents from the format
+ * structure into it, and adds the efi to the AIL with the given
+ * LSN.
+ */
+STATIC int
+xlog_recover_extfree_intent_commit_pass2(
+	struct xlog			*log,
+	struct list_head		*buffer_list,
+	struct xlog_recover_item	*item,
+	xfs_lsn_t			lsn)
+{
+	struct xfs_mount		*mp = log->l_mp;
+	struct xfs_efi_log_item		*efip;
+	struct xfs_efi_log_format	*efi_formatp;
+	int				error;
+
+	efi_formatp = item->ri_buf[0].i_addr;
+
+	efip = xfs_efi_init(mp, efi_formatp->efi_nextents);
+	error = xfs_efi_copy_format(&item->ri_buf[0], &efip->efi_format);
+	if (error) {
+		xfs_efi_item_free(efip);
+		return error;
+	}
+	atomic_set(&efip->efi_next_extent, efi_formatp->efi_nextents);
+
+	spin_lock(&log->l_ailp->ail_lock);
+	/*
+	 * The EFI has two references. One for the EFD and one for EFI to ensure
+	 * it makes it into the AIL. Insert the EFI into the AIL directly and
+	 * drop the EFI reference. Note that xfs_trans_ail_update() drops the
+	 * AIL lock.
+	 */
+	xfs_trans_ail_update(log->l_ailp, &efip->efi_item, lsn);
+	xfs_efi_release(efip);
+	return 0;
+}
+
+
+/*
+ * This routine is called when an EFD format structure is found in a committed
+ * transaction in the log. Its purpose is to cancel the corresponding EFI if it
+ * was still in the log. To do this it searches the AIL for the EFI with an id
+ * equal to that in the EFD format structure. If we find it we drop the EFD
+ * reference, which removes the EFI from the AIL and frees it.
+ */
+STATIC int
+xlog_recover_extfree_done_commit_pass2(
+	struct xlog			*log,
+	struct list_head		*buffer_list,
+	struct xlog_recover_item	*item,
+	xfs_lsn_t			lsn)
+{
+	struct xfs_ail_cursor		cur;
+	struct xfs_efd_log_format	*efd_formatp;
+	struct xfs_efi_log_item		*efip = NULL;
+	struct xfs_log_item		*lip;
+	struct xfs_ail			*ailp = log->l_ailp;
+	uint64_t			efi_id;
+
+	efd_formatp = item->ri_buf[0].i_addr;
+	ASSERT((item->ri_buf[0].i_len == (sizeof(xfs_efd_log_format_32_t) +
+		((efd_formatp->efd_nextents - 1) * sizeof(xfs_extent_32_t)))) ||
+	       (item->ri_buf[0].i_len == (sizeof(xfs_efd_log_format_64_t) +
+		((efd_formatp->efd_nextents - 1) * sizeof(xfs_extent_64_t)))));
+	efi_id = efd_formatp->efd_efi_id;
+
+	/*
+	 * Search for the EFI with the id in the EFD format structure in the
+	 * AIL.
+	 */
+	spin_lock(&ailp->ail_lock);
+	lip = xfs_trans_ail_cursor_first(ailp, &cur, 0);
+	while (lip != NULL) {
+		if (lip->li_type == XFS_LI_EFI) {
+			efip = (struct xfs_efi_log_item *)lip;
+			if (efip->efi_format.efi_id == efi_id) {
+				/*
+				 * Drop the EFD reference to the EFI. This
+				 * removes the EFI from the AIL and frees it.
+				 */
+				spin_unlock(&ailp->ail_lock);
+				xfs_efi_release(efip);
+				spin_lock(&ailp->ail_lock);
+				break;
+			}
+		}
+		lip = xfs_trans_ail_cursor_next(ailp, &cur);
+	}
+
+	xfs_trans_ail_cursor_done(&cur);
+	spin_unlock(&ailp->ail_lock);
+
+	return 0;
+}
+
 const struct xlog_recover_item_type xlog_extfree_intent_item_type = {
+	.commit_pass2_fn	= xlog_recover_extfree_intent_commit_pass2,
 };
 
 const struct xlog_recover_item_type xlog_extfree_done_item_type = {
+	.commit_pass2_fn	= xlog_recover_extfree_done_commit_pass2,
 };
diff --git a/fs/xfs/xfs_extfree_item.h b/fs/xfs/xfs_extfree_item.h
index 16aaab06d4ec..ecbe937952d8 100644
--- a/fs/xfs/xfs_extfree_item.h
+++ b/fs/xfs/xfs_extfree_item.h
@@ -78,10 +78,6 @@ typedef struct xfs_efd_log_item {
 extern struct kmem_zone	*xfs_efi_zone;
 extern struct kmem_zone	*xfs_efd_zone;
 
-xfs_efi_log_item_t	*xfs_efi_init(struct xfs_mount *, uint);
-int			xfs_efi_copy_format(xfs_log_iovec_t *buf,
-					    xfs_efi_log_format_t *dst_efi_fmt);
-void			xfs_efi_item_free(xfs_efi_log_item_t *);
 void			xfs_efi_release(struct xfs_efi_log_item *);
 
 int			xfs_efi_recover(struct xfs_mount *mp,
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index dba38fb99af7..2d34d2692b83 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2056,102 +2056,6 @@ xlog_buf_readahead(
 		xfs_buf_readahead(log->l_mp->m_ddev_targp, blkno, len, ops);
 }
 
-/*
- * This routine is called to create an in-core extent free intent
- * item from the efi format structure which was logged on disk.
- * It allocates an in-core efi, copies the extents from the format
- * structure into it, and adds the efi to the AIL with the given
- * LSN.
- */
-STATIC int
-xlog_recover_efi_pass2(
-	struct xlog			*log,
-	struct xlog_recover_item	*item,
-	xfs_lsn_t			lsn)
-{
-	int				error;
-	struct xfs_mount		*mp = log->l_mp;
-	struct xfs_efi_log_item		*efip;
-	struct xfs_efi_log_format	*efi_formatp;
-
-	efi_formatp = item->ri_buf[0].i_addr;
-
-	efip = xfs_efi_init(mp, efi_formatp->efi_nextents);
-	error = xfs_efi_copy_format(&item->ri_buf[0], &efip->efi_format);
-	if (error) {
-		xfs_efi_item_free(efip);
-		return error;
-	}
-	atomic_set(&efip->efi_next_extent, efi_formatp->efi_nextents);
-
-	spin_lock(&log->l_ailp->ail_lock);
-	/*
-	 * The EFI has two references. One for the EFD and one for EFI to ensure
-	 * it makes it into the AIL. Insert the EFI into the AIL directly and
-	 * drop the EFI reference. Note that xfs_trans_ail_update() drops the
-	 * AIL lock.
-	 */
-	xfs_trans_ail_update(log->l_ailp, &efip->efi_item, lsn);
-	xfs_efi_release(efip);
-	return 0;
-}
-
-
-/*
- * This routine is called when an EFD format structure is found in a committed
- * transaction in the log. Its purpose is to cancel the corresponding EFI if it
- * was still in the log. To do this it searches the AIL for the EFI with an id
- * equal to that in the EFD format structure. If we find it we drop the EFD
- * reference, which removes the EFI from the AIL and frees it.
- */
-STATIC int
-xlog_recover_efd_pass2(
-	struct xlog			*log,
-	struct xlog_recover_item	*item)
-{
-	xfs_efd_log_format_t	*efd_formatp;
-	xfs_efi_log_item_t	*efip = NULL;
-	struct xfs_log_item	*lip;
-	uint64_t		efi_id;
-	struct xfs_ail_cursor	cur;
-	struct xfs_ail		*ailp = log->l_ailp;
-
-	efd_formatp = item->ri_buf[0].i_addr;
-	ASSERT((item->ri_buf[0].i_len == (sizeof(xfs_efd_log_format_32_t) +
-		((efd_formatp->efd_nextents - 1) * sizeof(xfs_extent_32_t)))) ||
-	       (item->ri_buf[0].i_len == (sizeof(xfs_efd_log_format_64_t) +
-		((efd_formatp->efd_nextents - 1) * sizeof(xfs_extent_64_t)))));
-	efi_id = efd_formatp->efd_efi_id;
-
-	/*
-	 * Search for the EFI with the id in the EFD format structure in the
-	 * AIL.
-	 */
-	spin_lock(&ailp->ail_lock);
-	lip = xfs_trans_ail_cursor_first(ailp, &cur, 0);
-	while (lip != NULL) {
-		if (lip->li_type == XFS_LI_EFI) {
-			efip = (xfs_efi_log_item_t *)lip;
-			if (efip->efi_format.efi_id == efi_id) {
-				/*
-				 * Drop the EFD reference to the EFI. This
-				 * removes the EFI from the AIL and frees it.
-				 */
-				spin_unlock(&ailp->ail_lock);
-				xfs_efi_release(efip);
-				spin_lock(&ailp->ail_lock);
-				break;
-			}
-		}
-		lip = xfs_trans_ail_cursor_next(ailp, &cur);
-	}
-
-	xfs_trans_ail_cursor_done(&cur);
-	spin_unlock(&ailp->ail_lock);
-
-	return 0;
-}
-
 /*
  * This routine is called to create an in-core extent rmap update
  * item from the rui format structure which was logged on disk.
@@ -2522,10 +2426,6 @@ xlog_recover_commit_pass2(
 				trans->r_lsn);
 
 	switch (ITEM_TYPE(item)) {
-	case XFS_LI_EFI:
-		return xlog_recover_efi_pass2(log, item, trans->r_lsn);
-	case XFS_LI_EFD:
-		return xlog_recover_efd_pass2(log, item);
 	case XFS_LI_RUI:
 		return xlog_recover_rui_pass2(log, item, trans->r_lsn);
 	case XFS_LI_RUD:


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 10/21] xfs: refactor log recovery RUI item dispatch for pass2 commit functions
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
                   ` (8 preceding siblings ...)
  2020-04-30  0:48 ` [PATCH 09/21] xfs: refactor log recovery EFI " Darrick J. Wong
@ 2020-04-30  0:48 ` Darrick J. Wong
  2020-04-30  0:48 ` [PATCH 11/21] xfs: refactor log recovery CUI " Darrick J. Wong
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:48 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Move the rmap update intent and intent-done pass2 commit code into the
per-item source code files and use dispatch functions to call them.  We
do these one at a time because there's a lot of code to move.  No
functional changes.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_log_recover.c |   97 ------------------------------------------
 fs/xfs/xfs_rmap_item.c   |  105 +++++++++++++++++++++++++++++++++++++++++++++-
 fs/xfs/xfs_rmap_item.h   |    4 --
 3 files changed, 102 insertions(+), 104 deletions(-)


diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 2d34d2692b83..31f8449f2866 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2056,99 +2056,6 @@ xlog_buf_readahead(
 		xfs_buf_readahead(log->l_mp->m_ddev_targp, blkno, len, ops);
 }
 
-/*
- * This routine is called to create an in-core extent rmap update
- * item from the rui format structure which was logged on disk.
- * It allocates an in-core rui, copies the extents from the format
- * structure into it, and adds the rui to the AIL with the given
- * LSN.
- */
-STATIC int
-xlog_recover_rui_pass2(
-	struct xlog			*log,
-	struct xlog_recover_item	*item,
-	xfs_lsn_t			lsn)
-{
-	int				error;
-	struct xfs_mount		*mp = log->l_mp;
-	struct xfs_rui_log_item		*ruip;
-	struct xfs_rui_log_format	*rui_formatp;
-
-	rui_formatp = item->ri_buf[0].i_addr;
-
-	ruip = xfs_rui_init(mp, rui_formatp->rui_nextents);
-	error = xfs_rui_copy_format(&item->ri_buf[0], &ruip->rui_format);
-	if (error) {
-		xfs_rui_item_free(ruip);
-		return error;
-	}
-	atomic_set(&ruip->rui_next_extent, rui_formatp->rui_nextents);
-
-	spin_lock(&log->l_ailp->ail_lock);
-	/*
-	 * The RUI has two references. One for the RUD and one for RUI to ensure
-	 * it makes it into the AIL. Insert the RUI into the AIL directly and
-	 * drop the RUI reference. Note that xfs_trans_ail_update() drops the
-	 * AIL lock.
-	 */
-	xfs_trans_ail_update(log->l_ailp, &ruip->rui_item, lsn);
-	xfs_rui_release(ruip);
-	return 0;
-}
-
-
-/*
- * This routine is called when an RUD format structure is found in a committed
- * transaction in the log. Its purpose is to cancel the corresponding RUI if it
- * was still in the log. To do this it searches the AIL for the RUI with an id
- * equal to that in the RUD format structure. If we find it we drop the RUD
- * reference, which removes the RUI from the AIL and frees it.
- */
-STATIC int
-xlog_recover_rud_pass2(
-	struct xlog			*log,
-	struct xlog_recover_item	*item)
-{
-	struct xfs_rud_log_format	*rud_formatp;
-	struct xfs_rui_log_item		*ruip = NULL;
-	struct xfs_log_item		*lip;
-	uint64_t			rui_id;
-	struct xfs_ail_cursor		cur;
-	struct xfs_ail			*ailp = log->l_ailp;
-
-	rud_formatp = item->ri_buf[0].i_addr;
-	ASSERT(item->ri_buf[0].i_len == sizeof(struct xfs_rud_log_format));
-	rui_id = rud_formatp->rud_rui_id;
-
-	/*
-	 * Search for the RUI with the id in the RUD format structure in the
-	 * AIL.
-	 */
-	spin_lock(&ailp->ail_lock);
-	lip = xfs_trans_ail_cursor_first(ailp, &cur, 0);
-	while (lip != NULL) {
-		if (lip->li_type == XFS_LI_RUI) {
-			ruip = (struct xfs_rui_log_item *)lip;
-			if (ruip->rui_format.rui_id == rui_id) {
-				/*
-				 * Drop the RUD reference to the RUI. This
-				 * removes the RUI from the AIL and frees it.
-				 */
-				spin_unlock(&ailp->ail_lock);
-				xfs_rui_release(ruip);
-				spin_lock(&ailp->ail_lock);
-				break;
-			}
-		}
-		lip = xfs_trans_ail_cursor_next(ailp, &cur);
-	}
-
-	xfs_trans_ail_cursor_done(&cur);
-	spin_unlock(&ailp->ail_lock);
-
-	return 0;
-}
-
 /*
  * Copy an CUI format buffer from the given buf, and into the destination
  * CUI format structure.  The CUI/CUD items were designed not to need any
@@ -2426,10 +2333,6 @@ xlog_recover_commit_pass2(
 				trans->r_lsn);
 
 	switch (ITEM_TYPE(item)) {
-	case XFS_LI_RUI:
-		return xlog_recover_rui_pass2(log, item, trans->r_lsn);
-	case XFS_LI_RUD:
-		return xlog_recover_rud_pass2(log, item);
 	case XFS_LI_CUI:
 		return xlog_recover_cui_pass2(log, item, trans->r_lsn);
 	case XFS_LI_CUD:
diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c
index bcad3db1f3a4..51d9226c043e 100644
--- a/fs/xfs/xfs_rmap_item.c
+++ b/fs/xfs/xfs_rmap_item.c
@@ -18,6 +18,7 @@
 #include "xfs_log.h"
 #include "xfs_rmap.h"
 #include "xfs_error.h"
+#include "xfs_log_priv.h"
 #include "xfs_log_recover.h"
 
 kmem_zone_t	*xfs_rui_zone;
@@ -28,7 +29,7 @@ static inline struct xfs_rui_log_item *RUI_ITEM(struct xfs_log_item *lip)
 	return container_of(lip, struct xfs_rui_log_item, rui_item);
 }
 
-void
+STATIC void
 xfs_rui_item_free(
 	struct xfs_rui_log_item	*ruip)
 {
@@ -133,7 +134,7 @@ static const struct xfs_item_ops xfs_rui_item_ops = {
 /*
  * Allocate and initialize an rui item with the given number of extents.
  */
-struct xfs_rui_log_item *
+STATIC struct xfs_rui_log_item *
 xfs_rui_init(
 	struct xfs_mount		*mp,
 	uint				nextents)
@@ -161,7 +162,7 @@ xfs_rui_init(
  * RUI format structure.  The RUI/RUD items were designed not to need any
  * special alignment handling.
  */
-int
+STATIC int
 xfs_rui_copy_format(
 	struct xfs_log_iovec		*buf,
 	struct xfs_rui_log_format	*dst_rui_fmt)
@@ -608,8 +609,106 @@ xfs_rui_recover(
 	return error;
 }
 
+/*
+ * This routine is called to create an in-core extent rmap update
+ * item from the rui format structure which was logged on disk.
+ * It allocates an in-core rui, copies the extents from the format
+ * structure into it, and adds the rui to the AIL with the given
+ * LSN.
+ */
+STATIC int
+xlog_recover_rmap_intent_commit_pass2(
+	struct xlog			*log,
+	struct list_head		*buffer_list,
+	struct xlog_recover_item	*item,
+	xfs_lsn_t			lsn)
+{
+	int				error;
+	struct xfs_mount		*mp = log->l_mp;
+	struct xfs_rui_log_item		*ruip;
+	struct xfs_rui_log_format	*rui_formatp;
+
+	rui_formatp = item->ri_buf[0].i_addr;
+
+	ruip = xfs_rui_init(mp, rui_formatp->rui_nextents);
+	error = xfs_rui_copy_format(&item->ri_buf[0], &ruip->rui_format);
+	if (error) {
+		xfs_rui_item_free(ruip);
+		return error;
+	}
+	atomic_set(&ruip->rui_next_extent, rui_formatp->rui_nextents);
+
+	spin_lock(&log->l_ailp->ail_lock);
+	/*
+	 * The RUI has two references. One for the RUD and one for RUI to ensure
+	 * it makes it into the AIL. Insert the RUI into the AIL directly and
+	 * drop the RUI reference. Note that xfs_trans_ail_update() drops the
+	 * AIL lock.
+	 */
+	xfs_trans_ail_update(log->l_ailp, &ruip->rui_item, lsn);
+	xfs_rui_release(ruip);
+	return 0;
+}
+
+
+/*
+ * This routine is called when an RUD format structure is found in a committed
+ * transaction in the log. Its purpose is to cancel the corresponding RUI if it
+ * was still in the log. To do this it searches the AIL for the RUI with an id
+ * equal to that in the RUD format structure. If we find it we drop the RUD
+ * reference, which removes the RUI from the AIL and frees it.
+ */
+STATIC int
+xlog_recover_rmap_done_commit_pass2(
+	struct xlog			*log,
+	struct list_head		*buffer_list,
+	struct xlog_recover_item	*item,
+	xfs_lsn_t			lsn)
+{
+	struct xfs_rud_log_format	*rud_formatp;
+	struct xfs_rui_log_item		*ruip = NULL;
+	struct xfs_log_item		*lip;
+	uint64_t			rui_id;
+	struct xfs_ail_cursor		cur;
+	struct xfs_ail			*ailp = log->l_ailp;
+
+	rud_formatp = item->ri_buf[0].i_addr;
+	ASSERT(item->ri_buf[0].i_len == sizeof(struct xfs_rud_log_format));
+	rui_id = rud_formatp->rud_rui_id;
+
+	/*
+	 * Search for the RUI with the id in the RUD format structure in the
+	 * AIL.
+	 */
+	spin_lock(&ailp->ail_lock);
+	lip = xfs_trans_ail_cursor_first(ailp, &cur, 0);
+	while (lip != NULL) {
+		if (lip->li_type == XFS_LI_RUI) {
+			ruip = (struct xfs_rui_log_item *)lip;
+			if (ruip->rui_format.rui_id == rui_id) {
+				/*
+				 * Drop the RUD reference to the RUI. This
+				 * removes the RUI from the AIL and frees it.
+				 */
+				spin_unlock(&ailp->ail_lock);
+				xfs_rui_release(ruip);
+				spin_lock(&ailp->ail_lock);
+				break;
+			}
+		}
+		lip = xfs_trans_ail_cursor_next(ailp, &cur);
+	}
+
+	xfs_trans_ail_cursor_done(&cur);
+	spin_unlock(&ailp->ail_lock);
+
+	return 0;
+}
+
 const struct xlog_recover_item_type xlog_rmap_intent_item_type = {
+	.commit_pass2_fn	= xlog_recover_rmap_intent_commit_pass2,
 };
 
 const struct xlog_recover_item_type xlog_rmap_done_item_type = {
+	.commit_pass2_fn	= xlog_recover_rmap_done_commit_pass2,
 };
diff --git a/fs/xfs/xfs_rmap_item.h b/fs/xfs/xfs_rmap_item.h
index 8708e4a5aa5c..89bd192779f8 100644
--- a/fs/xfs/xfs_rmap_item.h
+++ b/fs/xfs/xfs_rmap_item.h
@@ -77,10 +77,6 @@ struct xfs_rud_log_item {
 extern struct kmem_zone	*xfs_rui_zone;
 extern struct kmem_zone	*xfs_rud_zone;
 
-struct xfs_rui_log_item *xfs_rui_init(struct xfs_mount *, uint);
-int xfs_rui_copy_format(struct xfs_log_iovec *buf,
-		struct xfs_rui_log_format *dst_rui_fmt);
-void xfs_rui_item_free(struct xfs_rui_log_item *);
 void xfs_rui_release(struct xfs_rui_log_item *);
 int xfs_rui_recover(struct xfs_mount *mp, struct xfs_rui_log_item *ruip);
 


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 11/21] xfs: refactor log recovery CUI item dispatch for pass2 commit functions
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
                   ` (9 preceding siblings ...)
  2020-04-30  0:48 ` [PATCH 10/21] xfs: refactor log recovery RUI " Darrick J. Wong
@ 2020-04-30  0:48 ` Darrick J. Wong
  2020-04-30  0:48 ` [PATCH 12/21] xfs: refactor log recovery BUI " Darrick J. Wong
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:48 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Move the refcount update intent and intent-done pass2 commit code into
the per-item source code files and use dispatch functions to call them.
We do these one at a time because there's a lot of code to move.  No
functional changes.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_log_recover.c   |  124 ------------------------------------------
 fs/xfs/xfs_refcount_item.c |  130 +++++++++++++++++++++++++++++++++++++++++++-
 fs/xfs/xfs_refcount_item.h |    2 -
 3 files changed, 128 insertions(+), 128 deletions(-)


diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 31f8449f2866..9292623bbdb4 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2056,126 +2056,6 @@ xlog_buf_readahead(
 		xfs_buf_readahead(log->l_mp->m_ddev_targp, blkno, len, ops);
 }
 
-/*
- * Copy an CUI format buffer from the given buf, and into the destination
- * CUI format structure.  The CUI/CUD items were designed not to need any
- * special alignment handling.
- */
-static int
-xfs_cui_copy_format(
-	struct xfs_log_iovec		*buf,
-	struct xfs_cui_log_format	*dst_cui_fmt)
-{
-	struct xfs_cui_log_format	*src_cui_fmt;
-	uint				len;
-
-	src_cui_fmt = buf->i_addr;
-	len = xfs_cui_log_format_sizeof(src_cui_fmt->cui_nextents);
-
-	if (buf->i_len == len) {
-		memcpy(dst_cui_fmt, src_cui_fmt, len);
-		return 0;
-	}
-	XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW, NULL);
-	return -EFSCORRUPTED;
-}
-
-/*
- * This routine is called to create an in-core extent refcount update
- * item from the cui format structure which was logged on disk.
- * It allocates an in-core cui, copies the extents from the format
- * structure into it, and adds the cui to the AIL with the given
- * LSN.
- */
-STATIC int
-xlog_recover_cui_pass2(
-	struct xlog			*log,
-	struct xlog_recover_item	*item,
-	xfs_lsn_t			lsn)
-{
-	int				error;
-	struct xfs_mount		*mp = log->l_mp;
-	struct xfs_cui_log_item		*cuip;
-	struct xfs_cui_log_format	*cui_formatp;
-
-	cui_formatp = item->ri_buf[0].i_addr;
-
-	cuip = xfs_cui_init(mp, cui_formatp->cui_nextents);
-	error = xfs_cui_copy_format(&item->ri_buf[0], &cuip->cui_format);
-	if (error) {
-		xfs_cui_item_free(cuip);
-		return error;
-	}
-	atomic_set(&cuip->cui_next_extent, cui_formatp->cui_nextents);
-
-	spin_lock(&log->l_ailp->ail_lock);
-	/*
-	 * The CUI has two references. One for the CUD and one for CUI to ensure
-	 * it makes it into the AIL. Insert the CUI into the AIL directly and
-	 * drop the CUI reference. Note that xfs_trans_ail_update() drops the
-	 * AIL lock.
-	 */
-	xfs_trans_ail_update(log->l_ailp, &cuip->cui_item, lsn);
-	xfs_cui_release(cuip);
-	return 0;
-}
-
-
-/*
- * This routine is called when an CUD format structure is found in a committed
- * transaction in the log. Its purpose is to cancel the corresponding CUI if it
- * was still in the log. To do this it searches the AIL for the CUI with an id
- * equal to that in the CUD format structure. If we find it we drop the CUD
- * reference, which removes the CUI from the AIL and frees it.
- */
-STATIC int
-xlog_recover_cud_pass2(
-	struct xlog			*log,
-	struct xlog_recover_item	*item)
-{
-	struct xfs_cud_log_format	*cud_formatp;
-	struct xfs_cui_log_item		*cuip = NULL;
-	struct xfs_log_item		*lip;
-	uint64_t			cui_id;
-	struct xfs_ail_cursor		cur;
-	struct xfs_ail			*ailp = log->l_ailp;
-
-	cud_formatp = item->ri_buf[0].i_addr;
-	if (item->ri_buf[0].i_len != sizeof(struct xfs_cud_log_format)) {
-		XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW, log->l_mp);
-		return -EFSCORRUPTED;
-	}
-	cui_id = cud_formatp->cud_cui_id;
-
-	/*
-	 * Search for the CUI with the id in the CUD format structure in the
-	 * AIL.
-	 */
-	spin_lock(&ailp->ail_lock);
-	lip = xfs_trans_ail_cursor_first(ailp, &cur, 0);
-	while (lip != NULL) {
-		if (lip->li_type == XFS_LI_CUI) {
-			cuip = (struct xfs_cui_log_item *)lip;
-			if (cuip->cui_format.cui_id == cui_id) {
-				/*
-				 * Drop the CUD reference to the CUI. This
-				 * removes the CUI from the AIL and frees it.
-				 */
-				spin_unlock(&ailp->ail_lock);
-				xfs_cui_release(cuip);
-				spin_lock(&ailp->ail_lock);
-				break;
-			}
-		}
-		lip = xfs_trans_ail_cursor_next(ailp, &cur);
-	}
-
-	xfs_trans_ail_cursor_done(&cur);
-	spin_unlock(&ailp->ail_lock);
-
-	return 0;
-}
-
 /*
  * Copy an BUI format buffer from the given buf, and into the destination
  * BUI format structure.  The BUI/BUD items were designed not to need any
@@ -2333,10 +2213,6 @@ xlog_recover_commit_pass2(
 				trans->r_lsn);
 
 	switch (ITEM_TYPE(item)) {
-	case XFS_LI_CUI:
-		return xlog_recover_cui_pass2(log, item, trans->r_lsn);
-	case XFS_LI_CUD:
-		return xlog_recover_cud_pass2(log, item);
 	case XFS_LI_BUI:
 		return xlog_recover_bui_pass2(log, item, trans->r_lsn);
 	case XFS_LI_BUD:
diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c
index ddab09385bfb..a76a5e9b862e 100644
--- a/fs/xfs/xfs_refcount_item.c
+++ b/fs/xfs/xfs_refcount_item.c
@@ -18,6 +18,7 @@
 #include "xfs_log.h"
 #include "xfs_refcount.h"
 #include "xfs_error.h"
+#include "xfs_log_priv.h"
 #include "xfs_log_recover.h"
 
 kmem_zone_t	*xfs_cui_zone;
@@ -28,7 +29,7 @@ static inline struct xfs_cui_log_item *CUI_ITEM(struct xfs_log_item *lip)
 	return container_of(lip, struct xfs_cui_log_item, cui_item);
 }
 
-void
+STATIC void
 xfs_cui_item_free(
 	struct xfs_cui_log_item	*cuip)
 {
@@ -134,7 +135,7 @@ static const struct xfs_item_ops xfs_cui_item_ops = {
 /*
  * Allocate and initialize an cui item with the given number of extents.
  */
-struct xfs_cui_log_item *
+STATIC struct xfs_cui_log_item *
 xfs_cui_init(
 	struct xfs_mount		*mp,
 	uint				nextents)
@@ -592,8 +593,133 @@ xfs_cui_recover(
 	return error;
 }
 
+/*
+ * Copy an CUI format buffer from the given buf, and into the destination
+ * CUI format structure.  The CUI/CUD items were designed not to need any
+ * special alignment handling.
+ */
+static int
+xfs_cui_copy_format(
+	struct xfs_log_iovec		*buf,
+	struct xfs_cui_log_format	*dst_cui_fmt)
+{
+	struct xfs_cui_log_format	*src_cui_fmt;
+	uint				len;
+
+	src_cui_fmt = buf->i_addr;
+	len = xfs_cui_log_format_sizeof(src_cui_fmt->cui_nextents);
+
+	if (buf->i_len == len) {
+		memcpy(dst_cui_fmt, src_cui_fmt, len);
+		return 0;
+	}
+	XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW, NULL);
+	return -EFSCORRUPTED;
+}
+
+/*
+ * This routine is called to create an in-core extent refcount update
+ * item from the cui format structure which was logged on disk.
+ * It allocates an in-core cui, copies the extents from the format
+ * structure into it, and adds the cui to the AIL with the given
+ * LSN.
+ */
+STATIC int
+xlog_recover_refcount_intent_commit_pass2(
+	struct xlog			*log,
+	struct list_head		*buffer_list,
+	struct xlog_recover_item	*item,
+	xfs_lsn_t			lsn)
+{
+	int				error;
+	struct xfs_mount		*mp = log->l_mp;
+	struct xfs_cui_log_item		*cuip;
+	struct xfs_cui_log_format	*cui_formatp;
+
+	cui_formatp = item->ri_buf[0].i_addr;
+
+	cuip = xfs_cui_init(mp, cui_formatp->cui_nextents);
+	error = xfs_cui_copy_format(&item->ri_buf[0], &cuip->cui_format);
+	if (error) {
+		xfs_cui_item_free(cuip);
+		return error;
+	}
+	atomic_set(&cuip->cui_next_extent, cui_formatp->cui_nextents);
+
+	spin_lock(&log->l_ailp->ail_lock);
+	/*
+	 * The CUI has two references. One for the CUD and one for CUI to ensure
+	 * it makes it into the AIL. Insert the CUI into the AIL directly and
+	 * drop the CUI reference. Note that xfs_trans_ail_update() drops the
+	 * AIL lock.
+	 */
+	xfs_trans_ail_update(log->l_ailp, &cuip->cui_item, lsn);
+	xfs_cui_release(cuip);
+	return 0;
+}
+
+
+/*
+ * This routine is called when an CUD format structure is found in a committed
+ * transaction in the log. Its purpose is to cancel the corresponding CUI if it
+ * was still in the log. To do this it searches the AIL for the CUI with an id
+ * equal to that in the CUD format structure. If we find it we drop the CUD
+ * reference, which removes the CUI from the AIL and frees it.
+ */
+STATIC int
+xlog_recover_refcount_done_commit_pass2(
+	struct xlog			*log,
+	struct list_head		*buffer_list,
+	struct xlog_recover_item	*item,
+	xfs_lsn_t			lsn)
+{
+	struct xfs_cud_log_format	*cud_formatp;
+	struct xfs_cui_log_item		*cuip = NULL;
+	struct xfs_log_item		*lip;
+	uint64_t			cui_id;
+	struct xfs_ail_cursor		cur;
+	struct xfs_ail			*ailp = log->l_ailp;
+
+	cud_formatp = item->ri_buf[0].i_addr;
+	if (item->ri_buf[0].i_len != sizeof(struct xfs_cud_log_format)) {
+		XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW, log->l_mp);
+		return -EFSCORRUPTED;
+	}
+	cui_id = cud_formatp->cud_cui_id;
+
+	/*
+	 * Search for the CUI with the id in the CUD format structure in the
+	 * AIL.
+	 */
+	spin_lock(&ailp->ail_lock);
+	lip = xfs_trans_ail_cursor_first(ailp, &cur, 0);
+	while (lip != NULL) {
+		if (lip->li_type == XFS_LI_CUI) {
+			cuip = (struct xfs_cui_log_item *)lip;
+			if (cuip->cui_format.cui_id == cui_id) {
+				/*
+				 * Drop the CUD reference to the CUI. This
+				 * removes the CUI from the AIL and frees it.
+				 */
+				spin_unlock(&ailp->ail_lock);
+				xfs_cui_release(cuip);
+				spin_lock(&ailp->ail_lock);
+				break;
+			}
+		}
+		lip = xfs_trans_ail_cursor_next(ailp, &cur);
+	}
+
+	xfs_trans_ail_cursor_done(&cur);
+	spin_unlock(&ailp->ail_lock);
+
+	return 0;
+}
+
 const struct xlog_recover_item_type xlog_refcount_intent_item_type = {
+	.commit_pass2_fn	= xlog_recover_refcount_intent_commit_pass2,
 };
 
 const struct xlog_recover_item_type xlog_refcount_done_item_type = {
+	.commit_pass2_fn	= xlog_recover_refcount_done_commit_pass2,
 };
diff --git a/fs/xfs/xfs_refcount_item.h b/fs/xfs/xfs_refcount_item.h
index e47530f30489..ebe12779eaac 100644
--- a/fs/xfs/xfs_refcount_item.h
+++ b/fs/xfs/xfs_refcount_item.h
@@ -77,8 +77,6 @@ struct xfs_cud_log_item {
 extern struct kmem_zone	*xfs_cui_zone;
 extern struct kmem_zone	*xfs_cud_zone;
 
-struct xfs_cui_log_item *xfs_cui_init(struct xfs_mount *, uint);
-void xfs_cui_item_free(struct xfs_cui_log_item *);
 void xfs_cui_release(struct xfs_cui_log_item *);
 int xfs_cui_recover(struct xfs_trans *parent_tp, struct xfs_cui_log_item *cuip);
 


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 12/21] xfs: refactor log recovery BUI item dispatch for pass2 commit functions
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
                   ` (10 preceding siblings ...)
  2020-04-30  0:48 ` [PATCH 11/21] xfs: refactor log recovery CUI " Darrick J. Wong
@ 2020-04-30  0:48 ` Darrick J. Wong
  2020-04-30  0:48 ` [PATCH 13/21] xfs: refactor recovered EFI log item playback Darrick J. Wong
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:48 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Move the bmap update intent and intent-done pass2 commit code into the
per-item source code files and use dispatch functions to call them.  We
do these one at a time because there's a lot of code to move.  No
functional changes.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_bmap_item.c   |  134 ++++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_bmap_item.h   |    2 -
 fs/xfs/xfs_log_recover.c |  139 ++--------------------------------------------
 3 files changed, 137 insertions(+), 138 deletions(-)


diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
index a2824013e2cb..f5c19ea4affb 100644
--- a/fs/xfs/xfs_bmap_item.c
+++ b/fs/xfs/xfs_bmap_item.c
@@ -22,6 +22,7 @@
 #include "xfs_bmap_btree.h"
 #include "xfs_trans_space.h"
 #include "xfs_error.h"
+#include "xfs_log_priv.h"
 #include "xfs_log_recover.h"
 
 kmem_zone_t	*xfs_bui_zone;
@@ -32,7 +33,7 @@ static inline struct xfs_bui_log_item *BUI_ITEM(struct xfs_log_item *lip)
 	return container_of(lip, struct xfs_bui_log_item, bui_item);
 }
 
-void
+STATIC void
 xfs_bui_item_free(
 	struct xfs_bui_log_item	*buip)
 {
@@ -135,7 +136,7 @@ static const struct xfs_item_ops xfs_bui_item_ops = {
 /*
  * Allocate and initialize an bui item with the given number of extents.
  */
-struct xfs_bui_log_item *
+STATIC struct xfs_bui_log_item *
 xfs_bui_init(
 	struct xfs_mount		*mp)
 
@@ -565,8 +566,137 @@ xfs_bui_recover(
 	return error;
 }
 
+/*
+ * Copy an BUI format buffer from the given buf, and into the destination
+ * BUI format structure.  The BUI/BUD items were designed not to need any
+ * special alignment handling.
+ */
+static int
+xfs_bui_copy_format(
+	struct xfs_log_iovec		*buf,
+	struct xfs_bui_log_format	*dst_bui_fmt)
+{
+	struct xfs_bui_log_format	*src_bui_fmt;
+	uint				len;
+
+	src_bui_fmt = buf->i_addr;
+	len = xfs_bui_log_format_sizeof(src_bui_fmt->bui_nextents);
+
+	if (buf->i_len == len) {
+		memcpy(dst_bui_fmt, src_bui_fmt, len);
+		return 0;
+	}
+	XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW, NULL);
+	return -EFSCORRUPTED;
+}
+
+/*
+ * This routine is called to create an in-core extent bmap update
+ * item from the bui format structure which was logged on disk.
+ * It allocates an in-core bui, copies the extents from the format
+ * structure into it, and adds the bui to the AIL with the given
+ * LSN.
+ */
+STATIC int
+xlog_recover_bmap_intent_commit_pass2(
+	struct xlog			*log,
+	struct list_head		*buffer_list,
+	struct xlog_recover_item	*item,
+	xfs_lsn_t			lsn)
+{
+	int				error;
+	struct xfs_mount		*mp = log->l_mp;
+	struct xfs_bui_log_item		*buip;
+	struct xfs_bui_log_format	*bui_formatp;
+
+	bui_formatp = item->ri_buf[0].i_addr;
+
+	if (bui_formatp->bui_nextents != XFS_BUI_MAX_FAST_EXTENTS) {
+		XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW, log->l_mp);
+		return -EFSCORRUPTED;
+	}
+	buip = xfs_bui_init(mp);
+	error = xfs_bui_copy_format(&item->ri_buf[0], &buip->bui_format);
+	if (error) {
+		xfs_bui_item_free(buip);
+		return error;
+	}
+	atomic_set(&buip->bui_next_extent, bui_formatp->bui_nextents);
+
+	spin_lock(&log->l_ailp->ail_lock);
+	/*
+	 * The RUI has two references. One for the RUD and one for RUI to ensure
+	 * it makes it into the AIL. Insert the RUI into the AIL directly and
+	 * drop the RUI reference. Note that xfs_trans_ail_update() drops the
+	 * AIL lock.
+	 */
+	xfs_trans_ail_update(log->l_ailp, &buip->bui_item, lsn);
+	xfs_bui_release(buip);
+	return 0;
+}
+
+
+/*
+ * This routine is called when an BUD format structure is found in a committed
+ * transaction in the log. Its purpose is to cancel the corresponding BUI if it
+ * was still in the log. To do this it searches the AIL for the BUI with an id
+ * equal to that in the BUD format structure. If we find it we drop the BUD
+ * reference, which removes the BUI from the AIL and frees it.
+ */
+STATIC int
+xlog_recover_bmap_done_commit_pass2(
+	struct xlog			*log,
+	struct list_head		*buffer_list,
+	struct xlog_recover_item	*item,
+	xfs_lsn_t			lsn)
+{
+	struct xfs_bud_log_format	*bud_formatp;
+	struct xfs_bui_log_item		*buip = NULL;
+	struct xfs_log_item		*lip;
+	uint64_t			bui_id;
+	struct xfs_ail_cursor		cur;
+	struct xfs_ail			*ailp = log->l_ailp;
+
+	bud_formatp = item->ri_buf[0].i_addr;
+	if (item->ri_buf[0].i_len != sizeof(struct xfs_bud_log_format)) {
+		XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW, log->l_mp);
+		return -EFSCORRUPTED;
+	}
+	bui_id = bud_formatp->bud_bui_id;
+
+	/*
+	 * Search for the BUI with the id in the BUD format structure in the
+	 * AIL.
+	 */
+	spin_lock(&ailp->ail_lock);
+	lip = xfs_trans_ail_cursor_first(ailp, &cur, 0);
+	while (lip != NULL) {
+		if (lip->li_type == XFS_LI_BUI) {
+			buip = (struct xfs_bui_log_item *)lip;
+			if (buip->bui_format.bui_id == bui_id) {
+				/*
+				 * Drop the BUD reference to the BUI. This
+				 * removes the BUI from the AIL and frees it.
+				 */
+				spin_unlock(&ailp->ail_lock);
+				xfs_bui_release(buip);
+				spin_lock(&ailp->ail_lock);
+				break;
+			}
+		}
+		lip = xfs_trans_ail_cursor_next(ailp, &cur);
+	}
+
+	xfs_trans_ail_cursor_done(&cur);
+	spin_unlock(&ailp->ail_lock);
+
+	return 0;
+}
+
 const struct xlog_recover_item_type xlog_bmap_intent_item_type = {
+	.commit_pass2_fn	= xlog_recover_bmap_intent_commit_pass2,
 };
 
 const struct xlog_recover_item_type xlog_bmap_done_item_type = {
+	.commit_pass2_fn	= xlog_recover_bmap_done_commit_pass2,
 };
diff --git a/fs/xfs/xfs_bmap_item.h b/fs/xfs/xfs_bmap_item.h
index ad479cc73de8..515b1d5d6ab7 100644
--- a/fs/xfs/xfs_bmap_item.h
+++ b/fs/xfs/xfs_bmap_item.h
@@ -74,8 +74,6 @@ struct xfs_bud_log_item {
 extern struct kmem_zone	*xfs_bui_zone;
 extern struct kmem_zone	*xfs_bud_zone;
 
-struct xfs_bui_log_item *xfs_bui_init(struct xfs_mount *);
-void xfs_bui_item_free(struct xfs_bui_log_item *);
 void xfs_bui_release(struct xfs_bui_log_item *);
 int xfs_bui_recover(struct xfs_trans *parent_tp, struct xfs_bui_log_item *buip);
 
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 9292623bbdb4..d28a46888efb 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2056,130 +2056,6 @@ xlog_buf_readahead(
 		xfs_buf_readahead(log->l_mp->m_ddev_targp, blkno, len, ops);
 }
 
-/*
- * Copy an BUI format buffer from the given buf, and into the destination
- * BUI format structure.  The BUI/BUD items were designed not to need any
- * special alignment handling.
- */
-static int
-xfs_bui_copy_format(
-	struct xfs_log_iovec		*buf,
-	struct xfs_bui_log_format	*dst_bui_fmt)
-{
-	struct xfs_bui_log_format	*src_bui_fmt;
-	uint				len;
-
-	src_bui_fmt = buf->i_addr;
-	len = xfs_bui_log_format_sizeof(src_bui_fmt->bui_nextents);
-
-	if (buf->i_len == len) {
-		memcpy(dst_bui_fmt, src_bui_fmt, len);
-		return 0;
-	}
-	XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW, NULL);
-	return -EFSCORRUPTED;
-}
-
-/*
- * This routine is called to create an in-core extent bmap update
- * item from the bui format structure which was logged on disk.
- * It allocates an in-core bui, copies the extents from the format
- * structure into it, and adds the bui to the AIL with the given
- * LSN.
- */
-STATIC int
-xlog_recover_bui_pass2(
-	struct xlog			*log,
-	struct xlog_recover_item	*item,
-	xfs_lsn_t			lsn)
-{
-	int				error;
-	struct xfs_mount		*mp = log->l_mp;
-	struct xfs_bui_log_item		*buip;
-	struct xfs_bui_log_format	*bui_formatp;
-
-	bui_formatp = item->ri_buf[0].i_addr;
-
-	if (bui_formatp->bui_nextents != XFS_BUI_MAX_FAST_EXTENTS) {
-		XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW, log->l_mp);
-		return -EFSCORRUPTED;
-	}
-	buip = xfs_bui_init(mp);
-	error = xfs_bui_copy_format(&item->ri_buf[0], &buip->bui_format);
-	if (error) {
-		xfs_bui_item_free(buip);
-		return error;
-	}
-	atomic_set(&buip->bui_next_extent, bui_formatp->bui_nextents);
-
-	spin_lock(&log->l_ailp->ail_lock);
-	/*
-	 * The RUI has two references. One for the RUD and one for RUI to ensure
-	 * it makes it into the AIL. Insert the RUI into the AIL directly and
-	 * drop the RUI reference. Note that xfs_trans_ail_update() drops the
-	 * AIL lock.
-	 */
-	xfs_trans_ail_update(log->l_ailp, &buip->bui_item, lsn);
-	xfs_bui_release(buip);
-	return 0;
-}
-
-
-/*
- * This routine is called when an BUD format structure is found in a committed
- * transaction in the log. Its purpose is to cancel the corresponding BUI if it
- * was still in the log. To do this it searches the AIL for the BUI with an id
- * equal to that in the BUD format structure. If we find it we drop the BUD
- * reference, which removes the BUI from the AIL and frees it.
- */
-STATIC int
-xlog_recover_bud_pass2(
-	struct xlog			*log,
-	struct xlog_recover_item	*item)
-{
-	struct xfs_bud_log_format	*bud_formatp;
-	struct xfs_bui_log_item		*buip = NULL;
-	struct xfs_log_item		*lip;
-	uint64_t			bui_id;
-	struct xfs_ail_cursor		cur;
-	struct xfs_ail			*ailp = log->l_ailp;
-
-	bud_formatp = item->ri_buf[0].i_addr;
-	if (item->ri_buf[0].i_len != sizeof(struct xfs_bud_log_format)) {
-		XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW, log->l_mp);
-		return -EFSCORRUPTED;
-	}
-	bui_id = bud_formatp->bud_bui_id;
-
-	/*
-	 * Search for the BUI with the id in the BUD format structure in the
-	 * AIL.
-	 */
-	spin_lock(&ailp->ail_lock);
-	lip = xfs_trans_ail_cursor_first(ailp, &cur, 0);
-	while (lip != NULL) {
-		if (lip->li_type == XFS_LI_BUI) {
-			buip = (struct xfs_bui_log_item *)lip;
-			if (buip->bui_format.bui_id == bui_id) {
-				/*
-				 * Drop the BUD reference to the BUI. This
-				 * removes the BUI from the AIL and frees it.
-				 */
-				spin_unlock(&ailp->ail_lock);
-				xfs_bui_release(buip);
-				spin_lock(&ailp->ail_lock);
-				break;
-			}
-		}
-		lip = xfs_trans_ail_cursor_next(ailp, &cur);
-	}
-
-	xfs_trans_ail_cursor_done(&cur);
-	spin_unlock(&ailp->ail_lock);
-
-	return 0;
-}
-
 STATIC int
 xlog_recover_commit_pass1(
 	struct xlog			*log,
@@ -2208,21 +2084,16 @@ xlog_recover_commit_pass2(
 {
 	trace_xfs_log_recover_item_recover(log, trans, item, XLOG_RECOVER_PASS2);
 
-	if (item->ri_type && item->ri_type->commit_pass2_fn)
-		return item->ri_type->commit_pass2_fn(log, buffer_list, item,
-				trans->r_lsn);
-
-	switch (ITEM_TYPE(item)) {
-	case XFS_LI_BUI:
-		return xlog_recover_bui_pass2(log, item, trans->r_lsn);
-	case XFS_LI_BUD:
-		return xlog_recover_bud_pass2(log, item);
-	default:
+	if (!item->ri_type) {
 		xfs_warn(log->l_mp, "%s: invalid item type (%d)",
 			__func__, ITEM_TYPE(item));
 		ASSERT(0);
 		return -EFSCORRUPTED;
 	}
+	if (!item->ri_type->commit_pass2_fn)
+		return 0;
+	return item->ri_type->commit_pass2_fn(log, buffer_list, item,
+			trans->r_lsn);
 }
 
 STATIC int


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 13/21] xfs: refactor recovered EFI log item playback
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
                   ` (11 preceding siblings ...)
  2020-04-30  0:48 ` [PATCH 12/21] xfs: refactor log recovery BUI " Darrick J. Wong
@ 2020-04-30  0:48 ` Darrick J. Wong
  2020-05-01 10:19   ` Christoph Hellwig
  2020-04-30  0:49 ` [PATCH 14/21] xfs: refactor recovered RUI " Darrick J. Wong
                   ` (8 subsequent siblings)
  21 siblings, 1 reply; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:48 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Move the code that processes the log items created from the recovered
log items into the per-item source code files and use dispatch functions
to call them.  No functional changes.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_extfree_item.c |   32 +++++++++++++++++++++++++++++--
 fs/xfs/xfs_extfree_item.h |    5 -----
 fs/xfs/xfs_log_recover.c  |   46 ++++-----------------------------------------
 fs/xfs/xfs_trans.h        |    4 ++++
 4 files changed, 38 insertions(+), 49 deletions(-)


diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
index 53c1d3b9b957..229c6dee0f85 100644
--- a/fs/xfs/xfs_extfree_item.c
+++ b/fs/xfs/xfs_extfree_item.c
@@ -28,6 +28,8 @@
 kmem_zone_t	*xfs_efi_zone;
 kmem_zone_t	*xfs_efd_zone;
 
+STATIC int xfs_efi_recover(struct xfs_mount *mp, struct xfs_efi_log_item *efip);
+
 static inline struct xfs_efi_log_item *EFI_ITEM(struct xfs_log_item *lip)
 {
 	return container_of(lip, struct xfs_efi_log_item, efi_item);
@@ -51,7 +53,7 @@ xfs_efi_item_free(
  * committed vs unpin operations in bulk insert operations. Hence the reference
  * count to ensure only the last caller frees the EFI.
  */
-void
+STATIC void
 xfs_efi_release(
 	struct xfs_efi_log_item	*efip)
 {
@@ -141,11 +143,37 @@ xfs_efi_item_release(
 	xfs_efi_release(EFI_ITEM(lip));
 }
 
+/* Recover the EFI if necessary. */
+STATIC int
+xfs_efi_item_recover(
+	struct xfs_log_item		*lip,
+	struct xfs_trans		*tp)
+{
+	struct xfs_ail			*ailp = lip->li_ailp;
+	struct xfs_efi_log_item		*efip;
+	int				error;
+
+	/*
+	 * Skip EFIs that we've already processed.
+	 */
+	efip = container_of(lip, struct xfs_efi_log_item, efi_item);
+	if (test_bit(XFS_EFI_RECOVERED, &efip->efi_flags))
+		return 0;
+
+	spin_unlock(&ailp->ail_lock);
+	error = xfs_efi_recover(tp->t_mountp, efip);
+	spin_lock(&ailp->ail_lock);
+
+	return error;
+}
+
 static const struct xfs_item_ops xfs_efi_item_ops = {
+	.flags		= XFS_ITEM_TYPE_IS_INTENT,
 	.iop_size	= xfs_efi_item_size,
 	.iop_format	= xfs_efi_item_format,
 	.iop_unpin	= xfs_efi_item_unpin,
 	.iop_release	= xfs_efi_item_release,
+	.iop_recover	= xfs_efi_item_recover,
 };
 
 
@@ -594,7 +622,7 @@ const struct xfs_defer_op_type xfs_agfl_free_defer_type = {
  * Process an extent free intent item that was recovered from
  * the log.  We need to free the extents that it describes.
  */
-int
+STATIC int
 xfs_efi_recover(
 	struct xfs_mount	*mp,
 	struct xfs_efi_log_item	*efip)
diff --git a/fs/xfs/xfs_extfree_item.h b/fs/xfs/xfs_extfree_item.h
index ecbe937952d8..23e3758b5dbb 100644
--- a/fs/xfs/xfs_extfree_item.h
+++ b/fs/xfs/xfs_extfree_item.h
@@ -78,9 +78,4 @@ typedef struct xfs_efd_log_item {
 extern struct kmem_zone	*xfs_efi_zone;
 extern struct kmem_zone	*xfs_efd_zone;
 
-void			xfs_efi_release(struct xfs_efi_log_item *);
-
-int			xfs_efi_recover(struct xfs_mount *mp,
-					struct xfs_efi_log_item *efip);
-
 #endif	/* __XFS_EXTFREE_ITEM_H__ */
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index d28a46888efb..06f30ce1d02d 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2599,46 +2599,6 @@ xlog_recover_process_data(
 	return 0;
 }
 
-/* Recover the EFI if necessary. */
-STATIC int
-xlog_recover_process_efi(
-	struct xfs_mount		*mp,
-	struct xfs_ail			*ailp,
-	struct xfs_log_item		*lip)
-{
-	struct xfs_efi_log_item		*efip;
-	int				error;
-
-	/*
-	 * Skip EFIs that we've already processed.
-	 */
-	efip = container_of(lip, struct xfs_efi_log_item, efi_item);
-	if (test_bit(XFS_EFI_RECOVERED, &efip->efi_flags))
-		return 0;
-
-	spin_unlock(&ailp->ail_lock);
-	error = xfs_efi_recover(mp, efip);
-	spin_lock(&ailp->ail_lock);
-
-	return error;
-}
-
-/* Release the EFI since we're cancelling everything. */
-STATIC void
-xlog_recover_cancel_efi(
-	struct xfs_mount		*mp,
-	struct xfs_ail			*ailp,
-	struct xfs_log_item		*lip)
-{
-	struct xfs_efi_log_item		*efip;
-
-	efip = container_of(lip, struct xfs_efi_log_item, efi_item);
-
-	spin_unlock(&ailp->ail_lock);
-	xfs_efi_release(efip);
-	spin_lock(&ailp->ail_lock);
-}
-
 /* Recover the RUI if necessary. */
 STATIC int
 xlog_recover_process_rui(
@@ -2883,7 +2843,7 @@ xlog_recover_process_intents(
 		 */
 		switch (lip->li_type) {
 		case XFS_LI_EFI:
-			error = xlog_recover_process_efi(log->l_mp, ailp, lip);
+			error = lip->li_ops->iop_recover(lip, parent_tp);
 			break;
 		case XFS_LI_RUI:
 			error = xlog_recover_process_rui(log->l_mp, ailp, lip);
@@ -2939,7 +2899,9 @@ xlog_recover_cancel_intents(
 
 		switch (lip->li_type) {
 		case XFS_LI_EFI:
-			xlog_recover_cancel_efi(log->l_mp, ailp, lip);
+			spin_unlock(&ailp->ail_lock);
+			lip->li_ops->iop_release(lip);
+			spin_lock(&ailp->ail_lock);
 			break;
 		case XFS_LI_RUI:
 			xlog_recover_cancel_rui(log->l_mp, ailp, lip);
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index 752c7fef9de7..9ac5483d2187 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -77,6 +77,7 @@ struct xfs_item_ops {
 	void (*iop_release)(struct xfs_log_item *);
 	xfs_lsn_t (*iop_committed)(struct xfs_log_item *, xfs_lsn_t);
 	void (*iop_error)(struct xfs_log_item *, xfs_buf_t *);
+	int (*iop_recover)(struct xfs_log_item *lip, struct xfs_trans *tp);
 };
 
 /*
@@ -85,6 +86,9 @@ struct xfs_item_ops {
  */
 #define XFS_ITEM_RELEASE_WHEN_COMMITTED	(1 << 0)
 
+/* This log item type is an intent log item. */
+#define XFS_ITEM_TYPE_IS_INTENT		(1 << 1)
+
 void	xfs_log_item_init(struct xfs_mount *mp, struct xfs_log_item *item,
 			  int type, const struct xfs_item_ops *ops);
 


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 14/21] xfs: refactor recovered RUI log item playback
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
                   ` (12 preceding siblings ...)
  2020-04-30  0:48 ` [PATCH 13/21] xfs: refactor recovered EFI log item playback Darrick J. Wong
@ 2020-04-30  0:49 ` Darrick J. Wong
  2020-04-30  0:49 ` [PATCH 15/21] xfs: refactor recovered CUI " Darrick J. Wong
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:49 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Move the code that processes the log items created from the recovered
log items into the per-item source code files and use dispatch functions
to call them.  No functional changes.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_log_recover.c |   48 ++--------------------------------------------
 fs/xfs/xfs_rmap_item.c   |   31 ++++++++++++++++++++++++++++--
 fs/xfs/xfs_rmap_item.h   |    3 ---
 3 files changed, 31 insertions(+), 51 deletions(-)


diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 06f30ce1d02d..cf790d02ee92 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2599,46 +2599,6 @@ xlog_recover_process_data(
 	return 0;
 }
 
-/* Recover the RUI if necessary. */
-STATIC int
-xlog_recover_process_rui(
-	struct xfs_mount		*mp,
-	struct xfs_ail			*ailp,
-	struct xfs_log_item		*lip)
-{
-	struct xfs_rui_log_item		*ruip;
-	int				error;
-
-	/*
-	 * Skip RUIs that we've already processed.
-	 */
-	ruip = container_of(lip, struct xfs_rui_log_item, rui_item);
-	if (test_bit(XFS_RUI_RECOVERED, &ruip->rui_flags))
-		return 0;
-
-	spin_unlock(&ailp->ail_lock);
-	error = xfs_rui_recover(mp, ruip);
-	spin_lock(&ailp->ail_lock);
-
-	return error;
-}
-
-/* Release the RUI since we're cancelling everything. */
-STATIC void
-xlog_recover_cancel_rui(
-	struct xfs_mount		*mp,
-	struct xfs_ail			*ailp,
-	struct xfs_log_item		*lip)
-{
-	struct xfs_rui_log_item		*ruip;
-
-	ruip = container_of(lip, struct xfs_rui_log_item, rui_item);
-
-	spin_unlock(&ailp->ail_lock);
-	xfs_rui_release(ruip);
-	spin_lock(&ailp->ail_lock);
-}
-
 /* Recover the CUI if necessary. */
 STATIC int
 xlog_recover_process_cui(
@@ -2843,10 +2803,8 @@ xlog_recover_process_intents(
 		 */
 		switch (lip->li_type) {
 		case XFS_LI_EFI:
-			error = lip->li_ops->iop_recover(lip, parent_tp);
-			break;
 		case XFS_LI_RUI:
-			error = xlog_recover_process_rui(log->l_mp, ailp, lip);
+			error = lip->li_ops->iop_recover(lip, parent_tp);
 			break;
 		case XFS_LI_CUI:
 			error = xlog_recover_process_cui(parent_tp, ailp, lip);
@@ -2899,13 +2857,11 @@ xlog_recover_cancel_intents(
 
 		switch (lip->li_type) {
 		case XFS_LI_EFI:
+		case XFS_LI_RUI:
 			spin_unlock(&ailp->ail_lock);
 			lip->li_ops->iop_release(lip);
 			spin_lock(&ailp->ail_lock);
 			break;
-		case XFS_LI_RUI:
-			xlog_recover_cancel_rui(log->l_mp, ailp, lip);
-			break;
 		case XFS_LI_CUI:
 			xlog_recover_cancel_cui(log->l_mp, ailp, lip);
 			break;
diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c
index 51d9226c043e..53348d48bf67 100644
--- a/fs/xfs/xfs_rmap_item.c
+++ b/fs/xfs/xfs_rmap_item.c
@@ -24,6 +24,8 @@
 kmem_zone_t	*xfs_rui_zone;
 kmem_zone_t	*xfs_rud_zone;
 
+STATIC int xfs_rui_recover(struct xfs_mount *mp, struct xfs_rui_log_item *ruip);
+
 static inline struct xfs_rui_log_item *RUI_ITEM(struct xfs_log_item *lip)
 {
 	return container_of(lip, struct xfs_rui_log_item, rui_item);
@@ -46,7 +48,7 @@ xfs_rui_item_free(
  * committed vs unpin operations in bulk insert operations. Hence the reference
  * count to ensure only the last caller frees the RUI.
  */
-void
+STATIC void
 xfs_rui_release(
 	struct xfs_rui_log_item	*ruip)
 {
@@ -124,11 +126,36 @@ xfs_rui_item_release(
 	xfs_rui_release(RUI_ITEM(lip));
 }
 
+/* Recover the RUI if necessary. */
+STATIC int
+xfs_rui_item_recover(
+	struct xfs_log_item		*lip,
+	struct xfs_trans		*tp)
+{
+	struct xfs_ail			*ailp = lip->li_ailp;
+	struct xfs_rui_log_item		*ruip = RUI_ITEM(lip);
+	int				error;
+
+	/*
+	 * Skip RUIs that we've already processed.
+	 */
+	if (test_bit(XFS_RUI_RECOVERED, &ruip->rui_flags))
+		return 0;
+
+	spin_unlock(&ailp->ail_lock);
+	error = xfs_rui_recover(tp->t_mountp, ruip);
+	spin_lock(&ailp->ail_lock);
+
+	return error;
+}
+
 static const struct xfs_item_ops xfs_rui_item_ops = {
+	.flags		= XFS_ITEM_TYPE_IS_INTENT,
 	.iop_size	= xfs_rui_item_size,
 	.iop_format	= xfs_rui_item_format,
 	.iop_unpin	= xfs_rui_item_unpin,
 	.iop_release	= xfs_rui_item_release,
+	.iop_recover	= xfs_rui_item_recover,
 };
 
 /*
@@ -489,7 +516,7 @@ const struct xfs_defer_op_type xfs_rmap_update_defer_type = {
  * Process an rmap update intent item that was recovered from the log.
  * We need to update the rmapbt.
  */
-int
+STATIC int
 xfs_rui_recover(
 	struct xfs_mount		*mp,
 	struct xfs_rui_log_item		*ruip)
diff --git a/fs/xfs/xfs_rmap_item.h b/fs/xfs/xfs_rmap_item.h
index 89bd192779f8..48a77a6f5c94 100644
--- a/fs/xfs/xfs_rmap_item.h
+++ b/fs/xfs/xfs_rmap_item.h
@@ -77,7 +77,4 @@ struct xfs_rud_log_item {
 extern struct kmem_zone	*xfs_rui_zone;
 extern struct kmem_zone	*xfs_rud_zone;
 
-void xfs_rui_release(struct xfs_rui_log_item *);
-int xfs_rui_recover(struct xfs_mount *mp, struct xfs_rui_log_item *ruip);
-
 #endif	/* __XFS_RMAP_ITEM_H__ */


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 15/21] xfs: refactor recovered CUI log item playback
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
                   ` (13 preceding siblings ...)
  2020-04-30  0:49 ` [PATCH 14/21] xfs: refactor recovered RUI " Darrick J. Wong
@ 2020-04-30  0:49 ` Darrick J. Wong
  2020-04-30  0:49 ` [PATCH 16/21] xfs: refactor recovered BUI " Darrick J. Wong
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:49 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Move the code that processes the log items created from the recovered
log items into the per-item source code files and use dispatch functions
to call them.  No functional changes.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_log_recover.c   |   48 ++------------------------------------------
 fs/xfs/xfs_refcount_item.c |   31 +++++++++++++++++++++++++++-
 fs/xfs/xfs_refcount_item.h |    3 ---
 3 files changed, 31 insertions(+), 51 deletions(-)


diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index cf790d02ee92..9323bb5800d6 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2599,46 +2599,6 @@ xlog_recover_process_data(
 	return 0;
 }
 
-/* Recover the CUI if necessary. */
-STATIC int
-xlog_recover_process_cui(
-	struct xfs_trans		*parent_tp,
-	struct xfs_ail			*ailp,
-	struct xfs_log_item		*lip)
-{
-	struct xfs_cui_log_item		*cuip;
-	int				error;
-
-	/*
-	 * Skip CUIs that we've already processed.
-	 */
-	cuip = container_of(lip, struct xfs_cui_log_item, cui_item);
-	if (test_bit(XFS_CUI_RECOVERED, &cuip->cui_flags))
-		return 0;
-
-	spin_unlock(&ailp->ail_lock);
-	error = xfs_cui_recover(parent_tp, cuip);
-	spin_lock(&ailp->ail_lock);
-
-	return error;
-}
-
-/* Release the CUI since we're cancelling everything. */
-STATIC void
-xlog_recover_cancel_cui(
-	struct xfs_mount		*mp,
-	struct xfs_ail			*ailp,
-	struct xfs_log_item		*lip)
-{
-	struct xfs_cui_log_item		*cuip;
-
-	cuip = container_of(lip, struct xfs_cui_log_item, cui_item);
-
-	spin_unlock(&ailp->ail_lock);
-	xfs_cui_release(cuip);
-	spin_lock(&ailp->ail_lock);
-}
-
 /* Recover the BUI if necessary. */
 STATIC int
 xlog_recover_process_bui(
@@ -2804,10 +2764,8 @@ xlog_recover_process_intents(
 		switch (lip->li_type) {
 		case XFS_LI_EFI:
 		case XFS_LI_RUI:
-			error = lip->li_ops->iop_recover(lip, parent_tp);
-			break;
 		case XFS_LI_CUI:
-			error = xlog_recover_process_cui(parent_tp, ailp, lip);
+			error = lip->li_ops->iop_recover(lip, parent_tp);
 			break;
 		case XFS_LI_BUI:
 			error = xlog_recover_process_bui(parent_tp, ailp, lip);
@@ -2858,13 +2816,11 @@ xlog_recover_cancel_intents(
 		switch (lip->li_type) {
 		case XFS_LI_EFI:
 		case XFS_LI_RUI:
+		case XFS_LI_CUI:
 			spin_unlock(&ailp->ail_lock);
 			lip->li_ops->iop_release(lip);
 			spin_lock(&ailp->ail_lock);
 			break;
-		case XFS_LI_CUI:
-			xlog_recover_cancel_cui(log->l_mp, ailp, lip);
-			break;
 		case XFS_LI_BUI:
 			xlog_recover_cancel_bui(log->l_mp, ailp, lip);
 			break;
diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c
index a76a5e9b862e..d291a640ce31 100644
--- a/fs/xfs/xfs_refcount_item.c
+++ b/fs/xfs/xfs_refcount_item.c
@@ -24,6 +24,8 @@
 kmem_zone_t	*xfs_cui_zone;
 kmem_zone_t	*xfs_cud_zone;
 
+STATIC int xfs_cui_recover(struct xfs_trans *tp, struct xfs_cui_log_item *cuip);
+
 static inline struct xfs_cui_log_item *CUI_ITEM(struct xfs_log_item *lip)
 {
 	return container_of(lip, struct xfs_cui_log_item, cui_item);
@@ -46,7 +48,7 @@ xfs_cui_item_free(
  * committed vs unpin operations in bulk insert operations. Hence the reference
  * count to ensure only the last caller frees the CUI.
  */
-void
+STATIC void
 xfs_cui_release(
 	struct xfs_cui_log_item	*cuip)
 {
@@ -125,11 +127,36 @@ xfs_cui_item_release(
 	xfs_cui_release(CUI_ITEM(lip));
 }
 
+/* Recover the CUI if necessary. */
+STATIC int
+xfs_cui_item_recover(
+	struct xfs_log_item		*lip,
+	struct xfs_trans		*tp)
+{
+	struct xfs_ail			*ailp = lip->li_ailp;
+	struct xfs_cui_log_item		*cuip = CUI_ITEM(lip);
+	int				error;
+
+	/*
+	 * Skip CUIs that we've already processed.
+	 */
+	if (test_bit(XFS_CUI_RECOVERED, &cuip->cui_flags))
+		return 0;
+
+	spin_unlock(&ailp->ail_lock);
+	error = xfs_cui_recover(tp, cuip);
+	spin_lock(&ailp->ail_lock);
+
+	return error;
+}
+
 static const struct xfs_item_ops xfs_cui_item_ops = {
+	.flags		= XFS_ITEM_TYPE_IS_INTENT,
 	.iop_size	= xfs_cui_item_size,
 	.iop_format	= xfs_cui_item_format,
 	.iop_unpin	= xfs_cui_item_unpin,
 	.iop_release	= xfs_cui_item_release,
+	.iop_recover	= xfs_cui_item_recover,
 };
 
 /*
@@ -445,7 +472,7 @@ const struct xfs_defer_op_type xfs_refcount_update_defer_type = {
  * Process a refcount update intent item that was recovered from the log.
  * We need to update the refcountbt.
  */
-int
+STATIC int
 xfs_cui_recover(
 	struct xfs_trans		*parent_tp,
 	struct xfs_cui_log_item		*cuip)
diff --git a/fs/xfs/xfs_refcount_item.h b/fs/xfs/xfs_refcount_item.h
index ebe12779eaac..cfaa857673a6 100644
--- a/fs/xfs/xfs_refcount_item.h
+++ b/fs/xfs/xfs_refcount_item.h
@@ -77,7 +77,4 @@ struct xfs_cud_log_item {
 extern struct kmem_zone	*xfs_cui_zone;
 extern struct kmem_zone	*xfs_cud_zone;
 
-void xfs_cui_release(struct xfs_cui_log_item *);
-int xfs_cui_recover(struct xfs_trans *parent_tp, struct xfs_cui_log_item *cuip);
-
 #endif	/* __XFS_REFCOUNT_ITEM_H__ */


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 16/21] xfs: refactor recovered BUI log item playback
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
                   ` (14 preceding siblings ...)
  2020-04-30  0:49 ` [PATCH 15/21] xfs: refactor recovered CUI " Darrick J. Wong
@ 2020-04-30  0:49 ` Darrick J. Wong
  2020-04-30  0:49 ` [PATCH 17/21] xfs: refactor releasing finished intents during log recovery Darrick J. Wong
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:49 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Move the code that processes the log items created from the recovered
log items into the per-item source code files and use dispatch functions
to call them.  No functional changes.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_bmap_item.c   |   31 ++++++++++++++-
 fs/xfs/xfs_bmap_item.h   |    3 -
 fs/xfs/xfs_log_recover.c |   95 ++++++----------------------------------------
 3 files changed, 41 insertions(+), 88 deletions(-)


diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
index f5c19ea4affb..a0fb79e1d09f 100644
--- a/fs/xfs/xfs_bmap_item.c
+++ b/fs/xfs/xfs_bmap_item.c
@@ -28,6 +28,8 @@
 kmem_zone_t	*xfs_bui_zone;
 kmem_zone_t	*xfs_bud_zone;
 
+STATIC int xfs_bui_recover(struct xfs_trans *tp, struct xfs_bui_log_item *buip);
+
 static inline struct xfs_bui_log_item *BUI_ITEM(struct xfs_log_item *lip)
 {
 	return container_of(lip, struct xfs_bui_log_item, bui_item);
@@ -47,7 +49,7 @@ xfs_bui_item_free(
  * committed vs unpin operations in bulk insert operations. Hence the reference
  * count to ensure only the last caller frees the BUI.
  */
-void
+STATIC void
 xfs_bui_release(
 	struct xfs_bui_log_item	*buip)
 {
@@ -126,11 +128,36 @@ xfs_bui_item_release(
 	xfs_bui_release(BUI_ITEM(lip));
 }
 
+/* Recover the BUI if necessary. */
+STATIC int
+xfs_bui_item_recover(
+	struct xfs_log_item		*lip,
+	struct xfs_trans		*tp)
+{
+	struct xfs_ail			*ailp = lip->li_ailp;
+	struct xfs_bui_log_item		*buip = BUI_ITEM(lip);
+	int				error;
+
+	/*
+	 * Skip BUIs that we've already processed.
+	 */
+	if (test_bit(XFS_BUI_RECOVERED, &buip->bui_flags))
+		return 0;
+
+	spin_unlock(&ailp->ail_lock);
+	error = xfs_bui_recover(tp, buip);
+	spin_lock(&ailp->ail_lock);
+
+	return error;
+}
+
 static const struct xfs_item_ops xfs_bui_item_ops = {
+	.flags		= XFS_ITEM_TYPE_IS_INTENT,
 	.iop_size	= xfs_bui_item_size,
 	.iop_format	= xfs_bui_item_format,
 	.iop_unpin	= xfs_bui_item_unpin,
 	.iop_release	= xfs_bui_item_release,
+	.iop_recover	= xfs_bui_item_recover,
 };
 
 /*
@@ -431,7 +458,7 @@ const struct xfs_defer_op_type xfs_bmap_update_defer_type = {
  * Process a bmap update intent item that was recovered from the log.
  * We need to update some inode's bmbt.
  */
-int
+STATIC int
 xfs_bui_recover(
 	struct xfs_trans		*parent_tp,
 	struct xfs_bui_log_item		*buip)
diff --git a/fs/xfs/xfs_bmap_item.h b/fs/xfs/xfs_bmap_item.h
index 515b1d5d6ab7..44d06e62f8f9 100644
--- a/fs/xfs/xfs_bmap_item.h
+++ b/fs/xfs/xfs_bmap_item.h
@@ -74,7 +74,4 @@ struct xfs_bud_log_item {
 extern struct kmem_zone	*xfs_bui_zone;
 extern struct kmem_zone	*xfs_bud_zone;
 
-void xfs_bui_release(struct xfs_bui_log_item *);
-int xfs_bui_recover(struct xfs_trans *parent_tp, struct xfs_bui_log_item *buip);
-
 #endif	/* __XFS_BMAP_ITEM_H__ */
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 9323bb5800d6..db4535cd74c1 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2599,60 +2599,6 @@ xlog_recover_process_data(
 	return 0;
 }
 
-/* Recover the BUI if necessary. */
-STATIC int
-xlog_recover_process_bui(
-	struct xfs_trans		*parent_tp,
-	struct xfs_ail			*ailp,
-	struct xfs_log_item		*lip)
-{
-	struct xfs_bui_log_item		*buip;
-	int				error;
-
-	/*
-	 * Skip BUIs that we've already processed.
-	 */
-	buip = container_of(lip, struct xfs_bui_log_item, bui_item);
-	if (test_bit(XFS_BUI_RECOVERED, &buip->bui_flags))
-		return 0;
-
-	spin_unlock(&ailp->ail_lock);
-	error = xfs_bui_recover(parent_tp, buip);
-	spin_lock(&ailp->ail_lock);
-
-	return error;
-}
-
-/* Release the BUI since we're cancelling everything. */
-STATIC void
-xlog_recover_cancel_bui(
-	struct xfs_mount		*mp,
-	struct xfs_ail			*ailp,
-	struct xfs_log_item		*lip)
-{
-	struct xfs_bui_log_item		*buip;
-
-	buip = container_of(lip, struct xfs_bui_log_item, bui_item);
-
-	spin_unlock(&ailp->ail_lock);
-	xfs_bui_release(buip);
-	spin_lock(&ailp->ail_lock);
-}
-
-/* Is this log item a deferred action intent? */
-static inline bool xlog_item_is_intent(struct xfs_log_item *lip)
-{
-	switch (lip->li_type) {
-	case XFS_LI_EFI:
-	case XFS_LI_RUI:
-	case XFS_LI_CUI:
-	case XFS_LI_BUI:
-		return true;
-	default:
-		return false;
-	}
-}
-
 /* Take all the collected deferred ops and finish them in order. */
 static int
 xlog_finish_defer_ops(
@@ -2740,10 +2686,11 @@ xlog_recover_process_intents(
 		 * We're done when we see something other than an intent.
 		 * There should be no intents left in the AIL now.
 		 */
-		if (!xlog_item_is_intent(lip)) {
+		if (!(lip->li_ops->flags & XFS_ITEM_TYPE_IS_INTENT)) {
 #ifdef DEBUG
 			for (; lip; lip = xfs_trans_ail_cursor_next(ailp, &cur))
-				ASSERT(!xlog_item_is_intent(lip));
+				ASSERT(lip->li_ops->flags &
+						XFS_ITEM_TYPE_IS_INTENT);
 #endif
 			break;
 		}
@@ -2757,20 +2704,11 @@ xlog_recover_process_intents(
 
 		/*
 		 * NOTE: If your intent processing routine can create more
-		 * deferred ops, you /must/ attach them to the dfops in this
-		 * routine or else those subsequent intents will get
+		 * deferred ops, you /must/ attach them to the transaction in
+		 * this routine or else those subsequent intents will get
 		 * replayed in the wrong order!
 		 */
-		switch (lip->li_type) {
-		case XFS_LI_EFI:
-		case XFS_LI_RUI:
-		case XFS_LI_CUI:
-			error = lip->li_ops->iop_recover(lip, parent_tp);
-			break;
-		case XFS_LI_BUI:
-			error = xlog_recover_process_bui(parent_tp, ailp, lip);
-			break;
-		}
+		error = lip->li_ops->iop_recover(lip, parent_tp);
 		if (error)
 			goto out;
 		lip = xfs_trans_ail_cursor_next(ailp, &cur);
@@ -2805,27 +2743,18 @@ xlog_recover_cancel_intents(
 		 * We're done when we see something other than an intent.
 		 * There should be no intents left in the AIL now.
 		 */
-		if (!xlog_item_is_intent(lip)) {
+		if (!(lip->li_ops->flags & XFS_ITEM_TYPE_IS_INTENT)) {
 #ifdef DEBUG
 			for (; lip; lip = xfs_trans_ail_cursor_next(ailp, &cur))
-				ASSERT(!xlog_item_is_intent(lip));
+				ASSERT(lip->li_ops->flags &
+						XFS_ITEM_TYPE_IS_INTENT);
 #endif
 			break;
 		}
 
-		switch (lip->li_type) {
-		case XFS_LI_EFI:
-		case XFS_LI_RUI:
-		case XFS_LI_CUI:
-			spin_unlock(&ailp->ail_lock);
-			lip->li_ops->iop_release(lip);
-			spin_lock(&ailp->ail_lock);
-			break;
-		case XFS_LI_BUI:
-			xlog_recover_cancel_bui(log->l_mp, ailp, lip);
-			break;
-		}
-
+		spin_unlock(&ailp->ail_lock);
+		lip->li_ops->iop_release(lip);
+		spin_lock(&ailp->ail_lock);
 		lip = xfs_trans_ail_cursor_next(ailp, &cur);
 	}
 


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 17/21] xfs: refactor releasing finished intents during log recovery
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
                   ` (15 preceding siblings ...)
  2020-04-30  0:49 ` [PATCH 16/21] xfs: refactor recovered BUI " Darrick J. Wong
@ 2020-04-30  0:49 ` Darrick J. Wong
  2020-04-30  0:49 ` [PATCH 18/21] xfs: refactor adding recovered intent items to the log Darrick J. Wong
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:49 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Replace the open-coded AIL item walking with a proper helper when we're
trying to release an intent item that has been finished.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_log_recover.h |    4 ++++
 fs/xfs/xfs_bmap_item.c          |   41 +++++++++------------------------------
 fs/xfs/xfs_extfree_item.c       |   41 +++++++++------------------------------
 fs/xfs/xfs_log_recover.c        |   33 +++++++++++++++++++++++++++++++
 fs/xfs/xfs_refcount_item.c      |   41 +++++++++------------------------------
 fs/xfs/xfs_rmap_item.c          |   41 +++++++++------------------------------
 6 files changed, 73 insertions(+), 128 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_log_recover.h b/fs/xfs/libxfs/xfs_log_recover.h
index 5017d80c0f4b..bb21d512a824 100644
--- a/fs/xfs/libxfs/xfs_log_recover.h
+++ b/fs/xfs/libxfs/xfs_log_recover.h
@@ -124,4 +124,8 @@ bool xlog_is_buffer_cancelled(struct xlog *log, xfs_daddr_t blkno, uint len);
 bool xlog_put_buffer_cancelled(struct xlog *log, xfs_daddr_t blkno, uint len);
 void xlog_recover_iodone(struct xfs_buf *bp);
 
+typedef bool (*xlog_item_match_fn)(struct xfs_log_item *item, uint64_t id);
+void xlog_recover_release_intent(struct xlog *log, unsigned short intent_type,
+		uint64_t intent_id, xlog_item_match_fn fn);
+
 #endif	/* __XFS_LOG_RECOVER_H__ */
diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
index a0fb79e1d09f..bf5997f616a4 100644
--- a/fs/xfs/xfs_bmap_item.c
+++ b/fs/xfs/xfs_bmap_item.c
@@ -662,6 +662,13 @@ xlog_recover_bmap_intent_commit_pass2(
 	return 0;
 }
 
+STATIC bool
+xfs_bui_item_match(
+	struct xfs_log_item	*lip,
+	uint64_t		intent_id)
+{
+	return BUI_ITEM(lip)->bui_format.bui_id == intent_id;
+}
 
 /*
  * This routine is called when an BUD format structure is found in a committed
@@ -678,45 +685,15 @@ xlog_recover_bmap_done_commit_pass2(
 	xfs_lsn_t			lsn)
 {
 	struct xfs_bud_log_format	*bud_formatp;
-	struct xfs_bui_log_item		*buip = NULL;
-	struct xfs_log_item		*lip;
-	uint64_t			bui_id;
-	struct xfs_ail_cursor		cur;
-	struct xfs_ail			*ailp = log->l_ailp;
 
 	bud_formatp = item->ri_buf[0].i_addr;
 	if (item->ri_buf[0].i_len != sizeof(struct xfs_bud_log_format)) {
 		XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW, log->l_mp);
 		return -EFSCORRUPTED;
 	}
-	bui_id = bud_formatp->bud_bui_id;
-
-	/*
-	 * Search for the BUI with the id in the BUD format structure in the
-	 * AIL.
-	 */
-	spin_lock(&ailp->ail_lock);
-	lip = xfs_trans_ail_cursor_first(ailp, &cur, 0);
-	while (lip != NULL) {
-		if (lip->li_type == XFS_LI_BUI) {
-			buip = (struct xfs_bui_log_item *)lip;
-			if (buip->bui_format.bui_id == bui_id) {
-				/*
-				 * Drop the BUD reference to the BUI. This
-				 * removes the BUI from the AIL and frees it.
-				 */
-				spin_unlock(&ailp->ail_lock);
-				xfs_bui_release(buip);
-				spin_lock(&ailp->ail_lock);
-				break;
-			}
-		}
-		lip = xfs_trans_ail_cursor_next(ailp, &cur);
-	}
-
-	xfs_trans_ail_cursor_done(&cur);
-	spin_unlock(&ailp->ail_lock);
 
+	xlog_recover_release_intent(log, XFS_LI_BUI, bud_formatp->bud_bui_id,
+			 xfs_bui_item_match);
 	return 0;
 }
 
diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
index 229c6dee0f85..57d33a5a42c5 100644
--- a/fs/xfs/xfs_extfree_item.c
+++ b/fs/xfs/xfs_extfree_item.c
@@ -724,6 +724,13 @@ xlog_recover_extfree_intent_commit_pass2(
 	return 0;
 }
 
+STATIC bool
+xfs_efi_item_match(
+	struct xfs_log_item	*lip,
+	uint64_t		intent_id)
+{
+	return EFI_ITEM(lip)->efi_format.efi_id == intent_id;
+}
 
 /*
  * This routine is called when an EFD format structure is found in a committed
@@ -739,46 +746,16 @@ xlog_recover_extfree_done_commit_pass2(
 	struct xlog_recover_item	*item,
 	xfs_lsn_t			lsn)
 {
-	struct xfs_ail_cursor		cur;
 	struct xfs_efd_log_format	*efd_formatp;
-	struct xfs_efi_log_item		*efip = NULL;
-	struct xfs_log_item		*lip;
-	struct xfs_ail			*ailp = log->l_ailp;
-	uint64_t			efi_id;
 
 	efd_formatp = item->ri_buf[0].i_addr;
 	ASSERT((item->ri_buf[0].i_len == (sizeof(xfs_efd_log_format_32_t) +
 		((efd_formatp->efd_nextents - 1) * sizeof(xfs_extent_32_t)))) ||
 	       (item->ri_buf[0].i_len == (sizeof(xfs_efd_log_format_64_t) +
 		((efd_formatp->efd_nextents - 1) * sizeof(xfs_extent_64_t)))));
-	efi_id = efd_formatp->efd_efi_id;
-
-	/*
-	 * Search for the EFI with the id in the EFD format structure in the
-	 * AIL.
-	 */
-	spin_lock(&ailp->ail_lock);
-	lip = xfs_trans_ail_cursor_first(ailp, &cur, 0);
-	while (lip != NULL) {
-		if (lip->li_type == XFS_LI_EFI) {
-			efip = (struct xfs_efi_log_item *)lip;
-			if (efip->efi_format.efi_id == efi_id) {
-				/*
-				 * Drop the EFD reference to the EFI. This
-				 * removes the EFI from the AIL and frees it.
-				 */
-				spin_unlock(&ailp->ail_lock);
-				xfs_efi_release(efip);
-				spin_lock(&ailp->ail_lock);
-				break;
-			}
-		}
-		lip = xfs_trans_ail_cursor_next(ailp, &cur);
-	}
-
-	xfs_trans_ail_cursor_done(&cur);
-	spin_unlock(&ailp->ail_lock);
 
+	xlog_recover_release_intent(log, XFS_LI_EFI, efd_formatp->efd_efi_id,
+			 xfs_efi_item_match);
 	return 0;
 }
 
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index db4535cd74c1..853500a51762 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -1779,6 +1779,39 @@ xlog_clear_stale_blocks(
 	return 0;
 }
 
+/*
+ * Release the recovered intent item in the AIL that matches the given intent
+ * type and intent id.
+ */
+void
+xlog_recover_release_intent(
+	struct xlog		*log,
+	unsigned short		intent_type,
+	uint64_t		intent_id,
+	xlog_item_match_fn	fn)
+{
+	struct xfs_ail_cursor	cur;
+	struct xfs_log_item	*lip;
+	struct xfs_ail		*ailp = log->l_ailp;
+
+	spin_lock(&ailp->ail_lock);
+	for (lip = xfs_trans_ail_cursor_first(ailp, &cur, 0); lip != NULL;
+	     lip = xfs_trans_ail_cursor_next(ailp, &cur)) {
+		if (lip->li_type != intent_type)
+			continue;
+		if (!fn(lip, intent_id))
+			continue;
+
+		spin_unlock(&ailp->ail_lock);
+		lip->li_ops->iop_release(lip);
+		spin_lock(&ailp->ail_lock);
+		break;
+	}
+
+	xfs_trans_ail_cursor_done(&cur);
+	spin_unlock(&ailp->ail_lock);
+}
+
 /******************************************************************************
  *
  *		Log recover routines
diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c
index d291a640ce31..3c469a49e620 100644
--- a/fs/xfs/xfs_refcount_item.c
+++ b/fs/xfs/xfs_refcount_item.c
@@ -685,6 +685,13 @@ xlog_recover_refcount_intent_commit_pass2(
 	return 0;
 }
 
+STATIC bool
+xfs_cui_item_match(
+	struct xfs_log_item	*lip,
+	uint64_t		intent_id)
+{
+	return CUI_ITEM(lip)->cui_format.cui_id == intent_id;
+}
 
 /*
  * This routine is called when an CUD format structure is found in a committed
@@ -701,45 +708,15 @@ xlog_recover_refcount_done_commit_pass2(
 	xfs_lsn_t			lsn)
 {
 	struct xfs_cud_log_format	*cud_formatp;
-	struct xfs_cui_log_item		*cuip = NULL;
-	struct xfs_log_item		*lip;
-	uint64_t			cui_id;
-	struct xfs_ail_cursor		cur;
-	struct xfs_ail			*ailp = log->l_ailp;
 
 	cud_formatp = item->ri_buf[0].i_addr;
 	if (item->ri_buf[0].i_len != sizeof(struct xfs_cud_log_format)) {
 		XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW, log->l_mp);
 		return -EFSCORRUPTED;
 	}
-	cui_id = cud_formatp->cud_cui_id;
-
-	/*
-	 * Search for the CUI with the id in the CUD format structure in the
-	 * AIL.
-	 */
-	spin_lock(&ailp->ail_lock);
-	lip = xfs_trans_ail_cursor_first(ailp, &cur, 0);
-	while (lip != NULL) {
-		if (lip->li_type == XFS_LI_CUI) {
-			cuip = (struct xfs_cui_log_item *)lip;
-			if (cuip->cui_format.cui_id == cui_id) {
-				/*
-				 * Drop the CUD reference to the CUI. This
-				 * removes the CUI from the AIL and frees it.
-				 */
-				spin_unlock(&ailp->ail_lock);
-				xfs_cui_release(cuip);
-				spin_lock(&ailp->ail_lock);
-				break;
-			}
-		}
-		lip = xfs_trans_ail_cursor_next(ailp, &cur);
-	}
-
-	xfs_trans_ail_cursor_done(&cur);
-	spin_unlock(&ailp->ail_lock);
 
+	xlog_recover_release_intent(log, XFS_LI_CUI, cud_formatp->cud_cui_id,
+			 xfs_cui_item_match);
 	return 0;
 }
 
diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c
index 53348d48bf67..1de5c20cb624 100644
--- a/fs/xfs/xfs_rmap_item.c
+++ b/fs/xfs/xfs_rmap_item.c
@@ -677,6 +677,13 @@ xlog_recover_rmap_intent_commit_pass2(
 	return 0;
 }
 
+STATIC bool
+xfs_rui_item_match(
+	struct xfs_log_item	*lip,
+	uint64_t		intent_id)
+{
+	return RUI_ITEM(lip)->rui_format.rui_id == intent_id;
+}
 
 /*
  * This routine is called when an RUD format structure is found in a committed
@@ -693,42 +700,12 @@ xlog_recover_rmap_done_commit_pass2(
 	xfs_lsn_t			lsn)
 {
 	struct xfs_rud_log_format	*rud_formatp;
-	struct xfs_rui_log_item		*ruip = NULL;
-	struct xfs_log_item		*lip;
-	uint64_t			rui_id;
-	struct xfs_ail_cursor		cur;
-	struct xfs_ail			*ailp = log->l_ailp;
 
 	rud_formatp = item->ri_buf[0].i_addr;
 	ASSERT(item->ri_buf[0].i_len == sizeof(struct xfs_rud_log_format));
-	rui_id = rud_formatp->rud_rui_id;
-
-	/*
-	 * Search for the RUI with the id in the RUD format structure in the
-	 * AIL.
-	 */
-	spin_lock(&ailp->ail_lock);
-	lip = xfs_trans_ail_cursor_first(ailp, &cur, 0);
-	while (lip != NULL) {
-		if (lip->li_type == XFS_LI_RUI) {
-			ruip = (struct xfs_rui_log_item *)lip;
-			if (ruip->rui_format.rui_id == rui_id) {
-				/*
-				 * Drop the RUD reference to the RUI. This
-				 * removes the RUI from the AIL and frees it.
-				 */
-				spin_unlock(&ailp->ail_lock);
-				xfs_rui_release(ruip);
-				spin_lock(&ailp->ail_lock);
-				break;
-			}
-		}
-		lip = xfs_trans_ail_cursor_next(ailp, &cur);
-	}
-
-	xfs_trans_ail_cursor_done(&cur);
-	spin_unlock(&ailp->ail_lock);
 
+	xlog_recover_release_intent(log, XFS_LI_RUI, rud_formatp->rud_rui_id,
+			 xfs_rui_item_match);
 	return 0;
 }
 


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 18/21] xfs: refactor adding recovered intent items to the log
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
                   ` (16 preceding siblings ...)
  2020-04-30  0:49 ` [PATCH 17/21] xfs: refactor releasing finished intents during log recovery Darrick J. Wong
@ 2020-04-30  0:49 ` Darrick J. Wong
  2020-04-30  0:49 ` [PATCH 19/21] xfs: refactor intent item RECOVERED flag into the log item Darrick J. Wong
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:49 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs, Christoph Hellwig

From: Darrick J. Wong <darrick.wong@oracle.com>

During recovery, every intent that we recover from the log has to be
added to the AIL.  Replace the open-coded addition with a helper.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_log_recover.h |    2 ++
 fs/xfs/xfs_bmap_item.c          |   10 +---------
 fs/xfs/xfs_extfree_item.c       |   10 +---------
 fs/xfs/xfs_log_recover.c        |   17 +++++++++++++++++
 fs/xfs/xfs_refcount_item.c      |   10 +---------
 fs/xfs/xfs_rmap_item.c          |   10 +---------
 6 files changed, 23 insertions(+), 36 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_log_recover.h b/fs/xfs/libxfs/xfs_log_recover.h
index bb21d512a824..ba172eb454c8 100644
--- a/fs/xfs/libxfs/xfs_log_recover.h
+++ b/fs/xfs/libxfs/xfs_log_recover.h
@@ -127,5 +127,7 @@ void xlog_recover_iodone(struct xfs_buf *bp);
 typedef bool (*xlog_item_match_fn)(struct xfs_log_item *item, uint64_t id);
 void xlog_recover_release_intent(struct xlog *log, unsigned short intent_type,
 		uint64_t intent_id, xlog_item_match_fn fn);
+void xlog_recover_insert_ail(struct xlog *log, struct xfs_log_item *lip,
+		xfs_lsn_t lsn);
 
 #endif	/* __XFS_LOG_RECOVER_H__ */
diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
index bf5997f616a4..e38efdb0ed71 100644
--- a/fs/xfs/xfs_bmap_item.c
+++ b/fs/xfs/xfs_bmap_item.c
@@ -649,15 +649,7 @@ xlog_recover_bmap_intent_commit_pass2(
 		return error;
 	}
 	atomic_set(&buip->bui_next_extent, bui_formatp->bui_nextents);
-
-	spin_lock(&log->l_ailp->ail_lock);
-	/*
-	 * The RUI has two references. One for the RUD and one for RUI to ensure
-	 * it makes it into the AIL. Insert the RUI into the AIL directly and
-	 * drop the RUI reference. Note that xfs_trans_ail_update() drops the
-	 * AIL lock.
-	 */
-	xfs_trans_ail_update(log->l_ailp, &buip->bui_item, lsn);
+	xlog_recover_insert_ail(log, &buip->bui_item, lsn);
 	xfs_bui_release(buip);
 	return 0;
 }
diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
index 57d33a5a42c5..9264ec0817cc 100644
--- a/fs/xfs/xfs_extfree_item.c
+++ b/fs/xfs/xfs_extfree_item.c
@@ -711,15 +711,7 @@ xlog_recover_extfree_intent_commit_pass2(
 		return error;
 	}
 	atomic_set(&efip->efi_next_extent, efi_formatp->efi_nextents);
-
-	spin_lock(&log->l_ailp->ail_lock);
-	/*
-	 * The EFI has two references. One for the EFD and one for EFI to ensure
-	 * it makes it into the AIL. Insert the EFI into the AIL directly and
-	 * drop the EFI reference. Note that xfs_trans_ail_update() drops the
-	 * AIL lock.
-	 */
-	xfs_trans_ail_update(log->l_ailp, &efip->efi_item, lsn);
+	xlog_recover_insert_ail(log, &efip->efi_item, lsn);
 	xfs_efi_release(efip);
 	return 0;
 }
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 853500a51762..527c74fa5cc3 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -1812,6 +1812,23 @@ xlog_recover_release_intent(
 	spin_unlock(&ailp->ail_lock);
 }
 
+/* Insert a recovered intent item into the AIL. */
+void
+xlog_recover_insert_ail(
+	struct xlog		*log,
+	struct xfs_log_item	*lip,
+	xfs_lsn_t		lsn)
+{
+	/*
+	 * The intent has two references. One for the done item and one for the
+	 * intent to ensure it makes it into the AIL. Insert the intent into
+	 * the AIL directly and drop the intent reference. Note that
+	 * xfs_trans_ail_update() drops the AIL lock.
+	 */
+	spin_lock(&log->l_ailp->ail_lock);
+	xfs_trans_ail_update(log->l_ailp, lip, lsn);
+}
+
 /******************************************************************************
  *
  *		Log recover routines
diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c
index 3c469a49e620..408d9312035a 100644
--- a/fs/xfs/xfs_refcount_item.c
+++ b/fs/xfs/xfs_refcount_item.c
@@ -672,15 +672,7 @@ xlog_recover_refcount_intent_commit_pass2(
 		return error;
 	}
 	atomic_set(&cuip->cui_next_extent, cui_formatp->cui_nextents);
-
-	spin_lock(&log->l_ailp->ail_lock);
-	/*
-	 * The CUI has two references. One for the CUD and one for CUI to ensure
-	 * it makes it into the AIL. Insert the CUI into the AIL directly and
-	 * drop the CUI reference. Note that xfs_trans_ail_update() drops the
-	 * AIL lock.
-	 */
-	xfs_trans_ail_update(log->l_ailp, &cuip->cui_item, lsn);
+	xlog_recover_insert_ail(log, &cuip->cui_item, lsn);
 	xfs_cui_release(cuip);
 	return 0;
 }
diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c
index 1de5c20cb624..ffb06960b12c 100644
--- a/fs/xfs/xfs_rmap_item.c
+++ b/fs/xfs/xfs_rmap_item.c
@@ -664,15 +664,7 @@ xlog_recover_rmap_intent_commit_pass2(
 		return error;
 	}
 	atomic_set(&ruip->rui_next_extent, rui_formatp->rui_nextents);
-
-	spin_lock(&log->l_ailp->ail_lock);
-	/*
-	 * The RUI has two references. One for the RUD and one for RUI to ensure
-	 * it makes it into the AIL. Insert the RUI into the AIL directly and
-	 * drop the RUI reference. Note that xfs_trans_ail_update() drops the
-	 * AIL lock.
-	 */
-	xfs_trans_ail_update(log->l_ailp, &ruip->rui_item, lsn);
+	xlog_recover_insert_ail(log, &ruip->rui_item, lsn);
 	xfs_rui_release(ruip);
 	return 0;
 }


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 19/21] xfs: refactor intent item RECOVERED flag into the log item
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
                   ` (17 preceding siblings ...)
  2020-04-30  0:49 ` [PATCH 18/21] xfs: refactor adding recovered intent items to the log Darrick J. Wong
@ 2020-04-30  0:49 ` Darrick J. Wong
  2020-04-30  0:49 ` [PATCH 20/21] xfs: refactor intent item iop_recover calls Darrick J. Wong
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:49 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Rename XFS_{EFI,BUI,RUI,CUI}_RECOVERED to XFS_LI_RECOVERED so that we
track recovery status in the log item, then get rid of the now unused
flags fields in each of those log item types.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_bmap_item.c     |   10 +++++-----
 fs/xfs/xfs_bmap_item.h     |    6 ------
 fs/xfs/xfs_extfree_item.c  |    8 ++++----
 fs/xfs/xfs_extfree_item.h  |    6 ------
 fs/xfs/xfs_refcount_item.c |    8 ++++----
 fs/xfs/xfs_refcount_item.h |    6 ------
 fs/xfs/xfs_rmap_item.c     |    8 ++++----
 fs/xfs/xfs_rmap_item.h     |    6 ------
 fs/xfs/xfs_trans.h         |    4 +++-
 9 files changed, 20 insertions(+), 42 deletions(-)


diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
index e38efdb0ed71..5b99ffb70659 100644
--- a/fs/xfs/xfs_bmap_item.c
+++ b/fs/xfs/xfs_bmap_item.c
@@ -141,7 +141,7 @@ xfs_bui_item_recover(
 	/*
 	 * Skip BUIs that we've already processed.
 	 */
-	if (test_bit(XFS_BUI_RECOVERED, &buip->bui_flags))
+	if (test_bit(XFS_LI_RECOVERED, &buip->bui_item.li_flags))
 		return 0;
 
 	spin_unlock(&ailp->ail_lock);
@@ -479,11 +479,11 @@ xfs_bui_recover(
 	struct xfs_bmbt_irec		irec;
 	struct xfs_mount		*mp = parent_tp->t_mountp;
 
-	ASSERT(!test_bit(XFS_BUI_RECOVERED, &buip->bui_flags));
+	ASSERT(!test_bit(XFS_LI_RECOVERED, &buip->bui_item.li_flags));
 
 	/* Only one mapping operation per BUI... */
 	if (buip->bui_format.bui_nextents != XFS_BUI_MAX_FAST_EXTENTS) {
-		set_bit(XFS_BUI_RECOVERED, &buip->bui_flags);
+		set_bit(XFS_LI_RECOVERED, &buip->bui_item.li_flags);
 		xfs_bui_release(buip);
 		return -EFSCORRUPTED;
 	}
@@ -517,7 +517,7 @@ xfs_bui_recover(
 		 * This will pull the BUI from the AIL and
 		 * free the memory associated with it.
 		 */
-		set_bit(XFS_BUI_RECOVERED, &buip->bui_flags);
+		set_bit(XFS_LI_RECOVERED, &buip->bui_item.li_flags);
 		xfs_bui_release(buip);
 		return -EFSCORRUPTED;
 	}
@@ -575,7 +575,7 @@ xfs_bui_recover(
 		xfs_bmap_unmap_extent(tp, ip, &irec);
 	}
 
-	set_bit(XFS_BUI_RECOVERED, &buip->bui_flags);
+	set_bit(XFS_LI_RECOVERED, &buip->bui_item.li_flags);
 	xfs_defer_move(parent_tp, tp);
 	error = xfs_trans_commit(tp);
 	xfs_iunlock(ip, XFS_ILOCK_EXCL);
diff --git a/fs/xfs/xfs_bmap_item.h b/fs/xfs/xfs_bmap_item.h
index 44d06e62f8f9..b9be62f8bd52 100644
--- a/fs/xfs/xfs_bmap_item.h
+++ b/fs/xfs/xfs_bmap_item.h
@@ -32,11 +32,6 @@ struct kmem_zone;
  */
 #define	XFS_BUI_MAX_FAST_EXTENTS	1
 
-/*
- * Define BUI flag bits. Manipulated by set/clear/test_bit operators.
- */
-#define	XFS_BUI_RECOVERED		1
-
 /*
  * This is the "bmap update intent" log item.  It is used to log the fact that
  * some reverse mappings need to change.  It is used in conjunction with the
@@ -49,7 +44,6 @@ struct xfs_bui_log_item {
 	struct xfs_log_item		bui_item;
 	atomic_t			bui_refcount;
 	atomic_t			bui_next_extent;
-	unsigned long			bui_flags;	/* misc flags */
 	struct xfs_bui_log_format	bui_format;
 };
 
diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
index 9264ec0817cc..6b5c2b263e5b 100644
--- a/fs/xfs/xfs_extfree_item.c
+++ b/fs/xfs/xfs_extfree_item.c
@@ -157,7 +157,7 @@ xfs_efi_item_recover(
 	 * Skip EFIs that we've already processed.
 	 */
 	efip = container_of(lip, struct xfs_efi_log_item, efi_item);
-	if (test_bit(XFS_EFI_RECOVERED, &efip->efi_flags))
+	if (test_bit(XFS_LI_RECOVERED, &efip->efi_item.li_flags))
 		return 0;
 
 	spin_unlock(&ailp->ail_lock);
@@ -634,7 +634,7 @@ xfs_efi_recover(
 	xfs_extent_t		*extp;
 	xfs_fsblock_t		startblock_fsb;
 
-	ASSERT(!test_bit(XFS_EFI_RECOVERED, &efip->efi_flags));
+	ASSERT(!test_bit(XFS_LI_RECOVERED, &efip->efi_item.li_flags));
 
 	/*
 	 * First check the validity of the extents described by the
@@ -653,7 +653,7 @@ xfs_efi_recover(
 			 * This will pull the EFI from the AIL and
 			 * free the memory associated with it.
 			 */
-			set_bit(XFS_EFI_RECOVERED, &efip->efi_flags);
+			set_bit(XFS_LI_RECOVERED, &efip->efi_item.li_flags);
 			xfs_efi_release(efip);
 			return -EFSCORRUPTED;
 		}
@@ -674,7 +674,7 @@ xfs_efi_recover(
 
 	}
 
-	set_bit(XFS_EFI_RECOVERED, &efip->efi_flags);
+	set_bit(XFS_LI_RECOVERED, &efip->efi_item.li_flags);
 	error = xfs_trans_commit(tp);
 	return error;
 
diff --git a/fs/xfs/xfs_extfree_item.h b/fs/xfs/xfs_extfree_item.h
index 23e3758b5dbb..30a9c7069233 100644
--- a/fs/xfs/xfs_extfree_item.h
+++ b/fs/xfs/xfs_extfree_item.h
@@ -16,11 +16,6 @@ struct kmem_zone;
  */
 #define	XFS_EFI_MAX_FAST_EXTENTS	16
 
-/*
- * Define EFI flag bits. Manipulated by set/clear/test_bit operators.
- */
-#define	XFS_EFI_RECOVERED	1
-
 /*
  * This is the "extent free intention" log item.  It is used to log the fact
  * that some extents need to be free.  It is used in conjunction with the
@@ -54,7 +49,6 @@ typedef struct xfs_efi_log_item {
 	struct xfs_log_item	efi_item;
 	atomic_t		efi_refcount;
 	atomic_t		efi_next_extent;
-	unsigned long		efi_flags;	/* misc flags */
 	xfs_efi_log_format_t	efi_format;
 } xfs_efi_log_item_t;
 
diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c
index 408d9312035a..493f1d16a6ce 100644
--- a/fs/xfs/xfs_refcount_item.c
+++ b/fs/xfs/xfs_refcount_item.c
@@ -140,7 +140,7 @@ xfs_cui_item_recover(
 	/*
 	 * Skip CUIs that we've already processed.
 	 */
-	if (test_bit(XFS_CUI_RECOVERED, &cuip->cui_flags))
+	if (test_bit(XFS_LI_RECOVERED, &cuip->cui_item.li_flags))
 		return 0;
 
 	spin_unlock(&ailp->ail_lock);
@@ -493,7 +493,7 @@ xfs_cui_recover(
 	bool				requeue_only = false;
 	struct xfs_mount		*mp = parent_tp->t_mountp;
 
-	ASSERT(!test_bit(XFS_CUI_RECOVERED, &cuip->cui_flags));
+	ASSERT(!test_bit(XFS_LI_RECOVERED, &cuip->cui_item.li_flags));
 
 	/*
 	 * First check the validity of the extents described by the
@@ -524,7 +524,7 @@ xfs_cui_recover(
 			 * This will pull the CUI from the AIL and
 			 * free the memory associated with it.
 			 */
-			set_bit(XFS_CUI_RECOVERED, &cuip->cui_flags);
+			set_bit(XFS_LI_RECOVERED, &cuip->cui_item.li_flags);
 			xfs_cui_release(cuip);
 			return -EFSCORRUPTED;
 		}
@@ -608,7 +608,7 @@ xfs_cui_recover(
 	}
 
 	xfs_refcount_finish_one_cleanup(tp, rcur, error);
-	set_bit(XFS_CUI_RECOVERED, &cuip->cui_flags);
+	set_bit(XFS_LI_RECOVERED, &cuip->cui_item.li_flags);
 	xfs_defer_move(parent_tp, tp);
 	error = xfs_trans_commit(tp);
 	return error;
diff --git a/fs/xfs/xfs_refcount_item.h b/fs/xfs/xfs_refcount_item.h
index cfaa857673a6..f4f2e836540b 100644
--- a/fs/xfs/xfs_refcount_item.h
+++ b/fs/xfs/xfs_refcount_item.h
@@ -32,11 +32,6 @@ struct kmem_zone;
  */
 #define	XFS_CUI_MAX_FAST_EXTENTS	16
 
-/*
- * Define CUI flag bits. Manipulated by set/clear/test_bit operators.
- */
-#define	XFS_CUI_RECOVERED		1
-
 /*
  * This is the "refcount update intent" log item.  It is used to log
  * the fact that some reverse mappings need to change.  It is used in
@@ -51,7 +46,6 @@ struct xfs_cui_log_item {
 	struct xfs_log_item		cui_item;
 	atomic_t			cui_refcount;
 	atomic_t			cui_next_extent;
-	unsigned long			cui_flags;	/* misc flags */
 	struct xfs_cui_log_format	cui_format;
 };
 
diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c
index ffb06960b12c..69be891e85e8 100644
--- a/fs/xfs/xfs_rmap_item.c
+++ b/fs/xfs/xfs_rmap_item.c
@@ -139,7 +139,7 @@ xfs_rui_item_recover(
 	/*
 	 * Skip RUIs that we've already processed.
 	 */
-	if (test_bit(XFS_RUI_RECOVERED, &ruip->rui_flags))
+	if (test_bit(XFS_LI_RECOVERED, &ruip->rui_item.li_flags))
 		return 0;
 
 	spin_unlock(&ailp->ail_lock);
@@ -533,7 +533,7 @@ xfs_rui_recover(
 	struct xfs_trans		*tp;
 	struct xfs_btree_cur		*rcur = NULL;
 
-	ASSERT(!test_bit(XFS_RUI_RECOVERED, &ruip->rui_flags));
+	ASSERT(!test_bit(XFS_LI_RECOVERED, &ruip->rui_item.li_flags));
 
 	/*
 	 * First check the validity of the extents described by the
@@ -568,7 +568,7 @@ xfs_rui_recover(
 			 * This will pull the RUI from the AIL and
 			 * free the memory associated with it.
 			 */
-			set_bit(XFS_RUI_RECOVERED, &ruip->rui_flags);
+			set_bit(XFS_LI_RECOVERED, &ruip->rui_item.li_flags);
 			xfs_rui_release(ruip);
 			return -EFSCORRUPTED;
 		}
@@ -626,7 +626,7 @@ xfs_rui_recover(
 	}
 
 	xfs_rmap_finish_one_cleanup(tp, rcur, error);
-	set_bit(XFS_RUI_RECOVERED, &ruip->rui_flags);
+	set_bit(XFS_LI_RECOVERED, &ruip->rui_item.li_flags);
 	error = xfs_trans_commit(tp);
 	return error;
 
diff --git a/fs/xfs/xfs_rmap_item.h b/fs/xfs/xfs_rmap_item.h
index 48a77a6f5c94..31e6cdfff71f 100644
--- a/fs/xfs/xfs_rmap_item.h
+++ b/fs/xfs/xfs_rmap_item.h
@@ -35,11 +35,6 @@ struct kmem_zone;
  */
 #define	XFS_RUI_MAX_FAST_EXTENTS	16
 
-/*
- * Define RUI flag bits. Manipulated by set/clear/test_bit operators.
- */
-#define	XFS_RUI_RECOVERED		1
-
 /*
  * This is the "rmap update intent" log item.  It is used to log the fact that
  * some reverse mappings need to change.  It is used in conjunction with the
@@ -52,7 +47,6 @@ struct xfs_rui_log_item {
 	struct xfs_log_item		rui_item;
 	atomic_t			rui_refcount;
 	atomic_t			rui_next_extent;
-	unsigned long			rui_flags;	/* misc flags */
 	struct xfs_rui_log_format	rui_format;
 };
 
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index 9ac5483d2187..5bdbae090f5d 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -59,12 +59,14 @@ struct xfs_log_item {
 #define	XFS_LI_ABORTED	1
 #define	XFS_LI_FAILED	2
 #define	XFS_LI_DIRTY	3	/* log item dirty in transaction */
+#define	XFS_LI_RECOVERED 4	/* log intent item has been recovered */
 
 #define XFS_LI_FLAGS \
 	{ (1 << XFS_LI_IN_AIL),		"IN_AIL" }, \
 	{ (1 << XFS_LI_ABORTED),	"ABORTED" }, \
 	{ (1 << XFS_LI_FAILED),		"FAILED" }, \
-	{ (1 << XFS_LI_DIRTY),		"DIRTY" }
+	{ (1 << XFS_LI_DIRTY),		"DIRTY" }, \
+	{ (1 << XFS_LI_RECOVERED),	"RECOVERED" }
 
 struct xfs_item_ops {
 	unsigned flags;


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 20/21] xfs: refactor intent item iop_recover calls
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
                   ` (18 preceding siblings ...)
  2020-04-30  0:49 ` [PATCH 19/21] xfs: refactor intent item RECOVERED flag into the log item Darrick J. Wong
@ 2020-04-30  0:49 ` Darrick J. Wong
  2020-04-30  0:49 ` [PATCH 21/21] xfs: remove unnecessary includes from xfs_log_recover.c Darrick J. Wong
  2020-05-01 10:15 ` [PATCH v2 00/21] xfs: refactor log recovery Christoph Hellwig
  21 siblings, 0 replies; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:49 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Now that we've made the recovered item tests all the same, we can hoist
the test and the ail locking code to the ->iop_recover caller and call
the recovery function directly.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_bmap_item.c     |   50 ++++++++++++--------------------------------
 fs/xfs/xfs_extfree_item.c  |   46 +++++++++++-----------------------------
 fs/xfs/xfs_log_recover.c   |    8 +++++--
 fs/xfs/xfs_refcount_item.c |   48 +++++++++++-------------------------------
 fs/xfs/xfs_rmap_item.c     |   47 +++++++++++------------------------------
 5 files changed, 58 insertions(+), 141 deletions(-)


diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
index 5b99ffb70659..58f0904e4504 100644
--- a/fs/xfs/xfs_bmap_item.c
+++ b/fs/xfs/xfs_bmap_item.c
@@ -28,7 +28,7 @@
 kmem_zone_t	*xfs_bui_zone;
 kmem_zone_t	*xfs_bud_zone;
 
-STATIC int xfs_bui_recover(struct xfs_trans *tp, struct xfs_bui_log_item *buip);
+STATIC int xfs_bui_item_recover(struct xfs_log_item *lip, struct xfs_trans *tp);
 
 static inline struct xfs_bui_log_item *BUI_ITEM(struct xfs_log_item *lip)
 {
@@ -128,29 +128,6 @@ xfs_bui_item_release(
 	xfs_bui_release(BUI_ITEM(lip));
 }
 
-/* Recover the BUI if necessary. */
-STATIC int
-xfs_bui_item_recover(
-	struct xfs_log_item		*lip,
-	struct xfs_trans		*tp)
-{
-	struct xfs_ail			*ailp = lip->li_ailp;
-	struct xfs_bui_log_item		*buip = BUI_ITEM(lip);
-	int				error;
-
-	/*
-	 * Skip BUIs that we've already processed.
-	 */
-	if (test_bit(XFS_LI_RECOVERED, &buip->bui_item.li_flags))
-		return 0;
-
-	spin_unlock(&ailp->ail_lock);
-	error = xfs_bui_recover(tp, buip);
-	spin_lock(&ailp->ail_lock);
-
-	return error;
-}
-
 static const struct xfs_item_ops xfs_bui_item_ops = {
 	.flags		= XFS_ITEM_TYPE_IS_INTENT,
 	.iop_size	= xfs_bui_item_size,
@@ -459,25 +436,26 @@ const struct xfs_defer_op_type xfs_bmap_update_defer_type = {
  * We need to update some inode's bmbt.
  */
 STATIC int
-xfs_bui_recover(
-	struct xfs_trans		*parent_tp,
-	struct xfs_bui_log_item		*buip)
+xfs_bui_item_recover(
+	struct xfs_log_item		*lip,
+	struct xfs_trans		*parent_tp)
 {
-	int				error = 0;
-	unsigned int			bui_type;
+	struct xfs_bmbt_irec		irec;
+	struct xfs_bui_log_item		*buip = BUI_ITEM(lip);
+	struct xfs_trans		*tp;
+	struct xfs_inode		*ip = NULL;
+	struct xfs_mount		*mp = parent_tp->t_mountp;
 	struct xfs_map_extent		*bmap;
+	struct xfs_bud_log_item		*budp;
 	xfs_fsblock_t			startblock_fsb;
 	xfs_fsblock_t			inode_fsb;
 	xfs_filblks_t			count;
-	bool				op_ok;
-	struct xfs_bud_log_item		*budp;
+	xfs_exntst_t			state;
 	enum xfs_bmap_intent_type	type;
+	bool				op_ok;
+	unsigned int			bui_type;
 	int				whichfork;
-	xfs_exntst_t			state;
-	struct xfs_trans		*tp;
-	struct xfs_inode		*ip = NULL;
-	struct xfs_bmbt_irec		irec;
-	struct xfs_mount		*mp = parent_tp->t_mountp;
+	int				error = 0;
 
 	ASSERT(!test_bit(XFS_LI_RECOVERED, &buip->bui_item.li_flags));
 
diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
index 6b5c2b263e5b..d6f2c88570de 100644
--- a/fs/xfs/xfs_extfree_item.c
+++ b/fs/xfs/xfs_extfree_item.c
@@ -28,7 +28,7 @@
 kmem_zone_t	*xfs_efi_zone;
 kmem_zone_t	*xfs_efd_zone;
 
-STATIC int xfs_efi_recover(struct xfs_mount *mp, struct xfs_efi_log_item *efip);
+STATIC int xfs_efi_item_recover(struct xfs_log_item *lip, struct xfs_trans *tp);
 
 static inline struct xfs_efi_log_item *EFI_ITEM(struct xfs_log_item *lip)
 {
@@ -143,30 +143,6 @@ xfs_efi_item_release(
 	xfs_efi_release(EFI_ITEM(lip));
 }
 
-/* Recover the EFI if necessary. */
-STATIC int
-xfs_efi_item_recover(
-	struct xfs_log_item		*lip,
-	struct xfs_trans		*tp)
-{
-	struct xfs_ail			*ailp = lip->li_ailp;
-	struct xfs_efi_log_item		*efip;
-	int				error;
-
-	/*
-	 * Skip EFIs that we've already processed.
-	 */
-	efip = container_of(lip, struct xfs_efi_log_item, efi_item);
-	if (test_bit(XFS_LI_RECOVERED, &efip->efi_item.li_flags))
-		return 0;
-
-	spin_unlock(&ailp->ail_lock);
-	error = xfs_efi_recover(tp->t_mountp, efip);
-	spin_lock(&ailp->ail_lock);
-
-	return error;
-}
-
 static const struct xfs_item_ops xfs_efi_item_ops = {
 	.flags		= XFS_ITEM_TYPE_IS_INTENT,
 	.iop_size	= xfs_efi_item_size,
@@ -623,16 +599,18 @@ const struct xfs_defer_op_type xfs_agfl_free_defer_type = {
  * the log.  We need to free the extents that it describes.
  */
 STATIC int
-xfs_efi_recover(
-	struct xfs_mount	*mp,
-	struct xfs_efi_log_item	*efip)
+xfs_efi_item_recover(
+	struct xfs_log_item		*lip,
+	struct xfs_trans		*parent_tp)
 {
-	struct xfs_efd_log_item	*efdp;
-	struct xfs_trans	*tp;
-	int			i;
-	int			error = 0;
-	xfs_extent_t		*extp;
-	xfs_fsblock_t		startblock_fsb;
+	struct xfs_efi_log_item		*efip = EFI_ITEM(lip);
+	struct xfs_mount		*mp = parent_tp->t_mountp;
+	struct xfs_efd_log_item		*efdp;
+	struct xfs_trans		*tp;
+	struct xfs_extent		*extp;
+	xfs_fsblock_t			startblock_fsb;
+	int				i;
+	int				error = 0;
 
 	ASSERT(!test_bit(XFS_LI_RECOVERED, &efip->efi_item.li_flags));
 
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 527c74fa5cc3..277e83e74344 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2707,7 +2707,7 @@ xlog_recover_process_intents(
 	struct xfs_ail_cursor	cur;
 	struct xfs_log_item	*lip;
 	struct xfs_ail		*ailp;
-	int			error;
+	int			error = 0;
 #if defined(DEBUG) || defined(XFS_WARN)
 	xfs_lsn_t		last_lsn;
 #endif
@@ -2758,7 +2758,11 @@ xlog_recover_process_intents(
 		 * this routine or else those subsequent intents will get
 		 * replayed in the wrong order!
 		 */
-		error = lip->li_ops->iop_recover(lip, parent_tp);
+		if (!test_bit(XFS_LI_RECOVERED, &lip->li_flags)) {
+			spin_unlock(&ailp->ail_lock);
+			error = lip->li_ops->iop_recover(lip, parent_tp);
+			spin_lock(&ailp->ail_lock);
+		}
 		if (error)
 			goto out;
 		lip = xfs_trans_ail_cursor_next(ailp, &cur);
diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c
index 493f1d16a6ce..53a79dc618f7 100644
--- a/fs/xfs/xfs_refcount_item.c
+++ b/fs/xfs/xfs_refcount_item.c
@@ -24,7 +24,7 @@
 kmem_zone_t	*xfs_cui_zone;
 kmem_zone_t	*xfs_cud_zone;
 
-STATIC int xfs_cui_recover(struct xfs_trans *tp, struct xfs_cui_log_item *cuip);
+STATIC int xfs_cui_item_recover(struct xfs_log_item *lip, struct xfs_trans *tp);
 
 static inline struct xfs_cui_log_item *CUI_ITEM(struct xfs_log_item *lip)
 {
@@ -127,29 +127,6 @@ xfs_cui_item_release(
 	xfs_cui_release(CUI_ITEM(lip));
 }
 
-/* Recover the CUI if necessary. */
-STATIC int
-xfs_cui_item_recover(
-	struct xfs_log_item		*lip,
-	struct xfs_trans		*tp)
-{
-	struct xfs_ail			*ailp = lip->li_ailp;
-	struct xfs_cui_log_item		*cuip = CUI_ITEM(lip);
-	int				error;
-
-	/*
-	 * Skip CUIs that we've already processed.
-	 */
-	if (test_bit(XFS_LI_RECOVERED, &cuip->cui_item.li_flags))
-		return 0;
-
-	spin_unlock(&ailp->ail_lock);
-	error = xfs_cui_recover(tp, cuip);
-	spin_lock(&ailp->ail_lock);
-
-	return error;
-}
-
 static const struct xfs_item_ops xfs_cui_item_ops = {
 	.flags		= XFS_ITEM_TYPE_IS_INTENT,
 	.iop_size	= xfs_cui_item_size,
@@ -473,25 +450,26 @@ const struct xfs_defer_op_type xfs_refcount_update_defer_type = {
  * We need to update the refcountbt.
  */
 STATIC int
-xfs_cui_recover(
-	struct xfs_trans		*parent_tp,
-	struct xfs_cui_log_item		*cuip)
+xfs_cui_item_recover(
+	struct xfs_log_item		*lip,
+	struct xfs_trans		*parent_tp)
 {
-	int				i;
-	int				error = 0;
-	unsigned int			refc_type;
+	struct xfs_bmbt_irec		irec;
+	struct xfs_cui_log_item		*cuip = CUI_ITEM(lip);
 	struct xfs_phys_extent		*refc;
-	xfs_fsblock_t			startblock_fsb;
-	bool				op_ok;
 	struct xfs_cud_log_item		*cudp;
 	struct xfs_trans		*tp;
 	struct xfs_btree_cur		*rcur = NULL;
-	enum xfs_refcount_intent_type	type;
+	struct xfs_mount		*mp = parent_tp->t_mountp;
+	xfs_fsblock_t			startblock_fsb;
 	xfs_fsblock_t			new_fsb;
 	xfs_extlen_t			new_len;
-	struct xfs_bmbt_irec		irec;
+	unsigned int			refc_type;
+	bool				op_ok;
 	bool				requeue_only = false;
-	struct xfs_mount		*mp = parent_tp->t_mountp;
+	enum xfs_refcount_intent_type	type;
+	int				i;
+	int				error = 0;
 
 	ASSERT(!test_bit(XFS_LI_RECOVERED, &cuip->cui_item.li_flags));
 
diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c
index 69be891e85e8..cee5c6155032 100644
--- a/fs/xfs/xfs_rmap_item.c
+++ b/fs/xfs/xfs_rmap_item.c
@@ -24,7 +24,7 @@
 kmem_zone_t	*xfs_rui_zone;
 kmem_zone_t	*xfs_rud_zone;
 
-STATIC int xfs_rui_recover(struct xfs_mount *mp, struct xfs_rui_log_item *ruip);
+STATIC int xfs_rui_item_recover(struct xfs_log_item *lip, struct xfs_trans *tp);
 
 static inline struct xfs_rui_log_item *RUI_ITEM(struct xfs_log_item *lip)
 {
@@ -126,29 +126,6 @@ xfs_rui_item_release(
 	xfs_rui_release(RUI_ITEM(lip));
 }
 
-/* Recover the RUI if necessary. */
-STATIC int
-xfs_rui_item_recover(
-	struct xfs_log_item		*lip,
-	struct xfs_trans		*tp)
-{
-	struct xfs_ail			*ailp = lip->li_ailp;
-	struct xfs_rui_log_item		*ruip = RUI_ITEM(lip);
-	int				error;
-
-	/*
-	 * Skip RUIs that we've already processed.
-	 */
-	if (test_bit(XFS_LI_RECOVERED, &ruip->rui_item.li_flags))
-		return 0;
-
-	spin_unlock(&ailp->ail_lock);
-	error = xfs_rui_recover(tp->t_mountp, ruip);
-	spin_lock(&ailp->ail_lock);
-
-	return error;
-}
-
 static const struct xfs_item_ops xfs_rui_item_ops = {
 	.flags		= XFS_ITEM_TYPE_IS_INTENT,
 	.iop_size	= xfs_rui_item_size,
@@ -517,21 +494,23 @@ const struct xfs_defer_op_type xfs_rmap_update_defer_type = {
  * We need to update the rmapbt.
  */
 STATIC int
-xfs_rui_recover(
-	struct xfs_mount		*mp,
-	struct xfs_rui_log_item		*ruip)
+xfs_rui_item_recover(
+	struct xfs_log_item		*lip,
+	struct xfs_trans		*parent_tp)
 {
-	int				i;
-	int				error = 0;
+	struct xfs_rui_log_item		*ruip = RUI_ITEM(lip);
 	struct xfs_map_extent		*rmap;
-	xfs_fsblock_t			startblock_fsb;
-	bool				op_ok;
 	struct xfs_rud_log_item		*rudp;
-	enum xfs_rmap_intent_type	type;
-	int				whichfork;
-	xfs_exntst_t			state;
 	struct xfs_trans		*tp;
 	struct xfs_btree_cur		*rcur = NULL;
+	struct xfs_mount		*mp = parent_tp->t_mountp;
+	xfs_fsblock_t			startblock_fsb;
+	enum xfs_rmap_intent_type	type;
+	xfs_exntst_t			state;
+	bool				op_ok;
+	int				i;
+	int				whichfork;
+	int				error = 0;
 
 	ASSERT(!test_bit(XFS_LI_RECOVERED, &ruip->rui_item.li_flags));
 


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 21/21] xfs: remove unnecessary includes from xfs_log_recover.c
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
                   ` (19 preceding siblings ...)
  2020-04-30  0:49 ` [PATCH 20/21] xfs: refactor intent item iop_recover calls Darrick J. Wong
@ 2020-04-30  0:49 ` Darrick J. Wong
  2020-05-01 10:15 ` [PATCH v2 00/21] xfs: refactor log recovery Christoph Hellwig
  21 siblings, 0 replies; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30  0:49 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Remove unnecessary includes from the log recovery code.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_log_recover.c |    8 --------
 1 file changed, 8 deletions(-)


diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 277e83e74344..09dd514a3498 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -18,21 +18,13 @@
 #include "xfs_log.h"
 #include "xfs_log_priv.h"
 #include "xfs_log_recover.h"
-#include "xfs_inode_item.h"
-#include "xfs_extfree_item.h"
 #include "xfs_trans_priv.h"
 #include "xfs_alloc.h"
 #include "xfs_ialloc.h"
-#include "xfs_quota.h"
 #include "xfs_trace.h"
 #include "xfs_icache.h"
-#include "xfs_bmap_btree.h"
 #include "xfs_error.h"
-#include "xfs_dir2.h"
-#include "xfs_rmap_item.h"
 #include "xfs_buf_item.h"
-#include "xfs_refcount_item.h"
-#include "xfs_bmap_item.h"
 
 #define BLK_AVG(blk1, blk2)	((blk1+blk2) >> 1)
 


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH 01/21] xfs: refactor log recovery item sorting into a generic dispatch structure
  2020-04-30  0:47 ` [PATCH 01/21] xfs: refactor log recovery item sorting into a generic dispatch structure Darrick J. Wong
@ 2020-04-30  5:53   ` Christoph Hellwig
  2020-04-30 15:08     ` Darrick J. Wong
  2020-05-01 10:40   ` Chandan Rajendra
  1 sibling, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-04-30  5:53 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Wed, Apr 29, 2020 at 05:47:41PM -0700, Darrick J. Wong wrote:
> +extern const struct xlog_recover_item_type xlog_icreate_item_type;
> +extern const struct xlog_recover_item_type xlog_buf_item_type;
> +extern const struct xlog_recover_item_type xlog_inode_item_type;
> +extern const struct xlog_recover_item_type xlog_dquot_item_type;
> +extern const struct xlog_recover_item_type xlog_quotaoff_item_type;
> +extern const struct xlog_recover_item_type xlog_bmap_intent_item_type;
> +extern const struct xlog_recover_item_type xlog_bmap_done_item_type;
> +extern const struct xlog_recover_item_type xlog_extfree_intent_item_type;
> +extern const struct xlog_recover_item_type xlog_extfree_done_item_type;
> +extern const struct xlog_recover_item_type xlog_rmap_intent_item_type;
> +extern const struct xlog_recover_item_type xlog_rmap_done_item_type;
> +extern const struct xlog_recover_item_type xlog_refcount_intent_item_type;
> +extern const struct xlog_recover_item_type xlog_refcount_done_item_type;

I'd prefer if we didn't have to expose these structures, but had a
xlog_register_recovery_item helper that just adds them to a list or
array.

>  typedef struct xlog_recover_item {
>  	struct list_head	ri_list;
> -	int			ri_type;
>  	int			ri_cnt;	/* count of regions found */
>  	int			ri_total;	/* total regions */
>  	xfs_log_iovec_t		*ri_buf;	/* ptr to regions buffer */
> +	const struct xlog_recover_item_type *ri_type;
>  } xlog_recover_item_t;

Btw, killing the xlog_recover_item_t typedef might be a worthwhile prep
patch.

> --- a/fs/xfs/xfs_buf_item.c
> +++ b/fs/xfs/xfs_buf_item.c
> @@ -17,7 +17,6 @@
>  #include "xfs_trace.h"
>  #include "xfs_log.h"
>  
> -
>  kmem_zone_t	*xfs_buf_item_zone;
>  
>  static inline struct xfs_buf_log_item *BUF_ITEM(struct xfs_log_item *lip)

Spurious whitespace change in a file not otherwise touched.

>  
> @@ -107,3 +109,14 @@ xfs_icreate_log(
>  	tp->t_flags |= XFS_TRANS_DIRTY;
>  	set_bit(XFS_LI_DIRTY, &icp->ic_item.li_flags);
>  }
> +
> +static enum xlog_recover_reorder
> +xlog_icreate_reorder(
> +		struct xlog_recover_item *item)
> +{
> +	return XLOG_REORDER_BUFFER_LIST;
> +}

It might be worth to throw in a comment why icreate items got to
the buffer list.

> +		return 0;
> +#ifdef CONFIG_XFS_QUOTA
> +	case XFS_LI_DQUOT:
> +		item->ri_type = &xlog_dquot_item_type;
> +		return 0;
> +	case XFS_LI_QUOTAOFF:
> +		item->ri_type = &xlog_quotaoff_item_type;
> +		return 0;
> +#endif /* CONFIG_XFS_QUOTA */
> +	default:
> +		return -EFSCORRUPTED;
> +	}
> +}

Quote recovery support currently is unconditionalẏ  Making it
conditional on CONFIG_XFS_QUOTA means a kernel without that config
will now fail to recover a file system with quota updates in the log.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 01/21] xfs: refactor log recovery item sorting into a generic dispatch structure
  2020-04-30  5:53   ` Christoph Hellwig
@ 2020-04-30 15:08     ` Darrick J. Wong
  2020-04-30 18:16       ` Darrick J. Wong
  0 siblings, 1 reply; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30 15:08 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Wed, Apr 29, 2020 at 10:53:09PM -0700, Christoph Hellwig wrote:
> On Wed, Apr 29, 2020 at 05:47:41PM -0700, Darrick J. Wong wrote:
> > +extern const struct xlog_recover_item_type xlog_icreate_item_type;
> > +extern const struct xlog_recover_item_type xlog_buf_item_type;
> > +extern const struct xlog_recover_item_type xlog_inode_item_type;
> > +extern const struct xlog_recover_item_type xlog_dquot_item_type;
> > +extern const struct xlog_recover_item_type xlog_quotaoff_item_type;
> > +extern const struct xlog_recover_item_type xlog_bmap_intent_item_type;
> > +extern const struct xlog_recover_item_type xlog_bmap_done_item_type;
> > +extern const struct xlog_recover_item_type xlog_extfree_intent_item_type;
> > +extern const struct xlog_recover_item_type xlog_extfree_done_item_type;
> > +extern const struct xlog_recover_item_type xlog_rmap_intent_item_type;
> > +extern const struct xlog_recover_item_type xlog_rmap_done_item_type;
> > +extern const struct xlog_recover_item_type xlog_refcount_intent_item_type;
> > +extern const struct xlog_recover_item_type xlog_refcount_done_item_type;
> 
> I'd prefer if we didn't have to expose these structures, but had a
> xlog_register_recovery_item helper that just adds them to a list or
> array.

I can look into making a register function and do lookups, but that's
a lot of indirection to save ~15 or so externs.

> >  typedef struct xlog_recover_item {
> >  	struct list_head	ri_list;
> > -	int			ri_type;
> >  	int			ri_cnt;	/* count of regions found */
> >  	int			ri_total;	/* total regions */
> >  	xfs_log_iovec_t		*ri_buf;	/* ptr to regions buffer */
> > +	const struct xlog_recover_item_type *ri_type;
> >  } xlog_recover_item_t;
> 
> Btw, killing the xlog_recover_item_t typedef might be a worthwhile prep
> patch.

Hrm, ok....

> > --- a/fs/xfs/xfs_buf_item.c
> > +++ b/fs/xfs/xfs_buf_item.c
> > @@ -17,7 +17,6 @@
> >  #include "xfs_trace.h"
> >  #include "xfs_log.h"
> >  
> > -
> >  kmem_zone_t	*xfs_buf_item_zone;
> >  
> >  static inline struct xfs_buf_log_item *BUF_ITEM(struct xfs_log_item *lip)
> 
> Spurious whitespace change in a file not otherwise touched.

Fixed.

> >  
> > @@ -107,3 +109,14 @@ xfs_icreate_log(
> >  	tp->t_flags |= XFS_TRANS_DIRTY;
> >  	set_bit(XFS_LI_DIRTY, &icp->ic_item.li_flags);
> >  }
> > +
> > +static enum xlog_recover_reorder
> > +xlog_icreate_reorder(
> > +		struct xlog_recover_item *item)
> > +{
> > +	return XLOG_REORDER_BUFFER_LIST;
> > +}
> 
> It might be worth to throw in a comment why icreate items got to
> the buffer list.
> 
> > +		return 0;
> > +#ifdef CONFIG_XFS_QUOTA
> > +	case XFS_LI_DQUOT:
> > +		item->ri_type = &xlog_dquot_item_type;
> > +		return 0;
> > +	case XFS_LI_QUOTAOFF:
> > +		item->ri_type = &xlog_quotaoff_item_type;
> > +		return 0;
> > +#endif /* CONFIG_XFS_QUOTA */
> > +	default:
> > +		return -EFSCORRUPTED;
> > +	}
> > +}
> 
> Quote recovery support currently is unconditionalẏ  Making it
> conditional on CONFIG_XFS_QUOTA means a kernel without that config
> will now fail to recover a file system with quota updates in the log.

Heh, I hadn't realized that quota recovery always works even if quotas
are disabled.  Ok, xfs_dquot_recovery.c it is...

--D

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 01/21] xfs: refactor log recovery item sorting into a generic dispatch structure
  2020-04-30 15:08     ` Darrick J. Wong
@ 2020-04-30 18:16       ` Darrick J. Wong
  2020-05-01  8:08         ` Christoph Hellwig
  0 siblings, 1 reply; 41+ messages in thread
From: Darrick J. Wong @ 2020-04-30 18:16 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Thu, Apr 30, 2020 at 08:08:21AM -0700, Darrick J. Wong wrote:
> On Wed, Apr 29, 2020 at 10:53:09PM -0700, Christoph Hellwig wrote:
> > On Wed, Apr 29, 2020 at 05:47:41PM -0700, Darrick J. Wong wrote:
> > > +extern const struct xlog_recover_item_type xlog_icreate_item_type;
> > > +extern const struct xlog_recover_item_type xlog_buf_item_type;
> > > +extern const struct xlog_recover_item_type xlog_inode_item_type;
> > > +extern const struct xlog_recover_item_type xlog_dquot_item_type;
> > > +extern const struct xlog_recover_item_type xlog_quotaoff_item_type;
> > > +extern const struct xlog_recover_item_type xlog_bmap_intent_item_type;
> > > +extern const struct xlog_recover_item_type xlog_bmap_done_item_type;
> > > +extern const struct xlog_recover_item_type xlog_extfree_intent_item_type;
> > > +extern const struct xlog_recover_item_type xlog_extfree_done_item_type;
> > > +extern const struct xlog_recover_item_type xlog_rmap_intent_item_type;
> > > +extern const struct xlog_recover_item_type xlog_rmap_done_item_type;
> > > +extern const struct xlog_recover_item_type xlog_refcount_intent_item_type;
> > > +extern const struct xlog_recover_item_type xlog_refcount_done_item_type;
> > 
> > I'd prefer if we didn't have to expose these structures, but had a
> > xlog_register_recovery_item helper that just adds them to a list or
> > array.
> 
> I can look into making a register function and do lookups, but that's
> a lot of indirection to save ~15 or so externs.

Ok, so I looked into this, and I don't know of a good way to avoid
exporting 14 *somethings*.  If we require a xlog_register_recovery_item
call to link the item types when the module loads, something has to call
that registration function.  That can be an __init function in each item
recovery source file, but now we have to export all fourteen of those
functions so that we can call them from init_xfs_fs.  Unless you've got
a better suggestion, I don't think this is worth the effort.

--D

> > >  typedef struct xlog_recover_item {
> > >  	struct list_head	ri_list;
> > > -	int			ri_type;
> > >  	int			ri_cnt;	/* count of regions found */
> > >  	int			ri_total;	/* total regions */
> > >  	xfs_log_iovec_t		*ri_buf;	/* ptr to regions buffer */
> > > +	const struct xlog_recover_item_type *ri_type;
> > >  } xlog_recover_item_t;
> > 
> > Btw, killing the xlog_recover_item_t typedef might be a worthwhile prep
> > patch.
> 
> Hrm, ok....
> 
> > > --- a/fs/xfs/xfs_buf_item.c
> > > +++ b/fs/xfs/xfs_buf_item.c
> > > @@ -17,7 +17,6 @@
> > >  #include "xfs_trace.h"
> > >  #include "xfs_log.h"
> > >  
> > > -
> > >  kmem_zone_t	*xfs_buf_item_zone;
> > >  
> > >  static inline struct xfs_buf_log_item *BUF_ITEM(struct xfs_log_item *lip)
> > 
> > Spurious whitespace change in a file not otherwise touched.
> 
> Fixed.
> 
> > >  
> > > @@ -107,3 +109,14 @@ xfs_icreate_log(
> > >  	tp->t_flags |= XFS_TRANS_DIRTY;
> > >  	set_bit(XFS_LI_DIRTY, &icp->ic_item.li_flags);
> > >  }
> > > +
> > > +static enum xlog_recover_reorder
> > > +xlog_icreate_reorder(
> > > +		struct xlog_recover_item *item)
> > > +{
> > > +	return XLOG_REORDER_BUFFER_LIST;
> > > +}
> > 
> > It might be worth to throw in a comment why icreate items got to
> > the buffer list.
> > 
> > > +		return 0;
> > > +#ifdef CONFIG_XFS_QUOTA
> > > +	case XFS_LI_DQUOT:
> > > +		item->ri_type = &xlog_dquot_item_type;
> > > +		return 0;
> > > +	case XFS_LI_QUOTAOFF:
> > > +		item->ri_type = &xlog_quotaoff_item_type;
> > > +		return 0;
> > > +#endif /* CONFIG_XFS_QUOTA */
> > > +	default:
> > > +		return -EFSCORRUPTED;
> > > +	}
> > > +}
> > 
> > Quote recovery support currently is unconditionalẏ  Making it
> > conditional on CONFIG_XFS_QUOTA means a kernel without that config
> > will now fail to recover a file system with quota updates in the log.
> 
> Heh, I hadn't realized that quota recovery always works even if quotas
> are disabled.  Ok, xfs_dquot_recovery.c it is...
> 
> --D

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 01/21] xfs: refactor log recovery item sorting into a generic dispatch structure
  2020-04-30 18:16       ` Darrick J. Wong
@ 2020-05-01  8:08         ` Christoph Hellwig
  0 siblings, 0 replies; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01  8:08 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, linux-xfs

On Thu, Apr 30, 2020 at 11:16:28AM -0700, Darrick J. Wong wrote:
> Ok, so I looked into this, and I don't know of a good way to avoid
> exporting 14 *somethings*.  If we require a xlog_register_recovery_item
> call to link the item types when the module loads, something has to call
> that registration function.  That can be an __init function in each item
> recovery source file, but now we have to export all fourteen of those
> functions so that we can call them from init_xfs_fs.  Unless you've got
> a better suggestion, I don't think this is worth the effort.

Ok, let's keep the externs for now.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/21] xfs: refactor log recovery
  2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
                   ` (20 preceding siblings ...)
  2020-04-30  0:49 ` [PATCH 21/21] xfs: remove unnecessary includes from xfs_log_recover.c Darrick J. Wong
@ 2020-05-01 10:15 ` Christoph Hellwig
  2020-05-01 16:53   ` Darrick J. Wong
  21 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01 10:15 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

I've looked a bit over the total diff and finaly result and really like
it.

A few comments from that without going into the individual patches:

 - I don't think the buffer cancellation table should remain in
   xfs_log_recovery.c.  I can either move them into a new file
   as part of resending my prep series, or you could move them into
   xfs_buf_item_recover.c.  Let me know what you prefer.
 - Should the match callback also move into struct xfs_item_ops?  That
   would also match iop_recover.
 - Based on that we could also kill XFS_ITEM_TYPE_IS_INTENT by just
   checking for iop_recover and/or iop_match.
 - Setting XFS_LI_RECOVERED could also move to common code, basically
   set it whenever iop_recover returns.  Also we can remove the
   XFS_LI_RECOVERED asserts in ->iop_recovery when the caller checks
   it just before.
 - we are still having a few redundant ri_type checks.
 - ri_type maybe should be ri_ops?

See this patch below for my take on cleaning up the recovery ops
handling a bit:

diff --git a/fs/xfs/libxfs/xfs_log_recover.h b/fs/xfs/libxfs/xfs_log_recover.h
index ba172eb454c8f..f97946cf94f11 100644
--- a/fs/xfs/libxfs/xfs_log_recover.h
+++ b/fs/xfs/libxfs/xfs_log_recover.h
@@ -7,7 +7,7 @@
 #define __XFS_LOG_RECOVER_H__
 
 /*
- * Each log item type (XFS_LI_*) gets its own xlog_recover_item_type to
+ * Each log item type (XFS_LI_*) gets its own xlog_recover_item_ops to
  * define how recovery should work for that type of log item.
  */
 struct xlog_recover_item;
@@ -20,7 +20,9 @@ enum xlog_recover_reorder {
 	XLOG_REORDER_CANCEL_LIST,
 };
 
-struct xlog_recover_item_type {
+struct xlog_recover_item_ops {
+	uint16_t		item_type;
+
 	/*
 	 * Help sort recovered log items into the order required to replay them
 	 * correctly.  Log item types that always use XLOG_REORDER_ITEM_LIST do
@@ -58,19 +60,19 @@ struct xlog_recover_item_type {
 			       struct xlog_recover_item *item, xfs_lsn_t lsn);
 };
 
-extern const struct xlog_recover_item_type xlog_icreate_item_type;
-extern const struct xlog_recover_item_type xlog_buf_item_type;
-extern const struct xlog_recover_item_type xlog_inode_item_type;
-extern const struct xlog_recover_item_type xlog_dquot_item_type;
-extern const struct xlog_recover_item_type xlog_quotaoff_item_type;
-extern const struct xlog_recover_item_type xlog_bmap_intent_item_type;
-extern const struct xlog_recover_item_type xlog_bmap_done_item_type;
-extern const struct xlog_recover_item_type xlog_extfree_intent_item_type;
-extern const struct xlog_recover_item_type xlog_extfree_done_item_type;
-extern const struct xlog_recover_item_type xlog_rmap_intent_item_type;
-extern const struct xlog_recover_item_type xlog_rmap_done_item_type;
-extern const struct xlog_recover_item_type xlog_refcount_intent_item_type;
-extern const struct xlog_recover_item_type xlog_refcount_done_item_type;
+extern const struct xlog_recover_item_ops xlog_icreate_item_type;
+extern const struct xlog_recover_item_ops xlog_buf_item_type;
+extern const struct xlog_recover_item_ops xlog_inode_item_type;
+extern const struct xlog_recover_item_ops xlog_dquot_item_type;
+extern const struct xlog_recover_item_ops xlog_quotaoff_item_type;
+extern const struct xlog_recover_item_ops xlog_bmap_intent_item_type;
+extern const struct xlog_recover_item_ops xlog_bmap_done_item_type;
+extern const struct xlog_recover_item_ops xlog_extfree_intent_item_type;
+extern const struct xlog_recover_item_ops xlog_extfree_done_item_type;
+extern const struct xlog_recover_item_ops xlog_rmap_intent_item_type;
+extern const struct xlog_recover_item_ops xlog_rmap_done_item_type;
+extern const struct xlog_recover_item_ops xlog_refcount_intent_item_type;
+extern const struct xlog_recover_item_ops xlog_refcount_done_item_type;
 
 /*
  * Macros, structures, prototypes for internal log manager use.
@@ -93,7 +95,7 @@ typedef struct xlog_recover_item {
 	int			ri_cnt;	/* count of regions found */
 	int			ri_total;	/* total regions */
 	xfs_log_iovec_t		*ri_buf;	/* ptr to regions buffer */
-	const struct xlog_recover_item_type *ri_type;
+	const struct xlog_recover_item_ops *ri_ops;
 } xlog_recover_item_t;
 
 struct xlog_recover {
diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
index 58f0904e4504d..952b4ce40433e 100644
--- a/fs/xfs/xfs_bmap_item.c
+++ b/fs/xfs/xfs_bmap_item.c
@@ -667,10 +667,12 @@ xlog_recover_bmap_done_commit_pass2(
 	return 0;
 }
 
-const struct xlog_recover_item_type xlog_bmap_intent_item_type = {
+const struct xlog_recover_item_ops xlog_bmap_intent_item_type = {
+	.item_type		= XFS_LI_BUI,
 	.commit_pass2_fn	= xlog_recover_bmap_intent_commit_pass2,
 };
 
-const struct xlog_recover_item_type xlog_bmap_done_item_type = {
+const struct xlog_recover_item_ops xlog_bmap_done_item_type = {
+	.item_type		= XFS_LI_BUD,
 	.commit_pass2_fn	= xlog_recover_bmap_done_commit_pass2,
 };
diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c
index d324f810819df..954e0e96af5dc 100644
--- a/fs/xfs/xfs_buf_item_recover.c
+++ b/fs/xfs/xfs_buf_item_recover.c
@@ -857,7 +857,8 @@ xlog_recover_buffer_commit_pass2(
 	return 0;
 }
 
-const struct xlog_recover_item_type xlog_buf_item_type = {
+const struct xlog_recover_item_ops xlog_buf_item_type = {
+	.item_type		= XFS_LI_BUF,
 	.reorder_fn		= xlog_buf_reorder_fn,
 	.ra_pass2_fn		= xlog_recover_buffer_ra_pass2,
 	.commit_pass1_fn	= xlog_recover_buffer_commit_pass1,
diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
index 83bd7ded9185f..6c6216bdc432c 100644
--- a/fs/xfs/xfs_dquot_item.c
+++ b/fs/xfs/xfs_dquot_item.c
@@ -527,7 +527,8 @@ xlog_recover_dquot_commit_pass2(
 	return 0;
 }
 
-const struct xlog_recover_item_type xlog_dquot_item_type = {
+const struct xlog_recover_item_ops xlog_dquot_item_type = {
+	.item_type		= XFS_LI_DQUOT,
 	.ra_pass2_fn		= xlog_recover_dquot_ra_pass2,
 	.commit_pass2_fn	= xlog_recover_dquot_commit_pass2,
 };
@@ -559,6 +560,7 @@ xlog_recover_quotaoff_commit_pass1(
 	return 0;
 }
 
-const struct xlog_recover_item_type xlog_quotaoff_item_type = {
+const struct xlog_recover_item_ops xlog_quotaoff_item_type = {
+	.item_type		= XFS_LI_QUOTAOFF,
 	.commit_pass1_fn	= xlog_recover_quotaoff_commit_pass1,
 };
diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
index d6f2c88570de1..5d1fb5e05b781 100644
--- a/fs/xfs/xfs_extfree_item.c
+++ b/fs/xfs/xfs_extfree_item.c
@@ -729,10 +729,12 @@ xlog_recover_extfree_done_commit_pass2(
 	return 0;
 }
 
-const struct xlog_recover_item_type xlog_extfree_intent_item_type = {
+const struct xlog_recover_item_ops xlog_extfree_intent_item_type = {
+	.item_type		= XFS_LI_EFI,
 	.commit_pass2_fn	= xlog_recover_extfree_intent_commit_pass2,
 };
 
-const struct xlog_recover_item_type xlog_extfree_done_item_type = {
+const struct xlog_recover_item_ops xlog_extfree_done_item_type = {
+	.item_type		= XFS_LI_EFD,
 	.commit_pass2_fn	= xlog_recover_extfree_done_commit_pass2,
 };
diff --git a/fs/xfs/xfs_icreate_item.c b/fs/xfs/xfs_icreate_item.c
index 602a8c91371fe..34805bdbc2e12 100644
--- a/fs/xfs/xfs_icreate_item.c
+++ b/fs/xfs/xfs_icreate_item.c
@@ -248,7 +248,8 @@ xlog_recover_do_icreate_commit_pass2(
 				     length, be32_to_cpu(icl->icl_gen));
 }
 
-const struct xlog_recover_item_type xlog_icreate_item_type = {
+const struct xlog_recover_item_ops xlog_icreate_item_type = {
+	.item_type		= XFS_LI_ICREATE,
 	.reorder_fn		= xlog_icreate_reorder,
 	.commit_pass2_fn	= xlog_recover_do_icreate_commit_pass2,
 };
diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c
index 46fc8a4b9ac61..9dff80783fe12 100644
--- a/fs/xfs/xfs_inode_item_recover.c
+++ b/fs/xfs/xfs_inode_item_recover.c
@@ -393,7 +393,8 @@ xlog_recover_inode_commit_pass2(
 	return error;
 }
 
-const struct xlog_recover_item_type xlog_inode_item_type = {
+const struct xlog_recover_item_ops xlog_inode_item_type = {
+	.item_type		= XFS_LI_INODE,
 	.ra_pass2_fn		= xlog_recover_inode_ra_pass2,
 	.commit_pass2_fn	= xlog_recover_inode_commit_pass2,
 };
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 09dd514a34980..e3f13866deb08 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -1828,55 +1828,35 @@ xlog_recover_insert_ail(
  ******************************************************************************
  */
 
-static int
-xlog_set_item_type(
-	struct xlog_recover_item		*item)
-{
-	switch (ITEM_TYPE(item)) {
-	case XFS_LI_ICREATE:
-		item->ri_type = &xlog_icreate_item_type;
-		return 0;
-	case XFS_LI_BUF:
-		item->ri_type = &xlog_buf_item_type;
-		return 0;
-	case XFS_LI_EFI:
-		item->ri_type = &xlog_extfree_intent_item_type;
-		return 0;
-	case XFS_LI_EFD:
-		item->ri_type = &xlog_extfree_done_item_type;
-		return 0;
-	case XFS_LI_RUI:
-		item->ri_type = &xlog_rmap_intent_item_type;
-		return 0;
-	case XFS_LI_RUD:
-		item->ri_type = &xlog_rmap_done_item_type;
-		return 0;
-	case XFS_LI_CUI:
-		item->ri_type = &xlog_refcount_intent_item_type;
-		return 0;
-	case XFS_LI_CUD:
-		item->ri_type = &xlog_refcount_done_item_type;
-		return 0;
-	case XFS_LI_BUI:
-		item->ri_type = &xlog_bmap_intent_item_type;
-		return 0;
-	case XFS_LI_BUD:
-		item->ri_type = &xlog_bmap_done_item_type;
-		return 0;
-	case XFS_LI_INODE:
-		item->ri_type = &xlog_inode_item_type;
-		return 0;
+static const struct xlog_recover_item_ops *xlog_recover_item_ops[] = {
+	&xlog_icreate_item_type,
+	&xlog_buf_item_type,
+	&xlog_extfree_intent_item_type,
+	&xlog_extfree_done_item_type,
+	&xlog_rmap_intent_item_type,
+	&xlog_rmap_done_item_type,
+	&xlog_refcount_intent_item_type,
+	&xlog_refcount_done_item_type,
+	&xlog_bmap_intent_item_type,
+	&xlog_bmap_done_item_type,
+	&xlog_inode_item_type,
 #ifdef CONFIG_XFS_QUOTA
-	case XFS_LI_DQUOT:
-		item->ri_type = &xlog_dquot_item_type;
-		return 0;
-	case XFS_LI_QUOTAOFF:
-		item->ri_type = &xlog_quotaoff_item_type;
-		return 0;
+	&xlog_dquot_item_type,
+	&xlog_quotaoff_item_type,
 #endif /* CONFIG_XFS_QUOTA */
-	default:
-		return -EFSCORRUPTED;
-	}
+};
+
+static const struct xlog_recover_item_ops *
+xlog_find_item_ops(
+	struct xlog_recover_item	*item)
+{
+	int				i;
+
+	for (i = 0; i < ARRAY_SIZE(xlog_recover_item_ops); i++)
+		if (ITEM_TYPE(item) == xlog_recover_item_ops[i]->item_type)
+			return xlog_recover_item_ops[i];
+
+	return NULL;
 }
 
 /*
@@ -1946,8 +1926,8 @@ xlog_recover_reorder_trans(
 	list_for_each_entry_safe(item, n, &sort_list, ri_list) {
 		enum xlog_recover_reorder	fate = XLOG_REORDER_ITEM_LIST;
 
-		error = xlog_set_item_type(item);
-		if (error) {
+		item->ri_ops = xlog_find_item_ops(item);
+		if (!item->ri_ops) {
 			xfs_warn(log->l_mp,
 				"%s: unrecognized type of log operation (%d)",
 				__func__, ITEM_TYPE(item));
@@ -1958,11 +1938,12 @@ xlog_recover_reorder_trans(
 			 */
 			if (!list_empty(&sort_list))
 				list_splice_init(&sort_list, &trans->r_itemq);
+			error = -EFSCORRUPTED;
 			break;
 		}
 
-		if (item->ri_type->reorder_fn)
-			fate = item->ri_type->reorder_fn(item);
+		if (item->ri_ops->reorder_fn)
+			fate = item->ri_ops->reorder_fn(item);
 
 		switch (fate) {
 		case XLOG_REORDER_BUFFER_LIST:
@@ -2098,46 +2079,6 @@ xlog_buf_readahead(
 		xfs_buf_readahead(log->l_mp->m_ddev_targp, blkno, len, ops);
 }
 
-STATIC int
-xlog_recover_commit_pass1(
-	struct xlog			*log,
-	struct xlog_recover		*trans,
-	struct xlog_recover_item	*item)
-{
-	trace_xfs_log_recover_item_recover(log, trans, item, XLOG_RECOVER_PASS1);
-
-	if (!item->ri_type) {
-		xfs_warn(log->l_mp, "%s: invalid item type (%d)",
-			__func__, ITEM_TYPE(item));
-		ASSERT(0);
-		return -EFSCORRUPTED;
-	}
-	if (!item->ri_type->commit_pass1_fn)
-		return 0;
-	return item->ri_type->commit_pass1_fn(log, item);
-}
-
-STATIC int
-xlog_recover_commit_pass2(
-	struct xlog			*log,
-	struct xlog_recover		*trans,
-	struct list_head		*buffer_list,
-	struct xlog_recover_item	*item)
-{
-	trace_xfs_log_recover_item_recover(log, trans, item, XLOG_RECOVER_PASS2);
-
-	if (!item->ri_type) {
-		xfs_warn(log->l_mp, "%s: invalid item type (%d)",
-			__func__, ITEM_TYPE(item));
-		ASSERT(0);
-		return -EFSCORRUPTED;
-	}
-	if (!item->ri_type->commit_pass2_fn)
-		return 0;
-	return item->ri_type->commit_pass2_fn(log, buffer_list, item,
-			trans->r_lsn);
-}
-
 STATIC int
 xlog_recover_items_pass2(
 	struct xlog                     *log,
@@ -2146,16 +2087,18 @@ xlog_recover_items_pass2(
 	struct list_head                *item_list)
 {
 	struct xlog_recover_item	*item;
-	int				error = 0;
+	int				error;
 
 	list_for_each_entry(item, item_list, ri_list) {
-		error = xlog_recover_commit_pass2(log, trans,
-					  buffer_list, item);
+		if (!item->ri_ops->commit_pass2_fn)
+			continue;
+		error = item->ri_ops->commit_pass2_fn(log, buffer_list, item,
+				trans->r_lsn);
 		if (error)
 			return error;
 	}
 
-	return error;
+	return 0;
 }
 
 /*
@@ -2187,13 +2130,16 @@ xlog_recover_commit_trans(
 		return error;
 
 	list_for_each_entry_safe(item, next, &trans->r_itemq, ri_list) {
+		trace_xfs_log_recover_item_recover(log, trans, item, pass);
+
 		switch (pass) {
 		case XLOG_RECOVER_PASS1:
-			error = xlog_recover_commit_pass1(log, trans, item);
+			if (item->ri_ops->commit_pass1_fn)
+				error = item->ri_ops->commit_pass1_fn(log, item);
 			break;
 		case XLOG_RECOVER_PASS2:
-			if (item->ri_type && item->ri_type->ra_pass2_fn)
-				item->ri_type->ra_pass2_fn(log, item);
+			if (item->ri_ops->ra_pass2_fn)
+				item->ri_ops->ra_pass2_fn(log, item);
 			list_move_tail(&item->ri_list, &ra_list);
 			items_queued++;
 			if (items_queued >= XLOG_RECOVER_COMMIT_QUEUE_MAX) {
diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c
index 53a79dc618f76..5703d5fdf4eeb 100644
--- a/fs/xfs/xfs_refcount_item.c
+++ b/fs/xfs/xfs_refcount_item.c
@@ -690,10 +690,12 @@ xlog_recover_refcount_done_commit_pass2(
 	return 0;
 }
 
-const struct xlog_recover_item_type xlog_refcount_intent_item_type = {
+const struct xlog_recover_item_ops xlog_refcount_intent_item_type = {
+	.item_type		= XFS_LI_CUI,
 	.commit_pass2_fn	= xlog_recover_refcount_intent_commit_pass2,
 };
 
-const struct xlog_recover_item_type xlog_refcount_done_item_type = {
+const struct xlog_recover_item_ops xlog_refcount_done_item_type = {
+	.item_type		= XFS_LI_CUD,
 	.commit_pass2_fn	= xlog_recover_refcount_done_commit_pass2,
 };
diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c
index cee5c61550321..12e035ff7bb2d 100644
--- a/fs/xfs/xfs_rmap_item.c
+++ b/fs/xfs/xfs_rmap_item.c
@@ -680,10 +680,12 @@ xlog_recover_rmap_done_commit_pass2(
 	return 0;
 }
 
-const struct xlog_recover_item_type xlog_rmap_intent_item_type = {
+const struct xlog_recover_item_ops xlog_rmap_intent_item_type = {
+	.item_type		= XFS_LI_RUI,
 	.commit_pass2_fn	= xlog_recover_rmap_intent_commit_pass2,
 };
 
-const struct xlog_recover_item_type xlog_rmap_done_item_type = {
+const struct xlog_recover_item_ops xlog_rmap_done_item_type = {
+	.item_type		= XFS_LI_RUD,
 	.commit_pass2_fn	= xlog_recover_rmap_done_commit_pass2,
 };

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH 13/21] xfs: refactor recovered EFI log item playback
  2020-04-30  0:48 ` [PATCH 13/21] xfs: refactor recovered EFI log item playback Darrick J. Wong
@ 2020-05-01 10:19   ` Christoph Hellwig
  2020-05-01 17:58     ` Darrick J. Wong
  0 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01 10:19 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Wed, Apr 29, 2020 at 05:48:59PM -0700, Darrick J. Wong wrote:
> +STATIC int xfs_efi_recover(struct xfs_mount *mp, struct xfs_efi_log_item *efip);

Can you just move xfs_efi_item_ops down a bit to avoid the forward
declaration?  Same for the other patches doing the same.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 09/21] xfs: refactor log recovery EFI item dispatch for pass2 commit functions
  2020-04-30  0:48 ` [PATCH 09/21] xfs: refactor log recovery EFI " Darrick J. Wong
@ 2020-05-01 10:28   ` Christoph Hellwig
  2020-05-01 17:56     ` Darrick J. Wong
  0 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01 10:28 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

> +STATIC int
> +xlog_recover_extfree_done_commit_pass2(
> +	struct xlog			*log,
> +	struct list_head		*buffer_list,
> +	struct xlog_recover_item	*item,
> +	xfs_lsn_t			lsn)
> +{

...

> +	return 0;
> +}
> +
>  const struct xlog_recover_item_type xlog_extfree_intent_item_type = {
> +	.commit_pass2_fn	= xlog_recover_extfree_intent_commit_pass2,
>  };
>  
>  const struct xlog_recover_item_type xlog_extfree_done_item_type = {
> +	.commit_pass2_fn	= xlog_recover_extfree_done_commit_pass2,
>  };

Nipick: It would be nice to keep all the efi vs efd code together
with their ops vectors?  Same for the other intent ops.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 01/21] xfs: refactor log recovery item sorting into a generic dispatch structure
  2020-04-30  0:47 ` [PATCH 01/21] xfs: refactor log recovery item sorting into a generic dispatch structure Darrick J. Wong
  2020-04-30  5:53   ` Christoph Hellwig
@ 2020-05-01 10:40   ` Chandan Rajendra
  1 sibling, 0 replies; 41+ messages in thread
From: Chandan Rajendra @ 2020-05-01 10:40 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thursday, April 30, 2020 6:17 AM Darrick J. Wong wrote: 
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Create a generic dispatch structure to delegate recovery of different
> log item types into various code modules.  This will enable us to move
> code specific to a particular log item type out of xfs_log_recover.c and
> into the log item source.
> 
> The first operation we virtualize is the log item sorting.

The item sorting is logically the same as before applying the patch. Hence,

Reviewed-by: Chandan Rajendra <chandanrlinux@gmail.com>

> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/Makefile                 |    2 +
>  fs/xfs/libxfs/xfs_log_recover.h |   41 ++++++++++++++
>  fs/xfs/xfs_bmap_item.c          |    7 ++
>  fs/xfs/xfs_buf_item.c           |    1 
>  fs/xfs/xfs_buf_item_recover.c   |   37 +++++++++++++
>  fs/xfs/xfs_dquot_item.c         |    8 +++
>  fs/xfs/xfs_extfree_item.c       |    7 ++
>  fs/xfs/xfs_icreate_item.c       |   13 ++++
>  fs/xfs/xfs_inode_item_recover.c |   25 ++++++++
>  fs/xfs/xfs_log_recover.c        |  115 ++++++++++++++++++++++++++-------------
>  fs/xfs/xfs_refcount_item.c      |    7 ++
>  fs/xfs/xfs_rmap_item.c          |    7 ++
>  12 files changed, 231 insertions(+), 39 deletions(-)
>  create mode 100644 fs/xfs/xfs_buf_item_recover.c
>  create mode 100644 fs/xfs/xfs_inode_item_recover.c
> 
> 
> diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile
> index ee375b67ac71..5e52c2dc6078 100644
> --- a/fs/xfs/Makefile
> +++ b/fs/xfs/Makefile
> @@ -120,9 +120,11 @@ xfs-y				+= xfs_log.o \
>  				   xfs_log_cil.o \
>  				   xfs_bmap_item.o \
>  				   xfs_buf_item.o \
> +				   xfs_buf_item_recover.o \
>  				   xfs_extfree_item.o \
>  				   xfs_icreate_item.o \
>  				   xfs_inode_item.o \
> +				   xfs_inode_item_recover.o \
>  				   xfs_refcount_item.o \
>  				   xfs_rmap_item.o \
>  				   xfs_log_recover.o \
> diff --git a/fs/xfs/libxfs/xfs_log_recover.h b/fs/xfs/libxfs/xfs_log_recover.h
> index 3bf671637a91..38ae9c371edb 100644
> --- a/fs/xfs/libxfs/xfs_log_recover.h
> +++ b/fs/xfs/libxfs/xfs_log_recover.h
> @@ -6,6 +6,45 @@
>  #ifndef	__XFS_LOG_RECOVER_H__
>  #define __XFS_LOG_RECOVER_H__
>  
> +/*
> + * Each log item type (XFS_LI_*) gets its own xlog_recover_item_type to
> + * define how recovery should work for that type of log item.
> + */
> +struct xlog_recover_item;
> +
> +/* Sorting hat for log items as they're read in. */
> +enum xlog_recover_reorder {
> +	XLOG_REORDER_BUFFER_LIST,
> +	XLOG_REORDER_ITEM_LIST,
> +	XLOG_REORDER_INODE_BUFFER_LIST,
> +	XLOG_REORDER_CANCEL_LIST,
> +};
> +
> +struct xlog_recover_item_type {
> +	/*
> +	 * Help sort recovered log items into the order required to replay them
> +	 * correctly.  Log item types that always use XLOG_REORDER_ITEM_LIST do
> +	 * not have to supply a function here.  See the comment preceding
> +	 * xlog_recover_reorder_trans for more details about what the return
> +	 * values mean.
> +	 */
> +	enum xlog_recover_reorder (*reorder_fn)(struct xlog_recover_item *item);
> +};
> +
> +extern const struct xlog_recover_item_type xlog_icreate_item_type;
> +extern const struct xlog_recover_item_type xlog_buf_item_type;
> +extern const struct xlog_recover_item_type xlog_inode_item_type;
> +extern const struct xlog_recover_item_type xlog_dquot_item_type;
> +extern const struct xlog_recover_item_type xlog_quotaoff_item_type;
> +extern const struct xlog_recover_item_type xlog_bmap_intent_item_type;
> +extern const struct xlog_recover_item_type xlog_bmap_done_item_type;
> +extern const struct xlog_recover_item_type xlog_extfree_intent_item_type;
> +extern const struct xlog_recover_item_type xlog_extfree_done_item_type;
> +extern const struct xlog_recover_item_type xlog_rmap_intent_item_type;
> +extern const struct xlog_recover_item_type xlog_rmap_done_item_type;
> +extern const struct xlog_recover_item_type xlog_refcount_intent_item_type;
> +extern const struct xlog_recover_item_type xlog_refcount_done_item_type;
> +
>  /*
>   * Macros, structures, prototypes for internal log manager use.
>   */
> @@ -24,10 +63,10 @@
>   */
>  typedef struct xlog_recover_item {
>  	struct list_head	ri_list;
> -	int			ri_type;
>  	int			ri_cnt;	/* count of regions found */
>  	int			ri_total;	/* total regions */
>  	xfs_log_iovec_t		*ri_buf;	/* ptr to regions buffer */
> +	const struct xlog_recover_item_type *ri_type;
>  } xlog_recover_item_t;
>  
>  struct xlog_recover {
> diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
> index ee6f4229cebc..a2824013e2cb 100644
> --- a/fs/xfs/xfs_bmap_item.c
> +++ b/fs/xfs/xfs_bmap_item.c
> @@ -22,6 +22,7 @@
>  #include "xfs_bmap_btree.h"
>  #include "xfs_trans_space.h"
>  #include "xfs_error.h"
> +#include "xfs_log_recover.h"
>  
>  kmem_zone_t	*xfs_bui_zone;
>  kmem_zone_t	*xfs_bud_zone;
> @@ -563,3 +564,9 @@ xfs_bui_recover(
>  	}
>  	return error;
>  }
> +
> +const struct xlog_recover_item_type xlog_bmap_intent_item_type = {
> +};
> +
> +const struct xlog_recover_item_type xlog_bmap_done_item_type = {
> +};
> diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
> index 1545657c3ca0..a416fc35e444 100644
> --- a/fs/xfs/xfs_buf_item.c
> +++ b/fs/xfs/xfs_buf_item.c
> @@ -17,7 +17,6 @@
>  #include "xfs_trace.h"
>  #include "xfs_log.h"
>  
> -
>  kmem_zone_t	*xfs_buf_item_zone;
>  
>  static inline struct xfs_buf_log_item *BUF_ITEM(struct xfs_log_item *lip)
> diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c
> new file mode 100644
> index 000000000000..07ddf58209c3
> --- /dev/null
> +++ b/fs/xfs/xfs_buf_item_recover.c
> @@ -0,0 +1,37 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2000-2006 Silicon Graphics, Inc.
> + * All Rights Reserved.
> + */
> +#include "xfs.h"
> +#include "xfs_fs.h"
> +#include "xfs_shared.h"
> +#include "xfs_format.h"
> +#include "xfs_log_format.h"
> +#include "xfs_trans_resv.h"
> +#include "xfs_bit.h"
> +#include "xfs_mount.h"
> +#include "xfs_trans.h"
> +#include "xfs_buf_item.h"
> +#include "xfs_trans_priv.h"
> +#include "xfs_trace.h"
> +#include "xfs_log.h"
> +#include "xfs_log_priv.h"
> +#include "xfs_log_recover.h"
> +
> +STATIC enum xlog_recover_reorder
> +xlog_buf_reorder_fn(
> +	struct xlog_recover_item	*item)
> +{
> +	struct xfs_buf_log_format	*buf_f = item->ri_buf[0].i_addr;
> +
> +	if (buf_f->blf_flags & XFS_BLF_CANCEL)
> +		return XLOG_REORDER_CANCEL_LIST;
> +	if (buf_f->blf_flags & XFS_BLF_INODE_BUF)
> +		return XLOG_REORDER_INODE_BUFFER_LIST;
> +	return XLOG_REORDER_BUFFER_LIST;
> +}
> +
> +const struct xlog_recover_item_type xlog_buf_item_type = {
> +	.reorder_fn		= xlog_buf_reorder_fn,
> +};
> diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
> index baad1748d0d1..3bd5b6c7e235 100644
> --- a/fs/xfs/xfs_dquot_item.c
> +++ b/fs/xfs/xfs_dquot_item.c
> @@ -17,6 +17,8 @@
>  #include "xfs_trans_priv.h"
>  #include "xfs_qm.h"
>  #include "xfs_log.h"
> +#include "xfs_log_priv.h"
> +#include "xfs_log_recover.h"
>  
>  static inline struct xfs_dq_logitem *DQUOT_ITEM(struct xfs_log_item *lip)
>  {
> @@ -383,3 +385,9 @@ xfs_qm_qoff_logitem_init(
>  	qf->qql_flags = flags;
>  	return qf;
>  }
> +
> +const struct xlog_recover_item_type xlog_dquot_item_type = {
> +};
> +
> +const struct xlog_recover_item_type xlog_quotaoff_item_type = {
> +};
> diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
> index 6ea847f6e298..c53e5f46ee26 100644
> --- a/fs/xfs/xfs_extfree_item.c
> +++ b/fs/xfs/xfs_extfree_item.c
> @@ -22,6 +22,7 @@
>  #include "xfs_bmap.h"
>  #include "xfs_trace.h"
>  #include "xfs_error.h"
> +#include "xfs_log_recover.h"
>  
>  kmem_zone_t	*xfs_efi_zone;
>  kmem_zone_t	*xfs_efd_zone;
> @@ -652,3 +653,9 @@ xfs_efi_recover(
>  	xfs_trans_cancel(tp);
>  	return error;
>  }
> +
> +const struct xlog_recover_item_type xlog_extfree_intent_item_type = {
> +};
> +
> +const struct xlog_recover_item_type xlog_extfree_done_item_type = {
> +};
> diff --git a/fs/xfs/xfs_icreate_item.c b/fs/xfs/xfs_icreate_item.c
> index 490fee22b878..9f38a3c200a3 100644
> --- a/fs/xfs/xfs_icreate_item.c
> +++ b/fs/xfs/xfs_icreate_item.c
> @@ -11,6 +11,8 @@
>  #include "xfs_trans_priv.h"
>  #include "xfs_icreate_item.h"
>  #include "xfs_log.h"
> +#include "xfs_log_priv.h"
> +#include "xfs_log_recover.h"
>  
>  kmem_zone_t	*xfs_icreate_zone;		/* inode create item zone */
>  
> @@ -107,3 +109,14 @@ xfs_icreate_log(
>  	tp->t_flags |= XFS_TRANS_DIRTY;
>  	set_bit(XFS_LI_DIRTY, &icp->ic_item.li_flags);
>  }
> +
> +static enum xlog_recover_reorder
> +xlog_icreate_reorder(
> +		struct xlog_recover_item *item)
> +{
> +	return XLOG_REORDER_BUFFER_LIST;
> +}
> +
> +const struct xlog_recover_item_type xlog_icreate_item_type = {
> +	.reorder_fn		= xlog_icreate_reorder,
> +};
> diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c
> new file mode 100644
> index 000000000000..478f0a5c08ab
> --- /dev/null
> +++ b/fs/xfs/xfs_inode_item_recover.c
> @@ -0,0 +1,25 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2000-2006 Silicon Graphics, Inc.
> + * All Rights Reserved.
> + */
> +#include "xfs.h"
> +#include "xfs_fs.h"
> +#include "xfs_shared.h"
> +#include "xfs_format.h"
> +#include "xfs_log_format.h"
> +#include "xfs_trans_resv.h"
> +#include "xfs_mount.h"
> +#include "xfs_inode.h"
> +#include "xfs_trans.h"
> +#include "xfs_inode_item.h"
> +#include "xfs_trace.h"
> +#include "xfs_trans_priv.h"
> +#include "xfs_buf_item.h"
> +#include "xfs_log.h"
> +#include "xfs_error.h"
> +#include "xfs_log_priv.h"
> +#include "xfs_log_recover.h"
> +
> +const struct xlog_recover_item_type xlog_inode_item_type = {
> +};
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index db47dfc0cada..8ab107680883 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -1786,6 +1786,57 @@ xlog_clear_stale_blocks(
>   ******************************************************************************
>   */
>  
> +static int
> +xlog_set_item_type(
> +	struct xlog_recover_item		*item)
> +{
> +	switch (ITEM_TYPE(item)) {
> +	case XFS_LI_ICREATE:
> +		item->ri_type = &xlog_icreate_item_type;
> +		return 0;
> +	case XFS_LI_BUF:
> +		item->ri_type = &xlog_buf_item_type;
> +		return 0;
> +	case XFS_LI_EFI:
> +		item->ri_type = &xlog_extfree_intent_item_type;
> +		return 0;
> +	case XFS_LI_EFD:
> +		item->ri_type = &xlog_extfree_done_item_type;
> +		return 0;
> +	case XFS_LI_RUI:
> +		item->ri_type = &xlog_rmap_intent_item_type;
> +		return 0;
> +	case XFS_LI_RUD:
> +		item->ri_type = &xlog_rmap_done_item_type;
> +		return 0;
> +	case XFS_LI_CUI:
> +		item->ri_type = &xlog_refcount_intent_item_type;
> +		return 0;
> +	case XFS_LI_CUD:
> +		item->ri_type = &xlog_refcount_done_item_type;
> +		return 0;
> +	case XFS_LI_BUI:
> +		item->ri_type = &xlog_bmap_intent_item_type;
> +		return 0;
> +	case XFS_LI_BUD:
> +		item->ri_type = &xlog_bmap_done_item_type;
> +		return 0;
> +	case XFS_LI_INODE:
> +		item->ri_type = &xlog_inode_item_type;
> +		return 0;
> +#ifdef CONFIG_XFS_QUOTA
> +	case XFS_LI_DQUOT:
> +		item->ri_type = &xlog_dquot_item_type;
> +		return 0;
> +	case XFS_LI_QUOTAOFF:
> +		item->ri_type = &xlog_quotaoff_item_type;
> +		return 0;
> +#endif /* CONFIG_XFS_QUOTA */
> +	default:
> +		return -EFSCORRUPTED;
> +	}
> +}
> +
>  /*
>   * Sort the log items in the transaction.
>   *
> @@ -1851,41 +1902,10 @@ xlog_recover_reorder_trans(
>  
>  	list_splice_init(&trans->r_itemq, &sort_list);
>  	list_for_each_entry_safe(item, n, &sort_list, ri_list) {
> -		xfs_buf_log_format_t	*buf_f = item->ri_buf[0].i_addr;
> +		enum xlog_recover_reorder	fate = XLOG_REORDER_ITEM_LIST;
>  
> -		switch (ITEM_TYPE(item)) {
> -		case XFS_LI_ICREATE:
> -			list_move_tail(&item->ri_list, &buffer_list);
> -			break;
> -		case XFS_LI_BUF:
> -			if (buf_f->blf_flags & XFS_BLF_CANCEL) {
> -				trace_xfs_log_recover_item_reorder_head(log,
> -							trans, item, pass);
> -				list_move(&item->ri_list, &cancel_list);
> -				break;
> -			}
> -			if (buf_f->blf_flags & XFS_BLF_INODE_BUF) {
> -				list_move(&item->ri_list, &inode_buffer_list);
> -				break;
> -			}
> -			list_move_tail(&item->ri_list, &buffer_list);
> -			break;
> -		case XFS_LI_INODE:
> -		case XFS_LI_DQUOT:
> -		case XFS_LI_QUOTAOFF:
> -		case XFS_LI_EFD:
> -		case XFS_LI_EFI:
> -		case XFS_LI_RUI:
> -		case XFS_LI_RUD:
> -		case XFS_LI_CUI:
> -		case XFS_LI_CUD:
> -		case XFS_LI_BUI:
> -		case XFS_LI_BUD:
> -			trace_xfs_log_recover_item_reorder_tail(log,
> -							trans, item, pass);
> -			list_move_tail(&item->ri_list, &item_list);
> -			break;
> -		default:
> +		error = xlog_set_item_type(item);
> +		if (error) {
>  			xfs_warn(log->l_mp,
>  				"%s: unrecognized type of log operation (%d)",
>  				__func__, ITEM_TYPE(item));
> @@ -1896,11 +1916,32 @@ xlog_recover_reorder_trans(
>  			 */
>  			if (!list_empty(&sort_list))
>  				list_splice_init(&sort_list, &trans->r_itemq);
> -			error = -EIO;
> -			goto out;
> +			break;
> +		}
> +
> +		if (item->ri_type->reorder_fn)
> +			fate = item->ri_type->reorder_fn(item);
> +
> +		switch (fate) {
> +		case XLOG_REORDER_BUFFER_LIST:
> +			list_move_tail(&item->ri_list, &buffer_list);
> +			break;
> +		case XLOG_REORDER_CANCEL_LIST:
> +			trace_xfs_log_recover_item_reorder_head(log,
> +					trans, item, pass);
> +			list_move(&item->ri_list, &cancel_list);
> +			break;
> +		case XLOG_REORDER_INODE_BUFFER_LIST:
> +			list_move(&item->ri_list, &inode_buffer_list);
> +			break;
> +		case XLOG_REORDER_ITEM_LIST:
> +			trace_xfs_log_recover_item_reorder_tail(log,
> +							trans, item, pass);
> +			list_move_tail(&item->ri_list, &item_list);
> +			break;
>  		}
>  	}
> -out:
> +
>  	ASSERT(list_empty(&sort_list));
>  	if (!list_empty(&buffer_list))
>  		list_splice(&buffer_list, &trans->r_itemq);
> diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c
> index 8eeed73928cd..ddab09385bfb 100644
> --- a/fs/xfs/xfs_refcount_item.c
> +++ b/fs/xfs/xfs_refcount_item.c
> @@ -18,6 +18,7 @@
>  #include "xfs_log.h"
>  #include "xfs_refcount.h"
>  #include "xfs_error.h"
> +#include "xfs_log_recover.h"
>  
>  kmem_zone_t	*xfs_cui_zone;
>  kmem_zone_t	*xfs_cud_zone;
> @@ -590,3 +591,9 @@ xfs_cui_recover(
>  	xfs_trans_cancel(tp);
>  	return error;
>  }
> +
> +const struct xlog_recover_item_type xlog_refcount_intent_item_type = {
> +};
> +
> +const struct xlog_recover_item_type xlog_refcount_done_item_type = {
> +};
> diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c
> index 4911b68f95dd..bcad3db1f3a4 100644
> --- a/fs/xfs/xfs_rmap_item.c
> +++ b/fs/xfs/xfs_rmap_item.c
> @@ -18,6 +18,7 @@
>  #include "xfs_log.h"
>  #include "xfs_rmap.h"
>  #include "xfs_error.h"
> +#include "xfs_log_recover.h"
>  
>  kmem_zone_t	*xfs_rui_zone;
>  kmem_zone_t	*xfs_rud_zone;
> @@ -606,3 +607,9 @@ xfs_rui_recover(
>  	xfs_trans_cancel(tp);
>  	return error;
>  }
> +
> +const struct xlog_recover_item_type xlog_rmap_intent_item_type = {
> +};
> +
> +const struct xlog_recover_item_type xlog_rmap_done_item_type = {
> +};
> 
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 02/21] xfs: refactor log recovery item dispatch for pass2 readhead functions
  2020-04-30  0:47 ` [PATCH 02/21] xfs: refactor log recovery item dispatch for pass2 readhead functions Darrick J. Wong
@ 2020-05-01 12:10   ` Chandan Rajendra
  0 siblings, 0 replies; 41+ messages in thread
From: Chandan Rajendra @ 2020-05-01 12:10 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thursday, April 30, 2020 6:17 AM Darrick J. Wong wrote: 
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Move the pass2 readhead code into the per-item source code files and use
> the dispatch function to call them.
>

Readahead is issued for buf, inode and dquot items similar to how it is done in
the present code.

Reviewed-by: Chandan Rajendra <chandanrlinux@gmail.com>

> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_log_recover.h |    6 ++
>  fs/xfs/xfs_buf_item_recover.c   |   11 +++++
>  fs/xfs/xfs_dquot_item.c         |   34 ++++++++++++++
>  fs/xfs/xfs_inode_item_recover.c |   19 ++++++++
>  fs/xfs/xfs_log_recover.c        |   95 +--------------------------------------
>  5 files changed, 73 insertions(+), 92 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_log_recover.h b/fs/xfs/libxfs/xfs_log_recover.h
> index 38ae9c371edb..1463eba47254 100644
> --- a/fs/xfs/libxfs/xfs_log_recover.h
> +++ b/fs/xfs/libxfs/xfs_log_recover.h
> @@ -29,6 +29,9 @@ struct xlog_recover_item_type {
>  	 * values mean.
>  	 */
>  	enum xlog_recover_reorder (*reorder_fn)(struct xlog_recover_item *item);
> +
> +	/* Start readahead for pass2, if provided. */
> +	void (*ra_pass2_fn)(struct xlog *log, struct xlog_recover_item *item);
>  };
>  
>  extern const struct xlog_recover_item_type xlog_icreate_item_type;
> @@ -90,4 +93,7 @@ struct xlog_recover {
>  #define	XLOG_RECOVER_PASS1	1
>  #define	XLOG_RECOVER_PASS2	2
>  
> +void xlog_buf_readahead(struct xlog *log, xfs_daddr_t blkno, uint len,
> +		const struct xfs_buf_ops *ops);
> +
>  #endif	/* __XFS_LOG_RECOVER_H__ */
> diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c
> index 07ddf58209c3..c756b8e55fde 100644
> --- a/fs/xfs/xfs_buf_item_recover.c
> +++ b/fs/xfs/xfs_buf_item_recover.c
> @@ -32,6 +32,17 @@ xlog_buf_reorder_fn(
>  	return XLOG_REORDER_BUFFER_LIST;
>  }
>  
> +STATIC void
> +xlog_recover_buffer_ra_pass2(
> +	struct xlog                     *log,
> +	struct xlog_recover_item        *item)
> +{
> +	struct xfs_buf_log_format	*buf_f = item->ri_buf[0].i_addr;
> +
> +	xlog_buf_readahead(log, buf_f->blf_blkno, buf_f->blf_len, NULL);
> +}
> +
>  const struct xlog_recover_item_type xlog_buf_item_type = {
>  	.reorder_fn		= xlog_buf_reorder_fn,
> +	.ra_pass2_fn		= xlog_recover_buffer_ra_pass2,
>  };
> diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
> index 3bd5b6c7e235..2a05d1239423 100644
> --- a/fs/xfs/xfs_dquot_item.c
> +++ b/fs/xfs/xfs_dquot_item.c
> @@ -386,7 +386,41 @@ xfs_qm_qoff_logitem_init(
>  	return qf;
>  }
>  
> +STATIC void
> +xlog_recover_dquot_ra_pass2(
> +	struct xlog			*log,
> +	struct xlog_recover_item	*item)
> +{
> +	struct xfs_mount	*mp = log->l_mp;
> +	struct xfs_disk_dquot	*recddq;
> +	struct xfs_dq_logformat	*dq_f;
> +	uint			type;
> +
> +	if (mp->m_qflags == 0)
> +		return;
> +
> +	recddq = item->ri_buf[1].i_addr;
> +	if (recddq == NULL)
> +		return;
> +	if (item->ri_buf[1].i_len < sizeof(struct xfs_disk_dquot))
> +		return;
> +
> +	type = recddq->d_flags & (XFS_DQ_USER | XFS_DQ_PROJ | XFS_DQ_GROUP);
> +	ASSERT(type);
> +	if (log->l_quotaoffs_flag & type)
> +		return;
> +
> +	dq_f = item->ri_buf[0].i_addr;
> +	ASSERT(dq_f);
> +	ASSERT(dq_f->qlf_len == 1);
> +
> +	xlog_buf_readahead(log, dq_f->qlf_blkno,
> +			XFS_FSB_TO_BB(mp, dq_f->qlf_len),
> +			&xfs_dquot_buf_ra_ops);
> +}
> +
>  const struct xlog_recover_item_type xlog_dquot_item_type = {
> +	.ra_pass2_fn		= xlog_recover_dquot_ra_pass2,
>  };
>  
>  const struct xlog_recover_item_type xlog_quotaoff_item_type = {
> diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c
> index 478f0a5c08ab..d97d8caa4652 100644
> --- a/fs/xfs/xfs_inode_item_recover.c
> +++ b/fs/xfs/xfs_inode_item_recover.c
> @@ -21,5 +21,24 @@
>  #include "xfs_log_priv.h"
>  #include "xfs_log_recover.h"
>  
> +STATIC void
> +xlog_recover_inode_ra_pass2(
> +	struct xlog                     *log,
> +	struct xlog_recover_item        *item)
> +{
> +	if (item->ri_buf[0].i_len == sizeof(struct xfs_inode_log_format)) {
> +		struct xfs_inode_log_format	*ilfp = item->ri_buf[0].i_addr;
> +
> +		xlog_buf_readahead(log, ilfp->ilf_blkno, ilfp->ilf_len,
> +				   &xfs_inode_buf_ra_ops);
> +	} else {
> +		struct xfs_inode_log_format_32	*ilfp = item->ri_buf[0].i_addr;
> +
> +		xlog_buf_readahead(log, ilfp->ilf_blkno, ilfp->ilf_len,
> +				   &xfs_inode_buf_ra_ops);
> +	}
> +}
> +
>  const struct xlog_recover_item_type xlog_inode_item_type = {
> +	.ra_pass2_fn		= xlog_recover_inode_ra_pass2,
>  };
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 8ab107680883..b61323cc5a11 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -2045,7 +2045,7 @@ xlog_put_buffer_cancelled(
>  	return true;
>  }
>  
> -static void
> +void
>  xlog_buf_readahead(
>  	struct xlog		*log,
>  	xfs_daddr_t		blkno,
> @@ -3912,96 +3912,6 @@ xlog_recover_do_icreate_pass2(
>  				     length, be32_to_cpu(icl->icl_gen));
>  }
>  
> -STATIC void
> -xlog_recover_buffer_ra_pass2(
> -	struct xlog                     *log,
> -	struct xlog_recover_item        *item)
> -{
> -	struct xfs_buf_log_format	*buf_f = item->ri_buf[0].i_addr;
> -
> -	xlog_buf_readahead(log, buf_f->blf_blkno, buf_f->blf_len, NULL);
> -}
> -
> -STATIC void
> -xlog_recover_inode_ra_pass2(
> -	struct xlog                     *log,
> -	struct xlog_recover_item        *item)
> -{
> -	if (item->ri_buf[0].i_len == sizeof(struct xfs_inode_log_format)) {
> -		struct xfs_inode_log_format	*ilfp = item->ri_buf[0].i_addr;
> -
> -		xlog_buf_readahead(log, ilfp->ilf_blkno, ilfp->ilf_len,
> -				   &xfs_inode_buf_ra_ops);
> -	} else {
> -		struct xfs_inode_log_format_32	*ilfp = item->ri_buf[0].i_addr;
> -
> -		xlog_buf_readahead(log, ilfp->ilf_blkno, ilfp->ilf_len,
> -				   &xfs_inode_buf_ra_ops);
> -	}
> -}
> -
> -STATIC void
> -xlog_recover_dquot_ra_pass2(
> -	struct xlog			*log,
> -	struct xlog_recover_item	*item)
> -{
> -	struct xfs_mount	*mp = log->l_mp;
> -	struct xfs_disk_dquot	*recddq;
> -	struct xfs_dq_logformat	*dq_f;
> -	uint			type;
> -
> -	if (mp->m_qflags == 0)
> -		return;
> -
> -	recddq = item->ri_buf[1].i_addr;
> -	if (recddq == NULL)
> -		return;
> -	if (item->ri_buf[1].i_len < sizeof(struct xfs_disk_dquot))
> -		return;
> -
> -	type = recddq->d_flags & (XFS_DQ_USER | XFS_DQ_PROJ | XFS_DQ_GROUP);
> -	ASSERT(type);
> -	if (log->l_quotaoffs_flag & type)
> -		return;
> -
> -	dq_f = item->ri_buf[0].i_addr;
> -	ASSERT(dq_f);
> -	ASSERT(dq_f->qlf_len == 1);
> -
> -	xlog_buf_readahead(log, dq_f->qlf_blkno,
> -			XFS_FSB_TO_BB(mp, dq_f->qlf_len),
> -			&xfs_dquot_buf_ra_ops);
> -}
> -
> -STATIC void
> -xlog_recover_ra_pass2(
> -	struct xlog			*log,
> -	struct xlog_recover_item	*item)
> -{
> -	switch (ITEM_TYPE(item)) {
> -	case XFS_LI_BUF:
> -		xlog_recover_buffer_ra_pass2(log, item);
> -		break;
> -	case XFS_LI_INODE:
> -		xlog_recover_inode_ra_pass2(log, item);
> -		break;
> -	case XFS_LI_DQUOT:
> -		xlog_recover_dquot_ra_pass2(log, item);
> -		break;
> -	case XFS_LI_EFI:
> -	case XFS_LI_EFD:
> -	case XFS_LI_QUOTAOFF:
> -	case XFS_LI_RUI:
> -	case XFS_LI_RUD:
> -	case XFS_LI_CUI:
> -	case XFS_LI_CUD:
> -	case XFS_LI_BUI:
> -	case XFS_LI_BUD:
> -	default:
> -		break;
> -	}
> -}
> -
>  STATIC int
>  xlog_recover_commit_pass1(
>  	struct xlog			*log,
> @@ -4138,7 +4048,8 @@ xlog_recover_commit_trans(
>  			error = xlog_recover_commit_pass1(log, trans, item);
>  			break;
>  		case XLOG_RECOVER_PASS2:
> -			xlog_recover_ra_pass2(log, item);
> +			if (item->ri_type && item->ri_type->ra_pass2_fn)
> +				item->ri_type->ra_pass2_fn(log, item);
>  			list_move_tail(&item->ri_list, &ra_list);
>  			items_queued++;
>  			if (items_queued >= XLOG_RECOVER_COMMIT_QUEUE_MAX) {
> 
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 04/21] xfs: refactor log recovery buffer item dispatch for pass2 commit functions
  2020-04-30  0:48 ` [PATCH 04/21] xfs: refactor log recovery buffer item dispatch for pass2 " Darrick J. Wong
@ 2020-05-01 13:43   ` Chandan Rajendra
  0 siblings, 0 replies; 41+ messages in thread
From: Chandan Rajendra @ 2020-05-01 13:43 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thursday, April 30, 2020 6:18 AM Darrick J. Wong wrote: 
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Move the log buffer item pass2 commit code into the per-item source code
> files and use the dispatch function to call it.  We do these one at a
> time because there's a lot of code to move.  No functional changes.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

The changes look good to me.

Reviewed-by: Chandan Rajendra <chandanrlinux@gmail.com>

> ---
>  fs/xfs/libxfs/xfs_log_recover.h |   23 +
>  fs/xfs/xfs_buf_item_recover.c   |  790 +++++++++++++++++++++++++++++++++++++++
>  fs/xfs/xfs_log_recover.c        |  798 ---------------------------------------
>  3 files changed, 820 insertions(+), 791 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_log_recover.h b/fs/xfs/libxfs/xfs_log_recover.h
> index b933dc8bb8a3..5017d80c0f4b 100644
> --- a/fs/xfs/libxfs/xfs_log_recover.h
> +++ b/fs/xfs/libxfs/xfs_log_recover.h
> @@ -36,6 +36,26 @@ struct xlog_recover_item_type {
>  	/* Do whatever work we need to do for pass1, if provided. */
>  	int (*commit_pass1_fn)(struct xlog *log,
>  			       struct xlog_recover_item *item);
> +
> +	/*
> +	 * This function should do whatever work is needed for pass2 of log
> +	 * recovery, if provided.
> +	 *
> +	 * If the recovered item is an intent item, this function should parse
> +	 * the recovered item to construct an in-core log intent item and
> +	 * insert it into the AIL.  The in-core log intent item should have 1
> +	 * refcount so that the item is freed either (a) when we commit the
> +	 * recovered log item for the intent-done item; (b) replay the work and
> +	 * log a new intent-done item; or (c) recovery fails and we have to
> +	 * abort.
> +	 *
> +	 * If the recovered item is an intent-done item, this function should
> +	 * parse the recovered item to find the id of the corresponding intent
> +	 * log item.  Next, it should find the in-core log intent item in the
> +	 * AIL and release it.
> +	 */
> +	int (*commit_pass2_fn)(struct xlog *log, struct list_head *buffer_list,
> +			       struct xlog_recover_item *item, xfs_lsn_t lsn);
>  };
>  
>  extern const struct xlog_recover_item_type xlog_icreate_item_type;
> @@ -100,5 +120,8 @@ struct xlog_recover {
>  void xlog_buf_readahead(struct xlog *log, xfs_daddr_t blkno, uint len,
>  		const struct xfs_buf_ops *ops);
>  bool xlog_add_buffer_cancelled(struct xlog *log, xfs_daddr_t blkno, uint len);
> +bool xlog_is_buffer_cancelled(struct xlog *log, xfs_daddr_t blkno, uint len);
> +bool xlog_put_buffer_cancelled(struct xlog *log, xfs_daddr_t blkno, uint len);
> +void xlog_recover_iodone(struct xfs_buf *bp);
>  
>  #endif	/* __XFS_LOG_RECOVER_H__ */
> diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c
> index deda3ad32d95..d324f810819d 100644
> --- a/fs/xfs/xfs_buf_item_recover.c
> +++ b/fs/xfs/xfs_buf_item_recover.c
> @@ -18,6 +18,10 @@
>  #include "xfs_log.h"
>  #include "xfs_log_priv.h"
>  #include "xfs_log_recover.h"
> +#include "xfs_error.h"
> +#include "xfs_inode.h"
> +#include "xfs_dir2.h"
> +#include "xfs_quota.h"
>  
>  STATIC enum xlog_recover_reorder
>  xlog_buf_reorder_fn(
> @@ -68,8 +72,794 @@ xlog_recover_buffer_commit_pass1(
>  	return 0;
>  }
>  
> +/*
> + * Validate the recovered buffer is of the correct type and attach the
> + * appropriate buffer operations to them for writeback. Magic numbers are in a
> + * few places:
> + *	the first 16 bits of the buffer (inode buffer, dquot buffer),
> + *	the first 32 bits of the buffer (most blocks),
> + *	inside a struct xfs_da_blkinfo at the start of the buffer.
> + */
> +static void
> +xlog_recover_validate_buf_type(
> +	struct xfs_mount		*mp,
> +	struct xfs_buf			*bp,
> +	struct xfs_buf_log_format	*buf_f,
> +	xfs_lsn_t			current_lsn)
> +{
> +	struct xfs_da_blkinfo		*info = bp->b_addr;
> +	uint32_t			magic32;
> +	uint16_t			magic16;
> +	uint16_t			magicda;
> +	char				*warnmsg = NULL;
> +
> +	/*
> +	 * We can only do post recovery validation on items on CRC enabled
> +	 * fielsystems as we need to know when the buffer was written to be able
> +	 * to determine if we should have replayed the item. If we replay old
> +	 * metadata over a newer buffer, then it will enter a temporarily
> +	 * inconsistent state resulting in verification failures. Hence for now
> +	 * just avoid the verification stage for non-crc filesystems
> +	 */
> +	if (!xfs_sb_version_hascrc(&mp->m_sb))
> +		return;
> +
> +	magic32 = be32_to_cpu(*(__be32 *)bp->b_addr);
> +	magic16 = be16_to_cpu(*(__be16*)bp->b_addr);
> +	magicda = be16_to_cpu(info->magic);
> +	switch (xfs_blft_from_flags(buf_f)) {
> +	case XFS_BLFT_BTREE_BUF:
> +		switch (magic32) {
> +		case XFS_ABTB_CRC_MAGIC:
> +		case XFS_ABTB_MAGIC:
> +			bp->b_ops = &xfs_bnobt_buf_ops;
> +			break;
> +		case XFS_ABTC_CRC_MAGIC:
> +		case XFS_ABTC_MAGIC:
> +			bp->b_ops = &xfs_cntbt_buf_ops;
> +			break;
> +		case XFS_IBT_CRC_MAGIC:
> +		case XFS_IBT_MAGIC:
> +			bp->b_ops = &xfs_inobt_buf_ops;
> +			break;
> +		case XFS_FIBT_CRC_MAGIC:
> +		case XFS_FIBT_MAGIC:
> +			bp->b_ops = &xfs_finobt_buf_ops;
> +			break;
> +		case XFS_BMAP_CRC_MAGIC:
> +		case XFS_BMAP_MAGIC:
> +			bp->b_ops = &xfs_bmbt_buf_ops;
> +			break;
> +		case XFS_RMAP_CRC_MAGIC:
> +			bp->b_ops = &xfs_rmapbt_buf_ops;
> +			break;
> +		case XFS_REFC_CRC_MAGIC:
> +			bp->b_ops = &xfs_refcountbt_buf_ops;
> +			break;
> +		default:
> +			warnmsg = "Bad btree block magic!";
> +			break;
> +		}
> +		break;
> +	case XFS_BLFT_AGF_BUF:
> +		if (magic32 != XFS_AGF_MAGIC) {
> +			warnmsg = "Bad AGF block magic!";
> +			break;
> +		}
> +		bp->b_ops = &xfs_agf_buf_ops;
> +		break;
> +	case XFS_BLFT_AGFL_BUF:
> +		if (magic32 != XFS_AGFL_MAGIC) {
> +			warnmsg = "Bad AGFL block magic!";
> +			break;
> +		}
> +		bp->b_ops = &xfs_agfl_buf_ops;
> +		break;
> +	case XFS_BLFT_AGI_BUF:
> +		if (magic32 != XFS_AGI_MAGIC) {
> +			warnmsg = "Bad AGI block magic!";
> +			break;
> +		}
> +		bp->b_ops = &xfs_agi_buf_ops;
> +		break;
> +	case XFS_BLFT_UDQUOT_BUF:
> +	case XFS_BLFT_PDQUOT_BUF:
> +	case XFS_BLFT_GDQUOT_BUF:
> +#ifdef CONFIG_XFS_QUOTA
> +		if (magic16 != XFS_DQUOT_MAGIC) {
> +			warnmsg = "Bad DQUOT block magic!";
> +			break;
> +		}
> +		bp->b_ops = &xfs_dquot_buf_ops;
> +#else
> +		xfs_alert(mp,
> +	"Trying to recover dquots without QUOTA support built in!");
> +		ASSERT(0);
> +#endif
> +		break;
> +	case XFS_BLFT_DINO_BUF:
> +		if (magic16 != XFS_DINODE_MAGIC) {
> +			warnmsg = "Bad INODE block magic!";
> +			break;
> +		}
> +		bp->b_ops = &xfs_inode_buf_ops;
> +		break;
> +	case XFS_BLFT_SYMLINK_BUF:
> +		if (magic32 != XFS_SYMLINK_MAGIC) {
> +			warnmsg = "Bad symlink block magic!";
> +			break;
> +		}
> +		bp->b_ops = &xfs_symlink_buf_ops;
> +		break;
> +	case XFS_BLFT_DIR_BLOCK_BUF:
> +		if (magic32 != XFS_DIR2_BLOCK_MAGIC &&
> +		    magic32 != XFS_DIR3_BLOCK_MAGIC) {
> +			warnmsg = "Bad dir block magic!";
> +			break;
> +		}
> +		bp->b_ops = &xfs_dir3_block_buf_ops;
> +		break;
> +	case XFS_BLFT_DIR_DATA_BUF:
> +		if (magic32 != XFS_DIR2_DATA_MAGIC &&
> +		    magic32 != XFS_DIR3_DATA_MAGIC) {
> +			warnmsg = "Bad dir data magic!";
> +			break;
> +		}
> +		bp->b_ops = &xfs_dir3_data_buf_ops;
> +		break;
> +	case XFS_BLFT_DIR_FREE_BUF:
> +		if (magic32 != XFS_DIR2_FREE_MAGIC &&
> +		    magic32 != XFS_DIR3_FREE_MAGIC) {
> +			warnmsg = "Bad dir3 free magic!";
> +			break;
> +		}
> +		bp->b_ops = &xfs_dir3_free_buf_ops;
> +		break;
> +	case XFS_BLFT_DIR_LEAF1_BUF:
> +		if (magicda != XFS_DIR2_LEAF1_MAGIC &&
> +		    magicda != XFS_DIR3_LEAF1_MAGIC) {
> +			warnmsg = "Bad dir leaf1 magic!";
> +			break;
> +		}
> +		bp->b_ops = &xfs_dir3_leaf1_buf_ops;
> +		break;
> +	case XFS_BLFT_DIR_LEAFN_BUF:
> +		if (magicda != XFS_DIR2_LEAFN_MAGIC &&
> +		    magicda != XFS_DIR3_LEAFN_MAGIC) {
> +			warnmsg = "Bad dir leafn magic!";
> +			break;
> +		}
> +		bp->b_ops = &xfs_dir3_leafn_buf_ops;
> +		break;
> +	case XFS_BLFT_DA_NODE_BUF:
> +		if (magicda != XFS_DA_NODE_MAGIC &&
> +		    magicda != XFS_DA3_NODE_MAGIC) {
> +			warnmsg = "Bad da node magic!";
> +			break;
> +		}
> +		bp->b_ops = &xfs_da3_node_buf_ops;
> +		break;
> +	case XFS_BLFT_ATTR_LEAF_BUF:
> +		if (magicda != XFS_ATTR_LEAF_MAGIC &&
> +		    magicda != XFS_ATTR3_LEAF_MAGIC) {
> +			warnmsg = "Bad attr leaf magic!";
> +			break;
> +		}
> +		bp->b_ops = &xfs_attr3_leaf_buf_ops;
> +		break;
> +	case XFS_BLFT_ATTR_RMT_BUF:
> +		if (magic32 != XFS_ATTR3_RMT_MAGIC) {
> +			warnmsg = "Bad attr remote magic!";
> +			break;
> +		}
> +		bp->b_ops = &xfs_attr3_rmt_buf_ops;
> +		break;
> +	case XFS_BLFT_SB_BUF:
> +		if (magic32 != XFS_SB_MAGIC) {
> +			warnmsg = "Bad SB block magic!";
> +			break;
> +		}
> +		bp->b_ops = &xfs_sb_buf_ops;
> +		break;
> +#ifdef CONFIG_XFS_RT
> +	case XFS_BLFT_RTBITMAP_BUF:
> +	case XFS_BLFT_RTSUMMARY_BUF:
> +		/* no magic numbers for verification of RT buffers */
> +		bp->b_ops = &xfs_rtbuf_ops;
> +		break;
> +#endif /* CONFIG_XFS_RT */
> +	default:
> +		xfs_warn(mp, "Unknown buffer type %d!",
> +			 xfs_blft_from_flags(buf_f));
> +		break;
> +	}
> +
> +	/*
> +	 * Nothing else to do in the case of a NULL current LSN as this means
> +	 * the buffer is more recent than the change in the log and will be
> +	 * skipped.
> +	 */
> +	if (current_lsn == NULLCOMMITLSN)
> +		return;
> +
> +	if (warnmsg) {
> +		xfs_warn(mp, warnmsg);
> +		ASSERT(0);
> +	}
> +
> +	/*
> +	 * We must update the metadata LSN of the buffer as it is written out to
> +	 * ensure that older transactions never replay over this one and corrupt
> +	 * the buffer. This can occur if log recovery is interrupted at some
> +	 * point after the current transaction completes, at which point a
> +	 * subsequent mount starts recovery from the beginning.
> +	 *
> +	 * Write verifiers update the metadata LSN from log items attached to
> +	 * the buffer. Therefore, initialize a bli purely to carry the LSN to
> +	 * the verifier. We'll clean it up in our ->iodone() callback.
> +	 */
> +	if (bp->b_ops) {
> +		struct xfs_buf_log_item	*bip;
> +
> +		ASSERT(!bp->b_iodone || bp->b_iodone == xlog_recover_iodone);
> +		bp->b_iodone = xlog_recover_iodone;
> +		xfs_buf_item_init(bp, mp);
> +		bip = bp->b_log_item;
> +		bip->bli_item.li_lsn = current_lsn;
> +	}
> +}
> +
> +/*
> + * Perform a 'normal' buffer recovery.  Each logged region of the
> + * buffer should be copied over the corresponding region in the
> + * given buffer.  The bitmap in the buf log format structure indicates
> + * where to place the logged data.
> + */
> +STATIC void
> +xlog_recover_do_reg_buffer(
> +	struct xfs_mount		*mp,
> +	struct xlog_recover_item	*item,
> +	struct xfs_buf			*bp,
> +	struct xfs_buf_log_format	*buf_f,
> +	xfs_lsn_t			current_lsn)
> +{
> +	int			i;
> +	int			bit;
> +	int			nbits;
> +	xfs_failaddr_t		fa;
> +	const size_t		size_disk_dquot = sizeof(struct xfs_disk_dquot);
> +
> +	trace_xfs_log_recover_buf_reg_buf(mp->m_log, buf_f);
> +
> +	bit = 0;
> +	i = 1;  /* 0 is the buf format structure */
> +	while (1) {
> +		bit = xfs_next_bit(buf_f->blf_data_map,
> +				   buf_f->blf_map_size, bit);
> +		if (bit == -1)
> +			break;
> +		nbits = xfs_contig_bits(buf_f->blf_data_map,
> +					buf_f->blf_map_size, bit);
> +		ASSERT(nbits > 0);
> +		ASSERT(item->ri_buf[i].i_addr != NULL);
> +		ASSERT(item->ri_buf[i].i_len % XFS_BLF_CHUNK == 0);
> +		ASSERT(BBTOB(bp->b_length) >=
> +		       ((uint)bit << XFS_BLF_SHIFT) + (nbits << XFS_BLF_SHIFT));
> +
> +		/*
> +		 * The dirty regions logged in the buffer, even though
> +		 * contiguous, may span multiple chunks. This is because the
> +		 * dirty region may span a physical page boundary in a buffer
> +		 * and hence be split into two separate vectors for writing into
> +		 * the log. Hence we need to trim nbits back to the length of
> +		 * the current region being copied out of the log.
> +		 */
> +		if (item->ri_buf[i].i_len < (nbits << XFS_BLF_SHIFT))
> +			nbits = item->ri_buf[i].i_len >> XFS_BLF_SHIFT;
> +
> +		/*
> +		 * Do a sanity check if this is a dquot buffer. Just checking
> +		 * the first dquot in the buffer should do. XXXThis is
> +		 * probably a good thing to do for other buf types also.
> +		 */
> +		fa = NULL;
> +		if (buf_f->blf_flags &
> +		   (XFS_BLF_UDQUOT_BUF|XFS_BLF_PDQUOT_BUF|XFS_BLF_GDQUOT_BUF)) {
> +			if (item->ri_buf[i].i_addr == NULL) {
> +				xfs_alert(mp,
> +					"XFS: NULL dquot in %s.", __func__);
> +				goto next;
> +			}
> +			if (item->ri_buf[i].i_len < size_disk_dquot) {
> +				xfs_alert(mp,
> +					"XFS: dquot too small (%d) in %s.",
> +					item->ri_buf[i].i_len, __func__);
> +				goto next;
> +			}
> +			fa = xfs_dquot_verify(mp, item->ri_buf[i].i_addr,
> +					       -1, 0);
> +			if (fa) {
> +				xfs_alert(mp,
> +	"dquot corrupt at %pS trying to replay into block 0x%llx",
> +					fa, bp->b_bn);
> +				goto next;
> +			}
> +		}
> +
> +		memcpy(xfs_buf_offset(bp,
> +			(uint)bit << XFS_BLF_SHIFT),	/* dest */
> +			item->ri_buf[i].i_addr,		/* source */
> +			nbits<<XFS_BLF_SHIFT);		/* length */
> + next:
> +		i++;
> +		bit += nbits;
> +	}
> +
> +	/* Shouldn't be any more regions */
> +	ASSERT(i == item->ri_total);
> +
> +	xlog_recover_validate_buf_type(mp, bp, buf_f, current_lsn);
> +}
> +
> +/*
> + * Perform a dquot buffer recovery.
> + * Simple algorithm: if we have found a QUOTAOFF log item of the same type
> + * (ie. USR or GRP), then just toss this buffer away; don't recover it.
> + * Else, treat it as a regular buffer and do recovery.
> + *
> + * Return false if the buffer was tossed and true if we recovered the buffer to
> + * indicate to the caller if the buffer needs writing.
> + */
> +STATIC bool
> +xlog_recover_do_dquot_buffer(
> +	struct xfs_mount		*mp,
> +	struct xlog			*log,
> +	struct xlog_recover_item	*item,
> +	struct xfs_buf			*bp,
> +	struct xfs_buf_log_format	*buf_f)
> +{
> +	uint			type;
> +
> +	trace_xfs_log_recover_buf_dquot_buf(log, buf_f);
> +
> +	/*
> +	 * Filesystems are required to send in quota flags at mount time.
> +	 */
> +	if (!mp->m_qflags)
> +		return false;
> +
> +	type = 0;
> +	if (buf_f->blf_flags & XFS_BLF_UDQUOT_BUF)
> +		type |= XFS_DQ_USER;
> +	if (buf_f->blf_flags & XFS_BLF_PDQUOT_BUF)
> +		type |= XFS_DQ_PROJ;
> +	if (buf_f->blf_flags & XFS_BLF_GDQUOT_BUF)
> +		type |= XFS_DQ_GROUP;
> +	/*
> +	 * This type of quotas was turned off, so ignore this buffer
> +	 */
> +	if (log->l_quotaoffs_flag & type)
> +		return false;
> +
> +	xlog_recover_do_reg_buffer(mp, item, bp, buf_f, NULLCOMMITLSN);
> +	return true;
> +}
> +
> +/*
> + * Perform recovery for a buffer full of inodes.  In these buffers, the only
> + * data which should be recovered is that which corresponds to the
> + * di_next_unlinked pointers in the on disk inode structures.  The rest of the
> + * data for the inodes is always logged through the inodes themselves rather
> + * than the inode buffer and is recovered in xlog_recover_inode_pass2().
> + *
> + * The only time when buffers full of inodes are fully recovered is when the
> + * buffer is full of newly allocated inodes.  In this case the buffer will
> + * not be marked as an inode buffer and so will be sent to
> + * xlog_recover_do_reg_buffer() below during recovery.
> + */
> +STATIC int
> +xlog_recover_do_inode_buffer(
> +	struct xfs_mount		*mp,
> +	struct xlog_recover_item	*item,
> +	struct xfs_buf			*bp,
> +	struct xfs_buf_log_format	*buf_f)
> +{
> +	int				i;
> +	int				item_index = 0;
> +	int				bit = 0;
> +	int				nbits = 0;
> +	int				reg_buf_offset = 0;
> +	int				reg_buf_bytes = 0;
> +	int				next_unlinked_offset;
> +	int				inodes_per_buf;
> +	xfs_agino_t			*logged_nextp;
> +	xfs_agino_t			*buffer_nextp;
> +
> +	trace_xfs_log_recover_buf_inode_buf(mp->m_log, buf_f);
> +
> +	/*
> +	 * Post recovery validation only works properly on CRC enabled
> +	 * filesystems.
> +	 */
> +	if (xfs_sb_version_hascrc(&mp->m_sb))
> +		bp->b_ops = &xfs_inode_buf_ops;
> +
> +	inodes_per_buf = BBTOB(bp->b_length) >> mp->m_sb.sb_inodelog;
> +	for (i = 0; i < inodes_per_buf; i++) {
> +		next_unlinked_offset = (i * mp->m_sb.sb_inodesize) +
> +			offsetof(xfs_dinode_t, di_next_unlinked);
> +
> +		while (next_unlinked_offset >=
> +		       (reg_buf_offset + reg_buf_bytes)) {
> +			/*
> +			 * The next di_next_unlinked field is beyond
> +			 * the current logged region.  Find the next
> +			 * logged region that contains or is beyond
> +			 * the current di_next_unlinked field.
> +			 */
> +			bit += nbits;
> +			bit = xfs_next_bit(buf_f->blf_data_map,
> +					   buf_f->blf_map_size, bit);
> +
> +			/*
> +			 * If there are no more logged regions in the
> +			 * buffer, then we're done.
> +			 */
> +			if (bit == -1)
> +				return 0;
> +
> +			nbits = xfs_contig_bits(buf_f->blf_data_map,
> +						buf_f->blf_map_size, bit);
> +			ASSERT(nbits > 0);
> +			reg_buf_offset = bit << XFS_BLF_SHIFT;
> +			reg_buf_bytes = nbits << XFS_BLF_SHIFT;
> +			item_index++;
> +		}
> +
> +		/*
> +		 * If the current logged region starts after the current
> +		 * di_next_unlinked field, then move on to the next
> +		 * di_next_unlinked field.
> +		 */
> +		if (next_unlinked_offset < reg_buf_offset)
> +			continue;
> +
> +		ASSERT(item->ri_buf[item_index].i_addr != NULL);
> +		ASSERT((item->ri_buf[item_index].i_len % XFS_BLF_CHUNK) == 0);
> +		ASSERT((reg_buf_offset + reg_buf_bytes) <= BBTOB(bp->b_length));
> +
> +		/*
> +		 * The current logged region contains a copy of the
> +		 * current di_next_unlinked field.  Extract its value
> +		 * and copy it to the buffer copy.
> +		 */
> +		logged_nextp = item->ri_buf[item_index].i_addr +
> +				next_unlinked_offset - reg_buf_offset;
> +		if (XFS_IS_CORRUPT(mp, *logged_nextp == 0)) {
> +			xfs_alert(mp,
> +		"Bad inode buffer log record (ptr = "PTR_FMT", bp = "PTR_FMT"). "
> +		"Trying to replay bad (0) inode di_next_unlinked field.",
> +				item, bp);
> +			return -EFSCORRUPTED;
> +		}
> +
> +		buffer_nextp = xfs_buf_offset(bp, next_unlinked_offset);
> +		*buffer_nextp = *logged_nextp;
> +
> +		/*
> +		 * If necessary, recalculate the CRC in the on-disk inode. We
> +		 * have to leave the inode in a consistent state for whoever
> +		 * reads it next....
> +		 */
> +		xfs_dinode_calc_crc(mp,
> +				xfs_buf_offset(bp, i * mp->m_sb.sb_inodesize));
> +
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * V5 filesystems know the age of the buffer on disk being recovered. We can
> + * have newer objects on disk than we are replaying, and so for these cases we
> + * don't want to replay the current change as that will make the buffer contents
> + * temporarily invalid on disk.
> + *
> + * The magic number might not match the buffer type we are going to recover
> + * (e.g. reallocated blocks), so we ignore the xfs_buf_log_format flags.  Hence
> + * extract the LSN of the existing object in the buffer based on it's current
> + * magic number.  If we don't recognise the magic number in the buffer, then
> + * return a LSN of -1 so that the caller knows it was an unrecognised block and
> + * so can recover the buffer.
> + *
> + * Note: we cannot rely solely on magic number matches to determine that the
> + * buffer has a valid LSN - we also need to verify that it belongs to this
> + * filesystem, so we need to extract the object's LSN and compare it to that
> + * which we read from the superblock. If the UUIDs don't match, then we've got a
> + * stale metadata block from an old filesystem instance that we need to recover
> + * over the top of.
> + */
> +static xfs_lsn_t
> +xlog_recover_get_buf_lsn(
> +	struct xfs_mount	*mp,
> +	struct xfs_buf		*bp)
> +{
> +	uint32_t		magic32;
> +	uint16_t		magic16;
> +	uint16_t		magicda;
> +	void			*blk = bp->b_addr;
> +	uuid_t			*uuid;
> +	xfs_lsn_t		lsn = -1;
> +
> +	/* v4 filesystems always recover immediately */
> +	if (!xfs_sb_version_hascrc(&mp->m_sb))
> +		goto recover_immediately;
> +
> +	magic32 = be32_to_cpu(*(__be32 *)blk);
> +	switch (magic32) {
> +	case XFS_ABTB_CRC_MAGIC:
> +	case XFS_ABTC_CRC_MAGIC:
> +	case XFS_ABTB_MAGIC:
> +	case XFS_ABTC_MAGIC:
> +	case XFS_RMAP_CRC_MAGIC:
> +	case XFS_REFC_CRC_MAGIC:
> +	case XFS_IBT_CRC_MAGIC:
> +	case XFS_IBT_MAGIC: {
> +		struct xfs_btree_block *btb = blk;
> +
> +		lsn = be64_to_cpu(btb->bb_u.s.bb_lsn);
> +		uuid = &btb->bb_u.s.bb_uuid;
> +		break;
> +	}
> +	case XFS_BMAP_CRC_MAGIC:
> +	case XFS_BMAP_MAGIC: {
> +		struct xfs_btree_block *btb = blk;
> +
> +		lsn = be64_to_cpu(btb->bb_u.l.bb_lsn);
> +		uuid = &btb->bb_u.l.bb_uuid;
> +		break;
> +	}
> +	case XFS_AGF_MAGIC:
> +		lsn = be64_to_cpu(((struct xfs_agf *)blk)->agf_lsn);
> +		uuid = &((struct xfs_agf *)blk)->agf_uuid;
> +		break;
> +	case XFS_AGFL_MAGIC:
> +		lsn = be64_to_cpu(((struct xfs_agfl *)blk)->agfl_lsn);
> +		uuid = &((struct xfs_agfl *)blk)->agfl_uuid;
> +		break;
> +	case XFS_AGI_MAGIC:
> +		lsn = be64_to_cpu(((struct xfs_agi *)blk)->agi_lsn);
> +		uuid = &((struct xfs_agi *)blk)->agi_uuid;
> +		break;
> +	case XFS_SYMLINK_MAGIC:
> +		lsn = be64_to_cpu(((struct xfs_dsymlink_hdr *)blk)->sl_lsn);
> +		uuid = &((struct xfs_dsymlink_hdr *)blk)->sl_uuid;
> +		break;
> +	case XFS_DIR3_BLOCK_MAGIC:
> +	case XFS_DIR3_DATA_MAGIC:
> +	case XFS_DIR3_FREE_MAGIC:
> +		lsn = be64_to_cpu(((struct xfs_dir3_blk_hdr *)blk)->lsn);
> +		uuid = &((struct xfs_dir3_blk_hdr *)blk)->uuid;
> +		break;
> +	case XFS_ATTR3_RMT_MAGIC:
> +		/*
> +		 * Remote attr blocks are written synchronously, rather than
> +		 * being logged. That means they do not contain a valid LSN
> +		 * (i.e. transactionally ordered) in them, and hence any time we
> +		 * see a buffer to replay over the top of a remote attribute
> +		 * block we should simply do so.
> +		 */
> +		goto recover_immediately;
> +	case XFS_SB_MAGIC:
> +		/*
> +		 * superblock uuids are magic. We may or may not have a
> +		 * sb_meta_uuid on disk, but it will be set in the in-core
> +		 * superblock. We set the uuid pointer for verification
> +		 * according to the superblock feature mask to ensure we check
> +		 * the relevant UUID in the superblock.
> +		 */
> +		lsn = be64_to_cpu(((struct xfs_dsb *)blk)->sb_lsn);
> +		if (xfs_sb_version_hasmetauuid(&mp->m_sb))
> +			uuid = &((struct xfs_dsb *)blk)->sb_meta_uuid;
> +		else
> +			uuid = &((struct xfs_dsb *)blk)->sb_uuid;
> +		break;
> +	default:
> +		break;
> +	}
> +
> +	if (lsn != (xfs_lsn_t)-1) {
> +		if (!uuid_equal(&mp->m_sb.sb_meta_uuid, uuid))
> +			goto recover_immediately;
> +		return lsn;
> +	}
> +
> +	magicda = be16_to_cpu(((struct xfs_da_blkinfo *)blk)->magic);
> +	switch (magicda) {
> +	case XFS_DIR3_LEAF1_MAGIC:
> +	case XFS_DIR3_LEAFN_MAGIC:
> +	case XFS_DA3_NODE_MAGIC:
> +		lsn = be64_to_cpu(((struct xfs_da3_blkinfo *)blk)->lsn);
> +		uuid = &((struct xfs_da3_blkinfo *)blk)->uuid;
> +		break;
> +	default:
> +		break;
> +	}
> +
> +	if (lsn != (xfs_lsn_t)-1) {
> +		if (!uuid_equal(&mp->m_sb.sb_uuid, uuid))
> +			goto recover_immediately;
> +		return lsn;
> +	}
> +
> +	/*
> +	 * We do individual object checks on dquot and inode buffers as they
> +	 * have their own individual LSN records. Also, we could have a stale
> +	 * buffer here, so we have to at least recognise these buffer types.
> +	 *
> +	 * A notd complexity here is inode unlinked list processing - it logs
> +	 * the inode directly in the buffer, but we don't know which inodes have
> +	 * been modified, and there is no global buffer LSN. Hence we need to
> +	 * recover all inode buffer types immediately. This problem will be
> +	 * fixed by logical logging of the unlinked list modifications.
> +	 */
> +	magic16 = be16_to_cpu(*(__be16 *)blk);
> +	switch (magic16) {
> +	case XFS_DQUOT_MAGIC:
> +	case XFS_DINODE_MAGIC:
> +		goto recover_immediately;
> +	default:
> +		break;
> +	}
> +
> +	/* unknown buffer contents, recover immediately */
> +
> +recover_immediately:
> +	return (xfs_lsn_t)-1;
> +
> +}
> +
> +/*
> + * This routine replays a modification made to a buffer at runtime.
> + * There are actually two types of buffer, regular and inode, which
> + * are handled differently.  Inode buffers are handled differently
> + * in that we only recover a specific set of data from them, namely
> + * the inode di_next_unlinked fields.  This is because all other inode
> + * data is actually logged via inode records and any data we replay
> + * here which overlaps that may be stale.
> + *
> + * When meta-data buffers are freed at run time we log a buffer item
> + * with the XFS_BLF_CANCEL bit set to indicate that previous copies
> + * of the buffer in the log should not be replayed at recovery time.
> + * This is so that if the blocks covered by the buffer are reused for
> + * file data before we crash we don't end up replaying old, freed
> + * meta-data into a user's file.
> + *
> + * To handle the cancellation of buffer log items, we make two passes
> + * over the log during recovery.  During the first we build a table of
> + * those buffers which have been cancelled, and during the second we
> + * only replay those buffers which do not have corresponding cancel
> + * records in the table.  See xlog_recover_buffer_pass[1,2] above
> + * for more details on the implementation of the table of cancel records.
> + */
> +STATIC int
> +xlog_recover_buffer_commit_pass2(
> +	struct xlog			*log,
> +	struct list_head		*buffer_list,
> +	struct xlog_recover_item	*item,
> +	xfs_lsn_t			current_lsn)
> +{
> +	struct xfs_buf_log_format	*buf_f = item->ri_buf[0].i_addr;
> +	struct xfs_mount		*mp = log->l_mp;
> +	struct xfs_buf			*bp;
> +	int				error;
> +	uint				buf_flags;
> +	xfs_lsn_t			lsn;
> +
> +	/*
> +	 * In this pass we only want to recover all the buffers which have
> +	 * not been cancelled and are not cancellation buffers themselves.
> +	 */
> +	if (buf_f->blf_flags & XFS_BLF_CANCEL) {
> +		if (xlog_put_buffer_cancelled(log, buf_f->blf_blkno,
> +				buf_f->blf_len))
> +			goto cancelled;
> +	} else {
> +
> +		if (xlog_is_buffer_cancelled(log, buf_f->blf_blkno,
> +				buf_f->blf_len))
> +			goto cancelled;
> +	}
> +
> +	trace_xfs_log_recover_buf_recover(log, buf_f);
> +
> +	buf_flags = 0;
> +	if (buf_f->blf_flags & XFS_BLF_INODE_BUF)
> +		buf_flags |= XBF_UNMAPPED;
> +
> +	error = xfs_buf_read(mp->m_ddev_targp, buf_f->blf_blkno, buf_f->blf_len,
> +			  buf_flags, &bp, NULL);
> +	if (error)
> +		return error;
> +
> +	/*
> +	 * Recover the buffer only if we get an LSN from it and it's less than
> +	 * the lsn of the transaction we are replaying.
> +	 *
> +	 * Note that we have to be extremely careful of readahead here.
> +	 * Readahead does not attach verfiers to the buffers so if we don't
> +	 * actually do any replay after readahead because of the LSN we found
> +	 * in the buffer if more recent than that current transaction then we
> +	 * need to attach the verifier directly. Failure to do so can lead to
> +	 * future recovery actions (e.g. EFI and unlinked list recovery) can
> +	 * operate on the buffers and they won't get the verifier attached. This
> +	 * can lead to blocks on disk having the correct content but a stale
> +	 * CRC.
> +	 *
> +	 * It is safe to assume these clean buffers are currently up to date.
> +	 * If the buffer is dirtied by a later transaction being replayed, then
> +	 * the verifier will be reset to match whatever recover turns that
> +	 * buffer into.
> +	 */
> +	lsn = xlog_recover_get_buf_lsn(mp, bp);
> +	if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0) {
> +		trace_xfs_log_recover_buf_skip(log, buf_f);
> +		xlog_recover_validate_buf_type(mp, bp, buf_f, NULLCOMMITLSN);
> +		goto out_release;
> +	}
> +
> +	if (buf_f->blf_flags & XFS_BLF_INODE_BUF) {
> +		error = xlog_recover_do_inode_buffer(mp, item, bp, buf_f);
> +		if (error)
> +			goto out_release;
> +	} else if (buf_f->blf_flags &
> +		  (XFS_BLF_UDQUOT_BUF|XFS_BLF_PDQUOT_BUF|XFS_BLF_GDQUOT_BUF)) {
> +		bool	dirty;
> +
> +		dirty = xlog_recover_do_dquot_buffer(mp, log, item, bp, buf_f);
> +		if (!dirty)
> +			goto out_release;
> +	} else {
> +		xlog_recover_do_reg_buffer(mp, item, bp, buf_f, current_lsn);
> +	}
> +
> +	/*
> +	 * Perform delayed write on the buffer.  Asynchronous writes will be
> +	 * slower when taking into account all the buffers to be flushed.
> +	 *
> +	 * Also make sure that only inode buffers with good sizes stay in
> +	 * the buffer cache.  The kernel moves inodes in buffers of 1 block
> +	 * or inode_cluster_size bytes, whichever is bigger.  The inode
> +	 * buffers in the log can be a different size if the log was generated
> +	 * by an older kernel using unclustered inode buffers or a newer kernel
> +	 * running with a different inode cluster size.  Regardless, if the
> +	 * the inode buffer size isn't max(blocksize, inode_cluster_size)
> +	 * for *our* value of inode_cluster_size, then we need to keep
> +	 * the buffer out of the buffer cache so that the buffer won't
> +	 * overlap with future reads of those inodes.
> +	 */
> +	if (XFS_DINODE_MAGIC ==
> +	    be16_to_cpu(*((__be16 *)xfs_buf_offset(bp, 0))) &&
> +	    (BBTOB(bp->b_length) != M_IGEO(log->l_mp)->inode_cluster_size)) {
> +		xfs_buf_stale(bp);
> +		error = xfs_bwrite(bp);
> +	} else {
> +		ASSERT(bp->b_mount == mp);
> +		bp->b_iodone = xlog_recover_iodone;
> +		xfs_buf_delwri_queue(bp, buffer_list);
> +	}
> +
> +out_release:
> +	xfs_buf_relse(bp);
> +	return error;
> +cancelled:
> +	trace_xfs_log_recover_buf_cancel(log, buf_f);
> +	return 0;
> +}
> +
>  const struct xlog_recover_item_type xlog_buf_item_type = {
>  	.reorder_fn		= xlog_buf_reorder_fn,
>  	.ra_pass2_fn		= xlog_recover_buffer_ra_pass2,
>  	.commit_pass1_fn	= xlog_recover_buffer_commit_pass1,
> +	.commit_pass2_fn	= xlog_recover_buffer_commit_pass2,
>  };
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index fbd1f7d6f1c9..0a241f1c371a 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -284,7 +284,7 @@ xlog_header_check_mount(
>  	return 0;
>  }
>  
> -STATIC void
> +void
>  xlog_recover_iodone(
>  	struct xfs_buf	*bp)
>  {
> @@ -2007,7 +2007,7 @@ xlog_add_buffer_cancelled(
>  /*
>   * Check if there is and entry for blkno, len in the buffer cancel record table.
>   */
> -static bool
> +bool
>  xlog_is_buffer_cancelled(
>  	struct xlog		*log,
>  	xfs_daddr_t		blkno,
> @@ -2024,7 +2024,7 @@ xlog_is_buffer_cancelled(
>   * buffer is re-used again after its last cancellation we actually replay the
>   * changes made at that point.
>   */
> -static bool
> +bool
>  xlog_put_buffer_cancelled(
>  	struct xlog		*log,
>  	xfs_daddr_t		blkno,
> @@ -2056,791 +2056,6 @@ xlog_buf_readahead(
>  		xfs_buf_readahead(log->l_mp->m_ddev_targp, blkno, len, ops);
>  }
>  
> -/*
> - * Perform recovery for a buffer full of inodes.  In these buffers, the only
> - * data which should be recovered is that which corresponds to the
> - * di_next_unlinked pointers in the on disk inode structures.  The rest of the
> - * data for the inodes is always logged through the inodes themselves rather
> - * than the inode buffer and is recovered in xlog_recover_inode_pass2().
> - *
> - * The only time when buffers full of inodes are fully recovered is when the
> - * buffer is full of newly allocated inodes.  In this case the buffer will
> - * not be marked as an inode buffer and so will be sent to
> - * xlog_recover_do_reg_buffer() below during recovery.
> - */
> -STATIC int
> -xlog_recover_do_inode_buffer(
> -	struct xfs_mount	*mp,
> -	xlog_recover_item_t	*item,
> -	struct xfs_buf		*bp,
> -	xfs_buf_log_format_t	*buf_f)
> -{
> -	int			i;
> -	int			item_index = 0;
> -	int			bit = 0;
> -	int			nbits = 0;
> -	int			reg_buf_offset = 0;
> -	int			reg_buf_bytes = 0;
> -	int			next_unlinked_offset;
> -	int			inodes_per_buf;
> -	xfs_agino_t		*logged_nextp;
> -	xfs_agino_t		*buffer_nextp;
> -
> -	trace_xfs_log_recover_buf_inode_buf(mp->m_log, buf_f);
> -
> -	/*
> -	 * Post recovery validation only works properly on CRC enabled
> -	 * filesystems.
> -	 */
> -	if (xfs_sb_version_hascrc(&mp->m_sb))
> -		bp->b_ops = &xfs_inode_buf_ops;
> -
> -	inodes_per_buf = BBTOB(bp->b_length) >> mp->m_sb.sb_inodelog;
> -	for (i = 0; i < inodes_per_buf; i++) {
> -		next_unlinked_offset = (i * mp->m_sb.sb_inodesize) +
> -			offsetof(xfs_dinode_t, di_next_unlinked);
> -
> -		while (next_unlinked_offset >=
> -		       (reg_buf_offset + reg_buf_bytes)) {
> -			/*
> -			 * The next di_next_unlinked field is beyond
> -			 * the current logged region.  Find the next
> -			 * logged region that contains or is beyond
> -			 * the current di_next_unlinked field.
> -			 */
> -			bit += nbits;
> -			bit = xfs_next_bit(buf_f->blf_data_map,
> -					   buf_f->blf_map_size, bit);
> -
> -			/*
> -			 * If there are no more logged regions in the
> -			 * buffer, then we're done.
> -			 */
> -			if (bit == -1)
> -				return 0;
> -
> -			nbits = xfs_contig_bits(buf_f->blf_data_map,
> -						buf_f->blf_map_size, bit);
> -			ASSERT(nbits > 0);
> -			reg_buf_offset = bit << XFS_BLF_SHIFT;
> -			reg_buf_bytes = nbits << XFS_BLF_SHIFT;
> -			item_index++;
> -		}
> -
> -		/*
> -		 * If the current logged region starts after the current
> -		 * di_next_unlinked field, then move on to the next
> -		 * di_next_unlinked field.
> -		 */
> -		if (next_unlinked_offset < reg_buf_offset)
> -			continue;
> -
> -		ASSERT(item->ri_buf[item_index].i_addr != NULL);
> -		ASSERT((item->ri_buf[item_index].i_len % XFS_BLF_CHUNK) == 0);
> -		ASSERT((reg_buf_offset + reg_buf_bytes) <= BBTOB(bp->b_length));
> -
> -		/*
> -		 * The current logged region contains a copy of the
> -		 * current di_next_unlinked field.  Extract its value
> -		 * and copy it to the buffer copy.
> -		 */
> -		logged_nextp = item->ri_buf[item_index].i_addr +
> -				next_unlinked_offset - reg_buf_offset;
> -		if (XFS_IS_CORRUPT(mp, *logged_nextp == 0)) {
> -			xfs_alert(mp,
> -		"Bad inode buffer log record (ptr = "PTR_FMT", bp = "PTR_FMT"). "
> -		"Trying to replay bad (0) inode di_next_unlinked field.",
> -				item, bp);
> -			return -EFSCORRUPTED;
> -		}
> -
> -		buffer_nextp = xfs_buf_offset(bp, next_unlinked_offset);
> -		*buffer_nextp = *logged_nextp;
> -
> -		/*
> -		 * If necessary, recalculate the CRC in the on-disk inode. We
> -		 * have to leave the inode in a consistent state for whoever
> -		 * reads it next....
> -		 */
> -		xfs_dinode_calc_crc(mp,
> -				xfs_buf_offset(bp, i * mp->m_sb.sb_inodesize));
> -
> -	}
> -
> -	return 0;
> -}
> -
> -/*
> - * V5 filesystems know the age of the buffer on disk being recovered. We can
> - * have newer objects on disk than we are replaying, and so for these cases we
> - * don't want to replay the current change as that will make the buffer contents
> - * temporarily invalid on disk.
> - *
> - * The magic number might not match the buffer type we are going to recover
> - * (e.g. reallocated blocks), so we ignore the xfs_buf_log_format flags.  Hence
> - * extract the LSN of the existing object in the buffer based on it's current
> - * magic number.  If we don't recognise the magic number in the buffer, then
> - * return a LSN of -1 so that the caller knows it was an unrecognised block and
> - * so can recover the buffer.
> - *
> - * Note: we cannot rely solely on magic number matches to determine that the
> - * buffer has a valid LSN - we also need to verify that it belongs to this
> - * filesystem, so we need to extract the object's LSN and compare it to that
> - * which we read from the superblock. If the UUIDs don't match, then we've got a
> - * stale metadata block from an old filesystem instance that we need to recover
> - * over the top of.
> - */
> -static xfs_lsn_t
> -xlog_recover_get_buf_lsn(
> -	struct xfs_mount	*mp,
> -	struct xfs_buf		*bp)
> -{
> -	uint32_t		magic32;
> -	uint16_t		magic16;
> -	uint16_t		magicda;
> -	void			*blk = bp->b_addr;
> -	uuid_t			*uuid;
> -	xfs_lsn_t		lsn = -1;
> -
> -	/* v4 filesystems always recover immediately */
> -	if (!xfs_sb_version_hascrc(&mp->m_sb))
> -		goto recover_immediately;
> -
> -	magic32 = be32_to_cpu(*(__be32 *)blk);
> -	switch (magic32) {
> -	case XFS_ABTB_CRC_MAGIC:
> -	case XFS_ABTC_CRC_MAGIC:
> -	case XFS_ABTB_MAGIC:
> -	case XFS_ABTC_MAGIC:
> -	case XFS_RMAP_CRC_MAGIC:
> -	case XFS_REFC_CRC_MAGIC:
> -	case XFS_IBT_CRC_MAGIC:
> -	case XFS_IBT_MAGIC: {
> -		struct xfs_btree_block *btb = blk;
> -
> -		lsn = be64_to_cpu(btb->bb_u.s.bb_lsn);
> -		uuid = &btb->bb_u.s.bb_uuid;
> -		break;
> -	}
> -	case XFS_BMAP_CRC_MAGIC:
> -	case XFS_BMAP_MAGIC: {
> -		struct xfs_btree_block *btb = blk;
> -
> -		lsn = be64_to_cpu(btb->bb_u.l.bb_lsn);
> -		uuid = &btb->bb_u.l.bb_uuid;
> -		break;
> -	}
> -	case XFS_AGF_MAGIC:
> -		lsn = be64_to_cpu(((struct xfs_agf *)blk)->agf_lsn);
> -		uuid = &((struct xfs_agf *)blk)->agf_uuid;
> -		break;
> -	case XFS_AGFL_MAGIC:
> -		lsn = be64_to_cpu(((struct xfs_agfl *)blk)->agfl_lsn);
> -		uuid = &((struct xfs_agfl *)blk)->agfl_uuid;
> -		break;
> -	case XFS_AGI_MAGIC:
> -		lsn = be64_to_cpu(((struct xfs_agi *)blk)->agi_lsn);
> -		uuid = &((struct xfs_agi *)blk)->agi_uuid;
> -		break;
> -	case XFS_SYMLINK_MAGIC:
> -		lsn = be64_to_cpu(((struct xfs_dsymlink_hdr *)blk)->sl_lsn);
> -		uuid = &((struct xfs_dsymlink_hdr *)blk)->sl_uuid;
> -		break;
> -	case XFS_DIR3_BLOCK_MAGIC:
> -	case XFS_DIR3_DATA_MAGIC:
> -	case XFS_DIR3_FREE_MAGIC:
> -		lsn = be64_to_cpu(((struct xfs_dir3_blk_hdr *)blk)->lsn);
> -		uuid = &((struct xfs_dir3_blk_hdr *)blk)->uuid;
> -		break;
> -	case XFS_ATTR3_RMT_MAGIC:
> -		/*
> -		 * Remote attr blocks are written synchronously, rather than
> -		 * being logged. That means they do not contain a valid LSN
> -		 * (i.e. transactionally ordered) in them, and hence any time we
> -		 * see a buffer to replay over the top of a remote attribute
> -		 * block we should simply do so.
> -		 */
> -		goto recover_immediately;
> -	case XFS_SB_MAGIC:
> -		/*
> -		 * superblock uuids are magic. We may or may not have a
> -		 * sb_meta_uuid on disk, but it will be set in the in-core
> -		 * superblock. We set the uuid pointer for verification
> -		 * according to the superblock feature mask to ensure we check
> -		 * the relevant UUID in the superblock.
> -		 */
> -		lsn = be64_to_cpu(((struct xfs_dsb *)blk)->sb_lsn);
> -		if (xfs_sb_version_hasmetauuid(&mp->m_sb))
> -			uuid = &((struct xfs_dsb *)blk)->sb_meta_uuid;
> -		else
> -			uuid = &((struct xfs_dsb *)blk)->sb_uuid;
> -		break;
> -	default:
> -		break;
> -	}
> -
> -	if (lsn != (xfs_lsn_t)-1) {
> -		if (!uuid_equal(&mp->m_sb.sb_meta_uuid, uuid))
> -			goto recover_immediately;
> -		return lsn;
> -	}
> -
> -	magicda = be16_to_cpu(((struct xfs_da_blkinfo *)blk)->magic);
> -	switch (magicda) {
> -	case XFS_DIR3_LEAF1_MAGIC:
> -	case XFS_DIR3_LEAFN_MAGIC:
> -	case XFS_DA3_NODE_MAGIC:
> -		lsn = be64_to_cpu(((struct xfs_da3_blkinfo *)blk)->lsn);
> -		uuid = &((struct xfs_da3_blkinfo *)blk)->uuid;
> -		break;
> -	default:
> -		break;
> -	}
> -
> -	if (lsn != (xfs_lsn_t)-1) {
> -		if (!uuid_equal(&mp->m_sb.sb_uuid, uuid))
> -			goto recover_immediately;
> -		return lsn;
> -	}
> -
> -	/*
> -	 * We do individual object checks on dquot and inode buffers as they
> -	 * have their own individual LSN records. Also, we could have a stale
> -	 * buffer here, so we have to at least recognise these buffer types.
> -	 *
> -	 * A notd complexity here is inode unlinked list processing - it logs
> -	 * the inode directly in the buffer, but we don't know which inodes have
> -	 * been modified, and there is no global buffer LSN. Hence we need to
> -	 * recover all inode buffer types immediately. This problem will be
> -	 * fixed by logical logging of the unlinked list modifications.
> -	 */
> -	magic16 = be16_to_cpu(*(__be16 *)blk);
> -	switch (magic16) {
> -	case XFS_DQUOT_MAGIC:
> -	case XFS_DINODE_MAGIC:
> -		goto recover_immediately;
> -	default:
> -		break;
> -	}
> -
> -	/* unknown buffer contents, recover immediately */
> -
> -recover_immediately:
> -	return (xfs_lsn_t)-1;
> -
> -}
> -
> -/*
> - * Validate the recovered buffer is of the correct type and attach the
> - * appropriate buffer operations to them for writeback. Magic numbers are in a
> - * few places:
> - *	the first 16 bits of the buffer (inode buffer, dquot buffer),
> - *	the first 32 bits of the buffer (most blocks),
> - *	inside a struct xfs_da_blkinfo at the start of the buffer.
> - */
> -static void
> -xlog_recover_validate_buf_type(
> -	struct xfs_mount	*mp,
> -	struct xfs_buf		*bp,
> -	xfs_buf_log_format_t	*buf_f,
> -	xfs_lsn_t		current_lsn)
> -{
> -	struct xfs_da_blkinfo	*info = bp->b_addr;
> -	uint32_t		magic32;
> -	uint16_t		magic16;
> -	uint16_t		magicda;
> -	char			*warnmsg = NULL;
> -
> -	/*
> -	 * We can only do post recovery validation on items on CRC enabled
> -	 * fielsystems as we need to know when the buffer was written to be able
> -	 * to determine if we should have replayed the item. If we replay old
> -	 * metadata over a newer buffer, then it will enter a temporarily
> -	 * inconsistent state resulting in verification failures. Hence for now
> -	 * just avoid the verification stage for non-crc filesystems
> -	 */
> -	if (!xfs_sb_version_hascrc(&mp->m_sb))
> -		return;
> -
> -	magic32 = be32_to_cpu(*(__be32 *)bp->b_addr);
> -	magic16 = be16_to_cpu(*(__be16*)bp->b_addr);
> -	magicda = be16_to_cpu(info->magic);
> -	switch (xfs_blft_from_flags(buf_f)) {
> -	case XFS_BLFT_BTREE_BUF:
> -		switch (magic32) {
> -		case XFS_ABTB_CRC_MAGIC:
> -		case XFS_ABTB_MAGIC:
> -			bp->b_ops = &xfs_bnobt_buf_ops;
> -			break;
> -		case XFS_ABTC_CRC_MAGIC:
> -		case XFS_ABTC_MAGIC:
> -			bp->b_ops = &xfs_cntbt_buf_ops;
> -			break;
> -		case XFS_IBT_CRC_MAGIC:
> -		case XFS_IBT_MAGIC:
> -			bp->b_ops = &xfs_inobt_buf_ops;
> -			break;
> -		case XFS_FIBT_CRC_MAGIC:
> -		case XFS_FIBT_MAGIC:
> -			bp->b_ops = &xfs_finobt_buf_ops;
> -			break;
> -		case XFS_BMAP_CRC_MAGIC:
> -		case XFS_BMAP_MAGIC:
> -			bp->b_ops = &xfs_bmbt_buf_ops;
> -			break;
> -		case XFS_RMAP_CRC_MAGIC:
> -			bp->b_ops = &xfs_rmapbt_buf_ops;
> -			break;
> -		case XFS_REFC_CRC_MAGIC:
> -			bp->b_ops = &xfs_refcountbt_buf_ops;
> -			break;
> -		default:
> -			warnmsg = "Bad btree block magic!";
> -			break;
> -		}
> -		break;
> -	case XFS_BLFT_AGF_BUF:
> -		if (magic32 != XFS_AGF_MAGIC) {
> -			warnmsg = "Bad AGF block magic!";
> -			break;
> -		}
> -		bp->b_ops = &xfs_agf_buf_ops;
> -		break;
> -	case XFS_BLFT_AGFL_BUF:
> -		if (magic32 != XFS_AGFL_MAGIC) {
> -			warnmsg = "Bad AGFL block magic!";
> -			break;
> -		}
> -		bp->b_ops = &xfs_agfl_buf_ops;
> -		break;
> -	case XFS_BLFT_AGI_BUF:
> -		if (magic32 != XFS_AGI_MAGIC) {
> -			warnmsg = "Bad AGI block magic!";
> -			break;
> -		}
> -		bp->b_ops = &xfs_agi_buf_ops;
> -		break;
> -	case XFS_BLFT_UDQUOT_BUF:
> -	case XFS_BLFT_PDQUOT_BUF:
> -	case XFS_BLFT_GDQUOT_BUF:
> -#ifdef CONFIG_XFS_QUOTA
> -		if (magic16 != XFS_DQUOT_MAGIC) {
> -			warnmsg = "Bad DQUOT block magic!";
> -			break;
> -		}
> -		bp->b_ops = &xfs_dquot_buf_ops;
> -#else
> -		xfs_alert(mp,
> -	"Trying to recover dquots without QUOTA support built in!");
> -		ASSERT(0);
> -#endif
> -		break;
> -	case XFS_BLFT_DINO_BUF:
> -		if (magic16 != XFS_DINODE_MAGIC) {
> -			warnmsg = "Bad INODE block magic!";
> -			break;
> -		}
> -		bp->b_ops = &xfs_inode_buf_ops;
> -		break;
> -	case XFS_BLFT_SYMLINK_BUF:
> -		if (magic32 != XFS_SYMLINK_MAGIC) {
> -			warnmsg = "Bad symlink block magic!";
> -			break;
> -		}
> -		bp->b_ops = &xfs_symlink_buf_ops;
> -		break;
> -	case XFS_BLFT_DIR_BLOCK_BUF:
> -		if (magic32 != XFS_DIR2_BLOCK_MAGIC &&
> -		    magic32 != XFS_DIR3_BLOCK_MAGIC) {
> -			warnmsg = "Bad dir block magic!";
> -			break;
> -		}
> -		bp->b_ops = &xfs_dir3_block_buf_ops;
> -		break;
> -	case XFS_BLFT_DIR_DATA_BUF:
> -		if (magic32 != XFS_DIR2_DATA_MAGIC &&
> -		    magic32 != XFS_DIR3_DATA_MAGIC) {
> -			warnmsg = "Bad dir data magic!";
> -			break;
> -		}
> -		bp->b_ops = &xfs_dir3_data_buf_ops;
> -		break;
> -	case XFS_BLFT_DIR_FREE_BUF:
> -		if (magic32 != XFS_DIR2_FREE_MAGIC &&
> -		    magic32 != XFS_DIR3_FREE_MAGIC) {
> -			warnmsg = "Bad dir3 free magic!";
> -			break;
> -		}
> -		bp->b_ops = &xfs_dir3_free_buf_ops;
> -		break;
> -	case XFS_BLFT_DIR_LEAF1_BUF:
> -		if (magicda != XFS_DIR2_LEAF1_MAGIC &&
> -		    magicda != XFS_DIR3_LEAF1_MAGIC) {
> -			warnmsg = "Bad dir leaf1 magic!";
> -			break;
> -		}
> -		bp->b_ops = &xfs_dir3_leaf1_buf_ops;
> -		break;
> -	case XFS_BLFT_DIR_LEAFN_BUF:
> -		if (magicda != XFS_DIR2_LEAFN_MAGIC &&
> -		    magicda != XFS_DIR3_LEAFN_MAGIC) {
> -			warnmsg = "Bad dir leafn magic!";
> -			break;
> -		}
> -		bp->b_ops = &xfs_dir3_leafn_buf_ops;
> -		break;
> -	case XFS_BLFT_DA_NODE_BUF:
> -		if (magicda != XFS_DA_NODE_MAGIC &&
> -		    magicda != XFS_DA3_NODE_MAGIC) {
> -			warnmsg = "Bad da node magic!";
> -			break;
> -		}
> -		bp->b_ops = &xfs_da3_node_buf_ops;
> -		break;
> -	case XFS_BLFT_ATTR_LEAF_BUF:
> -		if (magicda != XFS_ATTR_LEAF_MAGIC &&
> -		    magicda != XFS_ATTR3_LEAF_MAGIC) {
> -			warnmsg = "Bad attr leaf magic!";
> -			break;
> -		}
> -		bp->b_ops = &xfs_attr3_leaf_buf_ops;
> -		break;
> -	case XFS_BLFT_ATTR_RMT_BUF:
> -		if (magic32 != XFS_ATTR3_RMT_MAGIC) {
> -			warnmsg = "Bad attr remote magic!";
> -			break;
> -		}
> -		bp->b_ops = &xfs_attr3_rmt_buf_ops;
> -		break;
> -	case XFS_BLFT_SB_BUF:
> -		if (magic32 != XFS_SB_MAGIC) {
> -			warnmsg = "Bad SB block magic!";
> -			break;
> -		}
> -		bp->b_ops = &xfs_sb_buf_ops;
> -		break;
> -#ifdef CONFIG_XFS_RT
> -	case XFS_BLFT_RTBITMAP_BUF:
> -	case XFS_BLFT_RTSUMMARY_BUF:
> -		/* no magic numbers for verification of RT buffers */
> -		bp->b_ops = &xfs_rtbuf_ops;
> -		break;
> -#endif /* CONFIG_XFS_RT */
> -	default:
> -		xfs_warn(mp, "Unknown buffer type %d!",
> -			 xfs_blft_from_flags(buf_f));
> -		break;
> -	}
> -
> -	/*
> -	 * Nothing else to do in the case of a NULL current LSN as this means
> -	 * the buffer is more recent than the change in the log and will be
> -	 * skipped.
> -	 */
> -	if (current_lsn == NULLCOMMITLSN)
> -		return;
> -
> -	if (warnmsg) {
> -		xfs_warn(mp, warnmsg);
> -		ASSERT(0);
> -	}
> -
> -	/*
> -	 * We must update the metadata LSN of the buffer as it is written out to
> -	 * ensure that older transactions never replay over this one and corrupt
> -	 * the buffer. This can occur if log recovery is interrupted at some
> -	 * point after the current transaction completes, at which point a
> -	 * subsequent mount starts recovery from the beginning.
> -	 *
> -	 * Write verifiers update the metadata LSN from log items attached to
> -	 * the buffer. Therefore, initialize a bli purely to carry the LSN to
> -	 * the verifier. We'll clean it up in our ->iodone() callback.
> -	 */
> -	if (bp->b_ops) {
> -		struct xfs_buf_log_item	*bip;
> -
> -		ASSERT(!bp->b_iodone || bp->b_iodone == xlog_recover_iodone);
> -		bp->b_iodone = xlog_recover_iodone;
> -		xfs_buf_item_init(bp, mp);
> -		bip = bp->b_log_item;
> -		bip->bli_item.li_lsn = current_lsn;
> -	}
> -}
> -
> -/*
> - * Perform a 'normal' buffer recovery.  Each logged region of the
> - * buffer should be copied over the corresponding region in the
> - * given buffer.  The bitmap in the buf log format structure indicates
> - * where to place the logged data.
> - */
> -STATIC void
> -xlog_recover_do_reg_buffer(
> -	struct xfs_mount	*mp,
> -	xlog_recover_item_t	*item,
> -	struct xfs_buf		*bp,
> -	xfs_buf_log_format_t	*buf_f,
> -	xfs_lsn_t		current_lsn)
> -{
> -	int			i;
> -	int			bit;
> -	int			nbits;
> -	xfs_failaddr_t		fa;
> -	const size_t		size_disk_dquot = sizeof(struct xfs_disk_dquot);
> -
> -	trace_xfs_log_recover_buf_reg_buf(mp->m_log, buf_f);
> -
> -	bit = 0;
> -	i = 1;  /* 0 is the buf format structure */
> -	while (1) {
> -		bit = xfs_next_bit(buf_f->blf_data_map,
> -				   buf_f->blf_map_size, bit);
> -		if (bit == -1)
> -			break;
> -		nbits = xfs_contig_bits(buf_f->blf_data_map,
> -					buf_f->blf_map_size, bit);
> -		ASSERT(nbits > 0);
> -		ASSERT(item->ri_buf[i].i_addr != NULL);
> -		ASSERT(item->ri_buf[i].i_len % XFS_BLF_CHUNK == 0);
> -		ASSERT(BBTOB(bp->b_length) >=
> -		       ((uint)bit << XFS_BLF_SHIFT) + (nbits << XFS_BLF_SHIFT));
> -
> -		/*
> -		 * The dirty regions logged in the buffer, even though
> -		 * contiguous, may span multiple chunks. This is because the
> -		 * dirty region may span a physical page boundary in a buffer
> -		 * and hence be split into two separate vectors for writing into
> -		 * the log. Hence we need to trim nbits back to the length of
> -		 * the current region being copied out of the log.
> -		 */
> -		if (item->ri_buf[i].i_len < (nbits << XFS_BLF_SHIFT))
> -			nbits = item->ri_buf[i].i_len >> XFS_BLF_SHIFT;
> -
> -		/*
> -		 * Do a sanity check if this is a dquot buffer. Just checking
> -		 * the first dquot in the buffer should do. XXXThis is
> -		 * probably a good thing to do for other buf types also.
> -		 */
> -		fa = NULL;
> -		if (buf_f->blf_flags &
> -		   (XFS_BLF_UDQUOT_BUF|XFS_BLF_PDQUOT_BUF|XFS_BLF_GDQUOT_BUF)) {
> -			if (item->ri_buf[i].i_addr == NULL) {
> -				xfs_alert(mp,
> -					"XFS: NULL dquot in %s.", __func__);
> -				goto next;
> -			}
> -			if (item->ri_buf[i].i_len < size_disk_dquot) {
> -				xfs_alert(mp,
> -					"XFS: dquot too small (%d) in %s.",
> -					item->ri_buf[i].i_len, __func__);
> -				goto next;
> -			}
> -			fa = xfs_dquot_verify(mp, item->ri_buf[i].i_addr,
> -					       -1, 0);
> -			if (fa) {
> -				xfs_alert(mp,
> -	"dquot corrupt at %pS trying to replay into block 0x%llx",
> -					fa, bp->b_bn);
> -				goto next;
> -			}
> -		}
> -
> -		memcpy(xfs_buf_offset(bp,
> -			(uint)bit << XFS_BLF_SHIFT),	/* dest */
> -			item->ri_buf[i].i_addr,		/* source */
> -			nbits<<XFS_BLF_SHIFT);		/* length */
> - next:
> -		i++;
> -		bit += nbits;
> -	}
> -
> -	/* Shouldn't be any more regions */
> -	ASSERT(i == item->ri_total);
> -
> -	xlog_recover_validate_buf_type(mp, bp, buf_f, current_lsn);
> -}
> -
> -/*
> - * Perform a dquot buffer recovery.
> - * Simple algorithm: if we have found a QUOTAOFF log item of the same type
> - * (ie. USR or GRP), then just toss this buffer away; don't recover it.
> - * Else, treat it as a regular buffer and do recovery.
> - *
> - * Return false if the buffer was tossed and true if we recovered the buffer to
> - * indicate to the caller if the buffer needs writing.
> - */
> -STATIC bool
> -xlog_recover_do_dquot_buffer(
> -	struct xfs_mount		*mp,
> -	struct xlog			*log,
> -	struct xlog_recover_item	*item,
> -	struct xfs_buf			*bp,
> -	struct xfs_buf_log_format	*buf_f)
> -{
> -	uint			type;
> -
> -	trace_xfs_log_recover_buf_dquot_buf(log, buf_f);
> -
> -	/*
> -	 * Filesystems are required to send in quota flags at mount time.
> -	 */
> -	if (!mp->m_qflags)
> -		return false;
> -
> -	type = 0;
> -	if (buf_f->blf_flags & XFS_BLF_UDQUOT_BUF)
> -		type |= XFS_DQ_USER;
> -	if (buf_f->blf_flags & XFS_BLF_PDQUOT_BUF)
> -		type |= XFS_DQ_PROJ;
> -	if (buf_f->blf_flags & XFS_BLF_GDQUOT_BUF)
> -		type |= XFS_DQ_GROUP;
> -	/*
> -	 * This type of quotas was turned off, so ignore this buffer
> -	 */
> -	if (log->l_quotaoffs_flag & type)
> -		return false;
> -
> -	xlog_recover_do_reg_buffer(mp, item, bp, buf_f, NULLCOMMITLSN);
> -	return true;
> -}
> -
> -/*
> - * This routine replays a modification made to a buffer at runtime.
> - * There are actually two types of buffer, regular and inode, which
> - * are handled differently.  Inode buffers are handled differently
> - * in that we only recover a specific set of data from them, namely
> - * the inode di_next_unlinked fields.  This is because all other inode
> - * data is actually logged via inode records and any data we replay
> - * here which overlaps that may be stale.
> - *
> - * When meta-data buffers are freed at run time we log a buffer item
> - * with the XFS_BLF_CANCEL bit set to indicate that previous copies
> - * of the buffer in the log should not be replayed at recovery time.
> - * This is so that if the blocks covered by the buffer are reused for
> - * file data before we crash we don't end up replaying old, freed
> - * meta-data into a user's file.
> - *
> - * To handle the cancellation of buffer log items, we make two passes
> - * over the log during recovery.  During the first we build a table of
> - * those buffers which have been cancelled, and during the second we
> - * only replay those buffers which do not have corresponding cancel
> - * records in the table.  See xlog_recover_buffer_pass[1,2] above
> - * for more details on the implementation of the table of cancel records.
> - */
> -STATIC int
> -xlog_recover_buffer_pass2(
> -	struct xlog			*log,
> -	struct list_head		*buffer_list,
> -	struct xlog_recover_item	*item,
> -	xfs_lsn_t			current_lsn)
> -{
> -	xfs_buf_log_format_t	*buf_f = item->ri_buf[0].i_addr;
> -	xfs_mount_t		*mp = log->l_mp;
> -	xfs_buf_t		*bp;
> -	int			error;
> -	uint			buf_flags;
> -	xfs_lsn_t		lsn;
> -
> -	/*
> -	 * In this pass we only want to recover all the buffers which have
> -	 * not been cancelled and are not cancellation buffers themselves.
> -	 */
> -	if (buf_f->blf_flags & XFS_BLF_CANCEL) {
> -		if (xlog_put_buffer_cancelled(log, buf_f->blf_blkno,
> -				buf_f->blf_len))
> -			goto cancelled;
> -	} else {
> -
> -		if (xlog_is_buffer_cancelled(log, buf_f->blf_blkno,
> -				buf_f->blf_len))
> -			goto cancelled;
> -	}
> -
> -	trace_xfs_log_recover_buf_recover(log, buf_f);
> -
> -	buf_flags = 0;
> -	if (buf_f->blf_flags & XFS_BLF_INODE_BUF)
> -		buf_flags |= XBF_UNMAPPED;
> -
> -	error = xfs_buf_read(mp->m_ddev_targp, buf_f->blf_blkno, buf_f->blf_len,
> -			  buf_flags, &bp, NULL);
> -	if (error)
> -		return error;
> -
> -	/*
> -	 * Recover the buffer only if we get an LSN from it and it's less than
> -	 * the lsn of the transaction we are replaying.
> -	 *
> -	 * Note that we have to be extremely careful of readahead here.
> -	 * Readahead does not attach verfiers to the buffers so if we don't
> -	 * actually do any replay after readahead because of the LSN we found
> -	 * in the buffer if more recent than that current transaction then we
> -	 * need to attach the verifier directly. Failure to do so can lead to
> -	 * future recovery actions (e.g. EFI and unlinked list recovery) can
> -	 * operate on the buffers and they won't get the verifier attached. This
> -	 * can lead to blocks on disk having the correct content but a stale
> -	 * CRC.
> -	 *
> -	 * It is safe to assume these clean buffers are currently up to date.
> -	 * If the buffer is dirtied by a later transaction being replayed, then
> -	 * the verifier will be reset to match whatever recover turns that
> -	 * buffer into.
> -	 */
> -	lsn = xlog_recover_get_buf_lsn(mp, bp);
> -	if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0) {
> -		trace_xfs_log_recover_buf_skip(log, buf_f);
> -		xlog_recover_validate_buf_type(mp, bp, buf_f, NULLCOMMITLSN);
> -		goto out_release;
> -	}
> -
> -	if (buf_f->blf_flags & XFS_BLF_INODE_BUF) {
> -		error = xlog_recover_do_inode_buffer(mp, item, bp, buf_f);
> -		if (error)
> -			goto out_release;
> -	} else if (buf_f->blf_flags &
> -		  (XFS_BLF_UDQUOT_BUF|XFS_BLF_PDQUOT_BUF|XFS_BLF_GDQUOT_BUF)) {
> -		bool	dirty;
> -
> -		dirty = xlog_recover_do_dquot_buffer(mp, log, item, bp, buf_f);
> -		if (!dirty)
> -			goto out_release;
> -	} else {
> -		xlog_recover_do_reg_buffer(mp, item, bp, buf_f, current_lsn);
> -	}
> -
> -	/*
> -	 * Perform delayed write on the buffer.  Asynchronous writes will be
> -	 * slower when taking into account all the buffers to be flushed.
> -	 *
> -	 * Also make sure that only inode buffers with good sizes stay in
> -	 * the buffer cache.  The kernel moves inodes in buffers of 1 block
> -	 * or inode_cluster_size bytes, whichever is bigger.  The inode
> -	 * buffers in the log can be a different size if the log was generated
> -	 * by an older kernel using unclustered inode buffers or a newer kernel
> -	 * running with a different inode cluster size.  Regardless, if the
> -	 * the inode buffer size isn't max(blocksize, inode_cluster_size)
> -	 * for *our* value of inode_cluster_size, then we need to keep
> -	 * the buffer out of the buffer cache so that the buffer won't
> -	 * overlap with future reads of those inodes.
> -	 */
> -	if (XFS_DINODE_MAGIC ==
> -	    be16_to_cpu(*((__be16 *)xfs_buf_offset(bp, 0))) &&
> -	    (BBTOB(bp->b_length) != M_IGEO(log->l_mp)->inode_cluster_size)) {
> -		xfs_buf_stale(bp);
> -		error = xfs_bwrite(bp);
> -	} else {
> -		ASSERT(bp->b_mount == mp);
> -		bp->b_iodone = xlog_recover_iodone;
> -		xfs_buf_delwri_queue(bp, buffer_list);
> -	}
> -
> -out_release:
> -	xfs_buf_relse(bp);
> -	return error;
> -cancelled:
> -	trace_xfs_log_recover_buf_cancel(log, buf_f);
> -	return 0;
> -}
> -
>  /*
>   * Inode fork owner changes
>   *
> @@ -3887,10 +3102,11 @@ xlog_recover_commit_pass2(
>  {
>  	trace_xfs_log_recover_item_recover(log, trans, item, XLOG_RECOVER_PASS2);
>  
> +	if (item->ri_type && item->ri_type->commit_pass2_fn)
> +		return item->ri_type->commit_pass2_fn(log, buffer_list, item,
> +				trans->r_lsn);
> +
>  	switch (ITEM_TYPE(item)) {
> -	case XFS_LI_BUF:
> -		return xlog_recover_buffer_pass2(log, buffer_list, item,
> -						 trans->r_lsn);
>  	case XFS_LI_INODE:
>  		return xlog_recover_inode_pass2(log, buffer_list, item,
>  						 trans->r_lsn);
> 
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 05/21] xfs: refactor log recovery inode item dispatch for pass2 commit functions
  2020-04-30  0:48 ` [PATCH 05/21] xfs: refactor log recovery inode " Darrick J. Wong
@ 2020-05-01 14:03   ` Chandan Rajendra
  0 siblings, 0 replies; 41+ messages in thread
From: Chandan Rajendra @ 2020-05-01 14:03 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thursday, April 30, 2020 6:18 AM Darrick J. Wong wrote: 
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Move the log inode item pass2 commit code into the per-item source code
> files and use the dispatch function to call it.  We do these one at a
> time because there's a lot of code to move.  No functional changes.
>

The changes look good to me.

Reviewed-by: Chandan Rajendra <chandanrlinux@gmail.com>

> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/xfs_inode_item_recover.c |  355 +++++++++++++++++++++++++++++++++++++++
>  fs/xfs/xfs_log_recover.c        |  355 ---------------------------------------
>  2 files changed, 355 insertions(+), 355 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c
> index d97d8caa4652..46fc8a4b9ac6 100644
> --- a/fs/xfs/xfs_inode_item_recover.c
> +++ b/fs/xfs/xfs_inode_item_recover.c
> @@ -20,6 +20,8 @@
>  #include "xfs_error.h"
>  #include "xfs_log_priv.h"
>  #include "xfs_log_recover.h"
> +#include "xfs_icache.h"
> +#include "xfs_bmap_btree.h"
>  
>  STATIC void
>  xlog_recover_inode_ra_pass2(
> @@ -39,6 +41,359 @@ xlog_recover_inode_ra_pass2(
>  	}
>  }
>  
> +/*
> + * Inode fork owner changes
> + *
> + * If we have been told that we have to reparent the inode fork, it's because an
> + * extent swap operation on a CRC enabled filesystem has been done and we are
> + * replaying it. We need to walk the BMBT of the appropriate fork and change the
> + * owners of it.
> + *
> + * The complexity here is that we don't have an inode context to work with, so
> + * after we've replayed the inode we need to instantiate one.  This is where the
> + * fun begins.
> + *
> + * We are in the middle of log recovery, so we can't run transactions. That
> + * means we cannot use cache coherent inode instantiation via xfs_iget(), as
> + * that will result in the corresponding iput() running the inode through
> + * xfs_inactive(). If we've just replayed an inode core that changes the link
> + * count to zero (i.e. it's been unlinked), then xfs_inactive() will run
> + * transactions (bad!).
> + *
> + * So, to avoid this, we instantiate an inode directly from the inode core we've
> + * just recovered. We have the buffer still locked, and all we really need to
> + * instantiate is the inode core and the forks being modified. We can do this
> + * manually, then run the inode btree owner change, and then tear down the
> + * xfs_inode without having to run any transactions at all.
> + *
> + * Also, because we don't have a transaction context available here but need to
> + * gather all the buffers we modify for writeback so we pass the buffer_list
> + * instead for the operation to use.
> + */
> +
> +STATIC int
> +xfs_recover_inode_owner_change(
> +	struct xfs_mount	*mp,
> +	struct xfs_dinode	*dip,
> +	struct xfs_inode_log_format *in_f,
> +	struct list_head	*buffer_list)
> +{
> +	struct xfs_inode	*ip;
> +	int			error;
> +
> +	ASSERT(in_f->ilf_fields & (XFS_ILOG_DOWNER|XFS_ILOG_AOWNER));
> +
> +	ip = xfs_inode_alloc(mp, in_f->ilf_ino);
> +	if (!ip)
> +		return -ENOMEM;
> +
> +	/* instantiate the inode */
> +	ASSERT(dip->di_version >= 3);
> +	xfs_inode_from_disk(ip, dip);
> +
> +	error = xfs_iformat_fork(ip, dip);
> +	if (error)
> +		goto out_free_ip;
> +
> +	if (!xfs_inode_verify_forks(ip)) {
> +		error = -EFSCORRUPTED;
> +		goto out_free_ip;
> +	}
> +
> +	if (in_f->ilf_fields & XFS_ILOG_DOWNER) {
> +		ASSERT(in_f->ilf_fields & XFS_ILOG_DBROOT);
> +		error = xfs_bmbt_change_owner(NULL, ip, XFS_DATA_FORK,
> +					      ip->i_ino, buffer_list);
> +		if (error)
> +			goto out_free_ip;
> +	}
> +
> +	if (in_f->ilf_fields & XFS_ILOG_AOWNER) {
> +		ASSERT(in_f->ilf_fields & XFS_ILOG_ABROOT);
> +		error = xfs_bmbt_change_owner(NULL, ip, XFS_ATTR_FORK,
> +					      ip->i_ino, buffer_list);
> +		if (error)
> +			goto out_free_ip;
> +	}
> +
> +out_free_ip:
> +	xfs_inode_free(ip);
> +	return error;
> +}
> +
> +STATIC int
> +xlog_recover_inode_commit_pass2(
> +	struct xlog			*log,
> +	struct list_head		*buffer_list,
> +	struct xlog_recover_item	*item,
> +	xfs_lsn_t			current_lsn)
> +{
> +	struct xfs_inode_log_format	*in_f;
> +	struct xfs_mount		*mp = log->l_mp;
> +	struct xfs_buf			*bp;
> +	struct xfs_dinode		*dip;
> +	int				len;
> +	char				*src;
> +	char				*dest;
> +	int				error;
> +	int				attr_index;
> +	uint				fields;
> +	struct xfs_log_dinode		*ldip;
> +	uint				isize;
> +	int				need_free = 0;
> +
> +	if (item->ri_buf[0].i_len == sizeof(struct xfs_inode_log_format)) {
> +		in_f = item->ri_buf[0].i_addr;
> +	} else {
> +		in_f = kmem_alloc(sizeof(struct xfs_inode_log_format), 0);
> +		need_free = 1;
> +		error = xfs_inode_item_format_convert(&item->ri_buf[0], in_f);
> +		if (error)
> +			goto error;
> +	}
> +
> +	/*
> +	 * Inode buffers can be freed, look out for it,
> +	 * and do not replay the inode.
> +	 */
> +	if (xlog_is_buffer_cancelled(log, in_f->ilf_blkno, in_f->ilf_len)) {
> +		error = 0;
> +		trace_xfs_log_recover_inode_cancel(log, in_f);
> +		goto error;
> +	}
> +	trace_xfs_log_recover_inode_recover(log, in_f);
> +
> +	error = xfs_buf_read(mp->m_ddev_targp, in_f->ilf_blkno, in_f->ilf_len,
> +			0, &bp, &xfs_inode_buf_ops);
> +	if (error)
> +		goto error;
> +	ASSERT(in_f->ilf_fields & XFS_ILOG_CORE);
> +	dip = xfs_buf_offset(bp, in_f->ilf_boffset);
> +
> +	/*
> +	 * Make sure the place we're flushing out to really looks
> +	 * like an inode!
> +	 */
> +	if (XFS_IS_CORRUPT(mp, !xfs_verify_magic16(bp, dip->di_magic))) {
> +		xfs_alert(mp,
> +	"%s: Bad inode magic number, dip = "PTR_FMT", dino bp = "PTR_FMT", ino = %Ld",
> +			__func__, dip, bp, in_f->ilf_ino);
> +		error = -EFSCORRUPTED;
> +		goto out_release;
> +	}
> +	ldip = item->ri_buf[1].i_addr;
> +	if (XFS_IS_CORRUPT(mp, ldip->di_magic != XFS_DINODE_MAGIC)) {
> +		xfs_alert(mp,
> +			"%s: Bad inode log record, rec ptr "PTR_FMT", ino %Ld",
> +			__func__, item, in_f->ilf_ino);
> +		error = -EFSCORRUPTED;
> +		goto out_release;
> +	}
> +
> +	/*
> +	 * If the inode has an LSN in it, recover the inode only if it's less
> +	 * than the lsn of the transaction we are replaying. Note: we still
> +	 * need to replay an owner change even though the inode is more recent
> +	 * than the transaction as there is no guarantee that all the btree
> +	 * blocks are more recent than this transaction, too.
> +	 */
> +	if (dip->di_version >= 3) {
> +		xfs_lsn_t	lsn = be64_to_cpu(dip->di_lsn);
> +
> +		if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0) {
> +			trace_xfs_log_recover_inode_skip(log, in_f);
> +			error = 0;
> +			goto out_owner_change;
> +		}
> +	}
> +
> +	/*
> +	 * di_flushiter is only valid for v1/2 inodes. All changes for v3 inodes
> +	 * are transactional and if ordering is necessary we can determine that
> +	 * more accurately by the LSN field in the V3 inode core. Don't trust
> +	 * the inode versions we might be changing them here - use the
> +	 * superblock flag to determine whether we need to look at di_flushiter
> +	 * to skip replay when the on disk inode is newer than the log one
> +	 */
> +	if (!xfs_sb_version_has_v3inode(&mp->m_sb) &&
> +	    ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) {
> +		/*
> +		 * Deal with the wrap case, DI_MAX_FLUSH is less
> +		 * than smaller numbers
> +		 */
> +		if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH &&
> +		    ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) {
> +			/* do nothing */
> +		} else {
> +			trace_xfs_log_recover_inode_skip(log, in_f);
> +			error = 0;
> +			goto out_release;
> +		}
> +	}
> +
> +	/* Take the opportunity to reset the flush iteration count */
> +	ldip->di_flushiter = 0;
> +
> +	if (unlikely(S_ISREG(ldip->di_mode))) {
> +		if ((ldip->di_format != XFS_DINODE_FMT_EXTENTS) &&
> +		    (ldip->di_format != XFS_DINODE_FMT_BTREE)) {
> +			XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(3)",
> +					 XFS_ERRLEVEL_LOW, mp, ldip,
> +					 sizeof(*ldip));
> +			xfs_alert(mp,
> +		"%s: Bad regular inode log record, rec ptr "PTR_FMT", "
> +		"ino ptr = "PTR_FMT", ino bp = "PTR_FMT", ino %Ld",
> +				__func__, item, dip, bp, in_f->ilf_ino);
> +			error = -EFSCORRUPTED;
> +			goto out_release;
> +		}
> +	} else if (unlikely(S_ISDIR(ldip->di_mode))) {
> +		if ((ldip->di_format != XFS_DINODE_FMT_EXTENTS) &&
> +		    (ldip->di_format != XFS_DINODE_FMT_BTREE) &&
> +		    (ldip->di_format != XFS_DINODE_FMT_LOCAL)) {
> +			XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(4)",
> +					     XFS_ERRLEVEL_LOW, mp, ldip,
> +					     sizeof(*ldip));
> +			xfs_alert(mp,
> +		"%s: Bad dir inode log record, rec ptr "PTR_FMT", "
> +		"ino ptr = "PTR_FMT", ino bp = "PTR_FMT", ino %Ld",
> +				__func__, item, dip, bp, in_f->ilf_ino);
> +			error = -EFSCORRUPTED;
> +			goto out_release;
> +		}
> +	}
> +	if (unlikely(ldip->di_nextents + ldip->di_anextents > ldip->di_nblocks)){
> +		XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(5)",
> +				     XFS_ERRLEVEL_LOW, mp, ldip,
> +				     sizeof(*ldip));
> +		xfs_alert(mp,
> +	"%s: Bad inode log record, rec ptr "PTR_FMT", dino ptr "PTR_FMT", "
> +	"dino bp "PTR_FMT", ino %Ld, total extents = %d, nblocks = %Ld",
> +			__func__, item, dip, bp, in_f->ilf_ino,
> +			ldip->di_nextents + ldip->di_anextents,
> +			ldip->di_nblocks);
> +		error = -EFSCORRUPTED;
> +		goto out_release;
> +	}
> +	if (unlikely(ldip->di_forkoff > mp->m_sb.sb_inodesize)) {
> +		XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(6)",
> +				     XFS_ERRLEVEL_LOW, mp, ldip,
> +				     sizeof(*ldip));
> +		xfs_alert(mp,
> +	"%s: Bad inode log record, rec ptr "PTR_FMT", dino ptr "PTR_FMT", "
> +	"dino bp "PTR_FMT", ino %Ld, forkoff 0x%x", __func__,
> +			item, dip, bp, in_f->ilf_ino, ldip->di_forkoff);
> +		error = -EFSCORRUPTED;
> +		goto out_release;
> +	}
> +	isize = xfs_log_dinode_size(mp);
> +	if (unlikely(item->ri_buf[1].i_len > isize)) {
> +		XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(7)",
> +				     XFS_ERRLEVEL_LOW, mp, ldip,
> +				     sizeof(*ldip));
> +		xfs_alert(mp,
> +			"%s: Bad inode log record length %d, rec ptr "PTR_FMT,
> +			__func__, item->ri_buf[1].i_len, item);
> +		error = -EFSCORRUPTED;
> +		goto out_release;
> +	}
> +
> +	/* recover the log dinode inode into the on disk inode */
> +	xfs_log_dinode_to_disk(ldip, dip);
> +
> +	fields = in_f->ilf_fields;
> +	if (fields & XFS_ILOG_DEV)
> +		xfs_dinode_put_rdev(dip, in_f->ilf_u.ilfu_rdev);
> +
> +	if (in_f->ilf_size == 2)
> +		goto out_owner_change;
> +	len = item->ri_buf[2].i_len;
> +	src = item->ri_buf[2].i_addr;
> +	ASSERT(in_f->ilf_size <= 4);
> +	ASSERT((in_f->ilf_size == 3) || (fields & XFS_ILOG_AFORK));
> +	ASSERT(!(fields & XFS_ILOG_DFORK) ||
> +	       (len == in_f->ilf_dsize));
> +
> +	switch (fields & XFS_ILOG_DFORK) {
> +	case XFS_ILOG_DDATA:
> +	case XFS_ILOG_DEXT:
> +		memcpy(XFS_DFORK_DPTR(dip), src, len);
> +		break;
> +
> +	case XFS_ILOG_DBROOT:
> +		xfs_bmbt_to_bmdr(mp, (struct xfs_btree_block *)src, len,
> +				 (struct xfs_bmdr_block *)XFS_DFORK_DPTR(dip),
> +				 XFS_DFORK_DSIZE(dip, mp));
> +		break;
> +
> +	default:
> +		/*
> +		 * There are no data fork flags set.
> +		 */
> +		ASSERT((fields & XFS_ILOG_DFORK) == 0);
> +		break;
> +	}
> +
> +	/*
> +	 * If we logged any attribute data, recover it.  There may or
> +	 * may not have been any other non-core data logged in this
> +	 * transaction.
> +	 */
> +	if (in_f->ilf_fields & XFS_ILOG_AFORK) {
> +		if (in_f->ilf_fields & XFS_ILOG_DFORK) {
> +			attr_index = 3;
> +		} else {
> +			attr_index = 2;
> +		}
> +		len = item->ri_buf[attr_index].i_len;
> +		src = item->ri_buf[attr_index].i_addr;
> +		ASSERT(len == in_f->ilf_asize);
> +
> +		switch (in_f->ilf_fields & XFS_ILOG_AFORK) {
> +		case XFS_ILOG_ADATA:
> +		case XFS_ILOG_AEXT:
> +			dest = XFS_DFORK_APTR(dip);
> +			ASSERT(len <= XFS_DFORK_ASIZE(dip, mp));
> +			memcpy(dest, src, len);
> +			break;
> +
> +		case XFS_ILOG_ABROOT:
> +			dest = XFS_DFORK_APTR(dip);
> +			xfs_bmbt_to_bmdr(mp, (struct xfs_btree_block *)src,
> +					 len, (struct xfs_bmdr_block *)dest,
> +					 XFS_DFORK_ASIZE(dip, mp));
> +			break;
> +
> +		default:
> +			xfs_warn(log->l_mp, "%s: Invalid flag", __func__);
> +			ASSERT(0);
> +			error = -EFSCORRUPTED;
> +			goto out_release;
> +		}
> +	}
> +
> +out_owner_change:
> +	/* Recover the swapext owner change unless inode has been deleted */
> +	if ((in_f->ilf_fields & (XFS_ILOG_DOWNER|XFS_ILOG_AOWNER)) &&
> +	    (dip->di_mode != 0))
> +		error = xfs_recover_inode_owner_change(mp, dip, in_f,
> +						       buffer_list);
> +	/* re-generate the checksum. */
> +	xfs_dinode_calc_crc(log->l_mp, dip);
> +
> +	ASSERT(bp->b_mount == mp);
> +	bp->b_iodone = xlog_recover_iodone;
> +	xfs_buf_delwri_queue(bp, buffer_list);
> +
> +out_release:
> +	xfs_buf_relse(bp);
> +error:
> +	if (need_free)
> +		kmem_free(in_f);
> +	return error;
> +}
> +
>  const struct xlog_recover_item_type xlog_inode_item_type = {
>  	.ra_pass2_fn		= xlog_recover_inode_ra_pass2,
> +	.commit_pass2_fn	= xlog_recover_inode_commit_pass2,
>  };
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 0a241f1c371a..57e5dac0f510 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -2056,358 +2056,6 @@ xlog_buf_readahead(
>  		xfs_buf_readahead(log->l_mp->m_ddev_targp, blkno, len, ops);
>  }
>  
> -/*
> - * Inode fork owner changes
> - *
> - * If we have been told that we have to reparent the inode fork, it's because an
> - * extent swap operation on a CRC enabled filesystem has been done and we are
> - * replaying it. We need to walk the BMBT of the appropriate fork and change the
> - * owners of it.
> - *
> - * The complexity here is that we don't have an inode context to work with, so
> - * after we've replayed the inode we need to instantiate one.  This is where the
> - * fun begins.
> - *
> - * We are in the middle of log recovery, so we can't run transactions. That
> - * means we cannot use cache coherent inode instantiation via xfs_iget(), as
> - * that will result in the corresponding iput() running the inode through
> - * xfs_inactive(). If we've just replayed an inode core that changes the link
> - * count to zero (i.e. it's been unlinked), then xfs_inactive() will run
> - * transactions (bad!).
> - *
> - * So, to avoid this, we instantiate an inode directly from the inode core we've
> - * just recovered. We have the buffer still locked, and all we really need to
> - * instantiate is the inode core and the forks being modified. We can do this
> - * manually, then run the inode btree owner change, and then tear down the
> - * xfs_inode without having to run any transactions at all.
> - *
> - * Also, because we don't have a transaction context available here but need to
> - * gather all the buffers we modify for writeback so we pass the buffer_list
> - * instead for the operation to use.
> - */
> -
> -STATIC int
> -xfs_recover_inode_owner_change(
> -	struct xfs_mount	*mp,
> -	struct xfs_dinode	*dip,
> -	struct xfs_inode_log_format *in_f,
> -	struct list_head	*buffer_list)
> -{
> -	struct xfs_inode	*ip;
> -	int			error;
> -
> -	ASSERT(in_f->ilf_fields & (XFS_ILOG_DOWNER|XFS_ILOG_AOWNER));
> -
> -	ip = xfs_inode_alloc(mp, in_f->ilf_ino);
> -	if (!ip)
> -		return -ENOMEM;
> -
> -	/* instantiate the inode */
> -	ASSERT(dip->di_version >= 3);
> -	xfs_inode_from_disk(ip, dip);
> -
> -	error = xfs_iformat_fork(ip, dip);
> -	if (error)
> -		goto out_free_ip;
> -
> -	if (!xfs_inode_verify_forks(ip)) {
> -		error = -EFSCORRUPTED;
> -		goto out_free_ip;
> -	}
> -
> -	if (in_f->ilf_fields & XFS_ILOG_DOWNER) {
> -		ASSERT(in_f->ilf_fields & XFS_ILOG_DBROOT);
> -		error = xfs_bmbt_change_owner(NULL, ip, XFS_DATA_FORK,
> -					      ip->i_ino, buffer_list);
> -		if (error)
> -			goto out_free_ip;
> -	}
> -
> -	if (in_f->ilf_fields & XFS_ILOG_AOWNER) {
> -		ASSERT(in_f->ilf_fields & XFS_ILOG_ABROOT);
> -		error = xfs_bmbt_change_owner(NULL, ip, XFS_ATTR_FORK,
> -					      ip->i_ino, buffer_list);
> -		if (error)
> -			goto out_free_ip;
> -	}
> -
> -out_free_ip:
> -	xfs_inode_free(ip);
> -	return error;
> -}
> -
> -STATIC int
> -xlog_recover_inode_pass2(
> -	struct xlog			*log,
> -	struct list_head		*buffer_list,
> -	struct xlog_recover_item	*item,
> -	xfs_lsn_t			current_lsn)
> -{
> -	struct xfs_inode_log_format	*in_f;
> -	xfs_mount_t		*mp = log->l_mp;
> -	xfs_buf_t		*bp;
> -	xfs_dinode_t		*dip;
> -	int			len;
> -	char			*src;
> -	char			*dest;
> -	int			error;
> -	int			attr_index;
> -	uint			fields;
> -	struct xfs_log_dinode	*ldip;
> -	uint			isize;
> -	int			need_free = 0;
> -
> -	if (item->ri_buf[0].i_len == sizeof(struct xfs_inode_log_format)) {
> -		in_f = item->ri_buf[0].i_addr;
> -	} else {
> -		in_f = kmem_alloc(sizeof(struct xfs_inode_log_format), 0);
> -		need_free = 1;
> -		error = xfs_inode_item_format_convert(&item->ri_buf[0], in_f);
> -		if (error)
> -			goto error;
> -	}
> -
> -	/*
> -	 * Inode buffers can be freed, look out for it,
> -	 * and do not replay the inode.
> -	 */
> -	if (xlog_is_buffer_cancelled(log, in_f->ilf_blkno, in_f->ilf_len)) {
> -		error = 0;
> -		trace_xfs_log_recover_inode_cancel(log, in_f);
> -		goto error;
> -	}
> -	trace_xfs_log_recover_inode_recover(log, in_f);
> -
> -	error = xfs_buf_read(mp->m_ddev_targp, in_f->ilf_blkno, in_f->ilf_len,
> -			0, &bp, &xfs_inode_buf_ops);
> -	if (error)
> -		goto error;
> -	ASSERT(in_f->ilf_fields & XFS_ILOG_CORE);
> -	dip = xfs_buf_offset(bp, in_f->ilf_boffset);
> -
> -	/*
> -	 * Make sure the place we're flushing out to really looks
> -	 * like an inode!
> -	 */
> -	if (XFS_IS_CORRUPT(mp, !xfs_verify_magic16(bp, dip->di_magic))) {
> -		xfs_alert(mp,
> -	"%s: Bad inode magic number, dip = "PTR_FMT", dino bp = "PTR_FMT", ino = %Ld",
> -			__func__, dip, bp, in_f->ilf_ino);
> -		error = -EFSCORRUPTED;
> -		goto out_release;
> -	}
> -	ldip = item->ri_buf[1].i_addr;
> -	if (XFS_IS_CORRUPT(mp, ldip->di_magic != XFS_DINODE_MAGIC)) {
> -		xfs_alert(mp,
> -			"%s: Bad inode log record, rec ptr "PTR_FMT", ino %Ld",
> -			__func__, item, in_f->ilf_ino);
> -		error = -EFSCORRUPTED;
> -		goto out_release;
> -	}
> -
> -	/*
> -	 * If the inode has an LSN in it, recover the inode only if it's less
> -	 * than the lsn of the transaction we are replaying. Note: we still
> -	 * need to replay an owner change even though the inode is more recent
> -	 * than the transaction as there is no guarantee that all the btree
> -	 * blocks are more recent than this transaction, too.
> -	 */
> -	if (dip->di_version >= 3) {
> -		xfs_lsn_t	lsn = be64_to_cpu(dip->di_lsn);
> -
> -		if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0) {
> -			trace_xfs_log_recover_inode_skip(log, in_f);
> -			error = 0;
> -			goto out_owner_change;
> -		}
> -	}
> -
> -	/*
> -	 * di_flushiter is only valid for v1/2 inodes. All changes for v3 inodes
> -	 * are transactional and if ordering is necessary we can determine that
> -	 * more accurately by the LSN field in the V3 inode core. Don't trust
> -	 * the inode versions we might be changing them here - use the
> -	 * superblock flag to determine whether we need to look at di_flushiter
> -	 * to skip replay when the on disk inode is newer than the log one
> -	 */
> -	if (!xfs_sb_version_has_v3inode(&mp->m_sb) &&
> -	    ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) {
> -		/*
> -		 * Deal with the wrap case, DI_MAX_FLUSH is less
> -		 * than smaller numbers
> -		 */
> -		if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH &&
> -		    ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) {
> -			/* do nothing */
> -		} else {
> -			trace_xfs_log_recover_inode_skip(log, in_f);
> -			error = 0;
> -			goto out_release;
> -		}
> -	}
> -
> -	/* Take the opportunity to reset the flush iteration count */
> -	ldip->di_flushiter = 0;
> -
> -	if (unlikely(S_ISREG(ldip->di_mode))) {
> -		if ((ldip->di_format != XFS_DINODE_FMT_EXTENTS) &&
> -		    (ldip->di_format != XFS_DINODE_FMT_BTREE)) {
> -			XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(3)",
> -					 XFS_ERRLEVEL_LOW, mp, ldip,
> -					 sizeof(*ldip));
> -			xfs_alert(mp,
> -		"%s: Bad regular inode log record, rec ptr "PTR_FMT", "
> -		"ino ptr = "PTR_FMT", ino bp = "PTR_FMT", ino %Ld",
> -				__func__, item, dip, bp, in_f->ilf_ino);
> -			error = -EFSCORRUPTED;
> -			goto out_release;
> -		}
> -	} else if (unlikely(S_ISDIR(ldip->di_mode))) {
> -		if ((ldip->di_format != XFS_DINODE_FMT_EXTENTS) &&
> -		    (ldip->di_format != XFS_DINODE_FMT_BTREE) &&
> -		    (ldip->di_format != XFS_DINODE_FMT_LOCAL)) {
> -			XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(4)",
> -					     XFS_ERRLEVEL_LOW, mp, ldip,
> -					     sizeof(*ldip));
> -			xfs_alert(mp,
> -		"%s: Bad dir inode log record, rec ptr "PTR_FMT", "
> -		"ino ptr = "PTR_FMT", ino bp = "PTR_FMT", ino %Ld",
> -				__func__, item, dip, bp, in_f->ilf_ino);
> -			error = -EFSCORRUPTED;
> -			goto out_release;
> -		}
> -	}
> -	if (unlikely(ldip->di_nextents + ldip->di_anextents > ldip->di_nblocks)){
> -		XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(5)",
> -				     XFS_ERRLEVEL_LOW, mp, ldip,
> -				     sizeof(*ldip));
> -		xfs_alert(mp,
> -	"%s: Bad inode log record, rec ptr "PTR_FMT", dino ptr "PTR_FMT", "
> -	"dino bp "PTR_FMT", ino %Ld, total extents = %d, nblocks = %Ld",
> -			__func__, item, dip, bp, in_f->ilf_ino,
> -			ldip->di_nextents + ldip->di_anextents,
> -			ldip->di_nblocks);
> -		error = -EFSCORRUPTED;
> -		goto out_release;
> -	}
> -	if (unlikely(ldip->di_forkoff > mp->m_sb.sb_inodesize)) {
> -		XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(6)",
> -				     XFS_ERRLEVEL_LOW, mp, ldip,
> -				     sizeof(*ldip));
> -		xfs_alert(mp,
> -	"%s: Bad inode log record, rec ptr "PTR_FMT", dino ptr "PTR_FMT", "
> -	"dino bp "PTR_FMT", ino %Ld, forkoff 0x%x", __func__,
> -			item, dip, bp, in_f->ilf_ino, ldip->di_forkoff);
> -		error = -EFSCORRUPTED;
> -		goto out_release;
> -	}
> -	isize = xfs_log_dinode_size(mp);
> -	if (unlikely(item->ri_buf[1].i_len > isize)) {
> -		XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(7)",
> -				     XFS_ERRLEVEL_LOW, mp, ldip,
> -				     sizeof(*ldip));
> -		xfs_alert(mp,
> -			"%s: Bad inode log record length %d, rec ptr "PTR_FMT,
> -			__func__, item->ri_buf[1].i_len, item);
> -		error = -EFSCORRUPTED;
> -		goto out_release;
> -	}
> -
> -	/* recover the log dinode inode into the on disk inode */
> -	xfs_log_dinode_to_disk(ldip, dip);
> -
> -	fields = in_f->ilf_fields;
> -	if (fields & XFS_ILOG_DEV)
> -		xfs_dinode_put_rdev(dip, in_f->ilf_u.ilfu_rdev);
> -
> -	if (in_f->ilf_size == 2)
> -		goto out_owner_change;
> -	len = item->ri_buf[2].i_len;
> -	src = item->ri_buf[2].i_addr;
> -	ASSERT(in_f->ilf_size <= 4);
> -	ASSERT((in_f->ilf_size == 3) || (fields & XFS_ILOG_AFORK));
> -	ASSERT(!(fields & XFS_ILOG_DFORK) ||
> -	       (len == in_f->ilf_dsize));
> -
> -	switch (fields & XFS_ILOG_DFORK) {
> -	case XFS_ILOG_DDATA:
> -	case XFS_ILOG_DEXT:
> -		memcpy(XFS_DFORK_DPTR(dip), src, len);
> -		break;
> -
> -	case XFS_ILOG_DBROOT:
> -		xfs_bmbt_to_bmdr(mp, (struct xfs_btree_block *)src, len,
> -				 (xfs_bmdr_block_t *)XFS_DFORK_DPTR(dip),
> -				 XFS_DFORK_DSIZE(dip, mp));
> -		break;
> -
> -	default:
> -		/*
> -		 * There are no data fork flags set.
> -		 */
> -		ASSERT((fields & XFS_ILOG_DFORK) == 0);
> -		break;
> -	}
> -
> -	/*
> -	 * If we logged any attribute data, recover it.  There may or
> -	 * may not have been any other non-core data logged in this
> -	 * transaction.
> -	 */
> -	if (in_f->ilf_fields & XFS_ILOG_AFORK) {
> -		if (in_f->ilf_fields & XFS_ILOG_DFORK) {
> -			attr_index = 3;
> -		} else {
> -			attr_index = 2;
> -		}
> -		len = item->ri_buf[attr_index].i_len;
> -		src = item->ri_buf[attr_index].i_addr;
> -		ASSERT(len == in_f->ilf_asize);
> -
> -		switch (in_f->ilf_fields & XFS_ILOG_AFORK) {
> -		case XFS_ILOG_ADATA:
> -		case XFS_ILOG_AEXT:
> -			dest = XFS_DFORK_APTR(dip);
> -			ASSERT(len <= XFS_DFORK_ASIZE(dip, mp));
> -			memcpy(dest, src, len);
> -			break;
> -
> -		case XFS_ILOG_ABROOT:
> -			dest = XFS_DFORK_APTR(dip);
> -			xfs_bmbt_to_bmdr(mp, (struct xfs_btree_block *)src,
> -					 len, (xfs_bmdr_block_t*)dest,
> -					 XFS_DFORK_ASIZE(dip, mp));
> -			break;
> -
> -		default:
> -			xfs_warn(log->l_mp, "%s: Invalid flag", __func__);
> -			ASSERT(0);
> -			error = -EFSCORRUPTED;
> -			goto out_release;
> -		}
> -	}
> -
> -out_owner_change:
> -	/* Recover the swapext owner change unless inode has been deleted */
> -	if ((in_f->ilf_fields & (XFS_ILOG_DOWNER|XFS_ILOG_AOWNER)) &&
> -	    (dip->di_mode != 0))
> -		error = xfs_recover_inode_owner_change(mp, dip, in_f,
> -						       buffer_list);
> -	/* re-generate the checksum. */
> -	xfs_dinode_calc_crc(log->l_mp, dip);
> -
> -	ASSERT(bp->b_mount == mp);
> -	bp->b_iodone = xlog_recover_iodone;
> -	xfs_buf_delwri_queue(bp, buffer_list);
> -
> -out_release:
> -	xfs_buf_relse(bp);
> -error:
> -	if (need_free)
> -		kmem_free(in_f);
> -	return error;
> -}
> -
>  /*
>   * Recover a dquot record
>   */
> @@ -3107,9 +2755,6 @@ xlog_recover_commit_pass2(
>  				trans->r_lsn);
>  
>  	switch (ITEM_TYPE(item)) {
> -	case XFS_LI_INODE:
> -		return xlog_recover_inode_pass2(log, buffer_list, item,
> -						 trans->r_lsn);
>  	case XFS_LI_EFI:
>  		return xlog_recover_efi_pass2(log, item, trans->r_lsn);
>  	case XFS_LI_EFD:
> 
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 06/21] xfs: refactor log recovery dquot item dispatch for pass2 commit functions
  2020-04-30  0:48 ` [PATCH 06/21] xfs: refactor log recovery dquot " Darrick J. Wong
@ 2020-05-01 14:14   ` Chandan Rajendra
  0 siblings, 0 replies; 41+ messages in thread
From: Chandan Rajendra @ 2020-05-01 14:14 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thursday, April 30, 2020 6:18 AM Darrick J. Wong wrote: 
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Move the log dquot item pass2 commit code into the per-item source code
> files and use the dispatch function to call it.  We do these one at a
> time because there's a lot of code to move.  No functional changes.
>

The changes look good to me.

Reviewed-by: Chandan Rajendra <chandanrlinux@gmail.com>

> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/xfs_dquot_item.c  |  109 +++++++++++++++++++++++++++++++++++++++++++++
>  fs/xfs/xfs_log_recover.c |  112 ----------------------------------------------
>  2 files changed, 109 insertions(+), 112 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
> index 4d18af49adfe..83bd7ded9185 100644
> --- a/fs/xfs/xfs_dquot_item.c
> +++ b/fs/xfs/xfs_dquot_item.c
> @@ -419,8 +419,117 @@ xlog_recover_dquot_ra_pass2(
>  			&xfs_dquot_buf_ra_ops);
>  }
>  
> +/*
> + * Recover a dquot record
> + */
> +STATIC int
> +xlog_recover_dquot_commit_pass2(
> +	struct xlog			*log,
> +	struct list_head		*buffer_list,
> +	struct xlog_recover_item	*item,
> +	xfs_lsn_t			current_lsn)
> +{
> +	struct xfs_mount		*mp = log->l_mp;
> +	struct xfs_buf			*bp;
> +	struct xfs_disk_dquot		*ddq, *recddq;
> +	struct xfs_dq_logformat		*dq_f;
> +	xfs_failaddr_t			fa;
> +	int				error;
> +	uint				type;
> +
> +	/*
> +	 * Filesystems are required to send in quota flags at mount time.
> +	 */
> +	if (mp->m_qflags == 0)
> +		return 0;
> +
> +	recddq = item->ri_buf[1].i_addr;
> +	if (recddq == NULL) {
> +		xfs_alert(log->l_mp, "NULL dquot in %s.", __func__);
> +		return -EFSCORRUPTED;
> +	}
> +	if (item->ri_buf[1].i_len < sizeof(struct xfs_disk_dquot)) {
> +		xfs_alert(log->l_mp, "dquot too small (%d) in %s.",
> +			item->ri_buf[1].i_len, __func__);
> +		return -EFSCORRUPTED;
> +	}
> +
> +	/*
> +	 * This type of quotas was turned off, so ignore this record.
> +	 */
> +	type = recddq->d_flags & (XFS_DQ_USER | XFS_DQ_PROJ | XFS_DQ_GROUP);
> +	ASSERT(type);
> +	if (log->l_quotaoffs_flag & type)
> +		return 0;
> +
> +	/*
> +	 * At this point we know that quota was _not_ turned off.
> +	 * Since the mount flags are not indicating to us otherwise, this
> +	 * must mean that quota is on, and the dquot needs to be replayed.
> +	 * Remember that we may not have fully recovered the superblock yet,
> +	 * so we can't do the usual trick of looking at the SB quota bits.
> +	 *
> +	 * The other possibility, of course, is that the quota subsystem was
> +	 * removed since the last mount - ENOSYS.
> +	 */
> +	dq_f = item->ri_buf[0].i_addr;
> +	ASSERT(dq_f);
> +	fa = xfs_dquot_verify(mp, recddq, dq_f->qlf_id, 0);
> +	if (fa) {
> +		xfs_alert(mp, "corrupt dquot ID 0x%x in log at %pS",
> +				dq_f->qlf_id, fa);
> +		return -EFSCORRUPTED;
> +	}
> +	ASSERT(dq_f->qlf_len == 1);
> +
> +	/*
> +	 * At this point we are assuming that the dquots have been allocated
> +	 * and hence the buffer has valid dquots stamped in it. It should,
> +	 * therefore, pass verifier validation. If the dquot is bad, then the
> +	 * we'll return an error here, so we don't need to specifically check
> +	 * the dquot in the buffer after the verifier has run.
> +	 */
> +	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dq_f->qlf_blkno,
> +				   XFS_FSB_TO_BB(mp, dq_f->qlf_len), 0, &bp,
> +				   &xfs_dquot_buf_ops);
> +	if (error)
> +		return error;
> +
> +	ASSERT(bp);
> +	ddq = xfs_buf_offset(bp, dq_f->qlf_boffset);
> +
> +	/*
> +	 * If the dquot has an LSN in it, recover the dquot only if it's less
> +	 * than the lsn of the transaction we are replaying.
> +	 */
> +	if (xfs_sb_version_hascrc(&mp->m_sb)) {
> +		struct xfs_dqblk *dqb = (struct xfs_dqblk *)ddq;
> +		xfs_lsn_t	lsn = be64_to_cpu(dqb->dd_lsn);
> +
> +		if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0) {
> +			goto out_release;
> +		}
> +	}
> +
> +	memcpy(ddq, recddq, item->ri_buf[1].i_len);
> +	if (xfs_sb_version_hascrc(&mp->m_sb)) {
> +		xfs_update_cksum((char *)ddq, sizeof(struct xfs_dqblk),
> +				 XFS_DQUOT_CRC_OFF);
> +	}
> +
> +	ASSERT(dq_f->qlf_size == 2);
> +	ASSERT(bp->b_mount == mp);
> +	bp->b_iodone = xlog_recover_iodone;
> +	xfs_buf_delwri_queue(bp, buffer_list);
> +
> +out_release:
> +	xfs_buf_relse(bp);
> +	return 0;
> +}
> +
>  const struct xlog_recover_item_type xlog_dquot_item_type = {
>  	.ra_pass2_fn		= xlog_recover_dquot_ra_pass2,
> +	.commit_pass2_fn	= xlog_recover_dquot_commit_pass2,
>  };
>  
>  /*
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 57e5dac0f510..58a54d9e6847 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -2056,115 +2056,6 @@ xlog_buf_readahead(
>  		xfs_buf_readahead(log->l_mp->m_ddev_targp, blkno, len, ops);
>  }
>  
> -/*
> - * Recover a dquot record
> - */
> -STATIC int
> -xlog_recover_dquot_pass2(
> -	struct xlog			*log,
> -	struct list_head		*buffer_list,
> -	struct xlog_recover_item	*item,
> -	xfs_lsn_t			current_lsn)
> -{
> -	xfs_mount_t		*mp = log->l_mp;
> -	xfs_buf_t		*bp;
> -	struct xfs_disk_dquot	*ddq, *recddq;
> -	xfs_failaddr_t		fa;
> -	int			error;
> -	xfs_dq_logformat_t	*dq_f;
> -	uint			type;
> -
> -
> -	/*
> -	 * Filesystems are required to send in quota flags at mount time.
> -	 */
> -	if (mp->m_qflags == 0)
> -		return 0;
> -
> -	recddq = item->ri_buf[1].i_addr;
> -	if (recddq == NULL) {
> -		xfs_alert(log->l_mp, "NULL dquot in %s.", __func__);
> -		return -EFSCORRUPTED;
> -	}
> -	if (item->ri_buf[1].i_len < sizeof(struct xfs_disk_dquot)) {
> -		xfs_alert(log->l_mp, "dquot too small (%d) in %s.",
> -			item->ri_buf[1].i_len, __func__);
> -		return -EFSCORRUPTED;
> -	}
> -
> -	/*
> -	 * This type of quotas was turned off, so ignore this record.
> -	 */
> -	type = recddq->d_flags & (XFS_DQ_USER | XFS_DQ_PROJ | XFS_DQ_GROUP);
> -	ASSERT(type);
> -	if (log->l_quotaoffs_flag & type)
> -		return 0;
> -
> -	/*
> -	 * At this point we know that quota was _not_ turned off.
> -	 * Since the mount flags are not indicating to us otherwise, this
> -	 * must mean that quota is on, and the dquot needs to be replayed.
> -	 * Remember that we may not have fully recovered the superblock yet,
> -	 * so we can't do the usual trick of looking at the SB quota bits.
> -	 *
> -	 * The other possibility, of course, is that the quota subsystem was
> -	 * removed since the last mount - ENOSYS.
> -	 */
> -	dq_f = item->ri_buf[0].i_addr;
> -	ASSERT(dq_f);
> -	fa = xfs_dquot_verify(mp, recddq, dq_f->qlf_id, 0);
> -	if (fa) {
> -		xfs_alert(mp, "corrupt dquot ID 0x%x in log at %pS",
> -				dq_f->qlf_id, fa);
> -		return -EFSCORRUPTED;
> -	}
> -	ASSERT(dq_f->qlf_len == 1);
> -
> -	/*
> -	 * At this point we are assuming that the dquots have been allocated
> -	 * and hence the buffer has valid dquots stamped in it. It should,
> -	 * therefore, pass verifier validation. If the dquot is bad, then the
> -	 * we'll return an error here, so we don't need to specifically check
> -	 * the dquot in the buffer after the verifier has run.
> -	 */
> -	error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dq_f->qlf_blkno,
> -				   XFS_FSB_TO_BB(mp, dq_f->qlf_len), 0, &bp,
> -				   &xfs_dquot_buf_ops);
> -	if (error)
> -		return error;
> -
> -	ASSERT(bp);
> -	ddq = xfs_buf_offset(bp, dq_f->qlf_boffset);
> -
> -	/*
> -	 * If the dquot has an LSN in it, recover the dquot only if it's less
> -	 * than the lsn of the transaction we are replaying.
> -	 */
> -	if (xfs_sb_version_hascrc(&mp->m_sb)) {
> -		struct xfs_dqblk *dqb = (struct xfs_dqblk *)ddq;
> -		xfs_lsn_t	lsn = be64_to_cpu(dqb->dd_lsn);
> -
> -		if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0) {
> -			goto out_release;
> -		}
> -	}
> -
> -	memcpy(ddq, recddq, item->ri_buf[1].i_len);
> -	if (xfs_sb_version_hascrc(&mp->m_sb)) {
> -		xfs_update_cksum((char *)ddq, sizeof(struct xfs_dqblk),
> -				 XFS_DQUOT_CRC_OFF);
> -	}
> -
> -	ASSERT(dq_f->qlf_size == 2);
> -	ASSERT(bp->b_mount == mp);
> -	bp->b_iodone = xlog_recover_iodone;
> -	xfs_buf_delwri_queue(bp, buffer_list);
> -
> -out_release:
> -	xfs_buf_relse(bp);
> -	return 0;
> -}
> -
>  /*
>   * This routine is called to create an in-core extent free intent
>   * item from the efi format structure which was logged on disk.
> @@ -2771,9 +2662,6 @@ xlog_recover_commit_pass2(
>  		return xlog_recover_bui_pass2(log, item, trans->r_lsn);
>  	case XFS_LI_BUD:
>  		return xlog_recover_bud_pass2(log, item);
> -	case XFS_LI_DQUOT:
> -		return xlog_recover_dquot_pass2(log, buffer_list, item,
> -						trans->r_lsn);
>  	case XFS_LI_ICREATE:
>  		return xlog_recover_do_icreate_pass2(log, buffer_list, item);
>  	case XFS_LI_QUOTAOFF:
> 
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 07/21] xfs: refactor log recovery icreate item dispatch for pass2 commit functions
  2020-04-30  0:48 ` [PATCH 07/21] xfs: refactor log recovery icreate " Darrick J. Wong
@ 2020-05-01 14:18   ` Chandan Rajendra
  0 siblings, 0 replies; 41+ messages in thread
From: Chandan Rajendra @ 2020-05-01 14:18 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thursday, April 30, 2020 6:18 AM Darrick J. Wong wrote: 
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Move the log icreate item pass2 commit code into the per-item source code
> files and use the dispatch function to call it.  We do these one at a
> time because there's a lot of code to move.  No functional changes.
>

The changes look good to me.

Reviewed-by: Chandan Rajendra <chandanrlinux@gmail.com>

> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/xfs_icreate_item.c |  132 +++++++++++++++++++++++++++++++++++++++++++++
>  fs/xfs/xfs_log_recover.c  |  126 -------------------------------------------
>  2 files changed, 132 insertions(+), 126 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_icreate_item.c b/fs/xfs/xfs_icreate_item.c
> index 9f38a3c200a3..602a8c91371f 100644
> --- a/fs/xfs/xfs_icreate_item.c
> +++ b/fs/xfs/xfs_icreate_item.c
> @@ -6,13 +6,19 @@
>  #include "xfs.h"
>  #include "xfs_fs.h"
>  #include "xfs_shared.h"
> +#include "xfs_format.h"
>  #include "xfs_log_format.h"
> +#include "xfs_trans_resv.h"
> +#include "xfs_mount.h"
> +#include "xfs_inode.h"
>  #include "xfs_trans.h"
>  #include "xfs_trans_priv.h"
>  #include "xfs_icreate_item.h"
>  #include "xfs_log.h"
>  #include "xfs_log_priv.h"
>  #include "xfs_log_recover.h"
> +#include "xfs_ialloc.h"
> +#include "xfs_trace.h"
>  
>  kmem_zone_t	*xfs_icreate_zone;		/* inode create item zone */
>  
> @@ -117,6 +123,132 @@ xlog_icreate_reorder(
>  	return XLOG_REORDER_BUFFER_LIST;
>  }
>  
> +/*
> + * This routine is called when an inode create format structure is found in a
> + * committed transaction in the log.  It's purpose is to initialise the inodes
> + * being allocated on disk. This requires us to get inode cluster buffers that
> + * match the range to be initialised, stamped with inode templates and written
> + * by delayed write so that subsequent modifications will hit the cached buffer
> + * and only need writing out at the end of recovery.
> + */
> +STATIC int
> +xlog_recover_do_icreate_commit_pass2(
> +	struct xlog			*log,
> +	struct list_head		*buffer_list,
> +	struct xlog_recover_item	*item,
> +	xfs_lsn_t			lsn)
> +{
> +	struct xfs_mount		*mp = log->l_mp;
> +	struct xfs_icreate_log		*icl;
> +	struct xfs_ino_geometry		*igeo = M_IGEO(mp);
> +	xfs_agnumber_t			agno;
> +	xfs_agblock_t			agbno;
> +	unsigned int			count;
> +	unsigned int			isize;
> +	xfs_agblock_t			length;
> +	int				bb_per_cluster;
> +	int				cancel_count;
> +	int				nbufs;
> +	int				i;
> +
> +	icl = (struct xfs_icreate_log *)item->ri_buf[0].i_addr;
> +	if (icl->icl_type != XFS_LI_ICREATE) {
> +		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad type");
> +		return -EINVAL;
> +	}
> +
> +	if (icl->icl_size != 1) {
> +		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad icl size");
> +		return -EINVAL;
> +	}
> +
> +	agno = be32_to_cpu(icl->icl_ag);
> +	if (agno >= mp->m_sb.sb_agcount) {
> +		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad agno");
> +		return -EINVAL;
> +	}
> +	agbno = be32_to_cpu(icl->icl_agbno);
> +	if (!agbno || agbno == NULLAGBLOCK || agbno >= mp->m_sb.sb_agblocks) {
> +		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad agbno");
> +		return -EINVAL;
> +	}
> +	isize = be32_to_cpu(icl->icl_isize);
> +	if (isize != mp->m_sb.sb_inodesize) {
> +		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad isize");
> +		return -EINVAL;
> +	}
> +	count = be32_to_cpu(icl->icl_count);
> +	if (!count) {
> +		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad count");
> +		return -EINVAL;
> +	}
> +	length = be32_to_cpu(icl->icl_length);
> +	if (!length || length >= mp->m_sb.sb_agblocks) {
> +		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad length");
> +		return -EINVAL;
> +	}
> +
> +	/*
> +	 * The inode chunk is either full or sparse and we only support
> +	 * m_ino_geo.ialloc_min_blks sized sparse allocations at this time.
> +	 */
> +	if (length != igeo->ialloc_blks &&
> +	    length != igeo->ialloc_min_blks) {
> +		xfs_warn(log->l_mp,
> +			 "%s: unsupported chunk length", __FUNCTION__);
> +		return -EINVAL;
> +	}
> +
> +	/* verify inode count is consistent with extent length */
> +	if ((count >> mp->m_sb.sb_inopblog) != length) {
> +		xfs_warn(log->l_mp,
> +			 "%s: inconsistent inode count and chunk length",
> +			 __FUNCTION__);
> +		return -EINVAL;
> +	}
> +
> +	/*
> +	 * The icreate transaction can cover multiple cluster buffers and these
> +	 * buffers could have been freed and reused. Check the individual
> +	 * buffers for cancellation so we don't overwrite anything written after
> +	 * a cancellation.
> +	 */
> +	bb_per_cluster = XFS_FSB_TO_BB(mp, igeo->blocks_per_cluster);
> +	nbufs = length / igeo->blocks_per_cluster;
> +	for (i = 0, cancel_count = 0; i < nbufs; i++) {
> +		xfs_daddr_t	daddr;
> +
> +		daddr = XFS_AGB_TO_DADDR(mp, agno,
> +				agbno + i * igeo->blocks_per_cluster);
> +		if (xlog_is_buffer_cancelled(log, daddr, bb_per_cluster))
> +			cancel_count++;
> +	}
> +
> +	/*
> +	 * We currently only use icreate for a single allocation at a time. This
> +	 * means we should expect either all or none of the buffers to be
> +	 * cancelled. Be conservative and skip replay if at least one buffer is
> +	 * cancelled, but warn the user that something is awry if the buffers
> +	 * are not consistent.
> +	 *
> +	 * XXX: This must be refined to only skip cancelled clusters once we use
> +	 * icreate for multiple chunk allocations.
> +	 */
> +	ASSERT(!cancel_count || cancel_count == nbufs);
> +	if (cancel_count) {
> +		if (cancel_count != nbufs)
> +			xfs_warn(mp,
> +	"WARNING: partial inode chunk cancellation, skipped icreate.");
> +		trace_xfs_log_recover_icreate_cancel(log, icl);
> +		return 0;
> +	}
> +
> +	trace_xfs_log_recover_icreate_recover(log, icl);
> +	return xfs_ialloc_inode_init(mp, NULL, buffer_list, count, agno, agbno,
> +				     length, be32_to_cpu(icl->icl_gen));
> +}
> +
>  const struct xlog_recover_item_type xlog_icreate_item_type = {
>  	.reorder_fn		= xlog_icreate_reorder,
> +	.commit_pass2_fn	= xlog_recover_do_icreate_commit_pass2,
>  };
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 58a54d9e6847..6ba3d64d08de 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -2489,130 +2489,6 @@ xlog_recover_bud_pass2(
>  	return 0;
>  }
>  
> -/*
> - * This routine is called when an inode create format structure is found in a
> - * committed transaction in the log.  It's purpose is to initialise the inodes
> - * being allocated on disk. This requires us to get inode cluster buffers that
> - * match the range to be initialised, stamped with inode templates and written
> - * by delayed write so that subsequent modifications will hit the cached buffer
> - * and only need writing out at the end of recovery.
> - */
> -STATIC int
> -xlog_recover_do_icreate_pass2(
> -	struct xlog		*log,
> -	struct list_head	*buffer_list,
> -	xlog_recover_item_t	*item)
> -{
> -	struct xfs_mount	*mp = log->l_mp;
> -	struct xfs_icreate_log	*icl;
> -	struct xfs_ino_geometry	*igeo = M_IGEO(mp);
> -	xfs_agnumber_t		agno;
> -	xfs_agblock_t		agbno;
> -	unsigned int		count;
> -	unsigned int		isize;
> -	xfs_agblock_t		length;
> -	int			bb_per_cluster;
> -	int			cancel_count;
> -	int			nbufs;
> -	int			i;
> -
> -	icl = (struct xfs_icreate_log *)item->ri_buf[0].i_addr;
> -	if (icl->icl_type != XFS_LI_ICREATE) {
> -		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad type");
> -		return -EINVAL;
> -	}
> -
> -	if (icl->icl_size != 1) {
> -		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad icl size");
> -		return -EINVAL;
> -	}
> -
> -	agno = be32_to_cpu(icl->icl_ag);
> -	if (agno >= mp->m_sb.sb_agcount) {
> -		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad agno");
> -		return -EINVAL;
> -	}
> -	agbno = be32_to_cpu(icl->icl_agbno);
> -	if (!agbno || agbno == NULLAGBLOCK || agbno >= mp->m_sb.sb_agblocks) {
> -		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad agbno");
> -		return -EINVAL;
> -	}
> -	isize = be32_to_cpu(icl->icl_isize);
> -	if (isize != mp->m_sb.sb_inodesize) {
> -		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad isize");
> -		return -EINVAL;
> -	}
> -	count = be32_to_cpu(icl->icl_count);
> -	if (!count) {
> -		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad count");
> -		return -EINVAL;
> -	}
> -	length = be32_to_cpu(icl->icl_length);
> -	if (!length || length >= mp->m_sb.sb_agblocks) {
> -		xfs_warn(log->l_mp, "xlog_recover_do_icreate_trans: bad length");
> -		return -EINVAL;
> -	}
> -
> -	/*
> -	 * The inode chunk is either full or sparse and we only support
> -	 * m_ino_geo.ialloc_min_blks sized sparse allocations at this time.
> -	 */
> -	if (length != igeo->ialloc_blks &&
> -	    length != igeo->ialloc_min_blks) {
> -		xfs_warn(log->l_mp,
> -			 "%s: unsupported chunk length", __FUNCTION__);
> -		return -EINVAL;
> -	}
> -
> -	/* verify inode count is consistent with extent length */
> -	if ((count >> mp->m_sb.sb_inopblog) != length) {
> -		xfs_warn(log->l_mp,
> -			 "%s: inconsistent inode count and chunk length",
> -			 __FUNCTION__);
> -		return -EINVAL;
> -	}
> -
> -	/*
> -	 * The icreate transaction can cover multiple cluster buffers and these
> -	 * buffers could have been freed and reused. Check the individual
> -	 * buffers for cancellation so we don't overwrite anything written after
> -	 * a cancellation.
> -	 */
> -	bb_per_cluster = XFS_FSB_TO_BB(mp, igeo->blocks_per_cluster);
> -	nbufs = length / igeo->blocks_per_cluster;
> -	for (i = 0, cancel_count = 0; i < nbufs; i++) {
> -		xfs_daddr_t	daddr;
> -
> -		daddr = XFS_AGB_TO_DADDR(mp, agno,
> -				agbno + i * igeo->blocks_per_cluster);
> -		if (xlog_is_buffer_cancelled(log, daddr, bb_per_cluster))
> -			cancel_count++;
> -	}
> -
> -	/*
> -	 * We currently only use icreate for a single allocation at a time. This
> -	 * means we should expect either all or none of the buffers to be
> -	 * cancelled. Be conservative and skip replay if at least one buffer is
> -	 * cancelled, but warn the user that something is awry if the buffers
> -	 * are not consistent.
> -	 *
> -	 * XXX: This must be refined to only skip cancelled clusters once we use
> -	 * icreate for multiple chunk allocations.
> -	 */
> -	ASSERT(!cancel_count || cancel_count == nbufs);
> -	if (cancel_count) {
> -		if (cancel_count != nbufs)
> -			xfs_warn(mp,
> -	"WARNING: partial inode chunk cancellation, skipped icreate.");
> -		trace_xfs_log_recover_icreate_cancel(log, icl);
> -		return 0;
> -	}
> -
> -	trace_xfs_log_recover_icreate_recover(log, icl);
> -	return xfs_ialloc_inode_init(mp, NULL, buffer_list, count, agno, agbno,
> -				     length, be32_to_cpu(icl->icl_gen));
> -}
> -
>  STATIC int
>  xlog_recover_commit_pass1(
>  	struct xlog			*log,
> @@ -2662,8 +2538,6 @@ xlog_recover_commit_pass2(
>  		return xlog_recover_bui_pass2(log, item, trans->r_lsn);
>  	case XFS_LI_BUD:
>  		return xlog_recover_bud_pass2(log, item);
> -	case XFS_LI_ICREATE:
> -		return xlog_recover_do_icreate_pass2(log, buffer_list, item);
>  	case XFS_LI_QUOTAOFF:
>  		/* nothing to do in pass2 */
>  		return 0;
> 
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 08/21] xfs: remove log recovery quotaoff item dispatch for pass2 commit functions
  2020-04-30  0:48 ` [PATCH 08/21] xfs: remove log recovery quotaoff " Darrick J. Wong
@ 2020-05-01 15:09   ` Chandan Rajendra
  2020-05-01 17:41     ` Darrick J. Wong
  0 siblings, 1 reply; 41+ messages in thread
From: Chandan Rajendra @ 2020-05-01 15:09 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thursday, April 30, 2020 6:18 AM Darrick J. Wong wrote: 
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Quotaoff doesn't actually do anything, so take advantage of the
> commit_pass2_fn pointer being optional and get rid of the switch
> statement clause.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/xfs_log_recover.c |    3 ---
>  1 file changed, 3 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 6ba3d64d08de..dba38fb99af7 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -2538,9 +2538,6 @@ xlog_recover_commit_pass2(
>  		return xlog_recover_bui_pass2(log, item, trans->r_lsn);
>  	case XFS_LI_BUD:
>  		return xlog_recover_bud_pass2(log, item);
> -	case XFS_LI_QUOTAOFF:
> -		/* nothing to do in pass2 */
> -		return 0;

If there is a XFS_LI_QUOTAOFF item in the log, wouldn't XLOG_RECOVER_PASS2
step end up executing the statements under the "default" case given below?

>  	default:
>  		xfs_warn(log->l_mp, "%s: invalid item type (%d)",
>  			__func__, ITEM_TYPE(item));
> 
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/21] xfs: refactor log recovery
  2020-05-01 10:15 ` [PATCH v2 00/21] xfs: refactor log recovery Christoph Hellwig
@ 2020-05-01 16:53   ` Darrick J. Wong
  2020-05-01 17:03     ` Christoph Hellwig
  0 siblings, 1 reply; 41+ messages in thread
From: Darrick J. Wong @ 2020-05-01 16:53 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, May 01, 2020 at 03:15:39AM -0700, Christoph Hellwig wrote:
> I've looked a bit over the total diff and finaly result and really like
> it.
> 
> A few comments from that without going into the individual patches:
> 
>  - I don't think the buffer cancellation table should remain in
>    xfs_log_recovery.c.  I can either move them into a new file
>    as part of resending my prep series, or you could move them into
>    xfs_buf_item_recover.c.  Let me know what you prefer.

I'll look into moving it as an addition to this series.

>  - Should the match callback also move into struct xfs_item_ops?  That
>    would also match iop_recover.

Hmm, good idea!

>  - Based on that we could also kill XFS_ITEM_TYPE_IS_INTENT by just
>    checking for iop_recover and/or iop_match.

Yep.

>  - Setting XFS_LI_RECOVERED could also move to common code, basically
>    set it whenever iop_recover returns.  Also we can remove the
>    XFS_LI_RECOVERED asserts in ->iop_recovery when the caller checks
>    it just before.

I've noticed two weird things about the xfs_*_recover functions:

1. We'll set LI_RECOVERED if the intent is corrupt or if the final
commit succeeds (or fails), but we won't set it for other error bailouts
during recovery (e.g. xfs_trans_alloc fails).

2. If the intent is corrupt, iop_recovery also release the intent item,
but we don't do that for any of the other error returns from the
->iop_recovery function.  AFAICT those items (including the one that
failed recovery) are still on the AIL list and get released when we call
cancel_intents, which means that iop_recovery should /not/ be releasing
the item, right?

>  - we are still having a few redundant ri_type checks.
>  - ri_type maybe should be ri_ops?

Yeah.

> 
> See this patch below for my take on cleaning up the recovery ops
> handling a bit:

Looks decent; I was moving towards putting the XFS_LI_ code into the the
xlog_recover_item_ops anyway.

--D

> diff --git a/fs/xfs/libxfs/xfs_log_recover.h b/fs/xfs/libxfs/xfs_log_recover.h
> index ba172eb454c8f..f97946cf94f11 100644
> --- a/fs/xfs/libxfs/xfs_log_recover.h
> +++ b/fs/xfs/libxfs/xfs_log_recover.h
> @@ -7,7 +7,7 @@
>  #define __XFS_LOG_RECOVER_H__
>  
>  /*
> - * Each log item type (XFS_LI_*) gets its own xlog_recover_item_type to
> + * Each log item type (XFS_LI_*) gets its own xlog_recover_item_ops to
>   * define how recovery should work for that type of log item.
>   */
>  struct xlog_recover_item;
> @@ -20,7 +20,9 @@ enum xlog_recover_reorder {
>  	XLOG_REORDER_CANCEL_LIST,
>  };
>  
> -struct xlog_recover_item_type {
> +struct xlog_recover_item_ops {
> +	uint16_t		item_type;
> +
>  	/*
>  	 * Help sort recovered log items into the order required to replay them
>  	 * correctly.  Log item types that always use XLOG_REORDER_ITEM_LIST do
> @@ -58,19 +60,19 @@ struct xlog_recover_item_type {
>  			       struct xlog_recover_item *item, xfs_lsn_t lsn);
>  };
>  
> -extern const struct xlog_recover_item_type xlog_icreate_item_type;
> -extern const struct xlog_recover_item_type xlog_buf_item_type;
> -extern const struct xlog_recover_item_type xlog_inode_item_type;
> -extern const struct xlog_recover_item_type xlog_dquot_item_type;
> -extern const struct xlog_recover_item_type xlog_quotaoff_item_type;
> -extern const struct xlog_recover_item_type xlog_bmap_intent_item_type;
> -extern const struct xlog_recover_item_type xlog_bmap_done_item_type;
> -extern const struct xlog_recover_item_type xlog_extfree_intent_item_type;
> -extern const struct xlog_recover_item_type xlog_extfree_done_item_type;
> -extern const struct xlog_recover_item_type xlog_rmap_intent_item_type;
> -extern const struct xlog_recover_item_type xlog_rmap_done_item_type;
> -extern const struct xlog_recover_item_type xlog_refcount_intent_item_type;
> -extern const struct xlog_recover_item_type xlog_refcount_done_item_type;
> +extern const struct xlog_recover_item_ops xlog_icreate_item_type;
> +extern const struct xlog_recover_item_ops xlog_buf_item_type;
> +extern const struct xlog_recover_item_ops xlog_inode_item_type;
> +extern const struct xlog_recover_item_ops xlog_dquot_item_type;
> +extern const struct xlog_recover_item_ops xlog_quotaoff_item_type;
> +extern const struct xlog_recover_item_ops xlog_bmap_intent_item_type;
> +extern const struct xlog_recover_item_ops xlog_bmap_done_item_type;
> +extern const struct xlog_recover_item_ops xlog_extfree_intent_item_type;
> +extern const struct xlog_recover_item_ops xlog_extfree_done_item_type;
> +extern const struct xlog_recover_item_ops xlog_rmap_intent_item_type;
> +extern const struct xlog_recover_item_ops xlog_rmap_done_item_type;
> +extern const struct xlog_recover_item_ops xlog_refcount_intent_item_type;
> +extern const struct xlog_recover_item_ops xlog_refcount_done_item_type;
>  
>  /*
>   * Macros, structures, prototypes for internal log manager use.
> @@ -93,7 +95,7 @@ typedef struct xlog_recover_item {
>  	int			ri_cnt;	/* count of regions found */
>  	int			ri_total;	/* total regions */
>  	xfs_log_iovec_t		*ri_buf;	/* ptr to regions buffer */
> -	const struct xlog_recover_item_type *ri_type;
> +	const struct xlog_recover_item_ops *ri_ops;
>  } xlog_recover_item_t;
>  
>  struct xlog_recover {
> diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
> index 58f0904e4504d..952b4ce40433e 100644
> --- a/fs/xfs/xfs_bmap_item.c
> +++ b/fs/xfs/xfs_bmap_item.c
> @@ -667,10 +667,12 @@ xlog_recover_bmap_done_commit_pass2(
>  	return 0;
>  }
>  
> -const struct xlog_recover_item_type xlog_bmap_intent_item_type = {
> +const struct xlog_recover_item_ops xlog_bmap_intent_item_type = {
> +	.item_type		= XFS_LI_BUI,
>  	.commit_pass2_fn	= xlog_recover_bmap_intent_commit_pass2,
>  };
>  
> -const struct xlog_recover_item_type xlog_bmap_done_item_type = {
> +const struct xlog_recover_item_ops xlog_bmap_done_item_type = {
> +	.item_type		= XFS_LI_BUD,
>  	.commit_pass2_fn	= xlog_recover_bmap_done_commit_pass2,
>  };
> diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c
> index d324f810819df..954e0e96af5dc 100644
> --- a/fs/xfs/xfs_buf_item_recover.c
> +++ b/fs/xfs/xfs_buf_item_recover.c
> @@ -857,7 +857,8 @@ xlog_recover_buffer_commit_pass2(
>  	return 0;
>  }
>  
> -const struct xlog_recover_item_type xlog_buf_item_type = {
> +const struct xlog_recover_item_ops xlog_buf_item_type = {
> +	.item_type		= XFS_LI_BUF,
>  	.reorder_fn		= xlog_buf_reorder_fn,
>  	.ra_pass2_fn		= xlog_recover_buffer_ra_pass2,
>  	.commit_pass1_fn	= xlog_recover_buffer_commit_pass1,
> diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c
> index 83bd7ded9185f..6c6216bdc432c 100644
> --- a/fs/xfs/xfs_dquot_item.c
> +++ b/fs/xfs/xfs_dquot_item.c
> @@ -527,7 +527,8 @@ xlog_recover_dquot_commit_pass2(
>  	return 0;
>  }
>  
> -const struct xlog_recover_item_type xlog_dquot_item_type = {
> +const struct xlog_recover_item_ops xlog_dquot_item_type = {
> +	.item_type		= XFS_LI_DQUOT,
>  	.ra_pass2_fn		= xlog_recover_dquot_ra_pass2,
>  	.commit_pass2_fn	= xlog_recover_dquot_commit_pass2,
>  };
> @@ -559,6 +560,7 @@ xlog_recover_quotaoff_commit_pass1(
>  	return 0;
>  }
>  
> -const struct xlog_recover_item_type xlog_quotaoff_item_type = {
> +const struct xlog_recover_item_ops xlog_quotaoff_item_type = {
> +	.item_type		= XFS_LI_QUOTAOFF,
>  	.commit_pass1_fn	= xlog_recover_quotaoff_commit_pass1,
>  };
> diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
> index d6f2c88570de1..5d1fb5e05b781 100644
> --- a/fs/xfs/xfs_extfree_item.c
> +++ b/fs/xfs/xfs_extfree_item.c
> @@ -729,10 +729,12 @@ xlog_recover_extfree_done_commit_pass2(
>  	return 0;
>  }
>  
> -const struct xlog_recover_item_type xlog_extfree_intent_item_type = {
> +const struct xlog_recover_item_ops xlog_extfree_intent_item_type = {
> +	.item_type		= XFS_LI_EFI,
>  	.commit_pass2_fn	= xlog_recover_extfree_intent_commit_pass2,
>  };
>  
> -const struct xlog_recover_item_type xlog_extfree_done_item_type = {
> +const struct xlog_recover_item_ops xlog_extfree_done_item_type = {
> +	.item_type		= XFS_LI_EFD,
>  	.commit_pass2_fn	= xlog_recover_extfree_done_commit_pass2,
>  };
> diff --git a/fs/xfs/xfs_icreate_item.c b/fs/xfs/xfs_icreate_item.c
> index 602a8c91371fe..34805bdbc2e12 100644
> --- a/fs/xfs/xfs_icreate_item.c
> +++ b/fs/xfs/xfs_icreate_item.c
> @@ -248,7 +248,8 @@ xlog_recover_do_icreate_commit_pass2(
>  				     length, be32_to_cpu(icl->icl_gen));
>  }
>  
> -const struct xlog_recover_item_type xlog_icreate_item_type = {
> +const struct xlog_recover_item_ops xlog_icreate_item_type = {
> +	.item_type		= XFS_LI_ICREATE,
>  	.reorder_fn		= xlog_icreate_reorder,
>  	.commit_pass2_fn	= xlog_recover_do_icreate_commit_pass2,
>  };
> diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c
> index 46fc8a4b9ac61..9dff80783fe12 100644
> --- a/fs/xfs/xfs_inode_item_recover.c
> +++ b/fs/xfs/xfs_inode_item_recover.c
> @@ -393,7 +393,8 @@ xlog_recover_inode_commit_pass2(
>  	return error;
>  }
>  
> -const struct xlog_recover_item_type xlog_inode_item_type = {
> +const struct xlog_recover_item_ops xlog_inode_item_type = {
> +	.item_type		= XFS_LI_INODE,
>  	.ra_pass2_fn		= xlog_recover_inode_ra_pass2,
>  	.commit_pass2_fn	= xlog_recover_inode_commit_pass2,
>  };
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 09dd514a34980..e3f13866deb08 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -1828,55 +1828,35 @@ xlog_recover_insert_ail(
>   ******************************************************************************
>   */
>  
> -static int
> -xlog_set_item_type(
> -	struct xlog_recover_item		*item)
> -{
> -	switch (ITEM_TYPE(item)) {
> -	case XFS_LI_ICREATE:
> -		item->ri_type = &xlog_icreate_item_type;
> -		return 0;
> -	case XFS_LI_BUF:
> -		item->ri_type = &xlog_buf_item_type;
> -		return 0;
> -	case XFS_LI_EFI:
> -		item->ri_type = &xlog_extfree_intent_item_type;
> -		return 0;
> -	case XFS_LI_EFD:
> -		item->ri_type = &xlog_extfree_done_item_type;
> -		return 0;
> -	case XFS_LI_RUI:
> -		item->ri_type = &xlog_rmap_intent_item_type;
> -		return 0;
> -	case XFS_LI_RUD:
> -		item->ri_type = &xlog_rmap_done_item_type;
> -		return 0;
> -	case XFS_LI_CUI:
> -		item->ri_type = &xlog_refcount_intent_item_type;
> -		return 0;
> -	case XFS_LI_CUD:
> -		item->ri_type = &xlog_refcount_done_item_type;
> -		return 0;
> -	case XFS_LI_BUI:
> -		item->ri_type = &xlog_bmap_intent_item_type;
> -		return 0;
> -	case XFS_LI_BUD:
> -		item->ri_type = &xlog_bmap_done_item_type;
> -		return 0;
> -	case XFS_LI_INODE:
> -		item->ri_type = &xlog_inode_item_type;
> -		return 0;
> +static const struct xlog_recover_item_ops *xlog_recover_item_ops[] = {
> +	&xlog_icreate_item_type,
> +	&xlog_buf_item_type,
> +	&xlog_extfree_intent_item_type,
> +	&xlog_extfree_done_item_type,
> +	&xlog_rmap_intent_item_type,
> +	&xlog_rmap_done_item_type,
> +	&xlog_refcount_intent_item_type,
> +	&xlog_refcount_done_item_type,
> +	&xlog_bmap_intent_item_type,
> +	&xlog_bmap_done_item_type,
> +	&xlog_inode_item_type,
>  #ifdef CONFIG_XFS_QUOTA
> -	case XFS_LI_DQUOT:
> -		item->ri_type = &xlog_dquot_item_type;
> -		return 0;
> -	case XFS_LI_QUOTAOFF:
> -		item->ri_type = &xlog_quotaoff_item_type;
> -		return 0;
> +	&xlog_dquot_item_type,
> +	&xlog_quotaoff_item_type,
>  #endif /* CONFIG_XFS_QUOTA */
> -	default:
> -		return -EFSCORRUPTED;
> -	}
> +};
> +
> +static const struct xlog_recover_item_ops *
> +xlog_find_item_ops(
> +	struct xlog_recover_item	*item)
> +{
> +	int				i;
> +
> +	for (i = 0; i < ARRAY_SIZE(xlog_recover_item_ops); i++)
> +		if (ITEM_TYPE(item) == xlog_recover_item_ops[i]->item_type)
> +			return xlog_recover_item_ops[i];
> +
> +	return NULL;
>  }
>  
>  /*
> @@ -1946,8 +1926,8 @@ xlog_recover_reorder_trans(
>  	list_for_each_entry_safe(item, n, &sort_list, ri_list) {
>  		enum xlog_recover_reorder	fate = XLOG_REORDER_ITEM_LIST;
>  
> -		error = xlog_set_item_type(item);
> -		if (error) {
> +		item->ri_ops = xlog_find_item_ops(item);
> +		if (!item->ri_ops) {
>  			xfs_warn(log->l_mp,
>  				"%s: unrecognized type of log operation (%d)",
>  				__func__, ITEM_TYPE(item));
> @@ -1958,11 +1938,12 @@ xlog_recover_reorder_trans(
>  			 */
>  			if (!list_empty(&sort_list))
>  				list_splice_init(&sort_list, &trans->r_itemq);
> +			error = -EFSCORRUPTED;
>  			break;
>  		}
>  
> -		if (item->ri_type->reorder_fn)
> -			fate = item->ri_type->reorder_fn(item);
> +		if (item->ri_ops->reorder_fn)
> +			fate = item->ri_ops->reorder_fn(item);
>  
>  		switch (fate) {
>  		case XLOG_REORDER_BUFFER_LIST:
> @@ -2098,46 +2079,6 @@ xlog_buf_readahead(
>  		xfs_buf_readahead(log->l_mp->m_ddev_targp, blkno, len, ops);
>  }
>  
> -STATIC int
> -xlog_recover_commit_pass1(
> -	struct xlog			*log,
> -	struct xlog_recover		*trans,
> -	struct xlog_recover_item	*item)
> -{
> -	trace_xfs_log_recover_item_recover(log, trans, item, XLOG_RECOVER_PASS1);
> -
> -	if (!item->ri_type) {
> -		xfs_warn(log->l_mp, "%s: invalid item type (%d)",
> -			__func__, ITEM_TYPE(item));
> -		ASSERT(0);
> -		return -EFSCORRUPTED;
> -	}
> -	if (!item->ri_type->commit_pass1_fn)
> -		return 0;
> -	return item->ri_type->commit_pass1_fn(log, item);
> -}
> -
> -STATIC int
> -xlog_recover_commit_pass2(
> -	struct xlog			*log,
> -	struct xlog_recover		*trans,
> -	struct list_head		*buffer_list,
> -	struct xlog_recover_item	*item)
> -{
> -	trace_xfs_log_recover_item_recover(log, trans, item, XLOG_RECOVER_PASS2);
> -
> -	if (!item->ri_type) {
> -		xfs_warn(log->l_mp, "%s: invalid item type (%d)",
> -			__func__, ITEM_TYPE(item));
> -		ASSERT(0);
> -		return -EFSCORRUPTED;
> -	}
> -	if (!item->ri_type->commit_pass2_fn)
> -		return 0;
> -	return item->ri_type->commit_pass2_fn(log, buffer_list, item,
> -			trans->r_lsn);
> -}
> -
>  STATIC int
>  xlog_recover_items_pass2(
>  	struct xlog                     *log,
> @@ -2146,16 +2087,18 @@ xlog_recover_items_pass2(
>  	struct list_head                *item_list)
>  {
>  	struct xlog_recover_item	*item;
> -	int				error = 0;
> +	int				error;
>  
>  	list_for_each_entry(item, item_list, ri_list) {
> -		error = xlog_recover_commit_pass2(log, trans,
> -					  buffer_list, item);
> +		if (!item->ri_ops->commit_pass2_fn)
> +			continue;
> +		error = item->ri_ops->commit_pass2_fn(log, buffer_list, item,
> +				trans->r_lsn);
>  		if (error)
>  			return error;
>  	}
>  
> -	return error;
> +	return 0;
>  }
>  
>  /*
> @@ -2187,13 +2130,16 @@ xlog_recover_commit_trans(
>  		return error;
>  
>  	list_for_each_entry_safe(item, next, &trans->r_itemq, ri_list) {
> +		trace_xfs_log_recover_item_recover(log, trans, item, pass);
> +
>  		switch (pass) {
>  		case XLOG_RECOVER_PASS1:
> -			error = xlog_recover_commit_pass1(log, trans, item);
> +			if (item->ri_ops->commit_pass1_fn)
> +				error = item->ri_ops->commit_pass1_fn(log, item);
>  			break;
>  		case XLOG_RECOVER_PASS2:
> -			if (item->ri_type && item->ri_type->ra_pass2_fn)
> -				item->ri_type->ra_pass2_fn(log, item);
> +			if (item->ri_ops->ra_pass2_fn)
> +				item->ri_ops->ra_pass2_fn(log, item);
>  			list_move_tail(&item->ri_list, &ra_list);
>  			items_queued++;
>  			if (items_queued >= XLOG_RECOVER_COMMIT_QUEUE_MAX) {
> diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c
> index 53a79dc618f76..5703d5fdf4eeb 100644
> --- a/fs/xfs/xfs_refcount_item.c
> +++ b/fs/xfs/xfs_refcount_item.c
> @@ -690,10 +690,12 @@ xlog_recover_refcount_done_commit_pass2(
>  	return 0;
>  }
>  
> -const struct xlog_recover_item_type xlog_refcount_intent_item_type = {
> +const struct xlog_recover_item_ops xlog_refcount_intent_item_type = {
> +	.item_type		= XFS_LI_CUI,
>  	.commit_pass2_fn	= xlog_recover_refcount_intent_commit_pass2,
>  };
>  
> -const struct xlog_recover_item_type xlog_refcount_done_item_type = {
> +const struct xlog_recover_item_ops xlog_refcount_done_item_type = {
> +	.item_type		= XFS_LI_CUD,
>  	.commit_pass2_fn	= xlog_recover_refcount_done_commit_pass2,
>  };
> diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c
> index cee5c61550321..12e035ff7bb2d 100644
> --- a/fs/xfs/xfs_rmap_item.c
> +++ b/fs/xfs/xfs_rmap_item.c
> @@ -680,10 +680,12 @@ xlog_recover_rmap_done_commit_pass2(
>  	return 0;
>  }
>  
> -const struct xlog_recover_item_type xlog_rmap_intent_item_type = {
> +const struct xlog_recover_item_ops xlog_rmap_intent_item_type = {
> +	.item_type		= XFS_LI_RUI,
>  	.commit_pass2_fn	= xlog_recover_rmap_intent_commit_pass2,
>  };
>  
> -const struct xlog_recover_item_type xlog_rmap_done_item_type = {
> +const struct xlog_recover_item_ops xlog_rmap_done_item_type = {
> +	.item_type		= XFS_LI_RUD,
>  	.commit_pass2_fn	= xlog_recover_rmap_done_commit_pass2,
>  };

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 00/21] xfs: refactor log recovery
  2020-05-01 16:53   ` Darrick J. Wong
@ 2020-05-01 17:03     ` Christoph Hellwig
  0 siblings, 0 replies; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01 17:03 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, linux-xfs

On Fri, May 01, 2020 at 09:53:57AM -0700, Darrick J. Wong wrote:
> >  - Setting XFS_LI_RECOVERED could also move to common code, basically
> >    set it whenever iop_recover returns.  Also we can remove the
> >    XFS_LI_RECOVERED asserts in ->iop_recovery when the caller checks
> >    it just before.
> 
> I've noticed two weird things about the xfs_*_recover functions:
> 
> 1. We'll set LI_RECOVERED if the intent is corrupt or if the final
> commit succeeds (or fails), but we won't set it for other error bailouts
> during recovery (e.g. xfs_trans_alloc fails).
> 
> 2. If the intent is corrupt, iop_recovery also release the intent item,
> but we don't do that for any of the other error returns from the
> ->iop_recovery function.  AFAICT those items (including the one that
> failed recovery) are still on the AIL list and get released when we call
> cancel_intents, which means that iop_recovery should /not/ be releasing
> the item, right?

LI_RECOVERED just prevents entering ->iop_recover again.  Given that
we give up after any failed recovery I don't think it matters if we set
it or not.  That being said, we should be consistent, and taking the
setting into the caller will force them to be consistent.

Well, releasing them will remove them from the AIL.  So I think the
manual release is pointless, but not actively harmful.  But again,
removing them is probably and improvements, as that means all the
releasing from the AIL is driven from the common code.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 08/21] xfs: remove log recovery quotaoff item dispatch for pass2 commit functions
  2020-05-01 15:09   ` Chandan Rajendra
@ 2020-05-01 17:41     ` Darrick J. Wong
  0 siblings, 0 replies; 41+ messages in thread
From: Darrick J. Wong @ 2020-05-01 17:41 UTC (permalink / raw)
  To: Chandan Rajendra; +Cc: linux-xfs

On Fri, May 01, 2020 at 08:39:21PM +0530, Chandan Rajendra wrote:
> On Thursday, April 30, 2020 6:18 AM Darrick J. Wong wrote: 
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Quotaoff doesn't actually do anything, so take advantage of the
> > commit_pass2_fn pointer being optional and get rid of the switch
> > statement clause.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/xfs_log_recover.c |    3 ---
> >  1 file changed, 3 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> > index 6ba3d64d08de..dba38fb99af7 100644
> > --- a/fs/xfs/xfs_log_recover.c
> > +++ b/fs/xfs/xfs_log_recover.c
> > @@ -2538,9 +2538,6 @@ xlog_recover_commit_pass2(
> >  		return xlog_recover_bui_pass2(log, item, trans->r_lsn);
> >  	case XFS_LI_BUD:
> >  		return xlog_recover_bud_pass2(log, item);
> > -	case XFS_LI_QUOTAOFF:
> > -		/* nothing to do in pass2 */
> > -		return 0;
> 
> If there is a XFS_LI_QUOTAOFF item in the log, wouldn't XLOG_RECOVER_PASS2
> step end up executing the statements under the "default" case given below?

Hmm, good point, this breaks bisectability.  This patch should be the
last of the pass2 conversion patches, and it can take care of removing
all the old function dispatch infrastructure and whatnot.

--D

> >  	default:
> >  		xfs_warn(log->l_mp, "%s: invalid item type (%d)",
> >  			__func__, ITEM_TYPE(item));
> > 
> > 
> 
> 
> -- 
> chandan
> 
> 
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 09/21] xfs: refactor log recovery EFI item dispatch for pass2 commit functions
  2020-05-01 10:28   ` Christoph Hellwig
@ 2020-05-01 17:56     ` Darrick J. Wong
  0 siblings, 0 replies; 41+ messages in thread
From: Darrick J. Wong @ 2020-05-01 17:56 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, May 01, 2020 at 03:28:44AM -0700, Christoph Hellwig wrote:
> > +STATIC int
> > +xlog_recover_extfree_done_commit_pass2(
> > +	struct xlog			*log,
> > +	struct list_head		*buffer_list,
> > +	struct xlog_recover_item	*item,
> > +	xfs_lsn_t			lsn)
> > +{
> 
> ...
> 
> > +	return 0;
> > +}
> > +
> >  const struct xlog_recover_item_type xlog_extfree_intent_item_type = {
> > +	.commit_pass2_fn	= xlog_recover_extfree_intent_commit_pass2,
> >  };
> >  
> >  const struct xlog_recover_item_type xlog_extfree_done_item_type = {
> > +	.commit_pass2_fn	= xlog_recover_extfree_done_commit_pass2,
> >  };
> 
> Nipick: It would be nice to keep all the efi vs efd code together
> with their ops vectors?  Same for the other intent ops.

Ok, will do.

--D

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 13/21] xfs: refactor recovered EFI log item playback
  2020-05-01 10:19   ` Christoph Hellwig
@ 2020-05-01 17:58     ` Darrick J. Wong
  0 siblings, 0 replies; 41+ messages in thread
From: Darrick J. Wong @ 2020-05-01 17:58 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, May 01, 2020 at 03:19:47AM -0700, Christoph Hellwig wrote:
> On Wed, Apr 29, 2020 at 05:48:59PM -0700, Darrick J. Wong wrote:
> > +STATIC int xfs_efi_recover(struct xfs_mount *mp, struct xfs_efi_log_item *efip);
> 
> Can you just move xfs_efi_item_ops down a bit to avoid the forward
> declaration?  Same for the other patches doing the same.

I can, but then I need a forward declaration of xfs_efi_item_ops,
because xfs_efi_init needs the symbol, and the ->create_intent function
inside the item_ops needs xfs_efi_init.

Still, a forward declaration of a static variable is easier to maintain
than a function decl, so I'll change it.

--D

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2020-05-01 17:58 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-30  0:47 [PATCH v2 00/21] xfs: refactor log recovery Darrick J. Wong
2020-04-30  0:47 ` [PATCH 01/21] xfs: refactor log recovery item sorting into a generic dispatch structure Darrick J. Wong
2020-04-30  5:53   ` Christoph Hellwig
2020-04-30 15:08     ` Darrick J. Wong
2020-04-30 18:16       ` Darrick J. Wong
2020-05-01  8:08         ` Christoph Hellwig
2020-05-01 10:40   ` Chandan Rajendra
2020-04-30  0:47 ` [PATCH 02/21] xfs: refactor log recovery item dispatch for pass2 readhead functions Darrick J. Wong
2020-05-01 12:10   ` Chandan Rajendra
2020-04-30  0:47 ` [PATCH 03/21] xfs: refactor log recovery item dispatch for pass1 commit functions Darrick J. Wong
2020-04-30  0:48 ` [PATCH 04/21] xfs: refactor log recovery buffer item dispatch for pass2 " Darrick J. Wong
2020-05-01 13:43   ` Chandan Rajendra
2020-04-30  0:48 ` [PATCH 05/21] xfs: refactor log recovery inode " Darrick J. Wong
2020-05-01 14:03   ` Chandan Rajendra
2020-04-30  0:48 ` [PATCH 06/21] xfs: refactor log recovery dquot " Darrick J. Wong
2020-05-01 14:14   ` Chandan Rajendra
2020-04-30  0:48 ` [PATCH 07/21] xfs: refactor log recovery icreate " Darrick J. Wong
2020-05-01 14:18   ` Chandan Rajendra
2020-04-30  0:48 ` [PATCH 08/21] xfs: remove log recovery quotaoff " Darrick J. Wong
2020-05-01 15:09   ` Chandan Rajendra
2020-05-01 17:41     ` Darrick J. Wong
2020-04-30  0:48 ` [PATCH 09/21] xfs: refactor log recovery EFI " Darrick J. Wong
2020-05-01 10:28   ` Christoph Hellwig
2020-05-01 17:56     ` Darrick J. Wong
2020-04-30  0:48 ` [PATCH 10/21] xfs: refactor log recovery RUI " Darrick J. Wong
2020-04-30  0:48 ` [PATCH 11/21] xfs: refactor log recovery CUI " Darrick J. Wong
2020-04-30  0:48 ` [PATCH 12/21] xfs: refactor log recovery BUI " Darrick J. Wong
2020-04-30  0:48 ` [PATCH 13/21] xfs: refactor recovered EFI log item playback Darrick J. Wong
2020-05-01 10:19   ` Christoph Hellwig
2020-05-01 17:58     ` Darrick J. Wong
2020-04-30  0:49 ` [PATCH 14/21] xfs: refactor recovered RUI " Darrick J. Wong
2020-04-30  0:49 ` [PATCH 15/21] xfs: refactor recovered CUI " Darrick J. Wong
2020-04-30  0:49 ` [PATCH 16/21] xfs: refactor recovered BUI " Darrick J. Wong
2020-04-30  0:49 ` [PATCH 17/21] xfs: refactor releasing finished intents during log recovery Darrick J. Wong
2020-04-30  0:49 ` [PATCH 18/21] xfs: refactor adding recovered intent items to the log Darrick J. Wong
2020-04-30  0:49 ` [PATCH 19/21] xfs: refactor intent item RECOVERED flag into the log item Darrick J. Wong
2020-04-30  0:49 ` [PATCH 20/21] xfs: refactor intent item iop_recover calls Darrick J. Wong
2020-04-30  0:49 ` [PATCH 21/21] xfs: remove unnecessary includes from xfs_log_recover.c Darrick J. Wong
2020-05-01 10:15 ` [PATCH v2 00/21] xfs: refactor log recovery Christoph Hellwig
2020-05-01 16:53   ` Darrick J. Wong
2020-05-01 17:03     ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.