All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/20] xfsprogs: introduce the free inode btree
@ 2013-11-13 15:56 Brian Foster
  2013-11-13 15:56 ` [PATCH v2 01/20] xfs: refactor xfs_ialloc_btree.c to support multiple inobt numbers Brian Foster
                   ` (19 more replies)
  0 siblings, 20 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

Hi all,

This is the v2 userspace portion of finobt support corresponding to v2
of the kernel series.

Patches 1-10 are straight application of the corresponding kernel patches
with omissions where appropriate. At this point, I'd suggest review of
those patches target the kernel equivalents, as this set will progress
using the kernel set as a base.

Patch 11 adds mkfs support. Patches 12 and 13 provide a couple minor db
and repair fixes to support the new agi fields and calculate the fs
format respectively. Patches 14-18 add real repair support for the
finobt. Patch 19 adds support to report finobt state in xfs_info. Patch
20 adds support for metadump.

Note that this series is based on Dave's latest (v5) CRC write support
series for userspace:

http://oss.sgi.com/archives/xfs/2013-11/msg00351.html

This is required for metadump support in particular.

I think this set is now fairly comprehensive in terms of finobt support.
My biggest question at the moment is with regard to how far to enhance
repair support. Repair currently scans the finobt in phase 2, attempts
to call out inconsistencies and regenerates the finobt based on the
in-core data in phase 5. Once basic support is ironed out, we have a
duplicate source of a subset of inode metadata (chunks with free inodes)
from which to potentially make more intelligent repair decisions.
Thoughts appreciated.

Brian

v2:
- Rebased onto the CRC v5 series and v2 kernel finobt bits.
- Core finobt repair support.
- xfs_info support.
- xfs_metadump support.

Brian Foster (20):
  xfs: refactor xfs_ialloc_btree.c to support multiple inobt numbers
  xfs: reserve v5 superblock read-only compat. feature bit for finobt
  xfs: support the XFS_BTNUM_FINOBT free inode btree type
  xfs: update inode allocation/free transaction reservations for finobt
  xfs: insert newly allocated inode chunks into the finobt
  xfs: use and update the finobt on inode allocation
  xfs: refactor xfs_difree() inobt bits into xfs_difree_inobt() helper
  xfs: update the finobt on inode free
  xfs: report finobt status in fs geometry
  xfs: enable the finobt feature on v5 superblocks
  xfsprogs/mkfs: finobt mkfs support
  xfsprogs/db: finobt support
  xfsprogs/repair: account for finobt in ag 0 geometry pre-calculation
  xfsprogs/repair: phase 2 finobt scan
  xfsprogs/repair: pass btree block magic as param to build_ino_tree()
  xfsprogs/repair: pull the build_agi() call up out of the inode tree
    build
  xfsprogs/repair: helpers for finding in-core inode records w/ free
    inodes
  xfsprogs/repair: reconstruct the finobt in phase 5
  xfsprogs/growfs: report finobt status in fs geometry (xfs_info)
  xfsprogs/db: add finobt support to metadump

 db/agi.c                   |   2 +
 db/btblock.c               |  12 +
 db/metadump.c              |  25 +-
 growfs/xfs_growfs.c        |  14 +-
 include/xfs_ag.h           |  32 ++-
 include/xfs_btree.h        |   3 +
 include/xfs_format.h       |  14 +-
 include/xfs_fs.h           |   1 +
 include/xfs_ialloc_btree.h |   3 +-
 include/xfs_sb.h           |  10 +-
 include/xfs_trans_space.h  |   7 +-
 include/xfs_types.h        |   2 +-
 libxfs/xfs_btree.c         |   6 +-
 libxfs/xfs_ialloc.c        | 616 ++++++++++++++++++++++++++++++++++++++-------
 libxfs/xfs_ialloc_btree.c  |  68 ++++-
 libxfs/xfs_trans_resv.c    |  47 +++-
 mkfs/xfs_mkfs.c            |  83 ++++--
 repair/incore.h            |  27 ++
 repair/phase5.c            | 109 ++++++--
 repair/scan.c              | 239 +++++++++++++++++-
 repair/xfs_repair.c        |   2 +
 21 files changed, 1144 insertions(+), 178 deletions(-)

-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 01/20] xfs: refactor xfs_ialloc_btree.c to support multiple inobt numbers
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  2013-11-13 15:56 ` [PATCH v2 02/20] xfs: reserve v5 superblock read-only compat. feature bit for finobt Brian Foster
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

The introduction of the free inode btree (finobt) requires that
xfs_ialloc_btree.c handle multiple trees. Refactor xfs_ialloc_btree.c
so the caller specifies the btree type on cursor initialization to
prepare for addition of the finobt.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <david@fromorbit.com>
---
 include/xfs_ialloc_btree.h | 3 ++-
 libxfs/xfs_ialloc.c        | 8 ++++----
 libxfs/xfs_ialloc_btree.c  | 8 +++++---
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/include/xfs_ialloc_btree.h b/include/xfs_ialloc_btree.h
index f38b220..d7ebea7 100644
--- a/include/xfs_ialloc_btree.h
+++ b/include/xfs_ialloc_btree.h
@@ -58,7 +58,8 @@ struct xfs_mount;
 		 ((index) - 1) * sizeof(xfs_inobt_ptr_t)))
 
 extern struct xfs_btree_cur *xfs_inobt_init_cursor(struct xfs_mount *,
-		struct xfs_trans *, struct xfs_buf *, xfs_agnumber_t);
+		struct xfs_trans *, struct xfs_buf *, xfs_agnumber_t,
+		xfs_btnum_t);
 extern int xfs_inobt_maxrecs(struct xfs_mount *, int, int);
 
 #endif	/* __XFS_IALLOC_BTREE_H__ */
diff --git a/libxfs/xfs_ialloc.c b/libxfs/xfs_ialloc.c
index afe1a82..337a4c6 100644
--- a/libxfs/xfs_ialloc.c
+++ b/libxfs/xfs_ialloc.c
@@ -432,7 +432,7 @@ xfs_ialloc_ag_alloc(
 	/*
 	 * Insert records describing the new inode chunk into the btree.
 	 */
-	cur = xfs_inobt_init_cursor(args.mp, tp, agbp, agno);
+	cur = xfs_inobt_init_cursor(args.mp, tp, agbp, agno, XFS_BTNUM_INO);
 	for (thisino = newino;
 	     thisino < newino + newlen;
 	     thisino += XFS_INODES_PER_CHUNK) {
@@ -678,7 +678,7 @@ xfs_dialloc_ag(
 	ASSERT(pag->pagi_freecount > 0);
 
  restart_pagno:
-	cur = xfs_inobt_init_cursor(mp, tp, agbp, agno);
+	cur = xfs_inobt_init_cursor(mp, tp, agbp, agno, XFS_BTNUM_INO);
 	/*
 	 * If pagino is 0 (this is the root inode allocation) use newino.
 	 * This must work because we've just allocated some.
@@ -1140,7 +1140,7 @@ xfs_difree(
 	/*
 	 * Initialize the cursor.
 	 */
-	cur = xfs_inobt_init_cursor(mp, tp, agbp, agno);
+	cur = xfs_inobt_init_cursor(mp, tp, agbp, agno, XFS_BTNUM_INO);
 
 	error = xfs_check_agi_freecount(cur, agi);
 	if (error)
@@ -1271,7 +1271,7 @@ xfs_imap_lookup(
 	 * we have a record, we need to ensure it contains the inode number
 	 * we are looking up.
 	 */
-	cur = xfs_inobt_init_cursor(mp, tp, agbp, agno);
+	cur = xfs_inobt_init_cursor(mp, tp, agbp, agno, XFS_BTNUM_INO);
 	error = xfs_inobt_lookup(cur, agino, XFS_LOOKUP_LE, &i);
 	if (!error) {
 		if (i)
diff --git a/libxfs/xfs_ialloc_btree.c b/libxfs/xfs_ialloc_btree.c
index 27a5dd9..0b9b91a 100644
--- a/libxfs/xfs_ialloc_btree.c
+++ b/libxfs/xfs_ialloc_btree.c
@@ -30,7 +30,8 @@ xfs_inobt_dup_cursor(
 	struct xfs_btree_cur	*cur)
 {
 	return xfs_inobt_init_cursor(cur->bc_mp, cur->bc_tp,
-			cur->bc_private.a.agbp, cur->bc_private.a.agno);
+			cur->bc_private.a.agbp, cur->bc_private.a.agno,
+			cur->bc_btnum);
 }
 
 STATIC void
@@ -377,7 +378,8 @@ xfs_inobt_init_cursor(
 	struct xfs_mount	*mp,		/* file system mount point */
 	struct xfs_trans	*tp,		/* transaction pointer */
 	struct xfs_buf		*agbp,		/* buffer for agi structure */
-	xfs_agnumber_t		agno)		/* allocation group number */
+	xfs_agnumber_t		agno,		/* allocation group number */
+	xfs_btnum_t		btnum)		/* ialloc or free ino btree */
 {
 	struct xfs_agi		*agi = XFS_BUF_TO_AGI(agbp);
 	struct xfs_btree_cur	*cur;
@@ -387,7 +389,7 @@ xfs_inobt_init_cursor(
 	cur->bc_tp = tp;
 	cur->bc_mp = mp;
 	cur->bc_nlevels = be32_to_cpu(agi->agi_level);
-	cur->bc_btnum = XFS_BTNUM_INO;
+	cur->bc_btnum = btnum;
 	cur->bc_blocklog = mp->m_sb.sb_blocklog;
 
 	cur->bc_ops = &xfs_inobt_ops;
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 02/20] xfs: reserve v5 superblock read-only compat. feature bit for finobt
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
  2013-11-13 15:56 ` [PATCH v2 01/20] xfs: refactor xfs_ialloc_btree.c to support multiple inobt numbers Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  2013-11-13 15:56 ` [PATCH v2 03/20] xfs: support the XFS_BTNUM_FINOBT free inode btree type Brian Foster
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

Reserve a v5 read-only compatibility feature bit for the finobt and
create the xfs_sb_version_hasfinobt() helper to determine whether
an fs has the feature enabled.

The finobt does not change existing on-disk structures, but must
remain consistent with the ialloc btree. Modifications from older
kernels would violate that constrant. Therefore, we restrict older
kernels to read-only mounts of finobt-enabled filesystems.

Note that this does not yet enable the ability to rw mount a finobt
fs (by setting the feature bit in the XFS_SB_FEAT_RO_COMPAT_ALL
mask).

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 include/xfs_sb.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/include/xfs_sb.h b/include/xfs_sb.h
index 35061d4..070a7f6 100644
--- a/include/xfs_sb.h
+++ b/include/xfs_sb.h
@@ -585,6 +585,7 @@ xfs_sb_has_compat_feature(
 	return (sbp->sb_features_compat & feature) != 0;
 }
 
+#define XFS_SB_FEAT_RO_COMPAT_FINOBT   (1 << 0)		/* free inode btree */
 #define XFS_SB_FEAT_RO_COMPAT_ALL 0
 #define XFS_SB_FEAT_RO_COMPAT_UNKNOWN	~XFS_SB_FEAT_RO_COMPAT_ALL
 static inline bool
@@ -639,6 +640,12 @@ static inline int xfs_sb_version_hasftype(struct xfs_sb *sbp)
 		 (sbp->sb_features2 & XFS_SB_VERSION2_FTYPE));
 }
 
+static inline int xfs_sb_version_hasfinobt(xfs_sb_t *sbp)
+{
+	return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5) &&
+		(sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_FINOBT);
+}
+
 /*
  * end of superblock version macros
  */
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 03/20] xfs: support the XFS_BTNUM_FINOBT free inode btree type
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
  2013-11-13 15:56 ` [PATCH v2 01/20] xfs: refactor xfs_ialloc_btree.c to support multiple inobt numbers Brian Foster
  2013-11-13 15:56 ` [PATCH v2 02/20] xfs: reserve v5 superblock read-only compat. feature bit for finobt Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  2013-11-13 15:56 ` [PATCH v2 04/20] xfs: update inode allocation/free transaction reservations for finobt Brian Foster
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

Define the AGI fields for the finobt root/level and add magic
numbers. Update the btree code to add support for the new
XFS_BTNUM_FINOBT inode btree.

The finobt root block is reserved immediately following the inobt
root block in the AG. Update XFS_PREALLOC_BLOCKS() to determine the
starting AG data block based on whether finobt support is enabled.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 include/xfs_ag.h          | 32 +++++++++++++++----------
 include/xfs_btree.h       |  3 +++
 include/xfs_format.h      | 14 ++++++++++-
 include/xfs_types.h       |  2 +-
 libxfs/xfs_btree.c        |  6 +++--
 libxfs/xfs_ialloc.c       | 37 +++++++++++++++++++++++++----
 libxfs/xfs_ialloc_btree.c | 60 +++++++++++++++++++++++++++++++++++++++++++++--
 7 files changed, 130 insertions(+), 24 deletions(-)

diff --git a/include/xfs_ag.h b/include/xfs_ag.h
index 3fc1098..5d3011f 100644
--- a/include/xfs_ag.h
+++ b/include/xfs_ag.h
@@ -164,22 +164,28 @@ typedef struct xfs_agi {
 	__be32		agi_pad32;
 	__be64		agi_lsn;	/* last write sequence */
 
+	__be32		agi_free_root; /* root of the free inode btree */
+	__be32		agi_free_level;/* levels in free inode btree */
+
 	/* structure must be padded to 64 bit alignment */
 } xfs_agi_t;
 
-#define	XFS_AGI_MAGICNUM	0x00000001
-#define	XFS_AGI_VERSIONNUM	0x00000002
-#define	XFS_AGI_SEQNO		0x00000004
-#define	XFS_AGI_LENGTH		0x00000008
-#define	XFS_AGI_COUNT		0x00000010
-#define	XFS_AGI_ROOT		0x00000020
-#define	XFS_AGI_LEVEL		0x00000040
-#define	XFS_AGI_FREECOUNT	0x00000080
-#define	XFS_AGI_NEWINO		0x00000100
-#define	XFS_AGI_DIRINO		0x00000200
-#define	XFS_AGI_UNLINKED	0x00000400
-#define	XFS_AGI_NUM_BITS	11
-#define	XFS_AGI_ALL_BITS	((1 << XFS_AGI_NUM_BITS) - 1)
+#define	XFS_AGI_MAGICNUM	(1 << 0)
+#define	XFS_AGI_VERSIONNUM	(1 << 1)
+#define	XFS_AGI_SEQNO		(1 << 2)
+#define	XFS_AGI_LENGTH		(1 << 3)
+#define	XFS_AGI_COUNT		(1 << 4)
+#define	XFS_AGI_ROOT		(1 << 5)
+#define	XFS_AGI_LEVEL		(1 << 6)
+#define	XFS_AGI_FREECOUNT	(1 << 7)
+#define	XFS_AGI_NEWINO		(1 << 8)
+#define	XFS_AGI_DIRINO		(1 << 9)
+#define	XFS_AGI_UNLINKED	(1 << 10)
+#define	XFS_AGI_NUM_BITS_R1	11	/* end of the 1st agi logging region */
+#define	XFS_AGI_ALL_BITS_R1	((1 << XFS_AGI_NUM_BITS_R1) - 1)
+#define	XFS_AGI_FREE_ROOT	(1 << 11)
+#define	XFS_AGI_FREE_LEVEL	(1 << 12)
+#define	XFS_AGI_NUM_BITS_R2	13
 
 /* disk block (xfs_daddr_t) in the AG */
 #define XFS_AGI_DADDR(mp)	((xfs_daddr_t)(2 << (mp)->m_sectbb_log))
diff --git a/include/xfs_btree.h b/include/xfs_btree.h
index 6afe0b2..2590d40 100644
--- a/include/xfs_btree.h
+++ b/include/xfs_btree.h
@@ -37,6 +37,7 @@ extern kmem_zone_t	*xfs_btree_cur_zone;
 #define	XFS_BTNUM_CNT	((xfs_btnum_t)XFS_BTNUM_CNTi)
 #define	XFS_BTNUM_BMAP	((xfs_btnum_t)XFS_BTNUM_BMAPi)
 #define	XFS_BTNUM_INO	((xfs_btnum_t)XFS_BTNUM_INOi)
+#define	XFS_BTNUM_FINO	((xfs_btnum_t)XFS_BTNUM_FINOi)
 
 /*
  * For logging record fields.
@@ -67,6 +68,7 @@ do {    \
 	case XFS_BTNUM_CNT: __XFS_BTREE_STATS_INC(abtc, stat); break;	\
 	case XFS_BTNUM_BMAP: __XFS_BTREE_STATS_INC(bmbt, stat); break;	\
 	case XFS_BTNUM_INO: __XFS_BTREE_STATS_INC(ibt, stat); break;	\
+	case XFS_BTNUM_FINO: __XFS_BTREE_STATS_INC(fibt, stat); break;	\
 	case XFS_BTNUM_MAX: ASSERT(0); /* fucking gcc */ ; break;	\
 	}       \
 } while (0)
@@ -80,6 +82,7 @@ do {    \
 	case XFS_BTNUM_CNT: __XFS_BTREE_STATS_ADD(abtc, stat, val); break; \
 	case XFS_BTNUM_BMAP: __XFS_BTREE_STATS_ADD(bmbt, stat, val); break; \
 	case XFS_BTNUM_INO: __XFS_BTREE_STATS_ADD(ibt, stat, val); break; \
+	case XFS_BTNUM_FINO: __XFS_BTREE_STATS_ADD(fibt, stat, val); break; \
 	case XFS_BTNUM_MAX: ASSERT(0); /* fucking gcc */ ; break;	\
 	}       \
 } while (0)
diff --git a/include/xfs_format.h b/include/xfs_format.h
index 997c770..f8e1834 100644
--- a/include/xfs_format.h
+++ b/include/xfs_format.h
@@ -200,6 +200,8 @@ typedef __be32 xfs_alloc_ptr_t;
  */
 #define	XFS_IBT_MAGIC		0x49414254	/* 'IABT' */
 #define	XFS_IBT_CRC_MAGIC	0x49414233	/* 'IAB3' */
+#define	XFS_FIBT_MAGIC		0x46494254	/* 'FIBT' */
+#define	XFS_FIBT_CRC_MAGIC	0x46494233	/* 'FIB3' */
 
 typedef	__uint64_t	xfs_inofree_t;
 #define	XFS_INODES_PER_CHUNK		(NBBY * sizeof(xfs_inofree_t))
@@ -242,7 +244,17 @@ typedef __be32 xfs_inobt_ptr_t;
  * block numbers in the AG.
  */
 #define	XFS_IBT_BLOCK(mp)		((xfs_agblock_t)(XFS_CNT_BLOCK(mp) + 1))
-#define	XFS_PREALLOC_BLOCKS(mp)		((xfs_agblock_t)(XFS_IBT_BLOCK(mp) + 1))
+#define	XFS_FIBT_BLOCK(mp)		((xfs_agblock_t)(XFS_IBT_BLOCK(mp) + 1))
+
+/*
+ * The first data block of an AG depends on whether the filesystem was formatted
+ * with the finobt feature. If so, account for the finobt reserved root btree
+ * block.
+ */
+#define XFS_PREALLOC_BLOCKS(mp) \
+	(xfs_sb_version_hasfinobt(&((mp)->m_sb)) ? \
+	 XFS_FIBT_BLOCK(mp) + 1 : \
+	 XFS_IBT_BLOCK(mp) + 1)
 
 
 
diff --git a/include/xfs_types.h b/include/xfs_types.h
index 82bbc34..65c6e66 100644
--- a/include/xfs_types.h
+++ b/include/xfs_types.h
@@ -134,7 +134,7 @@ typedef enum {
 
 typedef enum {
 	XFS_BTNUM_BNOi, XFS_BTNUM_CNTi, XFS_BTNUM_BMAPi, XFS_BTNUM_INOi,
-	XFS_BTNUM_MAX
+	XFS_BTNUM_FINOi, XFS_BTNUM_MAX
 } xfs_btnum_t;
 
 struct xfs_name {
diff --git a/libxfs/xfs_btree.c b/libxfs/xfs_btree.c
index 2dd6fb7..871323a 100644
--- a/libxfs/xfs_btree.c
+++ b/libxfs/xfs_btree.c
@@ -27,9 +27,10 @@ kmem_zone_t	*xfs_btree_cur_zone;
  * Btree magic numbers.
  */
 static const __uint32_t xfs_magics[2][XFS_BTNUM_MAX] = {
-	{ XFS_ABTB_MAGIC, XFS_ABTC_MAGIC, XFS_BMAP_MAGIC, XFS_IBT_MAGIC },
+	{ XFS_ABTB_MAGIC, XFS_ABTC_MAGIC, XFS_BMAP_MAGIC, XFS_IBT_MAGIC,
+	  XFS_FIBT_MAGIC },
 	{ XFS_ABTB_CRC_MAGIC, XFS_ABTC_CRC_MAGIC,
-	  XFS_BMAP_CRC_MAGIC, XFS_IBT_CRC_MAGIC }
+	  XFS_BMAP_CRC_MAGIC, XFS_IBT_CRC_MAGIC, XFS_FIBT_CRC_MAGIC }
 };
 #define xfs_btree_magic(cur) \
 	xfs_magics[!!((cur)->bc_flags & XFS_BTREE_CRC_BLOCKS)][cur->bc_btnum]
@@ -1101,6 +1102,7 @@ xfs_btree_set_refs(
 		xfs_buf_set_ref(bp, XFS_ALLOC_BTREE_REF);
 		break;
 	case XFS_BTNUM_INO:
+	case XFS_BTNUM_FINO:
 		xfs_buf_set_ref(bp, XFS_INO_BTREE_REF);
 		break;
 	case XFS_BTNUM_BMAP:
diff --git a/libxfs/xfs_ialloc.c b/libxfs/xfs_ialloc.c
index 337a4c6..1bb30c6 100644
--- a/libxfs/xfs_ialloc.c
+++ b/libxfs/xfs_ialloc.c
@@ -1482,6 +1482,8 @@ xfs_ialloc_log_agi(
 		offsetof(xfs_agi_t, agi_newino),
 		offsetof(xfs_agi_t, agi_dirino),
 		offsetof(xfs_agi_t, agi_unlinked),
+		offsetof(xfs_agi_t, agi_free_root),
+		offsetof(xfs_agi_t, agi_free_level),
 		sizeof(xfs_agi_t)
 	};
 #ifdef DEBUG
@@ -1491,14 +1493,39 @@ xfs_ialloc_log_agi(
 	ASSERT(agi->agi_magicnum == cpu_to_be32(XFS_AGI_MAGIC));
 #endif
 	/*
-	 * Compute byte offsets for the first and last fields.
+	 * The growth of the agi buffer over time now requires that we interpret
+	 * the buffer as two logical regions delineated at the end of the unlinked
+	 * list. This is due to the size of the hash table and its location in the
+	 * middle of the agi.
+	 *
+	 * For example, a request to log a field before agi_unlinked and a field
+	 * after agi_unlinked could cause us to log the entire hash table and use
+	 * an excessive amount of log space. To avoid this behavior, log the
+	 * region up through agi_unlinked in one call and the region after
+	 * agi_unlinked through the end of the structure in another.
 	 */
-	xfs_btree_offsets(fields, offsets, XFS_AGI_NUM_BITS, &first, &last);
+	xfs_trans_buf_set_type(tp, bp, XFS_BLFT_AGI_BUF);
+
 	/*
-	 * Log the allocation group inode header buffer.
+	 * Compute byte offsets for the first and last fields in the first
+	 * region and log agi buffer. This only logs up through agi_unlinked.
 	 */
-	xfs_trans_buf_set_type(tp, bp, XFS_BLFT_AGI_BUF);
-	xfs_trans_log_buf(tp, bp, first, last);
+	if (fields & XFS_AGI_ALL_BITS_R1) {
+		xfs_btree_offsets(fields, offsets, XFS_AGI_NUM_BITS_R1,
+				  &first, &last);
+		xfs_trans_log_buf(tp, bp, first, last);
+	}
+
+	/*
+	 * Mask off the bits in the first region and calculate the first and last
+	 * field offsets for any bits in the second region.
+	 */
+	fields &= ~XFS_AGI_ALL_BITS_R1;
+	if (fields) {
+		xfs_btree_offsets(fields, offsets, XFS_AGI_NUM_BITS_R2,
+				  &first, &last);
+		xfs_trans_log_buf(tp, bp, first, last);
+	}
 }
 
 #ifdef DEBUG
diff --git a/libxfs/xfs_ialloc_btree.c b/libxfs/xfs_ialloc_btree.c
index 0b9b91a..3e9425c 100644
--- a/libxfs/xfs_ialloc_btree.c
+++ b/libxfs/xfs_ialloc_btree.c
@@ -48,6 +48,21 @@ xfs_inobt_set_root(
 	xfs_ialloc_log_agi(cur->bc_tp, agbp, XFS_AGI_ROOT | XFS_AGI_LEVEL);
 }
 
+STATIC void
+xfs_finobt_set_root(
+	struct xfs_btree_cur	*cur,
+	union xfs_btree_ptr	*nptr,
+	int			inc)	/* level change */
+{
+	struct xfs_buf		*agbp = cur->bc_private.a.agbp;
+	struct xfs_agi		*agi = XFS_BUF_TO_AGI(agbp);
+
+	agi->agi_free_root = nptr->s;
+	be32_add_cpu(&agi->agi_free_level, inc);
+	xfs_ialloc_log_agi(cur->bc_tp, agbp,
+			   XFS_AGI_FREE_ROOT | XFS_AGI_FREE_LEVEL);
+}
+
 STATIC int
 xfs_inobt_alloc_block(
 	struct xfs_btree_cur	*cur,
@@ -155,6 +170,17 @@ xfs_inobt_init_ptr_from_cur(
 	ptr->s = agi->agi_root;
 }
 
+STATIC void
+xfs_finobt_init_ptr_from_cur(
+	struct xfs_btree_cur	*cur,
+	union xfs_btree_ptr	*ptr)
+{
+	struct xfs_agi		*agi = XFS_BUF_TO_AGI(cur->bc_private.a.agbp);
+
+	ASSERT(cur->bc_private.a.agno == be32_to_cpu(agi->agi_seqno));
+	ptr->s = agi->agi_free_root;
+}
+
 STATIC __int64_t
 xfs_inobt_key_diff(
 	struct xfs_btree_cur	*cur,
@@ -185,6 +211,7 @@ xfs_inobt_verify(
 	 */
 	switch (block->bb_magic) {
 	case cpu_to_be32(XFS_IBT_CRC_MAGIC):
+	case cpu_to_be32(XFS_FIBT_CRC_MAGIC):
 		if (!xfs_sb_version_hascrc(&mp->m_sb))
 			return false;
 		if (!uuid_equal(&block->bb_u.s.bb_uuid, &mp->m_sb.sb_uuid))
@@ -196,6 +223,7 @@ xfs_inobt_verify(
 			return false;
 		/* fall through */
 	case cpu_to_be32(XFS_IBT_MAGIC):
+	case cpu_to_be32(XFS_FIBT_MAGIC):
 		break;
 	default:
 		return 0;
@@ -370,6 +398,28 @@ static const struct xfs_btree_ops xfs_inobt_ops = {
 #endif
 };
 
+static const struct xfs_btree_ops xfs_finobt_ops = {
+	.rec_len		= sizeof(xfs_inobt_rec_t),
+	.key_len		= sizeof(xfs_inobt_key_t),
+
+	.dup_cursor		= xfs_inobt_dup_cursor,
+	.set_root		= xfs_finobt_set_root,
+	.alloc_block		= xfs_inobt_alloc_block,
+	.free_block		= xfs_inobt_free_block,
+	.get_minrecs		= xfs_inobt_get_minrecs,
+	.get_maxrecs		= xfs_inobt_get_maxrecs,
+	.init_key_from_rec	= xfs_inobt_init_key_from_rec,
+	.init_rec_from_key	= xfs_inobt_init_rec_from_key,
+	.init_rec_from_cur	= xfs_inobt_init_rec_from_cur,
+	.init_ptr_from_cur	= xfs_finobt_init_ptr_from_cur,
+	.key_diff		= xfs_inobt_key_diff,
+	.buf_ops		= &xfs_inobt_buf_ops,
+#if defined(DEBUG) || defined(XFS_WARN)
+	.keys_inorder		= xfs_inobt_keys_inorder,
+	.recs_inorder		= xfs_inobt_recs_inorder,
+#endif
+};
+
 /*
  * Allocate a new inode btree cursor.
  */
@@ -388,11 +438,17 @@ xfs_inobt_init_cursor(
 
 	cur->bc_tp = tp;
 	cur->bc_mp = mp;
-	cur->bc_nlevels = be32_to_cpu(agi->agi_level);
 	cur->bc_btnum = btnum;
+	if (btnum == XFS_BTNUM_INO) {
+		cur->bc_nlevels = be32_to_cpu(agi->agi_level);
+		cur->bc_ops = &xfs_inobt_ops;
+	} else {
+		cur->bc_nlevels = be32_to_cpu(agi->agi_free_level);
+		cur->bc_ops = &xfs_finobt_ops;
+	}
+
 	cur->bc_blocklog = mp->m_sb.sb_blocklog;
 
-	cur->bc_ops = &xfs_inobt_ops;
 	if (xfs_sb_version_hascrc(&mp->m_sb))
 		cur->bc_flags |= XFS_BTREE_CRC_BLOCKS;
 
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 04/20] xfs: update inode allocation/free transaction reservations for finobt
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
                   ` (2 preceding siblings ...)
  2013-11-13 15:56 ` [PATCH v2 03/20] xfs: support the XFS_BTNUM_FINOBT free inode btree type Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  2013-11-13 15:56 ` [PATCH v2 05/20] xfs: insert newly allocated inode chunks into the finobt Brian Foster
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

Create the xfs_calc_finobt_res() helper to calculate the finobt log
reservation for inode allocation and free. Update
XFS_IALLOC_SPACE_RES() to reserve blocks for the additional finobt
insertion on inode allocation. Create XFS_IFREE_SPACE_RES() to
reserve blocks for the potential finobt record insertion on inode
free (i.e., if an inode chunk was previously fully allocated).

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 include/xfs_trans_space.h |  7 ++++++-
 libxfs/xfs_trans_resv.c   | 47 +++++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 49 insertions(+), 5 deletions(-)

diff --git a/include/xfs_trans_space.h b/include/xfs_trans_space.h
index 7d2c920..a7d1721 100644
--- a/include/xfs_trans_space.h
+++ b/include/xfs_trans_space.h
@@ -47,7 +47,9 @@
 #define	XFS_DIRREMOVE_SPACE_RES(mp)	\
 	XFS_DAREMOVE_SPACE_RES(mp, XFS_DATA_FORK)
 #define	XFS_IALLOC_SPACE_RES(mp)	\
-	(XFS_IALLOC_BLOCKS(mp) + (mp)->m_in_maxlevels - 1)
+	(XFS_IALLOC_BLOCKS(mp) + \
+	 (xfs_sb_version_hasfinobt(&mp->m_sb) ? 2 : 1 * \
+	  ((mp)->m_in_maxlevels - 1)))
 
 /*
  * Space reservation values for various transactions.
@@ -82,5 +84,8 @@
 	(XFS_DIRREMOVE_SPACE_RES(mp) + XFS_DIRENTER_SPACE_RES(mp,nl))
 #define	XFS_SYMLINK_SPACE_RES(mp,nl,b)	\
 	(XFS_IALLOC_SPACE_RES(mp) + XFS_DIRENTER_SPACE_RES(mp,nl) + (b))
+#define XFS_IFREE_SPACE_RES(mp)		\
+	(xfs_sb_version_hasfinobt(&mp->m_sb) ? (mp)->m_in_maxlevels : 0)
+
 
 #endif	/* __XFS_TRANS_SPACE_H__ */
diff --git a/libxfs/xfs_trans_resv.c b/libxfs/xfs_trans_resv.c
index 1e59fad..870d4fc 100644
--- a/libxfs/xfs_trans_resv.c
+++ b/libxfs/xfs_trans_resv.c
@@ -81,6 +81,37 @@ xfs_calc_inode_res(
 }
 
 /*
+ * The free inode btree is a conditional feature and the log reservation
+ * requirements differ slightly from that of the traditional inode allocation
+ * btree. The finobt tracks records for inode chunks with at least one free inode.
+ * Therefore, a record can be removed from the tree for an inode allocation or
+ * free and the associated merge reservation is unconditional. This also covers
+ * the possibility of a split on record insertion.
+ *
+ * the free inode btree: max depth * block size
+ * the free inode btree entry: block size
+ *
+ * TODO: is the modify res really necessary? covered by the merge/split res?
+ * This seems to be the pattern of ifree, but not create_resv_alloc. Why?
+ */
+STATIC uint
+xfs_calc_finobt_res(
+	struct xfs_mount 	*mp,
+	int			modify)
+{
+	uint res;
+
+	if (!xfs_sb_version_hasfinobt(&mp->m_sb))
+		return 0;
+
+	res = xfs_calc_buf_res(mp->m_in_maxlevels, XFS_FSB_TO_B(mp, 1));
+	if (modify)
+		res += (uint)XFS_FSB_TO_B(mp, 1);
+
+	return res;
+}
+
+/*
  * Various log reservation values.
  *
  * These are based on the size of the file system block because that is what
@@ -250,6 +281,7 @@ xfs_calc_remove_reservation(
  *    the superblock for the nlink flag: sector size
  *    the directory btree: (max depth + v2) * dir block size
  *    the directory inode's bmap btree: (max depth + v2) * block size
+ *    the finobt
  */
 STATIC uint
 xfs_calc_create_resv_modify(
@@ -258,7 +290,8 @@ xfs_calc_create_resv_modify(
 	return xfs_calc_inode_res(mp, 2) +
 		xfs_calc_buf_res(1, mp->m_sb.sb_sectsize) +
 		(uint)XFS_FSB_TO_B(mp, 1) +
-		xfs_calc_buf_res(XFS_DIROP_LOG_COUNT(mp), XFS_FSB_TO_B(mp, 1));
+		xfs_calc_buf_res(XFS_DIROP_LOG_COUNT(mp), XFS_FSB_TO_B(mp, 1)) +
+		xfs_calc_finobt_res(mp, 1);
 }
 
 /*
@@ -268,6 +301,7 @@ xfs_calc_create_resv_modify(
  *    the inode blocks allocated: XFS_IALLOC_BLOCKS * blocksize
  *    the inode btree: max depth * blocksize
  *    the allocation btrees: 2 trees * (max depth - 1) * block size
+ *    the finobt
  */
 STATIC uint
 xfs_calc_create_resv_alloc(
@@ -278,7 +312,8 @@ xfs_calc_create_resv_alloc(
 		xfs_calc_buf_res(XFS_IALLOC_BLOCKS(mp), XFS_FSB_TO_B(mp, 1)) +
 		xfs_calc_buf_res(mp->m_in_maxlevels, XFS_FSB_TO_B(mp, 1)) +
 		xfs_calc_buf_res(XFS_ALLOCFREE_LOG_COUNT(mp, 1),
-				 XFS_FSB_TO_B(mp, 1));
+				 XFS_FSB_TO_B(mp, 1)) +
+		xfs_calc_finobt_res(mp, 0);
 }
 
 STATIC uint
@@ -296,6 +331,7 @@ __xfs_calc_create_reservation(
  *    the superblock for the nlink flag: sector size
  *    the inode btree: max depth * blocksize
  *    the allocation btrees: 2 trees * (max depth - 1) * block size
+ *    the finobt
  */
 STATIC uint
 xfs_calc_icreate_resv_alloc(
@@ -305,7 +341,8 @@ xfs_calc_icreate_resv_alloc(
 		mp->m_sb.sb_sectsize +
 		xfs_calc_buf_res(mp->m_in_maxlevels, XFS_FSB_TO_B(mp, 1)) +
 		xfs_calc_buf_res(XFS_ALLOCFREE_LOG_COUNT(mp, 1),
-				 XFS_FSB_TO_B(mp, 1));
+				 XFS_FSB_TO_B(mp, 1)) +
+		xfs_calc_finobt_res(mp, 0);
 }
 
 STATIC uint
@@ -359,6 +396,7 @@ xfs_calc_symlink_reservation(
  *    the on disk inode before ours in the agi hash list: inode cluster size
  *    the inode btree: max depth * blocksize
  *    the allocation btrees: 2 trees * (max depth - 1) * block size
+ *    the finobt
  */
 STATIC uint
 xfs_calc_ifree_reservation(
@@ -374,7 +412,8 @@ xfs_calc_ifree_reservation(
 		xfs_calc_buf_res(2 + XFS_IALLOC_BLOCKS(mp) +
 				 mp->m_in_maxlevels, 0) +
 		xfs_calc_buf_res(XFS_ALLOCFREE_LOG_COUNT(mp, 1),
-				 XFS_FSB_TO_B(mp, 1));
+				 XFS_FSB_TO_B(mp, 1)) +
+		xfs_calc_finobt_res(mp, 1);
 }
 
 /*
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 05/20] xfs: insert newly allocated inode chunks into the finobt
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
                   ` (3 preceding siblings ...)
  2013-11-13 15:56 ` [PATCH v2 04/20] xfs: update inode allocation/free transaction reservations for finobt Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  2013-11-13 15:56 ` [PATCH v2 06/20] xfs: use and update the finobt on inode allocation Brian Foster
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

A newly allocated inode chunk, by definition, has at least one
free inode, so a record is always inserted into the finobt.

Create the xfs_inobt_insert() helper from existing code to insert
a record in an inobt based on the provided BTNUM. Update
xfs_ialloc_ag_alloc() to invoke the helper for the existing
XFS_BTNUM_INO tree and XFS_BTNUM_FINO tree, if enabled.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 libxfs/xfs_ialloc.c | 93 ++++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 70 insertions(+), 23 deletions(-)

diff --git a/libxfs/xfs_ialloc.c b/libxfs/xfs_ialloc.c
index 1bb30c6..e1f88ec 100644
--- a/libxfs/xfs_ialloc.c
+++ b/libxfs/xfs_ialloc.c
@@ -88,6 +88,66 @@ xfs_inobt_get_rec(
 }
 
 /*
+ * Insert a single inobt record. Cursor must already point to desired location.
+ */
+STATIC int
+xfs_inobt_insert_rec(
+	struct xfs_btree_cur	*cur,
+	__int32_t		freecount,
+	xfs_inofree_t		free,
+	int			*stat)
+{
+	cur->bc_rec.i.ir_freecount = freecount;
+	cur->bc_rec.i.ir_free = free;
+	return xfs_btree_insert(cur, stat);
+}
+
+/*
+ * Insert records describing a newly allocated inode chunk into the inobt.
+ */
+STATIC int
+xfs_inobt_insert(
+	struct xfs_mount	*mp,
+	struct xfs_trans	*tp,
+	struct xfs_buf		*agbp,
+	xfs_agino_t		newino,
+	xfs_agino_t		newlen,
+	xfs_btnum_t		btnum)
+{
+	struct xfs_btree_cur	*cur;
+	struct xfs_agi		*agi = XFS_BUF_TO_AGI(agbp);
+	xfs_agnumber_t		agno = be32_to_cpu(agi->agi_seqno);
+	xfs_agino_t		thisino;
+	int			i;
+	int			error;
+
+	cur = xfs_inobt_init_cursor(mp, tp, agbp, agno, btnum);
+
+	for (thisino = newino;
+	     thisino < newino + newlen;
+	     thisino += XFS_INODES_PER_CHUNK) {
+		error = xfs_inobt_lookup(cur, thisino, XFS_LOOKUP_EQ, &i);
+		if (error) {
+			xfs_btree_del_cursor(cur, XFS_BTREE_ERROR);
+			return error;
+		}
+		ASSERT(i == 0);
+
+		error = xfs_inobt_insert_rec(cur, XFS_INODES_PER_CHUNK,
+					     XFS_INOBT_ALL_FREE, &i);
+		if (error) {
+			xfs_btree_del_cursor(cur, XFS_BTREE_ERROR);
+			return error;
+		}
+		ASSERT(i == 1);
+	}
+
+	xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR);
+
+	return 0;
+}
+
+/*
  * Verify that the number of free inodes in the AGI is correct.
  */
 #ifdef DEBUG
@@ -286,13 +346,10 @@ xfs_ialloc_ag_alloc(
 {
 	xfs_agi_t	*agi;		/* allocation group header */
 	xfs_alloc_arg_t	args;		/* allocation argument structure */
-	xfs_btree_cur_t	*cur;		/* inode btree cursor */
 	xfs_agnumber_t	agno;
 	int		error;
-	int		i;
 	xfs_agino_t	newino;		/* new first inode's number */
 	xfs_agino_t	newlen;		/* new number of inodes */
-	xfs_agino_t	thisino;	/* current inode number, for loop */
 	int		isaligned = 0;	/* inode allocation at stripe unit */
 					/* boundary */
 	struct xfs_perag *pag;
@@ -430,29 +487,19 @@ xfs_ialloc_ag_alloc(
 	agi->agi_newino = cpu_to_be32(newino);
 
 	/*
-	 * Insert records describing the new inode chunk into the btree.
+	 * Insert records describing the new inode chunk into the btrees.
 	 */
-	cur = xfs_inobt_init_cursor(args.mp, tp, agbp, agno, XFS_BTNUM_INO);
-	for (thisino = newino;
-	     thisino < newino + newlen;
-	     thisino += XFS_INODES_PER_CHUNK) {
-		cur->bc_rec.i.ir_startino = thisino;
-		cur->bc_rec.i.ir_freecount = XFS_INODES_PER_CHUNK;
-		cur->bc_rec.i.ir_free = XFS_INOBT_ALL_FREE;
-		error = xfs_btree_lookup(cur, XFS_LOOKUP_EQ, &i);
-		if (error) {
-			xfs_btree_del_cursor(cur, XFS_BTREE_ERROR);
-			return error;
-		}
-		ASSERT(i == 0);
-		error = xfs_btree_insert(cur, &i);
-		if (error) {
-			xfs_btree_del_cursor(cur, XFS_BTREE_ERROR);
+	error = xfs_inobt_insert(args.mp, tp, agbp, newino, newlen,
+				 XFS_BTNUM_INO);
+	if (error)
+		return error;
+
+	if (xfs_sb_version_hasfinobt(&args.mp->m_sb)) {
+		error = xfs_inobt_insert(args.mp, tp, agbp, newino, newlen,
+					 XFS_BTNUM_FINO);
+		if (error)
 			return error;
-		}
-		ASSERT(i == 1);
 	}
-	xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR);
 	/*
 	 * Log allocation group header fields
 	 */
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 06/20] xfs: use and update the finobt on inode allocation
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
                   ` (4 preceding siblings ...)
  2013-11-13 15:56 ` [PATCH v2 05/20] xfs: insert newly allocated inode chunks into the finobt Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  2013-11-13 15:56 ` [PATCH v2 07/20] xfs: refactor xfs_difree() inobt bits into xfs_difree_inobt() helper Brian Foster
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

Replace xfs_dialloc_ag() with an implementation that looks for a
record in the finobt. The finobt only tracks records with at least
one free inode. This eliminates the need for the intra-ag scan in
the original algorithm. Once the inode is allocated, update the
finobt appropriately (possibly removing the record) as well as the
inobt.

Move the original xfs_dialloc_ag() algorithm to
xfs_dialloc_ag_slow() and fall back as such if finobt support is
not enabled.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 libxfs/xfs_ialloc.c | 211 +++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 210 insertions(+), 1 deletion(-)

diff --git a/libxfs/xfs_ialloc.c b/libxfs/xfs_ialloc.c
index e1f88ec..e95b847 100644
--- a/libxfs/xfs_ialloc.c
+++ b/libxfs/xfs_ialloc.c
@@ -699,7 +699,7 @@ xfs_ialloc_get_rec(
  * available.
  */
 STATIC int
-xfs_dialloc_ag(
+xfs_dialloc_ag_slow(
 	struct xfs_trans	*tp,
 	struct xfs_buf		*agbp,
 	xfs_ino_t		parent,
@@ -957,6 +957,215 @@ error0:
 	return error;
 }
 
+STATIC int
+xfs_dialloc_ag(
+	struct xfs_trans	*tp,
+	struct xfs_buf		*agbp,
+	xfs_ino_t		parent,
+	xfs_ino_t		*inop)
+{
+	struct xfs_mount		*mp = tp->t_mountp;
+	struct xfs_agi			*agi = XFS_BUF_TO_AGI(agbp);
+	xfs_agnumber_t			agno = be32_to_cpu(agi->agi_seqno);
+	xfs_agnumber_t			pagno = XFS_INO_TO_AGNO(mp, parent);
+	xfs_agino_t			pagino = XFS_INO_TO_AGINO(mp, parent);
+	struct xfs_perag		*pag;
+	struct xfs_btree_cur		*cur;
+	struct xfs_btree_cur		*tcur;
+	struct xfs_inobt_rec_incore	rec;
+	struct xfs_inobt_rec_incore	trec;
+	xfs_ino_t			ino;
+	int				error;
+	int				offset;
+	int				i, j;
+
+	if (!xfs_sb_version_hasfinobt(&mp->m_sb))
+		return xfs_dialloc_ag_slow(tp, agbp, parent, inop);
+
+	pag = xfs_perag_get(mp, agno);
+
+	/*
+	 * If pagino is 0 (this is the root inode allocation) use newino.
+	 * This must work because we've just allocated some.
+	 */
+	if (!pagino)
+		pagino = be32_to_cpu(agi->agi_newino);
+
+	cur = xfs_inobt_init_cursor(mp, tp, agbp, agno, XFS_BTNUM_FINO);
+
+	error = xfs_check_agi_freecount(cur, agi);
+	if (error)
+		goto error_cur;
+
+	if (agno == pagno) {
+		/*
+		 * We're in the same AG as the parent inode so allocate the
+		 * closest inode to the parent.
+		 */
+		error = xfs_inobt_lookup(cur, pagino, XFS_LOOKUP_LE, &i);
+		if (error)
+			goto error_cur;
+		if (i == 1) {
+			error = xfs_inobt_get_rec(cur, &rec, &i);
+			if (error)
+				goto error_cur;
+			XFS_WANT_CORRUPTED_GOTO(i == 1, error_cur);
+
+			/*
+			 * See if we've landed in the parent inode record. The
+			 * finobt only tracks chunks with at least one free
+			 * inode, so record existence is enough.
+			 */
+			if (pagino >= rec.ir_startino &&
+			    pagino < (rec.ir_startino + XFS_INODES_PER_CHUNK))
+				goto alloc_inode;
+		}
+
+		error = xfs_btree_dup_cursor(cur, &tcur);
+		if (error) 
+			goto error_cur;
+
+		error = xfs_inobt_lookup(tcur, pagino, XFS_LOOKUP_GE, &j);
+		if (error)
+			goto error_tcur;
+		if (j == 1) {
+			error = xfs_inobt_get_rec(tcur, &trec, &j);
+			if (error)
+				goto error_tcur;
+			XFS_WANT_CORRUPTED_GOTO(j == 1, error_tcur);
+		}
+
+		if (i == 1 && j == 1) {
+			if ((pagino - rec.ir_startino + XFS_INODES_PER_CHUNK - 1) >
+			    (trec.ir_startino - pagino)) {
+				rec = trec;
+				xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR);
+				cur = tcur;
+			} else {
+				xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR);
+			}
+		} else if (j == 1) {
+			rec = trec;
+			xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR);
+			cur = tcur;
+		} else {
+			xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR);
+		}
+	} else {
+		/*
+		 * Different AG from the parent inode. Check the record for the
+		 * most recently allocated inode.
+		 */
+		if (agi->agi_newino != cpu_to_be32(NULLAGINO)) {
+			error = xfs_inobt_lookup(cur, agi->agi_newino,
+						 XFS_LOOKUP_EQ, &i);
+			if (error)
+				goto error_cur;
+			if (i == 1) {
+				error = xfs_inobt_get_rec(cur, &rec, &i);
+				if (error)
+					goto error_cur;
+				XFS_WANT_CORRUPTED_GOTO(i == 1, error_cur);
+				goto alloc_inode;
+			}
+		}
+
+		/*
+		 * Allocate the first inode available in the AG.
+		 */
+		error = xfs_inobt_lookup(cur, 0, XFS_LOOKUP_GE, &i);
+		if (error)
+			goto error_cur;
+		XFS_WANT_CORRUPTED_GOTO(i == 1, error_cur);
+
+		error = xfs_inobt_get_rec(cur, &rec, &i);
+		if (error)
+			goto error_cur;
+		XFS_WANT_CORRUPTED_GOTO(i == 1, error_cur);
+	}
+
+alloc_inode:
+	offset = xfs_lowbit64(rec.ir_free);
+	ASSERT(offset >= 0);
+	ASSERT(offset < XFS_INODES_PER_CHUNK);
+	ASSERT((XFS_AGINO_TO_OFFSET(mp, rec.ir_startino) %
+				   XFS_INODES_PER_CHUNK) == 0);
+	ino = XFS_AGINO_TO_INO(mp, agno, rec.ir_startino + offset);
+
+	/*
+	 * Modify or remove the finobt record.
+	 */
+	rec.ir_free &= ~XFS_INOBT_MASK(offset);
+	rec.ir_freecount--;
+	if (rec.ir_freecount) 
+		error = xfs_inobt_update(cur, &rec);
+	else
+		error = xfs_btree_delete(cur, &i);
+	if (error)
+		goto error_cur;
+
+	/*
+	 * Lookup and modify the equivalent record in the inobt.
+	 */
+	tcur = xfs_inobt_init_cursor(mp, tp, agbp, agno, XFS_BTNUM_INO);
+
+	error = xfs_check_agi_freecount(tcur, agi);
+	if (error)
+		goto error_tcur;
+
+	error = xfs_inobt_lookup(tcur, rec.ir_startino, XFS_LOOKUP_EQ, &i);
+	if (error)
+		goto error_tcur;
+	XFS_WANT_CORRUPTED_GOTO(i == 1, error_tcur);
+
+	error = xfs_inobt_get_rec(tcur, &trec, &i);
+	if (error)
+		goto error_tcur;
+	XFS_WANT_CORRUPTED_GOTO(i == 1, error_tcur);
+	ASSERT((XFS_AGINO_TO_OFFSET(mp, trec.ir_startino) %
+				   XFS_INODES_PER_CHUNK) == 0);
+
+	trec.ir_free &= ~XFS_INOBT_MASK(offset);
+	trec.ir_freecount--;
+
+	XFS_WANT_CORRUPTED_GOTO((rec.ir_free == trec.ir_free) &&
+				(rec.ir_freecount == trec.ir_freecount),
+				error_tcur);
+
+	error = xfs_inobt_update(tcur, &trec);
+	if (error)
+		goto error_tcur;
+
+	/*
+	 * Update the perag and superblock.
+	 */
+	be32_add_cpu(&agi->agi_freecount, -1);
+	xfs_ialloc_log_agi(tp, agbp, XFS_AGI_FREECOUNT);
+	pag->pagi_freecount--;
+
+	xfs_trans_mod_sb(tp, XFS_TRANS_SB_IFREE, -1);
+
+	error = xfs_check_agi_freecount(tcur, agi);
+	if (error)
+		goto error_tcur;
+	error = xfs_check_agi_freecount(cur, agi);
+	if (error)
+		goto error_tcur;
+
+	xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR);
+	xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR);
+	xfs_perag_put(pag);
+	*inop = ino;
+	return 0;
+
+error_tcur:
+	xfs_btree_del_cursor(tcur, XFS_BTREE_ERROR);
+error_cur:
+	xfs_btree_del_cursor(cur, XFS_BTREE_ERROR);
+	xfs_perag_put(pag);
+	return error;
+}
+
 /*
  * Allocate an inode on disk.
  *
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 07/20] xfs: refactor xfs_difree() inobt bits into xfs_difree_inobt() helper
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
                   ` (5 preceding siblings ...)
  2013-11-13 15:56 ` [PATCH v2 06/20] xfs: use and update the finobt on inode allocation Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  2013-11-13 15:56 ` [PATCH v2 08/20] xfs: update the finobt on inode free Brian Foster
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

Refactor xfs_difree() in preparation for the finobt. xfs_difree()
performs the validity checks against the ag and reads the agi
header. The work of physically updating the inode allocation btree
is pushed down into the new xfs_difree_inobt() helper.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 libxfs/xfs_ialloc.c | 160 +++++++++++++++++++++++++++++++---------------------
 1 file changed, 96 insertions(+), 64 deletions(-)

diff --git a/libxfs/xfs_ialloc.c b/libxfs/xfs_ialloc.c
index e95b847..d8405e7 100644
--- a/libxfs/xfs_ialloc.c
+++ b/libxfs/xfs_ialloc.c
@@ -1325,74 +1325,31 @@ out_error:
 	return XFS_ERROR(error);
 }
 
-/*
- * Free disk inode.  Carefully avoids touching the incore inode, all
- * manipulations incore are the caller's responsibility.
- * The on-disk inode is not changed by this operation, only the
- * btree (free inode mask) is changed.
- */
-int
-xfs_difree(
-	xfs_trans_t	*tp,		/* transaction pointer */
-	xfs_ino_t	inode,		/* inode to be freed */
-	xfs_bmap_free_t	*flist,		/* extents to free */
-	int		*delete,	/* set if inode cluster was deleted */
-	xfs_ino_t	*first_ino)	/* first inode in deleted cluster */
+STATIC int
+xfs_difree_inobt(
+	struct xfs_mount		*mp,
+	struct xfs_trans		*tp,
+	struct xfs_buf			*agbp,
+	xfs_agino_t			agino,
+	struct xfs_bmap_free		*flist,
+	int				*delete,
+	xfs_ino_t			*first_ino,
+	struct xfs_inobt_rec_incore	*orec)
 {
-	/* REFERENCED */
-	xfs_agblock_t	agbno;	/* block number containing inode */
-	xfs_buf_t	*agbp;	/* buffer containing allocation group header */
-	xfs_agino_t	agino;	/* inode number relative to allocation group */
-	xfs_agnumber_t	agno;	/* allocation group number */
-	xfs_agi_t	*agi;	/* allocation group header */
-	xfs_btree_cur_t	*cur;	/* inode btree cursor */
-	int		error;	/* error return value */
-	int		i;	/* result code */
-	int		ilen;	/* inodes in an inode cluster */
-	xfs_mount_t	*mp;	/* mount structure for filesystem */
-	int		off;	/* offset of inode in inode chunk */
-	xfs_inobt_rec_incore_t rec;	/* btree record */
-	struct xfs_perag *pag;
-
-	mp = tp->t_mountp;
+	struct xfs_agi			*agi = XFS_BUF_TO_AGI(agbp);
+	xfs_agnumber_t			agno = be32_to_cpu(agi->agi_seqno);
+	xfs_agblock_t			agbno = XFS_AGINO_TO_AGBNO(mp, agino);
+	struct xfs_perag		*pag;
+	struct xfs_btree_cur		*cur;
+	struct xfs_inobt_rec_incore	rec;
+	int				ilen;
+	int				error;
+	int				i;
+	int				off;
 
-	/*
-	 * Break up inode number into its components.
-	 */
-	agno = XFS_INO_TO_AGNO(mp, inode);
-	if (agno >= mp->m_sb.sb_agcount)  {
-		xfs_warn(mp, "%s: agno >= mp->m_sb.sb_agcount (%d >= %d).",
-			__func__, agno, mp->m_sb.sb_agcount);
-		ASSERT(0);
-		return XFS_ERROR(EINVAL);
-	}
-	agino = XFS_INO_TO_AGINO(mp, inode);
-	if (inode != XFS_AGINO_TO_INO(mp, agno, agino))  {
-		xfs_warn(mp, "%s: inode != XFS_AGINO_TO_INO() (%llu != %llu).",
-			__func__, (unsigned long long)inode,
-			(unsigned long long)XFS_AGINO_TO_INO(mp, agno, agino));
-		ASSERT(0);
-		return XFS_ERROR(EINVAL);
-	}
-	agbno = XFS_AGINO_TO_AGBNO(mp, agino);
-	if (agbno >= mp->m_sb.sb_agblocks)  {
-		xfs_warn(mp, "%s: agbno >= mp->m_sb.sb_agblocks (%d >= %d).",
-			__func__, agbno, mp->m_sb.sb_agblocks);
-		ASSERT(0);
-		return XFS_ERROR(EINVAL);
-	}
-	/*
-	 * Get the allocation group header.
-	 */
-	error = xfs_ialloc_read_agi(mp, tp, agno, &agbp);
-	if (error) {
-		xfs_warn(mp, "%s: xfs_ialloc_read_agi() returned error %d.",
-			__func__, error);
-		return error;
-	}
-	agi = XFS_BUF_TO_AGI(agbp);
 	ASSERT(agi->agi_magicnum == cpu_to_be32(XFS_AGI_MAGIC));
 	ASSERT(agbno < be32_to_cpu(agi->agi_length));
+
 	/*
 	 * Initialize the cursor.
 	 */
@@ -1488,6 +1445,7 @@ xfs_difree(
 	if (error)
 		goto error0;
 
+	*orec = rec;
 	xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR);
 	return 0;
 
@@ -1496,6 +1454,80 @@ error0:
 	return error;
 }
 
+/*
+ * Free disk inode.  Carefully avoids touching the incore inode, all
+ * manipulations incore are the caller's responsibility.
+ * The on-disk inode is not changed by this operation, only the
+ * btree (free inode mask) is changed.
+ */
+int
+xfs_difree(
+	xfs_trans_t	*tp,		/* transaction pointer */
+	xfs_ino_t	inode,		/* inode to be freed */
+	xfs_bmap_free_t	*flist,		/* extents to free */
+	int		*delete,	/* set if inode cluster was deleted */
+	xfs_ino_t	*first_ino)	/* first inode in deleted cluster */
+{
+	/* REFERENCED */
+	xfs_agblock_t	agbno;	/* block number containing inode */
+	xfs_buf_t	*agbp;	/* buffer containing allocation group header */
+	xfs_agino_t	agino;	/* inode number relative to allocation group */
+	xfs_agnumber_t	agno;	/* allocation group number */
+	int		error;	/* error return value */
+	xfs_mount_t	*mp;	/* mount structure for filesystem */
+	xfs_inobt_rec_incore_t rec;	/* btree record */
+
+	mp = tp->t_mountp;
+
+	/*
+	 * Break up inode number into its components.
+	 */
+	agno = XFS_INO_TO_AGNO(mp, inode);
+	if (agno >= mp->m_sb.sb_agcount)  {
+		xfs_warn(mp, "%s: agno >= mp->m_sb.sb_agcount (%d >= %d).",
+			__func__, agno, mp->m_sb.sb_agcount);
+		ASSERT(0);
+		return XFS_ERROR(EINVAL);
+	}
+	agino = XFS_INO_TO_AGINO(mp, inode);
+	if (inode != XFS_AGINO_TO_INO(mp, agno, agino))  {
+		xfs_warn(mp, "%s: inode != XFS_AGINO_TO_INO() (%llu != %llu).",
+			__func__, (unsigned long long)inode,
+			(unsigned long long)XFS_AGINO_TO_INO(mp, agno, agino));
+		ASSERT(0);
+		return XFS_ERROR(EINVAL);
+	}
+	agbno = XFS_AGINO_TO_AGBNO(mp, agino);
+	if (agbno >= mp->m_sb.sb_agblocks)  {
+		xfs_warn(mp, "%s: agbno >= mp->m_sb.sb_agblocks (%d >= %d).",
+			__func__, agbno, mp->m_sb.sb_agblocks);
+		ASSERT(0);
+		return XFS_ERROR(EINVAL);
+	}
+	/*
+	 * Get the allocation group header.
+	 */
+	error = xfs_ialloc_read_agi(mp, tp, agno, &agbp);
+	if (error) {
+		xfs_warn(mp, "%s: xfs_ialloc_read_agi() returned error %d.",
+			__func__, error);
+		return error;
+	}
+
+	/*
+	 * Fix up the inode allocation btree.
+	 */
+	error = xfs_difree_inobt(mp, tp, agbp, agino, flist, delete, first_ino,
+				 &rec);
+	if (error)
+		goto error0;
+
+	return 0;
+
+error0:
+	return error;
+}
+
 STATIC int
 xfs_imap_lookup(
 	struct xfs_mount	*mp,
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 08/20] xfs: update the finobt on inode free
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
                   ` (6 preceding siblings ...)
  2013-11-13 15:56 ` [PATCH v2 07/20] xfs: refactor xfs_difree() inobt bits into xfs_difree_inobt() helper Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  2013-11-13 15:56 ` [PATCH v2 09/20] xfs: report finobt status in fs geometry Brian Foster
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

An inode free operation can have several effects on the finobt. If
all inodes have been freed and the chunk deallocated, we remove the
finobt record. If the inode chunk was previously full, we must
insert a new record based on the existing inobt record. Otherwise,
we modify the record in place.

Create the xfs_ifree_finobt() function to identify the potential
scenarios and update the finobt appropriately.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 libxfs/xfs_ialloc.c | 109 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 109 insertions(+)

diff --git a/libxfs/xfs_ialloc.c b/libxfs/xfs_ialloc.c
index d8405e7..834a740 100644
--- a/libxfs/xfs_ialloc.c
+++ b/libxfs/xfs_ialloc.c
@@ -1455,6 +1455,106 @@ error0:
 }
 
 /*
+ * Free an inode in the free inode btree.
+ */
+STATIC int
+xfs_difree_finobt(
+	struct xfs_mount		*mp,
+	struct xfs_trans		*tp,
+	struct xfs_buf			*agbp,
+	xfs_agino_t			agino,
+	struct xfs_inobt_rec_incore	*ibtrec) /* inobt record */
+{
+	struct xfs_agi			*agi = XFS_BUF_TO_AGI(agbp);
+	xfs_agnumber_t			agno = be32_to_cpu(agi->agi_seqno);
+	struct xfs_btree_cur		*cur;
+	struct xfs_inobt_rec_incore	rec;
+	int				offset = agino - ibtrec->ir_startino;
+	int				error;
+	int				i;
+
+	cur = xfs_inobt_init_cursor(mp, tp, agbp, agno, XFS_BTNUM_FINO);
+
+	error = xfs_inobt_lookup(cur, ibtrec->ir_startino, XFS_LOOKUP_EQ, &i);
+	if (error)
+		goto error;
+	if (i == 0) {
+		/*
+		 * If the record does not exist in the finobt, we must have just
+		 * freed an inode in a previously fully allocated chunk. If not,
+		 * something is out of sync.
+		 */
+		XFS_WANT_CORRUPTED_GOTO(ibtrec->ir_freecount == 1, error);
+
+		error = xfs_inobt_insert_rec(cur, ibtrec->ir_freecount,
+					     ibtrec->ir_free, &i);
+		if (error)
+			goto error;
+		ASSERT(i == 1);
+
+		goto out;
+	}
+
+	/*
+	 * Read and update the existing record.
+	 */
+	error = xfs_inobt_get_rec(cur, &rec, &i);
+	if (error)
+		goto error;
+	XFS_WANT_CORRUPTED_GOTO(i == 1, error);
+
+	rec.ir_free |= XFS_INOBT_MASK(offset);
+	rec.ir_freecount++;
+
+	XFS_WANT_CORRUPTED_GOTO((rec.ir_free == ibtrec->ir_free) &&
+				(rec.ir_freecount == ibtrec->ir_freecount),
+				error);
+
+	/*
+	 * The content of inobt records should always match between the inobt
+	 * and finobt. The lifecycle of records in the finobt is different from
+	 * the inobt in that the finobt only tracks records with at least one
+	 * free inode. This is to optimize lookup for inode allocation purposes.
+	 * The following checks determine whether to update the existing record or
+	 * remove it entirely.
+	 */
+
+	if (rec.ir_freecount == XFS_IALLOC_INODES(mp) &&
+	    !(mp->m_flags & XFS_MOUNT_IKEEP)) {
+		/*
+		 * If all inodes are free and we're in !ikeep mode, the entire
+		 * inode chunk has been deallocated. Remove the record from the
+		 * finobt.
+		 */
+		error = xfs_btree_delete(cur, &i);
+		if (error)
+			goto error;
+		ASSERT(i == 1);
+	} else {
+		/*
+		 * The existing finobt record was modified and has a combination
+		 * of allocated and free inodes or is completely free and ikeep
+		 * is enabled. Update the record.
+		 */
+		error = xfs_inobt_update(cur, &rec);
+		if (error)
+			goto error;
+	}
+
+out:
+	error = xfs_check_agi_freecount(cur, agi);
+	if (error)
+		goto error;
+
+	xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR);
+	return 0;
+
+error:
+	xfs_btree_del_cursor(cur, XFS_BTREE_ERROR);
+	return error;
+}
+
+/*
  * Free disk inode.  Carefully avoids touching the incore inode, all
  * manipulations incore are the caller's responsibility.
  * The on-disk inode is not changed by this operation, only the
@@ -1522,6 +1622,15 @@ xfs_difree(
 	if (error)
 		goto error0;
 
+	/*
+	 * Fix up the free inode btree.
+	 */
+	if (xfs_sb_version_hasfinobt(&mp->m_sb)) {
+		error = xfs_difree_finobt(mp, tp, agbp, agino, &rec);
+		if (error)
+			goto error0;
+	}
+
 	return 0;
 
 error0:
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 09/20] xfs: report finobt status in fs geometry
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
                   ` (7 preceding siblings ...)
  2013-11-13 15:56 ` [PATCH v2 08/20] xfs: update the finobt on inode free Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  2013-11-13 15:56 ` [PATCH v2 10/20] xfs: enable the finobt feature on v5 superblocks Brian Foster
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

Define the XFS_FSOP_GEOM_FLAGS_FINOBT fs geometry flag and set the
associated bit if the filesystem supports the free inode btree.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 include/xfs_fs.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/xfs_fs.h b/include/xfs_fs.h
index 554fd66..59c40fc 100644
--- a/include/xfs_fs.h
+++ b/include/xfs_fs.h
@@ -238,6 +238,7 @@ typedef struct xfs_fsop_resblks {
 #define XFS_FSOP_GEOM_FLAGS_LAZYSB	0x4000	/* lazy superblock counters */
 #define XFS_FSOP_GEOM_FLAGS_V5SB	0x8000	/* version 5 superblock */
 #define XFS_FSOP_GEOM_FLAGS_FTYPE	0x10000	/* inode directory types */
+#define XFS_FSOP_GEOM_FLAGS_FINOBT	0x20000	/* free inode btree */
 
 
 /*
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 10/20] xfs: enable the finobt feature on v5 superblocks
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
                   ` (8 preceding siblings ...)
  2013-11-13 15:56 ` [PATCH v2 09/20] xfs: report finobt status in fs geometry Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  2013-11-13 15:56 ` [PATCH v2 11/20] xfsprogs/mkfs: finobt mkfs support Brian Foster
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

Add the finobt feature bit to the list of known features. As of
this point, the kernel code knows how to mount and manage both
finobt and non-finobt formatted filesystems.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 include/xfs_sb.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/xfs_sb.h b/include/xfs_sb.h
index 070a7f6..9919fb8 100644
--- a/include/xfs_sb.h
+++ b/include/xfs_sb.h
@@ -586,7 +586,8 @@ xfs_sb_has_compat_feature(
 }
 
 #define XFS_SB_FEAT_RO_COMPAT_FINOBT   (1 << 0)		/* free inode btree */
-#define XFS_SB_FEAT_RO_COMPAT_ALL 0
+#define XFS_SB_FEAT_RO_COMPAT_ALL \
+		(XFS_SB_FEAT_RO_COMPAT_FINOBT)
 #define XFS_SB_FEAT_RO_COMPAT_UNKNOWN	~XFS_SB_FEAT_RO_COMPAT_ALL
 static inline bool
 xfs_sb_has_ro_compat_feature(
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 11/20] xfsprogs/mkfs: finobt mkfs support
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
                   ` (9 preceding siblings ...)
  2013-11-13 15:56 ` [PATCH v2 10/20] xfs: enable the finobt feature on v5 superblocks Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  2013-11-13 15:56 ` [PATCH v2 12/20] xfsprogs/db: finobt support Brian Foster
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

Add the 'finobt' metadata option to mkfs to format an fs with free
inode btree support. If enabled, initialize the associated AGI
header fields and btree root block.

Also, do the initialization of the superblock version and feature
bits (including the new finobt flag) a bit earlier. These fields
must now be initialized prior to the use of XFS_PREALLOC_BLOCKS(),
as the latter returns a value that depends on whether a finobt root
btree block is reserved.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 mkfs/xfs_mkfs.c | 83 +++++++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 63 insertions(+), 20 deletions(-)

diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
index d82128c..f28832a 100644
--- a/mkfs/xfs_mkfs.c
+++ b/mkfs/xfs_mkfs.c
@@ -183,6 +183,8 @@ char	*sopts[] = {
 char	*mopts[] = {
 #define	M_CRC		0
 	"crc",
+#define M_FINOBT	1
+	"finobt",
 	NULL
 };
 
@@ -962,6 +964,7 @@ main(
 	struct fs_topology	ft;
 	int			lazy_sb_counters;
 	int			crcs_enabled;
+	int			finobt;
 
 	progname = basename(argv[0]);
 	setlocale(LC_ALL, "");
@@ -995,6 +998,7 @@ main(
 	worst_freelist = 0;
 	lazy_sb_counters = 1;
 	crcs_enabled = 0;
+	finobt = 0;
 	memset(&fsx, 0, sizeof(fsx));
 
 	memset(&xi, 0, sizeof(xi));
@@ -1486,6 +1490,14 @@ _("cannot specify both crc and ftype\n"));
 						usage();
 					}
 					break;
+				case M_FINOBT:
+					if (!value || *value == '\0')
+						reqval('m', mopts, M_CRC);
+					c = atoi(value);
+					if (c < 0 || c > 1)
+						illegal(value, "m finobt");
+					finobt = c;
+					break;
 				default:
 					unknown('m', value);
 				}
@@ -2407,6 +2419,30 @@ _("size %s specified for log subvolume is too large, maximum is %lld blocks\n"),
 	mp->m_blkbb_log = sbp->sb_blocklog - BBSHIFT;
 	mp->m_sectbb_log = sbp->sb_sectlog - BBSHIFT;
 
+	/*
+	 * sb_versionnum and finobt flags must be set before we use
+	 * XFS_PREALLOC_BLOCKS().
+	 */
+	sbp->sb_features2 = XFS_SB_VERSION2_MKFS(crcs_enabled, lazy_sb_counters,
+					attrversion == 2, !projid16bit, 0,
+					(!crcs_enabled && dirftype));
+	sbp->sb_versionnum = XFS_SB_VERSION_MKFS(crcs_enabled, iaflag,
+					dsunit != 0,
+					logversion == 2, attrversion == 1,
+					(sectorsize != BBSIZE ||
+							lsectorsize != BBSIZE),
+					nci, sbp->sb_features2 != 0);
+	/*
+	 * Due to a structure alignment issue, sb_features2 ended up in one
+	 * of two locations, the second "incorrect" location represented by
+	 * the sb_bad_features2 field. To avoid older kernels mounting
+	 * filesystems they shouldn't, set both field to the same value.
+	 */
+	sbp->sb_bad_features2 = sbp->sb_features2;
+
+	if (finobt)
+		sbp->sb_features_ro_compat = XFS_SB_FEAT_RO_COMPAT_FINOBT;
+
 	if (loginternal) {
 		/*
 		 * Readjust the log size to fit within an AG if it was sized
@@ -2469,7 +2505,7 @@ _("size %s specified for log subvolume is too large, maximum is %lld blocks\n"),
 		printf(_(
 		   "meta-data=%-22s isize=%-6d agcount=%lld, agsize=%lld blks\n"
 		   "         =%-22s sectsz=%-5u attr=%u, projid32bit=%u\n"
-		   "         =%-22s crc=%u\n"
+		   "         =%-22s crc=%-8u finobt=%u\n"
 		   "data     =%-22s bsize=%-6u blocks=%llu, imaxpct=%u\n"
 		   "         =%-22s sunit=%-6u swidth=%u blks\n"
 		   "naming   =version %-14u bsize=%-6u ascii-ci=%d ftype=%d\n"
@@ -2478,7 +2514,7 @@ _("size %s specified for log subvolume is too large, maximum is %lld blocks\n"),
 		   "realtime =%-22s extsz=%-6d blocks=%lld, rtextents=%lld\n"),
 			dfile, isize, (long long)agcount, (long long)agsize,
 			"", sectorsize, attrversion, !projid16bit,
-			"", crcs_enabled,
+			"", crcs_enabled, finobt,
 			"", blocksize, (long long)dblocks, imaxpct,
 			"", dsunit, dswidth,
 			dirversion, dirblocksize, nci, dirftype,
@@ -2547,23 +2583,6 @@ _("size %s specified for log subvolume is too large, maximum is %lld blocks\n"),
 		sbp->sb_logsectsize = 0;
 	}
 
-	sbp->sb_features2 = XFS_SB_VERSION2_MKFS(crcs_enabled, lazy_sb_counters,
-				   attrversion == 2, !projid16bit, 0,
-				   (!crcs_enabled && dirftype));
-	sbp->sb_versionnum = XFS_SB_VERSION_MKFS(crcs_enabled, iaflag,
-					dsunit != 0,
-					logversion == 2, attrversion == 1,
-					(sectorsize != BBSIZE ||
-							lsectorsize != BBSIZE),
-					nci, sbp->sb_features2 != 0);
-	/*
-	 * Due to a structure alignment issue, sb_features2 ended up in one
-	 * of two locations, the second "incorrect" location represented by
-	 * the sb_bad_features2 field. To avoid older kernels mounting
-	 * filesystems they shouldn't, set both field to the same value.
-	 */
-	sbp->sb_bad_features2 = sbp->sb_features2;
-
 	if (force_overwrite)
 		zero_old_xfs_structures(&xi, sbp);
 
@@ -2720,6 +2739,10 @@ _("size %s specified for log subvolume is too large, maximum is %lld blocks\n"),
 		agi->agi_count = 0;
 		agi->agi_root = cpu_to_be32(XFS_IBT_BLOCK(mp));
 		agi->agi_level = cpu_to_be32(1);
+		if (finobt) {
+			agi->agi_free_root = cpu_to_be32(XFS_FIBT_BLOCK(mp));
+			agi->agi_free_level = cpu_to_be32(1);
+		}
 		agi->agi_freecount = 0;
 		agi->agi_newino = cpu_to_be32(NULLAGINO);
 		agi->agi_dirino = cpu_to_be32(NULLAGINO);
@@ -2845,6 +2868,26 @@ _("size %s specified for log subvolume is too large, maximum is %lld blocks\n"),
 			xfs_btree_init_block(mp, buf, XFS_IBT_MAGIC, 0, 0,
 						agno, 0);
 		libxfs_writebuf(buf, LIBXFS_EXIT_ON_FAILURE);
+
+		/*
+		 * Free INO btree root block
+		 */
+		if (!finobt)
+			continue;
+
+		buf = libxfs_getbuf(mp->m_ddev_targp,
+				XFS_AGB_TO_DADDR(mp, agno, XFS_FIBT_BLOCK(mp)),
+				bsize);
+		buf->b_ops = &xfs_inobt_buf_ops;
+		block = XFS_BUF_TO_BLOCK(buf);
+		memset(block, 0, blocksize);
+		if (xfs_sb_version_hascrc(&mp->m_sb))
+			xfs_btree_init_block(mp, buf, XFS_FIBT_CRC_MAGIC, 0, 0,
+						agno, XFS_BTREE_CRC_BLOCKS);
+		else
+			xfs_btree_init_block(mp, buf, XFS_FIBT_MAGIC, 0, 0,
+						agno, 0);
+		libxfs_writebuf(buf, LIBXFS_EXIT_ON_FAILURE);
 	}
 
 	/*
@@ -3081,7 +3124,7 @@ usage( void )
 {
 	fprintf(stderr, _("Usage: %s\n\
 /* blocksize */		[-b log=n|size=num]\n\
-/* metadata */		[-m crc=[0|1]\n\
+/* metadata */		[-m crc=0|1,finobt=0|1]\n\
 /* data subvol */	[-d agcount=n,agsize=n,file,name=xxx,size=num,\n\
 			    (sunit=value,swidth=value|su=num,sw=num|noalign),\n\
 			    sectlog=n|sectsize=num\n\
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 12/20] xfsprogs/db: finobt support
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
                   ` (10 preceding siblings ...)
  2013-11-13 15:56 ` [PATCH v2 11/20] xfsprogs/mkfs: finobt mkfs support Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  2013-11-13 15:56 ` [PATCH v2 13/20] xfsprogs/repair: account for finobt in ag 0 geometry pre-calculation Brian Foster
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

Add the AGI finobt fields and fibt layouts.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 db/agi.c     |  2 ++
 db/btblock.c | 12 ++++++++++++
 2 files changed, 14 insertions(+)

diff --git a/db/agi.c b/db/agi.c
index 398bdbb..6f167ac 100644
--- a/db/agi.c
+++ b/db/agi.c
@@ -57,6 +57,8 @@ const field_t	agi_flds[] = {
 	{ "uuid", FLDT_UUID, OI(OFF(uuid)), C1, 0, TYP_NONE },
 	{ "lsn", FLDT_UINT64X, OI(OFF(lsn)), C1, 0, TYP_NONE },
 	{ "crc", FLDT_CRC, OI(OFF(crc)), C1, 0, TYP_NONE },
+	{ "free_root", FLDT_AGBLOCK, OI(OFF(free_root)), C1, 0, TYP_INOBT },
+	{ "free_level", FLDT_UINT32D, OI(OFF(free_level)), C1, 0, TYP_NONE },
 	{ NULL }
 };
 
diff --git a/db/btblock.c b/db/btblock.c
index 1ea0cff..cdb8b1d 100644
--- a/db/btblock.c
+++ b/db/btblock.c
@@ -60,6 +60,12 @@ struct xfs_db_btree {
 		sizeof(xfs_inobt_rec_t),
 		sizeof(__be32),
 	},
+	{	XFS_FIBT_MAGIC,
+		XFS_BTREE_SBLOCK_LEN,
+		sizeof(xfs_inobt_key_t),
+		sizeof(xfs_inobt_rec_t),
+		sizeof(__be32),
+	},
 	{	XFS_BMAP_CRC_MAGIC,
 		XFS_BTREE_LBLOCK_CRC_LEN,
 		sizeof(xfs_bmbt_key_t),
@@ -84,6 +90,12 @@ struct xfs_db_btree {
 		sizeof(xfs_inobt_rec_t),
 		sizeof(__be32),
 	},
+	{	XFS_FIBT_CRC_MAGIC,
+		XFS_BTREE_SBLOCK_CRC_LEN,
+		sizeof(xfs_inobt_key_t),
+		sizeof(xfs_inobt_rec_t),
+		sizeof(__be32),
+	},
 	{	0,
 	},
 };
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 13/20] xfsprogs/repair: account for finobt in ag 0 geometry pre-calculation
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
                   ` (11 preceding siblings ...)
  2013-11-13 15:56 ` [PATCH v2 12/20] xfsprogs/db: finobt support Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  2013-11-13 15:56 ` [PATCH v2 14/20] xfsprogs/repair: phase 2 finobt scan Brian Foster
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

Account for the finobt in calc_mkfs().

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 repair/xfs_repair.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
index a863337..fbc2d7d 100644
--- a/repair/xfs_repair.c
+++ b/repair/xfs_repair.c
@@ -404,6 +404,8 @@ calc_mkfs(xfs_mount_t *mp)
 	bcntbt_root = bnobt_root + 1;
 	inobt_root = bnobt_root + 2;
 	fino_bno = inobt_root + XFS_MIN_FREELIST_RAW(1, 1, mp) + 1;
+	if (xfs_sb_version_hasfinobt(&mp->m_sb))
+		fino_bno++;
 
 	/*
 	 * If the log is allocated in the first allocation group we need to
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 14/20] xfsprogs/repair: phase 2 finobt scan
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
                   ` (12 preceding siblings ...)
  2013-11-13 15:56 ` [PATCH v2 13/20] xfsprogs/repair: account for finobt in ag 0 geometry pre-calculation Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  2013-11-13 15:56 ` [PATCH v2 15/20] xfsprogs/repair: pass btree block magic as param to build_ino_tree() Brian Foster
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

If one exists, scan the free inode btree in phase 2 of xfs_repair.
We use the same general infrastructure as for the inobt scan, but
trigger finobt chunk scan logic in in scan_inobt() via the magic
value.

The new scan_single_finobt_chunk() function is similar to the inobt
equivalent with some finobt specific logic. We can expect that
underlying inode chunk blocks are already marked used due to the
previous inobt scan. We can also expect to find every record
tracked by the finobt already accounted for in the in-core tree
with equivalent (and internally consistent) inobt record data.

Spit out a warning on any divergences from the above and add the
inodes referenced by the current finobt record to the appropriate
in-core tree.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 repair/scan.c | 239 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 234 insertions(+), 5 deletions(-)

diff --git a/repair/scan.c b/repair/scan.c
index 49ed194..1035f01 100644
--- a/repair/scan.c
+++ b/repair/scan.c
@@ -46,6 +46,7 @@ struct aghdr_cnts {
 	__uint64_t	fdblocks;
 	__uint64_t	icount;
 	__uint64_t	ifreecount;
+	__uint32_t	fibtfreecount;
 };
 
 void
@@ -882,6 +883,196 @@ _("inode rec for ino %" PRIu64 " (%d/%d) overlaps existing rec (start %d/%d)\n")
 	return suspect;
 }
 
+static int
+scan_single_finobt_chunk(
+	xfs_agnumber_t		agno,
+	xfs_inobt_rec_t		*rp,
+	int			suspect)
+{
+	xfs_ino_t		lino;
+	xfs_agino_t		ino;
+	xfs_agblock_t		agbno;
+	int			j;
+	int			nfree;
+	int			off;
+	int			state;
+	ino_tree_node_t		*first_rec, *last_rec, *ino_rec;
+
+	ino = be32_to_cpu(rp->ir_startino);
+	off = XFS_AGINO_TO_OFFSET(mp, ino);
+	agbno = XFS_AGINO_TO_AGBNO(mp, ino);
+	lino = XFS_AGINO_TO_INO(mp, agno, ino);
+
+	/*
+	 * on multi-block block chunks, all chunks start
+	 * at the beginning of the block.  with multi-chunk
+	 * blocks, all chunks must start on 64-inode boundaries
+	 * since each block can hold N complete chunks. if
+	 * fs has aligned inodes, all chunks must start
+	 * at a fs_ino_alignment*N'th agbno.  skip recs
+	 * with badly aligned starting inodes.
+	 */
+	if (ino == 0 ||
+	    (inodes_per_block <= XFS_INODES_PER_CHUNK && off !=  0) ||
+	    (inodes_per_block > XFS_INODES_PER_CHUNK &&
+	     off % XFS_INODES_PER_CHUNK != 0) ||
+	    (fs_aligned_inodes && agbno % fs_ino_alignment != 0)) {
+		do_warn(
+	_("badly aligned finobt inode rec (starting inode = %" PRIu64 ")\n"),
+			lino);
+		suspect++;
+	}
+
+	/*
+	 * verify numeric validity of inode chunk first
+	 * before inserting into a tree.  don't have to
+	 * worry about the overflow case because the
+	 * starting ino number of a chunk can only get
+	 * within 255 inodes of max (NULLAGINO).  if it
+	 * gets closer, the agino number will be illegal
+	 * as the agbno will be too large.
+	 */
+	if (verify_aginum(mp, agno, ino)) {
+		do_warn(
+_("bad starting inode # (%" PRIu64 " (0x%x 0x%x)) in finobt rec, skipping rec\n"),
+			lino, agno, ino);
+		return ++suspect;
+	}
+
+	if (verify_aginum(mp, agno,
+			ino + XFS_INODES_PER_CHUNK - 1)) {
+		do_warn(
+_("bad ending inode # (%" PRIu64 " (0x%x 0x%zx)) in finobt rec, skipping rec\n"),
+			lino + XFS_INODES_PER_CHUNK - 1,
+			agno,
+			ino + XFS_INODES_PER_CHUNK - 1);
+		return ++suspect;
+	}
+ 
+	/*
+	 * cross check state of each block containing inodes referenced by the
+	 * finobt against what we have already scanned from the alloc inobt.
+	 */
+	if (off == 0 && !suspect) {
+		for (j = 0;
+		     j < XFS_INODES_PER_CHUNK;
+		     j += mp->m_sb.sb_inopblock) {
+			agbno = XFS_AGINO_TO_AGBNO(mp, ino + j);
+
+			state = get_bmap(agno, agbno);
+			if (state == XR_E_INO) {
+				continue;
+			} else if ((state == XR_E_UNKNOWN) ||
+				   (state == XR_E_INUSE_FS && agno == 0 &&
+				    ino + j >= first_prealloc_ino &&
+				    ino + j < last_prealloc_ino)) {
+				do_warn(
+_("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\n"),
+					agno, agbno, mp->m_sb.sb_inopblock);
+
+				set_bmap(agno, agbno, XR_E_INO);
+				suspect++;
+			} else {
+				do_warn(
+_("inode chunk claims used block, finobt block - agno %d, bno %d, inopb %d\n"),
+					agno, agbno, mp->m_sb.sb_inopblock);
+				return ++suspect;
+			}
+		}
+	}
+
+	/*
+	 * ensure we have an incore entry for each chunk
+	 */
+	find_inode_rec_range(mp, agno, ino, ino + XFS_INODES_PER_CHUNK,
+			     &first_rec, &last_rec);
+
+	if (first_rec) {
+		if (suspect)
+			return suspect;
+
+		/*
+		 * verify consistency between finobt record and incore state
+		 */
+		if (first_rec->ino_startnum != ino) {
+			do_warn(
+_("finobt rec for ino %" PRIu64 " (%d/%u) does not match existing rec (%d/%d)\n"),
+				lino, agno, ino, agno, first_rec->ino_startnum);
+			return ++suspect;
+		}
+
+		nfree = 0;
+		for (j = 0; j < XFS_INODES_PER_CHUNK; j++) {
+			if (XFS_INOBT_IS_FREE_DISK(rp, j)) {
+				nfree++;
+				if (!suspect && !is_inode_free(first_rec, j))
+					suspect++;
+			}
+		}
+
+		goto check_freecount;
+	}
+
+	/*
+	 * the finobt contains a record that the previous alloc inobt scan never
+	 * found. insert the inodes into the appropriate tree.
+	 */
+
+	do_warn(
+		_("undiscovered finobt record, ino %" PRIu64 " (%d/%u)\n"),
+		lino, agno, ino);
+
+	if (!suspect) {
+		/*
+		 * inodes previously inserted into the uncertain tree should be
+		 * superceded by these when the uncertain tree is processed
+		 */
+		nfree = 0;
+		if (XFS_INOBT_IS_FREE_DISK(rp, 0)) {
+			nfree++;
+			ino_rec = set_inode_free_alloc(mp, agno, ino);
+		} else  {
+			ino_rec = set_inode_used_alloc(mp, agno, ino);
+		}
+		for (j = 1; j < XFS_INODES_PER_CHUNK; j++) {
+			if (XFS_INOBT_IS_FREE_DISK(rp, j)) {
+				nfree++;
+				set_inode_free(ino_rec, j);
+			} else  {
+				set_inode_used(ino_rec, j);
+			}
+		}
+	} else {
+		/*
+		 * this should handle the case where the inobt scan may have
+		 * already added uncertain inodes
+		 */
+		nfree = 0;
+		for (j = 0; j < XFS_INODES_PER_CHUNK; j++) {
+			if (XFS_INOBT_IS_FREE_DISK(rp, j)) {
+				add_aginode_uncertain(agno, ino + j, 1);
+				nfree++;
+			} else {
+				add_aginode_uncertain(agno, ino + j, 0);
+			}
+		}
+	}
+
+check_freecount:
+
+	if (nfree != be32_to_cpu(rp->ir_freecount)) {
+		do_warn(
+_("finobt ir_freecount/free mismatch, inode chunk %d/%u, freecount %d nfree %d\n"),
+			agno, ino, be32_to_cpu(rp->ir_freecount), nfree);
+	}
+
+	if (!nfree) {
+		do_warn(
+_("finobt record with no free inodes, inode chunk %d/%u\n"), agno, ino);
+	}
+
+	return suspect;
+}
 
 /*
  * this one walks the inode btrees sucking the info there into
@@ -990,12 +1181,29 @@ _("inode btree block claimed (state %d), agno %d, bno %d, suspect %d\n"),
 		 * the block.  skip processing of bogus records.
 		 */
 		for (i = 0; i < numrecs; i++) {
-			agcnts->agicount += XFS_INODES_PER_CHUNK;
-			agcnts->icount += XFS_INODES_PER_CHUNK;
-			agcnts->agifreecount += be32_to_cpu(rp[i].ir_freecount);
-			agcnts->ifreecount += be32_to_cpu(rp[i].ir_freecount);
+			if (magic == XFS_IBT_MAGIC ||
+			    magic == XFS_IBT_CRC_MAGIC) {
+				agcnts->agicount += XFS_INODES_PER_CHUNK;
+				agcnts->icount += XFS_INODES_PER_CHUNK;
+				agcnts->agifreecount +=
+					be32_to_cpu(rp[i].ir_freecount);
+				agcnts->ifreecount +=
+					be32_to_cpu(rp[i].ir_freecount);
+
+				suspect = scan_single_ino_chunk(agno, &rp[i],
+						suspect);
+			} else {
+				/*
+				 * the finobt tracks records with free inodes,
+				 * so only the free inode count is expected to be
+				 * consistent with the agi
+				 */
+				agcnts->fibtfreecount +=
+					be32_to_cpu(rp[i].ir_freecount);
 
-			suspect = scan_single_ino_chunk(agno, &rp[i], suspect);
+				suspect = scan_single_finobt_chunk(agno, &rp[i],
+						suspect);
+			}
 		}
 
 		if (suspect)
@@ -1180,6 +1388,20 @@ validate_agi(
 			be32_to_cpu(agi->agi_root), agno);
 	}
 
+	if (xfs_sb_version_hasfinobt(&mp->m_sb)) {
+		bno = be32_to_cpu(agi->agi_free_root);
+		if (bno != 0 && verify_agbno(mp, agno, bno)) {
+			magic = xfs_sb_version_hascrc(&mp->m_sb) ?
+					XFS_FIBT_CRC_MAGIC : XFS_FIBT_MAGIC;
+			scan_sbtree(bno, be32_to_cpu(agi->agi_free_level),
+				    agno, 0, scan_inobt, 1, magic, agcnts,
+				    &xfs_inobt_buf_ops);
+		} else {
+			do_warn(_("bad agbno %u for finobt root, agno %d\n"),
+				be32_to_cpu(agi->agi_free_root), agno);
+		}
+	}
+
 	if (be32_to_cpu(agi->agi_count) != agcnts->agicount) {
 		do_warn(_("agi_count %u, counted %u in ag %u\n"),
 			 be32_to_cpu(agi->agi_count), agcnts->agicount, agno);
@@ -1190,6 +1412,13 @@ validate_agi(
 			be32_to_cpu(agi->agi_freecount), agcnts->agifreecount, agno);
 	}
 
+	if (xfs_sb_version_hasfinobt(&mp->m_sb) &&
+	    be32_to_cpu(agi->agi_freecount) != agcnts->fibtfreecount) {
+		do_warn(_("agi_freecount %u, counted %u in ag %u finobt\n"),
+			be32_to_cpu(agi->agi_freecount), agcnts->fibtfreecount,
+			agno);
+	}
+
 	for (i = 0; i < XFS_AGI_UNLINKED_BUCKETS; i++) {
 		xfs_agino_t	agino = be32_to_cpu(agi->agi_unlinked[i]);
 
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 15/20] xfsprogs/repair: pass btree block magic as param to build_ino_tree()
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
                   ` (13 preceding siblings ...)
  2013-11-13 15:56 ` [PATCH v2 14/20] xfsprogs/repair: phase 2 finobt scan Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  2013-11-13 15:56 ` [PATCH v2 16/20] xfsprogs/repair: pull the build_agi() call up out of the inode tree build Brian Foster
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

A minor cleanup to build_ino_tree() to provide the appropriate
magic value for btree block initialization from the caller. This
facilitates use of separate magic values for finobt blocks when
building the free inode btree.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 repair/phase5.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/repair/phase5.c b/repair/phase5.c
index 77eb125..10ed1eb 100644
--- a/repair/phase5.c
+++ b/repair/phase5.c
@@ -1097,7 +1097,7 @@ build_agi(xfs_mount_t *mp, xfs_agnumber_t agno,
  */
 static void
 build_ino_tree(xfs_mount_t *mp, xfs_agnumber_t agno,
-		bt_status_t *btree_curs)
+		bt_status_t *btree_curs, __uint32_t magic)
 {
 	xfs_agnumber_t		i;
 	xfs_agblock_t		j;
@@ -1135,11 +1135,11 @@ build_ino_tree(xfs_mount_t *mp, xfs_agnumber_t agno,
 		bt_hdr = XFS_BUF_TO_BLOCK(lptr->buf_p);
 		memset(bt_hdr, 0, mp->m_sb.sb_blocksize);
 		if (xfs_sb_version_hascrc(&mp->m_sb))
-			xfs_btree_init_block(mp, lptr->buf_p, XFS_IBT_CRC_MAGIC,
+			xfs_btree_init_block(mp, lptr->buf_p, magic,
 						i, 0, agno,
 						XFS_BTREE_CRC_BLOCKS);
 		else
-			xfs_btree_init_block(mp, lptr->buf_p, XFS_IBT_MAGIC,
+			xfs_btree_init_block(mp, lptr->buf_p, magic,
 						i, 0, agno, 0);
 	}
 
@@ -1166,11 +1166,11 @@ build_ino_tree(xfs_mount_t *mp, xfs_agnumber_t agno,
 		bt_hdr = XFS_BUF_TO_BLOCK(lptr->buf_p);
 		memset(bt_hdr, 0, mp->m_sb.sb_blocksize);
 		if (xfs_sb_version_hascrc(&mp->m_sb))
-			xfs_btree_init_block(mp, lptr->buf_p, XFS_IBT_CRC_MAGIC,
+			xfs_btree_init_block(mp, lptr->buf_p, magic,
 						0, 0, agno,
 						XFS_BTREE_CRC_BLOCKS);
 		else
-			xfs_btree_init_block(mp, lptr->buf_p, XFS_IBT_MAGIC,
+			xfs_btree_init_block(mp, lptr->buf_p, magic,
 						0, 0, agno, 0);
 
 		bt_hdr->bb_u.s.bb_leftsib = cpu_to_be32(lptr->prev_agbno);
@@ -1483,6 +1483,7 @@ phase5_func(
 	xfs_extlen_t	freeblks2;
 #endif
 	xfs_agblock_t	num_extents;
+	__uint32_t	magic;
 
 	if (verbose)
 		do_log(_("        - agno = %d\n"), agno);
@@ -1616,7 +1617,9 @@ phase5_func(
 		/*
 		 * build inode allocation tree.  this also build the agi
 		 */
-		build_ino_tree(mp, agno, &ino_btree_curs);
+		magic = xfs_sb_version_hascrc(&mp->m_sb) ?
+				XFS_IBT_CRC_MAGIC : XFS_IBT_MAGIC;
+		build_ino_tree(mp, agno, &ino_btree_curs, magic);
 		write_cursor(&ino_btree_curs);
 		/*
 		 * tear down cursors
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 16/20] xfsprogs/repair: pull the build_agi() call up out of the inode tree build
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
                   ` (14 preceding siblings ...)
  2013-11-13 15:56 ` [PATCH v2 15/20] xfsprogs/repair: pass btree block magic as param to build_ino_tree() Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  2013-11-13 15:56 ` [PATCH v2 17/20] xfsprogs/repair: helpers for finding in-core inode records w/ free inodes Brian Foster
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

Pull the build_agi() call out of build_ino_tree() in phase 5. This
is to prepare for finobt support, in which build_agi() will require
context from multiple inode tree reconstructions (both the inode
allocation tree and free inode tree, when it exists).

Create the new 'agi_stat' structure to carry the requisite state
from the build_ino_tree() operation to build_agi().

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 repair/phase5.c | 36 +++++++++++++++++++++++++++---------
 1 file changed, 27 insertions(+), 9 deletions(-)

diff --git a/repair/phase5.c b/repair/phase5.c
index 10ed1eb..9632d2c 100644
--- a/repair/phase5.c
+++ b/repair/phase5.c
@@ -74,6 +74,15 @@ typedef struct bt_status  {
 	bt_stat_level_t		level[XFS_BTREE_MAXLEVELS];
 } bt_status_t;
 
+/*
+ * extra metadata for the agi
+ */
+struct agi_stat {
+	xfs_agino_t		first_agino;
+	xfs_agino_t		count;
+	xfs_agino_t		freecount;
+};
+
 static __uint64_t	*sb_icount_ag;		/* allocated inodes per ag */
 static __uint64_t	*sb_ifree_ag;		/* free inodes per ag */
 static __uint64_t	*sb_fdblocks_ag;	/* free data blocks per ag */
@@ -1053,8 +1062,7 @@ prop_ino_cursor(xfs_mount_t *mp, xfs_agnumber_t agno, bt_status_t *btree_curs,
  */
 static void
 build_agi(xfs_mount_t *mp, xfs_agnumber_t agno,
-		bt_status_t *btree_curs, xfs_agino_t first_agino,
-		xfs_agino_t count, xfs_agino_t freecount)
+		bt_status_t *btree_curs, struct agi_stat *agi_stat)
 {
 	xfs_buf_t	*agi_buf;
 	xfs_agi_t	*agi;
@@ -1075,11 +1083,11 @@ build_agi(xfs_mount_t *mp, xfs_agnumber_t agno,
 	else
 		agi->agi_length = cpu_to_be32(mp->m_sb.sb_dblocks -
 			(xfs_drfsbno_t) mp->m_sb.sb_agblocks * agno);
-	agi->agi_count = cpu_to_be32(count);
+	agi->agi_count = cpu_to_be32(agi_stat->count);
 	agi->agi_root = cpu_to_be32(btree_curs->root);
 	agi->agi_level = cpu_to_be32(btree_curs->num_levels);
-	agi->agi_freecount = cpu_to_be32(freecount);
-	agi->agi_newino = cpu_to_be32(first_agino);
+	agi->agi_freecount = cpu_to_be32(agi_stat->freecount);
+	agi->agi_newino = cpu_to_be32(agi_stat->first_agino);
 	agi->agi_dirino = cpu_to_be32(NULLAGINO);
 
 	for (i = 0; i < XFS_AGI_UNLINKED_BUCKETS; i++)  
@@ -1097,7 +1105,8 @@ build_agi(xfs_mount_t *mp, xfs_agnumber_t agno,
  */
 static void
 build_ino_tree(xfs_mount_t *mp, xfs_agnumber_t agno,
-		bt_status_t *btree_curs, __uint32_t magic)
+		bt_status_t *btree_curs, __uint32_t magic,
+		struct agi_stat *agi_stat)
 {
 	xfs_agnumber_t		i;
 	xfs_agblock_t		j;
@@ -1227,7 +1236,11 @@ build_ino_tree(xfs_mount_t *mp, xfs_agnumber_t agno,
 		}
 	}
 
-	build_agi(mp, agno, btree_curs, first_agino, count, freecount);
+	if (agi_stat) {
+		agi_stat->first_agino = first_agino;
+		agi_stat->count = count;
+		agi_stat->freecount = freecount;
+	}
 }
 
 /*
@@ -1484,6 +1497,7 @@ phase5_func(
 #endif
 	xfs_agblock_t	num_extents;
 	__uint32_t	magic;
+	struct agi_stat	agi_stat = {0,};
 
 	if (verbose)
 		do_log(_("        - agno = %d\n"), agno);
@@ -1615,12 +1629,16 @@ phase5_func(
 		build_agf_agfl(mp, agno, &bno_btree_curs,
 				&bcnt_btree_curs, freeblks1, extra_blocks);
 		/*
-		 * build inode allocation tree.  this also build the agi
+		 * build inode allocation tree.
 		 */
 		magic = xfs_sb_version_hascrc(&mp->m_sb) ?
 				XFS_IBT_CRC_MAGIC : XFS_IBT_MAGIC;
-		build_ino_tree(mp, agno, &ino_btree_curs, magic);
+		build_ino_tree(mp, agno, &ino_btree_curs, magic, &agi_stat);
 		write_cursor(&ino_btree_curs);
+
+		/* build the agi */
+		build_agi(mp, agno, &ino_btree_curs, &agi_stat);
+
 		/*
 		 * tear down cursors
 		 */
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 17/20] xfsprogs/repair: helpers for finding in-core inode records w/ free inodes
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
                   ` (15 preceding siblings ...)
  2013-11-13 15:56 ` [PATCH v2 16/20] xfsprogs/repair: pull the build_agi() call up out of the inode tree build Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  2013-11-13 15:56 ` [PATCH v2 18/20] xfsprogs/repair: reconstruct the finobt in phase 5 Brian Foster
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

Add the findfirst_free_inode_rec() and next_free_ino_rec() helpers
to assist scanning the in-core inode records for records with at
least one free inode. These will be used to determine what records
are included in the free inode btree.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 repair/incore.h | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/repair/incore.h b/repair/incore.h
index 38caa6d..5a563db 100644
--- a/repair/incore.h
+++ b/repair/incore.h
@@ -379,6 +379,33 @@ void			clear_uncertain_ino_cache(xfs_agnumber_t agno);
 		((ino_tree_node_t *) ((ino_node_ptr)->avl_node.avl_forw))
 
 /*
+ * finobt helpers
+ */
+static inline ino_tree_node_t *
+findfirst_free_inode_rec(xfs_agnumber_t agno)
+{
+	ino_tree_node_t *ino_rec;
+
+	ino_rec = findfirst_inode_rec(agno);
+
+	while (ino_rec && !ino_rec->ir_free)
+		ino_rec = next_ino_rec(ino_rec);
+
+	return ino_rec;
+}
+
+static inline ino_tree_node_t *
+next_free_ino_rec(ino_tree_node_t *ino_rec)
+{
+	ino_rec = next_ino_rec(ino_rec);
+	
+	while (ino_rec && !ino_rec->ir_free)
+		ino_rec = next_ino_rec(ino_rec);
+
+	return ino_rec;
+}
+
+/*
  * Has an inode been processed for phase 6 (reference count checking)?
  *
  * add_inode_refchecked() is set on an inode when it gets traversed
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 18/20] xfsprogs/repair: reconstruct the finobt in phase 5
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
                   ` (16 preceding siblings ...)
  2013-11-13 15:56 ` [PATCH v2 17/20] xfsprogs/repair: helpers for finding in-core inode records w/ free inodes Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  2013-11-13 15:56 ` [PATCH v2 19/20] xfsprogs/growfs: report finobt status in fs geometry (xfs_info) Brian Foster
  2013-11-13 15:56 ` [PATCH v2 20/20] xfsprogs/db: add finobt support to metadump Brian Foster
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

Support reconstruction of the finobt in phase 5 of xfs_repair. We
create a new cursor for the finobt and write the in-core records
that contain free inodes to the tree. Finally, pass the cursor
along to build_agi() to include the finobt root and level count in
the agi header.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 repair/phase5.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 57 insertions(+), 13 deletions(-)

diff --git a/repair/phase5.c b/repair/phase5.c
index 9632d2c..e138a6a 100644
--- a/repair/phase5.c
+++ b/repair/phase5.c
@@ -881,10 +881,11 @@ build_freespace_tree(xfs_mount_t *mp, xfs_agnumber_t agno,
  */
 static void
 init_ino_cursor(xfs_mount_t *mp, xfs_agnumber_t agno, bt_status_t *btree_curs,
-		__uint64_t *num_inos, __uint64_t *num_free_inos)
+		__uint64_t *num_inos, __uint64_t *num_free_inos, int finobt)
 {
 	__uint64_t		ninos;
 	__uint64_t		nfinos;
+	__uint64_t		rec_nfinos;
 	ino_tree_node_t		*ino_rec;
 	int			num_recs;
 	int			level;
@@ -920,13 +921,22 @@ init_ino_cursor(xfs_mount_t *mp, xfs_agnumber_t agno, bt_status_t *btree_curs,
 	 * build up statistics
 	 */
 	for (num_recs = 0; ino_rec != NULL; ino_rec = next_ino_rec(ino_rec))  {
-		ninos += XFS_INODES_PER_CHUNK;
-		num_recs++;
+		rec_nfinos = 0;
 		for (i = 0; i < XFS_INODES_PER_CHUNK; i++)  {
 			ASSERT(is_inode_confirmed(ino_rec, i));
 			if (is_inode_free(ino_rec, i))
-				nfinos++;
+				rec_nfinos++;
 		}
+
+		/*
+		 * finobt only considers records with free inodes
+		 */
+		if (finobt && !rec_nfinos)
+			continue;
+
+		nfinos += rec_nfinos;
+		ninos += XFS_INODES_PER_CHUNK;
+		num_recs++;
 	}
 
 	blocks_allocated = lptr->num_blocks = howmany(num_recs,
@@ -1061,8 +1071,8 @@ prop_ino_cursor(xfs_mount_t *mp, xfs_agnumber_t agno, bt_status_t *btree_curs,
  * XXX: yet more code that can be shared with mkfs, growfs.
  */
 static void
-build_agi(xfs_mount_t *mp, xfs_agnumber_t agno,
-		bt_status_t *btree_curs, struct agi_stat *agi_stat)
+build_agi(xfs_mount_t *mp, xfs_agnumber_t agno, bt_status_t *btree_curs,
+		bt_status_t *finobt_curs, struct agi_stat *agi_stat)
 {
 	xfs_buf_t	*agi_buf;
 	xfs_agi_t	*agi;
@@ -1096,6 +1106,11 @@ build_agi(xfs_mount_t *mp, xfs_agnumber_t agno,
 	if (xfs_sb_version_hascrc(&mp->m_sb))
 		platform_uuid_copy(&agi->agi_uuid, &mp->m_sb.sb_uuid);
 
+	if (xfs_sb_version_hasfinobt(&mp->m_sb)) {
+		agi->agi_free_root = cpu_to_be32(finobt_curs->root);
+		agi->agi_free_level = cpu_to_be32(finobt_curs->num_levels);
+	}
+
 	libxfs_writebuf(agi_buf, 0);
 }
 
@@ -1106,7 +1121,7 @@ build_agi(xfs_mount_t *mp, xfs_agnumber_t agno,
 static void
 build_ino_tree(xfs_mount_t *mp, xfs_agnumber_t agno,
 		bt_status_t *btree_curs, __uint32_t magic,
-		struct agi_stat *agi_stat)
+		struct agi_stat *agi_stat, int finobt)
 {
 	xfs_agnumber_t		i;
 	xfs_agblock_t		j;
@@ -1158,7 +1173,10 @@ build_ino_tree(xfs_mount_t *mp, xfs_agnumber_t agno,
 	 * pointers for the parent.  that can recurse up to the root
 	 * if required.  set the sibling pointers for leaf level here.
 	 */
-	ino_rec = findfirst_inode_rec(agno);
+	if (finobt)
+		ino_rec = findfirst_free_inode_rec(agno);
+	else
+		ino_rec = findfirst_inode_rec(agno);
 
 	if (ino_rec != NULL)
 		first_agino = ino_rec->ino_startnum;
@@ -1210,7 +1228,11 @@ build_ino_tree(xfs_mount_t *mp, xfs_agnumber_t agno,
 			bt_rec[j].ir_freecount = cpu_to_be32(inocnt);
 			freecount += inocnt;
 			count += XFS_INODES_PER_CHUNK;
-			ino_rec = next_ino_rec(ino_rec);
+
+			if (finobt)
+				ino_rec = next_free_ino_rec(ino_rec);
+			else
+				ino_rec = next_ino_rec(ino_rec);
 		}
 
 		if (ino_rec != NULL)  {
@@ -1486,9 +1508,12 @@ phase5_func(
 {
 	__uint64_t	num_inos;
 	__uint64_t	num_free_inos;
+	__uint64_t	finobt_num_inos;
+	__uint64_t	finobt_num_free_inos;
 	bt_status_t	bno_btree_curs;
 	bt_status_t	bcnt_btree_curs;
 	bt_status_t	ino_btree_curs;
+	bt_status_t	fino_btree_curs;
 	int		extra_blocks = 0;
 	uint		num_freeblocks;
 	xfs_extlen_t	freeblks1;
@@ -1533,8 +1558,13 @@ phase5_func(
 		 * on-disk btrees (includs pre-allocating all
 		 * required blocks for the trees themselves)
 		 */
-		init_ino_cursor(mp, agno, &ino_btree_curs,
-				&num_inos, &num_free_inos);
+		init_ino_cursor(mp, agno, &ino_btree_curs, &num_inos,
+				&num_free_inos, 0);
+
+		if (xfs_sb_version_hasfinobt(&mp->m_sb))
+			init_ino_cursor(mp, agno, &fino_btree_curs,
+					&finobt_num_inos, &finobt_num_free_inos,
+					1);
 
 		sb_icount_ag[agno] += num_inos;
 		sb_ifree_ag[agno] += num_free_inos;
@@ -1633,17 +1663,31 @@ phase5_func(
 		 */
 		magic = xfs_sb_version_hascrc(&mp->m_sb) ?
 				XFS_IBT_CRC_MAGIC : XFS_IBT_MAGIC;
-		build_ino_tree(mp, agno, &ino_btree_curs, magic, &agi_stat);
+		build_ino_tree(mp, agno, &ino_btree_curs, magic, &agi_stat, 0);
 		write_cursor(&ino_btree_curs);
 
+		/*
+		 * build free inode tree
+		 */
+		if (xfs_sb_version_hasfinobt(&mp->m_sb)) {
+			magic = xfs_sb_version_hascrc(&mp->m_sb) ?
+					XFS_FIBT_CRC_MAGIC : XFS_FIBT_MAGIC;
+			build_ino_tree(mp, agno, &fino_btree_curs, magic,
+					NULL, 1);
+			write_cursor(&fino_btree_curs);
+		}
+
 		/* build the agi */
-		build_agi(mp, agno, &ino_btree_curs, &agi_stat);
+		build_agi(mp, agno, &ino_btree_curs, &fino_btree_curs,
+			  &agi_stat);
 
 		/*
 		 * tear down cursors
 		 */
 		finish_cursor(&bno_btree_curs);
 		finish_cursor(&ino_btree_curs);
+		if (xfs_sb_version_hasfinobt(&mp->m_sb))
+			finish_cursor(&fino_btree_curs);
 		finish_cursor(&bcnt_btree_curs);
 		/*
 		 * release the incore per-AG bno/bcnt trees so
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 19/20] xfsprogs/growfs: report finobt status in fs geometry (xfs_info)
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
                   ` (17 preceding siblings ...)
  2013-11-13 15:56 ` [PATCH v2 18/20] xfsprogs/repair: reconstruct the finobt in phase 5 Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  2013-11-13 15:56 ` [PATCH v2 20/20] xfsprogs/db: add finobt support to metadump Brian Foster
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

Check and report on the free inode btree status bit in the fs
geometry.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 growfs/xfs_growfs.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/growfs/xfs_growfs.c b/growfs/xfs_growfs.c
index 2df68fb..87f689f 100644
--- a/growfs/xfs_growfs.c
+++ b/growfs/xfs_growfs.c
@@ -56,12 +56,13 @@ report_info(
 	int		projid32bit,
 	int		crcs_enabled,
 	int		cimode,
-	int		ftype_enabled)
+	int		ftype_enabled,
+	int		finobt_enabled)
 {
 	printf(_(
 	    "meta-data=%-22s isize=%-6u agcount=%u, agsize=%u blks\n"
 	    "         =%-22s sectsz=%-5u attr=%u, projid32bit=%u\n"
-	    "         =%-22s crc=%u\n"
+	    "         =%-22s crc=%-8u finobt=%u\n"
 	    "data     =%-22s bsize=%-6u blocks=%llu, imaxpct=%u\n"
 	    "         =%-22s sunit=%-6u swidth=%u blks\n"
 	    "naming   =version %-14u bsize=%-6u ascii-ci=%d ftype=%d\n"
@@ -71,7 +72,7 @@ report_info(
 
 		mntpoint, geo.inodesize, geo.agcount, geo.agblocks,
 		"", geo.sectsize, attrversion, projid32bit,
-		"", crcs_enabled,
+		"", crcs_enabled, finobt_enabled,
 		"", geo.blocksize, (unsigned long long)geo.datablocks,
 			geo.imaxpct,
 		"", geo.sunit, geo.swidth,
@@ -123,6 +124,7 @@ main(int argc, char **argv)
 	int			projid32bit;
 	int			crcs_enabled;
 	int			ftype_enabled = 0;
+	int			finobt_enabled;	/* free inode btree */
 
 	progname = basename(argv[0]);
 	setlocale(LC_ALL, "");
@@ -245,11 +247,12 @@ main(int argc, char **argv)
 	projid32bit = geo.flags & XFS_FSOP_GEOM_FLAGS_PROJID32 ? 1 : 0;
 	crcs_enabled = geo.flags & XFS_FSOP_GEOM_FLAGS_V5SB ? 1 : 0;
 	ftype_enabled = geo.flags & XFS_FSOP_GEOM_FLAGS_FTYPE ? 1 : 0;
+	finobt_enabled = geo.flags & XFS_FSOP_GEOM_FLAGS_FINOBT ? 1 : 0;
 	if (nflag) {
 		report_info(geo, datadev, isint, logdev, rtdev,
 				lazycount, dirversion, logversion,
 				attrversion, projid32bit, crcs_enabled, ci,
-				ftype_enabled);
+				ftype_enabled, finobt_enabled);
 		exit(0);
 	}
 
@@ -286,7 +289,8 @@ main(int argc, char **argv)
 
 	report_info(geo, datadev, isint, logdev, rtdev,
 			lazycount, dirversion, logversion,
-			attrversion, projid32bit, crcs_enabled, ci, ftype_enabled);
+			attrversion, projid32bit, crcs_enabled, ci, ftype_enabled,
+			finobt_enabled);
 
 	ddsize = xi.dsize;
 	dlsize = ( xi.logBBsize? xi.logBBsize :
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 20/20] xfsprogs/db: add finobt support to metadump
  2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
                   ` (18 preceding siblings ...)
  2013-11-13 15:56 ` [PATCH v2 19/20] xfsprogs/growfs: report finobt status in fs geometry (xfs_info) Brian Foster
@ 2013-11-13 15:56 ` Brian Foster
  19 siblings, 0 replies; 21+ messages in thread
From: Brian Foster @ 2013-11-13 15:56 UTC (permalink / raw)
  To: xfs

Include the free inode btree in metadump images. If the source fs
is finobt-enabled, run an additional scan_btree() of the finobt.
Since the private 'agi' scanfunc_ino() parameter is unused, change
the private parameter to a flag that indicates whether the current
scan is for the inobt or finobt. If the latter, we skip copying the
actual inode chunks as this work is already performed by the inobt
scan.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 db/metadump.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/db/metadump.c b/db/metadump.c
index 117dc42..bb52caf 100644
--- a/db/metadump.c
+++ b/db/metadump.c
@@ -1776,6 +1776,7 @@ scanfunc_ino(
 	xfs_inobt_ptr_t		*pp;
 	int			i;
 	int			numrecs;
+	int			finobt = *(int *) arg;
 
 	numrecs = be16_to_cpu(block->bb_numrecs);
 
@@ -1787,6 +1788,14 @@ scanfunc_ino(
 					typtab[btype].name, agno, agbno);
 			numrecs = mp->m_inobt_mxr[0];
 		}
+
+		/*
+		 * Only copy the btree blocks for the finobt. The inobt scan
+		 * copies the inode chunks.
+		 */
+		if (finobt)
+			return 1;
+
 		rp = XFS_INOBT_REC_ADDR(mp, block, 1);
 		for (i = 0; i < numrecs; i++, rp++) {
 			if (!copy_inode_chunk(agno, rp))
@@ -1826,6 +1835,7 @@ copy_inodes(
 {
 	xfs_agblock_t		root;
 	int			levels;
+	int			finobt = 0;
 
 	root = be32_to_cpu(agi->agi_root);
 	levels = be32_to_cpu(agi->agi_level);
@@ -1844,7 +1854,20 @@ copy_inodes(
 		return 1;
 	}
 
-	return scan_btree(agno, root, levels, TYP_INOBT, agi, scanfunc_ino);
+	if (!scan_btree(agno, root, levels, TYP_INOBT, &finobt, scanfunc_ino))
+		return 0;
+
+	if (xfs_sb_version_hasfinobt(&mp->m_sb)) {
+		root = be32_to_cpu(agi->agi_free_root);
+		levels = be32_to_cpu(agi->agi_free_level);
+
+		finobt = 1;
+		if (!scan_btree(agno, root, levels, TYP_INOBT, &finobt,
+				scanfunc_ino))
+			return 0;
+	}
+
+	return 1;
 }
 
 static int
-- 
1.8.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2013-11-13 15:56 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-13 15:56 [PATCH v2 00/20] xfsprogs: introduce the free inode btree Brian Foster
2013-11-13 15:56 ` [PATCH v2 01/20] xfs: refactor xfs_ialloc_btree.c to support multiple inobt numbers Brian Foster
2013-11-13 15:56 ` [PATCH v2 02/20] xfs: reserve v5 superblock read-only compat. feature bit for finobt Brian Foster
2013-11-13 15:56 ` [PATCH v2 03/20] xfs: support the XFS_BTNUM_FINOBT free inode btree type Brian Foster
2013-11-13 15:56 ` [PATCH v2 04/20] xfs: update inode allocation/free transaction reservations for finobt Brian Foster
2013-11-13 15:56 ` [PATCH v2 05/20] xfs: insert newly allocated inode chunks into the finobt Brian Foster
2013-11-13 15:56 ` [PATCH v2 06/20] xfs: use and update the finobt on inode allocation Brian Foster
2013-11-13 15:56 ` [PATCH v2 07/20] xfs: refactor xfs_difree() inobt bits into xfs_difree_inobt() helper Brian Foster
2013-11-13 15:56 ` [PATCH v2 08/20] xfs: update the finobt on inode free Brian Foster
2013-11-13 15:56 ` [PATCH v2 09/20] xfs: report finobt status in fs geometry Brian Foster
2013-11-13 15:56 ` [PATCH v2 10/20] xfs: enable the finobt feature on v5 superblocks Brian Foster
2013-11-13 15:56 ` [PATCH v2 11/20] xfsprogs/mkfs: finobt mkfs support Brian Foster
2013-11-13 15:56 ` [PATCH v2 12/20] xfsprogs/db: finobt support Brian Foster
2013-11-13 15:56 ` [PATCH v2 13/20] xfsprogs/repair: account for finobt in ag 0 geometry pre-calculation Brian Foster
2013-11-13 15:56 ` [PATCH v2 14/20] xfsprogs/repair: phase 2 finobt scan Brian Foster
2013-11-13 15:56 ` [PATCH v2 15/20] xfsprogs/repair: pass btree block magic as param to build_ino_tree() Brian Foster
2013-11-13 15:56 ` [PATCH v2 16/20] xfsprogs/repair: pull the build_agi() call up out of the inode tree build Brian Foster
2013-11-13 15:56 ` [PATCH v2 17/20] xfsprogs/repair: helpers for finding in-core inode records w/ free inodes Brian Foster
2013-11-13 15:56 ` [PATCH v2 18/20] xfsprogs/repair: reconstruct the finobt in phase 5 Brian Foster
2013-11-13 15:56 ` [PATCH v2 19/20] xfsprogs/growfs: report finobt status in fs geometry (xfs_info) Brian Foster
2013-11-13 15:56 ` [PATCH v2 20/20] xfsprogs/db: add finobt support to metadump Brian Foster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.