All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 00/13] xfs: preparing for online scrub support
@ 2017-06-02 21:24 Darrick J. Wong
  2017-06-02 21:24 ` [PATCH 01/13] xfs: optimize _btree_query_all Darrick J. Wong
                   ` (14 more replies)
  0 siblings, 15 replies; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-02 21:24 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

Hi all,

This is the seventh revision of a patchset that adds to XFS kernel
support for online metadata scrubbing and repair.  There aren't any
on-disk format changes.  Changes since v6 include refactoring the scrub
setup code, fixing a deadlock problem in the xattr scrubber, and
strengthening the cross-referencing checks.  I have been performing
weekly online scrubs of my XFS filesystems for several months now, with
surprisingly few problems.

Online scrub/repair support consists of four major pieces -- first, an
ioctl that maps physical extents to their owners (GETFSMAP; already in
4.12); second, various in-kernel metadata scrubbing ioctls to examine
metadata records and cross-reference them with other filesystem
metadata; third, an in-kernel mechanism for rebuilding damaged metadata
objects and btrees; and fourth, a userspace component to coordinate
scrubbing and repair operations.

This new utility, xfs_scrub, is separate from the existing offline
xfs_repair tool.  The program uses various XFS ioctls to iterate all XFS
metadata and asks the kernel to check the metadata and repair it if
necessary.

Per reviewer request, the v7 patch series has been broken into multiple
smaller series -- the first one makes all the libxfs changes necessary
to support scrub and the second series adds the scrub functionality.  A
similar split will be applied to the cross-referencing checks and the
repair functions the next time they are posted.

If you're going to start using this mess, you probably ought to just
pull from my git trees.  The kernel patches[1] should apply against
4.12-rc3.  xfsprogs[2] and xfstests[3] can be found in their usual
places.  The git trees contain all four series' worth of changes.

This is an extraordinary way to eat your data.  Enjoy! 
Comments and questions are, as always, welcome.

--D

[1] https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=djwong-devel
[2] https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=djwong-devel
[3] https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=djwong-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 01/13] xfs: optimize _btree_query_all
  2017-06-02 21:24 [PATCH v7 00/13] xfs: preparing for online scrub support Darrick J. Wong
@ 2017-06-02 21:24 ` Darrick J. Wong
  2017-06-06 13:32   ` Brian Foster
  2017-06-07  1:18   ` [PATCH v2 " Darrick J. Wong
  2017-06-02 21:24 ` [PATCH 02/13] xfs: remove double-underscore integer types Darrick J. Wong
                   ` (13 subsequent siblings)
  14 siblings, 2 replies; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-02 21:24 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Don't bother wandering our way through the leaf nodes when the caller
issues a query_all; just zoom down the left side of the tree and walk
rightwards along level zero.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_btree.c |   44 +++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 39 insertions(+), 5 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
index 3a673ba..07d75bc 100644
--- a/fs/xfs/libxfs/xfs_btree.c
+++ b/fs/xfs/libxfs/xfs_btree.c
@@ -4849,12 +4849,46 @@ xfs_btree_query_all(
 	xfs_btree_query_range_fn	fn,
 	void				*priv)
 {
-	union xfs_btree_irec		low_rec;
-	union xfs_btree_irec		high_rec;
+	union xfs_btree_rec		*recp;
+	int				stat;
+	int				error;
+
+	/*
+	 * Find the leftmost record.  The btree cursor must be set
+	 * to the low record used to generate low_key.
+	 */
+	memset(&cur->bc_rec, 0, sizeof(cur->bc_rec));
+	stat = 0;
+	error = xfs_btree_lookup(cur, XFS_LOOKUP_LE, &stat);
+	if (error)
+		goto out;
+
+	/* Nothing?  See if there's anything to the right. */
+	if (!stat) {
+		error = xfs_btree_increment(cur, 0, &stat);
+		if (error)
+			goto out;
+	}
 
-	memset(&low_rec, 0, sizeof(low_rec));
-	memset(&high_rec, 0xFF, sizeof(high_rec));
-	return xfs_btree_query_range(cur, &low_rec, &high_rec, fn, priv);
+	while (stat) {
+		/* Find the record. */
+		error = xfs_btree_get_rec(cur, &recp, &stat);
+		if (error || !stat)
+			break;
+
+		/* Callback */
+		error = fn(cur, recp, priv);
+		if (error < 0 || error == XFS_BTREE_QUERY_RANGE_ABORT)
+			break;
+
+		/* Move on to the next record. */
+		error = xfs_btree_increment(cur, 0, &stat);
+		if (error)
+			break;
+	}
+
+out:
+	return error;
 }
 
 /*


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 02/13] xfs: remove double-underscore integer types
  2017-06-02 21:24 [PATCH v7 00/13] xfs: preparing for online scrub support Darrick J. Wong
  2017-06-02 21:24 ` [PATCH 01/13] xfs: optimize _btree_query_all Darrick J. Wong
@ 2017-06-02 21:24 ` Darrick J. Wong
  2017-06-02 21:24 ` [PATCH 03/13] xfs: always compile the btree inorder check functions Darrick J. Wong
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-02 21:24 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs, Christoph Hellwig

From: Darrick J. Wong <darrick.wong@oracle.com>

This is a purely mechanical patch that removes the private
__{u,}int{8,16,32,64}_t typedefs in favor of using the system
{u,}int{8,16,32,64}_t typedefs.  This is the sed script used to perform
the transformation and fix the resulting whitespace and indentation
errors:

s/typedef\t__uint8_t/typedef __uint8_t\t/g
s/typedef\t__uint/typedef __uint/g
s/typedef\t__int\([0-9]*\)_t/typedef int\1_t\t/g
s/__uint8_t\t/__uint8_t\t\t/g
s/__uint/uint/g
s/__int\([0-9]*\)_t\t/__int\1_t\t\t/g
s/__int/int/g
/^typedef.*int[0-9]*_t;$/d

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_alloc_btree.c    |   20 +--
 fs/xfs/libxfs/xfs_attr_remote.c    |    8 +
 fs/xfs/libxfs/xfs_attr_sf.h        |   10 +
 fs/xfs/libxfs/xfs_bit.h            |   24 ++-
 fs/xfs/libxfs/xfs_bmap_btree.c     |    8 +
 fs/xfs/libxfs/xfs_btree.c          |   22 ++-
 fs/xfs/libxfs/xfs_btree.h          |   18 +--
 fs/xfs/libxfs/xfs_cksum.h          |   16 +-
 fs/xfs/libxfs/xfs_da_btree.c       |    2 
 fs/xfs/libxfs/xfs_da_btree.h       |    8 +
 fs/xfs/libxfs/xfs_da_format.c      |   28 ++--
 fs/xfs/libxfs/xfs_da_format.h      |   64 +++++----
 fs/xfs/libxfs/xfs_dir2.h           |    8 +
 fs/xfs/libxfs/xfs_dir2_leaf.c      |   12 +-
 fs/xfs/libxfs/xfs_dir2_priv.h      |    2 
 fs/xfs/libxfs/xfs_dir2_sf.c        |    2 
 fs/xfs/libxfs/xfs_format.h         |  112 ++++++++--------
 fs/xfs/libxfs/xfs_fs.h             |   12 +-
 fs/xfs/libxfs/xfs_ialloc.c         |    6 -
 fs/xfs/libxfs/xfs_ialloc_btree.c   |    4 -
 fs/xfs/libxfs/xfs_inode_buf.c      |    2 
 fs/xfs/libxfs/xfs_inode_buf.h      |   28 ++--
 fs/xfs/libxfs/xfs_log_format.h     |  256 ++++++++++++++++++------------------
 fs/xfs/libxfs/xfs_log_recover.h    |    2 
 fs/xfs/libxfs/xfs_quota_defs.h     |    4 -
 fs/xfs/libxfs/xfs_refcount_btree.c |    8 +
 fs/xfs/libxfs/xfs_rmap.c           |    8 +
 fs/xfs/libxfs/xfs_rmap.h           |    8 +
 fs/xfs/libxfs/xfs_rmap_btree.c     |   30 ++--
 fs/xfs/libxfs/xfs_rtbitmap.c       |    2 
 fs/xfs/libxfs/xfs_sb.c             |    4 -
 fs/xfs/libxfs/xfs_types.h          |   46 +++---
 fs/xfs/xfs_aops.c                  |    4 -
 fs/xfs/xfs_attr_list.c             |    2 
 fs/xfs/xfs_bmap_util.c             |   24 ++-
 fs/xfs/xfs_buf.c                   |    2 
 fs/xfs/xfs_dir2_readdir.c          |    8 +
 fs/xfs/xfs_discard.c               |    4 -
 fs/xfs/xfs_dquot.c                 |    2 
 fs/xfs/xfs_fsops.c                 |   16 +-
 fs/xfs/xfs_fsops.h                 |    4 -
 fs/xfs/xfs_inode.c                 |    6 -
 fs/xfs/xfs_inode.h                 |    4 -
 fs/xfs/xfs_ioctl.c                 |   20 +--
 fs/xfs/xfs_ioctl.h                 |   10 +
 fs/xfs/xfs_ioctl32.h               |    6 -
 fs/xfs/xfs_linux.h                 |   20 +--
 fs/xfs/xfs_log.c                   |   20 +--
 fs/xfs/xfs_log.h                   |    2 
 fs/xfs/xfs_log_priv.h              |    2 
 fs/xfs/xfs_log_recover.c           |   28 ++--
 fs/xfs/xfs_mount.c                 |   16 +-
 fs/xfs/xfs_mount.h                 |   34 ++---
 fs/xfs/xfs_qm_bhv.c                |    2 
 fs/xfs/xfs_rtalloc.c               |    8 +
 fs/xfs/xfs_stats.c                 |    8 +
 fs/xfs/xfs_stats.h                 |  190 +++++++++++++--------------
 fs/xfs/xfs_super.c                 |   26 ++--
 fs/xfs/xfs_trace.h                 |   20 +--
 fs/xfs/xfs_trans.h                 |    2 
 fs/xfs/xfs_trans_rmap.c            |    2 
 61 files changed, 634 insertions(+), 642 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_alloc_btree.c b/fs/xfs/libxfs/xfs_alloc_btree.c
index e1fcfe7..5020cbc 100644
--- a/fs/xfs/libxfs/xfs_alloc_btree.c
+++ b/fs/xfs/libxfs/xfs_alloc_btree.c
@@ -253,7 +253,7 @@ xfs_allocbt_init_ptr_from_cur(
 	ptr->s = agf->agf_roots[cur->bc_btnum];
 }
 
-STATIC __int64_t
+STATIC int64_t
 xfs_bnobt_key_diff(
 	struct xfs_btree_cur	*cur,
 	union xfs_btree_key	*key)
@@ -261,42 +261,42 @@ xfs_bnobt_key_diff(
 	xfs_alloc_rec_incore_t	*rec = &cur->bc_rec.a;
 	xfs_alloc_key_t		*kp = &key->alloc;
 
-	return (__int64_t)be32_to_cpu(kp->ar_startblock) - rec->ar_startblock;
+	return (int64_t)be32_to_cpu(kp->ar_startblock) - rec->ar_startblock;
 }
 
-STATIC __int64_t
+STATIC int64_t
 xfs_cntbt_key_diff(
 	struct xfs_btree_cur	*cur,
 	union xfs_btree_key	*key)
 {
 	xfs_alloc_rec_incore_t	*rec = &cur->bc_rec.a;
 	xfs_alloc_key_t		*kp = &key->alloc;
-	__int64_t		diff;
+	int64_t			diff;
 
-	diff = (__int64_t)be32_to_cpu(kp->ar_blockcount) - rec->ar_blockcount;
+	diff = (int64_t)be32_to_cpu(kp->ar_blockcount) - rec->ar_blockcount;
 	if (diff)
 		return diff;
 
-	return (__int64_t)be32_to_cpu(kp->ar_startblock) - rec->ar_startblock;
+	return (int64_t)be32_to_cpu(kp->ar_startblock) - rec->ar_startblock;
 }
 
-STATIC __int64_t
+STATIC int64_t
 xfs_bnobt_diff_two_keys(
 	struct xfs_btree_cur	*cur,
 	union xfs_btree_key	*k1,
 	union xfs_btree_key	*k2)
 {
-	return (__int64_t)be32_to_cpu(k1->alloc.ar_startblock) -
+	return (int64_t)be32_to_cpu(k1->alloc.ar_startblock) -
 			  be32_to_cpu(k2->alloc.ar_startblock);
 }
 
-STATIC __int64_t
+STATIC int64_t
 xfs_cntbt_diff_two_keys(
 	struct xfs_btree_cur	*cur,
 	union xfs_btree_key	*k1,
 	union xfs_btree_key	*k2)
 {
-	__int64_t		diff;
+	int64_t			diff;
 
 	diff =  be32_to_cpu(k1->alloc.ar_blockcount) -
 		be32_to_cpu(k2->alloc.ar_blockcount);
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index d52f525..da72b16 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -253,7 +253,7 @@ xfs_attr_rmtval_copyout(
 	xfs_ino_t	ino,
 	int		*offset,
 	int		*valuelen,
-	__uint8_t	**dst)
+	uint8_t		**dst)
 {
 	char		*src = bp->b_addr;
 	xfs_daddr_t	bno = bp->b_bn;
@@ -301,7 +301,7 @@ xfs_attr_rmtval_copyin(
 	xfs_ino_t	ino,
 	int		*offset,
 	int		*valuelen,
-	__uint8_t	**src)
+	uint8_t		**src)
 {
 	char		*dst = bp->b_addr;
 	xfs_daddr_t	bno = bp->b_bn;
@@ -355,7 +355,7 @@ xfs_attr_rmtval_get(
 	struct xfs_mount	*mp = args->dp->i_mount;
 	struct xfs_buf		*bp;
 	xfs_dablk_t		lblkno = args->rmtblkno;
-	__uint8_t		*dst = args->value;
+	uint8_t			*dst = args->value;
 	int			valuelen;
 	int			nmap;
 	int			error;
@@ -421,7 +421,7 @@ xfs_attr_rmtval_set(
 	struct xfs_bmbt_irec	map;
 	xfs_dablk_t		lblkno;
 	xfs_fileoff_t		lfileoff = 0;
-	__uint8_t		*src = args->value;
+	uint8_t			*src = args->value;
 	int			blkcnt;
 	int			valuelen;
 	int			nmap;
diff --git a/fs/xfs/libxfs/xfs_attr_sf.h b/fs/xfs/libxfs/xfs_attr_sf.h
index 90928bb..afd684a 100644
--- a/fs/xfs/libxfs/xfs_attr_sf.h
+++ b/fs/xfs/libxfs/xfs_attr_sf.h
@@ -31,10 +31,10 @@ typedef struct xfs_attr_sf_entry xfs_attr_sf_entry_t;
  * We generate this then sort it, attr_list() must return things in hash-order.
  */
 typedef struct xfs_attr_sf_sort {
-	__uint8_t	entno;		/* entry number in original list */
-	__uint8_t	namelen;	/* length of name value (no null) */
-	__uint8_t	valuelen;	/* length of value */
-	__uint8_t	flags;		/* flags bits (see xfs_attr_leaf.h) */
+	uint8_t		entno;		/* entry number in original list */
+	uint8_t		namelen;	/* length of name value (no null) */
+	uint8_t		valuelen;	/* length of value */
+	uint8_t		flags;		/* flags bits (see xfs_attr_leaf.h) */
 	xfs_dahash_t	hash;		/* this entry's hash value */
 	unsigned char	*name;		/* name value, pointer into buffer */
 } xfs_attr_sf_sort_t;
@@ -42,7 +42,7 @@ typedef struct xfs_attr_sf_sort {
 #define XFS_ATTR_SF_ENTSIZE_BYNAME(nlen,vlen)	/* space name/value uses */ \
 	(((int)sizeof(xfs_attr_sf_entry_t)-1 + (nlen)+(vlen)))
 #define XFS_ATTR_SF_ENTSIZE_MAX			/* max space for name&value */ \
-	((1 << (NBBY*(int)sizeof(__uint8_t))) - 1)
+	((1 << (NBBY*(int)sizeof(uint8_t))) - 1)
 #define XFS_ATTR_SF_ENTSIZE(sfep)		/* space an entry uses */ \
 	((int)sizeof(xfs_attr_sf_entry_t)-1 + (sfep)->namelen+(sfep)->valuelen)
 #define XFS_ATTR_SF_NEXTENTRY(sfep)		/* next entry in struct */ \
diff --git a/fs/xfs/libxfs/xfs_bit.h b/fs/xfs/libxfs/xfs_bit.h
index e1649c0..61c6b20 100644
--- a/fs/xfs/libxfs/xfs_bit.h
+++ b/fs/xfs/libxfs/xfs_bit.h
@@ -25,47 +25,47 @@
 /*
  * masks with n high/low bits set, 64-bit values
  */
-static inline __uint64_t xfs_mask64hi(int n)
+static inline uint64_t xfs_mask64hi(int n)
 {
-	return (__uint64_t)-1 << (64 - (n));
+	return (uint64_t)-1 << (64 - (n));
 }
-static inline __uint32_t xfs_mask32lo(int n)
+static inline uint32_t xfs_mask32lo(int n)
 {
-	return ((__uint32_t)1 << (n)) - 1;
+	return ((uint32_t)1 << (n)) - 1;
 }
-static inline __uint64_t xfs_mask64lo(int n)
+static inline uint64_t xfs_mask64lo(int n)
 {
-	return ((__uint64_t)1 << (n)) - 1;
+	return ((uint64_t)1 << (n)) - 1;
 }
 
 /* Get high bit set out of 32-bit argument, -1 if none set */
-static inline int xfs_highbit32(__uint32_t v)
+static inline int xfs_highbit32(uint32_t v)
 {
 	return fls(v) - 1;
 }
 
 /* Get high bit set out of 64-bit argument, -1 if none set */
-static inline int xfs_highbit64(__uint64_t v)
+static inline int xfs_highbit64(uint64_t v)
 {
 	return fls64(v) - 1;
 }
 
 /* Get low bit set out of 32-bit argument, -1 if none set */
-static inline int xfs_lowbit32(__uint32_t v)
+static inline int xfs_lowbit32(uint32_t v)
 {
 	return ffs(v) - 1;
 }
 
 /* Get low bit set out of 64-bit argument, -1 if none set */
-static inline int xfs_lowbit64(__uint64_t v)
+static inline int xfs_lowbit64(uint64_t v)
 {
-	__uint32_t	w = (__uint32_t)v;
+	uint32_t	w = (uint32_t)v;
 	int		n = 0;
 
 	if (w) {	/* lower bits */
 		n = ffs(w);
 	} else {	/* upper bits */
-		w = (__uint32_t)(v >> 32);
+		w = (uint32_t)(v >> 32);
 		if (w) {
 			n = ffs(w);
 			if (n)
diff --git a/fs/xfs/libxfs/xfs_bmap_btree.c b/fs/xfs/libxfs/xfs_bmap_btree.c
index 6cba69a..5e2b3dc 100644
--- a/fs/xfs/libxfs/xfs_bmap_btree.c
+++ b/fs/xfs/libxfs/xfs_bmap_btree.c
@@ -94,8 +94,8 @@ xfs_bmdr_to_bmbt(
  */
 STATIC void
 __xfs_bmbt_get_all(
-		__uint64_t l0,
-		__uint64_t l1,
+		uint64_t l0,
+		uint64_t l1,
 		xfs_bmbt_irec_t *s)
 {
 	int	ext_flag;
@@ -588,12 +588,12 @@ xfs_bmbt_init_ptr_from_cur(
 	ptr->l = 0;
 }
 
-STATIC __int64_t
+STATIC int64_t
 xfs_bmbt_key_diff(
 	struct xfs_btree_cur	*cur,
 	union xfs_btree_key	*key)
 {
-	return (__int64_t)be64_to_cpu(key->bmbt.br_startoff) -
+	return (int64_t)be64_to_cpu(key->bmbt.br_startoff) -
 				      cur->bc_rec.b.br_startoff;
 }
 
diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
index 07d75bc..302dd4c 100644
--- a/fs/xfs/libxfs/xfs_btree.c
+++ b/fs/xfs/libxfs/xfs_btree.c
@@ -43,7 +43,7 @@ kmem_zone_t	*xfs_btree_cur_zone;
 /*
  * Btree magic numbers.
  */
-static const __uint32_t xfs_magics[2][XFS_BTNUM_MAX] = {
+static const uint32_t xfs_magics[2][XFS_BTNUM_MAX] = {
 	{ XFS_ABTB_MAGIC, XFS_ABTC_MAGIC, 0, XFS_BMAP_MAGIC, XFS_IBT_MAGIC,
 	  XFS_FIBT_MAGIC, 0 },
 	{ XFS_ABTB_CRC_MAGIC, XFS_ABTC_CRC_MAGIC, XFS_RMAP_CRC_MAGIC,
@@ -51,12 +51,12 @@ static const __uint32_t xfs_magics[2][XFS_BTNUM_MAX] = {
 	  XFS_REFC_CRC_MAGIC }
 };
 
-__uint32_t
+uint32_t
 xfs_btree_magic(
 	int			crc,
 	xfs_btnum_t		btnum)
 {
-	__uint32_t		magic = xfs_magics[crc][btnum];
+	uint32_t		magic = xfs_magics[crc][btnum];
 
 	/* Ensure we asked for crc for crc-only magics. */
 	ASSERT(magic != 0);
@@ -778,14 +778,14 @@ xfs_btree_lastrec(
  */
 void
 xfs_btree_offsets(
-	__int64_t	fields,		/* bitmask of fields */
+	int64_t		fields,		/* bitmask of fields */
 	const short	*offsets,	/* table of field offsets */
 	int		nbits,		/* number of bits to inspect */
 	int		*first,		/* output: first byte offset */
 	int		*last)		/* output: last byte offset */
 {
 	int		i;		/* current bit number */
-	__int64_t	imask;		/* mask for current bit number */
+	int64_t		imask;		/* mask for current bit number */
 
 	ASSERT(fields != 0);
 	/*
@@ -1846,7 +1846,7 @@ xfs_btree_lookup(
 	int			*stat)	/* success/failure */
 {
 	struct xfs_btree_block	*block;	/* current btree block */
-	__int64_t		diff;	/* difference for the current key */
+	int64_t			diff;	/* difference for the current key */
 	int			error;	/* error return value */
 	int			keyno;	/* current key number */
 	int			level;	/* level in the btree */
@@ -4435,7 +4435,7 @@ xfs_btree_visit_blocks(
  * recovery completion writes the changes to disk.
  */
 struct xfs_btree_block_change_owner_info {
-	__uint64_t		new_owner;
+	uint64_t		new_owner;
 	struct list_head	*buffer_list;
 };
 
@@ -4481,7 +4481,7 @@ xfs_btree_block_change_owner(
 int
 xfs_btree_change_owner(
 	struct xfs_btree_cur	*cur,
-	__uint64_t		new_owner,
+	uint64_t		new_owner,
 	struct list_head	*buffer_list)
 {
 	struct xfs_btree_block_change_owner_info	bbcoi;
@@ -4585,7 +4585,7 @@ xfs_btree_simple_query_range(
 {
 	union xfs_btree_rec		*recp;
 	union xfs_btree_key		rec_key;
-	__int64_t			diff;
+	int64_t				diff;
 	int				stat;
 	bool				firstrec = true;
 	int				error;
@@ -4682,8 +4682,8 @@ xfs_btree_overlapped_query_range(
 	union xfs_btree_key		*hkp;
 	union xfs_btree_rec		*recp;
 	struct xfs_btree_block		*block;
-	__int64_t			ldiff;
-	__int64_t			hdiff;
+	int64_t				ldiff;
+	int64_t				hdiff;
 	int				level;
 	struct xfs_buf			*bp;
 	int				i;
diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h
index 27bed08..0a931f6 100644
--- a/fs/xfs/libxfs/xfs_btree.h
+++ b/fs/xfs/libxfs/xfs_btree.h
@@ -76,7 +76,7 @@ union xfs_btree_rec {
 #define	XFS_BTNUM_RMAP	((xfs_btnum_t)XFS_BTNUM_RMAPi)
 #define	XFS_BTNUM_REFC	((xfs_btnum_t)XFS_BTNUM_REFCi)
 
-__uint32_t xfs_btree_magic(int crc, xfs_btnum_t btnum);
+uint32_t xfs_btree_magic(int crc, xfs_btnum_t btnum);
 
 /*
  * For logging record fields.
@@ -150,14 +150,14 @@ struct xfs_btree_ops {
 					  union xfs_btree_rec *rec);
 
 	/* difference between key value and cursor value */
-	__int64_t (*key_diff)(struct xfs_btree_cur *cur,
+	int64_t (*key_diff)(struct xfs_btree_cur *cur,
 			      union xfs_btree_key *key);
 
 	/*
 	 * Difference between key2 and key1 -- positive if key1 > key2,
 	 * negative if key1 < key2, and zero if equal.
 	 */
-	__int64_t (*diff_two_keys)(struct xfs_btree_cur *cur,
+	int64_t (*diff_two_keys)(struct xfs_btree_cur *cur,
 				   union xfs_btree_key *key1,
 				   union xfs_btree_key *key2);
 
@@ -213,11 +213,11 @@ typedef struct xfs_btree_cur
 	union xfs_btree_irec	bc_rec;	/* current insert/search record value */
 	struct xfs_buf	*bc_bufs[XFS_BTREE_MAXLEVELS];	/* buf ptr per level */
 	int		bc_ptrs[XFS_BTREE_MAXLEVELS];	/* key/record # */
-	__uint8_t	bc_ra[XFS_BTREE_MAXLEVELS];	/* readahead bits */
+	uint8_t		bc_ra[XFS_BTREE_MAXLEVELS];	/* readahead bits */
 #define	XFS_BTCUR_LEFTRA	1	/* left sibling has been read-ahead */
 #define	XFS_BTCUR_RIGHTRA	2	/* right sibling has been read-ahead */
-	__uint8_t	bc_nlevels;	/* number of levels in the tree */
-	__uint8_t	bc_blocklog;	/* log2(blocksize) of btree blocks */
+	uint8_t		bc_nlevels;	/* number of levels in the tree */
+	uint8_t		bc_blocklog;	/* log2(blocksize) of btree blocks */
 	xfs_btnum_t	bc_btnum;	/* identifies which btree type */
 	int		bc_statoff;	/* offset of btre stats array */
 	union {
@@ -330,7 +330,7 @@ xfs_btree_islastblock(
  */
 void
 xfs_btree_offsets(
-	__int64_t		fields,	/* bitmask of fields */
+	int64_t			fields,	/* bitmask of fields */
 	const short		*offsets,/* table of field offsets */
 	int			nbits,	/* number of bits to inspect */
 	int			*first,	/* output: first byte offset */
@@ -408,7 +408,7 @@ int xfs_btree_new_iroot(struct xfs_btree_cur *, int *, int *);
 int xfs_btree_insert(struct xfs_btree_cur *, int *);
 int xfs_btree_delete(struct xfs_btree_cur *, int *);
 int xfs_btree_get_rec(struct xfs_btree_cur *, union xfs_btree_rec **, int *);
-int xfs_btree_change_owner(struct xfs_btree_cur *cur, __uint64_t new_owner,
+int xfs_btree_change_owner(struct xfs_btree_cur *cur, uint64_t new_owner,
 			   struct list_head *buffer_list);
 
 /*
@@ -434,7 +434,7 @@ static inline int xfs_btree_get_numrecs(struct xfs_btree_block *block)
 }
 
 static inline void xfs_btree_set_numrecs(struct xfs_btree_block *block,
-		__uint16_t numrecs)
+		uint16_t numrecs)
 {
 	block->bb_numrecs = cpu_to_be16(numrecs);
 }
diff --git a/fs/xfs/libxfs/xfs_cksum.h b/fs/xfs/libxfs/xfs_cksum.h
index a416c7c..8211f48 100644
--- a/fs/xfs/libxfs/xfs_cksum.h
+++ b/fs/xfs/libxfs/xfs_cksum.h
@@ -1,7 +1,7 @@
 #ifndef _XFS_CKSUM_H
 #define _XFS_CKSUM_H 1
 
-#define XFS_CRC_SEED	(~(__uint32_t)0)
+#define XFS_CRC_SEED	(~(uint32_t)0)
 
 /*
  * Calculate the intermediate checksum for a buffer that has the CRC field
@@ -9,11 +9,11 @@
  * cksum_offset parameter. We do not modify the buffer during verification,
  * hence we have to split the CRC calculation across the cksum_offset.
  */
-static inline __uint32_t
+static inline uint32_t
 xfs_start_cksum_safe(char *buffer, size_t length, unsigned long cksum_offset)
 {
-	__uint32_t zero = 0;
-	__uint32_t crc;
+	uint32_t zero = 0;
+	uint32_t crc;
 
 	/* Calculate CRC up to the checksum. */
 	crc = crc32c(XFS_CRC_SEED, buffer, cksum_offset);
@@ -30,7 +30,7 @@ xfs_start_cksum_safe(char *buffer, size_t length, unsigned long cksum_offset)
  * Fast CRC method where the buffer is modified. Callers must have exclusive
  * access to the buffer while the calculation takes place.
  */
-static inline __uint32_t
+static inline uint32_t
 xfs_start_cksum_update(char *buffer, size_t length, unsigned long cksum_offset)
 {
 	/* zero the CRC field */
@@ -48,7 +48,7 @@ xfs_start_cksum_update(char *buffer, size_t length, unsigned long cksum_offset)
  * so that it is consistent on disk.
  */
 static inline __le32
-xfs_end_cksum(__uint32_t crc)
+xfs_end_cksum(uint32_t crc)
 {
 	return ~cpu_to_le32(crc);
 }
@@ -62,7 +62,7 @@ xfs_end_cksum(__uint32_t crc)
 static inline void
 xfs_update_cksum(char *buffer, size_t length, unsigned long cksum_offset)
 {
-	__uint32_t crc = xfs_start_cksum_update(buffer, length, cksum_offset);
+	uint32_t crc = xfs_start_cksum_update(buffer, length, cksum_offset);
 
 	*(__le32 *)(buffer + cksum_offset) = xfs_end_cksum(crc);
 }
@@ -73,7 +73,7 @@ xfs_update_cksum(char *buffer, size_t length, unsigned long cksum_offset)
 static inline int
 xfs_verify_cksum(char *buffer, size_t length, unsigned long cksum_offset)
 {
-	__uint32_t crc = xfs_start_cksum_safe(buffer, length, cksum_offset);
+	uint32_t crc = xfs_start_cksum_safe(buffer, length, cksum_offset);
 
 	return *(__le32 *)(buffer + cksum_offset) == xfs_end_cksum(crc);
 }
diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
index 1bdf288..48f1136 100644
--- a/fs/xfs/libxfs/xfs_da_btree.c
+++ b/fs/xfs/libxfs/xfs_da_btree.c
@@ -1952,7 +1952,7 @@ xfs_da3_path_shift(
  * This is implemented with some source-level loop unrolling.
  */
 xfs_dahash_t
-xfs_da_hashname(const __uint8_t *name, int namelen)
+xfs_da_hashname(const uint8_t *name, int namelen)
 {
 	xfs_dahash_t hash;
 
diff --git a/fs/xfs/libxfs/xfs_da_btree.h b/fs/xfs/libxfs/xfs_da_btree.h
index 4e29cb6..ae6de17 100644
--- a/fs/xfs/libxfs/xfs_da_btree.h
+++ b/fs/xfs/libxfs/xfs_da_btree.h
@@ -60,10 +60,10 @@ enum xfs_dacmp {
  */
 typedef struct xfs_da_args {
 	struct xfs_da_geometry *geo;	/* da block geometry */
-	const __uint8_t	*name;		/* string (maybe not NULL terminated) */
+	const uint8_t		*name;		/* string (maybe not NULL terminated) */
 	int		namelen;	/* length of string (maybe no NULL) */
-	__uint8_t	filetype;	/* filetype of inode for directories */
-	__uint8_t	*value;		/* set of bytes (maybe contain NULLs) */
+	uint8_t		filetype;	/* filetype of inode for directories */
+	uint8_t		*value;		/* set of bytes (maybe contain NULLs) */
 	int		valuelen;	/* length of value */
 	int		flags;		/* argument flags (eg: ATTR_NOCREATE) */
 	xfs_dahash_t	hashval;	/* hash value of name */
@@ -207,7 +207,7 @@ int	xfs_da_reada_buf(struct xfs_inode *dp, xfs_dablk_t bno,
 int	xfs_da_shrink_inode(xfs_da_args_t *args, xfs_dablk_t dead_blkno,
 					  struct xfs_buf *dead_buf);
 
-uint xfs_da_hashname(const __uint8_t *name_string, int name_length);
+uint xfs_da_hashname(const uint8_t *name_string, int name_length);
 enum xfs_dacmp xfs_da_compname(struct xfs_da_args *args,
 				const unsigned char *name, int len);
 
diff --git a/fs/xfs/libxfs/xfs_da_format.c b/fs/xfs/libxfs/xfs_da_format.c
index f1e8d4d..6d77d1a 100644
--- a/fs/xfs/libxfs/xfs_da_format.c
+++ b/fs/xfs/libxfs/xfs_da_format.c
@@ -49,7 +49,7 @@ xfs_dir3_sf_entsize(
 	struct xfs_dir2_sf_hdr	*hdr,
 	int			len)
 {
-	return xfs_dir2_sf_entsize(hdr, len) + sizeof(__uint8_t);
+	return xfs_dir2_sf_entsize(hdr, len) + sizeof(uint8_t);
 }
 
 static struct xfs_dir2_sf_entry *
@@ -77,7 +77,7 @@ xfs_dir3_sf_nextentry(
  * not necessary. For non-filetype enable directories, the type is always
  * unknown and we never store the value.
  */
-static __uint8_t
+static uint8_t
 xfs_dir2_sfe_get_ftype(
 	struct xfs_dir2_sf_entry *sfep)
 {
@@ -87,16 +87,16 @@ xfs_dir2_sfe_get_ftype(
 static void
 xfs_dir2_sfe_put_ftype(
 	struct xfs_dir2_sf_entry *sfep,
-	__uint8_t		ftype)
+	uint8_t			ftype)
 {
 	ASSERT(ftype < XFS_DIR3_FT_MAX);
 }
 
-static __uint8_t
+static uint8_t
 xfs_dir3_sfe_get_ftype(
 	struct xfs_dir2_sf_entry *sfep)
 {
-	__uint8_t	ftype;
+	uint8_t		ftype;
 
 	ftype = sfep->name[sfep->namelen];
 	if (ftype >= XFS_DIR3_FT_MAX)
@@ -107,7 +107,7 @@ xfs_dir3_sfe_get_ftype(
 static void
 xfs_dir3_sfe_put_ftype(
 	struct xfs_dir2_sf_entry *sfep,
-	__uint8_t		ftype)
+	uint8_t			ftype)
 {
 	ASSERT(ftype < XFS_DIR3_FT_MAX);
 
@@ -124,7 +124,7 @@ xfs_dir3_sfe_put_ftype(
 static xfs_ino_t
 xfs_dir2_sf_get_ino(
 	struct xfs_dir2_sf_hdr	*hdr,
-	__uint8_t		*from)
+	uint8_t			*from)
 {
 	if (hdr->i8count)
 		return get_unaligned_be64(from) & 0x00ffffffffffffffULL;
@@ -135,7 +135,7 @@ xfs_dir2_sf_get_ino(
 static void
 xfs_dir2_sf_put_ino(
 	struct xfs_dir2_sf_hdr	*hdr,
-	__uint8_t		*to,
+	uint8_t			*to,
 	xfs_ino_t		ino)
 {
 	ASSERT((ino & 0xff00000000000000ULL) == 0);
@@ -225,7 +225,7 @@ xfs_dir3_sfe_put_ino(
 
 #define XFS_DIR3_DATA_ENTSIZE(n)					\
 	round_up((offsetof(struct xfs_dir2_data_entry, name[0]) + (n) +	\
-		 sizeof(xfs_dir2_data_off_t) + sizeof(__uint8_t)),	\
+		 sizeof(xfs_dir2_data_off_t) + sizeof(uint8_t)),	\
 		XFS_DIR2_DATA_ALIGN)
 
 static int
@@ -242,7 +242,7 @@ xfs_dir3_data_entsize(
 	return XFS_DIR3_DATA_ENTSIZE(n);
 }
 
-static __uint8_t
+static uint8_t
 xfs_dir2_data_get_ftype(
 	struct xfs_dir2_data_entry *dep)
 {
@@ -252,16 +252,16 @@ xfs_dir2_data_get_ftype(
 static void
 xfs_dir2_data_put_ftype(
 	struct xfs_dir2_data_entry *dep,
-	__uint8_t		ftype)
+	uint8_t			ftype)
 {
 	ASSERT(ftype < XFS_DIR3_FT_MAX);
 }
 
-static __uint8_t
+static uint8_t
 xfs_dir3_data_get_ftype(
 	struct xfs_dir2_data_entry *dep)
 {
-	__uint8_t	ftype = dep->name[dep->namelen];
+	uint8_t		ftype = dep->name[dep->namelen];
 
 	if (ftype >= XFS_DIR3_FT_MAX)
 		return XFS_DIR3_FT_UNKNOWN;
@@ -271,7 +271,7 @@ xfs_dir3_data_get_ftype(
 static void
 xfs_dir3_data_put_ftype(
 	struct xfs_dir2_data_entry *dep,
-	__uint8_t		type)
+	uint8_t			type)
 {
 	ASSERT(type < XFS_DIR3_FT_MAX);
 	ASSERT(dep->namelen != 0);
diff --git a/fs/xfs/libxfs/xfs_da_format.h b/fs/xfs/libxfs/xfs_da_format.h
index 9a492a9..3771edc 100644
--- a/fs/xfs/libxfs/xfs_da_format.h
+++ b/fs/xfs/libxfs/xfs_da_format.h
@@ -111,11 +111,11 @@ struct xfs_da3_intnode {
  * appropriate.
  */
 struct xfs_da3_icnode_hdr {
-	__uint32_t	forw;
-	__uint32_t	back;
-	__uint16_t	magic;
-	__uint16_t	count;
-	__uint16_t	level;
+	uint32_t	forw;
+	uint32_t	back;
+	uint16_t	magic;
+	uint16_t	count;
+	uint16_t	level;
 };
 
 /*
@@ -187,14 +187,14 @@ struct xfs_da3_icnode_hdr {
 /*
  * Byte offset in data block and shortform entry.
  */
-typedef	__uint16_t	xfs_dir2_data_off_t;
+typedef uint16_t	xfs_dir2_data_off_t;
 #define	NULLDATAOFF	0xffffU
 typedef uint		xfs_dir2_data_aoff_t;	/* argument form */
 
 /*
  * Offset in data space of a data entry.
  */
-typedef	__uint32_t	xfs_dir2_dataptr_t;
+typedef uint32_t	xfs_dir2_dataptr_t;
 #define	XFS_DIR2_MAX_DATAPTR	((xfs_dir2_dataptr_t)0xffffffff)
 #define	XFS_DIR2_NULL_DATAPTR	((xfs_dir2_dataptr_t)0)
 
@@ -206,7 +206,7 @@ typedef	xfs_off_t	xfs_dir2_off_t;
 /*
  * Directory block number (logical dirblk in file)
  */
-typedef	__uint32_t	xfs_dir2_db_t;
+typedef uint32_t	xfs_dir2_db_t;
 
 #define XFS_INO32_SIZE	4
 #define XFS_INO64_SIZE	8
@@ -226,9 +226,9 @@ typedef	__uint32_t	xfs_dir2_db_t;
  * over them.
  */
 typedef struct xfs_dir2_sf_hdr {
-	__uint8_t		count;		/* count of entries */
-	__uint8_t		i8count;	/* count of 8-byte inode #s */
-	__uint8_t		parent[8];	/* parent dir inode number */
+	uint8_t			count;		/* count of entries */
+	uint8_t			i8count;	/* count of 8-byte inode #s */
+	uint8_t			parent[8];	/* parent dir inode number */
 } __packed xfs_dir2_sf_hdr_t;
 
 typedef struct xfs_dir2_sf_entry {
@@ -447,11 +447,11 @@ struct xfs_dir3_leaf_hdr {
 };
 
 struct xfs_dir3_icleaf_hdr {
-	__uint32_t		forw;
-	__uint32_t		back;
-	__uint16_t		magic;
-	__uint16_t		count;
-	__uint16_t		stale;
+	uint32_t		forw;
+	uint32_t		back;
+	uint16_t		magic;
+	uint16_t		count;
+	uint16_t		stale;
 };
 
 /*
@@ -538,10 +538,10 @@ struct xfs_dir3_free {
  * xfs_dir3_free_hdr_from_disk/xfs_dir3_free_hdr_to_disk.
  */
 struct xfs_dir3_icfree_hdr {
-	__uint32_t	magic;
-	__uint32_t	firstdb;
-	__uint32_t	nvalid;
-	__uint32_t	nused;
+	uint32_t	magic;
+	uint32_t	firstdb;
+	uint32_t	nvalid;
+	uint32_t	nused;
 
 };
 
@@ -632,10 +632,10 @@ typedef struct xfs_attr_shortform {
 		__u8	padding;
 	} hdr;
 	struct xfs_attr_sf_entry {
-		__uint8_t namelen;	/* actual length of name (no NULL) */
-		__uint8_t valuelen;	/* actual length of value (no NULL) */
-		__uint8_t flags;	/* flags bits (see xfs_attr_leaf.h) */
-		__uint8_t nameval[1];	/* name & value bytes concatenated */
+		uint8_t namelen;	/* actual length of name (no NULL) */
+		uint8_t valuelen;	/* actual length of value (no NULL) */
+		uint8_t flags;	/* flags bits (see xfs_attr_leaf.h) */
+		uint8_t nameval[1];	/* name & value bytes concatenated */
 	} list[1];			/* variable sized array */
 } xfs_attr_shortform_t;
 
@@ -725,22 +725,22 @@ struct xfs_attr3_leafblock {
  * incore, neutral version of the attribute leaf header
  */
 struct xfs_attr3_icleaf_hdr {
-	__uint32_t	forw;
-	__uint32_t	back;
-	__uint16_t	magic;
-	__uint16_t	count;
-	__uint16_t	usedbytes;
+	uint32_t	forw;
+	uint32_t	back;
+	uint16_t	magic;
+	uint16_t	count;
+	uint16_t	usedbytes;
 	/*
 	 * firstused is 32-bit here instead of 16-bit like the on-disk variant
 	 * to support maximum fsb size of 64k without overflow issues throughout
 	 * the attr code. Instead, the overflow condition is handled on
 	 * conversion to/from disk.
 	 */
-	__uint32_t	firstused;
+	uint32_t	firstused;
 	__u8		holes;
 	struct {
-		__uint16_t	base;
-		__uint16_t	size;
+		uint16_t	base;
+		uint16_t	size;
 	} freemap[XFS_ATTR_LEAF_MAPSIZE];
 };
 
diff --git a/fs/xfs/libxfs/xfs_dir2.h b/fs/xfs/libxfs/xfs_dir2.h
index d6e6d9d..21c8f8b 100644
--- a/fs/xfs/libxfs/xfs_dir2.h
+++ b/fs/xfs/libxfs/xfs_dir2.h
@@ -47,9 +47,9 @@ struct xfs_dir_ops {
 	struct xfs_dir2_sf_entry *
 		(*sf_nextentry)(struct xfs_dir2_sf_hdr *hdr,
 				struct xfs_dir2_sf_entry *sfep);
-	__uint8_t (*sf_get_ftype)(struct xfs_dir2_sf_entry *sfep);
+	uint8_t (*sf_get_ftype)(struct xfs_dir2_sf_entry *sfep);
 	void	(*sf_put_ftype)(struct xfs_dir2_sf_entry *sfep,
-				__uint8_t ftype);
+				uint8_t ftype);
 	xfs_ino_t (*sf_get_ino)(struct xfs_dir2_sf_hdr *hdr,
 				struct xfs_dir2_sf_entry *sfep);
 	void	(*sf_put_ino)(struct xfs_dir2_sf_hdr *hdr,
@@ -60,9 +60,9 @@ struct xfs_dir_ops {
 				     xfs_ino_t ino);
 
 	int	(*data_entsize)(int len);
-	__uint8_t (*data_get_ftype)(struct xfs_dir2_data_entry *dep);
+	uint8_t (*data_get_ftype)(struct xfs_dir2_data_entry *dep);
 	void	(*data_put_ftype)(struct xfs_dir2_data_entry *dep,
-				__uint8_t ftype);
+				uint8_t ftype);
 	__be16 * (*data_entry_tag_p)(struct xfs_dir2_data_entry *dep);
 	struct xfs_dir2_data_free *
 		(*data_bestfree_p)(struct xfs_dir2_data_hdr *hdr);
diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c
index b887fb2..68bf3e8 100644
--- a/fs/xfs/libxfs/xfs_dir2_leaf.c
+++ b/fs/xfs/libxfs/xfs_dir2_leaf.c
@@ -145,7 +145,7 @@ xfs_dir3_leaf_check_int(
 static bool
 xfs_dir3_leaf_verify(
 	struct xfs_buf		*bp,
-	__uint16_t		magic)
+	uint16_t		magic)
 {
 	struct xfs_mount	*mp = bp->b_target->bt_mount;
 	struct xfs_dir2_leaf	*leaf = bp->b_addr;
@@ -154,7 +154,7 @@ xfs_dir3_leaf_verify(
 
 	if (xfs_sb_version_hascrc(&mp->m_sb)) {
 		struct xfs_dir3_leaf_hdr *leaf3 = bp->b_addr;
-		__uint16_t		magic3;
+		uint16_t		magic3;
 
 		magic3 = (magic == XFS_DIR2_LEAF1_MAGIC) ? XFS_DIR3_LEAF1_MAGIC
 							 : XFS_DIR3_LEAFN_MAGIC;
@@ -178,7 +178,7 @@ xfs_dir3_leaf_verify(
 static void
 __read_verify(
 	struct xfs_buf  *bp,
-	__uint16_t	magic)
+	uint16_t	magic)
 {
 	struct xfs_mount	*mp = bp->b_target->bt_mount;
 
@@ -195,7 +195,7 @@ __read_verify(
 static void
 __write_verify(
 	struct xfs_buf  *bp,
-	__uint16_t	magic)
+	uint16_t	magic)
 {
 	struct xfs_mount	*mp = bp->b_target->bt_mount;
 	struct xfs_buf_log_item	*bip = bp->b_fspriv;
@@ -299,7 +299,7 @@ xfs_dir3_leaf_init(
 	struct xfs_trans	*tp,
 	struct xfs_buf		*bp,
 	xfs_ino_t		owner,
-	__uint16_t		type)
+	uint16_t		type)
 {
 	struct xfs_dir2_leaf	*leaf = bp->b_addr;
 
@@ -343,7 +343,7 @@ xfs_dir3_leaf_get_buf(
 	xfs_da_args_t		*args,
 	xfs_dir2_db_t		bno,
 	struct xfs_buf		**bpp,
-	__uint16_t		magic)
+	uint16_t		magic)
 {
 	struct xfs_inode	*dp = args->dp;
 	struct xfs_trans	*tp = args->trans;
diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h
index 39f8604..011df4d 100644
--- a/fs/xfs/libxfs/xfs_dir2_priv.h
+++ b/fs/xfs/libxfs/xfs_dir2_priv.h
@@ -69,7 +69,7 @@ extern void xfs_dir3_leaf_compact_x1(struct xfs_dir3_icleaf_hdr *leafhdr,
 		struct xfs_dir2_leaf_entry *ents, int *indexp,
 		int *lowstalep, int *highstalep, int *lowlogp, int *highlogp);
 extern int xfs_dir3_leaf_get_buf(struct xfs_da_args *args, xfs_dir2_db_t bno,
-		struct xfs_buf **bpp, __uint16_t magic);
+		struct xfs_buf **bpp, uint16_t magic);
 extern void xfs_dir3_leaf_log_ents(struct xfs_da_args *args,
 		struct xfs_buf *bp, int first, int last);
 extern void xfs_dir3_leaf_log_header(struct xfs_da_args *args,
diff --git a/fs/xfs/libxfs/xfs_dir2_sf.c b/fs/xfs/libxfs/xfs_dir2_sf.c
index e84af09..be8b975 100644
--- a/fs/xfs/libxfs/xfs_dir2_sf.c
+++ b/fs/xfs/libxfs/xfs_dir2_sf.c
@@ -647,7 +647,7 @@ xfs_dir2_sf_verify(
 	int				offset;
 	int				size;
 	int				error;
-	__uint8_t			filetype;
+	uint8_t				filetype;
 
 	ASSERT(ip->i_d.di_format == XFS_DINODE_FMT_LOCAL);
 	/*
diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
index a1dccd8..e204a94 100644
--- a/fs/xfs/libxfs/xfs_format.h
+++ b/fs/xfs/libxfs/xfs_format.h
@@ -103,8 +103,8 @@ struct xfs_ifork;
  * Must be padded to 64 bit alignment.
  */
 typedef struct xfs_sb {
-	__uint32_t	sb_magicnum;	/* magic number == XFS_SB_MAGIC */
-	__uint32_t	sb_blocksize;	/* logical block size, bytes */
+	uint32_t	sb_magicnum;	/* magic number == XFS_SB_MAGIC */
+	uint32_t	sb_blocksize;	/* logical block size, bytes */
 	xfs_rfsblock_t	sb_dblocks;	/* number of data blocks */
 	xfs_rfsblock_t	sb_rblocks;	/* number of realtime blocks */
 	xfs_rtblock_t	sb_rextents;	/* number of realtime extents */
@@ -118,45 +118,45 @@ typedef struct xfs_sb {
 	xfs_agnumber_t	sb_agcount;	/* number of allocation groups */
 	xfs_extlen_t	sb_rbmblocks;	/* number of rt bitmap blocks */
 	xfs_extlen_t	sb_logblocks;	/* number of log blocks */
-	__uint16_t	sb_versionnum;	/* header version == XFS_SB_VERSION */
-	__uint16_t	sb_sectsize;	/* volume sector size, bytes */
-	__uint16_t	sb_inodesize;	/* inode size, bytes */
-	__uint16_t	sb_inopblock;	/* inodes per block */
+	uint16_t	sb_versionnum;	/* header version == XFS_SB_VERSION */
+	uint16_t	sb_sectsize;	/* volume sector size, bytes */
+	uint16_t	sb_inodesize;	/* inode size, bytes */
+	uint16_t	sb_inopblock;	/* inodes per block */
 	char		sb_fname[12];	/* file system name */
-	__uint8_t	sb_blocklog;	/* log2 of sb_blocksize */
-	__uint8_t	sb_sectlog;	/* log2 of sb_sectsize */
-	__uint8_t	sb_inodelog;	/* log2 of sb_inodesize */
-	__uint8_t	sb_inopblog;	/* log2 of sb_inopblock */
-	__uint8_t	sb_agblklog;	/* log2 of sb_agblocks (rounded up) */
-	__uint8_t	sb_rextslog;	/* log2 of sb_rextents */
-	__uint8_t	sb_inprogress;	/* mkfs is in progress, don't mount */
-	__uint8_t	sb_imax_pct;	/* max % of fs for inode space */
+	uint8_t		sb_blocklog;	/* log2 of sb_blocksize */
+	uint8_t		sb_sectlog;	/* log2 of sb_sectsize */
+	uint8_t		sb_inodelog;	/* log2 of sb_inodesize */
+	uint8_t		sb_inopblog;	/* log2 of sb_inopblock */
+	uint8_t		sb_agblklog;	/* log2 of sb_agblocks (rounded up) */
+	uint8_t		sb_rextslog;	/* log2 of sb_rextents */
+	uint8_t		sb_inprogress;	/* mkfs is in progress, don't mount */
+	uint8_t		sb_imax_pct;	/* max % of fs for inode space */
 					/* statistics */
 	/*
 	 * These fields must remain contiguous.  If you really
 	 * want to change their layout, make sure you fix the
 	 * code in xfs_trans_apply_sb_deltas().
 	 */
-	__uint64_t	sb_icount;	/* allocated inodes */
-	__uint64_t	sb_ifree;	/* free inodes */
-	__uint64_t	sb_fdblocks;	/* free data blocks */
-	__uint64_t	sb_frextents;	/* free realtime extents */
+	uint64_t	sb_icount;	/* allocated inodes */
+	uint64_t	sb_ifree;	/* free inodes */
+	uint64_t	sb_fdblocks;	/* free data blocks */
+	uint64_t	sb_frextents;	/* free realtime extents */
 	/*
 	 * End contiguous fields.
 	 */
 	xfs_ino_t	sb_uquotino;	/* user quota inode */
 	xfs_ino_t	sb_gquotino;	/* group quota inode */
-	__uint16_t	sb_qflags;	/* quota flags */
-	__uint8_t	sb_flags;	/* misc. flags */
-	__uint8_t	sb_shared_vn;	/* shared version number */
+	uint16_t	sb_qflags;	/* quota flags */
+	uint8_t		sb_flags;	/* misc. flags */
+	uint8_t		sb_shared_vn;	/* shared version number */
 	xfs_extlen_t	sb_inoalignmt;	/* inode chunk alignment, fsblocks */
-	__uint32_t	sb_unit;	/* stripe or raid unit */
-	__uint32_t	sb_width;	/* stripe or raid width */
-	__uint8_t	sb_dirblklog;	/* log2 of dir block size (fsbs) */
-	__uint8_t	sb_logsectlog;	/* log2 of the log sector size */
-	__uint16_t	sb_logsectsize;	/* sector size for the log, bytes */
-	__uint32_t	sb_logsunit;	/* stripe unit size for the log */
-	__uint32_t	sb_features2;	/* additional feature bits */
+	uint32_t	sb_unit;	/* stripe or raid unit */
+	uint32_t	sb_width;	/* stripe or raid width */
+	uint8_t		sb_dirblklog;	/* log2 of dir block size (fsbs) */
+	uint8_t		sb_logsectlog;	/* log2 of the log sector size */
+	uint16_t	sb_logsectsize;	/* sector size for the log, bytes */
+	uint32_t	sb_logsunit;	/* stripe unit size for the log */
+	uint32_t	sb_features2;	/* additional feature bits */
 
 	/*
 	 * bad features2 field as a result of failing to pad the sb structure to
@@ -167,17 +167,17 @@ typedef struct xfs_sb {
 	 * the value in sb_features2 when formatting the incore superblock to
 	 * the disk buffer.
 	 */
-	__uint32_t	sb_bad_features2;
+	uint32_t	sb_bad_features2;
 
 	/* version 5 superblock fields start here */
 
 	/* feature masks */
-	__uint32_t	sb_features_compat;
-	__uint32_t	sb_features_ro_compat;
-	__uint32_t	sb_features_incompat;
-	__uint32_t	sb_features_log_incompat;
+	uint32_t	sb_features_compat;
+	uint32_t	sb_features_ro_compat;
+	uint32_t	sb_features_incompat;
+	uint32_t	sb_features_log_incompat;
 
-	__uint32_t	sb_crc;		/* superblock crc */
+	uint32_t	sb_crc;		/* superblock crc */
 	xfs_extlen_t	sb_spino_align;	/* sparse inode chunk alignment */
 
 	xfs_ino_t	sb_pquotino;	/* project quota inode */
@@ -449,7 +449,7 @@ static inline void xfs_sb_version_addprojid32bit(struct xfs_sb *sbp)
 static inline bool
 xfs_sb_has_compat_feature(
 	struct xfs_sb	*sbp,
-	__uint32_t	feature)
+	uint32_t	feature)
 {
 	return (sbp->sb_features_compat & feature) != 0;
 }
@@ -465,7 +465,7 @@ xfs_sb_has_compat_feature(
 static inline bool
 xfs_sb_has_ro_compat_feature(
 	struct xfs_sb	*sbp,
-	__uint32_t	feature)
+	uint32_t	feature)
 {
 	return (sbp->sb_features_ro_compat & feature) != 0;
 }
@@ -482,7 +482,7 @@ xfs_sb_has_ro_compat_feature(
 static inline bool
 xfs_sb_has_incompat_feature(
 	struct xfs_sb	*sbp,
-	__uint32_t	feature)
+	uint32_t	feature)
 {
 	return (sbp->sb_features_incompat & feature) != 0;
 }
@@ -492,7 +492,7 @@ xfs_sb_has_incompat_feature(
 static inline bool
 xfs_sb_has_incompat_log_feature(
 	struct xfs_sb	*sbp,
-	__uint32_t	feature)
+	uint32_t	feature)
 {
 	return (sbp->sb_features_log_incompat & feature) != 0;
 }
@@ -594,8 +594,8 @@ xfs_is_quota_inode(struct xfs_sb *sbp, xfs_ino_t ino)
  */
 #define XFS_FSB_TO_B(mp,fsbno)	((xfs_fsize_t)(fsbno) << (mp)->m_sb.sb_blocklog)
 #define XFS_B_TO_FSB(mp,b)	\
-	((((__uint64_t)(b)) + (mp)->m_blockmask) >> (mp)->m_sb.sb_blocklog)
-#define XFS_B_TO_FSBT(mp,b)	(((__uint64_t)(b)) >> (mp)->m_sb.sb_blocklog)
+	((((uint64_t)(b)) + (mp)->m_blockmask) >> (mp)->m_sb.sb_blocklog)
+#define XFS_B_TO_FSBT(mp,b)	(((uint64_t)(b)) >> (mp)->m_sb.sb_blocklog)
 #define XFS_B_FSB_OFFSET(mp,b)	((b) & (mp)->m_blockmask)
 
 /*
@@ -1072,7 +1072,7 @@ static inline void xfs_dinode_put_rdev(struct xfs_dinode *dip, xfs_dev_t rdev)
  * next agno_log bits - ag number
  * high agno_log-agblklog-inopblog bits - 0
  */
-#define	XFS_INO_MASK(k)			(__uint32_t)((1ULL << (k)) - 1)
+#define	XFS_INO_MASK(k)			(uint32_t)((1ULL << (k)) - 1)
 #define	XFS_INO_OFFSET_BITS(mp)		(mp)->m_sb.sb_inopblog
 #define	XFS_INO_AGBNO_BITS(mp)		(mp)->m_sb.sb_agblklog
 #define	XFS_INO_AGINO_BITS(mp)		(mp)->m_agino_log
@@ -1269,16 +1269,16 @@ typedef __be32 xfs_alloc_ptr_t;
 #define	XFS_FIBT_MAGIC		0x46494254	/* 'FIBT' */
 #define	XFS_FIBT_CRC_MAGIC	0x46494233	/* 'FIB3' */
 
-typedef	__uint64_t	xfs_inofree_t;
+typedef uint64_t	xfs_inofree_t;
 #define	XFS_INODES_PER_CHUNK		(NBBY * sizeof(xfs_inofree_t))
 #define	XFS_INODES_PER_CHUNK_LOG	(XFS_NBBYLOG + 3)
 #define	XFS_INOBT_ALL_FREE		((xfs_inofree_t)-1)
 #define	XFS_INOBT_MASK(i)		((xfs_inofree_t)1 << (i))
 
 #define XFS_INOBT_HOLEMASK_FULL		0	/* holemask for full chunk */
-#define XFS_INOBT_HOLEMASK_BITS		(NBBY * sizeof(__uint16_t))
+#define XFS_INOBT_HOLEMASK_BITS		(NBBY * sizeof(uint16_t))
 #define XFS_INODES_PER_HOLEMASK_BIT	\
-	(XFS_INODES_PER_CHUNK / (NBBY * sizeof(__uint16_t)))
+	(XFS_INODES_PER_CHUNK / (NBBY * sizeof(uint16_t)))
 
 static inline xfs_inofree_t xfs_inobt_maskn(int i, int n)
 {
@@ -1312,9 +1312,9 @@ typedef struct xfs_inobt_rec {
 
 typedef struct xfs_inobt_rec_incore {
 	xfs_agino_t	ir_startino;	/* starting inode number */
-	__uint16_t	ir_holemask;	/* hole mask for sparse chunks */
-	__uint8_t	ir_count;	/* total inode count */
-	__uint8_t	ir_freecount;	/* count of free inodes (set bits) */
+	uint16_t	ir_holemask;	/* hole mask for sparse chunks */
+	uint8_t		ir_count;	/* total inode count */
+	uint8_t		ir_freecount;	/* count of free inodes (set bits) */
 	xfs_inofree_t	ir_free;	/* free inode mask */
 } xfs_inobt_rec_incore_t;
 
@@ -1397,15 +1397,15 @@ struct xfs_rmap_rec {
  *  rm_offset:54-60 aren't used and should be zero
  *  rm_offset:0-53 is the block offset within the inode
  */
-#define XFS_RMAP_OFF_ATTR_FORK	((__uint64_t)1ULL << 63)
-#define XFS_RMAP_OFF_BMBT_BLOCK	((__uint64_t)1ULL << 62)
-#define XFS_RMAP_OFF_UNWRITTEN	((__uint64_t)1ULL << 61)
+#define XFS_RMAP_OFF_ATTR_FORK	((uint64_t)1ULL << 63)
+#define XFS_RMAP_OFF_BMBT_BLOCK	((uint64_t)1ULL << 62)
+#define XFS_RMAP_OFF_UNWRITTEN	((uint64_t)1ULL << 61)
 
-#define XFS_RMAP_LEN_MAX	((__uint32_t)~0U)
+#define XFS_RMAP_LEN_MAX	((uint32_t)~0U)
 #define XFS_RMAP_OFF_FLAGS	(XFS_RMAP_OFF_ATTR_FORK | \
 				 XFS_RMAP_OFF_BMBT_BLOCK | \
 				 XFS_RMAP_OFF_UNWRITTEN)
-#define XFS_RMAP_OFF_MASK	((__uint64_t)0x3FFFFFFFFFFFFFULL)
+#define XFS_RMAP_OFF_MASK	((uint64_t)0x3FFFFFFFFFFFFFULL)
 
 #define XFS_RMAP_OFF(off)		((off) & XFS_RMAP_OFF_MASK)
 
@@ -1431,8 +1431,8 @@ struct xfs_rmap_rec {
 struct xfs_rmap_irec {
 	xfs_agblock_t	rm_startblock;	/* extent start block */
 	xfs_extlen_t	rm_blockcount;	/* extent length */
-	__uint64_t	rm_owner;	/* extent owner */
-	__uint64_t	rm_offset;	/* offset within the owner */
+	uint64_t	rm_owner;	/* extent owner */
+	uint64_t	rm_offset;	/* offset within the owner */
 	unsigned int	rm_flags;	/* state flags */
 };
 
@@ -1544,11 +1544,11 @@ typedef struct xfs_bmbt_rec {
 	__be64			l0, l1;
 } xfs_bmbt_rec_t;
 
-typedef __uint64_t	xfs_bmbt_rec_base_t;	/* use this for casts */
+typedef uint64_t	xfs_bmbt_rec_base_t;	/* use this for casts */
 typedef xfs_bmbt_rec_t xfs_bmdr_rec_t;
 
 typedef struct xfs_bmbt_rec_host {
-	__uint64_t		l0, l1;
+	uint64_t		l0, l1;
 } xfs_bmbt_rec_host_t;
 
 /*
diff --git a/fs/xfs/libxfs/xfs_fs.h b/fs/xfs/libxfs/xfs_fs.h
index 095bdf0..e895d32 100644
--- a/fs/xfs/libxfs/xfs_fs.h
+++ b/fs/xfs/libxfs/xfs_fs.h
@@ -302,10 +302,10 @@ typedef struct xfs_bstat {
  * and using two 16bit values to hold new 32bit projid was choosen
  * to retain compatibility with "old" filesystems).
  */
-static inline __uint32_t
+static inline uint32_t
 bstat_get_projid(struct xfs_bstat *bs)
 {
-	return (__uint32_t)bs->bs_projid_hi << 16 | bs->bs_projid_lo;
+	return (uint32_t)bs->bs_projid_hi << 16 | bs->bs_projid_lo;
 }
 
 /*
@@ -455,10 +455,10 @@ typedef struct xfs_handle {
  */
 typedef struct xfs_swapext
 {
-	__int64_t	sx_version;	/* version */
+	int64_t		sx_version;	/* version */
 #define XFS_SX_VERSION		0
-	__int64_t	sx_fdtarget;	/* fd of target file */
-	__int64_t	sx_fdtmp;	/* fd of tmp file */
+	int64_t		sx_fdtarget;	/* fd of target file */
+	int64_t		sx_fdtmp;	/* fd of tmp file */
 	xfs_off_t	sx_offset;	/* offset into file */
 	xfs_off_t	sx_length;	/* leng from offset */
 	char		sx_pad[16];	/* pad space, unused */
@@ -546,7 +546,7 @@ typedef struct xfs_swapext
 #define XFS_IOC_ATTRLIST_BY_HANDLE   _IOW ('X', 122, struct xfs_fsop_attrlist_handlereq)
 #define XFS_IOC_ATTRMULTI_BY_HANDLE  _IOW ('X', 123, struct xfs_fsop_attrmulti_handlereq)
 #define XFS_IOC_FSGEOMETRY	     _IOR ('X', 124, struct xfs_fsop_geom)
-#define XFS_IOC_GOINGDOWN	     _IOR ('X', 125, __uint32_t)
+#define XFS_IOC_GOINGDOWN	     _IOR ('X', 125, uint32_t)
 /*	XFS_IOC_GETFSUUID ---------- deprecated 140	 */
 
 
diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
index d41ade5..1e5ed94 100644
--- a/fs/xfs/libxfs/xfs_ialloc.c
+++ b/fs/xfs/libxfs/xfs_ialloc.c
@@ -140,9 +140,9 @@ xfs_inobt_get_rec(
 STATIC int
 xfs_inobt_insert_rec(
 	struct xfs_btree_cur	*cur,
-	__uint16_t		holemask,
-	__uint8_t		count,
-	__int32_t		freecount,
+	uint16_t		holemask,
+	uint8_t			count,
+	int32_t			freecount,
 	xfs_inofree_t		free,
 	int			*stat)
 {
diff --git a/fs/xfs/libxfs/xfs_ialloc_btree.c b/fs/xfs/libxfs/xfs_ialloc_btree.c
index 7c47188..ed52d99 100644
--- a/fs/xfs/libxfs/xfs_ialloc_btree.c
+++ b/fs/xfs/libxfs/xfs_ialloc_btree.c
@@ -219,12 +219,12 @@ xfs_finobt_init_ptr_from_cur(
 	ptr->s = agi->agi_free_root;
 }
 
-STATIC __int64_t
+STATIC int64_t
 xfs_inobt_key_diff(
 	struct xfs_btree_cur	*cur,
 	union xfs_btree_key	*key)
 {
-	return (__int64_t)be32_to_cpu(key->inobt.ir_startino) -
+	return (int64_t)be32_to_cpu(key->inobt.ir_startino) -
 			  cur->bc_rec.i.ir_startino;
 }
 
diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
index 09c3d1a..d887af9 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.c
+++ b/fs/xfs/libxfs/xfs_inode_buf.c
@@ -444,7 +444,7 @@ xfs_dinode_calc_crc(
 	struct xfs_mount	*mp,
 	struct xfs_dinode	*dip)
 {
-	__uint32_t		crc;
+	uint32_t		crc;
 
 	if (dip->di_version < 3)
 		return;
diff --git a/fs/xfs/libxfs/xfs_inode_buf.h b/fs/xfs/libxfs/xfs_inode_buf.h
index 6848a0a..0827d7d 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.h
+++ b/fs/xfs/libxfs/xfs_inode_buf.h
@@ -28,26 +28,26 @@ struct xfs_dinode;
  * format specific structures at the appropriate time.
  */
 struct xfs_icdinode {
-	__int8_t	di_version;	/* inode version */
-	__int8_t	di_format;	/* format of di_c data */
-	__uint16_t	di_flushiter;	/* incremented on flush */
-	__uint32_t	di_uid;		/* owner's user id */
-	__uint32_t	di_gid;		/* owner's group id */
-	__uint16_t	di_projid_lo;	/* lower part of owner's project id */
-	__uint16_t	di_projid_hi;	/* higher part of owner's project id */
+	int8_t		di_version;	/* inode version */
+	int8_t		di_format;	/* format of di_c data */
+	uint16_t	di_flushiter;	/* incremented on flush */
+	uint32_t	di_uid;		/* owner's user id */
+	uint32_t	di_gid;		/* owner's group id */
+	uint16_t	di_projid_lo;	/* lower part of owner's project id */
+	uint16_t	di_projid_hi;	/* higher part of owner's project id */
 	xfs_fsize_t	di_size;	/* number of bytes in file */
 	xfs_rfsblock_t	di_nblocks;	/* # of direct & btree blocks used */
 	xfs_extlen_t	di_extsize;	/* basic/minimum extent size for file */
 	xfs_extnum_t	di_nextents;	/* number of extents in data fork */
 	xfs_aextnum_t	di_anextents;	/* number of extents in attribute fork*/
-	__uint8_t	di_forkoff;	/* attr fork offs, <<3 for 64b align */
-	__int8_t	di_aformat;	/* format of attr fork's data */
-	__uint32_t	di_dmevmask;	/* DMIG event mask */
-	__uint16_t	di_dmstate;	/* DMIG state info */
-	__uint16_t	di_flags;	/* random flags, XFS_DIFLAG_... */
+	uint8_t		di_forkoff;	/* attr fork offs, <<3 for 64b align */
+	int8_t		di_aformat;	/* format of attr fork's data */
+	uint32_t	di_dmevmask;	/* DMIG event mask */
+	uint16_t	di_dmstate;	/* DMIG state info */
+	uint16_t	di_flags;	/* random flags, XFS_DIFLAG_... */
 
-	__uint64_t	di_flags2;	/* more random flags */
-	__uint32_t	di_cowextsize;	/* basic cow extent size for file */
+	uint64_t	di_flags2;	/* more random flags */
+	uint32_t	di_cowextsize;	/* basic cow extent size for file */
 
 	xfs_ictimestamp_t di_crtime;	/* time created */
 };
diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h
index 7ae571f..8372e9b 100644
--- a/fs/xfs/libxfs/xfs_log_format.h
+++ b/fs/xfs/libxfs/xfs_log_format.h
@@ -31,7 +31,7 @@ struct xfs_trans_res;
  * through all the log items definitions and everything they encode into the
  * log.
  */
-typedef __uint32_t xlog_tid_t;
+typedef uint32_t xlog_tid_t;
 
 #define XLOG_MIN_ICLOGS		2
 #define XLOG_MAX_ICLOGS		8
@@ -211,7 +211,7 @@ typedef struct xfs_log_iovec {
 typedef struct xfs_trans_header {
 	uint		th_magic;		/* magic number */
 	uint		th_type;		/* transaction type */
-	__int32_t	th_tid;			/* transaction id (unused) */
+	int32_t		th_tid;			/* transaction id (unused) */
 	uint		th_num_items;		/* num items logged by trans */
 } xfs_trans_header_t;
 
@@ -265,52 +265,52 @@ typedef struct xfs_trans_header {
  * must be added on to the end.
  */
 typedef struct xfs_inode_log_format {
-	__uint16_t		ilf_type;	/* inode log item type */
-	__uint16_t		ilf_size;	/* size of this item */
-	__uint32_t		ilf_fields;	/* flags for fields logged */
-	__uint16_t		ilf_asize;	/* size of attr d/ext/root */
-	__uint16_t		ilf_dsize;	/* size of data/ext/root */
-	__uint64_t		ilf_ino;	/* inode number */
+	uint16_t		ilf_type;	/* inode log item type */
+	uint16_t		ilf_size;	/* size of this item */
+	uint32_t		ilf_fields;	/* flags for fields logged */
+	uint16_t		ilf_asize;	/* size of attr d/ext/root */
+	uint16_t		ilf_dsize;	/* size of data/ext/root */
+	uint64_t		ilf_ino;	/* inode number */
 	union {
-		__uint32_t	ilfu_rdev;	/* rdev value for dev inode*/
+		uint32_t	ilfu_rdev;	/* rdev value for dev inode*/
 		uuid_t		ilfu_uuid;	/* mount point value */
 	} ilf_u;
-	__int64_t		ilf_blkno;	/* blkno of inode buffer */
-	__int32_t		ilf_len;	/* len of inode buffer */
-	__int32_t		ilf_boffset;	/* off of inode in buffer */
+	int64_t			ilf_blkno;	/* blkno of inode buffer */
+	int32_t			ilf_len;	/* len of inode buffer */
+	int32_t			ilf_boffset;	/* off of inode in buffer */
 } xfs_inode_log_format_t;
 
 typedef struct xfs_inode_log_format_32 {
-	__uint16_t		ilf_type;	/* inode log item type */
-	__uint16_t		ilf_size;	/* size of this item */
-	__uint32_t		ilf_fields;	/* flags for fields logged */
-	__uint16_t		ilf_asize;	/* size of attr d/ext/root */
-	__uint16_t		ilf_dsize;	/* size of data/ext/root */
-	__uint64_t		ilf_ino;	/* inode number */
+	uint16_t		ilf_type;	/* inode log item type */
+	uint16_t		ilf_size;	/* size of this item */
+	uint32_t		ilf_fields;	/* flags for fields logged */
+	uint16_t		ilf_asize;	/* size of attr d/ext/root */
+	uint16_t		ilf_dsize;	/* size of data/ext/root */
+	uint64_t		ilf_ino;	/* inode number */
 	union {
-		__uint32_t	ilfu_rdev;	/* rdev value for dev inode*/
+		uint32_t	ilfu_rdev;	/* rdev value for dev inode*/
 		uuid_t		ilfu_uuid;	/* mount point value */
 	} ilf_u;
-	__int64_t		ilf_blkno;	/* blkno of inode buffer */
-	__int32_t		ilf_len;	/* len of inode buffer */
-	__int32_t		ilf_boffset;	/* off of inode in buffer */
+	int64_t			ilf_blkno;	/* blkno of inode buffer */
+	int32_t			ilf_len;	/* len of inode buffer */
+	int32_t			ilf_boffset;	/* off of inode in buffer */
 } __attribute__((packed)) xfs_inode_log_format_32_t;
 
 typedef struct xfs_inode_log_format_64 {
-	__uint16_t		ilf_type;	/* inode log item type */
-	__uint16_t		ilf_size;	/* size of this item */
-	__uint32_t		ilf_fields;	/* flags for fields logged */
-	__uint16_t		ilf_asize;	/* size of attr d/ext/root */
-	__uint16_t		ilf_dsize;	/* size of data/ext/root */
-	__uint32_t		ilf_pad;	/* pad for 64 bit boundary */
-	__uint64_t		ilf_ino;	/* inode number */
+	uint16_t		ilf_type;	/* inode log item type */
+	uint16_t		ilf_size;	/* size of this item */
+	uint32_t		ilf_fields;	/* flags for fields logged */
+	uint16_t		ilf_asize;	/* size of attr d/ext/root */
+	uint16_t		ilf_dsize;	/* size of data/ext/root */
+	uint32_t		ilf_pad;	/* pad for 64 bit boundary */
+	uint64_t		ilf_ino;	/* inode number */
 	union {
-		__uint32_t	ilfu_rdev;	/* rdev value for dev inode*/
+		uint32_t	ilfu_rdev;	/* rdev value for dev inode*/
 		uuid_t		ilfu_uuid;	/* mount point value */
 	} ilf_u;
-	__int64_t		ilf_blkno;	/* blkno of inode buffer */
-	__int32_t		ilf_len;	/* len of inode buffer */
-	__int32_t		ilf_boffset;	/* off of inode in buffer */
+	int64_t			ilf_blkno;	/* blkno of inode buffer */
+	int32_t			ilf_len;	/* len of inode buffer */
+	int32_t			ilf_boffset;	/* off of inode in buffer */
 } xfs_inode_log_format_64_t;
 
 
@@ -379,8 +379,8 @@ static inline int xfs_ilog_fdata(int w)
  * information.
  */
 typedef struct xfs_ictimestamp {
-	__int32_t	t_sec;		/* timestamp seconds */
-	__int32_t	t_nsec;		/* timestamp nanoseconds */
+	int32_t		t_sec;		/* timestamp seconds */
+	int32_t		t_nsec;		/* timestamp nanoseconds */
 } xfs_ictimestamp_t;
 
 /*
@@ -388,18 +388,18 @@ typedef struct xfs_ictimestamp {
  * kept identical to struct xfs_dinode except for the endianness annotations.
  */
 struct xfs_log_dinode {
-	__uint16_t	di_magic;	/* inode magic # = XFS_DINODE_MAGIC */
-	__uint16_t	di_mode;	/* mode and type of file */
-	__int8_t	di_version;	/* inode version */
-	__int8_t	di_format;	/* format of di_c data */
-	__uint8_t	di_pad3[2];	/* unused in v2/3 inodes */
-	__uint32_t	di_uid;		/* owner's user id */
-	__uint32_t	di_gid;		/* owner's group id */
-	__uint32_t	di_nlink;	/* number of links to file */
-	__uint16_t	di_projid_lo;	/* lower part of owner's project id */
-	__uint16_t	di_projid_hi;	/* higher part of owner's project id */
-	__uint8_t	di_pad[6];	/* unused, zeroed space */
-	__uint16_t	di_flushiter;	/* incremented on flush */
+	uint16_t	di_magic;	/* inode magic # = XFS_DINODE_MAGIC */
+	uint16_t	di_mode;	/* mode and type of file */
+	int8_t		di_version;	/* inode version */
+	int8_t		di_format;	/* format of di_c data */
+	uint8_t		di_pad3[2];	/* unused in v2/3 inodes */
+	uint32_t	di_uid;		/* owner's user id */
+	uint32_t	di_gid;		/* owner's group id */
+	uint32_t	di_nlink;	/* number of links to file */
+	uint16_t	di_projid_lo;	/* lower part of owner's project id */
+	uint16_t	di_projid_hi;	/* higher part of owner's project id */
+	uint8_t		di_pad[6];	/* unused, zeroed space */
+	uint16_t	di_flushiter;	/* incremented on flush */
 	xfs_ictimestamp_t di_atime;	/* time last accessed */
 	xfs_ictimestamp_t di_mtime;	/* time last modified */
 	xfs_ictimestamp_t di_ctime;	/* time created/inode modified */
@@ -408,23 +408,23 @@ struct xfs_log_dinode {
 	xfs_extlen_t	di_extsize;	/* basic/minimum extent size for file */
 	xfs_extnum_t	di_nextents;	/* number of extents in data fork */
 	xfs_aextnum_t	di_anextents;	/* number of extents in attribute fork*/
-	__uint8_t	di_forkoff;	/* attr fork offs, <<3 for 64b align */
-	__int8_t	di_aformat;	/* format of attr fork's data */
-	__uint32_t	di_dmevmask;	/* DMIG event mask */
-	__uint16_t	di_dmstate;	/* DMIG state info */
-	__uint16_t	di_flags;	/* random flags, XFS_DIFLAG_... */
-	__uint32_t	di_gen;		/* generation number */
+	uint8_t		di_forkoff;	/* attr fork offs, <<3 for 64b align */
+	int8_t		di_aformat;	/* format of attr fork's data */
+	uint32_t	di_dmevmask;	/* DMIG event mask */
+	uint16_t	di_dmstate;	/* DMIG state info */
+	uint16_t	di_flags;	/* random flags, XFS_DIFLAG_... */
+	uint32_t	di_gen;		/* generation number */
 
 	/* di_next_unlinked is the only non-core field in the old dinode */
 	xfs_agino_t	di_next_unlinked;/* agi unlinked list ptr */
 
 	/* start of the extended dinode, writable fields */
-	__uint32_t	di_crc;		/* CRC of the inode */
-	__uint64_t	di_changecount;	/* number of attribute changes */
+	uint32_t	di_crc;		/* CRC of the inode */
+	uint64_t	di_changecount;	/* number of attribute changes */
 	xfs_lsn_t	di_lsn;		/* flush sequence */
-	__uint64_t	di_flags2;	/* more random flags */
-	__uint32_t	di_cowextsize;	/* basic cow extent size for file */
-	__uint8_t	di_pad2[12];	/* more padding for future expansion */
+	uint64_t	di_flags2;	/* more random flags */
+	uint32_t	di_cowextsize;	/* basic cow extent size for file */
+	uint8_t		di_pad2[12];	/* more padding for future expansion */
 
 	/* fields only written to during inode creation */
 	xfs_ictimestamp_t di_crtime;	/* time created */
@@ -483,7 +483,7 @@ typedef struct xfs_buf_log_format {
 	unsigned short	blf_size;	/* size of this item */
 	unsigned short	blf_flags;	/* misc state */
 	unsigned short	blf_len;	/* number of blocks in this buf */
-	__int64_t	blf_blkno;	/* starting blkno of this buf */
+	int64_t		blf_blkno;	/* starting blkno of this buf */
 	unsigned int	blf_map_size;	/* used size of data bitmap in words */
 	unsigned int	blf_data_map[XFS_BLF_DATAMAP_SIZE]; /* dirty bitmap */
 } xfs_buf_log_format_t;
@@ -533,7 +533,7 @@ xfs_blft_to_flags(struct xfs_buf_log_format *blf, enum xfs_blft type)
 	blf->blf_flags |= ((type << XFS_BLFT_SHIFT) & XFS_BLFT_MASK);
 }
 
-static inline __uint16_t
+static inline uint16_t
 xfs_blft_from_flags(struct xfs_buf_log_format *blf)
 {
 	return (blf->blf_flags & XFS_BLFT_MASK) >> XFS_BLFT_SHIFT;
@@ -554,14 +554,14 @@ typedef struct xfs_extent {
  * conversion routine.
  */
 typedef struct xfs_extent_32 {
-	__uint64_t	ext_start;
-	__uint32_t	ext_len;
+	uint64_t	ext_start;
+	uint32_t	ext_len;
 } __attribute__((packed)) xfs_extent_32_t;
 
 typedef struct xfs_extent_64 {
-	__uint64_t	ext_start;
-	__uint32_t	ext_len;
-	__uint32_t	ext_pad;
+	uint64_t	ext_start;
+	uint32_t	ext_len;
+	uint32_t	ext_pad;
 } xfs_extent_64_t;
 
 /*
@@ -570,26 +570,26 @@ typedef struct xfs_extent_64 {
  * size is given by efi_nextents.
  */
 typedef struct xfs_efi_log_format {
-	__uint16_t		efi_type;	/* efi log item type */
-	__uint16_t		efi_size;	/* size of this item */
-	__uint32_t		efi_nextents;	/* # extents to free */
-	__uint64_t		efi_id;		/* efi identifier */
+	uint16_t		efi_type;	/* efi log item type */
+	uint16_t		efi_size;	/* size of this item */
+	uint32_t		efi_nextents;	/* # extents to free */
+	uint64_t		efi_id;		/* efi identifier */
 	xfs_extent_t		efi_extents[1];	/* array of extents to free */
 } xfs_efi_log_format_t;
 
 typedef struct xfs_efi_log_format_32 {
-	__uint16_t		efi_type;	/* efi log item type */
-	__uint16_t		efi_size;	/* size of this item */
-	__uint32_t		efi_nextents;	/* # extents to free */
-	__uint64_t		efi_id;		/* efi identifier */
+	uint16_t		efi_type;	/* efi log item type */
+	uint16_t		efi_size;	/* size of this item */
+	uint32_t		efi_nextents;	/* # extents to free */
+	uint64_t		efi_id;		/* efi identifier */
 	xfs_extent_32_t		efi_extents[1];	/* array of extents to free */
 } __attribute__((packed)) xfs_efi_log_format_32_t;
 
 typedef struct xfs_efi_log_format_64 {
-	__uint16_t		efi_type;	/* efi log item type */
-	__uint16_t		efi_size;	/* size of this item */
-	__uint32_t		efi_nextents;	/* # extents to free */
-	__uint64_t		efi_id;		/* efi identifier */
+	uint16_t		efi_type;	/* efi log item type */
+	uint16_t		efi_size;	/* size of this item */
+	uint32_t		efi_nextents;	/* # extents to free */
+	uint64_t		efi_id;		/* efi identifier */
 	xfs_extent_64_t		efi_extents[1];	/* array of extents to free */
 } xfs_efi_log_format_64_t;
 
@@ -599,26 +599,26 @@ typedef struct xfs_efi_log_format_64 {
  * size is given by efd_nextents;
  */
 typedef struct xfs_efd_log_format {
-	__uint16_t		efd_type;	/* efd log item type */
-	__uint16_t		efd_size;	/* size of this item */
-	__uint32_t		efd_nextents;	/* # of extents freed */
-	__uint64_t		efd_efi_id;	/* id of corresponding efi */
+	uint16_t		efd_type;	/* efd log item type */
+	uint16_t		efd_size;	/* size of this item */
+	uint32_t		efd_nextents;	/* # of extents freed */
+	uint64_t		efd_efi_id;	/* id of corresponding efi */
 	xfs_extent_t		efd_extents[1];	/* array of extents freed */
 } xfs_efd_log_format_t;
 
 typedef struct xfs_efd_log_format_32 {
-	__uint16_t		efd_type;	/* efd log item type */
-	__uint16_t		efd_size;	/* size of this item */
-	__uint32_t		efd_nextents;	/* # of extents freed */
-	__uint64_t		efd_efi_id;	/* id of corresponding efi */
+	uint16_t		efd_type;	/* efd log item type */
+	uint16_t		efd_size;	/* size of this item */
+	uint32_t		efd_nextents;	/* # of extents freed */
+	uint64_t		efd_efi_id;	/* id of corresponding efi */
 	xfs_extent_32_t		efd_extents[1];	/* array of extents freed */
 } __attribute__((packed)) xfs_efd_log_format_32_t;
 
 typedef struct xfs_efd_log_format_64 {
-	__uint16_t		efd_type;	/* efd log item type */
-	__uint16_t		efd_size;	/* size of this item */
-	__uint32_t		efd_nextents;	/* # of extents freed */
-	__uint64_t		efd_efi_id;	/* id of corresponding efi */
+	uint16_t		efd_type;	/* efd log item type */
+	uint16_t		efd_size;	/* size of this item */
+	uint32_t		efd_nextents;	/* # of extents freed */
+	uint64_t		efd_efi_id;	/* id of corresponding efi */
 	xfs_extent_64_t		efd_extents[1];	/* array of extents freed */
 } xfs_efd_log_format_64_t;
 
@@ -626,11 +626,11 @@ typedef struct xfs_efd_log_format_64 {
  * RUI/RUD (reverse mapping) log format definitions
  */
 struct xfs_map_extent {
-	__uint64_t		me_owner;
-	__uint64_t		me_startblock;
-	__uint64_t		me_startoff;
-	__uint32_t		me_len;
-	__uint32_t		me_flags;
+	uint64_t		me_owner;
+	uint64_t		me_startblock;
+	uint64_t		me_startoff;
+	uint32_t		me_len;
+	uint32_t		me_flags;
 };
 
 /* rmap me_flags: upper bits are flags, lower byte is type code */
@@ -659,10 +659,10 @@ struct xfs_map_extent {
  * size is given by rui_nextents.
  */
 struct xfs_rui_log_format {
-	__uint16_t		rui_type;	/* rui log item type */
-	__uint16_t		rui_size;	/* size of this item */
-	__uint32_t		rui_nextents;	/* # extents to free */
-	__uint64_t		rui_id;		/* rui identifier */
+	uint16_t		rui_type;	/* rui log item type */
+	uint16_t		rui_size;	/* size of this item */
+	uint32_t		rui_nextents;	/* # extents to free */
+	uint64_t		rui_id;		/* rui identifier */
 	struct xfs_map_extent	rui_extents[];	/* array of extents to rmap */
 };
 
@@ -680,19 +680,19 @@ xfs_rui_log_format_sizeof(
  * size is given by rud_nextents;
  */
 struct xfs_rud_log_format {
-	__uint16_t		rud_type;	/* rud log item type */
-	__uint16_t		rud_size;	/* size of this item */
-	__uint32_t		__pad;
-	__uint64_t		rud_rui_id;	/* id of corresponding rui */
+	uint16_t		rud_type;	/* rud log item type */
+	uint16_t		rud_size;	/* size of this item */
+	uint32_t		__pad;
+	uint64_t		rud_rui_id;	/* id of corresponding rui */
 };
 
 /*
  * CUI/CUD (refcount update) log format definitions
  */
 struct xfs_phys_extent {
-	__uint64_t		pe_startblock;
-	__uint32_t		pe_len;
-	__uint32_t		pe_flags;
+	uint64_t		pe_startblock;
+	uint32_t		pe_len;
+	uint32_t		pe_flags;
 };
 
 /* refcount pe_flags: upper bits are flags, lower byte is type code */
@@ -707,10 +707,10 @@ struct xfs_phys_extent {
  * size is given by cui_nextents.
  */
 struct xfs_cui_log_format {
-	__uint16_t		cui_type;	/* cui log item type */
-	__uint16_t		cui_size;	/* size of this item */
-	__uint32_t		cui_nextents;	/* # extents to free */
-	__uint64_t		cui_id;		/* cui identifier */
+	uint16_t		cui_type;	/* cui log item type */
+	uint16_t		cui_size;	/* size of this item */
+	uint32_t		cui_nextents;	/* # extents to free */
+	uint64_t		cui_id;		/* cui identifier */
 	struct xfs_phys_extent	cui_extents[];	/* array of extents */
 };
 
@@ -728,10 +728,10 @@ xfs_cui_log_format_sizeof(
  * size is given by cud_nextents;
  */
 struct xfs_cud_log_format {
-	__uint16_t		cud_type;	/* cud log item type */
-	__uint16_t		cud_size;	/* size of this item */
-	__uint32_t		__pad;
-	__uint64_t		cud_cui_id;	/* id of corresponding cui */
+	uint16_t		cud_type;	/* cud log item type */
+	uint16_t		cud_size;	/* size of this item */
+	uint32_t		__pad;
+	uint64_t		cud_cui_id;	/* id of corresponding cui */
 };
 
 /*
@@ -755,10 +755,10 @@ struct xfs_cud_log_format {
  * size is given by bui_nextents.
  */
 struct xfs_bui_log_format {
-	__uint16_t		bui_type;	/* bui log item type */
-	__uint16_t		bui_size;	/* size of this item */
-	__uint32_t		bui_nextents;	/* # extents to free */
-	__uint64_t		bui_id;		/* bui identifier */
+	uint16_t		bui_type;	/* bui log item type */
+	uint16_t		bui_size;	/* size of this item */
+	uint32_t		bui_nextents;	/* # extents to free */
+	uint64_t		bui_id;		/* bui identifier */
 	struct xfs_map_extent	bui_extents[];	/* array of extents to bmap */
 };
 
@@ -776,10 +776,10 @@ xfs_bui_log_format_sizeof(
  * size is given by bud_nextents;
  */
 struct xfs_bud_log_format {
-	__uint16_t		bud_type;	/* bud log item type */
-	__uint16_t		bud_size;	/* size of this item */
-	__uint32_t		__pad;
-	__uint64_t		bud_bui_id;	/* id of corresponding bui */
+	uint16_t		bud_type;	/* bud log item type */
+	uint16_t		bud_size;	/* size of this item */
+	uint32_t		__pad;
+	uint64_t		bud_bui_id;	/* id of corresponding bui */
 };
 
 /*
@@ -789,12 +789,12 @@ struct xfs_bud_log_format {
  * 32 bits : log_recovery code assumes that.
  */
 typedef struct xfs_dq_logformat {
-	__uint16_t		qlf_type;      /* dquot log item type */
-	__uint16_t		qlf_size;      /* size of this item */
+	uint16_t		qlf_type;      /* dquot log item type */
+	uint16_t		qlf_size;      /* size of this item */
 	xfs_dqid_t		qlf_id;	       /* usr/grp/proj id : 32 bits */
-	__int64_t		qlf_blkno;     /* blkno of dquot buffer */
-	__int32_t		qlf_len;       /* len of dquot buffer */
-	__uint32_t		qlf_boffset;   /* off of dquot in buffer */
+	int64_t			qlf_blkno;     /* blkno of dquot buffer */
+	int32_t			qlf_len;       /* len of dquot buffer */
+	uint32_t		qlf_boffset;   /* off of dquot in buffer */
 } xfs_dq_logformat_t;
 
 /*
@@ -853,8 +853,8 @@ typedef struct xfs_qoff_logformat {
  * decoding can be done correctly.
  */
 struct xfs_icreate_log {
-	__uint16_t	icl_type;	/* type of log format structure */
-	__uint16_t	icl_size;	/* size of log format structure */
+	uint16_t	icl_type;	/* type of log format structure */
+	uint16_t	icl_size;	/* size of log format structure */
 	__be32		icl_ag;		/* ag being allocated in */
 	__be32		icl_agbno;	/* start block of inode range */
 	__be32		icl_count;	/* number of inodes to initialise */
diff --git a/fs/xfs/libxfs/xfs_log_recover.h b/fs/xfs/libxfs/xfs_log_recover.h
index 29a01ec..66948a9 100644
--- a/fs/xfs/libxfs/xfs_log_recover.h
+++ b/fs/xfs/libxfs/xfs_log_recover.h
@@ -26,7 +26,7 @@
 #define XLOG_RHASH_SIZE	16
 #define XLOG_RHASH_SHIFT 2
 #define XLOG_RHASH(tid)	\
-	((((__uint32_t)tid)>>XLOG_RHASH_SHIFT) & (XLOG_RHASH_SIZE-1))
+	((((uint32_t)tid)>>XLOG_RHASH_SHIFT) & (XLOG_RHASH_SIZE-1))
 
 #define XLOG_MAX_REGIONS_IN_ITEM   (XFS_MAX_BLOCKSIZE / XFS_BLF_CHUNK / 2 + 1)
 
diff --git a/fs/xfs/libxfs/xfs_quota_defs.h b/fs/xfs/libxfs/xfs_quota_defs.h
index 8eed512..d69c772 100644
--- a/fs/xfs/libxfs/xfs_quota_defs.h
+++ b/fs/xfs/libxfs/xfs_quota_defs.h
@@ -27,8 +27,8 @@
  * they may need 64-bit accounting. Hence, 64-bit quota-counters,
  * and quota-limits. This is a waste in the common case, but hey ...
  */
-typedef __uint64_t	xfs_qcnt_t;
-typedef __uint16_t	xfs_qwarncnt_t;
+typedef uint64_t	xfs_qcnt_t;
+typedef uint16_t	xfs_qwarncnt_t;
 
 /*
  * flags for q_flags field in the dquot.
diff --git a/fs/xfs/libxfs/xfs_refcount_btree.c b/fs/xfs/libxfs/xfs_refcount_btree.c
index 50add52..65c222a 100644
--- a/fs/xfs/libxfs/xfs_refcount_btree.c
+++ b/fs/xfs/libxfs/xfs_refcount_btree.c
@@ -202,7 +202,7 @@ xfs_refcountbt_init_ptr_from_cur(
 	ptr->s = agf->agf_refcount_root;
 }
 
-STATIC __int64_t
+STATIC int64_t
 xfs_refcountbt_key_diff(
 	struct xfs_btree_cur	*cur,
 	union xfs_btree_key	*key)
@@ -210,16 +210,16 @@ xfs_refcountbt_key_diff(
 	struct xfs_refcount_irec	*rec = &cur->bc_rec.rc;
 	struct xfs_refcount_key		*kp = &key->refc;
 
-	return (__int64_t)be32_to_cpu(kp->rc_startblock) - rec->rc_startblock;
+	return (int64_t)be32_to_cpu(kp->rc_startblock) - rec->rc_startblock;
 }
 
-STATIC __int64_t
+STATIC int64_t
 xfs_refcountbt_diff_two_keys(
 	struct xfs_btree_cur	*cur,
 	union xfs_btree_key	*k1,
 	union xfs_btree_key	*k2)
 {
-	return (__int64_t)be32_to_cpu(k1->refc.rc_startblock) -
+	return (int64_t)be32_to_cpu(k1->refc.rc_startblock) -
 			  be32_to_cpu(k2->refc.rc_startblock);
 }
 
diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c
index 06cfb93..1bcb41f 100644
--- a/fs/xfs/libxfs/xfs_rmap.c
+++ b/fs/xfs/libxfs/xfs_rmap.c
@@ -2061,7 +2061,7 @@ int
 xfs_rmap_finish_one(
 	struct xfs_trans		*tp,
 	enum xfs_rmap_intent_type	type,
-	__uint64_t			owner,
+	uint64_t			owner,
 	int				whichfork,
 	xfs_fileoff_t			startoff,
 	xfs_fsblock_t			startblock,
@@ -2182,7 +2182,7 @@ __xfs_rmap_add(
 	struct xfs_mount		*mp,
 	struct xfs_defer_ops		*dfops,
 	enum xfs_rmap_intent_type	type,
-	__uint64_t			owner,
+	uint64_t			owner,
 	int				whichfork,
 	struct xfs_bmbt_irec		*bmap)
 {
@@ -2266,7 +2266,7 @@ xfs_rmap_alloc_extent(
 	xfs_agnumber_t		agno,
 	xfs_agblock_t		bno,
 	xfs_extlen_t		len,
-	__uint64_t		owner)
+	uint64_t		owner)
 {
 	struct xfs_bmbt_irec	bmap;
 
@@ -2290,7 +2290,7 @@ xfs_rmap_free_extent(
 	xfs_agnumber_t		agno,
 	xfs_agblock_t		bno,
 	xfs_extlen_t		len,
-	__uint64_t		owner)
+	uint64_t		owner)
 {
 	struct xfs_bmbt_irec	bmap;
 
diff --git a/fs/xfs/libxfs/xfs_rmap.h b/fs/xfs/libxfs/xfs_rmap.h
index 98f908f..265116d 100644
--- a/fs/xfs/libxfs/xfs_rmap.h
+++ b/fs/xfs/libxfs/xfs_rmap.h
@@ -179,7 +179,7 @@ enum xfs_rmap_intent_type {
 struct xfs_rmap_intent {
 	struct list_head			ri_list;
 	enum xfs_rmap_intent_type		ri_type;
-	__uint64_t				ri_owner;
+	uint64_t				ri_owner;
 	int					ri_whichfork;
 	struct xfs_bmbt_irec			ri_bmap;
 };
@@ -196,15 +196,15 @@ int xfs_rmap_convert_extent(struct xfs_mount *mp, struct xfs_defer_ops *dfops,
 		struct xfs_bmbt_irec *imap);
 int xfs_rmap_alloc_extent(struct xfs_mount *mp, struct xfs_defer_ops *dfops,
 		xfs_agnumber_t agno, xfs_agblock_t bno, xfs_extlen_t len,
-		__uint64_t owner);
+		uint64_t owner);
 int xfs_rmap_free_extent(struct xfs_mount *mp, struct xfs_defer_ops *dfops,
 		xfs_agnumber_t agno, xfs_agblock_t bno, xfs_extlen_t len,
-		__uint64_t owner);
+		uint64_t owner);
 
 void xfs_rmap_finish_one_cleanup(struct xfs_trans *tp,
 		struct xfs_btree_cur *rcur, int error);
 int xfs_rmap_finish_one(struct xfs_trans *tp, enum xfs_rmap_intent_type type,
-		__uint64_t owner, int whichfork, xfs_fileoff_t startoff,
+		uint64_t owner, int whichfork, xfs_fileoff_t startoff,
 		xfs_fsblock_t startblock, xfs_filblks_t blockcount,
 		xfs_exntst_t state, struct xfs_btree_cur **pcur);
 
diff --git a/fs/xfs/libxfs/xfs_rmap_btree.c b/fs/xfs/libxfs/xfs_rmap_btree.c
index 74e5a54..c5b4a1c8 100644
--- a/fs/xfs/libxfs/xfs_rmap_btree.c
+++ b/fs/xfs/libxfs/xfs_rmap_btree.c
@@ -199,7 +199,7 @@ xfs_rmapbt_init_high_key_from_rec(
 	union xfs_btree_key	*key,
 	union xfs_btree_rec	*rec)
 {
-	__uint64_t		off;
+	uint64_t		off;
 	int			adj;
 
 	adj = be32_to_cpu(rec->rmap.rm_blockcount) - 1;
@@ -241,7 +241,7 @@ xfs_rmapbt_init_ptr_from_cur(
 	ptr->s = agf->agf_roots[cur->bc_btnum];
 }
 
-STATIC __int64_t
+STATIC int64_t
 xfs_rmapbt_key_diff(
 	struct xfs_btree_cur	*cur,
 	union xfs_btree_key	*key)
@@ -249,9 +249,9 @@ xfs_rmapbt_key_diff(
 	struct xfs_rmap_irec	*rec = &cur->bc_rec.r;
 	struct xfs_rmap_key	*kp = &key->rmap;
 	__u64			x, y;
-	__int64_t		d;
+	int64_t			d;
 
-	d = (__int64_t)be32_to_cpu(kp->rm_startblock) - rec->rm_startblock;
+	d = (int64_t)be32_to_cpu(kp->rm_startblock) - rec->rm_startblock;
 	if (d)
 		return d;
 
@@ -271,7 +271,7 @@ xfs_rmapbt_key_diff(
 	return 0;
 }
 
-STATIC __int64_t
+STATIC int64_t
 xfs_rmapbt_diff_two_keys(
 	struct xfs_btree_cur	*cur,
 	union xfs_btree_key	*k1,
@@ -279,10 +279,10 @@ xfs_rmapbt_diff_two_keys(
 {
 	struct xfs_rmap_key	*kp1 = &k1->rmap;
 	struct xfs_rmap_key	*kp2 = &k2->rmap;
-	__int64_t		d;
+	int64_t			d;
 	__u64			x, y;
 
-	d = (__int64_t)be32_to_cpu(kp1->rm_startblock) -
+	d = (int64_t)be32_to_cpu(kp1->rm_startblock) -
 		       be32_to_cpu(kp2->rm_startblock);
 	if (d)
 		return d;
@@ -384,10 +384,10 @@ xfs_rmapbt_keys_inorder(
 	union xfs_btree_key	*k1,
 	union xfs_btree_key	*k2)
 {
-	__uint32_t		x;
-	__uint32_t		y;
-	__uint64_t		a;
-	__uint64_t		b;
+	uint32_t		x;
+	uint32_t		y;
+	uint64_t		a;
+	uint64_t		b;
 
 	x = be32_to_cpu(k1->rmap.rm_startblock);
 	y = be32_to_cpu(k2->rmap.rm_startblock);
@@ -414,10 +414,10 @@ xfs_rmapbt_recs_inorder(
 	union xfs_btree_rec	*r1,
 	union xfs_btree_rec	*r2)
 {
-	__uint32_t		x;
-	__uint32_t		y;
-	__uint64_t		a;
-	__uint64_t		b;
+	uint32_t		x;
+	uint32_t		y;
+	uint64_t		a;
+	uint64_t		b;
 
 	x = be32_to_cpu(r1->rmap.rm_startblock);
 	y = be32_to_cpu(r2->rmap.rm_startblock);
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
index e47b99e..26bba7f 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.c
+++ b/fs/xfs/libxfs/xfs_rtbitmap.c
@@ -1011,7 +1011,7 @@ xfs_rtfree_extent(
 	    mp->m_sb.sb_rextents) {
 		if (!(mp->m_rbmip->i_d.di_flags & XFS_DIFLAG_NEWRTBM))
 			mp->m_rbmip->i_d.di_flags |= XFS_DIFLAG_NEWRTBM;
-		*(__uint64_t *)&VFS_I(mp->m_rbmip)->i_atime = 0;
+		*(uint64_t *)&VFS_I(mp->m_rbmip)->i_atime = 0;
 		xfs_trans_log_inode(tp, mp->m_rbmip, XFS_ILOG_CORE);
 	}
 	return 0;
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
index 584ec89..9b5aae2 100644
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -448,7 +448,7 @@ xfs_sb_quota_to_disk(
 	struct xfs_dsb	*to,
 	struct xfs_sb	*from)
 {
-	__uint16_t	qflags = from->sb_qflags;
+	uint16_t	qflags = from->sb_qflags;
 
 	to->sb_uquotino = cpu_to_be64(from->sb_uquotino);
 	if (xfs_sb_version_has_pquotino(from)) {
@@ -756,7 +756,7 @@ xfs_sb_mount_common(
 	mp->m_refc_mnr[1] = mp->m_refc_mxr[1] / 2;
 
 	mp->m_bsize = XFS_FSB_TO_BB(mp, 1);
-	mp->m_ialloc_inos = (int)MAX((__uint16_t)XFS_INODES_PER_CHUNK,
+	mp->m_ialloc_inos = (int)MAX((uint16_t)XFS_INODES_PER_CHUNK,
 					sbp->sb_inopblock);
 	mp->m_ialloc_blks = mp->m_ialloc_inos >> sbp->sb_inopblog;
 
diff --git a/fs/xfs/libxfs/xfs_types.h b/fs/xfs/libxfs/xfs_types.h
index 717909f..0220159 100644
--- a/fs/xfs/libxfs/xfs_types.h
+++ b/fs/xfs/libxfs/xfs_types.h
@@ -18,34 +18,34 @@
 #ifndef __XFS_TYPES_H__
 #define	__XFS_TYPES_H__
 
-typedef __uint32_t	prid_t;		/* project ID */
+typedef uint32_t	prid_t;		/* project ID */
 
-typedef __uint32_t	xfs_agblock_t;	/* blockno in alloc. group */
-typedef	__uint32_t	xfs_agino_t;	/* inode # within allocation grp */
-typedef	__uint32_t	xfs_extlen_t;	/* extent length in blocks */
-typedef	__uint32_t	xfs_agnumber_t;	/* allocation group number */
-typedef __int32_t	xfs_extnum_t;	/* # of extents in a file */
-typedef __int16_t	xfs_aextnum_t;	/* # extents in an attribute fork */
-typedef	__int64_t	xfs_fsize_t;	/* bytes in a file */
-typedef __uint64_t	xfs_ufsize_t;	/* unsigned bytes in a file */
+typedef uint32_t	xfs_agblock_t;	/* blockno in alloc. group */
+typedef uint32_t	xfs_agino_t;	/* inode # within allocation grp */
+typedef uint32_t	xfs_extlen_t;	/* extent length in blocks */
+typedef uint32_t	xfs_agnumber_t;	/* allocation group number */
+typedef int32_t		xfs_extnum_t;	/* # of extents in a file */
+typedef int16_t		xfs_aextnum_t;	/* # extents in an attribute fork */
+typedef int64_t		xfs_fsize_t;	/* bytes in a file */
+typedef uint64_t	xfs_ufsize_t;	/* unsigned bytes in a file */
 
-typedef	__int32_t	xfs_suminfo_t;	/* type of bitmap summary info */
-typedef	__int32_t	xfs_rtword_t;	/* word type for bitmap manipulations */
+typedef int32_t		xfs_suminfo_t;	/* type of bitmap summary info */
+typedef int32_t		xfs_rtword_t;	/* word type for bitmap manipulations */
 
-typedef	__int64_t	xfs_lsn_t;	/* log sequence number */
-typedef	__int32_t	xfs_tid_t;	/* transaction identifier */
+typedef int64_t		xfs_lsn_t;	/* log sequence number */
+typedef int32_t		xfs_tid_t;	/* transaction identifier */
 
-typedef	__uint32_t	xfs_dablk_t;	/* dir/attr block number (in file) */
-typedef	__uint32_t	xfs_dahash_t;	/* dir/attr hash value */
+typedef uint32_t	xfs_dablk_t;	/* dir/attr block number (in file) */
+typedef uint32_t	xfs_dahash_t;	/* dir/attr hash value */
 
-typedef	__uint64_t	xfs_fsblock_t;	/* blockno in filesystem (agno|agbno) */
-typedef __uint64_t	xfs_rfsblock_t;	/* blockno in filesystem (raw) */
-typedef __uint64_t	xfs_rtblock_t;	/* extent (block) in realtime area */
-typedef __uint64_t	xfs_fileoff_t;	/* block number in a file */
-typedef __uint64_t	xfs_filblks_t;	/* number of blocks in a file */
+typedef uint64_t	xfs_fsblock_t;	/* blockno in filesystem (agno|agbno) */
+typedef uint64_t	xfs_rfsblock_t;	/* blockno in filesystem (raw) */
+typedef uint64_t	xfs_rtblock_t;	/* extent (block) in realtime area */
+typedef uint64_t	xfs_fileoff_t;	/* block number in a file */
+typedef uint64_t	xfs_filblks_t;	/* number of blocks in a file */
 
-typedef	__int64_t	xfs_srtblock_t;	/* signed version of xfs_rtblock_t */
-typedef __int64_t	xfs_sfiloff_t;	/* signed block number in a file */
+typedef int64_t		xfs_srtblock_t;	/* signed version of xfs_rtblock_t */
+typedef int64_t		xfs_sfiloff_t;	/* signed block number in a file */
 
 /*
  * Null values for the types.
@@ -125,7 +125,7 @@ struct xfs_name {
  * uid_t and gid_t are hard-coded to 32 bits in the inode.
  * Hence, an 'id' in a dquot is 32 bits..
  */
-typedef __uint32_t	xfs_dqid_t;
+typedef uint32_t	xfs_dqid_t;
 
 /*
  * Constants for bit manipulations.
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 09af0f7..6bab8c4 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -836,7 +836,7 @@ xfs_writepage_map(
 	struct inode		*inode,
 	struct page		*page,
 	loff_t			offset,
-	__uint64_t              end_offset)
+	uint64_t              end_offset)
 {
 	LIST_HEAD(submit_list);
 	struct xfs_ioend	*ioend, *next;
@@ -991,7 +991,7 @@ xfs_do_writepage(
 	struct xfs_writepage_ctx *wpc = data;
 	struct inode		*inode = page->mapping->host;
 	loff_t			offset;
-	__uint64_t              end_offset;
+	uint64_t              end_offset;
 	pgoff_t                 end_index;
 
 	trace_xfs_writepage(inode, page, 0, 0);
diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
index 97c45b6..9bc1e12 100644
--- a/fs/xfs/xfs_attr_list.c
+++ b/fs/xfs/xfs_attr_list.c
@@ -279,7 +279,7 @@ xfs_attr_node_list(xfs_attr_list_context_t *context)
 	if (bp == NULL) {
 		cursor->blkno = 0;
 		for (;;) {
-			__uint16_t magic;
+			uint16_t magic;
 
 			error = xfs_da3_node_read(NULL, dp,
 						      cursor->blkno, -1, &bp,
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 9e3cc21..308428d 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -389,11 +389,11 @@ xfs_getbmapx_fix_eof_hole(
 	struct getbmapx		*out,		/* output structure */
 	int			prealloced,	/* this is a file with
 						 * preallocated data space */
-	__int64_t		end,		/* last block requested */
+	int64_t			end,		/* last block requested */
 	xfs_fsblock_t		startblock,
 	bool			moretocome)
 {
-	__int64_t		fixlen;
+	int64_t			fixlen;
 	xfs_mount_t		*mp;		/* file system mount point */
 	xfs_ifork_t		*ifp;		/* inode fork pointer */
 	xfs_extnum_t		lastx;		/* last extent pointer */
@@ -514,9 +514,9 @@ xfs_getbmap(
 	xfs_bmap_format_t	formatter,	/* format to user */
 	void			*arg)		/* formatter arg */
 {
-	__int64_t		bmvend;		/* last block requested */
+	int64_t			bmvend;		/* last block requested */
 	int			error = 0;	/* return value */
-	__int64_t		fixlen;		/* length for -1 case */
+	int64_t			fixlen;		/* length for -1 case */
 	int			i;		/* extent number */
 	int			lock;		/* lock state */
 	xfs_bmbt_irec_t		*map;		/* buffer for user's data */
@@ -605,7 +605,7 @@ xfs_getbmap(
 	if (bmv->bmv_length == -1) {
 		fixlen = XFS_FSB_TO_BB(mp, XFS_B_TO_FSB(mp, fixlen));
 		bmv->bmv_length =
-			max_t(__int64_t, fixlen - bmv->bmv_offset, 0);
+			max_t(int64_t, fixlen - bmv->bmv_offset, 0);
 	} else if (bmv->bmv_length == 0) {
 		bmv->bmv_entries = 0;
 		return 0;
@@ -742,7 +742,7 @@ xfs_getbmap(
 				out[cur_ext].bmv_offset +
 				out[cur_ext].bmv_length;
 			bmv->bmv_length =
-				max_t(__int64_t, 0, bmvend - bmv->bmv_offset);
+				max_t(int64_t, 0, bmvend - bmv->bmv_offset);
 
 			/*
 			 * In case we don't want to return the hole,
@@ -1676,7 +1676,7 @@ xfs_swap_extent_rmap(
 	xfs_filblks_t			ilen;
 	xfs_filblks_t			rlen;
 	int				nimaps;
-	__uint64_t			tip_flags2;
+	uint64_t			tip_flags2;
 
 	/*
 	 * If the source file has shared blocks, we must flag the donor
@@ -1792,7 +1792,7 @@ xfs_swap_extent_forks(
 	int			aforkblks = 0;
 	int			taforkblks = 0;
 	xfs_extnum_t		nextents;
-	__uint64_t		tmp;
+	uint64_t		tmp;
 	int			error;
 
 	/*
@@ -1850,15 +1850,15 @@ xfs_swap_extent_forks(
 	/*
 	 * Fix the on-disk inode values
 	 */
-	tmp = (__uint64_t)ip->i_d.di_nblocks;
+	tmp = (uint64_t)ip->i_d.di_nblocks;
 	ip->i_d.di_nblocks = tip->i_d.di_nblocks - taforkblks + aforkblks;
 	tip->i_d.di_nblocks = tmp + taforkblks - aforkblks;
 
-	tmp = (__uint64_t) ip->i_d.di_nextents;
+	tmp = (uint64_t) ip->i_d.di_nextents;
 	ip->i_d.di_nextents = tip->i_d.di_nextents;
 	tip->i_d.di_nextents = tmp;
 
-	tmp = (__uint64_t) ip->i_d.di_format;
+	tmp = (uint64_t) ip->i_d.di_format;
 	ip->i_d.di_format = tip->i_d.di_format;
 	tip->i_d.di_format = tmp;
 
@@ -1927,7 +1927,7 @@ xfs_swap_extents(
 	int			error = 0;
 	int			lock_flags;
 	struct xfs_ifork	*cowfp;
-	__uint64_t		f;
+	uint64_t		f;
 	int			resblks;
 
 	/*
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 62fa392..e4e254b 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1180,7 +1180,7 @@ xfs_buf_ioerror_alert(
 {
 	xfs_alert(bp->b_target->bt_mount,
 "metadata I/O error: block 0x%llx (\"%s\") error %d numblks %d",
-		(__uint64_t)XFS_BUF_ADDR(bp), func, -bp->b_error, bp->b_length);
+		(uint64_t)XFS_BUF_ADDR(bp), func, -bp->b_error, bp->b_length);
 }
 
 int
diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c
index 1bc7401..ede4790 100644
--- a/fs/xfs/xfs_dir2_readdir.c
+++ b/fs/xfs/xfs_dir2_readdir.c
@@ -44,7 +44,7 @@ static unsigned char xfs_dir3_filetype_table[] = {
 static unsigned char
 xfs_dir3_get_dtype(
 	struct xfs_mount	*mp,
-	__uint8_t		filetype)
+	uint8_t			filetype)
 {
 	if (!xfs_sb_version_hasftype(&mp->m_sb))
 		return DT_UNKNOWN;
@@ -117,7 +117,7 @@ xfs_dir2_sf_getdents(
 	 */
 	sfep = xfs_dir2_sf_firstentry(sfp);
 	for (i = 0; i < sfp->count; i++) {
-		__uint8_t filetype;
+		uint8_t filetype;
 
 		off = xfs_dir2_db_off_to_dataptr(geo, geo->datablk,
 				xfs_dir2_sf_get_offset(sfep));
@@ -194,7 +194,7 @@ xfs_dir2_block_getdents(
 	 * Each object is a real entry (dep) or an unused one (dup).
 	 */
 	while (ptr < endptr) {
-		__uint8_t filetype;
+		uint8_t filetype;
 
 		dup = (xfs_dir2_data_unused_t *)ptr;
 		/*
@@ -391,7 +391,7 @@ xfs_dir2_leaf_getdents(
 	 * Get more blocks and readahead as necessary.
 	 */
 	while (curoff < XFS_DIR2_LEAF_OFFSET) {
-		__uint8_t filetype;
+		uint8_t filetype;
 
 		/*
 		 * If we have no buffer, or we're off the end of the
diff --git a/fs/xfs/xfs_discard.c b/fs/xfs/xfs_discard.c
index 6a05d27..b2cde54 100644
--- a/fs/xfs/xfs_discard.c
+++ b/fs/xfs/xfs_discard.c
@@ -39,7 +39,7 @@ xfs_trim_extents(
 	xfs_daddr_t		start,
 	xfs_daddr_t		end,
 	xfs_daddr_t		minlen,
-	__uint64_t		*blocks_trimmed)
+	uint64_t		*blocks_trimmed)
 {
 	struct block_device	*bdev = mp->m_ddev_targp->bt_bdev;
 	struct xfs_btree_cur	*cur;
@@ -166,7 +166,7 @@ xfs_ioc_trim(
 	struct fstrim_range	range;
 	xfs_daddr_t		start, end, minlen;
 	xfs_agnumber_t		start_agno, end_agno, agno;
-	__uint64_t		blocks_trimmed = 0;
+	uint64_t		blocks_trimmed = 0;
 	int			error, last_error = 0;
 
 	if (!capable(CAP_SYS_ADMIN))
diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index 9d06cc3..e57c6cc 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -276,7 +276,7 @@ xfs_qm_init_dquot_blk(
 void
 xfs_dquot_set_prealloc_limits(struct xfs_dquot *dqp)
 {
-	__uint64_t space;
+	uint64_t space;
 
 	dqp->q_prealloc_hi_wmark = be64_to_cpu(dqp->q_core.d_blk_hardlimit);
 	dqp->q_prealloc_lo_wmark = be64_to_cpu(dqp->q_core.d_blk_softlimit);
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index 6ccaae9..8f22fc5 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -602,7 +602,7 @@ xfs_growfs_data_private(
 	if (nagimax)
 		mp->m_maxagi = nagimax;
 	if (mp->m_sb.sb_imax_pct) {
-		__uint64_t icount = mp->m_sb.sb_dblocks * mp->m_sb.sb_imax_pct;
+		uint64_t icount = mp->m_sb.sb_dblocks * mp->m_sb.sb_imax_pct;
 		do_div(icount, 100);
 		mp->m_maxicount = icount << mp->m_sb.sb_inopblog;
 	} else
@@ -793,17 +793,17 @@ xfs_fs_counts(
 int
 xfs_reserve_blocks(
 	xfs_mount_t             *mp,
-	__uint64_t              *inval,
+	uint64_t              *inval,
 	xfs_fsop_resblks_t      *outval)
 {
-	__int64_t		lcounter, delta;
-	__int64_t		fdblks_delta = 0;
-	__uint64_t		request;
-	__int64_t		free;
+	int64_t			lcounter, delta;
+	int64_t			fdblks_delta = 0;
+	uint64_t		request;
+	int64_t			free;
 	int			error = 0;
 
 	/* If inval is null, report current values and return */
-	if (inval == (__uint64_t *)NULL) {
+	if (inval == (uint64_t *)NULL) {
 		if (!outval)
 			return -EINVAL;
 		outval->resblks = mp->m_resblks;
@@ -904,7 +904,7 @@ xfs_reserve_blocks(
 int
 xfs_fs_goingdown(
 	xfs_mount_t	*mp,
-	__uint32_t	inflags)
+	uint32_t	inflags)
 {
 	switch (inflags) {
 	case XFS_FSOP_GOING_FLAGS_DEFAULT: {
diff --git a/fs/xfs/xfs_fsops.h b/fs/xfs/xfs_fsops.h
index f349158..2954c13 100644
--- a/fs/xfs/xfs_fsops.h
+++ b/fs/xfs/xfs_fsops.h
@@ -22,9 +22,9 @@ extern int xfs_fs_geometry(xfs_mount_t *mp, xfs_fsop_geom_t *geo, int nversion);
 extern int xfs_growfs_data(xfs_mount_t *mp, xfs_growfs_data_t *in);
 extern int xfs_growfs_log(xfs_mount_t *mp, xfs_growfs_log_t *in);
 extern int xfs_fs_counts(xfs_mount_t *mp, xfs_fsop_counts_t *cnt);
-extern int xfs_reserve_blocks(xfs_mount_t *mp, __uint64_t *inval,
+extern int xfs_reserve_blocks(xfs_mount_t *mp, uint64_t *inval,
 				xfs_fsop_resblks_t *outval);
-extern int xfs_fs_goingdown(xfs_mount_t *mp, __uint32_t inflags);
+extern int xfs_fs_goingdown(xfs_mount_t *mp, uint32_t inflags);
 
 extern int xfs_fs_reserve_ag_blocks(struct xfs_mount *mp);
 extern int xfs_fs_unreserve_ag_blocks(struct xfs_mount *mp);
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index ec9826c..ffbfe7d 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -632,7 +632,7 @@ __xfs_iflock(
 
 STATIC uint
 _xfs_dic2xflags(
-	__uint16_t		di_flags,
+	uint16_t		di_flags,
 	uint64_t		di_flags2,
 	bool			has_attr)
 {
@@ -855,8 +855,8 @@ xfs_ialloc(
 		inode->i_version = 1;
 		ip->i_d.di_flags2 = 0;
 		ip->i_d.di_cowextsize = 0;
-		ip->i_d.di_crtime.t_sec = (__int32_t)tv.tv_sec;
-		ip->i_d.di_crtime.t_nsec = (__int32_t)tv.tv_nsec;
+		ip->i_d.di_crtime.t_sec = (int32_t)tv.tv_sec;
+		ip->i_d.di_crtime.t_nsec = (int32_t)tv.tv_nsec;
 	}
 
 
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index 10e89fc..677d0bf 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -192,8 +192,8 @@ static inline void
 xfs_set_projid(struct xfs_inode *ip,
 		prid_t projid)
 {
-	ip->i_d.di_projid_hi = (__uint16_t) (projid >> 16);
-	ip->i_d.di_projid_lo = (__uint16_t) (projid & 0xffff);
+	ip->i_d.di_projid_hi = (uint16_t) (projid >> 16);
+	ip->i_d.di_projid_lo = (uint16_t) (projid & 0xffff);
 }
 
 static inline prid_t
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 6190697..c8d5523 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -444,8 +444,8 @@ xfs_attrmulti_attr_get(
 	struct inode		*inode,
 	unsigned char		*name,
 	unsigned char		__user *ubuf,
-	__uint32_t		*len,
-	__uint32_t		flags)
+	uint32_t		*len,
+	uint32_t		flags)
 {
 	unsigned char		*kbuf;
 	int			error = -EFAULT;
@@ -473,8 +473,8 @@ xfs_attrmulti_attr_set(
 	struct inode		*inode,
 	unsigned char		*name,
 	const unsigned char	__user *ubuf,
-	__uint32_t		len,
-	__uint32_t		flags)
+	uint32_t		len,
+	uint32_t		flags)
 {
 	unsigned char		*kbuf;
 	int			error;
@@ -499,7 +499,7 @@ int
 xfs_attrmulti_attr_remove(
 	struct inode		*inode,
 	unsigned char		*name,
-	__uint32_t		flags)
+	uint32_t		flags)
 {
 	int			error;
 
@@ -877,7 +877,7 @@ xfs_merge_ioc_xflags(
 
 STATIC unsigned int
 xfs_di2lxflags(
-	__uint16_t	di_flags)
+	uint16_t	di_flags)
 {
 	unsigned int	flags = 0;
 
@@ -1288,7 +1288,7 @@ xfs_ioctl_setattr_check_projid(
 	struct fsxattr		*fa)
 {
 	/* Disallow 32bit project ids if projid32bit feature is not enabled. */
-	if (fa->fsx_projid > (__uint16_t)-1 &&
+	if (fa->fsx_projid > (uint16_t)-1 &&
 	    !xfs_sb_version_hasprojid32bit(&ip->i_mount->m_sb))
 		return -EINVAL;
 
@@ -1932,7 +1932,7 @@ xfs_file_ioctl(
 
 	case XFS_IOC_SET_RESBLKS: {
 		xfs_fsop_resblks_t inout;
-		__uint64_t	   in;
+		uint64_t	   in;
 
 		if (!capable(CAP_SYS_ADMIN))
 			return -EPERM;
@@ -2018,12 +2018,12 @@ xfs_file_ioctl(
 	}
 
 	case XFS_IOC_GOINGDOWN: {
-		__uint32_t in;
+		uint32_t in;
 
 		if (!capable(CAP_SYS_ADMIN))
 			return -EPERM;
 
-		if (get_user(in, (__uint32_t __user *)arg))
+		if (get_user(in, (uint32_t __user *)arg))
 			return -EFAULT;
 
 		return xfs_fs_goingdown(mp, in);
diff --git a/fs/xfs/xfs_ioctl.h b/fs/xfs/xfs_ioctl.h
index 8b52881..e86c3ea 100644
--- a/fs/xfs/xfs_ioctl.h
+++ b/fs/xfs/xfs_ioctl.h
@@ -48,22 +48,22 @@ xfs_attrmulti_attr_get(
 	struct inode		*inode,
 	unsigned char		*name,
 	unsigned char		__user *ubuf,
-	__uint32_t		*len,
-	__uint32_t		flags);
+	uint32_t		*len,
+	uint32_t		flags);
 
 extern int
 xfs_attrmulti_attr_set(
 	struct inode		*inode,
 	unsigned char		*name,
 	const unsigned char	__user *ubuf,
-	__uint32_t		len,
-	__uint32_t		flags);
+	uint32_t		len,
+	uint32_t		flags);
 
 extern int
 xfs_attrmulti_attr_remove(
 	struct inode		*inode,
 	unsigned char		*name,
-	__uint32_t		flags);
+	uint32_t		flags);
 
 extern struct dentry *
 xfs_handle_to_dentry(
diff --git a/fs/xfs/xfs_ioctl32.h b/fs/xfs/xfs_ioctl32.h
index b1bb454..5492bcf 100644
--- a/fs/xfs/xfs_ioctl32.h
+++ b/fs/xfs/xfs_ioctl32.h
@@ -112,9 +112,9 @@ typedef struct compat_xfs_fsop_handlereq {
 
 /* The bstat field in the swapext struct needs translation */
 typedef struct compat_xfs_swapext {
-	__int64_t		sx_version;	/* version */
-	__int64_t		sx_fdtarget;	/* fd of target file */
-	__int64_t		sx_fdtmp;	/* fd of tmp file */
+	int64_t			sx_version;	/* version */
+	int64_t			sx_fdtarget;	/* fd of target file */
+	int64_t			sx_fdtmp;	/* fd of tmp file */
 	xfs_off_t		sx_offset;	/* offset into file */
 	xfs_off_t		sx_length;	/* leng from offset */
 	char			sx_pad[16];	/* pad space, unused */
diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h
index 044fb0e..ecdae42 100644
--- a/fs/xfs/xfs_linux.h
+++ b/fs/xfs/xfs_linux.h
@@ -23,14 +23,6 @@
 /*
  * Kernel specific type declarations for XFS
  */
-typedef signed char		__int8_t;
-typedef unsigned char		__uint8_t;
-typedef signed short int	__int16_t;
-typedef unsigned short int	__uint16_t;
-typedef signed int		__int32_t;
-typedef unsigned int		__uint32_t;
-typedef signed long long int	__int64_t;
-typedef unsigned long long int	__uint64_t;
 
 typedef __s64			xfs_off_t;	/* <file offset> type */
 typedef unsigned long long	xfs_ino_t;	/* <inode> type */
@@ -186,22 +178,22 @@ extern struct xstats xfsstats;
  * are converting to the init_user_ns. The uid is later mapped to a particular
  * user namespace value when crossing the kernel/user boundary.
  */
-static inline __uint32_t xfs_kuid_to_uid(kuid_t uid)
+static inline uint32_t xfs_kuid_to_uid(kuid_t uid)
 {
 	return from_kuid(&init_user_ns, uid);
 }
 
-static inline kuid_t xfs_uid_to_kuid(__uint32_t uid)
+static inline kuid_t xfs_uid_to_kuid(uint32_t uid)
 {
 	return make_kuid(&init_user_ns, uid);
 }
 
-static inline __uint32_t xfs_kgid_to_gid(kgid_t gid)
+static inline uint32_t xfs_kgid_to_gid(kgid_t gid)
 {
 	return from_kgid(&init_user_ns, gid);
 }
 
-static inline kgid_t xfs_gid_to_kgid(__uint32_t gid)
+static inline kgid_t xfs_gid_to_kgid(uint32_t gid)
 {
 	return make_kgid(&init_user_ns, gid);
 }
@@ -231,14 +223,14 @@ static inline __u32 xfs_do_mod(void *a, __u32 b, int n)
 
 #define do_mod(a, b)	xfs_do_mod(&(a), (b), sizeof(a))
 
-static inline __uint64_t roundup_64(__uint64_t x, __uint32_t y)
+static inline uint64_t roundup_64(uint64_t x, uint32_t y)
 {
 	x += y - 1;
 	do_div(x, y);
 	return x * y;
 }
 
-static inline __uint64_t howmany_64(__uint64_t x, __uint32_t y)
+static inline uint64_t howmany_64(uint64_t x, uint32_t y)
 {
 	x += y - 1;
 	do_div(x, y);
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 3731f13..12a905b 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -434,7 +434,7 @@ xfs_log_reserve(
 	int		 	unit_bytes,
 	int		 	cnt,
 	struct xlog_ticket	**ticp,
-	__uint8_t	 	client,
+	uint8_t		 	client,
 	bool			permanent)
 {
 	struct xlog		*log = mp->m_log;
@@ -825,9 +825,9 @@ xfs_log_unmount_write(xfs_mount_t *mp)
 		if (!error) {
 			/* the data section must be 32 bit size aligned */
 			struct {
-			    __uint16_t magic;
-			    __uint16_t pad1;
-			    __uint32_t pad2; /* may as well make it 64 bits */
+			    uint16_t magic;
+			    uint16_t pad1;
+			    uint32_t pad2; /* may as well make it 64 bits */
 			} magic = {
 				.magic = XLOG_UNMOUNT_TYPE,
 			};
@@ -1665,7 +1665,7 @@ xlog_cksum(
 	char			*dp,
 	int			size)
 {
-	__uint32_t		crc;
+	uint32_t		crc;
 
 	/* first generate the crc for the record header ... */
 	crc = xfs_start_cksum_update((char *)rhead,
@@ -1828,7 +1828,7 @@ xlog_sync(
 		 */
 		dptr = (char *)&iclog->ic_header + count;
 		for (i = 0; i < split; i += BBSIZE) {
-			__uint32_t cycle = be32_to_cpu(*(__be32 *)dptr);
+			uint32_t cycle = be32_to_cpu(*(__be32 *)dptr);
 			if (++cycle == XLOG_HEADER_MAGIC_NUM)
 				cycle++;
 			*(__be32 *)dptr = cpu_to_be32(cycle);
@@ -2363,8 +2363,8 @@ xlog_write(
 			}
 
 			reg = &vecp[index];
-			ASSERT(reg->i_len % sizeof(__int32_t) == 0);
-			ASSERT((unsigned long)ptr % sizeof(__int32_t) == 0);
+			ASSERT(reg->i_len % sizeof(int32_t) == 0);
+			ASSERT((unsigned long)ptr % sizeof(int32_t) == 0);
 
 			start_rec_copy = xlog_write_start_rec(ptr, ticket);
 			if (start_rec_copy) {
@@ -3143,7 +3143,7 @@ xlog_state_switch_iclogs(
 	/* Round up to next log-sunit */
 	if (xfs_sb_version_haslogv2(&log->l_mp->m_sb) &&
 	    log->l_mp->m_sb.sb_logsunit > 1) {
-		__uint32_t sunit_bb = BTOBB(log->l_mp->m_sb.sb_logsunit);
+		uint32_t sunit_bb = BTOBB(log->l_mp->m_sb.sb_logsunit);
 		log->l_curr_block = roundup(log->l_curr_block, sunit_bb);
 	}
 
@@ -3771,7 +3771,7 @@ xlog_verify_iclog(
 	xlog_in_core_2_t	*xhdr;
 	void			*base_ptr, *ptr, *p;
 	ptrdiff_t		field_offset;
-	__uint8_t		clientid;
+	uint8_t			clientid;
 	int			len, i, j, k, op_len;
 	int			idx;
 
diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
index cc5a9f1..bf21277 100644
--- a/fs/xfs/xfs_log.h
+++ b/fs/xfs/xfs_log.h
@@ -159,7 +159,7 @@ int	  xfs_log_reserve(struct xfs_mount *mp,
 			  int		   length,
 			  int		   count,
 			  struct xlog_ticket **ticket,
-			  __uint8_t	   clientid,
+			  uint8_t		   clientid,
 			  bool		   permanent);
 int	  xfs_log_regrant(struct xfs_mount *mp, struct xlog_ticket *tic);
 void      xfs_log_unmount(struct xfs_mount *mp);
diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
index c2604a5..1accc2c 100644
--- a/fs/xfs/xfs_log_priv.h
+++ b/fs/xfs/xfs_log_priv.h
@@ -419,7 +419,7 @@ struct xlog {
 };
 
 #define XLOG_BUF_CANCEL_BUCKET(log, blkno) \
-	((log)->l_buf_cancel_table + ((__uint64_t)blkno % XLOG_BC_TABLE_SIZE))
+	((log)->l_buf_cancel_table + ((uint64_t)blkno % XLOG_BC_TABLE_SIZE))
 
 #define XLOG_FORCED_SHUTDOWN(log)	((log)->l_flags & XLOG_IO_ERROR)
 
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index cd0b077..e19b20c 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2230,9 +2230,9 @@ xlog_recover_get_buf_lsn(
 	struct xfs_mount	*mp,
 	struct xfs_buf		*bp)
 {
-	__uint32_t		magic32;
-	__uint16_t		magic16;
-	__uint16_t		magicda;
+	uint32_t		magic32;
+	uint16_t		magic16;
+	uint16_t		magicda;
 	void			*blk = bp->b_addr;
 	uuid_t			*uuid;
 	xfs_lsn_t		lsn = -1;
@@ -2381,9 +2381,9 @@ xlog_recover_validate_buf_type(
 	xfs_lsn_t		current_lsn)
 {
 	struct xfs_da_blkinfo	*info = bp->b_addr;
-	__uint32_t		magic32;
-	__uint16_t		magic16;
-	__uint16_t		magicda;
+	uint32_t		magic32;
+	uint16_t		magic16;
+	uint16_t		magicda;
 	char			*warnmsg = NULL;
 
 	/*
@@ -2852,7 +2852,7 @@ xlog_recover_buffer_pass2(
 	if (XFS_DINODE_MAGIC ==
 	    be16_to_cpu(*((__be16 *)xfs_buf_offset(bp, 0))) &&
 	    (BBTOB(bp->b_io_length) != MAX(log->l_mp->m_sb.sb_blocksize,
-			(__uint32_t)log->l_mp->m_inode_cluster_size))) {
+			(uint32_t)log->l_mp->m_inode_cluster_size))) {
 		xfs_buf_stale(bp);
 		error = xfs_bwrite(bp);
 	} else {
@@ -3423,7 +3423,7 @@ xlog_recover_efd_pass2(
 	xfs_efd_log_format_t	*efd_formatp;
 	xfs_efi_log_item_t	*efip = NULL;
 	xfs_log_item_t		*lip;
-	__uint64_t		efi_id;
+	uint64_t		efi_id;
 	struct xfs_ail_cursor	cur;
 	struct xfs_ail		*ailp = log->l_ailp;
 
@@ -3519,7 +3519,7 @@ xlog_recover_rud_pass2(
 	struct xfs_rud_log_format	*rud_formatp;
 	struct xfs_rui_log_item		*ruip = NULL;
 	struct xfs_log_item		*lip;
-	__uint64_t			rui_id;
+	uint64_t			rui_id;
 	struct xfs_ail_cursor		cur;
 	struct xfs_ail			*ailp = log->l_ailp;
 
@@ -3635,7 +3635,7 @@ xlog_recover_cud_pass2(
 	struct xfs_cud_log_format	*cud_formatp;
 	struct xfs_cui_log_item		*cuip = NULL;
 	struct xfs_log_item		*lip;
-	__uint64_t			cui_id;
+	uint64_t			cui_id;
 	struct xfs_ail_cursor		cur;
 	struct xfs_ail			*ailp = log->l_ailp;
 
@@ -3754,7 +3754,7 @@ xlog_recover_bud_pass2(
 	struct xfs_bud_log_format	*bud_formatp;
 	struct xfs_bui_log_item		*buip = NULL;
 	struct xfs_log_item		*lip;
-	__uint64_t			bui_id;
+	uint64_t			bui_id;
 	struct xfs_ail_cursor		cur;
 	struct xfs_ail			*ailp = log->l_ailp;
 
@@ -5772,9 +5772,9 @@ xlog_recover_check_summary(
 	xfs_buf_t	*agfbp;
 	xfs_buf_t	*agibp;
 	xfs_agnumber_t	agno;
-	__uint64_t	freeblks;
-	__uint64_t	itotal;
-	__uint64_t	ifree;
+	uint64_t	freeblks;
+	uint64_t	itotal;
+	uint64_t	ifree;
 	int		error;
 
 	mp = log->l_mp;
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 2eaf818..cc6789d 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -174,7 +174,7 @@ xfs_free_perag(
 int
 xfs_sb_validate_fsb_count(
 	xfs_sb_t	*sbp,
-	__uint64_t	nblocks)
+	uint64_t	nblocks)
 {
 	ASSERT(PAGE_SHIFT >= sbp->sb_blocklog);
 	ASSERT(sbp->sb_blocklog >= BBSHIFT);
@@ -436,7 +436,7 @@ STATIC void
 xfs_set_maxicount(xfs_mount_t *mp)
 {
 	xfs_sb_t	*sbp = &(mp->m_sb);
-	__uint64_t	icount;
+	uint64_t	icount;
 
 	if (sbp->sb_imax_pct) {
 		/*
@@ -502,7 +502,7 @@ xfs_set_low_space_thresholds(
 	int i;
 
 	for (i = 0; i < XFS_LOWSP_MAX; i++) {
-		__uint64_t space = mp->m_sb.sb_dblocks;
+		uint64_t space = mp->m_sb.sb_dblocks;
 
 		do_div(space, 100);
 		mp->m_low_space[i] = space * (i + 1);
@@ -598,10 +598,10 @@ xfs_mount_reset_sbqflags(
 	return xfs_sync_sb(mp, false);
 }
 
-__uint64_t
+uint64_t
 xfs_default_resblks(xfs_mount_t *mp)
 {
-	__uint64_t resblks;
+	uint64_t resblks;
 
 	/*
 	 * We default to 5% or 8192 fsbs of space reserved, whichever is
@@ -612,7 +612,7 @@ xfs_default_resblks(xfs_mount_t *mp)
 	 */
 	resblks = mp->m_sb.sb_dblocks;
 	do_div(resblks, 20);
-	resblks = min_t(__uint64_t, resblks, 8192);
+	resblks = min_t(uint64_t, resblks, 8192);
 	return resblks;
 }
 
@@ -632,7 +632,7 @@ xfs_mountfs(
 {
 	struct xfs_sb		*sbp = &(mp->m_sb);
 	struct xfs_inode	*rip;
-	__uint64_t		resblks;
+	uint64_t		resblks;
 	uint			quotamount = 0;
 	uint			quotaflags = 0;
 	int			error = 0;
@@ -1060,7 +1060,7 @@ void
 xfs_unmountfs(
 	struct xfs_mount	*mp)
 {
-	__uint64_t		resblks;
+	uint64_t		resblks;
 	int			error;
 
 	cancel_delayed_work_sync(&mp->m_eofblocks_work);
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index 9fa312a..305d953 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -108,10 +108,10 @@ typedef struct xfs_mount {
 	xfs_buftarg_t		*m_ddev_targp;	/* saves taking the address */
 	xfs_buftarg_t		*m_logdev_targp;/* ptr to log device */
 	xfs_buftarg_t		*m_rtdev_targp;	/* ptr to rt device */
-	__uint8_t		m_blkbit_log;	/* blocklog + NBBY */
-	__uint8_t		m_blkbb_log;	/* blocklog - BBSHIFT */
-	__uint8_t		m_agno_log;	/* log #ag's */
-	__uint8_t		m_agino_log;	/* #bits for agino in inum */
+	uint8_t			m_blkbit_log;	/* blocklog + NBBY */
+	uint8_t			m_blkbb_log;	/* blocklog - BBSHIFT */
+	uint8_t			m_agno_log;	/* log #ag's */
+	uint8_t			m_agino_log;	/* #bits for agino in inum */
 	uint			m_inode_cluster_size;/* min inode buf size */
 	uint			m_blockmask;	/* sb_blocksize-1 */
 	uint			m_blockwsize;	/* sb_blocksize in words */
@@ -139,7 +139,7 @@ typedef struct xfs_mount {
 	struct mutex		m_growlock;	/* growfs mutex */
 	int			m_fixedfsid[2];	/* unchanged for life of FS */
 	uint			m_dmevmask;	/* DMI events for this FS */
-	__uint64_t		m_flags;	/* global mount flags */
+	uint64_t		m_flags;	/* global mount flags */
 	bool			m_inotbt_nores; /* no per-AG finobt resv. */
 	int			m_ialloc_inos;	/* inodes in inode allocation */
 	int			m_ialloc_blks;	/* blocks in inode allocation */
@@ -148,14 +148,14 @@ typedef struct xfs_mount {
 	int			m_inoalign_mask;/* mask sb_inoalignmt if used */
 	uint			m_qflags;	/* quota status flags */
 	struct xfs_trans_resv	m_resv;		/* precomputed res values */
-	__uint64_t		m_maxicount;	/* maximum inode count */
-	__uint64_t		m_resblks;	/* total reserved blocks */
-	__uint64_t		m_resblks_avail;/* available reserved blocks */
-	__uint64_t		m_resblks_save;	/* reserved blks @ remount,ro */
+	uint64_t		m_maxicount;	/* maximum inode count */
+	uint64_t		m_resblks;	/* total reserved blocks */
+	uint64_t		m_resblks_avail;/* available reserved blocks */
+	uint64_t		m_resblks_save;	/* reserved blks @ remount,ro */
 	int			m_dalign;	/* stripe unit */
 	int			m_swidth;	/* stripe width */
 	int			m_sinoalign;	/* stripe unit inode alignment */
-	__uint8_t		m_sectbb_log;	/* sectlog - BBSHIFT */
+	uint8_t			m_sectbb_log;	/* sectlog - BBSHIFT */
 	const struct xfs_nameops *m_dirnameops;	/* vector of dir name ops */
 	const struct xfs_dir_ops *m_dir_inode_ops; /* vector of dir inode ops */
 	const struct xfs_dir_ops *m_nondir_inode_ops; /* !dir inode ops */
@@ -194,7 +194,7 @@ typedef struct xfs_mount {
 	 * ever support shrinks it would have to be persisted in addition
 	 * to various other kinds of pain inflicted on the pNFS server.
 	 */
-	__uint32_t		m_generation;
+	uint32_t		m_generation;
 
 	bool			m_fail_unmount;
 #ifdef DEBUG
@@ -367,12 +367,12 @@ typedef struct xfs_perag {
 	char		pagi_init;	/* this agi's entry is initialized */
 	char		pagf_metadata;	/* the agf is preferred to be metadata */
 	char		pagi_inodeok;	/* The agi is ok for inodes */
-	__uint8_t	pagf_levels[XFS_BTNUM_AGF];
+	uint8_t		pagf_levels[XFS_BTNUM_AGF];
 					/* # of levels in bno & cnt btree */
-	__uint32_t	pagf_flcount;	/* count of blocks in freelist */
+	uint32_t	pagf_flcount;	/* count of blocks in freelist */
 	xfs_extlen_t	pagf_freeblks;	/* total free blocks */
 	xfs_extlen_t	pagf_longest;	/* longest free space */
-	__uint32_t	pagf_btreeblks;	/* # of blocks held in AGF btrees */
+	uint32_t	pagf_btreeblks;	/* # of blocks held in AGF btrees */
 	xfs_agino_t	pagi_freecount;	/* number of free inodes */
 	xfs_agino_t	pagi_count;	/* number of allocated inodes */
 
@@ -411,7 +411,7 @@ typedef struct xfs_perag {
 	struct xfs_ag_resv	pag_agfl_resv;
 
 	/* reference count */
-	__uint8_t		pagf_refcount_level;
+	uint8_t			pagf_refcount_level;
 } xfs_perag_t;
 
 static inline struct xfs_ag_resv *
@@ -434,7 +434,7 @@ void xfs_buf_hash_destroy(xfs_perag_t *pag);
 
 extern void	xfs_uuid_table_free(void);
 extern int	xfs_log_sbcount(xfs_mount_t *);
-extern __uint64_t xfs_default_resblks(xfs_mount_t *mp);
+extern uint64_t xfs_default_resblks(xfs_mount_t *mp);
 extern int	xfs_mountfs(xfs_mount_t *mp);
 extern int	xfs_initialize_perag(xfs_mount_t *mp, xfs_agnumber_t agcount,
 				     xfs_agnumber_t *maxagi);
@@ -450,7 +450,7 @@ extern struct xfs_buf *xfs_getsb(xfs_mount_t *, int);
 extern int	xfs_readsb(xfs_mount_t *, int);
 extern void	xfs_freesb(xfs_mount_t *);
 extern bool	xfs_fs_writable(struct xfs_mount *mp, int level);
-extern int	xfs_sb_validate_fsb_count(struct xfs_sb *, __uint64_t);
+extern int	xfs_sb_validate_fsb_count(struct xfs_sb *, uint64_t);
 
 extern int	xfs_dev_is_read_only(struct xfs_mount *, char *);
 
diff --git a/fs/xfs/xfs_qm_bhv.c b/fs/xfs/xfs_qm_bhv.c
index 3e52d5d..2be6d27 100644
--- a/fs/xfs/xfs_qm_bhv.c
+++ b/fs/xfs/xfs_qm_bhv.c
@@ -33,7 +33,7 @@ xfs_fill_statvfs_from_dquot(
 	struct kstatfs		*statp,
 	struct xfs_dquot	*dqp)
 {
-	__uint64_t		limit;
+	uint64_t		limit;
 
 	limit = dqp->q_core.d_blk_softlimit ?
 		be64_to_cpu(dqp->q_core.d_blk_softlimit) :
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index c57aa7f..9147219 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -1256,13 +1256,13 @@ xfs_rtpick_extent(
 {
 	xfs_rtblock_t	b;		/* result block */
 	int		log2;		/* log of sequence number */
-	__uint64_t	resid;		/* residual after log removed */
-	__uint64_t	seq;		/* sequence number of file creation */
-	__uint64_t	*seqp;		/* pointer to seqno in inode */
+	uint64_t	resid;		/* residual after log removed */
+	uint64_t	seq;		/* sequence number of file creation */
+	uint64_t	*seqp;		/* pointer to seqno in inode */
 
 	ASSERT(xfs_isilocked(mp->m_rbmip, XFS_ILOCK_EXCL));
 
-	seqp = (__uint64_t *)&VFS_I(mp->m_rbmip)->i_atime;
+	seqp = (uint64_t *)&VFS_I(mp->m_rbmip)->i_atime;
 	if (!(mp->m_rbmip->i_d.di_flags & XFS_DIFLAG_NEWRTBM)) {
 		mp->m_rbmip->i_d.di_flags |= XFS_DIFLAG_NEWRTBM;
 		*seqp = 0;
diff --git a/fs/xfs/xfs_stats.c b/fs/xfs/xfs_stats.c
index f11282c..056e12b 100644
--- a/fs/xfs/xfs_stats.c
+++ b/fs/xfs/xfs_stats.c
@@ -33,9 +33,9 @@ int xfs_stats_format(struct xfsstats __percpu *stats, char *buf)
 {
 	int		i, j;
 	int		len = 0;
-	__uint64_t	xs_xstrat_bytes = 0;
-	__uint64_t	xs_write_bytes = 0;
-	__uint64_t	xs_read_bytes = 0;
+	uint64_t	xs_xstrat_bytes = 0;
+	uint64_t	xs_write_bytes = 0;
+	uint64_t	xs_read_bytes = 0;
 
 	static const struct xstats_entry {
 		char	*desc;
@@ -100,7 +100,7 @@ int xfs_stats_format(struct xfsstats __percpu *stats, char *buf)
 void xfs_stats_clearall(struct xfsstats __percpu *stats)
 {
 	int		c;
-	__uint32_t	vn_active;
+	uint32_t	vn_active;
 
 	xfs_notice(NULL, "Clearing xfsstats");
 	for_each_possible_cpu(c) {
diff --git a/fs/xfs/xfs_stats.h b/fs/xfs/xfs_stats.h
index 375840f..f64d0ae 100644
--- a/fs/xfs/xfs_stats.h
+++ b/fs/xfs/xfs_stats.h
@@ -54,125 +54,125 @@ enum {
  */
 struct __xfsstats {
 # define XFSSTAT_END_EXTENT_ALLOC	4
-	__uint32_t		xs_allocx;
-	__uint32_t		xs_allocb;
-	__uint32_t		xs_freex;
-	__uint32_t		xs_freeb;
+	uint32_t		xs_allocx;
+	uint32_t		xs_allocb;
+	uint32_t		xs_freex;
+	uint32_t		xs_freeb;
 # define XFSSTAT_END_ALLOC_BTREE	(XFSSTAT_END_EXTENT_ALLOC+4)
-	__uint32_t		xs_abt_lookup;
-	__uint32_t		xs_abt_compare;
-	__uint32_t		xs_abt_insrec;
-	__uint32_t		xs_abt_delrec;
+	uint32_t		xs_abt_lookup;
+	uint32_t		xs_abt_compare;
+	uint32_t		xs_abt_insrec;
+	uint32_t		xs_abt_delrec;
 # define XFSSTAT_END_BLOCK_MAPPING	(XFSSTAT_END_ALLOC_BTREE+7)
-	__uint32_t		xs_blk_mapr;
-	__uint32_t		xs_blk_mapw;
-	__uint32_t		xs_blk_unmap;
-	__uint32_t		xs_add_exlist;
-	__uint32_t		xs_del_exlist;
-	__uint32_t		xs_look_exlist;
-	__uint32_t		xs_cmp_exlist;
+	uint32_t		xs_blk_mapr;
+	uint32_t		xs_blk_mapw;
+	uint32_t		xs_blk_unmap;
+	uint32_t		xs_add_exlist;
+	uint32_t		xs_del_exlist;
+	uint32_t		xs_look_exlist;
+	uint32_t		xs_cmp_exlist;
 # define XFSSTAT_END_BLOCK_MAP_BTREE	(XFSSTAT_END_BLOCK_MAPPING+4)
-	__uint32_t		xs_bmbt_lookup;
-	__uint32_t		xs_bmbt_compare;
-	__uint32_t		xs_bmbt_insrec;
-	__uint32_t		xs_bmbt_delrec;
+	uint32_t		xs_bmbt_lookup;
+	uint32_t		xs_bmbt_compare;
+	uint32_t		xs_bmbt_insrec;
+	uint32_t		xs_bmbt_delrec;
 # define XFSSTAT_END_DIRECTORY_OPS	(XFSSTAT_END_BLOCK_MAP_BTREE+4)
-	__uint32_t		xs_dir_lookup;
-	__uint32_t		xs_dir_create;
-	__uint32_t		xs_dir_remove;
-	__uint32_t		xs_dir_getdents;
+	uint32_t		xs_dir_lookup;
+	uint32_t		xs_dir_create;
+	uint32_t		xs_dir_remove;
+	uint32_t		xs_dir_getdents;
 # define XFSSTAT_END_TRANSACTIONS	(XFSSTAT_END_DIRECTORY_OPS+3)
-	__uint32_t		xs_trans_sync;
-	__uint32_t		xs_trans_async;
-	__uint32_t		xs_trans_empty;
+	uint32_t		xs_trans_sync;
+	uint32_t		xs_trans_async;
+	uint32_t		xs_trans_empty;
 # define XFSSTAT_END_INODE_OPS		(XFSSTAT_END_TRANSACTIONS+7)
-	__uint32_t		xs_ig_attempts;
-	__uint32_t		xs_ig_found;
-	__uint32_t		xs_ig_frecycle;
-	__uint32_t		xs_ig_missed;
-	__uint32_t		xs_ig_dup;
-	__uint32_t		xs_ig_reclaims;
-	__uint32_t		xs_ig_attrchg;
+	uint32_t		xs_ig_attempts;
+	uint32_t		xs_ig_found;
+	uint32_t		xs_ig_frecycle;
+	uint32_t		xs_ig_missed;
+	uint32_t		xs_ig_dup;
+	uint32_t		xs_ig_reclaims;
+	uint32_t		xs_ig_attrchg;
 # define XFSSTAT_END_LOG_OPS		(XFSSTAT_END_INODE_OPS+5)
-	__uint32_t		xs_log_writes;
-	__uint32_t		xs_log_blocks;
-	__uint32_t		xs_log_noiclogs;
-	__uint32_t		xs_log_force;
-	__uint32_t		xs_log_force_sleep;
+	uint32_t		xs_log_writes;
+	uint32_t		xs_log_blocks;
+	uint32_t		xs_log_noiclogs;
+	uint32_t		xs_log_force;
+	uint32_t		xs_log_force_sleep;
 # define XFSSTAT_END_TAIL_PUSHING	(XFSSTAT_END_LOG_OPS+10)
-	__uint32_t		xs_try_logspace;
-	__uint32_t		xs_sleep_logspace;
-	__uint32_t		xs_push_ail;
-	__uint32_t		xs_push_ail_success;
-	__uint32_t		xs_push_ail_pushbuf;
-	__uint32_t		xs_push_ail_pinned;
-	__uint32_t		xs_push_ail_locked;
-	__uint32_t		xs_push_ail_flushing;
-	__uint32_t		xs_push_ail_restarts;
-	__uint32_t		xs_push_ail_flush;
+	uint32_t		xs_try_logspace;
+	uint32_t		xs_sleep_logspace;
+	uint32_t		xs_push_ail;
+	uint32_t		xs_push_ail_success;
+	uint32_t		xs_push_ail_pushbuf;
+	uint32_t		xs_push_ail_pinned;
+	uint32_t		xs_push_ail_locked;
+	uint32_t		xs_push_ail_flushing;
+	uint32_t		xs_push_ail_restarts;
+	uint32_t		xs_push_ail_flush;
 # define XFSSTAT_END_WRITE_CONVERT	(XFSSTAT_END_TAIL_PUSHING+2)
-	__uint32_t		xs_xstrat_quick;
-	__uint32_t		xs_xstrat_split;
+	uint32_t		xs_xstrat_quick;
+	uint32_t		xs_xstrat_split;
 # define XFSSTAT_END_READ_WRITE_OPS	(XFSSTAT_END_WRITE_CONVERT+2)
-	__uint32_t		xs_write_calls;
-	__uint32_t		xs_read_calls;
+	uint32_t		xs_write_calls;
+	uint32_t		xs_read_calls;
 # define XFSSTAT_END_ATTRIBUTE_OPS	(XFSSTAT_END_READ_WRITE_OPS+4)
-	__uint32_t		xs_attr_get;
-	__uint32_t		xs_attr_set;
-	__uint32_t		xs_attr_remove;
-	__uint32_t		xs_attr_list;
+	uint32_t		xs_attr_get;
+	uint32_t		xs_attr_set;
+	uint32_t		xs_attr_remove;
+	uint32_t		xs_attr_list;
 # define XFSSTAT_END_INODE_CLUSTER	(XFSSTAT_END_ATTRIBUTE_OPS+3)
-	__uint32_t		xs_iflush_count;
-	__uint32_t		xs_icluster_flushcnt;
-	__uint32_t		xs_icluster_flushinode;
+	uint32_t		xs_iflush_count;
+	uint32_t		xs_icluster_flushcnt;
+	uint32_t		xs_icluster_flushinode;
 # define XFSSTAT_END_VNODE_OPS		(XFSSTAT_END_INODE_CLUSTER+8)
-	__uint32_t		vn_active;	/* # vnodes not on free lists */
-	__uint32_t		vn_alloc;	/* # times vn_alloc called */
-	__uint32_t		vn_get;		/* # times vn_get called */
-	__uint32_t		vn_hold;	/* # times vn_hold called */
-	__uint32_t		vn_rele;	/* # times vn_rele called */
-	__uint32_t		vn_reclaim;	/* # times vn_reclaim called */
-	__uint32_t		vn_remove;	/* # times vn_remove called */
-	__uint32_t		vn_free;	/* # times vn_free called */
+	uint32_t		vn_active;	/* # vnodes not on free lists */
+	uint32_t		vn_alloc;	/* # times vn_alloc called */
+	uint32_t		vn_get;		/* # times vn_get called */
+	uint32_t		vn_hold;	/* # times vn_hold called */
+	uint32_t		vn_rele;	/* # times vn_rele called */
+	uint32_t		vn_reclaim;	/* # times vn_reclaim called */
+	uint32_t		vn_remove;	/* # times vn_remove called */
+	uint32_t		vn_free;	/* # times vn_free called */
 #define XFSSTAT_END_BUF			(XFSSTAT_END_VNODE_OPS+9)
-	__uint32_t		xb_get;
-	__uint32_t		xb_create;
-	__uint32_t		xb_get_locked;
-	__uint32_t		xb_get_locked_waited;
-	__uint32_t		xb_busy_locked;
-	__uint32_t		xb_miss_locked;
-	__uint32_t		xb_page_retries;
-	__uint32_t		xb_page_found;
-	__uint32_t		xb_get_read;
+	uint32_t		xb_get;
+	uint32_t		xb_create;
+	uint32_t		xb_get_locked;
+	uint32_t		xb_get_locked_waited;
+	uint32_t		xb_busy_locked;
+	uint32_t		xb_miss_locked;
+	uint32_t		xb_page_retries;
+	uint32_t		xb_page_found;
+	uint32_t		xb_get_read;
 /* Version 2 btree counters */
 #define XFSSTAT_END_ABTB_V2		(XFSSTAT_END_BUF + __XBTS_MAX)
-	__uint32_t		xs_abtb_2[__XBTS_MAX];
+	uint32_t		xs_abtb_2[__XBTS_MAX];
 #define XFSSTAT_END_ABTC_V2		(XFSSTAT_END_ABTB_V2 + __XBTS_MAX)
-	__uint32_t		xs_abtc_2[__XBTS_MAX];
+	uint32_t		xs_abtc_2[__XBTS_MAX];
 #define XFSSTAT_END_BMBT_V2		(XFSSTAT_END_ABTC_V2 + __XBTS_MAX)
-	__uint32_t		xs_bmbt_2[__XBTS_MAX];
+	uint32_t		xs_bmbt_2[__XBTS_MAX];
 #define XFSSTAT_END_IBT_V2		(XFSSTAT_END_BMBT_V2 + __XBTS_MAX)
-	__uint32_t		xs_ibt_2[__XBTS_MAX];
+	uint32_t		xs_ibt_2[__XBTS_MAX];
 #define XFSSTAT_END_FIBT_V2		(XFSSTAT_END_IBT_V2 + __XBTS_MAX)
-	__uint32_t		xs_fibt_2[__XBTS_MAX];
+	uint32_t		xs_fibt_2[__XBTS_MAX];
 #define XFSSTAT_END_RMAP_V2		(XFSSTAT_END_FIBT_V2 + __XBTS_MAX)
-	__uint32_t		xs_rmap_2[__XBTS_MAX];
+	uint32_t		xs_rmap_2[__XBTS_MAX];
 #define XFSSTAT_END_REFCOUNT		(XFSSTAT_END_RMAP_V2 + __XBTS_MAX)
-	__uint32_t		xs_refcbt_2[__XBTS_MAX];
+	uint32_t		xs_refcbt_2[__XBTS_MAX];
 #define XFSSTAT_END_XQMSTAT		(XFSSTAT_END_REFCOUNT + 6)
-	__uint32_t		xs_qm_dqreclaims;
-	__uint32_t		xs_qm_dqreclaim_misses;
-	__uint32_t		xs_qm_dquot_dups;
-	__uint32_t		xs_qm_dqcachemisses;
-	__uint32_t		xs_qm_dqcachehits;
-	__uint32_t		xs_qm_dqwants;
+	uint32_t		xs_qm_dqreclaims;
+	uint32_t		xs_qm_dqreclaim_misses;
+	uint32_t		xs_qm_dquot_dups;
+	uint32_t		xs_qm_dqcachemisses;
+	uint32_t		xs_qm_dqcachehits;
+	uint32_t		xs_qm_dqwants;
 #define XFSSTAT_END_QM			(XFSSTAT_END_XQMSTAT+2)
-	__uint32_t		xs_qm_dquot;
-	__uint32_t		xs_qm_dquot_unused;
+	uint32_t		xs_qm_dquot;
+	uint32_t		xs_qm_dquot_unused;
 /* Extra precision counters */
-	__uint64_t		xs_xstrat_bytes;
-	__uint64_t		xs_write_bytes;
-	__uint64_t		xs_read_bytes;
+	uint64_t		xs_xstrat_bytes;
+	uint64_t		xs_write_bytes;
+	uint64_t		xs_read_bytes;
 };
 
 struct xfsstats {
@@ -186,7 +186,7 @@ struct xfsstats {
  * simple wrapper for getting the array index of s struct member offset
  */
 #define XFS_STATS_CALC_INDEX(member)	\
-	(offsetof(struct __xfsstats, member) / (int)sizeof(__uint32_t))
+	(offsetof(struct __xfsstats, member) / (int)sizeof(uint32_t))
 
 
 int xfs_stats_format(struct xfsstats __percpu *stats, char *buf);
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 455a575..a19aab8 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -196,7 +196,7 @@ xfs_parseargs(
 	int			dsunit = 0;
 	int			dswidth = 0;
 	int			iosize = 0;
-	__uint8_t		iosizelog = 0;
+	uint8_t			iosizelog = 0;
 
 	/*
 	 * set up the mount name first so all the errors will refer to the
@@ -556,7 +556,7 @@ xfs_showargs(
 
 	return 0;
 }
-static __uint64_t
+static uint64_t
 xfs_max_file_offset(
 	unsigned int		blockshift)
 {
@@ -587,7 +587,7 @@ xfs_max_file_offset(
 # endif
 #endif
 
-	return (((__uint64_t)pagefactor) << bitshift) - 1;
+	return (((uint64_t)pagefactor) << bitshift) - 1;
 }
 
 /*
@@ -622,7 +622,7 @@ xfs_set_inode_alloc(
 	 * the max inode percentage.  Used only for inode32.
 	 */
 	if (mp->m_maxicount) {
-		__uint64_t	icount;
+		uint64_t	icount;
 
 		icount = sbp->sb_dblocks * sbp->sb_imax_pct;
 		do_div(icount, 100);
@@ -1088,12 +1088,12 @@ xfs_fs_statfs(
 	struct xfs_mount	*mp = XFS_M(dentry->d_sb);
 	xfs_sb_t		*sbp = &mp->m_sb;
 	struct xfs_inode	*ip = XFS_I(d_inode(dentry));
-	__uint64_t		fakeinos, id;
-	__uint64_t		icount;
-	__uint64_t		ifree;
-	__uint64_t		fdblocks;
+	uint64_t		fakeinos, id;
+	uint64_t		icount;
+	uint64_t		ifree;
+	uint64_t		fdblocks;
 	xfs_extlen_t		lsize;
-	__int64_t		ffree;
+	int64_t			ffree;
 
 	statp->f_type = XFS_SB_MAGIC;
 	statp->f_namelen = MAXNAMELEN - 1;
@@ -1116,7 +1116,7 @@ xfs_fs_statfs(
 	statp->f_bavail = statp->f_bfree;
 
 	fakeinos = statp->f_bfree << sbp->sb_inopblog;
-	statp->f_files = MIN(icount + fakeinos, (__uint64_t)XFS_MAXINUMBER);
+	statp->f_files = MIN(icount + fakeinos, (uint64_t)XFS_MAXINUMBER);
 	if (mp->m_maxicount)
 		statp->f_files = min_t(typeof(statp->f_files),
 					statp->f_files,
@@ -1129,7 +1129,7 @@ xfs_fs_statfs(
 
 	/* make sure statp->f_ffree does not underflow */
 	ffree = statp->f_files - (icount - ifree);
-	statp->f_ffree = max_t(__int64_t, ffree, 0);
+	statp->f_ffree = max_t(int64_t, ffree, 0);
 
 
 	if ((ip->i_d.di_flags & XFS_DIFLAG_PROJINHERIT) &&
@@ -1142,7 +1142,7 @@ xfs_fs_statfs(
 STATIC void
 xfs_save_resvblks(struct xfs_mount *mp)
 {
-	__uint64_t resblks = 0;
+	uint64_t resblks = 0;
 
 	mp->m_resblks_save = mp->m_resblks;
 	xfs_reserve_blocks(mp, &resblks, NULL);
@@ -1151,7 +1151,7 @@ xfs_save_resvblks(struct xfs_mount *mp)
 STATIC void
 xfs_restore_resvblks(struct xfs_mount *mp)
 {
-	__uint64_t resblks;
+	uint64_t resblks;
 
 	if (mp->m_resblks_save) {
 		resblks = mp->m_resblks_save;
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 7c5a165..1a63919 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -251,7 +251,7 @@ TRACE_EVENT(xfs_iext_insert,
 		  __print_flags(__entry->bmap_state, "|", XFS_BMAP_EXT_FLAGS),
 		  (long)__entry->idx,
 		  __entry->startoff,
-		  (__int64_t)__entry->startblock,
+		  (int64_t)__entry->startblock,
 		  __entry->blockcount,
 		  __entry->state,
 		  (char *)__entry->caller_ip)
@@ -295,7 +295,7 @@ DECLARE_EVENT_CLASS(xfs_bmap_class,
 		  __print_flags(__entry->bmap_state, "|", XFS_BMAP_EXT_FLAGS),
 		  (long)__entry->idx,
 		  __entry->startoff,
-		  (__int64_t)__entry->startblock,
+		  (int64_t)__entry->startblock,
 		  __entry->blockcount,
 		  __entry->state,
 		  (char *)__entry->caller_ip)
@@ -1280,7 +1280,7 @@ DECLARE_EVENT_CLASS(xfs_imap_class,
 		  __entry->count,
 		  __print_symbolic(__entry->type, XFS_IO_TYPES),
 		  __entry->startoff,
-		  (__int64_t)__entry->startblock,
+		  (int64_t)__entry->startblock,
 		  __entry->blockcount)
 )
 
@@ -2057,7 +2057,7 @@ DECLARE_EVENT_CLASS(xfs_log_recover_buf_item_class,
 	TP_ARGS(log, buf_f),
 	TP_STRUCT__entry(
 		__field(dev_t, dev)
-		__field(__int64_t, blkno)
+		__field(int64_t, blkno)
 		__field(unsigned short, len)
 		__field(unsigned short, flags)
 		__field(unsigned short, size)
@@ -2106,7 +2106,7 @@ DECLARE_EVENT_CLASS(xfs_log_recover_ino_item_class,
 		__field(int, fields)
 		__field(unsigned short, asize)
 		__field(unsigned short, dsize)
-		__field(__int64_t, blkno)
+		__field(int64_t, blkno)
 		__field(int, len)
 		__field(int, boffset)
 	),
@@ -3256,8 +3256,8 @@ DECLARE_EVENT_CLASS(xfs_fsmap_class,
 		__field(xfs_agnumber_t, agno)
 		__field(xfs_fsblock_t, bno)
 		__field(xfs_filblks_t, len)
-		__field(__uint64_t, owner)
-		__field(__uint64_t, offset)
+		__field(uint64_t, owner)
+		__field(uint64_t, offset)
 		__field(unsigned int, flags)
 	),
 	TP_fast_assign(
@@ -3297,9 +3297,9 @@ DECLARE_EVENT_CLASS(xfs_getfsmap_class,
 		__field(dev_t, keydev)
 		__field(xfs_daddr_t, block)
 		__field(xfs_daddr_t, len)
-		__field(__uint64_t, owner)
-		__field(__uint64_t, offset)
-		__field(__uint64_t, flags)
+		__field(uint64_t, owner)
+		__field(uint64_t, offset)
+		__field(uint64_t, flags)
 	),
 	TP_fast_assign(
 		__entry->dev = mp->m_super->s_dev;
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index a07acbf..92db14e 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -249,7 +249,7 @@ struct xfs_rud_log_item *xfs_trans_get_rud(struct xfs_trans *tp,
 		struct xfs_rui_log_item *ruip);
 int xfs_trans_log_finish_rmap_update(struct xfs_trans *tp,
 		struct xfs_rud_log_item *rudp, enum xfs_rmap_intent_type type,
-		__uint64_t owner, int whichfork, xfs_fileoff_t startoff,
+		uint64_t owner, int whichfork, xfs_fileoff_t startoff,
 		xfs_fsblock_t startblock, xfs_filblks_t blockcount,
 		xfs_exntst_t state, struct xfs_btree_cur **pcur);
 
diff --git a/fs/xfs/xfs_trans_rmap.c b/fs/xfs/xfs_trans_rmap.c
index 9ead064..9b577be 100644
--- a/fs/xfs/xfs_trans_rmap.c
+++ b/fs/xfs/xfs_trans_rmap.c
@@ -96,7 +96,7 @@ xfs_trans_log_finish_rmap_update(
 	struct xfs_trans		*tp,
 	struct xfs_rud_log_item		*rudp,
 	enum xfs_rmap_intent_type	type,
-	__uint64_t			owner,
+	uint64_t			owner,
 	int				whichfork,
 	xfs_fileoff_t			startoff,
 	xfs_fsblock_t			startblock,


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 03/13] xfs: always compile the btree inorder check functions
  2017-06-02 21:24 [PATCH v7 00/13] xfs: preparing for online scrub support Darrick J. Wong
  2017-06-02 21:24 ` [PATCH 01/13] xfs: optimize _btree_query_all Darrick J. Wong
  2017-06-02 21:24 ` [PATCH 02/13] xfs: remove double-underscore integer types Darrick J. Wong
@ 2017-06-02 21:24 ` Darrick J. Wong
  2017-06-06 13:32   ` Brian Foster
  2017-06-02 21:24 ` [PATCH 04/13] xfs: export various function for the online scrubber Darrick J. Wong
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-02 21:24 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

The btree record and key inorder check functions will be used by the
btree scrubber code, so make sure they're always built.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_alloc_btree.c    |    6 ------
 fs/xfs/libxfs/xfs_bmap_btree.c     |    4 ----
 fs/xfs/libxfs/xfs_btree.h          |    2 --
 fs/xfs/libxfs/xfs_ialloc_btree.c   |    6 ------
 fs/xfs/libxfs/xfs_refcount_btree.c |    4 ----
 fs/xfs/libxfs/xfs_rmap_btree.c     |    4 ----
 6 files changed, 26 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_alloc_btree.c b/fs/xfs/libxfs/xfs_alloc_btree.c
index 5020cbc..cfde0a0 100644
--- a/fs/xfs/libxfs/xfs_alloc_btree.c
+++ b/fs/xfs/libxfs/xfs_alloc_btree.c
@@ -395,7 +395,6 @@ const struct xfs_buf_ops xfs_allocbt_buf_ops = {
 };
 
 
-#if defined(DEBUG) || defined(XFS_WARN)
 STATIC int
 xfs_bnobt_keys_inorder(
 	struct xfs_btree_cur	*cur,
@@ -442,7 +441,6 @@ xfs_cntbt_recs_inorder(
 		 be32_to_cpu(r1->alloc.ar_startblock) <
 		 be32_to_cpu(r2->alloc.ar_startblock));
 }
-#endif /* DEBUG */
 
 static const struct xfs_btree_ops xfs_bnobt_ops = {
 	.rec_len		= sizeof(xfs_alloc_rec_t),
@@ -462,10 +460,8 @@ static const struct xfs_btree_ops xfs_bnobt_ops = {
 	.key_diff		= xfs_bnobt_key_diff,
 	.buf_ops		= &xfs_allocbt_buf_ops,
 	.diff_two_keys		= xfs_bnobt_diff_two_keys,
-#if defined(DEBUG) || defined(XFS_WARN)
 	.keys_inorder		= xfs_bnobt_keys_inorder,
 	.recs_inorder		= xfs_bnobt_recs_inorder,
-#endif
 };
 
 static const struct xfs_btree_ops xfs_cntbt_ops = {
@@ -486,10 +482,8 @@ static const struct xfs_btree_ops xfs_cntbt_ops = {
 	.key_diff		= xfs_cntbt_key_diff,
 	.buf_ops		= &xfs_allocbt_buf_ops,
 	.diff_two_keys		= xfs_cntbt_diff_two_keys,
-#if defined(DEBUG) || defined(XFS_WARN)
 	.keys_inorder		= xfs_cntbt_keys_inorder,
 	.recs_inorder		= xfs_cntbt_recs_inorder,
-#endif
 };
 
 /*
diff --git a/fs/xfs/libxfs/xfs_bmap_btree.c b/fs/xfs/libxfs/xfs_bmap_btree.c
index 5e2b3dc..e23495e 100644
--- a/fs/xfs/libxfs/xfs_bmap_btree.c
+++ b/fs/xfs/libxfs/xfs_bmap_btree.c
@@ -687,7 +687,6 @@ const struct xfs_buf_ops xfs_bmbt_buf_ops = {
 };
 
 
-#if defined(DEBUG) || defined(XFS_WARN)
 STATIC int
 xfs_bmbt_keys_inorder(
 	struct xfs_btree_cur	*cur,
@@ -708,7 +707,6 @@ xfs_bmbt_recs_inorder(
 		xfs_bmbt_disk_get_blockcount(&r1->bmbt) <=
 		xfs_bmbt_disk_get_startoff(&r2->bmbt);
 }
-#endif	/* DEBUG */
 
 static const struct xfs_btree_ops xfs_bmbt_ops = {
 	.rec_len		= sizeof(xfs_bmbt_rec_t),
@@ -726,10 +724,8 @@ static const struct xfs_btree_ops xfs_bmbt_ops = {
 	.init_ptr_from_cur	= xfs_bmbt_init_ptr_from_cur,
 	.key_diff		= xfs_bmbt_key_diff,
 	.buf_ops		= &xfs_bmbt_buf_ops,
-#if defined(DEBUG) || defined(XFS_WARN)
 	.keys_inorder		= xfs_bmbt_keys_inorder,
 	.recs_inorder		= xfs_bmbt_recs_inorder,
-#endif
 };
 
 /*
diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h
index 0a931f6..177a364 100644
--- a/fs/xfs/libxfs/xfs_btree.h
+++ b/fs/xfs/libxfs/xfs_btree.h
@@ -163,7 +163,6 @@ struct xfs_btree_ops {
 
 	const struct xfs_buf_ops	*buf_ops;
 
-#if defined(DEBUG) || defined(XFS_WARN)
 	/* check that k1 is lower than k2 */
 	int	(*keys_inorder)(struct xfs_btree_cur *cur,
 				union xfs_btree_key *k1,
@@ -173,7 +172,6 @@ struct xfs_btree_ops {
 	int	(*recs_inorder)(struct xfs_btree_cur *cur,
 				union xfs_btree_rec *r1,
 				union xfs_btree_rec *r2);
-#endif
 };
 
 /*
diff --git a/fs/xfs/libxfs/xfs_ialloc_btree.c b/fs/xfs/libxfs/xfs_ialloc_btree.c
index ed52d99..6b1ddeb 100644
--- a/fs/xfs/libxfs/xfs_ialloc_btree.c
+++ b/fs/xfs/libxfs/xfs_ialloc_btree.c
@@ -302,7 +302,6 @@ const struct xfs_buf_ops xfs_inobt_buf_ops = {
 	.verify_write = xfs_inobt_write_verify,
 };
 
-#if defined(DEBUG) || defined(XFS_WARN)
 STATIC int
 xfs_inobt_keys_inorder(
 	struct xfs_btree_cur	*cur,
@@ -322,7 +321,6 @@ xfs_inobt_recs_inorder(
 	return be32_to_cpu(r1->inobt.ir_startino) + XFS_INODES_PER_CHUNK <=
 		be32_to_cpu(r2->inobt.ir_startino);
 }
-#endif	/* DEBUG */
 
 static const struct xfs_btree_ops xfs_inobt_ops = {
 	.rec_len		= sizeof(xfs_inobt_rec_t),
@@ -339,10 +337,8 @@ static const struct xfs_btree_ops xfs_inobt_ops = {
 	.init_ptr_from_cur	= xfs_inobt_init_ptr_from_cur,
 	.key_diff		= xfs_inobt_key_diff,
 	.buf_ops		= &xfs_inobt_buf_ops,
-#if defined(DEBUG) || defined(XFS_WARN)
 	.keys_inorder		= xfs_inobt_keys_inorder,
 	.recs_inorder		= xfs_inobt_recs_inorder,
-#endif
 };
 
 static const struct xfs_btree_ops xfs_finobt_ops = {
@@ -360,10 +356,8 @@ static const struct xfs_btree_ops xfs_finobt_ops = {
 	.init_ptr_from_cur	= xfs_finobt_init_ptr_from_cur,
 	.key_diff		= xfs_inobt_key_diff,
 	.buf_ops		= &xfs_inobt_buf_ops,
-#if defined(DEBUG) || defined(XFS_WARN)
 	.keys_inorder		= xfs_inobt_keys_inorder,
 	.recs_inorder		= xfs_inobt_recs_inorder,
-#endif
 };
 
 /*
diff --git a/fs/xfs/libxfs/xfs_refcount_btree.c b/fs/xfs/libxfs/xfs_refcount_btree.c
index 65c222a..3c59dd3 100644
--- a/fs/xfs/libxfs/xfs_refcount_btree.c
+++ b/fs/xfs/libxfs/xfs_refcount_btree.c
@@ -285,7 +285,6 @@ const struct xfs_buf_ops xfs_refcountbt_buf_ops = {
 	.verify_write		= xfs_refcountbt_write_verify,
 };
 
-#if defined(DEBUG) || defined(XFS_WARN)
 STATIC int
 xfs_refcountbt_keys_inorder(
 	struct xfs_btree_cur	*cur,
@@ -306,7 +305,6 @@ xfs_refcountbt_recs_inorder(
 		be32_to_cpu(r1->refc.rc_blockcount) <=
 		be32_to_cpu(r2->refc.rc_startblock);
 }
-#endif
 
 static const struct xfs_btree_ops xfs_refcountbt_ops = {
 	.rec_len		= sizeof(struct xfs_refcount_rec),
@@ -325,10 +323,8 @@ static const struct xfs_btree_ops xfs_refcountbt_ops = {
 	.key_diff		= xfs_refcountbt_key_diff,
 	.buf_ops		= &xfs_refcountbt_buf_ops,
 	.diff_two_keys		= xfs_refcountbt_diff_two_keys,
-#if defined(DEBUG) || defined(XFS_WARN)
 	.keys_inorder		= xfs_refcountbt_keys_inorder,
 	.recs_inorder		= xfs_refcountbt_recs_inorder,
-#endif
 };
 
 /*
diff --git a/fs/xfs/libxfs/xfs_rmap_btree.c b/fs/xfs/libxfs/xfs_rmap_btree.c
index c5b4a1c8..9d9c919 100644
--- a/fs/xfs/libxfs/xfs_rmap_btree.c
+++ b/fs/xfs/libxfs/xfs_rmap_btree.c
@@ -377,7 +377,6 @@ const struct xfs_buf_ops xfs_rmapbt_buf_ops = {
 	.verify_write		= xfs_rmapbt_write_verify,
 };
 
-#if defined(DEBUG) || defined(XFS_WARN)
 STATIC int
 xfs_rmapbt_keys_inorder(
 	struct xfs_btree_cur	*cur,
@@ -437,7 +436,6 @@ xfs_rmapbt_recs_inorder(
 		return 1;
 	return 0;
 }
-#endif	/* DEBUG */
 
 static const struct xfs_btree_ops xfs_rmapbt_ops = {
 	.rec_len		= sizeof(struct xfs_rmap_rec),
@@ -456,10 +454,8 @@ static const struct xfs_btree_ops xfs_rmapbt_ops = {
 	.key_diff		= xfs_rmapbt_key_diff,
 	.buf_ops		= &xfs_rmapbt_buf_ops,
 	.diff_two_keys		= xfs_rmapbt_diff_two_keys,
-#if defined(DEBUG) || defined(XFS_WARN)
 	.keys_inorder		= xfs_rmapbt_keys_inorder,
 	.recs_inorder		= xfs_rmapbt_recs_inorder,
-#endif
 };
 
 /*


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 04/13] xfs: export various function for the online scrubber
  2017-06-02 21:24 [PATCH v7 00/13] xfs: preparing for online scrub support Darrick J. Wong
                   ` (2 preceding siblings ...)
  2017-06-02 21:24 ` [PATCH 03/13] xfs: always compile the btree inorder check functions Darrick J. Wong
@ 2017-06-02 21:24 ` Darrick J. Wong
  2017-06-06 13:32   ` Brian Foster
  2017-06-02 21:24 ` [PATCH 05/13] xfs: plumb in needed functions for range querying of various btrees Darrick J. Wong
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-02 21:24 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Export various internal functions so that the online scrubber can use
them to check the state of metadata.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_alloc.c     |    2 +-
 fs/xfs/libxfs/xfs_alloc.h     |    2 ++
 fs/xfs/libxfs/xfs_btree.c     |   12 ++++++------
 fs/xfs/libxfs/xfs_btree.h     |   13 +++++++++++++
 fs/xfs/libxfs/xfs_dir2_leaf.c |    2 +-
 fs/xfs/libxfs/xfs_dir2_priv.h |    2 ++
 fs/xfs/libxfs/xfs_inode_buf.c |    2 +-
 fs/xfs/libxfs/xfs_inode_buf.h |    3 +++
 fs/xfs/libxfs/xfs_rmap.c      |    3 ++-
 fs/xfs/libxfs/xfs_rmap.h      |    3 +++
 fs/xfs/libxfs/xfs_rtbitmap.c  |    2 +-
 fs/xfs/xfs_itable.c           |    2 +-
 fs/xfs/xfs_itable.h           |    2 ++
 fs/xfs/xfs_rtalloc.h          |    3 +++
 14 files changed, 41 insertions(+), 12 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index 7486401..fefa8da 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -606,7 +606,7 @@ const struct xfs_buf_ops xfs_agfl_buf_ops = {
 /*
  * Read in the allocation group free block array.
  */
-STATIC int				/* error */
+int					/* error */
 xfs_alloc_read_agfl(
 	xfs_mount_t	*mp,		/* mount point structure */
 	xfs_trans_t	*tp,		/* transaction pointer */
diff --git a/fs/xfs/libxfs/xfs_alloc.h b/fs/xfs/libxfs/xfs_alloc.h
index 77d9c27..ef26edc 100644
--- a/fs/xfs/libxfs/xfs_alloc.h
+++ b/fs/xfs/libxfs/xfs_alloc.h
@@ -213,6 +213,8 @@ xfs_alloc_get_rec(
 
 int xfs_read_agf(struct xfs_mount *mp, struct xfs_trans *tp,
 			xfs_agnumber_t agno, int flags, struct xfs_buf **bpp);
+int xfs_alloc_read_agfl(struct xfs_mount *mp, struct xfs_trans *tp,
+			xfs_agnumber_t agno, struct xfs_buf **bpp);
 int xfs_alloc_fix_freelist(struct xfs_alloc_arg *args, int flags);
 int xfs_free_extent_fix_freelist(struct xfs_trans *tp, xfs_agnumber_t agno,
 		struct xfs_buf **agbp);
diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
index 302dd4c..302dac5 100644
--- a/fs/xfs/libxfs/xfs_btree.c
+++ b/fs/xfs/libxfs/xfs_btree.c
@@ -568,7 +568,7 @@ xfs_btree_ptr_offset(
 /*
  * Return a pointer to the n-th record in the btree block.
  */
-STATIC union xfs_btree_rec *
+union xfs_btree_rec *
 xfs_btree_rec_addr(
 	struct xfs_btree_cur	*cur,
 	int			n,
@@ -581,7 +581,7 @@ xfs_btree_rec_addr(
 /*
  * Return a pointer to the n-th key in the btree block.
  */
-STATIC union xfs_btree_key *
+union xfs_btree_key *
 xfs_btree_key_addr(
 	struct xfs_btree_cur	*cur,
 	int			n,
@@ -594,7 +594,7 @@ xfs_btree_key_addr(
 /*
  * Return a pointer to the n-th high key in the btree block.
  */
-STATIC union xfs_btree_key *
+union xfs_btree_key *
 xfs_btree_high_key_addr(
 	struct xfs_btree_cur	*cur,
 	int			n,
@@ -607,7 +607,7 @@ xfs_btree_high_key_addr(
 /*
  * Return a pointer to the n-th block pointer in the btree block.
  */
-STATIC union xfs_btree_ptr *
+union xfs_btree_ptr *
 xfs_btree_ptr_addr(
 	struct xfs_btree_cur	*cur,
 	int			n,
@@ -641,7 +641,7 @@ xfs_btree_get_iroot(
  * Retrieve the block pointer from the cursor at the given level.
  * This may be an inode btree root or from a buffer.
  */
-STATIC struct xfs_btree_block *		/* generic btree block pointer */
+struct xfs_btree_block *		/* generic btree block pointer */
 xfs_btree_get_block(
 	struct xfs_btree_cur	*cur,	/* btree cursor */
 	int			level,	/* level in btree */
@@ -1756,7 +1756,7 @@ xfs_btree_decrement(
 	return error;
 }
 
-STATIC int
+int
 xfs_btree_lookup_get_block(
 	struct xfs_btree_cur	*cur,	/* btree cursor */
 	int			level,	/* level in the btree */
diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h
index 177a364..9c95e96 100644
--- a/fs/xfs/libxfs/xfs_btree.h
+++ b/fs/xfs/libxfs/xfs_btree.h
@@ -504,4 +504,17 @@ int xfs_btree_visit_blocks(struct xfs_btree_cur *cur,
 
 int xfs_btree_count_blocks(struct xfs_btree_cur *cur, xfs_extlen_t *blocks);
 
+union xfs_btree_rec *xfs_btree_rec_addr(struct xfs_btree_cur *cur, int n,
+		struct xfs_btree_block *block);
+union xfs_btree_key *xfs_btree_key_addr(struct xfs_btree_cur *cur, int n,
+		struct xfs_btree_block *block);
+union xfs_btree_key *xfs_btree_high_key_addr(struct xfs_btree_cur *cur, int n,
+		struct xfs_btree_block *block);
+union xfs_btree_ptr *xfs_btree_ptr_addr(struct xfs_btree_cur *cur, int n,
+		struct xfs_btree_block *block);
+int xfs_btree_lookup_get_block(struct xfs_btree_cur *cur, int level,
+		union xfs_btree_ptr *pp, struct xfs_btree_block **blkp);
+struct xfs_btree_block *xfs_btree_get_block(struct xfs_btree_cur *cur,
+		int level, struct xfs_buf **bpp);
+
 #endif	/* __XFS_BTREE_H__ */
diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c
index 68bf3e8..7002024 100644
--- a/fs/xfs/libxfs/xfs_dir2_leaf.c
+++ b/fs/xfs/libxfs/xfs_dir2_leaf.c
@@ -256,7 +256,7 @@ const struct xfs_buf_ops xfs_dir3_leafn_buf_ops = {
 	.verify_write = xfs_dir3_leafn_write_verify,
 };
 
-static int
+int
 xfs_dir3_leaf_read(
 	struct xfs_trans	*tp,
 	struct xfs_inode	*dp,
diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h
index 011df4d..576f2d2 100644
--- a/fs/xfs/libxfs/xfs_dir2_priv.h
+++ b/fs/xfs/libxfs/xfs_dir2_priv.h
@@ -58,6 +58,8 @@ extern int xfs_dir3_data_init(struct xfs_da_args *args, xfs_dir2_db_t blkno,
 		struct xfs_buf **bpp);
 
 /* xfs_dir2_leaf.c */
+extern int xfs_dir3_leaf_read(struct xfs_trans *tp, struct xfs_inode *dp,
+		xfs_dablk_t fbno, xfs_daddr_t mappedbno, struct xfs_buf **bpp);
 extern int xfs_dir3_leafn_read(struct xfs_trans *tp, struct xfs_inode *dp,
 		xfs_dablk_t fbno, xfs_daddr_t mappedbno, struct xfs_buf **bpp);
 extern int xfs_dir2_block_to_leaf(struct xfs_da_args *args,
diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
index d887af9..0c970cf 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.c
+++ b/fs/xfs/libxfs/xfs_inode_buf.c
@@ -381,7 +381,7 @@ xfs_log_dinode_to_disk(
 	}
 }
 
-static bool
+bool
 xfs_dinode_verify(
 	struct xfs_mount	*mp,
 	xfs_ino_t		ino,
diff --git a/fs/xfs/libxfs/xfs_inode_buf.h b/fs/xfs/libxfs/xfs_inode_buf.h
index 0827d7d..a9c97a3 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.h
+++ b/fs/xfs/libxfs/xfs_inode_buf.h
@@ -82,4 +82,7 @@ void	xfs_inobp_check(struct xfs_mount *, struct xfs_buf *);
 #define	xfs_inobp_check(mp, bp)
 #endif /* DEBUG */
 
+bool	xfs_dinode_verify(struct xfs_mount *mp, xfs_ino_t ino,
+			  struct xfs_dinode *dip);
+
 #endif	/* __XFS_INODE_BUF_H__ */
diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c
index 1bcb41f..eda275b 100644
--- a/fs/xfs/libxfs/xfs_rmap.c
+++ b/fs/xfs/libxfs/xfs_rmap.c
@@ -179,7 +179,8 @@ xfs_rmap_delete(
 	return error;
 }
 
-static int
+/* Convert an internal btree record to an rmap record. */
+int
 xfs_rmap_btrec_to_irec(
 	union xfs_btree_rec	*rec,
 	struct xfs_rmap_irec	*irec)
diff --git a/fs/xfs/libxfs/xfs_rmap.h b/fs/xfs/libxfs/xfs_rmap.h
index 265116d..466ede6 100644
--- a/fs/xfs/libxfs/xfs_rmap.h
+++ b/fs/xfs/libxfs/xfs_rmap.h
@@ -216,5 +216,8 @@ int xfs_rmap_lookup_le_range(struct xfs_btree_cur *cur, xfs_agblock_t bno,
 		struct xfs_rmap_irec *irec, int	*stat);
 int xfs_rmap_compare(const struct xfs_rmap_irec *a,
 		const struct xfs_rmap_irec *b);
+union xfs_btree_rec;
+int xfs_rmap_btrec_to_irec(union xfs_btree_rec *rec,
+		struct xfs_rmap_irec *irec);
 
 #endif	/* __XFS_RMAP_H__ */
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
index 26bba7f..5d4e43e 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.c
+++ b/fs/xfs/libxfs/xfs_rtbitmap.c
@@ -70,7 +70,7 @@ const struct xfs_buf_ops xfs_rtbuf_ops = {
  * Get a buffer for the bitmap or summary file block specified.
  * The buffer is returned read and locked.
  */
-static int
+int
 xfs_rtbuf_get(
 	xfs_mount_t	*mp,		/* file system mount structure */
 	xfs_trans_t	*tp,		/* transaction pointer */
diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
index 26d67ce..c393a2f 100644
--- a/fs/xfs/xfs_itable.c
+++ b/fs/xfs/xfs_itable.c
@@ -31,7 +31,7 @@
 #include "xfs_trace.h"
 #include "xfs_icache.h"
 
-STATIC int
+int
 xfs_internal_inum(
 	xfs_mount_t	*mp,
 	xfs_ino_t	ino)
diff --git a/fs/xfs/xfs_itable.h b/fs/xfs/xfs_itable.h
index 6ea8b39..17e86e0 100644
--- a/fs/xfs/xfs_itable.h
+++ b/fs/xfs/xfs_itable.h
@@ -96,4 +96,6 @@ xfs_inumbers(
 	void			__user *buffer, /* buffer with inode info */
 	inumbers_fmt_pf		formatter);
 
+int xfs_internal_inum(struct xfs_mount *mp, xfs_ino_t ino);
+
 #endif	/* __XFS_ITABLE_H__ */
diff --git a/fs/xfs/xfs_rtalloc.h b/fs/xfs/xfs_rtalloc.h
index f13133e..79defa7 100644
--- a/fs/xfs/xfs_rtalloc.h
+++ b/fs/xfs/xfs_rtalloc.h
@@ -107,6 +107,8 @@ xfs_growfs_rt(
 /*
  * From xfs_rtbitmap.c
  */
+int xfs_rtbuf_get(struct xfs_mount *mp, struct xfs_trans *tp,
+		  xfs_rtblock_t block, int issum, struct xfs_buf **bpp);
 int xfs_rtcheck_range(struct xfs_mount *mp, struct xfs_trans *tp,
 		      xfs_rtblock_t start, xfs_extlen_t len, int val,
 		      xfs_rtblock_t *new, int *stat);
@@ -143,6 +145,7 @@ int xfs_rtalloc_query_all(struct xfs_trans *tp,
 # define xfs_growfs_rt(mp,in)                           (ENOSYS)
 # define xfs_rtalloc_query_range(t,l,h,f,p)             (ENOSYS)
 # define xfs_rtalloc_query_all(t,f,p)                   (ENOSYS)
+# define xfs_rtbuf_get(m,t,b,i,p)                       (ENOSYS)
 static inline int		/* error */
 xfs_rtmount_init(
 	xfs_mount_t	*mp)	/* file system mount structure */


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 05/13] xfs: plumb in needed functions for range querying of various btrees
  2017-06-02 21:24 [PATCH v7 00/13] xfs: preparing for online scrub support Darrick J. Wong
                   ` (3 preceding siblings ...)
  2017-06-02 21:24 ` [PATCH 04/13] xfs: export various function for the online scrubber Darrick J. Wong
@ 2017-06-02 21:24 ` Darrick J. Wong
  2017-06-06 13:33   ` Brian Foster
  2017-06-02 21:24 ` [PATCH 06/13] xfs: export _inobt_btrec_to_irec and _ialloc_cluster_alignment for scrub Darrick J. Wong
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-02 21:24 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Plumb in the pieces (init_high_key, diff_two_keys) necessary to call
query_range on the inode space and block mapping btrees and to extract
raw btree records.  This will eventually be used by the inobt and bmbt
scrubbers.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_bmap_btree.c   |   22 ++++++++++++++++++++++
 fs/xfs/libxfs/xfs_ialloc_btree.c |   26 ++++++++++++++++++++++++++
 2 files changed, 48 insertions(+)


diff --git a/fs/xfs/libxfs/xfs_bmap_btree.c b/fs/xfs/libxfs/xfs_bmap_btree.c
index e23495e..85de225 100644
--- a/fs/xfs/libxfs/xfs_bmap_btree.c
+++ b/fs/xfs/libxfs/xfs_bmap_btree.c
@@ -573,6 +573,16 @@ xfs_bmbt_init_key_from_rec(
 }
 
 STATIC void
+xfs_bmbt_init_high_key_from_rec(
+	union xfs_btree_key	*key,
+	union xfs_btree_rec	*rec)
+{
+	key->bmbt.br_startoff = cpu_to_be64(
+			xfs_bmbt_disk_get_startoff(&rec->bmbt) +
+			xfs_bmbt_disk_get_blockcount(&rec->bmbt) - 1);
+}
+
+STATIC void
 xfs_bmbt_init_rec_from_cur(
 	struct xfs_btree_cur	*cur,
 	union xfs_btree_rec	*rec)
@@ -597,6 +607,16 @@ xfs_bmbt_key_diff(
 				      cur->bc_rec.b.br_startoff;
 }
 
+STATIC int64_t
+xfs_bmbt_diff_two_keys(
+	struct xfs_btree_cur	*cur,
+	union xfs_btree_key	*k1,
+	union xfs_btree_key	*k2)
+{
+	return (int64_t)be64_to_cpu(k1->bmbt.br_startoff) -
+			  be64_to_cpu(k2->bmbt.br_startoff);
+}
+
 static bool
 xfs_bmbt_verify(
 	struct xfs_buf		*bp)
@@ -720,9 +740,11 @@ static const struct xfs_btree_ops xfs_bmbt_ops = {
 	.get_minrecs		= xfs_bmbt_get_minrecs,
 	.get_dmaxrecs		= xfs_bmbt_get_dmaxrecs,
 	.init_key_from_rec	= xfs_bmbt_init_key_from_rec,
+	.init_high_key_from_rec	= xfs_bmbt_init_high_key_from_rec,
 	.init_rec_from_cur	= xfs_bmbt_init_rec_from_cur,
 	.init_ptr_from_cur	= xfs_bmbt_init_ptr_from_cur,
 	.key_diff		= xfs_bmbt_key_diff,
+	.diff_two_keys		= xfs_bmbt_diff_two_keys,
 	.buf_ops		= &xfs_bmbt_buf_ops,
 	.keys_inorder		= xfs_bmbt_keys_inorder,
 	.recs_inorder		= xfs_bmbt_recs_inorder,
diff --git a/fs/xfs/libxfs/xfs_ialloc_btree.c b/fs/xfs/libxfs/xfs_ialloc_btree.c
index 6b1ddeb..317caba 100644
--- a/fs/xfs/libxfs/xfs_ialloc_btree.c
+++ b/fs/xfs/libxfs/xfs_ialloc_btree.c
@@ -175,6 +175,18 @@ xfs_inobt_init_key_from_rec(
 }
 
 STATIC void
+xfs_inobt_init_high_key_from_rec(
+	union xfs_btree_key	*key,
+	union xfs_btree_rec	*rec)
+{
+	__u32			x;
+
+	x = be32_to_cpu(rec->inobt.ir_startino);
+	x += XFS_INODES_PER_CHUNK - 1;
+	key->inobt.ir_startino = cpu_to_be32(x);
+}
+
+STATIC void
 xfs_inobt_init_rec_from_cur(
 	struct xfs_btree_cur	*cur,
 	union xfs_btree_rec	*rec)
@@ -228,6 +240,16 @@ xfs_inobt_key_diff(
 			  cur->bc_rec.i.ir_startino;
 }
 
+STATIC int64_t
+xfs_inobt_diff_two_keys(
+	struct xfs_btree_cur	*cur,
+	union xfs_btree_key	*k1,
+	union xfs_btree_key	*k2)
+{
+	return (int64_t)be32_to_cpu(k1->inobt.ir_startino) -
+			  be32_to_cpu(k2->inobt.ir_startino);
+}
+
 static int
 xfs_inobt_verify(
 	struct xfs_buf		*bp)
@@ -333,10 +355,12 @@ static const struct xfs_btree_ops xfs_inobt_ops = {
 	.get_minrecs		= xfs_inobt_get_minrecs,
 	.get_maxrecs		= xfs_inobt_get_maxrecs,
 	.init_key_from_rec	= xfs_inobt_init_key_from_rec,
+	.init_high_key_from_rec	= xfs_inobt_init_high_key_from_rec,
 	.init_rec_from_cur	= xfs_inobt_init_rec_from_cur,
 	.init_ptr_from_cur	= xfs_inobt_init_ptr_from_cur,
 	.key_diff		= xfs_inobt_key_diff,
 	.buf_ops		= &xfs_inobt_buf_ops,
+	.diff_two_keys		= xfs_inobt_diff_two_keys,
 	.keys_inorder		= xfs_inobt_keys_inorder,
 	.recs_inorder		= xfs_inobt_recs_inorder,
 };
@@ -352,10 +376,12 @@ static const struct xfs_btree_ops xfs_finobt_ops = {
 	.get_minrecs		= xfs_inobt_get_minrecs,
 	.get_maxrecs		= xfs_inobt_get_maxrecs,
 	.init_key_from_rec	= xfs_inobt_init_key_from_rec,
+	.init_high_key_from_rec	= xfs_inobt_init_high_key_from_rec,
 	.init_rec_from_cur	= xfs_inobt_init_rec_from_cur,
 	.init_ptr_from_cur	= xfs_finobt_init_ptr_from_cur,
 	.key_diff		= xfs_inobt_key_diff,
 	.buf_ops		= &xfs_inobt_buf_ops,
+	.diff_two_keys		= xfs_inobt_diff_two_keys,
 	.keys_inorder		= xfs_inobt_keys_inorder,
 	.recs_inorder		= xfs_inobt_recs_inorder,
 };


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 06/13] xfs: export _inobt_btrec_to_irec and _ialloc_cluster_alignment for scrub
  2017-06-02 21:24 [PATCH v7 00/13] xfs: preparing for online scrub support Darrick J. Wong
                   ` (4 preceding siblings ...)
  2017-06-02 21:24 ` [PATCH 05/13] xfs: plumb in needed functions for range querying of various btrees Darrick J. Wong
@ 2017-06-02 21:24 ` Darrick J. Wong
  2017-06-06 16:27   ` Brian Foster
  2017-06-02 21:24 ` [PATCH 07/13] xfs: check if an inode is cached and allocated Darrick J. Wong
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-02 21:24 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Create a function to extract an in-core inobt record from a generic
btree_rec union so that scrub will be able to check inobt records
and check inode block alignment.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_ialloc.c |   43 ++++++++++++++++++++++++++-----------------
 fs/xfs/libxfs/xfs_ialloc.h |    5 +++++
 2 files changed, 31 insertions(+), 17 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
index 1e5ed94..33626373 100644
--- a/fs/xfs/libxfs/xfs_ialloc.c
+++ b/fs/xfs/libxfs/xfs_ialloc.c
@@ -46,7 +46,7 @@
 /*
  * Allocation group level functions.
  */
-static inline int
+int
 xfs_ialloc_cluster_alignment(
 	struct xfs_mount	*mp)
 {
@@ -98,24 +98,14 @@ xfs_inobt_update(
 	return xfs_btree_update(cur, &rec);
 }
 
-/*
- * Get the data from the pointed-to record.
- */
-int					/* error */
-xfs_inobt_get_rec(
-	struct xfs_btree_cur	*cur,	/* btree cursor */
-	xfs_inobt_rec_incore_t	*irec,	/* btree record */
-	int			*stat)	/* output: success/failure */
+void
+xfs_inobt_btrec_to_irec(
+	struct xfs_mount		*mp,
+	union xfs_btree_rec		*rec,
+	struct xfs_inobt_rec_incore	*irec)
 {
-	union xfs_btree_rec	*rec;
-	int			error;
-
-	error = xfs_btree_get_rec(cur, &rec, stat);
-	if (error || *stat == 0)
-		return error;
-
 	irec->ir_startino = be32_to_cpu(rec->inobt.ir_startino);
-	if (xfs_sb_version_hassparseinodes(&cur->bc_mp->m_sb)) {
+	if (xfs_sb_version_hassparseinodes(&mp->m_sb)) {
 		irec->ir_holemask = be16_to_cpu(rec->inobt.ir_u.sp.ir_holemask);
 		irec->ir_count = rec->inobt.ir_u.sp.ir_count;
 		irec->ir_freecount = rec->inobt.ir_u.sp.ir_freecount;
@@ -130,6 +120,25 @@ xfs_inobt_get_rec(
 				be32_to_cpu(rec->inobt.ir_u.f.ir_freecount);
 	}
 	irec->ir_free = be64_to_cpu(rec->inobt.ir_free);
+}
+
+/*
+ * Get the data from the pointed-to record.
+ */
+int					/* error */
+xfs_inobt_get_rec(
+	struct xfs_btree_cur	*cur,	/* btree cursor */
+	xfs_inobt_rec_incore_t	*irec,	/* btree record */
+	int			*stat)	/* output: success/failure */
+{
+	union xfs_btree_rec	*rec;
+	int			error;
+
+	error = xfs_btree_get_rec(cur, &rec, stat);
+	if (error || *stat == 0)
+		return error;
+
+	xfs_inobt_btrec_to_irec(cur->bc_mp, rec, irec);
 
 	return 0;
 }
diff --git a/fs/xfs/libxfs/xfs_ialloc.h b/fs/xfs/libxfs/xfs_ialloc.h
index 0bb8966..b32cfb5 100644
--- a/fs/xfs/libxfs/xfs_ialloc.h
+++ b/fs/xfs/libxfs/xfs_ialloc.h
@@ -168,5 +168,10 @@ int xfs_ialloc_inode_init(struct xfs_mount *mp, struct xfs_trans *tp,
 int xfs_read_agi(struct xfs_mount *mp, struct xfs_trans *tp,
 		xfs_agnumber_t agno, struct xfs_buf **bpp);
 
+union xfs_btree_rec;
+void xfs_inobt_btrec_to_irec(struct xfs_mount *mp, union xfs_btree_rec *rec,
+		struct xfs_inobt_rec_incore *irec);
+
+int xfs_ialloc_cluster_alignment(struct xfs_mount *mp);
 
 #endif	/* __XFS_IALLOC_H__ */


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 07/13] xfs: check if an inode is cached and allocated
  2017-06-02 21:24 [PATCH v7 00/13] xfs: preparing for online scrub support Darrick J. Wong
                   ` (5 preceding siblings ...)
  2017-06-02 21:24 ` [PATCH 06/13] xfs: export _inobt_btrec_to_irec and _ialloc_cluster_alignment for scrub Darrick J. Wong
@ 2017-06-02 21:24 ` Darrick J. Wong
  2017-06-06 16:28   ` Brian Foster
                     ` (2 more replies)
  2017-06-02 21:24 ` [PATCH 08/13] xfs: reflink find shared should take a transaction Darrick J. Wong
                   ` (7 subsequent siblings)
  14 siblings, 3 replies; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-02 21:24 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Check the inode cache for a particular inode number.  If it's in the
cache, check that it's not currently being reclaimed.  If it's not being
reclaimed, return zero if the inode is allocated.  This function will be
used by various scrubbers to decide if the cache is more up to date
than the disk in terms of checking if an inode is allocated.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_icache.c |   83 +++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_icache.h |    3 ++
 2 files changed, 86 insertions(+)


diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index f61c84f8..d610a7e 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -633,6 +633,89 @@ xfs_iget(
 }
 
 /*
+ * "Is this a cached inode that's also allocated?"
+ *
+ * Look up an inode by number in the given file system.  If the inode is
+ * in cache and isn't in purgatory, return 1 if the inode is allocated
+ * and 0 if it is not.  For all other cases (not in cache, being torn
+ * down, etc.), return a negative error code.
+ *
+ * (The caller has to prevent inode allocation activity.)
+ */
+int
+xfs_icache_inode_is_allocated(
+	struct xfs_mount	*mp,
+	struct xfs_trans	*tp,
+	xfs_ino_t		ino,
+	bool			*inuse)
+{
+	struct xfs_inode	*ip;
+	struct xfs_perag	*pag;
+	xfs_agino_t		agino;
+	int			ret = 0;
+
+	/* reject inode numbers outside existing AGs */
+	if (!ino || XFS_INO_TO_AGNO(mp, ino) >= mp->m_sb.sb_agcount)
+		return -EINVAL;
+
+	/* get the perag structure and ensure that it's inode capable */
+	pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ino));
+	agino = XFS_INO_TO_AGINO(mp, ino);
+
+	rcu_read_lock();
+	ip = radix_tree_lookup(&pag->pag_ici_root, agino);
+	if (!ip) {
+		ret = -ENOENT;
+		goto out;
+	}
+
+	/*
+	 * Is the inode being reused?  Is it new?  Is it being
+	 * reclaimed?  Is it being torn down?  For any of those cases,
+	 * fall back.
+	 */
+	spin_lock(&ip->i_flags_lock);
+	if (ip->i_ino != ino ||
+	    (ip->i_flags & (XFS_INEW | XFS_IRECLAIM | XFS_IRECLAIMABLE))) {
+		ret = -EAGAIN;
+		goto out_istate;
+	}
+
+	/*
+	 * If lookup is racing with unlink, jump out immediately.
+	 */
+	if (VFS_I(ip)->i_mode == 0) {
+		*inuse = false;
+		ret = 0;
+		goto out_istate;
+	}
+
+	/* If the VFS inode is being torn down, forget it. */
+	if (!igrab(VFS_I(ip))) {
+		ret = -EAGAIN;
+		goto out_istate;
+	}
+
+	/* We've got a live one. */
+	spin_unlock(&ip->i_flags_lock);
+	rcu_read_unlock();
+	xfs_perag_put(pag);
+
+	*inuse = !!(VFS_I(ip)->i_mode);
+	ret = 0;
+	IRELE(ip);
+
+	return ret;
+
+out_istate:
+	spin_unlock(&ip->i_flags_lock);
+out:
+	rcu_read_unlock();
+	xfs_perag_put(pag);
+	return ret;
+}
+
+/*
  * The inode lookup is done in batches to keep the amount of lock traffic and
  * radix tree lookups to a minimum. The batch size is a trade off between
  * lookup reduction and stack usage. This is in the reclaim path, so we can't
diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
index 9183f77..eadf718 100644
--- a/fs/xfs/xfs_icache.h
+++ b/fs/xfs/xfs_icache.h
@@ -126,4 +126,7 @@ xfs_fs_eofblocks_from_user(
 	return 0;
 }
 
+int xfs_icache_inode_is_allocated(struct xfs_mount *mp, struct xfs_trans *tp,
+				  xfs_ino_t ino, bool *inuse);
+
 #endif


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 08/13] xfs: reflink find shared should take a transaction
  2017-06-02 21:24 [PATCH v7 00/13] xfs: preparing for online scrub support Darrick J. Wong
                   ` (6 preceding siblings ...)
  2017-06-02 21:24 ` [PATCH 07/13] xfs: check if an inode is cached and allocated Darrick J. Wong
@ 2017-06-02 21:24 ` Darrick J. Wong
  2017-06-06 16:28   ` Brian Foster
  2017-06-02 21:24 ` [PATCH 09/13] xfs: separate function to check if reflink flag needed Darrick J. Wong
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-02 21:24 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Adapt _reflink_find_shared to take an optional transaction pointer.  The
inode scrubber code will need to decide (within transaction context) if
a file has shared blocks.  To avoid buffer deadlocks, we must pass the
tp through to this function's utility calls.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_bmap_util.c |    4 ++--
 fs/xfs/xfs_reflink.c   |   15 ++++++++-------
 fs/xfs/xfs_reflink.h   |    6 +++---
 3 files changed, 13 insertions(+), 12 deletions(-)


diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 308428d..fe83bbc 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -455,8 +455,8 @@ xfs_getbmap_adjust_shared(
 
 	agno = XFS_FSB_TO_AGNO(mp, map->br_startblock);
 	agbno = XFS_FSB_TO_AGBNO(mp, map->br_startblock);
-	error = xfs_reflink_find_shared(mp, agno, agbno, map->br_blockcount,
-			&ebno, &elen, true);
+	error = xfs_reflink_find_shared(mp, NULL, agno, agbno,
+			map->br_blockcount, &ebno, &elen, true);
 	if (error)
 		return error;
 
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index ffe6fe7..e25c995 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -155,6 +155,7 @@
 int
 xfs_reflink_find_shared(
 	struct xfs_mount	*mp,
+	struct xfs_trans	*tp,
 	xfs_agnumber_t		agno,
 	xfs_agblock_t		agbno,
 	xfs_extlen_t		aglen,
@@ -166,18 +167,18 @@ xfs_reflink_find_shared(
 	struct xfs_btree_cur	*cur;
 	int			error;
 
-	error = xfs_alloc_read_agf(mp, NULL, agno, 0, &agbp);
+	error = xfs_alloc_read_agf(mp, tp, agno, 0, &agbp);
 	if (error)
 		return error;
 
-	cur = xfs_refcountbt_init_cursor(mp, NULL, agbp, agno, NULL);
+	cur = xfs_refcountbt_init_cursor(mp, tp, agbp, agno, NULL);
 
 	error = xfs_refcount_find_shared(cur, agbno, aglen, fbno, flen,
 			find_end_of_shared);
 
 	xfs_btree_del_cursor(cur, error ? XFS_BTREE_ERROR : XFS_BTREE_NOERROR);
 
-	xfs_buf_relse(agbp);
+	xfs_trans_brelse(tp, agbp);
 	return error;
 }
 
@@ -217,7 +218,7 @@ xfs_reflink_trim_around_shared(
 	agbno = XFS_FSB_TO_AGBNO(ip->i_mount, irec->br_startblock);
 	aglen = irec->br_blockcount;
 
-	error = xfs_reflink_find_shared(ip->i_mount, agno, agbno,
+	error = xfs_reflink_find_shared(ip->i_mount, NULL, agno, agbno,
 			aglen, &fbno, &flen, true);
 	if (error)
 		return error;
@@ -1373,8 +1374,8 @@ xfs_reflink_dirty_extents(
 			agbno = XFS_FSB_TO_AGBNO(mp, map[1].br_startblock);
 			aglen = map[1].br_blockcount;
 
-			error = xfs_reflink_find_shared(mp, agno, agbno, aglen,
-					&rbno, &rlen, true);
+			error = xfs_reflink_find_shared(mp, NULL, agno, agbno,
+					aglen, &rbno, &rlen, true);
 			if (error)
 				goto out;
 			if (rbno == NULLAGBLOCK)
@@ -1445,7 +1446,7 @@ xfs_reflink_clear_inode_flag(
 		agbno = XFS_FSB_TO_AGBNO(mp, map.br_startblock);
 		aglen = map.br_blockcount;
 
-		error = xfs_reflink_find_shared(mp, agno, agbno, aglen,
+		error = xfs_reflink_find_shared(mp, *tpp, agno, agbno, aglen,
 				&rbno, &rlen, false);
 		if (error)
 			return error;
diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h
index d29a796..b8cc5c3 100644
--- a/fs/xfs/xfs_reflink.h
+++ b/fs/xfs/xfs_reflink.h
@@ -20,9 +20,9 @@
 #ifndef __XFS_REFLINK_H
 #define __XFS_REFLINK_H 1
 
-extern int xfs_reflink_find_shared(struct xfs_mount *mp, xfs_agnumber_t agno,
-		xfs_agblock_t agbno, xfs_extlen_t aglen, xfs_agblock_t *fbno,
-		xfs_extlen_t *flen, bool find_maximal);
+extern int xfs_reflink_find_shared(struct xfs_mount *mp, struct xfs_trans *tp,
+		xfs_agnumber_t agno, xfs_agblock_t agbno, xfs_extlen_t aglen,
+		xfs_agblock_t *fbno, xfs_extlen_t *flen, bool find_maximal);
 extern int xfs_reflink_trim_around_shared(struct xfs_inode *ip,
 		struct xfs_bmbt_irec *irec, bool *shared, bool *trimmed);
 


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 09/13] xfs: separate function to check if reflink flag needed
  2017-06-02 21:24 [PATCH v7 00/13] xfs: preparing for online scrub support Darrick J. Wong
                   ` (7 preceding siblings ...)
  2017-06-02 21:24 ` [PATCH 08/13] xfs: reflink find shared should take a transaction Darrick J. Wong
@ 2017-06-02 21:24 ` Darrick J. Wong
  2017-06-06 16:28   ` Brian Foster
  2017-06-07  1:26   ` [PATCH v2 " Darrick J. Wong
  2017-06-02 21:25 ` [PATCH 10/13] xfs: refactor the ifork block counting function Darrick J. Wong
                   ` (5 subsequent siblings)
  14 siblings, 2 replies; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-02 21:24 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Separate the "clear reflink flag" function into one function that checks
if the flag is needed, and a second function that checks and clears the
flag.  The inode scrub code will want to check the necessity of the flag
without clearing it.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_reflink.c |   88 ++++++++++++++++++++++++++++++--------------------
 fs/xfs/xfs_reflink.h |    2 +
 2 files changed, 54 insertions(+), 36 deletions(-)


diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index e25c995..133ee02 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -1406,57 +1406,73 @@ xfs_reflink_dirty_extents(
 	return error;
 }
 
-/* Clear the inode reflink flag if there are no shared extents. */
+/* Does this inode need the reflink flag? */
 int
-xfs_reflink_clear_inode_flag(
-	struct xfs_inode	*ip,
-	struct xfs_trans	**tpp)
+xfs_reflink_needs_inode_flag(
+	struct xfs_trans		*tp,
+	struct xfs_inode		*ip,
+	bool				*needs_flag)
 {
-	struct xfs_mount	*mp = ip->i_mount;
-	xfs_fileoff_t		fbno;
-	xfs_filblks_t		end;
-	xfs_agnumber_t		agno;
-	xfs_agblock_t		agbno;
-	xfs_extlen_t		aglen;
-	xfs_agblock_t		rbno;
-	xfs_extlen_t		rlen;
-	struct xfs_bmbt_irec	map;
-	int			nmaps;
-	int			error = 0;
-
-	ASSERT(xfs_is_reflink_inode(ip));
+	struct xfs_bmbt_irec		got;
+	struct xfs_mount		*mp = ip->i_mount;
+	struct xfs_ifork		*ifp;
+	xfs_agnumber_t			agno;
+	xfs_agblock_t			agbno;
+	xfs_extlen_t			aglen;
+	xfs_agblock_t			rbno;
+	xfs_extlen_t			rlen;
+	xfs_extnum_t			idx;
+	bool				found;
+	int				error;
 
-	fbno = 0;
-	end = XFS_B_TO_FSB(mp, i_size_read(VFS_I(ip)));
-	while (end - fbno > 0) {
-		nmaps = 1;
-		/*
-		 * Look for extents in the file.  Skip holes, delalloc, or
-		 * unwritten extents; they can't be reflinked.
-		 */
-		error = xfs_bmapi_read(ip, fbno, end - fbno, &map, &nmaps, 0);
+	ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
+	if (!(ifp->if_flags & XFS_IFEXTENTS)) {
+		error = xfs_iread_extents(tp, ip, XFS_DATA_FORK);
 		if (error)
 			return error;
-		if (nmaps == 0)
-			break;
-		if (!xfs_bmap_is_real_extent(&map))
-			goto next;
+	}
 
-		agno = XFS_FSB_TO_AGNO(mp, map.br_startblock);
-		agbno = XFS_FSB_TO_AGBNO(mp, map.br_startblock);
-		aglen = map.br_blockcount;
+	*needs_flag = false;
+	found = xfs_iext_lookup_extent(ip, ifp, 0, &idx, &got);
+	while (found) {
+		if (isnullstartblock(got.br_startblock) ||
+		    got.br_state != XFS_EXT_NORM)
+			goto next;
+		agno = XFS_FSB_TO_AGNO(mp, got.br_startblock);
+		agbno = XFS_FSB_TO_AGBNO(mp, got.br_startblock);
+		aglen = got.br_blockcount;
 
-		error = xfs_reflink_find_shared(mp, *tpp, agno, agbno, aglen,
+		error = xfs_reflink_find_shared(mp, tp, agno, agbno, aglen,
 				&rbno, &rlen, false);
 		if (error)
 			return error;
 		/* Is there still a shared block here? */
-		if (rbno != NULLAGBLOCK)
+		if (rbno != NULLAGBLOCK) {
+			*needs_flag = true;
 			return 0;
+		}
 next:
-		fbno = map.br_startoff + map.br_blockcount;
+		found = xfs_iext_get_extent(ifp, ++idx, &got);
 	}
 
+	return 0;
+}
+
+/* Clear the inode reflink flag if there are no shared extents. */
+int
+xfs_reflink_clear_inode_flag(
+	struct xfs_inode	*ip,
+	struct xfs_trans	**tpp)
+{
+	bool			needs;
+	int			error = 0;
+
+	ASSERT(xfs_is_reflink_inode(ip));
+
+	error = xfs_reflink_needs_inode_flag(*tpp, ip, &needs);
+	if (error || needs)
+		return error;
+
 	/*
 	 * We didn't find any shared blocks so turn off the reflink flag.
 	 * First, get rid of any leftover CoW mappings.
diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h
index b8cc5c3..a26d795 100644
--- a/fs/xfs/xfs_reflink.h
+++ b/fs/xfs/xfs_reflink.h
@@ -47,6 +47,8 @@ extern int xfs_reflink_end_cow(struct xfs_inode *ip, xfs_off_t offset,
 extern int xfs_reflink_recover_cow(struct xfs_mount *mp);
 extern int xfs_reflink_remap_range(struct file *file_in, loff_t pos_in,
 		struct file *file_out, loff_t pos_out, u64 len, bool is_dedupe);
+extern int xfs_reflink_needs_inode_flag(struct xfs_trans *tp,
+		struct xfs_inode *ip, bool *needs_flag);
 extern int xfs_reflink_clear_inode_flag(struct xfs_inode *ip,
 		struct xfs_trans **tpp);
 extern int xfs_reflink_unshare(struct xfs_inode *ip, xfs_off_t offset,


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 10/13] xfs: refactor the ifork block counting function
  2017-06-02 21:24 [PATCH v7 00/13] xfs: preparing for online scrub support Darrick J. Wong
                   ` (8 preceding siblings ...)
  2017-06-02 21:24 ` [PATCH 09/13] xfs: separate function to check if reflink flag needed Darrick J. Wong
@ 2017-06-02 21:25 ` Darrick J. Wong
  2017-06-06 16:29   ` Brian Foster
                     ` (2 more replies)
  2017-06-02 21:25 ` [PATCH 11/13] xfs: return the hash value of a leaf1 directory block Darrick J. Wong
                   ` (4 subsequent siblings)
  14 siblings, 3 replies; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-02 21:25 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Refactor the inode fork block counting function to count extents for us
at the same time.  This will be used by the bmbt scrubber function.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_bmap_util.c |  105 +++++++++++++++++++++++++++++-------------------
 fs/xfs/xfs_bmap_util.h |    4 ++
 2 files changed, 67 insertions(+), 42 deletions(-)


diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index fe83bbc..fc15305 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -226,7 +226,7 @@ xfs_bmap_count_leaves(
 	xfs_ifork_t		*ifp,
 	xfs_extnum_t		idx,
 	int			numrecs,
-	int			*count)
+	unsigned long long	*count)
 {
 	int		b;
 
@@ -245,7 +245,7 @@ xfs_bmap_disk_count_leaves(
 	struct xfs_mount	*mp,
 	struct xfs_btree_block	*block,
 	int			numrecs,
-	int			*count)
+	unsigned long long	*count)
 {
 	int		b;
 	xfs_bmbt_rec_t	*frp;
@@ -260,17 +260,18 @@ xfs_bmap_disk_count_leaves(
  * Recursively walks each level of a btree
  * to count total fsblocks in use.
  */
-STATIC int                                     /* error */
+STATIC int
 xfs_bmap_count_tree(
-	xfs_mount_t     *mp,            /* file system mount point */
-	xfs_trans_t     *tp,            /* transaction pointer */
-	xfs_ifork_t	*ifp,		/* inode fork pointer */
-	xfs_fsblock_t   blockno,	/* file system block number */
-	int             levelin,	/* level in btree */
-	int		*count)		/* Count of blocks */
+	struct xfs_mount	*mp,
+	struct xfs_trans	*tp,
+	struct xfs_ifork	*ifp,
+	xfs_fsblock_t		blockno,
+	int			levelin,
+	unsigned int		*nextents,
+	unsigned long long	*count)
 {
 	int			error;
-	xfs_buf_t		*bp, *nbp;
+	struct xfs_buf		*bp, *nbp;
 	int			level = levelin;
 	__be64			*pp;
 	xfs_fsblock_t           bno = blockno;
@@ -303,8 +304,9 @@ xfs_bmap_count_tree(
 		/* Dive to the next level */
 		pp = XFS_BMBT_PTR_ADDR(mp, block, 1, mp->m_bmap_dmxr[1]);
 		bno = be64_to_cpu(*pp);
-		if (unlikely((error =
-		     xfs_bmap_count_tree(mp, tp, ifp, bno, level, count)) < 0)) {
+		error = xfs_bmap_count_tree(mp, tp, ifp, bno, level, nextents,
+				count);
+		if (error) {
 			xfs_trans_brelse(tp, bp);
 			XFS_ERROR_REPORT("xfs_bmap_count_tree(1)",
 					 XFS_ERRLEVEL_LOW, mp);
@@ -316,6 +318,7 @@ xfs_bmap_count_tree(
 		for (;;) {
 			nextbno = be64_to_cpu(block->bb_u.l.bb_rightsib);
 			numrecs = be16_to_cpu(block->bb_numrecs);
+			(*nextents) += numrecs;
 			xfs_bmap_disk_count_leaves(mp, block, numrecs, count);
 			xfs_trans_brelse(tp, bp);
 			if (nextbno == NULLFSBLOCK)
@@ -336,44 +339,61 @@ xfs_bmap_count_tree(
 /*
  * Count fsblocks of the given fork.
  */
-static int					/* error */
+int
 xfs_bmap_count_blocks(
-	xfs_trans_t		*tp,		/* transaction pointer */
-	xfs_inode_t		*ip,		/* incore inode */
-	int			whichfork,	/* data or attr fork */
-	int			*count)		/* out: count of blocks */
+	struct xfs_trans	*tp,
+	struct xfs_inode	*ip,
+	int			whichfork,
+	unsigned int		*nextents,
+	unsigned long long	*count)
 {
 	struct xfs_btree_block	*block;	/* current btree block */
 	xfs_fsblock_t		bno;	/* block # of "block" */
-	xfs_ifork_t		*ifp;	/* fork structure */
+	struct xfs_ifork	*ifp;	/* fork structure */
 	int			level;	/* btree level, for checking */
-	xfs_mount_t		*mp;	/* file system mount structure */
+	struct xfs_mount	*mp;	/* file system mount structure */
 	__be64			*pp;	/* pointer to block address */
+	int			error;
 
 	bno = NULLFSBLOCK;
 	mp = ip->i_mount;
+	*nextents = 0;
 	ifp = XFS_IFORK_PTR(ip, whichfork);
-	if ( XFS_IFORK_FORMAT(ip, whichfork) == XFS_DINODE_FMT_EXTENTS ) {
-		xfs_bmap_count_leaves(ifp, 0, xfs_iext_count(ifp), count);
+	if (!ifp)
 		return 0;
-	}
 
-	/*
-	 * Root level must use BMAP_BROOT_PTR_ADDR macro to get ptr out.
-	 */
-	block = ifp->if_broot;
-	level = be16_to_cpu(block->bb_level);
-	ASSERT(level > 0);
-	pp = XFS_BMAP_BROOT_PTR_ADDR(mp, block, 1, ifp->if_broot_bytes);
-	bno = be64_to_cpu(*pp);
-	ASSERT(bno != NULLFSBLOCK);
-	ASSERT(XFS_FSB_TO_AGNO(mp, bno) < mp->m_sb.sb_agcount);
-	ASSERT(XFS_FSB_TO_AGBNO(mp, bno) < mp->m_sb.sb_agblocks);
-
-	if (unlikely(xfs_bmap_count_tree(mp, tp, ifp, bno, level, count) < 0)) {
-		XFS_ERROR_REPORT("xfs_bmap_count_blocks(2)", XFS_ERRLEVEL_LOW,
-				 mp);
-		return -EFSCORRUPTED;
+	switch (XFS_IFORK_FORMAT(ip, whichfork)) {
+	case XFS_DINODE_FMT_EXTENTS:
+		*nextents = xfs_iext_count(ifp);
+		xfs_bmap_count_leaves(ifp, 0, (*nextents), count);
+		return 0;
+	case XFS_DINODE_FMT_BTREE:
+		if (!(ifp->if_flags & XFS_IFEXTENTS)) {
+			error = xfs_iread_extents(tp, ip, whichfork);
+			if (error)
+				return error;
+		}
+
+		/*
+		 * Root level must use BMAP_BROOT_PTR_ADDR macro to get ptr out.
+		 */
+		block = ifp->if_broot;
+		level = be16_to_cpu(block->bb_level);
+		ASSERT(level > 0);
+		pp = XFS_BMAP_BROOT_PTR_ADDR(mp, block, 1, ifp->if_broot_bytes);
+		bno = be64_to_cpu(*pp);
+		ASSERT(bno != NULLFSBLOCK);
+		ASSERT(XFS_FSB_TO_AGNO(mp, bno) < mp->m_sb.sb_agcount);
+		ASSERT(XFS_FSB_TO_AGBNO(mp, bno) < mp->m_sb.sb_agblocks);
+
+		error = xfs_bmap_count_tree(mp, tp, ifp, bno, level,
+				nextents, count);
+		if (error) {
+			XFS_ERROR_REPORT("xfs_bmap_count_blocks(2)",
+					XFS_ERRLEVEL_LOW, mp);
+			return -EFSCORRUPTED;
+		}
+		return 0;
 	}
 
 	return 0;
@@ -1789,8 +1809,9 @@ xfs_swap_extent_forks(
 	int			*target_log_flags)
 {
 	struct xfs_ifork	tempifp, *ifp, *tifp;
-	int			aforkblks = 0;
-	int			taforkblks = 0;
+	unsigned long long	aforkblks = 0;
+	unsigned long long	taforkblks = 0;
+	unsigned int		junk;
 	xfs_extnum_t		nextents;
 	uint64_t		tmp;
 	int			error;
@@ -1800,14 +1821,14 @@ xfs_swap_extent_forks(
 	 */
 	if ( ((XFS_IFORK_Q(ip) != 0) && (ip->i_d.di_anextents > 0)) &&
 	     (ip->i_d.di_aformat != XFS_DINODE_FMT_LOCAL)) {
-		error = xfs_bmap_count_blocks(tp, ip, XFS_ATTR_FORK,
+		error = xfs_bmap_count_blocks(tp, ip, XFS_ATTR_FORK, &junk,
 				&aforkblks);
 		if (error)
 			return error;
 	}
 	if ( ((XFS_IFORK_Q(tip) != 0) && (tip->i_d.di_anextents > 0)) &&
 	     (tip->i_d.di_aformat != XFS_DINODE_FMT_LOCAL)) {
-		error = xfs_bmap_count_blocks(tp, tip, XFS_ATTR_FORK,
+		error = xfs_bmap_count_blocks(tp, tip, XFS_ATTR_FORK, &junk,
 				&taforkblks);
 		if (error)
 			return error;
diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h
index 135d826..993973c 100644
--- a/fs/xfs/xfs_bmap_util.h
+++ b/fs/xfs/xfs_bmap_util.h
@@ -70,4 +70,8 @@ int	xfs_swap_extents(struct xfs_inode *ip, struct xfs_inode *tip,
 
 xfs_daddr_t xfs_fsb_to_db(struct xfs_inode *ip, xfs_fsblock_t fsb);
 
+int xfs_bmap_count_blocks(struct xfs_trans *tp, struct xfs_inode *ip,
+			  int whichfork, unsigned int *nextents,
+			  unsigned long long *count);
+
 #endif	/* __XFS_BMAP_UTIL_H__ */


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 11/13] xfs: return the hash value of a leaf1 directory block
  2017-06-02 21:24 [PATCH v7 00/13] xfs: preparing for online scrub support Darrick J. Wong
                   ` (9 preceding siblings ...)
  2017-06-02 21:25 ` [PATCH 10/13] xfs: refactor the ifork block counting function Darrick J. Wong
@ 2017-06-02 21:25 ` Darrick J. Wong
  2017-06-08 13:02   ` Brian Foster
  2017-06-08 18:22   ` [PATCH v2 " Darrick J. Wong
  2017-06-02 21:25 ` [PATCH 12/13] xfs: pass along transaction context when reading directory block buffers Darrick J. Wong
                   ` (3 subsequent siblings)
  14 siblings, 2 replies; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-02 21:25 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Provide a way to calculate the highest hash value of a leaf1 block.
This will be used by the directory scrubbing code to check the sanity
of hashes in leaf1 directory blocks.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_dir2_node.c |   28 ++++++++++++++++++++++++++++
 fs/xfs/libxfs/xfs_dir2_priv.h |    2 ++
 2 files changed, 30 insertions(+)


diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
index bbd1238..15c1881 100644
--- a/fs/xfs/libxfs/xfs_dir2_node.c
+++ b/fs/xfs/libxfs/xfs_dir2_node.c
@@ -524,6 +524,34 @@ xfs_dir2_free_hdr_check(
 #endif	/* DEBUG */
 
 /*
+ * Return the last hash value in the leaf1.
+ * Stale entries are ok.
+ */
+xfs_dahash_t					/* hash value */
+xfs_dir2_leaf1_lasthash(
+	struct xfs_inode	*dp,
+	struct xfs_buf		*bp,		/* leaf buffer */
+	int			*count)		/* count of entries in leaf */
+{
+	struct xfs_dir2_leaf	*leaf = bp->b_addr;
+	struct xfs_dir2_leaf_entry *ents;
+	struct xfs_dir3_icleaf_hdr leafhdr;
+
+	dp->d_ops->leaf_hdr_from_disk(&leafhdr, leaf);
+
+	ASSERT(leafhdr.magic == XFS_DIR2_LEAF1_MAGIC ||
+	       leafhdr.magic == XFS_DIR3_LEAF1_MAGIC);
+
+	if (count)
+		*count = leafhdr.count;
+	if (!leafhdr.count)
+		return 0;
+
+	ents = dp->d_ops->leaf_ents_p(leaf);
+	return be32_to_cpu(ents[leafhdr.count - 1].hashval);
+}
+
+/*
  * Return the last hash value in the leaf.
  * Stale entries are ok.
  */
diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h
index 576f2d2..c09bca1 100644
--- a/fs/xfs/libxfs/xfs_dir2_priv.h
+++ b/fs/xfs/libxfs/xfs_dir2_priv.h
@@ -95,6 +95,8 @@ extern bool xfs_dir3_leaf_check_int(struct xfs_mount *mp, struct xfs_inode *dp,
 /* xfs_dir2_node.c */
 extern int xfs_dir2_leaf_to_node(struct xfs_da_args *args,
 		struct xfs_buf *lbp);
+extern xfs_dahash_t xfs_dir2_leaf1_lasthash(struct xfs_inode *dp,
+		struct xfs_buf *bp, int *count);
 extern xfs_dahash_t xfs_dir2_leafn_lasthash(struct xfs_inode *dp,
 		struct xfs_buf *bp, int *count);
 extern int xfs_dir2_leafn_lookup_int(struct xfs_buf *bp,


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 12/13] xfs: pass along transaction context when reading directory block buffers
  2017-06-02 21:24 [PATCH v7 00/13] xfs: preparing for online scrub support Darrick J. Wong
                   ` (10 preceding siblings ...)
  2017-06-02 21:25 ` [PATCH 11/13] xfs: return the hash value of a leaf1 directory block Darrick J. Wong
@ 2017-06-02 21:25 ` Darrick J. Wong
  2017-06-08 13:02   ` Brian Foster
  2017-06-02 21:25 ` [PATCH 13/13] xfs: pass along transaction context when reading xattr " Darrick J. Wong
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-02 21:25 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Teach the directory reading functions to pass along a transaction context
if one was supplied.  The directory scrub code will use transactions to
lock buffers and avoid deadlocking with itself in the case of loops.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_dir2_priv.h |    4 ++--
 fs/xfs/xfs_dir2_readdir.c     |   15 +++++++++++----
 fs/xfs/xfs_file.c             |    2 +-
 3 files changed, 14 insertions(+), 7 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h
index c09bca1..cb679cf 100644
--- a/fs/xfs/libxfs/xfs_dir2_priv.h
+++ b/fs/xfs/libxfs/xfs_dir2_priv.h
@@ -132,7 +132,7 @@ extern int xfs_dir2_sf_replace(struct xfs_da_args *args);
 extern int xfs_dir2_sf_verify(struct xfs_inode *ip);
 
 /* xfs_dir2_readdir.c */
-extern int xfs_readdir(struct xfs_inode *dp, struct dir_context *ctx,
-		       size_t bufsize);
+extern int xfs_readdir(struct xfs_trans *tp, struct xfs_inode *dp,
+		       struct dir_context *ctx, size_t bufsize);
 
 #endif /* __XFS_DIR2_PRIV_H__ */
diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c
index ede4790..ba2638d 100644
--- a/fs/xfs/xfs_dir2_readdir.c
+++ b/fs/xfs/xfs_dir2_readdir.c
@@ -170,7 +170,7 @@ xfs_dir2_block_getdents(
 		return 0;
 
 	lock_mode = xfs_ilock_data_map_shared(dp);
-	error = xfs_dir3_block_read(NULL, dp, &bp);
+	error = xfs_dir3_block_read(args->trans, dp, &bp);
 	xfs_iunlock(dp, lock_mode);
 	if (error)
 		return error;
@@ -228,7 +228,7 @@ xfs_dir2_block_getdents(
 		if (!dir_emit(ctx, (char *)dep->name, dep->namelen,
 			    be64_to_cpu(dep->inumber),
 			    xfs_dir3_get_dtype(dp->i_mount, filetype))) {
-			xfs_trans_brelse(NULL, bp);
+			xfs_trans_brelse(args->trans, bp);
 			return 0;
 		}
 	}
@@ -239,7 +239,7 @@ xfs_dir2_block_getdents(
 	 */
 	ctx->pos = xfs_dir2_db_off_to_dataptr(geo, geo->datablk + 1, 0) &
 								0x7fffffff;
-	xfs_trans_brelse(NULL, bp);
+	xfs_trans_brelse(args->trans, bp);
 	return 0;
 }
 
@@ -495,15 +495,21 @@ xfs_dir2_leaf_getdents(
 	else
 		ctx->pos = xfs_dir2_byte_to_dataptr(curoff) & 0x7fffffff;
 	if (bp)
-		xfs_trans_brelse(NULL, bp);
+		xfs_trans_brelse(args->trans, bp);
 	return error;
 }
 
 /*
  * Read a directory.
+ *
+ * If supplied, the transaction collects locked dir buffers to avoid
+ * nested buffer deadlocks.  This function does not dirty the
+ * transaction.  The caller should ensure that the inode is locked
+ * before calling this function.
  */
 int
 xfs_readdir(
+	struct xfs_trans	*tp,
 	struct xfs_inode	*dp,
 	struct dir_context	*ctx,
 	size_t			bufsize)
@@ -522,6 +528,7 @@ xfs_readdir(
 
 	args.dp = dp;
 	args.geo = dp->i_mount->m_dir_geo;
+	args.trans = tp;
 
 	if (dp->i_d.di_format == XFS_DINODE_FMT_LOCAL)
 		rval = xfs_dir2_sf_getdents(&args, ctx);
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 5fb5a09..36c1293 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -950,7 +950,7 @@ xfs_file_readdir(
 	 */
 	bufsize = (size_t)min_t(loff_t, 32768, ip->i_d.di_size);
 
-	return xfs_readdir(ip, ctx, bufsize);
+	return xfs_readdir(NULL, ip, ctx, bufsize);
 }
 
 /*


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 13/13] xfs: pass along transaction context when reading xattr block buffers
  2017-06-02 21:24 [PATCH v7 00/13] xfs: preparing for online scrub support Darrick J. Wong
                   ` (11 preceding siblings ...)
  2017-06-02 21:25 ` [PATCH 12/13] xfs: pass along transaction context when reading directory block buffers Darrick J. Wong
@ 2017-06-02 21:25 ` Darrick J. Wong
  2017-06-08 13:02   ` Brian Foster
  2017-06-02 22:19 ` [PATCH 14/13] xfs: allow reading of already-locked remote symbolic link Darrick J. Wong
  2017-06-26  6:04 ` [PATCH 15/13] xfs: grab dquots without taking the ilock Darrick J. Wong
  14 siblings, 1 reply; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-02 21:25 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Teach the extended attribute reading functions to pass along a
transaction context if one was supplied.  The extended attribute scrub
code will use transactions to lock buffers and avoid deadlocking with
itself in the case of loops; since it will already have the inode
locked, also create xattr get/list helpers that don't take locks.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c        |   26 ++++++++++++-----
 fs/xfs/libxfs/xfs_attr_remote.c |    5 ++-
 fs/xfs/xfs_attr.h               |    3 ++
 fs/xfs/xfs_attr_list.c          |   59 ++++++++++++++++++++++-----------------
 4 files changed, 57 insertions(+), 36 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 6622d46..ef8a1c7 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -114,6 +114,23 @@ xfs_inode_hasattr(
  * Overall external interface routines.
  *========================================================================*/
 
+/* Retrieve an extended attribute and its value.  Must have iolock. */
+int
+xfs_attr_get_ilocked(
+	struct xfs_inode	*ip,
+	struct xfs_da_args	*args)
+{
+	if (!xfs_inode_hasattr(ip))
+		return -ENOATTR;
+	else if (ip->i_d.di_aformat == XFS_DINODE_FMT_LOCAL)
+		return xfs_attr_shortform_getvalue(args);
+	else if (xfs_bmap_one_block(ip, XFS_ATTR_FORK))
+		return xfs_attr_leaf_get(args);
+	else
+		return xfs_attr_node_get(args);
+}
+
+/* Retrieve an extended attribute by name, and its value. */
 int
 xfs_attr_get(
 	struct xfs_inode	*ip,
@@ -141,14 +158,7 @@ xfs_attr_get(
 	args.op_flags = XFS_DA_OP_OKNOENT;
 
 	lock_mode = xfs_ilock_attr_map_shared(ip);
-	if (!xfs_inode_hasattr(ip))
-		error = -ENOATTR;
-	else if (ip->i_d.di_aformat == XFS_DINODE_FMT_LOCAL)
-		error = xfs_attr_shortform_getvalue(&args);
-	else if (xfs_bmap_one_block(ip, XFS_ATTR_FORK))
-		error = xfs_attr_leaf_get(&args);
-	else
-		error = xfs_attr_node_get(&args);
+	error = xfs_attr_get_ilocked(ip, &args);
 	xfs_iunlock(ip, lock_mode);
 
 	*valuelenp = args.valuelen;
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index da72b16..5236d8e 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -386,7 +386,8 @@ xfs_attr_rmtval_get(
 			       (map[i].br_startblock != HOLESTARTBLOCK));
 			dblkno = XFS_FSB_TO_DADDR(mp, map[i].br_startblock);
 			dblkcnt = XFS_FSB_TO_BB(mp, map[i].br_blockcount);
-			error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp,
+			error = xfs_trans_read_buf(mp, args->trans,
+						   mp->m_ddev_targp,
 						   dblkno, dblkcnt, 0, &bp,
 						   &xfs_attr3_rmt_buf_ops);
 			if (error)
@@ -395,7 +396,7 @@ xfs_attr_rmtval_get(
 			error = xfs_attr_rmtval_copyout(mp, bp, args->dp->i_ino,
 							&offset, &valuelen,
 							&dst);
-			xfs_buf_relse(bp);
+			xfs_trans_brelse(args->trans, bp);
 			if (error)
 				return error;
 
diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
index d14691a..5d5a5e2 100644
--- a/fs/xfs/xfs_attr.h
+++ b/fs/xfs/xfs_attr.h
@@ -117,6 +117,7 @@ typedef void (*put_listent_func_t)(struct xfs_attr_list_context *, int,
 			      unsigned char *, int, int);
 
 typedef struct xfs_attr_list_context {
+	struct xfs_trans		*tp;
 	struct xfs_inode		*dp;		/* inode */
 	struct attrlist_cursor_kern	*cursor;	/* position in list */
 	char				*alist;		/* output buffer */
@@ -140,8 +141,10 @@ typedef struct xfs_attr_list_context {
  * Overall external interface routines.
  */
 int xfs_attr_inactive(struct xfs_inode *dp);
+int xfs_attr_list_int_ilocked(struct xfs_attr_list_context *);
 int xfs_attr_list_int(struct xfs_attr_list_context *);
 int xfs_inode_hasattr(struct xfs_inode *ip);
+int xfs_attr_get_ilocked(struct xfs_inode *ip, struct xfs_da_args *args);
 int xfs_attr_get(struct xfs_inode *ip, const unsigned char *name,
 		 unsigned char *value, int *valuelenp, int flags);
 int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
index 9bc1e12..545eca5 100644
--- a/fs/xfs/xfs_attr_list.c
+++ b/fs/xfs/xfs_attr_list.c
@@ -230,7 +230,7 @@ xfs_attr_node_list(xfs_attr_list_context_t *context)
 	 */
 	bp = NULL;
 	if (cursor->blkno > 0) {
-		error = xfs_da3_node_read(NULL, dp, cursor->blkno, -1,
+		error = xfs_da3_node_read(context->tp, dp, cursor->blkno, -1,
 					      &bp, XFS_ATTR_FORK);
 		if ((error != 0) && (error != -EFSCORRUPTED))
 			return error;
@@ -242,7 +242,7 @@ xfs_attr_node_list(xfs_attr_list_context_t *context)
 			case XFS_DA_NODE_MAGIC:
 			case XFS_DA3_NODE_MAGIC:
 				trace_xfs_attr_list_wrong_blk(context);
-				xfs_trans_brelse(NULL, bp);
+				xfs_trans_brelse(context->tp, bp);
 				bp = NULL;
 				break;
 			case XFS_ATTR_LEAF_MAGIC:
@@ -254,18 +254,18 @@ xfs_attr_node_list(xfs_attr_list_context_t *context)
 				if (cursor->hashval > be32_to_cpu(
 						entries[leafhdr.count - 1].hashval)) {
 					trace_xfs_attr_list_wrong_blk(context);
-					xfs_trans_brelse(NULL, bp);
+					xfs_trans_brelse(context->tp, bp);
 					bp = NULL;
 				} else if (cursor->hashval <= be32_to_cpu(
 						entries[0].hashval)) {
 					trace_xfs_attr_list_wrong_blk(context);
-					xfs_trans_brelse(NULL, bp);
+					xfs_trans_brelse(context->tp, bp);
 					bp = NULL;
 				}
 				break;
 			default:
 				trace_xfs_attr_list_wrong_blk(context);
-				xfs_trans_brelse(NULL, bp);
+				xfs_trans_brelse(context->tp, bp);
 				bp = NULL;
 			}
 		}
@@ -281,7 +281,7 @@ xfs_attr_node_list(xfs_attr_list_context_t *context)
 		for (;;) {
 			uint16_t magic;
 
-			error = xfs_da3_node_read(NULL, dp,
+			error = xfs_da3_node_read(context->tp, dp,
 						      cursor->blkno, -1, &bp,
 						      XFS_ATTR_FORK);
 			if (error)
@@ -297,7 +297,7 @@ xfs_attr_node_list(xfs_attr_list_context_t *context)
 						     XFS_ERRLEVEL_LOW,
 						     context->dp->i_mount,
 						     node);
-				xfs_trans_brelse(NULL, bp);
+				xfs_trans_brelse(context->tp, bp);
 				return -EFSCORRUPTED;
 			}
 
@@ -313,10 +313,10 @@ xfs_attr_node_list(xfs_attr_list_context_t *context)
 				}
 			}
 			if (i == nodehdr.count) {
-				xfs_trans_brelse(NULL, bp);
+				xfs_trans_brelse(context->tp, bp);
 				return 0;
 			}
-			xfs_trans_brelse(NULL, bp);
+			xfs_trans_brelse(context->tp, bp);
 		}
 	}
 	ASSERT(bp != NULL);
@@ -333,12 +333,12 @@ xfs_attr_node_list(xfs_attr_list_context_t *context)
 		if (context->seen_enough || leafhdr.forw == 0)
 			break;
 		cursor->blkno = leafhdr.forw;
-		xfs_trans_brelse(NULL, bp);
-		error = xfs_attr3_leaf_read(NULL, dp, cursor->blkno, -1, &bp);
+		xfs_trans_brelse(context->tp, bp);
+		error = xfs_attr3_leaf_read(context->tp, dp, cursor->blkno, -1, &bp);
 		if (error)
 			return error;
 	}
-	xfs_trans_brelse(NULL, bp);
+	xfs_trans_brelse(context->tp, bp);
 	return 0;
 }
 
@@ -448,16 +448,34 @@ xfs_attr_leaf_list(xfs_attr_list_context_t *context)
 	trace_xfs_attr_leaf_list(context);
 
 	context->cursor->blkno = 0;
-	error = xfs_attr3_leaf_read(NULL, context->dp, 0, -1, &bp);
+	error = xfs_attr3_leaf_read(context->tp, context->dp, 0, -1, &bp);
 	if (error)
 		return error;
 
 	xfs_attr3_leaf_list_int(bp, context);
-	xfs_trans_brelse(NULL, bp);
+	xfs_trans_brelse(context->tp, bp);
 	return 0;
 }
 
 int
+xfs_attr_list_int_ilocked(
+	struct xfs_attr_list_context	*context)
+{
+	struct xfs_inode		*dp = context->dp;
+
+	/*
+	 * Decide on what work routines to call based on the inode size.
+	 */
+	if (!xfs_inode_hasattr(dp))
+		return 0;
+	else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL)
+		return xfs_attr_shortform_list(context);
+	else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
+		return xfs_attr_leaf_list(context);
+	return xfs_attr_node_list(context);
+}
+
+int
 xfs_attr_list_int(
 	xfs_attr_list_context_t *context)
 {
@@ -470,19 +488,8 @@ xfs_attr_list_int(
 	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
 		return -EIO;
 
-	/*
-	 * Decide on what work routines to call based on the inode size.
-	 */
 	lock_mode = xfs_ilock_attr_map_shared(dp);
-	if (!xfs_inode_hasattr(dp)) {
-		error = 0;
-	} else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) {
-		error = xfs_attr_shortform_list(context);
-	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
-		error = xfs_attr_leaf_list(context);
-	} else {
-		error = xfs_attr_node_list(context);
-	}
+	error = xfs_attr_list_int_ilocked(context);
 	xfs_iunlock(dp, lock_mode);
 	return error;
 }


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 14/13] xfs: allow reading of already-locked remote symbolic link
  2017-06-02 21:24 [PATCH v7 00/13] xfs: preparing for online scrub support Darrick J. Wong
                   ` (12 preceding siblings ...)
  2017-06-02 21:25 ` [PATCH 13/13] xfs: pass along transaction context when reading xattr " Darrick J. Wong
@ 2017-06-02 22:19 ` Darrick J. Wong
  2017-06-08 13:02   ` Brian Foster
  2017-06-26  6:04 ` [PATCH 15/13] xfs: grab dquots without taking the ilock Darrick J. Wong
  14 siblings, 1 reply; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-02 22:19 UTC (permalink / raw)
  To: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Expose the readlink variant that doesn't take the inode lock so that
the scrubber can inspect symlink contents.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_symlink.c |    6 +++---
 fs/xfs/xfs_symlink.h |    1 +
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
index f2cb45e..49380485 100644
--- a/fs/xfs/xfs_symlink.c
+++ b/fs/xfs/xfs_symlink.c
@@ -43,8 +43,8 @@
 #include "xfs_log.h"
 
 /* ----- Kernel only functions below ----- */
-STATIC int
-xfs_readlink_bmap(
+int
+xfs_readlink_bmap_ilocked(
 	struct xfs_inode	*ip,
 	char			*link)
 {
@@ -153,7 +153,7 @@ xfs_readlink(
 	}
 
 
-	error = xfs_readlink_bmap(ip, link);
+	error = xfs_readlink_bmap_ilocked(ip, link);
 
  out:
 	xfs_iunlock(ip, XFS_ILOCK_SHARED);
diff --git a/fs/xfs/xfs_symlink.h b/fs/xfs/xfs_symlink.h
index e75245d..aeaee89 100644
--- a/fs/xfs/xfs_symlink.h
+++ b/fs/xfs/xfs_symlink.h
@@ -21,6 +21,7 @@
 
 int xfs_symlink(struct xfs_inode *dp, struct xfs_name *link_name,
 		const char *target_path, umode_t mode, struct xfs_inode **ipp);
+int xfs_readlink_bmap_ilocked(struct xfs_inode *ip, char *link);
 int xfs_readlink(struct xfs_inode *ip, char *link);
 int xfs_inactive_symlink(struct xfs_inode *ip);
 

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/13] xfs: optimize _btree_query_all
  2017-06-02 21:24 ` [PATCH 01/13] xfs: optimize _btree_query_all Darrick J. Wong
@ 2017-06-06 13:32   ` Brian Foster
  2017-06-06 17:43     ` Darrick J. Wong
  2017-06-07  1:18   ` [PATCH v2 " Darrick J. Wong
  1 sibling, 1 reply; 56+ messages in thread
From: Brian Foster @ 2017-06-06 13:32 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, Jun 02, 2017 at 02:24:06PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Don't bother wandering our way through the leaf nodes when the caller
> issues a query_all; just zoom down the left side of the tree and walk
> rightwards along level zero.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_btree.c |   44 +++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 39 insertions(+), 5 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
> index 3a673ba..07d75bc 100644
> --- a/fs/xfs/libxfs/xfs_btree.c
> +++ b/fs/xfs/libxfs/xfs_btree.c
> @@ -4849,12 +4849,46 @@ xfs_btree_query_all(
>  	xfs_btree_query_range_fn	fn,
>  	void				*priv)
>  {
> -	union xfs_btree_irec		low_rec;
> -	union xfs_btree_irec		high_rec;
> +	union xfs_btree_rec		*recp;
> +	int				stat;
> +	int				error;
> +
> +	/*
> +	 * Find the leftmost record.  The btree cursor must be set
> +	 * to the low record used to generate low_key.
> +	 */
> +	memset(&cur->bc_rec, 0, sizeof(cur->bc_rec));
> +	stat = 0;
> +	error = xfs_btree_lookup(cur, XFS_LOOKUP_LE, &stat);
> +	if (error)
> +		goto out;
> +
> +	/* Nothing?  See if there's anything to the right. */
> +	if (!stat) {
> +		error = xfs_btree_increment(cur, 0, &stat);
> +		if (error)
> +			goto out;
> +	}
>  
> -	memset(&low_rec, 0, sizeof(low_rec));
> -	memset(&high_rec, 0xFF, sizeof(high_rec));
> -	return xfs_btree_query_range(cur, &low_rec, &high_rec, fn, priv);
> +	while (stat) {
> +		/* Find the record. */
> +		error = xfs_btree_get_rec(cur, &recp, &stat);
> +		if (error || !stat)
> +			break;
> +
> +		/* Callback */
> +		error = fn(cur, recp, priv);
> +		if (error < 0 || error == XFS_BTREE_QUERY_RANGE_ABORT)
> +			break;
> +
> +		/* Move on to the next record. */
> +		error = xfs_btree_increment(cur, 0, &stat);
> +		if (error)
> +			break;
> +	}
> +
> +out:
> +	return error;

This all looks quite similar to xfs_btree_simple_query_range(), minus
the associated key checks. I doubt the latter measurably affects the
performance of a btree walk. Could we call that function directly here?

Brian

>  }
>  
>  /*
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 03/13] xfs: always compile the btree inorder check functions
  2017-06-02 21:24 ` [PATCH 03/13] xfs: always compile the btree inorder check functions Darrick J. Wong
@ 2017-06-06 13:32   ` Brian Foster
  0 siblings, 0 replies; 56+ messages in thread
From: Brian Foster @ 2017-06-06 13:32 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, Jun 02, 2017 at 02:24:18PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> The btree record and key inorder check functions will be used by the
> btree scrubber code, so make sure they're always built.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_alloc_btree.c    |    6 ------
>  fs/xfs/libxfs/xfs_bmap_btree.c     |    4 ----
>  fs/xfs/libxfs/xfs_btree.h          |    2 --
>  fs/xfs/libxfs/xfs_ialloc_btree.c   |    6 ------
>  fs/xfs/libxfs/xfs_refcount_btree.c |    4 ----
>  fs/xfs/libxfs/xfs_rmap_btree.c     |    4 ----
>  6 files changed, 26 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_alloc_btree.c b/fs/xfs/libxfs/xfs_alloc_btree.c
> index 5020cbc..cfde0a0 100644
> --- a/fs/xfs/libxfs/xfs_alloc_btree.c
> +++ b/fs/xfs/libxfs/xfs_alloc_btree.c
> @@ -395,7 +395,6 @@ const struct xfs_buf_ops xfs_allocbt_buf_ops = {
>  };
>  
>  
> -#if defined(DEBUG) || defined(XFS_WARN)
>  STATIC int
>  xfs_bnobt_keys_inorder(
>  	struct xfs_btree_cur	*cur,
> @@ -442,7 +441,6 @@ xfs_cntbt_recs_inorder(
>  		 be32_to_cpu(r1->alloc.ar_startblock) <
>  		 be32_to_cpu(r2->alloc.ar_startblock));
>  }
> -#endif /* DEBUG */
>  
>  static const struct xfs_btree_ops xfs_bnobt_ops = {
>  	.rec_len		= sizeof(xfs_alloc_rec_t),
> @@ -462,10 +460,8 @@ static const struct xfs_btree_ops xfs_bnobt_ops = {
>  	.key_diff		= xfs_bnobt_key_diff,
>  	.buf_ops		= &xfs_allocbt_buf_ops,
>  	.diff_two_keys		= xfs_bnobt_diff_two_keys,
> -#if defined(DEBUG) || defined(XFS_WARN)
>  	.keys_inorder		= xfs_bnobt_keys_inorder,
>  	.recs_inorder		= xfs_bnobt_recs_inorder,
> -#endif
>  };
>  
>  static const struct xfs_btree_ops xfs_cntbt_ops = {
> @@ -486,10 +482,8 @@ static const struct xfs_btree_ops xfs_cntbt_ops = {
>  	.key_diff		= xfs_cntbt_key_diff,
>  	.buf_ops		= &xfs_allocbt_buf_ops,
>  	.diff_two_keys		= xfs_cntbt_diff_two_keys,
> -#if defined(DEBUG) || defined(XFS_WARN)
>  	.keys_inorder		= xfs_cntbt_keys_inorder,
>  	.recs_inorder		= xfs_cntbt_recs_inorder,
> -#endif
>  };
>  
>  /*
> diff --git a/fs/xfs/libxfs/xfs_bmap_btree.c b/fs/xfs/libxfs/xfs_bmap_btree.c
> index 5e2b3dc..e23495e 100644
> --- a/fs/xfs/libxfs/xfs_bmap_btree.c
> +++ b/fs/xfs/libxfs/xfs_bmap_btree.c
> @@ -687,7 +687,6 @@ const struct xfs_buf_ops xfs_bmbt_buf_ops = {
>  };
>  
>  
> -#if defined(DEBUG) || defined(XFS_WARN)
>  STATIC int
>  xfs_bmbt_keys_inorder(
>  	struct xfs_btree_cur	*cur,
> @@ -708,7 +707,6 @@ xfs_bmbt_recs_inorder(
>  		xfs_bmbt_disk_get_blockcount(&r1->bmbt) <=
>  		xfs_bmbt_disk_get_startoff(&r2->bmbt);
>  }
> -#endif	/* DEBUG */
>  
>  static const struct xfs_btree_ops xfs_bmbt_ops = {
>  	.rec_len		= sizeof(xfs_bmbt_rec_t),
> @@ -726,10 +724,8 @@ static const struct xfs_btree_ops xfs_bmbt_ops = {
>  	.init_ptr_from_cur	= xfs_bmbt_init_ptr_from_cur,
>  	.key_diff		= xfs_bmbt_key_diff,
>  	.buf_ops		= &xfs_bmbt_buf_ops,
> -#if defined(DEBUG) || defined(XFS_WARN)
>  	.keys_inorder		= xfs_bmbt_keys_inorder,
>  	.recs_inorder		= xfs_bmbt_recs_inorder,
> -#endif
>  };
>  
>  /*
> diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h
> index 0a931f6..177a364 100644
> --- a/fs/xfs/libxfs/xfs_btree.h
> +++ b/fs/xfs/libxfs/xfs_btree.h
> @@ -163,7 +163,6 @@ struct xfs_btree_ops {
>  
>  	const struct xfs_buf_ops	*buf_ops;
>  
> -#if defined(DEBUG) || defined(XFS_WARN)
>  	/* check that k1 is lower than k2 */
>  	int	(*keys_inorder)(struct xfs_btree_cur *cur,
>  				union xfs_btree_key *k1,
> @@ -173,7 +172,6 @@ struct xfs_btree_ops {
>  	int	(*recs_inorder)(struct xfs_btree_cur *cur,
>  				union xfs_btree_rec *r1,
>  				union xfs_btree_rec *r2);
> -#endif
>  };
>  
>  /*
> diff --git a/fs/xfs/libxfs/xfs_ialloc_btree.c b/fs/xfs/libxfs/xfs_ialloc_btree.c
> index ed52d99..6b1ddeb 100644
> --- a/fs/xfs/libxfs/xfs_ialloc_btree.c
> +++ b/fs/xfs/libxfs/xfs_ialloc_btree.c
> @@ -302,7 +302,6 @@ const struct xfs_buf_ops xfs_inobt_buf_ops = {
>  	.verify_write = xfs_inobt_write_verify,
>  };
>  
> -#if defined(DEBUG) || defined(XFS_WARN)
>  STATIC int
>  xfs_inobt_keys_inorder(
>  	struct xfs_btree_cur	*cur,
> @@ -322,7 +321,6 @@ xfs_inobt_recs_inorder(
>  	return be32_to_cpu(r1->inobt.ir_startino) + XFS_INODES_PER_CHUNK <=
>  		be32_to_cpu(r2->inobt.ir_startino);
>  }
> -#endif	/* DEBUG */
>  
>  static const struct xfs_btree_ops xfs_inobt_ops = {
>  	.rec_len		= sizeof(xfs_inobt_rec_t),
> @@ -339,10 +337,8 @@ static const struct xfs_btree_ops xfs_inobt_ops = {
>  	.init_ptr_from_cur	= xfs_inobt_init_ptr_from_cur,
>  	.key_diff		= xfs_inobt_key_diff,
>  	.buf_ops		= &xfs_inobt_buf_ops,
> -#if defined(DEBUG) || defined(XFS_WARN)
>  	.keys_inorder		= xfs_inobt_keys_inorder,
>  	.recs_inorder		= xfs_inobt_recs_inorder,
> -#endif
>  };
>  
>  static const struct xfs_btree_ops xfs_finobt_ops = {
> @@ -360,10 +356,8 @@ static const struct xfs_btree_ops xfs_finobt_ops = {
>  	.init_ptr_from_cur	= xfs_finobt_init_ptr_from_cur,
>  	.key_diff		= xfs_inobt_key_diff,
>  	.buf_ops		= &xfs_inobt_buf_ops,
> -#if defined(DEBUG) || defined(XFS_WARN)
>  	.keys_inorder		= xfs_inobt_keys_inorder,
>  	.recs_inorder		= xfs_inobt_recs_inorder,
> -#endif
>  };
>  
>  /*
> diff --git a/fs/xfs/libxfs/xfs_refcount_btree.c b/fs/xfs/libxfs/xfs_refcount_btree.c
> index 65c222a..3c59dd3 100644
> --- a/fs/xfs/libxfs/xfs_refcount_btree.c
> +++ b/fs/xfs/libxfs/xfs_refcount_btree.c
> @@ -285,7 +285,6 @@ const struct xfs_buf_ops xfs_refcountbt_buf_ops = {
>  	.verify_write		= xfs_refcountbt_write_verify,
>  };
>  
> -#if defined(DEBUG) || defined(XFS_WARN)
>  STATIC int
>  xfs_refcountbt_keys_inorder(
>  	struct xfs_btree_cur	*cur,
> @@ -306,7 +305,6 @@ xfs_refcountbt_recs_inorder(
>  		be32_to_cpu(r1->refc.rc_blockcount) <=
>  		be32_to_cpu(r2->refc.rc_startblock);
>  }
> -#endif
>  
>  static const struct xfs_btree_ops xfs_refcountbt_ops = {
>  	.rec_len		= sizeof(struct xfs_refcount_rec),
> @@ -325,10 +323,8 @@ static const struct xfs_btree_ops xfs_refcountbt_ops = {
>  	.key_diff		= xfs_refcountbt_key_diff,
>  	.buf_ops		= &xfs_refcountbt_buf_ops,
>  	.diff_two_keys		= xfs_refcountbt_diff_two_keys,
> -#if defined(DEBUG) || defined(XFS_WARN)
>  	.keys_inorder		= xfs_refcountbt_keys_inorder,
>  	.recs_inorder		= xfs_refcountbt_recs_inorder,
> -#endif
>  };
>  
>  /*
> diff --git a/fs/xfs/libxfs/xfs_rmap_btree.c b/fs/xfs/libxfs/xfs_rmap_btree.c
> index c5b4a1c8..9d9c919 100644
> --- a/fs/xfs/libxfs/xfs_rmap_btree.c
> +++ b/fs/xfs/libxfs/xfs_rmap_btree.c
> @@ -377,7 +377,6 @@ const struct xfs_buf_ops xfs_rmapbt_buf_ops = {
>  	.verify_write		= xfs_rmapbt_write_verify,
>  };
>  
> -#if defined(DEBUG) || defined(XFS_WARN)
>  STATIC int
>  xfs_rmapbt_keys_inorder(
>  	struct xfs_btree_cur	*cur,
> @@ -437,7 +436,6 @@ xfs_rmapbt_recs_inorder(
>  		return 1;
>  	return 0;
>  }
> -#endif	/* DEBUG */
>  
>  static const struct xfs_btree_ops xfs_rmapbt_ops = {
>  	.rec_len		= sizeof(struct xfs_rmap_rec),
> @@ -456,10 +454,8 @@ static const struct xfs_btree_ops xfs_rmapbt_ops = {
>  	.key_diff		= xfs_rmapbt_key_diff,
>  	.buf_ops		= &xfs_rmapbt_buf_ops,
>  	.diff_two_keys		= xfs_rmapbt_diff_two_keys,
> -#if defined(DEBUG) || defined(XFS_WARN)
>  	.keys_inorder		= xfs_rmapbt_keys_inorder,
>  	.recs_inorder		= xfs_rmapbt_recs_inorder,
> -#endif
>  };
>  
>  /*
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 04/13] xfs: export various function for the online scrubber
  2017-06-02 21:24 ` [PATCH 04/13] xfs: export various function for the online scrubber Darrick J. Wong
@ 2017-06-06 13:32   ` Brian Foster
  0 siblings, 0 replies; 56+ messages in thread
From: Brian Foster @ 2017-06-06 13:32 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, Jun 02, 2017 at 02:24:24PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Export various internal functions so that the online scrubber can use
> them to check the state of metadata.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_alloc.c     |    2 +-
>  fs/xfs/libxfs/xfs_alloc.h     |    2 ++
>  fs/xfs/libxfs/xfs_btree.c     |   12 ++++++------
>  fs/xfs/libxfs/xfs_btree.h     |   13 +++++++++++++
>  fs/xfs/libxfs/xfs_dir2_leaf.c |    2 +-
>  fs/xfs/libxfs/xfs_dir2_priv.h |    2 ++
>  fs/xfs/libxfs/xfs_inode_buf.c |    2 +-
>  fs/xfs/libxfs/xfs_inode_buf.h |    3 +++
>  fs/xfs/libxfs/xfs_rmap.c      |    3 ++-
>  fs/xfs/libxfs/xfs_rmap.h      |    3 +++
>  fs/xfs/libxfs/xfs_rtbitmap.c  |    2 +-
>  fs/xfs/xfs_itable.c           |    2 +-
>  fs/xfs/xfs_itable.h           |    2 ++
>  fs/xfs/xfs_rtalloc.h          |    3 +++
>  14 files changed, 41 insertions(+), 12 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> index 7486401..fefa8da 100644
> --- a/fs/xfs/libxfs/xfs_alloc.c
> +++ b/fs/xfs/libxfs/xfs_alloc.c
> @@ -606,7 +606,7 @@ const struct xfs_buf_ops xfs_agfl_buf_ops = {
>  /*
>   * Read in the allocation group free block array.
>   */
> -STATIC int				/* error */
> +int					/* error */
>  xfs_alloc_read_agfl(
>  	xfs_mount_t	*mp,		/* mount point structure */
>  	xfs_trans_t	*tp,		/* transaction pointer */
> diff --git a/fs/xfs/libxfs/xfs_alloc.h b/fs/xfs/libxfs/xfs_alloc.h
> index 77d9c27..ef26edc 100644
> --- a/fs/xfs/libxfs/xfs_alloc.h
> +++ b/fs/xfs/libxfs/xfs_alloc.h
> @@ -213,6 +213,8 @@ xfs_alloc_get_rec(
>  
>  int xfs_read_agf(struct xfs_mount *mp, struct xfs_trans *tp,
>  			xfs_agnumber_t agno, int flags, struct xfs_buf **bpp);
> +int xfs_alloc_read_agfl(struct xfs_mount *mp, struct xfs_trans *tp,
> +			xfs_agnumber_t agno, struct xfs_buf **bpp);
>  int xfs_alloc_fix_freelist(struct xfs_alloc_arg *args, int flags);
>  int xfs_free_extent_fix_freelist(struct xfs_trans *tp, xfs_agnumber_t agno,
>  		struct xfs_buf **agbp);
> diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
> index 302dd4c..302dac5 100644
> --- a/fs/xfs/libxfs/xfs_btree.c
> +++ b/fs/xfs/libxfs/xfs_btree.c
> @@ -568,7 +568,7 @@ xfs_btree_ptr_offset(
>  /*
>   * Return a pointer to the n-th record in the btree block.
>   */
> -STATIC union xfs_btree_rec *
> +union xfs_btree_rec *
>  xfs_btree_rec_addr(
>  	struct xfs_btree_cur	*cur,
>  	int			n,
> @@ -581,7 +581,7 @@ xfs_btree_rec_addr(
>  /*
>   * Return a pointer to the n-th key in the btree block.
>   */
> -STATIC union xfs_btree_key *
> +union xfs_btree_key *
>  xfs_btree_key_addr(
>  	struct xfs_btree_cur	*cur,
>  	int			n,
> @@ -594,7 +594,7 @@ xfs_btree_key_addr(
>  /*
>   * Return a pointer to the n-th high key in the btree block.
>   */
> -STATIC union xfs_btree_key *
> +union xfs_btree_key *
>  xfs_btree_high_key_addr(
>  	struct xfs_btree_cur	*cur,
>  	int			n,
> @@ -607,7 +607,7 @@ xfs_btree_high_key_addr(
>  /*
>   * Return a pointer to the n-th block pointer in the btree block.
>   */
> -STATIC union xfs_btree_ptr *
> +union xfs_btree_ptr *
>  xfs_btree_ptr_addr(
>  	struct xfs_btree_cur	*cur,
>  	int			n,
> @@ -641,7 +641,7 @@ xfs_btree_get_iroot(
>   * Retrieve the block pointer from the cursor at the given level.
>   * This may be an inode btree root or from a buffer.
>   */
> -STATIC struct xfs_btree_block *		/* generic btree block pointer */
> +struct xfs_btree_block *		/* generic btree block pointer */
>  xfs_btree_get_block(
>  	struct xfs_btree_cur	*cur,	/* btree cursor */
>  	int			level,	/* level in btree */
> @@ -1756,7 +1756,7 @@ xfs_btree_decrement(
>  	return error;
>  }
>  
> -STATIC int
> +int
>  xfs_btree_lookup_get_block(
>  	struct xfs_btree_cur	*cur,	/* btree cursor */
>  	int			level,	/* level in the btree */
> diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h
> index 177a364..9c95e96 100644
> --- a/fs/xfs/libxfs/xfs_btree.h
> +++ b/fs/xfs/libxfs/xfs_btree.h
> @@ -504,4 +504,17 @@ int xfs_btree_visit_blocks(struct xfs_btree_cur *cur,
>  
>  int xfs_btree_count_blocks(struct xfs_btree_cur *cur, xfs_extlen_t *blocks);
>  
> +union xfs_btree_rec *xfs_btree_rec_addr(struct xfs_btree_cur *cur, int n,
> +		struct xfs_btree_block *block);
> +union xfs_btree_key *xfs_btree_key_addr(struct xfs_btree_cur *cur, int n,
> +		struct xfs_btree_block *block);
> +union xfs_btree_key *xfs_btree_high_key_addr(struct xfs_btree_cur *cur, int n,
> +		struct xfs_btree_block *block);
> +union xfs_btree_ptr *xfs_btree_ptr_addr(struct xfs_btree_cur *cur, int n,
> +		struct xfs_btree_block *block);
> +int xfs_btree_lookup_get_block(struct xfs_btree_cur *cur, int level,
> +		union xfs_btree_ptr *pp, struct xfs_btree_block **blkp);
> +struct xfs_btree_block *xfs_btree_get_block(struct xfs_btree_cur *cur,
> +		int level, struct xfs_buf **bpp);
> +
>  #endif	/* __XFS_BTREE_H__ */
> diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c
> index 68bf3e8..7002024 100644
> --- a/fs/xfs/libxfs/xfs_dir2_leaf.c
> +++ b/fs/xfs/libxfs/xfs_dir2_leaf.c
> @@ -256,7 +256,7 @@ const struct xfs_buf_ops xfs_dir3_leafn_buf_ops = {
>  	.verify_write = xfs_dir3_leafn_write_verify,
>  };
>  
> -static int
> +int
>  xfs_dir3_leaf_read(
>  	struct xfs_trans	*tp,
>  	struct xfs_inode	*dp,
> diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h
> index 011df4d..576f2d2 100644
> --- a/fs/xfs/libxfs/xfs_dir2_priv.h
> +++ b/fs/xfs/libxfs/xfs_dir2_priv.h
> @@ -58,6 +58,8 @@ extern int xfs_dir3_data_init(struct xfs_da_args *args, xfs_dir2_db_t blkno,
>  		struct xfs_buf **bpp);
>  
>  /* xfs_dir2_leaf.c */
> +extern int xfs_dir3_leaf_read(struct xfs_trans *tp, struct xfs_inode *dp,
> +		xfs_dablk_t fbno, xfs_daddr_t mappedbno, struct xfs_buf **bpp);
>  extern int xfs_dir3_leafn_read(struct xfs_trans *tp, struct xfs_inode *dp,
>  		xfs_dablk_t fbno, xfs_daddr_t mappedbno, struct xfs_buf **bpp);
>  extern int xfs_dir2_block_to_leaf(struct xfs_da_args *args,
> diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
> index d887af9..0c970cf 100644
> --- a/fs/xfs/libxfs/xfs_inode_buf.c
> +++ b/fs/xfs/libxfs/xfs_inode_buf.c
> @@ -381,7 +381,7 @@ xfs_log_dinode_to_disk(
>  	}
>  }
>  
> -static bool
> +bool
>  xfs_dinode_verify(
>  	struct xfs_mount	*mp,
>  	xfs_ino_t		ino,
> diff --git a/fs/xfs/libxfs/xfs_inode_buf.h b/fs/xfs/libxfs/xfs_inode_buf.h
> index 0827d7d..a9c97a3 100644
> --- a/fs/xfs/libxfs/xfs_inode_buf.h
> +++ b/fs/xfs/libxfs/xfs_inode_buf.h
> @@ -82,4 +82,7 @@ void	xfs_inobp_check(struct xfs_mount *, struct xfs_buf *);
>  #define	xfs_inobp_check(mp, bp)
>  #endif /* DEBUG */
>  
> +bool	xfs_dinode_verify(struct xfs_mount *mp, xfs_ino_t ino,
> +			  struct xfs_dinode *dip);
> +
>  #endif	/* __XFS_INODE_BUF_H__ */
> diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c
> index 1bcb41f..eda275b 100644
> --- a/fs/xfs/libxfs/xfs_rmap.c
> +++ b/fs/xfs/libxfs/xfs_rmap.c
> @@ -179,7 +179,8 @@ xfs_rmap_delete(
>  	return error;
>  }
>  
> -static int
> +/* Convert an internal btree record to an rmap record. */
> +int
>  xfs_rmap_btrec_to_irec(
>  	union xfs_btree_rec	*rec,
>  	struct xfs_rmap_irec	*irec)
> diff --git a/fs/xfs/libxfs/xfs_rmap.h b/fs/xfs/libxfs/xfs_rmap.h
> index 265116d..466ede6 100644
> --- a/fs/xfs/libxfs/xfs_rmap.h
> +++ b/fs/xfs/libxfs/xfs_rmap.h
> @@ -216,5 +216,8 @@ int xfs_rmap_lookup_le_range(struct xfs_btree_cur *cur, xfs_agblock_t bno,
>  		struct xfs_rmap_irec *irec, int	*stat);
>  int xfs_rmap_compare(const struct xfs_rmap_irec *a,
>  		const struct xfs_rmap_irec *b);
> +union xfs_btree_rec;
> +int xfs_rmap_btrec_to_irec(union xfs_btree_rec *rec,
> +		struct xfs_rmap_irec *irec);
>  
>  #endif	/* __XFS_RMAP_H__ */
> diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
> index 26bba7f..5d4e43e 100644
> --- a/fs/xfs/libxfs/xfs_rtbitmap.c
> +++ b/fs/xfs/libxfs/xfs_rtbitmap.c
> @@ -70,7 +70,7 @@ const struct xfs_buf_ops xfs_rtbuf_ops = {
>   * Get a buffer for the bitmap or summary file block specified.
>   * The buffer is returned read and locked.
>   */
> -static int
> +int
>  xfs_rtbuf_get(
>  	xfs_mount_t	*mp,		/* file system mount structure */
>  	xfs_trans_t	*tp,		/* transaction pointer */
> diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
> index 26d67ce..c393a2f 100644
> --- a/fs/xfs/xfs_itable.c
> +++ b/fs/xfs/xfs_itable.c
> @@ -31,7 +31,7 @@
>  #include "xfs_trace.h"
>  #include "xfs_icache.h"
>  
> -STATIC int
> +int
>  xfs_internal_inum(
>  	xfs_mount_t	*mp,
>  	xfs_ino_t	ino)
> diff --git a/fs/xfs/xfs_itable.h b/fs/xfs/xfs_itable.h
> index 6ea8b39..17e86e0 100644
> --- a/fs/xfs/xfs_itable.h
> +++ b/fs/xfs/xfs_itable.h
> @@ -96,4 +96,6 @@ xfs_inumbers(
>  	void			__user *buffer, /* buffer with inode info */
>  	inumbers_fmt_pf		formatter);
>  
> +int xfs_internal_inum(struct xfs_mount *mp, xfs_ino_t ino);
> +
>  #endif	/* __XFS_ITABLE_H__ */
> diff --git a/fs/xfs/xfs_rtalloc.h b/fs/xfs/xfs_rtalloc.h
> index f13133e..79defa7 100644
> --- a/fs/xfs/xfs_rtalloc.h
> +++ b/fs/xfs/xfs_rtalloc.h
> @@ -107,6 +107,8 @@ xfs_growfs_rt(
>  /*
>   * From xfs_rtbitmap.c
>   */
> +int xfs_rtbuf_get(struct xfs_mount *mp, struct xfs_trans *tp,
> +		  xfs_rtblock_t block, int issum, struct xfs_buf **bpp);
>  int xfs_rtcheck_range(struct xfs_mount *mp, struct xfs_trans *tp,
>  		      xfs_rtblock_t start, xfs_extlen_t len, int val,
>  		      xfs_rtblock_t *new, int *stat);
> @@ -143,6 +145,7 @@ int xfs_rtalloc_query_all(struct xfs_trans *tp,
>  # define xfs_growfs_rt(mp,in)                           (ENOSYS)
>  # define xfs_rtalloc_query_range(t,l,h,f,p)             (ENOSYS)
>  # define xfs_rtalloc_query_all(t,f,p)                   (ENOSYS)
> +# define xfs_rtbuf_get(m,t,b,i,p)                       (ENOSYS)
>  static inline int		/* error */
>  xfs_rtmount_init(
>  	xfs_mount_t	*mp)	/* file system mount structure */
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 05/13] xfs: plumb in needed functions for range querying of various btrees
  2017-06-02 21:24 ` [PATCH 05/13] xfs: plumb in needed functions for range querying of various btrees Darrick J. Wong
@ 2017-06-06 13:33   ` Brian Foster
  0 siblings, 0 replies; 56+ messages in thread
From: Brian Foster @ 2017-06-06 13:33 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, Jun 02, 2017 at 02:24:30PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Plumb in the pieces (init_high_key, diff_two_keys) necessary to call
> query_range on the inode space and block mapping btrees and to extract
> raw btree records.  This will eventually be used by the inobt and bmbt
> scrubbers.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_bmap_btree.c   |   22 ++++++++++++++++++++++
>  fs/xfs/libxfs/xfs_ialloc_btree.c |   26 ++++++++++++++++++++++++++
>  2 files changed, 48 insertions(+)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap_btree.c b/fs/xfs/libxfs/xfs_bmap_btree.c
> index e23495e..85de225 100644
> --- a/fs/xfs/libxfs/xfs_bmap_btree.c
> +++ b/fs/xfs/libxfs/xfs_bmap_btree.c
> @@ -573,6 +573,16 @@ xfs_bmbt_init_key_from_rec(
>  }
>  
>  STATIC void
> +xfs_bmbt_init_high_key_from_rec(
> +	union xfs_btree_key	*key,
> +	union xfs_btree_rec	*rec)
> +{
> +	key->bmbt.br_startoff = cpu_to_be64(
> +			xfs_bmbt_disk_get_startoff(&rec->bmbt) +
> +			xfs_bmbt_disk_get_blockcount(&rec->bmbt) - 1);
> +}
> +
> +STATIC void
>  xfs_bmbt_init_rec_from_cur(
>  	struct xfs_btree_cur	*cur,
>  	union xfs_btree_rec	*rec)
> @@ -597,6 +607,16 @@ xfs_bmbt_key_diff(
>  				      cur->bc_rec.b.br_startoff;
>  }
>  
> +STATIC int64_t
> +xfs_bmbt_diff_two_keys(
> +	struct xfs_btree_cur	*cur,
> +	union xfs_btree_key	*k1,
> +	union xfs_btree_key	*k2)
> +{
> +	return (int64_t)be64_to_cpu(k1->bmbt.br_startoff) -
> +			  be64_to_cpu(k2->bmbt.br_startoff);
> +}
> +
>  static bool
>  xfs_bmbt_verify(
>  	struct xfs_buf		*bp)
> @@ -720,9 +740,11 @@ static const struct xfs_btree_ops xfs_bmbt_ops = {
>  	.get_minrecs		= xfs_bmbt_get_minrecs,
>  	.get_dmaxrecs		= xfs_bmbt_get_dmaxrecs,
>  	.init_key_from_rec	= xfs_bmbt_init_key_from_rec,
> +	.init_high_key_from_rec	= xfs_bmbt_init_high_key_from_rec,
>  	.init_rec_from_cur	= xfs_bmbt_init_rec_from_cur,
>  	.init_ptr_from_cur	= xfs_bmbt_init_ptr_from_cur,
>  	.key_diff		= xfs_bmbt_key_diff,
> +	.diff_two_keys		= xfs_bmbt_diff_two_keys,
>  	.buf_ops		= &xfs_bmbt_buf_ops,
>  	.keys_inorder		= xfs_bmbt_keys_inorder,
>  	.recs_inorder		= xfs_bmbt_recs_inorder,
> diff --git a/fs/xfs/libxfs/xfs_ialloc_btree.c b/fs/xfs/libxfs/xfs_ialloc_btree.c
> index 6b1ddeb..317caba 100644
> --- a/fs/xfs/libxfs/xfs_ialloc_btree.c
> +++ b/fs/xfs/libxfs/xfs_ialloc_btree.c
> @@ -175,6 +175,18 @@ xfs_inobt_init_key_from_rec(
>  }
>  
>  STATIC void
> +xfs_inobt_init_high_key_from_rec(
> +	union xfs_btree_key	*key,
> +	union xfs_btree_rec	*rec)
> +{
> +	__u32			x;
> +
> +	x = be32_to_cpu(rec->inobt.ir_startino);
> +	x += XFS_INODES_PER_CHUNK - 1;
> +	key->inobt.ir_startino = cpu_to_be32(x);
> +}
> +
> +STATIC void
>  xfs_inobt_init_rec_from_cur(
>  	struct xfs_btree_cur	*cur,
>  	union xfs_btree_rec	*rec)
> @@ -228,6 +240,16 @@ xfs_inobt_key_diff(
>  			  cur->bc_rec.i.ir_startino;
>  }
>  
> +STATIC int64_t
> +xfs_inobt_diff_two_keys(
> +	struct xfs_btree_cur	*cur,
> +	union xfs_btree_key	*k1,
> +	union xfs_btree_key	*k2)
> +{
> +	return (int64_t)be32_to_cpu(k1->inobt.ir_startino) -
> +			  be32_to_cpu(k2->inobt.ir_startino);
> +}
> +
>  static int
>  xfs_inobt_verify(
>  	struct xfs_buf		*bp)
> @@ -333,10 +355,12 @@ static const struct xfs_btree_ops xfs_inobt_ops = {
>  	.get_minrecs		= xfs_inobt_get_minrecs,
>  	.get_maxrecs		= xfs_inobt_get_maxrecs,
>  	.init_key_from_rec	= xfs_inobt_init_key_from_rec,
> +	.init_high_key_from_rec	= xfs_inobt_init_high_key_from_rec,
>  	.init_rec_from_cur	= xfs_inobt_init_rec_from_cur,
>  	.init_ptr_from_cur	= xfs_inobt_init_ptr_from_cur,
>  	.key_diff		= xfs_inobt_key_diff,
>  	.buf_ops		= &xfs_inobt_buf_ops,
> +	.diff_two_keys		= xfs_inobt_diff_two_keys,
>  	.keys_inorder		= xfs_inobt_keys_inorder,
>  	.recs_inorder		= xfs_inobt_recs_inorder,
>  };
> @@ -352,10 +376,12 @@ static const struct xfs_btree_ops xfs_finobt_ops = {
>  	.get_minrecs		= xfs_inobt_get_minrecs,
>  	.get_maxrecs		= xfs_inobt_get_maxrecs,
>  	.init_key_from_rec	= xfs_inobt_init_key_from_rec,
> +	.init_high_key_from_rec	= xfs_inobt_init_high_key_from_rec,
>  	.init_rec_from_cur	= xfs_inobt_init_rec_from_cur,
>  	.init_ptr_from_cur	= xfs_finobt_init_ptr_from_cur,
>  	.key_diff		= xfs_inobt_key_diff,
>  	.buf_ops		= &xfs_inobt_buf_ops,
> +	.diff_two_keys		= xfs_inobt_diff_two_keys,
>  	.keys_inorder		= xfs_inobt_keys_inorder,
>  	.recs_inorder		= xfs_inobt_recs_inorder,
>  };
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 06/13] xfs: export _inobt_btrec_to_irec and _ialloc_cluster_alignment for scrub
  2017-06-02 21:24 ` [PATCH 06/13] xfs: export _inobt_btrec_to_irec and _ialloc_cluster_alignment for scrub Darrick J. Wong
@ 2017-06-06 16:27   ` Brian Foster
  2017-06-06 17:46     ` Darrick J. Wong
  0 siblings, 1 reply; 56+ messages in thread
From: Brian Foster @ 2017-06-06 16:27 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, Jun 02, 2017 at 02:24:36PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Create a function to extract an in-core inobt record from a generic
> btree_rec union so that scrub will be able to check inobt records
> and check inode block alignment.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_ialloc.c |   43 ++++++++++++++++++++++++++-----------------
>  fs/xfs/libxfs/xfs_ialloc.h |    5 +++++
>  2 files changed, 31 insertions(+), 17 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
> index 1e5ed94..33626373 100644
> --- a/fs/xfs/libxfs/xfs_ialloc.c
> +++ b/fs/xfs/libxfs/xfs_ialloc.c
> @@ -46,7 +46,7 @@
>  /*
>   * Allocation group level functions.
>   */
> -static inline int
> +int
>  xfs_ialloc_cluster_alignment(
>  	struct xfs_mount	*mp)
>  {
> @@ -98,24 +98,14 @@ xfs_inobt_update(
>  	return xfs_btree_update(cur, &rec);
>  }
>  
> -/*
> - * Get the data from the pointed-to record.
> - */
> -int					/* error */
> -xfs_inobt_get_rec(
> -	struct xfs_btree_cur	*cur,	/* btree cursor */
> -	xfs_inobt_rec_incore_t	*irec,	/* btree record */
> -	int			*stat)	/* output: success/failure */
> +void
> +xfs_inobt_btrec_to_irec(
> +	struct xfs_mount		*mp,
> +	union xfs_btree_rec		*rec,
> +	struct xfs_inobt_rec_incore	*irec)
>  {
> -	union xfs_btree_rec	*rec;
> -	int			error;
> -
> -	error = xfs_btree_get_rec(cur, &rec, stat);
> -	if (error || *stat == 0)
> -		return error;
> -
>  	irec->ir_startino = be32_to_cpu(rec->inobt.ir_startino);
> -	if (xfs_sb_version_hassparseinodes(&cur->bc_mp->m_sb)) {
> +	if (xfs_sb_version_hassparseinodes(&mp->m_sb)) {
>  		irec->ir_holemask = be16_to_cpu(rec->inobt.ir_u.sp.ir_holemask);
>  		irec->ir_count = rec->inobt.ir_u.sp.ir_count;
>  		irec->ir_freecount = rec->inobt.ir_u.sp.ir_freecount;
> @@ -130,6 +120,25 @@ xfs_inobt_get_rec(
>  				be32_to_cpu(rec->inobt.ir_u.f.ir_freecount);
>  	}
>  	irec->ir_free = be64_to_cpu(rec->inobt.ir_free);
> +}
> +
> +/*
> + * Get the data from the pointed-to record.
> + */
> +int					/* error */
> +xfs_inobt_get_rec(
> +	struct xfs_btree_cur	*cur,	/* btree cursor */
> +	xfs_inobt_rec_incore_t	*irec,	/* btree record */

Might as well kill the typedef usage while we're here. Otherwise looks
good:

Reviewed-by: Brian Foster <bfoster@redhat.com>

> +	int			*stat)	/* output: success/failure */
> +{
> +	union xfs_btree_rec	*rec;
> +	int			error;
> +
> +	error = xfs_btree_get_rec(cur, &rec, stat);
> +	if (error || *stat == 0)
> +		return error;
> +
> +	xfs_inobt_btrec_to_irec(cur->bc_mp, rec, irec);
>  
>  	return 0;
>  }
> diff --git a/fs/xfs/libxfs/xfs_ialloc.h b/fs/xfs/libxfs/xfs_ialloc.h
> index 0bb8966..b32cfb5 100644
> --- a/fs/xfs/libxfs/xfs_ialloc.h
> +++ b/fs/xfs/libxfs/xfs_ialloc.h
> @@ -168,5 +168,10 @@ int xfs_ialloc_inode_init(struct xfs_mount *mp, struct xfs_trans *tp,
>  int xfs_read_agi(struct xfs_mount *mp, struct xfs_trans *tp,
>  		xfs_agnumber_t agno, struct xfs_buf **bpp);
>  
> +union xfs_btree_rec;
> +void xfs_inobt_btrec_to_irec(struct xfs_mount *mp, union xfs_btree_rec *rec,
> +		struct xfs_inobt_rec_incore *irec);
> +
> +int xfs_ialloc_cluster_alignment(struct xfs_mount *mp);
>  
>  #endif	/* __XFS_IALLOC_H__ */
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 07/13] xfs: check if an inode is cached and allocated
  2017-06-02 21:24 ` [PATCH 07/13] xfs: check if an inode is cached and allocated Darrick J. Wong
@ 2017-06-06 16:28   ` Brian Foster
  2017-06-06 18:40     ` Darrick J. Wong
  2017-06-07  1:21   ` [PATCH v2 " Darrick J. Wong
  2017-06-16 17:59   ` [PATCH v3 " Darrick J. Wong
  2 siblings, 1 reply; 56+ messages in thread
From: Brian Foster @ 2017-06-06 16:28 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, Jun 02, 2017 at 02:24:43PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Check the inode cache for a particular inode number.  If it's in the
> cache, check that it's not currently being reclaimed.  If it's not being
> reclaimed, return zero if the inode is allocated.  This function will be
> used by various scrubbers to decide if the cache is more up to date
> than the disk in terms of checking if an inode is allocated.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/xfs_icache.c |   83 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  fs/xfs/xfs_icache.h |    3 ++
>  2 files changed, 86 insertions(+)
> 
> 
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index f61c84f8..d610a7e 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -633,6 +633,89 @@ xfs_iget(
>  }
>  
>  /*
> + * "Is this a cached inode that's also allocated?"
> + *
> + * Look up an inode by number in the given file system.  If the inode is
> + * in cache and isn't in purgatory, return 1 if the inode is allocated
> + * and 0 if it is not.  For all other cases (not in cache, being torn
> + * down, etc.), return a negative error code.
> + *
> + * (The caller has to prevent inode allocation activity.)
> + */

Hmm.. so isn't the data returned here potentially invalid once we drop
the inode reference? In other words, couldn't an inode where we return
inuse == true be reclaimed immediately after? Perhaps I'm just not far
enough along to understand how this is used. If that's the case, a note
about the lifetime/rules of this value might be useful.

FWIW, I'm also kind of wondering if rather than open code the bits of
the inode lookup, we could accomplish the same thing with a new flag to
the existing xfs_iget() lookup mechanism that implements the associated
semantics (i.e., don't read from disk, don't reinit, sort of a read-only
semantic).

Brian

> +int
> +xfs_icache_inode_is_allocated(
> +	struct xfs_mount	*mp,
> +	struct xfs_trans	*tp,
> +	xfs_ino_t		ino,
> +	bool			*inuse)
> +{
> +	struct xfs_inode	*ip;
> +	struct xfs_perag	*pag;
> +	xfs_agino_t		agino;
> +	int			ret = 0;
> +
> +	/* reject inode numbers outside existing AGs */
> +	if (!ino || XFS_INO_TO_AGNO(mp, ino) >= mp->m_sb.sb_agcount)
> +		return -EINVAL;
> +
> +	/* get the perag structure and ensure that it's inode capable */
> +	pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ino));
> +	agino = XFS_INO_TO_AGINO(mp, ino);
> +
> +	rcu_read_lock();
> +	ip = radix_tree_lookup(&pag->pag_ici_root, agino);
> +	if (!ip) {
> +		ret = -ENOENT;
> +		goto out;
> +	}
> +
> +	/*
> +	 * Is the inode being reused?  Is it new?  Is it being
> +	 * reclaimed?  Is it being torn down?  For any of those cases,
> +	 * fall back.
> +	 */
> +	spin_lock(&ip->i_flags_lock);
> +	if (ip->i_ino != ino ||
> +	    (ip->i_flags & (XFS_INEW | XFS_IRECLAIM | XFS_IRECLAIMABLE))) {
> +		ret = -EAGAIN;
> +		goto out_istate;
> +	}
> +
> +	/*
> +	 * If lookup is racing with unlink, jump out immediately.
> +	 */
> +	if (VFS_I(ip)->i_mode == 0) {
> +		*inuse = false;
> +		ret = 0;
> +		goto out_istate;
> +	}
> +
> +	/* If the VFS inode is being torn down, forget it. */
> +	if (!igrab(VFS_I(ip))) {
> +		ret = -EAGAIN;
> +		goto out_istate;
> +	}
> +
> +	/* We've got a live one. */
> +	spin_unlock(&ip->i_flags_lock);
> +	rcu_read_unlock();
> +	xfs_perag_put(pag);
> +
> +	*inuse = !!(VFS_I(ip)->i_mode);
> +	ret = 0;
> +	IRELE(ip);
> +
> +	return ret;
> +
> +out_istate:
> +	spin_unlock(&ip->i_flags_lock);
> +out:
> +	rcu_read_unlock();
> +	xfs_perag_put(pag);
> +	return ret;
> +}
> +
> +/*
>   * The inode lookup is done in batches to keep the amount of lock traffic and
>   * radix tree lookups to a minimum. The batch size is a trade off between
>   * lookup reduction and stack usage. This is in the reclaim path, so we can't
> diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
> index 9183f77..eadf718 100644
> --- a/fs/xfs/xfs_icache.h
> +++ b/fs/xfs/xfs_icache.h
> @@ -126,4 +126,7 @@ xfs_fs_eofblocks_from_user(
>  	return 0;
>  }
>  
> +int xfs_icache_inode_is_allocated(struct xfs_mount *mp, struct xfs_trans *tp,
> +				  xfs_ino_t ino, bool *inuse);
> +
>  #endif
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 08/13] xfs: reflink find shared should take a transaction
  2017-06-02 21:24 ` [PATCH 08/13] xfs: reflink find shared should take a transaction Darrick J. Wong
@ 2017-06-06 16:28   ` Brian Foster
  0 siblings, 0 replies; 56+ messages in thread
From: Brian Foster @ 2017-06-06 16:28 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, Jun 02, 2017 at 02:24:49PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Adapt _reflink_find_shared to take an optional transaction pointer.  The
> inode scrubber code will need to decide (within transaction context) if
> a file has shared blocks.  To avoid buffer deadlocks, we must pass the
> tp through to this function's utility calls.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/xfs_bmap_util.c |    4 ++--
>  fs/xfs/xfs_reflink.c   |   15 ++++++++-------
>  fs/xfs/xfs_reflink.h   |    6 +++---
>  3 files changed, 13 insertions(+), 12 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index 308428d..fe83bbc 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -455,8 +455,8 @@ xfs_getbmap_adjust_shared(
>  
>  	agno = XFS_FSB_TO_AGNO(mp, map->br_startblock);
>  	agbno = XFS_FSB_TO_AGBNO(mp, map->br_startblock);
> -	error = xfs_reflink_find_shared(mp, agno, agbno, map->br_blockcount,
> -			&ebno, &elen, true);
> +	error = xfs_reflink_find_shared(mp, NULL, agno, agbno,
> +			map->br_blockcount, &ebno, &elen, true);
>  	if (error)
>  		return error;
>  
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index ffe6fe7..e25c995 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -155,6 +155,7 @@
>  int
>  xfs_reflink_find_shared(
>  	struct xfs_mount	*mp,
> +	struct xfs_trans	*tp,
>  	xfs_agnumber_t		agno,
>  	xfs_agblock_t		agbno,
>  	xfs_extlen_t		aglen,
> @@ -166,18 +167,18 @@ xfs_reflink_find_shared(
>  	struct xfs_btree_cur	*cur;
>  	int			error;
>  
> -	error = xfs_alloc_read_agf(mp, NULL, agno, 0, &agbp);
> +	error = xfs_alloc_read_agf(mp, tp, agno, 0, &agbp);
>  	if (error)
>  		return error;
>  
> -	cur = xfs_refcountbt_init_cursor(mp, NULL, agbp, agno, NULL);
> +	cur = xfs_refcountbt_init_cursor(mp, tp, agbp, agno, NULL);
>  
>  	error = xfs_refcount_find_shared(cur, agbno, aglen, fbno, flen,
>  			find_end_of_shared);
>  
>  	xfs_btree_del_cursor(cur, error ? XFS_BTREE_ERROR : XFS_BTREE_NOERROR);
>  
> -	xfs_buf_relse(agbp);
> +	xfs_trans_brelse(tp, agbp);
>  	return error;
>  }
>  
> @@ -217,7 +218,7 @@ xfs_reflink_trim_around_shared(
>  	agbno = XFS_FSB_TO_AGBNO(ip->i_mount, irec->br_startblock);
>  	aglen = irec->br_blockcount;
>  
> -	error = xfs_reflink_find_shared(ip->i_mount, agno, agbno,
> +	error = xfs_reflink_find_shared(ip->i_mount, NULL, agno, agbno,
>  			aglen, &fbno, &flen, true);
>  	if (error)
>  		return error;
> @@ -1373,8 +1374,8 @@ xfs_reflink_dirty_extents(
>  			agbno = XFS_FSB_TO_AGBNO(mp, map[1].br_startblock);
>  			aglen = map[1].br_blockcount;
>  
> -			error = xfs_reflink_find_shared(mp, agno, agbno, aglen,
> -					&rbno, &rlen, true);
> +			error = xfs_reflink_find_shared(mp, NULL, agno, agbno,
> +					aglen, &rbno, &rlen, true);
>  			if (error)
>  				goto out;
>  			if (rbno == NULLAGBLOCK)
> @@ -1445,7 +1446,7 @@ xfs_reflink_clear_inode_flag(
>  		agbno = XFS_FSB_TO_AGBNO(mp, map.br_startblock);
>  		aglen = map.br_blockcount;
>  
> -		error = xfs_reflink_find_shared(mp, agno, agbno, aglen,
> +		error = xfs_reflink_find_shared(mp, *tpp, agno, agbno, aglen,
>  				&rbno, &rlen, false);
>  		if (error)
>  			return error;
> diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h
> index d29a796..b8cc5c3 100644
> --- a/fs/xfs/xfs_reflink.h
> +++ b/fs/xfs/xfs_reflink.h
> @@ -20,9 +20,9 @@
>  #ifndef __XFS_REFLINK_H
>  #define __XFS_REFLINK_H 1
>  
> -extern int xfs_reflink_find_shared(struct xfs_mount *mp, xfs_agnumber_t agno,
> -		xfs_agblock_t agbno, xfs_extlen_t aglen, xfs_agblock_t *fbno,
> -		xfs_extlen_t *flen, bool find_maximal);
> +extern int xfs_reflink_find_shared(struct xfs_mount *mp, struct xfs_trans *tp,
> +		xfs_agnumber_t agno, xfs_agblock_t agbno, xfs_extlen_t aglen,
> +		xfs_agblock_t *fbno, xfs_extlen_t *flen, bool find_maximal);
>  extern int xfs_reflink_trim_around_shared(struct xfs_inode *ip,
>  		struct xfs_bmbt_irec *irec, bool *shared, bool *trimmed);
>  
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 09/13] xfs: separate function to check if reflink flag needed
  2017-06-02 21:24 ` [PATCH 09/13] xfs: separate function to check if reflink flag needed Darrick J. Wong
@ 2017-06-06 16:28   ` Brian Foster
  2017-06-06 18:05     ` Darrick J. Wong
  2017-06-07  1:26   ` [PATCH v2 " Darrick J. Wong
  1 sibling, 1 reply; 56+ messages in thread
From: Brian Foster @ 2017-06-06 16:28 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, Jun 02, 2017 at 02:24:55PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Separate the "clear reflink flag" function into one function that checks
> if the flag is needed, and a second function that checks and clears the
> flag.  The inode scrub code will want to check the necessity of the flag
> without clearing it.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/xfs_reflink.c |   88 ++++++++++++++++++++++++++++++--------------------
>  fs/xfs/xfs_reflink.h |    2 +
>  2 files changed, 54 insertions(+), 36 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index e25c995..133ee02 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -1406,57 +1406,73 @@ xfs_reflink_dirty_extents(
>  	return error;
>  }
>  
> -/* Clear the inode reflink flag if there are no shared extents. */
> +/* Does this inode need the reflink flag? */
>  int
> -xfs_reflink_clear_inode_flag(
> -	struct xfs_inode	*ip,
> -	struct xfs_trans	**tpp)
> +xfs_reflink_needs_inode_flag(
> +	struct xfs_trans		*tp,
> +	struct xfs_inode		*ip,
> +	bool				*needs_flag)

This looks Ok to me, but just a nit that the _needs_inode_flag() name
sounds slightly confusing to me because any context around the flag is
now isolated to the caller. Could we call this something more generic
like _has_shared_extents() (and rename needs_flag appropriately as
well)?

Brian

>  {
> -	struct xfs_mount	*mp = ip->i_mount;
> -	xfs_fileoff_t		fbno;
> -	xfs_filblks_t		end;
> -	xfs_agnumber_t		agno;
> -	xfs_agblock_t		agbno;
> -	xfs_extlen_t		aglen;
> -	xfs_agblock_t		rbno;
> -	xfs_extlen_t		rlen;
> -	struct xfs_bmbt_irec	map;
> -	int			nmaps;
> -	int			error = 0;
> -
> -	ASSERT(xfs_is_reflink_inode(ip));
> +	struct xfs_bmbt_irec		got;
> +	struct xfs_mount		*mp = ip->i_mount;
> +	struct xfs_ifork		*ifp;
> +	xfs_agnumber_t			agno;
> +	xfs_agblock_t			agbno;
> +	xfs_extlen_t			aglen;
> +	xfs_agblock_t			rbno;
> +	xfs_extlen_t			rlen;
> +	xfs_extnum_t			idx;
> +	bool				found;
> +	int				error;
>  
> -	fbno = 0;
> -	end = XFS_B_TO_FSB(mp, i_size_read(VFS_I(ip)));
> -	while (end - fbno > 0) {
> -		nmaps = 1;
> -		/*
> -		 * Look for extents in the file.  Skip holes, delalloc, or
> -		 * unwritten extents; they can't be reflinked.
> -		 */
> -		error = xfs_bmapi_read(ip, fbno, end - fbno, &map, &nmaps, 0);
> +	ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
> +	if (!(ifp->if_flags & XFS_IFEXTENTS)) {
> +		error = xfs_iread_extents(tp, ip, XFS_DATA_FORK);
>  		if (error)
>  			return error;
> -		if (nmaps == 0)
> -			break;
> -		if (!xfs_bmap_is_real_extent(&map))
> -			goto next;
> +	}
>  
> -		agno = XFS_FSB_TO_AGNO(mp, map.br_startblock);
> -		agbno = XFS_FSB_TO_AGBNO(mp, map.br_startblock);
> -		aglen = map.br_blockcount;
> +	*needs_flag = false;
> +	found = xfs_iext_lookup_extent(ip, ifp, 0, &idx, &got);
> +	while (found) {
> +		if (isnullstartblock(got.br_startblock) ||
> +		    got.br_state != XFS_EXT_NORM)
> +			goto next;
> +		agno = XFS_FSB_TO_AGNO(mp, got.br_startblock);
> +		agbno = XFS_FSB_TO_AGBNO(mp, got.br_startblock);
> +		aglen = got.br_blockcount;
>  
> -		error = xfs_reflink_find_shared(mp, *tpp, agno, agbno, aglen,
> +		error = xfs_reflink_find_shared(mp, tp, agno, agbno, aglen,
>  				&rbno, &rlen, false);
>  		if (error)
>  			return error;
>  		/* Is there still a shared block here? */
> -		if (rbno != NULLAGBLOCK)
> +		if (rbno != NULLAGBLOCK) {
> +			*needs_flag = true;
>  			return 0;
> +		}
>  next:
> -		fbno = map.br_startoff + map.br_blockcount;
> +		found = xfs_iext_get_extent(ifp, ++idx, &got);
>  	}
>  
> +	return 0;
> +}
> +
> +/* Clear the inode reflink flag if there are no shared extents. */
> +int
> +xfs_reflink_clear_inode_flag(
> +	struct xfs_inode	*ip,
> +	struct xfs_trans	**tpp)
> +{
> +	bool			needs;
> +	int			error = 0;
> +
> +	ASSERT(xfs_is_reflink_inode(ip));
> +
> +	error = xfs_reflink_needs_inode_flag(*tpp, ip, &needs);
> +	if (error || needs)
> +		return error;
> +
>  	/*
>  	 * We didn't find any shared blocks so turn off the reflink flag.
>  	 * First, get rid of any leftover CoW mappings.
> diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h
> index b8cc5c3..a26d795 100644
> --- a/fs/xfs/xfs_reflink.h
> +++ b/fs/xfs/xfs_reflink.h
> @@ -47,6 +47,8 @@ extern int xfs_reflink_end_cow(struct xfs_inode *ip, xfs_off_t offset,
>  extern int xfs_reflink_recover_cow(struct xfs_mount *mp);
>  extern int xfs_reflink_remap_range(struct file *file_in, loff_t pos_in,
>  		struct file *file_out, loff_t pos_out, u64 len, bool is_dedupe);
> +extern int xfs_reflink_needs_inode_flag(struct xfs_trans *tp,
> +		struct xfs_inode *ip, bool *needs_flag);
>  extern int xfs_reflink_clear_inode_flag(struct xfs_inode *ip,
>  		struct xfs_trans **tpp);
>  extern int xfs_reflink_unshare(struct xfs_inode *ip, xfs_off_t offset,
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 10/13] xfs: refactor the ifork block counting function
  2017-06-02 21:25 ` [PATCH 10/13] xfs: refactor the ifork block counting function Darrick J. Wong
@ 2017-06-06 16:29   ` Brian Foster
  2017-06-06 18:51     ` Darrick J. Wong
  2017-06-07  1:29   ` [PATCH v2 9.9/13] xfs: make _bmap_count_blocks consistent wrt delalloc extent behavior Darrick J. Wong
  2017-06-07  1:29   ` [PATCH v2 10/13] xfs: refactor the ifork block counting function Darrick J. Wong
  2 siblings, 1 reply; 56+ messages in thread
From: Brian Foster @ 2017-06-06 16:29 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, Jun 02, 2017 at 02:25:01PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Refactor the inode fork block counting function to count extents for us
> at the same time.  This will be used by the bmbt scrubber function.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/xfs_bmap_util.c |  105 +++++++++++++++++++++++++++++-------------------
>  fs/xfs/xfs_bmap_util.h |    4 ++
>  2 files changed, 67 insertions(+), 42 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index fe83bbc..fc15305 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
...
> @@ -336,44 +339,61 @@ xfs_bmap_count_tree(
>  /*
>   * Count fsblocks of the given fork.
>   */
> -static int					/* error */
> +int
>  xfs_bmap_count_blocks(
> -	xfs_trans_t		*tp,		/* transaction pointer */
> -	xfs_inode_t		*ip,		/* incore inode */
> -	int			whichfork,	/* data or attr fork */
> -	int			*count)		/* out: count of blocks */
> +	struct xfs_trans	*tp,
> +	struct xfs_inode	*ip,
> +	int			whichfork,
> +	unsigned int		*nextents,
> +	unsigned long long	*count)
>  {
>  	struct xfs_btree_block	*block;	/* current btree block */
>  	xfs_fsblock_t		bno;	/* block # of "block" */
> -	xfs_ifork_t		*ifp;	/* fork structure */
> +	struct xfs_ifork	*ifp;	/* fork structure */
>  	int			level;	/* btree level, for checking */
> -	xfs_mount_t		*mp;	/* file system mount structure */
> +	struct xfs_mount	*mp;	/* file system mount structure */
>  	__be64			*pp;	/* pointer to block address */
> +	int			error;
>  
>  	bno = NULLFSBLOCK;
>  	mp = ip->i_mount;
> +	*nextents = 0;

I think we should be consistent between how we initialize nextents and
count, whether we initialize both here or expect the caller to do it.
That aside, the rest looks good to me:

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  	ifp = XFS_IFORK_PTR(ip, whichfork);
> -	if ( XFS_IFORK_FORMAT(ip, whichfork) == XFS_DINODE_FMT_EXTENTS ) {
> -		xfs_bmap_count_leaves(ifp, 0, xfs_iext_count(ifp), count);
> +	if (!ifp)
>  		return 0;
> -	}
>  
> -	/*
> -	 * Root level must use BMAP_BROOT_PTR_ADDR macro to get ptr out.
> -	 */
> -	block = ifp->if_broot;
> -	level = be16_to_cpu(block->bb_level);
> -	ASSERT(level > 0);
> -	pp = XFS_BMAP_BROOT_PTR_ADDR(mp, block, 1, ifp->if_broot_bytes);
> -	bno = be64_to_cpu(*pp);
> -	ASSERT(bno != NULLFSBLOCK);
> -	ASSERT(XFS_FSB_TO_AGNO(mp, bno) < mp->m_sb.sb_agcount);
> -	ASSERT(XFS_FSB_TO_AGBNO(mp, bno) < mp->m_sb.sb_agblocks);
> -
> -	if (unlikely(xfs_bmap_count_tree(mp, tp, ifp, bno, level, count) < 0)) {
> -		XFS_ERROR_REPORT("xfs_bmap_count_blocks(2)", XFS_ERRLEVEL_LOW,
> -				 mp);
> -		return -EFSCORRUPTED;
> +	switch (XFS_IFORK_FORMAT(ip, whichfork)) {
> +	case XFS_DINODE_FMT_EXTENTS:
> +		*nextents = xfs_iext_count(ifp);
> +		xfs_bmap_count_leaves(ifp, 0, (*nextents), count);
> +		return 0;
> +	case XFS_DINODE_FMT_BTREE:
> +		if (!(ifp->if_flags & XFS_IFEXTENTS)) {
> +			error = xfs_iread_extents(tp, ip, whichfork);
> +			if (error)
> +				return error;
> +		}
> +
> +		/*
> +		 * Root level must use BMAP_BROOT_PTR_ADDR macro to get ptr out.
> +		 */
> +		block = ifp->if_broot;
> +		level = be16_to_cpu(block->bb_level);
> +		ASSERT(level > 0);
> +		pp = XFS_BMAP_BROOT_PTR_ADDR(mp, block, 1, ifp->if_broot_bytes);
> +		bno = be64_to_cpu(*pp);
> +		ASSERT(bno != NULLFSBLOCK);
> +		ASSERT(XFS_FSB_TO_AGNO(mp, bno) < mp->m_sb.sb_agcount);
> +		ASSERT(XFS_FSB_TO_AGBNO(mp, bno) < mp->m_sb.sb_agblocks);
> +
> +		error = xfs_bmap_count_tree(mp, tp, ifp, bno, level,
> +				nextents, count);
> +		if (error) {
> +			XFS_ERROR_REPORT("xfs_bmap_count_blocks(2)",
> +					XFS_ERRLEVEL_LOW, mp);
> +			return -EFSCORRUPTED;
> +		}
> +		return 0;
>  	}
>  
>  	return 0;
> @@ -1789,8 +1809,9 @@ xfs_swap_extent_forks(
>  	int			*target_log_flags)
>  {
>  	struct xfs_ifork	tempifp, *ifp, *tifp;
> -	int			aforkblks = 0;
> -	int			taforkblks = 0;
> +	unsigned long long	aforkblks = 0;
> +	unsigned long long	taforkblks = 0;
> +	unsigned int		junk;
>  	xfs_extnum_t		nextents;
>  	uint64_t		tmp;
>  	int			error;
> @@ -1800,14 +1821,14 @@ xfs_swap_extent_forks(
>  	 */
>  	if ( ((XFS_IFORK_Q(ip) != 0) && (ip->i_d.di_anextents > 0)) &&
>  	     (ip->i_d.di_aformat != XFS_DINODE_FMT_LOCAL)) {
> -		error = xfs_bmap_count_blocks(tp, ip, XFS_ATTR_FORK,
> +		error = xfs_bmap_count_blocks(tp, ip, XFS_ATTR_FORK, &junk,
>  				&aforkblks);
>  		if (error)
>  			return error;
>  	}
>  	if ( ((XFS_IFORK_Q(tip) != 0) && (tip->i_d.di_anextents > 0)) &&
>  	     (tip->i_d.di_aformat != XFS_DINODE_FMT_LOCAL)) {
> -		error = xfs_bmap_count_blocks(tp, tip, XFS_ATTR_FORK,
> +		error = xfs_bmap_count_blocks(tp, tip, XFS_ATTR_FORK, &junk,
>  				&taforkblks);
>  		if (error)
>  			return error;
> diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h
> index 135d826..993973c 100644
> --- a/fs/xfs/xfs_bmap_util.h
> +++ b/fs/xfs/xfs_bmap_util.h
> @@ -70,4 +70,8 @@ int	xfs_swap_extents(struct xfs_inode *ip, struct xfs_inode *tip,
>  
>  xfs_daddr_t xfs_fsb_to_db(struct xfs_inode *ip, xfs_fsblock_t fsb);
>  
> +int xfs_bmap_count_blocks(struct xfs_trans *tp, struct xfs_inode *ip,
> +			  int whichfork, unsigned int *nextents,
> +			  unsigned long long *count);
> +
>  #endif	/* __XFS_BMAP_UTIL_H__ */
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/13] xfs: optimize _btree_query_all
  2017-06-06 13:32   ` Brian Foster
@ 2017-06-06 17:43     ` Darrick J. Wong
  0 siblings, 0 replies; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-06 17:43 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Tue, Jun 06, 2017 at 09:32:42AM -0400, Brian Foster wrote:
> On Fri, Jun 02, 2017 at 02:24:06PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Don't bother wandering our way through the leaf nodes when the caller
> > issues a query_all; just zoom down the left side of the tree and walk
> > rightwards along level zero.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/libxfs/xfs_btree.c |   44 +++++++++++++++++++++++++++++++++++++++-----
> >  1 file changed, 39 insertions(+), 5 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
> > index 3a673ba..07d75bc 100644
> > --- a/fs/xfs/libxfs/xfs_btree.c
> > +++ b/fs/xfs/libxfs/xfs_btree.c
> > @@ -4849,12 +4849,46 @@ xfs_btree_query_all(
> >  	xfs_btree_query_range_fn	fn,
> >  	void				*priv)
> >  {
> > -	union xfs_btree_irec		low_rec;
> > -	union xfs_btree_irec		high_rec;
> > +	union xfs_btree_rec		*recp;
> > +	int				stat;
> > +	int				error;
> > +
> > +	/*
> > +	 * Find the leftmost record.  The btree cursor must be set
> > +	 * to the low record used to generate low_key.
> > +	 */
> > +	memset(&cur->bc_rec, 0, sizeof(cur->bc_rec));
> > +	stat = 0;
> > +	error = xfs_btree_lookup(cur, XFS_LOOKUP_LE, &stat);
> > +	if (error)
> > +		goto out;
> > +
> > +	/* Nothing?  See if there's anything to the right. */
> > +	if (!stat) {
> > +		error = xfs_btree_increment(cur, 0, &stat);
> > +		if (error)
> > +			goto out;
> > +	}
> >  
> > -	memset(&low_rec, 0, sizeof(low_rec));
> > -	memset(&high_rec, 0xFF, sizeof(high_rec));
> > -	return xfs_btree_query_range(cur, &low_rec, &high_rec, fn, priv);
> > +	while (stat) {
> > +		/* Find the record. */
> > +		error = xfs_btree_get_rec(cur, &recp, &stat);
> > +		if (error || !stat)
> > +			break;
> > +
> > +		/* Callback */
> > +		error = fn(cur, recp, priv);
> > +		if (error < 0 || error == XFS_BTREE_QUERY_RANGE_ABORT)
> > +			break;
> > +
> > +		/* Move on to the next record. */
> > +		error = xfs_btree_increment(cur, 0, &stat);
> > +		if (error)
> > +			break;
> > +	}
> > +
> > +out:
> > +	return error;
> 
> This all looks quite similar to xfs_btree_simple_query_range(), minus
> the associated key checks. I doubt the latter measurably affects the
> performance of a btree walk. Could we call that function directly here?

Yes, we could.  Will fix and resend.

--D

> 
> Brian
> 
> >  }
> >  
> >  /*
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 06/13] xfs: export _inobt_btrec_to_irec and _ialloc_cluster_alignment for scrub
  2017-06-06 16:27   ` Brian Foster
@ 2017-06-06 17:46     ` Darrick J. Wong
  0 siblings, 0 replies; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-06 17:46 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Tue, Jun 06, 2017 at 12:27:41PM -0400, Brian Foster wrote:
> On Fri, Jun 02, 2017 at 02:24:36PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Create a function to extract an in-core inobt record from a generic
> > btree_rec union so that scrub will be able to check inobt records
> > and check inode block alignment.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/libxfs/xfs_ialloc.c |   43 ++++++++++++++++++++++++++-----------------
> >  fs/xfs/libxfs/xfs_ialloc.h |    5 +++++
> >  2 files changed, 31 insertions(+), 17 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
> > index 1e5ed94..33626373 100644
> > --- a/fs/xfs/libxfs/xfs_ialloc.c
> > +++ b/fs/xfs/libxfs/xfs_ialloc.c
> > @@ -46,7 +46,7 @@
> >  /*
> >   * Allocation group level functions.
> >   */
> > -static inline int
> > +int
> >  xfs_ialloc_cluster_alignment(
> >  	struct xfs_mount	*mp)
> >  {
> > @@ -98,24 +98,14 @@ xfs_inobt_update(
> >  	return xfs_btree_update(cur, &rec);
> >  }
> >  
> > -/*
> > - * Get the data from the pointed-to record.
> > - */
> > -int					/* error */
> > -xfs_inobt_get_rec(
> > -	struct xfs_btree_cur	*cur,	/* btree cursor */
> > -	xfs_inobt_rec_incore_t	*irec,	/* btree record */
> > -	int			*stat)	/* output: success/failure */
> > +void
> > +xfs_inobt_btrec_to_irec(
> > +	struct xfs_mount		*mp,
> > +	union xfs_btree_rec		*rec,
> > +	struct xfs_inobt_rec_incore	*irec)
> >  {
> > -	union xfs_btree_rec	*rec;
> > -	int			error;
> > -
> > -	error = xfs_btree_get_rec(cur, &rec, stat);
> > -	if (error || *stat == 0)
> > -		return error;
> > -
> >  	irec->ir_startino = be32_to_cpu(rec->inobt.ir_startino);
> > -	if (xfs_sb_version_hassparseinodes(&cur->bc_mp->m_sb)) {
> > +	if (xfs_sb_version_hassparseinodes(&mp->m_sb)) {
> >  		irec->ir_holemask = be16_to_cpu(rec->inobt.ir_u.sp.ir_holemask);
> >  		irec->ir_count = rec->inobt.ir_u.sp.ir_count;
> >  		irec->ir_freecount = rec->inobt.ir_u.sp.ir_freecount;
> > @@ -130,6 +120,25 @@ xfs_inobt_get_rec(
> >  				be32_to_cpu(rec->inobt.ir_u.f.ir_freecount);
> >  	}
> >  	irec->ir_free = be64_to_cpu(rec->inobt.ir_free);
> > +}
> > +
> > +/*
> > + * Get the data from the pointed-to record.
> > + */
> > +int					/* error */
> > +xfs_inobt_get_rec(
> > +	struct xfs_btree_cur	*cur,	/* btree cursor */
> > +	xfs_inobt_rec_incore_t	*irec,	/* btree record */
> 
> Might as well kill the typedef usage while we're here. Otherwise looks
> good:

Ok, will fix.

--D
> 
> Reviewed-by: Brian Foster <bfoster@redhat.com>
> 
> > +	int			*stat)	/* output: success/failure */
> > +{
> > +	union xfs_btree_rec	*rec;
> > +	int			error;
> > +
> > +	error = xfs_btree_get_rec(cur, &rec, stat);
> > +	if (error || *stat == 0)
> > +		return error;
> > +
> > +	xfs_inobt_btrec_to_irec(cur->bc_mp, rec, irec);
> >  
> >  	return 0;
> >  }
> > diff --git a/fs/xfs/libxfs/xfs_ialloc.h b/fs/xfs/libxfs/xfs_ialloc.h
> > index 0bb8966..b32cfb5 100644
> > --- a/fs/xfs/libxfs/xfs_ialloc.h
> > +++ b/fs/xfs/libxfs/xfs_ialloc.h
> > @@ -168,5 +168,10 @@ int xfs_ialloc_inode_init(struct xfs_mount *mp, struct xfs_trans *tp,
> >  int xfs_read_agi(struct xfs_mount *mp, struct xfs_trans *tp,
> >  		xfs_agnumber_t agno, struct xfs_buf **bpp);
> >  
> > +union xfs_btree_rec;
> > +void xfs_inobt_btrec_to_irec(struct xfs_mount *mp, union xfs_btree_rec *rec,
> > +		struct xfs_inobt_rec_incore *irec);
> > +
> > +int xfs_ialloc_cluster_alignment(struct xfs_mount *mp);
> >  
> >  #endif	/* __XFS_IALLOC_H__ */
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 09/13] xfs: separate function to check if reflink flag needed
  2017-06-06 16:28   ` Brian Foster
@ 2017-06-06 18:05     ` Darrick J. Wong
  0 siblings, 0 replies; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-06 18:05 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Tue, Jun 06, 2017 at 12:28:33PM -0400, Brian Foster wrote:
> On Fri, Jun 02, 2017 at 02:24:55PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Separate the "clear reflink flag" function into one function that checks
> > if the flag is needed, and a second function that checks and clears the
> > flag.  The inode scrub code will want to check the necessity of the flag
> > without clearing it.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/xfs_reflink.c |   88 ++++++++++++++++++++++++++++++--------------------
> >  fs/xfs/xfs_reflink.h |    2 +
> >  2 files changed, 54 insertions(+), 36 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> > index e25c995..133ee02 100644
> > --- a/fs/xfs/xfs_reflink.c
> > +++ b/fs/xfs/xfs_reflink.c
> > @@ -1406,57 +1406,73 @@ xfs_reflink_dirty_extents(
> >  	return error;
> >  }
> >  
> > -/* Clear the inode reflink flag if there are no shared extents. */
> > +/* Does this inode need the reflink flag? */
> >  int
> > -xfs_reflink_clear_inode_flag(
> > -	struct xfs_inode	*ip,
> > -	struct xfs_trans	**tpp)
> > +xfs_reflink_needs_inode_flag(
> > +	struct xfs_trans		*tp,
> > +	struct xfs_inode		*ip,
> > +	bool				*needs_flag)
> 
> This looks Ok to me, but just a nit that the _needs_inode_flag() name
> sounds slightly confusing to me because any context around the flag is
> now isolated to the caller. Could we call this something more generic
> like _has_shared_extents() (and rename needs_flag appropriately as
> well)?

Yes, that's a much better name. :)

int xfs_reflink_inode_has_shared_extents(..., bool *has_shared);

--D

> 
> Brian
> 
> >  {
> > -	struct xfs_mount	*mp = ip->i_mount;
> > -	xfs_fileoff_t		fbno;
> > -	xfs_filblks_t		end;
> > -	xfs_agnumber_t		agno;
> > -	xfs_agblock_t		agbno;
> > -	xfs_extlen_t		aglen;
> > -	xfs_agblock_t		rbno;
> > -	xfs_extlen_t		rlen;
> > -	struct xfs_bmbt_irec	map;
> > -	int			nmaps;
> > -	int			error = 0;
> > -
> > -	ASSERT(xfs_is_reflink_inode(ip));
> > +	struct xfs_bmbt_irec		got;
> > +	struct xfs_mount		*mp = ip->i_mount;
> > +	struct xfs_ifork		*ifp;
> > +	xfs_agnumber_t			agno;
> > +	xfs_agblock_t			agbno;
> > +	xfs_extlen_t			aglen;
> > +	xfs_agblock_t			rbno;
> > +	xfs_extlen_t			rlen;
> > +	xfs_extnum_t			idx;
> > +	bool				found;
> > +	int				error;
> >  
> > -	fbno = 0;
> > -	end = XFS_B_TO_FSB(mp, i_size_read(VFS_I(ip)));
> > -	while (end - fbno > 0) {
> > -		nmaps = 1;
> > -		/*
> > -		 * Look for extents in the file.  Skip holes, delalloc, or
> > -		 * unwritten extents; they can't be reflinked.
> > -		 */
> > -		error = xfs_bmapi_read(ip, fbno, end - fbno, &map, &nmaps, 0);
> > +	ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
> > +	if (!(ifp->if_flags & XFS_IFEXTENTS)) {
> > +		error = xfs_iread_extents(tp, ip, XFS_DATA_FORK);
> >  		if (error)
> >  			return error;
> > -		if (nmaps == 0)
> > -			break;
> > -		if (!xfs_bmap_is_real_extent(&map))
> > -			goto next;
> > +	}
> >  
> > -		agno = XFS_FSB_TO_AGNO(mp, map.br_startblock);
> > -		agbno = XFS_FSB_TO_AGBNO(mp, map.br_startblock);
> > -		aglen = map.br_blockcount;
> > +	*needs_flag = false;
> > +	found = xfs_iext_lookup_extent(ip, ifp, 0, &idx, &got);
> > +	while (found) {
> > +		if (isnullstartblock(got.br_startblock) ||
> > +		    got.br_state != XFS_EXT_NORM)
> > +			goto next;
> > +		agno = XFS_FSB_TO_AGNO(mp, got.br_startblock);
> > +		agbno = XFS_FSB_TO_AGBNO(mp, got.br_startblock);
> > +		aglen = got.br_blockcount;
> >  
> > -		error = xfs_reflink_find_shared(mp, *tpp, agno, agbno, aglen,
> > +		error = xfs_reflink_find_shared(mp, tp, agno, agbno, aglen,
> >  				&rbno, &rlen, false);
> >  		if (error)
> >  			return error;
> >  		/* Is there still a shared block here? */
> > -		if (rbno != NULLAGBLOCK)
> > +		if (rbno != NULLAGBLOCK) {
> > +			*needs_flag = true;
> >  			return 0;
> > +		}
> >  next:
> > -		fbno = map.br_startoff + map.br_blockcount;
> > +		found = xfs_iext_get_extent(ifp, ++idx, &got);
> >  	}
> >  
> > +	return 0;
> > +}
> > +
> > +/* Clear the inode reflink flag if there are no shared extents. */
> > +int
> > +xfs_reflink_clear_inode_flag(
> > +	struct xfs_inode	*ip,
> > +	struct xfs_trans	**tpp)
> > +{
> > +	bool			needs;
> > +	int			error = 0;
> > +
> > +	ASSERT(xfs_is_reflink_inode(ip));
> > +
> > +	error = xfs_reflink_needs_inode_flag(*tpp, ip, &needs);
> > +	if (error || needs)
> > +		return error;
> > +
> >  	/*
> >  	 * We didn't find any shared blocks so turn off the reflink flag.
> >  	 * First, get rid of any leftover CoW mappings.
> > diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h
> > index b8cc5c3..a26d795 100644
> > --- a/fs/xfs/xfs_reflink.h
> > +++ b/fs/xfs/xfs_reflink.h
> > @@ -47,6 +47,8 @@ extern int xfs_reflink_end_cow(struct xfs_inode *ip, xfs_off_t offset,
> >  extern int xfs_reflink_recover_cow(struct xfs_mount *mp);
> >  extern int xfs_reflink_remap_range(struct file *file_in, loff_t pos_in,
> >  		struct file *file_out, loff_t pos_out, u64 len, bool is_dedupe);
> > +extern int xfs_reflink_needs_inode_flag(struct xfs_trans *tp,
> > +		struct xfs_inode *ip, bool *needs_flag);
> >  extern int xfs_reflink_clear_inode_flag(struct xfs_inode *ip,
> >  		struct xfs_trans **tpp);
> >  extern int xfs_reflink_unshare(struct xfs_inode *ip, xfs_off_t offset,
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 07/13] xfs: check if an inode is cached and allocated
  2017-06-06 16:28   ` Brian Foster
@ 2017-06-06 18:40     ` Darrick J. Wong
  2017-06-07 14:22       ` Brian Foster
  0 siblings, 1 reply; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-06 18:40 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Tue, Jun 06, 2017 at 12:28:13PM -0400, Brian Foster wrote:
> On Fri, Jun 02, 2017 at 02:24:43PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Check the inode cache for a particular inode number.  If it's in the
> > cache, check that it's not currently being reclaimed.  If it's not being
> > reclaimed, return zero if the inode is allocated.  This function will be
> > used by various scrubbers to decide if the cache is more up to date
> > than the disk in terms of checking if an inode is allocated.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/xfs_icache.c |   83 +++++++++++++++++++++++++++++++++++++++++++++++++++
> >  fs/xfs/xfs_icache.h |    3 ++
> >  2 files changed, 86 insertions(+)
> > 
> > 
> > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> > index f61c84f8..d610a7e 100644
> > --- a/fs/xfs/xfs_icache.c
> > +++ b/fs/xfs/xfs_icache.c
> > @@ -633,6 +633,89 @@ xfs_iget(
> >  }
> >  
> >  /*
> > + * "Is this a cached inode that's also allocated?"
> > + *
> > + * Look up an inode by number in the given file system.  If the inode is
> > + * in cache and isn't in purgatory, return 1 if the inode is allocated
> > + * and 0 if it is not.  For all other cases (not in cache, being torn
> > + * down, etc.), return a negative error code.
> > + *
> > + * (The caller has to prevent inode allocation activity.)
> > + */
> 
> Hmm.. so isn't the data returned here potentially invalid once we drop
> the inode reference? In other words, couldn't an inode where we return
> inuse == true be reclaimed immediately after? Perhaps I'm just not far
> enough along to understand how this is used. If that's the case, a note
> about the lifetime/rules of this value might be useful.

The comment could state more explicitly what we're assuming the caller
has done to prevent inode allocation or freeing activity.  The scrubber
that calls this function will have locked the AGI buffer for this AG so
that it can compare the inobt ir_free bits against di_mode to make sure
that there aren't any discrepancies.  Even if the inode is immediately
reclaimed/deleted after we release the inode, the corresponding inobt
update will block on the AGI until the scrubber finishes, so from the
scrubber's point of view things are still consistent.  If the scrubber
finds the inode in some intermediate state of being created or torn
down, it doesn't bother checking the free mask on the assumption that
the thread modifying the inode will ensure the consistency or shut down.

tldr: We assume the caller has the AGI locked so that inodes stay stable
wrt to allocation or freeing, or only end up in an intermediate state;
we also assume the caller can handle inodes in an intermediate state.

> FWIW, I'm also kind of wondering if rather than open code the bits of
> the inode lookup, we could accomplish the same thing with a new flag to
> the existing xfs_iget() lookup mechanism that implements the associated
> semantics (i.e., don't read from disk, don't reinit, sort of a read-only
> semantic).

Originally it was just an iget flag, but the flag ended up special
casing a lot of the existing iget functionality.  Basically, we need to
disable the xfs_iget_cache_miss call; avoid the out_error_or_again case;
do our i_mode testing, release the inode, and jump out of the function
prior to the bit that can call xfs_setup_existing_inode; and change the
lock_flags assert to require lock_flags == 0 when we're just checking.

All that turned xfs_iget into such a muddy mess that I decided it was
cleaner to separate this specialized case into its own function and hope
that we're not really going to modify _iget a whole lot.

Anyway, thank you for the reviewing!

--D

> 
> Brian
> 
> > +int
> > +xfs_icache_inode_is_allocated(
> > +	struct xfs_mount	*mp,
> > +	struct xfs_trans	*tp,
> > +	xfs_ino_t		ino,
> > +	bool			*inuse)
> > +{
> > +	struct xfs_inode	*ip;
> > +	struct xfs_perag	*pag;
> > +	xfs_agino_t		agino;
> > +	int			ret = 0;
> > +
> > +	/* reject inode numbers outside existing AGs */
> > +	if (!ino || XFS_INO_TO_AGNO(mp, ino) >= mp->m_sb.sb_agcount)
> > +		return -EINVAL;
> > +
> > +	/* get the perag structure and ensure that it's inode capable */
> > +	pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ino));
> > +	agino = XFS_INO_TO_AGINO(mp, ino);
> > +
> > +	rcu_read_lock();
> > +	ip = radix_tree_lookup(&pag->pag_ici_root, agino);
> > +	if (!ip) {
> > +		ret = -ENOENT;
> > +		goto out;
> > +	}
> > +
> > +	/*
> > +	 * Is the inode being reused?  Is it new?  Is it being
> > +	 * reclaimed?  Is it being torn down?  For any of those cases,
> > +	 * fall back.
> > +	 */
> > +	spin_lock(&ip->i_flags_lock);
> > +	if (ip->i_ino != ino ||
> > +	    (ip->i_flags & (XFS_INEW | XFS_IRECLAIM | XFS_IRECLAIMABLE))) {
> > +		ret = -EAGAIN;
> > +		goto out_istate;
> > +	}
> > +
> > +	/*
> > +	 * If lookup is racing with unlink, jump out immediately.
> > +	 */
> > +	if (VFS_I(ip)->i_mode == 0) {
> > +		*inuse = false;
> > +		ret = 0;
> > +		goto out_istate;
> > +	}
> > +
> > +	/* If the VFS inode is being torn down, forget it. */
> > +	if (!igrab(VFS_I(ip))) {
> > +		ret = -EAGAIN;
> > +		goto out_istate;
> > +	}
> > +
> > +	/* We've got a live one. */
> > +	spin_unlock(&ip->i_flags_lock);
> > +	rcu_read_unlock();
> > +	xfs_perag_put(pag);
> > +
> > +	*inuse = !!(VFS_I(ip)->i_mode);
> > +	ret = 0;
> > +	IRELE(ip);
> > +
> > +	return ret;
> > +
> > +out_istate:
> > +	spin_unlock(&ip->i_flags_lock);
> > +out:
> > +	rcu_read_unlock();
> > +	xfs_perag_put(pag);
> > +	return ret;
> > +}
> > +
> > +/*
> >   * The inode lookup is done in batches to keep the amount of lock traffic and
> >   * radix tree lookups to a minimum. The batch size is a trade off between
> >   * lookup reduction and stack usage. This is in the reclaim path, so we can't
> > diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
> > index 9183f77..eadf718 100644
> > --- a/fs/xfs/xfs_icache.h
> > +++ b/fs/xfs/xfs_icache.h
> > @@ -126,4 +126,7 @@ xfs_fs_eofblocks_from_user(
> >  	return 0;
> >  }
> >  
> > +int xfs_icache_inode_is_allocated(struct xfs_mount *mp, struct xfs_trans *tp,
> > +				  xfs_ino_t ino, bool *inuse);
> > +
> >  #endif
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 10/13] xfs: refactor the ifork block counting function
  2017-06-06 16:29   ` Brian Foster
@ 2017-06-06 18:51     ` Darrick J. Wong
  2017-06-06 20:35       ` Darrick J. Wong
  0 siblings, 1 reply; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-06 18:51 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Tue, Jun 06, 2017 at 12:29:21PM -0400, Brian Foster wrote:
> On Fri, Jun 02, 2017 at 02:25:01PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Refactor the inode fork block counting function to count extents for us
> > at the same time.  This will be used by the bmbt scrubber function.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/xfs_bmap_util.c |  105 +++++++++++++++++++++++++++++-------------------
> >  fs/xfs/xfs_bmap_util.h |    4 ++
> >  2 files changed, 67 insertions(+), 42 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> > index fe83bbc..fc15305 100644
> > --- a/fs/xfs/xfs_bmap_util.c
> > +++ b/fs/xfs/xfs_bmap_util.c
> ...
> > @@ -336,44 +339,61 @@ xfs_bmap_count_tree(
> >  /*
> >   * Count fsblocks of the given fork.
> >   */
> > -static int					/* error */
> > +int
> >  xfs_bmap_count_blocks(
> > -	xfs_trans_t		*tp,		/* transaction pointer */
> > -	xfs_inode_t		*ip,		/* incore inode */
> > -	int			whichfork,	/* data or attr fork */
> > -	int			*count)		/* out: count of blocks */
> > +	struct xfs_trans	*tp,
> > +	struct xfs_inode	*ip,
> > +	int			whichfork,
> > +	unsigned int		*nextents,
> > +	unsigned long long	*count)
> >  {
> >  	struct xfs_btree_block	*block;	/* current btree block */
> >  	xfs_fsblock_t		bno;	/* block # of "block" */
> > -	xfs_ifork_t		*ifp;	/* fork structure */
> > +	struct xfs_ifork	*ifp;	/* fork structure */
> >  	int			level;	/* btree level, for checking */
> > -	xfs_mount_t		*mp;	/* file system mount structure */
> > +	struct xfs_mount	*mp;	/* file system mount structure */
> >  	__be64			*pp;	/* pointer to block address */
> > +	int			error;
> >  
> >  	bno = NULLFSBLOCK;
> >  	mp = ip->i_mount;
> > +	*nextents = 0;
> 
> I think we should be consistent between how we initialize nextents and
> count, whether we initialize both here or expect the caller to do it.

I later take advantage of the "doesn't initialize count" behavior for
inode extent and i_nblocks checking, but yes, this needs to be
consistent.  AFAICT the other XFS functions either return errors or
initialize the parameters, so this function should do so too.  I'll
update the scrub code to reflect this.

--D

> That aside, the rest looks good to me:
> 
> Reviewed-by: Brian Foster <bfoster@redhat.com>
> 
> >  	ifp = XFS_IFORK_PTR(ip, whichfork);
> > -	if ( XFS_IFORK_FORMAT(ip, whichfork) == XFS_DINODE_FMT_EXTENTS ) {
> > -		xfs_bmap_count_leaves(ifp, 0, xfs_iext_count(ifp), count);
> > +	if (!ifp)
> >  		return 0;
> > -	}
> >  
> > -	/*
> > -	 * Root level must use BMAP_BROOT_PTR_ADDR macro to get ptr out.
> > -	 */
> > -	block = ifp->if_broot;
> > -	level = be16_to_cpu(block->bb_level);
> > -	ASSERT(level > 0);
> > -	pp = XFS_BMAP_BROOT_PTR_ADDR(mp, block, 1, ifp->if_broot_bytes);
> > -	bno = be64_to_cpu(*pp);
> > -	ASSERT(bno != NULLFSBLOCK);
> > -	ASSERT(XFS_FSB_TO_AGNO(mp, bno) < mp->m_sb.sb_agcount);
> > -	ASSERT(XFS_FSB_TO_AGBNO(mp, bno) < mp->m_sb.sb_agblocks);
> > -
> > -	if (unlikely(xfs_bmap_count_tree(mp, tp, ifp, bno, level, count) < 0)) {
> > -		XFS_ERROR_REPORT("xfs_bmap_count_blocks(2)", XFS_ERRLEVEL_LOW,
> > -				 mp);
> > -		return -EFSCORRUPTED;
> > +	switch (XFS_IFORK_FORMAT(ip, whichfork)) {
> > +	case XFS_DINODE_FMT_EXTENTS:
> > +		*nextents = xfs_iext_count(ifp);
> > +		xfs_bmap_count_leaves(ifp, 0, (*nextents), count);
> > +		return 0;
> > +	case XFS_DINODE_FMT_BTREE:
> > +		if (!(ifp->if_flags & XFS_IFEXTENTS)) {
> > +			error = xfs_iread_extents(tp, ip, whichfork);
> > +			if (error)
> > +				return error;
> > +		}
> > +
> > +		/*
> > +		 * Root level must use BMAP_BROOT_PTR_ADDR macro to get ptr out.
> > +		 */
> > +		block = ifp->if_broot;
> > +		level = be16_to_cpu(block->bb_level);
> > +		ASSERT(level > 0);
> > +		pp = XFS_BMAP_BROOT_PTR_ADDR(mp, block, 1, ifp->if_broot_bytes);
> > +		bno = be64_to_cpu(*pp);
> > +		ASSERT(bno != NULLFSBLOCK);
> > +		ASSERT(XFS_FSB_TO_AGNO(mp, bno) < mp->m_sb.sb_agcount);
> > +		ASSERT(XFS_FSB_TO_AGBNO(mp, bno) < mp->m_sb.sb_agblocks);
> > +
> > +		error = xfs_bmap_count_tree(mp, tp, ifp, bno, level,
> > +				nextents, count);
> > +		if (error) {
> > +			XFS_ERROR_REPORT("xfs_bmap_count_blocks(2)",
> > +					XFS_ERRLEVEL_LOW, mp);
> > +			return -EFSCORRUPTED;
> > +		}
> > +		return 0;
> >  	}
> >  
> >  	return 0;
> > @@ -1789,8 +1809,9 @@ xfs_swap_extent_forks(
> >  	int			*target_log_flags)
> >  {
> >  	struct xfs_ifork	tempifp, *ifp, *tifp;
> > -	int			aforkblks = 0;
> > -	int			taforkblks = 0;
> > +	unsigned long long	aforkblks = 0;
> > +	unsigned long long	taforkblks = 0;
> > +	unsigned int		junk;
> >  	xfs_extnum_t		nextents;
> >  	uint64_t		tmp;
> >  	int			error;
> > @@ -1800,14 +1821,14 @@ xfs_swap_extent_forks(
> >  	 */
> >  	if ( ((XFS_IFORK_Q(ip) != 0) && (ip->i_d.di_anextents > 0)) &&
> >  	     (ip->i_d.di_aformat != XFS_DINODE_FMT_LOCAL)) {
> > -		error = xfs_bmap_count_blocks(tp, ip, XFS_ATTR_FORK,
> > +		error = xfs_bmap_count_blocks(tp, ip, XFS_ATTR_FORK, &junk,
> >  				&aforkblks);
> >  		if (error)
> >  			return error;
> >  	}
> >  	if ( ((XFS_IFORK_Q(tip) != 0) && (tip->i_d.di_anextents > 0)) &&
> >  	     (tip->i_d.di_aformat != XFS_DINODE_FMT_LOCAL)) {
> > -		error = xfs_bmap_count_blocks(tp, tip, XFS_ATTR_FORK,
> > +		error = xfs_bmap_count_blocks(tp, tip, XFS_ATTR_FORK, &junk,
> >  				&taforkblks);
> >  		if (error)
> >  			return error;
> > diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h
> > index 135d826..993973c 100644
> > --- a/fs/xfs/xfs_bmap_util.h
> > +++ b/fs/xfs/xfs_bmap_util.h
> > @@ -70,4 +70,8 @@ int	xfs_swap_extents(struct xfs_inode *ip, struct xfs_inode *tip,
> >  
> >  xfs_daddr_t xfs_fsb_to_db(struct xfs_inode *ip, xfs_fsblock_t fsb);
> >  
> > +int xfs_bmap_count_blocks(struct xfs_trans *tp, struct xfs_inode *ip,
> > +			  int whichfork, unsigned int *nextents,
> > +			  unsigned long long *count);
> > +
> >  #endif	/* __XFS_BMAP_UTIL_H__ */
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 10/13] xfs: refactor the ifork block counting function
  2017-06-06 18:51     ` Darrick J. Wong
@ 2017-06-06 20:35       ` Darrick J. Wong
  0 siblings, 0 replies; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-06 20:35 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Tue, Jun 06, 2017 at 11:51:12AM -0700, Darrick J. Wong wrote:
> On Tue, Jun 06, 2017 at 12:29:21PM -0400, Brian Foster wrote:
> > On Fri, Jun 02, 2017 at 02:25:01PM -0700, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > Refactor the inode fork block counting function to count extents for us
> > > at the same time.  This will be used by the bmbt scrubber function.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > >  fs/xfs/xfs_bmap_util.c |  105 +++++++++++++++++++++++++++++-------------------
> > >  fs/xfs/xfs_bmap_util.h |    4 ++
> > >  2 files changed, 67 insertions(+), 42 deletions(-)
> > > 
> > > 
> > > diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> > > index fe83bbc..fc15305 100644
> > > --- a/fs/xfs/xfs_bmap_util.c
> > > +++ b/fs/xfs/xfs_bmap_util.c
> > ...
> > > @@ -336,44 +339,61 @@ xfs_bmap_count_tree(
> > >  /*
> > >   * Count fsblocks of the given fork.
> > >   */
> > > -static int					/* error */
> > > +int
> > >  xfs_bmap_count_blocks(
> > > -	xfs_trans_t		*tp,		/* transaction pointer */
> > > -	xfs_inode_t		*ip,		/* incore inode */
> > > -	int			whichfork,	/* data or attr fork */
> > > -	int			*count)		/* out: count of blocks */
> > > +	struct xfs_trans	*tp,
> > > +	struct xfs_inode	*ip,
> > > +	int			whichfork,
> > > +	unsigned int		*nextents,
> > > +	unsigned long long	*count)
> > >  {
> > >  	struct xfs_btree_block	*block;	/* current btree block */
> > >  	xfs_fsblock_t		bno;	/* block # of "block" */
> > > -	xfs_ifork_t		*ifp;	/* fork structure */
> > > +	struct xfs_ifork	*ifp;	/* fork structure */
> > >  	int			level;	/* btree level, for checking */
> > > -	xfs_mount_t		*mp;	/* file system mount structure */
> > > +	struct xfs_mount	*mp;	/* file system mount structure */
> > >  	__be64			*pp;	/* pointer to block address */
> > > +	int			error;
> > >  
> > >  	bno = NULLFSBLOCK;
> > >  	mp = ip->i_mount;
> > > +	*nextents = 0;
> > 
> > I think we should be consistent between how we initialize nextents and
> > count, whether we initialize both here or expect the caller to do it.
> 
> I later take advantage of the "doesn't initialize count" behavior for
> inode extent and i_nblocks checking, but yes, this needs to be
> consistent.  AFAICT the other XFS functions either return errors or
> initialize the parameters, so this function should do so too.  I'll
> update the scrub code to reflect this.

I found a discrepancy, too -- xfs_bmap_count_leaves counts all the
blocks in the in-core inode fork, including delalloc reservations.
However, xfs_bmap_count_tree iterates the on-disk bmap btree, which
means that it does /not/ count delalloc reservations.  For the only
caller so far (xfs_swap_extents) this isn't a problem because we've
flushed the dirty data and locked the inode so there aren't any da
reservations.  However, for scrub we don't necessarily flush the page
cache, so the da reservations sometimes get counted and sometimes don't,
which causes a cross-referencing error if scrub happens to hit a btree
format file that has been written to.

So.... one more patch to fix that. :(

--D

> 
> --D
> 
> > That aside, the rest looks good to me:
> > 
> > Reviewed-by: Brian Foster <bfoster@redhat.com>
> > 
> > >  	ifp = XFS_IFORK_PTR(ip, whichfork);
> > > -	if ( XFS_IFORK_FORMAT(ip, whichfork) == XFS_DINODE_FMT_EXTENTS ) {
> > > -		xfs_bmap_count_leaves(ifp, 0, xfs_iext_count(ifp), count);
> > > +	if (!ifp)
> > >  		return 0;
> > > -	}
> > >  
> > > -	/*
> > > -	 * Root level must use BMAP_BROOT_PTR_ADDR macro to get ptr out.
> > > -	 */
> > > -	block = ifp->if_broot;
> > > -	level = be16_to_cpu(block->bb_level);
> > > -	ASSERT(level > 0);
> > > -	pp = XFS_BMAP_BROOT_PTR_ADDR(mp, block, 1, ifp->if_broot_bytes);
> > > -	bno = be64_to_cpu(*pp);
> > > -	ASSERT(bno != NULLFSBLOCK);
> > > -	ASSERT(XFS_FSB_TO_AGNO(mp, bno) < mp->m_sb.sb_agcount);
> > > -	ASSERT(XFS_FSB_TO_AGBNO(mp, bno) < mp->m_sb.sb_agblocks);
> > > -
> > > -	if (unlikely(xfs_bmap_count_tree(mp, tp, ifp, bno, level, count) < 0)) {
> > > -		XFS_ERROR_REPORT("xfs_bmap_count_blocks(2)", XFS_ERRLEVEL_LOW,
> > > -				 mp);
> > > -		return -EFSCORRUPTED;
> > > +	switch (XFS_IFORK_FORMAT(ip, whichfork)) {
> > > +	case XFS_DINODE_FMT_EXTENTS:
> > > +		*nextents = xfs_iext_count(ifp);
> > > +		xfs_bmap_count_leaves(ifp, 0, (*nextents), count);
> > > +		return 0;
> > > +	case XFS_DINODE_FMT_BTREE:
> > > +		if (!(ifp->if_flags & XFS_IFEXTENTS)) {
> > > +			error = xfs_iread_extents(tp, ip, whichfork);
> > > +			if (error)
> > > +				return error;
> > > +		}
> > > +
> > > +		/*
> > > +		 * Root level must use BMAP_BROOT_PTR_ADDR macro to get ptr out.
> > > +		 */
> > > +		block = ifp->if_broot;
> > > +		level = be16_to_cpu(block->bb_level);
> > > +		ASSERT(level > 0);
> > > +		pp = XFS_BMAP_BROOT_PTR_ADDR(mp, block, 1, ifp->if_broot_bytes);
> > > +		bno = be64_to_cpu(*pp);
> > > +		ASSERT(bno != NULLFSBLOCK);
> > > +		ASSERT(XFS_FSB_TO_AGNO(mp, bno) < mp->m_sb.sb_agcount);
> > > +		ASSERT(XFS_FSB_TO_AGBNO(mp, bno) < mp->m_sb.sb_agblocks);
> > > +
> > > +		error = xfs_bmap_count_tree(mp, tp, ifp, bno, level,
> > > +				nextents, count);
> > > +		if (error) {
> > > +			XFS_ERROR_REPORT("xfs_bmap_count_blocks(2)",
> > > +					XFS_ERRLEVEL_LOW, mp);
> > > +			return -EFSCORRUPTED;
> > > +		}
> > > +		return 0;
> > >  	}
> > >  
> > >  	return 0;
> > > @@ -1789,8 +1809,9 @@ xfs_swap_extent_forks(
> > >  	int			*target_log_flags)
> > >  {
> > >  	struct xfs_ifork	tempifp, *ifp, *tifp;
> > > -	int			aforkblks = 0;
> > > -	int			taforkblks = 0;
> > > +	unsigned long long	aforkblks = 0;
> > > +	unsigned long long	taforkblks = 0;
> > > +	unsigned int		junk;
> > >  	xfs_extnum_t		nextents;
> > >  	uint64_t		tmp;
> > >  	int			error;
> > > @@ -1800,14 +1821,14 @@ xfs_swap_extent_forks(
> > >  	 */
> > >  	if ( ((XFS_IFORK_Q(ip) != 0) && (ip->i_d.di_anextents > 0)) &&
> > >  	     (ip->i_d.di_aformat != XFS_DINODE_FMT_LOCAL)) {
> > > -		error = xfs_bmap_count_blocks(tp, ip, XFS_ATTR_FORK,
> > > +		error = xfs_bmap_count_blocks(tp, ip, XFS_ATTR_FORK, &junk,
> > >  				&aforkblks);
> > >  		if (error)
> > >  			return error;
> > >  	}
> > >  	if ( ((XFS_IFORK_Q(tip) != 0) && (tip->i_d.di_anextents > 0)) &&
> > >  	     (tip->i_d.di_aformat != XFS_DINODE_FMT_LOCAL)) {
> > > -		error = xfs_bmap_count_blocks(tp, tip, XFS_ATTR_FORK,
> > > +		error = xfs_bmap_count_blocks(tp, tip, XFS_ATTR_FORK, &junk,
> > >  				&taforkblks);
> > >  		if (error)
> > >  			return error;
> > > diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h
> > > index 135d826..993973c 100644
> > > --- a/fs/xfs/xfs_bmap_util.h
> > > +++ b/fs/xfs/xfs_bmap_util.h
> > > @@ -70,4 +70,8 @@ int	xfs_swap_extents(struct xfs_inode *ip, struct xfs_inode *tip,
> > >  
> > >  xfs_daddr_t xfs_fsb_to_db(struct xfs_inode *ip, xfs_fsblock_t fsb);
> > >  
> > > +int xfs_bmap_count_blocks(struct xfs_trans *tp, struct xfs_inode *ip,
> > > +			  int whichfork, unsigned int *nextents,
> > > +			  unsigned long long *count);
> > > +
> > >  #endif	/* __XFS_BMAP_UTIL_H__ */
> > > 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 01/13] xfs: optimize _btree_query_all
  2017-06-02 21:24 ` [PATCH 01/13] xfs: optimize _btree_query_all Darrick J. Wong
  2017-06-06 13:32   ` Brian Foster
@ 2017-06-07  1:18   ` Darrick J. Wong
  2017-06-07 14:22     ` Brian Foster
  1 sibling, 1 reply; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-07  1:18 UTC (permalink / raw)
  To: linux-xfs; +Cc: Brian Foster

Don't bother wandering our way through the leaf nodes when the caller
issues a query_all; just zoom down the left side of the tree and walk
rightwards along level zero.  In other words, use the simple query
range implementation.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_btree.c |   12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
index 3a673ba..d505179 100644
--- a/fs/xfs/libxfs/xfs_btree.c
+++ b/fs/xfs/libxfs/xfs_btree.c
@@ -4849,12 +4849,14 @@ xfs_btree_query_all(
 	xfs_btree_query_range_fn	fn,
 	void				*priv)
 {
-	union xfs_btree_irec		low_rec;
-	union xfs_btree_irec		high_rec;
+	union xfs_btree_key		low_key;
+	union xfs_btree_key		high_key;
+
+	memset(&cur->bc_rec, 0, sizeof(cur->bc_rec));
+	memset(&low_key, 0, sizeof(low_key));
+	memset(&high_key, 0xFF, sizeof(high_key));
 
-	memset(&low_rec, 0, sizeof(low_rec));
-	memset(&high_rec, 0xFF, sizeof(high_rec));
-	return xfs_btree_query_range(cur, &low_rec, &high_rec, fn, priv);
+	return xfs_btree_simple_query_range(cur, &low_key, &high_key, fn, priv);
 }
 
 /*

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 07/13] xfs: check if an inode is cached and allocated
  2017-06-02 21:24 ` [PATCH 07/13] xfs: check if an inode is cached and allocated Darrick J. Wong
  2017-06-06 16:28   ` Brian Foster
@ 2017-06-07  1:21   ` Darrick J. Wong
  2017-06-16 17:59   ` [PATCH v3 " Darrick J. Wong
  2 siblings, 0 replies; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-07  1:21 UTC (permalink / raw)
  To: linux-xfs; +Cc: Brian Foster

Check the inode cache for a particular inode number.  If it's in the
cache, check that it's not currently being reclaimed.  If it's not being
reclaimed, return zero if the inode is allocated.  This function will be
used by various scrubbers to decide if the cache is more up to date
than the disk in terms of checking if an inode is allocated.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_icache.c |   92 +++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_icache.h |    3 ++
 2 files changed, 95 insertions(+)

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index f61c84f8..cd8e228 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -633,6 +633,98 @@ xfs_iget(
 }
 
 /*
+ * "Is this a cached inode that's also allocated?"
+ *
+ * Look up an inode by number in the given file system.  If the inode is
+ * in cache and isn't in purgatory, return 1 if the inode is allocated
+ * and 0 if it is not.  For all other cases (not in cache, being torn
+ * down, etc.), return a negative error code.
+ *
+ * The caller has to prevent inode allocation and freeing activity,
+ * presumably by locking the AGI buffer.   This is to ensure that an
+ * inode cannot transition from allocated to freed until the caller is
+ * ready to allow that.  If the inode is in an intermediate state (new,
+ * reclaimable, or being reclaimed), -EAGAIN will be returned; if the
+ * inode is not in the cache, -ENOENT will be returned.  The caller must
+ * deal with these scenarios appropriately.
+ *
+ * This is a specialized use case for the online scrubber; if you're
+ * reading this, you probably want xfs_iget.
+ */
+int
+xfs_icache_inode_is_allocated(
+	struct xfs_mount	*mp,
+	struct xfs_trans	*tp,
+	xfs_ino_t		ino,
+	bool			*inuse)
+{
+	struct xfs_inode	*ip;
+	struct xfs_perag	*pag;
+	xfs_agino_t		agino;
+	int			ret = 0;
+
+	/* reject inode numbers outside existing AGs */
+	if (!ino || XFS_INO_TO_AGNO(mp, ino) >= mp->m_sb.sb_agcount)
+		return -EINVAL;
+
+	/* get the perag structure and ensure that it's inode capable */
+	pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ino));
+	agino = XFS_INO_TO_AGINO(mp, ino);
+
+	rcu_read_lock();
+	ip = radix_tree_lookup(&pag->pag_ici_root, agino);
+	if (!ip) {
+		ret = -ENOENT;
+		goto out;
+	}
+
+	/*
+	 * Is the inode being reused?  Is it new?  Is it being
+	 * reclaimed?  Is it being torn down?  For any of those cases,
+	 * fall back.
+	 */
+	spin_lock(&ip->i_flags_lock);
+	if (ip->i_ino != ino ||
+	    (ip->i_flags & (XFS_INEW | XFS_IRECLAIM | XFS_IRECLAIMABLE))) {
+		ret = -EAGAIN;
+		goto out_istate;
+	}
+
+	/*
+	 * If lookup is racing with unlink, jump out immediately.
+	 */
+	if (VFS_I(ip)->i_mode == 0) {
+		*inuse = false;
+		ret = 0;
+		goto out_istate;
+	}
+
+	/* If the VFS inode is being torn down, forget it. */
+	if (!igrab(VFS_I(ip))) {
+		ret = -EAGAIN;
+		goto out_istate;
+	}
+
+	/* We've got a live one. */
+	spin_unlock(&ip->i_flags_lock);
+	rcu_read_unlock();
+	xfs_perag_put(pag);
+
+	*inuse = !!(VFS_I(ip)->i_mode);
+	ret = 0;
+	IRELE(ip);
+
+	return ret;
+
+out_istate:
+	spin_unlock(&ip->i_flags_lock);
+out:
+	rcu_read_unlock();
+	xfs_perag_put(pag);
+	return ret;
+}
+
+/*
  * The inode lookup is done in batches to keep the amount of lock traffic and
  * radix tree lookups to a minimum. The batch size is a trade off between
  * lookup reduction and stack usage. This is in the reclaim path, so we can't
diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
index 9183f77..eadf718 100644
--- a/fs/xfs/xfs_icache.h
+++ b/fs/xfs/xfs_icache.h
@@ -126,4 +126,7 @@ xfs_fs_eofblocks_from_user(
 	return 0;
 }
 
+int xfs_icache_inode_is_allocated(struct xfs_mount *mp, struct xfs_trans *tp,
+				  xfs_ino_t ino, bool *inuse);
+
 #endif

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 09/13] xfs: separate function to check if reflink flag needed
  2017-06-02 21:24 ` [PATCH 09/13] xfs: separate function to check if reflink flag needed Darrick J. Wong
  2017-06-06 16:28   ` Brian Foster
@ 2017-06-07  1:26   ` Darrick J. Wong
  2017-06-07 14:22     ` Brian Foster
  1 sibling, 1 reply; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-07  1:26 UTC (permalink / raw)
  To: linux-xfs; +Cc: Brian Foster

Separate the "clear reflink flag" function into one function that checks
if the flag is needed, and a second function that checks and clears the
flag.  The inode scrub code will want to check the necessity of the flag
without clearing it.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_reflink.c |   88 ++++++++++++++++++++++++++++++--------------------
 fs/xfs/xfs_reflink.h |    2 +
 2 files changed, 54 insertions(+), 36 deletions(-)

diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index e25c995..ab2270a 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -1406,57 +1406,73 @@ xfs_reflink_dirty_extents(
 	return error;
 }
 
-/* Clear the inode reflink flag if there are no shared extents. */
+/* Does this inode need the reflink flag? */
 int
-xfs_reflink_clear_inode_flag(
-	struct xfs_inode	*ip,
-	struct xfs_trans	**tpp)
+xfs_reflink_inode_has_shared_extents(
+	struct xfs_trans		*tp,
+	struct xfs_inode		*ip,
+	bool				*has_shared)
 {
-	struct xfs_mount	*mp = ip->i_mount;
-	xfs_fileoff_t		fbno;
-	xfs_filblks_t		end;
-	xfs_agnumber_t		agno;
-	xfs_agblock_t		agbno;
-	xfs_extlen_t		aglen;
-	xfs_agblock_t		rbno;
-	xfs_extlen_t		rlen;
-	struct xfs_bmbt_irec	map;
-	int			nmaps;
-	int			error = 0;
-
-	ASSERT(xfs_is_reflink_inode(ip));
+	struct xfs_bmbt_irec		got;
+	struct xfs_mount		*mp = ip->i_mount;
+	struct xfs_ifork		*ifp;
+	xfs_agnumber_t			agno;
+	xfs_agblock_t			agbno;
+	xfs_extlen_t			aglen;
+	xfs_agblock_t			rbno;
+	xfs_extlen_t			rlen;
+	xfs_extnum_t			idx;
+	bool				found;
+	int				error;
 
-	fbno = 0;
-	end = XFS_B_TO_FSB(mp, i_size_read(VFS_I(ip)));
-	while (end - fbno > 0) {
-		nmaps = 1;
-		/*
-		 * Look for extents in the file.  Skip holes, delalloc, or
-		 * unwritten extents; they can't be reflinked.
-		 */
-		error = xfs_bmapi_read(ip, fbno, end - fbno, &map, &nmaps, 0);
+	ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
+	if (!(ifp->if_flags & XFS_IFEXTENTS)) {
+		error = xfs_iread_extents(tp, ip, XFS_DATA_FORK);
 		if (error)
 			return error;
-		if (nmaps == 0)
-			break;
-		if (!xfs_bmap_is_real_extent(&map))
-			goto next;
+	}
 
-		agno = XFS_FSB_TO_AGNO(mp, map.br_startblock);
-		agbno = XFS_FSB_TO_AGBNO(mp, map.br_startblock);
-		aglen = map.br_blockcount;
+	*has_shared = false;
+	found = xfs_iext_lookup_extent(ip, ifp, 0, &idx, &got);
+	while (found) {
+		if (isnullstartblock(got.br_startblock) ||
+		    got.br_state != XFS_EXT_NORM)
+			goto next;
+		agno = XFS_FSB_TO_AGNO(mp, got.br_startblock);
+		agbno = XFS_FSB_TO_AGBNO(mp, got.br_startblock);
+		aglen = got.br_blockcount;
 
-		error = xfs_reflink_find_shared(mp, *tpp, agno, agbno, aglen,
+		error = xfs_reflink_find_shared(mp, tp, agno, agbno, aglen,
 				&rbno, &rlen, false);
 		if (error)
 			return error;
 		/* Is there still a shared block here? */
-		if (rbno != NULLAGBLOCK)
+		if (rbno != NULLAGBLOCK) {
+			*has_shared = true;
 			return 0;
+		}
 next:
-		fbno = map.br_startoff + map.br_blockcount;
+		found = xfs_iext_get_extent(ifp, ++idx, &got);
 	}
 
+	return 0;
+}
+
+/* Clear the inode reflink flag if there are no shared extents. */
+int
+xfs_reflink_clear_inode_flag(
+	struct xfs_inode	*ip,
+	struct xfs_trans	**tpp)
+{
+	bool			needs_flag;
+	int			error = 0;
+
+	ASSERT(xfs_is_reflink_inode(ip));
+
+	error = xfs_reflink_inode_has_shared_extents(*tpp, ip, &needs_flag);
+	if (error || needs_flag)
+		return error;
+
 	/*
 	 * We didn't find any shared blocks so turn off the reflink flag.
 	 * First, get rid of any leftover CoW mappings.
diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h
index b8cc5c3..701487b 100644
--- a/fs/xfs/xfs_reflink.h
+++ b/fs/xfs/xfs_reflink.h
@@ -47,6 +47,8 @@ extern int xfs_reflink_end_cow(struct xfs_inode *ip, xfs_off_t offset,
 extern int xfs_reflink_recover_cow(struct xfs_mount *mp);
 extern int xfs_reflink_remap_range(struct file *file_in, loff_t pos_in,
 		struct file *file_out, loff_t pos_out, u64 len, bool is_dedupe);
+extern int xfs_reflink_inode_has_shared_extents(struct xfs_trans *tp,
+		struct xfs_inode *ip, bool *has_shared);
 extern int xfs_reflink_clear_inode_flag(struct xfs_inode *ip,
 		struct xfs_trans **tpp);
 extern int xfs_reflink_unshare(struct xfs_inode *ip, xfs_off_t offset,

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 9.9/13] xfs: make _bmap_count_blocks consistent wrt delalloc extent behavior
  2017-06-02 21:25 ` [PATCH 10/13] xfs: refactor the ifork block counting function Darrick J. Wong
  2017-06-06 16:29   ` Brian Foster
@ 2017-06-07  1:29   ` Darrick J. Wong
  2017-06-07 15:11     ` Brian Foster
  2017-06-07  1:29   ` [PATCH v2 10/13] xfs: refactor the ifork block counting function Darrick J. Wong
  2 siblings, 1 reply; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-07  1:29 UTC (permalink / raw)
  To: linux-xfs; +Cc: Brian Foster

There is an inconsistency in the way that _bmap_count_blocks deals with
delalloc reservations -- if the specified fork is in extents format,
*count is set to the total number of blocks referenced by the in-core
fork, including delalloc extents.  However, if the fork is in btree
format, *count is set to the number of blocks referenced by the on-disk
fork, which does /not/ include delalloc extents.

For the lone existing caller of _bmap_count_blocks this hasn't been an
issue because the function is only used to count xattr fork blocks
(where there aren't any delalloc reservations).  However, when scrub
comes along it will use this same function to check di_nblocks against
both on-disk extent maps, so we need this behavior to be consistent.

Therefore, fix _bmap_count_leaves not to include delalloc extents and
remove unnecessary parameters.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_bmap_util.c |   17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index fe83bbc..a34c3ce 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -223,16 +223,17 @@ xfs_bmap_eof(
  */
 STATIC void
 xfs_bmap_count_leaves(
-	xfs_ifork_t		*ifp,
-	xfs_extnum_t		idx,
-	int			numrecs,
+	struct xfs_ifork	*ifp,
 	int			*count)
 {
-	int		b;
+	xfs_extnum_t		i;
+	xfs_extnum_t		nr_exts = xfs_iext_count(ifp);
 
-	for (b = 0; b < numrecs; b++) {
-		xfs_bmbt_rec_host_t *frp = xfs_iext_get_ext(ifp, idx + b);
-		*count += xfs_bmbt_get_blockcount(frp);
+	for (i = 0; i < nr_exts; i++) {
+		xfs_bmbt_rec_host_t *frp = xfs_iext_get_ext(ifp, i);
+		if (!isnullstartblock(xfs_bmbt_get_startblock(frp))) {
+			*count += xfs_bmbt_get_blockcount(frp);
+		}
 	}
 }
 
@@ -354,7 +355,7 @@ xfs_bmap_count_blocks(
 	mp = ip->i_mount;
 	ifp = XFS_IFORK_PTR(ip, whichfork);
 	if ( XFS_IFORK_FORMAT(ip, whichfork) == XFS_DINODE_FMT_EXTENTS ) {
-		xfs_bmap_count_leaves(ifp, 0, xfs_iext_count(ifp), count);
+		xfs_bmap_count_leaves(ifp, count);
 		return 0;
 	}
 

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 10/13] xfs: refactor the ifork block counting function
  2017-06-02 21:25 ` [PATCH 10/13] xfs: refactor the ifork block counting function Darrick J. Wong
  2017-06-06 16:29   ` Brian Foster
  2017-06-07  1:29   ` [PATCH v2 9.9/13] xfs: make _bmap_count_blocks consistent wrt delalloc extent behavior Darrick J. Wong
@ 2017-06-07  1:29   ` Darrick J. Wong
  2017-06-07 15:11     ` Brian Foster
  2 siblings, 1 reply; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-07  1:29 UTC (permalink / raw)
  To: linux-xfs

Refactor the inode fork block counting function to count extents for us
at the same time.  This will be used by the bmbt scrubber function.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_bmap_util.c |  109 +++++++++++++++++++++++++++++-------------------
 fs/xfs/xfs_bmap_util.h |    4 ++
 2 files changed, 70 insertions(+), 43 deletions(-)

diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index a34c3ce..4baaff1 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -224,7 +224,8 @@ xfs_bmap_eof(
 STATIC void
 xfs_bmap_count_leaves(
 	struct xfs_ifork	*ifp,
-	int			*count)
+	xfs_extnum_t		*numrecs,
+	xfs_filblks_t		*count)
 {
 	xfs_extnum_t		i;
 	xfs_extnum_t		nr_exts = xfs_iext_count(ifp);
@@ -232,6 +233,7 @@ xfs_bmap_count_leaves(
 	for (i = 0; i < nr_exts; i++) {
 		xfs_bmbt_rec_host_t *frp = xfs_iext_get_ext(ifp, i);
 		if (!isnullstartblock(xfs_bmbt_get_startblock(frp))) {
+			(*numrecs)++;
 			*count += xfs_bmbt_get_blockcount(frp);
 		}
 	}
@@ -246,7 +248,7 @@ xfs_bmap_disk_count_leaves(
 	struct xfs_mount	*mp,
 	struct xfs_btree_block	*block,
 	int			numrecs,
-	int			*count)
+	xfs_filblks_t		*count)
 {
 	int		b;
 	xfs_bmbt_rec_t	*frp;
@@ -261,17 +263,18 @@ xfs_bmap_disk_count_leaves(
  * Recursively walks each level of a btree
  * to count total fsblocks in use.
  */
-STATIC int                                     /* error */
+STATIC int
 xfs_bmap_count_tree(
-	xfs_mount_t     *mp,            /* file system mount point */
-	xfs_trans_t     *tp,            /* transaction pointer */
-	xfs_ifork_t	*ifp,		/* inode fork pointer */
-	xfs_fsblock_t   blockno,	/* file system block number */
-	int             levelin,	/* level in btree */
-	int		*count)		/* Count of blocks */
+	struct xfs_mount	*mp,
+	struct xfs_trans	*tp,
+	struct xfs_ifork	*ifp,
+	xfs_fsblock_t		blockno,
+	int			levelin,
+	xfs_extnum_t		*nextents,
+	xfs_filblks_t		*count)
 {
 	int			error;
-	xfs_buf_t		*bp, *nbp;
+	struct xfs_buf		*bp, *nbp;
 	int			level = levelin;
 	__be64			*pp;
 	xfs_fsblock_t           bno = blockno;
@@ -304,8 +307,9 @@ xfs_bmap_count_tree(
 		/* Dive to the next level */
 		pp = XFS_BMBT_PTR_ADDR(mp, block, 1, mp->m_bmap_dmxr[1]);
 		bno = be64_to_cpu(*pp);
-		if (unlikely((error =
-		     xfs_bmap_count_tree(mp, tp, ifp, bno, level, count)) < 0)) {
+		error = xfs_bmap_count_tree(mp, tp, ifp, bno, level, nextents,
+				count);
+		if (error) {
 			xfs_trans_brelse(tp, bp);
 			XFS_ERROR_REPORT("xfs_bmap_count_tree(1)",
 					 XFS_ERRLEVEL_LOW, mp);
@@ -317,6 +321,7 @@ xfs_bmap_count_tree(
 		for (;;) {
 			nextbno = be64_to_cpu(block->bb_u.l.bb_rightsib);
 			numrecs = be16_to_cpu(block->bb_numrecs);
+			(*nextents) += numrecs;
 			xfs_bmap_disk_count_leaves(mp, block, numrecs, count);
 			xfs_trans_brelse(tp, bp);
 			if (nextbno == NULLFSBLOCK)
@@ -337,44 +342,61 @@ xfs_bmap_count_tree(
 /*
  * Count fsblocks of the given fork.
  */
-static int					/* error */
+int
 xfs_bmap_count_blocks(
-	xfs_trans_t		*tp,		/* transaction pointer */
-	xfs_inode_t		*ip,		/* incore inode */
-	int			whichfork,	/* data or attr fork */
-	int			*count)		/* out: count of blocks */
+	struct xfs_trans	*tp,
+	struct xfs_inode	*ip,
+	int			whichfork,
+	xfs_extnum_t		*nextents,
+	xfs_filblks_t		*count)
 {
+	struct xfs_mount	*mp;	/* file system mount structure */
+	__be64			*pp;	/* pointer to block address */
 	struct xfs_btree_block	*block;	/* current btree block */
+	struct xfs_ifork	*ifp;	/* fork structure */
 	xfs_fsblock_t		bno;	/* block # of "block" */
-	xfs_ifork_t		*ifp;	/* fork structure */
 	int			level;	/* btree level, for checking */
-	xfs_mount_t		*mp;	/* file system mount structure */
-	__be64			*pp;	/* pointer to block address */
+	int			error;
 
 	bno = NULLFSBLOCK;
 	mp = ip->i_mount;
+	*nextents = 0;
+	*count = 0;
 	ifp = XFS_IFORK_PTR(ip, whichfork);
-	if ( XFS_IFORK_FORMAT(ip, whichfork) == XFS_DINODE_FMT_EXTENTS ) {
-		xfs_bmap_count_leaves(ifp, count);
+	if (!ifp)
 		return 0;
-	}
 
-	/*
-	 * Root level must use BMAP_BROOT_PTR_ADDR macro to get ptr out.
-	 */
-	block = ifp->if_broot;
-	level = be16_to_cpu(block->bb_level);
-	ASSERT(level > 0);
-	pp = XFS_BMAP_BROOT_PTR_ADDR(mp, block, 1, ifp->if_broot_bytes);
-	bno = be64_to_cpu(*pp);
-	ASSERT(bno != NULLFSBLOCK);
-	ASSERT(XFS_FSB_TO_AGNO(mp, bno) < mp->m_sb.sb_agcount);
-	ASSERT(XFS_FSB_TO_AGBNO(mp, bno) < mp->m_sb.sb_agblocks);
-
-	if (unlikely(xfs_bmap_count_tree(mp, tp, ifp, bno, level, count) < 0)) {
-		XFS_ERROR_REPORT("xfs_bmap_count_blocks(2)", XFS_ERRLEVEL_LOW,
-				 mp);
-		return -EFSCORRUPTED;
+	switch (XFS_IFORK_FORMAT(ip, whichfork)) {
+	case XFS_DINODE_FMT_EXTENTS:
+		xfs_bmap_count_leaves(ifp, nextents, count);
+		return 0;
+	case XFS_DINODE_FMT_BTREE:
+		if (!(ifp->if_flags & XFS_IFEXTENTS)) {
+			error = xfs_iread_extents(tp, ip, whichfork);
+			if (error)
+				return error;
+		}
+
+		/*
+		 * Root level must use BMAP_BROOT_PTR_ADDR macro to get ptr out.
+		 */
+		block = ifp->if_broot;
+		level = be16_to_cpu(block->bb_level);
+		ASSERT(level > 0);
+		pp = XFS_BMAP_BROOT_PTR_ADDR(mp, block, 1, ifp->if_broot_bytes);
+		bno = be64_to_cpu(*pp);
+		ASSERT(bno != NULLFSBLOCK);
+		ASSERT(XFS_FSB_TO_AGNO(mp, bno) < mp->m_sb.sb_agcount);
+		ASSERT(XFS_FSB_TO_AGBNO(mp, bno) < mp->m_sb.sb_agblocks);
+
+		error = xfs_bmap_count_tree(mp, tp, ifp, bno, level,
+				nextents, count);
+		if (error) {
+			XFS_ERROR_REPORT("xfs_bmap_count_blocks(2)",
+					XFS_ERRLEVEL_LOW, mp);
+			return -EFSCORRUPTED;
+		}
+		return 0;
 	}
 
 	return 0;
@@ -1790,8 +1812,9 @@ xfs_swap_extent_forks(
 	int			*target_log_flags)
 {
 	struct xfs_ifork	tempifp, *ifp, *tifp;
-	int			aforkblks = 0;
-	int			taforkblks = 0;
+	xfs_filblks_t		aforkblks;
+	xfs_filblks_t		taforkblks;
+	xfs_extnum_t		junk;
 	xfs_extnum_t		nextents;
 	uint64_t		tmp;
 	int			error;
@@ -1801,14 +1824,14 @@ xfs_swap_extent_forks(
 	 */
 	if ( ((XFS_IFORK_Q(ip) != 0) && (ip->i_d.di_anextents > 0)) &&
 	     (ip->i_d.di_aformat != XFS_DINODE_FMT_LOCAL)) {
-		error = xfs_bmap_count_blocks(tp, ip, XFS_ATTR_FORK,
+		error = xfs_bmap_count_blocks(tp, ip, XFS_ATTR_FORK, &junk,
 				&aforkblks);
 		if (error)
 			return error;
 	}
 	if ( ((XFS_IFORK_Q(tip) != 0) && (tip->i_d.di_anextents > 0)) &&
 	     (tip->i_d.di_aformat != XFS_DINODE_FMT_LOCAL)) {
-		error = xfs_bmap_count_blocks(tp, tip, XFS_ATTR_FORK,
+		error = xfs_bmap_count_blocks(tp, tip, XFS_ATTR_FORK, &junk,
 				&taforkblks);
 		if (error)
 			return error;
diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h
index 135d826..0cede10 100644
--- a/fs/xfs/xfs_bmap_util.h
+++ b/fs/xfs/xfs_bmap_util.h
@@ -70,4 +70,8 @@ int	xfs_swap_extents(struct xfs_inode *ip, struct xfs_inode *tip,
 
 xfs_daddr_t xfs_fsb_to_db(struct xfs_inode *ip, xfs_fsblock_t fsb);
 
+int xfs_bmap_count_blocks(struct xfs_trans *tp, struct xfs_inode *ip,
+			  int whichfork, xfs_extnum_t *nextents,
+			  xfs_filblks_t *count);
+
 #endif	/* __XFS_BMAP_UTIL_H__ */

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 01/13] xfs: optimize _btree_query_all
  2017-06-07  1:18   ` [PATCH v2 " Darrick J. Wong
@ 2017-06-07 14:22     ` Brian Foster
  0 siblings, 0 replies; 56+ messages in thread
From: Brian Foster @ 2017-06-07 14:22 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Tue, Jun 06, 2017 at 06:18:53PM -0700, Darrick J. Wong wrote:
> Don't bother wandering our way through the leaf nodes when the caller
> issues a query_all; just zoom down the left side of the tree and walk
> rightwards along level zero.  In other words, use the simple query
> range implementation.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_btree.c |   12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
> index 3a673ba..d505179 100644
> --- a/fs/xfs/libxfs/xfs_btree.c
> +++ b/fs/xfs/libxfs/xfs_btree.c
> @@ -4849,12 +4849,14 @@ xfs_btree_query_all(
>  	xfs_btree_query_range_fn	fn,
>  	void				*priv)
>  {
> -	union xfs_btree_irec		low_rec;
> -	union xfs_btree_irec		high_rec;
> +	union xfs_btree_key		low_key;
> +	union xfs_btree_key		high_key;
> +
> +	memset(&cur->bc_rec, 0, sizeof(cur->bc_rec));
> +	memset(&low_key, 0, sizeof(low_key));
> +	memset(&high_key, 0xFF, sizeof(high_key));
>  
> -	memset(&low_rec, 0, sizeof(low_rec));
> -	memset(&high_rec, 0xFF, sizeof(high_rec));
> -	return xfs_btree_query_range(cur, &low_rec, &high_rec, fn, priv);
> +	return xfs_btree_simple_query_range(cur, &low_key, &high_key, fn, priv);
>  }
>  
>  /*
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 07/13] xfs: check if an inode is cached and allocated
  2017-06-06 18:40     ` Darrick J. Wong
@ 2017-06-07 14:22       ` Brian Foster
  2017-06-15  5:00         ` Darrick J. Wong
  0 siblings, 1 reply; 56+ messages in thread
From: Brian Foster @ 2017-06-07 14:22 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Tue, Jun 06, 2017 at 11:40:06AM -0700, Darrick J. Wong wrote:
> On Tue, Jun 06, 2017 at 12:28:13PM -0400, Brian Foster wrote:
> > On Fri, Jun 02, 2017 at 02:24:43PM -0700, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > Check the inode cache for a particular inode number.  If it's in the
> > > cache, check that it's not currently being reclaimed.  If it's not being
> > > reclaimed, return zero if the inode is allocated.  This function will be
> > > used by various scrubbers to decide if the cache is more up to date
> > > than the disk in terms of checking if an inode is allocated.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > >  fs/xfs/xfs_icache.c |   83 +++++++++++++++++++++++++++++++++++++++++++++++++++
> > >  fs/xfs/xfs_icache.h |    3 ++
> > >  2 files changed, 86 insertions(+)
> > > 
> > > 
> > > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> > > index f61c84f8..d610a7e 100644
> > > --- a/fs/xfs/xfs_icache.c
> > > +++ b/fs/xfs/xfs_icache.c
> > > @@ -633,6 +633,89 @@ xfs_iget(
> > >  }
> > >  
> > >  /*
> > > + * "Is this a cached inode that's also allocated?"
> > > + *
> > > + * Look up an inode by number in the given file system.  If the inode is
> > > + * in cache and isn't in purgatory, return 1 if the inode is allocated
> > > + * and 0 if it is not.  For all other cases (not in cache, being torn
> > > + * down, etc.), return a negative error code.
> > > + *
> > > + * (The caller has to prevent inode allocation activity.)
> > > + */
> > 
> > Hmm.. so isn't the data returned here potentially invalid once we drop
> > the inode reference? In other words, couldn't an inode where we return
> > inuse == true be reclaimed immediately after? Perhaps I'm just not far
> > enough along to understand how this is used. If that's the case, a note
> > about the lifetime/rules of this value might be useful.
> 
> The comment could state more explicitly what we're assuming the caller
> has done to prevent inode allocation or freeing activity.  The scrubber
> that calls this function will have locked the AGI buffer for this AG so
> that it can compare the inobt ir_free bits against di_mode to make sure
> that there aren't any discrepancies.  Even if the inode is immediately
> reclaimed/deleted after we release the inode, the corresponding inobt
> update will block on the AGI until the scrubber finishes, so from the
> scrubber's point of view things are still consistent.  If the scrubber
> finds the inode in some intermediate state of being created or torn
> down, it doesn't bother checking the free mask on the assumption that
> the thread modifying the inode will ensure the consistency or shut down.
> 
> tldr: We assume the caller has the AGI locked so that inodes stay stable
> wrt to allocation or freeing, or only end up in an intermediate state;
> we also assume the caller can handle inodes in an intermediate state.
> 

Ok, thanks for the explanation. The bits about reclaim are still a bit
unclear to me, but that will probably make more sense when I see how
this is used.

> > FWIW, I'm also kind of wondering if rather than open code the bits of
> > the inode lookup, we could accomplish the same thing with a new flag to
> > the existing xfs_iget() lookup mechanism that implements the associated
> > semantics (i.e., don't read from disk, don't reinit, sort of a read-only
> > semantic).
> 
> Originally it was just an iget flag, but the flag ended up special
> casing a lot of the existing iget functionality.  Basically, we need to
> disable the xfs_iget_cache_miss call; avoid the out_error_or_again case;
> do our i_mode testing, release the inode, and jump out of the function
> prior to the bit that can call xfs_setup_existing_inode; and change the
> lock_flags assert to require lock_flags == 0 when we're just checking.
> 
> All that turned xfs_iget into such a muddy mess that I decided it was
> cleaner to separate this specialized case into its own function and hope
> that we're not really going to modify _iget a whole lot.
> 

Hmm, so obviously I would expect some tweaks in that code, but I'm
curious how messy it really has to be. Walking through some of the
changes...

- The lock_flags check is already conditional in the code, so I'm not
  sure we really need the assert. I'd be fine with dropping it at least
  if we had a lock_flags == 0 caller. We could alternatively adjust it
  to accommodate the new xfs_iget() flag, which might be safer.
- I'm not sure that xfs_iget() really needs to be responsible for the
  release. What about a helper function on top that actually receives
  the xfs_inode from xfs_iget() and does the resulting checks, sets
  inuse appropriately and then releases the inode?
- With the above changes, would that reduce the necessary xfs_iget()
  changes to basically skipping out in a few places? For example,
  consider an XFS_IGET_INCORE flag that skips the -EAGAIN retry, skips
  the IRECLAIMABLE reinit in _iget_cache_hit() (returns -EAGAIN) and
  returns -ENOENT rather than calling _iget_cache_miss(). The code flow
  of the helper might look something like the following:

int
xfs_icache_inode_is_allocated(
	...
	xfs_ino_t		ino,
	bool			*inuse)
{
	...

	*inuse = false;
	error = xfs_iget(..., ino, XFS_IGET_INCORE, 0, &ip);
	if (error)
		return error;

	if (<ip checks>)
		*inuse = true;

	IRELE(ip);
	return 0;
}

... and may only require fairly straightforward tweaks to xfs_iget().
Thoughts?

Brian

> Anyway, thank you for the reviewing!
> 
> --D
> 
> > 
> > Brian
> > 
> > > +int
> > > +xfs_icache_inode_is_allocated(
> > > +	struct xfs_mount	*mp,
> > > +	struct xfs_trans	*tp,
> > > +	xfs_ino_t		ino,
> > > +	bool			*inuse)
> > > +{
> > > +	struct xfs_inode	*ip;
> > > +	struct xfs_perag	*pag;
> > > +	xfs_agino_t		agino;
> > > +	int			ret = 0;
> > > +
> > > +	/* reject inode numbers outside existing AGs */
> > > +	if (!ino || XFS_INO_TO_AGNO(mp, ino) >= mp->m_sb.sb_agcount)
> > > +		return -EINVAL;
> > > +
> > > +	/* get the perag structure and ensure that it's inode capable */
> > > +	pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ino));
> > > +	agino = XFS_INO_TO_AGINO(mp, ino);
> > > +
> > > +	rcu_read_lock();
> > > +	ip = radix_tree_lookup(&pag->pag_ici_root, agino);
> > > +	if (!ip) {
> > > +		ret = -ENOENT;
> > > +		goto out;
> > > +	}
> > > +
> > > +	/*
> > > +	 * Is the inode being reused?  Is it new?  Is it being
> > > +	 * reclaimed?  Is it being torn down?  For any of those cases,
> > > +	 * fall back.
> > > +	 */
> > > +	spin_lock(&ip->i_flags_lock);
> > > +	if (ip->i_ino != ino ||
> > > +	    (ip->i_flags & (XFS_INEW | XFS_IRECLAIM | XFS_IRECLAIMABLE))) {
> > > +		ret = -EAGAIN;
> > > +		goto out_istate;
> > > +	}
> > > +
> > > +	/*
> > > +	 * If lookup is racing with unlink, jump out immediately.
> > > +	 */
> > > +	if (VFS_I(ip)->i_mode == 0) {
> > > +		*inuse = false;
> > > +		ret = 0;
> > > +		goto out_istate;
> > > +	}
> > > +
> > > +	/* If the VFS inode is being torn down, forget it. */
> > > +	if (!igrab(VFS_I(ip))) {
> > > +		ret = -EAGAIN;
> > > +		goto out_istate;
> > > +	}
> > > +
> > > +	/* We've got a live one. */
> > > +	spin_unlock(&ip->i_flags_lock);
> > > +	rcu_read_unlock();
> > > +	xfs_perag_put(pag);
> > > +
> > > +	*inuse = !!(VFS_I(ip)->i_mode);
> > > +	ret = 0;
> > > +	IRELE(ip);
> > > +
> > > +	return ret;
> > > +
> > > +out_istate:
> > > +	spin_unlock(&ip->i_flags_lock);
> > > +out:
> > > +	rcu_read_unlock();
> > > +	xfs_perag_put(pag);
> > > +	return ret;
> > > +}
> > > +
> > > +/*
> > >   * The inode lookup is done in batches to keep the amount of lock traffic and
> > >   * radix tree lookups to a minimum. The batch size is a trade off between
> > >   * lookup reduction and stack usage. This is in the reclaim path, so we can't
> > > diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
> > > index 9183f77..eadf718 100644
> > > --- a/fs/xfs/xfs_icache.h
> > > +++ b/fs/xfs/xfs_icache.h
> > > @@ -126,4 +126,7 @@ xfs_fs_eofblocks_from_user(
> > >  	return 0;
> > >  }
> > >  
> > > +int xfs_icache_inode_is_allocated(struct xfs_mount *mp, struct xfs_trans *tp,
> > > +				  xfs_ino_t ino, bool *inuse);
> > > +
> > >  #endif
> > > 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 09/13] xfs: separate function to check if reflink flag needed
  2017-06-07  1:26   ` [PATCH v2 " Darrick J. Wong
@ 2017-06-07 14:22     ` Brian Foster
  0 siblings, 0 replies; 56+ messages in thread
From: Brian Foster @ 2017-06-07 14:22 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Tue, Jun 06, 2017 at 06:26:46PM -0700, Darrick J. Wong wrote:
> Separate the "clear reflink flag" function into one function that checks
> if the flag is needed, and a second function that checks and clears the
> flag.  The inode scrub code will want to check the necessity of the flag
> without clearing it.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/xfs_reflink.c |   88 ++++++++++++++++++++++++++++++--------------------
>  fs/xfs/xfs_reflink.h |    2 +
>  2 files changed, 54 insertions(+), 36 deletions(-)
> 
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index e25c995..ab2270a 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -1406,57 +1406,73 @@ xfs_reflink_dirty_extents(
>  	return error;
>  }
>  
> -/* Clear the inode reflink flag if there are no shared extents. */
> +/* Does this inode need the reflink flag? */
>  int
> -xfs_reflink_clear_inode_flag(
> -	struct xfs_inode	*ip,
> -	struct xfs_trans	**tpp)
> +xfs_reflink_inode_has_shared_extents(
> +	struct xfs_trans		*tp,
> +	struct xfs_inode		*ip,
> +	bool				*has_shared)
>  {
> -	struct xfs_mount	*mp = ip->i_mount;
> -	xfs_fileoff_t		fbno;
> -	xfs_filblks_t		end;
> -	xfs_agnumber_t		agno;
> -	xfs_agblock_t		agbno;
> -	xfs_extlen_t		aglen;
> -	xfs_agblock_t		rbno;
> -	xfs_extlen_t		rlen;
> -	struct xfs_bmbt_irec	map;
> -	int			nmaps;
> -	int			error = 0;
> -
> -	ASSERT(xfs_is_reflink_inode(ip));
> +	struct xfs_bmbt_irec		got;
> +	struct xfs_mount		*mp = ip->i_mount;
> +	struct xfs_ifork		*ifp;
> +	xfs_agnumber_t			agno;
> +	xfs_agblock_t			agbno;
> +	xfs_extlen_t			aglen;
> +	xfs_agblock_t			rbno;
> +	xfs_extlen_t			rlen;
> +	xfs_extnum_t			idx;
> +	bool				found;
> +	int				error;
>  
> -	fbno = 0;
> -	end = XFS_B_TO_FSB(mp, i_size_read(VFS_I(ip)));
> -	while (end - fbno > 0) {
> -		nmaps = 1;
> -		/*
> -		 * Look for extents in the file.  Skip holes, delalloc, or
> -		 * unwritten extents; they can't be reflinked.
> -		 */
> -		error = xfs_bmapi_read(ip, fbno, end - fbno, &map, &nmaps, 0);
> +	ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
> +	if (!(ifp->if_flags & XFS_IFEXTENTS)) {
> +		error = xfs_iread_extents(tp, ip, XFS_DATA_FORK);
>  		if (error)
>  			return error;
> -		if (nmaps == 0)
> -			break;
> -		if (!xfs_bmap_is_real_extent(&map))
> -			goto next;
> +	}
>  
> -		agno = XFS_FSB_TO_AGNO(mp, map.br_startblock);
> -		agbno = XFS_FSB_TO_AGBNO(mp, map.br_startblock);
> -		aglen = map.br_blockcount;
> +	*has_shared = false;
> +	found = xfs_iext_lookup_extent(ip, ifp, 0, &idx, &got);
> +	while (found) {
> +		if (isnullstartblock(got.br_startblock) ||
> +		    got.br_state != XFS_EXT_NORM)
> +			goto next;
> +		agno = XFS_FSB_TO_AGNO(mp, got.br_startblock);
> +		agbno = XFS_FSB_TO_AGBNO(mp, got.br_startblock);
> +		aglen = got.br_blockcount;
>  
> -		error = xfs_reflink_find_shared(mp, *tpp, agno, agbno, aglen,
> +		error = xfs_reflink_find_shared(mp, tp, agno, agbno, aglen,
>  				&rbno, &rlen, false);
>  		if (error)
>  			return error;
>  		/* Is there still a shared block here? */
> -		if (rbno != NULLAGBLOCK)
> +		if (rbno != NULLAGBLOCK) {
> +			*has_shared = true;
>  			return 0;
> +		}
>  next:
> -		fbno = map.br_startoff + map.br_blockcount;
> +		found = xfs_iext_get_extent(ifp, ++idx, &got);
>  	}
>  
> +	return 0;
> +}
> +
> +/* Clear the inode reflink flag if there are no shared extents. */
> +int
> +xfs_reflink_clear_inode_flag(
> +	struct xfs_inode	*ip,
> +	struct xfs_trans	**tpp)
> +{
> +	bool			needs_flag;
> +	int			error = 0;
> +
> +	ASSERT(xfs_is_reflink_inode(ip));
> +
> +	error = xfs_reflink_inode_has_shared_extents(*tpp, ip, &needs_flag);
> +	if (error || needs_flag)
> +		return error;
> +
>  	/*
>  	 * We didn't find any shared blocks so turn off the reflink flag.
>  	 * First, get rid of any leftover CoW mappings.
> diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h
> index b8cc5c3..701487b 100644
> --- a/fs/xfs/xfs_reflink.h
> +++ b/fs/xfs/xfs_reflink.h
> @@ -47,6 +47,8 @@ extern int xfs_reflink_end_cow(struct xfs_inode *ip, xfs_off_t offset,
>  extern int xfs_reflink_recover_cow(struct xfs_mount *mp);
>  extern int xfs_reflink_remap_range(struct file *file_in, loff_t pos_in,
>  		struct file *file_out, loff_t pos_out, u64 len, bool is_dedupe);
> +extern int xfs_reflink_inode_has_shared_extents(struct xfs_trans *tp,
> +		struct xfs_inode *ip, bool *has_shared);
>  extern int xfs_reflink_clear_inode_flag(struct xfs_inode *ip,
>  		struct xfs_trans **tpp);
>  extern int xfs_reflink_unshare(struct xfs_inode *ip, xfs_off_t offset,
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 9.9/13] xfs: make _bmap_count_blocks consistent wrt delalloc extent behavior
  2017-06-07  1:29   ` [PATCH v2 9.9/13] xfs: make _bmap_count_blocks consistent wrt delalloc extent behavior Darrick J. Wong
@ 2017-06-07 15:11     ` Brian Foster
  2017-06-07 16:19       ` Darrick J. Wong
  0 siblings, 1 reply; 56+ messages in thread
From: Brian Foster @ 2017-06-07 15:11 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Tue, Jun 06, 2017 at 06:29:10PM -0700, Darrick J. Wong wrote:
> There is an inconsistency in the way that _bmap_count_blocks deals with
> delalloc reservations -- if the specified fork is in extents format,
> *count is set to the total number of blocks referenced by the in-core
> fork, including delalloc extents.  However, if the fork is in btree
> format, *count is set to the number of blocks referenced by the on-disk
> fork, which does /not/ include delalloc extents.
> 
> For the lone existing caller of _bmap_count_blocks this hasn't been an
> issue because the function is only used to count xattr fork blocks
> (where there aren't any delalloc reservations).  However, when scrub
> comes along it will use this same function to check di_nblocks against
> both on-disk extent maps, so we need this behavior to be consistent.
> 
> Therefore, fix _bmap_count_leaves not to include delalloc extents and
> remove unnecessary parameters.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/xfs_bmap_util.c |   17 +++++++++--------
>  1 file changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index fe83bbc..a34c3ce 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -223,16 +223,17 @@ xfs_bmap_eof(
>   */

A quick update to the function comment would be nice just to point out
that we skip delalloc blocks to be consistent with the bmbt case.
Otherwise looks good to me:

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  STATIC void
>  xfs_bmap_count_leaves(
> -	xfs_ifork_t		*ifp,
> -	xfs_extnum_t		idx,
> -	int			numrecs,
> +	struct xfs_ifork	*ifp,
>  	int			*count)
>  {
> -	int		b;
> +	xfs_extnum_t		i;
> +	xfs_extnum_t		nr_exts = xfs_iext_count(ifp);
>  
> -	for (b = 0; b < numrecs; b++) {
> -		xfs_bmbt_rec_host_t *frp = xfs_iext_get_ext(ifp, idx + b);
> -		*count += xfs_bmbt_get_blockcount(frp);
> +	for (i = 0; i < nr_exts; i++) {
> +		xfs_bmbt_rec_host_t *frp = xfs_iext_get_ext(ifp, i);
> +		if (!isnullstartblock(xfs_bmbt_get_startblock(frp))) {
> +			*count += xfs_bmbt_get_blockcount(frp);
> +		}
>  	}
>  }
>  
> @@ -354,7 +355,7 @@ xfs_bmap_count_blocks(
>  	mp = ip->i_mount;
>  	ifp = XFS_IFORK_PTR(ip, whichfork);
>  	if ( XFS_IFORK_FORMAT(ip, whichfork) == XFS_DINODE_FMT_EXTENTS ) {
> -		xfs_bmap_count_leaves(ifp, 0, xfs_iext_count(ifp), count);
> +		xfs_bmap_count_leaves(ifp, count);
>  		return 0;
>  	}
>  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 10/13] xfs: refactor the ifork block counting function
  2017-06-07  1:29   ` [PATCH v2 10/13] xfs: refactor the ifork block counting function Darrick J. Wong
@ 2017-06-07 15:11     ` Brian Foster
  0 siblings, 0 replies; 56+ messages in thread
From: Brian Foster @ 2017-06-07 15:11 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Tue, Jun 06, 2017 at 06:29:16PM -0700, Darrick J. Wong wrote:
> Refactor the inode fork block counting function to count extents for us
> at the same time.  This will be used by the bmbt scrubber function.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/xfs_bmap_util.c |  109 +++++++++++++++++++++++++++++-------------------
>  fs/xfs/xfs_bmap_util.h |    4 ++
>  2 files changed, 70 insertions(+), 43 deletions(-)
> 
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index a34c3ce..4baaff1 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -224,7 +224,8 @@ xfs_bmap_eof(
>  STATIC void
>  xfs_bmap_count_leaves(
>  	struct xfs_ifork	*ifp,
> -	int			*count)
> +	xfs_extnum_t		*numrecs,
> +	xfs_filblks_t		*count)
>  {
>  	xfs_extnum_t		i;
>  	xfs_extnum_t		nr_exts = xfs_iext_count(ifp);
> @@ -232,6 +233,7 @@ xfs_bmap_count_leaves(
>  	for (i = 0; i < nr_exts; i++) {
>  		xfs_bmbt_rec_host_t *frp = xfs_iext_get_ext(ifp, i);
>  		if (!isnullstartblock(xfs_bmbt_get_startblock(frp))) {
> +			(*numrecs)++;
>  			*count += xfs_bmbt_get_blockcount(frp);
>  		}
>  	}
> @@ -246,7 +248,7 @@ xfs_bmap_disk_count_leaves(
>  	struct xfs_mount	*mp,
>  	struct xfs_btree_block	*block,
>  	int			numrecs,
> -	int			*count)
> +	xfs_filblks_t		*count)
>  {
>  	int		b;
>  	xfs_bmbt_rec_t	*frp;
> @@ -261,17 +263,18 @@ xfs_bmap_disk_count_leaves(
>   * Recursively walks each level of a btree
>   * to count total fsblocks in use.
>   */
> -STATIC int                                     /* error */
> +STATIC int
>  xfs_bmap_count_tree(
> -	xfs_mount_t     *mp,            /* file system mount point */
> -	xfs_trans_t     *tp,            /* transaction pointer */
> -	xfs_ifork_t	*ifp,		/* inode fork pointer */
> -	xfs_fsblock_t   blockno,	/* file system block number */
> -	int             levelin,	/* level in btree */
> -	int		*count)		/* Count of blocks */
> +	struct xfs_mount	*mp,
> +	struct xfs_trans	*tp,
> +	struct xfs_ifork	*ifp,
> +	xfs_fsblock_t		blockno,
> +	int			levelin,
> +	xfs_extnum_t		*nextents,
> +	xfs_filblks_t		*count)
>  {
>  	int			error;
> -	xfs_buf_t		*bp, *nbp;
> +	struct xfs_buf		*bp, *nbp;
>  	int			level = levelin;
>  	__be64			*pp;
>  	xfs_fsblock_t           bno = blockno;
> @@ -304,8 +307,9 @@ xfs_bmap_count_tree(
>  		/* Dive to the next level */
>  		pp = XFS_BMBT_PTR_ADDR(mp, block, 1, mp->m_bmap_dmxr[1]);
>  		bno = be64_to_cpu(*pp);
> -		if (unlikely((error =
> -		     xfs_bmap_count_tree(mp, tp, ifp, bno, level, count)) < 0)) {
> +		error = xfs_bmap_count_tree(mp, tp, ifp, bno, level, nextents,
> +				count);
> +		if (error) {
>  			xfs_trans_brelse(tp, bp);
>  			XFS_ERROR_REPORT("xfs_bmap_count_tree(1)",
>  					 XFS_ERRLEVEL_LOW, mp);
> @@ -317,6 +321,7 @@ xfs_bmap_count_tree(
>  		for (;;) {
>  			nextbno = be64_to_cpu(block->bb_u.l.bb_rightsib);
>  			numrecs = be16_to_cpu(block->bb_numrecs);
> +			(*nextents) += numrecs;
>  			xfs_bmap_disk_count_leaves(mp, block, numrecs, count);
>  			xfs_trans_brelse(tp, bp);
>  			if (nextbno == NULLFSBLOCK)
> @@ -337,44 +342,61 @@ xfs_bmap_count_tree(
>  /*
>   * Count fsblocks of the given fork.
>   */
> -static int					/* error */
> +int
>  xfs_bmap_count_blocks(
> -	xfs_trans_t		*tp,		/* transaction pointer */
> -	xfs_inode_t		*ip,		/* incore inode */
> -	int			whichfork,	/* data or attr fork */
> -	int			*count)		/* out: count of blocks */
> +	struct xfs_trans	*tp,
> +	struct xfs_inode	*ip,
> +	int			whichfork,
> +	xfs_extnum_t		*nextents,
> +	xfs_filblks_t		*count)
>  {
> +	struct xfs_mount	*mp;	/* file system mount structure */
> +	__be64			*pp;	/* pointer to block address */
>  	struct xfs_btree_block	*block;	/* current btree block */
> +	struct xfs_ifork	*ifp;	/* fork structure */
>  	xfs_fsblock_t		bno;	/* block # of "block" */
> -	xfs_ifork_t		*ifp;	/* fork structure */
>  	int			level;	/* btree level, for checking */
> -	xfs_mount_t		*mp;	/* file system mount structure */
> -	__be64			*pp;	/* pointer to block address */
> +	int			error;
>  
>  	bno = NULLFSBLOCK;
>  	mp = ip->i_mount;
> +	*nextents = 0;
> +	*count = 0;
>  	ifp = XFS_IFORK_PTR(ip, whichfork);
> -	if ( XFS_IFORK_FORMAT(ip, whichfork) == XFS_DINODE_FMT_EXTENTS ) {
> -		xfs_bmap_count_leaves(ifp, count);
> +	if (!ifp)
>  		return 0;
> -	}
>  
> -	/*
> -	 * Root level must use BMAP_BROOT_PTR_ADDR macro to get ptr out.
> -	 */
> -	block = ifp->if_broot;
> -	level = be16_to_cpu(block->bb_level);
> -	ASSERT(level > 0);
> -	pp = XFS_BMAP_BROOT_PTR_ADDR(mp, block, 1, ifp->if_broot_bytes);
> -	bno = be64_to_cpu(*pp);
> -	ASSERT(bno != NULLFSBLOCK);
> -	ASSERT(XFS_FSB_TO_AGNO(mp, bno) < mp->m_sb.sb_agcount);
> -	ASSERT(XFS_FSB_TO_AGBNO(mp, bno) < mp->m_sb.sb_agblocks);
> -
> -	if (unlikely(xfs_bmap_count_tree(mp, tp, ifp, bno, level, count) < 0)) {
> -		XFS_ERROR_REPORT("xfs_bmap_count_blocks(2)", XFS_ERRLEVEL_LOW,
> -				 mp);
> -		return -EFSCORRUPTED;
> +	switch (XFS_IFORK_FORMAT(ip, whichfork)) {
> +	case XFS_DINODE_FMT_EXTENTS:
> +		xfs_bmap_count_leaves(ifp, nextents, count);
> +		return 0;
> +	case XFS_DINODE_FMT_BTREE:
> +		if (!(ifp->if_flags & XFS_IFEXTENTS)) {
> +			error = xfs_iread_extents(tp, ip, whichfork);
> +			if (error)
> +				return error;
> +		}
> +
> +		/*
> +		 * Root level must use BMAP_BROOT_PTR_ADDR macro to get ptr out.
> +		 */
> +		block = ifp->if_broot;
> +		level = be16_to_cpu(block->bb_level);
> +		ASSERT(level > 0);
> +		pp = XFS_BMAP_BROOT_PTR_ADDR(mp, block, 1, ifp->if_broot_bytes);
> +		bno = be64_to_cpu(*pp);
> +		ASSERT(bno != NULLFSBLOCK);
> +		ASSERT(XFS_FSB_TO_AGNO(mp, bno) < mp->m_sb.sb_agcount);
> +		ASSERT(XFS_FSB_TO_AGBNO(mp, bno) < mp->m_sb.sb_agblocks);
> +
> +		error = xfs_bmap_count_tree(mp, tp, ifp, bno, level,
> +				nextents, count);
> +		if (error) {
> +			XFS_ERROR_REPORT("xfs_bmap_count_blocks(2)",
> +					XFS_ERRLEVEL_LOW, mp);
> +			return -EFSCORRUPTED;
> +		}
> +		return 0;
>  	}
>  
>  	return 0;
> @@ -1790,8 +1812,9 @@ xfs_swap_extent_forks(
>  	int			*target_log_flags)
>  {
>  	struct xfs_ifork	tempifp, *ifp, *tifp;
> -	int			aforkblks = 0;
> -	int			taforkblks = 0;
> +	xfs_filblks_t		aforkblks;
> +	xfs_filblks_t		taforkblks;
> +	xfs_extnum_t		junk;
>  	xfs_extnum_t		nextents;
>  	uint64_t		tmp;
>  	int			error;
> @@ -1801,14 +1824,14 @@ xfs_swap_extent_forks(
>  	 */
>  	if ( ((XFS_IFORK_Q(ip) != 0) && (ip->i_d.di_anextents > 0)) &&
>  	     (ip->i_d.di_aformat != XFS_DINODE_FMT_LOCAL)) {
> -		error = xfs_bmap_count_blocks(tp, ip, XFS_ATTR_FORK,
> +		error = xfs_bmap_count_blocks(tp, ip, XFS_ATTR_FORK, &junk,
>  				&aforkblks);
>  		if (error)
>  			return error;
>  	}
>  	if ( ((XFS_IFORK_Q(tip) != 0) && (tip->i_d.di_anextents > 0)) &&
>  	     (tip->i_d.di_aformat != XFS_DINODE_FMT_LOCAL)) {
> -		error = xfs_bmap_count_blocks(tp, tip, XFS_ATTR_FORK,
> +		error = xfs_bmap_count_blocks(tp, tip, XFS_ATTR_FORK, &junk,
>  				&taforkblks);
>  		if (error)
>  			return error;
> diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h
> index 135d826..0cede10 100644
> --- a/fs/xfs/xfs_bmap_util.h
> +++ b/fs/xfs/xfs_bmap_util.h
> @@ -70,4 +70,8 @@ int	xfs_swap_extents(struct xfs_inode *ip, struct xfs_inode *tip,
>  
>  xfs_daddr_t xfs_fsb_to_db(struct xfs_inode *ip, xfs_fsblock_t fsb);
>  
> +int xfs_bmap_count_blocks(struct xfs_trans *tp, struct xfs_inode *ip,
> +			  int whichfork, xfs_extnum_t *nextents,
> +			  xfs_filblks_t *count);
> +
>  #endif	/* __XFS_BMAP_UTIL_H__ */
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 9.9/13] xfs: make _bmap_count_blocks consistent wrt delalloc extent behavior
  2017-06-07 15:11     ` Brian Foster
@ 2017-06-07 16:19       ` Darrick J. Wong
  0 siblings, 0 replies; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-07 16:19 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jun 07, 2017 at 11:11:27AM -0400, Brian Foster wrote:
> On Tue, Jun 06, 2017 at 06:29:10PM -0700, Darrick J. Wong wrote:
> > There is an inconsistency in the way that _bmap_count_blocks deals with
> > delalloc reservations -- if the specified fork is in extents format,
> > *count is set to the total number of blocks referenced by the in-core
> > fork, including delalloc extents.  However, if the fork is in btree
> > format, *count is set to the number of blocks referenced by the on-disk
> > fork, which does /not/ include delalloc extents.
> > 
> > For the lone existing caller of _bmap_count_blocks this hasn't been an
> > issue because the function is only used to count xattr fork blocks
> > (where there aren't any delalloc reservations).  However, when scrub
> > comes along it will use this same function to check di_nblocks against
> > both on-disk extent maps, so we need this behavior to be consistent.
> > 
> > Therefore, fix _bmap_count_leaves not to include delalloc extents and
> > remove unnecessary parameters.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/xfs_bmap_util.c |   17 +++++++++--------
> >  1 file changed, 9 insertions(+), 8 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> > index fe83bbc..a34c3ce 100644
> > --- a/fs/xfs/xfs_bmap_util.c
> > +++ b/fs/xfs/xfs_bmap_util.c
> > @@ -223,16 +223,17 @@ xfs_bmap_eof(
> >   */
> 
> A quick update to the function comment would be nice just to point out
> that we skip delalloc blocks to be consistent with the bmbt case.
> Otherwise looks good to me:

Ok, done.  Thx for the re-review!  I'll update both _bmap_count_leaves and
_bmap_count_blocks like so:

/*
 * Count leaf blocks given a range of extent records.  Delayed allocation
 * extents are not counted towards the totals.
 */
STATIC void
xfs_bmap_count_leaves(...

/*
 * Count fsblocks of the given fork.  Delayed allocation extents are
 * not counted towards the totals.
 */
static int					/* error */
xfs_bmap_count_blocks(...

--D

> 
> Reviewed-by: Brian Foster <bfoster@redhat.com>
> 
> >  STATIC void
> >  xfs_bmap_count_leaves(
> > -	xfs_ifork_t		*ifp,
> > -	xfs_extnum_t		idx,
> > -	int			numrecs,
> > +	struct xfs_ifork	*ifp,
> >  	int			*count)
> >  {
> > -	int		b;
> > +	xfs_extnum_t		i;
> > +	xfs_extnum_t		nr_exts = xfs_iext_count(ifp);
> >  
> > -	for (b = 0; b < numrecs; b++) {
> > -		xfs_bmbt_rec_host_t *frp = xfs_iext_get_ext(ifp, idx + b);
> > -		*count += xfs_bmbt_get_blockcount(frp);
> > +	for (i = 0; i < nr_exts; i++) {
> > +		xfs_bmbt_rec_host_t *frp = xfs_iext_get_ext(ifp, i);
> > +		if (!isnullstartblock(xfs_bmbt_get_startblock(frp))) {
> > +			*count += xfs_bmbt_get_blockcount(frp);
> > +		}
> >  	}
> >  }
> >  
> > @@ -354,7 +355,7 @@ xfs_bmap_count_blocks(
> >  	mp = ip->i_mount;
> >  	ifp = XFS_IFORK_PTR(ip, whichfork);
> >  	if ( XFS_IFORK_FORMAT(ip, whichfork) == XFS_DINODE_FMT_EXTENTS ) {
> > -		xfs_bmap_count_leaves(ifp, 0, xfs_iext_count(ifp), count);
> > +		xfs_bmap_count_leaves(ifp, count);
> >  		return 0;
> >  	}
> >  
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 11/13] xfs: return the hash value of a leaf1 directory block
  2017-06-02 21:25 ` [PATCH 11/13] xfs: return the hash value of a leaf1 directory block Darrick J. Wong
@ 2017-06-08 13:02   ` Brian Foster
  2017-06-08 15:53     ` Darrick J. Wong
  2017-06-08 18:22   ` [PATCH v2 " Darrick J. Wong
  1 sibling, 1 reply; 56+ messages in thread
From: Brian Foster @ 2017-06-08 13:02 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, Jun 02, 2017 at 02:25:08PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Provide a way to calculate the highest hash value of a leaf1 block.
> This will be used by the directory scrubbing code to check the sanity
> of hashes in leaf1 directory blocks.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_dir2_node.c |   28 ++++++++++++++++++++++++++++
>  fs/xfs/libxfs/xfs_dir2_priv.h |    2 ++
>  2 files changed, 30 insertions(+)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
> index bbd1238..15c1881 100644
> --- a/fs/xfs/libxfs/xfs_dir2_node.c
> +++ b/fs/xfs/libxfs/xfs_dir2_node.c
> @@ -524,6 +524,34 @@ xfs_dir2_free_hdr_check(
>  #endif	/* DEBUG */
>  
>  /*
> + * Return the last hash value in the leaf1.
> + * Stale entries are ok.
> + */
> +xfs_dahash_t					/* hash value */
> +xfs_dir2_leaf1_lasthash(
> +	struct xfs_inode	*dp,
> +	struct xfs_buf		*bp,		/* leaf buffer */
> +	int			*count)		/* count of entries in leaf */
> +{
> +	struct xfs_dir2_leaf	*leaf = bp->b_addr;
> +	struct xfs_dir2_leaf_entry *ents;
> +	struct xfs_dir3_icleaf_hdr leafhdr;
> +
> +	dp->d_ops->leaf_hdr_from_disk(&leafhdr, leaf);
> +
> +	ASSERT(leafhdr.magic == XFS_DIR2_LEAF1_MAGIC ||
> +	       leafhdr.magic == XFS_DIR3_LEAF1_MAGIC);
> +

It looks like the assert is the only difference between this function
and xfs_dir2_leafn_lasthash(). It seems like overkill to me to duplicate
just for that. How about we fix up the assert to cover the additional
magics (and maybe rename _leafn_lasthash() to _leaf_lasthash() if
appropriate)?

Actually, taking a closer look, ->leaf_hdr_from_disk() already asserts
on the appropriate LEAF1/LEAFN magic based on the callback that is
specified. ISTM that we could also just kill the _lasthash() assert.

Brian

> +	if (count)
> +		*count = leafhdr.count;
> +	if (!leafhdr.count)
> +		return 0;
> +
> +	ents = dp->d_ops->leaf_ents_p(leaf);
> +	return be32_to_cpu(ents[leafhdr.count - 1].hashval);
> +}
> +
> +/*
>   * Return the last hash value in the leaf.
>   * Stale entries are ok.
>   */
> diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h
> index 576f2d2..c09bca1 100644
> --- a/fs/xfs/libxfs/xfs_dir2_priv.h
> +++ b/fs/xfs/libxfs/xfs_dir2_priv.h
> @@ -95,6 +95,8 @@ extern bool xfs_dir3_leaf_check_int(struct xfs_mount *mp, struct xfs_inode *dp,
>  /* xfs_dir2_node.c */
>  extern int xfs_dir2_leaf_to_node(struct xfs_da_args *args,
>  		struct xfs_buf *lbp);
> +extern xfs_dahash_t xfs_dir2_leaf1_lasthash(struct xfs_inode *dp,
> +		struct xfs_buf *bp, int *count);
>  extern xfs_dahash_t xfs_dir2_leafn_lasthash(struct xfs_inode *dp,
>  		struct xfs_buf *bp, int *count);
>  extern int xfs_dir2_leafn_lookup_int(struct xfs_buf *bp,
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 12/13] xfs: pass along transaction context when reading directory block buffers
  2017-06-02 21:25 ` [PATCH 12/13] xfs: pass along transaction context when reading directory block buffers Darrick J. Wong
@ 2017-06-08 13:02   ` Brian Foster
  0 siblings, 0 replies; 56+ messages in thread
From: Brian Foster @ 2017-06-08 13:02 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, Jun 02, 2017 at 02:25:14PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Teach the directory reading functions to pass along a transaction context
> if one was supplied.  The directory scrub code will use transactions to
> lock buffers and avoid deadlocking with itself in the case of loops.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_dir2_priv.h |    4 ++--
>  fs/xfs/xfs_dir2_readdir.c     |   15 +++++++++++----
>  fs/xfs/xfs_file.c             |    2 +-
>  3 files changed, 14 insertions(+), 7 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h
> index c09bca1..cb679cf 100644
> --- a/fs/xfs/libxfs/xfs_dir2_priv.h
> +++ b/fs/xfs/libxfs/xfs_dir2_priv.h
> @@ -132,7 +132,7 @@ extern int xfs_dir2_sf_replace(struct xfs_da_args *args);
>  extern int xfs_dir2_sf_verify(struct xfs_inode *ip);
>  
>  /* xfs_dir2_readdir.c */
> -extern int xfs_readdir(struct xfs_inode *dp, struct dir_context *ctx,
> -		       size_t bufsize);
> +extern int xfs_readdir(struct xfs_trans *tp, struct xfs_inode *dp,
> +		       struct dir_context *ctx, size_t bufsize);
>  
>  #endif /* __XFS_DIR2_PRIV_H__ */
> diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c
> index ede4790..ba2638d 100644
> --- a/fs/xfs/xfs_dir2_readdir.c
> +++ b/fs/xfs/xfs_dir2_readdir.c
> @@ -170,7 +170,7 @@ xfs_dir2_block_getdents(
>  		return 0;
>  
>  	lock_mode = xfs_ilock_data_map_shared(dp);
> -	error = xfs_dir3_block_read(NULL, dp, &bp);
> +	error = xfs_dir3_block_read(args->trans, dp, &bp);
>  	xfs_iunlock(dp, lock_mode);
>  	if (error)
>  		return error;
> @@ -228,7 +228,7 @@ xfs_dir2_block_getdents(
>  		if (!dir_emit(ctx, (char *)dep->name, dep->namelen,
>  			    be64_to_cpu(dep->inumber),
>  			    xfs_dir3_get_dtype(dp->i_mount, filetype))) {
> -			xfs_trans_brelse(NULL, bp);
> +			xfs_trans_brelse(args->trans, bp);
>  			return 0;
>  		}
>  	}
> @@ -239,7 +239,7 @@ xfs_dir2_block_getdents(
>  	 */
>  	ctx->pos = xfs_dir2_db_off_to_dataptr(geo, geo->datablk + 1, 0) &
>  								0x7fffffff;
> -	xfs_trans_brelse(NULL, bp);
> +	xfs_trans_brelse(args->trans, bp);
>  	return 0;
>  }
>  
> @@ -495,15 +495,21 @@ xfs_dir2_leaf_getdents(
>  	else
>  		ctx->pos = xfs_dir2_byte_to_dataptr(curoff) & 0x7fffffff;
>  	if (bp)
> -		xfs_trans_brelse(NULL, bp);
> +		xfs_trans_brelse(args->trans, bp);
>  	return error;
>  }
>  
>  /*
>   * Read a directory.
> + *
> + * If supplied, the transaction collects locked dir buffers to avoid
> + * nested buffer deadlocks.  This function does not dirty the
> + * transaction.  The caller should ensure that the inode is locked
> + * before calling this function.
>   */
>  int
>  xfs_readdir(
> +	struct xfs_trans	*tp,
>  	struct xfs_inode	*dp,
>  	struct dir_context	*ctx,
>  	size_t			bufsize)
> @@ -522,6 +528,7 @@ xfs_readdir(
>  
>  	args.dp = dp;
>  	args.geo = dp->i_mount->m_dir_geo;
> +	args.trans = tp;
>  
>  	if (dp->i_d.di_format == XFS_DINODE_FMT_LOCAL)
>  		rval = xfs_dir2_sf_getdents(&args, ctx);
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 5fb5a09..36c1293 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -950,7 +950,7 @@ xfs_file_readdir(
>  	 */
>  	bufsize = (size_t)min_t(loff_t, 32768, ip->i_d.di_size);
>  
> -	return xfs_readdir(ip, ctx, bufsize);
> +	return xfs_readdir(NULL, ip, ctx, bufsize);
>  }
>  
>  /*
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 13/13] xfs: pass along transaction context when reading xattr block buffers
  2017-06-02 21:25 ` [PATCH 13/13] xfs: pass along transaction context when reading xattr " Darrick J. Wong
@ 2017-06-08 13:02   ` Brian Foster
  0 siblings, 0 replies; 56+ messages in thread
From: Brian Foster @ 2017-06-08 13:02 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, Jun 02, 2017 at 02:25:20PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Teach the extended attribute reading functions to pass along a
> transaction context if one was supplied.  The extended attribute scrub
> code will use transactions to lock buffers and avoid deadlocking with
> itself in the case of loops; since it will already have the inode
> locked, also create xattr get/list helpers that don't take locks.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_attr.c        |   26 ++++++++++++-----
>  fs/xfs/libxfs/xfs_attr_remote.c |    5 ++-
>  fs/xfs/xfs_attr.h               |    3 ++
>  fs/xfs/xfs_attr_list.c          |   59 ++++++++++++++++++++++-----------------
>  4 files changed, 57 insertions(+), 36 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index 6622d46..ef8a1c7 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -114,6 +114,23 @@ xfs_inode_hasattr(
>   * Overall external interface routines.
>   *========================================================================*/
>  
> +/* Retrieve an extended attribute and its value.  Must have iolock. */
> +int
> +xfs_attr_get_ilocked(
> +	struct xfs_inode	*ip,
> +	struct xfs_da_args	*args)
> +{
> +	if (!xfs_inode_hasattr(ip))
> +		return -ENOATTR;
> +	else if (ip->i_d.di_aformat == XFS_DINODE_FMT_LOCAL)
> +		return xfs_attr_shortform_getvalue(args);
> +	else if (xfs_bmap_one_block(ip, XFS_ATTR_FORK))
> +		return xfs_attr_leaf_get(args);
> +	else
> +		return xfs_attr_node_get(args);
> +}
> +
> +/* Retrieve an extended attribute by name, and its value. */
>  int
>  xfs_attr_get(
>  	struct xfs_inode	*ip,
> @@ -141,14 +158,7 @@ xfs_attr_get(
>  	args.op_flags = XFS_DA_OP_OKNOENT;
>  
>  	lock_mode = xfs_ilock_attr_map_shared(ip);
> -	if (!xfs_inode_hasattr(ip))
> -		error = -ENOATTR;
> -	else if (ip->i_d.di_aformat == XFS_DINODE_FMT_LOCAL)
> -		error = xfs_attr_shortform_getvalue(&args);
> -	else if (xfs_bmap_one_block(ip, XFS_ATTR_FORK))
> -		error = xfs_attr_leaf_get(&args);
> -	else
> -		error = xfs_attr_node_get(&args);
> +	error = xfs_attr_get_ilocked(ip, &args);
>  	xfs_iunlock(ip, lock_mode);
>  
>  	*valuelenp = args.valuelen;
> diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
> index da72b16..5236d8e 100644
> --- a/fs/xfs/libxfs/xfs_attr_remote.c
> +++ b/fs/xfs/libxfs/xfs_attr_remote.c
> @@ -386,7 +386,8 @@ xfs_attr_rmtval_get(
>  			       (map[i].br_startblock != HOLESTARTBLOCK));
>  			dblkno = XFS_FSB_TO_DADDR(mp, map[i].br_startblock);
>  			dblkcnt = XFS_FSB_TO_BB(mp, map[i].br_blockcount);
> -			error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp,
> +			error = xfs_trans_read_buf(mp, args->trans,
> +						   mp->m_ddev_targp,
>  						   dblkno, dblkcnt, 0, &bp,
>  						   &xfs_attr3_rmt_buf_ops);
>  			if (error)
> @@ -395,7 +396,7 @@ xfs_attr_rmtval_get(
>  			error = xfs_attr_rmtval_copyout(mp, bp, args->dp->i_ino,
>  							&offset, &valuelen,
>  							&dst);
> -			xfs_buf_relse(bp);
> +			xfs_trans_brelse(args->trans, bp);
>  			if (error)
>  				return error;
>  
> diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
> index d14691a..5d5a5e2 100644
> --- a/fs/xfs/xfs_attr.h
> +++ b/fs/xfs/xfs_attr.h
> @@ -117,6 +117,7 @@ typedef void (*put_listent_func_t)(struct xfs_attr_list_context *, int,
>  			      unsigned char *, int, int);
>  
>  typedef struct xfs_attr_list_context {
> +	struct xfs_trans		*tp;
>  	struct xfs_inode		*dp;		/* inode */
>  	struct attrlist_cursor_kern	*cursor;	/* position in list */
>  	char				*alist;		/* output buffer */
> @@ -140,8 +141,10 @@ typedef struct xfs_attr_list_context {
>   * Overall external interface routines.
>   */
>  int xfs_attr_inactive(struct xfs_inode *dp);
> +int xfs_attr_list_int_ilocked(struct xfs_attr_list_context *);
>  int xfs_attr_list_int(struct xfs_attr_list_context *);
>  int xfs_inode_hasattr(struct xfs_inode *ip);
> +int xfs_attr_get_ilocked(struct xfs_inode *ip, struct xfs_da_args *args);
>  int xfs_attr_get(struct xfs_inode *ip, const unsigned char *name,
>  		 unsigned char *value, int *valuelenp, int flags);
>  int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
> diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
> index 9bc1e12..545eca5 100644
> --- a/fs/xfs/xfs_attr_list.c
> +++ b/fs/xfs/xfs_attr_list.c
> @@ -230,7 +230,7 @@ xfs_attr_node_list(xfs_attr_list_context_t *context)
>  	 */
>  	bp = NULL;
>  	if (cursor->blkno > 0) {
> -		error = xfs_da3_node_read(NULL, dp, cursor->blkno, -1,
> +		error = xfs_da3_node_read(context->tp, dp, cursor->blkno, -1,
>  					      &bp, XFS_ATTR_FORK);
>  		if ((error != 0) && (error != -EFSCORRUPTED))
>  			return error;
> @@ -242,7 +242,7 @@ xfs_attr_node_list(xfs_attr_list_context_t *context)
>  			case XFS_DA_NODE_MAGIC:
>  			case XFS_DA3_NODE_MAGIC:
>  				trace_xfs_attr_list_wrong_blk(context);
> -				xfs_trans_brelse(NULL, bp);
> +				xfs_trans_brelse(context->tp, bp);
>  				bp = NULL;
>  				break;
>  			case XFS_ATTR_LEAF_MAGIC:
> @@ -254,18 +254,18 @@ xfs_attr_node_list(xfs_attr_list_context_t *context)
>  				if (cursor->hashval > be32_to_cpu(
>  						entries[leafhdr.count - 1].hashval)) {
>  					trace_xfs_attr_list_wrong_blk(context);
> -					xfs_trans_brelse(NULL, bp);
> +					xfs_trans_brelse(context->tp, bp);
>  					bp = NULL;
>  				} else if (cursor->hashval <= be32_to_cpu(
>  						entries[0].hashval)) {
>  					trace_xfs_attr_list_wrong_blk(context);
> -					xfs_trans_brelse(NULL, bp);
> +					xfs_trans_brelse(context->tp, bp);
>  					bp = NULL;
>  				}
>  				break;
>  			default:
>  				trace_xfs_attr_list_wrong_blk(context);
> -				xfs_trans_brelse(NULL, bp);
> +				xfs_trans_brelse(context->tp, bp);
>  				bp = NULL;
>  			}
>  		}
> @@ -281,7 +281,7 @@ xfs_attr_node_list(xfs_attr_list_context_t *context)
>  		for (;;) {
>  			uint16_t magic;
>  
> -			error = xfs_da3_node_read(NULL, dp,
> +			error = xfs_da3_node_read(context->tp, dp,
>  						      cursor->blkno, -1, &bp,
>  						      XFS_ATTR_FORK);
>  			if (error)
> @@ -297,7 +297,7 @@ xfs_attr_node_list(xfs_attr_list_context_t *context)
>  						     XFS_ERRLEVEL_LOW,
>  						     context->dp->i_mount,
>  						     node);
> -				xfs_trans_brelse(NULL, bp);
> +				xfs_trans_brelse(context->tp, bp);
>  				return -EFSCORRUPTED;
>  			}
>  
> @@ -313,10 +313,10 @@ xfs_attr_node_list(xfs_attr_list_context_t *context)
>  				}
>  			}
>  			if (i == nodehdr.count) {
> -				xfs_trans_brelse(NULL, bp);
> +				xfs_trans_brelse(context->tp, bp);
>  				return 0;
>  			}
> -			xfs_trans_brelse(NULL, bp);
> +			xfs_trans_brelse(context->tp, bp);
>  		}
>  	}
>  	ASSERT(bp != NULL);
> @@ -333,12 +333,12 @@ xfs_attr_node_list(xfs_attr_list_context_t *context)
>  		if (context->seen_enough || leafhdr.forw == 0)
>  			break;
>  		cursor->blkno = leafhdr.forw;
> -		xfs_trans_brelse(NULL, bp);
> -		error = xfs_attr3_leaf_read(NULL, dp, cursor->blkno, -1, &bp);
> +		xfs_trans_brelse(context->tp, bp);
> +		error = xfs_attr3_leaf_read(context->tp, dp, cursor->blkno, -1, &bp);
>  		if (error)
>  			return error;
>  	}
> -	xfs_trans_brelse(NULL, bp);
> +	xfs_trans_brelse(context->tp, bp);
>  	return 0;
>  }
>  
> @@ -448,16 +448,34 @@ xfs_attr_leaf_list(xfs_attr_list_context_t *context)
>  	trace_xfs_attr_leaf_list(context);
>  
>  	context->cursor->blkno = 0;
> -	error = xfs_attr3_leaf_read(NULL, context->dp, 0, -1, &bp);
> +	error = xfs_attr3_leaf_read(context->tp, context->dp, 0, -1, &bp);
>  	if (error)
>  		return error;
>  
>  	xfs_attr3_leaf_list_int(bp, context);
> -	xfs_trans_brelse(NULL, bp);
> +	xfs_trans_brelse(context->tp, bp);
>  	return 0;
>  }
>  
>  int
> +xfs_attr_list_int_ilocked(
> +	struct xfs_attr_list_context	*context)
> +{
> +	struct xfs_inode		*dp = context->dp;
> +
> +	/*
> +	 * Decide on what work routines to call based on the inode size.
> +	 */
> +	if (!xfs_inode_hasattr(dp))
> +		return 0;
> +	else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL)
> +		return xfs_attr_shortform_list(context);
> +	else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
> +		return xfs_attr_leaf_list(context);
> +	return xfs_attr_node_list(context);
> +}
> +
> +int
>  xfs_attr_list_int(
>  	xfs_attr_list_context_t *context)
>  {
> @@ -470,19 +488,8 @@ xfs_attr_list_int(
>  	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
>  		return -EIO;
>  
> -	/*
> -	 * Decide on what work routines to call based on the inode size.
> -	 */
>  	lock_mode = xfs_ilock_attr_map_shared(dp);
> -	if (!xfs_inode_hasattr(dp)) {
> -		error = 0;
> -	} else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) {
> -		error = xfs_attr_shortform_list(context);
> -	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
> -		error = xfs_attr_leaf_list(context);
> -	} else {
> -		error = xfs_attr_node_list(context);
> -	}
> +	error = xfs_attr_list_int_ilocked(context);
>  	xfs_iunlock(dp, lock_mode);
>  	return error;
>  }
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 14/13] xfs: allow reading of already-locked remote symbolic link
  2017-06-02 22:19 ` [PATCH 14/13] xfs: allow reading of already-locked remote symbolic link Darrick J. Wong
@ 2017-06-08 13:02   ` Brian Foster
  0 siblings, 0 replies; 56+ messages in thread
From: Brian Foster @ 2017-06-08 13:02 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, Jun 02, 2017 at 03:19:42PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Expose the readlink variant that doesn't take the inode lock so that
> the scrubber can inspect symlink contents.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/xfs_symlink.c |    6 +++---
>  fs/xfs/xfs_symlink.h |    1 +
>  2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
> index f2cb45e..49380485 100644
> --- a/fs/xfs/xfs_symlink.c
> +++ b/fs/xfs/xfs_symlink.c
> @@ -43,8 +43,8 @@
>  #include "xfs_log.h"
>  
>  /* ----- Kernel only functions below ----- */
> -STATIC int
> -xfs_readlink_bmap(
> +int
> +xfs_readlink_bmap_ilocked(
>  	struct xfs_inode	*ip,
>  	char			*link)
>  {
> @@ -153,7 +153,7 @@ xfs_readlink(
>  	}
>  
>  
> -	error = xfs_readlink_bmap(ip, link);
> +	error = xfs_readlink_bmap_ilocked(ip, link);
>  
>   out:
>  	xfs_iunlock(ip, XFS_ILOCK_SHARED);
> diff --git a/fs/xfs/xfs_symlink.h b/fs/xfs/xfs_symlink.h
> index e75245d..aeaee89 100644
> --- a/fs/xfs/xfs_symlink.h
> +++ b/fs/xfs/xfs_symlink.h
> @@ -21,6 +21,7 @@
>  
>  int xfs_symlink(struct xfs_inode *dp, struct xfs_name *link_name,
>  		const char *target_path, umode_t mode, struct xfs_inode **ipp);
> +int xfs_readlink_bmap_ilocked(struct xfs_inode *ip, char *link);
>  int xfs_readlink(struct xfs_inode *ip, char *link);
>  int xfs_inactive_symlink(struct xfs_inode *ip);
>  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 11/13] xfs: return the hash value of a leaf1 directory block
  2017-06-08 13:02   ` Brian Foster
@ 2017-06-08 15:53     ` Darrick J. Wong
  2017-06-08 16:31       ` Brian Foster
  0 siblings, 1 reply; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-08 15:53 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Thu, Jun 08, 2017 at 09:02:26AM -0400, Brian Foster wrote:
> On Fri, Jun 02, 2017 at 02:25:08PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Provide a way to calculate the highest hash value of a leaf1 block.
> > This will be used by the directory scrubbing code to check the sanity
> > of hashes in leaf1 directory blocks.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/libxfs/xfs_dir2_node.c |   28 ++++++++++++++++++++++++++++
> >  fs/xfs/libxfs/xfs_dir2_priv.h |    2 ++
> >  2 files changed, 30 insertions(+)
> > 
> > 
> > diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
> > index bbd1238..15c1881 100644
> > --- a/fs/xfs/libxfs/xfs_dir2_node.c
> > +++ b/fs/xfs/libxfs/xfs_dir2_node.c
> > @@ -524,6 +524,34 @@ xfs_dir2_free_hdr_check(
> >  #endif	/* DEBUG */
> >  
> >  /*
> > + * Return the last hash value in the leaf1.
> > + * Stale entries are ok.
> > + */
> > +xfs_dahash_t					/* hash value */
> > +xfs_dir2_leaf1_lasthash(
> > +	struct xfs_inode	*dp,
> > +	struct xfs_buf		*bp,		/* leaf buffer */
> > +	int			*count)		/* count of entries in leaf */
> > +{
> > +	struct xfs_dir2_leaf	*leaf = bp->b_addr;
> > +	struct xfs_dir2_leaf_entry *ents;
> > +	struct xfs_dir3_icleaf_hdr leafhdr;
> > +
> > +	dp->d_ops->leaf_hdr_from_disk(&leafhdr, leaf);
> > +
> > +	ASSERT(leafhdr.magic == XFS_DIR2_LEAF1_MAGIC ||
> > +	       leafhdr.magic == XFS_DIR3_LEAF1_MAGIC);
> > +
> 
> It looks like the assert is the only difference between this function
> and xfs_dir2_leafn_lasthash(). It seems like overkill to me to duplicate
> just for that. How about we fix up the assert to cover the additional
> magics (and maybe rename _leafn_lasthash() to _leaf_lasthash() if
> appropriate)?
> 
> Actually, taking a closer look, ->leaf_hdr_from_disk() already asserts
> on the appropriate LEAF1/LEAFN magic based on the callback that is
> specified. ISTM that we could also just kill the _lasthash() assert.

The ASSERTs in the _lasthash functions and in leaf_hdr_from_disk aren't
testing quite the same things.  The asserts in _dir2_leaf[1n]_lasthash
check that we actually passed it a leaf1 or leafn block, respectively.
The asserts in _dir[23]_leaf_hdr_from_disk check that we actually fed it
a dir2 or dir3 leaf* block without caring whether it's leaf1 or leafn.
That's why I didn't just get rid of the assert and rename the function
xfs_dir2_leaf_lasthash().

I suppose we could just make a single parent function that takes the two
magics it wants to see and have _dir2_leaf[1n]_lasthash call the parent
function with the magic numbers they want to check.  How does that
sound?

--D

> 
> Brian
> 
> > +	if (count)
> > +		*count = leafhdr.count;
> > +	if (!leafhdr.count)
> > +		return 0;
> > +
> > +	ents = dp->d_ops->leaf_ents_p(leaf);
> > +	return be32_to_cpu(ents[leafhdr.count - 1].hashval);
> > +}
> > +
> > +/*
> >   * Return the last hash value in the leaf.
> >   * Stale entries are ok.
> >   */
> > diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h
> > index 576f2d2..c09bca1 100644
> > --- a/fs/xfs/libxfs/xfs_dir2_priv.h
> > +++ b/fs/xfs/libxfs/xfs_dir2_priv.h
> > @@ -95,6 +95,8 @@ extern bool xfs_dir3_leaf_check_int(struct xfs_mount *mp, struct xfs_inode *dp,
> >  /* xfs_dir2_node.c */
> >  extern int xfs_dir2_leaf_to_node(struct xfs_da_args *args,
> >  		struct xfs_buf *lbp);
> > +extern xfs_dahash_t xfs_dir2_leaf1_lasthash(struct xfs_inode *dp,
> > +		struct xfs_buf *bp, int *count);
> >  extern xfs_dahash_t xfs_dir2_leafn_lasthash(struct xfs_inode *dp,
> >  		struct xfs_buf *bp, int *count);
> >  extern int xfs_dir2_leafn_lookup_int(struct xfs_buf *bp,
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 11/13] xfs: return the hash value of a leaf1 directory block
  2017-06-08 15:53     ` Darrick J. Wong
@ 2017-06-08 16:31       ` Brian Foster
  2017-06-08 16:43         ` Darrick J. Wong
  0 siblings, 1 reply; 56+ messages in thread
From: Brian Foster @ 2017-06-08 16:31 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, Jun 08, 2017 at 08:53:59AM -0700, Darrick J. Wong wrote:
> On Thu, Jun 08, 2017 at 09:02:26AM -0400, Brian Foster wrote:
> > On Fri, Jun 02, 2017 at 02:25:08PM -0700, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > Provide a way to calculate the highest hash value of a leaf1 block.
> > > This will be used by the directory scrubbing code to check the sanity
> > > of hashes in leaf1 directory blocks.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > >  fs/xfs/libxfs/xfs_dir2_node.c |   28 ++++++++++++++++++++++++++++
> > >  fs/xfs/libxfs/xfs_dir2_priv.h |    2 ++
> > >  2 files changed, 30 insertions(+)
> > > 
> > > 
> > > diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
> > > index bbd1238..15c1881 100644
> > > --- a/fs/xfs/libxfs/xfs_dir2_node.c
> > > +++ b/fs/xfs/libxfs/xfs_dir2_node.c
> > > @@ -524,6 +524,34 @@ xfs_dir2_free_hdr_check(
> > >  #endif	/* DEBUG */
> > >  
> > >  /*
> > > + * Return the last hash value in the leaf1.
> > > + * Stale entries are ok.
> > > + */
> > > +xfs_dahash_t					/* hash value */
> > > +xfs_dir2_leaf1_lasthash(
> > > +	struct xfs_inode	*dp,
> > > +	struct xfs_buf		*bp,		/* leaf buffer */
> > > +	int			*count)		/* count of entries in leaf */
> > > +{
> > > +	struct xfs_dir2_leaf	*leaf = bp->b_addr;
> > > +	struct xfs_dir2_leaf_entry *ents;
> > > +	struct xfs_dir3_icleaf_hdr leafhdr;
> > > +
> > > +	dp->d_ops->leaf_hdr_from_disk(&leafhdr, leaf);
> > > +
> > > +	ASSERT(leafhdr.magic == XFS_DIR2_LEAF1_MAGIC ||
> > > +	       leafhdr.magic == XFS_DIR3_LEAF1_MAGIC);
> > > +
> > 
> > It looks like the assert is the only difference between this function
> > and xfs_dir2_leafn_lasthash(). It seems like overkill to me to duplicate
> > just for that. How about we fix up the assert to cover the additional
> > magics (and maybe rename _leafn_lasthash() to _leaf_lasthash() if
> > appropriate)?
> > 
> > Actually, taking a closer look, ->leaf_hdr_from_disk() already asserts
> > on the appropriate LEAF1/LEAFN magic based on the callback that is
> > specified. ISTM that we could also just kill the _lasthash() assert.
> 
> The ASSERTs in the _lasthash functions and in leaf_hdr_from_disk aren't
> testing quite the same things.  The asserts in _dir2_leaf[1n]_lasthash
> check that we actually passed it a leaf1 or leafn block, respectively.
> The asserts in _dir[23]_leaf_hdr_from_disk check that we actually fed it
> a dir2 or dir3 leaf* block without caring whether it's leaf1 or leafn.
> That's why I didn't just get rid of the assert and rename the function
> xfs_dir2_leaf_lasthash().
> 

Yeah, I'm aware they are not exactly equivalent. I was more thinking
that we still have some assert protection if something is blatantly
wrong (i.e., corruption, some non dir block, etc.). As it is, if the
_leaf1_lasthash() assert fails because we passed a leafn block to the
function, the solution presumably is to use the leafn function that
basically does the same thing (modulo the assert), right?

In other words, what's the value of asserting on the magics between two
functions that handle either format in the exact same way? If there is
value somewhere, it sounds like perhaps it's to the benefit of the
caller than for the helper itself (which is reasonable, I think, but
still doesn't justify the duplication IMO).

> I suppose we could just make a single parent function that takes the two
> magics it wants to see and have _dir2_leaf[1n]_lasthash call the parent
> function with the magic numbers they want to check.  How does that
> sound?
> 

Do I understand correctly that you mean an "internal" function that
receives the expected magic as a param (for the assert) and a couple
leaf[1|n]_lasthash() wrappers that pass the associated LEAF[1|N] magics?
If so, that sounds reasonable to me if you'd really prefer to keep the
isolated asserts. I'm more just trying to avoid the code duplication.

Brian

> --D
> 
> > 
> > Brian
> > 
> > > +	if (count)
> > > +		*count = leafhdr.count;
> > > +	if (!leafhdr.count)
> > > +		return 0;
> > > +
> > > +	ents = dp->d_ops->leaf_ents_p(leaf);
> > > +	return be32_to_cpu(ents[leafhdr.count - 1].hashval);
> > > +}
> > > +
> > > +/*
> > >   * Return the last hash value in the leaf.
> > >   * Stale entries are ok.
> > >   */
> > > diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h
> > > index 576f2d2..c09bca1 100644
> > > --- a/fs/xfs/libxfs/xfs_dir2_priv.h
> > > +++ b/fs/xfs/libxfs/xfs_dir2_priv.h
> > > @@ -95,6 +95,8 @@ extern bool xfs_dir3_leaf_check_int(struct xfs_mount *mp, struct xfs_inode *dp,
> > >  /* xfs_dir2_node.c */
> > >  extern int xfs_dir2_leaf_to_node(struct xfs_da_args *args,
> > >  		struct xfs_buf *lbp);
> > > +extern xfs_dahash_t xfs_dir2_leaf1_lasthash(struct xfs_inode *dp,
> > > +		struct xfs_buf *bp, int *count);
> > >  extern xfs_dahash_t xfs_dir2_leafn_lasthash(struct xfs_inode *dp,
> > >  		struct xfs_buf *bp, int *count);
> > >  extern int xfs_dir2_leafn_lookup_int(struct xfs_buf *bp,
> > > 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 11/13] xfs: return the hash value of a leaf1 directory block
  2017-06-08 16:31       ` Brian Foster
@ 2017-06-08 16:43         ` Darrick J. Wong
  2017-06-08 16:52           ` Brian Foster
  0 siblings, 1 reply; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-08 16:43 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Thu, Jun 08, 2017 at 12:31:50PM -0400, Brian Foster wrote:
> On Thu, Jun 08, 2017 at 08:53:59AM -0700, Darrick J. Wong wrote:
> > On Thu, Jun 08, 2017 at 09:02:26AM -0400, Brian Foster wrote:
> > > On Fri, Jun 02, 2017 at 02:25:08PM -0700, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > 
> > > > Provide a way to calculate the highest hash value of a leaf1 block.
> > > > This will be used by the directory scrubbing code to check the sanity
> > > > of hashes in leaf1 directory blocks.
> > > > 
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > ---
> > > >  fs/xfs/libxfs/xfs_dir2_node.c |   28 ++++++++++++++++++++++++++++
> > > >  fs/xfs/libxfs/xfs_dir2_priv.h |    2 ++
> > > >  2 files changed, 30 insertions(+)
> > > > 
> > > > 
> > > > diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
> > > > index bbd1238..15c1881 100644
> > > > --- a/fs/xfs/libxfs/xfs_dir2_node.c
> > > > +++ b/fs/xfs/libxfs/xfs_dir2_node.c
> > > > @@ -524,6 +524,34 @@ xfs_dir2_free_hdr_check(
> > > >  #endif	/* DEBUG */
> > > >  
> > > >  /*
> > > > + * Return the last hash value in the leaf1.
> > > > + * Stale entries are ok.
> > > > + */
> > > > +xfs_dahash_t					/* hash value */
> > > > +xfs_dir2_leaf1_lasthash(
> > > > +	struct xfs_inode	*dp,
> > > > +	struct xfs_buf		*bp,		/* leaf buffer */
> > > > +	int			*count)		/* count of entries in leaf */
> > > > +{
> > > > +	struct xfs_dir2_leaf	*leaf = bp->b_addr;
> > > > +	struct xfs_dir2_leaf_entry *ents;
> > > > +	struct xfs_dir3_icleaf_hdr leafhdr;
> > > > +
> > > > +	dp->d_ops->leaf_hdr_from_disk(&leafhdr, leaf);
> > > > +
> > > > +	ASSERT(leafhdr.magic == XFS_DIR2_LEAF1_MAGIC ||
> > > > +	       leafhdr.magic == XFS_DIR3_LEAF1_MAGIC);
> > > > +
> > > 
> > > It looks like the assert is the only difference between this function
> > > and xfs_dir2_leafn_lasthash(). It seems like overkill to me to duplicate
> > > just for that. How about we fix up the assert to cover the additional
> > > magics (and maybe rename _leafn_lasthash() to _leaf_lasthash() if
> > > appropriate)?
> > > 
> > > Actually, taking a closer look, ->leaf_hdr_from_disk() already asserts
> > > on the appropriate LEAF1/LEAFN magic based on the callback that is
> > > specified. ISTM that we could also just kill the _lasthash() assert.
> > 
> > The ASSERTs in the _lasthash functions and in leaf_hdr_from_disk aren't
> > testing quite the same things.  The asserts in _dir2_leaf[1n]_lasthash
> > check that we actually passed it a leaf1 or leafn block, respectively.
> > The asserts in _dir[23]_leaf_hdr_from_disk check that we actually fed it
> > a dir2 or dir3 leaf* block without caring whether it's leaf1 or leafn.
> > That's why I didn't just get rid of the assert and rename the function
> > xfs_dir2_leaf_lasthash().
> > 
> 
> Yeah, I'm aware they are not exactly equivalent. I was more thinking
> that we still have some assert protection if something is blatantly
> wrong (i.e., corruption, some non dir block, etc.). As it is, if the
> _leaf1_lasthash() assert fails because we passed a leafn block to the
> function, the solution presumably is to use the leafn function that
> basically does the same thing (modulo the assert), right?

Certainly that seems like the correct caller code fix.

> In other words, what's the value of asserting on the magics between two
> functions that handle either format in the exact same way? If there is
> value somewhere, it sounds like perhaps it's to the benefit of the
> caller than for the helper itself (which is reasonable, I think, but
> still doesn't justify the duplication IMO).

I wanted to be cautious about removing ASSERTs from functions.  Having
talked about this with you, I now feel emboldened enough to take your
original suggestion to simply combine the two functions. :)

> > I suppose we could just make a single parent function that takes the two
> > magics it wants to see and have _dir2_leaf[1n]_lasthash call the parent
> > function with the magic numbers they want to check.  How does that
> > sound?
> > 
> 
> Do I understand correctly that you mean an "internal" function that
> receives the expected magic as a param (for the assert) and a couple
> leaf[1|n]_lasthash() wrappers that pass the associated LEAF[1|N] magics?
> If so, that sounds reasonable to me if you'd really prefer to keep the
> isolated asserts. I'm more just trying to avoid the code duplication.

<nod> Eh, I'll just collapse both of them into a single _lasthash
function that doesn't care if it's passed a leaf1 or a leafn.

--D
> 
> Brian
> 
> > --D
> > 
> > > 
> > > Brian
> > > 
> > > > +	if (count)
> > > > +		*count = leafhdr.count;
> > > > +	if (!leafhdr.count)
> > > > +		return 0;
> > > > +
> > > > +	ents = dp->d_ops->leaf_ents_p(leaf);
> > > > +	return be32_to_cpu(ents[leafhdr.count - 1].hashval);
> > > > +}
> > > > +
> > > > +/*
> > > >   * Return the last hash value in the leaf.
> > > >   * Stale entries are ok.
> > > >   */
> > > > diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h
> > > > index 576f2d2..c09bca1 100644
> > > > --- a/fs/xfs/libxfs/xfs_dir2_priv.h
> > > > +++ b/fs/xfs/libxfs/xfs_dir2_priv.h
> > > > @@ -95,6 +95,8 @@ extern bool xfs_dir3_leaf_check_int(struct xfs_mount *mp, struct xfs_inode *dp,
> > > >  /* xfs_dir2_node.c */
> > > >  extern int xfs_dir2_leaf_to_node(struct xfs_da_args *args,
> > > >  		struct xfs_buf *lbp);
> > > > +extern xfs_dahash_t xfs_dir2_leaf1_lasthash(struct xfs_inode *dp,
> > > > +		struct xfs_buf *bp, int *count);
> > > >  extern xfs_dahash_t xfs_dir2_leafn_lasthash(struct xfs_inode *dp,
> > > >  		struct xfs_buf *bp, int *count);
> > > >  extern int xfs_dir2_leafn_lookup_int(struct xfs_buf *bp,
> > > > 
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 11/13] xfs: return the hash value of a leaf1 directory block
  2017-06-08 16:43         ` Darrick J. Wong
@ 2017-06-08 16:52           ` Brian Foster
  0 siblings, 0 replies; 56+ messages in thread
From: Brian Foster @ 2017-06-08 16:52 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, Jun 08, 2017 at 09:43:07AM -0700, Darrick J. Wong wrote:
> On Thu, Jun 08, 2017 at 12:31:50PM -0400, Brian Foster wrote:
> > On Thu, Jun 08, 2017 at 08:53:59AM -0700, Darrick J. Wong wrote:
> > > On Thu, Jun 08, 2017 at 09:02:26AM -0400, Brian Foster wrote:
> > > > On Fri, Jun 02, 2017 at 02:25:08PM -0700, Darrick J. Wong wrote:
> > > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > 
> > > > > Provide a way to calculate the highest hash value of a leaf1 block.
> > > > > This will be used by the directory scrubbing code to check the sanity
> > > > > of hashes in leaf1 directory blocks.
> > > > > 
> > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > ---
> > > > >  fs/xfs/libxfs/xfs_dir2_node.c |   28 ++++++++++++++++++++++++++++
> > > > >  fs/xfs/libxfs/xfs_dir2_priv.h |    2 ++
> > > > >  2 files changed, 30 insertions(+)
> > > > > 
> > > > > 
> > > > > diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
> > > > > index bbd1238..15c1881 100644
> > > > > --- a/fs/xfs/libxfs/xfs_dir2_node.c
> > > > > +++ b/fs/xfs/libxfs/xfs_dir2_node.c
> > > > > @@ -524,6 +524,34 @@ xfs_dir2_free_hdr_check(
> > > > >  #endif	/* DEBUG */
> > > > >  
> > > > >  /*
> > > > > + * Return the last hash value in the leaf1.
> > > > > + * Stale entries are ok.
> > > > > + */
> > > > > +xfs_dahash_t					/* hash value */
> > > > > +xfs_dir2_leaf1_lasthash(
> > > > > +	struct xfs_inode	*dp,
> > > > > +	struct xfs_buf		*bp,		/* leaf buffer */
> > > > > +	int			*count)		/* count of entries in leaf */
> > > > > +{
> > > > > +	struct xfs_dir2_leaf	*leaf = bp->b_addr;
> > > > > +	struct xfs_dir2_leaf_entry *ents;
> > > > > +	struct xfs_dir3_icleaf_hdr leafhdr;
> > > > > +
> > > > > +	dp->d_ops->leaf_hdr_from_disk(&leafhdr, leaf);
> > > > > +
> > > > > +	ASSERT(leafhdr.magic == XFS_DIR2_LEAF1_MAGIC ||
> > > > > +	       leafhdr.magic == XFS_DIR3_LEAF1_MAGIC);
> > > > > +
> > > > 
> > > > It looks like the assert is the only difference between this function
> > > > and xfs_dir2_leafn_lasthash(). It seems like overkill to me to duplicate
> > > > just for that. How about we fix up the assert to cover the additional
> > > > magics (and maybe rename _leafn_lasthash() to _leaf_lasthash() if
> > > > appropriate)?
> > > > 
> > > > Actually, taking a closer look, ->leaf_hdr_from_disk() already asserts
> > > > on the appropriate LEAF1/LEAFN magic based on the callback that is
> > > > specified. ISTM that we could also just kill the _lasthash() assert.
> > > 
> > > The ASSERTs in the _lasthash functions and in leaf_hdr_from_disk aren't
> > > testing quite the same things.  The asserts in _dir2_leaf[1n]_lasthash
> > > check that we actually passed it a leaf1 or leafn block, respectively.
> > > The asserts in _dir[23]_leaf_hdr_from_disk check that we actually fed it
> > > a dir2 or dir3 leaf* block without caring whether it's leaf1 or leafn.
> > > That's why I didn't just get rid of the assert and rename the function
> > > xfs_dir2_leaf_lasthash().
> > > 
> > 
> > Yeah, I'm aware they are not exactly equivalent. I was more thinking
> > that we still have some assert protection if something is blatantly
> > wrong (i.e., corruption, some non dir block, etc.). As it is, if the
> > _leaf1_lasthash() assert fails because we passed a leafn block to the
> > function, the solution presumably is to use the leafn function that
> > basically does the same thing (modulo the assert), right?
> 
> Certainly that seems like the correct caller code fix.
> 
> > In other words, what's the value of asserting on the magics between two
> > functions that handle either format in the exact same way? If there is
> > value somewhere, it sounds like perhaps it's to the benefit of the
> > caller than for the helper itself (which is reasonable, I think, but
> > still doesn't justify the duplication IMO).
> 
> I wanted to be cautious about removing ASSERTs from functions.  Having
> talked about this with you, I now feel emboldened enough to take your
> original suggestion to simply combine the two functions. :)
> 

Heh, sounds good. Note that we can just update the assert to cover the
additional magics rather than remove it entirely (if that isn't what you
planned to do already).

Brian

> > > I suppose we could just make a single parent function that takes the two
> > > magics it wants to see and have _dir2_leaf[1n]_lasthash call the parent
> > > function with the magic numbers they want to check.  How does that
> > > sound?
> > > 
> > 
> > Do I understand correctly that you mean an "internal" function that
> > receives the expected magic as a param (for the assert) and a couple
> > leaf[1|n]_lasthash() wrappers that pass the associated LEAF[1|N] magics?
> > If so, that sounds reasonable to me if you'd really prefer to keep the
> > isolated asserts. I'm more just trying to avoid the code duplication.
> 
> <nod> Eh, I'll just collapse both of them into a single _lasthash
> function that doesn't care if it's passed a leaf1 or a leafn.
> 
> --D
> > 
> > Brian
> > 
> > > --D
> > > 
> > > > 
> > > > Brian
> > > > 
> > > > > +	if (count)
> > > > > +		*count = leafhdr.count;
> > > > > +	if (!leafhdr.count)
> > > > > +		return 0;
> > > > > +
> > > > > +	ents = dp->d_ops->leaf_ents_p(leaf);
> > > > > +	return be32_to_cpu(ents[leafhdr.count - 1].hashval);
> > > > > +}
> > > > > +
> > > > > +/*
> > > > >   * Return the last hash value in the leaf.
> > > > >   * Stale entries are ok.
> > > > >   */
> > > > > diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h
> > > > > index 576f2d2..c09bca1 100644
> > > > > --- a/fs/xfs/libxfs/xfs_dir2_priv.h
> > > > > +++ b/fs/xfs/libxfs/xfs_dir2_priv.h
> > > > > @@ -95,6 +95,8 @@ extern bool xfs_dir3_leaf_check_int(struct xfs_mount *mp, struct xfs_inode *dp,
> > > > >  /* xfs_dir2_node.c */
> > > > >  extern int xfs_dir2_leaf_to_node(struct xfs_da_args *args,
> > > > >  		struct xfs_buf *lbp);
> > > > > +extern xfs_dahash_t xfs_dir2_leaf1_lasthash(struct xfs_inode *dp,
> > > > > +		struct xfs_buf *bp, int *count);
> > > > >  extern xfs_dahash_t xfs_dir2_leafn_lasthash(struct xfs_inode *dp,
> > > > >  		struct xfs_buf *bp, int *count);
> > > > >  extern int xfs_dir2_leafn_lookup_int(struct xfs_buf *bp,
> > > > > 
> > > > > --
> > > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > > > the body of a message to majordomo@vger.kernel.org
> > > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 11/13] xfs: return the hash value of a leaf1 directory block
  2017-06-02 21:25 ` [PATCH 11/13] xfs: return the hash value of a leaf1 directory block Darrick J. Wong
  2017-06-08 13:02   ` Brian Foster
@ 2017-06-08 18:22   ` Darrick J. Wong
  2017-06-09 12:54     ` Brian Foster
  1 sibling, 1 reply; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-08 18:22 UTC (permalink / raw)
  To: linux-xfs; +Cc: Brian Foster

Modify the existing dir leafn lasthash function to enable us to
calculate the highest hash value of a leaf1 block.  This will be used by
the directory scrubbing code to check the sanity of hashes in leaf1
directory blocks.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_da_btree.c  |   10 +++++-----
 fs/xfs/libxfs/xfs_dir2_node.c |   10 ++++++----
 fs/xfs/libxfs/xfs_dir2_priv.h |    2 +-
 3 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
index 48f1136..356f21d 100644
--- a/fs/xfs/libxfs/xfs_da_btree.c
+++ b/fs/xfs/libxfs/xfs_da_btree.c
@@ -1282,7 +1282,7 @@ xfs_da3_fixhashpath(
 			return;
 		break;
 	case XFS_DIR2_LEAFN_MAGIC:
-		lasthash = xfs_dir2_leafn_lasthash(dp, blk->bp, &count);
+		lasthash = xfs_dir2_leaf_lasthash(dp, blk->bp, &count);
 		if (count == 0)
 			return;
 		break;
@@ -1502,8 +1502,8 @@ xfs_da3_node_lookup_int(
 		if (blk->magic == XFS_DIR2_LEAFN_MAGIC ||
 		    blk->magic == XFS_DIR3_LEAFN_MAGIC) {
 			blk->magic = XFS_DIR2_LEAFN_MAGIC;
-			blk->hashval = xfs_dir2_leafn_lasthash(args->dp,
-							       blk->bp, NULL);
+			blk->hashval = xfs_dir2_leaf_lasthash(args->dp,
+							      blk->bp, NULL);
 			break;
 		}
 
@@ -1929,8 +1929,8 @@ xfs_da3_path_shift(
 			blk->magic = XFS_DIR2_LEAFN_MAGIC;
 			ASSERT(level == path->active-1);
 			blk->index = 0;
-			blk->hashval = xfs_dir2_leafn_lasthash(args->dp,
-							       blk->bp, NULL);
+			blk->hashval = xfs_dir2_leaf_lasthash(args->dp,
+							      blk->bp, NULL);
 			break;
 		default:
 			ASSERT(0);
diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
index bbd1238..682e2bf 100644
--- a/fs/xfs/libxfs/xfs_dir2_node.c
+++ b/fs/xfs/libxfs/xfs_dir2_node.c
@@ -528,7 +528,7 @@ xfs_dir2_free_hdr_check(
  * Stale entries are ok.
  */
 xfs_dahash_t					/* hash value */
-xfs_dir2_leafn_lasthash(
+xfs_dir2_leaf_lasthash(
 	struct xfs_inode *dp,
 	struct xfs_buf	*bp,			/* leaf buffer */
 	int		*count)			/* count of entries in leaf */
@@ -540,7 +540,9 @@ xfs_dir2_leafn_lasthash(
 	dp->d_ops->leaf_hdr_from_disk(&leafhdr, leaf);
 
 	ASSERT(leafhdr.magic == XFS_DIR2_LEAFN_MAGIC ||
-	       leafhdr.magic == XFS_DIR3_LEAFN_MAGIC);
+	       leafhdr.magic == XFS_DIR3_LEAFN_MAGIC ||
+	       leafhdr.magic == XFS_DIR2_LEAF1_MAGIC ||
+	       leafhdr.magic == XFS_DIR3_LEAF1_MAGIC);
 
 	if (count)
 		*count = leafhdr.count;
@@ -1405,8 +1407,8 @@ xfs_dir2_leafn_split(
 	/*
 	 * Update last hashval in each block since we added the name.
 	 */
-	oldblk->hashval = xfs_dir2_leafn_lasthash(dp, oldblk->bp, NULL);
-	newblk->hashval = xfs_dir2_leafn_lasthash(dp, newblk->bp, NULL);
+	oldblk->hashval = xfs_dir2_leaf_lasthash(dp, oldblk->bp, NULL);
+	newblk->hashval = xfs_dir2_leaf_lasthash(dp, newblk->bp, NULL);
 	xfs_dir3_leaf_check(dp, oldblk->bp);
 	xfs_dir3_leaf_check(dp, newblk->bp);
 	return error;
diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h
index 576f2d2..6d24209 100644
--- a/fs/xfs/libxfs/xfs_dir2_priv.h
+++ b/fs/xfs/libxfs/xfs_dir2_priv.h
@@ -95,7 +95,7 @@ extern bool xfs_dir3_leaf_check_int(struct xfs_mount *mp, struct xfs_inode *dp,
 /* xfs_dir2_node.c */
 extern int xfs_dir2_leaf_to_node(struct xfs_da_args *args,
 		struct xfs_buf *lbp);
-extern xfs_dahash_t xfs_dir2_leafn_lasthash(struct xfs_inode *dp,
+extern xfs_dahash_t xfs_dir2_leaf_lasthash(struct xfs_inode *dp,
 		struct xfs_buf *bp, int *count);
 extern int xfs_dir2_leafn_lookup_int(struct xfs_buf *bp,
 		struct xfs_da_args *args, int *indexp,

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 11/13] xfs: return the hash value of a leaf1 directory block
  2017-06-08 18:22   ` [PATCH v2 " Darrick J. Wong
@ 2017-06-09 12:54     ` Brian Foster
  0 siblings, 0 replies; 56+ messages in thread
From: Brian Foster @ 2017-06-09 12:54 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, Jun 08, 2017 at 11:22:36AM -0700, Darrick J. Wong wrote:
> Modify the existing dir leafn lasthash function to enable us to
> calculate the highest hash value of a leaf1 block.  This will be used by
> the directory scrubbing code to check the sanity of hashes in leaf1
> directory blocks.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Thanks for the update..

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_da_btree.c  |   10 +++++-----
>  fs/xfs/libxfs/xfs_dir2_node.c |   10 ++++++----
>  fs/xfs/libxfs/xfs_dir2_priv.h |    2 +-
>  3 files changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
> index 48f1136..356f21d 100644
> --- a/fs/xfs/libxfs/xfs_da_btree.c
> +++ b/fs/xfs/libxfs/xfs_da_btree.c
> @@ -1282,7 +1282,7 @@ xfs_da3_fixhashpath(
>  			return;
>  		break;
>  	case XFS_DIR2_LEAFN_MAGIC:
> -		lasthash = xfs_dir2_leafn_lasthash(dp, blk->bp, &count);
> +		lasthash = xfs_dir2_leaf_lasthash(dp, blk->bp, &count);
>  		if (count == 0)
>  			return;
>  		break;
> @@ -1502,8 +1502,8 @@ xfs_da3_node_lookup_int(
>  		if (blk->magic == XFS_DIR2_LEAFN_MAGIC ||
>  		    blk->magic == XFS_DIR3_LEAFN_MAGIC) {
>  			blk->magic = XFS_DIR2_LEAFN_MAGIC;
> -			blk->hashval = xfs_dir2_leafn_lasthash(args->dp,
> -							       blk->bp, NULL);
> +			blk->hashval = xfs_dir2_leaf_lasthash(args->dp,
> +							      blk->bp, NULL);
>  			break;
>  		}
>  
> @@ -1929,8 +1929,8 @@ xfs_da3_path_shift(
>  			blk->magic = XFS_DIR2_LEAFN_MAGIC;
>  			ASSERT(level == path->active-1);
>  			blk->index = 0;
> -			blk->hashval = xfs_dir2_leafn_lasthash(args->dp,
> -							       blk->bp, NULL);
> +			blk->hashval = xfs_dir2_leaf_lasthash(args->dp,
> +							      blk->bp, NULL);
>  			break;
>  		default:
>  			ASSERT(0);
> diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
> index bbd1238..682e2bf 100644
> --- a/fs/xfs/libxfs/xfs_dir2_node.c
> +++ b/fs/xfs/libxfs/xfs_dir2_node.c
> @@ -528,7 +528,7 @@ xfs_dir2_free_hdr_check(
>   * Stale entries are ok.
>   */
>  xfs_dahash_t					/* hash value */
> -xfs_dir2_leafn_lasthash(
> +xfs_dir2_leaf_lasthash(
>  	struct xfs_inode *dp,
>  	struct xfs_buf	*bp,			/* leaf buffer */
>  	int		*count)			/* count of entries in leaf */
> @@ -540,7 +540,9 @@ xfs_dir2_leafn_lasthash(
>  	dp->d_ops->leaf_hdr_from_disk(&leafhdr, leaf);
>  
>  	ASSERT(leafhdr.magic == XFS_DIR2_LEAFN_MAGIC ||
> -	       leafhdr.magic == XFS_DIR3_LEAFN_MAGIC);
> +	       leafhdr.magic == XFS_DIR3_LEAFN_MAGIC ||
> +	       leafhdr.magic == XFS_DIR2_LEAF1_MAGIC ||
> +	       leafhdr.magic == XFS_DIR3_LEAF1_MAGIC);
>  
>  	if (count)
>  		*count = leafhdr.count;
> @@ -1405,8 +1407,8 @@ xfs_dir2_leafn_split(
>  	/*
>  	 * Update last hashval in each block since we added the name.
>  	 */
> -	oldblk->hashval = xfs_dir2_leafn_lasthash(dp, oldblk->bp, NULL);
> -	newblk->hashval = xfs_dir2_leafn_lasthash(dp, newblk->bp, NULL);
> +	oldblk->hashval = xfs_dir2_leaf_lasthash(dp, oldblk->bp, NULL);
> +	newblk->hashval = xfs_dir2_leaf_lasthash(dp, newblk->bp, NULL);
>  	xfs_dir3_leaf_check(dp, oldblk->bp);
>  	xfs_dir3_leaf_check(dp, newblk->bp);
>  	return error;
> diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h
> index 576f2d2..6d24209 100644
> --- a/fs/xfs/libxfs/xfs_dir2_priv.h
> +++ b/fs/xfs/libxfs/xfs_dir2_priv.h
> @@ -95,7 +95,7 @@ extern bool xfs_dir3_leaf_check_int(struct xfs_mount *mp, struct xfs_inode *dp,
>  /* xfs_dir2_node.c */
>  extern int xfs_dir2_leaf_to_node(struct xfs_da_args *args,
>  		struct xfs_buf *lbp);
> -extern xfs_dahash_t xfs_dir2_leafn_lasthash(struct xfs_inode *dp,
> +extern xfs_dahash_t xfs_dir2_leaf_lasthash(struct xfs_inode *dp,
>  		struct xfs_buf *bp, int *count);
>  extern int xfs_dir2_leafn_lookup_int(struct xfs_buf *bp,
>  		struct xfs_da_args *args, int *indexp,
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 07/13] xfs: check if an inode is cached and allocated
  2017-06-07 14:22       ` Brian Foster
@ 2017-06-15  5:00         ` Darrick J. Wong
  0 siblings, 0 replies; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-15  5:00 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jun 07, 2017 at 10:22:44AM -0400, Brian Foster wrote:
> On Tue, Jun 06, 2017 at 11:40:06AM -0700, Darrick J. Wong wrote:
> > On Tue, Jun 06, 2017 at 12:28:13PM -0400, Brian Foster wrote:
> > > On Fri, Jun 02, 2017 at 02:24:43PM -0700, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > 
> > > > Check the inode cache for a particular inode number.  If it's in the
> > > > cache, check that it's not currently being reclaimed.  If it's not being
> > > > reclaimed, return zero if the inode is allocated.  This function will be
> > > > used by various scrubbers to decide if the cache is more up to date
> > > > than the disk in terms of checking if an inode is allocated.
> > > > 
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > ---
> > > >  fs/xfs/xfs_icache.c |   83 +++++++++++++++++++++++++++++++++++++++++++++++++++
> > > >  fs/xfs/xfs_icache.h |    3 ++
> > > >  2 files changed, 86 insertions(+)
> > > > 
> > > > 
> > > > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> > > > index f61c84f8..d610a7e 100644
> > > > --- a/fs/xfs/xfs_icache.c
> > > > +++ b/fs/xfs/xfs_icache.c
> > > > @@ -633,6 +633,89 @@ xfs_iget(
> > > >  }
> > > >  
> > > >  /*
> > > > + * "Is this a cached inode that's also allocated?"
> > > > + *
> > > > + * Look up an inode by number in the given file system.  If the inode is
> > > > + * in cache and isn't in purgatory, return 1 if the inode is allocated
> > > > + * and 0 if it is not.  For all other cases (not in cache, being torn
> > > > + * down, etc.), return a negative error code.
> > > > + *
> > > > + * (The caller has to prevent inode allocation activity.)
> > > > + */
> > > 
> > > Hmm.. so isn't the data returned here potentially invalid once we drop
> > > the inode reference? In other words, couldn't an inode where we return
> > > inuse == true be reclaimed immediately after? Perhaps I'm just not far
> > > enough along to understand how this is used. If that's the case, a note
> > > about the lifetime/rules of this value might be useful.
> > 
> > The comment could state more explicitly what we're assuming the caller
> > has done to prevent inode allocation or freeing activity.  The scrubber
> > that calls this function will have locked the AGI buffer for this AG so
> > that it can compare the inobt ir_free bits against di_mode to make sure
> > that there aren't any discrepancies.  Even if the inode is immediately
> > reclaimed/deleted after we release the inode, the corresponding inobt
> > update will block on the AGI until the scrubber finishes, so from the
> > scrubber's point of view things are still consistent.  If the scrubber
> > finds the inode in some intermediate state of being created or torn
> > down, it doesn't bother checking the free mask on the assumption that
> > the thread modifying the inode will ensure the consistency or shut down.
> > 
> > tldr: We assume the caller has the AGI locked so that inodes stay stable
> > wrt to allocation or freeing, or only end up in an intermediate state;
> > we also assume the caller can handle inodes in an intermediate state.
> > 
> 
> Ok, thanks for the explanation. The bits about reclaim are still a bit
> unclear to me, but that will probably make more sense when I see how
> this is used.
> 
> > > FWIW, I'm also kind of wondering if rather than open code the bits of
> > > the inode lookup, we could accomplish the same thing with a new flag to
> > > the existing xfs_iget() lookup mechanism that implements the associated
> > > semantics (i.e., don't read from disk, don't reinit, sort of a read-only
> > > semantic).
> > 
> > Originally it was just an iget flag, but the flag ended up special
> > casing a lot of the existing iget functionality.  Basically, we need to
> > disable the xfs_iget_cache_miss call; avoid the out_error_or_again case;
> > do our i_mode testing, release the inode, and jump out of the function
> > prior to the bit that can call xfs_setup_existing_inode; and change the
> > lock_flags assert to require lock_flags == 0 when we're just checking.
> > 
> > All that turned xfs_iget into such a muddy mess that I decided it was
> > cleaner to separate this specialized case into its own function and hope
> > that we're not really going to modify _iget a whole lot.
> > 
> 
> Hmm, so obviously I would expect some tweaks in that code, but I'm
> curious how messy it really has to be. Walking through some of the
> changes...
> 
> - The lock_flags check is already conditional in the code, so I'm not
>   sure we really need the assert. I'd be fine with dropping it at least
>   if we had a lock_flags == 0 caller. We could alternatively adjust it
>   to accommodate the new xfs_iget() flag, which might be safer.
> - I'm not sure that xfs_iget() really needs to be responsible for the
>   release. What about a helper function on top that actually receives
>   the xfs_inode from xfs_iget() and does the resulting checks, sets
>   inuse appropriately and then releases the inode?
> - With the above changes, would that reduce the necessary xfs_iget()
>   changes to basically skipping out in a few places? For example,
>   consider an XFS_IGET_INCORE flag that skips the -EAGAIN retry, skips
>   the IRECLAIMABLE reinit in _iget_cache_hit() (returns -EAGAIN) and
>   returns -ENOENT rather than calling _iget_cache_miss(). The code flow
>   of the helper might look something like the following:
> 
> int
> xfs_icache_inode_is_allocated(
> 	...
> 	xfs_ino_t		ino,
> 	bool			*inuse)
> {
> 	...
> 
> 	*inuse = false;
> 	error = xfs_iget(..., ino, XFS_IGET_INCORE, 0, &ip);
> 	if (error)
> 		return error;
> 
> 	if (<ip checks>)
> 		*inuse = true;
> 
> 	IRELE(ip);
> 	return 0;
> }
> 
> ... and may only require fairly straightforward tweaks to xfs_iget().
> Thoughts?

That could work too.  I'll give it a spin and post a v3 if it succeeds.

--D

> 
> Brian
> 
> > Anyway, thank you for the reviewing!
> > 
> > --D
> > 
> > > 
> > > Brian
> > > 
> > > > +int
> > > > +xfs_icache_inode_is_allocated(
> > > > +	struct xfs_mount	*mp,
> > > > +	struct xfs_trans	*tp,
> > > > +	xfs_ino_t		ino,
> > > > +	bool			*inuse)
> > > > +{
> > > > +	struct xfs_inode	*ip;
> > > > +	struct xfs_perag	*pag;
> > > > +	xfs_agino_t		agino;
> > > > +	int			ret = 0;
> > > > +
> > > > +	/* reject inode numbers outside existing AGs */
> > > > +	if (!ino || XFS_INO_TO_AGNO(mp, ino) >= mp->m_sb.sb_agcount)
> > > > +		return -EINVAL;
> > > > +
> > > > +	/* get the perag structure and ensure that it's inode capable */
> > > > +	pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ino));
> > > > +	agino = XFS_INO_TO_AGINO(mp, ino);
> > > > +
> > > > +	rcu_read_lock();
> > > > +	ip = radix_tree_lookup(&pag->pag_ici_root, agino);
> > > > +	if (!ip) {
> > > > +		ret = -ENOENT;
> > > > +		goto out;
> > > > +	}
> > > > +
> > > > +	/*
> > > > +	 * Is the inode being reused?  Is it new?  Is it being
> > > > +	 * reclaimed?  Is it being torn down?  For any of those cases,
> > > > +	 * fall back.
> > > > +	 */
> > > > +	spin_lock(&ip->i_flags_lock);
> > > > +	if (ip->i_ino != ino ||
> > > > +	    (ip->i_flags & (XFS_INEW | XFS_IRECLAIM | XFS_IRECLAIMABLE))) {
> > > > +		ret = -EAGAIN;
> > > > +		goto out_istate;
> > > > +	}
> > > > +
> > > > +	/*
> > > > +	 * If lookup is racing with unlink, jump out immediately.
> > > > +	 */
> > > > +	if (VFS_I(ip)->i_mode == 0) {
> > > > +		*inuse = false;
> > > > +		ret = 0;
> > > > +		goto out_istate;
> > > > +	}
> > > > +
> > > > +	/* If the VFS inode is being torn down, forget it. */
> > > > +	if (!igrab(VFS_I(ip))) {
> > > > +		ret = -EAGAIN;
> > > > +		goto out_istate;
> > > > +	}
> > > > +
> > > > +	/* We've got a live one. */
> > > > +	spin_unlock(&ip->i_flags_lock);
> > > > +	rcu_read_unlock();
> > > > +	xfs_perag_put(pag);
> > > > +
> > > > +	*inuse = !!(VFS_I(ip)->i_mode);
> > > > +	ret = 0;
> > > > +	IRELE(ip);
> > > > +
> > > > +	return ret;
> > > > +
> > > > +out_istate:
> > > > +	spin_unlock(&ip->i_flags_lock);
> > > > +out:
> > > > +	rcu_read_unlock();
> > > > +	xfs_perag_put(pag);
> > > > +	return ret;
> > > > +}
> > > > +
> > > > +/*
> > > >   * The inode lookup is done in batches to keep the amount of lock traffic and
> > > >   * radix tree lookups to a minimum. The batch size is a trade off between
> > > >   * lookup reduction and stack usage. This is in the reclaim path, so we can't
> > > > diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
> > > > index 9183f77..eadf718 100644
> > > > --- a/fs/xfs/xfs_icache.h
> > > > +++ b/fs/xfs/xfs_icache.h
> > > > @@ -126,4 +126,7 @@ xfs_fs_eofblocks_from_user(
> > > >  	return 0;
> > > >  }
> > > >  
> > > > +int xfs_icache_inode_is_allocated(struct xfs_mount *mp, struct xfs_trans *tp,
> > > > +				  xfs_ino_t ino, bool *inuse);
> > > > +
> > > >  #endif
> > > > 
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v3 07/13] xfs: check if an inode is cached and allocated
  2017-06-02 21:24 ` [PATCH 07/13] xfs: check if an inode is cached and allocated Darrick J. Wong
  2017-06-06 16:28   ` Brian Foster
  2017-06-07  1:21   ` [PATCH v2 " Darrick J. Wong
@ 2017-06-16 17:59   ` Darrick J. Wong
  2017-06-19 12:07     ` Brian Foster
  2 siblings, 1 reply; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-16 17:59 UTC (permalink / raw)
  To: linux-xfs; +Cc: Brian Foster

Check the inode cache for a particular inode number.  If it's in the
cache, check that it's not currently being reclaimed.  If it's not being
reclaimed, return zero if the inode is allocated.  This function will be
used by various scrubbers to decide if the cache is more up to date
than the disk in terms of checking if an inode is allocated.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
v3: plumb in an opportunistic "get it from the cache" flag to _iget
    and refactor the helper to use it.
---
 fs/xfs/xfs_icache.c |   52 +++++++++++++++++++++++++++++++++++++++++++++++++--
 fs/xfs/xfs_icache.h |    4 ++++
 2 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index f61c84f8..45845b7 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -369,6 +369,11 @@ xfs_iget_cache_hit(
 	if (ip->i_flags & XFS_IRECLAIMABLE) {
 		trace_xfs_iget_reclaim(ip);
 
+		if (flags & XFS_IGET_INCORE) {
+			error = -EAGAIN;
+			goto out_error;
+		}
+
 		/*
 		 * We need to set XFS_IRECLAIM to prevent xfs_reclaim_inode
 		 * from stomping over us while we recycle the inode.  We can't
@@ -433,7 +438,8 @@ xfs_iget_cache_hit(
 	if (lock_flags != 0)
 		xfs_ilock(ip, lock_flags);
 
-	xfs_iflags_clear(ip, XFS_ISTALE | XFS_IDONTCACHE);
+	if (!(flags & XFS_IGET_INCORE))
+		xfs_iflags_clear(ip, XFS_ISTALE | XFS_IDONTCACHE);
 	XFS_STATS_INC(mp, xs_ig_found);
 
 	return 0;
@@ -604,6 +610,10 @@ xfs_iget(
 			goto out_error_or_again;
 	} else {
 		rcu_read_unlock();
+		if (flags & XFS_IGET_INCORE) {
+			error = -ENOENT;
+			goto out_error_or_again;
+		}
 		XFS_STATS_INC(mp, xs_ig_missed);
 
 		error = xfs_iget_cache_miss(mp, pag, tp, ino, &ip,
@@ -624,7 +634,7 @@ xfs_iget(
 	return 0;
 
 out_error_or_again:
-	if (error == -EAGAIN) {
+	if (!(flags & XFS_IGET_INCORE) && error == -EAGAIN) {
 		delay(1);
 		goto again;
 	}
@@ -633,6 +643,44 @@ xfs_iget(
 }
 
 /*
+ * "Is this a cached inode that's also allocated?"
+ *
+ * Look up an inode by number in the given file system.  If the inode is
+ * in cache and isn't in purgatory, return 1 if the inode is allocated
+ * and 0 if it is not.  For all other cases (not in cache, being torn
+ * down, etc.), return a negative error code.
+ *
+ * The caller has to prevent inode allocation and freeing activity,
+ * presumably by locking the AGI buffer.   This is to ensure that an
+ * inode cannot transition from allocated to freed until the caller is
+ * ready to allow that.  If the inode is in an intermediate state (new,
+ * reclaimable, or being reclaimed), -EAGAIN will be returned; if the
+ * inode is not in the cache, -ENOENT will be returned.  The caller must
+ * deal with these scenarios appropriately.
+ *
+ * This is a specialized use case for the online scrubber; if you're
+ * reading this, you probably want xfs_iget.
+ */
+int
+xfs_icache_inode_is_allocated(
+	struct xfs_mount	*mp,
+	struct xfs_trans	*tp,
+	xfs_ino_t		ino,
+	bool			*inuse)
+{
+	struct xfs_inode	*ip;
+	int			error;
+
+	error = xfs_iget(mp, tp, ino, XFS_IGET_INCORE, 0, &ip);
+	if (error)
+		return error;
+
+	*inuse = !!(VFS_I(ip)->i_mode);
+	IRELE(ip);
+	return 0;
+}
+
+/*
  * The inode lookup is done in batches to keep the amount of lock traffic and
  * radix tree lookups to a minimum. The batch size is a trade off between
  * lookup reduction and stack usage. This is in the reclaim path, so we can't
diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
index 9183f77..bff4d85 100644
--- a/fs/xfs/xfs_icache.h
+++ b/fs/xfs/xfs_icache.h
@@ -47,6 +47,7 @@ struct xfs_eofblocks {
 #define XFS_IGET_CREATE		0x1
 #define XFS_IGET_UNTRUSTED	0x2
 #define XFS_IGET_DONTCACHE	0x4
+#define XFS_IGET_INCORE		0x8	/* don't read from disk or reinit */
 
 /*
  * flags for AG inode iterator
@@ -126,4 +127,7 @@ xfs_fs_eofblocks_from_user(
 	return 0;
 }
 
+int xfs_icache_inode_is_allocated(struct xfs_mount *mp, struct xfs_trans *tp,
+				  xfs_ino_t ino, bool *inuse);
+
 #endif

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 07/13] xfs: check if an inode is cached and allocated
  2017-06-16 17:59   ` [PATCH v3 " Darrick J. Wong
@ 2017-06-19 12:07     ` Brian Foster
  0 siblings, 0 replies; 56+ messages in thread
From: Brian Foster @ 2017-06-19 12:07 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, Jun 16, 2017 at 10:59:32AM -0700, Darrick J. Wong wrote:
> Check the inode cache for a particular inode number.  If it's in the
> cache, check that it's not currently being reclaimed.  If it's not being
> reclaimed, return zero if the inode is allocated.  This function will be
> used by various scrubbers to decide if the cache is more up to date
> than the disk in terms of checking if an inode is allocated.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> v3: plumb in an opportunistic "get it from the cache" flag to _iget
>     and refactor the helper to use it.
> ---
>  fs/xfs/xfs_icache.c |   52 +++++++++++++++++++++++++++++++++++++++++++++++++--
>  fs/xfs/xfs_icache.h |    4 ++++
>  2 files changed, 54 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index f61c84f8..45845b7 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -369,6 +369,11 @@ xfs_iget_cache_hit(
>  	if (ip->i_flags & XFS_IRECLAIMABLE) {
>  		trace_xfs_iget_reclaim(ip);
>  
> +		if (flags & XFS_IGET_INCORE) {
> +			error = -EAGAIN;
> +			goto out_error;
> +		}
> +
>  		/*
>  		 * We need to set XFS_IRECLAIM to prevent xfs_reclaim_inode
>  		 * from stomping over us while we recycle the inode.  We can't
> @@ -433,7 +438,8 @@ xfs_iget_cache_hit(
>  	if (lock_flags != 0)
>  		xfs_ilock(ip, lock_flags);
>  
> -	xfs_iflags_clear(ip, XFS_ISTALE | XFS_IDONTCACHE);
> +	if (!(flags & XFS_IGET_INCORE))
> +		xfs_iflags_clear(ip, XFS_ISTALE | XFS_IDONTCACHE);
>  	XFS_STATS_INC(mp, xs_ig_found);
>  
>  	return 0;
> @@ -604,6 +610,10 @@ xfs_iget(

It might be a good idea to check (or assert) that _IGET_INCORE isn't
specified with any other flags that don't make sense. Otherwise this
looks good to me, thanks...

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  			goto out_error_or_again;
>  	} else {
>  		rcu_read_unlock();
> +		if (flags & XFS_IGET_INCORE) {
> +			error = -ENOENT;
> +			goto out_error_or_again;
> +		}
>  		XFS_STATS_INC(mp, xs_ig_missed);
>  
>  		error = xfs_iget_cache_miss(mp, pag, tp, ino, &ip,
> @@ -624,7 +634,7 @@ xfs_iget(
>  	return 0;
>  
>  out_error_or_again:
> -	if (error == -EAGAIN) {
> +	if (!(flags & XFS_IGET_INCORE) && error == -EAGAIN) {
>  		delay(1);
>  		goto again;
>  	}
> @@ -633,6 +643,44 @@ xfs_iget(
>  }
>  
>  /*
> + * "Is this a cached inode that's also allocated?"
> + *
> + * Look up an inode by number in the given file system.  If the inode is
> + * in cache and isn't in purgatory, return 1 if the inode is allocated
> + * and 0 if it is not.  For all other cases (not in cache, being torn
> + * down, etc.), return a negative error code.
> + *
> + * The caller has to prevent inode allocation and freeing activity,
> + * presumably by locking the AGI buffer.   This is to ensure that an
> + * inode cannot transition from allocated to freed until the caller is
> + * ready to allow that.  If the inode is in an intermediate state (new,
> + * reclaimable, or being reclaimed), -EAGAIN will be returned; if the
> + * inode is not in the cache, -ENOENT will be returned.  The caller must
> + * deal with these scenarios appropriately.
> + *
> + * This is a specialized use case for the online scrubber; if you're
> + * reading this, you probably want xfs_iget.
> + */
> +int
> +xfs_icache_inode_is_allocated(
> +	struct xfs_mount	*mp,
> +	struct xfs_trans	*tp,
> +	xfs_ino_t		ino,
> +	bool			*inuse)
> +{
> +	struct xfs_inode	*ip;
> +	int			error;
> +
> +	error = xfs_iget(mp, tp, ino, XFS_IGET_INCORE, 0, &ip);
> +	if (error)
> +		return error;
> +
> +	*inuse = !!(VFS_I(ip)->i_mode);
> +	IRELE(ip);
> +	return 0;
> +}
> +
> +/*
>   * The inode lookup is done in batches to keep the amount of lock traffic and
>   * radix tree lookups to a minimum. The batch size is a trade off between
>   * lookup reduction and stack usage. This is in the reclaim path, so we can't
> diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
> index 9183f77..bff4d85 100644
> --- a/fs/xfs/xfs_icache.h
> +++ b/fs/xfs/xfs_icache.h
> @@ -47,6 +47,7 @@ struct xfs_eofblocks {
>  #define XFS_IGET_CREATE		0x1
>  #define XFS_IGET_UNTRUSTED	0x2
>  #define XFS_IGET_DONTCACHE	0x4
> +#define XFS_IGET_INCORE		0x8	/* don't read from disk or reinit */
>  
>  /*
>   * flags for AG inode iterator
> @@ -126,4 +127,7 @@ xfs_fs_eofblocks_from_user(
>  	return 0;
>  }
>  
> +int xfs_icache_inode_is_allocated(struct xfs_mount *mp, struct xfs_trans *tp,
> +				  xfs_ino_t ino, bool *inuse);
> +
>  #endif
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 15/13] xfs: grab dquots without taking the ilock
  2017-06-02 21:24 [PATCH v7 00/13] xfs: preparing for online scrub support Darrick J. Wong
                   ` (13 preceding siblings ...)
  2017-06-02 22:19 ` [PATCH 14/13] xfs: allow reading of already-locked remote symbolic link Darrick J. Wong
@ 2017-06-26  6:04 ` Darrick J. Wong
  2017-06-27 11:00   ` Brian Foster
  14 siblings, 1 reply; 56+ messages in thread
From: Darrick J. Wong @ 2017-06-26  6:04 UTC (permalink / raw)
  To: linux-xfs

Add a new dqget flag that grabs the dquot without taking the ilock.
This will be used by the scrubber (which will have already grabbed
the ilock) to perform basic sanity checking of the quota data.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_quota_defs.h |    2 ++
 fs/xfs/xfs_dquot.c             |   11 +++++++----
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_quota_defs.h b/fs/xfs/libxfs/xfs_quota_defs.h
index d69c772..2834574 100644
--- a/fs/xfs/libxfs/xfs_quota_defs.h
+++ b/fs/xfs/libxfs/xfs_quota_defs.h
@@ -136,6 +136,8 @@ typedef uint16_t	xfs_qwarncnt_t;
  */
 #define XFS_QMOPT_INHERIT	0x1000000
 
+#define XFS_QMOPT_NOLOCK	0x2000000 /* don't ilock during dqget */
+
 /*
  * flags to xfs_trans_mod_dquot.
  */
diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index e57c6cc..3519efc 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -472,18 +472,20 @@ xfs_qm_dqtobp(
 	struct xfs_mount	*mp = dqp->q_mount;
 	xfs_dqid_t		id = be32_to_cpu(dqp->q_core.d_id);
 	struct xfs_trans	*tp = (tpp ? *tpp : NULL);
-	uint			lock_mode;
+	uint			lock_mode = 0;
 
 	quotip = xfs_quota_inode(dqp->q_mount, dqp->dq_flags);
 	dqp->q_fileoffset = (xfs_fileoff_t)id / mp->m_quotainfo->qi_dqperchunk;
 
-	lock_mode = xfs_ilock_data_map_shared(quotip);
+	if (!(flags & XFS_QMOPT_NOLOCK))
+		lock_mode = xfs_ilock_data_map_shared(quotip);
 	if (!xfs_this_quota_on(dqp->q_mount, dqp->dq_flags)) {
 		/*
 		 * Return if this type of quotas is turned off while we
 		 * didn't have the quota inode lock.
 		 */
-		xfs_iunlock(quotip, lock_mode);
+		if (lock_mode)
+			xfs_iunlock(quotip, lock_mode);
 		return -ESRCH;
 	}
 
@@ -493,7 +495,8 @@ xfs_qm_dqtobp(
 	error = xfs_bmapi_read(quotip, dqp->q_fileoffset,
 			       XFS_DQUOT_CLUSTER_SIZE_FSB, &map, &nmaps, 0);
 
-	xfs_iunlock(quotip, lock_mode);
+	if (lock_mode)
+		xfs_iunlock(quotip, lock_mode);
 	if (error)
 		return error;
 

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH 15/13] xfs: grab dquots without taking the ilock
  2017-06-26  6:04 ` [PATCH 15/13] xfs: grab dquots without taking the ilock Darrick J. Wong
@ 2017-06-27 11:00   ` Brian Foster
  0 siblings, 0 replies; 56+ messages in thread
From: Brian Foster @ 2017-06-27 11:00 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Sun, Jun 25, 2017 at 11:04:46PM -0700, Darrick J. Wong wrote:
> Add a new dqget flag that grabs the dquot without taking the ilock.
> This will be used by the scrubber (which will have already grabbed
> the ilock) to perform basic sanity checking of the quota data.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

It might be good to add an assert somewhere after where we expect the
lock to be held one way or another. Otherwise looks Ok to me:

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_quota_defs.h |    2 ++
>  fs/xfs/xfs_dquot.c             |   11 +++++++----
>  2 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_quota_defs.h b/fs/xfs/libxfs/xfs_quota_defs.h
> index d69c772..2834574 100644
> --- a/fs/xfs/libxfs/xfs_quota_defs.h
> +++ b/fs/xfs/libxfs/xfs_quota_defs.h
> @@ -136,6 +136,8 @@ typedef uint16_t	xfs_qwarncnt_t;
>   */
>  #define XFS_QMOPT_INHERIT	0x1000000
>  
> +#define XFS_QMOPT_NOLOCK	0x2000000 /* don't ilock during dqget */
> +
>  /*
>   * flags to xfs_trans_mod_dquot.
>   */
> diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
> index e57c6cc..3519efc 100644
> --- a/fs/xfs/xfs_dquot.c
> +++ b/fs/xfs/xfs_dquot.c
> @@ -472,18 +472,20 @@ xfs_qm_dqtobp(
>  	struct xfs_mount	*mp = dqp->q_mount;
>  	xfs_dqid_t		id = be32_to_cpu(dqp->q_core.d_id);
>  	struct xfs_trans	*tp = (tpp ? *tpp : NULL);
> -	uint			lock_mode;
> +	uint			lock_mode = 0;
>  
>  	quotip = xfs_quota_inode(dqp->q_mount, dqp->dq_flags);
>  	dqp->q_fileoffset = (xfs_fileoff_t)id / mp->m_quotainfo->qi_dqperchunk;
>  
> -	lock_mode = xfs_ilock_data_map_shared(quotip);
> +	if (!(flags & XFS_QMOPT_NOLOCK))
> +		lock_mode = xfs_ilock_data_map_shared(quotip);
>  	if (!xfs_this_quota_on(dqp->q_mount, dqp->dq_flags)) {
>  		/*
>  		 * Return if this type of quotas is turned off while we
>  		 * didn't have the quota inode lock.
>  		 */
> -		xfs_iunlock(quotip, lock_mode);
> +		if (lock_mode)
> +			xfs_iunlock(quotip, lock_mode);
>  		return -ESRCH;
>  	}
>  
> @@ -493,7 +495,8 @@ xfs_qm_dqtobp(
>  	error = xfs_bmapi_read(quotip, dqp->q_fileoffset,
>  			       XFS_DQUOT_CLUSTER_SIZE_FSB, &map, &nmaps, 0);
>  
> -	xfs_iunlock(quotip, lock_mode);
> +	if (lock_mode)
> +		xfs_iunlock(quotip, lock_mode);
>  	if (error)
>  		return error;
>  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2017-06-27 11:00 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-02 21:24 [PATCH v7 00/13] xfs: preparing for online scrub support Darrick J. Wong
2017-06-02 21:24 ` [PATCH 01/13] xfs: optimize _btree_query_all Darrick J. Wong
2017-06-06 13:32   ` Brian Foster
2017-06-06 17:43     ` Darrick J. Wong
2017-06-07  1:18   ` [PATCH v2 " Darrick J. Wong
2017-06-07 14:22     ` Brian Foster
2017-06-02 21:24 ` [PATCH 02/13] xfs: remove double-underscore integer types Darrick J. Wong
2017-06-02 21:24 ` [PATCH 03/13] xfs: always compile the btree inorder check functions Darrick J. Wong
2017-06-06 13:32   ` Brian Foster
2017-06-02 21:24 ` [PATCH 04/13] xfs: export various function for the online scrubber Darrick J. Wong
2017-06-06 13:32   ` Brian Foster
2017-06-02 21:24 ` [PATCH 05/13] xfs: plumb in needed functions for range querying of various btrees Darrick J. Wong
2017-06-06 13:33   ` Brian Foster
2017-06-02 21:24 ` [PATCH 06/13] xfs: export _inobt_btrec_to_irec and _ialloc_cluster_alignment for scrub Darrick J. Wong
2017-06-06 16:27   ` Brian Foster
2017-06-06 17:46     ` Darrick J. Wong
2017-06-02 21:24 ` [PATCH 07/13] xfs: check if an inode is cached and allocated Darrick J. Wong
2017-06-06 16:28   ` Brian Foster
2017-06-06 18:40     ` Darrick J. Wong
2017-06-07 14:22       ` Brian Foster
2017-06-15  5:00         ` Darrick J. Wong
2017-06-07  1:21   ` [PATCH v2 " Darrick J. Wong
2017-06-16 17:59   ` [PATCH v3 " Darrick J. Wong
2017-06-19 12:07     ` Brian Foster
2017-06-02 21:24 ` [PATCH 08/13] xfs: reflink find shared should take a transaction Darrick J. Wong
2017-06-06 16:28   ` Brian Foster
2017-06-02 21:24 ` [PATCH 09/13] xfs: separate function to check if reflink flag needed Darrick J. Wong
2017-06-06 16:28   ` Brian Foster
2017-06-06 18:05     ` Darrick J. Wong
2017-06-07  1:26   ` [PATCH v2 " Darrick J. Wong
2017-06-07 14:22     ` Brian Foster
2017-06-02 21:25 ` [PATCH 10/13] xfs: refactor the ifork block counting function Darrick J. Wong
2017-06-06 16:29   ` Brian Foster
2017-06-06 18:51     ` Darrick J. Wong
2017-06-06 20:35       ` Darrick J. Wong
2017-06-07  1:29   ` [PATCH v2 9.9/13] xfs: make _bmap_count_blocks consistent wrt delalloc extent behavior Darrick J. Wong
2017-06-07 15:11     ` Brian Foster
2017-06-07 16:19       ` Darrick J. Wong
2017-06-07  1:29   ` [PATCH v2 10/13] xfs: refactor the ifork block counting function Darrick J. Wong
2017-06-07 15:11     ` Brian Foster
2017-06-02 21:25 ` [PATCH 11/13] xfs: return the hash value of a leaf1 directory block Darrick J. Wong
2017-06-08 13:02   ` Brian Foster
2017-06-08 15:53     ` Darrick J. Wong
2017-06-08 16:31       ` Brian Foster
2017-06-08 16:43         ` Darrick J. Wong
2017-06-08 16:52           ` Brian Foster
2017-06-08 18:22   ` [PATCH v2 " Darrick J. Wong
2017-06-09 12:54     ` Brian Foster
2017-06-02 21:25 ` [PATCH 12/13] xfs: pass along transaction context when reading directory block buffers Darrick J. Wong
2017-06-08 13:02   ` Brian Foster
2017-06-02 21:25 ` [PATCH 13/13] xfs: pass along transaction context when reading xattr " Darrick J. Wong
2017-06-08 13:02   ` Brian Foster
2017-06-02 22:19 ` [PATCH 14/13] xfs: allow reading of already-locked remote symbolic link Darrick J. Wong
2017-06-08 13:02   ` Brian Foster
2017-06-26  6:04 ` [PATCH 15/13] xfs: grab dquots without taking the ilock Darrick J. Wong
2017-06-27 11:00   ` Brian Foster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.