linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] xfsprogs: Extend per-inode extent counters
@ 2020-08-31 13:00 Chandan Babu R
  2020-08-31 13:00 ` [PATCH 1/4] xfsprogs: Introduce xfs_iext_max() helper Chandan Babu R
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Chandan Babu R @ 2020-08-31 13:00 UTC (permalink / raw)
  To: linux-xfs; +Cc: Chandan Babu R, david, darrick.wong, bfoster

The kernel commit xfs: fix inode fork extent count overflow
(3f8a4f1d876d3e3e49e50b0396eaffcc4ba71b08) mentions that 10 billion
data fork extents should be possible to create. However the
corresponding on-disk field has a signed 32-bit type. Hence this
patchset extends the per-inode data extent counter to 47 bits. The
length of 47-bits was chosen because,
Maximum file size = 2^63.
Maximum extent count when using 64k block size = 2^63 / 2^16 = 2^47.

Also, XFS has a per-inode xattr extent counter which is 16 bits
wide. A workload which
1. Creates 1 million 255-byte sized xattrs,
2. Deletes 50% of these xattrs in an alternating manner,
3. Tries to insert 400,000 new 255-byte sized xattrs
   causes the xattr extent counter to overflow.

Dave tells me that there are instances where a single file has more
than 100 million hardlinks. With parent pointers being stored in
xattrs, we will overflow the signed 16-bits wide xattr extent counter
when large number of hardlinks are created. Hence this patchset
extends the on-disk field to 32-bits.

The following changes are made to accomplish this,
1. A new incompat superblock flag to prevent older kernels from mounting
   the filesystem. This flag has to be set during mkfs time.
2. Carve out a new 32-bit field from xfs_dinode->di_pad2[]. This field
   holds the most significant 15 bits of the data extent counter.
3. Carve out a new 16-bit field from xfs_dinode->di_pad2[]. This field
   holds the most significant 16 bits of the attr extent counter.

This patchset can also be obtained from
https://github.com/chandanr/xfsprogs-dev.git at branch
xfs-incompat-extend-extcnt-v1.

Chandan Babu R (4):
  xfsprogs: Introduce xfs_iext_max() helper
  xfsprogs: Introduce xfs_dfork_nextents() helper
  xfsprogs: Extend data/attr fork extent counter width
  xfsprogs: Add wideextcnt mkfs option

 db/bmap.c                  |  8 +--
 db/btdump.c                |  4 +-
 db/check.c                 |  2 +-
 db/field.c                 |  4 --
 db/field.h                 |  2 -
 db/frag.c                  |  8 +--
 db/inode.c                 | 31 +++++++++---
 db/metadump.c              |  4 +-
 include/libxlog.h          |  6 ++-
 libxfs/xfs_bmap.c          | 21 ++++----
 libxfs/xfs_format.h        | 24 +++++----
 libxfs/xfs_inode_buf.c     | 78 +++++++++++++++++++++++-------
 libxfs/xfs_inode_buf.h     |  6 ++-
 libxfs/xfs_inode_fork.c    |  7 +--
 libxfs/xfs_inode_fork.h    | 17 +++++++
 libxfs/xfs_log_format.h    |  8 +--
 libxfs/xfs_types.h         |  6 ++-
 logprint/log_misc.c        | 21 ++++++--
 logprint/log_print_all.c   | 30 +++++++++---
 logprint/log_print_trans.c |  2 +-
 man/man8/mkfs.xfs.8        |  7 +++
 mkfs/xfs_mkfs.c            | 23 +++++++++
 repair/attr_repair.c       |  2 +-
 repair/dinode.c            | 99 ++++++++++++++++++++++----------------
 repair/prefetch.c          |  2 +-
 25 files changed, 292 insertions(+), 130 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/4] xfsprogs: Introduce xfs_iext_max() helper
  2020-08-31 13:00 [PATCH 0/4] xfsprogs: Extend per-inode extent counters Chandan Babu R
@ 2020-08-31 13:00 ` Chandan Babu R
  2020-08-31 13:01 ` [PATCH 2/4] xfsprogs: Introduce xfs_dfork_nextents() helper Chandan Babu R
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Chandan Babu R @ 2020-08-31 13:00 UTC (permalink / raw)
  To: linux-xfs; +Cc: Chandan Babu R, david, darrick.wong, bfoster

xfs_iext_max() returns the maximum number of extents possible for either
data fork or attribute fork. This helper will be extended further in a
future commit when maximum extent counts associated with data/attribute
forks are increased.

No functional changes have been made.

Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
 libxfs/xfs_bmap.c       |  9 ++++-----
 libxfs/xfs_inode_buf.c  |  9 ++++-----
 libxfs/xfs_inode_fork.h | 10 ++++++++++
 repair/dinode.c         | 23 ++++++++++++++---------
 4 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/libxfs/xfs_bmap.c b/libxfs/xfs_bmap.c
index 11f3f5f9..dae4d339 100644
--- a/libxfs/xfs_bmap.c
+++ b/libxfs/xfs_bmap.c
@@ -67,13 +67,12 @@ xfs_bmap_compute_maxlevels(
 	 * for both ATTR1 and ATTR2 we have to assume the worst case scenario
 	 * of a minimum size available.
 	 */
-	if (whichfork == XFS_DATA_FORK) {
-		maxleafents = MAXEXTNUM;
+	maxleafents = xfs_iext_max(&mp->m_sb, whichfork);
+	if (whichfork == XFS_DATA_FORK)
 		sz = XFS_BMDR_SPACE_CALC(MINDBTPTRS);
-	} else {
-		maxleafents = MAXAEXTNUM;
+	else
 		sz = XFS_BMDR_SPACE_CALC(MINABTPTRS);
-	}
+
 	maxrootrecs = xfs_bmdr_maxrecs(sz, 0);
 	minleafrecs = mp->m_bmap_dmnr[0];
 	minnoderecs = mp->m_bmap_dmnr[1];
diff --git a/libxfs/xfs_inode_buf.c b/libxfs/xfs_inode_buf.c
index b65cd0b1..ae71a19e 100644
--- a/libxfs/xfs_inode_buf.c
+++ b/libxfs/xfs_inode_buf.c
@@ -363,6 +363,8 @@ xfs_dinode_verify_fork(
 	int			whichfork)
 {
 	uint32_t		di_nextents = XFS_DFORK_NEXTENTS(dip, whichfork);
+	xfs_extnum_t		max_extents;
+
 
 	switch (XFS_DFORK_FORMAT(dip, whichfork)) {
 	case XFS_DINODE_FMT_LOCAL:
@@ -384,12 +386,9 @@ xfs_dinode_verify_fork(
 			return __this_address;
 		break;
 	case XFS_DINODE_FMT_BTREE:
-		if (whichfork == XFS_ATTR_FORK) {
-			if (di_nextents > MAXAEXTNUM)
-				return __this_address;
-		} else if (di_nextents > MAXEXTNUM) {
+		max_extents = xfs_iext_max(&mp->m_sb, whichfork);
+		if (di_nextents > max_extents)
 			return __this_address;
-		}
 		break;
 	default:
 		return __this_address;
diff --git a/libxfs/xfs_inode_fork.h b/libxfs/xfs_inode_fork.h
index 668ee942..e318dfdd 100644
--- a/libxfs/xfs_inode_fork.h
+++ b/libxfs/xfs_inode_fork.h
@@ -86,6 +86,16 @@ struct xfs_ifork {
 	(XFS_IFORK_FORMAT((ip), (w)) == XFS_DINODE_FMT_EXTENTS || \
 	 XFS_IFORK_FORMAT((ip), (w)) == XFS_DINODE_FMT_BTREE)
 
+static inline xfs_extnum_t xfs_iext_max(struct xfs_sb *sbp, int whichfork)
+{
+	ASSERT(whichfork == XFS_DATA_FORK || whichfork == XFS_ATTR_FORK);
+
+	if (whichfork == XFS_DATA_FORK)
+		return MAXEXTNUM;
+	else
+		return MAXAEXTNUM;
+}
+
 struct xfs_ifork *xfs_iext_state_to_fork(struct xfs_inode *ip, int state);
 
 int		xfs_iformat_fork(struct xfs_inode *, struct xfs_dinode *);
diff --git a/repair/dinode.c b/repair/dinode.c
index 526ecde3..de9a3286 100644
--- a/repair/dinode.c
+++ b/repair/dinode.c
@@ -1727,13 +1727,16 @@ _("bad attr fork offset %d in inode %" PRIu64 ", max=%zu\n"),
  */
 static int
 process_inode_blocks_and_extents(
-	xfs_dinode_t	*dino,
-	xfs_rfsblock_t	nblocks,
-	uint64_t	nextents,
-	uint64_t	anextents,
-	xfs_ino_t	lino,
-	int		*dirty)
+	struct xfs_mount	*mp,
+	xfs_dinode_t		*dino,
+	xfs_rfsblock_t		nblocks,
+	uint64_t		nextents,
+	uint64_t		anextents,
+	xfs_ino_t		lino,
+	int			*dirty)
 {
+	xfs_extnum_t		max_extents;
+
 	if (nblocks != be64_to_cpu(dino->di_nblocks))  {
 		if (!no_modify)  {
 			do_warn(
@@ -1750,7 +1753,8 @@ _("bad nblocks %llu for inode %" PRIu64 ", would reset to %" PRIu64 "\n"),
 		}
 	}
 
-	if (nextents > MAXEXTNUM)  {
+	max_extents = xfs_iext_max(&mp->m_sb, XFS_DATA_FORK);
+	if (nextents > max_extents)  {
 		do_warn(
 _("too many data fork extents (%" PRIu64 ") in inode %" PRIu64 "\n"),
 			nextents, lino);
@@ -1773,7 +1777,8 @@ _("bad nextents %d for inode %" PRIu64 ", would reset to %" PRIu64 "\n"),
 		}
 	}
 
-	if (anextents > MAXAEXTNUM)  {
+	max_extents = xfs_iext_max(&mp->m_sb, XFS_ATTR_FORK);
+	if (anextents > max_extents)  {
 		do_warn(
 _("too many attr fork extents (%" PRIu64 ") in inode %" PRIu64 "\n"),
 			anextents, lino);
@@ -2712,7 +2717,7 @@ _("Bad CoW extent size %u on inode %" PRIu64 ", "),
 	/*
 	 * correct space counters if required
 	 */
-	if (process_inode_blocks_and_extents(dino, totblocks + atotblocks,
+	if (process_inode_blocks_and_extents(mp, dino, totblocks + atotblocks,
 			nextents, anextents, lino, dirty) != 0)
 		goto clear_bad_out;
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/4] xfsprogs: Introduce xfs_dfork_nextents() helper
  2020-08-31 13:00 [PATCH 0/4] xfsprogs: Extend per-inode extent counters Chandan Babu R
  2020-08-31 13:00 ` [PATCH 1/4] xfsprogs: Introduce xfs_iext_max() helper Chandan Babu R
@ 2020-08-31 13:01 ` Chandan Babu R
  2020-08-31 20:54   ` Darrick J. Wong
  2020-08-31 13:01 ` [PATCH 3/4] xfsprogs: Extend data/attr fork extent counter width Chandan Babu R
  2020-08-31 13:01 ` [PATCH 4/4] xfsprogs: Add wideextcnt mkfs option Chandan Babu R
  3 siblings, 1 reply; 10+ messages in thread
From: Chandan Babu R @ 2020-08-31 13:01 UTC (permalink / raw)
  To: linux-xfs; +Cc: Chandan Babu R, david, darrick.wong, bfoster

This commit replaces the macro XFS_DFORK_NEXTENTS() with the helper
function xfs_dfork_nextents(). As of this commit, xfs_dfork_nextents()
returns the same value as XFS_DFORK_NEXTENTS(). A future commit which
extends inode's extent counter fields will add more logic to this
helper.

This commit also replaces direct accesses to xfs_dinode->di_[a]nextents
with calls to xfs_dfork_nextents().

No functional changes have been made.

Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
 db/bmap.c               |  6 +++---
 db/btdump.c             |  4 ++--
 db/check.c              |  2 +-
 db/frag.c               |  8 ++++---
 db/inode.c              | 14 ++++++------
 db/metadump.c           |  4 ++--
 libxfs/xfs_format.h     |  4 ----
 libxfs/xfs_inode_buf.c  | 26 ++++++++++++++++------
 libxfs/xfs_inode_buf.h  |  2 ++
 libxfs/xfs_inode_fork.c |  3 ++-
 repair/attr_repair.c    |  2 +-
 repair/dinode.c         | 48 +++++++++++++++++++++++------------------
 repair/prefetch.c       |  2 +-
 13 files changed, 74 insertions(+), 51 deletions(-)

diff --git a/db/bmap.c b/db/bmap.c
index fdc70e95..9800a909 100644
--- a/db/bmap.c
+++ b/db/bmap.c
@@ -68,7 +68,7 @@ bmap(
 	ASSERT(fmt == XFS_DINODE_FMT_LOCAL || fmt == XFS_DINODE_FMT_EXTENTS ||
 		fmt == XFS_DINODE_FMT_BTREE);
 	if (fmt == XFS_DINODE_FMT_EXTENTS) {
-		nextents = XFS_DFORK_NEXTENTS(dip, whichfork);
+		nextents = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
 		xp = (xfs_bmbt_rec_t *)XFS_DFORK_PTR(dip, whichfork);
 		for (ep = xp; ep < &xp[nextents] && n < nex; ep++) {
 			if (!bmap_one_extent(ep, &curoffset, eoffset, &n, bep))
@@ -158,9 +158,9 @@ bmap_f(
 		push_cur();
 		set_cur_inode(iocur_top->ino);
 		dip = iocur_top->data;
-		if (be32_to_cpu(dip->di_nextents))
+		if (xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK))
 			dfork = 1;
-		if (be16_to_cpu(dip->di_anextents))
+		if (xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK))
 			afork = 1;
 		pop_cur();
 	}
diff --git a/db/btdump.c b/db/btdump.c
index 920f595b..9ced71d4 100644
--- a/db/btdump.c
+++ b/db/btdump.c
@@ -166,13 +166,13 @@ dump_inode(
 
 	dip = iocur_top->data;
 	if (attrfork) {
-		if (!dip->di_anextents ||
+		if (!xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK) ||
 		    dip->di_aformat != XFS_DINODE_FMT_BTREE) {
 			dbprintf(_("attr fork not in btree format\n"));
 			return 0;
 		}
 	} else {
-		if (!dip->di_nextents ||
+		if (!xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK) ||
 		    dip->di_format != XFS_DINODE_FMT_BTREE) {
 			dbprintf(_("data fork not in btree format\n"));
 			return 0;
diff --git a/db/check.c b/db/check.c
index 12c03b6d..2d1823a4 100644
--- a/db/check.c
+++ b/db/check.c
@@ -2686,7 +2686,7 @@ process_exinode(
 	xfs_bmbt_rec_t		*rp;
 
 	rp = (xfs_bmbt_rec_t *)XFS_DFORK_PTR(dip, whichfork);
-	*nex = XFS_DFORK_NEXTENTS(dip, whichfork);
+	*nex = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
 	if (*nex < 0 || *nex > XFS_DFORK_SIZE(dip, mp, whichfork) /
 						sizeof(xfs_bmbt_rec_t)) {
 		if (!sflag || id->ilist)
diff --git a/db/frag.c b/db/frag.c
index 1cfc6c2c..20fb1306 100644
--- a/db/frag.c
+++ b/db/frag.c
@@ -262,9 +262,11 @@ process_exinode(
 	int			whichfork)
 {
 	xfs_bmbt_rec_t		*rp;
+	xfs_extnum_t		nextents;
 
 	rp = (xfs_bmbt_rec_t *)XFS_DFORK_PTR(dip, whichfork);
-	process_bmbt_reclist(rp, XFS_DFORK_NEXTENTS(dip, whichfork), extmapp);
+	nextents = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
+	process_bmbt_reclist(rp, nextents, extmapp);
 }
 
 static void
@@ -273,9 +275,9 @@ process_fork(
 	int		whichfork)
 {
 	extmap_t	*extmap;
-	int		nex;
+	xfs_extnum_t	nex;
 
-	nex = XFS_DFORK_NEXTENTS(dip, whichfork);
+	nex = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
 	if (!nex)
 		return;
 	extmap = extmap_alloc(nex);
diff --git a/db/inode.c b/db/inode.c
index 0cff9d63..3853092c 100644
--- a/db/inode.c
+++ b/db/inode.c
@@ -271,7 +271,7 @@ inode_a_bmx_count(
 		return 0;
 	ASSERT((char *)XFS_DFORK_APTR(dip) - (char *)dip == byteize(startoff));
 	return dip->di_aformat == XFS_DINODE_FMT_EXTENTS ?
-		be16_to_cpu(dip->di_anextents) : 0;
+		xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK) : 0;
 }
 
 static int
@@ -325,6 +325,7 @@ inode_a_size(
 {
 	xfs_attr_shortform_t	*asf;
 	xfs_dinode_t		*dip;
+	xfs_extnum_t		nextents;
 
 	ASSERT(startoff == 0);
 	ASSERT(idx == 0);
@@ -334,8 +335,8 @@ inode_a_size(
 		asf = (xfs_attr_shortform_t *)XFS_DFORK_APTR(dip);
 		return bitize(be16_to_cpu(asf->hdr.totsize));
 	case XFS_DINODE_FMT_EXTENTS:
-		return (int)be16_to_cpu(dip->di_anextents) *
-							bitsz(xfs_bmbt_rec_t);
+		nextents = xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK);
+		return (int)(nextents * bitsz(xfs_bmbt_rec_t));
 	case XFS_DINODE_FMT_BTREE:
 		return bitize((int)XFS_DFORK_ASIZE(dip, mp));
 	default:
@@ -496,7 +497,7 @@ inode_u_bmx_count(
 	dip = obj;
 	ASSERT((char *)XFS_DFORK_DPTR(dip) - (char *)dip == byteize(startoff));
 	return dip->di_format == XFS_DINODE_FMT_EXTENTS ?
-		be32_to_cpu(dip->di_nextents) : 0;
+		xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK) : 0;
 }
 
 static int
@@ -582,6 +583,7 @@ inode_u_size(
 	int		idx)
 {
 	xfs_dinode_t	*dip;
+	xfs_extnum_t	nextents;
 
 	ASSERT(startoff == 0);
 	ASSERT(idx == 0);
@@ -592,8 +594,8 @@ inode_u_size(
 	case XFS_DINODE_FMT_LOCAL:
 		return bitize((int)be64_to_cpu(dip->di_size));
 	case XFS_DINODE_FMT_EXTENTS:
-		return (int)be32_to_cpu(dip->di_nextents) *
-						bitsz(xfs_bmbt_rec_t);
+		nextents = xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK);
+		return (int)(nextents * bitsz(xfs_bmbt_rec_t));
 	case XFS_DINODE_FMT_BTREE:
 		return bitize((int)XFS_DFORK_DSIZE(dip, mp));
 	case XFS_DINODE_FMT_UUID:
diff --git a/db/metadump.c b/db/metadump.c
index e5cb3aa5..6a6757a2 100644
--- a/db/metadump.c
+++ b/db/metadump.c
@@ -2282,7 +2282,7 @@ process_exinode(
 
 	whichfork = (itype == TYP_ATTR) ? XFS_ATTR_FORK : XFS_DATA_FORK;
 
-	nex = XFS_DFORK_NEXTENTS(dip, whichfork);
+	nex = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
 	used = nex * sizeof(xfs_bmbt_rec_t);
 	if (nex < 0 || used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
 		if (show_warnings)
@@ -2335,7 +2335,7 @@ static int
 process_dev_inode(
 	xfs_dinode_t		*dip)
 {
-	if (XFS_DFORK_NEXTENTS(dip, XFS_DATA_FORK)) {
+	if (xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK)) {
 		if (show_warnings)
 			print_warning("inode %llu has unexpected extents",
 				      (unsigned long long)cur_ino);
diff --git a/libxfs/xfs_format.h b/libxfs/xfs_format.h
index a738cd8b..188deada 100644
--- a/libxfs/xfs_format.h
+++ b/libxfs/xfs_format.h
@@ -993,10 +993,6 @@ enum xfs_dinode_fmt {
 	((w) == XFS_DATA_FORK ? \
 		(dip)->di_format : \
 		(dip)->di_aformat)
-#define XFS_DFORK_NEXTENTS(dip,w) \
-	((w) == XFS_DATA_FORK ? \
-		be32_to_cpu((dip)->di_nextents) : \
-		be16_to_cpu((dip)->di_anextents))
 
 /*
  * For block and character special files the 32bit dev_t is stored at the
diff --git a/libxfs/xfs_inode_buf.c b/libxfs/xfs_inode_buf.c
index ae71a19e..d5584372 100644
--- a/libxfs/xfs_inode_buf.c
+++ b/libxfs/xfs_inode_buf.c
@@ -362,9 +362,10 @@ xfs_dinode_verify_fork(
 	struct xfs_mount	*mp,
 	int			whichfork)
 {
-	uint32_t		di_nextents = XFS_DFORK_NEXTENTS(dip, whichfork);
 	xfs_extnum_t		max_extents;
+	uint32_t		di_nextents;
 
+	di_nextents = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
 
 	switch (XFS_DFORK_FORMAT(dip, whichfork)) {
 	case XFS_DINODE_FMT_LOCAL:
@@ -396,6 +397,15 @@ xfs_dinode_verify_fork(
 	return NULL;
 }
 
+xfs_extnum_t
+xfs_dfork_nextents(struct xfs_sb *sbp, struct xfs_dinode *dip, int whichfork)
+{
+	if (whichfork == XFS_DATA_FORK)
+		return be32_to_cpu(dip->di_nextents);
+	else
+		return be16_to_cpu(dip->di_anextents);
+}
+
 static xfs_failaddr_t
 xfs_dinode_verify_forkoff(
 	struct xfs_dinode	*dip,
@@ -432,6 +442,8 @@ xfs_dinode_verify(
 	uint16_t		flags;
 	uint64_t		flags2;
 	uint64_t		di_size;
+	xfs_extnum_t            nextents;
+	int64_t			nblocks;
 
 	if (dip->di_magic != cpu_to_be16(XFS_DINODE_MAGIC))
 		return __this_address;
@@ -462,10 +474,12 @@ xfs_dinode_verify(
 	if ((S_ISLNK(mode) || S_ISDIR(mode)) && di_size == 0)
 		return __this_address;
 
-	/* Fork checks carried over from xfs_iformat_fork */
-	if (mode &&
-	    be32_to_cpu(dip->di_nextents) + be16_to_cpu(dip->di_anextents) >
-			be64_to_cpu(dip->di_nblocks))
+	nextents = xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK);
+	nextents += xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK);
+	nblocks = be64_to_cpu(dip->di_nblocks);
+
+        /* Fork checks carried over from xfs_iformat_fork */
+	if (mode && nextents > nblocks)
 		return __this_address;
 
 	if (mode && XFS_DFORK_BOFF(dip) > mp->m_sb.sb_inodesize)
@@ -522,7 +536,7 @@ xfs_dinode_verify(
 		default:
 			return __this_address;
 		}
-		if (dip->di_anextents)
+		if (xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK))
 			return __this_address;
 	}
 
diff --git a/libxfs/xfs_inode_buf.h b/libxfs/xfs_inode_buf.h
index 9b373dcf..f97b3428 100644
--- a/libxfs/xfs_inode_buf.h
+++ b/libxfs/xfs_inode_buf.h
@@ -71,5 +71,7 @@ xfs_failaddr_t xfs_inode_validate_extsize(struct xfs_mount *mp,
 xfs_failaddr_t xfs_inode_validate_cowextsize(struct xfs_mount *mp,
 		uint32_t cowextsize, uint16_t mode, uint16_t flags,
 		uint64_t flags2);
+xfs_extnum_t xfs_dfork_nextents(struct xfs_sb *sbp, struct xfs_dinode *dip,
+			int whichfork);
 
 #endif	/* __XFS_INODE_BUF_H__ */
diff --git a/libxfs/xfs_inode_fork.c b/libxfs/xfs_inode_fork.c
index 80ba6c12..8c32f993 100644
--- a/libxfs/xfs_inode_fork.c
+++ b/libxfs/xfs_inode_fork.c
@@ -205,9 +205,10 @@ xfs_iformat_extents(
 	int			whichfork)
 {
 	struct xfs_mount	*mp = ip->i_mount;
+	struct xfs_sb		*sbp = &mp->m_sb;
 	struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, whichfork);
 	int			state = xfs_bmap_fork_to_state(whichfork);
-	int			nex = XFS_DFORK_NEXTENTS(dip, whichfork);
+	xfs_extnum_t		nex = xfs_dfork_nextents(sbp, dip, whichfork);
 	int			size = nex * sizeof(xfs_bmbt_rec_t);
 	struct xfs_iext_cursor	icur;
 	struct xfs_bmbt_rec	*dp;
diff --git a/repair/attr_repair.c b/repair/attr_repair.c
index 6cec0f70..b6ca564b 100644
--- a/repair/attr_repair.c
+++ b/repair/attr_repair.c
@@ -1083,7 +1083,7 @@ process_longform_attr(
 	bno = blkmap_get(blkmap, 0);
 	if (bno == NULLFSBLOCK) {
 		if (dip->di_aformat == XFS_DINODE_FMT_EXTENTS &&
-				be16_to_cpu(dip->di_anextents) == 0)
+			xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK) == 0)
 			return(0); /* the kernel can handle this state */
 		do_warn(
 	_("block 0 of inode %" PRIu64 " attribute fork is missing\n"),
diff --git a/repair/dinode.c b/repair/dinode.c
index de9a3286..98bb4a17 100644
--- a/repair/dinode.c
+++ b/repair/dinode.c
@@ -68,7 +68,7 @@ _("clearing inode %" PRIu64 " attributes\n"), ino_num);
 		fprintf(stderr,
 _("would have cleared inode %" PRIu64 " attributes\n"), ino_num);
 
-	if (be16_to_cpu(dino->di_anextents) != 0)  {
+	if (xfs_dfork_nextents(&mp->m_sb, dino, XFS_ATTR_FORK) != 0) {
 		if (no_modify)
 			return(1);
 		dino->di_anextents = cpu_to_be16(0);
@@ -882,7 +882,7 @@ process_exinode(
 	lino = XFS_AGINO_TO_INO(mp, agno, ino);
 	rp = (xfs_bmbt_rec_t *)XFS_DFORK_PTR(dip, whichfork);
 	*tot = 0;
-	numrecs = XFS_DFORK_NEXTENTS(dip, whichfork);
+	numrecs = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
 
 	/*
 	 * We've already decided on the maximum number of extents on the inode,
@@ -981,7 +981,7 @@ _("mismatch between format (%d) and size (%" PRId64 ") in symlink inode %" PRIu6
 	}
 
 	rp = (xfs_bmbt_rec_t *)XFS_DFORK_DPTR(dino);
-	numrecs = be32_to_cpu(dino->di_nextents);
+	numrecs = xfs_dfork_nextents(&mp->m_sb, dino, XFS_DATA_FORK);
 
 	/*
 	 * the max # of extents in a symlink inode is equal to the
@@ -1496,6 +1496,8 @@ process_check_sb_inodes(
 	int		*type,
 	int		*dirty)
 {
+	xfs_extnum_t	nextents;
+
 	if (lino == mp->m_sb.sb_rootino) {
 		if (*type != XR_INO_DIR)  {
 			do_warn(_("root inode %" PRIu64 " has bad type 0x%x\n"),
@@ -1550,10 +1552,12 @@ _("realtime summary inode %" PRIu64 " has bad type 0x%x, "),
 				do_warn(_("would reset to regular file\n"));
 			}
 		}
-		if (mp->m_sb.sb_rblocks == 0 && dinoc->di_nextents != 0)  {
+
+		nextents = xfs_dfork_nextents(&mp->m_sb, dinoc, XFS_DATA_FORK);
+		if (mp->m_sb.sb_rblocks == 0 && nextents != 0)  {
 			do_warn(
 _("bad # of extents (%u) for realtime summary inode %" PRIu64 "\n"),
-				be32_to_cpu(dinoc->di_nextents), lino);
+				nextents, lino);
 			return 1;
 		}
 		return 0;
@@ -1571,10 +1575,12 @@ _("realtime bitmap inode %" PRIu64 " has bad type 0x%x, "),
 				do_warn(_("would reset to regular file\n"));
 			}
 		}
-		if (mp->m_sb.sb_rblocks == 0 && dinoc->di_nextents != 0)  {
+
+		nextents = xfs_dfork_nextents(&mp->m_sb, dinoc, XFS_DATA_FORK);
+		if (mp->m_sb.sb_rblocks == 0 && nextents != 0)  {
 			do_warn(
 _("bad # of extents (%u) for realtime bitmap inode %" PRIu64 "\n"),
-				be32_to_cpu(dinoc->di_nextents), lino);
+				nextents, lino);
 			return 1;
 		}
 		return 0;
@@ -1735,6 +1741,7 @@ process_inode_blocks_and_extents(
 	xfs_ino_t		lino,
 	int			*dirty)
 {
+	xfs_extnum_t		dnextents;
 	xfs_extnum_t		max_extents;
 
 	if (nblocks != be64_to_cpu(dino->di_nblocks))  {
@@ -1760,20 +1767,19 @@ _("too many data fork extents (%" PRIu64 ") in inode %" PRIu64 "\n"),
 			nextents, lino);
 		return 1;
 	}
-	if (nextents != be32_to_cpu(dino->di_nextents))  {
+
+	dnextents = xfs_dfork_nextents(&mp->m_sb, dino, XFS_DATA_FORK);
+	if (nextents != dnextents)  {
 		if (!no_modify)  {
 			do_warn(
 _("correcting nextents for inode %" PRIu64 ", was %d - counted %" PRIu64 "\n"),
-				lino,
-				be32_to_cpu(dino->di_nextents),
-				nextents);
+				lino, dnextents, nextents);
 			dino->di_nextents = cpu_to_be32(nextents);
 			*dirty = 1;
 		} else  {
 			do_warn(
 _("bad nextents %d for inode %" PRIu64 ", would reset to %" PRIu64 "\n"),
-				be32_to_cpu(dino->di_nextents),
-				lino, nextents);
+				dnextents, lino, nextents);
 		}
 	}
 
@@ -1784,19 +1790,19 @@ _("too many attr fork extents (%" PRIu64 ") in inode %" PRIu64 "\n"),
 			anextents, lino);
 		return 1;
 	}
-	if (anextents != be16_to_cpu(dino->di_anextents))  {
+
+	dnextents = xfs_dfork_nextents(&mp->m_sb, dino, XFS_ATTR_FORK);
+	if (anextents != dnextents)  {
 		if (!no_modify)  {
 			do_warn(
 _("correcting anextents for inode %" PRIu64 ", was %d - counted %" PRIu64 "\n"),
-				lino,
-				be16_to_cpu(dino->di_anextents), anextents);
+				lino, dnextents, anextents);
 			dino->di_anextents = cpu_to_be16(anextents);
 			*dirty = 1;
 		} else  {
 			do_warn(
 _("bad anextents %d for inode %" PRIu64 ", would reset to %" PRIu64 "\n"),
-				be16_to_cpu(dino->di_anextents),
-				lino, anextents);
+				dnextents, lino, anextents);
 		}
 	}
 
@@ -1831,14 +1837,14 @@ process_inode_data_fork(
 {
 	xfs_ino_t	lino = XFS_AGINO_TO_INO(mp, agno, ino);
 	int		err = 0;
-	int		nex;
+	xfs_extnum_t	nex;
 
 	/*
 	 * extent count on disk is only valid for positive values. The kernel
 	 * uses negative values in memory. hence if we see negative numbers
 	 * here, trash it!
 	 */
-	nex = be32_to_cpu(dino->di_nextents);
+	nex = xfs_dfork_nextents(&mp->m_sb, dino, XFS_DATA_FORK);
 	if (nex < 0)
 		*nextents = 1;
 	else
@@ -1959,7 +1965,7 @@ process_inode_attr_fork(
 		return 0;
 	}
 
-	*anextents = be16_to_cpu(dino->di_anextents);
+	*anextents = xfs_dfork_nextents(&mp->m_sb, dino, XFS_ATTR_FORK);
 	if (*anextents > be64_to_cpu(dino->di_nblocks))
 		*anextents = 1;
 
diff --git a/repair/prefetch.c b/repair/prefetch.c
index 686bf7be..6eb7c06b 100644
--- a/repair/prefetch.c
+++ b/repair/prefetch.c
@@ -393,7 +393,7 @@ pf_read_exinode(
 	xfs_dinode_t		*dino)
 {
 	pf_read_bmbt_reclist(args, (xfs_bmbt_rec_t *)XFS_DFORK_DPTR(dino),
-			be32_to_cpu(dino->di_nextents));
+			xfs_dfork_nextents(&mp->m_sb, dino, XFS_DATA_FORK));
 }
 
 static void
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/4] xfsprogs: Extend data/attr fork extent counter width
  2020-08-31 13:00 [PATCH 0/4] xfsprogs: Extend per-inode extent counters Chandan Babu R
  2020-08-31 13:00 ` [PATCH 1/4] xfsprogs: Introduce xfs_iext_max() helper Chandan Babu R
  2020-08-31 13:01 ` [PATCH 2/4] xfsprogs: Introduce xfs_dfork_nextents() helper Chandan Babu R
@ 2020-08-31 13:01 ` Chandan Babu R
  2020-08-31 21:00   ` Darrick J. Wong
  2020-08-31 13:01 ` [PATCH 4/4] xfsprogs: Add wideextcnt mkfs option Chandan Babu R
  3 siblings, 1 reply; 10+ messages in thread
From: Chandan Babu R @ 2020-08-31 13:01 UTC (permalink / raw)
  To: linux-xfs; +Cc: Chandan Babu R, david, darrick.wong, bfoster

The kernel commit xfs: fix inode fork extent count overflow
(3f8a4f1d876d3e3e49e50b0396eaffcc4ba71b08) mentions that 10 billion
data fork extents should be possible to create. However the
corresponding on-disk field has a signed 32-bit type. Hence this
commit extends the per-inode data extent counter to 47 bits. The
length of 47-bits was chosen because,
Maximum file size = 2^63.
Maximum extent count when using 64k block size = 2^63 / 2^16 = 2^47.

Also, XFS has a per-inode xattr extent counter which is 16 bits
wide. A workload which
1. Creates 1 million 255-byte sized xattrs,
2. Deletes 50% of these xattrs in an alternating manner,
3. Tries to insert 400,000 new 255-byte sized xattrs
   causes the xattr extent counter to overflow.

Dave tells me that there are instances where a single file has more than
100 million hardlinks. With parent pointers being stored in xattrs, we
will overflow the signed 16-bits wide xattr extent counter when large
number of hardlinks are created. Hence this commit extends the on-disk
field to 32-bits.

The following changes are made to accomplish this,

1. A new incompat superblock flag to prevent older kernels from mounting
   the filesystem. This flag has to be set during mkfs time.
2. Carve out a new 32-bit field from xfs_dinode->di_pad2[]. This field
   holds the most significant 15 bits of the data extent counter.
3. Carve out a new 16-bit field from xfs_dinode->di_pad2[]. This field
   holds the most significant 16 bits of the attr extent counter.

Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
 db/bmap.c                  |  2 +-
 db/field.c                 |  4 ---
 db/field.h                 |  2 --
 db/inode.c                 | 17 ++++++++++--
 include/libxlog.h          |  6 +++--
 libxfs/xfs_bmap.c          | 12 +++++----
 libxfs/xfs_format.h        | 20 ++++++++++----
 libxfs/xfs_inode_buf.c     | 53 ++++++++++++++++++++++++++++++--------
 libxfs/xfs_inode_buf.h     |  4 +--
 libxfs/xfs_inode_fork.c    |  4 +--
 libxfs/xfs_inode_fork.h    | 15 ++++++++---
 libxfs/xfs_log_format.h    |  8 +++---
 libxfs/xfs_types.h         |  6 +++--
 logprint/log_misc.c        | 21 +++++++++++----
 logprint/log_print_all.c   | 30 ++++++++++++++++-----
 logprint/log_print_trans.c |  2 +-
 repair/dinode.c            | 28 ++++++++++++--------
 17 files changed, 165 insertions(+), 69 deletions(-)

diff --git a/db/bmap.c b/db/bmap.c
index 9800a909..c374fa48 100644
--- a/db/bmap.c
+++ b/db/bmap.c
@@ -47,7 +47,7 @@ bmap(
 	int			n;
 	int			nex;
 	xfs_fsblock_t		nextbno;
-	int			nextents;
+	xfs_extnum_t		nextents;
 	xfs_bmbt_ptr_t		*pp;
 	xfs_bmdr_block_t	*rblock;
 	typnm_t			typ;
diff --git a/db/field.c b/db/field.c
index aa0154d8..2d707e4e 100644
--- a/db/field.c
+++ b/db/field.c
@@ -25,8 +25,6 @@
 #include "symlink.h"
 
 const ftattr_t	ftattrtab[] = {
-	{ FLDT_AEXTNUM, "aextnum", fp_num, "%d", SI(bitsz(xfs_aextnum_t)),
-	  FTARG_SIGNED, NULL, NULL },
 	{ FLDT_AGBLOCK, "agblock", fp_num, "%u", SI(bitsz(xfs_agblock_t)),
 	  FTARG_DONULL, fa_agblock, NULL },
 	{ FLDT_AGBLOCKNZ, "agblocknz", fp_num, "%u", SI(bitsz(xfs_agblock_t)),
@@ -300,8 +298,6 @@ const ftattr_t	ftattrtab[] = {
 	  FTARG_DONULL, fa_drtbno, NULL },
 	{ FLDT_EXTLEN, "extlen", fp_num, "%u", SI(bitsz(xfs_extlen_t)), 0, NULL,
 	  NULL },
-	{ FLDT_EXTNUM, "extnum", fp_num, "%d", SI(bitsz(xfs_extnum_t)),
-	  FTARG_SIGNED, NULL, NULL },
 	{ FLDT_FSIZE, "fsize", fp_num, "%lld", SI(bitsz(xfs_fsize_t)),
 	  FTARG_SIGNED, NULL, NULL },
 	{ FLDT_INO, "ino", fp_num, "%llu", SI(bitsz(xfs_ino_t)), FTARG_DONULL,
diff --git a/db/field.h b/db/field.h
index 15065373..7ebc9a1e 100644
--- a/db/field.h
+++ b/db/field.h
@@ -5,7 +5,6 @@
  */
 
 typedef enum fldt	{
-	FLDT_AEXTNUM,
 	FLDT_AGBLOCK,
 	FLDT_AGBLOCKNZ,
 	FLDT_AGF,
@@ -143,7 +142,6 @@ typedef enum fldt	{
 	FLDT_DRFSBNO,
 	FLDT_DRTBNO,
 	FLDT_EXTLEN,
-	FLDT_EXTNUM,
 	FLDT_FSIZE,
 	FLDT_INO,
 	FLDT_INOBT,
diff --git a/db/inode.c b/db/inode.c
index 3853092c..50a942b6 100644
--- a/db/inode.c
+++ b/db/inode.c
@@ -37,6 +37,7 @@ static int	inode_u_muuid_count(void *obj, int startoff);
 static int	inode_u_sfdir2_count(void *obj, int startoff);
 static int	inode_u_sfdir3_count(void *obj, int startoff);
 static int	inode_u_symlink_count(void *obj, int startoff);
+static int	inode_v3_wideextcnt_count(void *obj, int startoff);
 
 static const cmdinfo_t	inode_cmd =
 	{ "inode", NULL, inode_f, 0, 1, 1, "[inode#]",
@@ -100,8 +101,8 @@ const field_t	inode_core_flds[] = {
 	{ "size", FLDT_FSIZE, OI(COFF(size)), C1, 0, TYP_NONE },
 	{ "nblocks", FLDT_DRFSBNO, OI(COFF(nblocks)), C1, 0, TYP_NONE },
 	{ "extsize", FLDT_EXTLEN, OI(COFF(extsize)), C1, 0, TYP_NONE },
-	{ "nextents", FLDT_EXTNUM, OI(COFF(nextents)), C1, 0, TYP_NONE },
-	{ "naextents", FLDT_AEXTNUM, OI(COFF(anextents)), C1, 0, TYP_NONE },
+	{ "nextents_lo", FLDT_UINT32D, OI(COFF(nextents_lo)), C1, 0, TYP_NONE },
+	{ "naextents_lo", FLDT_UINT16D, OI(COFF(anextents_lo)), C1, 0, TYP_NONE },
 	{ "forkoff", FLDT_UINT8D, OI(COFF(forkoff)), C1, 0, TYP_NONE },
 	{ "aformat", FLDT_DINODE_FMT, OI(COFF(aformat)), C1, 0, TYP_NONE },
 	{ "dmevmask", FLDT_UINT32X, OI(COFF(dmevmask)), C1, 0, TYP_NONE },
@@ -162,6 +163,10 @@ const field_t	inode_v3_flds[] = {
 	{ "lsn", FLDT_UINT64X, OI(COFF(lsn)), C1, 0, TYP_NONE },
 	{ "flags2", FLDT_UINT64X, OI(COFF(flags2)), C1, 0, TYP_NONE },
 	{ "cowextsize", FLDT_EXTLEN, OI(COFF(cowextsize)), C1, 0, TYP_NONE },
+	{ "nextents_hi", FLDT_UINT32D, OI(COFF(nextents_hi)),
+	  inode_v3_wideextcnt_count, FLD_COUNT, TYP_NONE },
+	{ "naextents_hi", FLDT_UINT16D, OI(COFF(anextents_hi)),
+	  inode_v3_wideextcnt_count, FLD_COUNT, TYP_NONE },
 	{ "pad2", FLDT_UINT8X, OI(OFF(pad2)), CI(12), FLD_ARRAY|FLD_SKIPALL, TYP_NONE },
 	{ "crtime", FLDT_TIMESTAMP, OI(COFF(crtime)), C1, 0, TYP_NONE },
 	{ "inumber", FLDT_INO, OI(COFF(ino)), C1, 0, TYP_NONE },
@@ -396,6 +401,14 @@ inode_core_projid_count(
 	return dic->di_version >= 2;
 }
 
+static int
+inode_v3_wideextcnt_count(
+	void		*obj,
+	int		startoff)
+{
+	return xfs_sb_version_haswideextcnt(&mp->m_sb);
+}
+
 static int
 inode_f(
 	int		argc,
diff --git a/include/libxlog.h b/include/libxlog.h
index 5e94fa1e..1aab108c 100644
--- a/include/libxlog.h
+++ b/include/libxlog.h
@@ -89,13 +89,15 @@ extern int	xlog_find_tail(struct xlog *log, xfs_daddr_t *head_blk,
 
 extern int	xlog_recover(struct xlog *log, int readonly);
 extern void	xlog_recover_print_data(char *p, int len);
-extern void	xlog_recover_print_logitem(xlog_recover_item_t *item);
+extern void	xlog_recover_print_logitem(struct xlog *log,
+			xlog_recover_item_t *item);
 extern void	xlog_recover_print_trans_head(struct xlog_recover *tr);
 extern int	xlog_print_find_oldest(struct xlog *log, xfs_daddr_t *last_blk);
 
 /* for transactional view */
 extern void	xlog_recover_print_trans_head(struct xlog_recover *tr);
-extern void	xlog_recover_print_trans(struct xlog_recover *trans,
+extern void	xlog_recover_print_trans(struct xlog *log,
+				struct xlog_recover *trans,
 				struct list_head *itemq, int print);
 extern int	xlog_do_recovery_pass(struct xlog *log, xfs_daddr_t head_blk,
 				xfs_daddr_t tail_blk, int pass);
diff --git a/libxfs/xfs_bmap.c b/libxfs/xfs_bmap.c
index dae4d339..118b6e96 100644
--- a/libxfs/xfs_bmap.c
+++ b/libxfs/xfs_bmap.c
@@ -45,19 +45,21 @@ xfs_bmap_compute_maxlevels(
 	xfs_mount_t	*mp,		/* file system mount structure */
 	int		whichfork)	/* data or attr fork */
 {
+	xfs_extnum_t	maxleafents;	/* max leaf entries possible */
 	int		level;		/* btree level */
 	uint		maxblocks;	/* max blocks at this level */
-	uint		maxleafents;	/* max leaf entries possible */
 	int		maxrootrecs;	/* max records in root block */
 	int		minleafrecs;	/* min records in leaf block */
 	int		minnoderecs;	/* min records in node block */
 	int		sz;		/* root block size */
 
 	/*
-	 * The maximum number of extents in a file, hence the maximum
-	 * number of leaf entries, is controlled by the type of di_nextents
-	 * (a signed 32-bit number, xfs_extnum_t), or by di_anextents
-	 * (a signed 16-bit number, xfs_aextnum_t).
+	 * The maximum number of extents in a file, hence the maximum number of
+	 * leaf entries, is controlled by the size of the on-disk extent count,
+	 * either a signed 32-bit number for the data fork, or a signed 16-bit
+	 * number for the attr fork. With mkfs.xfs' wide-extcount option
+	 * enabled, the data fork extent count is unsigned 47-bits wide, while
+	 * the corresponding attr fork extent count is unsigned 32-bits wide.
 	 *
 	 * Note that we can no longer assume that if we are in ATTR1 that
 	 * the fork offset of all the inodes will be
diff --git a/libxfs/xfs_format.h b/libxfs/xfs_format.h
index 188deada..ab44bcb4 100644
--- a/libxfs/xfs_format.h
+++ b/libxfs/xfs_format.h
@@ -464,11 +464,13 @@ xfs_sb_has_ro_compat_feature(
 
 #define XFS_SB_FEAT_INCOMPAT_FTYPE	(1 << 0)	/* filetype in dirent */
 #define XFS_SB_FEAT_INCOMPAT_SPINODES	(1 << 1)	/* sparse inode chunks */
-#define XFS_SB_FEAT_INCOMPAT_META_UUID	(1 << 2)	/* metadata UUID */
+#define XFS_SB_FEAT_INCOMPAT_META_UUID	(1 << 2)        /* metadata UUID */
+#define XFS_SB_FEAT_INCOMPAT_WIDEEXTCNT	(1 << 3)	/* Wider data/attr fork extent counters */
 #define XFS_SB_FEAT_INCOMPAT_ALL \
 		(XFS_SB_FEAT_INCOMPAT_FTYPE|	\
 		 XFS_SB_FEAT_INCOMPAT_SPINODES|	\
-		 XFS_SB_FEAT_INCOMPAT_META_UUID)
+		 XFS_SB_FEAT_INCOMPAT_META_UUID| \
+		 XFS_SB_FEAT_INCOMPAT_WIDEEXTCNT)
 
 #define XFS_SB_FEAT_INCOMPAT_UNKNOWN	~XFS_SB_FEAT_INCOMPAT_ALL
 static inline bool
@@ -551,6 +553,12 @@ static inline bool xfs_sb_version_hasmetauuid(struct xfs_sb *sbp)
 		(sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_META_UUID);
 }
 
+static inline bool xfs_sb_version_haswideextcnt(struct xfs_sb *sbp)
+{
+	return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5) &&
+		(sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_WIDEEXTCNT);
+}
+
 static inline bool xfs_sb_version_hasrmapbt(struct xfs_sb *sbp)
 {
 	return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5) &&
@@ -873,8 +881,8 @@ typedef struct xfs_dinode {
 	__be64		di_size;	/* number of bytes in file */
 	__be64		di_nblocks;	/* # of direct & btree blocks used */
 	__be32		di_extsize;	/* basic/minimum extent size for file */
-	__be32		di_nextents;	/* number of extents in data fork */
-	__be16		di_anextents;	/* number of extents in attribute fork*/
+	__be32		di_nextents_lo;	/* lower part of data fork extent count */
+	__be16		di_anextents_lo;/* lower part of attr fork extent count */
 	__u8		di_forkoff;	/* attr fork offs, <<3 for 64b align */
 	__s8		di_aformat;	/* format of attr fork's data */
 	__be32		di_dmevmask;	/* DMIG event mask */
@@ -891,7 +899,9 @@ typedef struct xfs_dinode {
 	__be64		di_lsn;		/* flush sequence */
 	__be64		di_flags2;	/* more random flags */
 	__be32		di_cowextsize;	/* basic cow extent size for file */
-	__u8		di_pad2[12];	/* more padding for future expansion */
+	__be32		di_nextents_hi; /* higher part of data fork extent count */
+	__be16		di_anextents_hi;/* higher part of attr fork extent count */
+	__u8		di_pad2[6];	/* more padding for future expansion */
 
 	/* fields only written to during inode creation */
 	xfs_timestamp_t	di_crtime;	/* time created */
diff --git a/libxfs/xfs_inode_buf.c b/libxfs/xfs_inode_buf.c
index d5584372..219d0234 100644
--- a/libxfs/xfs_inode_buf.c
+++ b/libxfs/xfs_inode_buf.c
@@ -188,6 +188,7 @@ xfs_inode_from_disk(
 	struct xfs_inode	*ip,
 	struct xfs_dinode	*from)
 {
+	struct xfs_sb		*sbp = &ip->i_mount->m_sb;
 	struct xfs_icdinode	*to = &ip->i_d;
 	struct inode		*inode = VFS_I(ip);
 
@@ -228,8 +229,8 @@ xfs_inode_from_disk(
 	to->di_size = be64_to_cpu(from->di_size);
 	to->di_nblocks = be64_to_cpu(from->di_nblocks);
 	to->di_extsize = be32_to_cpu(from->di_extsize);
-	to->di_nextents = be32_to_cpu(from->di_nextents);
-	to->di_anextents = be16_to_cpu(from->di_anextents);
+	to->di_nextents = be32_to_cpu(from->di_nextents_lo);
+	to->di_anextents = be16_to_cpu(from->di_anextents_lo);
 	to->di_forkoff = from->di_forkoff;
 	to->di_aformat	= from->di_aformat;
 	to->di_dmevmask	= be32_to_cpu(from->di_dmevmask);
@@ -243,6 +244,13 @@ xfs_inode_from_disk(
 		to->di_crtime.tv_nsec = be32_to_cpu(from->di_crtime.t_nsec);
 		to->di_flags2 = be64_to_cpu(from->di_flags2);
 		to->di_cowextsize = be32_to_cpu(from->di_cowextsize);
+
+		if (xfs_sb_version_haswideextcnt(sbp)) {
+			to->di_nextents |=
+				((uint64_t)(be32_to_cpu(from->di_nextents_hi)) << 32);
+			to->di_anextents |=
+				((uint32_t)(be16_to_cpu(from->di_anextents_hi)) << 16);
+		}
 	}
 }
 
@@ -252,6 +260,7 @@ xfs_inode_to_disk(
 	struct xfs_dinode	*to,
 	xfs_lsn_t		lsn)
 {
+	struct xfs_sb		*sbp = &ip->i_mount->m_sb;
 	struct xfs_icdinode	*from = &ip->i_d;
 	struct inode		*inode = VFS_I(ip);
 
@@ -278,8 +287,8 @@ xfs_inode_to_disk(
 	to->di_size = cpu_to_be64(from->di_size);
 	to->di_nblocks = cpu_to_be64(from->di_nblocks);
 	to->di_extsize = cpu_to_be32(from->di_extsize);
-	to->di_nextents = cpu_to_be32(from->di_nextents);
-	to->di_anextents = cpu_to_be16(from->di_anextents);
+	to->di_nextents_lo = cpu_to_be32(from->di_nextents);
+	to->di_anextents_lo = cpu_to_be16(from->di_anextents);
 	to->di_forkoff = from->di_forkoff;
 	to->di_aformat = from->di_aformat;
 	to->di_dmevmask = cpu_to_be32(from->di_dmevmask);
@@ -293,6 +302,12 @@ xfs_inode_to_disk(
 		to->di_crtime.t_nsec = cpu_to_be32(from->di_crtime.tv_nsec);
 		to->di_flags2 = cpu_to_be64(from->di_flags2);
 		to->di_cowextsize = cpu_to_be32(from->di_cowextsize);
+		if (xfs_sb_version_haswideextcnt(sbp)) {
+			to->di_nextents_hi
+				= cpu_to_be32(from->di_nextents >> 32);
+			to->di_anextents_hi
+				= cpu_to_be16(from->di_nextents >> 16);
+		}
 		to->di_ino = cpu_to_be64(ip->i_ino);
 		to->di_lsn = cpu_to_be64(lsn);
 		memset(to->di_pad2, 0, sizeof(to->di_pad2));
@@ -306,9 +321,12 @@ xfs_inode_to_disk(
 
 void
 xfs_log_dinode_to_disk(
+	struct xfs_mount	*mp,
 	struct xfs_log_dinode	*from,
 	struct xfs_dinode	*to)
 {
+	struct xfs_sb		*sbp = &mp->m_sb;
+
 	to->di_magic = cpu_to_be16(from->di_magic);
 	to->di_mode = cpu_to_be16(from->di_mode);
 	to->di_version = from->di_version;
@@ -331,8 +349,8 @@ xfs_log_dinode_to_disk(
 	to->di_size = cpu_to_be64(from->di_size);
 	to->di_nblocks = cpu_to_be64(from->di_nblocks);
 	to->di_extsize = cpu_to_be32(from->di_extsize);
-	to->di_nextents = cpu_to_be32(from->di_nextents);
-	to->di_anextents = cpu_to_be16(from->di_anextents);
+	to->di_nextents_lo = cpu_to_be32(from->di_nextents_lo);
+	to->di_anextents_lo = cpu_to_be16(from->di_anextents_lo);
 	to->di_forkoff = from->di_forkoff;
 	to->di_aformat = from->di_aformat;
 	to->di_dmevmask = cpu_to_be32(from->di_dmevmask);
@@ -346,6 +364,10 @@ xfs_log_dinode_to_disk(
 		to->di_crtime.t_nsec = cpu_to_be32(from->di_crtime.t_nsec);
 		to->di_flags2 = cpu_to_be64(from->di_flags2);
 		to->di_cowextsize = cpu_to_be32(from->di_cowextsize);
+		if (xfs_sb_version_haswideextcnt(sbp)) {
+			to->di_nextents_hi = cpu_to_be32(from->di_nextents_hi);
+			to->di_anextents_hi = cpu_to_be16(from->di_anextents_hi);
+		}
 		to->di_ino = cpu_to_be64(from->di_ino);
 		to->di_lsn = cpu_to_be64(from->di_lsn);
 		memcpy(to->di_pad2, from->di_pad2, sizeof(to->di_pad2));
@@ -363,7 +385,7 @@ xfs_dinode_verify_fork(
 	int			whichfork)
 {
 	xfs_extnum_t		max_extents;
-	uint32_t		di_nextents;
+	xfs_extnum_t		di_nextents;
 
 	di_nextents = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
 
@@ -400,10 +422,19 @@ xfs_dinode_verify_fork(
 xfs_extnum_t
 xfs_dfork_nextents(struct xfs_sb *sbp, struct xfs_dinode *dip, int whichfork)
 {
-	if (whichfork == XFS_DATA_FORK)
-		return be32_to_cpu(dip->di_nextents);
-	else
-		return be16_to_cpu(dip->di_anextents);
+	xfs_extnum_t nextents;
+
+	if (whichfork == XFS_DATA_FORK) {
+		nextents = be32_to_cpu(dip->di_nextents_lo);
+		if (xfs_sb_version_haswideextcnt(sbp))
+			nextents |= ((uint64_t)be32_to_cpu(dip->di_nextents_hi) << 32);
+	} else {
+		nextents = be16_to_cpu(dip->di_anextents_lo);
+		if (xfs_sb_version_haswideextcnt(sbp))
+			nextents |= ((uint32_t)be16_to_cpu(dip->di_anextents_hi) << 16);
+	}
+
+	return nextents;
 }
 
 static xfs_failaddr_t
diff --git a/libxfs/xfs_inode_buf.h b/libxfs/xfs_inode_buf.h
index f97b3428..0dee0235 100644
--- a/libxfs/xfs_inode_buf.h
+++ b/libxfs/xfs_inode_buf.h
@@ -55,8 +55,8 @@ void	xfs_dinode_calc_crc(struct xfs_mount *, struct xfs_dinode *);
 void	xfs_inode_to_disk(struct xfs_inode *ip, struct xfs_dinode *to,
 			  xfs_lsn_t lsn);
 void	xfs_inode_from_disk(struct xfs_inode *ip, struct xfs_dinode *from);
-void	xfs_log_dinode_to_disk(struct xfs_log_dinode *from,
-			       struct xfs_dinode *to);
+void	xfs_log_dinode_to_disk(struct xfs_mount *mp,
+			struct xfs_log_dinode *from, struct xfs_dinode *to);
 
 #if defined(DEBUG)
 void	xfs_inobp_check(struct xfs_mount *, struct xfs_buf *);
diff --git a/libxfs/xfs_inode_fork.c b/libxfs/xfs_inode_fork.c
index 8c32f993..af4f893f 100644
--- a/libxfs/xfs_inode_fork.c
+++ b/libxfs/xfs_inode_fork.c
@@ -213,14 +213,14 @@ xfs_iformat_extents(
 	struct xfs_iext_cursor	icur;
 	struct xfs_bmbt_rec	*dp;
 	struct xfs_bmbt_irec	new;
-	int			i;
+	xfs_extnum_t		i;
 
 	/*
 	 * If the number of extents is unreasonable, then something is wrong and
 	 * we just bail out rather than crash in kmem_alloc() or memcpy() below.
 	 */
 	if (unlikely(size < 0 || size > XFS_DFORK_SIZE(dip, mp, whichfork))) {
-		xfs_warn(ip->i_mount, "corrupt inode %Lu ((a)extents = %d).",
+		xfs_warn(ip->i_mount, "corrupt inode %Lu ((a)extents = %llu).",
 			(unsigned long long) ip->i_ino, nex);
 		xfs_inode_verifier_error(ip, -EFSCORRUPTED,
 				"xfs_iformat_extents(1)", dip, sizeof(*dip),
diff --git a/libxfs/xfs_inode_fork.h b/libxfs/xfs_inode_fork.h
index e318dfdd..22f3c9b3 100644
--- a/libxfs/xfs_inode_fork.h
+++ b/libxfs/xfs_inode_fork.h
@@ -90,10 +90,17 @@ static inline xfs_extnum_t xfs_iext_max(struct xfs_sb *sbp, int whichfork)
 {
 	ASSERT(whichfork == XFS_DATA_FORK || whichfork == XFS_ATTR_FORK);
 
-	if (whichfork == XFS_DATA_FORK)
-		return MAXEXTNUM;
-	else
-		return MAXAEXTNUM;
+	if (whichfork == XFS_DATA_FORK) {
+		if (xfs_sb_version_haswideextcnt(sbp))
+			return MAXEXTNUM_HI;
+		else
+			return MAXEXTNUM;
+	} else {
+		if (xfs_sb_version_haswideextcnt(sbp))
+			return MAXAEXTNUM_HI;
+		else
+			return MAXAEXTNUM;
+	}
 }
 
 struct xfs_ifork *xfs_iext_state_to_fork(struct xfs_inode *ip, int state);
diff --git a/libxfs/xfs_log_format.h b/libxfs/xfs_log_format.h
index e3400c9c..809f8ce6 100644
--- a/libxfs/xfs_log_format.h
+++ b/libxfs/xfs_log_format.h
@@ -396,8 +396,8 @@ struct xfs_log_dinode {
 	xfs_fsize_t	di_size;	/* number of bytes in file */
 	xfs_rfsblock_t	di_nblocks;	/* # of direct & btree blocks used */
 	xfs_extlen_t	di_extsize;	/* basic/minimum extent size for file */
-	xfs_extnum_t	di_nextents;	/* number of extents in data fork */
-	xfs_aextnum_t	di_anextents;	/* number of extents in attribute fork*/
+	uint32_t	di_nextents_lo;	/* lower part of data fork extent count */
+	uint16_t	di_anextents_lo;/* lower part of attr fork extent count*/
 	uint8_t		di_forkoff;	/* attr fork offs, <<3 for 64b align */
 	int8_t		di_aformat;	/* format of attr fork's data */
 	uint32_t	di_dmevmask;	/* DMIG event mask */
@@ -414,7 +414,9 @@ struct xfs_log_dinode {
 	xfs_lsn_t	di_lsn;		/* flush sequence */
 	uint64_t	di_flags2;	/* more random flags */
 	uint32_t	di_cowextsize;	/* basic cow extent size for file */
-	uint8_t		di_pad2[12];	/* more padding for future expansion */
+	uint32_t	di_nextents_hi; /* higher part of data fork extent count */
+	uint16_t	di_anextents_hi;/* higher part of attr fork extent count */
+	uint8_t		di_pad2[6];	/* more padding for future expansion */
 
 	/* fields only written to during inode creation */
 	xfs_ictimestamp_t di_crtime;	/* time created */
diff --git a/libxfs/xfs_types.h b/libxfs/xfs_types.h
index 397d9477..23ff8166 100644
--- a/libxfs/xfs_types.h
+++ b/libxfs/xfs_types.h
@@ -12,8 +12,8 @@ typedef uint32_t	xfs_agblock_t;	/* blockno in alloc. group */
 typedef uint32_t	xfs_agino_t;	/* inode # within allocation grp */
 typedef uint32_t	xfs_extlen_t;	/* extent length in blocks */
 typedef uint32_t	xfs_agnumber_t;	/* allocation group number */
-typedef int32_t		xfs_extnum_t;	/* # of extents in a file */
-typedef int16_t		xfs_aextnum_t;	/* # extents in an attribute fork */
+typedef uint64_t	xfs_extnum_t;	/* # of extents in a file */
+typedef uint32_t	xfs_aextnum_t;	/* # extents in an attribute fork */
 typedef int64_t		xfs_fsize_t;	/* bytes in a file */
 typedef uint64_t	xfs_ufsize_t;	/* unsigned bytes in a file */
 
@@ -61,6 +61,8 @@ typedef void *		xfs_failaddr_t;
 #define	MAXEXTLEN	((xfs_extlen_t)0x001fffff)	/* 21 bits */
 #define	MAXEXTNUM	((xfs_extnum_t)0x7fffffff)	/* signed int */
 #define	MAXAEXTNUM	((xfs_aextnum_t)0x7fff)		/* signed short */
+#define MAXEXTNUM_HI	((xfs_extnum_t)0x7fffffffffff)	/* unsigned 47 bits */
+#define MAXAEXTNUM_HI	((xfs_aextnum_t)0xffffffff)	/* unsigned 32 bits */
 
 /*
  * Minimum and maximum blocksize and sectorsize.
diff --git a/logprint/log_misc.c b/logprint/log_misc.c
index be889887..4d09f357 100644
--- a/logprint/log_misc.c
+++ b/logprint/log_misc.c
@@ -438,8 +438,11 @@ xlog_print_trans_qoff(char **ptr, uint len)
 
 static void
 xlog_print_trans_inode_core(
+	struct xfs_mount	*mp,
 	struct xfs_log_dinode	*ip)
 {
+	xfs_extnum_t		nextents;
+
     printf(_("INODE CORE\n"));
     printf(_("magic 0x%hx mode 0%ho version %d format %d\n"),
 	   ip->di_magic, ip->di_mode, (int)ip->di_version,
@@ -448,11 +451,19 @@ xlog_print_trans_inode_core(
 	   ip->di_nlink, ip->di_uid, ip->di_gid);
     printf(_("atime 0x%x mtime 0x%x ctime 0x%x\n"),
 	   ip->di_atime.t_sec, ip->di_mtime.t_sec, ip->di_ctime.t_sec);
-    printf(_("size 0x%llx nblocks 0x%llx extsize 0x%x nextents 0x%x\n"),
+
+    nextents = ip->di_nextents_lo;
+    if (xfs_sb_version_haswideextcnt(&mp->m_sb))
+	    nextents |= ((xfs_extnum_t)ip->di_nextents_hi << 32);
+    printf(_("size 0x%llx nblocks 0x%llx extsize 0x%x nextents 0x%lx\n"),
 	   (unsigned long long)ip->di_size, (unsigned long long)ip->di_nblocks,
-	   ip->di_extsize, ip->di_nextents);
-    printf(_("naextents 0x%x forkoff %d dmevmask 0x%x dmstate 0x%hx\n"),
-	   ip->di_anextents, (int)ip->di_forkoff, ip->di_dmevmask,
+	   ip->di_extsize, nextents);
+
+    nextents = ip->di_anextents_lo;
+    if (xfs_sb_version_haswideextcnt(&mp->m_sb))
+	    nextents |= ((xfs_extnum_t)ip->di_anextents_hi << 16);
+    printf(_("naextents 0x%lx forkoff %d dmevmask 0x%x dmstate 0x%hx\n"),
+	   nextents, (int)ip->di_forkoff, ip->di_dmevmask,
 	   ip->di_dmstate);
     printf(_("flags 0x%x gen 0x%x\n"),
 	   ip->di_flags, ip->di_gen);
@@ -562,7 +573,7 @@ xlog_print_trans_inode(
     memmove(&dino, *ptr, sizeof(dino));
     mode = dino.di_mode & S_IFMT;
     size = (int)dino.di_size;
-    xlog_print_trans_inode_core(&dino);
+    xlog_print_trans_inode_core(log->l_mp, &dino);
     *ptr += xfs_log_dinode_size(log->l_mp);
     skip_count--;
 
diff --git a/logprint/log_print_all.c b/logprint/log_print_all.c
index e2e28b9c..aa171dfb 100644
--- a/logprint/log_print_all.c
+++ b/logprint/log_print_all.c
@@ -238,9 +238,14 @@ xlog_recover_print_dquot(
 
 STATIC void
 xlog_recover_print_inode_core(
+	struct xlog		*log,
 	struct xfs_log_dinode	*di)
 {
-	printf(_("	CORE inode:\n"));
+	struct xfs_sb		*sbp = &log->l_mp->m_sb;
+	xfs_aextnum_t		anextents;
+	xfs_extnum_t		nextents;
+
+        printf(_("	CORE inode:\n"));
 	if (!print_inode)
 		return;
 	printf(_("		magic:%c%c  mode:0x%x  ver:%d  format:%d\n"),
@@ -252,10 +257,17 @@ xlog_recover_print_inode_core(
 	printf(_("		atime:%d  mtime:%d  ctime:%d\n"),
 	       di->di_atime.t_sec, di->di_mtime.t_sec, di->di_ctime.t_sec);
 	printf(_("		flushiter:%d\n"), di->di_flushiter);
+
+	nextents = di->di_nextents_lo;
+	anextents = di->di_anextents_lo;
+	if (xfs_sb_version_haswideextcnt(sbp)) {
+		nextents |= ((xfs_extnum_t)di->di_nextents_hi << 32);
+		anextents |= ((xfs_aextnum_t)di->di_anextents_hi << 16);
+	}
 	printf(_("		size:0x%llx  nblks:0x%llx  exsize:%d  "
-	     "nextents:%d  anextents:%d\n"), (unsigned long long)
+	     "nextents:%lu  anextents:%u\n"), (unsigned long long)
 	       di->di_size, (unsigned long long)di->di_nblocks,
-	       di->di_extsize, di->di_nextents, (int)di->di_anextents);
+	       di->di_extsize, nextents, anextents);
 	printf(_("		forkoff:%d  dmevmask:0x%x  dmstate:%d  flags:0x%x  "
 	     "gen:%u\n"),
 	       (int)di->di_forkoff, di->di_dmevmask, (int)di->di_dmstate,
@@ -268,6 +280,7 @@ xlog_recover_print_inode_core(
 
 STATIC void
 xlog_recover_print_inode(
+	struct xlog		*log,
 	xlog_recover_item_t	*item)
 {
 	struct xfs_inode_log_format	f_buf;
@@ -289,7 +302,7 @@ xlog_recover_print_inode(
 	ASSERT(item->ri_buf[1].i_len ==
 			offsetof(struct xfs_log_dinode, di_next_unlinked) ||
 	       item->ri_buf[1].i_len == sizeof(struct xfs_log_dinode));
-	xlog_recover_print_inode_core((struct xfs_log_dinode *)
+	xlog_recover_print_inode_core(log, (struct xfs_log_dinode *)
 				      item->ri_buf[1].i_addr);
 
 	hasdata = (f->ilf_fields & XFS_ILOG_DFORK) != 0;
@@ -384,6 +397,7 @@ xlog_recover_print_icreate(
 
 void
 xlog_recover_print_logitem(
+	struct xlog		*log,
 	xlog_recover_item_t	*item)
 {
 	switch (ITEM_TYPE(item)) {
@@ -394,7 +408,7 @@ xlog_recover_print_logitem(
 		xlog_recover_print_icreate(item);
 		break;
 	case XFS_LI_INODE:
-		xlog_recover_print_inode(item);
+		xlog_recover_print_inode(log, item);
 		break;
 	case XFS_LI_EFD:
 		xlog_recover_print_efd(item);
@@ -434,6 +448,7 @@ xlog_recover_print_logitem(
 
 static void
 xlog_recover_print_item(
+	struct xlog		*log,
 	xlog_recover_item_t	*item)
 {
 	int			i;
@@ -493,11 +508,12 @@ xlog_recover_print_item(
 		       (long)item->ri_buf[i].i_addr, item->ri_buf[i].i_len);
 	}
 	printf("\n");
-	xlog_recover_print_logitem(item);
+	xlog_recover_print_logitem(log, item);
 }
 
 void
 xlog_recover_print_trans(
+	struct xlog		*log,
 	struct xlog_recover	*trans,
 	struct list_head	*itemq,
 	int			print)
@@ -510,5 +526,5 @@ xlog_recover_print_trans(
 	print_xlog_record_line();
 	xlog_recover_print_trans_head(trans);
 	list_for_each_entry(item, itemq, ri_list)
-		xlog_recover_print_item(item);
+		xlog_recover_print_item(log, item);
 }
diff --git a/logprint/log_print_trans.c b/logprint/log_print_trans.c
index 2004b5a0..c6386fb0 100644
--- a/logprint/log_print_trans.c
+++ b/logprint/log_print_trans.c
@@ -24,7 +24,7 @@ xlog_recover_do_trans(
 	struct xlog_recover	*trans,
 	int			pass)
 {
-	xlog_recover_print_trans(trans, &trans->r_itemq, 3);
+	xlog_recover_print_trans(log, trans, &trans->r_itemq, 3);
 	return 0;
 }
 
diff --git a/repair/dinode.c b/repair/dinode.c
index 98bb4a17..5a8de0f6 100644
--- a/repair/dinode.c
+++ b/repair/dinode.c
@@ -71,7 +71,9 @@ _("would have cleared inode %" PRIu64 " attributes\n"), ino_num);
 	if (xfs_dfork_nextents(&mp->m_sb, dino, XFS_ATTR_FORK) != 0) {
 		if (no_modify)
 			return(1);
-		dino->di_anextents = cpu_to_be16(0);
+		dino->di_anextents_lo = cpu_to_be16(0);
+		if (xfs_sb_version_haswideextcnt(&mp->m_sb))
+			dino->di_anextents_hi = cpu_to_be16(0);
 	}
 
 	if (dino->di_aformat != XFS_DINODE_FMT_EXTENTS)  {
@@ -959,7 +961,7 @@ process_symlink_extlist(xfs_mount_t *mp, xfs_ino_t lino, xfs_dinode_t *dino)
 	xfs_fileoff_t		expected_offset;
 	xfs_bmbt_rec_t		*rp;
 	xfs_bmbt_irec_t		irec;
-	int			numrecs;
+	xfs_extnum_t		numrecs;
 	int			i;
 	int			max_blocks;
 
@@ -989,7 +991,7 @@ _("mismatch between format (%d) and size (%" PRId64 ") in symlink inode %" PRIu6
 	 */
 	if (numrecs > max_symlink_blocks)  {
 		do_warn(
-_("bad number of extents (%d) in symlink %" PRIu64 " data fork\n"),
+_("bad number of extents (%lu) in symlink %" PRIu64 " data fork\n"),
 			numrecs, lino);
 		return(1);
 	}
@@ -1556,7 +1558,7 @@ _("realtime summary inode %" PRIu64 " has bad type 0x%x, "),
 		nextents = xfs_dfork_nextents(&mp->m_sb, dinoc, XFS_DATA_FORK);
 		if (mp->m_sb.sb_rblocks == 0 && nextents != 0)  {
 			do_warn(
-_("bad # of extents (%u) for realtime summary inode %" PRIu64 "\n"),
+_("bad # of extents (%lu) for realtime summary inode %" PRIu64 "\n"),
 				nextents, lino);
 			return 1;
 		}
@@ -1579,7 +1581,7 @@ _("realtime bitmap inode %" PRIu64 " has bad type 0x%x, "),
 		nextents = xfs_dfork_nextents(&mp->m_sb, dinoc, XFS_DATA_FORK);
 		if (mp->m_sb.sb_rblocks == 0 && nextents != 0)  {
 			do_warn(
-_("bad # of extents (%u) for realtime bitmap inode %" PRIu64 "\n"),
+_("bad # of extents (%lu) for realtime bitmap inode %" PRIu64 "\n"),
 				nextents, lino);
 			return 1;
 		}
@@ -1772,13 +1774,15 @@ _("too many data fork extents (%" PRIu64 ") in inode %" PRIu64 "\n"),
 	if (nextents != dnextents)  {
 		if (!no_modify)  {
 			do_warn(
-_("correcting nextents for inode %" PRIu64 ", was %d - counted %" PRIu64 "\n"),
+_("correcting nextents for inode %" PRIu64 ", was %lu - counted %" PRIu64 "\n"),
 				lino, dnextents, nextents);
-			dino->di_nextents = cpu_to_be32(nextents);
+			dino->di_nextents_lo = cpu_to_be32(nextents);
+			if (xfs_sb_version_haswideextcnt(&mp->m_sb))
+				dino->di_nextents_hi = cpu_to_be32(nextents >> 32);
 			*dirty = 1;
 		} else  {
 			do_warn(
-_("bad nextents %d for inode %" PRIu64 ", would reset to %" PRIu64 "\n"),
+_("bad nextents %lu for inode %" PRIu64 ", would reset to %" PRIu64 "\n"),
 				dnextents, lino, nextents);
 		}
 	}
@@ -1795,13 +1799,15 @@ _("too many attr fork extents (%" PRIu64 ") in inode %" PRIu64 "\n"),
 	if (anextents != dnextents)  {
 		if (!no_modify)  {
 			do_warn(
-_("correcting anextents for inode %" PRIu64 ", was %d - counted %" PRIu64 "\n"),
+_("correcting anextents for inode %" PRIu64 ", was %lu - counted %" PRIu64 "\n"),
 				lino, dnextents, anextents);
-			dino->di_anextents = cpu_to_be16(anextents);
+			dino->di_anextents_lo = cpu_to_be16(anextents);
+			if (xfs_sb_version_haswideextcnt(&mp->m_sb))
+				dino->di_anextents_hi = cpu_to_be16(anextents >> 16);
 			*dirty = 1;
 		} else  {
 			do_warn(
-_("bad anextents %d for inode %" PRIu64 ", would reset to %" PRIu64 "\n"),
+_("bad anextents %lu for inode %" PRIu64 ", would reset to %" PRIu64 "\n"),
 				dnextents, lino, anextents);
 		}
 	}
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/4] xfsprogs: Add wideextcnt mkfs option
  2020-08-31 13:00 [PATCH 0/4] xfsprogs: Extend per-inode extent counters Chandan Babu R
                   ` (2 preceding siblings ...)
  2020-08-31 13:01 ` [PATCH 3/4] xfsprogs: Extend data/attr fork extent counter width Chandan Babu R
@ 2020-08-31 13:01 ` Chandan Babu R
  3 siblings, 0 replies; 10+ messages in thread
From: Chandan Babu R @ 2020-08-31 13:01 UTC (permalink / raw)
  To: linux-xfs; +Cc: Chandan Babu R, david, darrick.wong, bfoster

Enabling wideextcnt option on mkfs.xfs command line causes the
filesystem inodes to have 47-bit data fork extent counters and 32-bit
attr fork extent counters. This also sets the
XFS_SB_FEAT_INCOMPAT_WIDEEXTCNT incompat flag on the superblock
preventing older kernels from mounting such a filesystem.

Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
 man/man8/mkfs.xfs.8 |  7 +++++++
 mkfs/xfs_mkfs.c     | 23 +++++++++++++++++++++++
 2 files changed, 30 insertions(+)

diff --git a/man/man8/mkfs.xfs.8 b/man/man8/mkfs.xfs.8
index 9d762a43..80378722 100644
--- a/man/man8/mkfs.xfs.8
+++ b/man/man8/mkfs.xfs.8
@@ -522,6 +522,13 @@ space over time such that no free extents are large enough to
 accommodate a chunk of 64 inodes. Without this feature enabled, inode
 allocations can fail with out of space errors under severe fragmented
 free space conditions.
+.TP
+.BI wideextcnt[= value]
+Extend inode data and attr fork extent counters from signed 32-bits and signed
+16-bits to unsigned 47-bits and unsigned 32-bits respectively. If the value is
+omitted, 1 is assumed. Wide extent count feature is disabled by default. This
+feature is only available for filesystems formatted with -m crc=1.
+.TP
 .RE
 .TP
 .BI \-l " log_section_options"
diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
index 2e6cd280..ff3e0705 100644
--- a/mkfs/xfs_mkfs.c
+++ b/mkfs/xfs_mkfs.c
@@ -71,6 +71,7 @@ enum {
 	I_ATTR,
 	I_PROJID32BIT,
 	I_SPINODES,
+	I_WIDEEXTCNT,
 	I_MAX_OPTS,
 };
 
@@ -383,6 +384,7 @@ static struct opt_params iopts = {
 		[I_ATTR] = "attr",
 		[I_PROJID32BIT] = "projid32bit",
 		[I_SPINODES] = "sparse",
+		[I_WIDEEXTCNT] = "wideextcnt",
 	},
 	.subopt_params = {
 		{ .index = I_ALIGN,
@@ -431,6 +433,12 @@ static struct opt_params iopts = {
 		  .maxval = 1,
 		  .defaultval = 1,
 		},
+		{ .index = I_WIDEEXTCNT,
+		  .conflicts = { { NULL, LAST_CONFLICT } },
+		  .minval = 0,
+		  .maxval = 1,
+		  .defaultval = 1,
+		}
 	},
 };
 
@@ -734,6 +742,7 @@ struct sb_feat_args {
 	bool	reflink;		/* XFS_SB_FEAT_RO_COMPAT_REFLINK */
 	bool	nodalign;
 	bool	nortalign;
+	bool	wideextcnt;
 };
 
 struct cli_params {
@@ -1469,6 +1478,9 @@ inode_opts_parser(
 	case I_SPINODES:
 		cli->sb_feat.spinodes = getnum(value, opts, subopt);
 		break;
+	case I_WIDEEXTCNT:
+		cli->sb_feat.wideextcnt = getnum(value, opts, subopt);
+		break;
 	default:
 		return -EINVAL;
 	}
@@ -1972,6 +1984,14 @@ _("reflink not supported without CRC support\n"));
 			usage();
 		}
 		cli->sb_feat.reflink = false;
+
+		if (cli->sb_feat.wideextcnt &&
+			cli_opt_set(&iopts, I_WIDEEXTCNT)) {
+			fprintf(stderr,
+_("wideextcnt inodes not supported without CRC support\n"));
+			usage();
+		}
+		cli->sb_feat.wideextcnt = false;
 	}
 
 	if ((cli->fsx.fsx_xflags & FS_XFLAG_COWEXTSIZE) &&
@@ -2953,6 +2973,8 @@ sb_set_features(
 		sbp->sb_features_incompat |= XFS_SB_FEAT_INCOMPAT_SPINODES;
 	}
 
+	if (fp->wideextcnt)
+		sbp->sb_features_incompat |= XFS_SB_FEAT_INCOMPAT_WIDEEXTCNT;
 }
 
 /*
@@ -3608,6 +3630,7 @@ main(
 			.parent_pointers = false,
 			.nodalign = false,
 			.nortalign = false,
+			.wideextcnt = false,
 		},
 	};
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] xfsprogs: Introduce xfs_dfork_nextents() helper
  2020-08-31 13:01 ` [PATCH 2/4] xfsprogs: Introduce xfs_dfork_nextents() helper Chandan Babu R
@ 2020-08-31 20:54   ` Darrick J. Wong
  2020-09-01 14:17     ` Chandan Babu R
  0 siblings, 1 reply; 10+ messages in thread
From: Darrick J. Wong @ 2020-08-31 20:54 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: linux-xfs, david, bfoster

On Mon, Aug 31, 2020 at 06:31:00PM +0530, Chandan Babu R wrote:
> This commit replaces the macro XFS_DFORK_NEXTENTS() with the helper
> function xfs_dfork_nextents(). As of this commit, xfs_dfork_nextents()
> returns the same value as XFS_DFORK_NEXTENTS(). A future commit which
> extends inode's extent counter fields will add more logic to this
> helper.
> 
> This commit also replaces direct accesses to xfs_dinode->di_[a]nextents
> with calls to xfs_dfork_nextents().
> 
> No functional changes have been made.
> 
> Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
> ---
>  db/bmap.c               |  6 +++---
>  db/btdump.c             |  4 ++--
>  db/check.c              |  2 +-
>  db/frag.c               |  8 ++++---
>  db/inode.c              | 14 ++++++------
>  db/metadump.c           |  4 ++--
>  libxfs/xfs_format.h     |  4 ----
>  libxfs/xfs_inode_buf.c  | 26 ++++++++++++++++------
>  libxfs/xfs_inode_buf.h  |  2 ++
>  libxfs/xfs_inode_fork.c |  3 ++-
>  repair/attr_repair.c    |  2 +-
>  repair/dinode.c         | 48 +++++++++++++++++++++++------------------
>  repair/prefetch.c       |  2 +-
>  13 files changed, 74 insertions(+), 51 deletions(-)
> 
> diff --git a/db/bmap.c b/db/bmap.c
> index fdc70e95..9800a909 100644
> --- a/db/bmap.c
> +++ b/db/bmap.c
> @@ -68,7 +68,7 @@ bmap(
>  	ASSERT(fmt == XFS_DINODE_FMT_LOCAL || fmt == XFS_DINODE_FMT_EXTENTS ||
>  		fmt == XFS_DINODE_FMT_BTREE);
>  	if (fmt == XFS_DINODE_FMT_EXTENTS) {
> -		nextents = XFS_DFORK_NEXTENTS(dip, whichfork);
> +		nextents = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
>  		xp = (xfs_bmbt_rec_t *)XFS_DFORK_PTR(dip, whichfork);
>  		for (ep = xp; ep < &xp[nextents] && n < nex; ep++) {
>  			if (!bmap_one_extent(ep, &curoffset, eoffset, &n, bep))
> @@ -158,9 +158,9 @@ bmap_f(
>  		push_cur();
>  		set_cur_inode(iocur_top->ino);
>  		dip = iocur_top->data;
> -		if (be32_to_cpu(dip->di_nextents))
> +		if (xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK))

Suggestion: Shift these kinds of changes to a separate patch to minimize
the amount of non-libxfs changes in a patch that will (eventually) be
ported from the kernel.  Ideally, the only changes to db/ and repair/
and mkfs/ would be the ones that are necessary to avoid breaking the
build.

Once you've separated the other conversions (like this one here) into a
separate patch, we can review that as a separate refactoring change to
userspace.

The reason for this ofc is that when the maintainers run libxfs-apply to
pull in the kernel patches, they're totally going to miss things like
this conversion unless you make them an explicit separate change.

FWIW the conversions themselves mostly look ok...

>  			dfork = 1;
> -		if (be16_to_cpu(dip->di_anextents))
> +		if (xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK))
>  			afork = 1;
>  		pop_cur();
>  	}
> diff --git a/db/btdump.c b/db/btdump.c
> index 920f595b..9ced71d4 100644
> --- a/db/btdump.c
> +++ b/db/btdump.c
> @@ -166,13 +166,13 @@ dump_inode(
>  
>  	dip = iocur_top->data;
>  	if (attrfork) {
> -		if (!dip->di_anextents ||
> +		if (!xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK) ||
>  		    dip->di_aformat != XFS_DINODE_FMT_BTREE) {
>  			dbprintf(_("attr fork not in btree format\n"));
>  			return 0;
>  		}
>  	} else {
> -		if (!dip->di_nextents ||
> +		if (!xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK) ||
>  		    dip->di_format != XFS_DINODE_FMT_BTREE) {
>  			dbprintf(_("data fork not in btree format\n"));
>  			return 0;
> diff --git a/db/check.c b/db/check.c
> index 12c03b6d..2d1823a4 100644
> --- a/db/check.c
> +++ b/db/check.c
> @@ -2686,7 +2686,7 @@ process_exinode(
>  	xfs_bmbt_rec_t		*rp;
>  
>  	rp = (xfs_bmbt_rec_t *)XFS_DFORK_PTR(dip, whichfork);
> -	*nex = XFS_DFORK_NEXTENTS(dip, whichfork);
> +	*nex = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
>  	if (*nex < 0 || *nex > XFS_DFORK_SIZE(dip, mp, whichfork) /
>  						sizeof(xfs_bmbt_rec_t)) {
>  		if (!sflag || id->ilist)
> diff --git a/db/frag.c b/db/frag.c
> index 1cfc6c2c..20fb1306 100644
> --- a/db/frag.c
> +++ b/db/frag.c
> @@ -262,9 +262,11 @@ process_exinode(
>  	int			whichfork)
>  {
>  	xfs_bmbt_rec_t		*rp;
> +	xfs_extnum_t		nextents;
>  
>  	rp = (xfs_bmbt_rec_t *)XFS_DFORK_PTR(dip, whichfork);
> -	process_bmbt_reclist(rp, XFS_DFORK_NEXTENTS(dip, whichfork), extmapp);
> +	nextents = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
> +	process_bmbt_reclist(rp, nextents, extmapp);
>  }
>  
>  static void
> @@ -273,9 +275,9 @@ process_fork(
>  	int		whichfork)
>  {
>  	extmap_t	*extmap;
> -	int		nex;
> +	xfs_extnum_t	nex;
>  
> -	nex = XFS_DFORK_NEXTENTS(dip, whichfork);
> +	nex = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
>  	if (!nex)
>  		return;
>  	extmap = extmap_alloc(nex);
> diff --git a/db/inode.c b/db/inode.c
> index 0cff9d63..3853092c 100644
> --- a/db/inode.c
> +++ b/db/inode.c
> @@ -271,7 +271,7 @@ inode_a_bmx_count(
>  		return 0;
>  	ASSERT((char *)XFS_DFORK_APTR(dip) - (char *)dip == byteize(startoff));
>  	return dip->di_aformat == XFS_DINODE_FMT_EXTENTS ?
> -		be16_to_cpu(dip->di_anextents) : 0;
> +		xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK) : 0;
>  }
>  
>  static int
> @@ -325,6 +325,7 @@ inode_a_size(
>  {
>  	xfs_attr_shortform_t	*asf;
>  	xfs_dinode_t		*dip;
> +	xfs_extnum_t		nextents;
>  
>  	ASSERT(startoff == 0);
>  	ASSERT(idx == 0);
> @@ -334,8 +335,8 @@ inode_a_size(
>  		asf = (xfs_attr_shortform_t *)XFS_DFORK_APTR(dip);
>  		return bitize(be16_to_cpu(asf->hdr.totsize));
>  	case XFS_DINODE_FMT_EXTENTS:
> -		return (int)be16_to_cpu(dip->di_anextents) *
> -							bitsz(xfs_bmbt_rec_t);
> +		nextents = xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK);
> +		return (int)(nextents * bitsz(xfs_bmbt_rec_t));
>  	case XFS_DINODE_FMT_BTREE:
>  		return bitize((int)XFS_DFORK_ASIZE(dip, mp));
>  	default:
> @@ -496,7 +497,7 @@ inode_u_bmx_count(
>  	dip = obj;
>  	ASSERT((char *)XFS_DFORK_DPTR(dip) - (char *)dip == byteize(startoff));
>  	return dip->di_format == XFS_DINODE_FMT_EXTENTS ?
> -		be32_to_cpu(dip->di_nextents) : 0;
> +		xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK) : 0;
>  }
>  
>  static int
> @@ -582,6 +583,7 @@ inode_u_size(
>  	int		idx)
>  {
>  	xfs_dinode_t	*dip;
> +	xfs_extnum_t	nextents;
>  
>  	ASSERT(startoff == 0);
>  	ASSERT(idx == 0);
> @@ -592,8 +594,8 @@ inode_u_size(
>  	case XFS_DINODE_FMT_LOCAL:
>  		return bitize((int)be64_to_cpu(dip->di_size));
>  	case XFS_DINODE_FMT_EXTENTS:
> -		return (int)be32_to_cpu(dip->di_nextents) *
> -						bitsz(xfs_bmbt_rec_t);
> +		nextents = xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK);
> +		return (int)(nextents * bitsz(xfs_bmbt_rec_t));
>  	case XFS_DINODE_FMT_BTREE:
>  		return bitize((int)XFS_DFORK_DSIZE(dip, mp));
>  	case XFS_DINODE_FMT_UUID:
> diff --git a/db/metadump.c b/db/metadump.c
> index e5cb3aa5..6a6757a2 100644
> --- a/db/metadump.c
> +++ b/db/metadump.c
> @@ -2282,7 +2282,7 @@ process_exinode(
>  
>  	whichfork = (itype == TYP_ATTR) ? XFS_ATTR_FORK : XFS_DATA_FORK;
>  
> -	nex = XFS_DFORK_NEXTENTS(dip, whichfork);
> +	nex = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
>  	used = nex * sizeof(xfs_bmbt_rec_t);
>  	if (nex < 0 || used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
>  		if (show_warnings)
> @@ -2335,7 +2335,7 @@ static int
>  process_dev_inode(
>  	xfs_dinode_t		*dip)
>  {
> -	if (XFS_DFORK_NEXTENTS(dip, XFS_DATA_FORK)) {
> +	if (xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK)) {
>  		if (show_warnings)
>  			print_warning("inode %llu has unexpected extents",
>  				      (unsigned long long)cur_ino);
> diff --git a/libxfs/xfs_format.h b/libxfs/xfs_format.h
> index a738cd8b..188deada 100644
> --- a/libxfs/xfs_format.h
> +++ b/libxfs/xfs_format.h
> @@ -993,10 +993,6 @@ enum xfs_dinode_fmt {
>  	((w) == XFS_DATA_FORK ? \
>  		(dip)->di_format : \
>  		(dip)->di_aformat)
> -#define XFS_DFORK_NEXTENTS(dip,w) \
> -	((w) == XFS_DATA_FORK ? \
> -		be32_to_cpu((dip)->di_nextents) : \
> -		be16_to_cpu((dip)->di_anextents))
>  
>  /*
>   * For block and character special files the 32bit dev_t is stored at the
> diff --git a/libxfs/xfs_inode_buf.c b/libxfs/xfs_inode_buf.c
> index ae71a19e..d5584372 100644
> --- a/libxfs/xfs_inode_buf.c
> +++ b/libxfs/xfs_inode_buf.c
> @@ -362,9 +362,10 @@ xfs_dinode_verify_fork(
>  	struct xfs_mount	*mp,
>  	int			whichfork)
>  {
> -	uint32_t		di_nextents = XFS_DFORK_NEXTENTS(dip, whichfork);
>  	xfs_extnum_t		max_extents;
> +	uint32_t		di_nextents;
>  
> +	di_nextents = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
>  
>  	switch (XFS_DFORK_FORMAT(dip, whichfork)) {
>  	case XFS_DINODE_FMT_LOCAL:
> @@ -396,6 +397,15 @@ xfs_dinode_verify_fork(
>  	return NULL;
>  }
>  
> +xfs_extnum_t
> +xfs_dfork_nextents(struct xfs_sb *sbp, struct xfs_dinode *dip, int whichfork)
> +{
> +	if (whichfork == XFS_DATA_FORK)
> +		return be32_to_cpu(dip->di_nextents);
> +	else
> +		return be16_to_cpu(dip->di_anextents);
> +}
> +
>  static xfs_failaddr_t
>  xfs_dinode_verify_forkoff(
>  	struct xfs_dinode	*dip,
> @@ -432,6 +442,8 @@ xfs_dinode_verify(
>  	uint16_t		flags;
>  	uint64_t		flags2;
>  	uint64_t		di_size;
> +	xfs_extnum_t            nextents;
> +	int64_t			nblocks;
>  
>  	if (dip->di_magic != cpu_to_be16(XFS_DINODE_MAGIC))
>  		return __this_address;
> @@ -462,10 +474,12 @@ xfs_dinode_verify(
>  	if ((S_ISLNK(mode) || S_ISDIR(mode)) && di_size == 0)
>  		return __this_address;
>  
> -	/* Fork checks carried over from xfs_iformat_fork */
> -	if (mode &&
> -	    be32_to_cpu(dip->di_nextents) + be16_to_cpu(dip->di_anextents) >
> -			be64_to_cpu(dip->di_nblocks))
> +	nextents = xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK);
> +	nextents += xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK);
> +	nblocks = be64_to_cpu(dip->di_nblocks);
> +
> +        /* Fork checks carried over from xfs_iformat_fork */
> +	if (mode && nextents > nblocks)
>  		return __this_address;
>  
>  	if (mode && XFS_DFORK_BOFF(dip) > mp->m_sb.sb_inodesize)
> @@ -522,7 +536,7 @@ xfs_dinode_verify(
>  		default:
>  			return __this_address;
>  		}
> -		if (dip->di_anextents)
> +		if (xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK))
>  			return __this_address;
>  	}
>  
> diff --git a/libxfs/xfs_inode_buf.h b/libxfs/xfs_inode_buf.h
> index 9b373dcf..f97b3428 100644
> --- a/libxfs/xfs_inode_buf.h
> +++ b/libxfs/xfs_inode_buf.h
> @@ -71,5 +71,7 @@ xfs_failaddr_t xfs_inode_validate_extsize(struct xfs_mount *mp,
>  xfs_failaddr_t xfs_inode_validate_cowextsize(struct xfs_mount *mp,
>  		uint32_t cowextsize, uint16_t mode, uint16_t flags,
>  		uint64_t flags2);
> +xfs_extnum_t xfs_dfork_nextents(struct xfs_sb *sbp, struct xfs_dinode *dip,
> +			int whichfork);
>  
>  #endif	/* __XFS_INODE_BUF_H__ */
> diff --git a/libxfs/xfs_inode_fork.c b/libxfs/xfs_inode_fork.c
> index 80ba6c12..8c32f993 100644
> --- a/libxfs/xfs_inode_fork.c
> +++ b/libxfs/xfs_inode_fork.c
> @@ -205,9 +205,10 @@ xfs_iformat_extents(
>  	int			whichfork)
>  {
>  	struct xfs_mount	*mp = ip->i_mount;
> +	struct xfs_sb		*sbp = &mp->m_sb;
>  	struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, whichfork);
>  	int			state = xfs_bmap_fork_to_state(whichfork);
> -	int			nex = XFS_DFORK_NEXTENTS(dip, whichfork);
> +	xfs_extnum_t		nex = xfs_dfork_nextents(sbp, dip, whichfork);
>  	int			size = nex * sizeof(xfs_bmbt_rec_t);
>  	struct xfs_iext_cursor	icur;
>  	struct xfs_bmbt_rec	*dp;
> diff --git a/repair/attr_repair.c b/repair/attr_repair.c
> index 6cec0f70..b6ca564b 100644
> --- a/repair/attr_repair.c
> +++ b/repair/attr_repair.c
> @@ -1083,7 +1083,7 @@ process_longform_attr(
>  	bno = blkmap_get(blkmap, 0);
>  	if (bno == NULLFSBLOCK) {
>  		if (dip->di_aformat == XFS_DINODE_FMT_EXTENTS &&
> -				be16_to_cpu(dip->di_anextents) == 0)
> +			xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK) == 0)

		    ^
This should /not/ be indented so that it lines up with the if body.

--D

>  			return(0); /* the kernel can handle this state */
>  		do_warn(
>  	_("block 0 of inode %" PRIu64 " attribute fork is missing\n"),
> diff --git a/repair/dinode.c b/repair/dinode.c
> index de9a3286..98bb4a17 100644
> --- a/repair/dinode.c
> +++ b/repair/dinode.c
> @@ -68,7 +68,7 @@ _("clearing inode %" PRIu64 " attributes\n"), ino_num);
>  		fprintf(stderr,
>  _("would have cleared inode %" PRIu64 " attributes\n"), ino_num);
>  
> -	if (be16_to_cpu(dino->di_anextents) != 0)  {
> +	if (xfs_dfork_nextents(&mp->m_sb, dino, XFS_ATTR_FORK) != 0) {
>  		if (no_modify)
>  			return(1);
>  		dino->di_anextents = cpu_to_be16(0);
> @@ -882,7 +882,7 @@ process_exinode(
>  	lino = XFS_AGINO_TO_INO(mp, agno, ino);
>  	rp = (xfs_bmbt_rec_t *)XFS_DFORK_PTR(dip, whichfork);
>  	*tot = 0;
> -	numrecs = XFS_DFORK_NEXTENTS(dip, whichfork);
> +	numrecs = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
>  
>  	/*
>  	 * We've already decided on the maximum number of extents on the inode,
> @@ -981,7 +981,7 @@ _("mismatch between format (%d) and size (%" PRId64 ") in symlink inode %" PRIu6
>  	}
>  
>  	rp = (xfs_bmbt_rec_t *)XFS_DFORK_DPTR(dino);
> -	numrecs = be32_to_cpu(dino->di_nextents);
> +	numrecs = xfs_dfork_nextents(&mp->m_sb, dino, XFS_DATA_FORK);
>  
>  	/*
>  	 * the max # of extents in a symlink inode is equal to the
> @@ -1496,6 +1496,8 @@ process_check_sb_inodes(
>  	int		*type,
>  	int		*dirty)
>  {
> +	xfs_extnum_t	nextents;
> +
>  	if (lino == mp->m_sb.sb_rootino) {
>  		if (*type != XR_INO_DIR)  {
>  			do_warn(_("root inode %" PRIu64 " has bad type 0x%x\n"),
> @@ -1550,10 +1552,12 @@ _("realtime summary inode %" PRIu64 " has bad type 0x%x, "),
>  				do_warn(_("would reset to regular file\n"));
>  			}
>  		}
> -		if (mp->m_sb.sb_rblocks == 0 && dinoc->di_nextents != 0)  {
> +
> +		nextents = xfs_dfork_nextents(&mp->m_sb, dinoc, XFS_DATA_FORK);
> +		if (mp->m_sb.sb_rblocks == 0 && nextents != 0)  {
>  			do_warn(
>  _("bad # of extents (%u) for realtime summary inode %" PRIu64 "\n"),
> -				be32_to_cpu(dinoc->di_nextents), lino);
> +				nextents, lino);
>  			return 1;
>  		}
>  		return 0;
> @@ -1571,10 +1575,12 @@ _("realtime bitmap inode %" PRIu64 " has bad type 0x%x, "),
>  				do_warn(_("would reset to regular file\n"));
>  			}
>  		}
> -		if (mp->m_sb.sb_rblocks == 0 && dinoc->di_nextents != 0)  {
> +
> +		nextents = xfs_dfork_nextents(&mp->m_sb, dinoc, XFS_DATA_FORK);
> +		if (mp->m_sb.sb_rblocks == 0 && nextents != 0)  {
>  			do_warn(
>  _("bad # of extents (%u) for realtime bitmap inode %" PRIu64 "\n"),
> -				be32_to_cpu(dinoc->di_nextents), lino);
> +				nextents, lino);
>  			return 1;
>  		}
>  		return 0;
> @@ -1735,6 +1741,7 @@ process_inode_blocks_and_extents(
>  	xfs_ino_t		lino,
>  	int			*dirty)
>  {
> +	xfs_extnum_t		dnextents;
>  	xfs_extnum_t		max_extents;
>  
>  	if (nblocks != be64_to_cpu(dino->di_nblocks))  {
> @@ -1760,20 +1767,19 @@ _("too many data fork extents (%" PRIu64 ") in inode %" PRIu64 "\n"),
>  			nextents, lino);
>  		return 1;
>  	}
> -	if (nextents != be32_to_cpu(dino->di_nextents))  {
> +
> +	dnextents = xfs_dfork_nextents(&mp->m_sb, dino, XFS_DATA_FORK);
> +	if (nextents != dnextents)  {
>  		if (!no_modify)  {
>  			do_warn(
>  _("correcting nextents for inode %" PRIu64 ", was %d - counted %" PRIu64 "\n"),
> -				lino,
> -				be32_to_cpu(dino->di_nextents),
> -				nextents);
> +				lino, dnextents, nextents);
>  			dino->di_nextents = cpu_to_be32(nextents);
>  			*dirty = 1;
>  		} else  {
>  			do_warn(
>  _("bad nextents %d for inode %" PRIu64 ", would reset to %" PRIu64 "\n"),
> -				be32_to_cpu(dino->di_nextents),
> -				lino, nextents);
> +				dnextents, lino, nextents);
>  		}
>  	}
>  
> @@ -1784,19 +1790,19 @@ _("too many attr fork extents (%" PRIu64 ") in inode %" PRIu64 "\n"),
>  			anextents, lino);
>  		return 1;
>  	}
> -	if (anextents != be16_to_cpu(dino->di_anextents))  {
> +
> +	dnextents = xfs_dfork_nextents(&mp->m_sb, dino, XFS_ATTR_FORK);
> +	if (anextents != dnextents)  {
>  		if (!no_modify)  {
>  			do_warn(
>  _("correcting anextents for inode %" PRIu64 ", was %d - counted %" PRIu64 "\n"),
> -				lino,
> -				be16_to_cpu(dino->di_anextents), anextents);
> +				lino, dnextents, anextents);
>  			dino->di_anextents = cpu_to_be16(anextents);
>  			*dirty = 1;
>  		} else  {
>  			do_warn(
>  _("bad anextents %d for inode %" PRIu64 ", would reset to %" PRIu64 "\n"),
> -				be16_to_cpu(dino->di_anextents),
> -				lino, anextents);
> +				dnextents, lino, anextents);
>  		}
>  	}
>  
> @@ -1831,14 +1837,14 @@ process_inode_data_fork(
>  {
>  	xfs_ino_t	lino = XFS_AGINO_TO_INO(mp, agno, ino);
>  	int		err = 0;
> -	int		nex;
> +	xfs_extnum_t	nex;
>  
>  	/*
>  	 * extent count on disk is only valid for positive values. The kernel
>  	 * uses negative values in memory. hence if we see negative numbers
>  	 * here, trash it!
>  	 */
> -	nex = be32_to_cpu(dino->di_nextents);
> +	nex = xfs_dfork_nextents(&mp->m_sb, dino, XFS_DATA_FORK);
>  	if (nex < 0)
>  		*nextents = 1;
>  	else
> @@ -1959,7 +1965,7 @@ process_inode_attr_fork(
>  		return 0;
>  	}
>  
> -	*anextents = be16_to_cpu(dino->di_anextents);
> +	*anextents = xfs_dfork_nextents(&mp->m_sb, dino, XFS_ATTR_FORK);
>  	if (*anextents > be64_to_cpu(dino->di_nblocks))
>  		*anextents = 1;
>  
> diff --git a/repair/prefetch.c b/repair/prefetch.c
> index 686bf7be..6eb7c06b 100644
> --- a/repair/prefetch.c
> +++ b/repair/prefetch.c
> @@ -393,7 +393,7 @@ pf_read_exinode(
>  	xfs_dinode_t		*dino)
>  {
>  	pf_read_bmbt_reclist(args, (xfs_bmbt_rec_t *)XFS_DFORK_DPTR(dino),
> -			be32_to_cpu(dino->di_nextents));
> +			xfs_dfork_nextents(&mp->m_sb, dino, XFS_DATA_FORK));
>  }
>  
>  static void
> -- 
> 2.28.0
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/4] xfsprogs: Extend data/attr fork extent counter width
  2020-08-31 13:01 ` [PATCH 3/4] xfsprogs: Extend data/attr fork extent counter width Chandan Babu R
@ 2020-08-31 21:00   ` Darrick J. Wong
  2020-09-01 14:17     ` Chandan Babu R
  0 siblings, 1 reply; 10+ messages in thread
From: Darrick J. Wong @ 2020-08-31 21:00 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: linux-xfs, david, bfoster

On Mon, Aug 31, 2020 at 06:31:01PM +0530, Chandan Babu R wrote:
> The kernel commit xfs: fix inode fork extent count overflow
> (3f8a4f1d876d3e3e49e50b0396eaffcc4ba71b08) mentions that 10 billion
> data fork extents should be possible to create. However the
> corresponding on-disk field has a signed 32-bit type. Hence this
> commit extends the per-inode data extent counter to 47 bits. The
> length of 47-bits was chosen because,
> Maximum file size = 2^63.
> Maximum extent count when using 64k block size = 2^63 / 2^16 = 2^47.
> 
> Also, XFS has a per-inode xattr extent counter which is 16 bits
> wide. A workload which
> 1. Creates 1 million 255-byte sized xattrs,
> 2. Deletes 50% of these xattrs in an alternating manner,
> 3. Tries to insert 400,000 new 255-byte sized xattrs
>    causes the xattr extent counter to overflow.
> 
> Dave tells me that there are instances where a single file has more than
> 100 million hardlinks. With parent pointers being stored in xattrs, we
> will overflow the signed 16-bits wide xattr extent counter when large
> number of hardlinks are created. Hence this commit extends the on-disk
> field to 32-bits.
> 
> The following changes are made to accomplish this,
> 
> 1. A new incompat superblock flag to prevent older kernels from mounting
>    the filesystem. This flag has to be set during mkfs time.
> 2. Carve out a new 32-bit field from xfs_dinode->di_pad2[]. This field
>    holds the most significant 15 bits of the data extent counter.
> 3. Carve out a new 16-bit field from xfs_dinode->di_pad2[]. This field
>    holds the most significant 16 bits of the attr extent counter.
> 
> Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
> ---
>  db/bmap.c                  |  2 +-
>  db/field.c                 |  4 ---
>  db/field.h                 |  2 --
>  db/inode.c                 | 17 ++++++++++--
>  include/libxlog.h          |  6 +++--
>  libxfs/xfs_bmap.c          | 12 +++++----
>  libxfs/xfs_format.h        | 20 ++++++++++----
>  libxfs/xfs_inode_buf.c     | 53 ++++++++++++++++++++++++++++++--------
>  libxfs/xfs_inode_buf.h     |  4 +--
>  libxfs/xfs_inode_fork.c    |  4 +--
>  libxfs/xfs_inode_fork.h    | 15 ++++++++---
>  libxfs/xfs_log_format.h    |  8 +++---
>  libxfs/xfs_types.h         |  6 +++--
>  logprint/log_misc.c        | 21 +++++++++++----
>  logprint/log_print_all.c   | 30 ++++++++++++++++-----
>  logprint/log_print_trans.c |  2 +-
>  repair/dinode.c            | 28 ++++++++++++--------
>  17 files changed, 165 insertions(+), 69 deletions(-)
> 
> diff --git a/db/bmap.c b/db/bmap.c
> index 9800a909..c374fa48 100644
> --- a/db/bmap.c
> +++ b/db/bmap.c
> @@ -47,7 +47,7 @@ bmap(
>  	int			n;
>  	int			nex;
>  	xfs_fsblock_t		nextbno;
> -	int			nextents;
> +	xfs_extnum_t		nextents;
>  	xfs_bmbt_ptr_t		*pp;
>  	xfs_bmdr_block_t	*rblock;
>  	typnm_t			typ;
> diff --git a/db/field.c b/db/field.c
> index aa0154d8..2d707e4e 100644
> --- a/db/field.c
> +++ b/db/field.c
> @@ -25,8 +25,6 @@
>  #include "symlink.h"
>  
>  const ftattr_t	ftattrtab[] = {
> -	{ FLDT_AEXTNUM, "aextnum", fp_num, "%d", SI(bitsz(xfs_aextnum_t)),
> -	  FTARG_SIGNED, NULL, NULL },
>  	{ FLDT_AGBLOCK, "agblock", fp_num, "%u", SI(bitsz(xfs_agblock_t)),
>  	  FTARG_DONULL, fa_agblock, NULL },
>  	{ FLDT_AGBLOCKNZ, "agblocknz", fp_num, "%u", SI(bitsz(xfs_agblock_t)),
> @@ -300,8 +298,6 @@ const ftattr_t	ftattrtab[] = {
>  	  FTARG_DONULL, fa_drtbno, NULL },
>  	{ FLDT_EXTLEN, "extlen", fp_num, "%u", SI(bitsz(xfs_extlen_t)), 0, NULL,
>  	  NULL },
> -	{ FLDT_EXTNUM, "extnum", fp_num, "%d", SI(bitsz(xfs_extnum_t)),
> -	  FTARG_SIGNED, NULL, NULL },
>  	{ FLDT_FSIZE, "fsize", fp_num, "%lld", SI(bitsz(xfs_fsize_t)),
>  	  FTARG_SIGNED, NULL, NULL },
>  	{ FLDT_INO, "ino", fp_num, "%llu", SI(bitsz(xfs_ino_t)), FTARG_DONULL,
> diff --git a/db/field.h b/db/field.h
> index 15065373..7ebc9a1e 100644
> --- a/db/field.h
> +++ b/db/field.h
> @@ -5,7 +5,6 @@
>   */
>  
>  typedef enum fldt	{
> -	FLDT_AEXTNUM,
>  	FLDT_AGBLOCK,
>  	FLDT_AGBLOCKNZ,
>  	FLDT_AGF,
> @@ -143,7 +142,6 @@ typedef enum fldt	{
>  	FLDT_DRFSBNO,
>  	FLDT_DRTBNO,
>  	FLDT_EXTLEN,
> -	FLDT_EXTNUM,
>  	FLDT_FSIZE,
>  	FLDT_INO,
>  	FLDT_INOBT,
> diff --git a/db/inode.c b/db/inode.c
> index 3853092c..50a942b6 100644
> --- a/db/inode.c
> +++ b/db/inode.c
> @@ -37,6 +37,7 @@ static int	inode_u_muuid_count(void *obj, int startoff);
>  static int	inode_u_sfdir2_count(void *obj, int startoff);
>  static int	inode_u_sfdir3_count(void *obj, int startoff);
>  static int	inode_u_symlink_count(void *obj, int startoff);
> +static int	inode_v3_wideextcnt_count(void *obj, int startoff);
>  
>  static const cmdinfo_t	inode_cmd =
>  	{ "inode", NULL, inode_f, 0, 1, 1, "[inode#]",
> @@ -100,8 +101,8 @@ const field_t	inode_core_flds[] = {
>  	{ "size", FLDT_FSIZE, OI(COFF(size)), C1, 0, TYP_NONE },
>  	{ "nblocks", FLDT_DRFSBNO, OI(COFF(nblocks)), C1, 0, TYP_NONE },
>  	{ "extsize", FLDT_EXTLEN, OI(COFF(extsize)), C1, 0, TYP_NONE },
> -	{ "nextents", FLDT_EXTNUM, OI(COFF(nextents)), C1, 0, TYP_NONE },
> -	{ "naextents", FLDT_AEXTNUM, OI(COFF(anextents)), C1, 0, TYP_NONE },
> +	{ "nextents_lo", FLDT_UINT32D, OI(COFF(nextents_lo)), C1, 0, TYP_NONE },
> +	{ "naextents_lo", FLDT_UINT16D, OI(COFF(anextents_lo)), C1, 0, TYP_NONE },
>  	{ "forkoff", FLDT_UINT8D, OI(COFF(forkoff)), C1, 0, TYP_NONE },
>  	{ "aformat", FLDT_DINODE_FMT, OI(COFF(aformat)), C1, 0, TYP_NONE },
>  	{ "dmevmask", FLDT_UINT32X, OI(COFF(dmevmask)), C1, 0, TYP_NONE },
> @@ -162,6 +163,10 @@ const field_t	inode_v3_flds[] = {
>  	{ "lsn", FLDT_UINT64X, OI(COFF(lsn)), C1, 0, TYP_NONE },
>  	{ "flags2", FLDT_UINT64X, OI(COFF(flags2)), C1, 0, TYP_NONE },
>  	{ "cowextsize", FLDT_EXTLEN, OI(COFF(cowextsize)), C1, 0, TYP_NONE },
> +	{ "nextents_hi", FLDT_UINT32D, OI(COFF(nextents_hi)),
> +	  inode_v3_wideextcnt_count, FLD_COUNT, TYP_NONE },
> +	{ "naextents_hi", FLDT_UINT16D, OI(COFF(anextents_hi)),
> +	  inode_v3_wideextcnt_count, FLD_COUNT, TYP_NONE },

Frankly, I would rather see you add new fp_ functions to db/fprint.c to
extract the relevant bits and keep them a single field rather than
splitting them into separate nextents_lo and nextents_hi fields.  I
don't really want to go doing that bit shifting in my head to figure out
how many extents an inode has.

Also: Same suggestion as the last patch -- API conversions to non-libxfs
code are fine to include in the "xfs:" patches to avoid breaking the
build, but all the other changes should be separate.

Notice how this patch has gotten very long because it adds widextcount
support to xfs_db, log dumping support to xfs_logprint, and the ability
to fix things to xfs_repair?

--D

>  	{ "pad2", FLDT_UINT8X, OI(OFF(pad2)), CI(12), FLD_ARRAY|FLD_SKIPALL, TYP_NONE },
>  	{ "crtime", FLDT_TIMESTAMP, OI(COFF(crtime)), C1, 0, TYP_NONE },
>  	{ "inumber", FLDT_INO, OI(COFF(ino)), C1, 0, TYP_NONE },
> @@ -396,6 +401,14 @@ inode_core_projid_count(
>  	return dic->di_version >= 2;
>  }
>  
> +static int
> +inode_v3_wideextcnt_count(
> +	void		*obj,
> +	int		startoff)
> +{
> +	return xfs_sb_version_haswideextcnt(&mp->m_sb);
> +}
> +
>  static int
>  inode_f(
>  	int		argc,
> diff --git a/include/libxlog.h b/include/libxlog.h
> index 5e94fa1e..1aab108c 100644
> --- a/include/libxlog.h
> +++ b/include/libxlog.h
> @@ -89,13 +89,15 @@ extern int	xlog_find_tail(struct xlog *log, xfs_daddr_t *head_blk,
>  
>  extern int	xlog_recover(struct xlog *log, int readonly);
>  extern void	xlog_recover_print_data(char *p, int len);
> -extern void	xlog_recover_print_logitem(xlog_recover_item_t *item);
> +extern void	xlog_recover_print_logitem(struct xlog *log,
> +			xlog_recover_item_t *item);
>  extern void	xlog_recover_print_trans_head(struct xlog_recover *tr);
>  extern int	xlog_print_find_oldest(struct xlog *log, xfs_daddr_t *last_blk);
>  
>  /* for transactional view */
>  extern void	xlog_recover_print_trans_head(struct xlog_recover *tr);
> -extern void	xlog_recover_print_trans(struct xlog_recover *trans,
> +extern void	xlog_recover_print_trans(struct xlog *log,
> +				struct xlog_recover *trans,
>  				struct list_head *itemq, int print);
>  extern int	xlog_do_recovery_pass(struct xlog *log, xfs_daddr_t head_blk,
>  				xfs_daddr_t tail_blk, int pass);
> diff --git a/libxfs/xfs_bmap.c b/libxfs/xfs_bmap.c
> index dae4d339..118b6e96 100644
> --- a/libxfs/xfs_bmap.c
> +++ b/libxfs/xfs_bmap.c
> @@ -45,19 +45,21 @@ xfs_bmap_compute_maxlevels(
>  	xfs_mount_t	*mp,		/* file system mount structure */
>  	int		whichfork)	/* data or attr fork */
>  {
> +	xfs_extnum_t	maxleafents;	/* max leaf entries possible */
>  	int		level;		/* btree level */
>  	uint		maxblocks;	/* max blocks at this level */
> -	uint		maxleafents;	/* max leaf entries possible */
>  	int		maxrootrecs;	/* max records in root block */
>  	int		minleafrecs;	/* min records in leaf block */
>  	int		minnoderecs;	/* min records in node block */
>  	int		sz;		/* root block size */
>  
>  	/*
> -	 * The maximum number of extents in a file, hence the maximum
> -	 * number of leaf entries, is controlled by the type of di_nextents
> -	 * (a signed 32-bit number, xfs_extnum_t), or by di_anextents
> -	 * (a signed 16-bit number, xfs_aextnum_t).
> +	 * The maximum number of extents in a file, hence the maximum number of
> +	 * leaf entries, is controlled by the size of the on-disk extent count,
> +	 * either a signed 32-bit number for the data fork, or a signed 16-bit
> +	 * number for the attr fork. With mkfs.xfs' wide-extcount option
> +	 * enabled, the data fork extent count is unsigned 47-bits wide, while
> +	 * the corresponding attr fork extent count is unsigned 32-bits wide.
>  	 *
>  	 * Note that we can no longer assume that if we are in ATTR1 that
>  	 * the fork offset of all the inodes will be
> diff --git a/libxfs/xfs_format.h b/libxfs/xfs_format.h
> index 188deada..ab44bcb4 100644
> --- a/libxfs/xfs_format.h
> +++ b/libxfs/xfs_format.h
> @@ -464,11 +464,13 @@ xfs_sb_has_ro_compat_feature(
>  
>  #define XFS_SB_FEAT_INCOMPAT_FTYPE	(1 << 0)	/* filetype in dirent */
>  #define XFS_SB_FEAT_INCOMPAT_SPINODES	(1 << 1)	/* sparse inode chunks */
> -#define XFS_SB_FEAT_INCOMPAT_META_UUID	(1 << 2)	/* metadata UUID */
> +#define XFS_SB_FEAT_INCOMPAT_META_UUID	(1 << 2)        /* metadata UUID */
> +#define XFS_SB_FEAT_INCOMPAT_WIDEEXTCNT	(1 << 3)	/* Wider data/attr fork extent counters */
>  #define XFS_SB_FEAT_INCOMPAT_ALL \
>  		(XFS_SB_FEAT_INCOMPAT_FTYPE|	\
>  		 XFS_SB_FEAT_INCOMPAT_SPINODES|	\
> -		 XFS_SB_FEAT_INCOMPAT_META_UUID)
> +		 XFS_SB_FEAT_INCOMPAT_META_UUID| \
> +		 XFS_SB_FEAT_INCOMPAT_WIDEEXTCNT)
>  
>  #define XFS_SB_FEAT_INCOMPAT_UNKNOWN	~XFS_SB_FEAT_INCOMPAT_ALL
>  static inline bool
> @@ -551,6 +553,12 @@ static inline bool xfs_sb_version_hasmetauuid(struct xfs_sb *sbp)
>  		(sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_META_UUID);
>  }
>  
> +static inline bool xfs_sb_version_haswideextcnt(struct xfs_sb *sbp)
> +{
> +	return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5) &&
> +		(sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_WIDEEXTCNT);
> +}
> +
>  static inline bool xfs_sb_version_hasrmapbt(struct xfs_sb *sbp)
>  {
>  	return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5) &&
> @@ -873,8 +881,8 @@ typedef struct xfs_dinode {
>  	__be64		di_size;	/* number of bytes in file */
>  	__be64		di_nblocks;	/* # of direct & btree blocks used */
>  	__be32		di_extsize;	/* basic/minimum extent size for file */
> -	__be32		di_nextents;	/* number of extents in data fork */
> -	__be16		di_anextents;	/* number of extents in attribute fork*/
> +	__be32		di_nextents_lo;	/* lower part of data fork extent count */
> +	__be16		di_anextents_lo;/* lower part of attr fork extent count */
>  	__u8		di_forkoff;	/* attr fork offs, <<3 for 64b align */
>  	__s8		di_aformat;	/* format of attr fork's data */
>  	__be32		di_dmevmask;	/* DMIG event mask */
> @@ -891,7 +899,9 @@ typedef struct xfs_dinode {
>  	__be64		di_lsn;		/* flush sequence */
>  	__be64		di_flags2;	/* more random flags */
>  	__be32		di_cowextsize;	/* basic cow extent size for file */
> -	__u8		di_pad2[12];	/* more padding for future expansion */
> +	__be32		di_nextents_hi; /* higher part of data fork extent count */
> +	__be16		di_anextents_hi;/* higher part of attr fork extent count */
> +	__u8		di_pad2[6];	/* more padding for future expansion */
>  
>  	/* fields only written to during inode creation */
>  	xfs_timestamp_t	di_crtime;	/* time created */
> diff --git a/libxfs/xfs_inode_buf.c b/libxfs/xfs_inode_buf.c
> index d5584372..219d0234 100644
> --- a/libxfs/xfs_inode_buf.c
> +++ b/libxfs/xfs_inode_buf.c
> @@ -188,6 +188,7 @@ xfs_inode_from_disk(
>  	struct xfs_inode	*ip,
>  	struct xfs_dinode	*from)
>  {
> +	struct xfs_sb		*sbp = &ip->i_mount->m_sb;
>  	struct xfs_icdinode	*to = &ip->i_d;
>  	struct inode		*inode = VFS_I(ip);
>  
> @@ -228,8 +229,8 @@ xfs_inode_from_disk(
>  	to->di_size = be64_to_cpu(from->di_size);
>  	to->di_nblocks = be64_to_cpu(from->di_nblocks);
>  	to->di_extsize = be32_to_cpu(from->di_extsize);
> -	to->di_nextents = be32_to_cpu(from->di_nextents);
> -	to->di_anextents = be16_to_cpu(from->di_anextents);
> +	to->di_nextents = be32_to_cpu(from->di_nextents_lo);
> +	to->di_anextents = be16_to_cpu(from->di_anextents_lo);
>  	to->di_forkoff = from->di_forkoff;
>  	to->di_aformat	= from->di_aformat;
>  	to->di_dmevmask	= be32_to_cpu(from->di_dmevmask);
> @@ -243,6 +244,13 @@ xfs_inode_from_disk(
>  		to->di_crtime.tv_nsec = be32_to_cpu(from->di_crtime.t_nsec);
>  		to->di_flags2 = be64_to_cpu(from->di_flags2);
>  		to->di_cowextsize = be32_to_cpu(from->di_cowextsize);
> +
> +		if (xfs_sb_version_haswideextcnt(sbp)) {
> +			to->di_nextents |=
> +				((uint64_t)(be32_to_cpu(from->di_nextents_hi)) << 32);
> +			to->di_anextents |=
> +				((uint32_t)(be16_to_cpu(from->di_anextents_hi)) << 16);
> +		}
>  	}
>  }
>  
> @@ -252,6 +260,7 @@ xfs_inode_to_disk(
>  	struct xfs_dinode	*to,
>  	xfs_lsn_t		lsn)
>  {
> +	struct xfs_sb		*sbp = &ip->i_mount->m_sb;
>  	struct xfs_icdinode	*from = &ip->i_d;
>  	struct inode		*inode = VFS_I(ip);
>  
> @@ -278,8 +287,8 @@ xfs_inode_to_disk(
>  	to->di_size = cpu_to_be64(from->di_size);
>  	to->di_nblocks = cpu_to_be64(from->di_nblocks);
>  	to->di_extsize = cpu_to_be32(from->di_extsize);
> -	to->di_nextents = cpu_to_be32(from->di_nextents);
> -	to->di_anextents = cpu_to_be16(from->di_anextents);
> +	to->di_nextents_lo = cpu_to_be32(from->di_nextents);
> +	to->di_anextents_lo = cpu_to_be16(from->di_anextents);
>  	to->di_forkoff = from->di_forkoff;
>  	to->di_aformat = from->di_aformat;
>  	to->di_dmevmask = cpu_to_be32(from->di_dmevmask);
> @@ -293,6 +302,12 @@ xfs_inode_to_disk(
>  		to->di_crtime.t_nsec = cpu_to_be32(from->di_crtime.tv_nsec);
>  		to->di_flags2 = cpu_to_be64(from->di_flags2);
>  		to->di_cowextsize = cpu_to_be32(from->di_cowextsize);
> +		if (xfs_sb_version_haswideextcnt(sbp)) {
> +			to->di_nextents_hi
> +				= cpu_to_be32(from->di_nextents >> 32);
> +			to->di_anextents_hi
> +				= cpu_to_be16(from->di_nextents >> 16);
> +		}
>  		to->di_ino = cpu_to_be64(ip->i_ino);
>  		to->di_lsn = cpu_to_be64(lsn);
>  		memset(to->di_pad2, 0, sizeof(to->di_pad2));
> @@ -306,9 +321,12 @@ xfs_inode_to_disk(
>  
>  void
>  xfs_log_dinode_to_disk(
> +	struct xfs_mount	*mp,
>  	struct xfs_log_dinode	*from,
>  	struct xfs_dinode	*to)
>  {
> +	struct xfs_sb		*sbp = &mp->m_sb;
> +
>  	to->di_magic = cpu_to_be16(from->di_magic);
>  	to->di_mode = cpu_to_be16(from->di_mode);
>  	to->di_version = from->di_version;
> @@ -331,8 +349,8 @@ xfs_log_dinode_to_disk(
>  	to->di_size = cpu_to_be64(from->di_size);
>  	to->di_nblocks = cpu_to_be64(from->di_nblocks);
>  	to->di_extsize = cpu_to_be32(from->di_extsize);
> -	to->di_nextents = cpu_to_be32(from->di_nextents);
> -	to->di_anextents = cpu_to_be16(from->di_anextents);
> +	to->di_nextents_lo = cpu_to_be32(from->di_nextents_lo);
> +	to->di_anextents_lo = cpu_to_be16(from->di_anextents_lo);
>  	to->di_forkoff = from->di_forkoff;
>  	to->di_aformat = from->di_aformat;
>  	to->di_dmevmask = cpu_to_be32(from->di_dmevmask);
> @@ -346,6 +364,10 @@ xfs_log_dinode_to_disk(
>  		to->di_crtime.t_nsec = cpu_to_be32(from->di_crtime.t_nsec);
>  		to->di_flags2 = cpu_to_be64(from->di_flags2);
>  		to->di_cowextsize = cpu_to_be32(from->di_cowextsize);
> +		if (xfs_sb_version_haswideextcnt(sbp)) {
> +			to->di_nextents_hi = cpu_to_be32(from->di_nextents_hi);
> +			to->di_anextents_hi = cpu_to_be16(from->di_anextents_hi);
> +		}
>  		to->di_ino = cpu_to_be64(from->di_ino);
>  		to->di_lsn = cpu_to_be64(from->di_lsn);
>  		memcpy(to->di_pad2, from->di_pad2, sizeof(to->di_pad2));
> @@ -363,7 +385,7 @@ xfs_dinode_verify_fork(
>  	int			whichfork)
>  {
>  	xfs_extnum_t		max_extents;
> -	uint32_t		di_nextents;
> +	xfs_extnum_t		di_nextents;
>  
>  	di_nextents = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
>  
> @@ -400,10 +422,19 @@ xfs_dinode_verify_fork(
>  xfs_extnum_t
>  xfs_dfork_nextents(struct xfs_sb *sbp, struct xfs_dinode *dip, int whichfork)
>  {
> -	if (whichfork == XFS_DATA_FORK)
> -		return be32_to_cpu(dip->di_nextents);
> -	else
> -		return be16_to_cpu(dip->di_anextents);
> +	xfs_extnum_t nextents;
> +
> +	if (whichfork == XFS_DATA_FORK) {
> +		nextents = be32_to_cpu(dip->di_nextents_lo);
> +		if (xfs_sb_version_haswideextcnt(sbp))
> +			nextents |= ((uint64_t)be32_to_cpu(dip->di_nextents_hi) << 32);
> +	} else {
> +		nextents = be16_to_cpu(dip->di_anextents_lo);
> +		if (xfs_sb_version_haswideextcnt(sbp))
> +			nextents |= ((uint32_t)be16_to_cpu(dip->di_anextents_hi) << 16);
> +	}
> +
> +	return nextents;
>  }
>  
>  static xfs_failaddr_t
> diff --git a/libxfs/xfs_inode_buf.h b/libxfs/xfs_inode_buf.h
> index f97b3428..0dee0235 100644
> --- a/libxfs/xfs_inode_buf.h
> +++ b/libxfs/xfs_inode_buf.h
> @@ -55,8 +55,8 @@ void	xfs_dinode_calc_crc(struct xfs_mount *, struct xfs_dinode *);
>  void	xfs_inode_to_disk(struct xfs_inode *ip, struct xfs_dinode *to,
>  			  xfs_lsn_t lsn);
>  void	xfs_inode_from_disk(struct xfs_inode *ip, struct xfs_dinode *from);
> -void	xfs_log_dinode_to_disk(struct xfs_log_dinode *from,
> -			       struct xfs_dinode *to);
> +void	xfs_log_dinode_to_disk(struct xfs_mount *mp,
> +			struct xfs_log_dinode *from, struct xfs_dinode *to);
>  
>  #if defined(DEBUG)
>  void	xfs_inobp_check(struct xfs_mount *, struct xfs_buf *);
> diff --git a/libxfs/xfs_inode_fork.c b/libxfs/xfs_inode_fork.c
> index 8c32f993..af4f893f 100644
> --- a/libxfs/xfs_inode_fork.c
> +++ b/libxfs/xfs_inode_fork.c
> @@ -213,14 +213,14 @@ xfs_iformat_extents(
>  	struct xfs_iext_cursor	icur;
>  	struct xfs_bmbt_rec	*dp;
>  	struct xfs_bmbt_irec	new;
> -	int			i;
> +	xfs_extnum_t		i;
>  
>  	/*
>  	 * If the number of extents is unreasonable, then something is wrong and
>  	 * we just bail out rather than crash in kmem_alloc() or memcpy() below.
>  	 */
>  	if (unlikely(size < 0 || size > XFS_DFORK_SIZE(dip, mp, whichfork))) {
> -		xfs_warn(ip->i_mount, "corrupt inode %Lu ((a)extents = %d).",
> +		xfs_warn(ip->i_mount, "corrupt inode %Lu ((a)extents = %llu).",
>  			(unsigned long long) ip->i_ino, nex);
>  		xfs_inode_verifier_error(ip, -EFSCORRUPTED,
>  				"xfs_iformat_extents(1)", dip, sizeof(*dip),
> diff --git a/libxfs/xfs_inode_fork.h b/libxfs/xfs_inode_fork.h
> index e318dfdd..22f3c9b3 100644
> --- a/libxfs/xfs_inode_fork.h
> +++ b/libxfs/xfs_inode_fork.h
> @@ -90,10 +90,17 @@ static inline xfs_extnum_t xfs_iext_max(struct xfs_sb *sbp, int whichfork)
>  {
>  	ASSERT(whichfork == XFS_DATA_FORK || whichfork == XFS_ATTR_FORK);
>  
> -	if (whichfork == XFS_DATA_FORK)
> -		return MAXEXTNUM;
> -	else
> -		return MAXAEXTNUM;
> +	if (whichfork == XFS_DATA_FORK) {
> +		if (xfs_sb_version_haswideextcnt(sbp))
> +			return MAXEXTNUM_HI;
> +		else
> +			return MAXEXTNUM;
> +	} else {
> +		if (xfs_sb_version_haswideextcnt(sbp))
> +			return MAXAEXTNUM_HI;
> +		else
> +			return MAXAEXTNUM;
> +	}
>  }
>  
>  struct xfs_ifork *xfs_iext_state_to_fork(struct xfs_inode *ip, int state);
> diff --git a/libxfs/xfs_log_format.h b/libxfs/xfs_log_format.h
> index e3400c9c..809f8ce6 100644
> --- a/libxfs/xfs_log_format.h
> +++ b/libxfs/xfs_log_format.h
> @@ -396,8 +396,8 @@ struct xfs_log_dinode {
>  	xfs_fsize_t	di_size;	/* number of bytes in file */
>  	xfs_rfsblock_t	di_nblocks;	/* # of direct & btree blocks used */
>  	xfs_extlen_t	di_extsize;	/* basic/minimum extent size for file */
> -	xfs_extnum_t	di_nextents;	/* number of extents in data fork */
> -	xfs_aextnum_t	di_anextents;	/* number of extents in attribute fork*/
> +	uint32_t	di_nextents_lo;	/* lower part of data fork extent count */
> +	uint16_t	di_anextents_lo;/* lower part of attr fork extent count*/
>  	uint8_t		di_forkoff;	/* attr fork offs, <<3 for 64b align */
>  	int8_t		di_aformat;	/* format of attr fork's data */
>  	uint32_t	di_dmevmask;	/* DMIG event mask */
> @@ -414,7 +414,9 @@ struct xfs_log_dinode {
>  	xfs_lsn_t	di_lsn;		/* flush sequence */
>  	uint64_t	di_flags2;	/* more random flags */
>  	uint32_t	di_cowextsize;	/* basic cow extent size for file */
> -	uint8_t		di_pad2[12];	/* more padding for future expansion */
> +	uint32_t	di_nextents_hi; /* higher part of data fork extent count */
> +	uint16_t	di_anextents_hi;/* higher part of attr fork extent count */
> +	uint8_t		di_pad2[6];	/* more padding for future expansion */
>  
>  	/* fields only written to during inode creation */
>  	xfs_ictimestamp_t di_crtime;	/* time created */
> diff --git a/libxfs/xfs_types.h b/libxfs/xfs_types.h
> index 397d9477..23ff8166 100644
> --- a/libxfs/xfs_types.h
> +++ b/libxfs/xfs_types.h
> @@ -12,8 +12,8 @@ typedef uint32_t	xfs_agblock_t;	/* blockno in alloc. group */
>  typedef uint32_t	xfs_agino_t;	/* inode # within allocation grp */
>  typedef uint32_t	xfs_extlen_t;	/* extent length in blocks */
>  typedef uint32_t	xfs_agnumber_t;	/* allocation group number */
> -typedef int32_t		xfs_extnum_t;	/* # of extents in a file */
> -typedef int16_t		xfs_aextnum_t;	/* # extents in an attribute fork */
> +typedef uint64_t	xfs_extnum_t;	/* # of extents in a file */
> +typedef uint32_t	xfs_aextnum_t;	/* # extents in an attribute fork */
>  typedef int64_t		xfs_fsize_t;	/* bytes in a file */
>  typedef uint64_t	xfs_ufsize_t;	/* unsigned bytes in a file */
>  
> @@ -61,6 +61,8 @@ typedef void *		xfs_failaddr_t;
>  #define	MAXEXTLEN	((xfs_extlen_t)0x001fffff)	/* 21 bits */
>  #define	MAXEXTNUM	((xfs_extnum_t)0x7fffffff)	/* signed int */
>  #define	MAXAEXTNUM	((xfs_aextnum_t)0x7fff)		/* signed short */
> +#define MAXEXTNUM_HI	((xfs_extnum_t)0x7fffffffffff)	/* unsigned 47 bits */
> +#define MAXAEXTNUM_HI	((xfs_aextnum_t)0xffffffff)	/* unsigned 32 bits */
>  
>  /*
>   * Minimum and maximum blocksize and sectorsize.
> diff --git a/logprint/log_misc.c b/logprint/log_misc.c
> index be889887..4d09f357 100644
> --- a/logprint/log_misc.c
> +++ b/logprint/log_misc.c
> @@ -438,8 +438,11 @@ xlog_print_trans_qoff(char **ptr, uint len)
>  
>  static void
>  xlog_print_trans_inode_core(
> +	struct xfs_mount	*mp,
>  	struct xfs_log_dinode	*ip)
>  {
> +	xfs_extnum_t		nextents;
> +
>      printf(_("INODE CORE\n"));
>      printf(_("magic 0x%hx mode 0%ho version %d format %d\n"),
>  	   ip->di_magic, ip->di_mode, (int)ip->di_version,
> @@ -448,11 +451,19 @@ xlog_print_trans_inode_core(
>  	   ip->di_nlink, ip->di_uid, ip->di_gid);
>      printf(_("atime 0x%x mtime 0x%x ctime 0x%x\n"),
>  	   ip->di_atime.t_sec, ip->di_mtime.t_sec, ip->di_ctime.t_sec);
> -    printf(_("size 0x%llx nblocks 0x%llx extsize 0x%x nextents 0x%x\n"),
> +
> +    nextents = ip->di_nextents_lo;
> +    if (xfs_sb_version_haswideextcnt(&mp->m_sb))
> +	    nextents |= ((xfs_extnum_t)ip->di_nextents_hi << 32);
> +    printf(_("size 0x%llx nblocks 0x%llx extsize 0x%x nextents 0x%lx\n"),
>  	   (unsigned long long)ip->di_size, (unsigned long long)ip->di_nblocks,
> -	   ip->di_extsize, ip->di_nextents);
> -    printf(_("naextents 0x%x forkoff %d dmevmask 0x%x dmstate 0x%hx\n"),
> -	   ip->di_anextents, (int)ip->di_forkoff, ip->di_dmevmask,
> +	   ip->di_extsize, nextents);
> +
> +    nextents = ip->di_anextents_lo;
> +    if (xfs_sb_version_haswideextcnt(&mp->m_sb))
> +	    nextents |= ((xfs_extnum_t)ip->di_anextents_hi << 16);
> +    printf(_("naextents 0x%lx forkoff %d dmevmask 0x%x dmstate 0x%hx\n"),
> +	   nextents, (int)ip->di_forkoff, ip->di_dmevmask,
>  	   ip->di_dmstate);
>      printf(_("flags 0x%x gen 0x%x\n"),
>  	   ip->di_flags, ip->di_gen);
> @@ -562,7 +573,7 @@ xlog_print_trans_inode(
>      memmove(&dino, *ptr, sizeof(dino));
>      mode = dino.di_mode & S_IFMT;
>      size = (int)dino.di_size;
> -    xlog_print_trans_inode_core(&dino);
> +    xlog_print_trans_inode_core(log->l_mp, &dino);
>      *ptr += xfs_log_dinode_size(log->l_mp);
>      skip_count--;
>  
> diff --git a/logprint/log_print_all.c b/logprint/log_print_all.c
> index e2e28b9c..aa171dfb 100644
> --- a/logprint/log_print_all.c
> +++ b/logprint/log_print_all.c
> @@ -238,9 +238,14 @@ xlog_recover_print_dquot(
>  
>  STATIC void
>  xlog_recover_print_inode_core(
> +	struct xlog		*log,
>  	struct xfs_log_dinode	*di)
>  {
> -	printf(_("	CORE inode:\n"));
> +	struct xfs_sb		*sbp = &log->l_mp->m_sb;
> +	xfs_aextnum_t		anextents;
> +	xfs_extnum_t		nextents;
> +
> +        printf(_("	CORE inode:\n"));
>  	if (!print_inode)
>  		return;
>  	printf(_("		magic:%c%c  mode:0x%x  ver:%d  format:%d\n"),
> @@ -252,10 +257,17 @@ xlog_recover_print_inode_core(
>  	printf(_("		atime:%d  mtime:%d  ctime:%d\n"),
>  	       di->di_atime.t_sec, di->di_mtime.t_sec, di->di_ctime.t_sec);
>  	printf(_("		flushiter:%d\n"), di->di_flushiter);
> +
> +	nextents = di->di_nextents_lo;
> +	anextents = di->di_anextents_lo;
> +	if (xfs_sb_version_haswideextcnt(sbp)) {
> +		nextents |= ((xfs_extnum_t)di->di_nextents_hi << 32);
> +		anextents |= ((xfs_aextnum_t)di->di_anextents_hi << 16);
> +	}
>  	printf(_("		size:0x%llx  nblks:0x%llx  exsize:%d  "
> -	     "nextents:%d  anextents:%d\n"), (unsigned long long)
> +	     "nextents:%lu  anextents:%u\n"), (unsigned long long)
>  	       di->di_size, (unsigned long long)di->di_nblocks,
> -	       di->di_extsize, di->di_nextents, (int)di->di_anextents);
> +	       di->di_extsize, nextents, anextents);
>  	printf(_("		forkoff:%d  dmevmask:0x%x  dmstate:%d  flags:0x%x  "
>  	     "gen:%u\n"),
>  	       (int)di->di_forkoff, di->di_dmevmask, (int)di->di_dmstate,
> @@ -268,6 +280,7 @@ xlog_recover_print_inode_core(
>  
>  STATIC void
>  xlog_recover_print_inode(
> +	struct xlog		*log,
>  	xlog_recover_item_t	*item)
>  {
>  	struct xfs_inode_log_format	f_buf;
> @@ -289,7 +302,7 @@ xlog_recover_print_inode(
>  	ASSERT(item->ri_buf[1].i_len ==
>  			offsetof(struct xfs_log_dinode, di_next_unlinked) ||
>  	       item->ri_buf[1].i_len == sizeof(struct xfs_log_dinode));
> -	xlog_recover_print_inode_core((struct xfs_log_dinode *)
> +	xlog_recover_print_inode_core(log, (struct xfs_log_dinode *)
>  				      item->ri_buf[1].i_addr);
>  
>  	hasdata = (f->ilf_fields & XFS_ILOG_DFORK) != 0;
> @@ -384,6 +397,7 @@ xlog_recover_print_icreate(
>  
>  void
>  xlog_recover_print_logitem(
> +	struct xlog		*log,
>  	xlog_recover_item_t	*item)
>  {
>  	switch (ITEM_TYPE(item)) {
> @@ -394,7 +408,7 @@ xlog_recover_print_logitem(
>  		xlog_recover_print_icreate(item);
>  		break;
>  	case XFS_LI_INODE:
> -		xlog_recover_print_inode(item);
> +		xlog_recover_print_inode(log, item);
>  		break;
>  	case XFS_LI_EFD:
>  		xlog_recover_print_efd(item);
> @@ -434,6 +448,7 @@ xlog_recover_print_logitem(
>  
>  static void
>  xlog_recover_print_item(
> +	struct xlog		*log,
>  	xlog_recover_item_t	*item)
>  {
>  	int			i;
> @@ -493,11 +508,12 @@ xlog_recover_print_item(
>  		       (long)item->ri_buf[i].i_addr, item->ri_buf[i].i_len);
>  	}
>  	printf("\n");
> -	xlog_recover_print_logitem(item);
> +	xlog_recover_print_logitem(log, item);
>  }
>  
>  void
>  xlog_recover_print_trans(
> +	struct xlog		*log,
>  	struct xlog_recover	*trans,
>  	struct list_head	*itemq,
>  	int			print)
> @@ -510,5 +526,5 @@ xlog_recover_print_trans(
>  	print_xlog_record_line();
>  	xlog_recover_print_trans_head(trans);
>  	list_for_each_entry(item, itemq, ri_list)
> -		xlog_recover_print_item(item);
> +		xlog_recover_print_item(log, item);
>  }
> diff --git a/logprint/log_print_trans.c b/logprint/log_print_trans.c
> index 2004b5a0..c6386fb0 100644
> --- a/logprint/log_print_trans.c
> +++ b/logprint/log_print_trans.c
> @@ -24,7 +24,7 @@ xlog_recover_do_trans(
>  	struct xlog_recover	*trans,
>  	int			pass)
>  {
> -	xlog_recover_print_trans(trans, &trans->r_itemq, 3);
> +	xlog_recover_print_trans(log, trans, &trans->r_itemq, 3);
>  	return 0;
>  }
>  
> diff --git a/repair/dinode.c b/repair/dinode.c
> index 98bb4a17..5a8de0f6 100644
> --- a/repair/dinode.c
> +++ b/repair/dinode.c
> @@ -71,7 +71,9 @@ _("would have cleared inode %" PRIu64 " attributes\n"), ino_num);
>  	if (xfs_dfork_nextents(&mp->m_sb, dino, XFS_ATTR_FORK) != 0) {
>  		if (no_modify)
>  			return(1);
> -		dino->di_anextents = cpu_to_be16(0);
> +		dino->di_anextents_lo = cpu_to_be16(0);
> +		if (xfs_sb_version_haswideextcnt(&mp->m_sb))
> +			dino->di_anextents_hi = cpu_to_be16(0);
>  	}
>  
>  	if (dino->di_aformat != XFS_DINODE_FMT_EXTENTS)  {
> @@ -959,7 +961,7 @@ process_symlink_extlist(xfs_mount_t *mp, xfs_ino_t lino, xfs_dinode_t *dino)
>  	xfs_fileoff_t		expected_offset;
>  	xfs_bmbt_rec_t		*rp;
>  	xfs_bmbt_irec_t		irec;
> -	int			numrecs;
> +	xfs_extnum_t		numrecs;
>  	int			i;
>  	int			max_blocks;
>  
> @@ -989,7 +991,7 @@ _("mismatch between format (%d) and size (%" PRId64 ") in symlink inode %" PRIu6
>  	 */
>  	if (numrecs > max_symlink_blocks)  {
>  		do_warn(
> -_("bad number of extents (%d) in symlink %" PRIu64 " data fork\n"),
> +_("bad number of extents (%lu) in symlink %" PRIu64 " data fork\n"),
>  			numrecs, lino);
>  		return(1);
>  	}
> @@ -1556,7 +1558,7 @@ _("realtime summary inode %" PRIu64 " has bad type 0x%x, "),
>  		nextents = xfs_dfork_nextents(&mp->m_sb, dinoc, XFS_DATA_FORK);
>  		if (mp->m_sb.sb_rblocks == 0 && nextents != 0)  {
>  			do_warn(
> -_("bad # of extents (%u) for realtime summary inode %" PRIu64 "\n"),
> +_("bad # of extents (%lu) for realtime summary inode %" PRIu64 "\n"),
>  				nextents, lino);
>  			return 1;
>  		}
> @@ -1579,7 +1581,7 @@ _("realtime bitmap inode %" PRIu64 " has bad type 0x%x, "),
>  		nextents = xfs_dfork_nextents(&mp->m_sb, dinoc, XFS_DATA_FORK);
>  		if (mp->m_sb.sb_rblocks == 0 && nextents != 0)  {
>  			do_warn(
> -_("bad # of extents (%u) for realtime bitmap inode %" PRIu64 "\n"),
> +_("bad # of extents (%lu) for realtime bitmap inode %" PRIu64 "\n"),
>  				nextents, lino);
>  			return 1;
>  		}
> @@ -1772,13 +1774,15 @@ _("too many data fork extents (%" PRIu64 ") in inode %" PRIu64 "\n"),
>  	if (nextents != dnextents)  {
>  		if (!no_modify)  {
>  			do_warn(
> -_("correcting nextents for inode %" PRIu64 ", was %d - counted %" PRIu64 "\n"),
> +_("correcting nextents for inode %" PRIu64 ", was %lu - counted %" PRIu64 "\n"),
>  				lino, dnextents, nextents);
> -			dino->di_nextents = cpu_to_be32(nextents);
> +			dino->di_nextents_lo = cpu_to_be32(nextents);
> +			if (xfs_sb_version_haswideextcnt(&mp->m_sb))
> +				dino->di_nextents_hi = cpu_to_be32(nextents >> 32);
>  			*dirty = 1;
>  		} else  {
>  			do_warn(
> -_("bad nextents %d for inode %" PRIu64 ", would reset to %" PRIu64 "\n"),
> +_("bad nextents %lu for inode %" PRIu64 ", would reset to %" PRIu64 "\n"),
>  				dnextents, lino, nextents);
>  		}
>  	}
> @@ -1795,13 +1799,15 @@ _("too many attr fork extents (%" PRIu64 ") in inode %" PRIu64 "\n"),
>  	if (anextents != dnextents)  {
>  		if (!no_modify)  {
>  			do_warn(
> -_("correcting anextents for inode %" PRIu64 ", was %d - counted %" PRIu64 "\n"),
> +_("correcting anextents for inode %" PRIu64 ", was %lu - counted %" PRIu64 "\n"),
>  				lino, dnextents, anextents);
> -			dino->di_anextents = cpu_to_be16(anextents);
> +			dino->di_anextents_lo = cpu_to_be16(anextents);
> +			if (xfs_sb_version_haswideextcnt(&mp->m_sb))
> +				dino->di_anextents_hi = cpu_to_be16(anextents >> 16);
>  			*dirty = 1;
>  		} else  {
>  			do_warn(
> -_("bad anextents %d for inode %" PRIu64 ", would reset to %" PRIu64 "\n"),
> +_("bad anextents %lu for inode %" PRIu64 ", would reset to %" PRIu64 "\n"),
>  				dnextents, lino, anextents);
>  		}
>  	}
> -- 
> 2.28.0
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] xfsprogs: Introduce xfs_dfork_nextents() helper
  2020-08-31 20:54   ` Darrick J. Wong
@ 2020-09-01 14:17     ` Chandan Babu R
  2020-09-01 15:42       ` Darrick J. Wong
  0 siblings, 1 reply; 10+ messages in thread
From: Chandan Babu R @ 2020-09-01 14:17 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, david, bfoster

On Tuesday 1 September 2020 2:24:26 AM IST Darrick J. Wong wrote:
> On Mon, Aug 31, 2020 at 06:31:00PM +0530, Chandan Babu R wrote:
> > This commit replaces the macro XFS_DFORK_NEXTENTS() with the helper
> > function xfs_dfork_nextents(). As of this commit, xfs_dfork_nextents()
> > returns the same value as XFS_DFORK_NEXTENTS(). A future commit which
> > extends inode's extent counter fields will add more logic to this
> > helper.
> > 
> > This commit also replaces direct accesses to xfs_dinode->di_[a]nextents
> > with calls to xfs_dfork_nextents().
> > 
> > No functional changes have been made.
> > 
> > Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
> > ---
> >  db/bmap.c               |  6 +++---
> >  db/btdump.c             |  4 ++--
> >  db/check.c              |  2 +-
> >  db/frag.c               |  8 ++++---
> >  db/inode.c              | 14 ++++++------
> >  db/metadump.c           |  4 ++--
> >  libxfs/xfs_format.h     |  4 ----
> >  libxfs/xfs_inode_buf.c  | 26 ++++++++++++++++------
> >  libxfs/xfs_inode_buf.h  |  2 ++
> >  libxfs/xfs_inode_fork.c |  3 ++-
> >  repair/attr_repair.c    |  2 +-
> >  repair/dinode.c         | 48 +++++++++++++++++++++++------------------
> >  repair/prefetch.c       |  2 +-
> >  13 files changed, 74 insertions(+), 51 deletions(-)
> > 
> > diff --git a/db/bmap.c b/db/bmap.c
> > index fdc70e95..9800a909 100644
> > --- a/db/bmap.c
> > +++ b/db/bmap.c
> > @@ -68,7 +68,7 @@ bmap(
> >  	ASSERT(fmt == XFS_DINODE_FMT_LOCAL || fmt == XFS_DINODE_FMT_EXTENTS ||
> >  		fmt == XFS_DINODE_FMT_BTREE);
> >  	if (fmt == XFS_DINODE_FMT_EXTENTS) {
> > -		nextents = XFS_DFORK_NEXTENTS(dip, whichfork);
> > +		nextents = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
> >  		xp = (xfs_bmbt_rec_t *)XFS_DFORK_PTR(dip, whichfork);
> >  		for (ep = xp; ep < &xp[nextents] && n < nex; ep++) {
> >  			if (!bmap_one_extent(ep, &curoffset, eoffset, &n, bep))
> > @@ -158,9 +158,9 @@ bmap_f(
> >  		push_cur();
> >  		set_cur_inode(iocur_top->ino);
> >  		dip = iocur_top->data;
> > -		if (be32_to_cpu(dip->di_nextents))
> > +		if (xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK))
> 
> Suggestion: Shift these kinds of changes to a separate patch to minimize
> the amount of non-libxfs changes in a patch that will (eventually) be
> ported from the kernel.  Ideally, the only changes to db/ and repair/
> and mkfs/ would be the ones that are necessary to avoid breaking the
> build.
> 
> Once you've separated the other conversions (like this one here) into a
> separate patch, we can review that as a separate refactoring change to
> userspace.

So the changes should be split into two patches -- One patch containing
conversion changes to code inside libxfs and the other one for non-libxfs
code. Please correct me if my understanding is incorrect.

> 
> The reason for this ofc is that when the maintainers run libxfs-apply to
> pull in the kernel patches, they're totally going to miss things like
> this conversion unless you make them an explicit separate change.
> 
> FWIW the conversions themselves mostly look ok...
> 
> >  			dfork = 1;
> > -		if (be16_to_cpu(dip->di_anextents))
> > +		if (xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK))
> >  			afork = 1;
> >  		pop_cur();
> >  	}
> > diff --git a/db/btdump.c b/db/btdump.c
> > index 920f595b..9ced71d4 100644
> > --- a/db/btdump.c
> > +++ b/db/btdump.c
> > @@ -166,13 +166,13 @@ dump_inode(
> >  
> >  	dip = iocur_top->data;
> >  	if (attrfork) {
> > -		if (!dip->di_anextents ||
> > +		if (!xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK) ||
> >  		    dip->di_aformat != XFS_DINODE_FMT_BTREE) {
> >  			dbprintf(_("attr fork not in btree format\n"));
> >  			return 0;
> >  		}
> >  	} else {
> > -		if (!dip->di_nextents ||
> > +		if (!xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK) ||
> >  		    dip->di_format != XFS_DINODE_FMT_BTREE) {
> >  			dbprintf(_("data fork not in btree format\n"));
> >  			return 0;
> > diff --git a/db/check.c b/db/check.c
> > index 12c03b6d..2d1823a4 100644
> > --- a/db/check.c
> > +++ b/db/check.c
> > @@ -2686,7 +2686,7 @@ process_exinode(
> >  	xfs_bmbt_rec_t		*rp;
> >  
> >  	rp = (xfs_bmbt_rec_t *)XFS_DFORK_PTR(dip, whichfork);
> > -	*nex = XFS_DFORK_NEXTENTS(dip, whichfork);
> > +	*nex = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
> >  	if (*nex < 0 || *nex > XFS_DFORK_SIZE(dip, mp, whichfork) /
> >  						sizeof(xfs_bmbt_rec_t)) {
> >  		if (!sflag || id->ilist)
> > diff --git a/db/frag.c b/db/frag.c
> > index 1cfc6c2c..20fb1306 100644
> > --- a/db/frag.c
> > +++ b/db/frag.c
> > @@ -262,9 +262,11 @@ process_exinode(
> >  	int			whichfork)
> >  {
> >  	xfs_bmbt_rec_t		*rp;
> > +	xfs_extnum_t		nextents;
> >  
> >  	rp = (xfs_bmbt_rec_t *)XFS_DFORK_PTR(dip, whichfork);
> > -	process_bmbt_reclist(rp, XFS_DFORK_NEXTENTS(dip, whichfork), extmapp);
> > +	nextents = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
> > +	process_bmbt_reclist(rp, nextents, extmapp);
> >  }
> >  
> >  static void
> > @@ -273,9 +275,9 @@ process_fork(
> >  	int		whichfork)
> >  {
> >  	extmap_t	*extmap;
> > -	int		nex;
> > +	xfs_extnum_t	nex;
> >  
> > -	nex = XFS_DFORK_NEXTENTS(dip, whichfork);
> > +	nex = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
> >  	if (!nex)
> >  		return;
> >  	extmap = extmap_alloc(nex);
> > diff --git a/db/inode.c b/db/inode.c
> > index 0cff9d63..3853092c 100644
> > --- a/db/inode.c
> > +++ b/db/inode.c
> > @@ -271,7 +271,7 @@ inode_a_bmx_count(
> >  		return 0;
> >  	ASSERT((char *)XFS_DFORK_APTR(dip) - (char *)dip == byteize(startoff));
> >  	return dip->di_aformat == XFS_DINODE_FMT_EXTENTS ?
> > -		be16_to_cpu(dip->di_anextents) : 0;
> > +		xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK) : 0;
> >  }
> >  
> >  static int
> > @@ -325,6 +325,7 @@ inode_a_size(
> >  {
> >  	xfs_attr_shortform_t	*asf;
> >  	xfs_dinode_t		*dip;
> > +	xfs_extnum_t		nextents;
> >  
> >  	ASSERT(startoff == 0);
> >  	ASSERT(idx == 0);
> > @@ -334,8 +335,8 @@ inode_a_size(
> >  		asf = (xfs_attr_shortform_t *)XFS_DFORK_APTR(dip);
> >  		return bitize(be16_to_cpu(asf->hdr.totsize));
> >  	case XFS_DINODE_FMT_EXTENTS:
> > -		return (int)be16_to_cpu(dip->di_anextents) *
> > -							bitsz(xfs_bmbt_rec_t);
> > +		nextents = xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK);
> > +		return (int)(nextents * bitsz(xfs_bmbt_rec_t));
> >  	case XFS_DINODE_FMT_BTREE:
> >  		return bitize((int)XFS_DFORK_ASIZE(dip, mp));
> >  	default:
> > @@ -496,7 +497,7 @@ inode_u_bmx_count(
> >  	dip = obj;
> >  	ASSERT((char *)XFS_DFORK_DPTR(dip) - (char *)dip == byteize(startoff));
> >  	return dip->di_format == XFS_DINODE_FMT_EXTENTS ?
> > -		be32_to_cpu(dip->di_nextents) : 0;
> > +		xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK) : 0;
> >  }
> >  
> >  static int
> > @@ -582,6 +583,7 @@ inode_u_size(
> >  	int		idx)
> >  {
> >  	xfs_dinode_t	*dip;
> > +	xfs_extnum_t	nextents;
> >  
> >  	ASSERT(startoff == 0);
> >  	ASSERT(idx == 0);
> > @@ -592,8 +594,8 @@ inode_u_size(
> >  	case XFS_DINODE_FMT_LOCAL:
> >  		return bitize((int)be64_to_cpu(dip->di_size));
> >  	case XFS_DINODE_FMT_EXTENTS:
> > -		return (int)be32_to_cpu(dip->di_nextents) *
> > -						bitsz(xfs_bmbt_rec_t);
> > +		nextents = xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK);
> > +		return (int)(nextents * bitsz(xfs_bmbt_rec_t));
> >  	case XFS_DINODE_FMT_BTREE:
> >  		return bitize((int)XFS_DFORK_DSIZE(dip, mp));
> >  	case XFS_DINODE_FMT_UUID:
> > diff --git a/db/metadump.c b/db/metadump.c
> > index e5cb3aa5..6a6757a2 100644
> > --- a/db/metadump.c
> > +++ b/db/metadump.c
> > @@ -2282,7 +2282,7 @@ process_exinode(
> >  
> >  	whichfork = (itype == TYP_ATTR) ? XFS_ATTR_FORK : XFS_DATA_FORK;
> >  
> > -	nex = XFS_DFORK_NEXTENTS(dip, whichfork);
> > +	nex = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
> >  	used = nex * sizeof(xfs_bmbt_rec_t);
> >  	if (nex < 0 || used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
> >  		if (show_warnings)
> > @@ -2335,7 +2335,7 @@ static int
> >  process_dev_inode(
> >  	xfs_dinode_t		*dip)
> >  {
> > -	if (XFS_DFORK_NEXTENTS(dip, XFS_DATA_FORK)) {
> > +	if (xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK)) {
> >  		if (show_warnings)
> >  			print_warning("inode %llu has unexpected extents",
> >  				      (unsigned long long)cur_ino);
> > diff --git a/libxfs/xfs_format.h b/libxfs/xfs_format.h
> > index a738cd8b..188deada 100644
> > --- a/libxfs/xfs_format.h
> > +++ b/libxfs/xfs_format.h
> > @@ -993,10 +993,6 @@ enum xfs_dinode_fmt {
> >  	((w) == XFS_DATA_FORK ? \
> >  		(dip)->di_format : \
> >  		(dip)->di_aformat)
> > -#define XFS_DFORK_NEXTENTS(dip,w) \
> > -	((w) == XFS_DATA_FORK ? \
> > -		be32_to_cpu((dip)->di_nextents) : \
> > -		be16_to_cpu((dip)->di_anextents))
> >  
> >  /*
> >   * For block and character special files the 32bit dev_t is stored at the
> > diff --git a/libxfs/xfs_inode_buf.c b/libxfs/xfs_inode_buf.c
> > index ae71a19e..d5584372 100644
> > --- a/libxfs/xfs_inode_buf.c
> > +++ b/libxfs/xfs_inode_buf.c
> > @@ -362,9 +362,10 @@ xfs_dinode_verify_fork(
> >  	struct xfs_mount	*mp,
> >  	int			whichfork)
> >  {
> > -	uint32_t		di_nextents = XFS_DFORK_NEXTENTS(dip, whichfork);
> >  	xfs_extnum_t		max_extents;
> > +	uint32_t		di_nextents;
> >  
> > +	di_nextents = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
> >  
> >  	switch (XFS_DFORK_FORMAT(dip, whichfork)) {
> >  	case XFS_DINODE_FMT_LOCAL:
> > @@ -396,6 +397,15 @@ xfs_dinode_verify_fork(
> >  	return NULL;
> >  }
> >  
> > +xfs_extnum_t
> > +xfs_dfork_nextents(struct xfs_sb *sbp, struct xfs_dinode *dip, int whichfork)
> > +{
> > +	if (whichfork == XFS_DATA_FORK)
> > +		return be32_to_cpu(dip->di_nextents);
> > +	else
> > +		return be16_to_cpu(dip->di_anextents);
> > +}
> > +
> >  static xfs_failaddr_t
> >  xfs_dinode_verify_forkoff(
> >  	struct xfs_dinode	*dip,
> > @@ -432,6 +442,8 @@ xfs_dinode_verify(
> >  	uint16_t		flags;
> >  	uint64_t		flags2;
> >  	uint64_t		di_size;
> > +	xfs_extnum_t            nextents;
> > +	int64_t			nblocks;
> >  
> >  	if (dip->di_magic != cpu_to_be16(XFS_DINODE_MAGIC))
> >  		return __this_address;
> > @@ -462,10 +474,12 @@ xfs_dinode_verify(
> >  	if ((S_ISLNK(mode) || S_ISDIR(mode)) && di_size == 0)
> >  		return __this_address;
> >  
> > -	/* Fork checks carried over from xfs_iformat_fork */
> > -	if (mode &&
> > -	    be32_to_cpu(dip->di_nextents) + be16_to_cpu(dip->di_anextents) >
> > -			be64_to_cpu(dip->di_nblocks))
> > +	nextents = xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK);
> > +	nextents += xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK);
> > +	nblocks = be64_to_cpu(dip->di_nblocks);
> > +
> > +        /* Fork checks carried over from xfs_iformat_fork */
> > +	if (mode && nextents > nblocks)
> >  		return __this_address;
> >  
> >  	if (mode && XFS_DFORK_BOFF(dip) > mp->m_sb.sb_inodesize)
> > @@ -522,7 +536,7 @@ xfs_dinode_verify(
> >  		default:
> >  			return __this_address;
> >  		}
> > -		if (dip->di_anextents)
> > +		if (xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK))
> >  			return __this_address;
> >  	}
> >  
> > diff --git a/libxfs/xfs_inode_buf.h b/libxfs/xfs_inode_buf.h
> > index 9b373dcf..f97b3428 100644
> > --- a/libxfs/xfs_inode_buf.h
> > +++ b/libxfs/xfs_inode_buf.h
> > @@ -71,5 +71,7 @@ xfs_failaddr_t xfs_inode_validate_extsize(struct xfs_mount *mp,
> >  xfs_failaddr_t xfs_inode_validate_cowextsize(struct xfs_mount *mp,
> >  		uint32_t cowextsize, uint16_t mode, uint16_t flags,
> >  		uint64_t flags2);
> > +xfs_extnum_t xfs_dfork_nextents(struct xfs_sb *sbp, struct xfs_dinode *dip,
> > +			int whichfork);
> >  
> >  #endif	/* __XFS_INODE_BUF_H__ */
> > diff --git a/libxfs/xfs_inode_fork.c b/libxfs/xfs_inode_fork.c
> > index 80ba6c12..8c32f993 100644
> > --- a/libxfs/xfs_inode_fork.c
> > +++ b/libxfs/xfs_inode_fork.c
> > @@ -205,9 +205,10 @@ xfs_iformat_extents(
> >  	int			whichfork)
> >  {
> >  	struct xfs_mount	*mp = ip->i_mount;
> > +	struct xfs_sb		*sbp = &mp->m_sb;
> >  	struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, whichfork);
> >  	int			state = xfs_bmap_fork_to_state(whichfork);
> > -	int			nex = XFS_DFORK_NEXTENTS(dip, whichfork);
> > +	xfs_extnum_t		nex = xfs_dfork_nextents(sbp, dip, whichfork);
> >  	int			size = nex * sizeof(xfs_bmbt_rec_t);
> >  	struct xfs_iext_cursor	icur;
> >  	struct xfs_bmbt_rec	*dp;
> > diff --git a/repair/attr_repair.c b/repair/attr_repair.c
> > index 6cec0f70..b6ca564b 100644
> > --- a/repair/attr_repair.c
> > +++ b/repair/attr_repair.c
> > @@ -1083,7 +1083,7 @@ process_longform_attr(
> >  	bno = blkmap_get(blkmap, 0);
> >  	if (bno == NULLFSBLOCK) {
> >  		if (dip->di_aformat == XFS_DINODE_FMT_EXTENTS &&
> > -				be16_to_cpu(dip->di_anextents) == 0)
> > +			xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK) == 0)
> 
> 		    ^
> This should /not/ be indented so that it lines up with the if body.

Sorry about that. I will fix it up.

-- 
chandan




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/4] xfsprogs: Extend data/attr fork extent counter width
  2020-08-31 21:00   ` Darrick J. Wong
@ 2020-09-01 14:17     ` Chandan Babu R
  0 siblings, 0 replies; 10+ messages in thread
From: Chandan Babu R @ 2020-09-01 14:17 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, david, bfoster

On Tuesday 1 September 2020 2:30:32 AM IST Darrick J. Wong wrote:
> On Mon, Aug 31, 2020 at 06:31:01PM +0530, Chandan Babu R wrote:
> > The kernel commit xfs: fix inode fork extent count overflow
> > (3f8a4f1d876d3e3e49e50b0396eaffcc4ba71b08) mentions that 10 billion
> > data fork extents should be possible to create. However the
> > corresponding on-disk field has a signed 32-bit type. Hence this
> > commit extends the per-inode data extent counter to 47 bits. The
> > length of 47-bits was chosen because,
> > Maximum file size = 2^63.
> > Maximum extent count when using 64k block size = 2^63 / 2^16 = 2^47.
> > 
> > Also, XFS has a per-inode xattr extent counter which is 16 bits
> > wide. A workload which
> > 1. Creates 1 million 255-byte sized xattrs,
> > 2. Deletes 50% of these xattrs in an alternating manner,
> > 3. Tries to insert 400,000 new 255-byte sized xattrs
> >    causes the xattr extent counter to overflow.
> > 
> > Dave tells me that there are instances where a single file has more than
> > 100 million hardlinks. With parent pointers being stored in xattrs, we
> > will overflow the signed 16-bits wide xattr extent counter when large
> > number of hardlinks are created. Hence this commit extends the on-disk
> > field to 32-bits.
> > 
> > The following changes are made to accomplish this,
> > 
> > 1. A new incompat superblock flag to prevent older kernels from mounting
> >    the filesystem. This flag has to be set during mkfs time.
> > 2. Carve out a new 32-bit field from xfs_dinode->di_pad2[]. This field
> >    holds the most significant 15 bits of the data extent counter.
> > 3. Carve out a new 16-bit field from xfs_dinode->di_pad2[]. This field
> >    holds the most significant 16 bits of the attr extent counter.
> > 
> > Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
> > ---
> >  db/bmap.c                  |  2 +-
> >  db/field.c                 |  4 ---
> >  db/field.h                 |  2 --
> >  db/inode.c                 | 17 ++++++++++--
> >  include/libxlog.h          |  6 +++--
> >  libxfs/xfs_bmap.c          | 12 +++++----
> >  libxfs/xfs_format.h        | 20 ++++++++++----
> >  libxfs/xfs_inode_buf.c     | 53 ++++++++++++++++++++++++++++++--------
> >  libxfs/xfs_inode_buf.h     |  4 +--
> >  libxfs/xfs_inode_fork.c    |  4 +--
> >  libxfs/xfs_inode_fork.h    | 15 ++++++++---
> >  libxfs/xfs_log_format.h    |  8 +++---
> >  libxfs/xfs_types.h         |  6 +++--
> >  logprint/log_misc.c        | 21 +++++++++++----
> >  logprint/log_print_all.c   | 30 ++++++++++++++++-----
> >  logprint/log_print_trans.c |  2 +-
> >  repair/dinode.c            | 28 ++++++++++++--------
> >  17 files changed, 165 insertions(+), 69 deletions(-)
> > 
> > diff --git a/db/bmap.c b/db/bmap.c
> > index 9800a909..c374fa48 100644
> > --- a/db/bmap.c
> > +++ b/db/bmap.c
> > @@ -47,7 +47,7 @@ bmap(
> >  	int			n;
> >  	int			nex;
> >  	xfs_fsblock_t		nextbno;
> > -	int			nextents;
> > +	xfs_extnum_t		nextents;
> >  	xfs_bmbt_ptr_t		*pp;
> >  	xfs_bmdr_block_t	*rblock;
> >  	typnm_t			typ;
> > diff --git a/db/field.c b/db/field.c
> > index aa0154d8..2d707e4e 100644
> > --- a/db/field.c
> > +++ b/db/field.c
> > @@ -25,8 +25,6 @@
> >  #include "symlink.h"
> >  
> >  const ftattr_t	ftattrtab[] = {
> > -	{ FLDT_AEXTNUM, "aextnum", fp_num, "%d", SI(bitsz(xfs_aextnum_t)),
> > -	  FTARG_SIGNED, NULL, NULL },
> >  	{ FLDT_AGBLOCK, "agblock", fp_num, "%u", SI(bitsz(xfs_agblock_t)),
> >  	  FTARG_DONULL, fa_agblock, NULL },
> >  	{ FLDT_AGBLOCKNZ, "agblocknz", fp_num, "%u", SI(bitsz(xfs_agblock_t)),
> > @@ -300,8 +298,6 @@ const ftattr_t	ftattrtab[] = {
> >  	  FTARG_DONULL, fa_drtbno, NULL },
> >  	{ FLDT_EXTLEN, "extlen", fp_num, "%u", SI(bitsz(xfs_extlen_t)), 0, NULL,
> >  	  NULL },
> > -	{ FLDT_EXTNUM, "extnum", fp_num, "%d", SI(bitsz(xfs_extnum_t)),
> > -	  FTARG_SIGNED, NULL, NULL },
> >  	{ FLDT_FSIZE, "fsize", fp_num, "%lld", SI(bitsz(xfs_fsize_t)),
> >  	  FTARG_SIGNED, NULL, NULL },
> >  	{ FLDT_INO, "ino", fp_num, "%llu", SI(bitsz(xfs_ino_t)), FTARG_DONULL,
> > diff --git a/db/field.h b/db/field.h
> > index 15065373..7ebc9a1e 100644
> > --- a/db/field.h
> > +++ b/db/field.h
> > @@ -5,7 +5,6 @@
> >   */
> >  
> >  typedef enum fldt	{
> > -	FLDT_AEXTNUM,
> >  	FLDT_AGBLOCK,
> >  	FLDT_AGBLOCKNZ,
> >  	FLDT_AGF,
> > @@ -143,7 +142,6 @@ typedef enum fldt	{
> >  	FLDT_DRFSBNO,
> >  	FLDT_DRTBNO,
> >  	FLDT_EXTLEN,
> > -	FLDT_EXTNUM,
> >  	FLDT_FSIZE,
> >  	FLDT_INO,
> >  	FLDT_INOBT,
> > diff --git a/db/inode.c b/db/inode.c
> > index 3853092c..50a942b6 100644
> > --- a/db/inode.c
> > +++ b/db/inode.c
> > @@ -37,6 +37,7 @@ static int	inode_u_muuid_count(void *obj, int startoff);
> >  static int	inode_u_sfdir2_count(void *obj, int startoff);
> >  static int	inode_u_sfdir3_count(void *obj, int startoff);
> >  static int	inode_u_symlink_count(void *obj, int startoff);
> > +static int	inode_v3_wideextcnt_count(void *obj, int startoff);
> >  
> >  static const cmdinfo_t	inode_cmd =
> >  	{ "inode", NULL, inode_f, 0, 1, 1, "[inode#]",
> > @@ -100,8 +101,8 @@ const field_t	inode_core_flds[] = {
> >  	{ "size", FLDT_FSIZE, OI(COFF(size)), C1, 0, TYP_NONE },
> >  	{ "nblocks", FLDT_DRFSBNO, OI(COFF(nblocks)), C1, 0, TYP_NONE },
> >  	{ "extsize", FLDT_EXTLEN, OI(COFF(extsize)), C1, 0, TYP_NONE },
> > -	{ "nextents", FLDT_EXTNUM, OI(COFF(nextents)), C1, 0, TYP_NONE },
> > -	{ "naextents", FLDT_AEXTNUM, OI(COFF(anextents)), C1, 0, TYP_NONE },
> > +	{ "nextents_lo", FLDT_UINT32D, OI(COFF(nextents_lo)), C1, 0, TYP_NONE },
> > +	{ "naextents_lo", FLDT_UINT16D, OI(COFF(anextents_lo)), C1, 0, TYP_NONE },
> >  	{ "forkoff", FLDT_UINT8D, OI(COFF(forkoff)), C1, 0, TYP_NONE },
> >  	{ "aformat", FLDT_DINODE_FMT, OI(COFF(aformat)), C1, 0, TYP_NONE },
> >  	{ "dmevmask", FLDT_UINT32X, OI(COFF(dmevmask)), C1, 0, TYP_NONE },
> > @@ -162,6 +163,10 @@ const field_t	inode_v3_flds[] = {
> >  	{ "lsn", FLDT_UINT64X, OI(COFF(lsn)), C1, 0, TYP_NONE },
> >  	{ "flags2", FLDT_UINT64X, OI(COFF(flags2)), C1, 0, TYP_NONE },
> >  	{ "cowextsize", FLDT_EXTLEN, OI(COFF(cowextsize)), C1, 0, TYP_NONE },
> > +	{ "nextents_hi", FLDT_UINT32D, OI(COFF(nextents_hi)),
> > +	  inode_v3_wideextcnt_count, FLD_COUNT, TYP_NONE },
> > +	{ "naextents_hi", FLDT_UINT16D, OI(COFF(anextents_hi)),
> > +	  inode_v3_wideextcnt_count, FLD_COUNT, TYP_NONE },
> 
> Frankly, I would rather see you add new fp_ functions to db/fprint.c to
> extract the relevant bits and keep them a single field rather than
> splitting them into separate nextents_lo and nextents_hi fields.  I
> don't really want to go doing that bit shifting in my head to figure out
> how many extents an inode has.

I agree. I will make the above suggested change.

> 
> Also: Same suggestion as the last patch -- API conversions to non-libxfs
> code are fine to include in the "xfs:" patches to avoid breaking the
> build, but all the other changes should be separate.
> 
> Notice how this patch has gotten very long because it adds widextcount
> support to xfs_db, log dumping support to xfs_logprint, and the ability
> to fix things to xfs_repair?

I will fix this up too.

> 
> --D
> 
> >  	{ "pad2", FLDT_UINT8X, OI(OFF(pad2)), CI(12), FLD_ARRAY|FLD_SKIPALL, TYP_NONE },
> >  	{ "crtime", FLDT_TIMESTAMP, OI(COFF(crtime)), C1, 0, TYP_NONE },
> >  	{ "inumber", FLDT_INO, OI(COFF(ino)), C1, 0, TYP_NONE },
> > @@ -396,6 +401,14 @@ inode_core_projid_count(
> >  	return dic->di_version >= 2;
> >  }
> >  
> > +static int
> > +inode_v3_wideextcnt_count(
> > +	void		*obj,
> > +	int		startoff)
> > +{
> > +	return xfs_sb_version_haswideextcnt(&mp->m_sb);
> > +}
> > +
> >  static int
> >  inode_f(
> >  	int		argc,
> > diff --git a/include/libxlog.h b/include/libxlog.h
> > index 5e94fa1e..1aab108c 100644
> > --- a/include/libxlog.h
> > +++ b/include/libxlog.h
> > @@ -89,13 +89,15 @@ extern int	xlog_find_tail(struct xlog *log, xfs_daddr_t *head_blk,
> >  
> >  extern int	xlog_recover(struct xlog *log, int readonly);
> >  extern void	xlog_recover_print_data(char *p, int len);
> > -extern void	xlog_recover_print_logitem(xlog_recover_item_t *item);
> > +extern void	xlog_recover_print_logitem(struct xlog *log,
> > +			xlog_recover_item_t *item);
> >  extern void	xlog_recover_print_trans_head(struct xlog_recover *tr);
> >  extern int	xlog_print_find_oldest(struct xlog *log, xfs_daddr_t *last_blk);
> >  
> >  /* for transactional view */
> >  extern void	xlog_recover_print_trans_head(struct xlog_recover *tr);
> > -extern void	xlog_recover_print_trans(struct xlog_recover *trans,
> > +extern void	xlog_recover_print_trans(struct xlog *log,
> > +				struct xlog_recover *trans,
> >  				struct list_head *itemq, int print);
> >  extern int	xlog_do_recovery_pass(struct xlog *log, xfs_daddr_t head_blk,
> >  				xfs_daddr_t tail_blk, int pass);
> > diff --git a/libxfs/xfs_bmap.c b/libxfs/xfs_bmap.c
> > index dae4d339..118b6e96 100644
> > --- a/libxfs/xfs_bmap.c
> > +++ b/libxfs/xfs_bmap.c
> > @@ -45,19 +45,21 @@ xfs_bmap_compute_maxlevels(
> >  	xfs_mount_t	*mp,		/* file system mount structure */
> >  	int		whichfork)	/* data or attr fork */
> >  {
> > +	xfs_extnum_t	maxleafents;	/* max leaf entries possible */
> >  	int		level;		/* btree level */
> >  	uint		maxblocks;	/* max blocks at this level */
> > -	uint		maxleafents;	/* max leaf entries possible */
> >  	int		maxrootrecs;	/* max records in root block */
> >  	int		minleafrecs;	/* min records in leaf block */
> >  	int		minnoderecs;	/* min records in node block */
> >  	int		sz;		/* root block size */
> >  
> >  	/*
> > -	 * The maximum number of extents in a file, hence the maximum
> > -	 * number of leaf entries, is controlled by the type of di_nextents
> > -	 * (a signed 32-bit number, xfs_extnum_t), or by di_anextents
> > -	 * (a signed 16-bit number, xfs_aextnum_t).
> > +	 * The maximum number of extents in a file, hence the maximum number of
> > +	 * leaf entries, is controlled by the size of the on-disk extent count,
> > +	 * either a signed 32-bit number for the data fork, or a signed 16-bit
> > +	 * number for the attr fork. With mkfs.xfs' wide-extcount option
> > +	 * enabled, the data fork extent count is unsigned 47-bits wide, while
> > +	 * the corresponding attr fork extent count is unsigned 32-bits wide.
> >  	 *
> >  	 * Note that we can no longer assume that if we are in ATTR1 that
> >  	 * the fork offset of all the inodes will be
> > diff --git a/libxfs/xfs_format.h b/libxfs/xfs_format.h
> > index 188deada..ab44bcb4 100644
> > --- a/libxfs/xfs_format.h
> > +++ b/libxfs/xfs_format.h
> > @@ -464,11 +464,13 @@ xfs_sb_has_ro_compat_feature(
> >  
> >  #define XFS_SB_FEAT_INCOMPAT_FTYPE	(1 << 0)	/* filetype in dirent */
> >  #define XFS_SB_FEAT_INCOMPAT_SPINODES	(1 << 1)	/* sparse inode chunks */
> > -#define XFS_SB_FEAT_INCOMPAT_META_UUID	(1 << 2)	/* metadata UUID */
> > +#define XFS_SB_FEAT_INCOMPAT_META_UUID	(1 << 2)        /* metadata UUID */
> > +#define XFS_SB_FEAT_INCOMPAT_WIDEEXTCNT	(1 << 3)	/* Wider data/attr fork extent counters */
> >  #define XFS_SB_FEAT_INCOMPAT_ALL \
> >  		(XFS_SB_FEAT_INCOMPAT_FTYPE|	\
> >  		 XFS_SB_FEAT_INCOMPAT_SPINODES|	\
> > -		 XFS_SB_FEAT_INCOMPAT_META_UUID)
> > +		 XFS_SB_FEAT_INCOMPAT_META_UUID| \
> > +		 XFS_SB_FEAT_INCOMPAT_WIDEEXTCNT)
> >  
> >  #define XFS_SB_FEAT_INCOMPAT_UNKNOWN	~XFS_SB_FEAT_INCOMPAT_ALL
> >  static inline bool
> > @@ -551,6 +553,12 @@ static inline bool xfs_sb_version_hasmetauuid(struct xfs_sb *sbp)
> >  		(sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_META_UUID);
> >  }
> >  
> > +static inline bool xfs_sb_version_haswideextcnt(struct xfs_sb *sbp)
> > +{
> > +	return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5) &&
> > +		(sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_WIDEEXTCNT);
> > +}
> > +
> >  static inline bool xfs_sb_version_hasrmapbt(struct xfs_sb *sbp)
> >  {
> >  	return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5) &&
> > @@ -873,8 +881,8 @@ typedef struct xfs_dinode {
> >  	__be64		di_size;	/* number of bytes in file */
> >  	__be64		di_nblocks;	/* # of direct & btree blocks used */
> >  	__be32		di_extsize;	/* basic/minimum extent size for file */
> > -	__be32		di_nextents;	/* number of extents in data fork */
> > -	__be16		di_anextents;	/* number of extents in attribute fork*/
> > +	__be32		di_nextents_lo;	/* lower part of data fork extent count */
> > +	__be16		di_anextents_lo;/* lower part of attr fork extent count */
> >  	__u8		di_forkoff;	/* attr fork offs, <<3 for 64b align */
> >  	__s8		di_aformat;	/* format of attr fork's data */
> >  	__be32		di_dmevmask;	/* DMIG event mask */
> > @@ -891,7 +899,9 @@ typedef struct xfs_dinode {
> >  	__be64		di_lsn;		/* flush sequence */
> >  	__be64		di_flags2;	/* more random flags */
> >  	__be32		di_cowextsize;	/* basic cow extent size for file */
> > -	__u8		di_pad2[12];	/* more padding for future expansion */
> > +	__be32		di_nextents_hi; /* higher part of data fork extent count */
> > +	__be16		di_anextents_hi;/* higher part of attr fork extent count */
> > +	__u8		di_pad2[6];	/* more padding for future expansion */
> >  
> >  	/* fields only written to during inode creation */
> >  	xfs_timestamp_t	di_crtime;	/* time created */
> > diff --git a/libxfs/xfs_inode_buf.c b/libxfs/xfs_inode_buf.c
> > index d5584372..219d0234 100644
> > --- a/libxfs/xfs_inode_buf.c
> > +++ b/libxfs/xfs_inode_buf.c
> > @@ -188,6 +188,7 @@ xfs_inode_from_disk(
> >  	struct xfs_inode	*ip,
> >  	struct xfs_dinode	*from)
> >  {
> > +	struct xfs_sb		*sbp = &ip->i_mount->m_sb;
> >  	struct xfs_icdinode	*to = &ip->i_d;
> >  	struct inode		*inode = VFS_I(ip);
> >  
> > @@ -228,8 +229,8 @@ xfs_inode_from_disk(
> >  	to->di_size = be64_to_cpu(from->di_size);
> >  	to->di_nblocks = be64_to_cpu(from->di_nblocks);
> >  	to->di_extsize = be32_to_cpu(from->di_extsize);
> > -	to->di_nextents = be32_to_cpu(from->di_nextents);
> > -	to->di_anextents = be16_to_cpu(from->di_anextents);
> > +	to->di_nextents = be32_to_cpu(from->di_nextents_lo);
> > +	to->di_anextents = be16_to_cpu(from->di_anextents_lo);
> >  	to->di_forkoff = from->di_forkoff;
> >  	to->di_aformat	= from->di_aformat;
> >  	to->di_dmevmask	= be32_to_cpu(from->di_dmevmask);
> > @@ -243,6 +244,13 @@ xfs_inode_from_disk(
> >  		to->di_crtime.tv_nsec = be32_to_cpu(from->di_crtime.t_nsec);
> >  		to->di_flags2 = be64_to_cpu(from->di_flags2);
> >  		to->di_cowextsize = be32_to_cpu(from->di_cowextsize);
> > +
> > +		if (xfs_sb_version_haswideextcnt(sbp)) {
> > +			to->di_nextents |=
> > +				((uint64_t)(be32_to_cpu(from->di_nextents_hi)) << 32);
> > +			to->di_anextents |=
> > +				((uint32_t)(be16_to_cpu(from->di_anextents_hi)) << 16);
> > +		}
> >  	}
> >  }
> >  
> > @@ -252,6 +260,7 @@ xfs_inode_to_disk(
> >  	struct xfs_dinode	*to,
> >  	xfs_lsn_t		lsn)
> >  {
> > +	struct xfs_sb		*sbp = &ip->i_mount->m_sb;
> >  	struct xfs_icdinode	*from = &ip->i_d;
> >  	struct inode		*inode = VFS_I(ip);
> >  
> > @@ -278,8 +287,8 @@ xfs_inode_to_disk(
> >  	to->di_size = cpu_to_be64(from->di_size);
> >  	to->di_nblocks = cpu_to_be64(from->di_nblocks);
> >  	to->di_extsize = cpu_to_be32(from->di_extsize);
> > -	to->di_nextents = cpu_to_be32(from->di_nextents);
> > -	to->di_anextents = cpu_to_be16(from->di_anextents);
> > +	to->di_nextents_lo = cpu_to_be32(from->di_nextents);
> > +	to->di_anextents_lo = cpu_to_be16(from->di_anextents);
> >  	to->di_forkoff = from->di_forkoff;
> >  	to->di_aformat = from->di_aformat;
> >  	to->di_dmevmask = cpu_to_be32(from->di_dmevmask);
> > @@ -293,6 +302,12 @@ xfs_inode_to_disk(
> >  		to->di_crtime.t_nsec = cpu_to_be32(from->di_crtime.tv_nsec);
> >  		to->di_flags2 = cpu_to_be64(from->di_flags2);
> >  		to->di_cowextsize = cpu_to_be32(from->di_cowextsize);
> > +		if (xfs_sb_version_haswideextcnt(sbp)) {
> > +			to->di_nextents_hi
> > +				= cpu_to_be32(from->di_nextents >> 32);
> > +			to->di_anextents_hi
> > +				= cpu_to_be16(from->di_nextents >> 16);
> > +		}
> >  		to->di_ino = cpu_to_be64(ip->i_ino);
> >  		to->di_lsn = cpu_to_be64(lsn);
> >  		memset(to->di_pad2, 0, sizeof(to->di_pad2));
> > @@ -306,9 +321,12 @@ xfs_inode_to_disk(
> >  
> >  void
> >  xfs_log_dinode_to_disk(
> > +	struct xfs_mount	*mp,
> >  	struct xfs_log_dinode	*from,
> >  	struct xfs_dinode	*to)
> >  {
> > +	struct xfs_sb		*sbp = &mp->m_sb;
> > +
> >  	to->di_magic = cpu_to_be16(from->di_magic);
> >  	to->di_mode = cpu_to_be16(from->di_mode);
> >  	to->di_version = from->di_version;
> > @@ -331,8 +349,8 @@ xfs_log_dinode_to_disk(
> >  	to->di_size = cpu_to_be64(from->di_size);
> >  	to->di_nblocks = cpu_to_be64(from->di_nblocks);
> >  	to->di_extsize = cpu_to_be32(from->di_extsize);
> > -	to->di_nextents = cpu_to_be32(from->di_nextents);
> > -	to->di_anextents = cpu_to_be16(from->di_anextents);
> > +	to->di_nextents_lo = cpu_to_be32(from->di_nextents_lo);
> > +	to->di_anextents_lo = cpu_to_be16(from->di_anextents_lo);
> >  	to->di_forkoff = from->di_forkoff;
> >  	to->di_aformat = from->di_aformat;
> >  	to->di_dmevmask = cpu_to_be32(from->di_dmevmask);
> > @@ -346,6 +364,10 @@ xfs_log_dinode_to_disk(
> >  		to->di_crtime.t_nsec = cpu_to_be32(from->di_crtime.t_nsec);
> >  		to->di_flags2 = cpu_to_be64(from->di_flags2);
> >  		to->di_cowextsize = cpu_to_be32(from->di_cowextsize);
> > +		if (xfs_sb_version_haswideextcnt(sbp)) {
> > +			to->di_nextents_hi = cpu_to_be32(from->di_nextents_hi);
> > +			to->di_anextents_hi = cpu_to_be16(from->di_anextents_hi);
> > +		}
> >  		to->di_ino = cpu_to_be64(from->di_ino);
> >  		to->di_lsn = cpu_to_be64(from->di_lsn);
> >  		memcpy(to->di_pad2, from->di_pad2, sizeof(to->di_pad2));
> > @@ -363,7 +385,7 @@ xfs_dinode_verify_fork(
> >  	int			whichfork)
> >  {
> >  	xfs_extnum_t		max_extents;
> > -	uint32_t		di_nextents;
> > +	xfs_extnum_t		di_nextents;
> >  
> >  	di_nextents = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
> >  
> > @@ -400,10 +422,19 @@ xfs_dinode_verify_fork(
> >  xfs_extnum_t
> >  xfs_dfork_nextents(struct xfs_sb *sbp, struct xfs_dinode *dip, int whichfork)
> >  {
> > -	if (whichfork == XFS_DATA_FORK)
> > -		return be32_to_cpu(dip->di_nextents);
> > -	else
> > -		return be16_to_cpu(dip->di_anextents);
> > +	xfs_extnum_t nextents;
> > +
> > +	if (whichfork == XFS_DATA_FORK) {
> > +		nextents = be32_to_cpu(dip->di_nextents_lo);
> > +		if (xfs_sb_version_haswideextcnt(sbp))
> > +			nextents |= ((uint64_t)be32_to_cpu(dip->di_nextents_hi) << 32);
> > +	} else {
> > +		nextents = be16_to_cpu(dip->di_anextents_lo);
> > +		if (xfs_sb_version_haswideextcnt(sbp))
> > +			nextents |= ((uint32_t)be16_to_cpu(dip->di_anextents_hi) << 16);
> > +	}
> > +
> > +	return nextents;
> >  }
> >  
> >  static xfs_failaddr_t
> > diff --git a/libxfs/xfs_inode_buf.h b/libxfs/xfs_inode_buf.h
> > index f97b3428..0dee0235 100644
> > --- a/libxfs/xfs_inode_buf.h
> > +++ b/libxfs/xfs_inode_buf.h
> > @@ -55,8 +55,8 @@ void	xfs_dinode_calc_crc(struct xfs_mount *, struct xfs_dinode *);
> >  void	xfs_inode_to_disk(struct xfs_inode *ip, struct xfs_dinode *to,
> >  			  xfs_lsn_t lsn);
> >  void	xfs_inode_from_disk(struct xfs_inode *ip, struct xfs_dinode *from);
> > -void	xfs_log_dinode_to_disk(struct xfs_log_dinode *from,
> > -			       struct xfs_dinode *to);
> > +void	xfs_log_dinode_to_disk(struct xfs_mount *mp,
> > +			struct xfs_log_dinode *from, struct xfs_dinode *to);
> >  
> >  #if defined(DEBUG)
> >  void	xfs_inobp_check(struct xfs_mount *, struct xfs_buf *);
> > diff --git a/libxfs/xfs_inode_fork.c b/libxfs/xfs_inode_fork.c
> > index 8c32f993..af4f893f 100644
> > --- a/libxfs/xfs_inode_fork.c
> > +++ b/libxfs/xfs_inode_fork.c
> > @@ -213,14 +213,14 @@ xfs_iformat_extents(
> >  	struct xfs_iext_cursor	icur;
> >  	struct xfs_bmbt_rec	*dp;
> >  	struct xfs_bmbt_irec	new;
> > -	int			i;
> > +	xfs_extnum_t		i;
> >  
> >  	/*
> >  	 * If the number of extents is unreasonable, then something is wrong and
> >  	 * we just bail out rather than crash in kmem_alloc() or memcpy() below.
> >  	 */
> >  	if (unlikely(size < 0 || size > XFS_DFORK_SIZE(dip, mp, whichfork))) {
> > -		xfs_warn(ip->i_mount, "corrupt inode %Lu ((a)extents = %d).",
> > +		xfs_warn(ip->i_mount, "corrupt inode %Lu ((a)extents = %llu).",
> >  			(unsigned long long) ip->i_ino, nex);
> >  		xfs_inode_verifier_error(ip, -EFSCORRUPTED,
> >  				"xfs_iformat_extents(1)", dip, sizeof(*dip),
> > diff --git a/libxfs/xfs_inode_fork.h b/libxfs/xfs_inode_fork.h
> > index e318dfdd..22f3c9b3 100644
> > --- a/libxfs/xfs_inode_fork.h
> > +++ b/libxfs/xfs_inode_fork.h
> > @@ -90,10 +90,17 @@ static inline xfs_extnum_t xfs_iext_max(struct xfs_sb *sbp, int whichfork)
> >  {
> >  	ASSERT(whichfork == XFS_DATA_FORK || whichfork == XFS_ATTR_FORK);
> >  
> > -	if (whichfork == XFS_DATA_FORK)
> > -		return MAXEXTNUM;
> > -	else
> > -		return MAXAEXTNUM;
> > +	if (whichfork == XFS_DATA_FORK) {
> > +		if (xfs_sb_version_haswideextcnt(sbp))
> > +			return MAXEXTNUM_HI;
> > +		else
> > +			return MAXEXTNUM;
> > +	} else {
> > +		if (xfs_sb_version_haswideextcnt(sbp))
> > +			return MAXAEXTNUM_HI;
> > +		else
> > +			return MAXAEXTNUM;
> > +	}
> >  }
> >  
> >  struct xfs_ifork *xfs_iext_state_to_fork(struct xfs_inode *ip, int state);
> > diff --git a/libxfs/xfs_log_format.h b/libxfs/xfs_log_format.h
> > index e3400c9c..809f8ce6 100644
> > --- a/libxfs/xfs_log_format.h
> > +++ b/libxfs/xfs_log_format.h
> > @@ -396,8 +396,8 @@ struct xfs_log_dinode {
> >  	xfs_fsize_t	di_size;	/* number of bytes in file */
> >  	xfs_rfsblock_t	di_nblocks;	/* # of direct & btree blocks used */
> >  	xfs_extlen_t	di_extsize;	/* basic/minimum extent size for file */
> > -	xfs_extnum_t	di_nextents;	/* number of extents in data fork */
> > -	xfs_aextnum_t	di_anextents;	/* number of extents in attribute fork*/
> > +	uint32_t	di_nextents_lo;	/* lower part of data fork extent count */
> > +	uint16_t	di_anextents_lo;/* lower part of attr fork extent count*/
> >  	uint8_t		di_forkoff;	/* attr fork offs, <<3 for 64b align */
> >  	int8_t		di_aformat;	/* format of attr fork's data */
> >  	uint32_t	di_dmevmask;	/* DMIG event mask */
> > @@ -414,7 +414,9 @@ struct xfs_log_dinode {
> >  	xfs_lsn_t	di_lsn;		/* flush sequence */
> >  	uint64_t	di_flags2;	/* more random flags */
> >  	uint32_t	di_cowextsize;	/* basic cow extent size for file */
> > -	uint8_t		di_pad2[12];	/* more padding for future expansion */
> > +	uint32_t	di_nextents_hi; /* higher part of data fork extent count */
> > +	uint16_t	di_anextents_hi;/* higher part of attr fork extent count */
> > +	uint8_t		di_pad2[6];	/* more padding for future expansion */
> >  
> >  	/* fields only written to during inode creation */
> >  	xfs_ictimestamp_t di_crtime;	/* time created */
> > diff --git a/libxfs/xfs_types.h b/libxfs/xfs_types.h
> > index 397d9477..23ff8166 100644
> > --- a/libxfs/xfs_types.h
> > +++ b/libxfs/xfs_types.h
> > @@ -12,8 +12,8 @@ typedef uint32_t	xfs_agblock_t;	/* blockno in alloc. group */
> >  typedef uint32_t	xfs_agino_t;	/* inode # within allocation grp */
> >  typedef uint32_t	xfs_extlen_t;	/* extent length in blocks */
> >  typedef uint32_t	xfs_agnumber_t;	/* allocation group number */
> > -typedef int32_t		xfs_extnum_t;	/* # of extents in a file */
> > -typedef int16_t		xfs_aextnum_t;	/* # extents in an attribute fork */
> > +typedef uint64_t	xfs_extnum_t;	/* # of extents in a file */
> > +typedef uint32_t	xfs_aextnum_t;	/* # extents in an attribute fork */
> >  typedef int64_t		xfs_fsize_t;	/* bytes in a file */
> >  typedef uint64_t	xfs_ufsize_t;	/* unsigned bytes in a file */
> >  
> > @@ -61,6 +61,8 @@ typedef void *		xfs_failaddr_t;
> >  #define	MAXEXTLEN	((xfs_extlen_t)0x001fffff)	/* 21 bits */
> >  #define	MAXEXTNUM	((xfs_extnum_t)0x7fffffff)	/* signed int */
> >  #define	MAXAEXTNUM	((xfs_aextnum_t)0x7fff)		/* signed short */
> > +#define MAXEXTNUM_HI	((xfs_extnum_t)0x7fffffffffff)	/* unsigned 47 bits */
> > +#define MAXAEXTNUM_HI	((xfs_aextnum_t)0xffffffff)	/* unsigned 32 bits */
> >  
> >  /*
> >   * Minimum and maximum blocksize and sectorsize.
> > diff --git a/logprint/log_misc.c b/logprint/log_misc.c
> > index be889887..4d09f357 100644
> > --- a/logprint/log_misc.c
> > +++ b/logprint/log_misc.c
> > @@ -438,8 +438,11 @@ xlog_print_trans_qoff(char **ptr, uint len)
> >  
> >  static void
> >  xlog_print_trans_inode_core(
> > +	struct xfs_mount	*mp,
> >  	struct xfs_log_dinode	*ip)
> >  {
> > +	xfs_extnum_t		nextents;
> > +
> >      printf(_("INODE CORE\n"));
> >      printf(_("magic 0x%hx mode 0%ho version %d format %d\n"),
> >  	   ip->di_magic, ip->di_mode, (int)ip->di_version,
> > @@ -448,11 +451,19 @@ xlog_print_trans_inode_core(
> >  	   ip->di_nlink, ip->di_uid, ip->di_gid);
> >      printf(_("atime 0x%x mtime 0x%x ctime 0x%x\n"),
> >  	   ip->di_atime.t_sec, ip->di_mtime.t_sec, ip->di_ctime.t_sec);
> > -    printf(_("size 0x%llx nblocks 0x%llx extsize 0x%x nextents 0x%x\n"),
> > +
> > +    nextents = ip->di_nextents_lo;
> > +    if (xfs_sb_version_haswideextcnt(&mp->m_sb))
> > +	    nextents |= ((xfs_extnum_t)ip->di_nextents_hi << 32);
> > +    printf(_("size 0x%llx nblocks 0x%llx extsize 0x%x nextents 0x%lx\n"),
> >  	   (unsigned long long)ip->di_size, (unsigned long long)ip->di_nblocks,
> > -	   ip->di_extsize, ip->di_nextents);
> > -    printf(_("naextents 0x%x forkoff %d dmevmask 0x%x dmstate 0x%hx\n"),
> > -	   ip->di_anextents, (int)ip->di_forkoff, ip->di_dmevmask,
> > +	   ip->di_extsize, nextents);
> > +
> > +    nextents = ip->di_anextents_lo;
> > +    if (xfs_sb_version_haswideextcnt(&mp->m_sb))
> > +	    nextents |= ((xfs_extnum_t)ip->di_anextents_hi << 16);
> > +    printf(_("naextents 0x%lx forkoff %d dmevmask 0x%x dmstate 0x%hx\n"),
> > +	   nextents, (int)ip->di_forkoff, ip->di_dmevmask,
> >  	   ip->di_dmstate);
> >      printf(_("flags 0x%x gen 0x%x\n"),
> >  	   ip->di_flags, ip->di_gen);
> > @@ -562,7 +573,7 @@ xlog_print_trans_inode(
> >      memmove(&dino, *ptr, sizeof(dino));
> >      mode = dino.di_mode & S_IFMT;
> >      size = (int)dino.di_size;
> > -    xlog_print_trans_inode_core(&dino);
> > +    xlog_print_trans_inode_core(log->l_mp, &dino);
> >      *ptr += xfs_log_dinode_size(log->l_mp);
> >      skip_count--;
> >  
> > diff --git a/logprint/log_print_all.c b/logprint/log_print_all.c
> > index e2e28b9c..aa171dfb 100644
> > --- a/logprint/log_print_all.c
> > +++ b/logprint/log_print_all.c
> > @@ -238,9 +238,14 @@ xlog_recover_print_dquot(
> >  
> >  STATIC void
> >  xlog_recover_print_inode_core(
> > +	struct xlog		*log,
> >  	struct xfs_log_dinode	*di)
> >  {
> > -	printf(_("	CORE inode:\n"));
> > +	struct xfs_sb		*sbp = &log->l_mp->m_sb;
> > +	xfs_aextnum_t		anextents;
> > +	xfs_extnum_t		nextents;
> > +
> > +        printf(_("	CORE inode:\n"));
> >  	if (!print_inode)
> >  		return;
> >  	printf(_("		magic:%c%c  mode:0x%x  ver:%d  format:%d\n"),
> > @@ -252,10 +257,17 @@ xlog_recover_print_inode_core(
> >  	printf(_("		atime:%d  mtime:%d  ctime:%d\n"),
> >  	       di->di_atime.t_sec, di->di_mtime.t_sec, di->di_ctime.t_sec);
> >  	printf(_("		flushiter:%d\n"), di->di_flushiter);
> > +
> > +	nextents = di->di_nextents_lo;
> > +	anextents = di->di_anextents_lo;
> > +	if (xfs_sb_version_haswideextcnt(sbp)) {
> > +		nextents |= ((xfs_extnum_t)di->di_nextents_hi << 32);
> > +		anextents |= ((xfs_aextnum_t)di->di_anextents_hi << 16);
> > +	}
> >  	printf(_("		size:0x%llx  nblks:0x%llx  exsize:%d  "
> > -	     "nextents:%d  anextents:%d\n"), (unsigned long long)
> > +	     "nextents:%lu  anextents:%u\n"), (unsigned long long)
> >  	       di->di_size, (unsigned long long)di->di_nblocks,
> > -	       di->di_extsize, di->di_nextents, (int)di->di_anextents);
> > +	       di->di_extsize, nextents, anextents);
> >  	printf(_("		forkoff:%d  dmevmask:0x%x  dmstate:%d  flags:0x%x  "
> >  	     "gen:%u\n"),
> >  	       (int)di->di_forkoff, di->di_dmevmask, (int)di->di_dmstate,
> > @@ -268,6 +280,7 @@ xlog_recover_print_inode_core(
> >  
> >  STATIC void
> >  xlog_recover_print_inode(
> > +	struct xlog		*log,
> >  	xlog_recover_item_t	*item)
> >  {
> >  	struct xfs_inode_log_format	f_buf;
> > @@ -289,7 +302,7 @@ xlog_recover_print_inode(
> >  	ASSERT(item->ri_buf[1].i_len ==
> >  			offsetof(struct xfs_log_dinode, di_next_unlinked) ||
> >  	       item->ri_buf[1].i_len == sizeof(struct xfs_log_dinode));
> > -	xlog_recover_print_inode_core((struct xfs_log_dinode *)
> > +	xlog_recover_print_inode_core(log, (struct xfs_log_dinode *)
> >  				      item->ri_buf[1].i_addr);
> >  
> >  	hasdata = (f->ilf_fields & XFS_ILOG_DFORK) != 0;
> > @@ -384,6 +397,7 @@ xlog_recover_print_icreate(
> >  
> >  void
> >  xlog_recover_print_logitem(
> > +	struct xlog		*log,
> >  	xlog_recover_item_t	*item)
> >  {
> >  	switch (ITEM_TYPE(item)) {
> > @@ -394,7 +408,7 @@ xlog_recover_print_logitem(
> >  		xlog_recover_print_icreate(item);
> >  		break;
> >  	case XFS_LI_INODE:
> > -		xlog_recover_print_inode(item);
> > +		xlog_recover_print_inode(log, item);
> >  		break;
> >  	case XFS_LI_EFD:
> >  		xlog_recover_print_efd(item);
> > @@ -434,6 +448,7 @@ xlog_recover_print_logitem(
> >  
> >  static void
> >  xlog_recover_print_item(
> > +	struct xlog		*log,
> >  	xlog_recover_item_t	*item)
> >  {
> >  	int			i;
> > @@ -493,11 +508,12 @@ xlog_recover_print_item(
> >  		       (long)item->ri_buf[i].i_addr, item->ri_buf[i].i_len);
> >  	}
> >  	printf("\n");
> > -	xlog_recover_print_logitem(item);
> > +	xlog_recover_print_logitem(log, item);
> >  }
> >  
> >  void
> >  xlog_recover_print_trans(
> > +	struct xlog		*log,
> >  	struct xlog_recover	*trans,
> >  	struct list_head	*itemq,
> >  	int			print)
> > @@ -510,5 +526,5 @@ xlog_recover_print_trans(
> >  	print_xlog_record_line();
> >  	xlog_recover_print_trans_head(trans);
> >  	list_for_each_entry(item, itemq, ri_list)
> > -		xlog_recover_print_item(item);
> > +		xlog_recover_print_item(log, item);
> >  }
> > diff --git a/logprint/log_print_trans.c b/logprint/log_print_trans.c
> > index 2004b5a0..c6386fb0 100644
> > --- a/logprint/log_print_trans.c
> > +++ b/logprint/log_print_trans.c
> > @@ -24,7 +24,7 @@ xlog_recover_do_trans(
> >  	struct xlog_recover	*trans,
> >  	int			pass)
> >  {
> > -	xlog_recover_print_trans(trans, &trans->r_itemq, 3);
> > +	xlog_recover_print_trans(log, trans, &trans->r_itemq, 3);
> >  	return 0;
> >  }
> >  
> > diff --git a/repair/dinode.c b/repair/dinode.c
> > index 98bb4a17..5a8de0f6 100644
> > --- a/repair/dinode.c
> > +++ b/repair/dinode.c
> > @@ -71,7 +71,9 @@ _("would have cleared inode %" PRIu64 " attributes\n"), ino_num);
> >  	if (xfs_dfork_nextents(&mp->m_sb, dino, XFS_ATTR_FORK) != 0) {
> >  		if (no_modify)
> >  			return(1);
> > -		dino->di_anextents = cpu_to_be16(0);
> > +		dino->di_anextents_lo = cpu_to_be16(0);
> > +		if (xfs_sb_version_haswideextcnt(&mp->m_sb))
> > +			dino->di_anextents_hi = cpu_to_be16(0);
> >  	}
> >  
> >  	if (dino->di_aformat != XFS_DINODE_FMT_EXTENTS)  {
> > @@ -959,7 +961,7 @@ process_symlink_extlist(xfs_mount_t *mp, xfs_ino_t lino, xfs_dinode_t *dino)
> >  	xfs_fileoff_t		expected_offset;
> >  	xfs_bmbt_rec_t		*rp;
> >  	xfs_bmbt_irec_t		irec;
> > -	int			numrecs;
> > +	xfs_extnum_t		numrecs;
> >  	int			i;
> >  	int			max_blocks;
> >  
> > @@ -989,7 +991,7 @@ _("mismatch between format (%d) and size (%" PRId64 ") in symlink inode %" PRIu6
> >  	 */
> >  	if (numrecs > max_symlink_blocks)  {
> >  		do_warn(
> > -_("bad number of extents (%d) in symlink %" PRIu64 " data fork\n"),
> > +_("bad number of extents (%lu) in symlink %" PRIu64 " data fork\n"),
> >  			numrecs, lino);
> >  		return(1);
> >  	}
> > @@ -1556,7 +1558,7 @@ _("realtime summary inode %" PRIu64 " has bad type 0x%x, "),
> >  		nextents = xfs_dfork_nextents(&mp->m_sb, dinoc, XFS_DATA_FORK);
> >  		if (mp->m_sb.sb_rblocks == 0 && nextents != 0)  {
> >  			do_warn(
> > -_("bad # of extents (%u) for realtime summary inode %" PRIu64 "\n"),
> > +_("bad # of extents (%lu) for realtime summary inode %" PRIu64 "\n"),
> >  				nextents, lino);
> >  			return 1;
> >  		}
> > @@ -1579,7 +1581,7 @@ _("realtime bitmap inode %" PRIu64 " has bad type 0x%x, "),
> >  		nextents = xfs_dfork_nextents(&mp->m_sb, dinoc, XFS_DATA_FORK);
> >  		if (mp->m_sb.sb_rblocks == 0 && nextents != 0)  {
> >  			do_warn(
> > -_("bad # of extents (%u) for realtime bitmap inode %" PRIu64 "\n"),
> > +_("bad # of extents (%lu) for realtime bitmap inode %" PRIu64 "\n"),
> >  				nextents, lino);
> >  			return 1;
> >  		}
> > @@ -1772,13 +1774,15 @@ _("too many data fork extents (%" PRIu64 ") in inode %" PRIu64 "\n"),
> >  	if (nextents != dnextents)  {
> >  		if (!no_modify)  {
> >  			do_warn(
> > -_("correcting nextents for inode %" PRIu64 ", was %d - counted %" PRIu64 "\n"),
> > +_("correcting nextents for inode %" PRIu64 ", was %lu - counted %" PRIu64 "\n"),
> >  				lino, dnextents, nextents);
> > -			dino->di_nextents = cpu_to_be32(nextents);
> > +			dino->di_nextents_lo = cpu_to_be32(nextents);
> > +			if (xfs_sb_version_haswideextcnt(&mp->m_sb))
> > +				dino->di_nextents_hi = cpu_to_be32(nextents >> 32);
> >  			*dirty = 1;
> >  		} else  {
> >  			do_warn(
> > -_("bad nextents %d for inode %" PRIu64 ", would reset to %" PRIu64 "\n"),
> > +_("bad nextents %lu for inode %" PRIu64 ", would reset to %" PRIu64 "\n"),
> >  				dnextents, lino, nextents);
> >  		}
> >  	}
> > @@ -1795,13 +1799,15 @@ _("too many attr fork extents (%" PRIu64 ") in inode %" PRIu64 "\n"),
> >  	if (anextents != dnextents)  {
> >  		if (!no_modify)  {
> >  			do_warn(
> > -_("correcting anextents for inode %" PRIu64 ", was %d - counted %" PRIu64 "\n"),
> > +_("correcting anextents for inode %" PRIu64 ", was %lu - counted %" PRIu64 "\n"),
> >  				lino, dnextents, anextents);
> > -			dino->di_anextents = cpu_to_be16(anextents);
> > +			dino->di_anextents_lo = cpu_to_be16(anextents);
> > +			if (xfs_sb_version_haswideextcnt(&mp->m_sb))
> > +				dino->di_anextents_hi = cpu_to_be16(anextents >> 16);
> >  			*dirty = 1;
> >  		} else  {
> >  			do_warn(
> > -_("bad anextents %d for inode %" PRIu64 ", would reset to %" PRIu64 "\n"),
> > +_("bad anextents %lu for inode %" PRIu64 ", would reset to %" PRIu64 "\n"),
> >  				dnextents, lino, anextents);
> >  		}
> >  	}
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] xfsprogs: Introduce xfs_dfork_nextents() helper
  2020-09-01 14:17     ` Chandan Babu R
@ 2020-09-01 15:42       ` Darrick J. Wong
  0 siblings, 0 replies; 10+ messages in thread
From: Darrick J. Wong @ 2020-09-01 15:42 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: linux-xfs, david, bfoster

On Tue, Sep 01, 2020 at 07:47:41PM +0530, Chandan Babu R wrote:
> On Tuesday 1 September 2020 2:24:26 AM IST Darrick J. Wong wrote:
> > On Mon, Aug 31, 2020 at 06:31:00PM +0530, Chandan Babu R wrote:
> > > This commit replaces the macro XFS_DFORK_NEXTENTS() with the helper
> > > function xfs_dfork_nextents(). As of this commit, xfs_dfork_nextents()
> > > returns the same value as XFS_DFORK_NEXTENTS(). A future commit which
> > > extends inode's extent counter fields will add more logic to this
> > > helper.
> > > 
> > > This commit also replaces direct accesses to xfs_dinode->di_[a]nextents
> > > with calls to xfs_dfork_nextents().
> > > 
> > > No functional changes have been made.
> > > 
> > > Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
> > > ---
> > >  db/bmap.c               |  6 +++---
> > >  db/btdump.c             |  4 ++--
> > >  db/check.c              |  2 +-
> > >  db/frag.c               |  8 ++++---
> > >  db/inode.c              | 14 ++++++------
> > >  db/metadump.c           |  4 ++--
> > >  libxfs/xfs_format.h     |  4 ----
> > >  libxfs/xfs_inode_buf.c  | 26 ++++++++++++++++------
> > >  libxfs/xfs_inode_buf.h  |  2 ++
> > >  libxfs/xfs_inode_fork.c |  3 ++-
> > >  repair/attr_repair.c    |  2 +-
> > >  repair/dinode.c         | 48 +++++++++++++++++++++++------------------
> > >  repair/prefetch.c       |  2 +-
> > >  13 files changed, 74 insertions(+), 51 deletions(-)
> > > 
> > > diff --git a/db/bmap.c b/db/bmap.c
> > > index fdc70e95..9800a909 100644
> > > --- a/db/bmap.c
> > > +++ b/db/bmap.c
> > > @@ -68,7 +68,7 @@ bmap(
> > >  	ASSERT(fmt == XFS_DINODE_FMT_LOCAL || fmt == XFS_DINODE_FMT_EXTENTS ||
> > >  		fmt == XFS_DINODE_FMT_BTREE);
> > >  	if (fmt == XFS_DINODE_FMT_EXTENTS) {
> > > -		nextents = XFS_DFORK_NEXTENTS(dip, whichfork);
> > > +		nextents = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
> > >  		xp = (xfs_bmbt_rec_t *)XFS_DFORK_PTR(dip, whichfork);
> > >  		for (ep = xp; ep < &xp[nextents] && n < nex; ep++) {
> > >  			if (!bmap_one_extent(ep, &curoffset, eoffset, &n, bep))
> > > @@ -158,9 +158,9 @@ bmap_f(
> > >  		push_cur();
> > >  		set_cur_inode(iocur_top->ino);
> > >  		dip = iocur_top->data;
> > > -		if (be32_to_cpu(dip->di_nextents))
> > > +		if (xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK))
> > 
> > Suggestion: Shift these kinds of changes to a separate patch to minimize
> > the amount of non-libxfs changes in a patch that will (eventually) be
> > ported from the kernel.  Ideally, the only changes to db/ and repair/
> > and mkfs/ would be the ones that are necessary to avoid breaking the
> > build.
> > 
> > Once you've separated the other conversions (like this one here) into a
> > separate patch, we can review that as a separate refactoring change to
> > userspace.
> 
> So the changes should be split into two patches -- One patch containing
> conversion changes to code inside libxfs and the other one for non-libxfs
> code. Please correct me if my understanding is incorrect.

I usually try to split the responsibilities in this manner:

"libxfs: prepare for FROB API change" -- make whatever change I need
to so that the next patch doesn't become insanely difficult.  This patch
is optional.

"xfs: change FROB API to frob" -- this is a strict backport of changes
from the kernel git tree to xfsprogs.  The only changes outside of
libxfs/ are fixes whatever tool breakage happens.

"xfs_db: support new FROB" -- add whatever new code you need to add to
xfs_db to support the new frob

"xfs_repair: support new FROB" -- same thing with repair

<repeat with the other tools>

"mkfs: support new FROB" -- same thing with mkfs

"xfs: officially enable FROB feature" -- make it so that libxfs will
actually recognize whatever new feature you're adding, if you're adding
one.

--D

> > 
> > The reason for this ofc is that when the maintainers run libxfs-apply to
> > pull in the kernel patches, they're totally going to miss things like
> > this conversion unless you make them an explicit separate change.
> > 
> > FWIW the conversions themselves mostly look ok...
> > 
> > >  			dfork = 1;
> > > -		if (be16_to_cpu(dip->di_anextents))
> > > +		if (xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK))
> > >  			afork = 1;
> > >  		pop_cur();
> > >  	}
> > > diff --git a/db/btdump.c b/db/btdump.c
> > > index 920f595b..9ced71d4 100644
> > > --- a/db/btdump.c
> > > +++ b/db/btdump.c
> > > @@ -166,13 +166,13 @@ dump_inode(
> > >  
> > >  	dip = iocur_top->data;
> > >  	if (attrfork) {
> > > -		if (!dip->di_anextents ||
> > > +		if (!xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK) ||
> > >  		    dip->di_aformat != XFS_DINODE_FMT_BTREE) {
> > >  			dbprintf(_("attr fork not in btree format\n"));
> > >  			return 0;
> > >  		}
> > >  	} else {
> > > -		if (!dip->di_nextents ||
> > > +		if (!xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK) ||
> > >  		    dip->di_format != XFS_DINODE_FMT_BTREE) {
> > >  			dbprintf(_("data fork not in btree format\n"));
> > >  			return 0;
> > > diff --git a/db/check.c b/db/check.c
> > > index 12c03b6d..2d1823a4 100644
> > > --- a/db/check.c
> > > +++ b/db/check.c
> > > @@ -2686,7 +2686,7 @@ process_exinode(
> > >  	xfs_bmbt_rec_t		*rp;
> > >  
> > >  	rp = (xfs_bmbt_rec_t *)XFS_DFORK_PTR(dip, whichfork);
> > > -	*nex = XFS_DFORK_NEXTENTS(dip, whichfork);
> > > +	*nex = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
> > >  	if (*nex < 0 || *nex > XFS_DFORK_SIZE(dip, mp, whichfork) /
> > >  						sizeof(xfs_bmbt_rec_t)) {
> > >  		if (!sflag || id->ilist)
> > > diff --git a/db/frag.c b/db/frag.c
> > > index 1cfc6c2c..20fb1306 100644
> > > --- a/db/frag.c
> > > +++ b/db/frag.c
> > > @@ -262,9 +262,11 @@ process_exinode(
> > >  	int			whichfork)
> > >  {
> > >  	xfs_bmbt_rec_t		*rp;
> > > +	xfs_extnum_t		nextents;
> > >  
> > >  	rp = (xfs_bmbt_rec_t *)XFS_DFORK_PTR(dip, whichfork);
> > > -	process_bmbt_reclist(rp, XFS_DFORK_NEXTENTS(dip, whichfork), extmapp);
> > > +	nextents = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
> > > +	process_bmbt_reclist(rp, nextents, extmapp);
> > >  }
> > >  
> > >  static void
> > > @@ -273,9 +275,9 @@ process_fork(
> > >  	int		whichfork)
> > >  {
> > >  	extmap_t	*extmap;
> > > -	int		nex;
> > > +	xfs_extnum_t	nex;
> > >  
> > > -	nex = XFS_DFORK_NEXTENTS(dip, whichfork);
> > > +	nex = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
> > >  	if (!nex)
> > >  		return;
> > >  	extmap = extmap_alloc(nex);
> > > diff --git a/db/inode.c b/db/inode.c
> > > index 0cff9d63..3853092c 100644
> > > --- a/db/inode.c
> > > +++ b/db/inode.c
> > > @@ -271,7 +271,7 @@ inode_a_bmx_count(
> > >  		return 0;
> > >  	ASSERT((char *)XFS_DFORK_APTR(dip) - (char *)dip == byteize(startoff));
> > >  	return dip->di_aformat == XFS_DINODE_FMT_EXTENTS ?
> > > -		be16_to_cpu(dip->di_anextents) : 0;
> > > +		xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK) : 0;
> > >  }
> > >  
> > >  static int
> > > @@ -325,6 +325,7 @@ inode_a_size(
> > >  {
> > >  	xfs_attr_shortform_t	*asf;
> > >  	xfs_dinode_t		*dip;
> > > +	xfs_extnum_t		nextents;
> > >  
> > >  	ASSERT(startoff == 0);
> > >  	ASSERT(idx == 0);
> > > @@ -334,8 +335,8 @@ inode_a_size(
> > >  		asf = (xfs_attr_shortform_t *)XFS_DFORK_APTR(dip);
> > >  		return bitize(be16_to_cpu(asf->hdr.totsize));
> > >  	case XFS_DINODE_FMT_EXTENTS:
> > > -		return (int)be16_to_cpu(dip->di_anextents) *
> > > -							bitsz(xfs_bmbt_rec_t);
> > > +		nextents = xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK);
> > > +		return (int)(nextents * bitsz(xfs_bmbt_rec_t));
> > >  	case XFS_DINODE_FMT_BTREE:
> > >  		return bitize((int)XFS_DFORK_ASIZE(dip, mp));
> > >  	default:
> > > @@ -496,7 +497,7 @@ inode_u_bmx_count(
> > >  	dip = obj;
> > >  	ASSERT((char *)XFS_DFORK_DPTR(dip) - (char *)dip == byteize(startoff));
> > >  	return dip->di_format == XFS_DINODE_FMT_EXTENTS ?
> > > -		be32_to_cpu(dip->di_nextents) : 0;
> > > +		xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK) : 0;
> > >  }
> > >  
> > >  static int
> > > @@ -582,6 +583,7 @@ inode_u_size(
> > >  	int		idx)
> > >  {
> > >  	xfs_dinode_t	*dip;
> > > +	xfs_extnum_t	nextents;
> > >  
> > >  	ASSERT(startoff == 0);
> > >  	ASSERT(idx == 0);
> > > @@ -592,8 +594,8 @@ inode_u_size(
> > >  	case XFS_DINODE_FMT_LOCAL:
> > >  		return bitize((int)be64_to_cpu(dip->di_size));
> > >  	case XFS_DINODE_FMT_EXTENTS:
> > > -		return (int)be32_to_cpu(dip->di_nextents) *
> > > -						bitsz(xfs_bmbt_rec_t);
> > > +		nextents = xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK);
> > > +		return (int)(nextents * bitsz(xfs_bmbt_rec_t));
> > >  	case XFS_DINODE_FMT_BTREE:
> > >  		return bitize((int)XFS_DFORK_DSIZE(dip, mp));
> > >  	case XFS_DINODE_FMT_UUID:
> > > diff --git a/db/metadump.c b/db/metadump.c
> > > index e5cb3aa5..6a6757a2 100644
> > > --- a/db/metadump.c
> > > +++ b/db/metadump.c
> > > @@ -2282,7 +2282,7 @@ process_exinode(
> > >  
> > >  	whichfork = (itype == TYP_ATTR) ? XFS_ATTR_FORK : XFS_DATA_FORK;
> > >  
> > > -	nex = XFS_DFORK_NEXTENTS(dip, whichfork);
> > > +	nex = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
> > >  	used = nex * sizeof(xfs_bmbt_rec_t);
> > >  	if (nex < 0 || used > XFS_DFORK_SIZE(dip, mp, whichfork)) {
> > >  		if (show_warnings)
> > > @@ -2335,7 +2335,7 @@ static int
> > >  process_dev_inode(
> > >  	xfs_dinode_t		*dip)
> > >  {
> > > -	if (XFS_DFORK_NEXTENTS(dip, XFS_DATA_FORK)) {
> > > +	if (xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK)) {
> > >  		if (show_warnings)
> > >  			print_warning("inode %llu has unexpected extents",
> > >  				      (unsigned long long)cur_ino);
> > > diff --git a/libxfs/xfs_format.h b/libxfs/xfs_format.h
> > > index a738cd8b..188deada 100644
> > > --- a/libxfs/xfs_format.h
> > > +++ b/libxfs/xfs_format.h
> > > @@ -993,10 +993,6 @@ enum xfs_dinode_fmt {
> > >  	((w) == XFS_DATA_FORK ? \
> > >  		(dip)->di_format : \
> > >  		(dip)->di_aformat)
> > > -#define XFS_DFORK_NEXTENTS(dip,w) \
> > > -	((w) == XFS_DATA_FORK ? \
> > > -		be32_to_cpu((dip)->di_nextents) : \
> > > -		be16_to_cpu((dip)->di_anextents))
> > >  
> > >  /*
> > >   * For block and character special files the 32bit dev_t is stored at the
> > > diff --git a/libxfs/xfs_inode_buf.c b/libxfs/xfs_inode_buf.c
> > > index ae71a19e..d5584372 100644
> > > --- a/libxfs/xfs_inode_buf.c
> > > +++ b/libxfs/xfs_inode_buf.c
> > > @@ -362,9 +362,10 @@ xfs_dinode_verify_fork(
> > >  	struct xfs_mount	*mp,
> > >  	int			whichfork)
> > >  {
> > > -	uint32_t		di_nextents = XFS_DFORK_NEXTENTS(dip, whichfork);
> > >  	xfs_extnum_t		max_extents;
> > > +	uint32_t		di_nextents;
> > >  
> > > +	di_nextents = xfs_dfork_nextents(&mp->m_sb, dip, whichfork);
> > >  
> > >  	switch (XFS_DFORK_FORMAT(dip, whichfork)) {
> > >  	case XFS_DINODE_FMT_LOCAL:
> > > @@ -396,6 +397,15 @@ xfs_dinode_verify_fork(
> > >  	return NULL;
> > >  }
> > >  
> > > +xfs_extnum_t
> > > +xfs_dfork_nextents(struct xfs_sb *sbp, struct xfs_dinode *dip, int whichfork)
> > > +{
> > > +	if (whichfork == XFS_DATA_FORK)
> > > +		return be32_to_cpu(dip->di_nextents);
> > > +	else
> > > +		return be16_to_cpu(dip->di_anextents);
> > > +}
> > > +
> > >  static xfs_failaddr_t
> > >  xfs_dinode_verify_forkoff(
> > >  	struct xfs_dinode	*dip,
> > > @@ -432,6 +442,8 @@ xfs_dinode_verify(
> > >  	uint16_t		flags;
> > >  	uint64_t		flags2;
> > >  	uint64_t		di_size;
> > > +	xfs_extnum_t            nextents;
> > > +	int64_t			nblocks;
> > >  
> > >  	if (dip->di_magic != cpu_to_be16(XFS_DINODE_MAGIC))
> > >  		return __this_address;
> > > @@ -462,10 +474,12 @@ xfs_dinode_verify(
> > >  	if ((S_ISLNK(mode) || S_ISDIR(mode)) && di_size == 0)
> > >  		return __this_address;
> > >  
> > > -	/* Fork checks carried over from xfs_iformat_fork */
> > > -	if (mode &&
> > > -	    be32_to_cpu(dip->di_nextents) + be16_to_cpu(dip->di_anextents) >
> > > -			be64_to_cpu(dip->di_nblocks))
> > > +	nextents = xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK);
> > > +	nextents += xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK);
> > > +	nblocks = be64_to_cpu(dip->di_nblocks);
> > > +
> > > +        /* Fork checks carried over from xfs_iformat_fork */
> > > +	if (mode && nextents > nblocks)
> > >  		return __this_address;
> > >  
> > >  	if (mode && XFS_DFORK_BOFF(dip) > mp->m_sb.sb_inodesize)
> > > @@ -522,7 +536,7 @@ xfs_dinode_verify(
> > >  		default:
> > >  			return __this_address;
> > >  		}
> > > -		if (dip->di_anextents)
> > > +		if (xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK))
> > >  			return __this_address;
> > >  	}
> > >  
> > > diff --git a/libxfs/xfs_inode_buf.h b/libxfs/xfs_inode_buf.h
> > > index 9b373dcf..f97b3428 100644
> > > --- a/libxfs/xfs_inode_buf.h
> > > +++ b/libxfs/xfs_inode_buf.h
> > > @@ -71,5 +71,7 @@ xfs_failaddr_t xfs_inode_validate_extsize(struct xfs_mount *mp,
> > >  xfs_failaddr_t xfs_inode_validate_cowextsize(struct xfs_mount *mp,
> > >  		uint32_t cowextsize, uint16_t mode, uint16_t flags,
> > >  		uint64_t flags2);
> > > +xfs_extnum_t xfs_dfork_nextents(struct xfs_sb *sbp, struct xfs_dinode *dip,
> > > +			int whichfork);
> > >  
> > >  #endif	/* __XFS_INODE_BUF_H__ */
> > > diff --git a/libxfs/xfs_inode_fork.c b/libxfs/xfs_inode_fork.c
> > > index 80ba6c12..8c32f993 100644
> > > --- a/libxfs/xfs_inode_fork.c
> > > +++ b/libxfs/xfs_inode_fork.c
> > > @@ -205,9 +205,10 @@ xfs_iformat_extents(
> > >  	int			whichfork)
> > >  {
> > >  	struct xfs_mount	*mp = ip->i_mount;
> > > +	struct xfs_sb		*sbp = &mp->m_sb;
> > >  	struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, whichfork);
> > >  	int			state = xfs_bmap_fork_to_state(whichfork);
> > > -	int			nex = XFS_DFORK_NEXTENTS(dip, whichfork);
> > > +	xfs_extnum_t		nex = xfs_dfork_nextents(sbp, dip, whichfork);
> > >  	int			size = nex * sizeof(xfs_bmbt_rec_t);
> > >  	struct xfs_iext_cursor	icur;
> > >  	struct xfs_bmbt_rec	*dp;
> > > diff --git a/repair/attr_repair.c b/repair/attr_repair.c
> > > index 6cec0f70..b6ca564b 100644
> > > --- a/repair/attr_repair.c
> > > +++ b/repair/attr_repair.c
> > > @@ -1083,7 +1083,7 @@ process_longform_attr(
> > >  	bno = blkmap_get(blkmap, 0);
> > >  	if (bno == NULLFSBLOCK) {
> > >  		if (dip->di_aformat == XFS_DINODE_FMT_EXTENTS &&
> > > -				be16_to_cpu(dip->di_anextents) == 0)
> > > +			xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK) == 0)
> > 
> > 		    ^
> > This should /not/ be indented so that it lines up with the if body.
> 
> Sorry about that. I will fix it up.
> 
> -- 
> chandan
> 
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-09-01 16:04 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-31 13:00 [PATCH 0/4] xfsprogs: Extend per-inode extent counters Chandan Babu R
2020-08-31 13:00 ` [PATCH 1/4] xfsprogs: Introduce xfs_iext_max() helper Chandan Babu R
2020-08-31 13:01 ` [PATCH 2/4] xfsprogs: Introduce xfs_dfork_nextents() helper Chandan Babu R
2020-08-31 20:54   ` Darrick J. Wong
2020-09-01 14:17     ` Chandan Babu R
2020-09-01 15:42       ` Darrick J. Wong
2020-08-31 13:01 ` [PATCH 3/4] xfsprogs: Extend data/attr fork extent counter width Chandan Babu R
2020-08-31 21:00   ` Darrick J. Wong
2020-09-01 14:17     ` Chandan Babu R
2020-08-31 13:01 ` [PATCH 4/4] xfsprogs: Add wideextcnt mkfs option Chandan Babu R

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).