linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V5 00/12] Bail out if transaction can cause extent count to overflow
@ 2020-10-03  5:56 Chandan Babu R
  2020-10-03  5:56 ` [PATCH V5 01/12] xfs: Add helper for checking per-inode extent count overflow Chandan Babu R
                   ` (11 more replies)
  0 siblings, 12 replies; 26+ messages in thread
From: Chandan Babu R @ 2020-10-03  5:56 UTC (permalink / raw)
  To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, david

XFS does not check for possible overflow of per-inode extent counter
fields when adding extents to either data or attr fork.

For e.g.
1. Insert 5 million xattrs (each having a value size of 255 bytes) and
   then delete 50% of them in an alternating manner.

2. On a 4k block sized XFS filesystem instance, the above causes 98511
   extents to be created in the attr fork of the inode.

   xfsaild/loop0  2008 [003]  1475.127209: probe:xfs_inode_to_disk: (ffffffffa43fb6b0) if_nextents=98511 i_ino=131

3. The incore inode fork extent counter is a signed 32-bit
   quantity. However, the on-disk extent counter is an unsigned 16-bit
   quantity and hence cannot hold 98511 extents.

4. The following incorrect value is stored in the xattr extent counter,
   # xfs_db -f -c 'inode 131' -c 'print core.naextents' /dev/loop0
   core.naextents = -32561

This patchset adds a new helper function
(i.e. xfs_iext_count_may_overflow()) to check for overflow of the
per-inode data and xattr extent counters and invokes it before
starting an fs operation (e.g. creating a new directory entry). With
this patchset applied, XFS detects counter overflows and returns with
an error rather than causing a silent corruption.

The patchset has been tested by executing xfstests with the following
mkfs.xfs options,
1. -m crc=0 -b size=1k
2. -m crc=0 -b size=4k
3. -m crc=0 -b size=512
4. -m rmapbt=1,reflink=1 -b size=1k
5. -m rmapbt=1,reflink=1 -b size=4k

The patches can also be obtained from
https://github.com/chandanr/linux.git at branch xfs-reserve-extent-count-v5.

I have two patches that define the newly introduced error injection
tags in xfsprogs
(https://github.com/chandanr/xfsprogs-dev/commit/7fd7aeef1cefbcc9abd6dd5887e710c80e48079d
and
https://github.com/chandanr/xfsprogs-dev/commit/3cbe12f6fdf306de06c4096eb50641fa2d834dc5).

I have also written tests
(https://github.com/chandanr/check-iext-overflow/blob/master/check-iext-overflow.sh/)
for verifying the checks introduced in the kernel. The tests have to
be edited to make them suitable for merging with xfstests. But they
also depend on error tags introduced in these patches . Hence, I am
planning to post the changes for xfsprogs and xfstests if other
developers are fine with the changes made in this patchset.

Changelog:
V4 -> V5:
  1. Introduce new error tag XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT to
     let user space programs to be able to guarantee that free space
     requests for files are satisfied by allocating minlen sized
     extents.
  2. Change xfs_bmap_btalloc() and xfs_alloc_vextent() to allocate
     minlen sized extents when XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT is
     enabled.
  3. Introduce a new patch that causes tp->t_firstblock to be assigned
     to a value only when its previous value is NULLFSBLOCK.
  4. Replace the previously introduced MAXERRTAGEXTNUM (maximum inode
     fork extent count) with the hardcoded value of 10.
  5. xfs_bui_item_recover(): Use XFS_IEXT_ADD_NOSPLIT_CNT when mapping
     an extent.
  6. xfs_swap_extent_rmap(): Use xfs_bmap_is_real_extent() instead of
     xfs_bmap_is_update_needed() to assess if the extent really needs
     to be swapped.

V3 -> V4:
  1. Introduce new patch which lets userspace programs to test "extent
     count overflow detection" by injecting an error tag. The new
     error tag reduces the maximum allowed extent count to 10.
  2. Injecting the newly defined error tag prevents
     xfs_bmap_add_extent_hole_real() from merging a new extent with
     its neighbours to allow writing deterministic tests for testing
     extent count overflow for Directories, Xattr and growing realtime
     devices. This is required because the new extent being allocated
     can be contiguous with its neighbours (w.r.t both file and disk
     offsets).
  3. Injecting the newly defined error tag forces block sized extents
     to be allocated for summary/bitmap files when growing a realtime
     device. This is required because xfs_growfs_rt_alloc() allocates
     as large an extent as possible for summary/bitmap files and hence
     it would be impossible to write deterministic tests.
  4. Rename XFS_IEXT_REMOVE_CNT to XFS_IEXT_PUNCH_HOLE_CNT to reflect
     the actual meaning of the fs operation.
  5. Fold XFS_IEXT_INSERT_HOLE_CNT code into that associated with
     XFS_IEXT_PUNCH_HOLE_CNT since both perform the same job.
  6. xfs_swap_extent_rmap(): Check for extent overflow should be made
     on the source file only if the donor file extent has a valid
     on-disk mapping and vice versa.

V2 -> V3:
  1. Move the definition of xfs_iext_count_may_overflow() from
     libxfs/xfs_trans_resv.c to libxfs/xfs_inode_fork.c. Also, I tried
     to make xfs_iext_count_may_overflow() an inline function by
     placing the definition in libxfs/xfs_inode_fork.h. However this
     required that the definition of 'struct xfs_inode' be available,
     since xfs_iext_count_may_overflow() uses a 'struct xfs_inode *'
     type variable.
  2. Handle XFS_COW_FORK within xfs_iext_count_may_overflow() by
     returning a success value.
  3. Rename XFS_IEXT_ADD_CNT to XFS_IEXT_ADD_NOSPLIT_CNT. Thanks to
     Darrick for the suggesting the new name.
  4. Expand comments to make use of 80 columns.

V1 -> V2:
  1. Rename helper function from xfs_trans_resv_ext_cnt() to
     xfs_iext_count_may_overflow().
  2. Define and use macros to represent fs operations and the
     corresponding increase in extent count.
  3. Split the patches based on the fs operation being performed.


Chandan Babu R (12):
  xfs: Add helper for checking per-inode extent count overflow
  xfs: Check for extent overflow when trivally adding a new extent
  xfs: Check for extent overflow when punching a hole
  xfs: Check for extent overflow when adding/removing xattrs
  xfs: Check for extent overflow when adding/removing dir entries
  xfs: Check for extent overflow when writing to unwritten extent
  xfs: Check for extent overflow when moving extent from cow to data
    fork
  xfs: Check for extent overflow when remapping an extent
  xfs: Check for extent overflow when swapping extents
  xfs: Introduce error injection to reduce maximum inode fork extent
    count
  xfs: Set tp->t_firstblock only once during a transaction's lifetime
  xfs: Introduce error injection to allocate only minlen size extents
    for files

 fs/xfs/libxfs/xfs_alloc.c      | 46 ++++++++++++++++++++
 fs/xfs/libxfs/xfs_alloc.h      |  1 +
 fs/xfs/libxfs/xfs_attr.c       | 13 ++++++
 fs/xfs/libxfs/xfs_bmap.c       | 38 +++++++++++++----
 fs/xfs/libxfs/xfs_errortag.h   |  6 ++-
 fs/xfs/libxfs/xfs_inode_fork.c | 27 ++++++++++++
 fs/xfs/libxfs/xfs_inode_fork.h | 77 ++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_bmap_item.c         | 10 +++++
 fs/xfs/xfs_bmap_util.c         | 31 ++++++++++++++
 fs/xfs/xfs_dquot.c             |  8 +++-
 fs/xfs/xfs_error.c             |  6 +++
 fs/xfs/xfs_inode.c             | 27 ++++++++++++
 fs/xfs/xfs_iomap.c             | 10 +++++
 fs/xfs/xfs_reflink.c           | 10 +++++
 fs/xfs/xfs_rtalloc.c           |  5 +++
 fs/xfs/xfs_symlink.c           |  5 +++
 16 files changed, 309 insertions(+), 11 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH V5 01/12] xfs: Add helper for checking per-inode extent count overflow
  2020-10-03  5:56 [PATCH V5 00/12] Bail out if transaction can cause extent count to overflow Chandan Babu R
@ 2020-10-03  5:56 ` Chandan Babu R
  2020-10-03  5:56 ` [PATCH V5 02/12] xfs: Check for extent overflow when trivally adding a new extent Chandan Babu R
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Chandan Babu R @ 2020-10-03  5:56 UTC (permalink / raw)
  To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, david

XFS does not check for possible overflow of per-inode extent counter
fields when adding extents to either data or attr fork.

For e.g.
1. Insert 5 million xattrs (each having a value size of 255 bytes) and
   then delete 50% of them in an alternating manner.

2. On a 4k block sized XFS filesystem instance, the above causes 98511
   extents to be created in the attr fork of the inode.

   xfsaild/loop0  2008 [003]  1475.127209: probe:xfs_inode_to_disk: (ffffffffa43fb6b0) if_nextents=98511 i_ino=131

3. The incore inode fork extent counter is a signed 32-bit
   quantity. However the on-disk extent counter is an unsigned 16-bit
   quantity and hence cannot hold 98511 extents.

4. The following incorrect value is stored in the attr extent counter,
   # xfs_db -f -c 'inode 131' -c 'print core.naextents' /dev/loop0
   core.naextents = -32561

This commit adds a new helper function (i.e.
xfs_iext_count_may_overflow()) to check for overflow of the per-inode
data and xattr extent counters. Future patches will use this function to
make sure that an FS operation won't cause the extent counter to
overflow.

Suggested-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
 fs/xfs/libxfs/xfs_inode_fork.c | 23 +++++++++++++++++++++++
 fs/xfs/libxfs/xfs_inode_fork.h |  2 ++
 2 files changed, 25 insertions(+)

diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
index 7575de5cecb1..8d48716547e5 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.c
+++ b/fs/xfs/libxfs/xfs_inode_fork.c
@@ -23,6 +23,7 @@
 #include "xfs_da_btree.h"
 #include "xfs_dir2_priv.h"
 #include "xfs_attr_leaf.h"
+#include "xfs_types.h"
 
 kmem_zone_t *xfs_ifork_zone;
 
@@ -728,3 +729,25 @@ xfs_ifork_verify_local_attr(
 
 	return 0;
 }
+
+int
+xfs_iext_count_may_overflow(
+	struct xfs_inode	*ip,
+	int			whichfork,
+	int			nr_to_add)
+{
+	struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, whichfork);
+	uint64_t		max_exts;
+	uint64_t		nr_exts;
+
+	if (whichfork == XFS_COW_FORK)
+		return 0;
+
+	max_exts = (whichfork == XFS_ATTR_FORK) ? MAXAEXTNUM : MAXEXTNUM;
+
+	nr_exts = ifp->if_nextents + nr_to_add;
+	if (nr_exts < ifp->if_nextents || nr_exts > max_exts)
+		return -EFBIG;
+
+	return 0;
+}
diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index a4953e95c4f3..0beb8e2a00be 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -172,5 +172,7 @@ extern void xfs_ifork_init_cow(struct xfs_inode *ip);
 
 int xfs_ifork_verify_local_data(struct xfs_inode *ip);
 int xfs_ifork_verify_local_attr(struct xfs_inode *ip);
+int xfs_iext_count_may_overflow(struct xfs_inode *ip, int whichfork,
+		int nr_to_add);
 
 #endif	/* __XFS_INODE_FORK_H__ */
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH V5 02/12] xfs: Check for extent overflow when trivally adding a new extent
  2020-10-03  5:56 [PATCH V5 00/12] Bail out if transaction can cause extent count to overflow Chandan Babu R
  2020-10-03  5:56 ` [PATCH V5 01/12] xfs: Add helper for checking per-inode extent count overflow Chandan Babu R
@ 2020-10-03  5:56 ` Chandan Babu R
  2020-10-06  4:18   ` Darrick J. Wong
  2020-10-03  5:56 ` [PATCH V5 03/12] xfs: Check for extent overflow when punching a hole Chandan Babu R
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 26+ messages in thread
From: Chandan Babu R @ 2020-10-03  5:56 UTC (permalink / raw)
  To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, david

When adding a new data extent (without modifying an inode's existing
extents) the extent count increases only by 1. This commit checks for
extent count overflow in such cases.

Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
 fs/xfs/libxfs/xfs_bmap.c       | 6 ++++++
 fs/xfs/libxfs/xfs_inode_fork.h | 6 ++++++
 fs/xfs/xfs_bmap_item.c         | 7 +++++++
 fs/xfs/xfs_bmap_util.c         | 5 +++++
 fs/xfs/xfs_dquot.c             | 8 +++++++-
 fs/xfs/xfs_iomap.c             | 5 +++++
 fs/xfs/xfs_rtalloc.c           | 5 +++++
 7 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 1b0a01b06a05..51c2d2690f05 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -4527,6 +4527,12 @@ xfs_bmapi_convert_delalloc(
 		return error;
 
 	xfs_ilock(ip, XFS_ILOCK_EXCL);
+
+	error = xfs_iext_count_may_overflow(ip, whichfork,
+			XFS_IEXT_ADD_NOSPLIT_CNT);
+	if (error)
+		goto out_trans_cancel;
+
 	xfs_trans_ijoin(tp, ip, 0);
 
 	if (!xfs_iext_lookup_extent(ip, ifp, offset_fsb, &bma.icur, &bma.got) ||
diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index 0beb8e2a00be..7fc2b129a2e7 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -34,6 +34,12 @@ struct xfs_ifork {
 #define	XFS_IFEXTENTS	0x02	/* All extent pointers are read in */
 #define	XFS_IFBROOT	0x04	/* i_broot points to the bmap b-tree root */
 
+/*
+ * Worst-case increase in the fork extent count when we're adding a single
+ * extent to a fork and there's no possibility of splitting an existing mapping.
+ */
+#define XFS_IEXT_ADD_NOSPLIT_CNT	(1)
+
 /*
  * Fork handling.
  */
diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
index ec3691372e7c..6a7dcea4ad40 100644
--- a/fs/xfs/xfs_bmap_item.c
+++ b/fs/xfs/xfs_bmap_item.c
@@ -519,6 +519,13 @@ xfs_bui_item_recover(
 	}
 	xfs_trans_ijoin(tp, ip, 0);
 
+	if (bui_type == XFS_BMAP_MAP) {
+		error = xfs_iext_count_may_overflow(ip, whichfork,
+				XFS_IEXT_ADD_NOSPLIT_CNT);
+		if (error)
+			goto err_inode;
+	}
+
 	count = bmap->me_len;
 	error = xfs_trans_log_finish_bmap_update(tp, budp, type, ip, whichfork,
 			bmap->me_startoff, bmap->me_startblock, &count, state);
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index f2a8a0e75e1f..dcd6e61df711 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -822,6 +822,11 @@ xfs_alloc_file_space(
 		if (error)
 			goto error1;
 
+		error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+				XFS_IEXT_ADD_NOSPLIT_CNT);
+		if (error)
+			goto error0;
+
 		xfs_trans_ijoin(tp, ip, 0);
 
 		error = xfs_bmapi_write(tp, ip, startoffset_fsb,
diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index 3072814e407d..5bf22d2e50cb 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -314,8 +314,14 @@ xfs_dquot_disk_alloc(
 		return -ESRCH;
 	}
 
-	/* Create the block mapping. */
 	xfs_trans_ijoin(tp, quotip, XFS_ILOCK_EXCL);
+
+	error = xfs_iext_count_may_overflow(quotip, XFS_DATA_FORK,
+			XFS_IEXT_ADD_NOSPLIT_CNT);
+	if (error)
+		return error;
+
+	/* Create the block mapping. */
 	error = xfs_bmapi_write(tp, quotip, dqp->q_fileoffset,
 			XFS_DQUOT_CLUSTER_SIZE_FSB, XFS_BMAPI_METADATA, 0, &map,
 			&nmaps);
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 3abb8b9d6f4c..a302a96823b8 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -250,6 +250,11 @@ xfs_iomap_write_direct(
 	if (error)
 		goto out_trans_cancel;
 
+	error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+			XFS_IEXT_ADD_NOSPLIT_CNT);
+	if (error)
+		goto out_trans_cancel;
+
 	xfs_trans_ijoin(tp, ip, 0);
 
 	/*
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 9d4e33d70d2a..3e841a75f272 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -804,6 +804,11 @@ xfs_growfs_rt_alloc(
 		xfs_ilock(ip, XFS_ILOCK_EXCL);
 		xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
 
+		error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+				XFS_IEXT_ADD_NOSPLIT_CNT);
+		if (error)
+			goto out_trans_cancel;
+
 		/*
 		 * Allocate blocks to the bitmap file.
 		 */
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH V5 03/12] xfs: Check for extent overflow when punching a hole
  2020-10-03  5:56 [PATCH V5 00/12] Bail out if transaction can cause extent count to overflow Chandan Babu R
  2020-10-03  5:56 ` [PATCH V5 01/12] xfs: Add helper for checking per-inode extent count overflow Chandan Babu R
  2020-10-03  5:56 ` [PATCH V5 02/12] xfs: Check for extent overflow when trivally adding a new extent Chandan Babu R
@ 2020-10-03  5:56 ` Chandan Babu R
  2020-10-06  4:18   ` Darrick J. Wong
  2020-10-03  5:56 ` [PATCH V5 04/12] xfs: Check for extent overflow when adding/removing xattrs Chandan Babu R
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 26+ messages in thread
From: Chandan Babu R @ 2020-10-03  5:56 UTC (permalink / raw)
  To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, david

The extent mapping the file offset at which a hole has to be
inserted will be split into two extents causing extent count to
increase by 1.

Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
 fs/xfs/libxfs/xfs_inode_fork.h |  7 +++++++
 fs/xfs/xfs_bmap_item.c         | 15 +++++++++------
 fs/xfs/xfs_bmap_util.c         | 10 ++++++++++
 3 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index 7fc2b129a2e7..bcac769a7df6 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -40,6 +40,13 @@ struct xfs_ifork {
  */
 #define XFS_IEXT_ADD_NOSPLIT_CNT	(1)
 
+/*
+ * Punching out an extent from the middle of an existing extent can cause the
+ * extent count to increase by 1.
+ * i.e. | Old extent | Hole | Old extent |
+ */
+#define XFS_IEXT_PUNCH_HOLE_CNT		(1)
+
 /*
  * Fork handling.
  */
diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
index 6a7dcea4ad40..323cee00bd45 100644
--- a/fs/xfs/xfs_bmap_item.c
+++ b/fs/xfs/xfs_bmap_item.c
@@ -440,6 +440,7 @@ xfs_bui_item_recover(
 	bool				op_ok;
 	unsigned int			bui_type;
 	int				whichfork;
+	int				iext_delta;
 	int				error = 0;
 
 	/* Only one mapping operation per BUI... */
@@ -519,12 +520,14 @@ xfs_bui_item_recover(
 	}
 	xfs_trans_ijoin(tp, ip, 0);
 
-	if (bui_type == XFS_BMAP_MAP) {
-		error = xfs_iext_count_may_overflow(ip, whichfork,
-				XFS_IEXT_ADD_NOSPLIT_CNT);
-		if (error)
-			goto err_inode;
-	}
+	if (bui_type == XFS_BMAP_MAP)
+		iext_delta = XFS_IEXT_ADD_NOSPLIT_CNT;
+	else
+		iext_delta = XFS_IEXT_PUNCH_HOLE_CNT;
+
+	error = xfs_iext_count_may_overflow(ip, whichfork, iext_delta);
+	if (error)
+		goto err_inode;
 
 	count = bmap->me_len;
 	error = xfs_trans_log_finish_bmap_update(tp, budp, type, ip, whichfork,
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index dcd6e61df711..0776abd0103c 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -891,6 +891,11 @@ xfs_unmap_extent(
 
 	xfs_trans_ijoin(tp, ip, 0);
 
+	error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+			XFS_IEXT_PUNCH_HOLE_CNT);
+	if (error)
+		goto out_trans_cancel;
+
 	error = xfs_bunmapi(tp, ip, startoffset_fsb, len_fsb, 0, 2, done);
 	if (error)
 		goto out_trans_cancel;
@@ -1176,6 +1181,11 @@ xfs_insert_file_space(
 	xfs_ilock(ip, XFS_ILOCK_EXCL);
 	xfs_trans_ijoin(tp, ip, 0);
 
+	error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+			XFS_IEXT_PUNCH_HOLE_CNT);
+	if (error)
+		goto out_trans_cancel;
+
 	/*
 	 * The extent shifting code works on extent granularity. So, if stop_fsb
 	 * is not the starting block of extent, we need to split the extent at
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH V5 04/12] xfs: Check for extent overflow when adding/removing xattrs
  2020-10-03  5:56 [PATCH V5 00/12] Bail out if transaction can cause extent count to overflow Chandan Babu R
                   ` (2 preceding siblings ...)
  2020-10-03  5:56 ` [PATCH V5 03/12] xfs: Check for extent overflow when punching a hole Chandan Babu R
@ 2020-10-03  5:56 ` Chandan Babu R
  2020-10-06  4:23   ` Darrick J. Wong
  2020-10-03  5:56 ` [PATCH V5 05/12] xfs: Check for extent overflow when adding/removing dir entries Chandan Babu R
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 26+ messages in thread
From: Chandan Babu R @ 2020-10-03  5:56 UTC (permalink / raw)
  To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, david

Adding/removing an xattr can cause XFS_DA_NODE_MAXDEPTH extents to be
added. One extra extent for dabtree in case a local attr is large enough
to cause a double split.  It can also cause extent count to increase
proportional to the size of a remote xattr's value.

Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
 fs/xfs/libxfs/xfs_attr.c       | 13 +++++++++++++
 fs/xfs/libxfs/xfs_inode_fork.h | 10 ++++++++++
 2 files changed, 23 insertions(+)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index fd8e6418a0d3..be51e7068dcd 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -396,6 +396,7 @@ xfs_attr_set(
 	struct xfs_trans_res	tres;
 	bool			rsvd = (args->attr_filter & XFS_ATTR_ROOT);
 	int			error, local;
+	int			rmt_blks = 0;
 	unsigned int		total;
 
 	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
@@ -442,11 +443,15 @@ xfs_attr_set(
 		tres.tr_logcount = XFS_ATTRSET_LOG_COUNT;
 		tres.tr_logflags = XFS_TRANS_PERM_LOG_RES;
 		total = args->total;
+
+		if (!local)
+			rmt_blks = xfs_attr3_rmt_blocks(mp, args->valuelen);
 	} else {
 		XFS_STATS_INC(mp, xs_attr_remove);
 
 		tres = M_RES(mp)->tr_attrrm;
 		total = XFS_ATTRRM_SPACE_RES(mp);
+		rmt_blks = xfs_attr3_rmt_blocks(mp, XFS_XATTR_SIZE_MAX);
 	}
 
 	/*
@@ -460,6 +465,14 @@ xfs_attr_set(
 
 	xfs_ilock(dp, XFS_ILOCK_EXCL);
 	xfs_trans_ijoin(args->trans, dp, 0);
+
+	if (args->value || xfs_inode_hasattr(dp)) {
+		error = xfs_iext_count_may_overflow(dp, XFS_ATTR_FORK,
+				XFS_IEXT_ATTR_MANIP_CNT(rmt_blks));
+		if (error)
+			goto out_trans_cancel;
+	}
+
 	if (args->value) {
 		unsigned int	quota_flags = XFS_QMOPT_RES_REGBLKS;
 
diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index bcac769a7df6..5de2f07d0dd5 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -47,6 +47,16 @@ struct xfs_ifork {
  */
 #define XFS_IEXT_PUNCH_HOLE_CNT		(1)
 
+/*
+ * Adding/removing an xattr can cause XFS_DA_NODE_MAXDEPTH extents to
+ * be added. One extra extent for dabtree in case a local attr is
+ * large enough to cause a double split.  It can also cause extent
+ * count to increase proportional to the size of a remote xattr's
+ * value.
+ */
+#define XFS_IEXT_ATTR_MANIP_CNT(rmt_blks) \
+	(XFS_DA_NODE_MAXDEPTH + max(1, rmt_blks))
+
 /*
  * Fork handling.
  */
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH V5 05/12] xfs: Check for extent overflow when adding/removing dir entries
  2020-10-03  5:56 [PATCH V5 00/12] Bail out if transaction can cause extent count to overflow Chandan Babu R
                   ` (3 preceding siblings ...)
  2020-10-03  5:56 ` [PATCH V5 04/12] xfs: Check for extent overflow when adding/removing xattrs Chandan Babu R
@ 2020-10-03  5:56 ` Chandan Babu R
  2020-10-03  5:56 ` [PATCH V5 06/12] xfs: Check for extent overflow when writing to unwritten extent Chandan Babu R
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Chandan Babu R @ 2020-10-03  5:56 UTC (permalink / raw)
  To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, david

Directory entry addition/removal can cause the following,
1. Data block can be added/removed.
   A new extent can cause extent count to increase by 1.
2. Free disk block can be added/removed.
   Same behaviour as described above for Data block.
3. Dabtree blocks.
   XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these
   can be new extents. Hence extent count can increase by
   XFS_DA_NODE_MAXDEPTH.

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
 fs/xfs/libxfs/xfs_inode_fork.h | 13 +++++++++++++
 fs/xfs/xfs_inode.c             | 27 +++++++++++++++++++++++++++
 fs/xfs/xfs_symlink.c           |  5 +++++
 3 files changed, 45 insertions(+)

diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index 5de2f07d0dd5..fd93fdc67ee4 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -57,6 +57,19 @@ struct xfs_ifork {
 #define XFS_IEXT_ATTR_MANIP_CNT(rmt_blks) \
 	(XFS_DA_NODE_MAXDEPTH + max(1, rmt_blks))
 
+/*
+ * Directory entry addition/removal can cause the following,
+ * 1. Data block can be added/removed.
+ *    A new extent can cause extent count to increase by 1.
+ * 2. Free disk block can be added/removed.
+ *    Same behaviour as described above for Data block.
+ * 3. Dabtree blocks.
+ *    XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these can be new
+ *    extents. Hence extent count can increase by XFS_DA_NODE_MAXDEPTH.
+ */
+#define XFS_IEXT_DIR_MANIP_CNT(mp) \
+	((XFS_DA_NODE_MAXDEPTH + 1 + 1) * (mp)->m_dir_geo->fsbcount)
+
 /*
  * Fork handling.
  */
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 49624973eecc..f347b1911d9c 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1159,6 +1159,11 @@ xfs_create(
 	if (error)
 		goto out_trans_cancel;
 
+	error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK,
+			XFS_IEXT_DIR_MANIP_CNT(mp));
+	if (error)
+		goto out_trans_cancel;
+
 	/*
 	 * A newly created regular or special file just has one directory
 	 * entry pointing to them, but a directory also the "." entry
@@ -1375,6 +1380,11 @@ xfs_link(
 	xfs_trans_ijoin(tp, sip, XFS_ILOCK_EXCL);
 	xfs_trans_ijoin(tp, tdp, XFS_ILOCK_EXCL);
 
+	error = xfs_iext_count_may_overflow(tdp, XFS_DATA_FORK,
+			XFS_IEXT_DIR_MANIP_CNT(mp));
+	if (error)
+		goto error_return;
+
 	/*
 	 * If we are using project inheritance, we only allow hard link
 	 * creation in our tree when the project IDs are the same; else
@@ -2850,6 +2860,11 @@ xfs_remove(
 	xfs_trans_ijoin(tp, dp, XFS_ILOCK_EXCL);
 	xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
 
+	error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK,
+			XFS_IEXT_DIR_MANIP_CNT(mp));
+	if (error)
+		goto out_trans_cancel;
+
 	/*
 	 * If we're removing a directory perform some additional validation.
 	 */
@@ -3210,6 +3225,18 @@ xfs_rename(
 	if (wip)
 		xfs_trans_ijoin(tp, wip, XFS_ILOCK_EXCL);
 
+	error = xfs_iext_count_may_overflow(src_dp, XFS_DATA_FORK,
+			XFS_IEXT_DIR_MANIP_CNT(mp));
+	if (error)
+		goto out_trans_cancel;
+
+	if (target_ip == NULL) {
+		error = xfs_iext_count_may_overflow(target_dp, XFS_DATA_FORK,
+				XFS_IEXT_DIR_MANIP_CNT(mp));
+		if (error)
+			goto out_trans_cancel;
+	}
+
 	/*
 	 * If we are using project inheritance, we only allow renames
 	 * into our tree when the project IDs are the same; else the
diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
index 8e88a7ca387e..581a4032a817 100644
--- a/fs/xfs/xfs_symlink.c
+++ b/fs/xfs/xfs_symlink.c
@@ -220,6 +220,11 @@ xfs_symlink(
 	if (error)
 		goto out_trans_cancel;
 
+	error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK,
+			XFS_IEXT_DIR_MANIP_CNT(mp));
+	if (error)
+		goto out_trans_cancel;
+
 	/*
 	 * Allocate an inode for the symlink.
 	 */
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH V5 06/12] xfs: Check for extent overflow when writing to unwritten extent
  2020-10-03  5:56 [PATCH V5 00/12] Bail out if transaction can cause extent count to overflow Chandan Babu R
                   ` (4 preceding siblings ...)
  2020-10-03  5:56 ` [PATCH V5 05/12] xfs: Check for extent overflow when adding/removing dir entries Chandan Babu R
@ 2020-10-03  5:56 ` Chandan Babu R
  2020-10-03  5:56 ` [PATCH V5 07/12] xfs: Check for extent overflow when moving extent from cow to data fork Chandan Babu R
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Chandan Babu R @ 2020-10-03  5:56 UTC (permalink / raw)
  To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, david

A write to a sub-interval of an existing unwritten extent causes
the original extent to be split into 3 extents
i.e. | Unwritten | Real | Unwritten |
Hence extent count can increase by 2.

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
 fs/xfs/libxfs/xfs_inode_fork.h | 8 ++++++++
 fs/xfs/xfs_iomap.c             | 5 +++++
 2 files changed, 13 insertions(+)

diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index fd93fdc67ee4..afb647e1e3fa 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -70,6 +70,14 @@ struct xfs_ifork {
 #define XFS_IEXT_DIR_MANIP_CNT(mp) \
 	((XFS_DA_NODE_MAXDEPTH + 1 + 1) * (mp)->m_dir_geo->fsbcount)
 
+/*
+ * A write to a sub-interval of an existing unwritten extent causes the original
+ * extent to be split into 3 extents
+ * i.e. | Unwritten | Real | Unwritten |
+ * Hence extent count can increase by 2.
+ */
+#define XFS_IEXT_WRITE_UNWRITTEN_CNT	(2)
+
 /*
  * Fork handling.
  */
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index a302a96823b8..2aa788379611 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -566,6 +566,11 @@ xfs_iomap_write_unwritten(
 		if (error)
 			goto error_on_bmapi_transaction;
 
+		error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+				XFS_IEXT_WRITE_UNWRITTEN_CNT);
+		if (error)
+			goto error_on_bmapi_transaction;
+
 		/*
 		 * Modify the unwritten extent state of the buffer.
 		 */
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH V5 07/12] xfs: Check for extent overflow when moving extent from cow to data fork
  2020-10-03  5:56 [PATCH V5 00/12] Bail out if transaction can cause extent count to overflow Chandan Babu R
                   ` (5 preceding siblings ...)
  2020-10-03  5:56 ` [PATCH V5 06/12] xfs: Check for extent overflow when writing to unwritten extent Chandan Babu R
@ 2020-10-03  5:56 ` Chandan Babu R
  2020-10-03  5:56 ` [PATCH V5 08/12] xfs: Check for extent overflow when remapping an extent Chandan Babu R
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Chandan Babu R @ 2020-10-03  5:56 UTC (permalink / raw)
  To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, david

Moving an extent to data fork can cause a sub-interval of an existing
extent to be unmapped. This will increase extent count by 1. Mapping in
the new extent can increase the extent count by 1 again i.e.
 | Old extent | New extent | Old extent |
Hence number of extents increases by 2.

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
 fs/xfs/libxfs/xfs_inode_fork.h | 9 +++++++++
 fs/xfs/xfs_reflink.c           | 5 +++++
 2 files changed, 14 insertions(+)

diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index afb647e1e3fa..b99e67e7b59b 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -78,6 +78,15 @@ struct xfs_ifork {
  */
 #define XFS_IEXT_WRITE_UNWRITTEN_CNT	(2)
 
+/*
+ * Moving an extent to data fork can cause a sub-interval of an existing extent
+ * to be unmapped. This will increase extent count by 1. Mapping in the new
+ * extent can increase the extent count by 1 again i.e.
+ * | Old extent | New extent | Old extent |
+ * Hence number of extents increases by 2.
+ */
+#define XFS_IEXT_REFLINK_END_COW_CNT	(2)
+
 /*
  * Fork handling.
  */
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index 16098dc42add..4f0198f636ad 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -628,6 +628,11 @@ xfs_reflink_end_cow_extent(
 	xfs_ilock(ip, XFS_ILOCK_EXCL);
 	xfs_trans_ijoin(tp, ip, 0);
 
+	error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+			XFS_IEXT_REFLINK_END_COW_CNT);
+	if (error)
+		goto out_cancel;
+
 	/*
 	 * In case of racing, overlapping AIO writes no COW extents might be
 	 * left by the time I/O completes for the loser of the race.  In that
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH V5 08/12] xfs: Check for extent overflow when remapping an extent
  2020-10-03  5:56 [PATCH V5 00/12] Bail out if transaction can cause extent count to overflow Chandan Babu R
                   ` (6 preceding siblings ...)
  2020-10-03  5:56 ` [PATCH V5 07/12] xfs: Check for extent overflow when moving extent from cow to data fork Chandan Babu R
@ 2020-10-03  5:56 ` Chandan Babu R
  2020-10-03  5:56 ` [PATCH V5 09/12] xfs: Check for extent overflow when swapping extents Chandan Babu R
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 26+ messages in thread
From: Chandan Babu R @ 2020-10-03  5:56 UTC (permalink / raw)
  To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, david

Remapping an extent involves unmapping the existing extent and mapping
in the new extent. When unmapping, an extent containing the entire unmap
range can be split into two extents,
i.e. | Old extent | hole | Old extent |
Hence extent count increases by 1.

Mapping in the new extent into the destination file can increase the
extent count by 1.

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
 fs/xfs/libxfs/xfs_inode_fork.h | 15 +++++++++++++++
 fs/xfs/xfs_reflink.c           |  5 +++++
 2 files changed, 20 insertions(+)

diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index b99e67e7b59b..ded3c1b56c94 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -87,6 +87,21 @@ struct xfs_ifork {
  */
 #define XFS_IEXT_REFLINK_END_COW_CNT	(2)
 
+/*
+ * Remapping an extent involves unmapping the existing extent and mapping in the
+ * new extent.
+ *
+ * When unmapping, an extent containing the entire unmap range can be split into
+ * two extents,
+ * i.e. | Old extent | hole | Old extent |
+ * Hence extent count increases by 1.
+ *
+ * Mapping in the new extent into the destination file can increase the extent
+ * count by 1.
+ */
+#define XFS_IEXT_REFLINK_REMAP_CNT(smap_real, dmap_written) \
+	(((smap_real) ? 1 : 0) + ((dmap_written) ? 1 : 0))
+
 /*
  * Fork handling.
  */
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index 4f0198f636ad..c9f9ff68b5bb 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -1099,6 +1099,11 @@ xfs_reflink_remap_extent(
 			goto out_cancel;
 	}
 
+	error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+			XFS_IEXT_REFLINK_REMAP_CNT(smap_real, dmap_written));
+	if (error)
+		goto out_cancel;
+
 	if (smap_real) {
 		/*
 		 * If the extent we're unmapping is backed by storage (written
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH V5 09/12] xfs: Check for extent overflow when swapping extents
  2020-10-03  5:56 [PATCH V5 00/12] Bail out if transaction can cause extent count to overflow Chandan Babu R
                   ` (7 preceding siblings ...)
  2020-10-03  5:56 ` [PATCH V5 08/12] xfs: Check for extent overflow when remapping an extent Chandan Babu R
@ 2020-10-03  5:56 ` Chandan Babu R
  2020-10-06  4:23   ` Darrick J. Wong
  2020-10-03  5:56 ` [PATCH V5 10/12] xfs: Introduce error injection to reduce maximum inode fork extent count Chandan Babu R
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 26+ messages in thread
From: Chandan Babu R @ 2020-10-03  5:56 UTC (permalink / raw)
  To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, david

Removing an initial range of source/donor file's extent and adding a new
extent (from donor/source file) in its place will cause extent count to
increase by 1.

Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
 fs/xfs/libxfs/xfs_inode_fork.h |  7 +++++++
 fs/xfs/xfs_bmap_util.c         | 16 ++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index ded3c1b56c94..837c01595439 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -102,6 +102,13 @@ struct xfs_ifork {
 #define XFS_IEXT_REFLINK_REMAP_CNT(smap_real, dmap_written) \
 	(((smap_real) ? 1 : 0) + ((dmap_written) ? 1 : 0))
 
+/*
+ * Removing an initial range of source/donor file's extent and adding a new
+ * extent (from donor/source file) in its place will cause extent count to
+ * increase by 1.
+ */
+#define XFS_IEXT_SWAP_RMAP_CNT		(1)
+
 /*
  * Fork handling.
  */
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 0776abd0103c..b6728fdf50ae 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -1407,6 +1407,22 @@ xfs_swap_extent_rmap(
 					irec.br_blockcount);
 			trace_xfs_swap_extent_rmap_remap_piece(tip, &uirec);
 
+			if (xfs_bmap_is_real_extent(&uirec)) {
+				error = xfs_iext_count_may_overflow(ip,
+						XFS_DATA_FORK,
+						XFS_IEXT_SWAP_RMAP_CNT);
+				if (error)
+					goto out;
+			}
+
+			if (xfs_bmap_is_real_extent(&irec)) {
+				error = xfs_iext_count_may_overflow(tip,
+						XFS_DATA_FORK,
+						XFS_IEXT_SWAP_RMAP_CNT);
+				if (error)
+					goto out;
+			}
+
 			/* Remove the mapping from the donor file. */
 			xfs_bmap_unmap_extent(tp, tip, &uirec);
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH V5 10/12] xfs: Introduce error injection to reduce maximum inode fork extent count
  2020-10-03  5:56 [PATCH V5 00/12] Bail out if transaction can cause extent count to overflow Chandan Babu R
                   ` (8 preceding siblings ...)
  2020-10-03  5:56 ` [PATCH V5 09/12] xfs: Check for extent overflow when swapping extents Chandan Babu R
@ 2020-10-03  5:56 ` Chandan Babu R
  2020-10-06  4:24   ` Darrick J. Wong
  2020-10-03  5:56 ` [PATCH V5 11/12] xfs: Set tp->t_firstblock only once during a transaction's lifetime Chandan Babu R
  2020-10-03  5:56 ` [PATCH V5 12/12] xfs: Introduce error injection to allocate only minlen size extents for files Chandan Babu R
  11 siblings, 1 reply; 26+ messages in thread
From: Chandan Babu R @ 2020-10-03  5:56 UTC (permalink / raw)
  To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, david

This commit adds XFS_ERRTAG_REDUCE_MAX_IEXTENTS error tag which enables
userspace programs to test "Inode fork extent count overflow detection"
by reducing maximum possible inode fork extent count to 10.

Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
 fs/xfs/libxfs/xfs_errortag.h   | 4 +++-
 fs/xfs/libxfs/xfs_inode_fork.c | 4 ++++
 fs/xfs/xfs_error.c             | 3 +++
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_errortag.h b/fs/xfs/libxfs/xfs_errortag.h
index 53b305dea381..1c56fcceeea6 100644
--- a/fs/xfs/libxfs/xfs_errortag.h
+++ b/fs/xfs/libxfs/xfs_errortag.h
@@ -56,7 +56,8 @@
 #define XFS_ERRTAG_FORCE_SUMMARY_RECALC			33
 #define XFS_ERRTAG_IUNLINK_FALLBACK			34
 #define XFS_ERRTAG_BUF_IOERROR				35
-#define XFS_ERRTAG_MAX					36
+#define XFS_ERRTAG_REDUCE_MAX_IEXTENTS			36
+#define XFS_ERRTAG_MAX					37
 
 /*
  * Random factors for above tags, 1 means always, 2 means 1/2 time, etc.
@@ -97,5 +98,6 @@
 #define XFS_RANDOM_FORCE_SUMMARY_RECALC			1
 #define XFS_RANDOM_IUNLINK_FALLBACK			(XFS_RANDOM_DEFAULT/10)
 #define XFS_RANDOM_BUF_IOERROR				XFS_RANDOM_DEFAULT
+#define XFS_RANDOM_REDUCE_MAX_IEXTENTS			1
 
 #endif /* __XFS_ERRORTAG_H_ */
diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
index 8d48716547e5..e080d7e07643 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.c
+++ b/fs/xfs/libxfs/xfs_inode_fork.c
@@ -24,6 +24,7 @@
 #include "xfs_dir2_priv.h"
 #include "xfs_attr_leaf.h"
 #include "xfs_types.h"
+#include "xfs_errortag.h"
 
 kmem_zone_t *xfs_ifork_zone;
 
@@ -745,6 +746,9 @@ xfs_iext_count_may_overflow(
 
 	max_exts = (whichfork == XFS_ATTR_FORK) ? MAXAEXTNUM : MAXEXTNUM;
 
+	if (XFS_TEST_ERROR(false, ip->i_mount, XFS_ERRTAG_REDUCE_MAX_IEXTENTS))
+		max_exts = 10;
+
 	nr_exts = ifp->if_nextents + nr_to_add;
 	if (nr_exts < ifp->if_nextents || nr_exts > max_exts)
 		return -EFBIG;
diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c
index 7f6e20899473..3780b118cc47 100644
--- a/fs/xfs/xfs_error.c
+++ b/fs/xfs/xfs_error.c
@@ -54,6 +54,7 @@ static unsigned int xfs_errortag_random_default[] = {
 	XFS_RANDOM_FORCE_SUMMARY_RECALC,
 	XFS_RANDOM_IUNLINK_FALLBACK,
 	XFS_RANDOM_BUF_IOERROR,
+	XFS_RANDOM_REDUCE_MAX_IEXTENTS,
 };
 
 struct xfs_errortag_attr {
@@ -164,6 +165,7 @@ XFS_ERRORTAG_ATTR_RW(force_repair,	XFS_ERRTAG_FORCE_SCRUB_REPAIR);
 XFS_ERRORTAG_ATTR_RW(bad_summary,	XFS_ERRTAG_FORCE_SUMMARY_RECALC);
 XFS_ERRORTAG_ATTR_RW(iunlink_fallback,	XFS_ERRTAG_IUNLINK_FALLBACK);
 XFS_ERRORTAG_ATTR_RW(buf_ioerror,	XFS_ERRTAG_BUF_IOERROR);
+XFS_ERRORTAG_ATTR_RW(reduce_max_iextents,	XFS_ERRTAG_REDUCE_MAX_IEXTENTS);
 
 static struct attribute *xfs_errortag_attrs[] = {
 	XFS_ERRORTAG_ATTR_LIST(noerror),
@@ -202,6 +204,7 @@ static struct attribute *xfs_errortag_attrs[] = {
 	XFS_ERRORTAG_ATTR_LIST(bad_summary),
 	XFS_ERRORTAG_ATTR_LIST(iunlink_fallback),
 	XFS_ERRORTAG_ATTR_LIST(buf_ioerror),
+	XFS_ERRORTAG_ATTR_LIST(reduce_max_iextents),
 	NULL,
 };
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH V5 11/12] xfs: Set tp->t_firstblock only once during a transaction's lifetime
  2020-10-03  5:56 [PATCH V5 00/12] Bail out if transaction can cause extent count to overflow Chandan Babu R
                   ` (9 preceding siblings ...)
  2020-10-03  5:56 ` [PATCH V5 10/12] xfs: Introduce error injection to reduce maximum inode fork extent count Chandan Babu R
@ 2020-10-03  5:56 ` Chandan Babu R
  2020-10-06  4:26   ` Darrick J. Wong
  2020-10-03  5:56 ` [PATCH V5 12/12] xfs: Introduce error injection to allocate only minlen size extents for files Chandan Babu R
  11 siblings, 1 reply; 26+ messages in thread
From: Chandan Babu R @ 2020-10-03  5:56 UTC (permalink / raw)
  To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, david

tp->t_firstblock is supposed to hold the first fs block allocated by the
transaction. There are two cases in the current code base where
tp->t_firstblock is assigned a value unconditionally. This commit makes
sure that we assign to tp->t_firstblock only if its current value is
NULLFSBLOCK.

Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
 fs/xfs/libxfs/xfs_bmap.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 51c2d2690f05..5156cbd476f2 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -724,7 +724,8 @@ xfs_bmap_extents_to_btree(
 	 */
 	ASSERT(tp->t_firstblock == NULLFSBLOCK ||
 	       args.agno >= XFS_FSB_TO_AGNO(mp, tp->t_firstblock));
-	tp->t_firstblock = args.fsbno;
+	if (tp->t_firstblock == NULLFSBLOCK)
+		tp->t_firstblock = args.fsbno;
 	cur->bc_ino.allocated++;
 	ip->i_d.di_nblocks++;
 	xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT, 1L);
@@ -875,7 +876,8 @@ xfs_bmap_local_to_extents(
 	/* Can't fail, the space was reserved. */
 	ASSERT(args.fsbno != NULLFSBLOCK);
 	ASSERT(args.len == 1);
-	tp->t_firstblock = args.fsbno;
+	if (tp->t_firstblock == NULLFSBLOCK)
+		tp->t_firstblock = args.fsbno;
 	error = xfs_trans_get_buf(tp, args.mp->m_ddev_targp,
 			XFS_FSB_TO_DADDR(args.mp, args.fsbno),
 			args.mp->m_bsize, 0, &bp);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH V5 12/12] xfs: Introduce error injection to allocate only minlen size extents for files
  2020-10-03  5:56 [PATCH V5 00/12] Bail out if transaction can cause extent count to overflow Chandan Babu R
                   ` (10 preceding siblings ...)
  2020-10-03  5:56 ` [PATCH V5 11/12] xfs: Set tp->t_firstblock only once during a transaction's lifetime Chandan Babu R
@ 2020-10-03  5:56 ` Chandan Babu R
  2020-10-06  4:25   ` Chandan Babu R
  2020-10-06  4:34   ` Darrick J. Wong
  11 siblings, 2 replies; 26+ messages in thread
From: Chandan Babu R @ 2020-10-03  5:56 UTC (permalink / raw)
  To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, david

This commit adds XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag which
helps userspace test programs to get xfs_bmap_btalloc() to always
allocate minlen sized extents.

This is required for test programs which need a guarantee that minlen
extents allocated for a file do not get merged with their existing
neighbours in the inode's BMBT. "Inode fork extent overflow check" for
Directories, Xattrs and extension of realtime inodes need this since the
file offset at which the extents are being allocated cannot be
explicitly controlled from userspace.

One way to use this error tag is to,
1. Consume all of the free space by sequentially writing to a file.
2. Punch alternate blocks of the file. This causes CNTBT to contain
   sufficient number of one block sized extent records.
3. Inject XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag.
After step 3, xfs_bmap_btalloc() will issue space allocation
requests for minlen sized extents only.

ENOSPC error code is returned to userspace when there aren't any "one
block sized" extents left in any of the AGs.

Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
 fs/xfs/libxfs/xfs_alloc.c    | 46 ++++++++++++++++++++++++++++++++++++
 fs/xfs/libxfs/xfs_alloc.h    |  1 +
 fs/xfs/libxfs/xfs_bmap.c     | 26 ++++++++++++++------
 fs/xfs/libxfs/xfs_errortag.h |  4 +++-
 fs/xfs/xfs_error.c           |  3 +++
 5 files changed, 72 insertions(+), 8 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index 852b536551b5..d8d8ab1478db 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -2473,6 +2473,45 @@ xfs_defer_agfl_block(
 	xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_AGFL_FREE, &new->xefi_list);
 }
 
+STATIC int
+minlen_freespace_available(
+	struct xfs_alloc_arg	*args,
+	struct xfs_buf		*agbp,
+	int			*stat)
+{
+	xfs_btree_cur_t		*cnt_cur;
+	xfs_agblock_t		fbno;
+	xfs_extlen_t		flen;
+	int			btree_error = XFS_BTREE_NOERROR;
+	int			error = 0;
+
+	cnt_cur = xfs_allocbt_init_cursor(args->mp, args->tp, agbp,
+			args->agno, XFS_BTNUM_CNT);
+	error = xfs_alloc_lookup_ge(cnt_cur, 0, args->minlen, stat);
+	if (error) {
+		btree_error = XFS_BTREE_ERROR;
+		goto out;
+	}
+
+	ASSERT(*stat == 1);
+
+	error = xfs_alloc_get_rec(cnt_cur, &fbno, &flen, stat);
+	if (error) {
+		btree_error = XFS_BTREE_ERROR;
+		goto out;
+	}
+
+	if (flen == args->minlen)
+		*stat = 1;
+	else
+		*stat = 0;
+
+out:
+	xfs_btree_del_cursor(cnt_cur, btree_error);
+
+	return error;
+}
+
 /*
  * Decide whether to use this allocation group for this allocation.
  * If so, fix up the btree freelist's size.
@@ -2490,6 +2529,7 @@ xfs_alloc_fix_freelist(
 	struct xfs_alloc_arg	targs;	/* local allocation arguments */
 	xfs_agblock_t		bno;	/* freelist block */
 	xfs_extlen_t		need;	/* total blocks needed in freelist */
+	int			i;
 	int			error = 0;
 
 	/* deferred ops (AGFL block frees) require permanent transactions */
@@ -2544,6 +2584,12 @@ xfs_alloc_fix_freelist(
 	if (!xfs_alloc_space_available(args, need, flags))
 		goto out_agbp_relse;
 
+	if (args->alloc_minlen_only) {
+		error = minlen_freespace_available(args, agbp, &i);
+		if (error || !i)
+			goto out_agbp_relse;
+	}
+
 	/*
 	 * Make the freelist shorter if it's too long.
 	 *
diff --git a/fs/xfs/libxfs/xfs_alloc.h b/fs/xfs/libxfs/xfs_alloc.h
index 6c22b12176b8..1d04089b7fb4 100644
--- a/fs/xfs/libxfs/xfs_alloc.h
+++ b/fs/xfs/libxfs/xfs_alloc.h
@@ -75,6 +75,7 @@ typedef struct xfs_alloc_arg {
 	char		wasfromfl;	/* set if allocation is from freelist */
 	struct xfs_owner_info	oinfo;	/* owner of blocks being allocated */
 	enum xfs_ag_resv_type	resv;	/* block reservation to use */
+	bool		alloc_minlen_only;
 } xfs_alloc_arg_t;
 
 /*
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 5156cbd476f2..fab4097e7492 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -3510,12 +3510,19 @@ xfs_bmap_btalloc(
 		ASSERT(ap->length);
 	}
 
+	memset(&args, 0, sizeof(args));
+
+	args.alloc_minlen_only = XFS_TEST_ERROR(false, mp,
+					XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT);
 
 	nullfb = ap->tp->t_firstblock == NULLFSBLOCK;
 	fb_agno = nullfb ? NULLAGNUMBER : XFS_FSB_TO_AGNO(mp,
 							ap->tp->t_firstblock);
 	if (nullfb) {
-		if ((ap->datatype & XFS_ALLOC_USERDATA) &&
+		if (args.alloc_minlen_only) {
+			ag = 0;
+			ap->blkno = XFS_AGB_TO_FSB(mp, ag, 0);
+		} else if ((ap->datatype & XFS_ALLOC_USERDATA) &&
 		    xfs_inode_is_filestream(ap->ip)) {
 			ag = xfs_filestream_lookup_ag(ap->ip);
 			ag = (ag != NULLAGNUMBER) ? ag : 0;
@@ -3523,10 +3530,12 @@ xfs_bmap_btalloc(
 		} else {
 			ap->blkno = XFS_INO_TO_FSB(mp, ap->ip->i_ino);
 		}
-	} else
+	} else {
 		ap->blkno = ap->tp->t_firstblock;
+	}
 
-	xfs_bmap_adjacent(ap);
+	if (!args.alloc_minlen_only)
+		xfs_bmap_adjacent(ap);
 
 	/*
 	 * If allowed, use ap->blkno; otherwise must use firstblock since
@@ -3540,7 +3549,6 @@ xfs_bmap_btalloc(
 	 * Normal allocation, done through xfs_alloc_vextent.
 	 */
 	tryagain = isaligned = 0;
-	memset(&args, 0, sizeof(args));
 	args.tp = ap->tp;
 	args.mp = mp;
 	args.fsbno = ap->blkno;
@@ -3549,7 +3557,10 @@ xfs_bmap_btalloc(
 	/* Trim the allocation back to the maximum an AG can fit. */
 	args.maxlen = min(ap->length, mp->m_ag_max_usable);
 	blen = 0;
-	if (nullfb) {
+	if (args.alloc_minlen_only) {
+		args.type = XFS_ALLOCTYPE_START_AG;
+		args.total = args.minlen = args.maxlen = ap->minlen;
+	} else if (nullfb) {
 		/*
 		 * Search for an allocation group with a single extent large
 		 * enough for the request.  If one isn't found, then adjust
@@ -3595,7 +3606,8 @@ xfs_bmap_btalloc(
 	 * is only set if the allocation length is >= the stripe unit and the
 	 * allocation offset is at the end of file.
 	 */
-	if (!(ap->tp->t_flags & XFS_TRANS_LOWMODE) && ap->aeof) {
+	if (!(ap->tp->t_flags & XFS_TRANS_LOWMODE) && ap->aeof &&
+		!args.alloc_minlen_only) {
 		if (!ap->offset) {
 			args.alignment = stripe_align;
 			atype = args.type;
@@ -3681,7 +3693,7 @@ xfs_bmap_btalloc(
 		if ((error = xfs_alloc_vextent(&args)))
 			return error;
 	}
-	if (args.fsbno == NULLFSBLOCK && nullfb) {
+	if (args.fsbno == NULLFSBLOCK && nullfb && !args.alloc_minlen_only) {
 		args.fsbno = 0;
 		args.type = XFS_ALLOCTYPE_FIRST_AG;
 		args.total = ap->minlen;
diff --git a/fs/xfs/libxfs/xfs_errortag.h b/fs/xfs/libxfs/xfs_errortag.h
index 1c56fcceeea6..6ca9084b6934 100644
--- a/fs/xfs/libxfs/xfs_errortag.h
+++ b/fs/xfs/libxfs/xfs_errortag.h
@@ -57,7 +57,8 @@
 #define XFS_ERRTAG_IUNLINK_FALLBACK			34
 #define XFS_ERRTAG_BUF_IOERROR				35
 #define XFS_ERRTAG_REDUCE_MAX_IEXTENTS			36
-#define XFS_ERRTAG_MAX					37
+#define XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT		37
+#define XFS_ERRTAG_MAX					38
 
 /*
  * Random factors for above tags, 1 means always, 2 means 1/2 time, etc.
@@ -99,5 +100,6 @@
 #define XFS_RANDOM_IUNLINK_FALLBACK			(XFS_RANDOM_DEFAULT/10)
 #define XFS_RANDOM_BUF_IOERROR				XFS_RANDOM_DEFAULT
 #define XFS_RANDOM_REDUCE_MAX_IEXTENTS			1
+#define XFS_RANDOM_BMAP_ALLOC_MINLEN_EXTENT		1
 
 #endif /* __XFS_ERRORTAG_H_ */
diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c
index 3780b118cc47..028560bb596a 100644
--- a/fs/xfs/xfs_error.c
+++ b/fs/xfs/xfs_error.c
@@ -55,6 +55,7 @@ static unsigned int xfs_errortag_random_default[] = {
 	XFS_RANDOM_IUNLINK_FALLBACK,
 	XFS_RANDOM_BUF_IOERROR,
 	XFS_RANDOM_REDUCE_MAX_IEXTENTS,
+	XFS_RANDOM_BMAP_ALLOC_MINLEN_EXTENT,
 };
 
 struct xfs_errortag_attr {
@@ -166,6 +167,7 @@ XFS_ERRORTAG_ATTR_RW(bad_summary,	XFS_ERRTAG_FORCE_SUMMARY_RECALC);
 XFS_ERRORTAG_ATTR_RW(iunlink_fallback,	XFS_ERRTAG_IUNLINK_FALLBACK);
 XFS_ERRORTAG_ATTR_RW(buf_ioerror,	XFS_ERRTAG_BUF_IOERROR);
 XFS_ERRORTAG_ATTR_RW(reduce_max_iextents,	XFS_ERRTAG_REDUCE_MAX_IEXTENTS);
+XFS_ERRORTAG_ATTR_RW(bmap_alloc_minlen_extent, XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT);
 
 static struct attribute *xfs_errortag_attrs[] = {
 	XFS_ERRORTAG_ATTR_LIST(noerror),
@@ -205,6 +207,7 @@ static struct attribute *xfs_errortag_attrs[] = {
 	XFS_ERRORTAG_ATTR_LIST(iunlink_fallback),
 	XFS_ERRORTAG_ATTR_LIST(buf_ioerror),
 	XFS_ERRORTAG_ATTR_LIST(reduce_max_iextents),
+	XFS_ERRORTAG_ATTR_LIST(bmap_alloc_minlen_extent),
 	NULL,
 };
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH V5 02/12] xfs: Check for extent overflow when trivally adding a new extent
  2020-10-03  5:56 ` [PATCH V5 02/12] xfs: Check for extent overflow when trivally adding a new extent Chandan Babu R
@ 2020-10-06  4:18   ` Darrick J. Wong
  0 siblings, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-06  4:18 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: linux-xfs, david

On Sat, Oct 03, 2020 at 11:26:23AM +0530, Chandan Babu R wrote:
> When adding a new data extent (without modifying an inode's existing
> extents) the extent count increases only by 1. This commit checks for
> extent count overflow in such cases.
> 
> Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>

Looks reasonable,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/libxfs/xfs_bmap.c       | 6 ++++++
>  fs/xfs/libxfs/xfs_inode_fork.h | 6 ++++++
>  fs/xfs/xfs_bmap_item.c         | 7 +++++++
>  fs/xfs/xfs_bmap_util.c         | 5 +++++
>  fs/xfs/xfs_dquot.c             | 8 +++++++-
>  fs/xfs/xfs_iomap.c             | 5 +++++
>  fs/xfs/xfs_rtalloc.c           | 5 +++++
>  7 files changed, 41 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 1b0a01b06a05..51c2d2690f05 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -4527,6 +4527,12 @@ xfs_bmapi_convert_delalloc(
>  		return error;
>  
>  	xfs_ilock(ip, XFS_ILOCK_EXCL);
> +
> +	error = xfs_iext_count_may_overflow(ip, whichfork,
> +			XFS_IEXT_ADD_NOSPLIT_CNT);
> +	if (error)
> +		goto out_trans_cancel;
> +
>  	xfs_trans_ijoin(tp, ip, 0);
>  
>  	if (!xfs_iext_lookup_extent(ip, ifp, offset_fsb, &bma.icur, &bma.got) ||
> diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
> index 0beb8e2a00be..7fc2b129a2e7 100644
> --- a/fs/xfs/libxfs/xfs_inode_fork.h
> +++ b/fs/xfs/libxfs/xfs_inode_fork.h
> @@ -34,6 +34,12 @@ struct xfs_ifork {
>  #define	XFS_IFEXTENTS	0x02	/* All extent pointers are read in */
>  #define	XFS_IFBROOT	0x04	/* i_broot points to the bmap b-tree root */
>  
> +/*
> + * Worst-case increase in the fork extent count when we're adding a single
> + * extent to a fork and there's no possibility of splitting an existing mapping.
> + */
> +#define XFS_IEXT_ADD_NOSPLIT_CNT	(1)
> +
>  /*
>   * Fork handling.
>   */
> diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
> index ec3691372e7c..6a7dcea4ad40 100644
> --- a/fs/xfs/xfs_bmap_item.c
> +++ b/fs/xfs/xfs_bmap_item.c
> @@ -519,6 +519,13 @@ xfs_bui_item_recover(
>  	}
>  	xfs_trans_ijoin(tp, ip, 0);
>  
> +	if (bui_type == XFS_BMAP_MAP) {
> +		error = xfs_iext_count_may_overflow(ip, whichfork,
> +				XFS_IEXT_ADD_NOSPLIT_CNT);
> +		if (error)
> +			goto err_inode;
> +	}
> +
>  	count = bmap->me_len;
>  	error = xfs_trans_log_finish_bmap_update(tp, budp, type, ip, whichfork,
>  			bmap->me_startoff, bmap->me_startblock, &count, state);
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index f2a8a0e75e1f..dcd6e61df711 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -822,6 +822,11 @@ xfs_alloc_file_space(
>  		if (error)
>  			goto error1;
>  
> +		error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
> +				XFS_IEXT_ADD_NOSPLIT_CNT);
> +		if (error)
> +			goto error0;
> +
>  		xfs_trans_ijoin(tp, ip, 0);
>  
>  		error = xfs_bmapi_write(tp, ip, startoffset_fsb,
> diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
> index 3072814e407d..5bf22d2e50cb 100644
> --- a/fs/xfs/xfs_dquot.c
> +++ b/fs/xfs/xfs_dquot.c
> @@ -314,8 +314,14 @@ xfs_dquot_disk_alloc(
>  		return -ESRCH;
>  	}
>  
> -	/* Create the block mapping. */
>  	xfs_trans_ijoin(tp, quotip, XFS_ILOCK_EXCL);
> +
> +	error = xfs_iext_count_may_overflow(quotip, XFS_DATA_FORK,
> +			XFS_IEXT_ADD_NOSPLIT_CNT);
> +	if (error)
> +		return error;
> +
> +	/* Create the block mapping. */
>  	error = xfs_bmapi_write(tp, quotip, dqp->q_fileoffset,
>  			XFS_DQUOT_CLUSTER_SIZE_FSB, XFS_BMAPI_METADATA, 0, &map,
>  			&nmaps);
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 3abb8b9d6f4c..a302a96823b8 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -250,6 +250,11 @@ xfs_iomap_write_direct(
>  	if (error)
>  		goto out_trans_cancel;
>  
> +	error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
> +			XFS_IEXT_ADD_NOSPLIT_CNT);
> +	if (error)
> +		goto out_trans_cancel;
> +
>  	xfs_trans_ijoin(tp, ip, 0);
>  
>  	/*
> diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
> index 9d4e33d70d2a..3e841a75f272 100644
> --- a/fs/xfs/xfs_rtalloc.c
> +++ b/fs/xfs/xfs_rtalloc.c
> @@ -804,6 +804,11 @@ xfs_growfs_rt_alloc(
>  		xfs_ilock(ip, XFS_ILOCK_EXCL);
>  		xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
>  
> +		error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
> +				XFS_IEXT_ADD_NOSPLIT_CNT);
> +		if (error)
> +			goto out_trans_cancel;
> +
>  		/*
>  		 * Allocate blocks to the bitmap file.
>  		 */
> -- 
> 2.28.0
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V5 03/12] xfs: Check for extent overflow when punching a hole
  2020-10-03  5:56 ` [PATCH V5 03/12] xfs: Check for extent overflow when punching a hole Chandan Babu R
@ 2020-10-06  4:18   ` Darrick J. Wong
  0 siblings, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-06  4:18 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: linux-xfs, david

On Sat, Oct 03, 2020 at 11:26:24AM +0530, Chandan Babu R wrote:
> The extent mapping the file offset at which a hole has to be
> inserted will be split into two extents causing extent count to
> increase by 1.
> 
> Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>

Looks fine,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/libxfs/xfs_inode_fork.h |  7 +++++++
>  fs/xfs/xfs_bmap_item.c         | 15 +++++++++------
>  fs/xfs/xfs_bmap_util.c         | 10 ++++++++++
>  3 files changed, 26 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
> index 7fc2b129a2e7..bcac769a7df6 100644
> --- a/fs/xfs/libxfs/xfs_inode_fork.h
> +++ b/fs/xfs/libxfs/xfs_inode_fork.h
> @@ -40,6 +40,13 @@ struct xfs_ifork {
>   */
>  #define XFS_IEXT_ADD_NOSPLIT_CNT	(1)
>  
> +/*
> + * Punching out an extent from the middle of an existing extent can cause the
> + * extent count to increase by 1.
> + * i.e. | Old extent | Hole | Old extent |
> + */
> +#define XFS_IEXT_PUNCH_HOLE_CNT		(1)
> +
>  /*
>   * Fork handling.
>   */
> diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
> index 6a7dcea4ad40..323cee00bd45 100644
> --- a/fs/xfs/xfs_bmap_item.c
> +++ b/fs/xfs/xfs_bmap_item.c
> @@ -440,6 +440,7 @@ xfs_bui_item_recover(
>  	bool				op_ok;
>  	unsigned int			bui_type;
>  	int				whichfork;
> +	int				iext_delta;
>  	int				error = 0;
>  
>  	/* Only one mapping operation per BUI... */
> @@ -519,12 +520,14 @@ xfs_bui_item_recover(
>  	}
>  	xfs_trans_ijoin(tp, ip, 0);
>  
> -	if (bui_type == XFS_BMAP_MAP) {
> -		error = xfs_iext_count_may_overflow(ip, whichfork,
> -				XFS_IEXT_ADD_NOSPLIT_CNT);
> -		if (error)
> -			goto err_inode;
> -	}
> +	if (bui_type == XFS_BMAP_MAP)
> +		iext_delta = XFS_IEXT_ADD_NOSPLIT_CNT;
> +	else
> +		iext_delta = XFS_IEXT_PUNCH_HOLE_CNT;
> +
> +	error = xfs_iext_count_may_overflow(ip, whichfork, iext_delta);
> +	if (error)
> +		goto err_inode;
>  
>  	count = bmap->me_len;
>  	error = xfs_trans_log_finish_bmap_update(tp, budp, type, ip, whichfork,
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index dcd6e61df711..0776abd0103c 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -891,6 +891,11 @@ xfs_unmap_extent(
>  
>  	xfs_trans_ijoin(tp, ip, 0);
>  
> +	error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
> +			XFS_IEXT_PUNCH_HOLE_CNT);
> +	if (error)
> +		goto out_trans_cancel;
> +
>  	error = xfs_bunmapi(tp, ip, startoffset_fsb, len_fsb, 0, 2, done);
>  	if (error)
>  		goto out_trans_cancel;
> @@ -1176,6 +1181,11 @@ xfs_insert_file_space(
>  	xfs_ilock(ip, XFS_ILOCK_EXCL);
>  	xfs_trans_ijoin(tp, ip, 0);
>  
> +	error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
> +			XFS_IEXT_PUNCH_HOLE_CNT);
> +	if (error)
> +		goto out_trans_cancel;
> +
>  	/*
>  	 * The extent shifting code works on extent granularity. So, if stop_fsb
>  	 * is not the starting block of extent, we need to split the extent at
> -- 
> 2.28.0
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V5 04/12] xfs: Check for extent overflow when adding/removing xattrs
  2020-10-03  5:56 ` [PATCH V5 04/12] xfs: Check for extent overflow when adding/removing xattrs Chandan Babu R
@ 2020-10-06  4:23   ` Darrick J. Wong
  2020-10-06  9:21     ` Chandan Babu R
  0 siblings, 1 reply; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-06  4:23 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: linux-xfs, david

On Sat, Oct 03, 2020 at 11:26:25AM +0530, Chandan Babu R wrote:
> Adding/removing an xattr can cause XFS_DA_NODE_MAXDEPTH extents to be
> added. One extra extent for dabtree in case a local attr is large enough
> to cause a double split.  It can also cause extent count to increase
> proportional to the size of a remote xattr's value.
> 
> Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>

Didn't I already review this?  AFAICT it hasn't changed much, but did
something change enough to warrant dropping the old RVB tag?

> ---
>  fs/xfs/libxfs/xfs_attr.c       | 13 +++++++++++++
>  fs/xfs/libxfs/xfs_inode_fork.h | 10 ++++++++++
>  2 files changed, 23 insertions(+)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index fd8e6418a0d3..be51e7068dcd 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -396,6 +396,7 @@ xfs_attr_set(
>  	struct xfs_trans_res	tres;
>  	bool			rsvd = (args->attr_filter & XFS_ATTR_ROOT);
>  	int			error, local;
> +	int			rmt_blks = 0;
>  	unsigned int		total;
>  
>  	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
> @@ -442,11 +443,15 @@ xfs_attr_set(
>  		tres.tr_logcount = XFS_ATTRSET_LOG_COUNT;
>  		tres.tr_logflags = XFS_TRANS_PERM_LOG_RES;
>  		total = args->total;
> +
> +		if (!local)
> +			rmt_blks = xfs_attr3_rmt_blocks(mp, args->valuelen);
>  	} else {
>  		XFS_STATS_INC(mp, xs_attr_remove);
>  
>  		tres = M_RES(mp)->tr_attrrm;
>  		total = XFS_ATTRRM_SPACE_RES(mp);
> +		rmt_blks = xfs_attr3_rmt_blocks(mp, XFS_XATTR_SIZE_MAX);
>  	}
>  
>  	/*
> @@ -460,6 +465,14 @@ xfs_attr_set(
>  
>  	xfs_ilock(dp, XFS_ILOCK_EXCL);
>  	xfs_trans_ijoin(args->trans, dp, 0);
> +
> +	if (args->value || xfs_inode_hasattr(dp)) {
> +		error = xfs_iext_count_may_overflow(dp, XFS_ATTR_FORK,
> +				XFS_IEXT_ATTR_MANIP_CNT(rmt_blks));
> +		if (error)
> +			goto out_trans_cancel;

Hmm.  If you hit this while trying to remove an xattr, what then?
I suppose you really don't want to overflow naextents, but I suppose the
only other option is to delete the file.  Oh well, attr forks with 65533
extents should be vanishingly rare, right?  Right? :)

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> +	}
> +
>  	if (args->value) {
>  		unsigned int	quota_flags = XFS_QMOPT_RES_REGBLKS;
>  
> diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
> index bcac769a7df6..5de2f07d0dd5 100644
> --- a/fs/xfs/libxfs/xfs_inode_fork.h
> +++ b/fs/xfs/libxfs/xfs_inode_fork.h
> @@ -47,6 +47,16 @@ struct xfs_ifork {
>   */
>  #define XFS_IEXT_PUNCH_HOLE_CNT		(1)
>  
> +/*
> + * Adding/removing an xattr can cause XFS_DA_NODE_MAXDEPTH extents to
> + * be added. One extra extent for dabtree in case a local attr is
> + * large enough to cause a double split.  It can also cause extent
> + * count to increase proportional to the size of a remote xattr's
> + * value.
> + */
> +#define XFS_IEXT_ATTR_MANIP_CNT(rmt_blks) \
> +	(XFS_DA_NODE_MAXDEPTH + max(1, rmt_blks))
> +
>  /*
>   * Fork handling.
>   */
> -- 
> 2.28.0
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V5 09/12] xfs: Check for extent overflow when swapping extents
  2020-10-03  5:56 ` [PATCH V5 09/12] xfs: Check for extent overflow when swapping extents Chandan Babu R
@ 2020-10-06  4:23   ` Darrick J. Wong
  0 siblings, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-06  4:23 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: linux-xfs, david

On Sat, Oct 03, 2020 at 11:26:30AM +0530, Chandan Babu R wrote:
> Removing an initial range of source/donor file's extent and adding a new
> extent (from donor/source file) in its place will cause extent count to
> increase by 1.
> 
> Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>

Looks ok,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/libxfs/xfs_inode_fork.h |  7 +++++++
>  fs/xfs/xfs_bmap_util.c         | 16 ++++++++++++++++
>  2 files changed, 23 insertions(+)
> 
> diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
> index ded3c1b56c94..837c01595439 100644
> --- a/fs/xfs/libxfs/xfs_inode_fork.h
> +++ b/fs/xfs/libxfs/xfs_inode_fork.h
> @@ -102,6 +102,13 @@ struct xfs_ifork {
>  #define XFS_IEXT_REFLINK_REMAP_CNT(smap_real, dmap_written) \
>  	(((smap_real) ? 1 : 0) + ((dmap_written) ? 1 : 0))
>  
> +/*
> + * Removing an initial range of source/donor file's extent and adding a new
> + * extent (from donor/source file) in its place will cause extent count to
> + * increase by 1.
> + */
> +#define XFS_IEXT_SWAP_RMAP_CNT		(1)
> +
>  /*
>   * Fork handling.
>   */
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index 0776abd0103c..b6728fdf50ae 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -1407,6 +1407,22 @@ xfs_swap_extent_rmap(
>  					irec.br_blockcount);
>  			trace_xfs_swap_extent_rmap_remap_piece(tip, &uirec);
>  
> +			if (xfs_bmap_is_real_extent(&uirec)) {
> +				error = xfs_iext_count_may_overflow(ip,
> +						XFS_DATA_FORK,
> +						XFS_IEXT_SWAP_RMAP_CNT);
> +				if (error)
> +					goto out;
> +			}
> +
> +			if (xfs_bmap_is_real_extent(&irec)) {
> +				error = xfs_iext_count_may_overflow(tip,
> +						XFS_DATA_FORK,
> +						XFS_IEXT_SWAP_RMAP_CNT);
> +				if (error)
> +					goto out;
> +			}
> +
>  			/* Remove the mapping from the donor file. */
>  			xfs_bmap_unmap_extent(tp, tip, &uirec);
>  
> -- 
> 2.28.0
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V5 10/12] xfs: Introduce error injection to reduce maximum inode fork extent count
  2020-10-03  5:56 ` [PATCH V5 10/12] xfs: Introduce error injection to reduce maximum inode fork extent count Chandan Babu R
@ 2020-10-06  4:24   ` Darrick J. Wong
  0 siblings, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-06  4:24 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: linux-xfs, david

On Sat, Oct 03, 2020 at 11:26:31AM +0530, Chandan Babu R wrote:
> This commit adds XFS_ERRTAG_REDUCE_MAX_IEXTENTS error tag which enables
> userspace programs to test "Inode fork extent count overflow detection"
> by reducing maximum possible inode fork extent count to 10.
> 
> Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>

Looks decent,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/libxfs/xfs_errortag.h   | 4 +++-
>  fs/xfs/libxfs/xfs_inode_fork.c | 4 ++++
>  fs/xfs/xfs_error.c             | 3 +++
>  3 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_errortag.h b/fs/xfs/libxfs/xfs_errortag.h
> index 53b305dea381..1c56fcceeea6 100644
> --- a/fs/xfs/libxfs/xfs_errortag.h
> +++ b/fs/xfs/libxfs/xfs_errortag.h
> @@ -56,7 +56,8 @@
>  #define XFS_ERRTAG_FORCE_SUMMARY_RECALC			33
>  #define XFS_ERRTAG_IUNLINK_FALLBACK			34
>  #define XFS_ERRTAG_BUF_IOERROR				35
> -#define XFS_ERRTAG_MAX					36
> +#define XFS_ERRTAG_REDUCE_MAX_IEXTENTS			36
> +#define XFS_ERRTAG_MAX					37
>  
>  /*
>   * Random factors for above tags, 1 means always, 2 means 1/2 time, etc.
> @@ -97,5 +98,6 @@
>  #define XFS_RANDOM_FORCE_SUMMARY_RECALC			1
>  #define XFS_RANDOM_IUNLINK_FALLBACK			(XFS_RANDOM_DEFAULT/10)
>  #define XFS_RANDOM_BUF_IOERROR				XFS_RANDOM_DEFAULT
> +#define XFS_RANDOM_REDUCE_MAX_IEXTENTS			1
>  
>  #endif /* __XFS_ERRORTAG_H_ */
> diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
> index 8d48716547e5..e080d7e07643 100644
> --- a/fs/xfs/libxfs/xfs_inode_fork.c
> +++ b/fs/xfs/libxfs/xfs_inode_fork.c
> @@ -24,6 +24,7 @@
>  #include "xfs_dir2_priv.h"
>  #include "xfs_attr_leaf.h"
>  #include "xfs_types.h"
> +#include "xfs_errortag.h"
>  
>  kmem_zone_t *xfs_ifork_zone;
>  
> @@ -745,6 +746,9 @@ xfs_iext_count_may_overflow(
>  
>  	max_exts = (whichfork == XFS_ATTR_FORK) ? MAXAEXTNUM : MAXEXTNUM;
>  
> +	if (XFS_TEST_ERROR(false, ip->i_mount, XFS_ERRTAG_REDUCE_MAX_IEXTENTS))
> +		max_exts = 10;
> +
>  	nr_exts = ifp->if_nextents + nr_to_add;
>  	if (nr_exts < ifp->if_nextents || nr_exts > max_exts)
>  		return -EFBIG;
> diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c
> index 7f6e20899473..3780b118cc47 100644
> --- a/fs/xfs/xfs_error.c
> +++ b/fs/xfs/xfs_error.c
> @@ -54,6 +54,7 @@ static unsigned int xfs_errortag_random_default[] = {
>  	XFS_RANDOM_FORCE_SUMMARY_RECALC,
>  	XFS_RANDOM_IUNLINK_FALLBACK,
>  	XFS_RANDOM_BUF_IOERROR,
> +	XFS_RANDOM_REDUCE_MAX_IEXTENTS,
>  };
>  
>  struct xfs_errortag_attr {
> @@ -164,6 +165,7 @@ XFS_ERRORTAG_ATTR_RW(force_repair,	XFS_ERRTAG_FORCE_SCRUB_REPAIR);
>  XFS_ERRORTAG_ATTR_RW(bad_summary,	XFS_ERRTAG_FORCE_SUMMARY_RECALC);
>  XFS_ERRORTAG_ATTR_RW(iunlink_fallback,	XFS_ERRTAG_IUNLINK_FALLBACK);
>  XFS_ERRORTAG_ATTR_RW(buf_ioerror,	XFS_ERRTAG_BUF_IOERROR);
> +XFS_ERRORTAG_ATTR_RW(reduce_max_iextents,	XFS_ERRTAG_REDUCE_MAX_IEXTENTS);
>  
>  static struct attribute *xfs_errortag_attrs[] = {
>  	XFS_ERRORTAG_ATTR_LIST(noerror),
> @@ -202,6 +204,7 @@ static struct attribute *xfs_errortag_attrs[] = {
>  	XFS_ERRORTAG_ATTR_LIST(bad_summary),
>  	XFS_ERRORTAG_ATTR_LIST(iunlink_fallback),
>  	XFS_ERRORTAG_ATTR_LIST(buf_ioerror),
> +	XFS_ERRORTAG_ATTR_LIST(reduce_max_iextents),
>  	NULL,
>  };
>  
> -- 
> 2.28.0
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V5 12/12] xfs: Introduce error injection to allocate only minlen size extents for files
  2020-10-03  5:56 ` [PATCH V5 12/12] xfs: Introduce error injection to allocate only minlen size extents for files Chandan Babu R
@ 2020-10-06  4:25   ` Chandan Babu R
  2020-10-06  4:27     ` Darrick J. Wong
  2020-10-06  4:34   ` Darrick J. Wong
  1 sibling, 1 reply; 26+ messages in thread
From: Chandan Babu R @ 2020-10-06  4:25 UTC (permalink / raw)
  To: linux-xfs; +Cc: darrick.wong, david

On Saturday 3 October 2020 11:26:33 AM IST Chandan Babu R wrote:
> This commit adds XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag which
> helps userspace test programs to get xfs_bmap_btalloc() to always
> allocate minlen sized extents.
> 
> This is required for test programs which need a guarantee that minlen
> extents allocated for a file do not get merged with their existing
> neighbours in the inode's BMBT. "Inode fork extent overflow check" for
> Directories, Xattrs and extension of realtime inodes need this since the
> file offset at which the extents are being allocated cannot be
> explicitly controlled from userspace.
> 
> One way to use this error tag is to,
> 1. Consume all of the free space by sequentially writing to a file.
> 2. Punch alternate blocks of the file. This causes CNTBT to contain
>    sufficient number of one block sized extent records.
> 3. Inject XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag.
> After step 3, xfs_bmap_btalloc() will issue space allocation
> requests for minlen sized extents only.
> 
> ENOSPC error code is returned to userspace when there aren't any "one
> block sized" extents left in any of the AGs.
> 
> Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
> ---
>  fs/xfs/libxfs/xfs_alloc.c    | 46 ++++++++++++++++++++++++++++++++++++
>  fs/xfs/libxfs/xfs_alloc.h    |  1 +
>  fs/xfs/libxfs/xfs_bmap.c     | 26 ++++++++++++++------
>  fs/xfs/libxfs/xfs_errortag.h |  4 +++-
>  fs/xfs/xfs_error.c           |  3 +++
>  5 files changed, 72 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> index 852b536551b5..d8d8ab1478db 100644
> --- a/fs/xfs/libxfs/xfs_alloc.c
> +++ b/fs/xfs/libxfs/xfs_alloc.c
> @@ -2473,6 +2473,45 @@ xfs_defer_agfl_block(
>  	xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_AGFL_FREE, &new->xefi_list);
>  }
>  
> +STATIC int
> +minlen_freespace_available(
> +	struct xfs_alloc_arg	*args,
> +	struct xfs_buf		*agbp,
> +	int			*stat)
> +{
> +	xfs_btree_cur_t		*cnt_cur;
> +	xfs_agblock_t		fbno;
> +	xfs_extlen_t		flen;
> +	int			btree_error = XFS_BTREE_NOERROR;
> +	int			error = 0;
> +
> +	cnt_cur = xfs_allocbt_init_cursor(args->mp, args->tp, agbp,
> +			args->agno, XFS_BTNUM_CNT);
> +	error = xfs_alloc_lookup_ge(cnt_cur, 0, args->minlen, stat);
> +	if (error) {
> +		btree_error = XFS_BTREE_ERROR;
> +		goto out;
> +	}
> +
> +	ASSERT(*stat == 1);
> +
> +	error = xfs_alloc_get_rec(cnt_cur, &fbno, &flen, stat);
> +	if (error) {
> +		btree_error = XFS_BTREE_ERROR;
> +		goto out;
> +	}
> +
> +	if (flen == args->minlen)
> +		*stat = 1;
> +	else
> +		*stat = 0;
> +
> +out:
> +	xfs_btree_del_cursor(cnt_cur, btree_error);
> +
> +	return error;
> +}
> +
>  /*
>   * Decide whether to use this allocation group for this allocation.
>   * If so, fix up the btree freelist's size.
> @@ -2490,6 +2529,7 @@ xfs_alloc_fix_freelist(
>  	struct xfs_alloc_arg	targs;	/* local allocation arguments */
>  	xfs_agblock_t		bno;	/* freelist block */
>  	xfs_extlen_t		need;	/* total blocks needed in freelist */
> +	int			i;
>  	int			error = 0;
>  
>  	/* deferred ops (AGFL block frees) require permanent transactions */
> @@ -2544,6 +2584,12 @@ xfs_alloc_fix_freelist(
>  	if (!xfs_alloc_space_available(args, need, flags))
>  		goto out_agbp_relse;
>  
> +	if (args->alloc_minlen_only) {
> +		error = minlen_freespace_available(args, agbp, &i);
> +		if (error || !i)
> +			goto out_agbp_relse;
> +	}
> +
>  	/*
>  	 * Make the freelist shorter if it's too long.
>  	 *
> diff --git a/fs/xfs/libxfs/xfs_alloc.h b/fs/xfs/libxfs/xfs_alloc.h
> index 6c22b12176b8..1d04089b7fb4 100644
> --- a/fs/xfs/libxfs/xfs_alloc.h
> +++ b/fs/xfs/libxfs/xfs_alloc.h
> @@ -75,6 +75,7 @@ typedef struct xfs_alloc_arg {
>  	char		wasfromfl;	/* set if allocation is from freelist */
>  	struct xfs_owner_info	oinfo;	/* owner of blocks being allocated */
>  	enum xfs_ag_resv_type	resv;	/* block reservation to use */
> +	bool		alloc_minlen_only;
>  } xfs_alloc_arg_t;
>  
>  /*
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 5156cbd476f2..fab4097e7492 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -3510,12 +3510,19 @@ xfs_bmap_btalloc(
>  		ASSERT(ap->length);
>  	}
>  
> +	memset(&args, 0, sizeof(args));
> +
> +	args.alloc_minlen_only = XFS_TEST_ERROR(false, mp,
> +					XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT);
>  
>  	nullfb = ap->tp->t_firstblock == NULLFSBLOCK;
>  	fb_agno = nullfb ? NULLAGNUMBER : XFS_FSB_TO_AGNO(mp,
>  							ap->tp->t_firstblock);
>  	if (nullfb) {
> -		if ((ap->datatype & XFS_ALLOC_USERDATA) &&
> +		if (args.alloc_minlen_only) {
> +			ag = 0;
> +			ap->blkno = XFS_AGB_TO_FSB(mp, ag, 0);
> +		} else if ((ap->datatype & XFS_ALLOC_USERDATA) &&
>  		    xfs_inode_is_filestream(ap->ip)) {
>  			ag = xfs_filestream_lookup_ag(ap->ip);
>  			ag = (ag != NULLAGNUMBER) ? ag : 0;
> @@ -3523,10 +3530,12 @@ xfs_bmap_btalloc(
>  		} else {
>  			ap->blkno = XFS_INO_TO_FSB(mp, ap->ip->i_ino);
>  		}
> -	} else
> +	} else {
>  		ap->blkno = ap->tp->t_firstblock;
> +	}
>  
> -	xfs_bmap_adjacent(ap);
> +	if (!args.alloc_minlen_only)
> +		xfs_bmap_adjacent(ap);
>  
>  	/*
>  	 * If allowed, use ap->blkno; otherwise must use firstblock since
> @@ -3540,7 +3549,6 @@ xfs_bmap_btalloc(
>  	 * Normal allocation, done through xfs_alloc_vextent.
>  	 */
>  	tryagain = isaligned = 0;
> -	memset(&args, 0, sizeof(args));
>  	args.tp = ap->tp;
>  	args.mp = mp;
>  	args.fsbno = ap->blkno;
> @@ -3549,7 +3557,10 @@ xfs_bmap_btalloc(
>  	/* Trim the allocation back to the maximum an AG can fit. */
>  	args.maxlen = min(ap->length, mp->m_ag_max_usable);
>  	blen = 0;
> -	if (nullfb) {
> +	if (args.alloc_minlen_only) {
> +		args.type = XFS_ALLOCTYPE_START_AG;

The above should have been,

args.type = XFS_ALLOCTYPE_FIRST_AG;

In my experiments, I had introduced a new args.type value and had later
realized that XFS_ALLOCTYPE_FIRST_AG would suffice for my requirements. I had
tested the changed version (which was in my git stash) and forgot to apply
that to this commit after testing was completed. Hence I ended up sending a
slightly stale patch. I am sorry about this. I will resend the series.

-- 
chandan




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V5 11/12] xfs: Set tp->t_firstblock only once during a transaction's lifetime
  2020-10-03  5:56 ` [PATCH V5 11/12] xfs: Set tp->t_firstblock only once during a transaction's lifetime Chandan Babu R
@ 2020-10-06  4:26   ` Darrick J. Wong
  2020-10-06  5:17     ` Chandan Babu R
  0 siblings, 1 reply; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-06  4:26 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: linux-xfs, david

On Sat, Oct 03, 2020 at 11:26:32AM +0530, Chandan Babu R wrote:
> tp->t_firstblock is supposed to hold the first fs block allocated by the
> transaction. There are two cases in the current code base where
> tp->t_firstblock is assigned a value unconditionally. This commit makes
> sure that we assign to tp->t_firstblock only if its current value is
> NULLFSBLOCK.

Do we hit this currently?  This seems like a regression fix, since I'm
guessing you hit this fairly soon after adding the next patch and
twisting the "shatter everything" debug knob it establishes?  And if
you can hit it there, you could hit this on a severely fragmented fs?

--D

> 
> Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
> ---
>  fs/xfs/libxfs/xfs_bmap.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 51c2d2690f05..5156cbd476f2 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -724,7 +724,8 @@ xfs_bmap_extents_to_btree(
>  	 */
>  	ASSERT(tp->t_firstblock == NULLFSBLOCK ||
>  	       args.agno >= XFS_FSB_TO_AGNO(mp, tp->t_firstblock));
> -	tp->t_firstblock = args.fsbno;
> +	if (tp->t_firstblock == NULLFSBLOCK)
> +		tp->t_firstblock = args.fsbno;
>  	cur->bc_ino.allocated++;
>  	ip->i_d.di_nblocks++;
>  	xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT, 1L);
> @@ -875,7 +876,8 @@ xfs_bmap_local_to_extents(
>  	/* Can't fail, the space was reserved. */
>  	ASSERT(args.fsbno != NULLFSBLOCK);
>  	ASSERT(args.len == 1);
> -	tp->t_firstblock = args.fsbno;
> +	if (tp->t_firstblock == NULLFSBLOCK)
> +		tp->t_firstblock = args.fsbno;
>  	error = xfs_trans_get_buf(tp, args.mp->m_ddev_targp,
>  			XFS_FSB_TO_DADDR(args.mp, args.fsbno),
>  			args.mp->m_bsize, 0, &bp);
> -- 
> 2.28.0
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V5 12/12] xfs: Introduce error injection to allocate only minlen size extents for files
  2020-10-06  4:25   ` Chandan Babu R
@ 2020-10-06  4:27     ` Darrick J. Wong
  0 siblings, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-06  4:27 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: linux-xfs, david

On Tue, Oct 06, 2020 at 09:55:03AM +0530, Chandan Babu R wrote:
> On Saturday 3 October 2020 11:26:33 AM IST Chandan Babu R wrote:
> > This commit adds XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag which
> > helps userspace test programs to get xfs_bmap_btalloc() to always
> > allocate minlen sized extents.
> > 
> > This is required for test programs which need a guarantee that minlen
> > extents allocated for a file do not get merged with their existing
> > neighbours in the inode's BMBT. "Inode fork extent overflow check" for
> > Directories, Xattrs and extension of realtime inodes need this since the
> > file offset at which the extents are being allocated cannot be
> > explicitly controlled from userspace.
> > 
> > One way to use this error tag is to,
> > 1. Consume all of the free space by sequentially writing to a file.
> > 2. Punch alternate blocks of the file. This causes CNTBT to contain
> >    sufficient number of one block sized extent records.
> > 3. Inject XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag.
> > After step 3, xfs_bmap_btalloc() will issue space allocation
> > requests for minlen sized extents only.
> > 
> > ENOSPC error code is returned to userspace when there aren't any "one
> > block sized" extents left in any of the AGs.
> > 
> > Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
> > ---
> >  fs/xfs/libxfs/xfs_alloc.c    | 46 ++++++++++++++++++++++++++++++++++++
> >  fs/xfs/libxfs/xfs_alloc.h    |  1 +
> >  fs/xfs/libxfs/xfs_bmap.c     | 26 ++++++++++++++------
> >  fs/xfs/libxfs/xfs_errortag.h |  4 +++-
> >  fs/xfs/xfs_error.c           |  3 +++
> >  5 files changed, 72 insertions(+), 8 deletions(-)
> > 
> > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> > index 852b536551b5..d8d8ab1478db 100644
> > --- a/fs/xfs/libxfs/xfs_alloc.c
> > +++ b/fs/xfs/libxfs/xfs_alloc.c
> > @@ -2473,6 +2473,45 @@ xfs_defer_agfl_block(
> >  	xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_AGFL_FREE, &new->xefi_list);
> >  }
> >  
> > +STATIC int
> > +minlen_freespace_available(
> > +	struct xfs_alloc_arg	*args,
> > +	struct xfs_buf		*agbp,
> > +	int			*stat)
> > +{
> > +	xfs_btree_cur_t		*cnt_cur;
> > +	xfs_agblock_t		fbno;
> > +	xfs_extlen_t		flen;
> > +	int			btree_error = XFS_BTREE_NOERROR;
> > +	int			error = 0;
> > +
> > +	cnt_cur = xfs_allocbt_init_cursor(args->mp, args->tp, agbp,
> > +			args->agno, XFS_BTNUM_CNT);
> > +	error = xfs_alloc_lookup_ge(cnt_cur, 0, args->minlen, stat);
> > +	if (error) {
> > +		btree_error = XFS_BTREE_ERROR;
> > +		goto out;
> > +	}
> > +
> > +	ASSERT(*stat == 1);
> > +
> > +	error = xfs_alloc_get_rec(cnt_cur, &fbno, &flen, stat);
> > +	if (error) {
> > +		btree_error = XFS_BTREE_ERROR;
> > +		goto out;
> > +	}
> > +
> > +	if (flen == args->minlen)
> > +		*stat = 1;
> > +	else
> > +		*stat = 0;
> > +
> > +out:
> > +	xfs_btree_del_cursor(cnt_cur, btree_error);
> > +
> > +	return error;
> > +}
> > +
> >  /*
> >   * Decide whether to use this allocation group for this allocation.
> >   * If so, fix up the btree freelist's size.
> > @@ -2490,6 +2529,7 @@ xfs_alloc_fix_freelist(
> >  	struct xfs_alloc_arg	targs;	/* local allocation arguments */
> >  	xfs_agblock_t		bno;	/* freelist block */
> >  	xfs_extlen_t		need;	/* total blocks needed in freelist */
> > +	int			i;
> >  	int			error = 0;
> >  
> >  	/* deferred ops (AGFL block frees) require permanent transactions */
> > @@ -2544,6 +2584,12 @@ xfs_alloc_fix_freelist(
> >  	if (!xfs_alloc_space_available(args, need, flags))
> >  		goto out_agbp_relse;
> >  
> > +	if (args->alloc_minlen_only) {
> > +		error = minlen_freespace_available(args, agbp, &i);
> > +		if (error || !i)
> > +			goto out_agbp_relse;
> > +	}
> > +
> >  	/*
> >  	 * Make the freelist shorter if it's too long.
> >  	 *
> > diff --git a/fs/xfs/libxfs/xfs_alloc.h b/fs/xfs/libxfs/xfs_alloc.h
> > index 6c22b12176b8..1d04089b7fb4 100644
> > --- a/fs/xfs/libxfs/xfs_alloc.h
> > +++ b/fs/xfs/libxfs/xfs_alloc.h
> > @@ -75,6 +75,7 @@ typedef struct xfs_alloc_arg {
> >  	char		wasfromfl;	/* set if allocation is from freelist */
> >  	struct xfs_owner_info	oinfo;	/* owner of blocks being allocated */
> >  	enum xfs_ag_resv_type	resv;	/* block reservation to use */
> > +	bool		alloc_minlen_only;
> >  } xfs_alloc_arg_t;
> >  
> >  /*
> > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> > index 5156cbd476f2..fab4097e7492 100644
> > --- a/fs/xfs/libxfs/xfs_bmap.c
> > +++ b/fs/xfs/libxfs/xfs_bmap.c
> > @@ -3510,12 +3510,19 @@ xfs_bmap_btalloc(
> >  		ASSERT(ap->length);
> >  	}
> >  
> > +	memset(&args, 0, sizeof(args));
> > +
> > +	args.alloc_minlen_only = XFS_TEST_ERROR(false, mp,
> > +					XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT);
> >  
> >  	nullfb = ap->tp->t_firstblock == NULLFSBLOCK;
> >  	fb_agno = nullfb ? NULLAGNUMBER : XFS_FSB_TO_AGNO(mp,
> >  							ap->tp->t_firstblock);
> >  	if (nullfb) {
> > -		if ((ap->datatype & XFS_ALLOC_USERDATA) &&
> > +		if (args.alloc_minlen_only) {
> > +			ag = 0;
> > +			ap->blkno = XFS_AGB_TO_FSB(mp, ag, 0);
> > +		} else if ((ap->datatype & XFS_ALLOC_USERDATA) &&
> >  		    xfs_inode_is_filestream(ap->ip)) {
> >  			ag = xfs_filestream_lookup_ag(ap->ip);
> >  			ag = (ag != NULLAGNUMBER) ? ag : 0;
> > @@ -3523,10 +3530,12 @@ xfs_bmap_btalloc(
> >  		} else {
> >  			ap->blkno = XFS_INO_TO_FSB(mp, ap->ip->i_ino);
> >  		}
> > -	} else
> > +	} else {
> >  		ap->blkno = ap->tp->t_firstblock;
> > +	}
> >  
> > -	xfs_bmap_adjacent(ap);
> > +	if (!args.alloc_minlen_only)
> > +		xfs_bmap_adjacent(ap);
> >  
> >  	/*
> >  	 * If allowed, use ap->blkno; otherwise must use firstblock since
> > @@ -3540,7 +3549,6 @@ xfs_bmap_btalloc(
> >  	 * Normal allocation, done through xfs_alloc_vextent.
> >  	 */
> >  	tryagain = isaligned = 0;
> > -	memset(&args, 0, sizeof(args));
> >  	args.tp = ap->tp;
> >  	args.mp = mp;
> >  	args.fsbno = ap->blkno;
> > @@ -3549,7 +3557,10 @@ xfs_bmap_btalloc(
> >  	/* Trim the allocation back to the maximum an AG can fit. */
> >  	args.maxlen = min(ap->length, mp->m_ag_max_usable);
> >  	blen = 0;
> > -	if (nullfb) {
> > +	if (args.alloc_minlen_only) {
> > +		args.type = XFS_ALLOCTYPE_START_AG;
> 
> The above should have been,
> 
> args.type = XFS_ALLOCTYPE_FIRST_AG;
> 
> In my experiments, I had introduced a new args.type value and had later
> realized that XFS_ALLOCTYPE_FIRST_AG would suffice for my requirements. I had
> tested the changed version (which was in my git stash) and forgot to apply
> that to this commit after testing was completed. Hence I ended up sending a
> slightly stale patch. I am sorry about this. I will resend the series.

Ok, but wait till I've gotten all the way through the replies (nearly
done now).

--D

> -- 
> chandan
> 
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V5 12/12] xfs: Introduce error injection to allocate only minlen size extents for files
  2020-10-03  5:56 ` [PATCH V5 12/12] xfs: Introduce error injection to allocate only minlen size extents for files Chandan Babu R
  2020-10-06  4:25   ` Chandan Babu R
@ 2020-10-06  4:34   ` Darrick J. Wong
  2020-10-06  9:17     ` Chandan Babu R
  1 sibling, 1 reply; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-06  4:34 UTC (permalink / raw)
  To: Chandan Babu R; +Cc: linux-xfs, david

On Sat, Oct 03, 2020 at 11:26:33AM +0530, Chandan Babu R wrote:
> This commit adds XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag which
> helps userspace test programs to get xfs_bmap_btalloc() to always
> allocate minlen sized extents.
> 
> This is required for test programs which need a guarantee that minlen
> extents allocated for a file do not get merged with their existing
> neighbours in the inode's BMBT. "Inode fork extent overflow check" for
> Directories, Xattrs and extension of realtime inodes need this since the
> file offset at which the extents are being allocated cannot be
> explicitly controlled from userspace.
> 
> One way to use this error tag is to,
> 1. Consume all of the free space by sequentially writing to a file.
> 2. Punch alternate blocks of the file. This causes CNTBT to contain
>    sufficient number of one block sized extent records.
> 3. Inject XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag.
> After step 3, xfs_bmap_btalloc() will issue space allocation
> requests for minlen sized extents only.

Is step #2 required?  What happens if I only turn the knob?

> ENOSPC error code is returned to userspace when there aren't any "one
> block sized" extents left in any of the AGs.
> 
> Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
> ---
>  fs/xfs/libxfs/xfs_alloc.c    | 46 ++++++++++++++++++++++++++++++++++++
>  fs/xfs/libxfs/xfs_alloc.h    |  1 +
>  fs/xfs/libxfs/xfs_bmap.c     | 26 ++++++++++++++------
>  fs/xfs/libxfs/xfs_errortag.h |  4 +++-
>  fs/xfs/xfs_error.c           |  3 +++
>  5 files changed, 72 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> index 852b536551b5..d8d8ab1478db 100644
> --- a/fs/xfs/libxfs/xfs_alloc.c
> +++ b/fs/xfs/libxfs/xfs_alloc.c
> @@ -2473,6 +2473,45 @@ xfs_defer_agfl_block(
>  	xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_AGFL_FREE, &new->xefi_list);
>  }
>  
> +STATIC int
> +minlen_freespace_available(

This ought to have an 'xfs_' prefix.

Also, what does this function do?  Does it decide if there's even enough
space to go ahead with a minlen allocation?

> +	struct xfs_alloc_arg	*args,
> +	struct xfs_buf		*agbp,
> +	int			*stat)
> +{
> +	xfs_btree_cur_t		*cnt_cur;

struct xfs_btree_cur	*cnt_cur;

> +	xfs_agblock_t		fbno;
> +	xfs_extlen_t		flen;
> +	int			btree_error = XFS_BTREE_NOERROR;
> +	int			error = 0;
> +
> +	cnt_cur = xfs_allocbt_init_cursor(args->mp, args->tp, agbp,
> +			args->agno, XFS_BTNUM_CNT);
> +	error = xfs_alloc_lookup_ge(cnt_cur, 0, args->minlen, stat);
> +	if (error) {
> +		btree_error = XFS_BTREE_ERROR;
> +		goto out;
> +	}
> +
> +	ASSERT(*stat == 1);

Is it ok to keep going with stat==0?  Or should we just ... I don't
know?  Bail out with -EFSCORRUPTED?

> +
> +	error = xfs_alloc_get_rec(cnt_cur, &fbno, &flen, stat);
> +	if (error) {
> +		btree_error = XFS_BTREE_ERROR;
> +		goto out;
> +	}
> +
> +	if (flen == args->minlen)
> +		*stat = 1;
> +	else
> +		*stat = 0;
> +
> +out:
> +	xfs_btree_del_cursor(cnt_cur, btree_error);

Note that due to a sloppy quirk of error handling, you can pass @error
to this function, no need for a separate btree_error.

> +
> +	return error;
> +}
> +
>  /*
>   * Decide whether to use this allocation group for this allocation.
>   * If so, fix up the btree freelist's size.
> @@ -2490,6 +2529,7 @@ xfs_alloc_fix_freelist(
>  	struct xfs_alloc_arg	targs;	/* local allocation arguments */
>  	xfs_agblock_t		bno;	/* freelist block */
>  	xfs_extlen_t		need;	/* total blocks needed in freelist */
> +	int			i;
>  	int			error = 0;
>  
>  	/* deferred ops (AGFL block frees) require permanent transactions */
> @@ -2544,6 +2584,12 @@ xfs_alloc_fix_freelist(
>  	if (!xfs_alloc_space_available(args, need, flags))
>  		goto out_agbp_relse;
>  
> +	if (args->alloc_minlen_only) {
> +		error = minlen_freespace_available(args, agbp, &i);
> +		if (error || !i)
> +			goto out_agbp_relse;
> +	}
> +
>  	/*
>  	 * Make the freelist shorter if it's too long.
>  	 *
> diff --git a/fs/xfs/libxfs/xfs_alloc.h b/fs/xfs/libxfs/xfs_alloc.h
> index 6c22b12176b8..1d04089b7fb4 100644
> --- a/fs/xfs/libxfs/xfs_alloc.h
> +++ b/fs/xfs/libxfs/xfs_alloc.h
> @@ -75,6 +75,7 @@ typedef struct xfs_alloc_arg {
>  	char		wasfromfl;	/* set if allocation is from freelist */
>  	struct xfs_owner_info	oinfo;	/* owner of blocks being allocated */
>  	enum xfs_ag_resv_type	resv;	/* block reservation to use */
> +	bool		alloc_minlen_only;
>  } xfs_alloc_arg_t;
>  
>  /*
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 5156cbd476f2..fab4097e7492 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -3510,12 +3510,19 @@ xfs_bmap_btalloc(
>  		ASSERT(ap->length);
>  	}
>  
> +	memset(&args, 0, sizeof(args));
> +
> +	args.alloc_minlen_only = XFS_TEST_ERROR(false, mp,
> +					XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT);

Can we just set maxlen = minlen here?

Also, should this debug knob also be applied to rt file allocations?

>  
>  	nullfb = ap->tp->t_firstblock == NULLFSBLOCK;
>  	fb_agno = nullfb ? NULLAGNUMBER : XFS_FSB_TO_AGNO(mp,
>  							ap->tp->t_firstblock);
>  	if (nullfb) {
> -		if ((ap->datatype & XFS_ALLOC_USERDATA) &&
> +		if (args.alloc_minlen_only) {
> +			ag = 0;

Hm, so setting this magic knob also makes everyone fight for space in AG 0?

> +			ap->blkno = XFS_AGB_TO_FSB(mp, ag, 0);
> +		} else if ((ap->datatype & XFS_ALLOC_USERDATA) &&
>  		    xfs_inode_is_filestream(ap->ip)) {
>  			ag = xfs_filestream_lookup_ag(ap->ip);
>  			ag = (ag != NULLAGNUMBER) ? ag : 0;
> @@ -3523,10 +3530,12 @@ xfs_bmap_btalloc(
>  		} else {
>  			ap->blkno = XFS_INO_TO_FSB(mp, ap->ip->i_ino);
>  		}
> -	} else
> +	} else {
>  		ap->blkno = ap->tp->t_firstblock;
> +	}
>  
> -	xfs_bmap_adjacent(ap);
> +	if (!args.alloc_minlen_only)
> +		xfs_bmap_adjacent(ap);
>  
>  	/*
>  	 * If allowed, use ap->blkno; otherwise must use firstblock since
> @@ -3540,7 +3549,6 @@ xfs_bmap_btalloc(
>  	 * Normal allocation, done through xfs_alloc_vextent.
>  	 */
>  	tryagain = isaligned = 0;
> -	memset(&args, 0, sizeof(args));
>  	args.tp = ap->tp;
>  	args.mp = mp;
>  	args.fsbno = ap->blkno;
> @@ -3549,7 +3557,10 @@ xfs_bmap_btalloc(
>  	/* Trim the allocation back to the maximum an AG can fit. */
>  	args.maxlen = min(ap->length, mp->m_ag_max_usable);
>  	blen = 0;
> -	if (nullfb) {
> +	if (args.alloc_minlen_only) {
> +		args.type = XFS_ALLOCTYPE_START_AG;
> +		args.total = args.minlen = args.maxlen = ap->minlen;
> +	} else if (nullfb) {
>  		/*
>  		 * Search for an allocation group with a single extent large
>  		 * enough for the request.  If one isn't found, then adjust
> @@ -3595,7 +3606,8 @@ xfs_bmap_btalloc(
>  	 * is only set if the allocation length is >= the stripe unit and the
>  	 * allocation offset is at the end of file.
>  	 */
> -	if (!(ap->tp->t_flags & XFS_TRANS_LOWMODE) && ap->aeof) {
> +	if (!(ap->tp->t_flags & XFS_TRANS_LOWMODE) && ap->aeof &&
> +		!args.alloc_minlen_only) {
>  		if (!ap->offset) {

Yikes, the conditional lines up with the body!

--D

>  			args.alignment = stripe_align;
>  			atype = args.type;
> @@ -3681,7 +3693,7 @@ xfs_bmap_btalloc(
>  		if ((error = xfs_alloc_vextent(&args)))
>  			return error;
>  	}
> -	if (args.fsbno == NULLFSBLOCK && nullfb) {
> +	if (args.fsbno == NULLFSBLOCK && nullfb && !args.alloc_minlen_only) {
>  		args.fsbno = 0;
>  		args.type = XFS_ALLOCTYPE_FIRST_AG;
>  		args.total = ap->minlen;
> diff --git a/fs/xfs/libxfs/xfs_errortag.h b/fs/xfs/libxfs/xfs_errortag.h
> index 1c56fcceeea6..6ca9084b6934 100644
> --- a/fs/xfs/libxfs/xfs_errortag.h
> +++ b/fs/xfs/libxfs/xfs_errortag.h
> @@ -57,7 +57,8 @@
>  #define XFS_ERRTAG_IUNLINK_FALLBACK			34
>  #define XFS_ERRTAG_BUF_IOERROR				35
>  #define XFS_ERRTAG_REDUCE_MAX_IEXTENTS			36
> -#define XFS_ERRTAG_MAX					37
> +#define XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT		37
> +#define XFS_ERRTAG_MAX					38
>  
>  /*
>   * Random factors for above tags, 1 means always, 2 means 1/2 time, etc.
> @@ -99,5 +100,6 @@
>  #define XFS_RANDOM_IUNLINK_FALLBACK			(XFS_RANDOM_DEFAULT/10)
>  #define XFS_RANDOM_BUF_IOERROR				XFS_RANDOM_DEFAULT
>  #define XFS_RANDOM_REDUCE_MAX_IEXTENTS			1
> +#define XFS_RANDOM_BMAP_ALLOC_MINLEN_EXTENT		1
>  
>  #endif /* __XFS_ERRORTAG_H_ */
> diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c
> index 3780b118cc47..028560bb596a 100644
> --- a/fs/xfs/xfs_error.c
> +++ b/fs/xfs/xfs_error.c
> @@ -55,6 +55,7 @@ static unsigned int xfs_errortag_random_default[] = {
>  	XFS_RANDOM_IUNLINK_FALLBACK,
>  	XFS_RANDOM_BUF_IOERROR,
>  	XFS_RANDOM_REDUCE_MAX_IEXTENTS,
> +	XFS_RANDOM_BMAP_ALLOC_MINLEN_EXTENT,
>  };
>  
>  struct xfs_errortag_attr {
> @@ -166,6 +167,7 @@ XFS_ERRORTAG_ATTR_RW(bad_summary,	XFS_ERRTAG_FORCE_SUMMARY_RECALC);
>  XFS_ERRORTAG_ATTR_RW(iunlink_fallback,	XFS_ERRTAG_IUNLINK_FALLBACK);
>  XFS_ERRORTAG_ATTR_RW(buf_ioerror,	XFS_ERRTAG_BUF_IOERROR);
>  XFS_ERRORTAG_ATTR_RW(reduce_max_iextents,	XFS_ERRTAG_REDUCE_MAX_IEXTENTS);
> +XFS_ERRORTAG_ATTR_RW(bmap_alloc_minlen_extent, XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT);
>  
>  static struct attribute *xfs_errortag_attrs[] = {
>  	XFS_ERRORTAG_ATTR_LIST(noerror),
> @@ -205,6 +207,7 @@ static struct attribute *xfs_errortag_attrs[] = {
>  	XFS_ERRORTAG_ATTR_LIST(iunlink_fallback),
>  	XFS_ERRORTAG_ATTR_LIST(buf_ioerror),
>  	XFS_ERRORTAG_ATTR_LIST(reduce_max_iextents),
> +	XFS_ERRORTAG_ATTR_LIST(bmap_alloc_minlen_extent),
>  	NULL,
>  };
>  
> -- 
> 2.28.0
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V5 11/12] xfs: Set tp->t_firstblock only once during a transaction's lifetime
  2020-10-06  4:26   ` Darrick J. Wong
@ 2020-10-06  5:17     ` Chandan Babu R
  0 siblings, 0 replies; 26+ messages in thread
From: Chandan Babu R @ 2020-10-06  5:17 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, david

On Tuesday 6 October 2020 9:56:29 AM IST Darrick J. Wong wrote:
> On Sat, Oct 03, 2020 at 11:26:32AM +0530, Chandan Babu R wrote:
> > tp->t_firstblock is supposed to hold the first fs block allocated by the
> > transaction. There are two cases in the current code base where
> > tp->t_firstblock is assigned a value unconditionally. This commit makes
> > sure that we assign to tp->t_firstblock only if its current value is
> > NULLFSBLOCK.
> 
> Do we hit this currently?  This seems like a regression fix, since I'm
> guessing you hit this fairly soon after adding the next patch and
> twisting the "shatter everything" debug knob it establishes?  And if
> you can hit it there, you could hit this on a severely fragmented fs?

I came across this when I was trying to understand the code flow w.r.t
xfs_bmap_btalloc() => xfs_alloc_vextent() => etc. I noticed that if a
transaction does the following,

1. Satisfy the first allocation request from AG X.
2. Satisfy the second allocation request from AG X+1, since say the second
   allocation request was for a larger minlen value.

... A new space allocation request with minlen equal to what was issued in
step 1 could fail (even though AG X could still have minlen free space)
because step 2 ended up updating tp->t_firstblock to a block from AG X+1 and
hence AG X could never be scanned for free blocks even though the transaction
holds a lock on the corresponding AGF.

This behaviour is most likely true when the "alloc minlen" debug knob
(introduced in the next patch) is enabled. However I didn't execute any
workload on a severly fragmented fs to actually see this behaviour on a
mounted filesystem.

> 
> --D
> 
> > 
> > Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
> > ---
> >  fs/xfs/libxfs/xfs_bmap.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> > index 51c2d2690f05..5156cbd476f2 100644
> > --- a/fs/xfs/libxfs/xfs_bmap.c
> > +++ b/fs/xfs/libxfs/xfs_bmap.c
> > @@ -724,7 +724,8 @@ xfs_bmap_extents_to_btree(
> >  	 */
> >  	ASSERT(tp->t_firstblock == NULLFSBLOCK ||
> >  	       args.agno >= XFS_FSB_TO_AGNO(mp, tp->t_firstblock));
> > -	tp->t_firstblock = args.fsbno;
> > +	if (tp->t_firstblock == NULLFSBLOCK)
> > +		tp->t_firstblock = args.fsbno;
> >  	cur->bc_ino.allocated++;
> >  	ip->i_d.di_nblocks++;
> >  	xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT, 1L);
> > @@ -875,7 +876,8 @@ xfs_bmap_local_to_extents(
> >  	/* Can't fail, the space was reserved. */
> >  	ASSERT(args.fsbno != NULLFSBLOCK);
> >  	ASSERT(args.len == 1);
> > -	tp->t_firstblock = args.fsbno;
> > +	if (tp->t_firstblock == NULLFSBLOCK)
> > +		tp->t_firstblock = args.fsbno;
> >  	error = xfs_trans_get_buf(tp, args.mp->m_ddev_targp,
> >  			XFS_FSB_TO_DADDR(args.mp, args.fsbno),
> >  			args.mp->m_bsize, 0, &bp);
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V5 12/12] xfs: Introduce error injection to allocate only minlen size extents for files
  2020-10-06  4:34   ` Darrick J. Wong
@ 2020-10-06  9:17     ` Chandan Babu R
  2020-10-07  5:09       ` Chandan Babu R
  0 siblings, 1 reply; 26+ messages in thread
From: Chandan Babu R @ 2020-10-06  9:17 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, david

On Tuesday 6 October 2020 10:04:24 AM IST Darrick J. Wong wrote:
> On Sat, Oct 03, 2020 at 11:26:33AM +0530, Chandan Babu R wrote:
> > This commit adds XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag which
> > helps userspace test programs to get xfs_bmap_btalloc() to always
> > allocate minlen sized extents.
> > 
> > This is required for test programs which need a guarantee that minlen
> > extents allocated for a file do not get merged with their existing
> > neighbours in the inode's BMBT. "Inode fork extent overflow check" for
> > Directories, Xattrs and extension of realtime inodes need this since the
> > file offset at which the extents are being allocated cannot be
> > explicitly controlled from userspace.
> > 
> > One way to use this error tag is to,
> > 1. Consume all of the free space by sequentially writing to a file.
> > 2. Punch alternate blocks of the file. This causes CNTBT to contain
> >    sufficient number of one block sized extent records.
> > 3. Inject XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag.
> > After step 3, xfs_bmap_btalloc() will issue space allocation
> > requests for minlen sized extents only.
> 
> Is step #2 required?  What happens if I only turn the knob?

If there are no minlen sized free space extents in the CNTBT, we would return
-ENOSPC to the userspace process. The reason behind forcing allocation of
minlen sized CNTBT records is to make sure that these newly allocated extents
do not get merged with their neighbouring extents in the inode's BMBT. On the
other hand, if we did allow slicing off minlen sized chunks of a larger free
space extent record in the CNTBT, the newly allocated extent records could be
contiguous (w.r.t both disk offset and file offset) with its neighbours in the
BMBT and hence merged, therby reducing inode fork extent count. This will
prevent us from writing deterministic "Inode extent count overflow" tests for
Directories, xattrs and realtime inodes.

> > ENOSPC error code is returned to userspace when there aren't any "one
> > block sized" extents left in any of the AGs.
> > 
> > Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
> > ---
> >  fs/xfs/libxfs/xfs_alloc.c    | 46 ++++++++++++++++++++++++++++++++++++
> >  fs/xfs/libxfs/xfs_alloc.h    |  1 +
> >  fs/xfs/libxfs/xfs_bmap.c     | 26 ++++++++++++++------
> >  fs/xfs/libxfs/xfs_errortag.h |  4 +++-
> >  fs/xfs/xfs_error.c           |  3 +++
> >  5 files changed, 72 insertions(+), 8 deletions(-)
> > 
> > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> > index 852b536551b5..d8d8ab1478db 100644
> > --- a/fs/xfs/libxfs/xfs_alloc.c
> > +++ b/fs/xfs/libxfs/xfs_alloc.c
> > @@ -2473,6 +2473,45 @@ xfs_defer_agfl_block(
> >  	xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_AGFL_FREE, &new->xefi_list);
> >  }
> >  
> > +STATIC int
> > +minlen_freespace_available(
> 
> This ought to have an 'xfs_' prefix.

Ok. I will fix this up.
> 
> Also, what does this function do?  Does it decide if there's even enough
> space to go ahead with a minlen allocation?

I will come up with a better name for this function. This function checks if
there is a freespace extent record whose length is exactly equal to
args->minlen.

> 
> > +	struct xfs_alloc_arg	*args,
> > +	struct xfs_buf		*agbp,
> > +	int			*stat)
> > +{
> > +	xfs_btree_cur_t		*cnt_cur;
> 
> struct xfs_btree_cur	*cnt_cur;

Sorry, I will fix that up.

> 
> > +	xfs_agblock_t		fbno;
> > +	xfs_extlen_t		flen;
> > +	int			btree_error = XFS_BTREE_NOERROR;
> > +	int			error = 0;
> > +
> > +	cnt_cur = xfs_allocbt_init_cursor(args->mp, args->tp, agbp,
> > +			args->agno, XFS_BTNUM_CNT);
> > +	error = xfs_alloc_lookup_ge(cnt_cur, 0, args->minlen, stat);
> > +	if (error) {
> > +		btree_error = XFS_BTREE_ERROR;
> > +		goto out;
> > +	}
> > +
> > +	ASSERT(*stat == 1);
> 
> Is it ok to keep going with stat==0?  Or should we just ... I don't
> know?  Bail out with -EFSCORRUPTED?

I think returning with -EFSCORRUPTED is a better option since before
executing the code here, we would have already executed
xfs_alloc_space_available() to make sure that atleast minlen free space is
available in the AG whose CNTBT is being traversed. Thanks for the
suggestion.

> 
> > +
> > +	error = xfs_alloc_get_rec(cnt_cur, &fbno, &flen, stat);
> > +	if (error) {
> > +		btree_error = XFS_BTREE_ERROR;
> > +		goto out;
> > +	}
> > +
> > +	if (flen == args->minlen)
> > +		*stat = 1;
> > +	else
> > +		*stat = 0;
> > +
> > +out:
> > +	xfs_btree_del_cursor(cnt_cur, btree_error);
> 
> Note that due to a sloppy quirk of error handling, you can pass @error
> to this function, no need for a separate btree_error.

Ok. Thanks for pointing that out. I will fix this.

> 
> > +
> > +	return error;
> > +}
> > +
> >  /*
> >   * Decide whether to use this allocation group for this allocation.
> >   * If so, fix up the btree freelist's size.
> > @@ -2490,6 +2529,7 @@ xfs_alloc_fix_freelist(
> >  	struct xfs_alloc_arg	targs;	/* local allocation arguments */
> >  	xfs_agblock_t		bno;	/* freelist block */
> >  	xfs_extlen_t		need;	/* total blocks needed in freelist */
> > +	int			i;
> >  	int			error = 0;
> >  
> >  	/* deferred ops (AGFL block frees) require permanent transactions */
> > @@ -2544,6 +2584,12 @@ xfs_alloc_fix_freelist(
> >  	if (!xfs_alloc_space_available(args, need, flags))
> >  		goto out_agbp_relse;
> >  
> > +	if (args->alloc_minlen_only) {
> > +		error = minlen_freespace_available(args, agbp, &i);
> > +		if (error || !i)
> > +			goto out_agbp_relse;
> > +	}
> > +
> >  	/*
> >  	 * Make the freelist shorter if it's too long.
> >  	 *
> > diff --git a/fs/xfs/libxfs/xfs_alloc.h b/fs/xfs/libxfs/xfs_alloc.h
> > index 6c22b12176b8..1d04089b7fb4 100644
> > --- a/fs/xfs/libxfs/xfs_alloc.h
> > +++ b/fs/xfs/libxfs/xfs_alloc.h
> > @@ -75,6 +75,7 @@ typedef struct xfs_alloc_arg {
> >  	char		wasfromfl;	/* set if allocation is from freelist */
> >  	struct xfs_owner_info	oinfo;	/* owner of blocks being allocated */
> >  	enum xfs_ag_resv_type	resv;	/* block reservation to use */
> > +	bool		alloc_minlen_only;
> >  } xfs_alloc_arg_t;
> >  
> >  /*
> > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> > index 5156cbd476f2..fab4097e7492 100644
> > --- a/fs/xfs/libxfs/xfs_bmap.c
> > +++ b/fs/xfs/libxfs/xfs_bmap.c
> > @@ -3510,12 +3510,19 @@ xfs_bmap_btalloc(
> >  		ASSERT(ap->length);
> >  	}
> >  
> > +	memset(&args, 0, sizeof(args));
> > +
> > +	args.alloc_minlen_only = XFS_TEST_ERROR(false, mp,
> > +					XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT);
> 
> Can we just set maxlen = minlen here?

I had noticed that xfs_bmap_btalloc() is structured as described below,
1. Compute the appropriate filesystem-wide block number (and hence the AG)
   to start searching for free space extents.
2. Compute xfs_alloc_arg->{type, total, minlen, maxlen}.
3. Compute xfs_alloc_arg->alignment and adjust xfs_alloc_arg->{type, maxlen}
   as required.
4. Invoke xfs_alloc_vextent().

To keep up with the existing code flow, I had set
xfs_alloc_args->{minlen, maxlen, total} to xfs_bmalloca->minlen at function
location corresponding to step 2.

> 
> Also, should this debug knob also be applied to rt file allocations?

I had missed xfs_bmap_alloc_userdata() => xfs_bmap_rtalloc() sequence. I will
add the error tag to rt file allocations as well. Thanks for pointing that
out.

> 
> >  
> >  	nullfb = ap->tp->t_firstblock == NULLFSBLOCK;
> >  	fb_agno = nullfb ? NULLAGNUMBER : XFS_FSB_TO_AGNO(mp,
> >  							ap->tp->t_firstblock);
> >  	if (nullfb) {
> > -		if ((ap->datatype & XFS_ALLOC_USERDATA) &&
> > +		if (args.alloc_minlen_only) {
> > +			ag = 0;
> 
> Hm, so setting this magic knob also makes everyone fight for space in AG 0?

For the normal use case, each AGF tracks the longest extent via
xfs_agf->agf_longest. When the transaction is allocating its first
extent, xfs_bmap_btalloc_nullfb() loops over each AG until it finds an AG
whose longest extent can be used for allocating xfs_alloc_arg->maxlen free
space extent. 

However, there is no such existing facility for tracking "minimum length"
extent in an AG. This could be done by adding a new member to the in-memory
data structure and intializing the new member by assigning the "length" value
of the leftmost record from CNTBT during xfs_alloc_read_agf(). However I
refrained from doing this since we will never need this on production
machines.

Also, since xfs_alloc_arg->type is being to XFS_ALLOCTYPE_FIRST_AG later in
the code, AG 0 is just the first AG being scanned for "exact minlen"
extents. We end up looping across remaining AGs if previously searched AGs do
not contain "exact minlen" extents.

> 
> > +			ap->blkno = XFS_AGB_TO_FSB(mp, ag, 0);
> > +		} else if ((ap->datatype & XFS_ALLOC_USERDATA) &&
> >  		    xfs_inode_is_filestream(ap->ip)) {
> >  			ag = xfs_filestream_lookup_ag(ap->ip);
> >  			ag = (ag != NULLAGNUMBER) ? ag : 0;
> > @@ -3523,10 +3530,12 @@ xfs_bmap_btalloc(
> >  		} else {
> >  			ap->blkno = XFS_INO_TO_FSB(mp, ap->ip->i_ino);
> >  		}
> > -	} else
> > +	} else {
> >  		ap->blkno = ap->tp->t_firstblock;
> > +	}
> >  
> > -	xfs_bmap_adjacent(ap);
> > +	if (!args.alloc_minlen_only)
> > +		xfs_bmap_adjacent(ap);
> >  
> >  	/*
> >  	 * If allowed, use ap->blkno; otherwise must use firstblock since
> > @@ -3540,7 +3549,6 @@ xfs_bmap_btalloc(
> >  	 * Normal allocation, done through xfs_alloc_vextent.
> >  	 */
> >  	tryagain = isaligned = 0;
> > -	memset(&args, 0, sizeof(args));
> >  	args.tp = ap->tp;
> >  	args.mp = mp;
> >  	args.fsbno = ap->blkno;
> > @@ -3549,7 +3557,10 @@ xfs_bmap_btalloc(
> >  	/* Trim the allocation back to the maximum an AG can fit. */
> >  	args.maxlen = min(ap->length, mp->m_ag_max_usable);
> >  	blen = 0;
> > -	if (nullfb) {
> > +	if (args.alloc_minlen_only) {
> > +		args.type = XFS_ALLOCTYPE_START_AG;
> > +		args.total = args.minlen = args.maxlen = ap->minlen;
> > +	} else if (nullfb) {
> >  		/*
> >  		 * Search for an allocation group with a single extent large
> >  		 * enough for the request.  If one isn't found, then adjust
> > @@ -3595,7 +3606,8 @@ xfs_bmap_btalloc(
> >  	 * is only set if the allocation length is >= the stripe unit and the
> >  	 * allocation offset is at the end of file.
> >  	 */
> > -	if (!(ap->tp->t_flags & XFS_TRANS_LOWMODE) && ap->aeof) {
> > +	if (!(ap->tp->t_flags & XFS_TRANS_LOWMODE) && ap->aeof &&
> > +		!args.alloc_minlen_only) {
> >  		if (!ap->offset) {
> 
> Yikes, the conditional lines up with the body!

Sorry, I will fix this.

> 
> --D
> 
> >  			args.alignment = stripe_align;
> >  			atype = args.type;
> > @@ -3681,7 +3693,7 @@ xfs_bmap_btalloc(
> >  		if ((error = xfs_alloc_vextent(&args)))
> >  			return error;
> >  	}
> > -	if (args.fsbno == NULLFSBLOCK && nullfb) {
> > +	if (args.fsbno == NULLFSBLOCK && nullfb && !args.alloc_minlen_only) {
> >  		args.fsbno = 0;
> >  		args.type = XFS_ALLOCTYPE_FIRST_AG;
> >  		args.total = ap->minlen;
> > diff --git a/fs/xfs/libxfs/xfs_errortag.h b/fs/xfs/libxfs/xfs_errortag.h
> > index 1c56fcceeea6..6ca9084b6934 100644
> > --- a/fs/xfs/libxfs/xfs_errortag.h
> > +++ b/fs/xfs/libxfs/xfs_errortag.h
> > @@ -57,7 +57,8 @@
> >  #define XFS_ERRTAG_IUNLINK_FALLBACK			34
> >  #define XFS_ERRTAG_BUF_IOERROR				35
> >  #define XFS_ERRTAG_REDUCE_MAX_IEXTENTS			36
> > -#define XFS_ERRTAG_MAX					37
> > +#define XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT		37
> > +#define XFS_ERRTAG_MAX					38
> >  
> >  /*
> >   * Random factors for above tags, 1 means always, 2 means 1/2 time, etc.
> > @@ -99,5 +100,6 @@
> >  #define XFS_RANDOM_IUNLINK_FALLBACK			(XFS_RANDOM_DEFAULT/10)
> >  #define XFS_RANDOM_BUF_IOERROR				XFS_RANDOM_DEFAULT
> >  #define XFS_RANDOM_REDUCE_MAX_IEXTENTS			1
> > +#define XFS_RANDOM_BMAP_ALLOC_MINLEN_EXTENT		1
> >  
> >  #endif /* __XFS_ERRORTAG_H_ */
> > diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c
> > index 3780b118cc47..028560bb596a 100644
> > --- a/fs/xfs/xfs_error.c
> > +++ b/fs/xfs/xfs_error.c
> > @@ -55,6 +55,7 @@ static unsigned int xfs_errortag_random_default[] = {
> >  	XFS_RANDOM_IUNLINK_FALLBACK,
> >  	XFS_RANDOM_BUF_IOERROR,
> >  	XFS_RANDOM_REDUCE_MAX_IEXTENTS,
> > +	XFS_RANDOM_BMAP_ALLOC_MINLEN_EXTENT,
> >  };
> >  
> >  struct xfs_errortag_attr {
> > @@ -166,6 +167,7 @@ XFS_ERRORTAG_ATTR_RW(bad_summary,	XFS_ERRTAG_FORCE_SUMMARY_RECALC);
> >  XFS_ERRORTAG_ATTR_RW(iunlink_fallback,	XFS_ERRTAG_IUNLINK_FALLBACK);
> >  XFS_ERRORTAG_ATTR_RW(buf_ioerror,	XFS_ERRTAG_BUF_IOERROR);
> >  XFS_ERRORTAG_ATTR_RW(reduce_max_iextents,	XFS_ERRTAG_REDUCE_MAX_IEXTENTS);
> > +XFS_ERRORTAG_ATTR_RW(bmap_alloc_minlen_extent, XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT);
> >  
> >  static struct attribute *xfs_errortag_attrs[] = {
> >  	XFS_ERRORTAG_ATTR_LIST(noerror),
> > @@ -205,6 +207,7 @@ static struct attribute *xfs_errortag_attrs[] = {
> >  	XFS_ERRORTAG_ATTR_LIST(iunlink_fallback),
> >  	XFS_ERRORTAG_ATTR_LIST(buf_ioerror),
> >  	XFS_ERRORTAG_ATTR_LIST(reduce_max_iextents),
> > +	XFS_ERRORTAG_ATTR_LIST(bmap_alloc_minlen_extent),
> >  	NULL,
> >  };
> >  
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V5 04/12] xfs: Check for extent overflow when adding/removing xattrs
  2020-10-06  4:23   ` Darrick J. Wong
@ 2020-10-06  9:21     ` Chandan Babu R
  0 siblings, 0 replies; 26+ messages in thread
From: Chandan Babu R @ 2020-10-06  9:21 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, david

On Tuesday 6 October 2020 9:53:29 AM IST Darrick J. Wong wrote:
> On Sat, Oct 03, 2020 at 11:26:25AM +0530, Chandan Babu R wrote:
> > Adding/removing an xattr can cause XFS_DA_NODE_MAXDEPTH extents to be
> > added. One extra extent for dabtree in case a local attr is large enough
> > to cause a double split.  It can also cause extent count to increase
> > proportional to the size of a remote xattr's value.
> > 
> > Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
> 
> Didn't I already review this?  AFAICT it hasn't changed much, but did
> something change enough to warrant dropping the old RVB tag?

Yes, you had reviewed it earlier. Sorry, I missed out on adding the RVB before
sending the patch.

> 
> > ---
> >  fs/xfs/libxfs/xfs_attr.c       | 13 +++++++++++++
> >  fs/xfs/libxfs/xfs_inode_fork.h | 10 ++++++++++
> >  2 files changed, 23 insertions(+)
> > 
> > diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> > index fd8e6418a0d3..be51e7068dcd 100644
> > --- a/fs/xfs/libxfs/xfs_attr.c
> > +++ b/fs/xfs/libxfs/xfs_attr.c
> > @@ -396,6 +396,7 @@ xfs_attr_set(
> >  	struct xfs_trans_res	tres;
> >  	bool			rsvd = (args->attr_filter & XFS_ATTR_ROOT);
> >  	int			error, local;
> > +	int			rmt_blks = 0;
> >  	unsigned int		total;
> >  
> >  	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
> > @@ -442,11 +443,15 @@ xfs_attr_set(
> >  		tres.tr_logcount = XFS_ATTRSET_LOG_COUNT;
> >  		tres.tr_logflags = XFS_TRANS_PERM_LOG_RES;
> >  		total = args->total;
> > +
> > +		if (!local)
> > +			rmt_blks = xfs_attr3_rmt_blocks(mp, args->valuelen);
> >  	} else {
> >  		XFS_STATS_INC(mp, xs_attr_remove);
> >  
> >  		tres = M_RES(mp)->tr_attrrm;
> >  		total = XFS_ATTRRM_SPACE_RES(mp);
> > +		rmt_blks = xfs_attr3_rmt_blocks(mp, XFS_XATTR_SIZE_MAX);
> >  	}
> >  
> >  	/*
> > @@ -460,6 +465,14 @@ xfs_attr_set(
> >  
> >  	xfs_ilock(dp, XFS_ILOCK_EXCL);
> >  	xfs_trans_ijoin(args->trans, dp, 0);
> > +
> > +	if (args->value || xfs_inode_hasattr(dp)) {
> > +		error = xfs_iext_count_may_overflow(dp, XFS_ATTR_FORK,
> > +				XFS_IEXT_ATTR_MANIP_CNT(rmt_blks));
> > +		if (error)
> > +			goto out_trans_cancel;
> 
> Hmm.  If you hit this while trying to remove an xattr, what then?
> I suppose you really don't want to overflow naextents, but I suppose the
> only other option is to delete the file.  Oh well, attr forks with 65533
> extents should be vanishingly rare, right?  Right? :)

Yes, Deleting the corresponding file would be the only option. If we did allow
this operation to succeed we would end up having a silent corruption of the
attr extent counter.

> 
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> --D
> 
> > +	}
> > +
> >  	if (args->value) {
> >  		unsigned int	quota_flags = XFS_QMOPT_RES_REGBLKS;
> >  
> > diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
> > index bcac769a7df6..5de2f07d0dd5 100644
> > --- a/fs/xfs/libxfs/xfs_inode_fork.h
> > +++ b/fs/xfs/libxfs/xfs_inode_fork.h
> > @@ -47,6 +47,16 @@ struct xfs_ifork {
> >   */
> >  #define XFS_IEXT_PUNCH_HOLE_CNT		(1)
> >  
> > +/*
> > + * Adding/removing an xattr can cause XFS_DA_NODE_MAXDEPTH extents to
> > + * be added. One extra extent for dabtree in case a local attr is
> > + * large enough to cause a double split.  It can also cause extent
> > + * count to increase proportional to the size of a remote xattr's
> > + * value.
> > + */
> > +#define XFS_IEXT_ATTR_MANIP_CNT(rmt_blks) \
> > +	(XFS_DA_NODE_MAXDEPTH + max(1, rmt_blks))
> > +
> >  /*
> >   * Fork handling.
> >   */
> 


-- 
chandan




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH V5 12/12] xfs: Introduce error injection to allocate only minlen size extents for files
  2020-10-06  9:17     ` Chandan Babu R
@ 2020-10-07  5:09       ` Chandan Babu R
  0 siblings, 0 replies; 26+ messages in thread
From: Chandan Babu R @ 2020-10-07  5:09 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, david

On Tuesday 6 October 2020 2:47:02 PM IST Chandan Babu R wrote:
> On Tuesday 6 October 2020 10:04:24 AM IST Darrick J. Wong wrote:
> > On Sat, Oct 03, 2020 at 11:26:33AM +0530, Chandan Babu R wrote:
> > > This commit adds XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag which
> > > helps userspace test programs to get xfs_bmap_btalloc() to always
> > > allocate minlen sized extents.
> > > 
> > > This is required for test programs which need a guarantee that minlen
> > > extents allocated for a file do not get merged with their existing
> > > neighbours in the inode's BMBT. "Inode fork extent overflow check" for
> > > Directories, Xattrs and extension of realtime inodes need this since the
> > > file offset at which the extents are being allocated cannot be
> > > explicitly controlled from userspace.
> > > 
> > > One way to use this error tag is to,
> > > 1. Consume all of the free space by sequentially writing to a file.
> > > 2. Punch alternate blocks of the file. This causes CNTBT to contain
> > >    sufficient number of one block sized extent records.
> > > 3. Inject XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag.
> > > After step 3, xfs_bmap_btalloc() will issue space allocation
> > > requests for minlen sized extents only.
> > 
> > Is step #2 required?  What happens if I only turn the knob?
> 
> If there are no minlen sized free space extents in the CNTBT, we would return
> -ENOSPC to the userspace process. The reason behind forcing allocation of
> minlen sized CNTBT records is to make sure that these newly allocated extents
> do not get merged with their neighbouring extents in the inode's BMBT. On the
> other hand, if we did allow slicing off minlen sized chunks of a larger free
> space extent record in the CNTBT, the newly allocated extent records could be
> contiguous (w.r.t both disk offset and file offset) with its neighbours in the
> BMBT and hence merged, therby reducing inode fork extent count. This will
> prevent us from writing deterministic "Inode extent count overflow" tests for
> Directories, xattrs and realtime inodes.
> 
> > > ENOSPC error code is returned to userspace when there aren't any "one
> > > block sized" extents left in any of the AGs.
> > > 
> > > Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
> > > ---
> > >  fs/xfs/libxfs/xfs_alloc.c    | 46 ++++++++++++++++++++++++++++++++++++
> > >  fs/xfs/libxfs/xfs_alloc.h    |  1 +
> > >  fs/xfs/libxfs/xfs_bmap.c     | 26 ++++++++++++++------
> > >  fs/xfs/libxfs/xfs_errortag.h |  4 +++-
> > >  fs/xfs/xfs_error.c           |  3 +++
> > >  5 files changed, 72 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> > > index 852b536551b5..d8d8ab1478db 100644
> > > --- a/fs/xfs/libxfs/xfs_alloc.c
> > > +++ b/fs/xfs/libxfs/xfs_alloc.c
> > > @@ -2473,6 +2473,45 @@ xfs_defer_agfl_block(
> > >  	xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_AGFL_FREE, &new->xefi_list);
> > >  }
> > >  
> > > +STATIC int
> > > +minlen_freespace_available(
> > 
> > This ought to have an 'xfs_' prefix.
> 
> Ok. I will fix this up.
> > 
> > Also, what does this function do?  Does it decide if there's even enough
> > space to go ahead with a minlen allocation?
> 
> I will come up with a better name for this function. This function checks if
> there is a freespace extent record whose length is exactly equal to
> args->minlen.
> 
> > 
> > > +	struct xfs_alloc_arg	*args,
> > > +	struct xfs_buf		*agbp,
> > > +	int			*stat)
> > > +{
> > > +	xfs_btree_cur_t		*cnt_cur;
> > 
> > struct xfs_btree_cur	*cnt_cur;
> 
> Sorry, I will fix that up.
> 
> > 
> > > +	xfs_agblock_t		fbno;
> > > +	xfs_extlen_t		flen;
> > > +	int			btree_error = XFS_BTREE_NOERROR;
> > > +	int			error = 0;
> > > +
> > > +	cnt_cur = xfs_allocbt_init_cursor(args->mp, args->tp, agbp,
> > > +			args->agno, XFS_BTNUM_CNT);
> > > +	error = xfs_alloc_lookup_ge(cnt_cur, 0, args->minlen, stat);
> > > +	if (error) {
> > > +		btree_error = XFS_BTREE_ERROR;
> > > +		goto out;
> > > +	}
> > > +
> > > +	ASSERT(*stat == 1);
> > 
> > Is it ok to keep going with stat==0?  Or should we just ... I don't
> > know?  Bail out with -EFSCORRUPTED?
> 
> I think returning with -EFSCORRUPTED is a better option since before
> executing the code here, we would have already executed
> xfs_alloc_space_available() to make sure that atleast minlen free space is
> available in the AG whose CNTBT is being traversed. Thanks for the
> suggestion.
> 
> > 
> > > +
> > > +	error = xfs_alloc_get_rec(cnt_cur, &fbno, &flen, stat);
> > > +	if (error) {
> > > +		btree_error = XFS_BTREE_ERROR;
> > > +		goto out;
> > > +	}
> > > +
> > > +	if (flen == args->minlen)
> > > +		*stat = 1;
> > > +	else
> > > +		*stat = 0;
> > > +
> > > +out:
> > > +	xfs_btree_del_cursor(cnt_cur, btree_error);
> > 
> > Note that due to a sloppy quirk of error handling, you can pass @error
> > to this function, no need for a separate btree_error.
> 
> Ok. Thanks for pointing that out. I will fix this.
> 
> > 
> > > +
> > > +	return error;
> > > +}
> > > +
> > >  /*
> > >   * Decide whether to use this allocation group for this allocation.
> > >   * If so, fix up the btree freelist's size.
> > > @@ -2490,6 +2529,7 @@ xfs_alloc_fix_freelist(
> > >  	struct xfs_alloc_arg	targs;	/* local allocation arguments */
> > >  	xfs_agblock_t		bno;	/* freelist block */
> > >  	xfs_extlen_t		need;	/* total blocks needed in freelist */
> > > +	int			i;
> > >  	int			error = 0;
> > >  
> > >  	/* deferred ops (AGFL block frees) require permanent transactions */
> > > @@ -2544,6 +2584,12 @@ xfs_alloc_fix_freelist(
> > >  	if (!xfs_alloc_space_available(args, need, flags))
> > >  		goto out_agbp_relse;
> > >  
> > > +	if (args->alloc_minlen_only) {
> > > +		error = minlen_freespace_available(args, agbp, &i);
> > > +		if (error || !i)
> > > +			goto out_agbp_relse;
> > > +	}
> > > +
> > >  	/*
> > >  	 * Make the freelist shorter if it's too long.
> > >  	 *
> > > diff --git a/fs/xfs/libxfs/xfs_alloc.h b/fs/xfs/libxfs/xfs_alloc.h
> > > index 6c22b12176b8..1d04089b7fb4 100644
> > > --- a/fs/xfs/libxfs/xfs_alloc.h
> > > +++ b/fs/xfs/libxfs/xfs_alloc.h
> > > @@ -75,6 +75,7 @@ typedef struct xfs_alloc_arg {
> > >  	char		wasfromfl;	/* set if allocation is from freelist */
> > >  	struct xfs_owner_info	oinfo;	/* owner of blocks being allocated */
> > >  	enum xfs_ag_resv_type	resv;	/* block reservation to use */
> > > +	bool		alloc_minlen_only;
> > >  } xfs_alloc_arg_t;
> > >  
> > >  /*
> > > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> > > index 5156cbd476f2..fab4097e7492 100644
> > > --- a/fs/xfs/libxfs/xfs_bmap.c
> > > +++ b/fs/xfs/libxfs/xfs_bmap.c
> > > @@ -3510,12 +3510,19 @@ xfs_bmap_btalloc(
> > >  		ASSERT(ap->length);
> > >  	}
> > >  
> > > +	memset(&args, 0, sizeof(args));
> > > +
> > > +	args.alloc_minlen_only = XFS_TEST_ERROR(false, mp,
> > > +					XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT);
> > 
> > Can we just set maxlen = minlen here?
> 
> I had noticed that xfs_bmap_btalloc() is structured as described below,
> 1. Compute the appropriate filesystem-wide block number (and hence the AG)
>    to start searching for free space extents.
> 2. Compute xfs_alloc_arg->{type, total, minlen, maxlen}.
> 3. Compute xfs_alloc_arg->alignment and adjust xfs_alloc_arg->{type, maxlen}
>    as required.
> 4. Invoke xfs_alloc_vextent().
> 
> To keep up with the existing code flow, I had set
> xfs_alloc_args->{minlen, maxlen, total} to xfs_bmalloca->minlen at function
> location corresponding to step 2.
> 
> > 
> > Also, should this debug knob also be applied to rt file allocations?
> 
> I had missed xfs_bmap_alloc_userdata() => xfs_bmap_rtalloc() sequence. I will
> add the error tag to rt file allocations as well. Thanks for pointing that
> out.

Actually the debug knob is not required for rt file allocations because they
take the same path as direct i/o writes and hence a userspace test program
could control the file offsets at which writes take place in order to prevent
neighbouring extents from getting merged into a single one. An example test
program is given below,

# realtime file
add_nosplit_5_iext_count_overflow_check()
{
        umount $dev

        mkfs.xfs -f -K -d size=${fssize} -r rtdev=${rtdev} -m reflink=0,rmapbt=0 $dev || \
                { print "Unable to mkfs.xfs $dev"; exit 1 }

        mount -o rtdev=${rtdev} $dev $mntpnt || { print "Unable to mount $dev"; exit 1 }

        testfile=${mntpnt}/testfile

        nr_blks=$((15 * 2))

        xfs_io -x -c 'inject reduce_max_iextents' $mntpnt

        for i in $(seq 0 2 $(($nr_blks - 1))); do
                xfs_io -Rf -c "pwrite $(($i * $bsize)) $bsize" -c fsync $testfile > /dev/null 2>&1
                [[ $? != 0 ]] && { echo "Failed to write at block $i"; break; }
        done

        ls -i $testfile
        # Make sure that this is a realtime file
        xfs_io -c 'lsattr' $testfile
        xfs_io -f -c "fiemap" $testfile | grep -i -v hole
}

In the above script, we write at non-contiguous file offsets and hence this is
sufficient to guarantee that the resulting file extents do not get merged with
their neighbours.

> 
> > 
> > >  
> > >  	nullfb = ap->tp->t_firstblock == NULLFSBLOCK;
> > >  	fb_agno = nullfb ? NULLAGNUMBER : XFS_FSB_TO_AGNO(mp,
> > >  							ap->tp->t_firstblock);
> > >  	if (nullfb) {
> > > -		if ((ap->datatype & XFS_ALLOC_USERDATA) &&
> > > +		if (args.alloc_minlen_only) {
> > > +			ag = 0;
> > 
> > Hm, so setting this magic knob also makes everyone fight for space in AG 0?
> 
> For the normal use case, each AGF tracks the longest extent via
> xfs_agf->agf_longest. When the transaction is allocating its first
> extent, xfs_bmap_btalloc_nullfb() loops over each AG until it finds an AG
> whose longest extent can be used for allocating xfs_alloc_arg->maxlen free
> space extent. 
> 
> However, there is no such existing facility for tracking "minimum length"
> extent in an AG. This could be done by adding a new member to the in-memory
> data structure and intializing the new member by assigning the "length" value
> of the leftmost record from CNTBT during xfs_alloc_read_agf(). However I
> refrained from doing this since we will never need this on production
> machines.
> 
> Also, since xfs_alloc_arg->type is being to XFS_ALLOCTYPE_FIRST_AG later in
> the code, AG 0 is just the first AG being scanned for "exact minlen"
> extents. We end up looping across remaining AGs if previously searched AGs do
> not contain "exact minlen" extents.
> 
> > 
> > > +			ap->blkno = XFS_AGB_TO_FSB(mp, ag, 0);
> > > +		} else if ((ap->datatype & XFS_ALLOC_USERDATA) &&
> > >  		    xfs_inode_is_filestream(ap->ip)) {
> > >  			ag = xfs_filestream_lookup_ag(ap->ip);
> > >  			ag = (ag != NULLAGNUMBER) ? ag : 0;
> > > @@ -3523,10 +3530,12 @@ xfs_bmap_btalloc(
> > >  		} else {
> > >  			ap->blkno = XFS_INO_TO_FSB(mp, ap->ip->i_ino);
> > >  		}
> > > -	} else
> > > +	} else {
> > >  		ap->blkno = ap->tp->t_firstblock;
> > > +	}
> > >  
> > > -	xfs_bmap_adjacent(ap);
> > > +	if (!args.alloc_minlen_only)
> > > +		xfs_bmap_adjacent(ap);
> > >  
> > >  	/*
> > >  	 * If allowed, use ap->blkno; otherwise must use firstblock since
> > > @@ -3540,7 +3549,6 @@ xfs_bmap_btalloc(
> > >  	 * Normal allocation, done through xfs_alloc_vextent.
> > >  	 */
> > >  	tryagain = isaligned = 0;
> > > -	memset(&args, 0, sizeof(args));
> > >  	args.tp = ap->tp;
> > >  	args.mp = mp;
> > >  	args.fsbno = ap->blkno;
> > > @@ -3549,7 +3557,10 @@ xfs_bmap_btalloc(
> > >  	/* Trim the allocation back to the maximum an AG can fit. */
> > >  	args.maxlen = min(ap->length, mp->m_ag_max_usable);
> > >  	blen = 0;
> > > -	if (nullfb) {
> > > +	if (args.alloc_minlen_only) {
> > > +		args.type = XFS_ALLOCTYPE_START_AG;
> > > +		args.total = args.minlen = args.maxlen = ap->minlen;
> > > +	} else if (nullfb) {
> > >  		/*
> > >  		 * Search for an allocation group with a single extent large
> > >  		 * enough for the request.  If one isn't found, then adjust
> > > @@ -3595,7 +3606,8 @@ xfs_bmap_btalloc(
> > >  	 * is only set if the allocation length is >= the stripe unit and the
> > >  	 * allocation offset is at the end of file.
> > >  	 */
> > > -	if (!(ap->tp->t_flags & XFS_TRANS_LOWMODE) && ap->aeof) {
> > > +	if (!(ap->tp->t_flags & XFS_TRANS_LOWMODE) && ap->aeof &&
> > > +		!args.alloc_minlen_only) {
> > >  		if (!ap->offset) {
> > 
> > Yikes, the conditional lines up with the body!
> 
> Sorry, I will fix this.
> 
> > 
> > --D
> > 
> > >  			args.alignment = stripe_align;
> > >  			atype = args.type;
> > > @@ -3681,7 +3693,7 @@ xfs_bmap_btalloc(
> > >  		if ((error = xfs_alloc_vextent(&args)))
> > >  			return error;
> > >  	}
> > > -	if (args.fsbno == NULLFSBLOCK && nullfb) {
> > > +	if (args.fsbno == NULLFSBLOCK && nullfb && !args.alloc_minlen_only) {
> > >  		args.fsbno = 0;
> > >  		args.type = XFS_ALLOCTYPE_FIRST_AG;
> > >  		args.total = ap->minlen;
> > > diff --git a/fs/xfs/libxfs/xfs_errortag.h b/fs/xfs/libxfs/xfs_errortag.h
> > > index 1c56fcceeea6..6ca9084b6934 100644
> > > --- a/fs/xfs/libxfs/xfs_errortag.h
> > > +++ b/fs/xfs/libxfs/xfs_errortag.h
> > > @@ -57,7 +57,8 @@
> > >  #define XFS_ERRTAG_IUNLINK_FALLBACK			34
> > >  #define XFS_ERRTAG_BUF_IOERROR				35
> > >  #define XFS_ERRTAG_REDUCE_MAX_IEXTENTS			36
> > > -#define XFS_ERRTAG_MAX					37
> > > +#define XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT		37
> > > +#define XFS_ERRTAG_MAX					38
> > >  
> > >  /*
> > >   * Random factors for above tags, 1 means always, 2 means 1/2 time, etc.
> > > @@ -99,5 +100,6 @@
> > >  #define XFS_RANDOM_IUNLINK_FALLBACK			(XFS_RANDOM_DEFAULT/10)
> > >  #define XFS_RANDOM_BUF_IOERROR				XFS_RANDOM_DEFAULT
> > >  #define XFS_RANDOM_REDUCE_MAX_IEXTENTS			1
> > > +#define XFS_RANDOM_BMAP_ALLOC_MINLEN_EXTENT		1
> > >  
> > >  #endif /* __XFS_ERRORTAG_H_ */
> > > diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c
> > > index 3780b118cc47..028560bb596a 100644
> > > --- a/fs/xfs/xfs_error.c
> > > +++ b/fs/xfs/xfs_error.c
> > > @@ -55,6 +55,7 @@ static unsigned int xfs_errortag_random_default[] = {
> > >  	XFS_RANDOM_IUNLINK_FALLBACK,
> > >  	XFS_RANDOM_BUF_IOERROR,
> > >  	XFS_RANDOM_REDUCE_MAX_IEXTENTS,
> > > +	XFS_RANDOM_BMAP_ALLOC_MINLEN_EXTENT,
> > >  };
> > >  
> > >  struct xfs_errortag_attr {
> > > @@ -166,6 +167,7 @@ XFS_ERRORTAG_ATTR_RW(bad_summary,	XFS_ERRTAG_FORCE_SUMMARY_RECALC);
> > >  XFS_ERRORTAG_ATTR_RW(iunlink_fallback,	XFS_ERRTAG_IUNLINK_FALLBACK);
> > >  XFS_ERRORTAG_ATTR_RW(buf_ioerror,	XFS_ERRTAG_BUF_IOERROR);
> > >  XFS_ERRORTAG_ATTR_RW(reduce_max_iextents,	XFS_ERRTAG_REDUCE_MAX_IEXTENTS);
> > > +XFS_ERRORTAG_ATTR_RW(bmap_alloc_minlen_extent, XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT);
> > >  
> > >  static struct attribute *xfs_errortag_attrs[] = {
> > >  	XFS_ERRORTAG_ATTR_LIST(noerror),
> > > @@ -205,6 +207,7 @@ static struct attribute *xfs_errortag_attrs[] = {
> > >  	XFS_ERRORTAG_ATTR_LIST(iunlink_fallback),
> > >  	XFS_ERRORTAG_ATTR_LIST(buf_ioerror),
> > >  	XFS_ERRORTAG_ATTR_LIST(reduce_max_iextents),
> > > +	XFS_ERRORTAG_ATTR_LIST(bmap_alloc_minlen_extent),
> > >  	NULL,
> > >  };
> > >  
> > 
> 
> 
> 

-- 
chandan




^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2020-10-07  5:09 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-03  5:56 [PATCH V5 00/12] Bail out if transaction can cause extent count to overflow Chandan Babu R
2020-10-03  5:56 ` [PATCH V5 01/12] xfs: Add helper for checking per-inode extent count overflow Chandan Babu R
2020-10-03  5:56 ` [PATCH V5 02/12] xfs: Check for extent overflow when trivally adding a new extent Chandan Babu R
2020-10-06  4:18   ` Darrick J. Wong
2020-10-03  5:56 ` [PATCH V5 03/12] xfs: Check for extent overflow when punching a hole Chandan Babu R
2020-10-06  4:18   ` Darrick J. Wong
2020-10-03  5:56 ` [PATCH V5 04/12] xfs: Check for extent overflow when adding/removing xattrs Chandan Babu R
2020-10-06  4:23   ` Darrick J. Wong
2020-10-06  9:21     ` Chandan Babu R
2020-10-03  5:56 ` [PATCH V5 05/12] xfs: Check for extent overflow when adding/removing dir entries Chandan Babu R
2020-10-03  5:56 ` [PATCH V5 06/12] xfs: Check for extent overflow when writing to unwritten extent Chandan Babu R
2020-10-03  5:56 ` [PATCH V5 07/12] xfs: Check for extent overflow when moving extent from cow to data fork Chandan Babu R
2020-10-03  5:56 ` [PATCH V5 08/12] xfs: Check for extent overflow when remapping an extent Chandan Babu R
2020-10-03  5:56 ` [PATCH V5 09/12] xfs: Check for extent overflow when swapping extents Chandan Babu R
2020-10-06  4:23   ` Darrick J. Wong
2020-10-03  5:56 ` [PATCH V5 10/12] xfs: Introduce error injection to reduce maximum inode fork extent count Chandan Babu R
2020-10-06  4:24   ` Darrick J. Wong
2020-10-03  5:56 ` [PATCH V5 11/12] xfs: Set tp->t_firstblock only once during a transaction's lifetime Chandan Babu R
2020-10-06  4:26   ` Darrick J. Wong
2020-10-06  5:17     ` Chandan Babu R
2020-10-03  5:56 ` [PATCH V5 12/12] xfs: Introduce error injection to allocate only minlen size extents for files Chandan Babu R
2020-10-06  4:25   ` Chandan Babu R
2020-10-06  4:27     ` Darrick J. Wong
2020-10-06  4:34   ` Darrick J. Wong
2020-10-06  9:17     ` Chandan Babu R
2020-10-07  5:09       ` Chandan Babu R

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).