* [PATCH V12 00/14] Bail out if transaction can cause extent count to overflow
@ 2021-01-04 10:31 Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 01/14] xfs: Add helper for checking per-inode extent count overflow Chandan Babu R
` (13 more replies)
0 siblings, 14 replies; 19+ messages in thread
From: Chandan Babu R @ 2021-01-04 10:31 UTC (permalink / raw)
To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, hch, allison.henderson
XFS does not check for possible overflow of per-inode extent counter
fields when adding extents to either data or attr fork.
For e.g.
1. Insert 5 million xattrs (each having a value size of 255 bytes) and
then delete 50% of them in an alternating manner.
2. On a 4k block sized XFS filesystem instance, the above causes 98511
extents to be created in the attr fork of the inode.
xfsaild/loop0 2008 [003] 1475.127209: probe:xfs_inode_to_disk: (ffffffffa43fb6b0) if_nextents=98511 i_ino=131
3. The incore inode fork extent counter is a signed 32-bit
quantity. However, the on-disk extent counter is an unsigned 16-bit
quantity and hence cannot hold 98511 extents.
4. The following incorrect value is stored in the xattr extent counter,
# xfs_db -f -c 'inode 131' -c 'print core.naextents' /dev/loop0
core.naextents = -32561
This patchset adds a new helper function
(i.e. xfs_iext_count_may_overflow()) to check for overflow of the
per-inode data and xattr extent counters and invokes it before
starting an fs operation (e.g. creating a new directory entry). With
this patchset applied, XFS detects counter overflows and returns with
an error rather than causing a silent corruption.
The patchset has been tested by executing xfstests with the following
mkfs.xfs options,
1. -m crc=0 -b size=1k
2. -m crc=0 -b size=4k
3. -m crc=0 -b size=512
4. -m rmapbt=1,reflink=1 -b size=1k
5. -m rmapbt=1,reflink=1 -b size=4k
The patches can also be obtained from
https://github.com/chandanr/linux.git at branch xfs-reserve-extent-count-v12.
I have two patches that define the newly introduced error injection
tags in xfsprogs
(https://lore.kernel.org/linux-xfs/20201104114900.172147-1-chandanrlinux@gmail.com/).
I have also written tests
(https://github.com/chandanr/xfstests/commits/extent-overflow-tests)
for verifying the checks introduced in the kernel.
Changelog:
V11 -> V12:
1. Rebase patches on top of Linux v5.11-rc1.
2. Revert back to using using a pseudo max inode extent count of 10.
Hence the patches
- [PATCH V12 05/14] xfs: Check for extent overflow when adding/removing xattrs
- [PATCH V12 10/14] xfs: Introduce error injection to reduce maximum
have been reverted back (including retaining of corresponding RVB
tags) to how it was under V10 of the patchset.
V11 of the patchset had increased the max pseudo extent count to
35 to allow for "directory entry remove" operation to always
succeed. However the corresponding logic was incorrect. Please
refer to "[PATCH V12 04/14] xfs: Check for extent overflow when
adding/removing dir entries" to find logic and explaination of
the newer logic.
"[PATCH V12 04/14] xfs: Check for extent overflow when
adding/removing dir entries" is the only patch yet to be reviewed.
V10 -> V11:
1. For directory/xattr insert operations we now reserve sufficient
number of "extent count" so as to guarantee a future
directory/xattr remove operation.
2. The pseudo max extent count value has been increased to 35.
V9 -> V10:
1. Pull back changes which cause xfs_bmap_compute_alignments() to
return "stripe alignment" into 12th patch i.e. "xfs: Compute bmap
extent alignments in a separate function".
V8 -> V9:
1. Enabling XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag will
always allocate single block sized free extents (if
available).
2. xfs_bmap_compute_alignments() now returns stripe alignment as its
return value.
3. Dropped Allison's RVB tag for "xfs: Compute bmap extent
alignments in a separate function" and "xfs: Introduce error
injection to allocate only minlen size extents for files".
V7 -> V8:
1. Rename local variable in xfs_alloc_fix_freelist() from "i" to "stat".
V6 -> V7:
1. Create new function xfs_bmap_exact_minlen_extent_alloc() (enabled
only when CONFIG_XFS_DEBUG is set to y) which issues allocation
requests for minlen sized extents only. In order to achieve this,
common code from xfs_bmap_btalloc() have been refactored into new
functions.
2. All major functions implementing logic associated with
XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag are compiled only
when CONFIG_XFS_DEBUG is set to y.
3. Remove XFS_IEXT_REFLINK_REMAP_CNT macro and replace it with an
integer which holds the number of new extents to be
added to the data fork.
V5 -> V6:
1. Rebased the patchset on xfs-linux/for-next branch.
2. Drop "xfs: Set tp->t_firstblock only once during a transaction's
lifetime" patch from the patchset.
3. Add a comment to xfs_bmap_btalloc() describing why it was chosen
to start "free space extent search" from AG 0 when
XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT is enabled and when the
transaction is allocating its first extent.
4. Fix review comments associated with coding style.
V4 -> V5:
1. Introduce new error tag XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT to
let user space programs to be able to guarantee that free space
requests for files are satisfied by allocating minlen sized
extents.
2. Change xfs_bmap_btalloc() and xfs_alloc_vextent() to allocate
minlen sized extents when XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT is
enabled.
3. Introduce a new patch that causes tp->t_firstblock to be assigned
to a value only when its previous value is NULLFSBLOCK.
4. Replace the previously introduced MAXERRTAGEXTNUM (maximum inode
fork extent count) with the hardcoded value of 10.
5. xfs_bui_item_recover(): Use XFS_IEXT_ADD_NOSPLIT_CNT when mapping
an extent.
6. xfs_swap_extent_rmap(): Use xfs_bmap_is_real_extent() instead of
xfs_bmap_is_update_needed() to assess if the extent really needs
to be swapped.
V3 -> V4:
1. Introduce new patch which lets userspace programs to test "extent
count overflow detection" by injecting an error tag. The new
error tag reduces the maximum allowed extent count to 10.
2. Injecting the newly defined error tag prevents
xfs_bmap_add_extent_hole_real() from merging a new extent with
its neighbours to allow writing deterministic tests for testing
extent count overflow for Directories, Xattr and growing realtime
devices. This is required because the new extent being allocated
can be contiguous with its neighbours (w.r.t both file and disk
offsets).
3. Injecting the newly defined error tag forces block sized extents
to be allocated for summary/bitmap files when growing a realtime
device. This is required because xfs_growfs_rt_alloc() allocates
as large an extent as possible for summary/bitmap files and hence
it would be impossible to write deterministic tests.
4. Rename XFS_IEXT_REMOVE_CNT to XFS_IEXT_PUNCH_HOLE_CNT to reflect
the actual meaning of the fs operation.
5. Fold XFS_IEXT_INSERT_HOLE_CNT code into that associated with
XFS_IEXT_PUNCH_HOLE_CNT since both perform the same job.
6. xfs_swap_extent_rmap(): Check for extent overflow should be made
on the source file only if the donor file extent has a valid
on-disk mapping and vice versa.
V2 -> V3:
1. Move the definition of xfs_iext_count_may_overflow() from
libxfs/xfs_trans_resv.c to libxfs/xfs_inode_fork.c. Also, I tried
to make xfs_iext_count_may_overflow() an inline function by
placing the definition in libxfs/xfs_inode_fork.h. However this
required that the definition of 'struct xfs_inode' be available,
since xfs_iext_count_may_overflow() uses a 'struct xfs_inode *'
type variable.
2. Handle XFS_COW_FORK within xfs_iext_count_may_overflow() by
returning a success value.
3. Rename XFS_IEXT_ADD_CNT to XFS_IEXT_ADD_NOSPLIT_CNT. Thanks to
Darrick for the suggesting the new name.
4. Expand comments to make use of 80 columns.
V1 -> V2:
1. Rename helper function from xfs_trans_resv_ext_cnt() to
xfs_iext_count_may_overflow().
2. Define and use macros to represent fs operations and the
corresponding increase in extent count.
3. Split the patches based on the fs operation being performed.
Chandan Babu R (14):
xfs: Add helper for checking per-inode extent count overflow
xfs: Check for extent overflow when trivally adding a new extent
xfs: Check for extent overflow when punching a hole
xfs: Check for extent overflow when adding/removing dir entries
xfs: Check for extent overflow when adding/removing xattrs
xfs: Check for extent overflow when writing to unwritten extent
xfs: Check for extent overflow when moving extent from cow to data
fork
xfs: Check for extent overflow when remapping an extent
xfs: Check for extent overflow when swapping extents
xfs: Introduce error injection to reduce maximum inode fork extent
count
xfs: Remove duplicate assert statement in xfs_bmap_btalloc()
xfs: Compute bmap extent alignments in a separate function
xfs: Process allocated extent in a separate function
xfs: Introduce error injection to allocate only minlen size extents
for files
fs/xfs/libxfs/xfs_alloc.c | 50 ++++++
fs/xfs/libxfs/xfs_alloc.h | 3 +
fs/xfs/libxfs/xfs_attr.c | 13 ++
fs/xfs/libxfs/xfs_bmap.c | 279 ++++++++++++++++++++++++---------
fs/xfs/libxfs/xfs_errortag.h | 6 +-
fs/xfs/libxfs/xfs_inode_fork.c | 27 ++++
fs/xfs/libxfs/xfs_inode_fork.h | 63 ++++++++
fs/xfs/xfs_bmap_item.c | 10 ++
fs/xfs/xfs_bmap_util.c | 31 ++++
fs/xfs/xfs_dquot.c | 8 +-
fs/xfs/xfs_error.c | 6 +
fs/xfs/xfs_inode.c | 45 ++++++
fs/xfs/xfs_iomap.c | 10 ++
fs/xfs/xfs_reflink.c | 16 ++
fs/xfs/xfs_rtalloc.c | 5 +
fs/xfs/xfs_symlink.c | 5 +
16 files changed, 499 insertions(+), 78 deletions(-)
--
2.29.2
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH V12 01/14] xfs: Add helper for checking per-inode extent count overflow
2021-01-04 10:31 [PATCH V12 00/14] Bail out if transaction can cause extent count to overflow Chandan Babu R
@ 2021-01-04 10:31 ` Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 02/14] xfs: Check for extent overflow when trivally adding a new extent Chandan Babu R
` (12 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Chandan Babu R @ 2021-01-04 10:31 UTC (permalink / raw)
To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, hch, allison.henderson
XFS does not check for possible overflow of per-inode extent counter
fields when adding extents to either data or attr fork.
For e.g.
1. Insert 5 million xattrs (each having a value size of 255 bytes) and
then delete 50% of them in an alternating manner.
2. On a 4k block sized XFS filesystem instance, the above causes 98511
extents to be created in the attr fork of the inode.
xfsaild/loop0 2008 [003] 1475.127209: probe:xfs_inode_to_disk: (ffffffffa43fb6b0) if_nextents=98511 i_ino=131
3. The incore inode fork extent counter is a signed 32-bit
quantity. However the on-disk extent counter is an unsigned 16-bit
quantity and hence cannot hold 98511 extents.
4. The following incorrect value is stored in the attr extent counter,
# xfs_db -f -c 'inode 131' -c 'print core.naextents' /dev/loop0
core.naextents = -32561
This commit adds a new helper function (i.e.
xfs_iext_count_may_overflow()) to check for overflow of the per-inode
data and xattr extent counters. Future patches will use this function to
make sure that an FS operation won't cause the extent counter to
overflow.
Suggested-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
fs/xfs/libxfs/xfs_inode_fork.c | 23 +++++++++++++++++++++++
fs/xfs/libxfs/xfs_inode_fork.h | 2 ++
2 files changed, 25 insertions(+)
diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
index 7575de5cecb1..8d48716547e5 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.c
+++ b/fs/xfs/libxfs/xfs_inode_fork.c
@@ -23,6 +23,7 @@
#include "xfs_da_btree.h"
#include "xfs_dir2_priv.h"
#include "xfs_attr_leaf.h"
+#include "xfs_types.h"
kmem_zone_t *xfs_ifork_zone;
@@ -728,3 +729,25 @@ xfs_ifork_verify_local_attr(
return 0;
}
+
+int
+xfs_iext_count_may_overflow(
+ struct xfs_inode *ip,
+ int whichfork,
+ int nr_to_add)
+{
+ struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, whichfork);
+ uint64_t max_exts;
+ uint64_t nr_exts;
+
+ if (whichfork == XFS_COW_FORK)
+ return 0;
+
+ max_exts = (whichfork == XFS_ATTR_FORK) ? MAXAEXTNUM : MAXEXTNUM;
+
+ nr_exts = ifp->if_nextents + nr_to_add;
+ if (nr_exts < ifp->if_nextents || nr_exts > max_exts)
+ return -EFBIG;
+
+ return 0;
+}
diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index a4953e95c4f3..0beb8e2a00be 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -172,5 +172,7 @@ extern void xfs_ifork_init_cow(struct xfs_inode *ip);
int xfs_ifork_verify_local_data(struct xfs_inode *ip);
int xfs_ifork_verify_local_attr(struct xfs_inode *ip);
+int xfs_iext_count_may_overflow(struct xfs_inode *ip, int whichfork,
+ int nr_to_add);
#endif /* __XFS_INODE_FORK_H__ */
--
2.29.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH V12 02/14] xfs: Check for extent overflow when trivally adding a new extent
2021-01-04 10:31 [PATCH V12 00/14] Bail out if transaction can cause extent count to overflow Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 01/14] xfs: Add helper for checking per-inode extent count overflow Chandan Babu R
@ 2021-01-04 10:31 ` Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 03/14] xfs: Check for extent overflow when punching a hole Chandan Babu R
` (11 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Chandan Babu R @ 2021-01-04 10:31 UTC (permalink / raw)
To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, hch, allison.henderson
When adding a new data extent (without modifying an inode's existing
extents) the extent count increases only by 1. This commit checks for
extent count overflow in such cases.
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
fs/xfs/libxfs/xfs_bmap.c | 6 ++++++
fs/xfs/libxfs/xfs_inode_fork.h | 6 ++++++
fs/xfs/xfs_bmap_item.c | 7 +++++++
fs/xfs/xfs_bmap_util.c | 5 +++++
fs/xfs/xfs_dquot.c | 8 +++++++-
fs/xfs/xfs_iomap.c | 5 +++++
fs/xfs/xfs_rtalloc.c | 5 +++++
7 files changed, 41 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index bc446418e227..32aeacf6f055 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -4527,6 +4527,12 @@ xfs_bmapi_convert_delalloc(
return error;
xfs_ilock(ip, XFS_ILOCK_EXCL);
+
+ error = xfs_iext_count_may_overflow(ip, whichfork,
+ XFS_IEXT_ADD_NOSPLIT_CNT);
+ if (error)
+ goto out_trans_cancel;
+
xfs_trans_ijoin(tp, ip, 0);
if (!xfs_iext_lookup_extent(ip, ifp, offset_fsb, &bma.icur, &bma.got) ||
diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index 0beb8e2a00be..7fc2b129a2e7 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -34,6 +34,12 @@ struct xfs_ifork {
#define XFS_IFEXTENTS 0x02 /* All extent pointers are read in */
#define XFS_IFBROOT 0x04 /* i_broot points to the bmap b-tree root */
+/*
+ * Worst-case increase in the fork extent count when we're adding a single
+ * extent to a fork and there's no possibility of splitting an existing mapping.
+ */
+#define XFS_IEXT_ADD_NOSPLIT_CNT (1)
+
/*
* Fork handling.
*/
diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
index 93e4d8ae6e92..0534304ed0a7 100644
--- a/fs/xfs/xfs_bmap_item.c
+++ b/fs/xfs/xfs_bmap_item.c
@@ -508,6 +508,13 @@ xfs_bui_item_recover(
xfs_ilock(ip, XFS_ILOCK_EXCL);
xfs_trans_ijoin(tp, ip, 0);
+ if (bui_type == XFS_BMAP_MAP) {
+ error = xfs_iext_count_may_overflow(ip, whichfork,
+ XFS_IEXT_ADD_NOSPLIT_CNT);
+ if (error)
+ goto err_cancel;
+ }
+
count = bmap->me_len;
error = xfs_trans_log_finish_bmap_update(tp, budp, bui_type, ip,
whichfork, bmap->me_startoff, bmap->me_startblock,
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 7371a7f7c652..db44bfaabe88 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -822,6 +822,11 @@ xfs_alloc_file_space(
if (error)
goto error1;
+ error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+ XFS_IEXT_ADD_NOSPLIT_CNT);
+ if (error)
+ goto error0;
+
xfs_trans_ijoin(tp, ip, 0);
error = xfs_bmapi_write(tp, ip, startoffset_fsb,
diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index 1d95ed387d66..175f544f7c45 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -314,8 +314,14 @@ xfs_dquot_disk_alloc(
return -ESRCH;
}
- /* Create the block mapping. */
xfs_trans_ijoin(tp, quotip, XFS_ILOCK_EXCL);
+
+ error = xfs_iext_count_may_overflow(quotip, XFS_DATA_FORK,
+ XFS_IEXT_ADD_NOSPLIT_CNT);
+ if (error)
+ return error;
+
+ /* Create the block mapping. */
error = xfs_bmapi_write(tp, quotip, dqp->q_fileoffset,
XFS_DQUOT_CLUSTER_SIZE_FSB, XFS_BMAPI_METADATA, 0, &map,
&nmaps);
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 7b9ff824e82d..f53690febb22 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -250,6 +250,11 @@ xfs_iomap_write_direct(
if (error)
goto out_trans_cancel;
+ error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+ XFS_IEXT_ADD_NOSPLIT_CNT);
+ if (error)
+ goto out_trans_cancel;
+
xfs_trans_ijoin(tp, ip, 0);
/*
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index b4999fb01ff7..161b0e8992ba 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -804,6 +804,11 @@ xfs_growfs_rt_alloc(
xfs_ilock(ip, XFS_ILOCK_EXCL);
xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
+ error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+ XFS_IEXT_ADD_NOSPLIT_CNT);
+ if (error)
+ goto out_trans_cancel;
+
/*
* Allocate blocks to the bitmap file.
*/
--
2.29.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH V12 03/14] xfs: Check for extent overflow when punching a hole
2021-01-04 10:31 [PATCH V12 00/14] Bail out if transaction can cause extent count to overflow Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 01/14] xfs: Add helper for checking per-inode extent count overflow Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 02/14] xfs: Check for extent overflow when trivally adding a new extent Chandan Babu R
@ 2021-01-04 10:31 ` Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 04/14] xfs: Check for extent overflow when adding/removing dir entries Chandan Babu R
` (10 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Chandan Babu R @ 2021-01-04 10:31 UTC (permalink / raw)
To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, hch, allison.henderson
The extent mapping the file offset at which a hole has to be
inserted will be split into two extents causing extent count to
increase by 1.
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
fs/xfs/libxfs/xfs_inode_fork.h | 7 +++++++
fs/xfs/xfs_bmap_item.c | 15 +++++++++------
fs/xfs/xfs_bmap_util.c | 10 ++++++++++
3 files changed, 26 insertions(+), 6 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index 7fc2b129a2e7..bcac769a7df6 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -40,6 +40,13 @@ struct xfs_ifork {
*/
#define XFS_IEXT_ADD_NOSPLIT_CNT (1)
+/*
+ * Punching out an extent from the middle of an existing extent can cause the
+ * extent count to increase by 1.
+ * i.e. | Old extent | Hole | Old extent |
+ */
+#define XFS_IEXT_PUNCH_HOLE_CNT (1)
+
/*
* Fork handling.
*/
diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
index 0534304ed0a7..2344757ede63 100644
--- a/fs/xfs/xfs_bmap_item.c
+++ b/fs/xfs/xfs_bmap_item.c
@@ -471,6 +471,7 @@ xfs_bui_item_recover(
xfs_exntst_t state;
unsigned int bui_type;
int whichfork;
+ int iext_delta;
int error = 0;
if (!xfs_bui_validate(mp, buip)) {
@@ -508,12 +509,14 @@ xfs_bui_item_recover(
xfs_ilock(ip, XFS_ILOCK_EXCL);
xfs_trans_ijoin(tp, ip, 0);
- if (bui_type == XFS_BMAP_MAP) {
- error = xfs_iext_count_may_overflow(ip, whichfork,
- XFS_IEXT_ADD_NOSPLIT_CNT);
- if (error)
- goto err_cancel;
- }
+ if (bui_type == XFS_BMAP_MAP)
+ iext_delta = XFS_IEXT_ADD_NOSPLIT_CNT;
+ else
+ iext_delta = XFS_IEXT_PUNCH_HOLE_CNT;
+
+ error = xfs_iext_count_may_overflow(ip, whichfork, iext_delta);
+ if (error)
+ goto err_cancel;
count = bmap->me_len;
error = xfs_trans_log_finish_bmap_update(tp, budp, bui_type, ip,
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index db44bfaabe88..6ac7a6ac2658 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -891,6 +891,11 @@ xfs_unmap_extent(
xfs_trans_ijoin(tp, ip, 0);
+ error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+ XFS_IEXT_PUNCH_HOLE_CNT);
+ if (error)
+ goto out_trans_cancel;
+
error = xfs_bunmapi(tp, ip, startoffset_fsb, len_fsb, 0, 2, done);
if (error)
goto out_trans_cancel;
@@ -1168,6 +1173,11 @@ xfs_insert_file_space(
xfs_ilock(ip, XFS_ILOCK_EXCL);
xfs_trans_ijoin(tp, ip, 0);
+ error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+ XFS_IEXT_PUNCH_HOLE_CNT);
+ if (error)
+ goto out_trans_cancel;
+
/*
* The extent shifting code works on extent granularity. So, if stop_fsb
* is not the starting block of extent, we need to split the extent at
--
2.29.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH V12 04/14] xfs: Check for extent overflow when adding/removing dir entries
2021-01-04 10:31 [PATCH V12 00/14] Bail out if transaction can cause extent count to overflow Chandan Babu R
` (2 preceding siblings ...)
2021-01-04 10:31 ` [PATCH V12 03/14] xfs: Check for extent overflow when punching a hole Chandan Babu R
@ 2021-01-04 10:31 ` Chandan Babu R
2021-01-08 1:17 ` Darrick J. Wong
2021-01-04 10:31 ` [PATCH V12 05/14] xfs: Check for extent overflow when adding/removing xattrs Chandan Babu R
` (9 subsequent siblings)
13 siblings, 1 reply; 19+ messages in thread
From: Chandan Babu R @ 2021-01-04 10:31 UTC (permalink / raw)
To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, hch, allison.henderson
Directory entry addition can cause the following,
1. Data block can be added/removed.
A new extent can cause extent count to increase by 1.
2. Free disk block can be added/removed.
Same behaviour as described above for Data block.
3. Dabtree blocks.
XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these
can be new extents. Hence extent count can increase by
XFS_DA_NODE_MAXDEPTH.
Directory entry remove and rename (applicable only to the source
directory entry) operations are handled specially to allow them to
succeed in low extent count availability scenarios
i.e. xfs_bmap_del_extent_real() will now return -ENOSPC when a possible
extent count overflow is detected. -ENOSPC is already handled by higher
layers of XFS by letting,
1. Empty Data/Free space index blocks to linger around until a future
remove operation frees them.
2. Dabtree blocks would be swapped with the last block in the leaf space
followed by unmapping of the new last block.
Also, Extent overflow check is performed for the target directory entry
of the rename operation only when the entry does not exist and a
non-zero space reservation is obtained successfully.
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
fs/xfs/libxfs/xfs_bmap.c | 15 ++++++++++++
fs/xfs/libxfs/xfs_inode_fork.h | 13 ++++++++++
fs/xfs/xfs_inode.c | 45 ++++++++++++++++++++++++++++++++++
fs/xfs/xfs_symlink.c | 5 ++++
4 files changed, 78 insertions(+)
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 32aeacf6f055..5fd804534e67 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -5151,6 +5151,21 @@ xfs_bmap_del_extent_real(
/*
* Deleting the middle of the extent.
*/
+
+ /*
+ * For directories, -ENOSPC will be handled by higher layers of
+ * XFS by letting the corresponding empty Data/Free blocks to
+ * linger around until a future remove operation. Dabtree blocks
+ * would be swapped with the last block in the leaf space and
+ * then the new last block will be unmapped.
+ */
+ if (S_ISDIR(VFS_I(ip)->i_mode) &&
+ whichfork == XFS_DATA_FORK &&
+ xfs_iext_count_may_overflow(ip, whichfork, 1)) {
+ error = -ENOSPC;
+ goto done;
+ }
+
old = got;
got.br_blockcount = del->br_startoff - got.br_startoff;
diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index bcac769a7df6..ea1a9dd8a763 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -47,6 +47,19 @@ struct xfs_ifork {
*/
#define XFS_IEXT_PUNCH_HOLE_CNT (1)
+/*
+ * Directory entry addition can cause the following,
+ * 1. Data block can be added/removed.
+ * A new extent can cause extent count to increase by 1.
+ * 2. Free disk block can be added/removed.
+ * Same behaviour as described above for Data block.
+ * 3. Dabtree blocks.
+ * XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these can be new
+ * extents. Hence extent count can increase by XFS_DA_NODE_MAXDEPTH.
+ */
+#define XFS_IEXT_DIR_MANIP_CNT(mp) \
+ ((XFS_DA_NODE_MAXDEPTH + 1 + 1) * (mp)->m_dir_geo->fsbcount)
+
/*
* Fork handling.
*/
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index b7352bc4c815..0db21368c7e1 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1042,6 +1042,11 @@ xfs_create(
if (error)
goto out_trans_cancel;
+ error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK,
+ XFS_IEXT_DIR_MANIP_CNT(mp));
+ if (error)
+ goto out_trans_cancel;
+
/*
* A newly created regular or special file just has one directory
* entry pointing to them, but a directory also the "." entry
@@ -1258,6 +1263,11 @@ xfs_link(
xfs_trans_ijoin(tp, sip, XFS_ILOCK_EXCL);
xfs_trans_ijoin(tp, tdp, XFS_ILOCK_EXCL);
+ error = xfs_iext_count_may_overflow(tdp, XFS_DATA_FORK,
+ XFS_IEXT_DIR_MANIP_CNT(mp));
+ if (error)
+ goto error_return;
+
/*
* If we are using project inheritance, we only allow hard link
* creation in our tree when the project IDs are the same; else
@@ -3106,6 +3116,35 @@ xfs_rename(
/*
* Check for expected errors before we dirty the transaction
* so we can return an error without a transaction abort.
+ *
+ * Extent count overflow check:
+ *
+ * From the perspective of src_dp, a rename operation is essentially a
+ * directory entry remove operation. Hence the only place where we check
+ * for extent count overflow for src_dp is in
+ * xfs_bmap_del_extent_real(). xfs_bmap_del_extent_real() returns
+ * -ENOSPC when it detects a possible extent count overflow and in
+ * response, the higher layers of directory handling code do the
+ * following:
+ * 1. Data/Free blocks: XFS lets these blocks linger around until a
+ * future remove operation removes them.
+ * 2. Dabtree blocks: XFS swaps the blocks with the last block in the
+ * Leaf space and unmaps the last block.
+ *
+ * For target_dp, there are two cases depending on whether the
+ * destination directory entry exists or not.
+ *
+ * When destination directory entry does not exist (i.e. target_ip ==
+ * NULL), extent count overflow check is performed only when transaction
+ * has a non-zero sized space reservation associated with it. With a
+ * zero-sized space reservation, XFS allows a rename operation to
+ * continue only when the directory has sufficient free space in its
+ * data/leaf/free space blocks to hold the new entry.
+ *
+ * When destination directory entry exists (i.e. target_ip != NULL), all
+ * we need to do is change the inode number associated with the already
+ * existing entry. Hence there is no need to perform an extent count
+ * overflow check.
*/
if (target_ip == NULL) {
/*
@@ -3116,6 +3155,12 @@ xfs_rename(
error = xfs_dir_canenter(tp, target_dp, target_name);
if (error)
goto out_trans_cancel;
+ } else {
+ error = xfs_iext_count_may_overflow(target_dp,
+ XFS_DATA_FORK,
+ XFS_IEXT_DIR_MANIP_CNT(mp));
+ if (error)
+ goto out_trans_cancel;
}
} else {
/*
diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
index 1f43fd7f3209..0b8136a32484 100644
--- a/fs/xfs/xfs_symlink.c
+++ b/fs/xfs/xfs_symlink.c
@@ -220,6 +220,11 @@ xfs_symlink(
if (error)
goto out_trans_cancel;
+ error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK,
+ XFS_IEXT_DIR_MANIP_CNT(mp));
+ if (error)
+ goto out_trans_cancel;
+
/*
* Allocate an inode for the symlink.
*/
--
2.29.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH V12 05/14] xfs: Check for extent overflow when adding/removing xattrs
2021-01-04 10:31 [PATCH V12 00/14] Bail out if transaction can cause extent count to overflow Chandan Babu R
` (3 preceding siblings ...)
2021-01-04 10:31 ` [PATCH V12 04/14] xfs: Check for extent overflow when adding/removing dir entries Chandan Babu R
@ 2021-01-04 10:31 ` Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 06/14] xfs: Check for extent overflow when writing to unwritten extent Chandan Babu R
` (8 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Chandan Babu R @ 2021-01-04 10:31 UTC (permalink / raw)
To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, hch, allison.henderson
Adding/removing an xattr can cause XFS_DA_NODE_MAXDEPTH extents to be
added. One extra extent for dabtree in case a local attr is large enough
to cause a double split. It can also cause extent count to increase
proportional to the size of a remote xattr's value.
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
fs/xfs/libxfs/xfs_attr.c | 13 +++++++++++++
fs/xfs/libxfs/xfs_inode_fork.h | 10 ++++++++++
2 files changed, 23 insertions(+)
diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index fd8e6418a0d3..be51e7068dcd 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -396,6 +396,7 @@ xfs_attr_set(
struct xfs_trans_res tres;
bool rsvd = (args->attr_filter & XFS_ATTR_ROOT);
int error, local;
+ int rmt_blks = 0;
unsigned int total;
if (XFS_FORCED_SHUTDOWN(dp->i_mount))
@@ -442,11 +443,15 @@ xfs_attr_set(
tres.tr_logcount = XFS_ATTRSET_LOG_COUNT;
tres.tr_logflags = XFS_TRANS_PERM_LOG_RES;
total = args->total;
+
+ if (!local)
+ rmt_blks = xfs_attr3_rmt_blocks(mp, args->valuelen);
} else {
XFS_STATS_INC(mp, xs_attr_remove);
tres = M_RES(mp)->tr_attrrm;
total = XFS_ATTRRM_SPACE_RES(mp);
+ rmt_blks = xfs_attr3_rmt_blocks(mp, XFS_XATTR_SIZE_MAX);
}
/*
@@ -460,6 +465,14 @@ xfs_attr_set(
xfs_ilock(dp, XFS_ILOCK_EXCL);
xfs_trans_ijoin(args->trans, dp, 0);
+
+ if (args->value || xfs_inode_hasattr(dp)) {
+ error = xfs_iext_count_may_overflow(dp, XFS_ATTR_FORK,
+ XFS_IEXT_ATTR_MANIP_CNT(rmt_blks));
+ if (error)
+ goto out_trans_cancel;
+ }
+
if (args->value) {
unsigned int quota_flags = XFS_QMOPT_RES_REGBLKS;
diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index ea1a9dd8a763..8d89838e23f8 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -60,6 +60,16 @@ struct xfs_ifork {
#define XFS_IEXT_DIR_MANIP_CNT(mp) \
((XFS_DA_NODE_MAXDEPTH + 1 + 1) * (mp)->m_dir_geo->fsbcount)
+/*
+ * Adding/removing an xattr can cause XFS_DA_NODE_MAXDEPTH extents to
+ * be added. One extra extent for dabtree in case a local attr is
+ * large enough to cause a double split. It can also cause extent
+ * count to increase proportional to the size of a remote xattr's
+ * value.
+ */
+#define XFS_IEXT_ATTR_MANIP_CNT(rmt_blks) \
+ (XFS_DA_NODE_MAXDEPTH + max(1, rmt_blks))
+
/*
* Fork handling.
*/
--
2.29.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH V12 06/14] xfs: Check for extent overflow when writing to unwritten extent
2021-01-04 10:31 [PATCH V12 00/14] Bail out if transaction can cause extent count to overflow Chandan Babu R
` (4 preceding siblings ...)
2021-01-04 10:31 ` [PATCH V12 05/14] xfs: Check for extent overflow when adding/removing xattrs Chandan Babu R
@ 2021-01-04 10:31 ` Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 07/14] xfs: Check for extent overflow when moving extent from cow to data fork Chandan Babu R
` (7 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Chandan Babu R @ 2021-01-04 10:31 UTC (permalink / raw)
To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, hch, allison.henderson
A write to a sub-interval of an existing unwritten extent causes
the original extent to be split into 3 extents
i.e. | Unwritten | Real | Unwritten |
Hence extent count can increase by 2.
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
fs/xfs/libxfs/xfs_inode_fork.h | 9 +++++++++
fs/xfs/xfs_iomap.c | 5 +++++
2 files changed, 14 insertions(+)
diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index 8d89838e23f8..917e289ad962 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -70,6 +70,15 @@ struct xfs_ifork {
#define XFS_IEXT_ATTR_MANIP_CNT(rmt_blks) \
(XFS_DA_NODE_MAXDEPTH + max(1, rmt_blks))
+/*
+ * A write to a sub-interval of an existing unwritten extent causes the original
+ * extent to be split into 3 extents
+ * i.e. | Unwritten | Real | Unwritten |
+ * Hence extent count can increase by 2.
+ */
+#define XFS_IEXT_WRITE_UNWRITTEN_CNT (2)
+
+
/*
* Fork handling.
*/
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index f53690febb22..5bf84622421d 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -566,6 +566,11 @@ xfs_iomap_write_unwritten(
if (error)
goto error_on_bmapi_transaction;
+ error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+ XFS_IEXT_WRITE_UNWRITTEN_CNT);
+ if (error)
+ goto error_on_bmapi_transaction;
+
/*
* Modify the unwritten extent state of the buffer.
*/
--
2.29.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH V12 07/14] xfs: Check for extent overflow when moving extent from cow to data fork
2021-01-04 10:31 [PATCH V12 00/14] Bail out if transaction can cause extent count to overflow Chandan Babu R
` (5 preceding siblings ...)
2021-01-04 10:31 ` [PATCH V12 06/14] xfs: Check for extent overflow when writing to unwritten extent Chandan Babu R
@ 2021-01-04 10:31 ` Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 08/14] xfs: Check for extent overflow when remapping an extent Chandan Babu R
` (6 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Chandan Babu R @ 2021-01-04 10:31 UTC (permalink / raw)
To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, hch, allison.henderson
Moving an extent to data fork can cause a sub-interval of an existing
extent to be unmapped. This will increase extent count by 1. Mapping in
the new extent can increase the extent count by 1 again i.e.
| Old extent | New extent | Old extent |
Hence number of extents increases by 2.
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
fs/xfs/libxfs/xfs_inode_fork.h | 9 +++++++++
fs/xfs/xfs_reflink.c | 5 +++++
2 files changed, 14 insertions(+)
diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index 917e289ad962..c8f279edc5c1 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -79,6 +79,15 @@ struct xfs_ifork {
#define XFS_IEXT_WRITE_UNWRITTEN_CNT (2)
+/*
+ * Moving an extent to data fork can cause a sub-interval of an existing extent
+ * to be unmapped. This will increase extent count by 1. Mapping in the new
+ * extent can increase the extent count by 1 again i.e.
+ * | Old extent | New extent | Old extent |
+ * Hence number of extents increases by 2.
+ */
+#define XFS_IEXT_REFLINK_END_COW_CNT (2)
+
/*
* Fork handling.
*/
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index 6fa05fb78189..ca0ac1426d74 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -628,6 +628,11 @@ xfs_reflink_end_cow_extent(
xfs_ilock(ip, XFS_ILOCK_EXCL);
xfs_trans_ijoin(tp, ip, 0);
+ error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+ XFS_IEXT_REFLINK_END_COW_CNT);
+ if (error)
+ goto out_cancel;
+
/*
* In case of racing, overlapping AIO writes no COW extents might be
* left by the time I/O completes for the loser of the race. In that
--
2.29.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH V12 08/14] xfs: Check for extent overflow when remapping an extent
2021-01-04 10:31 [PATCH V12 00/14] Bail out if transaction can cause extent count to overflow Chandan Babu R
` (6 preceding siblings ...)
2021-01-04 10:31 ` [PATCH V12 07/14] xfs: Check for extent overflow when moving extent from cow to data fork Chandan Babu R
@ 2021-01-04 10:31 ` Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 09/14] xfs: Check for extent overflow when swapping extents Chandan Babu R
` (5 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Chandan Babu R @ 2021-01-04 10:31 UTC (permalink / raw)
To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, hch, allison.henderson
Remapping an extent involves unmapping the existing extent and mapping
in the new extent. When unmapping, an extent containing the entire unmap
range can be split into two extents,
i.e. | Old extent | hole | Old extent |
Hence extent count increases by 1.
Mapping in the new extent into the destination file can increase the
extent count by 1.
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
fs/xfs/xfs_reflink.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index ca0ac1426d74..e1c98dbf79e4 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -1006,6 +1006,7 @@ xfs_reflink_remap_extent(
unsigned int resblks;
bool smap_real;
bool dmap_written = xfs_bmap_is_written_extent(dmap);
+ int iext_delta = 0;
int nimaps;
int error;
@@ -1099,6 +1100,16 @@ xfs_reflink_remap_extent(
goto out_cancel;
}
+ if (smap_real)
+ ++iext_delta;
+
+ if (dmap_written)
+ ++iext_delta;
+
+ error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK, iext_delta);
+ if (error)
+ goto out_cancel;
+
if (smap_real) {
/*
* If the extent we're unmapping is backed by storage (written
--
2.29.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH V12 09/14] xfs: Check for extent overflow when swapping extents
2021-01-04 10:31 [PATCH V12 00/14] Bail out if transaction can cause extent count to overflow Chandan Babu R
` (7 preceding siblings ...)
2021-01-04 10:31 ` [PATCH V12 08/14] xfs: Check for extent overflow when remapping an extent Chandan Babu R
@ 2021-01-04 10:31 ` Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 10/14] xfs: Introduce error injection to reduce maximum inode fork extent count Chandan Babu R
` (4 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Chandan Babu R @ 2021-01-04 10:31 UTC (permalink / raw)
To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, hch, allison.henderson
Removing an initial range of source/donor file's extent and adding a new
extent (from donor/source file) in its place will cause extent count to
increase by 1.
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
fs/xfs/libxfs/xfs_inode_fork.h | 7 +++++++
fs/xfs/xfs_bmap_util.c | 16 ++++++++++++++++
2 files changed, 23 insertions(+)
diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index c8f279edc5c1..9e2137cd7372 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -88,6 +88,13 @@ struct xfs_ifork {
*/
#define XFS_IEXT_REFLINK_END_COW_CNT (2)
+/*
+ * Removing an initial range of source/donor file's extent and adding a new
+ * extent (from donor/source file) in its place will cause extent count to
+ * increase by 1.
+ */
+#define XFS_IEXT_SWAP_RMAP_CNT (1)
+
/*
* Fork handling.
*/
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 6ac7a6ac2658..f3f8c48ff5bf 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -1399,6 +1399,22 @@ xfs_swap_extent_rmap(
irec.br_blockcount);
trace_xfs_swap_extent_rmap_remap_piece(tip, &uirec);
+ if (xfs_bmap_is_real_extent(&uirec)) {
+ error = xfs_iext_count_may_overflow(ip,
+ XFS_DATA_FORK,
+ XFS_IEXT_SWAP_RMAP_CNT);
+ if (error)
+ goto out;
+ }
+
+ if (xfs_bmap_is_real_extent(&irec)) {
+ error = xfs_iext_count_may_overflow(tip,
+ XFS_DATA_FORK,
+ XFS_IEXT_SWAP_RMAP_CNT);
+ if (error)
+ goto out;
+ }
+
/* Remove the mapping from the donor file. */
xfs_bmap_unmap_extent(tp, tip, &uirec);
--
2.29.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH V12 10/14] xfs: Introduce error injection to reduce maximum inode fork extent count
2021-01-04 10:31 [PATCH V12 00/14] Bail out if transaction can cause extent count to overflow Chandan Babu R
` (8 preceding siblings ...)
2021-01-04 10:31 ` [PATCH V12 09/14] xfs: Check for extent overflow when swapping extents Chandan Babu R
@ 2021-01-04 10:31 ` Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 11/14] xfs: Remove duplicate assert statement in xfs_bmap_btalloc() Chandan Babu R
` (3 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Chandan Babu R @ 2021-01-04 10:31 UTC (permalink / raw)
To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, hch, allison.henderson
This commit adds XFS_ERRTAG_REDUCE_MAX_IEXTENTS error tag which enables
userspace programs to test "Inode fork extent count overflow detection"
by reducing maximum possible inode fork extent count to 10.
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
fs/xfs/libxfs/xfs_errortag.h | 4 +++-
fs/xfs/libxfs/xfs_inode_fork.c | 4 ++++
fs/xfs/xfs_error.c | 3 +++
3 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/libxfs/xfs_errortag.h b/fs/xfs/libxfs/xfs_errortag.h
index 53b305dea381..1c56fcceeea6 100644
--- a/fs/xfs/libxfs/xfs_errortag.h
+++ b/fs/xfs/libxfs/xfs_errortag.h
@@ -56,7 +56,8 @@
#define XFS_ERRTAG_FORCE_SUMMARY_RECALC 33
#define XFS_ERRTAG_IUNLINK_FALLBACK 34
#define XFS_ERRTAG_BUF_IOERROR 35
-#define XFS_ERRTAG_MAX 36
+#define XFS_ERRTAG_REDUCE_MAX_IEXTENTS 36
+#define XFS_ERRTAG_MAX 37
/*
* Random factors for above tags, 1 means always, 2 means 1/2 time, etc.
@@ -97,5 +98,6 @@
#define XFS_RANDOM_FORCE_SUMMARY_RECALC 1
#define XFS_RANDOM_IUNLINK_FALLBACK (XFS_RANDOM_DEFAULT/10)
#define XFS_RANDOM_BUF_IOERROR XFS_RANDOM_DEFAULT
+#define XFS_RANDOM_REDUCE_MAX_IEXTENTS 1
#endif /* __XFS_ERRORTAG_H_ */
diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
index 8d48716547e5..e080d7e07643 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.c
+++ b/fs/xfs/libxfs/xfs_inode_fork.c
@@ -24,6 +24,7 @@
#include "xfs_dir2_priv.h"
#include "xfs_attr_leaf.h"
#include "xfs_types.h"
+#include "xfs_errortag.h"
kmem_zone_t *xfs_ifork_zone;
@@ -745,6 +746,9 @@ xfs_iext_count_may_overflow(
max_exts = (whichfork == XFS_ATTR_FORK) ? MAXAEXTNUM : MAXEXTNUM;
+ if (XFS_TEST_ERROR(false, ip->i_mount, XFS_ERRTAG_REDUCE_MAX_IEXTENTS))
+ max_exts = 10;
+
nr_exts = ifp->if_nextents + nr_to_add;
if (nr_exts < ifp->if_nextents || nr_exts > max_exts)
return -EFBIG;
diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c
index 7f6e20899473..3780b118cc47 100644
--- a/fs/xfs/xfs_error.c
+++ b/fs/xfs/xfs_error.c
@@ -54,6 +54,7 @@ static unsigned int xfs_errortag_random_default[] = {
XFS_RANDOM_FORCE_SUMMARY_RECALC,
XFS_RANDOM_IUNLINK_FALLBACK,
XFS_RANDOM_BUF_IOERROR,
+ XFS_RANDOM_REDUCE_MAX_IEXTENTS,
};
struct xfs_errortag_attr {
@@ -164,6 +165,7 @@ XFS_ERRORTAG_ATTR_RW(force_repair, XFS_ERRTAG_FORCE_SCRUB_REPAIR);
XFS_ERRORTAG_ATTR_RW(bad_summary, XFS_ERRTAG_FORCE_SUMMARY_RECALC);
XFS_ERRORTAG_ATTR_RW(iunlink_fallback, XFS_ERRTAG_IUNLINK_FALLBACK);
XFS_ERRORTAG_ATTR_RW(buf_ioerror, XFS_ERRTAG_BUF_IOERROR);
+XFS_ERRORTAG_ATTR_RW(reduce_max_iextents, XFS_ERRTAG_REDUCE_MAX_IEXTENTS);
static struct attribute *xfs_errortag_attrs[] = {
XFS_ERRORTAG_ATTR_LIST(noerror),
@@ -202,6 +204,7 @@ static struct attribute *xfs_errortag_attrs[] = {
XFS_ERRORTAG_ATTR_LIST(bad_summary),
XFS_ERRORTAG_ATTR_LIST(iunlink_fallback),
XFS_ERRORTAG_ATTR_LIST(buf_ioerror),
+ XFS_ERRORTAG_ATTR_LIST(reduce_max_iextents),
NULL,
};
--
2.29.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH V12 11/14] xfs: Remove duplicate assert statement in xfs_bmap_btalloc()
2021-01-04 10:31 [PATCH V12 00/14] Bail out if transaction can cause extent count to overflow Chandan Babu R
` (9 preceding siblings ...)
2021-01-04 10:31 ` [PATCH V12 10/14] xfs: Introduce error injection to reduce maximum inode fork extent count Chandan Babu R
@ 2021-01-04 10:31 ` Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 12/14] xfs: Compute bmap extent alignments in a separate function Chandan Babu R
` (2 subsequent siblings)
13 siblings, 0 replies; 19+ messages in thread
From: Chandan Babu R @ 2021-01-04 10:31 UTC (permalink / raw)
To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, hch, allison.henderson
The check for verifying if the allocated extent is from an AG whose
index is greater than or equal to that of tp->t_firstblock is already
done a couple of statements earlier in the same function. Hence this
commit removes the redundant assert statement.
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
fs/xfs/libxfs/xfs_bmap.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 5fd804534e67..90147b9f5184 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -3699,7 +3699,6 @@ xfs_bmap_btalloc(
ap->blkno = args.fsbno;
if (ap->tp->t_firstblock == NULLFSBLOCK)
ap->tp->t_firstblock = args.fsbno;
- ASSERT(nullfb || fb_agno <= args.agno);
ap->length = args.len;
/*
* If the extent size hint is active, we tried to round the
--
2.29.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH V12 12/14] xfs: Compute bmap extent alignments in a separate function
2021-01-04 10:31 [PATCH V12 00/14] Bail out if transaction can cause extent count to overflow Chandan Babu R
` (10 preceding siblings ...)
2021-01-04 10:31 ` [PATCH V12 11/14] xfs: Remove duplicate assert statement in xfs_bmap_btalloc() Chandan Babu R
@ 2021-01-04 10:31 ` Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 13/14] xfs: Process allocated extent " Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 14/14] xfs: Introduce error injection to allocate only minlen size extents for files Chandan Babu R
13 siblings, 0 replies; 19+ messages in thread
From: Chandan Babu R @ 2021-01-04 10:31 UTC (permalink / raw)
To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, hch, allison.henderson
This commit moves over the code which computes stripe alignment and
extent size hint alignment into a separate function. Apart from
xfs_bmap_btalloc(), the new function will be used by another function
introduced in a future commit.
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
fs/xfs/libxfs/xfs_bmap.c | 89 +++++++++++++++++++++++-----------------
1 file changed, 52 insertions(+), 37 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 90147b9f5184..3479cb1b8178 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -3463,13 +3463,59 @@ xfs_bmap_btalloc_accounting(
args->len);
}
+static int
+xfs_bmap_compute_alignments(
+ struct xfs_bmalloca *ap,
+ struct xfs_alloc_arg *args)
+{
+ struct xfs_mount *mp = args->mp;
+ xfs_extlen_t align = 0; /* minimum allocation alignment */
+ int stripe_align = 0;
+ int error;
+
+ /* stripe alignment for allocation is determined by mount parameters */
+ if (mp->m_swidth && (mp->m_flags & XFS_MOUNT_SWALLOC))
+ stripe_align = mp->m_swidth;
+ else if (mp->m_dalign)
+ stripe_align = mp->m_dalign;
+
+ if (ap->flags & XFS_BMAPI_COWFORK)
+ align = xfs_get_cowextsz_hint(ap->ip);
+ else if (ap->datatype & XFS_ALLOC_USERDATA)
+ align = xfs_get_extsz_hint(ap->ip);
+ if (align) {
+ error = xfs_bmap_extsize_align(mp, &ap->got, &ap->prev,
+ align, 0, ap->eof, 0, ap->conv,
+ &ap->offset, &ap->length);
+ ASSERT(!error);
+ ASSERT(ap->length);
+ }
+
+ /* apply extent size hints if obtained earlier */
+ if (align) {
+ args->prod = align;
+ div_u64_rem(ap->offset, args->prod, &args->mod);
+ if (args->mod)
+ args->mod = args->prod - args->mod;
+ } else if (mp->m_sb.sb_blocksize >= PAGE_SIZE) {
+ args->prod = 1;
+ args->mod = 0;
+ } else {
+ args->prod = PAGE_SIZE >> mp->m_sb.sb_blocklog;
+ div_u64_rem(ap->offset, args->prod, &args->mod);
+ if (args->mod)
+ args->mod = args->prod - args->mod;
+ }
+
+ return stripe_align;
+}
+
STATIC int
xfs_bmap_btalloc(
struct xfs_bmalloca *ap) /* bmap alloc argument struct */
{
xfs_mount_t *mp; /* mount point structure */
xfs_alloctype_t atype = 0; /* type for allocation routines */
- xfs_extlen_t align = 0; /* minimum allocation alignment */
xfs_agnumber_t fb_agno; /* ag number of ap->firstblock */
xfs_agnumber_t ag;
xfs_alloc_arg_t args;
@@ -3489,25 +3535,11 @@ xfs_bmap_btalloc(
mp = ap->ip->i_mount;
- /* stripe alignment for allocation is determined by mount parameters */
- stripe_align = 0;
- if (mp->m_swidth && (mp->m_flags & XFS_MOUNT_SWALLOC))
- stripe_align = mp->m_swidth;
- else if (mp->m_dalign)
- stripe_align = mp->m_dalign;
-
- if (ap->flags & XFS_BMAPI_COWFORK)
- align = xfs_get_cowextsz_hint(ap->ip);
- else if (ap->datatype & XFS_ALLOC_USERDATA)
- align = xfs_get_extsz_hint(ap->ip);
- if (align) {
- error = xfs_bmap_extsize_align(mp, &ap->got, &ap->prev,
- align, 0, ap->eof, 0, ap->conv,
- &ap->offset, &ap->length);
- ASSERT(!error);
- ASSERT(ap->length);
- }
+ memset(&args, 0, sizeof(args));
+ args.tp = ap->tp;
+ args.mp = mp;
+ stripe_align = xfs_bmap_compute_alignments(ap, &args);
nullfb = ap->tp->t_firstblock == NULLFSBLOCK;
fb_agno = nullfb ? NULLAGNUMBER : XFS_FSB_TO_AGNO(mp,
@@ -3538,9 +3570,6 @@ xfs_bmap_btalloc(
* Normal allocation, done through xfs_alloc_vextent.
*/
tryagain = isaligned = 0;
- memset(&args, 0, sizeof(args));
- args.tp = ap->tp;
- args.mp = mp;
args.fsbno = ap->blkno;
args.oinfo = XFS_RMAP_OINFO_SKIP_UPDATE;
@@ -3571,21 +3600,7 @@ xfs_bmap_btalloc(
args.total = ap->total;
args.minlen = ap->minlen;
}
- /* apply extent size hints if obtained earlier */
- if (align) {
- args.prod = align;
- div_u64_rem(ap->offset, args.prod, &args.mod);
- if (args.mod)
- args.mod = args.prod - args.mod;
- } else if (mp->m_sb.sb_blocksize >= PAGE_SIZE) {
- args.prod = 1;
- args.mod = 0;
- } else {
- args.prod = PAGE_SIZE >> mp->m_sb.sb_blocklog;
- div_u64_rem(ap->offset, args.prod, &args.mod);
- if (args.mod)
- args.mod = args.prod - args.mod;
- }
+
/*
* If we are not low on available data blocks, and the underlying
* logical volume manager is a stripe, and the file offset is zero then
--
2.29.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH V12 13/14] xfs: Process allocated extent in a separate function
2021-01-04 10:31 [PATCH V12 00/14] Bail out if transaction can cause extent count to overflow Chandan Babu R
` (11 preceding siblings ...)
2021-01-04 10:31 ` [PATCH V12 12/14] xfs: Compute bmap extent alignments in a separate function Chandan Babu R
@ 2021-01-04 10:31 ` Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 14/14] xfs: Introduce error injection to allocate only minlen size extents for files Chandan Babu R
13 siblings, 0 replies; 19+ messages in thread
From: Chandan Babu R @ 2021-01-04 10:31 UTC (permalink / raw)
To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, hch, allison.henderson
This commit moves over the code in xfs_bmap_btalloc() which is
responsible for processing an allocated extent to a new function. Apart
from xfs_bmap_btalloc(), the new function will be invoked by another
function introduced in a future commit.
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
fs/xfs/libxfs/xfs_bmap.c | 74 ++++++++++++++++++++++++----------------
1 file changed, 45 insertions(+), 29 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 3479cb1b8178..a2ff4818b8df 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -3510,6 +3510,48 @@ xfs_bmap_compute_alignments(
return stripe_align;
}
+static void
+xfs_bmap_process_allocated_extent(
+ struct xfs_bmalloca *ap,
+ struct xfs_alloc_arg *args,
+ xfs_fileoff_t orig_offset,
+ xfs_extlen_t orig_length)
+{
+ int nullfb;
+
+ nullfb = ap->tp->t_firstblock == NULLFSBLOCK;
+
+ /*
+ * check the allocation happened at the same or higher AG than
+ * the first block that was allocated.
+ */
+ ASSERT(nullfb ||
+ XFS_FSB_TO_AGNO(args->mp, ap->tp->t_firstblock) <=
+ XFS_FSB_TO_AGNO(args->mp, args->fsbno));
+
+ ap->blkno = args->fsbno;
+ if (nullfb)
+ ap->tp->t_firstblock = args->fsbno;
+ ap->length = args->len;
+ /*
+ * If the extent size hint is active, we tried to round the
+ * caller's allocation request offset down to extsz and the
+ * length up to another extsz boundary. If we found a free
+ * extent we mapped it in starting at this new offset. If the
+ * newly mapped space isn't long enough to cover any of the
+ * range of offsets that was originally requested, move the
+ * mapping up so that we can fill as much of the caller's
+ * original request as possible. Free space is apparently
+ * very fragmented so we're unlikely to be able to satisfy the
+ * hints anyway.
+ */
+ if (ap->length <= orig_length)
+ ap->offset = orig_offset;
+ else if (ap->offset + ap->length < orig_offset + orig_length)
+ ap->offset = orig_offset + orig_length - ap->length;
+ xfs_bmap_btalloc_accounting(ap, args);
+}
+
STATIC int
xfs_bmap_btalloc(
struct xfs_bmalloca *ap) /* bmap alloc argument struct */
@@ -3702,36 +3744,10 @@ xfs_bmap_btalloc(
return error;
ap->tp->t_flags |= XFS_TRANS_LOWMODE;
}
+
if (args.fsbno != NULLFSBLOCK) {
- /*
- * check the allocation happened at the same or higher AG than
- * the first block that was allocated.
- */
- ASSERT(ap->tp->t_firstblock == NULLFSBLOCK ||
- XFS_FSB_TO_AGNO(mp, ap->tp->t_firstblock) <=
- XFS_FSB_TO_AGNO(mp, args.fsbno));
-
- ap->blkno = args.fsbno;
- if (ap->tp->t_firstblock == NULLFSBLOCK)
- ap->tp->t_firstblock = args.fsbno;
- ap->length = args.len;
- /*
- * If the extent size hint is active, we tried to round the
- * caller's allocation request offset down to extsz and the
- * length up to another extsz boundary. If we found a free
- * extent we mapped it in starting at this new offset. If the
- * newly mapped space isn't long enough to cover any of the
- * range of offsets that was originally requested, move the
- * mapping up so that we can fill as much of the caller's
- * original request as possible. Free space is apparently
- * very fragmented so we're unlikely to be able to satisfy the
- * hints anyway.
- */
- if (ap->length <= orig_length)
- ap->offset = orig_offset;
- else if (ap->offset + ap->length < orig_offset + orig_length)
- ap->offset = orig_offset + orig_length - ap->length;
- xfs_bmap_btalloc_accounting(ap, &args);
+ xfs_bmap_process_allocated_extent(ap, &args, orig_offset,
+ orig_length);
} else {
ap->blkno = NULLFSBLOCK;
ap->length = 0;
--
2.29.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH V12 14/14] xfs: Introduce error injection to allocate only minlen size extents for files
2021-01-04 10:31 [PATCH V12 00/14] Bail out if transaction can cause extent count to overflow Chandan Babu R
` (12 preceding siblings ...)
2021-01-04 10:31 ` [PATCH V12 13/14] xfs: Process allocated extent " Chandan Babu R
@ 2021-01-04 10:31 ` Chandan Babu R
13 siblings, 0 replies; 19+ messages in thread
From: Chandan Babu R @ 2021-01-04 10:31 UTC (permalink / raw)
To: linux-xfs; +Cc: Chandan Babu R, darrick.wong, hch, allison.henderson
This commit adds XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag which
helps userspace test programs to get xfs_bmap_btalloc() to always
allocate minlen sized extents.
This is required for test programs which need a guarantee that minlen
extents allocated for a file do not get merged with their existing
neighbours in the inode's BMBT. "Inode fork extent overflow check" for
Directories, Xattrs and extension of realtime inodes need this since the
file offset at which the extents are being allocated cannot be
explicitly controlled from userspace.
One way to use this error tag is to,
1. Consume all of the free space by sequentially writing to a file.
2. Punch alternate blocks of the file. This causes CNTBT to contain
sufficient number of one block sized extent records.
3. Inject XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag.
After step 3, xfs_bmap_btalloc() will issue space allocation
requests for minlen sized extents only.
ENOSPC error code is returned to userspace when there aren't any "one
block sized" extents left in any of the AGs.
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
---
fs/xfs/libxfs/xfs_alloc.c | 50 ++++++++++++++
fs/xfs/libxfs/xfs_alloc.h | 3 +
fs/xfs/libxfs/xfs_bmap.c | 124 ++++++++++++++++++++++++++++-------
fs/xfs/libxfs/xfs_errortag.h | 4 +-
fs/xfs/xfs_error.c | 3 +
5 files changed, 159 insertions(+), 25 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index 7cb9f064ac64..0c623d3c1036 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -2474,6 +2474,47 @@ xfs_defer_agfl_block(
xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_AGFL_FREE, &new->xefi_list);
}
+#ifdef DEBUG
+/*
+ * Check if an AGF has a free extent record whose length is equal to
+ * args->minlen.
+ */
+STATIC int
+xfs_exact_minlen_extent_available(
+ struct xfs_alloc_arg *args,
+ struct xfs_buf *agbp,
+ int *stat)
+{
+ struct xfs_btree_cur *cnt_cur;
+ xfs_agblock_t fbno;
+ xfs_extlen_t flen;
+ int error = 0;
+
+ cnt_cur = xfs_allocbt_init_cursor(args->mp, args->tp, agbp,
+ args->agno, XFS_BTNUM_CNT);
+ error = xfs_alloc_lookup_ge(cnt_cur, 0, args->minlen, stat);
+ if (error)
+ goto out;
+
+ if (*stat == 0) {
+ error = -EFSCORRUPTED;
+ goto out;
+ }
+
+ error = xfs_alloc_get_rec(cnt_cur, &fbno, &flen, stat);
+ if (error)
+ goto out;
+
+ if (*stat == 1 && flen != args->minlen)
+ *stat = 0;
+
+out:
+ xfs_btree_del_cursor(cnt_cur, error);
+
+ return error;
+}
+#endif
+
/*
* Decide whether to use this allocation group for this allocation.
* If so, fix up the btree freelist's size.
@@ -2545,6 +2586,15 @@ xfs_alloc_fix_freelist(
if (!xfs_alloc_space_available(args, need, flags))
goto out_agbp_relse;
+#ifdef DEBUG
+ if (args->alloc_minlen_only) {
+ int stat;
+
+ error = xfs_exact_minlen_extent_available(args, agbp, &stat);
+ if (error || !stat)
+ goto out_agbp_relse;
+ }
+#endif
/*
* Make the freelist shorter if it's too long.
*
diff --git a/fs/xfs/libxfs/xfs_alloc.h b/fs/xfs/libxfs/xfs_alloc.h
index 6c22b12176b8..a4427c5775c2 100644
--- a/fs/xfs/libxfs/xfs_alloc.h
+++ b/fs/xfs/libxfs/xfs_alloc.h
@@ -75,6 +75,9 @@ typedef struct xfs_alloc_arg {
char wasfromfl; /* set if allocation is from freelist */
struct xfs_owner_info oinfo; /* owner of blocks being allocated */
enum xfs_ag_resv_type resv; /* block reservation to use */
+#ifdef DEBUG
+ bool alloc_minlen_only; /* allocate exact minlen extent */
+#endif
} xfs_alloc_arg_t;
/*
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index a2ff4818b8df..edfc4a01d83f 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -3552,34 +3552,101 @@ xfs_bmap_process_allocated_extent(
xfs_bmap_btalloc_accounting(ap, args);
}
-STATIC int
-xfs_bmap_btalloc(
- struct xfs_bmalloca *ap) /* bmap alloc argument struct */
+#ifdef DEBUG
+static int
+xfs_bmap_exact_minlen_extent_alloc(
+ struct xfs_bmalloca *ap)
{
- xfs_mount_t *mp; /* mount point structure */
- xfs_alloctype_t atype = 0; /* type for allocation routines */
- xfs_agnumber_t fb_agno; /* ag number of ap->firstblock */
- xfs_agnumber_t ag;
- xfs_alloc_arg_t args;
- xfs_fileoff_t orig_offset;
- xfs_extlen_t orig_length;
- xfs_extlen_t blen;
- xfs_extlen_t nextminlen = 0;
- int nullfb; /* true if ap->firstblock isn't set */
- int isaligned;
- int tryagain;
- int error;
- int stripe_align;
+ struct xfs_mount *mp = ap->ip->i_mount;
+ struct xfs_alloc_arg args = { .tp = ap->tp, .mp = mp };
+ xfs_fileoff_t orig_offset;
+ xfs_extlen_t orig_length;
+ int error;
ASSERT(ap->length);
+
+ if (ap->minlen != 1) {
+ ap->blkno = NULLFSBLOCK;
+ ap->length = 0;
+ return 0;
+ }
+
orig_offset = ap->offset;
orig_length = ap->length;
- mp = ap->ip->i_mount;
+ args.alloc_minlen_only = 1;
- memset(&args, 0, sizeof(args));
- args.tp = ap->tp;
- args.mp = mp;
+ xfs_bmap_compute_alignments(ap, &args);
+
+ if (ap->tp->t_firstblock == NULLFSBLOCK) {
+ /*
+ * Unlike the longest extent available in an AG, we don't track
+ * the length of an AG's shortest extent.
+ * XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT is a debug only knob and
+ * hence we can afford to start traversing from the 0th AG since
+ * we need not be concerned about a drop in performance in
+ * "debug only" code paths.
+ */
+ ap->blkno = XFS_AGB_TO_FSB(mp, 0, 0);
+ } else {
+ ap->blkno = ap->tp->t_firstblock;
+ }
+
+ args.fsbno = ap->blkno;
+ args.oinfo = XFS_RMAP_OINFO_SKIP_UPDATE;
+ args.type = XFS_ALLOCTYPE_FIRST_AG;
+ args.total = args.minlen = args.maxlen = ap->minlen;
+
+ args.alignment = 1;
+ args.minalignslop = 0;
+
+ args.minleft = ap->minleft;
+ args.wasdel = ap->wasdel;
+ args.resv = XFS_AG_RESV_NONE;
+ args.datatype = ap->datatype;
+
+ error = xfs_alloc_vextent(&args);
+ if (error)
+ return error;
+
+ if (args.fsbno != NULLFSBLOCK) {
+ xfs_bmap_process_allocated_extent(ap, &args, orig_offset,
+ orig_length);
+ } else {
+ ap->blkno = NULLFSBLOCK;
+ ap->length = 0;
+ }
+
+ return 0;
+}
+#else
+
+#define xfs_bmap_exact_minlen_extent_alloc(bma) (-EFSCORRUPTED)
+
+#endif
+
+STATIC int
+xfs_bmap_btalloc(
+ struct xfs_bmalloca *ap)
+{
+ struct xfs_mount *mp = ap->ip->i_mount;
+ struct xfs_alloc_arg args = { .tp = ap->tp, .mp = mp };
+ xfs_alloctype_t atype = 0;
+ xfs_agnumber_t fb_agno; /* ag number of ap->firstblock */
+ xfs_agnumber_t ag;
+ xfs_fileoff_t orig_offset;
+ xfs_extlen_t orig_length;
+ xfs_extlen_t blen;
+ xfs_extlen_t nextminlen = 0;
+ int nullfb; /* true if ap->firstblock isn't set */
+ int isaligned;
+ int tryagain;
+ int error;
+ int stripe_align;
+
+ ASSERT(ap->length);
+ orig_offset = ap->offset;
+ orig_length = ap->length;
stripe_align = xfs_bmap_compute_alignments(ap, &args);
@@ -4113,6 +4180,10 @@ xfs_bmap_alloc_userdata(
return xfs_bmap_rtalloc(bma);
}
+ if (unlikely(XFS_TEST_ERROR(false, mp,
+ XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT)))
+ return xfs_bmap_exact_minlen_extent_alloc(bma);
+
return xfs_bmap_btalloc(bma);
}
@@ -4149,10 +4220,15 @@ xfs_bmapi_allocate(
else
bma->minlen = 1;
- if (bma->flags & XFS_BMAPI_METADATA)
- error = xfs_bmap_btalloc(bma);
- else
+ if (bma->flags & XFS_BMAPI_METADATA) {
+ if (unlikely(XFS_TEST_ERROR(false, mp,
+ XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT)))
+ error = xfs_bmap_exact_minlen_extent_alloc(bma);
+ else
+ error = xfs_bmap_btalloc(bma);
+ } else {
error = xfs_bmap_alloc_userdata(bma);
+ }
if (error || bma->blkno == NULLFSBLOCK)
return error;
diff --git a/fs/xfs/libxfs/xfs_errortag.h b/fs/xfs/libxfs/xfs_errortag.h
index 1c56fcceeea6..6ca9084b6934 100644
--- a/fs/xfs/libxfs/xfs_errortag.h
+++ b/fs/xfs/libxfs/xfs_errortag.h
@@ -57,7 +57,8 @@
#define XFS_ERRTAG_IUNLINK_FALLBACK 34
#define XFS_ERRTAG_BUF_IOERROR 35
#define XFS_ERRTAG_REDUCE_MAX_IEXTENTS 36
-#define XFS_ERRTAG_MAX 37
+#define XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT 37
+#define XFS_ERRTAG_MAX 38
/*
* Random factors for above tags, 1 means always, 2 means 1/2 time, etc.
@@ -99,5 +100,6 @@
#define XFS_RANDOM_IUNLINK_FALLBACK (XFS_RANDOM_DEFAULT/10)
#define XFS_RANDOM_BUF_IOERROR XFS_RANDOM_DEFAULT
#define XFS_RANDOM_REDUCE_MAX_IEXTENTS 1
+#define XFS_RANDOM_BMAP_ALLOC_MINLEN_EXTENT 1
#endif /* __XFS_ERRORTAG_H_ */
diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c
index 3780b118cc47..185b4915b7bf 100644
--- a/fs/xfs/xfs_error.c
+++ b/fs/xfs/xfs_error.c
@@ -55,6 +55,7 @@ static unsigned int xfs_errortag_random_default[] = {
XFS_RANDOM_IUNLINK_FALLBACK,
XFS_RANDOM_BUF_IOERROR,
XFS_RANDOM_REDUCE_MAX_IEXTENTS,
+ XFS_RANDOM_BMAP_ALLOC_MINLEN_EXTENT,
};
struct xfs_errortag_attr {
@@ -166,6 +167,7 @@ XFS_ERRORTAG_ATTR_RW(bad_summary, XFS_ERRTAG_FORCE_SUMMARY_RECALC);
XFS_ERRORTAG_ATTR_RW(iunlink_fallback, XFS_ERRTAG_IUNLINK_FALLBACK);
XFS_ERRORTAG_ATTR_RW(buf_ioerror, XFS_ERRTAG_BUF_IOERROR);
XFS_ERRORTAG_ATTR_RW(reduce_max_iextents, XFS_ERRTAG_REDUCE_MAX_IEXTENTS);
+XFS_ERRORTAG_ATTR_RW(bmap_alloc_minlen_extent, XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT);
static struct attribute *xfs_errortag_attrs[] = {
XFS_ERRORTAG_ATTR_LIST(noerror),
@@ -205,6 +207,7 @@ static struct attribute *xfs_errortag_attrs[] = {
XFS_ERRORTAG_ATTR_LIST(iunlink_fallback),
XFS_ERRORTAG_ATTR_LIST(buf_ioerror),
XFS_ERRORTAG_ATTR_LIST(reduce_max_iextents),
+ XFS_ERRORTAG_ATTR_LIST(bmap_alloc_minlen_extent),
NULL,
};
--
2.29.2
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH V12 04/14] xfs: Check for extent overflow when adding/removing dir entries
2021-01-04 10:31 ` [PATCH V12 04/14] xfs: Check for extent overflow when adding/removing dir entries Chandan Babu R
@ 2021-01-08 1:17 ` Darrick J. Wong
2021-01-08 4:55 ` Chandan Babu R
0 siblings, 1 reply; 19+ messages in thread
From: Darrick J. Wong @ 2021-01-08 1:17 UTC (permalink / raw)
To: Chandan Babu R; +Cc: linux-xfs, hch, allison.henderson
On Mon, Jan 04, 2021 at 04:01:10PM +0530, Chandan Babu R wrote:
> Directory entry addition can cause the following,
> 1. Data block can be added/removed.
> A new extent can cause extent count to increase by 1.
> 2. Free disk block can be added/removed.
> Same behaviour as described above for Data block.
> 3. Dabtree blocks.
> XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these
> can be new extents. Hence extent count can increase by
> XFS_DA_NODE_MAXDEPTH.
>
> Directory entry remove and rename (applicable only to the source
> directory entry) operations are handled specially to allow them to
> succeed in low extent count availability scenarios
> i.e. xfs_bmap_del_extent_real() will now return -ENOSPC when a possible
> extent count overflow is detected. -ENOSPC is already handled by higher
> layers of XFS by letting,
> 1. Empty Data/Free space index blocks to linger around until a future
> remove operation frees them.
> 2. Dabtree blocks would be swapped with the last block in the leaf space
> followed by unmapping of the new last block.
>
> Also, Extent overflow check is performed for the target directory entry
> of the rename operation only when the entry does not exist and a
> non-zero space reservation is obtained successfully.
>
> Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
> ---
> fs/xfs/libxfs/xfs_bmap.c | 15 ++++++++++++
> fs/xfs/libxfs/xfs_inode_fork.h | 13 ++++++++++
> fs/xfs/xfs_inode.c | 45 ++++++++++++++++++++++++++++++++++
> fs/xfs/xfs_symlink.c | 5 ++++
> 4 files changed, 78 insertions(+)
>
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 32aeacf6f055..5fd804534e67 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -5151,6 +5151,21 @@ xfs_bmap_del_extent_real(
> /*
> * Deleting the middle of the extent.
> */
> +
> + /*
> + * For directories, -ENOSPC will be handled by higher layers of
> + * XFS by letting the corresponding empty Data/Free blocks to
> + * linger around until a future remove operation. Dabtree blocks
> + * would be swapped with the last block in the leaf space and
> + * then the new last block will be unmapped.
> + */
> + if (S_ISDIR(VFS_I(ip)->i_mode) &&
> + whichfork == XFS_DATA_FORK &&
> + xfs_iext_count_may_overflow(ip, whichfork, 1)) {
> + error = -ENOSPC;
> + goto done;
Hmm... it strikes me as a little odd that we're checking file mode and
fork type in the middle of the bmap code. However, I think it's the
case that the only place where anyone would punch a hole in the /middle/
of an extent is xattr trees and regular files, right? And both of those
cases are checked before we end up in the bmap code, right?
So we only really need this check to prevent extent count overflows when
removing dirents from directories, like the comment says, and only
because directories don't have a hard requirement that the bunmapi
succeeds. And I think this logic covers xfs_remove too? That's a bit
subtle, but as there's no extent count check in that function, there's
not much to attach a comment to... :)
Hm. I think I'd like xfs_rename to get a brief comment that we're
protected from extent count overflows in xfs_remove() by virtue of this
"leave the dir block in place if we ENOSPC" capability:
/*
* NOTE: We don't need to check for extent overflows here
* because the dir removename code will leave the dir block
* in place if the extent count would overflow.
*/
error = xfs_dir_removename(...);
Do xattr trees also have the same ability? I think they do, at least
for the dabtree part...?
I think I would've split this patch into three pieces:
- create, link, and symlink in one patch (adding dirents),
- the xfs_bmap_del_extent_real change and a comment for xfs_remove
(removing dirents)
- all the xfs_rename changes (adding and removing dirents)
Though I dunno, this series is already 14 patches, and the part that I
care most about is not leaving that subtlety in xfs_remove(). :)
Other than that, I follow the logic in this patch and will give it a
testrun tonight.
--D
> + }
> +
> old = got;
>
> got.br_blockcount = del->br_startoff - got.br_startoff;
> diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
> index bcac769a7df6..ea1a9dd8a763 100644
> --- a/fs/xfs/libxfs/xfs_inode_fork.h
> +++ b/fs/xfs/libxfs/xfs_inode_fork.h
> @@ -47,6 +47,19 @@ struct xfs_ifork {
> */
> #define XFS_IEXT_PUNCH_HOLE_CNT (1)
>
> +/*
> + * Directory entry addition can cause the following,
> + * 1. Data block can be added/removed.
> + * A new extent can cause extent count to increase by 1.
> + * 2. Free disk block can be added/removed.
> + * Same behaviour as described above for Data block.
> + * 3. Dabtree blocks.
> + * XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these can be new
> + * extents. Hence extent count can increase by XFS_DA_NODE_MAXDEPTH.
> + */
> +#define XFS_IEXT_DIR_MANIP_CNT(mp) \
> + ((XFS_DA_NODE_MAXDEPTH + 1 + 1) * (mp)->m_dir_geo->fsbcount)
> +
> /*
> * Fork handling.
> */
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index b7352bc4c815..0db21368c7e1 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -1042,6 +1042,11 @@ xfs_create(
> if (error)
> goto out_trans_cancel;
>
> + error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK,
> + XFS_IEXT_DIR_MANIP_CNT(mp));
> + if (error)
> + goto out_trans_cancel;
> +
> /*
> * A newly created regular or special file just has one directory
> * entry pointing to them, but a directory also the "." entry
> @@ -1258,6 +1263,11 @@ xfs_link(
> xfs_trans_ijoin(tp, sip, XFS_ILOCK_EXCL);
> xfs_trans_ijoin(tp, tdp, XFS_ILOCK_EXCL);
>
> + error = xfs_iext_count_may_overflow(tdp, XFS_DATA_FORK,
> + XFS_IEXT_DIR_MANIP_CNT(mp));
> + if (error)
> + goto error_return;
> +
> /*
> * If we are using project inheritance, we only allow hard link
> * creation in our tree when the project IDs are the same; else
> @@ -3106,6 +3116,35 @@ xfs_rename(
> /*
> * Check for expected errors before we dirty the transaction
> * so we can return an error without a transaction abort.
> + *
> + * Extent count overflow check:
> + *
> + * From the perspective of src_dp, a rename operation is essentially a
> + * directory entry remove operation. Hence the only place where we check
> + * for extent count overflow for src_dp is in
> + * xfs_bmap_del_extent_real(). xfs_bmap_del_extent_real() returns
> + * -ENOSPC when it detects a possible extent count overflow and in
> + * response, the higher layers of directory handling code do the
> + * following:
> + * 1. Data/Free blocks: XFS lets these blocks linger around until a
> + * future remove operation removes them.
> + * 2. Dabtree blocks: XFS swaps the blocks with the last block in the
> + * Leaf space and unmaps the last block.
> + *
> + * For target_dp, there are two cases depending on whether the
> + * destination directory entry exists or not.
> + *
> + * When destination directory entry does not exist (i.e. target_ip ==
> + * NULL), extent count overflow check is performed only when transaction
> + * has a non-zero sized space reservation associated with it. With a
> + * zero-sized space reservation, XFS allows a rename operation to
> + * continue only when the directory has sufficient free space in its
> + * data/leaf/free space blocks to hold the new entry.
> + *
> + * When destination directory entry exists (i.e. target_ip != NULL), all
> + * we need to do is change the inode number associated with the already
> + * existing entry. Hence there is no need to perform an extent count
> + * overflow check.
> */
> if (target_ip == NULL) {
> /*
> @@ -3116,6 +3155,12 @@ xfs_rename(
> error = xfs_dir_canenter(tp, target_dp, target_name);
> if (error)
> goto out_trans_cancel;
> + } else {
> + error = xfs_iext_count_may_overflow(target_dp,
> + XFS_DATA_FORK,
> + XFS_IEXT_DIR_MANIP_CNT(mp));
> + if (error)
> + goto out_trans_cancel;
> }
> } else {
> /*
> diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
> index 1f43fd7f3209..0b8136a32484 100644
> --- a/fs/xfs/xfs_symlink.c
> +++ b/fs/xfs/xfs_symlink.c
> @@ -220,6 +220,11 @@ xfs_symlink(
> if (error)
> goto out_trans_cancel;
>
> + error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK,
> + XFS_IEXT_DIR_MANIP_CNT(mp));
> + if (error)
> + goto out_trans_cancel;
> +
> /*
> * Allocate an inode for the symlink.
> */
> --
> 2.29.2
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH V12 04/14] xfs: Check for extent overflow when adding/removing dir entries
2021-01-08 1:17 ` Darrick J. Wong
@ 2021-01-08 4:55 ` Chandan Babu R
2021-01-09 1:33 ` Darrick J. Wong
0 siblings, 1 reply; 19+ messages in thread
From: Chandan Babu R @ 2021-01-08 4:55 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch, allison.henderson
On Thu, 07 Jan 2021 17:17:13 -0800, Darrick J. Wong wrote:
> On Mon, Jan 04, 2021 at 04:01:10PM +0530, Chandan Babu R wrote:
> > Directory entry addition can cause the following,
> > 1. Data block can be added/removed.
> > A new extent can cause extent count to increase by 1.
> > 2. Free disk block can be added/removed.
> > Same behaviour as described above for Data block.
> > 3. Dabtree blocks.
> > XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these
> > can be new extents. Hence extent count can increase by
> > XFS_DA_NODE_MAXDEPTH.
> >
> > Directory entry remove and rename (applicable only to the source
> > directory entry) operations are handled specially to allow them to
> > succeed in low extent count availability scenarios
> > i.e. xfs_bmap_del_extent_real() will now return -ENOSPC when a possible
> > extent count overflow is detected. -ENOSPC is already handled by higher
> > layers of XFS by letting,
> > 1. Empty Data/Free space index blocks to linger around until a future
> > remove operation frees them.
> > 2. Dabtree blocks would be swapped with the last block in the leaf space
> > followed by unmapping of the new last block.
> >
> > Also, Extent overflow check is performed for the target directory entry
> > of the rename operation only when the entry does not exist and a
> > non-zero space reservation is obtained successfully.
> >
> > Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
> > ---
> > fs/xfs/libxfs/xfs_bmap.c | 15 ++++++++++++
> > fs/xfs/libxfs/xfs_inode_fork.h | 13 ++++++++++
> > fs/xfs/xfs_inode.c | 45 ++++++++++++++++++++++++++++++++++
> > fs/xfs/xfs_symlink.c | 5 ++++
> > 4 files changed, 78 insertions(+)
> >
> > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> > index 32aeacf6f055..5fd804534e67 100644
> > --- a/fs/xfs/libxfs/xfs_bmap.c
> > +++ b/fs/xfs/libxfs/xfs_bmap.c
> > @@ -5151,6 +5151,21 @@ xfs_bmap_del_extent_real(
> > /*
> > * Deleting the middle of the extent.
> > */
> > +
> > + /*
> > + * For directories, -ENOSPC will be handled by higher layers of
> > + * XFS by letting the corresponding empty Data/Free blocks to
> > + * linger around until a future remove operation. Dabtree blocks
> > + * would be swapped with the last block in the leaf space and
> > + * then the new last block will be unmapped.
> > + */
> > + if (S_ISDIR(VFS_I(ip)->i_mode) &&
> > + whichfork == XFS_DATA_FORK &&
> > + xfs_iext_count_may_overflow(ip, whichfork, 1)) {
> > + error = -ENOSPC;
> > + goto done;
>
> Hmm... it strikes me as a little odd that we're checking file mode and
> fork type in the middle of the bmap code. However, I think it's the
> case that the only place where anyone would punch a hole in the /middle/
> of an extent is xattr trees and regular files, right? And both of those
> cases are checked before we end up in the bmap code, right?
Yes, your observation is correct. I will remove the file mode and fork type
checks.
>
> So we only really need this check to prevent extent count overflows when
> removing dirents from directories, like the comment says, and only
> because directories don't have a hard requirement that the bunmapi
> succeeds. And I think this logic covers xfs_remove too? That's a bit
> subtle, but as there's no extent count check in that function, there's
> not much to attach a comment to... :)
Yes, To provide more clarity, I should replace the above comment with
following,
/*
* -ENOSPC is returned since a directory entry remove operation must not fail
* due to low extent count availability. -ENOSPC will be handled by higher
* layers of XFS by letting the corresponding empty Data/Free blocks to linger
* around until a future remove operation. Dabtree blocks would be swapped with
* the last block in the leaf space and then the new last block will be
* unmapped.
*
* The above logic also applies to the source directory entry of a rename
* operation.
*/
>
> Hm. I think I'd like xfs_rename to get a brief comment that we're
> protected from extent count overflows in xfs_remove() by virtue of this
> "leave the dir block in place if we ENOSPC" capability:
>
> /*
> * NOTE: We don't need to check for extent overflows here
> * because the dir removename code will leave the dir block
> * in place if the extent count would overflow.
> */
> error = xfs_dir_removename(...);
Sure, I will add that.
>
> Do xattr trees also have the same ability? I think they do, at least
> for the dabtree part...?
The following code snippet from xfs_da_shrink_inode() does the special casing
only for the data fork i.e. for blocks holding directory entries.
for (;;) {
/*
* Remove extents. If we get ENOSPC for a dir we have to move
* the last block to the place we want to kill.
*/
error = xfs_bunmapi(tp, dp, dead_blkno, count,
xfs_bmapi_aflag(w), 0, &done);
if (error == -ENOSPC) {
if (w != XFS_DATA_FORK)
break;
error = xfs_da3_swap_lastblock(args, &dead_blkno,
&dead_buf);
So this facility is available only for directory entries.
Hence for xattrs, if we ever reach the extent count limit, the only way out is
to delete the corresponding file.
>
> I think I would've split this patch into three pieces:
>
> - create, link, and symlink in one patch (adding dirents),
> - the xfs_bmap_del_extent_real change and a comment for xfs_remove
> (removing dirents)
> - all the xfs_rename changes (adding and removing dirents)
>
> Though I dunno, this series is already 14 patches, and the part that I
> care most about is not leaving that subtlety in xfs_remove(). :)
I think you are right about that. I will split this patch according to what
you have mentioned above.
>
> Other than that, I follow the logic in this patch and will give it a
> testrun tonight.
Thank you!
>
> --D
>
> > + }
> > +
> > old = got;
> >
> > got.br_blockcount = del->br_startoff - got.br_startoff;
> > diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
> > index bcac769a7df6..ea1a9dd8a763 100644
> > --- a/fs/xfs/libxfs/xfs_inode_fork.h
> > +++ b/fs/xfs/libxfs/xfs_inode_fork.h
> > @@ -47,6 +47,19 @@ struct xfs_ifork {
> > */
> > #define XFS_IEXT_PUNCH_HOLE_CNT (1)
> >
> > +/*
> > + * Directory entry addition can cause the following,
> > + * 1. Data block can be added/removed.
> > + * A new extent can cause extent count to increase by 1.
> > + * 2. Free disk block can be added/removed.
> > + * Same behaviour as described above for Data block.
> > + * 3. Dabtree blocks.
> > + * XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these can be new
> > + * extents. Hence extent count can increase by XFS_DA_NODE_MAXDEPTH.
> > + */
> > +#define XFS_IEXT_DIR_MANIP_CNT(mp) \
> > + ((XFS_DA_NODE_MAXDEPTH + 1 + 1) * (mp)->m_dir_geo->fsbcount)
> > +
> > /*
> > * Fork handling.
> > */
> > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > index b7352bc4c815..0db21368c7e1 100644
> > --- a/fs/xfs/xfs_inode.c
> > +++ b/fs/xfs/xfs_inode.c
> > @@ -1042,6 +1042,11 @@ xfs_create(
> > if (error)
> > goto out_trans_cancel;
> >
> > + error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK,
> > + XFS_IEXT_DIR_MANIP_CNT(mp));
> > + if (error)
> > + goto out_trans_cancel;
> > +
> > /*
> > * A newly created regular or special file just has one directory
> > * entry pointing to them, but a directory also the "." entry
> > @@ -1258,6 +1263,11 @@ xfs_link(
> > xfs_trans_ijoin(tp, sip, XFS_ILOCK_EXCL);
> > xfs_trans_ijoin(tp, tdp, XFS_ILOCK_EXCL);
> >
> > + error = xfs_iext_count_may_overflow(tdp, XFS_DATA_FORK,
> > + XFS_IEXT_DIR_MANIP_CNT(mp));
> > + if (error)
> > + goto error_return;
> > +
> > /*
> > * If we are using project inheritance, we only allow hard link
> > * creation in our tree when the project IDs are the same; else
> > @@ -3106,6 +3116,35 @@ xfs_rename(
> > /*
> > * Check for expected errors before we dirty the transaction
> > * so we can return an error without a transaction abort.
> > + *
> > + * Extent count overflow check:
> > + *
> > + * From the perspective of src_dp, a rename operation is essentially a
> > + * directory entry remove operation. Hence the only place where we check
> > + * for extent count overflow for src_dp is in
> > + * xfs_bmap_del_extent_real(). xfs_bmap_del_extent_real() returns
> > + * -ENOSPC when it detects a possible extent count overflow and in
> > + * response, the higher layers of directory handling code do the
> > + * following:
> > + * 1. Data/Free blocks: XFS lets these blocks linger around until a
> > + * future remove operation removes them.
> > + * 2. Dabtree blocks: XFS swaps the blocks with the last block in the
> > + * Leaf space and unmaps the last block.
> > + *
> > + * For target_dp, there are two cases depending on whether the
> > + * destination directory entry exists or not.
> > + *
> > + * When destination directory entry does not exist (i.e. target_ip ==
> > + * NULL), extent count overflow check is performed only when transaction
> > + * has a non-zero sized space reservation associated with it. With a
> > + * zero-sized space reservation, XFS allows a rename operation to
> > + * continue only when the directory has sufficient free space in its
> > + * data/leaf/free space blocks to hold the new entry.
> > + *
> > + * When destination directory entry exists (i.e. target_ip != NULL), all
> > + * we need to do is change the inode number associated with the already
> > + * existing entry. Hence there is no need to perform an extent count
> > + * overflow check.
> > */
> > if (target_ip == NULL) {
> > /*
> > @@ -3116,6 +3155,12 @@ xfs_rename(
> > error = xfs_dir_canenter(tp, target_dp, target_name);
> > if (error)
> > goto out_trans_cancel;
> > + } else {
> > + error = xfs_iext_count_may_overflow(target_dp,
> > + XFS_DATA_FORK,
> > + XFS_IEXT_DIR_MANIP_CNT(mp));
> > + if (error)
> > + goto out_trans_cancel;
> > }
> > } else {
> > /*
> > diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
> > index 1f43fd7f3209..0b8136a32484 100644
> > --- a/fs/xfs/xfs_symlink.c
> > +++ b/fs/xfs/xfs_symlink.c
> > @@ -220,6 +220,11 @@ xfs_symlink(
> > if (error)
> > goto out_trans_cancel;
> >
> > + error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK,
> > + XFS_IEXT_DIR_MANIP_CNT(mp));
> > + if (error)
> > + goto out_trans_cancel;
> > +
> > /*
> > * Allocate an inode for the symlink.
> > */
>
--
chandan
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH V12 04/14] xfs: Check for extent overflow when adding/removing dir entries
2021-01-08 4:55 ` Chandan Babu R
@ 2021-01-09 1:33 ` Darrick J. Wong
2021-01-09 5:26 ` Chandan Babu R
0 siblings, 1 reply; 19+ messages in thread
From: Darrick J. Wong @ 2021-01-09 1:33 UTC (permalink / raw)
To: Chandan Babu R; +Cc: linux-xfs, hch, allison.henderson
On Fri, Jan 08, 2021 at 10:25:35AM +0530, Chandan Babu R wrote:
> On Thu, 07 Jan 2021 17:17:13 -0800, Darrick J. Wong wrote:
> > On Mon, Jan 04, 2021 at 04:01:10PM +0530, Chandan Babu R wrote:
> > > Directory entry addition can cause the following,
> > > 1. Data block can be added/removed.
> > > A new extent can cause extent count to increase by 1.
> > > 2. Free disk block can be added/removed.
> > > Same behaviour as described above for Data block.
> > > 3. Dabtree blocks.
> > > XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these
> > > can be new extents. Hence extent count can increase by
> > > XFS_DA_NODE_MAXDEPTH.
> > >
> > > Directory entry remove and rename (applicable only to the source
> > > directory entry) operations are handled specially to allow them to
> > > succeed in low extent count availability scenarios
> > > i.e. xfs_bmap_del_extent_real() will now return -ENOSPC when a possible
> > > extent count overflow is detected. -ENOSPC is already handled by higher
> > > layers of XFS by letting,
> > > 1. Empty Data/Free space index blocks to linger around until a future
> > > remove operation frees them.
> > > 2. Dabtree blocks would be swapped with the last block in the leaf space
> > > followed by unmapping of the new last block.
> > >
> > > Also, Extent overflow check is performed for the target directory entry
> > > of the rename operation only when the entry does not exist and a
> > > non-zero space reservation is obtained successfully.
> > >
> > > Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
> > > ---
> > > fs/xfs/libxfs/xfs_bmap.c | 15 ++++++++++++
> > > fs/xfs/libxfs/xfs_inode_fork.h | 13 ++++++++++
> > > fs/xfs/xfs_inode.c | 45 ++++++++++++++++++++++++++++++++++
> > > fs/xfs/xfs_symlink.c | 5 ++++
> > > 4 files changed, 78 insertions(+)
> > >
> > > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> > > index 32aeacf6f055..5fd804534e67 100644
> > > --- a/fs/xfs/libxfs/xfs_bmap.c
> > > +++ b/fs/xfs/libxfs/xfs_bmap.c
> > > @@ -5151,6 +5151,21 @@ xfs_bmap_del_extent_real(
> > > /*
> > > * Deleting the middle of the extent.
> > > */
> > > +
> > > + /*
> > > + * For directories, -ENOSPC will be handled by higher layers of
> > > + * XFS by letting the corresponding empty Data/Free blocks to
> > > + * linger around until a future remove operation. Dabtree blocks
> > > + * would be swapped with the last block in the leaf space and
> > > + * then the new last block will be unmapped.
> > > + */
> > > + if (S_ISDIR(VFS_I(ip)->i_mode) &&
> > > + whichfork == XFS_DATA_FORK &&
> > > + xfs_iext_count_may_overflow(ip, whichfork, 1)) {
> > > + error = -ENOSPC;
> > > + goto done;
> >
> > Hmm... it strikes me as a little odd that we're checking file mode and
> > fork type in the middle of the bmap code. However, I think it's the
> > case that the only place where anyone would punch a hole in the /middle/
> > of an extent is xattr trees and regular files, right? And both of those
> > cases are checked before we end up in the bmap code, right?
>
> Yes, your observation is correct. I will remove the file mode and fork type
> checks.
You might want to leave an assert there just in case someone else trips
over that...
>
> >
> > So we only really need this check to prevent extent count overflows when
> > removing dirents from directories, like the comment says, and only
> > because directories don't have a hard requirement that the bunmapi
> > succeeds. And I think this logic covers xfs_remove too? That's a bit
> > subtle, but as there's no extent count check in that function, there's
> > not much to attach a comment to... :)
>
> Yes, To provide more clarity, I should replace the above comment with
> following,
>
> /*
> * -ENOSPC is returned since a directory entry remove operation must not fail
> * due to low extent count availability. -ENOSPC will be handled by higher
> * layers of XFS by letting the corresponding empty Data/Free blocks to linger
grammar nit: "...by letting the corresponding empty Data/Free blocks linger
until a future remove operation."
> * around until a future remove operation. Dabtree blocks would be swapped with
> * the last block in the leaf space and then the new last block will be
> * unmapped.
> *
> * The above logic also applies to the source directory entry of a rename
> * operation.
> */
>
> >
> > Hm. I think I'd like xfs_rename to get a brief comment that we're
> > protected from extent count overflows in xfs_remove() by virtue of this
> > "leave the dir block in place if we ENOSPC" capability:
> >
> > /*
> > * NOTE: We don't need to check for extent overflows here
> > * because the dir removename code will leave the dir block
> > * in place if the extent count would overflow.
> > */
> > error = xfs_dir_removename(...);
>
> Sure, I will add that.
>
> >
> > Do xattr trees also have the same ability? I think they do, at least
> > for the dabtree part...?
>
> The following code snippet from xfs_da_shrink_inode() does the special casing
> only for the data fork i.e. for blocks holding directory entries.
>
> for (;;) {
> /*
> * Remove extents. If we get ENOSPC for a dir we have to move
> * the last block to the place we want to kill.
> */
> error = xfs_bunmapi(tp, dp, dead_blkno, count,
> xfs_bmapi_aflag(w), 0, &done);
> if (error == -ENOSPC) {
> if (w != XFS_DATA_FORK)
> break;
> error = xfs_da3_swap_lastblock(args, &dead_blkno,
> &dead_buf);
>
> So this facility is available only for directory entries.
>
> Hence for xattrs, if we ever reach the extent count limit, the only way out is
> to delete the corresponding file.
Ok. Just checking. :)
> >
> > I think I would've split this patch into three pieces:
> >
> > - create, link, and symlink in one patch (adding dirents),
> > - the xfs_bmap_del_extent_real change and a comment for xfs_remove
> > (removing dirents)
> > - all the xfs_rename changes (adding and removing dirents)
> >
> > Though I dunno, this series is already 14 patches, and the part that I
> > care most about is not leaving that subtlety in xfs_remove(). :)
>
> I think you are right about that. I will split this patch according to what
> you have mentioned above.
Thanks!
FWIW I didn't see any regressions in fstests...
--D
> >
> > Other than that, I follow the logic in this patch and will give it a
> > testrun tonight.
>
> Thank you!
>
> >
> > --D
> >
> > > + }
> > > +
> > > old = got;
> > >
> > > got.br_blockcount = del->br_startoff - got.br_startoff;
> > > diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
> > > index bcac769a7df6..ea1a9dd8a763 100644
> > > --- a/fs/xfs/libxfs/xfs_inode_fork.h
> > > +++ b/fs/xfs/libxfs/xfs_inode_fork.h
> > > @@ -47,6 +47,19 @@ struct xfs_ifork {
> > > */
> > > #define XFS_IEXT_PUNCH_HOLE_CNT (1)
> > >
> > > +/*
> > > + * Directory entry addition can cause the following,
> > > + * 1. Data block can be added/removed.
> > > + * A new extent can cause extent count to increase by 1.
> > > + * 2. Free disk block can be added/removed.
> > > + * Same behaviour as described above for Data block.
> > > + * 3. Dabtree blocks.
> > > + * XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these can be new
> > > + * extents. Hence extent count can increase by XFS_DA_NODE_MAXDEPTH.
> > > + */
> > > +#define XFS_IEXT_DIR_MANIP_CNT(mp) \
> > > + ((XFS_DA_NODE_MAXDEPTH + 1 + 1) * (mp)->m_dir_geo->fsbcount)
> > > +
> > > /*
> > > * Fork handling.
> > > */
> > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > > index b7352bc4c815..0db21368c7e1 100644
> > > --- a/fs/xfs/xfs_inode.c
> > > +++ b/fs/xfs/xfs_inode.c
> > > @@ -1042,6 +1042,11 @@ xfs_create(
> > > if (error)
> > > goto out_trans_cancel;
> > >
> > > + error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK,
> > > + XFS_IEXT_DIR_MANIP_CNT(mp));
> > > + if (error)
> > > + goto out_trans_cancel;
> > > +
> > > /*
> > > * A newly created regular or special file just has one directory
> > > * entry pointing to them, but a directory also the "." entry
> > > @@ -1258,6 +1263,11 @@ xfs_link(
> > > xfs_trans_ijoin(tp, sip, XFS_ILOCK_EXCL);
> > > xfs_trans_ijoin(tp, tdp, XFS_ILOCK_EXCL);
> > >
> > > + error = xfs_iext_count_may_overflow(tdp, XFS_DATA_FORK,
> > > + XFS_IEXT_DIR_MANIP_CNT(mp));
> > > + if (error)
> > > + goto error_return;
> > > +
> > > /*
> > > * If we are using project inheritance, we only allow hard link
> > > * creation in our tree when the project IDs are the same; else
> > > @@ -3106,6 +3116,35 @@ xfs_rename(
> > > /*
> > > * Check for expected errors before we dirty the transaction
> > > * so we can return an error without a transaction abort.
> > > + *
> > > + * Extent count overflow check:
> > > + *
> > > + * From the perspective of src_dp, a rename operation is essentially a
> > > + * directory entry remove operation. Hence the only place where we check
> > > + * for extent count overflow for src_dp is in
> > > + * xfs_bmap_del_extent_real(). xfs_bmap_del_extent_real() returns
> > > + * -ENOSPC when it detects a possible extent count overflow and in
> > > + * response, the higher layers of directory handling code do the
> > > + * following:
> > > + * 1. Data/Free blocks: XFS lets these blocks linger around until a
> > > + * future remove operation removes them.
> > > + * 2. Dabtree blocks: XFS swaps the blocks with the last block in the
> > > + * Leaf space and unmaps the last block.
> > > + *
> > > + * For target_dp, there are two cases depending on whether the
> > > + * destination directory entry exists or not.
> > > + *
> > > + * When destination directory entry does not exist (i.e. target_ip ==
> > > + * NULL), extent count overflow check is performed only when transaction
> > > + * has a non-zero sized space reservation associated with it. With a
> > > + * zero-sized space reservation, XFS allows a rename operation to
> > > + * continue only when the directory has sufficient free space in its
> > > + * data/leaf/free space blocks to hold the new entry.
> > > + *
> > > + * When destination directory entry exists (i.e. target_ip != NULL), all
> > > + * we need to do is change the inode number associated with the already
> > > + * existing entry. Hence there is no need to perform an extent count
> > > + * overflow check.
> > > */
> > > if (target_ip == NULL) {
> > > /*
> > > @@ -3116,6 +3155,12 @@ xfs_rename(
> > > error = xfs_dir_canenter(tp, target_dp, target_name);
> > > if (error)
> > > goto out_trans_cancel;
> > > + } else {
> > > + error = xfs_iext_count_may_overflow(target_dp,
> > > + XFS_DATA_FORK,
> > > + XFS_IEXT_DIR_MANIP_CNT(mp));
> > > + if (error)
> > > + goto out_trans_cancel;
> > > }
> > > } else {
> > > /*
> > > diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
> > > index 1f43fd7f3209..0b8136a32484 100644
> > > --- a/fs/xfs/xfs_symlink.c
> > > +++ b/fs/xfs/xfs_symlink.c
> > > @@ -220,6 +220,11 @@ xfs_symlink(
> > > if (error)
> > > goto out_trans_cancel;
> > >
> > > + error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK,
> > > + XFS_IEXT_DIR_MANIP_CNT(mp));
> > > + if (error)
> > > + goto out_trans_cancel;
> > > +
> > > /*
> > > * Allocate an inode for the symlink.
> > > */
> >
>
>
> --
> chandan
>
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH V12 04/14] xfs: Check for extent overflow when adding/removing dir entries
2021-01-09 1:33 ` Darrick J. Wong
@ 2021-01-09 5:26 ` Chandan Babu R
0 siblings, 0 replies; 19+ messages in thread
From: Chandan Babu R @ 2021-01-09 5:26 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: Chandan Babu R, linux-xfs, hch, allison.henderson
On 09 Jan 2021 at 07:03, Darrick J. Wong wrote:
> On Fri, Jan 08, 2021 at 10:25:35AM +0530, Chandan Babu R wrote:
>> On Thu, 07 Jan 2021 17:17:13 -0800, Darrick J. Wong wrote:
>> > On Mon, Jan 04, 2021 at 04:01:10PM +0530, Chandan Babu R wrote:
>> > > Directory entry addition can cause the following,
>> > > 1. Data block can be added/removed.
>> > > A new extent can cause extent count to increase by 1.
>> > > 2. Free disk block can be added/removed.
>> > > Same behaviour as described above for Data block.
>> > > 3. Dabtree blocks.
>> > > XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these
>> > > can be new extents. Hence extent count can increase by
>> > > XFS_DA_NODE_MAXDEPTH.
>> > >
>> > > Directory entry remove and rename (applicable only to the source
>> > > directory entry) operations are handled specially to allow them to
>> > > succeed in low extent count availability scenarios
>> > > i.e. xfs_bmap_del_extent_real() will now return -ENOSPC when a possible
>> > > extent count overflow is detected. -ENOSPC is already handled by higher
>> > > layers of XFS by letting,
>> > > 1. Empty Data/Free space index blocks to linger around until a future
>> > > remove operation frees them.
>> > > 2. Dabtree blocks would be swapped with the last block in the leaf space
>> > > followed by unmapping of the new last block.
>> > >
>> > > Also, Extent overflow check is performed for the target directory entry
>> > > of the rename operation only when the entry does not exist and a
>> > > non-zero space reservation is obtained successfully.
>> > >
>> > > Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
>> > > ---
>> > > fs/xfs/libxfs/xfs_bmap.c | 15 ++++++++++++
>> > > fs/xfs/libxfs/xfs_inode_fork.h | 13 ++++++++++
>> > > fs/xfs/xfs_inode.c | 45 ++++++++++++++++++++++++++++++++++
>> > > fs/xfs/xfs_symlink.c | 5 ++++
>> > > 4 files changed, 78 insertions(+)
>> > >
>> > > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
>> > > index 32aeacf6f055..5fd804534e67 100644
>> > > --- a/fs/xfs/libxfs/xfs_bmap.c
>> > > +++ b/fs/xfs/libxfs/xfs_bmap.c
>> > > @@ -5151,6 +5151,21 @@ xfs_bmap_del_extent_real(
>> > > /*
>> > > * Deleting the middle of the extent.
>> > > */
>> > > +
>> > > + /*
>> > > + * For directories, -ENOSPC will be handled by higher layers of
>> > > + * XFS by letting the corresponding empty Data/Free blocks to
>> > > + * linger around until a future remove operation. Dabtree blocks
>> > > + * would be swapped with the last block in the leaf space and
>> > > + * then the new last block will be unmapped.
>> > > + */
>> > > + if (S_ISDIR(VFS_I(ip)->i_mode) &&
>> > > + whichfork == XFS_DATA_FORK &&
>> > > + xfs_iext_count_may_overflow(ip, whichfork, 1)) {
>> > > + error = -ENOSPC;
>> > > + goto done;
>> >
>> > Hmm... it strikes me as a little odd that we're checking file mode and
>> > fork type in the middle of the bmap code. However, I think it's the
>> > case that the only place where anyone would punch a hole in the /middle/
>> > of an extent is xattr trees and regular files, right? And both of those
>> > cases are checked before we end up in the bmap code, right?
>>
>> Yes, your observation is correct. I will remove the file mode and fork type
>> checks.
>
> You might want to leave an assert there just in case someone else trips
> over that...
Ok.
>
>>
>> >
>> > So we only really need this check to prevent extent count overflows when
>> > removing dirents from directories, like the comment says, and only
>> > because directories don't have a hard requirement that the bunmapi
>> > succeeds. And I think this logic covers xfs_remove too? That's a bit
>> > subtle, but as there's no extent count check in that function, there's
>> > not much to attach a comment to... :)
>>
>> Yes, To provide more clarity, I should replace the above comment with
>> following,
>>
>> /*
>> * -ENOSPC is returned since a directory entry remove operation must not fail
>> * due to low extent count availability. -ENOSPC will be handled by higher
>> * layers of XFS by letting the corresponding empty Data/Free blocks to linger
>
> grammar nit: "...by letting the corresponding empty Data/Free blocks linger
> until a future remove operation."
>
Ok, I will fix that.
>> * around until a future remove operation. Dabtree blocks would be swapped with
>> * the last block in the leaf space and then the new last block will be
>> * unmapped.
>> *
>> * The above logic also applies to the source directory entry of a rename
>> * operation.
>> */
>>
>> >
>> > Hm. I think I'd like xfs_rename to get a brief comment that we're
>> > protected from extent count overflows in xfs_remove() by virtue of this
>> > "leave the dir block in place if we ENOSPC" capability:
>> >
>> > /*
>> > * NOTE: We don't need to check for extent overflows here
>> > * because the dir removename code will leave the dir block
>> > * in place if the extent count would overflow.
>> > */
>> > error = xfs_dir_removename(...);
>>
>> Sure, I will add that.
>>
>> >
>> > Do xattr trees also have the same ability? I think they do, at least
>> > for the dabtree part...?
>>
>> The following code snippet from xfs_da_shrink_inode() does the special casing
>> only for the data fork i.e. for blocks holding directory entries.
>>
>> for (;;) {
>> /*
>> * Remove extents. If we get ENOSPC for a dir we have to move
>> * the last block to the place we want to kill.
>> */
>> error = xfs_bunmapi(tp, dp, dead_blkno, count,
>> xfs_bmapi_aflag(w), 0, &done);
>> if (error == -ENOSPC) {
>> if (w != XFS_DATA_FORK)
>> break;
>> error = xfs_da3_swap_lastblock(args, &dead_blkno,
>> &dead_buf);
>>
>> So this facility is available only for directory entries.
>>
>> Hence for xattrs, if we ever reach the extent count limit, the only way out is
>> to delete the corresponding file.
>
> Ok. Just checking. :)
>
>> >
>> > I think I would've split this patch into three pieces:
>> >
>> > - create, link, and symlink in one patch (adding dirents),
>> > - the xfs_bmap_del_extent_real change and a comment for xfs_remove
>> > (removing dirents)
>> > - all the xfs_rename changes (adding and removing dirents)
>> >
>> > Though I dunno, this series is already 14 patches, and the part that I
>> > care most about is not leaving that subtlety in xfs_remove(). :)
>>
>> I think you are right about that. I will split this patch according to what
>> you have mentioned above.
>
>
> Thanks!
>
> FWIW I didn't see any regressions in fstests...
Cool. Thanks for the test run.
>
>> >
>> > Other than that, I follow the logic in this patch and will give it a
>> > testrun tonight.
>>
>> Thank you!
>>
>> >
>> > --D
>> >
>> > > + }
>> > > +
>> > > old = got;
>> > >
>> > > got.br_blockcount = del->br_startoff - got.br_startoff;
>> > > diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
>> > > index bcac769a7df6..ea1a9dd8a763 100644
>> > > --- a/fs/xfs/libxfs/xfs_inode_fork.h
>> > > +++ b/fs/xfs/libxfs/xfs_inode_fork.h
>> > > @@ -47,6 +47,19 @@ struct xfs_ifork {
>> > > */
>> > > #define XFS_IEXT_PUNCH_HOLE_CNT (1)
>> > >
>> > > +/*
>> > > + * Directory entry addition can cause the following,
>> > > + * 1. Data block can be added/removed.
>> > > + * A new extent can cause extent count to increase by 1.
>> > > + * 2. Free disk block can be added/removed.
>> > > + * Same behaviour as described above for Data block.
>> > > + * 3. Dabtree blocks.
>> > > + * XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these can be new
>> > > + * extents. Hence extent count can increase by XFS_DA_NODE_MAXDEPTH.
>> > > + */
>> > > +#define XFS_IEXT_DIR_MANIP_CNT(mp) \
>> > > + ((XFS_DA_NODE_MAXDEPTH + 1 + 1) * (mp)->m_dir_geo->fsbcount)
>> > > +
>> > > /*
>> > > * Fork handling.
>> > > */
>> > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
>> > > index b7352bc4c815..0db21368c7e1 100644
>> > > --- a/fs/xfs/xfs_inode.c
>> > > +++ b/fs/xfs/xfs_inode.c
>> > > @@ -1042,6 +1042,11 @@ xfs_create(
>> > > if (error)
>> > > goto out_trans_cancel;
>> > >
>> > > + error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK,
>> > > + XFS_IEXT_DIR_MANIP_CNT(mp));
>> > > + if (error)
>> > > + goto out_trans_cancel;
>> > > +
>> > > /*
>> > > * A newly created regular or special file just has one directory
>> > > * entry pointing to them, but a directory also the "." entry
>> > > @@ -1258,6 +1263,11 @@ xfs_link(
>> > > xfs_trans_ijoin(tp, sip, XFS_ILOCK_EXCL);
>> > > xfs_trans_ijoin(tp, tdp, XFS_ILOCK_EXCL);
>> > >
>> > > + error = xfs_iext_count_may_overflow(tdp, XFS_DATA_FORK,
>> > > + XFS_IEXT_DIR_MANIP_CNT(mp));
>> > > + if (error)
>> > > + goto error_return;
>> > > +
>> > > /*
>> > > * If we are using project inheritance, we only allow hard link
>> > > * creation in our tree when the project IDs are the same; else
>> > > @@ -3106,6 +3116,35 @@ xfs_rename(
>> > > /*
>> > > * Check for expected errors before we dirty the transaction
>> > > * so we can return an error without a transaction abort.
>> > > + *
>> > > + * Extent count overflow check:
>> > > + *
>> > > + * From the perspective of src_dp, a rename operation is essentially a
>> > > + * directory entry remove operation. Hence the only place where we check
>> > > + * for extent count overflow for src_dp is in
>> > > + * xfs_bmap_del_extent_real(). xfs_bmap_del_extent_real() returns
>> > > + * -ENOSPC when it detects a possible extent count overflow and in
>> > > + * response, the higher layers of directory handling code do the
>> > > + * following:
>> > > + * 1. Data/Free blocks: XFS lets these blocks linger around until a
>> > > + * future remove operation removes them.
>> > > + * 2. Dabtree blocks: XFS swaps the blocks with the last block in the
>> > > + * Leaf space and unmaps the last block.
>> > > + *
>> > > + * For target_dp, there are two cases depending on whether the
>> > > + * destination directory entry exists or not.
>> > > + *
>> > > + * When destination directory entry does not exist (i.e. target_ip ==
>> > > + * NULL), extent count overflow check is performed only when transaction
>> > > + * has a non-zero sized space reservation associated with it. With a
>> > > + * zero-sized space reservation, XFS allows a rename operation to
>> > > + * continue only when the directory has sufficient free space in its
>> > > + * data/leaf/free space blocks to hold the new entry.
>> > > + *
>> > > + * When destination directory entry exists (i.e. target_ip != NULL), all
>> > > + * we need to do is change the inode number associated with the already
>> > > + * existing entry. Hence there is no need to perform an extent count
>> > > + * overflow check.
>> > > */
>> > > if (target_ip == NULL) {
>> > > /*
>> > > @@ -3116,6 +3155,12 @@ xfs_rename(
>> > > error = xfs_dir_canenter(tp, target_dp, target_name);
>> > > if (error)
>> > > goto out_trans_cancel;
>> > > + } else {
>> > > + error = xfs_iext_count_may_overflow(target_dp,
>> > > + XFS_DATA_FORK,
>> > > + XFS_IEXT_DIR_MANIP_CNT(mp));
>> > > + if (error)
>> > > + goto out_trans_cancel;
>> > > }
>> > > } else {
>> > > /*
>> > > diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
>> > > index 1f43fd7f3209..0b8136a32484 100644
>> > > --- a/fs/xfs/xfs_symlink.c
>> > > +++ b/fs/xfs/xfs_symlink.c
>> > > @@ -220,6 +220,11 @@ xfs_symlink(
>> > > if (error)
>> > > goto out_trans_cancel;
>> > >
>> > > + error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK,
>> > > + XFS_IEXT_DIR_MANIP_CNT(mp));
>> > > + if (error)
>> > > + goto out_trans_cancel;
>> > > +
>> > > /*
>> > > * Allocate an inode for the symlink.
>> > > */
>> >
>>
--
chandan
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2021-01-09 5:27 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-04 10:31 [PATCH V12 00/14] Bail out if transaction can cause extent count to overflow Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 01/14] xfs: Add helper for checking per-inode extent count overflow Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 02/14] xfs: Check for extent overflow when trivally adding a new extent Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 03/14] xfs: Check for extent overflow when punching a hole Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 04/14] xfs: Check for extent overflow when adding/removing dir entries Chandan Babu R
2021-01-08 1:17 ` Darrick J. Wong
2021-01-08 4:55 ` Chandan Babu R
2021-01-09 1:33 ` Darrick J. Wong
2021-01-09 5:26 ` Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 05/14] xfs: Check for extent overflow when adding/removing xattrs Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 06/14] xfs: Check for extent overflow when writing to unwritten extent Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 07/14] xfs: Check for extent overflow when moving extent from cow to data fork Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 08/14] xfs: Check for extent overflow when remapping an extent Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 09/14] xfs: Check for extent overflow when swapping extents Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 10/14] xfs: Introduce error injection to reduce maximum inode fork extent count Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 11/14] xfs: Remove duplicate assert statement in xfs_bmap_btalloc() Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 12/14] xfs: Compute bmap extent alignments in a separate function Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 13/14] xfs: Process allocated extent " Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 14/14] xfs: Introduce error injection to allocate only minlen size extents for files Chandan Babu R
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.